Three-Dimensional Molecular Descriptors Based on Electron Charge

Related Content: Structure/Response Correlations and Similarity/Diversity Analysis by GETAWAY Descriptors. 1. Theory of the Novel 3D Molecular Descrip...
1 downloads 0 Views 640KB Size
J. Chem. In$ Comput. Sci. 1995, 35, 708-713

708

Three-Dimensional Molecular Descriptors Based on Electron Charge Density Weighted Graphs Emesto Estrada Centro de Bioactivos Quimicos, Universidad Central de Las Villas, Santa Clara 54830, Villa Clara, Cuba Received May 15, 1994@ Electron charge density calculated from quantum-chemical methods is used as vertex weight in molecular graphs. One graph theoretical index is obtained by using the RandiC-type invariant in the present approach. The new index contain information on 3D features of molecules, and it is discriminative between geometrical isomers, such as cis-trans alkenes and between conformers. This index is used in quantitative structureproperty relationships with boiling points of alkenes, and its performance is compared to the valence molecular connectivity index of Kier and Hall. Another index considering a correction to include the influences of different hydrogen atoms in the molecule is also proposed. Both novel indexes are more isomer sensitive than 2D molecular connectivity indexes, and the statistical analysis shows that correlations obtained with the 3D molecular descriptors are better than with the 2D ones. stereospecific interactions and bioactivity, which are abundant in the information of drug-receptor complexes.35 This The study of topological features of molecules has become appears to be a serious drawback and generate criticism from of major interest in the last and a great number of users of this type of approach.36 derived indices have been published in the chemical Only three approaches, to our knowledge, have been l i t e r a t ~ r e . ~ These -~ indices are based on graph-theoretical proposed to solve this limitation of topological indices. representations of molecules and the use of several invariants RandiC37-39proposes an extension of graph theoretical from mathematical properties of a structure.* methodology to structures embedded in three-dimensional One of the principal areas of research in chemical graph space; Bogdanov et al. suggests the use of geometric distance theory is the development and application of topological matrix to calculate three-dimensional Wiener number,4oand indices in quantitative structure-property (QSPR) and Estrada and Montero use the bond orders, calculated from quantitative structure-activityrelationships (QSAR) ~ t u d i e s . ~ - ' ~ different quantum-chemical methods, as edge weights in The importance of this type of approach in the reduction of calculation of three-dimensional molecular Connectivity?' analysis of molecules to property-property or propertyOur purpose in the present work is to use the electron activity comparisons, where mathematical properties are charge density, calculated from quantum chemical semiemcompared to physicochemical or biological properties, have pirical methods, as vertex weights in molecular graphs in been emphasized by RandiC.I3-l5 order to generate a three-dimensional valence molecular There are many topological indices described in the connectivity index that permits differentiation molecules with literaturelo (more than 120), among the most important ones multiple bonds and heteroatoms as well as molecules with are the Wiener number,I6 Hosoya index,I7 RandiC index of different spatial properties, such as configuration and conmolecular connectivity,'* valence molecular connectivity of formation. The influence of different hydrogen atoms in Kier and Hall,I9 Balaban index,2o molecular ID number,2i molecules will be considered as a correction to the proposed and so forth. The proliferation of new topological indices index. and some attempt to the systematization and generahzation of them has been proposed in the THE MOLECULAR CONNECTIVITY INDEXES There are two main limitations of topological indices in In the context of chemical graph theory, molecules are order to describe the chemical structure of molecules: they considered as simple graphs G = { V, E}, where V = { d i = are the nonconsideration of multiple bonds and heteroatoms 1, 2, ..., n } is the vertex set representing atoms and E = in calculations and the lack of information about molecular {eJi = 1, 2 , ..., m} is the edge set, which elements represent spatial properties, such as conformation. bonds in the molecule. The first limitation has been considered in several works, The molecular connectivity index was introduced by and different approaches can be found in the literature to RandiC'* as a good measurement of branching in molecules, resolve this problem.'.15.32-34 and it is calculated as The second question is related to the nature of graphtheoretical descriptors, because they are derived from a molecular graph, that is a two-dimensional representation where the summation is over all pairs of adjacent atoms of a molecule rather than the molecular structure. (bonds) in the molecule, and 6, is the degree of atom i It is clear that three-dimensional properties of molecules calculated as the sum of all elements of the ith row or column are of profound importance, specially when one considers in the adjacency matrix A of graph G representing the molecule. @Abstractpublished in Advance ACS Abstracts, June 15, 1995. INTRODUCTION

0095-2338/95/1635-0708$09.00/0 0 1995 American Chemical Society

THREE-DIMENSIONAL MOLECULAR DESCRIFTORS

J. Chem. In$ Comput. Sci., Vol. 35, No. 4, 1995 709

Further development of this index was performed by the treatment of unsaturation and a rational way of quantifying heteroatom content"*42introduced by Kier and Hall and called valence molecular connectivity index h"). The calculation of xVproceeds exactly as for the simple molecular connectivity index but using a modified value for 6, parameter. The new value for 6 parameter is assigned to an atom based on the number of valence electrons (Zv) not involved in bonds to hydrogen. Thus, 6; = Z; - hi, where hi is the number of hydrogen bonded on atom i. A great number of applications published in the literature use molecular connectivity indexes to correlate several physico-chemical properties9%' as well as biological activities of different types of compounds."$'* Among the principal disadvantage of molecular connectivity indexes we can find the nonhighly discriminative power of isomeric compounds and the inability to reflect fine stereochemical alternatives, Le., it does not register any difference between cis and trans isomers or between

conformer^.^^ THREE-DIMENSIONAL DESCRIPTORS In the present approach the electron charge densities on atoms are used as weights for vertices in the hydrogen suppressed graph representing molecules. The electron charge density on atom i (qi) is calculated from quantum chemical methods using the expression

uei

where Z,is the nuclear charge, P is the density matrix, S is the overlap matrix, and the term &si (PS),, is the Mulliken population43(the number of electrons in each atomic orbital Now we calculate the electron charge density connectivity, 6(qi) of the atom i, subtracting from qi the number of hydrogen atoms bonded to i (hi)

&(q) = qi - hi

0.9185

H

0.9213

k

where the summation is taken into account for all pairs of bonded atoms in the hydrogen depleted graph. In order to consider the influence of the different hydrogen atoms in the molecule we calculate a corrected electron charge density connectivity as follows dc(qi) = qi - &hj j

where q h j is the electron charge density of the jth hydrogen atom bound to the atom i. A corrected 3-D valence connectivity index P ( q ) is calculated in the same way as Q ( q ) but using values of 6'(qi) instead of 6(qi). Calculations of S2 indexes were performed by using the electron charge density calculated from quantum chemical semiempirical method PM344. Full geometry optimizations

3

0.9559

H

0.9558 I II I

/I)

b) CalCuldtlOnS of electron chargc dcnsln c~nnCClNIlleSfor each alonl oipropene 6 ) ( q l = 4 I699 - 2 = 2 1699

& ( y ) = 4 1360 - 1 = 1 1360

& ( q ) = 4 0781

-

3 = I 0781

6 ; ( y ) = 4 1699 - I 8398 = 2 3301 & ( q ) = 4 1760 - 0 9063 = 7 2297 & ( y ) = 4 0781 - 2 8697 =

I 2081

i) CdlNLatlOn or I> m i , c s s

niq, = O ' l 2 7 2

nqq) = 1) 8708

Figure 1. The computation of R indices for propene: (a) values of electron charge density on each atom of the molecule (I) and molecular graph representing the carbon skeleton of propene with labeled vertices (11). (b) calculations of electron charge density connectivities for each atom of propene, and (c) calculation of R indices.

with the Broyden-Fletcher-Goldfarb-Shanno were carried out using the package MOPAC version 6.046and computation of S2 indexes was made with the system MODEST version l.047. As an example, the computation of both indices for propene is shown in Figure 1. APPLICATIONS IN QSPR STUDIES The valence molecular connectivity index and threedimensional descriptors Q(q) and QC(q)were used in QSPR models for predicting the boiling points of a series of 53 C4-Cg alkenes. The experimental boiling points and topological indexes are given in Table 1. In order to describe the boiling points of alkenes as a function of topological indexes we use linear least-squares fit of the form

and the three-dimensional valence connectivity index O(q) is calculated using the RandiC-type invariant as Q(4) = c[si(q)sj(q)1;1'2

0.9063

H

bp ("C) = a

+ b TI

The statistical parameters of these correlations are depicted in Table 2 for the three descriptors, where r is the linear correlation coefficient, s is the standard deviation of regression and F is the Fisher ratio. As expressed by these parameters, equations using the three-dimensional descriptors O(q) and dc(q)represent better QSPR models than the equation with two-dimensional valence connectivity index

X". With the objective to obtain good QSPR models to describe the boiling points of alkenes according to the recently proposed methodology of Mihalic and Trinajstic,I0 we introduce the number of methyl groups directly bonded to double bonds as a new variable in the regression equations, obtaining the following models bp ("C) = a

+ bTI 4-C# CH,

The statistical characteristics of the above correlations are given in Table 3. The best agreement with experimental

710 J. Chem. In$ Comput. Sei., Vol. 35,No. 4, 1995

ESTRADA

Table 1. Boiling Points (in "C) and Topological Indexes Q(q), Qc(q), and xv for a Series of 53 Alkanes ~

cis-butene-2 2 trans-butene-2 3 3-Me-butene-1 4 pentene- 1 5 2-Me-butene-1 6 cis-pentene-2 7 trans-pentene-2 8 3,3-diMe-butene-1 9 2-Me-butene-2 10 4-Me-pentene-1 11 3-Me-pentene-1 12 cis-4-Me-pentene-2 13 trans-4-Me-pentene-2 14 2,3-diMe-butene-l 15 2-Me-pentene-1 16 hexene- 1 17 2-Et-butene-1 18 trans-hexene-3 19 cis-hexene-3 20 2-Me-pentene-2 21 cis-3-Me-pentene-2 22 trans-3-Me-pentene-2 23 trans-hexene-2 24 cis-hexene-2 25 4.4-diMe-pentene-1 26 2,3-diMe-butene-2 27 3,3-diMe-pentene-1 28 2,4-diMe-pentene-1 29 2,4-diMe-pentene-2 30 3-Et-pentene-1 31 2,3-diMe-pentene-1 32 4-Me-hexene-1 33 3-Me-2-Et-butene-1 34 2-Me-hexene-1 35 2-Et-pentene-1 36 trans-heptene-3 37 cis-heptene-3 38 2-Me-hexene-2 39 2-Et-pentene-2 40 2,3-diMe-pentene-2 41 cis-heptene-2 42 trans-heptene-2 43 2.4,4-triMe-pentene-1 44 2,4,4-triMe-pentene-2 45 2,5-diMe-hexene-2 46 2.3.4-triMe-pentene-2 47 octene- 1 48 cis-octene-3 49 trans-octene-3 50 cis-octene-4 51 trans-octene-4 52 cis-octene-2 53 trans-octene-2 1

a

0.88 3.72 20.10 29.90 31.20 36.90 36.40 41.20 38.50 53.90 54.10 56.30 58.60 55.70 60.70 63.30 64.70 67.10 66.40 67.30 70.40 67.60 67.90 68.80 72.50 73.20 77.50 8 1.60 83.40 85.00 84.30 87.50 89.00 91.10 94.00 96.00 95.70 95.80 94.00 97.50 98.50 98.00 101.40 101.90 112.60 116.50 121.30 122.90 123.30 122.50 121.90 125.60 125.00

1.4881 1.4881 1.8963 2.0235 1.9143 2.0260 2.0260 2.1969 1.8661 2.3794 2.4342 2.3988 2.3988 2.2971 2.4143 2.5235 2.4750 2.5639 2.5639 2.4040 2.4268 2.4268 2.5260 2.5260 2.6700 2.2500 2.7576 2.7702 2.7768 2.9721 2.8350 2.9173 2.8578 2.9143 2.9750 3.0639 3.0639 2.9004 2.9268 2.8107 3.0260 3.0260 3.0608 3.0774 3.2599 3.1935 3.5235 3.5639 3.5639 3.5639 3.5639 3.5260 3.5260

1.4045 1.4051 1.7959 1.9123 1.8173 1.9150 1.9156 2.0858 1.7735 2.2511 2.3016 2.2715 2.2737 2.1829 2.2874 2.3876 2.3471 2.4259 2.4240 2.2845 2.3042 2.3040 2.3893 2.3895 2.5344 2.1483 2.6119 2.6293 2.6406 2.8080 2.6895 2.7601 2.7135 2.7668 2.8195 2.9009 2.8984 2.7557 2.8285 2.6757 2.8641 2.8645 2.908 1 2.9300 3.1010 3.0356 3.3390 3.3743 3.3753 3.3730 3.3734 3.3409 3.3408

1.3197 1.3209 1.7024 1.8096 1.7417 1.8045 1.8054 1.9862 1.6776 2.1352 2.1808 2.1487 2.1514 2.0735 2.1677 2.2627 2.2216 2.2894 2.2864 2.1627 2.1789 2.1787 2.2575 2.2582 2.4138 2.0405 2.4814 2.4989 2.5072 2.6596 2.5530 2.6180 2.5753 2.6281 2.6712 2.7452 2.7394 2.61 13 2.6688 2.5363 2.7098 2.7108 2.7703 2.7897 2.9489 2.8793 3.1700 3.1946 3.1953 3.1935 3.1939 3.1657 3.1653

Experimental boiling points were taken from ref 48.

Table 2. Statistical Parameters for the Linear Correlation between Boiling Points and Topological Indexes of Alkenes mode1 1

2 3

index

xv Q(q) Qc(q)

a

b

r

s

F ratio

-79.905 -79.967 -79.534

58.238 61.437 64.658

0.9824 0.9841 0.9842

5.90 5.62 5.59

1411 1561 1576

boiling temperatures obtained with these models can be explained by the fact that methyl groups increase the electron density in double bond (by inductive and hyperconjugative effects) which must be causing variations in van der Waals molecular interactions not included in topological indexes. Boiling temperatures calculated by the six equations considered here are shown in Table 4.

Table 3. Statistical Parameters for the Linear Correlation between Boiling Points vs Topological Indexes and Number of Methyl Groups Bonded to Double Bond of Alkene model index 4 5 6

xv

a -91.794

Q(q) -90.986 Qc(q) -90.400

b

C

r

s

60.862 5.335 0.9951 3.14 63.966 5.041 0.9955 3.03 67.282 4.994 0.9954 3.04

Fratio 2558 2743 2715

Table 4. Calculated Boiling Points (in "C) of the Studied Series of Alkenes calculated boiling point alkene model 1 model 2 model 3 model 4 model 5 model 6 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50

51 52 53

6.76 6.76 30.53 37.94 31.58 38.09 38.09 48.04 28.77 58.67 61.86 59.80 59.80 53.88 60.70 67.06 64.24 69.41 69.41 60.10 61.43 6 1.43 67.21 67.21 75.59 51.13 80.69 8 1.43 81.81 93.19 85.20 90.00 86.53 89.82 93.36 98.53 98.53 89.01 90.55 83.79 96.33 96.33 98.35 99.32 109.95 106.08 125.30 127.65 127.65 127.65 127.65 125.45 125.45

6.32 6.36 30.37 37.52 31.68 37.68 37.72 48.18 28.99 58.33 61.44 59.59 59.72 54.14 60.56 66.72 64.23 69.07 68.96 60.39 61.60 61.58 66.82 66.84 75.74 52.02 80.50 81.57 82.26 92.55 85.27 89.60 86.74 90.02 93.25 98.25 98.10 89.33 93.81 84.42 95.99 96.02 98.70 100.04 110.55 106.53 125.17 127.34 127.40 127.26 127.28 125.29 125.28

5.80 5.87 30.54 37.47 33.08 37.14 37.20 48.89 28.94 58.52 61.47 59.40 59.57 54.54 60.63 66.77 64.11 68.50 68.30 60.30 61.35 61.34 66.43 66.48 76.54 52.40 80.91 82.04 82.58 92.43 85.54 89.74 86.98 90.39 93.18 97.97 97.59 89.31 93.03 84.46 95.68 95.74 99.59 100.84 111.14 106.64 125.43 127.02 127.07 126.95 126.98 125.16 125.13

9.44 9.44 23.62 31.36 30.05 36.85 36.85 41.91 37.78 53.02 56.36 59.54 59.54 53.35 60.48 61.79 58.84 64.25 64.25 65.19 66.57 66.57 67.28 67.28 70.7 1 66.48 76.04 82.14 87.88 89.09 86.08 85.76 82.14 90.91 89.27 94.68 94.68 95.40 91.67 95.27 97.71 97.71 99.83 106.17 117.28 113.24 122.65 125.11 125.11 125.11 125.11 128.14 128.14

8.94 8.97 23.89 31.34 30.30 36.55 36.59 42.44 37.58 53.01 56.24 59.35 59.49 53.69 60.37 61.74 59.15 64.19 64.07 65.23 66.49 66.47 66.89 66.90 71.13 66.60 76.09 82.24 88.00 88.63 86.09 85.57 82.58 9 1.04 89.37 94.57 94.4 1 95.37 94.98 95.29 97.26 97.28 100.07 106.52 117.45 113.27 122.60 124.85 124.92 124.77 124.80 127.76 127.75

8.38 8.46 24.14 31.35 31.78 36.00 36.06 43.23 37.45 53.26 56.33 59.16 59.34 54.10 60.44 61.84 59.07 63.63 63.43 65.10 66.19 66.17 66.48 66.53 72.00 66.86 76.55 82.72 88.28 88.54 86.36 85.74 82.87 91.42 89.32 94.30 93.91 95.28 94.16 95.23 96.91 96.98 100.98 107.28 118.00 113.31 122.88 124.54 124.59 124.46 124.49 127.59 127.56

CHARACTERISTIC FEATURES OF Q INDEXES

The general features of S2 indices depends on the sensitivity of hi values, which its calculations are based on. The use of electron charge density in the calculation of 6i values represents an improvement in its sensitivity in order to differentiate dissimilar atoms or atomic groups in molecules.

J. Chem. In& Comput. Sci., Vol. 35, No. 4, 1995 711

THREE-DIMENSIONAL MOLECULAR DESCRIFTORS Table 5. Values of 2D-6i and 3D-di Parameters for Different Atomic Groups group CH3“ CH3b -CH2=CH2 -NHf -NH2’ -CH< =CH-C =NHc =NHd -NH”-NHb=C < >C