Predicting Thermodynamic Properties with a Novel Semiempirical

Aug 7, 2007 - Prediction of the Enthalpy of Vaporization of Organic Compounds at Their Normal Boiling Point with the Positional Distributive Contribut...
0 downloads 0 Views 135KB Size
10174

J. Phys. Chem. B 2007, 111, 10174-10179

Predicting Thermodynamic Properties with a Novel Semiempirical Topological Descriptor and Path Numbers Congyi Zhou,*,†,‡ Xi Chu,† and Changming Nie‡ Department of Chemistry, The UniVersity of Montana, Missoula, Montana 59812, and School of Chemistry and Chemical Engineering, Nanhua UniVersity, Hengyang, People’s Republic of China, 421001 ReceiVed: January 25, 2007; In Final Form: May 26, 2007

Group electronegativities, which take into account the influence of chemical environment, are calculated. Equilibrium electronegativity is defined on the basis of group electronegativity. Using relative bond lengths and equilibrium electronegativies, we create a novel semiempirical topological descriptor, Nt. A quantitative structure-property relationship (QSPR) model is subsequently developed using Nt together with path numbers P2 and P3 to accurately predict thermodynamic properties of organic compounds. Excellent correlation coefficient values demonstrate the accuracy of the method. The contribution analysis indicates that Nt plays the most important role in the modeling. With the new QSPR model, we are able to predict a wide range of thermodynamic properties of an extensive number of molecules. The predictions are provided in the Supporting Information.

I. Introduction Quantitative structure-property/activity relationship (QSPR/ QSAR) models using descriptors generated from molecular graphs have been widely used to obtain physicochemical and biological properties and the activity of molecules.1-25 The concept of QSPR/QSAR dates back more than a century. In 1884, Mills developed a QSPR model to predict the melting points and boiling points of homologous series.1 Soon afterward, similar pioneering works were carried out on the QSPR between oil-water partition coefficients and the potency of local anesthetics2 and on that between chain lengths and the narcosis.3 The first molecular descriptors, the Wiener index4 and the Platt index,5 were proposed in 1947 to model the boiling points (BPs) of hydrocarbons. Hammett6,7 and Taft8-12 later made significant progress in the QSAR methodology. In the 1960s, Hansh and Fujita13 developed models that connect the biological activity of compounds to their hydrophobic, electronic, and steric properties. Free and Wilson14 modeled additive group contributions to the biological activity. QSPR/QSAR analysis using descriptors is now a well-established technique to correlate various properties and the activity of compounds with their molecular structures.15 Among descriptors, graph theoretical descriptors have attracted much recent research interest.16-25 The Wiener index,4 Randic Kier index,17,18 Hosoya index,19 Balaban index,20 Estrada index,21 and Xu index22 are the most well-known topological indexes. In the past, our research group has suggested some topological descriptors.23-25 Here, we propose a novel semiempirical topological descriptor, Nt. Nt encodes information on the molecular mass, structure, and intermolecular interactions, which are crucial ingredients for modeling thermodynamic properties. It is adapted from the traditional distance matrix, and it has equilibrium electronegativities and relative bond * Corresponding author. E-mail: [email protected]. † The University of Montana. ‡ Nanhua University.

Figure 1. Plot of group structure.

lengths as components. It well characterizes organic compounds and has wide applications in the field of QSPR/QSAR. II. Principle Concepts and Methods A. Group Electronegativity. The electronegativity of atom A, XA, is a measure of the ability of A to attract electrons when it is in a compound. The electronegativity of group G, XG, can be defined as a weighted average of electronegativities of its components and calculated using the Pauling scale.33 We first break down a group into k levels, as illustrated in Figure 1. For a two-level group such as dCdO or sCHI2, all of the atoms are equally weighted, so that

1 1 χdCO ) [χC + χO] ) [2.55 + 3.44] ) 2.9950 2 2 and

1 χsCHI2 ) [χC + χH + 2χI] ) 4 1 [2.55 + 2.20 + 5.32] ) 2.5175 4

10.1021/jp070660r CCC: $37.00 © 2007 American Chemical Society Published on Web 08/07/2007

Novel Semiempirical Topological Descriptor

J. Phys. Chem. B, Vol. 111, No. 34, 2007 10175

TABLE 1: Group Electronegativities

For a group with more than two levels, all of the atoms or groups attached to the “anchor atom” are weighted equally. For instance,

1 χsCH2CN ) [χC + 2χH + χsCN] ) 4 1 1 χ + 2χH + (χC + χN) 4 C 2

[

)

]

1 1 2.55 + 4.40 + (2.55 + 3.04) ) 2.4363 4 2

[

]

Representative groups and their electronegativites are listed in Table 1.

Different definitions of the group electronegativity are available.26-29 In Table 2, we compare our values with those suggested by other authors. Sanderson’s26 values are the geometric means of the atomic values of “stability ratio”, which are not linearly related to the Pauling scale.27 Nevertheless, they correlate well with our values and the correlation coefficient is 0.9454. Bratsch’s values are the harmonic means of the Pauling atomic electronegativities. This definition, however, only considers the numbers and species of atoms that form the group but ignores the group structure, which has an effect on the group’s ability of donating or attracting charge. As a result, its values for some groups, such as -OCOCH3 and -OPh,

10176 J. Phys. Chem. B, Vol. 111, No. 34, 2007

Zhou et al.

TABLE 2: Group Electronegativities Calculated According to Our Definition and Other Definitions26-29 group -CH3 -CH2CH3 -C6H5 -CHO -COOH -CF3 -CCl3 -CN -SiH3 -SiF3 -N3 -NH2 -NO -NO2 -NF2 -NCO -NCS -OPh -OCH3 -PH2 -OH -OCl -OClO -OClO2 -OClO3 -OBrO2 -OIO2 -ONO -ONO2 -OCHO -OCOCH3 -OCN -SH -SCN

this work Sanderson26 Bratsch27 2.2875 2.3094 2.4333 2.7300 2.9467 3.6225 3.0075 2.7950 2.1250 3.4600 3.0400 2.4800 3.2400 3.3067 3.6667 3.0100 2.8025 2.9367 2.8638 2.1967 2.8200 3.3000 3.3700 3.3933 3.4050 3.3583 3.3083 3.3400 3.3734 3.0850 3.0996 3.1175 2.3900 2.6875

2.63 2.64 2.67 2.96 3.12 3.64 3.28 2.96 2.48 3.42 3.19 2.78 3.42 3.49 3.75 3.18 2.96 2.75 2.81 2.57 3.08 3.56 3.59 3.61 3.62 3.54 3.41 3.49 3.53 3.12 2.91 3.18 2.77 2.96

Inamoto and Masuda28 Han29

2.28 2.29 2.38 2.64 2.80 3.49 2.98 2.77 2.12 3.12 3.04 2.42 3.23 3.30 3.61 2.97 2.71 2.44 2.44 2.20 2.68 3.29 3.34 3.37 3.38 3.31 3.21 3.30 3.33 2.80 2.56 2.97 2.37 2.71

2.47 2.48 2.72 2.87 2.82 2.98 2.67 3.21

2.45 2.46 2.74 2.83 2.73 3.09 2.63 3.11

2.30

2.34

2.99 3.57 3.42

3.00 3.42 3.88

3.55 3.50 3.52 3.54 2.19 3.49

3.35 3.33 3.48 3.45 2.35 3.44

3.51 3.51

3.48 3.49

2.62

2.55

significantly deviate from values of other authors. Inamoto’s28 and Han’s29 definitions involve a complex way of counting the number of valence electrons on the central atom. Compared to other definitions of group electronegativity, ours has the advantages of considering the group structure, being easy to calculate, and using the widely adopted Pauling unit. B. Equilibrium Electronegativity. For a molecule with an equilibrium structure, we further define the equilibrium electronegativity, Xe, of atom i as

χe,i ) (χi +

χG)/(1 + n) ∑ G

(1)

where Xi is the Pauling electronegativity of atom i, n is the total number of groups, which are attached to i, and XG is the electronegativity of group G, which directly connects to i. Taking the equilibrium electronegativity into account leads to the successful establishment of the index Nt. C. Topological Descriptor, Nt. Let the graph GV,E ) {V,E} be a hydrogen-suppressed graph of a molecule with N nonhydrogen atoms.22,30 The set of vertices, V, represents the N non-hydrogen atoms in the molecule, and E represents bonds between pairs of atoms. Such a graph depicts the topology of TABLE 3: Relative Bond Lengths (l/lC-C, lC-C ) 0.154 nm)31

Figure 2. Molecular description of 2,2-demethylpentanal.

chemical species. Invariants derived from it can be used to characterize a molecule.22,30 We define the distance matrix L associated with GV,E as a symmetric real positive N-by-N matrix, [lij]N×N, where lij is the sum of bond lengths corresponding to the shortest path between vertices i and j. For example, for propane, C1-C2-C3, l12 ) l21 ) l23 ) l32 ) 0.154 nm and l13 ) l31 ) 2 × 0.154 nm. Note that the diagonal matrix elements are zero. We further introduce a relative bond length matrix, L′ ) [ l′ij]N×N, such that

l′ij ) lij/lC-C

(2)

where lC-C is the bond length of a carbon-carbon single bond. Examples of relative bond lengths are given in Table 3. With the addition of L′ and a diagonal equilibrium electronegativity matrix, a revised distance matrix, D1, is created. Note that D1 is a symmetric real positive matrix.

[

χe,1 l21/lC-C ‚‚‚ D1 ) li1/lC-C ‚‚‚ ln1/lC-C

l12/lC-C χe,2 ‚‚‚ ‚‚‚ ‚‚‚ ln2/lC-C

‚‚‚ ‚‚‚ ‚‚‚ χe,i ‚‚‚ ‚‚‚

l1(n-1)/lC-C l2(n-1)/lC-C ‚‚‚ ‚‚‚ ‚‚‚ ln(n-1)/lC-C

l1n/lC-C l2n/lC-C ‚‚‚ lin/lC-C ‚‚‚ χe,n

The new topological descriptor, Nt, is defined as

Nt ) log λ log N

]

(3)

where λ is the largest eigenvalue of D1 and N is the number of non-hydrogen atoms. For example, the non-hydrogen skeletal description of 2,2-dimethylpentanal is given in Figure 2 and the corresponding revised distance matrix is

[

D1 ) 2.6375 1 2 3 4 2 2 0.7922

1 2.4330 1 2 3 1 1 1.7922

2 1 2.3446 1 2 2 2 2.7922

3 2 1 2.3182 1 3 3 3.7922

4 3 2 1 2.2952 4 4 4.7922

2 1 2 3 4 2.3241 2 2.7922

2 1 2 3 4 2 2.3241 2.7922

0.7922 1.7922 2.7922 3.7922 4.7922 2.7922 2.7922 2.9050

]

For this molecule, Nt )1.2898 log(8) ) 1.1648. This example demonstrates the impact of using equilibrium electronegativities: each diagonal element has a different value, which reflects the unique environment of different atoms. If the atomic electronegativities were used, the diagonal elements

Novel Semiempirical Topological Descriptor

J. Phys. Chem. B, Vol. 111, No. 34, 2007 10177

TABLE 4: Models for Boiling Points and Their Statistical Results property ) a1Nt + a2P2 + a3P3 + a4 no. 1 2 3 4

compound

a1

a2

a3

a4

R

F

S

N

alkanes hydrocarbons aldehydes and ketones mercaptans

139.1756 233.1561 149.2880 195.6176

-1.6177 -11.9346 -2.4017 -6.5953

3.6202 0 0.3006 -1.9678

-63.8401 -96.1359 -21.3744 -20.9997

0.9969 0.9994 0.9975 0.9993

7394 16574 6252 5617

4.3148 4.0863 3.8491 3.3825

143 46 98 26

would have been 2.55, 2.55, 2.55, 2.55, 2.55, 2.55, 2.55, and 3.44 instead. When we correlate the Nt values of 12 alkanes, including propane, pentane, hexane, heptane, octane, nonane, decane, undecane, dodecane, tridecane, tetradecane, and pentandecane, with their BPs, the correlation coefficient, R, and standard deviation, S, values are 0.998 85 and 4.436 21, respectively. By contrast, if we use atomic electronegativities rather than equilibrium electronegativities, the R and S values are 0.998 79 and 4.535 60, respectively. The inclusion of the equilibrium electronegativity results in a better correlation and less deviation. Relative bond lengths and electronegativities are equally important in determining the value of λ. The Perron-Frobenius theorem32 relates the largest eigenvalue of a real positive N-by-N matrix to the sums of matrix elements in a row, and it is estimated as being between the minimum and maximum of the sums. Matrix elements of D1 contain relative bond lengths and electronegativities; log λ is thus related to the sum of them. They are both dimensionless, although they describe different properties. Pauling electronegativities are in the range from 0.70 to 3.98,33 whereas relative bond lengths run from less than 1 to as large as a molecule can be. Therefore, relative bond lengths weight more numerically for larger molecules than for smaller molecules. The majority of the molecules involved in our extensive study have more than four non-hydrogen atoms. One should be cautious about applying our models to very small (diatomic or triatomic) molecules that contain atoms with very large electronegativities. Such molecules are out of the range of our study. At the molecular level, thermodynamic properties are related to the molecular mass, molecular structure, and intermolecular interactions. Thermodynamic properties such as the entropy and the heat capacity are related to the translational, rotational, and vibrational energies of a molecule at a given temperature and pressure, which are determined by the molecular structure. Intermolecular interactions are crucial for phase transitions. Information on the mass is mainly included in the log N factor of Nt, whereas log λ contains information on the molecular structure and intermolecular interactions. The structure of a molecule and the corresponding translational, rotational, and vibrational energies can be uniquely determined if the atomic species, bond lengths, bond angles, and torsion angles are specified. At a semiempirical level, our D1 matrix contains the essential topological information regarding the bonds, and the electronegativities correlate with the atomic species. The dipole moment of a molecule is the most important factor that determines its intermolecular interactions. The polarizability and hyperpolarizability play a less important role. The electronegativity measures the ability of the atom to attract charge. It is related to the charge on the atomic centers, and together with bond lengths, it contains the information on the dipole moment. The ability of different atomic centers to attract or lose charge in a molecule and how these centers are distributed are also related to the polarizabilities of a molecule. The largest eigenvalue of the revised matrix therefore contains the essential

TABLE 5: Statistical Results for the Leave-One-Out Cross-Validations and the Predictions of BP (°C) cross-validation of the training set R alkanes hydrocarbons aldehydes and ketones mercaptans

0.9965 0.9992 0.9977 0.9994

MRE N rms (%) 96 35 70 19

5.15 3.92 4.05 3.84

3.59 2.07 2.45 2.59

prediction of the test set R 0.9978 0.9991 0.9969 1.0000

MRE N rms (%) 47 11 28 7

3.30 6.65 3.91 5.81

2.38 4.88 2.55 3.58

ingredients of molecular structure and intermolecular interactions for modeling thermodynamic properties. D. Path Numbers P2 and P3. The idea of path number was initially proposed by Wiener,4 and here, we use Pm to denote the sum of m bond paths between all pairs of non-hydrogen atoms in a molecule. According to Gordon and Scantlebury,34 P2 characterizes the degree of branching of the molecule. Wiener used P3 in his modeling,4 and it reflects the shape of a molecule. For 2,2-dimethyl-3-pentanone, e.g., P2 and P3 are 10 and 8, respectively. E. Computational Details. Values of Nt are calculated with a program developed in our lab using Matlab 7.0 (Math Works, Inc.). Pn is obtained by using Hyperchem 7.5 (Hypercube, Inc.) and Dragon 5.4 (Talete srl, Milano, Italy). We use Excel (Microsoft Corp.), SPSS13.0 (SPSS, Inc.), and Origin7.0 (OriginLab) to perform statistical work and data analysis. III. Results and Discussion A. Model Development. Nt encodes information on the molecular mass, structure, and intermolecular interactions. P2 and P3 contain information on the shape of a molecule. Here, we build a multiple linear regression (MLR)35,36 model for the property of a molecule:

property ) a1Nt + a2P2 + a3P3 + a4

(4)

where a1, a2, and a3 are the contribution coefficients of Nt, P2, and P3, respectively, and a4 is a constant. Models are built according to the principle of trying to obtain the maximum value of correlation coefficient R and the minimum value of standard error S. R, S, and the Fischer value, F, can be used to assess the quality of the models. B. Modeling BPs. In the chemical and petrochemical industry, normal BPs are important for the processing engineering.37 To test our method, we build QSPR models of BP using eq 4. Values of a1, a2, a3, and a4 are listed in Table 4 together with statistical results for quality assessment. Experimental data of 143 alkanes and hydrocarbons are from ref 37; BP values of aldehydes and ketones are from refs 38-42; BP values of mercaptans are from ref 30. To validate the models in Table 4, the sample is divided into a training set and a test set by random sampling. For the training set, the leave-one-out method is used to perform a crossvalidation. Each time, one compound is left out from the training set and the model based on the others is used to predict the property of this particular compound. Properties of the test set

10178 J. Phys. Chem. B, Vol. 111, No. 34, 2007

Zhou et al.

TABLE 6: Models and Statistical Results for Various Thermodynamic Properties property ) a1Nt + a2P2 + a3P3 + a4 no.

property

a1

a2

a3

a4

R

F

S

N

1 2 3 4 5 6 7 8 9 10 11 12

∆Hv (kJ‚mol-1) -∆fHm° (kJ‚mol-1) Sm° (kJ‚mol-1‚K-1) ∆fGm° (kJ‚mol-1) ∆Hs (kJ‚mol-1) a (L2‚bar‚mol-2) b (L‚mol-1) Cp° (J‚K-1‚mol-1) Cp(400 K) (J‚K-1‚mol-1) Cp(600 K) (J‚K-1‚mol-1) Cp(800 K) (J‚K-1‚mol-1) Cp(1000 K) (J‚K-1‚mol-1)

16.2318 -114.7625 306.0837 44.7826 36.8085 40.2316 0.2005 197.4804 156.4124 212.6456 242.0185 280.5307

-0.3945 -5.4357 -7.7361 0.4594 -0.6742 -0.2046 -0.0005 -0.9212 4.7770 7.3363 8.0191 8.1678

0.2859 1.8803 1.8673 1.7783 0.2823 0.0649 -0.0018 0.8589 1.4750 1.5227 2.9790 2.6064

14.9947 -49.5929 135.7128 -52.6715 -0.3556 -10.6189 0.0029 15.5910 16.5403 18.0192 34.1285 36.0521

0.9973 0.9961 0.9942 0.9908 0.9958 0.9924 0.9965 0.9863 0.9973 0.9956 0.9972 0.9975

8598 3274 2234 1376 1609 864 1908 879 1611 984 1566 1693

0.4098 5.4634 13.4510 3.5948 1.8873 1.0498 0.0033 12.2069 10.4965 18.3858 16.9304 18.2261

142 82 82 81 45 44 44 78 30 30 30 30

TABLE 7: Models and Statistical Results for Critical Properties property ) a1Nt + a2P2 + a3P3 + a4 no.

property

a1

a2

a3

a4

R

F

S

N

1 2 3 4

ω Zc Tc (K) Vc (cm3‚mol-1)

0.4186 -0.0280 105.4943 426.9150

-0.0159 0.0002 -0.0152 0.4381

-0.0030 -0.0012 6.7504 -6.9434

-0.0061 0.2965 387.5881 -3.8565

0.9945 0.9138 0.9830 0.9853

4162 233 1306 1515

0.0117 0.0057 10.3264 23.6766

142 142 141 141

TABLE 8: Contribution Analysis of the Models Nt

P2

P3

property

Ψr (Ψf)

Ψr (Ψf)

Ψr (Ψf)

BP(alkanes) BP(hydrocarbons) BP(aldehydes and ketones) BP(mercaptans) ∆Hv -∆fHm° Sm° ∆fGm° ∆Hs a b Cp° Cp(400 K) Cp(600 K) Cp(800 K) Cp(1000 K) ω Zc Tc Vc

191.03 (79.97%) 320.15 (73.14%) 204.91 (88.27%) 268.5 (76.67%) 22.28 (77.62%) 67.10 (69.98%) 178.96 (81.82%) 58.65 (75.74%) 46.35 (79.01%) 45.22 (94.26%) 0.2253 (93.10%) 245.61 (92.06%) 203.23 (80.88%) 276.30 (79.75%) 314.47 (78.98%) 364.51 (81.53%) 0.5747 (75.21%) 0.0384 (63.49%) 144.89 (69.34%) 586.33 (87.69%)

15.85 (6.67%) 117.03 (26.74%) 23.53 (10.14%) 64.62 (18.45%) 3.86 (13.45%) 21.80 (22.74%) 31.03 (14.19%) 4.13 (5.33%) 5.44 (9.27%) 1.63 (3.40%) 0.0040 (1.65%) 7.89 (2.96%) 37.58 (14.96%) 57.71 (16.66%) 63.08 (15.84%) 64.25 (14.37%) 0.1556 (20.36%) 0.0020 (3.30%) 0.1488 (0.07%) 4.29 (0.64%)

30.53 (12.78%) 0.00 (0.00%) 2.54 (1.09%) 16.60 (4.74%) 2.41 (8.40%) 6.23 (6.50%) 6.19 (2.83%) 13.24 (17.10%) 6.38 (10.88%) 0.40 (0.83%) 0.0110 (4.55%) 6.05 (2.27%) 9.10 (3.62%) 9.39 (2.71%) 18.37 (4.61%) 16.07 (3.59%) 0.0254 (3.32%) 0.0101 (16.7%) 56.88 (27.22%) 58.50 (8.75%)

are predicted by models constructed using the entire training set. The quality of the models can be assessed by values of R, the root-mean-square (rms) error, and the mean relative error (MRE). Results of such validations are shown in Table 5. In addition, the experiment values, the calculated values with models before cross-validation, and the predicted values with models after cross-validation agree well with each other (see the tables in the Supporting Information). Mihalic and Trinajstic43 suggested that a good QSPR model for BP must have R > 0.99 and S < 5.0 °C. The statistical results in Tables 4 and 5 demonstrate the validity and quality of our models. C. Modeling Other Thermodynamic Properties. Thermodynamic properties are important for chemical engineering designs.37 For example, in order to design a heat exchanger for vaporizing liquids, one must know their vaporization enthalpy, ∆Hv. The heat capacity, Cp, is also important for a design of similar purpose.37 In Supporting Information Table 5, we listed the calculated values of ∆Hv for 142 alkanes in comparison with experimental values. The Nt, P2, and P3 values are also included in the table. In addition, the standard enthalpy of

formation, ∆fHm° (Supporting Information Table 6); standard entropy, Sm° (Supporting Information Table 7); standard free energy of formation, ∆fGm° (Supporting Information Table 8); sublimation enthalpy, ∆Hs (Supporting Information Table 9); and van der Waals constants a and b (Supporting Information Table 10) are included. Cp values for liquids (Cp°, Supporting Information Table 11) and at 400, 600, 800, and 1000 K (Supporting Information Table 12) are tabulated. All of the experimental values are taken from refs 38 and 44. Model coefficients and statistical results are listed in Table 6. For all of these properties, the values of R are larger than 0.99, which demonstrates the quality of our models. D. Modeling Critical Properties. The critical properties we study include the acentric factor, ω; critical compressibility factor, Zc; critical temperature, Tc; and critical volume, Vc. These are important physical properties of organic and inorganic compounds. Experimentally determining their values, however, can be time-consuming and expensive. QSPR, on the other hand, provides a fast and inexpensive way to determine these values. We take the experimental values listed in ref 37 to construct QSPR models. Details of our results are listed in Supporting

Novel Semiempirical Topological Descriptor

J. Phys. Chem. B, Vol. 111, No. 34, 2007 10179

Information Tables 13 and 14. Our calculation agrees well with experimental values. Model coefficients and statistical results are presented in Table 7, which demonstrates the quality and predictability of our models. E. Contribution of Different Indexes to the Models. In order to investigate the relative importance of individual indexes, we calculate the relative and fraction contribution of each index according to45

Ψir ) aiT hi Ψif ) R2|Ψir|/

∑i |Ψir| × 100%

(5) (6)

where Ψr, Ψf, and T h i are, respectively, the relative contribution, fraction contribution, and average value of the ith topological index. The square of the correlation coefficient, R2, is the coefficient of the determination. The sum is over all of the indexes in the model. Results of the contribution analysis are summarized in Table 8. Table 8 shows that Nt plays the most important role in our modeling. The contribution of Nt ranges from 73.14 to 88.27% for BP, from 63.49 to 87.69% for critical properties, and from 69.98 to 79.99% for other properties. Nt correlates well with thermodynamic properties. If we use Nt alone to model the BPs of a large number of alkanes, hydrocarbons, aldehydes and ketones, and mercaptans, the R values are 0.9910, 0.9979, 0.9960, and 0.9972, respectively. The contributions of P2 and P3 are less significant, although the presence of P2 and P3 improves the quality of the modeling. IV. Conclusion QSPR models are constructed with a novel semiempirical topological index, Nt, together with path numbers P2 and P3. Nt is built on the basis of equilibrium electronegativity and relative bond length. Values of the boiling points, critical properties, and other thermodynamic properties predicted by these models agree well with experimental measurements. Contribution analysis shows that index Nt plays the most important role in the modeling. With the additions of the path numbers, the quality of the models is further improved, which implies that the combination of electronegativity, bond length, branching of degree, and size encodes inherent chemical information of molecules. Acknowledgment. We acknowledge the support from the National Science Foundation Grant No. EPS-0346458 and the Office of the Vice President for Research and Development of the University of Montana. Supporting Information Available: Tables showing topological descriptor values and boiling point values for alkanes, hydrocarbons, aldehydes, ketones, and mercaptans and vaporization enthalpy, standard enthalpy of formation, standard entropy, standard free enthalpy of formation, sublimation enthalpy, van

der Waals constant, heat capacity, acentric factor, critical compressibility factor, and critical property values for alkanes. This material is available free of charge via the Internet at http:// pubs.acs.org. References and Notes (1) Mills, E. Philos. Mag. 1884, 17, 173. (2) Meyer, H. Arch. Exp. Pathol. Pharmakol. 1899, 42, 109. (3) Overton, E. Studien u¨ber die Narkose zugleich ein Beitrag zur allgemeinen Pharmakologie; Verlag von Gustav Fischer: Jena, Germany, 1901. (4) Wiener, H. J. Am. Chem. Soc. 1947, 69, 17. (5) Platt, J. R. J. Chem. Phys. 1947, 15, 419. (6) Hammett, L. P. Chem. ReV. 1935, 17, 125. (7) Hammet, L. P. Physical Organic Chemistry; McGraw-Hill: New York, 1940. (8) Taft, R. W. J. Am. Chem. Soc. 1952, 74, 2729. (9) Taft, R. W. J. Am. Chem. Soc. 1952, 74, 3120. (10) Taft, R. W. J. Am. Chem. Soc. 1953, 75, 4231. (11) Kamlet, M. J.; Taft, R. W. J. Am. Chem. Soc. 1976, 98, 737. (12) Taft, R. W. J. Am. Chem. Soc. 1953, 75, 4538. (13) Hansch, C.; Fujita, T. J. Am. Chem. Soc. 1964, 86, 1616. (14) Free, S. M.; Wilson, J. W. J. Med. Chem. 1964, 7, 395. (15) Alan, R. K.; Dan, C. F. Energy Fuels 2005, 19, 922. (16) Matamala, A. R.; Estrada, E. J. Phys. Chem. A 2005, 109, 9890. (17) Randic, M. J. Am. Chem. Soc. 1975, 97, 6609. (18) Kier, L. B.; Hall, H. Molecular ConnectiVity in Chemistry and Drug Research; Academic Press: New York, 1976. (19) Hosoya, H. Bull. Chem. Soc. Jpn. 1971, 9, 2332. (20) Balaban, A. T. Chem. Phys. Lett. 1982, 89, 399. (21) Estrada, E. J. Chem. Inf. Comput. Sci. 1995, 1, 31. (22) Ren, B. Y. J. Chem. Inf. Comput. Sci. 1999, 1, 139. (23) Zhou, C. Y.; Nie, C. M.; Li, S.; et al. J. Comput. Chem., in press. (24) Zhou, C. Y.; Nie, C. M.; Li, S.; et al. Chin. J. Inorg. Chem. 2007, 1, 25. (25) Zhou, C. Y.; Nie, C. M. Bull. Chem. Soc. Jpn. 2007, 8, in press. (26) Sanderson, R. T. Polar CoValence; Academic Press: New York, 1983. (27) Bratsch, S. G. J. Chem. Educ. 1985, 62, 101. (28) Inamoto, N.; Masuda, S. Chem. Lett. 1982, 7, 1003. (29) Han, R. C. Acta Chim. Sin. 1990, 48, 627. (30) Ren, B. Y. J. Chem. Inf. Comput. Sci. 2003, 43, 161. (31) Dean, J. A. Lange’s Handbook of Chemistry, 15th ed.; Science Press: Beijing, 2003 (WEI Jun-fa,et al, transl.). (32) Horn, R. A.; Johnson, C. R. Matrix Analysis; Cambridge University Press: London, 1990. (33) Huheey, J. E. Inorganic chemistry: principles of structure and reactiVity, 2nd ed.; Harper and Row: New York, 1978. (34) Gordon, M.; Scantlebury, G. R. Trans. Faraday Soc. 1964, 60, 604. (35) Ling, Y. Z.; Yu, R. Q. Chemometrics; Higher Education Press: Beijing, 2003. (36) Liu, S. S.; Cao, C. Z.; Li, Z. L. J. Chem. Inf. Comput. Sci. 1998, 38, 387. (37) Yaws, C. L. Chemical Properties Handbook; McGraw-Hill: Beijing, 1999. (38) Dean, J. A. Lange’s Handbook of Chemistry, 15th ed.; McGrawHill: Beijing, 1999. (39) Weast, R. CRC Handbook of Chemistry and Physics, 70th ed.; CRC Press: Boca Raton, FL, 1989-1990. (40) Lide, D. R.; Milne, G. W. A. Handbook of Data on Common Organic Compounds; CRC Press: Boca Raton, FL, 1992. (41) Dictionary of Organic Chemistry, 6th ed.; Chapman & Hall: London, 1996. (42) Huang, F.; Liu, X. Aldehydes. Encyclopedia of Chemical Industry; Chemical Industry Press: Beijing, 1997; Vol. 13. (43) Mihalic, M.; Trinajstic, N. J. Chem. Educ. 1992, 69, 701. (44) Yao, R. B. Handbook of Physic-Chemistry; Scientific and Technologic Press of Shanghai: Shanghai, 1985. (45) Needham, D. E.; Wei, I. C.; Seybold, P. G. J. Am. Chem. Soc. 1988, 110, 4186.