Quantitative Structure–Property Relationship Prediction of Gas Heat

Sep 14, 2012 - Department of Chemical Engineering, Amirkabir University of Technology (Tehran Polytechnic), Hafez Avenue, 15914 Tehran, Iran...
1 downloads 0 Views 522KB Size
Article pubs.acs.org/IECR

Quantitative Structure−Property Relationship Prediction of Gas Heat Capacity for Organic Compounds Aboozar Khajeh and Hamid Modarress* Department of Chemical Engineering, Amirkabir University of Technology (Tehran Polytechnic), Hafez Avenue, 15914 Tehran, Iran S Supporting Information *

ABSTRACT: In the present work, a quantitative structure−property relationship study is performed to predict gas heat capacity for a structurally wide variety of organic compounds using the genetic function approximation (GFA) and the adaptive neurofuzzy inference system (ANFIS) methods. The simple proposed models contain only three descriptors calculated solely from the molecular structure of compounds which are 3D-independent descriptors. The models were validated by an external prediction set. Good results were obtained from both models which get the squared correlation coefficients of 0.996 and 0.997 for GFA and ANFIS, respectively. This study discloses enhanced correlations of the heat capacity of gases with their molecular structures, wherein the influence of the size of molecules is found to predominate.

1. INTRODUCTION Heat capacity belongs to the most important thermophysical properties of a compound that is defined as the amount of heat required to change a substance’s temperature by a given amount. Sound speed measurements and flow calorimetric techniques offer two convenient and accurate routes for determining the heat capacity of gases.1 However, many investigators reported data for the gas heat capacity of many common organic chemicals; theoretical approaches to predict the gas heat capacity of new compounds instead of expensive and complicated experiments is necessary. To this aim, several models based on equations of state,2 group contribution methods,3,4 quantum mechanical methods,5 and statistical mechanical formulas6 have been extended to the prediction of gas heat capacity. Quantitative structure−property relationship (QSPR) is an efficient tool in the correlation and prediction of diverse physicochemical properties of compounds from their structures alone which is not dependent on any experimental properties. Once a correlation between structure and desired property in a QSPR model is found, it becomes of valuable assistance in the process of development and application of new molecules for use for any purpose. QSPR methodology was performed for the predicted gas heat capacity of alkanes.7−12 Moreover based on several data sets of organic compounds including aliphatic aldehydes and ketones,13,14 monosaccharides,15 and methyl benzene derivatives,16 the heat capacity of the gas phase was predicted by the QSPR method. Xue et al.17 utilized linear and nonlinear methods to develop a QSPR model for the prediction of gas heat capacity of a diverse set of 182 compounds. In QSPR studies, after the calculation of the molecular descriptors, linear methods such as multiple linear regression (MLR),18,19 partial least-squares analysis (PLS),20,21 or nonlinear methods such as multilayer perceptrons (MLP) neural network,22,23 radial basis function (RBF) neural network,24 and support vector machine (SVM)25 can be used to establish a correlation between the structural descriptors and the desired property. © 2012 American Chemical Society

Addition to the above-mentioned methods, genetic function approximation (GFA) and adaptive neuro-fuzzy inference system (ANFIS) proved to be simple and powerful tools to development QSPR models.26−31 GFA is a proper computational method which by combining genetic algorithm with statistical modeling tools produces a population of statistically valid regression equations to get the best fit of response data. Adaptive neuro-fuzzy inference system (ANFIS) is a fuzzy inference system (FIS) implemented in the framework of adaptive neural networks which includes both the fuzzy logic qualitative approximation and the adaptive neural network capability. It has been successfully applied to modeling complex nonlinear systems.32,33 In the present investigation, by using a large data set of diverse molecules accurate theoretical QSPR models were obtained for the prediction of the gas heat capacity of organic compounds. Genetic function approximation (GFA) and adaptive neuro-fuzzy inference system (ANFIS) methods were utilized to establish quantitative linear and nonlinear relationships for the prediction the heat capacity of gases based only on three 3D-independent descriptors. In addition, an analysis was performed to see what structural features or groups are important to the gas heat capacity of organic compounds.

2. MATERIALS AND METHODS 2.1. Data Set. The gas heat capacity (J/(mol K)) of 1174 organic compounds at 25 °C were collected from the Yaws’ Handbook.34 The set contains compounds of very different chemical structures such as alkanes, cycloalkanes, alkenes, cycloalkenes, alkynes, acids, alchols, aldehydes, ketones, amines, halides, ethers, esters, and aromatics. For the modeling purpose, the data set was divided into two parts: (1) the training set that consisted of 940 compounds; (2) the validation set that Received: Revised: Accepted: Published: 13490

May 20, 2012 August 27, 2012 September 14, 2012 September 14, 2012 dx.doi.org/10.1021/ie301317f | Ind. Eng. Chem. Res. 2012, 51, 13490−13495

Industrial & Engineering Chemistry Research

Article

where i = 1, 2, ..., M, cik (k = 0, 1, ..., p) are the consequent parameters, yi(X) is the output of the ith rule, and Fik (k = 1, 2, ..., p) are fuzzy sets. The overall output, y(X), of the model is obtained by combining the outputs from the M rules in the following prescribed way:

consisted of 234 compounds. The names of the compounds used in this study together with their heat capacity values are listed in the Supporting Information. 2.2. Molecular Descriptors. For each molecule more than 1000 theoretical descriptors with various types were calculated using the Dragon Web version developed by the Milano Chemometrics and QSAR research group.35 These descriptors are classified as (a) 0D-constitutional (atom and group counts); (b) 1D-functional groups and atom centered fragments; (c) 2D-topological, counts, autocorrelations, connectivity indices, information indices, topological indices, and eigenvalue-based indices; and (d) 3D- geometrical, WHIM, and GETAWAY descriptors. 2.3. Genetic Function Approximation. The genetic function approximation (GFA) is a genetics-based algorithm of variable selection, developed by Rogers and Hopfinger36 which is a combination of the Friedman’s multivariate adaptive regression splines (MARS) algorithm37 with the Holland’s genetic algorithm.38 GFA evolves population of equations that correlate best with the responses. The major advantages of this approach are that it produces a population of models rather than a single model, estimates the most appropriate number of features, and resists overfitting and allows control over the smoothness of fit. GFA works in the following way: (a) a particular number of equations (e.g., 100) are generated by a random choice of descriptors, (b) pairs of parents are selected from the present population of equations, with probabilities proportional to their fitness, (c) crossovers are performed at randomly chosen points within the equations to generate progeny equations combining the characteristics of both parents, (d) the goodness of each progeny equation is assessed by various scores such as R-square, adjusted R-square, and Friedman’s lack of fit (LOF), (e) The new progeny equation with better fitness is preserved. In this work the LOF score with the following equation is used: LOF =

M

y(X ) =

dp n

y (X ) =

c0i

+

c1ix1

+

c 2i x 2

= CiX

(3)

where the f (X) are rule firing level (strengths), defined as f i (X ) = Tkp= 1μ F i (xk)

(4)

k

in which T denoted a t-norm, usually minimum or product. The ANFIS structure contains five layers described as follows. In the first layer, all the nodes are adjustable nodes. They generate fuzzy membership grades of the inputs and outputs of this layer and are given by O1, i = μ A (x) ,

i = 1, 2

i

O1, i = μ B

i − 2 (y)

(5)

i = 3, 4

,

(6)

where μAi(x), μBi‑2 (y) can adopt any fuzzy membership function. For example, if the Gaussian membership function is employed, μAi(x) is given by μ A (x) = i

1 ⎡ 1+⎢ ⎣

2 ⎤bi

( ) ⎥⎦ x − ci ai

(7)

where ai, bi, and ci are parameters of the membership function, governing the Gaussian functions accordingly. The second layer consisting of fixed nodes represents the tnorm operators that combine the possible input membership grades in order to compute the firing strength of the rule. The outputs of this layer are given by

(1)

+ ... +

M

∑i = 1 f i (X ) i

2

cpixp

∑i = 1 f i (X )(c0i + c1ix1 + ... + cpixp)

=

O2, i = wi = μ A (x)μ B (y) ,

where SSE is the sum of squares of errors, c is the number of basis functions (other than the constant term), d is a user defined smoothness factor, p is the number of features in the model, and n is the number of data points from which the model is built. 2.4. Adaptive Neuro-Fuzzy Inference System (ANFIS). The adaptive neuro-fuzzy inference system (ANFIS), first introduced by Jang in 1993,39 is a class of adaptive networks that is funcionally equivalent to a fuzzy inference system. A fuzzy inference system includes four steps: (1) fuzzification of the input variables, (2) evaluation of the output for each rule, (3) aggregation of the rules’ outputs, and (4) defuzzification which can be done by different approaches. By using a hybrid learning procedure, ANFIS can determine fuzzy inference parameters and construct an input−output mapping based on some collection of input−output data. ANFIS is based on the Takagi−Sugeno−Kang (TSK)40 inference model. In a TSK model with M fuzzy if−then rules, each giving p antecedents, the ith rule can be expressed as follows. Rule i: If x1 is Fi1 and ... and xp is Fip, then i

M

∑i = 1 f i (X ) M

SSE

(1 − (c + ))

∑i = 1 f i (X )y i (X )

i

i

i = 1, 2

(8)

which are the so-called firing strengths of the rules. The third layer implements a normalization function and the outputs of this layer can be represented as wi O3, i = wi̅ = , i = 1, 2 w1 + w2 (9) In the fourth layer, the nodes are adjustable nodes and every node i has the following function: O4, i = wfi̅ i = wi(px + qiy + ri), i

i = 1, 2

(10)

where w̅ i is the output of layer 3, and {pi, qi, ri} is the parameter set. The fifth layer represents the aggregation of the outputs performed by weighted summation. The output is computed as O5, i =

∑ wfi̅ i i

=

∑i wfi i w1 + w2

(11)

ANFIS uses a hybrid learning algorithm in order to train the network according to input−output data pairs. A hybrid algorithm is divided into a forward pass and a backward pass.

(2) 13491

dx.doi.org/10.1021/ie301317f | Ind. Eng. Chem. Res. 2012, 51, 13490−13495

Industrial & Engineering Chemistry Research

Article

capacity and the most significant descriptor appearing in regression eq 12 is the descriptor nAT. Therefore size and cyclicity of molecules are the major factors affecting the heat capacity of gases. The applicability of the three selected descriptors by the GFA approach to predicting heat capacity of organic compounds at the gas state was analyzed with a nonlinear QSPR model constructed by using the hybrid subtractive clustering ANFIS. All ANFIS calculations were carried out using Matlab mathematical software with the fuzzy logic toolbox for Windows running on a personal computer, while the GFA model was derived by using Materials Studio software of Accelrys. To evaluate the performance of the QSPR models presented in this work, the data were randomly divided into two subsets, a training set and an external test set in a ratio of approximately 80:20%. The test set data were not used during the model development. The experimental and predicted heat capacities of gases as well as corresponding QSPR descriptors for training and test sets are presented in the Supporting Information. To ensure the significance and predictive power of the proposed QSPR models, two statistical parameters of the squared correlation coefficient (R2) and the root mean square error (RMSE) were calculated with the following formula:

The forward pass of the learning algorithm stop at nodes in layer four and the consequent parameters are identified by the least-squares method. In the backward pass, the error signals propagate backward and the premise parameters are undated by gradient descent. It has been proved that this hybrid algorithm is highly efficient in training the ANFIS.39

3. RESULTS AND DISCUSSION In this work the best combination of descriptors were determined by using the GFA method and then the most significant linear QSPR model was produced. To reach the optimal model complexity, the number of descriptors was chosen as three; however, adding more descriptors does not improve the developed model and the increase of the R2 value was less than 0.01. The smoothing parameter d which controls the number of terms in the model, is set at a default value of 1.0. However altering d has no effect on the result because the number of descriptors in the model is fixed at a constant value of 3. Other variables for the GFA approach which are more effective on convergence rate such as population size and mutation probability is set to the default values in Materials Studio software of Accelrys.41 The GFA method identified three descriptors that significantly influence the heat capacity of gases. On the basis of the training data set the following linear equation was derived by using the GFA method (10000 iterations, LOF score, 50 population size, 50% mutation probability):

n

R =1−

RMSE =

1

S1K

2

nAT

3

nCb−

type topological descriptors constitutional descriptors functional group counts

n

(15)

calcd where yexpt , and y ̅ are experimental, calculated, and average i , yi values and n is the number of compounds in data set. The values of these statistical parameters for GFA and ANFIS models are shown in Table 2. The R2 values higher than 0.99 and the low values of the RMSE indicate that two proposed models are reliable and predictive. The linear model is further validated based on the test set by the following criteria as recommended by Golbraikh and Tropsha:43,44

∑ yiexpt yicalcd

0.85 ≤ k =

∑ (yicalcd )2 ∑ yiexp yicalcd

0.85 ≤ k′ =

Table 1. The Three Molecular Descriptors Used in eq 12 ID

(14)

∑i = 1 (yiexpt − yicalcd )2

(12)

The S1K belongs to Kier alpha-modified shape descriptors, representing paths of order 1 which encodes information about the count of atoms and relative cyclicity of molecules.42 This descriptor increases with an increase in the size of molecules and decreases with the cyclicity of molecules. nAT is the number of atoms in a molecule. The nCb- descriptor is a functional group, which represents the number of substituted benzene C(sp2) groups in any substance. All these three descriptors are among the 3D-independent descriptors, reflecting the molecular composition of a compound without any information about its molecular geometry and can be easily calculated and used in QSPR studies. The molecular descriptors and their physical meanings are presented in Table 1.

molecular descriptor

n

∑i = 1 (yiexpt − y ̅ calcd )2 n

Cp = −2.31 + 9.34 × (S1K) + 4.50 × (n AT) + 2.32 × (nCb‐)

∑i = 1 (yiexpt − yicalcd )2

2

definition

m=

1-path Kier alpha-modified shape index number of atoms

∑ (yiexp )2

(R2 − R 0 2)

≤ 0.1

2

R

≤ 1.15 (16)

≤ 1.15 (17)

or

n=

(R2 − R 0′ 2) R2

≤ 0.1 (18)

n

number of substituted benzene C(sp2)

R 02 = 1 −

n

∑i = 1 (yicalcd − y ̅ calcd )2 n

The standardization of the regression coefficients (shown in parentheses below) in eq 13 enables an assignment of more importance to the variables of the model exhibiting larger absolute standardized coefficients. n AT(0.615) > S1K(0.405) > nCb‐(0.0168)

∑i = 1 (yicalcd − kyicalcd )2

2

R 0′ = 1 −

∑i = 1 (yiexpt − k′yiexpt )2 n

∑i = 1 (yiexpt − y ̅ expt )2

(19)

For the test set, the values of K = 0.9996 and K′ = 0.9990 are close to unity, and m = −0.0049 and n = −0.0049 are less than 0.1, which shows the predicting ability of the proposed model. A comparison between the QSPR model, for estimating the gas heat capacity developed in this work, with other QSPR

(13)

According to inequality 13, it can be concluded that nAT and S1K descriptors make a larger contribution to the gas heat 13492

dx.doi.org/10.1021/ie301317f | Ind. Eng. Chem. Res. 2012, 51, 13490−13495

Industrial & Engineering Chemistry Research

Article

Table 2. Squared Correlation Coefficient (R2), Root Mean Squares Error (RMSE) and Absolute Relative Deviation (ARD%) for GFA and ANFIS Methods GFA training set test set total

ANFIS

RMSE

R2

ARD%

RMSE

R2

ARD%

6.5841 9.0181 7.1358

0.99605 0.99508 0.99579

2.757 3.0606 2.8175

5.9148 7.4464 6.2501

0.99681 0.99664 0.99677

2.6315 2.9831 2.7016

models is presented in Table 3. The results in this table indicate that the accuracy of the proposed model is comparable with the Table 3. Comparison between the Presented Models and Previous Models no.

model

1 2 3 4 5

ref ref ref ref ref

6 7 8 9

ref 17 (MLR) ref 17 (RBF) ref17 (SVM) this work (GFA) this work (ANFIS)

10

12 13 14 15 16

R2 0.975 0.996 0.996 0.996 0.998

RMSE

Ncomponent

compound type(s)

0.970 0.947 0.988 0.997

4.648 4.337 2.931 7.136

182 182 182 1174

alkane aldehydes- ketones aldehydes- ketones monosaccharides benzene methyl derivatives diverse diverse diverse diverse

0.997

6.250

1174

diverse

2.354

134 18 18 10 13

Figure 2. Williams plot describing the applicability domain of the ANFIS model (h* = 0.0128).

other models with the added advantage that it is valid for a wide range of compounds and needs only three 3D-independent descriptors for estimating gas heat capacity which can be easily calculated. Analyzing the applicability domain (AD) of the model is suitable to evaluation of the reliability of QSPR model for the prediction of a property for a new chemical. In this work the applicability domain (AD) of the models was analyzed in the plot of the standardized residuals versus the leverage values (the Williams plot) which were shown in Figures 1 and 2. In the Williams plot the leverage (h) greater than critical hat value

(h*) suggested that the compound was influential on the model. Moreover the compounds with the standardized residuals greater than three standard deviation units, >3s, were considered as the response outliers. From Figures 1 and 2 it can be seen that the majority of compounds are located within the applicability domain and are predicted accurately. However some compounds have the leverage (h) greater than the critical hat value (h*). The high leverage values of these compounds related to high values of the number of atoms and the number of substituted benzene C(sp2) groups in a molecule. Whereas the predictions for these compounds have small residuals, such compounds are so-called good influence points which stabilize the model and make it more precise. The pyromellitic acid and the hexachlorocyclopentadiene are two bad influence points because they simultaneously have high leverages and residual values. Moreover, it can be seen that the most samples with response outliers have low leverage values which may be attributed to wrong experimental data rather than to molecular structures or kind of used methods (GFA or ANFIS) for model development. Figures 3 and 4 show the predicted versus experimental values of the heat capacity of gases for the GFA and ANFIS methods, respectively. As seen from these figures the proposed models were statistically stable and fitted the data well.

4. CONCLUSIONS The QSPR methodology has been successfully applied to the prediction of gas heat capacity for a diverse set of organic compounds. The GFA method was used for the extraction of molecular descriptors from large descriptor spaces and by developing a linear model. To analyze the nonlinear behavior of these molecular descriptors, the ANFIS method was employed.

Figure 1. Williams plot describing the applicability domain of the GFA model (h* = 0.0128). 13493

dx.doi.org/10.1021/ie301317f | Ind. Eng. Chem. Res. 2012, 51, 13490−13495

Industrial & Engineering Chemistry Research



Article

AUTHOR INFORMATION

Corresponding Author

*Tel.: +98 21 64543176. E-mail: [email protected]. Notes

The authors declare no competing financial interest.



(1) Wilhelm, E.; Letcher, T. Heat Capacities: Liquids, Solutions and Vapours; Royal Society of Chemistry: Cambridge, U.K., 2010. (2) Solimando, R.; Rogalski, M.; Coniglio, L. Heat capacity estimations using equations of state. Thermochim. Acta 1992, 211, 1−11. (3) Joback, K. G.; Reid, R. C. Estimation of pure-component properties from group contributions. Chem. Eng. Commun. 1987, 57, 233−243. (4) Coniglio, L.; Daridon, J. L. A group contribution method for estimating ideal gas heat capacities of hydrocarbons. Fluid Phase Equilib. 1997, 139, 15−35. (5) Speis, M.; Delfs, U.; Lucas, K. Quantum mechanical calculations of molecular properties and ideal gas heat capacity of difluoromethane. Fluid Phase Equilib. 2000, 170, 285−296. (6) Aly, F. A.; Lee, L. L. Self-consistent equations for calculating the ideal gas heat capacity, enthalpy, and entropy. Fluid Phase Equilib. 1981, 6, 169−179. (7) Liu, S.; Cai, S.; Cao, C.; Li, Z. Molecular electronegative distance vector (MEDV) Related to 15 properties of alkanes. J. Chem. Inf. Comput. Sci. 2000, 40, 1337−1348. (8) Ivanciuc, O.; Ivanciuc, T.; Cabrol-Bass, D.; Balaban, A. T. Evaluation in quantitative structure−property relationship models of structural descriptors derived from information-theory operators. J. Chem. Inf. Comput. Sci. 2000, 40, 631−643. (9) Ren, B. A new topological index for QSPR of alkanes. J. Chem. Inf. Comput. Sci. 1999, 39, 139−143. (10) Toropov, A. A.; Toropova, A. P. QSPR modeling of alkanes properties based on graph of atomic orbitals. J. Mol. Struct. (THEOCHEM) 2003, 637, 1−10. (11) Thanikaivelan, P.; Subramanian, V.; Rao, J. R.; Nair, B. U. Application of quantum chemical descriptor in quantitative structure activity and structure property relationship. Chem. Phys. Lett. 2000, 323, 59−70. (12) Ivanciuc, O.; Ivanciuc, T.; Klein, D. J.; Seitz, W. A.; Balaban, A. T. Wiener index extension by counting even/odd graph distances. J. Chem. Inf. Comput. Sci. 2001, 41, 536−549. (13) Lu, C.; Guo, W.; Hu, X.; Wang, Y.; Yin, C. A novel Lu index to QSPR studies of aldehydes and ketones. J. Math. Chem. 2006, 40, 379−388. (14) Ren, B. New atom-type-based AI topological indices: Application to QSPR studies of aldehydes and ketones. J. Comput. Aided Mol. Des. 2003, 17, 607−620. (15) Dyekjær, J. D.; Jonsdottir, S. O. QSPR models for various physical properties of carbohydrates based on molecular mechanics and quantum chemical calculations. Carbohydr. Res. 2004, 339, 269− 280. (16) Golovanov, I. B.; Zhenodarova, S. M. Quantitative structure_property relationship: XI. Properties of methyl benzene derivatives. Russ. J. Gen. Chem. 2003, 73, 240−243. (17) Xue, C. X.; Zhang, R. S.; Liu, H. X.; Liu, M. C.; Hu, Z. D.; Fan, B. T. Support vector machines-based quantitative structure−property relationship for the prediction of heat capacity. J. Chem. Inf. Comput. Sci. 2004, 44, 1267−1274. (18) Katritzky, A. R.; Kuanar, M.; Stoyanova-Slavova, I. B.; Slavov, S. H.; Dobchev, D. A.; Karelson, M.; Acree, W. E. Quantitative Structure−Property Relationship Studies on Ostwald Solubility and Partition Coefficients of Organic Solutes in Ionic Liquids. J. Chem. Eng. Data 2008, 53, 1085−1092. (19) Gharagheizi, F. A new molecular-based model for prediction of enthalpy of sublimation of pure components. Thermochim. Acta 2008, 469, 8−11.

Figure 3. Predicted gas heat capacity by GFA method versus experimental data.34

Figure 4. Predicted gas heat capacity by ANFIS method versus experimental data.34

The validation of obtained results confirms the goodness, robustness, and the predictive capacity of proposed models. The physicochemical meaning of each descriptor was examined to extract the structural properties of the compounds that influence gas heat capacity. For the structurally diverse data set the number of atoms in a molecule is the most important factor which affects the heat capacity of gases. Moreover the heat capacity of gases increases with decreasing cyclicity and increasing number of substituted benzene C(sp2) groups in the molecules.



REFERENCES

ASSOCIATED CONTENT

S Supporting Information *

Names of the compounds used in this study together with their heat capacity values. This material is available free of charge via the Internet at http://pubs.acs.org. 13494

dx.doi.org/10.1021/ie301317f | Ind. Eng. Chem. Res. 2012, 51, 13490−13495

Industrial & Engineering Chemistry Research

Article

(42) Todeschini, R.; Consonni, V. Molecular Descriptors for Chemoinformatics; Wiley−VCH: Weinheim, Germany, 2009. (43) Golbraikh, A.; Tropsha, A. Beware of q2! J. Mol. Graphics Modell. 2002, 20, 269−276. (44) Melagraki, G.; Afantitis, A.; Sarimveis, H.; Koutentis, P. A.; Kollias, G.; Igglessi-Markopoulou, O. Predictive QSAR workflow for the in silico identification and screening of novel HDAC inhibitors. Mol. Divers. 2009, 13, 301−311.

(20) Li, L.; Xie, S.; Cai, H.; Bai, X.; Xue, Z. Quantitative structure− property relationships for octanol−water partition coefficients of polybrominated diphenyl ethers. Chemosphere 2008, 72, 1602−1606. (21) Golmohammadi, H.; Dashtbozorgi, Z. Quantitative structure− property relationship studies of gas-to-wet butyl acetate partition coefficient of some organic compounds using genetic algorithm and artificial neural network. Struct. Chem. 2010, 21, 1241−1252. (22) Jarvas, G.; Quellet, C.; Dallos, A. Estimation of Hansen solubility parameters using multivariate nonlinear QSPR modeling with COSMO screening charge density moments. Fluid Phase Equilib. 2011, 309, 8−14. (23) Patel, S. J.; Ng, D.; Mannan, M. S. QSPR flash point prediction of solvents using topological indices for application in computer aided molecular design. Ind. Eng. Chem. Res. 2009, 48, 7378−7387. (24) Tetteh, J.; Suzuki, T.; Metcalfe, E.; Howells, S. Quantitative Structure−Property Relationships for the Estimation of Boiling Point and Flash Point Using a Radial Basis Function Neural Network. Ind. Eng. Chem. Res. 2009, 48, 7378−7387. (25) Kazakov, A.; Muzny, C. D.; Diky, V.; Chirico, R. D.; Frenkel, M. Predictive correlations based on large experimental datasets: Critical constants for pure compounds. Fluid Phase Equilib. 2010, 298, 131− 142. (26) Khajeh, A.; Modarress, H. Quantitative structure−property relationship prediction of liquid heat capacity at 298.15 K for organic compounds. Ind. Eng. Chem. Res. 2012, 51, 6251−6255. (27) Khajeh, A.; Modarress, H. QSPR prediction of flash point of esters by means of GFA and ANFIS. J. Hazard. Mater. 2010, 179, 715−720. (28) Khajeh, A.; Modarress, H. Quantitative structure−property relationship for surface tension of some common alcohols. J. Chemom. 2011, 25, 333−339. (29) Khajeh, A.; Modarress, H. Quantitative structure−property relationship prediction of liquid thermal conductivity for some alcohols. Struct. Chem. 2011, 22, 1315−1323. (30) Khajeh, A.; Modarress, H. QSPR prediction of surface tension of refrigerants from their molecular structures. Int. J. Refrig. 2012, 35, 150−159. (31) Khajeh, A.; Modarress, H. Quantitative structure−property relationship for flash point of alcohols. Ind. Eng. Chem. Res. 2011, 50, 11337−11342. (32) Khajeh, A.; Modarress, H.; Rezaee, B. Application of adaptive neuro-fuzzy inference system for solubility prediction of carbon dioxide in polymers. Expt. Syst. Appl. 2009, 36, 5728−5732. (33) Khajeh, A.; Modarress, H. Prediction of solubility of gases in polystyrene by adaptive neuro-fuzzy inference system and radial basis function neural network. Exp. Syst. Appl. 2010, 37, 3070−3074. (34) Yaws, C. L. Yaws’ Handbook of Thermodynamic and Physical Properties of Chemical Compounds, Knovel: Norwich, NY, 2003. (35) Ballabio, D.; Manganaro, A.; Consonni, V.; Mauri, A.; Todeschini, R. Introduction to MOLE DBOn-line Molecular Descriptors Database. MATCH Commun. Math. Comput. Chem. 2009, 62, 199−207. (36) Rogers, D.; Hopfinger, A. J. Application of genetic function approximation to quantitative structure−activity relationships and quantitative structure−property relationships. J. Chem. Inf. Comput. Sci. 1994, 34, 854−866. (37) Friedman, J. Multivariate Adaptive Regression Splines; Laboratory for Computational Statistics, Department of Statistics, Stanford University: Stanford, CA, Nov. 1988 (revised Aug 1990); Technical Report No. 102. (38) Holland, J. Adaptation in Artificial and Natural Systems; University of Michigan Press: Ann Arbor, MI, 1975. (39) Jang, J. ANFIS: adaptive network-based fuzzy inference systems. IEEE Trans. Syst. Man. Cybern. 1993, 23, 665−685. (40) Sugeno, M. Industrial Applications of Fuzzy Control; Elsevier: Amsterdam, The Netherlands, 1985. (41) Accelrys Software, Inc.: San Diego, CA, 2005 (http://accelrys. com). 13495

dx.doi.org/10.1021/ie301317f | Ind. Eng. Chem. Res. 2012, 51, 13490−13495