Artificial Neural Network Investigation of the Structural Group

A theoretical method for predicting the auto ignition temperature (AIT) of pure components is presented. Artificial neural networks were used to inves...
0 downloads 0 Views 158KB Size
5708

Ind. Eng. Chem. Res. 2003, 42, 5708-5714

CORRELATIONS Artificial Neural Network Investigation of the Structural Group Contribution Method for Predicting Pure Components Auto Ignition Temperature Tareq A. Albahri* and Reena S. George Chemical Engineering Department, Kuwait University, P.O.Box 5969, Safat 13060, Kuwait

A theoretical method for predicting the auto ignition temperature (AIT) of pure components is presented. Artificial neural networks were used to investigate several structural group contribution (SGC) methods available in the literature. The networks were used to probe the structural groups that have significant contribution to the overall AIT property of pure components and arrive at the set of groups that can best represent the auto ignition temperature for about 490 substances. The 58 single and binary structural groups listed were derived from the Ambrose, Joback, and Chueh-Swanson definitions of group contributions and modified to account for the location of the functional groups in the molecule. The proposed method can predict the auto ignition temperature of pure components, from only the knowledge of the molecular structure, with an average error of 2.8% and a correlation coefficient of 0.98. The results are further compared to the more traditional approach of the SGC method along with other methods in the literature, and shown to be far more accurate. The method is notable for the absence of any method which has previously been used to estimate pure component AIT from their molecular structure alone. Introduction In recent years, the term environmental impact has extended its traditional meaning to include other extensive concepts in view of the possibility of industrial accidents which, because of their magnitude, are capable of causing significant damage to people and the environment. This concern, which in the past was principally associated with the nuclear industry, now includes the chemical industry and its safety. Among those concerns are incidents of fire disasters caused by the leak of materials at or above their auto ignition temperature (AIT) in chemical and petrochemical plants. Auto ignition temperature, which is also referred to as autogenous ignition temperature, spontaneous ignition temperature, and self-ignition temperature (SIT), is one of the most important safety specifications used to characterize the hazard potential of a chemical substance. As determined by ASTM standard test method E659-78,1 auto ignition temperature is the lowest temperature at which the substance will produce hot-flame ignition in air at atmospheric pressure without the aid of an external energy source such as spark or flame. In the vernacular, auto ignition temperature is the temperature at which a material will spontaneously burst into flames when exposed to the atmosphere. At the auto ignition temperature, the rate of heat evolved by exothermic oxidation reaction overbalances the rate at which heat is lost to the surroundings and * To whom correspondence should be addressed. Tel.: (+965) 481-7662 (7459). Fax: (+965) 483-9498. E-mail: albahri@kuc01. kuniv.edu.kw. Web site: http://www.albahri.info.

causes ignition. Auto ignition temperature is dependent not only on the chemical and physical properties of the substance but also on the method and apparatus employed for its determination such as the volume and material of the vessel used, test pressure, and oxygen concentration.1 In the ASTM test method, auto ignition temperature is determined by inserting a small test sample into a heated flask containing air at a predetermined temperature then observing in a dark room for a sudden appearance of a flame and a sharp rise in the temperature of the gas mixture. The test procedure is repeated for a series of prescribed sample volumes and the lowest predetermined internal flask temperature leading to said flame and temperature rise is then taken to be the auto ignition temperature. Experimental determination of AIT is laborious, and, moreover, cannot always be done. In connection with this, it is important to work out methods for calculating it. Because the quality of a fuel depends on its composition, one could theoretically calculate the auto ignition temperature from a comprehensive analysis of the individual components in a fuel and their contribution to the overall auto ignition quality. Aside from the fact that a theory describing and quantifying the relationship between composition and auto ignition has not yet been fully developed, the major obstacles to this approach are the lack of a method to obtain the compositional data and the scarcity of auto ignition data for pure components. The former impediment is partially solved by the introduction of new analytical (gas and liquid chromatography) and theoretical techniques,2 whereas

10.1021/ie0300373 CCC: $25.00 © 2003 American Chemical Society Published on Web 09/27/2003

Ind. Eng. Chem. Res., Vol. 42, No. 22, 2003 5709

the latter, which still awaits the introduction of new techniques for estimation, is solved in this work. Background There is no method available in the literature for estimating the auto ignition temperature from the knowledge of only the structure of the molecule. Suzuki3 for example, developed a few equations to predict AIT using a quantitative structure-property relationship (QSPR) model. The paramount of these is an empirical correlation developed using a set of 250 pure components and is given by the following equation which predicts AIT in °C:

AIT ) 1.73Pc - 3.48PA + 191.4°χ - 246.8Qrˇ 121.3Iald + 70.4Iket + 302.5 (1) In the above equation, Iald and Iket are indicative variables of aldehyde and ketone functionality, respectively, which are assigned a value of 1 for such compounds and zero for all other compounds, °χ is zero-th order molecular connectivity index (-), Qrˇ is the sum of absolute negative atomic charges (a.u.), Pc is the critical pressure (Pa), and PA is the Parachor at 20 °C (cm3/mol). Although the above correlation was able to predict AIT accurately enough with an average percentage error of 4.5% and a correlation coefficient of 0.95, it requires unconventional parameters, the availability of which or the lack thereof poses some limitations on its applicability. The correlation was developed using a set of 250 pure components only, which is less than half the experimental values available in the literature. Furthermore, the correlation was unable to predict accurately enough the AIT for 23 pure components that were eliminated during testing. When those are included the correlation coefficient reduces to 0.89 and the average percentage error increases to 5.4%. Here we show that AIT can be predicted from the knowledge of only the molecular structure of the compound, with accuracy higher than any other method available in the literature. The proposed structural group contribution (SGC) method has been shown to be not just another correlation technique but in fact a theoretically consistent method for predicting pure and multicomponent properties.4,5 AIT is one of the most difficult properties to estimate or correlate because of its complex dependency on the molecular structure of the compound. A careful examination of AITs of hundreds of compounds reveals this complex nature. For example, AIT of n-paraffins is a function of the size or number of carbon atoms in the molecule. AIT decreases sharply with increasing carbon number up to C7 then remains almost the same for C7 through C16, ranging between 473 and 479 K, which is within the range of experimental error. For more complex compounds such as iso-paraffins in addition to the total number of carbon atoms AIT depends on the number, type, length, and degree of branching in the molecule. The location of the alkyl group in isoparaffins, however, has no effect. In addition to all of the above factors, the AIT for olefins is a function of the number (olefins and diolefins) and nature of the bonds (double or triple). Normal 1-alkenes, for example, behave more or less the same as n-paraffins. AIT decreases significantly with increasing carbon number up to C5 then remains about the same from C6 through C18, ranging from 503 to 526 K which is within the range of

experimental error. The location of the unsaturated bond along the chain and the cis/trans structural orientation has very minor influence on AIT. The AIT of aromatics is a function of the number and type of benzene rings (condensed or noncondensed), the number of alkyl groups attached to the benzene ring, their type, length, and degree of branching, and sometimes their location on the ring in the ortho, meta, and para positions. The AIT of aromatics in general ranges from about 700 to 840 K which is higher than those of paraffins and olefins of the same carbon number. Furthermore, no obvious correlation exists with either the carbon number or the boiling point and AIT. Cyclic compounds are the most complex because, in addition to all of the above factors, their AIT is a function of not only the number of cyclic rings but their size as well, in addition to the number and degree of branching of the alkyl groups attached to the cyclic ring. The structural orientation (cis/trans) and the location of the alkyl groups on the ring are of minor influence on AIT of cyclic compounds. Cyclic compounds have AIT values that are between those of aromatics and noncyclic compounds (paraffins and olefins) of the same carbon number. In general, AITs of cyclic compounds range from about 510 to 750 K but for a few exceptions. As in aromatics, no obvious correlation exists with either the carbon number or the boiling point and AIT. This is further complicated by the coexistence of several of these groups in one molecule in addition to other functional groups for halogenated compounds, ethers, alcohols, phenols, aldehydes, ketones, acids, esters, amines, and anhydrides, which makes it difficult to formulate a model that can incorporate the behavior of all the different groups without taking into account the structure of the molecules. Such complex dependency on the molecular structure can only be adequately represented by a model that takes into account the contribution of such structures in the molecule to the AIT property. Structural Group Contribution A careful examination of the AITs of hundreds of pure components reveals its complex dependency on the molecular structure of the substance. In this work we investigate this structural dependency of AIT using a SGC approach which has proven to be a very powerful tool for predicting many physical and chemical properties of pure compounds. The method was successfully used to predict pure component and mixture properties including, but not limited to, the critical temperature, critical pressure, critical volume, boiling point, freezing point, molar volume, viscosity, surface tension, diffusivity, thermal conductivity, heat capacity, heat of formation, heat of combustion, entropy, and Gibbs free energy.6 Furthermore, may commercial applications in the form of computer programs that estimate the properties of pure components from their chemical structure are currently being marketed. From the above discussion it can be clearly seen that the SGC approach has already been extensively used to estimate almost all pure component properties except a few. These include the properties related to the combustion of these molecules such as AIT, octane number, flash point, and upper and lower flammability limits. There are many structural group contribution methods in the literature including, but not limited to, the

5710 Ind. Eng. Chem. Res., Vol. 42, No. 22, 2003

work of Ambrose, Joback, Fedors, Thin et al., Benson, Qrrick-Erbar, Grunberg-Nissan, and Chueh-Swanson.6 The main differences between these are in the choice of the structural groups and the way in which they contribute to the overall property. Technical Development Auto ignition temperature is one of the macroscopic properties of compounds which are related to the molecular structure and determine the magnitude and predominant types of the intermolecular forces. The concept of structure suggests that a macroscopic property can be calculated from group contributions. The relevant characteristics of structure are related to the atoms (atomic groups, bond type, etc.); to them we assign weighting factors and then determine the property, usually by an algebraic operation which sums the contributions from the molecule’s parts. Of the many SGC estimation methods available in the literature, a combination of Ambrose, Joback, and Chueh-Swanson group contributions were selected on the basis of their generality and accuracy. This combination was tested and then modified with the location of the functional groups in the molecules which result in the best correlation coefficient and average error using artificial neural networks (ANN). It was found necessary to modify the structural groups to account for only those that have an influence on the overall AIT property. For example, no distinction in the AIT existed for the cis and trans structural orientations in olefins or cyclic compounds. Hence, such distinction was avoided in the choice of the structural groups (Table 1). It was also unnecessary to account for the location of the alkyl substitutions on the benzene ring in the ortho, meta, and para positions in aromatics, the location of the alkyl branches along the chain for iso-paraffins and iso-olefins, the location of the double bond along the chain in olefins, and the alkyl substitutions and ring size for naphthenes. Our attempts to enhance the model results by using two sets of structural groups, one for the aromatic ring in aromatics and another for the cyclic ring in naphthenes, did not result in a significant improvement in the model predictions and correlation of the experimental data. Therefore such distinctions were avoided. Method 1: SGC-Based ANN Model The neural network technique has been applied widely to various engineering areas. Nonlimiting examples in chemical engineering include the modeling of such petroleum refining processes as the hydrocracker,7 and the prediction of the thermodynamic8 and the transport9 properties of pure components. The neural network method of computation has several advantages over traditional methods especially in the speed of computation, learning ability, and fault tolerance. The theoretical basis of neural computing has been reported elsewhere.10,11 The concept of using SGC-based ANNs in predicting pure components properties is not new. It has been previously demonstrated to predict very accurately the thermodynamic properties of pure components such as the normal boiling point, the critical properties, and the acentric factor,8 in addition to the enthalpy of fusion.12 Nevertheless, the accuracy of the neural predictions was

Table 1. Structural Groups Corresponding to the Input Nodes in Figure 1 and Their Contributions Used with Equation 3 to Estimate Pure Components Auto Ignition Temperature serial no.

group

(AIT)i

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58

sCH3 >CH2 >CHs >C< dCH2 dCHs dC< dCd tCH >CH2 (ring) >CHs (ring) >C< (ring) dCHs (ring) dC< (ring) sF sCl sBr sCH3 (attached to sCl) >CH2 (attached to sCl) >CHs (attached to sCl) >C< (attached to sCl) sOH (alcohol) sOs (nonring) >CdO (nonring) OdCHs (aldehyde) sCOOH (acid) sCOOs (ester) dO sNH2 >NH (nonring) >Ns (nonring) sCN sNO2 sOH (phenol) sOs (ring) >CdO (ring) >NH (ring) sNd (ring) sSiH >Ns (ring) sH >Si< sSiO sBH3 sSiH2 sSO4 >S >SO sSH sSiH3 sPH3 sPdS sCOsOsOCs (anhydride) >Pb< sNa sPO4 tO dS

-1.0952 -0.8995 0.4397 3.5905 0.1432 -2.5374 -1.5587 -41.414 -3.0843 -1.0340 -0.4399 17.435 0.2355 1.1114 -0.6821 1.7790 3.8344 31.590 -3.2486 6.5139 5.8284 -0.4083 -1.8794 0.8520 -10.075 1.0103 1.7842 2.3019 -0.7274 0.2640 -5.7057 4.6050 -2.9752 1.3483 -0.9632 18.257 0.8183 8.0665 20.882 -20.913 1.1830 -4.8107 7.2808 -6.4022 -14.727 21.068 -9.5369 -7.7443 31.187 -11.572 -12.865 11.108 -1.3897 -4.5846 -22.020 8.2509 -18.267 14.259

Note: Groups 18 through 21 are for chlorine atoms in paraffinic compounds only.

never compared to those of the traditional SGC approach and was limited only to the above-mentioned properties. In this work we investigate the structural dependency of AIT using a SGC approach. For that purpose an artificial neural network model was constructed using MATLAB13 code to test several SGC methods available in the literature.6 The model was used to probe the structural groups that have significant contribution to the AIT property of pure components and arrive at the

Ind. Eng. Chem. Res., Vol. 42, No. 22, 2003 5711

of neurons in the hidden layer (H) for a three-layer network architecture shown in Figure 1,

H)

Figure 1. Architecture of artificial neural network for predicting the auto ignition temperature of pure components from their structural groups.

final groups (shown in Table 1) that provided the best correlation of the experimental data. Furthermore, the final groups arrived at were assessed for their capacity to predict the AIT of pure components using the said ANN model. Several ANN architectures were tried and the one that best simulated the AIT was retained. The final network structure is shown in Figure 1 and consists of three layers: input, output, and hidden. The input layer has a number of neurons which is equal to the number of structural groups being investigated. The hidden layer is a single layer with six neurons, and the output layer consists of one neuron representing the predicted AIT property. A sigmoid function was selected as the transfer function for each neuron. The inputs to the network algorithm are the number of the structural groups in the given substance. If a certain input group in the network did not exist in a molecule an input value of zero was assigned to that group. When comparing the different SGC methods in the literature and arriving at the final list of single and binary groups listed in Table 1, the complete data set of 490 pure substances was used in the ANN structure as input. This probing data set was taken from the property databank of AIChE-DIPPR.14 Each one of these 490 input sets included a group-number vector representing the number of the structural groups in a given substance. The connection weights of the network were adjusted iteratively by the back-propagation algorithm with the generalized delta rule to minimize the mean square error between the desired and the actual outputs. During the learning course, we recorded the average absolute deviation (AAD), maximum deviation, and the correlation coefficient of the predictions along with the corresponding time steps (epochs). It was found that 300 epochs were sufficient to achieve the convergence of learning, where the deviation between the actual and the desired responses has no significant change, thus the training was terminated at that number of time steps. Convergence was less than a minute for all cases investigated on a Pentium IV1.7GHz PC. To avoid extensive trials in determining the number of neurons in the hidden layer, the following simple equation was used to calculate the maximum number

(D - 1) (I + 2)

(2)

where D is the number of experimental data points and I is the number of input neurons. This equation was developed based on the fact that, for a feasible solution, the number of dependent variables (weights and biases) may not be greater than the number of independent variables (experimental data). Therefore, H must be rounded down to the nearest number. When separate training and testing sets of data are used, D would be the number of data points used in network training. The actual number of hidden-layer neurons was arrived at by stepping down one number at a time until the best results, as indicated by the correlation coefficient, are obtained for both the training and testing data sets. After arriving at the best network architecture shown in Figure 1, we demonstrated the predictive ability of the ANN by training it with only 470 of the experimental data points. The previously trained networks were then applied to predict the physical properties of the 20 remaining substances, which were not included in the learning database. The compounds in the testing set were chosen on the basis of the abundance of their counterparts (the class of compounds they represent) in the training data set on which the neural networks were trained. This is necessary because ANN cannot predict the AIT property of a class of compounds on which it was not trained. The accuracy of these predictions was compared with the available experimental data.14 Method 2: Traditional SGC Approach In a traditional structural contribution approach, the group contributions are usually incorporated in some form of an algebraic equation relating other properties such as boiling point, molecular weight, or just plain constants, to estimate the desired property. Many equations have been proposed ranging from simple relations to complicated polynomials.6 We have previously tested several equations and found the best to predict the target property in the following form4

∑i (AIT)i) -1.57(∑i (AIT)i)2 0.0773(∑(AIT)i)3 + 0.0032(∑(AIT)i)4] (3) i i

AIT ) [729.7 + 24.9(

where AIT is the auto ignition temperature and ∑(AIT)i is the sum of the group contributions to the total AIT property shown in Table 1. The calculation procedure for the AIT property using the above equation and the SGC values in Table 1 are illustrated elsewhere.4 Data on the AIT of about 490 pure components from the AIChE-DIPPR14 were used to estimate the values of the correlation coefficients of eq 3 and the various group contributions shown in Table 1. An optimization algorithm based on the least-squares method was used for that purpose. The algorithm minimizes the sum of the difference between the calculated and experimental AITs using the solver function in Microsoft Excel. Conversion of the above nonlinear regression algorithm was achieved in less than 10 min on a Pentium IV1.7GHz PC.

5712 Ind. Eng. Chem. Res., Vol. 42, No. 22, 2003 Table 2. Comparison of the Three Methods method

set

AAD (°C)

avg % error

max % error

R2

ANN ANN ANN Suzuki3 traditional

probing (all 490 compounds) training (470 compounds) testing (20 compounds) eq 1 (250 compounds)3 eq 3 (all 490 compounds)

17.7 17.8 16.7 34.8 58

2.8 2.9 2.6 5.4 9.2

20 21 7 33 125

0.98 0.98 0.98 0.89 0.79

Figure 2. Parity plot for the auto ignition temperature (AIT) for the whole data set of 490 pure components using the SGC method and ANN.

We have had big success in the past using the leastsquares method and the solver function in MS Excel to estimate the parameters of the traditional SGC approach for predicting such properties as octane number,4 aniline point,15 flash point,16 and upper and lower flammability limits16 with correlation coefficients (R2) as high as 0.99. In some of these cases we have had more components and more structural groups. Therefore, using MS Excel seems to be a judicial choice for its simplicity, surprisingly faster convergence, and effectiveness equal to that of other optimization tools for the task at hand. Results Using the probing set of data on the auto ignition temperature of 490 pure compounds,14 which included hydrocarbons, halogenated compounds, ethers, alcohols, phenols, aldehydes, ketones, acids, esters, amines, and anhydrides, several structural groups derived from the Ambrose, Joback, and Chueh-Swanson definitions of group contributions6 were tested and modified for the location of the functional groups in the molecules. During this probing stage, the correlation coefficient was used as an indication to discriminate between the SGC methods and the structural groups that have significant contribution to the AIT property of pure compounds. It was finally arrived at the set of groups that can best represent the experimental data with a correlation coefficient of 0.98 consisting of the 58 structural groups shown in Table 1. In addition to the above proposed structural groups, several other groups have also been investigated. Although better results were obtained with a larger number of structural groups, the improvement was not significant. A parity plot showing the accuracy of models correlation is presented in Figure 2. The average deviation in the predicted AITs for all types of pure compounds ranging in AIT from 303 to 1283 K was

Figure 3. Percentage error range for the auto ignition temperature for the whole data set of 490 pure components using the SGC-based ANN; 82% of data are below the 5% error range, 15% of data are between 5 and 10% error range, and only 3% of data are between 10 and 20% error range.

2.8% as shown in Table 2. The percentage errors for the probing data set are shown in Figure 3. To assess the accuracy of the models prediction, the data were then separated into training and testing sets consisting of 470 and 20 pure components, respectively. The percent error between the predicted AIT and the actual data used in training the network was calculated. The results from the trained network are summarized in Table 2 indicating that the average error for the AIT calculations for this mode was about 2.9% with a correlation coefficient of 0.98. As can be seen, the correlation of the neural network model for the training data set is good. The predictions of the trained neural networks have been cross validated against a testing set of data of 20 components not originally used in the training process. The percentage error between the predicted AIT and the experimental data used in testing the network was calculated. The networks predictions compared well against this new set of data with average and maximum percentage errors of 2.6% and 7%, respectively, and a correlation coefficient of 0.98. The detailed results and compounds tested are listed in Table 3. Comparing with the experimental values, as shown in Table 2, we found the predictions to be comparable to the trained networks in terms of AAD and correlation coefficients. A parity plot showing the accuracy of models correlation for both training and testing is presented in Figure 4. As can be seen, the predictions of the neural network model for the testing data set are excellent. The maximum percent errors are also satisfactory. The testing data set showed less maximum error (7%) than the training data set (21%). This is justified by the fact that the compounds in the testing set were preselected as representative examples of the different classes of compounds in the training data set on which the neural networks were trained. This is not to say that

Ind. Eng. Chem. Res., Vol. 42, No. 22, 2003 5713 Table 3. Testing Set of Components Not Used during Model Development for AIT compound

AIT exp.

AIT pred.

% errora

1-ethylnaphthalene 1-heptene 1-hexadecene 1-octene 2-methyl-1-propanol 2-pentanol 3-methyl-1-butanol benzyl alcohol butanol cis-1,2-dimethylcyclohexane cis-1,4-dimethylcyclohexane cis-4-methylcyclohexanol decane dimethyl phthalate ethanol maleic anhydride m-diethylbenzene propenenitrile sec-butylbenzene trans-decahydronaphthalene

753.0 536.0 513.0 503.0 681.0 615.9 623.0 708.9 615.9 577.0 576.9 570.0 473.9 828.9 695.9 689.0 831.9 643.9 678.9 570.0

748.5 533.6 532.4 518.6 659.5 631.5 631.5 756.6 606.1 585.3 585.3 569.2 439.8 857.1 664.6 709.0 858.1 632.2 659.5 569.2

0.6 0.4 3.8 3.1 3.2 2.5 1.4 6.7 1.6 1.4 1.5 0.1 7.2 3.4 4.5 2.9 3.2 1.8 2.8 0.1

a

Average % error ) 2.6, correlation coefficient ) 0.98.

Figure 4. Parity plot for the auto ignition temperature (AIT) of a training set of 470 pure components and a testing set of 20 components using the SGC-based ANNs.

ANN can predict AIT of pure components with an average error of 2.6% and a maximum error of 7%, but in fact an average error of 2.8% and a maximum error of 20% are expected as can be seen in rows 1 or 2 of Table 2. This is still better than the method of Suzuki3 which produces an average error of 5.4% and a maximum error of 33%, also keeping in mind that it was developed with a much smaller data set of 250 pure compounds which is half the number of compounds used to develop our model. There is no telling how his method will perform when tested on a larger set of data containing some very unconventional compounds. Our proposed method is therefore truly superior to that of Suzuki3 both in terms of accuracy, generality, and simplicity as it requires only the molecular structure of the compound, which is always known. The results for the traditional SGC method for predicting AIT using eq 3 and the group contributions in Table 1 are summarized in Table 2. The models predictions did not correlate very well with the experimental data as shown in Figure 5 with an average error of about 9.2 % and a correlation coefficient of 0.79. This corresponds well with our previous findings that AIT

Figure 5. Parity plot for the auto ignition temperature (AIT) of 490 pure components using the traditional structural group contributions method.

is especially difficult to correlate using the traditional SGC method even for a relatively small number of components and fewer structural groups.15 From the success we have had in the past in predicting other properties4,15,16 it is our conclusion that the impediment is related not to the optimization tool used but in fact to the AIT property which is too complex to model using the traditional SGC approach based on nonlinear regression and the least-squares technique which suffers from several shortcomings. These limitations are mainly associated with using a simple eq 3 which is unable to capture the complex nature of the AIT property. In addition, as in many other iterative techniques, the method success is dependent on effectively providing appropriate initial values of the group contributions. It is our experience from our previous work and the task at hand that minor improvement, if any, can be obtained by using other optimization tools such as MATLAB and GAMS. ANN method negates these inconveniences and inaccuracies or the limitations of eq 3 and offers a promising alternative to modeling for a number of reasons. ANNs are able to capture the nonlinearity in the system behavior very effectively. Once properly trained, ANNs offer predictions quickly and accurately on a personal computer. Furthermore, the connection weights and network architecture make predictions possible using a spreadsheet. A trained network for AIT can also be used for synthesizing molecules (i.e., choosing a molecule with a desired AIT). This can be done by invoking the inverse property of the network; what is the best combination of inputs that lead to certain outputs? Finally, this work demonstrates that the complex combustion property, AIT, can be modeled by backpropagation neural network models. Considering the difficulty and complexity of developing a first principles mechanistic model of AIT involving the kinetics and dynamics of combustion, neural networks can be an effective alternative. AIT has a number of intrinsic physical parameters associated with the molecular structure, such as group interactions, structural orientations, skew, hindrance, steric, resonance, inductive, and chiral effects, that are usually unknown and have to be determined through a parameter estimation methodology. Neural networks can learn about these

5714 Ind. Eng. Chem. Res., Vol. 42, No. 22, 2003

inherent relationships among various structural groups and their contribution to the overall AIT property of the molecule. Conclusion and Future Work In conclusion, a structure-based technique for estimation of the auto ignition temperature of pure components is nonexistent. The group contribution approach presented here is the first and proves to be a powerful tool for predicting the auto ignition temperature of pure substances from only their molecular structure. Having obtained only limited success with the traditional SGC approach using the least-squares method (R2 ) 0.79), an ANN model was developed for the same purpose. Neural network offered a significant improvement (R2 ) 0.98) and an advantage over the traditional SGC method as well as other methods.3 Another major advantage is the ability of the ANNs to probe the structural groups that have significant contribution to the overall AIT property of pure compounds, which is very difficult and time-consuming to perform with the traditional SGC approach using nonlinear regression. This is a novel use of neural networks and one of the significant contributions of this work. We are currently developing a method for estimating the AIT of petroleum fuels and other mixtures from pure components, utilizing the current procedure for the automatic generation and reliable estimation of AIT of pure components for which no data exist. Acknowledgment T.A.A. acknowledges Research Grant EC 04/01 from Kuwait University and Dr. Ali Elkamel for helpful discussions on neural computing. Supporting Information Available: The MATLAB code and the AIT data for the ANN model may be obtained by E-mail from the corresponding author. Weights fitted in the ANN architecture are shown in Table 4 in the Supporting Information. Nomenclature AIT ) Auto ignition temperature D ) Number of experimental data points H ) Maximum number of neurons in the hidden layer for a three-layer network architecture I ) Number of input neurons i ) Number of structural groups (shown in Table 1) Iald ) Indication variables of aldehyde functionality Iket ) Indication variables of ketone functionality

PA ) The Parachor at 20 °C (cm3/mol) Pc ) Critical pressure (Pa) Qrˇ ) Sum of absolute negative atomic charges (a.u.) °χ ) Zero-th order molecular connectivity index (-) ∑(AIT)i ) Summation of the structural group contributions (shown in Table 1)

Literature Cited (1) ASTM International. ASTM Standard Test Method E65978; The American Society for Testing and Materials: West Conshohocken, PA, 2000. (2) Neurock, M.; Nigam, A.; Trauth, D.; Klein, M. T. Molecular Representation of Complex Hydrocarbon Feedstocks through Efficient Characterization and Stochastic Algorithms. Chem. Eng. Sci. 1994, 49, 4153. (3) Suzuki, T. Quantitative Structure-Property Relationships for Auto-Ignition Temperatures of Organic Compounds. Fire Mater. 1994, 18, 81. (4) Albahri, T. A. Structural group contribution method for predicting the octane number of pure hydrocarbon liquids. Ind. Eng. Chem. Res. 2003, 42, 657. (5) Benson, S. W.; Buss, J. H. Additivity rules for the estimation of molecular properties. Thermodynamic properties. J. Chem. Phys. 1958, 29, 546. (6) Reid, R. C.; Prausnitz, J. M.; Polling B. E. The Properties of Gases and Liquids; Hill: New York, 1987. (7) Elkamel, A.; Al-Ajmi, A.; Fahim, M. Modeling the Hydrocracking Process Using Artificial Neural Networks. Pet. Sci. Technol. 1999, 17, 931. (8) Lee, M. J.; Chen, J. T. Fluid Property Prediction with the Aid of Neural Networks. Ind. Eng. Chem. Res. 1993, 32, 995. (9) Ismail, A.; Soliman, M. S.; Fahim, M. A. Prediction of the viscosity of heavy petroleum fractions and crude oils by neural networks. Sekiyu Gakkashi 1996, 39, 383. (10) Lipmann, R. P. An Introduction to Computing with Neural Nets. IEEE ASSP Mag. 1987, 4, 10. (11) Widrow, B.; Lehr, M. A. 30 years of Adaptive Neural Networks: Perceptron, Madaline, and Back-propagation. Proc. IEEE 1990, 78, 1415. (12) Bunz, A. P.; Braun, B.; Janowsky, R. Quantitative structure-property relationships and neural networks: correlation and prediction of physical properties of pure components and mixtures from molecular structure. Fluid Phase Equilib. 1999, 158, 367. (13) MATLAB V6.1, Neural Network toolbox; The MathWorks, Inc.: 2001. (http://www.mathworks.com). (14) AIChE, DIPPRO, public version; DIPPR Project 801Pure Component Data; 1996. (http://www.chempute.com/sinet.htm). (15) Albahri, T. A.; Zayed, F. I. Method for Predicting the Aniline Point of Pure Hydrocarbon Liquids; Presented at ACS 226th National Meeting, New York, September, 2003. (16) Albahri, T. A. Flammability Characteristics of Pure Hydrocarbons. Chem. Eng. Sci. 2003, 58, 3629.

Received for review January 13, 2003 Revised manuscript received August 13, 2003 Accepted August 20, 2003 IE0300373