Quantitative Structure Property Relationship Studies for Predicting

Jan 20, 2011 - ... Artie McFerrin Department of Chemical Engineering, Texas A&M ...... Cammi, R.; Pomelli, C.; Ochterski, J. W.; Ayala, P. Y.; Morokum...
1 downloads 0 Views 1MB Size
ARTICLE pubs.acs.org/IECR

Quantitative Structure Property Relationship Studies for Predicting Dust Explosibility Characteristics (Kst, Pmax) of Organic Chemical Dusts Olga J. Reyes, Suhani J. Patel, and M. Sam Mannan* Mary Kay O’Connor Process Safety Center, Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas 77843-3122, United States

bS Supporting Information ABSTRACT: In the chemical process industry, hazardous chemical dusts have recently been the cause of many severe incidents leading to fatalities and injuries. It is important to characterize the hazards posed by dust materials. Thus, in this work the quantitative structure property relationship (QSPR) technique has been used to predict the explosibility characteristics (i.e., KSt, Pmax) of chemical dusts. The data set consisted of 31 chemical dusts, further divided into training and test sets. The mathematical models were developed using Material Studio 5.0 software and the genetic function approximation (GFA) algorithm. The final predictive models for the maximum overpressure (Pmax) and dust deflagration index (KSt) are cubic equations with 5 parameters and high correlation coefficients (R2 for the models are 0.96 and 0.91, respectively). The correlations also took into account the effect of the dust particle size by using the median value as a parameter in the developed models. This is essential because dust explosion behavior is principally governed by particle size. Such prediction models for hazardous dust properties serve as an indication for design and protective measures against dust explosions as well as they can be applied for hazard classification and screening of dust materials.

1. INTRODUCTION Dust explosions pose one of the most serious and widespread of explosion hazards in the process industry.1 They can occur whenever dusts are produced, stored, or processed in a facility. In the U.S. alone, 281 dust fires and explosions have occurred between 1980 and 2005, killing 119, injuring 718 people, and extensively damaging industrial facilities.2 Dust explosion incidents happen very frequently in the process industry, and one of the most recent and relevant example of this is the incident that took place in Savannah, Georgia, at the Imperial Sugar refinery in Port Wentworth (February 2008). This accident killed 14 workers and injured 36 others.3 Thus, more guidelines for protection and mitigation against hazards arising from dust explosions are needed. These guidelines require the knowledge and the evaluation either experimentally or theoretically of the explosibility characteristics (i.e., Pmax, KSt).4 The maximum explosion overpressure, Pmax, is the difference between pressure at the time of ignition (normal pressure) and pressure at the highest point in the pressure time record resulting from a dust explosion.5 The maximum explosion pressure gives an indication of the magnitude of the damaging pressures that may be generated during a dust explosion.6 It is normally used to design containment, venting, suppression, isolation, and partial inerting. The dust deflagration index, KSt, is a concept introduced for scaling maximum rates of pressure rise to larger volumes by normalizing them. The rates of pressure rise are multiplied with the cube root of the explosion chamber volume, and this is called the cube root law (eq 1).7   dP KSt ¼ V 1=3 ð1Þ dt max The resulting dust explosion severity then forms the design basis for explosion protection (e.g., explosion relief venting, r 2011 American Chemical Society

explosion suppression). This approach rests entirely on the validity of the cube-root-law.6 The dust deflagration index, KSt, represents the maximum rate of pressure rise in 1 m3 vessel when a dust is ignited (i.e., the dust explosion violence) and is commonly used to classify the dust explosibility as shown:1 (1) KSt = 0, no explosion; (2) 0 < KSt < 200, weak explosion (dust explosion class 1); (3) 200 < KSt < 300, strong explosion (dust explosion class 2); (4) 300 < KSt, very strong explosion (dust explosion class 3). Some authors have previously explained how the KSt and Pmax values are influenced by many factors such as the particle size, moisture content, ambient humidity, oxygen available for combustion, the shape of dust particle, dust concentration, ignition source, and others.1,8,9 In general, a cloud of combustible powder/dust ignites more readily, and the violence of explosion increases with decreasing particle size. It is so because the finer particles have greater surface area per mass, are easily dispersed in air, and remain airborne for longer periods.1 There is also a critical measure of particle size below which the effect of particle size on the explosion and combustion rates is negligible.8 Overall, an important aspect of characterizing a dust explosion is to determine the Pmax and KSt values. Quantitative structure property relationship (QSPR) studies have been successfully applied in the process-safety field9-13 to predict properties such as reactivity of hazardous materials, flash points, auto ignition temperatures, explosion limits, and others. In this work, the technique has been applied to build a mathematical model to predict the explosibility characteristics of chemical dusts. The Received: June 27, 2010 Accepted: January 3, 2011 Revised: November 7, 2010 Published: January 20, 2011 2373

dx.doi.org/10.1021/ie1013663 | Ind. Eng. Chem. Res. 2011, 50, 2373–2379

Industrial & Engineering Chemistry Research

ARTICLE

advantage of this approach over other techniques lies in the fact that it requires only the knowledge of chemical structure and is not dependent on any experimental measures.14 Although the mathematical models obtained in this work were successfully compared with the experimental data, these correlations were developed for chemical dust materials only (according to the OSHA types of dust materials in the process industry15). There are some other important dust materials for which the developed correlations may not apply and are handled in the process industry everyday (such as plastics). This study is an initial approach in the application of QSPR techniques to predict dust explosibility characteristics and can be further extended with more data and to other dust types.

2. METHODOLOGY 2.1. Data Set. The experimental data of explosibility characteristics (i.e., KSt, Pmax) and the corresponding median values for the 31 chemical dusts were taken from the online database GESTISDUST-EX combustion and explosion characteristics of dust.16 The median value of a dust sample can be used as a representative, characteristic value of the actual needed particle size distribution of the sample. The median dust particle size is the particle diameter that divides the frequency distribution in half; 50% of the dust mass has particles with a larger diameter, and 50% has particles with a smaller diameter.17 Although in cases where the sample contains a high percentage of smaller or larger sized particles, the median value may not be the most appropriate representation. For some of the compounds, data were collected for different median values, thus leading to a total of 40 data points to perform the study. The collected data is based on tests in the 1 m3 vessel and/or 20 L sphere. A detailed explanation of the test methods, the characteristics of the database, and the limits of applicability can be found in the guidelines of the database.16 The data set was divided into two subsets, the training set which is formed by 29 different compounds and 31 data points and the test set which is formed by 6 different compounds and 9 data points. The difference in number of compounds and data points results from experimental values obtained for several particle sizes. The training set was used to develop the mathematical model and the test set to validate the same. 2.2. Geometry Optimization. With the data set obtained, the molecular structure of the compounds was searched using PubChem database18 and sketched using Gauss View 3.019 consequently. Since the values for some of the molecular descriptors are dependent on bonds lengths, bonds angles, and other such parameters, it is imperative to perform the geometry optimization for each chemical structure, prior to calculation of the molecular descriptors.20 To carry out the geometry optimization, the Gaussian 03 package21 at the B3LYP/631G level was used. The B3LYP function was chosen based on its effective performance for the systems involving organic compounds.22-24 2.3. Determination of Molecular Descriptors. The optimized molecular structures were used as an input into Material Studio software package 5.0.25 This software was used to calculate the molecular descriptors of all compounds in the training and test set. More than 200 molecular descriptors were calculated using the descriptors’ models and VAMP Electrostatics in the simulation model of the Material Studio software package.25 After the molecular descriptors were calculated, an initial analysis was performed to remove some of them to reduce the size of the model development and to keep relevant descriptors

Figure 1. Histogram and distribution for Kst and (KSt)1/2 values.

only. Following a heuristic method, first the molecular descriptors that were not available for all of the input structures were removed, such as crystal based (periodic descriptors) and charge based (Jurs descriptors), and second the molecular descriptors that had similar values for most of the structures were removed from the analysis. Thereafter, a univariate analysis was performed to the input values of KSt and Pmax. This helps to assess the quality of the data available and its suitability for future analysis. In the case of the KSt values, the data did not exhibit normal distribution, thus the values were transformed to the square root values because they showed normal distribution (as shown in Figure 1). The final model has been built as a function of the square root of KSt. Finally, to terminate the initial analysis, a correlation matrix was used; this tool gives the correlation between each pair of descriptors included in the analysis. Correlation coefficients between a pair of columns approaching þ1.0 or -1.0 suggest that the two columns of data are highly dependent on each other. Therefore, one descriptor of each pair which has coefficient greater than the absolute value, 0.9, was removed. At the end of this initial analysis, there were a total of 34 molecular descriptors remaining, along with the median value which will be used as one of the governing descriptors in the final model. 2.4. Developing the Model. The genetic function approximation (GFA) algorithm is a useful technique for searching in a large parameter space when the data set is small. This method provides multiple models that are created by evolving random initial models using different descriptors. Models are improved by performing a crossover operation to recombine terms providing better 2374

dx.doi.org/10.1021/ie1013663 |Ind. Eng. Chem. Res. 2011, 50, 2373–2379

Industrial & Engineering Chemistry Research

ARTICLE

Table 1. Molecular Descriptors Used in the Correlations and Their Definitions molecular descriptor

type

definition

dust size

dust median particle size

Kappa-3(AM)

topological descriptor

molecular shape index order 3 with R-modified atom count.

nC

atomistic descriptors

number of carbons

Ly

spatial (geometric) descriptor

shadow length: LY

Rotlbonds

structural descriptors

number of rotatable bonds

Syz, f

spatial (geometric) descriptors

shadow area fraction: YZ plane

Hbond donor

structural descriptors

number of hydrogen-bond donors

BIC E_ADJ_mag

information-content descriptors information-content descriptors

bond information content edge adjacency/magnitude

Sxy, f

spatial (geometric) descriptors

shadow area fraction: XY plane

Lz

spatial descriptors

shadow length: LZ

Total dipole

electronic information

total dipole magnitude

Hbond acceptor

structural descriptors

number of hydrogen-bond acceptors.

Chi(3):cluster

topological descriptor

simple 3rd order cluster chi index

HOMO

electronic information

energy of the highest occupied molecular orbital

LUMO Octupole xxz

electronic information electronic information

energy of the lowest unoccupied molecular orbital electrostatics octupole moment in the plane xxz

Octupole xzz

electronic information

electrostatics octupole moment in the plane xzz

Quadrupole yy

electronic information

electrostatics quadrupole moment in the yy plane.

scoring models. The GFA algorithm approach has a number of important advantages over other techniques such as, it builds multiple models rather than a single model, it automatically selects which features are to be used in the models, and it can build models using either a linear relation or higher order polynomial, splines, and Gaussians, etc.26 The application of GFA to QSAR/QSPR studies has been successfully used in a wide number of QSPR/QSAR research.27-31 This method combines Friedman’s multivariable adaptive regression splines MARS32 and Holland’s genetic algorithm.33 This method also used the Friedman’s lack of fit (LOF) along with the common R-squared, cross validated R-squared, and F-value to evaluate the significance of the regression in the QSPR/QSAR model. In Material Studio, the LOF is measured using a slight variation of the original Friedman formula as shown in eq 2.32 SSE LOF ¼  ð2Þ   c þ dp 2 M 1-λ M where SSE is the sum of square errors, c is the number of terms in the model, other than the constant term, d is a scaled smoothing parameter, p is the total number of descriptors contained in all model terms (ignoring the constant term), M is the number of samples in the training set, λ is a safety factor, with the value of 0.99, to ensure that the denominator of the expression can never become zero. The scaled smoothing parameter, d, is related to the userdefined smoothness parameter, R, by the expression shown in eq 3, where cmax is the maximum equation length.   M - cmax d ¼ R ð3Þ cmax The GFA algorithm was applied to the training set, using the R-squared value as a scoring function and a LOF smoothness parameter equal to 0.5. The program was run repeatedly varying the length and order of the equations. The set of equations returned were evaluated using the following parameters: (a) Friedman LOF measure, (b) R-squared, (c) cross-validated

R-squared, and (d) F-value. These values were calculated with the statistical models in the Material Studio software.25 After this assessment, two full cubic equations 5 parameters in length were selected as the most appropriate QSPR models for the Pmax and KSt explosibility characteristics. The total number of molecular descriptors used in predicting Pmax is 9 (not including dust size) and for KSt is 12 (not including dust size).

3. RESULTS AND DISCUSSION The obtained correlation for the maximum overpressure, Pmax using the GFA algorithm is as shown in eq 4. The definition and type of the descriptors used in the QSPR correlations developed in eqs 4 and 5 are as shown in Table 1. The calculated descriptors from Materials Studio for eq 4 are shown in Table 5 in the Supporting Information), and the ones used in eq 5 are shown in Table 6 in the Supporting Information). Pmax ¼ - 0:7245X1 - 3:0034  10 - 4 X2 þ 7:2018  10 - 3 X3 - 2:0618  10 - 2 X4 þ 1:9373  10 - 4 X5 þ 10:0614

ð4Þ

2

where X1, (BIC) (Kappa-3(AM)); X2, (dust size)(nC)(Ly); X3, (dust size)(Rotlbonds)(Syz, f); X4, (dust size)(Hbond donor)(BIC); X5, (dust size)(BIC)(E_ADJ_mag). The obtained correlation for the dust deflagration index, KSt, using the GFA algorithm is as shown in eq 5. pffiffiffiffiffiffi KSt ¼ - 0:4367X1 - 2:3624  10 - 2 X2 - 6:5637  10 - 2 X3 þ 4:1064  10 - 4 X4 - 1:1604  10 - 2 X5 þ 14:65534 ð5Þ where X1, (Hbond donor)2(Sxy, f); X2, (dust size)(BIC)(total dipole); X3, (Hbond acceptor)(Chi(3):cluster)(HOMO); X4, (BIC)(octupole xxz)(octupole xzz); X5, (Lz)(LUMO)(quadrupole yy). The number of carbons “nC” and other descriptors related with the measure of the strength of bonds in a molecule (in this case, the number of rotatable bonds “Rotlbonds”, the number of hydrogen 2375

dx.doi.org/10.1021/ie1013663 |Ind. Eng. Chem. Res. 2011, 50, 2373–2379

Industrial & Engineering Chemistry Research bond donors “Hbond donor”, the bond information content “BIC”, number of hydrogen bond-acceptors “Hbond acceptor”, the edge adjacency/magnitude “E_ADJ_mag”, and the Kappa-3 (AM)) are some of the molecular descriptors that have also appeared previously in a QSPR predictive model for the net heat of combustion.20 When an ignited dust cloud is confined, the heat of combustion may result in rapid development of pressure.1 Thus, properties such as Pmax and KSt are related with the heat of combustion. Therefore, it is not surprising that these descriptors appear in one or both of the models developed in the present study. The shadow length: LY “Ly” and the shadow length: LZ “Lz” are defined as the length of the molecule in the y and z directions, respectively. The shadow area fraction: XY plane “Sxy, f” and the shadow area fraction: YZ plane “Syz, f” are defined as the fraction of area of molecular shadows in the xy plane and yz plane, respectively, over the area of the enclosing rectangle. Molecular shadow indices are a set of geometric descriptors which help to characterize the shape of the molecules. Particle shape and porosity greatly affect the particle surface area and the reaction rates. In general, shapes with greater surface area will propagate a flame more readily; therefore, shape is of primary importance with regard to dust explosibility characteristics as well.34 The electronic information descriptors were calculated by Material Studio software using the VAMP electrostatics. This type of descriptors appeared only in the model built to predict the dust deflagration index, KSt. The total dipole moment, the Octupole xxz, the Octupole xzz, and the Quadrupole yy are electrostatic descriptors that imply molecular polarizability, which is a measure of the overall electronic charge distribution that can be distorted by an external field.35 On observation of eqs 4 and 5, it is seen that (KSt)1/2 increases as Octupole xxz and Octupole xzz increase and decreases as the total dipole moment and the Quadrupole yy decrease. The “HOMO” and “LUMO” are the energies of the highest occupied and lowest unoccupied molecular orbitals, respectively. It has been shown that these orbitals play a major role in governing many chemical reactions.36 As the violence of an explosion is related to the rate of energy released due to chemical reactions, the degree of confinement, and heat losses,34 thus it is justified that these descriptors influence the dust deflagration index KSt in the developed QSPR model. The predicted values of Pmax and KSt using eqs 4 and 5 in comparison with the experimental GESTIS-DUST-EX data are presented in Figures 2 and 3. Also the corresponding values can be found in Table 2.

4. VALIDATION OF THE QSPR MODELS 4.1. Internal Validation within Training Set. After development of a GFA model, the statistical results can be analyzed to demonstrate the validity and quality of the model. The principal statistical parameters for each of the models developed can be seen in Table 3. As was explained before, the Friedman lack of fit (LOF) measures the resistance to overfitting of a model and cannot be decreased by adding new terms to the equation. In general, the lower the LOF value the better. The R-squared parameter measures how well the model fits with the data in the training set. A value close to 1.0 shows that the genetic function approximation equation explains the dependent variable better. The cross validated Rsquared (R2cv) value has commonly been used to measure the predictive power of a developed model. It is assumed that the closer the value is to 1.0, the better the predictive power. Although a small Rcv2 indicates the low predictive ability of a model, the opposite is not necessarily true. This parameter

ARTICLE

Figure 2. Comparison between the predicted values of Pmax and experimental data for training and test sets.

Figure 3. Comparison between the predicted values of KSt and experimental data for training and test sets.

must not be used as the ultimate proof of the high predictive power of a model.37 Lastly, the F test determines whether or not the regression is statistically significant. A higher F-value is preferred for developing such correlations. In general, both of the developed models show good statistical parameters as shown in Table 3. The R-squared and cross validated R-squared values are fairly close (over or near 0.9), the F-value is high enough, and the Friedman LOF is not very high. On the basis of these values, the equations that were developed fit well with the data set and are statistically significant. It is still necessary to fully evaluate the true predictive power of each of the models by applying the models to a test set as shown in section 4.2. 4.2. External Validation Using Test Set. To evaluate the true predictive capacity of both models, the models were independently tested with 20% of the complete data set (test set) as recommended in previous literature.38 The predicted and the experimental values from the GESTIS-DUST-EX database (for both of the explosibility characteristics) were compared within the external test set. The predictive power of a QSPR 2376

dx.doi.org/10.1021/ie1013663 |Ind. Eng. Chem. Res. 2011, 50, 2373–2379

Industrial & Engineering Chemistry Research

ARTICLE

Table 2. Comparison between the Thermo-Kinetic Predicted Values (by Equations 4 and 5) and the Experimental Values from the GESTIS- DUST-EX Database chemical dust name

median size (μm)

Pmax exptl (barg)

Pmax calcd eq 4

KSt exptl (bar m/s)

KSt calcd eq 5

Training Set sodium stearate

22

8.8

9.09

123

116.08

lactose acetylsalicylic acid

27 39

8.3 9.5

8.13 9.38

82 258

80.24 226.76

centrimonium bromide

23

9

9.17

201

211.18

pentoxifyline

14

9.4

9.36

197

203.29

1,3-dimethyldiphenyl urea

34

9.3

9.78

212

192.55

lactose

130

1.7

2.06

3

6.53

pentaerythritol

135

9

8.95

158

164.95

pentaerythritol

230

8.7

8.38

158

164.95

sodium ascrobate sugar

23 30

8.4 8.5

8.82 8.80

119 138

100.63 124.78

N-methyl-N0 -diphenyl urea

20

9.1

9.32

217

188.74

wax (N,N0 ethylenebisstearamide)

10

8.7

8.68

269

256.96

hexamethylene tetramine

27

10.5

10.18

286

243.91 116.06

L-cystine

15

8.5

7.98

142

vitamin C

14

6.6

7.26

48

54.43

2-ethoxy-4,6-dihydroxy pyrimidine

25

9.2

9.05

162

218.17

22 235

8.5 8.7

8.30 8.37

169 231

160.92 224.79

colophony anthracene sodium hydrogen cyanamide

40

7

6.64

47

75.91

2,20 -methylene-bis-6-(1,1-dimethyl-ethyl)-4-methylphenol

15

9.3

9.08

257

308.68

anthranilic acid

50

8

7.91

110

129.92

phenytoin

80

8.8

8.66

205

141.60

dimethyl terephthalate

27

9.7

9.97

247

240.00

sodium glutamate

90

5.4

5.15

29

23.10

2,5-ditert-amyl hydroquinone ditertiary butyl-p-cresol

13 92

9.5 8.8

9.03 8.74

363 143

300.26 217.00

anthraquinone

49

8.8

9.47

263

307.06

guanine

15

8.8

8.87

96

105.24

benzoguanamine

19

8.8

8.82

171

137.51

acetoguanamine (2,4-diamino-6-methyl-1,3,5-triazine)

24

8.9

8.98

77

134.75

sugar

34

8.2

8.72

90

120.84

lactose

34

7.6

7.88

35

72.64

pentaerythritol tetramethylpiperidine

85 21

9.1 8.9

9.25 9.40

188 289

164.95 263.09

naphthalene

95

8.5

8.76

178

216.93

phthalazone

33

9.2

9.30

182

163.18

Test Set

model can be conveniently estimated by an external q2 defined as shown in eq 6.37 test P ðyi - ^yi Þ2 ¼1 qext 2 ¼ 1 - itest ð6Þ P ðyi - ^ytr Þ2

Table 3. Statistical Results for the Correlations Shown in Equations 4 and 5 statistical parameters

i¼1

where yi and ^yi are the experimental and predicted values over the test sets and ^ytr is the mean values of the experimental Pmax and KSt for the training sets. A QSPR model can be considered predictive when the qext2 is above 0.6.39 The qext2 obtained for the developed models is equal to 0.72 and 0.82 for the Pmax and KSt, respectively. Thus, it can be deduced that the models are fairly predictive.

Pmax eq 4

(KSt)1/2 eq 5

Friedman LOF

1.048

13.493

R-squared

0.960

0.907

cross validated R-squared

0.904

0.868

significance-of-regression F-value

106.033

42.832

On observation of the experimental and the predicted values as shown in Table 3, it can be noticed that although the qext2 for the Pmax model is lower than the one for the KSt model, the Pmax predicted values are fairly close to the experimental ones (the absolute 2377

dx.doi.org/10.1021/ie1013663 |Ind. Eng. Chem. Res. 2011, 50, 2373–2379

Industrial & Engineering Chemistry Research

ARTICLE

Table 4. Errors and Deviation between Experimental and Calculated Values Pmax eq 4

(KSt)1/2 eq 5

average absolute deviation

0.261 barg

25.496 bar m/s

average absolute relative deviation average percent bias

0.036 -0.863%

0.225 -9.858%

difference in values being around or below 0.6 barg) . In addition, most of the predicted values in the training set and all of the predicted values in the test set for Pmax prediction are higher than the experimental values from the data set. This is an important consideration in terms of safety because the underestimation of this property could result in faulty design strength of industrial equipment that may not be able to withstand the violence of an explosion. Thus, slight overestimation of Pmax is at times preferable for such hazardous operations. Finally, a statistical evaluation of the errors and deviations between the experimental and predicted values in the training and test set was done using the definitions shown below in eqs 7-9.40 The results for the average absolute deviation, the average absolute relative deviation, and the average percent bias for each of the models developed are as shown in Table 4. It can be observed that while Pmax has low deviation measures, calculated values of (KSt)1/2 have higher deviation measures. Thus, the correlations should be used after accounting for such uncertainties in calculations. n 1X jyi - ^yi j ð7Þ average absolute deviation ¼ n i¼1    n  1X yi - ^yi  average absolute relative deviation ¼   n i ¼ 1  yi  average percent bias ¼

n 1X yi - ^yi  100 n i ¼ 1 yi

ð8Þ

ð9Þ

where yi and ^yi are the experimental and predictive values over the training and test sets.

5. CONCLUSIONS In this work, two QSPR models have been developed for the prediction of the explosibility characteristics (i.e., Pmax, KSt) of chemical dusts. For development of these models, the data set comprised of 31 chemical dusts and the experimental values were taken from the GESTIS-DUST-EX database for combustion and explosion characteristics of dusts. Both of the models are multivariate cubic equations of five parameters in length (each parameter is a combination of molecular descriptors). Included within the parameters is the dust median particle size as one of the descriptors because dust explosibility characteristics are highly dependent on the size of the particles. The molecular descriptors were calculated based on the optimized chemical structures of the molecules. The final models were developed using the GFA algorithm. Statistical parameters of the models show that they are highly predictive and accurate. The models developed are an initial approach of QSPR studies applied to dust explosion characteristics, and the values predicted with these equations serve as an indication of the level of hazard involved, when experimental testing cannot be performed. These

values must not be utilized as a final basis for safety measures’ design but can be used effectively for screening purposes. Using this data for classification of hazardous dust materials requires some caution near the values that demarcate different categories (such as Kst = 200 and 300). If the predicted values lie around these figures, then it would be beneficial to gather experimental data for the material. QSPR is an emerging property prediction technique which has found widespread application for certain biological responses, polymer behaviors, and other material characterization. It also holds promise in the estimation of hazardous properties of dust as shown in this work. With the availability of a comprehensive database of experimental measurements of dust explosion characteristics and corresponding molecular structure information, QSPR models can be developed for universal application to most types of dust materials in future.

’ ASSOCIATED CONTENT

bS

Supporting Information. Calculated descriptors used in eq 4 (Table 5) and calculated descriptors used in eq 5 (Table 6). This material is available free of charge via the Internet at http:// pubs.acs.org.

’ AUTHOR INFORMATION Corresponding Author

*E-mail: [email protected].

’ ACKNOWLEDGMENT This research was sponsored by the Mary Kay O’Connor Process Safety Center, Artie McFerrin Department of Chemical Engineering, Texas A&M University. ’ REFERENCES (1) Abbasi, T.; Abbasi, S. A. Dust explosions-Cases, causes, consequences, and control. J. Hazard. Mater. 2007, 140 (1-2), 7–44. (2) CSB.Combustible Dust Hazard Study. U.S. Chemical Safety and Hazard Investigation Board, November 2006, Investigation Report 2006-H-1. (3) CSB, Sugar Dust Explosion and Fire. U.S. Chemical Safety and Hazard Investigation Board, September 2009, Investigation Report No. 2008-05-I-GA. (4) Di Benedetto, A.; Russo, P. Thermo-kinetic modelling of dust explosions. J. Loss Prev. Process Ind. 2007, 20 (4-6), 303–309.  .; García-Torrent, J.; Aguado, P. J. Determination of (5) Ramírez, A parameters used to prevent ignition of stored materials and to protect against explosions in food industries. J. Hazard. Mater. 2009, 168 (1), 115–120. (6) Dahoe, A. E.; Cant, R. S.; Scarlett, B. On the Decay of Turbulence in the 20-Liter Explosion Sphere. Flow, Turbul. Combust. 2001, 67, 159–184. (7) Amyotte, P. R.; Eckhoff, R. K. Dust explosion causation, prevention and mitigation: An overview. J. Chem. Health Saf. 2010, 17 (1), 15–28. (8) Eckhoff, R. K. Understanding dust explosions. The role of powder science and technology. J. Loss Prev. Process Ind. 2009, 22 (1), 105–116. (9) Saraf, S. R.; Rogers, W. J.; Mannan, M. S. Prediction of reactive hazards based on molecular structure. J. Hazard. Mater. 2003, 98 (1-3), 15–29. (10) Saraf, S. R.; Rogers, W. J.; Ford, D. M.; Mannan, M. S. Integrating molecular modeling and process safety research. Fluid Phase Equilib. 2004, 222-223, 205–211. 2378

dx.doi.org/10.1021/ie1013663 |Ind. Eng. Chem. Res. 2011, 50, 2373–2379

Industrial & Engineering Chemistry Research (11) Gharagheizi, F. A QSPR model for estimation of lower flammability limit temperature of pure compounds based on molecular structure. J. Hazard. Mater. 2009, 169 (1-3), 217–220. (12) Pan, Y.; Jiang, J.; Wang, R.; Cao, H.; Cui, Y. A novel QSPR model for prediction of lower flammability limits of organic compounds based on support vector machine. J. Hazard. Mater. 2009, 168 (2-3), 962–969. (13) Cao, H. Y.; Jiang, J. C.; Pan, Y.; Wang, R.; Cui, Y. Prediction of the net heat of combustion of organic compounds based on atom-type electrotopological state indices. J. Loss Prev. Process Ind. 2009, 22 (2), 222–227. (14) Pan, Y.; Jiang, J.; Wang, R.; Cao, H. Advantages of support vector machine in QSPR studies for predicting auto-ignition temperatures of organic compounds. Chemom. Intell. Lab. Syst. 2008, 92 (2), 169–178. (15) OSHA. Combustible Dust. http://www.osha.gov/Publications/ combustibledustposter.pdf (accessed January 2010). (16) IFA. GESTIS-DUST-EX. Database Combustion and explosion characteristics of dusts. http://www.dguv.de/ifa/en/gestis/expl/index. jsp (accessed January 2010). (17) BAM, PTB, DECHEMA. CHEMSAFE Database for recommended safety characteristics, http://i-systems.dechema.de/chemsafe/ def_e.php#median%20value (accessed January 2010). (18) NCBI. PubChem Project. http://pubchem.ncbi.nlm.nih.gov/ (accessed February 2010) (19) GaussView03; Gaussian, Inc.: Pittsburgh PA, 2003. (20) Gharagheizi, F. A simple equation for prediction of net heat of combustion of pure chemicals. Chemom. Intell. Lab. Syst. 2008, 91 (2), 177–180. (21) Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.; Cheeseman, J. R.; Montgomery, J. A., Vreven, T.; Kudin, K. N.; Burant, J. C.; Millam, J. M.; Iyengar, S. S.; Tomasi, J.; Barone, V.; Mennucci, B.; Cossi, M.; Scalmani, G.; Rega, N.; Petersson, G. A.; Nakatsuji, H.; Hada, M.; Ehara, M.; Toyota, K.; Fukuda, R.; Hasegawa, J.; Ishida, M.; Nakajima, T.; Honda, Y.; Kitao, O.; Nakai, H.; Klene, M.; Li, X.; Knox, J. E.; Hratchian, H. P.; Cross, J. B.; Bakken, V.; Adamo, C.; Jaramillo, J.; Gomperts, R.; Stratmann, R. E.; Yazyev, O.; Austin, A. J.; Cammi, R.; Pomelli, C.; Ochterski, J. W.; Ayala, P. Y.; Morokuma, K.; Voth, G. A.; Salvador, P.; Dannenberg, J. J.; Zakrzewski, V. G.; Dapprich, S.; Daniels, A. D.; Strain, M. C.; Farkas, O.; Malick, D. K.; Rabuck, A. D.; Raghavachari, K.; Foresman, J. B.; Ortiz, J. V.; Cui, Q.; Baboul, A. G.; Clifford, S.; Cioslowski, J.; Stefanov, B. B.; Liu, G.; Liashenko, A.; Piskorz, P.; Komaromi, I.; Martin, R. L.; Fox, D. J.; Keith, T.; Al-Laham, M. A.; Peng, C. Y.; Nanayakkara, A.; Challacombe, M.; Gill, P. M. W., Johnson, B.; Chen, W.; Wong, M. W.; Gonzalez, C.; Pople, J. A. Gaussian 03, revision A.1; Gaussian Inc.: Pittsburgh, PA, 2003. (22) Zhu, R.; Zhang, D.; Wu, J.; Liu, C. A DFT study on the mechanism and regioselectivity of the tandem O-nitroso aldol/Michael reaction of nitrosobenzene and cyclohexenone. J. Mol. Struct. 2007, 815 (1-3), 105–109. (23) Sultan, A. S., Balbuena, P. B., Hill, A. D. Nasr-EL-Din, H. A., ab initio and Molecular Simuation Studies of Organic and Inorganic Counter Effect on Anionic Viscoelastic Surfactants. SPE International Symposium on Oilfield Chemistry, Woodlands, TX, April 20-22, 2009. (24) Tirado-Rives, J.; Jorgensen, W. J. Performance of B3LYP density functional methods for a large set of organic molecules. J. Chem. Theory Comput. 2008, 4, 297–306. (25) Accelrys Inc. Materials Studio 5.0; Accelrys Inc.: San Diego, CA, 2009. (26) Ponnurengam, M. S.; Sethu, K. G.; Doble, M. QSAR Studies on Chalcones and Flavonoids as Anti-tuberculosis Agents Using Genetic Function Approximation (GFA) Method. Chem. Pharm. Bull. 2006, 55 (1), 44–49. (27) Rogers, D.; Hopfinger, A. J. Application of Genetic Function Approximation to Quantitative Structure-Activity Relationships and Quantitative Structure-Property Relationships. J. Chem. Inf. Comput. Sci. 1994, 34 (4), 854–866. (28) Shi, L. M.; Fan, Y.; Lee, J. K.; Waltham, M.; Andrews, D. T.; Scherf, U.; Paull, K. D.; Weinstein, J. N. Mining and Visualizing Large

ARTICLE

Anticancer Drug Discovery Databases^ah. J. Chem. Inf. Comput. Sci. 1999, 40 (2), 367–379. (29) Fan, Y.; Shi, L. M.; Kohn, K. W.; Pommier, Y.; Weinstein, J. N. Quantitative Structure-Antitumor Activity Relationships of Camptothecin Analogues: Cluster Analysis and Genetic Algorithm-Based Studies. J. Med. Chem. 2001, 44 (20), 3254–3263. (30) Hou, T.; Li, Y.; Zhang, W.; Wang, J. Recent Developments of In Silico Predictions of Intestinal Absorption and Oral Bioavailability. Comb. Chem. High Throughput Screening 2009, 12, 497–506. (31) Couling, D. J.; Bernot, R. J.; Docherty, K. M.; Dixon, J. K.; Maginn, E. J. Assessing the factors responsible for ionic liquid toxicity to aquatic organisms via quantitative structure-property relationship modeling. Green Chem. 2006, 8 (1), 82–90. (32) Friedman, J. H. Multivariate Adaptive Regression Splines. Ann. Stat. 1991, 19 (1), 1–67. (33) Holland, J. Adaptation in Artificial and Natural Systems; University of Michigan: Ann Arbor, MI, 1975. (34) Cashdollar, K. L. Overview of dust explosibility characteristics. J. Loss Prev. Process Ind. 2000, 13 (3-5), 183–199. (35) Chang, H.-J.; Kim, H. J.; Chun, H. S. Quantitative structureactivity relationship (QSAR) for neuroprotective activity of terpenoids. Life Sci. 2007, 80 (9), 835–841. (36) Karelson, M.; Lobanov, V. S.; Katritzky, A. R. QuantumChemical Descriptors in QSAR/QSPR Studies. Chem. Rev. 1996, 96 (3), 1027–1044. (37) Tropsha, A.; Gramatica, P.; Gombar, V. K. The Importance of Being Earnest: Validation is the Absolute Essential for Successful Application and Interpretation of QSPR Models. QSAR Comb. Sci. 2003, 22 (1), 69–77. (38) Gramatica, P. Principles of QSAR models validation: internal and external. QSAR Comb. Sci. 2007, 26 (5), 694–701. (39) Golbraikh, A.; Shen, M.; Xiao, Z.; Xiao, Y.-D.; Lee, K-H; Tropsha, A. Rational selection of training and test sets for the development of validated QSAR models. J. Comput.-Aided Mol. Des. 2003, 17, 241–253. (40) Patel, S. J.; Ng, D.; Mannan, M. S. QSPR Flash Point Prediction of Solvents Using Topological Indices for Application in Computer Aided Molecular Design. Ind. Eng. Chem. Res. 2009, 48 (15), 7378–7387.

2379

dx.doi.org/10.1021/ie1013663 |Ind. Eng. Chem. Res. 2011, 50, 2373–2379