CORRELATION pubs.acs.org/IECR
Development of a Simple Method to Predict Boiling Points and Flash Points of Acyclic Alkenes Felix A. Carroll,*,† Justin M. Godinho,† and Frank H. Quina‡ † ‡
Department of Chemistry, Davidson College, Davidson, North Carolina 28035, United States and Instituto de Química, Universidade de S~ao Paulo, CP 26077, S~ao Paulo 05513-970, Brazil
bS Supporting Information ABSTRACT: Boiling points (TB) of acyclic alkenes are predicted from their boiling point numbers (YBP) with the relationship TB(K) = 16.802YBP2/3 + 337.377YBP1/3 437.883. In turn, YBP values are calculated from structure using the relationship YBP = Oi + 1.726 + 2.779C + 1.716M3 + 1.564M + 4.204E3 + 3.905E 0.329D + 0.241G + 0.479V + 0.967T + 0.574S. Here Oi is a parameter that depends on the substitution pattern of the alkene, while the rest of the equation is the same as that reported earlier for calculating the YBP values of alkanes. For a data set consisting of 250 acyclic alkenes having from 6 to 36 carbon atoms, the average absolute deviation between literature TB values and those predicted with these equations was 1.29 K, and the R2 of the correlation was 0.999. In addition, the calculation of boiling points by this method provides a useful means to predict the flash points of alkenes from structure.
’ INTRODUCTION The correlation of macroscopic physical properties with molecular structure is an enduring theme in organic chemistry. We implicitly assume that properties follow from structure, yet it can be challenging to express the numerical value of a particular physical property with a simple mathematical relationship based directly on molecular structure. Boiling point (TB) is perhaps the most important physical property of an organic compound that is liquid at room temperature, so boiling point prediction methods are the subject of particular interest. They range from simple but relatively inaccurate approaches based on functional group counts and substituent relationships that are directly evident from molecular structure to complex multiparametric or neural network methods based on theoretical electronic properties or connectivity functions.1 Thus, boiling point prediction methods exemplify the inherent tension in chemistry between accuracy and simplicity.2 One of the difficulties in predicting boiling points from structure is that boiling points of a series of homologous organic compounds increase regularly with the number of repeat units, such as the number of CH2 groups in an alkyl chain, but this increase is not linear. For example, Figure 1 shows the curvature in a plot of the boiling points (b) of the 1-alkenes from 1-hexene to 1-hexatriacontene. Therefore, TB prediction methods often incorporate a set of parameters that, taken together, can model this curvature.3,4 A simpler and conceptually more satisfying approach is first to convert physical property values that are not linear with the count of structural units into values of a new parameter that does vary linearly with structural increments. One of the earliest efforts along these lines was reported by Kinney, who introduced the concept of the boiling point number, Y, as a new measure of TB values.5,6 Kinney’s relationship between TB and Y values is as follows: TB ðKÞ ¼ 230:14Y 1=3 269:85 r 2011 American Chemical Society
ð1Þ
In turn, Kinney’s Y could be calculated from structural increments of alkanes using Y ¼ 0:8C þ H þ 3:05M þ 5:5E þ 7P 0:4D þ 0:5V2 or 3 þ V4þ
ð2Þ
Here C is the number of carbon atoms in the longest carbon chain, H is the number of hydrogen atoms attached to this main chain, M is the number of methyl substituents on this chain, E is the number of ethyl substituents, P is the number of propyl substituents, D is the number of 2,2-dimethyl groups, V2 or 3 is 1 for structures having either two or three adjacent substituents on a chain of six carbons or fewer (otherwise it is 0), and V4+ is 1 for compounds having four or more adjacent substituents on a chain of six carbons or fewer. Kinney’s approach was reasonably accurate for lower molecular weight alkanes, but eq 1 is not as accurate for higher molecular weight compounds. Recently we improved the Kinney approach for paraffins by using a quadratic equation to calculate boiling points from a new boiling point parameter, which we termed YBP:7 TB ðKÞ ¼ 16:802YBP 2=3 þ 337:377YBP 1=3 437:883 ð3Þ We also improved Kinney’s eq 2 for alkanes by adding some additional parameters and then optimizing the coefficients of all parameters. The resulting calculation of YBP from structure for alkanes is YBP ¼ 1:726 þ 2:779C þ 1:716M3 þ 1:564M þ 4:204E3 þ 3:905E þ 5:007P 0:329D þ 0:241G þ 0:479V þ 0:967T þ 0:574S
ð4Þ
Received: June 9, 2011 Accepted: November 8, 2011 Revised: November 2, 2011 Published: November 28, 2011 14221
dx.doi.org/10.1021/ie201241e | Ind. Eng. Chem. Res. 2011, 50, 14221–14225
Industrial & Engineering Chemistry Research
CORRELATION
Table 1. Contribution of Different Olefinic Patterns to YBP Values substitution pattern
Figure 1. Nonlinear relationship of literature TB values (b, left axis) and linear relationship of YBP values (O, right axis) with the number of carbon atoms in a series of 1-alkenes from 1-hexene to 1-hexatriacontene. The solid line shows the best-fit linear correlation of the YBP values.
Here C is the number of carbon atoms in the longest chain, M3 is the number of methyl substituents on carbon 3 of this chain (counting from either end), M is the number of methyl substituents at other positions, E3 and E are the number of corresponding ethyl substituents, P is the number of propyl substituents, D is the number of 2,2dimethyl groupings (again counting from either end), G is the number of geminal substitutions at other positions, V is the number of vicinal alkyl substituents, T is the number of instances of two methyl substituents on both carbons one and three of a three-carbon segment of the main chain, and S is the square of the ratio of total number of carbons to the number of carbons in the longest chain. This approach gave a correlation of experimental and predicted boiling points with R2 = 0.999 for acyclic alkanes containing from 6 to 30 carbons.7 The success of this method results both from the good correlation of TB values with YBP values in eq 3 and from the ability of the parameters and coefficients in eq 4 to model the dispersion forces responsible for the intermolecular interactions of the alkanes. On the basis of the success of this method in predicting the boiling points of alkanes, we have begun to explore the prediction of boiling points from structure for other classes of organic compounds. Here we report the prediction of TB values for acyclic alkenes. The results provide a simple yet accurate way to predict both the boiling points and the flash points of this class of organic compounds.
’ METHOD AND RESULTS From literature sources we obtained the boiling points of 250 linear and branched acyclic alkenes containing from 6 to 36 carbon atoms and boiling from 54 °C to 496 °C.8 We then determined the experimental YBP values for these alkenes from eq 5, where a = 16.802, b = 337.377, and c = 437.883.7 As shown in Figure 1, these YBP values are quite linear with the number of methylene units in a series of 1-alkenes spanning a wide range of molecular sizes. " YBP ¼
b þ
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi#3 b2 4aðc TBP Þ 2a
ð5Þ
Oi
monosubstituted
0.392
disubstitued 2-alkene disubstituted 3- or higher alkene
0.224 0.548
geminal disubstituted
0.317
trisubstituted
0.473
tetrasubstituted
1.391
Next we sought to modify eq 4, which was developed specifically for alkanes, so that the YBP values of alkenes could also be calculated directly from molecular structure. We felt that this new equation should resemble eq 4 as closely as possible because the structural components that contribute to YBP values for alkanes should contribute in the same way to the YBP values of the saturated portion of alkenes. In other words, in this new equation not only the structural parameters but also their coefficients should be the same as in eq 4. Therefore we changed eq 4 only by introducing the parameter Oi, which measures the effect of various types of olefinic units on the boiling point of an alkene derived from an alkane with the same number of carbon atoms and the same molecular connectivity. The resulting equation is YBP ¼ Oi þ 1:726 þ 2:779C þ 1:716M3 þ 1:564M þ 4:204E3 þ 3:905E þ 5:007P 0:329D ð6Þ þ 0:241G þ 0:479V þ 0:967T þ 0:574S Other than the new parameter Oi, all of the parameters and coefficients in eq 6 are identical to those in eq 4. However, we did alter slightly the interpretation of three of the parameters in order to reflect the differences in geometry between an olefin and an alkane. First, the V count was 0.25 for each vicinal interaction of an olefinic substituent with an adjacent allylic substituent, but V remained 1.0 for each vicinal interaction along a saturated section of the alkyl chain. Second, the parameter S was given a slightly different function for cis alkenes in order to account for the difference in shape of cis and trans isomers. For trans alkenes, S was calculated as described earlier for alkanes. For cis isomers, however, S was calculated as the square of the ratio of (the total number of carbons plus 0.25) to (the number of carbons in the longest chain). Finally, a methyl substituent on C3 of a 1-, 2- or 3-alkene was counted as M (not M3), and a 2,2-dimethyl grouping in an allylic position was counted as G (not D). To evaluate this approach, we divided the 250 compounds in the data set randomly into a training set containing 200 alkenes and a test set containing the other 50 compounds. Next we calculated YBP values of the alkenes in the training set with eq 6, using initial guesses for the Oi values for different patterns of olefinic substitution. Then we used the Solver add-in of Microsoft Excel to determine the values of Oi that produced the lowest average absolute deviation (AAD) for the compounds in the training set between YBP values calculated from boiling points with eq 5 and those predicted from structure using eq 6. The optimized values of Oi are shown in Table 1. As noted above, the parameters in eq 4 measure the effects of structural features on the dispersion forces of the alkanes. These same forces should operate in the saturated portions of an alkene, so the Oi values reflect the change in intermolecular forces of attraction resulting from the replacement of a single bond in an 14222
dx.doi.org/10.1021/ie201241e |Ind. Eng. Chem. Res. 2011, 50, 14221–14225
Industrial & Engineering Chemistry Research
CORRELATION
Figure 2. Correlation of literature boiling points of 200 linear and branched acyclic alkenes in the training set with values predicted using YBP values calculated from structure via eq 6. The diagonal line represents perfect correlation of literature and predicted TB values.
Figure 3. Correlation of literature boiling points of 50 linear and branched acyclic alkenes in the test set with values predicted using YBP values calculated from structure via eq 6. The diagonal line represents perfect correlation of literature and predicted TB values.
alkane with a double bond. Some Oi values are negative, consistent with the slight decrease in dispersion of an alkene caused by removal of two hydrogen atoms from the corresponding alkane. The values of Oi generally increase with increasing substitution of the double bond, which can be explained on the basis of the greater planarity of an olefin having more alkyl substituents on the double bond in comparison with an isomer having fewer alkyl substituents on the double bond. The less negative Oi values for disubstituted 2-alkenes than for 3- or higher disubstituted alkenes is consistent with the general trend in boiling points observed for alkene positional isomers. A 2-alkene necessarily has a terminal segment with at least four coplanar carbon atoms, which should allow for greater intermolecular attraction at that end of the molecule. Alkenes with double bonds further along the main chain would also have coplanar segments, but the noncoplanar segments around them would limit their contribution to intermolecular attraction. As an example of the application of eq 6, consider the calculation of YBP for (Z)-3,4-dimethylpent-2-ene. The double bond has three alkyl substituents, so the value of Oi is 0.473 from Table 1. The longest chain has five carbon atoms, so C = 5. There is one methyl group on C3 of this 2-alkene and another methyl substituent on C4, so M = 2. The methyl substituent on the olefinic carbon C3 is vicinal to the methyl group on C4, so V = 0.25. The M3, E3, P, G, and T parameters are all 0. Since this is a cis alkene, the S parameter is ((7 + 0.25)/5)2 = 2.102. For this compound, therefore, YBP = 0.473 + 1.726 + (2.779 5) + (1.564 2) + (0.479 0.25) + (0.574 2.102) = 20.548. The YBP value calculated from the experimental boiling point and eq 5 is 20.549. We used eq 6 to calculate Y BP values of the 200 alkenes in the training set from their structures. When these Y BP values were used to predict T B values with eq 3, an excellent correlation was obtained, as shown in Figure 2. Here, R2 = 0.999, the standard error was 1.80 K, and the average
absolute deviation (AAD) between literature and predicted TB values was 1.27 K.9 When the Oi values determined for the training set (Table 1) were used to predict the YBP values and then the TB values of the 50 compounds in the test set, the correlation (Figure 3) of literature and predicted boiling points was comparable. The R2 was 0.999, the standard error was 1.70 K, and the AAD was 1.34 K. For the entire set of 250 alkenes, the R2 was 0.999, the standard error was 1.79 K, and the AAD was 1.29 K. We should note that eq 3 was developed for a set of n-alkanes containing from 6 to 30 carbons, and it gave a very good correlation of YBP and TB values of alkanes over this range of carbon skeleton sizes. The present data set includes compounds having from 5 carbons up to 36 carbons in the main chain, so this work is an extension of the correlation developed for alkanes. While the results in Figures 1, 2, and 3 indicate that this extension is justifiable, further work will be needed to determine the upper molecular size limit for the application of the equations used here.
’ DISCUSSION The results reported here compare favorably with those obtained using the more complicated boiling point prediction methods mentioned in the Introduction. In particular, a neural network method incorporating five structure and connectivity parameters gave an AAD of 2.38 K for a test set of 16 compounds.10 Another neural network approach using four connectivity indices, a shape index, and dipole moments gave an AAD of 1.51 K for a set of 144 alkenes containing up to 10 carbons, while a Fuzzy ARTMAP method using the same data set produced an AAD of 0.95 K.11 A multiparametric linear regression (MLR) correlation based on eight molecular connectivity parameters produced an R2 of 0.998 and an AAD of 1.30 K for a set of 107 acyclic alkenes containing 10 or fewer carbon atoms.12 A method incorporating both group contributions and topological 14223
dx.doi.org/10.1021/ie201241e |Ind. Eng. Chem. Res. 2011, 50, 14221–14225
Industrial & Engineering Chemistry Research
CORRELATION
There is a strong relationship between NFP and YBP values of hydrocarbons:20 NFP ¼ 0:987YBP þ 0:176D þ 0:687T þ 0:712B 0:176 ð8Þ
Figure 4. Correlation of reported flash points of 124 linear and branched acyclic alkenes with TFP values calculated using YBP values predicted with eq 6. The diagonal line represents a perfect correlation of literature and predicted values, and the data points are sized to indicate the standard error in the correlation.
parameters gave an AAD of 2.10 K for a group of 119 alkenes.13 A six-variable linear model incorporating molecular connectivity and computed values of electron density surfaces that was developed for hydrocarbons in general gave an AAD of 2.32 K for a set of 52 acyclic alkenes containing from 6 to 20 carbon atoms.14 Because each of these other methods was optimized for a different data set, an exact comparison of the accuracy of the present work with earlier publications is not feasible. It seems reasonable to conclude, however, that the simple method reported here produces an AAD value that is comparable to those obtained by the other methods. Moreover, these other methods require input parameters that can be tedious to calculate without specialized software. In contrast, eq 6 requires only parameters that are obvious from a structure drawing, so the method reported here is accessible to any chemist. The results of the present study may also be used to predict the flash points (TFP) of alkenes. The flash point of a liquid is the lowest temperature at which the mixture of vapor and air above the substance can be ignited. For this reason, flash points are the most commonly cited measure of the fire hazard associated with the storage, transport, and use of flammable compounds. In spite of their importance, we were able to locate reported flash points for only about half of the alkenes in the current data set. It is partly because experimental flash points of alkenes are generally less available than are experimental boiling points that many methods have been proposed to predict TFP values. These methods typically use as inputs molecular connectivity indices or theoretical descriptors such as computed properties of an electron density surface.1518 Recently we introduced the flash point number, NFP, as a new parameter to characterize the flammability hazards of organic compounds. The relationship of flash points and NFP values is19 TFP ðKÞ ¼ 23:369NFP 2=3 þ 20:010NFP 1=3 þ 31:901 ð7Þ
Here D is the number of olefinic double bonds in the structure, T is the number of triple bonds, and B is the number of aromatic rings. For monoalkenes, D = 1 and both T and B are 0, so eq 8 reduces to NFP = 0.987 YBP. We used eq 8 and YBP values calculated with eq 6 to predict the NFP values of the compounds in our data set, and then we used those NFP values in eq 7 to calculate their flash points. Equation 7 was developed using a data set containing linear and branched alkanes with boiling points up to 589 K, and we observed an increasing deviation between literature and predicted values for alkenes boiling above 600 K. Equation 7 seems to work well for alkenes boiling below 600 K, however, as shown by the good correlation between predicted TFP values and the reported TFP values of 124 olefins boiling below 600 K for which reported flash points could be found (Figure 4).8 The R2 of the correlation was 0.984, and the AAD was 4.59 K. This correlation compares favorably with that of other methods for predicting hydrocarbon flash points from structure, which usually give AAD values of 612 K.20,21 Therefore prediction of YBP values with eq 6 provides a way to estimate the flash points of acyclic alkenes from structure more easily and more accurately than with the other flash-point prediction methods.
’ CONCLUSIONS The method for predicting the boiling points of acyclic alkenes presented here not only is very simple to use but also is comparable in accuracy to neural network and MLR methods requiring connectivity or topological parameters. In addition, the YBP values calculated with eq 6 provide a useful way to predict the flash points of alkenes. Efforts to develop similar correlations for compounds containing other functional groups are currently underway. ’ ASSOCIATED CONTENT
bS
Supporting Information. The data set of 250 compounds along with their literature boiling points and references, YBP values, counts of the structural parameters used in eq 6, predicted TB values, reported flash points and references, and predicted TFP values. This material is available free of charge via the Internet at http://pubs.acs.org.
’ AUTHOR INFORMATION Corresponding Author
*Tel. 704-894-2544. Fax: 704-894-2709. E-mail: fecarroll@ davidson.edu.
’ ACKNOWLEDGMENT Financial and fellowship support from Davidson College and from Conselho Nacional de Desenvolvimento Científico e Tecnologico are gratefully acknowledged. F.H.Q. is affiliated with the Brazilian National Institute for Catalysis in Molecular and Nanostructured Systems (INCT-Catalysis) and the USP Consortium for Photochemical Technology. 14224
dx.doi.org/10.1021/ie201241e |Ind. Eng. Chem. Res. 2011, 50, 14221–14225
Industrial & Engineering Chemistry Research
’ REFERENCES (1) For a discussion and leading references, see Katritzky, A. R.; Kuanar, M.; Slavov, S.; Hall, C. D.; Karelson, M.; Kahn, I.; Dobchev, D. A. Quantitative Correlation of Physical and Chemical Properties with Chemical Structure: Utility for Prediction. Chem. Rev. 2010, 110, 5714. (2) Carroll, F. A. Perspectives on Structure and Mechanism in Organic Chemistry, 2nd ed.; John Wiley & Sons: Hoboken, NJ, 2010; Chapter 1. (3) Methods to predict TB values with parameters that vary linearly with the number of structural increments can be somewhat accurate if only a small range of molecular sizes is considered. For example, see Joback, K. G.; Reid, R. C. Estimation of Pure-Component Properties from Group-Contributions. Chem. Eng. Commun. 1987, 57, 233. In our study, the JobackReid method produced very large errors for higher molecular weight compounds (e.g., 250 K for 1-hexatriacontene). (4) For example, Nelson, S. D.; Seybold, P. G. Molecular Structure Property Relationships for Alkenes. J. Mol. Graphics Model 2001, 20, 36 predicted the boiling points of alkenes using as parameters the number of carbon atoms in the structure, the square of the number of carbons, the square root of the number of carbons, the number of terminal methyl groups, the number of CCC single bond paths, the number of exterior double bonds, and the number of allylic carbons. (5) Kinney, C. R. A System Correlating Molecular Structure of Organic Compounds with their Boiling Points. I. Aliphatic Boiling Point Numbers. J. Am. Chem. Soc. 1938, 60, 3032. (6) Kinney, C. R. Calculation of Boiling Points of Aliphatic Hydrocarbons. Ind. Eng. Chem. 1940, 32, 559. (7) Palatinus, J. A.; Sams, C. M.; Beeston, C. M.; Carroll, F. A.; Argenton, A. B.; Quina, F. H. Kinney Revisited: An Improved Group Contribution Method for the Prediction of Boiling Points of Acyclic Alkanes. Ind. Eng. Chem. Res. 2006, 45, 6860. (8) The alkenes, their boiling points and flash points, and literature references are provided in the Supporting Information. (9) These results were much more accurate than those obtained with a method for alkenes proposed by Kinney (ref 6). His approach required separate treatments for two families of olefins, depending upon whether the double bond is incorporated into the longest chain of carbon atoms in the structure. Moreover, the predicted values were reasonably accurate for lower molecular weight alkenes but produced significant errors for larger olefins. For the 250 alkenes in our data set, the Kinney method gave an AAD of 3.5 K. (10) Liu, S.; Zhang, R.; Liu, M.; Hu, Z. Neural Network Topological Indices Approach to the Prediction of Properties of Alkene. J. Chem. Inf. Comput. Sci. 1997, 37, 1146. (11) Espinosa, G.; Yaffe, D.; Cohen, Y.; Arenas, A.; Giralt, F. Neural Network Based Quantitative Structural Property Relations (QSPEs) for Predicting Boiling Points of Aliphatic Hydrocarbons. J. Chem. Inf. Comput. Sci. 2000, 40, 859. (12) Hansen, P. J.; Jurs, P. C. Prediction of Olefin Boiling Points from Molecular Structure. Anal. Chem. 1987, 59, 2322 (the statistical parameters reported here were calculated from data in this reference). (13) Li, H.; Higashi, H.; Tamura, K. Estimation of Boiling and Melting Points of Light, Heavy and Complex Hydrocarbons by Means of a Modified Group Vector Space Method. Fluid Phase Equilib. 2006, 239, 213. (14) Wessel, M. D.; Jurs, P. C. Prediction of Normal Boiling Points of Hydrocarbons from Molecular Structure. J. Chem. Inf. Comput. Sci. 1995, 35, 68 (the AAD value reported here was computed from data in Table 1 of this reference). (15) Suzuki, T.; Ohtaguchi, K.; Koide, K. A Method for Estimating Flash Points of Organic Compounds from Molecular Structures. J. Chem. Eng. Jpn. 1991, 24, 258. (16) Gharagheizi, F.; Alamdari, R. F. Prediction of Flash Point Temperature of Pure Components Using a Quantitative Structure Property Relationship Model. QSAR Comb. Sci. 2008, 27, 679. (17) Patel, S. J.; Ng, D.; Mannan, M. S. QSPR Flash Point Prediction of Solvents Using Topological Indices for Application in Computer Aided Molecular Design. Ind. Eng. Chem. Res. 2009, 48, 7378. (18) (a) Katritzky, A. R.; Petrukhin, R.; Jain, R.; Karelson, M. QSPR Analysis of Flash Points. J. Chem. Inf. Comput. Sci. 2001, 41, 1521.
CORRELATION
(b) Katritzky, A. R.; Stoyanova-Slavova, I. B.; Dobchev, D. A.; Karelson, M. QSPR Analysis of Flash Points: An Update. J. Mol. Graphics Model 2007, 26, 529. (19) Carroll, F. A.; Lin, C.-Y.; Quina, F. H. Calculating Flash Point Numbers from Molecular Structure: An Improved Method for Predicting the Flash Points of Acyclic Alkanes. Energy Fuels 2010, 24, 392. (20) Carroll, F. A.; Lin, C.-Y.; Quina, F. H. Improved Prediction of Hydrocarbon Flash Points from Boiling Point Data. Energy Fuels 2010, 24, 4854. (21) Carroll, F. A.; Lin, C.-Y.; Quina, F. H. Simple Method to Evaluate and to Predict Flash Points of Organic Compounds. Ind. Eng. Chem. Res. 2011, 50, 4796.
14225
dx.doi.org/10.1021/ie201241e |Ind. Eng. Chem. Res. 2011, 50, 14221–14225