Research: Science and Education
Further Analysis of Boiling Points of Small Molecules, CHwFxClyBrz Guy Beauchamp Department of Chemistry, Collège de l’Outaouais, Gatineau, PQ J8Y 6M5, Canada;
[email protected] One of the goals of physical chemistry is to better understand the relationships between the observed physical properties of molecules and the molecular or atomic parameters that may explain the macroscopic behavior. An article published in this Journal studied the boiling points (BP) of a number of molecules, CHwFxClyBrz, having a similar structure, thus eliminating the complication resulting from shape (1). A series of potential predictors were discussed to various degrees. From this eloquent analysis there was no unique de-
terminant of the boiling point of the molecules CHwFxClyBrz. The purpose of this article is to present specific hypotheses that satisfactorily explain the boiling point variation of these molecules and then analyze the model with the help of multiple linear regression, a data analysis tool. Several variables were considered: molecular mass, dipole moment, presence of a bromine atom, molecular volume (∝ parachor), molar refraction, and the uniqueness of the role of hydrogen. The 30 compounds used in the analysis are listed in Table 1.
Table 1. Physical Properties, Weak H-Bonding Categorical Factors, and Predicted Values of Boiling Points of Halogenated Methanes Formula 1
CH2F2
2
CH2FCl
Weak H-Bondingc
Exp BPa/ °C
Molar Refractionb
F1
F2
F3
Dipole Momentd/D
Pred BPa/ °C
᎑52
6.6
᎑1
0
1
1.97
᎑59
᎑9
11.6
᎑1
0
1
1.82
᎑9 42
3
CH2Cl2
40
16.6
᎑1
0
1
1.80
4
CH2ClBr
69
19.5
᎑1
0
1
1.66
71
5
CH2Br2
22.4
᎑1
0
CHF2
6.5
0
1
1 ᎑1
1.43
6
97 ᎑84
1.65
100 ᎑91
7
CHF2Cl
᎑40
11.5
0
1
᎑1
1.42
᎑41
8
CHF2Br
᎑14
14.4
0
1
᎑1
1.60
᎑11
9
CHFCl2
9
16.5
0
1
᎑1
1.29
10
10
CHFClBr
36
19.4
0
1
᎑1
1.50
39
11
CHCl3
62
21.5
0
1
᎑1
1.01
60
12
CHFBr2
65
22.3
0
1
᎑1
0.72
68
13
CHCl2Br
88
24.4
0
1
᎑1
1.31
89
14
CHBr3
30.2
0
1
᎑1
0.99
15
CF4
149 ᎑129
6.4
1
0
1
0.00
148 ᎑130
16
CF3Cl
᎑81
11.4
1
0
1
0.50
᎑79
17
CF3Br
᎑59
14.3
1
0
1
0.65
᎑50
18
CF2Cl2
᎑30
16.4
1
0
1
0.51
᎑29
19
CFCl3
24
21.4
1
0
1
0.49
21
20
CF2Br2
25
22.2
1
0
1
0.66
29
21
CCl4
76
26.4
1
0
1
0.00
72
22
CCl3Br
105
29.3
1
0
1
0.21
101
23
CFBr3
108
30.1
1
0
1
0.58
109
24
CCl2Br2
135
32.2
1
0
1
0.25
130
25
CBr3Cl
160
35.1
1
0
1
0.20
160
26
CBr4
38.0
1
CH3F
6.7
0
0 ᎑1
1 ᎑1
0.00
27
189 ᎑78
1.85
189 ᎑76
28
CH3Cl
᎑24
11.7
0
᎑1
᎑1
1.87
᎑26
29
CH3Br
14.6
0
᎑1
᎑1
1.81
4
30
CH4
4 ᎑161
6.8
0.00
a Experimental boiling points are taken from ref 13. Experimental and predicted boiling points were rounded to whole numbers. bValue of the atomic bond refractions are taken from ref 12: C⫺H is 1.7; C⫺F is 1.6; C⫺Cl is 6.6; and C⫺Br is 9.5. cSee Note 1. dValues of the dipole moments are taken from refs 7–11.
1842
Journal of Chemical Education
•
Vol. 82 No. 12 December 2005
•
www.JCE.DivCHED.org
Research: Science and Education
Role of Hydrogen in These Compounds
300
Multiple Linear Regression Analysis Multiple linear regression (2) (MLR) analysis is a statistical technique that can be a powerful tool for ascertaining the validity of a mathematical model when two or more explanatory (independent) variables are simultaneously investigated to explain the variance of the bp (dependent variable or criterion). We thus obtain a linear equation of the type:
BP = B0 + B1 X1 + B2 X2 + …
•
CBr4 CHBr3 CH2Br2
100
CCl4 0
CH2F2
-100
CF2Cl2
CF4 CH4
-200 0
5
10
15
20
25
30
35
40
Molar Refraction Figure 1. Plot of boiling point versus molar refraction for the compounds CHwFxClyBrz included in the study.
Figure 2. Variation of boiling point as a function of the number of hydrogen atoms in a series of molecules having a similar molar refraction value.
X2
(1)
When applied to the data of interest B0 is the intercept and B1 and B2 correspond to the partial regression coefficients of the independent variables X1 and X2; BP is the predicted value of the boiling point for the individual compounds. A simple way to understand the process of MLR analysis is by examining the Venn diagram, shown in Figure 3, which exemplifies the most commonly observed pattern for causal models. The circles represent the variances of each variable and the overlapping areas correspond to the square of the correlation coefficient (r 2) between these variables. The total area of boiling point covered by the X1 and X2 areas (A + B + C) represents the proportion of the boiling point’s variance accounted for by the two independent variables. Thus MLR specifically quantifies (as percent of total) each explanatory variable’s affect on the dependent variable. www.JCE.DivCHED.org
200
Boiling Point / °C
A good starting point in the analysis of the molecules is the plot of the boiling points of the individual molecules as a function of the molar refraction as was done in ref 1 (Figure 1). The boiling points fall neatly on almost parallel but separate lines of molecular families having 0, 1, 2, or 3 hydrogen atoms as the molar refraction increases. The influence of hydrogen in this group of molecules appears to be much more specific than its mere presence or absence. Since the CH4 molecule did not fit on any of the lines, it was excluded from further analysis. Interestingly, for a series of molecules having a similar molar refraction value (CF4, CHF3, CH2F2, CH3F), the boiling points versus the number of hydrogen atoms in the molecule generates the plots shown in Figure 2 (molar refraction = 6.4–6.7). Similar plots are obtained for the CF 3 Cl, CHF2Cl, CH2FCl, CH3Cl and CF3Br, CHF2Br, CH2FBr, CH3Br series. From the examination of Figure 2, the maximum boiling point is observed when two hydrogen atoms are present. The boiling points are lower when one or three hydrogen atoms are present in the molecules and are even lower when no hydrogen atom is present. This pattern strongly suggests an important role of H-bonding on the boiling point since the greatest boiling point is observed in a 2:2 ratio of hydrogen:halogen atoms in a molecule. In a 1:3 or 3:1 hydrogen:halogen atoms ratio, a reduced and similar number of H-bonds can be achieved and correlates well with the boiling points. Finally fully halogenated molecules, with no possibility of H-bonding, have even lower boiling points. The weak H-bonding contribution to the boiling point variation is categorical in nature, as is the inclusion of a bromine atom. Other explanatory variables, such as molar refraction, are continuous.
CH2Fx Cly Brz CHFx Cly Brz CH3Fx Cly Brz CFx Cly Brz
D X1 B
C
A
BP
Figure 3. Venn diagram of variables in MLR analysis. Area of circles represents variances. Overlapping areas (A, B, C, and D) depict degree of association (r2) between variables. Surface A illustrates the percent explanation of independent variable X1 on dependent variable boiling point not accounted for by variable X2. Surface B represents the percent of boiling point explained by both independent variables X1 and X2.
Vol. 82 No. 12 December 2005
•
Journal of Chemical Education
1843
Research: Science and Education
In addition to validating predictor variables of the continuous type, the flexibility of MLR allows independent variables of categorical nature, such as grouping factors, to be inserted into the analysis. For this particular procedure if a dichotomy is tested, for example, as in the case of a molecule having or not having one or more bromine atoms, then contrast values ᎑1 and +1 are assigned to the independent variable used for distinguishing group adherence. As for expressing the weak H-bonding category, three variables were necessary and are presented in Table 1 as F1, F2, and F3.1 The goals in using MLR analysis are (i) to reduce the number of explanatory variables by discarding those that, when added to the other variables already in the model, are not statistically associated with the boiling point ( p value greater than 0.05) and (ii) to include the greatest number of molecules from the sample population. The stepwise order of insertion was determined by the relative size of each individual variable r 2 (Table 2). Simply stated the largest contributor, molar refraction, was inserted first and then the molecular volume was added. If the molecular volume contribution was not statistically significant ( p > 0.05), it was removed from the analysis, and the next most important variable (molecular mass) was added. This was repeated until all independent variables were tested. The final result eliminated most predictors due to either the presence of a significant association with other variables inserted prior to them (redundancy), or simply the lack of any additional significant association with the boiling point. Following this analysis, two predictors stood out from the group; the molar refraction, associated with the molecule’s polarizability,2 and the weak H-bonding category (as three variables). Using contrast coding, each molecule was in turn compared to all the others to test it as a potential outlier (criterion: absolute Student t value +3.0; ref 3 ); this procedure, know as studentized residual analysis, revealed that there were no outliers in this group of molecules. The combined explanation of molar refraction, R, and weak H-bonding amounted to 99.8% of the variance exhibited by the boiling points. No other predictor was useful in defining the model. The final multiple regression equation obtained is BP =
10.09R − 34.36 F1 − 6.55 F2 − 4.96 F3 − 155.11 °C
(2)
From this equation, predicted values were calculated based on the independent variables (Table 1); the standard error of the predicted boiling points of the 29 molecules was equal to 3.7 ⬚C. Dipole Moment The finding that the dipole moment did not further increase the model precision is indicative that the boiling point variance is sufficiently well defined by the molecule polarizability within each hydrogen-atom number subgroup. One could argue that dipole moments differ from one hydrogenatom number group to the other and could explain the observed boiling point variation. This correlation is generally true when we consider 0, 1, and 2 hydrogen atoms on the 1844
Journal of Chemical Education
•
Table 2. Simple Linear Regression Data of Explanatory Variables r2
Predictor Variable
Probability
Molecular mass
0.71
0.000
Dipole moment
0.12
0.062
Molar Refraction
0.91
0.000
Inclusion of bromine atom
0.39
0.000
Molecular volume
0.81
0.000
Role of hydrogen atom
0.08
0.574
NOTE: A probability value greater than 0.05 indicates no significant correlation with the boiling point.
Table 3. Experimental and Predicted Values of Boiling Points of Molecules Not Used in Building the Model Formula
Molar Refraction
Exp BP/ °C
Pred BP/ °C
1
CHClBr2
27.3
120
119
2
CH2FBr
14.5
18
21
3
CFCl2Br
24.3
53
51
4
CFClBr2
27.2
80
80
5
CF2ClBr
19.3
᎑4
0
molecules but breaks down as dipole moments remain elevated (or even increase) in the CH3FxClyBrz group and boiling points decline (Figure 2). Interestingly, when the molar refraction and the dipole moment are combined in the MLR scheme the standard error of the predicted boiling points increases to a value of 10 ⬚C. Model Internal Validity Further indication that the dipole moment did not dictate the value of the boiling point of this series of molecules was assessed by testing the internal validity of the proposed model. The boiling points of five molecules for which values of the dipole moments were not found in the literature, and hence not used in building the model, were calculated using eq 2. The predicted boiling points (Table 3) are close to the experimental boiling points, indicating that the dipole moment was not needed to predict the boiling points. Discussion It is unexpected to characterize with two predictors a seemingly complex property such as the boiling point. The polarizability is by far the most important predictor, followed by weak H-bonding. In the simple linear regression analysis of the molecules (Table 2) the role of hydrogen was not significantly related to the boiling point. Nonetheless, when this predictor was inserted in the MLR analysis with the molar refraction, its true influence on the boiling point transpired. The fact that weak H-bonding has such a significant influence in building a model is surprising inasmuch as it is rarely considered when it comes to explaining boiling point variations of molecules in chemistry textbooks (4, 5).
Vol. 82 No. 12 December 2005
•
www.JCE.DivCHED.org
Research: Science and Education
Interestingly, even though the molar mass is somewhat correlated with the boiling point in a simple linear regression, it was discarded in the MLR scheme since it was unable to display any association beyond that demonstrated by the two outstanding predictors. This empirical observation concurs with the fact that the molar mass is conceptually unrelated to the boiling point (6) and answers an important question asked in ref 1. For the excluded CH4 molecule, just about any distinguishing characteristic from the halogens could be used to explain this fact. One that immediately comes to mind is that hydrogen has no nonbonding electrons. The apparent significant effect of the presence or absence of bromine atoms observed in Table 2 is due to the fact that bromine-containing molecules simply have a higher boiling point than other molecules; the inverse spurious relationship might have been observed for fluorine-containing molecules. Conclusion MLR analysis was useful in selecting the predictor variables that could significantly clarify the boiling point variation of CHwFxClyBrz molecules. The important combination of the polarizability and weak H-bonding was emphasized in the development of a linear model equation. Notes 1. Weak H-bonding categorical factors F1, F2, and F3 are variables that relate to the comparison of subgroups of molecules having a different number of hydrogen atoms. For example in F1, molecules having two hydrogen atoms (᎑1 value) are contrasted to molecules having no hydrogen atom (+1 value). Molecules having
www.JCE.DivCHED.org
•
zero value in F1 or F2 were not contrasted within that specific factor. 2. Molar refraction, R, is related to the molecule’s polarizability by the Lorentz–Lorenz equation,
R = −
4 πN α 3
where N is Avogadro’s number and α is polarizability.
Literature Cited 1. Laing, M. J. Chem. Educ. 2001, 78, 1544–1550. 2. Fox, J. Applied Regression Analysis, Linear Models, and Related Methods; Sage Publications: Thousand Oaks, CA, 1997. 3. Tabachnick, B. J.; Fidell, L. S. Using Multivariate Statistics, 4th ed.; Allyn and Bacon: New York, 2001; pp 66–67. 4. Chang, R. Chemistry, 6th ed.; McGraw-Hill: New York, 1998; p 448. 5. Bruice, P. Y. Organic Chemistry, 3rd ed.; Prentice Hall: Upper Saddle River, NJ, 2001; pp 82–85. 6. Rich, R. U. Chem. Educ. 2003, 7, 35–36. 7. CRC Handbook, 52nd ed.; Lide, D. R., Ed.; CRC Press: Boca Raton, FL, 1971; Section E-51. 8. Bock, E.; Iwacha, D. Can. J. Chem. 1968, 46, 523. 9. Miller, R. C.; Smyth, C. P. J. Chem. Phys.1956, 24, 814. 10. Novak, I. Zeitschrift für Physikalische Chemie 1990, 167, 123. 11. Bauder, A.; Beil, A.; Luckhaus, D.; Müller, F.; Quack, M. J. Chem. Phys. 1997, 106, 7558. 12. Glasstone, S. Textbook of Physical Chemistry, 2nd ed.; MacMillan: London, 1956; pp 528–538, 543–545. 13. Dictionary of Organic Compounds, 5th ed.; Chapman and Hall; London, 1982.
Vol. 82 No. 12 December 2005
•
Journal of Chemical Education
1845