Computer Programming Hyunyong Kim
in Physical Chemistry Laboratory
University of Missouri Columbia, 65201
I
Least-squares analysis
Introducing the method of digital computation to undergraduate chemistry majors has been emphasized in several articles of THIS JOURNAL for the last couple of years and needs no further elaboration. We have been requiring Fortran programming in our physical chemistry laboratory course, with a minimal introduction to the subject. It appears that there is an extremely high barrier to the first interaction between the student and the computer. After the first encounter, students learn to program quite smoothly. We therefore have our students write a very simple program and obtain the computed output after a one hour introduction to Fortran programming. The determination of the structure of carbonyl sulfide (OCS) from the experimentally determined moments of inertia by a least-squares analysis has been used as the first problem. The method of least-squares analysis is treated by most textbooks for physical chemistry laboratory (1, 8). A more rigorous discussion on this subject has also been published in THIS JOURNAL (3). For our student, the least-squares expression given in reference (1) is only slightly modified in form so that it is more adaptable to digital computation. Furthermore, the partial derivatives in a Taylor series expansion of the analytical function need not be explicitly expressed. This has a considerable advantage since the partial derivat.ives can become quite a lengthy expression as in the case of our problem. We present here the least-squares procedure in a more practical form and the stepwise programming exercises used in our course.
j(al, ma...m,, x,) is now expanded in Taylor series about a trial set of values, aJo( j = 1,2,. . .n).
+ (g)Aa,, + higher order terms
(3)
yt0 = f(alO,aso,.. . a,O, XJ are the calculated values using the trial set of parameters. A series expansion (eqn. (3)) with terms higher than firsborder neglected is substituted into eqn. (2) and with rearrangement, the following linear simultaneous equations are obtained
$ (2)(2)
A 4
=
c (G) " afi
(vi - UP)
Least Squares Procedure
Following the discussion given in reference (I), consider an analytical function y = f (a,, a%,. . .an,x), where a,, an, . . . a, are parameters to be determined. It is assumed that the number of independent parameters n is less than the number of experimental values m.
The residual sum of squares R as definedby
is to be minimized with respect to each aJ. y, represents the experimental values. The condition
yields n equations of the form
The solution of the above normal equations yields Aa,, the first-order correction to the trial set of values ajo. The new improved set of values (aJo AmJ) is
+
then used as a trial set, and the procedure is repeated until convergence is obtained. The iterative adjnstments are necessary except for the simple linear case, where the first iteration yields the unique solution. The expression given in eqn. (4) is not very suitable for computer programming. Rewriting eqn. (4) in matrix form, however, makes it more adaptable to digital computation. Students need not wait until they have had a course in linear algebra to use matrix operations. The only matrix operation required for this problem is matrix multiplication. In the matrix formulation (4), the residual sum of squares R is written as R = [Y - (yo AYN' . [Y - (Ja A Y ) ~ (5
+
120 1 lournol of Chemical Education
+
where the brackets denote a column matrix and the prime indicates the matrix transposition (the rows and columns interchanged) and the element of Ay matrix, Ayi represents the incremental change in f i introduced by the small changes in the parameters a,. It is assumed that none of the adjustment Ay, is unduly large and that the adjustments are in the nature of refinement rather than gxoss change. With further definition of the relation between Ay and Aa, it will become apparent that the above assumption is equivalent to the assumption made in eqn. (3) where terms higher then first-order are neglected. A relation between Ay and A a is definedby Ay = J
. Aa
(6)
The coefficient J is the so-called Jacobian matrix and has elements
Defining the error vector, e = y - 9,whose element ri is the difference between the experimental value y, and the calculated value y,O computed on the basis of the trial parameters ajo,and substituting eqn. (6) into eqn. (5) Upon differentiation and setting it equal to zero
and transposing terms (note that the J'. J is symmetric matrix) J ' . J , A a = J'.e
differences between the experimental values yi and ydQ calculated with the trial set of values, ago. After numerical substitution and matrix multiplication are carried out, eqn. (9) yields a set of first-order corrections which determine a new set of values for a,. These values are used to calculate new values of y,O and a new' Jacobian matrix. The cycle is repeated until convergence to a, is obtained. Convergence means that the elements of Am have become so small that the corresponding changes Ayi are negligible. Often in practice the Jacobian matrix need not be calculated for every iterative cycle. When the trial set of values is reasonably close to the converging set of values, a single Jacobian matrix can be used during the iterative operation. The final calculation, however, should include a calculation of the last Jacobian. The next step is to estimate the standard error of the parameters ort. The variance-covariance matrix of a is related to the Jacobian matrix by V ( a ) = (J1.J)-'.a'
(10)
where the elements of variance matrix are
LUA&s. . .im% J The variance (squares of the standard deviation) is the diagonal element in V(a) and the covariance is the offdiagonal element. c2 is the variance of the parent distribution and is estimated from the residual mean square sa
(7)
When eqn. (7) is expanded
The parameters in f are the final set of the calculated values. Further discussion on the error estimates and criteria for convergence are found in references (3) . . and
(4).
Students can readily show that eqn. (8) is equivalent to eqn. (4) by carrying out the matrix multiplication. The solution for Ao! is Aa = (J'. ] ) - I . J'e
(9)
(J' . J) -' is the inverse of (J' . J), and the elements of (J' . J)-' can be obtained by the use of Cramer's rule when the matrix J' . J is nonsingular (determinant is not zero) (4). It is convenient and sufficient in practice to construct an approximate Jacobian matrix for digital computation in the following way. A small increment (-1% of aQ)is given to the jth parameter aj, the other parameters remaining unchanged. This set of parameters is then used to calculate the m values of y,'. The m values of (y,' - ycO)/Aa, then correspond to the jth column of Jacobian matrix. The same increment is then given to each of the other orj in turn, and yi' calculated for each set of parameters. I n this way all of them X n elements of the Jacobian matrix are obtained. The elements of the e vector are obtained by taking the
In eqn. (I), the residual sum of squares R is defined on an absolute basis. It is sometimes desirable, however, to fit the data on a percentage basis. This is particularly true when the magnitudes of the a, values are spread over a wide range. Using W ,as a weighting factor, R can be redefined as
and, when a similar derivation is carried out as above, eqn. (9) becomes A a = (J'.W.J)-'.J'.W.e
where W i s an m X m diagonal matrix whose elements are l/y,=. Cornpuler Programming
The problem is to compute the least squares bond distances r(0-C) and r(C-S) of carbonyl sulfide from the four (4) moments of inertia determined by microwave spectroscopy. The moment of inertia expressed for a linear triatomic molecule is
Volume 47, Number 2, February
1970
/
121
matrix J'. J has the 2 X 2 dimension. Also multiply the J' matrix to the error vector s computed in exercise 4. The J'.e matrix is of course 2 X 1 matrix. Now you have the two normal equations in matrix form. Exercise 7. Compute All, and Ah using Cramer's rule.
Experimental data (5) to he used are Isotopes
0
C
S
MOI (amu A')
1 2 3
16.0000 16.0000 16.0000
12.0038 12.0038 13.0095
31.9822 33.9986 31.9822
83.1262 85.2092 83.3949
A , and Det. are cofactor and determinant of the J'. J
The following stepwise programming exercises have been used to introduce students to Fortran IV and the IBM 7040 computer. About 30 min per week were used to discuss each exercise. The calculations are simple enough so that a student can readily check the computer output by carrying out the calculations for each exercise on a desk calculator. Exercise 1. Given a set of 3 atomic masses, ml, m, and maatoms, compute the total mass. Exercise 2. Given a set of 3 atomic masses, ml, nh, ma and a set of bond distances, ZI = 1.20 PI and 12 = 1.50 A, compute the moment of inertia. Exercise 3. Compute all four moments of inertia I, using "DO LOOP." Use subscripted variables for moments of inertia, bond distances, and masses. Exercise 4. Include the experimental moments of inertia in your input along with the isotopic masses, and compute the 4 elements of the error vector s. el = It - I,Q,i = 1,2,3, and 4. Exercise 5. Compute the elements of the Jacobian matrix in the following way. First compute Il (h = 1.20, l2 = 1.50) as was done in exercise 3. Then com0.01, h = 1.50), and construct pute I,' (Il = 1.20 the first column of the Jacobian matrix, @I& 11) = (I,' - 1,0)/0.01. Next compute I" (11 = 1.20, la = 1.50 0.01). (I" - I,0)/0.01 is the second column of the Jacobian Matrix, (bI,/b12). Use doubly subscripted variables for the Jacobian matrix. Two "DO LOOPS" should be used; the inner "DO LOOP" extending over the four isotopic species and the outer "DO LOOP" covering the two bond parameters. Exercise 6. Multiply the 4 X 2 Jacobian matrix computed in the previous exercise by its transpose (the rows and columns interchanged). The multiplied
matrix. Compute the new bond distance 11 and la by adding the corrections Ah and Ah. Repeat the above adjustment procedure unt.il l, and k converge to within 0.001 A. You should find that they converge quite rapidly; only 3 cycles are needed. Exercise 8. Using the final set of parameters determined in exercise 7, compute the residual mean square s2 as given in eqn. (11) and then compute the standard deviations for 11and h according to eqn. (10). Most students finish the above problem by the middle of the semester and are then able to use the computer for the reduction of their experimental data. A repre sentative student program is available from Author on request. Our students are required to apply this technique to two additional problems: one is to calculate the bond distances and bond angles for ketene molecule from microwave data (6) by a least-squares analysis (four structural parameters are determined) and second to calculate the molecular force constants and thermodynamic properties from high resolution infrared data of HCN and DCN (7). These two prohlems are assigned during the second half of the semester.
+
Literahre Cited
+
SHOEMAXER, D. P., A N D OARLAND, C. W.. "Experiments in Physical Chemistry," (2nd ed.1 MeGrau-Hill Book Co.. Nev York, 1965. (2) DmleLs. F.. W I L L ~ A MJ. I , W., D D N D ~ RP. . . A ~ ~ B B R TR. T . A,. A N D C O ~ N W E LC G . ,D., "Experimental Physical Chemistry." (6th ad.) MoGraw-Hill Book Co.. New York, 1962. (3) WENWORTH. W. E.. J. CIIEM.E o u ~ . . 4 2 , 9 6 162 . (1965). (4) D n ~ m n N. , R.. AND SMITH, H.. "Applied Regremion Analysis," John Wiley 81 Sons, Ino.. New York. 1967. ( 5 ) T o w ~ e s ,C. H., H o m e ~ , . A .N., A N D ME~ILITT, P. R.. PAW. Ren.. (1)
l O d--,. RI . ---- l,."
74 ., Ill?.
(6) JOXNBON, H. R., AND STRAIPDBER~, N. W. P., J . Chem. PAya. 20, 687 (19521. (7) LTTTGE, R., J. CXEY. EDUC.,43.3 (1966).
+ A Simple, Student Photochemical Reactor A recent note concerning an inexpensive photochemical reactor [PAVLICK, J. W., J. CHEM.EDUC.,46, 568 (1969)l prompt3 me to describe a photochemical reactor we have successfully used at Cartland. I t consists of an erlenmeyer flask, a large beaker, and a couple of pieces of glass tubing. The size of the flask depends on the amount of material being photolyzed. It is Large enough so that the reaction mixture forms a. thin layer an the bottom of the flask. The beaker, which holds the cooling bath, is large enough to accommodate the WL flask, water inlet, and outlet. The flask and beaker are clamped on a ring stand so as to allow the water to circulate under the flask. The flask is irradiated from underneath with a standard conical spotlamp or sunlamp. The circulating water level is kept constant by attaching the water outlet tube to an aspirator. A rubber stopper with the appropriate number of holes can be used to connect gas inlets, addition funnels, etc., to the reaction flask. The sebup has been used successfully for a variety of simple photochemical reactions.
--
-
ROBERT SILBERMAN AT CORTLAND STATEUNIVERSITY COLLEQE CORTLAND, NEWYORK13045
122
/
Journal of Chemicol Education
Experimental data (5) to he used are Isotopes
0
C
S
MOI (amu A')
1 2 3
16.0000 16.0000 16.0000
12.0038 12.0038 13.0095
31.9822 33.9986 31.9822
83.1262 85.2092 83.3949
The following stepwise programming exercises have been used to introduce students to Fortran IV and the IBM 7040 computer. About 30 min per week were used to discuss each exercise. The calculations are simple enough so that a student can readily check the computer output by carrying out the calculations for each exercise on a desk calculator. Exercise 1. Given a set of 3 atomic masses, ml, m, and ma atoms, compute the total mass. Exercise 2. Given a set of 3 atomic masses, ml, nh, ma and a set of bond distances, ZI = 1.20 PI and 12 = 1.50 A, compute the moment of inertia. Exercise 3. Compute all four moments of inertia I, using "DO LOOP." Use subscripted variables for moments of inertia, bond distances, and masses. Exercise 4. Include the experimental moments of inertia in your input along with the isotopic masses, and compute the 4 elements of the error vector s. el = It - I,Q,i = 1,2,3, and 4. Exercise 5. Compute the elements of the Jacobian matrix in the following way. First compute Il (h = 1.20, l2 = 1.50) as was done in exercise 3. Then com0.01, h = 1.50), and construct pute I,' (Il = 1.20 the first column of the Jacobian matrix, @I& 11) = (I,' - 1,0)/0.01. Next compute I" (11 = 1.20, la = 1.50 0.01). (I" - I,0)/0.01 is the second column of the Jacobian Matrix, (bI,/b12). Use doubly subscripted variables for the Jacobian matrix. Two "DO LOOPS" should be used; the inner "DO LOOP" extending over the four isotopic species and the outer "DO LOOP" covering the two bond parameters. Exercise 6. Multiply the 4 X 2 Jacobian matrix computed in the previous exercise by its transpose (the rows and columns interchanged). The multiplied
+
+
122
/
Journal of Chemicol Education
matrix J'. J has the 2 X 2 dimension. Also multiply the J' matrix to the error vector s computed in exercise 4. The J'.e matrix is of course 2 X 1 matrix. Now you have the two normal equations in matrix form. Exercise 7. Compute All, and Ah using Cramer's rule.
A , and Det. are cofactor and determinant of the J'. J matrix. Compute the new bond distance 11 and la by adding the corrections Ah and Ah. Repeat the above adjustment procedure unt.il l, and k converge to within 0.001 A. You should find that they converge quite rapidly; only 3 cycles are needed. Exercise 8. Using the final set of parameters determined in exercise 7, compute the residual mean square s2 as given in eqn. (11) and then compute the standard deviations for 11and h according to eqn. (10). Most students finish the above problem by the middle of the semester and are then able to use the computer for the reduction of their experimental data. A repre sentative student program is available from Author on request. Our students are required to apply this technique to two additional problems: one is to calculate the bond distances and bond angles for ketene molecule from microwave data (6) by a least-squares analysis (four structural parameters are determined) and second to calculate the molecular force constants and thermodynamic properties from high resolution infrared data of HCN and DCN (7). These two prohlems are assigned during the second half of the semester. Literahre Cited SHOEMAXER, D. P., A N D OARLAND, C. W.. "Experiments in Physical Chemistry," (2nd ed.1 MeGrau-Hill Book Co.. Nev York, 1965. (2) DmleLs. F.. W I L L ~ A MJ. I , W., D D N D ~ RP. . . A ~ ~ B B R TR. T . A,. A N D C O ~ N W E LC G . ,D., "Experimental Physical Chemistry." (6th ad.) MoGraw-Hill Book Co.. New York, 1962. (3) WENWORTH. W. E.. J. CIIEM.E o u ~ . . 4 2 , 9 6 162 . (1965). (4) D n ~ m n N. , R.. AND SMITH, H.. "Applied Regremion Analysis," John Wiley 81 Sons, Ino.. New York. 1967. ( 5 ) T o w ~ e s ,C. H., H o m e ~ , . A .N., A N D ME~ILITT, P. R.. PAW. Ren.. (1)
l O d--,. RI . ---- l,."
74 ., Ill?.
(6) JOXNBON, H. R., AND STRAIPDBER~, N. W. P., J . Chem. PAya. 20, 687 (19521. (7) LTTTGE, R., J. CXEY. EDUC.,43.3 (1966).