Lecture graphic aids for least squares analysis - Journal of Chemical

There is a need to convey the fundamental concept of a variance surface and the problem of locating its minimum more clearly to students...
0 downloads 0 Views 3MB Size
Lecture Graphic Aids for Least-Squares Analysis Giles Henderson Eastern Illinois University. Charleston, IL 61920 The ultimate goal of manv laboratory measurements is value of one or often to extract the best or most more unknown parameters from several observations. In the simplest case, the unknown quantity, y, can he measured directly, such as thelength of a rod or the weight of a sample. We wish to find a single value of y, based on N measurements, which best represents the true value. The quality of this particular choice, y*, may be expressed as the sum of squared residuals,

where the best quality value of y* corresponds to aminimum

R'. For example, if we made several measurements of the length of a rod, we could calculate and plot R' for several choices of y*. This plot is clearly a parabola with a unique minimum in the y X space. By applying the least-squares principle (I) it can he shown that the unique y* value, correto a minimum R', is the most probable value of y . sponding . and turns out to also he the mean valueof the data set, i.i., the average value of the length measurements, providing the errors in measurement are random. In many cases we rely on, or wish to test, some theoretical model that relates the narameters of interest to the exnerimental observables: thd molar absorptivity to percent tiansmittance. the heat of vaporization to vapor pressures, the electric dipole moment tddielectric const&s,the activation ,enerav -. to reaction rates, molecular structure to microwave spectra, or in general? = f ( x , aa,a,. . .)where ai is one of the unknown parameters of interest and x is an independent variable sich as concentration or temperature and-may be systematically varied between measurements. In order to evaluate the quality of the model, we need to insure that the number of observations are greater than the number of unknown parameters. Our task then is to find a unique set of parameter values that will bring the model function into best agreement with the experimental observations. The least-squares analysis is the most common solution to

this problem. This once tedious method has become a routine procedure in undergraduate laboratories with the ready availability of digital computers. The basic linear regression theory (2-5) and statistical significance of optimized parameters (6-9) have been discussed in this Journal. The importance of proper weighting techniques (1&17) has been extensively developed along with analysis methods, for which experimental errors exist in both the dependent and the independent variables (18,191. Both direct grid search (20) and graphical methods (21) have been presented in addition to matrix (22) and iterative computer algorithms (23, 24). The ordinary least-squares method is developed in most of our current physical chemistry laboratory texts and in some lecture texts, e.g., Noggle (25). In spite of this attention, there remains a need to convey the fundamental concept of a variance surface and the problem of locating its minimum more clearly to our students. One of the difficulties in developing the basic concepts of linear regression is the lack of appropriate three-dimensional illustrations of how the quality of a fit depends on the choice of regression parameters. A survey of over 100 statistics texts revealed only a single example (26) of a three-dimensional figure depicting what is going on in a simple least-squares optimization of two parameters. Surprisingly, very few had even two-dimensional figures; see for instance (27). Thus digital graphics are employed here to prepare an accurate projection of a typical three-dimensional variance surface. Contours of the variance topography are then overlaid on plots of the normal equations to yield a conceptually clear graphic description of the least squares algorithm. These figures, i t is hoped, will prove to he useful teaching aids, as have other three-dimensional representations of PVT phase diagrams, potential energy surfaces, two-dimensional NMR spectra, two-dimensional fluorescence spectra (281, time-dependent quantum probability surfaces (29,30), etc. In the simplest application, we have a set of ( x , y) data points in which only the values of the independent variable are known exactly and all they measurements are of uniform precision. If these points are to fit an equation of the form

Volume 65

Number 11 November 1986

1001

our problem is t o optimize the values of the slope (at) and intercept (ao)parameters to give the best quality fit. Figures 1and 2 show the effects of varying one of these parametersas the other is held constant for a typical data set. The optimum or "least-squares" fit is obtained for the values of a. and al, which minimize the sum of residuals squared:

= x y ? - 2 o o x y i+ N o :

- 2 a , S y , + ~ o ~ n , >+~a:>: (3)

where N is the number of data points (10 in our example). Note that the variance, S2,is obtained by normalizing the sum of the residuals squared with respect to the number of degrees of freedom: SZ= RI(N - 1) (4) I t is particularly instructive to examine a plot of the R function, eq 3, in the (ao, al) space, see Figure 3. Here the domains of a. and al have been scaled differently for convenience in graphing. The surface has been truncated a t R = 200, and surface contours have been plotted in the (ao, al) plane for R = 72, 104, 136, 168, and 200. Figures 4 and 5 depict cross sections of the unnormalized variance surface. In the cross section of Figure 4, the slope (al) is held constant and we allow the intercept (ao) to vary, as in Figure 1. In Figure 5, the cross section shows the effect of varying the slope (al) as theintercept (ao)is held constant, as in Figure 2. These parabolic cross sections clearly reveal the quadratic dependence of the variance (quality of fit) on our choice of parameters and closely relate to the two-dimensional R' plot described above for the simple one-parameter case. We now seek the least-squares coordinates (ao, a,) that minimize the variance and maximize the quality of our fit. I t is evident that the minima in the variance cross sections correspond to coordinates for which the tangent or first derivative is zero: a minimum in the Fiaure 4 cross section is given by the condition,

Figure 1. Effect of varying the intercept (ad of Uw least-squares line with the slope (a,)held constant. The upper and lower limits cwrespond to the domain of a. in Figures 3 and 4.

The (ao, al) coordinates that satisfy eq 3 fall on a straight line, which is plotted in Figure 6 along with the unnormalized variance contours. This line connects the (ao, al) points for which a horizontally level rule held parallel to the a. axis, satisfying (aRlaao)., = 0, is tangent to the surface in Figures 3 or 4. Similarly, if we hold a0 constant and allow a l to vary, as in Figure 5, a variance minimum is given by

The ( a , , a , ) coordinates that satisfy this condition define a second straight linr. which is also dotted in Figure 6. These points are obtained by holding our horizontaily level rule tangent to the surface in Figure 3 or 5 and parallel to the a1 axis, ( a R l a a ~ ) = . ~0. These normal equations (eqs 3 and 4) the correspondtng horizontal tangent plane, and the normal line all intercept a t the point of minimum variance. In this case, the point of interception can be found analytically by solving the normal equations simultaneously for no and al. This yields the well-known linear regression parameters; the least-squares value for the slope and intercept:

1002

Journal of Chemical Education

Figure 2. Effect of varying the slope (a,) of the least-squares line with the intercept (ao)held constant. The upper and lower limits correspond to the domain of a, in Figures 3 and 5.

The subsequent least-squares function is depicted as the center line in both Figures 1and 2. These figures were prepared with a Nicolet 20 DXB computer using Fortran software and a Hewlett Packard model 7470A digital plotter. The three-dimensional plotting algorithm employed in Figures 3, 4, and 5 has been described previously (28). The (ao, al) coordinates of the variance contours were located with a polar search in which regular increments in the search angle, @, were followed by NewtonRaphson radial iterations to the desired level. The final (r,+)

Figure 3. The three-dimensional least-squaresunnarmalized variance surface given by eq 2 for the data in Figures 1 and 2 presented in both opaque and Vansparent blocks. The domains of & and a, are 2.96 5 a 5 22.05 and 8.25 5 8, 5 9.79.respectively. The surface has been truncated at R = 200 and contours are plotted far R = 72. 104. 136, and 200.

INTERCEPT

*

Figure 6. The least-squares coordinates. a. = 12.5 1.5 and a, = 9.0i 0.1, of the paint of mlnimumvariance are given by the simultaneous solution of the normal equations. The contour perimeter corresponds to the ao and a, damains of Figure 3.

volar search coordinates were then transformed to (an. ". a.) Cartesian plotting coordinates. The apparent segmentation in certain reeions of the contours is an artifact resultincfrom the more than 10-fold difference in the scaling of the-slipe and intercept (al and ao)axes. The author has found the figures presented here particularly useful as graphic lecture aids. These figures, along with equations 2-8 are available from the author on either overhead transparencies or 35-mm slides. Please enclose $10.00 payable to Eastern Illinois University to cover our production and postage costs. A,

Figure 4. Constant-slope cross section of the ieast-squaresvariance surface presented in both opaque and Vanspsrenl blacks. This plot clearly reveals Me OUadratiC deDendence of the oualitv of fit on the choice of intercept as we hold , the slope constant, as in Figure 1.

.

Acknowledgment

The author wishes to express his gratitude for the many useful suggestions by one of the reviewers. Lltereture Clted H.; Murphy, G. M. The Mathematics of Physics .nd Chemistry; van Nosfrsnd: Princeton. NJ. 1 9 5 6 ; ~504. 2. Acton. F. S. J . C h e m Educ. 1953,30, 128. 3. Nelson,L. S. J. Chem. Edue. 1956,33,126. 4. Wentworth, W. E. J . Chrm.Educ. 1965,42.96. 5 . Wontworth, W. E. J.Chem.Educ. 1965,42,162. 6. Hancoek.C. K.J. C h a m . E d u 1965.42.608. 7. Davis, W. H.:Pryor. W. A. J . Cham. Educ 1976.53. 265. 8. Sand8.D. E. J.Chem.Educ. 1977.64.90, 9 . Patfengill, M.D.;Sands, D.E. J . Chem.Educ. 1979.56, 244. 10. Anderson. K. P.:Snow.R. L. J. ChemEduc. 1967,44,758. 11. Srnith,E.D.: Mathew%D. M. J, Chem.Edvc. 1967,44,757. 12. K0hmsn.T. P. J . C h e m . E d u 1970.47.657, 13. Polinow, G. F. J . C h e m Educ. l971,48518. 14. Musulin, B. J . Chem.Educ. 1918.50.79. 15. Sands, D. E. J. Chem. Educ. 1914,51,473. 16. Christ1an.S. D.: Lane, E. H.; Gar1and.F. J . Chsm.Educ. 1971.51.475. 17. deLevie,R. J. Chem.Edur. 1986.63, LO. 18. 1rvin.J.A;Quickendon.T. I. J . Chom.Edur. 1983.60.711, 19. Ka1antar.A. H. J. ChemEduc. 1987.64.28. 2 h Beerey,J G.;Berke.I..:Callan..l R J Cham.Educ. 1968.45. 728. 21. Chri8tian.S. D. J Chem. E d u r 1965.42.804. . 22. Kim, H.J . C h w ~ E d u e1970,47.121. 23. Trindls, C . J . Chem. Educ. 1983,64 566. 24. Christ1an.S. D.:Tucker,E. E. J . Cham.Edur. 1984.61.788, 25. Noggle, J. H. Phyrical Chemistry; Littie, Brown: Boston, 1985: p 902. 26. Box. George E. P.; Hunter W. 0.;Hunter, J. S. Sratisfic8 far Erperimanlor8: A" IntroductianloDesign, Doto Analysis and Moddohd Model Building; Wiley: New 1. Margenau,

Figure 5. Constant-interceptcross section of the lean-squares varlsnce surface presented in both opaque and Vansparent blocks. This plot reveals the quadratic dependenceof the quality of fit an the choice of slope as we hold the intercept constmt, as in Figure 2.

Volume 65

Number 11

November 1986

1003