Least squares: Linear and non-linear - Journal of Chemical Education

The use of modern computers affords the instructor an excellent opportunity to introduce highly refined techniques of data treatment...
2 downloads 0 Views 901KB Size
Boris Murulin Southern Illinois University Carbondole, 62901

Least Squares:

Pollnowl recentlv illustrated some differences which occur when a specifk problem (fitting of data to the Clausius-Clanevron . - Eauation) is treated bv the techniaues of linear least squares and non-linear least squares. He points out that the latter method is more accurate since it minimized directly the deviations in the measured variable. The purpose of this note is to call attention to two implicit variables which affect the choice of a least squares method. The first implicit variable is choice of functional form. For example, the Clausius-Clapeyron Equation is derived using several assumptions. If the particular system which is being measured fails to adhere to those assumptions, then the eauation which is heine determined is onlv an a.~.~ r o x i mation. Under these circumstances, the non-linear least sauares nrocedure will vield the best two-~arameteraDproximation of that particular functional form. Of more importance, the second implicit variahle is error hy the experimenter, i.e., the student or the research worker. Assumine that the data trulv follows the selected functional form, then the deviations will be large or small accordine to the abilitv of the exnerimenter and the nature of the experimental equipment. A comparison in the error in one coefficient which has been obtained by the two different methods illustrates this point. The equation

-

-

Linear and Non-Linear

Comparison of CoefficientsDetermined bv Different Least Squares Techniques -,Least Independent Variable

Linear SquareError

A

B

%

-

Consequently, as the limit d 0 is approached, the errors from the methods approach each other in absolute value. A similar treatment of the constant A shows that two least squares methods yield similar values in the limit, d 0. The preceding abstractions are given concrete form by the consideration of some typical student data, viz., viscosity data given by Wettaw, et a1.3 Both 7 and ~ V w h e r e7 is viscosity and V is molar volume are said4 to obey eqn. (3)with the independent variable as reciprocal temperature, 1/T. The goodness of fit is measured by the Error Per Cent defined in eqn. (5).

-

Error Per Cent where y and x are variables, A and B are constants to he determined, and d is the average deviation, is in the form suitable for a linear least squares determination of A and B. After much algebra, solution of the normal equation for B yields

.+ terms of the order (dZ,d3,d')

(2)

In exponential form, eqn. (1)becomes y exp(&d)= A exp (Bx)

(3)

Solution of the normal equations given by Pollnow1 for dB using the usual series expansion for the exponential?, exp (*dl = 1 d + ...,yields

*

where Pollnow's dK has been called d. To first order, the second term of eqn. (2) and eqn. (4) are the errors in B resulting from a linear least squares determination and a non-linear least squares determination, respectively. Each component term of the errors is homogeneous in d, degree 1.

Non-Linear SquaresError A B %

-Least

=

pa x 100

(5)

where M is the mean property value of the N samples and a is the mean square deviation. Some typical results are given in the table. The calculations were performed on an IBM 7040 and on a n IBM 360165 with programs obtained from IBMJ. In summary, the use of modern computers affords the instructor an excellent o p ~ o r t u n i t vto introduce hiahlv refined techniques of data treatment. F k h e r , the instr;ctbr is provided an opportunity, which should not be disregarded, to emphasize the effect of the experimenter's interaction upon the final quantitative results. The author is indebted to the Data Processing and Computing Center of Southern Illinois University far the gratis use of their facilities. In particular, thanks are accorded to Mr. W. J. Jones for adjusting IBM programs to the local facilities. Pollnow, G. F., J . CHEM. EDUC., 48,518 (1971). =Dwight, H. B., "Tables of Integrals and Other Mathematical Data" (Rev. Ed.), Macmillan Co., N. Y., 1947. Wettaw, J. F., McEnary, E. E., Drennan, J. D., and Musulin, B.,J Chem. Eng. Doto, 14,181 (1969). Glasstone, S., Laidler, K. J., and Eyring, H., "The Theory of Rate Processes,"McGraw-Hill Book Co., Inc., N. Y., 1941. 5Purcell, T. D., "Least Squares Regression for the IBM 7040 by Orthonormal Linear Functions Using the Chaleski Method." Data Processing and Computing Center, Southern Illinois University, Carbondale, Ill/nois, Mimeographed Materials. Middleton, J . A., "Contributed Program Library (for IBM 51360, 1130 and 1800)," No. 360D-13.2.003, Hawthorne, N. Y., 1969.

Volume 50, Number 1, January 1973 / 79