Least-squares fitting of multilinear equations

computer M e / . 110 Least-Squares Fitting of Multilinear Equations Richard T. O'Neilll Department of Chemistry Xavier University Cincinnati, OH 45207 David C. Flaspohler Department of Mathematics Xavier University Cincinnati, OH 45207

Recent articles in this Journal (1, 2) have stressed the importance of curve fitting of experimental data in the teaching and practice of chemistry. Optimally, the method used should (a) report the parameters fitted and their uncertainties, (b) weight data correctly, (c) be able to be implemented on a wide variety of computers and calculators, (d) be as general as possible and not restricted to the linear case, and (e) avoid iteration if this is at all possible. To satisfy criterion c, the algorithm employed should not use matrices as such, since calculators and microcomputers do not usually support MAT commands, and should complete the calculation in a single computing cycle. Iteration should be avoided if possible because the functions involved may not converge rapidly, good initial estimations may not be available, and the process may converge to only a local minimum, giving an answer that is incorrect, but not obviously incorrect. Methods have been described that solve the curve-fitting problem for the strictly linear case, with uncertainty in both dependent and independent variables (3),the general nonlinear case with the uncertainty in the dependent variable only (4), and the general case when there is uncertainty in both dependent and independent variables (5).However these methods all involve iteration. Author to whom correspondence should be directed. BITNET address: ONEILL@XAVIER. Table 1.

c = [ao1[11 General heat capacity C,T

For the process described in this paper, all uncertainty is assumed to reside in the dependent variable, more than one independent variable is allowed, and iteration is not required. Multilinear Equations

It has been stated that when the model equation cannot be arranged in linear form, the data analysis problem must be solved by iterative methods (5). Linear form in this sense refers to the parameters to be fitted and not to the variables. It is not restricted to bivariate (X, Y ) data and to the Y = mX b form. Thus the van der Waals equation, after series expansion and rearrangement:


is not linear in the independent variables (V, T ) but is linear in the parameters to be fitted (a, b, c = b2). We use the term multilinear to refer to an equation that describes a dependent variable or function of variables (independent and dependent) as the sum of constants multiplied by functions of independent variables: k

or, if restricted to k 5 3 as in this paper,

where Y is the dependent variable (error normally distributed); Xi, X2, . . . are the independent variables (negligible error); F, G, H, and K are functions given by the model equation; and a , 0, and 7 are the parameters to be determined. In the van der Waals equation given above, F = PV - RT, Gl=G=-l/V,G2=H=RT/V,G3=K=RT/V2,al=a=a, a 2 = 0 = b , w = 7 = b2.


+ [b2][RT/ v ]

+ [all [ TI

+ [a21 [ l21

-I-[Be][(-2XJ4 ~ R , + J ~P,J+I = [uoI[21 4- [ae][(-2XJ-k fundamental vibration-rotation bands of heteronuclear diatomic molecule ~ R .+J ~ P . J +JI ,

Examples of Multilinear Equations of Chemical Interest

PV-RT =[a][-1/V] [b] [RT/ V] van der Waals equation (after series expansion) P, V, T


+ 4J3 + 7J2+ 6J+ 2)]

Note that the parameters must be multipliers of the functions. An equation that cannot be rearranged to this form is not multilinear. Table 1 gives in multilinear form several examples of equations of chemical interest that are usually not considered to be linear. Data for these equations can be fit analytically, not just iteratively, to obtain the coefficients. The table is constructed so that the variables and parameters for each equation can be easily identified with the terms (F,G, H, K, a , 1 3 , ~ )in the multilinear form.

Table 2.

Algorithm for Estimation of Parameters

1. Enter N, C, M Comment: N is the number of observations. Cis the number of functions on the right-hand side of the model equation (the number of parameters to be estimated), M is the number of independent variables

The Procedure

In least-squares analysis for this type of function, the usual assumptions are that

where ei's are independent and normally distributed with mean zero and variance (SD - 1). The problem can be avoided by using the alternate method of calculating the variance indicated in Table 2, step 12, by asterisks. In any case, the highest precision available with the computer or calculator should be used. The algorithm has been tested with VAX, IBM-PC, Apple-11, and Commodore-64 computers using the same BASIC program.



When all the error is assumed to be in the dependent variable, the weighting function for each point should be:

In the previous discussion it was assumed that a,, the uncertainty in Yi was the same for all Y. If it is not, then the global weight (1/(&F/5Y)2) should be multiplied by (l/ni2),all evaluated at the point in question, in order to calculate the weighting factor W;. The global part of the weight can be included explicitly for particular cases (6),but for reasons of generality, the procedure described here approximates the &F/&Y term as:


For situations where error exists in the independent variable(~) or where the equation cannot be arranged in a multilinear form, iterative techniques are needed. However, many equations of chemical interest that are not linear in the dependent variable(s) are nonetheless multilinear in form. If the error is essentially all in the dependent variable, these can conveniently be treated analytically, without iteration, by the technique described. 42

Journal of Chemical Education

