Applications of Least Squares Methods - Industrial & Engineering

Arnold W. Pratt , J. Nicolet Toal , George W. Rushizky , and Herbert A. Sober. Biochemistry 1964 3 (12), 1831-1837. Abstract | PDF | PDF w/ Links. Cov...
0 downloads 0 Views 541KB Size
I

J. B. OPFELL' and B. H. SAGE California Institute of Technology, Pasadena, Calif.

Applications of least Squares Methods The advent of automatic digital computing equipment has made these methods a practical tool in evaluating coefficients of analytical expressions from experimental data

STATISTIGAL

methods have not been widely applied to chemical engineering problems. For this reason it appears worth while to present briefly the method of least squares and a few illustrations of its application, without going into the detailed mathematical background of statistics or the numerous methods of curve fitting that have been developed. No effort has been made to include numerical examples, as they are well covered in other publications, particularly in regard to application of the Benedict equation to experimental data (7-9).

Method of least Squares The method of least squares is useful for obtaining an approximation of a set of points or a curve by an analytic expression. I t is based upon :he assumption that the best analytical approximation to a smooth curve or a set of points is that one for which the sum of the squares of the approximation errors is a minimum. I n most cases the magnitudes of the approximation errors are known precisely; thus the method is not statistical. The primary reason for choosing such a criterion stems from the opportunity for applying the convenient methods of normal regression analysis. The normal equations of the two methods are identical and are obtained by the differentiation of the likelihood function (75), which may be written ( N X 2 ) . Analysis. Assume that N pairs of N), observations (xr, yl); (r = 1, 2,. of the quantities x and y have been made. For theoretical reasons, the experimenter expects that y = $o(x), where q50(x) is some function, not necessarily elementary, of the variable, x . Let, for example,

contain no unknown constants. For example, if 4 0 ( x ) is a polynomial in x , then q j i ( x ) = x i , I n a formal manner the difference between the value y (or $0) predicted by Equation 1 and the corresponding observation is

The method of least squares determines the coefficients Ki in such a manner that N

minimum

(2)

I n the usual manner the minimum is obtained by differentiation with respect to each of the coefficients K,.. I n the strictest sense, the summation over r should be replaced by integration with respect to x over the interval throughout which the approximation function applies. Summation, though less precise, usually is adequate and is used in this discussion.

(i = 1 , 2 , .

. . .m )

(3)

These m equations may be rewritten as

(i = 1, 2,. . :.m) The set of equations shown above is called the normal equations of the method of least squares. For convenience the following briefer notation is often used: N

...

+ o ( x ) = Ki+i(x)

I n Equation 5 , (+i,cji) is the inner product of the vectors &,. and &. With this notation the ith normal equation may be written as rn

+ Kz+z(x) + ..

* *

. K,+,(x)

(1)

where m < N and Ki are constant coefficients of the functions &. Generally the functions c $ ~ are known and

I n conventional matrix notation (77, 72) Equation 6 becomes

Equations 3, 4, and 7 by themselves do not constitute sufficient conditions for the likelihood function NX2 to be a minimum. The necessary argument was supplied by Milne (74) for the case in which the functions +i contain no unknown parameters, which corresponds to the case in which the normal equations are linear with respect to the empirical coefficients (75). The solution (Kl,K2,. . . .K,) of the normal equations has the following properties:

1. If N > m, the solution is unique. 2. If N = m, the approximation function of Equation 1 fits the observations exactly. 3. If N < m, a family of proportional solutions is obtained. Applications arise in which the experimental observations are grouped into regions so compactly that the normal equations behave as though N 5 m. I n such situations it is difficult to obtain precise values for the empirical coefficients, though the approximating function describes the observations closely. In such cases the matrix of the coefficients of the normal equations is said to be ill conditioned (20). Frequently some & ( x ) cannot be completely separated from unknown constants. I n such a case the normal equations will not be linear in the constant coefficients and the evaluation of the solution will be more difficult. One of the more feasible methods of attack is an iteration in which the linear coefficients are obtained as functions of the nonlinear terms. Equation 2 may be satisfied by appropriate adjustment of the nonlinear coefficients. Another practical approach is through the use of polynomial approximations to the nonlinear functions. Sometimes the direct application of the experimental variables in the least squares analysis is not as convenient as the use of elementary combinations of them. An effort should be made to minimize the sum of squares of the residuals of that variable which can be expressed in the form of Equation 1 in which the functions C& are independent of all constants. I n the absence of specific information it is often assumed that the random observational errors are normally distributed

(7)

Present address, Cutter Laboratories, Berkeley, Calif. VOL. 50, NO. 5

MAY 1958

803

(75). According to this assumption the probability that an observed quantity will differ from the corresponding predicted quantity by an amount Ayi 1

is described by the function ___

d P

is assumed further that each error in observation is independent of all others and that errors, A)&, associated with one variable are independent of errors, Ay,, associated with the other variables, the probability of a set of errors is equal to the product of the probabilities of each of the errors. A sufficient condition for the most likely set of errors to occur under the hypothesis that the prediction represents the A;

observed phenomenon

is that

($)2be

I n the method

i = l

a minimum.

of least squares, the variances U& known a priori, so

1

7

are not

is replaced by

CAYS

weighting factors, w,,to which numerical values are assigned. The numerical weighting factors are determined from estimates of relative accuracy of the experimental information.

Applications Least squares techniques can often be employed in determining the coefficients for analytical expressions used to describe experimental data. Vapor Pressure Data. One of t‘he simplest applications of the method of least squares is found in the determination of the coefficients, A? B: and C! of the Antoine equation (78).

This expression for the vapor pressure, P”, in terms of the temperature, T , is not in the form of Equation 1. If the following substitutions are made in Equation 1 and it is rearranged, $0

= TlnP”

41 = T $2

=

T , - Ti

(12)

The least squares approximation will if Equation 11 diverge for C > Tmi,,imum is used. For the Antoine equation the use of iterative methods has essentially no advantage over the more direct substitution method. Values of A , B, and C which minimize the sum of the squares of residuals for In P” will be different from those which minimize the corresponding sum for T In P”. The magnitude of the difference in the coefficients will be inversely proportional to the accuracy with which the Antoine equation describes the experimental information. Diffusion Coefficients. Fick diffusion coefficients of hydrocarbons in the liquid phase are measured under transient conditions in an isochoric vessel (77). The weight of gas phase introduced as a function of time is described by a n expression of the following form: (Am

4-

P ) = ~

D(8 4- 9)

Following the preceding methods the expression may be rewritten in terms of the corresponding functions and coeffiSuch action cients of Equation 1. entails the following substitutions: +o=Z-l

91

K1 = Bo

= Q

(13)

The experimental observations are Am and 8. Expansion of this expression gives

Equation 14 assumes the form of Equation l , if the following substitutions are made :

K7

=

90= 8

93

it follows that:

- ClnP” + ( A C - B )

(10)

Equation 1 is now in the form of Equation 8 and the least squares evaluation of constants A , B, and C can be determined without iteration. If, on the other hand, it were necessary that the expression retain the form 1nP” = $ ( T ) ,iterative methods would have to be employed. The polynomial approximation would be

804

C

43 = 1.00000 Again the solutions for K j from these two approaches will not be the same and neither will the corresponding values of D, 8, and 1.1. Coefficients of Equations of State. Benedict (7) utilized least squares techniques in connection with manual methods to evaluate the coefficients for an equation of state for nitrogen. Brough ( 9 ) applied least squares methods with the aid of automatic digital computing equipment to the evaluation of the coefficients of the Benedict equation for propane. The latter methods illustrate another approach to the problem. The Benedict equation (4) may be written in terms of the compressibility factor in the following form:

(9)

43 = 1.0000000

TlnP” = A T

+

KI = A Kz = -C K3 = AC - B

InP”

where the value of T,+1 used in the (n 1) iteration is equal to T , f C,,. T,, is the value of T used in the nth iteration and C,, is the value of C computed from the nth iteration. T I is the experimentally observed temperature corresponding to the vapor pressure, PN. After convergence has been attained, it follows that:

= (Am)’

If desired, Equation 13 may be expressed in a nonlinear form which would require iterative methods for a least squares solution. One such form is: Am

=

45

do--

@

=

.\/E de +

9- p 2

de

(16)

The nonlinear part of the expression was approximated by a polynomial and the following substitutions :

INDUSTRIAL AND ENGINEERING CHEMISTRY

The function 41 contains the unknown parameter y and the solution of the normal equations depends upon its value. The set of values for the coefficients will not be uniquely determined by the minimization of the sum of the squares of the residuals, since 7 enters the normal equations in a nonlinear manner (79). I n principle, the method involves choosing a value of y on the basis of the best evidence available and determining the best fit in the sense of the least squares method with respect to the other variables. Another value of y is then chosen and a new set of

LEAST SQUARES M E T H O D S coefficients determined. This is repeated until the value of y yields the minimum deviation in the least squares sense to be established. The details of this method follow. Benedict, Webb, and Rubin (2-6) proposed Equation 18 as an empirical representation of the thermodynamic properties of light hydrocarbons and their mixtures and described a method for obtaining numerical values for the coefficients of this expression. Using the least squares criterion to fit the equation to the experimental observations, Benedict (7, 4, 5) determined sets of coefficients for several of these compounds. The least squares method as applied assumed that all the random experimental errors were reflected in the reported experimental values of pressure and that temperatures and specific volumes were known exactly. Experimental observations of pressure were weighted equally and the equation was fitted to one isochor a t a time. The resulting values of the volume-dependent functions were smoothed and adjusted so that predictions of fugacities and pressures for phases coexisting at heterogeneous equilibrium would be equal. Finally, the coefficients of the smoothed volume-dependent functions were determined by least squares. To describe the volumetric behavior for purposes of this discussion, Equation 18 will be used rather than the pressureexplicit form presented by Benedict, Webb, and Rubin (4, 5). The form of the Benedict equation of state shown by Equation 18 has several advantages over other arrangements when considered from the standpoint of numerical operations. It is linear in seven of the eight coefficients and the terms of the sum are not dependent upon the system of units employed to describe the experimental measurements. The range of absolute magnitude of the compressibility factor, z,, is not great. States a t low temperatures and specific weights will be of greater influence in establishing the coefficientsthan in the earlier method (4). Deming (70) described the application of the method of least squares to an exponential function with a linear component. The exponent in Deming’s example was independent of experimentally observed variables and in this respect differed materially from the evaluation of the coefficients of the Benedict equation. In a fashion similar to Benedict, Webb, and Rubin (2-6), Deming (70) carried out the analysis and formulated the method in expectation of using graphical and manual calculations. Brough (7-9) described a method of obtaining numerical values for the coefficients of the Benedict equation of state which is suitable for use with

automatic digital computers. This discussion extends Brough’s analysis to the case in which data from several sources are used simultaneously to obtain a set of coefficients which minimizes the rootmean-square deviations of the predicted values from the several sets of observed volumetric behavior. Brough’s analysis (8) assumed that all experimental errors were repeated in measurements of pressure, whereas temperature and molal volumes were known exactly. By the use of Equation 18, the deviations in pressure are weighted in such a manner that those a t low temperature and molal volume are of greatest importance. The criterion of optimum agreement with observed behavior in the least squares sense may be expressed :

=

The normal equations for this expression can be combined with the normal equations for Equation 18 in exactly the same manner as data from different sources have been combined to give Equation 22. As shown in Equation 23, the orthobaric specific weights and temperatures are to be used in the Benedict equation to obtain an expression for the differences in predicted bubble point and dew point pressures. ’ I t is necessary to assign a value to the weighting factor, w,,for each set of normal equations employed. Assignment of numerical values for the w,’s arises from physical intuition rather than statistical considerations. I t is desirable that an equation of state describe the volumetric behavior in the vicinity of the critical state. For a one-component, single-phase system, the specification of two independent properties is sufficient to determine the state of the system. The critical state may be characterized in part by the following:

minimum

The following expression is obtained as a typical equation in the set of normal equations:

(i = 1,2,

.

. . . 7)

The Benedict equation of state for a one-component, single-phase system allows only two degrees of freedom, but when the values of two intensive properties for the critical state are substituted into this equation, there is no a priori reason to expect that the value of the predicted property will correspond to that determined experimentally for the critical state. If Equations 24 and 25 are imposed as auxiliary restraints in connection with the least squares evaluation of the coefficients, the equation

(21)

The normal equations may be expressed in the form of Equation 7 , inasmuch as the two sums in Equation 21 are equal. The analog of Equation 7 is

. . . .

I t is possible to obtain coefficients for two equations simultaneously by least squares methods, if the 17,’s are the same in the two equations (76). An example would be the evaluation of the coefficients of the Benedict equation, using both Equation 18 and the following equation expressing the experimental information on the heterogeneous equilibrium states:

.

.

.

.

will predict the critical temDerature and the critical specific volume as accurately as they are known experimentally. Prediction of the critical pressure from the Benedict equation under such circumstances will be a least squares approximation. The latter intensive property could be predicted more accurately by imposing further auxiliary restraints, with a correspodding loss in accuracy in homogeneous regions perhaps remote from the critical state. Because Equations 20, 24, a n d ’ 25 together overspecify the values of the empirical coefficients, the forms X2,g, VOL. 50, NO. 5

MAY 1958

805

and h are linearly dependent. There exist two numbers, Kg and K,, such that

x*

XZin

f K8g f KQh = X&;

K8,K9

# 0

(26)

Equations 24, 25, and the derivatives of 26 provide nine linear restraints on nine unknown coefficients and so specify a unique set of values for the nine components of K for each value of y. The components of K are K1, K2,. , , Ks, KS. Components Ks and KO are called the Lagrange undetermined multipliers (73). They have no physical significance in this particular case and their numerical values are obtained only as a consequence of determining values for the other coefficients. Equation 26 has some of the same characteristics as Equation 20. The necessary conditions for Xz to exhibit a relative minimum and at the same time to provide values of the coefficients which will permit correct values of the critical temperatures and specific volume, are stipulated in a matrix equation similar to Equation 22 which is obtained by differentiating Xzwith respect to each of the components of K . Equation 22 may be solved by using Jordan’s method (20) for each value of the coefficient, y. Obtaining agreement of the equation of state with the auxiliary restraints is consistent with the usual method of establishing the coefficients of van der Waals equation. I n the latter instance a knowledge of three properties a t the critical state is sufficient to determine numerical values of the empirical coefficients. The earlier assumption that all random experimental errors are reflected in the reported values of pressure was assumed in the derivation of these expressions. Measure of Fit. After evaluation of the coefficients, it is desirable to determine the agreement of the predictions of the equation of state with experiment. A reasonable over-all measure of agreement is the weighted average variance of estimate (27), which may be defined

.

Average standard errors of estimate (27), of the compressibility factor for each set of experimental data sometimes are of greater utility. The latter quantity is identified by the expression,

Other special measures of agreement may be employed for a particular situation, but Equations 27 and 28 are generally useful.

806

Acknowledgment

Interest in this field resulted from the authors’ association with Project 37 of the American Petroleum Institute. The pertinent suggestions of Manson Benedict are gratefully acknowledged. Nomenclature

A , B, C = coefficients of Antoine equation A,, Bo, C, a, 6 , c, a, y = coefficients of Benedict equation of state D = a diffusion coefficient, pounds per second d = total differentiation operator e = base of natural logarithm expi ] = exponential function = defined in Eauation 24 = defined in Eiuation 25 = running sequence over number of terms = jth constant coefficient of empirical approximating function for +&) = natural logarithm = number of Coefficients in empirical function = number of observations used in least squares analysis = absolute pressure, p.s.i. = vapor pressure, p.s.1. = universal gas constant, (lb./sq. in.)(cu. ft.)/(lb. mole)(’ R.) = running sequence over number of observations N = weighted average variance of estimate = absolute temperature, R. = weighting factor = sum of deviations of observed values of dependent variable from values predicted by empirical approximating function, variance of estimate = defined by Equation 26 = independent variable = dependent variable = ith observed value of dependent variable y = compressibility factor, dimensionless = partial differentiation operator = difference in = measured weight of gas introduced into an isochoric vessel, pounds. = time, seconds = coefficient representing systematic error in time measurement, seconds = coefficient representing systematic error in measurement of weight of gas introduced into isochoric vessel, pounds. = total number of sources of data = summation operator = reciprocal molal volume, Ib. mole/cu. ft. = standard deviation of the errors in variable y = function of = j t h function of independent variable in empirical approximating function

INDUSTRIAL A N D ENGINEERING CHEMISTRY

&(x)

= function of independent variable ( x ) which is represented

$0

by an empirical approximating function = function (Pba - Pd,)!P” represented by Benedict equation

SUBSCRIPTS b

= value of property at bubble

d e

= value of property a t dew point = value of property predicted

i, j , k

= running index = final coefficient in empirical

point by equation of state

m n

function running index indicating number of iterations = running index ranging over number N of observations = running index referring to data source = minimum value of a variable =

r S

min

Literature Cited (1) Benedict, M., J.Am. Chem. Soc. 59, 1589 (1937). (2) Benedict, M., Webb, G. B., Rubin, L. C.: Chem. Eng. Progr. 47, 419 (1951). (3) Ibid,, p.’ 449. (4) Benedict, M., Webb, G. B., Rubin, L. C., J . Chem. Phys. 8, 334 (1940). (5) Ibid., 10,747 (1942). (6) Benedict, M., Webb, G. B., Rubin, L. C., Friend, L. C., Chem. Eng. Progr. 47, 609 (195:). ( 7 ) Brough, H. W., chemical engineering thesis, California Institute of Technolopv. 1950. Brou&,”H. W., Schlinger, W. G., Sage, B. H., Am. Doc. Inst., Washington, D. C., Doc. 3235 (1951). Brough, H. W., Schlinger, W. G., Sage, B. H., IND.ENG. CHmi. 43. 2442 11951). Deming, W.‘ E., ‘“Statistical Adjustment of Data,” Wiley, New York, *,.,. 1‘943.

(11) Ferrar, W. L., “Algebra,” Oxford University Press, London, 1941. (12) Prazer, R. A,, Duncan, W. J., Collar, A . R., “Elementary Matrices,” Cambridge University Press, 1938. (13) Margenau, H., Murphy, G., “Mathematics of Chemistry and Physics,” Van Nostrand, New York, 1943. (14) Milne, 1%‘. H., “Numerical Calculus,” Princeton University Press, Princeton, 1949. Mood, A. M., “Introduction to Theory of Statistics,” McGrawHill, New York, 1950. Opfell, J. B., Sage, B. H., Am. Doc. Inst., TVashington, D. C., Doc. 3849 (1952). Reamer, H.H., Opfell, J. B., Sage, B. H., IND. ENG. CHEM.48, 275 (1956). Rossini, F. D., others, “Selected Values of Properties of Hydrocarbons,” Katl. Bur. Standards, Washington, D. C., 1947. Selleck, F. T., Opfell, J. B., Sage, B. H., IND. ENG. CHEW 45, 1350 11953). (20) Turing, ’A. ’M., Quart. J . Mech. and A@l. Math. 1, 287 (1948). (21) Wilkes, S . S., “Elementary Statistical Analysis,” Princeton University Press, Princeton, 1949. RECEIVED for review October 15, 1956 ACCEPTED October 11, 1957