Weighted least squares curve fitting using ... - ACS Publications

Feb 16, 1970 - fitting is demonstrated, and its utility for the de- velopment of calibration curves is indicated. Weighted least squares curve fitting...
0 downloads 0 Views 436KB Size
spectrographic method becomes insensitive. Sensitivities in the solder matrix seem to be about five to ten times worse than in an aqueous solution when an air-acetylene flame was used. This can be expected because of lower nebulization and atomization efficiency in the solder matrix. However, when a nitrous oxide-acetylene flame was used to analyze A1 and As, no appreciable change in their sensitivities was observed compared to the aqueous solution. The high flame temperature seems to minimize at least the reduction of the atomization efficiency. In conclusion the following can be summarized:

2. Only a single preparation of solder sample is required to analyze all the trace and minor elements under the present experimental conditions. 3. Precision and accuracy of the present method are quite comparable with the wet and other instrumental methods but far better compared to the optical emission spectrographic method. 4. Sensitivity of the method is, in general, much better than wet and optical emission spectrographic methods. 5. The method is also rapid, simple, and accurate for the analysis,

1. Sn-Pb solders can be easily dissolved in mixed acids of nitric and fluoroboric acid for atomic absorption measurements.

RECEIVED for review February 16, 1970. Accepted April 15, 1970.

Weighted Least Squares Curve Fitting Using Functional Transformations Peter C . Jurs Department of Chemistry, The Pennsylvania State University, University Park, Pa. 16802

The method of weighted least squares curve fitting using functional transformations is described. A general discussion of functional transformations is followed by example transformations from exponential to linear, Gaussian to quadratic, and Cauchy to quadratic. Application of the method to experimental curve fitting is demonstrated, and its utility for the development of calibration curves is indicated.

WEIGHTED LEAST SQUARES curve fitting is a common technique used in research and routine laboratory situations. The basics of the method are included in many textbooks dealing with introductory numerical methods. Recent contributions to the literature in this area include papers describing methods for deconvoluting envelopes which may contain more than one peak (I-@, the development of “best” calibration curves using polynomial fits of varying degrees (5), and other general papers (6). Many methods of gamma-ray spectrum analysis based on least squares techniques have been described as well (7-9). This paper will discuss the method of fitting curves to data using functional transformations. Application of a functional transformation to data before fitting is advantageous in two cases. The data are known to follow a function which is not linear (or quadratic) in the parameters to be fit but which can be reduced to these forms, or it is desired to test a functionality to see if it fits the data well. The general method of (1) J. Pitha and R. N. Jones, Nat. Res. Counc. Can., Bull. No. 12, 1968. (2) R. D. B. Fraser and E. Suzuki, ANAL.CHEM.,38,1770 (1966). (3) J. R. Morrey, ibid., 40,905 (1968). (4) A. W. Westerberg, ibid., 41, 1770 (1969). (5) M. Margoshes and S. D. Rasberry, ibid., p 1163. (6) S. Margulies, Rev. Sci. Instrum., 39, 478 (1968). (7) E. Schonfeld, Nucl. Instrum. Methods, 51, 177 (1967). (8) “Applications of Computers to Nuclear and Radiochemistry,” G. D. O’Kelley, Ed., NAS-NS-3107, U. S. Atomic Energy Commission, 1963. (9) “Modern Trends in Activation Analysis,” J. R. DeVoe, Ed., Vol. 11, National Bureau of Standards Special Publication 312, U. S. Government Printing Office, June 1969.

functional transformations will be discussed and several specific examples with useful transformations will be given. Least squares methods depend on minimizing the sum

where (xi, y t ) , i = 1,2,. . . ,n are a series of points to be fit, w( is the weight associated with the ith point, and f ( x J is the function being fit. The sum may be differentiated with respect to each of the parameters and set equal to zero, yielding a set of normal equations. If the function f ( x i ) is represented as the following polynomial

then the norm1 equations are

C wt(yi - 2 ajxtj)x? i

j

=

0 for k

=

0,1,

. . . ,m

(3)

which can be solved for the aj’s using standard methods. Polynomial fits are often used empirically to fit data which cannot be fit well by linear or quadratic forms (5). However, it is often desirable to fit exponential, logarithmic, or other functions to data in a quantitative manner. This cannot always be done directly because the derivatives are not sufficiently convenient to allow the normal equations to be solved without iteration. This difficulty can be overcome in some cases by using a functional transformation which puts the data into a form which can then be fit using one of the standard methods. Several examples of the implementation of this method follow. These include fitting an exponential function, a logarithmic function, and a sine function by transforming them to linear forms, and fitting of Gaussian and Cauchy functions by transforming them to qudratic forms. LOGARITHMIC TRANSFORMATION Many analytical problems, such as radioactive decay experiments, yield data in which two variables are related by an ANALYTICAL CHEMISTRY, VOL. 42, NO. 7, JUNE 1970

747

Table I. Test of Logarithmic Transformation Exponential function: Equation (4). Given parameters: a = 5000; B = -1.0 Fifty points

4995 4989

B

R,

a

Transformed weights Untransformed weights

x 10-6 8 x IO-* 7

exponential function. Data such as this can be fit using a linear least squares fitting routine by logarithmically transforming the data and the associated weighting factors (IO). To fit the points (xi, y ) , i = 1, 2, . . , , n with a set of associated weights, w p ,the following procedure is used. The general form of the function to be fit is (4)

f ( x J = aexp(/3xJ

where the parameters a and /3 are to be fit. Taking the natural logarithm of both sides gives In y t = In a /3xi. Making the substitutions yt’ = In yi, a. = In a, and al = 0, results in a linear function which can be fit using a linear fitting routine. The weights must also be transformed to maintain the correct relationship between them and the points being fit. The general equation for the propagation of standard deviations (c.J Ref. 11) is as follows

+

Since the weights are often calculated from the standard , 5 deviations of the points being fit (IO),wi = ( l / u f ) Equation is used to calculate the new weights

(z) by,’ -z

Wi’ = Wt

For the logarithmic transformation being used wt’ =

wtyi2

(7)

and the new set of weights wi’ may be used along with the new transformed points in performing the linear fit. Even if the original points (xi, y i ) were equally weighted, e.g., w( = 1 for all i, the transformed weights must be calculated according t o Equation 6 to maintain the relationship. Use of the untransformed weights with the transformed function and points will give unintended results. A normal linear least squares fitting routine will provide the two parameters a. and al and their standard deviations uu,and uaL. The exponential parameter a and its associated standard deviation is calculated from the transformations ~

a = expao a = 2 uUo2 exp(2a0)

(8)

Only the transformation for a is given because /3 = al. Thus the exponential function of Equation 4 can be fit to a set of data points using a linear fitting routine while maintaining the original correspondence between the data points and their weights. (10) P. G. Guest, “Numerical Methods of Curve Fitting,” Cambridge University Press, 1961. (11) D. C. Baird, An Introduction to Measurement Theory and Experiment Design,” Prentice-Hall, Englewood Cliffs, N. J., 1962. 748

ANALYTICAL CHEMISTRY, VOL. 42, NO. 7, JUNE 1970

RP

x 2x

-0.999 -0.999

5

U

10-6

10-2

15.23 15.28

Table 11. Example Transformations Exponential Arcsin transformation transformation Original function y, = k In(px,) y, = k sin(a Px,) Transformation of y, y,’ = exp(y,/k) y,’ = arcsin(y,/k) Transformation of w, w ~ =’ w,exp(2y,/k) wz’ = w,(kZ - y,2)

+

In order to demonstrate the logarithmic transformation a test set of data was fit as shown in Table I. The test data consisted of 50 points which simulate a radioactive decay curve. The points were developed using Equation 4 but the points were offset from their nominal values by + l (with alternating sign) in order to simulate laboratory data. The weights were calculated as they would be in a counting experiment, Le. w t = l/yi. The results of fitting these data are shown in Table I, where a and /3 are the calculated parameters, R, and Rp are the relative standard deviations of the fit parameters, and u is the standard deviation of the entire curve to the data. The results of fitting with transformed and untransformed weights are compared. The fit using the transformed weights gives considerably smaller relative standard deviations of the parameters, and the overall u is slightly smaller. It should be noted that these results were obtained by a direct linear least squares computation after the transformation. Normal iterative procedures require one linear least squares fit per iteration, yielding a considerably longer computation. The same approach can be used to fit other functions too, for example, a logarithmic function or a sine function. Table I1 gives the transformations used in these two cases. In each case the fit of a linear curve to the transformed points ( y t ’ , x t ) gives the parameters (and their associated standard deviations) of the original function directly. GAUSSIAN TO QUADRATIC TRANSFORMATION

The same general method of functional transformation can be used in reducing more complex functions. Gaussian functions, either alone or in combination with other functions, are widely used for fitting experimental data ( I , 8). A Gaussian function can be fit to a set of data points using a quadratic fitting routine after a suitable transformation (12). The Gaussian function to be fit is. yt = a

ex,( -

(xi - P)’ 7) (9)

where a is the amplitude coefficient, /3 is the mean, and y is directly proportional to the variance of the curve (var = y/2). Taking the natural logarithm of both sides gives In y i = In a P2/y (2P/y)xt - xt2/y. A series of substitutions are made yt’ = Inyf; a. = lna - PZ/y; al = 2/3/y; a2 = -l/y;

+

(12) B. R. Kowalski and T. L. Isenhour, ANAL.CHEM.,40, 1186 (1968).

yielding an equation which can be fit using a standard quadratic fitting routine. The transformation of the weights is the same as for the logarithmic transformation Equation 7. The standard quadratic fitting routine determines the quadratic parameters ao, al, and up and their standard deviations. The following transformations give the Gaussian parameters and associated standard deviations. a

The following substitutions are made: y t ' = l / y t ; a. = 1 jijr P2/ar;a1 = -22p/ay; a2 = l / a y and a quadratic equation results. The weights are transformed using Equation 6 which gives

+

=

Wi'

P = Y

=

u,'

=

= wiyt4

(1 5 )

Then the standard quadratic fitting routine determines the quadratic parameters and their associated standard deviations. The following substitutioiis give the Cauchy parameters and the associated standard deviations for a,p, and y .

up' =

a = uy2 = aa22/a24

Curves are often fit to experimental data in order to develop a calibration curve which then can be used to calculate y f and its standard deviation from a measured xi. When the calibration curve is derived using the functional transformation method described here, the calculation of yi:from x f can be performed in two ways: using the Gaussian parameters or using the quadratic (transformed) parameters. Use of the Gaussian parameters involves the following steps. Measure x f and its associated w f . Then calculate y f using Equation 9 and the parameters a, P, and y . The standard deviation of y f is calculated from the equation

which is derived by applying Equation 5 to Equation 9. Using the quadratic parameters involves the following steps. Measure xi and w i . Calculate yi' and its associated standard deviation

Then calculate y f and uyiusing the transformations y i = exp ( y i ' ) and uU, = uyi' exp(yf'). The second method, using the quadratic parameters, is a considerably shorter and more convenient calculation to perform routinely at a desk calculator, and its use could save time and reduce errors.

4az koa2

- aI2

p = -~ 1 1 2 ~ 2 4aoa2- a12

Y

=

b 2 2

This method for deriving the Cauchy parameters allows a series of Cauchy functions to be fit, for example, to nonoverlapping infrared band envdopes by direct computation, which possesses several advantages over iterative methods. Alternatively, the method could be used to find starting values of parameters to be used in an iterative routine with overlapping bands. Once the Cauchy function is fit to the set of data to give a calibration curve, a new y I can be calculated from a new x f in two ways. Using the Cauchy parameters involves the following steps. Measure x i and w t . Calculate y f using Equation 13 and the parameters a,@,and y. Calculate the standard deviation of y f from the equation

CAUCHY TO QUADRATIC TRANSFORMATION

Experimental data are often fit with Cauchy functions, or combinations of Cauchy and Gaussian functions (13). Cauchy functions can be fit to a set of data points using a quadratic least squares routine after a suitable transformation. The general form of the Cauchy function to be fit is a

Yf=

1

+ Y-1

(Xf

- P)'

(13)

where a is the amplitude coefficient, p is the mean, and y is a measure of the spread of the curve. (The standard deviation is not defined for this function.) Taking the reciprocal of both sides and rearranging gives (13) R . N. Jones and R. P. Young, Nar. Res. Counc. Can., Bull. No. 13, 1969,p 23.

which is derived by using Equation 5 with Equation 13. Using the quadratic parameters involves the following steps. Measure x f and wi. Calculate yi' and its standard deviation using Equations 12. Then calculate yi: and its standard deviation using the transformations y t = l / y f ' and uyi2 = u y i r 2 y p 4 . The method using the quadratic parameters is a shorter and more convenient calculation. To demonstrate the transformation method for fitting Gaussian and Cauchy functions, two set of test data have been fit as shown in Table 111. The test points were developed using Equations 9 and 13 with + 1 % offsets used as before. The differences between the fits done with transformed and untransformed weights yield much different results in these cases. The overall standard deviation of the Gaussian fit is four times as large for the untransformed case as for the transformed case. Once again, these fits have been done with a single quadratic least squares fit and no iterations. ANALYTICAL CHEMISTRY, VOL. 42, NO. 7, JUNE 1970

749

Table 111. Transformations to Quadratic Fits Gaussian function: Equation 9 Given parameters: a = 100.0; 0 = 1.00; y = 2.00 Twenty points

Transformed weights Untransformed weights

a 100.01 96.03

B

R,

1 7

x x

10-4 10-4

R@

1.0000 0.9998

1 1

Cauchy function: Equation 13 Given parameters: a = 100.0; B Twenty points

Transformed weights Untransformed weights

a 100.01 100.92

R, 7 1

x x

10-6 10-2

Curve envelopes can be fit by functions which are combinations of Gaussian and Cauchy functions such as their sum

Reports of fitting this function to an infrared band envelope indicate that good fits are obtained with 6 = 0.1 CY (the Gaussian has a smaller amplitude than the Cauchy function) and y = kE where 0.4 < k < 0.8 (the Gaussian is wider than the Cauchy function) (13). The iterative fitting of this function to an infrared band envelope is aided by supplying the best possible parameters from which to start. A set of starting parameters can be derived as follows. Fit a Cauchy function to the band envelope. The three Cauchy parameters can then be used to obtain starting iterative parameters consistent with the findings mentioned above. Use the found Cauchy centroid for p ; use nine tenths of the Cauchy amplitude for a and one tenth for 6; use the Cauchy width for y and twice its value for E . Thus one has obtained

9 4

2.0000 2.0072

10-4

Rfi 10-6

x

x

10-3

R,

Y

10-4

1.00; y

=

B 0.9999 0.9940

x x

=

7 7

x

7 1

x

x

10-6 10-5

U

0 457 1.833 I

2.00

R,

7

2.0000 1 9788 I

x

10-0 10-2

U

0.509 0.646

values for the five parameters of Equation 18. It has been found that fitting a Cauchy function to such an envelope gives very good values for p, the centroid, and good values for CY and y. For example, an envelope was developed from Equation 18 with the parameters a = 90.0, p = 0.0, y = 2.50, 6 = 10.0, and E = 3.90. The resulting Cauchy parameters from the fit were a = 102.2,p = 0.00, y = 2.28, and the overall standard deviation of the fit was 0.81, which shows that the fit was good. These Cauchy parameters could be used as described to start an iterative routine to fit Equaton 18 to the envelope. Curve fitting using transformations of data to different functional forms can be used empirically. The original data points to be fit ( x t , yf) can be transformed into a different functional form ( x f , y f ’ ) and they can be fit empirically in that form. The weights associated with the points must always be transformed using Equation 6. Use of this method may result in the fitting of the original data with an acceptably good curve using fewer parameters than an expansion might require.

RECEIVED for review January 9,1970.

Accepted April 6,1970.

Identification of Polycyclic Aromatic and Heterocyclic Crude Oil Carboxylic Acids Wolfgang K. Seifert

Chevron Oil Field Research Co., P. 0. Box 1627, Richmond, Calif. 94802 Richard M. Teeter

Chevron Research Co., Richmond, CaliJ 94802 Compound classes of polycyclic aromatic and heterocyclic carboxylic acids, hitherto unknown in petroleum, were found in a California crude oil after conversion of the acids to their corresponding hydrocarbons followed by chromatographic separation on alumina and molecular spectrometry by various methods, including high resolution mass spectrometry. New classes of carboxylic acids found are mainly derivatives of benzfluorene-, acridine-, tetrahydrobenzacridine-, benzcarbazole-, tetrahydrobenzcarbazole-, cyclopentanophenanthrene-, di benzfuran-, and benzologs and of partly hydrogenated benzologs of the latter and of benzthiophene. The present study confirms and extends a preceding investigation of the polycyclic naphthenic, mono-, and diaromatic crude oil carboxylic acids. The conventional view of “Naphthenic Acids” in petroleum as primarily mononaphthenic and alkanoic acids is expanded by about 40 classes of carboxylic acids. Quantitative estimates are presented. 750

ANALYTICAL CHEMISTRY, VOL. 42, NO. 7, JUNE 1970

PRIOR TO THE WORK recently published from this laboratory (I) structural information on crude oil carboxylic acids was limited to alkanoic, 1-ring naphthenic, and simple aromatic types. The literature on the large amount of previous work has been summarized in one of our recent papers (2) which deals with the conversion of these California crude oil carboxylic acids to hydrocarbons for easier identification. High resolution molecular spectrometry of the derived hydrocarbons of low polarity ( I ) led to the discovery of some 15-20 compound classes of carboxylic acids of predominantly terpenoid polynuclear naphthenic, and mono- and diaromatic structure whose presence in petroleum was hitherto unknown. (1) W. K. Seifert and R. M. Teeter, ANAL.CHEM., 42, 180 (1970). (2) W. K. Seifert, R. M. Teeter, W. G . Howells, and M. J. R. Cantow, ibid., 41, 1638 (1969).