Practical least squares approximation of chromatograms - Analytical

Raman amide I and amide III spectra. Robert W. Williams ... H.J.G. Debets , A.W. Wijnsma , D.A. Doornbos , H.C. Smit. Analytica Chimica Acta 1985 ...
0 downloads 0 Views 727KB Size
Practical Least Squares Approximation of Chromatograms S. M. Roberts Process Industry Development, IBM, Houston, Texas

D. H. Wilkinson DACS Center, IBM, Houston, Texas

L. R . Walker IBM, Beaumont, Texas This paper describes practical experience with least squares approximation of chromatograms. Practical rules of thumb are suggested for making the initial estimates of the parameters for the underlying Gaussian or skewed Gaussian curves. For least squares approximations which do not converge, the paper introduces the forcing factor, a numerical device, which forces the least squares process to converge. A variety of examples are presented demonstrating the utility of the forcing factor concept.

SEVERAL INVESTIGATORS have proposed resolving fused peaks and other curve anomalies of chromatograms and spectral traces by least squares techniques (1-4). These schemes are based on the assumption that the traces may be approximated by Gaussian curves or skewed Gaussian curves. The gist of the least squares approximation technique is that for each chromatogram peak a set of trial values for the parameters, commonly peak amplitude, time of peak amplitude, “width” of peak, and “skewedness” factor are chosen and a theoretical Gaussian or skewed Gaussian curve is generated. For each point in time at which the laboratory data are taken, a comparison is made between the chromatogram data and the theoretical Gaussian or skewed Gaussian curve in the least squares sense. As a result, a set of corrections to the trial values for the parameters are generated and the process is repeated for new estimates of the values of the parameters. The cycle is terminated by one of two conditions: the value of each parameter converges to a “steady” state value or the maximum iteration count is exceeded. The least squares technique for chromatograms has been found to work successfully in a number of cases. By “successfully’’ we imply the Gaussian or skewed Gaussian curve approximates the chromatograms “closely” and the area under the Gaussian curve is within a small tolerance, say less than 5 of the chromatogram area. This does not mean (1) R. D. B. Fraser and E. Suzuki, ANAL.CHEM., 38, 1770-1773, 1966. (2) H. M. Gladney, B. F. Dowden, and J. D. Swalen, ibid., 41, 883 (1969). (3) W. 0. Keller, T. R. Lusebrink, and C. H. Sederholm, J . Chern. Phys., 44, 782-793 (1966). (4) H. Stone, J. Opt. SOC.Amer., 52,998-1003 (1962).

that the least squares technique is guaranteed to work in all cases. The objectives of this paper are practical rather than theoretical. Since the least squares method requires some judgment on the part of the user, we discuss, based on our own computing experience, some suggestions to assist the user of this method. Included are procedures to obtain initial trial values for the parameters and suggestions on how to speed up the computation. Since the least squares method can fail to converge, we exhibit several ways in which failure manifests itself. We also propose a method to force convergence and illustrate its efficacy. THEORY

We briefly discuss the least squares procedure. details may be found in (5).

Let xi = independent variable, i = 1, 2, . .,N Yi = Y ( x i ) = experimentally observed data, i = 1 , 2 , . .,N pj = parameters in a theoretical function F(P1, PB ,.., P , , x ) , j = 1 , 2,..,m pjw = the estimated value of PI for theqth iteration wi = weighting factor for the ith data point, a scalar = F(P1,Pz, , ., Pm,x i ) = a theoretical function Fi evaluated at the point x i = number of experimentally observed data N points = number of parameters P j m The least squares procedure requires that a set of N experiYB, , ,, Y , be curve fit by a funcmentally observed data Y1, tion F(P1,P2, , ., P,, x ) such that the residual sum of squares N

s

= i=l

wi( F , -

Yr)2

ANALYTICAL CHEMISTRY, VOL. 42, NO. 8, JULY 1970

(1)

is minimized through the choice of values for Pi, P2, . ., P,. The choice of the function F is made so that F more or less has the general shape of the plot of the data. The minimization of Equation 1 leads to the normal equations ( 5 ) N. R. Draper and H. Smith, “Applied Regression Analysis,”

Wiley, New York, N.Y., 1966, Chapter 10.

886

More

which is a set of m linear symmetric equations in the m unknowns APlC‘), APz(*),. .,

The least square process is an iterative process where for the qth set of trial or estimated values of the parameters PI(^), PZ(*), . ., P,(*),the function Fi is given by F J g ) and all the partial derivatives are evaluated using P,(@)k = 1, 2, . ., m. The solution of the normal equations gives the corrections APn(g)k = 1 , 2, . ,, m to the estimated P,(Q. New estimates of the parameters are found by forming: p,(g+l) = p,(d

+ Ap,(g) =

1, 2,

. ., m

(4)

Q-+=

Practically the iterative process is terminated when:

where e k is a user supplied tolerance. The success of the method is crucially dependent on whether an inverse of the matrix in Equation 2 exists. This in turn depends on how well the trial values of the parameters P k ( q ) , k = 1 , 2 , . ., m approximate the true but unknown values of the parameters Pk,k = 1, 2, . ., m. The function F may be a single function or a sum of functions, not necessarily all the same type. If the experimental data consist of a series of s peaks and if each peak may be described by m parameters, we may write 8

~i

Fj,i( P j , t , pj,2

=

9

Pj,mt xi)

(6)

3=1

where = 1 , 2 , . ., s for the jth theoretical function, and kth parameter,

Fi,*= jth theoretical function evaluated at x i , j Pi,,

=

k = 1 , 2,.., m; j = 1 , 2,.., s

Equation 2 calls for the partial derivatives b F f / b P j . These may be obtained by differentiation of F or may be obtained by numerical differentiation if the derivative is unwieldy analytically. Having discussed the idea of least squares approximation, for a general function, we now specialize the results to skewed Gaussian functions. Gladney, Dowden, and Swalen have suggested a skewed Gaussian function which may be applied to chromatograph data among others (2). The function is f = A&, to, w , 7 ) = A 2 d r z exp (z2

- 2yz) erf 0, - z) (7)

where A

= Amplitude of the peak

U

= distribution function = 2 G z exp (zz

t to Y

= independent variable (time) = time at which peak occurs

Z

= w / A r

2yz)erfb - r )

= (t

-

TIME, t

Figure 1. Evaluation of w and

7

(3)

Theoretically the iterative process converges when: Lim Pk(Q+ Pn, k

to

-

7

=

W

=

skewness factor width of curve

To determine w , drop a perpendicular from the peak height to the corrected base line. From a point one half the peak height, the distance to the left to the curve w L is taken as the initial estimate of the width w. To determine r, at a point l / 4 the peak height measured from the base line, the distance to the right to the curve is wR. The skewness factor is given by 7

=

WR

-

WL.

(8)

See Figure 1 . When the ratio of r / w < .4,the curve is Gaussian and instead of Equation 7, we use the Gaussian expression:

f

= A&, to, w ) = A exp

[

-----( t ;w:0)2]

(7’)

To associate the variables in Equation 7 with the general function F(P1, Pz,. ., Pm, x t ) , we note f corresponds to F, the parameters A , to, w , r correspond to P I ,Pz, PB,P 4 and the independent variable time t i corresponds to x f . One advantage of the Gladney, Dowden, Swalen approach is that an analytical Expression 7 or 7’ is obtained which can be used to integrate analytically or numerically the area under the peak, once the parameters are fixed by the least squares process. In fact, it can be shown that J-aa

f ( t >dt

= f i A w

(9)

Alternatively for curves which do not start or return to base line, one can numerically integrate, for an example, a single peak as Pl,

J

N

A ~ ( i it ,o , W , ~ ) A t i

f(t)dt = 1,’

i=l

tr = initial point on curve t f = final point on curve N = number of data points

For two overlapping peaks the area is computed as

to)/dTw tr = ti t f = tN t , = initial point for the second peak ANALYTICAL CHEMISTRY, VOL. 42, NO. 8, JULY 1970

887

Estimating the Partial Derivatives. Since the least squares procedure is an iterative process, it is not necessary for the partial derivatives to be known exactly. Gladney, Dowden, and Swalen, for example estimated the partials as follows:

AI

-bf-- v bA

t2 +3

t4

t5

TIME,t

Figure 2. Estimation of peak parameters

The first summation refers to the first peak and is taken over all the data between tr and t,. The second summation refers to the second peak and is taken over all the data between t, and t,. In the interval t, to t , both skewed Gaussian curves use the same data. Multiple peaks are handled in the same manner. See Figure 2 where t, = f 3 and t , = t s . The problem of obtaining the initial estimates of the parameters for each peak, as well as the span ( t l , t,), (t,, t,) over which each peak data are assumed to exist, will be discussed later. JUSTIFICATION OF THE LEAST SQUARES METHOD

The application of a least squares method to chromatography can only be justified when the analyst requires a level of accuracy and reproducibility that are lacking in other methods. The analyst pays a price for the least squares method; namely, the cost of the acquisition and storage of the raw data, and the processing of much more data than other methods. In addition the computation must be executed in a background mode or off-line mode so the results are not published as quickly as other methods. The least squares method does not provide area or composition directly but rather the “best” values for the parameters. From these the area and composition are computed. As described below the least squares procedure can not be used blindly but requires intelligent implementation by the user. The least squares method should be applied selectively only to those groups of peaks that require careful resolution. One potential use of the least squares procedure is as a standard against which “short cut” or “ad hoc” methods may be compared. PRACTICAL IMPLICATION OF THE LEAST SQUARES METHOD

While the least squares method of Gladney, Dowden, and Swalen theoretically offers a way to resolve fused peaks by skewed Gaussian curves, a number of practical decisions and actions must be carried out in order for the method to be fruitful. Among these are: removing the base-line drift; estimating the partial derivatives; scaling the matrix of the normal equations; estimating the initial trial values for the parameters; forcing convergence; speeding up the convergence. Removing the Base-Line Drift. It is obvious, but worth mentioning anyway, that the raw data must be corrected for base-line drift before the least squares method can be applied properly. 888

ANALYTICAL CHEMISTRY, VOL. 42, NO. 8, JULY 1970

and due to its complicated analytical expression, bf,br was obtained numerically.

- -- AV(4 t o , bf’ d7

w,

+

AT)

- Au(t, t o ,

w,

T), for each

t*,

AT i

=

1, 2,

. .,N

(16)

A convenient value for AT = 0.01 r Scaling for the Matrix of Normal Equations. If we ex-

amine the expressions for the partials we observe bf@A is independent of A , bf/br is dependent on A through f evaluated AT, while bf/btoand bfjbw are directly proat r and r portional to A . As a result of these relationships, for groups of peaks in which the amplitude of one peak (or more peaks) is very large compared to the others, the terms in the normal equations may differ widely in magnitude. As a consequence, the normal equations may be ill-conditioned. To by-pass this problem we have found it wise to rescale the data so the maximum amplitude is less than or equal to 1. Initial Estimates of the Values of Parameters. For each peak the user must find nominal or initial values of the following information A , to, w, T initial peak time, final peak time. Let us consider as an example Figure 2, a group of 2 peaks. For the first peak the amplitude AI and the initial estimate of time of peak amplitude = tz are read off directly. The width is also found by direct measurements. To obtain T we sketch in the line EF which we estimate would be the shape of the first curve, had no overlapping peak occurred. The first peak is taken to exist over the interval ( t l , ts). The second peak poses more problems. We sketch in the line G to fa, which we estimate to be the shape of the second peak extrapolated to base line. The point G may be thought of as the “initial” point of the second peak. The points E and G may coincide and be taken as the point in the trough between the two peaks. The amplitude A Z = J H is measured from the intersection of the dotted line EF to the crest of the small peak. The time at which the peak amplitude occurs is to,z= r4. The peak width is measured from a point halfway up the amplitude from the base at point H to the left. The distance wL = w may intersect either the curve or the extension of the curve from G to f a depending on the shape of the second peak and the line EF. The T is estimated using its definition. The interval over which the second peak is defined is ( f a , ts). The data in the interval (fa, r5) is used by the least squares procedure for both peaks. The suggested procedures are not to be considered rigid or inflexible. The rough idea is to sketch in the curves as best as one is able and then estimate the parameters. Our experience has been that initial estimates of the values of parameters as much as 25z in error compared to the true

+

values of the parameters will often still lead to convergence. Sometimes initial estimates of parameters in error in excess of 25 of true values will still lead to convergence. The specification of the data spans for each curve, (h, fb) for the first curve and ( t 3 , f6) for the second curve in Figure 2 is desirable since it speeds up the computation. By limiting the range of data for each curve, the program only processes those points which affect each peak. Convergence of the Method. The least squares method may be looked at as Newton’s method for the determination of the roots of an equation. As in Newton’s method the initial estimates of the roots are crucial for successful solution of the equations. Initial guesses of the values of the parameters which are too “far” from the true but unknown solution may cause the method to diverge rather than to converge. Some of the ways in which nonconvergence manifests itself are discussed in the Section on Numerical Experience. An extremely effective device to force and accelerate convergence is the forcing factor. The forcing factor is a constant 2 1.0, which is used to multiply the diagonal terms of the matrix in Equation 2. We will discuss the forcing factor from two points of view: its theoretical justification and its use. The theoretical justification of the forcing factor rests on two bases. First from matrix theory it is well known that a matrix which is diagonally dominant possesses an inverse. By definition matrix B with elements btl is said to be diagonally dominant if

In other words the absolute value of the diagonal element in each row is larger than the sum of the absolute values of the off diagonal terms. By multiplying the diagonal terms by a forcing factor, which is sufficiently large, we convert the matrix of Equation 2 into a diagonally dominant matrix. Second the introduction of the forcing factor is a realization of Levenberg’s damped least squares procedure (6). Instead of minimizing Equation 1, this procedure minimizes N

S =

C W f(F, i=l

Yr)’

+ d2

m

Cj APj2

(1 8)

j=1

where C, are parameters to be chosen and d is a scalar “damping” factor. As a consequence of this function, the least squares procedure produces a scalar multiplier (1 d2) of the diagonal terms in Equation 2. We see therefore that (1 dz) is our forcing factor. Levenberg recommends Equation 18 to induce convergence. The forcing factor therefore serves two roles: it guarantees the solution of Equation 2 for each iteration and it helps to promote convergence of the iterative scheme. To use the forcing factor we set it equal to 1.0 until we run into some difficulties. If the problem fails to converge or converges slowly, then the forcing factor is increased. The choice of the proper value for the forcing factor is therefore a matter of experience. We recommend increasing the forcing factor from 1.0 to 2.0 as the first adjustment. Sometimes increasing the forcing factor trivially beyond 1.O is sufficient to make a dramatic difference in the convergence properties of the solution. We see no real advantage in making the forcing factor any larger than necessary. Our

+

+

( 6 ) K. Levenberg, Quart. Appl. Marh., 2, 164-168 (1944).

experience has been that if a certain forcing factor helps the iterative process achieve convergence a substantially larger forcing factor does not necessarily accelerate convergence. Speeding Up to the Computation. A number of devices are available to the analyst for speeding up computation. INITIAL ESTIMATES OF THE VALUESOF THE PARAMETERS. One of the most powerful ways to accelerate the convergence of the method is to use good initial estimates of the parameters. While our discussion on initial estimates of the value of parameters is practical when starting with new samples, other schemes may be more efficient. One simple but useful technique is to store “standard” initial estimates for certain classes or groups of samples. Another effective method is to save the last successful solution for each class of samples and use these values as the new initial estimates of the parameters. This has the advantage that if a type of sample is changing with time, the initial trial values of the parameters also move with time. NUMBER OF DATA POINTS.The chromatograph generates a generous amount of data for each trace. For the least squares process, one need not process all the data available from the chromatograph. Provided the data points are spread over the span of the curve, as few as 3 or 4 data points per estimated parameter may suffice to determine the values of the parameters with sufficient accuracy. SPANOF EACHPEAK. A computer program with the specification of the span for each peak restricts the least squares program to process only the data pertinent to each peak. The computation therefore is speeded up by not processing points where the& = 0. ERF FUNCTION EVALUAT~ON. The evaluation of the Erf function in Equation 7 by an approximating polynomial rather than a table-look-up and interpolation speeds up the computation. RESTRICTION ON A P p . A practical programming device is to restrict the size of the AP5(q),j = 1, 2, . ., m, q = 0, 1, . . ., which are determined by the solution of the normal equations. When the trial estimates of the values of the parameter are poor, some of the AP+‘-‘) corrections may be very large relative to the Pj(‘). This may lead to absurd results and/or ill conditioned normal equations. If the calculated AP+Q exceeds P+*) by lo%, an effective programming device has been to set AP5(Q = 0.10 P5(*) where the sign of the calculated is used in the restricted AP+Q. FORCING FACTOR. As discussed above the forcing factor not only promotes convergence, in some instances it accelerates the rate of convergence. SYMMETRY OF THE NORMAL EQUATIONS.Since the matrix of the normal equations is symmetric, it is obvious that any well written computer program will take advantage of this fact.

I

I

I

I,

NUMERICAL EXPERIENCE

To test the least squares procedure we applied it to true Gaussian curves and then to actual chromatograph data. First we synthesized Gaussian curves of known data and parameters to see whether the least squares procedure would recover the original parameters. As an example of this, in Table Ia are tabulated data for two Gaussian curves and the composite curve, Curve D, which appears as the solid line in Figure 3. In the last column of Table l b are listed the true parameters of the Gaussian curves, while in the first three columns are listed the estimated parameters for the 0, 5, 10 iterations. Note for the first peak and the 0th ANALYTICAL CHEMISTRY, VOL. 42, NO. 8, JULY 1970

889

OSO

I-

Table Ib. Synthetic Gaussian Data (Figure 3)

0.40

Iteration No. Peak 1

fctl

A

0.30

to W

7

0.20

Forcing factor Span Relative area" peak 1

0.IO

Peak 2 A 0

to

W

TIME

7

-AAA 0-

Peak1 Peak 2

Fifth iteration, forcing factor = 1.0 Tenth iteration, forcing factor = 1.0 Fifth iteration, forcing factor = 2.0 Chromatograph trace Initial estimates of parameters

-0

-

A 0.3989 0.125

to

3.0 4.0

W

1.0 1.3

7

0.41 0.11

Span

0.00443 0.00791 0.01358 0.02239 0.03547 0.05399 0.07895 0.11092 0.14972 0.19418 0.24197 0.28969 0.33322 0.36827 0.39104 0.39894 0.39104 0.36827 0.33322 0.28969 0.24197 0.19418 0.14972 0.11092 0.07895 0.05399 0.03547 0.02239 0.01358 0.00791 0.00443

0.00221 0.02699 0.12098 0.19947 0.12098 0.02697 0.00221

0.00443 0.00791 0.01358 0.02239 0.03547 0.05399 0.07895 0.11092 0.14972 0.19418 0.24197 0.28969 0.33322 0.36827 0.39104 0.39894 0.39104 0.37048 0.36021 0.41067 0.44144 0.31516 0.17671 0.11313 0.07895 0.05399 0.03547 0.02239 0.01358 0.00791 0.00443

iteration, the estimated value of 7 = 0.41 specifies a slightly skewed curve. In both Table I b (for forcing factor = l.O), and Table IC (for forcing factor = 2.0), the least squares procedure recovers the true parameters of the underlying curves within an error of 1.5% of the true parameters. The relative area under each peak compared to the total area under both peaks is also given in Tables Ib and IC. The 890

10

0.3989 3.0000 1.0000 0.4100 1 .O 0.0-6.0

0.3968 2.9889 1.0410 0.3714 1.0 0.0-6.0 0.9203

0.3964 3.0086 1.0135 0.3714 1.o 0.0-6.0 0.8967

0.1347 4.2285 0.2646 0.1185 1 .o 2.5-6.0 0.0797

0.1979 3.9184 0.2334 0.1229 1 .o 2.5-6.0 0.1033

0.1250 4.oooO 0.3000 0.1100 1.O 2.5-6.0

ANALYTICAL CHEMISTRY, VOL. 42, NO. 8, JULY 1970

=

0.3989

3.oooo

.oooo

1 0.0

0.8887

0.1995 4 . m 0.2500 0.0 0.1113

Area under peak/Total area

Table IC. Synthetic Gaussian Data, (Figure 3) Iteration No. 0 5 Peak 1

0-6.0 2.5-5.0

Table la. Synthetic Gaussian Curve Data First Gaussian curve Second Composite curve, curve D Time Y Gaussian curve 0.0 0.2 0.4 0.6 0.8 1 .o 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0 4.2 4.4 4.6 4.8 5.0 5.2 5.4 5.6 5.8 6.0

5

Forcing factor Span Relative area" peak 2 Relative area for each peak under peaks.

Figure 3. Curve D 0- -0

0

True value of parameters

A to W T

Forcing factor Span Relative area peak 1

0.3989

3.oooo

1.0000 0.4100 2.0 0.0-6.0

0.4034 3.0133 1.0017 0.3690 2.0 0.0-6.0 0 9002 I

Peak 2 A to W

7

Forcing factor Span Relative area peak 2

0.1250 4 . m 0.3000 0.1100 2.0 2.5-5.0

0.1981 4.0416 0.2256 0.0882

2.0 2.5-5.0 0.0998

results presented here for this one example are more or less typical results one may expect from Gaussian data being approximated by Gaussian curves in the least squares sense. To illustrate the effect of the forcing factor consider, for example, Curve D,Figure 3 which gives the results for the forcing factor = 1.0 and 2.0. Note how with the forcing factor = 1.0, the 5th iteration omits the second peak while the 10th iteration discloses it. For exactly the same initial estimates, the 5th iteration with the forcing factor = 2.0 gives comparable results to the 10th iteration with the forcing factor = 1.0. (See Tables I b and IC.) Here is a case where the forcing factor accelerated convergence. While we have suggested and illustrated here that the use of the forcing factor can sometimes aid the convergence of the least squares process, the utilization of larger and larger forcing factors does not necessarily promote convergence or faster convergence. Our experience indicates one should use as small a forcing factor as possible if convergence problems should arise. The remainder of our discussion will deal with applying the least squares technique to actual chromatograph data. As an example of a successful application of the least

Table 11. Benzene, 2,2,4Trimethylpentane, n-Heptane Sample Iteration No.

W

Pk 1 0.4820 1060.0 27.5

Pk 2 0.2360 1205.0 66.9

Pk 3 0.4850 1360.0 31.8

Pk 1 0.4771 1058.3 42.4

5 Pk 2 0.2589 1221.6 57.3

Pk 3 0.5007 1349.7 42.0

7

0.0

0.0

0.0

0.0

0.0

0.0

33.1 35.9

28.4 25.4

38.4 38.7

37.8 35.9

22.6 25.4

39.6 38.7

A to

Calculated composition, % Measured composition, %

0

Table 111. Parameters

US.

Iteration Number for Curves in Figure 5

Run No.

Iteration No.

2

A to W 7

Forcing factor 4

A to W 7

Forcing factor

0.5856 184.0 10.00 4.20

1 0.6441 173.4 9.0 4.62

2 0.7086 168.9 8.10 5.08

3 0.7794 160.2 7.29 5.59

4 0.8514 144.2 6.56 6.15

5 0.7716 158.6 7.21 5.53

6 0.8488 142.1 6.49 6.00

1.o 0.5856 184.0 10.00 4.20

0.5963 183.9 9.95 3.84

0.5880 185.4 9.94 3.84

0.5867 186.1 9.96 3.84

0.5859 186.5 9.99 3.84

0.5849 186.8 10.03 3.84

0.5838 186.9 10.07 3.84

0

2.0 7,000

300

6,000

,400

,300

t

5

f(tl

.zoo

4,000

f(t) ,100

3,000 .000 900 DO0 IO0 1200 1300 1400 Is00 1600 1700

2,000

TIME

Figure 4. Benzene, 2,2,4trimethylpentane, n-heptane sample 000

1,000

Fifth iteration

- Chromatogram corrected for base line squares procedure, consider Figure 4. In Figure 4, the solid line represents a chromatogram, corrected for base line, of a sample consisting of 35.9 Z benzene, 25.4 Z 2,2,4-trimethylpentane, and 38.7 Z n-heptane. The dots represent the results of the 5th iteration of the least squares process. Table I1 lists the estimated peak parameters and the compositions calculated from them. The calculated compositions are within 3 Z of the measured compositions. To illustrate how the least squares method fails, let us examine Table I11 and Figure 5 , where the solid line represents the curve drawn through experimental data taken from a laboratory chromatograph. Initial parameter estimates are identical for Runs 2 and 4 except the forcing factor for Run 2 is 1.0, while that for Run 4 is 2.0. An examination of the sequence of parameters for Run 2 in Table I11 shows that the method does not converge. The lack of convergence is indicated by several items. First, the peak amplitude

0 160

170

180

190 200 210

220

TIME

Figure 5. Example of effect of forcing factor A- .-A Sixth iteration, Run 2 -0 Sixth iteration, Run 4 - Chromatograph trace Initial estimates of parameters Forcing A to W 7 factor 0.5856 184.0 10.0 4.2 1.0 Run2 Run4 0.5856 184.0 10.0 4.2 2.0

C- -0 Fifth iteration, Run 2 0-

--

oscillates from iteration to iteration from about 0.85 to 0.77. Next, the to decreases monotonically so that at the 6th iteration the estimated time at which the peak height occurs, namely 142.7, is not even within the range of data 161.0206.0. The peak width also oscillates between 6.5 and 7.2. The least squares results for iteration five and six of Run 2 ANALYTICAL CHEMISTRY, VOL. 42, NO. 8, JULY 1970

891

160 170

169 190 2CQ

210 220 230 240

250 260 270 280 990 300 310

320 330 340 350

TIME

Figure 6. Example of fit with nonconvergence of parameters 0- -0

Sixth iteration, Run 33

- Chromatograph trace

Initial estimates of parameters

Peak 1 Peak 2 Peak 3

A

to

W

7

Span

Forcing factor

0.5856 0.6681 0.5542

184.0

10.0 16.0 18.0

0 0 26.0

160-220 190-280 240-355

2.0 2.0 2.0

226.0 285.0

Table IV. Parameters Run

VS.

Iteration Number for Curves in Figure 6

Iteration

0 1 2 3 Pk 1 Pk 2 Pk3 Pk 1 33 A 0.5856 0.6681 0.5542 0.5778 0.6478 0.6089 0.5822 0.6398 0.6292 0.5882 0.6377 0,6488 to 184.0 226.0 285.0 184.5 227.0 283.3 184.8 227.5 282.2 185.1 227.7 281.2 w 10.0 16.0 18.0 9.0 15.5 17.7 8.7 15.2 17.5 8.5 15.0 17.3 7 0.0 0.0 26.0 0.0 0.0 23.4 0.0 0.0 21.0 0.0 0.0 18.9 Forcing 2.0 factor

No.

No.

Run Iteration No.

No. A to W 7

4 5 Pk 1 Pk2 Pk3 Pk 1 Pk2 0.5931 0.6384 0.6616 0.5966 0.6403 185.2 227.7 280.5 185.3 227.7 8.4 14.8 17.1 8.4 14.7 0.0 0.0 17.0 0.0 0.0

6 P k 3 . Pk 1 Pk2 0.6668 0.5986 0.6426 279.9 185.3 227.7 16.9 8.5 14.6 15.4 0.0 0.0

are an absurd approximation of the chromatography data. This situation is changed very dramatically in Run 4 where the forcing factor is taken at 2.0. The least squares procedure provides a good approximation to the data in this case. Table IV and Figure 6 illustrate an important point about convergence. The data in Table IV indicate that the parameters for Peaks 1 and 2 have converged while the parameters for Peak 3 have not converged. To see this, we observe that for the 6th iteration, the estimated peak amplitude for the third peak is 0.6662; while the true peak height is 0.5542. If we examine the plot in Figure 6, we discover the interesting result that curve produced by the least squares method is quite a good approximation to the experimental data. This apparently unusual result in which 892

ANALYTICAL CHEMISTRY, VOL. 42, NO. 8, JULY 1970

Pk3 0.6662 279.5 16.7 14.7

parameters, not the true parameters, generate nevertheless a good fit to the data can be explained by the theorem associated with approximation. The theorem is a sufficiency theorem which states in effect that if the trial parameters converge to the true but unknown parameters, then the approximation curve will match very closely the observed data. The theorem does not state (since it is a sufficiency theorem), however, what happens if the parameters do not converge. It is, therefore, conceivable that a set of nonconvergent parameters of a skewed Gaussian curve can still yield a satisfactory fit. This is indeed what has happened here. In the least squares equations are weighing factors Wt associated with each of the i = 1, 2, , , N data points. The weighting factors are another set of variables that can be exploited for better curve fitting. It is possible to arbi-

.

350

370

390

410'

430

450

470

490

510

530

TIME

Figure 7. Approximation of single peak by two Gaussian curves 0- -0 Sixth iteration, Run 35A A- --A Sixth iteration, Run 42 Chromatograph trace Initial estimates of parameters

-

A 0.4184

to

W

Run 35A 395.0 22.2 Run 42" Peak lb 0.4184 395.0 22.2 Peak 2c 0.4184 395.0 30.0 a Run 42 treated data as if two peaks existed. Peak 1 is symmetric Gaussian over the range 355-405. Peak 2 is skewed Gaussian over the range 410-545.

trarily choose weights greater than one to help shape the approximating curve. If Wc = p > 1, this implies or has the effect that the ith data point was measured p times rather than once. Since we have only made casual use of Wi > 1 for selected portions of some chromatographs, we can not report at this time how effective manipulating the weighting factor is. As a variation on the least squares procedure, we may treat a single peak as if it were the composite of two peaks. For example in Figure 7 we have considered this single peak to be composed of two Gaussian curves, the first curve exists over the span 355.0 to 405.0 and the second curve over the span 410.0 to 545.0. In this manner the chro-

9.0

Span 355-545

Forcing factor 2.0

0 22.5

355-405 410-545

2.0 2.0

7

matograph data are approximated by six parameters, three for the leading edge curve and three for the trailing edge curve rather than three for the entire curve. If skewed Gaussian curves are assumed, four parameters rather than three are employed. Furthermore there is no reason why we can not consider the leading edge curve Gaussian and the trailing edge curve skewed Gaussian. We may anticipate then with more adjustable parameters that we should be able to curve fit better. Such is the case for Run 42 in Figure 7 where we also display a curve with only three parameters, Run 35A.

RECE~VED for review July 25, 1969. Accepted May 5, 1970.

ANALYTICAL CHEMISTRY, VOL. 42, NO. 8, JULY 1970

893