Effect of random experimental error on the generalized standard

The catalyst increases the decomposition products by a factor of 2. Phenanthrene accounts for 10% of the uncatalyzed reaction products and about 20% o...
0 downloads 0 Views 1MB Size
562

Anal. Chem. 1084. 56,562-569

product, one can verify that these thermal processes do not scramble the label into other positions in the molecule. In the presence of the catalyst, ring opening amounting to about 4% of the total mixture is observed. A peak near 14 ppm, having some contribution from the label, is assigned to n-butylbenzene. This product is absent without the catalyst. The OHP spectra are much more complex than the tetralin spectra, but the same kinds of conclusions are reached. With or without the catalyst, the major decomposition products are phenanthrene and products with methyl-substituted fivemembered rings. The catalyst increases the decomposition products by a factor of 2. Phenanthrene accounts for 10% of the uncatalyzed reaction products and about 20% of the catalyzed reaction products. As was the case for tetralin, a small amount of ring opening is observed only in the presence of the catalyst. The uncatalyzed resulta are in good agreement with decomposition patterns observed previously in this laboratory and monitored by gas chromatography (13).

ACKNOWLEDGMENT We thank D. A. Danner for technical assistance and also E. J. Eisenbraun and associates for synthesis of labeled compounds. Registry NO.DMT, 87937-46-0; OHP, 5325-97-3; [1-13C]OHP, 87937-45-9; naphthalene, 91-20-3; mesitylene, 108-67-8; [ 1-13C]-

tetralin, 87937-44-8; tetralin, 119-64-2,

LITERATURE CITED Cronauer. D. C.; McNell, R. I.; Young, D. C.; Ruberto, R. G. Fue/ 1982. 61,610-619. Buchner, W.; Scheutzow, D. Org. Magn. Reson. 1975, 7 , 615-616. Ratto, J. J.; Heredy, L. A.; Skowronski R. P. I n “Coal Liquefactlon Fundamentals”; Whitehurst, D. D., Ed.; Amerlcan Chemlcal Society: Washlngton, DC, 1980 ACS Symposlum Serles 139, pp 347-370. McNell, R. I.; Cronauer, D. C.; Young, D. C. Fuel 1983, 6 2 , 401-406. Seshadrl, K. S.; Jewell, D. M.; Bymaster, D. L.; Dobbs, T. K.; Brown, C. E.; Vickery, E. H.; Eisenbraun, E. J., unpublished work, Oklahoma State Unlverslty and Gulf Research & Development Co., 1981. King, H. H.; Stock, L. M. Fuel 1981, 60, 748-749. Seshadri, K. S.; Ruberto, R. 0.;Jeweil, D. M.; Malone, H. P. Fuel 1978, 57, 111-116. Patt, S. L.; Shoolery, J. N. J. Magn. Reson. 4 6 , 1982, 535-539. Morrls, G.; Freeman, R. J. Am. Chem. Soc. 1979, 101, 760-762. Doddreii, D. M.; Pegg, D. T.; Bendall, M. R. J. M g n . Reson. I981 4 8 , 323-327. Bax, A. D.; Morrls, 0. A. J. Magn. Reson. 1981, 4 2 , 501-505. Hansen, P. E. Annu. Rep. NMR Spectrosc. 1981, I l a , 66-181. Cronauer, D.C.; Jeweli, D. M.; Modi, R. J.; Seshadri, K. S.; Shah, Y. T. “Coal Llquefactlon Fundamentals”; Whitehurst, D. D.,Ed.; American Chemical Society: Washlngton, DC, 1980; ACS Symposium Serles NO. 139; pp 371-392.

RECEIVED for review January 27,1983. Resubmitted October 27,1983. Accepted October 27,1983. Funding of this research was provided by the U.S.Department of Energy under Project DE-AC22-80PC30080.

Effect of Random Experimental Error on the Generalized Standard Addition Method Mark G. Moran and Bruce R. Kowalski*

Laboratory for Chemometrics,Department of Chemistry, University of Washington, Seattle, Washington 98195

The effect of random experimental error on the initial concentrations calculated by the generalized standard addition method (GSAM) is examined. It is seen that random noise has two separate effects on the final result. The extent to which random noise degrades the preclslon with which the final results can be quoted is quantitatively investigated. For a given level of random noise in the analytical sensors, the standard deviations of initial concentrations derived by GSAM are calculated. Additlonaiiy, random noise can cause bias, that is, deviations in accuracy in the estimate of the initial concentrations. The circumstances which lead to biased estimates are investigated. Finally, strategies for maximizlng precision and minimlzing bias of estimates are examined.

The analysis of multicomponent chemical systems remains one of the consistently most difficult and challengingproblems presented to the analytical chemist. It is desirable to carry out such an analysis and learn the chemical identity and concentration of each component in the mixture. Many approaches to the problem have been proposed over the years. Recently an algorithm has been suggested by Saxberg and Kowalski (I)to recover the concentrations of each component of the mixture in the presence of matrix effects. The purpose of this report will be to characterize the theoretical aspects of this method and investigate the statistical performance of the estimates. 0003-2700/84/0356-0562$01.50/0

Traditionally, there have been two approaches to the quantitation of chemical species: calibration curve methods and the method of standard additions. In the calibration curve approach, a response vs. concentration of analyte plot is generated by empiriciallyobserving responses at three or more concentrations. In order for successful recovery of concentration information, these points must bracket the true concentration of the unknown. This fact presupposes an approximate knowledge of the desired result and imposes a limitation on the method. Once the response-concentration curve has been generated, a response is obtained from the unknown, and by comparison to the standard curve, the desired concentration is estimated. Although the response curve need not be linear, and indeed is often nonlinear, it is most desirable to work with a linear response function. Such a function is inherently more accurate, since nonlinear functions place increased demands on the precision of the measurements. Additionally, estimates derived from such functions are very amenable to statistical analysis (2) and are characterized with a minimum of assumptions invoked by the experimentalist. Practically speaking, this procedure does not lend itself well to multicomponent analysis. Even in the limiting case of a mutually noninterfering n component mixture, the analysis is laborious since n calibration curves must be generated. TO the end that the calibration curves exhibit linear behavior over the concentration of interest, it is necessary to take additional steps to minimize or eliminate interferences which are usually 0 1984 American Chemical Soclety

ANALYTICAL CHEMISTRY, VOL. 56, NO. 3, MARCH 1984

present in “real world” samples. Another approach to the quantitation problem is the well-known standard addition method. In this method, the response of a sensor which generates some analytical signal proportional to the amount of analyte present is observed. A known addition of that species is made and another observation of the response noted. The amount of unknown material can be calculated, if the relationship between concentration and response is well-known. Generally, the analysis is performed under such conditions that this response function is also linear. If this assumption is made, the analyst must take care not to exceed the region in concentration space where the response is linear. Failure to do so will lead to an erroneous result. The analyst, when working with multicomponent systems, has no guarantee, that the signal is due solely to the particular analyte under consideration. In general, a method used to measure a given amount of analyte is selective to that analyte but not specific to that analyte. It is to this problem that the generalized standard addition method (herein referred to as the GSAM) is applicable. The GSAM considers the responses of p analytical sensors which develop signal proportional to r analytes in the system. A minimum of r standard additions must be made, one addition of each analyte, in order to estimate the concentrations of the r components. The condition p C r must be satisfied in practice. The extent to which a given sensor responds to each analyte is given by a vector of constants k such that the response of the jth sensor to the analytes is Rj = Cikiici. The response of p sensom to r analytes Is written by using matrix notation as R=CK

(1)

+

where R is at least (r 1) X p , K is p X r and C is (r + 1) X p . Subtraction of the first row of R and C from each of the succeeding rows yields the response change matrix AR resulting from concentration changes, AC, both at least dimension r X p . A R = ACK

(2)

If the analysis is conducted in solution, the AR matrix is replaced by AQ where each element of R is multiplied by the total volume. The AC matrix is replaced by AN, where AN is the matrix representing the number of moles of analyte added. Working with this form, one has AQ = ANK

(3)

K, the estimate of K is calculated by

K = (ANTAN)-lANTAQ

(4)

Finally, No the column vector of the analyte amounts is computed by using the estimate of K, provided it is nonsingular and r = p .

No= (KT)-lQ

(5)

One Dimensional Standard Addition Revisited. As one might imagine, the GSAM involves mathematical manipulations in vector spaces of higher order. In the simplest twocomponent case there are actually four dimensions, two response and two concentration dimensions. Human intuition breaks down a t this level, and reliance on abstract mathematics is absolute. As a method of gaining insight into the statistical foundation and limitations of the GSAM, the theoretical foundations of the single analyte standard addition case will be reexamined. A one-dimensional standard addition of a chemical species in solution proceeds as follows: (i) Observe the initial response Ro = k(no/uO)where no and uo are the initial number of moles

563

+

Flgure 1. Orlginal formulation of the GSAM as proposed by Saxberg and Kowalski (I).

of analyte and initial sample volume, respectively. (ii) Make a standard addition of amount nl in volume u1 and remeasure response R1 = k(no + nl)/(uo + ul). (iii) Calculate the original amount of analyte no = n1Qo/(Q1 - Qo)where Qo = V,,R0and Q1 = Rl(Vo+ Vl). As noted previously, the response per unit concentration change at Co = (no/uo) must be the same as at C1 = (no+ %)/(Uo

+ u1).

Usually, once no is obtained the analysis is “finished”. However, with modern instrumentation capable of producing copious amounts of data, the analyst is obligated to pursue the matter further. Along with a point estimate of the mean, some knowledge of the confidence band within which the true value may fall is highly desirable. Probability and statistics can be brought to bear on the problem. In the absence of deterministic error, the variations in the observable quantities may be successfullymodeled as random deviates. In other words the R’s of eq 1may be considered as normally distributed random variables. From a sample population of observations of the random variables, statistical methods allow estimation of the mean value and an associated level of confidence, that is, the probability that the true result is within certain numerical limits.

EXPERIMENTAL SECTION In order to isolate the effect of random error on the results of GSAM experiments, extensive simulation was conducted on twoand three-component model systems. Experimental design was constrained only by the requirement that all additions of standards be done serially. That is, all additions of one component are made followed by all additions of the other component(s). This results in a AN matrix of the form shown in Figure 1. Each response in the model system was perturbed by uncorrelated, Gaussian noise of mean zero and known standard deviation. These data were reduced by using a modified version of the GSAM program (original is available from Infometrix, Seattle, WA). For comparison with theoretical predictions, the average of ten simulation sets was used. In each set, there were 300 runs of the same GSAM experiment. All calculations were made on the chemistry department’s VAX 11/780 32 bit word computer (DigitalEquipment Corp., Maynard MA). Single precision accuracy was used for all cases. RESULTS AND DISCUSSION Bias in One-Dimensional Standard Addition Estimates. Complications arise with these procedures since the analyst is not concerned with the observables themselves, but functions of them. In some cases, these quantities can be characterized statistically, and in some cases, not. Consider the case of the one-dimensional standard addition method. The estimate of the analysis amount is

no = niQo/(Qi - Qo)

(6)

Suppose Q0 is distributed N(qO,q,P),meaning a normal distribution with mean qo and standard deviation uo2,and Q1is distributed N(ql,u?). It is well-known that the difference of such random variables is distributed N ( q , - qo, go2 u12).

+

564

ANALYTICAL CHEMISTRY, VOL. 56, NO. 3, MARCH 1984

Table I. Effect of Random Noise on Recovery of Initial Concentration molar absorptivity ( E = 2.67 x k , = 3.00 x 10-3 k , = 1.30 x lo-' c, = 1.550 X M absorbance 0.400 0.600 0.800 1.00

1.20

unweighted leastsquares fit weighted least-squares fit Co = (AN/vo)(Qo/Q, - Q o )

signal to noise ratio, absorbance 36.03 44.01 46.13 43.89 38.77

lo4)

104AN, mmol

0.00

1.91 3.86 5.84 7.85

total vol, mL 25.00 25.10 25.20 25.30 25.40

std dev of molar absorptivity

std dev of the recovered concn

(as % recovered)

20 range

% error due to bias

9.24 X 10' 3.78 x l o 2 4.04 x 103

6.64 x 10-7 4.16 x 10-7 8.66 X lo-'

93.22-107.02 94.42-105.58 90.89-107.69

0.12 0.02 2.29

Consequently, eq 6 is the ratio of two normals. It is of interest to know how the new random variable, the ratio of normals, is distributed. Knowledge of this distribution allows the second moment of the distribution to be calculated, which in turn is used to construct confidence limits for the mean. If z = x / y is the quotient of two random variables which are distributed normally about means ml and m2and possess variances a12,az2, then the probability distribution of the quotient is given by

When m, = m2= 0, this reduces to a cauchy distribution which is symmetric about zero. This is a noisy distribution as the tails fall off as z-2 giving nonnegligible probabilities to fairly large values of z. The moments of this distribution formally diverge. This is caused by the finite probability of division by zero. Physically, any ratio calculated by using the value y = 0 would be discarded. This corresponds mathematically to the use of a truncated distribution where an arbitrary cutoff value of z, z, is introduced. With this truncated distribution, location and scale parameters can be estimated. In the general case of m, inequal to m2which most often arises in the analysis of chemical data, the distribution is no longer symmetric. It can be shown that the most probable value of the distribution is a value slightly less than ml/m2. Additionally, the mean value of the random variable z is not equal to ml/mz. On the basis of Monte Carlo simulations of up to lo6 trials (3))it appears that the median of the distribution converges to the value of ml/m2. The practical ramification of these facts is that an average of values obtained by, say, repetitive runs of a standard addition result will not be symmetrically distributed about the true result. In other words, even if all sources of deterministic error are excluded, random error alone can cause the average of a set of standard addition experimental results to behave as if they are biased. The magnitude of the bias depends on the amount of random noise in the system. Fortunately, most results obtained by standard addition techniques contain only a very negligible error due to bias if the signal to noise ratio of the measurements is reasonable. The computation of the mean of the quotient, z, made by using a distribution free assumption gives the following result:

In the case of independent, uncorrelated numerator and denominator which have signal to noise ratios C 15, only the first term in the series on the right-hand side needs be re-

tained. The mean of the quotient is approximately equal to ml/m2+ (ml/mz)(a2/m2)2. The average of the quotients of random variables will converge to this result which is greater than the true result ml/m2. The relative error due to bias is therefore given by (a2/m2)X 100. It is crucial that the denominator of any ratio be well determined if the introduction of excessive noise in the result is to be avoided. In the limiting case of a perfectly known value of the denominator, E ( z ) = mr/m2and uz2 = ur2/m2. To illustrate these points, consider an example of quantitation of a dyestuff material by visible absorption spectrometry. In one case, a single standard addition is made and the initial concentration is estimated using eq 6. In another case, four standard additions are made and the data are reduced by using unweighted least-squares and weighted least-squares fits. The details of the experimental design are given in Table I. The noise in the spectrophotometer was modeled as a combination of photomultiplier shot noise and background noise. The shot noise was assumed to be of the form us = k2(T + p)'/'.The background noise was modeled as ab = k,T. The total noise in the transmittance is then UT = (a, + The values of kl and k2 are given by Rathman, Crouch, and Ingle (4).The noise in the absorbance measurement is then uA = 0.434 aT/T. In the case of a single standard addition of analyte which causes a %fold increase in sample absorbance, the 95% confidence limits on the estimate of the original concentration range from 90.89 to 107.69% . Additionally, the result is, on the average, biased on the high concentration side by 2.29%. In the four standard addition case, an unweighted leastsquares fit produces a recovery range of 93.22 to 107.02% at the 95% confidence level. However, due to the improvement in the estimate of the slope of the response curve, the bias in the result is sharply reduced to 0.12% of the original concentration. By use of weighted least-squares fitting which takes into account the noise characteristics of the instrument, the recovery range is 94.42 to 105.58% at 95% confidence and the bias is reduced to a truly negligible 0.02% of the original concentration. Precision of the Estimates of the K Matrix. The objective of this section is to investigate the statistical performance of the parameters estimated by GSAM, a multivariate standard addition method. The reason for this is 2-fold. Limits of precision of measurements (e.g., confidence limits) are assigned on the basis of estimates of variance of the measurements. Second, the method of statistical estimation used can often influence the experimentalist in planning optimum noise reducing experiments. This point is well illustrated by again referring to the quantitation of an unknown using one-dimensional standard addition. To es-

ANALYTICAL CHEMISTRY, VOL. 56, NO. 3, MARCH 1984

timate the unknown concentration the respone of the analytical sensor is plotted against the number of moles added to the unknown. The best straight line is fitted to these points and extended back to intersect the abcissa. The point where the intersection occurs is the initial number of moles of unknown. Presumably the “best” straight line fit to the data will yield the “best” result. In terms of precision, the line whose slope has minimum variance yields the most precise estimate of the initial concentration. Consider the following Statistical model for a linear response function: (9) = PO + Plxi + ei where yi is the measured value and xiis a constant, specified number and ei is the error of the measurement and is generally considered to be a random variable with a normal distribution about zero with variance 2. It can be shown ( 5 ) that the variance in the slope is equal to the quantity u/Ci(xi - Z)2. Optimal precision in the estimate of PI demands that the term Ci(xi- z ) be ~ maximized. This point was addressed by Franke, Zeeuw, and Hakkert (6)who recommended that the best way to achieve this was to load half of the measurements at each end of the extremes of the linear dynamic range of the sensor. Out of all possible curves which could be generated in such an experiment, this particular design yields optimally precise estimates of initial concentration. In multicomponent analysis, the best possible precision for the estimates is also desirable. The question is how to achieve the best estimates given the freedom to choose different experimental designs. Jochum, Jochum, and Kowalski (7)have analyzed this problem from a numerical analysis point of view. They made quantitative calculations of the deterministic upper error bound for a particular experimental design given a certain level of error present in the system. As they observe, the upper error bound depends on the condition number of the AWAN matrix where AN has rows equal to the amounts of standards added at each addition. Mathematically, this matrix need only be of full rank in order to yield a solution for K and hence No. Any linear dependence among the columns of the AN matrix raises the upper error bounds. Consequently it is ideal that columns of AN be orthogonal. In their original paper, Saxberg and Kowalski (1) suggest differencing the ith response and the zeroth response in order to calculate AQ; and hence ANP This data reduction method has come to be known as the total difference calculation (TDC). In order to make the columns of the AN matrix orthogonal, Jochum, Jochum, and Kowalski suggested that all additions of the first standard are made, followed by all additions of the second standard, and so on. The AQ and AN matrix elements for the J t h component are formed by taking the following incremental differences: Yi

AQiJ = QiJ - Qi-lJ

(10)

NiJ - Ni-lJ

(11)

miJ=

Differencing successive measurements in this way makes AN orthogonal and renders the problem well conditioned from a computational point of view. This method of data reduction is the so-called incremental difference method (IDC). The minimum deterministic error bound for No is achieved in this way. The purpose of the remaining sections is to examine how the precision of the estimates depends on experimental design and make recommendations of how to achieve optimal results. The statistical model of the GSAM is

Q=NK+e

(12)

Q, N, and K are as previously defined. t is the matrix of random variables which represent the error in each mea-

0

-A$

4

0

4

A

0

0

0

0

0

0

4

0

0

0

AN:

A$

0

4 A N ; o 4

4

4

AN!

565

0 0

0 0

0

o

o

0

0

A4

0

0

0

A$

0.

0

0

4

0

0

0

A$

0

0

0

A 1

AN:

0

0

A$

AN:

0

o

A$

AN:

0

0

A$

AN!

Flgure 2. Equivalent representation of the GSAM which is used for theoretical analysis.

surement. Formation of AQ as called for by either the incremental or total difference method gives

AQ = ANK

+ Ae

(13)

Now the error term of the right-hand side is a composite term. The ramifications of this fact will be discussed. It is useful to explicitly write out eq 13 for the case of two sensors and two ~ a l y t e with s four standard additions per analyte. Figure 1 shows eq 13 in this case. AQiu is the volume corrected response of sensor A to the ith standard addition of component A. AN: is the total amount of component A added after the ith standard addition of component A. t is the number of standard additions of each component. The balanced design where equal numbers of standard additions per component is easiest to handle notationally. In the two-component case there are then n = t t total standard additions. Including the measurement of initial response, a minimum of rt + r total measurements must be made. The structure of the AN matrix reflects the fact that all standard additions of A are made followed by all standard additions of B. K B A is the coefficient which reflects how much contribution to the signal of sensor A is made by the presence of analyte B. K A B is analogously representative of how selective sensor B is to analyte A. Atiu is the composite noise from the formation of the difference Qiu - Qou (TDC) or Qiu - Qi-lu (IDC). In order to actually calculate statistical descriptors of the parameters, it may be useful to note that eq 13 may be equivalently rewritten as shown in Figure 2. In the total difference algorithm each entry of the AN matrix is the total amount of standard added upon the ith addition. However, in the incremental difference algorithm each entry is the actual amount added upon the ith addition. Consequently, in the latter scheme, the matrix elements labeled AiVtAare identically zero since only one addition of standard is made at a time. Thus the AN matrix is orthogonal by columns. In order to proceed with the statistical analysis of the estimates, the following assumptions are made. (i) Each measurement is statistically independent. (ii) The noise in each measurement is modeled by a normally distributed random variable. (iii) The variance of the noise is independent of signal level. Assumption (iii) may seem excessively restrictive especially in light of the fact that many analytical sensors have noise which depends on the magnitude of the signal (8). However, this assumption leads to useful insight into the statistical

+

566

ANALYTICAL CHEMISTRY, VOL. 56, NO. 3, MARCH 1984

1

8.

v',

er+v',v:

v:

v',

1

-

(Vy+v:

v:

0

0

0

0

v:

0

0

0

0

v:

0

0

0

0

v:

Flgure 3. E(AcAeT)matrix for total difference calculation.

properties of the estimates. If need be, an appropriate transform of the raw data could be applied in order to render the variance independent of the signal level (9). Since the noise is random and uncorrelated, E ( q ) = E(eiej) = 0. Also, by assumption (iii) E(eiei)= 8. The variance associated with sensor A is labeled .A2 whereas uB2 applies to the variance in sensor B. The estimates of the elements of K are unbiased. This is readily seen by calculating the expectation value of eq 4 in light of assumptions (i)-(iii)

E(K)= (ANTAN)-'ANTE(AQ) E(AQ) = E ( A N K )

(14)

+ E(Ac)

8,

=

Figure 4. Matrices E,, E,, and E, of eq 19.

(15)

Substituting eq 15 into eq 14 and noting that E(e) = 0 proves the point.

E(K)= (ANTAN)-'ANTANK = K

Equation 16 statesJhat if the only noise present is random, the estimate of K, K, will be sFmetrically distributed about the-true valu:. The variance of K requires the expressionE([K - (K)][K - (K)IT)to be computed. By use of eq 13 and 4 this can be shown to be

VAR(K) = (ANTAN)-~ANTE(A~A~T)AN(ANTAN)-' (17) The matrix E(AsAeT) is the most important factor in eq 17. It contains effects which depend on experimental design as well as on the choice of incremental difference or total difference calculation algorithm. Generally E(AeAeT) is dimension r2t X r2t. In the 2 X 2 example, this is 4t X 4t. Initially, consider the case where the TDC algorithm is used. Also consider that standards are added as solutions;hence the total volume of the system increases upon each addition. Under these circumstances, the first t diagonal elements of AcAcT are given by A ~ i A t j=

(Vp)2(~iAA)2 - 2VpVo~i~to* + Vo2(toA)2

0

(16)

(18)

Taking the expectation value and making use of assumptions (i)-(iii) the matrix element reduces to ( Vk)2~A2 + V$UA', where ViAis the total volume after the ith addition of standard A, uA2 is the variance of the response of sensor A, and VOis the initial volume. The next t diagonal elements of the matrix are of the same form, but with VF, the total volume after the ith addition of component B, instead of V t . Off diagonal elements in the first 2 t rows are given by V,ZUA'. This is because the cross products AeiAti each contain common to terms. It is important to note that each volume-corrected response QiAis statistically independent of each QjA,j # i. However, each AQiA is correlated with all other AQ?, j # i, because of the commonality of terms such as VotoA. Within the context of the assumption of physically independent sensors, any term involving factors such as will have zero expectation value. Thus,terms such as AQ! are not correlated with terms such as AQiB. Consequently E(AcAeT) is block diagonal. The upper left hand block in the 2 X 2 case has 2t X 2t nonzero entries. The lower right hand block has also 2 t X 2 t nonzero entries which have the same structure as the other block but with uA2 replaced by uB2. Written out explicitly, the (AcAcT) matrix for this case, but now with two

0

1

(Vy+q -(Vy -wy wy+q

Figure 5. E(AcAcT)matrix for incremental difference calculation.

standard additions per component, is shown in Figure 3. The symbol @ indicates the direct product matrix operation. Inserting the expression for E(AeAtT) into e,q 17 and performing the indicated multiplication yields VAR(K). Before doing this, it is instructive to partition E(AcAtT) into three matrices El, Ez,and E3 This is shown in Figure 4. In the diagonal matrix of eq 24 each entry Vk is the incremental increase in the total volume due to the ith addition of component A and analogously for ViB. Inspection of the expression

VAR(K) = (ANTAN)-lANT(E1

+ E2 + E,)AN(ANT6N)-l

(19)

shows at a glance how different factors contribute to the total variance in K. The term (ANTAN)-lANTElAN(ANTAN)-l reduces to a particularly simple block diagonal expression which is exactly the variance one would calculate if each AQi were uncorrelated and the volume of solution remained constant. The term (AwAN)-1AN'%2AN(AwAN)-1 represents the contribution to variance due to effects of volume increase as a result of making additions of standard solutions. The reader will note that assumption (iii) can now be relaxed. In the case of solution standard additions, the variance in each volume-corrected measurement increases as volume increases. This is essentially no different from a case where a sensor develops a level of noise dependent of signal magnitude. To model such an effect the analyst need only know roughly how the variance depends on signal level. For many analytical instruments such as absorption spectrophotometers, this is well-known ( 4 ) . Additionally this effect could be estimated by an examination of residual plots (5). If the incremental difference algorithm is chosen, the structure of E(AeAcT) is slightly different. The diagonal elements do not change. Since AQiJ = (Q: - Qi-lJ)(J = A or B) each AQiJ is correlated only with AQi-lJ and each AQi+lJ. Consequently, only the diagonals above and below the main diagonals of E(AcAtT) contain nonzero elements. In this case the matrix is given explicitly as shown in Figure 5. Partitioning (AcheT)to illustrate contributions to variance proceeds

ANALYTICAL CHEMISTRY, VOL. 58, NO. 3, MARCH 1984

567

Table 11. Standard Deviations in Results Calculated by Using Equation 32 Model: V R , = K , , N , t K,,N, V R , =K,,Nl K=

std dev input noise Rl 0.01 0.005 0.015

0.025 0.005 0.030

R2 0.01 0.015 0.015 0.010 0.025 0.005

6::; ?::)

signal to noise ratio SNR(R,) 405

810 270 162 810

135

SNR(R,) 232.5 155 155 232.5 93.0 465

along the same lines. Matrices Eland E3are exactly similar to the TDC case. The matrix E2is tridiagonal and contains many more zeros than in the TDC case. At this point it is appropriate to address the question of whether or not there are any palpable differences between the different algorithms. Jochum, Jochum, and Kowalski (7),as discussed, showed that the incremental calculation by virtue of its superior conditioning is more robust to deterministic error. Extensive simulation in the authors' laboratory shows that the total difference +xlation is more robust to random error. That is for the K matrix and the resulting concentrations, more precise estimates are obtained by using the total difference calculation. The difference lies in the determinant of the matrix ANTAN. For the incremental difference case the matrix is diagonal. For the total difference case it is not. The determinant in the total difference is several orders of magnitude greater than in the incremental difference case. This effect even overcomes the fact that E2is more sparse in the incremental difference case. Generally the elements of VAR(K) will be uniformly smaller in the total difference case when the results are computed from the same data. This effect propagates through to the final concentrations whose precision ultimately depends on the precision of the estimates-of K. The previously derived equations show that VAR(K) depends explicitly on the square of the initial volume. This appears to suggest that only analyses of small volume samples are practical. This is untrue. The factor of Vo2is recovered because the calculation of concentration requires division of No by V,. This introduces a factor of VOT2 into the estimate of variance of concentration which exactly cancels the contribution due to V,Z. Once the matrices K and VAR(K) have been calculated, it remains to compute the initial amounts of analytes in the sample. This can be done by using the following expression when r = p :

No = (KT)-'Q (20) Additionally, some statement of the precision of the estimate of Noshould be given. The computation of VAFt(K) altkough somewhat tedious is relatively straightforward since K is a linear function of random variables. As stated in eq 16, the estimate of K is unbiased. In other words, the presence of random noise does not, on the average,shift the estimate above or below the true mean. Furtherpore, it can be shown that the elements of each column of K belong to a multivariate normal distribution (2). Precision and Accuracy in the Estimate of- Initial Concentration. The properties of th estimates of Nodiffer considerably-from the properties of the estimates of K. Each elem-ent of No is a complicated function of the elements of the K matrix. The probability distribution function governing NOcannot be expressed in close form. Consequently, the first and second moments of the distribution cannot be given ex-

+ K,,N, N,' = 3.3 N,Z = 1.5

calcd std deviation No'

0.178 0.147 0.281 0.351 0.212 0.491

N,Z 0.161 0.244 0.250 0.189 0.380 0.211

simulated std dev N,' 0.177 0.145 0.266 0.414 0.201 0.479

N,Z 0.168 0.233 0.251 0.225 0.402 0.171

actly. Nevertheless, it is still possible to make semiquantitative statements concerning these quantities. Simulation of GSAM experiments for the two-copponent case shows that the distribution of the elements of No qualitatively resembles a 0 distribution. The skewness of the distribution begins to become noticeable when the signal to noise ratio of the initial response drops below about 80. Because it is a multivariate technique, the problem of bias in estimates of No is a more serious potential problem in GSAM than in one-dimensional standard addition. For the ith componeTt, the bias increases a t least as cast as NJ(UO(Q/D(K)),~ where uDCfi2 is the variance of D(K), the determinant of K. For the two-component case this variance is given as

Contributions from the diagonal of K dominate this term. The correlation between the numerator and denominator is extremely small, owing to the fact that each term is of the order u t (uiis the standard deviation of the ith sensor), and partial cancellation among the terms. Monte Carlo simulation of GSAM experiments yith variable amounts of random noise show that the bias in No increases with input noise according to eq 21. Simulation also indicates that the median of the distribution is an unbiased estimate of No for almost all practical values of signal to noise ratio. Simulation results of the two-component case also reveal that while one component exhibits positive skewness, the other reveals negative skewness. This suggests that the results are correlated. In order to investiga: this point, consider for the time being that the elements of K are true constants. Then, the covariance between analytes one and two can be given directly as

Where u1 and u2 are the variances associated with sensors 1 and 2, respectively. Examination of eq 22 shows that the correlation increases as the noise in the sensors increased. This accounts for the way in which one noisy sensor can poison not only the estimate of the analyte to which it is most selective but all others in the system. Specifically, consider a case where sensor 1is relatively quiet while sensor 2 is relatively noisy, This noise in sensor 2 contributes to the effects of biasing the result for analyte 2 in one direction. Since the estimates are correlated, this drives the estimate of analyte 1 in the other direction thus degrading accuracy as well as precision. Another point which can be inferred from eq 22 is that when the sensors are not very selective the terms K2, and K12are relatively large. This tends to increase the correlation. When

568

ANALYTICAL CHEMISTRY, VOL. 56, NO. 3, MARCH 1984

the off diagonal terms are small, this fffect is minimized. Thus, by inspecting the condition of the K matrix, the analyst can tell quantitativ_elyhow much bias to expect in the result. The variance of No is of considerable interest to the analyst if quantitative error bounds on the results are to be quoted. Since a distribution function for No cannot be found, an approximate calculation of the variance-must suffice. In the two-component case each element of No is a function of six variables. The pairwise correlations and the vyiance in each variable all contribute to the total variance in N,. Due the complex nature of this function, a calculation of VAR(No)by propagation of error mathematics must take all of these into account. The propagation of error technique involves expanding an arbitrary function of random variables in a Taylor series and then calculating the first and second moments of the series. For a function f ( x l...x,) the result to second order is

+

...

VARCf(x1 x,)) = f(a,...a,) + Ci(af/axi)u'Xi2 2Ci(af/axi)(af/axj)COV(x,,xj) (23) In many cases, the series converges so quickly that many of these terms need not be included. However, the authors have found that in order to calculate VAR(No) for the GSAM estimates all terms to second order must be retained. This will suffice for cases where all sensors have approximately equal variance. If one sensor is considerably noisier (e.g., u2 > 10ul), then third-order terms also must be included. However, in the favorable case, retention of only second-order terms yields an excellent approximation. Computation of the terms in eq 23 are considerably simplified by noticing that aN,$/aKij = (Cik/D)Nd(-l)'+k+'where Cik is defined below. With these kinds of relationships, the variance of N,' can be written generally as

(2/D)C

CNJN$Cij%OV(K~jK,,) (24)

l=k k=2 j=1

The following definitions apply. Cik is the cofactor of the ikth element of the K matrix. is the sum of the squares of the cofactors of the ith row of the K matrix. uk2 is the variance of t)e kth analytical sensor. Since the variance of the kth row of K is proportional to U k 2 , this constant of proportionality is labeled (Yk. prnis a constant which arises from the calculation of the covariance between the Qoiresponse and the element of kji of K. It should be noted that the covariance between these terms is appreciable only because the individual measurements for each sensor AQ? are correlated. The constant prnis l m T ( w A N ) - l W A Qwhere 1, is a column vector of length rp which has each element equal to zero except for the rntb element which is one. _The index m is equal to ir + j . D(K) is the determinant of K and Vo is the initial volume of the sample. The summation limit r is the number of analytes in the unknown. Table I1 shows the results of calculations of VAR(No)using eq 24 with simulated results of the same quantity. It should be noted that agreement is good despite the fact that the selectivity of the sensors is poor. Also, this particular experimental design is relatively noisy since the total volume is increased by a factor of 3 during the course of the experiment. Recommendations for Optimally Precise Analysis. The objective of the analytical chemist is to obtain the most accurate and precise estimates of analyte concentrations or amounts possible. To this end the authors make several

recommendations for optimal GSAM experiments. The following conditions reduce variance and increase the accuracy of the technique. Volume increases as a result of standard additions must be kept to a minimum unless a weighted estimation procedure is used. Examination of eq 19 shows that the contribution to VAR(K) due to El and E2 are about the same order of magnitude. The matrix E3 contains volume effects. Evaluating this term for a given- experimental design reveals that the contribution to VAR(K) from this term goes as (V,! Vo)2 where V , is the final volume. Concentrated solutions of standards, or if possible, solid standards should be used to make additions. Dilute solutions of standards are advantageous in that larger volumes are easier to dispense with greater precision. Concentrated solutions, while advantageous from a statistical point of view, can introduce unwanted error in AN unless care is taken. Since the increase in variance of the measurement due to relative volume increase is clearly identified in eq 19 by the contribution due to matrix Estthis effect can be eliminated by an appropriate weighting scheme. The weighting vector which would multiply AQ in eq 14 would contain the reciprocals of the entries of the diagonal of E,. Additionally, if the variance of the sensor for each level of signal measured is known a priori, the weights due to this effect will be contained in the vector WQ The combined volume-variance weighting vector is the diagonal of E3WBT. This manner of weighting reduces contribution to VAR(K) to solely those contained in the matrices El and E2. In cases where random noise is the predominant source of error, the total difference calculation method should be used. This is true whether or not the noise in the sensors is independent or dependent on the signal magnitude. This is because the buildup of Det(ApAN) imparts favorable statistical properties to the solution of eq 4. It should be noted that because of the more sparse matrices involved, an error analysis routine using the incremental calculation should run faster. Franke et al. (6) recommended that optimization of onedimensional standard addition experiments can be done by loading half the measurements at each end of the extremes of the linear dynamic ranges. This recommendation can be generalized for the (%AM. To minimize the variance of K and consequently No, D e t ( M A N ) must be maximized. Consider two different approaches. In one case t standard additions for each of the r components could be made to a single sample, for a total of rt standard additions. Alternatively a sample could be partitioned into t subsamples to which one large standard addition for each of the r components is made, again for a total of rt standard additions. The latter case unequivocally maximizes Det(ANTAN) for a given concentration excursion. When practical, this should be done. While respecting the linear dynamic range of all sensors simultaneously, standard additions should be made which step each sensor to the limit of its dynamic range. This makes maximal use of the "statistical potential" of the technique. Examination of eq 24 shows that VAR(No)depends on the variances of the analytical sensors as well as the quantities ai and p, which are functions of 9. Since AN is under the experimentalist's control, VAR(No) can be reduced by experimental design. The quantity aiis minimized by strategies previously discussed; that is, maximizing Det(ANTAN) and minimizing volume increases. The second term of eq 24 involves the sume NJpl + NO2p2for the two-component case. p1is >O and p2 is N:. Any knowledge of this kind can and should be put to good use. That row of AN which corresponds to the smallest value of N,' should have the largest values of the incremental molar additions. This requires abandoning the symmetric design of additions, causing the condition of (ANTAN)-lto worsen. Nevertheless, statistical precision is gained a t the expense of amplification of deterministic error. In the twocomponent example chosen, redesigning the experiTent such that Win,= ('/&Minc reduces the variance in No by approximately 50%. The above considerations suggest a procedure analogous to the Stein two-stage sampling procedure in statistics. Out of a total or rt standard additions, the first (r - .$t data points could be used to calculate a rough estimate of No. With this, AN can be rechosen so that &/&= No1/N22. In summation, the GSAM algorithm provides the obvious advantage of simultaneously estimating initial concentrations of analytes when matrix effects and mutual interference effects are present. To obtain full advantage of this technique, care

569

must be applied in the analysis. ACKNOWLEDGMENT The authors thank Maynarhs de Koven for helpful discussions.

LITERATURE CITED (1) Saxberg, 6. E. H.; Kowalski, B. R. Anal. Chem. 1979, 57, 1031. (2) Grayblll, F. W. "Theory and Applicatlons of the Linear Model"; Duxbury Press: North Scituate, MA, 1976. (3) Moran, J. M., personal communlcatlon. (4) Rathman, L. D.; Crouch, S. R.; Ingle, J. D. Anal. Chem. 1972, 44, 1375. (5) Mendenhaii, W.; Scheaffer, R. L. "Mathematical Statistics with Applications"; Duxbury Press: North Scltuate, MA, 1973. (6) Franke, J. P.; de Zeeuw, R. A.; Hakkert, 6. R. Anal. Chem. 1978, 50, 1374. (7) Jochum, C.; Jochum, P.; Kowalskl, B. R. Anal. Chem. 1981, 53, 85. (8) Skoog, D. A.; West, D. M. "Fundamentals of Analytlcal Chemistry"; Holt Rlnehart and Winston: New York, 1976. (9) Harwlt, M.; Sloane, N. J. A. "Handamard Transform Optics"; Academ ic Press: New York, 1979.

RECEIVED for review July 15, 1983. Accepted December 7, 1983. This material is based upon work supported by the National Science Foundation under Grant CHE-8004220.

Portable Generator for On-Site Calibration of Peroxyacetyl Nitrate Analyzers Daniel Grosjean,*' Kochy Fung, John Collins, Jeffrey Harrison, and Edmund Breitung

Environmental Research & Technology, Znc., 2625 Townsgate Road, Westlake Village, California 91361

Parts per bllllon (ppb) levels of the atmospherlc pollutant peroxyacetyl nltrate (PAN, CH,C( 0)00N02) are synthesized In a portable photochemical flow reactor from lrradlated mixtures of acetaldehyde, chlorlne, and nitrogen dloxlde In alr. Direct, on-site calibration of PAN analyzers (e.g., electron capture gas chromatographs) can thus be performed In the range 2-400 ppb. The constant PAN output of the generator Is quantitated by Ion chromatography followlng alkaline hydrolysis of PAN to acetate In Implngers, whose collection efficiency for PAN and potential lnterferents has been established. Generator output stablllty, PAN output as a lunctlon of temperature, and effect of matrix air humldlty have also been characterized.

Peroxyacetyl nitrate [PAN, CH3C(0)OON02, systematic nomenclature name ethane peroxoic nitric anhydride ( I ) ] is a major product of photochemical transformations involving oxides of nitrogen in the atmosphere (2). Of the several methods available for trace levels measurements of PAN in laboratory, smog chamber, and ambient atmospheres, electron capture gas chromatography (EC-GC) is the most widely employed (2). Instrument calibrations are generally performed by dilution of high concentrations of PAN (- 1000 ppm in N,) prepared by standard methods (2) and stored in compressed gas cylinders. However, the stability of PAN in these conditions is highly erratic (2) and there has been, for a number of years, a critical need for developing a method suitable for calibration of EC-GC instruments directly in the Present address: Daniel Gros'ean and Associates,Inc., Suite 645,

350 N. Lantana St., Camarillo;&A 93010.

0003-2700/84/0356-0569$01.50/0

parts per billion (ppb) range relevant to atmospheric levels. The objective of this paper is to describe such a method, which has been developed in our laboratory to calibrate automated EC-GC instruments employed for PAN measurements as part of smog chamber studies of hydrocarbon photochemistry (3) and of field experiments conducted in urban atmospheres (4). A portable PAN generator has been constructured, in which known, stable, ppb amounts of PAN are generated by chlorine-initiated photooxidation of acetaldehyde in the presence of NO2in air, a method first described by Gay et al. (5). The portable generator can be used for direct, on-site calibration of EC-GC PAN analyzers in the range of -2-400 ppb. The PAN output of the generator is measured by ion chromatography following alkaline hydrolysis of PAN to acetate ion.

EXPERIMENTAL SECTION PAN Analyzer. Portable PAN analyzers constructed in our laboratory include a Shimazdu electron capture detector (63Ni) and electrometer, a 0.3 X 50 cm Teflon column filled with 10% Carbowax 400 on 60/80 mesh Chromosorb G, an 8-port sampling valve with a %cm3sampling loop actuated by a timer-controlled solenoid value with a cycle time of 15 min, and an insulated, compact frame housing all components (dimensions 43 x 30 x 15 cm). Typical operating conditions are as follows: oven temperature, 30 O C ; detector temperature, 30 O C ; and nitrogen carrier gas flow rate, 40 cm3min-'. PAN is eluted in -5 min under these conditions. PAN Generator. The portable PAN generator includes a temperature-controlled permeation oven (dimensions 46 X 35 X 20 cm, weight -2 kg) and a photochemical flow reactor (dimensions 76 X 46 X 28 cm, weight - 2 kg). The oven houses chlorine, acetaldehyde, and nitrogen dioxide permeation tubes (for example, 3 cm tube for CH3CH0and wafer F653 for C1, and NOz,Metronics). Constant temperature in the range 30.0 to 45.0 0 1984 American Chemlcai Soclety