Statistical uncertainties of analyses by calibration of counting

Department of Chemistry, University of Massachusetts, Boston, Massachusetts 02125. Methods are developed for calculating statistical uncertainties. In...
0 downloads 0 Views 790KB Size
980

ANALYTICAL CHEMISTRY, VOL. 50, NO. 7, JUNE 1978

Statistical Uncertainties of Analyses by Calibration of Counting Measurements Lowell M. Schwartz Deparfment of Chemistry, University of Massachusetts, Boston, Massachusetts 02 125

Methods are developed for calculating statistical uncertainties in the form of approximate confidence limits for analyses determined by calibration of counting experiments for which the calibration curve is linear. When these approximations are used, the analytical uncertainties may be calculated without elaborate digital computation. The appropriate method in a given experiment depends on the size of the background count relative to the sample counts and criteria are discussed for selecting the appropriate form. These procedures are relevant to radiochemical assays, to x-ray and UV-visible spectrometric methods utilizing photon counting, and in some instances to biological “dose-response’’ experiments.

This paper deals with the problem of assigning statistical uncertainty estimates in the form of confidence limits to analyses determined by reading a count response measurement through a linear calibration or standard curve. The wellknown method of finding confidence limits with a linear calibration curve ( I ) requires that the precision or variance of the data points be uniform (homoscedastic) along the curve, but when the response measurement is a count, the variances are inherently nonuniform from theoretical considerations (2). This problem can be approached in two ways. One method is to transform the measured variables in such a way that the transformed response variable is made homoscedastic. The other method is to construct the linear calibration curve by a least-squares regression procedure which applies weighting factors a t each point inversely proportional to the variance a t that point. Both methods are described in the statistics literature and an interested reader might well refer to an article by Finney (3)as an introduction to these sources. The text by Brownlee ( 4 ) describes the weighted-least squares method in detail. No such comparably detailed account of the transformed-variable method applied specifically to the confidence limit problem in counting experiments appears to exist in the chemical literature and this author has not seen it done elsewhere either. Consequently, the approach taken in this paper is to utilize the transformed-variable method and to take advantage of the a priori knowledge of the variance of the count measurements to derive equations for the analysis confidence limits. I t turns out that approximations can be made in cases when the background count is a small fraction of the sample count so that confidence limits can be calculated without reliance on overly powerful computing machinery. This condition no doubt pertains to many experimental situations and so the approximations calculated in these cases can be obtained a t minimal computing costs. When the background count is not small, more elaborate computations are required and some guidance is offered with respect to alternative strategies. Although the results found here apply to any analytical experiment yielding a response which is a count and is linearly related to the sample size variable, the exposition here will be phrased in terms of a radiochemical assay where the standards and unknowns are concentrations of radioactive 0003-2700/78/0350-0980$0 1.OO/O

material in samples of fixed size and the measurements are the counted numbers of radioactive disintegrations in these samples in a fixed period of time. The equations will also be applicable to other analytical methods such as x-ray and ultraviolet-visible spectrometry which make use of photon counting techniques. Also, if the fundamental assumptions are met, the results can be used in certain biological experiments where the response is a count of perhaps cells or organisms affected in some way by a treatment or dosage. An analysis of an unknown dosage is found by counting the responses due to a series of known dosages, constructing a “dose-response” curve, counting the response due t o an unknown dosage, and reading the unknown value from the curve.

FORMULATION Several standard samples ( n in number) are prepared containing various known concentrations xi of radioactive material. Each such sample is counted for the same fixed length of time and the total counts yTr are recorded. If n, replicate measurements are made on sample i, then the symbol yTi denotes the mean of these measurements. Then samples prepared identically to the standards but containing the unknown radioactive concentrations X Iare counted for the same fixed length of time and yield counts Y,. N , replicate counts yield the mean 7,over the replications. The unknown values X,are easily read by projecting 7,through the calibration line defined by the standard measurements. The problem, however, is to find confidence limits for each X,. We will assume a behavioral model which includes two essential features: (1) the calibrating line representing the yr, vs. x , dependence is linear and (2) the statistical scatter of each counting measurement follows the Gaussian or normal distribution. The first of these assumptions allows for a background count YB so that the total count Y T , = YB + ysi, where the sample count ys, = bx, and b is a proportionality constant, the slope of the yn vs. x , calibrating line which is sometimes known as the “specific activity”. The second assumption implies that two conditions are satisfied: (1)that the number of atoms disintegrating during the counting period is a small fraction of the population of radioactive atoms so that the counting statistics follows the Poisson distribution (2);and (2) that the number of counts is sufficiently large that the Poisson distribution is well-approximated by the normal distribution (2). These two conditions usually hold well in radiochemical experiments because the number of radioactive atoms in a sample is quite large and the counting period can be selected relative to the half-life in such a way as to obtain a large count without significantly depleting the radioactive population. However, in biological context, where organisms or cells are counted, these conditions may or may not be satisfied depending on the nature of the particular experiment. Under assumption 2 the population variance of the count CT;?equals y r (2) but if the counting detector circuitry amplifies or attenuates the actual count by a fixed factor before readout, then the variance would be expected to be O’yT =

kY T

1978 American Chemical Society

ANALYTICAL CHEMISTRY, VOL. 50, NO. 7 , JUNE 1978

where k is a known factor. k = 1 when the actual count is recorded. As was mentioned above, the calibrating line cannot be fit t o the ( x i , gTi)data by conventional least-squares formulas because these are derived on the assumption that all data points have uniform precision which is not valid for counting measurements. The approach here will be to transform the counts y T into another variable, z , which is homoscedastic. The procedure by which this is accomplished is described by Mandel ( I ) (Chapter 13) and the resulting transformation is z = y r l / z ,as has been noted by Finney (3) and by Sokal and Rohlf ( 5 ) ,for example. By applying propagation-of-variance to this transformation we see that the population variance of z is ozz =

k -= 4YT 4

which is, indeed, a constant with respect to x, but if the original calibration curve was linear, the transformed calibration curve =

(yB +

has become nonlinear. The problem of finding confidence limits for analyses read through nonlinear calibration curves of arbitrary form has been discussed earlier (6) and the methods described in that paper may well be applied here. The purpose of the present work is to show that under the usual experimental condition when the background count is a small fraction of the total count, the calculation of confidence limits may be done by a less elaborate computation than is required for a curve of arbitrary form. Equation 1 may be rewritten as a binomial series expansion

and if, with a sufficiently small background yB relative to sample ( y s = b x ) count, the first one or two terms will be a sufficient approximation. We will consider both these possibilities: if the first term alone suffices, the resulting development will be called t h e “zero-background approximation”; if the first two terms suffice, it will be called the “low-background approximation”.

ZERO-BACKGROUND APPROXIMATION This approximation implies that the calibration curve is of the form z = (bx)’I2 so that if the concentration values x are also square-root transformed into u = x1/2, the resulting z vs. u calibration curve zCd = clu is linear and passes through the origin of z,u coordinates. A least-squares f i t of a straight-line to data transformed in this manner yields a value for the slope c1 = b 1 / 2of the line ( 4 )

suu

Cniui2 i

where f i is the mean value of the transformed measurements. At this point we have introduced less cumbersome notation in the form Sfg to represent summations such as n

C nifigi i= 1 The variance of the parameter c1 is (C])=

(a

is solved for two roots, U , which are the desired (transformed) confidence limits. In this case Equation 2, after appropriate substitutions becomes

(3) which is quadratic in U. t is the critical value of Student’s t for a given confidence level and the number of degrees of freedom inherent in var ( e ) , which number is infinite if the population value ui is used. The two roots of Equation 3 are easily found by the well-known formula and these will be satifactory solutions provided the condition Suuc! > t2u2is met as explained elsewhere (6). The analysis for the unknown X is! of course, ( Z / C as ~ )is~ seen by substituting 2 for zCaIand back-transforming the resulting U.

LOW-BACKGROUND APPROXIMATION If the background count is significant but small, the first two terms of the binomial expansion imply t h a t the square-root transformed calibration data be fit with the form = C]U

%a1

+ c2u-I

The development of least-squares formulas for such a fit are deferred to an appendix to this paper. The results are Cl

=

(S2US/UU

-

(4)

SZ/U)lA

82,

n

var

been shown (4,6) that confidence limits can be calculated from calibrating points and unknown sample response measurements by utilizing the statistical t distribution. Using the notation of Ref. ( 6 ) ,when the difference t = - ycd(X)is a normally distributed random variable, then the ratio e/ is distributed as Student’s t and confidence limits are derived from this relationship. In the present instance, we note, however, that the square-root transformation applied to counting measurements Y and y distorts their intrinsic normal distribution so that the variables 2 and z cannot be strictly normally distributed. Statisticians often use procedures derived for the normal distribution to obtain approximate results in cases when the distribution is non-normal. The rationale is the so-called Central-Limit-Theorem ( 4 , 8 ) which states that the sum of n non-normally distributed random variables approaches normality as n increases. We shall appeal to this same rationale to use the properties of the t distribution to approximate the statistical behavior of t = 2 - zd( b‘) and its variance var (e) = var - Uvar (CJ, where 2 is the mean of the N replicated and transformed counts YT for one particular unknown sample. The equation

(1)

bX)ll2

c*=-=-

981

z n i 2 u i 2var

(zi)-- -o Z 2

(Cniui2)z

su u

where the substitution var ( 2 , ) = up/n, has been used. I t has

var ( c 2 )= Suuo2’/A

(7)

cov (cI,c2)= -voz2/A

(8)

where A = SUUS,,,- v2 and where v is the sum total of all replications over all calibrating points. The summation notation has been extended to include Si.. = ClnIu;2and S,,, = xlnLzlu;l.By comparing the transformed calibration form to the binomial expansion, it can be seen that the parameters c1 and c2 may be interpreted as b1/2and yB/2b1I2,respectively. Consequently, the background count yB and the specific activity b may be found from c1 and c 2 if desired. T o find confidence limits in this case E =

z

- Z,,l(U) =

z

- c,u -

c*U--’

(9)

982

ANALYTICAL CHEMISTRY, VOL. 50, NO. 7, JUNE 1978

and

var

(E)

=

var

( E )+ var

[Z,,~(U)]

(10)

The first variance on the right is ~2,”; the second is written in terms of variances and covariance of the parameters

var [zCa1(U)] = U z var (c,) 2 cov (C1,CZ)

+ U-’

var (cz) + (11)

Substituting Equations 6,5, and 8 into Equation 11,that result into Equation 10 and that result together with Equation 9 into Equation 2 yields

Table I. Calibration Data for a Simulated Experiment Involving “True” Background Count of 10 and Specific Activity of 0.5000 Standard samples xi

7T I

var (53-1)

2,

40 80 120 160 200 240 280 320 360 400

29.33 50.67 64.67 90.33 115.7 126.3 150.8 169.2 187.2 208.5

16.27 85.87 60.27 72.67 129.1 283.9 50.97 74.17 198.6 90.70

5.405 7.095 8.030 9.496 10.74 11.22 12.28 13.00 13.67 14.44

Unknown sample: which is quartic in U. Two of four possible roots must be extracted to find the desired confidence limits and (aside from attempting to solve the quartic directly) two practical solution procedures may be followed. One procedure is to plot the two calibration bands which are given by

z(u)=

E F

t var

(E)

(13)

and to note the intersections of the horizontal line z = 2 with these. If there are four intersections, judgment can usually serve to select the two appropriate ones as illustrated elsewhere (6). The second procedure is to first find the transformed unknown If corresponding to 2 by solving the quadratic equation

z=

c,u+ c,U-l This value is

-

u=[Z+ (Z’ - 4c,cz)1’’]/2c1

Then using 0 as a starting value, solve Equation 13 numerically (Newton-Raphson iteration) with z ( u ) = Z.The two (*) options yield two C‘roots which, hopefully, are the two proper ones. Needless to say, the analyst is cautioned t o examine these. SIMULATED EXAMPLES An experimenter wishing to calculate confidence limits by t h e zero-background or low-background approximations described above might well wonder which, if either, method is suitable given a particular background count in his case. T o provide some guidance we will simulate a series of five counting “experiments” by digital computer, calculate confidence limits from these by several methods, and examine t h e accuracies of the results. Each simulation involves ten standard samples with equally-spaced known x, values from 40. to 400. acu (arbitrary concentration units). The “true“ specific activity is b* = 0.5000 counts per acu but this information would not be known in a real experiment. The sample counts on the standards vary from about 20 to about 200. The sample count is generated by means of a subroutine (7) which calculates random numbers satisfying the Poisson distribution with parameter X = b*x,. To each such sample count is added a background count, also a Poisson-generated number. The X parameter for these are y j , the “true“ background counts, where y j is taken as 1,3,10,30, and 100, respectively, in each of the five simulations. These choices provide a range of background counts relative to sample count from quite small to quite substantial. The sum of background count and sample count is the total count 3 and each of these is repeatedly generated (“measured”) six times so that n, = 6 and Y = 60 in all cases. An unknown sample wiiii “true” X = 160. acu and “true” background equal to one of the five y j values is also counted six times ( N = 6).

var

(Ti )

0.1389 0.3988 0.2316 0.1940 0.2832 0.6030 0.0854 0.1113 0.2605 0.1103

z=9.745, var (z) = 0.0532.

The analysis of one simulated experiment will be presented in detail and then the corresponding results of the other four will be summarized. Table I shows the simulation data for the case yfj = 10. The mean of the replicated total counts and the variance of this quantity are given in the second and third columns, respectively. The mean and variance of the square-root-transformed total counts are in the fourth and fifth columns. The first step in the analysis is to verify that the fundamental assumptions about the counting statistics are valid. Statistical tests may be done to check that var ( 2 , ) is constant and that this constant is 0.25. The first of these is to make a linear regression analysis of the var (2,) vs. x , data. The least-square slope of such cn analysis is -0.0003 and the standard error estimate of the slope is 0.106. A t-test made for the null hypothesis that the true slope is zero yields t = 0.003 which when compared to toes = 2.31 for eight degrees of freedom clearly indicates that the slope is zero and hence var (5,) is constant with respect to x,. The second test is that var (2,) is 0.25. In this case the var (2) datum should be included since this is measured identically. The mean of these eleven variances is 0.2246 and the standard error esimate of this mean is 0.162. A t-test for the null hypothesis that the population mean is 0.25 yields t = 0.52 which when compared to toes = 2.23 for ten degrees of freedom confirms the null hypothesis. The next step would be to estimate the background count. This could be done by an independent experiment making replicate counts of a sample with x , = 0. Alternatively, the background count can be extracted from the measurements already made by plotting 3 vs. x , and noting the intercept on the J axis. If this is done by linear regression analysis, then weighting factors must be used a t each calibrating point. The weighting factors are inversely proportional to the population variance of the total count a t that point. Examination of the third column figures in Table I reveals that these sample variances are poor approximations t o the population variance. Better weighting factors could be calculated from

wi = (uyTi*)-l=

(&i*)-l

= (ys*

+ b*Xi)--l

where y?, is the “true” total count. Since the experimenter does not know yg and b*, a reasonable iterative procedure would start with unity weighting factors, calculate the least-squares slope and intercept, use these values as approximations to b* and yg, respectively, calculate a weighted-least-squares slope and intercept, recalculate weighting factors and a refined slope and intercept. For the data in Table I, this procedure converged to four significant figures in three iterations and yielded slope, intercept, and standard error of the intercept values of 0.4991, 9.454, and 2.51, respectively. A t-test for the null hypothesis that the back-

ANALYTICAL CHEMISTRY, VOL. 50, NO. 7, JUNE 1978

Table 11. Results of Analysis and 95% Confidence Limits Calculated from the Data of Table I by Three Methods

Zero-background approximation Analysis Confidence limits“ Low-background approximation Analysis Confidence limitsa Polynomial f i t ( G ) b Analysis Confidence limits“ “True” analysis

Transformed units ( U )

Original units (X)

13.25 12.60, 13.93

17 5.6 15.88, 194.0

13.12 12.38, 13.85

172.1 153.3, 191.8

13.02 12.38, 13.65 12.65

169.5 153.3, 186.3 160.0

a 95% confidence limits based on uz2 = 0.2500 and Utilizing zi VS. t, = 1.960 for = degrees of freedom. ui data.

ground count is zero rejects this hypothesis with t = 5.97 compared to toes = 2.00 or to,ol= 2.66. Therefore, the experimenter cannot expect the zero-background approximation to yield accurate confidence limits. However, y B about 9 is less than the smallest sample count of about 20 so that the low-background approximation should be reasonably valid. Nevertheless, analyses and 95% confidence limits are calculated by both approximations and by the polynomial method (6) on z, vs. u, data. The results are shown in Table 11. Each of the confidence limit calculations uses the population variance u: = 0.2500 and so to,os= 1.960 for degrees of freedom. We now examine the discrepancies between the analyses and confidence limits, as calculated, and the “true” analysis, which in this case is known to be U* = 12.65. The zerobackground analysis is U = 13.25, which is in error by a net bias of 4.7%. Similarly uvalues of 13.12 and 13.02 represent net biases of 3.7% and 2.9% for the low-background and polynomial fit analyses, respectively. These biases are attributable to two sources, (1) the bias in the measured transformed response 2 relative to the “true” or population mean response Z* = ( y i + 80.)”* and (2) the bias due to inaccuracy in the fitted calibration curve zd(x) relative to the “true” calibration curve z& = (yh + b * x ) l i 2 . With yg = 10, we see that Z* = 9.487 and so a measured 2 = 9.745 represents a 2.7% bias, which is the same for all three methods. The z d bias is found by projecting the true Z* through the fitted z d ( x ) curve and then comparing this pseudo-analysis to U*. The zero-background pseudo-analysis is Z*/cl = 9.48710.7353 = 12.90 which is in error by 2.0% relative to U*. The lowbackground approximation yields a pseudo-analysis of 12.74 (0.7% bias) as the proper root of the quadratic equation 9.487 = 0.7082 U + 5.960 C”’. Similarly the polynomial calibration curve causes a zd bias of only 0.1%. Biases calculated by these

counts

DISCUSSION First we notice that t-tests made on the hypotheses that = 0 for each simulation accept those hypotheses when y 2 is 1or 3 but reject them when y B is greater. This means that background counts as small as 3 are lost in the random scatter of the sample counts but counts of 10 and above are not. We must conclude that the zero-background approximation cannot logically be expected to be reliable in these latter cases. We also notice that when y3 is 30 and 100, the low-background approximation, Equation 20, fails to yield two roots which can be accepted as satisfactory 95% confidence limits. Consequently, we might conclude that the low-background approximation is invalid when yg is as large as 30 (compared to the smallest sample count of 20). The Z biases represent the deviation between the measured mean transformed responses Z and the population means Z*. These biases are independent of yg. They are due to the small number ( N = 6) of repeated counts made on the unknown sample and can, in principle, be reduced to zero by taking an unlimited number of replications of these counts. On the other hand, the z d biases in general cannot be eliminated by greater replication because these biases must be regarded as comprising both a random component due to scatter of the calibration measurements, z,,and a systematic error component due to lack-of-fit of the model equation used for the calibration curve. We can get some idea of the magnitude of the random component by noting that in several cases these biases are as small as 0.1%. Since each calibration curve is constructed from the same number of measurements ( u = 60) and since this number is greatly in excess of the number of fitted parameters in each case, we can conclude that the random component of the zCaIbias in each case is approximately the same, perhaps 0.5% or less. A calibration scheme with excessive systematic error cannot be expected to be

Low-background approx.

Polynomial fit ( 6 )

t-testd

Accept 1.4 0.1 1.5 2.7 1.4 0.5 0.9 2.9 1.4 0.6 2.4 0.9 Accept 0.7 0.4 1.1 2.6 0.7 0.2 2.8 0.7 0.1 0.9 0.9 2.4 10 Reject 2.7 2.0 4.7 2.6 2.7 0.7 3.7 2.9 2.7 0.1 2.9 2.5 2.7 30 -‘ 0.3 0.1 0.5 2.8 Reject 0.3 5.2 5.5 2.5 0.3 2.3 100 Reject 0.0 10.2 10.3 2.0 0.0 5.4 5.5 -‘ 0.0 2.3 2.2 3.5 a All biases are relative to U* = 12.65. One quarter the difference between 95% confidence levels, relative to U* = 12.65. Equation 1 2 fails to yield two satisfactory roots interpretable as 95% confidence limits. Both at 95% and 99% confidence levels. 1

3

983

procedures are shown in Table I11 for all five simulated experiments. Unfortunately, such a quantity as a “true” confidence limit is undefined so that the validities of the calculated confidence limits of the various methods for the various simulations cannot be examined directly. Nevertheless the biases calculated above must be compared in some manner to the uncertainty ranges predicted by the confidence limits. For this purpose we arbitrarily adopt the quantity: one quarter of the 95% confidence interval. This quantity would be approximately one standard deviation if the random variable U were distributed normally. These quantities, expressed as a percentage of U*, are given in Table I11 in the columns headed ‘I4 C.I. Finally, we also include in Table 111, the results of t-tests made on the null hypothesis that y B = 0 based on the statistical uncertainty of the intercept of the weightedleast-squares line fitted to yr, vs. x, data for each simulation. These tests are made both a t the 95% and 99% confidence levels and in all five cases the results of these two were in agreement.

Table 111. Summary of Calculations Made on Five Simulated Counting Calibration Experiments Zero-background approx.

*

ANALYTICAL CHEMISTRY, VOL. 50, NO 7, JUNE 1978

084

reliable, and so if we arbitrarily decide that 1% z d bias implies excessive lack-of-fit, we must reject the zero-background approximation if y2 1 10 and reject the "low-background" approximation if yB I30. Another notable feature of Table I11 is that in all cases where lack-of-fit is acceptably small, the calculated confidence intervals vary very little, i.e. the '/* (2.1. figures are all between 2.4% and 2.8%. This means that the confidence intervals are not significantly dependent on the background count, the model calibration equation, the Z bias, nor the zCd bias provided the fit is adequate. This constancy is directly attributable to the use of the same value uz = 0.2500 for var (2,) in the various formulas for confidence limits and this use is possible only because we know the population variance of the calibrating measurement a priori. Needless to say, similar behavior will not be found in other types (noncounting) calibration experiments where in general the population variance of the measurements is not known. The following guidance is offered to an analyst planning to use the calculational methods discussed in this paper. (1)Because the accuracy of a calibration procedure increases as more parameters are admitted into the model calibration equation, the reliability increases in the order (1) zerobackground approximation, (2) low-background approximation, (3) polynomial fit. (2) The computational complexity and, hence, the required computational resources increase for these methods in the same order. An analyst seeking to minimize computational complexity while retaining reliability might proceed as follows. (1) Fit the calibrating data yTI vs. x , to a weighted-leastsquares straight line and perform a t-test for the hypothesis that the intercept y B is zero. If this is acceptable, use the zero-background approximation. is less than (2) If this hypothesis is not acceptable, but the smallest sample count, three options are available in order of increasing computational complexity. (a) Use the y n vs. x , weighted-least-squares line to calculate the analysis of the unknown but use the zero-background formula Equation 3 to calculate confidence limits. Shift the confidence interval (difference between limits) symmetrically about the analysis value to approximate the correct confidence limits. This option is possible only because of the insensitivity of the confidence interval as discussed above. (b) Use the low-background approximation for both the analysis and the confidence limits. (c) Use the polynomial fit method on z, vs. u, data. (3) If the background count is greater than the sample counts, use the polynomial fit method on z, vs. x , data. This suggestion is made because the binomial expansion of (S'B + bx)lI2 is a power series in x when yB > bx. (4) If the background count is comparable to the sample counts, the polynomial fit method does not work well on either z i vs. x , or z, vs. u, data because neither binomial expansion with a limited number of terms is accurate. Evidence for this behavior is the poor performance of the polynomial f i t calculation on the yjj = 100 simulation as shown in Table 111. In such circumstances, the analyst might well change the size of his samples to enable the conditions yB > bx, or Y B < b x , to hold. If this were not possible, he could calculate a weighted-least squares linear calibration line and find confidence limits from this as described by Brownlee ( 4 ) . APPENDIX Least-Squares Analysis of t h e Form z = clu We seek to minimize the sum of squares

partial derivatives of S with respect to c1 and c pyields normal equations

cisuu

4- czv = s,,

c1v 4-

czs/uu

CI

= [S/uU(FniziUi)- V ( T n i Z i u i - ' ) ] / A= I

[ Fn i (Uis/, - v ui )Zi ] / A

s=p 2 i ( C l U i + czui-l - Z i ) ' 1

with respect to the parameters c1 and c2. The vanishing of

( A 11

The variance of c1 is thus V U

(c.1)

= [Cni2(uiS/uu- V U i - ' ) '

V U

(Zi)]/Az

Substituting var (2,) = u:/n, and then expanding the squared terms in the summations yields

var (cl)

=

~ [ S 2 / u u Z n i u-i 22 v S / , u c n i ~ i -+2 A 02

0, 2 s / u u

vzZniui-'] = -[s/uu(s,uusuu - VZ)] = AZ A 0.2

which is Equation 6. The variance of c2 as expressed in Equation 7 follows in like manner. The covariance cov (c1,c2)cannot be zero because both c1 and c p are functions of the same random variables zi. The procedure by which covariances of functions are calculated from variances of common independent variables is described by Meyer (8). We note that covariances are zero amongst the various 2, because these are uncorrelated measurements. The formula is, therefore,

The partial derivatives are best found from Equation A1 and the corresponding equation for c2. These are

and acz

._-

a 5 - ' i ( U i -IS,

u

- VU j )/A

Substituting these expressions along with var Equation A2 yields

(zi)= uZ/ni into

which upon expansion of the summation terms yields 0.2

cov (cl,cz) = ~ [ S , u S / u u ~- n2 vi S u u S ~ u+u v z Z n i l A 2

=

+ c2u-'

= sz/u

where Y = x , n L .Subscripted S notation is explained earlier in the paper. Solution of the normal equations follows as Equations 4 and 5. T o find expressions for the variances and covariance of c1 and c2, we note that the quantities S,,and Sz1,are random variables whereas S,,, S,,,,Y , and A = SUUS,,, - up are nonrandom. For cl, we expand Equation 4 and collect coefficients of 2,:

-[-(vsuus/uu - v"] A2 02

=

-vu, 2 A

which is Equation 8. LITERATURE C I T E D (1) J. Mandel, "The Statistical Analysis of Experimental Data", Wiley-Interscience, New York. N.Y., 1964, Chapter 12.

ANALYTICAL CHEMISTRY, VOL. 50, NO. 7, JUNE 1978 (2) G. Friedlander, J. W. Kennedy, and J. M. Miller, "Nuclear and Radiochemistry", 2nd ed., John Wiley and Sons, New York, N.Y., 1964, Chapter 6. (3) D. J. Finney, Biometrics, 32, 721 (1976). (4) K . A. Brownlee, Statistical Theory and Methodology in Science and Engineering", 1st ed.. John Wiley and Sons, New York, N.Y., 1960. ( 5 ) R. R . Sokai and F. J. Rohlf, "Biometry," W. H. Freeman and Co., San Francisco, Calif., 1969, Chapter 13.

985

(6) L. M. Schwartz, Anal. Cbem., 49, 2062 (1977). (7) H. E. Schaffer, Commun. A . C . M . . 13, 49 (1970). (8) S. L. Meyer "Data Analysis for Scientists and Engineers", John Wiley and Sons, New York, N.Y., 1975, Chapter 10.

for review January

9r

Accepted March

273

1978.

Laser Desorption-Mass Spectrometry of Polar Nonvolatile Bio-Organic Molecules M. A. Posthumus,' P. G. Kistemaker,' and H. L. C. Meuzelaar FOM-Institute for Atomic and Molecular Physics, Kruislaan 407, 1098 SJ Amsterdam, The Netherlands

M. C. Ten Noever de Brauw Central Institute for Nutrition and Food Research CIVO-TNO, Zeist, The Netherlands

Polar, nonvolatile organic molecules were analyzed wlth a laser induced desorption technique. I n this technique a thin layer of sample coated on a metal surface is exposed to a submicrosecond laser pulse, producing cationized species of the lnfact molecules and some fragments. These ions are mass analyzed by a magnetic sector type mass spectrometer equipped with a simultaneous electro-optical ion detection system or in high resolution experiments by a double focusing mass spectrometer with photoplate detection. For oligosaccharides, notorious for their nonvolatlllty and thermal lability, the ions due to catlonlred original molecules and building blocks are the most prominent. Despite the Intense laser pulse, there Is little elimination of water from the molecule, If any. Wlth this method, we have successfully analyzed, for Instance, digitonin, a digitalis pentaglycoside with a mass of 1228 amu, and underivatized adenylyl-(3',5')-cytldine (ApC, M = 572 amu).

In mass spectrometry, the study of "nonvolatile" polar and thermally labile compounds gives rise to many problems. These can be partially circumvented by chemical derivatization of the polar groups, thereby producing molecules with higher vapor pressures. However, the large increase in molecular weight, proportional to the number of polar groups present, constitutes a basic limitation to this approach. As shown for a number of oligopeptides, spectra of involatile substances can be obtained in some cases by direct exposure of the sample to the ion plasma of a chemical ionization source ( I ) . Also field desorption can be used for a wide range of nonvolatile bio-organic molecules as recently reviewed by Schulten ( 2 ) . This method is perhaps most widely used now for mass spectrometric analysis of nonvolatiles although routine application is still hampered by a moderate reproducibility due to many influencing factors like emitter quality, emitter temperature, and sample impurities ( 3 ) . A different approach is to attempt reproducible flash pyrolysis in vacuum to obtain volatile fragments, still bearing structural information on the original compound (4-6). This Present address, Laboratory of Organic Chemistry, Agricultural University, Wageningen, The Netherlands.

method has been applied successfully in the classification of highly complex biological material. Recently a new class of desorption techniques has been introduced. In the last year, a number of authors reported the desorption of intact molecules of highly involatile substances upon the impact of energetic particles on the sample material coated on a metal substrate. Ions, neutrals as well as photons, have been used as bombarding particles in quite different energy domains. Macfarlane et al. ( 7 ) and Krueger (8) used MeV energy fission products of californium-252 to induce the desorption of protonated and deprotonated molecular particles. Primary ions in the keV energy range were used by Benninghoven, with results comparable to the Wf-plasma desorption technique data (9). Moreover, already in 1968, Vastola and co-worker$ observed the production of ions during the irradiation of organic samples with a high intensity ruby-laser pulse (10). In this article we report new results of laser induced desorption mass spectrometry. I t is shown that this technique is not confined to organic salts ( I I ) , but can also be applied successfully to the analysis of an extensive range of polar, nonvolatile organic substances as often encountered in biochemical or clinical practice. Ionization occurs mostly by attachment of an alkali cation to the desorbed molecules. T o demonstrate the possibilities of laser ionization we report, among other things, on the analysis of underivatized oligosaccharides, cardiac glycosides, and nucleotides.

EXPERIMENTAL Simultaneous Ion Detection. The short duration of the ion burst upon the laser pulse precludes the use of a scanning type of mass spectrometer. Therefore time-of-flightm m spectrometers were generally used in experiments thus far reported in the literature ( 1 0 , I I ) . In this research, however, we used a home built magnetic sector mass spectrometer (see Figure 11, in which part of the spectrum is projected on a chevron CEMA detector (channeltron electron multiplier array, Galileo Optics Corp., Sturbridge, Mass.). The secondary electron output of the CEMA is proximity focused on a phosphor screen, coated on a fiber optic window, which serves also as a vacuum feedthrough for the resulting optical line spectrum. This image is detected and digitized with a vidicon camera, coupled to a 500-channel OMA (optical multichannel analyser, SSR Instrument Co., Santa Monica, Calif.). The characteristics of this simultaneous ion detector are described elsewhere in full detail (12). To obtain optimal benefit of this

0003-2700/78/0350-0985$01.00/0 0 1978 American Chemical Society