Chapter 10
Effects of Analytical Calibration Models on Detection Limit Estimates 1
K. G. Owens , C. F. Bauer, and C. L. Grant Department of Chemistry, University of New Hampshire, Durham, NH 03824
Detection limit estimates derived from confidence bands around analytical calibration curves are highly dependent on the experimental design and on the statistical data treatment. Procedures are described for testing the linearity of data and whether the intercept differs s i g n i f i c a n t l y from zero. Insensitivity of the correlation coefficient for the evaluation of goodness of fit of calibration models is emphasized, unweighted linear models with an intercept often yield overly conservative detection limits. Frequently, an unweighted zero-intercept model is justified on both theoretical and statistical grounds. This model yields confidence bands and detection l i m i t s consistent with experiment. When the variance of signal measurements increases with concentration, more r e a l i s t i c confidence bands and detection limits are produced by weighting the data. D e s p i t e numerous p a p e r s d e a l i n g w i t h the s p e c i f i c a t i o n of a n a l y t i c a l method d e t e c t i o n l i m i t s (see, f o r example 1-6) much disagreement remains about the choice of both e x p e r i m e n t a l and computational procedures. In part, these disagreements appear to be r e l a t e d to the t e c h n i c a l o b j e c t i v e s of the experimenter. For example, a d e t e c t i o n l i m i t (DL) might be e s t i m a t e d from some m u l t i p l e of the standard d e v i a t i o n of blank s o l u t i o n s i g n a l s measured d u r i n g a short time i n t e r v a l u s i n g a p a i n s t a k i n g l y optimized instrument. Such a DL estimate c l e a r l y provides useful i n f o r m a t i o n but i t would be u n r e a l i s t i c to expect to m a i n t a i n an e q u i v a l e n t DL d u r i n g an extended a n a l y s i s program w i t h r e a l samples. Unfortunately, these differences are frequently ignored, thereby encouraging unproductive controversy. It i s important to remember that a DL i s not an i n t r i n s i c p r o p e r t y but r a t h e r , i t i s r
Current address: Chemistry Department, Indiana University, Bloomington, IN 47405
0097-6156/88/0361 -0194$06.00/0 © 1988 American Chemical Society
10. OWENS ET AL.
Effects of Analytical Calibration Models
195
the i n t e r a c t i v e product of many variables including the method, the instrumentation, the nature of the samples, and the experimenter. T h i s d i s c u s s i o n w i l l be r e s t r i c t e d to e s t i m a t i o n of DL s a p p r o p r i a t e f o r r o u t i n e a p p l i c a t i o n of methods i n which a c a l i b r a t i o n function r e l a t e s a signal to concentration for a series of standards. No attempt w i l l be made to evaluate the m e r i t s of v a r i o u s terms such as l i m i t of q u a n t i t a t i o n , method d e t e c t i o n l i m i t , lower l i m i t of r e l i a b l e assay measurement, and others. For a d i s c u s s i o n o f those i s s u e s , the reader i s r e f e r r e d to the a r t i c l e s c i t e d here and to other papers from t h i s symposium. Consider a t y p i c a l procedure such as the spectrophotometric determination of an analyte i n groundwater samples. Quite l i k e l y , a s i n g l e c a l i b r a t i o n curve w i l l be used to cover a c o n c e n t r a t i o n range that extends from below the regulatory l i m i t (hopefully) to some elevated concentration far removed from the l i m i t . In t h i s s i t u a t i o n , a DL can be based on confidence l i m i t s (CL) around the c a l i b r a t i o n curve (7-14), The DL estimate produced i n t h i s fashion can then r e f l e c t the combined uncertainties i n sample analysis and calibration. C l e a r l y the design of the c a l i b r a t i o n procedure and the s t a t i s t i c a l analysis of the data are both important considerations. Questions which require attention are conveniently divided into two groups; those p e r t a i n i n g to the e x p e r i m e n t a l design and those pertaining to the s t a t i s t i c a l analysis. Design questions include: f
a) b) c) d)
e)
What concentration range should be covered? How many standards and blanks should be used? How should standards be d i s t r i b u t e d over the range of interest? How many r e p l i c a t e measurements should be made on standards and blanks, i n what o r d e r , and over what time frame? How should standards be prepared?
S t a t i s t i c a l analysis questions may include: a) Are the signal measurements ( y ^ normally distributed at each concentration l e v e l used? b) Are the v a r i a n c e s o f the s i g n a l s ( S ) homogeneous, i.e., independent of concentration? c) Is a l i n e a r model j u s t i f i e d or i s curvature indicated? d) I f l i n e a r , i s the intercept (b ) s i g n i f i c a n t l y different from zero, i.e., i s a zero-intercept model suitable? 2
y i
Q
!
Calculation of CL s should proceed only after these questions have been answered. Too f r e q u e n t l y , a l i n e a r model o f the form y = b + χ i s f i t t e d using least squares procedures and CL»s are calculated without proper attention to these issues. If the c a l i b r a t i o n model i s inappropriate, i t almost c e r t a i n l y follows t h a t DL's based on C L ' s a r o u n d t h a t model a r e i n a c c u r a t e . T h e r e f o r e , a major p o r t i o n of t h i s paper w i l l d e a l w i t h the s e l e c t i o n of a c a l i b r a t i o n model. At the end of the paper, we r e t u r n to a c o n s i d e r a t i o n o f the e f f e c t s of that model on DL estimates. Q
196
DETECTION IN ANALYTICAL CHEMISTRY
Experimental Design Q u e s t i o n s . C o n c e n t r a t i o n Range. Choice of the c o n c e n t r a t i o n range to be covered i s g e n e r a l l y d i c t a t e d by the nature of the samples to be analyzed, p r e c i s i o n and accuracy requirements, and the i n h e r e n t l i m i t a t i o n s of the procedure. For example, an environmental pollutant may be present i n samples at concentrations near the DL on t h e f r i n g e s o f an a r e a but a t s u b s t a n t i a l l y e l e v a t e d c o n c e n t r a t i o n s w i t h i n the most s e v e r e l y impacted area. Low concentrations must be accurately estimated i n order to define the geographic d i s t r i b u t i o n i n accordance with regulatory guidelines. However, i t i s a l s o necessary to determine high c o n c e n t r a t i o n s a c c u r a t e l y to plan cleanup s t r a t e g y . In these c i r c u m s t a n c e s , i t may be best to produce two c a l i b r a t i o n curves; one f o r the low concentration range from which the DL can be estimated, and one f o r the f u l l c o n c e n t r a t i o n range. I f o n l y the l a t t e r c a l i b r a t i o n i s performed, i t i s l i k e l y that DL estimates w i l l be unsatisfactory. Number of Standards, The number of standards required depends i n l a r g e measure on the c o n c e n t r a t i o n range and the nature of the anticipated functional relationship. A zero intercept l i n e a r model for a l i m i t e d concentration range may be adequately defined with three standards although most i n v e s t i g a t o r s p r e f e r f o u r or f i v e . In contrast, a thorough evaluation of a c u r v i l i n e a r model requires at least f i v e standards and more may be desiraole when spanning a wide concentration range. D i s t r i b u t i o n of Standards. The location of the c a l i b r a t i o n points on the concentration axis exerts f a r more influence on DL estimates than has been g e n e r a l l y recognized. Common p r a c t i c e i s to space standards e q u i d i s t a n t a c r o s s t h e e n t i r e r a n g e o f i n t e r e s t . However, for a specified number of standards, lower estimates of DL s are obtained without compromising the r e l i a b i l i t y of the high concentration range when an unsymmetrical d i s t r i b u t i o n favoring low c o n c e n t r a t i o n s i s used. These e f f e c t s w i l l be i l l u s t r a t e d i n a l a t e r section where CL computations are discussed. f
R e p l i c a t i o n of S t a n d a r d s . An obvious b e n e f i t of r e p l i c a t i o n i s improved r e l i a b i l i y of the results. Other benefits are the ease of t e s t i n g the goodness of f i t of the c a l i b r a t i o n model and the o p p o r t u n i t y to i n t e r s p e r s e measurements on standards randomly a c r o s s an e n t i r e l o t of samples to which that c a l i b r a t i o n curve w i l l apply. By t h i s arrangement the standard deviation estimated from the c a l i b r a t i o n data w i l l usually correspond c l o s e l y to the value estimated from r e p l i c a t e measurements on samples. Otherwise, r e p r o d u c i b i t y of samples may be poorer than f o r standards. Of course, t h i s assumes no systematic d r i f t of signal response during the course of the measurements. I f d r i f t i s suspected, i t can be checked by making s e v e r a l measurements on standards over an extended time period. Any procedure should be demonstrated to be p e r f o r m i n g n o r m a l l y and i n c o n t r o l before s t a r t i n g on a l o t of samples. The question of how many r e p l i c a t e measurements to make must i n c l u d e c o n s i d e r a t i o n of the magnitude of v a r i a b i l i t y , the time
10.
OWENS ET AL.
Effects of Analytical Calibration Models
197
r e q u i r e d , the c o s t of each measurement, and the r e l i a b i l i t y r e q u i r e d i n the f i n a l r e s u l t . For many s i t u a t i o n s we find duplicates or t r i p l i c a t e s are quite adequate. Preparation of Standards, To meet the requirement of independence of e r r o r s , each r e p l i c a t e and each standard should be prepared s e p a r a t e l y and w i t h great care (15). Subsequent r e g r e s s i o n a n a l y s i s w i l l assume no e r r o r i n the c o n c e n t r a t i o n (or at l e a s t , that the e r r o r i s s m a l l i n comparison to the e r r o r i n s i g n a l measurement) and that the errors are independent. Thus, solution standards should i d e a l l y be prepared from more than one stock solution using a v a r i e t y of pipets and volumetric flasks. Choosing a C a l i b r a t i o n Model. An assumed r e g r e s s i o n model may be e i t h e r l i n e a r or n o n l i n e a r . Choosing the c o r r e c t model i s c r u c i a l to o b t a i n i n g a c c u r a t e results. Although several recent papers (16-18) e x t o l l the v i r t u e s of nonlinear c a l i b r a t i o n curves as a means of improving accuracy or to extend the range of concentrations covered, t h i s discussion w i l l consider only l i n e a r models. Evaluation of nonlinear models i s an extension of the l i n e a r case with s i m i l a r conceptual framework. unweighted l e a s t squares curve f i t t i n g i s based on the a s s u m p t i o n s t h a t (a) measurement e r r o r s f o l l o w a Gaussian d i s t r i b u t i o n and t h a t (b) v a r i a n c e s a r e i n d e p e n d e n t o f concentration, i . e . , t h e y a r e homogeneous. In t y p i c a l calibrations, i n s u f f i c i e n t data are c o l l e c t e d to r i g o r o u s l y t e s t e i t h e r assumption. F o r t u n a t e l y , modest v i o l a t i o n s do not cause serious errors but Garden et a l . (19) warned that incorrect use of an unweighted l e a s t squares a n a l y s i s could cause gross e r r o r s i n the estimation of trace concentrations. In the i n i t i a l portion of t h i s discussion we w i l l consider examples where both assumptions appear v a l i d . L a t e r we w i l l examine the e f f e c t s of nonuniform variance. Linear Model W i t h I n t e r c e p t . There are two d i s t i n c t l i n e a r f i r s t order r e g r e s s i o n models t h a t a r e g e n e r a l l y e n c o u n t e r e d i n analytical calibration. The non-zero intercept model i s the most f a m i l i a r , and i t i s given by Equation 1. y = b
Q
+ b
x
χ
(1)
The estimates of intercept (b ) and slope (b^) are calculated so as to m i n i m i z e the sum of squares (SS) of the d e v i a t i o n s of the o b s e r v e d s i g n a l s (y^) f r o m t h e p r e d i c t e d v a l u e (y) a t any c o n c e n t r a t i o n (x) without c o n s t r a i n t s . For some d e t e r m i n a t i o n s , however, theory predicts that the response of the instrument should be l i n e a r with concentration and should also be zero when there i s no a n a l y t e present. Thus, i f the instrument has been c a l i b r a t e d c o r r e c t l y , the c a l c u l a t e d l i n e should pass through the o r i g i n by d e f i n i t i o n . The proper r e g r e s s i o n model would then be the zero intercept model shown as Equation 2. y = b, χ (2) o Q
x
DETECTION IN ANALYTICAL CHEMISTRY
198
The e s t i m a t e o f the true s l o p e , b , i s c a l c u l a t e d so as to minimize the SS of deviations from the l i n e with the r e s t r i c t i o n that the l i n e must pass through the o r i g i n . Each of these models w i l l be considered i n the following paragraphs. To f a c i l i t a t e t h i s discussion, we have fabricated some t y p i c a l spectrophotometric c a l i b r a t i o n data employing duplicate absorbance measurements at five concentrations. A reagent blank was used to set zero absorbance. Two s e t s o f data are shown i n Table I. The absorbance v a l u e s at each o f the 4 lowest c o n c e n t r a t i o n s are i d e n t i c a l f o r each set. The d i f f e r e n c e occurs i n the absorbance values f o r the h i g h e s t standard where a negative d e v i a t i o n from Beers Law i s represented by a reduced absorbance for case II. The regression equations and c o r r e l a t i o n c o e f f i c i e n t s were calculated according to standard equations available i n any text on regression analysis. 1 o
Table I. Data and S t a t i s t i c a l A n a l y s i s for Spectrophotometric C a l i b r a t i o n Concentrations
o f Standards (x) 0.500 1.00 2.00 5.00 10.00
Measured
Caag_i 0.054, 0.103, 0.202, 0.494, 0.975,
0.050 0.109 0.192 0.514 1.005
Absorbances (v)
Case I I 0.054, 0.103, 0.202, 0.494, 0.915,
0.050 0.109 0.192 0.514 0.945
Least Squares Model With Intercept
y=0.00431+0.0988x
y=0.0149+0.0927x
Correlation Coefficients
r = 0.9996
r = 0.9988
Models Through Origin
y = 0.0994x
y = 0.0948x
Goodness of F i t . The f i t t e d model w i t h i n t e r c e p t f o r Case I i s seen to have a c o r r e l a t i o n coefficient of 0.9996 which would often be i n t e r p r e t e d to mean that the equation f i t s the data very w e l l . However, we s h a l l see f r o m the Case I I d a t a s e t t h a t t h e c o r r e l a t i o n c o e f f i c i e n t i s not a s e n s i t i v e method of e v a l u a t i n g curve f i t . H u n t e r (2ϋ) n o t e s t h a t i n s t a t i s t i c a l t h e o r y , c o r r e l a t i o n i s a measure of the r e l a t i o n s h i p between two random (dependent) v a r i a b l e s . In a c a l i b r a t i o n problem, however, i t i s assumed that there i s a d e f i n i t e functional relationship between the dependent and independent v a r i a b l e s . Correlation, in i t s s t r i c t s t a t i s t i c a l sense, does not exist. Van Arendonk et a l . (21) point out that the c o r r e l a t i o n coefficient i s an i n s e n s i t i v e t o o l for use i n e v a l u a t i n g the q u a l i t y of the f i t t e d e q u a t i o n , and i t s use i n such a manner may lead to erroneous conclusions. We b e l i e v e that i t i s f a r more i n s t r u c t i v e to perform a regression a n a l y s i s i n which the s o u r c e s o f v a r i a t i o n are f r a c t i o n a t e d i n t o the sums of squares (SS) a t t r i b u t a b l e to regression and the SS for residuals. When r e p l i c a t e measurements have been made, the residual SS can be further fractionated i n t o a
10. OWENS ET AL.
Effects of Analytical Calibration Models
199
s y s t e m a t i c e r r o r component and a random e r r o r component. The systematic error component i s designated the SS due to lack of f i t (LOF) b e c a u s e i t a r i s e s from t h e inadequacy of the f i t t e d regression model to describe the experimental points. Table I l - a g i v e s the e q u a t i o n f o r c a l c u l a t i n g the SS of r e s i d u a l s w i t h N-2 degrees of freedom (d.f.), since two r e g r e s s i o n c o e f f i c i e n t s were fitted. Many s t a t i s t i c a l analysis programs routinely provide the SS of residuals. The SS for random error (SS error) i s independent of the r e g r e s s i o n model employed, i . e . , i t depends s o l e l y on the d i s t r i b u t i o n o f r e p l i c a t e s around the mean response f o r each standard. When duplicate measurements have been acquired for each standard, the SS error i s calculated as shown i n Table Ι Ι - a where d s d i f f e r e n c e i n s i g n a l f o r each set o f d u p l i c a t e s . T h e t o t a l d.f. i n t h i s e r r o r e s t i m a t e would be equal to the number of d u p l i c a t e sets since each would contribute 1 d.f. The SS for LOF i s obtained by d i f f e r e n c e between the r e s i d u a l SS and the random e r r o r SS. S i m i l a r l y , the d.f. associated with LOF i s obtained by difference. These c a l c u l a t i o n s are i l l u s t r a t e d i n Table I I f o r l i n e a r models w i t h i n t e r c e p t s f i t t e d to the data s e t s of Table I. I n s p e c t i o n of Table II r e v e a l s t h a t the F - r a t i o f o r LOF f o r the Case I r e s u l t s i s not s i g n i f i c a n t as expected, i . e . , the model i s an adequate d e s c r i p t i o n of the data. For the Case I I r e s u l t s , however, the LOF i s s i g n i f i c a n t at the 0.10 s i g n i f i c a n c e l e v e l despite finding a c o r r e l a t i o n coefficient of 0.9988! With the high p r o b a b i l i t y that the l i n e a r model does not p r o p e r l y f i t the data for Case I I , i t seems unreasonable to use such a c a l i b r a t i o n curve without t r y i n g to r e s o l v e the problem. Note that the nature of t h i s test i s such that the LOF w i l l not show significance i f large random errors are present. In fact, when random error i s large, i t i s d i f f i c u l t to detect systematic variations that might r e s u l t i n LOF. In t h i s example, however, random e r r o r i s the same f o r each case. The LOF i s caused by a negative deviation of absorbance for the highest concentration standard. The problem can be resolved by r e d u c i n g the c o n c e n t r a t i o n of the h i g h e s t standard to the upper l i m i t of the l i n e a r range or possibly by f i t t i n g a nonlinear model. The important point i s that the c o r r e l a t i o n coefficient provides no r e a l insight concerning the extent or nature of residuals whereas the LOF test does. I t i s i m p o r t a n t to note t h a t an o b s e r v a t i o n (or set o f o b s e r v a t i o n s ) on a standard may be r e j e c t e d as an o u t l i e r o n l y i f i t i s not at the extreme ends of the c a l i b r a t i o n curve. I f the lowest (or highest) standard appears to be an o u t l i e r , i t can not be determined from the data collected whether the concentration of the standard i s i n e r r o r or i f the response o f the i n s t r u m e n t i s beginning to d e v i a t e from l i n e a r i t y . The " o u t l y i n g " o b s e r v a t i o n would have to be retained unless additional measurements made on a standard o f lower (higher) c o n c e n t r a t i o n i n d i c a t e s t h a t the deviation from the calculated l i n e i s not due to n o n - l i n e a r i t y of the response function. When r e p l i c a t e measurements are not a v a i l a b l e , a thorough analysis i s required of the residuals: the i n d i v i d u a l differences between the e x p e r i m e n t a l p o i n t s and the c a l c u l a t e d r e g r e s s i o n l i n e . P a t t e r n s i n r e s i d u a l p l o t s p r o v i d e i n s i g h t c o n c e r n i n g the v a l i d i t y of the f i t t e d equation and possible causes when the f i t i s
200
DETECTION IN ANALYTICAL CHEMISTRY
Table ΙΙ-a. Formulation of Regression Analysis Table Using The Calibration Data of Table I and A Linear Model With Intercept
Source of Variation
.
|Êy - 1 ^ 3
Residual
Degrees of freedom (df )
Sum of squares (SS) 2
(Σχ)2
^
Ν
Error
8
r e s i d . SS 8
5
SS error 5
3
SS LOF MS L0F
-I
2 Lack of F i t (LOF)
Residual SS - Error SS
Mean square F-ratio (MS) (F)
3
-
MS error
Table ΙΙ-b. Regression Analyses With LOF Tests For Table I Calibration Data Using Intercept Model Source of Variation Case I Residual Error LOF
0.000872 0.000726 0.000146
8 5 3
0.000109 0.000145 0.0000487
0.34
Case I I Residual Error LOF
0.002518 0.000726 0.001792
8 5 3
0.000315 0.000145 0.000597
4.12
SS
df
MS
F-ratio*
*The F - r a t i o s r e q u i r e d f o r 3 and 5 df at v a r i o u s s i g n i f i c a n c e l e v e l s are 3.62 f o r 0.10, 5.41 f o r 0.05.
10. OWENS ET AL.
Effects of Analytical Calibration Models
201
poor. T h i s s u b j e c t i s comprehensively d i s c u s s e d i n Draper and Smith (15.). Zero I n t e r c e p t Model. For s p e c t r o p h o t o m e t r i c d e t e r m i n a t i o n s , theory p r e d i c t s that response of the i n s t r u m e n t should be l i n e a r with concentration and that the response should be zero when there i s no a n a l y t e present. The zero i n t e r c e p t r e g r e s s i o n model ( E q u a t i o n 2) p r o v i d e s p a r a m e t e r e s t i m a t e s which meet t h i s restriction. The e x p r e s s i o n used to c a l c u l a t e the slope of the l i n e through the o r i g i n i s : *1
= ^ (3) ° Σχ F i t t e d models through the o r i g i n are shown i n Table I for the two sets of data previously discussed. Before the equation of the l i n e c a l c u l a t e d u s i n g the zero intercept model i s employed to evaluate unknowns, i t must be tested to determine i f the model i s adequate to describe the experimental data. Regression analysis tables are constructed p r i o r to testing the s t a t i s t i c a l v a l i d i t y of the assumption that the i n t e r c e p t of the l i n e i s zero. The format f o r c a l c u l a t i o n of the r e g r e s s i o n a n a l y s i s t a b l e s i s shown i n Table I l l - a and the analyses of the Table I data are shown i n Table I l l - b . Inspection of these tables shows that the LOF test r e s u l t s are very s i m i l a r to those for the models with intercepts. Comparison of Tables Ι Ι - b and Ι Ι Ι - b reveals that the SS residuals are somewhat l a r g e r f o r the zero i n t e r c e p t models than f o r the models w i t h an intercept. This difference can be used to test the hypothesis that the intercept i s zero. F i r s t , i t must be demonstrated that the LOF i s not s i g n i f i c a n t s i n c e i t would not make good sense to t e s t the zero i n t e r c e p t h y p o t h e s i s f o r l i n e a r models shown not to f i t the data. Furthermore, the SS error and SS(LOF) should not be combined as SS residuals when LOF i s s i g n i f i c a n t . These requirements are met by the Case I r e s u l t s . To t e s t the h y p o t h e s i s that the intercept does not d i f f e r s i g n i f i c a n t l y from zero, calculate: Ί
(SS residual for zero intercept model) -
(SS residual of model with intercept)
F =
(4) MS residual of model with intercept
For the Case I d a t a , F = Q,QQv96v-Q,QQQ872 .81 0.000109 The d.f. i n the numerator w i l l always be 1 because (N-1-(N-2)=1 and, t h e r e f o r e the d i f f e r e n c e i n these SS are d i v i d e d by 1 to get the MS. The d.f. i n the denominator are N-2 or 8 i n t h i s example. At the 0.05 s i g n i f i c a n c e l e v e l , the r e q u i r e d F value w i t h 1 and 8 d.f. i s 5.32. C l e a r l y , we can not r e j e c t the h y p o t h e s i s that the intercept i s zero and consequently we conclude that t h i s model i s consistent with the data. It can be very advantageous to achieve a c a l i b r a t i o n that has a zero intercept i f i t can be demonstrated that t h i s condition can be sustained on a long term basis. We find that some systems that are c a r e f u l l y zeroed on blanks w i l l meet t h i s requirement. Under =
0
202
DETECTION IN ANALYTICAL CHEMISTRY
Table I l l - a . Formulation of Regression Analysis Table Using The C a l i b r a t i o n Data i n Table I and a Zero Intercept Model
Sum of squares (SS) / Residual Σχ
2
Z2 d
Error Lack of F i t
Residual SS-Error SS
Mean square (MS)
Degrees of freedom