Calibration: Detection, Quantification, and Confidence Limits Are

Jun 10, 2019 - Improving the Energy Efficiency of CO2 Conversion in Nonequilibrium Plasmas through Pulsing .... The Supporting Information is availabl...
0 downloads 0 Views 898KB Size
Feature Cite This: Anal. Chem. 2019, 91, 8715−8722

pubs.acs.org/ac

Calibration: Detection, Quantification, and Confidence Limits Are (Almost) Exact When the Data Variance Function Is Known Inverse variance weighting ensures optimal parameter estimation in least-squares fitting, with exact parameter standard errors for linear least-squares with known data variance. In this Feature, I emphasize the virtues of numerical methods for estimating data variance functions and for determining these limits for any calibration model, linear or nonlinear. Downloaded via 5.62.159.33 on July 25, 2019 at 11:54:12 (UTC). See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.

Joel Tellinghuisen* Department of Chemistry, Vanderbilt University, Nashville, Tennessee 37235, United States S Supporting Information *

almost exact  almost, because the slope b carries estimation uncertainty and because x0 is a nonlinear estimator, for which the linear least-squares (LLS) rules hold only approximately (see below).11,12 In the context of Monte Carlo (MC) simulations, σy is known exactly, since it is used by the computationalist to set the scale of the data error. In the analysis of experimental data, σy is usually treated as unknown and is replaced by its estimate sy, obtained from the LS fit residuals, δi = ycalc(xi) − yi, using sy 2 =

C

alibration is perhaps the most common computational procedure in analytical chemistry and is arguably the most important. The goal is the determination of the amount of an unknown and its uncertainty. In many applications, the uncertainty is subject to rules and guidelines for acceptability, often stated in terms of limits of detection (LOD) and limits of quantification (LOQ).1 The definitions of these limits have been the subject of debate in the analytical community for decades,2−7 as reviewed exhaustively, critiqued, and corrected in recent works by Voigtman.8−10 The focus of the present work is not those definitions, rather it is the least-squares standard errors (SE) that occur in all such calibration-based limits. For example, for the universally treated linear response function, y = a + bx

(3)

where p is the number of adjustable parameters in the LS fit (2 for eq 1). I will use the designation “post” (or a posteriori) in referring to sy-based estimates of the SEs and “prior” (a priori) for σy-based SEs. Confidence limits are defined to encompass a specified percentage of possible outcomes (usually 90% or 95%) and are commonly obtained from the normal (Gaussian) distribution when σy is known and the t-distribution when it is estimated as sy. In the latter case, one must also specify the number of degrees of statistical freedom ν (= n − p = n − 2 here), because the t-distribution depends on ν, converging on the normal distribution in the limit ν = ∞. Quantities similar to those in eq 2 occur in the definitions of LOD and LOQ.13 This equation can be recognized as a sum of variances divided by the squared slope, with the first term in parentheses accounting for the variance in the estimate of y0 from m measurements, while the second and third together give the variance in the response function (RF) at x0. (This sum of variances is called the prediction variance, usually with m = 1 assumed.) This relationship holds for heteroscedastic data (data of varying σy) as well as homoscedastic, and for any RF, including those requiring nonlinear LS (NLS) for estimation. In other words, the general form of eq 2 is

(1)

the calibration parameters a and b are estimated by leastsquares fitting (regression) of the calibration data, having measured response (y) at selected values of the independent variable x (amount, concentration). The unknown x0 is obtained from its response y0, as x0 = (y0 − a)/b. In the case of data of constant uncertainty σy (homoscedastic), the variance in x0 can be stated as Ä ÑÉ σy 2 ÅÅÅ 1 (x0 − x ̅ )2 ÑÑÑ 1 ÑÑ σx 0 2 = 2 ÅÅÅÅ + + n b ÅÅÅÇ m Σ(xi − x ̅ )2 ÑÑÑÑÖ (2)

2

σx 0 =

where m and n are the numbers of measurements of y0 and the calibrants, respectively, and overbars denote averages. An equivalent expression replaces (x0 − x̅) by (y0 − y̅)/b. In eq 2, the data error σy (standard deviation, SD) is assumed to be known, in which case the SE in x0 (σx0) is © 2019 American Chemical Society

∑ δi 2 n−p

σf 2(x0) + σy0 2 (df /dx)0 2

(4)

where f(x) represents the RF. There are ways to express RFs so that the LS fit yields σf 2(x0) directly, making eq 4 a numerical Published: June 10, 2019 8715

DOI: 10.1021/acs.analchem.9b00119 Anal. Chem. 2019, 91, 8715−8722

Feature

Analytical Chemistry

Recognition that inverse-variance weighting is required for correct use of the method of LS should force greater emphasis on the determination of data variance functions (DVFs). That such recognition remains far from universal is clear from literature efforts that treat weights as a means to an end  a tool to be varied in trial-and-error fashion in the quest for a “best weighting formula”, as judged from criteria other than the data variance.15−27 Those other criteria include “quality coefficients” (QC),16,20 which MC simulations have shown to be fundamentally flawed.28 Many of refs15−27 have settled on 1/x2 weighting for data which likely are dominated by proportional error in the large-signal limit. However, such weighting is an oversimplification that does not correctly represent the variance at low signal, where most instrumental techniques have constant limiting variance. Accordingly 1/x2 weighting is likely to yield overly precise estimates of the LOD and LOQ. In regions where proportional error indeed dominates and the response is linear, 1/x2 weighting does yield optimal estimates of the LS parameters and valid post SEs, but absolute DVFs are needed to realize the virtue of exact a priori SE prediction stated in the title to this work. The straightforward way to get data variances is from replicates. Indeed “weighted least squares” (WLS) is generally interpreted to mean wi = 1/syi2, with syi2 being the sampling estimate of the variance from the replicates for xi. However, this can be a poor choice if the number of replicates k is small, because such estimates have relative error (2/(k − 1))1/2. For example, with k = 3, the relative error is 100%, leading to considerable scatter in the estimates. Homoscedastic data treated this way show a large range of weights, leading to reduced precision.29 The preferred alternative is to fit all of the replicate-based variance estimates to a variance function that correctly accounts for the behavior of the variance over the full range of observed signals. This generally means at least two adjustable parameters to account for constant variance in the low-signal limit and x/y-dependent error at large. Two commonly used forms for proportional error are

as well as conceptual generalization of the very limited and formulaic eq 2. Moreover, it is easy to obtain x0 and its SE directly from the fit of the calibration data, for any weighting and any RF (see below). Many analytical techniques give data that are inherently strongly heteroscedastic, usually from error that is proportional to the signal (sometimes called constant coefficient of variation) or to the square root of the signal (counting data). Analysts are increasingly aware that this situation calls for weighted LS, but fewer seem clear on an important property of the method of linear least-squares: minimumvariance estimates of the parameters are obtained if and only if the data are weighted as their inverse variance. Any other weighting leads to (1) loss of precision in the LS estimation and, more importantly, (2) incorrect estimates of the variances that occur in the numerator of eq 4. Accordingly any limits based on incorrectly weighted data will be wrong  including those obtained using eq 2 for a linear RF with heteroscedastic data. To appreciate the importance of correct weighting, consider the example in Figure 1, for a linear model analyzed with two

Figure 1. Linear model with intercept 1, slope 2, and 5 xi values (1.1, 3.3, 5.5, 8.3, and 12.0). The illustrated results are obtained using the KaleidaGraph program14 on exactly fitting data. “Errors” are the a priori SEs obtained for the indicated error models.

σy(z) = cs0 + cs1z

(5a)

and σy 2(z) = cv0 + cv1z 2

different error structures: constant (σy = 1) and proportional (σyi = xi/4). The scales of the errors have been set to make the intercept more precise and the slope less precise in the heteroscedastic case. The SEs are exact results, and they are confirmed through MC simulations when the data have random error of these magnitudes and are analyzed with weights wi = 1/σyi2. However, suppose the heteroscedastic data are fitted unweighted, i.e., using wi = 1. MC simulations show that the actual SE for intercept a is 1.02, representing almost a factor of 3 loss in precision. Even worse, the post SE estimate obtained from standard expressions for “ordinary least-squares” (OLS, unweighted fitting) is sa ≈ 1.36. This would lead the analyst to report an (incorrect) LOD almost a factor of 4 larger than would be obtained with proper weighting. Alternatively, suppose the homoscedastic error structure is correct, but the analyst, not happy with the large uncertainty for the intercept, decides to weight the data as 1/x2 in order to better estimate a. Sadly, “you cannot fool Mother Nature”: This weighting increases the actual SE in a by 40%, but the fitting software now returns an estimate of sa ≈ 0.33, deceiving the analyst into thinking this procedure has indeed achieved its purpose, and leading to an LOD that is falsely optimistic by a factor of 3.5.

(5b)

where z is either x or y. The latter is the logical choice, since the uncertainty in the signal is expected to be a function of the signal, not the control variable that leads to that signal. In double-beam spectrophotometry, for example, the DVF is a function of wavelength and absorbance but independent of analyte.30,31 Nonetheless, x seems preferred in most efforts to derive weighting formulas and DVFs. From a fundamental standpoint, eq 5b is preferred over eq 5a, since for independent sources of error, variances add but SDs do not. Also, s is a biased estimator, while s2 is unbiased.10,32 Sampling estimates of σy2 and σy have proportional error, which means that in LS fits to eq 5a and eq 5b, both sy and sy2 should be weighted inversely as their own squares.32 Yet many efforts by chemical analysts have employed OLS to estimate the DVF parameters,33−39 though sometimes with logarithmic transformations, which do correctly convert proportional error to constant error. Correct weighting has been employed in a number of recent works, which include variance function estimation for several techniques by my colleagues and me.31,32,40−47 In LLS, incorrect weighting gives reduced 8716

DOI: 10.1021/acs.analchem.9b00119 Anal. Chem. 2019, 91, 8715−8722

Feature

Analytical Chemistry

Figure 2. Replicate-based variance estimates for 12 of the 16 species studied in refs 35 and 36 (omitting nos. 6, 11, 13, and 16), displayed as functions of concentration (ng/mL, A) and average peak area (B). The eight symbols represent the eight different concentration groups, in which the individual concentrations were chosen to give approximately the same peak area.

accompanying Supporting Information document, with examples in KaleidaGraph and Excel.

precision but not biased parameter estimates; thus, neglect of weights in DVF estimation does not ensure “wrong” results but may give estimates that lie outside the confidence region for correct weighting. For example, Zorn et al. noted the possibility of negative cs0 in fitting to the quadratic extension of eq 5a;35 with proper weighting, that possibility is greatly reduced because the low syi values at low x are much more heavily weighted than the high syi at large x. An alternative to fitting replicate-based sy2 (or sy) estimates for determining DVFs is generalized least-squares (GLS), in which the DVF is estimated from the RF fit residuals, in an iterative cyclical process.17,40,43,48 Although GLS can perform better than replicate analysis for a given DVF, it does not permit proper comparison among different functional forms of DVF while replicate analysis does. In both cases, the results will fail to achieve the goal highlighted in the title to this work because they will always be limited by the number of statistical degrees of freedom for the target data set, which is typically 1, but I will normally use the default of m = 1. Then with the DVF assumed known, the prediction uncertainty is obtained from the numerator in eq 4, σC 2 = σa 2 + σy 2(0)

(13)

in which the second term is the DVF at x = 0. (It would be divided by m for protocols having m > 1.) tp is a t coverage factor that depends on ν and % confidence. The default is 1sided 5%, meaning 5% of true blanks will be interpreted as detections. Since I am assuming known DVF, tp = zp = 1.6449 for 5% false positives. With this assumption, σC and SC are exact for a given linear response model, but xC is uncertain from the estimated slope. 8718

DOI: 10.1021/acs.analchem.9b00119 Anal. Chem. 2019, 91, 8715−8722

Feature

Analytical Chemistry

Figure 3. (Left) Model of Figure 1, with second and fourth points removed and first and last doubled, giving improved precision for both parameters in the homoscedastic model but slightly increased SE for b in the heteroscedastic model. (Right) Inclusion of unknown having y0 = 20 in the homoscedastic model: 5 calibration points and 1 unknown (upper); 3 calibration points and 3 unknowns (lower). A dummy x value > 12 is used to identify the y0 values, leading to the unusual appearance of the plotted curve. In KaleidaGraph, the expression for “calibration” is y = (x>12)?(a + b*c):(a + b*x).

detected as blanks (false negatives). Thus, for the default 5% and known DVF, tq = zq = 1.6449. This CI is exact. However, because x0 is a nonlinear estimator, the corresponding eq 14 is not exact. The errors in the two definitions of xD, eqs 14 and 17, will be examined through MC simulations below. The equations given above for xC and xD assume a linear RF with slope b. For nonlinear response, these relations change, because xC and xD are defined as the values at which the RF equals the specified values of yC and yD, which are a + SC and a + SD, respectively, for RFs with intercept a. This case will be considered further below.

The detection limit in the content domain is defined similarly, as x D = S D/ b

(14)

with SD = SC + tqσD

(15)

where σD2 = σf 2(x D) + σy 2(x D)

(16)



from which σD differs from σC even for homoscedastic data (σy(xD) = σy(0) = constant), because σf typically decreases from σa at x = 0. This makes xD a function of itself, leading many to adopt the approximation σD = σC, which is conservative, in yielding slightly larger xD. For the default 5% false negatives coverage, this gives SD = 2SC and x D2 = 2xC

ALGORITHMS AND LIMITS I have described an algorithm that assesses x0 and its uncertainty numerically in conjunction with the calibration fit for any RF and any DVF.55,56,59 It does this by including y0 for the unknown among the calibration data but treating it differently. For example, with the linear RF of eq 1, y0 is fitted to y0 = a + bx0. Since x0 is now an adjustable parameter, its SE is returned along with those for the calibration parameters. In effect, this algorithm just propagates the uncertainties of a, b, and y0 into that for x0. Even though x0 is a nonlinear estimator, with a non-normal distribution and inexact parametric SE, MC simulations have shown that these shortcomings are negligible unless the relative SE (RSE) is large, with RSE < 0.1 being a useful guideline for validity.52,60 This algorithm is illustrated in Figure 3, using the same data as in Figure 1. The left frame shows how altering the data structure can improve precision. (For linear RFs with constant data error, precision is improved by putting points at the extremes of the range.61) The right frame includes y0 = 20 for the unknown in the homoscedastic model, yielding x0 (c) and its SE, for two different distributions of the six measurements. Note that the 3,3 distribution yields better precision for the unknown, even though the precision for the calibration parameters is reduced. (For a fixed number of measurements, best precision for x0 is obtained for about equal numbers of knowns and unknowns.) Again, this approach is not limited to homoscedastic data and linear RFs nor to known DVFs and exactly fitting data. In the Supporting Information, I illustrate it for realistic data. Here I will show how to use it to determine limits and their uncertainties, the latter being the reason for the ν-dependent t-factors when SEs are postestimated, but an

(17)

In fact, for the homoscedastic case, Currie has given equations for xD without this approximation (eqs 19 and 20 in ref 57 and section 3.2.5 in ref 58). For both homo- and heteroscedastic data, solution for xD from eq 14 is a simple numerical task, as is illustrated below. Quantification limits have typically been defined in terms of the relative SD (RSD) in either the response or content domain, with 10% RSD being the default.57 In the response domain, this is in terms of σy rather than the prediction uncertainty, so for homoscedastic data, this means SQ = 10σy

(18)

which translates into SQ = 6.08 SC and ∼3SD for the default 5% ranges for SC and SD. As before, xQ is then obtained as SQ/b. However, because of the role of varying σf 2 in eq 4, this does not give the same confidence range for xQ. Accordingly, some workers have defined xQ as xQ = 10σ(xQ )

(19)

another expression that can be implemented easily as a successive approximation numerical task. Under the assumption of Gaussian error, the definition of SD in eq 15 ensures that a specified percentage of net responses centered at SD with σ = σD will fall below SC and hence be 8719

DOI: 10.1021/acs.analchem.9b00119 Anal. Chem. 2019, 91, 8715−8722

Feature

Analytical Chemistry

− p − 2 for a linear RF with a p-parameter DVF.35 This means use of this expression yields optimistic estimates of LODs, LOQs, and confidence limits where DVFs are estimated by LS fitting. The RSDs for DVF parameters can be predicted reliably from models, as is discussed below, where it is shown that blanks (measurements at x = 0) can provide a precision floor (and hence a minimal ν) for constant terms in both the DVF and the RF. Up until now, I have considered just linear RFs. However, many RFs are inherently nonlinear and should be treated as such rather than approximated as linear. As I have noted, equations relating limits in the response and content domain are then altered, from eqs 12a and 14 to

aspect of detection limits that has generally not been addressed directly.62 From eqs 4, 12, and 13, we see that xC can be expressed in terms of σx0 for x0 = 0: xC = z pσx 0(0)

(20)

where I continue to assume that the DVF is known. xC is thus obtained directly from the calibration algorithm and hence also xD2 (eq 17). For xD we write x D = xC + zqσx 0(x D)

(21)

which is solvable by successive approximation. Starting with xD = 0 on the rhs, we have xD2 initially for xD, which we then use for xD on the rhs. For the model on the right (upper) in Figure 3, using zq = zp = 1.6449, we obtain 2.03844 (xD2) → 1.97642 → 1.97800 → 1.97796 → 1.97796  the converged estimate agreeing with results obtained using eq 32 in ref 58, the rhs of which is xD2(K/I). Although eq 21 ensures that the lower 5% formal limit for x0 = xD is xC, MC simulations show that for this example the actual percentage of x0 estimates falling below xC when y0 is taken as the response for x0 = xD, is 5.46(4)%. For comparison, x0 = xD2 yields 4.45(4)% false negatives. Thus, in this case the “true” LOD is about midway between xD and xD2. Whether such small differences are of any concern can be addressed by evaluating the uncertainty in xD itself. Under the assumption of known DVF, only b is uncertain, and its relative uncertainty of 4.6% (Figure 3) translates into the same relative uncertainty in xD, or about 0.09, from which the difference between xD and xD2 is of no practical concern. If the DVF is not known, SC and SD are proportional to the estimate sy, and the uncertainty in xD is dominated by that in sy, which for the present model has relative magnitude (2ν)−1/2 = 0.41. This leads to significantly larger limits, from the factor t90,3 = 2.353 in place of zp = 1.6449 for 5% one-sided uncertainties. For the LOQ for this model, solution of eq 19 by successive approximation converges to xQ = 5.49095 in 4 steps, starting from xQ = xD. The heteroscedastic version of this model (Figure 3, left) is discussed in the Supporting Information, where a third version of xD is obtained by approximating σf 2(xD) in eq 16 by σa2,9 x D3 = xC + (zq /b)(σa 2 + σy 2(x D3))1/2

f (xC) = f (0) + SC

(23a)

and f (x D) = f (0) + SD

(23b)

with the definitions for SC and SD in terms of σC and σD unchanged. These equations can also be solved by successive approximation, but they require a different approach. In the Supporting Information, I illustrate two examples of nonlinear RFs  one where there is saturation behavior at large x but the linear approximation works well at small x and a second where the response is sigmoidal and the linear approximation is inappropriate.



VARIANCE FUNCTION ESTIMATION Having addressed the advantages that follow from knowledge of DVFs, I now turn to their determination. Key results are largely predictable from the fact that sampling estimates of variances are χ2 distributed, with RSD = (2/ν)1/2, while SD estimates have RSDs smaller by a factor of 2. Figure 4 shows

(22)

This ignores the decrease in σf with increasing x, giving a more conservative xD. The considerations in the Supporting Information show that just doubling xC to get xD2 is not a safe procedure for setting LODs with heteroscedastic data. (For homoscedastic linear models, xD3 = xD2.) A realistic heteroscedastic example is considered in Annex C of ref 13, consisting of 4 replicates at each of 6 analyte concentrations. The writers of this ISO document first estimate the data error function, fitting the replicate-based estimates of σy to the present eq 5a, through an appropriate iterative reweighting procedure. They then use this σy(x) function to weight the data in the calibration fit, from which they estimate xC and xD. I have repeated this analysis and examined the results with MC simulations, from which I have identified several problems with the treatment in ref 13. The reader is referred to the Supporting Information for my detailed reassessment of this example. One important finding is that the effective degrees of freedom ν from DVF estimation can be much smaller than predicted from simple expressions like ν = n 2

Figure 4. Least-squares treatment of model SDFs taken as constant and following eq 5a for 4 replicates at each of the 6 xi values in example 2 in Annex C of ref 13. The unusual expression for the fit function upper left is required to get KaleidaGraph to fit to a constant.

LS fits to two simple SD functions (SDFs), with weighting in accord with proportional errors of 1/ 6 for the 4 SD replicates at each concentration. For constant sy, the fit gives RSD = 1/6, which in turn gives ν = 18 exactly. Since the postestimates of the SEs for all RF parameters contain a factor of sy, all will be characterized by ν = 18, as found by fitting MC χ2 distributions in Figure S-11. For the linear SDF, results are more complicated. The RSD in cs0 translates into ν = 3.24, while that in cs1 gives ν = 10.08. These results also depend on the values of the SDF parameters. Thus, for example, if cs0 and 8720

DOI: 10.1021/acs.analchem.9b00119 Anal. Chem. 2019, 91, 8715−8722

Feature

Analytical Chemistry cs1 are changed to cs0 = 3 and cs1 = 0.25, ν = 1.75 for the former and 11.52 for the latter. Including blanks directly improves the precision of cs0 in the linear SDF in Figure 4 in a simple way. Thus, adding νb + 1 blanks (x = 0) increases the effective ν for cs0 by νb − e.g., from 3.24 to 6.24 for the example in Figure 4 with the addition of 4 blanks. The effect on the intercept a in the RF is similar but includes a reduction in σa as well as an increase in the effective ν. First, suppose a is estimated from νb + 1 blanks alone. Then sa will have ν = νb. Including these blanks in the calibration fit with other calibration data must yield a more precise a, characterized by a larger effective ν. The increased precision at small x contributes directly to smaller xC and xD, from decreases in σf 2(x0) in the numerator of eq 4. For estimated SDs, the effect on the SDF is just the increase in ν. Thus, for the constant σy example in Figure 4, increasing the replicates from 4 to 7 doubles ν (to 36) without changing the expected value of σy. This leaves the second term in the numerator of eq 4 unchanged but does give a smaller t value for confidence and detection limits. Next, consider the data from ref 35, shown in Figure 2. This is an ideal case for DVF estimation for a collective data set, with variance taken as a function of the response y rather than the control variable x. The obvious choice of independent variable is ⟨y⟩, as in Figure 2B. This raises a new issue, as ⟨y⟩, being based on 7 estimates, is not error-free, in violation of one of the requirements for linear LS. Further, there are questions about the best way to carry out the proportional weighting of the data, an issue that does not arise in fitting the exact data in Figure 4. There are at least four choices here: (1) Take wi−1 = svi2, with svi = (2/νi)1/2si2. Here si2 is the variance estimate from νi + 1 yi replicates (having average ⟨y⟩i). (2) Similar, but replace si2 with its estimated value, giving svi,calc = (2/ νi)1/2σy2(⟨y⟩i), where σy2 is the DVF. This is generally considered better than (1), but it makes the computation iterative because the DVF changes with changes in the wi. (3) Minimize the quantity Σ(δi/svi,calc),2 where δi = (calc − obs)i = σy2(⟨y⟩i) − si2. This is a version of the effective variance method that I have called EV2,63,64 in which the weights are a direct part of the optimization process. Like (2), it is iterative and nonlinear. (Results from (3) are not the same as those from (2), where the wi are adjusted after each cycle but are treated as constant in the minimization computations.) (4) Fit log(si2) to log(σy2). By error propagation the logarithmic transformation converts the wi to constant values, wi = νi/2. This makes this approach appealing, but for most DVFs, nonlinear LS is again required. Using methods like those discussed in refs 63 and 64, I have checked on the effect of uncertainty in ⟨y⟩ and found it negligible here. To compare the four weighting choices, I devised MC codes to model the data in Figure 2B, taking 125 ⟨y⟩ values spaced evenly on a log scale between 3 and 4400 and assuming a DVF of the form of eq 5b having cv0 = A2 and cv1 = B2, with A = 2.1 and B = 0.15. Option 2 proved best, with negligible bias in the parameters and only +3% in χ2. Option 1 gave parameters biased by about −35% and χ2 by +24%, while option 3 gave +15% bias in the parameters and −24% in χ2. Log fitting gave −8% bias in the parameters and +21% in χ2. The MC simulations were in fact prompted by similar differences observed for the fitting of the actual data. The MC results inspire confidence in the Option 2 results, which are

σy 2( y ) = [2.15(21)]2 + [0.1502(44) y ]2 ; χ 2 = 133.4 (126 points)

(24)

as obtained fitting variance estimates from all 16 data sets from ref 35 (after editing, see the Supporting Information). This χ2 value is reasonably close to its expected value of 124. I experimented with other functions and found that a third term in this DVF proportional to ⟨y⟩ was statistically significant. However, after reweighting, the converged χ2 was actually higher (138), so the DVF of eq 24 is preferred. (This behavior is a specific consequence of the reweighting and never occurs in ordinary LLS, where deletion of a coefficient having SE smaller than its magnitude always leads to a rise in the reduced χ2.51) The coefficient A = 2.15 is an SD, so its uncertainty translates into ν = 52, for which the 1-sided 5% t value is ∼2% larger than the 5% z value − this in spite of the ∼720 dof in the data set. More points at small ⟨y⟩ would have given more precise A and larger ν. The effect on xC and xD for the different data sets in ref 35 varies somewhat because of the varying intercepts for those data. Interestingly, the effective ν for the constant (A) in eq 24 is almost identical to the value obtained using ν = n − p − 2 by Zorn et al. in analyzing the data behind Figure 2 and eq 24 to obtain detection limits and alternative minimum levels.35,36 These authors estimated the SDF for each data set (usually 56 points) using unweighted fitting to the quadratic extension of eq 5a, taken as a function of x. Calculations like those shown in Figure 4 indicate an effective ν for the intercept cs0 of ∼8, when the data are properly weighted. For unweighted fitting, MC simulations confirm that the intercept RSD exceeds 1, leading to ν ≈ 1 and much larger limits than those given in ref 35. On the other hand, analysis of these data using the DVF of eq 24 gives xD values consistently smaller than those given in refs 35 and 36 (see the Supporting Information) − a clear triumph for DVF estimation in calibration. A further point about the calibration fitting of the ref 35 data is that the 2-parameter linear RF is statistically justified for only 5 of the 16 data sets, with a 3-parameter quadratic response called for in 8 cases, and a 1-parameter linear (intercept = 0) in 3 (see the Supporting Information). The equations given earlier make numerical assessment of xC and xD straightforward in all these cases. With a little practice using such numerical methods, analysts should find them a pleasing relief from the mind-numbing formulas that have dominated the literature on detection and quantification limits, formulas which in fact rarely cover response functions beyond the linear one of eq 1.65,66



ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.analchem.9b00119. Analysis of calibration data to obtain data variance functions (DVFs), limits of detection and quantification, estimates of unknowns and their standard errors (SEs), and methods for carrying out the computations discussed in the paper (PDF) Calib-ISO-Ex2.xls, Excel worksheet analysis of example shown in Figure S-3; ACFigure-S8.qpc, KaleidaGraph plot file for example in Figure S-8 (w/ data); VFdata8721

DOI: 10.1021/acs.analchem.9b00119 Anal. Chem. 2019, 91, 8715−8722

Feature

Analytical Chemistry



(27) Desharnais, B.; Camirand-Lemyre, F.; Mireault, P.; Skinner, C. D. J. Anal. Toxicol. 2017, 41, 269−276. (28) Tellinghuisen, J. J. Chromatogr. B: Anal. Technol. Biomed. Life Sci. 2008, 872, 162−166. (29) Jacquez, J. A.; Norusis, M. Biometrics 1973, 29, 771−779. (30) Rothman, L. D.; Crouch, S. R.; Ingle, J. D., Jr. Anal. Chem. 1975, 47, 1226−1233. (31) Tellinghuisen, J. Appl. Spectrosc. 2000, 54, 431−437 Changes in the optical properties of the solvent would affect such a calibration. . (32) Tellinghuisen, J. Analyst 2008, 133, 161−166. (33) Schwartz, L. M. Anal. Chem. 1979, 51, 723−727. (34) Garden, J. S.; Mitchell, D. G.; Mills, W. N. Anal. Chem. 1980, 52, 2310−2315. (35) Zorn, M. E.; Gibbons, R. D.; Sonzogni, W. C. Anal. Chem. 1997, 69, 3069−3075. (36) Zorn, M. E.; Gibbons, R. D.; Sonzogni, W. C. Environ. Sci. Technol. 1999, 33, 2291−2295. (37) Wilson, M. D.; Rocke, D. M.; Durbin, B.; Kahn, H. D. Anal. Chim. Acta 2004, 509, 197−208. (38) Desimoni, E.; Brunetti, B. Anal. Chim. Acta 2009, 655, 30−37. (39) Jimenez-Chacon, J.; Alvarez-Prieto, M. Accredit. Qual. Assur. 2010, 15, 19−28. (40) Tellinghuisen, J. Anal. Biochem. 2005, 343, 106−115. (41) Zeng, Q. C.; Zhang, E.; Tellinghuisen, J. Analyst 2008, 133, 1649−1655. (42) Zeng, Q. C.; Zhang, E.; Dong, H.; Tellinghuisen, J. J. Chromatogr. A 2008, 1206, 147−152. (43) Tellinghuisen, J. Chemom. Intell. Lab. Syst. 2009, 99, 138−149. (44) Tellinghuisen, J.; Bolster, C. Environ. Sci. Technol. 2010, 44, 5029−5034. (45) Thompson, M.; Coles, B. J. Accredit. Qual. Assur. 2011, 16, 13− 19. (46) Sadler, W. A. Ann. Clin. Biochem. 2016, 53, 141−149. (47) Noblitt, S. D.; Berg, K. E.; Cate, D. M.; Henry, C. S. Anal. Chim. Acta 2016, 915, 64−73. (48) Baumann, K.; Wätzig, H. J. Chromatogr. B 1995, 700, 9−20. (49) Tellinghuisen, J. J. Chem. Educ. 2005, 82, 157−166. (50) Bevington, P. R. Data Reduction and Error Analysis for the Physical Sciences; McGraw-Hill: New York, 1969. (51) Press, W. H.; Flannery, B. P.; Teukolsky, S. A.; Vetterling, W. T. Numerical Recipes; Cambridge University Press: Cambridge, U.K., 1986. (52) Tellinghuisen, J. Biochim. Biophys. Acta, Gen. Subj. 2018, 1862, 886−894. (53) Tellinghuisen, J. Analyst 2007, 132, 536−543. (54) Tellinghuisen, J. J. Phys. Chem. A 2001, 105, 3917−3921. (55) Tellinghuisen, J. Analyst 2000, 125, 1045−1048. (56) Tellinghuisen, J. Analyst 2005, 130, 370−378. (57) Currie, L. A. Pure Appl. Chem. 1995, 67, 1699−1723. (58) Currie, L. A. Chemom. Intell. Lab. Syst. 1997, 37, 151−181. (59) Tellinghuisen, J. Methods Enzymol. 2009, 454, 259−285. (60) Tellinghuisen, J. J. Phys. Chem. A 2000, 104, 2834−2844. (61) Francois, N.; Govaerts, B.; Boulanger, B. Chemom. Intell. Lab. Syst. 2004, 74, 283−292. (62) Badocco, D.; Lavagnini, I.; Mondin, A.; Pastore, P. Spectrochim. Acta, Part B 2014, 96, 8−11. (63) Tellinghuisen, J. Chemom. Intell. Lab. Syst. 2010, 103, 160−169. (64) Tellinghuisen, J. J. Chem. Educ. 2018, 95, 970−977. (65) Lavagnini, I.; Magno, F. Mass Spectrom. Rev. 2007, 26, 1−18. (66) Holstein, C. A.; Griffin, M.; Hong, J.; Sampson, P. D. Anal. Chem. 2015, 87, 9795−9801.

Zorn.txt, variance estimates from data in ref 35, used to estimate DVF; CalibData-Zorn.txt, data used to produce Table S-1 (ZIP)

AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. Phone: (615) 3224873. Fax: (615) 343-1234. ORCID

Joel Tellinghuisen: 0000-0002-4487-5197 Notes

The author declares no competing financial interest. Biography Joel Tellinghuisen is Professor of Chemistry, Emeritus at Vanderbilt University. He cut his data analysis teeth on problems in optical spectroscopy. More recently he has emphasized problems in other areas, including isothermal titration calorimetry (ITC) and analytical applications of polymerase chain reaction (PCR).



ACKNOWLEDGMENTS I have profited greatly from frequent exchanges with Ed Voigtman on matters concerning detection limits. I cannot thank him enough.



REFERENCES

(1) Raposo, F. TrAC, Trends Anal. Chem. 2016, 77, 167−185. Table 1 of this work conveniently summarizes 14 such guideline documents, from 1993 to 2014. (2) Kaiser, H. Spectrochim. Acta 1947, 3, 40−67. (3) Currie, L. A. Anal. Chem. 1968, 40, 586−593. (4) Hubaux, A.; Vos, G. Anal. Chem. 1970, 42, 849−855. (5) Thompson, M. Analyst 1998, 123, 405−407. (6) Currie, L. A. Anal. Chim. Acta 1999, 391, 127−134. (7) Belter, M.; Sajnog, A.; Baralkiewicz, D. Talanta 2014, 129, 606− 616. (8) Voigtman, E. Spectrochim. Acta, Part B 2008, 63, 129−141. (9) Voigtman, E. Spectrochim. Acta, Part B 2008, 63, 142−153. (10) Voigtman, E. Limits of Detection in Chemical Analysis; Wiley: Hoboken, NJ, 2017. (11) Shukla, G. K. Technometrics 1972, 14, 547−553. (12) Tellinghuisen, J. Fresenius' J. Anal. Chem. 2000, 368, 585−588. (13) International Organization for Standardization (ISO)Capability of detection−Part 2: Methodology in the linear calibration case, ISO 11843-2, 2000. (14) Tellinghuisen, J. J. Chem. Educ. 2000, 77, 1233−1239. (15) Nagaraja, N. V.; Paliwal, J. K.; Gupta, R. C. J. Pharm. Biomed. Anal. 1999, 20, 433−438. (16) Almeida, A. M.; Castel-Branco, M. M.; Falcão, A. C. J. Chromatogr. B: Anal. Technol. Biomed. Life Sci. 2002, 774, 215−222. (17) Sadray, S.; Rezaee, S.; Rezakhah, S. J. Chromatogr. B: Anal. Technol. Biomed. Life Sci. 2003, 787, 293−302. (18) Kiser, M. K.; Dolan, J. W. LC-GC N. Am. 2004, 22, 112−115. (19) Burns, M. J.; Valdivia, H.; Harris, N. Anal. Bioanal. Chem. 2004, 378, 1616−1623. (20) De Beer, J. O.; De Beer, T. R.; Goeyens, L. Anal. Chim. Acta 2007, 584, 57−65. (21) Steliopoulos, P.; Stickel, E. Anal. Chim. Acta 2007, 592, 181. (22) Lavagnini, I.; Magno, F. Mass Spectrom. Rev. 2007, 26, 1−18. (23) Jain, R. B. Clin. Chim. Acta 2010, 411, 270−279. (24) Burrows, J. Bioanalysis 2013, 5, 959−978. (25) Gu, H.; Liu, G.; Wang, J.; Aubry, A.-F.; Arnold, M. E. Anal. Chem. 2014, 86, 8959−8966. (26) Desharnais, B.; Camirand-Lemyre, F.; Mireault, P.; Skinner, C. D. J. Anal. Toxicol. 2017, 41, 261−268. 8722

DOI: 10.1021/acs.analchem.9b00119 Anal. Chem. 2019, 91, 8715−8722