Anal. Chem. 2006, 78, 8113-8120
Uncertainty-Based Internal Quality Control. Harmonization Considerations E. Bonet-Domingo,† L. Escuder-Gilabert, M. J. Medina-Herna´ndez, and S. Sagrado*
Departamento de Quı´mica Analı´tica, Universitat de Vale` ncia, C/ Vicente Andre´ s Estelle´ s s/n, E-46100, Burjassot, Valencia, Spain
Three main quality aspects for analytical laboratories are internal method validation, internal quality control (IQC), and sample result uncertainty. Unfortunately, in the past they have been used in a nonharmonized way. The most universal IQC tool is the mean chart, but some criteria used to fix their control limits do not fit the real nature of analytical results. A new approach for fixing these limits is proposed (the u-approach). The key is the combined uncertainty, u, obtained from the method validation information, also used for estimating the sample result uncertainty. A comparative study on “in-control” simulated, bibliographic, and real laboratory data suggests that the u-approach is more reliable than other well-established criteria. In addition, the u-approach mean chart emerges as an IQC tool, consistent with chemical assays, which harmonizes the validation-control-uncertainty process. Among others, three main quality aspects for analytical laboratories are (internal) method validation, sample result uncertainty, and (internal) quality control (IQC).1,2 Method validation should lead to declare that the method is “fit for purpose”.3 Uncertainty (i.e., the expanded uncertainty of a result, U) should define an interval around the sample result, Res ( U, where it is probable to find the true value with a given confidence level.4 When a method has been validated using the adequate references, it should produce consistent results.5 However, the fit-for-purpose status must be extended over time checking whether the validation characteristics are maintained, i.e., if the method is in statistical control (“in-control”). For instance, the ISO 17025 standard1 specifies quality assurance strategies, among others, the regular use of reference materials or the IQC. * To whom correspondence should be addressed. E-mail:
[email protected]. Tel.: +34 96 354 4878. Fax: +34 96 354 4953. † Current address: General de Ana´lisis Materiales y Servicios (GAMASER) S.L., Valencia, Spain. (1) ISO/IEC 17025. General Requirements for the Competence of Testing and Calibration Laboratories; ISO: Geneva, 1999. (2) Sagrado, S.; Bonet, E.; Medina, M. J.; Martı´n, Y. Manual Pra´ ctico de Calidad en los Laboratorios. Enfoque ISO 17025, 2nd ed.; Ediciones AENOR: Madrid, 2005. (3) EURACHEM/CITAC Guide. The Fitness for Purpose of Analytical Methods: A Laboratory Guide to Method Validation and Related Topics; 1998. (4) Hund, E.; Massart, D. L.; Smeyers-Verbeke, J. Trends Anal. Chem. 2001, 8, 394-406. (5) EURACHEM/CITAC Guide, Traceability in Chemical Measurement; 2003. 10.1021/ac0611216 CCC: $33.50 Published on Web 11/03/2006
© 2006 American Chemical Society
The most used IQC tools are the control charts, mainly the control chart for the mean (also known as X-bar, Shewhart, or simply mean chart).6 The critical aspects of the mean chart have been deeply revised; see refs 6 and 7 and references therein. Control charts were developed to control production processes (not measurement processes). However, analysts have considered the analytical system as a process where the products are the analytical results.7 Therefore, laboratories have employed classical control criteria to develop their internal mean charts. However, Mullins6 has advised the analytical community on the nonadequacy of classical control limits adapted from engineering books (the main source of such information) in order to control analytical methods. Concretely, the interval between upper and lower control limits obtained from replicate assays on a control material along a period of time with a given method could be too narrow since it takes into account only the repeatability contribution. Thus, the false alarm probability (to declare that the method is out of control when in fact is in control) becomes too large. As a result, the quality managers tend to avoid using such control charts. In the past, the quality aspects, such as internal method validation, method control, and result uncertainty, have been treated separately. However, recent approaches to harmonize the internal validation-uncertainty process have been reported.4,8 For instance, the last approach to estimate result uncertainty uses the information obtained during the internal validation of the method in intermediate precision conditions (adjusting the day-analystequipment experimental setup to that used in routine).8 However, there is a lack of harmonization on the validation-control process or, what would be even more interesting, on the validationcontrol-uncertainty one. In this paper, we propose a new IQC criterion (the u-approach for the mean chart) consistent with the chemical analysis nature. Such criterion involves adjusting the experimental effort of the quality tasks and allows harmonizing the validation-controluncertainty process. The proposed criterion is compared with other previous criteria used for the mean chart, in terms of statistical consistency and practical decision-making consequences. (6) Mullins, E. Analyst 1999, 124, 433-442. (7) Mullins, E. Statistics for the Quality Control Chemistry Laboratory; Royal Society of Chemistry: Cambridge, 2003. (8) Maroto, A.; Riu, J.; Boque´, R.; Rius, F. X. Anal. Chim. Acta 1999, 391, 173-185.
Analytical Chemistry, Vol. 78, No. 23, December 1, 2006 8113
THEORETICAL CALCULATIONS An optimized experimental design for assessing accuracy (bias and intermediate precision) during the method validation stage is performing Nr replicates along Ns series (involving different days and/or analysts and/or equipment, etc., usually referred as “runs”). Such setups can be organized as a two-factor, fully nested design taking into account the between-run variations and replicates.8 This approach analyses the XNrxNs data matrix by means of ANOVA, obtaining the residual mean square (MSr) and the between-run mean square (MSrun) from which the precision estimates: repeatability standard deviation (sr), between-run standard deviation (srun), intermediate precision standard deviation (si), and standard deviation of the mean (the run mean, smean2), can be calculated8
sr2 ) MSr
(1)
srun2 ) (MSrun - MSr)/Nr
(2)
si2 ) srun2 + sr2
(3)
smean2 ) MSrun/Nr
(4)
Harmonizing the Internal Validation-Uncertainty Process. To assess bias, a reference is necessary, i.e., certified reference material (CRM), spiked blank or material, reference method, etc. References should provide an accepted true value µref for the analyte and its associated uncertainty, uref. Different operational definitions of uncertainty have been described depending on the available reference.4 For instance, if a CRM is used in the method validation stage, the contribution of bias and traceability to CRM can be joined to the repeatability and run effects, to calculate the “absolute uncertainty”.4 Assuming that analytical sample results are obtained from Nrs replicates in a single run, the combined standard uncertainty, u, can be estimated as2,4,8
u ) x(srun2 + sr2/Nrs) + (smean2/Ns + uref2)
(5)
It should be noted that Nrs (replicates used for samples in the routine analysis stage) has to be used instead Nr (replicates corresponding to the validation stage). After that, the expanded uncertainty can be obtained using a coverage factor, k, as U ) ku. For k ) 2 (a common option), U is roughly equivalent to half the length of a 95% confidence interval.4 This approach evidences the harmonization of the method validation-sample result uncertainty process. Statistical Limits for the Mean Control Chart. Traditionally, some preliminary work (“training stage” or “phase I”) must be undertaken to establish reliable estimates of the process mean and standard deviation. Ideally, replicate results of the control material must be obtained in separate runs (batches, days), generating a phase I-control data matrix, XI. From it, the center line, CL (estimating the process mean) and S (estimating the process standard deviation) can be estimated. CL is normally set as the overall average of the data, after removing or downweight8114
Analytical Chemistry, Vol. 78, No. 23, December 1, 2006
ing “assignable cause” outliers.2,9 In contrast, the estimation of S still deserves more attention.6 Classical approaches are based on averaging repeatability statistics, as the standard error of the mean (SEM), the Range, or the standard deviation (s), of the Ns vectors (the so-called subgroups)9
S)
S)
∑ (SEM )
x
2
j
Ns
(∑ ) (∑ )
Rangej /(d2 × xNr) Ns
S)
sj /(c4 × xNr) Ns
(6)
(7)
(8)
where j refers to the jth run and c4 and d2 (the Hartley’s constant, also known as dn) can be found in many textbooks.9 The RSC Analytical Methods Committee10 has proposed to calculate S from Nelson’s suggestion:11
S)
(∑ )
MRj /1.128 Ns - 1
(9)
where MR is the moving range (MRj ) |Meanj+1 - Meanj|). Once S is estimated, control limits (upper, UCL, and lower, LCL) can be obtained following the “three σ limits” approach7 (UCL ) CL + 3S and LCL ) CL - 3S). UCL and LCL are called action limits. Other limits can be useful, for instance, the so-called warning limits (WCL ) CL ( 2S). Once CL and the control limits have been set, the monitoring process (“control stage” or “phase II”) can start. New assay results of the control material can be plotted into the mean chart and checked applying some rules to decide on the method status (in control or out of control) along the time. If a method is in statistical control, the warning lines include ∼95% of all values and the action lines ∼99.7%.2,9 Harmonizing the Validation-Control-Uncertainty Process. We propose an approach to estimate S that allows to harmonize the validation-control-uncertainty process. Some changes have to be made in respect to classical approaches: (i) The phase I-control data (XI matrix) should be avoided by using the validation data (X matrix). (ii) If a CRM is used for assessing accuracy, the same CRM must be used as control material. (iii) The same between-run variables (say day, analyst, equip) used along sample routine analyses must be accounted in the validation and phase II-control stages. (iv) S has to be estimated from uncertainty information, which comes from validation data. The last item can be understood as follows. During the phase II-control stage, along the routine analysis period, the CRM (used as control material; where µref ) µCRM and uref ) uCRM) is analyzed (9) Massart, D. L.; Vandeginste, B. M. G.; Buydens, L. M. C.; De Jong, S.; Lewi, P. J.; Smeyers-Verbeke, J. Handbook of Chemometrics ans Qualimetrics; Elsevier: Amsterdam, 1997; part A. (10) RSC Analytical Methods Committee. AMC Technical Briefs No. 12, March 2003. Available from: http://www.rsc.org/lap/rsccom/amc/amc_techbriefs.htm. (11) Nelson, L. S. J. Qual. Technol. 1982, 14, 172-173.
as a sample. If we assume that the 95% confidence interval for a sample result is Res ( U (U ) 2u; for k ) 2), we can also assume it for the result of the CRM. On the other hand, we assume that when the method is in control (and the process errors are normally distributed), the warning lines (µCRM ( 2S interval) include 95% of the control results. Therefore, 2u ∼ 2S, and then, S could be estimated from the combined uncertainty, u. So, rewriting eq 5 and assuming that each CRM control result corresponds to the mean of Nrc replicates in a single run:
S ) u )xsrun2 + sr2/Nrc + smean2/Ns + uCRM2
(10)
As before, Nrc (replicates used for phase II-control stage) has to be used instead Nr. The term u2CRM in eq 10 agrees with the definition of absolute uncertainty,4 the uncertainty related to the certified reference material, specified on the certificate (“class B uncertainty”). This approach, which we have named u-approach, is now consistent with the method validation information but also with the sample result uncertainty. EXPERIMENTAL SECTION Simulation Study. In order to statistically compare the proposed u-approach (eq 10) with other previously proposed approaches (eqs 6-9), a simulation study was performed. This guarantees that the assumptions on the data quality are satisfied. An algorithm was developed for simulating validation and control laboratory results, assuming that they come from a fit-for-purpose/ in-statistical-control method. The first step was to generate the validation XNrxNs data set. The model proposed by Kuttatharmmakul et al.12 was followed. It assumes normally distributed validation results, consistent with the accepted true values (µ0, E0, RSDr0, and RSDrun0; the zero in the subscripts is used to differentiate the true values from the estimated ones) and uses the RANDN function in MATLAB (The Math Works, 1992):13
xij ) µ0(1 + E0/100) + fj + eij
(11)
where xij is the result for the ith replicate of the jth condition (i.e., day, analyst and equip), µ0 represents the accepted reference value (of the true analyte content), E0 represents the true bias of the method as percentage, and fj is the random run effect for the jth condition (fj ∼ N(0, σrun2), where σrun corresponds to RSDrun0); eij is the random error under repeatability conditions for the ith replicate and jth condition (eij ∼ N(0, σr2), where σr corresponds to RSDr0). In order to compare the different approaches in terms of false alarm probabilities, we have simulated the situation µ0 ) 10 ( 0 (uCRM2 ) 0 in eq 10) and E0 ) 0 (unbiased method). From XNrxNs validation results, but also S values (eqs 6-10) for control purposes were obtained. The second step was to generate the phase II-control results, XcNrcxNsc, in the same conditions as X. Using them, the false alarm probabilities according to rules 1 (P1%) and 2 (P2%) were calculated and compared with the expected values (R-error probabilities ∼0.3 and 0.1%, respectively).7 (12) Kuttatharmmakul, S.; Massart, D. L.; Smeyers-Verbeke, J. Chemom. Intell. Lab. Syst. 2000, 52, 61-73. (13) Matlab reference guide; The Math Works: Natick, MA, 1992; pp 402-403.
Bibliographic Data. As an example to illustrate the different approaches and strategies, we have used a data table available from Miller and Miller.14 The X4×25 (replicates × days) data matrix correspond to an IQC standard solution of 50 mg‚kg-1. This “target value” is used as CL instead the run mean (more usual under the IQC theory). Real Laboratory Data. Two experimental data sets were considered in this study. Analyses were performed under routine conditions in a laboratory (GAMASER S.L., Valencia, Spain) having ISO 170251 accreditation and ISO 1400115 certification (so appropriate residues management is performed). The first data set comes from a hemeodialysis water sample (high-quality water purified to be adequate for hemeodialysis purposes; low organic matter and nitrate ion, 0) or inconvenient (phase II-P1% or -P2% > 5%) results appear in parentheses.
to be taken with caution, especially in the case of RSDrun (Figure 3b). Globally, the u-approach shows P1% and P2% closer to the expected ones, but also more precise, than the MR-approach. This fact is particularly evident in the case of high RSDr/RSDrun ratios (Figure 3c-f). Therefore, a more robust behavior for u-approachbased mean control plots can be expected. The more imprecise results for MR-based control plots could be explained in terms of the less information accounted for this statistic (only Ns - 1 MR 8118 Analytical Chemistry, Vol. 78, No. 23, December 1, 2006
pairs) respect to the u-approach that comes from ANOVA. As expected, the rest of the classical approaches (eqs 6-8) exhibited inadequate results (not shown) except for the RSDrun ∼ 0 case, where results similar to those from MR-approach were observed. Comparing Strategies for the Bibliographic Data. Figure 4 shows some different phase-I IQC results, corresponding to the X4×25 data matrix (validation data) from Miller and Miller,14 as a function on the approach and strategy. Since RSDr > RSDrun the
Figure 5. Routine control results of the conductivity method. Comparison of all the approaches studied (named as in Table 1; u and u-all refers to the fact of including or not, respectively, the uncertainty of the CRM) in terms of false alarm probabilities for rules 1 (P1%) and 2 (P2%), represented by vertical bars. Horizontal dashed lines refer to expected probability values. From the available data 8 different subsets were reorganized to form X3×25 matrices for validation (subset-1, period 1-25; subset-2, period 26-50; and so on) and the rest of the data as Xc (Nrc ) 3) control matrix. The x-axis number corresponds to the number of subset.
Range-approach could show results as consistent as MR or u ones. This fact can be observed in Figure 4a2,b2,c2 when the CL ) run mean strategy was used, suggesting that the method is stable from the classical IQC theory. However, this conclusion disagrees with that expressed in ref 14 based on CL ) 50 (reference value), where a nonconformity result is pointed out when using the Range-approach (here, shown in Figure 4a1). The MR-approach shows two means violating rule 2 (Figure 4b1) while the uapproach do not show any problem associated with the validation stage (Figure 4c1). This example points out that the decision criteria in IQC still deserve more attention from the quality organizations. However, it also illustrates that the u-approach, which generates slightly larger S values than the approaches based on Range or MR, could offer equivalent conclusions when interchanging the CL ) run mean or CL ) µCRM strategies when bias is relatively low, as in the present example (E ) 2.5%). Note that S(u-approach) ) u ) 1.63 (used in Figure 4c1,2) was estimated for the Nrc ) Nr ) 4 case. If Nrc ) 2 is fixed for phase II-control, S is even larger (1.95) according to eq 10. Controlling Real Data. The validation-control information corresponding to the method for determining nitrate ion is illustrated in Table 3. In this example the CL ) µCRM strategy was followed. The results correspond to Nr ) 4 replicates in repeatability conditions along several hours (here used as run) per day during different days, resulting in 96 available runs. The data were organized in three subsets. Subset-1 (X4×25 data) represents the method validation stage. According to Table 3, RSDrun > RSDr, and then the classical approaches (eqs 6-8) generate inadequate control plots. Both MR- and u-approaches provide conform phase I (on the X4×25 validation data) and incontrol phase II results (on the rest of the data, reduced to
duplicate results, as Xc2×71). The analysis of uncertainty components of eq 10 indicates that the main contributing factor is s2run (∼6), with an almost negligible contribution of the u2CRM ()0.16) term. This can explain the similar S estimates in both approaches and the equivalent results found. In contrast to simulation, when working with real data generated by methods working in routine conditions, there is no a priori guarantee on the quality of the results. Therefore, it is difficult to compare the aptitudes of mean chart approaches more than to observe the decision-making aspects involved using them. Examining the subset-1 results (using the 1-25 period for validating the method), it should be concluded that both MR- and uapproaches provide evidence that the method is in control during the whole control period (26-96) studied even from an estimated bias (E ) 4.1%), affecting the control plot. This means that, a priori, interchanging the validation and control data would not produce drastic changes in the results. To verify this fact, subsets-2 (validation period 26-50) and -3 (validation period 51-75) were considered instead subset-1 to perform the validation-control stages. The results are shown in Table 3. The subsets’ interchange makes changes in validation results, although probably they do not affect the decisions on method validity. However, the decrease observed in the RSDrun estimates in relation to the RSDr estimates is evident. This makes a larger decrease in S estimates from the MR-approach than from the u-approach, which has a remarkable impact on the control results. In consequence, in both subsets-2 and -3, one failure is detected in the phase I-control using the MR-approach, which should invalidate the control limits used, and also provide an inconvenient number of alarms (say >5%) in the phase II-control stage. In contrast, using the u-approach S estimates has no consequences on the acceptability of the control Analytical Chemistry, Vol. 78, No. 23, December 1, 2006
8119
limits in phase I-control in subsets 2 and 3 and provides tolerable results in subset-2 (although the differences are evident, particularly between subset-3 and subset-1 data). The second data set comes from results of the conductivity method, re-expressed as Rec% (in respect to the nominal values) for three conductivity levels used as pseudoreplicates (Nr ) Nrc ) 3). So, they reflect the between-conductivity level variability rather than repeatability, which could affect the RSDr/RSDrun ratio. In fact, the RSDrun value obtained was 0 in this case. This means that, a priori, all approaches in Table 1 could offer acceptable results. In addition, this case shows another relevant particularity related to the uncertainty term u2CRM (0.16), which is now the most important one in eq 10, followed by s2r (0.13). So, the inclusion (or not, if it was not considered or assumed as negligible) of such contribution in eq 10 should affect the results of the u-approach. Figure 5 shows the control results based on eight consecutive subsets interchanging the data used for validation (X3×25) to calculate the control limits with all the approaches in Table 1. The u-approach was used in two ways, including or not the u2CRM term in eq 10. From all approaches, subset-1 generates mean charts indicating that the method is in control. However, for some of the rest of the subsets used as the validation set, different results were observed except for the proposed u-approach (all the uncertainty terms included). However, even eliminating the u2CRM term in eq 10, the u-approach still shows better between-subsets stability than the rest of approaches. Final Remarks. As stated above, IQC results depends on the decision of using CL ) run mean or CL ) µCRM strategies. For instance, in Table 3 using CL ) run mean, the MR-approach results for subsets-2 and -3 becomes more consistent with those from subset-1 (for subset-2, there is still one phase I-fail but P1 and P2 are reduced to 1%; while for subset-3, a conform phase I and reduced P1 and P2 values, to 7 and 6%, respectively, are observed). Also, slightly better results are observed for the u-approach. This question should be valuated by analyst, quality managers, and normalization bodies in order to offer and document future validation protocols. Table 1 summarizes the use that, in our opinion, should be done to each of the approaches tested in this paper, according to simulation and practical cases evaluated. The main suggestion from this table is that both MR- and u-approaches can coexist in an IQC planning. The results in this paper address only type I error-false alarm. Normally, this is the practical aspect of interest for laboratories and most of the researcher papers on the IQC topic. However, from a statistical point of view, the type II error is relevant (although it needs to prefix a threshold value to consider a control result as truly biased). On the other hand, it is obvious that there is a tradeoff between the two types of error. In this sense, the u-approach provides lower type I error (lower false alarm rate) than the MR-approach, which means that it has necessarily lower
8120
Analytical Chemistry, Vol. 78, No. 23, December 1, 2006
capability to detect method (real) shifts when they occur. This argument reinforces the suggestion of a combined use of both approaches. Table 3 shows a trend that deserves more attention. The decrease of S values from subset-1 to subset-3 could be just chance; however, it could be explained in terms of a better performance of the laboratory with time deriving in a reduction of variability. This is the reason why the recommendation of revising (and updating) the control limits periodically. If this is the case, the validation statistics and the u estimates for control and sample uncertainty should be also updated periodically. In the case of analyses that must cover a range of analyte concentrations and/or matrices, validation, uncertainty, but also, quality control must account with these ranges. In such cases, the proposed approaches for validation and phase-II quality control should be extended to cover all matrices and at least three analyte concentration levels.2,9 CONCLUSION Internal method validation under intermediate precision conditions could show initial evidence of method stability, but continuous evidence of the assay’s reliability is only available under IQC schemes; so control charts are complementary to method validation in generating fit-for-purpose data. Samples should be analyzed only when the method is in control, in order to ensure consistent results and reported uncertainty. Therefore, harmonizing the information in the validation-control-uncertainty process is essential for quality assurance tasks. A single statistic, u, estimated from a safe (in terms of cost-benefit analysis) (Nr,Ns)-experimental design (say 4,25) is enough to ensure the consistency in these three stages. The u-approach mean control chart, which shows better applicability and robustness than previous approaches (see Table 1), seems to be an excellent option for controlling methods exhibiting moderate bias, even fixing the center line at the reference value. This approach seems to be compatible with phase II-control along time based on as low as Nrc ) 2 replicates, which can be in agreement with laboratory needs. Moreover, classical criteria should be eliminated for monitoring analytical data and the MR-approach should be used with caution (i.e., to monitor the method stability based of center line equal to the run mean from the validation stage). ACKNOWLEDGMENT The authors acknowledge the Spanish Ministry of Science and Technology (MCYT) and the European Regional Development Fund (ERDF) (Project SAF2005-01435) for the financial support. Received for review June, 21, 2006. Accepted September 24, 2006. AC0611216