Performance characteristics of a composite ... - ACS Publications

Division of Environmental Health Laboratory Sciences, National Center for Environmental Health and ... characteristics of a composite multivariate qua...
0 downloads 0 Views 848KB Size
Anal. Chem. 1992, 64, 1390-1395

1390

Performance Characteristics of a Composite Multivariate Quality Control System Samuel P. Caudill,' S. J. Smith, James L. Pirkle, and David L. Ashley

Division of Environmental Health Laboratory Sciences, National Center for Environmental Health and Injury Control, Centers for Disease Control, Public Health Service, US.Department of Health and Human Services, Atlanta, Georgia 30333

We present the results of an evaluation of the performance characteristics of a composlte multivariate quallty control (CMQC) system that incorporates quality control rules for univarlate,multivariate,and correlatlonconditions. The CMQC system evaluated is designed to help analysts detect unacceptable trends and systematlc error In one or more variables, unacceptable random error In one or more variables, and Unacceptable changes In the correlation structure of any palr of variables. It Is also designed to be tolerant of mlsslng data, to allow analysts to reject as few as one or as many as all variables In a run, and to provlde analysts wlth control statlstks and graphics that logically relate to sources of analytical error. We show that the varlous components of the CMQC system have adequate statistlcal power to detect systematlc errors, random errors, and correlation changes under the condltions llkely to be encountered wlth multivariate analytical measurement systems: (1) a single varlable wlth Increased systematlc or random error; (2) all variables or a subgroup of variables affected by a common problem that Increases systematlc or random error; and (3) missing data for one or more variables in a run. We also show that the power of the multlvarlate component of the CMQC system to detect systematlc and random errors Is higher than the power of an alternative multivariate test crlterlon.

INTRODUCTION

in this report is to demonstrate the statistical power of the CMQC system to detect the types of errors likely to be encountered in a laboratory process involving many variables. Using Monte Carlo simulation methods, we demonstrated the power of the various Components of the CMQC system to detect systematic errors, random errors, and correlation changes under the conditions likely to be encountered with multivariate analytical measurement systems. Smith et al.1 also previously discussed some of the problems associated with using Hotelling's Tz 2 as a control statistic when many variables are being monitored. Although we believe those arguments are sufficient to warrant the use of the multivariate control statistic (MCS) of the CMQC system rather than the Tz statistic,our second purpose in this report is to compare the statistical power of the MCS with that of Tz. The reader should keep in mind, however, that in many practical applications involving many variables, the use of Tz may not be possible. That is, test statistic values for Tz cannot by computed if the data are missing for one or more variables in a given run or if the number of variables exceedsthe number of available characterization runs. We compared the power of the two statistics in situations involving 5,10,20,30, and 40 variables, when means, standard deviations, and pairwise correlation coefficients are based on 40 characterization runs. We chose 40 characterization runs for these comparisons because, as demonstrated previously,' 40 characterization runs are usually adequate for univariate QC, and increasing the number of characterization runs beyond 40 has little effect on reducing false reject rates for bivariate correlation tests.

The compositemultivariate quality control (CMQC)system proposed by Smith et al.' is designed for situations involving the simultaneous control of many variables. In such situations other multivariate quality control methods are inadequate due to (1)the large sample sizes required to obtain precise estimates of the correlation matrix; (2) the requirement that all data be observed for all variables in every run; (3) the difficulty of relating individual variable values to the multivariate control statistic; and (4) the difficulty of logically relating out-of-control conditions to specific sources of analytical error. The CMQC system was specificallydesigned to eliminate these deficiencies and to provide the analyst with the following capabilities: (1)can detect unacceptable trends and systematic error in one or more variables; (2) can detect unacceptable random error in one or more variables; (3) can detect unacceptable changes in correlation structure in any pair of variables; (4) can tolerate missing data; (5) can reject as few as one or as many as all variables in a run; (6) can provide control statistics that logically relate to sources of analytical error. Using data from 63 analytical runs of 40 volatile organic compounds measured by mass spectrometry methods, Smith et al.1 illustrated the above capabilities. Our main purpose

METHODS A complete description of the CMQC statistical methods and graphical output is given by Smith et al.' A summary of the CMQC univariate, bivariate, and multivariate quality control rules is given in Table I. The univariate rules are probability-based and are designed to help analysts detect changes in single variables. The bivariate rules are also probability-based and are designed to help analysts detect changes in pairs of variables. The bivariate rules are based on a statistic denoted by PCORR, which is the observed significance level associated with a bivariate Hotelling's Tz statistic. Thus, PCORR is the estimated probability that an observed result for a pair of variables is due to the random variation expected on the basis of the observed variance of each variable and the observed pairwise correlation between the two variables during characterization runs. The multivariate rules are based on Monte Carlo simulation, and are designed to help analysts detect changes in groups of variables. The multivariate rules are based on the MCS, which is defined as

s.J.;Caudill, S. P.; Pirkle, J. L.; Ashley, D. L. Anal. Chen.

(2) Hotelling, H. Multiuariate Quality Control, Techniques of Statistical Analysis; McGraw-Hill New York, 1947.

(1)Smith,

1991,63,1419-1425.

This article not subject to U S . Copyright.

MCS = [log [ E d : + cll/m where di is the standardized deviate from the target con-

Published 1992 by the American Chemical Society

ANALYTICAL CHEMISTRY, VOL. 64, NO. 13, JULY 1, 1992

Table I. Composite Multivariate Quality Control Rules u1 u2 u3 u4

B1 B2 B3 B4

M1 M2 M3 M4 M5

Univariate Rules: For a Single Variable in a Run reject any single variable outaide +3 SD during one run (extreme deviation) reject any single variable outaide +2 SD on the same side of the mean for a second consecutive run (moderate deviation) reject any single variable displaying a consistent trend up or down for seven consecutive runs (trend 1) reject any single variable falling on the same side of the mean for 10 consecutive runs (trend 2) Bivariate Rules: For a Pair of Variables in a Run reject any pair of variables with PCORRa values less than 0.003 on at least two consecutive runs (extreme deviation) reject any pair of variables with PCORR values that decrease for seven consecutive runs (trend 1) examine any pair of variables with PCORR values less than 0.05 during one run (moderate deviation) examine any pair of variables with PCORR values less than 0.5 for 10 consecutive runs (trend 2) Multivariate Rules: For All Variables in a Run reject a run if the MCSbis outaide the 0.003 probability level (extreme deviation) reject the second consecutive run in which the MCS is outside the 0.05 probability level (moderate deviation) reject the seventh run if MCS increases in value for seven consecutive runs (trend 1) reject the tenth run if MCS falls above the median for 10 consecutive runs (trend 2) examine a single run outside the 0.05 level

The bivariate rules are based on a statistic denoted by PCORR, which is the observed significance level associated with a bivariate Hotelling’s P statistic. MCS is the multivariate quality control statistic (see text).

centration obtained in characterization runs for the ith variable, m is the number of control variables measured in a single run, and c is the number of variables with missing data in a single run. The standardized deviate for each run equals the control variable result minus the overall mean of the control variable in the characterization runs divided by the standard deviation of the control variable in the characterization runs. Analysta use both univariate and bivariate rules in rejecting partial runs, and they use multivariate rules in rejecting entire runs. To evaluate the performance characteristics of the CMQC, we used actual mass spectrometry data from analysis for volatile organic compounds (VOCs) in human blood to simulate data for power calculations. The calibrators for the VOC samples consisted of six different preparations and each preparation contained from 5 to 10 different compounds, for a total of 40 compounds. The characteristics of the simulated data set correspond to 40 QC characterization runs obtained in conjunction with the Centers for Disease Control’s (CDC’s) third National Health and Nutrition Examination Survey (NHANES 111). That is, means, standard deviations, and pairwise correlations computed from these 40 QC characterization runs were used to generate simulated characterization and test runs. Control limits for the MCS and PCORR were determined by Monte Carlo simulation or direct calculation as described in the supplementary material. After obtaining the necessary control limits, we used Monte Carlo simulation to estimate a levels and power for the MCS and PCORR. Descriptions of the simulation procedures are also described in the supplementary material. We checked the accuracy of the

1991

simulations by verifying that the observed a levels for the MCS and PCORR conformed to the expected a levels. We then identified types of multivariate errors that could occur and simulated each of these by generating new runs of test data that reflected these error types. We determined a levels and power values for the MCS and PCORR, and using the method described by Morrison,3we directly calculated a and power levels for Tz. The power estimates presented in this report represent instantaneous power, that is, the power of a particular control statistic to detect changes in the process between the current and previous runs or between the current and previous consecutive runs. The first set of simulations was designed to demonstrate how the CMQC system responds to the types of problems likely to be encountered in a 40-variable process. In the final set of simulations, we compared the power of P and the MCS component of the CMQC system to determine whether the MCS may be preferable to P even when the number of variables being monitored is not large, missing observations are not likely, or the number of characterization runs is adequate for the number of variables. The multivariate control rules (see Table I) used in the CMQC power simulations include the extreme deviation rule (Ml), the moderate deviation rule (M2), and the trend rule (M4). The bivariate control rules used in the CMQC power simulations include the extreme deviation rule (Bl), the moderate deviation rule (B3), and the trend rule (B4). Power comparisons between the MCS and P are based on 0.05 level testa. Power estimates for the multivariate trend rule (M3),which monitors consecutive increases in the MCS, or for the bivariate trend rule (B2),whichmonitors consecutive decreases in PCORR values, were not included in the simulations since these conditions, which are important to detect, rarely occur in practice. Power calculations for the univariate rules were excluded since these results could be calculated directly and are available elsewhere.4 In our study, we performed the following types of simulations: All Measurements within a Subgroup of Analytes Affected by a Common Problem. When many analytes are measured in a QC process, it is convenient to prepare analytical standards or QC materials in groups so that each group contains a fraction of all measured analytes. If this method is followed, measurements within a group are correlated and subject to common errors. For example, dilution variability will affect all analytes within each group equally. For simulation, we chose two groups of standards from the 40-analyte VOC data set discussed above. The first group consisted of analytes 1-9 (low concentration, medium volatility VOCs). This group of analytes had relatively low pairwise correlations. The second group consisted of analytes 26-35 (VOCsthat are normally present in human blood). The second group of analytes had substantially higher concentrations, and several analytes in that group had high pairwise correlations. For eachgroup of analytes, we increased systematic laboratory measurement error in the analyte subgroup of interest, but did not increase systematic measurement error in the remaining analytes. We report the estimated power of the MCS statistic in these simulations. Change in Pairwise Correlation between Measurements for Two Analytes. A change in pairwise correlation between two analytes can occur when the analytical system is altered by some factor that affects an analyte or a group of analytes differently than it affects another analyte or group of analytes. For example, if two analytes in different groups are highly correlated and one group is incorrectly prepared, (3) Morrison, D. F. Multiuariate Statistical Methods: McGraw-Hill: New York, 1967; pp 148-149. (4) Duncan, A. J. Quality Control and Industrial Statistics, 4th ed.; Richard D. Irwin: Homewood, IL, 1974; pp 442-450.

1392

ANALYTICAL CHEMISTRY, VOL. 84, NO. 13, JULY 1, 1992

Table 11. Power of the MCS To Detect Common Systematic Error in a Group of QC Materials (Analytes 1-9) That Have Lower Intercorrelations (See Text) number of standard deviations that stimulated runs were shifted from characterization mean values control rule 0.0 0.5 1.0 1.5 2.0 2.5 3.0 extreme deviation (Ml) 0.3 0.4 2.9 3.5 13.4 36.1 67.3 moderate deviation (M2) 0.3 0.7 1.7 9.9 34.8 75.8 96.7 0.0 0.4 4.7 28.4 88.4 100.0 trend 2 (M4) 100.0 Table 111. Power of the MCS To Detect Common Systematic Error in a Group of QC Materials (Analytes 26-35) That Have Higher Intercorrelations (See.Text) number of standard deviations that simulated runs were shifted from characterization mean values control rule 0.0 0.5 1.0 1.5 2.0 2.5 3.0 extreme deviation (Ml) 0.3 0.6 2.0 7.1 20.8 43.9 70.4 moderate deviation (M2) 0.3 0.6 2.7 11.1 36.5 72.0 93.5 0.0 0.4 3.7 29.1 trend 2 (M4) 82.9 98.2 100.0 ~~~~

the pairwise correlation between these two analytes would be altered. We simulated these situations with three typical VOC characterization correlations (i.e., -0.5, 0.5, and 0.9). We then generated VOC test data with different correlations and estimated the power of the bivariate rules to detect these shifts. Using the three typical VOC characterization correlations, we also generated test data but added or subtracted a constant from each result to simulate a shift in one or both analytes. We present estimates of the power of the bivariate rules to detect these shifts. Increase in the SystematicError of Several Analytes. Dilution errors may cause a sudden shift in the concentration of several analytes; internal standard preparation errors may affect an entire group of standards. We illustrate MCS power curves when 4 of 40, 8 of 40,16 of 40, 32 of 40, or 40 or 40 analytes simultaneously undergo a systematic shift ranging from 0.5 to 3.0 standard deviations. Increase in the Random Error of All Analytes. We also considered what happens when an increase in random error affects all analytes equally (e.g., when an instrument detector gradually fails). We present MCS power estimates for situations when increased random error causes each analyte’s standard deviation to increase by 10 %, 25 %, 50%, 75%,or 100%. Proportional Decrease in the Signal-to-Noise Ratio. The noise level (background) for the analytical process can increase because of a decrease in instrument sensitivity. For example, in mass spectrometry, as samples are run, deposits accumulate on the mass spectrometry source resulting in a steady decrease in the sensitivity. As a result of this general increase in noise, the analytes with lower concentrations have a proportionally decreased signal-to-noise ratio. We present power estimates for the MCS when the noise level for the analytical process is raised so that the standard deviation of the analyte with the smallest error increases by 50-300% and the standard deviation of each other analyte increases in inverse proportion to the ratio of its standard deviation to the standard deviation of the analyte with the smallest error. For example, if the standard deviations for analytes B and C are 3 times larger and 5 times larger, respectively, than the standard deviation for analyte A (theanalyte with the smallest error), then a 100% increase in the standard deviation of analyte A would translate into a 33 % and 20 9% ,respectively, increase in the standard deviations for analytes B and C. Missing Data for Several Analytes. Missing results for one or several analytes can occur when the specimen concentration is below the detection limits of the instrument, when standards are inadvertently omitted from the run, or when complete runs are not obtained because of instrument difficulties. We present power estimates for the MCS to detect changes in analytes 1-9 when 1,5,10,15, or 20 other analytes of the 40 are missing in the simulated test runs.

~

~~~

~~

~

~

~

~

_

_

_

_

Comparison of the MCS and !P. Using 0.05 level tests, we compared the power of the MCS and P to detect shifts in a subset of variables. Using 40 characterization runs, we estimated the power of the MCS and of P to detect shifts involving 5, 10, 20, 30, or 40 variables.

RESULTS All Measurements within a Subgroup of Analytes Affected by a Common Problem. Tables I1 and 111, respectively, contain estimates of the power of the MCS to detect increased systematic error in a group of analytes. The first analysis involved a group of 9 analytes with lower pairwise correlations and the second analysis involved a group of 10 analytes with higher pairwise correlations. The pairwise correlations (before the simulated shifts indicated in Table 11) for the first group ranged in absolute value from 0.01 to 0.55, with 28 (78%)below 0.25 and 1 (3%)above 0.50. The pairwise correlations for the second group (before the simulated shifts indicated in Table 111) ranged in absolute value from 0.01 to 0.86, with 21 (47%) below 0.25 and 14 (31%) above 0.50. A comparison of the results in the two tables suggests that the original correlations among the affected analytes had little effect on the power of the MCS to detect shifts in the group of analytes. For example, if the moderate deviation rule (M2) was used, and if the entire first (Table 11)or second (Table 111)group of analytes was shifted 1 standard deviation, the chance of detecting the shifts would be approximately 2-3 5%. If the systematic error increased to an amount equal to 2.5 standard deviations, the chance of detecting the shift would be approximately 70-75 76. Even though the power of the MCS is not substantially affected by the underlying correlation structure of the data, the MCS is sensitive to changes in correlation structure, since a common shift imposed on a group of variables results in an increased positive correlation among those variables. Change in Pairwise Correlation between Measurements for Two Analytes. The power of the bivariate rules to detect a change in true correlation between two variables is shown in Table IV. The column labeled “correlation change” displays the true correlation during simulated characterization runs (left side of column) and during simulated test runs (right side of column). The columns labeled “shift in variable 1”and ‘shift in variable 2”represent the amount of shift imposed on variables 1and 2, respectively, during simulated test runs. No shift was imposed on either variable during simulated characterization runs. The first three rows of Table IV show the type I error rates (i.e., the percentages of times that use of the indicated bivariate rules will result in an in-control run being incorrectly labeled out-of-control). As expected, these error rates are comparable for a range of pairwise correlations. The next

ANALYTICAL CHEMISTRY, VOL. 64, NO. 13, JULY 1, 1992

1393

Table IV. Power of the CMQC Bivariate Rules To Detect Changes in Correlation between Two Variables action warning trend correlation shift in shift in rule rule rule change variable 1 variable 2 (Bl) (B3) (B4) ~~

-0.5 to -0.5 0.5 to 0.5 0.9 to 0.9 -0.5 to 0.0 -0.5 to 0.5 -0.5 to 0.9 0.5 to -0.5 0.5 to 0.0 0.5 to 0.9 0.9 to -0.5 0.9 to 0.0 0.9 to 0.5 -0.5 to -0.5 0.5 to 0.5 0.9 to 0.9 -0.5 to -0.5 0.5 to 0.5 0.9 to 0.9 -0.5 to -0.5 0.5 to 0.5 0.9 to 0.9

none none none none none none none none none none none none +1 sd +1 sd +1 sd +3 sd +3 sd +3 sd -3 sd -3 sd -3 sd

none none none none none none none none none none none none +1 sd +1 sd +1 sd +3 sd +3 sd +3 sd +3 sd +3 sd +3 sd

0.0 0.0 0.0 0.1 0.2 0.6 0.2 0.0 0.0 11.4 6.6 1.4 1.1 0.0 0.0 88.8 23.5 12.7 22.3 88.5 90.9

5.3 5.0 5.1 11.4 16.8 20.7 16.1 11.2 3.1 52.7 45.3 31.1 39.0 16.1 13.6 100.0 85.5 75.5 84.8 100.0 100.0

0.2 0.1 0.4 0.7 0.5 0.4 0.5 0.2 0.0 10.6 10.8 5.4 30.6 2.6 1.8 100.0 94.1 83.2 92.9 100.0 100.0

nine rows represent imposed changes in the underlying correlation between variables 1 and 2. Since the extreme deviation action rule (Bl) was designed to have a low type I error rate (about 0.09% or 0.32%) to avoid unnecessary out-of-control events, the instantaneous power of this rule is minimalfor all situations illustrated except for those involving major changes in correlation (i.e., changes from 0.9 to 0.0 or from 0.9 to -0.5). The moderate warning rule (B3), on the other hand, is more sensitive to less extreme correlation changes, thus alerting the analyst to subtle changes in the analytical process that, if left unattended, may lead to outof-control conditions in later runs. The trend rule (B4) is moderatelysensitive to major shifts in correlation (i.e., changes from 0.9 to 0.0 or from 0.9 to -0.5). The last nine rows represent situations for which the correlations in the simulated test runs are equal to the correlations in the simulated characterization runs, but a shift has been imposed on both variables. Such a shift obviously results in a relationship between the two variables that is inconsistent with the characterized correlation, unless the absolute value of the characterization correlation is near 1 and both variables are shifted in accord with the characterization correlation. The power of the extreme deviation rule (Bl) is moderately high or high (and desirably so) when both variables shift by 3 standard deviations, yet the power is low (also desirably so) when both variables shift by only 1 standard deviation. Whether the power is moderately high or high when both variables are shifted by 3 standard deviations depends upon the extent to which the shift contradicts the correlation observed during characterization runs. For example, a -3 standard deviation shift in one variable and a +3 standard deviation shift in the other variable is more likely to be detected if the characterization correlation was 0.5 than if it was -0.5. The trend rule (B4) is moderately sensitive to minor shifts in two variables when the shift for both variables contradicts the correlation observed between the variables during characterization runs (e.g., -0.5 correlation accompanied by a positive shift in both variables of 1 standard deviation). The trend rule (B4) is very sensitive to major shifts in a pair of variables. Increase in the Systematic Error of Several Analytes. Figures 1and 2show the estimated power of the MCS extreme deviation rule (Ml) and moderate deviation rule (M2), respectively, to detect a 0-3 standard deviation shift in 4, a, 16, 32, or 40 variables. The expected type I error rates

0

1

SCI

I

3

Flgure 1. Power of the MCS extreme deviation rule (Ml) when 4, 8, 16,32, or 40 varlables are shlfted. The x-axis representsthe number of standard deviationseach variable Is shiftedfrom Its characterlzatlon mean. The y-axis represents 100 times the probabllity that the MCS extreme deviation rule (Ml) wlll detect the Indicated shift. 100

901

O0I 70

20 10

4 0

1

SHIFT

2

3

Figure 2. Power of the MCS moderate deviatlon rule (M2) when 4, 8, 16,32, or 40 variables are shlfted. The x-axis representsthe number of standard deviatlonseach variable 1s shiftedfrom its characterlzatlon mean. The y-axis represents 100 tlmes the probability that the MCS moderate deviation rule (M2) will detect the lndlcated shlft.

(zeroshift imposed for all variables) for this extreme deviation rule (Ml) and this moderate deviation rule (M2) are 0.3% and 0.25 % ,respectively. Note that if only four variables are shifted, the MCS extreme deviation rule (Ml) has less than a 15% chance of identifying the shift, even when all four variables are shifted 3 standard deviations. Thus, the MCS is not prone to reject an entire run when only a few variables are out-of-control. In such cases, the univariate rules would lead to rejection of only the results for the affected variables. If, however, 16of 40variablesare shifted3standarddeviations, there is about a 99% chance of declaring the entire run outof-control. Thus, when small or large numbers of variables are shifted, the MCS performs as designed (i.e., provides low power to detect a few out-of-control variables; provides high power to detect many out-of-control variables). Increase in t h e Random Error of A l l Analytes. In Table V we show the estimated power of the MCS to detect an increase in random error for all variables. The simulated increase in the standard deviation of each variable ranged from 10% to 100%. As can be observed, the MCS extreme deviation rule (Ml) would be very sensitive to a doubling of the standard deviation of all variables in a40-variable process, and the moderate deviation rule (M2) would have a 7 in 10 chance of detecting a 50 % increase in the standard deviation of all variables in a 40-variable process. Proportional Decrease in t h e Signal-to-Noise Ratio. In Table VI we show the estimated power of the MCS to

1394

ANALYTICAL CHEMISTRY, VOL. 64, NO. 13. JULY 1, 1992

Table V. Power of the MCS To Detect an Increase in Random Error Affecting All Variables control rule moderate trend 2 percent extreme increasea deviation (Ml) deviation (M2) (M4) 0 10 25 50 75 100

0.3 2.1 7.0 30.3 59.8 82.7

0.3 14.4 34.1 69.3 89.5 97.3

0.0 68.8 86.8 97.8 99.7 99.9

The standard deviation of each variable was increased by the percentage indicated. This increase simulates an increase in the random error for each variable. Table VI. Power of the MCS To Detect a Proportional Increase in Noise-to-SignalRatio. control rule trend 2 extreme moderate percent deviation (Ml) increase deviation (M2) (M4) 0 50 100 150 200 250 300

0.3 2.7 11.3 25.1 43.6 61.7 74.8

0.3 3.4 19.0 40.5 61.0 79.2 89.0

0.0 7.3 33.4 63.8 87.5 90.2 96.4

The standarddeviation of the variable with the smallest standard deviation was increased by the percentage indicated. The standard deviation of each of the remaining variables was increased in proportion to its magnitude relative to that of the variable with the smallest standard deviation. These increases simulate an increase in the noise-to-signalratio, or a decrease in the signal-to-noiseratio. Table VII. Power of the MCS To Detect a Common Shift of Two Standard Deviations in a Group of QC Materials (Variables 1-9) When Results for a Number of Other Variables Are Missing number of variables missing control rule 0 1 5 10 15 20 extreme deviation (Ml) 13.4 12.5 11.8 10.7 9.5 8.2 moderate deviation (M2) 34.8 34.6 33.7 33.5 33.3 33.8 88.4 89.9 92.8 89.7 96.3 96.5 trend 2 (M4) detect an increase in the noise of an analytical process. To simulate this situation, we increased the standard deviation of each variable in proportion to its magnitude relative to that of the variable with the smallest standard deviation. The simulated amount of increase (relative to the variable with the smallest standard deviation) ranged from 0.5 to 3.0 in increments of 0.5, where 3.0 represents a 300% increase in standard deviation for the variable with the smallest standard deviation. The amount of increase in standard deviation for the other 39 variables for the most extreme situation simulated (i.e., a 300% increase in the standard deviation of the variable with the smallest standard deviation) broke down as follows: (1)10 variables had less than a 5% increase, 10 had a 5-25% increase, 3 had a 25-50% increase, 12 had a 50-100% increase, and 4 had a 100% to less than a 200% increase. The last row of Table VI indicates the power of the MCS to detect this array of increases in standard deviationamong the 40 variables. Thus, the extreme deviation rule (Ml) would have about a 75% chance of detecting this particular increase in the noise of the analytical system. Missing Data for Several Analytes. The MCS contains a correction factor for missing data as indicated in eq 1. In Table VI1 we show the effect of missing (unobserved) data on the power of the MCS to detect a 2 standard deviation shift in the mean of variables 1-9 when variables 10, 10-14,

10-19, 10-24, or 10-29 are missing. We estimated that the power of the extreme deviation rule (Ml) decreases slightly as the number of missing variables becomes excessive,whereas the power of the trend rule (M4) increases. These apparent changes in power are due to slight changes in the type I error rate as the number of missing variables increases. This change in the error rate arises because the correction factor is based on the assumption that the true mean and variance of the missing variables are known. That is, the correction factor, c, in eq 1 is the expected value of Zdi2 when di is computed by using the true mean and standard deviation of the ith variable. Since in practice the mean and standard deviation must be estimated from the characterization data, we conducted the simulation in the same way. Therefore, c is a biased estimate of the expected value of Zdi2. At the tails of the MCS distribution the bias is negative, and near the center of the distribution it is positive. Thus, as c increases, the estimated power of the extreme deviation rule (Ml) decreases and the estimated power of the trend rule (M4) increases. The estimated power of the moderate deviation rule (M2) is essentially constant, regardless of the number of missing variables. Comparison of the MCS and TL. A comparison of the power of T2 and the MCS to detect systematic errors is presented in Table VIII. For these comparisons the first 5, 10, 20, 30, or 40 of the 40 VOCs previously discussed were used to generate correlated data for the power estimates. Power estimates are based on estimated means, standard deviations, and pairwise correlations from 40 characterization runs. We estimated mean shifts of 0.0-3.0 standard deviation units in increments of 0.5 units for 40% of the 5, 10, 20, 30, or 40 variables (Le., 2 of 5, 4 of 10,8 of 20, 1 2 of 30, or 16 of 40). We chose 40% in order to obtain the broadest range of power values for this range of mean shifts. Using a smaller percentage of shifted variables, we would have focused only on the lower portion of the T2 and MCS power curves, and using a larger percentage of shifted variables we would have focused only on the upper portion of the power curves. We used 0.05 level tests to estimate the power of the MCS and of Tz. The first row in Table VI11 (shift of 0.0 units) is always equal to 5 % for T2 because the results were obtained by direct calculation. The type I error estimates that we obtained using the MCS deviate only slightly from 5%, demonstrating that the simulation performed well. As the number of variables increases, the power of !P to detect a given shift in 40% of the variables is better for 10 or 20 variables than for 5 or 30 variables. The reason for this difference has to do with the relative sizes of the numerator and denominator degrees of freedom associated with the T2 statistic, which has an F distribution.5 As Table VI11 also shows, the power of the MCS to detect a given shift in 40 % of the variables increases as the number of variables increases. This increase is probably due to the similarity between the MCS and a x2 statistic. The MCS is a simple function of a x2 distributed variable (with degrees of freedom equal to the number of variables) if all variables are mutually independent (uncorrelated). Since the power of a x2 statistic increases as the number of degrees of freedom increases, the power of the MCS for a noncorrelated set of variables should also increase as the number of variables increases. Though we do not know the exact distribution of the MCS when the variables are correlated, we expect that the power of the MCS should increase as the number of variables increases. The more important information conveyed by Table VIII, however, is that the power of the MCS is comparable to and may be considerably greater than the power of T2. ( 5 ) Dixon, W. J.; Massey, F. J. Introduction to Statistical Analysis; 2nd ed.; McGraw-Hill: New York, 1957; pp 256-259.

ANALYTICAL CHEMISTRY, VOL. 64, NO. 13, JULY 1, 1992

1395

Table VIII. Power of !P versus Power of the MCS To Detect Systematic Errors as a Function of the Number of Variables number of variables (number of shifted variables) 5 (2)

shift”

IF

0.0

5.0 6.9 13.9 27.8 48.6 71.2 88.2

0.5 1.0 1.5 2.0 2.5 3.0

20 (8)

10 (4) MCS 5.2 8.2 16.2 29.9 53.9 76.5 90.1

T2 5.0 7.4 16.7 36.7 64.3 87.0 97.2

MCS 5.3 7.6 17.3 35.0 63.6 88.7 97.7

IF 5.0 7.5 17.7 40.4 70.3 91.3 98.7

30 (12) MCS 5.3 8.1 18.8 44.8 77.8 96.1 99.9

IF 5.0 6.8 13.6 28.5 51.3 74.7 90.5

40 (16) MCS 4.7 7.5 19.5 48.5 84.3 98.4 100.0

IF -b -b

-b -b -b

-b -b

MCS 5.3 7.7 20.6 53.8 88.2 99.4 100.0

a The number of standard deviations that simulated runs were shifted from zero target values. T2 cannot be computed when the number of variables is greater than or equal to the number of characterization runs.

DISCUSSION Using actual data from 63 analytical runs of 40 analytes measured by mass spectrometry methods, Smith et a1.l demonstrated various features of the CMQC system and how the CMQC system can be applied to an actual multivariate process. Such a demonstration, however, does not provide an adequate basis for assessing how well the system will perform in general. The performance of a quality control system is usually measured in terms of statistical power (i.e., the probability that certain quantifiable errors will be detected by the control rules). The power of a given statistic can be evaluated in absolute terms (Le., on the basis of statistical power estimates under various out-of-control situations) or by way of comparison with the power of an alternative statistic, if one exists. The results of our power computations demonstrate that the various components of the CMQC system are sensitive to the types of errors likely to be encountered with multivariate analytical measurement systems. The power of the MCS was unaffected by the underlying correlation structure of the process variables, yet the MCS was also sensitive to changes in correlation structure from characterization to test runs. This sensitivity resulted primarily from the systematic shifts imposed on variables affected by the correlation changes. The power of the MCS was only slightly diminished by missing observations on some variables. Because the MCS is based on the sum of squared deviations, the MCS was also sensitive to increases in the random variation of several variables (i.e,, when the deviation on either side of a given variable’s characterization mean increases, MCS values also increase). Because 7’2 represents an alternative to the MCS (the multivariate component of the CMQC system), we also compared the power of the MCS to that of 7’2. On the basis of this comparison,we demonstrated that the MCS actually performs better than 7’2 for processes with as few as 5 or as many as 40 variables. This may be because 7’2 values are subject to errors in estimation of the correlation structure, whereas the MCS values are not. Thus, disregarding the correlation structure (as is done in constructing the MCS) results in more (not less) power to detect shifts in the means of variables. The comparisons we made are all based on means, standard deviations, and correlation coefficients estimated from 40 characterization runs. The power of 7’2 could have been increased if more characterization runs had been used, but in many applications, resource constraints could make it difficult to obtain even 40 runs. In our simulations we chose 40 characterization runs as an upper limit because of this constraint and as a lower limit because, even though 30 characterization runs is usually considered adequate for

univariate QC, a substantial reduction in bivariate false reject rates can be achieved by going from 30 to 40 characterization runs.’ The power of the MCS for a fixed number of characterization runs actually increases as the number of variables increases. In summary, the MCS will reject entire runs when a substantial number of variables are out-of-control, whereas the univariate and bivariate control rules will target individual out-of-control variables so that entire runs are not rejected simply because of a few out-of-control variables. By having a quality control system with univariate, bivariate, and multivariate components, the power to detect errors in only one, in several, or in many variables can be high without substantially increasing false rejections of entire runs because of errors in only one or a few variables. In addition, by examining results of univariate, bivariate, and multivariate quality control rules, analysts have a much improved capability of identifying the specific sources in the measurement process that account for out-of-control conditions. One disadvantage of the CMQC system is that control limits for the MCS must be estimated by simulation. The results of preliminary studies of the distribution of MCS critical values, however, suggest that it may be possible to develop an algorithm that will produce critical value estimates for the MCS which are functions of the number of variables and one or more summary measures of the estimated correlation structure of the variables. MCS critical values may also possibly be obtained by using a modified bootstrap method. We are investigating these issues.

ACKNOWLEDGMENT Use of trade names is for identification only and does not constitute endorsement by the Public Health Service or the US. Department of Health and Human Services. SUPPLEMENTARY MATERIAL AVAILABLE Steps for determining control limits and a level and power estimations for the MCS and PCORR procedures (8 pages). Photocopies of the supplementary material from this paper or microfiche (105 X 148 mm, 24X reduction, negatives) may be obtained from Microforms & Back Issues Office, American Chemical Society, 1155 16th Street, NW, Washington, DC 20036. Orders must state whether for photocopy or microfiche and give complete title of article, names of authors, journal issue date, and page numbers. Prepayment, check or money order for $17.50 for photocopy ($19.50 foreign) or $10.00 for microfiche ($11.00 foreign), is required, and prices are subject to change. Canadian residents should add 7% GST.

RECEIVED for review December 27, 1992.

9, 1991. Accepted March