p • 0.128 when there i s no b i a s ) . The maximum p e r m i s s i b l e value of the t r u e CV-p w i l l be r e f e r r e d to as i t s " t a r g e t l e v e l " . In order to have a confidence l e v e l of 95% that a subject method meets t h i s r e q u i r e d t a r g e t l e v e l , on the b a s i s of CV-p estimated from l a b o r a t o r y t e s t s , an upper confidence l i m i t f o r CV-p i s c a l c u l a t e d which must s a t i s f y the f o l l o w i n g c r i t e r i o n : r e j e c t the method ( i . e . decide i t does not meet the accuracy standard) i f the 95% upper confidence l i m i t f o r CV-p exceeds the t a r g e t l e v e l of CVp. Otherwise, accept the method. This d e c i s i o n c r i t e r i o n was implemented i n the form of the D e c i s i o n Rule given below which i s based on assumptions that e r r o r s are normally d i s t r i b u t e d and the method i s unbiased. Biased methods are^ discussed f u r t h e r below. For our v a l i d a t i o n s , a CVp i s a pooled estimate c a l c u l a t e d from the p a r t i c u l a r type of s t a t i s t i c a l data set (36 samples) described e a r l i e r i n the S t a t i s t i c a l Experimental Design s e c t i o n of t h i s r e p o r t . A s t a t i s t i c a l procedure i s given i n H a l d ( i ) f or determining an upper confidence l i m i t f o r the c o e f f i c i e n t of v a r i a t i o n . This general theory had ^to be adapted a p p r o p r i a t e l y f o r a p p l i c a t i o n to a pooled CVp estimate. For t h i s design, and under the s t a t e d assumptions, there i s a one-to-one correspondence between values of CV-p and upper confidence l i m i t s f o r CV-p. Therefore, the confidence l i m i t c r i t e r i o n given above i s e q u i v a l e n t to another c r i t e r i o n based on the r e l a t i o n s h i p of CV-p and i t s c r i t i c a l value. The
In Chemical Hazards in the Workplace; Choudhary, G.; ACS Symposium Series; American Chemical Society: Washington, DC, 1981.
31.
BUSCH
AND
NIOSH
TAYLOR
Validation
Tests
509
D e c i s i o n Rule i s as f o l l o w s : D e c i s i o n Rule: The CV^ from l a b t e s t s would have to be l e s s than the c r i t i c a l value 0.105 to be 95% confident that the true CV i s at or below 0.128 ( i . e . , i n order to be 95% confident that future e r r o r s by the same method would not exceed +25% more than 5% of the time).
Downloaded by NORTH CAROLINA STATE UNIV on January 7, 2013 | http://pubs.acs.org Publication Date: April 2, 1981 | doi: 10.1021/bk-1981-0149.ch031
T
^ Figure 1 provides adjustments to c r i t i c a l values f o r CV-p when a method i s biased. The dotted curve gives c r i t i c a l values of CV^ as a f u n c t i o n of bias f o r a s t a t i s t i c a l s i g n i f i cance t e s t performed a t the 5% p r o b a b i l i t y l e v e l . Because uniform r e p l i c a t e determinations of the bias were not made i n the v a l i d a t i o n t e s t s , the bias i s treated as a known constant rather than an estimated value. The experimental design could be modified to permit determination of the i m p r e c i s i o n i n the bias by p r o v i d i n g f o r uniform r e p l i c a t i o n of the independent method as w e l l as the method under evaluat i o n . Then the d e c i s i o n chart could be modified to include allowance f o r v a r i a b i l i t y of r e p l i c a t e bias determinations. In cases where confidence l i m i t s can be c a l c u l a t e d f o r the b i a s , the c r i t i c a l CV-j should be read from the dotted curve a t a p o s i t i o n corresponding to the 95% upper confidence l i m i t f o r the b i a s . T h i s i s a conservative procedure. The c a l c u l a t e d p o i n t s through which the curves of Figure 1 were drawn using a French curve are given below. Bias (%) 0 2.5 5.0 10.0 15.0 16.8 20.0 25.0
Target CV (%) T
12.8 12.5 11.8 9.1 6.1 5.0 3.0 0
Critical
CV (%) T
10.5 10.3 9.8 7.9 5.8 5.0 (Unattainable) (Unattainable)
Operating C h a r a c t e r i s t i c s of the V a l i d a t i o n Test Program As would be expected, i n order to be able to have at l e a s t 95% confidence that the true CVx does not exceed i t s t a r g e t l e v e l , we must s u f f e r the penalty of sometimes f a l s e l y accepting a "bad" method ( i . e . one whose true CV-p i s u n s a t i s factory). Such d e c i s i o n e r r o r s , r e f e r r e d to as "type-1 e r r o r s " , occur randomly but have a c o n t r o l l e d long-term frequency of l e s s than 5% of the cases. (The 5% p r o b a b i l i t y of type-1 e r r o r i s by d e f i n i t i o n the complement of the confidence l e v e l . ) The upper confidence l i m i t on CV«j< i s below the t a r g e t l e v e l when the method i s judged acceptable under the D e c i s i o n Rule.
In Chemical Hazards in the Workplace; Choudhary, G.; ACS Symposium Series; American Chemical Society: Washington, DC, 1981.
CHEMICAL
510
HAZARDS
IN
T H E
WORKPLACE
The v a l i d a t i o n t e s t program can a l s o have a "type-2 e r r o r " , which i s the mistake of deciding that a method i s "bad" ( C V > 0.128) when i n f a c t i t i s "good" ( C V < 0.128). The r i s k ( p r o b a b i l i t y ) of making a type-2 d e c i s i o n e r r o r i s not bounded (as i s the case f o r the type-1 e r r o r ) . Rather, i t depends on the true C V . In a previous r e p o r t ( i - ) , i t was shown that the p r o b a b i l i t y of a type-2 e r r o r i s l a r g e (0.88) f o r a " b o r d e r l i n e " true CV-p ( j u s t below 0.128) but decreases to small p r o b a b i l i t i e s of 0.10 f o r C V - 0.091, and 0.05 f o r C V - 0.088. Thus, more than 95% of methods whose CVx's are below 0.088 (8.8%) w i l l be accepted on the basis of t h e i r test results. "Good" methods whose true CV-p's are i n the range 8.8% to 12.8% run a higher r i s k of not being approved; t h i s r i s k could be lowered by using more than the now-pres c r i b e d 3 sets of 6 samples f o r the CV«p l a b o r a t o r y estimates i n (each phase of) t h i s program. However, the r a t e of improvement, i n the p r e c i s i o n of the l a b o r a t o r y estimates CV-p, from using more samples would be s m a l l . For example, using 45 samples (15 per each of 3 groups) f o r each of the two phases i n s t e a d of 18 (6 per group) only increases the "safe approval l e v e l " (0.05 p r o b a b i l i t y of type-2 e r r o r ) f o r C V from 0.088 (18 samples) to 0.099 (45 samples). The d e c i s i o n was made, t h e r e f o r e , to perform the smaller number (18) of t e s t s f o r each of the two phases of the program. X
X
X
T
Downloaded by NORTH CAROLINA STATE UNIV on January 7, 2013 | http://pubs.acs.org Publication Date: April 2, 1981 | doi: 10.1021/bk-1981-0149.ch031
T
T
Results of V a l i d a t i o n Tests Over 300 methods have been v a l i d a t e d using the s t a t i s t i c a l p r o t o c o l described above. Histograms have been prepared showing the d i s t r i b u t i o n s of p r e c i s i o n s and biases obtained i n the v a l i d a t i o n t e s t s . Of 310^methods v a l i d a t e d , only 31 (10%) had p r e c i s i o n estimates ( C V s ) above 9% (See Figure 2 ) . Apparently, only a small number of "good" methods have been t e s t e d whose CV-p's are i n the b o r d e r l i n e range where there i s an appreciable chance of r e j e c t i n g "good" methods. Since the pump e r r o r has a CVp of 5% by i t s e l f , no values of CV-p f a l l below t h i s l e v e l except f o r a few cases f o r which the method does not i n v o l v e use of a personal sampling pump. I t should be noted a l s o that most of the methods have prec i s i o n s c l u s t e r i n g around 6-7% i n d i c a t i n g the high q u a l i t y of a n a l y t i c a l methods t e s t e d . The d i s t r i b u t i o n of estimated biases f o r these methods i s shown i n Figure 3. Except f o r a bias of zero, the methods tend to be d i s t r i b u t e d evenly i n the -10% to 10% bias region. The high p r o p o r t i o n of zero-bias methods may be explained by the number of f i l t e r c o l l e c t i o n methods which have 100% c o l l e c t i o n e f f i c i e n c y ; many of these methods use low-biased a n a l y s i s techniques, p a r t i c u l a r l y atomic absorption s p e c t r o s copy. f
T
In Chemical Hazards in the Workplace; Choudhary, G.; ACS Symposium Series; American Chemical Society: Washington, DC, 1981.
BUSCH
A N D TAYLOR
2
Downloaded by NORTH CAROLINA STATE UNIV on January 7, 2013 | http://pubs.acs.org Publication Date: April 2, 1981 | doi: 10.1021/bk-1981-0149.ch031
o
60
NIOSH
Validation
Tests
511
H
0.04
0.06
0.08
Estimated Coefficient of Variation (CVj)
Figure 2.
! | E
Histogram of CV (estimated coefficient of variation of net error attributable to sampling and analysis) for 310 methods T
80
60
-5
0
In
5
10
15
Estimated Bias (%)
Figure 3.
Estimated biases for 310 test methods
In Chemical Hazards in the Workplace; Choudhary, G.; ACS Symposium Series; American Chemical Society: Washington, DC, 1981.
CHEMICAL
512
HAZARDS
IN
T H E WORKPLACE
Downloaded by NORTH CAROLINA STATE UNIV on January 7, 2013 | http://pubs.acs.org Publication Date: April 2, 1981 | doi: 10.1021/bk-1981-0149.ch031
Summary We have presented a s t a t i s t i c a l experimental design and a p r o t o c o l to use i n e v a l u a t i n g l a b o r a t o r y data to determine whether the sampling and a n a l y t i c a l method t e s t e d meets a defined accuracy c r i t e r i o n . The accuracy i s defined r e l a t i v e to a s i n g l e measurement from the t e s t method r a t h e r than f o r a mean of s e v e r a l r e p l i c a t e t e s t r e s u l t s . Accuracy here i s the d i f f e r e n c e between the t e s t r e s u l t and the " t r u e " value, and thus, must combine the two sources of measurement e r r o r : 1) the random e r r o r s of the sampling and a n a l y s i s ( i . e . p r e c i s i o n ) represented by the t o t a l c o e f f i c i e n t of v a r i a t i o n (CV-jO of r e p l i c a t e measurements around t h e i r own mean and, 2) the e r r o r due to a r e a l b i a s (systematic e r r o r ) represented by the d i f f e r e n c e between average r e s u l t s by the subject collection-and-measurement method and average r e s u l t s from an independent method. The American Society f o r T e s t i n g and M a t e r i a l s , i n t h e i r accuracy standard s t a t e s that accuracy does i n c l u d e both of these e r r o r s ( S e c t i o n 4.1). We have estimated both types of e r r o r s and r e f e r r e d r e s u l t s to a d e c i s i o n chart (Figure 1) to see i f the t e s t method does or does not meet the accuracy c r i t e r i o n . F i n a l l y , we would l i k e to point out that the s t a t i s t i c a l p r o t o c o l f o r v a l i d a t i o n deals mainly with the l a s t step i n determining the v a l i d i t y of a monitoring method. The s t a t i s t i c a l p r o t o c o l i s not appropriate f o r a p p l i c a t i o n to a method that has not been completely developed. Tests f o r such items as sample c o l l e c t i o n e f f i c i e n c y , s t a b i l i t y , and recovery; sampler c a p a c i t y ; and a n a l y t i c a l range and c a l i b r a t i o n a l l should be evaluated p r i o r to a p p l i c a t i o n of the s t a t i s t i c a l p r o t o c o l i n connection with l a b o r a t o r y v a l i d a t i o n testing.
Literature Cited (1)
Hald, A., "Statistical Theory with Engineering Applications", Chapter 11: part 11.8 and 11.9; Wiley, 1952.
(2)
Busch, K. A., "Statistical Properties of the SRI Contract Protocol (CDC 99-74-45) for Estimation of Total Errors of Air Sampling/Analysis Procedures", memorandum to Deputy Director, Division of Laboratories and Criteria Development, Jan. 6, 1975.
(3)
"Standard Recommended Practice for Use of the Terms Precision and Accuracy as Applied to Measurement of Property of a Material", E 177-71, in Annual Book of Standards, part 41, American Society for testing and Materials: Philadelphia, Pa., 1976.
In Chemical Hazards in the Workplace; Choudhary, G.; ACS Symposium Series; American Chemical Society: Washington, DC, 1981.
31.
BUSCH
NIOSH
A N D TAYLOR
Validation
Tests
513
APPENDIX I
TARGET VALUE OF CV
T
FOR A BIASED METHOD
The maximum p e r m i s s i b l e CV-p ( t a r g e t value) f o r a biased method can be found by means of the formulae given below.
Downloaded by NORTH CAROLINA STATE UNIV on January 7, 2013 | http://pubs.acs.org Publication Date: April 2, 1981 | doi: 10.1021/bk-1981-0149.ch031
Let B • Bias r a t i o f o r the method - (mean r e s u l t by the method)v(true concentration)• Standard normal deviates f o r l e f t and r i g h t sides of the normal d i s t r i b u t i o n corresponding to large e r r o r s ( e r r o r s beyond +25%) are given by: „ L
Z
=
0.75-B B ^ T
, a
n
d
Z
„ R "
1.25-B B^V7~
For a given B, CV^ i s the s o l u t i o n of the equation:
/
TO -yL-
e-
( 1 / 2
>
z 2
[
dZ +
(
1
e -
1 / 2 ) z 2
d Z = 0.05
Z
R equation must be solved i t e r a t i v e l y . For any s e l e c t e d B, C V T / S are s e l e c t e d by t r i a l and e r r o r i n order to f i n d the value of CVj f o r which the sum of the i n t e g r a l s equals 0.05. The
TT
i
Example:
n
i
i
*
B » 1.1, Z
-0.35
L
= —
— CV-p
„
, Z
=
R
0.15
— — CV«p
For CV - 0.09116, Z - -3.8394, Z - 1.6455, and the sum of i n t e g r a l s i s 0.0001 + 0.0499 - 0.0500. Thus a method with B - 1 . 1 ( i . e . 10% b i a s ) has CV - 0.091 as i t s target l e v e l . X
L
R
T
In Chemical Hazards in the Workplace; Choudhary, G.; ACS Symposium Series; American Chemical Society: Washington, DC, 1981.
CHEMICAL
514
HAZARDS
IN
T H E WORKPLACE
APPENDIX I I
COMPUTATIONAL FORMULAE FOR STATISTICAL ANALYSIS
This appendix gives the formulae and d e f i n i t i o n s used i n the p r o t o c o l to s t a t i s t i c a l l y analyze l a b o r a t o r y data from validation tests.
Downloaded by NORTH CAROLINA STATE UNIV on January 7, 2013 | http://pubs.acs.org Publication Date: April 2, 1981 | doi: 10.1021/bk-1981-0149.ch031
Definitions
and symbols are l i s t e d below:
Mean - a r i t h m e t i c mean or average ( x ) , defined as the sum of the observations d i v i d e d by the number of observations ( n ) . Standard D e v i a t i o n - the p o s i t i v e square root of the v a r i a n c e , which i n t u r n i s defined as the sum of_squares of the d e v i a t i o n s of the observations from t h e i r mean (x) d i v i d e d by one l e s s than the number of observations (n - ! ) •
Std Dev CV -
c o e f f i c i e n t of v a r i a t i o n , or r e l a t i v e standard d e v i a t i o n , defined as the standard d e v i a t i o n d i v i d e d by the mean* CV -
CV
CV
1
(
2
CV -
Std Dev Mean
c o e f f i c i e n t of v a r i a t i o n (estimated value) f o r the s i x a n a l y t i c a l samples at each of the 0.5, 1, and 2X OSHA PEL's f o r the recommended sample volume. c o e f f i c i e n t of v a r i a t i o n (estimated value) f o r the s i x generated samples at each of the 0.5, 1, and 2X OSHA PEL's.
pooled c o e f f i c i e n t of v a r i a t i o n : the value derived from the c o e f f i c i e n t s of v a r i a t i o n (of a given type, e.g. CV^ o r CV«) obtained from the a n a l y s i s of 6 samples a t each of the three t e s t l e v e l s . The mathemat i c a l equation i s expressed as:
In Chemical Hazards in the Workplace; Choudhary, G.; ACS Symposium Series; American Chemical Society: Washington, DC, 1981.
31.
NIOSH
BUSCH A N D TAYLOR
CV
Validation
Ji
f
i