Statistical Protocol for the NIOSH Validation Tests - American

Health Administration (OSHA) announced a joint program to ... that accurate personal sampling and analytical methods would ... Therefore, we also give...
0 downloads 6 Views 1MB Size
31 Statistical Protocol for the NIOSH Validation Tests K E N N E T H A . B U S H and D A V I D G . T A Y L O R

Downloaded by NORTH CAROLINA STATE UNIV on January 7, 2013 | http://pubs.acs.org Publication Date: April 2, 1981 | doi: 10.1021/bk-1981-0149.ch031

National Institute for Occupational Safety and Health, Robert A. Taft Laboratories, 4676 Columbia Parkway, Cincinnati, O H 45226

Early i n 1974, the National Institute for Occupational Safety and Health (NIOSH) and the Occupational Safety and Health Administration (OSHA) announced a joint program to complete the existing workroom level standards promulgated by the U.S. Department of Labor i n 1972 (29 CFR 1910.1000). At that time, a statistical protocol was developed which has since been used for laboratory validation of over 300 sampling and analytical methods for monitoring employee exposure to the toxic substances i n the OSHA regulations. The v a l i d a tions were conducted by Stanford Research Institute (now SRI International) under contracts CDC-99-74-45 and 210-76-0123 with NIOSH. The contractor set up laboratory facilities and a i r generation-dilution systems to validate methods over a concentration range from one-half to two times the permiss i b l e exposure l i m i t s (PEL) for the toxic substances shown in 29 CFR 1910.1000, Tables Z-1, Z-2, and Z-3. The OSHA PEL's are occupational health standards for personal exposure l i m i t s and may be e i t h e r an 8-hour time-weighted average (TWA) c o n c e n t r a t i o n or a c e i l i n g standard s p e c i f i e d f o r a short time i n t e r v a l ( g e n e r a l l y 30 minutes or l e s s ) . The purpose of the v a l i d a t i o n program was to assure t h a t accurate personal sampling and a n a l y t i c a l methods would be a v a i l a b l e f o r use by OSHA i n monitoring f o r non-compliance to the OSHA p e r m i s s i b l e exposure l i m i t s (PEL's). The methods are a v a i l a b l e to others who may want t o use them t o determine worker exposure to the substances i n the OSHA r e g u l a t i o n s . When a standardized s a m p l i n g / a n a l y t i c a l method i s used to measure the c o n c e n t r a t i o n of a workplace a i r contaminant, i t i s c e r t a i n that there w i l l be some e r r o r i n the r e s u l t . But the exact amount of e r r o r i n a given r e s u l t i s u n c e r t a i n because q u a n t i t a t i v e e r r o r s occur as i f they were random v a r i a b l e s , i . e . i n a chance manner, even when the method i s used c o r r e c t l y . However, f o r a method which i s " i n c o n t r o l " , what i s p r e d i c t a b l e i s the long-term p r o p o r t i o n of i n d i v i d u a l e r r o r s which do not exceed a s e l e c t e d l i m i t of e r r o r . The This chapter not subject to U . S . copyright. Published 1981 American Chemical Society

In Chemical Hazards in the Workplace; Choudhary, G.; ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

Downloaded by NORTH CAROLINA STATE UNIV on January 7, 2013 | http://pubs.acs.org Publication Date: April 2, 1981 | doi: 10.1021/bk-1981-0149.ch031

504

CHEMICAL

HAZARDS

IN

THE

WORKPLACE

p r o b a b i l i t y that a g i v e n e r r o r w i l l be l e s s than some s e l e c t e d l i m i t could be c a l c u l a t e d i f c e r t a i n s t a t i s t i c a l parameters of the method, were known, namely i t s c o e f f i c i e n t of v a r i a t i o n (CV) and (any) b i a s . (The CV i s r e f e r r e d to as the r e l a t i v e standard d e v i a t i o n by chemists. I t i s the r a t i o of the standard d e v i a t i o n of r e p l i c a t e c o n c e n t r a t i o n measurements to the mean c o n c e n t r a t i o n provided by the method.) U s u a l l y , an approximately normal d i s t r i b u t i o n of e r r o r s can be assumed to e x i s t as a b a s i s f o r c a l c u l a t i n g such p r o b a b i l i t i e s . The CV i s assumed to be constant over the f o u r - f o l d range of concentrations used i n a given method's v a l i d a t i o n t e s t s . In t h i s paper, we d e f i n e an accuracy standard i n terms of i t s two s t a t i s t i c a l parameters. However, i n order to evaluate the accuracy of a p a r t i c u l a r method i n terms of i t s s t a t i s t i c a l parameters, we have the problem that estimates of the method's s t a t i s t i c a l parameters are themselves subject to random sampling v a r i a t i o n s because the estimates must be c a l c u l a t e d from only a f i n i t e number of r e p l i c a t e samples. The high cost of generating and a n a l y z i n g l a r g e numbers of r e p l i c a t e samples n e c e s s i t a t e d using only enough samples to assure that reasonably accurate estimates were obtained of the CV and bias parameters of a method. Therefore, we a l s o give s t a t i s t i c a l d e c i s i o n c r i t e r i a by which t e s t data f o r a method can be evaluated t o determine whether there i s reasonable confidence that the method meets the accuracy standard. Several assumptions were made p r i o r to i n i t i a t i n g the a c t u a l v a l i d a t i o n of a given method: 1.

2. 3.

4.

5.

The a n a l y t i c a l method had to be p r e v i o u s l y developed and t e s t e d f o r items such as sample c o l l e c t i o n e f f i c i e n c y , recovery, and sample s t a b i l i t y . Both the a i r sampling and a n a l y t i c a l method were to be v a l i d a t e d . An independent method was needed to v e r i f y the l a b o r a t o r y generation atmospheres used to v a l i d a t e the method. The accuracy requirement developed f o r the methods had to apply to a s i n g l e sample a n a l y s i s , and not r e q u i r e an average of the analyses of s e v e r a l samples, because OSHA compliance determinations may be made on the b a s i s of a s i n g l e sample. The bias determined i n the v a l i d a t i o n r e f e r r e d to the d i f f e r e n c e between average r e s u l t s of the t e s t method and average r e s u l t s of the independent reference method. However, i t was recognized that other sources of b i a s , e.g. some i n t e r f e r e n c e s , may i n c r e a s e the t r u e b i a s of the method i n some unique f i e l d s i t u a t i o n s .

In Chemical Hazards in the Workplace; Choudhary, G.; ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

Downloaded by NORTH CAROLINA STATE UNIV on January 7, 2013 | http://pubs.acs.org Publication Date: April 2, 1981 | doi: 10.1021/bk-1981-0149.ch031

31.

BUSCH AND TAYLOR

NIOSH Validation Tests

NIOSH Accuracy C r i t e r i o n . Accuracy i s determined by both the p r e c i s i o n and b i a s of the sampling and a n a l y t i c a l method. Bias was defined under item 5 above as the d i f f e r e n c e between average r e s u l t s by the t e s t method and average r e s u l t s by an Independent reference method. P r e c i s i o n r e f e r s to the d i s t r i b u t i o n of s i z e s of d i f f e r e n c e s between r e s u l t s f o r r e p l i c a t e samples and the mean f o r the t e s t method at that c o n c e n t r a t i o n . The accuracy c r i t e r i o n and i t s i m p l i c a t i o n s w i t h respect to the worst p r e c i s i o n and b i a s which are a l l o w a b l e are discussed below. The g o a l , however, i s to assure t h a t , i n the long run, s i n g l e measurements by the method w i l l come w i t h i n +25% of corresponding " t r u e " a i r concentrations at l e a s t 95% of the time. This accuracy requirement a p p l i e s to a c o n c e n t r a t i o n range of 0.5 to 2.0 times the environmental PEL. In the case of normally d i s t r i b u t e d sampling and a n a l y s i s e r r o r s (and no b i a s ) the above requirement i m p l i e s that the true c o e f f i c i e n t of v a r i a t i o n of the t o t a l e r r o r ( i . e . net p r e c i s i o n e r r o r of sampling and a n a l y s i s ) , denoted by CV-p, should be no g r e a t e r than 0.128 d e r i v e d as f o l l o w s : CV«p 0.25/1.96 - 0.128. The number 0.128 i s the l a r g e s t acceptable t r u e CV-p f o r which the net e r r o r would not exceed +25% at the 95% confidence l e v e l . The number 1.96 i s the appropiate Z - s t a t i s t i c (from t a b l e s of the standard normal d i s t r i b u t i o n ) at the same confidence l e v e l . I f b i a s e x i s t s , the l a r g e s t acceptable CV-p would have to be s m a l l e r than 0.128 i n order f o r there to be l e s s than 5% " l a r g e e r r o r s " ( i . e . e r r o r s exceeding +25%). In such cases, there would not be a 50-50 d i v i s i o n of p o s i t i v e and negative l a r g e e r r o r s - r a t h e r , l a r g e e r r o r s i n the d i r e c t i o n of the b i a s would occur more o f t e n than 2.5% of the time. Large e r r o r s i n the other d i r e c t i o n would occur correspondi n g l y l e s s o f t e n , to keep the t o t a l occurrence i n both d i r e c t i o n s at 5%. The s o l i d curve i n Figure 1 shows the r e l a t i o n s h i p between the bias and l a r g e s t acceptable l e v e l of the true p r e c i s i o n parameter (denoted i n F i g u r e 1 as the " t a r g e t l e v e l " of the C V of a method). Note that when the b i a s i s z e r o , the l a r g e s t acceptable true CV-p i s 0.128. Formulae are g i v e n i n Appendix I f o r computing the s o l i d curve g i v i n g the CV-p t a r g e t l e v e l and b i a s combinations which meet the NIOSH accuracy standard. The dotted curve of F i g u r e 1 gives corresponding maximum p e r m i s s i b l e estimates of CV-p (designated CVx), based on l a b o r a t o r y t e s t s performed under the experimental design described below. The shaded area i n d i c a t e s the acceptable CV-p r e g i o n f o r v a l i d a t i o n of a method. The concept of making allowance f o r the sampling e r r o r i n the p r e c i s i o n estimate i t s e l f w i l l be developed more f u l l y below under S t a t i s t i c a l A n a l y s i s P r o t o c o l . B a s i c a l l y , i n the case of an T

In Chemical Hazards in the Workplace; Choudhary, G.; ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

505

In Chemical Hazards in the Workplace; Choudhary, G.; ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

Downloaded by NORTH CAROLINA STATE UNIV on January 7, 2013 | http://pubs.acs.org Publication Date: April 2, 1981 | doi: 10.1021/bk-1981-0149.ch031

>

r η

Ο

m

Η

3

>

Ν

Χ >

ο > r

Χ m

Ο

Ο ON

31.

BUSCH A N D TAYLOR

NIOSH Validation Tests

507

unbiased method, the estimate CV«p must be a t or below 0.105 i n order t o be a t l e a s t 95% c o n f i d e n t t h a t the true CVj i s a t o r below 0.128.

Downloaded by NORTH CAROLINA STATE UNIV on January 7, 2013 | http://pubs.acs.org Publication Date: April 2, 1981 | doi: 10.1021/bk-1981-0149.ch031

S t a t i s t i c a l Experimental Design Since the accuracy of an a i r c o n c e n t r a t i o n measurement i s a f u n c t i o n of both sampling and a n a l y s i s , i t i s important to evaluate the method by t e s t i n g both the sampling and a n a l y t i c a l p o r t i o n s of the method. The v a l i d a t i o n program was designed to permit separate e v a l u a t i o n of the l e v e l s of e r r o r i n these two p a r t s of the method, as w e l l as the t o t a l (net) e r r o r . A l l v a l i d a t i o n t e s t s f o r a g i v e n method were c a r r i e d out i n a s i n g l e l a b o r a t o r y , and although many of the methods had been used i n the f i e l d p r e v i o u s l y , no f i e l d v a l i d a t i o n s were undertaken. I n i t i a l l y , the a n a l y t i c a l method was t e s t e d to assure t h a t i t was acceptable f o r a n a l y t e recovery as w e l l as f o r p r e c i s i o n . The sampling medium was spiked w i t h known amounts of the t e s t chemical a t three l e v e l s corresponding to oneh a l f , one, and two times the o c c u p a t i o n a l PEL f o r a given a i r volume. S i x spiked samples f o r each l e v e l were analyzed. The success of t h i s p o r t i o n of the v a l i d a t i o n assured that the a n a l y t i c a l p r e c i s i o n was acceptable f o r the d e s i r e d c o n c e n t r a t i o n range. The second p o r t i o n of the v a l i d a t i o n was to t e s t the net p r e c i s i o n due to both the sampling procedure and the a n a l y t i c method used i n sequence. This r e q u i r e d the generat i o n of known a i r b o r n e concentrations of the t o x i c substance i n a l a b o r a t o r y g e n e r a t o r - d i l u t i o n system. Three concentrat i o n s , a t one-half, one, and two times the PEL, were prepared to t e s t the sampling method. The generated concentrations were v e r i f i e d by a completely independent sampling and a n a l y t i c a l method. For some substances t h i s procedure was not p o s s i b l e and c a l c u l a t i o n s based upon known f l o w and d e l i v e r y r a t e s , or on the e x p e r i m e n t a l l y determined c o l l e c t i o n e f f i c i e n c y , sample s t a b i l i t y , and recovery were necessary to estimate the generated c o n c e n t r a t i o n . A f t e r s e l e c t i n g the recommended f l o w r a t e and sample volume (based upon the sampler c a p a c i t y ) , the samples were c o l l e c t e d from the l a b o r a t o r y g e n e r a t i o n - d i l u t i o n system. S i x samples a t each of the three c o n c e n t r a t i o n s were c o l l e c t e d using c a l i b r a t e d c r i t i c a l o r i f i c e s . The data from these 18 samples, along w i t h the 18 spiked sample r e s u l t s obtained i n the a n a l y t i c a l v a l i d a t i o n , were the b a s i c s t a t i s t i c a l s e t of data used f o r the o v e r a l l method v a l i d a t i o n . The data were used to determine the p r e c i s i o n and b i a s of the method, which together determine i t s accuracy. The e r r o r of the personal sampling pump was not evaluated e x p e r i m e n t a l l y s i n c e sample flows i n the l a b o r a t o r y t e s t s were c o n t r o l l e d by c r i t i c a l o r i f i c e s i n

In Chemical Hazards in the Workplace; Choudhary, G.; ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

508

CHEMICAL

H A Z A R D S IN

THE

WORKPLACE

most cases. However, i n the f i e l d , sampling pumps are used and t h e i r e r r o r was assumed to have a r e l a t i v e standard d e v i a t i o n of 0.05 ( i . e . 5%) based on pump s p e c i f i c a t i o n s .

Downloaded by NORTH CAROLINA STATE UNIV on January 7, 2013 | http://pubs.acs.org Publication Date: April 2, 1981 | doi: 10.1021/bk-1981-0149.ch031

S t a t i s t i c a l Analysis Protocol The purpose of the s t a t i s t i c a l a n a l y s i s i s to estimate the bias and the p r e c i s i o n (measured by the CV-p of the t o t a l p r e c i s i o n e r r o r of a subject method) and r e s o l v e the l a t t e r e r r o r i n t o components CVg due to the sampling method ( l e s s pump e r r o r ) , CV^ due to the a n a l y t i c a l method ( i n c l u d i n g e r r o r i n the d e s o r p t i o n e f f i c i e n c y f a c t o r ) , and CVp (an assumed l e v e l of pump e r r o r ) . Appendix I I gives the d e f i n i t i o n s and computational formulae f o r the s t a t i s t i c a l a n a l y s i s . Assuming normally d i s t r i b u t e d sampling and a n a l y s i s e r r o r s (and no b i a s ) , the NIOSH accuracy standard i s met i f the true c o e f f i c i e n t of v a r i a t i o n of the t o t a l e r r o r , denoted by CV-p, i s no g r e a t e r than 0.128. However, estimates of CV-p (denoted by CV-p), which were obtained i n the l a b o r a t o r y v a l i d a t i o n s , are themselves subject to a p p r e c i a b l e random e r r o r s of e s t i m a t i o n . Therefore, a " c r i t i c a l value" f o r the CV-p was needed^ ( i . e . the value not to be exceeded by an experimental CV-p i f the^method i s to be judged a c c e p t a b l e ) . The c r i t i c a l value of CV

p • 0.128 when there i s no b i a s ) . The maximum p e r m i s s i b l e value of the t r u e CV-p w i l l be r e f e r r e d to as i t s " t a r g e t l e v e l " . In order to have a confidence l e v e l of 95% that a subject method meets t h i s r e q u i r e d t a r g e t l e v e l , on the b a s i s of CV-p estimated from l a b o r a t o r y t e s t s , an upper confidence l i m i t f o r CV-p i s c a l c u l a t e d which must s a t i s f y the f o l l o w i n g c r i t e r i o n : r e j e c t the method ( i . e . decide i t does not meet the accuracy standard) i f the 95% upper confidence l i m i t f o r CV-p exceeds the t a r g e t l e v e l of CVp. Otherwise, accept the method. This d e c i s i o n c r i t e r i o n was implemented i n the form of the D e c i s i o n Rule given below which i s based on assumptions that e r r o r s are normally d i s t r i b u t e d and the method i s unbiased. Biased methods are^ discussed f u r t h e r below. For our v a l i d a t i o n s , a CVp i s a pooled estimate c a l c u l a t e d from the p a r t i c u l a r type of s t a t i s t i c a l data set (36 samples) described e a r l i e r i n the S t a t i s t i c a l Experimental Design s e c t i o n of t h i s r e p o r t . A s t a t i s t i c a l procedure i s given i n H a l d ( i ) f or determining an upper confidence l i m i t f o r the c o e f f i c i e n t of v a r i a t i o n . This general theory had ^to be adapted a p p r o p r i a t e l y f o r a p p l i c a t i o n to a pooled CVp estimate. For t h i s design, and under the s t a t e d assumptions, there i s a one-to-one correspondence between values of CV-p and upper confidence l i m i t s f o r CV-p. Therefore, the confidence l i m i t c r i t e r i o n given above i s e q u i v a l e n t to another c r i t e r i o n based on the r e l a t i o n s h i p of CV-p and i t s c r i t i c a l value. The

In Chemical Hazards in the Workplace; Choudhary, G.; ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

31.

BUSCH

AND

NIOSH

TAYLOR

Validation

Tests

509

D e c i s i o n Rule i s as f o l l o w s : D e c i s i o n Rule: The CV^ from l a b t e s t s would have to be l e s s than the c r i t i c a l value 0.105 to be 95% confident that the true CV i s at or below 0.128 ( i . e . , i n order to be 95% confident that future e r r o r s by the same method would not exceed +25% more than 5% of the time).

Downloaded by NORTH CAROLINA STATE UNIV on January 7, 2013 | http://pubs.acs.org Publication Date: April 2, 1981 | doi: 10.1021/bk-1981-0149.ch031

T

^ Figure 1 provides adjustments to c r i t i c a l values f o r CV-p when a method i s biased. The dotted curve gives c r i t i c a l values of CV^ as a f u n c t i o n of bias f o r a s t a t i s t i c a l s i g n i f i cance t e s t performed a t the 5% p r o b a b i l i t y l e v e l . Because uniform r e p l i c a t e determinations of the bias were not made i n the v a l i d a t i o n t e s t s , the bias i s treated as a known constant rather than an estimated value. The experimental design could be modified to permit determination of the i m p r e c i s i o n i n the bias by p r o v i d i n g f o r uniform r e p l i c a t i o n of the independent method as w e l l as the method under evaluat i o n . Then the d e c i s i o n chart could be modified to include allowance f o r v a r i a b i l i t y of r e p l i c a t e bias determinations. In cases where confidence l i m i t s can be c a l c u l a t e d f o r the b i a s , the c r i t i c a l CV-j should be read from the dotted curve a t a p o s i t i o n corresponding to the 95% upper confidence l i m i t f o r the b i a s . T h i s i s a conservative procedure. The c a l c u l a t e d p o i n t s through which the curves of Figure 1 were drawn using a French curve are given below. Bias (%) 0 2.5 5.0 10.0 15.0 16.8 20.0 25.0

Target CV (%) T

12.8 12.5 11.8 9.1 6.1 5.0 3.0 0

Critical

CV (%) T

10.5 10.3 9.8 7.9 5.8 5.0 (Unattainable) (Unattainable)

Operating C h a r a c t e r i s t i c s of the V a l i d a t i o n Test Program As would be expected, i n order to be able to have at l e a s t 95% confidence that the true CVx does not exceed i t s t a r g e t l e v e l , we must s u f f e r the penalty of sometimes f a l s e l y accepting a "bad" method ( i . e . one whose true CV-p i s u n s a t i s factory). Such d e c i s i o n e r r o r s , r e f e r r e d to as "type-1 e r r o r s " , occur randomly but have a c o n t r o l l e d long-term frequency of l e s s than 5% of the cases. (The 5% p r o b a b i l i t y of type-1 e r r o r i s by d e f i n i t i o n the complement of the confidence l e v e l . ) The upper confidence l i m i t on CV«j< i s below the t a r g e t l e v e l when the method i s judged acceptable under the D e c i s i o n Rule.

In Chemical Hazards in the Workplace; Choudhary, G.; ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

CHEMICAL

510

HAZARDS

IN

T H E

WORKPLACE

The v a l i d a t i o n t e s t program can a l s o have a "type-2 e r r o r " , which i s the mistake of deciding that a method i s "bad" ( C V > 0.128) when i n f a c t i t i s "good" ( C V < 0.128). The r i s k ( p r o b a b i l i t y ) of making a type-2 d e c i s i o n e r r o r i s not bounded (as i s the case f o r the type-1 e r r o r ) . Rather, i t depends on the true C V . In a previous r e p o r t ( i - ) , i t was shown that the p r o b a b i l i t y of a type-2 e r r o r i s l a r g e (0.88) f o r a " b o r d e r l i n e " true CV-p ( j u s t below 0.128) but decreases to small p r o b a b i l i t i e s of 0.10 f o r C V - 0.091, and 0.05 f o r C V - 0.088. Thus, more than 95% of methods whose CVx's are below 0.088 (8.8%) w i l l be accepted on the basis of t h e i r test results. "Good" methods whose true CV-p's are i n the range 8.8% to 12.8% run a higher r i s k of not being approved; t h i s r i s k could be lowered by using more than the now-pres c r i b e d 3 sets of 6 samples f o r the CV«p l a b o r a t o r y estimates i n (each phase of) t h i s program. However, the r a t e of improvement, i n the p r e c i s i o n of the l a b o r a t o r y estimates CV-p, from using more samples would be s m a l l . For example, using 45 samples (15 per each of 3 groups) f o r each of the two phases i n s t e a d of 18 (6 per group) only increases the "safe approval l e v e l " (0.05 p r o b a b i l i t y of type-2 e r r o r ) f o r C V from 0.088 (18 samples) to 0.099 (45 samples). The d e c i s i o n was made, t h e r e f o r e , to perform the smaller number (18) of t e s t s f o r each of the two phases of the program. X

X

X

T

Downloaded by NORTH CAROLINA STATE UNIV on January 7, 2013 | http://pubs.acs.org Publication Date: April 2, 1981 | doi: 10.1021/bk-1981-0149.ch031

T

T

Results of V a l i d a t i o n Tests Over 300 methods have been v a l i d a t e d using the s t a t i s t i c a l p r o t o c o l described above. Histograms have been prepared showing the d i s t r i b u t i o n s of p r e c i s i o n s and biases obtained i n the v a l i d a t i o n t e s t s . Of 310^methods v a l i d a t e d , only 31 (10%) had p r e c i s i o n estimates ( C V s ) above 9% (See Figure 2 ) . Apparently, only a small number of "good" methods have been t e s t e d whose CV-p's are i n the b o r d e r l i n e range where there i s an appreciable chance of r e j e c t i n g "good" methods. Since the pump e r r o r has a CVp of 5% by i t s e l f , no values of CV-p f a l l below t h i s l e v e l except f o r a few cases f o r which the method does not i n v o l v e use of a personal sampling pump. I t should be noted a l s o that most of the methods have prec i s i o n s c l u s t e r i n g around 6-7% i n d i c a t i n g the high q u a l i t y of a n a l y t i c a l methods t e s t e d . The d i s t r i b u t i o n of estimated biases f o r these methods i s shown i n Figure 3. Except f o r a bias of zero, the methods tend to be d i s t r i b u t e d evenly i n the -10% to 10% bias region. The high p r o p o r t i o n of zero-bias methods may be explained by the number of f i l t e r c o l l e c t i o n methods which have 100% c o l l e c t i o n e f f i c i e n c y ; many of these methods use low-biased a n a l y s i s techniques, p a r t i c u l a r l y atomic absorption s p e c t r o s copy. f

T

In Chemical Hazards in the Workplace; Choudhary, G.; ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

BUSCH

A N D TAYLOR

2

Downloaded by NORTH CAROLINA STATE UNIV on January 7, 2013 | http://pubs.acs.org Publication Date: April 2, 1981 | doi: 10.1021/bk-1981-0149.ch031

o

60

NIOSH

Validation

Tests

511

H

0.04

0.06

0.08

Estimated Coefficient of Variation (CVj)

Figure 2.

! | E

Histogram of CV (estimated coefficient of variation of net error attributable to sampling and analysis) for 310 methods T

80

60

-5

0

In

5

10

15

Estimated Bias (%)

Figure 3.

Estimated biases for 310 test methods

In Chemical Hazards in the Workplace; Choudhary, G.; ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

CHEMICAL

512

HAZARDS

IN

T H E WORKPLACE

Downloaded by NORTH CAROLINA STATE UNIV on January 7, 2013 | http://pubs.acs.org Publication Date: April 2, 1981 | doi: 10.1021/bk-1981-0149.ch031

Summary We have presented a s t a t i s t i c a l experimental design and a p r o t o c o l to use i n e v a l u a t i n g l a b o r a t o r y data to determine whether the sampling and a n a l y t i c a l method t e s t e d meets a defined accuracy c r i t e r i o n . The accuracy i s defined r e l a t i v e to a s i n g l e measurement from the t e s t method r a t h e r than f o r a mean of s e v e r a l r e p l i c a t e t e s t r e s u l t s . Accuracy here i s the d i f f e r e n c e between the t e s t r e s u l t and the " t r u e " value, and thus, must combine the two sources of measurement e r r o r : 1) the random e r r o r s of the sampling and a n a l y s i s ( i . e . p r e c i s i o n ) represented by the t o t a l c o e f f i c i e n t of v a r i a t i o n (CV-jO of r e p l i c a t e measurements around t h e i r own mean and, 2) the e r r o r due to a r e a l b i a s (systematic e r r o r ) represented by the d i f f e r e n c e between average r e s u l t s by the subject collection-and-measurement method and average r e s u l t s from an independent method. The American Society f o r T e s t i n g and M a t e r i a l s , i n t h e i r accuracy standard s t a t e s that accuracy does i n c l u d e both of these e r r o r s ( S e c t i o n 4.1). We have estimated both types of e r r o r s and r e f e r r e d r e s u l t s to a d e c i s i o n chart (Figure 1) to see i f the t e s t method does or does not meet the accuracy c r i t e r i o n . F i n a l l y , we would l i k e to point out that the s t a t i s t i c a l p r o t o c o l f o r v a l i d a t i o n deals mainly with the l a s t step i n determining the v a l i d i t y of a monitoring method. The s t a t i s t i c a l p r o t o c o l i s not appropriate f o r a p p l i c a t i o n to a method that has not been completely developed. Tests f o r such items as sample c o l l e c t i o n e f f i c i e n c y , s t a b i l i t y , and recovery; sampler c a p a c i t y ; and a n a l y t i c a l range and c a l i b r a t i o n a l l should be evaluated p r i o r to a p p l i c a t i o n of the s t a t i s t i c a l p r o t o c o l i n connection with l a b o r a t o r y v a l i d a t i o n testing.

Literature Cited (1)

Hald, A., "Statistical Theory with Engineering Applications", Chapter 11: part 11.8 and 11.9; Wiley, 1952.

(2)

Busch, K. A., "Statistical Properties of the SRI Contract Protocol (CDC 99-74-45) for Estimation of Total Errors of Air Sampling/Analysis Procedures", memorandum to Deputy Director, Division of Laboratories and Criteria Development, Jan. 6, 1975.

(3)

"Standard Recommended Practice for Use of the Terms Precision and Accuracy as Applied to Measurement of Property of a Material", E 177-71, in Annual Book of Standards, part 41, American Society for testing and Materials: Philadelphia, Pa., 1976.

In Chemical Hazards in the Workplace; Choudhary, G.; ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

31.

BUSCH

NIOSH

A N D TAYLOR

Validation

Tests

513

APPENDIX I

TARGET VALUE OF CV

T

FOR A BIASED METHOD

The maximum p e r m i s s i b l e CV-p ( t a r g e t value) f o r a biased method can be found by means of the formulae given below.

Downloaded by NORTH CAROLINA STATE UNIV on January 7, 2013 | http://pubs.acs.org Publication Date: April 2, 1981 | doi: 10.1021/bk-1981-0149.ch031

Let B • Bias r a t i o f o r the method - (mean r e s u l t by the method)v(true concentration)• Standard normal deviates f o r l e f t and r i g h t sides of the normal d i s t r i b u t i o n corresponding to large e r r o r s ( e r r o r s beyond +25%) are given by: „ L

Z

=

0.75-B B ^ T

, a

n

d

Z

„ R "

1.25-B B^V7~

For a given B, CV^ i s the s o l u t i o n of the equation:

/

TO -yL-

e-

( 1 / 2

>

z 2

[

dZ +

(

1

e -

1 / 2 ) z 2

d Z = 0.05

Z

R equation must be solved i t e r a t i v e l y . For any s e l e c t e d B, C V T / S are s e l e c t e d by t r i a l and e r r o r i n order to f i n d the value of CVj f o r which the sum of the i n t e g r a l s equals 0.05. The

TT

i

Example:

n

i

i

*

B » 1.1, Z

-0.35

L

= —

— CV-p



, Z

=

R

0.15

— — CV«p

For CV - 0.09116, Z - -3.8394, Z - 1.6455, and the sum of i n t e g r a l s i s 0.0001 + 0.0499 - 0.0500. Thus a method with B - 1 . 1 ( i . e . 10% b i a s ) has CV - 0.091 as i t s target l e v e l . X

L

R

T

In Chemical Hazards in the Workplace; Choudhary, G.; ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

CHEMICAL

514

HAZARDS

IN

T H E WORKPLACE

APPENDIX I I

COMPUTATIONAL FORMULAE FOR STATISTICAL ANALYSIS

This appendix gives the formulae and d e f i n i t i o n s used i n the p r o t o c o l to s t a t i s t i c a l l y analyze l a b o r a t o r y data from validation tests.

Downloaded by NORTH CAROLINA STATE UNIV on January 7, 2013 | http://pubs.acs.org Publication Date: April 2, 1981 | doi: 10.1021/bk-1981-0149.ch031

Definitions

and symbols are l i s t e d below:

Mean - a r i t h m e t i c mean or average ( x ) , defined as the sum of the observations d i v i d e d by the number of observations ( n ) . Standard D e v i a t i o n - the p o s i t i v e square root of the v a r i a n c e , which i n t u r n i s defined as the sum of_squares of the d e v i a t i o n s of the observations from t h e i r mean (x) d i v i d e d by one l e s s than the number of observations (n - ! ) •

Std Dev CV -

c o e f f i c i e n t of v a r i a t i o n , or r e l a t i v e standard d e v i a t i o n , defined as the standard d e v i a t i o n d i v i d e d by the mean* CV -

CV

CV

1

(

2

CV -

Std Dev Mean

c o e f f i c i e n t of v a r i a t i o n (estimated value) f o r the s i x a n a l y t i c a l samples at each of the 0.5, 1, and 2X OSHA PEL's f o r the recommended sample volume. c o e f f i c i e n t of v a r i a t i o n (estimated value) f o r the s i x generated samples at each of the 0.5, 1, and 2X OSHA PEL's.

pooled c o e f f i c i e n t of v a r i a t i o n : the value derived from the c o e f f i c i e n t s of v a r i a t i o n (of a given type, e.g. CV^ o r CV«) obtained from the a n a l y s i s of 6 samples a t each of the three t e s t l e v e l s . The mathemat i c a l equation i s expressed as:

In Chemical Hazards in the Workplace; Choudhary, G.; ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

31.

NIOSH

BUSCH A N D TAYLOR

CV

Validation

Ji

f

i