2 Testing Basic Assumptions in the Measurement Process J A M E S J. F I L L I B E N
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
National Bureau of Standards, Washington, DC 20234
The purpose of this paper is to discuss various statistical techniques for the testing of the basic underlying assumptions in a measurement process. In general, the measurement process refers to the act of collecting quantified information about some phenomenon of interest under well-defined conditions. Among the various components of the measurement process are the experimentalists themselves. The end objective in a measurement process is predictability (1,2) that is, the ability to make probability statements about measurements already taken and yet to be taken. If this predictability is not present, then the process will yield conclusions which are only temporal and local in nature, and which will lack the generality typical of scientific experimentation. To achieve such predictability, the measurement process must be "in control (3)." The term "in control" is a statistical term—having nothing to do per se with whatever physical science area or phenomenon that the experiment involves, but rather with the properties of sequences of measurements. A broad definition of "in control" is as follows: a measurement process is in control if the resulting observations from the process, when collected under any fixed experimental condition within the scope of the a priori well-defined conditions of the measurement process, behave like random drawings from some fixed probability distribution with fixed location and fixed variation parameters. The essential components implied by the above definition are: 1. randomness
30 In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
2.
FiLLiBEN
Testing Basic Assumptions
2.
fixed location
3.
fixed variation
4.
fixed distribution
31
I t i s t o be noted t h a t " f i x e d l o c a t i o n " as used above and throughout t h i s paper i s an abbreviated way o f s t a t i n g t h a t the measurement process has a s i n g l e l i m i t i n g mean ( i . e . , as the measurement process continues i n time, i t i s conceived t o have a unique l i m i t i n g " t y p i c a l v a l u e " ) . S i m i l a r l y , " f i x e d v a r i a t i o n " i s an abbreviated way of s t a t i n g t h a t the measurement process has a s t a b l e degree o f v a r i a t i o n . I t i s important of course t o note t h a t the above components i n the d e f i n i t i o n o f " i n c o n t r o l " are i d e n t i c a l l y those u n d e r l y i n g assumptions which are t y p i c a l l y made, e i t h e r knowingly or unknowingly, i n a measurement process. The consequences f o r i n v a l i d i t y o f these assumptions are the a r r i v a l a t i n c o r r e c t c o n c l u s i o n s and the l o s s of the d e s i r e d - f o r p r e d i c t a b i l i t y t h a t the s c i e n t i s t i n v a r i a b l y seeks. In t h i s l i g h t , the t e s t i n g and checking o f b a s i c assumpt i o n s L$l measurement process takes on i t s r i g h t f u l importance. The t e s t i n g o f assumptions i s a "necessary e v i l , " t a n g e n t i a l i n a sense t o the main a n a l y s i s , but which r a r e l y can be s h o r t - c i r c u i t e d . In order t o t e s t such assumptions a f t e r the f a c t , we have only the raw data r e s u l t i n g from the experiment. F o r t u n a t e l y , however, much i n f o r m a t i o n about the v a l i d i t y o r i n v a l i d i t y o f the u n d e r l y i n g assumptions i s s t i l l l a t e n t i n the data and c o n s i d e r a b l e progress ( o f both a t h e o r e t i c a l and p r a c t i c a l nature) has been made i n the l a s t decade i n the development o f s t a t i s t i c a l techniques f o r the e x t r a c t i o n of such information. The remainder o f t h i s paper w i l l deal w i t h an enumeration and d i s c u s s i o n of such techniques. i n
a
I t w i l l be noted t h a t almost a l l o f the techniques t o be presented are g r a p h i c a l i n nature. There are many reasons f o r such a heavy dependence on graphics: 1. P l o t s take f u l l advantage o f the p a t t e r n r e c o g n i t i o n c a p a b i l i t i e s of the human (e.g., l i n e a r i t y i s easy t o d e t e c t ) . 2. P l o t s make use o f a minimal number o f assumptions. Thus, the use o f p l o t s increases the l i k e l i h o o d t h a t the conclusions w i l l not be approach-dependent. 3. From a communications p o i n t of view, a p l o t i s g e n e r a l l y a much more understandable and e f f i c i e n t way o f conveying information t o another a n a l y s t o r e x p e r i m e n t a l i s t than i s a set of s t a t i s t i c s .
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION OF T H E M E A S U R E M E N T PROCESS
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
32
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
FiLLiBEN
Testing Basic Assumptions
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
2.
33
Où co
too
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
VALIDATION OF T H E M E A S U R E M E N T PROCESS
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
THE
IS A P L O T
579.CCCCC00=MIN-
-469.1250000
-359.25CC00O
-2 4 9 . 3 7 5 C C C C
- 129. 5CC0COO=MID-
-29.6250000
8O.25C0OOO
190. 1 250CO0
I 1-0C00O0O
j
ι
X
X
XX
X
ι
I
X
XX
X X
X
X
X
100.5CC0000
X
X
X
X
XX
X
Run sequence plot. Beam deflection data.
X
X X X
ι
(HORIΖCNTALLY)
XX
I
I 50.7500000
VERSUS
Figure Id.
I
X X
OF X ( I ) ( V E R T I C A L L Y )
3C0.0CCCC00=MAX-
FOLLOWING
X
X
X
150.2500000
XX
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
X
X
X
I
XX
X
X
X X
X
I 200.0000000
X
VALIDATION OF T H E M E A S U R E M E N T PROCESS
36
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
4. A p l o t allows the a n a l y s t t o see and use a l l o f the data--thus no information i s l o s t i n the t y p i c a l e x e r c i s e of forming s t a t i s t i c s which ( i n essence) summarize and map information l a t e n t i n the e n t i r e data set i n t o a s i n g l e number—a number which w i l l u s u a l l y only be s e n s i t i v e t o one p a r t i c u l a r a n a l y s i s aspect of the data. 5. A p l o t allows the a n a l y s t t o check many d i f f e r e n t aspects of the data simultaneously--and so information w i l l be relayed not only about what i s being i n v e s t i g a t e d , but a l s o about unsuspected anomalies (e.g., o u t l i e r s ) i n the data. RUN SEQUENCE PLOT We assume t h a t the e x p e r i m e n t a l i s t has c o l l e c t e d η observations Y Y , ... , Y under the u n i v a r i a t e model: 1 $
2
n
response Ϋ. = constant c + e r r o r e.
(1)
Almost a l l data have a "time" run sequence ( i = l , 2, n) a s s o c i a t e d w i t h i t . Although the c o l l e c t i o n o f data p o i n t s may or may not have been equispaced i n time, the o r d e r i n g of the data i n time ( i . e . , the run sequence) i s u s u a l l y w e l l - d e f i n e d (unless observations are simultaneously c o l l e c t e d ) . In cases where the data a c q u i s i t i o n r a t e i s such t h a t there i s an equal time-spacing between c o l l e c t e d data p o i n t s , the run sequence has a natural analogue t o a p o s s i b l y r e l e v a n t f a c t o r (time) i n the experiment; i n other cases, when the data a c q u i s i t i o n r a t e i s v a r i a b l e or random, no such analogue e x i s t s - - y e t the run sequence " f a c t o r " i s s t i l l f r e q u e n t l y of i n t e r e s t . The run sequence p l o t ( d e f i n e d as a p l o t of Y j versus i ) i s the s i m p l e s t p o s s i b l e data p l o t and y e t i s almost i n v a r i a b l y i n f o r m a t i v e . This run sequence p l o t i s the recommended f i r s t step i n assessing whether the b a s i c assumptions o f the measurement process are tenable. In p a r t i c u l a r , t h i s p l o t y i e l d s information about the assumption o f f i x e d l o c a t i o n , f i x e d v a r i a t i o n and the i m p l i c i t c o r o l l a r y assumptions t h a t the data s e t i s o u t l i e r - f r e e . Figures l a through Id i l l u s t r a t e t h i s on three p a r t i c u l a r data s e t s . The two upper p l o t s present data (voltage counts) from a Josephson J u n c t i o n cryothermometry experiment. The upper l e f t p l o t (data a l l on the lower margin w i t h the exception o f an i s o l a t e d p o i n t on the upper margin) i s i n d i c a t i v e of an o u t l i e r ( i n t h i s case due t o a keypunch e r r o r i n the l e a d i n g d i g i t ) . The upper r i g h t p l o t i s the c o r r e c t e d data s e t ; note the absence of s h i f t s or v a r i a t i o n a l changes as one proceeds l e f t t o r i g h t ( i n time) across the p l o t . (Note a l s o how t h i s p l o t gives the a n a l y s t a c l e a r and i n i t i a l " f e e l " f o r the d i s c r e t e character o f these data.) The lower l e f t f i g u r e i s a run sequence p l o t f o r wind v e l o c i t y data. Note the apparent s h i f t (up) i n l o c a t i o n i n the second h a l f o f the d a t a — a c l e a r i n d i c a t i o n o f a process apparently not i n s t a t i s t i c a l c o n t r o l . The lower r i g h t f i g u r e
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2.
37
Testing Basic Assumptions
FiLLiBEN
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
gives a run sequence p l o t f o r d e f l e c t i o n s o f a s t e e l - c o n c r e t e beam when a p e r i o d i c force i s a p p l i e d t o i t . No apparent s h i f t i n l o c a t i o n o r v a r i a t i o n and no apparent o u t l i e r s a r e evident from t h i s p l o t , and so t h i s p a r t i c u l a r data s e t "passes" the s c r u t i n y of t h i s f i r s t s t a t i s t i c a l technique. I t i s t o be noted i n passing t h a t t h e FORTRAN c a l l i n g sequence i n the upper l e f t corner o f t h i s f i g u r e (and most other f i g u r e s i n t h i s paper) r e f e r t o c a l l s t o subroutines i n the DATAPAC ( 6 , 7 ) data a n a l y s i s package which produced the computerized output t h a t comprises the f i g u r e s . An important g e n e r a l i z a t i o n o f the run sequence p l o t i s the c o n t r o l chart. Rather than p l o t t i n g XJ vs i (as above), the sample mean c o n t r o l chart p l o t s x-j vs i when some a r b i t r a r y (but f i x e d ) numbers o f observations i n _ t h e o r i g i n a l sequence have been grouped together t o form each x-j. I f the o r i g i n a l process i s normal, o r i f the number o f observations grouped t o form a s i n g l e mean i s l a r g e , then normal p r o b a b i l i t y l i m i t s can be i n s e r t e d onto the chart so as t o define a t y p i c a l band o f v a r i a t i o n f o r the process. Large ( o r frequent) excursions outside the band i s an i n d i c a t i o n o f a process t h a t i s no longer "in control." In a d d i t i o n t o χ c o n t r o l c h a r t s , other useful c o n t r o l charts are s (standard d e v i a t i o n ) c h a r t s , r (range) charts and CUSUM (cummulative sum) charts. C o l l e c t i v e l y , they are e x c e l l e n t d i a g n o s t i c t o o l s f o r determining whether i n p a r t i c u l a r an o u t l i e r has occurred and i n general whether a s h i f t i n l o c a t i o n o r v a r i a t i o n has occurred i n the process. For f u r t h e r information on c o n t r o l c h a r t s , the reader i s r e f e r r e d t o Himmelblau (8). LAG-1
AUTOCORRELATION PLOT
The lag-1 a u t o c o r r e l a t i o n p l o t i s defined as a p l o t o f Y versus Υ· -j^over the e n t i r e data s e t ; that i s , the f o l l o w i n g n-1 points are p l o t t e d : ( Υ , Υ ) , ( Y , Y ) , (Y >Yâ)>. · · (Y > V l ) lag-1 autocorrelation p l o t i s s e n s i t i v e t o the randomness assumption i n a measurement process. I f the data are random, then adjacent observations w i l l be u n c o r r e c t e d and the p l o t o f Y. versus Y - j ^ w i l l appear as a data cloud with no apparent structure. However, i f the data are not random and i f adjacent observations do have some a u t o c o r r e l a t i o n t h i s s t r u c t u r e w i l l f r e q u e n t l y manifest i t s e l f i n the a u t o c o r r e l a t i o n p l o t . Figure 2 gives three examples o f the a u t o c o r r e l a t i o n p l o t . The upper l e f t p l o t i s an a u t o c o r r e l a t i o n p l o t o f the aforementioned voltage c o u n t s — n o apparent s t r u c t u r e i s n o t i c e a b l e (aside from the l a t t i c e e f f e c t due t o the discreteness o f the data). The upper r i g h t p l o t i s f o r the wind v e l o c i t y data--note the pronounced l i n e a r s t r u c t u r e i n t h i s p l o t which i m p l i e s t h a t the randomness assumption i s untenable f o r these data. The lower p l o t i s f o r the beam d e f l e c t i o n data--note the w e l l - d e f i n e d Ί
T h e
2
Χ
3
2
4
n
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
THF
IS
A
PLCT
CF
X(I)
Figure 2a.
(VERTICALLY)
2895.O0CCO00
2895.0000000=MIN
2895.8750000
2896.750CC00
2897.625CCO0
2898.50C0CC0=MID
2899.375CC00
2900.2500000
2901 . 1 250COO
29C2.0CCCC00=MAX
FOLLCWING
X(I-l)
t
( FΟκIZGNTALL
2898.5CC00OO
Y)
29C0.2500000
Lag-1 (Y vs. Yi-i) autocorrelation plot. Voltage counts.
2896.75C0OOO
VERSUS
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
2902.0000000
Ο Ο H
H
M
M
CO 00
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
FiLLiBEN
Testing Basic Assumptions
WIND VELOCITIES
CALL PL0TXX(X,1200)
Π
295.14 295.14
1
1
1
l
ι
T"
Γ
I 654.20
I
I
1
L 1013.26
Yi—ι
Figure 2b.
Lag-1 (Yi vs. Y< - J autocorrelation plot. Wind velocities.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
THE
IS A PLOT
X X X XX
X
X XXX
X
XX XX
X X X X X
xxxxx
X
X
X
X X
XX
X X XX
- 139.5CCC000
X
X XXX X
X XX
XX
X XX
X X X X
XX XX
X
XXX X X XX X
X XX X XX
80.2500000
xxxx XX
Lag-1 (Yi vs. Yi-i) autocorrelation plot. Beam deflections.
-359.25CO00O
XX
>
X ( I - l ) (HORIZONTALLY)
X X X XXXXX
VERSUS
Figure 2c.
X X X
XX
(VERTICALLY)
-579.0CCC0OO
-579.CCCCCCO=MIN-
-469.1250000
-359.25CC0OO
-249.375CCC0
- 129.5CCCCCC=MID~
-29.6250000
80.2SC0CC0
190.125CO00
X
ΓΡ X(I>
300-0CC0C0O-MAX-
FOLLOWING
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
300.0000000
Ο Ο Μ
M
Ο
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
2.
FiLLiBEN
Testing Basic Assumptions
41
e l l i p t i c a l s t r u c t u r e o f the a u t o c o r r e l a t i o n p l o t which i s a l s o i n d i c a t i v e o f the untenableness o f the randomness assumption. In t h i s l a s t case the l a c k o f randomness was, as i t turned out, due to an underlying c y c l i c s t r u c t u r e i n the data ( i . e . , the t r u e model was Y-j = c + a* s i η (ôi + φ) + e i (where i i s time) r a t h e r than the assumed Y = c + e . The reader should note the two p o i n t s i n the upper r i g h t p o r t i o n o f the p l o t which are o f f the ellipse. This i s due t o a s i n g l e o u t l i e r i n the data and demonstrates the secondary sensitivity o f the lag-1 a u t o c o r r e l a t i o n p l o t to o u t l i e r s . RUNS TEST The runs t e s t i s a technique t h a t i s s p e c i f i c a l l y used f o r testing randomness. The assumed underlying model f o r a p p l i c a t i o n o f t h i s technique i s expressed by eq. ( 1 ) . To i l l u s t r a t e the technique, consider the run sequence p l o t o f 50 spectrophotometry transmittance data p o i n t s i n f i g u r e 3. I t i s apparent from the p l o t t h a t the data are not random (note how observations 35 to 45 are not random but r a t h e r near-monotonic i n nature). To s c r u t i n i z e the c o r r e l a t i o n s t r u c t u r e i n t h i s data set, consider the runs a n a l y s i s given i n f i g u r e 4. A run up of length i means t h a t there are e x a c t l y (i+1) successive observations such t h a t each observation i s g r e a t e r than (or a t l e a s t equal t o ) the previous observation. The underlying theory behind the runs t e s t i s t h a t i f the data are random and i f the sample s i z e i s known ( i n t h i s case, n=50), the number of runs up of length 1, o f length 2, etc. , may be considered as random v a r i a b l e s whose expected values and standard d e v i a t i o n s can be c a l c u l a t e d from t h e o r e t i c a l c o n s i d e r a t i o n s (9) and these c a l c u l a t i o n s w i l l not depend on the (unknown) d i s t r i b u t i o n o f the data but only on i t s assumed randomness. Having computed such t h e o r e t i c a l v a l u e s , the f i n a l step i n the t e s t i s to compute from the data the observed number o f runs (up) o f length 1, o f length 2, e t c . , and then determine how many t h e o r e t i c a l standard deviations that t h i s observed statistic falls from the t h e o r e t i c a l l y expected value. This i s most e a s i l y done by formation of the standardized v a r i a b l e : N. - E(N.) SD(N.) where Nj i s the observed number o f runs (up) o f length i , E(N-j) i s the t h e o r e t i c a l expected number o f runs up o f length i and SD(N-j) i s the t h e o r e t i c a l standard d e v i a t i o n o f the number o f runs up o f length i . This standardized v a r i a t e i s given i n the right-most column of f i g u r e 4. For random data, one would expect values o f , say, ±1, ±2, ±3 i n t h i s column, i . e . , the observed number o f runs o f length i should be only a few ( a t most) standard d e v i a t i o n s away from the t h e o r e t i c a l expected value f o r the number o f runs o f length i . For nonrandom data, the
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
THE
I S A PLGT
X
X
X
X
XX
X ( l ) (VERTICALLY) I I
I (HORI Ζ C N T A L L Y ) I
1 3 . 25 C0C 0 0
VERSOS
XX
X
X
X
X
25.5CC00CO
X
X
X
37 . 7 5 C 0 0 O 0
Figure 3. Run sequence plot for spectrophotometric measurement of transmittance
2.00 13000 = MIN-
2.0014750
2.00 16500
2.0018250
2.0020000=MIO-
2.002 1750
2.CC22500
2.0025250
I
1 .cccccoo
OF
2.OC27000=MAX-
FOLLOWING
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
X X X
50.CCOOOOO
X
I
Ο η w
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
I
I
«
.0
1 3
Figure 4.
14
A
NUMBER
T
NUMBER
STAT
•
S T
•
OF
OF
RUNS EXACTLY
I
OR
6.0417 1.5750 . 3208 .0538 .007 7 .0010 .0001 .nooo .0000 .0000 .0000 .0000 .0000 .0000
1 . 3962 1.0622 .5433 . 2299 .0874 . 0308 . 0 102 . 0032 .0010 ,0003 .000 1 . 0000 .0000 .0000
I
MORE
2.0696
LENGTH
16.5000
OF
SD(STAT)
UP
1.6539 .9997 .5003 .2132 .08 l θ .029 1 .0097 . 003 1 .0009 . 0003 .0001 .0000 .0000 .0000
EXP(STAT)
RUNS
4.4667 I .2542 . 267 1 .046 1 .0067 . 0008 .0001 .0000 . 0000 .0000 .0000 .0000 . 0000 .0000
3.2170
LENGTH
10.4583
OF
EXP(STAJ)
P
UP U
SD(STAT )
RUNS
-.03 1 .3*4 3.09 8. M7 22.79 64.85 195.70 31 1 . 6 4 1042.19 -.00 -.00 -.00 -.00 -.00
-H.59
( S Τ Α Τ - Ε X P ( S Τ Α Τ> ) / S O ( S T A T
-2.9*4 -.89 -.25 -.53 -.22 -.08 -.03 103.06 -.00 1087.63 -.00 -.00 -.00 -.00 -.00
A
X p 2
ΐΊ
+
e r r o r
e n
*j
where f i s an unknown f u n c t i o n r e l a t i n g the nonrandom v a r i a b l e s X] and X t o the response v a r i a b l e Y, and where Xn" and Xo-iindicate d i f f e r e n t d i s c r e t e l i m i t s w i t h i n the v a r i a b l e s )( ana X , r e s p e c t i v e l y . Although the p l o t o f Y versus K w i l l c e r t a i n l y give the a n a l y s t some i n d i c a t i o n of the nature o f the f u n c t i o n f , the main p o i n t i s whether and how the response i s a d d i t i o n a l l y a f f e c t e d by the second v a r i a b l e (say v a r i a b l e X ) f o r some (but not a l l ) values o f the v a r i a b l e X . I f the v a r i a b l e X a f f e c t s the response f o r only some (but not a l l ) of the values o f the v a r i a b l e X i n s t a t i s t i c a l terms t h i s i s r e f e r r e d t o as an " i n t e r a c t i o n " e x i s t i n g between X and X --thus the e f f e c t o f X on the response i s dependent on the value of X . Rather than apply the usual 2-factor a n a l y s i s of variance 2
1
±
2
2
x
2
l 5
x
2
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2
x
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
CALL PLOTBD(Y,X,N,XDELTA) SPECIMAN HOMOGENERITY
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
2.
FiLLiBEN
Testing Basic Assumptions
47
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
(ANOVA) t o data from t h i s model, we apply the g r a p h i c a l procedure i l l u s t r a t e d i n f i g u r e 6. This g r a p h i c a l a n a l y s i s o f variance (GANOVA) (10,11) i s simply a p l o t of the response v a r i a b l e versus one f a c t o r , w i t h d i f f e r e n t l e v e l s of the second f a c t o r i n d i c a t e d by d i f f e r e n t types of p l o t characters w i t h i n the p l o t . The GANOVA procedure i s very r e v e a l i n g i n t h a t i t communicates a l l o f the l a t e n t r e l e v a n t information i n t h i s 2-factor system. This technique i s i l l u s t r a t e d i n f i g u r e 6 which p l o t s the r e s i d u a l s from a f i t o f the response v a r i a b l e (days t o f a i l u r e ) from a s t r e s s f a t i g u e experiment versus l a b (9 l a b s ) with the value o f t h e p l o t c h a r a c t e r representing v a r i o u s l e v e l s (3 l e v e l s ) o f a second v a r i a b l e (experiment c o n f i g u r a t i o n ) which could (but h y p o t h e t i c a l l y should not) a f f e c t the response. A p l o t c h a r a c t e r value o f 3, e.g., i n d i c a t e s t h a t t h i s p a r t i c u l a r data p o i n t was generated from c o n f i g u r a t i o n 3 of the experiment. The two independent v a r i a b l e s are: l a b o r a t o r y (9 l e v e l s — p l o t t e d h o r i z o n t a l l y ) , configuration (3 levels--denoted by d i f f e r e n t characters).
plot
Making reference t o f i g u r e 6, i t i s seen t h a t the assumption t h a t a l l l e v e l s o f the c o n f i g u r a t i o n f a c t o r a f f e c t s the response i n a uniform fashion i s untenable. I t i s c l e a r from the p l o t t h a t a lab-configuration interaction exists. For example, c o n f i g u r a t i o n s 2 and 3 y i e l d c o n s i s t e n t l y low values f o r l a b 4 w h i l e c o n f i g u r a t i o n 1 y i e l d s low and r a t h e r v a r i a b l e values f o r labs 5 and 7. A suspicious low observation a l s o i s seen t o e x i s t f o r lab 3, c o n f i g u r a t i o n 2. Such an augmented p l o t - - a s described above--is a useful technique f o r examining the assumption t h a t the response i s not dependent on some p a r t i c u l a r v a r i a b l e . I t i s t o be noted i n passing t h a t although the p l o t c h a r a c t e r i s the recommended procedure f o r conveying information about the second v a r i a b l e , one could a l s o j u s t as w e l l use the type o f l i n e f o r conveying the second-variable information. The former i s recommended when generating computer p r i n t e r p l o t s - which are by nature d i s c r e t e . The l a t t e r i s recommended when a continuous p r i n t i n g device ( i . e . , one capable o f drawing d i f f e r e n t types o f continuous l i n e s ) i s a v a i l a b l e . Figure 7 i l l u s t r a t e s the above l i n e - t y p e a l t e r n a t i v e with an example based on measured voltages from electrical connectors. The two independent v a r i a b l e s here are: time i n days ( p l o t t e d h o r i z o n t a l l y ) connector type (3 levels--denoted by d i f f e r e n t l i n e types) The m u l t i p l e replication.
l i n e s o f t h e same type a r e due t o experimental Although much can be s a i d about the p l o t and about
American Chemical Society In Validation of theLibrary Measurement Process; DeVoe, J.;
ACS Symposium Series; American Society: Washington, DC, 1977. 1155 16thChemical St., N.W.
VALIDATION OF T H E M E A S U R E M E N T PROCESS
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
48
CO
.ο SX
CUD
Ο
Ο
fi
·
2 — [0 Jj H
0)
+3 W
Ρ bû -P -H -H «M
Un
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2.
Testing Basic Assumptions
FiLLiBEN
49
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
the r e l a t i v e e f f e c t s o f the two f a c t o r s on the response, we concentrate herein on v i o l a t i o n s o f b a s i c assumptions and note t h a t an apparent v i o l a t i o n i n the form of an o u t l i e r i s evident from the p l o t — n o t e how the f o u r t h data p o i n t of the bottom l i n e of the p l o t i s i n c o n s i s t e n t w i t h the other data l i n e s i n t h i s bottom group. This f o u r t h p o i n t i s c l e a r l y an o u t l i e r , and y e t i t s d e t e c t i o n may very e a s i l y have been l o s t i n the numerical mechanics o f a standard ANOVA. 3-VARIABLE GANOVA This 3 - v a r i a b l e GANOVA technique i s a p p l i c a b l e where a multifactor model i s a p p r o p r i a t e , i . e . , the u n d e r l y i n g hypothesized model i s o f the form (e.g., f o r three f a c t o r s ) : response Y.
j R
= constant c + B^.X^ +
B ^ . X ^ ^
+ e r r o r e. w i t h an a l t e r n a t i v e general model of the f o l l o w i n g : Y
= f
X
X
X
ijk ( li» 2j- 3k>
+
™
e
ijk
w i t h f unknown, where the doubly-subscripted B's r e f e r t o f a c t o r e f f e c t s and the doubly-subscripted X's r e f e r t o coded dummy l e v e l s o f each f a c t o r . Again, r a t h e r than apply the standard 3f a c t o r a n a l y s i s of variance (ANOVA) t o data from t h i s model, we apply the g r a p h i c a l procedure i l l u s t r a t e d i n f i g u r e 7. This graphical a n a l y s i s o f variance (GANOVA) (10,11) i s a g e n e r a l i z a t i o n o f the type of p l o t discussed i n s e c t i o n 6 and i s defined simply as a p l o t o f the response v a r i a b l e versus one f a c t o r , w i t h d i f f e r e n t l e v e l s of the second f a c t o r i n d i c a t e d by d i f f e r e n t types o f p l o t characters w i t h i n the p l o t , and w i t h the d i f f e r e n t l e v e l s of the t h i r d f a c t o r i n d i c a t e d by d i f f e r e n t types o f l i n e s w i t h i n the p l o t . The 3 - v a r i a b l e GANOVA conveniently communicates a t a glance a l l o f the r e l e v a n t information i n t h i s 3 - f a c t o r system. Figure 8 i l l u s t r a t e s the a p p l i c a t i o n of the 3 - v a r i a b l e GANOVA t o d r i l l t h r u s t f o r c e . These data are drawn from the e x c e l l e n t a r t i c l e by Hamaker (10) whose main p o i n t was e x a c t l y as we are emphasizing h e r e — v i z . , t h a t there e x i s t v a l u a b l e graphical alternatives t o the usual ANOVA. The three independent v a r i a b l e s are: d r i l l speed (5 l e v e l s — p l o t t e d h o r i z o n t a l l y ) m a t e r i a l (2 levels--denoted by d i f f e r e n t p l o t c h a r a c t e r s ) feed r a t e (3 l e v e l s — d e n o t e d by d i f f e r e n t l i n e types) R e f e r r i n g t o f i g u r e 8, one concludes t h a t v a r i a b l e s 2 ( m a t e r i a l ) and 3 (feed r a t e ) both a f f e c t the response i n a n o n - n e g l i g i b l e
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
CALL GAN0V3(Y,F1,F2,F3,N)
O F T H E M E A S U R E M E N T PROCESS
ELECTRIC CONNECTORS
20-)—,—ι—,—ι—,—ι—,—ι—,—ι—,—ι ure 7.
0 20 40 60 80 100 120 UOLTAGE DROP UERSUS ELAPSED DAYS 6 / 2 3 / 7 5 JJF6.CLARK1 HOT/COLD COPPER OPEN PLUG 20 AMP BRASS » SOLID, STEEL = SHALL DASH , INNOUATIUE «= LARGE DASH
Three graphic analyses of variance for electrical connectors data
CALL GAN0V3(Y,F1,F2,F3,K)
1500 ι
HAMAKER ( 1 9 7 1 PIRT) DRILL THRUST FORCE
—
Review of the International Statistical Institute
Figure 8. Three graphic analyses of variance for Hamaker (10). Drill thrust force data; thrust force (v) vs. drill speed (x); plot character = material (2 levels); type of line = feed rate (3 levels).
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2.
51
Testing Basic Assumptions
FiLLiBEN
way. The apparent e x i s t e n c e o f an o u t l i e r i s a l s o evident from the p l o t .
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
I t i s t o be noted t h a t due t o the necessary use o f l i n e types, the 3-variable GANOVA can be done only with the continuous p r i n t i n g devices. YOUDEN PLOT The Youden p l o t (12,13) i s a useful graphical technique most commonly a p p l i c a b l e t o i n t e r l a b o r a t o r y experiments when there e x i s t s e x a c t l y 2 runs ( o r 2 specimens, or 2 l e v e l s of some p a r t i c u l a r f a c t o r , e t c . ) t o be t e s t e d f o r a c e r t a i n property o f interest. I d e a l l y , i f t h e 2 runs are " a l i k e " and i f a l l l a b o r a t o r i e s are " a l i k e , " the response model should be as i n eq. ( 1 ) ; whereas i f a l a b o r a t o r y e f f e c t e x i s t e d , an appropriate model might be: response Y. = constant c + L. + e r r o r e. (where 1_· represents an e f f e c t due t o laboratory i - - i = 1, 2,...,k) and a d d i t i o n a l l y , i f both a laboratory e f f e c t and a run e f f e c t e x i s t e d , an a p p r o p r i a t e model might be: Ί
response Y. = constant c + L. + R. + e r r o r e^. (where R^ represents a run e f f e c t f o r run j — j = 1 , 2 ) . To t e s t which model i s appropriate, a Youden p l o t i s a p p l i e d which i s defined as a p l o t o f the k (where k = the number of l a b o r a t o r i e s i n the experiment) coordinate p a i r s : Y , Y ) (Y2i> ^ 2 2 ) » · · · ( k l > k 2 ) > where Y-jj represents the measured values obtained from l a b o r a t o r y i ( i = 1, 2, ... , k) on run j ( j = 1, 2). 13L
Y
i 2
5
Y
To f a c i l i t a t e the graphical a n a l y s i s , the p l o t c h a r a c t e r i s again used t o "pack" i n e x t r a i n f o r m a t i o n — i n t h i s case, about the laboratory f a c t o r . Thus, e.g., a p l o t character o f 4 i n d i c a t e s t h a t the measurement i n question came from l a b o r a t o r y 4. The Youden p l o t i s i l l u s t r a t e d i n f i g u r e 9 as a p p l i e d t o data from an ASTM s t r e s s c o r r o s i o n experiment where 7 ( k ) l a b o r a t o r i e s were being t e s t e d . I f no l a b o r a t o r y o r run e f f e c t s e x i s t e d , the r e s u l t i n g Youden p l o t w i l l appear as a random 2-dimensional s c a t t e r o f points. A l t e r n a t i v e l y , i f l a b o r a t o r y and/or run e f f e c t s do e x i s t , much useful information about the nature o f such e f f e c t s can be gleaned from the r e s u l t i n g p l o t . The p l o t i n f i g u r e 9 a c t u a l l y i s based on 7 χ 5 = 35 p l o t points (not a l l o f which In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION O F T H E
MEASUREMENT
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
52
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
PROCESS
2.
FiLLiBEN
53
Testing Basic Assumptions
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
appear due t o computer p r i n t e r o v e r s t r i k i n g ) . The m u l t i p l i c i t y of 5 i s due t o the e x i s t e n c e o f 5 r e p l i c a t i o n s per lab--such r e p l i c a t i o n s pose no problems i n u t i l i z i n g the Youden p l o t . With respect t o how t o i n t e r p r e t a Youden p l o t , several c h a r a c t e r i s t i c s are t o be noted. A displacement o f p o i n t s from the same l a b o r a t o r y along the 45° diagonal i s i n d i c a t i v e t h a t t h i s l a b o r a t o r y i s c o n s i s t e n t l y generating low ( o r high) readings r e l a t i v e t o the other l a b o r a t o r i e s ( t h e c l u s t e r o f l a b o r a t o r y 4 p o i n t s i n f i g u r e 9 i s i l l u s t r a t i v e of t h i s negative l a b o r a t o r y b i a s ) . On the other hand, a c l u s t e r o f p o i n t s from the same l a b o r a t o r y d i s p l a c e d o f f the diagonal represents i n c o n s i s t e n t readings by t h a t l a b o r a t o r y from one run t o the next. Figure 9 i n d i c a t e s t h a t l a b o r a t o r y 3 has such an i n t e r - r u n v a r i a b i l i t y problem. For l a b o r a t o r y 3, the readings f o r run 1 are c o n s i s t e n t l y higher than those f o r run 2. The Youden p l o t i s a s i m p l e — y e t extremely method f o r a n a l y z i n g i n t e r l a b o r a t o r y data.
effective--
EXAMINING DISTRIBUTIONAL INFORMATION The d i s c u s s i o n has already touched on three (randomness, f i x e d l o c a t i o n , f i x e d v a r i a t i o n ) o f the four assumptions t y p i c a l l y made about a measurement process. The f o u r t h assumption ( f i x e d d i s t r i b u t i o n ) w i l l now be addressed. From a s t a t i s t i c a l p o i n t of view, there are f i v e reasons why d i s t r i b u t i o n a l i n f o r m a t i o n should be r o u t i n e l y checked: 1. optimal parameters ;
estimators
f o r location
and
variation
2. v a l i d i t y of c r i t i c a l values used i n s t a t i s t i c a l t e s t s o f significance; 3.
assessment of goodness o f f i t i n r e g r e s s i o n ;
4. e x i s t e n c e of o u t l i e r s ; 5. assessment o f whether the measurement process control.
is in
The l a s t o f these reasons (assessment o f whether a measurement process i s i n s t a t i s t i c a l c o n t r o l ) i s the main one w i t h respect to the o v e r a l l purpose o f t h i s paper. The f i r s t four reasons provide a d d i t i o n a l m o t i v a t i o n f o r checking distributional assumptions, and w i l l be i n d i v i d u a l l y touched on a t t h i s time.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
54
VALIDATION OF T H E M E A S U R E M E N T PROCESS
The f i r s t (optimal estimators) p o i n t r e f e r s to the case where one i s i n t e r e s t e d i n e s t i m a t i n g from a given data s e t the l o c a t i o n parameter c and v a r i a t i o n ( d i s p e r s i o n or s c a l e ) parameter σ i n the model described i n eq. (1). ( I t i s assumed t h a t the e r r o r , e^ i s a random v a r i a b l e w i t h mean 0 and (unknown) standard d e v i a t i o n , σ. ) Various estimators of c would, f o r example, include the usual sample mean of η observations c = SY|/n, the sample median (c = the middle observation i n the ordered s e t of o b s e r v a t i o n s ) , or the sample midrange (c = the average of the s m a l l e s t and l a r g e s t o b s e r v a t i o n s ) . It is a statistical "fact-of-life" t h a t i n e s t i m a t i n g l o c a t i o n and v a r i a t i o n parameters, the goodness (accuracy) of a p a r t i c u l a r estimator and the choice of an optimal estimator are dependent on the underlying d i s t r i b u t i o n . For example, i f the underlying d i s t r i b u t i o n which generated the data set were the bell-shaped (normal or Gaussian), the best estimator of c would be the sample mean. However, i f the underlying d i s t r i b u t i o n were uniform ( i . e . , i t had a f l a t - - r a t h e r than bell-shaped p r o b a b i l i t y f u n c t i o n ) , it can be t h e o r e t i c a l l y demonstrated that the sample midrange, c = ( s m a l l e s t + l a r g e s t ) / 2 i s a much more accurate estimator of c than the sample mean. A l t e r n a t i v e l y , i f the underlying d i s t r i b u t i o n f o r the data were, e.g., very "longt a i l e d " l i k e the Cauchy ( i ^ j e . , the p r o b a b i l i t y f u n c t i o n i s b e l l shaped but higher valued i n the t a i l s than the normal), then theory d i c t a t e s and p r a c t i c e confirms t h a t the sample median i s a much more accurate estimator of c than e i t h e r the sample mean or the simple midrange. Thus, i t i s seen t h a t f o r e s t i m a t i n g the constant c i n the s i m p l e s t p o s s i b l e response model (Y = c + e ) , a necessary p r e l i m i n a r y step i s t o "estimate" the underlying distribution. Although the c e n t r a l l i m i t theorem provides a t h e o r e t i c a l b a s i s f o r suggesting t h a t f o r many p h y s i c a l science experiments, the normal d i s t r i b u t i o n "should" be the underlying d i s t r i b u t i o n , such normality should never be a u t o m a t i c a l l y assumed. As w i l l be seen i n the remaining s e c t i o n s , s t a t i s t i c a l techniques do e x i s t which a l l o w the a n a l y s t t o e a s i l y and r o u t i n e l y check such d i s t r i b u t i o n a l models. The second reason why d i s t r i b u t i o n a l information should be checked deals w i t h the v a l i d i t y o f t e s t s t a t i s t i c s . In the m u l t i f a c t o r s t a t i s t i c a l techniques r e f e r r e d t o as r e g r e s s i o n and a n a l y s i s of v a r i a n c e , there are a v a r i e t y of t e s t s t a t i s t i c s (mostly t and F s t a t i s t i c s ) which are a p p l i e d t o t e s t the s i g n i f i c a n c e of various f a c t o r s i n the m u l t i - f a c t o r model. I t i s an important s t a t i s t i c a l f a c t t h a t the v a l i d i t y of these t e s t s t a t i s t i c s holds only i f the r e s i d u a l s ( d e v i a t i o n s ) a f t e r the f i t are normally d i s t r i b u t e d . That i s t o say, i t i s the d i s t r i b u t i o n a l c h a r a c t e r i s t i c of the r e s i d u a l s a f t e r the f i t t h a t d i c t a t e the v a l i d i t y of the t and F s t a t i s t i c s . I f the t r u e underlying d i s t r i b u t i o n of the r e s i d u a l s i s non-normal, t h i s w i l l a f f e c t the t r u e s i g n i f i c a n c e l e v e l s of the t e s t s t a t i s t i c s . The net r e s u l t i s t h a t u l t i m a t e l y the conclusions about the
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2.
FiLLiBEN
Testing Basic Assumptions
55
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
s i g n i f i c a n c e o f various f a c t o r s i n r e g r e s s i o n and ANOVA may be incorrect. Again, as emphasized before, no b l i n d assumptions need be made about the d i s t r i b u t i o n o f such residuals. Techniques w i l l be demonstrated t o a l l o w the d i s t r i b u t i o n t o be r o u t i n e l y checked. The t h i r d reason f o r checking d i s t r i b u t i o n a l i n f o r m a t i o n i s r e l a t e d t o the aforementioned r e g r e s s i o n and ANOVA. The p o i n t t o be emphasized i s t h a t an a d d i t i o n a l important reason f o r examining the d i s t r i b u t i o n o f r e s i d u a l s a f t e r t h e f i t i s t o determine whether o r not one has a r r i v e d a t a reasonable d e t e r m i n i s t i c o r f u n c t i o n a l model f o r the data. I f the f i t t e d r e g r e s s i o n or ANOVA model i s c o r r e c t , the r e s i d u a l s a f t e r the f i t should i d e a l l y have the same four p r o p e r t i e s as has been p r e v i o u s l y discussed w i t h respect t o the u n i v a r i a t e response variable, viz. : random fixed location fixed variation fixed distribution In a large m a j o r i t y of cases, the r e s i d u a l s a f t e r the c o r r e c t f i t w i l l not only f o l l o w some f i x e d d i s t r i b u t i o n , but w i l l a l s o rather specifically follow a normal distribution. The i m p l i c a t i o n o f course i s t h a t i n order t o assess whether o r not one has a c o r r e c t f i t , one ought t o examine the d i s t r i b u t i o n o f the r e s i d u a l s t o check f o r such normality. Though not a s u f f i c i e n t c o n d i t i o n i n i t s e l f f o r adequate f i t , the normality of the r e s i d u a l s serves as a p r a c t i c a l necessary c o n d i t i o n which may p r o f i t a b l y be used i n determining model adequacy. From a pragmatic p o i n t o f view, t h i s t h i r d reason f o r examining d i s t r i b u t i o n a l information i s an extremely important one. The f o u r t h reason f o r checking d i s t r i b u t i o n a l information deals w i t h the o u t l i e r problem. How does one t e l l i f a suspicious-looking observation i s i n fact an o u t l i e r ? ( " O u t l i e r " as here used r e f e r s t o an observation t h a t was generated from a d i f f e r e n t model o r a d i f f e r e n t d i s t r i b u t i o n than was the main "body" o f the data.) Frequently, an o u t l i e r w i l l manifest i t s e l f i n one o r another o f the p l o t s already discussed i n previous s e c t i o n s . However, an a d d i t i o n a l and a t times more s e n s i t i v e check i s given by a d e t a i l e d examination o f the d i s t r i b u t i o n of the data. An observation which appears t o be a b o r d e r l i n e o u t l i e r i n some previous p l o t s f r e q u e n t l y turns out to be a w e l l - d e f i n e d o u t l i e r when examined r e l a t i v e t o the d i s t r i b u t i o n o f the r e s t o f the data. The same numerical observation may very well be a " t y p i c a l " extreme observation r e l a t i v e t o one d i s t r i b u t i o n but an o u t l i e r r e l a t i v e t o another d i s t r i b u t i o n . By examining the d i s t r i b u t i o n of the data (and/or the r e s i d u a l s a f t e r a f i t ) , the a n a l y s t gives himself a much more s e n s i t i v e t o o l f o r o u t l i e r d e t e c t i o n and i d e n t i f i c a t i o n . In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
56
VALIDATION OF T H E M E A S U R E M E N T PROCESS
The f i f t h and f i n a l p o i n t w i t h respect t o the importance of checking f o r d i s t r i b u t i o n a l i n f o r m a t i o n deals w i t h the main p o i n t of t h i s p a p e r — p r e d i c t a b i l i t y and the determination of whether a process i s " i n c o n t r o l . " P r e d i c t a b i l i t y means being able t o make p r o b a b i l i t y statements about f u t u r e output from the process. These p r o b a b i l i t y statements w i l l most commonly r e f e r to expected v a r i a t i o n (about some t y p i c a l value) of output from the process. The main p o i n t i s t h a t such p r o b a b i l i t y statements w i l l change depending on the t r u e u n d e r l y i n g d i s t r i b u t i o n of the process. A statement such as: "97-1/2% of the f u t u r e observations from t h i s measurement process should f a l l w i t h i n (approximately) 3 standard d e v i a t i o n s of the mean"
w i l l of course be t r u e i f the u n d e r l y i n g generating d i s t r i b u t i o n i s normal but on the other hand w i l l be f a l s e i f the u n d e r l y i n g d i s t r i b u t i o n i s ( f o r example) uniform, Cauchy or e x p o n e n t i a l . I t i s important f o r a n a l y s t s t o keep i n mind t h a t f o r non-normal d i s t r i b u t i o n s , a p r o b a b i l i t y statement about expected f u t u r e occurrences (e.g., w i t h i n two standard d e v i a t i o n s of the mean) w i l l change from d i s t r i b u t i o n t o d i s t r i b u t i o n . The exact proba b i l i t y value (= 97-1/2% f o r the normal) must be (and can be) determined once the u n d e r l y i n g d i s t r i b u t i o n i s determined. I t is a r e c u r r i n g requirement t o "estimate" the u n d e r l y i n g distribution. With these motivations and j u s t i f i c a t i o n s f o r examining d i s t r i b u t i o n a l i n f o r m a t i o n , the next two s e c t i o n s w i l l present various data a n a l y s i s techniques t o c a r r y out such examinations. PROBABILITY PLOTS A p r o b a b i l i t y p l o t (14,15,16,17,18,19,20,21) i s a g r a p h i c a l t o o l f o r a s s e s s i n g the goodness of f i t of some hypothesized d i s t r i b u t i o n (e.g., normal, uniform, Poisson, e t c . ) t o an observed data set. In d e s c r i b i n g a p r o b a b i l i t y p l o t , i t w i l l be assumed t h a t the model i s as i n d i c a t e d i n eq. ( 1 ) . However, i t i s t o be kept i n mind t h a t the p r o b a b i l i t y p l o t technique has much g r e a t e r g e n e r a l i t y inasmuch as i t can be a p p l i e d t o the r e s i d u a l s a f t e r any m u l t i f a c t o r f i t as w e l l as t o the raw observations from the simple Y. = c + e.. model. A p r o b a b i l i t y p l o t i s ( i n general) simply a p l o t of the observed ordered ( s m a l l e s t t o l a r g e s t ) observations Y j on the vertical axis versus the corresponding typical ordered observations based on whatever d i s t r i b u t i o n i s being hypothesized. Thus, f o r example, i f one were forming a normal p r o b a b i l i t y p l o t , the f o l l o w i n g η coordinate p l o t p o i n t s would
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2.
Testing Basic Assumptions
FiLLiBEN
57
be formed: ( Y M J , Y , M ) , . . . ( Y , M ) where Y i s the observed s m a l l e s t data p o i n t , and M i s the t h e o r e t i c a l "expected" value o f the s m a l l e s t data p o i n t from a sample o f s i z e η normally d i s t r i b u t e d p o i n t s . S i m i l a r l y , Y would be the second s m a l l e s t observed value and M would be the "expected value" o f the second s m a l l e s t o b s e r v a t i o n i n a sample o f s i z e η normally d i s t r i b u t e d p o i n t s . This proceeds up t o Y which would be the l a r g e s t observed data value and M would be the "expected value" of the l a r g e s t observation i n a sample o f s i z e η from a normal d i s t r i b u t i o n . Thus, i n forming a normal p r o b a b i l i t y p l o t , the v e r t i c a l a x i s values depend only on the observed data, w h i l e the horizontal a x i s values a r e generated independently o f the observed data and depend only on the t h e o r e t i c a l d i s t r i b u t i o n being t e s t e d o r hypothesized ( n o r m a l i t y i n t h i s case) and a l s o the value o f the sample s i z e n. A p r o b a b i l i t y p l o t i s thus i n s i m p l e s t terms a p l o t o f the observed versus the t h e o r e t i c a l o r "expected." i $
2
2
n
1
n
x
2
2
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
n
t
h
The crux o f the p r o b a b i l i t y p l o t i s t h a t the i ordered observation i n a sample o f s i z e η from some d i s t r i b u t i o n i s i t s e l f a random v a r i a b l e which has a d i s t r i b u t i o n unto i t s e l f . This d i s t r i b u t i o n o f the i^h ordered o b s e r v a t i o n can be t h e o r e t i c a l l y d e r i v e d and summarized ( i . e . , mapped i n t o a s i n g l e " t y p i c a l value") as can any other random v a r i a b l e . One can then pose the r e l e v a n t question as t o what s i n g l e number best t y p i f i e s the d i s t r i b u t i o n a s s o c i a t e d w i t h a given ordered o b s e r v a t i o n i n a sample o f s i z e , n. A computational disadvantage t o the use o f the mean i s t h a t d i f f e r e n t i n t e g r a t i o n techniques may be needed f o r d i f f e r e n t types o f d i s t r i b u t i o n . For some d i s t r i b u t i o n s the mathematical i n t e g r a t i o n does not e x i s t . These c o n s i d e r a t i o n s d i c t a t e t h a t the median i s s u p e r i o r t o the mean i n terms o f forming a t h e o r e t i c a l "expected" o r " t y p i c a l " value t o summarize the e n t i r e d i s t r i b u t i o n o f the l'th ordered o b s e r v a t i o n i n a sample o f s i z e η from the d i s t r i b u t i o n being t e s t e d . Thus, t o be p r e c i s e , the M-j on the h o r i z o n t a l a x i s of the p r o b a b i l i t y p l o t i s taken t o be the median o f the d i s t r i b u t i o n o f the i ™ ordered observation i n a sample o f s i z e η from whatever u n d e r l y i n g d i s t r i b u t i o n i s being t e s t e d . I t i s t o be noted t h a t the s e t o f Mj as a whole w i l l change from one hypothesized d i s t r i b u t i o n t o another--and t h e r e i n l i e s the distributional s e n s i t i v i t y o f the p r o b a b i l i t y plot technique. For example, i f the hypothesized d i s t r i b u t i o n i s uniform, then a uniform p r o b a b i l i t y p l o t would be formed and the w i l l be approximately equi-spaced t o r e f l e c t the f l a t nature of the uniform p r o b a b i l i t y d e n s i t y f u n c t i o n . On the other hand, i f the hypothesized d i s t r i b u t i o n i s normal, then the Mj w i l l have a r a t h e r sparse spacing f o r the f i r s t few ( M , M , M ,...) and l a s t few (..., M __ , M , M ) values but w i l l become more densely spaced as one proceeds xoward the middle of the s e t (... , ^-1/2» n/2> n + l / 2 behavior f o r the M-j i s of x
n
M
M
2
2
3
n - 1
S u c h
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION OF
58
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
course r e f l e c t i n g density function.
the
bell-shape
of
THE
the
MEASUREMENT
normal
PROCESS
probability
In summary, f o r a s p e c i f i c hypothesized d i s t r i b u t i o n , D Q the l'th value Mj i n the corresponding p r o b a b i l i t y p l o t i s a t h e o r e t i c a l (but computable) value c l o s e to what one t y p i c a l l y would "expect" f o r the value of the i order observation i f i n f a c t one had taken a random sample of s i z e η from the distribution D . ο How does one use and i n t e r p r e t p r o b a b i l i t y p l o t s ? In l i g h t of the above, i t i s seen t h a t i f i n f a c t the observed data do have a d i s t r i b u t i o n t h a t the a n a l y s t has hypothesized, then (except f o r an unimportant l o c a t i o n and scale f a c t o r which can be determined a f t e r the f a c t ) the Y-j and Μ· w i l l be n e a r - i d e n t i c a l f o r a l l i , t h a t i s , over the e n t i r e set. Consequently, the p l o t of Y j versus Μ · w i l l be n e a r - l i n e a r . This l i n e a r i t y i s the dominant f e a t u r e to be checked f o r i n any p r o b a b i l i t y p l o t . A linear probability plot indicates t h a t the hypothesized d i s t r i b u t i o n , D gives a good d i s t r i b u t i o n a l f i t to the observed data set. This combination of s i m p l i c i t y of use along w i t h distributional s e n s i t i v i t y makes the probability plot an extremely powerful t o o l f o r data a n a l y s i s . η
η
Q
The next l o g i c a l question to be examined i s what w i l l the p r o b a b i l i t y p l o t look l i k e i f the hypothesized d i s t r i b u t i o n , D i s not c o r r e c t - - ! . e . , i f the u n d e r l y i n g d i s t r i b u t i o n t h a t generated the data i s not the same as the d i s t r i b u t i o n , D hypothesized by the a n a l y s t . In t h i s case, the Υ· and will not match over the e n t i r e set and so the r e s u l t i n g p r o b a b i l i t y p l o t w i l l be nonlinear. A very useful aspect of the p r o b a b i l i t y p l o t i s t h a t the type of n o n l i n e a r i t y e x h i b i t e d by a given p r o b a b i l i t y p l o t w i l l give the a n a l y s t useful i n f o r m a t i o n as to how the d i s t r i b u t i o n a l hypothesis, D should be adjusted so as to a r r i v e at a b e t t e r d i s t r i b u t i o n a l f i t to the data. This l a s t p o i n t i s an important asset of the p r o b a b i l i t y p l o t technique f o r t e s t i n g assumptions i n d i s t r i b u t i o n . For example, i f the a n a l y s t b e l i e v e s t h a t the true underlying d i s t r i b u t i o n i s i n general a symmetric d i s t r i b u t i o n ( i . e . , a d i s t r i b u t i o n which has a p r o b a b i l i t y f u n c t i o n as i l l u s t r a t e d i n f i g . 10) as opposed to a skewed d i s t r i b u t i o n (e.g., w i t h a p r o b a b i l i t y f u n c t i o n as i l l u s t r a t e d i n f i g . 11), then the p r o b a b i l i t y p l o t a n a l y s i s to be p r e s e n t l y d e s c r i b e d i s r a t h e r t y p i c a l . The f i r s t step i n such a n a l y s i s i s u s u a l l y to t e s t the normal d i s t r i b u t i o n hypothesis (the normal being the most commonly-employed symmetric d i s t r i b u t i o n ) by forming a normal p r o b a b i l i t y p l o t . In forming such a p l o t , l e t us c o n s i d e r the f o l l o w i n g f i v e types of most commonly-encountered appearances of the normal p r o b a b i l i t y p l o t : l i n e a r , S-shaped, N-shaped, nonsymmetric c r o s s - o v e r , and convex (see f i g . 12). 0
0
Ί
Q
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Testing Basic Assumptions
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
FiLLiBEN
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION OF T H E M E A S U R E M E N T PROCESS
60
I f the normal p r o b a b i l i t y p l o t has the l i n e a r appearance of f i g u r e 12a, t h i s i n d i c a t e s t h a t the normal d i s t r i b u t i o n y i e l d s an acceptably good f i t t o the data; so no f u r t h e r p r o b a b i l i t y p l o t s need be formed and the d i s t r i b u t i o n a n a l y s i s i s completed. I f the normal p r o b a b i l i t y p l o t has the S-shaped appearance of f i g u r e 12b, t h i s i n d i c a t e s t h a t the D = normal hypothesis i s i n c o r r e c t , and t h a t the t r u e underlying d i s t r i b u t i o n f o r the data i s symmetric but i s s h o r t e r - t a i l e d than normal. Examples of such symmetric d i s t r i b u t i o n s , s h o r t e r - t a i l e d than normal, would be a U-shaped d i s t r i b u t i o n , a uniform d i s t r i b u t i o n , or a truncated bell-shaped d i s t r i b u t i o n . (These three d i s t r i b u t i o n s have p r o b a b i l i t y f u n c t i o n s as i l l u s t r a t e d i n f i g . 13.) In such a case, the second i t e r a t i o n by the a n a l y s t would be t o form an a d d i t i o n a l p r o b a b i l i t y p l o t f o r some s h o r t e r - t a i l e d d i s t r i b u t i o n (e.g., from a uniform p r o b a b i l i t y p l o t ) . I f the r e s u l t i n g uniform p r o b a b i l i t y p l o t i s s t i l l S-shaped, the t h i r d i t e r a t i o n i s t o form a p r o b a b i l i t y p l o t f o r a d i s t r i b u t i o n t h a t i s even s h o r t e r - t a i l e d than uniform (e.g., some U-shaped d i s t r i b u t i o n ) . On the other hand, i f the uniform p r o b a b i l i t y p l o t has a form as i n f i g u r e 12c (and which w i l l be represented very crudely as an "N shape"), the t h i r d i t e r a t i o n would be t o form a p r o b a b i l i t y p l o t f o r some d i s t r i b u t i o n s h o r t e r - t a i l e d than normal but l o n g e r - t a i l e d than uniform. Such i t e r a t i o n i s continued u n t i l there i s convergence t o an acceptable l i n e a r p r o b a b i l i t y p l o t . In p r a c t i c e , the a n a l y s i s w i l l u s u a l l y converge t o an acceptable d i s t r i b u t i o n i n a r e l a t i v e l y small number of i t e r a t i o n s .
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
0
To consider another p o s s i b i l i t y , i f the o r i g i n a l normal p r o b a b i l i t y p l o t has the "N-shaped" appearance o f f i g u r e 12c, t h i s suggests t h a t the D = normal hypothesis i s i n c o r r e c t , and t h a t the true underlying d i s t r i b u t i o n f o r the data i s s t i l l symmetric but i s l o n g e r - t a i l e d than normal. An example would be the Cauchy ( a l s o known as the Lorentzian) d i s t r i b u t i o n which i s a bell-shaped d i s t r i b u t i o n whose " t a i l s " are "longer" or " f a t t e r " than the normal. Figure lOd i l l u s t r a t e s the p r o b a b i l i t y d e n s i t y f u n c t i o n f o r the Cauchy d i s t r i b u t i o n . The t y p i c a l nature of l o n g - t a i l e d d i s t r i b u t i o n s l i k e the Cauchy i s t h a t i f the measurement process i s generating data from such a d i s t r i b u t i o n , i t i s more l i k e l y t o generate some observations which are c o n s i d e r a b l y removed from the "body" of the data than i n sampling from a more m o d e r a t e - t a i l e d d i s t r i b u t i o n (such as the normal). As b e f o r e , since the o r i g i n a l normal p r o b a b i l i t y p l o t was not l i n e a r , the a n a l y s t should perform the i t e r a t i v e a n a l y s i s t o produce a l o n g e r - t a i l e d probability plot ( l i k e a Cauchy p r o b a b i l i t y p l o t ) . I f t h i s second p l o t i s l i n e a r , t h i s i m p l i e s t h a t the Cauchy y i e l d s an acceptable d i s t r i b u t i o n . I f t h i s second p l o t i s not l i n e a r , other i t e r a t i o n s on D must be made based on the S-shaped or N-shaped appearance of the Cauchy p r o b a b i l i t y p l o t . A r o u t i n e computerized procedure t o c a r r y out such i t e r a t i o n s f o r the symmetric f a m i l y of d i s t r i b u t i o n s w i l l be presented i n s e c t i o n 11. 0
0
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
2.
FiLLiBEN
Testing Basic Assumptions
χ
χχχχχχχΧ
61
χχΧΧΧ
Figure 12. Typical shapes of probability plots, (a.) Linear; (b.) s-shaped; (d.) nonsymmetric crossover; (e.) convex.
Figure 13. Distributions shorter-tailed than normal, a. Tukey λ = 1.5 distribution (very short-tailed); b. uniform distribution (shorttailed); c. truncated normal distribution (moderate/short-tailed); d. normal distribution (moderate-tailed).
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION OF
62
THE
M E A S U R E M E N T PROCESS
I f the o r i g i n a l normal p r o b a b i l i t y p l o t has the appearance of f i g u r e 12d where the diagonal l i n e d i v i d e d the data p o i n t s on e i t h e r side unequally or as i n 12e where the diagonal l i n e does not d i v i d e the data at a l l , t h i s i s i n d i c a t i v e t h a t not only may the s p e c i f i c hypothesis t h a t D = normal, be i n c o r r e c t , but a l s o t h a t the hypothesis of a symmetric d i s t r i b u t i o n may be i n c o r r e c t . In such a case, the t r u e u n d e r l y i n g d i s t r i b u t i o n f o r the data would then be some type of skewed d i s t r i b u t i o n (e.g. , of the types w i t h p r o b a b i l i t y d e n s i t y f u n c t i o n s as i l l u s t r a t e d i n f i g u r e 11). In forming a d d i t i o n a l p r o b a b i l i t y p l o t s to f i t the data, the a n a l y s t should consequently consider d i s t r i b u t i o n s which are skewed. To enumerate but a few of the skewed d i s t r i b u t i o n s t h a t might be considered i n subsequent i t e r a t i o n s , one would i n c l u d e the log-normal d i s t r i b u t i o n , the half-normal d i s t r i b u t i o n , the exponential d i s t r i b u t i o n , the Weibull f a m i l y of d i s t r i b u t i o n s , the extreme value f a m i l y of d i s t r i b u t i o n s , and the Pareto f a m i l y of d i s t r i b u t i o n s . For an e x c e l l e n t general d e s c r i p t i o n of various d i s t r i b u t i o n s and d i s t r i b u t i o n a l f a m i l i e s (both skewed and symmetric) the reader i s r e f e r r e d to the comprehensive t e x t s by Johnson and Kotz (22,23).
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
0
One f i n a l p o i n t regarding o u t l i e r - d e t e c t i o n i s noteworthy. I f i n forming, f o r example, a normal p r o b a b i l i t y p l o t , the p l o t turns out to be l i n e a r w i t h the exception of one or two p o i n t s (see f i g . 14), how i s t h i s to be i n t e r p r e t e d ? This type of p l o t i s i n d i c a t i n g t h a t the normal f i t i s acceptable f o r most of the data but t h a t one or two p o i n t s are o u t l i e r s and do not seem to agree w i t h the normality assumption. The p r o b a b i l i t y p l o t i s thus seen to be usable f o r d e t e c t i n g o u t l i e r s . The next step i n the a n a l y s i s i s f o r the a n a l y s t to d e l e t e the one or two o f f e n d i n g p o i n t s and to form a p r o b a b i l i t y p l o t w i t h the remaining p o i n t s . I f t h i s second p l o t i s s t i l l s t r o n g l y l i n e a r , t h i s gives a d d i t i o n a l support to the hypothesis t h a t the data are n o r m a l l y - d i s t r i b u t e d and t h a t the one or two questionable p o i n t s are i n f a c t o u t l i e r s . The use of the p r o b a b i l i t y p l o t as a t o o l f o r o u t l i e r d e t e c t i o n i s g e n e r a l l y more s e n s i t i v e than any of the techniques discussed i n previous s e c t i o n s . The experimenter i s a l s o reminded t h a t although such o u t l i e r s may be deleted from f u r t h e r a n a l y s i s , these o u t l i e r s e x i s t " f o r a reason" and the experimenter ought to s a t i s f y himself t h a t he has determined what set of experimental circumstances had l e d to them. The examination of o u t l i e r s almost i n v a r i a b l y leads to improved design of the experiment and ultimately to an improved understanding of the experimental f a c t o r s which prevent a measurement process from being " i n c o n t r o l . " Having discussed what a p r o b a b i l i t y p l o t i s and how one i s to be interpreted, we now enumerate b r i e f l y some of the advantages of using a p r o b a b i l i t y p l o t as opposed to other methods of checking f o r d i s t r i b u t i o n a l information (e.g., histogram, χ s t a t i s t i c , f i t to p r o b a b i l i t y d e n s i t y f u n c t i o n ) . 2
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
FiLLiBEN
Testing Basic Assumptions
iI I
Ο
Ο
Ν
Ί
0>
ο ro
(Λ W HI U
V) W
I
I I I ι ι ι
Ο
^5
V
ζ
00 < ω Η _ι «Η • UJ ο _) α -· α α π
Ν
"δ.
>- υ
*-Η 5
m >ο Η α ·α -ΐ < Ζ et • ζ
< (D Ο α α
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
64
V A L I D A T I O N OF
THE
MEASUREMENT
PROCESS
Although i t w i l l be shown t h a t the p r o b a b i l i t y p l o t technique i s to be highly recommended, the various techniques are complementary. An o u t l i n e of the advantages of the p r o b a b i l i t y p l o t approach i s as f o l l o w s :
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
Graphical Technique The p r o b a b i l i t y p l o t i s a graphical technique and b e n e f i t s from a l l of the advantages of graphics as o u t l i n e d the end of s e c t i o n 1. Easy to
so at
Use
The dominant f e a t u r e to be l i n e a r i t y . This i s the s i m p l e s t is easily detectable. In p r o b a b i l i t y p l o t s i s no longer a
checked i n a p r o b a b i l i t y p l o t i s s t r u c t u r e p o s s i b l e i n a p l o t and addition, the generation of problem.
A p p l i c a b l e to a Wide Range of D i s t r i b u t i o n The p r o b a b i l i t y p l o t technique can be a p p l i e d to a wide range of d i s t r i b u t i o n s — c e r t a i n l y f o r a l l d i s t r i b u t i o n s commonly encountered i n p r a c t i c e . These d i s t r i b u t i o n s would cover those of both the continuous (e.g., normal) and the d i s c r e t e (e.g., Poisson) types. Such d i s t r i b u t i o n s would i n c l u d e the normal (Gaussian), uniform, v a r i o u s U-shaped d i s t r i b u t i o n s , Cauchy, L o g i s t i c , h a l f - n o r m a l , log-normal, e x p o n e n t i a l , gamma, beta, Wei b u l l , extreme v a l u e , Pareto, b i n o m i a l , Poisson, geometric, and negative binomial. For each such d i s t r i b u t i o n D, there nonetheless remains the same uniform approach i n i n t e r p r e t i n g the r e s u l t i n g p r o b a b i l i t y p l o t ; v i z . , to check f o r l i n e a r i t y and if nonlinear to make adjustments to the hypothesized d i s t r i b u t i o n s D a c c o r d i n g l y — b a s e d on the type of n o n l i n e a r i t y encountered. 0
No a p r i o r i Location and V a r i a t i o n Estimates Needed 2
One problem a s s o c i a t e d w i t h the χ goodness of f i t techniques and w i t h the e m p i r i c a l technique of superimposing a f i t t e d p r o b a b i l i t y d e n s i t y f u n c t i o n over a histogram of the data i s t h a t a p r i o r i values of the parameter ( u s u a l l y l o c a t i o n and v a r i a t i o n ) are needed before the technique can a c t u a l l y be a p p l i e d . This i s f r e q u e n t l y i m p r a c t i c a l f o r two reasons: 1. Such available.
known
values
for
the
parameters
are
rarely
2. Accurate estimates f o r the parameters can only be obtained a f t e r the d i s t r i b u t i o n has been "estimated" r a t h e r than before.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2.
FiLLiBEN
Testing Basic Assumptions
65
Since the p r o b a b i l i t y p l o t technique does not need a p r i o r i values t o be a p p l i e d , i t i s s u p e r i o r and d e f i n i t e l y f a r more p r a c t i c a l than the χ and f i t t e d p r o b a b i l i t y d e n s i t y f u n c t i o n methods f o r d i s t r i b u t i o n a l t e s t i n g . 2
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
Automatic Estimate of Location and V a r i a t i o n Obtained An a d d i t i o n a l advantage o f a p p l y i n g the technique i s t h a t estimates o f l o c a t i o n and s c a l e parameters a r e a u t o m a t i c a l l y produced as a secondary output. These l o c a t i o n and v a r i a t i o n estimates a r e d e r i v a b l e , r e s p e c t i v e l y , from the v e r t i c a l a x i s i n t e r c e p t and t h e slope o f the r e s u l t i n g p r o b a b i l i t y p l o t . Although the a n a l y s t i s reminded t h a t such l o c a t i o n and v a r i a t i o n estimates a r e not t o be considered as the optimal (minimum v a r i a n c e ) estimates, they nevertheless serve as useful p r a c t i c a l i n d i c a t i o n s o f what the appropriate parameter values should be. No Grouping o f Data Need be Done A problem a s s o c i a t e d w i t h the histogram technique (whereby the a n a l y s t simply forms a histogram o f the data and notes i t s general shape without applying or f i t t i n g a specific d i s t r i b u t i o n t o i t ) f o r gathering d i s t r i b u t i o n a l i n f o r m a t i o n i s t h a t o f choosing the grouping i n t e r v a l (the c l a s s width) f o r the histogram. The appearance o f the r e s u l t i n g histogram i s r a t h e r s t r o n g l y a f f e c t e d by the choice o f t h i s c l a s s width. A c l a s s width which i s "too narrow" w i l l r e s u l t i n a histogram i n which the true d i s t r i b u t i o n a l shape i s obscured by excessive v a r i a b i l i t y i n the height o f the bar a s s o c i a t e d w i t h each c l a s s , a c l a s s width which i s "top wide" w i l l r e s u l t i n a histogram i n which the t r u e d i s t r i b u t i o n a l shape i s obscured by "leakage" across neighboring c l a s s e s so t h a t the d i s t r i b u t i o n a l content f o r a given c l a s s w i l l be "smeared" out over several c l a s s e s . Although r u l e s o f thumb do e x i s t f o r choosing a reasonable c l a s s w i d t h , t h i s nevertheless c a l l s f o r an intermediate judgment t o be made by the a n a l y s t . The use o f the p r o b a b i l i t y p l o t technique e l i m i n a t e s the need f o r such a choice. Inasmuch as a p r o b a b i l i t y p l o t uses each observation i n d i v i d u a l l y and r e q u i r e s no grouping, t h i s f r e e s t h e a n a l y s t from making choices about c l a s s widths and e l i m i n a t e s ( i f the wrong c l a s s width happens t o have been chosen) a p o s s i b l e undesirable approach-dependency on the u l t i m a t e c o n c l u s i o n s . The net p o s i t i v e e f f e c t o f the p r o b a b i l i t y p l o t i s t h a t i t allows a d i s t r i b u t i o n a l a n a l y s i s t o be performed i n a completely d i r e c t and automatic f a s h i o n w i t h no intermediate d e c i s i o n s (such as c l a s s width) t o be made by the a n a l y s t . Thus, the conclusions from the d i s t r i b u t i o n a l a n a l y s i s w i l l r e f l e c t only the content o f the data and w i l l avoid p o s s i b l e biases introduced by the a n a l y s i s .
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
VALIDATION OF T H E M E A S U R E M E N T PROCESS
tuo
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
FiLLiBEN
Testing Basic Assumptions
X X X X X
g
Μ
Χ w
Η
ο
ο
Η
> >
ROBABILl Y LOT CORRELATION COEFFICIENT =
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
VALIDATION OF T H E M E A S U R E M E N T
χ x X
X
χ χ
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
PROCESS
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
FiLLiBEN
Testing Basic Assumptions
ο ω ο ο
χ χ
X X X X X X X X X X X X X X X X X X X X X X X X X X X X
Ο Ν
ί
αϊ J α Σ
SX
Ο
δ.
Ο
ι
α
«
I I
w
I I ι 1
X
X X X X X X X X X X X
α m
Ο < U- CD
« Ο ζ a. ο α
X
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Testing Basic Assumptions
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
FiLLiBEN
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
VALIDATION O F T H E M E A S U R E M E N T
PROCESS
Γ
^2
χ χ X
^5 X X X X X X X X X X X X X X X
SX
X
χ χ X
X
χ χ
Σ H < ζ V) UJ
ι oo in α \ f- < Ο I ο ω υ
ο Ο ω Σ
' Ji ; ω ω « ω* en w ο* σ' ω u? w ω Jl * w ω ω ιή* J «ο* ω «? en ω en w «τ! ω J! w « ë ëgëgggggëggëgëëëëëgg ggggëgggëëëggëggëëëëgg ë w
w
C O C C D C C D D Q O C C Û C C O C O C û D D O O O O O O O D C û O C C O C Û O O O D D
S a t S t S'a- S ι ï ï " O
O
O
O
O
Q
O
C
O
C
O
C
S £ £ ο f 'i' £ S a β t a' g & a S α S S Ï Î ' ï S S t S ' Î J I C
O
O
O
O
O
O
O
O
O
O
O
O
C
O
D
O
O
O
O
O
O
C
O
O
C
C
O
O
O
α t O
D
Ο Ο ^ OOOcO°OOOoOcoOOoOoOOOCOuOOOOOoOoOOOcOoOOc cιο if.mc c uc") irο mο ο ι·ο- inο mc if° in inΟ in mΟ ο ο inc inο inο inο irc mο ο inο ir.ο inο inο ο ο inο inο inο inοinο toοinο inο inc οinc inc c
a
C>
if
ID
C
CJ LP IT IT*
IT
IT IT
u a a a a: a a. u a a a a a a, u: a a; u a a UJ UJ a. ui ui a, u. u.i u- ni u.» u.' :
IT
in
UJ
u UJ a. a: a a ui UJ
ΙΧΧΙΧΧΧΧΙΙΧχΧΧΙΙΧΙΧΧΙΙΙΙΧΙΙΧΙΙΧΧΧΧΙΙΙΧΧΧΙΧΙΙ Ui h-Hhhhl-Hhl-KI-hK HKHKI-HHKKHHI-l-t-h-l-h KHhK-t-HHKhr-Kt-HK z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z
u: a; a< a a. a^ a a a. a: aj UJ a a UJ a. aj ai a> a a a ai a a> UJ a a UJ a> a. a. a a- a α.· ai aj a> UJ UJ a, a a a: UJ U a a. a a> a a a a a a u a^ a. a; a a a) a a a a a: a a a< a. a ai a' a a a> a a a. a. a a a a 1
:
1
:
1
a a IL a ai a. a a: a ai a a a' a a a a a UJ a a a a a a' a a a a a a a a a a a a a a a a a a a a a a a a a a œ a a a r a . c c a c c a a a c G a a a . a a c c a a a a , a a t a a c c c D û G a , c c a c c a a a . 1
z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z
u u u u.1 ui a a a. u u a u a a a a> a: a ai a a ai u U ' a ai a' a a a a a a a a a a a a a a a a a a a a a a a a a a a c r a a a a a a a a a a a a a a a a a a a c r a a a a a a a a a a c r a a a a a a a a a a u a . a a a a a a a a a a a a a a a a u a C
ο
a
z z z z z z z z z z z z z z z z z z z z z z z z z z z z1 z z z z z z z z z z z z z z z z a a a a a a ai a a a a a a a a a a a a a a a u: u a a> a u u a a a a u a a a a a a a: a; ai a a a a ai u a, a a a a ui a ai a: a a a a a a a. a G G a a : a a : G G a G c L a c c œ œ a G a
ο
g
Η
ai a aa a au . a a. a. a: a a -a a a
c
ο
r
a; a a a. a u: a
:
1
ο ο
c i r R r PÎ r: π Ρ r, r η η n r m r> r r m η m ' m r
aa
a.
O
U G C U U U G G G U U U G L - i G U U G G G U
Ui a a' a a. a a ui a a a a U, ai ΧΤΧΧΧΧΧΧΙΧΖΧΤΧΙΧΧΤΤΧΧΤΧΧΙΧΧΧΧΤΧΙΤΧΧΤΙΧΧΧΧΧΙΧ I Κ μ- Khi- — t — t t- Κ H t- Ht-- K— t -t>— — l — t H> —— l — J K— l h- K H t- h H I a, lu a
a u-
a
a. in UJ a. UJ a a
u. a. a a
UJ
a. a u a
u. a u
a u,
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
u UJ
IL
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
2.
FiLLiBEN
Testing Basic Assumptions
97
For the Josephson J u n c t i o n v o l t a g e counts data (see f i g . 24), the MPPCC c r i t e r i o n i n d i c a t e d t h a t the normal d i s t r i b u t i o n was the best f i t which i s i n agreement w i t h t h e already-seen l i n e a r i t y of the ( f i g . 17) normal p r o b a b i l i t y p l o t . For the wind v e l o c i t y data (see f i g . 25), the MPPCC c r i t e r i o n i n d i c a t e d t h a t the normal d i s t r i b u t i o n (though not p r e c i s e l y optimal) was nevertheless near-optimal and i s i n agreement w i t h the nearl i n e a r i t y o f the ( f i g . 18) normal p r o b a b i l i t y p l o t . The MPPCC c r i t e r i a as a p p l i e d t o the beam d e f l e c t i o n data i s presented i n f i g u r e 26. As expected, the b e s t - f i t symmetric d i s t r i b u t i o n t o t h i s data s e t i s i n the U-shaped d i s t r i b u t i o n region. The a n a l y s i s o f the x-ray c r y s t a l l o g r a p h y r e s i d u a l s (see f i g . 27) a l s o confirms what was p r e v i o u s l y s u s p e c t e d — v i z . , t h a t the best d i s t r i b u t i o n i s i n the moderate to moderate l o n g - t a i l e d region. The f i n a l example (see f i g . 28) i l l u s t r a t e s the use o f the MPPCC f o r a skewed d i s t r i b u t i o n a l f a m i l y - - i n t h i s case the extreme value f a m i l y . The data set considered i s annual maximum wind speeds a t Walla W a l l a , Washington. In a f a s h i o n s i m i l a r t o the symmetric f a m i l y a n a l y s i s , a r e p r e s e n t a t i v e s e t o f 46 members o f the extreme value d i s t r i b u t i o n a l f a m i l y were s e l e c t e d and the p r o b a b i l i t y p l o t c o r r e l a t i o n c o e f f i c i e n t was computed for each. The r e s u l t s o f the a n a l y s i s i n d i c a t e d t h a t an extreme value d i s t r i b u t i o n w i t h shape parameter γ = 7 y i e l d s the best fit. The use o f the MPPCC c r i t e r i a i s recommended not as a replacement f o r examination of i n d i v i d u a l p r o b a b i l i t y p l o t s , but r a t h e r as an important complement t o such analyses. The automated procedures presented above a l l o w the a n a l y s t t o q u i c k l y "converge" t o a neighborhood o f d i s t r i b u t i o n s which provide good f i t s to the data set under examination. 4-PLOT ANALYSIS The general question posed i n t h i s s e c t i o n i s as f o l l o w s : Given t h a t one would l i k e t o prepare a completely automated (computerized) f i r s t - p a s s a n a l y s i s t h a t would be a p p l i c a b l e to a wide v a r i e t y o f data sets and which would take no more than one computer page, what would one i n c l u d e on t h a t s i n g l e page? As has been s t r e s s e d throughout t h i s paper, the b a s i c assumptions which must be t e s t e d t o assure t h a t a measurement process i s i n c o n t r o l i n c l u d e randomness, f i x e d l o c a t i o n , f i x e d v a r i a t i o n , and f i x e d d i s t r i b u t i o n . The value o f the run sequence p l o t and the lag-1 a u t o c o r r e l a t i o n p l o t s have already been amply discussed w i t h respect to the f i r s t three p o i n t s . The p r o b a b i l i t y p l o t has been discussed a t length w i t h respect t o the l a s t p o i n t . And so the following 4-plot a n a l y s i s i s presented as an automated
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN
THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE
700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 70 0 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700
ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED CRDEREO ORDERED ORDERED ORDERED OROERED ORDERED ORDERED ORDERED ORDERED ORDEREO ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED
OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS · OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS.
AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND
THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE
ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER OROER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER OROER OROER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER OROER ORDER ORDER ORDER ORDER ORDER ORDER
STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT · STAT. STAT. STAT. STAT. STAT. STAT . STAT. MEDIANS MEDI ANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS
FROM FROM FROM FROM FROM FROM FROM FRCM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FRCM FROM FROM FROM FROM FROM FRCM FROM FROM FRCM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE
LAMBDA = 2 . 0 DI S T . LAMBDA = 1 .9 DIST. LAMBDA 1.8 D I S T . LAMBDA = I . 7 DI S T . LAMBDA = 1 .6 DIST. LAMBDA = 1.5 DIST. LAMBDA = 1 .4 DIST. LAMBDA 1 .3 DIST. LAMBDA = 1 .2 DIST . LAMBDA = 1 .1 DI S T . LAMBDA = 1.0 D I S T . LAMBDA = • 9 DIST. LAMBDA = • 8 DI S T . LAMBDA .7 DIST. LAMBOA = • 6 DIST. LAMBDA = . 5 DIST . • 4 DIST. LAMBDA = LAMBDA = . 3 DI S T . LAMBDA = • 2 DIST. NORMAL D I S T R I B U T ION LAMBDA = • 1 DI S T . LOGISTIC DIST. DIST. DOUBLE E X P . DIST. LAMBDA -.1 LAMBDA - . 2 DIST. LAMBDA = - . 3 DIST. LAMBDA = - . 4 DIST. LAMBDA = - . 5 DIST. LAMBOA = - . 6 DI S T . - .7 D I S T . LAMBDA = LAMBDA = - . 8 OIST. LAMBDA = - . 9 DIST. CAUCHY D I S T R I B U T I O N DIST. LAMBDA = - 1 . 0 DI S T . LAMBDA -1.1 LAMBDA = -1 . 2 D I S T . LAMBDA = - 1 . 3 D I S T . LAMBDA - 1 . 4 DI S T . DIST. LAMBDA = - 1 . 5 LAMBDA = -1 . 6 D I S T . LAMBDA = - 1 . 7 DI S T . LAMBDA -1 . 8 D I S T . L A M B D A = - 1 . 9 D I ST . L A M B D A = - 2 . 0 DI S T . I s IS IS IS
I s IS IS IS IS IS IS IS IS
I s IS IS IS IS IS
I s IS IS
IS IS IS IS IS IS IS IS IS IS IS IS IS IS IS IS IS IS IS IS IS IS
.95543 .95447 .95374 .95308 .95265 .95244 .95249 .95267 .95329 .95407 .95544 .95706 .95911 .96161 .96446 .96737 .97048 .97315 .97492 .97484 .97434 .97045 .95613 .96028 .94076 .90818 .85924 .79357 .71518 .63128 .55017 .47743 .42313 .41619 .36553 .32518 .29315 .26772 .24749 .23128 .21820 .20752 . 19871 .19139
Figure 24. Printout plot correlation coefficient analysis for Josephson Junction cryothermometry voltage counts
T H E CORR E L A T I C N T H E CORR E L A T I C N T H E CORR E L A T I O N T H E CORR E L A T I C N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I C N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I C N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I C N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I C N T H E CORR E L A T I O N T H E CORR E L A T I C N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I C N T H E CORR E L A T I C N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I C N T H E CORR E L A T I C N T H E CORR E L A T I O N T H E CORR E L A T I C N
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
53 Ο ο w
*
H
c! 53 M £
>
Χ
Η Μ £ M
*ι
Ο
Ο
Η
>
Θ
ZD
00
2.
Testing Basic Assumptions
FiLLiBEN
Vf ι Γ l ί t f ι κ κ • κ • r t f t f l Γ ^ f f c r ί ^ f ^ M r J ^ ^ r f f ^ ^ f f ι f • σ a ^ ι f C • c κ ' d • ^ ι f • ^ P ι er cc cr a cr a cr. cr a a a a or cr σ cr σ σ σ cr σ σ se et- c; ;* \c hc « - ι τ ^ ο Λ ί ο ^ σ cc νβ If If. Γ
r
;
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
" IT IT' U"IT Ι/Ί1/IT IT: 1/1 IT IT.
IT' LT. œ Ι/î IT. IT 1/1 IT' l/l IT IT) IT ΙΛ IT
»I- C·I - · H - l - t - l - l - K - h O I - l - . . · . l - K - l - h h K I - t : i/
en
· \s. irif.
c ο ο ο c τ o c c r r r c r en r; r r- r n r
i n i/ι/ι m in in »- mι/Ί^ι/ΐΐ/ΐι/ιι/ΐιΧί/ιι/ιι/ι
on c c c c c e r, r c D r. r c ο c. c ο ο ο ο ο μα c
cr— · μ» or · m · rr h»-> · LT c o ir ι ι ι ι ι ι ι ι ι ι ι *> μI C II Ο L l II II II II II i l II II II C II II II II II II II II II II II
IIIIIIII I
« ' « < < : « « « « « « < « « r < : « c . « « < Î j « K U J < < « « e « < L « « > ~ < i « « < i : < « f f < « < x « « . Ci c. c c ο c. c. r c c ο c c. ο c c c, c. ο «=r c in _J ο c η ο c c c η «τ τ α ο ο ο ο ο ο c ο ο ο cr rr cr en ΓΓΙ en ce, cr cr, cr cr, en cr cr• er, en en en cr 5 en μ-> c r r r er cr. rr rr rr cr cr cri u en en rr, en m cr cr, cr, m cr. rr 5 5 5 5 5 ^ ^ 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 55 5 5 55 c r 5 < £ ;. ^ 5 5 f5 5 5 5?^?5 5 5 5 5 55:! _ j 5 5 5 5 5 5 5 ^ 5 2 r 5 5 5 55555 5 { τ ,
5
ι
!
=
J J J J J - i J j J J J J j j _ i j j j j ; j j c j J j J j J J J . UJ X μ5 C et U
J _ J C J _ I _ ) - J _ I _ I -
UJ lit UJ LU UJ UJ U LL! UJ UI Lf U UJ U LU UI L! L i U.' Ui UI U UJ UJ UJ UJ UJ U' LL UJ UJ U' UJ UJ UJ LU UJ Lu Ui Ul UI UJ UJ X X X X X X X X X X X X X X X X X X X X I X X X X X X X X X X X X X X X X I X I X X X μ- μ μ κ μ - μ ^ μ - μ - μ μ κ μ κ μ- μ- μ μ μ- κ μ μ μ μ μ μ κ μ μ - μ μ μ μ κ μ- μ μ ι - μ ι - κ κ ι - κ 5 5 5 5 5 5 5 5 5^ 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5" 5- 5 5 5 5 5" 5 5 C C C C C C C C C C C . C C C C C C C C . c c c c c c c c c c c c c c c c c c c c cr c c c c r a erex cr er ο or or or or or cr cr cr or or. ctc^o^aaccorctcr(i'a.cr or. aaoctrer a. ocoro^cr:Cr. a. a L L L L U L L U L L L L L U L U U U L U L i L L U L L U L L U U U U U U L U U u U U L L ty LT u~ 1/ 1/ I T IT. I T y y y y y y y y y y y y y —y ~ —y y y y y y y y y y y —Ζ ? 7 77 7 7? 7 Ζ7 Ζ 7 < r < c < f < < i < c < r c r < < : < r < i < < r < ! < r « i < t f < < r < < i c « i c i t f < r c < < < r < « r < i < r < < f < t f < r < 1
!
1
1
c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c e e c c U U ι Li li I U' U ι li Li-U.IL L L It ' U li L U U U U LI l i li : L U U U li li L L U U Ll IJ ' L L ' li L U li U U U 55 5 55^55^ 5 ^ 5 5 5 5 5 5 5 5 5 ^ 5 5 5 5 5 5 5 5 5 ^ 5 5 ^ 5 5 5 5 5 5 5 5 55^5 ;
:
1
LT&crLrtyLrcjrcrcrcrcrc5fcrcr L U U L L U L : L > U L L ' L ' L L L L L L L L L L L L U L L IJ L L L L L L L L L L L L J L L L L c e e e e c e e e c c e c e c c c c e c c e e c c e c e c c e c c cc c r r r r r r r r eter. ererererereraererexera
o o o o o o o o o o o o o o o o o o o c o o o o o o o o c o o o o o o o o c o o o o o o
L L ' L ! ϋ L L L L L» L U L L L L L L L L L L L L L L L L L L L L L L L L L L L L L i L L L L. X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X μ-μμκμ-μ-μ-μ-μ-μ
μ^μμ-μ-μ-μμ-κμ-μμ-μ-μΗμ-μμ-μ-μ-μ-μ
μμ-Ημ-μ-μμ-μ-μ-μ-μ-μ-
c c e e ec e c c t c e c erc c c c c c c e r r e ce ce c ce c t . e e ce c ce ce ~Z Ζ T y ~ y ~ Ζ Ζ y Ζ y Ζ y y Ζ Ζ Ζ Τ - Τ Τ Ζ Ζ Z 7 7 7 7 Z 7 Z 7 7 7 7 7 7 7 Z 7 7 7 7 L/>-l/M/lU";l/lLfil/>l/^inLnLOl^ cr cr cr er CÎ ex cr cr cr cr v er π rr r rr rr cr cr rr rr rr e r a rererrrrrrrr-ererererereerrererrrerr. c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c. c ce c c. c c: c: c c. c
c o n o o . n o o o o n o n a
L i L l L ' lu L l L l l i L l lt< L l L) L.I L ! L i L l Li I L ! Ii' L! L! h.! L! Lr L.i L' Li I L L' L l L' L.! L - L» L l L i b ' L l Ul L l L ! LJ Ul L' L rr n cr rr ύ rr rr rr et rr ύ et y cr rr rr rr ύ ncrr>Ύrîrrrrrγΐrrrrγcrrrr:yeyΎrΎ'y^cγ^rryrrcγcrcr Ιι · L ! L I U LU L' L' L l L • L ι L l lu L l LJ L lil L ' L : L L L L l L' L: L > L L ! tu L i L ' L : L L L l L ' li : L ' L i L ι L L L I L l i 1
1
η c η ο οη ηηηη ηηοc η ηη ηc ηη ο n n n n n n o n c o n n o n n c .
ηοοηοη
1
r r r r r r r r c r f r f r f r r r r r f r r r r r f r r r f r f r r r r r r r f r c r f r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r v r r rr rr rr r r rr C C C C C C C: c c, c c c c. c c c c c c c c c c c c c c c c c c c c ο c. c c c c c c c c c o o o c o c o o c c o o o o o o o o o o o o o o o o o c o o o o o o o o o o c c o c o o
L ' L • L ' LU L ' L L. L L l. L' L! L L L LU L' l u ti L i L t. L U L L 1
L U L ι L L L L L ' L ' L L LU L i L •• L > L U' L 1
1
x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
Γι
fil
L i til III III LJ L l IU til l i ! !.«. 1 II UJ 111 III lit til L l III hi L! L l Ul ( j I til L l L l tit L l ll) IJ I LI til IJ I III ll! ll) LI III 111 Ll I L l III L L L L L L' L In li L, L L L L U L. L L i L ! L L L Li L L, L L lu L' L l L L < L ι L ' LI L : L J L l u U li li L L : 1
1
(-»-»- ι - μ t~ D- \~ κ ι-- μ μ· μ. y- y- t- h y~ v_ μ- μ ^ ι_ μ. μ- j- κ μt- *κ ^ t- ν- t- μ μ- μ μ μΙιΙ L l LJ 111 LJ LJ L l til L l UJ LJ LJ LJ til 1,1 !>J L i LJ 1,1 1,1 Ul LJ LJ L l Ul L) L l til LJ LtJ I.J LJ L.I LJ LJ UJ LJ L) LJ L l L l til L l L l r r r r r n r n r r r r rnrrfTirrirrr^ μ
• > ~ Γ'•
> z -y ζ > -y
u
y τ •? -y y y y > y y y - y -•••-y y y y y y y y y y y y y y y y y y y y
O O O O O O O O O O O O O O O O v O O O O O O O O O O O O O O O O O O O O O O O O O O O O μ-μ-μ-μ-μ-μ-μ-μ-μ-μ-μ-μ-μ-μ-μ-μ-μ-μ-t—μ-μ-μ-μ-μ-μ-μ-^μ-μ-μ^ til Ul L l L l LJ L l LJ L l U L l L l L l Ul L l UJ Ul L l L l LI LJ Ul L l Ut lit Ul Ul L l L l L l L l L l Ul LJ IJ L l L l til lit IJ L l L l tiJ L l U rrr~rrrr.rrcrrrrrcrcrr^ rrrrrrrrrrrrrrrrrrrrr^CYrrrrrrrYr-rrrr c c c c c c c c c c c c c c c c c c c c c c c c c c c C C C C C C . C CC C C C C C C C c CJ t j t> ο ο ο ο ο ο ο ο ο ο ο υ υ υ υ υ ο υ ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο t) ο L l L l Lit L l L l
111
LJ LJ til L l L l til L l til
111
LJ L l til tJ lit L l L l L i t j L l til U L l LJ L l U IJ L l L l lit L> L l L l L l L l IJ L l LJ L l
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE
CORRELATION CORR EL AT I C N CORRELATION CORRELATICN CORRELATION CORRELATION CORRELATION CORRELATICN CORR E L A T I O N CORRELATION CORRELATION CORRELATION CORRELATION CORRELATICN CORRELATION CORRELATICN CORRELATICN CORRELATION CORRELATION CORRELATICN CORRELATION CORRELATICN CORRELATION CORRELATION CORRELATICN CORRELATION CORRELATICN CORRELATICN CORRELATION CORRELATION CORRELATICN CORRELATION CORRELATICN CORR ELAT I C N CORRELATION CORRELATION CORRELATION CORRELATION CORRELATICN CORRELATICN CORRELATICN CORRELATICN CORRELATION C O R R E L A T I ON
BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN EETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN EETWEEN BETWEEN BETWEEN BETWEEN BETWEEN
200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 20 0 200 200 200 200 200 200 200 200
OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS.
AND AND AND ANO AND ANO AND AND AND AND AND AND AND AND AND AND AND AND AND AND ANO AND AND AND AND AND AND AND AND AND AND AND ANO AND AND AND AND AND AND AND AND AND AND AND
THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE
ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDE R ORDER ORDER
STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT . STAT. STAT. STAT. STAT. STAT . STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT . STAT. STAT. STAT . STAT. STAT. STAT . STAT. STAT. STAT .
MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MED I ANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MED IANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIA NS MFD I ANS MEDIANS MEDIANS MEDIANS MED I ANS MEDIANS
FROM FROM FROM FROM FROM FROM FROM FRCM FROM FROM FROM FROM FRCM FRCM FROM FROM FRCM FROM FROM FRCM FROM FROM FRCM FROM FROM FRCM FROM FROM FRCM FROM FROM FRCM FROM FROM FROM FROM FRCM FROM FRCM FRCM FROM FROM FRCM FROM
THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE
-
LAMBDA = 2.0 D I S T . LAMBDA = 1 .9 DI S T . LAMBDA = 1.8 D I S T . 1.7 D I S T . LAMBDA = LAMBDA = 1 .6 DI S T . LAMBDA = 1 .5 D I S T . 1 .4 D I S T . LAMBDA = LAMBDA 1 .3 D I S T . LAMBDA = 1 .2 D I S T . LAMBDA = 1.1 D I S T . 1 .0 D I S T . LAMBDA LAMBDA • 9 DIST. LAMBDA • 8 DIST. LAMBDA .7 D I S T . LAMBDA = • 6 DIST. LAMBDA .5 D I S T . .4 D I S T . LAMBDA = .3 D I S T . LAMBDA LAMBDA = . 2 DIST. NORMAL DI: ST RIBUT ION LAMBDA = • 1 DIST. L O G I S T I C DI ST. DOUBLE E X P . D I S T . LAMBOA - - . 1 DI S T . -.2 DI S T . LAMBDA = LAMBDA = -.3 DIST. LAMBDA = - .4 DI S T . LAMBDA = - .5 DI S T . LAMBDA = -.6 D I S T . - . 7 DI S T . LAMBDA = LAMBDA - .8 D I S T . LAMBDA = -.9 DIST. CAUCHY DI[ S T R I B U T I O N LAMBDA = - I .0 D I S T . LAMBDA - - 1 . 1 D I S T . LAMBDA = -1 .2 DI S T . LAMBDA = - 1 . 3 D I S T . LAMBDA -1 .4 D I S T . LAMBDA = -1 .5 DI S T . LAMBDA -1.6 DIST. LAMBDA = - 1 . 7 D I S T . LAMBDA = -1 .8 DI S T . LAMBDA = - 1 . 9 D I S T . LAMBDA -2.0 OIST.
Printout plot correlation coefficient analysis for beam deflection
ORDERED ORDERED ORDERED CRDE RED ORDERED OROEREO ORDERED ORDERED ORDERED CRDERED ORDERED OROERED CRDERED ORDERED ORDERED ORDERED ORDERED CRDERED ORDERED ORDERED CRDERED OROERED ORDERED ORDERED OROERED ORDERED CRDERED ORDERED ORDERED OROERED ORDERED CRDERED ORDERED ORDERED CRDERED ORDERED ORDERED CRDERED ORDERED ORDERED CRDERED ORDERED ORDERED CRDERED
Figure 26.
THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
IS
I s IS IS IS IS IS
Is IS IS IS IS IS
I s IS IS
Is Is IS
Is IS IS
I s IS
I s IS IS
I s IS IS IS IS IS IS IS IS IS IS
Is IS IS IS IS IS
.99255 .99308 .99351 .99383 .99405 .99417 .99416 .99403 .99374 .99326 .99255 • 99155 .99015 .98822 .98558 .98198 .97703 .97026 .96097 .95408 .94825 .93092 .88949 .90760 .87 67 7 .83721 .78846 .73138 .66839 .60309 .53933 .48027 .44084 .42864 .38292 .34520 .31404 .28852 .26771 .25074 .23687 .22548 .2160 8 .20828
Ο n w
*ϋ
H
td M £ M
C
C/J
>
Μ
Η Μ
*1
Η Ο Ο
ο
>
— II
< >
ο
Ο
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE
CORRELATICN CORRELATICN CORRELATION CORRELATION CORRELATICN CORRELATION CORRELATICN CORRELATICN CORRELATION CORRELATICN CORRELATION CORRELATION CORRELATICN CORRELATICN CORRELATION CORR EL AT I C Ν CORRELATION CORRELATION CORRELATICN CORRELATION CORRELATICN CORRELATION CORRELATION CORRELATICN CORRELATICN CORRELATION CORRELATICN CORRELATICN CORRELATION CORRELATION CORRELATION CORRELATION CORRELATICN CORRELATION CORRELATICN CORRELATION CORRELATION CORRELATION CORRELATION CORRELATION CORRELATICN CORRELATICN CORRELATION CORRELATICN
THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE
24 19 2419 2419 2419 2419 24 19 2419 2419 2419 2419 2419 2419 2419 2419 2419 2419 2419 2419 2419 24 19 24 19 2419 2419 2419 24 19 2419 24 19 2419 2419 2419 2419 2419 2419 2419 2419 2419 2419 2419 2419 2419 2419 2419 2419 2419
ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED CRDERED ORDERED ORDERED ORDERED ORDERED OROERED ORDERED ORDERED CRDERED ORDERED ORDERED CRDERED ORDERED ORDERED ORDERED ORDERED CRDERED ORDERED ORDERED CRDERED ORDERED ORDERED CRDERED ORDERED ORDERED ORDERED ORDERED CRDERED CRDERED ORDERED ORDERED ORDERED ORDERED OROERED CRDERED ORDERED CRDERED ORDERED
OBS. OBS. OBS. OBS. OBS. OBS · OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS · OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS.
AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND T H E AND THE AND THE AND T H E AND THE AND THE ANO THE AND THE AND THE AND THE AND THE ANO THE AND THE AND THE AND T H E AND THE
ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDE R ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER
STAT. STAT . STAT. STAT. STAT. STAT . STAT. STAT. STAT . STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT . STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MED I ANS MEDIANS MEDIANS MEDIANS MED I ANS MEDIANS MEDIANS MEDIANS MEDIANS MFD IANS MEDIANS MEDIANS MEDIANS MEDIANS MEOIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEOIANS MEDIANS MEDIANS MEDIANS MEDIANS MED I ANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MED IANS MED I ANS
FROM FROM FROM FROM FROM FRCM FROM FRCM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FRCM FROM FROM FRCM FROM FROM FRCM FROM FROM FRCM FROM FROM FRCM FROM FROM FRCM FROM FROM FRCM FROM THE THE THE THE THE THE THE THE THE THF THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE
LAMBDA = 2.0 DI S T . LAMBDA = 1 .9 D I S T . LAMBDA = 1 .8 D I S T . LAMBDA = 1.7 DI S T . LAMBDA = 1 .6 D I S T . LAMBDA = 1 · 5 DIST. LAMBDA = 1 .4 DI S T . LAMBDA 1.3 D I S T . LAMBDA = 1.2 D I S T . LAMBOA 1 .1 DI S T . LAMBDA = 1 · 0 DIST. LAMBDA .9 DI S T . .8 D I S T . LAMBDA = LAMBDA = .7 D I S T . LAMBDA • 6 DIST. = LAMBDA = .5 DI S T . LAMBDA • 4 DIST. = LAMBDA .3 D I S T . = LAMBDA = • 2 DI S T . NORMAL D I S T R I B U T ION LAMBDA • 1 DIST. = L O G I S T I C 01 S T . DOUBLE E X P . D I S T . LAMBDA = - .1 D I S T . LAMBDA = -.2 D I S T . LAMBDA = -.3 D I S T . LAMBDA = -.4 D I S T . LAMBDA - - . 5 D I S T . LAMBDA -.6 DIST. LAMBDA = -.7 DI S T . LAMBDA = -.8 D I S T . LAMBDA = - . 9 DI S T . CAUCHY D I S T R I B U T ION LAMBDA = - 1 . 0 D I S T . LAMBDA = -1 . 1 DI S T . LAMBDA = -1 .2 D I S T . LAMBDA = - 1 . 3 D I S T . LAMBOA = - 1 . 4 D I S T . LAMBDA = -1 .5 D I S T . LAMBDA = -1 .6 D I S T . LAMBDA = -1 .7 DI S T . LAMBDA = -1 .8 D I S T . LAMBDA = - 1 . 9 DI S T . LAMBDA = - 2 . 0 D I S T . I
s
I s IS IS IS IS IS IS IS IS IS IS IS IS IS IS
Is IS IS IS IS IS IS IS IS IS IS IS IS IS IS IS IS IS IS IS
I S IS IS IS IS IS IS IS
.89619 .89466 .89334 .89226 .89146 .89100 .89095 .89136 .89231 .89389 •89619 .89932 .90339 .90854 .91490 .92259 .93173 .94233 .95428 .96258 .96710 .97965 .98785 .98960 . 9 9 2 6 5 MAX .98206 .94981 .89075 .80825 .71434 .62288 .54283 .48098 .47729 .42510 .38433 .35243 .32730 .30730 .29118 • 27805 .26722 .25820 .25061
Figure 27. Printout plot correlation coefficient analysis for x-ray crystallography residuals
BETWEEN EETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN EETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN EETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
Ο
"S.
3
Co Co
ce Ci*
2
w M
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
VALIDATION OF T H E M E A S U R E M E N T
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
PROCESS
2.
FiLLiBEN
Testing Basic Assumptions
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
comprehensive f i r s t - p a s s t o o l f o r data a n a l y s i s . are as f o l l o w s : 1.
run sequence p l o t
2.
lag-1 a u t o c o r r e l a t i o n
3.
histogram
4.
normal p r o b a b i l i t y p l o t
103 The four p l o t s
plot
normal p r o b a b i l i t y p l o t was chosen because the normality assumption i s most commonly employed. The histogram i s included as an a d d i t i o n a l g r a p h i c a l p o i n t o f reference i n case the normality assumption i s untenable. In a d d i t i o n , a s e l e c t e d set of about a dozen useful l o c a t i o n s , v a r i a t i o n and a u t o c o r r e l a t i o n summarizing s t a t i s t i c s i s included on t h i s s i n g l e page. This p a r t i c u l a r 4-plot a n a l y s i s has proved to be i n v a r i a b l y informative i n terms o f a s s e s s i n g the v a l i d i t y of the underlying assumptions i n a measurement process. The a n a l y s i s can be a p p l i e d to both raw response data and to r e s i d u a l s a f t e r a m u l t i factor (e.g., r e g r e s s i o n , ANOVA) f i t . The technique i s recommended only as a f i r s t pass i n a data a n a l y s i s and should be' complemented by more d e t a i l e d a n a l y s i s . The a p p l i c a t i o n o f t h i s technique t o several examples i s now discussed. The f i r s t example ( f i g . 29) i s the 500 Rand (24) normal random numbers o f f i g u r e s 15 and 22. The run sequence plot indicates f i x e d l o c a t i o n and v a r i a t i o n . The lag-1 autocorrelation plot indicates randomness. The histogram i n d i c a t e s a bell-shaped symmetric d i s t r i b u t i o n . The normal p r o b a b i l i t y p l o t i n d i c a t e s normality. The second example ( f i g . 30) i s the 700 Josephson J u n c t i o n cryothermometry voltage counts o f f i g u r e s 17 and 24. The run sequence p l o t i n d i c a t e s f i x e d l o c a t i o n and v a r i a t i o n and a l s o the rather discrete nature o f the data. The lag-1 a u t o c o r r e l a t i o n p l o t i n d i c a t e s randomness and r e i n f o r c e s the d i s c r e t e aspect o f the data; the histogram i n d i c a t e s symmetry and a bell-shape; the normal p r o b a b i l i t y p l o t indicates normality. The t h i r d example ( f i g . 31) i s the 200 beam d e f l e c t i o n s data on f i g u r e s 19 and 26. The run sequence p l o t i n d i c a t e s f i x e d l o c a t i o n and v a r i a t i o n and perhaps a s i n g l e o u t l i e r ( h i g h ) . The lag-1 a u t o c o r r e l a t i o n p l o t i n d i c a t e s w e l l - d e f i n e d nonrandomness and a d d i t i o n a l evidence f o r an o u t l i e r . The histogram i n d i c a t e s symmetry and a U-shaped i n d i c a t e s t h a t a d i s t r i b u t i o n s h o r t e r t a i l e d than normal i s needed) and a d d i t i o n a l o u t l i e r evidence. A remodeling t o take i n t o account the dominant a u t o c o r r e l a t i o n s t r u c t u r e of the data i s c l e a r l y c a l l e d f o r i n t h i s case.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
104
VALIDATION O F T H E M E A S U R E M E N T
PROCESS
Downloaded by KTH ROYAL INST OF TECHNOLOGY on August 31, 2015 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0063.ch002
κ ο X κ Ο Ο * * W ΙΌ # * • 00 * * Ο # *
) 3 > t)
Ο Ο Ο ΙΌ Ο Ο
ζ
ο
υ
α
ί Ο ) ΙΌ Ο « Il II · Ο Ο Ο b CM Ν£> ·*Ο
Χ Χ Χ > c χ χ χ
cr « ο ζ < Η (Λ
ΙΌ Φ u.
Χ Χ Χ Χ Χ Χ Χ Χ Χ Χ Χ Χ χ χ χ χ χ χ χ χ χ χ χ χ χ χ x x x x x x x x x x x x x x x
Χ Χ Χ Χ Χ Χ >
ζ ια α Κ Ο Ζ
α.
Ό Η Ζ Ο
o o * > ο ο ο ο ο ο ο ο ο σ > ο σ MfiOlflOinoldOlflCMOC* n « N w œ i O a * o < o o i O i ^ Î O I O r t Î N I B n O Ï C M O N h h
Il
σ> ο q> a σ> ο Γ>-ν0ΦΟ(Γ N o CM II )-*inCMONr»mcMoroio- > > α ω ω ω ο ο ο ο υ
Σ Σ Σ W Σ Σ