Soft Independent Method of Class Analogy - ACS Symposium Series

Nov 6, 1985 - Soft Independent Method of Class Analogy ... National Fisheries Research Laboratory, U.S. Fish and Wildlife Service, Columbia, MO 65201...
0 downloads 0 Views 1MB Size
1 Soft Independent Method of Class Analogy Use in Characterizing Complex Mixtures and Environmental Residues of Polychlorinated Biphenyls 1

1

2

1

D. L. Stalling , T. R. Schwartz , W. J. Dunn III , and J. D. Petty 1

Downloaded by 146.185.202.78 on May 29, 2016 | http://pubs.acs.org Publication Date: November 6, 1985 | doi: 10.1021/bk-1985-0292.ch001

2

Columbia National Fisheries Research Laboratory, U.S. Fish and Wildlife Service, Columbia, MO 65201 Health Sciences Research Center, Department of Medicinal Chemistry and Pharmacognosy, University of Illinois at Chicago, Chicago, IL 60612

Pattern recognition studies on complex data from capillary gas chromatographic analyses were conducted with a series of micro­ computer programs based on principal components (SIMCA-3B). Principal components sample score plots provide a means to assess sample similarity. The behavior of analytes in samples can be evaluated from variable loading plots derived from principal components calculations. A complex data set was derived from isomer specific polychlorinated biphenyl (PCBS) analyses of samples from laboratory and field studies. The application of chemometrics to these problems includes three segments: analytical quality control; method and data base development; and modeling Aroclor composition and PCB residues in bird eggs. Chemometrics, as d e f i n e d by Kowalski (1) , i n c l u d e s the a p p l i c a t i o n o f m u l t i v a r i a t e s t a t i s t i c a l methods t o the study o f chemical problems. SIMCA (Soft Independent Method of C l a s s Analogy) and other m u l t i v a r i a t e statistical methods have been used as tools i n chemometric investigations. SIMCA, based on p r i n c i p a l components, i s a m u l t i v a r i a t e chemometric method t h a t has been a p p l i e d t o a v a r i e t y o f chemical problems o f v a r y i n g complexity. The SIMCA-3Β program i s s u i t a b l e f o r use with 8- and 16b i t microcomputers. Four l e v e l s o f p a t t e r n r e c o g n i t i o n have been d e f i n e d by Albano (2) . L e v e l s I and I I a r e most f r e q u e n t l y used t o determine t h e s i m i l a r i t y o f o b j e c t s , o r t o c h a r a c t e r i z e c l u s t e r s o f samples and t o c l a s s i f y unknown o b j e c t s . Level I I I takes advantage o f t h e r e d u c t i o n o f data dimensions r e s u l t i n g from SIMCA and seeks t o e s t a b l i s h a c o r r e l a t i o n o f sample scores with independent v a r i a b l e s 0097-6156/ 85/ 0292-0001 $06.00/ 0 © 1985 American Chemical Society Breen and Robinson; Environmental Applications of Chemometrics ACS Symposium Series; American Chemical Society: Washington, DC, 1985.

Downloaded by 146.185.202.78 on May 29, 2016 | http://pubs.acs.org Publication Date: November 6, 1985 | doi: 10.1021/bk-1985-0292.ch001

2

ENVIRONMENTAL APPLICATIONS OF CHEMOMETRICS

such as chemical f u n c t i o n s or v a r i a b l e s , s p e c t r o s c o p i c data or chemical t o x i c i t y . T h i s approach i s o f t e n used i n quantitative structure-activity relationships (3-5). Level IV is most frequently applied to^ complex s p e c t r o s c o p i c c a l i b r a t i o n problems and i n s i t u a t i o n s where composition p r e d i c t i o n or e s t i m a t i o n i s t o be made from s p e c t r o s c o p i c data. The SIMCA approach can be a p p l i e d i n a l l of the four l e v e l s of p a t t e r n r e c o g n i t i o n . We focus on i t s use t o d e s c r i b e complex mixtures g r a p h i c a l l y , and on i t s u t i l i t y in quality control. T h i s approach was s e l e c t e d f o r the tasks of developing a quality control program and evaluating similarities i n samples of v a r i o u s types. P r i n c i p a l components a n a l y s i s has proven t o be w e l l s u i t e d f o r e v a l u a t i n g data from c a p i l l a r y gas chromatographic (GC) analyses (6-8). A n a l y t i c a l q u a l i t y c o n t r o l (QC) e f f o r t s u s u a l l y are at l e v e l I or I I . S t a t i s t i c a l e v a l u a t i o n of m u l t i v a r i a t e l a b o r a t o r y data i s o f t e n complicated because the number of dependent v a r i a b l e s i s g r e a t e r than the number of samples. In e v a l u a t i n g q u a l i t y c o n t r o l , the analyst seeks t o establish that r e p l i c a t e analyses made on reference m a t e r i a l of known composition do not c o n t a i n excessive systematic or random e r r o r s of measurement. In a d d i t i o n , when such problems are detected, i t i s h e l p f u l i f remedial measures can be i n f e r r e d from the QC data. Our progress i n the a p p l i c a t i o n of chemometrics t o c a p i l l a r y GC data was advanced by the development of a l a b o r a t o r y chromatography data base (9). T h i s development followed from our d e c i s i o n t o use c a p i l l a r y GC i n most of our l a b o r a t o r y analyses f o r environmental contaminants. A data base was considered necessary because l a r g e amounts of data were being generated from the analysis of laboratory and field s t u d i e s on complex mixtures of organochlorine contaminants. A data base i s an important, but not e s s e n t i a l , f a c t o r i n u s i n g p a t t e r n r e c o g n i t i o n f o r quality control. The most advanced a p p l i c a t i o n of p a t t e r n r e c o g n i t i o n (Level IV) offers the possibility of predicting independent v a r i a b l e s by u s i n g l a t e n t v a r i a b l e s d e r i v e d from examining t r a i n i n g s e t s of dependent and independent v a r i a b l e s (10). The a p p l i c a t i o n of p a r t i a l l e a s t squares i n the p r e d i c t i o n of the composition of mixtures of A r o c l o r s was p r e v i o u s l y explored (6) by u s i n g the program, PLS-2 provided by the SIMCA-3Β programs (11-12). The f i r s t r e s u l t s from the use of PLS were reported by Dunn e t a l (6) who estimated the composition of PCB contaminated waste o i l i n terms of A r o c l o r mixtures. S t a l l i n g et a l (13), who reported on the c h a r a c t e r i z a t i o n of PCB mixtures and the use of three-dimensional plots d e r i v e d from p r i n c i p a l components, demonstrated t h a t the f r a c t i o n a l composition of TCDD and other PCDD residues were r e l a t e d t o t h e i r geographical o r i g i n s . These two r e p o r t s (6,13) d e s c r i b e d the a p p l i c a t i o n of an advanced chemometric t o o l i n r e s i d u e s t u d i e s and illustrated the

Breen and Robinson; Environmental Applications of Chemometrics ACS Symposium Series; American Chemical Society: Washington, DC, 1985.

1.

Laboratory Mixtures and Environmental PCBs

STALLING ET A L .

3

use of p a t t e r n recognition to extract quantitative i n f o r m a t i o n about sample s i m i l a r i t y . In our present i n v e s t i g a t i o n s , we encountered a p r e s s i n g need f o r an o b j e c t i v e , s t a t i s t i c a l l y based way of e v a l u a t i n g concentrations of as many as 105 i n d i v i d u a l PCB isomers i n each sample analyzed by c a p i l l a r y GC. We summarize here some of the experience obtained i n our l a b o r a t o r i e s from the use of SIMCA t o c h a r a c t e r i z e A r o c l o r mixtures and environmental PCB r e s i d u e s i n a s e r i e s of b i r d eggs.

Downloaded by 146.185.202.78 on May 29, 2016 | http://pubs.acs.org Publication Date: November 6, 1985 | doi: 10.1021/bk-1985-0292.ch001

METHODS

Sampling. Eggs of F o r s t e r ' s t e r n (Sterna f o r s t e r i ) were collected i n 1983 from nests i n two colonies in W i s c o n s i n — o n e on Lake Poygan and the other on Oconto Marsh, Green Bay—as part of a study on impaired r e p r o d u c t i o n . Lake Poygan i s a r e l a t i v e l y c l e a n lake whereas Green Bay i s h e a v i l y contaminated from the Fox R i v e r with many i n d u s t r i a l c h e m i c a l s — p a r t i c u l a r l y PCBs and chlorophenols, which are known sources o f PCDFs and PCDDs. Reproductive success has d e c l i n e d and the incidence of deformed young has increased i n the Green Bay colony (14). A n a l y s i s of PCBs. PCB r e s i d u e s i n e x t r a c t s of egg samples were enriched by u s i n g a combination of g e l permeation chromatography on BioBeads S-X3 and 1:1 (v/v) cyclohexane:methylene chloride. Adsorption column chromatography on s i l i c i c a c i d was used t o separate PCBs from other c o - e x t r a c t i v e s and contaminants (15). The PCB congeners were separated by u s i n g a g l a s s c a p i l l a r y chromatographic column (30 Μ χ .25 mm i.d.) coated w i t h C - h y d r o c a r b o n s t a t i o n a r y phase (Quadrex Corp., New Haven, CT 06525); a 60-cm uncoated fused silica retention gap connected the i n j e c t o r t o the a n a l y t i c a l column and a 15 cm uncoated fused s i l i c a column connected a n a l y t i c a l column t o the d e t e c t o r . The data sampling and gas chromatography program was c o n t r o l l e d by a V a r i a n Autosampler Model 8000, which a l s o delivered a calibrated amount of sample t o the GC i n j e c t i o n p o r t . Chromatographic c o n d i t i o n s were s i m i l a r for a l l of the analyses: i n i t i a l temperature, 80 °C, programmed a t 3 °C/min t o a f i n a l temperature of 265 °C; d e t e c t o r temperature, 320 °C; and i n j e c t o r temperature ( d i r e c t i n j e c t ) 220 °C. An IBM CS9000 microcomputer was i n t e r f a c e d w i t h the GC which acquired data generated by the e l e c t r o n capture detector. In p r o c e s s i n g the data, we used the CS9000 and a software package designed f o r l a b o r a t o r y data c o l l e c t i o n ( C a p i l l a r y A p p l i c a t i o n s Program [CAP], IBM Instrument d i v i s i o n , Danbury, CT 06810). We organized the processed peak data, u s i n g a b a s i c program, i n t o a s e r i e s o f f i l e s on hard d i s k media and t r a n s f e r r e d these f i l e s o f f - l i n e t o a D i g i t a l Equipment Corp. (DEC) PDP-11/34 minicomputer. We 87

Breen and Robinson; Environmental Applications of Chemometrics ACS Symposium Series; American Chemical Society: Washington, DC, 1985.

ENVIRONMENTAL APPLICATIONS OF CHEMOMETRICS

Downloaded by 146.185.202.78 on May 29, 2016 | http://pubs.acs.org Publication Date: November 6, 1985 | doi: 10.1021/bk-1985-0292.ch001

4

then organized t h e data i n t o t r e e - s t r u c t u r e d d i s k f i l e s , u s i n g our s p e c i a l i z e d l a b o r a t o r y data base management computer programs w r i t t e n i n DSM-11 ( D i g i t a l Standard MUMPS) f o r the PDP-11 f a m i l y o f computers. We separated 105 c o n s t i t u e n t s and achieved c a l i b r a t i o n by u s i n g a 1:1:1:1 (w/w/w/w) mixture o f A r o c l o r s 1242, 1248, 1254 and 1260. The l a s t two d i g i t s o f the A r o c l o r number designates t h e percentage c h l o r i n e i n the A r o c l o r . A chromatogram o f t h i s mixed A r o c l o r standard i s shown i n Figure 1. The method o f peak i d e n t i f i c a t i o n was a r e t e n t i o n index system u t i l i z i n g na l k y l t r i c h l o r o a c e t a t e s (16). Molar response f a c t o r s were determined from a flame i o n i z a t i o n d e t e c t o r by u s i n g t h e computer-based c a l c u l a t i o n methods d e s c r i b e d by Schwartz et aTj. (16) . A f t e r we determined the concentrations o f i n d i v i d u a l isomers, we r e t r i e v e d t h e data from t h e MUMPs based l a b o r a t o r y data base, and t r a n s f e r r e d them t o an IBM-XT (IBM Corporation, Boco Raton, FL 33432) by way o f a RS-232 l i n k , u s i n g t h e program Cyber (Department o f L i n g u i s t i c s , U n i v e r s i t y o f I l l i n o i s a t Champaign-Urbana, Urbana, I L 61820). In performing p r i n c i p a l components analyses, we used SIMCA-3Β f o r MS-DOS based microcomputers ( P r i n c i p a l Data Components, 2505 Shepard Blvd., Columbia, MO 65201). A s e r i e s o f A r o c l o r s and known A r o c l o r mixtures were analyzed by these techniques t o provide a t r a i n i n g data set f o r SIMCA-3B. These standards i n c l u d e d r e p l i c a t e analyses, a 1:1 (w/w) mixture o f each A r o c l o r i n combination with one other A r o c l o r , and a 1:1:1:1 mixture of each A r o c l o r (Table I ) . P r i n c i p a l Components A n a l y s i s We examined t h e data by c a l c u l a t i n g p r i n c i p a l components sample scores (Thetas) and v a r i a b l e l o a d i n g terms (Betas), u s i n g the program CPRIN from the SIMCA-3Β programs. A f t e r c a l c u l a t i n g two o r three p r i n c i p a l components f o r a c l a s s model, one can prepare a p l o t o f sample s i m i l a r i t y , by u s i n g t h e sample scores (Theta-1 v s Theta-2) , as w e l l as v a r i a b l e loadings (Beta-1 v s Beta-2). Sample s i m i l a r i t y was determined by c a l c u l a t i n g sample scores (θ-values, Equation [ 1 ] ) .

X

i k

= Xi +

e

ka*Bai + E

i

k

[1]

a=l The l i k e n e s s o f samples w i t h i n t h e c l a s s can be assessed by t h e proximity o f samples t o each other i n p l o t s derived from p r i n c i p a l components models. The s t a t i s t i c a l technique o f c r o s s - v a l i d a t i o n (17) was used t o

Breen and Robinson; Environmental Applications of Chemometrics ACS Symposium Series; American Chemical Society: Washington, DC, 1985.

Laboratory Mixtures and Environmental PCBs

Downloaded by 146.185.202.78 on May 29, 2016 | http://pubs.acs.org Publication Date: November 6, 1985 | doi: 10.1021/bk-1985-0292.ch001

1. STALLING ET AL.

Breen and Robinson; Environmental Applications of Chemometrics ACS Symposium Series; American Chemical Society: Washington, DC, 1985.

5

6

ENVIRONMENTAL APPLICATIONS OF CHEMOMETRICS

determine the number of components t h a t were s t a t i s t i c a l l y significant.

Downloaded by 146.185.202.78 on May 29, 2016 | http://pubs.acs.org Publication Date: November 6, 1985 | doi: 10.1021/bk-1985-0292.ch001

Table I .

Aroclors

Sample # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Samples Composing the T r a i n i n g Data S e t .

Aroclor Composition 1242 1248 1254 1260 0 0 0 1 0 0 0 1 0 0 0 1 0 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 1 1 1 1 1 1 1 1 0 1 0 1 1 1 0 0 0 1 1 0 0 0 1 1 0 1 0 1 1 0 0 1

Replicate # 1254-1 1248-1 1260-1 1242-1 1242-2 1260-2 1254-2 1:1:1:1-1 1:1:1:1-2

-

P r i n c i p a l Components P l o t s By using SIMCA-3Β program, FPLOT.EXE, one can plot numerous v a r i a b l e s d e r i v e d from the p r i n c i p a l components calculations. Because a p r i n t e r i n the character mode i s used with t h i s program t o p l o t v a r i a b l e s , the p l o t s are r e s t r i c t e d t o two-dimensional p r e s e n t a t i o n s . The program 3DPC.BAS ( P r i n c i p a l Data Components) p r o v i d e s a means t o p l o t sample scores i n 3-D and c o l o r i f t h r e e p r i n c i p a l components are calculated. The 3-D display derived from the sample score v a l u e s may be transferred to a disk file by using the program, FRIEZE.COM, s u p p l i e d as p a r t of PC-PAINT BRUSH or 4-Point Graphics (International Microcomputer Software, Inc., [IMSII]), San R a f e l , CA 94991). The image i s s t o r e d on d i s k and can be e d i t e d , enhanced, or l a b e l e d with a commercial software package such as PC Paint Brush (IMSII). The screen image can a l s o be p r i n t e d on a color or black/white p r i n t e r . RESULTS

and

DISCUSSION

Analyses of PCBs can create l a r g e data s e t s t h a t are d i f f i c u l t t o i n t e r p r e t , s i n c e there are 209 PCB isomers. Isomer compositions may vary widely due t o d i f f e r e n t i a l p a r t i t i o n i n g or metabolism of compounds. In a d d i t i o n , wide d i f f e r e n c e s i n r e s i d u e p r o f i l e s may e x i s t i n the b i o t a l o c a l l y because of v a r i a t i o n s i n e f f l u e n t s , combustion, or other source of r e s i d u e s . Chemometric methods can

Breen and Robinson; Environmental Applications of Chemometrics ACS Symposium Series; American Chemical Society: Washington, DC, 1985.

Downloaded by 146.185.202.78 on May 29, 2016 | http://pubs.acs.org Publication Date: November 6, 1985 | doi: 10.1021/bk-1985-0292.ch001

1. STALLING ET AL.

Laboratory Mixtures and Environmental PCBs

7

g r e a t l y improve t h e a n a l y s t ' s a b i l i t y t o d e s c r i b e and model residues i n these d i v e r s e samples. The u t i l i t y o f p r i n c i p a l components modeling o f m u l t i v a r i a t e data l i k e those encountered i n these complex mixtures, originates from g r a p h i c a l presentations of sample s i m i l a r i t y , as w e l l as from s t a t i s t i c a l r e s u l t s c a l c u l a t e d by t h e SIMCA-3Β programs (3). Sample data are treated as p o i n t s i n higher dimensional space, and p r o j e c t i o n s o f these data a r e made i n two- o r t h r e e dimensional space i n a way t h a t preserves most o f the e x i s t i n g r e l a t i o n s among samples and v a r i a b l e s (3). This f e a t u r e i s e s p e c i a l l y h e l p f u l i n v i s u a l i z i n g data o f more than t h r e e dimensions. The c a l c u l a t i o n s i n v o l v e d i n p r i n c i p a l components are summarized i n Equation [1]. The o b j e c t i v e was t o d e r i v e a model o f a data s e t having k samples and i v a r i a b l e s i n which t h e concentration o r value o f any measured value, ik' c a l c u l a t e d . The p r i n c i p a l component term i s the product o f e and B-^, where e (Theta) i s designated t h e a component '^score f o r sample k, and B ^ (Beta) i s designated as the " l o a d i n g ^ f o r v a r i a b l e i i n p r i n c i p a l component a. The term i s the mean o f variable X^ i n a l l samples. The r e s i d u a l term (or unexplained p a r t o f t h e measurement not modeled) i s designated E j w and "A" d e s c r i b e s t h e number o f p r i n c i p a l components e x t r a c t e d from t h e data. A more d e t a i l e d d i s c u s s i o n o f t h i s approach was given by Dunn e t a l . (6,18). The concentration data obtained from each sample a n a l y s i s were expressed as f r a c t i o n a l p a r t s and normalized t o sum t o 100. The normalized data were s t a t i s t i c a l l y analyzed, and t h r e e p r i n c i p a l components (A=3, Equation [1]) were c a l c u l a t e d . The PCB c o n s t i t u e n t s ( v a r i b l e s ) are numbered s e q u e n t i a l l y and correspond t o peak #1, peak #2, ... t o peak #105. The s t r u c t u r e and r e t e n t i o n index o f each c o n s t i t u e n t i n the mixture were reported by Schwartz e t a l . (9). The t a b u l a r l i s t i n g o f t h e data i s a v a i l a b l e from the present authors. The A r o c l o r samples l i s t e d i n Table ι were modeled by p r i n c i p a l components t o i l l u s t r a t e how t h e r e s u l t from principal components calculations can be used i n d e s c r i b i n g PCB data. The sample scores (Figure 2, A.Theta-1 v s . Theta-2; B. Theta-1 v s . Theta-2; and C Theta-2 vs. Theta-3) are p l o t t e d f o r t h e samples. Results obtained from t h e p l o t s o f t h e v a r i a b l e loadings (Figure 2, A'- C ) f o r t h e t h r e e components p r o v i d e i n s i g h t i n t o the importance o f t h e GC Peaks i n s e p a r a t i n g the v a r i o u s A r o c l o r s and t h e i r mixtures (Figure 2, A - C) . The l o a d i n g p l o t s show a separation o f variables that are t i g h t l y c l u s t e r e d , t h e groups o f v a r i a b l e s r a d i a t i n g outward from t h e center. They are c l u s t e r e d i n groups t h a t r e f l e c t t h e v a r i a b l e s t h a t are c h a r a c t e r i s t i c of the i n d i v i d u a l Aroclors. The sample scores (Theta-1, Theta-2, and Theta-3) i n each component were used t o represent t h e samples i n a 3-D x

c

o

u

l

d

b

e

k a

t

k

h

a

11

1

Breen and Robinson; Environmental Applications of Chemometrics ACS Symposium Series; American Chemical Society: Washington, DC, 1985.

8

ENVIRONMENTAL APPLICATIONS OF C H E M O M E T R I C S

Downloaded by 146.185.202.78 on May 29, 2016 | http://pubs.acs.org Publication Date: November 6, 1985 | doi: 10.1021/bk-1985-0292.ch001

SAMPLE SCORES

VARIABLE LOADINGS

A. A R O C L O R . T 9 0 X - P C 1 , Y - P C 2

A*. A R O C L O R . C 9 0 X - B E T 1 , Y - B E T 2

B. A R O C L O R . T 9 0 X - P C 1 , Y - P C 3

B*. A R O C L O R . C 9 0 X - B E T 1 , Y - B E T 3

C. A R O C L Q R . T 9 0 X - P C 2 , Y - P C 3

C . AROCLOR.C90 X-BET2, Y-BET3

Figure 2. Principal

Components Plots for Aroclor

(ref. Table 1 for Sample

Samples

i.d.)

Breen and Robinson; Environmental Applications of Chemometrics ACS Symposium Series; American Chemical Society: Washington, DC, 1985.

1.

STALLING ET A L .

Laboratory Mixtures and Environmental PCBs

9

Downloaded by 146.185.202.78 on May 29, 2016 | http://pubs.acs.org Publication Date: November 6, 1985 | doi: 10.1021/bk-1985-0292.ch001

graph (Figure 3). T h i s p l o t shows t h a t while much o f t h e sample information may be d i s c e r n e d from a two component model, i t i s impossible t o t e l l i f an A r o c l o r mixture i s composed o f more than a mixture o f three A r o c l o r s . The 3D p r e s e n t a t i o n i l l u s t r a t e s more c l e a r l y than three 2-D p l o t s , how complex data may be viewed and r e l a t i o n s among the samples more c l e a r l y comprehended than when data a r e presented i n t a b u l a r form.

Figure 3. 3-D Plot of Principal Components Scores (Theta-1,-2,-3) Representing Normalized Isomer Composition Data for Aroclors 1242, 1248, 1254, 1260, and their mixtures. The points for each Aroclor represent individual sample analyses. The plot in the upper right quadrant is the view parallel to the Z-axis.

The p r i n c i p a l components model o f the A r o c l o r samples (Table I ) preserves g r e a t e r than 95% o f t h e sample v a r i a n c e o f t h e e n t i r e data s e t . From t h e 3-D sample score p l o t (Figure 3) one can make these observations: PCB mixtures o f two A r o c l o r s form a s t r a i g h t l i n e ; three A r o c l o r mixtures form a plane; and t h a t p o s s i b l e mixtures of t h e f o u r A r o c l o r s a r e bounded by t h e i n t e r s e c t i o n o f the f o u r planes. Samples not bounded by o r i n s i d e t h e volume formed by t h e i n t e r s e c t i o n o f t h e f o u r planes may

Breen and Robinson; Environmental Applications of Chemometrics ACS Symposium Series; American Chemical Society: Washington, DC, 1985.

Downloaded by 146.185.202.78 on May 29, 2016 | http://pubs.acs.org Publication Date: November 6, 1985 | doi: 10.1021/bk-1985-0292.ch001

10

ENVIRONMENTAL APPLICATIONS OF CHEMOMETRICS

be d e r i v e d from, but are not i d e n t i c a l t o , mixtures of Aroclors. In SIMCA-3B, modeling power i s d e f i n e d t o be a measure of the importance of each v a r i a b l e i n a p r i n c i p a l component term of the c l a s s model (18). The modeling power has a maximum value of one (1.0) i f the v a r i a b l e i s well described by the principal components model. V a r i a b l e s with modeling power of l e s s than 0.2 can be eliminated from the data without a major loss of i n f o r m a t i o n (18). For the PCB mixtures we analyzed (Table I ) , the modeling power was determined on the b a s i s of a three component model (A=3). These data r e v e a l e d t h a t most of the 105 GC-peaks p l a y an important r o l e i n the c l a s s model f o r the f o u r A r o c l o r s and t h e i r mixtures. The modeling power of each v a r i a b l e i s p l o t t e d (Figure 4) f o r each component term along with a p l o t of the c o n c e n t r a t i o n p r o f i l e of sample 9 (Table I ) ; t h i s sample contains 1242:1248:1254:1260 i n a 1:1:1:1 r a t i o and the plot represents i t s f r a c t i o n a l composition. R e p r o d u c a b i l i t y aspects of the a n a l y s i s i s r e f l e c t e d i n the n e a r l y i d e n t i c a l p r o x i m i t y of each of the r e p l i c a t e analyses (Table I ) . Use of t h i s s t a t i s t i c a l technique t o examine sample r e s i d u e p r o f i l e s from d i f f e r e n t l o c a t i o n s has l e a d t o an improved understanding of complex mixtures of contaminants and r e l a t e d problems. Residues i n F o r s t e r ' s Tern Egcrs A d e c l i n e i n r e p r o d u c t i v e success and i n c r e a s e d i n c i d e n c e of deformed young were observed i n c o l o n i e s of F o r s t e r ' s t e r n s , common t e r n s , cormorants, and herons i n and near Green Bay (14, 20). The samples from two Wisconsin l o c a t i o n s (Lake Poygan and Green Bay) were analyzed f o r i n d i v i d u a l PCBs u s i n g the method d e s c r i b e d by Schwartz et a l . (9) . Residue l e v e l s f o r the t o t a l PCB content (Table i i ) represent the sum of the i n d i v i d u a l PCBs present i n the sample. For the SIMCA analyses, the i n d i v i d u a l PCB isomer c o n c e n t r a t i o n s were normalized t o sum 100. We examined the data by u s i n g the SIMCA-3Β program t o c a l c u l a t e p r i n c i p a l components and t o p l o t sample scores i n a manner i d e n t i c a l t o t h a t d i s c u s s e d f o r the A r o c l o r mixtures. The p l o t of sample data i l l u s t r a t e s t h a t the geographic l o c a t i o n s have d i f f e r e n t residue profiles (Figure 5). The PCBs present i n eggs c o l l e c t e d from Green Bay b i r d s were s i m i l a r t o A r o c l o r 1254, and the t o t a l PCB c o n c e n t r a t i o n s were about 6 times g r e a t e r than those i n Lake Poygan. The composition of PCBs i n eggs c o l l e c t e d from Lake Poygan b i r d s were l e s s c o n s i s t e n t and tended t o l i e f a r t h e r from a l i n e between A r o c l o r s 1254 and 1260 (Figure 5) . We found t h a t the geographical o r i g i n of the samples (Lake Poygan or Green Bay) c o u l d be a s c e r t a i n e d with a p r o b a b i l i t y of 0.85 by u s i n g a class

Breen and Robinson; Environmental Applications of Chemometrics ACS Symposium Series; American Chemical Society: Washington, DC, 1985.

STALLING ET A L .

Laboratory Mixtures and Environmental PCBs

11

Downloaded by 146.185.202.78 on May 29, 2016 | http://pubs.acs.org Publication Date: November 6, 1985 | doi: 10.1021/bk-1985-0292.ch001

1.

Figure 4. Plots of the Fractional Composition of an Aroclor Mixture (A, Sample 9, Table 1) and the Modeling Power for a Three Component Model of the Samples in Table 1: PC-1 (B); PC-2 (C); and PC-3 (D).

model o f each group based on normalized r e s i d u e data. T h i s was determined u s i n g the program the SIMCA-3Β program "CLASSI" t o c l a s s i f y the samples.

Breen and Robinson; Environmental Applications of Chemometrics ACS Symposium Series; American Chemical Society: Washington, DC, 1985.

12

ENVIRONMENTAL APPLICATIONS OF CHEMOMETRICS

Downloaded by 146.185.202.78 on May 29, 2016 | http://pubs.acs.org Publication Date: November 6, 1985 | doi: 10.1021/bk-1985-0292.ch001

Table I I .

R e s i d u e s o f PCB i n Tern Eggs C o l l e c t e d and Green B a y l

Collection

Site

Lake Poygan Green Bay

from Lake Poyga

Mean PCB (ug/g wet

Residue weight)

20.4

(7.8)

3.7

(1.5)

2

^each composite sample contained 6 eggs ^standard deviations in parentheses

Figure 5. Principal Components Plot Derived from Analysis of Aroclor Standards, Their Equal Mixture, and Eggs of Forster's Tern.

Breen and Robinson; Environmental Applications of Chemometrics ACS Symposium Series; American Chemical Society: Washington, DC, 1985.

1.

STALLING ET AL.

13

Laboratory Mixtures and Environmental PCBs

Two A r o c l o r 1260 standards (A^^ and A ) were i n c l u d e d i n these analyses. One standard was from the Columbia N a t i o n a l F i s h e r i e s Research Laboratory, and the other from the Patuxent W i l d l i f e Research Center (U.S. F i s h and W i l d l i f e S e r v i c e , L a u r e l , MD.) A d i f f e r e n c e i n the concentration of one constituent of about 30% was r e s p o n s i b l e f o r the small d i f f e r e n c e observed between the two Aroclor 1260 standards (Figure 5.) Use of a q u a n t i t a t i v e chemometric method t o d e s c r i b e compositional r e s i d u e d i f f e r e n c e s measured i n environmental samples may prove helpful in c o r r e l a t i n g residue profiles and concentrations with observed b i o l o g i c a l e f f e c t s , such as decreased s u r v i v a l of young b i r d s . 2

Downloaded by 146.185.202.78 on May 29, 2016 | http://pubs.acs.org Publication Date: November 6, 1985 | doi: 10.1021/bk-1985-0292.ch001

SUMMARY

These a p p l i c a t i o n s demonstrate t h a t p a t t e r n r e c o g n i t i o n techniques based on principal components may be e f f e c t i v e l y used t o c h a r a c t e r i z a t e complex environmental r e s i d u e s . In comparisons of PCBs i n b i r d eggs c o l l e c t e d from d i f f e r e n t regions, we demonstrated through the use of SIMCA t h a t the p r o f i l e s i n samples from a r e l a t i v e l y c l e a n area differed i n concentration and composition from profiles i n samples from a more h i g h l y contaminated r e g i o n . Q u a l i t y c o n t r o l can be evaluated by the proximity of r e p l i e ? c e a n a l y s i s of samples i n p r i n c i p a l components plots. More extensive use of isomer s p e c i f i c a n a l y s i s , when combined with chemometric techniques, should improve i n s i g h t i n t o how r e s i d u e s i n the environment r e l a t e t o t h e i r sources. T h i s approach could l e a d t o a q u a n t i t a t i v e description of changes i n the composition of these chemicals as they pass through the food chain and are d i s t r i b u t e d i n the environment. Acknowledgment

We thank Michael Koehler ( U n i v e r s i t y of I l l i n o i s at Chicago) f o r developing the combined twoand threedimensional plotting program to display the sample component scores. D i s c i aimer. R e f e r e n c e s t o t r a d e names or m a n u f a c t u r e r s of commercial p r o d u c t s do not imply Government endorsement of commercial p r o d u c t s .

Literature Cited 1.

Kowalski, B.; 884.

Chemistry and Industry, 1978,

18,

2.

Albanc, C.; Dunn, W.J., III; Edlund, U.; Johansson, E.; Nroden, B; Sjostrom, M.; Wold, S. Anal. Chim. Acta. Comp. Tech. Optim. 1978, 103. 429-443.

3.

Wold, S.; Sjostrom, M. In "Chemometrics, Theory and Application," ACS Symp. Ser. 1977, No. 52, 243-282. Breen and Robinson; Environmental Applications of Chemometrics ACS Symposium Series; American Chemical Society: Washington, DC, 1985.

882-

Downloaded by 146.185.202.78 on May 29, 2016 | http://pubs.acs.org Publication Date: November 6, 1985 | doi: 10.1021/bk-1985-0292.ch001

14

ENVIRONMENTAL APPLICATIONS OF CHEMOMETRICS

4.

Dunn, W. J., 1001.

III; Wold, S. L . Med. Chem. 1979, 21,

5.

Dunn, W. J., III; Wold, S. J. Chem. Inf. Comput. Sci. 1981, 21, 8-13.

6.

Dunn, W. J., III; Stalling, D.L.; Schwartz, T.R.; Hogan, J.W.; Petty, J.D.; Anal. Chem. 1984, 56, 1308-1313.

7.

Stalling, D.L.; Smith, L.M.; Petty, J.D.; Dunn, W.J., III, "Dioxins and Furans in the Environment -- A Problem for Chemometrics," in "Dioxins in the Environment," Limno-Tech, Inc., Ann Arbor, MI, in press.

8.

Stalling, D.L.; Dunn, W. J., III; Schwartz, T. R.; Hogan, J. W.; Petty, J. D.; Johansson, E . ; Wold, S. In "Application of SMICA, A Principal Components Method, in Isomer Specific Analysis of PCB's," ACS Symp. Ser. 1985, in press.

9.

Schwartz, T. R.; Campbell, R.D.; Stalling, D.L.; Little, R.L.; Petty, J.D.; Hogan, J.W.; Kaiser, E.M.; Anal. Chem. 1984, 56, 1303-1308.

10. Wold, S. and Dunn, W. J., III.; J. Chem. Inf. Comput. Sci., 1983, 23, 6-13. 11. Wold, S.; Pattern Recognition, 1976. 8, 127-134. 12. Wold, S., Albano, C., Dunn, W.J., III, Edlund, U . , Esbensen, K. Geladi, P., Hellberg, S., Johansson, E . , Lindberg, W., and Sjostrom, Μ., "Multivariate Data Analysis in Chemistry," in Chemometrics. Mathematics and Statistics in Chemistry, Kowalski, B.R., Ed., D. Reidel Publishing Company, 1984, 17-95. 13. Stalling, D.L., Norstrom, R.J., Smith, L.M., Simon, Μ., Chemosphere, 1985. in press. 14. Memorandum from T.J. Kubiak, Fish and Wildlife Service, Green Bay Field Office, University of Wisconsin-Green Bay, Green Bay, WI, to Members Fox River/Green Bay Toxics Task Force, January 27, 1983. 15. Stalling, D.L.; Tindle, R.C.; J. Assoc. Off. Anal. Chem., 1972, 55, 32-38. 16. Schwartz, T. R.; Petty, J. D.; Kaiser, Ε. M. Anal. Chem. 1983, 55, 1839. 17. Wold, S.,Technometrics 1978, 20, 397-406. 18. Wold, S., Technometrics 1978, 20, 397-406.

Breen and Robinson; Environmental Applications of Chemometrics ACS Symposium Series; American Chemical Society: Washington, DC, 1985.

1. STALLING ET AL.

Laboratory Mixtures and Environmental

PCBs

15

19. Dunn, III, W.J., Wold, S., and Stalling, D.L., "How SIMCA Pattern Recognition Works," Proceedings of Symposium on Chemometrics, Division of Environmental Chemistry, 188 t h National ACS Meeting, Philadelphia, PA, August 26-31, 1985. in press. 20. Harris, H.J., and J. A. Trick, 1979, Annual Performance Report. Wisconsin Department of Natural Resources, Office of Endangered and Non-Game Species, Madison, Wisconsin.

Downloaded by 146.185.202.78 on May 29, 2016 | http://pubs.acs.org Publication Date: November 6, 1985 | doi: 10.1021/bk-1985-0292.ch001

RECEIVED July 17, 1985

Breen and Robinson; Environmental Applications of Chemometrics ACS Symposium Series; American Chemical Society: Washington, DC, 1985.