An Expert System for Prediction of Aquatic Toxicity ... - ACS Publications

Jan 9, 1990 - formalism for a chemical structure, the expert system identifies all ... internal knowledge base using these fragments by forward chaini...
4 downloads 0 Views 1MB Size
Chapter 7

An Expert System for Prediction of Aquatic Toxicity of Contaminants James P. Hickey, Andrew J. Aldridge, Dora R. May Passino, and Anthony M. Frank

Downloaded by PURDUE UNIV on July 5, 2016 | http://pubs.acs.org Publication Date: July 5, 1990 | doi: 10.1021/bk-1990-0431.ch007

U.S. Fish and Wildlife Service, National Fisheries Research Center—Great Lakes, 1451 Green Road, Ann Arbor, MI 48105 The National Fisheries Research Center-Great Lakes has developed an interactive computer program i n muLISP that runs on an IBM-compatible microcomputer and uses a linear solvation energy relationship (LSER) to predict acute toxicity to four representative aquatic species from the detailed structure of an organic molecule. Using the SMILES formalism for a chemical structure, the expert system identifies all structural components and uses a knowledge base of rules based on an LSER to generate four structure-related parameter values. A separate module then relates these values to toxicity. The system i s designed for rapid screening of potential chemical hazards before laboratory or f i e l d investigations are conducted and can be operated by users with little toxicological background. This i s the f i r s t expert system based on LSER, relying on the f i r s t comprehensive compilation of rules and values for the estimation of LSER parameters. The N a t i o n a l F i s h e r i e s R e s e a r c h C e n t e r - G r e a t L a k e s , U.S. F i s h and Wildlife Service, has t e n t a t i v e l y identified more t h a n 500 c o n t a m i n a n t s i n the t i s s u e s o f w a l l e y e s ( S t i z o s t e d i o n v i t r e u m v i t r e u m ) and l a k e t r o u t ( S a l v e l i n u s namaycush) from the G r e a t Lakes b a s i n ( 1 ) , and 362 s u b s t a n c e s have been v e r i f i e d i n the G r e a t Lakes system ( 2 ) . A s y s t e m a t i c assessment o f the b i o l o g i c a l h a z a r d s o f t h e s e compounds i s underway ( 3 - 5 ) . I t i s , however, p h y s i c a l l y and e c o n o m i c a l l y i m p o s s i b l e t o r u n b i o a s s a y s on e v e r y compound, e s p e c i a l l y s i n c e hundreds o f new compounds a r e b e i n g i n t r o d u c e d i n t o t h e environment each y e a r . By u s i n g m a t h e m a t i c a l models based on q u a n t i t a t i v e s t r u c t u r e a c t i v i t y r e l a t i o n s h i p s (QSAR), one can q u i c k l y d e t e r m i n e the compounds that merit f u r t h e r i n v e s t i g a t i o n (5-9). Use o f t h e s e p r e d i c t i v e models s e r v e as a r a p i d , i n e x p e n s i v e s c r e e n i n g t e c h n i q u e , e s p e c i a l l y f o r compounds n o t c o m m e r c i a l l y a v a i l a b l e . Such models, however, a r e complex and much o f the knowledge r e q u i r e d t o a p p l y them i s This chapter not subject to U.S. copyright Published 1990 American Chemical Society

Hushon; Expert Systems for Environmental Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

Downloaded by PURDUE UNIV on July 5, 2016 | http://pubs.acs.org Publication Date: July 5, 1990 | doi: 10.1021/bk-1990-0431.ch007

7.

HICKEYETAL.

Prediction ofAquatic Toxicity of Contaminants 91

q u a l i t a t i v e o r n o t w e l l d e f i n e d . One approach t o making t h e s e models u s a b l e as an assessment t o o l f o r t o x i c o l o g i s t s , i n l i e u o f a b i o a s s a y , or managers who l a c k t e c h n i c a l e x p e r t i s e i n t o x i c o l o g y and c h e m i s t r y i s t o c r e a t e a computer program c a p a b l e o f u s i n g q u a l i t a t i v e i n f o r m a t i o n f o r d e c i s i o n - m a k i n g (a s o - c a l l e d e x p e r t system). Expert systems are s p e c i a l - p u r p o s e programs t h a t i m i t a t e the performance o f human e x p e r t s i n s o l v i n g problems on s p e c i a l i z e d , o f t e n h i g h l y t e c h n i c a l s u b j e c t s ( 1 0 ) ; they do so u s i n g the h e u r i s t i c knowledge o f the human e x p e r t , w i t h the e x p e r t ' s q u a l i t a t i v e r e a s o n i n g t o s o l v e the problem ( 1 1 ) . E x p e r t systems have p o t e n t i a l f o r r e c o g n i z i n g and managing e n v i r o n m e n t a l problems. P r o t o t y p e e x p e r t systems b e i n g d e v e l o p e d i n c l u d e s i t e assessment systems used t o i d e n t i f y and q u a n t i f y hazards a t "superfund" s i t e s . A l s o i n c l u d e d are p r e d i c t i v e systems such as the Hazardous Waste and Management E x p e r t System d e s i g n e d t o p r o v i d e a d v i c e about r e d u c i n g h e a l t h and e n v i r o n m e n t a l r i s k s a t hazardous waste s i t e s (12.) . We d e v e l o p e d the f i r s t e x p e r t system t h a t i n c o r p o r a t e s a w o r k i n g set o f r u l e s f o r a type o f QSAR r e f e r r e d t o as a l i n e a r s o l v a t i o n energy r e l a t i o n s h i p o r LSER (13-17) t o p r e d i c t LSER v a r i a b l e v a l u e s from SMILES s t r i n g f o r m a l i s m . The program a l s o uses t h e s e LSER r e s u l t s and i n f o r m a t i o n about t o x i c i t y t o p r e d i c t a c u t e t o x i c i t y t o f o u r r e p r e s e n t a t i v e organisms: t h e f a t h e a d minnow (Pimephales p r o m e l a s ) , t h e c r u s t a c e a n s Daphnia magna and Daphnia p u l e x . and P h o t o b a c t e r i u m phosphoreum. t h e l u m i n e s c e n t agent i n the M i c r o t o x test. System Overview The e x p e r t system i s d e s i g n e d t o p r e d i c t LSER v a r i a b l e v a l u e s from a c h e m i c a l s t r u c t u r e f o r m a l i s m , and an a d d i t i o n t o the s o f t w a r e uses t h e s e v a l u e s t o p r e d i c t acute t o x i c i t y ( F i g u r e 1 ) . The program i s w r i t t e n i n the muLISP computer language and runs on a n IBM-compatible p e r s o n a l computer. I t can be used by anyone who has even a l i m i t e d background i n t o x i c o l o g y o r c h e m i s t r y . I t adheres t o a b a s i c d o c t r i n e of e x p e r t system methodology: the s e p a r a t i o n o f the knowledge base from t h e methods o f p r o c e s s i n g t h e knowledge makes t h e system r e l a t i v e l y easy t o modify and debug ( 1 0 ) . E x p e r t systems a r e d e s i g n e d to be used b y n o n - t e c h n i c a l p e r s o n n e l ( 1 8 ) ; i n o u r example t h e u l t i m a t e u s e r c o u l d be a n a t u r a l r e s o u r c e manager. To o p e r a t e o u r system, t h e u s e r e n t e r s a r e p r e s e n t a t i o n ( s t r u c t u r e o r s u i t a b l e i d e n t i f i c a t i o n number) o f a c h e m i c a l compound i n t o the computer. No other i n t e r a c t i o n i s required. The program can be d i v i d e d i n t o t h r e e main s e c t i o n s : the f i r s t s e c t i o n determines the s t r u c t u r a l elements o f the c h e m i c a l compound; the second e s t i m a t e s the v a l u e s f o r the LSER model v a r i a b l e s ; and the t h i r d p r e d i c t s the t o x i c i t y o f the compound. A l l t h r e e s e c t i o n s are c o n t r o l l e d by the main r e a s o n i n g s e c t i o n , c a l l e d the i n f e r e n c e e n g i n e . To i d e n t i f y the compound the i n f e r e n c e engine q u e r i e s the u s e r f o r e i t h e r a CAS (Chemical A b s t r a c t s S e r v i c e ) i d e n t i f i c a t i o n number o r a SMILES r e p r e s e n t a t i o n o f t h e compound s t r u c t u r e ( d e f i n e d l a t e r ) . Next, the s t r i n g i s decomposed by a f r a g m e n t a t i o n p r o c e s s and two r u l e bases a r e c o n s u l t e d . The i n f e r e n c e engine c o n s t r u c t s a secondary i n t e r n a l knowledge base u s i n g these fragments b y f o r w a r d c h a i n i n g o f

Hushon; Expert Systems for Environmental Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

Hushon; Expert Systems for Environmental Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

USER INPUT

Figure 1. Physical Data Flow Diagram.

(*•, α, &Vj)

Ring Parameter Values

Confidence

Downloaded by PURDUE UNIV on July 5, 2016 | http://pubs.acs.org Publication Date: July 5, 1990 | doi: 10.1021/bk-1990-0431.ch007

RING LOOKUP FILE

OUTPUT SCREEN AND PRINTER

7.

HICKEY ET XL

Prediction of Aquatic Toxicity of Contaminants

93

Downloaded by PURDUE UNIV on July 5, 2016 | http://pubs.acs.org Publication Date: July 5, 1990 | doi: 10.1021/bk-1990-0431.ch007

the r u l e bases and s t o r e s the i n f o r m a t i o n i n a dynamic, g l o b a l , i n t r i n s i c d a t a type c h a r a c t e r i z e d by LISP programmers as a " p r o p e r t y list". V a l u e s f o r the f o u r LSER v a r i a b l e s (V./100, 7Γ*, β, α, v i d e i n f r a ) a r e a s s i g n e d t o the i n d i v i d u a l fragments o f the compound by the second r u l e base and accumulated. The i n f e r e n c e engine t h e n i n v o k e s the r e g r e s s i o n and c o n f i d e n c e i n t e r v a l c a l c u l a t i o n s , u t i l i z i n g the LSER v a r i a b l e s as i n p u t s . F i n a l l y , the e s t i m a t e d LSER v a r i a b l e s and the p r e d i c t e d t o x i c i t i e s w i t h t h e i r c o n f i d e n c e i n t e r v a l s a r e d i s p l a y e d to the u s e r . SMILES S t r i n g s The e x p e r t system determines the s t r u c t u r e o f an o r g a n i c m o l e c u l e from a s t a n d a r d c h e m i c a l n o t a t i o n known as a SMILES s t r i n g . The S i m p l i f i e d M o l e c u l a r I n p u t L i n e E n t r y System was developed by the U.S. E n v i r o n m e n t a l P r o t e c t i o n Agency (19. 20) a t the E n v i r o n m e n t a l Research L a b o r a t o r y , D u l u t h , M i n n e s o t a , f o r the QSAR Research Program (20-23) and was based on work w i t h the M e d i c i n a l C h e m i s t r y P r o j e c t a t Pomona College, Claremont, California (24). A SMILES s t r i n g is a l i n e a r i z a t i o n o f the t h r e e - d i m e n s i o n a l r e p r e s e n t a t i o n o f an o r g a n i c compound. The n o t a t i o n has f o u r b a s i c s y n t a x r u l e s t h a t a l l o w the u s e r t o r e p r e s e n t a m o l e c u l e i n a form t h a t can be e a s i l y u s e d by the muLISP language (19-21). A s y n o p s i s o f the r u l e s w i l l be g i v e n here f o r the p r o s p e c t i v e u s e r ; i n the f u t u r e , the use o f the SMILES f o r m a l i s m w i l l n o t be so n e c e s s a r y . The f i r s t r u l e d e s i g n a t e s a s e t o f b a s i c symbols ( T a b l e I ) . A l l m o l e c u l e s a r e r e p r e s e n t e d as hydrogen-suppressed, and s i n g l e bonds a r e assumed by d e f a u l t . For example, CO assumes a s i n g l e bond between the c a r b o n and oxygen, and C-0 i n d i c a t e s a double bond. The bond between two l o w e r c a s e symbols i s a r o m a t i c . Table I .

B a s i c Symbols Used i n the F o r m u l a t i o n o f SMILES S t r i n g s

Symbol C,c N,n 0,o S,s P.P

Designation Normal Normal Normal Normal Normal

and and and and and

a r o m a t i c carbon aromatic n i t r o g e n a r o m a t i c oxygen aromatic s u l f u r a r o m a t i c phosphorus

Symbol BR, Br CL,C1 I, F

-,# *

Designation Bromine Chlorine I o d i n e and f l u o r i n e Double, t r i p l e bonds A r o m a t i c bond

The second r u l e d e f i n e s s i m p l e c h a i n s i n m o l e c u l e s . A s i m p l e c h a i n of atoms i s r e p r e s e n t e d by atomic symbols i n t e r s p e r s e d w i t h t h e i r r e s p e c t i v e bond symbols. For example, CC r e p r e s e n t s ethane; C«=C ethene; and CCCCCO, n - p e n t a n o l . The t h i r d r u l e d e f i n e s s i m p l e branches i n m o l e c u l e s . A branch from the main c h a i n i s e n c l o s e d i n p a r e n t h e s e s . The s t r i n g i n p a r e n t h e s e s i s p l a c e d d i r e c t l y a f t e r the symbol f o r the atom t o which the b r a n c h i s connected. I f i t i s connected by a m u l t i p l e bond, the bond symbol i m m e d i a t e l y f o l l o w s the l e f t p a r e n t h e s i s . More than one

Hushon; Expert Systems for Environmental Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

94

E X P E R T SYSTEMS FOR

ENVIRONMENTAL APPLICATIONS

Downloaded by PURDUE UNIV on July 5, 2016 | http://pubs.acs.org Publication Date: July 5, 1990 | doi: 10.1021/bk-1990-0431.ch007

b r a n c h i s i n d i c a t e d by u s i n g more t h a n one s e t o f p a r e n t h e s e s : ( ) ( ) and ( ( )) are s i m p l e forms t h a t may be used. For example, I C ( I ) 0 represents diiodomethanol, and CCCCC(C(C)C)CC-C represents 3isopropyloctene. The f o u r t h r u l e d e f i n e s r i n g s t r u c t u r e s . A r i n g i s c l o s e d by u s i n g a p a i r o f r i n g c l o s u r e numbers. I n C1CCCCC1, a s i n g l e bond c o n n e c t s the "1" a f t e r the f i r s t c a r b o n w i t h the o t h e r c a r b o n f o l l o w e d by a "1". For m u l t i p l e r i n g s , we use d i f f e r e n t r i n g numbers ( e . g . , 1,2,3,4), and p a i r s o f carbons w i t h l i k e numbers a r e c o n n e c t e d t o c l o s e the r i n g s . For example, c l c c 2 c c c c c 2 c c l r e p r e s e n t s n a p t h a l e n e , where c l i s j o i n e d w i t h c l and c2 w i t h c2 t o form the two f u s e d r i n g s . I n f e r e n c e mechanics The p r o b l e m - s o l v i n g system i s b u i l t around r u l e s t h a t c o n s i s t o f an a n t e c e d e n t " i f " p a r t and a c o n c l u s i o n "then" p a r t . These r u l e s are p r o c e s s e d by c o n c e n t r a t i n g on the r u l e s ' a n t e c e d e n t s , a p r o c e s s r e f e r r e d t o as f o r w a r d c h a i n i n g ( 1 1 ) . When a l l o f the a n t e c e d e n t s i n a r u l e t e s t t r u e , the r u l e i s s a i d t o be t r i g g e r e d . I f an a c t i o n i s performed ( i . e . , a c o n c l u s i o n i s added t o the secondary knowledge b a s e ) , the r u l e i s s a i d t o be f i r e d . S e v e r a l r u l e s may be t r i g g e r e d a t once, r e q u i r i n g c o n f l i c t r e s o l u t i o n s t r a t e g i e s t o determine w h i c h o f them s h o u l d be f i r e d (25)· The c o n f l i c t r e s o l u t i o n s t r a t e g i e s we use a r e termed c o n t e x t l i m i t i n g and r u l e o r d e r i n g . I n c o n t e x t - l i m i t i n g s t r a t e g y , r u l e s are grouped i n such a way t h a t few r u l e s are a c t i v e a t the same t i m e . The r e s u l t i n g groups are d i s j o i n t s u b s e t s o f the s e t o f a l l r u l e s . Because the i n f e r e n c e engine a c t i v a t e s and d e a c t i v a t e s r u l e groups, i t needs o n l y t o h a n d l e c o n f l i c t s w i t h i n a group. W i t h i n each s u b s e t , r u l e o r d e r i n g i s used t o r e s o l v e c o n f l i c t s . Rule o r d e r i n g r e q u i r e s t h a t the r u l e s be o r g a n i z e d i n a s i n g l e p r i o r i t y l i s t i n w h i c h the f i r s t t r i g g e r i n g r u l e i n the l i s t has the h i g h e s t p r i o r i t y and the o t h e r s are i g n o r e d ( 2 5 ) . The i n f e r e n c e engine i s o l a t e s the b a s i c s k e l e t a l s t r u c t u r e s , ( r i n g s o r c h a i n s ) and a l l o t h e r f u n c t i o n a l groups composing the compound, and passes t h e s e t o the f i r s t knowledge base t h a t i d e n t i f i e s them. R i n g s are the most d i f f i c u l t s t r u c t u r e s to i s o l a t e because one compound may c o n t a i n many r i n g s w i t h some a t t a c h e d t o each o t h e r . The SMILES s t r i n g i s t h e r e f o r e put i n a t r e e s t r u c t u r e , and the s h o r t e s t p a t h t o the numbers i n the c h a i n are i d e n t i f i e d by u s i n g a b r a n c h and bound search (25). For example, Figure 2a shows a tree for decahydro-2,3dimethylnaphthalene for which the SMILES string is (C(C(CCC1)CC(C2C)C)(C1)C2). The i n f e r e n c e engine f i r s t f i n d s the "1" r i n g and then proceeds to the "2" r i n g . Once the s h o r t e s t p a t h t o one o f the numbers o f the numbered r i n g i s found, the system b a l a n c e s the t r e e and runs the s e a r c h a g a i n . The f i r s t number found becomes the r o o t ( F i g u r e 2b). T h i s s h o r t e s t p a t h t h e n becomes the a c t u a l r i n g s u b s t i t u e n t , w h i c h i s s t o r e d f o r l a t e r i d e n t i f i c a t i o n by the f i r s t r u l e base i n the secondary knowledge base. The i n f e r e n c e engine t h e n b e g i n s a n o t h e r s e a r c h f o r a n o t h e r r i n g , r e p e a t i n g the s e a r c h sequence. A l l n o n - r i n g fragments a r e a l s o i s o l a t e d and s t o r e d i n the knowledge base o f the secondary p r o p e r t y

Hushon; Expert Systems for Environmental Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

Downloaded by PURDUE UNIV on July 5, 2016 | http://pubs.acs.org Publication Date: July 5, 1990 | doi: 10.1021/bk-1990-0431.ch007

7.

HICKEYET AL.

Prediction ofAquatic Toxicity of Contaminants 95

C

/l\ c c c

/ / / \ 1 2 C

/

C

/

/ Λ / 1

c

c

/

c

2

\ c (a)

Figure 2a. SMILES string tree representation C(C(CCC1)CC(C2C))(C1)C2.

for decahydro-2,3-dimethylnaphthalene,

Hushon; Expert Systems for Environmental Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

Downloaded by PURDUE UNIV on July 5, 2016 | http://pubs.acs.org Publication Date: July 5, 1990 | doi: 10.1021/bk-1990-0431.ch007

96

EXPERT SYSTEMS F O R E N V I R O N M E N T A L APPLICATIONS

1

\ c

\ c

/\ /

c

c

/\ C

2

C

/

\ c

C

/

/\

c

c e

/

/

1

2

/ c (b) Figure 2b. SMILES string tree representation for SMILES string tree from a with C l as root.

Hushon; Expert Systems for Environmental Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

7.

HICKEYETAL.

Prediction ofAquatic Toxicity of Contaminants 97

Downloaded by PURDUE UNIV on July 5, 2016 | http://pubs.acs.org Publication Date: July 5, 1990 | doi: 10.1021/bk-1990-0431.ch007

list. Only a f t e r a l l fragments o f the c h e m i c a l compound have been i s o l a t e d and s t o r e d i n a p r o p e r t y l i s t does the program p r o g r e s s t o the f i r s t knowledge base f o r fragment i d e n t i f i c a t i o n . The s o f t w a r e i s coded t o r e c o g n i z e a c e r t a i n l a r g e b u t f i n i t e number o f fragments ( o r f u n c t i o n a l groups). I f a p a r t i c u l a r f u n c t i o n a l group i s n o t r e c o g n i z e d , the system w i l l i d e n t i f y p a r t s o f i t and p r o c e e d . I f a f u n c t i o n a l group i s r e c o g n i z e d and no v a l u e s a r e a v a i l a b l e , the system s k i p s them, e f f e c t i v e l y a s s i g n i n g d e f a u l t v a l u e s f o r the atoms making up the u n i t . Once the s u b s t i t u e n t s have been i d e n t i f i e d and t h e i r names s t o r e d i n the p r o p e r t y l i s t , the program p r o g r e s s e s t o t h e second knowledge base used t o g e n e r a t e the f o u r LSER v a r i a b l e v a l u e s .

L i n e a r S o l v a t i o n Energy R e l a t i o n s h i p In the LSER model (13-17) the t o x i c i t y o f a contaminant i s r e l a t e d t o i t s s t r u c t u r e (26). The g e n e r a l i z e d LSER e q u a t i o n c o n t a i n s t h r e e s i m p l e and c o n c e p t u a l l y e x p l i c i t t y p e s o f terms: t o x i c i t y - c a v i t y term + d i p o l a r term + hydrogen b o n d i n g terms In this system each fragment o f t h e contaminant m o l e c u l e c o n t r i b u t e s b o t h t o the energy r e q u i r e d t o o r d e r s o l v e n t m o l e c u l e s (water o r b i o s y s t e m medium) around the m o l e c u l e and t o the e n e r g i e s g a i n e d o r l o s t t h r o u g h f o r m a t i o n o f e l e c t r o s t a t i c and hydrogen bonds between the contaminant and the medium. The g e n e r a l form o f t h e e q u a t i o n used i n our e x p e r t system i s l o g ( t o x i c i t y ) - mV,/100 + s7T* + b0 + aa 1 0

where m, s, b, and a a r e c o n s t a n t s . N u m e r i c a l v a l u e s o f f o u r LSER v a r i a b l e s a r e g e n e r a t e d f o r each fragment: mV,/100 i s an e n d o e r g i c energy term t h a t measures the f r e e energy r e q u i r e d t o s e p a r a t e s o l v e n t m o l e c u l e s and p r o v i d e a s u i t a b l y s i z e d c a v i t y f o r the contaminant m o l e c u l e , and V,/100 i s the i n t r i n s i c (van der Waals) m o l e c u l a r volume s c a l e d b y a f a c t o r o f 100, t o g i v e magnitudes comparable t o the o t h e r t h r e e v a r i a b l e s . The d i p o l a r i t y / p o l a r i z a b i l i t y term sir*, measures the g e n e r a l l y e x o e r g i c e f f e c t s o f s o l u t e - s o l v e n t , d i p o l e - d i p o l e , and d i p o l e - i n d u c e d d i p o l e i n t e r a c t i o n s , and 7Γ* i s a measure o f t h e m o l e c u l e ' s a b i l i t y t o s t a b i l i z e a n e i g h b o r i n g charge o r d i p o l e b y n a t u r e o f n o n - s p e c i f i c d i e l e c t r i c i n t e r a c t i o n s . The hydrogen b o n d i n g terms bβ and a a measure the e x o e r g i c e f f e c t s o f hydrogen b o n d i n g , i n v o l v i n g the s o l v e n t as hydrogen bond donor a c i d and the s o l u t e as hydrogen bond a c c e p t o r base β , and the s o l u t e as hydrogen bond donor a c i d and the s o l v e n t as hydrogen bond a c c e p t o r base a.. E s s e n t i a l t o the program a r e the complete s e t s o f v a r i a b l e v a l u e s f o r each fundamental s t r u c t u r e and fragment we have e n c o u n t e r e d o r t h a t we a n t i c i p a t e w i l l e x i s t i n an e n v i r o n m e n t a l sample. Some o f t h e s e v a l u e s have been f o r m u l a t e d by the few p u b l i s h e d r u l e s (27. 28) , b u t most were computed p r i m a r i l y b y e x t r a p o l a t i o n from o t h e r v a l u e s t a k e n from the l i t e r a t u r e (6. 7. 13-17. 27-41) and c o d i f i e d i n t o rules. A m a n u s c r i p t l i s t i n g the comprehensive s e t o f r u l e s i s i n preparation. S e v e r a l c o n v e n t i o n s o f the volume term a r e a v a i l a b l e m

m

Λ

Hushon; Expert Systems for Environmental Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

98

E X P E R T SYSTEMS FOR ENVIRONMENTAL APPLICATIONS

(39-42) . The p r e s e n t system uses the c o n v e n t i o n V, and makes use o f c o n t r i b u t i o n s f o r a i l fragments and fundamental s t r u c t u r e s b a s e d on e x t r a p o l a t i o n s from p r e v i o u s l y r e p o r t e d v a l u e s ( 6 . 7. 13-17. 27-41). The same p r o c e s s was used t o d e v i s e v a l u e c o n t r i b u t i o n s o f t h e ΤΓ*, /3, and α v a r i a b l e s . The p r e s e n t complete s e t o f v a r i a b l e e s t i m a t i o n r u l e s a l l o w s p r e d i c t i o n o f the LSER v a r i a b l e s f o r a l m o s t any o r g a n i c compound. W i t h r e g a r d s t o the a c c u r a c y o f these e s t i m a t e d v a l u e s , p r e d i c t i o n s f o r V,/100 a r e g e n e r a l l y ±.02 o f l i t e r a t u r e v a l u e s , as volumes a r e s t r i c t l y a d d i t i v e . F o r α and β, t h e l i m i t e d e x p e r i m e n t a l d a t a a v a i l a b l e from M. H. Abraham e t a l . (43. 44) show t h a t t h e p r e d i c t e d v a l u e s g e n e r a l l y agree w i t h i n ±.03 o f t h e e x p e r i m e n t a l l y determined data. F o r 7Γ*. t h e r e a r e no e x p e r i m e n t a l d a t a a v a i l a b l e , b u t p r e d i c t e d v a l u e s agree ±.03 w i t h p u b l i s h e d v a l u e s ( 6 , J_ 13-17. 27-41). Downloaded by PURDUE UNIV on July 5, 2016 | http://pubs.acs.org Publication Date: July 5, 1990 | doi: 10.1021/bk-1990-0431.ch007

t

B i o a s s a v Data S e t s and M u l t i p l e L i n e a r R e g r e s s i o n E q u a t i o n s The e x p e r t system p r e d i c t s the acute t o x i c i t y o f a c h e m i c a l t o f o u r r e p r e s e n t a t i v e a q u a t i c organisms and r e p o r t s t o x i c i t y as e i t h e r EC50- i s the e f f e c t i v e c o n c e n t r a t i o n a t which e i t h e r 50% o f t h e a n i m a l s (Daphnia p u l e x o r D. magna) were i m m o b i l i z e d o r 50% o f t h e luminescence ( t h e M i c r o t o x t e s t ) was d i m i n i s h e d - - o r LC50, t h e l e t h a l c o n c e n t r a t i o n f o r 50% o f the f i s h ( f a t h e a d minnows) i n t h e s t u d y . The r e g r e s s i o n e q u a t i o n s were d e r i v e d by u s i n g one s e t o f t o x i c i t y d a t a c o l l e c t e d under c o n t r o l l e d c o n d i t i o n s f o r each s p e c i e s . The d a t a f o r Daphnia p u l e x (7) were o b t a i n e d a t the N a t i o n a l F i s h e r i e s R e s e a r c h C e n t e r - G r e a t Lakes, under the t e s t c o n d i t i o n s o f a 48-hour exposure a t 20°C i n r e c o n s t i t u t e d h a r d water. The EC50 v a l u e s were d e t e r m i n e d by p r o b i t a n a l y s i s . The d a t a s e t s f o r the M i c r o t o x t e s t , P h o t o b a c t e r i u m phosphoreum ( 3 6 ) , Daphnia magna ( 7 ) , and f a t h e a d minnows, Pimephales promelas (45-48) were t a k e n from the l i t e r a t u r e . A l l d a t a s e t s were examined c l o s e l y f o r c o n t i n u i t y and a p p l i c a b i l i t y o f t e s t c o n d i t i o n s , c l o s e adherence t o r i g i d q u a l i t y assurance and q u a l i t y c o n t r o l schemes (49-52), and r e p r e s e n t a t i v e o f a wide v a r i e t y o f c h e m i c a l c l a s s e s and s t r u c t u r a l subunits. Each LSER model i s b e s t d e v e l o p e d by u s i n g a d a t a s e t c o n t a i n i n g the w i d e s t s e l e c t i o n o f c h e m i c a l c l a s s e s and s t r u c t u r e s , which a r e g e n e r a l l y r e p r e s e n t a t i v e o f e n v i r o n m e n t a l samples. The r e g r e s s i o n e q u a t i o n s used i n t h e p r e s e n t system ( T a b l e I I ) were p r e v i o u s l y developed and d i s c u s s e d (7. 9. 3 6 ) . R e s u l t s and D i s c u s s i o n Our e x p e r t system i s the o n l y p r e d i c t i v e s o f t w a r e a v a i l a b l e b a s e d on the LSER model. T h i s system a l s o r e p r e s e n t s t h e f i r s t total c o d i f i c a t i o n o f the r u l e s and fragment c o n t r i b u t i o n v a l u e s f o r s y n t h e s i s o f the f o u r parameter v a l u e s . P r e v i o u s p u b l i c a t i o n s have e i t h e r l i s t e d p a r t i a l g u i d e l i n e s f o r s p e c i f i c c l a s s e s (27. 28) o r have a l l u d e d t o them (7. 13-17. 27-41). T a b l e I I I demonstrates the p r e d i c t i v e a b i l i t y o f the s o f t w a r e f o r b o t h the LSER parameters and contaminant t o x i c i t y . The LSER parameter v a l u e s a r e composed from a sum o f t h e c o n t r i b u t i o n s from each o f the fragments, and the p r e d i c t e d v a l u e s a r e g e n e r a l l y c l o s e t o t h e v a l u e s composed by hand. F o r some compounds (such as acenaphthene) , the v a l u e s f o r 7Γ', /3, and a may n o t be a c c u r a t e l y r e p r e s e n t e d by a s i m p l e sum o f fragment v a r i a b l e

Hushon; Expert Systems for Environmental Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

H I C K E Y E T AL.

Table I I .

Log

10

Prediction ofAquatic Toxicity of Contaminants

E x p e r t System M u l t i p l e L i n e a r R e g r e s s i o n E q u a t i o n s G e n e r a l Equation* ( t o x i c i t y ) - ( i n t e r c e p t ) - mV,/100 - s7T* + bo

- aa

b

1) The M i c r o t o x T e s t ( P h o t o b a c t e r i u m phosphoreum) . μΜ/L log(EC50) - 7.49 - 7.39 V./100 - 1.38 π' + 3.70 β - 1.66 α Ν - 40, R - 0.966, s d - 0.319 2

c

Downloaded by PURDUE UNIV on July 5, 2016 | http://pubs.acs.org Publication Date: July 5, 1990 | doi: 10.1021/bk-1990-0431.ch007

2) Daphnia p u l e x . μΜ/L log(EC50) - 4.09 - 4.33 V,/100 -0.05 τΤ-0.13 β-0.22 a N - 38, R - 0.868, s d - 0.418 2

c

3) Daphnia magna . mM/L log(EC50) - 4.18 - 4.73 V./100 - 1.67 π' + 1.48 β - 0.93 α Ν - 53, R - 0.948, s d - 0.221 2

d

4) Fathead minnow (Pimephales p r o m e l a s ) . M/L l o g ( L C 5 0 ) - - 0.34 - 5.26 V./100 - 0.80 7Γ* + 3.98 β - 0.80 α Ν - 76, R - 0.970 s d - 0.218 2

a. b. c d.

see t e x t f o r e x p l a n a t i o n o f symbols. (36) (2) (£)

Hushon; Expert Systems for Environmental Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

EXPERT SYSTEMS FOR E N V I R O N M E N T A L APPLICATIONS

100

T a b l e I I I . P r e d i c t e d (Ρ) v s E x p e r i m e n t a l (E) LSER Parameter V a l u e s and Acute T o x i c i t i e s f o r t h e M i c r o t o x T e s t (MT), Daphnia p u l e x (DP), Daphnia magna (DM), and t h e Fathead Minnow (FM) LSER v a l u e s log(Acute T o x i c i t y ) 2

Compound

P/E

c

V,

π

0

β

a

100

MT EC50 (/IM)

Well-behaved compounds

DM

FM

EC50 (mM)

LC50 (M)

c

Ρ Ε

0.,633 0. 40 0..42 0.,35 0.,690 0. 40 0..45 0.,35

3.,23 2.,71

2. 12

0.,81 0. 32

-2.,63 -3.,02

n-Heptanol Ρ Ε

0.,731 0.,40 0,.42 0.,35 0.,789 0.,40 0,.45 0.,33

2.,51 1.,93

1.,69

0.,35 -0.,09

-3.,15 -2.,53

2-Butanone Ρ Ε

0.,478 0.,65 0,.48 0..00 0.,477 0.,67 0,.48 0..00

4.,84 4.,85

2.,85

1.,54 2.,09

-1. ,51 -1. .35

4-Methyl2-pentanone

Ρ Ε

0.,674 0.,65 0,.48 0..00 0.,663 0.,63 0,.48 0..00

3.,39 2..90

2.,00

0.,61 1..17

-2..54 -2..30

1,2-Dichlo roethane Ε

Ρ0.,376 0.,70 0,.20 0..00 0..442 0.,81 0,.10 0,.00

4..49 4..05

3.,26

1..52 1.,13

-2..10 -2..92

Iodocyclo- Ρ hexane Ε

0..779 0..32 0,.05 0,.00 0..779 0.,32 0,.05 0,.00

Cyclohexane

Ρ Ε

0..598 0.,00 0,.00 0,.00 0..598 0..00 0,.00 0..00

1..35 0..61

-3,.48 -4..27

0MTA

Ρ Ε

1..544 0..06 0 .00 0,.00 1,.444 0,.04 0 .00 0..00

Benzene

Ρ Ε

0,.491 0..59 0 .10 0,.00 0,.491 0,.59 0 .10 0,.00

3,.42 3,.31

2..75

1..01 1..16

-3..01 -3,.40

ο-Xylene

Ρ Ε

0..687 0..59 0 .12 0,.00 0..671 0..51 0 .12 0,.00

2..04 1,.94

1..90

0..12 0..15

-3,.96 -3,.82

Chlorobenzene

Ρ Ε

0..581 0..71 0 .07 0,.00 0..581 0..71 0 .07 0,.00

2,.48 2,.12

2..35

0..34 0..44

-3,.69 -3..77

Naphtha­ lene

Ρ Ε

0..753 0..70 0,.15 0,.00 0..753 0..70 0,.15 0,.00

1,.51

1..61 1..58

-0..33

-4,.28 -4,.32

9HFluorene

Ρ Ε

0..958 0..66 0,.21 0,.00 0..960 0.,66 0 .20 0,.00

n-Hexanol Downloaded by PURDUE UNIV on July 5, 2016 | http://pubs.acs.org Publication Date: July 5, 1990 | doi: 10.1021/bk-1990-0431.ch007

DP EC50 (μΜ)

d

1.,50 1.,53 3..07

2..30

-1. .80 -1. .74

0..68 0..11

Hushon; Expert Systems for Environmental Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

7.

HICKEY E T AJL

Prediction ofAquatic Toxicity ofContaminants Table III. Continued LSER v a l u e s *

Compound

P/E

c

V,

π

log(Acute T o x i c i t y ) β

a

100

0

MT

DP

DM

FM

EC50 (/IM)

EC50 (μΜ)

EC50 (mM)

LC50 (M)

Downloaded by PURDUE UNIV on July 5, 2016 | http://pubs.acs.org Publication Date: July 5, 1990 | doi: 10.1021/bk-1990-0431.ch007

Troublesome compounds Acenaphthene

Ρ Ε

0..855 1.,83 0.,30 0..00 0..896 0.,62 0.,17 0..00

-0,.24

1,.13

-2..49

-5,.10 -4,.95

Camphor

Ρ Ε

1..594 0.,68 0.,48 0.,00 1..106 0.,68 0.,59 0.,00

-3,.11 -1. .79

-3.,58

-7,.16 *-3..92

Diethylphthalate

Ρ Ε

1..177 0.,93 0.,82 0.,00 1.,153 0.,90 0.,70 0.,00

0,.10 -0..16

-1. ,91 *-0.,52

-4,.55 -3,.87

2,4-Pentanedione

Ρ Ε

0.,662 1.,30 0.,96 0.,00 0.,595 0. 90 0.,90 0.,00

4,.36

2..08

Phenol

Ρ Ε

0..536 0.,72 0.,33 0.,61 0.,536 0.,72 0.,33 0.,60

2,.74 2,.63

2,.46

0.,36 -0.,48

-2,.94 -3,.94

2-Methylphenol

Ρ Ε

0..634 0.,72 0.,34 0..61 0.,634 0. 70 0.,33 0.,57

2,.06 2..28

2,.03

-0.,09 -0.,75

-3,.42 -3,.77

Aniline

Ρ Ε

0..562 0.,71 0.,50 0..23 0..562 0.,73 0..50 0..26

3,.83

2,.44

0..86 *-2..27

-2,.10 -2,.84

4-Chloroaniline

Ρ Ε

0..652 0.,83 0.,47 0.,23 0..652 0.,73 0.,40 0..31

2,.88

2,.04

0.,19 *-1. .59

-2,.79 -3,.62

4-Fluoroaniline

Ρ Ε

0.591 0.74 0.45 0.28 0.591 0.73 0.47 0.23*

3.46

2.31

0.62

-2.40 *-3.82

4-Nitroaniline

Ρ Ε

0.702 1.13 0.70 0.23 0.702 1.25 0.48 0.42

-0.21 -0.76

-2.40 -3.04

Butylamine Ρ Ε

0.472 0.25 0.69 0.00 0.535 0.32 0.69 0.00

6.21

2.93

2.55 -0.33 *0.02 *-2.44

Triethanolamine

Ρ Ε

0.803 1.35 1.95 1.05 0.840 1.35 2.00 0.85

5.16

1.39

0.03 1.10 0.97 *-1.10

Pyridine

Ρ Ε

0.470 0.87 0.44 0.00 0.470 0.87 0.44 0.00

4.44 4.51

2.87

1.15 0.48

Nicotine

Ρ Ε

1.041 1.01 1.14 0.00 0.975 1.01 1.17 0.00

0.,29 -1, .13 0.,00 *-2,.98

-1.80 -2.93

0.18 *1.34 Continued on next page

Hushon; Expert Systems for Environmental Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

101

102

E X P E R T SYSTEMS FOR

ENVIRONMENTAL APPLICATIONS

Table III. Continued

a. b.

Downloaded by PURDUE UNIV on July 5, 2016 | http://pubs.acs.org Publication Date: July 5, 1990 | doi: 10.1021/bk-1990-0431.ch007

c.

d.

See t e x t f o r d e f i n i t i o n s o f LSER symbols. * D e s i g n a t e s a c t u a l t o x i c i t y v a l u e s g r e a t e r t h a n 2 o r "sigma" from the p r e d i c t e d t o x i c i t y . Ρ - p r e d i c t e d v a l u e s when e x p e r t system i s used. Ε - h a n d - c a l c u l a t e d LSER parameter v a l u e s o r e x p e r i m e n t a l l y determined t o x i c i t y f o r a s p e c i e s . For h a n d - c a l c u l a t e d LSER v a r i a b l e v a l u e s , the c o n t r i b u t i o n s from a l l fragments were summed. Some o f the v a l u e s f o r 7Γ*, 0, and α were a d j u s t e d t o r e f l e c t e i t h e r some predominant c o n t r i b u t i o n o r a v e c t o r sum, w h i c h can account f o r most o f the d i s c r e p a n c i e s between p r e d i c t e d and e x p e r i m e n t a l v a l u e s . B l a n k t o x i c i t y e n t r i e s (P o r E) i n d i c a t e no d a t a were a v a i l a b l e . " W e l l Behaved Compounds" have LSER s y s t e m - p r e d i c t e d v a l u e s t h a t agree ±.01 w i t h h a n d - c a l c u l a t e d v a l u e s and p r e d i c t e d t o x i c i t i e s w i t h i n ±1 l o g u n i t o f a c t u a l v a l u e . "Troublesome Compounds" can have LSER v a l u e s >±.03 w i t h handc a l c u l a t e d v a l u e s b u t g e n e r a l l y have a c t u a l t o x i c i t i e s ±1 t o 3 l o g u n i t s d i f f e r e n t from the p r e d i c t e d v a l u e s . OTMA: octahydro-1,4,9,9-tetramethy1-1H-3a,7-methanoazulene

Hushon; Expert Systems for Environmental Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

Downloaded by PURDUE UNIV on July 5, 2016 | http://pubs.acs.org Publication Date: July 5, 1990 | doi: 10.1021/bk-1990-0431.ch007

7.

H I C K E Y E T AL.

Prediction ofAquatic Toxicity of Contaminants 103

c o n t r i b u t i o n s (which i s t h e p r e s e n t s i t u a t i o n ) ; a v e c t o r sum o r a sum w i t h a fragment h i e r a r c h y o f importance (use o r n o t use, and t o what degree) may g i v e a more a c c u r a t e v a l u e . T h i s approach i s done o c c a s i o n a l l y t o compute h a n d - c a l c u l a t e d v a l u e s found i n T a b l e I I I and a c c o u n t s f o r the d i s c r e p a n c i e s o c c a s i o n a l l y seen between p r e d i c t e d and e x p e r i m e n t a l v a l u e s f o r LSER v a r i a b l e s . Even when t h e e x p e r t system's e s t i m a t e s o f LSER parameter v a l u e s match o u r hand c a l c u l a t i o n s , p r e d i c t e d t o x i c i t i e s c a n d i f f e r from t h e observed t o x i c i t y b y one t o t h r e e o r d e r s o f magnitude ( s e e "Troublesome Compounds" i n Table I I I ) . Our e x p e r t system was d e v e l o p e d t o p r e d i c t t o x i c i t i e s a c c o r d i n g t o one mode o f a c t i o n - nonpolar, non-reactive n a r c o s i s - - f o r n e u t r a l organic molecules w i t h no s p e c i a l p h y s i c a l p r o p e r t y c o n s i d e r a t i o n s . T o x i c i t i e s f o r such compounds have been p r e d i c t e d by o u r system w i t h i n one o r d e r o f magnitude o f t h e observed v a l u e and g e n e r a l l y w i t h i n ± a f a c t o r o f 5. Compounds t h a t r e a c t w i t h t h e b i o s y s t e m ( e . g . , aldehydes and amines) g e n e r a l l y a r e 10 t o 1000 times more t o x i c t h a n p r e d i c t e d b y t h e p r e s e n t system; and compounds t h a t i o n i z e a t t e s t pH c o n d i t i o n s ( o r g a n i c a c i d s ) , t h a t a r e h i g h l y v o l a t i l e ( e . g . , camphor), t h a t have low w a t e r s o l u b i l i t y , o r t h a t do n o t d i f f u s e a c r o s s c e l l membranes ( v e r y l o n g c h a i n a l c o h o l s ) w i l l have an o b s e r v e d t o x i c i t y o f o n l y 1/10 to 1/100 o f t h a t p r e d i c t e d h e r e . T o x i c i t y would a l s o be d i f f i c u l t t o e s t i m a t e f o r o r g a n i c e s t e r s ( l i k e p h t h a l a t e s ) and amides because h y d r o l y s i s under t h e t e s t c o n d i t i o n s would produce a t l e a s t two d i f f e r e n t , p o s s i b l y t o x i c , molecules (19). E x p e r t System C a p a b i l i t y . U t i l i t y , and L i m i t a t i o n s The e x p e r t system i s c a p a b l e o f e v a l u a t i n g compounds w i t h e i t h e r r i n g o r c h a i n s k e l e t o n s , double and t r i p l e bonds, and a l l common f u n c t i o n a l groups ( e . g . , a c i d s , a l c o h o l s , amines, e s t e r s , and h a l i d e s ) . The r i n g s t r u c t u r e s i n c l u d e cyclohexane and benzene d e r i v a t i v e s ; m u l t i p l e , and condensed r i n g s t r u c t u r e s such as p o l y c h l o r i n a t e d b i p h e n y l s (PCBs), n a p h t h a l e n e s , and h i g h e r p o l y n u c l e a r a r o m a t i c h y d r o c a r b o n s and t h e i r heterocyclic analogs such as n i c o t i n e and p y r r o l e ; and t h e c o r r e s p o n d i n g s a t u r a t e d r i n g systems. The LSER models p r o v i d e c o n s i s t e n t l y b e t t e r c o r r e l a t i o n s w i t h t o x i c i t y than do o t h e r w i d e l y used QSAR models t h a t depend s i m p l y on the p a r t i t i o n i n g o f contaminants i n t o o c t a n o l and w a t e r , (28). The LSER models have thus f a r been a p p l i e d t o o n l y a few d a t a s e t s f o r a q u a t i c organisms ( 7 . 9. 36-38) a l t h o u g h they c o u l d be used t o p r e d i c t t o x i c i t y t o a wide v a r i e t y o f a q u a t i c organisms, as w e l l a s to model s p e c i f i c mechanisms o f t o x i c i t y (37. 3 8 ) . We a r e c u r r e n t l y u s i n g t h e e x p e r t system t o h e l p e s t i m a t e t o x i c i t y o f c h e m i c a l s b e f o r e we b e g i n b i o a s s a y s , t o s h o r t e n t h e time spent on r a n g e - f i n d i n g t e s t s . The system a l s o may be used as p a r t o f a h a z a r d assessment scheme, t o e v a l u a t e t h e t o x i c i t y o f compounds d e t e c t e d i n e n v i r o n m e n t a l samples by gas chromatography/mass s p e c t r o m e t r y (GC/MS). The system c o u l d a c t as a s c r e e n i n g t o o l i n an i n i t i a l e v a l u a t i o n o f contaminants d e t e c t e d a t a s i t e o f c o n c e r n . Some l i m i t a t i o n s o f the system i n c h e m i c a l r e c o g n i t i o n and i n the e s t i m a t i o n o f v a l u e s o f LSER v a r i a b l e s must be improved. As i n any v i a b l e , e v o l v i n g e x p e r t system, t h e r e a r e s t i l l s o f t w a r e problems and l i m i t a t i o n s b e i n g s t u d i e d . F o r

Hushon; Expert Systems for Environmental Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

104

E X P E R T SYSTEMS FOR

ENVIRONMENTAL APPLICATIONS

Downloaded by PURDUE UNIV on July 5, 2016 | http://pubs.acs.org Publication Date: July 5, 1990 | doi: 10.1021/bk-1990-0431.ch007

example, c e r t a i n SMILES d e s i g n a t i o n s cannot be a n a l y z e d c o r r e c t l y , and c e r t a i n obscure c h e m i c a l fragments ( c l a s s t y p e s ) a r e identified incorrectly. A l s o , t h e r e i s no p r o v i s i o n t o d i f f e r e n t i a t e between p o s s i b l e g e o m e t r i c ( c i s vs t r a n s ) o r o p t i c a l isomers o f compounds. The d i f f e r e n t forms have d i f f e r e n t observed t o x i c i t i e s , but no a p p l i c a b l e w e i g h t i n g scheme o r i n p u t d e s i g n a t i o n has been developed f o r the p r e s e n t system. Future The s o f t w a r e now uses s t r u c t u r a l l y i n t r i n s i c parameters f o r o n l y one QSAR model (LSER) and the r e s u l t s a r e used t o p r e d i c t one p r o p e r t y (acute toxicity) to four aquatic species by one mechanism ( n o n r e a c t i v e , n o n - p o l a r n a r c o s i s ) ; however, we i n t e n d t o c o n t i n u e t o r e f i n e our e q u a t i o n s as databases grow, i n c o r p o r a t e o t h e r models, p r e d i c t o t h e r p r o p e r t i e s , and i n c l u d e o t h e r organisms. We will attempt t o d i f f e r e n t i a t e between modes o f t o x i c a c t i o n and improve our e s t i m a t e s a c c o r d i n g l y . For the w i d e l y d i v e r g e n t c l a s s e s o f c h e m i c a l s and types o f e n v i r o n m e n t a l b e h a v i o r , no one model w i l l b e s t d e s c r i b e e v e r y s i t u a t i o n and no one s p e c i e s i s the o p t i m a l organism to m o n i t o r . As the s o f t w a r e e v o l v e s , the e x p e r t system s h o u l d choose the b e s t model b a s e d on the contaminant, the s p e c i e s , and the p r o p e r t y to be p r e d i c t e d ( e . g . , t o x i c i t y o r b i o a c c u m u l a t i o n ) . I n a d d i t i o n , we e n v i s i o n an i n t e r a c t i v e s c r e e n system f o r d a t a e n t r y t h a t w i l l bypass the SMILES n o t a t i o n and a l l o w the u s e r t o d e s c r i b e the m o l e c u l e by p o s i n g a s e r i e s o f q u e s t i o n s about the compound's backbone and f u n c t i o n a l groups. The responses w i l l t r a n s l a t e d i r e c t l y i n t o v a l u e s o f LSER v a r i a b l e s . A p r e l i m i n a r y v e r s i o n o f t h i s system i s now a v a i l a b l e . Our u l t i m a t e o b j e c t i v e i s t o produce a u s e r - f r i e n d l y e x p e r t system f o r use i n the e v a l u a t i o n o f contaminants a t s p e c i f i c s i t e s . Acknowledgments We d e d i c a t e t h i s work t o the memory o f the l a t e Mortimer J . Kamlet whose i n s p i r a t i o n and d r i v e h e l p e d b r i n g h i s LSER concept t o a w o r k a b l e h y p o t h e s i s t h a t made t h i s work p o s s i b l e . We thank Dr. Amjad Umar ( U n i v e r s i t y o f M i c h i g a n ) f o r concepts and guidance i n a r t i f i c i a l i n t e l l i g e n c e and Dr. M i c h a e l H. Abraham ( U n i v e r s i t y C o l l e g e London) f o r h i s c o n s u l t a t i o n and encouragement i n LSER t h e o r y and p r a c t i c e . Use o f t r a d e names does not c o n s t i t u t e Government endorsement o f commercial p r o d u c t s .

Literature Cited 1. Hesselberg, R. J.; Seelye, J. G. Identification of organic compounds i n Great Lakes fishes by gas chromatography/mass spectrometry: 1977. Administrative Report 82-1. Great Lakes Fishery Laboratory, Ann Arbor, Michigan. 1982. 2. Great Lakes Water Quality Board. 1987 Report on Great Lakes Water Quality. International Joint Commission, Windsor, Ontario. 1987. 236 pp. 3. Passino, D. R. M.; Smith, S. B. Environ. Toxicol. Chem., 1987, 6, 901-907.

Hushon; Expert Systems for Environmental Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

Downloaded by PURDUE UNIV on July 5, 2016 | http://pubs.acs.org Publication Date: July 5, 1990 | doi: 10.1021/bk-1990-0431.ch007

7.

HICKEY ET A K

Prediction ofAquatic Toxicity of Contaminants

105

4. Savino, J. F.; Tanabe, L. L. Bull. Environ. Contain. Toxicol., 1989, 42, 778-784. 5. Passino, D. R. M.; Smith, S. B. In QSAR i n Environmental Toxicology; K. L. E. Kaiser, ed.; D. Reidel. Dordrecht, Holland, 1986, pp 261-270. 6. Passino, D. R. M. Proceedings of the Technology Transfer Conference, Ontario Ministry of the Environment, Toronto, Ontario, 1986, Part B, pp 1-26. 7. Passino, D. R. M.; Hickey, J. P.; Frank, A. M. Proceedings of QSAR 88: Third International Workshop on Quantitative StructureActivity Relationships in Environmental Toxicology, 1988, pp 131146. 8. Hickey, J. P.; Passino, D. R. M.; Frank, A. M. Preprint of Papers, 3rd Chemical Congress of North America and 195th ACS National Meeting, 1988, 28, 521-523. 9. Hickey, J. P.; Passino, D. R. M.; Kamlet, M. J . Program and Abstracts. 22nd Great Lakes Regional Meeting, American Chemical Society. 1989. Abstract No. 6. 10. Waterman, D. A. A Guide to Expert Systems. Addison-Wesley, Reading, MA, 1986. 11. Brundick, F.; Dumar, J.; Hanratty, T; Tanenbaum, P. In Second Conference on A r t i f i c i a l Intelligence Applications: The Engineering of Knowledge-Based Systems. 1985. 12. Hushon, J. M. Environ. Sci. Technol., 1987, 21, 838-841. 13. Taft, R. W.; Abboud, J.-L. M.; Kamlet, M. J.; Abraham, M. H. J. Solution Chem., 1985, 14:153-175. 14. Kamlet, M. J.; Doherty, R. M.; Abboud, J.-L. M.; Taft, R. W. Chemtech, 1986, 16, 566-576. 15. Kamlet, M. J.; Taft, R. W. Acta Chem. Scand. 1985, B39, 611628. 16. Kamlet, M. J.; Abboud, J.-L. M.; Taft, R. W. Prog. Phys. Org. Chem., 1981, 13, 485-630. 17. Abraham, M. H.; Doherty, R. M.; Kamlet, M. J.; Taft, R. W. Chem. Britain, 1986, 22, 551-554. 18. Burns, Ν. Α.; Ashford, T. J.; Iwaskiw, C. T.; Starbird, R. P.; Flagg, R. L. IBM Systems Journal, 1986, 25, 2. 19. Leo, A.; Weininger, D. CLOGP Version 3.2 User Reference Manual. Medicinal Chemistry Project, Pomona College. Claremont, CA, 1984. 20. SMILES: A line notation and computerized interpreter for chemical structures. Environmental Research Brief. U.S. Environmental Protection Agency, Environmental Research Laboratory, Duluth, MN, 1987. 21. Weininger, D. J. Chem. Information Computer Sci., 1988, 28, 3136. 22. SMILES user Manual. Institute for Biological and Chemical Process Analysis (IPA), Montana State University, Bozeman, MT. 1987. 23. QSAR System User Manual, Montana State University, Bozeman, MT. 1987. 24. MedChem Software Manual. Medicinal Chemistry Project, Pomona College, Claremont, CA, 1985. 25. Winston, P. H. A r t i f i c i a l Intelligence, 2nd ed. Addison-Wesley, Reading, MA, 1984.

Hushon; Expert Systems for Environmental Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

Downloaded by PURDUE UNIV on July 5, 2016 | http://pubs.acs.org Publication Date: July 5, 1990 | doi: 10.1021/bk-1990-0431.ch007

106

EXPERT SYSTEMS FOR ENVIRONMENTAL APPLICATIONS

26. Casarett, L. J.; Doull, J . Toxicology, 3rd ed. C. D. Klaassen, M. O. Amdur, J. Doull, eds. MacMillan, New York, 1986. 27. Kamlet, M. J.; Doherty, R. M.; Abboud, J.-L. M.; Abraham, M. H.; Taft, R. W. J . Pharm. Sci., 1986, 75, 338-349. 28. Kamlet, M. J.; Doherty, R. M.; Carr, P. W.; Mackay, D.; Abraham, M. H.; Taft, R. W. Environ. Sci. Technol., 1988, 22, 503-509. 29. Kamlet, M. J . ; Doherty, R. M.; Abraham, M. H.; Carr, P. W.; Doherty, R. F.; Taft, R. W. J . Phys. Chem., 1987, 91, 1996-2004. 30. Taft, R. W.; Abraham, M. H.; Famini, G. R.; Doherty, R. M.; Kamlet, M. J. J. Pharm. Sci., 1985, 74, 807-814. 31. Kamlet, M. J.; Doherty, R. M.; Abraham, D. J . ; Taft, R. W.; Abraham, M. H. J. Pharm. Sci., 1986, 75, 350-355. 32. Kamlet, M. J.; Doherty, R. M.; Fiserova-Bergerova, V.; Carr, P. W.; Abraham, M. H.; Taft, R. W. J. Pharm. Sci., 1987, 76, 14-17. 33. Leahy, D. E.; Carr, P. W.; Pearlman, R. S.; Taft, R. W.; Kamlet, M. J. Chromatographia, 1986, 21, 473-478. 34. Sadek, P. C.; Carr, P. W.; Doherty, R. M.; Kamlet, M. J.; Taft, R. W.; Abraham, M. H. Anal. Chem., 1985, 57, 2971-2978. 35. Carr, P. W.; Doherty, R. M.; Kamlet, M. J.; Taft, R. W.; Melander, M.; Horvath, C. Anal. Chem., 1986, 58, 2674-2680. 36. Kamlet, M. J.; Doherty, R. M.; Veith, G. D.; Taft, R. W.; Abraham, M. H. Environ. Sci. Technol., 1986, 20, 690-695. 37. Kamlet, M. J.; Doherty, R. M.; Taft, R. W.; Abraham, M. H.; Veith, G.; Abraham, D. J. Environ. Sci. Technol., 1987, 21, 149-155. 38. Kamlet, M. J.; Doherty, R. M.; Abraham, D. J.; Taft, R. W. Quant. Struct. Activ. Relat., 1988, 7, 71-78. 39. Leahy, D. E. J. Pharm. Sci., 1986, 75, 629-636. 40. Pearlman, R. S. Partition Coefficient Determination and Estimation; W. J . Dunn, J. H. Block, and R. S. Pearlman, eds.; Pergamon Press, New York, 1986, pp. 3-20. 41. Bondi, A. J. Phys. Chem., 1964, 68, 441-449. 42. Abraham, M. H.; McGowan, J. C. Chromatographia, 1987, 23, 243246. 43. Abraham, M. H.; Buist, G. J.; Grellier, P. L.; McGill, R. Α.; Prior, D. V.; Oliver, S.; Turner, E.; Morris, J . J.; Taylor, P. J.; Nicolet, P.; Maria, P.-C.; Gal, J.-F.; Abboud, J.-L. M.; Doherty, R. M.; Kamlet, M. J.; Shuely, W. J.; Taft, R. W. J. Phys. Org. Chem., 1989, 2, 540-552. 44. Abraham, M. H.; Grellier, P. L.; Prior, D. V.; Duce, P. P.; Morris, J. J.; Taylor, P. J. J. Chem. Soc. Perkin Trans. II, 1989, 699711. 45. Acute Toxicities of Organic Chemicals to Fathead Minnows (Pimephales promelas). Brook, L. T.; Geiger, D. L.; Call, D. J.; Northcott, C.E., eds. Center for Lake Superior Environmental Studies, University of Wisconsin, Superior, WI, 1984-1987, Vol. 14. 46. Veith, G. D.; Call, D. J.; Brooke, L. T. Can. J . Fish. Aquat. Sci., 1983, 40, 743-748. 47. Hall, L. H.; Kier, L. B.; Phipps, G. Environ. Toxicol. Chem., 1984, 3, 355-365. 48. Hall, L. H.; Kier, L. B. Environ. Toxicol. Chem., 1985, 5, 333337.

Hushon; Expert Systems for Environmental Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

7. HICKEY ET A L

Prediction of Aquatic Toxicity of Contaminants

107

Downloaded by PURDUE UNIV on July 5, 2016 | http://pubs.acs.org Publication Date: July 5, 1990 | doi: 10.1021/bk-1990-0431.ch007

49. Hamilton, Μ. Α.; Russo, R. C.; Thurston, R. V. Environ. Sci. Technol., 1977, 7, 714-719. [Correction 1978, 12, 147]. 50. American Public Health Association, American Water Works Association, and Water Pollution Control Federation. Standard methods for the examination of water and wastewater, 16th ed., New York, 1985. 51. Committee on Methods for Toxicity Tests with Aquatic Organisms. 1975. EPA-660/3-75-009. U.S. Environmental Protection Agency, Duluth, MN. 52. American Society for Testing and Materials. Annual Book of ASTM Standards. E729-80. American Society for Testing and Materials, Philadelphia, PA, 1980. RECEIVED January 9, 1990

Hushon; Expert Systems for Environmental Applications ACS Symposium Series; American Chemical Society: Washington, DC, 1990.