Molecular Modeling - ACS Publications - American Chemical Society

0097-6156/94/0576-0001$08.00/0 ... 2. MOLECULAR MODELING regulatory and consumer groups indicate the need to ... current state-of-the-art, along with ...
1 downloads 0 Views 2MB Size
Chapter 1

Molecular Modeling Transferring Technology to Solutions

Downloaded by IMPERIAL COLL LONDON on June 16, 2014 | http://pubs.acs.org Publication Date: December 14, 1994 | doi: 10.1021/bk-1994-0576.ch001

Michael N. Liebman Bioinformatics Program, Amoco Technology Company, Mail Code F-2, 150 West Warrenville Road, Naperville, IL 60563-8460

The changes which are being realized in the food and agricultural product areas suggest that the time is appropriate to consider transfer of technology to assist in the rational design of new products and assessment of potential benefit and r i s k . In this overview we present an introduction to where molecular modeling is today and where i t ' s evolution is headed in terms of hardware and software c a p a b i l i t i e s . The d i f f i c u l t issues involved in complex problem-solving are i d e n t i f i e d in terms of relevant research goals to reveal value in identifying and solving the correct problem to reach these goals. The value in looking to new methodologies and a p p l i c a t i o n s is also described.

Man h a s l o n g a t t e m p t e d t o e f f e c t c o n t r o l o v e r t h e p l a n t s and a n i m a l s w h i c h have s h a r e d h i s e n v i r o n m e n t a n d p r o v i d e d f o o d , f u e l , c l o t h i n g a n d e a r l y forms o f t r a n s p o r t a t i o n . T h i s c o n t r o l has i n v o l v e d development and u t i l i z a t i o n o f crop s c i e n c e , s o i l s c i e n c e and animal husbandry t o e f f e c t enhanced f o o d q u a l i t y and p r o d u c t i o n , r e s i s t a n c e t o d i s e a s e and enhancement o f o t h e r c o n s u m e r - d r i v e n characteristics, e.g. c o l o r , t a s t e , t e x t u r e . H i s t o r i c a l l y , an a d v a n t a g e i n applying these technologies t o a g r i c u l t u r a l products, whether as c h e m i c a l a d d i t i v e s o r modulators or crossb r e e d i n g i n p l a n t s o r a n i m a l s , has been t h e ease o f a c c e s s t o t h e e x p e r i m e n t a l system. Recent f o c u s by b o t h

0097-6156/94/0576-0001$08.00/0 © 1994 American Chemical Society In Molecular Modeling; Kumosinski, T., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1994.

Downloaded by IMPERIAL COLL LONDON on June 16, 2014 | http://pubs.acs.org Publication Date: December 14, 1994 | doi: 10.1021/bk-1994-0576.ch001

2

MOLECULAR MODELING

r e g u l a t o r y and consumer g r o u p s i n d i c a t e t h e need t o improve methods f o r c h a r a c t e r i z i n g e x p e r i m e n t a l s y s t e m s a n d t o e s t a b l i s h more a c c u r a t e r i s k a s s e s s m e n t capabilities. Access t o i n f o r m a t i o n concerning the g e n e t i c s , p h y s i o l o g y and t h e s t r u c t u r e a n d f u n c t i o n o f m o l e c u l a r t a r g e t s i n a g r i c u l t u r e have added t o t h e c o m p l e x i t y o f r e s o l v i n g t h e s e issues but a l s o provide the opportunity t o develop a p p l i c a t i o n s u s i n g more r a t i o n a l a p p r o a c h e s . The r e q u i r e m e n t s f o r d e v e l o p i n g r a t i o n a l a p p r o a c h e s t o a g r i c u l t u r a l a p p l i c a t i o n s i n c l u d e both the i n t e g r a t i o n o f computational and experimental methodologies and t h e i n f o r m a t i c s and d a t a b a s e t e c h n o l o g i e s n e c e s s a r y t o s u p p o r t the i n t e g r a t i o n . Many o f t h e s e component t e c h n o l o g i e s have been d e v e l o p e d f o r u s e i n p h a r m a c e u t i c a l r e s e a r c h , where the t a r g e t organism i s not suitable for experimental d e v e l o p m e n t a n d model o r g a n i s m s o n l y s e r v e a s c a n d i d a t e s c r e e n s d u r i n g d r u g development. I t i s thus o f b e n e f i t t o a g r i c u l t u r a l r e s e a r c h e r s t o l e a r n what t e c h n o l o g i e s i n r a t i o n a l drug d e s i g n a n d p r o t e i n e n g i n e e r i n g might be suitable for transfer toagricultural applications. Common

Paths,

Common

Needs

and Technology

Transfer

Product Development PathThe p r o p o s a l for the p o t e n t i a l t r a n s f e r o f t e c h n o l o g y can be b e t t e r u n d e r s t o o d when one e x a m i n e s t h e a n a l o g y b e t w e e n a g r i c u l t u r a l a n d p h a r m a c e u t i c a l needs. T h i s i s d e s c r i b e d i n terms o f t h e customers which comprise each area's r e s p e c t i v e p r o d u c t d e v e l o p m e n t p a t h ( F i g u r e s 1 a n d 2 ) . A l t h o u g h t h e commonly h e l d v i e w m i g h t be t h a t d r u g development p r o c e e d s a l o n g a d i f f e r e n t p a t h and w i t h d i f f e r e n t p r i o r i t i e s , t h e a n a l o g o u s r e p r e s e n t a t i o n o f p h a r m a c e u t i c a l product development c a n a l s o be v i e w e d i n t e r m s o f a s e q u e n c e o f p r o d u c e r s a n d consumers. Product Development Needs-While t h e s e p a t h s may a p p e a r t o i n v o l v e d i f f e r e n t p r i o r i t i e s , b o t h have a f u n d a m e n t a l need t o develop an a c c u r a t e a n a l y s i s o f the r e l a t i o n s h i p between s t r u c t u r e and f u n c t i o n o f t h e (macro)molecules w h i c h s e r v e as t a r g e t s f o r t h e p r o d u c t s o r a s t h e p r o d u c t s , themselves. W i t h i n the pharmaceutical i n d u s t r y , r e g u l a t o r y and e c o n o m i c c o n s i d e r a t i o n s h a v e r e s u l t e d i n s t r o n g e r emphasis on t h e development and implementation o f computational and experimental t o o l s t o a i d i n t h e determination of structural identities and features r e l e v a n t t o product development. The f o c u s o f t h e p a p e r s presented here i s on p r o v i d i n g an o v e r v i e w oft h e experimental and computational t o o l s which represent c u r r e n t s t a t e - o f - t h e - a r t , a l o n g w i t h examples o f t h e i r a p p l i c a t i o n t o a c t u a l problems, t o serve as a v e h i c l e t o support the t r a n s f e r o f a p p r o p r i a t e t e c h n o l o g i e s to problem s o l v i n g i n t h e a g r i c u l t u r e domain. Potential

f o r Technology

Transfer-The

methods a n d

In Molecular Modeling; Kumosinski, T., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1994.

Downloaded by IMPERIAL COLL LONDON on June 16, 2014 | http://pubs.acs.org Publication Date: December 14, 1994 | doi: 10.1021/bk-1994-0576.ch001

LIEBMAN

Transferring Technology to Solutions

PRODUCER/ FARMER

CROP ENHANCERS HERBICIDES PESTICIDES TEMPERATURE YIELD

PROCESSOR

STABILITY TRANSPORT COST REGULATORY

CONSUMER

TASTE COLOR FLAVOR COST

Figure 1. Agricultural Product Development Pathway showing the progression from source to processor to consumer, with their associated priorities and goals. This is constructed analogous to the Pharmaceutical Development Path in Figure 2.

PRODUCT IDENTIFICATION

REACTIVITY SPECIFICITY SCREENING LEAD COMPOUNDS

PRODUCT DEVELOPMENT

REGULATORY COST DELIVERY STABILITY PATENT ISSUES

CONSUMER

COST SIDE-EFFECTS EFFICACY

Figure 2. Pharmaceutical Development Path showing the progression from product or lead identification to product development to consumer, with their associated priorities and goals.

In Molecular Modeling; Kumosinski, T., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1994.

Downloaded by IMPERIAL COLL LONDON on June 16, 2014 | http://pubs.acs.org Publication Date: December 14, 1994 | doi: 10.1021/bk-1994-0576.ch001

4

MOLECULAR MODELING

p r o b l e m s w h i c h a r e p r e s e n t e d r e p r e s e n t most o f t h e continuum of problem areas which confront b i o l o g i c a l research, i n c o r p o r a t i n g both today's e x i s t i n g i n f o r m a t i o n as w e l l a s i n a n t i c i p a t i o n o f a c c e s s t o genome s e q u e n c e s . T h i s c o n t i n u u m i n c l u d e s gene l o c a t i o n ( e i t h e r f o r t a r g e t s e l e c t i o n o r as d e s i r e d p r o d u c t ) , s t r u c t u r a l o r g a n i z a t i o n o f t h e gene p r o d u c t ( i . e . p r o t e i n o r enzyme), s t r u c t u r a l r e s p o n s e t o e n v i r o n m e n t a l f a c t o r s (i.e.conformâtional), d e f i n i t i o n of s p e c i f i c i t y (in vivo versus i n v i t r o ) , participation i n higher order organizations (e.g. b i o l o g i c a l p a t h w a y s ) , s e c o n d a r y e f f e c t s p r o d u c e d by i n t e r pathway i n t e r a c t i o n ( i . e . s i d e - e f f e c t s ) , and c l i n i c a l observations (e.g. e x i s t i n g d i s e a s e o r r i s k a s s e s s m e n t ) . T h i s e x t e n s i o n , from gene l o c a t i o n t o c l i n i c a l o b s e r v a t i o n , i s v i e w e d a s t h e b a s i s f o r d e f i n i n g t h e "new" m e d i c i n e w h i c h w i l l f o l l o w a c c e s s t o t h e s e q u e n c e o f t h e human genome a n d h i g h speed s e q u e n c i n g t o o l s . I t can be r e a d i l y seen t o have a p a r a l l e l i n a g r i c u l t u r e , e.g. development o r s e l e c t i o n o f crop f o r p a r t i c u l a r disease r e s i s t a n c e , e t c . The c o m m o n a l i t y o f t h e s e two b r o a d a p p l i c a t i o n s r e s i d e s i n t h e i r u t i l i z a t i o n of the h i e r a r c h i c a l processing of b i o l o g i c a l i n f o r m a t i o n f r o m t h e genome t h r o u g h t h e gene product (Figure 3) t o i t s b i o l o g i c a l f u n c t i o n o r dysfunction ( F i g u r e 4 ) . The k e y t o u n d e r s t a n d i n g t h e impact o f t h i s i n f o r m a t i o n p r o c e s s i n g i s i n the a n a l y s i s o f t h e s t r u c t u r e and r e s u l t i n g f u n c t i o n o f t h e p r o t e i n . Common

Goals

Sequence t o S t r u c t u r e - A l o n g s t a n d i n g g o a l o f s t r u c t u r a l b i o l o g y h a s been t h e a c c u r a t e p r e d i c t i o n o f t h e s e c o n d a r y and t e r t i a r y s t r u c t u r e o f a p r o t e i n f r o m i t s c o n s t i t u e n t amino a c i d s e q u e n c e . The t a c i t a s s u m p t i o n h a s been t h a t a d e q u a t e i n f o r m a t i o n e x i s t s w i t h i n t h e amino a c i d sequence t o e n a b l e t h e c o r r e c t t h r e e - d i m e n s i o n a l s t r u c t u r e t o be assembled during p r o t e i n s y n t h e s i s . This i n t e r p r e t a t i o n i s b a s e d on t h e o b s e r v e d f i d e l i t y o f p r o t e i n c o n f o r m a t i o n a s s o c i a t e d w i t h a s p e c i f i e d p o l y p e p t i d e sequence, and t h e ability o f t h e sequence to return to i t s native conformation following denaturation i n an environment devoid of other protein synthesis components, e.g. ribosomes. S i g n i f i c a n t computational and e x p e r i m e n t a l r e s e a r c h e f f o r t s have examined b o t h t h e c o r r e l a t i o n o f t h e amino a c i d sequence, o r s e q u e n c e - d e r i v e d p r o p e r t i e s , w i t h t h e r e s u l t a n t p r o t e i n s t r u c t u r e , and t h e mechanism by w h i c h the f o l d i n g occurs. Most n o t a b l e i n the more t h a n 2 0 y e a r p e r i o d o f a c t i v i t y i n t h i s a r e a has been t h e g r o w t h o f t h e database o f atomic r e s o l u t i o n p r o t e i n s t r u c t u r e s , i n i t i a l l y o b s e r v e d by x - r a y c r y s t a l l o g r a p h y and more r e c e n t l y by NMR; t h e e v o l u t i o n o f t h e methods u s e d t o a n a l y z e t h i s d a t a ; and e v o l u t i o n o f t h o s e q u e s t i o n s w h i c h s c i e n t i s t s hope t o answer t h r o u g h s u c h s t u d i e s . S i n c e the f i r s t d e t e r m i n a t i o n o f the t h r e e - d i m e n s i o n a l s t r u c t u r e o f a p r o t e i n i n a c r y s t a l l i n e environment, t h e

In Molecular Modeling; Kumosinski, T., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1994.

1. LIEBMAN

Transfernng Technology to Solutions

(

Downloaded by IMPERIAL COLL LONDON on June 16, 2014 | http://pubs.acs.org Publication Date: December 14, 1994 | doi: 10.1021/bk-1994-0576.ch001

AGC

I I I I ! I ÏH

5

II I

Chromosome

3

UUC GAU CCG UUA UUG CAC AUC GGA

! ι I

TTTTTTT^T

, , ,

Ser - Phe - Asp -

Γ

Γ

Γ

Τ

Λ

Pro - Leu -

Λ

Leu -

Λ

Λ

His - lie -

^ene (DNA) ι ι

Λ

Gly

, , , Protein (amino acid)

Protein (three-dimensional structure)

Figure 3. The "Holy Grail" of molecular biology as represented by the trans­ formation of information from nucleus to chromosome to gene to gene sequence to amino acid sequence to three-dimensional structure of the gene product (protein).

Protein (three-dimensional structure)

Protein Function Enzyme Pathways Catalysts Diagnostics Protein Modification Site-directed Mutagenesis Chemical Modification

Protein Dysfunction Disease States Genetic Abnormalities Diagnostics I Molecular Design to Protein Target Inhibitors Activators Therapeutics/Diagnostics

Figure 4. The targeted values associated with understanding the relationship between the three-dimensional structure of a protein and its biological function. In Molecular Modeling; Kumosinski, T., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1994.

Downloaded by IMPERIAL COLL LONDON on June 16, 2014 | http://pubs.acs.org Publication Date: December 14, 1994 | doi: 10.1021/bk-1994-0576.ch001

6

MOLECULAR MODELING

i n f o r m a t i o n b a s e has grown t o more t h a n 1000 structures, most o f w h i c h a r e p u b l i c l y a v a i l a b l e t h r o u g h t h e P r o t e i n Data Bank. This growth has been nonlinear and is indicative of t e c h n o l o g i c a l advances which permit the higher resolution, g r e a t e r t h a n 2. OA r e s o l u t i o n , o f many o f t h e new e n t r a n t s . S t r u c t u r e s , or sets of conformations w h i c h a r e c o n s i s t e n t w i t h o b s e r v e d d i s t a n c e c o n s t r a i n t s , as revealed by t w o - d i m e n s i o n a l NMR, h a v e become recent a d d i t i o n s to t h i s database. As t h i s d a t a b a s e has grown, the a p p r o p r i a t e n e s s f o r a p p l i c a t i o n of s t a t i s t i c a l methods has i n c r e a s e d , and s t u d i e s have r a n g e d from t h e c o r r e l a t i o n of l o c a l , secondary s t r u c t u r a l f e a t u r e s , to longer range, tertiary structural interactions. This structural d i s t i n c t i o n i s t y p i c a l l y m a i n t a i n e d through the selection o f a v i e w i n g window o n t o t h e amino a c i d s e q u e n c e , where t h e window b o u n d a r i e s a r e d e f i n e d i n terms o f d i s t a n c e a p a r t a l o n g the p o l y p e p t i d e chain. Such c o r r e l a t i o n s have a t t e m p t e d t o u t i l i z e b o t h t h e f i r s t o r d e r d a t a , i . e . amino a c i d s e q u e n c e , and p h y s i c o c h e m i c a l p a r a m e t e r s t h a t may be d e r i v e d from the sequence, e.g. hydrophobicity. More r e c e n t a p p r o a c h e s have u t i l i z e d t h e m e t h o d o l o g i e s developed i n computer s c i e n c e t o e x p l o r e knowledge b a s e s and p a t t e r n r e c o g n i t i o n , as w e l l as t h e c o n n e c t i o n i s t a p p r o a c h . The most s t r i k i n g e v o l u t i o n i n t h e p r o c e s s o f p r o t e i n s t r u c t u r e p r e d i c t i o n i s r e f l e c t e d i n the u n d e r l y i n g questions and g o a l s which continue to d r i v e the c o n v e r s i o n of i n f o r m a t i o n to knowledge, and r e f l e c t s t h e more r a p i d technological advances and needs for the application of protein e n g i n e e r i n g and b i o t e c h n o l o g y . The d i f f i c u l t c o n v e r s i o n o f i n f o r m a t i o n i n t o knowledge b o t h d e r i v e s b e n e f i t and s u f f e r s from i n c r e a s e d a c c e s s t o computers and d a t a b a s e s . The goal of determining the three-dimensional s t r u c t u r e of a p r o t e i n from i t s constituent amino a c i d s e q u e n c e stems f r o m t h e u n i v e r s a l a c c e p t a n c e that the function, i n v i v o and i n v i t r o , o f a enzyme (catalyst p r o t e i n ) i s a d i r e c t c o n s e q u e n c e o f i t s f o l d e d shape and its physical properties. The c a p a b i l i t y o f d e s c r i b i n g and u n d e r s t a n d i n g the r e l a t i o n s h i p between structure and f u n c t i o n i n these b i o l o g i c a l macromolecules i s e s s e n t i a l to success in modern medical applications including therapeutics and d i a g n o s t i c s , as w e l l as i n d e v e l o p i n g biotechnology-based c h e m i c a l s y n t h e s i s and d e a l i n g w i t h impending environmental i s s u e s (Figure 4). Research i n this a r e a has b e e n a s i g n i f i c a n t focus of both the b i o t e c h n o l o g y and p h a r m a c e u t i c a l i n d u s t r i e s . P u b l i c domain sequence d a t a b a s e s a r e g r o w i n g a l m o s t e x p o n e n t i a l l y , fueled i n p a r t by t h e i n c r e a s i n g i n t e r e s t i n t h e Human Genome Sequencing P r o j e c t at the i n t e r n a t i o n a l l e v e l . Structural i n f o r m a t i o n l a g s b e h i n d t h e sequence i n f o r m a t i o n by o r d e r s o f m a g n i t u d e w i t h no o b v i o u s p a t h t o r e d u c e t h i s g a p . The p r o b l e m , from a s t r u c t u r e p r e d i c t i o n b a s i s , c a n be v i e w e d as t h e t r a n s f e r o f i n f o r m a t i o n a l o n g a pathway:

In Molecular Modeling; Kumosinski, T., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1994.

1.

LIEBMAN

Gene Sequence -->

Downloaded by IMPERIAL COLL LONDON on June 16, 2014 | http://pubs.acs.org Publication Date: December 14, 1994 | doi: 10.1021/bk-1994-0576.ch001

7

Transferring Technology to Solutions Protein Sequence

-->

Protein Structure

-->

Protein Function

S t r u c t u r e to FunctionThis sequence-to-structure pathway a c t u a l l y c r o s s e s o n l y the f i r s t major b a r r i e r c o n f r o n t i n g r e s e a r c h e r s i n t h i s a r e a . The most s i g n i f i c a n t u t i l i z a t i o n o f t h e s t r u c t u r a l i n f o r m a t i o n f o r p r o t e i n s and enzymes d e r i v e s f r o m d e t e r m i n a t i o n o f t h e i r f u n c t i o n a l c h a r a c t e r i s t i c s . T h i s i s of p a r t i c u l a r r e l e v a n c e t o the sequence information provided i n the Human Genome I n i t i a t i v e ( o r s i m i l a r l y f r o m p l a n t genomes) d e p i c t i n g s e v e r a l areas of near-term and long-term commercial potential. As i n much o f r e s e a r c h , we ask q u e s t i o n s w h i c h may n o t a l w a y s a d d r e s s t h e i n f o r m a t i o n we need t o know, but r a t h e r a r e bounded by our p e r c e p t i o n o f what we b e l i e v e can be answered. A l t h o u g h f r e q u e n t l y u n s t a t e d , we w o u l d l i k e t o a c c u r a t e l y p r e d i c t t h e s t r u c t u r a l f e a t u r e s o f an enzyme necessary to 1. u n d e r s t a n d t h e s t r u c t u r a l and e v o l u t i o n a r y s o u r c e o f i t s s p e c i f i c i t y and r e a c t i v i t y ; 2. u n d e r s t a n d a d a p t a t i o n t h r o u g h t i s s u e , o r g a n i s m o r even phylum d i f f e r e n c e s ' * 3. d e s i g n more s p e c i f i c i n h i b i t o r s , s u b s t r a t e s o r a c t i v i t y modulators ; 4. d e s i g n and p e r f o r m s i t e - d i r e c t e d m u t a t i o n s w h i c h c a n p r e d i c t a b l y enhance t h e r m a l s t a b i l i t y , r e s p o n s e t o pH, a l t e r s p e c i f i c i t y or r e a c t i v i t y parameters; 5. i d e n t i f y / c l a s s i f y a p r o t e i n as t o i t s f u n c t i o n a l i t y b a s e d on a n a l y s i s o f i t s amino a c i d / n u c l e i c a c i d sequence a l o n e ; 6. a d d r e s s p o t e n t i a l i s s u e s o f p a t e n t a b i l i t y , p a t e n t p r o t e c t i o n or p a t e n t a v o i d a n c e i n d e f i n i n g new m a t e r i a l s t h r o u g h t h e s e p r o c e s s e s ; and 7. a c q u i r e and m a i n t a i n such knowledge i n a f o r m w h i c h can be a p p l i e d i n a p o t e n t i a l l y enhanced and p r o p r i e t a r y manner, i . e . c o m m e r c i a l i z e t h e p r o d u c t , t h e p r o c e s s and/or t h e knowledge. We c a n s u m m a r i z e t h e v a l u e o f t h e a n a l y s i s o f p r o t e i n s t r u c t u r e and i t s r e l a t i o n t o f u n c t i o n by r e p r e s e n t i n g t h e r o l e o f a p r o t e i n i n f u n c t i o n and d y s f u n c t i o n , and as a t a r g e t f o r d e s i g n i n g a m o l e c u l e , o r as t h e m o l e c u l e t o be p r o d u c e d i n a m o d i f i e d o r u n m o d i f i e d f o r m as shown i n F i g u r e 4. Virtual

Tools

of

Molecular

Modeling

The key t o s u c c e s s f u l l y a n a l y z i n g t h e complex r e l a t i o n s h i p between s t r u c t u r e and function requires data and i n f o r m a t i o n f r o m b o t h e x p e r i m e n t a l and computational t e c h n o l o g i e s and t h e i n t e g r a t i o n o f f u n d a m e n t a l p r i n c i p l e s from a range of d i s c i p l i n e s . The r a p i d l y e v o l v i n g (and improving) cost-performance curve is placing high

In Molecular Modeling; Kumosinski, T., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1994.

Downloaded by IMPERIAL COLL LONDON on June 16, 2014 | http://pubs.acs.org Publication Date: December 14, 1994 | doi: 10.1021/bk-1994-0576.ch001

8

MOLECULAR MODELING

p e r f o r m a n c e computer t e c h n o l o g y on t h e d e s k t o p . Access t o high resolution, i n t e r a c t i v e computer graphics, conformational energy m i n i m i z a t i o n , molecular dynamics, quantum m e c h a n i c a l c a l c u l a t i o n s , homologous protein s t r u c t u r e g e n e r a t i o n and sequence database s e a r c h i n g a r e e x a m p l e s o f some t o o l s w h i c h c a n be r o u t i n e l y a c c e s s e d t h r o u g h c o m m e r c i a l m o l e c u l a r m o d e l i n g and m o l e c u l a r b i o l o g y s o f t w a r e p a c k a g e s on t h e s e d e s k t o p w o r k s t a t i o n s . The i s s u e i s no l o n g e r what c a n we i n t e g r a t e , b u t r a t h e r what s h o u l d we i n t e g r a t e t o s o l v e a p r o b l e m . I n F i g u r e 5, we show t h e range of t o o l s which a r e commonly explored, both c o m p u t a t i o n a l l y a n d e x p e r i m e n t a l l y , when s t a r t i n g a t t h e t o p w i t h i n f o r m a t i o n a b o u t t h e gene p r o d u c t i n t e r m s o f sequence, s t r u c t u r e and/or concerning inhibitors or modulators of i t s function. The r a t i o n a l progression towards t h e a p p l i c a t i o n goals a t t h e bottom can f o l l o w a m u l t i t u d e o f p a t h s , w i t h no s i n g l e p a t h p r e s e n t l y a b l e t o g u a r a n t e e s u c c e s s , even f o r a s p e c i f i c c l a s s o f p r o b l e m , e.g. d e s i g n i n g an i n v i v o i n h i b i t o r f o r a known enzyme structure. A c c e s s t o t h e component t o o l s h a s b e e n s i g n i f i c a n t l y enhanced t h r o u g h u s e o f c o m m e r c i a l p a c k a g e s f o r m o l e c u l a r m o d e l i n g , e.g. Biosym, Cambridge S c i e n t i f i c , ChemDesign, MDL I n f o Systems, M o l e c u l a r S i m u l a t i o n s , O x f o r d M o l e c u l a r and T r i p o s , as w e l l as f o r m o l e c u l a r biology, e.g. DNAStar, GCG, I n t e l l i G e n e t i c s , M a c V e c t o r , e t c . These provide access t o both t h e computational t o o l s and u s e r f r i e n d l y i n t e r f a c e s as w e l l as l i n k a g e s t o t h e r a p i d l y expanding d a t a b a s e s o f s t r u c t u r e and sequence d a t a . The p r o b l e m i s not how t o a c c e s s r e l e v a n t d a t a and i n f o r m a t i o n , but r a t h e r how t o e x t r a c t knowledge t o make t h e i n f o r m a t i o n useful. Y e a r s o f e v o l u t i o n have e n a b l e d b i o l o g y t o d e v e l o p the appropriate rules and b i o l o g i c a l / b i o c h e m i c a l / b i o p h y s i c a l t o o l s e s s e n t i a l t o p r o c e s s most o f t h i s i n f o r m a t i o n s u c c e s s f u l l y , a l t h o u g h m i s t a k e s do o c c u r . The c h a l l e n g e t o o u r a c h i e v i n g some d e g r e e o f s u c c e s s l i e s i n developing the ability to identify the information n e c e s s a r y t o s o l v e t h e problem from t h e wide band o f i n f o r m a t i o n which i s a v a i l a b l e , and develop t h e t o o l s necessary t o i d e n t i f y , s e l e c t and apply t h a t i n f o r m a t i o n efficiently. T h i s c h a l l e n g e g e n e r a l i z e s t o any o f t h e i n f o r m a t i o n - i n t e n s e areas which confront us. Defining

Real

Problems

The k e y t o a d d r e s s i n g t h i s i n v o l v e s 1) l e a r n i n g how t o identify t h e r i g h t q u e s t i o n ; 2) b e i n g a b l e t o e v a l u a t e i f a d e q u a t e i n f o r m a t i o n e x i s t s t o answer t h a t q u e s t i o n a n d what t h a t i n f o r m a t i o n i s ; a n d 3) e v a l u a t i n g t h e i m p a c t o f t h e i n h e r e n t b i a s p r e s e n t i n e i t h e r t h e answer o r method we u s e d t o f i n d t h e answer. A s k i n g the Right Q u e s t i o n - The most d i f f i c u l t , a n d y e t t h e most s i g n i f i c a n t s t e p i n s o l v i n g a p r o b l e m i n v o l v e s e s t a b l i s h i n g what a r e t h e a c t u a l g o a l s t o a c c o m p l i s h t o

In Molecular Modeling; Kumosinski, T., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1994.

1. UEBMAN

Downloaded by IMPERIAL COLL LONDON on June 16, 2014 | http://pubs.acs.org Publication Date: December 14, 1994 | doi: 10.1021/bk-1994-0576.ch001

COMPUTATION

structure prediction

database searching

9

Transferring Technology to Solutions

molecular dynamics

homology] building ab initio calculations!

EXPERIMENT Computer graphics/ 3-D visualization

database generation

Spectroscopy NMR, UV, IR, Fluorescence

active analog/lead nnmpnnnri

energy minimization

structuremo dification/sta bilization

van der Waals surfaces Electrostatic Potential Maps

enzyme-inhibitor docking

Diffraction x-ray, neutron

site-directed mutagenesis

quantitative structureactivity relationships

Figure 5. The "real-world" situation involving taking proprietary information and selecting the correct "pathway" (i.e. combination of experimental and computational molecular modeling approaches) to produce the desired knowledge or application.

In Molecular Modeling; Kumosinski, T., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1994.

Downloaded by IMPERIAL COLL LONDON on June 16, 2014 | http://pubs.acs.org Publication Date: December 14, 1994 | doi: 10.1021/bk-1994-0576.ch001

10

MOLECULAR MODELING

complete t h e task. A general observation i s t h a t many q u e s t i o n s a r e f r e q u e n t l y s t a t e d w h i c h do n o t a d e q u a t e l y r e f l e c t t h e s e g o a l s , b u t r a t h e r r e f l e c t an a n t i c i p a t i o n o f t h e s o l u t i o n o r a p p r o a c h t o s o l v i n g t h e p r o b l e m b a s e d on t h e e x i s t i n g knowledge o f t h e p e r s o n p r e s e n t i n g t h e p r o b l e m t o be s o l v e d . As a r e s u l t , t h e same q u e s t i o n may be p r e s e n t e d by d i f f e r e n t i n v e s t i g a t o r s , y e t t h e i r a c t u a l n e e d s may be v e r y d i f f e r e n t . We c a n e x a m i n e this d i s t i n c t i o n by e x p a n d i n g t h e q u e s t i o n t o examine u n d e r l y i n g i s s u e s , and r e a d i l y recognize that t h i s i s n o t a unique c h a r a c t e r i s t i c of problem-solving i n the pharmaceutical o r b i o l o g i c a l domains. Thus t h e f i r s t s t a g e o f p r o b l e m s o l v i n g i n v o l v e s an i d e n t i f i c a t i o n a n d a n a l y s i s o f t h e t r u e goals. An example o f a q u e s t i o n , a n d one f r e q u e n t l y a s k e d , concerns t h e a b i l i t y t o p r e d i c t the three-dimensional s t r u c t u r e o f a p r o t e i n from i t s amino a c i d sequence. Can we p r e d i c t t h e s t r u c t u r e o f a p r o t e i n f r o m i t s amino a c i d sequence? I n i t i a l l y c o n f r o n t e d w i t h t h i s , some o f t h e underlying questions include: -do we want t o p r e d i c t p r o t e i n s t r u c t u r e o r i d e n t i f y protein family? -can we p r e d i c t t e r t i a r y s t r u c t u r e o r , a t l e a s t , s e c o n d a r y s t r u c t u r e composition? -how a c c u r a t e a p r e d i c t i o n i s needed t o be o f v a l u e ? -how do we a s s e s s t h e a c c u r a c y o f t h e p r e d i c t e d s t r u c t u r e ? by an o v e r a l l c o m p a r i s o n o r by c o r r e l a t i o n o f s t r u c t u r a l or functional features? -an a c c u r a c y o f 90-95% i s a s i g n i f i c a n t improvement o v e r t h e c u r r e n t 65-70% b a r r i e r , b u t i s a d e q u a t e t o e n a b l e a c c u r a t e d e t e r m i n a t i o n o f s t r u c t u r e and f u n c t i o n ? -what a c c u r a c y do we need f o r r a t i o n a l d e s i g n o f an a c t i v e s i t e i n h i b i t o r , given that a f u l l x-ray s t r u c t u r e , i . e . 100% a c c u r a c y , does n o t g u a r a n t e e s u c c e s s -what a c c u r a c y do we need t o r a t i o n a l l y d e s i g n a s i t e d i r e c t e d mutation? -can we u n d e r s t a n d d i f f e r e n c e s i n p h y s i o l o g i c a l f u n c t i o n among homologous enzymes? - i s i t n e c e s s a r y t o p r e d i c t s t r u c t u r e from c o m p u t a t i o n a l approaches without the b e n e f i t o f u s i n g a v a i l a b l e experimental data, given that f i r s t p r i n c i p l e c a l c u l a t i o n s are not p o s s i b l e ? -do we want t o i n h i b i t an enzyme o r t h e p h y s i o l o g i c a l process i n which i t functions? -do we r e a l l y need t o p r e d i c t s t r u c t u r e , o r p r e d i c t a n d i d e n t i f y f u n c t i o n w i t h i n a molecule? Do We Have Enough Information- I n a d d i t i o n , i t i s n e c e s s a r y t o e v a l u a t e whether s u f f i c i e n t i n f o r m a t i o n i s a v a i l a b l e t o answer t h e q u e s t i o n presented, or the underlying questions: -does an amino a c i d sequence c o r r e l a t e w i t h a s i n g l e structure, or a family of structures? - i s t h i s c o r r e l a t i o n b a s e d on t h e amino a c i d i d e n t i t y , o r

In Molecular Modeling; Kumosinski, T., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1994.

Downloaded by IMPERIAL COLL LONDON on June 16, 2014 | http://pubs.acs.org Publication Date: December 14, 1994 | doi: 10.1021/bk-1994-0576.ch001

1.

LIEBMAN

Transferring Technology to Solutions

11

on t h e p h y s i c o c h e m i c a l p r o p e r t i e s t h a t an amino a c i d i s capable of p r o v i d i n g i n a given environment? -does e x p e r i m e n t a l d a t a , e.g. s p e c t r o s c o p y , p h y s i c o c h e m i c a l properties, provide additional information? - a r e t h e r e homologous p r o t e i n s , e i t h e r i n t h e t r a d i t i o n a l s e n s e o f e v o l u t i o n a r y homology, o r as s t r u c t u r a l o r f u n c t i o n a l homologues? - a r e t h e s p e c i f i c i t y p r o f i l e s d e v e l o p e d f o r s u b s t r a t e s and i n h i b i t o r s u s i n g i n v i t r o a s s a y s f o r a g i v e n enzyme adequate t o e s t a b l i s h the i n v i v o s t r u c t u r e - f u n c t i o n r o l e which governed i t s e v o l u t i o n ? - i f t h e g o a l i s i n h i b i t i o n o f a p r o c e s s , i s t h e enzyme being s t u d i e d the best t a r g e t , or the t a r g e t of convenience? -can i n f o r m a t i o n about n a t u r a l m u t a t i o n s and t h e i r functional differences provide insight? -do we a d e q u a t e l y u n d e r s t a n d how t h e m o l e c u l e w o r k s , e i t h e r i n v i t r o o r i n v i v o , even i f we can a c c u r a t e l y p r e d i c t describe i t s structure? -can we use t h e m o d e l i n g t o r a t i o n a l l y d e s i g n e x p e r i m e n t a l a p p r o a c h e s w h i c h can a s s i s t i n e n h a n c i n g t h e p r e d i c t i o n method? Evaluating Bias in Data or Methodology- A l s o o f importance i s the need t o examine the p o s s i b i l i t y of i n t r o d u c i n g o r o v e r l o o k i n g b i a s e s w h i c h may be p r e s e n t i n e i t h e r the experimental or computational methods of a n a l y s i s , or i n s e l e c t i n g the data used f o r the a n a l y s i s : -can any b a s i s s e t o f p r o t e i n s a d e q u a t e l y be c o n s t r u c t e d as a model f o r t h e w o r l d o f p r o t e i n s w i t h o u t i n t r o d u c i n g some b i a s ? -do s p e c i f i c i t i e s towards a c t i v e - s i t e d i r e c t e d i n h i b i t o r s and\substrates adequately assay the n a t u r a l s p e c i f i c i t i e s o f enzymes w h i c h i n t e r a c t s p e c i f i c a l l y w i t h macromolecules? - a r e e n v i r o n m e n t a l b i a s e s p r e s e n t , e.g. pH, temperature, compartmentalization? -does enzyme mechanism a d e q u a t e l y p r o v i d e an o b j e c t i v e means f o r c l a s s i f i c a t i o n , e.g.Enzyme c o m m i s s i o n numbers (E.C.)? -do methods o f a n a l y s i s , w h i c h commonly emphasize s i m i l a r i t y , enable d e t e c t i o n of d i s c r e t e but p o t e n t i a l l y s i g n i f i c a n t s t r u c t u r a l and f u n c t i o n a l differences? Dealing With B i a s - By a d d r e s s i n g e a c h o f t h e t h r e e c h a l l e n g e s n o t e d a b o v e , we c a n o p t i m i z e t h e p o t e n t i a l success of our approach t o p r o b l e m - s o l v i n g . The most d i f f i c u l t c h a l l e n g e i n v o l v e s i d e n t i f y i n g and o v e r c o m i n g i n h e r e n t b i a s i n e i t h e r t h e a v a i l a b l e o r s e l e c t e d d a t a to° be a n a l y z e d , o r t h e method o f a n a l y s i s , i t s e l f . The e f f e c t o f t h e s e b i a s e s can be r e a d i l y seen i n t h e i r impact on t h e analysis of the structure-function relationship in proteins. I f we use t h e d e f i n i t i o n t h a t a l l p r o t e i n s c a n

In Molecular Modeling; Kumosinski, T., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1994.

Downloaded by IMPERIAL COLL LONDON on June 16, 2014 | http://pubs.acs.org Publication Date: December 14, 1994 | doi: 10.1021/bk-1994-0576.ch001

12

MOLECULAR MODELING

be c l a s s i f i e d as e i t h e r s o l u b l e ( i . e . w a t e r s o l u b l e ) o r membrane bound ( i . e . n o n - w a t e r s o l u b l e ) , a r e p r e s e n t a t i o n o f our c u r r e n t knowledge can be compared t o t h e complete u n i v e r s e of p r o t e i n s . The b i a s p r e s e n t i n o u r e x i s t i n g k n o w l e d g e i s t h e n t h e d i f f e r e n c e between t h o s e p r o t e i n s w h i c h a r e known and the remainder of the u n i v e r s e of a l l p r o t e i n s (Figure 6). Because we c a n n o t q u a n t i t a t i v e l y o r q u a l i t a t i v e l y e v a l u a t e t h i s b i a s , i t has t h e p o t e n t i a l t o i m p a c t our a b i l i t y t o p r o d u c e f u l l y g e n e r a l i z a b l e r e s u l t s f r o m t h e d a t a s e t b e i n g a n a l y z e d , and i n a manner w h i c h r e m a i n s unknown. As shown i n F i g u r e 6, t h e s i z e and t h e r e f o r e t h e p o t e n t i a l i m p a c t o f t h e unknown b i a s can be s i g n i f i c a n t i n d e f i n i n g t h e l i m i t o f our e x i s t i n g knowledge. As an example o f how t h i s b i a s m i g h t e x i s t , i f t h e s o l u b l e p r o t e i n s w h i c h we s t u d i e d i n c l u d e d only hemoglobins, myoglobins, cytochrome b562, h e m e r y t h r i n s and e r y t h r o c o u r i n s , and t h e n o n - s o l u b l e p r o t e i n s i n c l u d e d r h o d o p s i n and o t h e r s e v e n - b u n d l e membrane s p a n n i n g p r o t e i n s , we m i g h t assume t h a t a l l p r o t e i n s c o n t a i n e d a l p h a h e l i c e s as t h e i r main s e c o n d a r y s t r u c t u r a l feature. Because we know t h a t o t h e r s t r u c t u r a l c l a s s e s o f p r o t e i n s do e x i s t , we can r e a d i l y see t h e b i a s i n t h i s d a t a s e t , but i n e x a m i n i n g t h e s e t o f p r o t e i n s whose s t r u c t u r e s we do know, we a r e n o t a b l e t o i d e n t i f y any b i a s w h i c h might s i m i l a r l y e x i s t . A s i m i l a r o b s e r v a t i o n o f b i a s can e x i s t i n any s e t o f p r o t e i n s w h i c h we s e l e c t , n o t o n l y p r e j u d i c e d by s t r u c t u r e , but a l s o f u n c t i o n , e.g. enzymes may not a d e q u a t e l y r e p r e s e n t s t r u c t u r a l p r o t e i n s , as w e l l as p h y s i c o - c h e m i c a l p r o p e r t i e s , e.g. m o l e c u l a r weight, subunit organization, etc. B i a s e s may a l s o have an historical origin. S e r i n e p r o t e a s e s were f i r s t i s o l a t e d from d i g e s t i v e organs because of convenience of access to material. Their p r o t e o l y t i c a c t i v i t y c l a s s i f i e d them as digestive enzymes w i t h limited specificity towards macromolecular substrates r a t h e r than t h e i r h i g h l y r e f i n e d role in limited p r o t e o l y s i s as observed in blood c o a g u l a t i o n , complement a c t i v a t i o n , e t c . Even t o d a y , n e w l y discovered serine proteases are described as being c h y m o t r y p s i n - l i k e ( i . e . hydrophobic r e s i d u e substrate) or t r y p s i n - l i k e ( i . e . charged r e s i d u e substrate) independent o f t h e p h y s i o l o g i c a l p r o c e s s i n w h i c h t h e y p a r t i c i p a t e , and t h e g r e a t e r r a n g e of s u b s t r a t e r e a c t i v i t y w h i c h has been observed. Amino a c i d s e q u e n c e s o f s e r i n e p r o t e a s e s are numbered a c c o r d i n g t o an a l i g n m e n t w i t h t h e s e q u e n c e i n c h y m o t r y p s i n , not because of i t s e s t a b l i s h e d r o l e i n the e v o l u t i o n o f t h i s f a m i l y but b e c a u s e o f t h e c h r o n o l o g y o f i t s observation. B i a s can a l s o e x i s t i n the m e t h o d o l o g i e s u s e d o r developed for problem-solving and l i m i t o u r a b i l i t y t o a d e q u a t e l y d e s c r i b e t h e systems u n d e r a n a l y s i s . Interest i n e x a m i n i n g t h e s e c o n d a r y s t r u c t u r e o f a p r o t e i n can be s i g n i f i c a n t l y l i m i t e d by t h e common b i a s w h i c h f o c u s e s on the e x i s t e n c e of r e g u l a r s t r u c t u r e s i n c l u d i n g a l p h a h e l i c e s and b e t a s h e e t s . B o t h t h e ease o f r e c o g n i z i n g s t r u c t u r a l

In Molecular Modeling; Kumosinski, T., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1994.

Downloaded by IMPERIAL COLL LONDON on June 16, 2014 | http://pubs.acs.org Publication Date: December 14, 1994 | doi: 10.1021/bk-1994-0576.ch001

1. LIEBMAN

Transferring Technology to Solutions

13

r e g u l a r i t y and t h e a b i l i t y t o f o c u s on l o c a t i n g a l i m i t e d s e t o f s t r u c t u r a l f e a t u r e s r e s u l t i n an o b s e r v e r , u s i n g a computer g r a p h i c s d i s p l a y , t o " s e e " t h e h e l i c e s and s h e e t s a t t h e e x p e n s e o f o t h e r s t r u c t u r a l f e a t u r e s w h i c h may appear l e s s r e g u l a r . I n a d d i t i o n , t h e molecule w i l l appear d i f f e r e n t l y as i t i s v i e w e d from t h e d i f f e r e n t o r i e n t a t i o n s w h i c h r e s u l t f r o m r o t a t i o n and t r a n s l a t i o n o f t h e a t o m i c c o o r d i n a t e s , thus h i d i n g o r exposing d i f f e r e n t s t r u c t u r a l regions during i t s graphical manipulation. By c o n t r a s t , the i n f o r m a t i o n content of the molecular conformation remains constant throughout i t s g r a p h i c a l m a n i p u l a t i o n , meaning t h a t t h e secondary s t r u c t u r a l f e a t u r e s a r e n o t c h a n g i n g , o n l y t h e a b i l i t y t o p e r c e i v e them t h r o u g h t h i s form o f r e p r e s e n t a t i o n . I t i s also d i f f i c u l t to v i s u a l l y process s t r u c t u r a l regions of non-regular conformation f o r c o m p a r i s o n w i t h p o t e n t i a l l y s i m i l a r r e g i o n s i n t h e same o r other p r o t e i n s , t h i s f u r t h e r l i m i t s the a b i l i t y t o "see" other s i m i l a r s t r u c t u r a l features i n d i f f e r e n t p r o t e i n s . T h e s e b i a s e s a r e compounded f u r t h e r when a t t e m p t i n g t o e v a l u a t e t h e s i m i l a r i t y b e t w e e n two c o m p l e t e p r o t e i n m o l e c u l e s , e i t h e r v i s u a l l y u s i n g computer g r a p h i c s as n o t e d above, o r u s i n g computer a l g o r i t h m s such as root-meansquared s u p e r p o s i t i o n . Compensation f o r b i a s i n data selection or i n m e t h o d o l o g y i s most d i f f i c u l t when t h e b i a s c a n n o t be a d e q u a t e l y measured. T h i s l i m i t a t i o n b e a r s d i r e c t l y on t h e a b i l i t y t o g e n e r a l i z e t h e r e s u l t s o f t h e a n a l y s i s and we cannot expect t o completely e l i m i n a t e these b i a s e s i n e i t h e r the data or methodologies. Thus o u r i n a b i l i t y t o develop g e n e r a l r u l e s f o r p r o t e i n s t r u c t u r e p r e d i c t i o n from amino a c i d s e q u e n c e may be i n d i c a t i v e o f s u c h a b i a s because c r i t i c a l i n f o r m a t i o n i s both m i s s i n g and not a s s e s s a b l e from t h e e x i s t i n g data. B i a s e s c a n be u s e d c o n s t r u c t i v e l y when t h e y a r e used t o focus on t h e p a r t i c u l a r a s p e c t s o f a s p e c i f i c p r o b l e m and g e n e r a l i z a t i o n i s not the o v e r a l l goal. We c a n i n t e n t i o n a l l y i n c l u d e a p a r t i c u l a r b i a s and e v a l u a t e t h e s i g n i f i c a n c e o f t h e r e s u l t s a n d t h e l i m i t a t i o n i n t h e i r g e n e r a l i z a b i l i t y by c a r e f u l a n a l y s i s and i n t e r p r e t a t i o n o f e a c h s t a g e o f t h e analysis. We c a n c o n s t r u c t , f o r e x a m p l e , a s e t o f s t r u c t u r a l l y and/or f u n c t i o n a l l y r e l a t e d p r o t e i n s u s i n g c h a r a c t e r i s t i c s w h i c h m i g h t r e l a t e homologous f a m i l i e s , p r o t e i n s from t h e same t i s s u e w i t h i n an o r g a n i s m , o p e r a t i n g a t t h e same pH maximum, e t c . We s h o u l d e x p e c t that a n a l y s i s o f s u c h s e t s w o u l d be a b l e t o b e s t p r e d i c t c h a r a c t e r i s t i c s o f t h e n e x t member o f t h i s c l a s s , a n d f a i l u r e t o be a c c u r a t e i n p r e d i c t i o n s about t h e n e x t member o f a s e t w o u l d s u g g e s t an i n a b i l i t y t o g e n e r a l i z e . Thus a n a l y s i s o f t h e s e r i n e p r o t e a s e s w i t h an i n a b i l i t y t o a c c u r a t e l y p r e d i c t s t r u c t u r e and f u n c t i o n o f t h e next member o f t h i s c l a s s s t r o n g l y i n d i c a t e s t h a t t h e p o t e n t i a l for accurate p r e d i c t i o n of non-serine proteases i s significantly limited. The c h a r a c t e r i s t i c s w h i c h a r e a c t u a l l y l e a r n e d from s u c h p r o t e i n s e t a n a l y s e s a r e c l e a r l y

In Molecular Modeling; Kumosinski, T., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1994.

14

MOLECULAR MODELING

class-associated, although some subset of these characteristics may be generally applicable. The d i f f i c u l t y and t h e v a l u e comes i n d e v e l o p i n g t h e a b i l i t y t o r e c o g n i z e and s e p a r a t e t h e s e c h a r a c t e r i s t i c s . The a n a l y s i s i s c a r r i e d out i n a p a r a l l e l manner f o r m u l t i p l e p r o t e i n c l a s s e s as d e p i c t e d i n F i g u r e 7. This is conceptually a n a l o g o u s t o t h e c o m p u t a t i o n a l a p p r o a c h t e r m e d " d i v i d e and conquer".

Downloaded by IMPERIAL COLL LONDON on June 16, 2014 | http://pubs.acs.org Publication Date: December 14, 1994 | doi: 10.1021/bk-1994-0576.ch001

Evolving

New

Methods

f o r Problem

Solving

As we h a v e n o t e d , c o n v e n t i o n a l m o l e c u l a r m o d e l i n g has p r o g r e s s e d from the c o m p u t a t i o n a l t o o l s m a i n t a i n e d and a p p l i e d by s p e c i a l i s t s on h i g h - p e r f o r m a n c e c o m p u t e r s and g r a p h i c s w o r k s t a t i o n s , to the d e l i v e r y of hands-on t o o l s e t s on personal computers and workstations. These t e c h n o l o g i c a l a d v a n c e s have h e l p e d t o i n t e g r a t e m o d e l i n g i n t o t h e g e n e r a l p r o b l e m - s o l v i n g p r o c e s s by l o w e r i n g t h e b a r r i e r s t o e n t r y and s i g n i f i c a n t l y e n l a r g i n g t h e community of i t s p r a c t i t i o n e r s . A primary b e n e f i t of t h i s e v o l u t i o n has been t h e enhanced c o m m u n i c a t i o n between t h o s e who need t o s o l v e a p a r t i c u l a r p r o b l e m and t h o s e who c o n c e i v e o f and develop the modeling t o o l s . A secondary b e n e f i t , which i s o n l y b e g i n n i n g t o be r e a l i z e d , is that this broader community of those using modeling has led to the i n t r o d u c t i o n o f new a p p r o a c h e s b a s e d on t h e t r a n s f e r o f t e c h n o l o g y from o t h e r d i s c i p l i n e s . Pattern recognition, n e u r a l networks, p a r a l l e l p r o c e s s i n g , array processing, d a t a b a s e a r c h i t e c t u r e and s e a r c h i n g , d i s t r i b u t e d c o m p u t i n g , P e t r i Net a n a l y s i s , a r t i f i c i a l i n t e l l i g e n c e t e c h n i q u e s and fuzzy logic are some of the newly incorporated methodologies, many o f w h i c h r e l y on t h e improvement i n c o m p u t i n g a c c e s s on t h e d e s k t o p . The o p p o r t u n i t y f o r t h e f u t u r e r e s i d e s i n e v o l v i n g t h e techniques for problem-solving beyond conventional b o u n d a r i e s to i n c o r p o r a t e the wide-range of available technologies and d i s c i p l i n e s . Examination of the f u n c t i o n a l r e q u i r e m e n t s f o r d e f i n i n g and s o l v i n g a s p e c i f i c p r o b l e m can l e a d t o t h e development o f m e t h o d o l o g i e s w h i c h can a d d r e s s a w i d e - r a n g e o f s e e m i n g l y d i s p a r a t e p r o b l e m s . Our i n t e r e s t i n a n a l y z i n g t h e b l o o d c o a g u l a t i o n pathway has led t o d e v e l o p m e n t o f t o o l s f o r pathway r e p r e s e n t a t i o n , c o m p a r i s o n and s i m u l a t i o n , d e t e r m i n a t i o n of potential control points f o r d i r e c t pathway i n t e r v e n t i o n , e.g. enzyme i n h i b i t o r o r d r u g , a n a l o g y t o e v o l u t i o n a r i l y r e l a t e d p a t h w a y s , e . g . complement o r f i b r i n o l y s i s , g e n e t i c o r i g i n of pathway e v o l u t i o n , r i s k a s s e s s m e n t o f s e c o n d a r y e f f e c t s t h r o u g h l i n k e d pathways, assessment o f g e n e t i c d i f f e r e n c e s , e . g . m u t a t e d but n o n - l e t h a l enzymes, i n terms o f f u n c t i o n a l differences in pathway function and response and i d e n t i f i c a t i o n of g e n e t i c c o n t r o l elements to p o t e n t i a l l y e x p r e s s pathways i n a l t e r n a t i v e o r g a n i s m s . These t o o l s a r e n e i t h e r l i m i t e d i n t h e i r a p p l i c a t i o n to the coagulation pathway, nor to b i o l o g i c a l p r o c e s s e s . Thus t h e p o t e n t i a l

In Molecular Modeling; Kumosinski, T., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1994.

1. LIEBMAN

Transferring Technology to Solutions

Downloaded by IMPERIAL COLL LONDON on June 16, 2014 | http://pubs.acs.org Publication Date: December 14, 1994 | doi: 10.1021/bk-1994-0576.ch001

Case 1

15

Case 2

Figure 6. The bias which represents the influence of unknown or unattainable information remains unassessable in terms of that information which is known and presents a formidable barrier to modeling the whole world.

Figure 7. Utilizing a known bias to create parallel data models which can reveal information which contains the predetermined bias, and information which may be of general applicability. for technology t r a n s f e r o u t as w e l l a p p l i c a t i o n s h o u l d n o t be o v e r l o o k e d .

as

i n tothe

Conclusion The o v e r v i e w presented here addresses the nature of p r o b l e m - s o l v i n g w i t h a f o c u s on i s s u e s a r i s i n g i n t h e p r e d i c t i o n and a n a l y s i s o f p r o t e i n s t r u c t u r e and f u n c t i o n . The g o a l h a s been t o show t h e p o t e n t i a l v a l u e i n a p p l y i n g the methodologies developed and deployed within the pharmaceutical area t o f u n c t i o n a l l y analogous problems i n a g r i c u l t u r a l and food chemistry. T h i s d i s c u s s i o n h a s been

In Molecular Modeling; Kumosinski, T., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1994.

16

MOLECULAR MODELING

Downloaded by IMPERIAL COLL LONDON on June 16, 2014 | http://pubs.acs.org Publication Date: December 14, 1994 | doi: 10.1021/bk-1994-0576.ch001

d i r e c t e d a t i d e n t i f y i n g u n d e r l y i n g problems and q u e s t i o n s which a r e present but n o t always addressed. T h i s approach was u s e d t o show t h e p o t e n t i a l w h i c h c a n s t i l l be r e a l i z e d by the rational development and u t i l i z a t i o n o f computational modeling i n conjunction w i t h experimental v e r i f i c a t i o n t o t a c k l e t h e complex p r o b l e m s o f t o d a y a n d tomorrow. The p a p e r s w h i c h f o l l o w i n t h i s volume s e r v e a s examples o f moving c o n v e n t i o n a l approaches t o t h e i r successful application.

References 1. Pharmaceutical R&D: Costs, Risks and Rewards, Office of Technology Assessment, 1993 2. Mapping Our Genes, Office of Technology Assessment, Johns Hopkins University Press, 1988 3. Annual Reports in Medicinal Chemistry, Division of Medicinal Chemistry, J . Bristol, Editor, Academic Press 4. Bernstein, F . C . , T.F. Koetzle, G.J.B. Williams, E.F. Meyer, Jr., M.D. Brice, J.R.Rodgers, O.Kennard, T. Shimanouchi and M. Tasumi, J. Mol. Biol. 112, 535, 1977 5. Wilcox, G.L, Poliac M. and Liebman, M.N., Tetra. Comp. Let., 191 1990 6. Liebman, Μ. Ν., and Brugge, A. L . , Santa Fe Institute Studies in the Sciences of Complexity, Volume VII, eds. G. Bell and T. Marr, Addison-Wesley Longman Publishing Group, 183, 1989; 7. Liebman, M. N., J. Comp.-Aided Molecular Design 1, 323 1987 8. Liebman, Μ. Ν., J. Indus. Microbiology 3, 127 1988 9. Liebman, Μ. Ν. Enzyme 36, 115, 1986. 10. Prestrelski, S. J., Williams, A. L. and Liebman, M. N. Proteins, 14, 430 1992 11. Prestrelski, S. J., Byler, D. M. and Liebman, M.N., Proteins, 14, 440 1992 12. Liebman, M.N., Venanzi, C . A . , Weinstein, H . , Biopolymers 24, 1721,1985. 13. Liebman, Μ. Ν., Application of Neural Networks to the Analysis of Structure and Function in Proteins, Second International Conference on Bioinformatics, Supercomputing and Complex Genome Analysis, Ed. Lim, Fickett, Cantor and Robbins,World Scientific, p331 1993 14. Reddy, V. Ν., Mavrovouniotis, M. and Liebman, Μ. Ν., Petri Net Representations in Metabolic Pathways, ISMB, 1993 in press RECEIVED August

2, 1994

In Molecular Modeling; Kumosinski, T., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1994.