Computer Modeling of Carbohydrate Molecules - American Chemical

Nantes, France. A strategy for automated, flexible-residue conforma tional analysis of disaccharides is presented with examples from a study of cellob...
0 downloads 15 Views 2MB Size
Chapter 12

Conformational Analysis of a Disaccharide (Cellobiose) with the Molecular Mechanics Program (MM2) 1

2

2

Alfred D. French , V. H. Tran , and Serge Pérez Downloaded by UNIV OF SYDNEY on December 24, 2017 | http://pubs.acs.org Publication Date: July 6, 1990 | doi: 10.1021/bk-1990-0430.ch012

1

Southern Regional Research Center, U.S. Department of Agriculture, P.O. Box 19687, New Orleans, LA 70179 Institut National de la Recherche Agronomique, B.P. 527, 44026, Nantes, France 2

A strategy for automated, flexible-residue conforma­ tional analysis of disaccharides i s presented with examples from a study of cellobiose. The strategy includes modifications of the MM2 program to give a r i g i d dihedral driver option that starts with the same intra-residue geometry at each increment of the driven torsion angles. This avoids the propagation of residue distortions from one conformation to the next. In analyzing cellobiose, the use of four starting models with different combinations of side group orientations provided at least one satisfactory optimization for each linkage conformation. Each starting model, contributed to a table of lowest energy values but the low-energy region of the resulting map was similar to earlier work based on a single starting model. Many monosaccharides have a single, well-established, preferred ring conformation, such as C-. Therefore, the objective of a typical conformational analysis (CA) of disaccharides i s the understanding of the varying energetic relationship between the two residues as they are rotated about their bonds to the oxygen atom of the glycosidic linkage. These rotations are described by the torsion angles φ and Ψ, shown i n Figure 1. One might (naively) employ CA to answer the question, "What i s the most likely shape of a molecule?" However, crystallographic and other experimental evidence shows that the conformations of individual residues (1,2), disaccharides (3) and polysaccharides (4.) vary, often substantially. Perhaps then, i t i s more appropriate to think of CA as a tool for predicting the range or ranges of attainable conformations. Of these attainable conformations, observed values of φ and Ψ w i l l vary, depending on crystal packing i n the solid state or the type of solvent i n solutions. Although the main variables of disaccharide CA are φ and Ψ, an objective treatment requires finding the least energetic combination of a l l the other conformational variables at each φ,Ψ point. In a practical sense, this requires computer models of sugar residues that are flexible. A l l bond lengths, bond angles and torsion angles other than φ and Ψ must be adjusted at each increment of φ and Ψ i n order to obtain the lowest possible potential energy. 4

This chapter not subject to U.S. copyright Published 1990 American Chemical Society

French and Brady; Computer Modeling of Carbohydrate Molecules ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

Downloaded by UNIV OF SYDNEY on December 24, 2017 | http://pubs.acs.org Publication Date: July 6, 1990 | doi: 10.1021/bk-1990-0430.ch012

192

C O M P U T E R M O D E L I N G OF CARBOHYDRATE M O L E C U L E S

F i g u r e 1. A (1 -> 4) d i s a c c h a r i d e showing Ψ and φ, b a s e d on t h e t o r s i o n a n g l e s H l - C l - 0 4 ' - C 4 ' and Cl-04'-C4'-H4', r e s p e c t i v e l y .

French and Brady; Computer Modeling of Carbohydrate Molecules ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

Downloaded by UNIV OF SYDNEY on December 24, 2017 | http://pubs.acs.org Publication Date: July 6, 1990 | doi: 10.1021/bk-1990-0430.ch012

12.

FRENCH ET A L

Conformational Analysis of a Disaccharide

193

W h i l e p i o n e e r i n g work w i t h f l e x i b l e c a r b o h y d r a t e r e s i d u e s was done a decade ago (5), CA w i t h f l e x i b l e r e s i d u e s o v e r a l l o f φ,Ψ space i s a r e c e n t development (6-9). The m o l e c u l a r mechanics program used i n t h e p r e s e n t work was t h e 1985 v e r s i o n o f MMP2 (10,11). D e s p i t e some s u c c e s s e s w i t h MM2, and i t s p r e d e c e s s o r , MM1, on c a r b o h y d r a t e s (12-14), i t s a p p l i c a t i o n t o CA o f d i s a c c h a r i d e s i s n o t s t r a i g h t - f o r w a r d . The major d i f f i c u l t i e s w i t h CA o f d i s a c c h a r i d e s u s i n g MM2, o r any o t h e r program, a r i s e from t h e m u l t i p l e minimum problem. A s t r a t e g y f o r surmounting t h i s c l a s s i c o b s t a c l e i s p r e s e n t e d i n t h e f o l l o w i n g p a p e r by Tran and Brady. That l a b o r i o u s s t r a t e g y depends on t h e a v a i l a b i l i t y of a f l e x i b l e d e f i n i t i o n of the pattern for c o n f o r m a t i o n a l s e a r c h i n g i n t h e CHARMM (15) program. MM2 has no such f a c i l i t y f o r a semi-automated, p s e u d o - r a d i a l c o n f o r m a t i o n a l s e a r c h , r e s u l t i n g i n an a d d i t i o n a l c h a l l e n g e . T h i s paper g i v e s an a l t e r n a t i v e t o t h e s t r a t e g y d e s c r i b e d i n t h e Tran-Brady p a p e r f o r p e r f o r m i n g CA o f d i s a c c h a r i d e s . The method h e r e i n i s n o t as e l e g a n t , b u t i s perhaps b e t t e r s u i t e d t o a u t o m a t i o n . A n o t h e r advantage i s t h a t i t i s e a s i e r t o d e s c r i b e t h e c o n s t r u c t i o n o f a g i v e n map o f c o n f o r m a t i o n a l energy o v e r φ, Ψ space so t h a t o t h e r workers c o u l d r e p r o d u c e i t . I n o r d e r t o automate t h i s s i m p l e r approach, i t was s t i l l n e c e s s a r y t o m o d i f y t h e MM2 program, and t h e modifications are described. P r e l i m i n a r y m o d e l i n g work on c e l l o b i o s e (7,16) i s c o n f i r m e d by examples t h a t u s e t h e more complete t r e a t m e n t p e r m i t t e d by t h e m o d i f i e d program. Both t h e Tran-Brady p a p e r and t h i s one d e s c r i b e i n i t i a l attempts t o d e v e l o p methods and t h e u n d e r l y i n g p h i l o s o p h y f o r CA t h r o u g h models o f c o m p l i c a t e d s t r u c t u r e s t h a t c a n d e f o r m inelastically. (Here, an i n e l a s t i c d e f o r m a t i o n means t h a t an a l t e r n a t e c o n f o r m a t i o n f o r one o r more s t r u c t u r a l f e a t u r e s was a d o p t e d d u r i n g energy m i n i m i z a t i o n . Examples i n c l u d e t h e r o t a t i o n o f an h y d r o x y l group t h r o u g h an energy b a r r i e r t o an a l t e r n a t e s t a g g e r e d p o s i t i o n o r t h e c h a n g i n g o f a p y r a n o i d r i n g from t h e C - shape.) A l t h o u g h we want t h e m o l e c u l a r model t o d e f o r m d u r i n g CA, we must cope w i t h t h e i n e l a s t i c d e f o r m a t i o n s t h a t o c c u r when a n a l y z i n g c o m b i n a t i o n s o f φ and Ψ t h a t have h i g h e n e r g i e s . T h i s i s a problem when u s i n g t h e s t a n d a r d f a c i l i t i e s f o r CA w i t h i n MM2 because t h e s t a r t i n g geometry f o r each o p t i m i z a t i o n i s t h e p r e v i o u s l y o p t i m i z e d structure. Any i n e l a s t i c d e f o r m a t i o n i s t h u s l i k e l y t o be t r a n s m i t t e d t o t h e next s t r u c t u r e and t h e c o n f o r m a t i o n and energy w i l l not, i n g e n e r a l , be t h e same b e f o r e and a f t e r 360° o f r o t a t i o n . T h i s d i f f i c u l t y i s i n a d d i t i o n t o t h e more c l a s s i c a s p e c t o f t h e m u l t i p l e minima p r o b l e m where an overwhelming number o f p o s s i b l e s t r u c t u r e s must be t e s t e d t o a s c e r t a i n t h e l e a s t e n e r g e t i c s t r u c t u r e . The s t r a t e g y p r e s e n t e d t h u s must overcome b o t h t y p e s o f problem. 4

The p r o b l e m o f i n e l a s t i c d e f o r m a t i o n s i s i n a d d i t i o n t o o t h e r problems a s s o c i a t e d w i t h t r y i n g t o a s s e s s t h e p o t e n t i a l e n e r g i e s a t v a r i o u s r o t a t i o n s about bonds. B u r k e r t and A l l i n g e r (17) have discussed s e v e r a l aspects o f these c a l c u l a t i o n s , i n c l u d i n g the p r o b l e m t h a t t h e r o t a t i o n s a r e u s u a l l y d e f i n e d by o n l y one o f s e v e r a l t o r s i o n a n g l e s a s s o c i a t e d w i t h a g i v e n bond. T y p i c a l l y , t h e r e i s an a r t i f a c t u a l " l a g " i n t h e t o r s i o n a n g l e s t h a t a r e n o t used by t h e m o d e l i n g program t o d e f i n e t h e r o t a t i o n about t h e bond. Flexible-Residue

Justification

I n l i g h t o f t h e d i f f i c u l t i e s j u s t d i s c u s s e d , one might wonder whether the i n c o r p o r a t i o n of residue f l e x i b i l i t y i s worthwhile. "Rigidr e s i d u e " methods such as HSEA (1_8) r e q u i r e f a r l e s s computer t i m e t h a n f l e x i b l e - r e s i d u e methods. We c i t e two p r a c t i c a l advantages o f

French and Brady; Computer Modeling of Carbohydrate Molecules ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

194

COMPUTER MODELING OF CARBOHYDRATE MOLECULES

a l l o w i n g i n t e r n a l adjustments b e s i d e s t h e b a s i c a p p e a l o f i n c o r p o r a t i n g a known a s p e c t o f t h e m o l e c u l e i n t h e model: 1.

Downloaded by UNIV OF SYDNEY on December 24, 2017 | http://pubs.acs.org Publication Date: July 6, 1990 | doi: 10.1021/bk-1990-0430.ch012

2.

S i n c e t h e r e s i d u e can f l e x , d e t a i l e d a s p e c t s o f t h e s t a r t i n g geometry o f t h e r e s i d u e a r e not c r i t i c a l . With r i g i d - r e s i d u e a n a l y s i s , s t a r t i n g g e o m e t r i e s t a k e n from v a r i o u s c r y s t a l s t r u c t u r e s g i v e minima i n d i f f e r e n t p o s i t i o n s (19). Rigidr e s i d u e a n a l y s e s s t a r t i n g from d i s a c c h a r i d e c r y s t a l s t r u c t u r e s w i l l almost i n e v i t a b l y f a v o r t h e s t a r t i n g c o n f o r m a t i o n i f t h e p o t e n t i a l f u n c t i o n s are reasonable. I f t h e v a r i o u s φ,Ψ c o m b i n a t i o n s found i n s i n g l e - c r y s t a l d i f f r a c t i o n s t u d i e s a r e p l o t t e d on CA maps, t h e e n e r g i e s c o r r e s p o n d i n g t o t h e s e c o m b i n a t i o n s a r e o f t e n lower on maps p r e p a r e d w i t h f l e x i b l e r e s i d u e s t h a n on maps made w i t h r i g i d r e s i d u e s (3,20 2 1 ) . The e n e r g i e s c a l c u l a t e d w i t h f l e x i b l e r e s i d u e methods f o r e x p e r i m e n t a l l y d e t e r m i n e d c o n f o r m a t i o n s a r e i n a c c o r d w i t h e n e r g i e s t h a t c o u l d be e x p e c t e d from hydrogen bonding and van d e r Waals f o r c e s . #

While i t i s d i f f i c u l t t o v e r i f y e x p e r i m e n t a l l y the c a l c u l a t e d h e i g h t s o f c o n f o r m a t i o n a l b a r r i e r s , i t seems t h a t f l e x i b l e - r e s i d u e methods can g i v e b e t t e r r e s u l t s . E n e r g i e s based on r i g i d r e s i d u e s i n c r e a s e t o a r t i f i c i a l l y h i g h v a l u e s a t l a r g e d i s t a n c e s from t h e s t a r t i n g φ, Ψ c o n f o r m a t i o n (22). The MM2

Program

The computer program used h e r e i n , MM2, i s one o f many (23) t h a t a d j u s t ("optimize") t h e atomic c o o r d i n a t e s o f a m o l e c u l e t o produce a s t r u c t u r e a t a l o c a l minimum on a m u l t i d i m e n s i o n a l h y p e r s u r f a c e o f p o t e n t i a l energy. Such programs r e q u i r e p r e d e f i n e d e q u a t i o n s and c o n s t a n t s f o r t h e c a l c u l a t i o n o f the energy o f e v e r y t y p e o f i n t e r a c t i o n , i . e . , bond s t r e t c h i n g , bond a n g l e bending, t o r s i o n s and non-bonded van d e r Waals f o r c e s . I n i t s academic v e r s i o n s , MM2 (and MMP2 v e r s i o n s t h a t i n c l u d e d e l o c a l i z e d p i e l e c t r o n s ) does not p r o v i d e g r a p h i c d i s p l a y and i s b e s t c o n s i d e r e d a t o o l f o r s t r u c t u r e o p t i m i z a t i o n (energy m i n i m i z a t i o n ) and f o r CA. Neither version i n c l u d e s f a c i l i t i e s f o r m o l e c u l a r dynamics o r Monte C a r l o t e c h n i q u e s . Attractive Attributes. 1. 2.

3.

MM2

i s attractive for several

reasons:

I t i s a g e n e r a l - p u r p o s e program t h a t i s c a r e f u l l y p a r a m e t e r i z e d f o r a wide v a r i e t y o f m o l e c u l a r t y p e s . Two r e c e n t v e r s i o n s o f MM2, MMP2(85) and MM2(87), a u t o m a t i c a l l y compensate f o r t h e anomeric e f f e c t s t h a t a r e important f o r sugars. Accomodations f o r c a r b o h y d r a t e s a r e d i s c u s s e d f u r t h e r i n t h e c h a p t e r i n t h i s book by F r e n c h , Rowland and A l l i n g e r . MM2 i s a v a i l a b l e (except t o Communist c o u n t r i e s ) f o r a c o p y i n g f e e t h r o u g h t h e Quantum C h e m i s t r y Program Exchange (QCPE), Department o f C h e m i s t r y , I n d i a n a U n i v e r s i t y , Bloomington, IN 47901. There a r e s e v e r a l v e r s i o n s f o r s e v e r a l k i n d s o f computers. Only academic workers can o b t a i n t h e newest v e r s i o n , MM2(87) (or MMP2(85) on which t h i s work i s based) t h r o u g h t h e QCPE. O t h e r u s e r s may get t h o s e v e r s i o n s from M o l e c u l a r Design, L t d , San Leandro, C a l i f o r n i a , 2132 F a r a l l o n D r i v e 94577. The commercial v e r s i o n s use t h e same methods f o r energy and s t r u c t u r e c a l c u l a t i o n s , but a r e enhanced f o r e a s i e r p r e p a r a t i o n of input f i l e s , e t c .

French and Brady; Computer Modeling of Carbohydrate Molecules ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

12.

FRENCH ET AL»

Conformational Analysis of a Disaccharide

195

The manual p r o v i d e d by QCPE f o r MM2 i s u s e f u l as a r e j o u r n a l a r t i c l e s (10,11); two books a r e recommended t o p r o s p e c t i v e u s e r s o f MM2 (24,25). A l s o , t h e QCPE s p o n s o r s t r a i n i n g c o u r s e s . Limitations. working with

Downloaded by UNIV OF SYDNEY on December 24, 2017 | http://pubs.acs.org Publication Date: July 6, 1990 | doi: 10.1021/bk-1990-0430.ch012

1.

Experience MM2.

shows t h a t t h e r e a r e some l i m i t a t i o n s when

The t a s k o f c r e a t i n g i n p u t f i l e s i s t e d i o u s f o r m o l e c u l e s as l a r g e as d i s a c c h a r i d e s and a d d i t i o n a l s u p p o r t i s a d v i s a b l e f o r u s e r s o f t h e academic v e r s i o n s . S e v e r a l programs from t h e QCPE p r o v i d e t h i s c a p a b i l i t y , as do a number o f commercial programs. The b e s t o f such programs c r e a t e a s t a n d a r d MM2 i n p u t f i l e a f t e r t h e u s e r draws t h e s t r u c t u r e on a t e r m i n a l screen. L i k e o t h e r programs f o r d e t e r m i n i n g l e a s t e n e r g e t i c c o n f o r m a t i o n s , MM2 o n l y f i n d s l o c a l minima. A l t e r n a t e s t r u c t u r e s s e p a r a t e d by energy b a r r i e r s must be e x p l i c i t l y t e s t e d and t h e i r e n e r g i e s compared. It i s especially d i f f i c u l t t o cover a l l p o s s i b l e a l t e r n a t e s t r u c t u r e s f o r carbohydrates. T h i s i s due b o t h t o t h e n a t u r e o f c a r b o h y d r a t e s and t o a l i m i t a t i o n i n MM2. Only two t o r s i o n a n g l e s c a n be v a r i e d s y s t e m a t i c a l l y i n t h e s t a n d a r d program. MM2 i s slow compared t o programs w i t h s i m p l e r p o t e n t i a l f u n c t i o n s , a l t h o u g h i t i s r a p i d compared t o quantum m e c h a n i c a l methods. MM2 r e q u i r e s l o n e p a i r s o f e l e c t r o n s on a l l e t h e r and h y d r o x y l oxygen atoms and n i t r o g e n atoms. These l o n e p a i r s a r e t r e a t e d as i f t h e y a r e atoms and t h u s t h e number o f "atoms" i s i n c r e a s e d by as much as 50% f o r c a r b o h y d r a t e s . T h i s c a n d o u b l e t h e r e q u i r e d computer time compared t o c a l c u l a t i o n s not using lone p a i r s . The c o m p l e x i t y o f t h e p o t e n t i a l f u n c t i o n s i n h i b i t s t h e e x t e n t of p a r a m e t e r i z a t i o n , a l t h o u g h many s t r u c t u r e s c a n be modeled. The m o d i f i c a t i o n s d e s c r i b e d below a r e n e c e s s a r y f o r automated CA o f m o l e c u l e s t h a t c a n deform i n e l a s t i c a l l y . New r e l e a s e s of MM3, t h e s u c c e s s o r t o MM2, s h o u l d i n c o r p o r a t e some o f t h e s e changes. (See t h e c h a p t e r by F r e n c h , Rowland and A l l i n g e r f o r i n f o r m a t i o n on MM3.)

2.

3.

4. 5.

Problems w i t h M o d e l i n g

Carbohydrates

Two a s p e c t s o f c a r b o h y d r a t e s t r u c t u r e a r e e s p e c i a l l y p r o b l e m a t i c f o r m o d e l i n g because o f t h e m u l t i p l e minimum problem: 4 1 R i n g Geometry. The number o f p o s s i b l e r i n g conformers ( C^, C , S , etc.) i s p o t e n t i a l l y large. That number i s s q u a r e d t o g i v e t h e number o f s t a r t i n g models t h a t might r e q u i r e c o n s i d e r a t i o n f o r a d i s a c c h a r i d e , s i n c e t h e two r i n g s i n d i s a c c h a r i d e s c o u l d p o s s i b l y have two d i f f e r e n t forms. I n some c a s e s (26), one must t e s t s e v e r a l r i n g forms, i n c r e a s i n g t h e c o m p l e x i t y o f t h e s t u d y . During the o p t i m i z a t i o n o f v e r y f l e x i b l e r i n g s such as f r u c t o f u r a n o s e s (French, A. D.; Tran, V. H. B i o p o l y m e r s , I n p r e s s ) , s e v e r a l d i f f e r e n t c o n f o r m a t i o n s c a n be v i s i t e d e n r o u t e t o t h e l e a s t e n e r g e t i c structure. 4

1

5

R o t a t i n g S i d e Groups. The p o s i t i o n s o f r o t a t i n g s i d e groups on s u g a r s a f f e c t t h e c a l c u l a t e d energy v a l u e s . P r i m a r y a l c o h o l groups u s u a l l y e x i s t i n s t a g g e r e d p o s i t i o n s (gg, g t , and tg) (27) t h a t c o r r e s p o n d t o l o c a l minima. P r i m a r y a l c o h o l groups o f p y r a n o s e s o c c u r m o s t l y i n one o f two p o s i t i o n s , a v o i d i n g i n t e r a c t i o n s such as between 04 and 06 i n g l u c o s e i f 06 has a t g p o s i t i o n . In both s o l i d s

French and Brady; Computer Modeling of Carbohydrate Molecules ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

Downloaded by UNIV OF SYDNEY on December 24, 2017 | http://pubs.acs.org Publication Date: July 6, 1990 | doi: 10.1021/bk-1990-0430.ch012

196

COMPUTER MODELING OF CARBOHYDRATE MOLECULES

and solutions, gt and gg positions are preferred for glucose, while the tg and gt positions are preferred for galactose (28). Even hydrogen atoms i n secondary hydroxyl groups are problematic, with different arrangements giving a range of energy values. However, one need not usually consider a l l three staggered positions for each hydroxyl group. The lowest energies for models of pyranose rings occur when the secondary hydroxyl groups a l l have similar r e l a t i v e orientations. This enables the formation of cooperative rings of intramolecular hydrogen-bonds. These s i m i l a r orientations are described as clockwise (C) or anticlockwise (R) (8.) . A paper by TvaroSka, Kozar and Hricovini i n t h i s book describes an alternate procedure for coping with variable side group positions. In the present case (cellobiose), four different models were tested. They were gtgtRR, gtgtCC, ggggRR and ggggRC, shown i n Figure 2. More combinations were not used as s t a r t i n g models because the number of changes i n the energy map seemed to diminish with each successive t r i a l . Unless a l l p o s s i b i l i t i e s are t r i e d , of course, there i s no way to know with certainty that the lowest energy has been attained at each φ,Ψ point. While more structures can be tested, i t i s not reasonable to test a l l p o s s i b i l i t i e s . About one week i s required to test each s t a r t i n g model on a MicroVax II and there are about possibilities. Instead, we seek a result that w i l l have an error less greater than 1 kcal/mol at each φ,Ψ point, at least i n the i n t e r e s t i n g , low-energy zones. This error i s i n addition to the o v e r a l l deficiencies i n the force f i e l d , such as the underestimation of hydrogen bonding energy i n MMP2(85) (29) and neglect of any environmental interactions. Most molecular mechanics studies do not indicate that one p o s i t i o n of the primary alcohol group has an energy p r o h i b i t i v e l y higher than the others. This i s not consistent with the experimental data so we conclude that the model i s not complete. Also, i n t e r residue hydrogen bonds are often observed under experimental conditions but intra-molecular hydrogen bonds are favored i n our models because the molecule i s i s o l a t e d . Therefore, the purpose i n using a variety of different s t a r t i n g models i s not to determine the preferred side group orientations. Instead, alternate starting arrangements were used to assure attainment of low energies for φ-Ψ values that otherwise might have higher energy values caused by positions of side groups that cause interference. Problems with Flexible-Residue Analysis Because the i n t e r n a l geometry of each residue responds to forces a r i s i n g from the proximity of the other half of the disaccharide, an apparent c o n f l i c t arises between two desirable goals of CA. On one hand, we hope that model residues deform during changes i n φ and Ψ i n a manner similar to real molecules that undergo s i m i l a r motions. One might expect that the structure and energy values of r e a l molecules would be different before and immediately after 3 6 0 ° rotations about φ and Ψ. On the other hand, a φ, Ψ map must have the same energies at +180 and - 1 8 0 ° i n order to show the minimal energy at each φ,Ψ conformation. This c o n f l i c t i s a t y p i c a l example of the difference between k i n e t i c a l l y determined results and thermodynamically determined ones. Energy minimization algorithms however, cannot generally overcome false minima, so i n e l a s t i c a l l y deformed models are not brought to the thermodynamically best structure during CA. A modeling study can avoid i n e l a s t i c deformations by only searching conformation space close to the minima as i n the pseudor a d i a l search method described i n the preceding paper by Tran and Brady. That type of search mimics the thermal motion of a molecule,

French and Brady; Computer Modeling of Carbohydrate Molecules ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

12.

FRENCH ET A L

(informational Analysis ofa Disaccharide

197

which m o s t l y s t a y s w i t h i n t h e low-energy a r e a s . Only a f t e r t h e lowe n e r g y r e g i o n s a r e e s t a b l i s h e d does one attempt t o d e t e r m i n e t h e e n e r g i e s o f l i n k a g e c o n f o r m a t i o n s t h a t might deform t h e model inelastically. The approach used i n t h e p r e s e n t paper s i m p l y t r i e s s e v e r a l d i f f e r e n t s t a r t i n g models a t each p o i n t .

Downloaded by UNIV OF SYDNEY on December 24, 2017 | http://pubs.acs.org Publication Date: July 6, 1990 | doi: 10.1021/bk-1990-0430.ch012

Dihedral

Drivers

D u r i n g CA o f a d i s a c c h a r i d e , t h e two r e s i d u e s a r e r o t a t e d about t h e i r bonds t o t h e l i n k i n g oxygen. MM2 has a " d i h e d r a l d r i v e r " f a c i l i t y t h a t a c c e p t s t h e i n i t i a l , f i n a l and i n c r e m e n t s i z e v a l u e s o f two t o r s i o n a n g l e s . A t each i n c r e m e n t o f t h e s e t o r s i o n a n g l e s , t h e e n e r g y i s m i n i m i z e d , p r o v i d i n g a v a l u e f o r a p o i n t on t h e energy map. The two t o r s i o n a n g l e s o f t h e m o l e c u l a r model a r e h e l d a t t h e s p e c i f i e d v a l u e s by a s s i g n i n g a l a r g e p o t e n t i a l energy t o changes o f t h e two t o r s i o n a n g l e s . T h i s approach a l l o w s o p t i m i z a t i o n o f a l l o t h e r s t r u c t u r a l c h a r a c t e r i s t i c s f o r a l l atoms i n c l u d i n g t h o s e t h a t d e f i n e t h e t o r s i o n a n g l e s . A f t e r o p t i m i z a t i o n i s complete, t h e energy i s r e c a l c u l a t e d w i t h t h e u s u a l t o r s i o n a l p o t e n t i a l . Two t y p e s o f d i h e d r a l d r i v e r s a r e a v a i l a b l e i n s t a n d a r d MM2. One o p t i o n p r o v i d e s f o r changes o f t o r s i o n a n g l e s w i t h i n r i n g s . I t functions s l o w l y a c c o r d i n g t o t h e program manual and w i l l n o t be d i s c u s s e d further. The o t h e r a v a i l a b l e o p t i o n i s f o r u s e w i t h s i d e groups, and t h e r e f o r e would be b e t t e r s u i t e d f o r c h a n g i n g t h e v a l u e s o f φ and Ψ . With t h i s o p t i o n (the -1 o p t i o n i n t h e MM2 manual), t h e r e s i d u e s o f t h e s t a r t i n g model a r e r o t a t e d r i g i d l y (without i n t e r n a l change) t o t h e f i r s t φ,Ψ c o m b i n a t i o n t o be c o n s i d e r e d . A f t e r t h e f i r s t o p t i m i z a t i o n f i n i s h e s , t h e f i r s t t o r s i o n a n g l e s p e c i f i e d (eg. φ) i s changed by i t s i n c r e m e n t , r i g i d l y r o t a t i n g one o f t h e newly o p t i m i z e d r e s i d u e s . T h i s new s t r u c t u r e i s o p t i m i z e d , and t h e p r o c e s s c o n t i n u e s u n t i l φ has undergone a l l t h e s p e c i f i e d i n c r e m e n t s . Then, t h e second t o r s i o n a n g l e , Ψ, i s changed by i t s s p e c i f i e d increment and a l l v a l u e s o f φ a r e a g a i n t e s t e d . T h i s scheme i s shown i n F i g u r e 3 (Option -1), w i t h each arrowhead r e p r e s e n t i n g a p o i n t where t h e s t r u c t u r e would be o p t i m i z e d . The Problem w i t h t h e S t a n d a r d D r i v e r . F o r ease o f use, i t i s d e s i r a b l e t o s t e p b o t h φ and Ψ t h r o u g h 360° i n an automated procedure. However, t h i s w i l l cause t h e model t o p a s s t h r o u g h some conformations that r e s u l t i n i n e l a s t i c deformations. Since the s t a n d a r d d r i v e r b e g i n s each o p t i m i z a t i o n w i t h t h e i n t e r n a l r e s i d u e geometries of t h e preceding conformation, r e o r i e n t a t i o n s o f s i d e groups and o t h e r d e f o r m a t i o n s a r e o f t e n c a r r i e d f o r w a r d . A l t h o u g h i t i s p o s s i b l e t h a t o p t i m i z a t i o n s a t subsequent c o n f o r m a t i o n s would " r e p a i r " t h e r e s i d u e geometry, i t does n o t happen o f t e n . The e f f e c t s o f p r o p a g a t e d d i s t o r t i o n s o f t h e r e s i d u e a r e shown i n F i g u r e 4, a CA map w i t h o u t c o n t o u r i n g t h a t was p r e p a r e d w i t h t h e standard d r i v e r . The gtgtRR s t a r t i n g model o f c e l l o b i o s e had an energy o f 31.4 k c a l / m o l ( i t s c o n f o r m a t i o n was φ = 20, Ψ » -60). A f t e r r i g i d l y r o t a t i n g t o φ » -180, Ψ - -180 and o p t i m i z i n g a t i n c r e m e n t s o f 20° o v e r 360°, t h e s m a l l e s t energy found was 32.8 k c a l / m o l . The secondary h y d r o x y l group o r i e n t a t i o n s were changed a t an e a r l y φ,Ψ c o n f o r m a t i o n and n o t r e s t o r e d . Another m a n i f e s t a t i o n o f t h e d e f o r m a t i o n i s t h a t t h e energy v a l u e s a t φ - -180, Ψ = 140 and a t φ - +180, Ψ - 140, d i f f e r by 5 k c a l / m o l . As t h e c o n f o r m a t i o n a l s e a r c h p r o c e e d e d between t h e s e two p o i n t s , a s i d e group changed t o a d i f f e r e n t (but n o t t h e i n i t i a l ) p o s i t i o n . F i g u r e 4 might r e p r e s e n t w e l l t h e e n e r g i e s t h a t would be found i m m e d i a t e l y a f t e r a r e a l m o l e c u l e was f o r c e d t o change c o n f o r m a t i o n s

French and Brady; Computer Modeling of Carbohydrate Molecules ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

Downloaded by UNIV OF SYDNEY on December 24, 2017 | http://pubs.acs.org Publication Date: July 6, 1990 | doi: 10.1021/bk-1990-0430.ch012

198

COMPUTER MODELING OF CARBOHYDRATE MOLECULES

gtgtCC

ggggRC

Figure 2. The four s t a r t i n g models used f o r the study of cellobiose (lone pairs of electrons are not shown). Convention defines the R and C notation when the residue i s i n a conventional orientation and i s viewed from above. The least energetic structure observed i n t h i s study i s gtgtRR. This Figure and Figure 5 were drawn with CHEMX, developed and d i s t r i b u t e d by Chemical Design Ltd, Oxford, England.

I

1

1

Φ

Standard Option -1

1

I

1

1

1

Φ

New Option -2

Figure 3. A comparison of two methods of producing starting conformations. With standard option -1, the conformations are generated from the preceding structure. With our -2 option, a l l conformations within a run are generated from the same, single s t a r t i n g point.

French and Brady; Computer Modeling of Carbohydrate Molecules ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

12.

FRENCH ET A L

Conformational Analysis oj'a Disaccharide

199

Downloaded by UNIV OF SYDNEY on December 24, 2017 | http://pubs.acs.org Publication Date: July 6, 1990 | doi: 10.1021/bk-1990-0430.ch012

a l o n g t h e p a t h g i v e n by t h e s t a n d a r d d r i v e r . However, such a p a t h o f c o n f o r m a t i o n a l change i s i m p r o b a b l e . R e a l m o l e c u l e s would a v o i d h i g h - e n e r g y c o n f o r m a t i o n s and deformed s p e c i e s would e v e n t u a l l y r e v e r t t o l o w e r - e n e r g y c o n f o r m a t i o n s r e g a r d l e s s o f how i n e l a s t i c a l l y deformed a model might be. The f a u l t s i n t h i s map ( f a i l u r e t o a t t a i n t h e e n e r g y v a l u e o f t h e s t a r t i n g c o n f o r m a t i o n and t h e d i f f e r e n c e s i n e n e r g i e s a t each s i d e ) r e s u l t f r o m t h e c o n t i n u o u s a p p l i c a t i o n o f t h e s t a n d a r d d i h e d r a l d r i v e r i n MM2. A New D r i v e r . I n o u r s t r a t e g y , we a n a l y z e each φ, Ψ c o n f o r m a t i o n independently. Each o p t i m i z a t i o n s t a r t s w i t h t h e same r e s i d u e g e o m e t r i e s , which a r e r o t a t e d r i g i d l y from t h e i n i t i a l conformation d i r e c t l y t o the φ, Ψ p o i n t i n q u e s t i o n . MM2 was m o d i f i e d so t h i s t a s k c a n be automated t h r o u g h a new d i h e d r a l d r i v e r o p t i o n t h a t we have d e s i g n a t e d as -2. The r e l a t i o n s o f s t a r t i n g models t o t h e o p t i m i z e d p o i n t s a r e a l s o shown i n F i g u r e 3 f o r t h e new d r i v e r option. T h i s approach m a i n t a i n s c o n t r o l o v e r t h e s t a r t i n g geometry, and d i r e c t l y overcomes t h e two f a u l t s d e s c r i b e d f o r r e s u l t s from t h e standard d r i v e r . A New Problem. W h i l e o u r new d r i v e r s o l v e s some i m p o r t a n t problems, i t c r e a t e s a new one, i . e . , s t r u c t u r e s a t s e v e r a l φ, Ψ p o i n t s f a i l to optimize p r o p e r l y . An example i s shown i n F i g u r e 5 f o r c e l l o b i o s e w i t h φ o f -100 and Ψ o f -80. I n i t i a l l y , t h i s c o n f o r m a t i o n , when imposed on a gtgtRR model, p l a c e s t h e c e n t e r s o f t h e 02 and 03' atoms o n l y 0.488 Â a p a r t ( F i g u r e 5a). (In a r i g i d - r e s i d u e a n a l y s i s , t h i s c o n f l i c t would cause a v e r y h i g h energy t o be c a l c u l a t e d . ) Some o f t h e bonds t o t h e l o n e p a i r s o v e r l a p and a c o n t a c t o f 0.119 Â o c c u r s between one o f t h e l o n e e l e c t r o n p a i r s and t h e o t h e r oxygen atom. Severe d i s t o r t i o n s o c c u r e d when MM2 moved t h e atoms t o t r y t o reduce t h e energy o f t h e t a n g l e d model i n F i g u r e 5a. The o p t i m i z a t i o n d i d n o t p r o c e e d c o r r e c t l y because movement t o r e s o l v e t h e i n t e r - r e s i d u e c o n f l i c t s would have i n i t i a l l y i n c r e a s e d t h e s e v e r i t y o f t h e van d e r Waals r e p u l s i o n s . I n s t e a d , some o f t h e bond l e n g t h s and o t h e r f e a t u r e s assumed h i g h l y i m p r o b a b l e v a l u e s . The r e s u l t i n g s t r u c t u r e ( F i g u r e 5b) has a r e p o r t e d energy o f -6469 kcal/mol. (A s u i t a b l e warning was i s s u e d by MM2 t h a t n o n - s t a n d a r d bond l e n g t h s had o c c u r r e d and t h a t o p t i m i z a t i o n was t e r m i n a t e d . ) T h i s wrong v a l u e o f t h e e n e r g y r e s u l t s f r o m t h e c u b i c t e r m i n t h e b o n d - s t r e t c h i n g component o f t h e c a l c u l a t e d energy. As n o t e d i n R e f . 11, "When energy m i n i m i z a t i o n i s done w i t h a v e r y p o o r s t a r t i n g geometry, [the c u b i c f u n c t i o n ] may l e a d t o d i s a s t e r — w i t h t h e molecule f l y i n g apart." S i n c e bond l e n g t h s were i n i t i a l l y i n t h e c o r r e c t range, t h e c u b i c c o n t r i b u t i o n t o bond s t r e t c h i n g was n o t s u p p r e s s e d and t h e l a r g e n e g a t i v e energy was o b t a i n e d . The t h i r d s t r u c t u r e ( F i g u r e 5c) i s an o p t i m i z e d r e s u l t w i t h t h e same v a l u e s o f φ and Ψ and an energy o f 54.2 k c a l / m o l , t a k e n f r o m t h e work d e p i c t e d i n F i g u r e 4 t h a t used t h e s t a n d a r d d r i v e r . The s t a r t i n g geometry was gtgtRR, b u t e a r l i e r o p t i m i z a t i o n s had r e o r i e n t e d t h e h y d r o x y l groups on t h e n o n - r e d u c i n g r e s i d u e and adjusted the residue geometries. This preconditioning eliminated the t a n g l i n g and a l l o w e d MM2 t o s u c c e s s f u l l y o p t i m i z e t h e s t r u c t u r e . However, o p t i m i z a t i o n s a t f o l l o w i n g φ, Ψ v a l u e s f a i l e d t o r e t u r n t h e h y d r o x y l groups t o t h e gtgtRR p o s i t i o n . T h i s i s why, when t h e CA comes t o t h e p o i n t w i t h φ = 20 and Ψ « -60 (as i n t h e s t a r t i n g geometry), t h e c a l c u l a t e d energy was 32.8 k c a l / m o l i n s t e a d o f 31.4 kcal/mol. Although the i n d i v i d u a l residues are d i s t o r t e d i n F i g u r e 5c they s t i l l would be c l a s s e d as C - shapes. T o g e t h e r , F i g u r e s 4 and 5b c o n t r a s t trie problems o f t h e two t y p e s o f automated a n a l y s i s . The g r a d u a l approach t o h i g h - e n e r g y 4

French and Brady; Computer Modeling of Carbohydrate Molecules ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

Downloaded by UNIV OF SYDNEY on December 24, 2017 | http://pubs.acs.org Publication Date: July 6, 1990 | doi: 10.1021/bk-1990-0430.ch012

200

COMPUTER MODELING OF CARBOHYDRATE MOLECULES

45 49 55 57 49 45 47 44 39 37 37 41 45 44 45 48 43 42 44

43 51 62 64 53 47 43 43 41 40 40 42 46 50 48 48 46 42 44

46 52 62 73 61 51 45 43 43 43 44 46 49 53 55 52 50 47 47

51 53 61 64 69 59 50 45 45 42 44 46 51 54 62 57 52 49 52

46 50 60 60 70 64 54 48 48 42 41 43 49 54 56 62 56 49 46

45 45 53 57 59 65 56 47 42 40 40 39 42 49 55 55 54 50 45

45 44 49 50 57 53 57 48 42 38 37 37 38 43 48 52 49 47 45

42 44 48 46 49 45 56 49 42 37 35 35 38 39 45 48 46 43 42

38 42 49 45 46 42 41 42 42 38 34 34 36 40 42 45 44 40 38

1 1 1 1 1 -8 -6 -4 -2 -0 -8 -6 -4 -2 0 0 0 0 0 0 0 0 0

36 40 48 44 44 42 39 38 39 37 35 33 34 39 42 42 42 39 36

35 39 46 44 43 43 39 36 35 35 35 34 33 35 40 41 40 37 35

36 37 45 45 42 41 40 36 34 34 34 36 34 34 38 39 38 37 36

37 39 48 48 43 39 39 37 35 34 34 37 39 38 39 40 37 36 37

39 43 52 55 48 41 39 39 38 37 36 38 43 44 43 43 40 38 39

41 45 54 56 55 47 42 40 40 40 39 39 43 48 50 48 46 41 41

43 46 50 56 62 53 46 42 41 41 40 39 41 47 51 52 49 45 43

44 45 51 54 53 58 49 43 40 40 41 39 39 43 49 50 49 46 44

44 44 49 53 51 52 50 43 38 37 39 42 40 41 45 47 45 45 44

45 47 50 57 51 46 47 44 40 37 38 41 45 44 45 48 43 42 45

0

2 0

4 0

6 0

8 0

1 0 0

1 2 0

1 4 0

1 6 0

1 8 0

180 160 140 120 100 80 60 40 20 0 -20 -40 -60 -80 -100 -120 -140 -160 -180

Φ F i g u r e 4. Energy v a l u e s ( K c a l / m o l ) f o r a gtgtRR s t a r t i n g model p r o d u c e d w i t h MM2 and i t s s t a n d a r d o p t i o n -1 d i h e d r a l d r i v e r .

F i g u r e 5. a) The s t a r t i n g model o f c e l l o b i o s e (gtgtRR) a f t e r r i g i d r o t a t i o n t o Ψ = -80, φ - -100. b) The r e s u l t o f a t t e m p t e d o p t i m i z a t i o n by MM2. c) The same l i n k a g e c o n f o r m a t i o n , but t h e s t r u c t u r e was t a k e n from t h e s t u d y t h a t p r o d u c e d t h e map i n F i g u r e 4.

French and Brady; Computer Modeling of Carbohydrate Molecules ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

12.

FRENCH ET A L

(informational Analysis of a Disaccharide

201

Downloaded by UNIV OF SYDNEY on December 24, 2017 | http://pubs.acs.org Publication Date: July 6, 1990 | doi: 10.1021/bk-1990-0430.ch012

c o n f o r m a t i o n s w i t h t h e s t a n d a r d d r i v e r i s more l i k e l y t o p r o v i d e s u c c e s s f u l o p t i m i z a t i o n s . On t h e o t h e r hand, energy v a l u e s and r e s i d u e g e o m e t r i e s depend on which c o n f o r m a t i o n s p r e c e d e d t h e φ, Ψ point i n question. W i t h t h e new d r i v e r , i n which t h e s t a r t i n g r e s i d u e g e o m e t r i e s a r e r i g i d l y r o t a t e d t o t h e d e s i r e d φ and Ψ v a l u e s , bad s t a r t i n g g e o m e t r i e s a r e more l i k e l y and s t r u c t u r e s may n o t o p t i m i z e p r o p e r l y . The e x t e n t o f t h e problems i n h e r e n t i n c o n d u c t i n g CA o f d i s a c c h a r i d e s w i t h t h e s t a n d a r d d r i v e r o p t i o n i s , i f a n y t h i n g , u n d e r s t a t e d i n t h i s d e m o n s t r a t i o n because o f t h e e q u a t o r i a l l i n k a g e s in cellobiose. D u r i n g such automated CA, models w i t h a x i a l l i n k a g e s e n c o u n t e r more s e v e r e i n t e r - r e s i d u e c o n t a c t s and, hence, r e s i d u e deformations. Working Around t h e New Problem. S i n c e s t r u c t u r e s such as t h e one shown i n F i g u r e 5b a r e c o m p u t a t i o n a l a r t i f a c t s , t h e i r e n e r g i e s s h o u l d be d i s c a r d e d . There a r e a t l e a s t t h r e e ways t o m i n i m i z e t h e impact of t h e m i s s i n g energy v a l u e s . I f t h e r e i s no e n e r g y v a l u e f o r a φ,Ψ p o i n t , one can be e x t r a p o l a t e d from n e i g h b o r i n g v a l u e s . The SURFER program (Golden Software, Golden, C o l o r a d o ) p r o d u c e s c o n t o u r p l o t s from g r i d s w i t h m i s s i n g d a t a t h r o u g h e x t r a p o l a t i o n . C o n f o r m a t i o n s a f f e c t e d by t h i s p r o b l e m have e n e r g i e s so h i g h t h a t t h e c o n f o r m a t i o n s a r e improbable, and a r e a s o n a b l e e r r o r i n t h e e x t r a p o l a t e d v a l u e w i l l have l i t t l e e f f e c t on t h e i m p o r t a n t , low-energy r e g i o n s o f t h e φ,Ψ map. A second way depends on t h e use o f s e v e r a l s t a r t i n g models w i t h d i f f e r e n t h y d r o x y l and p r i m a r y a l c o h o l group o r i e n t a t i o n s f o r c a l c u l a t i o n o f t h e energy a t each φ,Ψ p o i n t . Since our goal i s t o d e t e r m i n e t h e l o w e s t energy v a l u e a t each p o i n t , t h e e n e r g i e s t h a t a r e c l e a r l y i n e r r o r c a n be d i s c a r d e d and t h e b e s t r e m a i n i n g energy v a l u e s c a n be used. I f s e v e r a l s t a r t i n g s t r u c t u r e s a r e used, i t w i l l be r a r e i f none o f them p r o d u c e s a r e a s o n a b l e v a l u e . A t t h e φ and Ψ v a l u e s o f t h e models i n F i g u r e 5, t h r e e o f t h e f o u r s t a r t i n g models f a i l e d t o o p t i m i z e p r o p e r l y . The f o u r t h , however, gave an energy o f 52.4 k c a l / m o l , 1.8 k c a l / m o l lower t h a n t h e v a l u e o b t a i n e d w i t h t h e standard d r i v e r option. A t h i r d approach i s t o u s e a s a t i s f a c t o r i l y o p t i m i z e d geometry from a n e i g h b o r i n g p o i n t as a s t a r t i n g geometry. I f t h a t i s done, one w i l l p r o b a b l y f i n d t h a t c o n f o r m a t i o n s and e n e r g i e s depend on t h e d i r e c t i o n of approach. The b e s t remedy i s t o p r e v e n t t h e entanglement t h a t r e s u l t s i n t h e i n c o r r e c t s t r u c t u r e s . As shown i n t h e c h a p t e r by B r a n t and C h r i s t , one way t o m i n i m i z e i n t e r - r e s i d u e c o n t a c t s i s t o i n c r e a s e t h e bond a n g l e a t t h e oxygen atom t h a t l i n k s t h e two r e s i d u e s t o g e t h e r t o about 125°. While t h e o p t i m i z a t i o n r o u t i n e w i l l r e t u r n t h e value o f t h e g l y c o s i d i c a n g l e t o about 117 degrees, t h e r e s i d u e g e o m e t r i e s w i l l s i m u l t a n e o u s l y a d j u s t t o a v o i d t a n g l i n g . Such a m o d i f i c a t i o n t o t h e above s t r a t e g y has been f a i r l y s u c c e s s f u l i n p r e l i m i n a r y t e s t i n g . A f t e r t h e s t r u c t u r e s t o be u s e d as s t a r t i n g g e o m e t r i e s have been i n i t i a l l y o p t i m i z e d , t h e i r l i n k a g e bond a n g l e s a r e i n c r e a s e d t o t h e larger value. These new s t r u c t u r e s a r e t h e n used as t h e s t a r t i n g models w i t h t h e new d r i v e r o p t i o n . The MM3 program may o f t e n a v o i d t h i s p r o b l e m because o f two changes. E x p l i c i t l o n e p a i r s w i l l n o t be used, and t h e c u b i c bond s t r e t c h i n g f u n c t i o n o f MM2 w i l l be r e p l a c e d by a q u a r t i c e q u a t i o n (30). Clues from O p t i m i z a t i o n

Reports

T e s t i n g f o r V a l i d O p t i m i z a t i o n s bv Energy V a l u e . W i t h hundreds o r thousands o f d a t a p o i n t s t o examine, d e t a i l e d i n s p e c t i o n f o r s u c c e s s f u l l y o p t i m i z e d s t r u c t u r e s i s t e d i o u s . L i m i t e d e x p e r i e n c e has

French and Brady; Computer Modeling of Carbohydrate Molecules ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

202

COMPUTER MODELING OF CARBOHYDRATE MOLECULES

Downloaded by UNIV OF SYDNEY on December 24, 2017 | http://pubs.acs.org Publication Date: July 6, 1990 | doi: 10.1021/bk-1990-0430.ch012

shown t h a t s u c c e s s f u l l y o p t i m i z e d d i s a c c h a r i d e s can be d e t e c t e d by t h e i r energy v a l u e s . T h e i r " F i n a l S t e r i c E n e r g i e s " s h o u l d be between about 25 and 75 k c a l / m o l w i t h MMP2(85). Other software, i n c l u d i n g MM2(87), w i l l have r a t h e r d i f f e r e n t ranges o f e n e r g i e s , as w i l l o t h e r m o l e c u l a r s t r u c t u r e s . The l e a s t e n e r g e t i c MMP2(85) v a l u e s f o r p e r m e t h y l a t e d c e l l o b i o s e , f o r example, a r e about 80 k c a l / m o l (French, A. D. Unpublished data). V a l u e s o u t s i d e t h i s range i n d i c a t e t h a t t h e s t r u c t u r e has not been p r o p e r l y o p t i m i z e d , as d i s c u s s e d above. Problems may e x i s t even when MM2's energy i s w i t h i n t h e above range. Large, u n r e a s o n a b l e v a l u e s f o r i n d i v i d u a l terms may fortuitously b a l a n c e each o t h e r i n a way t h a t t h e i r t o t a l appears t o be reasonable. E v i d e n c e f o r T r a n s i t i o n s . S t a n d a r d MM2 g i v e s a r e c o r d o f the e n e r g y v a l u e s and the average a t o m i c movement as t h e s t r u c t u r e a d j u s t s t o p r o v i d e lower energy v a l u e s . I n i t i a l l y , t h e movement i s o f t e n l a r g e ( s e v e r a l hundredths o f an Angstrom) . I t becomes p r o g r e s s i v e l y s m a l l e r as the energy approaches the f i n a l v a l u e . The r a t e of change i n t h e s e v a l u e s i n d i c a t e s t h e e x t e n t and t y p e o f d i f f e r e n c e between t h e i n i t i a l and f i n a l s t r u c t u r e s . Two c l u e s can be g a i n e d from e x a m i n i n g the average atomic movement v a l u e s . W h i l e t h e l a r g e atomic movements o f t e n cease almost immediately, t h e y may s t a y a t a n e a r l y c o n s t a n t , moderate v a l u e f o r an e x t e n d e d time b e f o r e d r o p p i n g o f f f u r t h e r . I n i t i a l l a r g e movements c o r r e s p o n d t o changes i n i n i t i a l atomic p o s i t i o n s f o r most atoms. Movement v a l u e s t h a t remain n e a r l y c o n s t a n t i n d i c a t e a change ( p r o b a b l y a r o t a t i o n ) o f one group r e l a t i v e t o a n o t h e r w h i l e t h e r e l a t i v e p o s i t i o n s o f t h e atoms w i t h i n t h e groups a r e n e a r l y unchanged. In t h e s p e c i a l c a s e where an energy b a r r i e r t o r o t a t i o n i s overcome, t h e average a t o m i c movement may i n c r e a s e t e m p o r a r i l y and t h e n resume i t s downward t r e n d . Modifications to

MM2

We changed the MM2 program f o r more e f f i c i e n t use on d i s a c c h a r i d e s . The m o d i f i e d v e r s i o n g i v e s t h e same r e s u l t s as t h e o r i g i n a l , u n l e s s t h e new o p t i o n s a r e s e l e c t e d . Our v e r s i o n r e p o r t s t h a t a t r a n s i t i o n may have o c c u r r e d i f t h e a v e r a g e atomic movement i n c r e a s e s . This e l i m i n a t e s the need t o r e p o r t t h e h i s t o r y o f t h e a v e r a g e atomic movement d u r i n g CA. Megabytes of d i s k space p e r CA run are saved by o m i t t i n g redundant i n f o r m a t i o n and t h e r e p o r t s o f average a t o m i c movement. We have implemented IPRINT o p t i o n s 5 and 6 i n a d d i t i o n t o t h e o p t i o n s 1-4 o f t h e s t a n d a r d program. Both 5 and 6 e l i m i n a t e t h e same i n f o r m a t i o n f r o m t h e s t a n d a r d o u t p u t , but o p t i o n 6 does not p r o d u c e t h e s e c o n d a r y o u t p u t f i l e s (FOR009.DAT) t h a t r o u g h l y c o r r e s p o n d t o MM2 i n p u t f i l e s , f u r t h e r s a v i n g d i s k s p a c e . Besides conserving d i s k space, t h e b r i e f e r o u t p u t f i l e s can be more q u i c k l y scanned f o r t h e important r e s u l t s . A n o t h e r change was t o p l a c e t h e energy r e s u l t i n t h e FOR009.DAT f i l e s as w e l l as i n t h e main system o u t p u t . This s t o r e s t h e f i n a l energy v a l u e s on d i s k even i f t h e main o u t p u t i s s e n t t o t h e v i d e o d i s p l a y when u s i n g o p t i o n s 1-5. A g r e a t e r u n d e r s t a n d i n g o f t h e MM2 program i s needed t o implement the new -2 o p t i o n f o r t h e d i h e d r a l d r i v e r . I n MM2, a temporary f i l e s t o r e s t h e c o o r d i n a t e s a t t h e end o f e a c h o p t i m i z a t i o n f o r use as s t a r t i n g p o s i t i o n s f o r the next o p t i m i z a t i o n . The p r o c e d u r e s t h a t c r e a t e t h e s e f i l e s had t o be changed. An a l t e r n a t i v e t o m o d i f i c a t i o n o f MM2, used p r e v i o u s l y (3), was t o c r e a t e s e p a r a t e i n p u t f i l e s f o r each φ,Ψ c o n f o r m a t i o n o f i n t e r e s t . T h i s a l l o w e d use o f t h e s t a n d a r d d r i v e r w i t h a 0° i n c r e m e n t s i z e . S p e c i a l programs c o u l d be used t o p r e p a r e a l l o f t h e i n p u t f i l e s . The check o f

French and Brady; Computer Modeling of Carbohydrate Molecules ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

12.

FRENCH ET AL.

Conformational Analysis of a Disaccharide

203

whether t h e new d r i v e r has been p r o p e r l y implemented i s whether i t g i v e s t h e same r e s u l t s a t a v a r i e t y o f c o n f o r m a t i o n s as t h e s t a n d a r d d r i v e r , used w i t h an i n c r e m e n t s i z e o f z e r o .

Downloaded by UNIV OF SYDNEY on December 24, 2017 | http://pubs.acs.org Publication Date: July 6, 1990 | doi: 10.1021/bk-1990-0430.ch012

Application to Cellobiose A p r e l i m i n a r y energy map f o r c e l l o b i o s e has been p u b l i s h e d (7,16) but i t was based on o n l y one c o m b i n a t i o n o f r o t a t i n g group p o s i t i o n s . S i m i l a r maps b a s e d on o t h e r s t a r t i n g models were needed t o c o n f i r m t h e i n i t i a l work. The o p t i m a l gtgtRR s t r u c t u r e from t h e e a r l i e r work was a l t e r e d t o g i v e t h r e e a d d i t i o n a l s t a r t i n g s t r u c t u r e s ( a l l shown i n F i g u r e 2) . φ and Ψ were s t e p p e d i n 20° i n c r e m e n t s f r o m -180 to +160°. I n t h e p r e v i o u s work, an i r r e g u l a r g r i d was used, w i t h 10° i n c r e m e n t s i n t h e low-energy r e g i o n s . The d e f a u l t d i e l e c t r i c c o n s t a n t o f 1.5 was used, a p p r o p r i a t e f o r an i s o l a t e d m o l e c u l e . The MM2 c a l c u l a t i o n s were c a r r i e d out on VAX computers. The energy v a l u e s were managed w i t h a program g i v e n i n the Appendix t h a t was w r i t t e n i n GWBASIC f o r IBM-PC c o m p a t i b l e s . Results Energy v a l u e s f o r t h e gtgtRR model were t h e same as computed e a r l i e r f o r t h e same φ and Ψ v a l u e s , c o n f i r m i n g t h a t the program m o d i f i c a t i o n s had not a l t e r e d t h e c a l c u l a t e d energy v a l u e s . From t h e r e s u l t s f o r a l l f o u r s t a r t i n g models, t h e u t i l i t y program i n t h e Appendix s e l e c t e d t h e 324 l o w e s t energy v a l u e s shown i n F i g u r e 6a. F i g u r e 6b shows t h a t 220 o f t h o s e 324 e n e r g i e s a r o s e f r o m t h e gtgtRR s t a r t i n g s t r u c t u r e used i n t h e e a r l i e r work. A n o t h e r 57 p r e f e r r e d c o n f o r m a t i o n s s t a r t e d as ggggRR, 31 p o i n t s a r o s e from ggggRC, and 16 came from g t g t C C s t r u c t u r e s . F i g u r e 6c shows t h e ranges o f e n e r g y v a l u e s a t each φ,Ψ point. These ranges a r e b a s e d o n l y on s t r u c t u r e s t h a t o p t i m i z e d p r o p e r l y . T h e r e f o r e , some ranges a r e b a s e d on l e s s t h a n f o u r e n e r g i e s . The magnitudes of t h e ranges show t h e i m p o r t a n c e o f t h e r o t a t i n g groups, a l t h o u g h the d i f f e r e n c e between the gtgtRR model and t h e one w i t h t h e l o w e s t energy a t t h e c o n f o r m a t i o n i n q u e s t i o n was u s u a l l y s m a l l , shown i n F i g u r e 6d. In t h e f o u r s e t s o f 324 p o i n t s c a l c u l a t e d , s t r u c t u r e s f a i l e d t o o p t i m i z e p r o p e r l y 37 t i m e s . F i g u r e 6e shows t h e l o c a t i o n s and t h e numbers o f t h o s e models. A l l φ,Ψ points that corresponded to i m p r o p e r l y o p t i m i z e d c o n f o r m a t i o n s , when t e s t e d w i t h o t h e r s t a r t i n g models, gave o p t i m i z e d e n e r g i e s a t l e a s t 10 k c a l above t h e minimum. F i g u r e 7 i s a c o n t o u r p l o t o f t h e d a t a i n F i g u r e 6a. It i s almost i d e n t i c a l t o t h e p l o t p u b l i s h e d e a r l i e r t h a t embodied 497 p o i n t s b a s e d on one s t a r t i n g model i n s t e a d o f 324 p o i n t s f o r each o f 4 s t a r t i n g models. I m p l i c a t i o n s o f t h e v a r i o u s minima and b a r r i e r s a r e d i s c u s s e d i n R e f s . 7 and 16. F i g u r e 8a shows t h e d i f f e r e n c e s between maps made w i t h d r i v e r o p t i o n s -1 and -2 f o r t h e gtgtRR s t a r t i n g model. F i g u r e 8b i s a SURFER c o n t o u r p l o t o f t h e a p e r i o d i c d a t a i n F i g u r e 4. These p l o t s e x h i b i t t h e v a r i o u s o p t i o n s o f the u t i l i t y program i n t h e Appendix, the e f f e c t s o f t h e new d r i v e r , and the a d d i t i o n of e x t r a s t a r t i n g geometries. Conclusions The strategy conveyed i n t h i s paper p e r m i t s c o h e r e n t r e s u l t s f r o m an automated CA w h i l e u s i n g f l e x i b l e r e s i d u e s . By t e s t i n g a l l t h e s t a r t i n g models o v e r t h e e n t i r e range o f φ and Ψ, p a r a l l e l s e t s o f d a t a were o b t a i n e d t h a t were s u b m i t t e d t o a s i m p l e program f o r f i n a l analysis. T h i s m i n i m i z e s t h e p e r s o n a l t i m e r e q u i r e d t o produce a

French and Brady; Computer Modeling of Carbohydrate Molecules ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

Downloaded by UNIV OF SYDNEY on December 24, 2017 | http://pubs.acs.org Publication Date: July 6, 1990 | doi: 10.1021/bk-1990-0430.ch012

204

COMPUTER MODELING OF CARBOHYDRATE MOLECULES

47 49 53 47 44 42 40 36 34 35 39 45 44 45 47 43 42 44

49 56 56 52 45 42 41 38 37 37 41 46 50 49 49 46 43 44

51 57 63 60 51 45 42 40 40 42 45 49 53 56 54 49 49 47

53 57 63 66 58 51 45 43 42 41 44 49 56 59 60 52 49 51

48 54 57 66 65 53 47 44 42 41 41 43 52 58 60 59 49 46

43 47 54 55 58 57 49 43 40 39 39 40 43 56 55 54 51 45

42 44 49 53 50 53 50 43 39 36 37 38 40 44 51 50 47 45

43 43 45 48 46 46 51 44 38 35 35 37 39 41 44 45 42 41

39 42 44 46 43 41 42 43 38 34 33 35 39 41 41 41 37 37

36 40 43 44 42 39 38 38 37 34 32 33 37 40 40 39 35 33

34 40 41 42 42 38 35 34 34 34 32 31 34 38 39 37 34 32

34 39 42 40 39 38 34 32 32 33 34 32 33 36 37 36 34 32

36 42 45 41 38 37 35 33 32 33 36 37 36 37 38 35 34 34

40 44 49 45 40 37 37 36 35 35 37 41 40 41 41 38 36 37

42 47 50 51 44 39 37 38 38 37 38 42 46 46 45 44 39 39

44 48 51 52 49 42 39 38 39 38 38 40 46 49 50 47 43 41

42 48 50 50 50 45 39 36 37 39 38 38 42 47 48 47 44 42

43 46 50 47 45 44 39 35 34 36 41 39 40 45 46 44 44 43

160 140 120 100 80 60 40 20 0 -20 -40 -60 -80 -100 -120 -140 -160 -180

1 1 1 1 1 1 1 1 1 -8 -6 -4 -2 -0 -8 -6 -4 -2 2 4 6 8 0 2 4 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Φ Figure 6a. MM2 "Final Steric Energy" values for a l l tested values of Phi and P s i . The value at each point i s the lowest of the energies calculated for the four starting models. (The largest φ and Ψ values are 160°).

1 1 4 3 3 1 3 1 1 4 1 1 1 1 1 1 1 1

4 2 1 3 3 3 4 3 3 4 1 1 1 1 1 1 1 2

2 4 2 3 1 1 1 3 3 3 3 1 3 1 1 3 1 2

2 2 2 2 4 4 4 1 3 1 1 1 1 3 1 3 1 3

1 4 4 1 2 3 4 3 3 1 1 1 4 1 4 1 1 1

1 1 4 4 4 2 4 3 3 3 3 1 1 4 1 1 3 1

3 3 3 4 4 1 2 4 3 3 3 1 1 1 1 1 1 3

1 1 4 4 4 4 4 2 1 3 3 3 1 1 1 1 1 1

1 1 1 1 4 1 1 2 1 3 3 3 1 1 1 1 1 1

1 1 1 1 3 4 1 1 1 1 3 1 1 1 1 1 1 1

1 1 1 1 1 4 1 1 1 1 1 1 1 1 1 1 1 1

1 1 3 1 1 3 1 1 1 1 1 1 1 1 1 1 1 1

1 2 1 1 1 1 3 1 1 1 1 1 1 1 1 1 1 1

1 1 3 3 1 1 1 3 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 3 1 1 1 1 3 2 1 1 1 3 1 1 3 3 1 1 1 3 3 1 1 1 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

1

160 140 120 100 80 60 40 20 0 -20 -40 -60 -80 -100 -120 -140 -160 -180

1 1 1 1 1 1 1 1 8 -6 -4 -2 -0 -8 -6 -4 -2 2 4 6 8 0 2 4 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Φ Figure 6b. The starting model at each point that gave the lowest energy (1 - gtgtRR, 2 - gtgtCC, 3 - ggggRC and 4 - ggggRR).

French and Brady; Computer Modeling of Carbohydrate Molecules ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

Downloaded by UNIV OF SYDNEY on December 24, 2017 | http://pubs.acs.org Publication Date: July 6, 1990 | doi: 10.1021/bk-1990-0430.ch012

12.

Conformational Analysis ofa Disaccharide

FRENCH ET A L 1 3 2 1 1 2 3 1 1 3 4 1 3 2 1 3 2 1 4 3 2 4 4 3 4 3 4 5 4 4 5 4 0 5 5 5 5 1 4 6 4 3 4 3 3 3 2 4 2 2 2 3 3 3

2 5 2 2 2 2 1 0 2 1 3 5 1 2 2 3 3 3 3 3 4 4 5 5 3 0 5 4 5 4 4 4 3 4 2 3

3 2 3 3 5 3 1 1 3 4 4 4 5 0 5 6 4 3

2 2 2 3 3 4 3 1 2 4 5 5 5 4 8 4 4 3

4 2 2 3 1 1 1 2 2 3 5 5 4 6 6 10 3 4

4 3 5 5 2 3 4 4 2 2 3 3 2 2 3 3 2 3 3 3 2 3 3 3 2 2 3 3 1 3 3 4 1 2 3 4 3 2 2 3 5 3 3 2 6 5 3 4 6 6 4 3 5 6 6 7 6 6 6 6 6 6 6 6 5 5 6 6 3 4 5 6

1 1 1 1 1 -8 -6 -4 -2 -0 -8 -6 -4 -2 0 0 0 0 0 0 0 0 0

0

2 0

6 5 3 3 2 4 3 3 2 2 5 3 2 5 1 5 3 2 3 5 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 5 4 5 6 4 5 5 6 3 4 5 6 3 5 5 5 3 3 5 5 6 3 5 3 6 6 5 3 5 5 4 4 5 4 4 4

4 0

6 0

1 0 0

8 0

1 2 0

205

3 2 160 3 2 140 3 3 120 3 3 100 2 3 80 3 5 60 4 4 40 4 4 20 0 4 5 5 5 -20 7 5 -40 7 5 -60 6 7 -80 5 6 -100 5 4 -120 3 3 -140 3 2 -160 3 2 -180 1 4 0

1 6 0

Φ F i g u r e 6c. geometries.

The ranges o f energy v a l u e s r e s u l t i n g from t h e 4 s t a r t i n g

XX XX

XX

1

XX XX XX XX

1

XX . XX . . XX

1 . 1

1 .

. .

1 1 1 1 1 -8 -6 -4 -2 -0 -8 -6 -4 -2 0 0 0 0 0 0 0 0 0 0

2 0

4 0

6 0

8 0

0 0

160 140 120 100 80 60 40 20 0 Ψ -20 -40 -60 -80 -100 -120 -140 -160 -180

1 1 1 1 2 4 6 0 0 0

F i g u r e 6d. The d i f f e r e n c e between gtgtRR and t h e b e s t v a l u e o f energy. I f t h e gtgtRR model was b e s t o r t h e d i f f e r e n c e was l e s s t h a n 0.5 k c a l / m o l , a "." i s shown. I f a structure f a i l e d t o optimize c o r r e c t l y (energy o u t s i d e t h e range 25 t o 75 k c a l / m o l ) , XX i s shown.

French and Brady; Computer Modeling of Carbohydrate Molecules ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

206

COMPUTER MODELING OF CARBOHYDRATE MOLECULES



·

. 1 1

1 - 1 1 . - · · 3 • · · . 2 1 1 1

1 1

1 . 2

. . . . . . . . . 1 1 . 1 .

1

Downloaded by UNIV OF SYDNEY on December 24, 2017 | http://pubs.acs.org Publication Date: July 6, 1990 | doi: 10.1021/bk-1990-0430.ch012

• · 2 • · - I . . 1 1 3 • •

· ·

· ·

3 . 1 . 1 · 1 1

1 1 1 1 1 -8 -6 -4 -2 -0 -8 -6 -4 -2 0 0 0 0 0 0 0 0 0 0

160 140 120 100 80 60 40 20 0 -20 -40 -60 -80 -100 -120 -140 -160 -180

Ψ

1 1 1 1 2 4 6 8 0 2 4 6 0 0 0 0 0 0 0 0

Φ F i g u r e 6e. L o c a t i o n and number o f models t h a t f a i l e d t o o p t i m i z e (energy v a l u e s were o u t s i d e t h e range o f 25 t o 75 k c a l / m o l ) .

F i g u r e 7. The c o n t o u r e d map e q u i v a l e n t t o t h e energy g r i d i n F i g u r e 6a. C o n t o u r s a r e drawn a t 1 k c a l / m o l i n t e r v a l s .

French and Brady; Computer Modeling of Carbohydrate Molecules ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

Downloaded by UNIV OF SYDNEY on December 24, 2017 | http://pubs.acs.org Publication Date: July 6, 1990 | doi: 10.1021/bk-1990-0430.ch012

12.

FRENCH ET AL» 0 -1 -2 1 2 -0 6 7 3 XX 8 10 1 -2 1 2 1 0 6 1 0 3 2 1 3 3 2 3 2 3 3 2 2 2 1 XX 0 0 -1 0 -1 -3 0 -1 -2 1 -1 -2 0 -0 0 -0 -1 -1 0 -1 -1

-0 -1 3 -1 3 -1 -3 -1 2 0 2 2 2 -2 2 -3 -1 0 1

0 2 5 0 4 -2 -2 -1 3 -1 -0 2 6

Conformational Analysis ofa Disaccharide

0 2 6 -0 2 6 -3 -3 -2 -1 -1 -1 3 XX 6 -2 XX 2 -1 -4 -1 -0 -4 0 0

-1 1 5 1 4 1 4 -4 -3 -2 -1 -1 -0 3 3 1 -1 -0 -1

1 1 5 1 -1 -1 10 -4 -4 -1 -1 -0 -0 0 4 3 2 1 1

2 3 3 4 7 8 1 2 -0 0 -1 -2 -0 1 -0 1 -2 1 -0 0 -0 0 0 1 0 1 1 1 1 1 4 2 3 3 2 4 2 3

1 1 1 1 1 -8 -6 -4 -2 -0 -8 -6 -4 -2 0 0 0 0 0 0 0 0 0

0

3 5 7 2 1 1 1 1 1 1 1 1 1 1 1 2 3 3 3

3 3 2 2 2 2 3 3 3 2 2 2 7 XX 8 7 2 3 1 4 XX 6 5 4 2 2 2 XX 11 4 2 1 2 4 XX XX 1 2 2 2 4 XX 2 2 2 2 3 3 1 2 2 2 3 3 2 2 2 2 2 3 1 2 2 2 2 2 2 1 1 1 1 1 2 2 2 2 1 1 2 2 4 2 2 1 2 2 2 4 2 1 2 2 2 3 2 2 2 2 2 2 2 2 3 2 2 2 2 2 3 3 2 2 2 2

2 0

4 0

6 0

8 0

1 0 0

1 2 0

1 4 0

207

1 0 180 1 0 160 2 2 140 3 XX 120 3 3 100 7 2 80 6 5 60 4 4 40 4 4 20 3 3 0 3 3 -20 1 2 -40 1 0 -60 1 0 -80 1 0 -100 1 1 -120 1 0 -140 1 0 -160 1 0 -180 1 6 0

1 8 0

Φ Figure 8a. Grid of energy differences between results from the standard MM2 option -1 driver and the modified option -2 driver. Positive values indicate that the -2 driver gave a lower value (gtgtRR starting structures only).

Φ Figure 8b. Contour map based on the standard option -1 driver Contours are drawn at 1 kcal/mol l e v e l s . French and Brady; Computer Modeling of Carbohydrate Molecules ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

Downloaded by UNIV OF SYDNEY on December 24, 2017 | http://pubs.acs.org Publication Date: July 6, 1990 | doi: 10.1021/bk-1990-0430.ch012

208

COMPUTER MODELING OF CARBOHYDRATE MOLECULES

flexible-residue analysis. Although s t i l l computationally expensive (each s t a r t i n g model r e q u i r e d about 2.5 cpu days on a MicroVax I I I ) , t h a t f a c t o r w i l l d i m i n i s h t o r e l a t i v e i n s i g n i f i c a n c e as e c o n o m i c a l computers become f a s t e r . An advantage t o t h e new d r i v e r o p t i o n i s t h a t t h e c a l c u l a t e d energy f o r any g i v e n φ,Ψ p o i n t depends o n l y on t h e s t a r t i n g geometry and not t h e p r e c e d i n g p o i n t s . T h i s not o n l y a v o i d s t h e f a u l t s d i s c u s s e d above, but i t p e r m i t s c o m b i n a t i o n o f t h e r e s u l t s with other r e s u l t s . F o r example, t h e e n e r g i e s a t even v a l u e s of φ and Ψ c o u l d be i n t e r s p e r s e d w i t h e n e r g i e s a t odd v a l u e s i n l i m i t e d a r e a s t o produce a h i g h e r - r e s o l u t i o n a n a l y s i s . O v e r a l l , t h e new d r i v e r o p t i o n i s a s t e p f o r w a r d i n CA o f m o l e c u l e s t h a t can a d e f o r m i n e l a s t i c a l l y . C a l c u l a t i o n s o f energy t a k e l o n g e r because most s t a r t i n g g e o m e t r i e s a r e not as c l o s e t o t h e f i n a l r e s u l t as t h e y a r e w i t h t h e s t a n d a r d -1 d r i v e r o p t i o n . However, i t i s not n e c e s s a r y t o c a l c u l a t e e n e r g i e s a t b o t h -180° and +180°, s a v i n g some time, f o r a net l o s s i n speed o f about 10%. The problems s o l v e d by t h e new d r i v e r o p t i o n a r e c r i t i c a l , w h i l e t h e new p r o b l e m o f o c c a s i o n a l improper o p t i m i z a t i o n can r e a d i l y be worked around. Acknowledgment s C a l c u l a t i o n s were p e r f o r m e d a t t h e I n s t i t u t e N a t i o n a l de l a Recherche Agronomique (INRA), Nantes, F r a n c e and a t L o u i s i a n a S t a t e U n i v e r s i t y as w e l l as a t t h e Southern R e g i o n a l R e s e a r c h C e n t e r . Some o f t h i s e f f o r t was i n s p i r e d by d i s c u s s i o n s w i t h P r o f e s s o r John Brady, C o r n e l l University. Mary An G o d s h a l l , Sugar P r o c e s s i n g R e s e a r c h Inc., Dr. W i l l i a m E. F r a n k l i n , Southern R e g i o n a l Research C e n t e r , P r o f e s s o r Andrew Waterhouse, T u l a n e U n i v e r s i t y , Dr. Massimo R a g a z z i , M i l a n , Dr. I g o r Tvaroska, S l o v a k Academy o f S c i e n c e s , and P r o f e s s o r N. L. A l l i n g e r , U. G e o r g i a , p r o v i d e d u s e f u l comments on t h e m a n u s c r i p t . The use o f b r a n d names f o r p r o d u c t s i s f o r d e s c r i p t i v e purposes and i s not an endorsement. T h i s c o l l a b o r a t i o n was made p o s s i b l e by a USDA A g r i c u l t u r a l R e s e a r c h F e l l o w s h i p .

APPENDIX The p r o c e d u r e used t o p r e p a r e t h e d a t a f o r t h e v a r i o u s t a b l e s ( F i g u r e s 3, 6, and 8) f o l l o w s . A f t e r c o m p l e t i n g t h e MM2 runs, t h e main o u t p u t f i l e s were each p r o c e s s e d w i t h t h e VMS e d i t o r , ED. The command t o w r i t e a l l l i n e s w i t h "FINAL STERIC ENERGY" t o a f i l e was g i v e n (WR FILENAME.NRG ALL "FINAL S " ) . A f t e r Q U I T t i n g t h e e d i t o r , t h e NRG f i l e was l o a d e d i n t o t h e e d i t o r and t h e t e x t was s t r i p p e d o f f , u s i n g t h e command s t r i n g , S /FINAL STERIC ENERGY // whole. That was f o l l o w e d w i t h S /KCAL.// whole. U s i n g a communications program, t h e 4 NRG f i l e s were t r a n s f e r e d t o an IBM-PC/AT c o m p a t i b l e computer and merged w i t h each o t h e r , i n such a way t h a t t h e r e were 4 columns of energy v a l u e s . The r e s u l t i n g f i l e was t h e n i n p u t t o t h e f o l l o w i n g program a v a i l a b l e from t h e a u t h o r on d i s k . The f o l l o w i n g program i s w r i t t e n i n GWBASIC f o r IBM-PC compatibles. I t assumes t h a t energy v a l u e s a r e i n a s i n g l e column f o r a l l Ψ,φ p o i n t s f o r a g i v e n s t a r t i n g model. E n e r g i e s f o r a d d i t i o n a l s t a r t i n g models must be i n a d d i t i o n a l columns, w i t h a l l v a l u e s i n each row c o r r e s p o n d i n g t o t h e same Ψ and φ. Energy v a l u e s o u t s i d e t h e range o f 20 t o 75 k c a l / m o l a r e d i s c a r d e d by t h e program. They s h o u l d not be d i s c a r d e d manually. Besides the input f i l e s f o r SURFER, t h e program p r o d u c e d t h e uncontoured energy maps i n F i g u r e s 4, 6 and 8.

French and Brady; Computer Modeling of Carbohydrate Molecules ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

12. 10

Downloaded by UNIV OF SYDNEY on December 24, 2017 | http://pubs.acs.org Publication Date: July 6, 1990 | doi: 10.1021/bk-1990-0430.ch012

20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260 270 280 290 300 310 320 340 350 360 370 380 390 400 410 420 430 440 450 460 470 480 490 500 510 520 530 540 550 560

FRENCH ET AL»

Conformational Analysis of a Disaccharide

209

' PROGRAM TO TAKE ONE OR MORE LISTS OF MM2 ENERGIES, PREPARE FILE FOR 'SURFER MAPS, ETC. 'WRITTEN BY A. D. FRENCH - VERSION 2.0 MARCH 2, 1989 DIM Z(10),M(30,30) UPLIM=75:' Values above t h i s generally result from malformed structures LO=25:' Values below t h i s are erroneous f o r MMP2(85), cellobiose PRINT "Energy analysis u t i l i t y f o r MM2 output, SURFER input." PRINT "MAPREP V. 2.0 - March 2, 1989": PRINT:PRINT 'End of preliminaries, s t a r t of file handling PRINT "Current allowed energy range i s ";LO/" - ";UPLIM INPUT "NAME OF INPUT FILE";FI$ OPEN "I",1,FI$ INPUT "NAME OF OUTPUT FILE FOR SURFER INPUT";FO$ OPEN "0",2,FO$ 'Set

up ranges and increments f o r Phi, P s i

INPUT "IS THIS A STANDARD -180 TO +160 STEP 20 MAP? (Y/N)[Y]";A$ IF LEFT$(A$,1)="Y" OR LEFT$(A$,1)="y" OR LEFT$(A$,1)="" GOTO 290 INPUT "STARTING PHI VALUE";PHBEG INPUT "ENDING PHI VALUE";PHEND INPUT "INCREMENT OF PHI"/PHDEL INPUT "STARTING PSI VALUE";PSBEG INPUT "ENDING PSI VALUE";PSEND INPUT "INCREMENT OF PSI";PSDEL GOTO 340 PHBEG=-18 0 :PHEND=16 0 :PHDEL=2 0 PSBEG=-18 0 : ΡSEND=160 : ΡSDEL=2 0 ι 'Rest of input, set up type of map desired. INPUT "HOW MANY COLUMNS OF ENERGIES ARE THERE";NCOL PRINT "WHAT TYPE OF ANALYSIS IS DESIRED?" PRINT " 1. U s u a l P h i , P s i and Lowest Energy V a l u e s " PRINT " 2. P h i , P s i and Number o f Column w i t h Lowest Energy Value" PRINT " 3. Range o f Energy V a l u e s a t Each P h i , P s i " PRINT " 4. Bad V a l u e s on P h i , P s i G r i d " PRINT " 5. D i f f e r e n c e Between any Column and Best V a l u e " PRINT " 6. P h i , P s i and Energy from o n l y One o f S e v e r a l Columns" INPUT "YOUR CHOICE (1-6)";OUTVAL IF OUTVAL 6 THEN GOTO 350 IF OUTVAL=l THEN OUTVAL$="E": GOTO 720 IF OUTVAL=6 THEN OUTVAL$="E":INPUT "Column Number ";ICOL:GOTO 540 IF OUTVAL=2 AND NCOL>l THEN OUTVAL$="CC": GOTO 720 IF OUTVAL=2 THEN PRINT CHR$(7);"1 Column, no c h o i c e ! " : GOTO 360 IF OUTVAL=3 AND NCOL>l THEN OUTVAL$="R": GOTO 720 IF OUTVAL=3 THEN PRINT CHR$(7)/"1 Column, no range!": GOTO 360 IF OUTVAL=4 THEN OUTVAL$="B": GOTO 720 IF OUTVAL-5 THEN OUTVAL$-"D" IF OUTVAL=5 THEN INPUT "Number o f Column f o r Comparison";ICOL GOTO 720 'Routine t o e x t r a c t j u s t one column o f energy v a l u e s ' FOR Y=PSBEG TO ΡSEND STEP PSDEL ROW=ROW+1:COL=0

French and Brady; Computer Modeling of Carbohydrate Molecules ACS Symposium Series; American Chemical Society: Washington, DC, 1990.

Downloaded by UNIV OF SYDNEY on December 24, 2017 | http://pubs.acs.org Publication Date: July 6, 1990 | doi: 10.1021/bk-1990-0430.ch012

210

COMPUTER MODELING OF CARBOHYDRATE MOLECULES

570 FOR X=PHBEG TO PHEND STEP PHDEL 580 COL=COL+l 590 FOR K=l TO NCOL 600 I F K=ICOL THEN INPUT #1,ENG ELSE INPUT #1,DUM 610 NEXT Κ 620 I F ENG