Parametrization and application of CHEAT95, and extended atom

Carine Baraguey, Dirk Mertens, and Andreas Dölle ... C. M. Timmers , J. J. Turner , C. M. Ward , G. A. van der Marel , M. L. C. E. Kouwijzer , P. D. ...
0 downloads 0 Views 2MB Size
13426

J. Phys. Chem. 1995, 99, 13426-13436

Parametrization and Application of CHEAT95, an Extended Atom Force Field for Hydrated (0ligo)saccharides Milou L. C. E. Kouwijzer Department of Crystal and Structural Chemistry, Bijvoet Center for Biomolecular Research, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands

Peter D. J. Grootenhuis" Department of Computational Medicinal Chemistry, N. V. Organon, P.O.Box 20, 5340 BH Oss, The Netherlands Received: May 17, 1995; In Final Form: June 30, 1995@

A completely revised version of the CHEAT force field (carbohydrate hydroxyl groups represented by extended

atoms) is presented. This force field is meant to allow simulation of oligosaccharides in aqueous solution without the explicit inclusion of water molecules. Thus, the average effect of intra- and intermolecular hydrogen bonding has been introduced into the potential energy function by adding a new (extended) atom type to the QuantdCHARMm 4.0 force field. The additional force field parameters for the new atom type have been optimized by using a set of 56 estimated interaction energy differences for solvated cyclohexanols and aldopyranoses. The resulting CHEAT95 force field, which is computationally very efficient, has been applied to the disaccharides cellobiose, maltose, and sucrose. By comparing calculated potential energy maps with experimentally determined conformational data from X-ray, NMR, and optical rotation studies, it is shown that CHEAT95 results closely agree with the disaccharide conformations observed in solution. All but one conformation of the disaccharides in crystal structures are within 4 kcaVmol of the calculated global minima. In particular sucrose, which conformational behavior constitutes a major problem to force field calculations, could be simulated in full agreement with the experimental data. Finally, a complete conformational search has been carried out for the glycosidic and exocyclic dihedrals of GlcNAca( 1-4)[Fuca( 1+3)]GlcNAcP, a structural variant of Lewis X. The CHEAT95 results are in better agreement with N M R data and molecular dynamics simulations that explicitly include water molecules rather than with in vacuo calculations.

Introduction Because of their vital role in biological systems, carbohydrates form an interesting group of compounds in chemistry. Their conformational behavior in aqueous solution is of particular interest. However, it is not easy to study this experimentally, because typically oligosaccharides can adopt a number of conformations between which transitions may 0ccur.I To study the behavior of carbohydratesin water theoretically, empirical force field calculations can be performed. With molecular dynamics (MD)the time-dependent behavior in water can be simulated, although such studies should be interpreted with care. Apart from the limitations of an empirical force field and inaccuracies due to cutoff radii or other assumptions, sampling the configurational space is known to be a problem. A simulation of a few nanoseconds already takes a long computing time for an explicitly hydrated oligosaccharide. However, the nanosecond time scale may very well be too short to observe major conformational transitions. Therefore, it may be advantageous to perform a systematic conformational search in uacuo in order to generate promising starting conformations for subsequent MD simulations. In uacuo simulations are computationally efficient compared to MD simulations of explicitly hydrated carbohydrates, but typically the results are affected by the formation of unrealistic intramolecular hydrogen bonds. NMR experiments have shown that the hydroxyl groups of some monosaccharides are more or less free rotors' and that intramolecular hydrogen bonds are

* Author for correspondence.E-mail: [email protected]. 'Abstract published in Advance ACS Absfracts, August 15, 1995. 0022-3654/95/2099-13426$09.00/0

rarely found. For example, there appears to be a strong similarity between NMR data for sucrose and its 2-deoxy a n a l ~ g u e . Recently, ~ an NMR study of the hydroxyl protons of sucrose in aqueous solution showed that the chemical shift temperature coefficients, the coupling constants, and the exchange rates were virtually the same for all of them. No persistent hydrogen bonds were found, which suggests that the hydrogen bonds to water pred~minate.~Another illustration comes from analysis of the seven known crystal structures of pure aldohexopyranoses (where each molecule can be considered to be surrounded by a polar environment): just one of the 40 independent hydroxyl groups in these compounds (viz., in a-Dtalose) is not involved in an intermolecular hydrogen bond but forms an intramolecular bond. All other hydroxyl groups are involved in at least one intermolecular hydrogen bond, sometimes bifurcated with one intramolecular b ~ n d . ~ . ~ The energy associated with the formation of the intramolecular hydrogen bonds in vacuo dominates the total energy of the compound. Therefore, a conformational search in uacuo is very much a search for the most favorable intramolecular hydrogen-bonding scheme. This is illustrated by energy minimized conformations of methyl-P-D-glucopyranose in vacuo and MD simulations of the same compound in water.7 In vacuo the tg conformation is found to be the most stable one (06 trans with respect to 0 5 and gauche with respect to C4, shown in Figure l), which is supported by the 06.. 0 4 hydrogen bond. However, it is known from NMR data that this conformation is hardly present in aqueous ~ o l u t i o n .The ~ simulation in water agrees with this; the intramolecular hydrogen bonds are replaced by intermolecular ones. 0 1995 American Chemical Society

Extended Atom Force Field for (0ligo)saccharides

Figure 1. Minimum energy conformation in vacuo of methyl-B-oglucopyranoside. The dashed lines indicate hydrogen bonds (PLUTON plots).

Figure 2. Crystal structure conformation of sucrose ( 2 - 0 - ( a - ~ glucopyranosyl)-/?-D-fructofuranoside),’O with the glycosidic dihedrals q5 (05-Cl-Ol-C2’) and ly (Cl-Ol-C2’-05’).

Of course, the above-mentioned problems, viz., the accuracy of the calculated energies and the sampling of the configurational space, exist for in zmcuo calculations too. An example of these problems is the conformational analysis of the glycosidic link in the disaccharide sucrose (see Figure 2). This molecule contains three exocyclic hydroxymethyl groups and eight hydroxyl groups. When the lowest energy for a given conformation of the glycosidic link is calculated systematically, a minimum of three staggered positions has to be taken into account for each group, meaning that a total of 3” (-177 000) energy calculations have to be made. In addition, the results are likely to be dominated by unrealistic intramolecular hydrogen bonds. To circumvent the problems of sampling the configurational space and the intramolecular hydrogen bonds in in vacuo simulations, we introduced in 1993 the CHEAT force field:” a force field derived from CHARMmI2in which the carbohydrate hydroxyl groups are represented by extended atoms (also referred to as “united atoms”). Thus, in the CHEAT approach, the hydrogen-bond-donating capacity of hydroxyl groups is disabled and the average effect of intra- and intermolecular hydrogen bonds is mimicked. In this way, the sampling of the conformational space by a systematic search or an MD simulation in vacuo is faster than all-atom calculations, and more importantly, the results are not dominated anymore by unrealistic intramolecular hydrogen bonds. The corresponding force field parameters were optimized on a set of steric interaction energies between hydroxyl and/or methyl groups of six-membered ring compounds in aqueous solution at room temperature, as derived by Angyal.I3 We have applied the force field successfully in a study of di- and trim an nose^.'^ Bergwerff et al. used an implementation of the CHEAT force field in the molecular modeling package InsightDiscover to study the difference between the trisaccharides Galp( l-4)[Fuca( 1--3)]GlcNAc/3OEt and GalNAc@(1-4)[Fuca( 1-3)]GlcNAcp-OMe (where Gal is D-galactopyranose, Fuc is L-fucopyranose, GlcNAc is 2-acetamido-2-deoxy-~-glucopyranose, Et is ethyl, GalNAc is 2-acetamido-2-deoxy-~-galactopyranose, and Me is methyl). l 5 Their MD results agree well with their NMR results.

J. Phys. Chem., Vol. 99, No. 36, 1995 13427 The original CHEAT force field has a few drawbacks. First, the performance of CHEAT for the different subgroups of compounds ((methyl)cyclohexanols, inositols, aldohexopyranoses, and aldopentopyranoses) varies significantly. Second, the force field is not a true superset of CHARMm, since two parameters of the CHARMm force field had to be adapted in order to improve the conformational energy prediction of two basic molecules of the training set, ethane and methylcyclohexane. The CHARMm force field has recently been improved in terms of its performance on small molecule^.'^^'^ This was done by the inclusion of all dihedral terms explicitly instead of the use of generalized dihedrals. Therefore, we decided to reparametrize CHEAT on the basis of the new CHARMm force field. In order to distinguish between the published and the currently derived version of CHEAT, they will be referred to as CHEAT93 and CHEAT95, respectively. In this paper it will be shown that CHEAT95 which now is a true superset of CHARMm, offers a substantial improvement over the old version and that it allows high-quality conformational analyses of oligosaccharides. Computational Methods Parametrization of CHEAT95 Since the CHEAT force field should preferably be a superset of the underlying CHARMm force field, only parameters for a new extended atom type representing a hydrated hydroxyl group, named OTIE, were optimized. The force constants and equilibrium values of bond lengths and angles were copied from the corresponding CHARMm parameters for a hydroxyl oxygen atom. The same was done for most of the dihedral parameters. However, for the OTlE-CT-CT-OT1E and CT-CT-CT-OT1E dihedrals, force constants were optimized (for the atom types, see the Appendix). One additional dihedral term was added and optimized to improve the description of the anomeric effect for monosaccharides (viz., CT-OE-CT-OTlE). The partial charge of the OTlE atom (for reasons of electroneutrality, together with the charge of the adjacent carbon atom) and the van der Waals parameters were optimized. However, the van der Waals parameters for the 1-4 interactions of OTlE were taken to be equal to those of the hydroxyl oxygen atom in CHARMm, similar to the treatment in CHARMm of the extension of C to CH, CH2, or CH3 atom types. Some tests were done by varying the 1-4 van der Waals parameters, with other multiplicities or phases of the various dihedral terms, and with extra dihedral terms, but none of these tests resulted in improvement. Thus, seven new force field parameters had to be optimized. The same procedure as for CHEAT93 was used to optimize the parameter set: a grid search was performed in which the parameters were varied systematically between reasonable values. For every combination the energies of the compounds of a training set in several conformations were calculated. Energy differences between different conformations of the same molecule were compared to corresponding values estimated by Angyal, who derived a table of interaction energies for cyclic compounds in aqueous solution at room temperature, given in Table l.13 In this way, the root mean square (rms) deviation between the CHEAT95 and Angyal energy differences was calculated for thousands of parameter combinations. Training Set. The CHEAT93 force field had been optimized on 64 energy differences between conformations with equatorial and axial substituents, estimated from Angyal’s scheme: 10 hydroxy and/or methyl substituted cyclohexanols, 5 inositols, and 12 aldopyranoses. For every aldopyranose four energy

13428 J. Phys. Chem., Vol. 99, No. 36, 1995

Kouwijzer and Grootenhuis

TABLE 1: Energies (kcaVmol) of Interactions between Substituents in Cyclitols in Aqueous Solution (25 OC)13J8 between OH ax CH3 or CHzOH ax OH equat H ax 0.45 0.9 OH ax 1.5 2.5 0.35 OH equat 0.35 0.45 0.35 CH3 or CHzOH equat 0.45 0.45 (I

anomeric effect 0 2 ax and 0 3 ax 0 2 ax and 0 3 equat 0 2 equat 0 1 equat 0.85 1.o 0.55 Interactions between axial groups refer to 1-3, and the others to 1-2 interactions.

TABLE 2: Conformational Energy Differences (kcaymol) between Axially and Equatorially Substituted Cyclohexane Derivatives (in the Chair Conformation) As Estimated from Angyal's Scheme13and Calculated by CHEAT

- --

compound

Angyal CHEAT95 CHEAT93

cyclohexanol (a e) 1,2-cyclohexadiol(a,a, e, e) 1,3-~yclohexadiol (a,a e, e) 1,4-~yclohexadiol (a, a e, e) 1,2,3-cyclohexatriol (a, a, a, -e, e, e) (a, a, a, e, e, e) 1,2,4-cyclohexatriol 1,3,5-cyclohexatriol (a,a, a, -e, e , e)

-

0.90 1.45 2.40 1.80 2.60 2.95 4.50

0.96 1.47 2.53 1.79 2.61 2.95 4.69

0.83 1.72 2.23 1.63 2.92

0.09

0.24

1.10 2.32 3.95 1.01 0.95 2.92 3.81 3.92 4.20 3.23 4.53 6.51

0.79 1.93 3.60

0.13

0.40

2.24 2.18 4.44 2.36 6.55

2.32 2.45 4.77 2.68 6.79

0.19

0.20

4.17

a

differences were calculated between the a- and P-anomers in both the 4C1 and the IC4 conformation (but note that only three are independent). Next to this, the energy difference between the anti and the gauche conformation of 1,2-ethanediol was used.' For the optimization of CHEAT95 three changes to this training set were made. (1) The set was extended with 9 more substituted cyclohexanes. (2) 1,ZEthanediol was removed from the set because of the uncertainty of the experimental energy. We decided to focus on cyclic systems only. (3) The 16 energy differences between the 4Cl and IC4 conformations of both anomers of the aldohexopyranoses were removed from the set. The energy differences between the 4Cl and IC4 conformations for each anomer calculated with CHEAT95 were systematically too large, and it turned out that these differences deviated significantly more from Angyal's estimated values than all other energy differences, including the corresponding differences for the aldopentopyranoses. A reason for this may be that the interaction energy of the hydroxymethyl group is, according to Angyal's scheme, independent of its conformation with respect to the ring oxygen atom (in fact, no difference is made between this group and a methyl group), which is known from NMR experiments not to be true.I9 Furthermore, very little is known about the preferred conformation of this hydroxymethyl group in the IC4 ring conformation for each compound, so quantitative checks can hardly be made. Since Angyal predicts with his scheme the relative amount of each anomer of the aldopyranoses with reasonable accuracy, compared to NMR results, we do consider all other values in his scheme to be reliable for our purposes. For the aldohexopyranoses, the conformation of the hydroxymethyl group in the 4Cl ring conformation was taken as gg for those that have 0 4 equatorially (as in glucose); gt was taken for those that have it axially (like galactose), since these are the dominating conformations in aqueous s o l ~ t i o n ' ~ (for the atomic numbering, see Figure 1). In the IC4 ring conformation, where experimental data are hardly available, all hydroxymethyl groups were placed in the gt conformation (the gg conformation for the gluco-like monosaccharides is highly unlikely because of the axial position of 0 4 and, in the a anomer, 01). Altogether, a set of 56 energy differences (of which 4 are redundant) has been used to optimize CHEATBS, compared to 64 (of which 12 are redundant) for CHEAT93. Energy Calculations. All energy calculations described here were carried out using CHARMm version 22, in the Quanta/ CHARMm molecular modeling package version 4.0.16 No cutoff radius was used for the nonbonding interactions, and a distance-dependent dielectric constant (er = r/A; for 1-4 interactions cr = 2rIA) was used, as in CHEAT93. The energy was minimized by the adopted basis Newton-Raphson algorithm until the rms energy gradient was less than 0.001 kcal/ (mol-A).

rms diff cyclohexanols

---

1-methyl-2-cyclohexanol (a,e e, a) I-methyl-2-cyclohexanol (a,a e, e) 1-methyl-3-cyclohexanol (a,e -e, a) I-methyl-3-cyclohexanol (a,a e, e) I-methyl-4-cyclohexanol (a,e e, a) I-methyl-4-cyclohexanol (a, a e, e) l-methyl-2,3-cyclohexadiol(a,a, a e, e, e) l-methyl-2,4-cyclohexadiol (a, a, a e, e, e) l-methyl-2,5-cyclohexadiol(a,a, a e, e, e) I-methyl-2,6-cyclohexadiol (a,a, a e, e, e ) l-methyl-3,4-cyclohexadiol(a, a, a e, e, e) l-methyl-3,5-cyclohexadiol(a,a, a e, e, e)

----

0.90 2.25 3.85 0.90 0.90 2.70 3.95 3.75 4.30 3.30 4.40 6.50

rms diff methylcyclohexanols

--

inositol(e,e, a, a, a, a a, a, e, e, e, e) (epi)inositol( e , a, a, a, e, a a, e, e, e, a, e) (myo)inositol (e,a, a, a, a, a -a, e, e, e, e, e) (neo)inositol(e,a, a, e, a, a a, e, e, a, e, e) (scyl1o)inositol(a, a, a, a, a, a e, e, e, e, e, e) (+)

--

rms diff inositols

2.30 2.30 4.60 2.30 6.90

5.81

For the calculation of conformational energy maps as a function of the glycosidic dihedrals I$ and ly in three disaccharides, these angles were varied from -180" to +180" with a grid size of 10". For every molecule a set of starting structures was generated with every hydroxymethyl group in one of the three staggered positions (for cellobiose and maltose 9 starting structures and for sucrose 27 starting structures). At every grid point the energy of all starting structures was minimized with restraints on 4 and ly of 1000 kcaI/(molrad*). The minimization as described above was preceded by 50 steps of steepest descent. Then, the restraints were removed, and the energy was recalculated. Isoenergy contour maps were based on the lowest energy calculated at each grid point and generated using a contouring program.20 In the trisaccharide GlcNAcP( 1+4)[Fuca( 1--3)]GlcNAc/3 two glycosidic links are present. At first, all angles were varied from -180" to +180" with a grid size of 30", and in each of the points of this four dimensional grid, the energy was minimized (with the hydroxymethyl groups in the gt position). Next, the low-energy regions of each angle were determined. New energy minimizations were then performed for a smaller range for one link and the full range for the other, now with a grid size of 20" and nine starting structures in each grid point (both hydroxymethyl groups in one of the three staggered positions). All calculations were performed on either a Silicon Graphics Challenge or on Indigo 2 Extremes, all with R4400 processors.

Results and Discussion Parametrization. The final parameter set is given in the Appendix, as well as the topologies of cyclohexanol and glucose [The CHEAT95 parameters and topologies in digital form are available by request to Molecular Simulations,Burlington, MA]. The resulting energy differences calculated with this parameter set and the comparison with the Angyal values are given in Tables 2-4. The final rms difference between the calculated

Extended Atom Force Field for (0ligo)saccharides

J. Phys. Chem., Vol. 99,No. 36, 1995 13429

TABLE 3: Conformational Energy Differences (kcaumol) between the a and /3 Anomers of the 4C1and lC4 Conformations of the D-Aldopyranoses As Estimated from Anmal’s Scheme13 and Calculated by CHEAT 4 C ~ ( a-) 4C~cC) 1C4(a)- ICaCc) compound Angyal CHEAT95 CHEAT93 Angyal CHEAT95 CHEAT93 allose altrose galactose glucose gulose idose mannose talose rms diff

0.95 0.30 0.35 0.35 0.95 0.30 -0.45 -0.45

0.95 0.36 0.25 0.25 0.86 0.25 -0.33 -0.38 0.08

0.89 0.42 0.06 0.21 0.73 0.21 -0.12 -0.28 0.20

-0.70 - 1.50 - 1.45 - 1.45 -0.70 -1.50 -2.10 -2.10

-1.13 -1.66 -1.75 - 1.73 -1.17 - 1.77 -2.24 -2.31 0.30

-1.17 -1.47 - 1.82 -1.76 -1.17 -1.62 -2.19 -2.27 0.30

arabinose lyxose ribose xylose rms diff

0.30 -0.45 0.95 0.35

0.32 -0.37 0.91 0.22 0.08

0.27 -0.25 0.74 0.09 0.20

-0.35 -0.95 0.45 -0.30

-0.14 -0.83 0.43 -0.22 0.13

0.03 0.64 0.36 -0.14 0.26

TABLE 4: Conformational Energy Differences (kcavmol) between the 4C1and lC4 Conformations of the a and /3 Anomers of the D-Aldopentopyranoses As Estimated from Anmal’s Scheme13 and Calculated by CHEAT U a ) - 4Cl(a) IC4IR) - 4CIcC) compound Annyal CHEAT95 CHEAT93 Angyal CHEAT95 CHEAT93 arabinose lyxose ribose xylose rms diff

-1.15 0.55 0.10 1.65

-1.09 0.82 0.29 2.13

-1.04 0.74 -0.50 1.74

0.29

0.32

-0.50 1.05 0.60 2.30

-0.63 1.28 0.77 2.56

-0.80 1.12 -0.38 1.96

0.20

0.54

and Angyal energy differences is 0.18 kcaymol ( N = 56). When only the energy differences that were calculated with both versions of CHEAT are taken into account, the total rms (N = 47) for CHEAT93 is 0.30 kcal/mol and for CHEAT95 0.19 kcaymol. The imbalances between the different subsets that were reported with CHEAT93 are, to a lesser extent, present with CHEAT95 too, but in a different way: apparently, the IC4 conformations do not agree very well with the estimated Angyal values, about which we have already expressed our concern (see the subsection “Training Set”). Of course, the optimization of the parameter set on energy differences should preferably be based on experimental data and not on the Angyal set. However, only a limited number of useful experimental data is available. Therefore, we have chosen to use the Angyal set of steric interaction energies, which seems to approximate the experimental differences rather well and results in a consistent set of energy differences. Ring Conformations. The conformations of the sixmembered rings were satisfactory in that no unexpected features were observed and no conformational changes took place. Next to these rings, five ketofuranose molecules and P-D-fructofuranose were studied to calculate the energy difference between the N and the S conformations of the ring. Unfortunately, for all but fructose, both conformations appeared to be unstable (especially the S conformations). Anomeric, Hassel-Ottar, and Gauche Effects. Three important factors for the conformation of (hydrated) saccharides are the anomeric effect,21the Hassel-Ottar also called 1,3-syndiaxial or pen interaction, and the gauche effect.23 The anomeric effect for monosaccharides has been introduced qualitatively in the CHEAT95 force field, as can be seen in Table 3 (the prediction of the anomeric proportions by the Angyal energies is in good agreement with NMR datal3) because of the new dihedral term CT-OE-CT-OTlE. The anomeric effect for saccharides that consist of more than one monomer

TABLE 5: Statistical Distribution of the Conformations of the Hydroxymethyl Group in Aldohexopyranoses (4C1Ring Conformation) with the Standard Deviation in Parentheses? gluco-like galacto-like

nn(%)

nr(%) ?e(%) .en(%)

nt(%)

m(%)

46(4) 2 ( 2 ) 22(1) 56(4) 23 (3) 43 (3) 7 (3) 61 (9) 32(6) l(1) 26(2) 61 (1) 13 (1) 5(1) 56(1) 39(1) The NMR data are based on both anomers of D-glUCOSe, D-mannose, and D-galactose, all with and without a methyl group at Ol.9-29For the CSD data 370 gluco- and 84 galacto-like aldopyranoses were found and e~amined.~.~ The CHEAT95 data are based on the hexopyranoses mentioned in Table 3. NMR

CSD CHEAT95

53(4) 56(4)

is present in the underlying CHARMm force field (through a dihedral term CT-OE-CT-OE with a multiplicity of one and a shift of 180”).’6 The typically different bond lengths around the anomeric center are not parametrized. According to the Hassel-Ottar effect, conformations of parallel C-OH bonds on C(n) and C(n+2) are energetically unfavorable.22 For our training set, this means that the conformation of the exocyclic hydroxymethyl group in the 4C1 conformation of the gluco-like monosaccharides should not be tg, and for the galacto-like monosaccharides it should not be gg. This effect is seen both in aqueous ~ o l u t i o nand ’ ~ in crystal structure^.^^ We extended the latter database search, since the Cambridge structural database (CSD)S has increased substantially in the past 15 years (the number of independent, useful gluco-like aldohexopyranoses has increased since then from 101 to 370 and for the galacto-like ones from 24 to 84). With CHEAT95 the potential energy of all three conformers was calculated and converted to statistical distributions. These percentages, together with NMR and CSD data, are given in Table 5. This table shows that in CHEAT95 the “forbidden” conformations are indeed less populated than the other two. In contrast with this, in vacuo calculations with other force fields, ab initio or semiempirical calculations often prefer the tg conformation in glucose, because of the intramolecularhydrogen bonds (as shown in Figure l).7.25 Nevertheless, the percentages for the gluco-like saccharides calculated with CHEAT95 are not in good agreement with the experimental data; the gt conformation is favored instead of the gg. This might be improved by the addition of a dihedral term in the force field with a low-energy contribution for the gg conformation and higher contribution for the gt (and also the tg) conformation. This could be a term for the OE-CT-CT-OT1E dihedral with a multiplicity of one and a phase of 120°, but, at present, CHARMm only allows phases of 0 or 180”. Besides, such a term would interfere with the ring conformation too, so the improvement in the conformations of the hydroxymethyl group could introduce new problems with the ring conformations. The same was seen in attempts to improve our calculated percentages via a term for the CT-CT-CT-OT1E dihedral with a perodicity of 2. In calculations with other force fields in uacuo, better agreement is reached when solvent effects are inbut only with a quantum chemical solvation model of water based on a continuum treatment of dielectric polarization and solvent accessible surface area were the populations reproduced correctly.28 Most 0-C-C-0 dihedrals in our training set are part of the ring, thus not freely rotatable. Therefore, CHEAT95 has not been optimized on the gauche effect explicitly. The final dihedral term OTlE-CT-CT-OT1E with a multiplicity of 1 (see the Appendix) favors the trans conformation but gives better overall results than a term with a multiplicity of two (which

13430 J. Phys. Chem., Vol. 99, No. 36, 1995

Kouwijzer and Grootenhuis

optical rotation s t ~ d i e s ,are ~ ~marked. . ~ ~ The calculations with a 10” grid size and all starting structures took about 13 h for cellobiose and maltose and 52 h for sucrose (with a 20” grid size, as French and Dowd used; this decreases to about 4.5 and 14 h, respectively). None of the CHEAT95 maps shows regions with unusually high barriers, and all but one of the crystal structures are found Figure 3. Crystal structure conformation of cellobiose (~-O-@-D- within 4 kcdmol from the minimum (the exception being within 5 kcaVmo1). According to an optical rotation study of cellobiose glucopyranosyl)-/?-~-glucopyranose),~~ with the glycosidic dihedrals 4 (Hl-Cl-Ol-C4’) and v (Cl-Ol-C4’-H4’). in aqueous solution, the regions denoted A-C in Figure 5 account for 85% of the entire populated region and D for another For maltose a difference exists between the region where most crystal structures are found and the lowest energy conformation in aqueous solution, where region A is the most populated Regions B and C are populated to a lesser extent, and region D is populated only slightly. For sucrose, the predominant solution conformer, in region A, does not differ much from the crystal structure conformation, but there is also a significant representation of conformers in region B.36 In our calculations starting structures with the hydroxymethyl group in the tg conformation were included. When these had Figure 4. Crystal structure conformation of a-maltose (4-O-(a-~glucopyranosyl)-a-~-glucopyranose),~~ with the glycosidic dihedrals 4 been neglected, the maps of cellobiose and maltose, as shown (HI-Cl-01-C4’) and v (Cl-Ol-C4’-H4’). in Figures 5 and 6, would have remained unaltered. The map of sucrose, however, would not. When all 90 grid points with would favor the gauche conformation). Nevertheless, the energies less than 4 kcaVmol above the minimum are scrutinized, gauche effect is apparent in the rotatable 05-C5-C6-06 it is found that the 05-C5-C6-06, the 05’-C5’-C6’-06’, dihedral in the galacto-like monosaccharides. Table 5 shows and the 05’-C2’-C1’-01’ dihedrals occupy the tg conformathat the energy of the tg conformation is indeed higher than tion in 7, 29, and 25 grid points, respectively. These conformathat of the gt conformation (the higher energy of the gg tions appear already at 2.1, 1.0, and 2.1 kcallmol from the conformation was already explained by the Hassel-Ottar effect). minimum, respectively. This tg conformation is found for the In the gluco-like monosaccharides, the high energy of the tg 05’-C2’-C1’-01’ dihedral in some of the crystal structures conformation can be explained both by the Hassel-Ottar effect too. Obviously, the neglect of these starting structures will and by the gauche effect. change the map. Application 1: Conformational Analysis of Cellobiose, The overall shape of the maps of cellobiose and maltose agree Maltose, and Sucrose. In 1993, French and Dowd published with those of French and Dowd, but the shape of the minima a conformational study of disaccharides with relaxed residue differ in some respects. French and Dowd calculated only one energy maps of cellobiose (Figure 3), maltose (Figure 4), and for maltose at 4, t/,~ = -16”, -24”; approximately sucrose (Figure 2).30 The maps were calculated with ~ ~ 3 ( 9 2 ) , ~ ’ . ~minimum * between the two we calculate, Le., -70”, -44” and - 1lo, 37”. picturing conformational energy as a function of the glycosidic The conformations found in crystal structures are in better dihedrals. For each compound a number of starting conformaagreement with their minimum than with our minimum. tions was constructed with the rotatable hydroxyl groups in However, region A (most populated in aqueous solution) either a clockwise or an anticlockwise (as in Figure 1) coincides with our global minimum, while in the map of French orientation and the hydroxymethyl groups in either gg or gt and Dowd the global minimum falls in region B. To find the positions. This resulted in 16, 24, and 48 starting structures origin of this difference, another map of maltose in the allfor cellobiose, maltose, and sucrose, respectively (it was noted atom representation was calculated with CHARMm. The map that this set of starting structures might not result in the lowest was calculated with a 20” grid size and 16 starting structures possible energy at each map point). Conformations found by (the hydroxymethyl groups in either the gg or the gt position diffraction studies in crystals of relevant molecules were not and the hydroxyl groups either in a clockwise or in an more than 3 kcdmol higher in energy than the global minimum anticlockwise orientation). The map is shown in Figure 8; it found by ~ ~ 3 ( 9 2except ) , for sucrose, where differences up to resembles more the MM3 map than the CHEAT95 map. 8 kcaVmo1 were found. Analysis of intramolecular hydrogen bonds in the crystal For these three compounds we calculated the potential energy structures with a conformation in the different minima is not map using CHEAT95. To facilitate the comparison, the same easily done since the positions of the hydrogen atoms are not definitions of the glycosidic dihedrals were used as those of always known. Only three of the eight conformations of crystal French and Dowd. The ring conformations were 4Cl for the structures which are found in or near region A in Figure 6 can glucopyranose and N (4T3) for the fructofuranose ring (which be analyzed, and they do not have an intramolecular hydrogen is a stable conformation in CHEAT95, see the subsection “Ring bond. Some of the conformations in region B, however, do Conformations”). To extend the set of crystal structures used have one, viz., between 03’ and 0 2 . Therefore, the difference by French and Dowd for comparison, a database search was in the position of the global minimum in Figures 6 and 8 appears perf~rmed.~The conformations found in crystal structures to be caused by intramolecular hydrogen bonds. which contain p( 1-4) or a(1-4) linked ddohexopyranoses (the reducing saccharide gluco-like) are marked in the maps of The difference between the shape of the minima is probably cellobiose (Figure 5) and maltose (Figure 6), respectively. In not only caused by intramolecular hydrogen bonds, but by other the map of sucrose (Figure 7), the conformations of crystal differences between the force fields as well. Glennon et al.,25 structures that contain a sucrose residue are marked. Next to who published a new AMBER based force field for monosacthat, low-energy regions in aqueous solution, determined with charides and (1-4) linked polysaccharides, made a conforma-

J. Phys. Chem., Vol. 99, No. 36, 1995 13431

Extended Atom Force Field for (0ligo)saccharides

180

60

120

40 20

60

*

*

0

(O)

0

("-20 -40

-60

-60

-120

-80

-180 -180-120

- 100 -60

60

0

4

120 180

-40-20

0

20 40 60 80 100

4

(O)

(O)

Figure 5. Overall (left) and detailed (right) conformational energy maps of cellobiose as a function of the glycosidic dihedrals $ and (for the definition see Figure 3) calculated with CHEAT95 Isoenergy contours are 1 kcaUmol apart. The marks correspond to the conformations found in crystal structures from the CSD5 with the refcodes accelll0, acello, acglpr, achitml0, aclact, bchittl0, beljad, berkisO1, bikwohl0, blacto, cellob02, civfeslo, cofmepl0, dicmeh, digoxnl0, dihtuj, duvgikl0, fensum, fusxia, gitxinl0, kamhov, lactccl0, lactosl0, mcelob, and olgose. The letters

correspond to minima in aqueous solution, found by optical rotation.35

180 120 60

*

(O)

0

-60 -120 -180 -180 -120 -60

0

60

120 180

4 (")

-90

-60

-30

4

0

10

("1

Figure 6. Conformational energy maps of a-maltose as a function of the glycosidic dihedrals 4 and li, (for the definition see Figure 4) calculated with CHEAT95 Isoenergy contours are 1 kcaYmol apart. The marks correspond to the conformations found in crystal structures from the CSDS with the refcodes amicetl0, bihjeh, chxamh03, cytsac, dudxop, foxsug20, hahxuj, ipmalt, koyzaz, maltosl 1, maltot, mmalts, phmalt, turansOl, vecvoo, and zzztuc0l. The letters correspond to minima in aqueous solution, found by optical rotation studies.35

tional energy map of maltose in uacuo too and calculated a map more similar to ours (with two minima, one around 4, q = -40', -40' and the other around 20', 20'). Brady and Schmidt37 performed MD calculations of explicitly hydrated maltose with a C H M m based all-atom force field for carbohydrates and found two minima, at -62', -11' and at 30', 20'. However, in a second simulation with lower charges on the hydroxyl oxygen atoms, they found that the region between these minima was sampled frequently too, indicating one large low-energy region. Our map of sucrose is rather different from the one calculated by French and D o ~ d The . ~ glycosidic ~ dihedrals found in the crystal structures are much closer to the calculated minimum than in their map, which is shown in Figure 9. The structures of nystose (4, = +101', -19') and raffinose (4, W = 82", 12"), which are about 5 and 8'kcaYmol above their minimum, respectively, are close to our minimum (within 3 kcdmol). The conformation of crystalline 1,4',6'-trichloro- 1,4',6'-trideoxygalactosucrose (TGS, 4, = +91°, -162') differs relatively much from the others and is found about 5 kcaYmol above our minimum. This may be explained by the presence of an

intramolecular hydrogen bond between the two monomeric units in the crystal structure which stabilizes the conformation. NMR experiments of TGS in D20 have shown that the molecule displays a variety of conformations and that ly has a much greater freedom of rotation than 4;38both maps agree with this. Recently, a number of papers about sucrose was published to which our results can be compared. Again, NMR experiments indicate conformational flexibility of the glycosidic linkage in aqueous s o l ~ t i o n . ~ HervC . ~ ~ , ~du~ Penhoat et aL3 combined an NMR study with molecular modeling, using the PFOS program. The sugar residues remained rigid (in conformations minimized by MMP2(85)), and the energy was the sum of terms from van der Waals interactions, torsional rotations, and exoanomeric effects, both with and without hydrogenbonding interactions. They calculated 8 maps with different combinations of the conformations of the hydroxymethyl group (the 05-C5-C6-06 and the 05'-C5'-C6-06' dihedral either gg or gt and the 05'-C2'-C1'-01' dihedral either gg or tg). In this way, they calculated several minima for sucrose. None of these minima predicted NMR coupling constants and nuclear Overhauser enhancements (NOES) comparable to the

Kouwijzer and Grootenhuis

13432 J. Phys. Chem., Vol. 99, No. 36, 1995 180

1

I

I

I

I

I

120

60 180 120

II, ( O )

I 1

60

-

0

-

-60

-

-120

-

-60

- 120

-180 1 I -180 -120 -60

- 180

1

0

60

120 180

4 ("1

10 40 70 100 130 160

d

(O)

Figure 7. Conformational energy maps of sucrose as a function of the glycosidic dihedrals 4 and q (for the definition see Figure 2) calculated with CHEAT95. Isoenergy contours are 1 kcal/mol apart. The marks correspond to the conformations found in crystal structures from the CSD5 with the refcodes celgij, dinyool0, hahxuj, kanjoy, kestos, kscosf, meleztol, melezt02, pekhes, plantel0, rafino01, stachy01, sucros04, and zzzsti01. The letters correspond to minima in aqueous solution, found by optical rotation studies.36

180

1

I

0

120 60

II, ("1 0 -60 -120

-180 -180-120

-60

0

60

120 180

4 ("1 Figure 8. Conformational energy map of maltose as a function of the glycosidic dihedrals 4 and (for the definition see Figure 4) as calculated with CHARMm. Isoenergy contours are 1 kcal/mol apart. For the marks see the caption of Figure 6 . experimentally determined data; only an average over all the maps corresponded reasonably well with the experimental results. Four of the minima they calculated agree reasonably with the maps calculated with both CHEAT95 and ~ ~ 3 ( 9 2 ) , but the fifth minimum fits the CHEAT95 map better than the ~ ~ 3 ( 9 map. 2) An explanation for the poor performance of ~ ~ 3 ( 9 on 2) sucrose could initially not be given by French and Dowd. Recently, van Alsenoy et aL4' investigated possible causes as an overlapping anomeric effect, intramolecular forces (the orientations of the primary and secondary hydroxyl groups), and intermolecular forces. This was done by conformational analysis using ab initio quantum mechanics for a tetrahydro-

10 40

70 100 130 160

4 (O) Figure 9. Conformational energy map of sucrose as a function of the glycosidic dihedrals @ and (for the definition see Figure 2) as calculated with ~ ~ 3 ( 9 by 2 )French and Dowd.lo Isoenergy contours are 1 kcal/mol apart. For the marks see the caption of Figure 7.

pyran-tetrahydrofuran analogue of sucrose, ~ ~ 3 ( 9calcula2) tions of a miniature model of crystalline raffinose pentahydrate, and additional calculations with ~ ~ 3 ( 9 2 One ) . of the results was that about 6 of the 8 kcaVmol energy difference between

J. Phys. Chem., Vol. 99, No. 36, 1995 13433

Extended Atom Force Field for (0ligo)saccharides

TABLE 6: Minimum Energy Conformations of the Glycosidic Dihedrals $fg (OS’-Cl”-Ol”-C3’) and (Cl”-Ol”-C3’-C2’) in Fuca(1-3)GlcNAc Found by Different Authors (See Text for Methods Used) compound

ref

Gal/3( 1-4)[Fuca( 1-3)]GlcNAc Galp( 1-4)[Fuca( 1--3)]GlcNAc/3-OMe

44

GalNAcP( 1-4)[Fuca( 1-3)]GlcNAc/3-OMe NeuNAca(2-3)Gal/?( 1-4)[Fuca( 1-3)]GlcNAc R-GlcNAcP( 1-4)[Fuca( 1-3)IGlcNAcP-R’ (see text)

15 45 42

Fuca( 1-3)lGlcNAc

46

GlcNAcP( 1-4)[Fuca( 1--3)]GlcNAc~

present work

43

the crystalline raffinose conformation and the minimum energy conformation in the sucrose map, originated from torsion angles involving the anomeric carbon atom on the furanose ring (C2’). They concluded that the overlapping anomeric effect was the main cause for the relatively high energy differences between conformations found in crystal structures and the minimum energy conformation of sucrose calculated with molecular mechanics. Thus, they suggest that if improved parameters, based on molecules with an overlapping anomeric sequence, cannot overcome the discrepancies, the potential energy function might have to be expanded with terms for strings of atoms longer than the four than define a torsion angle. Since the conformations found in crystal structures are found in low-energy regions in the CHEAT95 calculations of sucrose, the overlapping anomeric sequence is apparently not a problem in CHARMm (which provides the parameters for the anomeric effect in disaccharides; see the subsection “Anomeric, HasselOttar, and Gauche Effects”). The main difference between the potential energy maps of sucrose by CHEAT95 (Figure 7) and by ~ ~ 3 ( 9 (Figure 2) 9) is the minimum energy conformation about the torsion angle $J. This is one of the torsions that cause the high relative energy for the raffinose conformation of sucrose in ~ ~ 3 ( 9 2 )Therefore, . the torsional contributions to the dihedral $J in ~ ~ 3 ( 9 and 2 ) CHARMm were compared by the calculation of the total torsional energy around the 0 1-C2’ bond (the central bond in $J), shown in Figure 10 (no contributions from electrostatic or van der Waals interactions are taken into account). The torsional energies for the dihedral Q, = OE CT - OE - CT are

MM3: , E ( q )= 0.625(1 CHARMm:

+

COS q) -

+

1.5[1 - COS(^^)] 0.425[1 cos(3q)I

+

E ( q ) = 1.30(1 - cos q)

and for the dihedral q = CT - CT - OE - CT are

MM3: E ( 9 ) = 0.225(1

+

COS

q)

+ 0.025[1 - COS(^^)] + 0.3785[1 icos(3q)]

CHARMm:

E ( q ) = 0.82(1

+ cos q) i- 0.25[1 + cos(3q)I

The most important difference between the energy curves from the two force fields is the barrier between the two gauche conformations. In ~ ~ 3 ( 9 the 2 ) barrier for $J is 5.2 kcaVmol (the largest part comes from the second term of the OE-CTOE-CT dihedral), whereas in CHARMm it is only 0.3 kcaV mol. The value of $J in the raffinose conformation is 12”, so

-2

0

vfg

d

?I)

-86.8 -45 -83 -56 (10) -77 (17) -90 - 140 -70 - 145 - 100 -45 - 140 -65 -95

-98.1 -96 -97 -84 (9) -94 (10) -100 - 135 -95 - 145 -160 -80 - 140 90 - 165

120

*

240 (e)

360

Figure 10. Contribution of torsional terms to the energy in sucrose (see Figure 7) according to ~ ~ 3 ( 9 and 2 ) CHEAT95 (CHARMm) as a function of 11, (C 1-0 1-C2’-05’).

Figure 11.

Low energy conformation from CHEAT95 of GlcNAcP( 1-4)[Fuca( 1+3)]GlcNAcP, with the glycosidic dihedrals Wgg is Cl-Ol-C4’-C3’, @fg is of interest: @ is 05-Cl-Ol-C4’, 05”-Cl”-O”l’”-C3’, and ll,fgis C 1”-01”-C3’-C2’.

the nature of this barrier is of great importance when comparing the crystal structure conformation of raffinose with the potential energy map of sucrose. Therefore, maybe not the overlapping anomeric effect is the main problem in ~ ~ 3 ( 9 calculations 2) of sucrose but only the (too) high barrier in the OE-CT-OECT torsion angle term. Application 2: Conformational Analysis of GlcNAcP(1-4)[Fuca(1--3)]GlcNAc~. The last application is the conformational analysis of the glycosidic link Fuc-GlcNAc in the trisaccharide GlcNAcP ( 144)[Fuca( 1-3)]GlcNAcP (see Figure ll), which is a part of a glycopeptide from bromelain.42 It is a structural variant of the Lewis X (or Lex) structure Galp( 1-4)[Fuca( 1-3)]GlcNAcP, which is present in a variety of human ti~sues.4~ The Fuc-GlcNAc link is of particular interest because of contradictory results published on this type of compound: some studies predict a rather rigid conformation with essentially only one minimum, and others predict two distinct minima. Wormald et a1.44 found a single conformation of the FucGlcNAc link in the Le“ structure, both by NMR and by semiempirical quantum mechanical calculations. Miller et al.43 investigated a methylated LeX structure by NMR and MD in vacuo and found essentially also one minimum (actually, the MD predicted two minima close together). Bergwerff et aLi5 performed MD simulations with CHEAT93 on a methylated GalNAc variant of the Le“ structure and also found one

Kouwijzer and Grootenhuis

13434 J. Phys. Chem., Vol. 99, No. 36, 1995 180 120 60 (O)

0

-60 -120 -180 -180-120

-60

0

60

4gg

-180 -180 -120 -60

120 180

0

60

120 180

4%- ("1

(O)

180

120

60

-I

LJB%-

-180 -180

-120

-60

4gg

0

(O)

60

-180

-120

-60 4fg

0

(O)

Figure 12. Conformational energy maps of GlcNAcp(1--4)GlcNAcP (left)and Fuca(1--3)GlcNAc/3 (right), as a function of the glycosidic dihedrals (grid size upper two 30°, lower two 20"). Isoenergy contours are 1 kcal/mol apart. On the right, the marks correspond to the conformations found by (a) Wormald et al.,44 (b) Miller et al.,43(c) Bergwerff et al.,I5 (d) Rutherford et al.,45(e) Imberty et a1.,46 and (0 Lommerse et aL4*

minimum, in agreement with Rutherford et who performed restrained simulated annealing and restrained MD in uacuo, combined with NMR, on a sialyl-le' structure (NeuNAca(2-3)Lex, where NeuNAc is D-neuraminic acid). In contrast with this, Lommerse et a1.4* included water molecules explicitly in MD simulations of the complete glycopeptide from bromelain (consisting of a hexasaccharide and a tetrapeptide: Mana(146)[XylP( 1-2)]Ma$( 1-4) GlcNAc@(1-4)[Fuca(l-3)]GlcNAc/3(1-N) Asn-Glu-Ser-Ser, where Man is D-mannopyranose, Xyl is D-XylOSe, Asn is L-asparagine, Glu is L-glutamic acid, and Ser is L-serine). They found two distinct minima for the glycosidic link Fuc-GlcNAc, which is in agreement with their NMR results. Finally, Imberty et al.46 predicted three minima for the Fuca( 1-3)GlcNAc disaccharide by MM calculations with the PFOS program; energies included interresidue van der Waals interactions, torsional rotations, and exoanomeric effects. No specific hydrogen-bonding interactions

were included. All the minima listed by these authors are summarized in Table 6. In summary, essentially one minimum is found with force field calculations in vacuo (and NMR); two distinct minima are found with NMR and MD in water; and PFOS predicts three minima. The energy as a function of the four glycosidic dihedrals was calculated with CHEAT95, and four minima were found for the Fuca( 1-3)GlcNAc link, as can be seen in Table 6 and Figure 12. The discrepancy concerning the rigidity of the Fuca( 1-3)GlcNAc link prompted us to look more closely at the studies that predict a single conformation. The preferred conformation of Wormald et aLU has a hydrogen bond between 02"of fucose and the carbonyl group of GlcNAc, which is not possible with CHEAT (Figure 12 shows that here the carbonyl group is even turned away from the fucose residue). From stereo pictures given by Miller et al.43and Rutherford et al.,45it is easily seen

Extended Atom Force Field for (0ligo)saccharides

J. Phys. Chem., Vol. 99, No. 36, 1995 13435

TABLE 7: Atom Types

HA

aliphatic carbon (tetrahedral) aliphatic or aromatic hydrogen

CT HA

OE OTlE

HA 0.05

ether oxygedacetal oxygen extended hydroxyl atom

that the same hydrogen bond can be formed in their low-energy conformation too, although the authors do not mention anything about hydrogen bonds. A second point of interest is the remark made by Miller et al.:43the methodology employed in their study identifies a single structure whose calculated NOE intensities are in agreement with the experimental NOE intensities, which does not allow for the presence of multiple solution conformations. Because of the fact that the time-averaged NMR values can result either from a single rigid conformation or a conformational equilibrium, Rutherford et al.45 used rotating-frame Overhauser enhancement (ROE) derived distance constraints in computer simulations and calculated from these simulations long-range interglycosidic spin-coupling constants and ROE data. They found close agreement between the back-calculated and experimental NMR data, but a discrepancy for the distance between HS’ of fucose and H3‘ of GlcNAc remained. Altogether, these results do not provide convincing evidence that the Fuca( 1--3)GlcNAc link is as rigid as these authors claim. Therefore, we consider the agreement on flexibility of this link between the CHEAT95 results and the results of MD in water together with NMR data42 as a confirmation of the quality of CHEAT95. Conclusions The CHEAT95 force field is an improvement over the 1993 version: the agreement with the training set has been enhanced, and the force field is now a true superset of CHARMm.

1

0.05\

HA OiO07 CT-OTlE -0.075 0.046

OTlE -0.075\

cT-~ E

HA

0.076-0.3

CT

\TN0.031

HA/o.025 j . l g A O ~ lE 0.071 \CT-CT -0.075 OTIE/0.025 0.025\

OTIE I HA 0.071

-0.070

0.071 -0.075

Figure 13. Atomic partial charges for cyclohexanol and the aldohexopyranoses (Eel = &iqiqj/4neoc,rij). Conformation defining effects as the anomeric and HasselOttar effect are present in the force field, albeit that for the hydroxymethyl group in the gluco-like aldohexopyranoses the gt conformation is favored too much over the gg. Nevertheless, the agreement with the situation in a polar environment is better than with other force fields in vacuo because of the absence of unrealistic intramolecular hydrogen bonds. CHEAT95 has been optimized for aldopyranoses in particular and not for furanoses or acyclic polyols. It is clear that CHEAT95 should not be used for simulating protein-carbohydrate interactions since in these interactions the hydrogen-bonding capacity of the carbohydrate molecule is essential. For further improvement of the force field, especially on the intraresidue interactions, more quantitative experimental data about the conformational behavior of smaller carbohydrates in water is needed. The CHEAT95 force field has been applied to the conformational analysis of cellobiose, maltose, and sucrose. This resulted in potential energy maps as a function of the glycosidic

TABLE 8: CHEAT95 Force Field Parameters Used for the Carbohydrates Described in This Article, Partially Directly from CHARMm Bond Lengths ( E b = Ckb(r - ro)2)

kh (kcaV(mol.A*)) CT-CT CT-OE

268.0 390.0

rn

(A)

kh

1.529 1.407

CT-OTlE HA-CT

Bond Angles (Eo = D e ( B CT-CT-CT CT-CT-OE CT-CT-OTlE HA-CT-CT HA-CT-HA

ks (kcal/(moliad2)) 58.35 57.0 45.0 37.5 33.0

112.70 110.00 110.50 110.70 107.80

rn

375.0 340.0

1.405 1.090

eo

ke (kcal/(moliad2)) 55.5 55.0

HA-CT-OE HA-CT-OT1E OE-CT-OE OE-CT-OTl E CT-OE-CT

109.47 108.00 109.00 107.80 112.40

60.0 48.0 58.0

- kd cos(nq5 - 6), Where n = 1 , 2 , 3 , 4 , 6 )

kb (kcaUmol)

n

6 (den)

0.70

1 3 1 1 1 3 1 1 3 1 3 3

0 0 0 0 0 0 180 0 0 180 0 0

CT-CT-CT-CT CT-CT-CT-CT CT-CT-CT-OE CT-CT-CT-OT1E CT-CT-OE-CT CT-CT-OE-CT OE-CT-OE-CT OT1 E-CT- CT- OT1 E OT1 E-CT- CT- OT1 E OT1 E-CT-OE-CT X-CT-CT-X X-CT-OE-X

(A)

- Bo)2)

eo(deg)

Dihedral Angles (Em= Elkdl

(kcaU(mol.AZ))

0.13 0.55 0.20 0.82 0.25 1.30 0.25

0.50 0.50 0.15 0.27

van der Waals Parameters (EvdW = &,[(AJrLj”) - (B,j/r6)1with Aij = [Emin(i) Emin(j)]”* [Rmin(i) + Rmin(j)]” and Bij = 2[Emln(i) Emln(j)]I/2 [Rmin(i) Rmi,(j)I6; For Atoms with No Explicit 1-4 Parameters Emin(l-4) = 0.5Emin and Rmm(1-4) = Rmin)

+

E,,,,, (kcaVmol)

CT HA

OE OTlE

-0.0903 -0.0420 -0.1591 -0.0 100

Rmin

(A)

1.800 1.330 1.600 2.100

Emin( 1-4) (kcaUmol) -0.10 -0.20 -0.20

Rmin(1-4) 1.75 1.36 1S O

(A)

13436 J. Phys. Chem., Vol. 99, No. 36, 1995 torsion angles that agree with conformational preferences indicated by NMR and optical rotation studies. Since conformations found in crystal structures occupied positions mostly within 4 kcaVmo1 from the global minimum, we feel that X-ray structures of disaccharides approximate the conformation observed in solution closely. For sucrose, we note that all relevant crystal structures are within 5 kcaVmol within the global minimum, correspondence that has not been achieved yet with other force fields (including exhaustive ~ ~ 3 ( 9 calculation^^^). 2) The glycosidic link between fucose and glucose in the trisaccharide GlcNAcP( l+4)[Fuca( 1+3)]GlcNAc/3 was also studied with CHEAT95. The results are in better agreement with NMR data and MD calculations in which water molecules were explicitly included, than with other in uacuo calculations. The extended hydroxyl atom type enables a much faster sampling of the conformational space of carbohydrates in aqueous solution than calculations which include explicit water molecules. Compared to calculations in uucuo with traditional force fields, the calculations with CHEAT95 are somewhat faster too, because of the decreased number of atoms and rotatable hydroxyl groups. Next to that, the calculations are more relevant to the state of the molecule in an aqueous environment than to the isolated molecule. Although CHEAT95 is certainly not “a panacea for the inclusion of solvent effects”,47 it seems a pragmatic and computationally efficient approach toward the conformational analysis of oligosaccharides. Acknowledgment. We would like to thank Dr. C. A. G. Haasnoot, Prof. Dr. J. Kroon, Dr. L. M. J. Kroon-Batenburg, and Dr. B. P. van Eijck for the helpful discussions and their valuable suggestions, and Dr. A. D. French for kindly providing us the data for Figure 9. Appendix In this Appendix the atom types (Table 7), the force field parameters (Table 8), and the atomic partial charges for cyclohexanol and P-D-glucose (Figure 13) are given. Since CHEAT95 is a superset of CHARMm, all parameters that do not include an OTlE atom type are taken directly from CHARMm version 22 (the Quanta 4.0 PARM.PRM file). The energy is calculated according to the potential energy function of this force field. References and Notes (1) Bush, C. A. Curr. Opin. Struct. Biol. 1992, 2, 655-660. (2) Poppe, L.; van Halbeek, H. Nut. Srruct. B i d . 1994, 1, 215-216. (3) Hervt du Penhoat, C.; Imberty, A,; Roques, N.; Michon, V.; Mentech, J.; Descotes, G.; Ptrez, S. J. Am. Chem. SOC. 1991, 113, 37203727. (4) Adams, B.; Lemer, L. J . Am. Chem. SOC.1992, 114,4827-4829. (5) Allen, F. H.; Davies, J. E.; Galloy, J. J.; Johnson, 0.;Kennard, 0.; Macrae, C. F.; Mitchell, E. M.; Mitchell, G. F.; Smith, J. M.; Watson, D. G. J . Chem. In$ Comput. Sci. 1991, 31, 187-204. ( 6 ) Kouwijzer, M. L. C. E. Unpublished results. (7) Kroon-Batenburg, L. M. J.; Kroon, J. Biopolymers 1990,29, 12431248. (8) Spek, A. L. Acta Crystallogr. 1990, A46, C34.

Kouwijzer and Grootenhuis (9) Nishida, Y.; Hori, H.; Ohrui, H.: Meguro, H. J. Carbohydr. Chem. 1988, 7, 239-250. (10) Brown, G. M.; Levy, H. A. Acta Crystallogr. 1973, B29, 790797. (11) Grootenhuis, P. D. J.; Haasnoot, C. A. G. Mol. Simul. 1993, 10, 75-95. (12) Brooks, B. R.; Bruccoleri, R. E.; Olafson, B. D.; States, D. J.; Swaminathan, S . ; Karplus, M. J. Comput. Chem. 1983, 4, 187-217. (13) Angyal, S. J. Angew. Chem. 1969, 81, 172-182. (14) Kouwijzer, M. L. C. E.: Schrijvers, R.; Grootenhuis, P. D. J. To be published. (15) Bergwerff, A. A.; van Kuik, J. A,; Schiphorst, W. E. C. M.: Koeleman, C. A. M.; van der Eijnden, D. H.; Kamerling, J. P.; Vliegenthart, J. F. G. FEBS Lett. 1993, 334, 133-138. (16) CHARMm, version 22.0, shipped with QuantdCHARMm version 4.0; Molecular Simulations Inc.: Waltham, MA, 1994. (17) Momany, F. A,; Rone, R. J. Comput. Chem. 1992, 13, 888-900. (18) Angyal, S . J. Aust. J . Chem. 1968, 21, 2737-2746. (19) Bock, K.; Duus, J. 0. J . Carbohydr. Chem. 1994, 4, 513-543. (20) Schreurs, A. M. M. program CONTOUR. Crystal and Structural Chemistry; Bijvoet Center for Biomolecular Research, Utrecht University: Utrecht The Netherlands. (21) Jeffrey, G. A. Acta Crystallogr. 1990, 846, 89-103. (22) Hassel, 0.;Ottar, B. Acta Chem. Scand. 1947, 1, 929-942. (23) Wolfe, S. Acc. Chem. Res. 1972, 5, 102-111. (24) Marchessault, R. H.; Perez, S . Biopolymers 1979, 18, 2369-2374. (25) Glennon, T. M.; Zheng, Y.-J.; Le Grand, S. M.; Schutzberg, B. A.: Mertz, K. M., Jr. J . Comput. Chem. 1994, 15, 1019-1040. (26) van Eijck, B. P.; Hooft, R. W. W.; Kroon, J. J . Phys. Chem. 1993, 97, 12093-12099. (27) TvaroSka, I.; Ko29r, T. Theor. Chim. Acta 1986, 70, 99-1 14. (28) Cramer, C. J.; Truhlar, D. G. J. Am. Chem. SOC.1993,115,57455753. (29) Hori, H.; Nishida, Y.; Ohrui, H.; Meguro, H. J. Carbohydr. Chem. 1990, 9, 601-618. (30) French, A. D.; Dowd, M. K. J . Mol. Struct. 1993, 286, 183-201. (31) Allinger, N. L.; Yuh, Y. H.; Lii, J.-H. J. Am. Chem. Soc. 1989, 111 855 1-8566. (32) Allinger, N. L.: Rahman, M.; Lii, J.-H. J . Am. Chem. Soc. 1990, 112, 8293-8307. (33) Chu, S. S . C.; Jeffrey, G. A. Acta Crystallogr. 1968, 824, 830838. (34) Takusagawa, F.; Jacobson, R. A. Acta Crystallogr. 1978,B34,213218. (35) Stevens, E. S.; Sathyanarayana, B. K. J . Am. Chem. SOC. 1989, 111, 4149-4154. (36) Stevens, E. S.; Duda, C. A. J . Am. Chem. SOC.1991, 113, 86228627. (37) Brady, J. W.; Schmidt, R. K. J. Phys. Chem. 1993, 97, 958-966. (38) Meyer, C.; PCrez. S . ; Hervt du Penhoat, C.; Michon, V. J . Am. Chem. SOC. 1993, 115, 10300-10310. (39) Poppe, L.; van Halbeek, H. J . Am. Chem. SOC. 1992, 114, 10921094. (40) Duker, J. M.; Serianni, A. S. Carbohydr. Res. 1993, 249, 281303. (41) van Alsenoy, C.; French, A. D.; Cao, M.; Newton, S . Q.; Schafer, L. J . Am. Chem. SOC. 1994, 116, 9590-9595. (42) Lommerse, J. P. M.; Kroon-Batenburg, L. M. J.: Kroon, J.; Kamerling, J. P.; Vliegenthart, J. F. G. Accepted for publication in J . Biomol. NMR . (43) Miller, K. E.; Mukhopadhyay, C.; Cagas, P.; Bush, C. A. Biochemistry 1992, 31, 6703-6709. (44) Wormald, M. R.; Edge, C. J.; Dwek, R. A. Biochem. Biophys. Res. Commun. 1991, 180, 1214-1221. (45) Rutherford, T. J.; Spackman, D. G.; Simpson, P. J.; Homans, S . W. Glycobiology 1994, 4, 59-68. (46) Imberty, A.; Delage, M.-M.; Bourne, Y.; Cambillau, C.; PCrez, S . Glycoconjugate J. 1991, 8, 456-483. (47) van Halbeek, H. Curr. Opin. Struct. Biol. 1994, 4. 697-709.

JP95 1365M