J. Phys. Chem. 1994,98, 6241-6243
6241
A Simple and Effective Procedure for Conformational Search of Macromolecules: Application to Met- and Leu-Enkephalin Hagai Meirovitch’ and Eva Meirovitch Supercomputer Computations Research Institute, Florida State University, Tallahassee, Florida 32306-4052
Andre G. Michel Institut de Recherches Serrier, 1 1 , Rue des Moulineaux, F9215 Suresnes, France
Maximiliano Vdsquez Protein Design Labs, 2375 Garcia Avenue, Mountain View, California 94043 Received: March 21, 1994’
A simple and efficient method for searching the conformational space of macromolecules is presented. With this method an initial set of relatively low-energy structures is generated, and their energies are further minimized with a procedure that enables escaping from local energy minima. Illustrative calculations are described for Met- and Leu-enkephalin.
Theoretical studies of the low-energy (and free energy) structures of biological macromolecules are essential for understanding their biological activity.’ However, an exhaustive conformational search is unfeasible even for a small peptide; hence, the currently available methods are stochastic in nature and in practice cannot guarantee the identification of all the low-energy conformations of interest, including the ground state.2-24 It is, therefore, important to develop a variety of independent, hence complementary, techniques that are also efficient and simple; such methods are particularly useful in the context of rational drug design. We have devised such a method as an ingredient of a new thermodynamic approach for analyzing nuclear magnetic resonance (NMR) data obtained from solutions offlexible peptides. Initially, a set of low-energy structures is generated using the techniquedescribed in this report, and its members are then used as “seeds” for microstates spanned by Monte Carlo simulations. The free energy (hence the relative populations) of the microstates is calculated from the Monte Carlo samples using the local states method.25-2* Finally, the experimental nuclear Overhauser enhancement parametersare reproduced theoretically as weighted averages over the various microstates. An application to Leuenkephalin, a pentapeptide with the sequence H-Tyr-Gly-GlyPhe-Leu-OH, for which experimental NMR data are available,*32 will be reported elsewhere. The conformationalsearch method described here is based on a procedure called SADA (systematic alteration of dihedral angles), proposed by Meirovitch and S~heraga,~3 that was used originallyfor minimizing theenergy of a 58-residue protein, bovine pancreatic trypsin inhibitor. With the usual local minimizers, the molecule becomes trapped in the local minimum closest to the starting conformation. The present method enables one to relieve the molecule from such a minimum in a systematic way, and to find lower energy structures. We chose to use ECEPP,’C36 which is a relatively simple force field based on rigid geometry (i.e., constant bond angles and lengths), with conformationsthus defined solely by the backbone and side chain dihedral angles. The calculationsare carried out without explicit considerationof solvent effects. Even though our main interest lies in Leu-
* To whom correspondence should be addressed.
Abstract published in Advance ACS Abstracts, June 1, 1994.
0022-3654/94/2098-6241 S04.5OlO
enkephalin, we have also applied the method, as a test, to the extensively studied molecule Met-enkephalin (H-Tyr-Gly-GlyPhe-Met-OH).2.3,8-23The method will be described as applied to this molecule, which can be described by 24 variables (15 backbone dihedral angles 4, J/ and w and the 9 side chain dihedral angles x). Our method consists of two stages. In stage I, ninit optimized structures are obtained from ninit randomly selected conformations, by pursuing steps 1-4 as described below. (1) An initial conformation is defined by assigning to each variable a random value within the range [-180°,1800] (for o angles, restricted ranges of [ 180’ - 20°, 1 80° + 20°] are used). (2) The energy is minimized with Powell’s conjugate gradient method3’ using numerical derivativesof the energy; this minimizer is implemented in the package PEPSEA described in ref 21. (3) SADA is applied; Le., eachvariable is changed sequentially, d within the corone at a time, in increments of n ~ degrees responding range, and the energy is calculated. If a lower energy is obtained for an altered dihedral angle, the new values of the energy and the angle replace the current ones; otherwise, the current values are retained. One carries out nititerations of this procedure; Le., each angle is visited nit times. (4) The best structure obtained in (3) is energy minimized again and stored. In this way nidtenergy optimizedconformations are obtained in stage I. Since SADA treats the angles independently, only a limited part of phase space is searched, and therefore if ninitis relatively small, the change of finding the global energy minimum structure is slim. In order to increase this chance, correlationsbetween the angles are taken into account in stage 11. Here, a systematic refinement procedure is applied only to the conformations of lowest energy generated in stage I. (A typical number is 10.) One would expect that certain parts of the latter are already optimized energetically, and can therefore serve as templates around which the other parts will be optimized as well. Stage I1 consists of the following steps. ( 5 ) Side Chain Optimization. A trial conformation for refinement is defined by assigning random values to the nine side chain angles x (see step l ) , while the backbone angles are preserved. This conformationis optimized by applying steps 2-4 (where, however, all the angles can vary). The random selection
-
0 1994 American Chemical Societv
Letters
6242 The Journal of Physical Chemistry, Vol. 98, No. 25, 1994
of side chains is carried out h i d e times, and the conformations of low energy are retained, to be further treated in the next step. (6) Backbone Optimization. A trial conformation is defined by assigning random values to 41and $1 of the first residue, while the other 22 dihedral angles are preserved; this conformation is energy minimized by steps 2-4, and the process is repeated for nz random selectios of 41and $1. The best structure obtained becomes a template for the next step, in which 42 and $2 are changed randomly n2 times, etc. A systematic refinement should take into account correlations of any length scale along the backbone. Therefore, we performed similar calculations for groups of four (rather than two) backbone angles (41,$i, &+I, $i+l, i = 1, 4), n4 times for each group. The results show that, for the pentapeptide investigated in this study, increasing the group size further is unnecessary. We have first sought to examine the efficiency of steps 2 4 (which include SADA) vs that of energy minimization alone (Le., only step 2). Each of the two procedures were applied to ninit = 3000 randomly chosen conformations. Using steps 2-4, we obtained 7, 17, and 43 structures with energy within the ranges [-10,-111, [-9,-lo], and [-8,-91 kcal/mol, respectively,whereas energy minimization alone generated only a single conformation within the range [-8,-91 kcal/mol. Thus, steps 2 4 makea more efficient procedure even though they are four times slower than step 2. However, this test is incomplete since computer time can be dramatically decreased (1-2 orders of magnitude) by calculating the energy gradients analytically, and employing a more efficient energy computation (which takes advantage of the fact that the angles are changed one at a time); optimizing the values of the parameters ngrid and nil should also save considerable computer time (see a later discussion). Notice that the lowest energy (of -12.9 kcal/mol) found in other studies2JlJ3J6has not been reached with steps 2-4. The six conformations of lowest energy obtained in stage I were subjected to the stage I1 procedures. Thus, the energies were reduced, from -10.017, -8.97, -10.185, -10.906, -10.935, and -10.725 to -10.018, -10.018, -10.192, -10.907, -11.697, and -10.994 (in kcal/mol), respectively, by applying a side chain optimization (step 5, stage 11) based on h i d e = 50. However, further backbone optimization (step 6) based on changing four angles at a time (we have skipped here the two-angleoptimization) did not improve the energiesof the first four structures, but reduced the energies of the last two to -12.906 (n4 = 50) and -1 1.405 (n4 = 5). The last structure was subjected again to a backbone optimization by changing groups of two angles (n2 = 40) leading to -12.906 kcal/mol. Thus, the energy considered as the global minimum has been reached twice, and the correspondingstructure (see Table 1) is identical to that obtained before.2Jl.I3J6 We have also located several of the low-energy conformations identified previously;16J0however, structures with energies lower than -12.9 kcal/mol were not found by a further application of the procedures of stage I1 to the best structure. We have not yet optimized the various parameters entering the calculations. Thus, the computations described above were performed using increments of ngrid = l o and nit = 3, for which an optimization of steps 2-4 requires on average 2.5 minutes of CPU time on an Alpha workstation, DEC 3000/400. As an illustration of the potential inherent in parameter optimization, we have applied steps 2 4 to &,,it = 500 random structures (step 1) using ngrid = 2' and nit = 2, to find a 2-fold reduction in CPU time with only a slight decrease in the effectiveness to generate low-energy conformations. Thus, in the range [-10,-111 we obtained about twice as many structures now than before (2 vs l.l,on average) whilein therange [-8,-101 halfasmany structures (5 vs 9.3). Also, identical results were obtained by including the o angles in the SADA search (step 3) or by excluding them, implying that it is unnecessary to vary these angles. (Notice that the a's still can change in the energy minimization applied in
TABLE 1: Dihedral Angles (in degrees) of the Four Optimized Structures of Met- and Leu-Enkephrrlia and Their Enerw (in kcalhol) Met Leu residue angle Met ( w = 180°) Leu ( w = 180') ~~
-86.2 156.1 -177.0 -172.6 +78.8 -166.0 -154.5 83.9 168.6 83.7 -73.9 -170.1 -137.0 19.3 -174.1 58.8 -85.4 -163.6 160.3 -179.7 52.8 175.3 -179.8 -58.6 -12.904
-86.4 153.7 180.0 -179.9 -111.3 145.3 -161.6 71.2 180.0 64.1 -93.5 180.0 -8 1.7 -29.2 180.0 179.8 80.0 -80.8 143.9 180.0 -65.1 -179.2 -179.3 -60.0 -10.714
-78.9 137.4 -168.8 -165.2 70.6 -160.4 -87.6 65.0 176.1 65.8 -92.7 177.2 -83.6 -33.6 -177.4 178.6 -103.0 -155.3 -76.2 -177.8 178.1 61.8 51.9 59.1 -10.095
-86.9 152.9 180.0 179.5 -109.7 144.9 -161.0 72.1 180.0 64.5 -92.2 180.0 -84.0 -25.0 180.0 72.7 -95.4 -78.6 130.6 180.0 179.6 65.5 52.8 59.6 -9.706
steps 2 and 4). This is important since it may also apply to bond angles in force fields based on flexiblegeometry. Computer time can be further reduced by taking into account symmetry. For example, x 4 of Met can be varied within the limited range of &60° around its current value, instead of the full range of 360O. However, the most significantreduction in computer time would be gained in ni,,jt could be decreased considerably,say from 3000 to -20. Indeed, it turns out that the optimization procedures of stage I1 are very effective even when applied to conformations with relatively high energy. To illustrate this, we selected six conformations with energies in the range [-3,487-8.0521 kcal/ mol obtained in stage I, and subjected them to a stage I1 refinement. On purpose we included in this group three conformations (with energiesof -8.05 2, -7.8 3 1, and -5.3 7 1 kcal/ mol which displayed some structural similarity to the global minimum conformation; indeed, two of these three (the latter ones) have led to the global minimum structure while the energy of the other four conformations was decreased to -10.018 and -10.347 kcal/mol. Theseresultsareencouragingalthough further work is required in order to optimize all the parameters, in particular n2 and n4. We have also applied the method to a model of Met-enkephalin in which the backbone dihedral angles o were fixed at 180°. As discussed above, a small value of ninit should suffice. However, in order to check the effectiveness of stage I alone (steps 2-4), we again used a relatively large number of starting conformations, ninil = 3200. From this group eight different conformations with energies in the range [-8.6,-10.7131 kcal/mol have been obtained. Side chain optimization (step 5 , stage 11) lowered the energy of three of them to -10.713 kcal/mol, whereas no further decrease in energy was achieved by additional backbone optimization (step 6,stageII). Thissuggeststhat-10.713 kcal/mol,alreadyobtained in stage I, might be the global minimum (see Table 1). It should be pointed out that this value is lower than the lowest energy of -10.3 kcal/mol found in ref 15. (However, the dihedral angles for the two structures are close.) In refs 12 and 14 lowest energy values of -1 1.9 and -1 2.1 kcal/mol, respectively, with structures very similar to ours, have been reported; however, they appear to have been obtained with a somewhat different set of energy parameters than those used by ECEPP/2. The method has been also applied to Leu-enkephalin,for which
-
Letters only few calculations have been performed previously, all with an old19 or set of ECEPP/2 parameters (Le., the improved parameters of ref 36 have not been used and thus, for Met-enkephalin, for example, the lowest energy found in ref 19 is -15.8 as compared to -12.9 kcal/mol obtained in ref 2 for approximatelythe same structure). Using the model that allows the backbone w angles to vary, we optimized in stage I mnit = 2 100random conformations, to obtain seven different low-energy structures within therange [-8.204,-9.8371 kcal/mol. Sidechain and backbone optimization lowered the energy of one of these structures to -10.095 kcal/mol, which is thus considered to be the global minimum (see Table 1 for the corresponding angles). A similar treatment for fixed o angles with mnit = 2400 led in stage1toaminimumof-9.33 kcal/mol, which was further reduced to-9.707 kcal/mol by applying sidechainoptimization. Backbone optimization did not decrease the energy further. In summary, the SADA-based method described in this Letter is simple, independent of other techniques, and most effective in locating low-energy structures. In its present form it is suited to treat linear molecules and to optimize side chain conformations of cyclic molecules and large proteins. As mentioned previously, the parameters of the method (ndtlnbd, nit, hi&,n2, and n,) should be further optimized and computer time also reduced considerably by using analytical derivatives and more efficient energy calculation; then the efficiency of the method will be compared with those of other techniques. Acknowledgment. We are grateful to Professor G. V. Nikiforovich for valuablediscussions. We acknowledge support from the Florida State University Supercomputer Computations Research Institute, which is partially funded by the U.S. Department of Energy under Contract DE-FC05-85ER250000. References and Notes ( 1) Scheraga, H. A. In Reviewsin Computational Chemistry, Lipkowitz. K. D., Boyd, D. B., US.; VCH Publishers: New Ywk, 1992; Vol. 3, pp 73-
142. (2) Li, Z.;Schcraga, H. A. Proc. Natl. AccadSci. U S A 1987,84,66116615. (3) Li, Z.; Scheraga, H. A. J. Mol. Strucr. (THEWHEM) 1988,179, 333-3 5 2. (4) Chang, G.; Guida, W. C.; Still, W. C. J . Am. Chem. Soc. 1989,111, 43794386. ( 5 ) Ferguson, D. M.; Raber, D. J. J . Am. Chem. Soc. 1989,111,43714378.
The Journal of Physical Chemistry, Vol. 98, No. 25, 1994 6243 (6) Saunders, M. J. Am. Chem.Soc. 1987,109,3150-3152. (7) Goodman, J. M.; Still, W. C. J. Compur. Chem. 1991, 12, 11101117. (8) Isogai, Y.;Ntmethy, G.; Scheraga, H. A. Proc. Natl. Accad Sci. U.S.A. 1977, 74,414-418. (9) Painc, G . H.; Scheraga, H. A. Biopolymers 1985, 24, 1391-1436. (10) V4squez. M.; Schcraga, H. A. Biopolymers 1985, 24, 1437-1447. (11) Ripoll, D. R.;Scherap, H. A. J. Protein Chem. 1989,8,263-287. (12) Okamoto, Y.;Kikuchi, T.; Kawai, H. Chem. k t r . 1992, 3, 12751278. (13) Nayeem, A.; Vila, J.; Scheraga, H. A. J . Compur. Chem. 1991,12, 594-605. (14) Hansmann, U. H. E.;Okamoto, Y . J. Comput. Chem. 1993, 14, 1333-1338. (15) Shaumann, T. S.;Braun, W.; Wiltrich, K. Biopolymers 1990, 29, 679694. (16) Von Freyberg, B.; Braun, W. J. Compur. Chem. 1991, 12, 10651076. (17) Dorofeyev, V. E.; Mazur, A. K. J. Biomol. Strucr. Dyn. 1993, 10, 143-167. (18) Abagyan, R. A.; Argos, P. J. Mol. Biol. 1992, 225, 519-532. (19) Moral-, L. B.; Garduflc-JuBrez, R.; Romero, D. J . Biomol. Strucr. Dyn. 1992, 9,951-957. (20) Vajda, S.; Delisi, C. Biopolymers 1990, 29, 1755-1772. (21) Michel, A. G.; Ameziane-Hassani, C.; Bredin, N. Can. J . Chem. 1992, 70, 506-603. (22) Maigret, B.; Premilat, S . Biochem. Biophys. Res. Commun. 1982, 104,971-976. (23) Premilat, S.;Maigret, B. J. Phys. Chem. 1980,84, 293-299. (24) Bruccoleri, R. E.; Karplus, M. Biopolymers 1987, 26, 137-168. (25) Meirovitch, H. Chem. Phys. Lett. 1977,21, 389-392. (26) Meirovitch, H.; Vbquez, M.; Scheraga, H. A. Biopolymers 1987, 26.651671. (27) Meirovitch, H.; Kitson, D. H.; Hagler, A. T. J. Am. Chem. Soc. 1992, 114, 53865399, (28) Meirovitch, H.; Koerber, S.C.; Rivier, J.; Hagler, A. T. Biopolymers,
in Dress. (29) Betines, J.; Nikiforovich, G. V.; Chipens, G. J. Mol. Strucr. (THEWHEM) 1986,137, 129-132. (30) Picone, D.; Dtrsi, A.; Motta, A.; Tancredi, T.; Temussi, A. Eur. J. Biochem. 1990, 192, 433439. (31) Picone, D.; Dtrsi, A,; Motta, A,; Tancredi, T.; Temussi, A. Terrahedron 1988,44,975-990. (32) Vesterman, B.; Saulitis, J.; &tines, J.; Liepins, E. J.; Nikiforovich, G. V. Biochim. Biophys. Acta 1989, 998, 204-209. (33) Meirovitch, H.; Scheraga, H. A. Macromolecules 1981, 14, 12501259. (34) Momany, F. A.; McGuire, R. F.; Burgess, A. W.;Scheraga, H. A. J. Phys. Chem. 1975, 79,2361-2381. (35) NCmethv. _ .G.:. Pottle. M. S.:Scheraaa. - . H. A. J . Phvs. Chem. 1983. 87,’1833-1887. (36) Sippl, M. J.; NCmethy, G.; Scheraga, H. A. J. Phys. Chem. 1984, 88,62316233. (37) Powell, M. J. D. Math. Program. 1977, 12, 241-254.