J. Phys. Chem. 1993,97, 261-210
267
Mean Field Theory as a Tool for Intramolecular Conformational Optimization. 3. Test on Melittin K. A. Olszewski,+*tL. Piela,+*$ and H.A. Scheraga'vt Quantum Chemistry Laboratory, Department of Chemistry, Warsaw University, Pasteura 1, 02-093 Warsaw, Poland, and Baker Laboratory of Chemistry, Cornell University, Ithaca, New York 14853- 1301
Received: September 18, 1992
The self-consistent multitorsional field (SCMTF) method for global optimization of the conformational energy of polypeptides has been tested with the 20-residue membrane-bound portion of melittin. A new rapid method for calculating the effective potential is presented, together with an algorithm that accelerates convergence in the vicinity of the global minimum. The effective average potential energy for each dihedral angle was obtained by interpolating the Monte Carlo average ECEPP/2 (empirical conformational energy program for peptides) potential, calculated on an equally spaced wide grid supplemented by points drawn from the related probability density distribution. When the system started to converge, the effective potential was evaluated in a limited portion of the dihedral-angle space; this helped to accelerate the convergence. Starting from four different initial conformations, related to the C, D. E, and F regions for all residues, the S C M T F method was able to locate the global minimum in all cases.
ki = 0 describes the ground state of the ith dihedral angle). The effective potential heff(ei) is given by
1. Introduction
The self-consistent multitorsional field (SCMTF) method for global optimization, presented in previous papers,'**is applied here to a larger molecule, the 20-residue membrane-bound portion of melittin, as a further test of the method. Its amino acid sequence where the summation extends over Mc lowest-energy trial is Gly-Ile-Gly-Ala-Val-Leu-Lys-Val-Leu-Thr-Thr-Gly-Leu-Proconformations taken from the set of hf locally minimized Ala-Leu-Ile-Ser-Trp-Ile. conformationsgenerated by drawing4 from the probability density distributions pi of each dihedral angle given by 2. SCMTF Metbod-Algorithm Improvements Since the algorithm has already been described,'J only its essential aspects related to the goal of the present paper will be given here. In the SCMTF method, the SchrMinger equation
H* = E*
(1)
for the motion of the nuclei is solved, in the hope that its "smoothing" properties3 will enable us to simplify the structure of the conformational energy hypersurface. In this approach, the Hamiltonian operator describes the system with fixed bond lengths and bond angles; therefore, dihedral angles are the only variables. After an approximate separation of the dihedral-angle motion, application of a mean field functional form leads to a set of N coupled equations:
hi&= &('
i = 1, ...,N
(2)
where N is the number of dihedral angles and k i i s the Hamiltonian for the ith dihedral angle given by
(3) where Ii is an averaged moment of inertia.' ki describes the variation of Bi in the averaged field of the other dihedral angles; 47 and e: are eigensolutions (eigenfunction and eigenvalue, respectively) of this Hamiltonian (kidesignatesdistinct solutions;
' Warsaw University. 1 Cornell
University.
* To whom correspondence should be addressed. 0022-3654/58/2097-0267$04.00/0
where 2 is a normalization factor and /3 = ( k 0 - I with k and T being the Boltzmann constant and the absolute temperature, respectively. This introduces temperature into our considerations and allows us to regulate the accessibility of the dihedral angle space for sampling with the single dihedral angle probability density distribution. Solutions of eqs 2 give a new set of 47 and t7 and, therefore, a new set of p i . The procedure is repeated iteratively until self-consistency of the pidistributions is achieved. When convergencewas slow or when oscillationsappeared, usually a few, so-called macroiterations, i.e., consecutive runs with different values of the parameters and with starting conformation defined by the previous run, were carried out. 2.1. Computation of Qff.Since we solve the one-dimensional SchrMinger equations numeri~ally,~,~ we must compute ticrffor each i at a large number (KJof equally spaced points for 0, E (-*,*). We found that a reasonable number of points is of order of 1000; in fact, we have used Kg = 640. For medium-size molecules, the cost of the method (see ref 1) turned out to be given approximately by KgMcN (number of single-point evaluations of the potential energy in one iteration); therefore, for a molecule as complicated as melittin (compared to our earlier applications'-*) the obvious way to reduce the cost of the method is to reduce the number of computations of the ECEPP potential by decreasing Kg. On the other hand, this should not influence the quality of the effective potential. We have proceeded in the following way: we chose a number K ; of the order of KJ10 as a number of equally-spaced points in (-*,*). To the above set of points, common to all dihedral angles, we added another set of N (of the order of 10) points generated according to the 0 1993 American Chemical Society
268 The Journal of Physical Chemistry, Vol. 97, No. 1, 1993
Olszewski et al.
TABLE I: SCMTF First Macroiteration for Melittin Starting from the ALLC Conformation. iteration Glyl Ile2 Gly, Alar Val5 Leu6 Lysl Val8 Leu9 Thrlo Thrtl Glyt2 Leu13 Pro14 Ala15 Leu16 Ile17 Serls Trp19 Ilezo 1 2-3
4-6 7 8-11 12 13 14 15-16 17-19 20-21 0
C A F A A A A A A A A
D A A A A A A A A A A
C A A A A H A A A A A
C A A A A A A A A A A
C A A A A A A A A A A
C A A A A A A A A A A
C
B A A A A A A A A A
C A A A A A A A A A A
D A A A A A A A A A A
C A A A A A A A A A A
C A A A A A A A A A A
C D’ A ’ A* A A A A A A A
C C C C D C C C C C D
A A A A A A C C A A A
C C A A A A A A A A A
C A A A A A A A A A A
C C A A A A A A A A A
D D A A A A A A A A A
C C A A A A A A A A A
C A A A A A F A F A A
These conformational regions pertain only to BOid. The starting conformation is considered as a zeroth iteration.
TABLE II: SCMTF First Macroiteration for Melittin Starting from the ALLF Conformation’ ~
iteration Glyl Ile2 Glyj Ala4 Val5 Leu6 Lys7 Val8 Leu9 Thrlo ThrlI Gly12 Leu13 Pro14 Ala15 Leu16 Hell Serls Trpl9 He20 1 2-3 4-7 8 I1 12-24 25-26 a
F F F F H A A H A A A H I O A A A A A A H A A A A A A A A
F A A A A A A
F F A A A A A A A A A A A A A
F A A A A A A
F A A A A A A
F A A A A A A
F A A A A A A
F A A A A A A
C C
C D D D D
F C C C C A A
F A A A A A A
F A A A A A A
F F F F F A A
F F F F F C A
F A A C C C C
F A A F F A A
These conformational regions pertain only to PId. The starting conformation is considered as a zeroth iteration.
probability density distribution computed for this dihedral angle. The first set of K; points assures that the whole space available for the dihedral angle is explored, while the second_set of I\” points enhances the quality of the representation of in the vicinity of the minima of t i C f f from the previous iteration (one of which may be related to the global minimum of the potential energy V). Therefore, the effective potential is calculated at K; N points, and these val!es are used to calculate the spline function that interpolates yicff. Then, we recalculate vieffon a grid of K, equally spaced points. The above procedure reduces the cost of the most computationally intensive step of the method (the evaluation of the ECEPP energy) by a factor of almost 10. 2.2. Improving Convergence near the Global Minimum While theSCMTFmethod proved to be rapidlyconvergent when locating the neighborhood of the global minimum, it slows down as the minimum is approached, Le., when the changes of conformation decrease in subsequent iterations. This is caused by two facts: the first is that we neglected the correlation of dihedral angles, when applying the mean field theory (Le., by approximating the conformational energy by the sum of effective potentials that depend on only a single dihedral angle), and the second is that the quality of the representation of Viefr may become too poor (because of the Monte Carlo approximation) to provide a detailed structure of the minima and maxima of the potential energy V. On the other hand, the problem arises only when the system is close to the global minimum and starts to converge; hence, the conformational changes should not be large. Since correlation effects cannot be introduced into the SCMTF method in a simple way (although local minimization of the trial conformations in a sense takes part of the correlation effects into account), we decided to improve the representation of yicffinthe neighborhood of the global minimum. This can be done in two ways: the first is to decrease Kh,at the same time increasing N, which leads to more extensive sampling in the vicinity of the global minimum. Yet we found that if certain regions are oversampled by using a large value of N, the quality of the representation of the effective potential became worse becauseof inadequacies of the spline interpolation. The second way to solve the problem is to assume that, if the method has successfully located the pattern of the global minimum, in terms of the conformational code of Zimmermann et al.,’ we can limit the accessible dihedral angle space (in subsequent macroiterations) by using the 8 conformations that constitute the convergence criteria. Specifically, we used a
vrf
+
Figure 1. Global minimum conformation for melittin found in this work.
multidimensional rectilinear box centered on Omin by conceptually applying an infinite repulsive potential at the walls of the box; Le., sampling was limited to conformations within the box. The size of the box was varied in different tests, as will be reported in the next section. Inside the box, the effective potential for each dihedral angle was calculated according to the procedure described in the previous section. It should be noted that the position of the center of the box is not fixed but changes as Bmin changes; therefore, its position is adjusted in consecutive iterations. In test calculations, we observed convergenceeven when theglobal minimum lay outside the initial box. Also, the lengths of the edges of the box may vary, e.g., they may be large for the backbone dihedral angles, thereby allowing for large amplitude motion, and small for side-chain dihedral angles, allowing only for small adjustments of the sidechains to the new backboneconformation. This procedure turned out to lead to very rapid convergence, even for relatively large lengths of the edges of the box. This may easily be understood, since the ratio of the volume of the box to the volume of the whole dihedral angle space, for a molecule with 39 dihedral angles (see next section) with an edge length of the box of K , is equal to 1.82 X 10-12;i.e., in general, this ratio behaves as a geometric sequence with a ratio less than 1 and hence has a limit equal to 0. Therefore, a t the end of the whole procedure, the accessible space is drastically reduced, although still any point in the dihedral angle space may be accessed because the center of the box is not fixed.
The Journal of Physical Chemistry, Vol. 97, No. I, 1993 269
Intramolecular Conformational Optimization. 3
TABLE III: Global Minimum Conformation for Melittin from This Work' dihedral angle (deg) residue Gly-l
4
-61 (-62 Ile-2 -65 (-65 Gly-3 -64 (-64 Ala-4 -65 (-65 Val-5 -63 (-63 -65 Leu-6 (-65 LYS-7 -64 (-64 Val-8 -66 (-66 Leu-9 -64 (-64 Thr-10 -70 (-70 0
$ -42 -43 -40
-40 -41 -41 -47
-47 -40 -41 -41 -41 -45 -45 -39 -39 -42 -42 -40 -40
w
180 180) 179 179 -179 179) -179 -179 -179 -179 -178 -178 -179 -179 -179 -179 -177 -177 -178 180
XI
-66 -65
x2 153 148
dihedral angle (deg)
x3 53 57
x4
x5
61 58)
residue Thr-ll Gly-12 Leu-I3
61 61) 165 165 177 177 -177 -178 166 166 177 177 -63 -63
Pro-14
51 51 62 62 176 175 51 51 63 63 91 91
53 52) 53 52 179 179 54 53) 52 52 55 55)
Ala-I5 59 59) 180 180
Leu-I6 177 177)
IIc-17
Ser-18 179 179)
Trp-19 Ile-20
4 -70 (-70 -78 (-78 -150 (-150 -75 (-75 -61 (-61 -66 (-66 -66 (-66 -72 (-71 -67 (-67 -67 (-67
$ -38 -38 -37 -37 79 78 -15 -15 -38 -38 -43 -43 -33 -33 -37 -37 -42 -42 -41 -41
w
XI
x=
-62 -63
91 91
x3 56 56)
x4
-177 -177 -173 -173) 180 180 180 180) -178 -178 180 -179 -179 -179 -179 -179 179 179 179 179
-174 -173
67 67
51 51
59 59)
61 61 68 57 67 67) -22 -22) 158 158
52 52 173 174
58 58) -72 -70)
60 54
65 66)
64 63) 176 176 -84 -86 -60 -60 -60 -60 -68 -68
xs
Values in parentheses are from ref 1 1.
TABLE I V SCMTF First Macroiteration for Melittin Starting from the A&D*DA, Conformation' iteration Glyl Ilez Gly3 Ala4 Val5 Leu6 Lys7 Val8 Leu9 Thrlo ThrlI Gly12 Leu13 Pro14 Ala15 Leu16 Ile17 Serlg Trpls Ilezo 1-4 5-6 7-9 10 11 12-13 14-15 1 6 1 9 21 22-23 a
A A A A A A A
A A A A A A A - 1 8 - 2 0 A A A A
A A A A A A A A A A A
A A A A A A A A A A A A A A A A A A A A A A A A
A A A A A A A A A A A
These conformational regions pertain only to
A A A A A A A A A A A A A A A A A A A A A A A A Bold.
A A A A A A A A A A A
A A A A A A A A A A A
C C C C C D C A A A A
D ' D A G D A H D A G D A D* A* A D* A* A D* A* A A A ' A A A * A A D A A D A
A A A A A A A A A A A
A A A A A A A A A A A
A A A A A A A A A A A
A A A A A A A A C C A
A A A A A A A A A A A
A A A A A A A A A A A
The starting conformation i s considered as a zeroth iteration.
3. Results and Discussion The 20-residue segment of the melittin molecule was constructed using the ECEPP (empirical conformational energy program for peptides) algorithm8-I0 with an acetyl group on the N-terminus and a methylamide group on the C-terminus. All backbone dihedral angles, 4's and were varied, except 4 of Pro-14, which was fixed a t -75'. The initial values of the 0's and x's were taken from ref 11 and allowed to vary only after all convergence criteria were satisfied. Four global minimizations were carried out by starting from arbitrary chosen points of the conformational space, described in terms of the conformational codes of Zimmermann et al.7 as
vs,
1.
ALLC, Le., I$ = -80' and rC, = 80'
2.
ALLD, Le., 9 = -150' and $ = 80'
A L L D and ALLE, before convergence to the global minimum in a subsequent macroiteration.
3.
ALLE, Le., 9 = -150' and $ = 150'
4.
ALLF, Le., 9 = -80' and $ = 170'
macroiteration did not converge and was restarted using Bavfrom the 8th iteration (before oscillation occurred) as a starting conformation. This also yielded the structure A&D*DA7 with an energy of -40.9 kcal/mol. Since the convergence criteria related to Oav and Omax were not fulfilled, the next macroiteration was restarted with one of the conformations related to the unfulfilled convergence criterion; specifically, we used the B.' vector as a startingconformation. This led to the global minimum pattern with the energy -70.5 kcal/mol. The computation time for the first macroiteration was comparable to the time reported in ref 2 for icosalanine. After locating the global minimum pattern in all cases, according to the fulfillment of all the convergence criteria, the resulting conformations from each run were used as starting points
Figure 2. PersistentlyoccurringconformationAloCD*DA, found in runs
For the ALLC starting conformation, the global minimum patternwasattainedin thefirstmacroiterationinthe 17thiteration (see Table I) giving -55.7 kcal/mol in the 19th iteration. For the ALLF starting conformation, the conformation attained in the first macroiteration had almost the global minimum pattern (with only one defect: Trp-19 was in the C conformation instead of the A conformation) with energy -59.8 kcal/mol (see Table 11). For the ALLD starting conformation, convergence was slower producing structure AloCD'DA7 in the 48th iteration with energy -35.5 kcal/mol. For the ALLE starting conformation, the first
270
The Journal of Physical Chemistry, Vol. 97, No. 1, 1993
TABLE V Persistently Occurring Conformation AIoCD'DA, for Melittin Found in Runs ALLD and ALLE dihedral angle (deg) residue
Gly- 1 Ile-2 Gly-3 Ala-4 Val-5 Leu-6 Lys-7 Val-8 Leu-9 Thr-IO Thr-11 Gly-I2 Leu- 13 Pro- 14 Ala- I5 Leu- 16 Ile- 17 Ser- 18 Trp- 19 Ile-20
4 -6 I -60 -61 -57 -59 -62 -64
-59 -55 -56 -8 0 163 -144 -7 5 -57 -58 -59 -57 -59 -56
$
w
-46 -43 -49 -49 -46 -39 -51 -47 -57 -37 66
180 180 -179 -178 -177 -178 -178
-44
75 -21 -43 -53 -42 -46 -53 -42
180
-178 177 -173 176 180 -178 -178 -177 -179 -178 -178 179
x'
x2
-68
152
52
60 164 179 -177 165 175 -58 -56
50 62 177 50 59 108 88
54 51 179 55 52 58 56
-174
67
53
59
63 177 -84 -59 -58 -92
62 72 77 -24 69
53 52
57 48
53
176
x3
x4
x5
59
59 180
177
57
to continue for several more macroiterations. In these macroiterations, all backbone and side-chain dihedral angles of the molecule, except 4 of Pro-14 were varied. The algorithm for improving convergence was applied with box edge lengths equal to 120° for the dihedral angles 9 and and 15O and 60° for w and x's, respectively. In all cases, the structure reported in ref 11 was obtained, with an energy of -87.0 kcal/mol (see Figure 1) [the slight differences in energy (from -85.9 in ref 11) and in conformation are caused by minor changes in the ECEPP/2 parametrization introduced in the mean time]. Table 111contains the structures at the global minima from this work and from ref 11. The total computation time varied from approximately 6 to 18 h. The low-lying minimum A&D*DA, in runs ALLD and ALLE occurred persistently. Attempts to minimize its energy with the dihedral angles w and x varied, but maintaining the backbone pattern [by applying a small box length related to the backbone dihedral angles ( S O 0 ) ] led to an energy of -72.0 kcal/mol (see
+
Olszewski et al. Figure 2). Table IV contains the dihedral angles related to this structure. To obtain convergence of the density distributions and to escape from this local minimum, a higher temperature had to be applied. Consequently, the next macroiteration starting from this conformation successfully located the global minimum pattern, as one that fulfilled all probability density distribution convergence criteria, with an energy of -32.0 kcal/mol; the resulting energy is higher than the starting energy because, in the first iterations, convergence of the energy is not forced (by adding the0 conformations that constitute the convergencecriteria) until the probability density distribution converges (see Table V). In this case, the global minimum was obtained directly as described above.
Acknowledgment. This work was supported by research grants from the National Institutes of Health (GM-14312), from the National Science Foundation (DMB 90-15815), and from the National Research Council (KBN) of Poland. The computations were carried out a t the Cornell Supercomputer Facility, a resource of the Cornell Center for Theory and Simulation in Science and Engineering, which receives major funding from the National Science Foundation and the IBM Corp., with additional support from New York State and members of its Corporate Research Institute. References and Notes (1) Olszewski, K. A.; Piela, L.; Scheraga, H. A. J . Phys. Chem. 1992, 96, 4672. (2) Olszewski, K. A.; Piela, L.; Scheraga, H. A. J.Phys. Chem.,preceding paper in this issue. (3) Cycon, H. L.; Froese, R. G.; Kirsh, W.; Simon, B. Schrodinger Operators; Springer-Verlag: Berlin, 1987. (4) von Neumann, J. Not. Bur. Stand. Appl. Math. Ser. 1951, 12, 36. (5) Cooley, J. W. Math. Comput. 1961, 15, 363. (6) Johnson, B. R. J. Chem. Phys. 1977, 67, 4086. (7) Zimmermann, S. S.; Pottle, M. S.; Ntmethy, G.; Scheraga, H. A. Macromolecules 1977, I O . 1 , (8) Momany, F. A.; McGuire, R. F.; Burgess, A. W.; Scheraga, H. A. J . Phys. Chem. 1975, 79, 2361. (9) Ntmethy, G.; Pottle, M. S.; Scheraga, H. A. J . Phys. Chem. 1983, 87, 1883. (10) Sippl, M. J.; Ntmethy, G.; Scheraga, H. A. J. Phys. Chem. 1984, 88, 623 1. ( 1 1) Ripoll, D. R.; Scheraga, H. A. Biopolymers 1990, 30, 165.