The Multiple-Minima Problem in the Conformational Analysis of

rate-determining step is the formation of water in the bulk, Le., eq 22. The fact that .... 0. X. Figure 1. Original double minimum potential energy c...
0 downloads 0 Views 908KB Size
J . Phys. Chem. 1989, 93, 3339-3346 Mechanism of the Catalytic Oxidation of H2. The catalytic oxidation, which proceeds by the repetitive reduction and reoxidation of the catalyst, is called "redox" mechanism. In this mechanism the rate of catalytic oxidation (denoted by r(cat)) agrees, to the first approximation, with the rate of reduction of the catalyst, r ( H 2 ) ,and also with the rate of reoxidation, r ( 0 2 ) , when they are compared at the same degree of reduction. r ( H 2 ) at the degree of reduction of 0.015 electrons/anion (the stationary state, see Figure 5b) was estimated from the P(H2) vs time curves mol.min-'.g-'. This value almost agreed to be (1.7-2.1) X with the stationary rate of H2 consumption (1.7 X 10" molmin-'.g-') during the catalytic oxidation. This result is good evidence of a redox mechanism. As is discussed previously for PMoI2,l2when r ( H 2 ) is independent of the surface area, the dependency of r(cat) on the

3339

surface area becomes very small due to the nature of the r ( H 2 ) vs t curve, even though r(Oz)was proportional to the surface area. Therefore, it may be concluded also in the present case that r(cat) little depended on the surface area because r(H2)in the redox cycle did not depend on the surface area. As discussed above, the reason r ( H 2 ) did not depend on the surface area is that the rate-determining step is the formation of water in the bulk, Le., eq 22. The fact that in the presence of O2the rate of exchange between hydrogen in the gas phase and the protons in the bulk of the catalyst was also 20 times greater than r(cat) shows that protons can migrate rapidly in the bulk of the catalyst also during the catalytic oxidation of H2. Registry No. PW,,, 1343-93-7; H2,1333-74-0; D2,7782-39-0.

The Multiple-Minima Problem in the Conformational Analysis of Molecules. Deformation of the Potential Energy Hypersurface by the Diffusion Equation Method Lucjan Piela,* Quantum Chemistry Laboratory, Department of Chemistry, University of Warsaw, Pasteura 1 , 02-093 Warsaw, Poland

Jaroslaw Kostrowicki, Institute of Inorganic Chemistry and Technology, Technical University of Gdansk, Majakowskiego 11/12, 80-952 Gdansk, Poland

and Harold A. Scheraga Baker Laboratory of Chemistry, Cornell University, Ithaca, New York 14853- 1301 (Received: August 15, 1988)

A deterministic algorithm designed to search for the global minimum of a potential energy function in the conformational analysis of molecules is proposed. The algorithm is based on the deformation of the original potential energy hypersurface in such a way as to obtain only a single minimum which, in most cases, is related to the global one. This single minimum can easily be attained from any starting point of the modified hypersurface by standard local minimization procedures. The position of this minimum with respect to the global one in the original hypersurface may have been changed during deformation; therefore, a reversing procedure is applied in which the global minimum is usually attained by gradually reversing the deformation. The hypersurface is deformed with the aid of the diffusion (or heat conduction) equation, with the original shape of the hypersurface having the meaning of the initial concentration (or temperature) distribution. The algorithm functions efficiently in one- and two-dimensionalproblems of chemical interest as well as in many dimensions for some test functions. Suggestions for extending it to higher dimensions for systems of chemical interest are provided, and a possible application to molecular dynamics is indicated. The significance of the proposed method extends beyond its application in chemistry.

1. Introduction The multiple-minima problem is the most formidable one in the conformational analysis of macromo1ecules.l The number of minima appearing in the energy hypersurface typically varies as 3m for hydrocarbon-like molecules or as lom for polypeptides, m being the number of monomer units. This means that a study based on some systematic2-I2 or r a n d ~ m ' ~exploration -~~ of the whole energy hypersurface in order to locate the global minimum quickly becomes impossible as the number of minima increases, even when applying rapidly computable atom-atom potentials and using the most powerful computers available today. Actually, at present, m = 5 is a practical upper limit for polypeptides, whereas proteins contain of the order of 100 residues. Some other methods proposed in the literature do not search the whole space, but their application is limited to polypeptides,21-22and then only to chains containing not more than 20 amino acid residues. *To whom correspondence should be addressed.

0022-3654/89/2093-3339$01.50/0

In the present paper, we recognize that it is impossible to search the whole conformational space of a large molecule, and we (1) Gibson. K. D.: Scheraga. H. A. In Sfructure and Exoression: Vol. I. From Proteins to Ribosome: Sarma, M. H., Sarma, R. H., Eds.; Adenine

Press: Guilderland, NY, 1988; p 67. (2) Crippen, G. M.; Scheraga, H. A. Proc. N a f l .Acad. Sd.U.S.A. 1969, .. 64, 42. (3) Gibson, K. D.; Scheraga, H. A. Compuf.Biochem. Res. 1970,3, 375. (4) Goldstein, A. A.; Price, J. F. Math. Compuf. 1971, 25, 569. (5) Crippen, G.M.; Scheraga, H. A. Arch. Biochem. Biophys. 1971, 144, 453, 462. (6) Shubert, B. 0. SIAM J . Numer. Anal. 1972, 9, 379. (7) Crippen, G. M.; Scheraga, H. A. J. Compuf. Phys. 1973, 12, 491. (8) Shor, N. Z . Cybernetics 1977, 12, 94. (9) Treccani, G. In Towards Global Minimization 2; Dixon, L. C. W., Szego, G. P., Eds.; North-Holland: Amsterdam and Elsevier: New York, 1978; pp 165-177. (10) Levy, A. V.; Montalvo, A. SIAM J. Sci. Statist. Compuf. 1985, 6, 15. (1 1) Evtushenko, G. Numerical Optimization Techniques; Optimization Software, Inc.: New York, 1985; pp 465-509.

0 1989 American Chemical Society

3340 The Journal of Physical Chemistry, Vol. 93, No. 8, 1989 therefore design an approach whose philosophy is similar to that proposed by Stephenson and Binsch (SB)23in the analysis of NMR spectra. These authors23 transform the original function to be minimized into another one with a smaller number of minima. The aim of the present paper is to report a new method for transforming the original energy function and to test the proposed technique for locating the global minimum for the potential energy surface of small model molecules.

Piela et al. 0.21

-1.2

-0.8

0

-0.4 X

2. Method

2.1. Basic Idea. The basic idea behind our approach is to deform the potential energy hypersurface smoothly in such a way as to make shallow potential wells disappear gradually, while other potential wells grow at their expense (we define the depth of a potential well as the difference between the minimum value of the function on the boundary and the minimum value inside the well). As the potential wells evolve, they change their position, size and shape and become shallower. The assumption of the present algorithm is that shallower potential wells disappear more easily than do deep ones. Then one eventually ends up with a single potential well that has absorbed all the others and, hopefully, will have arisen from that one which contained the original global minimum. In this new situation, any local minimization method would, of course, lead to the same, unique minimum independent of the starting point. Its position with respect to the original global one may have changed and, therefore, the deformation is gradually reversed, as will be described in section 2.5; one then obtains a trajectory through conformational space leading to the global minimum. When the reversal is complete, this trajectory ends up at the position of the global minimum of the original function. The deformation of the hypersurface, without exploring myriads of local minima or knowing their positions beforehand, is an extremely important feature of our method. To our knowledge, this feature is shared only with the SB procedure.23 The methods of Crippen and Scheraga* and of Goldstein and Price,4 which are based on a similar idea, eliminate mainly those minima that were already found previously and, therefore, are inefficient for problems with a very large number of minima. In a sense, the methods of C r i p ~ e n ~and ~ q of ~ ~Purisima and S ~ h e r a g a ~are ~ ? complementary ~’ to the method of the present paper; however, they deal with the unchanged energy hypersurface in a multidimensional space, where there is a unique minimum, and then relax back to three dimensions to locate the global minimum. Although the SB method23and our approach have the same general philosophy in common, they differ mathematically, because we use a fundamentally different method of deformation of the hypersurface than is used in the SB approach. 2.2. Transformation Operator. To begin with, we first consider a one-dimensional problem, and will then extend it to higher dimensions in sections 2.4, 4, and 5. One may define a transformation of the original functionf(x) that destabilizes any potential well of f ( x ) by decreasing its depth; here, f is the con-

(12) Ge, R. P.; Qin, Y. F. J . Opt. Theor. Appl. 1987, 54, 241. (1 3) Khachaturyan, A.; Semenovskaya, S.; Vainshtein, B. Acta Crystollogr. 1981, A37, 742. (14) Kirkpatrick, S.; Gelatt, C. D., Jr.; Vecchi, M . P. Science 1983, 220, 671. (15) Vanderbilt, D.; Louie, S. G. J . Comput. Phys. 1984, 56, 259. (16) Bohachevsky, I. 0.;Johnson, M. E.; Stein, M. L. Technometrics 1986, 28, 209. (17) Khachaturyan, A. J . Math. Phys. 1986, 27, 1834. (18) Snyman, J. A.; Fatti, L. P. J . O p f . Theory Appl. 1987, 54, 121. (19) Li, Z.; Scheraga, H. A. Proc. Natl. Acad. Sei. U.S.A.1987,84,6611. (20) Saunders, M. J . A m . Chem. SOC.1987, 109, 3150. (21) Piela, L.; Scheraga, H. A. Biopolymers 1987, 26, S33. (22) Ripoll, D. R.; Scheraga, H. A. Biopolymers 1988, 27, 1283. (23) Stephenson, D. S.; Binsch, G. J . Magn. Reson. 1980, 37, 395. (24) Crippen, G. M. J . Comput. Chem. 1982, 3, 471. (25) Crippen, G. M. J . Comput. Chem. 1984, 5 , 548. (26) Purisima, E. 0.;Scheraga, H. A. Proc. Natl. Acad. Sci. U.S.A.1986, 83, 2782. (27) Purisima, E. 0.; Scheraga, H . A. J . Mol. Biol. 1987, 196, 697.

Figure 1. Original double minimum potential energy curvef(so1id line) transformed, according to ( I ) , into a curve with only a single minimum (dashed line). The values of the transformed function a t the inflection points do not change. The particular function used in this figure isflx) = x4 + ax3 + 6x2, with a and 6 equal to 2 and 0.9, respectively. When (3 = 0.02 one obtains/+ pf”, which exhibits only one minimum.

formational energy and x is a geometrical variable such as a dihedral angle: f’l](x)= f ( x )

+ pf”(x)

for p

>O

(1)

Under the transformation of (l), the inflection points do not undergo any change since they correspond tof” = 0, while the regions of the curve where the function is convex (concave) go up (down), thus destabilizing the existing extrema of the curve (for small 8’s) (see Figure 1). The procedure may be repeated for the new curve and, in the Nth iteration, we obtain

In this form, this procedure may be applied directly to conformational analysis in n dimensions by simply replacing the second derivative operator by the n-dimensional Laplacian, but the destabilization procedure is most effective when p and N tend to zero and infinity, respectively.28 In particular, taking p = t / N , with t > 0 being a parameter, ( 2 ) may be transformed into

where an operator A has been defined such that exp(A) denotes the operator which is represented by the Taylor series 1

1 1 + A + --A2 + -A3 + ... 2! 3!

The operator

(dd:J

T ( t ) = exp t

(4)

which appears in (3), has some useful properties. First, it is linear and its eigenfunctions are sin w x and cos w x , w being a real constant:

T ( t ) sin w x = a(w,t) sin w x

(5)

T ( t ) cos w x = a(w,t) cos w x

(6)

(28) To illustrate this point, let us take a nonzero value of (3 in (1) and consider whether the extrema ofp’](x), ( I ) , are going to be destabilized. One may writenx) + fif”(x) = Lf(x) + (fi/M)f”(x)] + ( M -l)(j3/M)J”(x) 2 F(x) + ( M - I)(B/M)f”(x). Let us assume for the moment that the function F(x) =Ax) + (p/M)f”(x) has been obtained according to the philosophy behind ( I ) (is., a certain destabilization of the extrema offhas been achieved). Now, the addition, F ( x ) + ( M - l ) ( f i / M ) f ” ( x ) ,is not a step in the best direction, becauseyis not the second derivative of F ( x ) (as it should be if a further destabilization is to take place). Thus, in order to obtain the destabilization as effectively as possible, the (3 parameter should be infinitesimally small, e.g., fi = t / N , N - -, and the transformation of (1) should be iterated, as in (3). As to the value of N , repetition of transformation (1) many times [see (2)] accelerates the destabilization process shown in Figure 1, more and more, because, in the ( N + 1)th iteration, one destabilizes the already destabilized (in the first N steps) potential wells.

The Journal of Physical Chemistry, Vol. 93, No. 8,1989 3341

Conformational Analysis of Molecules where the eigenvalues a(w,t) are expressed as a ( w ) = exp(-w2t)

(7)

From (5)-(7), it is clear that the operator T ( t )flattens sines and cosines, because of the factor a(w,t) of (7). The important point is that this flattening is more pronounced for high-frequency functions (large w ) . As a consequence, high-frequency Fourier components of f(x) will vanish first when the operator T ( t ) is applied, usually leading to T ( t )f(x) with many fewer minima. Another property of T ( t ) is that this operator transforms polynomials into polynomials of the same degree, e.g.

studies. It should be noted, however, that a logarithmic dependence1*29v30 of the “local free energy” on the determinant of second derivatives suggests that the present method probably underestimates the tendency for the entropy-preferred potential wells. 2.4. Extension to Higher Dimensions. The fact that F(x,t) of (3) satisfies the diffusion equation can be used to extent the procedure to higher dimensions. In a multidimensional diffusion equation, the operator d2/dx2 must then be replaced by the Laplacian

T(t)x = x

+ 2t T(t)x3 = x3 + 6tx T(t)X4 = x4 + 122x2 + 12t2 T(t)x2 = x2

(8)

To prove these relations, it is sufficient to apply the operator T(t) explicitly in the form of a Taylor series. 2.3. Diffusion Equation. Generally, the Taylor series representation of eq 3 may be divergent for certain functionsflx) and for sufficiently large values of the parameter t . However, as one may easily verify by differentiating this series term by term, when it does converge, its sum is a solution of the diffusion or heat conduction equation

where n denotes the dimension of the problem. Thus, by analogy with (l), in the multidimensional case one destabilizes the function f by adding the trace of the Hessian (the mixed derivatives do not appear in the Laplacian A). The trace is invariant with respect to any orthogonal transformation of the axes, including that leading to the diagonal form of the Hessian. For a minimum (maximum), where all eigenvalues of the Hessian are positive (negative), the trace is positive (negative). A saddle point corresponds to the presence of both positive and negative eigenvalues. Depending on the sign of the trace, the value off’] in a multidimensional analogue of (1) will increase (a dominance of the positive eigenvalues; trace > 0), decrease (a dominance of the negative eigenvalues; trace C 0), or remain the same (cancellation of both types of eigenvalues; trace = 0). The operator T ( t ) is equal to

(9)

with t representing time and with the initial condition F(x,O) = f(x). The function F represents a concentration or temperature distribution, respectively. Whenf(x) is bounded, a solution of (9) exists for any positive value oft, even for such a large value o f t for some functionsflx) for which the series of (3) diverges. In the latter case, a solution of (9) for any small deviation At in the time to is consistent with the series in (3), i.e. T(At)F(x,to) = F(x,to+At)

(10)

This means that, when the series does not converge, the operator T , defined by the series of (3), produces a function F for the near future only, but the action of the operator always matches the solution of the diffusion equation. In the following, we will search for the solution of (9) rather than the application of the operator T ( t ) for large t’s (when the series converges these two procedures are equivalent). Thus, from a physical point of view, the procedure described here represents a spontaneous mass transport (or flow of heat) in a medium for an initial distribution of concentration (or temperature) according to the functionflx) (which, in our example, represents the conformational energy). The diffusion equation provides a physically appealing picture. Indeed, independent of the initial conditions, the concentration (or temperature) is a constant at t = m. However, just before that (Le., at large t ) the concentration (or temperature) may exhibit only one minimum. Hopefully, this minimum represents the last trace of the potential well of the global minimum of the original hypersurface. This physical interpretation suggests that, in some cases, the global minimum may not be found. For example, when the global minimum belongs to a narrow potential well of large depth, the potential well may disappear as the time t increases, while a wider, originally shallower, potential well may survive. However, one can legitimately ask whether narrow potential wells are of any interest from a chemical point of view. Indeed, wide potential wells (with a large number of vibrational states) will generally be preferred over narrow ones (a small number of vibrational states) of comparable depth, because of the entropy factor. Thus, what one is really interested in is a set of low-lying minima with some preference for the wide ones. The question whether the present algorithm under- or overestimates this tendency is not addressed in the present paper and is a subject of our current

T ( t ) = T,(t)T,(t)

...Tn(t)

with

having the property that for j

Ti(t)f(x,) =f(xj)

+i

Equation 9 then becomes AF =

aF

at

2.5. Reversing Procedure. If time t = to is sufficiently large, one may obtain a function F(x,to) that exhibits only one minimum. Its position, in general, is different from that of the global one of the original potential f (the latter minimum represents our target). The unique minimum of F(x,to), when used as a starting point in a minimization off(x), may in principle lead to a local minimum and not the global one. In order to avoid such a situation, a reversing procedure (Figure 2) is used in the present method. The procedure consists of gradually diminishing to by setting t = to, to - At, to - 2At, etc., where At is a small time interval, and obtaining the corresponding set of deformed surfaces [by using the operator T(t)or by solving (9)]: F(x,t,), F(x,to-At), F(x,to-2At), etc. Then, using the unique minimum of F(x,to) as the starting point in the minimization of F(x,to-At), and the resulting minimum position as the starting point in the minimization of F(x,to-ZAt), etc., one finally attains a minimum of F(x,O) =f(x), which may be regarded as genealogically related to the unique minimum of F(x,t), t I tO. Hopefully, the minimum of F(x,O) represents the global one of the original potentialflx). 2.6. Summary of Method. Summing up, the method of the present paper consists of the following steps: (1) Solve (9) for F(x,O) = f(x) as the initial condition or apply the operator T ( t ) for a sufficiently large value o f t = to,and use a local minimization procedure to locate the position x* of the unique minimum of the deformed potential. This position is then used as the starting point (29) Gibson, K. D.; Scheraga, H. A. Physiol. Chem. Phys. 1969, I , 109. (30) G6, N.; Scheraga, H. A. (a) J . Chem. Phys. 1969, 51, 4751; (b) Macromolecules 1916, 9, 5 3 5 .

3342 7he Journal of Physical Chemistry, Vol. 93, No. 8, 1989

Piela et al.

TABLE I: The Pseudoetbane Moleculea of Figure 3 Evolution in Time t (Arbitary Units) of the Dihedral Angles (in Degrees) Corresponding to the ith Minimum, el”.! and to the ith Maximum, Pqi,after Transformation by the Operator T ( t ) , (4), or by Solving (9) t 0.0 0.1 0.2 0.3 0.3 1 0.312c O.32Oc 0.4 OSd 61.6 63.6 63.9 64.0 64.3 69.9 e 60.4 60.8 $?I 119.3 117.2 111.8 110.9 110.8 110.0 98.1 e ,pi 119.9 208.6 210.9 186.7 193.7 211.4 213.5 232.1 235 1 ey.2 183.5b 240.8 245.7 248.7 249.9 e e e 240.0 240.2 $ p 2 8y.3 296.1 292.5 284.5 264.0 258.7 256.9 e e e qax.3 0.1 0.6 2.3 6.3 6.7 7.0 7.5 14.3 28.0 “Potential energy function, Figure 4A. For the parameters used for the Lennard-Jones atom-atom potentials” see the text. bThis row shows how the osition of the global minimum evolves in time t. C A t a time between t = 0.312 and t = 0.320 the more shallow of the two competing minima, fly’,disappears (see Figure 4). d A t this time the transformed potential, (3), exhibits only one minimum. ‘This extremum disappeared after transformation T(r). 3’

A

deformation

Figure 3. The pseudoethane molecule, Le., the ethane molecule substituted by various atoms instead of hydrogens. This is a one-dimensional model with short-range interactions included, the only variable determining the conformation being the dihedral angle 8. The atoms interact by the Lennard-Jones potential. The geometry assumed for this model molecule was as follows: the C-X (X being atoms 1, 2, 3; see text) and C-C bond lengths were taken as R = 1.54 A, and the X C C and XCX bond angles were taken as @ = 109.5’. Atom number with a prime denotes the same kind of atom as that with the same number without a prime.

t =0.02

original potential

Figure 2. Illustration of the deformation of the original potentialf(x) (the same as in Figure 1) by the operator T ( t ) ,and of the reversing procedure. The deformation applied by the operator T(t, = 0.25) leads to the curve with the unique minimum that is achievable from any point of the space by a simple minimization. Then, the reversing procedure (shown by the arrows directed downward) is applied by considering a sequence of the deformed curves: T(O.lS)f(x),T(O.lO)f(x), T(O.O5)f(x), T(O.OZ)f(x),and finally T(O)f(x) = f ( x ) . Each step of the procedureis followed by a minimization symbolized in the figure by a ball moving downhill from the minimum position of the upper curve and always reaching the position of the minimum in the lower curve. In the final step, the global minimum is found.

in the reversing procedure. (2) Apply the reversing procedure described in section 2.5. (3) The position obtained by minimizing F(x,to=O) from the starting point obtained in the reversing procedure is, hopefully, the position of the global minimum.

3. One-Dimensional Models A simple example of a one-dimensional conformational problem is a substituted ethane molecule hereafter called the pseudoethane molecule (Figure 3). In this model, only short-range interactions are encountered. The atoms interact by the Lennard-Jones potential and the conformational energy depends on the dihedral angle 8 shown in Figure 3 and has the form 3

c(0) =

3

The values of A , and B , were taken3’ for various sets of atoms 1, 2, 3 of Figure 3. The results, corresponding to the following choice-1 = Csp3,2 = Nsp3,3 = OSpz;A , , = 372.5, B , , = 28.58, A12 = 348.7, B12 = 16.82, A13 = 361.2, B13 = 20.52, A22 = 344.1, Bz2 = 17.82, A23 = 349.5, Bz3 = 15.75, A33= 367.2, and B3, = 14.49 (A’s in kcal A6/mol, B s in lo4 kcal A12/mol)-are collected in Table I and Figure 4. Another simple model, hereafter called the pendulum (Figure 5), mimics medium- and long-range interactions. In the pendulum model, an atom (which we refer to as atom number 0) moves in a plane at a constant radius ro from a given point. The atom interacts with several other immobile atoms, 1,2, ..., i, ..., k located in fixed positions in the same place at radius r > r,,. The potential energy, c(Bo), is given by

C [Aijriyl2(0)- Bijrij4(0)] j=lj=l

k

400) =

where

$(e) and

where L = R( 1 - 2 cos o), I = R sin 4 with R = 1.54 A and 4 = 109.5’ (see Figure 3 for R and 4), and rijis an element of the matrix

=

+ p, COS (e + r,)

CY,

(17)

c [Ao,rol-12(~o) - Bolro,”(~O)l

,=I

(21)

where 80 is the position of atom 0 (correspondingly 0, for fixed atom i ) ; rol(Bo)is the distance between atoms 0 and i, and Ao, and

(3 1 ) Hopfinger, A. J. Conformational Properties of Macromolecules; Academic Press: London, 1973; pp 38-49.

The Journal of Physical Chemistry, Vol. 93, No. 8, 1989 3343

Conformational Analysis of Molecules

-0. I 5

-0.25 -0.15

c\

t 0.03

B

1.0.06

c

n

!WP \I

1

-0 E

\ -

zU w I

-

C m v

c

m:

v

W

0

120

240

Dihedral Angle, 8 (deg) Figure 4. Conformational energy (kcal/mol) for the pseudoethane molecule, Figure 3, as a function of the dihedral angle 0 (in degrees). The original curve, c(0) of (16), corresponds to t = 0. The Figures for t > 0 correspond to the time evolution according to (9). The Lennard-Jones potential energy parameters are given in the text; Table I collects the positions of the extrema during the time evolution. At t 5: 0.5, the

transformed curve exhibits a single minimum.

_-

-0.21

lLd d l240L A I20 360

0

8,(deg) Figure 6. Conformational energy, ~ ( 0 , )(kcal/mol), for the pendulum versus angle Bo of Figure 5 and (21). Time evolution leads to a single minimum related to the global one in the original potential energy curve ( t = 0). The particular choice of the atoms used here is given in the text.

Figure 5. "Pendulum" model mimicking medium- and long-range interactions. An atom denoted by 0 is at a constant radius ro from a point on a plane; other atoms are located in the same plane at radius r > ro. The positions of all atoms are determined by the angles e,, i = 0, 1, 2 .... The values of Or for i = 1, 2, 3, ... are fixed, the only variable being Bo.

Bo,are pair interaction parameters. From the geometry of the problem roi2(eo)= aP + pp COS where a p and

-o'2

- 0 . 2202

360

(e,

-

e,)

(22)

pP are a p = ro2

+ rz

pp = -2ror

(23) (24)

In the example presented here, atom 0 was chosen as a hydrogen atom, while atoms 1-6 ( k = 6) have been taken as C,d, N,d, Os+ F, CI, and P, respectively. The corresponding Lennard-Jones parameters for pairs of these atoms3' are as follows: Aol = 127.4, Bo, = 3.743,,402 = 128.3,Bo2 = 1.995,,403 = 123.9,Bo, = 2.503, AM = 79.8,804 = 0.9361,A05 = 272.8,Bo5 = 8.074,AM = 264.8, BM = 7.825. (A's in kcal A6/mol; B s in lo4 kcal Al2/mol). The diffusion or heat conduction equation has been solved by expanding the initial potential energy curve of (16)or (21) in a Fourier series (see the Appendix). Then, the time evolution of

the potential energy curve has been calculated by applying (5) and (6)to individual terms of the expansion (cutting them off when numerical convergence is attained). It should be noted that, from these equations, it follows that, at large times t , the only component of interest in the Fourier series is the one corresponding to the lowest frequency w. The reason is that all other components disappear very rapidly, because of the exponential with respect to the product of time and the square of the frequency w in (7). The method presented here has been tested for different choices of the interacting atoms in both models under study (Figures 3 and 5). The results for the first model are given in Figure 4 (they correspond to the data collected in Table I). The choice of atoms was made in order to obtain an almost degenerate energy curve (minima of similar depth; see Figure 4A) in order to provide a stringent test of the method. The proposed method extracted the proper (Le., the global) minimum, and removed its nearest competitor; this happened at a time t between 0.312and 0.320;see Table I and Figure 4. Similarly, Figure 6 and Table I1 show the results obtained for the pendulum model of Figure 5. In this model [e, = ( i - 1)60°, i = 1, ..., 61, the number of minima was greater than in the previous model, but only one minimum was left after solving (9) for sufficiently large t . This single minimum originates from the global one of the minimized function. The calculations (not reported here) have also been carried out for several other types of atoms and their positions with similar final results. Only in one model (the same sequence of atoms as in the previous model; O1 = Oo, O2 = 40°,e3 = SO0, O4 = 160°,O5 = 240°, 8, = 300O) was the extracted minimum not the global one; the originally minimized and transformed functions are shown in Figure 7. It should be noted, however, that the minimum located

Piela et al.

3344 The Journal of Physical Chemistry, Vol. 93, No. 8, 1989 u0

v,

0

"0 W

x

0 W

f 0

0

v:

f 0

0

0

x

0

2 0 P

? 0

Bo ( d e g )

0

Figure 7. A negative example. Conformational energy (kcal/mol) for the pendulum versus angle 6, of Figure 5 and (21). Time evolution leads to a single minimum that is nor related to the global one in the original potential energy curve ( t = 0). Instead the procedure gives a wide low-energy potential well close to 6 = 60'.

0 ' I )

2 0

0

e 0 0

0.5

W

8

7

1

I

I

I

I\

x

0

I

0.4

W

9 0

I I

0

I I

-

m

8 a

1 -

i

.-

5

I

I

I

I

I

0.3

I

I

I

I

I

I

I I I

I

I

I

z .-

I

c

m

W d - w P o o m P

'*

I

n

I

I I I

L

E 0.2 c

I

0

1

I

4

W

0

0.I

II

I

I

I

I

I

r-

0

I

I

I

I I

0 \o

m

1

I

8

x x

I

I

I

I I

L

3

8

I l l

globall i minimum \ trajectory

%

W

x x

I

In c

x

m

I

jlobal naximum / \ ,rajectory /

r0

4

3

240

I20

0

0 W

C

I

I I

1.

IO0

I

\

\# I

I I I i 1

I

\

I

200

I:i

I

I I

I

300

0

8

x

x 0 m

8 N 0

9 0

3

0

x

0

8 c

Figure 8. Positions of the critical points of the potential energy curve for the pendulum model, Figure 5 , transformed according to (9) for various times t (see Table 11). Dashed (solid) lines correspond to positions of minima (maxima).

in this case by the present method corresponds to a wide and fairly deep (although not the deepest) double minimum that might be preferred because of the entropy factor. The important point is that, in all cases tested, the minima which move closer do not meet; instead, one of them (together with the hump that previously separated two minima) disappears. This is essential during the reversing procedure described in section 2.5, because in such a case the result of this procedure is uniquely defined. As shown in Figure 8 (the pendulum model, Figure 5 and Table 11), for sufficiently small time steps A f , the reversing procedure remains on a separate line ("trajectory of the critical point"). This is important if one recalls that the present method

Conformational Analysis of Molecules

The Journal of Physical Chemistry, Vol. 93, No. 8, 1989 3345

+

+

T ( t ) f = T , ( t ) T , ( t ) f =F(x,y,t) = a(x2 + 2 t ) bb2 + 2t) c exp(-y2t) cos yx + d exp(-S2t) cos Sy (26)

0c '

When t tends to infinity, the third and fourth terms on the right-hand side of (26) vanish, while the remainder has the unique and immobile minimum at (0,O) at any time t. This is, therefore, what our method gives as the position of the global minimum, the same as Bohachevsky et al. obtained by applying their stochastic procedure.I6 Figure 9. The pseudopropane molecule. The conformation shown in the figure corresponds to the values of the dihedral angles 8, and O2 equal to zero (the three carbon atoms as well as the fluorine and the terminal chlorine atoms are in the same plane). TABLE 111: Evolution in Time t of the Dihedral Angles (in Degrees) Corresponding to Minima for a Pseudopropane Derivative Molecule (Figure 9); 0: and 4, Dihedral Angles for the ith Minimum 1

0;

e:

0: 9; 8:

9: 0: 0:

e; 8:

e: 0; 8: 0: 8: 88,

9; 8;

value of the orig function at the min

0.00

0.09

0.18

0.31

0.32

58 202 58 295 61 292 86 61 187 198 191 288 202 58 288 288 313 61

58 216 a

65 230 a

90 227 a

a

-2.98

a

-3.19

65 266 101 72 194 220 194 270 198 65 292 216 292 16

a

a

a

-3.20

a

a

a

-1.61

a

a

a

-2.98

202 241 198 76 279 229 a

216 234 a

216 234 a

a

a

-3.1 1

a

a

-2.54

5. Toward Multidimensional Models The one- and two-dimensional cases studied in the preceding sections have only theoretical, not practical, value, because the number of minima is small [except for the function of (25)]. Sometimes, application of the method of the present paper to many-dimensional models with many minima is easy as shown below. The Griewank functions studied by Snyman and Fattils in their multistart global minimization algorithm are easily treated by the present method. The functions have the form

The authors investigated their method using two functions: G1 with n = 2, d = 200 and G2 with n = 10, d = 4000, the first having hundreds and the second thousands of minima. The present method treats this case as follows. The operator T ( t ) applied to the function gives [see (5)-(8) and (12)-(14)]

T(tlf = T,(t)Tz(t)...Tn(tlf

-3.25 the global min -2.64

Minimum disappeared after transformation T ( t ) is applied.

When t is large

T(t)fz

c" 1d

-(xk2

k=l

is based on the assumption that the single minimum at t = to is genealogically related to the global one at t = 0. As can be seen from Figures 4 and 6 in the examples described above, the transformation T ( t ) ,(3), did not alter the position of the global minimum very much. During the reversing procedure, the largest shift of the positions of the minima from those at time t = 0 to those at time t = to (see section 2.5) is 52'.

4. Two-Dimensional Models As an example of a two-dimensional problem, a substituted propane hereafter called a pseudopropane molecule, Figure 9, has been used; this molecule involves two dihedral angles. This choice represents a natural step toward molecules of interest in conformational analysis. The results, being a solution of (15), are given in Table 111. Again, as in the one-dimensional models, a single minimum was obtained after transformation T ( t )had been applied (for sufficiently large t ) to the potential energy surface, and it did not change very much in the reversing procedure compared to the position of the global minimum in the original potential energy surface. Interestingly (cf. the one-dimensional models), when the deformation parameter t increases, the minima disappear because of a pairing between a minimum and a maximum or a minimum (maximum) and a saddle point. The method of the present paper has also been tested in another two-dimensional example, the one studied by Bohachevsky et a1.I6 in their recently published annealing method for searching for the global minimum. These authors studied the function f(X,Y) = axz

+ by2 + c cos yx + d cos Sy

a, b

> 0; c, d < 0 ( 2 5 )

In this two-dimensional case, the operator T ( t ) acting on the function in (25) [see (5)-(8) and (12)-(14)] produces

+ 2t) + 1

(29)

which means that T(t)frepresents a multidimensional parabola with the immobile minimum position (0, 0, 0, ..., 0). This is what our method indicates as the unique minimum that becomes the global one after the reversing procedure. Synman and Fatti found this point 6 times out of 20 starts for G1 and 6 times out of 60 starts for G2. Difficulties begin at larger dimensions with more complicated functions, e.g., those encountered in conformational analysis, when an exponential growth of the number of minima occurs. Let us assume that the potential function used depends only on the dihedral angles (conformational analysis based on rigid geometry). In such a case, a potential energy function (if bounded) could be expanded into a Fourier series. If the function is not bounded (e.g., usually the Lennard-Jones potential), then it can easily be modified by assigning some high but finite energies for close atom-atom contacts. At large values of t , the most essential component of the Fourier representation of the potential energy function becomes that of the lowest frequency. The number of the lowest frequency terms depends linearly32on the number of dihedral angles. This enables one to expect to treat molecules of real chemical interest. Work on this problem is under way in our laboratory. 6. Conclusions The main conclusions of the present paper may be. summarized as follows: (32) The lowest frequencies corresponding to (A2) of the Appendix are associated with the following m's [m = (m,, m2,..., mJ]:( f l , 0, 0, ..., 0), (0, * I , 0, ..., 0). (0, 0, f l , 0, ..., 0). (0, 0, ..., *l). The number of the corresponding Fourier coefficients a, therefore, increases linearly with the space dimension n.

3346 The Journal of Physical Chemistry, Vol. 93, No. 8, 1989

Piela et al. By direct insertion into (Al), it can easily be verified that the solution of ( A l ) is of the form

%

E(81,82, ...On,?) = Camexp(-llml12t) exp[i(m,8)1

a2

I

m

r

I

where

I=

Figure 10. Definition of the geometry for 1-4 interactions used in (AS). It should be noted that dl and d, do not need to be identical with some chemical bonds; Le., the dihedral angle 0 can correspond to the rotation of one large rigid segment relative to another large rigid segment of a

given molecule. (a) An algorithm for finding the global minimum in the conformational analysis of molecules has been proposed. The method is based on the evolution in "timen of the initial potential energy hypersurface ending up with a hypersurface with only a single minimum that, hopefully, is related to the global one of the original potential. The evolution proceeds according to the physics of diffusion (or heat conduction), with the initial distribution of concentration (or temperature) corresponding exactly to the original potential. (b) The method has been tested successfully in several one- and two-dimensional examples of some chemical significance, as well as in multidimensional cases treated by other authors. Even in cases close to degeneracy, the method finds the global minimum correctly. The p r d u r e failed only seldomly. When this happens, instead of the global minimum, a wide and low-lying minimum is found that would probably be favored by the entropy factor. (c) In all models studied, the position of the single minimum in the transformed potential has been located close to the position of the global minimum of the original potential. This result suggests that the position of the global minimum may be estimated from the lowest frequency term of the Fourier series for the full potential hypersurface. (d) The deformation of the potential energy hypersurface may be particularly useful in molecular dynamics calculations. Even a moderate deformation (small t ) , being equivalent to lowering of potential energy barriers, might be able to produce large conformational changes that are difficult to achieve at t = 0 (no deformation). Then, the reversing procedure may be applied to obtain the behavior with the original potential. (e) Although the present approach is formulated for the purposes of conformational analysis (with the variables being dihedral angles in the range of 0 to 2 r ) , its significance extends beyond this particular application. Acknowledgment. We are very indebted to Dr. L. Z. Stolarczyk and Dr. M. Frankowicz for helpful discussions. This work was supported partly by the Polish Academy of Sciences (CPBP 01.12 and CPBP 01.15) and partly by the National Science Foundation (DMB84-0181 l), the National Institutes of Health (GM-14312), and the National Foundation for Cancer Research.

Appendix: Solution of the Diffusion Equation We present here a solution of the diffusion equation (15) n d2E(0,,...,8,,,t) - aE(8,,...,8,,,t) C (All ,=I

(A3)

at

ao,2

for the conformational energy function t ( O l , ...,8,) as a boundary condition: E(8,,...,8,,0) = e(81, ...,e,) where Bi are the dihedral angles defining the molecular geometry, and the symbol E for the energy now replaces F of (1 5). Because c ( 8 , , ...,On) is a periodic function with period 27r in each of the variables 81, ..., On, it can be expanded in a Fourier series (provided c is bounded): t ( 8 i 3 . . . d n )= L a m exp[i(m,QI m

(A2)

where m = (ml, ..., m,) is a series of integers and (m$) denotes a scalar product n

(mJ)= Cm,8, 1=I

n

llm1I2 = CmI2 1=1

The Fourier coefficients a, can be computed from (A2) by integration.

c(81,82,...8n) exp[-i(m,8)] de, de2 ... d8,

am = (2r)-" 10,211"

(A4) When we make use of the Lennard-Jones pair interaction potential, c(Bl,02, ...On) contains the terms rd and r-I2, where r is an interatomic distance. Since 9 is a trigonometric polynomial of dihedral angles 8, the coefficients a, can be computed in the vast majority of cases without calculation of the integral in (A4). Consider, for example, the one-dimensional case ( n = 1) r2 = d I 2+ d2' + d32+ 2dld2 cos a I + 2d2d3cos a2 + 2dld3(cos a 1cos a2 + sin a I sin a2cos 8 ) (A5) where the lengths d,, and the angles a, and 8, are shown in Figure 10. In complex form

+

r2 = cI exp(i8) co + c - ~exp(-i8) ('46) where cl, co, and c - ~are constants that can easily be related to those of (A5). The Fourier series for r-2 may be expressed as [c, exp(i8)

+ co + c-,

Cam exp(im8) m

exp(-iO)]-'

After multiplying both sides by [cl exp(i8) we obtain

+ co + c - ~exp(-iO)],

C(c-lam+l+ cOam+ clam-l)exp(im8)

1

('47)

m

(A8)

Comparison of coefficients on the left- and the right-hand sides of (A8) gives c_lal cooO cia-, = 1 for m = 0

+

+

+

+

and ~-~a,+, cOam clam-, = 0

for m # 0

(A9)

Therefore, if we know two coefficients, for example, a. and a,, we can compute all the other coefficients recursively from (A9). These two initial coefficients can be found by integration [(A4), where e is replaced by 1/r2], and the result is a0 = (Co2 -

4C-lCl)-1i2

a, = [ l - co(co2 - 4c,c-I)-1/2]/(2c-I)

(A101

What one really needs, however, are similar coefficients but for r-6 and r-I2 rather than for f 2 .The required formula can be

obtained by the following differentiation: a, for r-21-2=

(2n)-1~21[exp(-im8)[c-lexp(-iO)

+ co + cl exp(i8)]-/-']

de

+ co + cl exp(i8)l-l

de)

[ c - ~exp(-i8)

a/

= (-l)/(I!)-l -(am ac;

for f 2 , from (A10))

(A1 1)

for 1 = 2 and 1 = 5, respectively. Generalization of the recursive formula (A9) for n dimensions is very easy. It is more difficult^ to find analytic expressions for the initial coefficients. The problem has been solved for n = 2 (not shown here), and work is in progress in our laboratory for n > 2.