Protein molecular dynamics constrained to slow modes: theoretical

DOI: 10.1021/j100157a059. Publication Date: February 1991. ACS Legacy Archive. Cite this:J. Phys. Chem. 95, 4, 1817-1829. Note: In lieu of an abstract...
0 downloads 0 Views 2MB Size
1817

J. Phys. Chem. 1991, 95, 1817-1829 membrane structure at low salt (0-20 mM), namely, the surface hydration. Surface hydration is quite different at the pK, or at pH 9-10.’ When it is high (at pH 9-10), the T,is low and the vesicle is small. When it is low (at the pK,), the T, is high and the vesicle is large. Electrostatics seems to display a secondary role in determining the membrane structure at low salt because upon an increase in surface charge the T, can increase (before the pK,) or decrease (after the pK,).

Conclusions At low salt (0-20 mM NaCl), pH and salt effects on DHP vesicle properties such as size, phase behavior, and colloidal stability can be associated with a variable degree of surface hydration at the DHP bilayer. At half-dissociation of the phosphate polar head in water (pK, = 5.8-6.3), the degree of surface hydration is lowest due to the Occurrence of the maximal number of hydrogen bridges between phosphate polar heads. As a consequence, the surface area exposed to the water solution is lowest, generating the largest vesicle size and the highest phase transition temperature (T,) over the entire range of pH tested (3-10). At pK,, monovalent salt (3-20 mM) can disrupt hydrogen bridges, increasing surface hydration, generating smaller vesicles, and extensively decreasing vesicle aggregation at 70-73 OC.

+

At pH > pK, 1, hydrated Na+ ion adsorption (0.1-20 mM) plays an important role in providing surface hydration, reducing vesicle size, T,, and the extent of intervesicle aggregation at room temperature. At pH < pK, - 1, the presence of multilamellar structures in water (C = 0.1 mM) is strongly suggested by phase behavior data. Precisely when the DHP membrane possesses its highest degree of surface hydrophobicity (at the pK,), water permeability is maximal. With increasing C, at the pK,, maximal water permeability decreases due to a decrease in vesicle size. Surface hydration is apparently not important in determining the colloidal stability of LV at moderate monovalent salt concentration (0.16425 M), but ion adsorption significantly reduces vesicles aggregation, increasing colloidal stability at very low salt (0-20 mM).

Acknowledgment. Financial support from the Fundaslo de Amparo fi Pesquisa do Estado de Slo Paulo (FAPESP), the Conselho Nacional de Desenvolvimento Cientifico e Tecnol6gico (CNPq), and the Third World Academy of Sciences (TWAS) is gratefully acknowledged. S. Hix is the recipient of a CNPq undergraduate fellowship. We thank one of the referees for deep comments and valuable criticism. Registry No. Na(DHP), 60285-46-3; NaCI, 7647-14-5.

Protein Molecular Dynamics Constrained to Slow Modes. Theoretical Approach Based on a Hierarchy of Local Modes with a Set of Holonomic Constraints: The Method and Its Tests on Citrate Synthase Jean Durup Laboratoire de Physique Quantique (UA 505 of the CNRS), Universiti Paul Sabatier, 118, route de Narbonne, 31062 Toulouse, France (Received: April 23, 1990)

We present a new approach for theoretical simulation of protein dynamics constrained to slow modes in order to allow for an increase of the integration time step by an order of magnitude. It consists in building a hierarchy of internal vectors, according to the tree principle (generalization of Jacobi coordinates), and in using their polar coordinates, referred for each vector to the two vectors immediately lower in the tree, as local modes. Then the higher frequency modes are either fixed to their equilibrium values (holonomic constraints) or, as regards the cyclic coordinates which have no equilibrium value, affected with a friction coefficient. Conditions in the design of the tree are derived which allow the holonomic constraints to be applied as a correction after each step without needing an iteration. The method is tested on citrate synthase, a dimeric enzyme with 2 X 437 residues. The CHARMM-20 programs were used for topology, energy calculation, and basic dynamics. Comparisons between exact dynamics, 1-fs time step, and constrained dynamics, 8-fs time step, are presented in terms of the variations of some randomly chosen unconstrained coordinates along a 2.8-ps run, and in terms of statistics on average values and RMS fluctuations of all unconstrained coordinates in the same time lag. Finally, prospects for improvement and extension of this method are discussed.

I. Introduction Although the functional properties of proteins are now fairly well understood from the structural viewpoint, the molecular specificities responsible for their remarkable dynamical behaviors are still largely a matter of speculation. The fact that catalytic proteins, i.e., enzymes, in particular, have been developed during primeval evolution through a gradual selection among 101oo-lO1m possible nonfunctional polypeptides should warn us that very abnormal and specific structural or dynamical features might be of decisive importance in the control of biochemical processes. These considerations explain why the theoretical approach of protein dynamics pioneered and developed by the group of Karplus and by his followers,’” resting upon a description as accurate as 0022-365419112095-1817$02.50/0

possible of the macromolecule at the atomic level, is now accepted by most researchers in the field. Whenever comparisons were (1) McCammon, J. A.; Gelin, B. R.; Karplus, M.Nature 1977, 267, 585. McCammon, J. A.; Wolynes, P. G.; Karplus, M.Biochemistry 1979,18,927. van Gunsteren, W. F.; Karplus, M.Biochemistry 1982, 21,2259. Swaminathan, S.; Ichiye, T.; van Gunsteren, W.; Karplus, M.Biochemistry 1982,21, 5230. (2) Brooks, B. R.; Bruccoleri, R. E.; Olafson, B. D.; States, D. J.; Swaminathan, S.; Karplus, M. J . Comput. Chem. 1983, 4, 187. (3) van Gunsteren, W. F.;Berendsen, H. J. C.; Hermans, J.; Hol, W. G. J.; Postma, J. P. M.Proc. Natl. Acad. Sci. U.S.A. 1983, 80, 4315. (4) Levitt, M.J . Mol. Biol. 1983, 168, 595. Warshel, A. Proc. Natl. Acad. Sri. U.S.A. 1984, 81, 444. Levitt, M.;Sharon, R. Proc. Natl. Acad. Sci. U.S.A. 1988, 85, 7557.

0 1991 American Chemical Society

1818 The Journal of Physical Chemistry, Vol. 95, No. 4, 1991

possible between experiment and such treatments, a fair agreement was achieved and the theoretical work has brought more insight than experiment alone.’ The major problem in such a theory is the immense gap between the time scale of atomic motions, such as the vibration of a hydrogen bridge whose period lies in the femtosecond range, and the characteristic times of biochemical events, which extend from picoseconds to minutes.6 Therefore, extension of computer simulations of protein dynamics to times larger than a nanosecond requires artificial freezing of the fastest vibrational modes, not so much for reducing the number of degrees of freedom to be handled as for allowing larger time steps in the numerical integration of the equation of motion. This direction of research, which is central in the present work, prompts two kinds of questions, which will be discussed in sections I1 and 111: (i) Which kinematic parameters in fast motions may be considered as immaterial from the biophysical viewpoint, and (ii) how can fast modes be frozen without an alteration of the behavior of the slow modes? We then shall introduce in section IV a new method for constraining protein dynamics to slow modes, and show in section V how this method allowed us in a first step to increase by 1 order of magnitude the integration time step. The model system chosen for this study was an enzyme of the Krebs cycle, citrate synthase. Finally in section VI we shall discuss our results and the prospects of the extension of the present method to larger time scales. A comprehensive description of the various kinds of motions in proteins may be found in the review paper recently published by Brooks et aL9 A useful simple classification was that of Ringe and Petsko6 into atomic fluctuations, collective motions (either fast or slow, and either frequent or infrequent), and triggered conformational changes. Another important distinction, introduced among others by Alberding et a1.,I0 is that between stochastic and deterministic processes, more precisely between those processes where the available phase space is scanned in an ergodic way, and those, well exemplified by protein folding, where a final state is reached after the system has traversed only an extremely small portion of that space.”*12 A deterministic behavior has as an important consequence that one may expect a distribution of durations of the process not of the Poisson type but instead peaked above some threshold value, and this was shown to be one possible condition for high selectivity in enzyme-catalyzed reaction^.'^*'^ 11. How Do Fast Degrees of Freedom Bear on Slower Conformational Modes? The possibility of neglecting fast vibrational motions rests on the assumptions that the details of their time pattern bear no consequences on conformational changes taking place in times much longer than their periods, since these details are irreproducible and anyway averaged out on the time scale of theslow motions. This contention may be wrong. Biochemists often have speculated that charge migration through networks of hydrogen ( 5 ) Elber, R.; Karplus, M. Science 1987, 235, 318. (6) Brooks, C. L.; Karplus, M.; Pettitt, M. B. Ado. Chem. Phys. 1988, 71, 1 and references therein. (7) Case, D. A.; Karplus, M. J . Mol. Biol. 1979, 132, 343. Northrup, S. H.; Pear, M. R.; Morgon, J. D.; McCammon, J. A.; Karplus, M. J . Mol. Biol. 1981,153,1087. McCammon, J. A.; Northrup, S. H. Nature 1981,293,316. Mao, B.; Pear, M. R.; McCammon, J. A.; Quiccho, F. A. J. Biol. Chem. 1982, 257, 1131. McCammon, J . A.; Lee, Y.-C.; Northrup, S.H. J . Am. Chem. Soc. 1983, 105, 2232. Aqvist, J.; van Gpnsteren, W. F.;Leijonmarck, M.; Tapia, 0.J. Mol. Biol. 1985,183, 461. Aqvist, J.; Leijonmarck, M.; Tapia, 0. Eur. Biophys. J . 1989, 16, 327. Treutlein, H.; Niedermeier, C.; Schulten, K.; Diesenhofer, J.; Michel, H.; Bruenger, A.; Karplus, M. Jerusalem Symp. Quantum Chem. Biochem. 1988, 21, 513. (8) Ringe, D.; Petsko, G . A. Prog. Biophys. Mol. Biol. 1985, 45, 197. (9) Reference 6, Chapters VI-VIII. (IO) Alberding, N.; Frauenfelder, H.; HHnggi, P. Proc. Natl. Acad. Sci. U.S.A. 1978, 75, 26. (1 I ) Nemethy, G.; Scheraga, H. A. Q. Reu. Biophys. 1977, 10, 239. ( I 2) Reference 6, section VIIIB. (13) Ninio, J. Biochimie 1975, 57, 587; 1977, 59, 759. (14) Durup, J. Comments Atom. Mol. Phys. 1983, 12, 255.

Durup b o n d ~ , I ~ -the ~ ’ effect of giant dipole vibration,I8 or other electromechanochemical devices,I9 the propagation of vibrational solitons,2w22coupling of some vibrational degrees of freedom into collective motions exceptionally uncoupled from the heat bath,23*z4 or the development of high electric fieldsz5” might be responsible for energy or information transfer within a protein. As long as theory has not been able to reproduce such features in a given system, whether because these features actually are absent or because the theory is not accurate enough, there is no further loss of information in ignoring the detailed patterns of the fast motions. Fast motions, however, contribute energy and entropy to the slow modes even on the time scale of the periods or pseudoperiods of the latter: energy because the zero-point vibrational energy of the fast modes depends on the phase of the slow modes, entropy because the coupling between fast and slow modes induces phase changes in the latter. The first effect might be treated either with the help of algorithms of formal calculus of the third derivatives of the potential, or by using a continuous modelz7 of the fast oscillators. The second effect requires explicit entropy calculatiomZ8 Both effects were ignored in our calculations described in section V. 111. How To ‘Freeze” Fast Modes in a Dynamical Simulation of Proteins

In this section we are not dealing with the description of a transition between two states (either a reactive process or a conformational change), but with the simulation of conformational fluctuations in a macromolecule, which indeed is a prerequisite before the study of any transition. There has been developed two ways of getting rid of the fastest modes in protein dynamical simulations: either to run the dynamics with constraints or to replace the effect of fast modes with friction coefficients and random forces acting on the slow modes. A detailed description of the latter method, named stochastic dynamics, may be found in ref.29 We shall now concentrate on dynamics with holonomic constraints, which we used in the present work. A holonomic constraint is by definition30 any fixed relation between general coordinates and time. Here time will never have to be involved, and the constraints simply are relations between coordinates. Two kinds of constraints have been used up to now: either to fix the fastest normal modes of vibration to zero

(15) Nagle, J. F.; Morowitz, H. J. Proc. Natl. Acad, Sci. U.S.A. 1978, 75, 298. Nagle, J. F.; Mille, M.; Morowitz, H. J. J. Chem. Phys. 1980, 72, 3959. Nagle, J. F.; Tristram-Nagle, S.J . Membr. Biol. 1983, 74, I. (16) Metzler, D. E. Ado. Enzymol. 1979, 50, 1. (17) Burres, N.; Dunken, A. K. J. Theor. Biol. 1980.87, 723. (18) Frohlich, H. Proc. Natl. Acad. Sci. U.S.A. 1975, 72, 421 I . (19) Green, D. E. Ann. N.Y. Acad. Sci. 1974, 227, 6. (20) Davydov, A. S. J. Theor. Biol. 1977, 66. 319. (21) Hyman, J. M.; McLaughlin, D. W.; Scott, A. C. Physica D 1981,3, 23. Scott, A. C. Phys. Rev. A 1982, 26, 578; Phys. Scr. 1982, 25, 651. (22) Careri, G. In Structure and Dynamics of Nucleotides, Proteins and

Membranes; Clementi, E., Sarma, R. H., Eds.; Adenine Press: Guilderland, NY, 1984; p 63. (23) Somogyi, B.; Welch, G. R.; Dajmanovitch,S. Biochem. Biophys. Acta 1974, 768, 8 1. (24) Careri, G.; Gratton,

E.Biosystem 1977,8, 185. Careri, G.; Fasella, P.; Gratton, E. Annu. Rev. Biophys. Bioenerg. 1979, 8, 69. (25) Frohlich, H. Int. J. Quantum. Chem. 1968, 2, 641. (26) Hol, W. G. J.; van Duynen, P. T.; Berendsen, H. J. C. Nature 1978, 273, 443. (27) Thacher, T.; Ganesan, S.; Askar, A,; Rabitz, H. J . Chem. Phys. 1986, 85, 3655. (28) Reference 6, Chapter V. (29) Reference 6, Chapter IV. (30) Goldstein, H. CIassical Mechanics; Addison-Wesley: Reading, MA, 1950; section 1-3. (31) Brooks, B.; Karplus, M. Proc. Narl. Acad. Sci. U.S.A. 1983,80,6571. Reference 6, section IVF.

Protein Molecular Dynamics

The Journal of Physical Chemistry, Vol. 95, No. 4, 1991 1819

to lead to erroneous results as soon as one fixes not only distances or to fix local internal coordinates like bond distances and/or bond but also angles. angles and dihedrals to their equilibrium value^.^^"^ Also, both The reason for it is easy to understand mathematically. Let together have been used.38q39 us assume that our biomolecule is described by a set of n internal Both methods meet serious difficulties. coordinates, out of which a subset (qj,j = nf + 1, n) will subseNormal-mode analysis requires40 the diagonalization of the quently be fixed to constant values (the holonomic constraints), Hessian matrix, i.e., the matrix of second derivatives of the powhereas the remaining subset (qi,i = 1, nf)remain free and will tential energy function with respect to the coordinates. Since the method rests upon the assumption of harmonic motion around be used in the dynamical calculation. a minimum with respect to all coordinates, in protein dynamics We first let all coordinates behave freely. Let the part i iof the Lagrangian 1: depending upon the time the diagonalization of this huge matrix should be repeated in principle each time the system leaves the neighborhood of a given derivative qi of any qi be expanded as minimum for that of another one, an event which according to Elber and K a r p l ~ scalculations '~ takes place about each 0.15 ps. Besides, the assumed uncoupling of normal modes will not hold when the system is in the process of passing from one minimum where the X coefficients will in general depend on the coordinates. to another. Finally, the importance of nonbonding (van der Waals The momentum conjugated to qi is given by46 and electrostatic) interactions, which are far from harmonic, seems to make normal modes a priori inadequate in protein dynamics. Comparison of experimental and normal-mode spectra for some protein secondary structures indeed showed poor agreement.41 In spite of these intrinsic difficulties, dynamics constrained to and Lagrange equation will read slow normal modes was shown by Brooks and Karplus3' to give good results for main-chain fluctuations in bovine pancreatic (3) trypsine inhibitor (BPTI), as compared with exact dynamics. Levitt et aL3* and Noguti and Gd9 circumvented the size Thus the second derivatives (qi,i = 1, n) will be the solutions of problem by working in the space of the (cp, +, x) dihedral angles, a system of n linear equations. after freezing bond distances and angles to their equilibrium values. In a numerical algorithm such as the Verlet algorithm used in The latter authors39 simply used a Monte Carlo-Metropolis the CHARMM programs,2 the coordinates at step (p + 1) are method for describing fluctuations in BPTI, whereas the former determined from their second derivatives at step (p + 1) and from authors3* actually developed methods for running classical dythe variables and possibly their first derivatives at steps p and namics and studied not only the small BPTI (58 residues) but earlier. These second derivatives are given by n equations simpler various other proteins with up to 129 residues. Again very enthan (3) since usually Cartesian coordinates are used, but nevcouraging results were obtained by Levitt et a1?* especially in their ertheless the two systems of equations are rigorously equivalent. comparisons of experimental and theoretical RMS (root mean Let us now fix the holonomic constraints, Le., set each q) 0' = square) fluctuations, and their work provided a lot of insight into nf 1, n) to its predetermined value and each corresponding qj the collective motions. However, the range of problems to which to zero. The corresponding qjs of course should also be zero; the method is applicable is still an open question. In our opinion, however, their values obtained from the n equations (3) will always the (cp, +) angles are an excellent description of structure but their be slightly different from 0. Therefore, ignoring the last (n - nf) use in dynamics is quite paradoxical since the slightest change equations (3) and solving only the first nf equations for the qis in any of them produces large displacements far from the bond (n = 1, nf) after setting the q i s to 0 f o r i = nf + 1, n would be involved and raises opposing forces which have nothing to do with erroneous. In other words, simply setting the q[f to zero would the local torsion and are very far from harmonic. Undoubtedly leave us with n equations for only nf unknown qi s. Corrections the mixing of these coordinates into normal modes derived from will have to be brought to the 9:s (entering the X coefficients and the use of effective potentials largely corrects this deficiency but their derivatives, as well as a i / a q i ) and, implicitly, to the 4;s it seems hazardous to predict a priori to what extent it does. More for making the whole system of linear equations compatible. precisely, the issue for each normal mode is whether or not Various methods have been proposed and developed for solving harmonicity holds up to energies in the order of kT. In some this difficulty: computation of a correction potentia1,4'@' iterative instances the answer clearly is Another, related question correction of the 4;s (i = 1, nf) at each integration step (the SHAKE is how frequently in the dynamics calculation the effective poapplication of Gauss' principle of least constraint." tentials have to be updated. Finally, a new way of parametrization of protein coordinates Dynamics with holonomic constraints, e.g., on bond lengths or was recently introduced by Sklenar et aLso but not yet applied angles, would be a very practical and convenient method if one to dynamical calculations. could simply set the constraints, by freezing some internal coordinates to predetermined values, and then run the dynamics only IV. Method with the nonconstrained coordinates. Such a procedure, which is correct for normal coordinates at the level of the harmonic The method we propose for protein dynamics constrained to approximation, is incorrect in general and was ~ h o ~ n ~ ~ slow , ~modes ~ ~is based ~ ~ on, the ~ construction - ~ ~ of a hierarchy of local modes and the application of holonomic constraints fixing the fastest modes. We shall present in this section the successive steps (32) Sanejouand, Y.-H. T h k de Doctorat, Universite de Paris-Sud, Orsay, of the method and discuss its a priori advantages and disadvanFrance, 1990. tages. The tests that we performed will be described in section (33) Ryckaert, J. P.; Ciccotti, G.; Berendsen, H. J. C. J . Comput. Chem. 1977, 23, 327. V. (34) van Gunsteren, W. F.; Berendsen, H. J. C. Mol. Phys. 1977,34, 131I . In whatever concerns topology, energy calculations, and basic (35) van Gunsteren, W. F.; Karplus, M. Macromolecules 1982, 15, 1528. dynamics, we made a large use of the CHARMM* programs (36) Ryckaert, J. P. Mol. Phys. 1985, 55, 549.

+

(37) Tobias, D. J.; Brooks, C. L. J . Chem. Phys. 1988,89, 5115. (38) Levitt, M.; Sander, C.; Stern, P. S. J . Mol. Biol. 1985, 181, 423. (39) Noguti, T.; GO,N . Biopolymers 1985,24, 527. Nishikawa, T.; GO, N . Proreins: Srrucr., Funcr., Gener. 1987,2,308. Noguti, T.; GO, N. Proteins: Srrucr., Funcr., Gener. 1989, 5, 97, 104, 113, 125, 132. (40) Reference 30, Chapter IO. (41) Chou, K.-C. Biophys. Chem. 1986, 25, 105. (42) Reference 32, p 56. (43) Gottlieb, M.; Bird, R. B. J . Chem. Phys. 1976, 65, 2467. (44) Chandler, D.; Berne, B. J. J . Chem. Phys. 1979, 71, 5386.

(45) Toxvaerd, S. J . Chem. Phys. 1987, 87, 6140. (46) Reference 32, section 1-4. (47) Fixman, M. Proc. Natl. Acad. Sci. U.S.A. 1974, 71, 3050. (48) Pear, M. R.; Weiner, J. H. J . Chem. Phys. 1979, 71, 212. (49) Edberg, R.; Evans, D. J.; Morriss, G. P. J . Chem. Phys. 1986, 84, 6933. (50) Sklenar, H.; Etchebest, C.; Lavery, R. Proteins: Srnrcr., Funcr. Gener. 1989, 6, 46.

1820 The Journal of Physical Chemistry, Vol. 95, No. 4, 1991 TABLE I: An Example 01 Generalized Jacobi Vectors: Building tbe Arginine Residue

tree level

vector origin“

vector origin’

vector extremity

14 14

NH2 N

13

y

12

CNH

12 I1

CONH

NH2 H CaNtH3 NH2NH2 NH CrNzH4’ CNH CSNjHIZ

vector extremity

tree level

16

Hnll

Hn12

16

4 2 1

4 2 2

15 15 15 15 14 14

C,H N, N,I Nn2 C,HC,H, CbH,

C,HZ H, HH HH C,H2 N,H*

;; 2

5

o~~~

“Each origin or extremity is either-an atom or the center of mass of a group of atoms. The CH and CH2 groups are handled as ‘extended atoms”. (CHARMM-20 version), which Professor Martin Karplus kindly allowed us to use when we started this work. The list of atoms included all but the H a t o m nonconcemed with hydrogen bonding (in other words, groups CH, CH2, CH3 and S H were treated as “extended atoms”). 1 . Internal Vectors. We first build a set of internal vectors according to the tree principle, Le., each vector joins the centers of mass of two vectors immediately higher in the tree. Part of the upper vectors join two atoms, or an atom and the center of mass of a vector lower in the tree. This is a generalization of the Jacobi coordinates for triatomic systemss1 (for description and simple extensions, see refs 52 and 53). It should be mentioned that such a tree is very different from the “tree” recently introduced by Mazur and AbagyanS4 which rather is a chain of atoms and rigid bodies; however, the premises and the aims of these authors are similar to ours. [We are grateful to the referee who brought this work to our attention.] An example of our tree principle is given in Table I, which shows the way the arginine residues were built. For obvious reasons the same algorithms were used for building all residues of each kind. The tree includes four levels for each of the smallest residues, five levels for His and Phe, and six levels for Arg, Asn, Gln, Lys, Trp, and Tyr. Four levels were needed for building the secondary structures and five levels for the tertiary one, whereas the lowest level (for our dimeric enzyme) controlled subunit relative positions. Ligands entered the secondary structure in the same way as individual residues, except for “bound” water molecules which we associated to specific residues. Such a tree may a priori be built in an infinite number of ways. Our initial guidelines were simplicity and transferability. As we shall show later, when constraints are set their compatibility with molecular mechanics will impose stringent conditions on the design of the tree: however, when first building the tree one can only try to meet these conditions by use of intuition. We were careful to let appear explicitly in the tree the structural elements that are important from the chemical and biochemical viewpoints: aromatic rings, residues, a-helices and @-sheets,domains, subunits, and the ligand molecules. Otherwise, we simply tried to have each vector join two groups with masses in the same order of magnitude. The advantages of using this set of internal coordinates are the following. (i) The transformation of the atomic Cartesian coordinates (xi, y,, zj,i = I , N , N being the number of atoms) into the Cartesian coordinates of our internal vectors (Xi, yi, Zi, i = 1, N - 1) and vice versa, as well as the transformations of the velocities and of the forces (potential gradients), requires only two constant, purely topological matrices, inverse from each other, both very hollow (in our model case, less than 65 000 nonzero matrix elements out of N2 = 2.2 X IO6), and therefore extremely fast to operate. (51) Jacobi, C. G. J. Vorlesungen Gber Dynamik; Reimer: Berlin, 1866. (52) Johnson, B. R. J . Chem. Phys. 1980, 73, 5051. (53) Aquilanti, V.; Cavalli, S . J . Chem. Phys. 1%6, 85, 1355. (54) Mazur. A. K.;Abagyan, R.A. J. Biomol. Strucr. Dyn. 1989,6,815. Abagyan, R. A.; Mazur, A. K. Ibid. 1989, 6, 833.

Durup (ii) Owing to the tree principle, the kinetic energy operator remains diagonal in terms of the new velocities. Therefore, Hamiltonian dynamics reduces to simple Newtonian dynamics. (iii) There is a wide freedom in the design of the tree. (iv) Our internal vectors appear intuitively as a good representation of a hierarchy of local modes, since they take into account both chemical connectivity and spatial arrangement. (v) The most critical part of the algorithm, as will appear later, concerns the building of each residue. Thus once the algorithm has been optimized on any test protein it is completely transferable. The main but unavoidable disadvantage of the method is that half the bond distances and most of the bond angles do not appear explicitly in the internal coordinate set. At a further step of the procedure this will raise problems, e.g., for the backbone bond connecting two neighboring residues, which here is handled only indirectly through the vector joining the centers of mass of these residues. This problem will be treated in section V. 2. Internal Polar Coordinates. Conditions on the Holonomic Constraints. The second step consists in transforming the Xi, 6, Zicoordinates of the internal vectors in the external reference frame into polar coordinates, R,, 8 , ai, where for each vector 9, is defined with respect to the vector immediately lower in the tree, and ai with respect to the two vectors immediately lower. These polar coordinates actually are our local modes. They will be classified into slower and faster modes according to their frequencies, and our holonomic constraints will be the fixation of the fastest ones to their equilibrium values. At this point two conditions of compatibility with molecular mechanics already appear. From now on we shall designate by R’ (polar coordinates R‘, W,V)the vector connected with a vector R(R,8,9) and immediately lower in the tree; in other words, the center of mass of R is either the origin or the extremity of R’. These conditions are the following. Condition 1 . No 9’ coordinate should be able to take values close to 0 or A if either the 9’coordinate of the same vector R’ or the 9 coordinate of a vector R connected to R’ and immediately higher in the tree is constrained. This condition simply ensures that no dihedral angle to be set by a constraint might be an indefinite coordinate. Condition 2. No set of constraints should ever be conflicting with the equilibrium value of any bond distance, bond angle, or dihedral (improper torsion in the CHARMM language). In other words, once the constraints have been applied, the ensemble of configurations compatible with the canonical bond lengths, angles, and dihedrals should be a nonempty, as large as possible, subspace of the configuration space scanned by varying the unconstrained (free) coordinates. Now the dynamics under constraints was run in the following way (“projected dynamicsn4*). Each step of the integration of Hamilton equations was performed with the Xi, 6,ZiCartesian internal coordinates, and therefore was exact. The CHARMM d e s , using Verlet algorithm, could be applied with only very slight modifications. The energy and the forces of course were calculated from the usual atomic coordinates, which required the use of the above-mentioned topological transformation matrices. Then the Cartesian internal coordinates were transformed into the polar ones, and the latter were corrected by application of the constraints and then transformed back into the Cartesian coordinates of the internal vectors. Coming back to our discussion in section 111, we now shall look whether the tree may be built in such a way that these corrections might need only a single step, without any iteration. Let us first consider the representation in terms of the Cartesian coordinates X , U,Z of our internal vectors, with which actually each step of the dynamics calculation was performed. According to the tree principle, the &’s in eq 1 reduce to diagonal elements, so that for any value of index i eq 3 includes only the qi second derivative. Thus any qi ( i = 1, n f ) which is affected by no constraint will need no correction. Now the Cartesian coordinates of any vector in the tree depend only on the polar (R,9,9) coordinates of this vector and of those lower in the tree. Therefore, corrections to an unconstrained

The Journal of Physical Chemistry, Vol. 95, No. 4, 1991

Protein Molecular Dynamics

coordinate may occur only if there is a constrained coordinate in the same vector or in a vector connected to it and lower in the tree. One immediately sees that the latter case should be unfrequent since the lower vectors in the tree correspond to the slower modes. However, there are cases, e.g., where there is an unconstrained Q coordinate at some level and a constrained R coordinate at a lower level. Let us then turn to eq 3 in terms of our polar coordinates, keeping in mind that the two sets of differential equations are mathematically equivalent. Simple algebra provides the following momenta associated with any polar coordinate: (4)

Pe

I- $1 +

pR20 + p R R R.-

(

po = pR2 R R ’

terms of higher rank (5)

- + terms of higher rank

(6)

1:)

where is the reduced mass associated with the internal vector involved, the pripe designates the vector immediately lower in the tree, R and R’ indicate unit vectors, and the parentheses in (6)designate a mixed product of three vectors. From what was discussed above we may ignore the contributions in ( 5 ) and (6)from higher vectors in the tree. Now if we come back to eq 2 and 3 and inspect which second derivatives will be coupled in our case, we first see from eq 4 that R will be coupled to no other second derivative. From (5) it appears that 8 will be coupled to the second derivatives of the 8‘ and Q’ coordinates of the just lower vector in the tree, as well as to the second derivatives of the angular coordinates of the next lower vectors, through the unit vector R’. Finally, from eq 6, through simple vector algebra it turns out that Q will be coupled to the second derivative of 8’ (unless 8 is constrained to ?r/2), of 9’ (unless 8’ is constrained to 7r/2), and of the angular coordinates of the lower vectors in the tree. Thus, the conditions that allow us to set the constraints in a single step, after each integration step, are the following. Conditions 3. (i) No vector R should have its 8 coordinate unconstrained if there is any constrained angular coordinate of any vector connected to R in the tree hierarchy and of lower rank. (ii) No vector R should have its Q coordinate unconstrained either if the 8’coordinate of the vector R’just lower in the tree is constrained (unless the 8 coordinate of R is constrained to ?r/2), or if the Q’ coordinate of R’is constrained (unless the 8’coordinate of R’is constrained to 7r/2), or if there is any constrained angular coordinate in any vector connected to R’in the tree hierarchy and of lower rank. It must be noticed that none of our three conditions can be checked when building the tree, since the choice of the constraints is made afterwards, from the observed frequencies of the modes corresponding to the polar coordinates and possibly according to other later decisions. We shall come back to that point in section V. A final difficult is that some fast modes, which have to be frozen in order to allow for an increase in the integration time step, have no definite equilibrium value. This is true for the almost freely rotating groups, such as the NH3+group of lysine, the carboxylate and the amide groups, and to a large extent for the NH2 groups of arginine. These groups undergo fast librations, due to their involvement in hydrogen bonds or salt bridges, around mean orientations which depend on the specific lacation of their partners in these interactions. In the present first step of the method we handled this difficulty by setting on such modes, instead of a fixed constraint, a damping by a friction coefficient. We thus were led to use for these modes the Langevin dynamics which is included in the CHARMM programs. In a later step of the method we intend to use another strategy, adapted to integration steps in the order of 0.1 ps. This will be

1821

to remove the polar hydrogen atoms from explicit dynamics and to include directly a hydrogen-bond potential function, depending solely on the coordinates of the donor and acceptors atoms (all acceptor atoms present in some neighborhood) and of their antecedents. This potential function should be a mean free energy, determined from the allowed motions of the polar hydrogens. Such a technique could even be extended to include a logical variable describing proton transfer, and possibly taking into account quantum tunneling.55 3. Setting the Constraints. Dynamics with a Larger Time Step. For the usual oscillatory functions, simple integration algorithms (without a predictor-corrector) require an integration step allowing a minimum of 7 steps per period and preferably 10. Once an integration step has been chosen, these figures yield the limiting frequency vo over which all modes should be frozen. The frequency of each of our polar modes is easily determined by running an exact dynamical calculation during a time of about 100 vo (which is enough for the frequencies in the critical region to be determined within 10% uncertainty). The frequency distribution of all modes of the same kind (i.e., a specified polar coordinate of the vectors connecting the same atoms or atom groups in identical residues) provides an estimate of the changes in frequency which any of these modes may experience in a longer actual history. Since this history is not known beforehand, the constraints have to be applied either to all modes of the same kind or to none of them. These considerations, along with the three conditions given in section IV.2, are the basis for setting our holonomic constraints. Then we are ready for running the dynamics with the increased time step.

V. A First Test on Citrate Synthase I . General Choices, and an Incidental Observation. Citrate synthase, which catalyzes the reaction acetyl coenzyme A + oxalacetate * coenzyme A + citrate a part of the Krebs cycle in aerobic cells, was chosen as a typical enzyme of medium size (two identical subunits of 437 residues each), which undergoes conformational changes triggered by substrate binding, but has no known allosteric effector, at least as regards the pig heart enzyme which we considered in the present A comprehensive review of the behavior of this enzyme may be found in ref 57. Chemical evidence on its conformational dynamics is a current subject of research in the group of C. Ghtlis in Orsay, among others. Radiocrystallographic data for three conformations of pig heart citrate synthase (two “closed” forms and an “open” form) were published by Remington et aLS8 The enzyme sequence had been determined by Bloxham et alas9 The present work used the data on the closed monoclinic form, which were given at a 1.7-A resolution, and which include the citrate and coenzyme A substrates, as well as 95 “bound” water molecules. We first performed an energy minimization of the structure, using the Powell conjugate-gradient method,6O in its standard CHARMM version with the SHAKE algorithm.61 The options we used in the calculation of the van der Waals and electrostatic interactions, in the minimization as well as later in the dynamics, were a unity dielectric constant with shifted potential and special 1-4 interactions with reduced electrostatic forces.2 The bound water molecules were treated according to the ST2 model, which till recently was shown to provide a reasonable description of the electrostatic interactions.62 We are conscious that we may not (55) Zheng, C.; Wong, C. F.; McCammon, J. A.; Wolynes, P. G. Chew. Scr. 1989, 29A, 17 1. ( 5 6 ) Kurz, L. C.; Drysdale, G . R.; Duckworth, H. W. Gordon Conf. Enzymes, Coenzymes Metab. Pathways 1989. (57) Wiegand, J.; Remington, S.J. Annu. Rev. Biophys. Biophys. Chew. 1986, 15, 97. (58) Remington, S.;Wiegand, G.;Huber, R. J . Mol. Biol. 1982,158, 1 1 1 . Wiegand, G.; Remington, S.;Deisenhofer, J.; Huber, R. J . Mol. Biol. 1984, 174, 205. (59) Bloxham, D. P.; Parmelee, D. C.; Kumar, S.; Wade, R. D.; Ericsson, L. H.; Neurath, H.; Walsh, K. A.; Titani, K. Proc. Natl. Acad. Sci. U.S.A. 1981, 78, 5381. (60) Powell, M. J. D. Math. Programming 1977, 12, 241. (61) van Gunsteren, W. F.; Karplus, M. J . Compur. Chem. 1980, I , 266.

1822 The Journal of Physical Chemistry, Vol. 95, No. 4, I991

have used all best options in the potential energy cal~ulation,6~ but this should not affect our present work which is essentially methodological. Also, we delayed to a later step the inclusion of the solvent in the calculations. At the end of the minimization process the rms deviation of the atomic positions with respect to the original crystallographic data was 1.4 A. We took into account the subunit-subunit interaction but imposed the preservation of the symmetry with respect to the twofold axis, in the energy minimization and as well in the dynamics. This assumption avoids a doubling of the number of degrees of freedom, which would affect the size of the data arrays and double the computer time. The latter is mainly controlled by the number of nonbonding interactions, which under the assumption of symmetry conservation increased by only 12% on inclusion of the subunit-subunit interactions. From the biochemical viewpoint this assumption holds or not according to whether the subunit behavior is concerted or alternate, which to our knowledge is not yet known for citrate synthase. An interesting incidental observation that we made when computing the subunit-subunit interaction, and which does not seem to have been mentioned earlier, is that two atomic groups (and only two) are located almost exactly on the symmetry axis, according to the crystallographic data, and therefore we had to move them slightly to avoid a coincidence of their coordinates with those of the identical groups of the other subunit. Now both these atomic groups are C, methyl groups, from Met 45 and Met 127, respectively, which sit at both ends of the intersubunit interaction area. Since there are only 15 methionines out of 437 residues in each subunit, it seems extremely unlikely that the above observation is a pure coincidence. More precisely, in terms of cause probabilities, the 5 X IO-) probability that two randomly chosen atoms (in the sense including the actual atoms and the "extended atoms" in CHARMM designation) would be identical atoms of identical residues has to be compared with the a priori probability of a functional origin of that coincidence. It thus does not seem unlikely that methioninemethionine interactions would play some role in dimer formation, and this has to be tested experimentally. 2. Determination of the Constraints. As indicated in section IV.3, in order to allow an increase by an order of magnitude of the time step in the integration of the classical equations of dynamics, we first had to determine the frequencies associated with the internal polar coordinates defined in section IV.2, which we did simply by counting the number of sign changes of the deviation of each coordinate with respect to its mean value, in a dynamics calculation including 1000 I-fs steps. The validity of the method was immediately apparent from the fact that for most of the coordinates the frequencies obtained for the same internal vectors in all residues of the same kind were close to each other. We then selected for freezing (or damping, see section IV.2) essentially all modes of frequencies larger that 1.4 X 1013s-' and a number of modes in the range of (1.0-1.4) X 1013 s-', the ensemble of decisions being adjusted to meet all three conditions described in section IV.2. Throughout this work the correctness of the choice of the constraints was verified by running the constrained dynamics, as will be described in sections V.3 and V.4: each error, in particular any violation of the three above-described a priori conditions, led to divergence of the Verlet algorithm, resulting in unacceptable local heatings, which we carefully monitored. The most difficult choices occurred in cases where the behavior of a couple of residues of a given kind differed from that of all others; this happened typically for basic residues, where the hydrogen-bonding interactions vary strongly among identical residues placed in differing environments. Keeping in mind that these different behaviors are (62) Kochanski, E.; Rahmouni, A.; Wiest, R. Modelling Molecular Structure and Properties in Physical Chemistry and Biophysics, Presented at the 44th International Meeting, Division de Chimie Physique, Societe Frangise de Chimie, Nancy, France 1989, 13. (63) Loncharich, R. J.; Brooks, B. R. Proteins: Struct., Funcf., Genet. 1989. 6, 32.

Durup indicative of the events which a given individual residue may experience in its actual history which is much longer than the duration of our simulations, we adjusted our choices so as to satisfy all individual cases, while applying each decision to all identical vectors of identical residues. Our final choices, some of which will be further explained in section V.4, are reproduced in the Appendix. 3. Freezing the Constrained Modes. In the present methodological study it was important to make comparisons between an exact 1-fs time step dynamics and the constrained 8-fs time step dynamics (the choice of an increase by a factor of 8, rather than 10, was dictated by the simplification in developing the algorithm with successive doublings of the step). For such comparisons to be possible, it was necessary to freeze the constrained modes progressively, with the hope of not perturbing too much the trajectories. This was obtained by setting successively, along 106 1-fs steps of the dynamics, the constraints which consisted of 5481 internal coordinates fixed to their median values (for the ensemble of identical vectors of identical residues), as obtained from a foregoing 1000 1-fs step exact dynamics, and 327 internal coordinates affected with a friction coefficient, out of a total of 13 687 internal polar coordinates (we deal with 4563 internal vectors, but the lowest one in the tree, which joints the centers of mass of the two subunits, remains perpendicular to the twofold symmetry axis, and its direction is fixed since no overall rotation of the macromolecules is allowed in the dynamics, so that only the length of this vector is a variable). The frozen coordinates were not set brutally to their present values; rather, at each step the algorithm selected only those variations of these coordinates which brought them closer to these values. Finally, the friction coefficients (Langevin dynamics) were given a uniform value corresponding to a damping rate constant of 8 X l o t 3s-l. In a regular study of conformational fluctuations the significant dynamical simulation is preceded by a thermal equilibration stage of ca. 30 ps. In the present methodological study, we did not feel it necessary, inasmuch as we had to save computer time since we wanted to check the validity of our approach before applying for vectorial computer facilities. Thus we simply performed a 1 IO-fs heating, starting from random velocities corresponding to 80 K, up to a final temperature of 278 K, and thereafter a 1-ps thermal equilibration. Then we started the freezing as explained above. To display the behavior observed in this freezing stage we decided to compare trajectories for randomly chosen unconstrained coordinates in a regular 128-fs dynamics and in the dynamics run across the same time lag with the freezing process carried out in the first 106 fs. The regular dynamics used the Cartesian coordinates of the atoms in the standard way, whereas the constrained dynamics used the Cartesian coordinates of our internal vectors as explained in sections IV.1 and IV.2. In both dynamics the integration time step was 1 fs. The randomly chosen unconstrained polar internal coordinates were classified according to whether they belonged to vectors with 0, 1, or 2 frozen polar coordinates. In each of these groups, consisting of five to nine vectors after elimination of those where the overall variation of the coordinates was too small for the comparisons to be interesting, we chose and displayed in Figures 1-3 some extreme cases (best and worst comparisons) and some typical cases (median quality of comparison). From Figures 1-3 it appears that the freezing process alters the phases and amplitudes of the fluctuations in the unconstrained coordinates without changing the overall trends. Typical deviations are well examplified in Figure 3, where we plotted only variations in order to allow for an expanded scale. The fact that the initial parts of the curves generally coincide (except in Figure 3, because the freezing of the coordinates of the arginine terminal groups happened to take place in the very beginning of the freezing process) simply shows that the mathematical identity of the Newtonian dynamics performed either with the atomic coordinates or with the Cartesian coordinates of the internal vectors is well preserved by the algorithm. The peculiar behavior of the Tyr 167 C,C,HC,,HC~HC,HC, Q, coordinate in Figure 2 deserves some comments. This angle

The Journal of Physical Chemistry, Vol. 95, No. 4, 1991 1823

Protein Molecular Dynamics

ior

Degrees 240.

rad

A 0.2

I

PHI

I80 *

120.

0.1

60.

-0.0

-0.1

-0.2

-0.3 0.00

0.05

0.10

0 . I5

0.00

Time, ps

Figure 1. Freezing the constrained coordinates. Comparison between the exact (full line) and constrained (dashed line) dynamics in two extreme cases and a median one: the variations of the unconstrained @ coordinate (Rand 8 being constrained)of the Lys 112 C-NH (label I ) , Asp 237 C-02- (label 2), and Thr 247 0-H (label 3) vectors. The Asp 237 C-0; case shows the effect of a friction coefficient, which was applied at time 0.049 ps (see the arrow).

0.05

0.10

0.15

Time, p s

Figure 3. Freezing the constrained coordinates. Comparison between the exact (full line) and constrained (dashed line) dynamics in a typical case: the variations AR, Ae, A@ of the unconstrained coordinates of the

CH(CH2)3NH-Ct(NH2)2vector with respect to their initial values (3.81 A, 0.39 rad, 0.97 rad, respectively). Units: A or radians.

Arg 121

is the dihedral between the plane of a tyrosine ring and the C,C& plane. It moves from -60" to +60° in both simulations but in the exact dynamics by a +240° rotation whereas in the constrained dynamics it is by a -120" rotation. The fact that in spite of this big difference in the directions of motion the final orientation is the same seems to indicate that the overall orientation change was the result of a fluctuation in the environment of this tyrosine ring and that this change was essentially the same in both dynamics. The same observation applies to the effect of damping (curve 2 of Figure 1): the damped curve follows the trend of the unconstrained dynamics, clearly as a result of environmental strains. 4. Constrained Dynamics with an 8-fs Integration Step. Our first tests rapidly showed that our constraints were not sufficient to allow this 8-fold increase in the integration time step, so that -.--------------I we had to increase the total number of constraints by almost 50%. PHI 1 There were two reasons for this. \ (i) In some cases the relative motion of two residues, or of two moieties of a large side chain with respect to each other, although slow, led a t some step to a slight compression or expansion of the -60. atomic bond between them (a bond among those which were not explicitly present in the tree), and by the action of the opposing forces this effect was progressively enhanced in the further steps in a resonant way. No such effect had occurred in the 1-fs constrained dynamics. We then decided to make use of the SHAKE algorithm6' (see section 111) on the limited set of bond lengths joining two heavy groups and not fixed directly or indirectly by the constraints. As SHAKE constraints we used the bond equilibrium lengths. The -180. I I I I amount of such additional constraints was 1114 out of 4743 0.00 0.05 0.10 0.15 atom-atom bonds in the macromolecule. (ii) For some vector kinds a couple of individual vectors whose Time, PS frequencies were significantly higher than the median value for Figure 2. Freezing the constrained coordinates. Comparison between their kind could not adapt to the increased time step. This was the exact (full line) and constrained (dashed line) dynamics in two exto be expected, as earlier discussed (section V.2), but in some cases treme cam: the variations of the unconstrained 8 and @ coordinates (R we waited until the present test before making the decision to add being constrained) of the Ile 200 C,,H2-CIH3 (label 1) and Tyr 167 a new constraint to the ensemble of vectors of the kind involved. C,C,,HC,,H-C,,HC,,HC, (label 2) vectors. Degrees

A

1824 The Journal of Physical Chemistry, Vol. 95, No. 4, 1991

Durup

Degrees

1

Degrees

360

240

I

I PHI 2

PHI 3 120

-120.

4

-180.

-

PHI 1

-240 I

0.00

I

0.50

I

1.00

I

1.50

2.00

I

2.50

-360

1 1

I

0.00

0.50

In all such cases the relevant polar coordinates had no specific equilibrium values and therefore we applied to them the friction coefficient (see section V.3). These constraints, as well as the above-mentioned SHAKE constraints, are listed in the Appendix. We shall now compare the 8-fs constrained dynamics with the regular I-fs dynamics, along a time lag of 2.8 ps. In Figures 4-6 we show the trajectories for the same polar coordinates as in Figures 1-3. Time 0 in Figures 4-6 is identical with the final time 0.15 ps in Figures 1-3, so that the curves are the following of the preceding ones but with a time scale contracted by a factor of 20. Comparisons for slower motions, randomly chmen to represent the successive steps in the building of the secondary structure of the protein, are shown in Figures 7-10. In all cases in Figures 4-10, the phase coincidence between exact and constrained dynamics disappears within at most 1 ps, as might have been expected. The question is how much the other characters of the trajectories, namely, the amplitudes, frequencies, and degrees of irregularity of the fluctuations are preserved. First of all, the high-frequency oscillations in the exact dynamics disappear in the constrained one, especially in the cases (Figures 4 and 5) where part of the coordinates of the involved vector have been frozen. These oscillations clearly are due to mechanical coupling between the polar coordinates and would not appear in the exact dynamics if the 0 coordinates had been purely conformational modes, which they are not in general in our method. On the other hand, it is remarkable that the behavior of a damped coordinate (Figure 4, label PHI 2) on the expanded time scale differs no more from that of undamped similar 0 coordinates. Regarding the gross structure it is often comparable in both dynamics, sometimes with a higher fluctuation amplitude in the exact one and sometimes with the opposite trend.

1 00

1.50

I

2.00

I

2.50

Time, ps

Time, ps Figure 4. Comparison between exact, 1-fs time step (full line), and constrained,8-fs time step (dotted line), dynamics in the same examples as in Figure 1: the variations of the unconstrained @ coordinate (R and 8 being constrained) of the Lys 112 C-NH (label I), Asp 237 C-0; (label 2), and Thr 247 0-H (label 3) vectors. The Asp 237 C-O,-vector was affected with a friction coefficient.

,

Figure 5. Comparison between exact, I-fs time step (full line), and constrained, 8-fs time step (dotted line), dynamics in the same samples as in Figure 2: the variations of the unconstrained 0 and 8 coordinates (R being constrained)of the Ile 200 C,,H,-C:,H, (label 1) and Tyr 167 C,C,,HC,,H-C,,HC,,HCr (label 2) vectors.

The largest differences are observed with Thr 247 0-H (Figure 4), Tyr 167 C,C,,HC,,H-C,HC, HC, again (Figure 5), and Arg 42 1 CH(CH2)3NH-C+(NH2)2(hgure 6) Q coordinates. There occurs in the first case a 120' rotation of the threonine OH group only in the constrained dynamics, in the second case two forthand-back 180' rotations and a 360' rotation only in the exact dynamics (with the same final orientation in both dynamics, as earlier in Figure 2), and in the third case a slight contraction of the arginine side chain accompanied by a 60' rotation of this side chain with respect to the backbone only in the constrained dynamics. It is interesting to compare in these three cases the rms (root mean square) fluctuations of the involved coordinates, in the exact and in the constrained dynamics, when averaged over all identical vectors of identical residues in our protein (23 threonines, 19 tyrosines, and 19 arginines). These average rms fluctuations are for Thr 0-H 0, Tyr C3H2-C3H20,and Arg C4NH -CN2H4+ R and 0 coordinates, respectively, 46.8', 34.9', 0.156 and 23.3' in the exact dynamics, and 35.0°, 46.6', 0.188 A and 25.2O in the constrained dynamics. Thus the big differences which appear in our samples of Figures 4-6 seem to be purely incidental. However, to confirm this conclusion we should compare trajectories for these individual motions across much longer histories. We now turn to overall statistical comparisons between exact, 1-fs time step, and constrained, 8-fs time step, dynamics over the same 2.8-ps reference time lag, as in Figures 4-10. Figure 11 shows on a logarithmic scale the histograms of the relative deviations of the constrained dynamics with respect to the exact dynamics, for the average values and for the rms fluctuations of each of the unconstrained polar coordinates. The pattern appears to be roughly symmetrical, but with noticeably more modes with an increased rmsj7uctuation (3320 modes) than with a decreased one (2722 modes). The numbers of coordinates with increased (3077) or decreased (2965) average values are not

1,

The Journal of Physical Chemistry, Vol. 95, No. 4, 1991

Protein Molecular Dynamics 4.5

ior

rad

A

4.0

I

1825

or rad F.

4.0 3.5 3.5

3.0

3.0

2.5 2.5

2.0 2.0 1.5

1.0

i

fi

PHI

-

PHI

1.5

THETA 0.5 1.0

0.0

-0.5

PHI I

I

0.00

I

0.50

1

1

1.00

1.M

I

I

0.5

0.00

2.50

2.00

constrained, 8-fs time step (dashed line), dynamics in the same example as in Figure 3: the variations of the unconstrained R, 8, @ coordinates of the Arg 421 CH(CH2)3NH-Ct(NH2)2vector. TABLE 11: Relative Deviations between Average Values of Unconstrained R Coordinates in the 8-fs Time Step Constrained Dynamics with Respect to the 1-fs Time Step Exact Dynamics, for Each Kind of Vectors: Numbers of Kinds in Each Interval of Deviation Values -6%

-2%

-1%

-1%

+2%

1.00

1.50

2.00

2.50

Time, ps

Time, p s Figure 6. Comparison between exact, 1-fs time step (full line), and

0.50

Figure 7. Comparison between exact, 1-fs time step (full line), and constrained, 8-fs time step (dotted line), dynamics: the variations of the unconstrained R, 0, @ coordinates of the Val 357-Pro 358 vector. TABLE 111: Deviations between Average Values of Unconstrained 8 and @ Coordinates in the 8-fs Time Step Constrained Dynamics with Respect to the 1-fs Exact Dynamics, for Each Kind of Vectors: Number of Kinds in Each Interval of Deviation Values --x

--x2

x - x x x ---4 ---16 --3 2 + 5-x + 3 +X + I 8 +L - + -x4 + x x2 8

+6%

“Vectors Asp, Gln, lle CONH-side chain. bVectors Lys C3H5-C2H4NtH3,Trp CONH-side chain, two residues - three residues. significantly different from each other within statistical uncertainty. The dissymmetry in the rms fluctuation pattern is probably due to incomplete thermalization after the perturbations created in the slow motions by the changes in the terminal groups upon their freezing. Again, comparisons on longer time scales should ’be performed for definite conclusions to be drawn. The next tables and figures provide comparisons of mean values and rms fluctuations of the unconstrained coordinates when averaged over all modes of each kind. We distinguished 110 kinds of vectors, namely (i) each kind of vector in a given residue type, (ii) the ensemble of all vectors belonging to the coenzyme A and citrate substrates, and (iii) five sets of vectors involved in the secondary and upper structures and therefore joining groups of p and q residues, with @,q) = (1,l). (1,2), (2,2), (2,3), and higher. We disregarded all vectors involving the bound water molecules. This is because these molecules often came to the surface of the protein and then evaporated in the simulation, either in the exact dynamics or in the constrained one or in both, which gave rise to erratic behavior in the vectors connecting them to specific residues. Tables I1 and 111 give an overview of the deviations encountered in the mean values of the unconstrained R,8, and CP coordinates

a Vectors Ser 0-H, Glu CH-CH2, and C3H5-CO;, Arg C4H7NHCtN2H4, Phe C,C6,HC,IH-C62HC,2HCrHand C2H3-C6H5,and His CONH-side chain. *Vectors Met SH-CH3, Asp C2H3-C0;, Asn C2H3-CONH2, Gln C,H5-CONH2, Gly CONH-CHI, and Cys CONH-side chain.

when averaged Over all vectors of the same kind, as defined above. Large deviations occur only for the CP coordinates. In most cases they are cyclic coordinates, whose averages are not very meaningful. The only puzzling cases among those listed in the footnotes of Table 111are the variations in the average orientations of the histidine and cysteine side chains as referred to the locations of the neighboring residues. Here again a longer duration dynamics would be necessary for understanding the reasons of these variations. Finally, Figures 12-14 show the rms fluctuations of the unconstrained R,8,and CP coordinates when averaged over all vectors of the same kind, in the exact dynamics in abscissa and in the constrained dynamics in ordinate. The best correlation is observed with the R coordinates. The small negative intercept of the least-squares fit on the ordinate axis (-0.03 A) is significant and arises from small-amplitude oscillations in the exact dynamics, due to mechanical coupling with the other modes upper in the tree, which are frozen in the constrained dynamics. This effect would be invisible with the 8 and CP coordinates and therefore the straight lines we drew in Figures 13 and 14 are least-squares fits through the origin.

)126

io r 4.5

Durup

The Journal of Physical Chemistry, Vol. 95, No. 4, 1991 rad

1o r

22.5

rad

22.0

4.0

3.0 =.5

2.5

21.5

11

21.0

THETA

20.5

1 ~

THETA 1.5

1.0

-

0.5

-

0.0

I

20.0

I

0.00

0.50

I

I

I

1.00

1.50

19.0

I

2.00

2.50

Time, ps Wre 8. As in Figure 7, for the Ile 189 Tyr 190-Arg 191 Asn 192 v$ctor.

A

7.5

1

or rad

0.00

1.00

0.50

1.50

2.00

2.50

Time, p s Figure 10. As in Figure 7, for the vector joining the Met 390-Arg 401 and Ala 402-Gly 416 moieties of helix S. For the sake of a common scale in R (A) and 8 and CP (radians), 6 r were arbitrarily added to 8 and a, whose values actually lie around 90° and Oo, respectively. Modes f r a c t i o n

7.0

!7 ! i ! i

6.5

6.0 9.5 5.0 4,5 4.0

*I

.I

.I

I.

.I

I'

! i ! i

0 . 2. 3 :

.I

P.5

l-

3.0 2,s

THETA

2.0

I*

! i .I

I

.

0

-2

.I

- . !. i

1.5 1.0

0.5 0.0

-0.5 0.0

-1.0 -1.5

PHI I

0.00

I

0.50

I

I

1.00

1.50

8

2.00

-6

-6

-4

-2

0

Ln I R e l . d e v . 1

-2.0

-8.5

-4

I

2.50

T i m e , ps

wre 9. As in Figure 7, for the coenzyme A-OP loop vector. This loop &tween helices 0 and P includes 14 residues from Arg 3 13 to Thr 326. The CoA substrate actually is noncovalently bound to residues 46, 164, 273, 315, 317,318,368, and to residue 324 of the symmetrical subunit.5*

Figure 11. Histograms of the relative deviations, on a logarithmic scale, In I(M- M-J/M,I, in the 8-fs time step constrained dynamics with respect to the I-fs time step exact dynamics, where M is the average (full line) or the RMS fluctuation (dashed line) of each of the 6042 unconstrained polar coordinates. The left part of the abscissa axis corresponds to negative deviations, the right part to positive ones. The ordinate is the fraction of the 6042 modes.

The slopes of the straight lines in Figures 12-1 4 are 1.32, 1.1 1, and 1.24 for the R, 0, and CP coordinates, respectively. This slight

The Journal of Physical Chemistry, Vol. 95, No. 4, 1991 1827

Protein Molecular Dynamics

TABLE IV residue alanine" (33) arginine ( I 9j

~

vector or bond CH-CH, H-Hb

C C C C C

N=H,~

NHrNH2 C+-N2H4 Ne-H CdH,-N,H CH-CHI CIH3-C,H2 N.-C+

e

.

C C C D1

c c C c c

C S S C C C C C S C C

asparagine (1 8)

aspartate (21)

C C C C C

glutamine (17)

glutamate (24)

D8 DI

isoleucine ( 19) leucine (53)

lysineC(24)

C C C S C

D8 D8 D8 D1

D8

D8

C C

D8 D1

c c

D8 D8

C

valine (28)

D8

c

c c DI c c

DI

C

tyrosine (19)

C C

e

c c c c

c c

c

c c

c c

c C c

c c

c c c c c c c

c c c c c c c

c c c c c c c

c c c c c

c c

C

D8

C C C S

peptide bond" (436)

c c C C

c c c c C

C C

H20 (ST2 model) (95)

c c C C C C S

serine (29)

tryptophan (9)

C C C D1

C D8

~ C

C S threonine (23)

c c c c c c

vector or bond S-CH,

C C S

C C

C S C

IH

phenylalanine (12)

proline (22)

c c

glycine (33) histidine ( 1 4)

I

c

D8 D8 D8 DI

c c C S C C

DI D8 DI DI

C C C DI

c c C c c

cysteine (4)

+ residue methionine (15)

c c c c c C C C S C

C D8

c c D8 D8 c c C C C C C Sh S

C C D89

D8 D8

D8 D8 f D8

D8 D1

C S "Ala N-terminal NH3+group was handled like Lys NH3+. bThesevectors enter twice in each residue. cLys C-terminal COf group was handled like Asp and Glu CO;. "In the CONH peptide bond (or CON in Pro) the C = O was that of the preceding residue. 'These Trp vectors were built in our code in a slightly different way, where there was a 6 angle too close to zero. The present data are a better choice, but the difference is a minof one and was not worth running all our programs again. fD1 in Gly. KC in Gly. hExcept in Gly. systematic increase of the rms's in the constrained dynamics is in agreement with the histogram of Figure 1 1, and its constancy confirms our suspicion that it arises from an insufficient thermalization.

VI. Discussion Our philosophy in developing this method and applying it to citrate synthase as described in sections IV and V was to design the minimal algorithm which would be able to work for the expansion of the integration time step by an order of magnitude. This algorithm was in no way optimized and in this sense the first results appear as encouraging. We shall now discuss the possible

improvements. First of all the design of the tree could certainly be improved. Apart from the three conditions given in section IV which afe obeyed by the present algorithm, we have not worked out what could be the optimization principles with respect to molecular mechanics. For example, in our construction we included the a-carbon in the side chain, and then the latter was connected to the peptide bond for building the whole residue. We might as well have first built the full NH-C,H-CO backbone segment and then joined it to the side chain. We cannot as yet decide whether we made the better choice or not. Second, simple improvements, some of which exist in the

+

1828 The Journal of Physical Chemistry, Vol. 95, No. 4, 1991

Durup Bo.

C R M S (constr) 0.5

< R M S lconstr) HI. 0.4

x x x

40.

x

X

0.3 X

x

30.

x /

0.2

X

x x x x

20.

0.i

10.

0.0 0.1

0.0

0.3

0.2

0.

Figure 12. (Rms) fluctuations of the unconstrained R coordinates, after averaging over each kind of vectors: 8-fs time step constrained dynamics vs 1 -fs time step exact dynamics. The straight line is a least-squares fit among aII points. Units, A. C R M S (constr) 10.

4

I

X

xx 8.

x /

"/

7.

I

xx X

6.

5.

/ XJ ;

4.

%XX

3.

2.

0.

I

0

.

1

.

2

I

O

I

cx

0.4

(exact)

I

I

3 . 4 . 5 . 6 . 7 . 6 . 9 ,

CRMS

(exact)

Figure 13. (Rms) fluctuations of the unconstrained 8 coordinates, after

averaging over each kind of vectors: 8-fs time step constrained dynamics vs I-fs time step exact dynamics. The straight line is a least-squares fit among all points and through the origin. Units, degrees. CHARMM package, could be brought to the algorithm of con-

strained dynamics, at the expense of a comparatively small increase in computer time: (i) to set explicitly to zero the forces on the constrained coordinates, (ii) to do the same for the corresponding velocities (which enter the reinitialization steps of the Verlet algorithm), (iii) to proceed to an iteration between our constraints and the set of SHAKE constraints which we added to them, and (iv) to apply the friction coefficient selectively to a given polar coordinate rather than to the three Cartesian coordinates of the vector it belongs to. Third, as we mentioned in section 11, tests should be performed to check whether the neglect of the zero-point energy and of the entropic part in the free energy of the frozen modes are acceptable approximations or not. Apart from a purely theoretical investigation, careful inspection on a longer time lag of the fluctuation amplitudes of the slow modes and of the frequencies of occurrence of phase changes in their motion, in comparison with the exact dynamics, should provide more insight about it.

0.

I

1

I

1

10.

20.

30

40.

c R M S (exact)

Figure 14. Same as Figure 13, for the @ coordinates.

We hope that further discussions will shed light on which kind of improvement, either among the above-mentioned ones or not, should receive a priority. Our present prospects are first to introduce an implicit treatment of the hydrogen bonds, as sketched above in paragraph IV.2, then to introduce the solvent into the calculation, and thereafter to run comparison tests on longer time scales and after full thermalization. A further increase of the integration time step up to ca. 50 fs seems possible by damping another class of modes. One of the encouraging aspects of this work is that the introduction of a friction coefficient on those fast degrees of freedom which cannot be assigned an equilibrium value, although we did not like it partly because of the arbitrariness in the choice of the damping factor and partly because of its intricate intrusion into the thermodynamics of our system, seemed to affect very little the dynamics of the unconstrained modes. Later prospects, with an integration time step around 50 fs, would be to run constrained dynamics for as long a time as computer development will then allow, and from such runs performed in different points along a predetermined path in a conformational transition to derive the sets of actual collective conformational modes of the protein motion.

Acknowledgment. We are extremely grateful to Professor Martin Karplus who allowed us to use the CHARMM programs and who showed a friendly interest in this work. Much help and encouragement also came from my Orsay colleagues Dr. David Ptrahia, Professor Jeannine Yon, and Dr. Charis Ghtlis. All calculations were performed on the Sperry 1100-90 computer of Paris-Sud Informtique in Orsay, and we are much indebted to the whole P S I . team and especially to Mr. Yves Daigneaux for their constant help and attention. The terminal and connections that we used from Toulouse were built and maintained by Dr. Jean-Yves Grandpeix and his colleagues in the Laboratoire d'Energie Solaire headed by Professor Alain Cordier and we owe a great deal to their kindness. Finally, the author is grateful to Dr.Jean-Paul Malrieu and his colleagues who welcomed him in the Laboratoire de Physique Quantique and provided him with a warm atmosphere as well as with the necessary financial support from the C.N.R.S. and the Universiti5 Paul Sabatier. Appendix

This Appendix gives the list of the constraints that were applied to the various vectors in each kind of residue (Table IV). C: constraints on R, 8,or @.

J . Phys. Chem. 1991,95, 1829-1836 D1, D8: damping, applied already in the 1-fs time step dynamics or only in the 8-fs one, respectively. S: SHAKE constraint. The figure in parentheses after each residue name is the number of occurrences of the residue in citrate synthase. The constraints on the vectors belonging to the substrates

1829

(coenzyme A, citrate) and to the unique trimethyllysine (TMK 368) residue, which are specific to citrate synthase, are not listed here. Those on peptide bonds and on bound water molecules appear at the end. Registry No. Citrate synthase, 9027-96-7.

Micelles and Aggregates of Fluorinated Surfactants Wen Guo, Tommy A. Brown, and B. M. Fung* Department of Chemistry and Institute for Applied Surfactant Research, University of Oklahoma, Norman, Oklahoma 73019-0370 (Received: January 19, 1990; In Final Form: August 20, 1990)

The formation of micelles and aggregates by three kinds of fluorinated surfactants has been studied by F-19 NMR, light scattering, microscopy, centrifugation, conductance measurement, and pH titration. The surfactants studied are perfluoroheptanoic acid, its salts, and its ethoxylated amides. By the use of F-19 NMR, the mean lifetime of the surfactant molecules exchanging between the monomeric state and the aggregated state was found to be about lod s for the salts, 10-4 s for the acid, and lo-’ s for the amides. The reason for the increase in the mean lifetime for the acid and the amides is the formation of aggregates above the critical micelle concentration. The aggregates have an average diameter of about 100-200 nm, which is much larger than ordinary micelles. They can be distinguished under a phase-contrast microscope and can be separated from the surfactant dispersion by centrifugation. Conductance measurements of perfluoroheptanoic acid indicate strong counterion binding in the aggregates, the formation of which is favored by a reduction of the net charge in the hydrophilic head groups. It was found that large aggregates start to form when about 20% of the sodium ions in the salt are replaced by protons. Judging from the characteristics of the F-19 NMR spectra of the nonionic fluorinated surfactants (ethoxylated amides) and the optical isotropy of the droplets viewed under a polarizing microscope, it is suggested that the aggregates may exist in a cubic phase.

Introduction Surfactant molecules in an aqueous solution above a certain concentration (the critical micelle concentration, cmc) aggregate to form micelles, which are thermodynamically stable molecular assemblies with hydrophilic exterior and hydrophobic interior. The exchange of surfactant molecules between monomers and micelles is rapid and often thought to be diffusion controlled.’.2 The mean lifetime of the surfactant molecules is usually of the order of IO4 S.2

Fluorinated surfactants have some distinct properties different from those of hydrocarbon surfactants. Because of the size of the fluorine atom, a fluorocarbon chain is more “rigid” than a hydrocarbon hai in.^,^ It has been estimated that the effect of each CF, group toward micelle formation is roughly equivalent to 1.6 CH2 group^.^ Solutions of fluorocarbon surfactants above the cmc have much lower surface tensions than those of hydrocarbon surfactants6 While hydrocarbon surfactants usually form spherical micelles under normal conditions, fluorocarbon surfactants have a tendency to form rodlike mi~el1es.l~ Since F-19 chemical shifts are very sensitive to the environment, F-I9 NMR is an excellent tool for the study of fluorinated surfactant~.’-’~ In a recent study of some fluorinated nonionic ( I ) Lang, J.; Tondra, C.; Zana, R.;Hoffmann, H.; Ulbricht, W.J. Phys. Chem. 1975, 79, 276. (2) Aniansson, E. A.; Wall, S.N.; Almgren, M.; Hoffmann, H.; Kielmann, 1.; Ulbricht, W.;Zana, R.;Lang, J.; Trondre, C. J . Phys. Chem. 1976, 80, 905. (3) Asakawa. T.; Mouri, M.; Miyagishi, S . Lmgmuir 1989, 5, 343. (4) Hoffmann, H.; Kalus, J.; Thurn, H. Prog. Colloid Polym. Sci. 1983, 2 6 / , 1043. (5) Shincda, K.;Hato, M.; Hayshi, T. J . Phys. Chem. 1972, 76, 909. (6) Shinoda, K.; Kunieda, H. J . Phys. Chem. 1976, 80, 2468. (7) Muller, N.; Birkhahn, R. H. J . Phys. Chem. 1967, 71, 957. (8) Muller, N.; Birkhahn, R. H. J . Phys. Chem. 1968, 72, 583. (9) Muller, N.; Timothy, W . J. J. Phys. Chem. 1969, 73, 2024. (IO) Muller, N.; Platko, F. E. J . Phys. Chem. 1971, 75, 547. ( I 1 ) Muller, N.; Simsohn. H. J . Phys. Chem. 1971, 75, 942.

0022-3654/91/2095-1829$02.50/0

surfactants, we observed two separate sets of F-19 N M R signals for the surfactant molecules above the cmc.I4 From an analysis of the line widths, it was found that the mean lifetime of the surfactant molecules is of the order of 0.1-1 s. The slow rate of exchange was unprecedented, and the reason of this unusual behavior was unclear. Since then, we have carried out more experiments trying to determine factors controlling the rate of exchange of surfactant molecules in solution. In addition to the nonionic perfluoroalkyl polyethoxylated amides, perfluoroheptanoic acid and its salts were also studied. CF3(CF2)5COOH ( 1 : perfluoroheptanoic acid) CF3(CF2)5COO-M+ (2: perfluoroheptanoate salts)

CF3(CF2)SC(0)NH(CH2CH20),H (3: m = 2, FEA 6-2; 4: m = 3, FEA 6-3) These compounds were chosen because they have the same fluorocarbon tail, so that effects of the hydrophilic head group charge and the nature of the counterions on the aggregation and exchange of surfactant molecules can be compared. Corresponding cationic and zwitterionic surfactants were not available in pure form and therefore not studied. The major technique used in this investigation was F-19 NMR. Light scattering, microscopy, centrifugation, conductance measurement, and pH titration were used to obtain complementary information. A combination of these techniques enables us to gain a rather comprehensive understanding in the nature of aggregation of these systems. The aggregation of perfluorocctanoic acid and its salts was investigated in some detail by Hoffmann and co-workers by several classical techniques.I6-’* F- 19 N M R of sodium perfluoro(12) Muller, N. J . Phys. Chem. 1979, 83, 1388. (13) Muller, N. J . Phys. Chem. 1982, 86, 2047. (14) Fung, B. M.; Mamrosh, D. L.; ORear, E. A.; Frech, C. B.; Afzal, J. J . Phys. Chem. 1988, 92,4405. (1 5) Tiddy, G. L.T. Mordern Trends of Colloid Science in Chemistry and Biology; Eicke, H. S.,Eds.; Birkhauser: Basel, 1985; p 148.

0 1991 American Chemical Society