J. Phys. Chem. 1990, 94, 8091-8097
8091
Harmonic and Quasiharmonic Descriptions of Crambin Martha M. Teeter* Department of Chemistry, Boston College, Chestnut Hill, Massachusetts 02167
and David A. Case* Department of Molecular Biology, Research Institute of Scripps Clinic, La Jolla, California 92037 (Received: March 5, 1990)
We report a normal-mode analysis of crambin using various versions of the CHARMM, AMBER, DISCOVER, and AMBER/OPLS force fields, comparing static and dynamic features derived from these commonly used potential energy functions. We have also carried out a quasiharmonic analysis of crambin using a 100-ps dynamics simulation with the CHARMM potential energy function. Although the detailed character of the lowest frequency modes is sensitive to the force field used (and is different for the quasiharmonic and harmonic results), many general features of the frequency distributions and computed fluctuations are remarkably similar in the various calculations. The density of states in the C--20acm-l region agrees well in the quasiharmonic and harmonic calculations, but only for sampling times greater than about 50 ps; for shorter times, the density of certain low-energy vibrations is significantly underestimated. Trial calculations with increased dielectric constants or in the presence of a spherical shell of water atoms show fluctations markedly different from any of the “standard” vacuum calculations. Overall, these results provide new evidence about the insensitivity of computed dynamical properties to the assumed potential function and about the convergence of quasiharmonic molecular dynamics simulations.
I. Introduction Molecular dynamics studies of proteins and nucleic acids have been useful in understanding the nature of fluctuations about equilibrium structures and of transitions between such structures. The results of such studies depend upon the empirical potential functions used, and it is clearly of interest to know how various functions differ in their descriptions of macromolecular motions, even when one may not be able to determine which function is “best” in agreeing with experimental data. A number of studies have compared the structural consequences of various potential models for peptides and proteins,I-j but there have been few studies of the dynamical consequences of various potential models. In part this is because the results of molecular dynamics simulations depend upon the details of the equilibrium and sampling periods, so that it is often difficult to make meaningful comparisons between different simulations. One approach to this comparison problem is to compute vibrational normal modes, which gives an approximate picture of the dynamics of the molecule that can be characterized in terms of a relatively few parameters, Le., the character and frequency of the individual normal modes. Only a few examples of protein normal modes have been reported to date,4-12but advances in computing power, along with the emergence of machines with large amounts of physical memory, now make normal-mode calculations on small proteins a relatively routine calculation. Here we compare normal-mode calculations on the 46-residue protein crambin using eight variants of commonly used potential functions. Molecular dynamics simulations on anharmonic potential surfaces can also be analyzed in terms of normal modes by de( I ) Hall, D.; Pavitt, N . J . Comput. Chem. 1984, 5, 441. (2) Whitlow, M.; Teeter, M. M. J . Am. Chem. Soc. 1986, 108, 7163. (3) Roteman, 1. K.; Gibson, K. D.; Scheraga, H.A. J . Biomol. Strucr. Dynam. 1989, 7 , 391. Roterman, 1. K.; Lambert, M . H.; Gibson, K. D.; Scheraga, H. A. Ibid. 1989, 7, 421. (4) Brooks. B.: Karplus, M . froc. Narl. Acad. Sci. U S A 1983, 80, 6571-6575. ( 5 ) Yoshioki, S.; Abe, H.; Noguti, T.; Go, N.; Nagayama, K. J . Mol. Biol. 1983, 170. 1031-1036. (6) Go, N.; Noguti, T.; Nishikawa, T. froc. Narl. Acad. Sci. U S A 1983, 80, 3696-3700. (7) Nishikawa, T.; Go. N. froteinr: Srruct., Funcr., Genet. 1987, 2, 308. (8) Levy, R. M.; Karplus, M. Biopolymers 1979, 18, 2465-2495. (9) Levitt, M.; Sander, C.; Stern, P. S . J . Mol. Biol. 1985, 181,423-447, (IO) Roux, B.; Karplus, M. Biophys. J . 1988, 53, 297-309. ( I I ) Cusack, S.; Smith, J.; Finney, J.; Tidor, B.; Karplus, M. J . Mol. Biol. 1988, 202, 903. (12) Kottalam, J.: Case, D. A . Biopolymers 1990, 29, 1409-1421.
0022-3654/90/2094-8091$02.50/0
TABLE I: Potential Functions Used no. function comments 1
2 3 4 5 6 7 8
CHARMM CHARMM AMBER AMBER AMBER AMBER
parameters from versions 17/18, united atom, e = 1 parameters from version 19, united atom, c = 1 united atom, “large” van der Waals radii, c = r
all atom, c = r united atom, with e = 4r all atom, with c given by eq 1 united atom, c = 1 OPLS DISCOVER all atom, c = 1
termining “quasiharmonic” modes that give the same fluctuations as those determined in the dynamical ~ i m u l a t i o n . ~Levy ~ ~ ’ ~et al.Is have reported that such simulations can give results very different from true normal modes, but these simulations were fit to a simplified C a carbon model, and little is known about the convergence characteristics of more conventional calculations. Accordingly, we report quasiharmonic calculations based on a 100-ps vacuum simulation, which shows that the frequency spectrum and root mean square (rms) atomic fluctuations in a dynamics simulation can be remarkably close to those determined from a conventional normal-mode calculation using the same potential function. Section I1 gives the details of the potential functions we have used, and of the minimization and molecular dynamics simulations carried out. In section I11 we compare the frequency spectra and atomic fluctuations predicted by various potentials with each other and with experiment and in section IV discuss results of our quasiharmonic calculations. Some conclusions about the use of normal-mode analyses are given in the final section. 11. Method of Calculation A. Potential Functions. A large number of empirical potential functions have been proposed for peptides and proteins, and our choice of examples has been dictated in part by our familiarity with programs that can use these potentials to compute normal modes of a fairly large system. The eight functions used here are summarized in Table I. They include older and more recent parametrizations from the CHARMM programs,I6*”the united-atom (13) Karplus, M.; Kushick, J. N. Macromolecules 1981, 14, 325. (14) Levy, R. M.; Karplus, M.; Kushick, J.; Perahia, D. Macromolecules
..
__
1984 - - - -17 , 1370 -.
(IS) Levy, R. M.; Srinivasan, A. R.; Olson, W.K.; McCammon, J . A. Biopolymers 1984. 23, 1099. (16) Brooks, B. R.; Bruccoleri, R. E.; Olafson, B. D.; States, D. J.; Swaminathan, S.; Karplus, M. J . Comput. Chem. 1983, 4 , 187.
0 1990 American Chemical Society
8092
The Journal of Physical Chemistry, Vol. 94, No. 21, 1990
and all-atom representations of the AMBER potential^,^*^'^ the AMBER/OPLS potential*O (referred to hereafter as “OPLS” for brevity), and the all-atom potentials in the DISCOVER package.2’ In addition, Levitt et al.8*22have reported normal-mode results for crambin to which the present results can be compared. Details of the optional parameters are given in the following paragraphs, but in each case we attempted to use parameter options consistent with the common applications of these programs. The CHARMM parameters used come from both version 17/ 18 and version 19. Harmonic normal modes were calculated with CHARMM 1 7 / 1 8 and CHARMM 19, and quasiharmonic normal modes were from a dynamics simulation with CHARMM 1 7 / 1 8 . Three major differences between the parameters in CHARMM 1 7 / 1 8 and CHARMM 1 9 are ( I ) hydrogen-bonding parameters are adjusted based on ab initio calculations of formamide:formamide, formamide:water, and water:water dimers (Reiher, W.; Karplus, M., unpublished work); (2) the angular dependence of hydrogen bonds is eliminated; and (3) the Slater-Kirkwood treatment of van der Waals radii is replaced with simple mixing rules (Briinger, A., unpublished work). For all CHARMM calculations, a constant dielectric of 1.O was used. Nonbonding interactions were cut off at 8 A and had a switching function from 6.5 to 7.5 A. The nonbonded lists were updated every 50 cycles. For CHARMM 1 7 / 1 8 , hydrogen-bond interactions were computed if the distance was less than 5 A (with a switching function between 3.5 and 4.5 A) and if the A-H-D angle was less than 90” (with a switching function between 50” and 70”; 0” is linear); antecedent angle terms were included; and the cosine exponents for angular dependence of the hydrogen-bond energy on the A-H-D angle and AA-A-H angle were 4 and 2, respectively. (The CHARMM functions can also be used with an r-dependent dielectric in vacuo; with such a choice, minima have been found that are closer to the X-ray structure than those reported here [ Karplus, M., personal communication] .) The united-atom AMBER calculations used the “large” radii for aliphatic carbons2j and divided 1-4 nonbonded interactions by 8.0, as recommended. A distance-dependent dielectric with t = r (where r is the interatomic distance in A) was used, with a 9-h; cutoff for nonbonded and electrostatic interactions. The nonbonded lists were updated every 100 cycles and “frozen” during the final stages of the minimization. The all-atom calculations used the same dielectric and cutoffs. In addition to these “standard” calculations, two electrostatic variants were considered. The first used a dielectric of 4r; this reduces electrostatic effects and can yield minimized structures that are in closer agreement to the observed X-ray structure than do “standard” parameters2 The second variant used a sigmoidal d i e l e c t r i ~ ~ ~ , ~ ~ t
= 1
+ d - 1/Zd exp[-sr](s2r2 + 2rs + 2)
(1)
with d = 78 and s = 0.3. This function yields a small dielectric constant for short distances (less than about 5 A), which is important for the description of hydrogen bonds, but considerably attenuates longer range electrostatic interactions. The OPLS calculations are similar to the AMBER united-atom ones, but with a dielectric of unity. It should be noted that these potentials were primarily developed for use in solution studies and not for the gas-phase models used here: the OPLS potential is (17) Reiher, W. E. Ph.D. Thesis, Harvard University, 1985. (18) Weiner, S. J.: Kollman, P. A.; Case, D. A,; Singh, U.C.; Ghio, C.; Alagona, G.; Weiner, P. J. A m . Chem. SOC.1984, 106, 765. (19) Weiner, S. J.; Kollman, P. A.; Nguyen, D. T.: Case, D. A. J. Comput. Chem. 1986, 7, 230. (20) Jorgensen, W. L.: Tirado-Rives, J . J. Am. Chem. SOC.1988. 110, 1657. (21) Hagler, A. T. In The Peptides; Hruby, V., Ed.; Academic Press: Orlando, 1985; Vol. 7, pp 213-299. Most of the parameters of the current DISCOVER force field are listed in: Dauber-Osguthorpe, P.; Roberts, V . A,; Osguthorpe, D. J.; Wolff. J.; Genest, M.; Hagler, A. T. Proreins: Sirucr., Funci., Genet. 1988, 4, 3 I . (22) Levitt, M. J. Mol. Biol. 1983, 170, 723. (23) Jorgensen, W. L. J . Am. Chem. SOC.1976, 98, 4600. (24) Hingerty, B. E.; Ritchie, R. H.; Ferrell, T. L.; Turner, J . E. Biopolymers 1985, 24, 427. ( 2 5 ) Ramstein. J.: Lavery. R. Proc. Narl. Acad. Sei. USA 1985. 85, 7231
Teeter and Case included here, nevertheless, in order to gain an appreciation for the extent to which computed dynamics depends upon the details of the assumed potentials. Energy minimizations on the crambin crystal with the OPE potential have been shown to agree well with the X-ray structure (rms deviation of 0.17 A for the all protein heavy atoms).20 An additional “nonstandard” calculation was carried out with the OPLS force field, in which the protein was immersed in water so that a 5-A shell of solvent surrounded the molecule. (This was accomplished in a standard way, by placing the protein in a box of preequilibrated water molecules and removing those waters that overlapped the protein or which were more than 5 A away from any protein atom.) The complete system was minimized, and the protein fluctuations were computed. This is not a realistic liquid simulation, since the completely minimized state shows none of the thermal disorder characteristic of a liquid. It may have some features in common with low-temperature experiments on proteins in glassy solvents, where there is little thermal energy, but with no overall order such as would be found in ice. For our purposes, these results help delineate the extent to which computed fluctuations are sensitive to the ways in which the calculations are carried out. The DISCOVER dlculations were carried out with programs from Biosym Inc., modified to allow the computation of normal modes in systems as large as crambin. These used a nonbonded cutoff of 15 A with a switching function between 10 and 12 A. DISCOVER is the only potential we consider that includes “cross-terms” describing stretch-bend and stretch-torsion interactions. These are known to be important for the detailed description of vibrational properties but, as shown below, have little apparent effect on large-scale fluctuations. B. Energy-Minimization Calculations. The starting point for our minimizations in every case were the X-ray coordinates relined from 0.945-A data (Teeter and Hendrickson, unpublished work). Crambin is microheterogeneous with at least two sequences found in the crystals; we used the sequence with proline at position 22 and leucine at position 25. For each potential, adopted basis Newton-Raphson minimizations16 (for CHARMM) or conjugate gradients (for all other potentials) were used to obtain a local minimum in which the root mean square of the elements of the gradient vector was less than 2 X IO4 kcal/(mol A). In all cases, the initial steps of minimization used a constraining function to limit motion away from the crystal structure; these constraints were gradually removed, and the final structures reported here have no such constraints. It is worth noting that structures that are nearly completely minimized in vacuum simulations (like the ones discussed here) are frequently much further away from starting X-ray structures than are structures where the minimization is arbitrarily terminated at some preset mean gradient, such as 0.1 kcal/(mol A). Although it is clear that a large number of local minima exist for any protein, we have been unable to find fully minimized vacuum structures whose superposition with the starting X-ray structure is less than about 0.7 A (see below). C. Normal-Mode Analyses. The normal-mode frequencies were determined in the normal fashion by diagonalization of a mass-weighted second-derivative matrix of the potential energy. We carried out the calculations in Cartesian coordinates and in double precision, using Givens-Householder routines from the EISPACK library. These calculations are now nearly routine on computers with enough physical memory (ca. 6 Mbytes for crambin) to hold the second-derivative matrix entirely in core. For the united-atom calculations, determination of the frequencies and the lowest 500 eigenvectors required about 20 min of CPU time on a Convex C1 computer, once the minimized structures had been obtained. Since the normal modes evolve independently of each other, thermal fluctuations in molecular properties (such as Cartesian atomic fluctuations) can be determined as a sum over normal modes
The Journal of Physical Chemistry, Vol. 94, No. 21, 1990 8093
Harmonic and Quasiharmonic Descriptions of Crambin
TABLE 11: Superposition of Structures' 1 2 3
4 5 6 7 8
2
3
4
5
6
7
8
X-ray
0.55
0.66 0.92
0.97 1.19 0.80
0.86 1.13 0.59 0.88
1.14 1.37 0.91 0.36 0.90
0.73 0.85
1.05 1.23 1.09 0.90 0.97 1.04 1.16
1.02 1.20 1.00 0.72 0.83 0.76 1.07 0.77
0.70 0.99 0.97 1.15
'Values are rms difference in A of backbone heavy atoms (N, Ca, C, 0) for each pair of structures, following a rigid rotation that minimizes this difference.
Figure 1. Secondary structure cartoon of crambin, derived from a drawing by Jane Richardson, as modified by Marc Whitlow. Arrows represent &sheets and lightning bolts the disulfide cross-links.
where Axj is the deviation of the Cartesian coordinate of atom j from its average value, aijis the component of the ith normal mode along that coordinate, wi is a normal-mode frequency, and
kB is the Boltzmann constant. Fluctuations in internal coordinates, such a torsion angles, can be determined by replacing a with the appropriate projection on the desired internal coordinate. Because of the inverse frequency dependence in eq 2, most of the fluctuations arise from normal modes below 100 cm-I; fluctuations reported below are based on 500 normal modes, but we have verified for the AMBER united-atom case that these are identical (to the number of significant figures quoted) with results based on the complete spectrum. D. Quasiharmonic Calculations. For Cartesian atomic fluctuations, the sum in eq 2 can be simplified to yield'3*14,26 (AxiAxj) = k,T(FI),
(3)
where F is a second-derivative matrix of the potential
F,
= (a2V/ax,dxj)
(4)
Hence the matrix of fluctuations and the force matrix are inverses of each other. This relationship can be inverted to obtain an effective force matrix if the fluctuations are known independently, e.g., from a molecular dynamics simulation. The eigenvalues of the mass-weighted form of this effective force matrix determine the quasiharmonic frequencies of the system, which reflect the presence of anharmonicities sampled during the molecular dynamics simulation, but which can also be manipulated and discussed in the language of normal modes.I4 The quasiharmonic method has been used in a few cases to develop estimates of molecular e n t r ~ p i e s , ' ~but , ' ~not much is known about the convergence of such calculations or of the relation of quasiharmonic to harmonic frequencies for macromolecular systems. Our quasiharmonic calculations were based on a 100-ps vacuum trajectory, calculated with CHARMM version 17/18. For this, the temperature of the energy-minimized structure was raised to 300 K in 8 ps with steps of 0.001 ps. The structure was equilibrated for 22 ps at 300 K and followed by 100 ps of molecular dynamics using the Verlet algorithm. Shake was used to fix the hydrogen atoms to the heavy atoms. Coordinates were saved every fifth step (0.005 ps). For quasiharmonic analysis, calculation of 1188 frequencies required 25 h of CPU time on a Microvax 11. We compared both the frequencies derived from the fluctuations of the full run and those from various "windows" of shorter length contained within the full run. (26) Lamm, G.;Szabo,A. J . Chem. Phys. 1986,85, 7334.
111. Results A. Structural Comparisons. Crambin is a small, compact, globular protein further stitched together by three disulfide bonds (see Figure 1). Its regular secondary structure consists of a short stretch of @-sheetat residues 1-4 and 32-35 and helical regions from 7-18 and 22-30.27 It has five turn regions (5-6, 19-21, 31, 36-39, and 40-44) three of which have the sequence Pro-Gly and the others have Pro or Gly. The helical regions of the protein are most rigid, followed by the &sheet. The most mobile region of the structure, 36-45, contains two turns and has no secondary structure. Residues 39 to 41 are roughly parallel to the @-sheet region, but have no hydrogen bonds to it. The only bond that connects this to the rest of the structure is the disulfide between residues 3 and 40. This section of the protein can be called the "floppy loop" region. In order to compare the minimized crambin structures discussed here, we report the average root mean square (rms) distance of the backbones from each other once the structures have been brought into maximum coincidence by rigid translations and rotations (Table 11). A detailed analysis of the effects of the choice of force field parameters on energy minimized structures of crambin has been presented by Teeter and Whitlow,2 and it is not our goal here to repeat such an analysis. Nevertheless, it is worth pointing out that none of the potential functions used here found vacuum-minimized structures that were closer than 0.7 8, from the crystal structure, and that the minimized structures were no closer to each other than to the crystal structure. They differ from the X-ray structure (and from each other) by amounts that far exceed the uncertainties in the experimental structure (0.1 1 A). In part this reflects important differences between the vacuum and crystal environments, but in part it also reflects the significant range of configuration space explored even by energy minimizations from a common starting structure. In general, there appears to be some correlation between the vacuum-minimized structure and the potential used. For example, the two CHARMM potentials (no. 1 and 2) give structures that are very similar (see Table 11). Likewise, the two united-atom AMBER potentials (no. 3 and 5 ) and the two all-atom AMBER potentials (no. 4 and 6) show small relative differences. United-atom versions of CHARMM and AMBER (no. 1 and 3) also yield similar results. The all-atom force fields (no. 4, 6 and 8 ) have minimized structures that are closest to the crystal structure, although this does not imply that these are "better" potentials. Figure 2 shows the crambin backbone for the X-ray structure (heavy lines) and for the minimized structures (light lines). The most significant differences occur in the floppy loop region from residues 35 to 42 (top right of the figure). In this region, the three all-atom force fields follow fairly closely the X-ray chain tracing, whereas the remaining structures collapse inward somewhat toward the core of the molecule. Noticeable variation from the X-ray structure is also seen in the loop between the helices (residues 17-23), the amino terminus, and the second @-sheet strand. B. Frequency Distributions. As discussed above, low-frequency modes dominate calculations of overall fluctuations, and the 20 lowest frequencies for each harmonic calculation are collected in (27) Hendrickson, W . A,; Teeter, M. M. Nuture 1981,290, 107.
8094 The Journal of Physical Chemistry, Vol. 94, No. 21, 1 5190
Teeter and Case
vi
8
d
150
c 0
50
0 irequency
Figure 3. Cumulative harmonic frequency distributions. Code from Table I: no. I , heavy solid line; no. 2, heavy dashed line; no. 3, light dahsed line; no. 4, light dots; no. 5 , heavy dots; no. 6, squares; no. 7,light solid line; no. 8, triangles. Filled circles connected by a light solid line are from ref 9. Figure 2. Superposition of minimized and X-ray structures. Backbone for the X-ray structure is shown as a bold line; calculated structures are depicted with light lines. The floppy loop region around residue 40 is at the upper right; in this region, the three calculated structures that are closest to the X-ray results are from potentials no. 4 . 6 , and 8 (see Table
1200
; c
1000
1).
,I
TABLE 111: Lowest Harmonic Frequencies' 1
2
3
4
5
6
6.9 8.7 10.2 12.8 13.5 13.7 15.0 15.4 16.9 17.9 19.5 21.2 21.4 22.2 23.0 23.2 23.5 24.1 24.9 25.5
6.6 8.4 9.9 12.2 14.1 15.0 15.7 16.4 17.6 18.2 18.5 19.7 20.2 20.9 21.3 22.2 23.3 24.4 25.0 25.8
4.4 7.2 9.2 11.5 13.6 13.8 14.3 15.4 16.1 17.7 18.0 18.9 19.7 20.4 20.9 21.8 22.8 23.6 23.8 24.8
5.6 7.2 8.2 10.5 10.6 11.8 14.5 14.8 15.4 16.5 17.6 18.3 19.3 20.3 21.3 21.5 21.9 22.9 23.3 23.6
4.8 7.3 8.2 9.0 9.4 10.0 10.4 12.5 13.6 14.5 15.3 15.7 16.5 18.0 18.3 18.9 19.1 19.6 20.5 21.3
6.9 8.4 9.9 11.6 12.5 13.4 14.5 15.8 16.0 17.2 18.9 19.6 20.2 20.5 21.1 21.7 22.2 23.0 23.8 24.1
7 7.5 8.3 10.2 11.8 12.7 13.9 14.9 15.6 17.1 17.5 18.0 19.5 19.8 20.3 21.5 22.3 23.0 23.9 24.3 24.6
8 4.7
5.8 6.3 9.3 9.9 10.6 12.6 13.3 15.8 16.1 16.4 16.9 18.1 18.1 19.4 20.7 21.2 21.8 22.8 23.2
"Values in cm-I; potential numbers are identified in Table 1 Table 111. The lowest frequency varies somewhat from 4.4 cm-' in AMBER united atom to 7.5 cm-' in OPLS. It is not only the lowest mode that determines overall fluctuations, but also the distribution of low-lying modes, and these values are plotted in Figure 3. The results are nearly the same for all calculations, to within a small uncertainty. For example, all force fields (except for the nonstandard no. 5, with e = 4r) predict between 179 and 194 modes below 140 cm-I, with AMBER united atom at the low end of this range and CHARMM 19 at the upper end. The cumulative distribution is roughly linear in the range 20-140 cm-l; i.e., the density of states in this region is approximately constant at about 1.5 modes/cm-I. Figure 3 shows that two additional frequency distributions fall outside the range of these "standard" calculations. First, results from a calculation by Levitt et ai. are shown (connected filled circles in Figure 3) where only torsional angles were allowed to vary. This distribution rises more rapidly than the others in the 10-60-cm-' region and then becomes essentially flat above 120
0
500
1000
2500 3000 3500 4000 frequency
1500 2000
Figure 4. Cumulative frequency distribution for the 100-ps molecular dynamics run (solid line) and the harmonic calculation with the OPLS potential (dashed line).
cm-I, a region where bond and angle modes are continuing to contribute to the Cartesian results. Hence the number of modes below 140 cm-' in this calculation is only 136, and for frequencies above 90 cm-I there is a significant discrepancy between the dihedral and Cartesian results. A second example arises from AMBER united-atom calculations (heavy dotted lines) in which the dielectric constant was taken to be 4r, where r is the interatomic distance in angstroms. This has the effect of reducing long-range electrostatic effects (which may be advantageous for structural studies) but also greatly reduces the strengths of hydrogen bonds that hold secondary structure together, since these are largely electrostatic in origin. As seen in Figure 3, these frequencies are consistently lower than other calculations, having 202 frequencies below 140 cm-I. The overall difference may not seem too great, but the extra frequencies at the very low end of the spectrum have important consequences, as we show below. The complete distribution of quasiharmonic frequencies, plotted in Figure 4, is very much the same for the harmonic and quasiharmonic calculations, except in the 1700-2600-cm-'region; there are no harmonic frequencies in this region (and hence a plateau i n the cumulative distribution plotted in Figure 4), but the quasiharmonic calculation has less structure and places some modes in this region.
Harmonic and Quasiharmonic Descriptions of Crambin TABLE I V Lowest Ouasiharmonic Freauencies’ 1-5 10-25 1-50 25-100 3.0 3.1 3.5 7.3 5.8 7.4 9.6 10.3 10.8 12.5 14.1 15.1 15.4 17.3 18.0 20.1 22.2 22.6 25.2 25.8 28.0 29.2 29.8
7.9 10.4 10.6 12.1 13.7 14.7 15.1 16.2 17.0 17.7 18.5 18.9 19.7 20.8 21.7 22.0 23.1 24.1 24.6
6.6 7.4 9.3
10.1 10.3 11.3 13.0 13.9 14.3 16.0 16.1 16.8 17.7 18.6 19.4 20.3 20.4 21.8 22.4
6.6 7.2 9.1 9.3 10.0 11.0 12.5 13.6 14.2 14.8 15.7 16.2 17.0 17.3 18.1 18.7 19.8 20.1 20.4
The Journal of Physical Chemistry, Vol. 94, No. 21, 1990 8095 0.8
I
1-100
2.6 7.7 8.3 9.0 9.9 10.3 11.6 12.0 12.4 13.8 14.2 15.8 16.2 17.0 17.7 17.9 18.2 18.9 19.3 20.3
frequency
Figure 5. Cumulative quasiharmonic frequency distributions. Various lengths of simulation are shown: solid lines show results for 5-ps “windows”(1-5 and 50-55 ps); dot-dash for 10 ps (1-10 ps); dashed lines for 15 ps (1-1 5 and 25-40 ps); dotted lines for 50 ps (1-50 and 25-75 ps); squares for 75 ps (25-100) and triangles for the full 100-ps period.
Solid circles give the harmonic results for the CHARMM 1 7 / 1 8 force field. Table 1V and Figure 5 show low-frequency quasiharmonic modes computed for the full 100-ps simulation, as well as for a variety of shorter periods taken from this simulation. The very lowest frequency modes (below about 20 cm-I) converge quickly, so that the nature of the gross fluctuations (which depend most strongly on the modes with lowest frequency) is reasonably well established after about 10 ps of sampling (see subsections C and D, below). However, the distribution of higher frequencies (between 20 and 140 cm-l) is seriously underestimated for sampling times less than 50 ps. These distributions depend upon the amount of time sampled but are not sensitive to which portion of the trajectory is used; e.g., all of the distributions for 25-ps periods are nearly superimposable. Further, the distribution for long sampling periods is close to that computed for the true normal modes with the same potential function, although it is not clear that the mode distribution has converged even after 100 ps. Longer simulations should prove instructive on this point. The way in which the quasiharmonic frequency spectrum converges with sampling time may seem counterintuitive, since the low-frequency modes appear even with short sampling times, while higher frequency modes require longer sampling times. It should be remembered that this quasiharmonic procedure is purely
0.7
02
I
1
0
5
10
15
20 25 30 35 residue number
40
45
50
Figure 6. Root-mean-square fluctuations in atomic position as a function of residue number. Values plotted are the average for the backbone heavy atoms (N, Ca, C, and 0),coded as in Figure 3, except for the circles connected by a solid line, which here give values obtained from the crystallographic B factors.
a time-independent model that measures a mean fluctuation matrix and converts it to a frequency representation; we are not undertaking a Fourier transform of the time-dependent motions where the length of the sampling time might be more easily related to frequencies in the transformed data. Improved understanding may also come from an analysis of the convergence of particular elements in the correlation matrix; such studies are planned. The implications of the distributions seen in Figure 5 are not entirely clear, but they suggest that short dynamics simulations (of 10-20 ps) are “missing” some aspects of the fluctuation behavior seen by longer simulations. Such fluctuations may be important, for example, in free energy perturbation calculations that rely on obtaining a good sampling of the locally available configuration space. The fact that longer simulations approach distributions very close to that predicted by normal-mode calculations lends support to the hope that the latter may serve as useful substitutes for the former in certain situations. An earlier quasiharmonic simulation has been reported by Levy et aI.l5 for the pancreatic trypsin inhibitor, a protein of about the same size as crambin. These authors used a simplified model for the frequency analysis in which each residue was represented by a single interaction site and off-diagonal elements of the force matrix in internal coordinates were ignored. They recovered results that were quite different from the standard normal modes, finding 50 frequencies below 25 cm-’ and 7 less than 1 cm-I. We find no evidence in our quasiharmonic results for abnormally low frequencies, and, indeed, find that the fluctuations predicted by molecular dynamics results are quite close to those of a standard harmonic analysis. It is not clear why the earlier simplified model gave such unusual results, but it is likely that the neglect of cross terms had an important effect. The present calculations show that, at the atomic level, harmonic and quasiharmonic frequency distributions can be comparable. C. Atomic Fluctuations. Figure 6 compares the rms atomic fluctuations for the backbone atoms of crambin to the refined temperature factors from the 0.945-A crystal structure. This is plotted as an “absolute” comparison, even though some fraction of the observed temperature factors almost certainly arises from factors other than internal motion:prominent candidates are crystal packing defects and overall librational motion. As with most earlier comparisons of this sort, the general pattern of fluctuations is the same in the calculated and observed plots, although the mean magnitudes and extent of variation from one residue to the next differ. With only a few exceptions (e.g., residue 9 for CHARMM 19, residue 43 for OPLS), the calculated results are in remarkable agreement with each other. The pattern in side-chain fluctuations (not shown) resembles that for the backbone, but with larger
8096 The Journal of Physical Chemistry, Vol. 94, No. 21, 1990
Teeter and Case 20 1
08
07
16
c
O6
4 * L
0 05
2 vi
E
04
03
02
1
1
0
5
10
15
20 25 30 35 residue number
40
45
50
Figure 7. Same as Figure 6 . Solid line: OPLS potential; dashed line, united-atom potential with e = 4r; dotted h e , OPLS potential including a shell of water molecules. AMBER
25 30 35 40 45 50 Residue Figure 9. Root-mean-square fluctuations in backbone angle for various potentials, coded as in Figure 3. 0
20 1.2 1.1 1.o
3 1 1
5
10
15
20
,
I
I
I
0.9
c
4
$
0.6
L.
n
t
J
0.5
0.4
0.3 0.2
‘
0
1
5
10
15
80
85
30
residue number
35
,
I
I
40
45
50
Figure 8. Same as Figure 6, but for the quasiharmonic results for various windows. Light solid line, 1-10 ps; dashed line, 10-25 ps; dotted line, 25-40 ps; dot-dash, 40-55 ps; heavy solid line, 1-55 ps.
variability from residue to residue. As expected, the helical regions (residues 7-18 and 22-30) show low amplitudes of motion, whereas the turn regions (residues 5-6, 19-21, and 36-44) show the largest variation in all potentials. In Figure 7 we show fluctuations for the two nonstandard calculations described above. The calculation with t = 4r, where hydrogen-bond interactions have been reduced, shows generally larger amplitude motion, and very noticeably larger motion in certain residues that appear to be constrained primarily by H bonds in the standard calculations. The calculation that includes a solvent shell departs from the standard calculation in the opposite direction, giving much smaller amplitude fluctuations. This is also to be expected, since this fully minimized “solvent” has no thermal disorder, so that any fluctuation must increase the local strain energy. Together, these two nonstandard calculations show that Cartesian fluctuations are indeed sensitive to major changes in the force field model, although they are much less sensitive to smaller differences that exist between various popular parametrizations. Figure 8 shows backbone fluctuations for various parts of the molecular dynamics simulation. There is some variation depending upon which part of the trajectory was sampled, but the general pattern is for the most part independent of this. For residues 1-35 these are very close to those seen in the harmonic calculations, but the fluctuations in the floppy loop in the 36-44 region are
2 ’ 0
I
20 25 30 35 40 Residue Figure 10. Same as Figure 8, but for the backbone angle $. 10
5
10
5
10
15
45
I
I
01
25 30 35 40 Residue Figure 11. Same as Figure 8, but for the peptide torsion w 0
15
20
45
about 50% larger than the harmonic results. In this region the molecular dynamics trajectory is apparently exploring a variety of local conformations, leading to larger root-mean-square fluctuations. Even here, though, the differences between harmonic and MD results are not overwhelming, and both types of calculation show this to be the most mobile section of protein (in
Harmonic and Quasiharmonic Descriptions of Crambin agreement with a straightforward interpretation of the crystallographic B factors.) D. Torsion Angle Fluctuations. Figures 9-1 1 show fluctuations in the backbone dihedral angles 4, $, and o for some of the potential functions. The mean fluctuations are quite close for the various potentials, but there are significant local differences; note especially the turn region around residues 19 and 20: the CHARMM potential (heavy solid line) shows a large amplitude of motion where the AMBER all-atom potential (dotted line) has a very small fluctuation. The large fluctuation in 4(20) in the CHARMM result is closely connected to a similar fluctuation in $(19) (see Figure 9). Such motions allow the peptide bond to move with minimal disruption of the configuration of the nearby residues and are commonly seen in molecular dynamics simulations. The large difference between the various force fields here probably arise from small differences in the local geometries rather than from anything intrinsic to the potentials; note that for other angles (such as 4(33)) the AMBER fluctuations are above those arising from the CHARMM potential.
IV. Discussion The calculations reported here are not intended to help decide which protein potential function is the “best” one to use but to provide more information about the extent to which some aspects of protein dynamics depend upon the force field. The principal limitations of the current calculations include the harmonic approximation and the neglect of solvent. Although these limitations are indeed severe for describing some aspects of protein dynamics, there is continued interest in harmonic descriptions as starting points for understanding protein dynamics. As used here, they provide a convenient description of the global flucutuations that in many aspects is close to that found in (vacuum) molecular dynamics simulations. More complex theories based on a harmonic description incorporate the effects of solvent friction,’2,26 analyze anharmonic effects in a way different from that used here,28or allow for thermal disorder by expanding about positions that are not local minima.29 Both the harmonic and quasiharmonk descriptions of the fluctuations use a Gaussian form, whereas the “true” distributions are expected in many cases to have more than one peak, arising from transitions between minima.m In spite of these limitations, normal-mode analyses provide (28) Bialek, W.; Goldstein, R. F. Biophys. J . 1985, 48, 1027. (29) Seeley, G.; Keyes, T. J . Chem. Phys. 1989, 91, 5581.
The Journal of Physical Chemistry, Vol. 94, No. 21, 1990 8097 a simple way to compare the gross features of protein potential energy surfaces and their dependence upon parametrization of the molecular force field. The results reported here allow several interesting conclusions to be drawn about the normal-mode description of protein motion. First, most of the standard potential parameterizations give nearly equivalent results, as long as attention is limited to large-amplitude, low-frequency motions. Greatly reducing the strengths of hydrogen bonds by increasing e in an otherwise standard potential yields larger motions (and lower frequencies) than do more standard potentials. On the other hand, use of a sigmoidal dielectric that reduces the impact of long-range electrostatic interactions while maintaining hydrogen-bond strengths gives results little different from more standard potentials. Second, significant contributions of stretches and bends to frequencies above 60 cm-’ are found, so that the density of states in torsion space (see Figure 3) is significantly lower than that in Cartesian coordinates. Third, the amplitudes and frequencies of vacuum molecular dynamics simulations are close to those of the normal-mode calculations, so that many gross features of the former can be estimated by the latter. This correspondence only holds for MD sampling times longer than 30 or 40 ps; for shorter simulation times there is a significant loss of density of states in the 40-200-cm-’ region. Although the overall fluctuations are comparable, significant variations at particular sites are often seen when comparing one potential to another. It should be noted that all of the fluctuations considered here are equal-time correlations; Le., they show the amplitudes to be expected for various motions but not the rates at which excursions decay back to their equilibrium values, The latter can also be calculated from both normal modes and molecular dynamics, and some preliminary studies of such time correlation functions (which arise in spin relaxation and in fluorescence anisotropy decay) have been reported elsewhere.’* Further comparisons of molecular dynamics and normal-mode results are in progress.
Acknowledgment. We thank Bernard Brooks for help in setting up the molecular dynamics simulation and Cindy Fisher for help in using the DISCOVER package. This work was supported by N I H grants GM39266 (D.A.C.) and DMB86-06636 and DMB8904337 (M.M.T.). (30) Ichiye, T.; Karplus, M. Proteins 1987, 2,236; Biochemistry 1988, 27, 3487.