feature article - American Chemical Society

Jun 18, 1994 - difference in free energy (or adiabatic work) between two points along the coordinate is given .... usually involve hydrogen stretching...
0 downloads 0 Views 2MB Size
J. Phys. Chem. 1994, 98, 9700-9711

9700

FEATURE ARTICLE Modeling Solvent in Biomolecular Systems Paul E. Smith and B. Montgomery Pettitt' Department of Chemistry, University of Houston, 4800 Calhoun Road, Houston, Texas 77204-5641 Received: March 21, 1994; In Final Form: June 18, 1994@

Currently applied models for the treatment of solvent in biomolecular systems are reviewed. Solvent models ranging from purely continuum to quantum mechanical in nature are discussed, together with their ranges of validity and the approximations inherent to the various methods. As a potential energy surface interpretation of thermodynamics and kinetics is a useful and familiar tool to the physical chemist; we use the generalization to free energy surfaces (or potentials of mean force) to unify the discussion where possible. An example of how theory and simulations can aid in the interpretation of experimental data for the solvation of myoglobin is presented. It is argued that the advent of better theories and increasingly faster computers will provide the opportunity for the application of more rigorous solvent models for the study of complex biomolecular solutions with increasingly more accurate results.

1. Introduction The solvent environment usually associated with biological molecules in their active (or native) state has profound influences on the structure, dynamics, and thermodynamics of such systems. This is directly evident from physical experiments performed on proteins and nucleic acids as a function of the degree of hydration.'-' Different experiments have probed a variety of physical characteristics associated with the degree of hydration and have yielded different views concerning the effects of successive layers of solvent.'-'* A substantial amount of theoretical effort has been directed toward providing models and theories capable of describing the solvation effects observed experimentally. 12-14 The range of models and methodologies is, however, not as broad as the range of physical phenomena to which they apply. In this article we will examine some of the widely used solvent models of current interest for biological molecules. Each class of model has its own strengths and weaknesses, dictated by the domains of temporal and spatial phenomena for which they are valid. The spatial domains of applicability range from the extent of individual atomic librations, characteristic of short time fluctuations, to the (essentially) infinite extent characteristic of macroscopic phase boundaries. Time scales similarly range from intramolecular vibrations to scales associated with net mass and heat transport. A completely unifying physical picture has yet to emerge, but several reliable reductions are in common use. The use of potential energy surfaces in chemical reasoning is prevalent for both structural and kinetic problems. We use an appropriate extension in the condensed phase, namely, the potential of mean force. We present the various models of solvation as they are currently used, illustrated by recent examples from our own work and that of others. The classic review of Friedman considered the various levels of theory or methods used to describe s~lvation,'~ while van Gunsteren et al. have recently reviewed the treatment of solvation in biomolecular systems.l6 Here, we shall try to outline the different models commonly @

Abstract published in Advance ACS Absrrucrs, September 1, 1994.

adopted for the description of the solvent in biomolecular systems, with little discussion of the range of methods used to apply them. However, we note that certain models are less readily divorced from the corresponding methods than others. The physical characteristics of the time scales and spatial extents covered are often loosely correlated with both accuracy (comparison with experiment) and computational convenience. Generally, the finer the time scales and spatial resolution required, the more demanding the model and the corresponding methods become. We have arranged the models covered here in roughly increasing order of complexity or, correspondingly, increasing resolution in space and/or time. We follow the current section with a brief overview of some of the physical phenomena of interest as related to biomolecules. Our presentation of the solvent models then follows, together with an overview of current simulation results for the solvation of biomolecules, and ends with a discussion and outlook for this area.

2. Solvent Effects on Biomolecules At the microscopic level, the forces acting on a system determine the response times and the equilibrium behavior of the system. For temperatures and pressures of interest for biological systems, the solvent has little effect on the distribution of bond lengths or bond angles of the biomolecule. The remaining forces, both intra- and intermolecular, are of similar energies (frequencies). Thus, especially in hydrogen-bonding solvents like water, one may expect the modification to the intramolecular potential governing the remaining degrees of freedom, such as dihedral torsions or intermolecular separations, to be significant. It is the combined effects of the averaged intra- and intermolecular forces that give rise to an effective potential or a potential of mean force (PMF). Such free energy surfaces represent a useful intellectual reduction but are dependent on the time scale over which the averages are generated. Most often the time scale for averaging is infinite (Le., thermodynamic equilibrium), although the very use of classical models without explicit electronic degrees of freedom implies an effective molecular model based on a time scale which is

0022-3654/94/2098-97~~04.5QIQ0 1994 American Chemical Society

Feature Article

J. Phys. Chem., Vol. 98, No. 39, 1994 9701

long for electronic relaxation but short compared with atomic motions. Such preaveraged models must be used with care to avoid interpreting details on a time or length scale which is comparable to that over which the effective forces were generated. In this section we briefly review the formal pictures of equilibrium and dynamical effects of relevance to biomolecular systems. 2.1. Equilibrium Effects. One can investigate the effects of a solvent environment using a potential of mean force approach. Consider a system with a constant number of particles N , in a volume V at a temperature T, and described by a Hamiltonian H({p},{r}), where p and r are the momenta and coordinates of the particles, respectively. Then, the Helmholtz free energy of the system is given by

A = -kBT In J J

e-H({p},{r})/kBT

dp dr

approach toward the treatment of solvation effects, it is convenient to express the solvent effect in terms of the modification of atom-atom (painvise) correlations. In this case, the average of any property X which depends on the positions of atoms 1 and 2 is given by

JX(rI,r2) e-V({r’)’kBT dr (A‘) =

Je-v({r})’kBT

dr

(6)

The pair correlation function between atoms 1 and 2 is given by

(1)

where k~ is the Boltzmann constant and T is the absolute temperature. We now consider a general coordinate 6, involving n atoms, for which we would like to know the free energy profile. Typically, 6 corresponds to the separation of two atoms or to the rotation around a dihedral, etc. The free energy at a particular point along the coordinate 6 may be written as

and represents the normalized probability of finding atoms 1 and 2 in a volume element d r at positions rl and r2, respectively. We may rewrite eq 6 such that

or

A ( ( ) = -kBTln

JJ

d(r, - r’l(6))...d(r,,-

r’,(6)) e-H({p},{r))lkBT

dp dr (2)

(A‘) = JX(r 1’ r2 )e-W(rl,rz)’kBT dr, dr,

(9)

where {r’(e)} represents the set of position vectors corresponding to a particular value of 6. As we have restricted the integration over the positions of atoms 1 to n, the free energy is now explicitly dependent on the coordinate 6. Hence, the difference in free energy (or adiabatic work) between two points along the coordinate is given by

where W(rl,r2) is the effective potential between atoms 1 and 2 after averaging over all the other (solvent) atoms in the system. Therefore, the relationship between the potential of mean force, or adiabatic work surface W, and the pair correlation function g(,) is given by (switching to scalar notation)

The above derivative may be obtained from eq 2 and is given by

wherefis the total force acting between the two particles. This represents the adiabatic work required to bring two particles to positions 1 and 2 in the liquid. Part of this work is the direct interaction of the molecules U,and part is the indirect effect of solvent A W

which is simply the ensemble average of aHla6 obtained at a particular value of 6. Assuming the Hamiltonian may be factored into a potential and a kinetic term H({p},{r}) = V({r}) K({p}), one can perform the integration over momenta in the numerator and denominator of eq 4. Hence, eq 3 becomes

+

The ensemble average in the above equation is simply the negative of the mean force acting at t. (This is most easily seen if one considers t = rl - r2 = 112, Le., the distance between two atoms.) Therefore, the effective potential, or free energy surface, for the coordinate 6 is just that representative of the mean force along the coordinate, hence the name potential of mean force. In the thermodynamic limit we can consider the PMF as an average over forces acting on tagged degrees of freedom or with respect to the condition functions in the system.17 For a general

It is this indirect contribution from the environment that plays such a profound role in determining the conformational distribution in biological molecules. In polypeptides or polynucleotides there are a number of conformationally interesting single bond torsions. While for each torsion there may be a possible direct steric clash with the rest of the molecule, in general there are usually several rotamers possible. The number of accessible conformational states is roughly the number of rotamers to the power of the number of bonds, a significant number even for short oligopeptides or nucleotides. Much of the focus on the shape or population distribution of conformers for a particular biomolecule is directed toward these single bond torsions. The dihedral potentials about noncyclicly constrained single bonds frequently possess intrinsic barriers of a few kcaymol or less. Thus, solvent contributions to AW on the order of a few ~ B can T be as large as the direct potential interaction for those degrees of freedom. This can have a profound effect on the shape and number of minima for an energy (or free energy) surface.’* The potential of mean force is an equilibrium concept. It assumes averaging over all possible short time events. There-

Smith and Pettitt

9702 J. Phys. Chem., Vol. 98, No. 39, 1994 fore, one cannot usually interpret the kinetics of the system in terms of simple mechanistic events, as these are preaveraged out of this equilibrium picture. A simple single barrier crossing for instance, when viewed in detail on short time scales, may not involve a single large (multi k ~ r event ) but rather a large ) frequently in the solvent number of small ( k ~ rrearrangements, (bath) components. Clearly, defining the time scales of the events of interest is important for assessing which motions may be averaged out of the picture. 2.2. Dynamical Effects. At the interface between equilibrium and dynamic concepts are the time scale definitions of dominance. In averaging over time scales clear separations in degrees of freedom become apparent. It is customary to start with the Bom-Oppenheimer separation,15 averaging over all fast electronic motions on the ground state potential energy surface. This leaves one with the familiar electronically adiabatic potential surface for the nuclei or the motions of whole atoms. To extend this to the bath of solvent molecules, we augment this separation with the time scales defined by Eisenberg and Kauzmann.I9 The previous electronic separation generates the I-structure or the so-called instantaneous structure of the atoms. Next, one can average over the fast vibrations in the fluid yielding the V-structure. Various definitions from tens of femtoseconds to a picosecond are possible for this domain. Rossky and co-workers have quantified the structure and accompanying dynamics in this range.z0 The final EisenbergKauzmann classification is the diffusive, or D-~tructure.’~ Here, the diffusive motions that are often used to characterize the liquid state are averaged over to obtain the required long time average. For each of these ranges different techniques are useful. For the problems of interest in this review article the motion or momentum distribution function of the electrons is of little interest. The first relevant time scale for much of the discussion in the next section is the vibrational time scale. The highest frequency vibrations commonly found in biomolecular systems usually involve hydrogen stretching such as N-H, 0-H, or even C-H oscillators. The classical equipartition population is, in general, wrong for these modes, and they should be treated quantum mechanically. Other lower frequency modes are correspondingly closer to the equipartition theorem limit and may be treated classically for many applications. In fact, the highest frequency modes are often approximated as rigid via a constraint (or holonomic constraints for the equations of motion21). Molecular dynamics in its most common form thus assumes an average over electronic motions and often over the highest frequency vibrational modes. To average over different physical partitions of the system, a different set of methods is used. In the next section we will outline the use of Langevin methods and how one can “preintegrate” certain degrees of freedom out of the underlying partition function. Formally, this transforms a many-body problem to a few-body problem, which potentially results in a great intellectual and computational advantage. Whether any practical advantage is obtained is quite sensitive to both the physical approximations made and the system being studied. Due to the constant collisions between the solute atoms and water molecules, the solvent will act as a “bath” for the transfer of energy (or momentum) to and from the solute. This energy transfer has a strong influence on the dynamical behavior of the solute.2z For a pure solute (no solvent), many conformational transitions of interest would normally require long activation times during which enough energy is accumulated along the transition coordinate to cross the energy barrier

between the two conformations. The presence of the solvent provides a more efficient mechanism to obtain the required activation energy. However, in addition to the energy gain from the solvent bath, energy is also lost to the bath. This is a direct consequence of the viscosity of the solvent medium which tends to dampen the motion of the solute.23 Hence, the solvent both accelerates and retards the motion of the solute, and the exact behavior for a given conformational rearrangement will depend on the size of the rearrangement and the time scale involved. The frequency response of the solvent viscosity is an important factor for the latter. The intricate relationship between solvent activation and deactivation has been investigated using some simple model system^.^^^^^

3. Solvent Models 3.1. Primitive Models. The simplest approximation for the treatment of solvent is to represent the solvent environment as an infinite continuum with a permittivity corresponding to that of the bulk solvent. Hence, the normal Coulomb potential becomes

where EO is the permittivity of free space, E , is the relative permittivity of the solvent, and qi and q, are atomic charges situated at sites i and j separated by a distance r i j . Here, each pair of charges are assumed to interact through bulk solvent. Simple scaling of the Coulomb interactions is very computationally efficient and is therefore particularly well suited for long simulations where a thorough search of conformation space is important. The method is commonly used for studying small flexible peptides which may adopt many different, significantly populated, conformations in solution.25,26In combination with simulated annealing or quenching:’ the uniform continuum approach is a valuable first approximation. For situations where the first solvation shell waters are tightly bound to a central ion, the technique has been extended to treat an ion and the first solvation shell as a single spherical particle or “solventberg”.28 The major disadvantage of the uniform continuum approach involves the loss of atomic detail in the model. Any specific solute-water interactions are essentially lost, which may lead to a significantly different potential energy surface than that provided by a model with atomic detail. This is most easily seen if one considers the solvation energy of ions in a dielectric continuum as modeled by the Bom equationz9

where AA is the change in Helmholtz free energy of solvation (continuum - vacuum) for an ion of charge q and radius a in a continuum of relative permittivity E,. One can see from this equation that the free energy of solvation will be the same for anions and cations of the same size and absolute charge. However, for solvents with an asymmetric charge distribution (e.g., water), this is not observed for computer simulations and more advanced theories where an explicit solvent model is The continuum model implies that anions and cations with the same charge magnitude and size will perturb the surrounding water structure in the same way, which is not usually the case. In addition, electrostriction and dielectric saturation are difficult to model with a continuum approach.33 Simple continuum based theories can be extended to include

Feature Article molecular however, when a careful comparison of conformational or configurational energies is required, this type of approach should be used with caution. 3.2. Effective Dielectric Constants. As the uniform dielectric continuum approach is so inexpensive, there have been several attempts to improve the model while maintaining a high degree of computational efficiency. (a) Distance-Dependent Dielectric. One such approach is to use a distance-dependent dielectric constant where the relative permittivity of the solvent is related to the distance (in angstroms) between the charge Therefore, chargecharge interactions become

where d = 1 or 4 A-I. The idea is that charges separated by large distances probably interact through bulk solvent and should therefore be more heavily screened than corresponding charges with a smaller separation. An added advantage is that chargecharge interactions now decay as 1/?, and therefore artifacts associated with the truncation of the Coulomb potential at some cutoff radius are correspondingly reduced, although not elimi~ ~ a t e It d .is,~ in ~ general, an expression which is incorrect in detail for any physical system. (b, Sigmoidal Dielectric. Many other effective permittivities have been suggested which are explicitly dependent on the interatomic separation. They all display the general characteristics of a low effective permittivity at short distances which approaches the bulk permittivity at large distances.41 Ramstein and Lavery have suggested the following analytical form for the dependence of the relative permittivity on charge separation r,42

Here the permittivity ranges from unity at zero separation to approximately the bulk value (D) at distances greater than 15 or so, depending on the value of theovariable parameter S. Typically, S ranges from 0.15 to 0.30 A-1.38342 This type of approach is an improvement over the uniform or simple r-dielectric model but still lacks atomic detail. Furthermore, the application of such functional forms to peptides and proteins is not trivial. For instance, two charges may actually interact through the protein (as opposed to through the solvent), and therefore the choice of bulk permittivity becomes unclear. (c) Ionic Screening. In an attempt to include the effects of an ionic atmosphere on the electrostatic interactions, it is possible to include Debye-Huckel s ~ r e e n i n g . Hence, ~ ~ , ~ we have

where K is the inverse Debye-Huckel screening length and teff is any of the effective permittivities (distance-dependent, sigmoidal, etc.). The approximation is good for small ionic strengths but often breaks down for ionic strengths greater than 0.001 M45 and situations where specific interactions are important. 3.3. Reaction Fields. The previous continuum methods have involved treating the charge-charge interactions as isolated interactions which are considered independently from the rest of the molecule to which they may be attached. However, the

J. Phys. Chem., Vol. 98, No. 39, 1994 9703 true effect of placing a molecule in a dielectric continuum is more complicated. If we place a molecule with a net charge in a dielectric continuum, it will be solvated by the continuum. However, as the system as a whole must be neutral, the dielectric must also neutralize the net charge. If the molecule has a net dipole moment, this will induce a dipole in the continuum such that the overall system has no net dipole (assuming it is not paramagnetic). The induced dipole will then interact with the permanent molecular dipole. This effect is known as the reaction field due to the dielectric continuum. Higher multipoles of the charge distribution will induce higher multipoles in the dielectric which will then interact with the original charge distribution to generate a general reaction field.46 The general reaction field may be determined analytically for symmetrical cavities by solving for the potential inside and outside the cavity with the appropriate boundary conditions.46 For a molecule in a spherical cavity of radius a and relative permittivity of 1 situated in an infinite uniform dielectric of relative permittivity E , the total potential within the cavity is given by46

with

where P, are Legendre polynomials and 0 is the angle between the vectors r, and r,, which are defined with respect to the center of the sphere. The first term corresponds to the normal Coulombic potential due to the explicit charge distribution within the cavity, while the second term is the reaction field contribution generated by the interaction of that charge distribution with the infinite dielectric continuum. The n = 0 and n = 1 terms correspond to the Born solvation energy29and the Onsager reaction field:7s48 respectively. (The latter is now commonly employed in quantum mechanical calculations to approximate solvent effects; for a review see ref 49 and references therein.) The method can also be extended to include an ionic strength d e ~ e n d e n c e .The ~ ~ interaction of higher moments of the charge distribution with the continuum is particularly important for proteins with many ionizable side chains. Tanford and Kirkwood first applied the general reaction field approach to the calculation of protein titration curves.5oIt can also be used for the rough prediction or rationalization of pK, ~ h i f t s . ~ lThe - ~ ~major disadvantage of the method is the requirement of a spherical46 or e l l i p t i ~ a lcavity. ~ ~ , ~ ~Although many proteins can be described as roughly spherical or elliptical, the functionally important region of a protein is seldom so geometrically simple. Concern has also been expressed as to the high sensitivity of the results on the proximity of the charge groups to the surface of the cavity.53 The technique is usually restricted to the case of a rigid solute for which the dielectric permittivity of the solute must be known. However, the method has recently been extended to simulations in which the charge sites within the cavity are mobile.55 In addition, the reaction field approach is usually implemented such that the solute feels an instantaneous response from the dielectric continuum, corresponding to the zero-frequency permittivity of the solvent. In reality, the response is frequency-dependent and determined by the dynamics of the solvent and how it responds

9704 J. Phys. Chem., Vol. 98, No. 39, 1994

Smith and Pettitt

to changes in the individual moments of the solute. In this is that the equilibrium (average) properties of the solvent are obtained very efficiently. The major problems are the paramrespect, fast processes will not produce the same response as etrization dependence on an explicit simulation, which can make slow processes, and the various moments may also illicit a the results dependent on the quality of the explicit force field different time-dependent response. Inclusion of a delayed adopted, and the lack of an explicit ionic strength dependence. Onsager reaction field has been implemented and tested for The effect of the shape and extent of the dipole grid on the liquid water,56although the original formulation requires some fields around the solute is not, as yet, fully understood. modification!0 Results are intermediate between no reaction field and an instantaneous reaction field. 3.6. Atomic Solvation Parameters. A different approach to the modeling of solvatioddesolvation is that of atomic 3.4. Poisson-Boltzmann Calculations. The general reacsolvation energy parameter^.^^ This approach assumes that the tion field technique can be applied to arbitrarily shaped solutes free energy of solvation of a solute can be decomposed into a by adopting various numerical techniques. The most common sum of atomic free energy components. Therefore, we have technique is the finite difference Poisson-Boltzmann techn i q ~ e although ,~~ finite element58,59and boundary elementm approaches are also feasible. In the finite difference PoissonBoltzmann approach the solute is placed on a three-dimensional grid, and the relevant field equations, Le., the Poisson and where Awi represents the degree of solvent exposure of atom i. Poisson-Boltzmann equations inside and outside of the solute In this respect, solvent exposure can refer to either the degree cavity, respectively, are solved by assigning charges and of accessible surface area84-86 or the degree of coordination permittivities to the grid point^.^^,^^ The method thus determines (atomic occupancy)87 of the atom, in which case Agi is the the electrostatic contribution to the solvation free energy. atomic solvation free energy per unit surface area or unit volume, In this way the irregular boundary between the solute and respectively. The Agi’s are parametrized on experimental free the continuum can be treated quite accurately. Ionic strength energies of transfer of small amino acids between water and effects (via Debye-Huckel screening45) can also be included. o c t a n ~ l or ~ ~small . ~ ~ organic molecules between water and Most applications to date have used the linearized form of the ~ a p o r . ~ OThe - ~ l solvation term is then used in conjunction with Poisson-Boltzmann equation although recent numerical adthe standard force field terms. During a simulation or molecular vances have seen the emergence of nonlinear Poisson-Boltzmechanics minimization A o i will vary as the atoms move, mann calculations.62-M The method can be used in Brownian causing a change in AGsolv. This variation is such that dynamics simulation^^^-^^ and for the prediction of pK, shifts upon protonatioddeprotonation of ionizable side ~ h a i n s . ~ ~ - ~ Ohydrophilic groups will tend to become more heavily solvated while hydrophobic groups will become less solvent exposed. One of the most promising uses of this solvation model is the Such an empirical method is computationally efficient as long rapid estimation of free energies of solvation for the design of as bo can be calculated easily. Fortunately, there are now modified ligand^.^^.^^ The method is computationally inexpenapproximate analytical methods available for the traditionally sive, and although it is usually applied to a single rigid expensive calculation of solvent accessible surface areas and conformation of the solute, it has been combined with molecular their derivative^.^^^^^ The largest difficulty with this type of dynamics.73 However, the direct combination of both techniques approach is obtaining a complementary set of force field and is not ~traightforward.’~The recent use of multigrid techniques solvation parameters. In addition, the experimental waterhas enabled the treatment of larger systems with greater octanol and water-vapor transfer free energies produce sigp r e ~ i s i o n . ~The ~ . ~method ~ and its applications are discussed nificantly different atomic solvation parameters and are often in more detail e l s e ~ h e r e . ~ ~ < ~ ~ - ~ ~ inadequate measures of the contribution of the solvation free 3.5. Langevin Dipoles. Warshel and Levitt have developed energy of transfer from the solvent to the protein interior.94There an approach which is intermediate between that of an explicit also seems to be some disagreement as to whether the solvation solvent and a continuum representation.80-82 In this method, term should depend on the accessible surface area83.85,86 or the the solvent is represented by a set of polarizable and rotatable volume of an atom.87s95Although the relative hydrophobicity dipoles fixed in space on a grid surrounding the solute, usually of various groups may be modeled effectively, dielectric a protein. The field at each grid point, Ei, is then given by the screening between charged groups is absent, and ideally a protein charges and the remaining solvent dipoles. The average screening function should also be incorporated into the model. orientation of each dipole is described by a Langevin type Finally, it should be noted that the decomposition of the formula solvation free energy into atomic or group terms is not unique from a statistical mechanical basis.32-96-98However, simulations employing atomic solvation parameters are typically 20-30 times faster than an equivalent explicit solvent simulation and are therefore quite appealing. where X,is given by 3.7. Integral Equations. Theoretical structural properties in condensed phase are obtainable from a variety of methods, including integral equations. Integral equation methods have p~ is the unpolarized solvent dipole, and C is an adjustable as their fundamental goal the calculation of the n-body distribuparameter which determines the resistance to reorientation of tions within the system.99 For practical reasons, however, one the solvent dipoles. C is usually determined by comparison usually assumes that only pair interactions (real or effective) with an explicit solvent simulation. The field and average are important and for which the pair distribution function orientation of the induced dipole at each grid point are then between atom pairs (g(2)) is given by eq 7 . Distribution obtained by an iterative procedure. functions obtained from integral equation theories can, in As the solvent dynamics is not treated explicitly, the principle, yield all equilibrium properties for the system of calculation is quite inexpensive (only a few iterations are interest, in particular, the PMF between all atom pairs within required to achieve convergence), and yet it is more realistic the system. than a purely continuum approach. The beauty of the method Since the integral in eq 7 is intractable, the problem is recast

Feature Article

J. Phys. Chem., Vol. 98, No. 39, 1994 9705

into an integral equation relation between g and an intermediate function, together with a nonlinear algebraic equation which also relates g and the intermediate function. This is formally exact but remains just as impractical in terms of a solution. By making various approximations, one can obtain a set of equations which are computationally convenient and show reasonable qualitative accuracy. Even though all practical integral equation approaches are approximate, they have a useful practical range of utility and are an important tool for theoretical work in the area of modeling solvation phenomena. The application of integral equation methods for the treatment of solvent effects in biomolecular systems is, at present, limited. The reason for this lies in the need to determine the solvent contribution to the free energy for each distinct conformation of the solute. For small solutes with only one or two interesting degrees of freedom this is relatively easy.18 However, for larger solutes the number of possible conformations increases rapidly. The problem is then analogous to that of a typical conformational search where a systematic search of the total number of possible conformations quickly becomes prohibitive. Another application of integral equations is to approximate the total PMF energy surface as a sum of painvise PMF’s.lo0 To accomplish this, one needs only to obtain PMF’s between all solute atom pairs. This is equivalent to assuming that all triplet and higher correlations between solute sites are contained in the solute interaction potential and that the solvent effects are simply related to products of the relevant pair correlations; i.e., the triplet correlation is given by g(3)(r1,r~,r3) = g(*)(rl,rd g(2)(r2,r3) g(*)(r1,r3).lo1The neglect of explicit three-body correlations within the solvent correction (AW) significantly decreases the computational expense involved. However, for most biomolecules the presence of relatively rigid secondary structural elements implies significant many-body correlations; i.e., they have a well-defined interior. Hence, the approximation will be good for small peptides1* and nucleotidesLo2but will become increasingly less accurate for large biomolecules with a well-defined interior. 3.8. Stochastic Dynamics. The methods described so far have concentrated on the effects of the solvent environment on the energetics of the system. In this respect, they have tried to generate a potential of mean force surface from the corresponding vacuum potential surface. However, as mentioned in section 2 , the solvent has a dynamical effect, as well as a thermodynamic effect, on the properties of the solute. The dynamical effects of the solvent can be modeled very simply using the generalized Langevin e q ~ a t i o n ’ ~ ~ . ~ ~ ~

m;ii = -VU({r})

- mihrdt’Ki({r},t’)

ti(?- t’)

+ A,({r},t) (22)

where P, and ti are the acceleration and velocity of atom i of mass mi and U ( { r } )is the solute potential energy or free energy function. &({r},t) is a position- and time-dependent random force that represents the energy transfer between solute and solvent upon collision, and the integral characterizes the dissipation of energy due to the damping of solute motions by the solvent. The factor Ki({r},t’) in the integral is the generalized friction kernel, where {r} represents the set of position vectors of all particles. The generalized friction kernel and the random force are related by the fluctuation-dissipation theorem. lo5 The full generalized Langevin equation requires prior knowledge of the dynamics of the system we are trying to simulate and so does not directly simplify the problem. To avoid this difficulty, various approximations for the friction kemal have been introduced (see ref 106 and references therein).

The simplest approach is to model the solvent effect on the kinetics of the solute by means of the Langevin equation,

+

m;i, = -VU({r}) - miyitj A,(t)

(23)

where y1 is the friction constant (i.e., the generalized friction kernel is taken to be delta correlated in time and independent of atomic positions) which is related to the zero frequency viscosity of the s o l ~ e n t , ~ and ~ ’ -Ai(t) ~ ~ ~is the random force on a given particle (also assumed independent of atomic positions). In our case, Ai(t) is assumed to have a Gaussian distribution with momentsll’

and (Ai(t) A$’))

= 6yik,T6(t - t’)6,

(25)

where i and j label different solute sites and !QT is the Boltzmann constant multiplied by the absolute temperature. Thus, the dynamical equations of motion are no longer deterministic but rather involve random forces for which we only know their average properties, hence the name stochastic dynamics.lo3 This approach represents a considerable simplification of the dynamical effects of the solvent and is only slightly more expensive than a corresponding vacuum simulation. The combination of any potential of mean force approach (particularly integral equations) with stochastic dynamics simulations represents the reduction of an expensive many-body problem to an efficient effective few-body problem.’00~’L2 However, large solutes still present some problems. The assignment of a friction coefficient must account for the fact that not all atoms of a large solute are exposed to the solvent. Atoms in the core of the solute will experience no direct interactions with solvent, and their friction coefficients should be set to zero. While fully exposed or fully buried atoms are easily modeled by stochastic dynamics techniques, it is difficult to account for partially exposed atoms, although some methods to achieve this do exist.L13s114 3.9. Explicit Solvent. The most generally dependable way of including both the equilibrium and dynamical effects of a solvent is to include the solvent atoms explicitly. This requires a solvent-solvent potential that is sufficiently accurate to describe the major physical properties of the solvent (such as internal energy, radial distribution functions, self-diffusion constant, density, viscosity, etc.) and a method to determine an effective solute-solvent potential given the individual solutesolute and solvent-solvent parameters. The standard way to achieve this is to use combination rules for the Lennard-Jones parameters. Hence, we have for the total solute-solvent potential

where u and v refer to solute and solvent, respectively.

Smith and Pettitt

9706 J. Phys. Chem., Vol. 98, No. 39, 1994 Explicit solvent may be included in two ways. The solute can be solvated by a large box (usually cubic, rectangular, or a truncated octahedron) of solvent molecules and the whole system replicated in 3 dimensions and treated with periodic boundary conditions.118Alternatively, the solute can be partially solvated, the explicit solvent beyond a certain distance neglected, and the solvent-vacuum boundary treated using stochastic dynamics t e ~ h n i q u e s ' l ~ Jor' ~continuum methods. Periodic boundary conditions are, in general, more expensive. If the whole solute needs to be solvated, and is fairly spherical in nature, the truncated octahedron periodic boundary conditions requires approximately half the number of atoms of the equivalent cubic box, producing savings of 3-4 in computer time. The major disadvantage of explicit solvent methods is the large computational expense involved. Explicit particle methods can be orders of magnitude more expensive than a purely continuum approach, depending on the number of solvation layers employed for the simulation. Therefore, one should only resort to large explicit solvent simulations if the properties one is interested in, or the desired accuracy, require the inclusion of many-body correlations such as those provided by an atomic model. For very large systems (>>lo 000 atoms) there are several techniques (twin range cutoff,'** link cell lists,122cell can be used to increase the m ~ l t i p o l e s , ' * ~etc.) ~ ~ * which ~ efficiency of normal simulations. Another consequence of the expense of performing such explicit solvent simulations is that the results obtained by using either of the combination rules (eqs 27 and 28) have not been tested thoroughly. There is no guarantee that the effective solute-solute interactions and the effective solvent-solvent interactions when combined will produce the required effective solute-solvent interactions. There is some evidence to the contrary,Iz5 although more work has to be done before force field corrections are justified. Even if one has a reliable force field, the long-range nature of the Coulomb potential is such that a cutoff distance, beyond which all interactions are neglected, has to be invoked.40 Unfortunately, this can result in a multitude of problems, especially for very polar or charged systems, which can severely affect the results of the s i m ~ l a t i o n . ~ ~ For ~ ' * this ~ - ~reason, ~~ the method of truncation and the truncation distance are to be considered an integral part of the force field. 3.10. Polarizability. All the currently available force fields for the simulation of biomolecular systems involve effective force field parameters. Hence, enhanced atomic charges are used to model the average electronic polarization of an atom/ molecule by its surroundings. For instance, the dipole moment of a water molecule in the gas and ice phases are 1.85134and a2.6 D,135 respectively, the difference resulting from the mutual polarization between water molecules in the solid phase. Inclusion of polarizability may be particularly important for very polar or ionic systems and systems containing highly polarizable atoms, e.g., chlorine or bromine. It has been suggested that polarization of the amino acid and nucleotide bases in water may contribute 10-20% of the electrostatic contribution to the free energy of solvation.'36 Polarization may be explicitly included in the force field by the use of induced point dipoles (and point quadrupoles, octupoles, etc.; although their inclusion becomes increasingly more tedious). Several polarizable water models exist using this type of a p p r ~ a c h . ' ~ ~Usually, - l ~ ~ each oxygen of a water molecule is assigned an isotropic point polarizability a which interacts with the field at the oxygen to give an induced dipole p such that

where E; is the total field at site i. The total field is composed of two terms, one from the permanent charges of the system and one from the other induced dipoles. Hence,

and T is the dipole field tensor given by

The above set of coupled equations can be solved by matrix inversion or iterative methods. The latter is more common as the induced dipoles from the previous molecular dynamics or Monte Carlo step provide good starting points for the iterative procedure which is also less computationally demanding. Two or three iterations are usually required to achieve convergence, which means the calculations are 2-3 times more expensive than regular explicit solvent simulations. A more efficient approach for the treatment of polarization has been proposed by Sprik and Klein'44 and in a simplified ~ ~ the induced dipoles located form by van Belle et ~ z 1 . I Here, at each polarizable site are treated as extra dynamical variables. An extended L a g r a r ~ g i a n ' is ~ ~proposed , ~ ~ ~ which includes a potential and kinetic energy term associated with the induced point dipoles. From this Lagrangian the equations of motion for the induced dipoles can be derived and integrated using standard molecular dynamics techniques. By dynamically coupling the induced dipoles to the rest of the system, the need for iteration at each time step is removed and the algorithm is therefore more efficient than iterative methods, although it is still more expensive than standard simulations without explicit polarization. The only drawback with this type of approach is that the kinetic energy of the induced dipoles (used in the Lagrangian formulation) requires a "mass" associated with the extra degrees of freedom. This mass determines the response of the dipoles to the fluctuating local electric fields. Too large a mass will result in too slow a response, and the induced dipoles will experience an effectively zero average electric field. Too small a mass will cause problems when integrating the equations of motion. In reality, the induced dipoles model the polarization of the electronic cloud surrounding each atomic center, and their response should be, within the Bom-Oppenheimer approximation, intantaneous. A compromise is a small mass which ensures that the induced dipoles are, on average, aligned with the local electric field. The required mass can also be obtained by comparison with a corresponding simulation using one of the iterative, instantaneous response, methods. Another approach to the problem of polarization is that of the shell model. In the shell model each atom is represented as a charged core connected to a charged shell by a harmonic spring.148 The motion of the shell is such that the physical charge separation between the core and the shell behaves as an induced dipole. Originally, the shell was taken to be massless, and hence the shell positions were determined by minimization at each time step. A more efficient method is to assign the shell a mass and integrate their equations of motion along with the atom cores.148 The shell model represents a more efficient method than the extended Lagrangian approach as it does not involve the dipole-dipole tensor. However, it does require a mass and a force constant associated with the spring, together

Feature Article

J. Phys. Chem., Vof. 98, No. 39, 1994 9707

with a charge assigned to the core (and a corresponding charge terms which correspond to the coefficients of the wave functions. on the shell). By assigning a “mass” to these coefficients and an orthonorThe use of explicit polarization models is, at present, rather mality constraint, their equations of motion may be propagated limited and usually restricted to water and small ions. The along with the atomic nuclei. Under the appropriate conditions, inclusion of polarization into a protein or DNA force field is a the trajectory generated corresponds to that of the ground state significant undertaking which requires reparametrization of wave function. The method has recently been applied to the existing force fields. In addition, polarizability is a many-body simulation of liquid water,165 although the short simulation effect, and hence a decomposition of polarization effects performed (0.5 ps equilibration, 1.5 ps production) illustrate that the method is relatively expensive. between groups may be more difficult than for the more usual effective pair potentials. For this reason it may be a while before Finally, Vaidehi et al. have described a method for the reliable polarizable force fields exist. determination of solvation free energies which has many 3.11. Quantum Mechanical. As a result of the small mass advantages over other quantum mechanical methods.166 One of hydrogen atoms and the quantization of solvent vibrational can perform a classical simulation of a system and then apply modes, the properties of hydrogen-containing solvents can statistical mechanical perturbation theory167 to obtain the possess significant quantum mechanical contributions. Ideally, corresponding equilibrium properties of the equivalent quantum one would like to propagate the time-dependent Schrodinger mechanical system via equation for the system of interest including all electronic degrees of freedom. Unfortunately, using a sufficiently accurate (33) basis set (and possibly a post-Hartree-Fock calculation) to fully account for polarization effects means that this is impractical, where X is the property of interest, AH = Hqm- Hcl, and the even for small condensed phase model systems. Hence, other angular brackets denote an ensemblehime average over a system quantum mechanical techniques have evolved. with the appropriate Hamiltonian. As long as the overlap Classical simulations may be corrected for quantum mechanibetween the quantum mechanical and classical distributions is cal effects in a number of ways.149-152 By expanding the significant, the perturbation treatment should be applicable. The quantum mechanical phase space distribution in powers of major advantage of this approach is that the majority of the Plank’s constant, Wigner and Kirkwood obtained the first-order correction to the classical phase space d i s t r i b ~ t i o n . In ~ ~this ~ . ~ ~ ~ work is purely classical, and only the energy of a small set of representative (uncorrelated) configurations needs to be detercase, one can define a “quantum” potential V,, such that mined. However, the method is limited to the determination of equilibrium quantum mechanical properties.

4. Simulation Results on Solvation Effects where VCl is the classical potential and fi = ( I c ~ 0 - l . Hence, one can either simulate the system using V,, or simulate using VCland treat the quantum mechanical potential as a perturbation to the classical potential. Quantum corrections can also be obtained by calculating the velocity spectrum (Fourier transform of the velocity autocorrelation function) from a classical simulation and reweighting accordingly. Berens et al. have given the corrections for the energy, heat capacity, free energy, and entropy of the system using this approach.15* Instead of performing a full semiempirical or ab initio calculation for the whole system, one may define a particular region of interest which will be treated quantum mechanically while the remaining degrees of freedom are treated classically.154 This type of mixed quantum mechanical and molecular mechanical (QM/MM) approach has been used to study the active site of lysozyme,81solvation free e n e r g i e ~ , ’ ~and ~ . ’the ~ ~effects of solvent on vibrational spectra157and simple reactions.158Even for these reduced systems one is limited to mainly semiempirical calculations. Unfortunately, the commonly used semiempirical Hamiltonians provide a poor description of water-water interactions. 154~159 Feynman path integral methods, which exploit an isomorphism between quantum and classical statistical mechanics, have also been used to investigate quantum effects in the condensed phase.160.161The method has been applied to a whole protein162 and liquid water.161s163 In terms of the structure of liquid water, differences between the quantum and classical simulations were found to be largest for the hydrogen-hydrogen radial distribution function, with the quantum simulation producing less structure. 161 An application of the extended system Lagrangian approach has been proposed by Car and Parrinello for the treatment of quantum systems.164 Here, the extended Lagrangian includes

In this section we will review some of the simulation results describing the effect of solvent on biomolecules. It does not constitute an exhaustive search of the current literature but does provide a representative view of our present understanding of solvation effects. Unfortunately, most of the studies to date have concentrated on proteins or small peptides, with relatively fewer studies of DNA or lipids. Hence, the results are slightly biased toward effects observed for proteins. Rossky and Karplus performed the first simulation to study the effect of solvent on the structure and dynamics of a b i o m ~ l e c u l e They . ~ ~ ~studied ~ ~ ~ the ~ alanine dipeptide in explicit water and concluded that, in comparison with equivalent in vacuo simulations, certain aspects of the local fluctuations and structure are not significantly affected by the presence of the solvent, especially for motions on the order of a few picoseconds or less. Subsequently, simulations of small biomolecules in explicit solution and integral equation results have shown that, although the local dynamics and structure are retained on addition of solvent, the effect of solvent on the overall free energy surface of the solute (achieved in the long time limit) is s i g n i f i ~ a n t . ~ Although ~ , ~ ~ ~ -the ~ ~exact ~ results are dependent on the force field parameters, the general trend is that of a flattening of the corresponding vacuum potential energy surface on addition of solvent. The relative free energies of different conformations are modified due to competition between solutesolute and solute-solvent hydrogen bond formation.177 The barriers separating the various conformations are also reduced, facilitating barrier crossing and changing the dynamical properties of the solute. The same general solvent effects are observed with Langevin dynamics on a PMF surface.loO Some effects are recovered by coupling molecular dynamics and the finite difference Poisson-Boltzmann technique for the calculation of the reaction field contribution to the energy of a solute in

9708 J. Phys. Chem., Voi. 98, No. 39, 1994

Figure 1. A comparison of the refined water molecule positions obtained from three diffraction experiments on two different myoglobin crystal forms. The watem from two X-ray experiments, one on a P6 crystal and one an a P21 crystal. In addition, the waters obtained from a neutron diffraction experiment on the P21 crystal form are also displayed. Notice the general lack of correlation between water molecules refined from the different experiments. In comparison, the agreement an the protein structure was excellent, Molecular dynamics simulations find density maxima at essentially all the water refinement sites from these experiments with a range of equilibrium populations (occupancies)! The experiments are color-coded blue, P6-Phillips. Bmokhaven listing 1MBW red, PZI-Kuriyan, Bmokhaven listing IMBC; purple, PZI-Schoenbom. (Neutrun) Brookhaven listing ZMB5; green, ribbon protein-heme structure from P6 form.

solution;73however, such a model includes no kinetic effects from the solvent. Studies of isolated protein secondary structural elements (usually helices) have also been used to probe the influence of solvent and solvent models on the structures, as well as investigating the structure of the solvent around the solutes?8.178.L7qIt is observed that specific interaction between helices and water molecules, resulting in water insertion into the helix, promotes the destabilization of polyalanine helices.178 The use of a sigmoidal dielectric function to represent implicit solvent screening was found to reproduce some effects of a full explicit simulation?8 Comparisons between in vacuo and explicit solvent simulations have also been used to probe the effect of water on protein^.^^^-^^^ For interior residues. the size of fluctuations is the same in both environments.lm However, for exterior atoms the magnitude of the fluctuations are significantly affected, resulting in larger displacements in solutiou. In vacuo simulations tend to promote the formation of incorrect hydrogen bonds,'81 many of which result from the "collapse" of charged side chains on the surface of the protein. (In fact, prolonged simulation without accounting for solvation effects results in the formation of a spherically deformed ~ r 0 t e i n . l ~For ) realistic properties of proteins at least a couple of solvation layers of ~ . ~ ~ the ~ inclusion of an explicit water are r e q ~ i r e d . ~In~general, solvent environment leads to average structures which are closer to that of the crystal.l8' Another theoretical approach to determine the effects of solvent on biomolecules is to compare simulations in different solvent environments.l86-188 Comparative simulations of a

small peptide and BF'TI have been performed in both aqueous solution and chloroform. Chloroform, lacking the hydrogenbonding capability of water, promotes the formation of a larger solute-solute hydrogen-bonding uetwork.lS8 The increased number of hydrogen bonds arise from changes in the orientation of amino acid side chains,lg7 while the conformation and flexibility of the backbone are essentially the same in both envirouments.186J87 The presence or absence of solvent also has a pronounced effect on the dynamical properties of a protein.1L2,180Most noticeable are the longer relaxation times observed for side chains in an explicit solvent. The longer relaxation times are a direct result of the viscosity of the solvent which imparts significant diffusive character to the motion of protein atoms.L8o Using Langevin dynamics, Lonchariacb et ai. have investigated the effect of solvent viscosity on the isomerization rates of the alanine dipeptide in an implicit solvent.2z The isomerization rate is observed to be intricately linked to the solvent viscosity. Justification for the need of explicit simulations is provided by the observation of specific biomolecule-water interactions and momentum transfers at the atomic level. These important interactions have been observed both from experiment1°~"~lg9 and from explicit simulatious.l9O-lM Both DNA18qJq1.'93.195 and p r ~ t e i n s ~ ~ J *possess ~ " ' ~ waters of hydration which display reduced mobility compared to bulk water, suggesting strong and specific interaction with the biomolecule. In addition, biomolecules also influence the dynamics of water molecules in the vicinity of the molecular surface, again resulting in reduced mobility.119J96 These effects suggest that solvent models employing just the bulk properties of the solvent may be in

Feature Article

J. Phys. Chem., Vol. 98, No. 39, 1994 9709

error. Recent simulation evidence also suggests that the protein-water interface is not as well defined as the picture obtained from crystallographic studies.I9’ In fact, significant solvent density is observed inside of what would be classically considered the protein surface. Again, these effects cannot be reproduced by many methods. Finally, several attempts have been made to investigate the Although effects of solvent on chemical reactions.81~198-202 simple reactions may be parametrized using molecular mechanics methods,199one usually has to resort to more expensive ab initio calculations.200 This has severely limited the number of systems studied. For the systems studied so far the solvent (usually water) has been found to have a significant effect on the reaction profile compared to that in the gas phase. Solvent has been shown to both destabilize199and stabilize200~201 the intermediate transition state.

opportunity to study biomolecular systems with increasingly more realistic models. Hence, we expect to see the emergence of solvent models which include polarization effects, and an increase in the use of quantum mechanical calculations, together with the extension of current methods to probe solvent effects on the thermodynamics and long time motions of biomolecules. Hopefully, the problems associated with the truncation of electrostatic interactions in explicit solvent simulations will be substantially reduced as the study of larger systems (and longer cutoffs) become routine. In principle, many of the solvation models merely require the development of sufficiently fast computers that they may be applied to systems involving biomolecular solvation. For most models the theory leads the application which suggests the use of computer simulations for the study of solvation of biomolecules has yet to reach its full potential.

5. Discussion and Outlook

Acknowledgment. The authors thank Wilfred F. van Gunsteren, Herman J. C. Berendsen, Mike Gilson, John s. Perkyns, Gillian Lynch, and Ilario G. Tironi for many valuable discussions concerning the aspects of solvation and solvent models. We also thank the Robert A. Welch Foundation and the NIH for financial support.

In this article we have tried to outline many of the commonly used methods for the treatment of solvent in biomolecular systems. Although the list is by no means complete, it does represent the broad range of techniques available at present. Obviously, the exact method of choice will always depend on what one is trying to calculate. In this respect, a balance between accuracy and computational efficiency is usually desired. However, as the methods become more and more efficient, the corresponding loss of detail will tend to obscure the mechanistic details surrounding the solvation process. As ab initio calculations become more accessible, there is a growing trend toward using such results to parametrize many of the models already discussed (especially explicit solvent simulations). This is largely due to the difficulty in obtaining the necessary experimental data for the heterogeneous systems of current interest. Many of the commonly used techniques for developing parameters to study such systems are not easily applied to proteins (radial distribution functions, internal energy, density, erc.),hence the recourse to small model systems studied quantum mechanically. This has also led to an increase in the comparison of the results from simple models with comparable explicit simulations. In many cases subsequent comparison with the experimental data may neglect important counterion eff e c t ~viscosity ,~~ and dielectric changes, etc., which may also have a profound effect on the structure and dynamics of the biomolecule. Indeed, even in well-studied systems where solvent structure is experimentally accessible theoretical studies can make an important contribution leading to improvements in the interpretation of experimental results. Shown in Figure 1 is the X-ray-determined structure of myoglobin, a small oxygen storage protein. Proximal water molecules of hydration determined from three different experiments are displayed. One can see that the experiments only agree for relatively few water molecules. Recently, computer simulations have provided a unifying view of the solvation of myoglobin accounting for nearly all the experimentally placed solvent molecules as local density maxima.8 The key is the interpretation of water occupancy in terms of probability distributions rather than distinct, fully occupied, hydration sites. Such information is readily available from explicit solvent simulations. The derivation of appropriate solvent distribution functions has allowed such information to be used in the refinement of experimental data.203 Advances in theory will permit the study of more complex systems with increasing accuracy. As computer power increases and becomes more generally available, it will provide the

References and Notes (1) (2) (3) (4) (5) 39-92.

Wlodawer, A. Prog. Biophys. Mol. Biol. 1982, 40, 115-157. Kossiakoff, A. A. Ann. Rev. Biophys. Bioeng. 1983, 12, 159-182. Parak, F. Methods Enzymol. 1986, 127, 196-206. Schoenbom, B. P. J. Mol. Biol. 1988, 201, 741-749. Goldanskii, V. I.; Krupyanskii, Y. F. Q. Rev. Biophys. 1989, 22, Cheng, X.; Schoenbom. B. P. Acta Crystallogr. 1990, 846, 195Rupley, J. A,; Careri, G. Adv. Protein Chem. 1991, 41, 37-172. Lounnas, V.; Pettitt, B. M. Proteins 1994, 18, 133-147. Lounnas, V.; Pettitt, B. M. Proteins 1994, 18, 148-160. Saenger, W. Annu. Rev. Biophys. Biophys. Chem. 1987, 16, 93Otting, G.; Liepinsh, E.; Wiithrich, K. Science 1991, 254, 974Teeter, M. M. Annu. Rev. Biophys. Biophys. Chem. 1991,20,577Brooks 111, C. L.; Karplus. M. Methods Enzymol. 1986, 127, 369-

(14) Marlow, G. E.; Perkyns, J. S.; Pettitt, B. M. Chem. Rev. 1993, 93, 2503-2521. (15) Friedman, H. L. Annu. Rev. Phys. Chem. 1981, 32, 179-204. (16) van Gunsteren, W. F.; Luque, F. J.; Timms, D.; Torda, A. E. Annu. Rev. Biophys. Biomol. Struct. 1994, 23, 847-863. (17) Chandler, D. Equilibrium Theory of Polyatomic Fluids. In Montroll, E. W., Lebowitz, J. L., Eds. The Liquid State ofMatter: Fluids, Simple and Complex; North-Holland: New York, 1982; pp 275-340. (18) Pettitt, B. M.; Karplus, M. Chem. Phys. Lett. 1985, 121, 194-201. (19) Eisenberg, D.; Kauzmann, W. The Structure and Properties of Water; Oxford University Press: New York, 1969. (20) Hirata, F.; Rosskv, P. J. J . Chem. Phvs. 1981. 74, 6867-6874. (21) Ryckaert, J.-P.; Ciccotti, G.; Berendsen, H. J. C. J . Comput. Phys. 1977, 23, 327-341. (22) Loncharich, R. J.; Brooks, B. R.; Pastor, R. W. Biopolymers 1992, 32, 523-535. (23) Kitao, A,; Hirata, F.; Go, N. Chem. Phys. 1991, 158, 447-472. (24) Zhou, H. X. Chem. Phys. Lett. 1989, 164, 285-290. (25) Mierke, D. F.; Said-Nejad, 0. E.; Schiller, P. W.; Goodman, M. Biopolymers 1990, 29, 179- 196. (26j O’Connor, S . D.; Smith, P. E.; AI-Obeidi, F.: Pettitt, B. M. J . Med. Chem. 1992, 35, 2870-2881. (27) Kirkpatrick, S . ; Gelatt, C. D.; Vecchi, M. P. Science 1983, 220, 671-680. (28) Singh, U. C.; Weiner, S. J.; Kollman, P. Proc. Nail. Acad. Sci. U S A . 1985, 82, 755-759. (29) Bom, M. Z. Phys. 1920, 1, 45-48. (30) Roux, B.; Yu, H.-A.; Karplus, M. J. Phys. Chem. 1990,94,46834688. (31) Levy, R. M.; Belhadj, M.: Kitchen, D. B. J . Chem. Phys. 1991, 95. 3621-3633. (32) Perkyns, J.; Pettitt, B. M. Biophys. Chem. 1994, 51, 129-146.

9710 J . Phys. Cliem., Vol. 98, No. 39, 1994 (33) Jayaram, B.; Fine, R.; Sharp. K.; Honig, B. J . Phys. Chem. 1989, 93, 4320-4327. (34) Rashin, A. A. J . Phys. Chem. 1989, 93, 4664-4669. (35) Rashin, A. A. J . Phys. Chem. 1990, 94, 1725-1733. (36) Sitkoff, D.; Sharp, K. A.; Honig, B. J . Phys. Chem. 1994,98,19781988. (37) McCammon, J. A,; Wolynes, P. G.; Karplus. M. Biochemistry 1979, 18, 927-942. (38) Daggett, V.; Kollman, P. A,; Kuntz, I. D. Biopolymers 1991, 31. 285-304. (39) Guenot, J.; Kollman, P. A. J . Comput. Chem. 1993,14, 295-311. (40) Smith, P. E.; van Gunsteren, W. F. Methods for the Evaluation of Long Range Electrostatic Forces in Computer Simulations of Molecular Systems. In van Gunsteren, W. F., Weiner, P. K., Wilkinson, A. J.. Eds. Computer Simulation of Biomolecular Systems: Theoretical and Experimental Applications; ESCOM: Leiden, 1993; Vol. 2, pp 182-212. (41) Hingerty, B. E.; Ritchie, R. H.; Ferrell, T. L.; Turner. J. E. Biopolymers 1985, 24, 427-439. (42) Ramstein, J . ; Lavery, R. Proc. Natl. Acad. Sci. U.S.A. 1988, 85, 723 1-7235. (43) McQuanie, D. A. Statistical Mechanics; Harper & Row: New York, 1976. (44) Hill, T. L. J . Phys. Chem. 1956, 60, 253-255. (45) Berry, R. S.; Rice, S. A.; Ross, J. Physical Chemistry; John Wiley & Sons: New York, 1980. (46) Kxkwood, J. G. J. Chem. Phys. 1934, 2, 351-361. (47) Onsager, L. J . Am. Chem. SOC. 1936, 58, 1486-1493. (48) Kirkwood, J. G. J . Chem. Phys. 1939, 7. 911-919. (49) Cramer, C. J.; Truhlar, D. G. Rev. Comput. Chem., in press. (50) Tanford, C.; Kirkwood, J. G. J . Am. Chem. Soc. 1957, 79, 53335339. (51) Kirkwood, J. G.; Westheimer, F. H. J . Chem. Phys. 1938,6, 506512. (52) Westheimer, F. H.; Kirkwood, J. G. J . Chem. Phys. 1938, 6, 513517. (53) Tanford, C.; Roxby, R. Biochemistq 1972, 11. 2192-2198. (54) Felder, C. E. J . Chem. Phys. 1981, 75, 4679. (55) Alper, H.; Levy, R. M. J . Chem. Phys. 1993, 99. 9847-9852. (56) van Gunsteren, W. F.; Berendsen, H. J. C.; Rullmann, J. A. C. Faraday Discuss. Chem. SOC.1978, 66, 58-70. (57) Wanvicker, J.; Watson, H. C. J . Mol. Biol. 1982, 157, 671-679. (58) Reddy, J. N. An Introduction to the Finite Element Method: McGraw-Hill: New York, 1984. (59) YOU,T. J.; Harvey, S. C. J . Comput. Chem. 1993, 14, 484-501. (60) Zauhar, R. J.; Morgan, R. S. J . Mol. B i d . 1985, 186, 815-820. (61) Davis, M. E.; McCammon, J. A. Chem. Rev. 1990, 90, 509-521. (62) Jayaram, B.; Sharp, K.; Honig, B. Biopolymers 1989, 28. 975993. (63) Sharp, K. A.; Honig, B. J. Phys. Chem. 1990, 94, 7684. (64) Luty, B. R.; Davis, M. E.; McCammon, J. A. J . Comput. Chem. 1992, 13, 1114-1118. (65) Sharp, K.; Fine, R.; Schulten, K.; Honig, B. J . Phys. Chem. 1987. 91, 3624-3631. (66) Allison, S. A.; Bacquet, R. J.; McCammon, J. A. Biopolymers 1988, 27, 251-269. (67) Davis, M. E.; Madura, J. D.; Sines, J.; Luty, B. A,; Allison, S . A.; McCammon, J. A. Methods Enzymol. 1991, 202, 473-498. (68) Stemberg, M. J. E.; Hayes, F. R. F.; Russell, A. I.; Thomas, P. G.; Fersht, A. R. Nature 1987, 330, 86-88. (69) Bashford, D.; Karplus, M. Biochemistry 1990, 29, 10219-10225. (70) Yang, A.; Gunner, M. R.; Sampogna, R.; Sharp, K.; Honig, B. Proteins 1993, 15, 252-265. (71) Mohan, V.; Davis, M. E.; McCammon, J. A,; Pettitt, B. M. J . Phys. Chem. 1992, 96, 6428-6431. (72) Mohan, V.; Cheng, Y.-K.; Marlow, G. E.; Pettitt, B. M. Biopolymers 1993, 33, 1317-1325. (73) Sharp, K. J . Comput. Chem. 1991, 12,454-468. (74) Gilson, M. K.; Davis, M. E.; Luty, B. A,; McCammon, J. A. J . Phys. Chem. 1993, 97, 3591-3600. (75) Oberoi, H.; Allewell, N. M. Biophys. J. 1993, 65, 48-55. (76) Holst, M.; Kozack, R. E.; Saied, F.; Subramaniam, S. Proteins 1994, 18, 231-245. (77) Harvey, S. C. Proteins 1989, 5, 78-92. (78) Bashford, D. Curr. Opin. Struct. B i d . 1991, I , 175-184. (79) Moult, J. Curr. Opin. Struct. Biol. 1992, 2, 223-229. (80) Warshel, A,; Russell, S. T. Q. Rev. Biophys. 1984, 17, 283-422. (81) Warshel, A.; Levitt, M. J . Mol. Biol. 1976, 103, 227-249. (82) Russell, S. T.; Warshel, A. J . Mol. Biol. 1985, 185, 389-404. (83) Eisenberg, D.; McLachlan, A. D. Nature 1986, 319, 199-203. (84) Eisenberg, D.; Wesson, M.; Yamashita, M. Chem. Scr. 1989, 29A, 217-221. (85) Still. W. C.: TemDczvk. A,: Hawlev. R. C.: Hendrickson. T. J. Am. Chem. SOC.1990, 112, 6i27-6129. (86) Schiffer, C. A ; Caldwell, J. W.; Kollman, P. A,, Stroud, R. M. Mol. Simul. 1993, 10, 121-150.

Smith and Pettitt (87) Stouten, P. F. W.; Frommel, C.; Nakamura, H.; Sander. C. Mol. Simul. 1993, 10, 97-120. (88) Fauchere, J. L.; Pliska, V. Eur. J . Med. Chem.-Chim. Ther. 1983, 18. 369-375. (89) Kim, A,; Szoka, F. C. Pharm. Res. 1992, 9, 504-514. (90) Hine, J.; Mookerjee, P. K. J. Org. Chem. 1975, 40. 292-298. (91) Wolfenden, R. V.; Cullis, P. M.; Southgate, C. C. F. Science 1979. 206, 575-577. (92) Hasel, W.; Hendrikson, T.; Still, W. C. Tetrahedron Comp. Methods 1988, I , 103. (93) Perrot, G.; Cheng, B.; Gibson, K. D.; Vila, J.; Palmer, K. A,; Nayeem, A,; Maigret, B.; Scheraga, H. A. J . Comput. Chem. 1992. 13, 1-11. (94) Ben-Naim, A. Biopolymers 1990, 29, 567-596. (95) Ben-Naim, A,; Mazo, R. M. J. Phys. Chem. 1993, 97, 1082910834. (96) Ben-Naim, A. Solvation Thermodynamics: Plenum Press: New York, 1987. (97) van Gunsteren, W. F.; Beutler, T. C.; Fraternali, F.; King, P. M.; Mark, A. E.; Smith, P. E. Computation of Free Energy in Practice: Choice of Approximations and Accuracy Limiting Factors. In van Gunsteren, W. F., Weiner, P. K., Wilkinson, A. J., Eds. Computer Simulation of Biomolecular Systems: Theoretical and Experimental Applications; ESCOM: Leiden, 1993; Vol. 2, pp 315-348. (98) Smith, P. E.; van Gunsteren. W. F. J . Phys. Chem., in press. (99) Hansen, J. P.; McDonald, I. R. Theory of Simple Liquids, 2nd ed.; Academic Press: London, 1986. (100) Smith, P. E.; Pettitt. B. M.; Karplus, M. J . Phys. Chem. 1993, 97. 6907-69 13. (101) Kirkwood, J. G. J. Chem. Phys. 1935, 3, 300-313. (102) Mitra, R.; Pettitt, B. M.; R a m , G. L.; Blake, R. D. Nucl. Acids Res. 1993. 21, 6028-6037. (103) Chandrasekhar, S . Rev. Mod. Phys. 1943, 15, 1-89. (104) Adelman. S. A. Adv. Chem. Phys. 1980, 44, 143-253. (105) Chandler, D. Introduction to Modem Statistical Mechanics; Oxford University Press: New York, 1987. (106) Pastor, R. W. Topics in Stochastic Dynamics of Polymers. PhD Thesis, Harvard University, 1984. (107) Pastor, R. W.; Karplus, M. J . Phys. Chem. 1988,92, 2636-2641. (108) Pear, M. R.; McCammon, J. A. J . Chem. Phys. 1981, 74, 69226925. (109) Ladanyi, B. M.; Evans, G. T. J . Chem. Phys. 1983, 79,944-952. (110) Venable, R. M.; Pastor, R. W. Biopolymers 1988,27, 1001-1014. (111) van Gunsteren. W. F.; Berendsen, H. J. C. Mol. Simul. 1988, I , 173- 185. (112) Brooks 111, C. L.; Karplus, M.; Pettitt, B. M. Proteins. A Theoretical Perspective of Dynamics, Structure, and Thermodynamics. In Prigogine, I., Rice, S. A., Eds. Advances in Chemical Physics; John Wiley & Sons: New York, 1988; Vol. 71. (113) van Gunsteren, W. F.; Berendsen, H. J. C. Groningen Molecular Simulation (GROMOS) Library Manual; Biomos: Groningen, 1987. (114) Yun-Yu, S.; Lu, W.; van Gunsteren, W. F. Mol. Simul. 1988, 1, 369- 388. (115) Brooks, B. R.; Bruccoleri, R. E.; Olafson, B. D.; States, D. J.; Swaminathan, S.; Karplus, M. J . Comput. Chem. 1983, 4, 187-217. (1 16) Weiner, S. J . : Kollman, P. A.; Nguyen, D. T.; Case, D. A. J . Compitr. Chem. 1986, 7, 230-252. (117) Jorgensen, W. L.; Tirado-Rives, J. J . Am. Chem. Soc. 1988, 110, 1657-1666. (118) Allen, M. P.; Tildesley, D. J. Computer Simulation of Liquids; Oxford University Press: Oxford, 1987. (119) Brooks 111, C. L.; Karplus, M. J . Mol. Biol. 1989,208, 159-181. (120) Wallqvist, A. Mol. Simul. 1993, 10, 13-17. (121) van Gunsteren, W. F. Berendsen, H. J. C. Angew. Chem., Int. Ed. Engl. 1990, 29, 992-1023. (122) Quentrec, B.; Brot, C. J . Comput. Phys. 1975, 13, 430-432. (123) Greengard, L.; Rokhlin, V. J . Comput. Phys. 1987, 73*325-348. (124) Ding, H.;Karasawa, N.; Goddard 111, W. A. J . Chem. Phys. 1992, 97. 4309-4315. (125) Smith, P. E.; van Gunsteren, W. F. J . Mol. Biol. 1994, 236, 629636. (126) Brooks 111, C. L.; Pettitt, B. M.; Karplus, M. J . Chem. Phys. 1985, 83, 5897-5908. (127) Brooks 111, C. L. J . Chem. Phys. 1987, 86, 5156-5162. (128) Madura, J. D.; Pettitt. B. M. Chem. Phys. Lett. 1988, 150. 105108. (129) Alper, H. E.; Bassolino. D.; Stouch, T. R. J . Chem. Phys. 1993, 98, 9798-9807. (130) Schreiber, H.; Steinhauser, 0. J. Mol. Biol. 1992,228, 909-923. (131) Schreiber, H.; Steinhauser, 0.Biochemistry 1992,31, 5856-5860. (132) Schreiber, H.; Steinhauser, 0. Chem. Phys. 1992, 168, 75-89. (133) Smith, P. E.; Pettitt, B. M. J. Chem. Phys. 1991, 95, 8430-8441. (134) Clough, S. A.; Beers, Y.; Klein, G. P.; Rothman, L. S. J. Chem. Phys. 1973, 59, 2254-2259. (135) Whalley, E. Chem. Phys. Lett. 1978, 53, 449-451.

Feature Article

J. Phys. Chem., Vol. 98, No. 39, 1994 9711

(136) Gao, I.; Xia, X. Science 1992, 258, 631-635. (137) Stillinger, F. H.; David, C. W. J . Chem. Phys. 1978, 69, 14731484. (138) Caldwell, J.; Dang, L. X.; Kollman, P. A. J. Am. Chem. SOC. 1990, 112, 9144-9147. (139) Straatsma, T. P.; McCammon, J. A. Mol. Simul. 1990, 5, 181192. (140) Dang, L. X. J. Chem. Phys. 1992, 97, 2659-2660. (141) Halley, J. W.; Rustad, J. R.; Rahman, A. J . Chem. Phys. 1993, 98, 4110-4119. (142) Wallqvist, A.; Beme, B. J. J . Phys. Chem. 1993, 97, 1384113851. (143) Bemardo, D. N.; Ding, Y.;Krogh-Jespersen, K.; Levy, R. M. J . Phys. Chem. 1994, 98, 4180-4187. (144) Sprik, M.; Klein, M. L. J . Chem. Phys. 1988, 89, 7556-7560. (145) van Belle, D.; Froeyen, M.; Lippens, G.; Wodak, S.J. Mol. Phys. 1992, 77, 239-255. (146) Andersen, H. C. J . Chem. Phys. 1980, 72, 2384-2393. (147) Nos& S . J . Chem. Phys. 1984, 81, 511-519. (148) Mitchell, P. J.; Fincham, D. J . Phys. Condens. Matter 1993, 5, 1031-1038. (149) Kirkwood, J. G. Phys. Rev. 1933, 44, 31-37. (150) Gibson, W. G. Mol. Phys. 1974, 28, 793-800. (151) Isaacson, A.; Truhlar, D. G. J. Chem. Phys. 1981, 75, 40904094. (152) Berens, P. H.; Mackay, D. H. J.; White, G. M.; Wilson, K. R. J . Chem. Phys. 1983, 79, 2375-2389. (153) Wigner, E. Phys. Rev. 1932, 40, 749-759. (154) Field, M. J. The Simulation of Chemical Reactions. In van Gunsteren, W. F., Weiner, P. K., Wilkinson, A. J., Eds.Computer Simulation of Biomolecular Systems: Theoretical and Experimental Applications; ESCOM: Leiden, 1993; Vol. 2, pp 82-123. (155) Gao, J. J. Phys. Chem. 1992, 96, 537-540. (156) Gao, J. J . Phys. Chem. 1992, 96, 6432-6439. (157) Herman, M. F.; Beme, B. J. J . Chem. Phys. 1983, 78, 410341 17. (158) Bash, P. A.; Field, M. J.; Karplus, M. J . Am. Chem. SOC.1987, 109, 8092-8094. (159) Hemdon, W. C.; Radhakrishnan, T.P. Chem. Phys. Lett. 1988, 148, 492-496. (160) Chandler, D.; Wolynes, P. G. J . Chem. Phys. 1981, 74, 40784095. (161) Kuharski, R. A.; Rossky, P. J. J . Chem. Phys. 1985, 82, 51645177. (162) Zheng, C.; Wong, C. F.; McCammon, J. A,; Wolynes, P. G. Chem. Scr. 1989, 29A, 171-179. (163) Billeter., S . R.: Kine. P. M.: van Gunsteren. W. F. J . Chem. Phvs. 1994, Ibi 6692-6699. (164) Car. R.: Parinello. M. Phvs. Rev. Lett. 1985, 55, 2471-2474. (165) Laasonen, K.; Sprik, M.;Parrinello, M.; Car, R. J . Chem. Phys. 1993, 99. 9080-9089. (166) Vaidehi, N.; Wesolowski, T. A.; Warshel, A. J. Chem. Phys. 1992, 97, 4264-4271. (167) Zwanzig, R. W. J . Chem. Phys. 1954, 22, 1420-1426, (168) Rossky, P. J.; Karplus, M. J . Am. Chem. SOC. 1979, 101, 19131937. (169) Rossky, P.; Karplus, M.; Rahman, A. Biopolymers 1979,18, 825854. I

I

(170) Mezei, M.; Mehrotra, P. K.; Beveridge, D. L. J . Am. Chem. SOC. 1985,107, 2239-2245. (171) Alagona, G.; Ghio, C.; Kollman, P. A. J . Am. Chem. SOC. 1985, 107, 2229-2229. (172) Pettitt, B. M.; Karplus, M.; Rossky, P. J. J. Phys. Chem. 1986, 90, 6335-6345. (173) Lau, W. F.; Pettitt, B. M. Biopolymers 1987, 26, 1817-1831. (174) Anderson, A. G.; Hermans, J. Proteins 1988, 3, 262-265. (175) Tobias, D. J.; Brooks III, C. L. J . Phys. Chem. 1992, 96, 38643870. (176) Brooks III,C. L.; Case, D. A. Chem. Rev. 1993,93,2487-2502. (177) Hagler, A. T.; Osguthorpe, D. J.; Robson, B. Science 1980, 208, 599-601. (178) DiCapua, F. M.; Swaminathan, S.; Beveridge, D. L. J. Am. Chem. SOC. 1991, 113, 6145-6155. (179) Gerstein, M.; Lynden-Bell, R. M. J. Phys. Chem. 1993,97,29822990. (180) van Gunsteren, W. F.; Karplus, M. Biochemistry 1982,21,22592274. (181) Levitt, M.; Sharon, R. Proc. Natl. Acad. Sci. U S A . 1988, 85, 7557-7561. (182) Levitt, M. Chem. Scr. 1989, 29A, 197-203. (183) Amold, G. E.; Omstein, R. L. Proteins 1994, 18, 19-33. (184) Brooks, B. R. Personal communication. (185) Guenot, J.; Kollman, P. A. Protein Sci. 1992, I, 1185-1205. (186) Lautz, J.; Kessler, H.; van Gunsteren, W. F.; Weber, H.-P.; Wenger, R. M. Biopolymers 1990, 29, 1669-1687. (187) Hartsough, D. S.; M e n , Jr., K. M. J . Am. Chem. SOC. 1992,114, 10113- 10116. (188) Hartsough, D. S.;Men, Jr., K. M. J. Am. Chem. SOC. 1993,115, 6529-6537. (189) Neidle, S.; Berman, H. M.; Shieh, H. S. Nature 1980,288, 129133. (190) Hagler, A. T.; Moult, J. Nature 1978, 272, 222-226. (191) Ahlstrom, P.; Teleman, 0.; Jonsson, B. Chem. Scr. 1989, 29A, 97-101. (192) de Vlieg, J.; Berendsen, H. J. C.; van Gunsteren, W. F. Proteins 1989, 6, 104-127. (193) Mohan, V.; Smith, P. E.; Pettitt, B. M. J . Am. Chem. SOC. 1993, 115, 9297-9298. (194) Brume, R. M.; Liepinsh, E.; Otting, G.; Wiithrich, K.; van Gunsteren, W. F. J . Mol. Biol. 1993, 231, 1040-1048. (195) Forester, T. R.; McDonald, I. R. Mol. Phys. 1991, 72, 643-660. (196) Knapp, E. W.; Muegge, I. J . Phys. Chem. 1993,97,11339-11343. (197) Louhhas, V.; PettittB. M.; Findsen, L.; Subramanian, S . J . Phys. Chem. 1992, 96, 7157-7159. (198) Warshel, A. Biochemistry 1981, 20, 3167-3177. (199) Chandrasekhar, J.; Smith, S . F.; Jorgensen, W. L. J . Am. Chem. SOC.1985, 107, 154-163. (200) Weiner, S . J.; Singh, U. C . ; Kollman, P. A. J . Am. Chem. SOC. 1985, 107, 2219-2229. (201) Blake, J. F.; Jorgensen, W. L. J. Am. Chem. SOC. 1991,113,74307432. (202) Aqvist, J.; Warshel, A. Chem. Rev. 1993, 93, 2523-2544. (203) Lounnas, V.; Pettitt, B. M.; Phillips, Jr., G. N. Biophys. J. 1994, 66, 601-614.