J. Phys. Chem. 1986, 90, 6335-6345
6335
Integral Equation Model for Aqueous Solvation of Polyatomlc Solutes: Application to the Determination of the Free Energy Surface for the Internal Motion of Blomolecules B. Montgomery PettittJ Martin Karplus,* Department of Chemistry, Harvard University, Cambridge, Massachusetts 021 38
and Peter J. Rossky Department of Chemistry, University of Texas, Austin, Texas 78712 (Received: February 20, 1986)
A model is presented for determining the intramolecular potential of mean force for flexible polyatomic molecules in aqueous solution. This is an essential step in developing a reduced simulation technique for studying solvated biopolymers. The Ornstein-Zemike-like integral equation theory within a superposition formalism is shown to lead to a convenient and efficient method for calculating the solvent-modified intramolecular potential. Solute-solvent distribution functions for the atoms (sites) composing the polyatomic molecule are evaluated individually and introduced into the appropriate integral equations to obtain the site-site potentials of mean force for all distinct atom pairs in the molecule. The superposition approximation plus an empirical energy function for the internal degrees of freedom can then be employed to determine the total solventmodified potential of mean force surface for the molecular system. An application to the evaluation of the intramolecular free energy surface for the alanine dipeptide (N-methylalanylacetamide) under vacuum and in aqueous solution is given to illustrate the method.
I. Introduction A full theory of biopolymers requires a method for treating the effects of solvent on their thermodynamic, structural, and dynamic properties.' This is of fundamental importance because biopolymers, such as proteins and nucleic acids, contain many charged and polar groups whose interactions are strongly perturbed by a polar solvent like water. The behavior of their nonpolar (hydrophobic) groups in aqueous solution is of great interest as well. It has been suggested:*3 in fact, that the partitioning of polar and nonpolar groups between the exterior and interior of globular proteins is an essential determinant of their structure and stability in water. Most studies of macromolecules of biological interest have neglected the solvent4or used continuum models to introduce its effects5-' Recently, a number of molecular dynamics and Monte Carlo simulation studies of biopolymers including explicit treatment of the aqueous solvent molecules have been made.*-14 Although the molecular representation is very important for obtaining a realistic picture, it is also very time consuming to include solvent in simulations, even on supercomputers; e.g., a recent 80-ps simulation of a DNA oligomer composed of 8 base pairs in an environment of 14 Na+ and 1231 water molecules with periodic boundary conditions required 40 h on a Cyber 20S.14 The multitude of studies that need to be done to characterize the essential properties of biomolecules make it desirable to have available methods for treating the solvent that take into account its molecular structure and yet do not require arduous computations. In this paper we develop such a method that makes use of recent advances in the statistical mechanical theory of solut i o n ~ . ' ~The method is based on Ornstein-Zernike type integral equation theories for determining the solvent structure and its ~ integral equations are used effect on solute b e h a ~ i o r . ' ~ - *The first to determine the distribution functions of the solvent relative to the solute atoms. From these distribution functions the potentials of mean force between the different types of atom pairs composing the polyatomic biomolecule can be calculated. Given the potentials of mean force, an intramolecular superposition approximation is used to construct the solvent-modified potential surface for the flexible internal degrees of freedom of the biomolecule. Since this can be done once and for all for a given solvent a t a specified density and temperature, the resulting potential surface can be used for stochastic molecular dynamical or Monte Carlo simulations at a relatively modest cost in computer Current address: Department of Chemistry, University of Houston, Houston, TX 77004.
0022-3654/86/2090-6335$01.50/0
time, on the order of that required for treating the system under vacuum. This makes it possible to study interesting systems on the nanosecond to microsecond time scale. The objective of this paper is to present the theoretical basis of the approach outlined above, to apply it to the essential constituents of peptides and proteins, and to show how their interactions are modified by the presence of solvent. As a model system, the alanine dipeptide, N-methylalanylacetamide (Figure l), is considered. It is of interest because it contains the polypeptide backbone polar groups (C=O and NH) as well as aliphatic CH3sidechains. Moreover, the dipeptide has been studied in a number of solution simulations.*J'J* Calculation of the potentials of mean force in aqueous solution for the component atoms makes it possible to construct a solvent-modified free energy surface for the internal degrees of freedom of the dipeptide. Section I1 reviews the Ornstein-Zernike integral equation theory and its application in the extended RISM form to the determination of solute site-solvent site and solute site-solute site distribution functions. The introduction of a superposition approximation that permits one to calculate the solvent-modified free energy surface for a polyatomic molecule from the site-site distribution functions is outlined. In section 111, details of the (1) Brooks, C.; Karplus, M. Methods Enzymol. 1986, 127, 369. (2) Kauzmann, W. Adv. Protein Chem. 1959, 14. (3) Dill, K. A. Biochemistry 1985, 24, 1501. (4) Karplus, M.; McCammon, J. A. Annu. Rev.Biochem. 1983,53,263. (5) Tanford, C.; Kirkwood, J. G. J . Am. Chem. SOC.1957, 79, 5333. (6) Shire, S.J.; Hanania, G. I. H.; Gurd, F. R. N. Biochemistry 1974, 13, 2967. (7) Warshel, A.; Russell, S. T.; Churg, A. K. Proc. Natl. Acad. Sci. U.S.A. 1984,81, 4785. (8) Rossky, P.; Karplus, M. J . Am. Chem. SOC.1979, 101, 1913. (9) van Gunsteren, W.; Karplus, M. Biochemistry 1982, 21, 2259. (10) Chandrasekhar, J.; Jorgensen, W. L. J . Am. Chem. SOC.1985,107, 2974. (11) Mezei, M.; Mehrotra, P. K.; Beveridge, D. L. J . Am. Chem. SOC. 1985. 107. 2239. (12) Hagler, A. T.; Osguthorpe, D. J.; Robson, B. Science (Washington, D.C.)1980, 208, 599. (13) van Gunsteren, W.; Berendsen, H. J. C. J. Mol. Biol. 1984, 176,559. (14) van Gusteren, W.; Berendsen, W. J. C.; Gentsen, R. G.; Zwindermann, H. R. J. Ann. N . Y. Acad. Sci. 1986, in press. (15) Chandler, D. In The Liquid Stare of Matter: Montroll, E. W., Lebowitz, J. L., Eds.; North-Holland: Amsterdam, 1982; p 275. (16) Chandler, D.; Pratt, L. J . Chem. Phys. 1976, 65,2925. (17) Pratt, L.; Chandler, D.J . Chem. Phys. 1977, 76,3683. (18) Hirata, F.; Pettitt, B. M.; Rcssky, P. J. J . Chem. Phys. 1982, 77,509. (19) Pettitt, B. M.; Rossky, P. J. J. Chem. Phys. 1982, 77,1451. (20) Leveque, D.;Weiss, J. J.; Patey, G. J . Chem. Phys. 1980, 72,1887.
0 1986 American Chemical Society
6336 The Journal of Physical Chemistry, Vol. 90, No. 23, 1986
H
Figure 1. Alanine dipeptide (N-methylalanylacetamide) model system to which the solute atoms chosen for study correspond;the C,, conformation, which is the energy minimum under vacuum, is shown.
model and the parameters used to represent the solute and solvent are given. The results obtained for distribution functions of the aqueous solvent sites (H, 0)relative to the solute atomic sites are described in section IV. Section V describes the solute site-site potentials of mean force for the distinct pairs required to represent the dipeptide. The origin of some of the structural features of the potentials are analyzed. The resulting solvent-modified potential for the internal degrees of freedom of the alanine dipeptide is given in section VI and compared with the vacuum surface. Section VI1 presents a concluding discussion with the prospects for development and applications of the present approach. 11. Theory
The solution structure and internal thermodynamics of a polyatomic solute molecule composed of nMatoms (sites) can be determined from the nM-point singlet intramolecular density distribution function, pM(l)(iM)in the presence of the molecular solvent. We use the conventional notation that lMis a shorthand for the set of vectors (r(a)iM,..., r(Y)iM)where r(a)iMdenotes the position of the ath atom in the lth molecule of type M. Evaluation of pM(l)(iM) is equivalent to finding the solvent-modified intramolecular potentials of mean force from which the desired dynamic and thermodynamic properties can be obtained. In general, the potentials responsible for chemical bonding act over a relatively short distance and are 2-4 orders of magnitude larger than those involved in intermolecular interactions. To deal with covalent bonding distributions and their associated Boltzmann factors we use a classical (on, off) definition of bonding.16 If u(iM)is defined to be the intramolecular potential, the classical chemical bonding Boltzmann factor for a molecule of type M can be written
b ~ ( i =~H) M ( ~ M ~ x) P ( + ~ M ( ~ M ) )
(1)
where HM is 0 if any chemical bonds of M are broken and 1 otherwise; i.e., the factor HMessentially defines the molecule M. If we assume rigid covalent bonds, then the remaining terms of u(iM)are associated with relatively soft degrees of freedom such as bond angles and particularly dihedral angles. These degrees of freedom, which are expected to be most influenced by the condensed phase environment, govern the conformation of the polyatomic biomolecule. The one- (in general N-) molecule density distribution functions that characterize the system can be obtained by taking logarithmic functional derivatives of the grand canonical partition function with respect to the activity. The resulting equation can be recast in the form of a density cluster expansion16 by the techniques of topological reduction.z1 For the one-molecule distribution function of interest here this yieldsL6
or for the isotropic (uniform) case (21) Stell, G. In Theory of Classical Equilibrium Fluids;Frisch, H. L., Lebowitz, J. L., Eds.; Benjamin: New York, 1964.
Pettitt et al. PM(’)(~M) =P M S ~ ~ M ) (2b) where paM(da)iM) is the single-particle density distribution function for the site CY in molecule M, yM*(iM) is the singlet intramolecular cavity distribution functionz2for the sites iM,vM is the symmetry number, sMis the intramolecular distribution function for molecule M, and pMis the number density of molecule M in the solution. Physically, the solute “cavities” interact with the solvent in the same way as normal sites; however, they do not interact with each other. It has been noted” that the cavity distribution function for two particles in a fluid is qualitatively similar to the cavity distribution function for two particles in a molecule that do not overlap (Le., two particles separated by three bonds, the so-called 1,4 and higher pairs). We approximate the one-molecule nM-point cavity distribution function by a superposition of two-site, atomic particle, cavity distribution functions;L6p22 that is YM*(M
=
n yaMYM(r)
(3)
UMYM
where yaMyM(r) is the cavity distribution function matrix element for sites a and y in molecule M. We assume that all such pairs interact in the absence of solvent by spherically symmetric site-site potentials. This is similar in spirit to Kirkwood’s superposition approximationz3for the triplet distribution function. At the two-particle level we may define the cavity distribution function in a simple fluid as
y ( r ) = g(r)esu(‘) (4) where g(r) is the radial pair distribution function and V(r)is the pair potential for the sites (assumed spherical). The function g(r) may be written in terms of the potential of mean force (PMF), W W ,as g ( r ) = e-8W) (5) where W(r)can be written as W(r) = V(r)+ AW(r) (6) and AW(r) is the correction to the effective pair potential due to the presence of solvent. Comparing with eq 4, we see that the AW(r) term is the cavity potential; that is y ( r ) = e-8AWr) (7) The approximation made in using eq 7 on the right-hand side of eq 3 is that the solvent effect on two atoms is the same as that for two atoms separated by several bonds in the same polyatomic molecule (biopolymer) or free in the solvent. Obviously, this approximation is best for the case of extended polymers, such as peptides, where all of the atoms are exposed to solvent; the present approach would not be suitable for the treatment of atoms in the interior of a globular protein. An indication of the magnitude of the effect of neighboring groups on solute correlation functions is found in the work of Pratt and Chandler,z4 who compared nonpolar solutes, such as methane and ethane, in water. Analysis of the superposition approximation for the case of a pair of diatomic molecules in solution indicates that there is some cancellation of errors between those introduced by the superposition approximation and those due to the use of the free atom pair potential of mean force when spherically averaged (Pettitt and Karplus, unpublished calculations). With the problem reduced to determining the cavity distribution functions yaMTM(r) for charged spherical atomic sites a and y, a choice of method for evaluating the yam&) functions has to be made. One approach would be based on umbrella sampling simulation techniques, such as those used by Berne et al?5 for two apolar solutes in water and more recently by Berkowitz et a1.z6 for Na+ and C1-. For the present model calculation, which consists of the polypeptide backbone atoms plus the simple side chain (alanine), there are seven unique atom types, which requires the (22) Merron, E.; Seigert, A. J. F. J . Chem. Phys. 1968, 48, 3139. (23) Kirkwood, J. G. J . Chem. Phys. 1935, 3, 300. (24) Pratt, L.; Chandler, D. J . Chem. Phys. 1980, 73, 3430. (25) Pangali, C.; Rao, M.; Berne, B. J. J . Chem. Phys. 1979, 71, 2975. (26) Berkowitz, M. L.; Karim, 0. A.; McCammon, J. A.; Rossky, P. J. Chem. Phys. Left. 1984, 105, 571.
Aqueous Solvation of Polyatomic Solutes
The Journal of Physical Chemistry, Vol. 90, No. 23, 1986 6337
calculation of 28 such potentials for pairs of charged solute sites over a wide range of distances. Such a set of simulations would be extremely time consuming. However, there exists the alternative of using analytic techniques based on integral equations that are readily solved by elementary numerical methods. The equivalent problem of charged spherical atoms at infinite dilution in aqueous solution has been treated by use of the extended form of the RISM integral equations with renormalization of Coulomb
W.."" = v i j U U + U j j u u + kT(cciU''- hij"u) = Y u* .PU+ uCjjuu - kT(CUV*phvU)ij (1 5 ) IJ
or, with the definition of AW given in eq 6
w.. JJ = uij * 4-
vi/.+ A W j
(16)
where we have dropped the uu superscript. From eq 7 and 16 we may identify AW, with the cavity potential for charged sited interaction^.^^ i The general approach has been derived in the l i t e r a t ~ r e , ' ~ , ' ~ and j at infinite dilution in a polar solvent, which corresponds to a water model in the present case. so that we outline only the particular method used here. We start As in previous work,18*30 we make use of the fact that the solvent with the RISM e q ~ a t i o n ~ ' * ~ ~ dielectric information in the form of the static dielectric constant php = w*c*w + o*c*php (8) may be removed by simply subtracting the extended RISM screened Coulomb potential from AW and replacing it with a where all matrices are site-site labeled and * denotes a matrix phenomenological value from experiment; that is defining convolution product. Here h = g - 1 is the matrix of intermolecule pair correlation functions, c is the matrix of direct correlation functions, p is a diagonal matrix of site densities, and w is an intramolecular correlation function matrix defined in real space as WaMyM'
=
PM.6MM'[6aMyM '(raM
- ryM) + ( l
a'WM(',M
- ',My,)
- 'YM)]
(9)
for site CY in molecule M and site y in molecule M'. The symbol sfMyM represents the intramolecular pair distribution function for distinct spherical sites in a given molecule M. It is convenient to consider the correlation matrices site ordered according to species that are identified with solvent or solute. Equation 9 then separates into three equations for hW,h", and huu,the soluentsoluent, soluent-solute, and solute-solute correlation function matrices, respectively. At infinite dilution (Le., pu 0), eq 8 can be rewritten as a set of coupled integral equations of the form
-
+ w*cW*phWp hUVp= cUv*o+ cuv*phwp
phwp = w*cW*w
huu =
c'JU
+
CUV+phV'J
(10) (11) (12)
where now p and w refer only to the solvent since with the superposition model only monoatomic solutes need to be considered. Because we are dealing with charged sites (partial or full atomic charges),29 it is convenient to cast these equations in renormalized form and then use a closure that takes advantage of this renormalization. The details of this procedure have been given elsewhere.30 The closure used is an analogue of the familiar hypernetted chain ( H N C ) closure; i.e. cij = ,-udkT+htfCij - h.. 11 + cij - 1 (13) where h, and cij may be the w, vu, or uu pair correlation function matrix elements. The quantity Uijis the pair potential function matrix element between the ij pair of interacting sites (see eq 4). Ujjcan be written as the sum of a specifically short-ranged potential U,* (e.g., a Lennard-Jones 6-1 2 potential) and a long-range Coulomb part, Vi;, that goes as l/rij With the closure relation, eq 13, we can solve eq 10-12 sequentially. W e first solve eq 10 for the pure solventsolvent correlations. We then use the solution of eq 10 as input for eq 1 1 and solve for the solute-solvent correlations. Finally, we could, in principle, solve eq 12 for the solute-solute correlations with the solution of eq 11. However, what we are interested in for this particular application is the solvent-mediated potential (potential of mean force) between two solute sites i and j , W j , which has the form W, = -kT In (hij 1 ) (14)
+
For the present choice of an HNC-like closure (eq 13) this is simply (see eq 12) (27) Anderson, H. C.; Chandler, C. J . Chem. Phys. 1972, 57, 1918. (28) Chandler, D.; Anderson, H. C. J. Chem. Phys. 1972, 72, 1930. (29) Pettitt, B. M.; Karplus, M. J . Am. Chem. Soc. 1985, 197, 1166. (30) Hirata, F.; Rossky, P. J.; Pettitt, B. M. J. Chem. Phys. 1983,78,4133.
AWjj' = AWij -
vi;
ERISM
u-c 11
+-+v fp
(17)
where cp is the phenomenological dielectric constant of the solvent. Equation 16 becomes
Wj =
Uij*
+ AWjj'
(18)
The decomposition in eq 17 and 18 is important for obtaining meaningful results, in light of the fact that.al1 RISM theories yield an incorrect ab initio dielectric constant due to the implicit constraint in the closure that c is asymptotically equal to
[email protected] This has been shown to result in ~ R I S M=
1+(
4 (/.~,'))/3kT ~
(19)
where ( k 2is) the average of the dipole moment squared of the solvent molecules. It should, however, be noted that this dielectric replacement concerns only the continuum contribution to the potential of mean force and does not alter the contribution from ~ explicitly molecular effects. It has been d e m ~ n s t r a t e din~several cases that the fact that RISM theories yield an underestimated dielectric constant does not change the quality of the short-ranged correlations. Thus, the use of eq 17 is justified for improving the net Coulombic continuum contribution to the potential of mean force between solute sites. The present approximation to the intramolecular pair potential of mean force when coupled with an effective dynamical theory, such as Langevin dynamics, provides a reduction of a many body Hamiltonian problem to an effective few body problem. We examine some of the structural and thermodynamic consequences of these approximations in sections IV and V. In section VI the results of a model application of the potential of mean force determination method to the alanine dipeptide are presented. 111. The Model System As a limited objective for the present study, the atoms composing the blocked alanine dipeptide (N-methylalanylacetamide) were selected (see Figure 1). This dipeptide contains the atoms of the peptide backbone and the simplest (nonglycine) side chain. Specifically, there are three types of carbon atoms (the carbonyl carbon CO,the a carbon Ca, and the methyl group carbon CMc), the nitrogen atom N of the peptide bond, the carbonyl oxygen 0, and two types of hydrogen atoms (the polar amide hydrogen HNand the aliphatic hydrogen Hc of the C,H and CH3 groups). The parameters used for these atoms in the calculation are listed in Table I. They are very similar to those employed in a simu* ~ only ~ lation of the alanine dipeptide in aqueous s o l ~ t i o n . ~The change is that here there is a single charge for the methyl group carbons, and all aliphatic hydrogens have the same charge; in the previous dipeptide potential the three methyl groups had slightly different charges based on quantum mechanical calculations for a specific geometry. This modification, which enforces the (31) Cummings, P. T.; Stell, G. Mol. Phys. 1981, 44, 529. (32) Rossky, P. J.; Pettitt, B. M.; Stell, G. Mol. Phys. 1983, 50, 1263. (33) Rossky, P. J.; Karplus, M.; Rahman, A. Biopolymers 1979, 28, 825.
Pettitt et al.
6338 The Journal of Physical Chemistry, Vol. 90, No. 23, 1986 TABLE I: Leonard-Jones and Charge Parameters"
atom
U
t
4
3.208 2.616 3.208 2.770 1.604 3.208 2.640
0.0900 0.0045 0.0900 0.1600 0.0045 0.0900 0.2300
-0.0480 0.0246 0.0700 -0.2740 0.1830 0.4330 -0.42 80
0.400 3.215
0.0180 0.1188
0.4000 -0.8000
solute
CMS HC
c, N
HN
CO 0 solvent
H 0
, u in A, z in kcal/mol, and q in units of the electronic charge.
chemical equivalence of the hydrogens on a given methyl group, reduces the number of unique solute-solvent potentials and correlation Functions that have to be calculated. The original and modified parameter sets were examined in a comparative study of Coulomb interactions in the dipeptide system,29where it was found that there were very small differences in the vacuum structural, dynamic, and thermodynamic properties. The water model used is the simple three-site TIPS model34 of Jorgensen as adapted for analytic integral equation theories by Pettitt and Rossky;lg the nonbonded parameters for the TIPS model are also listed i n Table I, The pair correlations and properties for this potential model were the focus of a previous study,I9 and that analysis will not be repeated here. It has been demonstrated that the pair correlations obtained from integral equation theories yield good agreement with simulation results for the peak placements of the radial probability distribution function and for the coordination numbers.I9 The coordination number of sitesj about a site i within a given radial shell (volume) is defined as
TABLE 11: hoperties of Solute-Solvent Radial Distribution Functionsa
atom CM~ Hc C, N,
HN Co 0
max
rXHb
3.60 3.13 3.52 1.35 2.62 3.46 1.27
Uij)/2
and
4.85 4.93 4.85 1.86 3.82 4.66 1.82
3.06 2.56 3.00 2.83 2.01 2.89 2.77
I
=
[€ii€jj]
were used. These rules are consistent with those used in the molecular dynamics simulations of the dipeptide model system in aqueous solution and of pure To employ the solute-solute potentials of mean force obtained from the integral equations to construct the solvent-modified potential energy surface for the dipeptide, we need an expression for the vacuum energy of this system as a function of its internal degrees of freedom. For this purpose we use the empirical potential given p r e v i o ~ s l y .It~ ~is composed of bond length, bond angle,3* and dihedral angle terms, as well as van der Waals, electrostatic, and an explicit hydrogen bonding interaction; the latter is of the form ( A / @ - B/rlO) where A and B are suitable parameters and r is the hydrogen bond length. All parameters used in the potentials are identical with those listed in ref 33 except for the averaging of the carbon and hydrogen charges already described.
IV. Results for the Radial Distribution Functions The radial distribution functions for the aqueous solvent 0 and H around each of the biomolecule-solute atoms at infinite dilution are shown in Figures 2-5. To designate these correlation functions, we use a two-subscript notation with the first subscript denoting (34) Jorgensen, W. J . Am. Chem. Soc. 1981, 103, 335. (35) Hansen, J. P.; McDonald, I. R. Theory of Simple Fluids: Academic: New York, 1976. (36) Rahman, A.; Stillinger, F. J . Chem. Phys. 1974, 60, 1945. (37) Pettitt, B. M.; Rossky, P. J. J . Chem. Phys. 1983, 78, 7296. (38) Urey, J. C.: Bradley, C . A. Phys. Rea. 1931, 38, 1969.
(rm$
NXH
NXO
4.66 4.31 4.76 3.98 2.78 3.60 3.67
15.1 11.4 15.0 0.6 7.4 12.7 1.0
(rmin) 13.3 10.7 14.0 8.7 2.7 5.6 7.0
I
I
I
I L' I
0.
0.
,
I
I
2.
4.
6.
0.
rinA Figure 2. Solutesolvent radial pair distribution functions for the nitrogen; the solid line is gNo(r), and the dashed line is g N H ( r ) . 0 Distribution Functions 3.
I
I
I
I'
,'
l.
t
1(.
0. tij
min
N Distribution Functions
where r is the distance to the first (or any) minimum of gij(r)and pi is the number density of sites j . To obtain the parameters for the calculation of the solutesolvent interactions, the standard mixing rules35
+
max
"All distances in 8, and energies in kcal/mol. bPosition of first maximum and minimum in the radial distribution function, where X corresponds to the atom labeling the row. CCalculatedper single hydrogen.
Nv(r) = 47rpjx'gij(r? r R dr'
uij = (aii
rxob
min
0.
2.
,
I
. .?. . . . . . . , .
I
I
4.
6.
0.
rinA Figure 3. Solutesolvent radial pair distribution functions for the carbonyl oxygen; the solid line is goo@), and the dashed line is go&).
the solute and the second the solvent; e.g., gHNo(r)denotes the radial distribution function of the water oxygen relative to the amide hydrogen. Table I1 lists the maxima, minima, and coordination numbers for each solute atom. It should be noted that the coordination numbers correspond to a solute atom that does not have other solute neighbors and so may be larger than the actual coordination number for the site in a biomolecule. Amide Nitrogen. The N-H distribution in Figure 2 clearly displays the strong orientational correlation characteristic of a hydrogen bond. The first peak in gNH(r)is relatively sharp and narrow, which is similar to the features observed in pure water and in methanol between a highly charged acceptor site and a polar hydrogen site.I9v3' The N-H maximum occurs at 1.35 8, and is associated with a coordination number (eq 20) near unity; that is, on the average we would expect to see one water hydrogen which is hydrogen-bonded to such a N atom. There is also a rather well-defined second maximum arising from the other hydrogen of the hydrogen-bonded water. The N-0 distribution function is similar to that expected from the simple packing of nonpolar spheres. However, the first peak is somewhat narrower due to the orientation of the hydrogen-bonded water. The coordination number within the present approximation for an uncharged atom with the same van der Waals parameters is 12.7 compared to 8.7
Aqueous Solvation of Polyatomic Solutes Hydrogen 3.
The Journal of Physical Chemistry, Vol. 90, No. 23, 1986 6339
- Water Dirtribution Functions
% Hll
o.
A
Hc
I\ 0.
2.
4.
8.
- Water Distribution Functions
Carbon 3.
I
I
I
I
I
i
.
,
.,,
..., .
.I.,. .L, /
\
,
-i
.,
m c
i
8.
rinA
Figure 4. Solutesolvent radial pair distribution functions for the hydrogen atoms. The top panel is for the amide hydrogen, HN,and the bottom panel is for the carbon hydrogen, Hc. The solid line is gHxo(r), and the dashed line is gH,H(r).
for the charged nitrogen moiety used here. Carbonyl Oxygen. The carbonyl oxygen results are displayed in Figure 3. Compared to Figure 2, goH(r)displays more profound hydrogen bonding effects. In part, this is a result of the higher charge on the peptide oxygen than on the amide nitrogen. The first peak of gOH(r) is considerably sharper than in the case of nitrogen, indicating a stronger orientational correlation. This is also substantiated by the higher coordination number for 0-H. In the first solvation shell we obtain a coordination number of nearly 2. This suggests that, on the average, the carbonyl oxygen is nearly twice as good a hydrogen bond acceptor as is the amide nitrogen. Corresponding trends were obtained by computer simulations for amide and carbonyl groups in aqueous solutions.8 It should be noted, however, that extended RISM theory tends to yield attractive orientational features, such as hydrogen bonds, that are displaced by about 0.2 A to smaller r values and are somewhat more intense than those obtained from computer simulations for the same models. Since these characteristics tend to offset each other, the extended RISM theory provides a good measure of coordination and, therefore, of s t r u c t ~ r e . ~ ~ J ~ The carbonyl-oxygen, water-oxygen correlations demonstrate the packing behavior of water around such a charged atom. There is good agreement on the placement of the first peak in goo(r) with the previous dipeptide The first peak in goo(r) is similar to, though slightly broader than, that for pure water.19,34y36The small resolved peak near 4.5 A indicates that the water in the second solvation shell has some degree of tetrahedral orientation with respect to the carbonyl oxygen and the oxygens in the first solvation layer. The total coordination in the first two shells is greater in this work than that found in the simulation (see Table I1 and ref 8.and 33). as expected. Thus, although the packing in the first solvation layer is slightly different from pure water, the next layer appears to have some characteristics of the bulk solvent. Again, this is in accord with computer simulations. Hydrogen Atoms. In Figure 4, we present the pair distribution functions for both the nonpolar aliphatic and the polar amide hydrogen atoms. The most prominent difference between the two types of hydrogens is the position and width of the first peak in the distributions. That the peak is a t smaller r for HNthan for Hcresults from the difference in hydrogen radii (see Table I). The gH H(r) distribution is broader than for gHNH(r)and is associatki with a larger coordination number. In the gHxO(r)distributions, the first peaks have qualitatively different shapes. That for gHco(r)corresponds to a simple (uncharged) solute in water while that for gH&) is considerably sharper. There is a resolved shoulder after the first maximum in the latter. This indicates a different second solvation layer for this species. Also there is very little long-range order, as both the Hfl and HrH correlation
0.
L 0.
/I
2.
I
I
1
I
4.
8.
8.
r in A
Figure 5. Solutesolvent radial pair distribution functions for the carbon atoms. From top to bottom are shown the carbonyl carbon, the (Y carbon, and the methyl carbon distributions. The solid line is gcXo(r)and the dashed line is gCXH(r).
functions are nearly flat after 5 A. These differences for the two hydrogen atom types arise from a competition of the effects of charge vs. size. We consider this point in more detail in section V. Carbon Atoms. Figure 5 displays the correlation functions for the three different types of carbon atoms. In the sequence Cot C,, CMerwhich corresponds to the sequence from positive to negative charge, the first peaks of the gCH(r)distributions increase in width. For the methyl carbon, there is a significant increase in probability at small distances. The gcxo(r) distributions appear to exhibit two distinct types of behavior. The a carbon and the methyl carbon are quite similar in appearance and coordination number (see Table 11). The function gc,,o(r) agrees in the distance from peak to trough with the simulation of a similar quantity computed between entire methyl groups and the solvent.8 The carbonyl carbon, with the highest positive charge, displays a reordered second solvation shell. This is not unlike the behavior of increasing the charge while decreasing the radius in the hydrogen atom models where a similar water oxygen-solute atom correlation exists (see below). Analysis of Effect of Charge. Since the charge of an atom is the dominant parameter differentiating the hydrophobic (nonpolar) from the hydrophilic (polar) type, it is useful to consider the change in the correlation function as the charge on the solute atom is altered continuously, keeping the van der Waals parameters constant. An example of this type is provided by the three types of carbon atoms appearing in the model system. The results of such a charging calculation for solute, water oxygen and solute and water hydrogen correlation functions, are presented in Figures 6 and 7, respectively. The solute water oxygen distribution functions display different types of behavior for increasing negative charge and for increasing positive charge (Figure 6). For the positive cases (top of Figure 6 ) , the first peak in guo(r) moves inward. Starting at low charge, this peak first decreases in height and then starts to increase as the solute charge increases. Also the first peak gets broader at the base until finally a shoulder resolves itself into a new second peak. This is qualitatively different from what happens as the negative charge of the same-sized solute is increased (bottom of Figure 6). In this case, the first peak steadily increases in height and decreases in width without changing its position. The first minimum decreases in concavity until a new feature starts to emerge. This feature resembles the second peak in the highly charged carbonyl oxygen-water oxygen
6340 The Journal of Physical Chemistry, Vol. 90, No. 23, 1986 C-0 vs. Charge
Pettitt et al. TABLE III: Effects of Charge on Solvation at a Constant DiameteP rco
rCH
3.
~
q
min
max
0.50 0.25 0.10 0.00
3.5 3.5 3.5 3.6 3.6 1.6 1.5
4.7 4.8 4.8 4.8 4.8 2.0 2.0
-0.10 -0.25 -0.50
min
max
2.8 (4.1)b 3.6 (5.1)b 2.9 5.0 3.0 4.8 3.1 4.7 3.1 4.3 3.0 4.3 2.9 3.9
NCH
NCO
(rmin)'
(rmin)
13.5 13.8 14.9 15.1 14.4 0.4 1.1
6.4 (16.8)b 15.7 14.0 13.3 12.6 12.6 8.1
-
All distances in A. For this case the second maximum and minimum and the coordination number at the second minimum are given. CCalculatedper single hydrogen. 0.
'
I 1 2.
0.
I
I
I
4.
8.
8.
30. 25.
r in A Figure 6. Effect of changing solute charge on solute (U)-water distribution function. At the top are guo(r)distribution functions for positive charges: at the bottom are the guo(r) distribution functions for negative charges. The arrows indicate the direction of the change with respect to increasing charge magnitude. The zero charge case is presented for
both charge signs, and the other curves represent the range corresponding to Table 111; the van der Waals parameters used for U correspond to those for carbon.
10.
15. 5.
30* 25.
C-H I
Charge
VS.
'
I
I
1
A
/i
It
P'
i A
I
1.
2.
3.
4.
5.
6.
r Figure 8. Coordination number Nux@)for the charging of a carbon-like solute in TIPS water. The solute-water oxygen curves are shown in the
top panel, and the solutewater hydrogen curves are shown in the bottom panel. The highest positive charge example is given as the dashed line, the highest negative example as the solid line (see Table 111). 0.
2.
4.
8.
8.
r in A Figure 7. Effect of changing solute charge on solute (U)-water hydrogen distribution functions: the same set of results for gUH(r)are shown here as in Figure 6 for guo(r).
correlation function (see Figure 5). The changes with rap, :t to solute charge are even more striking in the solutewater hydrogen distribution functions given in Figure 7. For the positively charged case (top of Figure 7), the main peak in gUH(r)becomes narrower primarily by losing intensity at small distance values. Also, the peak increases in height with increasing charge. This is a manifestation of the water hydrogen orientation as the positive charges repel each other. As the negative charge on the solute is increased (bottom of Figure 7), the solvent hydrogens become increasingly localized (oriented) with respect to the solute. A new inner solvation shell is defined. This sharp feature is generally associated with either hydrogen bonding or ionic solvation in aqueous solution. In either case it is due to the strong Coulomb forces of the solute that orient the water molecules. Since the hydrogens in the water model used are completely enveloped in the van der Waals radius of the oxygen, the packing forces between the solute and the solvent are spherical and, therefore, do not inhibit the orientation due to electrostatic effects. Further, because the off-center hydrogens of the water model can get very close to the solute, an extreme localization can result from highly charged solutes. In Table 111 we have tabulated the maxima, minima, and coordination numbers for the charging experiment. The table
demonstrates the difficulty in comparing solvation numbers due to a given definition of solvation shell. There are obvious structural changes/complications evident in NUH(r)as a function of charge. This measure steadily increases until the onset of hydrogen bonding. At this point the coordination number drops precipitously and then begins to rise again. This is due to the formation of a new inner coordination sphere. For the extreme example of positive charge we have included the calculation of the charged solutewater oxygen coordination number at the second minimum for comparison. As positive charge is increased we notice that a new solvation shell becomes resolved at the highest positive charge, such that the coordination number appears to drop after a steady increase. Considering the first plus the new second shell, we conclude that the first minimum for the uncharged case was moving monotonically outward with increasing charge and that the coordination was steadily increasing. This is the type of behavior one might have expected from increasing the number of nearest neighbors due to electrostriction. The cumulative coordination curves, Nux(r), for the extreme charge case out to 6 A are displayed iniFigure 8. For hydrogen coordination, NUH(r),the behavior for the positive and negative case is very different. In the positively charged case the hydrogen's preference for pointing away from the solute manifests itself by having little number density at distances less than 3 A. This is in sharp contrast to the negative solute case where the presence of hydrogen bonding leads to a constant (saturation) in the NUH(r) curve from around the hydrogen bond distance (1.4 A) out to about 2.7 A. In the curves for the oxygen coordination number (NUo(r)) there is a very slight enhancement in coordination at
Aqueous Solvation of Polyatomic Solutes
0.0 0.5
F 1
-0.5
v
1
3
8
0.5
\
I
0.0
I
1 1
-
0.0
-
I
I
I
I
HEN
-
-0.5
al
I
I
1
I
Hc Hc
\
a
s
1
I
I
0.5
-
B
-
2-0.5
4a
I
0.5
1
1
I
I
I
,c,c
I 0.0 -
-OJ
t
1
V I
-1.0
0.
2.
4.
I
I
6.
8.
10.
r in A Figure 9. Potentials of mean force for low charge product to size ratio.
The solid line is the P M F the dashed line is the gas-phase pair potential with a dielectric constant of 80. contact for the positively charged species relative to the negative one. V. Potentials of Mean Force for Biomolecule Components All possible unique pair potentials of mean force (PMF) between the seven sites of interest (see Table I) can be generated from a knowledge of the correlation functions for each solute site with respect to the solvent sites by use of eq 12 and 15. There result a total of 28 pair potentials which are stored in tabulated form with their derivatives for the analysis and simulations of the biomolecules of interest.39 Rather than a display of all the individual PMF's, it appears more useful here to group them according to size and charge and discuss selected examples. We use the ratio of the charge product to the Lennard-Jones separation distance, R, = qgj/uil;this quantity corresponds to the electrostatic potential energy at the Lennard-Jones contact distance in vacuum. We have found that the qualitative features of the P M F s correlate roughly with the magnitude of R,. Engel and Hertz40 have used an analogous quantity, the "surface charge" q1/4ar;,to relate the ratio of water molecule reorientation times, as measured by NMR, to the size and charge of the solute. Three classes corresponding to different values of R , (in units of e2/A) were found. The first class is made up of the lowest values of R,IR,I < 0.001). These are essentially the hydrophobic pairs, such as Hc-Hc and C,-He. These low field cases are physically characterized by their inability to electrostatically orient a water molecule between them. Some examples of the PMF's are shown in Figure 9; for comparison the gas-phase potential with a dielectric constant of 80 is included in the figure. The effects of the intervening water molecules are manifested by the undulations in the P M F s corresponding to the solvation layers around the atoms. The distance between major minima is one water diameter. This is consistent with the interpretation of these features given in other studies of solvation of hydrophobic molecules by polar solvents. 17,30,43 The Helmholtz free energy is the thermodynamic potential for the canonical ensemble (N,V,T). Thus, the curves in Figure 9 and succeeding figures are a representation of the reversible work required to move two sites from infinite separation to a distance r. The barriers in the figure may be interpreted as the free energy (39) Pettitt, B. M.; Karplus, M. "A Stochastic Simulation of the Dynamics of MAA on the Potential of Mean Force", to be submitted. (40) Engel, G.; Hertz, H. G. Ber. Bunsen-Ges. Phys. Chem. 1968,72,808.
-Oa5 -1.0
t'
0.
I
I
I
I
2.
4.
e.
8.
10.
rink Figure 10. Potentials of mean for intermediate charge product to size ratio. The solid line is the PMF; the dashed line is the gas-phase pair potential with a dielectric constant of 80.
penalty for the collision (or formation) of two solute cages around the "cavities" in which the particles are situated. The minima are indicative of solvent stabilization. In the case of an aqueous solvent these features are more pronounced in terms of absolute energy differences than for simple or nonassociated polar liquids. For a fluid such as water, where an extensive hydrogen-bonded network exists, the effective forces can be more complicated. These complications are more evident for the higher charged species discussed below. The PMF's in Figure 10 correspond to pairs of sites with IR,I < 0.01 R , is about an order of magnitude greater than for the first case. The curves for Hc-O, C,-HN, and Hc-N are shown. These examples correspond to one rather highly charged site (e.g., 0, HN, N) and one site of low radius (Hc, HN). The PMF's that result from a competition between charge and packing effects have a more complex structure than did the relatively apolar systems of Figure 9. They display remnants of the effects of the packing structure of the solvent superimposed on the averaged orientational correlations. Due to charge effects the solvent cages apparently merge with a smaller free energy penalty that is exemplified by the relatively flat curves. Also, the solvent stabilization at contact due to packing is smaller than in the previous case. This is qualitatively consistent with consistent with simulation results.41 It has been shown that for intermediate charges at a fixed diameter there is a loss of water order (negative hydration) in the vicinity of the solute (the so-called Frank-Wien A region).42 The site-site cases with the highest values of R, are displayed in Figures 1 1-1 3; they correspond to values in the range IR,I 10.1. This category can be described in terms of three subgroups, positively (+ +), negatively (--) and oppositely (+-) charged pairs, for which the distribution functions are shown in Figures 11, 12, and 13, respectively. For the high-charge (+ +) pairs, the minima are somewhat closer to each other than for the lower charge examples, indicating an orientational preference. However, the barriers are small, as in the intermediate ratio case. Also, the steep repulsion of the Lennard-Jones wall is both softened and moved to larger r values, in contrast to the intermediate charge case (compare with Figure 10). These (++) pairs display features consistent with lower dielectric screening at contact than the asymptotic value. This is due to the explicitly molecular nature of the present theory for the liquid structure. With no solvent (41) Gieger, A. Ber. Bunsen-Ges. Phys. Chem. 1981, 85, 52. (42) Frank, H. S.; Wien, W. Y . Discuss. Faraday SOC.1957, 24, 133.
Pettitt et al.
6342 The Journal of Physical Chemistry, Vol. 90, No. 23, 1986
0.0 Oa5
0.5
0.0 -0.5
f
8
K t ii
1.0
\
a 3
0.5
.B
3
0.0
d
3
2
1.0 0.5 0.0 -0.5
'
0.
1
I
I
2.
4.
6.
t
8.
I."
10.
for like (+ +) charged examples. The solid line is the P M F the dashed line is the gas-phase pair potential with a dielectric constant of 80.
:
.
S
1
1
V
I
I
-.." 0.
2.
4.
6.
2.
4.
8.
8.
10.
rinA
rind,
Figure 11. Potentials of mean force for high charge product to size ratio
-0.5
0.
8.
10.
rhA Figure 12. Potentials of mean force for high charge product to size ratio potential of mean force for like (--) charged examples. The solid line is the PMF; the dashed line is the pair potential with a dielectric constant of 80.
between the two positively charged atoms, the effective screening due to the classical solvent is reduced. This can cause a larger repulsive field at contact for the strongly repulsive sites and, therefore, a larger apparent size (effective radius). The high-charge (--) pairs have substantially different behavior (Figure 12). These PMF's are reminiscent of negative ions in aqueous s o l u t i ~ n .There ~ ~ ~ is ~ a~ sizable stabilization a t contact, which is due only in small part to the effects of the large van der Waals well depth (see the similar behavior in Figure 9),since a corresponding stabilization is found in aqueous solvation studies of hard-sphere ions with the linearized HNC equationsz0 where there is no dispersion well. One source of the attractive well is the strong interaction of both negative solutes with the solvent (43)Pettitt, B. M.; Rossky,P. J. J . Chem. Phys. 1986, 84, 5836.
Figure 13. Potentials of mean force for high charge product to size ratio potential of mean force for unlike (+ -) charged examples. The solid line is the PMF; the dashed line is the pair potential with a dielectric constant of 80.
hydrogens at solute contact. The subtle effects of the aqueous model charge distribution are also apparent in the complex form of the first barrier (see below). Results for the relatively high-charged (+-) cases are shown in Figure 13. There is clearly a stabilization at contact as well as some structure in the barrier between the solvent separated and contact minima. This structure, which manifests itself as a depression a t the top of the barrier, is a result of the orientational correlations between the polar solvent and a charged solute. The dip occurs near the distance expected for tetrahedral orientations of the two solutes and the solvent. Thus, we see an aspect of the averaged threebody orientation projected onto the pair correlations or equivalently their potential of mean force. The size of the contact stabilization for these pairs is due to the reduced screening of the electrostatic potential a t small distances (see above). Intramolecular Hydrogen Bond. An essential interaction in biopolymers is the internal hydrogen bond; e.g., peptides, proteins, and nucleic acids are stabilized in their low-energy structures by such hydrogen bonds. For the alanine dipeptide model considered here (see section VI) there is a hydrogen bond between OLand HRin the C,* configuration shown in Figure 1. In the top panel of Figure 14 the P M F curve that corresponds to AWoH plus the hydrogen bonding 10-12 term is displayed. Also shown are the hydrogen bonding term and the shielded Coulomb interaction, VIE,plus the 6-12 term; the latter is the usual nonbonded plus shielded Coulomb interaction used to determine the PMF. The PMF minimum at antact is stabilized only slightly relative to the 1-10-12 contributions. In addition there is a very shallow minimum near 3.4 A with a slight barrier a t about 2.5 A. These features are associated with the solvent-stabilized minimum for near tetrahedral geometries. The magnitude of these effects is between that for an intermediate charge product to size ratio, corresponding to competition between the packing forces and orientational forces, and that for the strongly attractive case. One might have expected that there would have been a larger free energy barrier for breaking two solute water hydrogen bonds as the solutes approaches the stable contact minimum. However, the relatively small charges on N and HN in the present model (see Table I) lead to small perturbations of the solvent. The addition of the 10-12 term to the gas-phase molecular mechanics potential compensates for the small charges. In alternative reparameterizations of the potentials for peplarger charges have been introduced and no additional
-
Aqueous Solvation of Polyatomic Solutes
Hydrogen bonding PYFs
2.0
0.0
The Journal of Physical Chemistry, Vol. 90, No. 23, 1986 6343 180
I
90
-2.0 I
I
I’
I
I 1
I I
3 0
I .Ej
gd -2.0
-90
4
2
2.0
-180 -180
0.0
-2’o -4.0
t‘
0.
I
I
I
I
2.
4.
6.
8.
1
I 10.
r in A Figure 14. Hydrogen bonding potential of mean force. In the top panel, the full curve is the potential of mean force, the dashed line is the 1-10-12 potential, and the dot-dashed curve is the 1-6-12 potential used to obtain A W for the PMF curve; in all cases a shielded Coulomb interaction (e = 80) is used. The other panels show the effect of increasing the charge on the form of the potential of mean force; (center) qH = 0.3, qo = -0.6; (bottom) qH = 0.4, qo = -0.8. The dashed curve is the 1-6-12 potential with a shielded Coulomb interaction (c = 80).
10-12 term is used. With such an increase in the magnitude of the charges on the amide hydrogen and carbonyl oxygen, the solute-water and solutewatersolute orientations become more important. The resulting P M F s for approximately 50% and 100% higher solute charges are given in the two lower panels of Figure 14 where no 10-12 term is included. Relative to the 6-12 plus v C / t term (dashed line in the figure), the contact stabilization increases significantly for the larger charges. Also, the second minimum is more pronounced and there is a sizable free energy barrier between the two minima. Thus, the larger charge parameters, which put the solutesolvent and solutesolute potential on an equal footing with the solventsolvent interaction potential, yield an 0-H P M F that has a hydrogen bonding structure with the anticipated free energy barrier.
-90
0
180
90
b
180
90
-90
-180 -180
-90
0
90
180
$ Figure 15. (a) Vacuum surface of the alanine dipeptide showing a plot of the energy as a function of the dihedral angles @ and +. Solid lines
are the five lowest energy contours at 1-kcal intervals (the bottom contour is indicated with a heavy line); the dashed lines are the higher contours at 1-kcal intervals. (b) Solvent-modified surface for the alanine dipeptide showing a plot of the free energy as a function of the dihedral angles @ and +. See a for a description of contour levels.
VI. An Example: Solvent-ModifiedFree Energy Surface for the Alanine Dipeptide degrees of freedom to a given value and allowing the rest of the To illustrate the application of the present method, we present‘ molecule to relax to the energy minimum, subject to this constraint results for the relative free energies of different geometries of the (adiabatic mapping). The vacuum and solvent-modified potential alanine dipeptide (Figure 1) in aqueous solution. This provides surfaces for the dipeptide obtained in this way are shown in parts a potential of mean force surface that can be compared with the a and b of Figure 15, respectively. corresponding vacuum surface. Further, it can be used in a Most of the vacuum potential surface (Figure 15a) is at a high stochastic dynamic simulation of the dipeptide for comparison with energy relative to the two prominent wells, referred to as C7q existing full periodic boundary condition molecular dynamics (-60°,60’) and C7ax(60°, -60’). They are due primarily to an simulation of the dipeptide and its aqueous e n ~ i r o n m e n t . ~ * ~ internal ~ hydrogen bond which is present in pseudo seven-membered Results of the latter comparison will be reported in a separate (C,) rings, one of which, C7q, is shown in Figure 1. paper.39 The potential surface in solution (Figure 15b) is very different To obtain the full solvent-modified potential surface, we use in character. Many of the barriers and high-energy regions of the previously described vacuum surface (see section 111) and add the vacuum surface are now considerably lower in energy, although in the Awi,, as calculated in section V, for each of the 28 unique the hydrogen-bonded ring conformations are still of lowest energy. pairs (ij)of nonbonded interactions. Since the change between One region that is not changed much is the barrier at (4, # = vacuum potential and the solvent-modified potential is expected Oo, OO), which arises from the close (repulsive) contact between to be most pronounced for the soft dihedral angles, we show the amide hydrogen HRand the carbonyl oxygen OL(see Figure two-dimensional cuts through the full surfaces corresponding to 1). This type of repulsive interaction is little altered by the the variation of the free energy with the dihedral angle coordinates presence of solvent. d(CL’-NL-C,-CR’) and #(NL-C,-CR’-NR); see Figure 1. Such a The solvated surface along 4 = 180’ shows a pronounced ‘Ramachandran plot” is produced by constraining these two stabilization relative to the vacuum surface. Also, there are several significant local minima that were not present on the vacuum surface. These include the polyproline structure (PIT:-60°, 180’), (44) Pettitt, B. M.; Karplus, M. G e m . Phys. Lett. 1985, 121, 194.
6344
The Journal of Physical Chemistry, Vol. 90, No. 23, 1986
TABLE I V Solvent Contribution to Free Energy Differences of Alanine Dipeptide
Monte Carlo simulation" $9
*b
-70, -50 -80, 150
'YR
PI1
AAC
-3.6 -3.2
present theory 6, fl AA' -59, -89 -58, -173
-5.2 -4.7
" From ref 11. bAssumedangles in degrees; no attempt was made to find the local free energy minimum. In kcal/mol, relative to free energy of C7q,assumed to have the structure (@,+: -90°, 90'). dAngles in degrees of minimum free energy structure. eIn kcal/mol, relative to C7q minimum (4, $: -67.1°, 67.5O). TABLE V Dihedral Andes of C? Conformation"
dihedral angle
vacuum surface
solvated surface
Wl
178.3 -67.6 64.2 180.0
179.9 -67.1 67.5 -178.5
6 $ W
the right-handed a helix (aR:6 O 0 ,-60°), and the left-handed a helix (aL:60°,60'). These conformations, all of which are observed in the structures of peptides and proteins, may play a role in initiating folding in solution to form secondary structural elements. The vacuum potential reveals a region of favorable electrostatic interaction along the line connecting the two C7 minima. This region is significantly destabilized in the solvated surface by dielectric shielding. However, the modification of the potential surface that occurs in solution also involves the stabilization of pairs of solute atoms that share one or more solvent molecules (so-called solvent-separated pairs) as well as contact pair stabilization. These effects, which are clearly seen in the individual potentials of mean force (e.g., Figures 9 and lo), are due to the packing of an associated solvent like water around a give solute. They would not be present in a continuum model of the solvent. The trends predicted by the present theory for the solvation free energy differences between the C7'Q,aR,and PIIconformers can be compared with a recent computer simulation by Mezei et al." They have performed an umbrella sampled Monte Carlo calculation to obtain the difference in the free energy of aqueous solvation for these three conformations of the dipeptide. Since they used a somewhat different model for the peptide and water potentials, a quantitative comparison is not possible. However, their simulation and the present theory show similar effects of solvent stabilization of aRand PI*relative to C7'Q(see Table IV). The dihedral angles that minimize the free energy of the C7'Q structure on the potential of mean force surface are compared with those for the vacuum potential in Table V; the peptide group dihedral angles w1 and w2 are defined by CL-CL?NL-Crr and Crr-CRrNR-CR, respectively (see Figure 1). It can be seen that there are only relatively modest shifts in the dihedral angle minima. The dihedral angle $ has the largest shift; the minimum changes from 64' on the vacuum surface to 68' on the solvent-modified surface. This is in accord with the results of molecular dynamics simulations for the alanine dipeptide in aqueous solution.* Since the long-ranged Coulomb force modification is coupled to the other degrees of freedom, small shifts in bond angles and bond lengths also occur. On the solvated surface, both of the C7wells are elongated in the $ direction relative to the vacuum map. Thus, one expects that the population density is also elongated in this direction. This is confirmed by a preliminary calculation of the dynamics in the C7q well using a stochastic representation of the kinetic effects of the solvent bath.3g It is found that the degree of change in the 3. dihedral angle root mean square fluctuations on going from vacuum to solution are in good agreement with earlier explicit molecular dynamics simulation results.* Also, a full molecular dynamics simulation33has been performed in the region near (4, $ = 180°, -60'). This is a small well at high relative energy on
Pettitt et al. the vacuum surface (see Figure 15a), yet in the simulation the molecule remained in the vicinity of the starting coordinates. The solvated surface obtained by the present theory indicates that this region should be relatively stable in solution (see Figure 15b).
VII. Discussion and Conclusion The goal of this work was to develop and implement a method for dealing with the equilibrium properties of a nonrigid polar polyatomic molecule in aqueous solution. The essential idea of the approach is the introduction of a potential of mean force to take into account the modification of the vacuum potential energy surface by the surrounding solvent. Determination of such a free energy surface for the degrees of freedom that specify the conformation of a polyatomic molecule by means of computer simulations is a very time-consuming task. To avoid this, we have made use of methods based on integral equation theories for evaluation of the solution properties. The solvation behavior of each of the constituent atoms is calculated and then used to construct potentials of mean force for all distinct atom pairs in the polyatomic molecule. With neglect of solvent effect on bond length and bond angles and the use of the superposition approximation for atoms separated by three or more bonds, the complete free energy surface for the dihedral angle degrees of freedom of the polyatomic molecule is constructed. In this way a very complex problem is reduced to a stepwise, computationally feasible, buildup procedure. The solute-site, solvent-site distribution functions for the atom types that make up biomolecules have been shown to cover a wide range of behavior. They go from the nonpolar limit (e.g., aliphatic carbons and hydrogens) that is dominated by packing effects to relatively polar species (carbonyl C and 0, amide N and H) where the electrostatic interactions with the polar solvent (0and H sites in water) lead to strong orientational effects. To make clear the importance of site charges in solution we have compared a series of solute atoms with continuously varying charge for a fixed van der Waals parameter. Given the solute-solvent distribution functions, the solute site-site potentials of mean force can be obtained directly. It was found that the results can be classified in terms of a parameter R, (R, = qiqj/u,,) that is the ratio of the product of the charges to the van der Waals separation parameter. Three classes of potentials of mean force were identified: very small R, corresponds to hydrophobic pairs that have contact and solvent-separated minima; intermediate R , values, involving a polar and a nonpolar atom, apparently produce a cancellation of the hydrophobic and charge-ordering effects that results in a rather flat interaction curve; finally, the high R, case associated with two polar atoms is dominated by the charge effects and yields behavior similar to those found previously for ionic solutes. The method outlined here is best for molecules or portions of molecules with considerable solvent exposure. It is just these systems for which modifications of the effective potential due to interactions with the solvent are expected to be most important. Of particular interest are a wide range of peptides of which the simplest, N-methylalanylacetamide,has been examined in the present paper. Incorporation of the site-site potentials of mean force required for this molecule or more complex systems into a general purpose program for macromolecular calculations, such provides a basis for a variety of studies. as the CHARMM Of importance for testing the results obtained here is the comparison with simulation data for corresponding systems with explicit treatment of the aqueous solvent. Since such a molecular dynamics simulation exists for N-methylalanylacetamide,a comparison will be made of those results with the ones obtained from a stochastic dynamics simulation with the potential of mean force surface calculated here.39 Although the comparison of the full simulation with a stochastic dynamics study provides a test of the present methodology, it cannot verify the accuracy of the molecular mechanics potential (45) Brooks, B.
R.;Bruccoleri, R. E.;Olafson, B. D.; States, D. J.; Sw-
aminathan, S.; Karplus, M. J . Comput. Chem. 1983, 4, 187.
J. Phys. Chem. 1986,90,6345-6349 function itself. Here experimental studies are essential. One point of considerable importance that could be investigated by measurement of the equilibrium distribution of the conformers of N-methylalanylacetamide in aqueous solution is the origin of the hydrogen bonding interaction. Two types of models are in current use, and the integral equation method of this paper has been applied to both of them. The first, considered here, has an important contribution to the hydrogen bond energy (approximately 50%) that is nonelectrostatic (the so-called 10-12 term in potential function). The second, considered in a previous report,44dispenses with a special hydrogen bond term and obtains corresponding energies by the use of larger charges; i.e., the dominant contribution to the hydrogen bond is electrostatic. It is clear from the theory developed here, as can be seen by comparing Figure 3 of ref 44 with Figure 15 of the present paper, that the solvent effect is much more important in the second model than in the first; i.e., because the dominant effect of the aqueous solvent is to produce dielectric shielding, the electrostatic interaction is strongly decreased, while the explicit hydrogen bonding potential is unaffected. A measurement of the relative populations of hydrogen-bonded and non-hydrogen-bonded species would make it possible to determine which representation of the hydrogen bond is more accurate. The interplay of structure and dynamics plays an essential role in the function of biological m a c r o m ~ l e c u l e s . ~Much ~ ~ of the (46) Alber, T.;Gilber, William A.; Ponzi, Dagmar R.; Petsko, G.A. CZBA
Found.Symp. 1982, 93, 4. Karplus, M.; Swaminathan, S.; Ichiye, T.; van Gunsteren, W. F. Ibid., p 271 and references therein.
6345
theoretical progress in this area has come from the analysis of dynamical simulations of proteins and nucleic acids based on potential functions that are constructed from classical site-site models similar to those used here. However, most of the simulations have been restricted to treating the system in vacuo, although explicit solvent has been included in a few cases. The present approach to the potential of mean force when coupled with a reduced description of the dynamics, such as a Langevin algorithm, makes possible a fuller examination of the low-frequency dynamical behavior of polyatomic solutes in condensed-phase environments. It should be possible, for example, to perform simulations for systems like oligopeptides in an implicit solution environment that extend over nanoseconds or even microseconds. These could be used to obtain a detailed picture of conformation transitions and rate constants and would make possible a quantitative comparison with N M R relaxation data and other measurements of the intramolecular dynamics.
Acknowledgment. B.M.P. would like to thank the Robert A. Welch Foundation for partial support of this work. P.J,R. acknowledges the support of the National Institutes of Health, and M.K.acknowledges support from the National Science Foundation. P.J.R. is an Alfred P. Sloan Foundation fellow and the recipient of an N S F Presidential Young Investigator Award and an NIH Research Career Development Award from the National Cancer Institute, DHHS. Registry No. N-Methylalanylacetamide, 19701-83-8.
Microspeciation of Polypeptides
Department of Inorganic and Analytical Chemistry, L. Edtvos University, Budapest, H- 1443, Pf.123, Hungary (Received: March 13, 1986)
A new principle to estimate the concentrations of differently protonated polypeptide species (microspecies) has been elaborated. This method uses a recent type of thermodynamic parameter: group constant, which is a submolecular measure of basicity. The applicability of the principle is discussed, and the relevant relations are deduced. To study the main characteristics of this type of microspeciation, a comparative analysis has been made on glutathione, using either its group constants or its previously published microconstants. The full efficiency of the method is exemplified by the 32 amino acid containing microspecies concentrations and their distributions.
Microspecies are multidentate (bi0)ligands of distinctive number and site(s) of the bound protons. This d e f i t e stage of protonation (and the inherent fine structure) is a sine qua non of their specific biochemical interactions. Speciation is a recent tendency of environmental and clinical analytical chemistry. It provides not only the analytical concentration of a component in a system but also its distribution among the different species. Microspeciation in addition distinguishes between the different isomeric products of protonation, complexation, and other associations. Due to its principal significance, this paper focuses on the basic questions of protonation microspeciation. To date the only way for microspeciation (determination of concentrations of microscopic protonation isomers) was their calculation in the knowledge of the pH, the total concentration, and the equilibrium macro- and microconstants. Two species, the zero and the maximum proton containing ones, do not require the knowledge of microconstants. All others (where isomeric protonation occurs) do. For example, in the solution of a dibasic unsymmetrical acid (the simplest type of compound for microequilibria), four microspecies exist: H2A, HA-, AH-, and A2-. H2A is the acid, A20022-3654/86/2090-6345$01.50/0
is its anion, and HA- and AH- are nonidentical protonation isomers with bound protons at different sites. H2A can be malic acid, amino acids, etc. In this system the relations 1-7 are valid
kA = [HA-]([H+] [A2-])-' kB = [AH-]([H+][A2-])-'
(1)
kBA= [H2A]([H+][AH-])-'
(3)
kAB= [H,A]([H+] [HA-])-' + [HA-] + [AH-] + [H2S]
(4)
C, = [A2-]
= kA
+ ke
(5)
(6)
(7) where superscripts of k microconstants indicate the group (site) protonating in the process under consideration, subscripts (if at all) denote the other group@)bound to proton during the process, CAis the total concentration of the ligand, and 8, and O2 are the complex products. From eq 1-7 any microspecies concentration can be expressed, e.g., [AH-]. If CA= 1, eq 8 gives the proportion 0 1986 American Chemical Society