An Accuracy Assessment

Apr 1, 1994 - described using scaled particle theories14Js or theories of .... built and minimized with x-PL0Rs7 using OPLS parameter^.^^ ...... Ermr ...
0 downloads 0 Views 3MB Size
J. Phys. Chem. 1994,98, 4683-4694

4683

Solvation Free Energies Estimated from Macroscopic Continuum Theory: An Accuracy Assessment Thomas Simonson’J**and Axel T. Bringer’** Laboratoire de Biologie Structurale (C.N.R.S.),I.B.M.C., Universitg Louis Pasteur, I5 rue Rend Descartes, 67084 Strasbourg, France, and Howard Hughes Medical Institute and Department of Molecular Biophysics and Biochemistry, Yale University, New Haven Connecticut 06520 Received: September 22, 1993; In Final Form: February 25, 1994”

The solvation thermodynamics of a set of small organic molecules is studied using a simple continuum model. Transfer from vapor or cyclohexane to water is decomposed into three steps: first the atomic partial charges of the solute are removed, then it is transferred into aqueous solution, and then its partial charges are restored. The electrostatic steps are treated using continuum electrostatics; the hydrophobic transfer step is treated using atomic “solvation parameters” or “surface tensions”. Vapor-to-water transfer free energies of 35 neutral compounds were estimated, including analogues of 17 amino acid side chains. Variants of the model were tested, using from one to three atom types, each with its own surface tension. With a single surface tension, taken from literature data on linear alkanes, the data were reproduced with a root-mean-square (rms) error of 1.3 kcal/mol. Using up to three atom types did not improve the fit significantly. A model with up to eight atom types, but no electrostatic term, behaved no better. The model was also applied to cyclohexane-to-water transfer of the 17 amino acid side chain analogues considered above. Using a single, adjustable, atomic surface tension, the data were fit with an rms error of 1.2 kcal/mol, giving a microscopic surface tension of 19 f 6 cal/(mol A2), in agreement with experimental estimates for n-alkanes. Models with up to three atom types did not behave significantly better. Thus parametrized, the model was used to calculate the potential of mean force between two N-methylacetamide molecules in solvents of dielectric constant 80, mimicking water, and 2.2, mimicking carbon tetrachloride.

1. Introduction

Understanding the solvation thermodynamics of small organic molecules is an important step toward predicting the structure and stability of large biomolecules. Continuum models have often been proposed for this purpose. Electrostatic interactions in solution can be described by classical continuum electrostatics. Applications to small charged and polar molecules go back to Kirkwood and Westheimer and include recent calculations of ptrs and solvation free e n e r g i e ~ l - ~This approach views the solvent as a homogeneous dielectric continuum and the solute as a cavity containing a permanent charge distribution that polarizes the solvent. The free energy can be calculated by a Born charging process. The continuum treatment of solvent polarization has sometimes beencombined with a semiempirical molecular orbital model of the solute.10-12 Solvent polarization has also been described by Langevin dipoles,13which are closely related to the continuum model. Hydrophobic interactions can also be described by continuum models. Hydrophobic solutes and cavities in solution have been described using scaled particle theories14Js or theories of microscopic surface tension.1G20These relate the free energy of cavity formation, as well as solute-solvent dispersion interactions, semiempirically to solute surface or v0lume.~~-~3 For neutral, polar solutes, the continuum treatments of electrostatic interactions and hydrophobic cavity formation can be used in combination.2k27 Integral equation methods are another possible approach, suitable for both hydrophobic28 and charged or polar2’31 solutes. These approaches can be viewed as semiempirical. In contrast, various purely empirical continuum approaches have been proposed. These introduce atomic or molecular solvation

* To whom correspondence should be addressed.

t Universitd Louis Pasteur.

t Yale University. *Abstract published in Advance ACS Absrracrs, April 1, 1994.

parameters obtained by fitting partition coefficients of model compounds. One variant assumes that the total solvation free energy isa sumof atomiccontributions, proportional to the solventaccessible atomic surface area and dependent on the atom’s chemical type and e n ~ i r o n m e n t .A ~ ~second ~ ~ ~ variant is based on linear solvation energy relationships between large numbers of organic solutes, partitioning between different solvents. Transfer free energies are empirically related to molecular properties of the solute and the solvent, such as polarity, polarizability, or measures of hydrogen-bonding ability.34-36This second variant is rather powerful but cannot be easily transferred to macromolecules. Compared to the sophisticated methods of molecular dynamics and Monte Carlo ~ i m u l a t i o nall , ~these ~ ~ ~ continuum ~ approaches are fairly inexpensive and simple to use. As a result, there is hope to apply some of them systematically to large problems of biological interest. Many such applications already exist. Applications of continuum electrostatics to proteins have included numerous titration and redox calculations.3g-42 More recently, the implicit treatment of solvent in molecular dynamics simulations of proteins has been a subject of intense interest. This can bedone using continuum electrostaticsand the Poisson-Boltzmann equati~n~~~~-particularly since the correct form of the electrostatic boundary forces was recently derived.44 It can also be done using atomic solvation parameters4sv46and possibly integral equation method~.~’.~s In this article, we apply a simple, classical, continuum model to the solvation thermodynamicsof a set of small neutral molecules in an attempt to assess the accuracy and usefulness of the model. This analysis should give a firmer basis to implicit treatments of solvent in macromolecular simulations and to many empirical predictions of protein ~ t a b i l i t y . ~ ~The - ~ lmodel is essentially that of Huron and Claverie and many successive authors,2’26 which combines a treatment of solvent polarization based on continuum electrostatics with a treatment of the hydrophobic cavity free energy based on one or more microscopic surface tensions. Thus,

0022-3654/94/2098-4683$04.50/0 0 1994 American Chemical Society

4684

The Journal of Physical Chemistry, Vol. 98, No. 17, 1994

the focus of this work is not the novelty of the model but the systematic investigation of its accuracy in various situations. Consider the following three-step solvation process. The solute is initially in vacuo. The first step-following Born and many other authors-is to remove its atomic partial charges, giving a completely nonpolar analogue. The second step is to transfer this nonpolar analogue into dilute aqueous solution. The last step is to restore the atomic partial charges. The charging and uncharging steps will be modeled using continuum electrostatics and solving the Poisson-Boltzmann equation numerically. The transfer of the nonpolar analogue (step 2) involves creating a cavity in solution and incorporating the solute-solvent van der Waals potential. This step will be modeled using one or more microscopic surface tensions. For the first 10-20 n-alkanes, the vapor-to-water and organic solvent-to-water transfer free energies are approximately proportional to the alkane solute’s surface area. For the transfer of an alkane solute between cyclohexane and water, for example, assuming equal, infinitely dilute, mole fractions of solute in each phase, the proportionality constant for the transfer freeenergyis about y = 25 cal/(mol Az).Therefore, a single microscopic surface tension of this magnitude should in principle suffice for our purposes. We apply the model to a data base of 35 small neutral organic solutes, including the analogues of 17 amino acid side chains (Table 1). They range in size from methane to cyclohexane. Standard transfer free energies from vacuum and cyclohexane to water arecalculated and compared toexperiment. Wecompare several different implementations of the model, using different sets of atomic partial charges and radii and varying numbers of microscopic surface tensions. In each case, the surface tensions are either derived from semiempirical arguments or least-squares fit to our data base of transfer free energies. These calculations will allow us to determine the best implementation of the model and its optimal numerical parametrization. Different implementations may possibly be required for describing vapor-towater and cyclohexane-to-water transfer, and of course different values of the surface tension@) will be appropriate in the two cases. Thus parametrized, the model should be applicable to many other biomolecular problems. As a simple illustration, the potential of mean force between two N-methylacetamide molecules is calculated in a low dielectric solvent, mimicking cyclohexane or carbon tetrachloride, and in water. The success or failure of this kind of application has obvious implications for implicit solvent models in molecular dynamics simulations of proteins. The next section describes the model in more detail as well as the computational methods. The results are grouped in section 3 and discussed in section 4. In an appendix we discuss a recently proposed hydrophobicity scale for the amino acids, based on standard transfer free energies augmented by a Flory-Huggins mixing entropy.52 This hydrophobicity scale gives a measure of “contact” free energy per unit surface area of the solute.53 2. Materials and Methods

2.1. A Three-StepTransfer Pathway: SeparatingHydrophobic Transfer and Electrostatic Charging. Consider the three-step solvation process described in the Introduction. Starting from a solute in one solvent (e.g.,vacuum or cyclohexane), remove its partial charges, transfer it into aqueous solution, and then restore its partial charges. Assume equal, infinitely dilute, concentrations in both gas and solution. The transfer free energy AG can be written

each term corresponding to one step of the transfer process.

Simonson and Brunger

,

35

L

151 I

10-

A

A

c

d

l 05

125

150

175

200

225

250

1 A

A

a

b

275

300

area (ii2)’ 325

350

375

Figure 1. Vapor-to-water transfer data for saturated hydrocarbons as a function of accessible surface area (from ref 5 5 ) . Standard states are 1 M ideal gas and solution phases. Linear alkanes (dots) are labeled by the number of carbons. Cycliccompounds(triangles) are (a) cyclooctane, (b) cycloheptane, (c) cyclopentane, (d) cyclohexane, (e) methylcyclopentane, (f) methylcyclohexane, and (8) cis-1,2-dimethylcyclohexane. Branchedcompounds(X) are (h) isobutane, (i) neopentane,e)isopentane, (k) neohexane, (I) isohexane, (m)3methylpentane, (n) 2,4-dimethylpentane, ( 0 ) isooctane, and (p) 2,2,5-trimethylhexane. Compounds were built and minimized with x-PL0Rs7 using OPLS parameter^.^^ The electrostatic contribution

is estimated by solving the Poisson-Boltzmann equation numerically We neglect the structural relaxation that must occur when the solute partial charges are removed or restored. We must assume either that this relaxation is small or that we can apply restraining forces to maintain the initial structure in such a way that the restraint free energy is the same in all three phases. The restraining forces then do not contribute to the transfer free energies. The hydrophobic contribution AGh* is related empirically to the atomic accessible surface areas A, and one or more “atomic surface tensions” rj:

The sum is over all solute atoms. We neglect changes in the solute three-dimensional structure upon transfer between vacuum, cyclohexane, and water, so that the atomic accessible surface areas are constants. The surface tensions Tiand the constant c will be either estimated semiempirically or treated as adjustable parameters. We expect that our nonpolar analogue should behave somewhat like an alkane molecule. Thus, for cyclohexane-towater transfer, for example, a single surface tension of y, y = 25 cal/(mol A*) for all atom types should give reasonable results. 2.2. Surface Tensionsfor Hydrocarbons. How realistic is this hybrid, Poisson-Boltzmann + surface tension, continuum model? Although continuum electrostatics are very simplistic at the molecular level, Poisson-Boltzmann calculations have been shown to compare fairly well with high-level free energy simulations for a number of small polar and charged compounds.6.54 The hydrophobic, microscopic surface tension model embodied in eq 2 may be more problematic. The vapor-to-water standard transfer free energies of several saturated hydrocarbons are shown in Figure 1 as a function of accessible surface area. The standard states are 1 M concentrations of solute in the ideal vapor and solution phases, at room temperature and p r e s s ~ r e . ~These s compounds are approximately uncharged, which allows us to directly evalute the validity of eq 2 without being affected by the accuracy of the electrostatic terms. For linear and moderately branched alkanes, the free energy is indeed nearly proportional to surface area, with a proportionality constant of about 6 cal/(mol A2). The cyclic alkanes behave differently, however. Cyclopentane and cyclohexane and their derivatives fall about 1 kcal/mol below the n-alkane curve; cycloheptane and cyclooctane fall about 2.5 kcal/

The Journal of Physical Chemistry, Vol. 98, No. 17, 1994 4685

Continuum Solvation Free Energies mol below it, It appears that a single surface tension is a good approximation for moderately branched alkanes but may be a very rough one when a variety of different geometries, e.g., cyclic or highly branched, are involved. Clearly, a rigorous, general expression for the hydrophobic free energy AGh+ must contain terms that depend not only on the solute’s surface area but also on its radius (or curvature), its volume, and its detailed shape. For the small solutes considered here, area and volume are nearly proportional, so that the dependence of AGho on solute volume is implicitly contained in eq 2. The dependence of AGh* on solute radius (or curvature) is probably weak for the range of solutes considered here. Theories of cavities and interfaces predict that, for a spherical cavity of radius r, the surface tension depends on cavity size through a multiplying factor 1/(1 26/19, where 6 is a constant on the order of a molecular diameter.14Jc20 The solutes considered here range in size from methane to cyclohexane. If we estimate the solute radii r from their areas A, using 47r2 = A , we find that for these solutes r ranges from 2.5 to 3.5 A. If 6 = 4 A,21 the multiplying factor 1/( 1 28/r) only varies from 0.25 to 0.30. Therefore, the dependence of the free energy on the solute radius is weak and implicitly contained in eq 2. The more complex dependence of AGb* on details of the solute shape can only be included in the model empirically, for example, by assigning different surface tensions to different chemical atom types. This leads to different implementations of the model, with slightly different functional forms for 2.3. Variants of the Hydrophobic Transfer Model. Several functional forms for the hydrophobic transfer term AGh* were tried in this work. The simplest variant assumes a single surface tension y for all atom types. y and the constant term c (eq 2) can be least-squares fit to the data base of transfer free energies. Alternatively, y can be taken from alkane transfer data and c estimated from scaled particle theory, in which case the model has no adjustable parameters. Variants using two surface tensions were also tried. One distinguished atoms in cyclic and acyclic compounds. Another distinguished polar and nonpolar atoms. All partial charges are removed in the hydrophobic analogues, so that all atoms are alkane-like. Nevertheless, there are probably systematic errors in the electrostatic treatment, which affect different atoms in different ways. The use of a polar and a nonpolar surface tension attempts to compensate for these. Variants using three surface tensions were also tried. A first onedistinguished positive, negative, and neutral atoms. A second one distinguished carbons, oxygens, and nitrogens. Finally, a variant using eight surface tensions was tried, which simply omits the electrostatic term calculated from the Poisson-Boltzmann equation. This last variant is essentially the empirical EisenbergMcClachlan m ~ d e l . ~ ~ J ~ For each functional form of Act,*, the y, and c are either derived from semiempirical arguments or least-squares fit to the data base of transfer free energies. The net result is that we determine the optimal functional form for AGh* and the optimal numerical parameters to use. 2.4. ComputationalDetails of the SolvationCalculations. Table 1 lists the solutes used for the vapor-to-water calculations, along with the references to the experimental solvation data. Ionizable solutes were considered in their neutral form. The list includes side chain analogues of the standard amino acids, except for glycine, proline, and arginine. The glycine analogue, hydrogen, cannot be modeled confidently as a classical mechanical particle (let alone a macroscopic dielectric continuum); proline cannot be subdivided into side chain and backbone; and for arginine, atomic partial charges were not available for the neutral form. For cyclohexane-to-water transfer calculations, we considered just the 17 amino acid analogues listed in Table 3. For both data sets, experimental data were partition coefficients expressed in concentration units. Thus, the standard states can be taken as 1 M in the ideal gas phase and in the ideal solution phase. The logarithm of the partition coefficient times kTgives the free energy to transfer a single solute molecule from the gas

+

+

phase to the solution phase, assuming equal, infinitely dilute, solute concentrations in both phases.56 The conformation of each solute was fully optimized in vacuo using the program X-PLOR~’ and the “OPLS” energy parameters of Jorgensen and co-workers,s8 including explicit hydrogen atoms for all aromatic rings (Tirado-Rives, J., personal communication). Accessible surface areas were calculated using X-PLOR,with a solvent probe radius of 1.4 A, and OPLS van der Waals radii (a/2). The dielectric boundary between solute and solvent was defined as the locus of the center of a probe sphere, rolling over the envelope of the atomic Born radii. The probe sphere radius was 1.4 8, unless otherwise noted. The Born radii are taken equal to Rbrn= R* = 2-%. This definition of the dielectric boundary is commonly used and known to give reasonable results for small polar and charged molecules.6 In the OPLS force field, the radius of almost all polar hydrogen atoms is zero. The exceptions are mainly the weakly polar aromatic hydrogens. For the hydrogens with zero radius, we increased the Born radius to 1 A to improve agreement with the free energy perturbation results of Still et al.5 As specified below, a few of the calculations were done with the original, “small”, OPLS hydrogens for purposes of comparison. When several stable conformations with similar energies existed, we performed calculations for each conformer. For cyclohexane, the accessible area is an average over the boat and chair conformations. For ethanol, cis and trans conformers gave results within a few percent of each other; only the trans results are reported. 4-Methyl and 5-methylimidazolegave results within 5% of each other; only the 4-methyl results are reported. The trans form of N-methylacetamide, considered to be the more stable, was used. Poisson-Boltzmann calculations were done with the program Delphi.59 Model conditions were designed in part to give good agreement with microscopic free energy perturbation results in the l i t e r a t ~ r e .These ~ perturbation results correspond to thesame charging/uncharging processes, in vacuum and aqueous solution, as considered here. The internal solute dielectric constant was taken equal to one. The dielectric constants of cyclohexane and water were 2 and 80, respectively. Zero ionic strength was assumed, which is reasonably close (though not identical) to the experimental condition~.~W3 (The data of Radzicka et al. for example correspond in some cases to ionic strengths on the order of 0.05 M [concentration of monovalent counterions]. For six test compounds, AG:Fw calculated at this ionic strength differed from the result at zero ionic strength by less than 2%) Each atomic partial charge is distributed over the eight closest grid points in such a way as to preserve the first few moments of the charge distribution in the corresponding grid cube. The multigrid, “focusing” method provided by Delphi was used to solve the Poisson-Boltzmann equation.@ An initial cubic grid was defined, whose length was typically 10 times the maximum extent of the solute. This grid had 65 nodes per axis. The initial boundary potential was approximated by the Coulombic potential of the solute divided by the solvent dielectric constant. The Poisson equation was solved using a finite-difference algorithm.65 Then a smaller grid was defined, the boundary potential was taken from the previous step, and the Poisson equation was solved again. This was repeated four or five times, until the final grid extended 1 or 2 A beyond the edge of the solute’s Born envelope. The successive grids all have 65 nodes per axis. Thus, each successive grid is finer than the last, and the description of the potential becomes more detailed at each step; hence one speaks of a “focusing” calculation. The sensitivity of the results to model parameters was tested in detail for six of the solutes, by varying the dielectric constants, the grid sizes, the grid centering, and the solvent probe radius and by calculating with and without focusing. Close agreement with Jean-Charles et a1.6 was obtained for the six solutes when their published model conditions were used. Once the electrostatic contribution was obtained, it was subtracted from the experimental transfer free energy, leaving

Simonson and Briinger

4686 The Journal of Physical Chemistry, Vo1. 98, No. 17, 1994

to these quantities by

an “inferred” estimate of the hydrophobic term AGh,(inferred) = AGoV,,(exp)

- AGtZw

(3)

This term was least-squares fit using the surface tension model of eq 2, with c and 7)as adjustable parameters. This gives a “fitted” estimate AGhQ(fit)and a “fitted” transfer free energy AGov,(fit). 2.5. Potential of Mean Force Calculation. If the above fits are reasonably successful, we can apply the resulting model to related problems. As one such application, the potential of mean force between two N-methylacetamide molecules was calculated along a certain reaction coordinate, in solvents of dielectric constants 2.2 (mimicking CC4) and 80. The reaction coordinate was set to the linear displacement of one molecule along the CO( 1)-NH(2) direction, with the molecules coplanar (see Figure 5). Formation of the dimer in solution at a given separation can be accomplished as follows: take the dimer in solution at infinite separation, remove to vacuum, move along the reaction coordinate to the desired separation, and transfer back into the solvent. The energy of the vacuum step is calculated as a sum of Coulomb terms using X-PLOR.The free energy of the two transfer steps is calculated as above, with one slight difference: for the potential of mean force in the low dielectric solvent, we simply neglect the nonelectrostatic interactions (AGhQ). It can be shown that they nearly cancel out in the process of forming the dimer in the low dielectric solvent. Given the modest accuracy of the microscopic surface tension approach, we can then neglect them without significantly decreasing the precision of the calculation. 2.6. Comparisons with a Fully Empirical Continuum Model. In addition to the semiempirical model discussed up to now (based on the Poisson-Boltzmann equation plus one or more microscopic surface tensions), we also performed some calculations with a different, fully empirical model.36 Cyclohexane-to-water transfer free energies were calculated for a subset of our data base compounds. The fully empirical model36usescorrelations between transfer free energies and molecular properties, such as volume, hydrogen-bonding ability, polarizability, and heat of vaporization, to predict partitioning of 180solutes between water and 25 organic solvents, with mean errors on the order of 1 kcal/mol. This approach does not have an obvious extension to proteins and is considered here simply as a benchmark for the magnitude of error of empirical solvation models. This model expresses the partition coefficient of a solute between water and an organic solvent as

where V2 is the solute volume, am2and Pm2 are solute solvatochromic parameters, related to hydrogen-bonding ability, and the other parameters are characteristic of the solvent.

3. Results 3.1. Precision and Accuracy of Electrostatic Calculations. Electrostatic contributions to the vapor-to-water transfer free energies, AGZw,were calculated under various conditions for six test compounds: acetic acid, acetamide, methanol, ethanol, toluene, and phenol. We investigated in particular the effect of grid fineness and probe radius (used to define the dielectric boundary). Figure 2 shows results for methanol and acetamide, which are typical of all the solutes. Act!!& is shown as a function of grid fineness. The number of grid nodes was 65 X 65 X 65 in all calculations. Thus, the grid fineness is 64/L grid units/& where L denotes the edge length of the cubic grid. Solvent probe radii of 0,0.5, 1,1.5, and 2 A are compared, each of which leads to a different definition of the solute-solvent dielectric boundary. For selected atoms, we have also shown the difference Vwat,iVv,c,ibetween the electrostatic potential at the atomic center i in water and in vacuum. The total electrostatic free energy is related

where the sum is over all solute atoms i , of partial charge qi. As the grid length becomes smaller (increasing fineness), the solvation free energy decreases systematically in magnitude. For acetamide, going from 3 to 6 grid units/i( decreases the free energy by over 1 kcal/mol. The same qualitative effect is seen for all 35 solutes. As grid length increases (decreasing fineness), the convergence of the energy is slow and large fluctuations are seen. These are presumably due to the different ways the partial charges are split up over the grid nodes and their changing interaction with solvent as a result. The vertical dotted line to the right of each plot corresponds to a grid size that “exactly fits” the solute. The length of the grid edge in this case is exactly equal to the greatest dimension of the solute, including its Born envelope, plus the diameter of one water molecule (2.8 A). For this grid, the calculations have still not fully converged. The calculations with zero probe radius converge the fastest and the calculations with the larger probes more slowly. In the case of methanol, the slow convergence of the free energy is entirely due to the contribution of the hydrogen atom, whose potential fluctuates with changing grid fineness (Figure 2). The potentials on the oxygen and carbon converge smoothly. For acetamide, the free energy convergence is limited by the oxygen potential, with the other atomic potentials converging moresmoothly. Davis and M ~ C a m m o n have ~ ~ , ~proposed ~ a dielectric boundary smoothing technique that appears to improve the convergence of this type of calculation. A broader, Gaussian, distribution of each atomic partial charge over the nearby grid points might also improve the models6’ The choice of probe radius has a large effect on the free energy for a given grid fineness. Physically, with a smaller probe we expect a larger free energy magnitude. In practice, the opposite trend is seen more often, though not systematically. The equipotentials in the solvent region do indeed move closer to the solute as probe radius is decreased (not shown). However, the free energy magnitude, and the magnitude of the Vwat,i- Vvas,i, decrease in many cases when the probe radius is decreased. The curves corresponding todifferent probe radii cross over in several places (Figure 2). It appears that although the direct partial charge-solvent interaction will always increase with the smaller probe size, there are other effects involving the distribution of induced surface charge and its self-energy, as well as changes in the screening of partial charge-partial charge interactions. These effects depend on the details of the distribution of partial charge and induced surface charge on each particular grid and are not easy to predict. Results with and without focusing were found to be similar for a given grid length, and slight changes of the grid origin had a weak effect (results not shown). Microscopic free energy perturbation results for AG& are available for 12 of the compounds in our set, based on Monte Carlo simulations in aqueous solution.5 Table 1 compares the macroscopic and microscopic results. As noted by previous authors,5.6q8 the agreement is rather good, provided a solute dielectric constant of one is used. The rms difference between the two data sets was 1.0 kcal/mol. The largest deviations were 1.8 kcal/mol for benzene and acetic acid and 1.7 kcal/mol for acetamide. Judging from these comparisons, the incomplete convergence with respect to grid length and the uncertainty as to the best definition of the dielectric boundary led to an uncertainty of at least 1-2 kcal/mol in the calculated transfer free energies. These errors should be reduced by the empirical fit to experimental transfer free energies discussed below. Thus, any reasonable choice of parameters (grid size, probe radius, and Born radii)

The Journal of Physical Chemistry, Vol. 98, No. 17, 1994 4687

Continuum Solvation Free Energies

acetamide

methanol

6'

4

4

23.'.r.

22

'

...

5

6

I

8

11 I

3

5

6

I

\

j

..

4

I -26.

:

t

-I1 -12

Figure 2. Electrostatic results for methanol (left-hand panels) and acetamide (right-hand panels). Top panels: - A@Z (kcal/mol) vs grid fineness (grid units per A). Each panel shows results for five different probe radii: 0, 0.5, 1, 1.5, and 2 A (see legend at lower left). Lower panels: atomic contributions Vwat,,- VvacJ(kcal/(mol au)) for selected atoms (methanol 0,H; acetamide 0,N) vs grid fineness. V,,.~J- Vmais the difference between

the electrostatic potential on atom i in solution and in vacuum. The grid contains 65 plot shows the grid fineness that "exactly fits" the solute (see text). could be used as long as the same parameters are used for both the fit and the prediction of transfer free energies. The electrostatic free energies reported in the next sections (Tables 1 and 3)are all the result of four or five focusing steps, such that the initial grid is about 10 times the greatest solute dimension,while the final grid is the "exactly fitting" grid described above. The urobe radius is 1.4 A. and the Born radii are taken from the OPLS force field, with enlarged hydrogens, as discussed above.

X 65 X 65

nodes in all cases. The dashed vertical line on each

Each electrostatic potential calculation took about 30s of CPU time on a Hewlett-Packard HP730 workstation. Since focusing was used and three solvents were compared, overall each transfer free energy required . about 5-10 min of C p u time. -I

3.2. Parametrization of the Model Using Vapor to Water Transfer Data. The electrostatic free energies A@.$& were calculated for the 35 solutes listed in Table 1 and then subtracted from the experimental transfer freeenergies to give the "inferred"

4688

The Journal of Physical Chemistry, Vol. 98, No. 17, 1994

4 I

-10

-8

36

, '5 8

-6

-4

-2--s/l

, I

17,

,

I

Fignre 3. Vapor-t+water transfer data. (a) Transfer free energy of hydrophobicanalogues,AG*(inferred)vs acccssiblasurfaaareaofsolutc (dots): unrestrained least-squares linear tit (solid line); linear tit to first few n-alkanes (dotted line). (b) AG',(fit) = AG&t) + A=, estimated using the semiempirical model VIfor the hydrophobic term, Each point is labeled with the number of the wmpared to AGO,(exp). wrresponding solute in Table I . The dashed diagonal lines are 1 kcall mol above and below the main diagonal. values of the hydrophobic free energies AGh*(inferred) (eq 3). These were fit using the microscopic surface tension model (eq 2). Several functional forms were tried for AGhh using from one to eight surface tensions. Definitions and results are shown in Table 2. The simplest variant of the model uses a single surface tension y in AGh*. We can assume that y = 0.006 kcal/(mol.&'), the value appropriate for small linear alkanes (see Figure 1). For theconstant term, c, wecan makea theoreticalestimate. Consider the free energy to create a hard point particle in water. Scaled particle theory for example predicts this free energy to be about 0.2-0.3 kcal/mo1.'4.6*.69 Inour model, thearea of a point particle is the locus of the center of a probe sphere in contact with this particle, 4n X 1.4' = 24.6 A'. W i t h y = 0.006 kcal/(mol A'), we find c = 0.2-0.006 X 24.6 = 0.05 5 0 kcal/mol. Thus, in this variant of the model, we have AGh*(fit) = O.O06A, with no adjustment of parameters. This form fits the data reasonably well. For our 35 solutes, the rms deviation between the experimental and model transfer free energies is 1.3 kcal/mol. The correlation coefficient between the experimental and model transfer freeenergiesisO.93. Thecorrelationcoefficient &tween AGh*(inferred) and AGh,(fit) is only -0.28: the proportionality of AGh*(inferred)to solute area is poor. The theoretical transfer free energies estimated with this model are plotted in Figure 3 and tabulated in Table 1. Thesamevalueofy = 0.006kcal/(molA*)canalsobeobtained by least-squares fitting AGh* against the data, with the constant term c fixed at zero. The same value of c = 0 can be obtained hy fitting AGhi against the data, with the surface tension fixed at 0.006. The uncertainty in y and c estimated from the leastsquares variances is large: 1.3 for the constant term and 0.005 for the surface tension. Unrestrained least-squares fitting of y and c together, on the other hand, led to unphysical results, i.e., a negative surface tension or a large value of c, without reducing the rms error significantly.

Simonson and Briinger Variants of the model using two or three surface tensions y, in Ache were also tried. The surface tensions were treated as adjustable parameters. Onevariant containedonesurface tension for atoms in cyclic compounds and one for atoms in acyclic compounds. The surface tensions, r m s errors, and correlation coefficients resulting from a least-squares fit to the data were almost the same as before (not shown). The distinction between cyclic and acyclic compounds is apparently not justified at this empirical level. Othervariantsthatused twoor threedifferent surfacetensions. based on atom polarity or chemical type, gave only marginally better results (Table 2). All these variants lead to figures very similar to Figure 3. Finally, a variant using eight atom types, and omitting the macroscopicelectrostatic term, behaved little better (not shown). This variant is essentially the empirical Eisenberg-McClachlan model."^'' 3.3. Paramehimtionof theModelUsingCyclohexsneto Water Transfer Data. The electrostatic free energies A@& were calculated for the 17 amino acid analogues listed in Table 3 and then subtracted from the experimental transfer free energies to give the 'inferred"va1uesof the hydrophohicfreeenergies. These were fit using several variants of the microscopic surface tension model of eq 2. Variants and result are summarized in Table 4. The first variant of the model used a single surface tension y and treated y and c a s adjustable parameters. An unrestrained least-squares fit to the data gave AGh*(fit) = 4 . 5 6 0.019A. Therms deviation between AGo,(exp) and AGo,(fit) was 1.2 kcal/mol. Thecorrelationcoefficient between AGo,(exp) and AGo,(fit) was 0.94. The correlation coefficient between AG*(inferred) and AGh4(fit) was 0.59: the proportionality of AGh*(inferred) to solute area is better than in the vacuum case. The surface tension of 19 cal/(mol A2) is not very different from previous estimates for hydrocarbons (based on thesame standard state), and the constant term is reasonably small. Table 3 and Figure4givethetransfer freeenergiesobtained with thisvariant. Avariantofthemodel with twosurfacetensions,distinguishing polar and nonpolar atoms, gave nearly identical results (Table 4). FM purposesofcomparison, we also repeated thesecalculations using different sets of charges and Born radii for the electrostatic term. AG& was calculated with the hydrogen Born radii deriveddirectly from theOPLSforcefield @.e., without increasing the polar hydrogen radiifromO to 1 A). Thisgavesimilarquality fits, with a slightly smaller constant term c = -1.1 kcal/mol. A@& was also calculated using the Charmm22 charges and Born radii (Karplus, M., personal communication). These gave significantly smaller electrostatic free energies and a slightly poorer fit to the overall free energies. No attempt was made to tune thechargesor Born radii to improve the fit. Theelectrostatic free energies are reported in Table 3. 3.4. Application to N-Methylacetamide Dimerization. The macroscopic model thus parametrized can be applied, at least in principle, to many biomolecular problems. As a simple illustration, we calculate the potential of mean force between two N-methylacetamide molecules, in a solvent with a dielectric constant of 2.2, mimicking carbon tetrachloride, and in water. The success or failure of this kind of application has obvious implications for implicit solvent models in molecular dynamics simulations of proteins. The molecules were in the trans conformation, coplanar, and initially arranged so that the intermolecular carbonylamide hydrogen bond was linear. The reaction coordinate was the displacement along the direction of this hydrogen bond (Figure Sa). The electrostaticcomponent of the freeenergy was calculated as above, starting from a large grid and using six focusing steps. The hydrophobic term, in the case of water, was modeled using a surface tension of 19 cal/(molA'), based on the cyclohexaneto-water fits (Table 4). In the case of the low dielectric solvent,

+

Continuum Solvation Free Energies

The Journal of Physical Chemistry, Vol. 98, No. 17, 1994 4689

TABLE 1: Vapor to Water Solvation Free Energy Calculations solute

area' volumeb AGo,(exp)C AGo,(fit)d FEP 146 39 2.WJ 0.87 0 2. propane 223 81 1.98ia 1.33 0 2.121s 3. n-butane 100 250 1S O 0 2.30,a 4. isobutane 250 100 1S O 0 5. acetamide -9.7W*h -1 1.23 59 216 -12.5 -10.8 57 -6.7 3fash 6. acetic acid -7.28 212 -8.6 -6.8 7. methanethiol -1.24, -2.99 192 55 -4.1 8. propionamide -9.401.h 247 79 -11.41 -12.9 9. propionic acid -6.471s -7.17 245 79 -8.6 10. 4-methylimidazole 259 -10.29h 79 -8.13 -9.7 11. 3-methylindole -5.90, 122 -4.19 331 -6.2 12.p-cresol -6.13fa 106 -6.01 303 -7.9 13. ethylmethyl sulfide -1.491 90 -1.86 246 -3.3 14. methanol -5.1W 174 40 -6.18 -7.2 -7.1 15. ethanol -4.9YJ 206 sa -5.63 -6.9 -7.4 16. toluene 297 106 -0.76f -0.05 -1.8 -0.7 272 -4.38f -1.77 98 17. butylamine -3.4 18. ethanethiol 220 -1.30, 74 -3.04 -4.4 19. dimethylsulfide -1.54f -2.17 73 222 -3.5 20. trans-N-methylacetamide -10.07h 254 -8.08 76 -9.6 -10.1 21. cis-N-methylacetamide 250 -6.48 76 -8.0 -7.4 22. acetone 74 233 -3.80, -2.62 -4.0 -4.5 272 23. butanoic acid 92 -6.3Y -7.51 -9.2 75 -4.84fJ 24. 1-propanol 236 -7.0 -5.62 71 -4.7Ya 25.2-propanol 235 -5.20 -6.6 -7.1 94 -4.521 26.2-methyl-2-propanol -4.28 260 -5.8 80 -3.31fa 27. methyl acetate 246 -2.8 -1.35 -3.1 280 -3.OVa 28. ethyl acetate 103 -3.7 -2.03 178 -4.57f 44 29. methylamine -3.92 -5.0 209 -4.56fa 30. ethylamine 66 -4.7 -3.46 3 1. propylamine 24 1 a2 -4.4Y4 -3.27 -4.7 -0.88fJ 275 32. benzene -2.07 89 -3.7 -1.9 33. phenol 253 -6.Ya 89 -6.84 -8.4 -8.0 182 1.09 34. ethane 1.8YJ 65 0 1.56 35. cyclohexane 260 1.23fa 108 0 -4.81 -8.54h 36. dimethylacetamide 272 93 -6.4 -7.5 In A2.b In cm3/mol. kcal/mol. Using 0.006A for the hydrophobic term. e Free energy perturbation results.3/Reference 61. g Reference 62. Reference 63. Standard states are 1 M in ideal solution and in ideal gas phase. 1. methane

TABLE 2 Results of Various Model Fits for Vapor to Water Transfer _ _ _ _ _ ~ ~ no. of adjustable model no. fitting function' rmsdb P r i , d parameters 0 vle 0.006A 1.35 0.93 -0.28 v2f 0.004Ap+ 0.007AnP 1.34 0.93 2 ~ 3 8 -0.004A+ + 0.018A- + 1.21 0.94 3 0.005An, ~4~ 0.005Ac + 0.013Ao 1.35 0.93 3 0.005AN * Areas in A*, energies in kcal/mol. A = total acctssible surface area. b Rms deviation between AGo,(exp) and AGo,(fit). e Correlation coefficient between AGov,(exp) and AGo,,(fit). Correlation coefficient between AGh,(fit) and AGh*(inferred). Surfacetension was fixed at alkanevalue. f A,and A,, = polar and nonpolar surfaceareas. Nonpolar atoms were defined as those with partial charges between -0.2 and +0.2 au in the empirical force field. Using atoms with partial charges between -0.1 and +0.1 au gave very similar results. g A- and A+ = surface areas corresponding to atoms with positive and negative partial charges. Ac, AN,and A 0 are surfaces areas corresponding to C, N, and 0. ~~

~

the hydrophobic term was neglected, as discussed in the Methods section. The zero of free energy was chosen to be at infinite separation in vacuum. The electrostatic contributions are averages over five grid lengths and two positions of the grid origin; the scatter of these 10 results was used to estimate an error bar a t each point along the potential of mean force. Results were only weakly sensitive to probe radius. In order to extrapolate the results to infinite separation, we could in principle use the monomer solvation free energies calculated in the two previous sections. However, the grid used in the present calculation is by necessity much larger (less fine) than above. This introduces a systematic error and makes it difficult to compare the monomer and dimer calculations. Indeed,

the electrostatic solvation free energy of an acetamide monomer in water was estimated earlier to be -12.5 kcal/mol; the corresponding quantity for the dimer at 9-A separation is found here to be -21.2 kcal/mol. We assume that the systematic error introduced by the larger grid size is approximately independent of the separation between the molecules. Therefore, we extrapolated the dimer results to infinite separation by subtracting out the dipoledipole interaction energy between the monomers at 9-A separation, which is -0.2 kcal/mol in the low dielectric medium and approximately zero in water. The potentials of mean force are shown in Figure 5b. The free energy curves are somewhat irregular, with error bars of up to 1 kcal/mol a t large separations in water. These appear to be due to lack of convergence of the electrostatic term with respect to grid length, as noted in previous sections (Figure 2). This could lead to important artifacts in molecular dynamics simulations that use a continuum description of the solvent. The hydrogenbonded conformation is found to be stable in the low dielectric solvent, with a free energy of -3.4 kcal/mol compared to that of the separated dimers. The transfer of the hydrogen-bonded dimer from the low dielectric to water costs about 0.7 kcal/mol. For this transfer, the gain in electrostatic free energy is -6.5 kcal/ mol, while the loss in hydrophobic free energy is 7.2 kcal/mol. In water, the hydrogen-bonded conformation is 0.6 kcal/mol less stable than the separated dimers. The activation barrier to separate the monomers in water is about 3 kcal/mol. The hydrophobic term AGh* at infinite separation in water is 9.1 kcal/ mol. These results can be compared with high-level free energy simulations on a similar system, the formamide dimer."J Results for this system taken from ref 70 are reported in Figure 5b. These simulations gave -8.4 kcal/mol for forming the hydrogen bond

4690

Simonson and Briinger

The Journal of Physical Chemistry, Vol. 98, No. 17, 1994

TABLE 3: Cyclohexane to Water Solvation Free Energy Calculations solute aa area0 volumeb AGo,(exp). AG',(fit) AC& A@& 0 0 39 2.00 2.19 I . methane Ala I46 81 4.04 3.67 0 0 2. propane Val 223 4.17 0 0 LOO 4.92 3. n-butane Ile 250 0 0 LOO 4.92 4.17 4. isobutane Leu 250 -5.81 -9.3 -5.4 Asn 216 59 4.64 5. acetamide 4.49 -7.9 4.6 6. acetic acid ASP 212 57 4.46 -3.9 -1.1 55 1.28 4.85 7. methane thiol CYS 192 -9.4 -6.2 79 -5.54 -5.34 8. propionamide Gln 247 9. propionic acid GI" 245 79 -3.07 4.60 -8.7 4.6 -2.34 4.7 -8.0 IO. 4-methylimidazole His 259 79 4.55 1.15 -4.5 4.2 I I . 3-methylindole TIP 331 I22 2.33 4.4 -4.9 I06 4.14 -1.20 I2.p-cresol TY~ 303 -2.0 -0.5 90 2.35 2.07 13. ethylmethyl sulfide Met 246 40 -3.40 -2.98 -5.1 -5.0 14. methanol S.X I74 -5.6 -5.0 58 -2.57 -2.23 15. ethanol Thr 206 3.90 -1.2 -1.4 Phe 297 I06 2.98 16. toluene 2.40 -2.2 -2.0 98 4.40 17. butylamine LYS 272 'In A'. In cm'/mol.