Langevin Dipoles Model for ab Initio Calculations of Chemical

LD models have been established.26,27 All these studies have shown that .... Vector µb0 with the magnitude of .... where R is the radius of the Langev...
0 downloads 0 Views 273KB Size
J. Phys. Chem. B 1997, 101, 5583-5595

5583

Langevin Dipoles Model for ab Initio Calculations of Chemical Processes in Solution: Parametrization and Application to Hydration Free Energies of Neutral and Ionic Solutes and Conformational Analysis in Aqueous Solution Jan Floria´ n and Arieh Warshel* Department of Chemistry, UniVersity of Southern California, Los Angeles, California 90089-1062 ReceiVed: February 10, 1997; In Final Form: May 2, 1997X

A new parametrization of the Langevin dipole (LD) model is developed for ab initio calculations of chemical processes in aqueous solution. This parametrization is implemented in both the iterative (ILD) and noniterative (NLD) versions of the LD model. The training set for the new parametrization encompassed solvation free energies of 44 neutral and 39 ionic solutes that contained the C, O, N, P, S, F, and Cl atoms. The performance of the model is assessed by examining its ability to represent the overall training set, pKa differences of structurally related compounds, and conformation-related changes in the solvation energy of 1,2-ethanediol (glycol). The effects of solute polarization and electron correlation are also discussed. The overall performance of the model is found to be comparable or slightly better than the PCM continuum model of Tomasi and co-workers. However, the simplified explicit representation of solvent molecules of the LD model may allow one to gain a somewhat clearer insight into the molecular origin of different solvent effects than that obtained by continuum models. The present version of the model can be used as a stand alone program with the standard output from the Gaussian or related programs.

1. Introduction Quantum mechanical studies of chemical processes in solutions and proteins should take into account the effect of the environment around the reacting fragments.1 A number of computational approaches were developed to accomplish this task, ranging from continuum to all-atom models (for review see, for example, ref 2). One of these methods is the Langevin dipoles (LD) model,3-6 which was introduced soon after the emergence of the early combined quantum mechanical/ continuum dielectric approaches.7-10 In fact, the LD model represents probably the first attempt to obtain quantitative estimates of solvation energies of molecules in solutions and enzymes. The term “quantitative” needs to be stressed here, because continuum methods available at that time involved the use of a spherical solute-solvent boundary. As the uncertain radius of this sphere was explicitly included in solvation free energy calculations, the predictive capabilities of early continuum methods were quite limited. In contrast, the LD model used transferable atomic parameters (van der Waals radii) calibrated using observed solvation energies. This approach has provided an effective way of treating solutes of arbitrary shape without assuming any idealized cavity boundary. The same type of atom-parametrized boundary was later applied in new generations of continuum dielectric models.11-17 In addition, the seemingly simple idea of using van der Waals radii as adjustable parameters has been extensively used in subsequent all-atom solvent models (for review see ref 18) and represents one of the main reasons for their reliability. The development of the LD model was motivated by the realization that dipolar models can capture the main physics of polar solvents and that reproducing the average polarization of the solvent should suffice for a reasonable evaluation of solvation effects. This assumption has been later confirmed by the success of modern continuum models. The LD model and its protein dipoles Langevin dipoles (PDLD) version has been applied extensively to calculations of chemical processes X

Abstract published in AdVance ACS Abstracts, June 15, 1997.

S1089-5647(97)00507-5 CCC: $14.00

in solution (see for example refs 19-24). The close relationship between the LD model and more rigorous microscopic models has been demonstrated.6,25 In addition, the formal relations between the continuum models, explicit dipoles models, and LD models have been established.26,27 All these studies have shown that dipolar models provide very effective ways for evaluating solvation energies. In addition, they showed that many different models can perform reasonably well and that what really counts is the computational efficiency and the consistency of their parametrization. Naturally, a general reliability regardless of the exact details of the model is characteristic also of continuum models provided that they use atom-based parametrization. Apparently this was not understood by some scientists who criticized the minor details of the LD implementation while having no difficulties in accepting the oversimplifications of continuum models.28,29 The explicit dipoles present in the LD model allow one to form a direct bridge to molecular solvent models and to address problems that are difficult to formulate in a unique way in the framework of continuum treatments (e.g. the treatment of induced dipoles in excited state calculations30). As far as quantum chemical calculations are concerned, the LD model and its protein dipoles Langevin dipoles (PDLD) version can be considered as the earliest hybrid quantum mechanical/molecular mechanical (QM/MM) solvation models.3 This early QM/LD model has considered consistently the coupling between the solute semiempirical QCFF/ALL or MINDO/2 Hamiltonian and the solvent. Subsequent study has parametrized the LD model with the semiempirical MNDO approach.30 Unfortunately, current semiempirical approaches do not always provide correct dipole moments. Consequently, it is difficult to obtain consistent solvation energies from semiempirical quantum mechanical models without using unrealistic van der Waals radii. Therefore we continued to use empirically adjusted charges in LD and PDLD calculations,31 and this is in fact the approach used in refining the LD parameters in the POLARIS program20 and in related parametrizations of continuum models (e.g. ref 32). However, recent © 1997 American Chemical Society

5584 J. Phys. Chem. B, Vol. 101, No. 28, 1997 progress in methodology and computer power made it feasible to obtain reliable charges from ab initio calculations. Such atomic charges may enable one to use more physically justified van der Waals radii in QM/MM models and to assess accuracy limits of simplified solvent models. An attempt to implement the LD model in ab initio calculations was reported recently by Malcolm and McDouall.33 This work used Mulliken atomic charges and solute cavities represented by Bondi’s atomic radii34 plus a value of 0.75 and 0.54 Å for neutral and ionic solutes, respectively. Increased computational demands related with the use of ab initio technique was compensated by neglecting gridaveraging procedures and interactions among solvent dipoles. Despite the resulting grid-related uncertainty, their calculations correctly reproduced solvent-induced change of the reaction profile for the CH3Cl + Cl- SN2 reaction. Nevertheless, for more quantitative studies of biochemical processes, a new parametrization of the LD solvent model for solutes described by ab initio quantum mechanical methods is needed. In this paper we develop and parametrize an LD model for ab initio calculations of solvation free energies and thoroughly analyze the performance of this model. Both the noniterative and iterative versions of the LD model are considered. For comparison purposes, solvation free energies are evaluated by using the polarized continuum model (PCM) of Miertus, Scrocco, and Tomasi11,12 implemented in the Gaussian 94 program.35 For the LD calculations we use atomic charges fitted to the ab initio molecular electrostatic potential (MEP) of the gas phase solute. Because the charges are not considered as adjustable parameters, the correlation between them and the van der Waals parameters is largely removed, which enables a consistent evaluation of solvation properties for different solute structures. Since we are primarily concerned with establishing a simple way of using the LD model with standard ab initio programs, we evaluate solute polarization by using ab initio MEP charges of hydrated solutes that were obtained during the foregoing PCM calculations. Such a use of the PCM model is reasonable because the increase in solvation free energy due to the changes in solute charge distribution upon solvation represents in many cases only a second-order effect.36 This approach merely reflects a pragmatic decision based on the interest in having a convenient and easily portable model. The paper is arranged into three parts. First, we describe main features of the new version of the LD model and its parametrization for aqueous solutions. Next, we present solvation free energies of a representative set of 44 neutral and 39 ionic molecules calculated by the LD and PCM methods. In addition, the corresponding experimental solvation energies of ionic solutes are carefully reevaluated. Finally, the conformational dependence of the solvation free energy of 1,2-ethanediol (glycol) is examined. 2. Ab Initio Methods Solvation free energies were calculated by using the Langevin dipoles (LD) method (see below), and the solute charge distribution was approximated by the atom-centered point charges. Although the integration of the HF wave function is a more accurate and straightforward approach for generating the electric field, which determines the orientation of Langevin dipoles, we abandoned this approach after an initial testing period. During this period it was found that the changes in the solvation energy due to the use of wave-function-generated fields are negligible compared to the differences between the calculated and experimental solvation energies. In addition, the more accurate solute charge distribution did not always result in improved agreement with experiment. Obviously, the errors

Floria´n and Warshel in solvation energy associated with replacing solvent molecules by point dipoles are comparable or larger than errors originating from the multipole expansion of the solute charge distribution. Thus, no major improvements are expected from having an exact electric field at the positions of these dipoles. In addition, the use of the exact solute charge distribution results in a significant increase of the CPU time. Finally, the use of atomic charges simplifies the interpretation of the calculated solvation energies in terms of group contributions and also the separation of the ab initio and solvation calculations without neglecting the solute polarization term (see section 3.6). The later feature is especially useful for the developement of a new parametrization of our solvation model. Atomic charges were evaluated by fitting to the molecular electrostatic potential (MEP) obtained from the HF/6-31G* and MP2/6-31+G** electron densities and HF/6-31G* molecular geometries using the Gaussian 94 program.35 For the MEP calculations, the default Merz-Kollman atomic radii and the grid construction procedure were used. Gas phase proton affinities and charge distributions needed for the evaluation of electron correlation effects were calculated at the MP2/6-31+G**//HF/6-31G* level. Zero-point energies (ZPE) were determined from HF/6-31G* harmonic vibrational frequencies scaled by 0.9. For comparison and as a source of atomic charges of hydrated solutes we used the polarized continuum method (PCM)11 implemented in the Gaussian 94 program.35 HF/6-31G* wave functions and default Pauling’s (Merz-Kollman’s) van der Waals radii scaled by 1.2 were used for the PCM calculations. 3. Description and Parametrization of the Langevin Dipole Model The Langevin dipole (LD) solvation model is based on the evaluation of interactions between the electrostatic field of the solute and point dipoles placed on a grid surrounding the solute. This grid of dipoles is surrounded by a dielectric continuum. The solute electrostatic field is generated from the point charges placed at the atomic nuclei. The total electrostatic field at a given dipole and the magnitude of this dipole are related via the Langevin function (see below) that exhibits saturation for large fields. The resulting electrostatic part of the solvation energy (∆GES) is augmented by the solvation contribution from the outer continuum dielectric (∆Gbulk), van der Waals (∆GvdW), hydrophobic (∆Gphob) and solute-polarization (∆Grelax) terms to give the solvation free energy, ∆Gsolv:

∆Gsolv ) ∆GES + ∆Gbulk + ∆GvdW + ∆Gphob + ∆Grelax (1) In the current implementation, atomic charges are determined by fitting to the ab initio electrostatic potential. Because the atomic charges are not empirically adjustable (in contrast to the previous versions of the model), also other parameters of the LD solvation model had to be reoptimized. These changes include introduction of the field-dependent grid points and surface-constrained dipoles on outer solvent surface, the new relation between magnitudes of dipoles placed at the inner and outer grid points, reparametrized vdW and hydrophobic terms, increased extent of dipole-dipole interactions, and a newly added solute-polarization term. The details of this new parametrization for aqueous solution are presented below. 3.1. Grid Construction. The LD model represents solvent molecules by a grid of point dipoles. Our previous experience has shown that the exact nature of this lattice does not affect the calculated solvation energy in a major way. Thus, the

Langevin Dipoles Model

J. Phys. Chem. B, Vol. 101, No. 28, 1997 5585

computational efficiency rather than the reproduction of the actual solvent structure is the decisive factor in choosing the best grid. In the present model, point dipoles are centered at the 3 Å simple cubic grid that is transformed into the denser 1 Å cubic grid near the van der Waals (vdW) surface of the solute (defined by the vdW radii of Table 1). The boundary between the inner (1 Å) and outer (3 Å) grids is formed by points that lie at the distance of 2 Å from the vdW surface of the solute. The outer grid points are constructed up to the distance of 20 Å from the vdW surface of the solute, but those outer grid points that are located in the regions of very small field are subsequently discarded. This selection is done by using the screened electric field,

B ξ Dj )

∑1

Qib r ij

d(rij) rij3

, i ∈ solute, j ∈ grid points

(2)

Figure 1. Plot of the Langevin function (eq 8).

(3)

that determines the magnitude of the Langevin dipoles is given by eqs 2 and 3. The extent of dipole polarization is given by the Langevin function, L(x), which exhibits linear behavior for smaller fields and reaches saturation for larger fields (Figure 1):

where

d(rij) )

x2 + rij 1.7

bij| ) |(r bj - b ri)| denote atomic and the symbols Qi and rij ) |r charges and the distance between the ith atom and the jth grid point, respectively. The screened electrostatic field defined by eqs 2 and 3 was also used in the noniterative LD calculations (see below). The criterion |ξ BDj | < 0.0015 e/Å2 was used for the selection procedure described above. In addition, Langevin dipoles at grid points with magnitude of electrostatic field in the 0.0015 < |ξ BDj | < 0.0021 e/Å2 range were constrained at the noniterative values. These dipoles were kept constant during the iterative Langevin dipole calculation. Using this fielddependent selection of grid points, there are typically about 1400 grid points for ions treated in this paper, whereas for neutral solutes only several hundred grid points are used. This procedure improves computational efficiency of the LD model, since Langevin dipoles are placed only in locations where they can really contribute to the electrostatic part of the solvation free energy. Since the values of ∆Gsolv depend upon the position of the center of the grid, it is essential to carry out LD calculations for several different grids to obtain a stable mean value of ∆Gsolv. In this study, ∆Gsolv was averaged over 22 grids. This procedure decreased the grid-related uncertainty of our results below 1.5%. 3.2. Electrostatic Contribution to Solvation Free Energies. For a given grid, the magnitudes and orientations of Langevin dipoles are calculated in two ways. In the iteratiVe calculation, further denoted as ILD, Langevin dipoles are allowed to interact with each other. The jth dipole, b µj, becomes polarized (i.e. changes its size and orientation) along the vector of the total electrostatic field, B ξj, evaluated as a sum of the unscreened contributions from solute charges Qi and from Langevin dipoles determined in the preceding, (n-1)th, iteration:

ξ 0j + B ξj ) B

∑ k*j

[rjk2b µ (n-1) k

rjk5

∑i

L(x) ) coth(x) -

x)

1 x

BLj | µ0|ξ kT

(µ0,in)2 (µ0,out)2 ) ϑin ϑout

In the noniteratiVe Langevin dipole calculation (NLD), the field

(8)

(9)

where ϑin ) 1 Å3 and ϑout ) 27 Å3 denote volumes of the inner and outer grid cells, respectively. This relationship can be derived from the requirement that the macroscopic dielectric constant () of the inner and outer grid dipoles is the same, assuming the Claussius-Mosotti relationship between  and µ. The electrostatic part of the solvation energy, ∆GES, is evaluated as energy of Langevin dipoles b µj in the electrostatic field generated by the solute atoms: N

B0j ) ∑j (µbILD j ‚ξ

(10)

N

‚ξ BDj ) ∑j (µbNLD j

(5)

(7)

ξj (eq 4) in the ILD approach and and the field B ξ Lj is taken as B D as B ξ j (eq 2) in the NLD approach. Vector b µ0 with the magnitude of 0.26 and 0.05 e Å for outer and inner grid dipoles, respectively, points in the direction of B ξ Lj . Furthermore, symbols k and T in eq 8 denote the Boltzman constant and the thermodynamic temperature, respectively. The magnitudes of µ0 for inner and outer grid centered dipoles are not chosen arbitrarily, but they are related by the formula

NLD NLD ∆GES ) 332kES

Qib r ij rij3

where

(4)

where B ξ0 is the field of the solute in vacuum,

B ξ 0j )

(6)

ILD ILD ∆GES ) 332kES

(b r jk‚µ b(n-1) )b r jk] k

-

µ 0 L(x) b µj ) b

(11)

The parameter kES implicitly includes energy needed to polarize the solvent molecules. It should be 1/2 within the linear response approximation,6,37,38 but recent simulations utilizing all-atom solvent models indicated that its magnitude can vary for different solutes.39 In our LD implementation, kES is

5586 J. Phys. Chem. B, Vol. 101, No. 28, 1997

Floria´n and Warshel

assumed to be solute-independent and amounts to 0.52 and 0.47 for ILD and NLD models, respectively. 3.3. Calculation of Dipole-Dipole Interactions. As mentioned above the vector of the local electrostatic field, B ξj, at the jth Langevin dipole is calculated iteratively in the ILD model. The iterative cycle is terminated when oscillations in ∆GES fall below 0.1% for 10 successive steps. Usually, about 60 iterations are needed before this criterion is met. The converged ∆GES is independent of starting configuration of the Langevin dipoles. However, for faster convergence, this configuration is generated so that part of the outer grid dipoles are oriented randomly, whereas all inner grid dipoles and remaining outer grid dipoles are constructed using a screening function as in the NLD method. Subsequently, the jth Langevin dipole at the nth iteration is calculated as

b µ (n) b(n-1) + (1 - ω)µ b0L(x(n-1)) j ) ωµ j

(12)

TABLE 1: vdW Radii of the LD Model atom typea

r* [Å]

atom typea

r* [Å]

C(sp3) C(sp2) C(sp) O(sp3) O(sp2) O(inorg)b N

2.65 3.0 3.25 2.2 2.65 2.8 2.65

S P F Cl H H(inorg)c

3.2 3.1 2.45 3.15 d 2.0

a C(sp) ) C(sp2), O(sp) ) O(sp2). b Inorganic vdW radius is used for oxygen atoms that are not directly bonded to carbon. Exceptions include oxygen atoms in nitro groups attached to carbon, H2O, and their ions. c H covalently bound to O(inorg). d vdW radius of hydrogen is assumed to be linearly dependent on the r* of the closest heavy atom X: r*(H) ) kHr*(X). The constant kH equals 0.88 and 0.78 for atom X from the first and second row of the periodic table, respectively.

according to the formula

{

1,

where the dumping factor ω is gradually decreased from 0.9 to 0.5 as n increases from 1 to 80. The value of x(n-1) is calculated according to eq 8, using the local electrostatic field, B ξj, evaluated from the values of b µ(n-1) by using eq 4. In the latter formula, a cutoff distance of 20 Å is used between the first and ninth, and in the 45th iteration. In other iterative steps, the cutoff distance is decreased to 6 Å and augmented by the local reaction field approximation.40 In addition, interactions between dipoles separated by less than 2.5 Å are excluded. 3.4. Bulk Correction. The contributions to the solvation free energy originating from the solvent region that is outside the solvent region filled with Langevin dipoles are evaluated by a continuum approximation. This is done by using Born’s and Onsager’s formulas for charges and dipoles, respectively, and expressing the relevant bulk free energies of eq 1 as

f(Vj) ) 1 χ,

for |Vj| e Vmin

|Vj|-Vmin (1 - χ), for Vmax > |Vj| > Vmin Vmax - Vmin for |Vj| g Vmax

}

(15)

The empirical parameters kphob ) 0.012 kcal/mol, Vmin ) 0.002 e/Å, Vmax ) 0.0015 e/Å, and χ ) 0.08 were adjusted to obtain correct solute-size dependence of solvation free energies of hydrocarbons and alkyl alcohols. The van der Waals energy, ∆GvdW, was calculated as the sum of r-9 repulsion and London dispersion terms,

∆GvdW ) kvdW

[ ( ) ( )] r*i 9

CiNj 2 ∑ r i,j

r*i 6

-3

, rij i ∈ solute atoms, j ∈ grid points (16) ij

∆Gbulk(ionic solute) ) -166(1 - 1/)Q2/R

(13a)

∆Gbulk(neutral solute) ) -166[(2 - 2))/(2 + 1)]µ2/R3 (13b) where R is the radius of the Langevin dipoles region, while Q and µ are the charge and the dipole of ionic and neutral solutes, respectively, and  is the dielectric constant of the solvent ( ) 80 for water). The cavity radius was calculated from the estimated solute radius plus 2 Å for the inner grid layer plus a term determined from the number and spacing of outer grid dipoles. The resulting bulk correction, ∆Gbulk, is typically about -10 kcal/mol for ionic solutes and about -0.1 kcal/mol for polar neutral solutes. 3.5. Hydrophobic and van der Waals Interactions. To reproduce observed solvation free energies of nonpolar solutes, the electrostatic part of the solvation energy must be augmented by additional terms approximating hydrophobic and van der Waals (vdW) contributions. The hydrophobic free energy, ∆Gphob, is assumed to be related to the magnitude of the nonpolar molecular surface, which is proportional to the number of Langevin dipoles (grid points) that lie within 1.5 Å from the vdW surface of the solute:

∑j f(Vj),

∆Gphob ) kphob

j ∈ surface grid points

(14)

Here, the magnitude of electrostatic potential, Vi, at the ith grid point is used as the criterion for the local surface polarity,

In this formula, ri* and Ci denote atomic vdW radii and London coefficients, respectively, and rij is the distance between the ith atom and the jth grid point. The normalization factor Nj, which equals 1 and 1/27 for outer and inner grid points, respectively, was introduced in eq 16 because the densities of Langevin dipoles placed on inner and outer grid points, respectively, differ by a factor of 27 (see above). The parameter kvdW is equal to 0.84 kcal/mol. The values of atomic radii (r*) are given in Table 1. Note that these parameters represent the distance from the given solute to the center of the nearest solvent molecule, and thus they are not directly comparable to vdW atomic radii obtained from molecular and ionic crystals (e.g. Bondi’s or Pauling’s sets), molecular mechanic force fields, or those obtained by fitting solvation energies by using Born’s formula. However, the overall trend in the magnitudes of LD vdW radii is similar to those obtained by using other methods. The London coefficients (the C’s) were chosen to be the same for the atoms from the same row of the periodic table. Namely, C ) 0.7, 1.5, and 2.0 for hydrogen and atoms located in the first and second row of the periodic table of elements, respectively. The increase in the London coefficients reflects the fact that atomic polarizability increases with the atom size. 3.6. Solute Polarization. Because ∆GES is evaluated from the gas phase charges, some correction is required to account for the increase in the solvation free energy due to the polarization of the solute electron density as it interacts with Langevin dipoles. This correction, which is denoted here as ∆Grelax, is evaluated in the framework of the linear response

Langevin Dipoles Model

J. Phys. Chem. B, Vol. 101, No. 28, 1997 5587

approximation (LRA).6 Within this approximation, the energy invested in polarizing the solute is negative half of the energy gained by the interaction between the polarizing potential and the induced changes in solute atomic charges (∆qi). This leads in the case of combined QM/MM treatments to the expression41

∆GrelaxLRA )

∑Vi∆qi - 0.5∑Vi∆qi ) 0.5∑Vi∆qi

(17)

where the values of solvent-induced electrostatic potentials at solute atoms, Vi, are evaluated as

∑j

Vi ) -332

(µ b j‚ b r ij) drij3

(18)

and the screening factor d is given by eq 3 for the NLD method, and d ) 1 for the ILD method. Here we decided to replace the factor of 1/2 in eq 17 by an adjustable constant krelax,

∑Vi∆qi

∆Grelax ) krelax

(19)

To determine the constant krelax, we calculated ∆Grelax for H2O, OH-, and CH3O- molecules, which represent systems with small, medium, and large polarization contributions, and compared the resulting ∆Grelax with the values obtained previously by a self-consistent quantum mechanical approach.36 Accordingly, we adjusted krelax to the values 0.8 and 0.6 for the NLD and ILD methods, respectively. Note that the value 0.6 is close to 0.5 predicted by the LRA. The values of ∆qi were obtained by evaluating the change in potential-derived atomic charges when going from the gas phase solute to the solute solvated using the polarized continuum method (PCM) of Tomasi. Because the values of ∆qi obtained in this way are not dependent on the LD vdW radii, we found this approach to be most convenient for the purpose of adjusting parameters of the LD model. In the future we intend to evaluate ∆Grelax of larger systems empirically, whereas for smaller solutes we will incorporate the potential from Langevin dipoles into the solute quantum mechanical Hamiltonian. The strategy for the implementation of the QM/LD coupling into the semiempirical MNDO Hamiltonian has been described previously.30 This approach assumes that differential overlap of atomic orbitals can be partly neglected and that the SCF Hamiltonian is Lo¨wdin ortogonalized. Such a procedure is entirely consistent within the framework of the semiempirical model.42 The same procedure can be considered only approximate for ab initio methods, so that its performance needs to be carefuly assessed. A succesful implementation of this QM/LD coupling was reported by Malcolm et al.33 A more rigorous approach involves the conversion of the Langevin dipoles into point charges, which are subsequently introduced as external charges in the ab initio Hamiltonian (see also the related treatment on the all-atom water model43). These and other effective approaches are being developed in our laboratory, but they are beyond the scope of the present work. It is important to point out in this respect that if the solvent is treated classically (including, of course, by continuum models), there is no rigorous way of obtaining the solute polarization energy (see section 2.2 and the appendix of ref 30), and the LRA is one of the most reasonable approximations. This applies also to the effect of the electron correlation part of ∆Grelax (see section 3.7). The ∆Grelax contributions evaluated by eq 19 are compared with the results of Chen et al.36 in Table 2. Although there are apparent method-dependent variations in calculated values, overall trends are predicted in a consistent way. Such a

TABLE 2: Solute Polarization Term (∆Grelax (kcal/mol)) Evaluated by Various Computational Methods molecule

ILD

SCRF-Pa

SCRF-P+Da

H2O OHCH3OH CH3OCH3NH2 CH3NH3+ Im ImH+ HCOOH HCOOCH3COOH CH3COO-

-0.8 -2.8 -0.6 -6.8 -0.3 -0.3 -0.8 -0.3 -0.6 -1.8 -0.6 -3.5

-1.6 -3.7 -1.4 -5.9 -1.2 -1.1 -3.4 -0.6 -1.5 -1.0 -1.8 -4.2

-0.9 -1.9 -1.1 -6.1 -0.6 -0.6 -2.4 -0.2 -1.6 -1.2 -1.5 -3.4

a Results obtained by self-consistent coupling of the reaction potential of the solvent, obtained by solving the Poisson-Boltzmann equation, with the DFT quantum mechanical description of the solvent.36 The molecular charge distribution was represented by a multicenter multipole expansion formed either by atomic charges (SCRF-P) or by atomic charges and dipoles (SCRF-P+D).

performance seems to be sufficient in standard solvation calculations since ∆Grelax usually does not exceed 15% of the total solvation energy. The polarization of anionic solutes was found to be significantly larger than for their neutral or positively charged counterparts. The largest ∆Grelax value of -8.6 kcal/ mol was obtained by us for the phenolate anion. These findings can be rationalized by a more diffuse character of the wave function of anions and consequently their larger static electronic polarizability. Therefore it is quite surprising that ∆Grelax values reported by Chen et al.36 for formic acid/formate system do not conform to the mentioned trend. 3.7. Higher Order Corrections. The essential feature of the presented parametrization of the LD solvation model is the use of the solute point charges that approximate the molecular electrostatic potential determined from the Hartree-Fock (HF) wave function expressed in terms of the 6-31G* set of atomic orbitals. This quantum chemical method is frequently considered as a method of choice for various chemical and biochemical applications involving medium to large molecules. However, the HF approximation neglects electron correlation effects that are known to play an important role in systems such as transition states for breaking or forming covalent bonds. Also, the use of more extensive basis sets may be necessary for the proper description of solute charge distribution in some anionic systems. Thus, in these cases additional corrections may be required to amend these deficiencies. This is accomplished here as in the case of the HF solute polarization by using the LRA and adding to ∆Gsolv a term

∑Vi∆qicor

∆Gcor ) kES

(20)

where ∆qicor represents the additional solute polarization due to the higher order effects, and kES ) 0.52 and 0.47 for the ILD and NLD methods, respectively. (Note that the LRA constant kES is identical to that used in eqs 10 and 11.) Equation 20 is analogous to eq 19 in that it involves the electric potentials Vi at the positions of solute atoms and differences in atomic charges. Here, the values of ∆qicor were obtained by subtracting charges on the ith solute atom calculated from the HF and MP2 electrostatic potential, i.e. ∆qi ) QiHF - QiMP2. By MP2 we denote the many-body perturbation theory of second order that treats the electron correlation as a perturbation to the HF Hamiltonian. Consequently, MP2 electron density and electrostatic potential include part of the electron correlation effects. The MP2/6-31+G** atomic charges are used in this paper to evaluate the higher order corrections according to eq 20, but

5588 J. Phys. Chem. B, Vol. 101, No. 28, 1997

Floria´n and Warshel

TABLE 3: Solvation Free Energies [kcal/mol] of Hydrocarbons and Neutral Alkyl Alcohols, Ethers, and Amines solutea

∆GESb ∆GvdW ∆Gphob ∆Grelax ∆Gcalc solv

methane -0.1 ethane 0.0 propane 0.0 butane 0.0 pentane 0.0 hexane 0.0 cyclohexane 0.0 ethene -0.4 cyclopentene -0.2 cyclopentadiene -0.8 cyclopentadiene-c -50 benzene -1.1 naphthalene -1.7 ethyne -0.4 methanol -4.8 ethanol -4.5 propanol -4.3 butanol -4.4 ethanediol -7.3 tetrahydrofuran -2.7 1,4-dioxane -3.9 dimethyl ether -2.1 diethyl ether -1.3 ammonia -3.1 methylamine -2.5 ethylamine -2.4 dimethylamine -1.5 trimethylamine -1.0

-1.5 -2.2 -2.9 -3.5 -4.1 -4.7 -4.1 -2.3 -3.8 -3.7 -4.5 -4.4 -6.2 -2.3 -1.9 -2.6 -3.2 -3.8 -2.9 -3.6 -3.8 -2.6 -3.9 -1.4 -2.2 -2.8 -2.8 -3.3

3.4 4.2 4.9 5.5 6.1 6.8 5.9 4.0 5.4 4.7 0.5 4.9 5.2 4.0 1.5 2.2 3.1 3.6 1.5 3.0 3.3 2.6 4.3 1.1 2.1 2.9 3.3 4.4

0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -0.6 -0.1 0.0 0.0 -0.6 -0.6 -0.6 -0.6 -0.9 -0.6 -0.6 -0.4 -0.2 -0.4 -0.3 -0.3 -0.2 -0.1

∆Gexp solv

1.8 1.9 2.0 1.8 2.0 2.0 2.0 2.2 2.0 2.3 2.1 2.6 1.8 1.2 1.3 1.3 1.4 0.6 0.2 -65 -66 ( 5 -0.7 -0.9 -2.8 -2.4 0.3 0.0 -6.0 -5.1 -5.3 -5.0 -5.0 -4.8 -5.1 -4.7 -9.7 -9.6 -3.9 -3.5 -5.1 -5.1 -2.6 -1.9 -1.1 -1.6 -4.0 -4.3 -2.9 -4.6 -2.6 -4.5 -1.2 -4.3 -0.1 -3.2

a All-trans conformers of alkanes, C (trans) conformers of alkyl s alcohols, all-gauche conformer of ethanediol. b Obtained by the ILD c Deprotonated at the sp3 carbon atom. method.

other extended basis sets or other correlated methods, including density functional theory (DFT), could be used as a reasonable alternative. We use the notation MP2-LD and LD to distinguish between ∆Gsolv results calculated by using the LD model with and without the ∆Gcor term, respectively. Effects that cannot be treated by the LRA, for example the variation of the electron correlation energy upon solvation, were neglected. This approach is partly justified by the small values of ∆Gcor obtained for the molecules studied by us and a small effect of the electric field on the electron correlation contribution to the dipole moment of the OH- ion (see section 4.4). Also, we would like to emphasize that there is no rigorous way of obtaining the correlation contribution to the solute polarization energy for classical solvent models (see section 2.2 and the appendix of ref 30). 4. Hydration Free Energies of Neutral and Ionic Solutes In Tables 3 and 4, we present an overview of the calculated and experimental hydration free energies (∆Gsolv) for a number of neutral and ionic solutes. The first attempt to obtain such a quantitative comparison was reported in early studies of solvent models.4,6 More extensive compilations were reported by Cramer and Truhlar15,16 as a part of the development of the SM family of solvation models based on the semiempirical AM1 and PM3 calculations, by Sitkoff et al., who developed atomic parameters for the macroscopic solvent models,32 and by Stefanovich and Truong44 for ab initio PCM calculations. Other computational studies of aqueous solvation thermodynamics were focused on implementation of various solvation models into quantum mechanical programs30,36,45-49 or on detailed analysis of solvation phenomena for a smaller range of molecules, e.g. neutral molecules.50-53 Table 3 lists individual components of the calculated solvation free energy for selected solutes. In Table 4, results obtained

by using iterative (ILD) and noniterative (NLD) Langevin dipole models described in the preceding section are compared with experimental data. In addition, LD solvation free energies can be compared in Table 4 with results obtained by the PCM model. In the PCM calculations we used Pauling’s hybridizationindependent atomic radii that were parametrized only through a single multiplicative scale factor. The experimental data presented in Tables 3 and 4 vary in their reliability, depending on the way in which they were derived. For neutral solutes, experimental ∆Gsolv can be obtained directly by measuring partition coefficients of solutes between gas phase and dilute aqueous solutions in equilibrium. Following the suggestion of Ben-Naim,54 the experimental ∆Gsolv presented in this paper corresponds to a transfer of the single solute molecule from gas to dilute aqueous solution of the same molarity. We used the data compiled by Cabani et al.55 and by Ben-Naim and Marcus56 for the temperature of 298 K and atmospheric pressure. The uncertainty of measured partition coefficients increases with increasing |∆Gsolv|.57 Eventually, the concentration of the solute molecules in the gas phase falls below the experimental detection limit. This usually occurs for solutes with |∆Gsolv| above 12 kcal/mol. Thus, ∆Gsolv for all charged solutes cannot be obtained by direct experiments. Instead, for a positively charged (protonated) solute, experimental ∆Gsolv can be obtained indirectly from the values of gas phase basicity, Bg(A) ) G°(A) + G°(H+) - G°(AH+), the solvation free energy of the proton, ∆Gsolv(H+), ∆Gsolv of its conjugate base, and its pKa constant:6

∆Gsolv(AH+) ) Bg(A) + ∆Gsolv(H+) + ∆Gsolv(A) 2.303RT pKa(AH+) (21a) By using an analogous thermodynamic cycle, ∆Gsolv of negatively charged base B- can be evaluated as

∆Gsolv(B-) ) -Bg(B-) - ∆Gsolv(H+) + ∆Gsolv(BH) + 2.303RT pKa(BH) (21b) In cases where the gas phase basicity is unknown, it can be obtained from the value of gas phase proton affinity, ∆Hg [kcal/ mol], by using the approximate formula58,59

Bg(A) ) ∆Hg(A) - 7.5 kcal/mol

(22)

The value of ∆Gsolv(H+) can be derived from the absolute potential of the hydrogen electrode.60 Because the value of this potential has been determined with ∼1% accuracy,61 the values for solvation energy of the proton that are used by different authors vary considerably. Here we assume that ∆Gsolv(H+) is -259.5 kcal/mol and that the uncertainty of this value is (2.5 kcal/mol. ∆Gsolv(H+) cancels out when relative solvation energies of equally charged solutes are evaluated, but its magnitude does affect relative solvation energies of solutes carrying different charges. Therefore, the uncertainty in ∆Gsolv(H+) must be considered when LD and experimental ∆Gsolv are compared. The accuracy of other terms entering eq 21 was determined as follows. Experimental pKa values are highly accurate if they fall in the 0-14 range, but their uncertainty increases beyond this region. Moreover, solvation free energies determined by using eq 21 for acids and bases with pKa constants below -3 and above 16 refer to solvation by solvents other than neutral water. The relative gas phase basicities are determined directly from the equilibrium constants for proton transfer reactions, and as such they are usually accurate within 0.5 kcal/mol. However,

Langevin Dipoles Model

J. Phys. Chem. B, Vol. 101, No. 28, 1997 5589

TABLE 4: Comparison of Calculated and Experimental Solvation Free Energies of Neutral and Ionic Solutes in Aqueous Solutiona calculated b

solute

H2O H3O+ OHMeOH2+ MeOEtOH2+ EtOPhOH PhO-d acetaldehyde formic acid formate acetic acid acetate CHF2COOH CHF2COOCHCl2COOH CHCl2COONH4+ MeNH3+ Me2NH2+ Me3NH+ aniline anilineH+m pyridine pyridineH+m imidazole imidazoleH+m formamide formamideH+k acetamide acetamideH+k cytosine cytosineH+f HCN

NLD

ILD

experimental PCM

c

this work

d

Pearson

calculated e

solute

b

CN acetonitrile CH3CNH+ nitromethane CH2NO2HNO2 NO2HNO3 NO3H2S HSMeSH MeSEtSH EtSPhSH PhSPH3 PH4+ MePH2 MePH3+ Me2PH Me2PH2+ Me3P Me3PH+ H3PO4 H2PO4HPO4(2-) PO4(3-) CH3F HF FCH3Cl ClH Cl-

experimental c

NLD

ILD

PCM

-77 -6.7 -72 -7.4i -83i -4.1 -77 -5.2 -72 -0.3 -77 -1.3 -74 -1.4 -71 -1.6 -66 0.8 -75 0.2 -69 -0.1 -64 -0.4 -61 -10 -71 -225 -439 -2.6 -4.5 -109 -2.0 -0.6 -81

-75 -6.9 -68 -7.9i -82i -4.5 -74 -6.0 -68 -0.4 -75 -1.5 -74 -1.6 -72 -1.9 -69 0.8 -72 0.1 -68 -0.2 -63 -0.6 -59 -12 -70 -247 -536 -2.8 -4.9 -104 -2.1 -0.8 -78

-89 -4.1 -74 -5.2 -84 -2.5 -85 -2.7 -77 -0.2 -87 -0.5 -84 -0.6 -81 -0.3 -68.0 -0.1 -80 -0.2 -72 -0.4 -64 -0.6 -60 -13 -81 -275 -594 -2.6 -4.7

-

-7.8 -108 -118 -88 -99 -80 -93 -4.7 -77 -4.7 -5.3 -85 -6.1 -83 -5.9 -78 -4.2 -71 -85 -78 -70 -65 -4.3 -64 -3.7 -60 -7.3 -65 -8.2 -80 -8.2 -72 -17 -67 -4.3

-8.8 -101 -115 -83 -97 -77 -93 -5.7 -80 -5.1 -6.2 -81 -7.0 -82 -6.8 -75 -5.0 -71 -81 -74 -67 -62 -5.4 -64 -4.3 -59 -8.3 -62 -8.7 -76 -8.9 -69 -18 -66 -4.7

-6.0 -94 -121 -80 -105 -73 -102 -2.9 -83 -4.8 -6.5 -88 -7.0 -89 -5.7 -77 -4.8 -77 -87 -76 -68 -64 -2.2 -62 -3.4 -61 -5.6 -67 -7.9 -77 -9.4 -70 -15 -67 -3.5

-6.4 -105 ( 5 -110 ( 5 -87 ( 5j -98 ( 5 -81 ( 6j -94 ( 5 -6.6 -75 ( 5 -3.5 -80 ( 5 -6.7 -82 ( 5

-104 -106 -83 -95

-72

-77

-70 ( 6 -66 ( 6 -81 ( 5 -73 ( 5 -66 ( 5 -59 ( 5 -4.9 -68 ( 6 -4.7 -58 ( 5 -10.3g -64 ( 5

-79 -70 -63 -59 -68 -59 -62

-78 ( 5 -9.7 -70 ( 5 -67 ( 6

0.0 0.0

this workd -75 ( 5 -3.9 -69 ( 5 -3.7h -80 ( 6 -73 ( 7 -66 ( 5 -0.7 -76 ( 5 -1.2 -76 ( 5 -1.2 -74 ( 5 -2.6 -65 ( 7l 0.6

Pearsone -77 -69

-72 -65 -76

-67 -73

-63 ( 5

-66

-57 ( 5

-57

-53 ( 5

-53

8l

-68 ( -245 ( 15l -536 ( 20l -0.2 -107 ( 6 -0.6

-107

-78 ( 7

-77

a

Solvation free energies refer to dilute aqueous solutions, standard state of 298 K, and equal molar concentrations of solute in both the gas phase and solution. b Me, Et, and Ph denote methyl, ethyl, and phenyl residues, respectively. c PCM HF/6-31G* method using Merz-Kollman/Pauling vdW radii (Å) of 1.2 (H), 1.50 (C), 1.50 (N), 1.40 (O), 1.75 (S), 1.80 (P), 1.8 (Cl) scaled by a factor of 1.2. d Experimental solvation free energies of neutral molecules were taken from Cabani et al.55 For the determination of “experimental” data for ionic solutes and their uncertainties see the text. e Reference 58. f Cytosine protonated at N3. g 4-Methylimidazole.84 h Nitroethane. i Calculated by assuming H3C(sp3)NO(sp2)O(sp2) and H2C(sp2)NO(sp2)O(-)(sp3) structures for the neutral and deprotonated nitromethane, respectively. j pKa ) -2.1 and -1.9 for MeOH2+ and EtOH2+, respectively.85 k Protonated at the oxygen atom. l The ab initio MP2/6-31+G**//HF/6-31G* method was used to determine the gas phase proton affinity. m Protonated at the nitrogen atom.

the error range of absolute free energies of the molecular standards for these measurements is significantly larger.62 Therefore, gas phase basicities were assumed to be accurate to within (2.5 and (1.5 kcal/mol for molecules with proton affinities above and below 202 kcal/mol, respectively.59,62 For basicities of neutral molecules we used the scale of Lias et al.,62 and for basicities of anions (acidities of neutral molecules) we used a more recent compilation by the same group.59 In a few cases when experimental gas phase basicities were not available, we substituted them by ab initio MP2/6-31+G**//HF/6-31G* proton affinities using eq 22. We assumed that these ab initio results were accurate within (4 kcal/mol. Similarly, if experimental ∆Gsolv of neutral compounds were not available, we used in the right-hand side of eqs 21a and 21b values of ∆Gsolv that were calculated by the ILD method. This replacement is reasonable since the accuracy of calculated solvation energies of neutral solutes is significantly better than for ionic solutes. Experimental solvation energies for a number of ionic solutes were reported previously by Pearson,58 who used the same formula (eq 21a,b) for their derivation. These data differ from ours in that Pearson’s values of |∆Gsolv| refer to the transfer of the solute from the ideal gas (1/22.4 M) to 1 M aqueous solution. For example, for methanol ∆Gsolv of -3.2 and -5.1 kcal/mol was used in Pearson’s and our compilations, respectively. There

are also other nonsystematic differences between our and Pearson’s sets of experimental data, especially in the choice of gas phase free energies. Therefore, we converted Pearson’s values to our standard state and listed them in Table 4 along with the experimental solvation energies derived by us and their uncertainties. Generally large error ranges of experimental solvation energies of ionic solutes suggest a question whether one should include these data in the training set for the parametrization of solvation models. We believe that despite their uncertainty, these data are valuable for determination of atomic vdW radii. This is because calculated solvation energies of ionic solutes are highly sensitive to atomic radii. In addition, variations in ∆Gsolv due to the substituent effects are usually determined experimentally with substantially better accuracy since gas phase data refer to the same standard and ∆Gsolv(H+) cancels out. 4.1. Carbohydrates. The electrostatic contributions to the ∆Gsolv of neutral carbohydrates are negligible. Because the experimental ∆Gsolv of linear alkanes increases only slowly with the number of methylene groups (Table 3), van der Waals (∆GvdW) and hydrophobic (∆Gphobic) terms that have opposite signs must nearly compensate each other. The magnitude of ∆GvdW per -CH3 group was adjusted to be close to -0.9 kcal/ mol determined from the measured substituent effects on

5590 J. Phys. Chem. B, Vol. 101, No. 28, 1997 enthalpies of transfer of benzene derivatives from vapor to methanol.63 The resulting calculated ∆Gsolv slightly underestimated experimental ∆Gsolv of linear alkanes and overestimated experimental ∆Gsolv of cyclic alkanes, which are about 1.4 kcal/ mol less positive compared to linear alkanes.55 This finding could be explained neither by continuum dielectric theory50 nor by the current LD model. Because the ∆GvdW term is quite insensitive to the magnitudes of atomic van der Waals (vdW) radii, these radii have to be determined from polar or ionic solutes, the solvation energy of which is dominated by electrostatic interactions. With increasing aromatic character, ∆GES contributes more significantly to the total solvation energy, but this is still insufficient for adjusting vdW radii of carbon atoms. Therefore, these parameters were mainly determined from solvation properties of alkyl alcohols and amines (see below). Except for the cyclopentadiene anion (Table 3), experimental data from carbocations and carboanions were not used because the relevant pKa constants fall in the highly acidic or basic regions. The cyclopentadiene anion could be included in our training set since it is strongly stabilized by resonance. Indeed, the pKa value for cyclopentadiene (16 ( 0.2) is considered to be the most reliable pKa for a hydrocarbon in the aqueous standard state.64 4.2. Compounds Containing Oxygen or Sulfur Atoms. Oxygen, as the highly electronegative element, largely determines solvation properties of organic molecules. Because the calculated solvation energy is strongly dependent on the magnitudes of vdW radii, we used three different atom types for oxygen in our current LD model. For organic sp3 oxygen, a small 2.2 Å vdW radius was used to account for large |∆Gsolv| of aliphatic alcohols and their ions (Tables 3 and 4). By analogy, the same oxygen vdW radius was used for H3O+, H2O, and OH- molecules, which resulted in underestimated |∆Gsolv| of H3O+ and overestimated |∆Gsolv| for OH- and H2O. Therefore, for obtaining more accurate results, we suggest oxygen vdW radii in H2O and OH- to be increased to 2.35 Å, as in our study of phosphate ester hydrolysis.65 Unfortunately, the use of such hybridization- or group-dependent vdW radii may result in discontinuities on the free energy surface of a studied chemical reaction. To our knowledge, little attention was paid to this complication before, as many solvation models tend to introduce different vdW parameters for the same atoms in neutral and ionic solutes, or use many group-dependent parameters. On the other hand, the use of hybridization- and group-independent atomic parameters as in the PCM model results in large deviations from experimental data (Table 4). Because the current LD model uses vdW radii that are independent of the total solute charge, the number of such discontinuities is largely limited in the current LD model. Thus, this LD model and a linear interpolation of vdW radii along the reaction path could be recently used to obtain smooth and quantitatively correct free energy surfaces for phosphate ester hydrolysis in aqueous solution.65 Low accuracy of ∆Gsolv of ions containing sp3 oxygen atoms not only is characteristic of the LD model but is even more pronounced in continuum-based models. For example, the PCM model with generic atomic radii largely overestimates |∆Gsolv| of the OH- ion and underestimates |∆Gsolv| for protonated water, methanol, and ethanol molecules, respectively (Table 4). Alternatively, by using the PCM model with density functional and MP2 electron densities and refined atomic radii, Stefanovich and Truong obtained reasonable value of ∆Gsolv for the OHion, but ∆Gsolv reported by them for CH3O- fell in the -78 to -83 kcal/mol range.44 Similarly, a solvation energy of -85 kcal/mol was obtained for CH3O- by solving the Poisson-

Floria´n and Warshel Boltzman equation for density functional charge distribution on the solute.36 Lim et al.66 obtained a correct solvation energy of CH3O- (-97 kcal/mol), but only at the expense of different vdW radii of the oxygen atom in methanol and methoxide anion and freely adjustable atomic charges. The agreement with experimental data improves for solutes containing sp2 oxygen, such as organic acids. However, the hybridization-dependent vdW radii for O and C atoms require the knowledge of the prevalent resonance form for a given solute. This is straightforward, for example, for R-COOmolecules, in which two resonance structures have equal “weight” and consequently either oxygen can be defined as sp2 (and the other as sp3). However, the situation becomes more difficult in phenoxide anion,

for which contribution of the minor resonance forms 2 and 3 is unknown. The solvation energy of -80 kcal/mol reported in Table 4 was calculated using the resonance form 1. It is overestimated by 5 kcal/mol, whereas ∆Gsolv calculated with the sp2 radius on the O atom and the smaller sp3 radius on the opposite C atom, as in the resonance form 2, amounts to -71 kcal/mol at the ILD level. Thus, averaging over energies obtained for the resonance structures 1 and 2 results in a significant improvement of the calculated result. Alternatively, averaged vdW radii of 2.43 and 2.83 Å, respectively, could be used for the O and C atoms affected by the resonance. The same overestimation of the calculated |∆Gsolv| occurs for the thiophenolate anion. Thus, the vdW parameter of sp2-hybridized sulfur atom should be somewhat larger than for the sp3 one, in analogy with the trend occurring for C and O atoms. Otherwise, LD results for thiols are in an excellent agreement with available experimental data. 4.3. Compounds Containing Nitrogen or Phosphorus Atoms. Unlike solvation of cations of aliphatic alcohols, ∆Gsolv of protonated methyl amines tend to be overestimated by PCM (Table 3) and other continuum dielectric methods.36,46 Therefore, we have chosen a rather large (2.65 Å) vdW radius for nitrogen. This however did not lead to sufficient agreement with the experiment, as large contributions to ∆Gsolv in these molecules have their origin in charged hydrogen atoms. To keep the number of adjustable parameters small, we have chosen the vdW radii of H atoms to be a linear function of the radii of adjacent heavy atoms (with the exception of the H atom bonded to inorganic oxygen). The resulting ILD values of ∆Gsolv agree well with their experimental counterparts and also reproduce decreasing |∆Gsolv| upon going from NH4+ to the trimethylammonium ion. The solvation of protonated methyl phosphines turned out to be an even more difficult case. Here, the experimental solvation energies are systematically overestimated by the ILD method by about 6 kcal/mol, but relative energies of protonated mono-, di-, and trimethyl phosphine are predicted correctly. A similar decrease of |∆Gsolv| upon methylation on an electronegative center was observed and calculated for neutral and ionized water and alkyl alcohols. However, for neutral methyl amines, the experimental ∆Gsolv is approximately constant,56,57 whereas LD (Table 3), continuum dielectric,68,69 and MD FEP calculations70,71 provide a decrease in |∆Gsolv| of about 1 kcal/mol for each methylation upon going from

Langevin Dipoles Model ammonia to trimethylamine. Several computational studies68-71 were devoted to this topic without solving this enigma. Recently, Marten et al.50 attributed this discrepancy to the incorrect description of short-range hydrogen-bonding effects by methods that use electrostatic models. In contrast to amines, |∆Gsolv| of neutral acetonitrile calculated by using the same vdW radius on nitrogen is overestimated by 3 kcal/mol. This deviation is not critical, given that the very limited parameter set in our model is capable of describing consistently many experimental results for both neutral and ionic solutes. It cannot be corrected by a better choice of vdW radius on the nitrogen, as this would increase the deviation between LD and experimental solvation energy for protonated acetonitrile. In contrast, the SCRF model, which was designed for neutral molecules by Marten et al.,50 provided the solvation energy of acetonitrile within 0.4 kcal/mol from its experimental value. However, this model involved a separate vdW parameter for the sp-hybridized nitrogen and for two types of sp carbon, one of them for the CC and the other for the CN type of triple bonds. The vdW radius for N(sp) was chosen by these authors to be 0.5 Å larger than for N(sp2). Obviously, if this method and these parameters were applied to the prediction of the solvation energy of protonated acetonitrile, they would result in a largely underestimated |∆Gsolv|. Thus, in addition to amines, solvation properties of nitriles seem to be the case, in which proper treatment of both the quantum and statistical nature of the solute-solvent interactions becomes important. Solvation free energies of aromatic N-containing solutes could be predicted accurately enough without any additional parameter adjustments. This is especially important for formamide and acetamide, which represent model systems for peptidic linkages, and also for cytosine. Various aspects of aqueous solvation of these important biomolecules have been recently scrutinized by several groups.23,69,70,72-75 Compounds containing oxygen atoms bonded to nitrogen or phosphorus have rather small |∆Gsolv|. Thus, the value of the oxygen vdW radius in these compounds was adjusted to 2.8 Å, which is 0.6 Å larger than for O(sp3). The ILD model with this parametrization reproduces correctly solvation energies of NO2-, NO3-, and phosphate mono-, di-, and trianion. The noniterative LD (NLD) and the PCM models seem to fail for highly charged phosphates, providing too small and too large values of |∆Gsolv|, respectively. Considerable improvement in the PCM results for phosphate anions could be expected if a larger vdW radius for inorganic oxygen was used. In addition, overestimated PCM hydration energies of 2- and 3- ions can be rationalized by the lack of saturation in continuum methods. In contrast, the nonlinear dependence of induced dipoles upon the magnitude of the solute electrostatic field is expressed in terms of the Langevin function (Figure 1) in LD-based approaches. 4.4. Electron Correlation and Basis Set Effects. As described in section 3.7, the current LD implementation enables one to easily evaluate electron correlation and basis set effects upon the calculated solvation energies from the differences between the HF and MP2 atomic charges, and the values of Langevin dipoles calculated from the HF/6-31G* charges. (Note that the MP2-LD results were not used for the parametrization of the model nor were they included in Tables 3 and 4.) In this section we will discuss the trends in ∆Gcor values calculated by using eqs 19 and 20 and MP2/6-31+G** atomic charges. The HF/6-31G* method is generally believed to overestimate the gas phase molecular dipole moment by about 10-20%, so that it effectively mimics the solute polarization in polar

J. Phys. Chem. B, Vol. 101, No. 28, 1997 5591 solvents. This advantage of HF/6-31G* charges was utilized in a recent refinement of the AMBER molecular mechanics force field.76 Thus, the more accurate solute charge distribution obtained by using correlated quantum chemical methods can be expected to provide less negative solvation energies. We found this expectation to apply for the majority of conjugated or partly conjugated systems, regardless of the total solute charge. However, the magnitude of the MP2/6-31+G** corrections in these solutes was found to be generally small. For example, the largest ∆Gcor of 0.6 kcal/mol was calculated for neutral nitromethane. The combined electron correlation and basis set corrections are small also for methyl amines, where they amount to only few tens of kcal/mol. Interestingly, these contributions help to slightly improve the agreement with the experimental solvation energies. On the other hand, for methyl alcohols, the calculated ∆Gcor corrections tend to increase their already too high |∆Gsolv| values. This is noticeable especially for OH- and CH3- ions, for which ∆Gcor amounts to -1.2 and -1.0 kcal/mol. However, due to the small size of these ions, the electronic electron correlation effects might be partly offset by the correlationrelated increase in the CO bond length. To estimate the importance of such geometrical effects we calculated ∆Gsolv of the OH- ion by using both the HF and MP2 geometries with the same HF set of atomic charges. The resulting change in ∆Gsolv corresponding to the 0.008 Å increase in the OH bond length turned out to be only -0.2 kcal/mol, which is below the grid-related uncertainty. In addition, we evaluated (at the MP2/ 6-311++G(2d,p) level) how the correlation contribution to the dipole moment of the OH- ion changes when OH- is placed into the homogeneous electrostatic field. We applied a field of 0.01 au, which induced the dipole moment increase from the 1.77 to 2.04 D at the HF level and from 1.76 to 2.06 D at the MP2 level. Thus, in this case the electron correlation contribution to the induced dipole moment is negligible. 4.5. Substituent-Dependent Changes in Solvation Free Energies. As illustrated in Table 4, the absolute accuracy of both the experimental and LD solvation free energies of ionic solutes is low. This is rather unfortunate since the importance of solvation contributions from ionized groups for the kinetics and equilibrium energetics of heterolytic chemical reactions in solutions is enormous. However, experimental information often becomes more reliable (up to (0.5 kcal/mol) for solvation properties within a group of chemically related compounds. The substituent-dependent changes in ∆Gsolv of charged solutes usually amount to several kcal/mol, so the proper prediction of these trends by the theory is in fact a more meaningful test of the quality of the given computational method than the magnitude of the mean deviation of absolute experimental and calculated solvation energies. Even more importantly, if the solvation-dependent changes in equilibrium constants and kinetics of chemical reactions are correctly reproduced by a solvation model, it is usually easy to change a few parameters of the model in such a way that both the absolute and relative energetics will agree with the experimental data. In Table 5, we present such a comparison for several groups of related ionic compounds. Members of each group have common total charge and type of central atom. For the same molecules, variations in gas phase basicity calculated at the ab initio MP2/6-31+G**//HF/6-31G* level are compared with the corresponding experimental results. The ab initio method chosen includes a significant amount of correlation energy contributions while applicable to relatively large solutes (∼40 atoms), so it represents a reliable counterpart of the LD model for a wide range of chemical processes occurring in aqueous

5592 J. Phys. Chem. B, Vol. 101, No. 28, 1997

Floria´n and Warshel

TABLE 5: Comparison of Calculated and Experimental Substituent Effects on Gas Phase Basicities, Solvation Free Energies of Charged Solutes, and Corresponding pKa Differences substituents (R)(R′)(R′′)

∆∆Bga (kcal/mol) exptd MP2c

∆∆Gsolvb (kcal/mol) ILD PCM

NLD

10.4 4.8 17.3 21.5

Amines, N(R)(R′)(R′′)(H) , Relative to NH4 10.1 7 7 6.9 21 17 17.2 15 14 21.7 20 19

(-H)(-CH3)2 (-CH3)3

14.5 24.2

(-H)2(-CH3) (-H)2(-C2H5) (-CH3) (-C2H5) (-Ph) (-CHO) (-CCH3O) (-CCHF2O) (-CCHCl2O)

/NH3 11 25 19 23

calce

exptf

+

+

(-H)2(-CH3) (-H)2(-Ph) (-H)(-CH3)2 (-CH3)3

∆pKa expt 8 13 15 22

3 -10 4 4

14 -4.6 1.5 0.6

Phosphines, P(R)(R′)(R′′)(H)+, Relative to MePH3+/MePH2 12.3 5 5 8 23.0 8 9 12

6 10

7 11

3.9 8.7

15.9 20.2

Alcohols, ROH2+, Relative to H3O+/H2O 15.1 20 18 14 21.1 28 24 21

18 26

0 0

-5.2 -9.0 -39.5 -45.9 -42.0 -61.4 -62.1

Oxoanions, RO-, Relative to HO-/H2O -10.1 19 18 16 -13.4 25 22 19 -41.8 41 35 38 -45.9 33 34 33 -42.6 35 33 32 -60.5 40 40 44 -62.2 47 44 44

12 16 35 30 28 40 44

-7 -7 -6 -11 -8 -17 -16

0 2

5 5

(-CH3) (-C2H5)

4.6 3.0

mean deviationg

1.4

5.7 4.0

Thiols, RS-, Relative to HS-/H2S 3 1 6 3 3.5

2.1

3 6 3.6

0 0 -0.1 0.2 -5.7 -12.0 -10.9 -14.4 -14.4 5 5

2.5

a

Free energy change for the gas phase proton transfer reaction AH+ + B f A + BH+ or AH + B- f A- + BH, where B or B- is the reference base. b For absolute solvation energies and method descriptions see Table 4. c Determined from MP2/6-31+G**//HF/6-31G* proton affinities at 298 K, using eq 22. Calculated ∆Bg values for the reference bases: NH3, 198.0; MePH2, 195.0; H2O, 157.1; OH-, 379.7; SH-, 344.5 kcal/mol. d Experimental ∆B values for the reference bases: NH , 195.6; MePH , 196.3; H O, 159.0; OH-, 384.1; SH-, 344.9 kcal/mol (refs 59, 62). e MP2 g 3 2 2 + ILD method. Absolute pKa constants for acids calculated from MP2 gas phase energies and ILD solvation free energies using eq 21: NH4+, 11; + + f MePH3 , 3; H3O , -7; H2O, 10; H2S, 8. Reference 86 for phosphines, ref 85 for water and alcohols, ref 87 for others. Experimental pKa constants for reference systems: NH4+, 9.2; MePH3+, 0; H3O+, -1.5; H2O, 15.7; H2S, 7.0. g Arithmetic average of the absolute values of the differences between calculated and experimental data.

solution. The data presented in Table 5 enable one to compare the accuracy of such standard ab initio methods with the performance of LD and PCM models. Finally, a most critical check of the predictive capabilities of theoretical approaches for the treatment of solution energetics is presented in the two rightmost columns of Table 5. Here, the equilibrium constants for proton transfer reactions (characterized by pKa differences) calculated by using the ILD method are compared with corresponding experimental quantities. 4.6. Overall Performance. Overall, the ILD method slightly outperforms both the NLD and PCM methods. Such a good performance indicates that the ILD method can be fully competitive with the continuum dielectric methods in studies of reactivity and equilibrium properties of medium-sized systems in aqueous solution. Here, the existence of several alternative theoretical approaches should enable one to critically evaluate the inherent accuracy of calculated solvation free energies. An example of such a comparison will be given in the following section of our paper. The results provided by the NLD method have comparable accuracy as those obtained from the PCM calculations. The NLD method performs especially well for neutral and monoanionic solutes, whereas for solutes with larger localized charges NLD hydration energies are significantly underestimated. However, one should realize that the NLD method is about 2 orders of magnitude faster than the ILD and PCM methods. Therefore, a good performance of the noniterative approach is promising for the calculations of large systems. For such systems, the HF/6-31G* atomic charges needed for the LD calculations may not be directly obtainable, as the whole system may be too big for ab initio treatment. However, the necessary charge distribution could be built from separate ab initio

calculations of individual constituents, for example amino acids. In fact, the potential-derived HF/6-31G* charges and this buildup strategy have been used for a long time in the AMBER force field.76 Thus, if the free energy contribution corresponding to solute polarization (∆Grelax) could be determined empirically, e.g. from the parametrization of atomic polarizabilities, one could use, for example, the AMBER library of atomic charges to calculate hydration energies of macromolecules within the framework of the LD parametrization presented in this paper. Such calculations are currently feasible only by using the ENZYMIX library of group charges and the previous LD model implemented in the POLARIS program.20 5. Solvation Effects on Conformational Equilibria in 1,2-Ethanediol Rotations around single bonds constitute important degrees of freedom of biological macromolecules. The corresponding energy differences are usually small, but because of a large number of such degrees of freedom, they add up to significant driving forces influencing macromolecular secondary structures. Thus, it would be very useful if simple theoretical solvation models such as the LD model were able to predict correctly how hydration affects these conformational equilibria. To assess the performance of our LD model in evaluating conformationrelated solvation energy differences, we studied the g+G+g-, tG+g- and tTt rotamers of 1,2-ethanediol (Figure 2). We use the spectroscopic notation of the rotational isomers that denotes torsional angles near 60°, -60°, and 180° as gauche+ (g+), gauche- (g-), and trans (t), respectively. The g+G+g- and tG+g- conformers enable intramolecular hydrogen bonding. Consequently, they represent preferred conformations in the gas

Langevin Dipoles Model

J. Phys. Chem. B, Vol. 101, No. 28, 1997 5593

Figure 2. Benchmark conformers of 1,2-ethanediol. The calculated gas phase HF/6-31G* torsional angles (deg) are given below each conformer in the order (H1O1C1C2, O1C1C2O2, C1C2O2H2). The intramolecular hydrogen bond is indicated by the dashed line.

TABLE 6: Relativea Solvation Free Energies of the g+G+gand tTt Conformers of 1,2-Ethanediol in Aqueous Solution conformer method

g+G+g-

tTt

NLD ILD MP2-NLDb MP2-ILDb PCMc PCM(HF/4-31G)d PCM(HF/6-31G*)e SM1af SM2f SM3f MC (OPLS/TIP4P)g MD-FEP (GROMOS/SPC)h

-0.7 -0.9 -0.6 -0.8 -0.4 -1.1 -0.7 -0.4 -0.1 -0.1 -1.0 -0.7

-0.2 -0.5 -0.3 -0.6 -0.5 -0.2 -0.2 -0.1 -1.0 -1.2 -1.2 -1.1

a With respect to the tG+g- conformer. b LD solvation free energies corrected for electron correlation effects (see section 3.7). c This work. d Reference 78. e HF/6-31G*//HF/4-31G PCM calculation.78 f AM1SM1a, AM1-SM2, and PM3-SM3 relative solvation free energies, evaluated at the MP2/cc-pVDZ geometry.77 g Monte Carlo free energy perturbation method.79 h Molecular dynamics restrained dihedral sampling.80

phase. When only the electronic, vibrational, and rotational contributions to the free energies are considered, the tG+gconformer is preferred by 0.5 kcal/mol to the tG+g- conformer and by 2.2 kcal/mol to the tTt one.77 Except for the tTt conformer, seven other minima that lack internal hydrogen bonds were located on the ab initio MP2/cc-pVDZ potential energy surface.77 For the investigation of all stereochemically unique conformers, which exceeds the scope of the present paper, we refer the reader to the work of Cramer and Truhlar77 and references cited therein. The relative LD solvation free energies of the different conformers are presented in Table 6 along with the results obtained by other computational methods. These include PCM and Monte Carlo calculations of Alagona and co-workers,78,79 results obtained with AM1-SM1a, AM1-SM2, and PM3-SM3 solvation models,77 and classical molecular dynamics simulations.80 All the mentioned theoretical methods agree that both the g+G+g- and tTt conformers are stabilized upon solvation. However, the predicted magnitudes of solvation effects considerably vary among different methods. The SM2 and SM3 solvation models were claimed by Cramer and Truhlar77 to be more appropriate for the calculation of conformational equilibria in aqueous solution than the PCM model because the SM2 and SM3 models are parametrized to include first solvation shell effects. Indeed, the major part of the relative stabilization of the tTt conformer in the SM2 and SM3 calculations was found to have a nonelectrostatic origin.77 In contrast, the more negative ∆Gsolv for g+G+g- and tTt than for the tG+gconformer originates mainly from the ∆GES term in both the iterative and noniterative LD solvation models. This electro-

static stabilization of the all-trans conformer is increased upon inclusion of the electron correlation. On the other hand, it is partly compensated by the larger (positive) hydrophobic term. The effects of repulsion-dispersion (∆GvdW), solute polarization (∆Grelax), and Born (∆GBorn) terms were found to be negligible. The above mentioned competition of different components of solvation free energy results in slightly larger stabilization of the g+G+g- conformer than the tTt conformer. At the highest MP2-ILD level, this stabilization with respect to tG+g- amounts to 0.8 and 0.6 kcal/mol, respectively. This result is in a reasonable agreement with calculations using all-atom solvent models and also with results of our PCM calculation. Interestingly, the PCM calculations of Alagona and Ghio that were carried out at the similar ab initio HF/6-31G*//HF/4-31G level resulted in notably different relative energies. Most of this difference probably originates from the differences in atomic vdW radii. Unfortunately, information about vdW radii was not included in ref 78. 6. Concluding Remarks Approximate treatments of solute-solvent interactions provide at present the only practical way for quantitative evaluation of solvation energies. Several strategies are currently used including continuum treatments, LD models, and all-atom QM/ MM approaches. In this paper, a new parametrization of the LD model is presented, and the resulting solvation energies are compared to the results obtained by using current continuum models. The performance of both methods is comparable and depends to a large extent on the effort invested in the parametrization.81 However, since the LD model is based on an explicit representation of the solvent molecules, it provides a clearer connection to the underlying molecular picture. For example, it is not entirely clear how to represent the electronic polarizability of solvent molecules in continuum approaches. On the other hand, the dipolar models enable one to deduce the direct analogy between the behavior of the inductive components of these dipoles and the behavior of the electrons in real solvent molecules. This can be done by considering the actual solvent molecules in a quantum mechanical supermolecular treatment and then asking what is the corresponding classical dipolar analog.41 To realize this point, one can start with a solute molecule and a single helium atom and treat them as a supermolecular quantum mechanical system within a double CI treatment41 and then reproduce the effect of the helium atom by an induced dipole. The same can be done with many helium atoms or any other system of polarizable atoms.3 In this case, it is possible to form a clear bridge between a quantum mechanical description of the solvent molecules and our practical classical model. An identical philosophy can be used in modeling polar solvent molecules. The solvent molecules can of course be approximated by continuum approaches, and there are even powerful formulations for the treatment of the electronic polarization within the continuum theory.82 However, it is not clear yet how to relate such treatments to the quantum treatment of the solvent, since the basic model does not reflect a quantum mechanical representation of the solvent. This is perhaps the reason for controversies about the correct implementations of such models. No such problems exist in dipolar models due to their clear connection to the microscopic world. Therefore, we believe that explicit dipolar models may be useful in future attempts to improve solvent models, such as modeling charge transfer to and from the solvent or explicit hydrogen bonds. In cases like this, it is quite straightforward to study the properties of real molecular models and to construct the corresponding dipolar models. Finally, the advantage of dipolar

5594 J. Phys. Chem. B, Vol. 101, No. 28, 1997 models becomes more apparent when one moves to heterogeneous environments such as proteins, where the correct use of continuum representation involves major conceptual traps.24 Those who do not share our belief in the advantage of using dipolar models, which is in part a matter of taste, can simply consider our LD model as a tool in the arsenal of modern computational chemistry that augments the more widespread continuum dielectric and all-atom solvation models. The current version of our LD program83 can be used along with any standard ab initio program and can be obtained upon request from the authors. Acknowledgment. We thank Dr. Arno Papazyan for suggesting the Clausius-Mosotti scaling and for valuable discussions. This work was supported by the NIH Grant GM24492 and by the Tobacco Grant 4RT-0002. References and Notes (1) Warshel, A. Computer Modeling of Chemical Reactions in Enzymes and Solutions; John Wiley & Sons: New York, 1991. (2) (a) Cramer, C. J.; Truhlar, D. G. Structure and ReactiVity in Aqueous Solution; American Chemical Society: Washington, 1994; Vol. 568. (b) Tomasi, J.; Mennucci, B.; Cammi, R.; Cossi, M. In Computational Approaches to Biochemical ReactiVity; Na´ray-Szabo´, G., Warshel, A., Eds.; Kluwer Academic Publishers: Dordrecht, 1997; pp 1-102. (3) Warshel, A.; Levitt, M. J. Mol. Biol. 1976, 103, 227. (4) Warshel, A. J. Phys. Chem. 1979, 83, 1640. (5) Russell, S. T.; Warshel, A. J. Mol. Biol. 1985, 185, 389. (6) Warshel, A.; Russell, S. T. Q. ReV. Biol. 1984, 17, 283. (7) Rinaldi, D.; Rivail, J.-L. Theor. Chim. Acta 1973, 32, 57. (8) Tapia, O.; Goscinski, O. Mol. Phys. 1975, 29, 1653. (9) Rivail, J.-L.; Rinaldi, D. Chem. Phys. 1976, 18, 223. (10) McCreery, J. H. C., R. E.; Hall, G. G. J. Am. Chem. Soc. 1976, 98, 7191. (11) Miertus, S.; Scrocco, E.; Tomasi, J. Chem. Phys. 1981, 55, 117. (12) Miertus, S.; Tomasi, J. Chem. Phys. 1982, 65, 239. (13) Rashin, A. A. J. Phys. Chem. 1990, 94, 1725. (14) Sharp, K. A.; Honig, B. J. Phys. Chem. 1990, 94, 7684. (15) Cramer, C. J.; Truhlar, D. G. J. Am. Chem. Soc. 1991, 113, 8305. (16) Cramer, C. J.; Truhlar, D. G. J. Comput.-Aided Mol. Des. 1992, 6, 629. (17) Dillet, V.; Rinaldi, D.; Rivail, J. L. J. Phys. Chem. 1994, 98, 5034. (18) Kollman, P. Chem. ReV. 1993, 93, 2395. (19) Warshel, A.; A° qvist, J. Annu. ReV. Biophys. Biophys. Chem. 1991, 20, 267. (20) Lee, F. S.; Chu, Z. T.; Warshel, A. J. Comput. Chem. 1993, 14, 161. (21) Alden, R. G.; Parson, W. W.; Chu, Z. T.; Warshel, A. J. Am. Chem. Soc. 1995, 117, 12284. (22) Stephens, P. J.; Jollie, D. R.; Warshel, A. Chem. ReV. 1996, 96, 2491. (23) Floria´n, J.; Baumruk, V.; Leszczynski, J. J. Phys. Chem. 1996, 100, 5578. (24) Sham, Y. Y.; Chu, Z. T.; Warshel, A. J. Phys. Chem. B 1997, 101, 4458. (25) King, G.; Warshel, A. J. Chem. Phys. 1989, 91, 3647. (26) Papazyan, A.; Warshel, A. In preparation. (27) Coalson, R. D.; Duncan, A. J. Phys. Chem. 1996, 100, 2612. (28) Rogers, N. K. Prog. Biophys. Mol. Biol. 1986, 48, 37. (29) It is important to point out that past criticisms of the LD model were either irrelevant or incorrect. This point was partially clarified recently.24 However, in the interest of those who are unfamiliar with the model and find it computationally appealing we will further clarify this issue by considering the specific criticism of Rogers28 point by point. (i) It was argued28 that the LD model has only permanent dipoles, while it is well-known that electronic polarizability is very important for a proper dielectric constant. However, many LD versions include explicitly induced dipoles (e.g. ref 4). Also, the effect of induced dipoles is rather unimportant when one deals with ground state solvation properties and adjusts the permanent dipoles to account for the missing electronic polarizability. Moreover, as far as the dielectric constant () is concerned it is well-known that the proper  can be obtained with a properly adjusted permanent dipole without any induced dipole. (ii) Rogers pointed out the obvious fact that the cubic grid of the LD model does not have the correct structure of real water, which he assumed to be very important for obtaining the dielectric constant of water. However, the LD and other simple dipolar models (or their simplification, the continuum models) have no pretense of reproducing the details of the solvent structure, and their use is justified by the virtue of capturing solvation effects. (iii) It was pointed out that the reaction field

Floria´n and Warshel from the bulk is very important and it is neglected in the LD model. This criticism is rather puzzling since the LD model does include the reaction field effect on solvation energy (see e.g. ref 6), and a simple version of this correction is the bulk correction presented in section 3.4. Finally, it was implied that the model is problematic because it was criticized by Krishtalik and Topolev88 in particular with respect to ion pairs. The problem with such a second-hand assessment is that the original criticism was baseless in any single point (see a discussion of some of the points in footnote 56 of ref 89). The specific criticism of the energetics of ion pairs was based on the incorrect assumption that |∆Gsolv| of ion pairs in solution should stay constant with distance rather than increase with distance. (This obvious point is demonstrated for example in Figure 2 of ref 4.) (30) Luzhkov, V.; Warshel, A. J. Comput. Chem. 1992, 13, 199. (31) Warshel, A.; Chu, Z. T. In ACS Symposium Series: Structure and ReactiVity in Aqueous Solution. Characterization of Chemical and Biological Systems; Cramer, D. G., Ed.; American Chemical Society: Washington, DC, 1994. (32) Sitkoff, D.; Sharp, A. A.; Honig, B. J. Phys. Chem. 1994, 98, 1978. (33) Malcolm, N. O. J.; McDouall, J. J. W. J. Mol. Struct. (THEOCHEM) 1996, 366, 1. (34) Bondi, A. J. Phys. Chem. 1964, 68, 441. (35) Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Gill, P. M. W.; Johnson, B. G.; Robb, M. A.; Cheeseman, J. R.; Keith, T.; Petersson, G. A.; Montgomery, J. A.; Raghavachari, K.; Al-Laham, M. A.; Zakrzewski, V. G.; Ortiz, J. V.; Foresman, J. B.; Cioslowski, J.; Stefanov, B. B.; Nanayakkara, A.; Challacombe, M.; Peng, C. Y.; Ayala, P. Y.; Chen, W.; Wong, M. W.; Andres, J. L.; Replogle, E. S.; Gomperts, R.; Martin, R. L.; Fox, D. J.; Binkley, J. S.; Defrees, D. J.; Baker, J.; Stewart, J. P.; HeadGordon, M.; Gonzalez, C.; Pople, J. A. Gaussian 94, ReVision D.2; Gaussian, Inc.: Pittsburgh, PA, 1995. (36) Chen, J. L.; Noodleman, L.; Case, D. A.; Bashford, D. J. Phys. Chem. 1994, 98, 11059. (37) Levy, R. M.; Belhadj, M.; Kitchen, D. B. J. Chem. Phys. 1991, 95, 3627. (38) Lee, F. S.; Chu, Z. T.; Bolger, M. B.; Warshel, A. Protein Eng. 1992, 5, 215. (39) A° qvist, J.; Hansson, T. J. Phys. Chem. 1996, 100, 9512. (40) Lee, F. S.; Warshel, A. J. Chem. Phys. 1992, 97, 3100. (41) Luzhkov, V.; Warshel, A. J. Am. Chem. Soc. 1991, 113, 4491. (42) In a recent study90 it was argued that the LD approach of ref 30 and, in fact, ref 4 is inconsistent in its implementation with QM calculations since it uses point charges for the solute and that this leads to problems in MD simulations (since the force is presumably not consistent with the energy). These assertions involve several misunderstandings which are clarified below: (i) The LD method has been developed and implemented within the Lo¨wdin orthogonalized orbitals formulation (where the overlap is treated implicitly) for the solute-solvent coupling, and this approximation is fully consistently and rigorously obtained using the solute point charges (see e.g. ref 30). (ii) Dealing with quantum/classical models, one has to define the interaction potential semiempirically, and thus there is no correct or “incorrect” approximation. This is, already taking coupling of 1/R rather than an electron repulsion integral is an approximation or a model. The LD model defines fully and consistently a solute-solvent electrostatic coupling. Obviously this coupling is not identical to the coupling obtained using ab initio wave functions or electrostatic potential, but it does not involve any inconsistency and its first derivatives are the exact derivative of the assumed potential. In fact, referring to rigorous QM coupling when the solvent is treated classically as dipoles or as a continuum is an oxymoron. (iii) The LD is not an MD model, and it does not involve any analytical gradients. Consistent energy minimization with semiempirical QM forces and induced dipoles on the solvent (or protein) have been implemented by us as early as 1976.3 Again, using solute point charges within semiempirical models provides a completely consistent and a valid model for MD simulations with semiempirical models. This is particularly true when one uses approaches such as the QCFF/PI model in MD studies (see e.g. ref 41). (iv) Finally as far as moving to more rigorous solute-solvent coupling on the ab initio level is concerned, obviously the LD model does not represent the solvent quantum mechanically and the same is true for continuum models. However, we have already introduced approaches that treat the solvent quantum mechanically.91,92 (43) Muller, R. P.; Warshel, A. J. Phys. Chem. 1995, 99, 17516. (44) Stefanovich, E. V.; Truong, T. N. Chem. Phys. Lett. 1995, 244, 65. (45) Ford, G. P.; Wang, B. J. Am. Chem. Soc. 1992, 114, 10563. (46) Andzelm, J.; Klamt, A. J. Chem. Phys. 1995, 103, 9312. (47) Truong, T. N.; Stefanovich, E. V. Chem. Phys. Lett. 1995, 240, 253. (48) Tawa, G. J.; Martin, R. L.; Pratt, L. R.; Russo, T. V. J. Phys. Chem. 1996, 100, 1515. (49) (a) Cossi, M.; Barone, V.; Cammi, R.; Tomasi, J. Chem. Phys. Lett. 1996, 255, 327. (b) York, D. M.; Lee, T.-S.; Yang, W. Chem. Phys. Lett. 1996, 263, 297. (50) Marten, B.; Kim, K.; Cortis, C.; Friesner, R. A.; Murphy, R. B.; Ringnalda, M. N.; Sitkoff, D.; Honig, B. J. Phys. Chem. 1996, 100, 11775.

Langevin Dipoles Model (51) Tuno´n, I.; Ruiz-Lo´pez, M. F.; Rinaldi, D.; Bertra´n, J. J. Comput. Chem. 1996, 17, 148. (52) Rashin, A. A.; Young, L.; Topol, I. A. Biophys. Chem. 1994, 51, 359. (53) Bachs, M.; Luque, F. J.; Orozco, M. J. Comput. Chem. 1994, 15, 446. (54) Ben-Naim, A. J. Phys. Chem. 1978, 82, 792. (55) Cabani, S.; Gianni, P.; Mollica, V.; Lepori, L. J. Solution Chem. 1981, 10, 563. (56) Ben-Naim, A.; Marcus, Y. J. Chem. Phys. 1984, 81, 2016. (57) Wolfenden, R. Science 1983, 222, 1087. (58) Pearson, R. G. J. Am. Chem. Soc. 1986, 108, 6109. (59) Lias, S. G.; Bartmess, J. E.; Liebman, J. F.; Holmes, J. L.; Levin, R. D.; Mallard, W. G. J. Phys. Chem. Ref. Data 1988, 17, Suppl. 1. (60) Farrell, J. F.; McTigue, P. J. Electroanal. Chem. 1982, 139, 37. (61) Reiss, H.; Heller, A. J. Phys. Chem. 1985, 89, 4207. (62) Lias, S. G.; Liebman, J. F.; Levin, R. D. J. Phys. Chem. Ref. Data 1984, 13, 695. (63) Fuchs, R.; Young, T. M.; Rodewald, R. F. J. Am. Chem. Soc. 1974, 96, 4705. (64) Streitwieser, A., Jr.; Nebenzahl, L. L. J. Am. Chem. Soc. 1976, 98, 2188. (65) Floria´n, J.; Warshel, A. J. Am. Chem. Soc. 1997, 119, 5473. (66) Lim, C.; Bashford, D.; Karplus, M. J. Phys. Chem. 1991, 95, 5610. (67) Wolfenden, R. Biochemistry 1978, 17, 201. (68) Orozco, M.; Jorgensen, W. L.; Luque, F. J. J. Comput. Chem. 1993, 14, 1498. (69) Tannor, D. J.; Marten, B.; Murphy, R.; Friesner, R. A.; Nicholls, A.; Honig, B. J. Am. Chem. Soc. 1994, 116, 11875. (70) Morgantini, P. Y.; Kollman, P. A. J. Am. Chem. Soc. 1995, 117, 6057. (71) Meng, E. C.; Caldwell, J. W.; Kollman, P. A. J. Phys. Chem. 1996, 100, 2367. (72) Rick, S. W.; Berne, B. J. J. Am. Chem. Soc. 1996, 118, 672. (73) Cramer, C. J.; Truhlar, D. G. Chem. Phys. Lett. 1992, 198, 74. (74) Miller, J. L.; Kollman, P. A. J. Phys. Chem. 1996, 100, 8587. (75) Colominas, C.; Luque, F. J.; Orozco, M. J. Am. Chem. Soc. 1996, 118, 6811.

J. Phys. Chem. B, Vol. 101, No. 28, 1997 5595 (76) Cornell, W. D.; Cieplak, P.; Bayly, C. I.; Gould, I. R.; Merz, K. M.; Ferguson, D. M.; Spellmeyer, D. C.; Fox, T.; Caldwell, J. W.; Kollman, P. A. J. Am. Chem. Soc. 1995, 117, 5179. (77) Cramer, C. J.; Truhlar, D. G. J. Am. Chem. Soc. 1994, 116, 3892. (78) Alagona, G.; Ghio, C. J. Mol. Struct. (THEOCHEM) 1992, 254, 287. (79) Nagy, P. I.; Dunn, W. J., III; Alagona, G.; Ghio, C. J. Am. Chem. Soc. 1991, 113, 6719. (80) Hooft, R. W. W.; van Eijck, B. P.; Kroon, J. J. Chem. Phys. 1992, 97, 3639. (81) It had been sometimes assumed that continuum models are more general than the LD model since they involve fewer parameters. Of course, early continuum models used only two parameters: the dielectric constant and the cavity size. Therefore, they were nonquantitative. Eventually, it was realized that it is essential to have different parameters (Born or van der Waals radii) for different atom types. This is given by the physics of solvation and does not need any justification now. The fact that continuum models contain the experimental dielectric constant () does not make them more (or less) general since the interest here is in solvation energy that mainly depends on the representation of the solute/solvent boundaries whose microscopic origin is not related to . (82) McRae, E. G. J. Phys. Chem. 1957, 61, 562. (83) (a) Floria´n, J.; Warshel, A. ChemSol, Version 1.0; University of Southern California: Los Angeles, 1997. (b) Program ChemSol can be downloaded from the anonymous ftp server usc.edu, directory /pub/warshel/ cs. (84) Wolfenden, R. Biochemistry 1981, 20, 849. (85) Lowry, T. H.; Richardson, K. S. Mechanism and Theory in Organic Chemistry; HarperCollins Publishers: New York, 1987. (86) Kirby, J. A.; Warren, S. G. The Organic Chemistry of Phosphorus; Elsevier: Amsterdam, 1967. (87) Dean, J. A. Lange’s Handbook of Chemistry; McGraw-Hill, Inc.: New York, 1992. (88) Krishtalik, V. I.; Topolev, V. V. Mol. Biol. 1984, 18, 892. (89) Yadav, A.; Jacksom, R. M.; Holbrook, J. J.; Warshel, A. J. Am. Chem. Soc. 1991, 113, 4800. (90) Thompson, M. A. J. Phys. Chem. 1996, 14492. (91) Wesolowski, T.; Warshel, A. J. Phys. Chem. 1994, 98, 5183. (92) Wesolowski, T.; Muller, R. P.; Warshel, A. J. Phys. Chem. 1996, 100, 15444.