Letter pubs.acs.org/JPCL
Free-Energy Calculations of Ionic Hydration Consistent with the Experimental Hydration Free Energy of the Proton Haiyang Zhang,† Yang Jiang,‡ Hai Yan,*,† Chunhua Yin,† Tianwei Tan,*,‡ and David van der Spoel*,§ †
Department of Biological Science and Engineering, School of Chemistry and Biological Engineering, University of Science and Technology Beijing, 100083 Beijing, China ‡ Beijing Key Lab of Bioprocess, College of Life Science and Technology, Beijing University of Chemical Technology, Box 53, 100029 Beijing, China § Uppsala Center for Computational Chemistry, Science for Life Laboratory, Department of Cell and Molecular Biology, Uppsala University, Husargatan 3, Box 596, SE-75124 Uppsala, Sweden S Supporting Information *
ABSTRACT: Computational free-energy correction strategies and the choice of experimental proton hydration free energy, ΔGs*(H+), are analyzed to investigate the apparent controversy in experimental thermodynamics of ionic hydration. Without corrections, the hydration free-energy (ΔGhyd) calculations match experiments with ΔGs*(H+) = −1064 kJ/mol as reference. Using the Galvani surface potential the resulting (real) ΔGhyd are consistent with ΔG*s (H+) = −1098 kJ/mol. When applying, in an ad hoc manner, the discrete solvent correction, ΔGhyd matching the “consensus” ΔGs*(H+) of −1112 kJ/mol are obtained. This analysis rationalizes reports on ΔGhyd calculations for ions using different experimental references. For neutral amino acid side chains ΔGhyd are independent of the water model, whereas there are large differences in ΔGhyd due to the water model for charged species, suggesting that long-range ordering of water around ions yields an important contribution to the ΔGhyd. These differences are reduced significantly when applying consistent corrections, but to obtain the most accurate results it is recommended to use the water model belonging to the force field. 54A8 force field.12 Truhlar and coworkers adopted Tissandier et al.’s prediction of −1112 kJ/mol,13 a value often claimed to be the best estimate of ΔG*s (H+),14 and this value was used for the Minnesota solvation database.15,16 All four values for ΔGs*(H+) have been used as references for force-field development efforts,12,17−24 potentially leading to incompatible biomolecular models. This work intends to rationalize the confusion by an extensive simulation benchmark of amino acid (AA) side-chain analogues and water models with different free-energy corrections. For reference, we compiled a list of experimental hydration free energy of charged AA side chain analogues (Table 1) using
T
itratable amino acids are of crucial importance for protein folding and function. Proton transfer between titratable side chains and surroundings may be involved, for instance, in enzyme catalysis, like in the well-known Ser-His-Asp triad of lipases,1 and it is responsible for photophysical properties of green fluorescent protein.2 A change in the titration states of one residue may trigger a cascade of reactions, but such complex titration behavior is very difficult to capture experimentally.3 Even with computational methods, it is still very challenging to predict such reactions.2,4 Applicability of protein force fields has improved substantially, but their accuracy for modeling titratable amino acids is largely unknown due to the controversial experimental thermodynamics of ionic hydration and the complexity in free-energy corrections for theoretical modeling of molecular ions.5,6 The absolute hydration free energy of the proton ΔGs*(H+) is still debated, leading to inconsistent experimental references for ions that depend on it. Marcus chose a ΔG*s (H+) of −1064 kJ/mol for the derivation of ion hydration free energies,7 while Noyes and Rosseinsky utilized −1098 kJ/mol8 for the same purpose.9,10 Hünenberger et al. proposed a value of −1108 kJ/ mol11 and then used it as a reference to develop the GROMOS © 2017 American Chemical Society
ΔGs*(BH+/A−) = ΔGs*,conv (BH+/A−) + qΔGs*(H+)
(1)
ΔGs*,conv (BH+/A−) = ΔGs*(B/AH) ± (ΔGg0(BH+/AH) − 2.303RT pK a + ΔG 0 →*)
(2)
Received: May 8, 2017 Accepted: May 31, 2017 Published: May 31, 2017 2705
DOI: 10.1021/acs.jpclett.7b01125 J. Phys. Chem. Lett. 2017, 8, 2705−2712
Letter
The Journal of Physical Chemistry Letters
Table 1. Experimental Conventional (ΔGs*,conv) and Absolute (ΔGs*) Hydration Free Energies (kJ/mol) of Charged Amino Acid Side-Chain Analoguesa ΔGs* (B/AH)b
analogues ΔGs* (H ) Arg Arg Asp Cys Glu His Lys Tyr
ΔG0gc
ΔGs* (BH+/A−)
ΔGs*,conv
pKa
−1064
+
−45.7 −48.1j −28.0 −5.2 −27.1 −43.0 −18.3 −25.6
n-propylguanidine guanidinei acetic acid methanethiol propionic acid 4-methylimidazole n-butylamine 4-methylphenol
d
−1098e
−1108f
−1112g
h
13.65 13.7k 4.8l 10.3l 4.9l 6h 10.79h 10.3l
949.4 1428.7 1467 1424 920.9 886.6 1438.7
831.0 −1437.3 −1421.3 −1431.1 851.6 814.6 −1413.5
−232.9 −373.3 −357.4 −367.1 −212.3 −249.3 −349.5
−266.9 −339.4 −323.5 −333.2 −246.3 −283.3 −315.5
−276.9 −329.3 −313.4 −323.1 −256.3 −293.3 −305.5
−281.4 −324.8 −308.9 −318.6 −260.8 −297.8 −301.0
Relevant reactions are BH+ → B + H+ for cation analogues and AH+ → A− + H+ for anions. bHydration free energies of neutral species from refs 25 and 26. cGas-phase basicity (BH+) and gas-phase acidity (AH) from the National Institute of Standards and Technology (NIST) with estimated errors of up to 8 kJ/mol.27,28 dRef 7. eRef 8. fRef 11. gRef 13. hTaken from ref 25. iPartial analogue to arginine. jTaken from ref 25 for methylguanidine and decreased by 1.3 kJ/mol for removal of the methyl group.12 kRef 12. lRef 15. a
where ΔG*S ,conv(BH+/A−) represents the conventional aqueous free energies of cationic (BH+) and anionic (A−) compounds, and the upper sign applies to cations and the lower to anions. ΔG*s (B/AH) is the hydration free energy of the neutral species, ΔG0g is the gas-phase basicity/acidity for the corresponding base/acid of titratable species, ΔG0→* (7.95 kJ/mol) is freeenergy change from 1 atm gas phase to 1 M liquid phase, and q is the net charge.15 Note that the absolute solvation free energy of a single ion, ΔG*s (BH+/A−) cannot be measured directly; instead, solvation free energies of ions are often tabulated as relative or conventional free energies by arbitrarily setting solvation free energy of the proton to zero.15 This results in a shift by qΔG s* (H + ) between ΔG s* (BH + /A − ) and ΔGs*,conv(BH+/A−), as given in eq 1. Following Shirts’ protocol,29 raw (uncorrected) hydration free energies (ΔGraw) of titratable AAs were then computed using thermodynamic integration and compared to the experimental observations. Artifacts in the modeling of charged species must be corrected by adding a term ΔGcor to ΔGraw (ΔGcor = 0 for neutral species), resulting in the final hydration free energy (ΔGhyd). Hünenberger et al. proposed a free-energy correction (ΔGANA) for handling finite size artifacts when computing the solvation energy of monatomic ions and binding free energy of charged species.30,31 Here we reformulated these corrections to treat solvation of polyatomic ions like AAs. A semianalytical formula of ΔGANA is ΔGANA
ξ q2 q⎛ = − LS + 3⎜ 8πε0εsL L ⎝ −
electric potentials for a heterogeneous dielectric environment (interior permittivity set to 1, exterior to εs) and a homogeneous unity permittivity (both interior and exterior permittivity set to 1), respectively, derived from Poisson− Boltzmann (PB) calculations with the AMBER 11 package,32 and Lr is the cubic box length in the PB calculations. Potential inaccuracies in the dielectric permittivity of solvent models (ΔGD) were estimated based on the difference in the hydration free energy between PB-based calculations setting the dielectric constants to 78.46 (corresponding to water at room temperature) and to εs. A further addition or correction to the free energy of solvation is necessary depending on the techniques used in simulations (see ref 33 for a complete explanation). Solvation free-energy calculations as done typically under periodic boundary conditions (PBC) do not have any boundary between air and liquid and thus yield so-called intrinsic free energies. Here we used full PBC and Ewald summation (close to particle-scheme, P-scheme),33 and therefore a term should be added corresponding to crossing the vacuum−bulk solvent interface, producing the physical (real) solvation free energy.34 For most polar solvents interfaces are ordered, giving rise to a surface potential. This leads to the Galvani potential correction of the solvent (eq 4) by ΔG⌀ = q⌀G = qFχ
where F is the Faraday constant (96.48 kJ/mol/V) and χ is the phase potential (V).35,36 In conjunction with the so-called nonperiodic molecularscheme (M-scheme) for simulation a different correction term is needed, called the discrete solvent correction (ΔGDSC) accounting for the error in the potential evaluation at the ionic site due to an improper summation scheme with an unordered interface. This correction arises because the M-summation cutoff sphere acts like an incorrectly organized water/vapor boundary.33 ΔGDSC is also known as the classical exclusion potential of the solvent model6 and relates to the quadrupolemoment trace (γs) of the solvent model and the number of solvent molecules (Ns) by30
⎞ ξ ϕHETd r + CB qLr2⎟ Lr 4πε0εs ⎠
∫
3
2 1 16π 2 ⎛ 1 ⎞q ⎜1 − ⎟ 6 εs ⎠ L 8πε0 45 ⎝
−1 ⎧⎡ q 4π ⎛ 1 ⎞⎤ ⎡ ⎥ ⎨⎢ ⎜1 − ⎟ · ⎢ ⎪⎢ 8πε 3 ⎝ εs ⎠⎥⎦ ⎢⎣ ⎣ 0 ⎩ ⎪
∫L ϕHET(r)d3r r
⎫5/2 ξCB 2⎛ 1 ⎞⎤⎪ ϕHOM(r)d r − − qLr ⎜1 − ⎟⎥⎬ Lr εs ⎠⎥⎦⎪ 4πε0 ⎝ ⎭
∫
(4)
3
ΔG DSC = −qγsNs/(6ε0L3)
(3)
where L is the length of the simulated cubic box, εs is the solvent model permittivity, ξLS is the cubic lattice-sum integration constant (≈ −2.837297), ξCB is the cubic Coulomb integration constant (≈ −2.380077), ϕHET and ϕHOM are
(5)
For the calculation of intrinsic solvation free energies in atomistic (AT) simulations under PBC, the results from the particle-scheme (P-scheme) and molecule-scheme (M-scheme) always differ by an offset of approximately ΔGDSC (eq 6).6,33,37 2706
DOI: 10.1021/acs.jpclett.7b01125 J. Phys. Chem. Lett. 2017, 8, 2705−2712
Letter
The Journal of Physical Chemistry Letters ΔGAT,P = ΔGAT,M − ΔG DSC
mean-square error (RMSE) of ∼3 kJ/mol, while AMBER03 and GROMOS 54A7/A8 show larger RMSEs up to 9 kJ/mol (Figure 1b and Table S6). The hydration free energies were computed as well for the other 13 neutral AAs (all except for Pro and Gly), denoted as nontitratable AAs in this work. We find that AMBER 99SB, AMBER 03, CHARMM 27, GROMOS 54A7/A8, and OPLSAA/L with their respective preferred water models describe the hydration of neutral AA side chains with RMSEs from experiment of 3.6, 7.7, 5.2, 4.2, and 4.3 kJ/mol, respectively. All force fields tend to underestimate the solvation of the sidechain analogues except for GROMOS 54A7/A8, which instead overestimates the solvation in most cases (Table S2). AMBER 99SB yields a good prediction for hydration free energy of the nontitratable AAs (Figure 1b, RMSE ≈ 4 kJ/mol). AMBER 03 shows large deficiencies in modeling neutral states of either nontitratable or titratable AAs, indicating that it should not be used to predict absolute free energies (Figure 1b). GROMOS 54A7/A8 outperforms the other FFs for the nontitratable AAs with a small RMSE of 1.9 kJ/mol (Figure 1b), while it gives a large RMSE of ∼7 kJ/mol for the titratable AAs due to the overestimation of His0 mentioned above (Figure 1a). The tautomeric preference of His0 depends highly on the surrounding groups;42−44 for instance, NH3+−His0−COO− shows a ratio of Hie0 to Hid0 of 0.8:0.2, whereas NH2− His0−COO− gives a roughly equal population with a ratio of 0.53:0.47.42 It therefore makes sense that the computed ΔGhyd for both tautomers (Hid0 and Hie0) are close to the experimental hydration energy of 4-methylimidazole. When considering only the tautomer that has the ΔGhyd closest to the experiment (that is, Hid0 for AMBER 03 and GROMOS 54A7/ 54A8; Hie0 for AMBER 99SB, CHARMM 27, and OPLS-AA/ L), the RMSEs of the titratable AAs from experiment for the six FFs decrease to some extent (Table S6). On the basis of such considerations, GROMOS 54A7/A8 shows an RMSE of 3.6 kJ/ mol for the titratable AAs and an RMSE of 2.5 kJ/mol for all of the neutral AA side chains (Figure S2). Note that good performance of the GROMOS 54A7/54A8 force field is as expected because it was parametrized on ΔGhyd.46 All of the FFs appear to reproduce the relative solvation of all neutral AA side chains well, as revealed by a high Spearman rank order coefficient larger than 0.95 (Table S2). The raw hydration free energies (ΔGraw) for charged states of titratable AA side chains are strongly influenced by water models (Table S7), and standard deviations between different water models are up to 8 kJ/mol, ten times that for nontitratable AAs (Table 2 and Table S5), suggesting that long-range ordering of water around ions yields an important contribution to the hydration free energy. Upon including the contribution from the air/water interfacial potential jump (ΔG⌀) or discrete solvent effect (ΔGDSC), the differences between water models are reduced significantly to within 4 kJ/ mol (Table S5), but to obtain the most accurate results it is recommended to use the water model belonging to the force field. In addition, a large difference in FF performance for predicting hydration free energy on full (Arg+) and partial (Gua+) arginine side chains is observed; the all-atom FFs (AMBER, CHARMM, and OPLS-AA/L) predict a more negative ΔGhyd (more solvated) for Gua+ than Arg+ by 31 kJ/mol, while the united-atom FFs (GROMOS) give a prediction by 4 kJ/mol (Figure S4). Finite-size effects (ΔGANA) and inaccurate solvent model permittivity (ΔGD) contribute little to ΔGhyd, and they tend to cancel out each
(6)
The physical (real) free energies of ionic solvation can be therefore computed then by ΔGreal = ΔGAT,P + ΔG⌀ = ΔGAT,M − ΔG DSC + ΔG⌀ (7)
Hydration free-energy calculations for solvated titratable AA (Arg, Asp, Glu, His, and Lys) side-chain analogues in ∼500 water molecules were performed at 300 K with thermodynamic integration using the GROMACS 4.5.5 package.38−41 Six popular biomolecular force fields (FFs) and six water models were examined, resulting in 19 FF/water combinations in total, of which only the SPC/E water model was used with all of the FFs, as summarized in Table S1 in the Supporting Information (SI). The differences between calculated and experimental ΔGhyd (Table 1) for neutral AAs are presented in Figure 1a and
Figure 1. Differences between calculated ΔGhyd and experiments for neutral states of amino acid side chains modeled by six biomolecular force fields with their respective preferred water model. (a) Errors for titratable AAs. (b) Root-mean-square errors (RMSEs) for nontitratable, titratable, and all AAs. FF indices are AMBER 99SB (#1), AMBER 03 (#2), CHARMM 27 (#3), GROMOS 54A7 (#4), GROMOS 54A8 (#5), and OPLS-AA/L (#6). His0 here is a weighted value computed as 0.2*Hid0 + 0.8*Hie0. The blank region in panel a indicates that FF parameters are unavailable for neutral arginine.
Table S2. The neutral side chain of histidine (His0) exists in two tautomeric forms of Hid0 (proton attached to the δ-N) and Hie0 (proton to ε-N). Hie0 is energetically preferred over Hid0 with the population ratio of Hie0 to Hid0 being ∼8:2.42−44 On the basis of this tautomeric preference, a weighted value is given for His0 in Figure 1a, and this is used for subsequent data analysis unless stated otherwise. A deficiency in the modeling of Hie0 leads to an underestimated hydration of His0 by 15 kJ/mol with AMBER 03 and to an overestimation of His0 by 15 kJ/mol with the GROMOS 54A7/A8 force fields, although these three FFs model Hid0 reasonably well. CHARMM 27 reproduces the hydration of His0 best, while AMBER 99SB and OPLS-AA/L underestimate it by ∼6 kJ/mol. Given that the uncertainty of the experimental hydration free energies was up to 8.1 kJ/ mol,45 the six FFs appear to model Asp0, Glu0, Lys0, and Arg0 reasonably well (Figure 1 and Table S2). Table 2 lists the ΔGhyd results of neutral AAs for the six FFs with preferred water models, that is, TIP3P for AMBER and CHARMM, SPC for GROMOS, and TIP4P for OPLS-AA/L. The influence of the water model on the ΔGhyd for neutral states of titratable AA side chain analogues (Table S4) appears negligible, as indicated by the small standard deviations of