Force Field Benchmark of Amino Acids: I. Hydration and Diffusion in

Apr 12, 2018 - Thermodynamic and kinetic properties are of critical importance for the applicability of computational models to biomolecules such as p...
0 downloads 0 Views 3MB Size
Subscriber access provided by UNIV OF DURHAM

Computational Biochemistry

Force Field Benchmark of Amino acids: I. Hydration and Diffusion in Different Water Models Haiyang Zhang, Chunhua Yin, Yang Jiang, and David van der Spoel J. Chem. Inf. Model., Just Accepted Manuscript • DOI: 10.1021/acs.jcim.8b00026 • Publication Date (Web): 12 Apr 2018 Downloaded from http://pubs.acs.org on April 13, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Force Field Benchmark of Amino Acids: I. Hydration and Diffusion in Different Water Models Haiyang Zhang,† Chunhua Yin,† Yang Jiang, ‡ and David van der Spoel*§ †

Department of Biological Science and Engineering, School of Chemistry and Biological

Engineering, University of Science and Technology Beijing, 100083 Beijing, China ‡

Beijing Key Lab of Bioprocess, College of Life Science and Technology, Beijing University of

Chemical Technology, Box 53, 100029 Beijing, China §

Uppsala Center for Computational Chemistry, Science for Life Laboratory, Department of Cell

and Molecular Biology, Uppsala University, Husargatan 3, Box 596, SE-75124 Uppsala, Sweden Corresponding Author * [email protected]

ACS Paragon Plus Environment

1

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 54

ABSTRACT

Thermodynamic and kinetic properties are of critical importance for the applicability of computational models to biomolecules such as proteins. Here we present an extensive evaluation of the Amber ff99SB-ILDN force field for modeling hydration and diffusion of amino acids with three-site (SPC, SPC/E, SPC/Eb, and TIP3P), four-site (TIP4P, TIP4P-Ew, and TIP4P/2005), and five-site (TIP5P and TIP5P-Ew) water models. Hydration free energies (HFEs) of neutral amino acid side chain analogues have little dependence on the water models with a root-mean-square error (RMSE) of ~1 kcal/mol from experimental observations. Based on the number of interacting sites in water models, HFEs of charged side chains can be putatively classified into three groups, of which three-site models lies between that of four- and five-site water models and for each group the water model dependence is greatly eliminated with the solvent Galvani potential considered. Some discrepancies in the location (RRDF) of the first hydration peak of ion−water radial distribution functions between experimental and calculated observations were detected, such as a systematic underestimation of the acetate (Asp) ion. The RMSE of calculated diffusion coefficients of amino acids from experiment increases linearly with the diffusion coefficients of the solvent water models at infinite dilution. TIP3P has the fastest diffusivity, in line with literature findings, while the FB and OPC water model families as well as TIP4P/2005 perform well within a relative error of 5% and TIP4P/2005 yields the most accurate estimate for the water diffusion coefficient. All the tested water models overestimate amino acid diffusion coefficients by approximately between 40% (TIP4P/2005) and 200% (TIP3P). Scaling proteinwater interactions with TIP4P/2005 in the Amber ff99SBws and ff03ws force fields leads to more negative HFEs but has little influence on the diffusion of amino acids. The most recent FF/water combinations of ff14SB/OPC3, ff15ipq/SPC/Eb, and fb15/TIP3P-FB do not show obvious improvements in accuracy for the tested quantities. These findings here establish a benchmark that may aid in the development and improvement of classical force fields to accurately model protein dynamics and thermodynamics.

ACS Paragon Plus Environment

2

Page 3 of 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

1. INTRODUCTION Extensive attention has been paid to the study of molecular interactions in water, both using experimental and computational tools, attempting to explore the role of water in biology.1, 2 In the past decades, a large number of theoretical potentials have been proposed to capture the physical properties of water molecules as accurately as possible.3-8 A special website with information on the science of water is maintained by Martin Chaplin.9 To date, however, many of the popular models reproduce water properties with limited accuracy only, and thus the choice of water models in molecular dynamics (MD) simulations commonly hinges on the research purpose and the compatibility with the used force fields. For instance, the native water models of Amber,10 GROMOS,11 and OPLS12 force fields (FFs) are TIP3P (transferable intermolecular potential 3P),13 SPC (simple point charge),14 and TIP4P,13 respectively; more water models are supported as well for Amber such as SPC/E (extended SPC),15 TIP4P,13 and TIP5P.16 Of these, SPC/E appeared to be compatible with varied biomolecular FFs.17 It could be argued however that SPC/E may be incompatible with protein force fields due to the polarization energy having been built into its parameterization.15 No biomolecular force fields were designed with an equivalent approach, although the recent Amber force field ff15ipq18 was tuned with reference to the SPC/Eb19 cousin water model. Most of classical additive force fields like Amber largely followed historical precedent and were developed and tested with an earlier generation of water models, such as the TIP3P model developed 35 years ago.13 Clearly, one of the weaknesses of these FFs is the reliance on the water model used.20 A well-known drawback of TIP3P is its fast self-diffusivity and low viscosity,21-23 indicating that TIP3P is not suitable for predicting absolute kinetic properties of biomolecules in water. For this reason, diffusion/viscosity corrections are required for a meaningful comparison with experimental kinetic properties in studies of for instance metal ions

ACS Paragon Plus Environment

3

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 54

and protein folding dynamics.23, 24 Refined water models have been proposed a decade ago, that were carefully parameterized against extensive experimental data, for instance TIP4P-Ew (adaption of TIP4P for use with Ewald techniques),25 TIP5P-Ew (adaption of TIP5P for Ewald),26 and TIP4P/2005.27 The SPC/Eb model, with a slight increase of 1% in the O-H bond length compared to the original SPC/E was proposed recently in order to obtain an improved reproduction of rotational diffusion and NMR spectral density of pure water and proteins.19 Two water model families of the latest generation, “FB” (like TIP3P-FB and TIP4P-FB)28 and “OPC” (like OPC3 and OPC),29, 30 were developed using different strategies such as Force Balance and global optimization, respectively. These models were shown to reproduce a comprehensive set of liquid bulk properties accurately. However, it is difficult to choose a water model for biomolecular simulations without considering the biomolecular force field.31 Using an improved water model should, in principle, result in an enhanced accuracy of simulated properties of biomolecules, since the hydrophobic effect that is the driving force for biomolecular associations and folding is largely determined by solvent properties.32 By introducing modifications for protein FFs, several groups have made progress for application of these FFs with the recently optimized water models.20, 33, 34 With minimal back bone changes, Best and Mittal combined Amber ff03 with TIP4P/2005 (the resulting FF was denoted as ff03w) for reproducing helix propensity;33 Nerenberg and Head-Gordon adapted Amber ff99SB35 to TIP4P-Ew in order to obtain improved conformational ensembles of small peptides.34 Both of these adapted FFs show a better performance than the native water model TIP3P in combination with the unchanged Amber FF. Similarly, for a faithful reproduction of helix propensity by ff99SB-ILDN,10 Best et al. then adjusted back bone torsions and refitted atomic charges of charged amino acids and side chain torsions of Asp for use with TIP4P/2005 (ff99SBw).20, 36

ACS Paragon Plus Environment

4

Page 5 of 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

These adaptions have been shown advantageous for modeling properties of disordered proteins.37-40 The latest Amber protein FFs of ff15ipq18 and fb1541 have been built for consistency with recent high-quality water models of SPC/Eb19 and TIP3P-FB,28 showing improved performances for diffusion and temperature dependent properties of proteins, respectively. Despite the quality of water models, a key observation in recent years has been the imbalance between protein-water and protein-protein interactions. The simulated disordered peptides were found to be structurally too compact relative to the experiment,20,

42, 43

and artificial protein

aggregation was observed under self-crowding conditions in simulations.44-46 These observations indicate that protein-water interactions are likely too weak compared to protein-protein interactions, as confirmed by a slightly more positive hydration free energy of amino acid side chain analogues.20, 47, 48 To restore the balance, some researchers have taken a first step and proposed a simple strategy to tune short-range solute-water Lennard-Jones interactions, allowing for predictions being more consistent with experimental solvation free energies49 or small-angle X-ray scattering and Förster resonance energy transfer measurements.20 To capture thermodynamic and kinetic properties of biomolecules like proteins in water accurately, a highquality water model and balanced interactions between interacting molecules are therefore required, and more importantly, the origin of force field deficiencies such as hydration and diffusion properties is highly desirable to be identified thoroughly. Here we presented an extensive benchmark of hydration and diffusivity of amino acid analogues for the Amber protein force field ff99SB-ILDN10 paired with nine water models, namely, SPC,14 SPC/E,15 SPC/Eb,19 TIP3P,13 TIP4P,13 TIP4P-Ew,25 TIP4P/2005,27 TIP5P,16 and TIP5P-Ew.26 For TIP4P/2005, additional variants of Amber FFs, including ff99SBw,20

ACS Paragon Plus Environment

5

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 54

ff99SBws,20 ff03w,33 and ff03ws20 were examined as well. The latest Amber FFs of ff14SB,50 ff15ipq,18 and fb1541 were tested with recent water models of OPC3,30 SPC/Eb,19 and TIP3PFB,28 respectively. The similarities, differences, and/or deficiencies of force field/water combinations are discussed in detail. This work can be useful for further efforts in protein force field development. 2. COMPUTATIONAL METHODS 2.1 Structural Model. All amino acid residues (except for Gly and Pro) available in the Amber force fields (FFs) were examined in this work. Construction of side chain analogues followed Shirts and coworkers,48 and the net charge for neutral and charged analogues was forced to be zero and unity (-1 for Asp and Glu and +1 for Arg, His and Lys), respectively, via adjusting the atomic charge of β-carbon (referred to as the method of Cβ modification here).51 Two tautomeric states of neutral His, namely, Hid and Hie with a proton connected to δ- and εnitrogens, respectively, were constructed separately. Similarly, three partial analogues of Gud+ (guanidinium), Imd+ (imidazolium), and NH4+ (ammonium) ions were also built on the basis of Arg+, His+, and Lys+, respectively, for direct comparison with experimental observations. These side chain analogues were used to calculate hydration properties of amino acids. Experimental diffusion constants of amino acids at infinite dilution used here for reference were determined in a buffer of pH~3.5,52 at which amino acids usually exist in a zwitterionic form (NH3+-X-COO-). Original Amber FFs do not have parameters for zwitterionic amino acid residues and we constructed them based on the available N- and C-terminal amino acids manually. The carbonyl group of N-terminal amino acid building blocks was changed to be a carboxylate (-COO-), of which atom types were assigned in accordance with the charged Asp residue, resulting in zwitterionic amino acids. Atomic charges for carboxylate oxygen atoms

ACS Paragon Plus Environment

6

Page 7 of 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

were set to the averaged value over all the carboxylate groups of C-terminal amino acids. Finally, -

if necessary, the charge of the carbon atom of -COO was adjusted to ensure an integral net charge. This manual assignment was referred to as the method of C=O modification. These zwitterionic forms were used to evaluate the diffusion properties of amino acids. 2.2 Amino Acid Force Field and Water Models. The Amber ff99SB-ILDN force field10 was used to model side chain analogues and zwitterionic forms of amino acids in combination with three catalogues of molecular mechanics water models, namely, three-site (SPC,14 SPC/E,15 SPC/Eb,19 and TIP3P13), four-site (TIP4P,13 TIP4P-Ew,25 and TIP4P/200527), and five-site (TIP5P16 and TIP5P-Ew26) models. Four other variants of the Amber protein FFs, ff99SBw,20 ff99SBws,20 ff03w,33 and ff03ws20 were examined as well. These FFs were adapted for use with the optimized TIP4P/2005 water model27 by Best and coworkers and the resulting FFs showed good performances for the properties of helix propensity, disordered proteins, and non-specific protein associations in protein simulations.20, 33, 36, 53 Both ff99SBw and ff99SBws were based on ff99SB*-ILDN-Q,36 while the latter used a scaling of protein-water Lennard-Jones (LJ) interactions, indicated by a suffix “s”, as computed by  = ε  =   ε

(1)

where εO and εi are LJ parameters (ε) for water oxygen and protein atoms i, εOi is the result for mixed atom types generated by the Lorentz-Berthelot (LB) combination rule, and α is the scaling factor (α= 1.1 for ff99SBws and ff03ws).20 Similarly, both ff03w and ff03ws were based on ff0354 with a slight backbone modification,53 and the latter implemented scaled protein-water interactions, similar to ff99SBws. Note that atomic charges of neutral, N-, and C-terminal amino acids remained unchanged in ff99SBw(s)20 and were identical to that in ff99SB-ILDN,10 while ff03w(s)20, 33 shared identical atomic charges with ff0354 for all amino acid residues. FF/water

ACS Paragon Plus Environment

7

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 54

combinations of latest Amber ff14SB/OPC3, ff15ipq/SPC/Eb, and fb15/TIP3P-FB extracted from AmberTools 1655 were used to model side chain analogues and zwitterionic amino acid residues as well. Note that the Amber ff14SB is not built around OPC3, and due to the too fast selfdiffusivity of TIP3P,21-23 we choose OPC3 to examine whether the diffusion properties of amino acids can be improved by latest water models. Water properties for two more four-site models of TIP4P-FB28 and OPC29 were also calculated for a comparison with the other models. 2.3 Estimation of Water Properties. 2.3.1 Diffusion Constants. All the MD simulations were carried out with full periodic boundary conditions using the GROMACS package (version 5.1).56-59 For estimation of purewater prosperities, seven cubic boxes containing 512, 768, 1024, 1536, 2048, 3072, and 4096 pure water molecules with box lengths of about 2.5, 2.9, 3.2, 3.6, 4.0, 4.6, and 5.0 nm were simulated separately. The Leap-Frog algorithm was used to integrate classical Newtown’s equations of motion using a time step of 0.002 ps. Electrostatics interactions were handled by the particle-mesh Ewald (PME) method60, 61 with a real-space cutoff of 1.0 nm, and Lennard−Jones interactions were truncated at 1.0 nm. The velocity rescaling thermostat62 and Parrinello-Rahman barostat63,

64

were applied to maintain temperate and pressure at 298.15 K and 1 bar with a

coupling time constant of 1 ps and 5 ps, respectively. The SETTLE algorithm65 was used to maintain a rigid water model. The calculation of diffusion coefficients should ideally be done based on simulations in the NVE ensemble. Due to the practical issues of NVE simulations (see more discussion in the work by York et al. where they concluded that the computed diffusion coefficients of water and Mg2+ and temperature distribution in the NVE simulations are very close to NVT),23 we choose the NVT ensemble for the calculations then. After an energy

ACS Paragon Plus Environment

8

Page 9 of 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

minimization and 1-ns NVT, 1-ns NPT, and 3-ns NVT equilibrations, production trajectories were performed at NVT for 20 ns with atomic coordinates saved to disk every 2 ps. Finite-size effects under periodic boundary conditions (PBC) have been shown to influence the translational diffusion constant determination from simulations for molecules such as water and metal ions.21,

23, 66, 67

In order to correct such effects, the box size dependent diffusion

constants (DPBC) were used to extrapolate the diffusion constants in the infinite dilution limit (D0), as in eq 2.21, 68  = −

.  π

(2)

where kB is Boltzmann’s constant, T is the absolute temperature (298.15 K), η is the shear viscosity of the solvent, and L is the box length. In practice, the y-intercept for a linear fit of DPBC versus 1/L gives D0, and the corresponding slope can be used to calculate η. Note that the viscosity was shown to be system-size independent; Given a solvent viscosity, eq 2 can also be used to correct the system-size dependences of diffusion coefficient calculations under PBC.21, 69 Similar to the method by York and coworkers,23 each 20-ns NVT production trajectory was divided into 100 blocks equally. For each block, mean squared displacements (MSD) of the diffusing particle were calculated and averaged over all molecules, and the box size dependent diffusion constant was then calculated from MSD using the Einstein relation.70 The final DPBC for each box was obtained by averaging over all the blocks and error estimates of DPBC were evaluated by block averaging.69 With the aid of R program,71 D0 at infinite dilution and solvent viscosity of η and their corresponding errors were determined from linear regression of DPBC versus 1/L (eq 2) by generating 1,000 data sets from a normal distribution of DPBC randomly and obtaining the corresponding averages and standard deviations.

ACS Paragon Plus Environment

9

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 54

2.3.2 Dielectric Permittivity and Multipole Moment. Static relative dielectric constants (ε0) at 298.15 K and 1 bar were computed from the fluctuations of the total dipole moment of the simulation box.72,

73

The molecular dipole (µ) and dielectric constant (ε0) of water were

calculated with the GROMACS analysis tool of “gmx dipoles”.56-59 The quadrupole moments (QT) of three- and four-site water models were calculated following the work by Abascal and Vega (eq 5 in the Ref.74), and that of the five-site models was from the Ref.16 2.3.3 Surface Potential and Surface Tension. After 1-ns NPT equilibration, a slab of 3.1 x 3.1 x 9.3 nm3 with 1000 pure water molecules was simulated at NVT for 10 ns with a protocol close to that in the Section 2.3.1. The last 8 ns was used for the calculation of surface potential (χ) and surface tension (γ). The GROMACS tool of “gmx potential” was used to compute the potential across the box where periodic boundaries are taken into account to make the potential periodic as well. The potential was computed by integrating twice the charge density per slab and it was set to zero in the gas phase. A script was used to compute the average potential in the water phase. Using the same simulations, the surface tension for the water models was computed from the difference in the lateral and normal pressure, see Ref.75 for details. 2.4 Estimation of Amino Acid Hydration. Hydration free energies (HFEs) of amino acid side chain analogues were calculated by thermodynamic integration (TI). A Langevin (stochastic dynamics) integrator76 was used for TI simulations, with a friction coefficient of 1 ps−1. The LINCS algorithm77 was used to constrain all bonds of the solute molecules. A two-stage TI was performed along an alchemical reaction coordinate of λ = 0, 0.25, 0.5, 0.75, and 1 for decoupling Coulombic interactions, and then λ = 0, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95, and 1 for decoupling van der Waals interactions. Each λ was simulated for 2 ns at 298.15 K and 1 bar, and the first 100 ps was removed for system equilibration. Each system

ACS Paragon Plus Environment

10

Page 11 of 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

contains only one side chain analogue and 640 water molecules in a cubic box. The protocol for TI simulations was similar to the above pure-water calculations and more details have been presented in Ref.78 HFE calculations that are typically done under PBC (as in this work) do not have any boundary between air and liquid and are known as intrinsic free energies (∆Gintr). A surface contribution (∆Gsurf) from crossing the vacuum/bulk interface is required to be added, giving rise to the physical (real) free energy of solvation (eq 3).79-81 ∆ = ∆!"# + ∆%&' = ∆!"# + (∅*

(3)

where q is the net charge of the solute and ∅* is the solvent Galvani potential jump in air to liquid direction (∅+ = Fχ, where F is the Faraday constant of 23.061 kcal/mol/V). For charged side chains, finite-size artifacts and potential inaccuracies in solvent model permittivity need to be considered with care as well.82 It was noted that these two corrections are quite small and negligible for PME/PBC simulations (as we did in this work).78, 83 Experimental observations for the HFEs of ions are based on a debated hydration free energy ∗ 0H2 3 ranging from -265 to -250 kcal/mol,84 and to date there is no consensus of the proton ∆,-.

on the true value. Our previous studies on charged amino acids and molecular ions78, 83 indicated that the calculated real free energy of ionic hydration (i.e, taking the solvent Galvani potential into account) with classical molecular mechanism force fields appeared to best match the ∗ 0H 2 3 of -262.4 kcal/mol85 for reference. This value was therefore used experiments using a ∆,-.

here for comparison with the calculations. Besides the solvation free energies, the water structure around amino acids is of crucial importance for protein dynamics in water. Charged side chain analogues of Arg+, His+, Imd+, Lys+, NH4+, Asp-, and Glu- were simulated for 10 ns separately without counter ions in a cubic

ACS Paragon Plus Environment

11

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 54

box of 640 water molecules, and the last 8-ns was used to compute radial distribution functions (RDFs) of a given ion around water molecules. Here an unnormalized RDF, ρ(r) = n(r)/(4πr2∆r), was used to monitor water structures around amino acids for clarity, where r is the distance between the reference site of an ion and the oxygen atoms of water molecules and n(r) is the average number of the given particle in a spherical shell with a thickness of ∆r at a distance r.86 ρ(r) is also known as the water number density (nm-3) around ions at r. The location of the first hydration peak (RRDF) in ρ(r) can be used for comparison with experimental estimates if available.87-91 Nitrogen and oxygen atoms were chosen as the reference site for cationic and anionic ions, respectively.11 2.5 Estimation of Amino Acid Diffusion. Similar to water diffusion calculations, each zwitterionic amino acid was immersed into a cubic box of 512, 768, 1024, 2048, and 4096 water molecules separately. After an energy minimization and equilibrations, 30 separate simulations were performed for each box and each amino acid, and were used for the determination of the box size dependent diffusion constants (DPBC). A linear fit of DPBC versus 1/L (eq 2) gives the 6 6 calculated translational diffusion ( ,55 ) of amino acids at infinite dilution by extrapolation. In %6 . order to correct for potential errors from the diffusion of water models, a scaled quantity ( ,55 )

can be obtained by eq 4.23 %6 . 6 6 ,55 = ,55

:;
?= 78,9

(4)

BC

6 6 where ,@ and ,@ are experimental and calculated water diffusion coefficients, respectively.

This correction allows a more meaningful comparison with experimental observations.

ACS Paragon Plus Environment

12

Page 13 of 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

3. RESULTS 3.1 Water Properties. Simulated self-diffusion coefficients (DPBC) of water molecules as a function of the inverse box length (1/L) examined are presented in Figure 1. As expected, water diffusion constants depend on the simulated system size linearly and increase as the box length increases. The extrapolated diffusion constant (D0) at infinite dilution and the corresponding viscosity computed by eq 2 are summarized in Table 1. Out of 13 water models, TIP3P gives the largest diffusion coefficient of D0 = 6.14 ± 0.05 x 10-5 cm2/s and significantly overestimates the experimental determination (D0 = 2.299 x 10-5 cm2/s),92 and TIP4P/2005 predicts a value of D0 =

Figure 1. Box dependent diffusion constants of water calculated from MD simulations with varied box lengths (L) under periodic boundary conditions (PBC) for early generation (a) and latest (b) water models. TIP4P/2005 is given in both panels for comparison. The solid lines indicate linear fits of data points; the intercept (i.e., L approaches to infinity) gives the diffusion constant at infinite dilution and the slope is

used to estimate the viscosity of water models (eq 2).

ACS Paragon Plus Environment

13

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 54

Table 1. Calculated Properties of the Water Models Tested at 298.15 Ka µ

QT

,@

ε0

-1

-5

2

η

χ

∅+

γ

OPC3 SPC SPC/E SPC/Eb TIP3P

(D) 2.430 2.274 2.351 2.374 2.347

(10 D nm) 2.060 1.969 2.036 2.076 1.721

77.8 (1.9) 65.1 (1.3) 70.6 (1.6) 70.5 (1.3) 96.2 (1.8)

(10 cm /s) 2.41 (0.02) 4.52 (0.03) 2.76 (0.02) 2.10 (0.01) 6.14 (0.05)

(mPa•s) 0.87 (0.07) 0.45 (0.03) 0.78 (0.07) 0.91 (0.06) 0.32 (0.04)

(V) -0.624 -0.605 -0.589 -0.590 -0.534

(kcal/mol) -14.4 -14.0 -13.6 -13.6 -12.3

(mN/m) 60.0 (0.2) 49.5 (0.3) 56.8 (0.4) 60.4 (0.3) 45.9 (0.7)

TIP3P-FB

2.419

2.052

78.6 (1.8)

2.20 (0.01)

1.00 (0.08)

-0.569

-13.1

59.8 (0.3)

OPC TIP4P TIP4P-Ew

2.480 2.178 2.321

2.300 2.147 2.164

76.3 (1.9) 51.1 (0.8) 63.4 (1.4)

2.37 (0.02) 3.86 (0.03) 2.63 (0.04)

1.01 (0.09) 0.55 (0.07) 0.75 (0.07)

-0.544 -0.523 -0.513

-12.5 -12.1 -11.8

68.2 (0.2) 51.4 (0.6) 58.0 (0.2)

TIP4P-FB TIP4P/2005 TIP5P TIP5P-Ew exp.

2.429 2.305 2.292 2.292 2.9(0.6)b

2.151 2.297 1.565 1.565

76.6 (2.1) 56.7 (1.6) 89.9 (3.7) 94.4 (3.6) 78.375c

2.21 (0.01) 2.31 (0.02) 2.93 (0.03) 3.09 (0.03) 2.299d

0.90 (0.06) 0.84 (0.08) 0.82 (0.09) 0.66 (0.07) 0.89e

-0.517 -0.486 -0.104 -0.102

-11.9 -11.2 -2.4 -2.4

63.7 (0.3) 62.6 (0.6) 47.9 (0.6) 51.5 (0.4) 71.99f

Dipole (µ) and quadrupole (QT) moment, static dielectric constant (ε0), diffusion constant ( ,@ ) at infinite dilution, viscosity (η), surface potential (χ) at water-air interface in air to liquid direction, solvent Galvani potential (∅+ ), and surface tension (γ) are given with standard deviations in parentheses; Standard deviations for χ and ∅+ amount to 5 mV and 0.1 kcal/mol; bDipole moment of liquid water from x-ray diffraction measurements;93 cForm Ref;94 dFrom Ref;92 eFrom Ref;95, 96 f From Ref.97

a

2.31 ± 0.02 x 10-5 cm2/s and reproduces the experiment perfectly (Figure 1a). The latest two water families of FB and OPC models yields a close prediction to the experiment, yet with a slight underestimation and overestimation by about 5%, respectively (Figure 1b and Table 1). SPC/Eb gives the smallest diffusion coefficient (D0) at infinite dilution, and D0 of the water models are in the order of TIP3P (6.14) > SPC (4.52) > TIP4P (3.86) > TIP5P-Ew (3.09) > TIP5P (2.93) > SPC/E (2.76) > TIP4P-Ew (2.63) > OPC3 (2.41) > OPC (2.37) > TIP4P/2005 (2.31) > TIP4P-FB (2.21) > TIP3P-FB (2.20) > SPC/Eb (2.10), where D0 are given in units of 105

cm2/s (Figure 1 and Table 1). Our D0 calculations are in excellent agreement with the results

published by others (with the finite-size effects under PBC corrected) for TIP3P (6.05 x 10-5 cm2/s) and SPC/E (2.79 x 10-5 cm2/s),21,

23

and are slightly smaller than the predictions for

TIP4P-Ew (2.72 x 10-5 cm2/s),23 SPC/E (2.97 x 10-5 cm2/s), and TIP4P/2005 (2.49 x 10-5

ACS Paragon Plus Environment

14

Page 15 of 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

cm2/s).67 The original SPC/Eb work reported a value of 2.34 x 10-5 cm2/s in NPT ensemble,19 11% larger than our prediction in NVT ensemble. Linear fits of DPBC vs 1/L (Figure 1 and eq 2) produce estimates for the shear viscosities (η) of the water models are listed in Table 1. Compared to the experiment (η = 0.89 mPa•s),95, 96 the TIP3P, SPC, and TIP4P models severely underestimate the water viscosity and of these TIP3P yields the worst (smallest) viscosity of 0.32 ± 0.04 mPa•s, in line with reports by others.21, 98-102 TIP4P/2005 reproduces the viscosity well (η = 0.84 ± 0.08 mPa•s), as confirmed by several groups.67, 98, 100, 101 OPC3, SPC/Eb, and TIP4P-FB show a good performance for η as well, and the other water models show more or less acceptable performances for the water viscosity (Table 2). Note that the reverse non-equilibrium molecular dynamics (RNEMD), a method close to the real experiment technique, produced a dramatically small estimation of η = 0.33 mPa•s for SPC/E and 0.58 mPa•s for TIP4P/2005, whereas for SPC, TIP3P, TIP4P, TIP4P-Ew, TIP5P, and TIP5P-Ew comparable results were obtained by RNEMD.99 The viscosity (η) of water models examined are found to correlate strongly (R = -0.96) with their self-diffusion coefficients (D0), and η deceases linearly with the D0 increasing, as indicated by the blue circles in Figure 2.

Figure 2. Correlation of calculated water diffusion coefficients with the corresponding shear viscosities of the solvent water (blue circles) and with the root-mean-square error (RMSE, red triangles) of calculated amino acid diffusion coefficients from experimental observations at infinite dilution. Solid lines indicate linear fits of data points, R is correlation coefficient, and RMSE is for the calculations with ff99SB-ILDN.

ACS Paragon Plus Environment

15

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 54

Surface tensions are underestimated to some extent for all water models, of which the four-site OPC model performs best and predicts a surface tension of 68.2 mN/m, very close to the experiment of 71.99 mN/m.97 This underestimation can be ascribed to the lack of polarization in the models and omission of long-range dispersion contribute to this.103,

104

Compared to the

experimental dipole moment of liquid water (µ = 2.9),105 all the water models have (by design, since these are empirical models) a smaller µ, and no obvious differences are observed between water models except for TIP4P having the smallest µ = 2.178 D (Table 1). The FB and OPC water models reproduce the experimental dielectric constants (ε0) of water accurately, in perfect agreement with the original reports.28-30 TIP3P, TIP5P, and TIP5P-Ew exhibit larger ε0 than the experiment, whereas the other models produce an underestimation and TIP4P has the lowest ε0. Note that the ε0 prediction of water models depends highly on the simulation protocol,106 and there is a large diversity in the literate. For the solvent Galvani potential (∅+ ) that needs to be considered in the calculation of physical (real) solvation free energies (SFEs) for molecular ions, the three- and four-site water models give an ∅+ ranging from -11.2 to -14.4 kcal/mol. This leads to a deviation in the calculated ionic SFE in water from typical MD/PBC simulations by 11-14 kcal/mol (underestimation for cations and overestimation for anions). The five-site water models of TIP5P and TIP5P-Ew have a small contribution (∅+ ~ -2 kcal/mol) from the surface potential, which could lead to the conclusion that the water-vacuum interface is not very structured. These models also have a low surface tension, although there is no correlation between surface potentials and surface tension in the models tested. The predictions of the surface potential (χ) of SPC, SPC/E, and TIP3P in this work are close to the available reports,79, 107 while χ for TIP4P (-0.523 V) is

ACS Paragon Plus Environment

16

Page 17 of 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

less negative than that in the literature (χ ~ -0.6 V),

108, 109

leading to a difference in ∅+ by ~2

kcal/mol. 3.2 Hydration of Amino Acid Side Chains. Hydration free energies (HFEs, ∆Ghyd) of amino acid side chain analogues are corrected by adding a term based on the solvent’s Galvani potential (q∅+ , eq 3), and the corrected real HFEs are given in Table 2. The correction amounts to zero for neutral analogues (q = 0). For the neutral amino acid side chain analogues with ff99SB-ILDN, no significant differences are observed between three- and four-site water models with the root-mean-square error (RMSE) from experimental observations amounting to ~1 kcal/mol, in line with our previous work for Amber ff99SB, CHARMM 27, GROMOS 54A7/A8, and OPLS-AA/L.83 A similar RMSE (~1 kcal/mol) for five-site water models of TIP5P and TIP5P-Ew is observed (Table 2); however, these two water models yield a more negative ∆Ghyd for Thr (T), Trp (W), and Tyr (Y) than the three- and four-site water models by 1~2 kcal/mol resulting in a good match with the experiments (Table 2 and Figure 3a). The TIP5P and TIP5PEw models overestimate the hydration (more negative ∆Ghyd) of neutral Asp and Glu residues by ~2 kcal/mol, whereas the three- and four-site water models reproduced the experimental ∆Ghyd of these two residues accurately. For the three- and four-site water models, in general, ff99SB-ILDN appears to have a systematic bias in that the hydration of neutral side chains is on average less favorable than experimental values, similar to that with ff99SBw/TIP4P/2005, ff14SB/TIP3P, and ff15ipq/SPC/Eb, as indicated by a positive mean signed error (MSE) of 0.5~0.8 kcal/mol (Tables 2 and S1). These results are in agreement with the work on a partial set of neutral side chain analogues (i.e, most neutral forms of titratable AAs such as Asp, Glu, His, and Lys were not examined) with Amber ff99 and ff03*.17, 20 However, TIP5P and TIP5P-Ew show a reasonable

ACS Paragon Plus Environment

17

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 54

representation of amino acid hydration with a small MSE of 0.1 and -0.1 kcal/mol, respectively (Table 2).The latest ff15ipq underestimates ∆Ghyd of Asp, Glu, Met, Ser, Thr, and Tyr by about 2 kcal/mol, and overestimates Asn, Gln, and His by 1~3 kcal/mol, resulting in a large RMSE of 1.7 kcal/mol

from

experiment.

ACS Paragon Plus Environment

18

Page 19 of 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Table 2. Real Hydration Free Energies (kcal/mol) of Amino Acid Side Chain Analoguesa exp.b FF

SPC

SPC/E

c

c

SPC/Eb c

d

OPC3

TIP3P

TIP3P -FB

TIP4P

TIP4P -Ew

e

c

f

c

c

c

g

h

i

TIP5P

TIP5P -Ew

j

c

c

TIP4P/2005

neutral Ala

1.94

2.4

2.6

2.6

2.6

2.6

2.4

2.6

2.5

2.6

2.5

2.6

2.2

2.6

2.2

2.3

2.3

Asn

-9.68

-9.7

-9.5

-9.5

-11.7

-9.5

-9.7

-9.6

-9.9

-9.7

-10.1

-10.0

-10.5

-7.3

-7.9

-10.2

-10.3

Asp

-6.7

-6.7

-6.6

-6.5

-4.6

-6.7

-6.8

-6.7

-6.5

-6.6

-6.7

-6.8

-7.1

-5.8

-6.3

-8.7

-8.9

Cys

-1.24

0.0

0.2

0.3

0.4

0.2

-0.1

0.2

0.1

0.2

0.1

0.2

-0.4

0.1

-0.4

-0.1

-0.2

Gln

-9.38

-10.3

-10.0

-10.1

-12.2

-10.0

-10.1

-10.4

-10.3

-10.3

-10.6

-10.5

-11.2

-11.4

-12.1

-10.5

-10.7

Glu

-6.47

-6.9

-6.7

-6.6

-4.5

-6.8

-6.8

-6.8

-6.7

-6.7

-7.0

-6.9

-7.6

-6.1

-6.8

-8.3

-8.5

Hid

-8.6

-8.4

-8.3

-12.7

-8.3

-8.5

-8.2

-8.6

-8.4

-8.8

-8.8

-9.7

-12.5

-13.5

-8.4

-8.7

Hie

-9.3

-9.1

-9.0

-11.4

-9.0

-9.1

-9.0

-9.1

-9.1

-9.4

-9.3

-10.3

-5.1

-6.2

-8.8

-9.0

-10.27

-9.1

-8.9

-8.8

-11.7

-8.8

-9.0

-8.8

-9.0

-9.0

-9.2

-9.2

-10.2

-6.6

-7.6

-8.7

-9.0

Ile

2.15

2.5

2.8

2.9

2.9

2.8

2.4

2.7

2.7

2.8

2.7

2.7

1.6

2.6

1.5

2.4

2.2

Leu

2.28

2.4

2.6

2.8

2.8

2.6

2.3

2.5

2.6

2.6

2.5

2.6

1.5

2.6

1.5

2.2

2.2

Lys

-4.38

-4.8

-4.8

-4.7

-3.9

-4.7

-4.7

-5.4

-5.0

-5.0

-5.4

-5.4

-6.3

-3.1

-4.1

-3.8

-4.0

Met

-1.48

0.6

1.0

1.0

0.8

1.0

0.5

0.9

0.8

1.0

0.9

0.8

-0.3

0.5

-0.7

0.6

0.6

Phe

-0.76

-0.2

0.1

0.2

-0.2

0.1

-0.3

-0.1

0.1

0.1

-0.1

0.0

-1.4

0.8

-0.5

0.0

-0.2

Ser

-5.06

-4.5

-4.5

-4.4

-2.9

-4.5

-4.5

-4.6

-4.6

-4.5

-4.6

-4.6

-4.8

-4.8

-4.9

-5.5

-5.6

Thr

-4.88

-4.2

-4.1

-4.3

-2.9

-4.1

-4.1

-4.4

-4.2

-4.2

-4.4

-4.4

-4.9

-3.6

-4.1

-4.9

-5.0

Trp

-5.88

-5.2

-4.6

-4.4

-5.6

-4.6

-5.2

-4.8

-4.6

-4.4

-4.8

-4.9

-6.5

-3.2

-4.9

-5.7

-6.1

Tyr

-6.11

-4.5

-4.1

-3.8

-3.8

-4.1

-4.6

-4.2

-4.0

-4.0

-4.2

-4.1

-5.5

-1.5

-3.1

-6.0

-6.1

Val

1.99

2.4

2.6

2.7

2.7

2.6

2.4

2.6

2.6

2.6

2.5

2.5

1.7

2.5

1.5

2.3

2.2

RMSE

0.9

1.1

1.1

1.7

1.1

0.8

1.1

1.1

1.1

1.0

1.0

0.9

2.0

1.4

1.0

1.1

MSE

0.5

0.7

0.8

0.7

0.7

0.5

0.6

0.6

0.7

0.5

0.5

-0.3

1.3

0.4

0.1

-0.1

His

k

charged Arg

-61.1

-60.3

-60.0

-58.6

-61.4

-60.5

-59.7

-57.1

-57.4

-56.4

-57.0

-57.9

-53.6

-55.0

-65.2

-65.6

-81.1

-80.6

-81.7

-82.5

-87.8

-80.7

-80.1

-83.0

-85.3

-85.7

-87.7

-85.1

-84.7

-80.7

-80.3

-73.2

-73.7

-79.6

-80.1

-81.4

-82.1

-88.4

-80.3

-79.7

-82.7

-85.0

-85.2

-87.4

-88.5

-88.3

-91.2

-91.2

-72.8

-73.1

Gud

-63.8

-68.8

-68.3

-67.9

-65.4

-69.4

-68.5

-67.2

-64.6

-64.8

-63.9

-65.4

-65.6

-61.6

-61.9

-75.5

-75.8

His

-58.9

-59.7

-58.9

-58.6

-58.6

-59.9

-59.0

-58.5

-56.0

-55.9

-55.0

-54.5

-55.6

-53.9

-54.9

-61.5

-61.9

Lys

-67.7

-66.3

-65.5

-65.4

-66.0

-66.6

-65.8

-65.3

-62.7

-62.8

-61.9

-61.9

-62.8

-59.5

-60.4

-69.4

-69.5

RMSE

2.4

2.4

2.5

5.0

2.6

2.3

2.5

4.0

4.1

5.5

5.5

5.0

6.8

6.4

7.2

7.1

-0.9

-0.9

-1.1

-3.0

-1.2

-0.4

-1.1

-0.5

-0.6

-0.9

-0.8

-1.2

0.9

0.5

-0.2

-0.6

Asp Glu l

MSE a

Standard deviations for neutral and charged analogues amount approximately to 0.05 and 0.14 kcal/mol, respectively; RMSE means root-mean-square error and MSE is mean signed error; bExperimental observations for neutral analogues were taken from Refs110, 111; ∗ 0H 2 3 of -262.4 kcal/mol; cff99SB-ILDN; dff15ipq; eff14SB; ffb15; gff99SBw; for charged ones from Refs11, 83, 111 based on a ∆,-. h i j k ff99SBws; ff03w; ff03ws; A weighted value computed as 0.2*Hid + 0.8* Hie;83, 112 lGuanidinium (Gud), partial analogue of Arg+.

ACS Paragon Plus Environment

19

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 54

Figure 3. Comparison of calculated real hydration free energies using ff99SB-ILDN for (a) neutral and (b) charged (b) amino acid side chain analogues in different water models with experimental observations. The amino acid with a large discrepancy in calculated hydration free energies with varied water models are marked by its one-letter-code (R stands for Gud+ here).

It should be noted that Amber ff99SB-ILDN is identical to ff99SBw for neutral side chain analogues. In order to validate the Amber ff99SBws, HFEs of all the side chain analogues were calculated with (ff99SBws) and without (ff99SBw) scaling of protein-water interactions (Table 2). The scaling in ff99SBws helps to strengthen the hydration of the neutral analogues by lowering ∆Ghyd for all residues, yielding an overall more favorable hydration with a MSE of -0.3 kcal/mol. Compared to ff99SBw and the experimental observations, ff03w (sharing identical parameters for neutral residues to ff03) displays a worse performance with a larger RMSE (~2 kcal/mol) and a more positive MSE (1.3 kcal/mol). For instance, HFEs of Asn, His, Lys, Trp, and Tyr are underestimated by 2~4 kcal/mol in ff03w, as indicated by the neutral analogues

ACS Paragon Plus Environment

20

Page 21 of 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

(Tables 2 and S2). Scaling the water interactions in ff03ws produces more negative ∆Ghyd and leads to an overall RMSE (1.4 kcal/mol) and MSE (0.4 kcal/mol) relative to ff03w, which is an improvement compared to the experiments. However, the performance of ff03ws for predicting the hydration of neutral amino acids is still slightly worse than ff99SB-ILDN and ff99SBw(s). The intrinsic HFEs (∆Gintr, i.e., without the air/liquid surface contribution in eq 3) of charged amino acids depend critically on the type of water models,83, 113 and the difference in ∆Gintr between different water models is even larger than 20 kcal/mol. For Glu-, for instance, SPC/E, TIP3P, TIP4P/2005, and TIP5P-Ew with ff99SB-ILDN predict a ∆Gintr of -95.0, -92.0, -98.6, and -75.5 kcal/mol, respectively. With the surface contribution, the difference in the real HFEs (∆Ghyd) resulting from the water models are reduced significantly, as indicated by the charged analogues in Table 2. Interestingly, ∆Ghyd with ff99SB-ILD for the nine examined water models can be putatively classified into three groups, based on the number of interacting sites in the water models. The magnitude of ∆Ghyd with three-site water models lies between that of four-site water models and five-site water models (Figure 3b). The four-site models overestimate the hydration of negatively charged amino acid side chains, but underestimate the hydration of the positively charged side chains; the opposite behavior is observed for the five-site water models. For each group, the difference in ∆Ghyd with varied water models for the same amino acid side chain is less than 2 kcal/mol. If the solvent Galvani potential (∅+ ~ -14 kcal/mol) for TIP4P from the literature108, 109 was used instead of -12.1 kcal/mol in Table 1, the distinctions between TIP4P and the three-site water models would be not as obvious as in Figure 3b. The debated experimental reference for ∆Ghyd of charged residues,83 however, precludes any clear conclusion about the performance of varied water models.

ACS Paragon Plus Environment

21

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 54

Atomic charges of charged amino acids in ff99SB-ILDN, ff99SBw(s), and ff03w(s) are different and thus cannot be compared with each other directly. For the latter two force fields, it should be noted that there is a large difference in the hydration of Asp- and Glu- residues, whereas the difference between the two residues is tiny in the experiments and in the calculations with ff99SB-ILDN and ff15ipq (Table 2). Refitted atomic charges for charged residues in ff99SBw20,

36

gives a comparable accuracy for ∆Ghyd to that in ff99SB-ILDN, except for an

underestimation of Asp by ~3 kcal/mol. Scaling protein-water interactions in ff99SBws and ff03ws gives a slightly strengthened HFE, and real HFEs of charged side chains using ff99SBw(s) and ff03w(s) with TIP4P/2005 appear to fall within the group of four-site water models as classified in Figure 3b. Effects of water models on the hydration of charged amino acids were further investigated in terms of structural ion−water interacting properties. Table 3 lists the locations (RRDF) of the first peaks in the ion−water radial distribution functions (RDFs) and the water number density (ρ) at RRDF, along with, if available, the corresponding experimental estimates. A quick look suggests that the three- and four-site water models reproduce the experimental RRDF of Gud+ and NH4+ ions well, whereas the five-site water models give an underestimation. The five-site water models perfectly reproduce the experiment RRDF of Imd+, while an overestimation by 0.01 nm is found for the three- and four-site water models. All the tested water models underestimate the RRDF location of Asp



by 0.01~0.02 nm. A closer inspection on the relationship with water

parameters (Table S3) and water-related properties (Tables 1 and 2) indicates that RRDF has a stronger correlation with the Lennard-Jones (LJ) repulsive term (C12, R = 0.6 ~ 0.9) of water models than the dispersion one (C6, R = 0.2 ~ 0.6); for the σ-ε form of LJ potential, RRDF correlates to σ more strongly (R > 0.86) than ε (R < 0.5), as given in Table S4 and Figure S1.

ACS Paragon Plus Environment

22

Page 23 of 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

RRDF does not show any correlation with the water dipole, whereas a strong correlation with the quadrupole moment (QT) and hydration free

ACS Paragon Plus Environment

23

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 54

Table 3. Characterization for the First Peak of Ion-Water Radial Distribution Functionsa SPC

SPC/E

b

b

b

0.290

0.290

0.290

His+ Imd+

TIP3PFB e

OPC3

TIP3P

c

d

b

0.290

0.302

0.292

0.290

0.292

0.294

0.286

0.290

0.306

0.286

0.288

0.290

0.288

0.284

0.288

0.290

0.288

0.284

0.290

0.284

0.286

0.290

0.286

Lys+

0.286

0.284

0.288

0.288

NH4+

0.284

0.284

0.286

0.266

0.266

0.266

0.264

FF

SPC/Eb

TIP4P b

TIP4PEw b

TIP4P/2005

TIP5P

TIP5PEw b

exp.

b

f

g

b

0.292

0.294

0.294

0.300

0.278

0.278

0.294

0.292

0.294

0.294

0.300

0.278

0.278

0.288

0.286

0.286

0.292

0.292

0.296

0.276

0.276

0.282

0.290

0.288

0.288

0.290

0.292

0.292

0.274

0.276

0.284

0.286

0.288

0.286

0.288

0.290

0.292

0.294

0.282

0.278

0.290

0.284

0.280

0.286

0.286

0.286

0.290

0.288

0.290

0.278

0.278

0.285j

0.264

0.262

0.266

0.268

0.264

0.264

0.264

0.264

0.266

0.270

0.274

0.274

0.285k

0.264

0.264

0.266

0.266

0.266

0.262

0.262

0.264

0.266

0.266

0.274

0.272

RRDF (nm) Arg+ Gud

Asp Glu

+





~0.3h 0.273i

-3

ρ (nm ) +

41.6

42.6

43.5

44.5

43.4

43.5

42.9

41.9

42.0

41.9

41.9

40.2

57.2

58.7

+

33.1

33.0

34.5

36.9

35.5

34.5

35.5

33.6

35.5

34.1

35.5

33.6

46.0

44.9

+

94.4

94.7

96.4

99.3

92.2

104.1

97.4

93.4

99.8

99.3

95.7

89.8

116.1

124.6

103.5

108.5

110.8

115.3

104.6

100.3

107.3

109.6

111.3

113.0

100.7

96.1

88.5

89.6

102.9

108.5

113.0

118.8

106.8

104.6

111.9

110.1

110.7

114.8

114.7

117.0

85.9

88.7

Arg His

Lys

Asp Glu





a

Position (RRDF) of the first peak in the radial distribution functions (RDFs) of water (oxygen atoms) around the ions (nitrogen atoms for cations and oxygen for anions) and water number density (ρ) at RRDF are given; for pure water, ρ≈33.4 nm-3 corresponding to a density of 1 g/mL. Gud+(guandinium), Imd+ (imidazolium), and NH4+ (ammonium) are partial analogues of Arg+, His+, and Lys+ side chains, respectively; bff99SB-ILDN; cff15ipq; dff14SB; efb15; fff99SBws; gff03ws; hEstimate based on neutron diffraction experiments;87 iAverage hydrogen-bond length with water observed in a solid crystal;88 jFrom Ref.;89, 90 kFrom x-ray and NMR experiments.91

energies of charged amino acid sidechains is observed. A stronger hydration means a smaller RRDF and a higher water number density (ρ) at RRDF, as shown in Figure S1 and Table 3. It is found that a reduction in C12, σ, or QT could yield a smaller and larger RRDF for positively and charged side chains, respectively (Figure S1). Compared to the “native” water model of TIP3P with the Amber ff99SB-ILDN force field, TIP4P/2005 slightly overestimates the RRDF between water and positively charged amino acid side chains by 0.004~0.008 nm, and underestimates the RRDF between water and negatively charged amino acid side chains by 0.002~0.004 nm. In contrasts, TIP5P and TIP5P-Ew yield a large underestimation (which is much further away from the experiments) for positively charged amino acids by 0.006~0.012 nm and an overestimation for negatively charged amino acids by

ACS Paragon Plus Environment

24

Page 25 of 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

- ~0.007 nm (Table 3). RDF plots of Arg+ and Asp , for instance, are given in Figure 4,

pinpointing the discrepancies in water-ion binding structures with different water models, which indicates that the five-site models likely produce different hydrated structures from three- and four-site models. With different atomic charges and the scaling of protein-water interactions, ff99SBws and ff03ws with TIP4P/2005 produce results comparable to that of ff99SB-ILDN (Table 3). No significant differences for the lasted Amber FFs of ff14SB, ff15ipq, and fb15, except ff15ipq overestimating the RRDF of Gud+ by 0.006 nm.



Figure 4. Radial distribution functions (RDFs) for (a) nitrogen of Arg+ and (b) oxygen atoms of Asp around water (HOH) oxygen atoms computed from 10-ns MD simulations with ff99SB-ILDN in TIP3P (solid lines), TIP4P/2005 (dotted), and TIP5P-Ew (dash-dotted) water. Unnormalized RDFs are presented here describing water number density (ρ) around charged amino acids; ρ approaches to 33.4 nm-3 corresponding to a pure water density of 1 g/mL. Inserts are enlarged view of water number density at r = 0.25~0.35 nm.

ACS Paragon Plus Environment

25

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 54

3.3 Diffusion of Amino Acids. Similar to water diffusion constant calculations, translational diffusion coefficients of zwitterionic amino acids at infinite dilution were determined by extrapolation (eq 2), see Table 4. With the Amber ff99SB-ILDN, TIP3P yields the largest root-mean-square error (RMSE) of 1.44 x 10-5 cm2/s from experimental observations, while SPC/Eb gives the smallest RMSE of 0.15 x 10-5 cm2/s, despite SPC/Eb underestimates the diffusion constant by ~9% (Table 1). The RMSEs with varied water models are found to be correlated strongly (R = 0.99) with the water self-diffusion constants (D0), as shown in Figure 2; the faster the water Table 4. Comparison of Calculated Translational Diffusion Constants (10

-5

cm2/s) of Amino Acids with

Experimental Observations at Infinite Dilutiona exp.b

SPC

SPC/E

FF Ala 0.74 Arg 0.55 Asn 0.66 Asp 0.65 Cys 0.69 Gln Glu 0.62 Gly His 0.56 Ile 0.62 Leu Lys 0.53 Met 0.64 Phe 0.60 Ser 0.71 Thr Trp Tyr Val RMSEcalci RMSEscaledi R

c 1.87 1.63 1.56 1.52 1.71 1.52 1.46 2.17 1.38 1.52 1.65 1.56 1.54 1.40 1.68 1.66 1.59 1.43 1.73 0.94 0.18 0.66

c 1.09 0.99 0.87 1.16 1.13 1.05 0.91 1.44 1.13 0.97 1.04 0.95 1.16 0.94 1.04 1.32 0.84 0.99 0.99 0.41 0.24 0.29

SPC/Eb c 0.79 0.74 0.82 0.63 0.92 0.69 0.69 0.92 0.74 0.78 0.78 0.71 0.75 0.80 0.78 0.79 0.74 0.78 0.83 0.15 0.21 0.41

d 0.75 0.66 0.77 0.69 0.85 0.83 0.70 0.93 0.75 0.84 0.72 0.75 0.78 0.83 0.92 0.76 0.70 0.64 0.81 0.16 0.23 0.44

OPC3

TIP3P

e

c 2.40 1.96 2.16 1.89 2.33 1.98 1.83 2.76 2.00 2.07 2.17 1.75 2.22 1.89 2.20 2.30 1.83 1.82 1.87 1.44 0.14 0.80

TIP3P -FB f 0.87 0.73 0.87 0.70 0.76 0.78 0.71 1.13 0.87 0.71 0.84 0.74 0.78 0.83 0.95 0.75 0.67 0.74 0.72 0.18 0.22 0.43

TIP4P c 1.70 1.25 1.32 1.43 1.51 1.39 1.39 1.82 1.48 1.34 1.47 1.48 1.53 1.30 1.56 1.41 1.31 1.32 1.35 0.82 0.24 0.59

TIP4PEw c 1.07 0.97 0.95 0.93 1.00 0.92 1.10 1.36 1.07 0.95 0.99 0.91 1.01 0.88 1.07 1.07 0.83 0.84 0.93 0.37 0.25 0.39

TIP4P/2005 c 0.96 0.82 0.88 0.84 0.93 0.89 0.72 1.01 0.86 0.92 0.78 0.88 0.87 0.80 1.01 0.95 0.79 0.86 0.85 0.25 0.25 0.57

g 0.93 0.83 0.81 0.74 1.04 0.91 0.75 1.16 0.84 0.78 0.78 0.91 0.96 0.79 1.02 0.84 0.78 0.81 0.81 0.25 0.25 0.43

h 0.98 0.80 0.81 0.80 0.99 0.68 0.67 1.10 0.86 0.78 0.87 0.76 0.75 0.88 0.80 0.85 0.81 0.82 0.90 0.21 0.21 0.45

TIP5P c 0.97 0.87 0.93 0.95 0.98 0.78 0.87 1.16 0.96 1.08 0.75 0.87 0.93 1.06 0.99 0.64 0.96 0.85 1.01 0.33 0.13 0.33

TIP5PEw c 1.24 0.96 0.94 0.94 1.00 1.10 0.92 1.23 1.05 1.01 1.00 0.88 1.03 0.88 0.95 0.84 0.85 0.94 1.01 0.36 0.12 0.52

0.99 0.72 0.86 0.78 0.92 0.84 0.81 1.05 0.87 0.84 0.80 0.84 0.90 0.92 0.94 0.89 0.83 0.88 0.82 0.24 0.20 0.65 a Amino acids are in zwitterionic forms for consistency with experimental determinations at pH~3.5; relative errors for the calculations are ~7%; bTaken from Ref.;52 Standard deviations for the experiments are smaller than 0.01 x 10-5 cm2/s and not

ACS Paragon Plus Environment

26

Page 27 of 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

given here; cff99SB-ILDN; dff15ipq; ef14SB; ffb15; gff99SBws; hff03ws; iRoot-mean-square error (RMSE) of the calculated 6 6 %6 . ,55 and scaled ,55 (eq 4) from experiment.

ACS Paragon Plus Environment

27

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 54

model diffuses, the larger the RMSE is. Relationship of calculated and experimental amino acid diffusion constants with their molar masses is given in Figure 5. All the water models examined overestimate the diffusion constants of zwitterionic amino acids with ff99SB-ILDN. The fast diffusion of TIP3P greatly accelerates the diffusion of amino acids and produces the overall largest diffusion constants that are two times larger than the experimental observations. TIP4P/2005, which performs best for the water diffusivity, yet still gives an RMSE of 0.25 x 10-5 cm2/s, an overestimation by 40% approximately. As expected, diffusion of amino acids decreases, in a power way probably, as their molar masses increase (Figure 5). Due to the similar result for water diffusivity, ff15ipq/SPC/Eb performances similarly to ff99SB-ILDN/SPC/Eb and fb15/TIP3P-FB, and that of ff14SB/OPC is similar to ff99SB-ILDN/TIP4P/2005 (Table 4).

Figure 5. Calculated and experimental diffusion constants of amino acids at infinite dilution as a

function of molar mass of the amino acids with ff99SB-ILDN. Solid lines are fits to a power function indicative of a relationship with the molar mass. Data points for TIP4P-Ew, TIP5P, and TIP5P-Ew are close to that of SPC/E and TIP4P/2005 and not given here for clarity. Note that ff99SB-ILDN and ff99SBw share identical force field parameters of zwitterionic amino acids, as stated in the model construction in Section 2.1. Unexpectedly, scaling the water interactions in ff99SBws has little influence on the diffusion properties of amino acids (Table 4), although such scaling leads to a more favorable hydration of amino acid side chains (Table 2).

ACS Paragon Plus Environment

28

Page 29 of 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Also, Amber ff03ws yields comparable diffusion results to ff99SB-ILDN and ff99SBws (Table BC

BC

6 6 6 4). With a scaling factor of ,@ /D6 ,@ , where D ,@ and D ,@ are experimental and calculated

water diffusion coefficients (eq 4), comparable results for all the water models tested can be obtained. DISCUSSION A systematic benchmark of protein force fields (FFs) for reproducing hydration properties and diffusion of amino acids is presented to assess the performance of FF/water combinations. Eight Amber force fields, namely, ff99SB-ILDN,10 ff99SBw,20 ff99SBws,20 ff03w,33 ff03ws,20 ff14SB,50 ff15ipq,18 and fb15,41 were examined with 11 water models belonging to three categories, namely, three-site (OPC3,30 SPC,14 SPC/E,15 SPC/Eb,19 and TIP3P,13 and TIP3PFB28), four-site (TIP4P,13 TIP4P-Ew,25 and TIP4P/200527), and five-site (TIP5P16 and TIP5PEw26) models. The widely used variants of Amber type fixed-charge force fields, as examined in this work, share the potential form and most parameters from the original ff94 model by Cornell et al.,114 where atomic charges were obtained by fitting the calculated electrostatic potential (ESP) at the HF/6-31G* level of theory in gas phase with the RESP method.115 Subsequent adaptations were made focusing primarily on the improved representation of protein secondary structures by successive tuning of the ff94 dihedral torsions, such as ff99,116 ff99SB,35 and side chain torsions (ff99SB-ILDN10). The latest Amber ff14SB, for instance, completely refitted all amino acid side chain dihedral parameters that were carried over from ff94,50 while Amber fb15 started from ff99SB, modified the bonded parameters by fitting to high-quality QM calculations with the Force Balance (FB) method,28 and left the other parameters unmodified.41 Inheriting from the same parent ff94, ff99SB-ILDN, ff14SB, and fb15 therefore share identical atomic charges and nonbonded parameters for all amino acids, of which charged residues have the same backbone

ACS Paragon Plus Environment

29

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 54

charges and the neutral ones share a different set of backbone charges. In order to eliminate the difference in backbone charges for a faithful reproduction of helix propensities, Best et al. refitted atomic charges of charged residues in ff99SB-ILDN and side-chain torsions for Asp in conjunction with the new partial charges and, together with a fine-tuning of backbone dihedral potentials,53 the resulting force field is referred to as ff99SB*-ILDN-Q.36 Based on this, the ff99SBws was developed in order to get strengthened protein-water interactions.20 HF/6-31G* has been reported to overestimate the gas-phase dipole moment and to produce slightly larger charges leading to an overestimated gas-phase dipole, which is somewhat condensed phase-like.54 For a more accurate representation of polarization effects, the Amber ff03 lineage applied the electrostatic potential (ESP) for RESP fitting with an increased quantum mechanical (QM) level of theory (B3LYP/cc-pVTZ) using the IEFPCM continuum solvent model (with an effective dielectric constant of 4, mimicking organic media or protein interior).54 Unlike ff94, each amino acid in ff03, on which ff03ws was based, has its own backbone charges, although bonded- (except main-chain torsion) and van der Waals parameters were still taken from ff94 directly. The third charge model in the evolution of Amber protein models was the implicitly polarized charge (IPolQ) scheme,117 which assigns polarized atomic charges implicitly in the presence of explicit solvent, as did in the latest Amber ff15ipq with SPC/Eb water model.18 The ff15ipq is an original protein FF with its own charge model and new angle and torsion terms as well as new non-bonded parameters for the polar hydrogens designed to improve amino acid salt bridge associations.18 The consistency used when treating backbone charges on the peptide group automatically compartmentalizes the charge representations and thereby lends credence to a manual charge assignment. The procedure introduced by Shirts et al. for taping together representations of

ACS Paragon Plus Environment

30

Page 31 of 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

amino acid side chain analogues48 was therefore used in this work as well and is referred to as the method of Cβ modification. However, ff03 does not have any true consistency in backbone charges, which makes manual charge assignment somewhat arbitrary. Following the original papers,54, 114 a full ESP charge reassignment by QM calculations (referred to as the method of QM full ESP) was attempted to construct the side chain analogues and zwitterionic amino acids with the Gaussian 09 software.118 Compared to the QM full ESP, the Cβ modification slightly overestimates the dipole of neutral side chain analogues (Figure 6a-b) and underestimates the corresponding atomic charges (in absolute value) by ~20% (Figure 6c-d). Interestingly, the QM full ESP method does not seem

Figure 6. Comparison of dipole (a and b), atomic charge (c and d), and hydration free energy in TIP3P (e and f) for the constructed neutral amino acid side chain analogues by the method of Cβ modification and QM full ESP calculations for Amber ff99SB-ILDN (top) and ff03 (bottom).

to influence hydration free energies (∆Ghyd) of the analogues significantly for ff99SB-ILDN (Figure 6e and Table S5), but shows a better prediction for ff03 than the manual assignment by Cβ modification, as indicated by a reduced MSE from 1.3 to 0.8 kcal/mol and a reduced RMSE from 1.8 to 1.0 kcal/mol (Figure 6f and Table S2), a similar performance to that with f99SBILDN. That is, the ∆Ghyd performance of ff03 is better than that observed in Table 2.

ACS Paragon Plus Environment

31

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 54

Discrepancies in dipole and atomic charges were observed as well for the zwitterionic amino acids between the manual assignment and QM full ESP calculations (Figure S2). These findings indicate the charge distribution could be a source of error for such calculations. Our benchmark results for neutral amino acid side chains and neutral zwitterionic amino acids are summarized in Figure 7. The “native” model TIP3P in combination with ff99SB-ILDN shows

Figure 7. Overall performance of the examined force field/water combination as indicated by (a) root-meansquare error (RMSE) and (b) mean signed error (MSE) for hydration free energies (∆Ghyd) of neutral amino acid side chains as well as by the MSE for diffusion constants (D0,AA) of neutral zwitterionic amino acids at infinite dilution compared to the experiment. Because all the water models overestimate amino acid diffusions, the RMSE for D0,AA is almost identical to the corresponding MSE.

the smallest RMSE of 0.8 kcal/mol from experiment for hydration free energies (∆Ghyd) of the neutral analogues. The RMSE amounts to about 1 kcal/mol for all the FFs that were inherited from ff94 and shared an identical charge model, whereas ff03w and ff15ipq yields large RMSEs of 1.4 and 1.7 kcal/mol, respectively (Figure 7a). Note that the QM full ESP charge reassignment

ACS Paragon Plus Environment

32

Page 33 of 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

leads to a reduced RMSE for ff03w, as mentioned above in Figure 6f. ∆Ghyd of the neutral analogues for ff99SB-ILDN paired with three- and four-site models is almost independent on the used water models, with an MSE of 0.5~0.8 kcal/mol, which is less favorable compared to experiment. However, TIP5P and TIP5P-Ew tend to give a more negative ∆Ghyd with ff99SBILDN for certain amino acid residues, which is better for some but worse for others compared to the other water models and experiments, resulting in an overall small MSE of 0.1 and -0.1 kcal/mol (Table 2 and Figures 3a and 7b). Interestingly, ∆Ghyd for the charged analogues with ff99SB-ILDN is water model dependent and can be classified putatively into three groups, depending on the number of interacting sites in water models (Figure 3b), of which the three-site models perform best with a RMSE of 2 kcal/mol. It should be noted that such classification is based on the comparison with experiments ∗ 0H 2 3 of -262.4 kcal/mol.85 For each group, the difference in ∆Ghyd with with a reference ∆,-.

different water models for the same amino acid side chain is less than 2 kcal/mol and appears to be negligible, and no obvious differences for the location of first hydration peaks (RRDF) in the radial distribution functions between charged amino acids and water molecules are observed (Table 3). Even so, however, the choice of water model was reported to show a strong influence on the accuracy of other properties of amino acid side chains such as hydration entropies, enthalpies, and heat capacities.17 If other molecules such as metal ions were involved in the simulations, force field parameters for these molecules should be developed and optimized for use with the chosen water model.119-121 Despite the good performance on ∆Ghyd, TIP3P overestimates amino acid diffusions largely, which is likely due to the inherent deficiency of the TIP3P water model, in particular, the significantly overestimated self-diffusivity (D0 = 6.14 ± 0.05 x 10-5 cm2/s) and the corresponding

ACS Paragon Plus Environment

33

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 34 of 54

low viscosity (η = 0.32 ± 0.04 mPa•s), compared to experimental observations of 2.299 x 10-5 cm2/s and 0.89 mPa•s, respectively. With the optimized TIP4P/2005 model, which accurately captures many properties of liquid water including diffusion (D0 = 2.31 ± 0.02 x 10-5 cm2/s) and viscosity (η = 0.84 ± 0.08 mPa•s), a better performance with ff99SB-ILDN (much closer to the experiments) for absolute diffusion constants of amino acids is obtained, similar to the predictions by the latest OPC3 water model with Amber ff14SB (Table 4). Using a water model with a lower D0 than experiment, like SPC/Eb, reduces the diffusivity of amino acids further, but a tendency of overestimation still exists for TIP4P/2005 and SPC/Eb as well as the other water models tested, even with the lasted FFs of ff14SB, ff15ipq, and fb15 (Table 4 and Figure 5). With an ad-hoc correction for diffusion properties of the water models (eq 4), the calculated results are comparable between different water models (Table 4). These findings indicate an improvement in the simulated diffusivity of amino acids with an accurate model for liquid water and a necessity of improving the accuracy of force fields themselves. Seeing that, despite very accurate diffusion properties of TIP4P/2005, the diffusion of amino acids is still overestimated by about 40%, even when using the scaled water interactions in ff99SBws and ff03ws,20 there seems to be still something missing from the force fields. A modification of charges in solvated side-chains (large absolute values of the charges) would probably reduce the diffusion rates, but most likely at the cost of overestimated HFEs. It has been suggested, based on extensive simulations of molecular liquids, that the dispersion interactions in biomolecular force fields are overestimated,104 due to force fields being optimized without long-range dispersion corrections. Although a reduction of the dispersion constant (C6 in the Lennard-Jones potential) could have an effect on protein properties, it is unlikely to affect diffusion of isolated amino acids.

ACS Paragon Plus Environment

34

Page 35 of 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

It has been suggested that the balance between protein-water interactions and protein-protein interactions for current biomolecular force fields is not correct for the modeling of disordered proteins and protein aggregations.20, 42-46 The slightly too weak protein-water interactions are confirmed here by positive MSEs for the hydration of neutral side chain analogues with ff99SBILDN, ff99w, and ff03w, ff14SB, ff15ipq, and fb15 (Table 2 and Figure 7). Tuning the solutewater van der Waals interactions appears to be a partial solution to this issue,20, 46, 49 as was done in ff99SBws and ff03ws force fields by Best et al.20 For these two force fields, scaling proteinwater interactions was proposed to modulate (strengthen, more precisely) protein-water interactions and generate, likely without highly residue-specific effects, more negative hydration free energies of amino acids, as confirmed by a reduced MSE from 0.5 to -0.3 kcal/mol for ∆Ghyd of neutral side chains with ff99SBws and from 1.3 to 0.4 kcal/mol with ff03ws (Table 2 and Figure 7). More importantly, such scaling shows improved properties of simulated disordered proteins and non-specific protein association.20 The strengthened interaction between solute and water molecules is expected to slow down the diffusion of the solute; however, this scaling does not influence the amino acid diffusion significantly (Table 4), see discussion above. The location (RRDF) of first hydration peak also needs to be considered for further improvements of the protein force fields. The three- and four-site water models with Amber force fields largely overestimate the location (RRDF) of first hydration peak of Imd+ by 0.01 nm, and underestimate the RRDF of Asp - by 0.01~0.02 nm. Except for Imd+ that is reproduced accurately, the five-site water model appears to predict a systematic underestimation of RRDF for charged amino acids (Table 3). Such deficiencies likely lead to a deviation in the simulated biomolecular structures from the experiments. In practice, the water-ion distance could be fixed probably by modifying protein fields, such as the efforts made in the GROMOS 54A8 parameter

ACS Paragon Plus Environment

35

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 36 of 54

set,11 or by the refinement of water models. A reduction in the repulsion term (C12), σ in the σ-ε form of Lennard-Jones potential, or quadrupole moment (QT) of water models, for instance, could help somewhat to obtain better hydrated structures (Figure S1), but with necessary adjustments for reproducing key properties of water molecules. The benchmark results highlight the complexity of modeling amino acid hydration with different water models, in particular for charged residues. Atomic charges in the Amber ff94 and ff03 lineages appear to perform better for hydration free energies of side chains than that in the latest ff15ipq, although the implicitly polarized charge (IPolQ) method in ff15ipq made an important leap forward in terms of solvent-induced polarization for condensed phase simulations. The exaggerated self-diffusivity of water models is shown to be largely responsible for the observed discrepancy in the predicted diffusion constants of amino acids by varied water models. Due to the accurate diffusion properties, TIP4P/2005 and the “FB” and “OPC” families of the most recent generation are recommended for use with applications where the absolute diffusivity or kinetic properties are of interest. The comparison between different water models not only indicates the use of accurate water models could result in, to some extent, an improvement of the protein force fields, but also provides a source of discrepancies/errors in the simulation studies. The limited success (Tables 1 and 4) of more complex four- and five-site water models (note that the latest OPC and TIP4P-FB were not tested here) underscores the difficulty of building an accurate water model that simultaneously is a good solvent model for proteins. With simple atom-based models, parameter development for water and proteins will probably lead to compromises in accuracy. Improvement of accuracy will be limited until fast codes and parameter development pipelines can support more advanced representations. Some promising work on polarizable models is underway.122-124 This work establishes a fundamental assessment

ACS Paragon Plus Environment

36

Page 37 of 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

for the hydration and diffusion of amino acids and provides implications for further improvements and applications of protein force fields. A reasonable modeling of the surrounding environments (other than water) around biomolecules is of critical importance and needs to be checked carefully before adaption and re-optimization of the biomolecular force fields. As a follow-up, properties of amino acids in organic solvents other than water will be presented in a subsequent article. ASSOCIATED CONTENT Supporting Information Supplementary tables and figures. The Supporting Information is available free of charge on the ACS Publications website. AUTHOR INFORMATION Corresponding Author [email protected] (D.v.d.S.) Notes The authors declare no competing financial interests. ACKNOWLEDGMENT We thank Prof. Tianwei Tan for a grant of computer time through ChemCloudComputing of Beijing University of Chemical Technology. This work was supported by the National Natural Science Foundation of China (21606016), Beijing Natural Science Foundation (5174036) and the

ACS Paragon Plus Environment

37

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 38 of 54

Fundamental Research Funds for the Central Universities (FRF-TP-17-009A2) as well as by the Swedish Research Council (grant 2013-5947). REFERENCES (1) Bellissent-Funel, M.-C.; Hassanali, A.; Havenith, M.; Henchman, R.; Pohl, P.; Sterpone, F.; van der Spoel, D.; Xu, Y.; Garcia, A. E. Water Determines the Structure and Dynamics of Proteins. Chem. Rev. 2016, 116, 7673-7697. (2) Jungwirth, P. Biological Water or Rather Water in Biology? J. Phys. Chem. Lett. 2015, 6, 2449-2451. (3) Finney, J. L. The Water Molecule and Its Interactions: The Interaction between Theory, Modelling, and Experiment. J. Mol. Liq. 2001, 90, 303-312. (4) Guillot, B. A Reappraisal of What We Have Learnt During Three Decades of Computer Simulations on Water. J. Mol. Liq. 2002, 101, 219-260. (5) Jorgensen, W. L.; Tirado-Rives, J. Potential Energy Functions for Atomic-Level Simulations of Water and Organic and Biomolecular Systems. Proc. Natl. Acad. Sci. U.S.A. 2005, 102, 6665-6670. (6) Wallqvist, A.; Mountain, R. D. Molecular Models of Water: Derivation and Description. Rev. Comput. Chem. 2007, 13, 183-247. (7) Ouyang, J. F.; Bettens, R. Modelling Water: A Lifetime Enigma. Chimia 2015, 69, 104111. (8) Cisneros, G. A.; Wikfeldt, K. T.; Ojamäe, L.; Lu, J.; Xu, Y.; Torabifard, H.; Bartok, A. P.; Csanyi, G.; Molinero, V.; Paesani, F. Modeling Molecular Interactions in Water: From Pairwise to Many-Body Potential Energy Functions. Chem. Rev. 2016, 116, 7501-7528. (9) Chaplin, M. www.lsbu.ac.uk/water/water_models.html (Accessed Apr-12, 2018) (10) Lindorff-Larsen, K.; Piana, S.; Palmo, K.; Maragakis, P.; Klepeis, J. L.; Dror, R. O.; Shaw, D. E. Improved Side-Chain Torsion Potentials for the Amber ff99SB Protein Force Field. Proteins: Struct., Funct., Genet. 2010, 78, 1950-1958. (11) Reif, M. M.; Hünenberger, P. H.; Oostenbrink, C. New Interaction Parameters for Charged Amino Acid Side Chains in the Gromos Force Field. J. Chem. Theory Comput. 2012, 8, 3705-3723. (12) Kaminski, G. A.; Friesner, R. A.; Tirado-Rives, J.; Jorgensen, W. L. Evaluation and Reparametrization of the OPLS-AA Force Field for Proteins Via Comparison with Accurate Quantum Chemical Calculations on Peptides. J. Phys. Chem. B 2001, 105, 6474-6487. (13) Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. Comparison of Simple Potential Functions for Simulating Liquid Water. J. Chem. Phys. 1983, 79, 926-935. (14) Berendsen, H. J. C.; Postma, J. P. M.; van Gunsteren, W. F.; Hermans, J. Interaction Models for Water in Relation to Protein Hydration. In Intermolecular Forces, Pullman, B., Ed.; Springer Netherlands: Dordrecht, 1981, pp 331-342. (15) Berendsen, H. J. C.; Grigera, J. R.; Straatsma, T. P. The Missing Term in Effective Pair Potentials. J. Phys. Chem. 1987, 91, 6269-6271. (16) Mahoney, M. W.; Jorgensen, W. L. A Five-Site Model for Liquid Water and the Reproduction of the Density Anomaly by Rigid, Nonpolarizable Potential Functions. J. Chem. Phys. 2000, 112, 8910-8922.

ACS Paragon Plus Environment

38

Page 39 of 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

(17) Hess, B.; van der Vegt, N. F. Hydration Thermodynamic Properties of Amino Acid Analogues: A Systematic Comparison of Biomolecular Force Fields and Water Models. J. Phys. Chem. B 2006, 110, 17616-17626. (18) Debiec, K. T.; Cerutti, D. S.; Baker, L. R.; Gronenborn, A. M.; Case, D. A.; Chong, L. T. Further Along the Road Less Traveled: Amber ff15ipq, an Original Protein Force Field Built on a Self-Consistent Physical Model. J. Chem. Theory Comput. 2016, 12, 3926-3947. (19) Takemura, K.; Kitao, A. Water Model Tuning for Improved Reproduction of Rotational Diffusion and Nmr Spectral Density. J. Phys. Chem. B 2012, 116, 6279-6287. (20) Best, R. B.; Zheng, W.; Mittal, J. Balanced Protein–Water Interactions Improve Properties of Disordered Proteins and Non-Specific Protein Association. J. Chem. Theory Comput. 2014, 10, 5113-5124. (21) Yeh, I.-C.; Hummer, G. System-Size Dependence of Diffusion Coefficients and Viscosities from Molecular Dynamics Simulations with Periodic Boundary Conditions. J. Phys. Chem. B 2004, 108, 15873-15879. (22) van der Spoel, D.; van Maaren, P. J.; Berendsen, H. J. A Systematic Study of Water Models for Molecular Simulation: Derivation of Water Models Optimized for Use with a Reaction Field. J. Chem. Phys. 1998, 108, 10220-10230. (23) Panteva, M. T.; Giambaşu, G. M.; York, D. M. Comparison of Structural, Thermodynamic, Kinetic and Mass Transport Properties of Mg2+ Ion Models Commonly Used in Biomolecular Simulations. J. Comput. Chem. 2015, 36, 970-982. (24) Rhee, Y. M.; Sorin, E. J.; Jayachandran, G.; Lindahl, E.; Pande, V. S. Simulations of the Role of Water in the Protein-Folding Mechanism. Proc. Natl. Acad. Sci. U.S.A. 2004, 101, 6456-6461. (25) Horn, H. W.; Swope, W. C.; Pitera, J. W.; Madura, J. D.; Dick, T. J.; Hura, G. L.; HeadGordon, T. Development of an Improved Four-Site Water Model for Biomolecular Simulations: TIP4P-Ew. J. Chem. Phys. 2004, 120, 9665-9678. (26) Rick, S. W. A Reoptimization of the Five-Site Water Potential (TIP5P) for Use with Ewald Sums. J. Chem. Phys. 2004, 120, 6085-6093. (27) Abascal, J. L. F.; Vega, C. A General Purpose Model for the Condensed Phases of Water: TIP4P/2005. J. Chem. Phys. 2005, 123, 234505. (28) Wang, L.-P.; Martinez, T. J.; Pande, V. S. Building Force Fields: An Automatic, Systematic, and Reproducible Approach. J. Phys. Chem. Lett. 2014, 5, 1885-1891. (29) Izadi, S.; Anandakrishnan, R.; Onufriev, A. V. Building Water Models: A Different Approach. J. Phys. Chem. Lett. 2014, 5, 3863-3871. (30) Izadi, S.; Onufriev, A. V. Accuracy Limit of Rigid 3-Point Water Models. J. Chem. Phys. 2016, 145, 074501. (31) Beauchamp, K. A.; Lin, Y.-S.; Das, R.; Pande, V. S. Are Protein Force Fields Getting Better? A Systematic Benchmark on 524 Diverse Nmr Measurements. J. Chem. Theory Comput. 2012, 8, 1409-1414. (32) Chandler, D. Interfaces and the Driving Force of Hydrophobic Assembly. Nature 2005, 437, 640-647. (33) Best, R. B.; Mittal, J. Protein Simulations with an Optimized Water Model: Cooperative Helix Formation and Temperature-Induced Unfolded State Collapse. J. Phys. Chem. B 2010, 114, 14916-14923.

ACS Paragon Plus Environment

39

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 40 of 54

(34) Nerenberg, P. S.; Head-Gordon, T. Optimizing Protein-Solvent Force Fields to Reproduce Intrinsic Conformational Preferences of Model Peptides. J. Chem. Theory Comput. 2011, 7, 1220-1230. (35) Hornak, V.; Abel, R.; Okur, A.; Strockbine, B.; Roitberg, A.; Simmerling, C. Comparison of Multiple Amber Force Fields and Development of Improved Protein Backbone Parameters. Proteins: Struct., Funct., Genet. 2006, 65, 712-725. (36) Best, Robert B.; de Sancho, D.; Mittal, J. Residue-Specific Α-Helix Propensities from Molecular Simulation. Biophys. J. 2012, 102, 1462-1467. (37) Knott, M.; Best, R. B. A Preformed Binding Interface in the Unbound Ensemble of an Intrinsically Disordered Protein: Evidence from Molecular Simulations. PLoS Comput. Biol. 2012, 8, e1002605. (38) Zerze, G. H.; Mittal, J.; Best, R. B. Diffusive Dynamics of Contact Formation in Disordered Polypeptides. Phys. Rev. Lett. 2016, 116, 068102. (39) Miller, C.; Zerze, G. l. H.; Mittal, J. Molecular Simulations Indicate Marked Differences in the Structure of Amylin Mutants, Correlated with Known Aggregation Propensity. J. Phys. Chem. B 2013, 117, 16066-16075. (40) Mittal, J.; Yoo, T. H.; Georgiou, G.; Truskett, T. M. Structural Ensemble of an Intrinsically Disordered Polypeptide. J. Phys. Chem. B 2012, 117, 118-124. (41) Wang, L.-P.; McKiernan, K. A.; Gomes, J.; Beauchamp, K. A.; Head-Gordon, T.; Rice, J. E.; Swope, W. C.; Martínez, T. J.; Pande, V. S. Building a More Predictive Protein Force Field: A Systematic and Reproducible Route to Amber-Fb15. J. Phys. Chem. B 2017, 121, 4023-4039. (42) Piana, S.; Donchev, A. G.; Robustelli, P.; Shaw, D. E. Water Dispersion Interactions Strongly Influence Simulated Structural Properties of Disordered Protein States. J. Phys. Chem. B 2015, 119, 5113-5123. (43) Rauscher, S.; Gapsys, V.; Gajda, M. J.; Zweckstetter, M.; de Groot, B. L.; Grubmüller, H. Structural Ensembles of Intrinsically Disordered Proteins Depend Strongly on Force Field: A Comparison to Experiment. J. Chem. Theory Comput. 2015, 11, 5513-5524. (44) Abriata, L. A.; Dal Peraro, M. Assessing the Potential of Atomistic Molecular Dynamics Simulations to Probe Reversible Protein-Protein Recognition and Binding. Sci. Rep. 2015, 5, 10549 (45) Petrov, D.; Zagrovic, B. Are Current Atomistic Force Fields Accurate Enough to Study Proteins in Crowded Environments? PLoS Comput. Biol. 2014, 10, e1003638. (46) Nawrocki, G.; Wang, P.-h.; Yu, I.; Sugita, Y.; Feig, M. Slow-Down in Diffusion in Crowded Protein Solutions Correlates with Transient Cluster Formation. J. Phys. Chem. B 2017, 121, 11072-11084. (47) Shirts, M. R.; Pande, V. S. Solvation Free Energies of Amino Acid Side Chain Analogs for Common Molecular Mechanics Water Models. J. Chem. Phys. 2005, 122, 134508. (48) Shirts, M. R.; Pitera, J. W.; Swope, W. C.; Pande, V. S. Extremely Precise Free Energy Calculations of Amino Acid Side Chain Analogs: Comparison of Common Molecular Mechanics Force Fields for Proteins. J. Chem. Phys. 2003, 119, 5740-5761. (49) Nerenberg, P. S.; Jo, B.; So, C.; Tripathy, A.; Head-Gordon, T. Optimizing Solute–Water Van Der Waals Interactions to Reproduce Solvation Free Energies. J. Phys. Chem. B 2012, 116, 4524-4534. (50) Maier, J. A.; Martinez, C.; Kasavajhala, K.; Wickstrom, L.; Hauser, K. E.; Simmerling, C. Ff14sb: Improving the Accuracy of Protein Side Chain and Backbone Parameters from Ff99sb. J. Chem. Theory Comput. 2015, 11, 3696-3713.

ACS Paragon Plus Environment

40

Page 41 of 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

(51) Zhang, H.; Lv, Y.; Tan, T.; van der Spoel, D. Atomistic Simulation of Protein Encapsulation in Metal–Organic Frameworks. J. Phys. Chem. B 2016, 120, 477-484. (52) Germann, M. W.; Turner, T.; Allison, S. A. Translational Diffusion Constants of the Amino Acids: Measurement by Nmr and Their Use in Modeling the Transport of Peptides. J. Phys. Chem. A 2007, 111, 1452-1455. (53) Best, R. B.; Hummer, G. Optimized Molecular Dynamics Force Fields Applied to the Helix−Coil Transition of Polypeptides. J. Phys. Chem. B 2009, 113, 9004-9015. (54) Duan, Y.; Wu, C.; Chowdhury, S.; Lee, M. C.; Xiong, G.; Zhang, W.; Yang, R.; Cieplak, P.; Luo, R.; Lee, T. A Point-Charge Force Field for Molecular Mechanics Simulations of Proteins Based on Condensed-Phase Quantum Mechanical Calculations. J. Comput. Chem. 2003, 24, 1999-2012. (55) Case, D. A.; Cerutti, D. S.; Cheatham, T. E. I.; Darden, T. A.; Duke, R. E.; Giese, T. J.; Gohlke, H.; Goetz, A. W.; Izadi, S.; Janowski, P.; Kaus, J.; Kovalenko, A.; Lee, T. S.; LeGrand, S.; Li, P.; Lin, C.; Luo, R.; Madej, B.; Mermelstein, D.; Merz, K. M.; Monard, G.; Nguyen, H.; Nguyen, H. T.; Omelyan, A. O.; Roe, D. R.; Roitberg, A.; Sagui, C.; Simmerling, C. L.; BotelloSmith, W. M.; Swails, J.; Walker, R. C.; Wolf, R. M.; Wu, X.; Xiao, L.; Kollman, P. Ambertools 16, University of California, San Francisco, 2016. (56) van der Spoel, D.; Lindahl, E.; Hess, B.; Groenhof, G.; Mark, A. E.; Berendsen, H. J. C. GROMACS: Fast, Flexible, and Free. J. Comput. Chem. 2005, 26, 1701-1718. (57) Pronk, S.; Páll, S.; Schulz, R.; Larsson, P.; Bjelkmar, P.; Apostolov, R.; Shirts, M. R.; Smith, J. C.; Kasson, P. M.; van der Spoel, D.; Hess, B.; Lindahl, E. GROMACS 4.5: A HighThroughput and Highly Parallel Open Source Molecular Simulation Toolkit. Bioinformatics 2013, 29, 845-854. (58) Hess, B.; Kutzner, C.; van der Spoel, D.; Lindahl, E. GROMACS 4: Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation. J. Chem. Theory Comput. 2008, 4, 435-447. (59) Abraham, M. J.; Murtola, T.; Schulz, R.; Páll, S.; Smith, J. C.; Hess, B.; Lindahl, E. GROMACS: High Performance Molecular Simulations through Multi-Level Parallelism from Laptops to Supercomputers. SoftwareX 2015, 1, 19-25. (60) Essmann, U.; Perera, L.; Berkowitz, M. L.; Darden, T.; Lee, H.; Pedersen, L. G. A Smooth Particle Mesh Ewald Method. J. Chem. Phys. 1995, 103, 8577-8593. (61) Darden, T.; York, D.; Pedersen, L. Particle Mesh Ewald: An N.Log(N) Method for Ewald Sums in Large Systems. J. Chem. Phys. 1993, 98, 10089-10092. (62) Bussi, G.; Donadio, D.; Parrinello, M. Canonical Sampling through Velocity Rescaling. J Chem Phys 2007, 126. (63) Parrinello, M.; Rahman, A. Polymorphic Transitions in Single Crystals: A New Molecular Dynamics Method. J. Appl. Phys. 1981, 52, 7182-7190. (64) Nose, S.; Klein, M. L. Constant Pressure Molecular Dynamics for Molecular Systems. Mol. Phys. 1983, 50, 1055-1076. (65) Miyamoto, S.; Kollman, P. A. Settle–an Analytical Version of the Shake and Rattle Algorithm for Rigid Water Models. J. Comput. Chem. 1992, 13, 952-962. (66) Spångberg, D.; Hermansson, K. Effective Three-Body Potentials for Li+(Aq) and Mg2+(Aq). J. Chem. Phys. 2003, 119, 7263-7281. (67) Tazi, S.; Boţan, A.; Salanne, M.; Marry, V.; Turq, P.; Rotenberg, B. Diffusion Coefficient and Shear Viscosity of Rigid Water Models. J. Phys.: Condens. Matter 2012, 24, 284117.

ACS Paragon Plus Environment

41

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 42 of 54

(68) Dünweg, B.; Kremer, K. Molecular Dynamics Simulation of a Polymer Chain in Solution. J. Chem. Phys. 1993, 99, 6983-6997. (69) Hess, B. Determining the Shear Viscosity of Model Liquids from Molecular Dynamics Simulations. J. Chem. Phys. 2002, 116, 209-217. (70) Einstein, A. On the Motion of Small Particles Suspended in Liquids at Rest Required by the Molecular-Kinetic Theory of Heat. Annalen der physik 1905, 17, 549-560. (71) R Core Team. R: A Language and Environment for Statistical Computing, version 3.2.3; R Foundation for Statistical Computing: Vienna, Austria, 2015. (72) Neumann, M.; Steinhauser, O.; Pawley, G. S. Consistent Calculation of the Static and Frequency-Dependent Dielectric Constant in Computer Simulations. Mol. Phys. 1984, 52, 97113. (73) Neumann, M. Dipole Moment Fluctuation Formulas in Computer Simulations of Polar Systems. Mol. Phys. 1983, 50, 841-858. (74) Abascal, J. L. F.; Vega, C. The Water Forcefield:  Importance of Dipolar and Quadrupolar Interactions. J. Phys. Chem. C 2007, 111, 15811-15822. (75) van der Spoel, D.; Wensink, E. J.; Hoffmann, A. C. Lifting a Wet Glass from a Table: A Microscopic Picture. Langmuir 2006, 22, 5666-5672. (76) Van Gunsteren, W.; Berendsen, H. A Leap-Frog Algorithm for Stochastic Dynamics. Mol. Simulat. 1988, 1, 173-185. (77) Hess, B.; Bekker, H.; Berendsen, H. J. C.; Fraaije, J. G. E. M. Lincs: A Linear Constraint Solver for Molecular Simulations. J. Comput. Chem. 1997, 18, 1463-1472. (78) Zhang, H.; Jiang, Y.; Yan, H.; Cui, Z.; Yin, C. Comparative Assessment of Computational Methods for Free Energy Calculations of Ionic Hydration. J. Chem. Inf. Model. 2017, 57, 27632775. (79) Lamoureux, G.; Roux, B. Absolute Hydration Free Energy Scale for Alkali and Halide Ions Established from Simulations with a Polarizable Force Field. J. Phys. Chem. B 2006, 110, 3308-3322. (80) Misin, M.; Fedorov, M. V.; Palmer, D. S. Hydration Free Energies of Molecular Ions from Theory and Simulation. J. Phys. Chem. B 2016, 120, 975-983. (81) Simonson, T.; Hummer, G.; Roux, B. Equivalence of M- and P-Summation in Calculations of Ionic Solvation Free Energies. J. Phys. Chem. A 2017, 121, 1525-1530. (82) Hünenberger, P.; Reif, M., Single-Ion Solvation: Experimental and Theoretical Approaches to Elusive Thermodynamic Quantities. Royal Society of Chemistry: Cambridge, 2011. (83) Zhang, H.; Jiang, Y.; Yan, H.; Yin, C.; Tan, T.; van der Spoel, D. Free-Energy Calculations of Ionic Hydration Consistent with the Experimental Hydration Free Energy of the Proton. J. Phys. Chem. Lett. 2017, 8, 2705-2712. (84) Li, P.; Song, L. F.; Merz Jr, K. M. Systematic Parameterization of Monovalent Ions Employing the Nonbonded Model. J. Chem. Theory Comput. 2015, 11, 1645-1657. (85) Randles, J. The Real Hydration Energies of Ions. Trans. Faraday Soc. 1956, 52, 15731581. (86) Zhang, H.; Ge, C.; van der Spoel, D.; Feng, W.; Tan, T. Insight into the Structural Deformations of Beta-Cyclodextrin Caused by Alcohol Cosolvents and Guest Molecules. J. Phys. Chem. B 2012, 116, 3880-3889.

ACS Paragon Plus Environment

42

Page 43 of 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

(87) Mason, P.; Neilson, G.; Dempsey, C.; Barnes, A.; Cruickshank, J. The Hydration Structure of Guanidinium and Thiocyanate Ions: Implications for Protein Stability in Aqueous Solution. Proc. Natl. Acad. Sci. U.S.A. 2003, 100, 4557-4561. (88) Steiner, T. The Hydrogen Bond in the Solid State. Angew. Chem. Int. Ed. 2002, 41, 4876. (89) Jensen, K. P.; Jorgensen, W. L. Halide, Ammonium, and Alkali Metal Ion Parameters for Modeling Aqueous Solutions. J. Chem. Theory Comput. 2006, 2, 1499-1509. (90) Marcus, Y. Ionic Radii in Aqueous Solutions. Chem. Rev. 1988, 88, 1475-1498. (91) Caminiti, R.; Cucca, P.; Monduzzi, M.; Saba, G.; Crisponi, G. Divalent Metal–Acetate Complexes in Concentrated Aqueous Solutions. An X-Ray Diffraction and Nmr Spectroscopy Study. J. Chem. Phys. 1984, 81, 543-551. (92) Mills, R. Self-Diffusion in Normal and Heavy Water in the Range 1-45. Deg. J. Phys. Chem. 1973, 77, 685-688. (93) Badyal, Y.; Saboungi, M.-L.; Price, D.; Shastri, S.; Haeffner, D.; Soper, A. Electron Distribution in Water. J. Chem. Phys. 2000, 112, 9206-9208. (94) Pátek, J.; Hrubý, J.; Klomfar, J.; Součková, M.; Harvey, A. H. Reference Correlations for Thermophysical Properties of Liquid Water at 0.1 Mpa. J. Phys. Chem. Ref. Data 2009, 38, 2129. (95) Harris, K. R.; Woolf, L. A. Temperature and Volume Dependence of the Viscosity of Water and Heavy Water at Low Temperatures. J. Chem. Eng. Data 2004, 49, 1064-1069. (96) Harris, K. R.; Woolf, L. A. Temperature and Volume Dependence of the Viscosity of Water and Heavy Water at Low Temperatures. J. Chem. Eng. Data 2004, 49, 1851-1851. (97) Vargaftik, N.; Volkov, B.; Voljak, L. International Tables of the Surface Tension of Water. J. Phys. Chem. Ref. Data 1983, 12, 817-820. (98) Guevara-Carrion, G.; Vrabec, J.; Hasse, H. Prediction of Self-Diffusion Coefficient and Shear Viscosity of Water and Its Binary Mixtures with Methanol and Ethanol by Molecular Simulation. J. Chem. Phys. 2011, 134, 074508. (99) Mao, Y.; Zhang, Y. Thermal Conductivity, Shear Viscosity and Specific Heat of Rigid Water Models. Chem. Phys. Lett. 2012, 542, 37-41. (100) González, M. A.; Abascal, J. L. The Shear Viscosity of Rigid Water Models. J. Chem. Phys. 2010, 132, 096101. (101) Fanourgakis, G. S.; Medina, J.; Prosmiti, R. Determining the Bulk Viscosity of Rigid Water Models. J. Phys. Chem. A 2012, 116, 2564-2570. (102) Song, Y.; Dai, L. L. The Shear Viscosities of Common Water Models by NonEquilibrium Molecular Dynamics Simulations. Mol. Simulat. 2010, 36, 560-567. (103) Zubillaga, R. A.; Labastida, A.; Cruz, B.; Martínez, J. C.; Sánchez, E.; Alejandre, J. Surface Tension of Organic Liquids Using the OPLS/AA Force Field. J. Chem. Theory Comput. 2013, 9, 1611-1615. (104) Fischer, N. M.; van Maaren, P. J.; Ditz, J. C.; Yildirim, A.; van der Spoel, D. Properties of Organic Liquids When Simulated with Long-Range Lennard-Jones Interactions. J. Chem. Theory Comput. 2015, 11, 2938-2944. (105) Clough, S. A.; Beers, Y.; Klein, G. P.; Rothman, L. S. Dipole Moment of Water from Stark Measurements of H2o, Hdo, and D2o. J. Chem. Phys. 1973, 59, 2254-2259. (106) van der Spoel, D.; van Maaren, P. J. The Origin of Layer Structure Artifacts in Simulations of Liquid Water. J. Chem. Theory Comput. 2006, 2, 1-11.

ACS Paragon Plus Environment

43

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 44 of 54

(107) Paluch, M. Surface Potential at the Water-Air Interface. Annales Universitatis Mariae Curie-Sklodowska, sectio AA–Chemia 2016, 70, 1. (108) Brodskaya, E.; Zakharov, V. Computer Simulation Study of the Surface Polarization of Pure Polar Liquids. J. Chem. Phys. 1995, 102, 4595-4599. (109) Wilson, M. A.; Pohorille, A.; Pratt, L. R. Comment on ‘‘Study on the Liquid–Vapor Interface of Water. I. Simulation Results of Thermodynamic Properties and Orientational Structure’’. J. Chem. Phys. 1989, 90, 5211-5213. (110) Wolfenden, R.; Andersson, L.; Cullis, P. M.; Southgate, C. C. B. Affinities of Amino Acid Side Chains for Solvent Water. Biochemistry 1981, 20, 849-855. (111) Sitkoff, D.; Sharp, K. A.; Honig, B. Accurate Calculation of Hydration Free Energies Using Macroscopic Solvent Models. J. Phys. Chem. 1994, 98, 1978-1988. (112) Chakrabarti, P. Geometry of Interaction of Metal Ions with Histidine Residues in Protein Structures. Protein Eng., Des. Sel. 1990, 4, 57-63. (113) Jensen, K. P. Improved Interaction Potentials for Charged Residues in Proteins. J. Phys. Chem. B 2008, 112, 1820-1827. (114) Cornell, W. D.; Cieplak, P.; Bayly, C. I.; Gould, I. R.; Merz, K. M.; Ferguson, D. M.; Spellmeyer, D. C.; Fox, T.; Caldwell, J. W.; Kollman, P. A. A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules. J. Am. Chem. Soc. 1995, 117, 5179-5197. (115) Bayly, C. I.; Cieplak, P.; Cornell, W.; Kollman, P. A. A Well-Behaved Electrostatic Potential Based Method Using Charge Restraints for Deriving Atomic Charges: The Resp Model. J. Phys. Chem. 1993, 97, 10269-10280. (116) Wang, J.; Cieplak, P.; Kollman, P. A. How Well Does a Restrained Electrostatic Potential (Resp) Model Perform in Calculating Conformational Energies of Organic and Biological Molecules? J. Comput. Chem. 2000, 21, 1049-1074. (117) Cerutti, D. S.; Rice, J. E.; Swope, W. C.; Case, D. A. Derivation of Fixed Partial Charges for Amino Acids Accommodating a Specific Water Model and Implicit Polarization. J. Phys. Chem. B 2013, 117, 2328-2338. (118) Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.; Cheeseman, J. R.; Scalmani, G.; Barone, V.; Mennucci, B.; Petersson, G. A.; Nakatsuji, H.; Caricato, M.; Li, X.; Hratchian, H. P.; Izmaylov, A. F.; Bloino, J.; Zheng, G.; Sonnenberg, J. L.; Hada, M.; Ehara, M.; Toyota, K.; Fukuda, R.; Hasegawa, J.; Ishida, M.; Nakajima, T.; Honda, Y.; Kitao, O.; Nakai, H.; Vreven, T.; Montgomery, J. A.; Peralta, J. E.; Ogliaro, F.; Bearpark, M.; Heyd, J. J.; Brothers, E.; Kudin, K. N.; Staroverov, V. N.; Kobayashi, R.; Normand, J.; Raghavachari, K.; Rendell, A.; Burant, J. C.; Iyengar, S. S.; Tomasi, J.; Cossi, M.; Rega, N.; Millam, J. M.; Klene, M.; Knox, J. E.; Cross, J. B.; Bakken, V.; Adamo, C.; Jaramillo, J.; Gomperts, R.; Stratmann, R. E.; Yazyev, O.; Austin, A. J.; Cammi, R.; Pomelli, C.; Ochterski, J. W.; Martin, R. L.; Morokuma, K.; Zakrzewski, V. G.; Voth, G. A.; Salvador, P.; Dannenberg, J. J.; Dapprich, S.; Daniels, A. D.; Farkas, O.; Foresman, J. B.; Ortiz, J. V.; Cioslowski, J.; Fox, D. J., In; Gaussian 09, Revision D.01. Wallingford CT, 2013. (119) Jiang, Y.; Zhang, H.; Tan, T. Rational Design of Methodology-Independent Metal Parameters Using a Nonbonded Dummy Model. J. Chem. Theory Comput. 2016, 12, 3250-3260. (120) Li, P.; Merz Jr, K. M. Taking into Account the Ion-Induced Dipole Interaction in the Nonbonded Model of Ions. J. Chem. Theory Comput. 2014, 10, 289-297.

ACS Paragon Plus Environment

44

Page 45 of 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

(121) Duarte, F.; Bauer, P.; Barrozo, A.; Amrein, B. A.; Purg, M.; Åqvist, J.; Kamerlin, S. C. L. Force Field Independent Metal Parameters Using a Nonbonded Dummy Model. J. Phys. Chem. B 2014, 118, 4351-4362. (122) Ponder, J. W.; Wu, C.; Ren, P.; Pande, V. S.; Chodera, J. D.; Schnieders, M. J.; Haque, I.; Mobley, D. L.; Lambrecht, D. S.; DiStasio Jr, R. A. Current Status of the Amoeba Polarizable Force Field. J. Phys. Chem. B 2010, 114, 2549-2564. (123) Lopes, P. E.; Huang, J.; Shim, J.; Luo, Y.; Li, H.; Roux, B.; MacKerell Jr, A. D. Polarizable Force Field for Peptides and Proteins Based on the Classical Drude Oscillator. J. Chem. Theory Comput. 2013, 9, 5430-5449. (124) Lemkul, J. A.; Huang, J.; Roux, B.; MacKerell Jr, A. D. An Empirical Polarizable Force Field Based on the Classical Drude Oscillator Model: Development History and Recent Applications. Chem. Rev. 2016, 116, 4983-5013.

ACS Paragon Plus Environment

45

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 46 of 54

Table of Contents Image Used Only Force Field Benchmark of Amino Acids: I. Hydration and Diffusion in Different Water Models Haiyang Zhang, Chunhua Yin, Yang Jiang, and David van der Spoel*

ACS Paragon Plus Environment

46

Page 47 of 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Box dependent diffusion constants of water calculated from MD simulations with varied box lengths (L) under periodic boundary conditions (PBC) for early generation (a) and latest (b) water models. TIP4P/2005 is given in both panels for comparison. The solid lines indicate linear fits of data points; the intercept (i.e., L approaches to infinity) gives the diffusion constant at infinite dilution and the slope is used to estimate the viscosity of water models (eq 2). 118x168mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Correlation of calculated water diffusion coefficients with the corresponding shear viscosities of the solvent water (blue circles) and with the root-mean-square error (RMSE, red triangles) of calculated amino acid diffusion coefficients from experimental observations at infinite dilution. Solid lines indicate linear fits of data points, R is correlation coefficient, and RMSE is for the calculations with ff99SB-ILDN. 58x40mm (600 x 600 DPI)

ACS Paragon Plus Environment

Page 48 of 54

Page 49 of 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Comparison of calculated real hydration free energies using ff99SB-ILDN for (a) neutral and (b) charged (b) amino acid side chain analogues in different water models with experimental observations. The amino acid with a large discrepancy in calculated hydration free energies with varied water models are marked by its one-letter-code (R stands for Gud+ here). 83x116mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Radial distribution functions (RDFs) for (a) nitrogen of Arg+ and (b) oxygen atoms of Asp- around water (HOH) oxygen atoms computed from 10-ns MD simulations with ff99SB-ILDN in TIP3P (solid lines), TIP4P/2005 (dotted), and TIP5P-Ew (dash-dotted) water. Unnormalized RDFs are presented here describing water number density (ρ) around charged amino acids; ρ approaches to 33.4 nm-3 corresponding to a pure water density of 1 g/mL. Inserts are enlarged view of water number density at r = 0.25~0.35 nm. 116x161mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 50 of 54

Page 51 of 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Calculated and experimental diffusion constants of amino acids at infinite dilution as a function of molar mass of the amino acids with ff99SB-ILDN. Solid lines are fits to a power function indicative of a relationship with the molar mass. Data points for TIP4P-Ew, TIP5P, and TIP5P-Ew are close to that of SPC/E and TIP4P/2005 and not given here for clarity. 61x44mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Comparison of dipole (a and b), atomic charge (c and d), and hydration free energy in TIP3P (e and f) for the constructed neutral amino acid side chain analogues by the method of Cβ modification and QM full ESP calculations for Amber ff99SB-ILDN (top) and ff03 (bottom). 66x29mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 52 of 54

Page 53 of 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Overall performance of the examined force field/water combination as indicated by (a) root-mean-square error (RMSE) and (b) mean signed error (MSE) for hydration free energies (∆Ghyd) of neutral amino acid side chains as well as by the MSE for diffusion constants (D0,AA) of neutral zwitterionic amino acids at infinite dilution compared to the experiment. Because all the water models overestimate amino acid diffusions, the RMSE for D0,AA is almost identical to the corresponding MSE. 101x122mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Table of content graphic 39x21mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 54 of 54