Fixed-Charge Atomistic Force Fields for Molecular ... - ACS Publications

The empirical functional form of classical fixed-charge force fields dates back to 1969 and remains essentially unchanged. In a fixed-charge force fie...
1 downloads 0 Views 1MB Size
Review Cite This: J. Chem. Inf. Model. 2018, 58, 565−578

pubs.acs.org/jcim

Fixed-Charge Atomistic Force Fields for Molecular Dynamics Simulations in the Condensed Phase: An Overview Sereina Riniker* Laboratory of Physical Chemistry, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland ABSTRACT: In molecular dynamics or Monte Carlo simulations, the interactions between the particles (atoms) in the system are described by a so-called force field. The empirical functional form of classical fixed-charge force fields dates back to 1969 and remains essentially unchanged. In a fixed-charge force field, the polarization is not modeled explicitly, i.e. the effective partial charges do not change depending on conformation and environment. This simplification allows, however, a dramatic reduction in computational cost compared to polarizable force fields and in particular quantum-chemical modeling. The past decades have shown that simulations employing carefully parametrized fixed-charge force fields can provide useful insights into biological and chemical questions. This overview focuses on the four major force-field families, i.e. AMBER, CHARMM, GROMOS, and OPLS, which are based on the same classical functional form and are continuously improved to the present day. The overview is aimed at readers entering the field of (bio)molecular simulations. More experienced users may find the comparison and historical development of the force-field families interesting. KEYWORDS: Molecular dynamics simulations, Force fields, AMBER, CHARMM, GROMOS, OPLS



INTRODUCTION Classical molecular dynamics (MD) simulations can be used to study processes involving proteins, lipids, nucleic acids, carbohydrates and small organic molecules. The size range of the molecules involved and the time scales of the processes of interest make it necessary to treat the molecules at the atomistic level of resolution (or even lower), i.e. each atom in a molecule is represented by a point in space with mass, (partial) charge, and van der Waals (vdW) parameters. Thus, the electrons are not considered explicitly and the system is assumed to be in the electronic ground state. In this case, the dynamics of the system can be described by Newton’s equations of motion. The interactions between the atoms in the system are defined via a potential energy function of the atomic coordinates also called a force f ield. The classical (bio)molecular force fields still use essentially the same empirical expression for the potential energy as proposed by Levitt and Lifson in 1969:1

bonds and bond angles. The force-field terms are shown schematically in Figure 1. In the early versions of the force fields, additional hydrogen-bonding interaction terms were employed,2−4 which were dropped, however, soon afterward. Despite the general importance of polarization effects, historically,5 the partial charges of the atoms were fixed during simulations, due to the high computational cost and added complexity associated with polarizable force fields. As fixedcharge force fields have shown to be surprisingly successful, we will focus in the following on this class of force fields and give an overview about the major force-field families, including parametrization strategies and historical development. For other reviews on force fields, see e.g. refs 6−11. Each interaction term consists of a sum over terms, which involve one to four atoms and a few parameters. This means that for example a system with 10 000 atoms would require in principle the definition of a multiple of 10 000 parameters. Classical fixed-charge force fields are effective force fields, i.e. the parameters are fitted empirically to experimental data and/ or data from higher-level quantum-mechanical (QM) calculations. To simplify the problem and to limit the number of parameters, the assumption of transferability is invoked, i.e. similar substructures may share some of their parameters. Based on this assumption, the parameters are generally determined for small molecules and subsequently used in larger ones. For organic liquids, experimental data for thermodynamic proper-

Epot( r ⃗) = Ebond( r ⃗) + Eangle( r ⃗) + Etorsion( r ⃗) + Eimproper( r ⃗) + E ele( r ⃗) + EvdW ( r ⃗)

(1)

where Epot is the potential-energy function of the system and r ⃗ are the coordinates of all atoms in the system. Ebond, Eangle, Etorsion, and Eimproper are so-called bonded or covalent terms for bond stretching, bond-angle bending, dihedral-angle torsion, and improper dihedral-angle bending (or out-of-plane distortions) in the molecules. The nonbonded terms Eele and EvdW describe the Coulomb (electrostatic) and van der Waals (vdW) interactions between atoms not directly connected via covalent © 2018 American Chemical Society

Received: January 25, 2018 Published: March 6, 2018 565

DOI: 10.1021/acs.jcim.8b00042 J. Chem. Inf. Model. 2018, 58, 565−578

Review

Journal of Chemical Information and Modeling

Figure 1. Schematic illustration of the terms in a classical fixed-charge force field, i.e. bond stretching (Ebond), bond-angle bending (Eangle), dihedralangle torsion (Etorsion), and improper dihedral-angle bending (Eimproper) as well as van der Waals (EvdW) and electrostatic (Eele) interactions.

Table 1. List of Fixed-Charge Atomistic Force Fields Used in MD Simulations of Proteins, Nucleic Acids, Lipids, Carbohydrates, and Small Organic Molecules Including the Main Publicationsa family AMBER

molecule type amino acids, nucleic acids

small molecules lipids carbohydrates

CHARMM

amino acids

amino acids, nucleic acids, lipids, carbohydrates nucleic acids

GROMOS

lipids small molecules carbohydrates amino acids amino acids, nucleic acids amino acids, nucleic acids, lipids

lipids small molecules carbohydrates

OPLS

small molecules, amino acids

carbohydrates

version

year

particle type

main water model

ff84 ff8615 ff9413 ff9916 ff99SB17 ff14SB18 GAFF19 GAFFlipid20 Lipid1421 GLYCAM_9322 GLYCAM200023 GLYCAM0624 CHARMM43,25 CHARMM1926 CHARMM2214 CHARMM3627

1984 1986 1994 1999 2006 2014 2004 2012 2014 1995 2001 2007 1983 1985 1992 2012

UA AA AA AA AA AA AA AA AA AA AA AA UA UA AA AA

implicit implicit TIP3P TIP3P TIP3P TIP3P TIP3P TIP3P TIP3P TIP3P TIP3P TIP3P ST2 TIP3P TIP3P TIP3P

CHARMM2228 CHARMM2729,30 (in CHARMM36)31 CGenFF32−34 (in CHARMM36)35 26C12,25 37C436−38 43A139 45A340,41 53A5/53A642 54A743 54A844 (in 54A7)45 ATB1.046 ATB2.047 53A6CARBO48 53A6GLYC49 56A6CARBO_R50 OPLS12 OPLS-AA51 OPLS-AA/L52 OPLS_200553 OPLS2.054 OPLS355 OPLS-AA56

1995 2000 2010 2010 2011 1982 1984 1996 2001 2004 2011 2012 2009 2011 2014 2010 2012 2016 1988 1996 2001 2005 2012 2016 1997

AA AA AA AA AA UA UA UA UA UA UA UA UA UA or AA UA or AA UA UA UA UA AA AA AA AA AA AA

TIP3P TIP3P TIP3P TIP3P TIP3P SPC SPC SPC SPC SPC SPC SPC SPC SPC SPC SPC SPC SPC TIP3P, TIP4P, SPC TIP3P, TIP4P − SPC, TIP3P, TIP4P SPC SPC −

4

For each force field version, the year in which the community started to use it is given (this is often before the corresponding paper was published). The particle type is either all-atom (AA), where all atoms are explicitly present, or united-atom (UA), where the hydrogens of aliphatic CHn groups are not explicitly present. a

566

DOI: 10.1021/acs.jcim.8b00042 J. Chem. Inf. Model. 2018, 58, 565−578

Review

Journal of Chemical Information and Modeling

43A1, a quartic functional form is used in GROMOS by default for computational efficiency

ties as a function of temperature (and less so pressure) are available and high-quality QM calculations are possible. However, even with these simplifications, force-field parametrization remains a heavily underdetermined problem, which means that different combinations of parameters may be found that fulfill the set of target properties equally well. The four major families of (bio)molecular force fields AMBER, CHARMM, GROMOS, and OPLS all share the same functional form shown in eq 1 and are summarized in Table 1. The majority of these force fields have started historically with amino acids (and nucleic acids) and were later expanded to other types of molecules such as lipids, carbohydrates, and organic molecules. The exception is OPLS, which started with organic liquids. Biomolecules (amino acids, nucleic acids, lipids, and carbohydrates) consist of a relatively small and closed number of building blocks, which simplifies parametrization. More challenging is the vast chemical space of organic molecules, requiring flexible parametrization strategies. AMBER, CHARMM, and GROMOS are therefore divided historically into a biomolecular force field and a force field for small molecules. In general, the small-molecule versions of the force fields (i.e., GAFF, CGenFF, GROMOS ATB) should not be used for the simulation of proteins and other biomolecules. Especially in the beginning of force-field development, the different forcefield families influenced each other substantially. For example, the original OPLS force field took over the bonded terms from AMBER.12 In turn, the vdW parameters of OPLS were incorporated in the ff94 version of AMBER.13 In CHARMM22, the parameters for the aromatic side chains were largely taken from OPLS.14 Historically, all force-field families started with a united-atom (UA) representation for the aliphatic CHn groups for simplicity and reduction in computational cost. During the late 1980s and early 1990s, the developers of all force-field families except GROMOS decided to switch to an all-atom (AA) representation. Historically, the parameters of a force field were adapted to a chosen water model. Also in later versions, some of the parameters may depend on the water model used as parametrization and/or validation generally involve the comparison of calculated hydration free energies of small organic molecules with experimental values. Switching from the simple three-site models to more sophisticated water models is therefore not trivial. Note that the OPLS force fields OPLS_2005, OPLS2.0, and OPLS3 are part of commercial software, while OPLS-AA is still actively maintained and freely distributed by the Jorgensen group. The other force-field families are publicly available. The review is organized as follows. First, the different bonded and nonbonded force-field terms are discussed in detail. The vdW and electrostatic interactions are thereby discussed in separate sections. Second, future developments are highlighted, followed by conclusions and guidelines.

Vibond(di) =

V iangle(θi) =

1 a,harm Ki [θi − θ0, i]2 2

(4)

where θi is the current value of bond angle i and θ0,i is the reference bond angle. Since version 43A1, a computationally more efficient form is used in GROMOS by default V iangle(θi) =

1 a,cos K i [cos(θi) − cos(θ0, i)]2 2

(5)

with a defined relation between Ka,cos and Ka,harm (see ref 59). i i Note that this functional form is a poor choice for linear molecules as the derivative is close to zero for θ ≈ 0 or π. In order to better fit the vibrational spectra, the CHARMM nucleic acids force field involves additional bond-angle terms using the Urey−Bradley (UB) form14,28 ViUB(Si) = K iUB[Si − S0, i]2

(6)

where Si is the 1,3-distance of the UB term i and S0,i is the reference 1,3-distance. Bond-angle bending terms also belong to the hard degrees of freedom, although they are less “hard” than bond-stretching terms. Bond-angle constraints should not be used in practice (with the exception of fully rigid molecules) as they affect torsional dihedral-angle distributions and barriers and introduce dynamic and thermodynamic artifacts.2,60,61 Atoms that are connected via bonds or bond angles (first and second neighbors) are excluded from nonbonded interactions. Parametrization. Bond stretching and bond-angle bending terms can be parametrized in a relatively straightforward manner. Historically, the parametrization was based on crystallographic and microwave data as well as vibrational frequencies.3,4,25,62 For example, the bond-length and bondangle parameters in GROMOS 26C1 and CHARMM4 were

BOND STRETCHING AND BOND-ANGLE BENDING Functional Form. The harmonic functional form to represent bond stretching is used in AMBER, CHARMM, OPLS, and the GROMOS 37C4 force field 1 b,harm Ki [di − d0, i]2 2

(3)

which is equivalent to the harmonic form close to d0 and with Kb,quartic = Kb,harm(2d20)−1. Use of a harmonic (or quartic) functional form prohibits bond dissociation. More complex functional forms to represent the anharmonicity of bond stretching have been investigated, however, as these are computationally more expensive and the high-frequency bond-stretching vibrations have little influence on the dynamics of biomolecules, the simple forms are commonly used. Bondstretching terms are so-called “hard” degrees of freedom, i.e. a small deviation from r0 will result in a large increase in the potential energy compared to kBT (where kB is the Boltzmann constant and T the temperature). These terms thus dictate the maximum time step that can be chosen for stable integration of the equations of motion (Δt = 0.5 fs). Therefore, in practice, either the bonds involving hydrogens or all bonds are constrained to their ideal values using algorithms such as SHAKE57 or LINCS58 to increase the time step to Δt = 1 or 2 fs, respectively. Use of bond-length constraints is justified because bond-stretching vibrations are quantum-mechanically not excited at room temperature and thus uninteresting for the systems studied at room temperature or physiological temperature. Bond-angle bending is typically described by a harmonic functional form such as in AMBER, CHARMM, OPLS, and the GROMOS 37C4 force fields



Vibond(di) =

1 b,quartic 2 Ki [di − d0,2 i]2 4

(2)

where Kb,harm is the force constant of bond i, di is the current i bond length, and d0,i is the reference bond length. Since version 567

DOI: 10.1021/acs.jcim.8b00042 J. Chem. Inf. Model. 2018, 58, 565−578

Review

Journal of Chemical Information and Modeling

back to 1/2.0 for the protein force field after a higher value was used in intermediate versions, whereas for nonprotein dihedralangle torsion terms the scaling factor could vary between 1/2.0 and 1.0.55 In the GROMOS force fields, only vdW 1,4interactions are reduced, more specifically the repulsive part for atom types occurring in biomolecules.36,38,39,42 The electrostatic 1,4-interactions are generally not reduced, with the exception of hydrogens in aromatic rings, which are excluded (from version 43A1 onward). In CHARMM22, reduced vdW 1,4-interactions were used only for the nitrogen and oxygen of the peptide backbone and for aliphatic carbons, the electrostatic 1,4-interactions are generally not reduced.14 Parametrization. The parametrization of the torsional dihedral-angle terms is less trivial than that of the other bonded terms. The first torsional parameters were determined based on experimental data concerning conformational equilibria of small molecules, e.g. the cis/trans energy barrier of the peptide bond in N-methylacetamide.4 In the 1990s, QM methods became sufficiently fast that torsional energy profiles of model compounds could be calculated and used for fitting. However, these were generally performed in the gas phase and not in the condensed phase. In more recent years, validation and/or fitting to experimental data such as NMR scalar 3J-couplings or propensities for secondary-structure motifs emerged. A problem with 3J-couplings is the known ambiguity in the parametrization of the Karplus relation that is used to convert dihedral angles into 3J-coupling constants that results in uncertainties of 1−2 Hz.64 In addition, 3J-values around 7 Hz may be the result of conformational averaging, which makes them unsuitable for force-field parametrization where a one-toone relation between a 3J-value and conformer is desired. Thus, the use of 3J-couplings for fitting needs to be done with care to avoid overfitting. Torsional preferences obtained from protein crystal structures may reflect the particular crystal environment. As the quality of torsional dihedral-angle terms (or the lack of it) is most evident in the protein backbone (and side chains), the largest effort by far has been put into the development and refinement of backbone (and side chain) dihedral-angle terms. Unfortunately, it is not precisely known from experiment how much flexibility the secondary and tertiary structures of proteins should have (and what the corresponding time scales are). The rigidity and secondary-structure preferences can therefore vary between different force-field families (and between versions in the same family).65−69 In the following, a brief summary of the evolution of the backbone (and side chain) torsional parameters is given for each force-field family. As mentioned above, the first torsional parameters in AMBER were derived from experimental data concerning conformational equilibria of small molecules.4 In ff94, a few torsional parameters were added, which were derived by fitting to QM energy profiles for model compounds in the gas phase.13 In subsequent years, it was found that ff94 overstabilized αR-helices. Reference 17 contains a nice summary of the various attempts that were proposed in the AMBER community to address this issue. In 2006, the ff99SB version was published, where the backbone torsional parameters were improved by fitting the QM energy profiles of alanine and glycine dipeptides and tetrapeptides.17 Glycine had thereby only one set of ϕ and ψ parameters, whereas the other amino acids had two for each angle. In the following years, limitations of ff99SB were found with respect to side-chain rotamers and reproduction of NMR scalar 3J-couplings. An attempt of improving the side-chain torsions of isoleucine, leucine,

taken from a set of amino-acid crystal structures,63 while their force constants were estimated based on an analysis of vibrational spectra.2,25 More recently, ab initio QM calculations are additionally used to derive parameters.19,32,46 Note that, in AMBER, CHARMM, and OPLS, the concept of “atom types” is used to define all parameters, bonded and nonbonded. The underlying assumption is that the information about the atom types and connectivity is sufficient to determine all parameters. Therefore, many atom types in these force fields has the same vdW parameters but infer different bonded parameters. When broadening the chemical space that is covered by the force field, an increasing number of atom types have to be defined to account for the diversity found in small organic molecules. In GROMOS, on the other hand, the bonded parameters are determined independently of the atom types that define the vdW parameters.



DIHEDRAL-ANGLE TORSION Functional Form. The basic functional form for the torsional dihedral-angle term used in all discussed force fields is Vitorsion(θi) = K it[1 + cos(mθi − γ )]

(7)

Vitorsion(θi) = K it[1 + cos(γ )cos(mθi)]

(8)

or where is the force constant of torsion i, θi is the current torsional angle, m is the multiplicity, and γ is the phase angle (typically 0 or π). If γ is 0 or π, the two equations are equivalent. In the CHARMM and GROMOS force fields, m can be up to six. In older AMBER and OPLS versions, only multiplicities up to three were considered. In AMBER ff14SB18 and OPLS3,55 the maximum possible value for m was increased to four. In GAFF, some special terms with m = 4 were used already earlier.19 In AMBER, CHARMM, and GROMOS, a cosine series for a specific torsion is obtained by applying multiple torsion terms per bond, whereas in OPLS a fixed cosine series with defined values for γ is used51 Kti

Vitorsion(θi) = K it,1[1 + cos(θi)] + K it,2[1 − cos(2θi)] + K it,3[1 + cos(3θi)]

(9)

In OPLS3, the cosine series was extended by the term Kti,4[1 − cos(4θi)].55 It is important to note that the choice of the torsional dihedral-angle term and its parameters are related to the choice of the nonbonded interactions between third neighbors in all force-field families discussed here. The energy profile of a torsional degree of freedom is a combination of the torsional dihedral-angle term(s), and the electrostatic and vdW 1,4interactions. The nonbonded 1,4-interactions are commonly reduced in classical force fields in order to avoid too large repulsion in gauche conformations,36 although to a varying degree. In AMBER, both electrostatic and vdW 1,4-interactions were scaled by a factor 1/2.0 (1/8.0 in some special vdW interactions) in the UA version of the force field (ff84).4 In the second generation AA force field (ff94), the scaling factor for the electrostatic 1,4-interactions was changed to 1/1.2, while 1/ 2.0 was used for all vdW 1,4-interactions.13 OPLS adopted the approach of AMBER, with scaling factors 1/2.0 for the electrostatic and 1/8.0 for the vdW 1,4-interactions in the first UA version.12 In 1996, both scaling factors were set to 1/ 2.0 for the AA version.51 In OPLS3, the scaling factor was set 568

DOI: 10.1021/acs.jcim.8b00042 J. Chem. Inf. Model. 2018, 58, 565−578

Review

Journal of Chemical Information and Modeling

limitations in the protein force field were addressed.55 Proline, glycine and alanine backbone torsional parameters were fitted separately based on QM energy profiles of the corresponding dipeptides, and tetrapeptide in the case of alanine. For all other amino acids, the parameters of alanine were used. Side-chain torsional parameters were refined based on distributions observed in the protein data bank (PDB).

aspartate, and asparagine was contained in ff99SB-ILDN using the strategy of fitting QM energy profiles of complete amino acids.70 Later, the issues were addressed in a more complete fashion in ff14SB, where torsional parameters for side chains were fitted based on QM energy profiles and the backbone dihedral-angle term for ϕ was empirically adjusted to better reproduce NMR data for a alanine hexapeptide.18 The torsional parameters in GROMOS 26C12 were taken from earlier force-field versions used at Harvard.25,71 The torsional parameters for the furanose ring of nucleotides in GROMOS 37C4 were adopted from ref 72. In subsequent force-field versions, only selected torsional parameters, e.g. for lipid tails,40 were (re)parametrized. In the GROMOS 53A6 force field, the nonbonded parameters were generally reparametrized, while the bonded parameters were left unchanged.42 In subsequent years, it was found that 53A6 is biased toward β-sheets at the expense of αR-helices. In GROMOS 54A7, improved backbone torsional parameters (two torsion terms per angle instead of one) were proposed in combination with reduced vdW interactions between backbone nitrogen and oxygen forming hydrogen bonds. 43 The parameters were inspired by a QM study of alanine peptides and refined based on long simulations of five peptides, whose secondary-structure preferences are well characterized experimentally.73 Recently, a set of refined backbone torsional parameters have been proposed, grouping the amino acids into four residue classes based on experimental secondary-structure propensities and 3J-couplings.74 In 2004, CHARMM developers embraced a different strategy to improve torsional parameters than the other force-field families by introducing grid-based energy correction maps termed CMAP.75 CMAPs were obtained by grid-based interpolation (using cubic splines) to minimize the difference between the classical and the target QM energy surface of dipeptides. The initial use of CMAPs led to a substantial overstabilization of the αR-helical conformation in proteins. This could be seen for example in a force-field comparison using the villin headpiece, where the temperature in the simulation with CHARMM22-CMAP had to be increased by 50 K compared to AMBER ff99SB-ILDN to obtain unfolding and folding events of this small protein.66 To overcome this limitation, the backbone CMAPs were refitted in 2012 (CHARMM36) based on higher-quality QM energy surfaces of glycine and proline, and based on NMR scalar 3J-couplings of weakly structured peptides for the other amino acids.27 In addition, side-chain torsional parameters were improved by fitting to QM energy profiles and 3J-couplings. This refitting resolved the bias toward αR but introduced a bias toward αL. Thus, a revised version CHARMM36m was proposed in 2016,76 where the backbone CMAPs were refitted using experimentally derived propensities of secondary-structure motifs and small-angle X-ray scattering (SAXS) data for intrinsically disordered peptides. In OPLS, the first torsional dihedral-angle parameters were taken over from AMBER.12 For OPLS-AA, all torsional parameters were newly fitted to QM energy profiles of more than 50 small molecules.51 In 2001, a refined version of the protein backbone torsional parameters was proposed (OPLSAA/L) based on higher-quality QM energy profiles of alanine dipeptide.52 Alanine tetrapeptide and dipeptides of the other amino acids were used for validation. In OPLS2.0, additional torsional parameters were introduced for new chemical substructures fitted to QM energy profiles.54 With OPLS3,



IMPROPER DIHEDRAL-ANGLE BENDING Functional Form. The improper dihedral-angle bending terms are needed to treat more accurately the out-of-plane distortions of planar groups and to enforce the correct geometry and chirality of tetrahedral centers (especially important for UA aliphatic carbons). The basic harmonic functional form used in CHARMM and GROMOS is2,3 Viimproper(ξi) = K iim[ξi − ξ0, i]2

(10)

where Kim i is the force constant of improper dihedral angle i, ξi is the current angle, and ξ0,i is the reference angle. The improper dihedral angle is defined by four atoms: the central atom and its three neighbors. For planar groups, ξ0 = 0°, whereas for tetrahedral groups ξ0 = 35.26°. In AMBER and OPLS, the standard dihedral-angle torsion term is used for the out-of-plane distortions.4 Parametrization. There are only a few improper dihedral parameters needed (e.g., a set for planar groups and a set for tetrahedral groups) and the parametrization is comparably trivial. Force constants were obtained by fitting to vibrational frequencies,13 or choosing arbitrary values.



VAN DER WAALS (VDW) INTERACTIONS Functional Form. In all force-field families discussed, the vdW interactions are described using a 12−6 Lennard-Jones (LJ) functional form VijvdW(rij) =

C12(i , j) rij12



C6(i , j) rij6

(11)

with

C12(i , j) = 4ϵij(σij)12

(12)

C6(i , j) = 4ϵij(σij)6

(13)

where rij is the distance between atoms i and j, and ϵij and σij are the LJ parameters. All of the discussed force fields group the atoms in the system into atom types, which have specific LJ parameters and typically represent atoms in a specific chemical environment (e.g., oxygen in a carboxylic acid group, oxygen in a hydroxy group, etc.). If the concept of atom types is used globally to define bonded and nonbonded parameters (i.e., in AMBER, CHARMM, and OPLS), several atom types may have the same LJ parameters but infer different bonded parameters. The number of atom types therefore varies between the forcefield families and has generally increased over time, i.e., from 65 in the united atom version of OPLS12 to 124 in OPLS3,55 from 40 in AMBER ff844 to 57 in AMBER/GAFF,19 from 26 in GROMOS 26C12,25 to 54 in GROMOS 54A7/ATB,43 and from 29 in CHARMM43 to 139 in CHARMM/CGenFF.32 To obtain the LJ parameters between different atom types, combination rules are used (see refs 77 and 78 for an overview). In AMBER and CHARMM, the Lorentz−Berthelot 569

DOI: 10.1021/acs.jcim.8b00042 J. Chem. Inf. Model. 2018, 58, 565−578

Review

Journal of Chemical Information and Modeling

Table 2. Summary of the Settings and Parametrization Strategies Used for the LJ Interactions in the Main Force-Field Versionsa force field version

comb. rule

AMBER ff844

LB

AMBER ff9413 AMBER ff9916 GAFF19 CHARMM43

1,4- interactions

cutoff RLJ [nm]

long- range treatment

parametrization

0.8−0.9

truncation

taken from ref 95, plus adjustments

LB LB LB LB

scaled by 1/2 (or 1/8) scaled by 1/2 scaled by 1/2 scaled by 1/2 no reduction

0.8−0.9 0.9 0.9b 0.75

taken from OPLS-AA taken from ff94 taken from ff94 or ff99 crystal packing, Slater−Kirkwood

CHARMM2214

LB

special pairs

1.2

CGenFF32−34

LB

no reduction

1.2

GROMOS 26C12,25 GROMOS 37C437 GROMOS 43A139,84 GROMOS 53A5/ 53A642 GROMOS ATB46 OPLS12

geometric

no reduction

0.8

geometric geometric

special pairs special pairs

0.8 1.4

truncation truncation correction eq 20b switching function, truncation switching function, truncation switching function; correction eq 20 switching function, truncation truncation truncation

geometric

special pairs

1.4

truncation

fitting liquid and hydration properties

geometric geometric

special pairs scaled by 1/8

1.4 1.0−1.5

OPLS-AA51

geometric

scaled by 1/2

1.1−1.5

geometric

scaled by 1/2 to 1

1.0

truncation switching function; correction eq 20 switching function; correction eq 20 correction eq 20

taken from 53A6 fitting liquid properties, gas-phase geometries, complexation energies fitting liquid and hydration properties, gas-phase geometries, complexation energies taken from OPLS-AA

OPLS2.0

54

fitting liquid properties fitting liquid properties taken from ref 96 fitting liquid properties taken from 37C4

a

The combination rule for the LJ parameters is either geometric or the Lorentz−Berthelot (LB) scheme. bThe settings for GAFF were taken from ref 94, because in ref 19, only gas-phase QM calculations were performed and therefore no long-range settings were reported.

combination rule is used with the arithmetic mean for σ and the geometric mean for ϵ σij =

ϵij =

σii + σjj 2

(14)

ϵiiϵjj

(15)

3⎛ 1 ⎞ ⎜ ⎟ 2 ⎝ 4π ϵ0 ⎠

1/2

C6(i , j) =

eℏme−1/2αiαj (αi /Ni)1/2 + (αj/Nj)1/2

(16)

where ϵ0 is the permittivity in vacuum, e the electron charge, ℏ the reduced Planck’s constant (ℏ = h/2π), me the electron rest mass, αi the polarizability of atom i, and Ni the effective number of outer shell electrons. This approximation was used to determine the C6 parameters in the first versions of CHARMM and GROMOS.2,3 In OPLS, on the other hand, the LJ parameters (both C6 and C12) were obtained by fitting experimental densities and heats of vaporization of 36 organic liquids.12,51 The set of liquids was chosen to represent the functional groups in the natural amino acids. The OPLS atom types were taken over by AMBER ff94, with some adjustments and new types for all-atom aliphatic carbons using the same fitting procedure.13 Recently, alternative approaches were proposed to derive dispersion parameters from QM calculations using an atoms-in-molecule (AIM) method82 or the exhange-hole dipole moment (XDM) model.83 C6 parameters obtained in this manner were found to be generally smaller than those used in (bio)molecular force fields and showed more variation.83 Such QM-based approaches would also allow the inclusion of higher dispersion terms. The repulsive C12 term, on the other hand, is a classical concept to describe the repulsion between atoms at short distances. It is difficult to obtain C12 parameters from experiment or QM calculations, and these are usually determined by fitting to thermodynamic properties of liquids and crystal structures. In the first versions of CHARMM and GROMOS, C12 parameters between atom types i and j were estimated using,2,3

Note that the original publication of AMBER ff84 states that the geometric combination rule was used for σ too,4 but this was apparently a mistake (see Table 13 in ref 13 and footnote 42 in ref 78). CHARMM allows the use of pairwise-specific LJ parameters, which do not obey the combination rules (e.g., to correct the interactions with ions79). OPLS and GROMOS, on the other hand, use the geometric mean for both σ and ϵ. GROMOS allows furthermore up to three different C12 parameters per atom type together with a combination matrix to provide some flexibility for parametrization.39,42 When comparing with experimentally determined minimum-energy distances of rare gases, it was found that the geometric mean gives too small σij values if the atom types i and j are substantially different in size.78 The arithmetic mean shows the same deficiency but to a lesser degree. Despite this shortcoming, the combination rules were chosen for their simplicity and ease of computation. As mentioned above, the LJ interactions between third neighbors (1,4-interactions) are generally reduced in most force fields, although to varying degree and extent (Table 2). Parametrization. The attractive C6 term in the LJ interaction describes dispersion and can contribute significantly to the intermolecular potential energy.80 It has a direct relation to experiment and can be estimated for atom types i and j using the Slater−Kirkwood approximation81

C12(i , j) = 570

1 C6(i , j)(R i + R j)6 2

(17) DOI: 10.1021/acs.jcim.8b00042 J. Chem. Inf. Model. 2018, 58, 565−578

Review

Journal of Chemical Information and Modeling

error at RLJ, at the expense of another (hopefully smaller) inaccuracy introduced between Rsw and RLJ. An example for such a switching function is

where Ri is the vdW radius of atom type i (i.e., the standard vdW radius of the element, 0.005 nm was added per attached hydrogen). With eqs 12 and 13, this means that Rij = Ri + Rj. In GROMOS, the vdW parameters were modified in order to employ the SPC water model as solvent for proteins.37 The parameters were later refined by fitting to densities and heats of vaporization of organic liquids (side-chain analogues) as well as heats of sublimation and unit cell parameters of crystals.14,84,85 A major refitting of the LJ parameters in GROMOS was done for the 53A5/53A6 force fields, including solvation free energies in water and cyclohexane as additional target quantities.42 Interestingly, it was not possible by varying C12 parameters and partial charges to obtain a single parameter set that reproduced pure liquid properties and solvation free energies, resulting in two force-field versions 53A5 and 53A6. In the GROMOS-compatible force field 2016H66, it could be achieved by allowing additionally the C6 parameters to vary.86 As mentioned above, the OPLS (and AMBER) C12 parameters were obtained together with the C6 parameters by fitting the thermodynamic properties of organic liquids.12,51 Note that LJ parameters and partial charges are inherently connected and can in principle not be chosen independently from each other. Most of the force fields for small molecules, however, build on this assumption with a relatively low number of atom types (especially AMBER and GROMOS) and general schemes to obtain partial charges (see below). Long-Range vdW Interactions. Calculating all pairwise vdW (and electrostatic, see below) interactions in a system would scale with N2, where N is the number of particles (atoms). In order to reduce the computational effort, a cutoff RLJ was introduced early on. Only interactions within this cutoff distance from an atom are calculated explicitly, while the interactions outside the cutoff are either neglected (i.e., truncation) or approximated with a long-range approach. The C6 part of the LJ interaction is always attractive and can be calculated analytically for the range RLJ < r < ∞ (for a system with a single atom type), E RvdW = 2π LJ

N2 V

∫R



⎧1 rij ≤ R sw ⎪ ⎪ ⎪ (RLJ − rij)2 (RLJ + 2rij − 3R sw ) S(rij) = ⎨ R sw < rij ≤ RLJ (RLJ − R sw )3 ⎪ ⎪ ⎪0 rij > RLJ ⎩ (22)

as implemented in CHARMM.3 Another strategy is to use a large RLJ, where the cutoff error becomes sufficiently small such that it can be neglected. Alternatively, the long-range LJ interactions can be accounted for by applying a long-range correction using eq 20 with an average ⟨C6⟩ value over all atoms in the system. However, as the long-range correction affects only the energy and the pressure but not the forces, it is important that the correction was already applied during parametrization of the LJ parameters. Otherwise the truncation error has already been accounted for by the resulting effective LJ parameters. More sophisticated strategies to account for long-range LJ interactions are for example the isotropic periodic sum (IPS) method,87,88 or a lattice-sum approach for dispersion.89,90 In the IPS method, the long-range part is represented by a sum over isotropic periodic images, whereas in lattice-sum approaches the sum is over discrete lattice images. Both approaches can also be used to account for the long-range electrostatic interactions (see below). The cutoff RLJ (and the long-range treatment) can in principle be chosen independently from the rest of the force field. It is important to note, however, that LJ parameters are ef fective parameters, i.e. they are fitted to reproduce experimental data. The cutoff and long-range treatment used during parametrization therefore influence the resulting LJ parameters, and thus become themselves parameters of the force field. For this reason, a force field should be used with the same cutoff (and long-range treatment) as employed during its parametrization to ensure a consistent behavior. If a different setup is used, the resulting accuracy of the force fields should be carefully checked, e.g. with simulations of organic liquids and comparison to experimental data. In the following, the setups used for parametrization are summarized for the different force-field families and versions (Table 2). The OPLS-AA force field was parametrized using a general cutoff RLJ = 1.1 nm (in some special cases up to 1.5 nm) together with a long-range correction using eq 20.12,91 In AMBER ff94, the OPLS LJ parameters were largely taken over.13 It is not clearly described in ref 13 but can be assumed that the parameters were adjusted to the combination rules and long-range settings used with AMBER, i.e. RLJ = 0.8 and 0.9 nm and direct truncation. In AMBER ff99, a cutoff RLJ = 0.9 nm with no long-range correction or switching function was used.16 As mentioned above, CHARMM uses a switching function for the LJ interactions. In CHARMM4, the cutoff RLJ was set to 0.75 nm3 and later increased to 0.95 nm in CHARMM22.14 For the GROMOS force field, the strategy with the longer cutoff was pursued in the 1990s, increasing RLJ from 0.8 to 1.4 nm, which reduced the LJ cutoff error (eq 18) to less than 1%.39,85 In order to keep the computational cost reasonable, a twinrange approach92 with a short-range cutoff Rsr = 0.8 nm was used. The LJ parameters of the 53A5/53A6 force fields were

r 2V vdW(r )g (r ) dr (18)

LJ

where V is the volume of the system and g(r) is the atom−atom radial distribution function. The corresponding pressure contribution is PRLJ = −

2π ⎛⎜ N ⎞⎟ 3 ⎝V ⎠

2

∫R



r3

LJ

dV vdW(r ) g (r ) d r dr

(19)

For large values of RLJ, g(r) can be approximated by 1 and VvdW(r) by −C6/r6, which gives the following simplified expressions for EvdW RLJ and PRLJ, E RvdW =− LJ

PRLJ

2π N 2 C6 3 V (RLJ )3

2 2E RvdW 4π ⎛⎜ N ⎞⎟ C6 LJ =− = 3 ⎝ ⎠ 3 V (RLJ ) V

(20)

(21)

Different strategies have been developed to treat cutoff effects with LJ interactions. One approach is to damp them by using a switching function starting at a distance 0 < Rsw < RLJ. The switching function modifies the LJ energy and force smoothly to become zero at RLJ. In this way, there is no cutoff 571

DOI: 10.1021/acs.jcim.8b00042 J. Chem. Inf. Model. 2018, 58, 565−578

Review

Journal of Chemical Information and Modeling

Table 3. Summary of the Settings and Parametrization Strategies Used for the Electrostatic Interactions in the Main Force-Field Versions charge groups

1,4interactions

cutoff Rele [nm]

AMBER ff844

no

0.8−0.9

truncation

STO-3G in vacuum with ESP

AMBER ff9413

no

0.8−0.9

truncation

HF/6-31G* in vacuum with RESP

AMBER ff9916

no

0.9

PME

HF/6-31G* in vacuum with RESP

GAFF19

no

0.9a

PMEa

HF/6-31G* in vacuum with RESP or AM1-BCC

CHARMM43

yes

scaled by 1/1.2 scaled by 1/1.2 scaled by 1/1.2 scaled by 1/1.2 no reduction

0.75

fitting QM interaction energies of dimers

CHARMM2214

yes

no reduction

1.2

CGenFF32−34 GROMOS 26C12,25

no yes

no reduction no reduction

1.2 0.8

GROMOS 37C437,92,125 GROMOS 43A139,84

yes

no reduction

0.8−1.5

switching function; truncation switching function, truncation PME switching function; truncation truncation

yes

1.4

truncation

fitting liquid properties

GROMOS 53A5/53 A642 GROMOS ATB46

yes

1.4

reaction field

fitting liquid and hydration properties

1.4

reaction field

B3LYP/6-31G* in implicit water with ESP

OPLS12

yes

no reductionb no reductionb no reductionb scaled by 1/2

1.0−1.5

yes

scaled by 1/2

1.1−1.5

no

scaled by 1/2

1.0

switching function; truncation switching function; truncation PME

fitting liquid properties, gas-phase geometries, complexation energies fitting liquid and hydration properties, gas-phase geometries, complexation energies CM1A-BCC

force field version

OPLS-AA

51

OPLS2.054

yes

long-range treatment

parametrization

HF/6-31G* in vacuum with ESP, scaling by 1.16 trained BCI taken from ref 96 taken from refs 107−109

a

The settings for GAFF were taken from ref 94, because in ref 19, only gas-phase QM calculations were performed and therefore no long-range settings were reported. bWith the exception of 1,4-interactions between hydrogens in aromatic rings, which are excluded.

parametrized with this setup.42 In the twin-range scheme, the interactions within Rsr are evaluated every step, while the intermediate-range interactions between Rsr and RLJ are calculated only every nth step (typically n = 5) and kept constant in between. This procedure can be viewed as a type of mult-timestepping approach. However, it was recently found that it leads to artifacts at interfaces between media with different dielectric permittivity.93

Parametrization. As polarization is not treated explicitly in fixed-charge force fields, the “mean-field” polarization is modeled by treating partial charges as effective parameters. The partial charges are generally the part of the force fields that has seen most changes over the past decades. There are broadly two strategies to derive partial charges, which are also used in combination: (i) fitting to experimental data and (ii) extraction from QM calculations. It is important to note that the latter is an underdetermined mathematical problem with no unique solution (and potentially no solution at all), as there exists an infinite number of charge distributions, which can reproduce the electrostatic potential (ESP) outside a surface encapsulating all charges. Partial charges extracted from QM calculations usually require some postprocessing such as symmetrization, i.e. averaging the charges of symmetric atoms in the molecule. The two force-field families GROMOS (except ATB) and OPLS (before OPLS_2005) employ the first strategy, while the other two force-fields families AMBER and CHARMM make use of the latter (Table 3). Ideally, the partial charges of all parts of the force field should be obtained in the same manner to ensure consistency and compatibility. In AMBER ff84, the partial charges were derived by fitting the ESP from QM calculations in vacuum using the STO-3G functional and the Kollman−Singh97 approach.4 Note that this force-field version still contained a hydrogen-bonding term. As the QM calculations could only be done on very small molecules, the partial charges of larger molecules were obtained by “patching together” fragments. The partial charges were not scaled and their polarization was later found to be too small compared to the polarization of the empirically derived water models.13 The ESP fit approach is furthermore known to be



ELECTROSTATIC INTERACTIONS Functional Form. In classical fixed-charge force fields, only pairwise Coulomb interactions between the point charges of atoms i and j are considered V ijele(rij) =

qiqj

1 4π ϵ0ϵ1 rij

(23)

where q is the partial charge, ϵ0 the electric constant, ϵ1 the background dielectric permittivity (typically set to 1 for atomistic systems), and rij the distance between atoms i and j. The classical electrostatic interactions can be viewed as the remaining long-range part (1/r) of the QM electrostatic interactions, after all short-range contributions like bonds, repulsion and dispersion are removed. As mentioned above, the electrostatic interactions between third neighbors (1,4-interactions) are scaled in AMBER and OPLS. In CHARMM, third neighbors have full electrostatic interactions. The same applies to GROMOS with the exception of 1,4-interactions between hydrogens in aromatic rings, which are excluded. 572

DOI: 10.1021/acs.jcim.8b00042 J. Chem. Inf. Model. 2018, 58, 565−578

Review

Journal of Chemical Information and Modeling

groups for the model compounds of CGenFF, whereas a single charge group is used for new molecules. The GROMOS biomolecular force fields use empirically fitted partial charges. The partial charges in GROMOS 26C1 were taken from ref 96 and modified together with the LJ parameters to match the SPC water model.2,25 In GROMOS 37C4,37 the partial charges were replaced by those from Lifson and co-workers,107,108 who had used crystal structures, sublimation energies, dimerization energies, and dipole moments in the gas phase to fit partial charges. For charged residues, the partial charges from ref 109 were adopted. GROMOS uses charge groups, i.e. groups of five atoms or less with an integer total charge. For GROMOS 53A5/53A6, the partial charges were fitted to reproduce the density and heat of vaporization as well as solvation free energies in water and cyclohexane of organic liquids (side-chain derivatives).42 For organic molecules, the ATB web server can be used to obtain parameters for the GROMOS force field.46,47 For molecules with less than 40 atoms, initial partial charges are extracted from B3LYP/6-31G* QM calculations with implicit solvent (mimicking water) by fitting the ESP using the Kollman− Singh97 scheme. The partial charges are further modified by averaging of charges of atoms in equivalent functional groups and assignment of charge groups.46 Note that the partial charges obtained by ATB can differ from those of the GROMOS biomolecular force field for the same functional groups. OPLS was the first force field to fit the partial charges to reproduce liquid properties such as density and heat of vaporization for a set of 36 organic liquids.12 Partial charges for charged residues were obtained by fitting to results from QM calculations (with the 6-31G* basis set) of water−solute complexes. Upon changing from a UA to an AA force field, initial charges for the aliphatic carbons and hydrogens were obtained by assigning +0.06 e to the hydrogens and then further adjusting them to reproduce liquid properties.51 Hydration free energies were added as target properties for parametrization. In OPLS-UA and OPLS-AA, the concept of charge groups was used as in GROMOS and CHARMM. In OPLS_2005, which expanded the range of functional groups, BCI fitted to QM results were introduced to obtain partial charges for new functional groups.53 In OPLS2.0 and OPLS3,54,55 the charge model was changed to CM1A-BCC, which is a combination of the semiempirical CM1A110 charges and BCC fitted to reproduce QM electrostatic potentials and experimental hydration free energies. The partial charges of the protein force field were thereby kept unchanged.55 Recently, an alternative approach to derive partial charges has been proposed based on machine learning (ML).111,112 Training ML models on reference electron densities for a large set of molecules, partial charges can be predicted with high accuracy for new molecules and combined with existing force fields (e.g., OPLS-AA or GAFF). The advantage of the ML-based approaches is an inherent averaging over similar substructures in different molecules, which is important for transferability and a fast computation. The computational cost is limited to initial construction of the set of reference electron densities and training of the ML models, which has to be done only once. Long-Range Electrostatic Interactions. We will give here only a short summary of the different methods to account for long-range electrostatic interactions and what is standardly used

dependent on the conformation and the basis set used. However, for classical fixed-charge force fields only one set of partial charges is used for all conformations. To address this issue, the “restrained electrostatic potential” (RESP)98 based method was developed. Alternatively, simultaneous fitting of a single set of partial charges to multiple conformations can also reduce the conformational dependency.99,100 The RESP approach was used to derive the partial charges for the AMBER ff94 version based on HF/6-31G* QM calculations in vacuum.13 The hydrogen-bonding term was found no longer necessary. The HF/6-31G* setup was known to uniformly overestimate polarity. As molecules are more polarized in the condensed phase (in water) compared to the gas phase, this overpolarization was considered beneficial and the resulting partial charges were used without scaling. In AMBER ff03, an implicit solvent with a dielectric permittivity of ϵ = 4 was introduced for the QM calculations to further increase the polarization.101 Later force fields kept either the ff99 or ff03 procedure for partial charges. In GAFF, the partial charges for organic molecules are ideally obtained with RESP in the same way as the protein charges. As this requires a QM calculation for each molecule, an alternative charge model AM1BCC102,103 has been proposed.19 This model uses a faster semiempirical AM1 calculation to obtain Mulliken charges, which are then modified by bond charge corrections (BCC) that were fitted to reproduce QM electrostatic potentials in order to be compatible with RESP charges. The first partial charges in CHARMM (CHARMM4) were obtained by empirical fitting to match the QM interaction energies of dimers of small molecules in vacuum (at that time the force field contained a hydrogen-bonding term).3 In CHARMM19, this strategy was retained, using HF/6-31G* for the QM calculations together with an empirical calibration based on the TIP3P water model.26 The water dimer energy was found to be a factor of 1.16 higher with TIP3P than with HF/6-31G*, thus all partial charges obtained by fitting the QM interaction energies were scaled by a factor of 1.16. As with AMBER, the hydrogen-bonding term was found no longer necessary at this point. A more detailed description of the empirical fitting procedure for the partial charges is given in ref 14 for CHARMM22. A major difference between the earlier force-field version was that in CHARMM22 the TIP3P LJ parameters were used for both the solute−solvent and solvent− solvent interactions, which affects the fitting (matching of water−solute interaction energies to QM data). The scaling factor of 1.16 was retained (except for charged residues).14 Partial charges were furthermore selected such that they could be grouped into so-called “charge groups“104 with five atoms or less and an integer total charge (0, ±1, etc.). This simplifies transferability to larger molecules and the treatment of longrange electrostatic interactions (see below). Later force-field versions did not change the strategy to obtain partial charges. For organic molecules, CGenFF uses bond charge increments (BCI) to obtain partial charges of new functional groups.34 The BCI were introduced for the Merck molecular force field (MMFF94),105,106 fitting them to reproduce scaled HF/6-31G* dipole moments, dimer interaction energies, and hydrogenbond geometries in the gas phase. The BCI in CGenFF are conceptually similar but fitted to reproduce the partial charges of model compounds that were included in the parametrization of CGenFF. In addition to BCI, angle and torsion charge increments were introduced.34 Atoms were grouped into charge 573

DOI: 10.1021/acs.jcim.8b00042 J. Chem. Inf. Model. 2018, 58, 565−578

Review

Journal of Chemical Information and Modeling

ELS,r comprises all pairwise interactions up to the cutoff Rele calculated explicitly. Next to Ewald summation, the particle− particle particle-mesh (P3M)122 method and the particle-mesh Ewald (PME)123,124 approach belong to the LS methods. They reduce the computational effort compared to Ewald summation by calculating the reciprocal part on a mesh, i.e., by solving Poisson’s equation of a smooth charge distribution on a grid. Note that this computational speed-up comes at the expense of giving up the strictly pairwise nature of the interactions. P3M and PME are still considerably slower than RF. In the following, the setups used for parametrization are summarized for the different force-field families and versions (Table 3). As for the LJ interactions, it is advisable to apply the same settings. In AMBER ff94, truncation was used without a switching function or a reaction field.13 AMBER ff99 was parametrized with PME to account for the long-range electrostatic interactions. Later versions retained the same setup. Early versions of CHARMM used a switching function for the electrostatic interactions, e.g. CHARMM4 3 or CHARMM22.14 Later on, PME was used to account for the long-range electrostatic interactions.27,32 In GROMOS, initially a switching function was used, e.g. 26C1.2 In the 1980s, the cutoff radius was occasionally extended up to 1.5 nm and the electrostatic interactions truncated at this cutoff distance.92,125 In GROMOS 45A3 and subsequent versions, the RF method was used standardly.40,42 For OPLS-AA, the electrostatic interactions were brought to zero at the cutoff using a switching function and then truncated.126 In ref 52, a switching function was used for OPLS-AA/L.52 In OPLS_2005, PME was used to account for the long-range electrostatic interactions.53 For OPLS2.0 and OPLS3, no description could be found in refs 54 and 55, but it is assumed that PME was used. Future Developments. Employing fixed partial charges in force fields is clearly a substantial simplification, whereby changes due to changes in the surroundings are ignored. Nevertheless, if partial charges are used, which represent the molecule on average well, the past decades have shown that such fixed-charge force fields can be used to study (bio)molecular systems in the condensed phase and to gain insights not accessible from experiment. Furthermore, there is still room for improvement for fixed-charge force fields with relatively simple modifications and without introducing the additional complexity of explicit inclusion of polarization effects. For example, by using off-site charges the σ-hole of aryl halides or the anisotropy in the charge distribution caused by lone pairs can be reproduced by fixed-charge models. This idea has been proposed early on (see e.g. ref 127) but has regained interest recently with the increases in computer power.55,82,128 Similarly, multisite representations of multivalent ions improve the description of the interaction with water, proteins and nucleic acids.129−132 New parametrization approaches will be required in the future to recover the connection between the partial charges and vdW parameters in the context of organic molecules, which was given up in current general force fields in order to increase the coverage of the chemical space. ML methods for example may offer new opportunities in force-field parametrization, as already shown for partial charges.111,112 Such ML-based approaches have also been used to construct interaction potentials, which do not follow the functional form in eq 1 (see e.g. ref 133 and references therein); however, a detailed discussion of this field is outside the scope of the present overview.

with the discussed force-field families. For a more detailed review, see e.g. refs 113 and 114. Like for the LJ interactions, a cutoff Rele was introduced within which the pairwise electrostatic interactions are calculated explicitly. Generally, the cutoffs are set to the same value, i.e. Rele = RLJ. Note that the commonly used cutoffs are much shorter than the range of the electrostatic interactions, which decay with 1/r. Early on it was recognized that truncation of the electrostatic interactions results in substantial errors. By introducing charge groups and considering them for the cutoff such that no charge group is cut into parts, distortions in pair properties could be avoided. However, other artifacts remained. For example, an accumulation of ions just outside the cutoff could be observed in NaCl solutions.115 A simple solution is the use of either a shifting function as available for CHARMM43 V ijele,shift(rij) =

⎛ 2rij2 rij4 ⎞ 1⎜ ⎟ − + 1 4 ⎟ 2 4π ϵ0ϵ1 rij ⎜⎝ R ele R ele ⎠ qiqj

(24)

or a switching function (see e.g., eq 22 or ref 116) as for the LJ interactions. While the electrostatic interactions are no longer Coulombic when using a shifting function, the short-range component (rij < Rsw) remains Coulombic with a switching function. On the other hand, if Rsw and Rele are too close, large artificial forces can occur for atom pairs with a distance Rsw < rij ≤ Rele. In both cases, the partial charges need to be parametrized consistently. A computationally efficient alternative is the use of a reaction field (RF) where a dielectric continuum with a permittivity ϵRF is assumed outside the cutoff Rele. The RF approach was first introduced by Barker and Watts117 and later generalized for ionic systems115 V ijele,RF(rij) =

2 qiqj ⎛ 1 ϵ − ϵ1 rij ⎜ + RF 3 4π ϵ0ϵ1 ⎜⎝ rij 2ϵRF + ϵ1 R ele



⎞ 3ϵRF 1 ⎟ 2ϵRF + ϵ1 R ele ⎟⎠

(25)

For large ϵRF, the RF approach can the viewed as a physically based shifting function. The use of a RF resolves the issue with the accumulation of ions outside the cutoff,115 although not entirely.118 The advantages of RF are speed and the strictly pairwise nature. The main disadvantage is the homogeneous nature of the dielectric continuum outside the cutoff, which is not a good assumption for anisotropic systems such as membranes. Recently, we have developed an anisotropic RF method to address this issue.119 In lattice-sum (LS) methods, the system is modeled as fully periodic. They are therefore formally correct only for exactly periodic systems. For (bio)molecular systems in the condensed phase, the periodicity introduced by the periodic boundary conditions is utilized by the LS methods. This may artificially enforce the periodicity.120 One of the first LS methods to calculate the electrostatic potential exactly was Ewald summation published in 1921.121 The Coulomb interaction is split into a short-range direct sum ELS,r (calculated in real space) and a long-range term, which varies slowly and is typically handled in “reciprocal” space (k-space) via Fourier transforms E ele,LS = E LS,r + ELS,k

(26) 574

DOI: 10.1021/acs.jcim.8b00042 J. Chem. Inf. Model. 2018, 58, 565−578

Review

Journal of Chemical Information and Modeling

(2) van Gunsteren, W. F.; Karplus, M. Effect of Constraints on the Dynamics of Macromolecules. Macromolecules 1982, 15, 1528−1544. (3) Brooks, B. R.; Bruccoleri, R. E.; Olafson, B.; States, D. J.; Swaminathan, S.; Karplus, M. CHARMM: A Program for Macromolecular Energy, Minimization, and Dynamics Calculations. J. Comput. Chem. 1983, 4, 187−217. (4) Weiner, S. J.; Kollman, P. A.; Case, U. C. S.; Ghio, C.; Alagona, G.; Profeta, S.; Weiner, P.; Singh, U. C. A New Force Field for Molecular Mechanical Simulation of Nucleic Acids and Proteins. J. Am. Chem. Soc. 1984, 106, 765−784. (5) Berendsen, H. J. C., Ed. Report of the CECAM Workshop on Models for Protein Dynamics; CECAM: Orsay, France, 1976. (6) Hünenberger, P. H.; van Gunsteren, W. F. Computer Simulation of Biomolecular Systems; Springer: Dordrecht, 1997; pp 3−82. (7) MacKerell, A. D. Computational Biochemistry and Biophysics; Marcel Dekker: New York, 2001; pp 7−38. (8) Ponder, J. W.; Case, D. A. Force Fields for Protein Simulations. Adv. Protein Chem. 2003, 66, 27−85. (9) MacKerell, A. D. Empirical Force Fields for Biological Macromolecules: Overview and Issues. J. Comput. Chem. 2004, 25, 1584−1604. (10) Jorgensen, W. L.; Tirado-Rives, J. Potential Energy Functions for Atomic-Level Simulations of Water and Organic and Biomolecular Systems. Proc. Natl. Acad. Sci. U. S. A. 2005, 102, 6665−6670. (11) van Gunsteren, W. F.; Dolenc, J. Thirty-Five Years of Biomolecular Simulation: Development of Methodology, Force Fields and Software. Mol. Simul. 2012, 38, 1271−1281. (12) Jorgensen, W. L.; Tirado-Rives, J. The OPLS Potential Functions for Proteins. Energy Minimisations for Crystals of Cyclic Peptides and Crambin. J. Am. Chem. Soc. 1988, 110, 1657−1666. (13) Cornell, W. D.; Cieplak, P.; Bayly, C. I.; Gould, I. R.; Merz, K. M.; Ferguson, D. M.; Spellmeyer, D. C.; Fox, T.; Caldwell, J. W.; Kollman, P. A. A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules. J. Am. Chem. Soc. 1995, 117, 5179−5197. (14) MacKerell, A. D.; Bashford, D.; Bellott, M.; Dunbrack, R. L.; Evanseck, J. D.; Field, M. J.; Fischer, S.; Gao, J.; Guo, H.; Ha, S.; Joseph-McCarthy, D.; Kuchnir, L.; Kuczera, K.; Lau, F. T. K.; Mattos, C.; Michnick, S.; Ngo, T.; Nguyen, D. T.; Prodhom, B.; Reiher, W. E.; Roux, B.; Schlenkrich, M.; Smith, J. C.; Stote, R.; Straub, J.; Watanabe, M.; Wiórkiewicz-Kuczera, J.; Yin, D.; Karplus, M. All-Atom Empirical Potential for Molecular Modeling and Dynamics Studies of Proteins. J. Phys. Chem. B 1998, 102, 3586−3616. (15) Weiner, S. J.; Kollman, P. A.; Nguyen, D. T.; Case, D. A. An AllAtom Force Field for Simulations of Proteins and Nucleic Acids. J. Comput. Chem. 1986, 7, 230−252. (16) Wang, J.; Cieplak, P.; Kollman, P. A. How Well Does a Restrained Electrostatic Potential (RESP) Model Perform in Calculating Conformational Energies of Organic and Biological Molecules? J. Comput. Chem. 2000, 21, 1049−1074. (17) Hornak, V.; Abel, R.; Okur, A.; Strockbine, B.; Roitberg, A.; Simmerling, C. Comparison of Multiple Amber Force Fields and Development of Improved Protein Backbone Parameters. Proteins: Struct., Funct., Genet. 2006, 65, 712−725. (18) Maier, J. A.; Martinez, C.; Kasavajhala, K.; Wickstrom, L.; Hauser, K. E.; Simmerling, C. ff14SB: Improving the Accuracy of Protein Side Chain and Backbone Parameters from ff99SB. J. Chem. Theory Comput. 2015, 11, 3696−37132. (19) Wang, J.; Wolf, R. M.; Kollman, P. A.; Case, D. A.; Caldwell, J. W. Development and Testing of a General Amber Force Field. J. Comput. Chem. 2004, 25, 1157−1174. (20) Dickson, C. J.; Rosso, L.; Betz, R. M.; Walker, R. C.; Gould, I. R. GAFFlipid: A General Amber Force Field for the Accurate Molecular Dynamics Simulation of Phospholipid. Soft Matter 2012, 8, 9617− 9627. (21) Dickson, C. J.; Madej, B. D.; Skjevik, A. A.; Betz, R. M.; Teigen, K.; Gould, I. R.; Walker, R. C. Lipid14: The Amber Lipid Force Field. J. Chem. Theory Comput. 2014, 10, 865−879.

Another possible improvement lies in the definition or mapping of the bonded parameters. In AMBER, the bonded parameters are assigned based on the atom types, which hampers flexibility. To address this, a SMIRKS134 based format called SMIRNOFF has recently been developed,135 which may also be interesting for other force-field families. A more fundamental issue regarding force fields in general is the description of the kinetics of (bio)molecular systems. Force fields have been parametrized to a large extent against thermodynamic data, not kinetic data, because much more, and more precise experimental data are available for thermodynamic quantities. With the advance of MD simulations into the millisecond range, the deficiencies of the force fields with respect to kinetics has become apparent.65−69 To address this issue, more and more precise experimental kinetic information must become available such that force fields can be parametrized and validated accordingly.



CONCLUSIONS AND GUIDELINES Over the past decades, a large effort has gone into the development and testing of fixed-charge force fields for the simulation of (bio)molecular systems in the condensed phase. Four major families of force fields have been established in the 1980s, AMBER, CHARMM, GROMOS, and OPLS, which are actively maintained and continuously improved. The basic functional form of these force-field families is very similar, with differences in the technical details and parametrization strategies. In order to ensure that a force field behaves as described, users are advised to employ the same settings for the simulation as used during the parametrization. This is particularly important with respect to the cutoff for the long-range interactions and the methods used to treat the interactions beyond the cutoff. As the force-field parameters are effective (calibrated to reproduce experimental data), the nonbonded force-field parameters (mainly LJ parameters) inherently depend on the settings used during parametrization. The advances in computation power allow us nowadays to increase the size of the training sets substantially (e.g., number of molecules and length of simulations). Extra care must thereby be taken to avoid overfitting during the parametrization and to account for uncertainties in the experimental data.



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. ORCID

Sereina Riniker: 0000-0003-1893-4031 Notes

The author declares no competing financial interest.



ACKNOWLEDGMENTS The author thanks Philippe Hünenberger and Wilfred van Gunsteren for helpful discussions. The author thanks Wilfred van Gunsteren, William Jorgensen, Alexander MacKerell, and David Case for checking some of the details of the force fields.



REFERENCES

(1) Levitt, M.; Lifson, S. Refinement of Protein Conformations Using a Macromolecular Energy Minimization Procedure. J. Mol. Biol. 1969, 46, 269−279. 575

DOI: 10.1021/acs.jcim.8b00042 J. Chem. Inf. Model. 2018, 58, 565−578

Review

Journal of Chemical Information and Modeling (22) Woods, R. J.; Dwek, R. A.; Edge, C. J.; Fraser-Reid, B. Molecular Mechanical and Molecular Dynamical Simulations of Glycoproteins and Oligosaccharides. 1. GLYCAM_93 Parameter Development. J. Phys. Chem. 1995, 99, 3832−3846. (23) Kirschner, K. N.; Woods, R. J. Solvent Interactions Determine Carbohydrate Conformation. Proc. Natl. Acad. Sci. U. S. A. 2001, 98, 10541−10545. (24) Kirschner, K. N.; Yongye, A. B.; Tschampel, S. M.; GonzálezOuteirino, J.; Daniels, C. R.; Foley, B. L.; Woods, R. J. GLYCAM06: A Generalizable Biomolecular Force Field. Carbohydrates. J. Comput. Chem. 2008, 29, 622−655. (25) van Gunsteren, W. F.; Olafson, B.; Swaminathan, S. Inventory of Extended Atom Sets, Residue Topologies and Force Parameter Sets; March 1979. (26) Neria, E.; Fischer, S.; Karplus, M. Simulation of Activation Free Energies in Molecular Systems. J. Chem. Phys. 1996, 105, 1902−1921. (27) Best, R. B.; Zhu, X.; Shim, J.; Lopes, P. E. M.; Mittal, J.; Feig, M.; MacKerell, A. D. Optimization of the Additive CHARMM AllAtom Protein Force Field Targeting Improved Sampling of the Backbone Φ, Ψ and Side-Chain χ1 and χ2 Dihedral Angles. J. Chem. Theory Comput. 2012, 8, 3257−3273. (28) MacKerell, A. D.; Wiórkiewicz-Kuczera, J.; Karplus, M. An AllAtom Empirical Energy Function for the Simulation of Nucleic Acids. J. Am. Chem. Soc. 1995, 117, 11946−11975. (29) Foloppe, N.; MacKerell, A. D. All-Atom Empirical Force Field for Nucleic Acids: I. Parameter Optimization Based on Small Molecule and Condensed Phase Macromolecular Target Data. J. Comput. Chem. 2000, 21, 86−104. (30) MacKerell, A. D.; Banavali, N. K. All-Atom Empirical Force Field for Nucleic Acids: II. Application to Molecular Dynamics Simulations of DNA and RNA in Solution. J. Comput. Chem. 2000, 21, 105−120. (31) Klauda, J. B.; Venable, R. M.; Freites, J. A.; O’Connor, J. W.; Tobias, D. J.; Mondragon-Ramirez, C.; Vorobyov, I.; MacKerell, A. D.; Pastor, R. W. Update of the CHARMM All-Atom Additive Force Field for Lipids: Validation on Six Lipid Types. J. Phys. Chem. B 2010, 114, 7830−7843. (32) Vanommeslaeghe, K.; Hatcher, E.; Acharya, C.; Kundu, S.; Zhong, S.; Shim, J.; Darian, E.; Guvench, O.; Lopes, P.; Vorobyov, I.; MacKerell, A. D. CHARMM General Force Field: A Force Field for Drug-Like Molecules Compatible with the CHARMM All-Atom Additive Biological Force Fields. J. Comput. Chem. 2009, 31, 671−690. (33) Vanommeslaeghe, K.; MacKerell, A. D. Automation of the CHARMM General Force Field (CGenFF) I: Bond Perception and Atom Typing. J. Chem. Inf. Model. 2012, 52, 3144−3154. (34) Vanommeslaeghe, K.; MacKerell, A. D.; Raman, E. P. Automation of the CHARMM General Force Field (CGenFF) II: Assignment of Bonded Parameters and Partial Atomic Charges. J. Chem. Inf. Model. 2012, 52, 3155−3168. (35) Guvench, O.; Mallajosyula, S. S.; Raman, E. P.; Hatcher, E.; Vanommeslaeghe, K.; Foster, T. J.; Jamison, F. W.; MacKerell, A. D. CHARMM Additive All-Atom Force Field for Carbohydrate Derivatives and Its Utility in Polysaccharide and Carbohydrate Protein Modeling. J. Chem. Theory Comput. 2011, 7, 3162−3180. (36) van Gunsteren, W. F. Van der Waals Parameters and Torsion Potentials; 1983. (37) Hermans, J.; Berendsen, H. J. C.; van Gunsteren, W. F.; Postma, J. P. M. A Consistent Empirical Potential for Water-Protein Interactions. Biopolymers 1984, 23, 1513−1518. (38) van Gunsteren, W. F.; Berendsen, H. J. C. GROningen MOlecular Simulation (GROMOS) Library Manual; Biomos b. v.: Groningen, The Netherlands, 1987. (39) van Gunsteren, W. F.; Billeter, S. R.; Eising, A. A.; Hünenberger, P. H.; Krüger, P.; Mark, A. E.; Scott, W. R. P.; Tironi, I. G. Biomolecular Simulation: The GROMOS96 Manual and User Guide; Vdf Hochschulverlag AG an der ETH Zürich: Zürich, Switzerland, 1996. (40) Schuler, L. D.; Daura, X.; van Gunsteren, W. F. An Improved GROMOS96 Force Field for Aliphatic Hydrocarbons in the Condensed Phase. J. Comput. Chem. 2001, 22, 1205−1218.

(41) Chandrasekhar, I.; Kastenholz, M.; Lins, R. D.; Oostenbrink, C.; Schuler, L. D.; Tieleman, D. P.; van Gunsteren, W. F. A Consistent Potential Energy Parameter Set for Lipids: Dipalmitoylphosphatidylcholine as a Benchmark of the GROMOS96 45A3 Force Field. Eur. Biophys. J. 2003, 32, 67−77. (42) Oostenbrink, C.; Villa, A.; Mark, A. E.; van Gunsteren, W. F. A Biomolecular Force Field Based on the Free Enthalpy of Hydration and Solvation: The GROMOS Force-Field Parameter Sets 53A5 and 53A6. J. Comput. Chem. 2004, 25, 1656−1676. (43) Schmid, N.; Eichenberger, A. P.; Choutko, A.; Riniker, S.; Winger, M.; Mark, A. E.; van Gunsteren, W. F. Definition and Testing of the GROMOS Force-Field Versions: 54A7 and 54B7. Eur. Biophys. J. 2011, 40, 843−856. (44) Reif, M. M.; Hünenberger, P. H.; Oostenbrink, C. New Interaction Parameters for Charged Amino Acid Side Chains in the GROMOS Force Field. J. Chem. Theory Comput. 2012, 8, 3705−3723. (45) Poger, D.; van Gunsteren, W. F.; Mark, A. E. A New Force Field for Simulating Phosphatidylcholine Bilayers. J. Comput. Chem. 2010, 31, 1117−1125. (46) Malde, A. K.; Zuo, L.; Breeze, M.; Stroet, M.; Poger, D.; Nair, P. C.; Oostenbrink, C.; Mark, A. E. An Automated Force Field Topology Builder (ATB) and Repository: Version 1.0. J. Chem. Theory Comput. 2011, 7, 4026−4037. (47) Koziara, K. B.; Stroet, M.; Malde, A. K.; Mark, A. E. Testing and Validation of the Automated Topology Builder (ATB) Version 2.0: Prediction of Hydration Free Enthalpies. J. Comput.-Aided Mol. Des. 2014, 28, 221−233. (48) Hansen, H. S.; Hünenberger, P. H. A Reoptimized GROMOS Force Field for Hexopyranose-Based Carbohydrates Accounting for the Relative Free Energies of Ring Conformers, Anomers, Epimers, Hydroxymethyl Rotamers, and Glycosidic Linkage Conformers. J. Comput. Chem. 2011, 32, 998−1032. (49) Pol-Fachin, L.; Rusu, V. H.; Verli, H.; Lins, R. D. GROMOS 53A6GLYC, an Improved GROMOS Force Field for HexopyranoseBased Carbohydrates. J. Chem. Theory Comput. 2012, 8, 4681−4690. (50) Plazinski, W.; Lonardi, A.; Hünenberger, P. H. Revision of the GROMOS 56A6CARBO Force Field: Improving the Description of Ring-Conformational Equilibria in Hexopyranose-Based Carbohydrates Chains. J. Comput. Chem. 2016, 37, 354−365. (51) Jorgensen, W. L.; Maxwell, D. S.; Tirado-Rives, J. Development and Testing of the OPLS All-Atom Force Field on Conformational Energetics and Properties of Organic Liquids. J. Am. Chem. Soc. 1996, 118, 11225−11236. (52) Kaminski, G. A.; Friesner, R. A.; Tirado-Rives, J.; Jorgensen, W. L. Evaluation and Reparametrization of the OPLS-AA Force Field for Proteins via Comparison with Accurate Quantum Chemical Calculations on Peptides. J. Phys. Chem. B 2001, 105, 6474−6487. (53) Banks, J. L.; Beard, H. S.; Cao, Y.; Cho, A. E.; Damm, W.; Farid, R.; Felts, A. K.; Halgren, T. A.; Mainz, D. T.; Maple, J. R.; Murphy, R.; Philipp, D. M.; Repasky, M. P.; Zhang, L. Y.; Berne, B. J.; Friesner, R. A.; Gallicchio, E.; Levy, R. M. Integrated Modeling Program, Applied Chemical Theory (IMPACT). J. Comput. Chem. 2005, 26, 1752−1780. (54) Shivakumar, D.; Harder, E.; Friesner, R. A.; Damm, W.; Sherman, W. Improving the Prediction of Absolute Solvation Free Energies Using the Next Generation OPLS Force Field. J. Chem. Theory Comput. 2012, 8, 2553−2558. (55) Harder, E.; Damm, W.; Maple, J.; Wu, C.; Reboul, M.; Xiang, J. Y.; Wang, L.; Lupyan, D.; Dahlgren, M. K.; Knight, J. L.; Kaus, J. W.; Cerutti, D. S.; Krilov, G.; Jorgensen, W. L.; Abel, R.; Friesner, R. A. OPLS3: A Force Field Providing Broad Coverage of Drug-like Small Molecules and Proteins. J. Chem. Theory Comput. 2016, 12, 281−296. (56) Damm, W.; Frontera, A.; Tirado-Rives, J.; Jorgensen, W. L. OPLS All-Atom Force Field for Carbohydrates. J. Comput. Chem. 1997, 18, 1955−1970. (57) Ryckaert, J.-P.; Ciccotti, G.; Berendsen, H. J. C. Numerical Integration of the Cartesian Equations of Motion of a System with Constraints: Molecular Dynamics of n-Alkanes. J. Comput. Phys. 1977, 23, 327−341. 576

DOI: 10.1021/acs.jcim.8b00042 J. Chem. Inf. Model. 2018, 58, 565−578

Review

Journal of Chemical Information and Modeling (58) Hess, B.; Bekker, H.; Berendsen, H. J. C.; Fraaije, J. G. E. M. LINCS: A Linear Constraint Solver for Molecular Simulations. J. Comput. Chem. 1997, 18, 1463−1472. (59) Scott, W. R. P.; Hünenberger, P. H.; Tironi, I. G.; Mark, A. E.; Billeter, S. R.; Fennen, J.; Torda, A. E.; Huber, T.; Krüger, P.; van Gunsteren, W. F. The GROMOS Biomolecular Simulation Program Package. J. Phys. Chem. A 1999, 103, 3596−3607. (60) van Gunsteren, W. F.; Berendsen, H. J. C. Algorithms for Macromolecular Dynamics and Constraint Dynamics. Mol. Phys. 1977, 34, 1311−1327. (61) van Gunsteren, W. F. Constrained Dynamics of Flexible Molecules. Mol. Phys. 1980, 40, 1015−1019. (62) Warshel, A.; Lifson, S. Consistent Force Field Calculations. II. Crystal Structures, Sublimation Energies, Molecular and Lattice Vibrations, Molecular Conformations, and Enthalpies of Alkanes. J. Chem. Phys. 1970, 53, 582−594. (63) Gurskaya, G. V. The Molecular Structure of Amino Acids: Determination by X-ray Diffraction Analysis; Consultants Bureau; Plenum Publishing Corporation: New York, USA, 1968. (64) Steiner, D.; Allison, J. R.; Eichenberger, A. P.; van Gunsteren, W. F. On the Calculation of 3Jαβ-Coupling Constants for Side Chains in Proteins. J. Biomol. NMR 2012, 53, 223−246. (65) Best, R. B.; Buchete, N.-V.; Hummer, G. Are Current Molecular Dynamics Force Fields too Helical? Biophys. J. 2008, 95, L07−L09. (66) Piana, S.; Lindorff-Larsen, K.; Shaw, D. E. How Robust Are Protein Folding Simulations with Respect to Force Field Parameterization? Biophys. J. 2011, 100, L47−L49. (67) Beauchamp, K. A.; Lin, Y.-S.; Das, R.; Pande, V. S. Are Protein Force Fields Getting Better? A Systematic Benchmark on 524 Diverse NMR Measurements. J. Chem. Theory Comput. 2012, 8, 1409−1414. (68) Martín-García, F.; Papaleo, E.; Gomez-Puertas, P.; Boomsma, W.; Lindorff-Larsen, K. Comparing Molecular Dynamics Force Fields in the Essential Subspace. PLoS One 2015, 10, e0121114. (69) Vitalini, F.; Mey, A. S. J. S.; Noé, F.; Keller, B. G. Dynamic Properties of Force Fields. J. Chem. Phys. 2015, 142, 084101. (70) Lindorff-Larsen, K.; Piana, S.; Palmo, K.; Maragakis, P.; Klepeis, J. L.; Dror, R. O.; Shaw, D. E. Improved Side-Chain Torsion Potentials for the Amber ff99SB Protein Force Field. Proteins: Struct., Funct., Genet. 2010, 78, 1950−1958. (71) Gelin, B. R.; Karplus, M. Side-Chain Torsional Potentials: Effect of Dipeptide, Protein, and Solvent Environment. Biochemistry 1979, 18, 1256−1268. (72) Olson, W. K. How Flexible is the Furanose Ring? 2. An Updated Potential Energy Estimate. J. Am. Chem. Soc. 1982, 104, 278−286. (73) Xie, Y.; Zuo, L.; Hao, F.; Zagrovic, B.; Mark, A. E. An Improved Peptide Backbone Torsion Angle Potential for GROMOS 53A6 Force Field; unpublished report, 2011. (74) Margreitter, C.; Oostenbrink, C. Optimization of Protein Backbone Dihedral Angles by Means of Hamiltonian Reweighting. J. Chem. Inf. Model. 2016, 56, 1823−1834. (75) MacKerell, A. D.; Feig, M.; Brooks, C. L. Extending the Treatment of Backbone Energetics in Protein Force Fields: Limitations of Gas-Phase Quantum Mechanics in Reproducing Protein Conformational Distributions in Molecular Dynamics Simulations. J. Comput. Chem. 2004, 25, 1400−1415. (76) Huang, J.; Rauscher, S.; Nawrocki, G.; Ran, T.; Feig, M.; de Groot, B. L.; Grubmüller, H.; MacKerell, A. D. CHARMM36m: An Improved Force Field for Folded and Intrinsically Disordered Proteins. Nat. Methods 2017, 14, 71−73. (77) Kramer, H. L.; Herschbach, D. R. Combination Rules for van der Waals Force Constants. J. Chem. Phys. 1970, 53, 2792−2800. (78) Halgren, T. A. Representation of van der Waals (vdW) Interactions in Molecular Mechanics Force Fields: Potential Form, Combination Rules, and vdW Parameters. J. Am. Chem. Soc. 1992, 114, 7827−7843. (79) Luo, Y.; Roux, B. Simulation of Osmotic Pressure in Concentrated Aqueous Salt Solutions. J. Phys. Chem. Lett. 2010, 1, 183−189.

(80) MacKerell, A. D.; Karplus, M. Importance of Attractive van der Waals Contribution in Empirical Energy Function Models for the Heat of Vaporization of Polar Liquidst. J. Phys. Chem. 1991, 95, 10559− 10560. (81) Slater, J. C.; Kirkwood, J. G. The van der Waals Forces in Gases. Phys. Rev. 1931, 37, 682−697. (82) Cole, D. J.; Vilseck, J. Z.; Tirado-Rives, J.; Payne, M. C.; Jorgensen, W. L. Biomolecular Force Field Parameterization via Atoms-in-Molecule Electron Density Partitioning. J. Chem. Theory Comput. 2016, 12, 2312−2323. (83) Mohebifar, M.; Johnson, E. R.; Rowley, C. R. Evaluating ForceField London Dispersion Coefficients Using the Exchange-Hole Dipole Moment Model. J. Chem. Theory Comput. 2017, 13, 6146− 6157. (84) Smith, L. J.; Mark, A. E.; Dobson, C. M.; van Gunsteren, W. F. Comparison of MD Simulations and NMR Experiments for Hen Lysozyme. Analysis of Local Fluctuations, Cooperative Motions, and Global Changes. Biochemistry 1995, 34, 10918−10931. (85) Daura, X.; Mark, A. E.; van Gunsteren, W. F. Parametrization of Aliphatic CHn United Atoms of GROMOS96 Force Field. J. Comput. Chem. 1998, 19, 535−547. (86) Horta, B. A. C.; Merz, P. T.; Fuchs, P. F. J.; Dolenc, J.; Riniker, S.; Hünenberger, P. H. GROMOS-Compatible Force Field for Small Organic Molecules in the Condensed Phase: The 2016H66 Parameter Set. J. Chem. Theory Comput. 2016, 12, 3825−3850. (87) Wu, X.; Brooks, B. R. Isotropic Periodic Sum: A Method for the Calculation of Long-Range Interactions. J. Chem. Phys. 2005, 122, 044107. (88) Klauda, J. B.; Wu, X.; Pastor, R. W.; Brooks, B. R. Long-Range Lennard-Jones and Electrostatic Interactions in Interfaces: Application of the Isotropic Periodic Sum Method. J. Phys. Chem. B 2007, 111, 4393−4400. (89) Wennberg, C. L.; Murtola, T.; hess, B.; Lindahl, E. LennardJones Lattice Summation in Bilayer Simulations Has Critical Effects on Surface Tension and Lipid Properties. J. Chem. Theory Comput. 2013, 9, 3527−3537. (90) Leonard, A. N.; Simmonett, A. C.; Pickard, F. C.; Huang, J.; Venable, R. M.; Klauda, J. B.; Brooks, B. R.; Pastor, R. W. Comparison of Additive and Polarizable Models with Explicit Treatment of LongRange Lennard-Jones Interactions Using Alkane Simulations. J. Chem. Theory Comput. 2018, 14, 948−958. (91) Jorgensen, W. L.; Madura, J. D.; Swenson, C. J. Optimized Intermolecular Potential Functions for Liquid Hydrocarbons. J. Am. Chem. Soc. 1984, 106, 6638−6646. (92) van Gunsteren, W. F.; Berendsen, H. J. C.; Geurtsen, R. G.; Zwinderman, H. R. J. A Molecular Dynamics Computer Simulation of an Eight Base-Pair DNA Fragment in Aqueous Solution: Comparison with Experimental Two-Dimensional NMR Data. Ann. N. Y. Acad. Sci. 1986, 482, 287−303. (93) Sidler, D.; Frasch, S.; Cristòfol-Clough, M.; Riniker, S. Density Artifacts at Water-Chloroform Interface Caused by Multiple TimeStep Resonance Effects. J. Comput. Chem. 2018, submitted for publication. (94) Wang, J.; Hou, T. Application of Molecular Dynamics Simulations in Molecular Property Prediction. 1. Density and Heat of Vaporization. J. Chem. Theory Comput. 2011, 7, 2151−2165. (95) Hagler, A. T.; Huler, E.; Lifson, S. Energy Functions for Peptides and Proteins. I. Derivation of a Consistent Force Field Including the Hydrogen Bond from Amide Crystals. J. Am. Chem. Soc. 1974, 96, 5319−5327. (96) McCammon, J. A.; Wolynes, P. G.; Karplus, M. Picosecond Dynamics of Tyrosine Side Chains in Proteins. Biochemistry 1979, 18, 927−942. (97) Singh, U. C.; Kollman, P. A. An Approach to Computing Electrostatic Charges for Molecules. J. Comput. Chem. 1984, 5, 129− 145. (98) Bayly, C. I.; Cieplak, P.; Cornell, W. D.; Kollman, P. A. A WellBehaved Electrostatic Potential Based Method Using Charge 577

DOI: 10.1021/acs.jcim.8b00042 J. Chem. Inf. Model. 2018, 58, 565−578

Review

Journal of Chemical Information and Modeling Restraints for Deriving Atomic Charges: The RESP Model. J. Phys. Chem. 1993, 97, 10269−10280. (99) Reynolds, C. A.; Essex, J. W.; Richards, W. G. Atomic Charges for Variable Molecular Conformations. J. Am. Chem. Soc. 1992, 114, 9075−9079. (100) Cornell, W. D.; Cieplak, P.; Bayly, C. I.; Kollmann, P. A. Application of RESP Charges To Calculate Conformational Energies, Hydrogen Bond Energies, and Free Energies of Solvation. J. Am. Chem. Soc. 1993, 115, 9620−9631. (101) Duan, Y.; Wu, C.; Chowdhury, S.; Lee, M. C.; Xiong, G.; Zhang, W.; Yang, R.; Cieplak, P.; Luo, R.; Lee, T.; Caldwell, J. W.; Wang, J.; Kollman, P. A. A Point-Charge Force Field for Molecular Mechanics Simulations of Proteins Based on Condensed-Phase Quantum Mechanical Calculations. J. Comput. Chem. 2003, 24, 1999−2012. (102) Jakalian, A.; Bush, B. L.; Jack, D. B.; Bayly, C. I. Fast, Efficient Generation of High-Quality Atomic Charges. AM1-BCC Model: I. Method. J. Comput. Chem. 2000, 21, 132−146. (103) Jakalian, A.; Jack, D. B.; Bayly, C. I. Fast, Efficient Generation of High-Quality Atomic Charges. AM1-BCC: II. Parametrization and Validation. J. Comput. Chem. 2002, 23, 1623−1641. (104) van Gunsteren, W. F.; Berendsen, H. J. C.; Colonna, F.; Perahia, D.; Hollenberg, J. P.; Lellouch, D. On Searching Neighbors in Computer Simulations of Macromolecular Systems. J. Comput. Chem. 1984, 5, 272−279. (105) Halgren, T. A. Merck Molecular Force Field. I. Basis, Form, Scope, Parametrisation, and Performance of MMFF94. J. Comput. Chem. 1996, 17, 490−519. (106) Halgren, T. A. Merck Molecular Force Field. II. MMFF94 van der Waals and Electrostatic Parameters for Intermolecular Interactions. J. Comput. Chem. 1996, 17, 520−552. (107) Lifson, S.; Hagler, A. T.; Dauber, P. Consistent Force Field Studies of Intermolecular Forces in Hydrogen-Bonded Crystals. 1. Carboxylic Acids, Amides, and the C = O-H − Hydrogen Bonds. J. Am. Chem. Soc. 1979, 101, 5111−5121. (108) Hagler, A. T.; Lifson, S.; Dauber, P. Consistent Force Field Studies of Intermolecular Forces in Hydrogen-Bonded Crystals. 2. A Benchmark for the Objective Comparison of Alternative Force Fields. J. Am. Chem. Soc. 1979, 101, 5122−5130. (109) Poland, D.; Scheraga, H. A. Energy Parameters in Polypeptides. I. Charge Distributions and the Hydrogen Bond. Biochemistry 1967, 6, 3791−3800. (110) Storer, J. W.; Giesen, D. J.; Cramer, C. J.; Truhlar, D. G. Class IV Charge Models: A New Semiempirical Approach in Quantum Chemistry. J. Comput.-Aided Mol. Des. 1995, 9, 87−110. (111) Rai, B. K.; Bakken, G. A. Fast and Accurate Generation of Ab Initio Quality Atomic Charges Using Nonparametric Statistical Regression. J. Comput. Chem. 2013, 34, 1661−1671. (112) Bleiziffer, P.; Schaller, K.; Riniker, S. Machine Learning of Partial Charges Derived From High-Quality Quantum-Mechanical Calculations. J. Chem. Inf. Model. 2018, DOI: 10.1021/acs.jcim.7b00663. (113) Cisneros, G. A.; Karttunen, M.; Ren, P.; Sagui, C. Classical Electrostatics for Biomolecular Simulations. Chem. Rev. 2014, 114, 779−814. (114) Koehl, P. Electrostatics Calculations: Latest Methodological Advances. Curr. Opin. Struct. Biol. 2006, 16, 142−151. (115) Tironi, I.; Sperb, R.; Smith, P. E.; van Gunsteren, W. F. A Generalized Reaction Field Method for Molecular Dynamics Simulations. J. Chem. Phys. 1995, 102, 5451−5459. (116) Christen, M.; Hünenberger, P. H.; Bakowies, D.; Baron, R.; Bürgi, R.; Geerke, D. P.; Heinz, T. N.; Kastenholz, M. A.; Kräutler, V.; Oostenbrink, C.; Peter, C.; Trzesniak, D.; van Gunsteren, W. F. The GROMOS Software for Biomolecular Simulation: GROMOS05. J. Comput. Chem. 2005, 26, 1719−1751. (117) Barker, J. A.; Watts, R. O. Monte Carlo Studies of the Dielectric Properties of Water-Like Models. Mol. Phys. 1973, 26, 789− 792.

(118) Kräutler, V.; Hünenberger, P. H. Explicit-Solvent Molecular Dynamics Simulations of a DNA Tetradecanucleotide Duplex: LatticeSum Versus Reaction-Field Electrostatics. Mol. Simul. 2008, 34, 491− 499. (119) Sidler, D.; Frasch, S.; Cristòfol-Clough, M.; Riniker, S. Anisotropic Reaction Field Correction for Long-Range Electrostatic Interactions in Molecular Dynamics Simulations. J. Chem. Phys. 2018, submitted for publication. (120) Luty, B. A.; van Gunsteren, W. F. Calculating Electrostatic Interactions Using the Particle-Particle Particle-Mesh Method with Nonperiodic Long-Range Interactions. J. Phys. Chem. 1996, 100, 2581−2587. (121) Ewald, P. P. Die Berechnung optischer und elekrostatischer Gitterpotentiale. Ann. Phys. 1921, 369, 253−287. (122) Hockney, R. W.; Eastwood, J. W. Computer Simulation Using Particles; McGraw-Hill: New York, 1981. (123) Darden, T.; York, D.; Pedersen, L. An N · log(N) Method for Ewald Sums in Large Systems. J. Chem. Phys. 1993, 98, 10089−10092. (124) Essmann, U.; Perera, L.; Berkowitz, M. L.; Darden, T.; Lee, H.; Pedersen, L. A Smooth Particle Mesh Ewald Method. J. Chem. Phys. 1995, 103, 8577. (125) Berendsen, H. J. C.; van Gunsteren, W. F.; Zwinderman, H. R. J.; Geurtsen, R. G. Simulations of Proteins in Water. Ann. N. Y. Acad. Sci. 1986, 482, 269−285. (126) Personal communication by William Jorgensen. (127) Dixon, R. W.; Kollman, P. A. Advancing Beyond the Atom? Centered model in Additive and Nonadditive Molecular Mechanics. J. Comput. Chem. 1997, 18, 1632−1646. (128) Kramer, C.; Spinn, A.; Liedl, K. R. Charge Anisotropy: Where Atomic Multipoles Matter Most. J. Chem. Theory Comput. 2014, 10, 4488−4496. (129) Åqvist, J.; Warshel, A. Free Energy Relationships in Metalloenzyme-Catalyzed Reactions. Calculations of the Effects of Metal Ion Substitutions in Staphylococcal Nuclease. J. Am. Chem. Soc. 1990, 112, 2860−2868. (130) Åqvist, J.; Warshel, A. Computer Simulation of the Initial Proton Transfer Step in Human Carbonic Anhydrase I. J. Mol. Biol. 1992, 224, 7−14. (131) Oelschlaeger, P.; Klahn, M.; Beard, W. A.; Wilson, S. H.; Warshel, A. Magnesium-Cationic Dummy Atom Molecules Enhance Representation of DNA Polymerase β in Molecular Dynamics Simulations: Improved Accuracy in Studies of Structural Features and Mutational Effects. J. Mol. Biol. 2007, 366, 687−701. (132) Saxena, A.; Sept, D. Multisite Ion Models that Improve Coordination and Free Energy Calculations in Molecular Dynamics Simulations. J. Chem. Theory Comput. 2013, 9, 3538−3542. (133) Smith, J. S.; Isayev, O.; Roitberg, A. E. ANI-1: An Extensible Neural Network Potential with DFT Accuracy at Force Field Computational Cost. Chem. Sci. 2017, 8, 3192−3203. (134) Daylight Chemical Information Systems Inc. Daylight Theory Manual; 2011; http://www.daylight.com. (135) Mobley, D. L.; Chodera, J. D.; Bannan, C.; Zanette, C.; Bayly, C. I.; Lim, N. M. SMIRks Native Open Force Field (SMIRNOFF). https://github.com/openforcefield/smirnoff (accessed January 20, 2018).

578

DOI: 10.1021/acs.jcim.8b00042 J. Chem. Inf. Model. 2018, 58, 565−578