Review pubs.acs.org/CR
Cite This: Chem. Rev. XXXX, XXX, XXX−XXX
Developing and Testing of Lipid Force Fields with Applications to Modeling Cellular Membranes Alison N. Leonard,‡,§ Eric Wang,†,§ Viviana Monje-Galvan,† and Jeffery B. Klauda*,†,‡ Department of Chemical and Biomolecular Engineering and ‡Biophysics Graduate Program, University of Maryland, College Park, Maryland 20742, United States
Downloaded via UNIV OF GLASGOW on February 21, 2019 at 02:44:10 (UTC). See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.
†
ABSTRACT: The amphipathic nature of the lipid molecule (hydrophilic head and hydrophobic tails) enables it to act as a barrier between fluids with various properties and to sustain an environment where the processes critical to life may proceed. While computer simulations of biomolecules primarily investigate protein conformation and binding to drug-like molecules, these interactions often occur in the context of a lipid membrane. Chemical specificity of lipid models is essential to accurately represent the complex environment of the lipid membrane. This review discusses the development and performance of currently used chemically specific lipid force fields (FF) such as the CHARMM, AMBER, GROMOS, OPLS, and MARTINI families. Considerations in lipid FF development including lipid diversity, temperature dependence, phase behavior, and effects of atomic polarizability are considered, as well as methods and goals of parametrization. Applications of these FFs to complex and diverse models for cellular membranes are summarized. Lastly, areas for future development, such as efficient inclusion of long-range Lennard−Jones interactions (significant in transitions from polar to apolar media), accurate transmembrane dipole potential, and diffusion under periodic boundary conditions are considered.
CONTENTS 1. Introduction 1.1. Importance of Lipids in Cellular Membranes 1.2. Complexity of Lipid Head Groups and Membrane Structure 1.3. Molecular Simulations of Lipid Bilayers 1.4. Summary of Past Lipid FF and Model Membrane Reviews 1.5. Scope and Limitations of this Review 2. Introduction to Chemical-Specific Lipid Force Fields 3. All-Atom Lipid Force Fields 3.1. CHARMM 3.2. Slipids 3.3. Lipid14 (AMBER) 3.4. OPLS-AA 4. United-Atom (UA) Lipid Force Fields 4.1. GROMOS 4.2. OPLS-UA 4.3. TraPPE 4.4. C36-UA 5. Coarse-Grained Lipid Force Fields 5.1. MARTINI 5.2. SDK (Shinoda−DeVane−Klein) Force Field 5.3. Multiscale-CG with Force Matching 6. Polarizable Force Fields and Hybrid Models 7. Strengths and Weaknesses of Chemical-Specific Force Fields 7.1. Common Structural Properties 7.2. Other Important Bilayer Properties © XXXX American Chemical Society
7.2.1. Mechanical Properties 7.2.2. Monolayer Surface Tensions 7.2.3. Bilayer Dipole Potential Drop 7.2.4. Lipid Diffusion 7.2.5. Small Molecule Permeation in Lipid Bilayers 7.2.6. Interactions with Water and Ions 7.3. Lipid Diversity in Force Fields 8. Modeling Cellular Membrane Complexity 8.1. Initial Cellular Membrane Modeling with UA/AA Force Fields 8.2. MARTINI CG Models for Large Lipid Bilayers 8.3. CHARMM-GUI: A Tool to Build Complex Lipid Bilayer Systems 8.4. Considerations in Complex Membrane Modeling 9. Conclusions and Perspectives Author Information Corresponding Author ORCID Author Contributions Notes Biographies Acknowledgments References
B B B C D E E G G I J J J K L L L M M N O P R R T
T U V V V W W X X Y Z AA AB AD AD AD AD AD AD AE AE
Special Issue: Biomembrane Structure, Dynamics, and Reactions Received: June 18, 2018
A
DOI: 10.1021/acs.chemrev.8b00384 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
(see past reviews14−16 for more in-depth overview). The focus in the Introduction will be variety of the lipid head groups. Glycerophospholipids are the major constituents of eukaryotic membranes, possessing a diacylglycerol backbone,14 and include phosphatidycholine (PC), phosphatidylethanolamine (PE), phosphatidylserine (PS), phosphatidylinositol (PI), phosphatidic acid (PA), and phosphatidylglycerol (PG) (Figure 1). Out of these, PC is the most abundant lipid,
1. INTRODUCTION 1.1. Importance of Lipids in Cellular Membranes
The lipid bilayer is the primary structural component of cellular membranes. Historical experiments such as Overton’s permeability experiments,1 which demonstrated the greater permeability of apolar molecules through membranes, suggested a significant lipidic component. Although flawed assumptions were used, Gorter and Grendel2 suggested a lipid bilayer structure for the membrane. Eventually the membrane’s structure was conclusively proven when electron microscopy experiments directly observed the membrane. Cellular membranes define the capability to support life. These structures, around 3−5 nm in thickness, enable compartmentalization and the separation of life processes from nonliving processes. Current abiogenetic models suggest that entropically driven self-assembly of lipids into vesicles formed the first protocells. Although these first cells were quite permeable to small charged ions such as protons,3 development of these structures over time led to membranes capable of establishing pronounced ion gradients which are harnessed to produce energy. Maintenance of ion gradients is therefore essential to the viability of a cell or organism. Membrane pores, which dismantle the established electrochemical gradients, can be detrimental to cell processes and have been exploited in immune defense systems to eliminate bacterial cells. The poreforming mechanisms of the complement membrane attack complex4 and antimicrobial peptides5 (AMPs) are active areas of research for their potential therapeutic applications. In 1935, Danielli and Davson6 proposed that membrane proteins lay on the exterior of the lipid bilayer, which was updated in 1972 to the fluid mosaic model of Singer and Nicholson,7 incorporating transmembrane proteins as well. The fluid mosaic model, which reconciled a large amount of experimental data, enjoyed success for many years and retains some relevance today. The 1997 proposal of lipid rafts,8 ordered microdomains within cell membranes containing mainly sphingomyelin (SM) and cholesterol (Chol) lipid species, emerged as a challenge to the fluid mosaic model, although these structures are sporadic throughout membranes. Furthermore, estimates have found that proteins compose around one-half of the mass of a plasma membrane,9 requiring an update of the fluid mosaic model to incorporate protein crowding by emphasizing the “mosaic” aspect of membranes. Membrane proteins perform a myriad of cellular functions. An annulus of lipids surrounds each membrane protein similarly to water layers that surround soluble proteins, and these lipids bind to the protein surface through acyl chains.10 These lipid−protein interactions are critical for protein function, especially considering lipid chain length because the protein will adapt its geometry to minimize hydrophobic mismatch. Other influencing factors include the presence of anionic or polar groups in the surrounding phospholipids.11 This is known to affect the gating activity of ion channels12 and also the folding of peptides due to the hydrogen bonding necessary for secondary structures. Depending on membrane composition, the interfacial region has been termed a catalyst for secondary structure folding.13
Figure 1. Chemical structures of phospholipid/sphingolipid headgroups and Chol. Headgroups represented are phosphatidylcholine (PC), phosphatidylethanolamine (PE), phosphatidylserine (PS), phosphatidylinositol (PI), phosphatidic acid (PA), phosphatidylglycerol (PG), sphingomyelin (SM), and ceramide (Cer).
making PC membranes (and in particular 1-palmitoyl-2oleoylphosphatidylcholine or POPC) the most common models when compositional information is unavailable. PE lipids have a relatively small headgroup compared to PC lipids and possess an additional hydrogen-bond donor. PS lipids possess an additional carboxyl compared to PE which gives them an overall negative charge in the carboxylate form, and PI lipids possess a ring in the headgroup with varying phosphorylations that make PI lipids useful in lipid signaling.17 Sphingolipids make up another major class of membrane lipids which differ from glycerophospholipids primarily in the ceramide (Cer) backbone, which also serves as a signaling molecule involved in apoptosis.18−21 Cer tends to form segregated domains and is a structural and regulatory component of lipid rafts.22 It is also essential to the structure of the lipid matrix in the stratum corneum23−25 likely due to its highly hydrophobic nature.26 Sphingolipids have significant hydrogen-bonding capacity compared to glycerophospholipids, particularly due to the amide donor. Sphingosine is the most
1.2. Complexity of Lipid Head Groups and Membrane Structure
Lipid species are widely diverse across organisms, all varying in factors such as chain length, chain unsaturation, and headgroup B
DOI: 10.1021/acs.chemrev.8b00384 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
common backbone in sphingolipids, but alternate backbones such as phytosphingosine possess greater hydroxylation, leading to even greater hydrogen bonding which has been suggested to be instrumental in the formation of hydrogenbond bridges between lipid domains.27 Sterols are nonpolar lipids possessing an inflexible ringed structure. Chol is dominant in mammals, while ergosterol is dominant in yeast. When present in fluid membranes, the inflexibility of the sterol ring rigidifies the overall membrane and increases the order of fluid lipids, and an increase in bilayer thickness often accompanies the increase in rigidity and order.28 Sterols have complex interaction dynamics with lipids. Chol, for example, possesses a rough face due to methylation, which contributes to its ordering effect by reducing sterol tilt.29 This methylation also contributes to its orientation with respect to acyl chains: the rough face preferentially orients toward unsaturated tails, while the smooth face orients toward saturated tails.30 The phase behavior of lipid membranes is intimately related to chain order, which is customarily defined using the amount of acyl hydrocarbons existing in the trans state and is quantified by the order parameter obtained from NMR. The liquid-disordered phase is the most fluid and is common in physiological membranes. In contrast, the solid gel phase is the least fluid and is characterized by high chain order and a noticeable chain tilt. Gel phase bilayers exhibit minimal lateral diffusion and have been shown to possess increased pore sealing times.31 The liquid-ordered phase exists in the presence of high sterol concentration and is characterized by high chain order but significant translational mobility and lack of chain tilt. Both fluid and gel phase bilayers can be transitioned into the liquid-ordered phase by the introduction of sterol. The phase of a membrane is strongly influenced by the melting temperature (TM) of the component lipids. For example, physiological membranes require high fluidity and are often composed of POPC which is monounsaturated and possesses a low TM, while the stratum corneum requires high rigidity and is composed of high concentrations of Chol and high-TM Cer. Nonlamellar phases such as the hexagonal or cubic phase are not present in traditional membranes but can occur in transient events such as membrane fusion.14 The major geometric consideration for lipid membranes is the curvature. The Helfrich Hamiltonian32 describes the energy density of a fluid sheet in terms of the principal curvatures, bending modulus, saddle splay modulus, and spontaneous curvature and is applicable to both monolayers and bilayers. The spontaneous curvature, which is zero for some well-balanced lipids in a symmetric bilayer, describes the tendency for component lipids to curve leaflets based on their chemical structure and interspecies interactions. Intuitively, one expects lipids with large headgroups and small tails to induce positive curvature and small headgroups and large tails to induce negative curvature (Figure 2). While the concept is true, molecular shape alone is not sufficient to describe curvature. Other factors, such as electrostatic interactions, also play a role.33 Lipid curvature also influences the activity of proteins: the mechanism of membrane fusion by the hemaglutinin influenza virus is modulated by the curvature of the target membrane.34
Figure 2. Influence of lipid structure on leaflet spontaneous curvature.
that were intended to study the hydrophobic interior of the membrane.35 In 1993, the first detailed bilayer simulation (PC lipids) was performed by Richard Pastor and Rick Venable36 on the picosecond time scale. This was computationally expensive at the time, whereas by current standards, initial bilayer equilibration requires at least tens of nanoseconds. Nonetheless, the initial study provided information about picosecond motions and membrane viscosity and led the way for future simulations. In 1997, a simulation of PC at constant surface areas37 was performed and used to corroborate values obtained from experiment. In 2002, further studies analyzed the presence of water in PC bilayers, probing the effect of water wires38 and water crossing events.39 These studies also provided insight into nanosecond dynamics of lipids, particularly wobble motion in a cone-like potential.40 In the following years, the expansion of computational resources allowed for longer simulations with a greater diversity of lipids. Using the GROMACS simulation program41 and accompanying Berger Groningen Molecular Simulation (GROMOS) FF,42 Lindahl and Edholm43 simulated a small bilayer (64 lipids) of 1,2-dipalmitoyl-sn-phosphatidylcholine (DPPC) for 60 ns and a large bilayer (1024 lipids) for 10 ns. Venable et al.44 simulated the gel phase of DPPC and compared the use of Ewald summation over spherical cutoffs, revealing the importance of detailed all-atom models and constant-pressure ensembles. For some time, simulations mainly considered the use of zwitterionic PC and PE lipids in membrane models, but further progress allowed the simulation of anionic lipids such as PS45−47 and PG,48 providing useful models for anionic bacterial membranes. In all of these studies, positive counterions were included to create an overall electroneutral system and stabilize the bilayer by charge screening. After common phospholipids were parametrized, FF development shifted toward sphingolipids, of interest for their prevalence in lipid rafts8 and neuronal membranes.49 The first simulations with sphingomyelin (SM) were performed using the GROMOS FF. In 2003, Chiu et al.50 performed simulations of pure SM bilayers and compared the properties to DPPC bilayers, finding greater ordering and intramolecular hydrogen bonding. Niemala et al.51 similarly found suppressed dynamics in the lateral and rotational motions of SM compared to DPPC. Rog et al.52 simulated SM/Chol and PC/Chol mixtures, finding transitions from the gel to liquidordered phase common at high sterol concentrations. In
1.3. Molecular Simulations of Lipid Bilayers
Efforts to model lipid bilayers with molecular dynamics (MD) began in the 1980s with the simulation of hydrocarbon chains C
DOI: 10.1021/acs.chemrev.8b00384 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
Na+ ions bound to DPPC bilayers with greater frequency than K+, generating a membrane potential of −70 mV. Another study68 compared the effect of various chloride salts on the structure of a DPPC bilayer using the OPLS-AA FF. The authors found that, with the exception of K+, cations bind to the lipid head oxygens and impact dipole orientation, lipid order, and charge distribution. However, the authors of the above studies note the unknown effects of FF dependence on these results. A more recent study69 using the C36 FF parametrized lipid−ion interactions using osmotic pressure data, resulting in an improvement of deuterium order parameters for lipids with anionic headgroups. Investigation of ion interactions with membranes in simulation is ongoing. Future efforts include development of lipid−ion interaction parameters for more species of ions and verification of current and new parameters through various experimental methods. Recent development aims to incorporate lipid diversity and achieve long simulation trajectories, although these are computationally expensive. Efforts have been made to model diverse mixtures containing three or more components, and simulations have been extended into microsecond ranges (see section 8 for several examples).
agreement with the previous studies, the hydrogen bonding was found to be more significant in SM over PC. Later, Metcalf et al.53 performed simulations on SM/Cer bilayers to model bilayers after conversion by sphingomyelinase and found structural differences corresponding to liquid-ordered and disordered phases. The sphingolipid FF was unavailable in the Chemistry at Harvard Macromolecular Mechanics (CHARMM) community for some time; leading the McCabe group to develop a custom FF based on analogy with the protein FF and ab initio calculations, which demonstrated better agreement with experiment than GROMOS.54 However, this did not accurately reproduce gel phase transitions in SM bilayers, and a sphingolipid update from Venable et al.55 was later published based on data from X-ray form factors. Today, most sphingolipid simulations in the CHARMM community use C36, although GROMOS remains popular and has been used to simulate the perturbation of Cer bilayers by dimethyl sulfoxide.56 Molecular dynamics simulations can currently model a wide variety of lipids, including saturated and unsaturated hydrocarbon chains, PC, PE, PS, PG, and sphingomyelin (SM) headgroups, ether and ester linkages between the glycerol backbone and tail, methylated tails, cyclopentene rings, and various sterols (see section 7.3 for details). Simulation studies commonly include membrane-bound proteins, peptides, ions, and various water models. With all-atom simulations that can reach the microsecond range, exploration of membrane dynamics is of increasing interest. While mixed lipid bilayers more accurately model biological systems, simulations of single-component bilayers remain useful for FF validation. Such studies compare results from simulation with experimental data,57 particularly order parameters obtained from NMR, or use single-component systems as initial tests for the development of new analyses such as lipid clustering,58 permeability,59 or curvature.55 While the permeability of small apolar molecules such as gases can be analyzed in unbiased MD simulations, sampling of charged or large molecules requires long time scales or enhanced sampling methods such as umbrella sampling. Simple binary mixtures have been used to model physiological systems. In addition to physiological applicability, binary systems generally have more experimental data for comparison. Chol, which is known to modify the properties of membranes, is a frequent component of binary models. Even in the early days of the simulation field, PC/Chol simulations were carried out and analyzed for lipid surface area60 and chain order parameters. Chol was found to induce visually apparent transitions into the liquid-ordered phase for saturated PC at concentrations above 12%.61 Lipids rafts, which are mainly composed of SM and Chol, have been simulated multiple times using binary mixtures of these lipids.62−65 The longest simulations of binary mixtures extend to hundreds of nanoseconds, which adequately capture the wobble dynamics of pure SM bilayers but is not long enough to capture the dynamics of high concentrations of Chol. Binary mixtures of PC and Cer have also been used to study the effect of Cer as models of the stratum corneum.66 Although PC is not present in the stratum corneum, this allowed for comparison with experiment and provided insight into Cer’s role in ordering membranes. Effects of ions on lipid membranes have been explored with MD simulations. A study67 using a derivative of the GROMOS87 FF with Berger lipid parameters67 found that
1.4. Summary of Past Lipid FF and Model Membrane Reviews
Reviews of lipid FFs in recent years have focused on the authors’ areas of expertise or community affiliations. Klauda et al.70 published a short review on important considerations when parametrizing a pairwise additive all-atom/united-atom lipid FF with emphasis on the CHARMM community of FFs. This introduced the importance of using quantum mechanics (QM) and small molecule experimental data in parametrizing the lipid FF, with the aim to accurately represent bilayer properties in the constant pressure and temperature (NPT) ensemble. Several years after this review, the first of many lipid FFs was published that allowed MD simulations in the NPT ensemble, and a review of the strengths and weaknesses of the CHARMM36 lipid FF71 was published by Pastor and MacKerell.72 More recently, a review on the polarizable Drude oscillator model used in CHARMM including that for the lipids was published.73 For a coarse-grained (CG) representation of lipids (multiple heavy atoms in a given site, see details in section 2), Marrink and Tieleman74 reviewed the development of the popular MARTINI CG FF applications in lipid simulations. Each of these reviews focused primarily on a given community, but a recent review by Lyubartsev and Rabinovich75 provided an overall summary of common lipid FFs and how their accuracy was tested. Ollila and Pabst76 compared the accuracy of commonly used all-atom (AA) and united-atom (UA) lipid FFs, noting areas of remaining deficiency. In 2012, Piggot et al. evaluated structural properties of DPPC and POPC bilayers for various versions of GROMOS lipids and C36 lipids, noting deficiency in the Kukol modifications77 for monounsaturated POPC.78 Reviews have also focused on the complexity of in vivo cellular membranes. The importance of cholesterol and lipid organization has been reviewed with emphasis on its contribution to lipid-associated disorders in humans.79 The fluid mosaic model of Singer and Nicholson7 with its isolated proteins in a sea of lipids has been updated and reviewed by Engelman80 to represent the current knowledge of varying membrane regions of composition, thickness, and protein organization. The complexity of lipid composition and D
DOI: 10.1021/acs.chemrev.8b00384 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
with multiple molecules in a single site (beyond coarsegraining, as with some dissipative particle dynamics studies) are not reviewed as they are not chemically specific.
resulting membrane biophysical properties and function was nicely reviewed by van Meer, Voelker, and Feigenson.14 The lipid diversity of organelles in yeast and mammals is illustrated in this review along with membrane phase behavior, lipid transport, lipid synthesis, and lipid metabolism. The wide diversity of lipids in biology was classified by Fahy et al.,16 providing an excellent resource for understanding the enormous lipid diversity in nature. Many past reviews have focused on details of simple model membranes (1−3 lipid species) that may contain cholesterol and model interactions with membrane-active compounds.78,81−83 The simulation field has been only recently able to obtain lipid membrane models that aim to incorporate the lipid diversity in vivo. Moreover, with the development of more accurate lipid FFs, reviews and comparisons on global membrane properties relevant to cells, e.g., mechanical properties,84 have provided details in the accuracy of measures not typically targeted in FF optimization. There have been some recent reviews on lipid diversity incorporated into MD simulations85,86 providing background and insight into where the field of in vivo cell membrane modeling is progressing. Reviews have also encompassed multiscale modeling of membranes and membrane-associated proteins such as those by Pluhackova and Böckmann.87 Concurrently, Marrink et al.88 provide a comprehensive review of modeling membrane complexity with emphasis on MD, including computational methods and limitations, properties of multicomponent membranes, lipid−protein interactions, and in vivo models such as bacterial and organelle membranes. Marrink et al. discuss applications of MD modeling to realistic cell membranes, whereas the present review focuses on the capabilities and development of MD lipid FFs and implications for membrane models.
2. INTRODUCTION TO CHEMICAL-SPECIFIC LIPID FORCE FIELDS The lipid molecule is amphipathic with a headgroup that is hydrophilic (“water loving”) and tails that are hydrophobic (“water disliking”) (Figure 3). One key property for
Figure 3. Schematic of a lipid molecule (POPS here) and classifications of the two regions (head and tails).
simulations to model is the self-assembly of these molecules to form various structures depending on temperature and water concentration, e.g., micelles, bicelles, worm-like micelles, and bilayers.96 Accurate balance is needed between the hydrophobicity of the acyl chains and the hydrophilic nature of the lipid head groups to allow for the assembly of these structures. The phase state of the lipid bilayer can also vary depending on conditions, such as lipid composition. Bilayers without sterols typically exist in a fluid-like state known as the liquid-crystalline (Lα) phase that contains disordered lipid acyl chains (mix of trans and gauche conformations).14,97 At low temperatures, a gel phase (Lβ) can form with ordered acyl chains (mainly in the trans conformation), forming a tilt angle with respect to the bilayer normal.97 Depending on the lipid head groups and chain type, an intermediate phase can form known as a ripple phase (Pβ) in which regions of chain order (gel-like) alternate with regions of disorder or chain interdigitation, repeating in a ripple formation97−99 (PC with fully saturated tails is an example that forms this phase). The introduction of sterols can alter phase behavior. Disordered acyl chains with cholesterol in the membrane are known as liquid disordered (Ld), but cholesterol can increase acyl chain order above the gel phase melting temperature and form ordered acyl chains with a phase called liquid ordered (Lo).14,100 Accurate lipid force fields must be able to capture the balance of these chemical-specific interactions to represent phase changes. A molecular FF for lipids must be able to represent the internal molecular structure important for describing the bond distances, angles, and dihedral states along with the intermolecular interactions important for forming selfassembled structures. This review covers three groups of chemical-specific lipid force fields each with varying level of detail to describe the lipid molecule. The most detailed are the AA lipid FFs that explicitly include all hydrogen atoms of the lipid molecule (Figure 4). Although AA FFs can be
1.5. Scope and Limitations of this Review
The scope of this review is to provide an extensive assessment of lipid FFs that include chemical detail. An introduction to these lipid FFs is presented in section 2, which distinguishes between AA, UA, and CG FFs. Section 2 also presents the working mathematical inter/intramolecular potentials used by these FFs. Section 3 reviews the history of AA lipid FF development and how accuracy has improved over the past decade to reproduce bilayer properties at the atonic level. Section 4 contains a similar review of UA lipid FFs, and section 5 presents chemical-specific CG FFs. Section 6 describes methods used to explicitly treat atomic polarizability in lipid FFs. Section 7 provides a description of the strengths and weaknesses of the FFs covered in sections 3−6 and compares benefits and disadvantages for all levels of chemical detail (AA to CG and polarizable FFs), as well as within each level how FFs compare. Section 8 reviews recent studies in modeling complex lipid diversity. Lastly, section 9 provides insight into where improvements are needed for the next-generation FFs and where the field is progressing in cellular membrane modeling. This review focuses on chemically based FFs and standard approaches. It does not include membrane modeling with virtual sites89,90 to allow for longer time steps in AA MD or the use of FFs with discontinuous or nonstandard potentials.91−94 Enhanced sampling methods are not reviewed; details regarding these methods can be found in Mori et al.95 Complex membrane models are broadly discussed as examples for the current state and future direction of the field. Models E
DOI: 10.1021/acs.chemrev.8b00384 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
assumed to be constant and not influenced by environmental changes. All chemical-specific lipid force fields follow the Class I form (limited to the harmonic approximation of bonded terms) with varying approaches for some of the terms for N atoms or interaction sites U (r1 , r2 , ···, rN ) =
∑
kb , ij(rij − r0, ij)2
bonds
+
∑
kθ , ijk(f (θijk) − f (θ0, ijk))2
angles
Figure 4. Comparison of AA, UA, and CG force field atoms/particles for a POPC lipid.
+
+
all atom
united atom
coarse grained
polarizable
years published
CHARMM22 (C22)101,102 CHARMM27 (C27)103 CHARMM27r (C27r)104,105 GAFF106,107 CHARMM36 (C36)71,108,109 Slipids110,111 Lipid14112 OPLS-AA113 GROMOS (Berger)42 GROMOS (Chiu)114 OPLS-UA115,116 GROMOS (45 A3)117 GROMOS (53a6)118 TraPPE-UA119 C36-UA120 MARTINI121,122 multiscale-CG123,124
1996, 2000 2005 2007, 2010 2012, 2014 2014 1997 1999 1999, 2003 2004 2013 2014 2004, 2005
SDK125 dry MARTINI126 Drude127 polarizable water MARTINI128
2010 2015 2013 2010
1997
2008 2013
2000
2007
∑
Udih(φijkl) +
dihedrals
+
∑ electrostatic
∑ ULJ(rij) LJ
qiqj 4πε0rij
(1)
The first four terms in eq 1 are bonded terms that describe internal degrees of freedom of neighboring molecules. These approximate the quantum mechanical (QM) description of shared electron orbitals that form chemical bonds. The bonds term in eq 1 is a harmonic approximation of the bonded potential that describes the energy of stretching a bond from its equilibrium distance (r0,ij). The angles term describes the energy for scissoring the angle from its equilibrium value (θ0,ijk) and based triplicates of atoms connected by two bonds. In most FFs, this is just the angle, but the MARTINI lipid FF uses cosine of the angle.122 For conditions typically observed in biological application, the harmonic approximation is valid as changes from the equilibrium distances are minimal and these force fields are not reactive (no bond breaking or making). For all-atom or united-atom force fields, these two terms are obtained using experimental data on r0,ij and θ0,ijk with their associated force constants from vibrational spectra or if unknown the use of QM to provide these values. For complete accuracy, these parameters will vary between molecules of the same bond type (for example, a C−C single bond); however, transferability (using the same parameters to a bond or angle of a single type) is a goal in FF parametrization to easily extend the parameters to other molecules of the same type. The third term in eq 1 (cross, improper) is added to provide a proper description of (1) out-of-plane orientations of an atom relative to a plane made by three other atoms (improper) and (2) 1−3 interactions between atoms that are not bonded but connected by two bonds (cross). For AA FFs these add an additional level of flexibility to properly represent intramolecular structure. For CG FFs, these can be required to avoid any unphysical structures.125 The fourth term in eq 1 is used to describe the dihedrals in a molecule and best represent torsional energies about a bond. The approach used by different FFs results in different functional forms for these terms and will be provided in section 3 for each of the FFs. The last two terms describe key nonbonded interactions that represent electrostatic (charge−charge) interactions and repulsion−attraction interactions with a Lennard−Jones (LJ) potential (details of varying functional forms will be provided in section 3). The electrostatic term follows Coulomb’s Law for charge−charge interaction with vacuum permittivity (ε0). The charge here for UA and AA FFs is a partial charge to represent the QM description of charge distribution on atoms
Table 1. Summary of Lipid Force Fieldsa lipid FF
Ucr,im(r , ϕ)
improper
computationally demanding, they provide chemical detail without lower resolution approximations. The UA FFs lump nonpolar hydrogens into the attached heavy atom as one single interaction site (Figure 4). For example, the methyl (CH3) and methylene (CH2) atoms will each be represented by a single site for each of the two groups, thus reducing the interaction sites from 4 to 1 and 3 to 1, respectively. This reduction of interaction sites allows for less computational demand. If parametrized well, these can be of similar accuracy to the AA FFs. The UA approach can be further reduced by combining several heavy atoms into a single interaction site (Figure 4) known as a CG FF. The level of coarse graining can vary, but typically for chemical-specific FFs, up to 3−4 heavy atoms can be combined to a single site. Table 1 lists commonly used FFs
grouping
∑ cross,
community CHARMM CHARMM CHARMM AMBER CHARMM Self/AMBER AMBER OPLS GROMOS GROMOS OPLS GROMOS GROMOS TraPPE CHARMM MARTINI self self MARTINI CHARMM MARTINI
a
The associated publishing community is also given and those developed independent of other communities are listed as “self”.
of all three types. The associated publishing community is given to emphasize the compatibility of the lipid FF with other biomolecules, i.e., FFs should not be mixed from different communities. The first type of force fields to be described is pairwise additive force fields, in which the atomic charge distribution is F
DOI: 10.1021/acs.chemrev.8b00384 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
represents the response of the electron cloud to its local environment. The charge on the Drude particle is represented by its polarizability αi with qD, i = kDαi . Although one could allow the induced polarization to fully relax at every step using an iterative self-consistent approach, this is computationally demanding. Alternatively, an extended Lagrangian method can be used, where the Drude particles have a small mass (0.4 amu) taken from the attached real atom.135,138 In this approach, the simulation proceeds much like a classical pairwise additive MD simulation with an additional Drude particle on non-hydrogen atoms. The additional Drude particles and the need for a 1 fs time step (vs 2 fs) makes this slower than a pairwise additive FF but faster than fluctuating charge approaches. The following three sections will provide details of lipid FFs from the fully atomistic to united atom to coarse-grained force fields. Section 6 will discuss models of atomic polarizability in lipid FFs.
or group of atoms. The methods used to obtain the partial charges vary between FFs. To increase efficiency in computationally expensive pairwise calculations for long-range electrostatics, all-atom and united-atom classes of FFs, as well as some CG FFs, treat the sum over electrostatic interactions with the particle-mesh Ewald (PME) method.129 Within a supplied cutoff distance, interactions between each interaction site are summed directly. Outside the cutoff distance, interactions are approximated with a Fourier transform to provide the long range electrostatic contribution, important for the slowly decaying inverse of distance. In the vicinity of the cutoff, the potential is modulated to maintain continuity.129 The LJ term is used to represent both the repulsion (Pauli exclusion) and the attraction (multipoles) felt between molecules. The AA and UA lipid FFs represent this with a 12−6 LJ potential with the attraction represented by a dipole− dipole dispersion term of Cdisr−6, where Cdis is a constant. In pairwise additive FFs, Cdis is an effective parameter that lumps together higher order multipoles. The repulsion term, which should theoretically be exponential, is represented by Crepr−12 because this is just the square of the term used for the dispersion energy, thus simplifying the need to compute the exponential of a distance. Although this is an approximation, there is minimal loss of accuracy in biological applications that typically do not require an accurate description of unfavorable close contacts of atoms (only seen at very high pressures). Other power series in distances are used by some CG FFs;125 details are provided in section 5. The constant partial charge assumption is the most severe of the assumptions listed above for the Class I pairwise additive FFs. To resolve this issue, there have been varying approaches to account for atomic polarizability and better represent the effect of local environment on molecular electrostatics. Explicit changes of the electrostatics of atoms or interaction sites were the first approaches used to describe this with fluctuating charge or induced dipole schemes. Patel and co-workers130−134 were one of the first to implement a scheme (fluctuating charges) to include polarization in a lipid FF. The fluctuating charge scheme uses a charge equilibration approach,132 allowing charges to adjust to varying local environments. However, this requires a level of iteration that is computationally demanding and reduces its applicability to complex biological systems with current computational resources. More recently, the classical Drude oscillator model has been implemented for the CHARMM family of FFs127,135,136 and will be our focus here for polarizable FFs with atomistic detail. The Drude FF uses a virtual particle, i.e., the Drude particle, that has a charge qD,i and is connected to a polarizable atom i with a harmonic force constant kD. This allows the use of the Class I energy function (eq 1) with an additional term to describe the energy associated with the Drude particle137 i qD, iqj qD, iqD, j 1 jjj jj∑ +∑ 4πε0 jj i < j rDi − rj rDi − rDj i100 ns to reach thermodynamic equilibrium, as was the case with C36 development of sphingolipids.55 Continuing with the development of the intermediate lipid FF from C27r, Högberg et al.177 aimed to improve the lipid FF I
DOI: 10.1021/acs.chemrev.8b00384 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
The initial work by the GAFFlipid191 and Lipid11190 allowed for the final and current update for the lipid FF in AMBER, known as Lipid14.112 The approach and testing ultimately followed the same benchmarks as the C36 lipid FF with the same lipids (DLPC, DMPC, DPPC, DOPC, POPC, and POPE). Five 125 ns replicas of pure bilayers for each lipid were used to provide ample equilibration and unbiased averages of the reported properties. The SA/lip, isothermal compressibility, thickness, and NMR order parameters all agree favorably with experiment to the same level of accuracy of C36. However, SCD for the headgroup and order splitting for carbon-2 on the sn-2 chain were not reported or targeted. After extensive optimization, the AMBER community now has a strong lipid FF that works well with the tensionless NPT ensemble.
for charges falls more in line with the AMBER family of FFs. Moreover, tests with a WALP23 peptide (modeled by three versions of the AMBER protein FF) demonstrate Slipids’ compatibility with proteins modeled by the AMBER FF.110 3.3. Lipid14 (AMBER)
The AMBER community has been limited until recently in having an accurate and fully developed FF for lipids. Initial work developed an AMBER-compatible partial charge assignment for DLPE in 1992.185 A bilayer of 48 lipids with a hydration of 11 waters per lipid was simulated at 315 K. This low hydration (below the full-hydration limit of ∼25 waters per lipid)97 MD simulation of 0.2 ns with the SPC/E water model186 resulted in something close to the Lα phase, but the simulation time scale was too short to probe equilibrium structures. Then in 2001 Moore et al.187 developed an AMBER lipid FF for DMPC fully hydrated with SPC/E water. This followed the standard RESP procedure for partial charge determination and was simulated for 10 ns. The resulting SA/ lip was 57.7 Å2, which is 8 Å2 lower than experiment at the same temperature of 333 K (65.7 ± 1.3 Å2).188 At the beginning of the millennium, the CHARMM community had developed the C27FF and was on the verge of developing the C27r lipid FF. During the same time period, the AMBER community developed a general AMBER FF (GAFF) geared toward drug molecules in 2004.189 Since the availability of lipids in the AMBER FF was minimal, a logical idea was to see how well GAFF can reproduce lipid bilayer properties. Jójárt and Martinek were the first to publish the extension of GAFF to fully hydrated POPC membrane lipids in 2007.106 Maintaining reasonable accuracy with experimental properties of the POPC bilayer required 60 mN/m of an applied surface tension based on 20 ns simulations. This was 3fold the surface tension needed for fully saturated DPPC bilayers simulated with the C27r FF.70 Similar condensing of the lipid bilayer was observed for MD simulations with GAFF with DMPC and DOPC lipids.107 Although the GAFF was able to match certain properties of the lipid bilayer, issues similar to C22 and C27/C27r existed in that an applied surface tension or constant lateral area was needed to fully reproduce experiments. Developing a transferrable lipid FF to the accuracy of C36 was an important goal for the AMBER community, so that MD simulations of membrane-associated proteins could be performed. A year after the C36 lipid FF was published, Skjevik et al.190 published a more extensive development of a lipid FF called Lipid11. A careful development of LJ and charge parameters was the focus in Lipid11 that allowed for lipid-specific parameters to improve modeling of lipid bilayers. The authors acknowledged that this was preliminary work to develop a more accurate lipid FF. Tests on bilayers of DOPC, POPE, and POPC resulted in the need of an applied surface tension of 10−26 mN/m to maintain good agreement with experimental scattering profiles and NMR SCD. In the same year, preliminary efforts were published with improved acyl chain FF parameters that obfuscated the need for an applied surface tension named GAFFlipid.191 This resulted in reasonable structural predictions for saturated (di-12:0, di14:0, and di-16:0) and unsaturated (di-18:1, 16:0/18:1) PC lipids and POPE (16:0/18:0). Modifications in the partial charges, LJ, and dihedral potential for the acyl chain helped improve the bilayer properties.
3.4. OPLS-AA
The Optimized Parameters for Liquid Simulations All Atom (OPLS-AA)FF was originally developed by Jorgensen et al.192 for alkanes. OPLS-AA has been parametrized for various biomolecules,193−195 but only recently have lipids been parametrized for this FF.196 The lipid FF development of partial charges followed a similar route as that of the AMBER community, use of QM to obtain partial charges, and found charges based on a continuum solvent model best represented experimental observables. Similar to the C36 lipid FF, small molecules to represent the PC lipid headgroup were used to optimize dihedrals. The focus of parametrization was limited to 200 ns MD simulations of DPPC at 323 K. The resulting calculations for the acyl-chain NMR SCD and X-ray form factors all agreed favorably with experiment. Moreover, the inclusion of dihedral optimization for the lipid headgroup results in good agreement with NMR for this region of the lipid. This has since been extended to lipids with chain unsaturation, cholesterol,197 ether linkages,198 and PE.198 Ultimately, continued development is needed to reach the diversity available in C36 and the AMBER-associated FFs.
4. UNITED-ATOM (UA) LIPID FORCE FIELDS In UA models, nonpolar hydrogens are not independent interaction sites; instead, the nonbonded parameters of the parent atom are optimized to include the steric effect of hydrogens. For example, CH2 and CH3 groups are each modeled with a single particle. This reduces the number of atoms in a bilayer or monolayer simulation by about a factor of 3, increasing efficiency. Polar hydrogens, which can participate in hydrogen bonds and other polar interactions, are separately modeled and may be given a Lennard−Jones potential to prevent an infinite negative energy at the location of the hydrogen atom.199 While early releases of CHARMM,200 AMBER,201 and OPLS202 included united-atom force fields with polar hydrogens, the development of these FFs now favors the increased detail of the AA approach. Of currently available UA lipid FFs, GROMOS203,204 offers the greatest variety and attracts the most users. This section will provide a brief history, parametrization strategy, and validation for each UA lipid FF in Table 1, including GROMOS (its several FF variants), OPLS-UA, TraPEE-UA, and C36-UA. Comparisons in accuracy and diversity will be made in section 7. J
DOI: 10.1021/acs.chemrev.8b00384 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
and ε2 and κ are the relative permittivity and inverse Debye screening length of the medium outside the cutoff sphere defined by Rrf, respectively. The sum in eq 9 can also include atom pairs excluded from Coulombic contributions. Even though atom i might be excluded from atom j in Coulombic contributions, it still feels the reaction field due to atom j and is not excluded from the reaction-field sum. Several variants of the GROMOS lipid FF are in current use. Models adapted from GROMOS96207 include the 45A3 parametrization of DPPC by Chandrasekhar et al.,117 adjusted by Oostenbrink et al.206 The final adjustments by Oostenbrink to the nonbonded lipid parameters aimed to reproduce solvation properties of small polar molecules and did not provide validation of bilayer properties. A second branch of development, initially by Chiu et al.,208 includes adjustments to nonbonded parameters by Berger et al.42 for use with the OPLS FF,202 called the “Berger lipid FF”. The most recent version of this FF, 43A1-S3, includes updated partial charges and corresponding dihedral parameters.204 In 2009, the set was expanded to model mixed-lipid bilayers, with saturated lipids (DLPC, DMPC, and DPPC), monounsaturated lipids (DOPC and POPC),204 and SM.209 This set showed excellent agreement with experimental SA/lip for DLPC and DPPC. SA/lip for DMPC was overestimated by 1.5 Å2 and for DOPC underestimated by 1.4 Å2. Following adjustment of torsional and nonbonded parameters by Oostenbrink (parameter sets 53A5 and 53A6), DPPC bilayers failed to reproduce experimental data with satisfactory accuracy,77,210 leading to the trend of using the GROMOS87 FF211 with lipid parameters based on the Berger set discussed above.212,213 While these membrane simulations reproduced bilayer attributes well, the GROMOS87 FF for proteins lacks the accuracy of modern protein FFs. A year before the release of 43A1-S3 (2008), Andreas Kukol defined a new ester− carbonyl atom type with a larger van der Waals radius, 0.664 nm instead of 0.336.77 This parameter set borrowed partial charges from Chiu et al.208 discussed above and included models for DMPC and monounsaturated lipids POPC and 1palmitoyl-2-oleoyl-sn-glycero-3-phosphoglycerol (POPG). Reported SA/lip for DPPC and POPG were within 1 Å2 of experimental values, although DMPC SA/lip was overestimated by 2 Å2. The parameters in this study are compatible with standard GROMOS 53A6 FF. A later study by Piggot et al. reported deficiencies in structural properties when Kukol’s modifications were used to model monounsaturated POPC but showed good agreement with experimental bending constants and deuterium order parameters for DPPC bilayers.78 A final attempt to combine Chiu charges with 53A6 atom types was made by Poger et al. in 2010.203 By varying the interaction between choline methyls and nonester phosphate oxygens, SA/lip of a pure DPPC bilayer was well reproduced, as well as lipid volume and SCD of the palmitoyl chains. A follow-up study modeled DLPC, DMPC, and monounsaturated DOPC and POPC.214 While the authors claim to reproduce experimental bilayer structure and hydration, SA/lip in the four additional lipids is overestimated by an average of 7 Å2. Order parameters for three of the four added lipids agree well with experiment; the decrease in order at C10 in POPC is not present in the model. Overall hydration of the lipid headgroups (around 14.5 waters/lipid) agrees with experimental estimates (11−20 waters/lipid). However, the artificially high SA/lip impacts both headgroup hydration and order parameters; i.e., a more expanded bilayer would
4.1. GROMOS
The Groningen Molecular Simulation (GROMOS) computer program package has an associated diverse UA FF capable of modeling alkanes,205 lipids,77,203,204 sugars, nucleotides, and proteins.206 Parameterization of the GROMOS lipid FFs relies on electronic structure computations for bond parameters and charge distributions. Assignment of van der Waals parameters focuses on reproduction of thermodynamic properties of model compounds, such as enthalpies and free energies of solvation.204 Uniquely, in development of the current protein FF,206 free enthalpies of solvation in both polar and apolar environments were targeted to accurately represent partitioning in anisotropic systems such as lipid membranes. In the GROMOS family of FFs, nearest and next-nearest neighbor (1−2 and 1−3) interactions between bonded atoms are excluded from the sum over nonbonded interactions.206 From 1−4 pairs of atoms that comprise atomic rings such as cyclohexane are also excluded. This enhances the effect of improper dihedral parameters, which maintain planar orientation of aromatic rings. For the same reason, some hydrogen atoms bound to carbons in aromatic rings are also excluded from 1−4 and 1−5 counting. For alkane moieties, 1−4 nonbonded interactions are also excluded from the nonbonded pair list for these atom types. Improper dihedral and nearest-neighbor stretching interactions are modeled with simple harmonic potentials. Proper torsional interactions are treated using a trigonometric sum over all proper torsions with force constants Kn, phase shifts δn, and multiplicities n206 U trig(φijkl) =
∑ ∑ K n[1 + cos δncos nφijkl] dihedrals ∀ n
(7)
where φijkl is the value taken by the dihedral at a given time step. The form of the van der Waals potential in GROMOS is206 U LJ(rij) =
ji C12ij
∑ jjjjj ij
k
rij12
−
C6ij yzz zz z rij6 z {
(8)
where rij is the separation between atoms i and j at a given time step. Interaction parameters C12ij and C6ij are calculated from parameters for individual atoms (C12ii for atom i, for example) using geometric combining rules. van der Waals interactions that fall within a short cutoff length, Rp, are evaluated at each time step from a pair list that is constructed at each Np time step. Also at each Np time step, interactions between atoms separated by distances between Rp and a long-range cutoff Rl are also evaluated. These remain constant between pair list updates. In addition to the direct Coulombic interactions, GROMOS includes a reaction-field contribution VRF representing the interaction of atom i with the induced field of a continuous dielectric medium outside a cutoff (Rrf) due to the presence of atom j206 U RF(q) =
∑ ij
qiqj − 1 Crf rij2 2 4πε0ε1
R rf3
(9)
where Crf =
(2ε1 − 2ε2)(1 + κR rf ) − ε2(κR rf )2 (2ε1 − 2ε2)(1 + κR rf ) + ε2(κR rf )2
(10) K
DOI: 10.1021/acs.chemrev.8b00384 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
The United-Atom Optimized Parameters for Liquid Simulations (OPLS-UA) was initially parametrized for DMPC215 by borrowing from various other FFs, including MM3 for hydrocarbons,216 AMBER for phosphate Lennard−Jones parameters (scaled to be compatible with other OPLS-UA LJ parameters),217 and OPLS-UA amino acid FF for choline bond and angle parameters. Partial charges were taken from an ab initio study by Charifson et al.,218 also incorporated into AMBER. This partial charge distribution reproduces experimental estimates of the dipole moment (∼20 D)219 of a DMPC headgroup. This piecemeal model preceded the OPLSAA model of DPPC by 15 years, despite extensive parametrization of other biomolecules for OPLS-AA. SA/lip for a pure DMPC bilayer at 303 K agrees well with experiment (sim, 60.6 Å2/lipid; exp, 60.2). Models for cholesterol220 and epicholesterol221 were added in follow-up studies. Stretching, bending, and torsional parameters were borrowed from OPLS-UA parameters for hydrocarbons, with the exception of hydroxyl groups, which were treated in full atomic detail. Partial charges of cholesterol were fit to minimize error in comparison to QM electrostatic potential maps. With no recent developments and capabilities to model only DMPC and cholesterol, this FF lacks the diversity needed to model biological membranes.
In the extension of TraPPE to phospholipids,230 flexible bonds were used. Stretching and bending constants were taken from the CHARMM C36 FF. Combining C36 torsions and bending constants with TraPPE LJ and partial charge parameters resulted in poor prediction of rotational barriers for model compounds. To remedy this, dihedral parameters were reoptimized to include 1−4 LJ and Coulombic interactions. Choline and dimethyl phosphate were used to model torsions in the lipid headgroup, ethyl acetate the linker, and hexane and hexene the tail regions. Fine tuning of dihedral parameters utilized larger model compounds. Partial charges were determined from ab initio calculations on complete head groups. The SPC/E water model was used.231 Complete models were built for saturated- and unsaturatedchain, zwitterionic, and charged lipids (10 in total): DPPC, DMPC, DLPC, DOPC, POPC, POPE, DOPS, dilauroylphosphatidylglycerol (DLPG), dimyristoylphosphatidylglycerol (DMPG), and dioleoylphosphatidylglycerol (DOPG). The set shows good agreement with experimental SA/lip, with the exception of POPE, which is overestimated by 2.3 Å2. However, area compressibility moduli are systematically overestimated by about a factor of 1.5 for PS and PC lipids. While SCD of the acyl chains are slightly underestimated for DLPC and DMPC, trends are reproduced and close agreement with experimental SCD is exhibited for DPPC and POPC. TraPPE-UA for lipids has not been widely used because compatible FFs for other biomolecules, such as proteins, nucleic acids, and carbohydrates, have not been developed.
4.3. TraPPE
4.4. C36-UA
Originally developed to model vapor−liquid coexistence, the Transferable Potentials for Phase Equilibria (TraPPE) FF specializes in accurate prediction of phase equilibria and other thermophysical properties over a range of conditions and is thus applicable to industrial research. TraPPE-UA models CHn, n = [0,4] each as single interaction sites. Polar atoms and associated hydrogens are treated explicitly. TraPPE-UA FFs exist for alkanes,222−224 alkenes,225 alcohols,226 thiols,227 sulfides,227 amines,228 amides,228 ethers,222 glycols,229 and aldehydes.229 However, the FF cannot currently model larger biomolecules such as proteins or nucleic acids. A recent extension of this highly transferrable FF to lipids proved successful at reproducing key structural properties of a variety of pure lipid bilayers.230 With the exception of lipids and aromatic rings, the TraPPE models listed above are semiflexible with fixed bond lengths. For fully flexible models (including lipids), stretching is modeled with a harmonic potential. In the cases of the lipid FF, force constants associated with nearest-neighbor stretching are taken from the C36 lipid FF. The torsional potential is the standard form in eq 5230 and similarly for the LJ potential (eq 6). LJ potentials are calculated directly using a spherical truncation at either 12 or 14 Å. Long-range LJ contributions to energy, pressure, and chemical potential are evaluated using an analytical tail correction assuming an isotropic and homogeneous medium. Long-range electrostatic interactions are computed using the Ewald summation technique. TraPPEUA lipid FF only includes nonbonded LJ and Coulomb interactions between atoms separated by four or more bonds, and 1−4 nonbonded interactions were used to calculate dihedral parameters. Improper torsions and nearest-neighbor bond stretching (in the lipid FF) are maintained with basic harmonic potentials.
The CHARMM36 United-Atom (C36-UA) chain model232 takes a UA approach to modeling the acyl chains of lipids while using AA C36 for each atom in the lipid headgroup. It differs from the earlier C27-UA release by Hénin et al.233 in that C36UA reintroduces explicit hydrogens to the first methylene group of each lipid tail. The C36-UA chain model is compatible with the AA CHARMM family of FFs and uses the same potential energy function. The earlier parametrization effort by Hénin et al.233 was based on C27r lipids234 and required that simulations be run in the constant area ensemble. As described in section 3.1, C27r did not accurately represent the lateral density and phase behavior of certain lipids in the constant number of molecules, temperature, and pressure (NPT) ensemble.71 In simulations of lipid mixtures or lipids with proteins, for which the average surface area per lipid is unknown, the accuracy of C27-UA was therefore in question. Parameterization of C36-UA began with testing existing UA lipid FFs for alkanes. Because OPLS-UA hydrocarbon parameters235 gave the best agreement with experimental densities and heats of vaporization for heptane and pentadecane, C36-UA adopted OPLS-UA Lennard−Jones parameters. Torsional parameters from C27r-UA were refined to fit the ab initio torsional energy landscapes of the modified lipid tails (with OPLS-UA LJ parameters). Reintroduction of explicit hydrogens at the first methylene group in the acyl chains restored the carbon/hydrogen charge separation on this group and placed the link between the AA and the UA regions at a bond between methylene groups. Moreover, this allowed for explicit representation of the known unique splitting of the SCD for carbon-2 on sn-2.161,162 This required reparameterization of the first methylene group and associated interactions.
allow more water to penetrate, and the palmitoyl tails to exhibit less order. 4.2. OPLS-UA
L
DOI: 10.1021/acs.chemrev.8b00384 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
5.1. MARTINI
Resulting structural parameters were evaluated for pure DMPC, DPPC, POPC, and DOPC bilayers. Experimental SA/lip is well matched by C36-UA for DOPC but deviates from experiment by about 1 Å2/lipid for DPPC and DMPC and by 4 Å2/lipid for POPC. However, deuterium order parameters (estimated by reconstructing the deuterium atoms assuming an ideal geometry) of the acyl chains agree well with experiment for POPC. Simulations of the C36-UA DMPC model with three different concentrations of C36-AA cholesterol showed good agreement with X-ray form factors at 3% and 10% cholesterol, although the peak positions were shifted slightly to the right, indicating SA/lip may be too high (like pure DMPC in C36-UA). The aggregation size of dodecylphosphatidylcholine (DPC) micelles was also compared with experiment. Micelles successfully self-assembled from an initially random configuration of DPC and water. Although the sample size was small (4 micelles in total were evaluated), aggregation numbers of equilibrated micelles agreed well with experimental values (exp.,236 56 ± 5 molecules/micelle; sim., 53 ± 11 molecules/ micelle).
The MARTINI FF was first developed for lipids by Marrink et al.121 in 2004, which can be referred to as MARTINI 1.0. This 4-to-1 mapping of heavy atoms results in interaction sites that are subdivided based on types known as polar, nonpolar, apolar, and charged. Polar sites are used to represent compounds that are highly water soluble. The apolar site is for hydrophobic regions of molecules, while nonpolar sites are for polar/apolar, i.e., glycerol interface region in a lipid. The charged site is for groups that have a net unit charge like the phosphate in a lipid. The MARTINI FF follows the same functional form as AA and UA lipid FFs for the bonded term but only has bond and angle harmonic terms in eq 1. For lipids without ringed structures, there are no dihedral terms used in the FF function. As described above, one of the topics of debate and variations between lipid FFs with AA and UA approaches is how to deal with 1−4 interactions via scaling the LJ or electrostatics or not. For MARTINI, only the bonded 1− 2 interactions lack the LJ potential. Both 1−3 and 1−4 each include the LJ and electrostatic terms. The LJ form follows the functional form in eq 5 and the standard Coulombic potential in eq 1. The chemical-based mapping in the MARTINI 1.0 CG FF allows for easy comparison with various structural and thermodynamic properties (molar volume, internal energy, heats of vaporization, chain order, lipid bilayer area compressibility, temperature, and pressure). However, the time scale for motion is altered because CG FFs result in a smoother energy landscape compared to AA FFs. There lacks a unified conversion of integration time step in CG FFs to the real time, but the standard conversion factor used in MARTINI is a factor of 4,121 so if a simulation is run for 25 μs of CG simulation time then the actual time sampled is closer to 100 μs. As with higher resolution FFs, the first test for accuracy is to look at liquid alkane thermodynamic properties. The density for alkanes varying from butane to eicosane are in excellent agreement with experiment at 300 K and 1 bar. The CG FF density profiles along the interface normal for hexadecane and water agree with an AA model.121 Moreover, the free energy associated with moving a water from the aqueous phase to an oil phase is well matched with experiment for water in hexadecane and also butane in water. Although MD simulations of AA and UA lipid FFs typically use preassembled lipid bilayers due to long time scales for spontaneous aggregation, the MARTINI 1.0 FF can quickly go from a randomly mixed system of DPPC lipids in water to the formation of a lipid bilayer in a few 100 ns.121 The overall SA/ lip and thickness of a DPPC bilayer at 323 K and 1 bar agreed well with AA MD and experiment. Moreover, the elastic properties of this bilayer also agree well with experiment, i.e., bending modulus and area compressibility factor.121 This accuracy and availability was not limited to DPPC bilayers; a range of different lipid acyl chain lengths, inclusion of monounsaturated tails, and PE lipids all resulted in reasonable agreement with experimental data. An updated version of the MARTINI lipid FF was published several years later and is known as MARTINI 2.0.122 Three deficiencies were improved with this new version of the MARTINI FF, i.e., CG lipids have a too negative spontaneous curvature that yields a stronger tendency to form nonlamellar phases, the water model tends to freeze, and CG mapping is too coarse in energy levels to properly represent chemical
5. COARSE-GRAINED LIPID FORCE FIELDS Although UA lipid FFs do offer some speed up compared to AA FFs, the largest speed up is with further reduction in chemical detail. The chemical-specific CG FFs typically use a rough 4-to-1 reduction in heavy atom detail in a POPC lipid that would require 52 UA interaction sites be reduced down to 12 (Figure 4). The reduction in detail also allows longer time steps due to the lack of fast vibrational motions, and thus, time steps of 20−50 fs can be used. These CG FF can easily obtain effective real times of microseconds to milliseconds depending on how many molecules are in this simulation system. Three CG FFs will be discussed in the following subsections: MARTINI,121,122,126,128 Shinoda/DeVane/Klein (SDK),125 and Multiscale-CG from the Voth Group.123,124 In addition to explicit CG representations for lipid species, phenomenological membrane models have been developed to reproduce macroscopic or bulk properties of a bilayer, such as its bending rigidity and its ability to deform and remodel.237 Such representation of the membrane can be particle based, in which different degrees of CG resolution are chosen to reproduce a given macroscopic behavior of the bilayer without modeling a specific lipid species or be further simplified to represent the bilayer as an elastic mesh with no explicit lipid or lipid-like molecules (not the focus here as there is a loss of chemical detail). Two of the most common particle-based models are the Cooke−Deserno238 and the Brannigan− Philips−Brown239 lipids. In these models, the lipid is represented as a chain of flexible beads; again, this results in loss of chemical detail, particularly because the two tails of a typical phospholipid are represented with a single chain. Combined with the use of implicant solvent, these models significantly reduce computational cost. The top-down modeling approach used in building phenomenological models reproduces some experimental data but does not reproduce fine-grained or all-atom observables that are needed to accurately represent the physics of a given lipid molecule that may affect macroscopic behavior of the system.240 Despite some of the drawbacks of this approach, phenomenological models have been widely used in the study of protein− membrane dynamics241−243 and continue to evolve to facilitate the study of large-scale membrane dynamics. M
DOI: 10.1021/acs.chemrev.8b00384 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
5.2. SDK (Shinoda−DeVane−Klein) Force Field
specific changes. MARTINI 1.0 has 5 types of LJ pair parameters (ε and σ), while MARTINI 2.0 has 10 possible pairs for LJ parameters. This chemical flexibility allow for greater accuracy in thermodynamics of small molecule hydration and octanol/water partition free energies that is within 1 kbT for many molecules studied.122 The issue of water freezing was resolved in an ad hoc manner by adding antifreeze at a mole fraction of 0.1 to water with minimal influence to bilayer properties. The dodecane/water interfacial tension with this updated MARTINI 2.0 model (50 mN/m) is in excellent agreement with experiment (52 mN/m). The common properties for lipid bilayers maintain a similar level of accuracy in the updated MARTINI 2.0.122 One of the main improvements was in the spontaneous curvature that for DPPC should be close to ∼0 nm−1, but MARTINI 1.0 resulted in from −0.1 to −0.2 nm−1, and MARTINI 2.0 is closer to the AA lipid FF result from −0.02 to −0.05 nm−1. MARTINI 2.0 also presented the first version of the cholesterol for this CG lipid FF and its ringed structure. It accurately represented the condensing effect of cholesterol on saturated-tailed lipids. Additional lipid molecules have included glycolipids,244 sphingolipids,122 and inositol-containing lipids.244,245 This lipid FF is diverse enough to represent most of the lipids that occur naturally, and more details for complex models with this FF are provided in section 8. It also offers the ability to simulate other biomolecules including proteins/peptides,246,247 RNA/DNA,248,249 and sugars.250 Lipid hydration can also be modeled implicitly with a FF that does not have explicit water but maintains the chemical detail of the lipid. The cutely named Dry MARTINI lipid FF represents water with an implicit solvent by fine tuning the lipid parameters to result in proper structure of bilayers and other self-assembled structures.126 For large assemblies, such as liposomes, even CG water can prohibit the ability to simulate such structures. Although the effect of water could be represented in some mean-field approximation like Poisson− Boltzmann or others,251 Dry MARTINI does not use these computationally expensive approaches and instead just adjusted the LJ terms in the lipid model to result in accurate hydrated lipid bilayer properties. The POPC lipid bilayer was used as a test, and Dry MARTINI self-assembled into a bilayer that resulted in nearly identical properties to that of standard MARTINI for SA/lip, area compressibility, bilayer thickness, and bending rigity.126 Comparing 242 different lipids formed from 4 head groups and 18 lipid tails result in strong matching of bilayer thickness and SA/lip. However, in general, the Dry MARTINI model tends to result in short-tailed lipid membranes that are slightly too thin and long-tailed membranes that are too thick (dense). Overall, the phase behavior of domain-forming lipids and the formation of a lipid gel state is well reproduced with Dry MARTINI, albeit 10 K higher than experiment for DPPC.126 Since this was tuned more for bilayers, the micelle formation at dilute conditions was not well represented with Dry MARTINI due to the necessity for attractive potentials to aggregate lipids. The main advantage of Dry MARTINI is to go beyond the ∼50 nm limit for liposomes with standard MARTINI. Vesicle fusion and stalk formation can easily be studied using Dry MARTINI. Although some testing has been done with peptides, the accuracy of this for large transmembrane proteins has not been tested.126
The MARTINI lipid FF is an example of a primarily empirical FF that uses experiments and some AA lipid FF results to optimize its model parameters. Another approach is to focus on directly fitting a CG lipid FF to AA MD simulation results. A lipid FF that followed this scheme comes from Shinoda, DeVane, and Klein (SDK).125 The C27r lipid FF104 was used as a target for parametrizing the SDK CG FF. Harmonic terms for the bond stretching and angle bending were used, and those separated by more than two bonds interact by the nonbonded forces (LJ and electrostatics). A different crossterm for 1−3 interactions is used to truncate the nonbonded energies at some distance rs
∑ Ucr(r) = ∑ {Unb(ri(i+ 2)) − Unb(rs)} for ri(i+ 2) < rs cross
cross
(11)
The distance rs in the SDK FF is set to the LJ potential minimum distance for the pair. Equation 11 is used to prevent rare and unphysically bent structures because the harmonic angle constants can be comparable to that of the nonbonded potential depth. Past work by the same authors determined that the typical AA and UA LJ form (12-6) potential did not reproduce structural and interfacial properties well at a CG level.252 Therefore, two LJ potentials were used, i.e., 12-4 potential with pairs involving water and a softer 9-6 potential for other pairs ÅÄÅ ÑÉ l o ÅÅij σij yz12 ij σij yz4 ÑÑÑ o o 3 3 Å Ñ o j z j z o εijÅÅÅjjj zzz − jjj zzz ÑÑÑ o ∑ o Å ÑÑ o j z j z r r 2 Å o nonbonded ÅÅk ij { o k ij { ÑÑÖ o Ç o o o pairs with water ∑ ULJ(rij) = omoo É ÅÄÅ 9 6Ñ o LJ o ij σij yz ÑÑÑÑ ÅÅÅijj σij yzz o 27 o j z o εijÅÅÅjj zz − jjj zzz ÑÑÑÑ o ∑ o o j rij z ÑÑ 4 ÅÅÅjj rij zz o nonbonded o k { ÑÑÖ o ÅÅÇk { o o pairs not with water n (12)
The electrostatic potential followed standard Coulomb’s law in eq 1 and was evaluated without truncation, i.e., using Ewald summation. Water in the SDK FF is modeled with a single LJ site that represents three water molecules.125 This serves as to model hydrodynamic interactions with surfactants. CG water is modeled so that it must be a liquid phase over a temperature range of interest (0−100 °C), and the surface tension and density of water should agree well with experiment. This water model must also agree well with interfacial width and free energies of transfer. Therefore, the alkane− and alkene−water LJ parameters were fit using surface tension and density data, while the ester−water interaction parameters were fit to match experimental hydration free energies.125 These alkane, alkene, and ester parameters were transferred to their respective moieties in the lipid, but the lipid headgroup required fits that were directly based on AA MD simulations with the C27r lipid FF. The SA/lip and distribution functions were used as targets to the CG FF parameter optimization using various lipid bilayers and micelles (those with PC and PE lipid head groups with saturated and monounsaturated tails) in the Lα state at conditions above their respective main transition temperature.125 This model was not designed to represent gel structures of lipid bilayers. N
DOI: 10.1021/acs.chemrev.8b00384 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
when it was first developed. Recent publications describe the modifications that allow the method to account for and reproduce dynamical data, such as time correlations, for small molecules using virtual or fictitious sites.258,259 The MS-CG method uses force matching (FM) to build an empirical FF for complex biological molecules and optimizes it to reproduce a potential of mean force (PMF) between CG sites of an AA MD simulation. The method systematically derives CG potential parameters from the atomic-level interactions to describe short-range interactions between CG sites using a variational minimization procedure.123 The resulting CG model uses the PMF as a “potential energy function” that will, in principle, reproduce the equilibrium probability distribution of CG sites consistent with the underlying AA model.254 A mapping operator is used to reduce the resolution of the AA system and render its CG configuration. Typically, the mapping operator uses the center of mass of the atoms grouped in a CG site because the resulting interaction between CG sites models the original AA simulation. However, when a single CG site represents associating atoms of different masses it is suggested to select the center of geometry as the metric. A mixture of both representations (center of mass/geometry) is allowed in the MS-CG approach.123 Latest developments in this methodology have explored the use of the center of charge as the mapping operator to include electrostatic properties from the underlying AA system into the CG model;260 however, this has not yet been applied to lipid bilayers. To ensure consistency between the CG and the AA models, the MS-CG method defines a CG coordinate (R1, ..., RN) and corresponding momentum (P1, ..., PN) as a linear combination of the coordinates and momenta of the subset of atoms the CG site represents, where no atom can be defined twice in the CG sites.254 Additionally, the probability distribution of the positions and momenta of the CG model must be equal to the equilibrium distribution function of those in the AA model. The CG forces must be directly proportional to the equilibrium averages in the atomistic simulation in either the NVT or the NPT ensemble; in the latter case the volume in the CG model must be also the same as in the AA model.254,256 FM is carried out using the following equation (eq 1 in ref 123)
The goals of the SDK model were to achieve results similar to that of a detailed AA lipid FF with minimal fits to experimental data and to be able to represent various selfassembled lipid and surfactant structures, e.g., micelles, bicelles, bilayers, and liposomes.125 Dodecylphosphatidylcholine (DPC) was used as a target for parametrizing the SDK FF to match with micelle AA MD simulations. The probability density of moieties in the DPC molecule in a micelle match well with AA MD. Moreover, the radial distribution functions for headgroup pairs also matches with AA MD. Bilayers simulations were used to optimize headgroup−headgroup, headgroup−water, and headgroup−lipid tail parameters for PC and PE lipids (DMPC, DPPC, POPC, and POPE). This required hundreds of trial CG MD simulations using the NPT ensemble to match structural and interfacial properties of the bilayers with AA MD. The resulting SDK parameter set has excellent agreement with experimental SA/lip and bilayer thickness. Moreover, the density profiles of the lipid groups and segmental order parameters agree well for MD simulations with SDK and AA. MD simulations with SDK also agree well with various properties that were not used in the parametrization scheme. The elastic properties of bilayers for area compressibility and bending moduli each agree favorably within experimental and simulation error.125 As with the MARTINI CG FF, the SDK FF results in a self-assemble bilayer structure within 150 ns starting with 932 DMPC lipids. Monolayer surface pressures are in excellent agreement with experiment.125 Vesicles form spontaneously with the SDK FF as well as bilayer budding. Overall, the SDK FF offers an alternative to the MARTINI CG FF but is currently limited to lipid and surfactant molecules. There has been some development of a compatible protein FF253 but not to the level of the MARTINI CG FF. 5.3. Multiscale-CG with Force Matching
While the SDK FF is based on AA MD simulations and some experimental data fitting for the FF parametrization, the Multiscale-CG (MS-CG) approach developed at the beginning of the millennium by Izvekov and Voth123,124 relies entirely on data extracted from an AA MD simulation of the system of interest. This method is part of the so-called “bottom-up” approach and has been under constant development; its implementation is extensively described in a series of papers.254−256 The method was originally envisioned to provide a systematic procedure to develop and optimize CG potentials to reproduce and predict properties of interest in a system, given that at the time CG FF parametrization was done to match average structural or thermodynamic properties but did not take into account the underlying atomic forces of the system.123 The first system modeled with the MS-CG method was a lipid bilayer of pure DMPC with a rigid TIP3P model for water molecules;123 the method was later modified to allow the study of membranes at longer time and length scales, further reducing the system’s resolution by the removal of water molecules, which could allow the sampling of membrane selfassembly, lipid domain formation, and membrane fusion.257 The method was initially developed in the canonical ensemble (constant number of molecules, volume, and temperature: NVT) and was later extended to the isothermal−isobaric ensemble (NPT), which is most commonly used in the simulation of lipid bilayers. This method reproduced structural properties of the modeled system but not dynamical properties
ji
qαβ zy zzn zz αil , βjl = Fαrefil , rα2il , βjl z { α = 1, ..., K , i = 1, ..., Nα , l = 1, ..., L
∑ ∑ jjjjj−f (rαil ,βjl , {rαβ ,κ}, {fαβ ,κ }, {f αβ″ ,κ }) − K
Nβ
β=1
j=1
k
(13)
Fref αil
where the reference force is that of the AA system and the fitting parameters are fαβ,κ, f″αβ,κ, and qαβ. The first term on the left-hand side of the equation models the short-range forces and the second term the Coulomb portion; the subscript αil identifies the ith atom of kind α in the Ith configuration. Given a large enough L (atomic configurations sampled in the AA MD trajectory), the equation can be solved for the above parameters. To ensure enough sampling for the FM procedure, forces need to be frequently sampled over an equilibrated AA MD trajectory (e.g., every 0.1−1 ps).123,257 A detailed description of the FM scheme is described in ref 261. When modeling bilayers, MS-CG-derived potentials are a sum of basis functions that account for bonded, angular, dihedral, and nonbonded interactions entirely derived from the O
DOI: 10.1021/acs.chemrev.8b00384 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
(VCG) lipid model that capture the dynamics of the lipid− solvent interface. Another simulation method that involves coarse graining is dissipative particle dynamics, a method that bridges microscopic and macroscopic phenomena of simple and complex fluids.275 Particles represent groups of atoms or fluid regions. Internal degrees of freedom are integrated out and represented with pairwise forces. Among other uses, DPD has been used to study cell membrane rupture in the presence of a magnetic nanoparticle276 and the thermodynamics of cholesterol mixing with DMPC bilayers.277
atomistic potential function and the mapping operator with an additive constant.254 It takes the form U (R N ) = −kBT ln z(R N ) + (constant)
(14)
where z(RN) = ∫ drn exp(− u(rn)/kBT)δ(MNR (rn) − RN). The force field is defined by the gradients of the CG potential FI (RN ) = − ×
∏ J(≠ I )
kT ∂U (R N ) = BN ∂R I z( R )
δ(MRN (r n) − R N )
∫ dr n exp(−u(r n)/kBT )
i yz ∂ jjj z δ jj∑ cIi ri − R Izzz j zz ∂R I j i ϵ0I k {
6. POLARIZABLE FORCE FIELDS AND HYBRID MODELS The pairwise additive (constant partial charge) assumption of classical FFs can be a poor approximation, especially when a molecule is exposed to differing environments with varying dielectrics. AA FFs can also include virtual sites like those in the TIP4P water models278−281 or Drude138,282,283 and those to represent lone pairs,283 but for lipids these nonatomic sites have only been implemented with the Drude FFs. The most common initial approach to represent polarization in AA FFs incorporated fluctuating charges (FQs). In the CHARMM community, Patel and Brooks used a FQ model to develop an initial FF for proteins130,284 and alkanes.131 The extension to a FQ lipid FF started with more refinement of the electrostatic and LJ parameters of liquid alkanes to match the experimental polarizability, enthalpy of vaporization, and density for hexane.132 This was then developed for PC lipid bilayers (DMPC and DPPC) and compared to the C27r lipid FF. The FQ FF used the TIP4P-FQ for water and resulted in a desired increase in penetration of water into the center of the bilayer. Agreement with NMR SCD was comparable to C27r, but MD simulations required time steps of 0.5 fs, which is smaller than that typically used with C27r (2 fs). Due to the computational cost for FQ models, the CHARMM community, including Drs. Benoit Roux, Alexander MacKerrell, Jr., and other collaborators, decided to use the Drude oscillator approach to polarization. As its name suggests, the CHARMM Drude FF treats atomic polarizability in a classical manner using the Drude oscillator model described at the end of section 2. In the CHARMM implementation,285 all heavy atoms have an attached Drude particle with a mass of 0.4 amu, and additional energy terms account for the added degrees of freedom (eq 2). In the initial parametrization effort by Anizimov et al.,286 which included nucleic acid components, reference values for atomic polarizabilities (α) were based on Miller’s atomic hybrid polarizability values.287 Determination of αi can be reduced to the determination of the partial charges of Drude particles, qD,i, which were assigned with restrained fitting to quantum mechanical electrostatic potentials (ESP). Specifically, qD,i were chosen to reproduce QM results for a series of perturbed ESP maps, each one representing the electronic response of a given molecule in the presence of a background charge placed at different chemically relevant positions. QM calculations of the ESPs were performed on MP2-optimized geometries using the B3LYP hybrid functional288−290 and the correlation-consistent double-ζ Dunning cc-pVDZ and aug-cc-pVDZ basis sets. 291 Drude was subsequently parametrized to include alkanes,137 alcohols,292 aromatic compounds,293 ethers,294,295 nitrogen-containing heteroaromatic compounds,296 lipids,127,136 proteins,297 and DNA.298
(15)
where u(rn) is the interaction potential for the AA system and n MRN (r n) = ∑i = 1 cIi ri the linear mapping operator.254 Given the MS-CG method only computes short-range forces, it is highly efficient compared to the atomistic system of the corresponding size, especially when simulating larger systems.123 Since the MS-CG model for any given system relies entirely on the atomic interactions reproduced in the AA MD simulation of the system, the accuracy of the CG potential is directly related to the accuracy of the FF used for the AA simulation. Furthermore, the CG FF parameters are optimized to reproduce the conformational changes that were sampled during the AA simulation. To ensure accuracy of the CG potential in the region of unsampled configuration space, the MS-CG variational calculation needs to be corrected, which can be done using a postprocessing or fixed part approach.254,255,262 The postprocessing approach is an extrapolation of the calculated potential, while the fixed part adds a trial function to the basis set of functions used in the derivation of the CG potential that is already large and positive in the unsampled region. The MS-CG approach has been extensively applied to the study of small molecules like alcohols and short polymers255,262 and small peptides, especially to address protein folding.263,264 Protein and protein−membrane systems have also been modeled using this approach, mostly helical proteins,265,266 and the issue of transferability has also been addressed for these systems.267,268 Newest developments in the methodology provide a way to incorporate experimental data into the CG model using an experiment-directed simulation to bias the underlying AA MD trajectory prior to mapping the CG sites.269 With respect to lipid bilayers, the MS-CG approach reproduces reliable structural data in terms of radial distribution functions, interspecies distances, and density profiles. The first lipid species to be studied were pure DMPC,123 a DPPC/DPPE mixture,124 a DMPC/cholesterol mixture, and a DOPC/DOPE mixture.257,270,271 A hybrid approach that included the MS-CG method was later applied to derive a highly CG model for DLPC, DOPC, and a mixed bilayer of DOPC/DOPS.272 The derivation of a mixed AA-CG model is also possible, in which the atom-CG and CG-CG interactions are mapped and parametrized using the MS-CG approach273 and allows for two scales of detail in a single MD simulation. Finally, the latest expansion of this approach is to incorporate virtual sites to model the solvation of lipid headgroups in a membrane, a key step in membrane selfassembly and lipid sorting.274 Systematic iteration between the MS-CG and the relative entropy minimization (REM) methods enables the construction of a virtual-site CG P
DOI: 10.1021/acs.chemrev.8b00384 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
A Thole damping constant299 was implemented to explicitly include dipole−dipole interactions for atoms within three bonds.137 In additive FFs, interactions for 1−2 and 1−3 pairs are typically subsumed by the bonded potential energy function and excluded from the electrostatic energy. While electrostatic interactions of Drude oscillators with core charges are excluded for 1−2 and 1−3 atom pairs, Coulomb interactions between Drude oscillators (i.e., dipole−dipole interactions) corresponding to 1−2 and 1−3 atom pairs are included but shielded by a damping function. The magnitude of this shielding is governed by the Thole damping constant, which can now be calculated from parameter assignments for individual atoms.300 In addition to inclusion of dipole−dipole interactions, the Drude FF has a second inherent advantage over C36: It uses the polarizable SWM4-NDP water model, optimized for negatively charged Drude particles.301 Relative to TIP3P used with C36, SWM4-NDP shows better agreement with experimental properties of bulk water including vaporization enthalpy, translational diffusion, shear viscosity, and surface tension at the air/water interface. However, it should be noted that SWM4-NDP was optimized near biological temperature and can result in reduced accuracy (water density and likely other properties) outside 290−310 K.283 Until recently, the Drude FF was parametrized for only a single lipid: the saturated lipid DPPC.127 Parameters were first obtained for model compounds including alkanes, dimethylphosphate, tetramethylammonium, and various esters. Partial charges and atomic polarizabilities were fit as previously described. LJ parameters were selected to reproduce both QM gas phase interaction energies and liquid molecular volumes. Bonded term force constants were optimized against QM calculations of the vibrational spectra. Lastly, dihedral parameters were optimized against one-dimensional relaxed potential energy scans as well as rotamer energies to balance the local and global QM target data. The MP2/6-31G(d) level was used to optimize gas phase geometries for neutral molecules and cations and the MP2/6-31+G* level for anions. Minimum energy rotamer configurations of small molecules were identified with CHARMM and optimized as described above. Single-point energies of resulting configurations were performed at the MP2/aug-cc-pVTZ level.291 The resulting model underestimated SA/lip of a DPPC bilayer in the NPT ensemble, yielding 60 ± 2 Å2 at 323 K302 compared to the experimental value of 63.0 Å2,303 which is reproduced by C36.304 Additionally, the area compressibility modulus was high, indicating the bilayer was too stiff. Lipid self-diffusion was slower than in C36 by about a factor of 2/3. This is an improvement; when periodic boundary conditions are considered, estimates of the self-diffusion coefficient of C36 lipids are too high by about a factor of 3.305 Deuterium order parameters for carbon atoms at sites G3S and G2 along the polar headgroup were substantially underestimated. Otherwise, form factors, electron density profiles, and deuterium order parameters of the hydrocarbon tails agreed well with experimental values. A recent parametrization effort by Li et al.136 improved agreement with structural data for a DPPC bilayer and expanded the Drude FF, adding seven lipid types: DMPC, DLPC, POPC, DOPC, DPPE, POPE, and DOPE. This list includes lipids with monounsaturated tails and PE headgroups. The authors began by adjusting parameters for seven torsions located in the polar head and glycerol backbone region to bring
G3S and G2 deuterium order parameters of DPPC in closer agreement with experiment than their first effort.127 This was achieved, and the SA/lip of DPPC was also improved. However, the new FF still underestimates SA/lip for all PC lipids (except DMPC) by an average of 2.3 Å2/lipid. Most notably, the surface area of DOPC, which was simulated at three different temperatures, is underestimated by 5 Å2/lipid on average. Additionally, bilayer area compressibility moduli are substantially overestimated, and lipid self-diffusion coefficients are lower than experimental values. While CG lipid FFs do not explicitly treat polarizability, the CG MARTINI FF introduced a polarizable water model that is compatible with the lipid FF.128 As noted in section 5, one of the issues with MARTINI water is that it freezes well above the normal freezing temperature and antifreeze is added to prevent water from freezing. Moreover, the simple water site that represents four waters by a LJ term cannot represent the natural response to global and local electrostatic fields and polarization effects. Therefore, a polarizable form of water was developed (MARTINI 2.0P) to be used with lipids and other macromolecules.128 In this model, the water particle still represents four water molecules but also contains one positively and one negatively charged particle bonded to the central LJ particle. These charged particles have a mass of 24 amu each with the remaining mass on the LJ particle to total the mass of four waters (72 amu).128 The bonded charge particles have a fixed distance from the LJ particle but are allowed to have a harmonic angle term for the angle formed with the LJ particle at the center. This results in a water model with five adjustable parameters, i.e., the charge, distance to LJ particle, resting angle, harmonic angle force constant, and atom type of the LJ particle. This is somewhat analogous to the Drude oscillator approach discussed above.127 The charge terms are excluded within a particle, and thus, this change only influences interparticle interactions. The optimized polarizable water model for MARTINI results in a density that is higher than experimental measurements and a dielectric constant that is lower.128 The temperature dependence of these two properties is too strong with 4% error in density at 300 K and 5% error for the dielectric constant. The freezing point of water is found to be between 280 and 285 K, which is too high but better than the nonpolarizable water with 290 ± 5 K.128 The goal of this model was to work well with biomacromolecules; the new water model results in similar structural properties for the DPPC lipid bilayer. MD simulations with MARTIN2.0 result in an electrostatic potential drop across the bilayer to be ∼4.3 V, while MARTINI2.0P results in ∼2.3 V. Although this is still larger than pairwise additive FF71 and experiment at 0.23−0.25 V,306 the reduction in the electrostatic drop is in the correct direction. Moreover, the energy barrier to translocate Na+ and Cl− across the DPPC bilayer is improved with the polarizable model when long-range electrostatics is included with PME.128 Although this does improve some lipid properties, the lipids remain nonpolarizable and thus do not respond electrostatically to their own local environment. Polarizability has also been incorporated into the MS-CG approach (section 5) to model dipole interactions with a dummy particle and harmonic bond, similar to the Drude oscillator, and additional dipole moment mapping to better represent the dielectric properties of certain materials.307,308 Hybrid methods combine multiple simulation scales into the same simulations, such as QM with MD or AA with CG. While Q
DOI: 10.1021/acs.chemrev.8b00384 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
Figure 5. SA/lip obtained from simulation are compared with experiment for various FFs.42,75,107,109,110,112,113,120,126,181,188,189,191,206,313−318 Solid and empty squares are PC and PE lipids, respectively. Black diagonal lines represent the 95% confidence intervals of the experimental DHH. Outliers are circled for clarity and described in the text. Data from common saturated and monounsaturated lipids are shown.
SCD, based on published values. The second part of this section will focus on key properties of bilayers (mechanical, electrostatic, and diffusion) that are not general targets for lipid FF development. The accuracy of various FFs will be compared for these properties. The section will conclude with a summary of lipid diversity available in 0commonly used FFs.
typically used to quantify protein dynamics, hybrid quantum mechanical/molecular mechanics (QM/MM) models can also be used with compatible lipid FFs. The QM/MM approach enhances chemical detail with the ability to create/destroy chemical bonds while maintaining computational efficiency in nonreactive areas. Enzymes are commonly treated with this method, which allows the active site of the enzyme to be calculated with QM while the periphery is calculated using MD.309 Likewise, all atom/coarse grain (AA/CG) simulations have been used to simulate protein folding with the accuracy of AA and efficiency of CG.310 Boundary regions are particularly important for these methods, including the treatment of covalent bonds in QM/MM and particle resolution in AA/CG.
7.1. Common Structural Properties
First, we consider three key properties to define lipid bilayer structure (SA/lip, thickness, and SCD) with some brief discussion on phase transition temperatures. The comparison is based on published data from the respective FF parametrization papers and manuscripts that faithfully follow the cutoff schemes with which FFs were developed. Using different cutoffs for evaluating nonbonded interactions in real space, e.g., potential based or hard cutoffs in the CHARMM lipid FFs can lead to inaccuracies in lipid properties because the FF was not tuned to these truncation schemes. Work by Skjevik et al.311 present SA/lip that are ∼10 Å2 too low for DPPC due to the use of hard cutoffs with the C36 FF. SA/lip is the most common structural measure of lipid FF accuracy because it can be reliably estimated from
7. STRENGTHS AND WEAKNESSES OF CHEMICAL-SPECIFIC FORCE FIELDS As described in the above sections, all FFs are validated directly against experiment or fit to FFs that are tested with experiment. This section first summarizes some accuracy measures for lipid FFs and how commonly used FFs compare with common structural metrics, i.e., SA/lip, thickness, and R
DOI: 10.1021/acs.chemrev.8b00384 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
Figure 6. DHH obtained from simulation are compared with experiment for various FFs.109,112,113,120,126,181,188,191,313,314,319 Solid and empty squares are PC and PE lipids, respectively. Black diagonal lines represent the 95% confidence intervals of the experimental SA/lip. Outliers are circled for clarity and described in the text. Data from common saturated and monounsaturated lipids are shown.
experimental X-ray and neutron scattering factors.160,312 The SA/lip typically comes from a model fit to the form factors and can depend on the assumptions (mathematical function, typically a set of Gaussians). However, for simplicity in this review we will compare with structural measures that are in real space and provide the reader an understanding of overall bilayer structure. Although early versions of the CHARMM and AMBERbased FFs required a fixed area ensemble, modern FFs removed this restriction by ensuring zero net surface tension along the bilayer normal for single-component bilayers. The comparison of experimental and simulated SA/lip is summarized in Figure 5. The C36, OPLS, Slipids, Lipid14, Berger, G53, and C36-UA FFs each conform to the experimental error for the lipids reported in the literature. However, there are some outliers for the other lipid FFs. The GAFF has several outliers (DMPC, POPE, and DOPC), which is to be expected, as this was an intermediate FF before being updated with Lipid14. The GROMOS 45a7 FF results in a severe underprediction for POPE, suggesting that this FF may not accurately model PE lipids. The MARTINI FF has poor agreement (both the explicit and the implicit water versions) with DLPC, predicting too high of a SA/lip (56.0 and 60.6 Å2 for wet and dry, respectively) when this should be in a gel-like area of 49 Å2. While SA/lip provides information on the lateral structure of the lipid bilayer, details of the orthogonal structure are typically summarized by thickness measures. Three of the most common ways to measure the thickness of the bilayer are head-to-head thickness (DHH), which is a measure of the distance between the maximum profile density of the bilayer (∼distance between phosphates of opposing leaflets), bilayer thickness (DB), which is water density thickness at half-
maximum, and hydrophobic thickness (2DC), which is the thickness of the half-maximum of density for atoms in the hydrophobic acyl chains. A comparison between simulated and experimental DHH is given in Figure 6. For the most part, the AA and UA FFs agree with experimental values with the outlier being POPE for both Lipid14 and GROMOS45a7. Although both dry and wet MARTINI FFs report similar thicknesses for DLPC and POPC, these are too large compared to the experimental values. In general, there is a bias toward thicker membranes relative to experiment as all of the outliers lie above the 95% confidence interval and borderline values tend to lie near the top boundary, as demonstrated with C36. Perhaps this recapitulates historical issues with high alkane density and gelling in initial MD simulations, although on a more subtle level. Whereas SA/lip and DHH provide details on the overall structure of a lipid bilayer, SCD provide molecular-level information on the C−D (or C−H in simulation) angle with respect to the bilayer normal. These routine NMR measures offer excellent metrics for lipid FF accuracy in addition to probing the structure of lipids (typically the acyl chain). Ollila and Pabst76 provide a detailed comparison of many lipid FFs on NMR properties including SCD. The acyl chain order parameters agree well with experiment for UA and AA lipid FFs as these have been fit to torsional scans of alkanes and accurate SA/lip.76 If the SA/lip was too high, this would result in underestimation of SCD due to a lower lateral density. Chain order can be also computed at the CG level, and the MARTINI FFs agree well with experiment.121,122,126 A less common parametrization concern has been the SCD values for the headgroup atoms and interfacial carbon, carbon2 (C2), which is below the carbonyl on the two acyl chains. As S
DOI: 10.1021/acs.chemrev.8b00384 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
Figure 7. (a) SA/lip for DLPX as a function of temperature (red, simulation; blue, experiment). (b) Comparison of the experimental and C36 FF predicted SA/lip.109
Decreasing the temperature below the conditions studied by Zhuang et al.108,109 will result in phase changes to more condensed phases (ripple and/or gel) for saturated acyl chains. Although this was not a direct target in the parameter optimization in any FF, many rejected parameter sets that would result in gel structures above the gel (or ripple) transition temperature. This was especially true with the SM lipid FF development for C36.55 Pluhackova et al.314 performed heating simulations at a rate of 1 K/ns to estimate the phase transition starting at 300 K. The authors note that this fast heating would likely overestimate the true transition temperature by ∼10 K, but our recent work on long simulations at various temperatures (500−1000 ns)99 suggests that this rate likely overestimates the transition temperature by potentially ∼30 K. On the basis of our work, the main transition temperature for DPPC is 308−313 K, in excellent agreement with the experiment at 314 K.322 If the overestimate by Pluhackova can be scaled equally for simulations with various FFs, Lipid14 has a similar excellent agreement with experiment, whereas Slipids and GROMOS 54A7323 underpredict the transition temperature by nearly 10 K.314 Although the ripple phase for DPPC is found to be stable over a shorter range than experiment with C36, the ripple phase for DMPC is stable for ∼10 K (in agreement with experiment) and the structure based on 1152 lipids agrees favorably with the structural dimensions for the major and minor arm and wavelength.98,99 Transition temperatures for the MARTINI model for DPPC are 10−20 K lower than experiment but within the range of experimental transition temperatures for DMPC and thus within the resolution accuracy of the CG model.324 The properties of the gel phase (tilted acyl chains, SA/lip, and thickness) agree well with experiment for C36. MD simulations of the MARTINI 2.0 lipid FF showed untilted chain structure,325 but work with replica-exchange simulations to enhance conformational sampling has demonstrated that the MARTINI FF can represent tilted gel structures with DPPC.326
mentioned above, experimentally it is known that the C2 on the sn-2 chain exhibits quadrupole splitting that results in two distinct values.161,162 This is the result of the uniqueness of the C−H bond and not that there are two populations sampled.76,320 As this was one focus of the C36 lipid FF, C36 and C36-UA agree reasonably well with experiment for the sn-2 C2 SCD. The SLipid FF also agrees reasonably with experiment for this carbon, but other FFs either lack reports for the two values of SCD or incorrectly predict minimal variations between the C−H bonds.76 The SCD for the C−H bonds in the lipid headgroup also have been measured with NMR159 and provide good measures for lipid headgroup conformations. As with the C2 carbon, the C36 lipid FF used QM on small model compounds to fit torsional potentials with an aim to best represent the lipid headgroup SCD. C36 and C36-UA resulted in the best agreement with experiment compared to the other UA and AA FFs.321 The agreement is not always within experimental error in terms of the magnitude, but as with C2, the unique values for the C−H bonds per carbon are reproduced well.321 Other AA FFs (Slipids and Lipid14) have at least two of the headgroup carbons in major disagreement with experiment, and similar to worse agreement is seen with the various GROMOS UA lipid FFs.321 The Berger42 and Kukol77 FFs result in near complete disagreement with experiment of the lipid headgroup. Consequently, the P−N vector angle with the bilayer normal is overpredicted with the Berger FF.321 The comparison of lipid membrane structural properties and trends discussed above was based on standard lipids (PC and PE) with saturated or monounsaturated tails at temperatures near room temperature or above the lipid phase transition temperature. However, synthesized membranes and membranes from extremophiles exist in a much wider temperature range, with a wide variety found in nature. Therefore, Zhuang et al.108,109 tested the accuracy of the C36 lipid FF to reproduce experimental structure (SA/lip, thickness, and SCD) for PC lipids from 20 to 60 °C108 and PA, PC, PE, PG and PS lipids with 16 different paired acyl chains (saturated, monounsaturated, mixed acyl, and polyunsaturated).109 The temperature dependence of simulations with the C36 lipid FF agrees well with experiment (Figure 7). Although nearly all of the structural properties of these bilayers were found to be in excellent agreement with experiment, this work did demonstrate a deficiency in the prediction of the structural properties of the PS-containing lipids (the outlier of Figure 7b). This remains as a target to improve the accuracy of the C36 FF.
7.2. Other Important Bilayer Properties
7.2.1. Mechanical Properties. The mechanical properties of lipid bilayers are important to describe the response to lateral or normal stresses to the bilayer and can be used to estimate the spontaneous curvature (c0). The reader is encouraged to look at the article by Venable et al.84 that nicely presents associated equations for calculating these properties from simulation. These mechanical properties of T
DOI: 10.1021/acs.chemrev.8b00384 Chem. Rev. XXXX, XXX, XXX−XXX
Chemical Reviews
Review
isolated bilayer, but long-range electrostatics are included, and this would be more stack-like. In any event, the agreement for C36 on KC with experiment is at best within the error between experimental measures. It should also be noted that these C36 results are based on a lipid director spectrum approach (or orientation analysis) and not based on large-scale undulations due to system size constraints.84 To the authors’ knowledge, KC has not been reported for Slipids, Lipid14, or the CHARMM Drude lipid FF. Using GROMOS Berger lipids,42 KC for DPPC is 4.54 ± 0.93 × 10−20 J,333 in good agreement with nonflicker results for similar lipids (DMPC and 1,2-diarachidonoyl-sn-glycero-3phosphocholine, or DAPC), which have KC of around 5 × 10−20 J.334 The reported MARTINI (1.0 and 2.0) KC for DPPC is 4 ± 2 × 10−20 J121,122 and in better agreement with the nonflicker experiments. Recently, Chaurasia et al.335 have taken a systematic evaluation of KC using the undulation and orientation approach. For DPPC, the undulation analysis resulted in 7.0 × 10−20 J and the orientation analysis 16.3 × 10−20 J. This suggests that the potential disagreement seen between C36 and X-ray and aspiration experiments is the result of using local orientation rather than a global undulation approach. Since the spontaneous curvature (c 0 ) requires K C , comparison can only be made where KC is known. The C36 and Martini FFs have been used to find values of c0 in pure bilayers. With the C36 FF, DPPC and POPG show a large spontaneous radius of curvature (R0 = c0−1) whereas SM55 shows relatively moderate positive curvature and DOPC336 negative curvature. A thorough review by Venable, Brown, and Pastor84 provides details for calculating R0 from simulations of planar bilayers and reports results for 12 lipid types, although it has been noted that discrepancies arise depending on how KC is calculated.336 Overall, C36 overestimates the magnitude R0 by 20−35% relative to experiment.84 MARTINI 2.0 agrees with experimental estimates of c0 for unsaturated lipids DOPC and DOPE, but DPPC yields c0 that is slightly too negative.122 Accurate representation of bending properties remains a target for future FF development. 7.2.2. Monolayer Surface Tensions. One might expect the interfacial tension or surface tension (γ) needed to maintain a lipid monolayer at a given SA/lip to be an easy target to match with experiment if bilayer properties match with experiment. However, this has been a known limitation to the C36 FF as it is currently parametrized.71,72 MD simulations using 12 Å force-based cutoffs with no long-range correction with C36 result in monolayer γ that are too low. For example, at a monolayer coverage of 64 Å2/lipid C36 predicts 26.4 ± 1.0 dyn/cm, whereas experiment is 40.9 dyn/cm.71 Including longrange LJ interactions, the agreement with experiment is within statistical error. However, this result is not pleasing because one should not require different methods to represent longrange LJ when simulating a bilayer versus a monolayer. This suggests there is an inherent dependence on cutoff with the C36 lipid FFs (and other FFs similarly parametrized), and potential solutions to this problem will be discussed in section 9. No values directly comparable to experiment are reported for monolayer γ for Slipids, Lipid14, or GROMOS. The MARTINI FF accurately represents the monolayer γ near the natural SA/lip of a bilayer.337 However, the liquid condensed phase (high surface pressure) is too sharp, suggesting a larger barrier to condense the MARTINI lateral area or tightly interacting lipids. Thus, in the liquid expanded
bilayers were not targets in parametrization and represent predictions. The area compressibility (KA) is important for representing the proper lateral forces in response to cell-penetrating peptides and transmembrane proteins, such as mechanosensitive channel proteins. MD simulations with the C36 lipid FF accurately represent KA, e.g., the 230 ± 20 dyn/cm for DPPC,84 and compare well with experiment (234 dyn/cm).327 Similar agreement with experiment is observed for Slipids and Lipid14. Simulations with the UA lipid FFs show some divergence from experiment. The GROMOS54a7 FF predicts KA that is twice larger than experiment.314 GROMOS 45A3 also reports a high KA for DPPC (460 dyn/cm). The GROMOS lipids introduced by Poger et al.214 yield similar results, overestimating KA for lipids with varying degrees of saturation (DMPC, DOPC, and POPC). For example, a pure DOPC bilayer in this FF has a KA of 389 ± 19 dyn/cm, considerably higher than experimental predictions, which range from 171 ± 89328 to 265 ± 18 dyn/cm.327 GROMOS 43A1S3,204 based on Berger lipids, reports overestimation of KA of a DOPC bilayer with the standard calculation based on area fluctuations but a more accurate prediction (277 ± 10 dyn/ cm)329 when, to account for effects of undulations, KA is estimated using a series of simulations by the rate of change of surface tension with respect to SA/lip. Older variants of the Berger lipid FF208,330 overestimate KA for DMPC331 and DPPC.332 CG MD simulations with the MARTINI 1.0 FF (as well as MARTINI 2.0) agree with experiment for KA. However, this agreement is good for patches containing