Assessing Atomistic and Coarse-Grained Force Fields for Protein

Jul 18, 2008 - ... LinPedro E. M. LopesEdward HarderBenoît RouxAlexander D. MacKerell, Jr. .... Vitaly V. Vostrikov , Benjamin A. Hall , Denise V. Gr...
0 downloads 0 Views 3MB Size
9588

J. Phys. Chem. B 2008, 112, 9588–9602

Assessing Atomistic and Coarse-Grained Force Fields for Protein-Lipid Interactions: the Formidable Challenge of an Ionizable Side Chain in a Membrane Igor Vorobyov, Libo Li, and Toby W. Allen* Department of Chemistry, UniVersity of California, One Shields AVenue, DaVis, California 95616 ReceiVed: December 5, 2007; ReVised Manuscript ReceiVed: February 16, 2008; In Final Form: April 3, 2008

Ionizable amino acid side chains play important roles in membrane protein structure and function, including the activation of voltage-gated ion channels, where it has been previously suggested that charged side chains may move through the hydrocarbon core of the membrane. However, all-atom molecular dynamics simulations have demonstrated large free energy barriers for such lipid-exposed motions. These simulations have also revealed that the membrane will deform due to the presence of a charged side chain, leading to a complex solvation microenvironment for which empirical force fields were not specifically parametrized. We have tested the ability of the all-atom CHARMM, Drude polarizable CHARMM, and a recent implementation of a coarse-grained force field to measure the thermodynamics of arginine-membrane interactions as a function of protonation state. We have employed model systems to attempt to match experimental bulk partitioning and quantum mechanical interactions within the membrane and found that free energy profiles from nonpolarizable and polarizable CHARMM simulations are accurate to within 1-2 kcal/mol. In contrast, the coarse-grained simulations failed to reproduce the same membrane deformations, exhibit interactions that are an order of magnitude too small, and thus, have incorrect free energy profiles. These results illustrate the need for careful parametrization of coarse-grained force fields and demonstrate the utility of atomistic molecular dynamics for providing quantitative thermodynamic and mechanistic analysis of protein-lipid interactions. Introduction Quantitative thermodynamic measures and molecular-level interpretations of protein-lipid interactions are essential to understand the insertion, folding, and function of membrane proteins. Although experiment has provided a wealth of insight into protein structure and function in a membrane bilayer, molecular dynamics (MD) simulation can assist in uncovering some hidden microscopic mechanisms. Computer simulations must, however, reproduce certain minimal empirical model benchmarks before they can be trusted to extrapolate to actual membrane-protein systems. In the case of protein-lipid interactions, there are only a few limited such benchmarks. In this study, we explore how well biomolecular force fields reproduce bulk partitioning free energies as well as microscopic interactions at the protein-lipid interface. We chose to do this for the most challenging of situations, for which the original empirical parametrizations had not envisaged: the modeling of a lipid-exposed ionizable side chain. Ionizable amino acid side chains play essential roles in the structure and function of proteins.1 A particularly interesting role is played by basic side chains in the gating process of voltage-gated ion channels, where several arginine (Arg) side chains, contained in voltage sensor domains, must interact closely with the lipid membrane to sense changes in transmembrane electrostatic potential.2 However, the exact mechanisms remain under contention.3–9 One proposed model, inspired by the first crystallographic structure of a voltage-gated potassium channel, KvAP,10 involved the movement of voltage-sensing domains completely across the hydrophobic core of the lipid bilayer upon membrane depolarization, with some Arg side chains necessarily becoming lipid-exposed.3,4 This conforma* Corresponding author. Phone: (530) 754-5968. Fax: (530) 752-8995. E-mail: [email protected].

tional change has been estimated theoretically to be forbidden by costs of hundreds of kilocalories per mole.11 Whereas subsequent crystal structures are significantly different12,13 and several experiments contradict these large motions,14,15 this model challenged the accepted view that membranes act as barriers to all charged and polar species and led to several investigations, experimental and computational, into the thermodynamics of ionizable side chains in the membrane. Recent biological experiments using the translocon/oligosaccharyl transferase machinery of membrane protein synthesis have attempted to estimate apparent free energies for sequencedependent peptide insertion into the endoplasmic reticulum membrane.16,17 For the Arg side chain, these free energy values are surprisingly low. For example, a value of ∼2.5 kcal/mol was found for Arg in the middle of the hydrophobic segment16 and just 0.5 kcal/mol for the KvAP S4 peptide.18 However, due to the complexity of this biological system, the solvation environment of the Arg remains unknown in those experiments. Moreover, we have questioned the ability of the engineered membrane protein to anchor an Arg-containing hydrophobic segment across a membrane, with demonstrated sliding of the charged side chain to the membrane interface limiting the ability to interpret those results in terms of spatially resolved free energies.19–21 Theoretical estimates of the energy barriers faced by a single Arg residue to cross a lipid bilayer, based on continuum electrostatic models, vary from 11 to 40 kcal/mol (see the Supporting Information and refs 20, 22–24). A wide range of estimates can be obtained with these models owing to ambiguities in the definition of the membrane and protein geometry, assignment of dielectric constants, and estimates for nonpolar and membrane interfacial contributions, but a far more fundamental limitation exists. These models have been influenced

10.1021/jp711492h CCC: $40.75  2008 American Chemical Society Published on Web 07/18/2008

Assessing Force Fields for Protein–Lipid Interactions

J. Phys. Chem. B, Vol. 112, No. 32, 2008 9589

Figure 1. Snapshots of the molecular systems from CHARMM (upper panels) and coarse-grained (lower panels) PMF simulations (central windows, z ) 0) of the transfer across DPPC lipid bilayer for the following Arg side chain or analog models: (A) protonated side chain on the TM helix (ArgH+); (B) methyl guanidinium (MguanH+); (C) neutral Arg side chain on the TM helix (Arg0); and (D) methyl guanidine (Mguan0). The transmembrane polyleucine helix is green. Lipid hydrocarbon chains are grey, H atoms are omitted for clarity. The following atom colors are used: H, white; O, red; N, blue; C, grey; P, orange; K, violet; Cl, light blue. Arg side chain or analog as well as water and lipid phosphate atoms inside the membrane core are shown in the spacefill representation.

by the traditional picture of a biological membrane, which is that of a bilayer arrangement of phospholipid molecules in which their hydrocarbon chains are segregated away from bulk water to form a sheetlike, nonpolar core region while the polar lipid head groups remain hydrated at the interface. Several studies have now shown that the membrane is not always a simple sheetlike structure, but is, instead, a complex dynamical arrangement of molecules that will deform in the presence of a charged species1,20,25–27 (see Figure 1A and B for charged Arg side chain and analog methyl guanidinium, described below). The fact that a charged side chain would remain coordinated by water and lipid head groups, even at the center of the bilayer, tells us that the core of the membrane cannot be modeled by a low dielectric hydrocarbon, even as a rough approximation. Fully atomistic explicit solvent MD simulations are the closest models to reality we have and can reveal some unique features of membranes that are hidden from experiment. We have obtained spatially resolved free energy profiles for Arg side chain transfer across the lipid bilayer1,20 using the all-atom CHARMM force field.28 We used simple Arg analog molecules as well as transmembrane (TM) helix models to determine the free energy as a function of position across a fully hydrated dipalmitoylphosphatidylcholine (DPPC) bilayer. The use of the analog molecule is advantageous because it isolates the side chain directly. It does however, tend to exaggerate energetic variations within the bilayer.1 This study demonstrated that a charged Arg residue would face large free energy barriers of 17 and 21 kcal/mol for helix and analog models, respectively. We also determined that the pKa of Arg (on a host helix) is

fairly insensitive to its position in the membrane and changes by only e4.5 units very close to the bilayer center. This preference for the protonated species can be explained by the local membrane deformations that preferentially stabilize the charged form, whereas the neutral species leads to no such deformations. In fact, the neutral species still experiences a large barrier, of 7-10 kcal/mol, but as a result of simple dehydration as it moves from water to a hydrocarbon environment (see Figure 1C and D, described below).1 How accurate are these free energies and pKa profiles derived from empirically derived biomolecular force fields? The situation depicted in Figure 1A and B of a deformable membrane is not one that was imagined when the empirical force fields were originally parametrized. In most biomolecular force fields, the parameters have been developed on the basis of a limited number of gas-phase and condensed-phase properties of small model compounds,29 usually in a homogeneous environment (e.g., gas phase, neat liquids, or aqueous solution). These properties include gas-phase geometries, vibrational frequencies, torsional energy profiles, water interaction energies, molecular volumes, heats of vaporization and sublimation, and hydration free energies.29 But how do we know that the interactions of the side chain with different solvation microenvironments are properly accounted for? We therefore turn to available experimental data and quantum mechanical (QM) calculations that may serve as benchmark tests for the MD force fields. Our observations of membrane deformability tell us that experimental measures of partitioning from bulk water to a bulk nonpolar solvent, such as cyclohexane (cHex30) or n-octanol,31,32

9590 J. Phys. Chem. B, Vol. 112, No. 32, 2008 are not likely to directly correspond to the situation presented by a lipid membrane for a charged side chain. Bulk solvents can, however, provide useful benchmarks for testing MD force field parametrizations. In particular, we shall investigate the solvation free energies of Arg in water and cHex, because these represent extreme case solvation microenvironments. However, because the cost of moving a charged species from water to hydrocarbon is so great (as we shall show in this study), only the neutral species partitions into the nonpolar phase. In fact, this is assumed in the analysis of experimental data, despite adjusting for the equilibrium distribution of charged and neutral species in the aqueous phase.30,33 We are therefore forced to explore indirect experimental and quantum mechanical measures of charged side chain solvation in cHex to compare to our empirical models. Partitioning into the commonly used membrane mimetic n-octanol, often used as a measure of hydrophobicity,34–36 suffers a different limitation owing to its ability to hydrogen bond and the high mole fraction of water (0.20-0.29)36,37 in saturated wet octanol. In this case, partitioning free energies for ionizable side chains of just 1.3-3.0 kcal/mol31,32 indicate that this bulk long-chain alcohol is nearly as hydrophilic as water. While ionizable side chains may remain charged in this medium, the relevance to a lipid membrane is unclear,21,38 and it will not be considered as a suitable benchmark for our membrane studies. In addition to testing solvation in water and hydrocarbon, it is important to check how well the CHARMM force field reproduces the detailed interactions that occur in the membrane deformations. It is especially evident that Arg-lipid headgroup coordination was not a factor during force field development. Interaction energies of Arg side chain analogs with water and lipid phosphate moieties from the ensemble of MD configurations will be tested against QM calculations using methods similar to recent Arg complexation studies.39–41 Furthermore, most MD force fields do not explicitly take into account electronic polarizability, and this can lead to errors, especially for nonpolar solvents. We therefore also simulate with a classical Drude oscillator model42,43 to correctly account for the dielectric response of bulk solvents and the hydrocarbon core of the membrane. Given the challenges of sampling protein-lipid interactions, especially as systems become larger and more complex, it is always tempting to consider more computationally efficient approaches, such as coarse grained (CG) MD simulations. For instance, the CG force field of Marrink et al.44 has been widely used for semiquantitative descriptions of lipid assemblies. It provides, on average, 4-to-1 mapping of non-hydrogen atoms to one interaction center and a gain up to 3-4 orders of magnitude in the computational efficiency while still maintaining good agreement with experimental structural and dynamic properties of lipid assemblies.44 Recently, this model was adapted for simulation of proteins and used to study the problem of insertion of the KvAP S4 helix or entire voltage sensor into model membranes.45 In this model, equal preference to adopt membrane-spanning or interfacially bound orientations was observed, suggesting little penalty for exposing Arg side chains to lipid hydrocarbon, in stark contrast to fully atomistic MD results. Here, we put this CG model to the test by recalculating free energy profiles with the same techniques as used for fully atomistic MD simulations and providing comparison of solvation free energies to available experimental data. We will show that all-atom MD simulations with the CHARMM force field28 provide kilocalorie-per-mole-order accuracy for the thermodynamics of lipid-exposed Arg side

Vorobyov et al. chains. In comparison, the CG model tested here does not provide a correct physical representation of Arg side chain interactions with bulk solvents or lipid membranes. Computational Methods Molecular Systems. In the accompanying manuscript, we focused on the thermodynamics of Arg solvation and protonation in a membrane environment, which entailed calculation of free energy profiles for charged and neutral forms of this side chain attached to a model TM helix (ArgH+ and Arg0, respectively), as well as the methyl guanidinium ion and neutral methyl guanidine analog molecules (MguanH+ and Mguan0, respectively).1 The methyl analog is smaller than the actual Arg side chain but is expected to give similar results based on experimental hydration free energies.46 In this current study, we also consider the full side chain analog, propyl guanidine (neutral Pguan0 or charged PguanH+), to allow further comparisons to simulation and experimental studies. The TM helix model used was an 80-residue polyleucine R-helix with a central charged (ArgH+) or neutral (Arg0) side chain, spanning ∼120 Å in length.20 Arg side chain analogs MguanH+ and Mguan0 were used in all-atom and CG simulations, whereas PguanH+ and Pguan0 were utilized only in CG calculations. The lipid bilayer was constructed from 48 DPPC molecules with the peptide or the analog placed along the z axis, parallel to the membrane normal. DPPC was chosen because it is a reasonable model for a biological membrane and is the best-studied computationally.47–50 The systems were solvated with ∼0.5 M aqueous solution of KCl to provide good sampling of ionic baths and neutralize the positive charge of the protonated Arg side chain or analog. Fully atomistic systems consisted of 31 592 or 31 593 atoms for ArgH+ or Arg0, respectively (including 7895-7896 water molecules, 72 K+, and 73 or 72 Cl- ions), and 12 852 or 12 853 atoms for MguanH+ or Mguan0 side chain analogs, respectively (2186-2187 water molecules, 20 K+, and 21 or 20 Cl- ions). Other details for fully atomistic simulations are provided in the accompanying paper.1 The CG systems were constructed by mapping equilibrated systems from fully atomistic simulations and consisted of 2639 CG particles for the helix system (including 1757-1758 CG water particles and the same number of CG hydrated K + and Cl- ions as in all-atom systems) and 1103-1104 particles for the analog-containing systems (with 485-486 CG water particles). In both all-atom and CG simulations, a planar 5.0 kcal/mol/Å2 constraint was applied to the bilayer center of mass (COM) to prevent drift in the z direction, with no effect on the potential of mean force (PMF). In all ArgH+ or Arg0 MD runs, cylindrical potential functions of 5.0 kcal/mol/Å2 were applied to the COM of each residue backbone to keep the peptide vertical and prevent tilting. Dihedral constraints of 0.030 kcal/ mol/deg2 were applied to the peptide backbone in all-atom simulations,1,20 whereas harmonic constraints of 2.4 kcal/mol/ Å2 between nonbonded CG protein backbone CR particles45 were utilized to maintain R-helical conformation of that peptide in the aqueous solution (a requirement of that model). For solvation free energy calculations using all-atom force fields, the side chain analogs were simulated in cubic boxes filled with 200 water or 128 cHex molecules with the solute molecule held in the center by a harmonic constraint (0.5 kcal/ mol/Å2). In corresponding CG simulations, the condensed phase was modeled by a box of 200 polar (water-like) or apolar (hydrocarbon-like)-type particles with a CG side chain analog, constrained in the center as in all-atom calculations.

Assessing Force Fields for Protein–Lipid Interactions

Figure 2. Clusters of MguanH+ (panel A) or Mguan0 (panel B) with dimethyl phosphate anion and first-shell solvation water molecules, extracted from ArgH+ and Arg0 simulations, used for calculations of interaction energies.

To calculate gas-phase interaction energies using both the CHARMM force field and QM methods, coordinates of the Arg side chain along with the closest lipid dimethyl phosphate anion (DMPA) moiety and all water molecules in the first solvation shell of both Arg and DMPA (distances of e3 Å) were extracted from all-atom PMF simulation trajectories. Around 100 atomic coordinate sets per window were used, spanning several nanoseconds of simulation time. The protonated or neutral Arg side chain was converted to MguanH+ or Mguan0, respectively. Clusters containing the most typical number of water molecules surrounding both the Arg side chain analog and DMPA moiety (6 for charged and 7 for neutral Arg, due to greater solvation of DMPA for the latter) were used for consistency (see Figure 2). We selected 10 clusters for each pair of adjacent windows from PMF simulations with interaction energies closest to the average values from CHARMM calculations. Empirical Force Fields. Empirical force field calculations were carried out using the program CHARMM.51 Standard CHARMM PARAM27 empirical force field,28,52–54 TIP3P water model55 and recently developed neutral Arg side chain parameters56 have been used. The neutral side chain was deprotonated at the secondary () N atom, previously shown to be the most stable tautomer.57 To investigate the role of electronic polarizability, we employed a classical Drude oscillator model in which the polarizability was introduced by adding auxiliary charged particles attached to polarizable atoms via a harmonic spring.42,43 The positions of these so-called Drude particles were adjusted self-consistently in the electric field of other atoms or propagated in MD simulations via extended Lagrangian formalism through the assignment of small mass (0.1 amu) and keeping them at low temperature (1 K) using a separate thermostat.42 For the purpose of this study, Drude particles were added only to nonhydrogen atoms of solvent molecules (e.g., water or cHex) or

J. Phys. Chem. B, Vol. 112, No. 32, 2008 9591 lipid hydrocarbon tails. The recently developed SWM4-NDP polarizable water model58 as well as alkane59(for lipids) and cHex60 electrostatic parameters were used for these calculations. In the case of polarizable membrane simulations, Drudes were added to the lipid hydrocarbon chains only to account for a deficiency explained in the Results Section. The CG model developed for lipids by Marrink et al.44 and extended to proteins by Bond and Sansom45 was ported into the CHARMM program and used in this study. In this model, one CG particle typically corresponds to four non-hydrogen atoms and comprises four different types: polar (P), apolar (C), mixed polar/apolar (N), and charged (Q) with the latter two divided into four subtypes, depending on a hydrogen bonding potential of a CG particle: none (0), donor (d), acceptor (a), or both (da).44 The potential energy function for this model consists of a bond harmonic term, angle periodic cosine term, LennardJones (LJ) term for van der Waals (vdW) interactions, and Coulomb term for electrostatics. LJ interactions are computed between all CG particles not directly bonded to each other, whereas electrostatic interactions are calculated only between those carrying nonzero net charge and are shielded by using the relative dielectric constant of 2044 instead of 1, typically used in all-atom force fields. The same atom types and corresponding parameters as in the original studies44,45 were used for all CG particles. The recent modification of this CG model, known as the MARTINI force field,61 was not tested in this work, since the original implementation was used for the previous studies45 examined here. Equilibrated systems from all-atom CHARMM simulations were transformed into CG systems using the same mapping procedure that was used in previous studies.44,45 For the neutral Arg side chain, not considered in the previous CG study,45 the charged particle of Qd type was substituted by a polar neutral particle of Nda type.44 MguanH+ or Mguan0 analogs were modeled using a single Cz CG particle of Qd or Nda type,44 respectively, whereas for PguanH+ or Pguan0 models (equivalent to charged or neutral Arg side chain models, respectively), an additional Cγ CG particle of C type was attached to Cz particle. Potential of Mean Force Calculations. A similar methodology was used for computing the PMF from fully atomistic and CG simulations and is described in more detail in the accompanying paper.1 The free energy calculations were carried out by employing umbrella sampling through the use of harmonic biasing “window” functions that hold the z component of the COM of the peptide or analog around regular positions with a force constant, K, relative to the COM of the lipid bilayer. Sixty-one equally spaced windows spanning -30 e z e 30 Å were used with K ) 2.5 kcal/mol/Å2, corresponding to root mean square (rms) fluctuations of 0.5 Å (with larger force constants applied in some all-atom ArgH+ simulations20). In Drude polarizable PMF simulations, force constants of 0.6-10 kcal/ mol/Å2 (with rms fluctuations of 1 or 0.25 Å, respectively) were used for 1 Å spaced windows across -16 e z e 16 Å. The PMFs were obtained by unbiasing the resultant distribution using the weighted histogram analysis method (WHAM).62 All PMF simulations were carried out at 330 K, above the gel-phase transition temperature for a pure DPPC membrane.63 Lateral hexagonal periodic boundary condition (PBC) dimensions were based on the experimental area per lipid of ∼64 Å2 64 and calculated protein cross-sectional area of ∼180 Å2. Simulation boxes had an xy-translational length of 44.4 or 42.6 Å and an average height of ∼190 or ∼80 Å for the helix- and analog-containing systems, respectively. For both nonpolarizable CHARMM and CG runs, a constant normal pressure of 1 atm

9592 J. Phys. Chem. B, Vol. 112, No. 32, 2008 was maintained using the Langevin piston method,65 with the pressure coupling applied in the z direction parallel to the bilayer normal. Drude polarizable PMF runs were performed in NVT ensemble using a Nose-Hoover thermostat and the new velocity Verlet integrator (vv2) following an extended Lagrangian formalism.42 All-atom CHARMM nonpolarizable or Drude polarizable molecular dynamics simulations used 2 and 1 fs time steps, respectively. A much longer 40 fs time step was used in all CG MD simulations, as done previously.44 In nonpolarizable and Drude polarizable CHARMM simulations, 5-10 ns and 3-7 ns per window were used in these PMF runs, with the first 1-4 ns utilized for the system equilibration. The special methodology used to ensure proper sampling of relevant molecular configuration and, thus, converged free energy profiles is described in the accompanying paper.1 Substantially longer simulation times (5 ns of equilibration and 50 ns of production run for each window) were used for less computationally demanding CG simulations. In all-atom simulations, electrostatic interactions were treated using particle-mesh Ewald (PME) summation66 with a coupling parameter 0.34 and sixth-order spline for mesh interpolation. Nonbond pair lists were maintained out to 16 Å, and a real space cutoff of 10 Å was used for the Lennard-Jones term truncated via an atom-based energy shift algorithm. Bonds to H atoms were constrained using the SHAKE algorithm.67 For CG simulations, a slightly different cutoff scheme from that in the original implementation was utilized, since GROMACS68 shift functions69 are not available in CHARMM. The successful use of the alternative nonbond cutoff scheme for this CG model was demonstrated previously.70 The nonbond pair lists were maintained out to 14 Å, and a real space cutoff of 12 Å was used for both electrostatic and LJ terms truncated via an atombased force shift and energy shift, respectively. Satisfactory agreement (within a few percent) of energies and forces computed with the original GROMACS and CHARMM implementation of this model was found for a few representative systems. Free Energy Perturbation Calculations. Solvation free energies of Arg side chain analogs in bulk water and cHex, as well as partitioning free energies between these two solvents, were calculated for both all-atom CHARMM and CG force fields by the free energy perturbation (FEP) approach71,72 using the staged protocol developed by Deng and Roux.73 Solvation free energies were calculated as a difference between free energies in the corresponding solvent and vacuum. water-cHex partitioning free energies were calculated as a difference in cHex and water solvation free energies. Electrostatic and dispersive contributions were computed using the standard linear coupling scheme (with a coupling parameter ranging from 0 to 1 in increments of 0.1), whereas the repulsive term was transformed into a soft-core potential and calculated in multiple stages, as described previously.73 All systems used for condensed-phase FEP calculations were first subjected to initial minimization and then at least 60 ps of NPT MD equilibration for all-atom or 6 ns for CG systems. These simulations were performed at 298 K and 1 atm isotropic pressure using cubic PBC, Nose-Hoover thermostat,74 modified Hoover-Anderson barostat,42,75 and the vv2 integrator42 The average translational lengths were ∼18 or ∼29 Å for all-atom or ∼29 and ∼31 Å for CG water or cHex boxes, respectively. The nonbond interaction truncation scheme applied in FEP simulations was slightly different from that used for PMF runs, since the latter is currently not supported by the existing

Vorobyov et al. CHARMM FEP code and resulted in a spontaneous freezing of the whole system for some CG runs. In fully atomistic simulations, the energy switch was used starting at 10 Å, with the atom-based real space cutoff of 12 Å for LJ interactions and maintaining nonbond pair list computed up to 14 Å. For CG FEP simulations, the energy switch and force shift function were used for LJ and electrostatic interactions, respectively, starting at 7 Å, with the real space cutoff of 10 Å and the nonbond pair list computed up to 16 Å. Since the use of different cutoffs might result in discrepancies in the FEP results, the corresponding corrections were calculated as outlined below. Time steps and other simulation parameters were the same as in PMF runs. Gas-phase, all-atom FEP simulations were performed at 298 K using Langevin dynamics with a friction coefficient of 5.0 ps-1 applied to all atoms and infinite cutoffs used for nonbond interactions. Free energies were corrected for finite LJ cutoffs in condensedphase FEP simulations using average interaction energies between a solute and a solvent from several atomic coordinate sets. Long-range LJ corrections to all-atom solvation free energies were computed this way using long energy-switchbased nonbond cutoffs of 30 Å as a reference state.76 The CG model was parametrized to be used with finite cutoffs, and the gas-phase reference is absent in this case.44 However, both CHARMM and CG free energy estimates adjusted for the same nonbond cutoffs as used in PMF simulations were also calculated and labeled as CHARMM(cutoff). MguanH+ and PguanH+ ion solvation free energies were also corrected for the contribution from the corresponding interfacial potential (air-water, air-cHex, or water-cHex) and, thus, represent so-called “real” solvation free energies.77 Since PBCs were imposed during FEP calculations, corresponding terms had to be added a posteriori and amount to -11.5 kcal/mol for the nonpolarizable (additive) TIP3P and -12.5 kcal/mol for SWM4-NDP water models.77 Unfortunately, available experimental data and theoretical calculations lead to contradictory predictions on the magnitude and even sign of the air-water interfacial potential.78 However, it is important that we obtain reliable estimates of interfacial potentials for the chosen solvent models in this study. We have calculated the air–cHex and water–cHex interfacial potentials to be –10.8 and 1.2 kcal/mol, respectively, for non-polarizable CHARMM and –4.8 and 8.3 kcal/mol, respectively, for Drude polarizable CHARMM, using a standard methodology.58,77 Systems of 424 cHex and/or 2704 water molecules were placed in rectangular boxes and simulated for ∼1 ns using PBC, NVT (air–cHex), or NPT (water–cHex) ensembles at 298 K, using the same simulation parameters as used in umbrella sampling runs. Interfacial potential profiles were obtained by integrating charge density distributions twice using Poisson’s equation. In MguanH+ and PguanH+ condensedphase FEP simulations, PME net charge corrections were applied for energy and pressure.79 Quantum Mechanical Calculations. QM ab initio calculations using Gaussian0380 have been used to check interaction energies between Arg side chains and the water and lipid head groups within the membrane. QM interactions energies were calculated at the MP2 level of theory81,82 with the 6-31+G(d) basis set83,84 using a supermolecule approach; that is, as a difference between single-point energies of the cluster and constituent monomers. The basis set superposition error (BSSE) resulting from a different basis set sizes used for the monomer and cluster energy calculations was corrected using the counterpoise correction (CPC) technique.85 100% CPC was found to provide reasonable agreement with calculations using the

Assessing Force Fields for Protein–Lipid Interactions

J. Phys. Chem. B, Vol. 112, No. 32, 2008 9593

Figure 3. Mean first shell solvation numbers from PMF umbrella sampling trajectories using all-atom CHARMM (dashed lines) or coarse-grained (solid) force fields. Solvation numbers are the average number of atomic species within the first shells for water (red), lipid phosphate groups (blue), and lipid carbonyl groups (green) relative to the guanidine or guanidinium Cz atom (see Figure 2). Solvation numbers were calculated for (A) ArgH+, (B) MguanH+, (C) Arg0, and (D) Mguan0. Solvation numbers for CG propyl guanidinium (PguanH+) and guanidine (Pguan0) models are shown in panels B and D as dotted curves.

Figure 4. Average interaction energies from PMF umbrella sampling trajectories using all-atom CHARMM (dashed lines) or coarse-grained (solid lines) force fields. Interaction energies were calculated between Arg side chain or analog and water molecules (red), lipid head groups (blue), K+ and Cl- ions (cyan). Interaction energies were calculated for (A) ArgH+, (B) MguanH+, (C) Arg0, and (D) Mguan0. Interaction energies for coarsegrained PguanH+ and Pguan0 models are shown on panels B and D as dotted curves. On the top panels, interaction energies from both CHARMM and CG simulations are shown, whereas on the bottom panels, only those from CG simulations are shown on a smaller scale.

larger basis set aug-cc-pVDZ86 at the MP2 or hybrid DFT PBE0 level87 with CPC interaction energies within