Potential of Mean Force and pKa Profile Calculation for a Lipid

Jul 18, 2008 - The issue of ionizable protein side chains interacting with lipid membranes has been the focus of much attention since the proposal of ...
0 downloads 11 Views 3MB Size
9574

J. Phys. Chem. B 2008, 112, 9574–9587

Potential of Mean Force and pKa Profile Calculation for a Lipid Membrane-Exposed Arginine Side Chain Libo Li, Igor Vorobyov, and Toby W. Allen* Department of Chemistry, UniVersity of California, One Shields AVenue, DaVis, California 95616 ReceiVed: December 5, 2007; ReVised Manuscript ReceiVed: February 16, 2008

The issue of ionizable protein side chains interacting with lipid membranes has been the focus of much attention since the proposal of the paddle model of voltage-gated ion channels, which suggested multiple arginine (Arg) side chains may move through the hydrocarbon core of a lipid membrane. Recent cell biology experiments have also been interpreted to suggest that these side chains would face only small free energy penalties to cross membranes, challenging a long-standing view in membrane biophysics. Here, we employ side chain analog and transmembrane helix models to determine the free energy of an Arg side chain, as a function of protonation state, across a membrane. We observe high free energy barriers for both the charged and neutral states that would prohibit lipid-exposed movement. The mechanisms for charged and neutral Arg transport are, however, very different, with the neutral state experiencing simple dehydration, whereas the charged state experiences a complex mechanism involving connections to the bilayer interfaces that deform the local membrane structure. We employ special methods to ensure sampling of these interfacial connections and decompose the free energy to shed light on the mechanisms. These deformations are found to preferentially stabilize the protonated form, such that the Arg side chain remains almost exclusively charged inside the membrane, with a pKa shift of e4.5 units. In contrast, the analog models are found to exaggerate the variations in energetics across the membrane and have larger pKa shifts. These results have implications for models of voltage gated ion channels, suggesting that although Arg side chains are ideally suited for carrying charge, the thermodynamics dictate that they must remain sequestered from the lipid bilayer environment. 1. Introduction Ionizable protein side chains play important roles in a wide array of biological phenomena, including protein folding, protein-protein2 and protein-nucleic acid3 interactions, enzyme activity,4 pH activation of proteins,5 nuclear localization,6 peptide-mediated membrane fusion,7 antimicrobial peptide activity,8 and various proton9 and ion10 transport mechanisms. The large hydration free energies of these side chains and their strong electrostatic interactions lead them to be controlling factors in the folding of proteins into three-dimensional structures, each seeking charge-pairing interactions or a favorable polar solvation environment. Membrane proteins, however, are subject to an additional restraint on the protein fold due to the lipids that envelop the transmembrane (TM) domains and provides a seemingly hostile environment for polar and charged groups. It is anticipated that ionizable groups must remain sequestered from the lipid bilayer hydrocarbon core to avoid suffering large dehydration free energy penalties.11 These penalties can be truly great: the hydration free energy of charged arginine (Arg), for example, has been estimated to be -62 kcal/mol, with the associated energetic cost of moving this side chain in its charged state from water to hydrocarbon being around 30 kcal/mol.12 Recently, however, cell biology experiments that utilize the translocon/oligosaccharyl transferase machinery of membrane protein synthesis have been interpreted to suggest small barriers of ∼2.5 kcal/mol for inserting an Arg side chain at the center of a TM helix.13,14 Moreover, it has been suggested by a recent model of voltage-gated ion channel activity that multiple Arg side chains may move completely across the core of a membrane * Corresponding author. Phone: (530) 754-5968. Fax: (530) 752-8995. E-mail: [email protected]

under the driving force of a membrane potential, with some necessarily exposed to lipid hydrocarbon,15,16 being theoretically forbidden by energetic costs of hundreds of kilocalories per mole.17 The suggestion that it may be possible for Arg side chains to become exposed to the lipid hydrocarbon environment opposes a long-held biophysical view and presents us with an unexpected theoretical problem: What are the energetic costs of burying ionizable side chains inside membranes? We have previously carried out fully atomistic molecular dynamics (MD) simulations to reveal that a charged Arg side chain attached to TM R-helix does experience a large free energy barrier of ∼17 kcal/mol when moving across the membrane.18 The discrepancy between MD simulation and the translocon-based results has been explained in terms of the inability to interpret those experiments spatially, given the complexities of the biological system and the difficulty in holding Arg side chains inside the membrane by standard secondary structure restraints.18 In contrast, the MD models to be explored here, while being idealized views of real membrane proteins, can provide spatially resolved free energies and, as we show in the accompanying manuscript, are accurate to the order of 1 kcal/mol.12 In contrast to the conventional continuum picture,12,19 we have observed the lipid bilayer to be able to deform around a charged moiety such that water and lipid head groups would penetrate into the membrane core in an attempt to stabilize the Arg, as reported previously18,20,21 and seen in Figure 1A and B. Thus, the energetics associated with Arg movement in the membrane are not simply associated with dehydration in a hydrocarbon environment but also involve strong interactions with water molecules and zwitterionic lipid head groups drawn into the

10.1021/jp7114912 CCC: $40.75  2008 American Chemical Society Published on Web 07/18/2008

PMF and pKa Profile for Lipid-Exposed Arg

J. Phys. Chem. B, Vol. 112, No. 32, 2008 9575

Figure 1. Equilibrated MD systems for the charged Arg side chain (ArgH+, A), charged methyl guanidinium analog (MguanH+, B), neutral Arg side chain (Arg0, C), and neutral methyl guanidine analog (Mguan0, D) following equilibration with the protein/analog molecule held near the center of the bilayer. The Arg side chain/analog molecule is drawn as blue N, gray C, and white H atoms; the TM helix, as a green ribbon; and the hydrated DPPC bilayer with gray C atoms, orange P, red O, and blue N atoms, and red/white water molecules. Lipid phosphate and water atoms that have been pulled into the bilayer core (z e 13Å) are drawn as balls.

membrane core, as well as indirect energy costs arising from bilayer deformations. These observations demonstrate that to obtain the correct thermodynamics of Arg movement in membranes requires a fully atomistic MD model. Here, we will decompose the energetics of Arg movement in the membrane into components emerging from different direct interactions as well as indirect membrane strain contributions. Arg presents a special case because of its high pKa value of 12.0-13.7 in aqueous solution22–24 which indicates that as little as one side chain in a million may be neutral at pH 7. However, the charge state of ionizable side chains in proteins is pHdependent and is subject to the three-dimensional fold of the protein, influenced by the solvation environment of the side chain and its accessibility to other ionizable groups.25,26 Recent experiments have shown that nonpolar environments within membrane proteins can lead to large (neutralizing) pKa reductions for basic side chains.27,28 However, these experiments did not examine the protein periphery where a side chain may be exposed to lipid. Can we expect the same results for a lipidexposed side chain? Our observations of membrane deformations leading to extensive side chain hydration and lipid headgroup coordination, which likely would not occur in a hydrophobic pocket of a protein, make it difficult to predict Arg’s behavior in the membrane. Here, we investigate the thermodynamics of Arg, in both its charged and neutral forms, by employing well-defined TM helix and analog models within the framework of a fully atomistic MD simulation. Although various situations arise in membrane proteins in which arginine may be stabilized by other polar and charged protein side chains (including in voltage-gated ion channels15,16,29), we sought especially to examine the case of a lipid-exposed Arg: a situation presented by the original paddle model15,16 and proposed by the translocon experiment papers.13,14 We want to know, in the absence of other interactions, can a lipid-exposed Arg side chain remain charged in the membrane, and what are the corresponding thermodynamics for each species, protonated or deprotonated? 2. Theory and Methods 2.1. Molecular Dynamics Models. We have constructed two models for exploring side chain thermodynamics within the membrane. The first of those is a simple analog molecule, methyl guanidine (Mguan), which along with guanidine has been used extensively in experimental studies as a model of the Arg

side chain.25,26,30–33 This is even simpler than propyl guanidine (Pguan), the full side chain analog, but was used because experimental partitioning experiments suggest similar free energies for the two.34 Our calculations have since shown, however, that the difference in their solvation free energies in a hydrocarbon solvent, relative to water, is around 2 kcal/mol.12 It is important to consider the effect of a TM segment when studying the movement of an ionizable side chain across a membrane, and for this purpose, we employ a uniformly hydrophobic poly-Leu helix with a central Arg side chain.18 We recognize that the chosen TM helix is just one model helix and that a variety of sequences could exist in a full membrane protein, just as different 3D structures and orientations could occur. Different sequences with other polar and even charged side chains may also change the extent of interfacial deformations, yet a uniformly hydrophobic TM helix is the obvious first choice for this study. Below, we devise a procedure for free energy and pKa profile calculation that relies on having a well-defined equilibrium path from bulk water to the membrane. The uniformly hydrophobic helix provides a translationally invariant model protein (in the absence of the Arg) that permits tractable simulations and provides a straightforward interpretation of results. We also remark that although an isolated helix may tilt in a membrane,35 this need not be the case in a full membrane protein. Allowing such tilting, therefore, would represent a specific feature of the model, and thus, one could argue a vertical helix is just as suitable, yet is highly advantageous computationally. We therefore maintain our helix in a vertical orientation with harmonic constraints (see below). For both analog and helix models, we have simulated the charged/protonated and the neutral/deprotonated side chains. The R-helical TM helix was initially built in ideal conformation (φ ) -57, Ψ) -47°) and placed at a position of interest, and a new hydrated membrane was then built around it by methods that are extensions of early techniques.36 Helices were made long enough (80 residues, ∼120 Å in length) to avoid terminal-interface interaction for translations of up to (30 Å. The C and N termini were chosen to be neutral, employing acetyl and ethanolamine groups, respectively. The R-helical conformation was approximately maintained by dihedral constraints of 0.030 kcal/mol/deg2, corresponding to a root-meansquare fluctuation of ∼5° in each dihedral angle, to avoid sampling unfolding/folding of the hydrophobic R-helix in bulk

9576 J. Phys. Chem. B, Vol. 112, No. 32, 2008

Li et al.

water, which is not expected to significantly affect the relative energetics for trans-membrane positions (see the accompanying manuscript12 for further discussions). A cylindrical constraint of 5 kcal/mol/Å2 was applied to the centers of mass (COM) of each residue backbone to keep the peptide vertical. The Arg side chain was placed at the 40th residue, or near the center of the helix. A dipalmitoylphosphatidylcholine (DPPC) lipid bilayer was chosen because it is a reasonable model for a biological membrane and it is well-studied computationally.37–39 Membrane patches of 48 DPPC lipid molecules were employed in simulations. The system was long enough to ensure that the protein does not pass through the periodic boundaries and also that interaction between image proteins was negligible. Hexagonal periodic boundaries with an xy-translation length of ∼44 Å and average height of ∼190 Å were used. Pressure coupling was applied in the z direction, parallel to the membrane normal. Lateral periodic box dimensions were based on an experimental area/lipid of 64 Å230 and a protein cross-sectional area of ∼180 Å2. To neutralize the net charge on the protein/analog and to sample the ionic baths thoroughly, a 0.5 M KCl solution was used, which corresponds to 7896 or 7895 water molecules and 72 K+ and 72 or 73 Cl- ions for the neutral or charged Arg side chains, respectively. The system consisted of 31 593 or 31 592 atoms in total for the neutral and charged side chain, respectively. Sample coordinates from a simulation with the Arg near the center of the bilayer are shown in Figure 1A and C for the charged (ArgH+) and neutral (Arg0) species, respectively (see ref 18 to see the full span of the system). The analog systems were smaller than the TM helix systems because of the absence of the long helix: their lateral box dimension is 42.6 Å and average length is ∼80 Å. The systems contained 12 853 or 12 852 atoms: 2187 or 2186 water molecules and 20 K+ and 20 or 21 Cl- ions for neutral and charged Arg side chain analogs, respectively. Sample coordinates from a simulation with the charged (MguanH+) and neutral (Mguan0) analog molecules are shown in Figure 1B and D, respectively. Simulations were carried out using the free energy protocols described below for the four systems: ArgH+, Arg0, MguanH+, and Mguan0. For each system, 61 simulations were carried out, requiring simulation times of between 5 and 10 ns each. Simulations were performed with the program CHARMM,40 version 32b2, using the PARAM27 force field41–43 and TIP3P water model.44 The electrostatics were calculated using the particle-mesh Ewald (PME) method,45 and bonds to H atoms were maintained with the SHAKE algorithm.46 Vertical pressure and temperature were kept constant at 1 atm and 330K, respectively, by the Langevin piston and Nose-Hoover methods.47 The temperature, 330 K, was above the gel-phase transition temperature for a pure DPPC membrane.48 New CHARMM parameters for the neutral Arg side chain49 developed by a standard protocol41 were utilized. These parameters, as well as those for the charged side chain, have been shown in the accompanying manuscript to reproduce experimental solvation energetics.12 2.2. Free Energy Profile Calculation. The Potential of Mean Force (PMF), W, is the thermodynamic work along a coordinate z, relative to some reference value z′

W(z) ) -

∫z′z dζ 〈F(ζ)〉

(1)

where F(ζ) is the instantaneous force acting on the chosen coordinate, ζ. This may be re-expressed in terms of a ratio of configurational integrals,

W(z) ) ∆G(z′ f z) ) -kT ln

〈F(z)〉 ) 〈F(z′)〉 -kT ln

∫ d3N-1R e-U(z,R)⁄kT (2) ∫ d3N-1R e-U(z′,R)⁄kT

where R represents the remaining 3N - 1 coordinates, k is the Boltzmann constant, and T is the temperature. It is this free energy profile that we aim to capture for the Arg-membrane translocation process. To assist in computing this configurational integral, we employ umbrella sampling,50 whereby we add a series of biasing potentials, wi(z) ) (1/2)Ki(z - zi)2, to the system potential, U(z, R), for each “window”, i. To begin, 21 independent starting configurations with the peptide COM, relative to the membrane COM, spanning the membrane were created and equilibrated for 2-5 ns. Simulations were replicated to create 61 separate simulations in 1 Å intervals spanning -30 e z e 30 Å, which were each simulated for an additional 4-7 ns to compute well-converged PMFs (see Supporting Information, Figures S1 and S3). The biasing harmonic force constant used was 2.5 kcal/mol/Å2 for both charged and neutral Arg side chain models (although larger constants have been applied for some simulations18), resulting in good overlap of umbrella sampling windows. The distributions were unbiased with the weighted histogram analysis method (WHAM)51 to get the optimal estimate for the unbiased density, 〈F(z)〉 and the PMF via eq 2. The equilibrated systems in Figure 1 A and B demonstrate that when a charged Arg or analog is held inside the hydrocarbon core of the membrane, the membrane will deform such that water molecules and lipid head groups enter to coordinate the side chain. In contrast, it can be seen from Figure 1 C and D that no such interfacial connections occur for the neutral side chain or its analog. The interfacial connections for the charged species create a difficulty in sampling equilibrium distributions of configurations that must be resolved. During ArgH+ and MguanH+ simulations, once a connection to one interface was established, it was maintained throughout the entire simulation; only in the initial stages when water penetration and no lipid lead group penetration had occurred did the analog change the interface to which it was connected. Furthermore, connections to two faces concurrently were not observed once the deformation was established (although was observed in special simulations described below). Close to the membrane center, it should be equally probable to connect to one interface or the other. However, connection to just one interface throughout the entire simulation will lead to an artificial excess average force and subsequent rise in the free energy profile. To sample the forming and breaking of connections to either interface will likely require unfeasible multi-microsecond simulations for each umbrella sampling window.18 We remark, however, that computing a “flipping rate” is troublesome because it would require calculation of the activation barrier and diffusion coefficient, having sampled the dynamics of interfacial connection breaking and forming that occurs on very long timescales. In the case of the side chain, ArgH+, we have previously overcome this sampling issue in an indirect fashion by biasing the orientation of the side chain.18 This simply led to a fairly even distribution of oppositely connected configurations and eliminated the net force. However, the problem of obtaining an equilibrium distribution of interfacial connections, inherent in both ArgH+ and MguanH+ simulations, requires special attention.

PMF and pKa Profile for Lipid-Exposed Arg

J. Phys. Chem. B, Vol. 112, No. 32, 2008 9577

The probability of having a connection to a “far” interface (beyond the center of the membrane) depends on the relative free energy for such a state as compared to the free energy for connecting to just the “near” interface. Figure 3A (to be described in the Results and Discussion Section) shows a series of configurations for near and far connected interfaces for MguanH+ obtained from the procedure described below. Given the large forces the membrane imposes on a charged group deep inside the membrane,18 it is evident that the free energy for the far connection must be higher than that for the near connection away from the bilayer center. Thus, the balance of near and far interfacial connections will be significant only when the charged side chain or analog is close to the bilayer center where some cancelation of opposite forces will occur. When it is far from the center, the only low free energy state is that of a near interface-connected analog for which an average force will exist that gives rise to a free energy barrier. We consider only the side chain analog MguanH+ to correct for sampling of interfacial connections, given that we have adequately overcome the problems with the actual side chain previously.18 The system’s free energy as a function of MguanH+ position, z, is proportional to the logarithm of a configurational integral over all other 3N - 1 degrees of freedom, R, in the system, as in eq 2. If we were to sample points in only this 3N - 1 dimensional configurational space for where the analog was connected to the near interface, we would obtain the following constrained PMF,

Wnear(z) ) -kT ln

∫ d3N-1R e-U(z,R)⁄kT

(3a)

near

and if we were to sample only configurations in which the analog was connected to the far interface, the constrained PMF would be

Wfar(z) ) -kT ln

∫ d3N-1R e-U(z,R)⁄kT

(3b)

far

Such constrained phase space integration is possible in practice because these are two distinct possibilities that interchange on very long timescales. Equation 2 is equivalent to

{∫

W(z) ) -kT ln

near

d3N-1R e-U(z,R)⁄kT +

}

∫ d3N-1R e-U(z,R)⁄kT far

(4) because the configurational integrals are additive. Using eqs 3a and 3b, this becomes

W(z) ) -kT ln{e-Wnear(z)⁄kT + e-Wfar(z)⁄kT}

(5)

being the correct overall PMF for the charged analog after independently sampling the two distinct states, near and far. We can interpret this PMF combination rule in terms of the mean forces experienced by the side chain in contact with different interfaces. Differentiating both sides of eq 5 with respect to z, we obtain

〈F(z) 〉 )

e-Wnear(z)⁄kT 〈F(z)〉near + e-Wfar(z)⁄kT 〈F(z)〉far e-Wnear(z)⁄kT + e-Wfar(z)⁄kT

(6)

where the subscript, near or far, on the ensemble average brackets indicates averaging over configurations with near or far connections, respectively. That is, the PMF combination rule is equivalent to a weighted average of the mean force. What this means is that the overall PMF coming from eq 5 automatically accounts for the cancelation of forces near the middle of

the bilayer where the free energies of near and far connected analogs become similar and a smooth maximum in the function (with continuous first derivative passing through zero) should occur. As the analog moves away from the middle, the far connection becomes improbable, and the cancelation of opposite forces is lost, leading to a steep slope in the PMF. What are the range and magnitude of this force canceling due to opposite interfacial connections? Rewriting eq 5 in terms of a correction to the near-only PMF, we obtain

W(z) ) Wnear(z) - kT ln{e-[Wfar(z)-Wnear(z)]⁄kT + 1}

(7)

Clearly, when the analog resides at the middle of the membrane and Wfar ) Wnear, there will be an entropic correction of -kT ln 2 due to doubling the accessible phase space. Plotting the correction term as a function of the variable, Wfar - Wnear, reveals a smooth function with greatest negative value of -0.45 kcal/mol at 330 K that goes to zero smoothly and rapidly. Thus, the correction is small and has a well-defined value. We test this prediction by computing the near and far PMFs separately and then combining. To obtain a PMF that includes only the far connected configurations, we first created a series of starting points for these opposite connected configurations after attempting a Hamiltonian replica exchange protocol52 on umbrella sampling windows. In this case, our replicas differ by their window biasing functions. Every 10-20 ps of umbrella sampling simulation, a Monte Carlo exchange was considered for neighboring windows in an attempt to exchange configurations and allow different interfacial connections to occur in neighboring windows near the membrane center. Replicas were exchanged according to the Metropolis transition probability, in which several random pairs of windows were examined each trial. This procedure was not very successful, with only a few exchanges between neighboring windows that had opposite connections, owing to the large opposite forces acting on the protein in those adjacent windows. However, ∼5 ns of this method for a range of windows was enough to provide a few additional starting points where the analog was connected to the far interface. These starting points were then replicated for neighboring windows to obtain equilibrated configurations for far connected analogs up to 5 Å from the center, which were then simulated as a separate set of umbrella sampling windows for 3.5 ns. Equation 5 was then used to obtain the overall result for the MguanH+ PMF. 2.3. Enthalpic, Entropic, and Free Energy Contributions. We analyze energetics that contribute to the Arg-membrane translocation process. To do this, we compute energies from PME-based electrostatics and Lennard-Jones interactions using the same nonbonded interaction scheme as used in MD simulations, as described above. The energy of the system can be broken down into direct and indirect interactions.

U(z, R) ) UD(z, R) + UI(R)

(8)

The direct energy, UD(z, R), leads to interaction forces acting on the reaction coordinate, which are a function of both the protein position (relative to the membrane), z, and the remaining coordinates, R. The indirect energy, UI(R), however, involves interactions among the other atoms and is a function of their coordinates, R, only. We have computed the average direct energy profile,

∫ d3N-1R UD(z, R) e-[U (z,R)+U (R)]⁄kT ∫ d3N-1R e-[U (z,R)+U (R)]⁄kT D

〈UD(z, R) 〉 (z) )

D

I

(9)

I

and its components. Direct interactions between the protein (TM

9578 J. Phys. Chem. B, Vol. 112, No. 32, 2008

Li et al.

helix or analog molecule) and lipid or water were computed. These direct energies were calculated each 0.2 ps of the simulation as the average of Ulipid–protein ) Ulipid+protein - Ulipid - Uprotein, for example, valid for an additive (nonpolarizable) force field. A similar expression was used for protein–water interactions (with the effect of ions being small and were simply added to water energies as an “electrolyte” term). The deformations in local membrane structure seen in Figure 1A and B suggest that indirect or “strain” energies (not involving direct interactions with the protein) will also be very important. Although the indirect energy for a single configuration of the system is independent of z, its ensemble average value does depend on z,

∫ d3N-1R UI(R) e-[U (z,R)+U (R)]⁄kT ∫ d3N-1R e-[U (z,R)+U (R)]⁄kT D

〈UI(R) 〉 (z) )

D

I

(10)

I

and thus, we have computed the mean indirect energy profile as well as that for different contributions to the indirect energy. In particular, we calculated lipid-lipid, lipid-water, and water-water energies. The relevant energies were Ulipid, Uwater, and the interaction energy Ulipid-water) Ulipid+water - Ulipid Uwater. We note that part of the lipid-water interactions is actually involved in the direct interaction because the chosen reaction coordinate was, in fact, the z component of the distance between the protein and the membrane (see below for explanation). A contribution from the protein itself, Uprotein, would also constitute an indirect energy, but it has been shown to be small for these models. Combined, these energetic contributions can tell us much about the causes of the resulting PMF shape and magnitude. However, they are just energies and are lacking important entropic terms. We have attempted to estimate the entropic contributions to the PMF for both the charged and neutral species. The usual practice for extracting the entropy is to analyze temperature variations in free energy.53 This would require a significant undertaking, given the challenges in converging the free energy at just at a single temperature. We instead sought to calculate the average energy and to derive, in an approximate way, the entropy. We have determined that the pressure-volume work contribution to the enthalpy, ∆H(z), is negligible. If we subtract this enthalpy from the total PMF, W(z) ) ∆G(z′ f z), we obtain the entropic contribution, -T∆S(z′ f z) ) ∆G(z′ f z) - ∆H(z′ f z). This procedure is not typically employed due to the large fluctuations in the enthalpy that are transferred to the entropic contribution. We do see significant relative uncertainties in our analysis, but this analysis still reveals some important quantitative observations. Furthermore, we sought to quantify actual free energy contributions to the PMF directly. The relationship between the PMF and the mean force (eq 1) permits a decomposition into components originating from different interactions (R),

WR(z) ) -

∫z′z dζ 〈FR(ζ)〉

(11)

where FR(ζ) is the instantaneous force of the component, R, acting along the chosen coordinate, ζ. This mean force decomposition is meaningful because of the fact that forces acting on the protein are additive, F(z) ) ΣR FR(z), and related to free energy via a simple integration such that W(z) ) ΣR WR(z). Our interest is primarily in how the lipids and water molecules that are pulled into the core of the bilayer contribute to the PMF. We therefore compute contributions from those interactions only. Because our chosen coordinate is the distance,

parallel to the membrane normal, between the protein and lipid bilayer centers of mass, the relevant force is the difference between the instantaneous force acting between component R and the protein and the instantaneous force acting between component R and the lipid bilayer,

∫z′z dζ 〈FR-protein(ζ) - FR-lipids(ζ) 〉 z z ) (- ∫z′ dζ 〈FR-protein(ζ) 〉 - (- ∫z′ dζ 〈FR-lipids(ζ) 〉

WR(z) ) -

(12)

where the mean force between the chosen components and the reference lipid bilayer is typically small (and has little effect on the resulting PMF contributions, as shown in the Supporting Information, Figure S4A). We note that because we consider only contributions to the PMF from molecules in close proximity to the peptide, we can compute instantaneous forces using a large cutoff of 20 Å instead of the PME for the electrostatic contribution. We have tested this approach, with results in the Supporting Information (Figure S4B), demonstrating only small changes with respect to results obtained using the PME approach. We note that although we refer to these direct contributions to the PMF as “direct”, we must recognize that, in fact, indirect strains contribute to each of these via the Boltzmann averaging of the mean force in eq 12,

(

∫ d3N-1R FR(ζ, R) e-[U (ζ,R)+U (R)]⁄kT WR(z) ) - ∫z′ dζ ∫ d3N-1R e-[U (ζ,R)+U (R)]⁄kT z

D

D

I

I

)

(13) where we underline the indirect term that is influencing those results. Moreover, all of those contributions sum to give the total PMF. The question of indirect contributions to the PMFs is a long-standing problem, simply because the PMF arises from the mean forces acting on the reaction coordinate, and indirect interactions do not impart any force for a given instantaneous configuration. A completely different breakdown of the free energy is needed to separate out the direct and indirect contributions. Because the indirect energy consists of a very large number of component terms, its magnitude and fluctuation are very large, hampering efforts to compute contributions via statistical mechanical perturbative approaches. Instead, the answer is likely to come directly from an analysis of membrane coordinates, which is the subject of future exploration. 2.4. pKa Shift Profile Calculation. We have attempted two different approaches to compute the pKa shift profile across the membrane. The usual approach with MD simulation would be to simulate the deprotonation process via free energy perturbation (FEP).55 We seek the free energy change for the process

ArgH+(z) f Arg0(z) + H+(aq)

(14)

at a position inside the membrane (held by a planar harmonic constraint, as in umbrella sampling simulations). This free energy is equal to

∆Gdeprot(z) ) (GArg0(z) + GH+) - GArgH+(z)

(15a)

as can also be written for a position in the bulk, z ) z′

∆Gdeprot(z′) ) (GArg0(z′) + GH+) - GArgH+(z′)

(15b)

The difference between these two free energies leads to a cancelation of the free energy of the unconstrained proton, GH+, and is

PMF and pKa Profile for Lipid-Exposed Arg

J. Phys. Chem. B, Vol. 112, No. 32, 2008 9579

∆∆Gdeprot ) (GArg0(z) - GArg0(z′)) - (GArgH+(z) - GArgH+(z′)) ) (GArg0(z) - GArgH+(z)) - (GArg0(z′) - GArgH+(z′)) ) ∆Gdeprot(z) - ∆Gdeprot(z′)

(16)

Thus, the deprotonation process corresponds simply to a mutational transformation from protonated ArgH+ to neutral Arg0 at two positions. The procedure involves introducing a thermodynamic coupling parameter, λ, varying from 0 to 1, to switch between the protonated U (λ ) 0), and deprotonated U (λ ) 1) potential energy function of the system. The free energy difference, ∆G, can be computed via thermodynamic perturbation.55 To test this approach, we carried out deprotonation calculations in bulk and at the center of the bilayer. We simulated 11 λ values ranging from 0 to 1 concurrently to maximize convergence and then summed the free energy changes for each window. We found that convergence of this procedure is very difficult to achieve owing to the challenge of sampling interfacial connections for partially charged species, as shown in the Supporting Information (Figure S5) and discussed in the Results and Discussion Section. However, a simpler approach was at hand that does not involve sampling turning on and off membrane deformations. We employ a thermodynamic cycle, shown in Figure 8A (this figure is described in detail in the Results and Discussion Section), in which ∆∆G deprot can be seen to equal the difference between the PMFs for neutral and charged Arg across the membrane,

∆∆Gdeprot ) WArg0(z) - WArgH+(z)

(17)

(remembering that the PMF has already been defined relative to the reference position, z′, via eq 2). This change in free energy of deprotonation is related to the pKa shift via

∆pKa ) ∆∆Gdeprot ⁄ 2.3kT

(18)

This approach does not necessitate simulation of partially protonated Arg side chains and, thus, does not suffer from the aforementioned limitations. Converged PMFs for the λ-end points are all that is needed to compute the pKa shift. 3. Results and Discussion 3.1. Membrane Deformations and Side Chain Coordination. The charged states of the side chain or analog inside the bilayer lead to similar local deformations of the membrane where polar water molecules and lipid head groups are pulled into the bilayer (cf., Figure 1A and B). In contrast, no such deformations occurred for the neutral species owing to significantly smaller interaction forces (see below). The formation of local deformations in the membrane due to the presence of charged side chains means that some simulation time was required to converge the solvation environment and interactions of the side chain with water molecules and lipid head groups. Equilibration of each window was judged by approximate convergence of interaction energies for the side chain with its water and lipid environment. This convergence was most challenging at the membrane interface and core. For the charged Arg side chain case, reported previously,18 ∼2 ns of equilibration was needed in the middle of the bilayer, whereas up to 4 ns of equilibration was required for the charged analog. For the neutral Arg systems, interaction energies converged slowly owing to the small forces involved. Figure S1 in the Supporting Information shows time series of interaction energies for two challenging

windows, where interaction energies can be seen to level off within around 4-5 ns. Free energy analysis began after this period. We have previously shown that the sampling of Arg-membrane interactions, using a TM helix model, depended on the rotamers of the charged Arg side chain.18 For the charged species, despite having an inherent preference on the order of 1 kcal/mol for a downward directed rotamer, a fairly uniform distribution of orientations was observed in water, and similar free energies for different rotamers were calculated near the membrane center.18 Figure S2A of the Supporting Information shows the adiabatic rotamer energy map for the neutral Arg side chain, revealing a similar ∼1 kcal/mol preference for this downward directed rotamer. However, the lack of deforming interactions with the membrane suggests that the side chain inside the membrane may have little preference. Figure S2B in the Supporting Information demonstrates that we have sampled a wide distribution of structural isomers in these simulations in different regions of the membrane, even deep inside the hydrophobic core. The lack of preference for one rotamer over another in the membrane core and occurrence of both upward and downward directed Arg side chain orientations in each half of the core are a consequence of much weaker interactions with the membrane for the neutral species. This suggests that the resulting free energy profile will be fairly symmetric about the membrane center. Solvation numbers of Arg side chains and analogs by water molecules as well as lipid phosphate and carbonyl groups are shown on the upper panels of Figure 2. As indicated by Figure 2A and B (upper panels), the charged Arg side chain and its analog are coordinated by at least one lipid headgroup phosphate oxygen and about five water molecules throughout the membrane due to the occurrence of membrane deformations. Although the side-chain ArgH+ and analog MguanH+ have similar coordination inside the core, the analog has increased coordination by lipid head and carbonyl groups at the interface due to the absence of the nearby helix and the greater ability of the analog molecule to reorient. The analog also has greater hydration, by two to three molecules, in bulk water, which is expected to lower the free energy in this bulk reference position and lead to an increased barrier to insert into the membrane. In contrast, the neutral side chain and analog (Figure 2C and D) experience a complete absence of coordination by head groups and a drop to just one to two in the number of hydration waters at the membrane center. This dehydration reaches a limit and remains approximately constant for the innermost ∼10 Å of the core. The interaction energies between Arg side chains or analogs and other components are also shown in Figure 2 (lower panels). Charged Arg interacts strongly with head groups and water throughout the membrane, with typical values of -100 and -50 kcal/mol, respectively. Its analog has similar interaction energies to the side chain in bulk water and in the membrane’s core region, but these are very different at the interface, as expected. The analog experiences far greater variations in interaction energies with water and head groups, changing by up to 100 kcal/mol across the membrane, in contrast to the actual side chain, which are fairly constant in comparison. The interaction energies of neutral species are much weaker. For example, their interactions with the lipid head groups at the interface are typically -10 kcal/mol, compared to -100 kcal/mol for the charged side chain. This explains why no membrane deformations were seen for the neutral species and suggests that its energetics would be dominated by dehydration.

9580 J. Phys. Chem. B, Vol. 112, No. 32, 2008

Li et al.

Figure 2. Solvation analysis. Top panels: mean first-shell solvation numbers of ArgH+ (A), MguanH+ (B), Arg0 (C), and Mguan0 (D) for water (red), phosphate (blue), and lipid carbonyl (green) are shown. Solvation numbers are the average number of atomic species within the first solvation shells defined by radii 4.85, 4.55, and 5.00 Å for water oxygen, phosphate oxygen, and carbonyl oxygen, respectively, relative to the guanidine carbon of the side chain or analog. Bottom panels: Average side chain interaction energies in kilocalories per mole for components: water (red); lipid head groups, including carbonyl moieties (blue) and K+ and Cl- ions (cyan). For the neutral species, we show a reduced energy range as insets.

3.2. Free Energy Profiles. Umbrella sampling simulations, following equilibration, were continued for 4-5 ns per window until convergence and symmetrization were achieved to within ∼1 kcal/mol, as shown in Figure S3 of the Supporting Information. The charged side chain has been shown to have very little asymmetry following equilibration and special sampling methods for obtaining an equilibrium distribution of side chain conformations and interfacial connections.18 The PMF for the charged analog also results from special sampling methods described in the Theory and Methods Section and is also well-converged with little asymmetry. The neutral Arg PMF is approximately symmetric because the side chain has no obvious preferred rotameric states and little interfacial interaction when inside the membrane (a small remaining asymmetry was removed by symmetrizing the biased density prior to WHAM calculation). The neutral analog PMF was symmetric to within 1 kcal/mol. The charged analog MguanH+ tends to pull water and lipid head groups from the closest membrane-water interface, similar to the charged side chain simulations. If MguanH+ is located close to the membrane center, probabilities of contacting opposite interfaces should be similar. However, once such a connection is made, large forces acting on water and head groups prevent breaking and reforming of interfacial connections in multi-ns simulations (although Figure 3A has revealed that a continuous pore can be formed across the membrane if the farconnected analog is moved several angstrom beyond the center of the bilayer). This indicates that on either side of the bilayer center, the charged analog will experience an excess force toward the closest interface, which would result in an artificial increase in the barrier height and an incorrect inverted “V”shaped PMF. Therefore, separate simulations with the connections established to either “near” or “far” interfaces were performed for a range of central windows, with samples shown in Figure 3A, which were then combined via the configurational integral summation procedure of eq 5. Figure 3B and C illustrate samples of the near (green) and far (blue) PMFs over ranges of (5 and (1Å from the membrane center, respectively. The convergence of the near-PMF has been shown in the Supporting Information; the far-PMF was found to be converged to within 0.1 kcal/mol on average (not shown). As expected, the far PMF (equal in free energy to the near PMF only at the membrane center) climbs quickly as the analog moves away from the

middle of the bilayer. The final result for the actual PMF is shown as a red curve in this figure. One can immediately see that the contribution of the far connected configurations to the overall PMF is small (e0.45 kcal/mol, as predicted) and confined to have an effect only very close to the center of the membrane (within (0.5 Å only). Interestingly, because the effect is confined to close to the membrane center, we have found that approximately the same final PMF can be obtained with a simple WHAM unbiasing of a symmetrized biased density from the near-only PMF where the central window has a wide enough spread to provide both near and far connections within 1 Å from the center. However, without this investigation, the range of the effect would not have been known. We note that prior to carrying out this procedure, the PMF obtained was slightly different.56 This was because connection to a “far” interface 1Å from the membrane center led to artificial force cancellation and a barrier that was 2 kcal/mol lower, emphasizing the need for care in accounting for the correct equilibrium distribution of interfacial connections. The final PMFs for all simulations are compared in Figure 4. The PMF of ArgH+ (solid red curve) appears very different in shape from a typical dielectric barrier for a model membrane12 and, in fact, climbs more steeply as the charge enters deeper into the membrane. We have previously explained this in terms of the greater forces of attraction to the interface as the side chain moves deeper and causes greater local deformations in membrane structure (see also the sections to follow). The barrier reaches as high as 17 kcal/mol close to the bilayer center. The PMF for the charged analog, MguanH+ (dashed red curve), also exhibits a large barrier and similar sharpness. However, the charged analog experiences interfacial binding arising from greater coordination with water and lipid head groups in the absence of the host helix. The barrier is fairly similar to that found for the actual side chain, but it is higher by ∼4 kcal/mol, which is a consequence of several contributing factors. First, the model helix influences side chain solvation, which almost eliminates interfacial binding and also reduces the level of hydration in bulk, raising the bulk reference free energy and thus decreasing the barrier to enter the membrane. At the same time, the helix is expected to raise the overall dielectric response of the membrane core (as well as to provide favorable electrostatic interaction with the peptide backbone dipoles), lowering the barrier further. In addition, the side chain is larger

PMF and pKa Profile for Lipid-Exposed Arg

J. Phys. Chem. B, Vol. 112, No. 32, 2008 9581

Figure 4. PMFs for ArgH+ (solid red); MguanH+ (dashed red); Arg0 (solid blue); and Mguan0 (dashed blue). The dotted black curve is the PMF for the ArgH+ side chain obtained with a different choice of reaction coordinate, now in terms of the position of the side chain guanidinium carbon, zCz, relative to the COM of the bilayer.

Figure 3. (A) Sample configurations where the charged side chain analog is connected to the near and far interfaces, taken from simulations to sample the equilibrium distribution of membrane deformations (see text). (B) Far connection PMF (blue line), near connection PMF (green line) for MguanH+. (C) Same as B but plotted over a smaller range with the actual PMF shown as a red line.

(and more nonpolar) than the analog, being better represented by propyl rather than methyl guanidinium. The difference in water–hydrocarbon partitioning free energies between the propyl and methyl analogs has been calculated to be ∼-2 kcal/mol,12 favoring the actual side chain inside the membrane and again lowering the barrier for the helix relative to the analog, but in opposition to these effects is a contribution from the model helix itself. The translating helix also introduces an ∼4 kcal/mol cost for excluding a nonpolar Leu side chain from the membrane

when Arg enters that is expected to raise the barrier for the side chain on the helix21 (with membrane deformations also influencing the free energy to move Leu residues adjacent to Arg from bulk water to the membrane). These factors cancel each other to some extent. However, a more fundamental difference in the helix and analog PMF calculations exists that is related to the choice of reaction coordinates. In the case of the helix model, we have mapped out the free energy as a function of the position of the entire peptide rather than the Arg side chain COM relative to that of the membrane. Therefore, even when the COM of the peptide is near the membrane center, the Arg side chain in the upward or downward directed orientation will attempt to snorkel toward an interface, placing its guanidinium C atom up to (4 Å from the bilayer COM. In the case of the analog, we have mapped out the free energy as a function of the side chain analog molecule COM with respect to the membrane COM directly, thus preventing it from snorkeling toward an interface. The difference in these two cases becomes evident from the analysis of the 2D PMF as a function of protein and relative side chain position presented previously.18 When the peptide is positioned near the maximum in the helix PMF, the side chain obtains an equilibrium distribution of orientations, being mostly down or up, but rarely with the guanidinium group actually at the center of the bilayer. In contrast, the analog molecule is forced to sample all positions across the bilayer center. To better relate the two cases, we have remapped the 2D free energy surface, W(z, s), of ref 18 (where z was the position of the helix COM relative to the membrane and s was the relative side chain guanidinium carbon Cz-helix COM coordinate) to instead report the free energy as a function of side chain position and relative position, W′(zCz, s) ) W(z + s, s), rediscretized on a regular grid in zCz. We then integrated over the relative side chain-helix coordinate, s, via

∫ e-W′(z

W′(zCz) ) -kT ln

Cz,s)kT

ds

(19)

to obtain a new 1D PMF as a function of the position of the side chain and not the position of the peptide COM (though still relative to the membrane COM). The result is shown as a dotted black curve in Figure 4. As expected, the barrier is higher and is not too dissimilar to the analog calculation near the bilayer center. This suggests that the aforementioned solvation effects

9582 J. Phys. Chem. B, Vol. 112, No. 32, 2008

Li et al.

Figure 5. Direct water-protein and lipid-protein (A) and indirect water, lipid, and water-lipid (C) energy contributions for the MguanH+ analog system. B and D are the total direct and indirect components from A and C shown on reduced energy, position ranges, or both. Each component has been symmetrized.

of the helix and related contributions from the nonpolarity of the side chain, analog, and Leu reference roughly cancel, and in the end, the helix and analog calculations are very consistent (at least in the bilayer core). Recently, results for a Pguan analog have been reported by MacCallum et al.21 who used a different model membrane and a different MD force field to also obtain a large barrier consistent with the results reported here. There are, however, differences of several kilocalories per mole between these studies that are likely due to the choice of model and force field, as to be expected. The PMF reported for PguanH+ has very deep binding sites that protrude deep into the core of the membrane, and the barrier is lower by 6 kcal/mol. In particular, the choice of dioleoylphosphatidylcholine lipids in that study, with considerably larger area per lipid, is likely to promote water penetration into the membrane interface (MacCallum, personal communication) that will lead to greater binding near the interfaces. The propyl guanidinium analog is also expected to bind more strongly because it can favorably interact with the interface and also has greater stabilization with its longer nonpolar tail in the lipid hydrocarbon region. In addition, the large carbonyl dipole in the Berger lipid parameters57 used in that study may lead to greater interfacial binding. All of these effects could explain the greater binding and the lower barrier in those studies.

Importantly, however, both studies have consistently observed large barriers to charged Arg analog movement into membranes. The PMF of the neutral side chain (Arg0) is shown as a solid blue curve in Figure 4 and exhibits a surprisingly high barrier of ∼10 kcal/mol. The PMF forms a plateau inside the core instead of climbing to a sharp peak, as charged species do. This plateau roughly matches the region of limited dehydration in Figure 2 and suggests that translocation involves simple dehydration/loss of hydrogen bonds to water and no membrane deformation for the neutral side chain. The neutral analog (Mguan0), shown as a dashed blue curve in Figure 4, is lower by 2-3 kcal/mol, with a barrier of ∼7 kcal/mol, which is roughly consistent with partitioning free energies for this analog.12 This lower barrier for the neutral analog in comparison to the neutral side chain demonstrates that the helix has much less effect on the neutral side chain solvation, and the difference between Arg0 and Mguan0 is dominated by the Leu reference energy (∼4 kcal/mol),12 partially compensated by the more unfavorable (by ∼2 kcal/mol)12 water/hydrocarbon partitioning free energy of Mguan0 with respect to Pguan0. We conclude from these neutral Arg PMFs that although deprotonation does lower the free energy barrier for translocation, it does not provide a “free” passage across the membrane.

PMF and pKa Profile for Lipid-Exposed Arg

Figure 6. Enthalpic (blue), entropic (red), and free energy (black, taken from Figure 4) contribution estimates for MguanH+ (A) and Mguan0 (B) analog simulations. Error bars represent standard errors of mean from the block analysis of the simulation energies.

3.3. Direct and Indirect Energetic Contributions. Energy calculations alone can help to unveil some important contributions to the PMF from components of the membrane system. Figure 5A shows the energetic contributions from direct terms (water-protein and lipid-protein, with ionic interactions included in the water term), for the charged analog MguanH+ as an illustration. The dehydration energy amounts to nearly 70 kcal/mol relative to the bulk. Amazingly, the attractive interaction with lipids counteracts almost all of this loss, appearing almost as a mirror image of the water contribution. The total direct energy is nearly flat compared to those contributions as the analog moves across the membrane. It is not completely flat, however, as shown in Figure 5B with a reduced energy range, revealing that direct enthalpic interactions with solvating water and lipids favor the analog residing at the interface by ∼5 kcal/mol due to a good coordination environment. However, these direct interactions are less strong at the membrane center, by ∼10 kcal/mol, relative to the interface. This reduction in strength of interactions with solvating molecules will contribute to the large barrier in the free energy profile, but still accounts for just a fraction of this barrier. Indirect terms, including water-water, lipid-lipid, and water-lipid interaction energies as a function of the position of the MguanH+ analog, are shown in Figure 5C. The lipid energy rises by ∼60 kcal/mol relative to the bulk when the charged molecule is inside the membrane. This is a significant energetic cost and can be understood in terms of the loss of headgroup - headgroup interactions as lipids are displaced from a perfect bilayer arrangement. At the same time, the water is also perturbed as a result of the side chain movement into the bilayer. In this case, the water energy drops by 20-40 kcal/

J. Phys. Chem. B, Vol. 112, No. 32, 2008 9583 mol near the interface and then increases to 5-15 kcal/mol above the bulk value when the side chain is near the center of the membrane. The interfacial effect can be understood in terms of greater water-water interaction as lipid phosphates become involved in coordinating the side chain. The effect near the bilayer center suggests that water-water interactions become sacrificed as water-Arg and water-headgroup interactions inside the core become stronger. Combined, the lipid and water energies amount to 70 kcal/mol strain energy when MguanH+ is at the membrane center relative to bulk water. However, this is almost completely canceled by the water-lipid interaction, which is ∼70 kcal/mol more negative inside the membrane relative to bulk. This can be understood in terms of the strong interactions between lipid head groups and their coordinating water molecules inside the core of the membrane. The total indirect energy cost is replotted over a smaller range in energy and positions in Figure 5D to highlight its shape as the analog crosses the membrane core. A change of ∼13 kcal/mol implies a large “strain force” is acting across the core that resists deformations extending to the membrane center. Although these are not free energy calculations and are subject to statistical uncertainties, they do reveal that the energetics of membrane perturbations are a balance of very large contributions that do not completely cancel, with the net result helping to explain the rapid climb in the overall PMF. Importantly, the total energy of the system (addition of the black curves in Figure 5A and C, shown as a blue curve in Figure 6A) when the side chain is near the middle of the membrane relative to bulk is only ∼5 kcal/mol according to this energetic analysis, being much smaller than the free energy barrier for MguanH+. What else might be playing an important role? 3.4. Entropic Contributions. There still remain significant changes in the overall free energy profile that have not been accounted for. The total system energy for the Arg side chain analog, MguanH+, as a function of analog position (blue curve of Figure 6A) actually becomes attractive inside the membrane interface and then repulsive at the bilayer center. The variation across the membrane is up to 15 or 20 kcal/mol and is important for the mechanisms of membrane translocation. The result for the entropic contribution, -T∆S (z), is shown as a red curve for comparison. This contribution is unfavorable and reaches as high as +20 kcal/mol and remains above 10 kcal/mol at the bilayer center relative to the bulk. This tells us that entropy is just as important in shaping the overall PMF as is enthalpy for the charged Arg analog. We can understand this because many freely diffusing water molecules and head groups become immobilized as they solvate the Arg analog (and coordinating lipid headgroup) deep inside the hydrophobic core. The results for the neutral side chain analog are shown in Figure 6B for comparison. In this case, the entropy contribution is fairly noisy but shows an opposite effect where the entropy favors insertion. We can understand this because the neutral side chain analog has lost its tightly bound hydration waters, which now experience greater conformational freedom, and thus, a positive entropy change is seen upon movement of Mguan0 to the core of the membrane. This highlights the very different insertion mechanisms for charged and neutral Arg side chains and the importance of entropy to the resulting PMFs. 3.5. Free Energy Contributions. The strong attractive interactions with lipid phosphate moieties and water molecules, on the order of -100kcal/mol, provides a large driving force to pull polar groups into the core of the membrane. However, we have also seen that these strong interactions occur even at the free energy maximum at the bilayer center. This illustrates

9584 J. Phys. Chem. B, Vol. 112, No. 32, 2008

Li et al.

Figure 7. Free energy contributions from core-located water (red), lipid head groups (blue), and ions (cyan) obtained for ArgH+ (A), MguanH+ (B). The black curves are the PMFs from Figure 4, and the remainder curves (green) have had the core contributions subtracted.

that strong attractive interactions do not demonstrate thermodynamic stability. But it raises the question of what has caused these large free energy barriers. We have already shown that direct and indirect energy contributions can account for much of the PMF shape and magnitude but have also shown that entropy plays a major role. Here, we isolate the direct contributions to the actual free energy for individual components of the system via eq 12. Here, we consider only the contributions from polar components (lipid head groups and water molecules) that have entered within the core of the membrane, defined as having lipid P or water O atoms within 13 Å of the bilayer center (K+ and Clion contributions have been previously shown to be small18). These components are what separate the MD model from a typical uniform slab model of a membrane and are the most interesting. This will reveal how the membrane deformations have helped to stabilize the charged side chain inside the membrane. In the case of the protonated side chain, ArgH+, we have previously carried out this analysis (in which we also made use of additional 2D umbrella sampling windows) for averaging the mean force due to particular interactions.18 Figure 7A shows the core-located water and headgroup contributions to the ArgH+ PMF, with each converging to zero outside the membrane core due to the exclusion of interfacial contributions. Core-located water and head groups stabilize the helix by as much as ∼35 and 5 kcal/mol, respectively. The water contribution is large because water is very efficient at stabilizing charges away from the aqueous phase58 and also because more water enters the core as the side chain moves deeper. However, quite surprisingly, despite the side chain’s remaining tightly coordinated by 1 phosphate group at the bilayer center, the core-located head groups contribute ∼10 kcal/mol less near the membrane center relative to the outer core, actually helping to destabilize the protein. Although there is some weakening of side chain-headgroup interactions, a significant contribution is likely to be associated with the indirect energetic and entropic costs to pull those groups into the bilayer (see below). Note, however, that for small displacements of Arg into the membrane core, where the side chains remain within reach of the interface, lipid head groups do help to stabilize the protein by several kilocalories per mole. The remainder curve of Figure 7A, obtained by subtracting the above components from the total PMF, exhibits a barrier of

Figure 8. (A) Thermodynamic cycle for calculating changes in the deprotonation free energy as a function of position, z, relative to bulk, z′. (B) pKa shift profile for Arg side chain (solid) and Mguan analog (dashed).

∼36 kcal/mol and represents the contributions from the bulk electrolyte and unperturbed lipid bilayer (without the penetrating polar groups). This is the Born dehydration energy for a deformation-free bilayer, as well as interactions with the interfacial charge distribution of the membrane (shown in the

PMF and pKa Profile for Lipid-Exposed Arg accompanying manuscript to be consistent with continuum membrane calculations). The fact that the barrier in the Arg PMF is 17 kcal/mol is a result of lipid and water contributions’ being insufficient to overcome this substantial electrostatic barrier presented by the membrane. Figure 7B shows the same decomposition for the charged analog MguanH+ in which the same effects are observed, only with much greater magnitudes (approximately doubled), consistent with the increased variations in solvation and interactions. Here, we see a stabilizing contribution of nearly -90 kcal/mol from core-located water and a clear destabilizing effect of almost +50 kcal/mol for core-located lipid head groups. The remainder term is ∼56 kcal/mol in this case. This result is approximately consistent with continuum membrane as well as MguanH+ water/cyclohexane partitioning free energies12 after taking into account a 14-19 kcal/mol dipolar potential for the lipid bilayer.18 The smaller remainder term for the helix can be explained mainly by a stabilizing effect of the protein backbone (also shown via simple continuum membrane calculations in the accompanying manuscript12). Free energy decompositions for the neutral Arg side chain and analog are not shown because they cause no obvious bilayer deformation and remain coordinated by only one to two water molecules and no lipid head groups at the center of the membrane. As a consequence, the contribution of core components to the PMF is negligible ( 0 (partially charged). This is most obvious for λ ) 0.2, in which no lipid phosphate coordinated the side chain after 5 ns. Consequently, the free energy perturbation inside the membrane did not converge, even after 11 × 5 ns of simulation for that single position (solid black curve of Figure S5B), demonstrating the huge challenge of computing a pKa profile for all positions across a membrane with this approach. In contrast, smaller times of 1-2 ns would suffice in the bulk water (see Figure S5B). We instead turn to the thermodynamic cycle approach (Figure 8A) to obtain the pKa shift profile. The pKa shift for the Arg side chain (Figure 8B) remains close to zero throughout the membrane and drops by just 4.5 units within the central 10 Å. The Arg side chain would therefore remain protonated throughout the membrane at neutral pH because the pKa remains above 7. Depending on the choice of aqueous solution value for the pKa (12 - 13.7),22–24 the pKa of Arg at the center of the membrane would be somewhere between 7.5 and 9.2, corresponding to 70-99% protonation at neutral pH. Even allowing for the lower limit and an uncertainty of ∼1 pKa unit (see accompanying manuscript), a considerable fraction (24-99.9%) of Arg would remain protonated. In acidic environments, Arg would essentially never be uncharged.

J. Phys. Chem. B, Vol. 112, No. 32, 2008 9585 Figure 8B also shows the ∆pKa profile for the analog Mguan. Unlike the side chain, the analog experiences a positive shift at the interface due to a strong binding to lipid head groups. The ∆pKa then is negative within the core, reaching a minimum of -9 units. This tells us that at neutral pH, the analog would experience deprotonation close to the membrane center. However, we have demonstrated that the analog model has far greater energetic variations across the bilayer, and this emphasizes the important role played by the TM segment (and choice of reaction coordinate). We understand the resilience of the Arg side chain against the membrane environment (for the TM helix case) in terms of the different behavior of the lipid membrane for charged and neutral species: membrane deformations are preferentially assisting the charged species through hydration and coordination by lipid head groups. Free energy decomposition has shown that the barrier for charged Arg would be at least doubled without deformations (Figure 7A and B, remainder curves), whereas the neutral PMF would be almost unaffected. Thus, without membrane deformations, Arg would completely deprotonate throughout the membrane core. Conclusions We have employed simple side chain analogs as well as a TM helix model to explore the thermodynamics of lipid-exposed Arg movement across a membrane. We have calculated free energy profiles for both the charged and neutral forms of each model side chain. Our simulations revealed that the charged species create local deformations of membrane structure that do not break in multi-ns simulations. The neutral forms lead to much smaller forces and subsequently do not cause these deformations. The mechanisms of neutral Arg movement are simply associated with dehydration. We have employed special approaches to ensure sampling of the interfacial connections for the charged side chain or its analog to ensure correct free energy profiles. Despite the penetration of water and lipid head groups into the bilayer, the barrier experienced by charged Arg is high, 17-21 kcal/mol, depending on the model. The presence of the protein segment lowered the free energy barrier and essentially eliminated interfacial binding, in agreement with experiment. By remapping the free energy of the TM helix in terms of the free energy as a function of the side chain guanidinium group position, we demonstrated that the two models are, in fact, consistent. The neutral side chain was also seen to have a large free energy barrier of 7-10 kcal/mol, depending on the model, demonstrating that deprotonation does not eliminate the barrier to translocation across the membrane. In the accompanying manuscript,12 we present calculations that demonstrate that these free energy profiles are accurate to within 1-2 kcal/mol for both charged and neutral species and that any errors due to the force field representation of side chain solvation or interactions within the membrane are likely to be fairly constant across the membrane and, thus, have little impact on the shapes of the free energy profiles presented here. We have calculated average energy contributions to help explain the shape of the resultant free energy profiles. We have found that, in both the direct and indirect contributions, there is a sum of large terms that almost cancel. Almost all energy losses due to dehydration are made up for by increased interactions with lipid head groups. The net direct interaction energy is fairly flat in comparison, but with some variation across the bilayer that will contribute to the resulting free energy barrier. We have shown that entropy plays a significant role in membrane partitioning, and thus, it is free energy contributions

9586 J. Phys. Chem. B, Vol. 112, No. 32, 2008 that we must analyze to understand what determines the shape of the PMF. We took advantage of the ability to decompose the mean force to elucidate contributions to the free energy profile. We showed that water pulled into the membrane helps to stabilize the analog or TM helix containing Arg but that the lipid head groups pulled into the membrane only help to stabilize the protein for small displacements into the core, after which they actually have a destabilizing effect due to the strains imposed on the membrane. We found that these free energy contributions were approximately doubled for the analog compared to the actual side chain, with core-located water providing a very strong stabilizing effect and core-located head groups an even more destabilizing role for this small analog molecule. We understand this in terms of the far greater energetic variations seen for the analog and also due to somewhat increased local membrane deformations. The indirect energy terms (i.e., those terms not involving a direct interaction with the protein) were seen to be a cancelation of large terms: strains in the lipids and water essentially are overcome by increased water-lipid interactions in the bilayer. The net effect was again fairly flat in comparison, but with a sharp variation deep inside the membrane core. We interpreted this variation in terms of an apparent “strain force” that increases the energy of the system as Arg moves to the bilayer center. We have demonstrated that the sharp increase in free energy in the bilayer core is due to both weakening of direct interactions and an increase in the indirect, or strain, costs to deform the membrane, and must now seek a formalism by which we can tractably decompose those direct and indirect contributions to the free energy. We have computed pKa shift profiles across the membrane for analog and TM helix models. We have shown that the pKa is almost unaffected by the membrane, except within the innermost 10 Å of the hydrocarbon core, where a shift of just -4.5 units was calculated. For the analog molecule, the shift was larger and occurred over a greater fraction of the membrane core, reaching a minimum of -9 units at the membrane center, signifying the importance of the TM segment in the side chain thermodynamics, partially protecting it from the unfriendly surroundings while also allowing the side chain conformational freedom to attempt to snorkel toward an interface and lower its energy. The relatively small pKa shift for the Arg side chain is a result of membrane deformations’ preferentially stabilizing the charged state, narrowing the gap between charged and neutral Arg free energies. We have shown that without these deformations, the pKa shift would be truly great. This may explain why even larger shifts have been estimated in nonpolar parts of proteins27 where a limited number of water molecules may penetrate and lipid head-groups likely do not venture. We can conclude that an Arg side chain on a TM helix will be governed by thermodynamics that is determined almost exclusively by the protonated state for which the barrier for membrane translocation is 17 kcal/mol, prohibiting any lipid exposed motion for Arg side chains. Interestingly, other basic residues, such as lysine, for example, may suffer similar shifts in pKa but will likely deprotonate inside the membrane due to their lower aqueous phase pKa values (e.g., 10.4 for Lys22). Thus, Arg is a unique amino acid side chain that could even withstand the hostile environment presented by the membrane hydrocarbon core and maintain the ability to carry a charge. Such a situation does, however, appear highly unlikely, with implications for the interpretation of translocon-based assays as well as the mechanisms of voltage-gated ion channels.

Li et al. Acknowledgment. This work was carried out in part using Pittsburgh Supercomputing Center grant MCB-050002 and was supported by NSF CAREER award MCB-0546768. Supporting Information Available: In the Supporting Information, we present five figures demonstrating equilibration, sampling, and convergence of our simulations. This material is available free of charge via the Internet at http://pubs.acs.org. References and Notes (1) Yang, A. S.; Honig, B. J. Mol. Biol. 1993, 231, 459. (2) Woods, A. S.; Ferre, S. J. Proteome Res. 2005, 4, 1397. (3) Luscombe, N. M.; Laskowski, R. A.; Thornton, J. M. Nucleic Acids Res. 2001, 29, 2860. (4) Huang, H. C.; Briggs, J. M. Biopolymers 2002, 63, 247. (5) Cuello, L.; Romero, J.; Cortes, D.; Perozo, E. Biochemistry 1998, 37, 3229. (6) Kalderon, D.; Richardson, W. D.; Markham, A. F.; Smith, A. E. Nature 1984, 311, 33. (7) Han, X.; Bushweller, J. H.; Cafiso, D. S.; Tamm, L. K. Nat. Struct. Biol. 2001, 8, 715. (8) Brown, K. L.; Hancock, R. E. W. Curr. Opin. Immunol. 2006, 18, 24. (9) Mathies, R. A.; Lin, S. W.; Ames, J. B.; Pollard, W. T. Annu. ReV. Biophys. Biophys. Chem. 1991, 20, 491. (10) Armstrong, C. M.; Bezanilla, F. Nature 1973, 242, 459. (11) Strandberg, E.; Killian, J. A. FEBS Lett. 2003, 544, 69. (12) Vorobyov, I.; Li, L. B.; Allen, T. W. J. Phys. Chem. B 2008, 112, 9588. (13) Hessa, T.; Kim, H.; Bihlmaier, K.; Lundin, C.; Boekel, J.; Andersson, H.; Nilsson, I.; White, S. H.; von Heijne, G. Nature 2005, 433, 377. (14) White, S. H.; von Heijne, G. Curr. Opin. Struct. Biol. 2005, 15, 378. (15) Jiang, Y. X.; Ruta, V.; Chen, J. Y.; Lee, A.; MacKinnon, R. Nature 2003, 423, 42. (16) Jiang, Y. X.; Lee, A.; Chen, J. Y.; Ruta, V.; Cadene, M.; Chait, B. T.; MacKinnon, R. Nature 2003, 423, 33. (17) Grabe, M.; Lecar, H.; Jan, Y. N.; Jan, L. Y. Proc. Natl. Acad. Sci., U.S.A. 2004, 101, 17640. (18) Dorairaj, S.; Allen, T. W. Proc. Natl. Acad. Sci., U.S.A. 2007, 104, 4943. (19) Essmann, U.; Berkowitz, M. L. Biophy. J. 1999, 76, 2081. (20) Freites, J. A.; Tobias, D. J.; von Heijne, G.; White, S. H. Proc. Natl. Acad. Sci., U.S.A. 2005, 102, 15059. (21) MacCallum, J. L.; Bennett, W. F. D.; Tieleman, D. P. J. Gen. Physiol. 2007, 129, 371. (22) Nozaki, Y.; Tanford, C. Methods Enzymol. 1967, 11, 715. (23) Angyal, S. J.; Warburton, W. K. J. Chem. Soc. 1951, 2492. (24) Hall, N. F.; Sprinkle, M. R. J. Am. Chem. Soc. 1932, 54, 3469. (25) Raczynska, E. D.; Cyranski, M. K.; Gutowski, M.; Rak, J.; Gal, J. F.; Maria, P. C.; Darowska, M.; Duczmal, K. J. Phys. Org. Chem. 2003, 16, 91. (26) Schlippe, Y. V. G.; Hedstrom, L. Arch. Biochem. Biophys. 2005, 433, 266. (27) Cymes, G. D.; Ni, Y.; Grosman, C. Nature 2005, 438, 975. (28) Niemeyer, M. I.; Gonzalez-Nilo, F. D.; Zuniga, L.; Gonzalez, W.; Cid, L. P.; Sepulveda, F. V. Proc. Natl. Acad. Sci., U.S.A. 2007, 104, 666. (29) Long, S. B.; Campbell, E. B.; MacKinnon, R. Science 2005, 309, 897. (30) Nagle, J. F. Biophys J. 1993, 64, 1476. (31) Sopio, R.; Lederer, M. Z. Lebensm. Unters. Forsch. 1995, 201, 381. (32) Schug, K. A.; Lindner, W. Chem. ReV. 2005, 105, 67. (33) Springs, B.; Haake, P. Bioorg. Chem. 1977, 6, 181. (34) Wolfenden, R.; Andersson, L.; Cullis, P. M.; Southgate, C. C. B. Biochemistry 1981, 20, 849. (35) Strandberg, E.; Ozdirekcan, S.; Rijkers, D. T. S.; van der Wel, P. C. A.; Koeppe, R. E.; Liskamp, R. M. J.; Killian, J. A. Biophy. J. 2004, 86, 3709. (36) Woolf, T. B.; Roux, B. Proteins Struct. Funct. Genet. 1996, 24, 92. (37) Tieleman, D. P.; Marrink, S. J.; Berendsen, H. J. C. Biochim. Biophys. Acta 1997, 1331, 235. (38) Tu, K. C.; Klein, M. L.; Tobias, D. J. Biophys. J. 1998, 75, 2147. (39) Chiu, S. W.; Clark, M. M.; Jakobsson, E.; Subramaniam, S.; Scott, H. L. J. Comput. Chem. 1999, 20, 1153. (40) Brooks, B. R.; Bruccoleri, R. E.; Olafson, B. D.; States, D. J.; Swaminathan, S.; Karplus, M. J. Comput. Chem. 1983, 4, 187.

PMF and pKa Profile for Lipid-Exposed Arg (41) MacKerell, A. D., Jr.; Bashford, D.; Bellot, M.; Dunbrack, R. L.; Evanseck, J. D.; Field, M. J.; Fischer, S.; Gao, J.; Guo, H.; Ha, S.; JosephMcCarthy, D.; Kuchnir, L.; Kuczera, K.; Lau, F. T. K.; Mattos, C.; Michnick, S.; Ngo, T.; Nguyen, D. T.; Prodhom, B.; Reiher, W. E., III; Roux, B.; Schlenkrich, B.; Smith, J.; Stote, R.; Straub, J.; Watanabe, M.; Wiorkiewicz-Kuczera, J.; Karplus, M. J. Phys. Chem. B 1998, 102, 3586. (42) Feller, S. E.; MacKerell, A. D. J. Phys. Chem. B 2000, 104, 7510. (43) Yin, D. X.; Mackerell, A. D. J. Comput. Chem. 1998, 19, 334. (44) Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. J. Chem. Phys. 1983, 79, 926. (45) Darden, T.; York, D.; Pedersen, L. J. Chem. Phys. 1993, 98, 10089. (46) Ryckaert, J. P.; Ciccotti, G.; Berendsen, H. J. C. J. Comput. Phys. 1977, 23, 327. (47) Feller, S. E.; Zhang, Y. H.; Pastor, R. W.; Brooks, B. R. J. Chem. Phys. 1995, 103, 4613. (48) Sperotto, M. M.; Mouritsen, O. G. Biophys. J. 1991, 59, 261.

J. Phys. Chem. B, Vol. 112, No. 32, 2008 9587 (49) Li, L. B.; Vorobyov, I.; MacKerell, A. D.; Allen, T. W. Biophys. J., 2008, in press. (50) Torrie, G. M.; Valleau, J. P. J. Comput. Phys. 1977, 23, 187. (51) Kumar, S.; Bouzida, D.; Swendsen, R. H.; Kollman, P. A.; Rosenberg, J. M. J. Comput. Chem. 1992, 13, 1011. (52) Sugita, Y.; Kitao, A.; Okamoto, Y. J. Chem. Phys. 2000, 113, 6042. (53) MacCallum, J. L.; Tieleman, D. P. J. Am. Chem. Soc. 2006, 128, 125. (54) Singh, U. C.; Brown, F. K.; Bash, P. A.; Kollman, P. A. J. Am. Chem. Soc. 1987, 109, 1607. (55) Simonson, T.; Carlsson, J.; Case, D. A. J. Am. Chem. Soc. 2004, 126, 4167. (56) Allen, T. W. J. Gen. Physiol. 2007, 130, 237. (57) Berger, O.; Edholm, O.; Jahnig, F. Biophys. J. 1997, 72, 2002. (58) Allen, T. W.; Andersen, O. S.; Roux, B. Proc. Natl. Acad. Sci. USA. 2004, 101, 117.

JP7114912