Implementation of the Forward–Reverse Method for Calculating the

Nov 5, 2014 - While eq 2 already shows a superior convergence rate when trivially implemented, we argue below that much more can be learned from this ...
0 downloads 9 Views 2MB Size
Article pubs.acs.org/JPCB

Implementation of the Forward−Reverse Method for Calculating the Potential of Mean Force Using a Dynamic Restraining Protocol Mostafa Nategholeslam,*,† C. G. Gray,*,‡ and Bruno Tomberli*,§ †

Department of Physics and Biophysics Interdepartmental Group, University of Guelph, Guelph, Ontario, Canada Guelph-Waterloo Physics Institute and Department of Physics, University of Guelph, Guelph, Ontario, Canada § Department of Physics, Capilano University, North Vancouver, British Columbia, Canada ‡

ABSTRACT: We present a new sampling and analysis scheme for calculating the potential of mean force (PMF) of systems studied by steered molecular dynamics simulations. This scheme, which we call the bin-passing method, is based on the forward−reverse (FR) method (due to I. Kosztin and co-workers, Kosztin et al. J. Chem. Phys. 2006, 124(6), 064106) and arguments based on the second law of thermodynamics. Applying the bin-passing method results in enhanced sampling, better separation of the reversible and irreversible work distributions, and faster convergence to the underlying PMF of the system under study. Post-simulation analysis is performed using a purpose-built software that we have made publicly available at https://github.com/1particle/bin-passing_analyzer under the terms of the GNU General Public License (version 3). Three examples are provided, for systems of varying sizes and complexities, to demonstrate the efficiency of this method and the quality of the results: for the dissociation PMF of NaCl in water, the bin-passing method obtains PMFs in excellent agreement with that obtained for the same system and using the same force-field through static (equilibrium) methods. The bin-passing method gives a very symmetric PMF for passage of a single water molecule through a DPPC bilayer, and the resultant PMF leads to permeability values in better agreement with experiments than those obtained through previous simulation studies. Finally, we consider the interaction of the antimicrobial peptide HHC-36 with two model membranes and employ the bin-passing method to obtain the PMFs for peptide adsorption to the membranes. The characteristics of these PMFs are consistent with the qualities established for the HHC-36 peptide through in vivo and in vitro experiments, as a non-toxic strong antimicrobial agent.



INTRODUCTION The advent of Jarzynski’s equality1,2 and, shortly after that, Crooks’ theorem3 have led to numerous studies on the applicability of these theorems for extracting free energy profiles of many-body systems from the external nonequilibrium work, both in experimental settings4−6 and simulations (e.g., see refs 7−10). These theorems and their implications have also been the subject of considerable theoretical investigation. Several groups have also employed Jarzynski’s equality to obtain free energy profiles in numerous biomolecular systems.11−31 Despite all the insight and utility provided by these works, the original hopes raised by non-equilibrium work (NEW) theorems and their extensions, i.e., to provide fast-converging methods for extracting the potential of mean force (PMF) are not entirely fulfilled for large thermodynamic systems, e.g., biomolecular systems. In particular, while a small ensemble of steered molecular dynamics (SMD)7,9,32 trajectories is often generated and used to obtain the PMF, thorough analysis of the convergence of the PMF and proper error analysis is often absent from these works. Furthermore, the simulation time invested to achieve the PMF from non-equilibrium SMD simulations using a direct application of Jarzynski’s equality, is comparable, if not longer than the time required to achieve a © 2014 American Chemical Society

converged PMF from an equilibrium simulation for the same system,33,34 and thus, some of the anticipated merits of using a NEW theorem are lost. In cases where faster steering speeds are used, higher numbers of trajectories are needed to achieve convergence, to the extent that it is often computationally more convenient to stay as close to the equilibrium limit as possible. Despite being correct in principle (when sampling over infinitely many paths), Jarzynski’s equality was discovered soon after its advent to suffer from slow convergence, due to its reliance on very small, zero, and even negative values of dissipative work, all of which are known to occur rarely, from the second law of thermodynamics.35,36 Various authors (including Jarzynski himself) have suggested that using bidirectional work samplings might partially remedy that difficulty.35,37 The best-known general bidirectional work theorem is that of Crooks3 e−β ΔG =

⟨f ( +W )⟩F ⟨f ( −W )e−βW ⟩R

(1)

Received: May 20, 2014 Revised: November 4, 2014 Published: November 5, 2014 14203

dx.doi.org/10.1021/jp504942t | J. Phys. Chem. B 2014, 118, 14203−14214

The Journal of Physical Chemistry B



where ΔG is change in free energy between the two macrostates under study, and ⟨...⟩F and ⟨...⟩R denote averages taken over ensembles of paths in the forward and reverse directions, respectively. This theorem by itself is not directly applicable to extract PMF profiles, as it contains an arbitrary function ( f) of work W. One needs to make an explicit choice for that function before being able to implement Crooks’ theorem, and some choices may lead to better convergence than others. The forward−reverse (FR) method,33 due to Kosztin and co-workers, is based on Crooks’ fluctuation theorem (the FR method can also be derived from the Jarzynski equality, within the same assumptions used to derive it from Crooks’ fluctuation theorem) and arrives at a remarkably simple result: for a system obeying Brownian dynamics, when we steer the system with a harmonic potential of sufficient stiffness to proceed with uniform speed from initial to final macrostates (corresponding to the initial and final values of the reaction coordinate), the free energy difference between those states is given by half the difference of the averages of forward and reverse works, i.e., ΔG =

⟨W ⟩F − ⟨W ⟩R 2

Article

BIN-CROSSING VERSUS BIN-PASSING SCHEMES Over the course of an SMD simulation, time is discretized into very short time-steps, typically on the order of a femtosecond. We start at xA at t = 0, and proceed xA → xB with a constant velocity of v = (xB − xA)/T, where T is the simulation time we invest in each realization of the xA → xB process. The forward and the reverse drift speeds need to be the same for eq 2 to be applicable, so the system will also be steered with the speed v = (xA − xB)/T along the reverse xA ← xB path. The specifics of the choice of the spring constant, total simulation time, and the steering speed, v, are not the subjects of our direct concern here. Rather, we wish to revisit eq 2 with the notions of reversible and dissipative works in mind, and also with some insight obtained from the second law of thermodynamics. To simplify the discussion, we formally define the notion of bincrossing, which we will later contrast to the notion of binpassing. Bin-crossing. During an SMD simulation (e.g., an FR run), one bin-crossing occurs when the value of the chosen reaction coordinate of the system (e.g., the radial distance between two ions) is steered to enter one boundary of a bin, pass through the bin, and eventually emerge from the opposite boundary of the bin. This marks the completion of one bin-crossing for that specific bin. A schematic view of this process is depicted in Figure 1. The elementary scheme for recording work values

(2)

Like other non-equilibrium work theorems, the convergence rate of the RHS (right-hand side) to LHS in eq 2 differs from system to system, based on the size of the system, its internal dynamics, and the thermodynamic parameters of the simulation. Equation 2 has been used to extract PMFs for several test systems of various levels of complexity,33,38−41 and its convergence rate has been shown to exceed that of Jarzynski’s equality, when tested on the same systems. While eq 2 already shows a superior convergence rate when trivially implemented, we argue below that much more can be learned from this simple equation and demonstrate how that knowledge can be utilized toward significantly improving the convergence by devising a more careful analysis scheme. In the elementary data-collection and analysis protocol for using eq 2 in a steered molecular dynamics (SMD) simulation, it is commonplace to divide the reaction path into the desired number of bins.7,9,33 Higher number of bins corresponds with a higher resolution for the PMF to be calculated. Over the course of the simulation, we record the work done on the system in each bin as it proceeds from initial state A to the final state B (and back), corresponding to initial xA and final xB values of the reaction coordinate. In traveling from A to B, the value of the reaction coordinate will traverse each bin at least one time. If we repeat this process a large number of times in the forward (F) and reverse (R) directions, we obtain the average works on the RHS of eq 2. The change in free energy across each bin calculated this way can then be summed to obtain the complete PMF from A to B. We call this elementary scheme the bincrossing method and compare its accuracy with the bin-passing method, which we present in the next section. We then detail a method for estimating the uncertainties in the PMFs obtained through the bin-passing method, followed by results for the binpassing method applied to three different systems, in increasing order of complexity. These results allow us to make comparisons between the convergence and accuracy of the bin-passing method compared to the bin-crossing method and comment on each method’s applicability in modeling bimolecular processes.

Figure 1. Progress of the system viewed at time-step resolution, for a system that is steered using a harmonic restraining potential in the A → B direction over the [0, T] time interval. In the enlarged view on the right, the horizontal dotted lines represent successive time-steps, each of length δt. The system evolves discretely, one time-step at a time. Although the overall progress of the system is from A to B there are many time-steps during which the progress of the system is in the A ← B direction, mostly as a result of thermal fluctuations.

during an SMD run is based on bin-crossing: we (re)start accumulating the external work done on the system (in either forward or reverse direction) as a bin is entered, accumulate all the work in subsequent time-steps spent within the boundaries of that bin, up to the time-step when the opposite boundary is exited. Although this might seem to be the most intuitive way to measure work for calculating the work averages in eq 2, a closer look suggests otherwise. The derivation of eq 2 gives also a relation for the average dissipative work (in either direction) ⟨Wd⟩33 ⟨W ⟩F + ⟨W ⟩R (3) 2 The meaning of eqs 2 and 3 is quite simple: the average forward work done on the system, in general consists of a ⟨Wd⟩ =

14204

dx.doi.org/10.1021/jp504942t | J. Phys. Chem. B 2014, 118, 14203−14214

The Journal of Physical Chemistry B

Article

reversible part (equal to the free energy difference) and a dissipative part ⟨W ⟩F = WF,reversible + ⟨Wd⟩

(4)

A similar relation can be written for ⟨W⟩R ⟨W ⟩R = WR,reversible + ⟨Wd⟩

(5)

By definition, we have WF,reversible = −WR,reversible = ΔG, and for most systems, ⟨Wd⟩F = ⟨Wd⟩R. Exceptions include asymmetric Brownian objects with a constraint on their orientation along the reaction path. Such exceptions aside, for most Brownian systems, one can show33 that the average dissipative work ⟨Wd⟩ done on the system in forward and reverse directions is the same, when pulls in both direction are done with constant and equal drift speeds, using a harmonic steering restraint of large enough stiffness. The second law of thermodynamics asserts further for the average dissipative work that ⟨Wd⟩ ≥ 0, with the equality holding only for infinitely slow pulls (v → 0), during which system always remains close to equilibrium. This, together with the discrete nature of SMD pulls, causes most of the time-steps to have a positive value of dissipative work, although there is nothing in the second law against any single time-step having zero or negative Wd, even when the system is far from equilibrium.42 From the enlarged section at the right side of Figure 1, it is clear that when system is steered to proceed in the forward (reverse) direction with a stiff spring, it very often happens to go in the reverse (forward) direction. This can be due to thermal fluctuations in the system (e.g., impulses from water molecules in an aqueous medium) or a result of corrective steerings when the system has jumped ahead (overshot) during an earlier time-step. To understand the latter effect, consider that steering is performed using a spring of constant stiffness, which may result in the system proceeding further than its target position by the end of some time-steps. This should be corrected in subsequent time-steps, according to the predefined trajectory we wish the system to follow x − xA xtarget(t ) = xA + v ·i·δt , v = B (6) T

Figure 2. A sample section of the trajectory traversed in an actual SMD simulation. The track shows the normal distance (x) between the center of mass of an HHC-36 peptide molecule (sequence KRWWKWWRR)43 from the center of mass of a POPE−POPG membrane patch. The HHC-36 molecule is steered to go toward the membrane during the shown interval, but it can be seen that during several time-steps the progress of the molecule is toward larger xvalues (marked red on the graph, and denoted further by colored bars beneath them). A harmonic potential with k = 8000 kcal/(mol·Å2) is used to move the peptide, with another harmonic potential of k = 25000 kcal/(mol·Å2) keeping the center of mass of the membrane in place. Each segment of the trajectory shown here corresponds to displacement of the peptide (relative to the membrane) during one time-step of duration 2 fs.

adding to the overall value of the dissipative work in the forward direction. In signal-noise language, and regarding our sampling of the reversible work as signal and that of the dissipative work as noise, the work recorded during each time-step has a contribution to signal (reversible work) and some to the noise (dissipative work). Counting the work from red segments as forward work cancels some of the signal from blue segments, while still contributing to the noise, and neither of these is desirable. In this same language, we can say that on the RHS of eq 2, the average noise (dissipative work) from ⟨W⟩F cancels that in ⟨W⟩R, while the signal (reversible work) in ⟨W⟩F sums with the negative of that in ⟨W⟩R, which is in turn the negative of the signal value in ⟨W⟩F, and then dividing by 2 we obtain the average value of the signal. Adopting a scheme where red-segment time-steps are excluded from our samplings of forward work, does not mean that we discard them. Rather, they can be used in our sampling of the reverse work ⟨W⟩R, as they contain the correct value of the A ← B reversible and dissipative works. The rationale discussed above holds also in the reverse direction: only timesteps during which the system actually proceeds in the reverse direction should be used in our estimation of ⟨W⟩R, and these can occur both during the (overall) forward and reverse portion of the SMD run. Although these assertions are intuitively appealing, implementing them in a practical analysis scheme is not trivial, as they suggest a segmentation of the trajectory, based on the direction of evolution of the system during each time-step. Subsequently, one needs to patch the right segments together to calculate ⟨W⟩F and ⟨W⟩R. Describing one such scheme, together with a method for estimating the associated uncertainties, is what we turn to next. Bin-passing Scheme for Calculation of ⟨W⟩F and ⟨W⟩R. The idea behind the bin-passing scheme for recording ⟨W⟩F and ⟨W⟩R is that forward (reverse) work should be accounted for when system actually proceeds in the forward (reverse) direction at the time-step level. In simulations, when we can

with a similar equation holding in the reverse direction. Here i is the time-step counter and δt is the length of each time-step, such that t = i·δt. An example of this fluctuation phenomenon in an actual SMD run is shown in Figure 2, where the system is restrained (or targeted) to proceed to smaller x values (blue segments), but every now and then it moves in the opposite direction (red segments, denoted also by colored bars beneath them). Our major assertion here is that the external work done on the system over the red segments in Figure 2 should not be counted as forward work in pulling the system toward smaller x values (which we here define as the forward direction). Rather, they should be used in calculating the average reverse work. The rationale for this is that the reversible work portion of the work recorded during each red segment equals the negative of the free energy difference for that segment (in the forward direction), while its dissipative work is (most likely) still a positive value, for the reason discussed above. Adding the work done during the red-segment time-steps in the estimation of the forward work, thus cancels some of the reversible work recorded during the time-steps that actually proceeded in the forward direction (dark blue time-steps), while (most likely) 14205

dx.doi.org/10.1021/jp504942t | J. Phys. Chem. B 2014, 118, 14203−14214

The Journal of Physical Chemistry B

Article

with a value exactly equal to a bin width). Then the temporary work and path-length variables of the bin are reset to zero and the algorithm proceeds to reading in the values for the next time-step. This scheme involves no greater degree of approximation than the bin-crossing method, because the accumulation of work within bins in the bin-crossing method already rests upon the approximation that (for small enough bins) the force due to surroundings is the same everywhere within the bin. Therefore, each bin passing that occurs fully within a bin measures, to within the limits of this approximation, the same reversible work as the bin crossing method. Data from the time-steps when a bin-boundary is crossed are discarded, with no noticeable change to the forward or reverse work averages. The extra complexity required in the algorithm for splitting the contributions from such time-steps is not justifiable, given the small amount of extra data that would be gained. When all the raw position and force data is read this way, for each bin in each direction we have a value for the total work and a counter variable which shows the corresponding number of bin-passings. These two variables are used to find the average forward (reverse) work per bin-passing, ⟨W⟩F (⟨W⟩R). These values are then used in eq 2 to calculate ΔG over each bin, and those values are subsequently summed along the reaction path to give the desired PMF, ΔG(x), over the [xA, xB] range. Using this scheme, one generally achieves much better statistics for calculating average external work values, as the number of bin-passings is usually at least an order of magnitude larger than the corresponding number of bin-crossings, for a given bin and in a given direction. These better statistics are simply the result of exploiting the fluctuating path the system traverses along the reaction path, and assigning each time-step segment to the proper direction. The work average values recorded this way also suffer much less from accumulation of dissipative work and simultaneous cancellation of reversible work values between different time-step segments that would happen in the bin-crossing scheme. As we demonstrate below, the bin-passing scheme leads to much faster convergence of the RHS of eq 2 to its LHS. Nevertheless, we have to consider the possibility that the statistics gathered using a bin-passing scheme might suffer from internal correlations of data: many instances of bin-passing in a given direction might result from a single bin-crossing, and the recorded work values for successive bin-passings might thus be correlated, i.e., statistically dependent on each other. While using such statistics for calculating average work values is safe, we should be cautious in estimating the uncertainty in ⟨W⟩F and ⟨W⟩R values (and thus the uncertainty in the PMF) calculated this way. As discussed next, we use the blockaveraging scheme44 for a more conservative estimation of error values. Dealing with Errors. Suppose we divide the reaction path from xA to xB into N bins, each of length δx = |xA − xB|/N. During a simulation, we collect ni,F(R) instances of forward(reverse) bin-passing work for the ith bin along the reaction path, and we list all those ni instances of bin-passing work in the time-order that they occurred, {W1, W2, ..., Wni}i,F(R). A correlation length r can effectively be defined for these ni samplings, such that the mth and the (m + r)th instances of forward(reverse) bin-passing work along the bin can be considered as statistically independent (one obviously requires m + r < ni). Now if we put our ni samplings into ni/r blocks,

check the direction, the system evolves during each time-step, there is no reason for mixing together all the back and forth fluctuations that over time progress the system in a certain targeted direction. Rather, we can benefit from our knowledge of the direction of progress of the system at the smallest (i.e., time-step) level, and then add the work done during each timestep to the total amount of work (for the corresponding bin) done in the same direction as the system has progressed during that time-step. Simply put, forward work is recorded in a bin only from time-steps during which system actually proceeds forward, and similarly for the reverse work. Based on this idea, we propose the following scheme for calculating ⟨W⟩F and ⟨W⟩R: perform an SMD FR run, between the desired initial and final macrostates of the system, distinguished by values xA and xB of the reaction coordinate. For each time-step, record (perhaps in a text or binary file) the initial and final values of the reaction coordinate, and also the external force applied on the system (for a typical two-body scenario, this involves recording initial and final values of two coordinates and two forces, one for each object). We strongly recommend that the analysis be performed postsimulation, as this analysis scheme usually takes a few orders of magnitude less processing time than the simulation itself, depending on the size of the system and the simulation time invested. There is no extra overhead due to recording the force and position data during the SMD run, as these data have to be recorded for any alternative analysis scheme. Post-simulation analysis also preserves raw data, and various analysis algorithms can then be applied, without the need for repeating the simulation. After the simulation, the analysis algorithm is given the initial and final values of the reaction coordinate, and the number of bins (of equal width, although that can be altered, at the price of further complexity in the algorithm). With this information, an initial mesh of the reaction path can be furnished, with spatial boundaries of each bin determined. Conveniently, the forward direction is then determined as the direction from the given initial to the given final value of the reaction coordinate xA → xB, with the opposite direction xA ← xB defined as the reverse direction. This choice of direction is arbitrary, since reversing it will not affect the shape of the resultant PMF other than, possibly, a constant overall shift. The analysis algorithm then proceeds to read the raw data (recorded with the format described above), one time-step at a time. For each bin, a pathlength variable and a temporary work variable is defined in each direction. When a time-step datum is read, if the progress of the system during that step is in the forward (reverse) direction and its initial and final values of the reaction coordinate fall within the ith bin along the reaction path, the temporary work value of that bin is incremented by the value of the forward (reverse) work done during that time-step, and the forward (reverse) path-length variable of the bin is incremented with the absolute value of the displacement along the reaction path during that time-step. This process is repeated for subsequent time-steps, until the path-length variable of a bin in a given direction reaches or exceeds the size of the bin (|xB − xA|/(number of bins)). This marks the completion of one bin-passing. At this point the temporary work variable of the bin is normalized to the value of that bin’s path-length variable, the total forward (reverse) work variable of the bin is incremented with the value of the normalized forward (reverse) temporary work, and the forward (reverse) work-counter variable is incremented by one (normalization is often required due to the discretization of the motion, which makes it unlikely that any given bin passing ends 14206

dx.doi.org/10.1021/jp504942t | J. Phys. Chem. B 2014, 118, 14203−14214

The Journal of Physical Chemistry B

Article

each of length r, and use these ni/r work averages, rather than the original ni different values to calculate the uncertainty in ⟨W⟩F(R) across that bin, we can be confident that our ni/r samplings are statistically independent and thus we are not underestimating ⟨δW⟩F(R) for the given bin. This is the idea behind block-averaging (Figure 3).

choose it to be zero here. Then the value of the free energy function at the end of the ith bin is given by i

G(xA + i·δx) =

∑ ΔGa ,

δx =

a=1

xB − xA N

(7)

Using the ⟨δsW⟩i,F and ⟨δsW⟩i,R values for each bin, individual ΔGi values calculated from eq 2 will have associated uncertainties δs(ΔGi) =

⟨δsW ⟩i2,F + ⟨δsW ⟩i2,R

(8)

where again the subscript s serves to remind the block-size dependence of our estimation of δ(ΔGi). Using the δs(ΔGi) values found this way along xA → xB, we can estimate the upper bound for the error in our calculated value of the free energy function by the end of the ith bin, δsG(xA + i·δx) Figure 3. Schematic view of block-averaging for work values Wi corresponding to successive bin-passings along a given bin in a given direction. A block size of s = 3 is used to re-group Wi values into Wj′ values, where, e.g., W1′ = (W1 + W2 + W3)/3.

i

δsG(xA + i·δx) =

∑ [δs(ΔGa)]2 a=1

(9)

This equation results in an increasing trend in δsGi values as we proceed from xA to xB. We might now recall that the choice of direction xA → xB for accumulating individual ΔGi values to establish the free energy profile G(x) is arbitrary. Subsequently, the increasing trend of associated δsGi values in the xA → xB direction also follows that arbitrary choice of direction. If we do the calculation in eq 9 twice, once from xA to xB and the other time in the opposite direction, we will obtain two trends of δsG that increase in opposite directions. We can convert these two sets of error values into weight factors for their two associated free energy profiles. Recalling that the two free energy profiles are exactly the same, except for an overall constant difference, we can shift one to coincide with the other. Corresponding to xA → xB and xA ← xB directions for establishing the G(x) profiles, each value of G(x) can now be considered to have two different weight factors,45 ws,xA → xB(x) = (δsG(x)xA → xB)−2 and ws,xA ← xB(x) = (δsG(x)xA ← xB)−2. These weight factors can be combined to give a moderated value for δsG(x):

The only remaining difficulty is finding the correlation length r for the simulation data in hand. That can be simply done by repeating the analysis several times, using different block-sizes. A block size of s simply means that s successive values of recorded bin-passing work in a given direction and for a given bin are averaged together and used as a single value toward finding ⟨W⟩F(R) and its associated uncertainty, ⟨δW⟩F(R). By using successively larger values of s, we reduce the chance of having correlation between each two adjacent block averages of work in a given direction. This generally results in an increase in the calculated ⟨δW⟩F(R), using the ni/s block values, rather than ni values (corresponding to a block size of 1). The maximum ⟨δW⟩F(R) will be obtained when s ≈ r. Using larger block sizes generally results in a decrease in the calculated ⟨δW⟩F(R), because when s > r, some uncorrelated values will be averaged over in a single block, without accounting for their genuine scatter, which will subsequently result in smaller calculated ⟨δW⟩F(R). A practical scheme for finding r can not rely solely on calculating ⟨δW⟩F(R) with different block sizes for one single bin along the reaction path, as each bin generally exhibits a different trend of correlation among successive bin-passing work values. Rather, a method to establish the error profile along the reaction path is needed. The latter should take into account the accumulation of error along the reaction path as we establish the PMF using the estimated PMF difference ΔG values for individual bins. Using a block-size of s, we establish the ⟨W⟩F and ⟨W⟩R values for each bin along this reaction path. We then use these values in eq 2 to obtain ΔG across each bin. With ni,F instances of forward and ni,R instances of reverse bin-passing recorded for the ith bin along the reaction path, we will have ni,F/s values of forward bin-passing work (with block-size s), and ni,R/s such values in the reverse direction. This statistic can be used to calculate ⟨δsW⟩i,F and ⟨δsW⟩i,R for the ith bin, where the subscript s denotes the general dependence of ⟨δsW⟩i,F(R) on the value of the block-size s used. We establish ⟨W⟩F and ⟨W⟩R values with their associated ⟨δsW⟩i,F(R) uncertainties for all the N bins. The initial value of free energy function G at xA is arbitrary and does not affect the accumulation of the error, and for the sake of simplicity we

δsG(x) = =

1 ws , xA → xB(x) + ws , xA ← xB(x) δsG(x)xA → xB . δsG(x)xA ← xB (δsG(x)xA → xB )2 + (δsG(x)xA ← xB )2

(10)

δsG(x) values found this way, generally show a trend of starting from smaller values at both ends of the reaction path (xA and xB) and increasing gradually toward a maximum in the middle. This can be understood from eq 8, and the fact that δsG(x)xA → xB and δsG(x)xA ← xB values show trends of increase in the xA → xB and xA ← xB directions, respectively. Equation 10 combines these two trends into a more symmetric pattern, removing the arbitrariness of unidirectional accumulation of error in both δsG(x)xA → xB and δsG(x)xA ← xB profiles. Error profiles established using eqs 8−10, using successively larger values of block size s, typically have a single maximum in the middle of the reaction path, as we demonstrate in the next section. The value of this maximum increases with s, until s approaches the correlation length r. As explained above, larger block sizes (s > r) usually result in a decrease in values of the error, for averaging over uncorrelated data without accounting 14207

dx.doi.org/10.1021/jp504942t | J. Phys. Chem. B 2014, 118, 14203−14214

The Journal of Physical Chemistry B

Article

Å/ns, equilibrating the system in that configuration for 0.1 ns, moving the Na+ ion back to (2.1 Å, 0, 0) along the x-axis with a steering velocity of −7.9 Å/ns, and equilibrating the system in that configuration for another 0.1 ns. The Cl− ion is constrained at the origin throughout the production run. Each such FR cycle thus takes 2.2 ns, and 10 cycles are performed in each of the two production runs, giving a total of 22 ns of simulation time for each run. The [2.3 Å, 9.5 Å ] range of the resultant trajectory has been used in obtaining the PMFs from both methods. The resultant PMFs are shown in Figure 4, together with Timko’s PMF for NaCl dissociation obtained using equilibrium classical molecular dynamics with CHARMM27 force-field47 at

for their scatter. We can use this behavior as our guide in estimating r, by establishing error profiles with increasingly larger values of s, until a maximum is reached. Although adopting such an error profile as the measure of uncertainty in our calculated free energy profile might be an overestimate, the error-bars obtained this way for the bin-passing method are often consistent with the bin-to-bin variation in the resultant free energy profile. We show examples of this procedure and the resultant G and δsG profiles in the next section. While the convergence rate of the bin-passing algorithm is high, there is no guarantee that for a given system and with a converged PMF, the error profiles obtained using successively larger block sizes would necessarily exhibit a maximum. This can be due to lack of enough sampling to encompass several correlation lengths r worth of work values in each direction, while the sampling has been enough already for calculating the underlying PMF. In such cases, we can exploit our knowledge of the physical characteristics of the system. Assume that we know, e.g., that the xA → xB and xA ← xB paths have each been traversed 10 times in a given FR simulation, and that samplings in each two such successive trips can be regarded as uncorrelated. Then for estimating the error in the PMF it will make no sense to use a block-size so large that it gives less than 10 sampled values for any bin along the reaction path, as this would average uncorrelated samplings of work together as one data point. We have used this intuitive rule of thumb for choosing the proper error profiles in the example of peptidemembrane systems below, while the maximum in error-profiles can be established and used for smaller systems when a sufficiently long sampling has been performed (see next section).



EXAMPLE 1: PMF OF INTERACTION OF Na+ AND Cl− IONS IN BULK WATER The dissociation PMF for an Na+ and a Cl− ion in water has been studied by many authors, and we use this simple system as our first example to demonstrate the efficiency and accuracy of the bin-passing method. We present here the results of two similar SMD simulations, both performed on the same equilibrated simulation box. For both simulations we have used the TIP3P water model,46 together with the CHARMM force field (version 2747), and both simulations have been performed using NAMD molecular dynamics software,48 at NPT conditions, with atmospheric pressure (P = 1.00 atm) and physiological temperature (T = 310 K). A single Cl− ion is put (strictly fixed) at the origin of the coordinate system, with an Na+ ion at (2.3 Å, 0, 0). A total of 3916 water molecules are used to build a (nearly) cubical water box of size ∼49 Å on each side around these two atoms. The two ions are held fixed in place using a constraint, while the energy of the whole system is minimized for 1000 steps and then the system undergoes an equilibration molecular dynamics run for 5 ns, using a time-step of 2 fs and under NPT conditions (P = 1.00 atm, T = 310 K). We use the resultant equilibrated box as the initial configuration for both of the simulations in this section. During the production FR runs, we employ a harmonic guiding potential of stiffness k = 500 kcal/(mol·Å2) to perform the steering. Two similar simulations were performed, using identical steering conditions, but with bin-crossing sampling performed in one simulation and bin-passing during the other. Each FR cycle consists of moving the Na+ ion from (2.1 Å, 0, 0) to (10 Å, 0, 0) along the x-axis, with a steering velocity of 7.9

Figure 4. (a) Dissociation PMFs for NaCl in water at T = 310 K and P = 1.00 atm, using TIP3P46 water model and CHARMM2747 force field, obtained from two SMD simulations, and compared with the classical PMF obtained by Timko et al.,49 using the CHARMM27 force field, at T = 300 K and P = 1 atm. The PMF by Timko et al. has been calculated over about half the range of reaction coordinate as that studied here using the FR method. We have used a harmonic potential of stiffness k = 500 kcal/(mol·Å2) to steer the Na+ ion with a steering speed of 7.9 Å/ns along the trajectory, while the Cl− ion was constrained at the origin. Error-bars are not shown to allow distinguishing the PMF curves clearly. (b) Bin-crossing and binpassing PMFs are given with their corresponding error bars. The error bars for the bin-passing PMF are obtained using a block size of 80, as explained in Figure 5. (c) The forward portion (moving from larger to smaller z-values) of the bin-crossing run presented in (a) and (b) is used to construct an estimation of the PMF using Jarzynski’s equality e−βF = ⟨e−βW⟩. The resultant profile has obviously not converged to the underlying PMF, as a comparison with (a) and (b) reveals. 14208

dx.doi.org/10.1021/jp504942t | J. Phys. Chem. B 2014, 118, 14203−14214

The Journal of Physical Chemistry B

Article

phosphocholine (DPPC) bilayer at atmospheric pressure and T = 323 K. A membrane patch composed of 80 DPPC molecules was built, using the CHARMM-GUI50,51 membrane builder. This patch was then hydrated with a total of 5137 TIP3P46 water molecules on both faces. The approximate dimensions of the resultant simulation box were 65 Å × 68 Å × 103 Å along the x, y, and z directions, respectively, with the normal to the membrane surface along the z-direction, and its center of mass situated at the origin of the coordinate system. The resultant system was then equilibrated for 60 ns using NAMD48 and the CHARMM3652 force field, with a time-step of 2.00 fs, under constant temperature and pressure (P = 1.00 atm and T = 323 K), while all three dimensions of the system were allowed to fluctuate in size to let the system settle to an equilibrated geometry. During the last 5 ns of this equilibration simulation, the area-per-lipid was 60.57 ± 0.84 Å2, exhibiting ∼1.4% fluctuation around the mean value. This value for the average area per lipid headgroup falls within the 57−71.2 Å2 range of experimentally measured values for headgroup area of DPPC at 50 °C.53 Nine frames (microstates) were thus obtained, one each 0.5 ns along this trajectory, which were subsequently used as the initial states for nine parallel sets of FR production runs. The approximate dimensions of these simulation boxes were 60 Å × 67 Å × 105 Å. Each of these boxes was used as the initial configuration for an FR simulation of the maximum duration of 40 ns (totalling to 302.9 ns worth of simulation time). During the FR runs, a single water molecule, which was held fixed at (0, 0, 31 Å) during the equilibrations, was steered to go through the membrane with an average z drift speed of 13.78 Å /ns for a total of 4.5 ns in each direction, and it was held fixed at the two end points (at z = 31 Å and −31 Å) of the trajectory for 0.5 ns between each two successive trips through the membrane. Each FR cycle of the water molecule thus consisted of a 4.5 ns trip through the membrane from z = 31 Å to −31 Å, 0.5 ns of rest at z = −31 Å, a 4.5 ns trip back to z = 31 Å, and a final 0.5 ns of resting at z = 31 Å, before the next cycle began, totaling to 10 ns of simulation time. Our 302.9 ns of simulations thus encompass over 30 whole FR cycles, performed on nine similar but uncorrelated systems. No restraint was put on either the water molecule or the membrane along the x or y directions, thus, allowing the restrained water molecule to move freely (as a result of thermal fluctuations) in the x and y directions, while its motion along the z axis was restrained through the protocol described above. This was to ensure that the each two successive trips through the width of the membrane were along different trajectories, thus avoiding any bias in sampling the configurations of the phospholipid molecules in the membrane. Throughout these FR simulations, the steered water molecule has been restrained using a spring constant of 0.5 kcal/(mol·Å2), while the center of mass of the DPPC membrane patch was restrained to stay at z = 0, using a spring of stiffness 500 kcal/(mol·Å2). The PMF obtained (using the bin-passing method) from these series of simulations, is shown in Figure 6, together with the (symmetrized) PMFs obtained for the same system by various other groups.41,54−57 The uncertainty profiles (using several different block sizes) for this system are also shown in Figure 7 and exhibit a maximum for a block size of 100. These largest uncertainty estimations are thus used as the error-bars on the bin-passing PMF in Figure 6. The PMFs reported by

300 K. It is clear in Figure 4 that, for exactly the same amount of simulation time and under the same simulation conditions, the bin-crossing method gives poor results for the dissociation PMF of Na+ and Cl− ions in water, while the PMF obtained from the bin-passing algorithm, agrees almost perfectly with Timko’s result. In part (c) of Figure 4 we have shown the PMF obtained using Jarzynski’s equality on the work samplings obtained in the bin-crossing simulations. As Jarzynski’s original method is unidirectional, only about half of the data from a given series of FR simulations can be used as input to this method, and it is evident that the resultant profile, shown in Figure 4c, has not converged to the underlying PMF. We should note that using about the same amount of data (∼11 ns worth of simulation time), the bin-passing method obtains the general features of the NaCl dissociation PMF with a much better precision (result not shown) than, e.g., the bin-crossing FR result shown in Figure 4a,b, using 22 ns worth of simulation time with the conditions explained above. The Jarzynski result shown in part (c) of Figure 4 thus only serves the purpose of exhibiting the relative inefficiency of this unidirectional method when applied to a small ensemble of trajectories. In Figure 5 we establish the error profiles, as discussed in the last section, for the bin-passing PMF presented in Figure 4,

Figure 5. PMF error profiles for the FR simulations of NaCl dissociation in water at 310 K, obtained through the bin-passing algorithm with increasing values of block sizes. A block size of 80 is seen to obtain the largest uncertainty values, and the error bars obtained using this value of block size are thus used in Figure 4 on top of the bin-passing PMF.

using various block sizes from 1 to 160, and it can be seen that these curves reach their maximum at a block size of 80 and decrease thereafter for larger block sizes. We have thus used the error bars obtained with the block size of 80 on top of the binpassing PMF in Figure 4b. These error bars are about half the size (or less) of the bin-crossing error bars shown in the same figure. Comparison of the size of the PMFs and their associated error-bars together demonstrate superior performance of the bin-passing method over the traditional bin-crossing method for this system, to the extent that even traces of a third, very shallow potential well can be distinguished on the bin-passing PMF around 7.5 Å.



EXAMPLE 2: PMF FOR PASSAGE OF WATER THROUGH A DPPC MEMBRANE Passage of various molecules through membranes is a classic problem of biological and industrial importance. As the second example of the application of the bin-passing method, we study the passage of water through a 1,2-dipalmitoyl-sn-glycero-314209

dx.doi.org/10.1021/jp504942t | J. Phys. Chem. B 2014, 118, 14203−14214

The Journal of Physical Chemistry B

Article

⎛ P = ⎜⎜ ⎝

∫z

z2

1

−1 e βΔG(z) ⎞ dz⎟⎟ D(z) ⎠

(11)

where D(z) denotes the local diffusion coefficient. We used the diffusion coefficient profile obtained for this system by Holland et al.41 (at P = 1.00 atm and T = 323 K), together with our PMF for this system, shown in Figure 7, to calculate the righthand side of eq 11 and found a permeability value of P = (7.92 ± 0.7) × 10−3 cm/s for water−DPPC at the given pressure and temperature. This value is not in perfect agreement with the experimentally measured value of 2.4 × 10−3 cm/s for this system at T = 319.15 K59 or 3.00 × 10−4 at T = 343.15 K.58 Nevertheless, this result is in closer agreement with the experimental values than those obtained through previous simulation studies, as shown in Figure 8.

Figure 6. PMF for passage of water through a DPPC phospholipid bilayer, as obtained using the bin passing method from FR simulations (at T = 323 K and P = 1 atm), together with various PMFs for the same system obtained by other groups.41,54−57 Only, the bin-passing PMF as given here is not symmetrized, and it exhibits a maximum asymmetry of only ∼0.17 kcal/mol. The value z = 0 corresponds to the center of the membrane.

Figure 8. Comparison of water permeability of the DPPC bilayer, as obtained from various studies. The reported values given by Jansen and Blume (343.15 K)58 and Lawaczeck (319.15 K)59 are obtained experimentally, while the other four results41,55−57 and the value obtained in this work are from simulation studies.

Figure 7. Error profiles for the 302.9 ns long FR simulation of the passage of water through a DPPC bilayer. A block size of 100 is seen to obtain the largest estimated uncertainties in the PMF, and successively higher block sizes result in reduced estimations of the uncertainty. The uncertainty values obtained with a block size of 100 are used on the bin-passing PMF shown in Figure 6.

As can be seen from eq 11, permeability depends exponentially on the PMF and only linearly on the diffusion coefficient. The better agreement with the experimental results of the permeability obtained using the PMF found through the bin-passing method (compared with those reported from other simulation studies) thus supports the validity of the larger barrier height obtained here, shown in Figure 6.

other groups are obtained using various methods, force-fields, and simulation times. The longest simulation among these has been the 30 ns run performed by Holland et al.,41 and the shortest has perhaps been the classic and pioneering studies by Marrink and Berendsen,55 where they could afford only to spend very few nanoseconds of simulation time (total simulation time unclear). With the exception of the PMF reported by Shinoda et al.,56 all other groups calculated the PMF for only one leaflet and mirrored the result to produce the full profile. Nevertheless, Shinoda et al.56 and Sugii et al.57 did not provide error bars on their PMFs, and the degree of convergence of their results is thus undetermined. In Figure 6, the bin-passing PMF shows a deviation from symmetry of no more than ∼0.17 kcal/mol, over the whole reaction path of length 60 Å. The PMFs obtained by the other groups were symmetrized (by amounts not reported), due to incomplete sampling. Our bin-passing PMF for this system exhibits a larger barrier height (by ∼1−2 kcal/mol) compared to those obtained by other groups. However, our PMF results in a better agreement with experimentally measured permeability values when used to obtain the DPPC bilayer permeability to water at the given temperature, using the equation55



EXAMPLE 3: PMF OF INTERACTION OF THE HHC-36 PEPTIDE WITH TWO MODEL MEMBRANES As our third example of the application of the bin-passing method, we establish the adsorption PMF for the antimicrobial peptide HHC-36 (WRWWKWWRR)43 to two different model membranes. HHC-36 has already been shown to perform as a strong antimicrobial peptide, through a series of in vivo and in vitro experiments. Specifically, this peptide has been tested against strains of multidrug-resistant bacteria, such as Pseudomonas aeruginosa and Staphylococcus aureus and has surpassed regularly used antibiotics in antimicrobial activity. HHC-36 has also shown minimal toxicity toward host mammalian cells, specifically toward red blood cells.43 While these experiments can not indicate the precise mechanism of interaction of this peptide against either mammalian or bacterial cell membranes, they do suggest that the peptide adsorbs less strongly to mammalian cell membranes and more so to the bacterial cell membranes. This suggestion is supported by the 14210

dx.doi.org/10.1021/jp504942t | J. Phys. Chem. B 2014, 118, 14203−14214

The Journal of Physical Chemistry B

Article

above the membrane surface, with the z-component of its center of mass at 55 Å, that is, 55 Å along the direction normal to the membrane surface. The peptide was then moved toward the membrane with a drift velocity of 5 Å/ns, along the z-axis, for 7 ns, until the z-coordinate of its center of mass reached 20 Å. We kept the peptide there for half a nanosecond to let the system settle down, and then steered the system along the opposite path, moving peptide back along the z-axis to z = 55 Å, followed by another 0.5 ns of relaxation, which marked the completion of one round of an FR run with |vF| = |vR| = 5 Å/ns. A snapshot of the initial configuration of this cycle, together with a depiction of distances described above, is shown in Figure 9, where the water molecules are shown very faintly by dotted lines to help keep the peptide and the membrane surface visible in the figure.

fact that mammalian cell membranes are rich in zwitterionic phospholipids, while the cytoplasmic membrane of bacteria is rich in anionic phospholipids. With five units of positive electric charge, one would expect this peptide to show more attraction to negatively charged phospholipids like phosphatidylglycerol (PG), compared to the zwitterionic phosphatidylcholine (PC). Strains of methicillin-resistant Staphylococcus aureus (a Grampositive bacterium) contain between 70 to over 85% of the PG in their total phospholipid content.60 PG also comprises 18% of the phospholipid content of the Salmonella typhimurium, and 19% of that of Escherichia coli (both Gram-negative), with the major phospholipid component in both cases being phosphatidylethanolamine (PE), which comprises about 75% of the phospholipid content of the membrane of Salmonella typhimurium, and 69% in the case of Escherichia coli.61 PC is the major phospholipid component of mammalian erythrocytes. While it comprises about 30% of the phospholipid content of the human erythrocyte membrane, 62,63 its distribution is not symmetric between inner and outer leaflets: the outer leaflet is has about 45% PC content among its phospholipids, while the inner leaflet has only 14%.62 To study the interaction of HHC-36 peptide with mammalian and bacterial membranes we prepared two model membrane patches, one composed of 128 molecules of the 1palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine (POPC) molecules, with 64 phospholipid molecules per leaflet, and the other with a 3:1 mixture of 1-palmitoyl-2-oleoyl-sn-glycero-3phosphoethanolamine (POPE) to 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphoglycerol (POPG) molecules, using 32 POPG and 96 POPE molecules, with 48 POPE and 16 POPG molecules in each leaflet. Both membranes were built using the CHARMMGUI membrane builder50,51 and then hydrated in VMD molecular visualization software.64 The resultant simulation boxes were then merged each with a separate box containing a hydrated HHC-36 molecule, obtained from over 600 ns refolding simulation, initiated from a refolded structure of the peptide molecule. All of these simulations (initial equilibration, refolding simulation of the stretched peptide structure, final equilibration, and the production runs) have been done using NAMD48 at T = 310 K and P = 1.00 atm, using TIP3P water model,46 with CHARMM2747 force field for water molecules, ions, and the protein, and CHARMM3652 force field for the membrane patches. Proper numbers of sodium and chloride ions have been added to each simulation box to obtain near physiological NaCl concentration (∼0.15 M), and also to render the simulation boxes electrically neutral. The final POPE/POPG box (containing the peptide) has been equilibrated for 71.7 ns, and the POPC box (also containing the peptide) for 175.6 ns. POPC by itself does not form very stable membranes, and thus, we performed a longer equilibration run for the POPC box, compared to the one containing POPE/POPG. From the last 20 ns of the equilibration run of the POPE/ POPG box, 11 frames have been picked, at 2 ns intervals, which were then used as the starting configurations of the 11 different FR SMD peptide-membrane runs. Similarly, from the last 24 ns of the POPC+peptide box, 13 frames have been extracted at intervals 2 ns apart, and these have been used as the initial configurations for 13 different FR SMD runs. Each membrane has its interfaces parallel to the x−y plane, and the z coordinate of its center of mass has been kept close to zero by means of harmonic restraints. In our FR simulations with both types of membranes, the peptide was initially held

Figure 9. General scheme for the SMD simulations of the HHC-36 adsorption to the POPC and POPE/POPG membrane patches. Center of mass of the peptide is moved along the z-axis from z = 55 to 20 Å and back, with an average steering speed of 5.00 Å/ns, while the center of mass of the membrane patch is restrained at z = 0.

For steering, the HHC-36 molecule against the POPE/ POPG membrane patch in the fashion just described, we use a harmonic restraint of kp = 500 kcal/(mol·Å2), with a harmonic restraint of km = 10000 kcal/(mol·Å2) holding the z-coordinate of the center of mass of the membrane essentially fixed at zero. For the peptide-POPC system, these values were kp = 100 kcal/ (mol·Å2) and km = 5000 kcal/(mol·Å2), respectively. A total of 743.87 ns of production FR run was collected for the peptidePOPE/POPG system, and 611 ns for the peptide-POPC system. The collected force and positions data from all the time-steps for each system were analyzed by the bin-passing algorithm described in the earlier sections, and the resultant adsorption PMFs are shown in Figure 10. One can see in that figure that the PMF well for the peptide’s interaction with the charged membrane (POPE−POPG) is much deeper compared to that of the zwitterionic membranes (POPC). The welldepths are 13.41 and 4.75 kcal/mol for the POPE−POPG and POPC membranes, respectively. Both curves are essentially flat in the 43−47 Å range, which means that the peptide is out of the effective interaction range of the membranes at this distance interval, and we have shifted both curves to have an average of zero in this region. We wish to emphasize the efficiency of the bin-passing method in obtaining these curves, giving satisfactory PMFs within less than a microsecond of simulation time. Applying the bin-crossing method would require much longer simulation times to achieve a converged PMF for either of the two peptide-membrane systems presented here. 14211

dx.doi.org/10.1021/jp504942t | J. Phys. Chem. B 2014, 118, 14203−14214

The Journal of Physical Chemistry B

Article

conveniently chosen fast density functional theory (DFT) methods in between the pure classical steps, which are in turn influenced by the outcome of quantum calculations. This way a practical distinction is made between the classical and quantum time-scales (following the same idea presented in the Born− Oppenheimer approximation67), which makes it possible to apply non-equilibrium work theorems to the classical steps. Due to the fact that the total distance is accumulated to determine the number of bin-passings (as opposed to the total displacement used in the bin-crossing method), a much higher number of work samples is obtained in this method, further improving the statistics. The interrelation among those contributions can be vividly seen in the central formulae of the FR method, and can be effectively implemented when the notion of dissipative work is considered within the framework of the second law of thermodynamics. These considerations are fundamental to the design of the bin-passing method presented here, and contribute to its higher efficiency compared to the more elementary bin-crossing scheme. Our experimentation with the bin-crossing method (results shown for the NaCl dissociation in water) makes it clear that this method performs poorly in obtaining the PMFs for thermodynamic systems. This is to be expected from the arguments presented above: accumulation of the dissipative work and cancellation of reversible work contributions in both the forward and the reverse work averages, when sampling is done through the bincrossing method, results in poor convergence to the PMF, and larger uncertainties in the resultant profiles. In contrast, the binpassing method exploits the thermal fluctuations occurring during SMD simulations, and results in highly improved convergence and smaller error bars. The convergence rate generally decreases with the size of the system under study, as is the case with any other PMF calculation method. Nevertheless, we showed that a submicrosecond long simulation is sufficient for obtaining satisfactory PMFs for systems as large as a hydrated peptidemembrane pair. Furthermore, the block-size variation scheme that we present and discuss puts a tight bound on the maximum uncertainty of the PMFs obtained using the bin-passing method. We suspect that lack of a proper analysis scheme and software has been a major reason for the FR method33 not becoming as widely employed by simulators as (we believe) it should be. The analysis scheme presented here can be performed using the bin-passing analyzer software that we have made available as a free software at https://github.com/ 1particle/bin-passing_analyzer under the terms of the GNU General Public License (version 3). This software has been used for performing the post simulation analyses for all the PMFs presented in this paper. Considering the remarkable convergence rate and precision of the bin-passing method and the advantages it provides (such as manual post-simulation binning of the reaction path), we hope that these methods will be adopted by the community of chemical and biomolecular simulators.

Figure 10. PMFs for adsorption of HHC-36 molecule to a POPC (pink curve) and a POPE/POPG (dark blue curve) membrane patch, both at T = 310 K and P = 1.00 atm. TIP3P water model has been used, together with the CHARMM2747 force field for the peptide, water, and ions, and the CHARMM3652 force field for the membranes. A drift velocity of 5 Å/ns has been used for steering the peptide toward each membrane surface and away from it. Error-bars are obtained using block-sizes of 400 for the POPE/POPG system and 300 for the POPC system, as these where the largest block-size values that for each bin gave at least as many samplings as the bin-crossing scheme would obtain (∼50 for the POPE/POPG system and ∼41 for the POPC system.). Center of mass of each membrane patch is restrained at z = 0.

Furthermore, as we discussed earlier, the inherent deficiencies of the bin-crossing method in obtaining reliable average forward and reverse work values might make it impossible to obtain converged PMFs using that method, even from multimicrosecond simulations of these relatively large systems. The larger well-depth of the PMF for the peptide’s interaction with the POPE−POPG membrane, compared to that with the POPC membrane, supports the higher affinity of the peptide to adsorb to the POPE−POPG membrane. Recalling that POPE−POPG is a memetic of bacterial cytoplasmic membrane and POPC represents a model for mammalian cell membranes, this observation is in agreement with in vivo and in vitro experiments43 and also isothermal titration calorimetry experiments65 that have collectively established HHC-36 as a strong antimicrobial agent with minimal hemolytic activity.



CONCLUSION Based on the FR method,33,38,39 the bin-passing scheme was shown here to effectively and precisely obtain the PMFs for systems of varying degrees of complexity. The key point in devising this scheme and thus to obtain such free energy profiles is to carefully consider the reversible and dissipative work contributions during each time-step of the steered molecular dynamics simulations, and to assign them accordingly to forward and reverse directions. This is in contrast to the bin-crossing scheme that relies on cancellation of undesirable work contributions merely through the averaging process, but accumulates the error of both the forward and reverse dissipative work in each unidirectional bin. This method can be generalized to mixed quantum mechanical-molecular mechanical (QM/MM) simulations, such as those invoking the Car−Parrinello scheme.66 some groups have demonstrated the application of Jarzynski’s equality to calculate PMFs from steered QM/MM simulations.30,31 The quantum mechanical variations of the electronic structure are often handled by



AUTHOR INFORMATION

Corresponding Authors

*E-mail: [email protected]. *E-mail: [email protected]. *E-mail: [email protected] Notes

The authors declare no competing financial interest. 14212

dx.doi.org/10.1021/jp504942t | J. Phys. Chem. B 2014, 118, 14203−14214

The Journal of Physical Chemistry B



Article

(19) Cuendet, M. A.; Michielin, O. Protein−protein interaction investigated by steered molecular dynamics: the TCR−pMHC complex. Biophys. J. 2008, 95, 3575−3590. (20) Boechi, L.; Martí, M. A.; Vergara, A.; Sica, F.; Mazzarella, L.; Estrin, D. A.; Merlino, A. Protonation of histidine 55 affects the oxygen access to heme in the alpha chain of the hemoglobin from the Antarctic fish Trematomus bernacchii. IUBMB Life 2011, 63, 175−182. (21) Morzan, U. N.; Capece, L.; Marti, M. A.; Estrin, D. A. Quaternary structure effects on the hexacoordination equilibrium in rice hemoglobin rHb1: Insights from molecular dynamics simulations. Proteins 2013, 81, 863−873. (22) Patel, J. S.; Berteotti, A.; Ronsisvalle, S.; Rocchia, W.; Cavalli, A. Steered molecular dynamics simulations for studying protein-ligand interaction in cyclin-dependent kinase 5. J. Chem. Inf. Model. 2014, 54, 470−480. (23) Yu, T.; Lee, O. S.; Schatz, G. C. Steered molecular dynamics studies of the potential of mean force for peptide amphiphile selfassembly into cylindrical nanofibers. J. Phys. Chem. A 2013, 117, 7453−7460. (24) Hwang, H.; Schatz, G. C.; Ratner, M. A. Steered molecular dynamics studies of the potential of mean force of a Na+ or K+ ion in a cyclic peptide nanotube. J. Phys. Chem. B 2006, 110, 26448−26460. (25) L. Boechi, L.; Martí, M. A.; Milani, M.; Bolognesi, M.; Luque, F. J.; Estrin, D. A. Structural determinants of ligand migration in Mycobacterium tuberculosis truncated hemoglobin O. Proteins 2008, 73, 372−379. (26) Caballero, J.; Zamora, C.; Aguayo, D.; Yanez, C.; González-Nilo, F. D. Study of the interaction between progesterone and βcyclodextrin by electrochemical techniques and steered molecular dynamics. J. Phys. Chem. B 2008, 112, 10194−10201. (27) Jensen, M. Ø.; Yin, Y.; Tajkhorshid, E.; Schulten, K. Sugar transport across lactose permease probed by steered molecular dynamics. Biophys. J. 2007, 93, 92−102. (28) Boechi, L.; Mañez, P. A.; Luque, F. J.; Marti, M. A.; Estrin, D. A. Unraveling the molecular basis for ligand binding in truncated hemoglobins: The trHbO Bacillus subtilis case. Proteins 2010, 78, 962− 970. (29) Wang, Y.; Schulten, K.; Tajkhorshid, E. What makes an aquaporin a glycerol channel? A comparative study of AqpZ and GlpF. Structure 2005, 13, 1107−1118. (30) Raugei, S.; Cascella, M.; Carloni, P. A proficient enzyme: insights on the mechanism of orotidine monophosphate decarboxylase from computer simulations. J. Am. Chem. Soc. 2004, 126, 15730− 15737. (31) Crespo, A.; Martí, M. A.; Estrin, D. A.; Roitberg, A. E. Multiplesteering QM-MM calculation of the free energy profile in chorismate mutase. J. Am. Chem. Soc. 2005, 127, 6940−6941. (32) Izrailev, S.; Stepaniants, S.; Isralewitz, B.; Kosztin, D.; Lu, H.; Molnar, F.; Wriggers, W.; Schulten, K. Lecture Notes in Computational Science and Engineering. In Computational Molecular Dynamics: Challenges, Methods, Ideas - Proceedings of the 2nd International Symposium on Algorithms for Macromolecular Modelling; Deuflhard, P., Hermans, J., Leimkuhler, B., Mark, A., Reich, S., Skeel, R., Eds.; Springer-Verlag: Berlin, 1998; Vol. 4; Chapter “Steered Molecular Dynamics”, pp 39−65. (33) Kosztin, I.; Barz, B.; Janosi, L. Calculating potentials of mean force and diffusion coefficients from nonequilibrium processes without Jarzynski’s equality. J. Chem. Phys. 2006, 124, 064106. (34) Park, S.; Khalili-Araghi, F.; Tajkhorshid, E.; Schulten, K. Free energy calculation from steered molecular dynamics simulations using Jarzynski’s equality. J. Chem. Phys. 2003, 119, 3559−3566. (35) Jarzynski, C. Rare events and the convergence of exponentially averaged work values. Phys. Rev. E 2006, 73, 046105. (36) Collin, D.; Ritort, F.; Jarzynski, C.; Smith, S. B.; T, I., Jr; Bustamante, C. Verification of the Crooks fluctuation theorem and recovery of RNA folding free energies. Nature 2005, 437, 231−234. (37) Pohorille, A.; Jarzynski, C.; Chipot, C. Good practices in freeenergy calculations. J. Phys. Chem. B 2010, 114, 10235−10253.

ACKNOWLEDGMENTS The authors are grateful to the Natural Sciences and Engineering Research Council of Canada for financial support. Computational resources for the simulations presented in this work were provided by SHARCNET, SciNet,68 and Calcul Québec HPC consortia of Compute Canada.



REFERENCES

(1) Jarzynski, C. Nonequilibrium equality for free energy differences. Phys. Rev. Lett. 1997, 78, 2690−2693. (2) Jarzynski, C. Equilibrium free-energy differences from nonequilibrium measurements: A master-equation approach. Phys. Rev. E 1997, 56, 5018−5035. (3) Crooks, G. Path-ensemble averages in systems driven far from equilibrium. Phys. Rev. E 2000, 61, 2361−2366. (4) Hummer, G.; Szabo, A. Free energy reconstruction from nonequilibrium single-molecule pulling experiments. Proc. Natl. Acad. Sci. U.S.A. 2001, 98, 3658−3661. (5) Liphardt, J.; Dumont, S.; Smith, S. B.; Tinoco, I.; Bustamante, C. Equilibrium information from nonequilibrium measurements in an experimental test of Jarzynski’s equality. Science 2002, 296, 1832− 1835. (6) Yangyuoru, P. M.; Dhakal, S.; Yu, Z.; Koirala, D.; Mwongela, S. M.; Mao, H. Single-molecule measurements of the binding between small molecules and DNA aptamers. Anal. Chem. 2012, 84, 5298− 5303. (7) Park, S.; Schulten, K. Calculating potentials of mean force from steered molecular dynamics simulations. J. Chem. Phys. 2004, 120, 5946−5961. (8) Minh, D.; McCammon, J. Springs and speeds in free energy reconstruction from irreversible single-molecule pulling experiments. J. Phys. Chem. B 2008, 112, 5892−5897. (9) Jensen, M.; Park, S.; Tajkhorshid, E.; Schulten, K. Energetics of glycerol conduction through aquaglyceroporin GlpF. Proc. Natl. Acad. Sci. U.S.A. 2002, 99, 6731−6736. (10) Xiong, H.; Crespo, A.; Marti, M.; Estrin, D.; Roitberg, A. Free energy calculations with non-equilibrium methods: applications of the Jarzynski relationship. Theor. Chem. Acc. 2006, 116, 338−346. (11) Forti, F.; Boechi, L.; Estrin, D. A.; Marti, M. A. Comparing and combining implicit ligand sampling with multiple steered molecular dynamics to study ligand migration processes in heme proteins. J. Comput. Chem. 2011, 32, 2219−2231. (12) Jensen, M. Ø.; Park, S.; Tajkhorshid, E.; Schulten, K. Energetics of glycerol conduction through aquaglyceroporin GlpF. Proc. Natl. Acad. Sci. U.S.A. 2002, 99, 6731−6736. (13) Gupta, A. N.; Vincent, A.; Neupane, K.; Yu, H.; Wang, F.; Woodside, M. T. Experimental validation of free-energy-landscape reconstruction from non-equilibrium single-molecule force spectroscopy measurements. Nat. Phys. 2011, 7, 631−634. (14) Moreno, D. M.; Martí, M. A.; Biase, P. M. D.; Estrin, D. A.; Demicheli, V.; Radi, R.; Boechi, L. Exploring the molecular basis of human manganese superoxide dismutase inactivation mediated by tyrosine 34 nitration. Arch. Biochem. Biophys. 2011, 507, 304−309. (15) Bidon-Chanal, A.; Martí, M. A.; Crespo, A.; Milani, M.; Orozco, M.; Bolognesi, M.; Luque, F. J.; Estrin, D. A. Ligand-induced dynamical regulation of NO conversion in Mycobacterium tuberculosis truncated hemoglobin-N. Proteins: Struct., Funct., Bioinf. 2006, 64, 457−464. (16) Kang, Y.; Liu, Y. C.; Wang, Q.; Shen, J. W.; Wu, T.; Guan, W. J. On the spontaneous encapsulation of proteins in carbon nanotubes. Biomaterials 2009, 30, 2807−2815. (17) Zhang, D.; Gullingsrud, J.; McCammon, J. A. Potentials of mean force for acetylcholine unbinding from the alpha7 nicotinic acetylcholine receptor ligand-binding domain. J. Am. Chem. Soc. 2006, 128, 3019−3026. (18) Martí, M. A.; Lebrero, M. C. G.; Roitberg, A. E.; Estrin, D. A. Bond or cage effect: how nitrophorins transport and release nitric oxide. J. Am. Chem. Soc. 2008, 130, 1611−1618. 14213

dx.doi.org/10.1021/jp504942t | J. Phys. Chem. B 2014, 118, 14203−14214

The Journal of Physical Chemistry B

Article

(38) Forney, M.; Janosi, L.; Kosztin, I. Calculating free-energy profiles in biomolecular systems from fast nonequilibrium processes. Phys. Rev. E 2008, 78, 051913. (39) Fabritiis, G. D.; Coveney, P.; Villa‘-Freixa, J. Energetics of K+ permeability through Gramicidin A by forward-reverse steered molecular dynamics. Proteins 2008, 73, 185−194. (40) NategholEslam, M.; Holland, B. W.; Gray, C. G.; Tomberli, B. Drift-oscillatory steering with the forward-reverse method for calculating the potential of mean force. Phys. Rev. E 2011, 83, 021114. (41) Holland, B. W.; Gray, C. G.; Tomberli, B. Calculating diffusion and permeability coefficients with the oscillating forward-reverse method. Phys. Rev. E 2012, 86, 036707. (42) Jarzynski, C. Progress in Mathematical Physics; IEqualities and Inequalities: Irreversibility and the Second Law of Thermodynamics at the Nanoscale. In Time; Duplantier, B., Ed.; 2013; Vol. 63, pp 145− 172, (43) Cherkasov, A.; Hilpert, K.; Jenssen, H.; Fjell, C. D.; Waldbrook, M.; Mullaly, S. C.; Volkmer, R.; Hancock, R. E. W. Use of artificial intelligence in the design of small peptide antibiotics effective against a broad spectrum of highly antibiotic-resistant superbugs. ACS Chem. Biol. 2008, 4, 65−74. (44) Rapaport, D. C. The Art of Molecular Dynamics Simulation, 2nd ed.; Cambridge University Press: New York, 2004; p 86. (45) Taylor, J. R. An Introduction to Error Analysis: the Study of Uncertainties in Physical Measurements; University Science Books: Herndon, VA, 1997; Chapter 7. (46) Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. Comparison of simple potential functions for simulating liquid water. J. Chem. Phys. 1983, 79, 926−935. (47) Foloppe, N.; MacKerell, A. D., Jr All-atom empirical force field for nucleic acids: I. Parameter optimization based on small molecule and condensed phase macromolecular target data. J. Comput. Chem. 2000, 21, 86−104. (48) Phillips, J. .; Braun, R.; Wang, W.; Gumbart, J.; Tajkhorshid, E.; Villa, E.; Chipot, C.; Skeel, R. D.; Kale, L.; Schulten, K. Scalable molecular dynamics with NAMD. J. Comput. Chem. 2005, 26, 1781− 1802. (49) Timko, J.; Bucher, D.; Kuyucak, S. Dissociation of NaCl in water from ab initio molecular dynamics simulations. J. Chem. Phys. 2010, 132, 114510. (50) Jo, S.; Kim, T.; Im, W. Automated builder and database of protein/membrane complexes for molecular dynamics simulations. PLoS One 2007, 2, e880. (51) Jo, S.; Lim, J. B.; Klauda, J. B.; Im, W. CHARMM-GUI membrane builder for mixed bilayers and its application to yeast membranes. Biophys. J. 2009, 97, 50−58. (52) Vanommeslaeghe, K.; Hatcher, E.; Acharya, C.; Kundu, S.; Zhong, S.; Shim, J.; Darian, E.; Guvench, O.; Lopes, P.; Vorobyov, I.; MacKerell, A. D., Jr CHARMM general force field: A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields. J. Comput. Chem. 2010, 31, 671−690. (53) Nagle, J. F.; Tristram-Nagle, S. Structure of lipid bilayers. Biochim. Biophys. Acta, Biomembr. 2000, 1469, 159−195. (54) Bemporad, D.; Essex, J. W.; Luttmann, C. Permeation of small molecules through a lipid bilayer: a computer simulation study. J. Phys. Chem. B 2004, 108, 4875−4884. (55) Marrink, S. J.; Berendsen, H. J. C. Simulation of water transport through a lipid membrane. J. Phys. Chem. 1994, 98, 4155−4168. (56) Shinoda, W.; Mikami, M.; Baba, T.; Hato, M. Molecular dynamics study on the effects of chain branching on the physical properties of lipid bilayers: 2. Permeability. J. Phys. Chem. B 2004, 108, 9346−9356. (57) Sugii, T.; Takagi, S.; Matsumoto, Y. A molecular-dynamics study of lipid bilayers: effects of the hydrocarbon chain length on permeability. J. Chem. Phys. 2005, 123, 184714. (58) Jansen, M.; Blume, A. A comparative study of diffusive and osmotic water permeation across bilayers composed of phospholipids with different head groups and fatty acyl chains. Biophys. J. 1995, 68, 997−1008.

(59) Lawaczeck, R. On the permeability of water molecules across vesicular lipid bilayers. J. Membr. Biol. 1979, 51, 229−261. (60) Mishra, N. N.; Yang, S. J.; Sawa, A.; Rubio, A.; Nast, C. C.; Yeaman, M. R.; Bayer, A. S. Analysis of cell membrane characteristics of in vitro-selected daptomycin-resistant strains of methicillin-resistant Staphylococcus aureus. Antimicrob. Agents Chemother. 2009, 53, 2312− 2318. (61) Ames, G. F. Lipids of Salmonella typhimurium and Escherichia coli: structure and metabolism. J. Bacteriol. 1968, 95, 833−843. (62) Dodge, J. T.; Phillips, G. B. Composition of phospholipids and of phospholipid fatty acids and aldehydes in human red cells. J. Lipid Res. 1967, 8, 667−675. (63) Virtanen, J. A.; Cheng, K. H.; Somerharju, P. Phospholipid composition of the mammalian red cell membrane can be rationalized by a superlattice model. Proc. Natl. Acad. Sci. U.S.A. 1998, 95, 4964− 4969. (64) Humphrey, W.; Dalke, A.; Schulten, K. VMD - visual molecular dynamics. J. Mol. Graph. 1996, 14, 33−38. (65) Nichols, M.; Kuljanin, M.; Nategholeslam, M.; Hoang, T.; Vafaei, S.; Tomberli, B.; Gray, C. G.; DeBruin, L.; Jelokhani-Niaraki, M. Dynamic turn conformation of a short tryptophan-rich cationic antimicrobial peptide and its interaction with phospholipid membranes. J. Phys. Chem. B 2013, 117, 14697−14708. (66) Car, R.; Parrinello, M. Unified approach for molecular dynamics and density-functional theory. Phys. Rev. Lett. 1985, 55, 2471−2474. (67) Born, M.; Oppenheimer, R. Zur quantentheorie der molekeln. Ann. Phys. 1927, 389, 457−484. (68) Loken, C.; et al. SciNet: lessons learned from building a powerefficient top-20 system and data centre. J. Phys.: Conf. Ser. 2010, 012026.

14214

dx.doi.org/10.1021/jp504942t | J. Phys. Chem. B 2014, 118, 14203−14214