A Differential Fluctuation Theorem - ACS Publications - American

Mar 11, 2008 - theorem.7-9 The second paper by Crooks summarized very general nonequilibrium thermodynamic identities in a path- integral formalism.5 ...
1 downloads 0 Views 268KB Size
6168

J. Phys. Chem. B 2008, 112, 6168-6174

A Differential Fluctuation Theorem† Paul Maragakis,*,‡,|,⊥ Martin Spichty,*,§,| and Martin Karplus*,‡,§ Department of Chemistry & Chemical Biology, HarVard UniVersity, Cambridge, MA 02138, and Laboratoire de Chimie Biophysique, Institut de Science et d’Inge´ nierie Supramole´ culaires, UniVersite´ Louis Pasteur, 67083 Strasbourg Cedex, France ReceiVed: September 1, 2007; In Final Form: December 11, 2007

We derive a nonequilibrium thermodynamics identity (the “differential fluctuation theorem”) that connects forward and reverse joint probabilities of nonequilibrium work and of arbitrary generalized coordinates corresponding to states of interest. This identity allows us to estimate the free energy difference between domains of these states. Our results follow from a general symmetry relation between averages over nonequilibrium forward and backward path functions derived by Crooks [Crooks, G. E. Phys. ReV. E 2000, 61, 2361-2366]. We show how several existing nonequilibrium thermodynamic identities can be obtained directly from the differential fluctuation theorem. We devise an approach for measuring conformational free energy differences, and we demonstrate its applicability to the analysis of molecular dynamics simulations by estimating the free energy difference between two conformers of the alanine dipeptide model system. We anticipate that these developments can be applied to the analysis of laboratory experiments.

1. Introduction theorem,1,2

Jarzynski developed the nonequilibrium work which relates the free energy difference between two thermodynamic states to the work performed on a system during its nonequilibrium, finite-time transformation between these two states. Starting from the equilibrium ensemble of one state, the system evolves in time from A to B as a continuous parameter λ changes in an externally controlled manner (the switching protocol). Crooks generalized Jarzynski’s proof to Markovian discrete-time dynamics3 and developed two insightful extensions in subsequent papers.4,5 The first paper provided a fluctuation theorem4 that relates the free energy difference to the probability distributions of the nonequilibrium work measured during forward and conjugate reverse protocols. This paper, as well as another by Jarzynski,6 connected the above finite-time thermodynamic concepts to the entropy production fluctuation theorem.7-9 The second paper by Crooks summarized very general nonequilibrium thermodynamic identities in a pathintegral formalism.5 Arguably, a long cycle of theoretical developments has been completed, starting from the realization that the second law of thermodynamics is statistical and therefore can be violated in principle,10-12 and extending to the relationship of microscopic reversibility of the equations of motion to thermodynamics.13 Hummer and Szabo formulated a rigorous connection between some of these developments in nonequilibrium thermodynamics and single-molecule force spectroscopy experiments14,15 With a concise proof and generalization of the Jarzynski identity, they paved the way for experimental measurements of equilibrium †

Part of the “Attila Szabo Festschrift”. * Corresponding authors. (P.M.) Tel: (212) 478-0414; fax: (212) 8451414; e-mail: [email protected]. (M.S.) Tel: +33 390-24-5127; fax: +33 390-24-5126; e-mail: [email protected]. (M.K.) Tel: (617) 495-4018; fax: (617) 496-3204; e-mail: [email protected]. ‡ Harvard University. § Universite ´ Louis Pasteur. | Joint first authors. ⊥ Current address: D. E. Shaw Research, New York, New York 10036.

free energy differences by repeated pulling of molecules with an externally controlled apparatus, such as an atomic force microscope or optical tweezers.16 In a set of single-molecule experiments involving RNA unfolding and folding, Collin et al.17 verified the Crooks fluctuation theorem. They also obtained accurate estimates of the underlying free energy differences using the Bennett acceptance ratio method with the Crooks fluctuation theorem.4,5,18-21 Here we develop a generalization of the Crooks fluctuation theorem. This “differential” fluctuation theorem64 relates the free energy difference between two thermodynamic states to the joint probability distributions of the forward and reverse nonequilibrium work and of the generalized coordinates corresponding to the states of interest. Order parameters, whose values are associated with different conformations, are examples of such generalized coordinates. The differential fluctuation theorem allows us to use only the nonequilibrium work values that accompany transitions between subsets of the complete equilibrium ensemble to recover the relative partition functions of these subsets. In the first part of this paper, we derive the new identity, show how it connects with existing nonequilibrium work theories, and demonstrate how it can be used for obtaining conformational free energy differences. In the second part of the paper, we test the latter approach by applying it to computer simulations of the free energy differences between two conformers of the alanine dipeptide, a model system that has been used in many other studies.22-42 It is of interest because it contains the essential features of the protein backbone and yet is simple enough that the exact free energy surface can be obtained by alternative methods.26,29,31,39,40 2. Theory 2.1. Terminology and Definitions. The Hamiltonians HA and HB describe systems A and B, respectively. Without loss of generality, we consider the vector x that spans the microstates (phase space coordinates) of both Hamiltonians (it is always

10.1021/jp077037r CCC: $40.75 © 2008 American Chemical Society Published on Web 03/11/2008

A Differential Fluctuation Theorem

J. Phys. Chem. B, Vol. 112, No. 19, 2008 6169

possible to extend their phase spaces by dummy variables). The parametric Hamiltonian Hλ, with H0 ) HA and H1 ) HB, describes the transition between systems A and B by switching λ between λ ) 0 and λ ) 1. The symbol λ designates a timedependent parameter such that a forward time switching protocol changes the system described by Hλ from A to B in a time interval τ. The time reversed protocol, which runs over the same sequence of λ values as the forward time protocol, but in reverse order, performs the switch of Hλ from B to A in the same time interval τ. In what follows, we consider that the system Hλ is coupled to a heat bath with inverse temperature β ) 1/kBT (kB is the Boltzmann constant, T is the temperature), and assume that, at time t ) 0, the system is in the equilibrium ensemble of H0 (H1) before the forward (reverse) switching protocol. We follow the standard notation and terminology for probabilities in the Bayesian statistics literature43 and do not discriminate between probability density function and discrete probabilities when such distinctions are irrelevant or clear from the context. We simplify our notation and do not explicitly list all dependencies of these probabilities that are assumed constant in all our equations (e.g., the external conditions). 2.2. Derivation of a Differential Fluctuation Theorem. The central result of this paper is the identity

PF (W,a f b)e

-βW

) PR(-W,b* f a*)e

-β∆F

(1)

where PF(W,a f b) is the joint probability of performing work W with the nonequilibrium forward switch starting from value a of the function a(x) of microstate x(0) at the start of the switch, and taking value b of the function b(x) of microstate x(τ) at the end of the forward switch. The functions a(x), b(x) can be arbitrary generalized coordinates, such as vectors of coordinates, momenta, forces and energy, or characteristic functions that filter for specific criteria (e.g., a range of a dihedral angle, a maximal value of the radius of gyration, or the number of hydrogen bonds), or combinations of the above. The quantity PR(-W,b* f a*) on the right-hand side of eq 1 is the joint probability of performing work -W with the nonequilibrium reverse switch starting from value b* of the function b(x) of microstate x(0) at the start of the switch, and taking value a* of the function a(x) of microstate x(τ) at the end of the reverse switch. (The asterisk (*) denotes a reversal of the momentum components of a or b. With the exception of the end of section 2.3, we will omit this asterisk in the notation that follows since we will consider configurational degrees of freedom.) ∆F is the free energy difference FB - FA between states A and B, so that e-β∆F is the ratio of the partition functions ZB/ZA. To derive eq 1, we start with the identity (eq 15 in ref 5)

〈Fe

-βW+β∆F

〉F ) 〈Fˆ 〉R

F ) δ(W - W [x(0 f τ)])δ(a - a(x(0))) δ(b - b(x(τ))) (3) (where x(0 f τ) denotes a complete trajectory in phase space), and its time-reversed path function Fˆ,

Fˆ ) δ(W + W [x(0 f τ)])δ(b - b(x(0)))δ(a - a(x(τ)))

(4)

into eq 2, (where now x(0 f τ) denotes the complete timereversed trajectory, as per Crooks,5 but we keep the same notation for simplicity), and use the properties of the Dirac delta function to obtain 〈δ(W - W [x(0 f τ)])δ(a - a(x(0)))δ(b - b(x(τ)))〉F e-βW+β∆F ) 〈δ(W + W [x(0 f τ)])δ(b - b(x(0)))δ(a - a(x(τ)))〉R (5)

The ensemble averages over the products of the delta functions correspond to joint probabilities P, that is,

〈δ(W - W [x(0 f τ)])δ(a - a(x(0)))δ(b - b(x(τ)))〉F ) PF(W, a(x(0)) ) a,b(x(τ)) ) b) ) PF (W,a f b) and

〈δ(W + W [x(0 f τ)])δ(b - b(x(0)))δ(a - a(x(τ)))〉R ) PR(-W,b(x(0)) ) b,a(x(τ)) ) a) ) PR(-W,b f a) so that eq 1 follows directly. We can rewrite eq 1 in terms of conditional probabilities by expansion of the joint probabilities to obtain

PF (W|a f b)PF(a f b)e-βW ) PR(-W|b f a)PR(b f a)e-β∆F (6) where PF(W|a f b) is the probability to perform work W given that the forward pathway starts at value a of a(x) from an equilibrium distribution in state A and ends in value b of b(x) in state B; PF(a f b) is the probability to observe the latter pathway in an ensemble of forward pathways that start from the equilibrium of state A. Corresponding definitions apply to the reverse pathway starting from value b of b(x) from an equilibrium distribution in state B and ending at value a of a(x) in state A. We can further expand eq 6 by noting that PF(a f b) is also a joint probability to start with a sample from the equilibrium distribution of A that satisfies a(x(0)) ) a, and to end the nonequilibrium trajectory with b(x(τ)) ) b (similarly for the second probability in the right-hand side of eq 6). We obtain

PF(W|a f b)PF(b|a)PA(a)e-βW )

(2)

PR(-W|b f a)PR(a|b)PB(b)e-β∆F (7)

The left-hand side of eq 2 denotes the average of the path function F over an ensemble of trajectories that start from a thermal equilibrium distribution associated with HA and follow a forward switching protocol; the path function F is an arbitrary functional of the trajectory. The backward path function Fˆ is the same functional of the trajectory with time-reversal symmetry applied. The right-hand side denotes the average of the backward path function Fˆ over an ensemble of trajectories that start from a thermal equilibrium distribution associated with HB and follow a reverse switching protocol. The path function W returns the work along a trajectory, and ∆F is the equilibrium free energy difference between states A and B. To prove eq 1 we insert the functional

where PF(b|a) is the probability to end in b given that the forward pathway started from a in the equilibrium of state A, and PA(a) is the probability to measure a ) a(x) in the equilibrium of state A. The right-hand side of eq 7 contains the corresponding reverse path quantities that start from equilibrium in state B and end in state A. Figure 1 shows a schematic diagram of a set of forward and reverse nonequilibrium simulations described by eq 7. The complete equilibrium distribution of A (top, left, blue circle) transforms, via a forward switching protocol, to a protocoldependent nonequilibrium distribution of state B (top, right, outer red curve). Similarly, the complete equilibrium distribution of B (bottom, right, red circle) transforms, via the conjugate

6170 J. Phys. Chem. B, Vol. 112, No. 19, 2008

Figure 1. A schematic diagram of nonequilibrium processes in the forward (top) and the reverse (bottom) direction connecting the states A and B.

reverse switching protocol, to a protocol-dependent nonequilibrium distribution of state A (bottom, left, outer blue curve). The nonequilibrium work values collected during these two switching processes obey the conventional Crooks fluctuation theorem (see eq 8 below). The domain a of the equilibrium distribution of A (top, left, blue dot) transforms, via the same forward switching protocol, to a protocol-dependent distribution of state B. We sketch a forward pathway ensemble and the shape of a protocoldependent equiprobability surface of the final state B (top, right, inner red curve), which encompasses the region b(x(τ)) ) b at the end of the forward switch. Similarly, for the reverse switching process on the right-hand side of eq 7, we sketch the corresponding initial state (bottom, right, red dot), reverse path ensemble, and final state equiprobability surface (bottom, left, blue inner curve) that encompasses the domain a(x(τ)) ) a. Equation 7 provides the proper reweighting to the Crooks fluctuation theorem (see eq 8 below) when the work data is restricted to measurements between two subsets a and b of the full states A and B with equilibrium probabilities PA(a) and PB(b), respectively. In Figure 1, the conditional probability PF(b|a) is the ratio of the area of dot b to the area enclosed in the top, right, red inner curve. All the conditional probabilities in eq 7 are path-averaged quantities that depend on the switching protocol. The probabilities PA(a) and PB(b), however, are path-independent equilibrium quantities, and thus can be measured with different experiments or simulations. Suppose we want to calculate the free energy difference for an alchemical mutation (e.g., mutation of an amino acid side-chain in a protein). The two end states, A and B, may have high barriers in their free energy landscapes (for example, if amino acids such as threonine or valine are involved20,29), such that the sampling of the entire equilibrium ensembles of A and B through molecular dynamics would be very time-consuming. Applying the joint probability fluctuation theorem, one could sample only the equilibrium ensemble for a small domain a of A and b of B, and subsequently perform nonequilibrium switches between those domains in order to obtain the probabilities PF(W|a f b) and PR(-W|b f a). The quantity PF(b|a) (similarly for PR(a|b)) can be calculated from nonequilibrium forward trajectories that start from an ensemble of state A restricted to a, such that a(x(0)) ) a; the conditional probability is simply the fraction of these trajectories that take the value b for the function b(x) in state B, so that b(x(τ)) ) b. The equilibrium probabilities PA(a) and PB(b) could be obtained, for example, from potential of mean force (PMF) calculations,

Maragakis et al. e.g., via thermodynamic integration,44 adaptive umbrella sampling,25,29 metadynamics,45 adiabatic dynamics,39 or from other enhanced sampling methods such as parallel tempering.46,47 By combining the probabilities from these two separate experiments, we are able to determine the free energy difference between A and B. 2.3. Connection to Existing Thermodynamic Identities. In this section, we discuss the relation between our main result, eq 1, to existing thermodynamic theories, and show how it can be used to derive a variety of known results in a simple way. Equation 1 is a specialization of eq 2, (Crooks’s more general path-function identity5), which retains information about observables of the end states, and which is written explicitly in terms of probabilities. As we show below, eq 1 is a generalization of the Crooks fluctuation theorem4 (see eq 8 below) for joint work probability distributions that subsumes existing nonequilibrium theories, including the Jarzynski identity, the Kawasaki nonlinear response relation,5,48,49 and the formalism of Hummer and Szabo,14 as well as other recently developed identities that use nonequilibrium forward and reverse switching processes to extract conformational free energy differences.50,51 Integrating both sides of eq 1 over all values of a and b, we obtain the Crooks fluctuation theorem,4

PF(W)e-βW ) PR(-W)e-β∆F

(8)

with PF(W) [PR(-W)] being the nonequilibrium work distribution resulting from the forward (reverse) switch of the complete thermal equilibrium distribution of ensemble A (B). From the Crooks fluctuation theorem, the Jarzynski identity (eq 2a of ref 1) follows by integration over all work values (as is shown in ref 5):

〈e-βW〉F ) e-β∆F

(9)

with the angle brackets indicating path ensemble averages. Integrating eq 1 over W and over a, we obtain a relation between the equilibrium distribution of the observable b in the final state B, based on the proper reweighting of the nonequilibrium forward simulations:

PB(b)e-∆F )

∫-∞∞ dW PF(W,b(τ))e-βW

with PF(W,b(τ)) being the joint probability to perform work W and measure b at the end of a forward trajectory. Using a shorthand notation for the integral on the right-hand side, we can write this equation as (see eq 21 in ref 5)

PB(b) ) 〈PF(b(τ))e-βW〉Fe+β∆F

(10)

Hummer and Szabo derived the above equation starting from the Feynman-Kac theorem for path integrals.14 This equation allowed them to obtain estimators of the PMF as a function of the molecule extension from repeated single-molecule pulling experiments. They considered the Hamiltonian HB ) HA + k(z - Vt)2/2, with k being the spring constant of the harmonic trap that constrains the molecule, V being the pulling speed, and z being the extension of the molecule. They obtained the PMF by a generalization of the weighted histogram analysis method.52 Integrating eq 1 over a, we obtain eq 6 of Paramore et al.:51

PF(W,b(τ))e-βW ) PR(-W,b(0))e-β∆F

(11)

This equation is a recent extension of the Crooks fluctuation theorem that includes reaction coordinates in addition to the

A Differential Fluctuation Theorem

J. Phys. Chem. B, Vol. 112, No. 19, 2008 6171

work; it was used by Paramore et al. to generalize the results of Hummer and Szabo and to reconstruct the PMF along a reaction coordinate. Integrating eq 1 over W and over b, we obtain the Kawasaki nonlinear response relation (eq 23 of ref 5):

〈PR(a(τ))〉R ) 〈PA(a)e-βW〉Fe+β∆F

(12)

with PR(a(τ)) being the probability distribution of a at the end of a reverse switch. If a and b are microstates, PA(a) and PB(b) are given by the Boltzmann distribution. We rewrite eq 1 as

PF(W,b|a)e-βEA(a)e-βW ) PR(-W,a*|b*)e-βEB(b) (13) with PF(W,b|a) being the conditional probability to measure work W in a forward path ending on microstate b given that the path started from microstate a [similarly for PR(-W,a*|b*)]. Using Jarzynski’s definitions of generated entropy ∆S and of the heat absorbed by the system in contact with a single heat bath,6 we have T∆S ) W - EB(b) + EA(a), so that eq 13 becomes

PF(W,b|a) ) PR(-W,a*|b*)e∆S/kB

(14)

Since the generated entropy is a linear function of the work, and since the temperature T and energy change EB(b) - EA(a) are constant, we get

PF(+∆S,b|a) PR(-∆S,a*|b* )

) e(∆S/k)B

(15)

Jarzynski derived a more generally applicable form of this “detailed fluctuation theorem” by eliminating the bath degrees of freedom from the Hamiltonian dynamics of a system in contact with multiple baths.6 2.4. Conformational Free Energy Differences. An interesting limit of eq 7 is obtained when the systems A and B are the same. In this case, the free energy difference is zero, so that eq 7 reduces to

PF(W|a f b)PF(b|a)P(a)e-βW ) PR(-W|b f a)PR(a|b)P(b) (16) where P(a) and P(b) are the probabilities to measure a(x) ) a and b(x) ) b, respectively. The conformational free energy difference ∆Fafb is given by -kT ln P(b)/P(a) so that

PF(W|a f b)PF(b|a)e-βW ) PR(-W|b f a)PR(a|b)e-β∆Fafb (17) Any nonequilibrium switching Hamiltonian Hλ that uses the same Hamiltonian for the end states of the switching process (λ ) 0,1) can be used with this simplified form of eq 7; i.e., the Hamiltonian may change during the switching process (0 < λ < 1). If a and b are parameters that define two conformations, the resulting equation gives the conformational free energy differences between the two conformations a and b, ∆Fafb, as a function of nonequilibrium measurements. If no work is performed (nor gained) by the system, i.e., the Hamiltonian is not changed in the course of the nonequilibrium switching processes, we have e-βW ) 1 and PF(W|a f b) ) PR(-W|b f a) (the latter are Dirac delta functions: δ(W)). Equation 17 then simplifies to

PF(b|a) ) PR(a|b)e-β∆Fafb

(18)

Adib derived this equation from the concept of microscopic reversibility and applied it to extract conformational free energy differences from “clamp-and-release” simulations.50 During such in-silico experiments, the system is first fixed either at a(x(0)) ) a or at b(x(0)) ) b, and the remaining degrees of freedom are equilibrated (“clamp” stage); the constraint is then removed, and the coordinates are recorded after time t ) τ (“release” stage); the fraction of trajectories that start in a (b) and end in b (a) yields the conditional probabilities in eq 18. 3. Application to Conformational Change: The Alanine Dipeptide We use eq 17 to obtain the free energy difference between two conformers of capped amino acid alanine, N-acetyl-Lalanine-N′-methylamide (aka alanine dipeptide), a system that is simple enough so that the free energy can be obtained by other means.26,29,31,39,40 3.1. Simulation Methodology. We carried out all calculations with the program CHARMM,53 version c30b2. We used the polar hydrogen topology and the parameter set PARAM19,54,55 with the ACE implicit solvation model56 with smoothing parameter 1.6 and dielectric constant 78.5. In all molecular dynamics (MD) simulations the temperature was controlled by Langevin dynamics with a friction coefficient of 4 ps-1. The time step for the leapfrog integrator was 1 fs. 3.1.1. AdaptiVe Umbrella Sampling. We obtained the free energy map for the (φ,ψ)-space from 100 independent MD simulations at 300 K, each 100 ns in length, using a twodimensional adaptive umbrella potential29 for the dihedral angles φ and ψ; the umbrella potential was constructed from six trigonometric functions in each dimension. Each dihedral angle was divided into 18 bins yielding a total of 324 bins. The PMF value and error are the mean and the standard error, respectively, of ∆Fi ) -kBT ln(Pi(k)/Pi(C7eq)), with i denoting the index of the simulation (from 1 to 100), Pi(k) and Pi(C7eq) being the probability of bin k and C7eq (see below), respectively, as sampled in the ith adaptive umbrella simulation, and ∆Fi being the estimate of the PMF from simulation i. The standard error is estimated as

σ)

x(

1

N

∑( ∆Fi - 〈∆F〉) N i)1

)

2

(19)

with 〈∆F〉 being the mean free energy difference. 3.1.2. Definition of Characteristic Functions. We define the conformational subspaces C7eq and C7ax with the (characteristic) function

a(x) ) δφa × δψa

(20)

1 1 1, if φa - w e φ(x) e φa + w δ φa ) 2 2 0, otherwise

(21)

where

{

and

δ ψa =

{

1 1 1, if ψa - w e ψ(x) e ψa + w 2 2 0, otherwise

(22)

6172 J. Phys. Chem. B, Vol. 112, No. 19, 2008

Maragakis et al.

The functions φ(x) and ψ(x) give the dihedral angles φ and ψ of microstate x in degrees (°). With the reference valuesφa ) - 90°, ψa ) 150°, and the width w ) 20°, the function a(x) takes the value 1 for (φ,ψ) ∈ C7eq and 0 otherwise. Correspondingly, the function b(x) has the reference values φb ) 70°, ψb ) -70°, so that b(x) ) 1 describes the conformational subspace C7ax. The assignments C7eq and C7ax are somewhat arbitrary, but they are adequate for illustrative purposes. 3.1.3. Sampling of Equilibrium Ensembles of C7eq and C7ax. We sampled the canonical ensembles for conformational subspaces C7eq and C7ax by two independent MD simulations, each for 11 ns at 300 K. Sampling was restricted to the (φ,ψ)subspaces by adding the following restraining potential to the Hamiltonian:

Vrest,ref ) (1 - δφ,ref)Vφ,ref + (1 - δψ,ref)Vψ,ref

(23)

1 2 Vφ,ref ) fref |φ(x) - φref| - w 2

(24)

1 2 Vψ,ref ) fref |ψ(x) - ψref| - w 2

(25)

with

(

)

and

(

)

with δφ,ref and δφ,ref being the functions defined in eqs 21 and 22. The force constant fref was set to 0.3 kcal mol-1 degree-2, and the reference values φref and ψref correspond to those of a(x) and b(x), given in the previous section. This restraining potential ensures that the system is confined to the conformational subspace of interest (a ) 1 or b ) 1) and that the restraining potential vanishes if the system is inside this subspace. For the last 10 ns of the MD-trajectory, snapshots were saved every 1 ps; i.e., 1 ns served as equilibration. Structures with Vrest,ref * 0 were dropped, which yielded 8062 equilibrium structures for a(x(0)) ) 1 and 8297 structures for b(x(0)) ) 1. 3.1.4. Nonequilibrium Switches between C7eq and C7ax. We performed nonequilibrium switches with the following Hamiltonian: Hλ )

{

Hsys, if λ ) 0 or λ ) 1 Hsys + (1 - λ)(Vφ,a + Vψ,a) + λ(Vφ,b + Vψ,b), otherwise (26)

with Hsys being the Hamiltonian of the system (CHARMM potential), Vφ,ref and Vψ,ref, with ref ) a or ref ) b, being the potentials given by eqs 25 and 24 with w ) 0, and λ is the usual coupling parameter running from 0 to 1 for the forward switch (C7eq f C7ax) and from 1 to 0 for the reverse switch (C7ax f C7eq). The Hamiltonian in eq 26 is a discontinuous function of λ at λ ) 0,1. Our symmetric switching protocol consists of a sequence of Nλ stepwise changes of λ by ∆λ, each followed by 100 steps of MD simulation, until λ reaches its final value. The switching protocol is designed to move the system between C7eq and C7ax, and it can be modulated by three parameters: ∆λ (the speed of the switch), fa, and fb (the two force constants). (While it is possible to come up with efficient switching protocols for specific problemssa reasonable choice would be to drag the system along its optimal reaction coordinateswe used here a simple, transferable protocol for the purpose of demonstration.) The work is given by Nλ

W)

E(x(i), λ(i + 1)) - E(x(i),λ(i)) ∑ i)1

(27)

Figure 2. Simulated free energy surface of alanine dipeptide (schematic representation shown on top) as a function of dihedral angles φ and ψ.

with E(x(i), λ(i)), E(x(i), λ(i + 1)) being the potential energy of the system before and after the ith change of λ, respectively. These energies are obtained by calculating Hλ(i) and Hλ(i+1) (with λ(i + 1) ) λ(i) + ∆λ) for x(i) (the coordinates of the microstate at the time when the ith change is occurring; no special treatment of the end states is needed). 3.2. Simulation Results. Figure 2 shows a coarse-grained map of the free energy map for the relevant internal degrees of freedom of this system, i.e., the free energy as a function of the dihedral angles φ and ψ. The map was obtained by use of adaptive umbrella sampling.29 The conformational subspace (φ ) -90 ( 10°, ψ ) 150 ( 10°) named C7eq forms the global minimum on this coarse-grained free energy map. The minimum with the largest displacement from the global minimum in terms of (φ,ψ)-coordinates is the subspace C7ax (φ ) 70 ( 10°, ψ ) -70 ( 10°); it is 3.83 ( 0.08 kcal mol-1 higher in free energy, according to the result of 100 adaptive umbrella simulations, each 100 ns in length. We determined the free energy difference between C7eq and C7ax using work data from nonequilibrium forward and reverse switches between their respective conformational subspaces. We define a characteristic function a(x) [b(x)] that takes the value 1 if the system is located in the conformational subspace C7eq [C7ax] and is 0 otherwise. We rewrite eq 20 in a form reminiscent of the Crooks fluctuation theorem:

PF(W|a f b)e-βW ) PR(-W|b f a)e-β∆F′

(28)

with ∆F′ ) ∆Fafb - kT ln PR(a|b) + kT ln PF(b|a). ∆Fafb ) -kT ln P(b)/P(a) is the free energy difference between C7eq and C7ax. Taking only the work values of the forward switches that ended up in C7ax [b(x(τ)) ) 1] (given that they started from C7eq [a(x(0)) ) 1]) and the work values of the reverse switches that ended up in C7eq [a(x(τ)) ) 1] (given that they started from C7ax [b(x(0)) ) 1]), we obtain PF(W|a f b) and PR(-W|b f a). Equation 28 with the given set of nonequilibrium work data is solved by Bennett’s acceptance ratio method.18 It yields the

A Differential Fluctuation Theorem

J. Phys. Chem. B, Vol. 112, No. 19, 2008 6173

TABLE 1: Free Energy Difference between Conformational Subspaces C7eq (a(x) )1) and C7ax (b(x) )1) of Alanine Dipeptide, ∆Fafb, for Various Switching Protocolsa prot. fab fbb I II III IV

∆F′c,d

∆λ

PF(b|a)e

12 12 0.02 3.96(0.11) 0.893(0.003) 12 12 0.002 3.87(0.01) 0.936(0.003) 3 12 0.002 4.19(0.03) 0.939(0.003) 12 3 0.002 3.60(0.02) 0.591(0.005)

PR(a|b)e

∆Fafbc,g

0.828(0.004) 0.920(0.003) 0.530(0.005) 0.921(0.003)

3.92(0.11) 3.86(0.01) 3.85(0.03) 3.87(0.02)

a Standard errors are given in parentheses. b In cal mol-1 degree-2. In kcal mol-1. d Standard error of the maximum likelihood estimate.19 e Standard error of the binomial distribution, i.e., σ ) xp(1-p)/N, with p being the corresponding probability and N being the total number of switching attempts (N ) 8062 for forward and N ) 8297 for reverse c

switching). g σ ) β-1 (σp/p).



2 ∆F′

+σβ-1ln(PF(b|a))2+σβ-1ln(PR(a|b))2 with σβ-1ln(p) )

maximum likelihood estimate of ∆F′ from which kT ln PF(b|a)/PR(a|b) has to then be substracted to obtain ∆Fafb. To determine the robustness of the method we used several switching protocols with different values for the parameters fa, fb, and ∆λ; Table 1 summarizes the results and lists the fraction of the originally launched trajectories that were kept (PF(b|a) and PR(a|b) for the forward and reverse switching protocols, respectively.) It should be emphasized that the aim of this example is not to find an optimal switching protocol but to demonstrate the applicability of eq 17. The calculated free energy difference in our application of the differential fluctuation theorem is in agreement with the reference value obtained from adaptive umbrella sampling (3.83 ( 0.08 kcal mol-1). For the switching Hamiltonians with fa ) fb (protocols I and II) the probabilities PF(b|a) and PR(a|b) are very similar so that ∆F′ directly reflects ∆Fafb. Performing the switches more slowly (being less impatient15) improves the accuracy of the ∆F′ estimate: The decrease of ∆λ by a factor of 10 decreases the error by a factor of 7. Since the contribution to the error from the probabilities PF(b|a) and PR(a|b) is much smaller than the contribution from ∆F′, the error for ∆Fafb decreases by essentially the same amount. The standard error σ for ∆Fafb using ∆λ ) 0.002 (protocol II) is 5.33 times smaller than the one obtained from adaptive umbrella sampling (σ ) 0.08 kcal mol-1; see section 3.1.1). The gain in efficiency of determining ∆Fafb relative to adaptive umbrella sampling can be estimated from the amount of CPU time needed to obtain the same standard error; it is 3.4-fold for protocol II. We note, however, that the adaptive umbrella sampling estimates the free energies in a grid of 18 × 18 ) 324 squares, so that more information is obtained (i.e., the whole free energy surface). Protocols III and IV with fa * fb clearly demonstrate the influence of the “arrival” probabilities PF(b|a) and PR(a|b), i.e., the work probability distributions for the pathways connecting domains a and b have to be corrected by the probabilities of making the transition to these domains. These probabilities differ considerably for protocols III and IV and the correction term kT ln PF(b|a)/PR(a|b) becomes important. For protocol III, ∆F′ is greater than ∆Fafb, while for protocol IV, the opposite is true. After applying the correction term, kT ln PF(b|a)/PR(a|b), to ∆F′, the same results (within the errors) are obtained for ∆Fafb in both cases. The importance of the correction term depends on the switching protocol, but it cannot always be neglected. 4. Conclusions We have derived a new nonequilibrium thermodynamics identity, which we refer to as the “differential fluctuation theorem”. It connects forward and reverse joint probability

distributions of work and generalized coordinates in systems that switch between subsets of the complete ensembles of their thermodynamic states of interest. The identity is a special case of a path-integral thermodynamic identity (eq 15 from ref 5, which was written in terms of functionals of the trajectories.) Its utility is that it is expressed in terms of practical, measurable quantities: work distributions, arrival probabilities, and potentials of mean force. The identity is also a generalization of the Crooks fluctuation theorem;4 it connects forward and reverse joint probability distributions of the work and of the generalized coordinates. It subsumes many other known and practical nonequilibrium thermodynamic identities, including the Jarzynski identity,2 the Kawasaki identity,5,48,49 and the PMF reconstruction scheme of Hummer and Szabo.14 We demonstrated how our results can aid in developing new approaches to measure/simulate free energy differences, by applying it to a problem of conformational change (one of the most difficult problems in computational biochemistry simulations). We derived a specialized equation from the differential fluctuation theorem to connect the relative probabilities of two conformers with measurements of the work during nonequilibrium switches and with the success rate of such nonequilibrium switches. As a specific application of the latter identity, we used it to calculate the conformational free energy difference between two conformers (C7eq and C7ax) of the alanine dipeptide with an implicit solvent model; there is no inherent difficulty in using it with explicit solvent (unlike methods that determine optimal reaction coordinates, such as the string method38), but it would, of course, be more costly in computer time. While performance, per se, is not the concern of the present work, we note that one can influence the performance of the method by the proper choice of the nonequilibrium switching protocol (Table 1). Since this protocol can be optimized at a small cost compared to the total cost of sampling (related ideas have appeared in the context of transition path sampling57-60), one can create practical conformational sampling methods starting from the differential fluctuation theorem. We also expect that the combination of the present work with our previous analysis of multiprotocol, multistate, nonequilibrium switches20,21 will allow for analysis of the underlying conformational free energies in complicated systems with many metastable states. We suggest that one could take advantage of the present developments by extending singlemolecule force spectroscopy17,21,61 with multidimensional experiments.62,63 Acknowledgment. We thank Sergei Krivov for helpful discussions. Some of the computations were done on the Crimson Grid cluster in Harvard. The research at Harvard was supported in part by a grant from the National Institutes of Health. M.S. was supported in part by the CHARMM Development Project, and P.M. acknowledges support by the Marie Curie European Fellowship, Grant Number MEIF-CT-2003501953. Note Added after ASAP Publication. This article was published ASAP on March 11, 2008. Some text at the end of the fifth paragraph of Section 3.2 has been removed to correct a production error. The correct version was published on March 13, 2008. References and Notes (1) Jarzynski, C. Phys. ReV. Lett. 1997, 78, 2690-2693. (2) Jarzynski, C. Phys. ReV. E 1997, 56, 5018-5035. (3) Crooks, G. E. J. Stat. Phys. 1998, 90, 1481-1487.

6174 J. Phys. Chem. B, Vol. 112, No. 19, 2008 (4) Crooks, G. E. Phys. ReV. E 1999, 60, 2721-2726. (5) Crooks, G. E. Phys. ReV. E 2000, 61, 2361-2366. (6) Jarzynski, C. J. Stat. Phys. 2000, 98, 77-102. (7) Evans, D.; Cohen, E.; Morriss, G. Phys. ReV. Lett. 1993, 71, 24012404. (8) Evans, D. J.; Searles, D. J. Phys. ReV. E 1994, 50, 1645-1648. (9) Galavotti, G. J. Stat. Phys. 1995, 78, 1571-1589. (10) Leff, H.; Rex, A. Maxwell’s Demon: Entropy, Information, Computing; Princeton University Press: Princeton, NJ, 1990. (11) Loschmidt, J. Sitzungsber. Kais. Akad. Wiss. Wien, Math.Naturwiss. Kl. 1876, 73, 128-142. (12) Jarzynski, C. J. Stat. Phys. 1999, 96, 415-427. (13) Prigogine, I. Science 1978, 201, 777-785. (14) Hummer, G.; Szabo, A. Proc. Natl. Acad. Sci. U.S.A. 2001, 98, 3658-3661. (15) Hummer, G.; Szabo, A. Acc. Chem. Res. 2005, 38, 504-513. (16) Liphardt, J.; Dumont, S.; Smith, S. B.; Tinoco, I.; Bustamante, C. Science 2002, 296, 1832-1835. (17) Collin, D.; Ritort, F.; Jarzynski, C.; Smith, S. B.; Tinoco, I.; Bustamante, C. Nature 2005, 437, 231-234. (18) Bennett, C. H. J. Comput. Phys. 1976, 22, 245-268. (19) Shirts, M. R.; Bair, E.; Hooker, G.; Pande, V. S. Phys. ReV. Lett. 2003, 91, 140601. (20) Maragakis, P.; Spichty, M.; Karplus, M. Phys. ReV. Lett. 2006, 96, 100602. (21) Maragakis, P.; Ritort, F.; Bustamante, C.; Karplus, M.; Crooks, G. E. ArXiV e-prints 2007, 707, http://arxiv.org/pdf/0707.0089. (22) Rossky, P. J.; Karplus, M. J. Am. Chem. Soc. 1979, 101, 19131937. (23) Brady, J.; Karplus, M. J. Am. Chem. Soc. 1985, 107, 6103-6105. (24) Pettitt, B. M.; Karplus, M. Chem. Phys. Lett. 1985, 121, 194-201. (25) Mezei, M. J. Comput. Phys. 1987, 68, 237-248. (26) Anderson, A. G.; Hermans, J. Proteins 1988, 3, 262-265. (27) Tobias, D.; Brooks, C., III. J. Phys. Chem. 1992, 96, 3864-3870. (28) Olender, R.; Elber, R. J. Chem. Phys. 1996, 105, 9299-9315. (29) Bartels, C.; Karplus, M. J. Comput. Chem. 1997, 18, 1450-1462. (30) MacKerell, A. D., Jr.; Bashford, D.; Bellott, R. L.; Dunbrack, R. L., Jr.; Evanseck, J. D.; Field, M. J.; Fischer, S.; Gao, J.; Guo, H.; Ha, S.; Joseph-McCarthy, D.; Kuchnir, L.; Kuczera, K.; Lau, F. T. K.; Mattos, C.; Michnick, S.; Ngo, T.; Nguyen, D. T.; Prodhom, B.; Reiher, W. E., III; Roux, B.; Schlenkrich, M.; Smith, J. C.; Stote, R.; Straub, J.; Watanabe, M.; Wiorkiewicz-Kuczera, J.; Yin, D.; Karplus, M. J. Phys. Chem. B 1998, 102, 3586-3616. (31) Apostolakis, J.; Ferrara, P.; Caflisch, A. J. Chem. Phys. 1999, 110, 2099-2108. (32) Smith, P. E. J. Chem. Phys. 1999, 111, 5568-5579. (33) Wang, J. M.; Cieplak, P.; Kollman, P. A. J. Comput. Chem. 2000, 21, 1049-1074. (34) Kaminski, G. A.; Friesner, R. A.; Tirado-Rives, J.; Jorgensen, W. L. J. Phys. Chem. B 2001, 105, 6474-6487. (35) Ren, P. Y.; Ponder, J. W. J. Comput. Chem. 2002, 23, 1497-1506. (36) Hu, H.; Elstner, M.; Hermans, J. Proteins 2003, 50, 451-463.

Maragakis et al. (37) MacKerell, A. D.; Feig, M.; Brooks, C. L. J. Comput. Chem. 2004, 25, 1400-1415. (38) Ren, W.; Vanden-Eijnden, E.; Maragakis, P.; Weinan, E. J. Chem. Phys. 2005, 123, 134109. (39) Rosso, L.; Abrams, J. B.; Tuckerman, M. E. J. Phys. Chem. B 2005, 109, 4162-4167. (40) Ensing, B.; De Vivo, M.; Liu, Z.; Moore, P.; Klein, M. L. Acc. Chem. Res. 2006, 39, 73-81. (41) van der Vaart, A.; Karplus, M. J. Chem. Phys. 2007, 126, 164106. (42) Branduardi, D.; Gervasio, F. L.; Parrinello, M. J. Chem. Phys. 2007, 126, 054103. (43) Jaynes, E. T. Probability Theory: The Logic of Science; Cambridge University Press: New York, 2003. (44) Kirkwood, J. G. J. Chem. Phys. 1935, 3, 300-313. (45) Laio, A.; Parrinello, M. Proc. Natl. Acad. Sci. U.S.A. 2002, 99, 12562-12566. (46) Earl, D. J.; Deem, M. W. Phys. Chem. Chem. Phys. 2005, 7, 39103916. (47) Geyer, C. J. Markov chain Monte Carlo maximum likelihood. In Computing Science and Statistics: Proceedings of the 23rd Symposium on the Interface; Keramidas, E. M., Ed.; Interface Foundation: Fairfax Station, 1991. (48) Yamada, T.; Kawasaki, K. Prog. Theor. Phys. 1967, 38, 10311051. (49) Morriss, G.; Evans, D. Phys. ReV. A 1988, 37, 3605-3608. (50) Adib, A. B. J. Chem. Phys. 2006, 124, 144111. (51) Paramore, S.; Ayton, G. S.; Voth, G. A. J. Chem. Phys. 2007, 126, 051102. (52) Ferrenberg, A.; Swendsen, R. Phys. ReV. Lett. 1989, 63, 11951198. (53) Brooks, B. R.; Bruccoleri, R. E.; Olafson, B. D.; States, D. J.; Swaminathan, S.; Karplus, M. J. Comput. Chem. 1983, 4, 187-217. (54) Reiher, W., III. Theoretical studies of hydrogen bonding. Ph.D. Thesis, Department of Chemistry, Harvard University, 1985. (55) Neria, E.; Fischer, S.; Karplus, M. J. Chem. Phys. 1996, 105, 19021921. (56) Schaefer, M.; Karplus, M. J. Phys. Chem. 1996, 100, 1578-1599. (57) Dellago, C.; Bolhuis, P. G.; Geissler, P. L. AdV. Chem. Phys. 2002, 123, 1-78. (58) Sun, S. X. J. Chem. Phys. 2003, 118, 5769-5775. (59) Ytreberg, F. M.; Zuckerman, D. M. J. Chem. Phys. 2004, 120, 10876-10879. (60) Atilgan, E.; Sun, S. X. J. Chem. Phys. 2004, 121, 10392-10400. (61) Ritort, F. J. Phys.: Condens. Matter 2006, 18, R531-R583. (62) Ernst, R.; G., B.; A, W. Principles of Nuclear Magnetic Resonances in One and Two Dimensions; Clarendon: Oxford, 1987. (63) Hochstrasser, R. M. Proc. Natl. Acad. Sci. U.S.A. 2007, 104, 14190-14196. (64) Although, in principle, the term “identity” would be appropriate for both our and the Crooks fluctuation theorem, we use the term “theorem” in anticipation of the close connection between these two relations.