Effective Riemannian Diffusion Model for Conformational Dynamics of

Nov 18, 2016 - The presented formalism can be readily employed to modify the collective variable based enhanced sampling techniques, such as umbrella ...
1 downloads 9 Views 1MB Size
Subscriber access provided by UB + Fachbibliothek Chemie | (FU-Bibliothekssystem)

Letter

Effective Riemannian Diffusion Model for Conformational Dynamics of Biomolecular Systems Ashkan Fakharzadeh, and Mahmoud Moradi J. Phys. Chem. Lett., Just Accepted Manuscript • DOI: 10.1021/acs.jpclett.6b02208 • Publication Date (Web): 18 Nov 2016 Downloaded from http://pubs.acs.org on November 22, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

The Journal of Physical Chemistry Letters is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry Letters

Effective Riemannian Diffusion Model for Conformational Dynamics of Biomolecular Systems Ashkan Fakharzadeh† and Mahmoud Moradi∗,‡ †Department of Physics, North Carolina State University, Raleigh, NC 27695 ‡Department of Chemistry and Biochemistry, University of Arkansas, Fayetteville, AR 72701 E-mail: [email protected]

1

ACS Paragon Plus Environment

The Journal of Physical Chemistry Letters

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Page 2 of 26

Page 3 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry Letters

Dimensionality reduction has become an important part of biomolecular simulation studies to evade the curse of dimensionality. Within the context of molecular dynamics (MD) simulations, in addition to automated algorithms (e.g., principal component analysis 1 and its nonlinear counterparts 2–4 ), many collective variable based enhanced sampling techniques (e.g., US 5,6 and its nonequilibrium counterparts 7–12 ) aim at reducing the dimensionality of the problem in an ad-hoc manner. Collective variables can be defined intuitively to describe the slow degrees of freedom associated with functionally important protein conformational changes, 13 e.g., an interdomain molecular distance 8 or the root-mean-square deviation (RMSD) from a reference structure. 7 Several collective variable suites/modules 14–16 have recently been developed to allow for the system-specific design of collective variables such as handedness, 17,18 orientation quaternion, 16,19 and path collective variables. 20,21 The “reaction coordinate” 22 is an ideal collective variable for enhanced sampling techniques. However, even if a one-dimensional reaction coordinate exists, it is not known a priori. This has led to the development of several path-finding algorithms, which implicitly or explicitly approximate the reaction coordinate by the arc-length of a curve in the multi-dimensional space of atomic coordinates 23,24 or collective variables, 25,26 with several recent applications in biomolecular simulations. 27–29 Our aim here is to provide a robust conceptual framework and a rigorous mathematical formalism for the description of effective diffusion of biomolecules in collective variable spaces. Many path-finding algorithms and free energy calculation methods are formulated within an effective Euclidean diffusion framework. 25,30,31 Here we show that under similar assumptions, a Riemannian diffusion equation can be formulated, which has different conceptual and practical implications. In particular, the Riemannian formalism provides a definition for the PMF and the MFEP that are, unlike their conventional counterparts, invariant under coordinate transformations. We note that the effective Riemannian diffusion introduced here is a reformulation rather than a generalization of conventional effective diffusion in collective variable spaces. The Riemannian reformulation replaces the conventional Euclidean geometry by a Riemannian geometry,

3

ACS Paragon Plus Environment

The Journal of Physical Chemistry Letters

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 26

whose metric tensor is related to the conventional diffusion tensor, thereby eliminating the need for a position-dependent, anisotropic diffusivity in the model. Consider a system consisting of N atoms described by coordinates x, momenta p, and potential surface V (x) with a canonical equilibrium distribution at temperature β −1 denoted by ρ(x, p). Suppose that we model this “atomic system” using the equations of motion mi x˙i = pi and: dpi = −(

p ∂ V (x) + γi m−1 2γβ −1 dBi , i pi )dt + ∂xi

(1)

where the atomic mass and damping coefficient associated with atom i are denoted by mi and γi = γmi , respectively, and γ is the collision frequency. B(t) is a Wiener process satisfying





Bi (t) = 0 and Bi (t)Bj (t) = δij t, in which . represents an ensemble average. At the

overdamped limit, this equation reduces to:

dxi = −βDi

p ∂ V (x)dt + 2Di dBi , ∂xi

(2)

in which Di = γi−1 β −1 is the diffusion constant associated with atom i. Due to the dimensionality of the atomic system, it is useful to describe the conformational dynamics of the biomolecule using an effective dynamical model in a coarser space of collective variables that represent the slow degrees of freedom. Assuming that the dynamics in the reduced space of collective variables is diffusive, one may use a Euclidean diffusion model to describe the effective dynamics. 25,31 However, as shown here, the reformulation of this model using a Riemannian scheme provides a conceptually and practically more convenient framework not only for describing the dynamics but also for identifying the MFEP and calculating the PMF along this path using biased simulations. Suppose that ξ(x) is a multi-dimensional coarse coordinate describing the “reduced system” where ξ : R3N → M and (M, g) is a complete Riemannian manifold of dimension n, in which g(ξ) is a position-dependent metric. In the overdamped regime of the reduced system,

4

ACS Paragon Plus Environment

Page 5 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry Letters

the effective potential surface is described by:

G(ζ) = −β −1 log δζ (ξ) ,

(3)

in which ζ is any point on M and the ensemble average is over ξ. Here δζ (ξ) is the Dirac R delta function satisfying dΩξ δζ (ξ)F(ξ) = F(ζ), for any continuous compactly supported

function F. dΩξ is the volume element on M , which is invariant under coordinate transfor√ mations and is expressed in local coordinates as dΩξ = gdn ξ, where g is the determinant

of g. Since the atomic and reduced systems have the same ξ distributions, we can easily see that the effective potential G(ξ) is the same as the PMF of the atomic system defined by:

G(ζ) = −β

−1

log

Z

d3N x d3N p ρ(x, p) δζ (ξ(x)).

(4)

Note that the overdamped assumption for the effective dynamics does not necessarily imply that the dynamics of the atomic system is overdamped. We postulate that the dynamics in the ξ space can be modeled as a Riemannian diffusion described by: √  dξ = − βD∇G(ξ) + b dt + 2DdW ,

(5)

which in local coordinates (using the Einstein notation) can be written as: √  dξ i = − βDg ij ∂j G(ξ) + bi dt + 2DdW i ,

(6)

∂ , ∂ξ i

D is a position-independent, scalar diffusion constant, bi = −Dg jk Γi jk

is a drift rate, and W (t) is a Riemannian Wiener process 32 such that Wi (t) = 0 and

i

i ˙ (t) is ˙ (t1 )W ˙ j (t2 ) = g ij δ(t1 − t2 ), where W W (t)W j (t) = g ij t for any i and j (or W

in which ∂i =

the formal derivative of W (t)). D is an arbitrary constant and may be chosen to be equal

to the diffusion constant of some arbitrary atom with mass m (i.e., D = m−1 γ −1 β −1 ). Γ

5

ACS Paragon Plus Environment

The Journal of Physical Chemistry Letters

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 26

is the Christoffel symbol, which is the Levi-Civita connection of the metric tensor. b may be termed the geometric drift rate since it is determined by the (effective) geometry only. Fokker-Planck (or Smoluchowsky) equation associated with this process is:  ∂ u(ξ, t) = βD∇ · u(ξ, t)∇G(ξ) + D∆u(ξ, t), ∂t

(7)

  1 1 ∂ √ √ u(ξ, t) = βD √ ∂i u(ξ, t) gg ij ∂j G(ξ) + D √ ∂i gg ij ∂j u(ξ, t) , ∂t g g

(8)

or

in which ∆ is the Laplace-Beltrami operator, u(ξ, t) is the probability of finding the system at ξ at time t and the boundary condition is u(ξ, 0) = u0 (ξ). We note that the covariant Fokker-Planck equation has been previously described 32,33 and Riemannian diffusion without the drag term has also been proposed on a continuous two-dimensional surface, 34 on a d-dimensional constant curvature manifold, 35 and in the mathematics literature. 36,37 In addition, the overdamped model has been treated as a special case of a general dynamical system on Riemannian manifolds, 38 while its oscillatory behavior 39 and its discrete-time counterparts to arbitrary orders beyond the classical second order 40 have been studied. The Riemannian diffusion as an effective model for collective variable dynamics is, however, the main focus of this paper, which to the best of our knowledge has not been formulated before. As shown in Supporting Information, a Riemannian diffusion with a position-independent, scalar diffusivity D and metric g is equivalent to a Euclidean diffusion with a positiondependent diffusion tensor D = Dg −1 . In the Riemannian formalism the position dependence and anisotropy of diffusivity are simply features of the geometry of collective variable space, which is conceptually convenient to consider. However, we note that the two formulations are associated with two distinct definitions for the PMF and only the Riemannian PMF is invariant under coordinate transformations. While the total drift is the same in both formulations (and changes under coordinate transformations), only the Riemannian formulation separates the coordinate-system dependent part of the drift (i.e., the geometric

6

ACS Paragon Plus Environment

Page 7 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry Letters

term) and the invariant part (i.e., the free-energy dependent term). Generally, quantities such as distance, free energy, and free energy gradient as well as geometrical objects such as the MFEP are invariant under coordinate transformations, only in Riemannian formulation. Therefore, while the two approaches are equally valid to describe the dynamics, the conventional diffusion formulation lacks the conceptual robustness of its Riemannian counterpart. Note that since the free energy is not invariant, the MFEP can hardly be meaningful in the conventional formalism while in the Riemannian formalism, the MFEP is invariant and represents the most likely pathway. We resort to the Riemannian formulation not only for its conceptual convenience but also to take advantage of the utilities available within the Riemannian geometry. Ideally, one may estimate the PMF (i.e., the potential of the effective system) along a given reaction coordinate using Relation (3), assuming enough number of independent and identically distributed (i.i.d.) samples have been collected from MD or Monte Carlo (MC)

simulations of the atomic model to estimate δζ (ξ) . To avoid undersampling high-energy regions of the ξ space, one may add a biasing term to the potential of atomic model, e.g.,

VB (x) = U (ξ(x)) = k2 d(ξ(x), ζ)2 (where k is a constant and d(ξ, ζ) is the geodesic distance between points ξ and ζ), to encourage the system to stay around ζ, a given point on M . We are interested in estimating the PMF G(ξ) on a given curve ζ(s), parametrized by its arc-length s. The partition function of this biased system can be written as:

exp(−βF (s)) =

Z

d3N x d3N p exp(−βUs (ξ(x)) ρ(x, p),

(9)

or exp(−βF (s)) =

Z

 dΩξ exp − β(G(ξ) + Us (ξ)) ,

(10)

in which Us (ξ) = k2 d(ξ, ζ(s))2 is the biasing potential and F (s) is the perturbed free energy, i.e., the Riemannian, invariant counterpart of the conventional, non-invariant perturbed free energy. 41,42 The center of harmonic potential Us (ξ) can be varied along any given curve 7

ACS Paragon Plus Environment

The Journal of Physical Chemistry Letters

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 26

ζ(s) by varying s either in a time-dependent manner, as in steered MD , 8 or using multiple independent simulations, as in US. 5,6 The perturbed free energy F (s) can then be estimated using either nonequilibrium work measurements 43,44 or biasing potential values, 45 depending on the sampling protocol. We note that for any given point ζ, the perturbed free energy associated with the biasing potential k2 d(ξ, ζ)2 can be defined as: f (ζ) = −β

−1

log

Z

 k dΩξ exp − β(G(ξ) + d(ξ, ζ)2 ) . 2

(11)

Unlike F (s), which is defined on the ζ(s) curve only, f (ξ) is defined everywhere on the manifold; however, the two functions have the same values at any point along the curve, i.e., f (ζ(s)) = F (s). Employing an approach similar to that previously used to relate the onedimensional PMF and perturbed free energy functions within the conventional free energy formalism, 42 the Riemannian adaption of Weierstrass transform can be used to derive a relation between the Riemannian PMF and perturbed free energy, as shown in Supporting Information: exp(−βG(ζ)) = (

1 2π − n ) 2 exp(− ∆) exp(−βf (ξ))|ξ=ζ . βk 2βk

(12)

For large force constant k, Relation (12) reduces to (see Supporting Information):

G(ζ) ≈ f (ζ) +

 1 β∇f (ξ).∇f (ξ) − ∆f (ξ) |ξ=ζ + Constant. 2βk

(13)

Ideally, one may calculate the PMF everywhere in the ξ space. However, the collective variable is likely to be multi-dimensional, for which the full reconstruction of free energy landscape becomes prohibitively expensive computationally. For a multi-dimensional collective variable space, it is more practical to limit the free energy calculations to the MFEP between two given free energy minima. For large k, estimating G(ζ(s)) − F (s) along an arbitrary curve ζ(s) requires the evaluation of the gradient and Laplace-Beltrami operators

8

ACS Paragon Plus Environment

Page 9 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry Letters

of f (ξ) along ζ(s). We write these quantities in the so-called Frenet frame of the ζ(s) curve, which is a geodesic normal coordinate system:  1 d d2 β( F (s))2 − 2 F (s) + 2βk ds ds  β∇Σs f (ξ).∇Σs f (ξ) − ∆Σs f (ξ) |ξ=ζ(s) + Constant,

G(ζ(s)) ≈ F (s) + 1 2βk

(14)

in which Σs is the submanifold or hypersurface perpendicular to the ζ(s) curve at point ζ(s), while ∇Σs and ∆Σs are the gradient and the Laplace-Beltrami operator in the Σs submanifold. If the ζ(s) curve is the MFEP such that ζ(s)k∇f (ξ)|ξ=ζ(s) , then both ∇Σs f (ξ) and ∆Σs f (ξ) vanish along ζ(s). Therefore, we have an estimate for PMF in terms of F (s) function and its derivatives only:

G(ζ(s)) ≈ F (s) +

 1 d d2 β( F (s))2 − 2 F (s) + Constant, 2βk ds ds

(15)

In other words, while estimating G(ζ(s)) from F (s) is possible within the stiff-spring approximation for any arbitrary curve ζ(s), the correction term can be estimated from the F (s) function, as in Relation (15), only if the curve is parallel to the free energy gradient or expression β∇Σs f (ξ).∇Σs f (ξ) − ∆Σs f (ξ) is negligible along the curve. If an ensemble of trajectories evolve starting from a given point ζ(s) on a MFEP at time t = 0, they will satisfy Relation (5) such that:



δξ



s,δt

 = − βD∇G(ζ(s)) + b(ζ(s)) δt + O(δt2 ),

(16)

in which δξ = ξ(δt) − ξ(0), ξ(0) = ζ(s), and . s,δt denotes an ensemble average over all

trajectories starting at ζ(s), evaluated after a short time δt. If we write Relation (16) in the Frenet frame of the ζ(s) curve, the geometric drift vanishes along the curve since the Frenet frame is a normal coordinate system. On the other hand, ∇G(ζ(s)) is parallel to

9

ACS Paragon Plus Environment

The Journal of Physical Chemistry Letters

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 26

ζ(s); therefore, Relation (16) simplifies to:



δξ



d = −βD .ˆ e (s) = δs G(ζ(s))δt + O(δt2 ), 0 s,δt s,δt ds

(17)

ˆ i ’s are Frenet vectors, e.g., first Frenet vector and δξ s,δt .ˆ ei (s) = O(δt2 ) for i > 0, in which e ˆ 0 is the unit tangent vector along ζ(s). e

Similar to the string method, 25,26 one may heuristically identify the MFEP by itera tively launching short simulations along a given curve and modifying the curve until δξ s,δt

ˆ 0 is negligible. The main difference between projected onto any vector perpendicular to e

our implementation and other variations of string method is that we consistently use the metric for all calculations such as the biasing potential and reparametrization of the curve. Our proposed algorithm is similar, in spirit, to string method with swarms of trajectories (SMwST), 26 in which the zero-drift path (no drift except along the path, in average) is identified. However, if the drift is estimated locally in normal coordinates, the zero-drift path is the same as the Riemannian MFEP, which is invariant under coordinate transformations. The original SMwST algorithm, 26 however, does not necessarily identify the zero-drift path in the normal coordinates, which implies that it has a non-invariant solution. It has been already noted that neither the conventional MFEP nor the conventional zero-drift path is invariant under coordinate transformations. 31 However, here we have shown that the Riemannian MFEP is invariant and identical to zero-drift path, if the drifts are measured locally in normal coordinates. The Riemannian implementation of SMwST requires estimating both the drift and metric at every iteration. Within the overdamped approximation of atomic system, Relation (2)

P can be used to approximate δξ i δξ j s,δt by using δξ i = l ∂x∂ l ξ i δxl (not using the Einstein

notation). Assuming that

∂ i ξ ∂xl

is a smooth function of ξ i along the curve for all i and j, we

can show:

δξ i δξ j



s,δt

≈2

X l

Dl

∂ i ∂ j ξ ξ δt + O(δt2 ), ∂xl ∂xl s 10

ACS Paragon Plus Environment

(18)

Page 11 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry Letters

in which . s is an ensemble average conditioned on ξ = ζ(s). On the other hand, Relation

(6) indicates that:



δξ i δξ j



s,δt



= 2D δW i δW j δt + O(δt2 ) = 2Dg ij δt + O(δt2 ),

(19)

P m ∂ i ∂ j m which implies g ij ≈ l m . Conditional ensemble av, where we used DDl = m l ξ ∂xl ξ s l ∂x l

erage ∂x∂ l ξ i ∂x∂ l ξ j s can be approximated from biased simulations by restraining the system at point ζ(s), e.g., using the same biased simulations necessary for SMwST (assuming k is

large). Therefore, at every SMwST iteration, in addition to the drift, the metric is estimated as well. We note that even if the dynamics is not overdamped at the atomic level, it is possible that the above estimator approximates the metric reasonably. The above estimator has been derived previously for the diffusion tensor; 46 however, our approach is more straightforward under the assumptions made. For more accurate metric estimators, which may be necessary at the later stages of SMwST and prior to free energy calculations, one

may use the correlation functions δξ i δξ j s,δt /(2Dδt). However, direct estimators typically

require many more i.i.d. samples as compared to an estimator based on ∂x∂ l ξ i ∂x∂ l ξ j s , which is a simple average rather than a correlation function.

We note that while we have made certain simplifications in our SMwST implementation, our Riemannian formalism can readily be used in more complex contexts. For instance, if the important transition paths are not limited to a unique transition tube, one may locate the free energy landmarks to generate a network of transition tubes as suggested elsewhere; 47 however, the procedures can be modified according to our Riemannian formalism. Generally any free energy calculation method or path-finding algorithm that relies on measuring distances in a collective variable space (e.g., for calculating kernels in various metadynamics methods 10,12,48 ), would require a metric estimation in order to have a solution invariant under coordinate transformations. Our formalism provides a practical scheme for modifying these methods within a Riemannian framework as described in detail in Supporting Information

11

ACS Paragon Plus Environment

The Journal of Physical Chemistry Letters

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

for SMwST and US methods. We use a three-dimensional toy model to illustrate the workings of our algorithm and the improvement of the SMwST and US methods due to relatively simple modifications inspired by our Riemannian formalism. The dynamics of this toy model is described by the Cartesian overdamped Langevin equation, i.e., Relation (2), and the potential function is defined by U (x, y, z) = αUM (x, y) +

K 2 z , 2

in which UM (x, y) is the Muller potential, 49

α = 0.1 is a scaling factor, and K = 10 kcal/mol is a harmonic constant (see Computational Methods). Due to the harmonic term, z stays around 0 and the system can effectively be considered two-dimensional. Combining path-finding algorithms (e.g., SMwST) and free energy calculations (e.g., US) in the (x, y) space, one can find the MFEP and the free energy along this pathway as illustrated in Fig. 1. Transforming the Cartesian coordinates to the spherical coordinates, i.e., (x, y, z) → (r, θ, φ), and ignoring φ, we will have an alternative 2D (collective variable) space described by (r, θ). One may perform path-finding algorithms and free energy calculations in the (r, θ) space. We note that (r, θ) is not an ideal collective variable for this model; however, we will use this suboptimal collective variable in order to make a comparison between the conventional and Riemannian SMwST/US algorithms. Note that in realistic biomolecular simulations, designing ideal reaction coordinates is virtually impossible and the assumption that the collective variables define a Euclidean space is often unjustified. To illustrate the workings of our modified SMwST/US method, which is discussed in detail in Supporting Information, three algorithms, denoted by (i), (ii), and (iii), were employed to identify the MFEP and/or the zero-drift path (Fig. 1(a)) and its free energy profile (Fig. 1(b)). The PMF is estimated using the perturbed free energies (from US) and the stiffspring correction term in Relation (15)—for a comparison between perturbed and corrected free energies, see Fig. S1 in Supporting Information. Algorithms (i) and (ii), both employ the conventional SMwST and US methods by using the Euclidean metric and ignoring the geometric drift term. In Algorithms (i) and (ii), (x, y) and (r, θ) are used as collective variables,

12

ACS Paragon Plus Environment

Page 12 of 26

Page 13 of 26

respectively. Algorithm (i) is accurate since (x, y) has a Euclidean metric by construct and is not associated with any geometric drift term, i.e., the MFEP and the zero-drift path are the same. On the other hand, Algorithm (ii) is not accurate due to the position dependence of the metric, which is neglected. The discrepancy seen in Fig. 1 between the results of Algorithms (i) and (ii) clearly illustrates the failure of the conventional SMwST/US method when applied to a non-Euclidean collective variable space. Algorithm (iii) takes advantage of the formalism developed here to modify both SMwST and US algorithms such that the collective variable (r, θ) can be used by incorporating on-the-fly metric estimation (see Computational Methods). The converged pathway approximates the Riemannian MFEP, which is close to the conventional MFEP found in the Euclidean (x, y) space. Unlike the conventional SMwST and US algorithms, the Riemannian modified algorithms are applicable even

)UHH (QHUJ\ NFDO PRO

if the collective variable space is not flat.

(a) y

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry Letters

í

í

(b)

= [ \ FRQYHQWLRQDO =U FRQYHQWLRQDO =U 5LHPDQQLDQ

,PDJH 1XPEHU

x

Figure 1: Path-finding and free energy calculations for a 3D toy model described by a 2D collective variable. (a) Pathways generated by three alternative Algorithms (i), (ii), and (iii), shown as black, blue, and red curves, respectively. The resulting curves are projected onto the (x, y) space and the scaled-down Muller potential used in the simulations is shown. (b) Estimated PMF along the curves, shown in panel (a), plotted against the image/window number. While the simple example above illustrates that the Riemannian SMwST/US method is more accurate than its conventional counterpart when the collective variable geometry devi13

ACS Paragon Plus Environment

The Journal of Physical Chemistry Letters

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ates from the Euclidean geometry, the implementation of these methods may be challenging P m ∂ i ∂ j in more complex cases. Assuming g ij ≈ l ml ∂xl ξ ∂xl ξ s , one may minimize the cost

of calculations by defining the collective variables based on non-overlapping sets of atoms such that the off-diagonal elements of metric tensor are negligible. Assuming the position dependence of the metric is smooth enough, the SMwST method would only require metric determination at the end of each iteration, while the US simulations can simply use the metric determined at the end of SMwST simulations. Note that in both SMwST and US methods, the systems stay close to a given image/window center, which justifies the use of an approximate metric tensor based on the closest image/window center to the system. Therefore, even if some of the off-diagonal elements are not negligible, metric elements are updated in SMwST less frequently than the MD integration steps, typically by a few orders of magnitude. We note that the Riemannian modification of path-finding algorithms and free energy calculation methods as discussed here is aimed at improving the accuracy rather than efficiency of the sampling. The metric estimation is an additional calculation to ensure the reliability of the converged solution rather than speeding up the convergence. For certain collective variables, estimating the geodesic distance and other quantities required for Riemannian free energy calculations may be feasible without the need for metric estimations. For instance, as recently employed for orientation-based biasing of transmembrane helices, 13,19,29 the geodesic distance between orientation quaternions P and Q can be estimated by cos−1 (|P · Q|), in which P · Q is their inner product. 16 To illustrate the workings of the Riemannian SMwST/US method within the biomolecular simulation context, we employ this methodology to identify the MFEP and its associated free energy profile between two given minima of met-enkephalin conformational free energy landscape. Met-enkephalin peptide (Tyr-Gly-Gly-Phe-Met) has been extensively studied in the past using various enhanced sampling techniques. 47,50–52 Rather than exploring its entire conformational free energy landscape, however, we have selected two minima (representing an extended and a β-turn structure) and studied their MFEP pathways using both conventional

14

ACS Paragon Plus Environment

Page 14 of 26

Page 15 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry Letters

and Riemannian implementations of SMwST/US method as an illustrative example. The two minima represent two (out of 27) clusters (Fig. S2) identified using a k-means clustering technique from 4 µs MD trajectories of met-enkephalin solvated in a water box (see Computational Methods and Supporting Information). We note that the extended/elongated conformation is associated with a local free energy minimum as determined by the above clustering technique and the free energy calculations discussed below (within both the conventional and Riemannian frameworks). This is consistent with several other studies (but not all 53 ) employing various sampling methods and force fields. 47,52,54 We use an 8-dimensional collective variable of backbone dihedral angles (ψ1 , φ2 , ψ2 , φ3 , ψ3 , φ4 , ψ4 , φ5 ) for SMwST/US simulations, which is effectively associated with a tridiagonal metric (Fig. S3). The results of SMwST/US simulations (using 15 images/windows, each represented by 20 independent copies) are shown in Fig. 2, illustrating the difference between the MFEP pathways identified using the conventional and Riemannian methods. The 8-dimensional pathways are projected onto the first three dihedral principal components (dPCs) 52 constructed from equilibrium simulations (for better illustration). Although the two intermediate conformations may seem somewhat similar, they have a RMSD of ∼ 2.8 ˚ A (see Table S1), which indicates the two conformations are distinct. It is also easy to see the difference of these two intermediate conformations by directly comparing their backbone dihedral angles as shown in Fig. S4. Figure S4 also shows the free energy profiles associated with individual dihedral angles as obtained from our equilibrium simulations. The four conformations shown in Fig. 3, which represent the reactant/R (Image 1), product/P (Image 15), and two alternative intermediates (A and B) are clearly associated with distinct values of backbone dihedral angles. To determine which intermediate conformation (A or B) is more likely to be visited in the R→P transition, we used our equilibrium trajectories to identify all R→P transitions. We identified twice more such transitions with A as the intermediate than B with a first-passage time that is also twice longer (on average) in R→B→P transitions as compared to R→A→P transitions (Table S2). These observations provide direct evidence from our equilibrium sim-

15

ACS Paragon Plus Environment

The Journal of Physical Chemistry Letters

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ulations that the pathway identified by our Riemannian method is more “relevant” than the one identified by the conventional method. More importantly, the Riemannian pathway is invariant under coordinate transformations. We repeated the SMwST/US simulations in the entire dPC collective variable space, which represents a nonlinear transformation of dihedral P angles; The ith dPC is defined as 8j=1 (aij cos θj + aij sin θj ), where θj is the j th backbone dihedral angle and (aij , bij ) coefficients are estimated using our extensive equilibrium simu-

lations. The Riemannian SMwST/US method, unlike its conventional counterpart, results in a MFEP pathway (Fig. 3(a)) and free energy profile (Fig. 3(b)) that are similar in both collective variable spaces. The approximations involved in the implementation of the Riemannian SMwST/US method, as described in Supporting Information, have caused some discrepancies between the solutions associated with the two collective variable spaces; however, the solutions are qualitatively similar and identify the same intermediate state (i.e., A), where the conventional SMwST/US method identifies distinct intermediates in the two spaces (i.e., B and C), as illustrated in Fig. 3(a). We have developed a Riemannian diffusion formalism for effective description of diffusive dynamics of biomolecular systems in smooth collective variable spaces. The Riemannian definition of the PMF, proposed here, is invariant under coordinate transformations and provides a robust framework for accurate thermodynamic and kinetic quantification of biomolecular processes using all-atom MD simulations. The MFEP, defined within the Riemannian framework, is also invariant under coordinate transformations and more accurately represents the most probable path, as compared to the alternative definitions. The implication of the Riemannian formalism is that, the path-finding algorithms and free energy calculation methods developed within the Euclidean formalism should be modified in order to consistently take into account the geometry of the collective variable space. The workings of the developed formalism was demonstrated by means of a simple toy model as well as MD simulations of a short peptide to illustrate how the US and SMwST algorithms can be implemented within the proposed Riemannian framework.

16

ACS Paragon Plus Environment

Page 16 of 26

Page 17 of 26

The Journal of Physical Chemistry Letters

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

C

ï

dPC3

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Free Energy (kcal/mol)

The Journal of Physical Chemistry Letters

ï

C

ï

(a)

B A

Intermediate

ï ï dPC 1

ï

ï

dPC2

Page 18 of 26

8

Conventional Dihedral-Based

6

dPC-Based

4 2

(b)

Riemannian Dihedral-Based dPC-Based

3

7

9

Image Number

Figure 3: (a) The met-enkephalin MFEP pathways and (b) their corresponding free energies, obtained from conventional (blue) and Riemannian (red) SMwST/US simulations using backbone dihedral angles (solid lines) and dPCs (dashed lines) as collective variables. The MFEP pathways are projected onto the first three dPCs in panel (a).

Computational Methods For our toy model, the overdamped Langevin equation, i.e., Relation (2), was integrated using the Euler-Maruyama method, 55 with parameters T = 300.0 K, m = 1, γ = 1, and time step ∆t = 10−6 . The definition of the Muller potential 49 and other details can be found in Supporting Information. Three sets of SMwST/US simulations were carried out: (i) Using the collective variable ξ = (x, y) with the Euclidean metric; (ii) Using ξ = (r, θ) with the Euclidean metric; and (iii) Using ξ = (r, θ) with a metric estimated on-the-fly using P m ∂ i ∂ j estimator g ij ≈ l m . In all cases, a parallel variation 29 of SMwST method 26 l ξ ∂xl ξ s l ∂x

was used, which is discussed in detail in Supporting Information. Met-enkephalin peptide was simulated in a box of 885 TIP3P water molecules 56 using CHARMM36 force field 57,58 and MD

simulation package NAMD 2.10. 59 A Langevin thermostat with γ = 0.5 ps−1 , T = 300.0 K, and ∆t = 2 fs was used for integration, while the SETTLE 60 and SHAKE 61 algorithms were used to constrain the hydrogen bonds in water and peptide molecules, respectively. The pressure was maintained at 1 atm using the Nos´e-Hoover Langevin piston method. 62,63 The smoothed cutoff distance for non-bonded interactions was set to 10−12 ˚ A, and long-range electrostatic interactions were computed with the particle mesh Ewald (PME) method. 64 The equilibrium and follow-up SMwST/US simulations are described in Supporting information. 18

ACS Paragon Plus Environment

Page 19 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry Letters

Acknowledgement This research is supported by the University of Arkansas, Fayetteville.

Supporting Information Available In Supporting Information, the equivalence of Riemannian and Euclidean diffusion formalisms is shown, the relation between the PMF and perturbed free energy is derived, the SMwST/US method is described, and the simulation details are provided. Figure S1 shows the corrected versus perturbed free energies for our toy model, while Figs. S2, S3, and S4 illustrate the clustering, metric estimation, and dihedral angle analysis of met-enkephalin, respectively. Table S1 provides the pairwise RMSDs of select met-enkephalin conformations, while Table S2 reports on the first-passage time analysis of met-enkephalin conformational transition.

References (1) Amadei, A.; Linnsen, A. B. M.; Berendsen, H. J. C. Essential Dynamics of Proteins. Proteins: Struct., Func., Gen. 1993, 17, 412–425. (2) Das, P.; Moll, M.; Stamati, H.; Kavraki, L. E.; Clementi, C. Low-Dimensional, FreeEnergy Landscapes of Protein Folding Reactions by Nonlinear Dimensionality Reduction. Proc. Natl. Acad. Sci. USA 2006, 103, 9887–9890. (3) Ferguson, A. L.; Panagiotopoulos, A. Z.; Debenedetti, P. G.; Kevrekidis, I. G. Systematic Determination of Order Parameters for Chain Dynamics using Diffusion Maps. Proc. Natl. Acad. Sci. USA 2010, 107, 13597–13602. (4) Ferguson, A. L.; Panagiotopoulos, A. Z.; Kevrekidis, I. G.; Debenedetti, P. G. Nonlin-

19

ACS Paragon Plus Environment

The Journal of Physical Chemistry Letters

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ear Dimensionality Reduction in Molecular Simulation: The Diffusion Map Approach. Chem. Phys. Lett. 2011, 509, 1–11. (5) Torrie, G. M.; Valleau, J. P. Nonphysical Sampling Distributions in Monte Carlo FreeEnergy Estimation: Umbrella Sampling. J. Chem. Phys. 1977, 23, 187–199. (6) Northrup, S. H.; Pear, M. R.; Lee, C. Y.; McCammon, J. A.; Karplus, M. Dynamical Theory of Activated Processes in Globular Proteins. Proc. Natl. Acad. Sci. USA 1982, 79, 4035–4039. (7) Schlitter, J.; Engels, M.; Kr¨ uger, P.; Jacoby, E.; Wollmer, A. Targeted Molecular Dynamics Simulation of Conformational Change — Application to the T ↔ R Transition in Insulin. Mol. Sim. 1993, 10, 291–308. (8) Izrailev, S.; Stepaniants, S.; Balsera, M.; Oono, Y.; Schulten, K. Molecular Dynamics Study of Unbinding of the Avidin-Biotin Complex. Biophys. J. 1997, 72, 1568–1581. (9) Huber, T.; Torda, A. E.; van Gunsteren, W. F. Local Elevation: a Method for Improving the Searching Properties of Molecular Dynamics Simulation. J. Comput.-Aided Mol. Des. 1994, 8, 695–708. (10) Laio, A.; Parrinello, M. Escaping Free Energy Minima. Proc. Natl. Acad. Sci. USA 2002, 99, 12562–12566. (11) Hummer, G.; Kevrekidis, I. G. Coarse Molecular Dynamics of a Peptide Fragment: Free Energy, Kinetics, and Long-Time Dynamics Computations. J. Chem. Phys. 2003, 118, 10762–10773. (12) Moradi, M.; Tajkhorshid, E. Driven Metadynamics: Reconstructing Equilibrium Free Energies from Driven Adaptive-Bias Simulations. J. Phys. Chem. Lett 2013, 4, 1882– 1887.

20

ACS Paragon Plus Environment

Page 20 of 26

Page 21 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry Letters

(13) Moradi, M.; Tajkhorshid, E. Computational Recipe for Efficient Description of LargeScale Conformational Changes in Biomolecular Systems. J. Chem. Theory Comput. 2014, 10, 2866–2880. (14) Bonomi, M.; Branduardi, D.; Bussi, G.; Camilloni, C.; Provasi, D.; Raiteri, P.; Donadio, D.; Marinelli, F.; Pietrucci, F.; Broglia, R. et al. PLUMED: a Portable Plugin for Free Energy Calculations with Molecular Dynamics. Comp. Phys. Comm. 2009, 180, 1961. (15) Babin, V.; Karpusenka, V.; Moradi, M.; Roland, C.; Sagui, C. Adaptively Biased Molecular Dynamics: An Umbrella Sampling Method with a Time-Dependent Potential. Int. J. Quant. Chem. 2009, 109, 3666–3678. (16) Fiorin, G.; Klein, M. L.; H´enin, J. Using Collective Variables to Drive Molecular Dynamics Simulations. Mol. Phys. 2013, 111, 3345–3362. (17) Moradi, M.; Babin, V.; Roland, C.; Darden, T.; Sagui, C. Conformations and Free Energy Landscapes of Polyproline Peptides. Proc. Natl. Acad. Sci. USA 2009, 106, 20746–20751. (18) Moradi, M.; Babin, V.; Roland, C.; Sagui, C. Reaction Path Ensemble of the B−Z-DNA Transition: a Comprehensive Atomistic Study. 2013, 41, 33–43. (19) Moradi, M.; Tajkhorshid, E. Mechanistic Picture for Conformational Transition of a Membrane Transporter at Atomic Resolution. Proc. Natl. Acad. Sci. USA 2013, 110, 18916–18921. (20) Branduardi, D.; Gervasio, F. L.; Parrinello, M. From A to B in Free Energy Space. J. Chem. Phys. 2007, 126, 054103. (21) Berteotti, A.; Cavalli, A.; Branduardi, D.; Gervasio, F. L.; Recanatini, M.; Par-

21

ACS Paragon Plus Environment

The Journal of Physical Chemistry Letters

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

rinello, M. Protein Conformational Transitions: The Closure Mechanism of a Kinase Explored by Atomistic Simulations. J. Am. Chem. Soc 2009, 131, 244–250. (22) Legoll, F.; Leli´evre, T. Effective Dynamics using Conditional Expectations. Nonlinearity 2010, 23, 2131. (23) Czerminski, R.; Elber, R. Reaction Path Study of Conformational Transitions and Helix Formation in a Tetrapeptide. Proc. Natl. Acad. Sci. USA 1989, 86, 6963–6967. (24) Mills, G.; J´onsson, H. Quantum and Thermal Effects in H2 Dissociative Adsorption: Evaluation of Free Energy Barriers in Multidimensional Quantum Systems. Phys. Rev. Lett. 1994, 72, 1124–1127. (25) Maragliano, L.; Fischer, A.; Vanden-Eijnden, E.; Ciccotti, G. String Method in Collective Variables: Minimum Free Energy Paths and Isocommittor Surfaces. J. Chem. Phys. 2006, 125, 024106. (26) Pan, A. C.; Sezer, D.; Roux, B. Finding Transition Pathways Using the String Method with Swarm of Trajectories. J. Phys. Chem. B 2008, 112, 3432–3440. (27) Pontiggia, F.; Pachov, D. V.; Clarkson, M. W.; Villali, J.; Hagan, M. F.; Pande, V. S.; Kern, D. Free Energy Landscape of Activation in a Signalling Protein at Atomic Resolution. Nat. Commun. 2015, 6, 7284. (28) Meng, Y.; lin Lin, Y.; Roux, B. Computational Study of the DFG-Flip Conformational Transition in c-Abl and c-Src Tyrosine Kinases. J. Phys. Chem. B 2015, 119, 1443– 1456. (29) Moradi, M.; Enkavi, G.; Tajkhorshid, E. Atomic-Level Characterization of Transport Cycle Thermodynamics in the Glycerol-3-Phosphate:Phosphate Transporter. Nat. Commun. 2015, 6, 8393.

22

ACS Paragon Plus Environment

Page 22 of 26

Page 23 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry Letters

(30) E, W.; Vanden-Eijnden, E. Transition-Path Theory and Path-Finding Algorithms for the Study of Rare Events. Annu. Rev. Phys. Chem. 2010, 61, 391–420. (31) Johnson, M. E.; Hummer, G. Characterization of a Dynamic String Method for the Construction of Transition Pathways in Molecular Reactions. J. Phys. Chem. B 2012, 116, 8573–8583. (32) Risken, H. The Fokker-Planck Equation:

Methods of Solution and Applications;

Springer: Berlin, Germany; 1984. (33) Kampen, N. G. Brownian Motion on a Manifold. J. Stat. Phys. 1986, 44, 1–24. (34) Gustafsson, S.; Halle, B. Diffusion on a Flexible Surface. J. Chem. Phys. 1997, 106, 1880–1887. (35) Castro-Villarreal, P. Brownian Motion Meets Riemann Curvature. J. Stat. Mech. Theor. Exp. 2010, 8, 08006. (36) Hsu, E. Stochastic Analysis on Manifolds; American Mathematical Society: Providence, RI; 2002. (37) Stroock, D. An Introduction to the Analysis of Paths on a Riemannian Manifold ; American Mathematical Society: Providence, RI; 2005. (38) Fiori, S. Extended Hamiltonian Learning on Riemannian Manifolds: Theoretical Aspects. IEEE Trans. Neural Networks 2011, 22, 687–700. (39) Fiori, S. Nonlinear Damped Oscillators on Riemannian Manifolds: Fundamentals. J. Sys. Sci. Complex 2016, 29, 22–40. (40) Fiori, S. Auto-Regressive Moving-Average Discrete-Time Dynamical Systems and Autocorrelation Functions on Real-Valued Riemannian Matrix Manifolds. Discrete and Continuous Dyn. Sys. Ser. B 2014, 19, 2785–2808. 23

ACS Paragon Plus Environment

The Journal of Physical Chemistry Letters

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(41) Hummer, G.; Szabo, A. Free Energy Reconstruction from Nonequilibrium SingleMolecule Pulling Experiments. Proc. Natl. Acad. Sci. USA 2001, 98, 3658–3661. (42) Hummer, G.; Szabo, A. Free Energy Profiles from Single-Molecule Pulling Experiments. Proc. Natl. Acad. Sci. USA 2010, 107, 21441–21446. (43) Jarzynski, C. Nonequilibrium Equality for Free Energy Differences. Phys. Rev. Lett. 1997, 78, 2690–2693. (44) Shirts, M. R.; Bair, E.; Hooker, G.; Pande, V. S. Equilibrium Free Energies from Nonequilibrium Measurements using Maximum-Likelihood Methods. Phys. Rev. Lett. 2003, 91, 140601. (45) Shirts, M. R.; Chodera, J. D. Statistically Optimal Analysis of Samples from Multiple Equilibrium States. J. Chem. Phys. 2008, 129, 124105. (46) Maragliano, L.; Roux, B.; Vanden-Eijnden, E. Comparison between Mean Forces and Swarms-of-Trajectories String Methods. J. Chem. Theory Comput. 2014, 10, 524–533. (47) Chen, M.; Yu, T.-Q.; Tuckerman, M. E. Locating Landmarks on High-Dimensional Free Energy Surfaces. Proc. Natl. Acad. Sci. USA 2015, 112, 3235–3240. (48) Barducci, A.; Bussi, G.; Parrinello, M. Well-Tempered Metadynamics: A Smoothly Converging and Tunable Free-Energy Method. Phys. Rev. Lett. 2008, 100, 020603. (49) M¨ uller, K. Reaktionswege auf Mehrdimensionalen Energiehyperfl¨achen. Angew. Chem. 1980, 92, 1–14. (50) Evans, D. A.; Wales, D. J. The free Energy Landscape and Dynamics of MetEnkephalin. J. Chem. Phys. 2003, 119, 9947–9955. (51) Chen, M.; Cuendet, M. A.; Tuckerman, M. E. Heating and Flooding: A Unified Approach for Rapid Generation of Free Energy Surfaces. J. Chem. Phys. 2012, 137 . 24

ACS Paragon Plus Environment

Page 24 of 26

Page 25 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry Letters

(52) Sicard, F.; Senet, P. Reconstructing the Free-Energy Landscape of Met-Enkephalin using Dihedral Principal Component Analysis and Well-Tempered Metadynamics. J. Chem. Phys. 2013, 138 . (53) H´enin, J.; Fiorin, G.; Chipot, C.; Klein, M. L. Exploring Multidimensional Free Energy Landscapes using Time-Dependent Biases on Collective Variables. J. Chem. Theory Comput. 2010, 6, 35–47. (54) Sutto, L.; DAbramo, M.; Gervasio, F. L. Comparing the Efficiency of Biased and Unbiased Molecular Dynamics in Reconstructing the Free Energy Landscape of MetEnkephalin. J. Chem. Theory Comput. 2010, 6, 3640–3646. (55) Maruyama, G. Continuous Markov Processes and Stochastic Equations. Rend. Circ. Math. Palermo 1955, 4, 48–90. (56) Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. Comparison of Simple Potential Functions for Simulating Liquid Water. J. Chem. Phys. 1983, 79, 926–935. (57) Klauda, J. B.; Venable, R. M.; Freites, J. A.; O’Connor, J. W.; Tobias, D. J.; Mondragon-Ramirez, C.; Vorobyov, I.; MacKerell Jr., A. D.; Pastor, R. W. Update of the CHARMM All-Atom Additive Force Field for Lipids: Validation on Six Lipid Types. J. Phys. Chem. B 2010, 114, 7830–7843. (58) Best, R. B.; Zhu, X.; Shim, J.; Lopes, P. E. M.; Mittal, J.; Feig, M.; MacKerell, A. D. Optimization of the Additive CHARMM All-Atom Protein Force Field Targeting Improved Sampling of the Backbone φ, ψ and Side-Chain χ1 and χ2 Dihedral Angles. J. Chem. Theory Comput. 2012, 8, 3257–3273. (59) Phillips, J. C.; Braun, R.; Wang, W.; Gumbart, J.; Tajkhorshid, E.; Villa, E.; Chipot, C.; Skeel, R. D.; Kale, L.; Schulten, K. Scalable Molecular Dynamics with NAMD. J. Comp. Chem. 2005, 26, 1781–1802. 25

ACS Paragon Plus Environment

The Journal of Physical Chemistry Letters

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(60) Miyamoto, S.; Kollman, P. A. SETTLE: An Analytical Version of the SHAKE and RATTLE Algorithm for Rigid Water Molecules. J. Comp. Chem. 1992, 13, 952–962. (61) Ryckaert, J.-P.; Ciccotti, G.; Berendsen, H. J. C. Numerical Integration of the Cartesian Equations of Motion of a System with Constraints: Molecular Dynamics of n-Alkanes. J. Comp. Phys. 1977, 23, 327–341. (62) Martyna, G. J.; Tobias, D. J.; Klein, M. L. Constant Pressure Molecular Dynamics Algorithms. J. Chem. Phys. 1994, 101, 4177–4189. (63) Feller, S. E.; Zhang, Y.; Pastor, R. W.; Brooks, B. R. Constant Pressure Molecular Dynamics Simulation: The Langevin Piston Method. J. Chem. Phys. 1995, 103, 4613– 4621. (64) Darden, T.; York, D.; Pedersen, L. G. Particle Mesh Ewald: An N ·log(N ) Method for Ewald Sums in Large Systems. J. Chem. Phys. 1993, 98, 10089–10092.

26

ACS Paragon Plus Environment

Page 26 of 26