Generalized Langevin Equation as a Model for Barrier Crossing

Jan 3, 2019 - Conformational memory in single-molecule dynamics has attracted recent attention and, in particular, has been invoked as a possible ...
0 downloads 0 Views 1MB Size
Subscriber access provided by Iowa State University | Library

Article

Generalized Langevin Equation as a Model for Barrier Crossing Dynamics in Biomolecular Folding Rohit Satija, and Dmitrii E. Makarov J. Phys. Chem. A, Just Accepted Manuscript • DOI: 10.1021/acs.jpca.8b11137 • Publication Date (Web): 03 Jan 2019 Downloaded from http://pubs.acs.org on January 5, 2019

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Generalized Langevin Equation as a Model for Barrier Crossing Dynamics in Biomolecular Folding Rohit Satija† and Dmitrii E. Makarov*†‡ †Department of Chemistry and ‡Institute for Computational Engineering and Sciences, University of Texas at Austin, Austin, Texas 78712, U.S.A.

ACS Paragon Plus Environment

1

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 38

ABSTRACT: Conformational memory in single-molecule dynamics has attracted recent attention and, in particular, has been invoked as a possible explanation of some of the intriguing properties of transition paths observed in single-molecule force spectroscopy (SMFS) studies. Here we study one candidate for a non-Markovian model that can account for conformational memory, the generalized Langevin equation with a friction force that depends not only on the instantaneous velocity but also on the velocities in the past. The memory in this model is determined by a timedependent friction memory kernel. We propose a method for extracting this kernel directly from an experimental signal and illustrate its feasibility by applying it to a generalized Rouse model of a SMFS experiment, where the memory kernel is known exactly. Using the same model, we further study how memory affects various statistical properties of transition paths observed in SMFS experiments and evaluate the performance of recent approximate analytical theories of nonMarkovian dynamics of barrier crossing. We argue that the same type of analysis can be applied to recent single-molecule observations of transition paths in protein and DNA folding.

1. INTRODUCTION

ACS Paragon Plus Environment

2

Page 3 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Single-molecule measurements of biomolecular folding and dynamics are often interpreted using the model where the dynamics of the reaction coordinate (usually equated to the experimental observable x) is one-dimensional diffusion subjected to a potential of mean force U (x) 1-4. The latter is related to the equilibrium distribution peq (x) of the reaction coordinate:

U (x) = -k BT ln peq (x), (1) and the equation of motion along this coordinate is the Langevin equation, , (2) where g is a friction coefficient and z (t) is a Gaussian random noise with zero mean, which satisfies the fluctuation-dissipation theorem,

z (t)z (t ¢ ) = 2g kBTd (t - t ¢ ) . (3) The friction coefficient is related to the diffusion coefficient D through the Einstein relationship, D = kBT / g .

Eq. 2 describes a Markovian stochastic process (i.e., dynamics without memory). Biomolecules are, however, polymers, and the dynamics of monomers within polymers is known to be distinctly non-Markov5-8. Such non-Markov effects have attracted recent attention in connection with experimental measurements of the properties of molecular transition paths, i.e. short segments of the molecular trajectories spent crossing the folding free energy barrier. Specifically, it was proposed9,10 that, while Eq. 2 may be a good description of the folding and unfolding rates, various properties of transition paths such as their temporal duration11-22 (i.e., transition path time) and their average shape23-28 may be more sensitive to memory effects. Indeed, molecular simulations2938

, theoretical considerations39, and a few experimental studies40-42 indicate significant memory

and anomalous diffusion effects in protein dynamics.

ACS Paragon Plus Environment

3

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 38

Quantifying memory effects in single-molecule dynamics is generally a difficult task43,44. Many distinct mathematical models result in anomalous diffusion45-47, and which one provides the most accurate description of biomolecular dynamics remains an open question. Here, we explore the Generalized Langevin Equation (GLE), , (4) as a candidate for a model of the dynamics of reaction coordinates in folding. Here x (t) is a friction memory kernel and z (t) is a Gaussian random noise, which has zero mean and which satisfies the fluctuation-dissipation relationship:

z (t)z (t ¢ ) = kBT x (t - t ¢) (5) The GLE (Eq. 4) reduces to the ordinary Langevin equation (Eq. 2) when the memory kernel decays faster than other relevant dynamical timescales in the system, such that the kernel can be approximated by a delta function,

x (t) = 2gd (t) . (6) Establishing whether the GLE is a viable (and, in particular, better than the ordinary Langevin equation) description of an experimental trajectory x(t) is a non-trivial task. The Markovianity test proposed recently48 allows one to assess the significance of memory effects but offers no way to estimate the parameters of the underlying non-Markov model. Because the probability of a particular realization x(t1 ), x(t 2 ), x(t 3 ),... of a non-Markov process cannot be written as a product of pairwise propagators, a maximum likelihood type of approach49 for estimating the model parameters is not feasible in the case of a GLE. It is possible to estimate the memory kernel directly from simulations50-53, but, with the exception of recent work from the Netz group52, such methods usually require modified simulations in which, e.g., the position x is constrained to a fixed value.

ACS Paragon Plus Environment

4

Page 5 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

They are, therefore, inapplicable to experimental signals. A further complication stems from limited time resolution of the experimental data and/or limited sampling rate of the experimental trajectories, which prohibits measuring quantities related to, for example, instantaneous velocities associated with reaction coordinates. In what follows, we describe a method for extracting the memory kernel directly from unbiased trajectories x(t) , which only requires slowly varying – and thus experimentally accessible – quantities (such as the autocorrelation function of x). We illustrate this method using a toy model of a single-molecule force spectroscopy experiment. This model, similar to the generalized Rouse model (GRM) proposed by the Thirumalai group54, (i) captures the essential features of the free energy landscape traversed by a biomolecule in the process of its unfolding and refolding under mechanical stress, (ii) includes non-Markov effects caused by polymer dynamics, and (iii) can be described exactly by a GLE with a memory kernel that can be estimated analytically. By analyzing folding/refolding trajectories obtained from this model, we further show how memory affects experimentally observable properties of transition paths (such as their shapes and transition path times) and test analytical approximations for the distributions of transition path times proposed recently55,56.

We envisage that similar analysis can be applied to experimental trajectories

obtained from single-molecule force spectroscopy data.

2. METHODS: ESTIMATING THE MEMORY KERNEL FROM DATA Our method of estimating the friction memory kernel from a trajectory x(t) is closely related to the approaches described by Daldrop et al52 and by Debnath et al57. We neglect the inertial effects and assume the overdamped limit of the GLE, Eq. 4, which amounts to setting its left hand side to zero:

ACS Paragon Plus Environment

5

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 38

(7) Multiplying the above equation by x(0) and averaging over the thermal noise, we obtain t

0 = -U ¢ éë x(t) ùû x(0) - ò x (t - t ¢ )

d x(t ¢ )x(0) dt ¢

0

dt ¢ (8)

Introducing the position autocorrelation function Cxx (t) = x(t)x(0)

and the force-position

correlation function C fx (t) = -U ¢ éë x(t) ùû x(0) , Eq. 8 can be written as: , (9) or, in Laplace space, Cˆ fx (s) (10) xˆ(s) = ˆ sC xx (s) - Cxx (0) ¥

where the hat denotes the transform (e.g., fˆ (s) = ò f (t)e- st dt ). This gives the memory kernel in 0

Laplace space; Numerically inverting the Laplace transform58, the memory kernel in the time domain can be further recovered.

Although Eq.10 is exact (in the overdamped GLE case) for any potential U (x) , it is instructive to

1 consider the case of a harmonic potential well, U (x) = k x 2 . This potential is a good model when, 2 for example, x(t) describes fluctuations of a molecule around its equilibrium structure, or fluctuations of the intramolecular distance within an unfolded protein59. In this case, Eq. 10 reduces to the known expression57,60:

ACS Paragon Plus Environment

6

Page 7 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

xˆ(s) = k

Cˆ xx (s) (11) Cxx (0) - sCˆ xx (s)

The limit s ® 0 of this expression, if it exists, gives the Markov approximation to the GLE, ¥

xˆ(0) = k Cˆ xx (0) / Cxx (0) = k Cˆ xx (0) / x 2 = k ò c (t)dt , (12) 0

where the normalized autocorrelation function,

c (t) = x(t)x(0) / x 2

(13)

was introduced. Indeed, the Markov approximation, i.e. replacement of Eq. 4 by Eq. 2, amounts to approximating the memory kernel by the delta function, ¥

x (t) » 2d (t) ò x (t)dt = 2xˆ(0)d (t) = 2gd (t) , (14) 0

so that g = xˆ(0) is the friction coefficient in the Markov limit. Note, however, that xˆ(0) may be infinite, as it is in the case where the memory kernel is a power law – a Markov limit is never attained in this case. Consider now the limit s ® ¥ , which, in the time domain, corresponds to short-time dynamics. In particular, consider the short-time behavior of the mean square displacement, Dx 2 (t) = [ x(t) - x(0)] = 2 x 2 [1- c (t)] = 2

2kBT

k

[1- c (t)] (15)

Using Eq. 11, we have cˆ (s) =

xˆ(s) 1 k » + ... (16) ˆ ˆ k + sx (s) s x (s)s 2

If

xˆ(¥) º g ¥ (17)

ACS Paragon Plus Environment

7

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 38

is finite, then, taking the inverse Laplace transform of Eq. 16 one recovers the short time behavior:

c (t) » 1- k t / g ¥ . Using Eq. 15 one finds, then, that the mean square displacement undergoes diffusive dynamics at short times, Dx 2 » 2

kBT



t º 2D¥t (18)

The plateau value xˆ(¥) is, therefore, the effective friction coefficient that would be inferred from the short-time dynamics of the system. In the non-Markov case, this value is different from the Markov limit xˆ(0) of the friction coefficient. Eq. 18 cannot be used when xˆ(¥) = 0 . Consider, for example, the power law friction kernel of the form55:

x (t) =

ha t -a , (19) G(1- a )

where 0 < a < 1. Taking its Laplace transform, xˆ(s) = ha / s1-a , Eqs. 15 and 16 give Dx 2 »

2kBT t a . (20) ha G(1+ a )

The short-time dynamics is, therefore, subdiffusive in the case of a power-law memory kernel.

3. RESULTS: APPLICATION TO THE GENERALIZED ROUSE MODEL (GRM) Here we study the dynamics of the toy model of a one-dimensional Rouse chain whose ends interact with one another via some potential and which may be subjected to an external force. The total potential energy of the chain, which is composed of beads with coordinates x1, x2 ,..., xN , is given by

ACS Paragon Plus Environment

8

Page 9 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

N-1

UGRM (x1,..., xN ) = åk 0 (xn+1 - xn )2 / 2 +Uee (x N - x1 ) (21) n=1

where k 0 is a spring constant that defines the Kuhn length of the chain, and U ee is the additional potential acting on the chain’s end-beads. In the next Section, we will introduce a specific form of U ee suitable to mimic the unfolding and refolding of a biopolymer under a stretching force, as in

single-molecule force spectroscopy studies. Here we will consider two different potentials, along with the Uee = 0 case, with the goal to investigate how the method of estimating the memory kernel depends on the system’s potential of mean force. The beads of the chain obey the overdamped Langevin equation of the form (22), where g 0 is a monomer friction coefficient and z n (t) is a delta-correlated Gaussian random noise with zero mean satisfying the relationship z n (t)z m (t ¢) = 2kBTg 0d (t - t ¢)d nm . As was shown previously6, the dynamics of the end-to-end distance x = xN - x1 in this chain satisfies

U (x) =

a

GLE,

Eq.

7,

exactly,

with

the

potential

of

mean

force

given

by

1 k0 2 x +U ee (x) , (23) 2 N -1

where the first term accounts for the elasticity of the polymer chain. The memory kernel is independent of the potential U ee ; it thus can be determined by, for example, setting Uee = 0 such that the potential of the mean force is that of a harmonic oscillator with a spring constant

k = k 0 / (N -1) . In the Laplace space, then, the memory kernel is given by Eq. 11, with the autocorrelation function of the end-to end distance given by the known expression61,62,

Cxx (t) =

æ k 0l pt ö 2 exp u pN - u p1 , å ç ÷ bk 0 p=1 l p è g0 ø 1

N -1

1

(

)

(24)

ACS Paragon Plus Environment

9

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 38

where

u pn =

é pp æ 2 1ö ù é pp ù cos ê n + ÷ ú and l p = 4sin 2 ê ú (25) ç N N 2 è ø 2N ë û ë û

Using Eq. 10, we have extracted the memory kernel from Brownian dynamics simulations (i.e. from numerical integration of Eq. 22) for the GRM with different potentials U ee and compared it to the exact result – see Figure 1. In this Figure, as well as in the rest of the paper, k BT is used as the energy unit, 10 k BT k 0 is the length unit and g 0 k 0 is the time unit. The exact memory kernel is independent of the potential U ee , but the accuracy of its numerical estimate depends on this potential, on the length of the trajectory used, and on the data sampling rate. This dependence is important to understand if one wishes to apply this method to experimental data. In particular, consider the low-frequency behavior ( s ® 0 ), which corresponds to the long-time ( t ® ¥ ) tail of the memory kernel in the time domain. The Laplace- transformed memory kernel reaches a plateau xˆ(0) , which, according to Eq. 10, should be proportional to the ¥

integral of the force-position correlation function, xˆ(0) = - ò dtC fx (t) Cxx (0) . The numerical noise 0

in estimating the correlation function C fx (t) may shift the estimated integral, and thus alter the estimated plateau value, as is observed in Figure 1 for the cases of nonzero potential U ee : in those two cases the dynamics involves barrier crossing, and, as a result, the timescale of the decay of C fx (t) becomes longer than in the case of a free Rouse chain, resulting in larger statistical errors

given the same trajectory length. In the opposite, high-frequency limit, the friction kernel approaches another plateau, xˆ(¥) . In the time domain, this corresponds to a delta-function contribution, implying that the memory

ACS Paragon Plus Environment

10

Page 11 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

kernel has a strictly Markovian component, and the friction force has a component that is proportional to the velocity. This component simply corresponds to the solvent friction force acting on the end monomers resulting in an effective friction coefficient of g 0 / 2 6, as opposed to the timedelayed friction force exerted on the end monomers by the rest of the chain. The high-frequency behavior of the xˆ(s) is determined by the short-time behavior of the correlation functions entering Eq. 10; the accuracy in estimating the plateau value is sensitive to the data sampling rate, which is manifested by relatively small deviations of the estimated xˆ(¥) from g 0 / 2 in Figure 1.

Figure 1. Memory kernels extracted from the end-to-end distance of a 1D Rouse chain using Eq. 10 with different interaction potentials U ee acting on the chain ends. Here, the intra-chain stiffness is k 0 = 100 , and the number of monomers is N = 50. All trajectories were sampled using a time

ACS Paragon Plus Environment

11

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 38

step of 0.5g 0 k 0 with a total simulation time of 106 g 0 k 0 . The high-frequency (large s ) errors are due to finite sampling rate (i.e. these errors are reduced if the time resolution of data collection improves). The low-frequency (small s ) errors are a result of statistical errors in estimating the correlation functions Cxx (t) and C fx (t) (i.e. these errors are reduced as the simulated trajectories become longer). For a free chain, i.e. U ee (x) = 0 , the correlation functions decay rapidly; as a result, convergence is achieved even with a trajectory of modest length. This decay time is longer

(

)

when the dynamics involves crossing a barrier, as in the cases of U ee (x) = 3 x 2 -1

(

)

2

and

2

U ee (x) = 48 x -12 - x -6 - 9.2x , requiring much longer trajectories to achieve the same accuracy. For the same simulation time, the error in estimating the low-frequency limit of the memory kernel increases progressively with the barrier height, which is, respectively, ~ 2k BT and ~ 6k BT for the above two potentials. Inset shows the friction kernel in the time domain obtained via numerical inversion of the Laplace transform. The kernel has a delta-function component, which is not shown. The grey dashed-dotted line shows the memory kernel estimated by fitting Eq. 10 with the function of the form, xˆ(s) = g ¥ + a / (s + l ) , which corresponds to exponentially decaying memory, x (t) = 2g ¥d (t) + ae- lt .

In principle, the Laplace transform of the memory kernel contains all the physically important information about memory. If the explicit time dependence x (t) is desired, Laplace transform must be inverted numerically, which, in general, is a challenging task. Inset of Figure 1 shows the friction kernel in the time domain obtained via such numerical inversion. The kernel has a delta-

ACS Paragon Plus Environment

12

Page 13 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

function component, which is not shown. Spurious oscillations observed for the high-barrier case, where the statistics is insufficient, illustrate the difficulty in performing an inverse Laplace transform on noisy data. By taking advantage of physical insight into the system, when available, accuracy of estimating x (t) may be improved. For example, if one assumes (correctly, for the Rouse model) that the non-Markov part of x (t) is a sum of decaying exponentials with positive amplitudes, x (t) = 2g ¥d (t) + å ai e- lit ,ai > 0 , (cf. Eqs. 11 and 24), then one can estimate x (t) by i

a fitting xˆ(s) to a sum xˆ(s) = g ¥ + å i thus finding ai and l i . The resulting kernel would then i s + li no longer have artifacts such as oscillations. In the absence of insight, however, “blind” inverse Laplace transform methods are preferred. Because of experimental constraints (noise, limited time resolution etc.) it may be desirable to use an empirical memory kernel, such as the well-studied kernel of the form x (t) = 2g ¥d (t) + ae- lt , which includes a Markovian component and a single term with exponentially decaying memory. Figure 1 illustrates that, in comparison to the Markov model (where xˆ(s) has the same value for any s) such an empirical model provides an improved description of the GRM data; yet it is clearly inadequate, especially considering the behavior of the memory kernel in the time domain.

4. RESULTS: PROPERTIES OF TRANSITION PATHS IN SIMULATED MECHANICAL UNFOLDING OF THE GENERELZIED ROUSE MODEL The model introduced in the previous Section allows us to examine some of the generic features introduced by memory in single-molecule force spectroscopy experiments13,28,63,64. To do so, we set the potential acting on the end monomers of the chain (see Eq. 21) to be:

ACS Paragon Plus Environment

13

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 38

éæ s ö 12 æ s ö 6 ù U ee (x) = 4e êç ÷ - ç ÷ ú - fx (26) è xø ú êëè x ø û This potential includes a Lennard-Jones interaction between the end monomers and a stretching force f that is pulling these monomers apart. The corresponding potential of mean force (PMF) for the dynamics of the end-to-end distance x is:

éæ s ö 12 æ s ö 6 ù 1 k0 2 U (x) = x + 4e êç ÷ - ç ÷ ú - fx , (27) 2 N -1 è xø ú êëè x ø û For the value of the force f used in this work, the PMF exhibits two minima: the minimum corresponding to the shorter distance x mimics the “folded state”, where the end monomers stick to each other because of the Lennard-Jones attraction, while the minimum corresponding to the higher extension of the chain corresponds to the “unfolded” state where the end monomers do not interact with each other (Figure 2). The shape of the PMF is reminiscent to that found in molecular simulations of folding (using different reaction coordinates)65, with a broad unfolded basin and a narrow folded basin.

ACS Paragon Plus Environment

14

Page 15 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Figure 2. Potential of Mean Force (PMF) on the end-to-end distance of the generalized Rouse model, Eq. 21. The parameters used here are k 0 = 100 , N = 50, e = 12 , s = 1, and f = 9.2 (same as in Figure 1). Blue dots represent the PMF reconstructed from the equilibrium probability distribution in a simulated trajectory using Eq. 1. The dashed black line is the exact PMF given by Eq. 27. A , TS , and B represent the compact chain (folded) basin, the transition state (barrier top), and the expanded chain (unfolded) basin. Orange dashed lines show the transition region boundaries x A and x B used in the trajectory analysis. Dashed blue line is the symmetric harmonic

ACS Paragon Plus Environment

15

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 38

potential used to estimate the transition path times and the transition path velocity using analytical approximations.

Transition paths are defined as pieces of a trajectory x(t) that enter an interval x A £ x £ xB (which we call the transition region) through one of its boundaries (say x A ) and exit through the other boundary, staying continuously inside this interval between these two events. In what follows we analyze how various experimentally measurable properties of transition paths are affected by the memory effects introduced by the dynamics of the polymer chain. 4.1. Markovianity Test. A recent study48 proposed a simple test allowing one to establish whether the observed dynamics x(t) is Markovian. Consider the probability P(x A ® xB | x) that a point x belongs to a transition path from A to B. For a Markov process, the maximum of this probability is always exactly the same,

max P(x A ® xB | x) = max P(xB ® x A | x) = 1/ 4 ,

x A