Adaptive Finite Temperature String Method in ... - ACS Publications

Nov 30, 2017 - Here we present a modified version of the on-the-fly string method for the ..... The hybridization coordinate was calculated as the sig...
0 downloads 0 Views 2MB Size
Subscriber access provided by University of Florida | Smathers Libraries

Article

Adaptive Finite Temperature String Method in Collective Variables Kirill Zinovjev, and Iñaki Tuñón J. Phys. Chem. A, Just Accepted Manuscript • DOI: 10.1021/acs.jpca.7b10842 • Publication Date (Web): 30 Nov 2017 Downloaded from http://pubs.acs.org on December 1, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

The Journal of Physical Chemistry A is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Adaptive Finite Temperature String Method in Collective Variables Kirill Zinovjev and I˜naki Tu˜no´n∗ Departament de Qu´ımica F´ısica, Universitat de Val`encia, 46100 Burjassot, Spain E-mail: [email protected]

1

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Abstract Here we present a modified version of the on-the-fly string method for the localization of the minimum free energy path in a space of arbitrary collective variables. In the proposed approach the shape of the biasing potential is controlled by only two force constants, defining the width of the potential along the string and orthogonal to it. The force constants and the distribution of the string nodes are optimized during the simulation improving the convergence. The optimized parameters can be used for umbrella sampling with a path CV along the converged string as the reaction coordinate. We test the new method with three fundamentally different processes: chloride attack to chloromethane in bulk water, alanine dipeptide isomerization and the enzymatic conversion of isochorismate to piruvate. In each case the same set of parameters resulted in a rapidly converging simulation and a precise estimation of the potential of mean force. Therefore, the default settings can be used for a wide range of processes, making the method essentially parameter free and more user-friendly.

2

ACS Paragon Plus Environment

Page 2 of 31

Page 3 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Introduction Complex processes in condensed phases typically involve changes in many degrees of freedom. 1,2 In principle, one would like to construct the complex multidimensional free energy surface (FES) to obtain an exhaustive description of the process. However, the computational cost of constructing such a surface scales exponentially with the number of coordinates, limiting practically the number of coordinates to only a few. This problem, often called a “curse of dimensionality”, affects the applicability of common free energy calculation techniques, such as Umbrella Sampling (US) 3 and Metadynamics. 4 It is necessary, therefore, to project the multidimensional FES on a surface of lower number of dimensions, ideally on a single distinguished reaction coordinate (RC) capable to guide the system through the process of interest. However, the definition of such a coordinate is not a trivial problem for processes in condensed phases. 1,2,5 Path-based methods gained lots of popularity in the last decade as an efficient solution to investigate the free energy change in complex processes. These methods were applied to a plethora of problems in computational chemistry, from phase transitions 6 to enzymatic catalysis. 2 At the core of any path-based approach is the assumption that the process under analysis is most likely to proceed along some path in the multidimensional space of reaction coordinates. In other words, the “reaction tube” – a region connecting reactants and products basins that includes most of the reactive trajectories – is narrow enough. In this case this space can be projected, without loosing any essential information, onto some collective variable (CV) that defines the progress along this path. By doing so, the number of reaction coordinates is reduced to just one and the above mentioned dimensionality issue is solved. The path is typically found by the string method or a similar path optimization technique. A plethora of methods with slightly different path definitions exists, 7–14 and in theory the resulting path will depend on the method. For example, the original string method 7,8 converges to the minimum free energy path (MFEP), while the finite-temperature string method 10 and the methods developed by Leines et al. 14 and Dickson et al. 13 provide 3

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

the principal curve of the reaction. In contrast to the MFEP, which goes through the free energy minimum in directions orthogonal to the path, the principal curve goes through the average of the corresponding thermodynamic ensemble. 10 The difference between these two paths depends on the ruggedness of the FES and the width of the reaction tube. For a narrow reaction tube and a smooth FES the MFEP and other path definitions are expected to be similar. 15 Whatever is the technique used, the potential of mean force (PMF) along the obtained path has to be calculated to estimate the free energy barrier of the process. 16 Many of the path optimization methods provide an estimate of the free energy along the path directly from the data acquired during the path optimization. In the original string method, 7,8 swarms of trajectories, 9 and maximum flux transition paths method 11 the mean force at each string node is projected onto the path and integrated. Though coming at zero additional computational cost, the resulting free energy profile does not include the entropic contribution from the degrees of freedom that are orthogonal to the path in the chosen CV space, thus being only an approximation (although typically a good one) to the true one-dimensional PMF. Other methods combine path optimization with free energy calculation techniques based on the flattening of the underlying free energy. 13,14 While providing approximately true PMFs, these methods inevitably suffer a well-known convergence problem, if an important degree of freedom (DOF) is omitted. 4 Another approach is to use the reaction path as a reference for the definition of a path CV. 17 In this case, the PMF calculation is decoupled from the path optimization itself and then US 3 with Hamiltonian Replica Exchange 18,19 can be used to obtain the sampling along the RC (the path CV). This strategy provides a converged PMF, even with a suboptimal RC, provided that the free energy barriers along the orthogonal DOFs are not too high. This two-step approach was succesfully applied to a wide spectrum of enzymatic reactions. 20–23 However, the main disadvantage is the additional computational cost of the US calculation. Another drawback of this approach is the large number of parameters that have to be chosen by the user for each case: the force constants (one for each

4

ACS Paragon Plus Environment

Page 4 of 31

Page 5 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

CV) and the damping coefficients needed to define the string and the force constants and reference positions for the US windows. Here we present a modified version of the on-the-fly string method (the Adaptive String Method, ASM), which attempts to solve the two problems mentioned above. The biasing potential of the string nodes is redefined to approximately match the biasing potential in US windows along a path CV. In this way the sampling obtained during the string optimization can be employed to determine the PMF. In addition, we introduce a strategy for on-the-fly optimization of the positions and force constants in the string nodes and of the US windows. The optimized parameters guarantee a uniform sampling of the entire range of the RC, ensuring a high rate of replica exchanges. This significantly improves the convergence of both the string optimization and the calculation of the PMF using US. As shown below, the automatic optimization of the parameters allows the use of the same initial setup in very different problems, therefore minimizing the manual selection of parameters. The paper is organized as follows: in section ”Theory and method” we describe the modifications of the original string method that reduce the number of the adjustable parameters introducing no additional approximations. Next, we give details of the practical implementation of the method, including the on-the-fly optimization of the parameters. Finally, we show the results obtained with the new method for three substantially different processes: the symmetric SN 2 reaction of chloromethane with chloride anion in water, the isomerization of alanine dipeptide and the enzymatic reaction catalyzed by Isochorismate Pyruvate Lyase (IPL).

Theory and method The on-the-fly string method Here we briefly describe the on-the-fly string method. The details can be found in the original paper by Maragliano et al. 8 One starts with a set of D CVs {θ1 , . . . , θD } which assumed to cover all the DOFs relevant in the process under study. A vector function z(α) 5

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 31

that represents a path (the “string”) is defined in this space and discretized as a set of n equidistant string nodes {z1 , . . . , zn }:  zi = z(αi ) = z

 i−1 L n−1

(1)

given that the parametrization α corresponds to the arc-length of the string up to the given point and L is the total arc-length of the string. An independent MD simulation is initiated for each node, with a harmonic biasing potential that restrains the system close to the position of the node in the CV space:

Vi (x) =

X Kj j

1 (θj (x) − zij )2 = (θ(x) − zi )T K(θ(x) − zi ) 2 2

(2)

where x are the 3N Cartesian coordinates of the system, zij is the value of the j-th CV in the i-th node, Kj is the force constant associated to the j-th CV and finally K is the diagonal D × D matrix of the force constants. The trajectory xi (t) associated to the node zi governs the motion of the node itself: ˜ i (t))K(θ(xi (t)) − zi ) γ z˙ i (t) = M(x

(3)

Here γ is the damping coefficients that determines the “speed” of the node and

˜ ij (x) = M

3N X ∂θi (x) 1 ∂θj (x) ∂xk mk ∂xk k=1

(4)

is the local distance metric tensor with mk being the atomic mass associated to the k-th Cartesian. Given K and γ large enough, the dynamics of the nodes, governed by Eq. (3) converges to a path with the following property: dz(α) k M(z(α))∇A(z(α)) dα

6

ACS Paragon Plus Environment

(5)

Page 7 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

which is the definition of MFEP. 7 M is the metric tensor defined as D E ˜ M(z(α)) = M(x)

(6) θ(x)=z(α)

where h· · · iθ(x)=z(α) is the statistical average over the ensemble with θ(x) constrained to z(α). In practice, the dynamics given by (3) must be discretized:

˜ i (t))K(θ(xi (t)) − zi )∆t zi (t + ∆t) = zi (t) + γ −1 M(x

(7)

Obviously, free dynamics would cause the nodes to collapse to the corresponding free energy minima. This is avoided by a reparametrization procedure that keeps the nodes equidistant. After every dynamical step the string is interpolated using its arc-length and picking equispaced points along the interpolated curve:

z0i

= Zz1 ,...,zn



i−1 L n−1

 (8)

where Zz1 ,...,zn (α) is a continuous interpolation of the string nodes, parametrized by the arclength. Z can be a simple linear interpolation, as in the original reference, 7 or a more precise procedure like cubic splines interpolation. The dynamics (7) and the reparametrization (8) ensure the convergence of the string to the vicinity of MFEP, given than K and γ are chosen to be large enough.

The adaptive string method The introduction of a biasing potential (2) with a diagonal matrix K can be seen as restrain of the system to an ellipsoid with the principal axes aligned along the basis coordinates of the selected CV space (Fig. 1a). However, one is free to choose K in (2) and (3) to be any positive-definite matrix with eigenvalues large enough to keep the system close to the string 7

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(a)

Page 8 of 31

(b)

Figure 1: The “biasing ellipsoid” (yellow) of a string node in (a) original string method and (b) adaptive string method. The red region represents the region sampled by a US window with the path CV used as a RC. Note that in the case of the adaptive string method the overlap between the string node ensemble and the US window ensemble is significantly better. node. Particularly, it is possible to construct a bias that orients the ellipsoid along the string (Fig. 1b). To do so, we introduce the following biasing matrix instead of K:

ˆn ˆ T M−1 + K ⊥ M−1 B = (K k − K ⊥ )M−1 n

(9)

Here the subindex i in all the terms is omitted for clarity. K k and K ⊥ are the force constants ˆ is a for the harmonic biasing potential along and orthogonal to the string, respectively. n unit vector tangent to the string and M is the distance metric tensor, which for a node i is calculated as: 8 D

˜ Mi = M(x)

E

=Z

−1

i

Z

−β(V (x)+Vi (x)) ˜ M(x)e dx ≈ M(zi )

(10)

R3N

where V (x) is the potential energy function of the system, Vi is defined as in (2) but with ˆ fulfills biasing matrix B and Z is the partition function of the biased ensemble. The vector n the following normalization condition:

ˆ Ti M−1 ˆi |ˆ ni |Mi ≡ n i n

8

1/2

=1

ACS Paragon Plus Environment

(11)

Page 9 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Here and below | · · · |M denotes the length of some vector with a distance metric defined by ˆ can be obtained a tensor M. If the arc-length α is calculated using the same metric then n as dz(αi ) dα

ˆi = n

(12)

The geometric meaning of the biasing matrix B becomes more evident when K ⊥ is set equal to 0 or to K k : k

Ki⊥

2 K ˆi = 0 : Vi (x) = i (θ(x) − zi )T M−1 i n 2

(13)

k

k

Ki⊥ = Ki : Vi (x) =

Ki |θ(x) − zi |2Mi 2

(14)

In the first case, the potential vanishes in the directions orthogonal to the string and the system is restrained to a hyperplane orthogonal to the string. In fact, potential (13) is identical to the one used in the generalized hyperplanar transition state optimization approach 24 to restrain the system to a hyperplane in the CV space. In the second case the force constants along and orthogonal to the path are made equal. As a result, the potential becomes spherically symmetric and depends only on the distance of θ to the node z. When dealing with CV spaces, the introduction of the distance metric tensor (6) has several advantages that are described elsewhere. 2,24,25 The feature that is most important here is the rigorous definition of distance and orthogonality when CVs of the different nature (bond lengths, angles etc.) are used. The introduction of the metric tensor in (11) resolves this issue and implicitly ”converts” the different units of the selected coordinates to those of mass-weighted Cartesian coordinates. The usage of the biasing matrix B instead of the diagonal matrix K is the only fundamental difference between the adaptive and the original string methods. The evolution of the string nodes is performed with Eq. (7) with B instead of K. The force constants and the damping coefficient γ have to be chosen as large as possible for the converged string

9

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 31

to approximate well the MFEP. However, it is worth discussing the dependence of the converged path on the choice of K ⊥ . When K ⊥ → ∞ the free energy gradient normal to the converged string vanishes, resulting in the MFEP (as in the original string method 7 ). Instead, when K ⊥ → 0 the converged string crosses any hypersurface orthogonal to the path at the ensemble average of the CVs constrained to this hypersurface, which is the definition of the principal curve. 10 Thus, MFEP and the principal curve are limiting results of the adaptive string method, with some interpolation between the two being obtained in practical simulations.

Optimization of αi and K k The convergence of both the string method and US is significantly improved applying the Hamiltonian Replica Exchange 18 approach. This procedure is efficient if the sampling from nearby string nodes (or US windows) overlap significantly. 19 A simple way to achieve this is to ensure that the distributions sampled at each node, assuming normality, have the following mean value and standard deviation:

µi =

i−1 L, n−1

σi =

L 2(n − 1)

(15)

This distribution corresponds to the following effective free energy (intrinsic free energy plus bias) along the string: Aef f (li (x)) = β −1 σi−2 (li (x) − µi )2

(16)

where li is the projection of θ(x) onto the string in the vicinity of the i-th node: ˆ i + αi li (x) = (θ(x) − zi )T M−1 i n

(17)

If n nodes are distributed evenly from 0 to L then αi = µi . Note that when K ⊥ = 0 (Eq. (13)) the string bias becomes simply a harmonic bias along li .

10

ACS Paragon Plus Environment

Page 11 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Due to the underlying free energy gradient in the region sampled by the string nodes, the sampling obtained with the string bias will be shifted from the reference position zi (µ(li ) 6= µi ). In the same way, the curvature of the free energy surface alters the width k

of the distribution (σ 2 (li ) 6= σi2 ). Therefore, αi and Ki have to be adjusted to provide the probability distribution defined by (15). This can be achieved with a dynamical optimization of these parameters. The optimal values of αi and K k are reached at the steady state of the following dynamics: k

γα α˙ i = Ki h∆li (x)ii k 2 γK K˙ i = β −1 σi−2 − h∆li2 (x)i−1 i + κh∆li (x)ii

(18) 

(19)

where γα and γK are the damping coefficients. Here and below ∆li (x) = µi − li (x) is introduced for clarity. The averages h· · · ii are calculated over the ensemble corresponding to the string node i as in Eq. (10). The meaning of the third r.h.s. term of (19) is the following: when κ > 0 and αi is not optimal (h∆li (x)i2 > 0), K is overestimated at the steady state of (19). This helps to accelerate the convergence of αi . In practice, instead of calculating the ensemble averages, the dynamics are performed on-the-fly: k

αi (t + ∆t) = αi (t) + γα−1 Ki ∆li (x(t))∆t  −1  2   k k −1 −1 −2 2 + κ ∆li (t)) ∆t Ki (t + ∆t) = Ki (t) + γK β σi − ∆li (t)

(21)

X(t + ∆t) = X(t) · (1 − ) + X(x(t + ∆t)) · 

(22)

(20)

where:

is an exponential moving average with  determining the “width” of the averaging window. The initial values for αi and K k can be taken as those providing an optimal distribution width (Eq. (15)) in case of a flat underlying free energy:

αi,init = µi,init =

11

i−1 Linit n−1

ACS Paragon Plus Environment

(23)

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

k

−2 Ki,init = β −1 σi,init =4

(n − 1)2 βL2init

Page 12 of 31

(24)

where Linit is the length of the initial guess. The initial value of the M tensor (which is ˜ over the structures used to needed to calculate B) can be obtained as an average of M initiate the dynamics in each node (see section ). It may seem that the introduced modifications to the original string method increase the number of adjustable parameters, instead of decreasing it. However, as shown in the Results section, all the new parameters are essentially system independent and the same values are expected to be adequate for a wide range of problems. Moreover, since the node parameters (αi and K k ) are optimized during the simulation, any inadequacy in the initial values will be corrected during the simulation. Therefore, one does not need to adjust the parameters manually and if the default values are used, the method is expected to work almost as a ”black box”.

Path CV and PMF along the string The main strength of the new approach comes from the fact that the modified string bias greatly resembles an US bias along a path CV (Fig. 1b). Different, while geometrically similar, definitions of path CVs exist in the literature, 13,14,17 all sharing a central feature: the path CV measures the progress along the path and has its isosurfaces orthogonal to the path. An ensemble obtained with a harmonic bias applied along a path CV based on the MFEP will be restrained in directions orthogonal to the path by the curvature of the free energy surface and along the path by the bias itself. Therefore, the biased sampling is oriented exactly as with the modified string potential described above. Here we will use the coordinates based on the definition by Branduardi et al.: 17,25 Pn s(θ(x)) =

i−1 −λ|θ(x)−zi |Mi i=1 λ e Pn −λ|θ(x)−zi |M i i=1 e

12

ACS Paragon Plus Environment

(25)

Page 13 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

z(θ(x)) = −λ−1 ln

n X

e−λ|θ(x)−zi |Mi

(26)

i=1

The first coordinate measures the progress along the path while the second one measures the distance from it. The reference path is discretized to a set of equidistant points z with λ being the inverse distance between the two nearby z points. The number of reference points n, however, is not necessarily the same as in the string discretization - more points might be interpolated for the path CV to obtain a better path resolution. Note that the distance metric based on the metric tensor M is also used here. This ensures that the biased ensemble is as similar as possible to the one obtained from the string calculation. Also, the introduction of the metric tensor guarantees that the path CV isosurfaces approximate the isocommittors in the vicinity of the path. 7,25 The similarity of the biasing potentials used in the modified string method and in the US simulations along a path CV (see Figure 1) provides a computational advantage: the optimized node parameters αi and K k can be directly used as reference positions and force constants for the US windows, resulting in the following biasing potential:

Vis (s(x)) =

Kk (s(x) − αi )2 + Vb (z) 2

(27)

where Vb (z) =

K⊥ z(x)2 2

(28)

As described in the previous section, these parameters are expected to provide uniform sampling along the path with high replica exchange rate. The second term corresponds to the bias orthogonal to the path and should be omitted when one is interested in an unbiased estimation of the PMF along the s coordinate. However, in some cases, it could be desirable to explore only those regions close to the path during the US simulations (see Results) and this term should then be kept. Additionally, the PMF along the path CV can be efficiently evaluated by ensemble

13

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 31

reweighting during the string optimization. 26 For instance, if the sampling from each string node is used to reweight a corresponding US window along the path CV, the probability distribution of the s values in each window would be: E s eβ [Vi (θ(x))−Vi (s(x))] δ(s0 − s(x)) i E D ρsi (s0 ) = β [Vi (θ(x))−Vis (s(x))] e D

(29)

i

Once the reweighted biased distributions (29) for each window are obtained, the PMF can be calculated using standard techniques, such as WHAM 27 or Umbrella Integration. 28 However, it should be kept in mind that the reference path of the path CV changes during the string optimization, introducing an error source additional to the reweighting scheme. Therefore, the PMF obtained by reweighting should be considered only as a semi-quantitative estimate and used only to check that the simulation is stable and provides meaningful results.

PMF decomposition The decomposition of the free energy change in terms of the different CVs can provide a valuable information for the optimization of the process under study. For example, in the study of a catalyzed reaction the identification of those CVs contributig more to the free energy barrier can assist the design of new and more efficient catalysts. The knowledge of MFEP and the associated free energy profile along it allows to decompose the contributions of each CV to the resulting free energy profile or PMF. The contributions of each CV to the free energy profile can be easily evaluated in the original string method 7 by integrating the components of the free energy gradient projected onto the MFEP:

0

Z

α0

n ˆ i (α) · ∇i A(z(α))dα

Ai (α ) =

(30)

0

ˆ (α) and ∇i A(z(α)) is the i-th Here n ˆ i (α) is the i-th component of the normal vector n component of the free energy gradient at z(α). Essentially the same approach was also used

14

ACS Paragon Plus Environment

Page 15 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

by Sanchez-Martines et al. in the swarms of trajectories string method. 29 In case of the adaptive string method the derivative of the free energy w.r.t. CVs is not explicitly known, since the bias is no longer oriented along the CVs. However, from Eq. (5) it follows that

∇A(z(α)) = c(α) · M−1 (z(α))ˆ n(α)

(31)

where c(α) a proportionality constant, that can be determined from the normalization condition (11), to be equal to the free energy gradient along the path:

ˆ T (α)∇A(z(α)) = c(α) = n

dA(α) dα

(32)

The free energy gradient can be obtained directly from the US windows when Umbrella Integration 28 is used or calculated numerically from a PMF integrated with WHAM. 27 Plugging Eqs. (31) and (32) into (30) the expression for the contribution of each CV to the PMF is obtained: 0

Z

α0

n ˆ i (α) ·

Ai (α ) = 0

 dA(α) · M−1 (z(α))ˆ n(α) i dα dα

(33)

Where (· · · )i denotes an i-th component of a vector.

The adaptive string method workflow The proposed method can be summarized in the following workflow: 1. Equilibrate the system with long unbiased MD simulation in reactants and products states. Note that the Hamiltonian has to be identical in both systems. Therefore, if a chemical transformation is studied, either a QM potential (or a hybrid QM/MM one) or an empirical model describing correctly bond breaking and formation is required. 2. Choose a set of CVs that are assumed to be necessary and sufficient to describe the process under study. Use the average values of the CVs in the equilibrated reactants and products states as the endpoints of the initial guess for the string. The rule of 15

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 31

thumb is to include as CVs the distances of all the bonds that are broken, formed or change their order during the process and the hybridization coordinates for all the atoms that change their hibridization state. For simple conformational transitions one might include all the dihedral angles that change significantly going from one conformation to another. 3. Prepare the initial structures for every node of the string. This can be done either by gradually increasing the string biasing potential (defined by parameters αi,init and k

Ki,init , Eqs. (23) and (24)) or with targeted MD 30 by steering the reactants structure towards the products one or vice versa. The former approach was used in all the exampled provided here. The initial path can be chosen as a straight line connecting reactants and products. However, it is generally a better approach to prepare an initial guess that roughly follows some putative reaction mechanism and to run separate simulations for each proposed mechanism. 4. Run the trajectories for all the string nodes z. During the simulation, calculate the biasing tensor (9) and use it as the biasing matrix to update the positions of the nodes k

with Eq. (7). Simultaneously, update the biasing parameters αi and Ki (Eqs. (20) and (21)). Replica exchange can be used to improve sampling. 5. Optionally, every X steps of simulation calculate a preliminary PMFs. The string nodes are moving towards the MFEP, so only the most recent data should be used to estimate the PMF. Although the statistical error will probably be too large, an approximate shape of the PMF might be helpful to check whether the simulation is giving adequate results and to identify possible problems with the simulation system. 6. Check the convergence of the string calculating the mean distance between the string nodes in the past and their current positions: n

C(t) =

1X |zi (t) − zi (t0 )|M(t0 ) n i=1 16

ACS Paragon Plus Environment

(34)

Page 17 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

where t0 is the current simulation time. When the string is converged, the convergence curve (34) reaches a plateau close to 0. 7. When the simulation converges, use the average position of the nodes over the plateau k

to obtain a smooth MFEP. Also, calculate average αi and Ki . 8. Finally, fix the string at the obtained MFEP and switch the string potential to the k

harmonic bias along the path CVs (25). Use the average αi and Ki (26) as the reference values and the force constants for the harmonic bias, respectively. Continue the simulation to obtain enough sampling for the final PMF calculation with Umbrella Integration or WHAM. Umbrella Integration was used here, since it is less sensitive to the sampling noise and provides an analytic expression for the statistical error. 31 9. If the string and the parameters converged well, the total sampling obtained from the PMF calculation is roughly uniform, and the structure does not exhibit large conformational “jumps” going from one string node (US window) to another, then the chosen set of CVs is likely adequate and the obtained results are credible. If not, go back to step 2, try to determine the missing essential CVs and repeat the calculations. The missing CVs might be detected by analizing the conformational discontinuities between nearby CV nodes, especially in the TS region(s).

Results and Discussion In the following section we apply the proposed adaptive string method (ASM) to three fundamentally different processes: a conformational transition from C7a to C7eq in alanine dipeptide (ADP), an SN 2 reaction between chlormethane and chlorine ion in water (ClCH3 Cl) and a pericyclic conversion of isochorismate to pyruvate and salicylate catalyzed by an enzyme isochorismate-pyruvate lyase (IPL) (Figure 2).

17

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

O

H N

H

ψ

ϕ

O

O

H

N

N

N H

O

Page 18 of 31

N N

H

O

H

O

Figure 2: Mechanisms of the three studied processes and the chosen CV spaces. Top to bottom: ClCH3 Cl, ADP, IPL. Distances, hybridization coordinates and dihedrals are represented with yellow rounded rectangles, blue circles and curved arrows, respectively.

MD and ASM setup All the simulations were performed using Amber12 32 package. The Amberff12SB 33 forcefield and TIP3P 34 water model were used. A hybrid QM/MM scheme with the AM1/Amberff12SB 35 Hamiltonian was used in examples 1 and 3. In all cases MD simulations were performed with a Langevin thermostat at 300K, Velocity Verlet integrator 36 and periodic boundary conditions with Particle Mesh Ewald 37 to treat long-range electrostatic interactions. Replica Exchange was applied during both the string optimization and the US with exchange attempts performed every 100 fs. Other parameters and system properties are given in Table 1. The same set of parameters required for the ASM was used in the three examples presented below (see Table 1). The parameters were adjusted to provide a compromise between the speed of convergence and the stability of simulations. US simulations were performed for 30 ps in all three cases. While it is a common approach to maintain the system close to the path 13,14 or to ignore the entropy in orthogonal directions, 7,8 up to our knowledge the effect of such a bias on the 18

ACS Paragon Plus Environment

Page 19 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

PMF has never been explicitly tested. Since in the adaptive string method the restriction of the system in the directions orthogonal to the string is controlled explicitly by K ⊥ we have also tested the effect of this parameter on the resulting PMF: the US caculations for each system were performed with and without biasing potential along coordinate z. ADP conformational change was described with two CVs, the backbone dihedral angles φ and ψ. For the ClCH3 Cl reaction the breaking and forming distance as well as the hybridization coordinate of the methyl group were used. The hybridization coordinate was calculated as the signed point plane distance between the carbon atom and the plane formed by three hydrogens. The value is negative when the methyl group “umbrella” is pointing to the reactants chlorine and positive when it is inversed. For the IPL system a set of 9 CVs was used, consisting of the 6 distances of the 6-membered cycle formed in the TS (Figure 2) and three hybridization coordinates for the carbon atoms that change their hybridization during the reaction. This set was shown elsewhere to be sufficient to define a good RC. 24,25 Table 1: Simulation parameters. The same set of parameters was used in all ASM simulations. χ = [a.m.u.1/2 ·˚ A]

System ClCH3 Cl ADP IPL

System and MD parameters Box size QM timestep Atoms nodes Atoms (fs) (˚ A) 26x24x23 1407 6 1.0 50 27x29x25 1912 – 1.0 79x79x79 50025 24 0.5

γ (ps−1 ) 2 · 103

ASM parameters γα K⊥ (kcal·mol−1 · χ−2 ) (kcal·mol−1 · χ−2 ps) 2 · 102

2 · 105

γK (ps)

κ (χ−4 )



5

6 · 102

100

Convergence and PMF precision The results for the string convergence and PMF calculations are provided in Table 2. The convergence curves are shown in Figure 3. The string converged after 5, 10 and 15 ps for ClCH3 Cl, IPL and ADP systems, respectively. It is not surprising that the convergence took somewhat longer for the conformational change of ADP, since this process has more diffusive character. During the US stage, in the three cases, 30 ps of sampling with 50 windows was enough to reduce the statistical error to the order of 1 kcal/mol. The sampling along the

19

ACS Paragon Plus Environment

The Journal of Physical Chemistry

MFEP is in all cases reasonably uniform (Figure 4), justifying the parameter optimization procedure introduced in section ”Optimization of αi and K k ”. Table 2: Results of the string optimization and US along the converged string. The convergence time includes 5 ps used to obtain average path and parameters. The PMF and the statistical error were obtained with Umbrella Integration. The error is calculated as 95% confidence interval, assuming normal distribution. *For IPL, complete removal of the potential Vb made simulations unstable, so a much softer potential with K ⊥ = 10 kcal · mol −1 · χ−2 was used instead. Convergence time (ps) US time (ps) ClCH3 Cl ADP IPL

10 25 15

30 30 30

Free Energy Barrier (kcal/mol) With Vb (z) Without Vb (z) Literature 24.0 ± 1.3 23.6 ± 1.3 24.95 38 8.6 ± 0.8 7.9 ± 0.9 8.7 14 38.2 ± 1.2 36.2∗ ± 1.1 37.8 25

1.5

A) C (a.m.u.1 2 ⋅ °

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 31

ClCH3Cl ADP IPL

1.0

0.5

0.0 0

5

10

15

20

25

t (ps) Figure 3: Convergence curves. The convergence is calculated with Eq. (34).

Impact of the Vb (z) potential The barriers obtained both with and without Vb (z) for the three processes are consistent with previous studies on the same systems, 7,14,24,25,38–40 although direct comparison was not 20

ACS Paragon Plus Environment

Page 21 of 31

0.25

ClCH3Cl ADP IPL

0.20 0.15

ρ

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

0.10 0.05 0.00 0

2

4

s (a.m.u.

1 2

⋅° A)

6

8

Figure 4: Sampling along the MFEP in US calculation. ρ is the probability density to find the system at a given value of s in any window. always possible because of the differences in the potential energy functions used in different works. Restraining the system to the vicinity of the path was crucial to obtain an adequate PMF in the case of ADP (Figure 5). When the restraint was not present, the system explored another minimum during the simulations, C5 (see SI, Figure S7). As a result, a deep well appears in the obtained PMF. The result is not surprising – the same observation was mentioned by Leines et al. 14 to justify the need for a restraint orthogonal to the path. When the additional bias Vb (z) is applied, the simulation does not sample the third minimum and the correct PMF for the conformational transition from C7a to C7eq in ADP is obtained. Moreover, Vb (z) was necessary for the stability during the simulation of IPL. Because of the presence of a very shallow TS 25 this system is able to sample regions far from the path, causing numerical instabilities in the path CV restraint. 2 The introduction of a stronger Vb (z) bias solves this issue and improves the ∆G‡ estimate. These two examples illustrate that the addition of a restraint to keep the system close to the path can be useful to improve the stability and convergence of the simulations without impairing much or even improving the resulting free energy profile. 21

ACS Paragon Plus Environment

The Journal of Physical Chemistry

On the other hand, when Vb (z) is applied the system is forced to the vicinity of the converged path. Therefore it has no chance to sample unexpected regions of the conformational space which, in turn, could be useful to propose alternative reaction mechanisms. For instance, if one were unaware of the existence of the C5 minimum in ADP and performed the study always using Vb (z), the existence of this conformation would be ignored. So, in general, we suggest that a good practice would be to perform simulations first without Vb (z). However, if no extra knowledge is gained and the system becomes unstable, Vb (z) can be then safely applied. 10 Without V b (z ) With V b (z )

8

A (kcal mol)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 31

6 4 2 0 −2 −4 0

1

2

3

4

s (a.m.u.

5 1 2

6

7

8

°) ⋅A

Figure 5: PMF for ADP system obtained with and without Vb (z). The shaded areas correspond to 95% confidence interval for the PMFs.

Decomposition of the PMF This feature of the method is illustrated with an analysis of the ClCH3 Cl results. Figure 6 shows the variation of 3 of the CVs employed in this case (the distances associated to the formed and broken bonds and the hybridization coordinate of the transferred methyl group) and the free energy change associated to each one.

22

ACS Paragon Plus Environment

Page 23 of 31

25

total PMF broken bond formed bond hybridization

A (kcal mol)

15

5

−5

−15 4

3

d (° A)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

2

1

0

−1 0

1

2

3

s (a.m.u.

4

1 2

⋅° A)

5

6

7

Figure 6: Decomposition of the PMF for the nucleophilic attack of chloride to CH3 Cl (top) and variation of the main CVs along the MFEP (bottom). The shaded area corresponds to 95% confidence interval for the PMF. As expected, the SN 2 reaction is described as an associative process. According to the changes observed in main reaction distances along the MFEP (Figure 6, bottom), the nucleophile first approaches the methyl group and afterwards the bond with the leaving group is broken, provoking the lengthening of the corresponding distance. The hybridization coordinate only changes significantly around the TS, where the distances to the nucleophile and leaving groups change simultaneously. 23

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The PMF decomposition presented here (Figure 6, top) must not be seen as a measure of ‘the “bond energy”. This decomposition provides information regarding along which coordinates the energy is “pumped” to the system by thermal fluctuations to reach the TS and along which ones the excess energy is released afterwards. According to the free energy change associated to each CV when the existing C − Cl bond starts to break, a free energy cost of ≈ 7 kcal/mol is already paid for pushing the nucleophilic chloride closer to CH3 Cl (first vertical line in Figure 6). Next, a synchronous antisymmetric transfer happens where the free energy of the system increases along the two distances associated to the nucleophile and the leaving group until the TS is reached. The energy that would be released forming a new bond is used to compensate the energy required to break the existing one and as a result the two bond distances are energetically coupled. After crossing the TS the behavior of the bond distances to the nucleophile and the leaving group is swapped. The energy released by the newly formed bond (up to the second vertical line in Figure 6) corresponds exactly to the energy accumulated along the broken bond before the TS (from the first vertical line to the TS). And viceversa, the energy released along the broken bond matches the energy acquired when the nucleophile approached the CH3 Cl moiety. Note that the free energy associated to the hybridization coordinate is essentially negligible. This is an indicator that this coordinate is not really necessary to define the MFEP, and it could be safely removed from the chosen CV space.

Conclusions Free energy surfaces are an essential tool for the detailed comprehension of chemical process. However, the multidimensional nature of most chemical problems makes their complete determination an unaffordable problem from the computational point of view. Instead, a reasonable option consists in the exploration of those regions that are of interest and, in particular, the MFEP. We have here presented a modified version of the string method,

24

ACS Paragon Plus Environment

Page 24 of 31

Page 25 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

named as the adaptive string method (ASM), to determine the MFEP using an arbitrarily large number of collective variables. The main advantadges of this new version are: i) the reduction in the number of parameteres that must be manually selected by the user, ii) an on-the-fly optimization of the parameters to accelerate the convergence and iii) the use of the sampling obtained during the string optimization for the evaluation of the approximate free energy profile along a path CV. The method has been implemented in the Amber package and is available from the authors upon request. The current implementation has been used to study a SN 2 reaction in solution, a conformation transformation of alanine dipeptide and an enzymatic reaction. In all the three cases we used the same set of initial parameters with some of them being optimized during the simulations to accelerate the convergence. This feature reduces significantly the complexity of this method, facilitating its use in a large number of cases without knowing a priori the optimum values for the set of parameters needed. We also discussed different features of the method, such as the use of a biasing potential along the direction orthogonal to the path and the decomposition of the PMF. The improved convergence of the method makes it very attractive from the computational point of view, facilitating the simulation of larger and more complex systems and/or the use of computationaly more expensive (and probably more reliable) potentials. The ease of use also makes this method an interesting option from the point of view of the user, specially for non-experts in the field of free energy simulations.

Acknowledgements The authors gratefully acknowledge financial support from Ministerio de Econom´ıa y Competitividad (project CTQ2015-66223-C2-2-P). K. Z. acknowledges a FPU fellowship from Ministerio de Educaci´on . The authors acknowledge computational facilities of the Servei d’Inform`atica de la Universitat de Val`encia in the ‘Tirant’ supercomputer.

25

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Supporting Information Figures with the convergence of the strings, decomposition of the PMFs for the ADP and IPL cases, and the conformational space sampled in ADP simulations are provided as Supporting Information. This material is available free of charge via the Internet at http://pubs.acs.org.

References (1) Peters, B. Reaction Rate Theory and Rare Events Simulations; Elsevier, 2017. (2) Zinovjev, K.; Tu˜ n´on, I. Reaction coordinates and transition states in enzymatic catalysis. WIREs Comput. Mol. Sci. e1329–n/a. (3) K¨astner, J. Umbrella sampling. WIREs Comput. Mol. Sci. 2011, 1, 932–942. (4) Barducci, A.; Bonomi, M.; Parrinello, M. Metadynamics. WIREs Comput. Mol. Sci. 2011, 1, 826–843. (5) Li, W.; Ma, A. Recent developments in methods for identifying reaction coordinates. Mol. Simul. 2014, 40, 784–793. (6) Venturoli, M.; Vanden-Eijnden, E.; Ciccotti, G. Kinetics of phase transitions in two dimensional Ising models studied with the string method. J. Math. Chem. 2009, 45, 188–222. (7) Maragliano, L.; Fischer, A.; Vanden-Eijnden, E.; Ciccotti, G. String method in collective variables: Minimum free energy paths and isocommittor surfaces. J. Chem. Phys. 2006, 125, 024106. (8) Maragliano, L.; Vanden-Eijnden, E. On-the-fly string method for minimum free energy paths calculation. Chem. Phys. Lett. 2007, 446, 182 – 190.

26

ACS Paragon Plus Environment

Page 26 of 31

Page 27 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

(9) Pan, A. C.; Sezer, D.; Roux, B. Finding Transition Pathways Using the String Method with Swarms of Trajectories. J. Phys. Chem. B. 2008, 112, 3432–3440. (10) Vanden-Eijnden, E.; Venturoli, M. Revisiting the finite temperature string method for the calculation of reaction tubes and free energies. J. Chem. Phys. 2009, 130, 194103. (11) Zhao, R.; Shen, J.; Skeel, R. D. Maximum Flux Transition Paths of Conformational Change. J. Chem. Theory Comput. 2010, 6, 2411–2423. (12) Johnson, M. E.; Hummer, G. Characterization of a Dynamic String Method for the Construction of Transition Pathways in Molecular Reactions. J. Phys. Chem. B. 2012, 116, 8573–8583. (13) Dickson, B. M.; Huang, H.; Post, C. B. Unrestrained Computation of Free Energy along a Path. J. Phys. Chem. B. 2012, 116, 11046–11055. (14) D´ıaz Leines, G.; Ensing, B. Path Finding on High-Dimensional Free Energy Landscapes. Phys. Rev. Lett. 2012, 109, 020601. (15) Maragliano, L.; Roux, B.; Vanden-Eijnden, E. Comparison between Mean Forces and Swarms-of-Trajectories String Methods. J. Chem. Theory Comput. 2014, 10, 524–533. (16) Schenter, G. K.; Garrett, B. C.; Truhlar, D. G. Generalized transition state theory in terms of the potential of mean force. J. Chem. Phys. 2003, 119, 5828–5833. (17) Branduardi, D.; Gervasio, F. L.; Parrinello, M. From A to B in free energy space. J. Chem. Phys. 2007, 126 . (18) Sugita, Y.; Kitao, A.; Okamoto, Y. Multidimensional replica-exchange method for freeenergy calculations. J. Chem. Phys. 2000, 113, 6042–6051. (19) Sabri Dashti, D.; Roitberg, A. E. Optimization of Umbrella Sampling Replica Exchange Molecular Dynamics by Replica Positioning. J. Chem. Theory Comput. 2013, 9, 4692– 4699. 27

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(20) Zinovjev, K.; Ruiz-Pern´ıa, J. J.; Tu˜ n´on, I. Toward an Automatic Determination of Enzymatic Reaction Mechanisms and Their Activation Free Energies. J. Chem. Theory Comput. 2013, 9, 3740–3749. (21) Aranda, J.; Zinovjev, K.; Roca, M.; Tu˜ n´on, I. Dynamics and Reactivity in Thermus aquaticus N6-Adenine Methyltransferase. J. Am. Chem. Soc. 2014, 136, 16227–16239. (22) Manna, R. N.; Zinovjev, K.; Tu˜ no´n, I.; Dybala-Defratyka, A. Dehydrochlorination of Hexachlorocyclohexanes Catalyzed by the LinA Dehydrohalogenase. A QM/MM Study. J. Phys. Chem. B. 2015, 119, 15100–15109. ´ (23) Aranda, J.; Zinovjev, K.; Swiderek, K.; Roca, M.; Tu˜ n´on, I. Unraveling the Reaction Mechanism of Enzymatic C5-Cytosine Methylation of DNA. A Combined Molecular Dynamics and QM/MM Study of Wild Type and Gln119 Variant. ACS Catal. 2016, 6, 3262–3276. (24) Zinovjev, K.; Tu˜ no´n, I. Transition state ensemble optimization for reactions of arbitrary complexity. J. Chem. Phys. 2015, 143, 134111. (25) Zinovjev, K.; Tu˜ n´on, I. Exploring chemical reactivity of complex systems with pathbased coordinates: Role of the distance metric. J. Comput. Chem. 2014, 35, 1672–1681. (26) Zwanzig, R. W. High-Temperature Equation of State by a Perturbation Method. I. Nonpolar Gases. J. Chem. Phys. 1954, 22, 1420–1426. (27) Kumar, S.; Rosenberg, J. M.; Bouzida, D.; Swendsen, R. H.; Kollman, P. A. THE weighted histogram analysis method for free-energy calculations on biomolecules. I. The method. J. Comput. Chem. 1992, 13, 1011–1021. (28) K¨astner, J.; Thiel, W. Bridging the gap between thermodynamic integration and umbrella sampling provides a novel analysis method: “Umbrella integration”. J. Chem. Phys. 2005, 123, 144104. 28

ACS Paragon Plus Environment

Page 28 of 31

Page 29 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

(29) Sanchez-Martinez, M.; Field, M.; Crehuet, R. Enzymatic Minimum Free Energy Path Calculations Using Swarms of Trajectories. J. Phys. Chem. B. 2015, 119, 1103–1113. (30) Schlitter, J.; Engels, M.; Kr¨ uger, P. Targeted molecular dynamics: A new approach for searching pathways of conformational transitions. J. Mol. Graphics 1994, 12, 84 – 89. (31) K¨astner, J.; Thiel, W. Analysis of the statistical error in umbrella sampling simulations by umbrella integration. J. Chem. Phys. 2006, 124, 234106. (32) Case, D. A.; Darden, T. A.; Cheatham, T. E.; Simmerling, C. L.; Wang, J.; Duke, R. E.; Luo, R.; Walker, R. C.; Zhang, W.; Merz, K. M. et al. AMBER 12 ; University of California, San Francisco, 2012. (33) Duan, Y.; Wu, C.; Chowdhury, S.; Lee, M. C.; Xiong, G.; Zhang, W.; Yang, R.; Cieplak, P.; Luo, R.; Lee, T. et al. A point-charge force field for molecular mechanics simulations of proteins based on condensed-phase quantum mechanical calculations. J. Comput. Chem. 2003, 24, 1999–2012. (34) Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. Comparison of simple potential functions for simulating liquid water. J. Chem. Phys. 1983, 79, 926–935. (35) Dewar, M. J. S.; Zoebisch, E. G.; Healy, E. F.; Stewart, J. J. P. Development and use of quantum mechanical molecular models. 76. AM1: a new general purpose quantum mechanical molecular model. J. Am. Chem. Soc. 1985, 107, 3902–3909. (36) Swope, W. C.; Andersen, H. C.; Berens, P. H.; Wilson, K. R. A computer simulation method for the calculation of equilibrium constants for the formation of physical clusters of molecules: Application to small water clusters. J. Chem. Phys. 1982, 76, 637–649. (37) Darden, T.; York, D.; Pedersen, L. Particle mesh Ewald: An N·log(N) method for Ewald sums in large systems. J. Chem. Phys. 1993, 98, 10089–10092. 29

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(38) Bash, P. A.; Field, M. J.; Karplus, M. Free energy perturbation method for chemical reactions in the condensed phase: a dynamic approach based on a combined quantum and molecular mechanics potential. J. Am. Chem. Soc. 1987, 109, 8092–8094. (39) Mart´ı, S.; Andr´es, J.; Moliner, V.; Silla, E.; Tu˜ no´n, I.; Bertr´an, J. Mechanism and Plasticity of Isochorismate Pyruvate Lyase: A Computational Study. J. Am. Chem. Soc. 2009, 131, 16156–16161. (40) Zinovjev, K.; Mart´ı, S.; Tu˜ n´on, I. A Collective Coordinate to Obtain Free Energy Profiles for Complex Reactions in Condensed Phases. J. Chem. Theory Comput. 2012, 8, 1795–1801.

30

ACS Paragon Plus Environment

Page 30 of 31

Page 31 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

TOC graphic.

31

ACS Paragon Plus Environment