Subscriber access provided by The University of British Columbia Library
Article
Intermediate Thermodynamic States Contribute Equally to Free Energy Convergence: A Demonstration with Replica Exchange Trung Hai Nguyen, and David D. L. Minh J. Chem. Theory Comput., Just Accepted Manuscript • DOI: 10.1021/acs.jctc.6b00060 • Publication Date (Web): 07 Apr 2016 Downloaded from http://pubs.acs.org on April 13, 2016
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
Journal of Chemical Theory and Computation is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 27
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
Intermediate Thermodynamic States Contribute Equally to Free Energy Convergence: A Demonstration with Replica Exchange Trung Hai Nguyen and David D. L. Minh∗ Department of Chemistry, Illinois Institute of Technology, Chicago, IL 60616, USA E-mail:
[email protected] Abstract We investigate the relationship between the number of intermediate thermodynamic states along a pathway and the precision of free energy estimates. With a sufficient number of states, the asymptotic variance collapses as a function of the total sample size. Our analytical result is corroborated by replica exchange molecular dynamics simulations of model systems in which the neighbor exchange rate exceeds 35%. Precision collapse is also observed in heat capacity estimates based on the multi-state Bennett acceptance ratio. In contrast to the relaxation and mean first passage times, the autocorrelation time of state indices is found to be relevant to free energy convergence.
∗
To whom correspondence should be addressed
1
ACS Paragon Plus Environment
Journal of Chemical Theory and Computation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
Page 2 of 27
Introduction
When estimating free energy differences, it is frequently advantageous to simulate a series of intermediate thermodynamic states (or for brevity, states) along a pathway. If parameters specifying the two states are sufficiently different, then the most probable regions of configuration space will also be distinct. Consequently, the variance of potential energy differences will be large and many samples will be required to obtain converged free energy estimates. 1 Because free energy is a state function, the same quantity may be estimated by summing up free energy differences between a series of similar states. Staged free energy calculations often lead to more precise estimates in the same amount of computer time. Simulating multiple states can also accelerate configurational sampling. A common objective of molecular simulation is to sample configurations x from the Boltzmann distribution R π(x) = Z −1 q(x), where Z = dx q(x) is the partition function, q(x) = exp [−u(x)] is the
unnormalized probability density, and u(x) is the reduced potential. 2 While its precise form depends on the ensemble, the reduced potential generally involves functions of x and a set of thermodynamic parameters. The former may include the potential energy U (x), the volume
V (x), and the particle number ni (x) for each component i = 1, ..., M . Parameters often include the inverse temperature β = (kB T )−1 , the pressure p, and chemical potential µi . With the aforementioned terms, the reduced potential is, "
u(x) = β U (x) + pV (x) +
X
#
µi ni (x) .
i
(1)
In systems featuring rugged energy landscapes where local minima are separated by high barriers, however, Markov chain Monte Carlo and molecular dynamics simulations frequently become trapped in local minima. Rather than broadly exploring configuration space, the simulations generate highly correlated samples. Because modifying thermodynamic parameters or the potential energy can make energy landscapes smoother, a random walk in state space (as well as configuration space) facilitates barrier crossing. 2
ACS Paragon Plus Environment
Page 3 of 27
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
State-space random walks are achieved through a broad class of methods known as generalized 3 or extended 4 ensemble algorithms. A prominent example is replica exchange (RE), which is widely used in simulations of disordered spin systems 5–8 and biomolecules. 9–11 The defining feature of RE is a periodic attempt to exchange states between multiple copies, or replicas, of a system. 5 While many schemes are possible, 12 the most common approach is to attempt exchanges between neighboring states with the most similar parameter values. An attempted exchange between replica l with configuration x and replica m with configuration y is accepted with probability, qm (y)ql (x) w = min 1, . qm (x)ql (y)
(2)
Due to these exchanges, each replica is likely to contribute to multiple states. Subsequent analyses, e.g. estimation of thermodynamic expectations, usually involve gathering samples from multiple replicas that correspond to each state of interest. When RE was originally conceived, parameters for each state would only differ in temperature (parallel tempering). High-temperature states facilitate barrier crossing while lowtemperature states emphasize sampling within local minima. RE is also increasingly employed to enhance sampling for states with different Hamiltonians (Hamiltonian RE, or HRE), 13–17 including staged free energy calculations. In a binding free energy calculation, for example, the pathway often includes a decoupled state in which a ligand and receptor are not interacting, alchemical intermediates with physically unrealistic interactions, and a physically realistic fully-coupled state. Finally, while most simulations involve states along a defined pathway, it is possible to perform RE in multiple dimensions. 18,19 The principles underlying the selection of intermediate states remains an active area of research. Efficiency metrics applied to single Markov chains, such as the effective sample size, 20 do not readily apply to multiple Markov chains with complex correlations. Because there is no consensus for how efficiency should be quantified, 21 researchers have selected
3
ACS Paragon Plus Environment
Journal of Chemical Theory and Computation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
intermediate states by optimizing a number of different objectives. A number of studies, for example, have tuned the temperature distribution to achieve uniform neighbor exchange probabilities. 22–28 Generalizing beyond parallel tempering, Shenfeld et al. 25 showed that the asymptotic variance of free energy differences is minimized and the neighbor exchange probability is maximized when states are equally spaced in thermodynamic length. Building on these results, others have optimized the position and force constants of harmonic restraints in umbrella sampling. 29,30 The present work focuses on selecting the number of states, K. According to several independent lines of thought, there is an optimal number of states. In free energy calculations, it is often suggested (e.g. Chipot and Pohorille 31 ) that the number of states should balance the precision of pairwise free energies and the extra sampling required within intermediates. This suggestion is corroborated by efficiency metrics based on RE indices. The mean square displacement of the state index is maximized with an exchange rate of around 20%, diminishing with larger K. 32 A similar trend occurs with the round-trip time, which is maximized with an exchange rate of around 45%. 33,34 A common interpretation of these results is that sampling from too many states will degrade statistical sampling. 23,29,33,34 Finally, Chandler 35 argued that stratified sampling is most efficient when there are many states with narrow windows (outside of which the potential is infinite), but that windows must be large enough such that Monte Carlo moves have a reasonable acceptance rate. Others have suggested that there is a minimum but no optimal number of states. In the special case of asymptotically large K and sample size, and when states are equally separated in thermodynamic length, Shenfeld et al. 25 showed that the variance of free energy estimates does not depend on the number of states. Rather, it is inversely proportional to the total sample size, N . With more stringent error analysis than Chandler 35 , Virnau and Muller 36 and Thiede et al. 37 determined that umbrella sampling error is independent of the number of windows. As there is no universal standard for quantifying the efficiency of molecular simulation,
4
ACS Paragon Plus Environment
Page 4 of 27
Page 5 of 27
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
an appropriate approach to choosing K depends on the objective being optimized. Because free energy estimates are sensitive to the quality of configurational sampling, 1 we posit that the precision of these calculations is a valuable metric for simulation efficiency. Herein, we extend Shenfeld et al. 25 to show that free energy convergence depends on the total number of samples regardless of thermodynamic length considerations. We also empirically investigate the standard deviation of free energy and heat capacity estimates across a broad range of K. Finally, we consider how different replica exchange efficiency metrics relate to the convergence of these estimates.
2
Theory
We first derive the asymptotic variance for a specific estimator and class of protocols, and then generalize the result. Suppose that we have independent and identically distributed samples xki , in which
k ∈ {1, ..., K} is the state index and i ∈ {1, ..., Nk } is the sample index. Nk is the number PK of samples drawn from state k, and thus the total number of samples is N = k=1 Nk . PK−1 Similarly, ∆fk is the free energy difference between states k and k + 1, and ∆f = k=1 ∆fk is the free energy difference between states 1 and K. Using the exponential average 38 (EXP) for ∆fk , a simple estimator for ∆f is, c =− ∆f
K−1 X
ln exp (−∆uk ),
(3)
k=1
where ∆uk = uk+1 − uk and the overbar denotes the arithmetic average over the samples collected at state k. If there is no covariance between state pairs, then by error propagation
5
ACS Paragon Plus Environment
Journal of Chemical Theory and Computation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
c is, the asymptotic variance of ∆f c] = σ [∆f 2
K−1 X k=1
σ
2
h
ln exp (−∆uk )
Page 6 of 27
i
i h σ 2 exp (−∆uk ) = 2 k=1 exp (−∆uk ) 2 K−1 X exp −∆uk σ 2 [∆uk ] k , = N k exp (−∆u ) k=1 k K−1 X
(4)
where σk2 [·] is the variance of · in state k. (Note that in RE, the assumption of independence is untrue; there is always some covariance among states.) Consider the special case (which is common in free energy calculations) where the reduced potential energy has the form,
uk (x) = u0 (x) + λαk W (x),
(5)
where u0 (x) is invariant to the state, λk is a scaling factor, and α is a positive integer exponent. We choose λk to be equidistant points on the interval [λ1 , λK ] such that δ = λk+1 − λk is constant. The reduced potential energy difference is then, ∆uk = [(λk + δ)α − λαk ] W (x).
(6)
In the limit of large K, δ and ∆uk are small. The first quotient in the summation of 1 − ∆uk exp(−∆uk ) = 1, and the potential energy difference in ≈ Eq. (4) goes to one, 1 − ∆uk exp(−∆uk ) W (x). 39 Therefore, in the limit where K is Eq. (6) approximately becomes ∆uk ≈ αδλα−1 k large, the variance approaches, c ] = α2 δ 2 σ [∆f 2
K−1 X k=1
6
2(α−1) 2 σk [W ]
λk
Nk
ACS Paragon Plus Environment
.
(7)
Page 7 of 27
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
To consider the effect of total sample size on the variance, let the ratio rk = Nk /N be a constant. This assumption will hold true in RE, where rk = 1/K, but not necessarily in other generalized ensemble calculations. Replacing Nk with rk N allows 1/N to be factored from each term in the summand. We then obtain, K−1 2(α−1) 2 α 2 δ 2 X λk σk [W ] c σ [∆f ] = . N k=1 rk 2
(8)
c ] is inversely Since all other terms are constant with respect to N , then the variance σ 2 [∆f
proportional to the total sample size N .
How is Eq. (7) affected by changing K but keeping N constant? Increasing K by a factor of γ (K ′ = γK) leads to a decrease in δ such that δ ′ = δ/γ. To maintain the total number of samples, then we need to reduce Nk such that Nk′ = Nk /γ. Under these conditions, the variance of the estimated free energy difference is, γK−1 2(α−1) 2 α 2 δ 2 X λk ′ σk′ [W ] c σ [∆f ] = 2 . γ k′ =1 Nk ′ 2
(9)
Comparing Eqs. (7) and (9), the sum in Eq. (9) has γ times more terms. The summation index is k ′ instead of k to indicate that the terms in Eq. (9) are generally different from those in Eq. (7). However, when the states are densely distributed, the numerator of each term indexed by k in Eq. (7) has γ terms in Eq. (9) which are very close to it. Expressed 2(α−1) 2 σk [W ]
mathematically, λk 2(α−1) 2 σk [W ]/Nk
λk
2(α−1) 2 σk′ [W ]
≈ λk ′
2(α−1) 2 σk′ [W ]/Nk .
≈ γλk′
when λk ≈ λk′ . Accounting for sample sizes,
Based on this approximation, the value of the sum
in Eq. (9) is γ 2 times that of Eq. (7), i.e., γK−1
K−1 X λ2(α−1) X λ2(α−1) σ 2 [W ] σk2′ [W ] k k k′ ≈ γ2 , ′ N N k k ′ k=1 k =1
(10)
and thus the two expressions are equal. Therefore, when K is sufficiently large, the variance in estimated free energies no longer depends on K. 7
ACS Paragon Plus Environment
Journal of Chemical Theory and Computation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 8 of 27
Eq. (7) is closely related to results from Shenfeld et al. 25 . By applying the Cramér-Rao lower bound 40 to find the minimum variance of ∆fk , Shenfeld et al. 25 derived a lower bound c ]. Equality with the lower bound is achieved by the minimum variance unbiased for σ 2 [∆f estimator, which is EXP, Eq. (3), when using samples from a single state. 41 Eq. (6) from Shenfeld et al. 25 was derived assuming the Gibbs ensemble and estimators using samples from a single state. If α = 1 and a single thermodynamic parameter is varied, Eq. (7) directly maps onto the lower bound in their expression. Now consider the general case, when the potential energy is not necessarily described by Eq. (5) and δ is not necessarily constant. Taking the Fisher information contained in samples from a pair of states, 41 Shenfeld et al. 25 applied the Cramér-Rao lower bound and made a few simplifying approximations. The approximations were based on series expansions that assume small perturbations to the difference in likelihood between states k and k + 1. Combining their Eqs. (9) and (10), and summing over states, we obtain,
dk ] ≥ σ [∆f 2
K−1 X k=1
2 σk2 [uk+1 − uk ] + σk+1 [uk+1 − uk ] . Nk + Nk+1
(11)
We can analyze Eq. (11) with closely analogous logic to our analysis of Eq. (7). That is, if the ratio rk′ = (Nk +Nk+1 )/N is constant, then the variance is inversely proportional to the total sample size. If K is scaled by γ, then N can be held constant by scaling (Nk + Nk+1 ) by 1/γ. Based on a first-order expansion of the potential energy difference, uk+1 − uk is dk ] is invariant to changes in K. Our results are more also scaled by 1/γ such that σ 2 [∆f
general than Eq. (8) of Shenfeld et al. 25 because we do not require states to be equidistant in thermodynamic length. A priori, it is unclear how the analytical results apply to practical simulations. First of all, the derivation is based on assumptions that are not satisfied in most circumstances. For example, samples are not independent and identically distributed in Markov chain Monte Carlo and molecular dynamics simulations. It is also unclear how K affects correlations
8
ACS Paragon Plus Environment
Page 9 of 27
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
between walkers. Secondly, the analytical results are based on asymptotically large K. It is dk ] is a function of N and invariant to K. unclear what K is sufficiently large such that σ 2 [∆f Finally, the analysis is based on the minimum variance unbiased estimator for a pair of
states, which Shirts et al. 41 showed to be the Bennett Acceptance Ratio (BAR). The BAR c = fbl − fbk , is obtained by estimate of the free energy difference between states k and l, ∆f
finding the zero of, Nk X n=1
1 1 + exp[M +
∆u(xkn )
c] − ∆f
−
Nl X n=1
1 c )] 1 + exp[−(M + ∆u(xln ) − ∆f
= 0,
(12)
where ∆u(x) = ul (x) − uk (x) and M = ln NNkl . While BAR is optimal for pairs of states, we can employ the multi-state Bennett acceptance ratio 2 (MBAR), which is an optimal estimator for many states. The MBAR free energy estimate is given by the solutions to, fbk = − ln
Nl K X X l=1 n=1
PK
m=1
exp[−uk (xln )] , Nm exp[fbm − um (xl )]
(13)
n
where fˆk is the estimate of the dimensionless free energy, up to an additive constant. If the assumptions hold, then a plot of the standard deviation versus N on a logarithmic scale should be linear with a slope of − 12 . Assuming that rk′ = (Nk + Nk+1 )/N is constant and taking the square root and natural logarithm of both sides of Eq. 11 yields, K−1 X 1 2 dk ] = − 1 ln N + ln ln σ[∆f σk2 [uk+1 − uk ] + σk+1 [uk+1 − uk ] . ′ 2 r k=1 k
3
(14)
Numerical Methods
To determine whether the analytical results hold in practical simulations, we conducted RE simulations for two model systems (Fig. 1) and three thermodynamic processes.
9
ACS Paragon Plus Environment
Journal of Chemical Theory and Computation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment
Page 10 of 27
Page 11 of 27
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
f (x) was the distance between the center of mass of CB7 and that of B5, κ = 150 kJ mol−1 nm−2 , and λk were equally spaced between 0 to 2 nm. Furthermore, a flat-bottom harmonic potential was applied to restrain the center of mass of B5 within a cylinder of radius 0.2 nm aligned along the principal axis of CB7. In HRE simulations, the temperature of the system was kept at 300 K using the Langevin thermostat. For parallel tempering, we simulated deca-alanine with the temperature Tk geometrically distributed between 300 to 800 K. The number of states, K, was varied over a wide range. For deca-alanine HRE and parallel tempering, the number of states was chosen to be K = 8, 16, 24, 32, 64, or 128. The total simulation length was kept fixed at K · t = 12.8 µs, where t is the simulation time for each replica. For simulations of the CB7-B5 complex, the number of states was chosen to be K = 12, 24, 48, 72, or 96. The total simulation length was K · t = 19.2 µs. For each process and value of K, 100 independent RE simulations were conducted starting from random initial configurations. Neighbor exchange was attempted and samples were collected every 1 ps. Simulations were performed with the Molecular Modeling Toolkit. 48 Free energies were estimated using EXP, 38 BAR, 49 or MBAR. 2 The heat capacity (variance of the potential energy) in state 1 was estimated using samples from state 1 or using MBAR, 2 which is based on samples from all states. The sample standard deviation for each system and each value of K was estimated based on the 100 independent simulations.
3.3
RE efficiency metrics
RE has been assessed by a variety of mixing efficiency metrics. 7,12,32–34,50,51 Of these, we calculated four: 1. the neighbor exchange rate; 2. the autocorrelation time of the replica state index, τac ; 3. the relaxation time from the empirical transition matrix, τ2 ; 4. and the mean first-passage time, τp . 11
ACS Paragon Plus Environment
Journal of Chemical Theory and Computation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 12 of 27
Details are described below. 3.3.1
Neighbor exchange rate
The exchange rate pacc k between state k and k + 1 was estimated as, pacc k =
N acc . N att
(16)
N acc is the number of accepted exchange attempts and N att is the total number of exchange attempts. 3.3.2
Autocorrelation of the replica state index, τac
Let Mk (t) be the index (from 1 to K) of the state visited by replica k at time t. The autocorrelation function of Mk (t) is defined as, 52
Ck (t) =
hMk (s)Mk (s + t)i − hMk (s)i2 . hMk (s)2 i − hMk (s)i2
(17)
When the simulation time is sufficiently long, all replicas perform statistically equivalent walks on the state index ladder. Therefore we can average the autocorrelation function over all the replicas as, C(t) =
K 1 X Ck (t). K k=1
(18)
The integrated autocorrelation time τac is defined as, 52
τac =
T X t=1
3.3.3
t 1− T
C(t).
(19)
Relaxation time from empirical transition matrix, τ2
The empirical transition matrix T is a K × K stochastic matrix whose element Tij is the probability that a replica currently at state i will be at state j the next exchange attempt.
12
ACS Paragon Plus Environment
Page 13 of 27
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
T is estimated as, 12 Nij + Nji Tij ≈ PK , (N + N ) ik ki k=1
(20)
where Nij is the number of times the replica is in state j one update interval after being in state i. T is then diagonalized to compute its eigenvalues. When the eigenvalues are sorted in descending order 1 = µ1 > µ2 > · · · > µK , the relaxation time τ2 is given by, 12
τ2 =
3.3.4
1 . 1 − µ2
(21)
Mean first-passage time, τp
The mean first-passage time τp is the average number of exchanges required for a replica from state 1 to reach state K (or vice versa). To estimate τp we assume that the replica walking through the state space is a Markov chain with the transition matrix T given in Eq. 20. Suppose that the replica starts at state 1. The mean first-passage time is the same as the absorbing time of a Markov chain in which state K is an absorbing state. 53 This absorbing Markov chain has a transition matrix with elements T′ given by, 53
Tij′ =
1
0 Tij
if i = j = K, if i = K, j 6= K,
(22)
otherwise.
Based on this transition matrix, the mean first-passage time (or equivalently the absorbing time) is given by, 53 τp =
X
u1j .
(23)
j6=K
where uij are elements of the fundamental matrix, U = (I − T′ )−1 ,
13
ACS Paragon Plus Environment
(24)
Journal of Chemical Theory and Computation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
and I is the K × K identity matrix.
4
Numerical Results and Discussion
4.1
RE simulations of model systems
While estimated free energies of deca-alanine (Fig. 2a and 2c) are robust to changing K, free energy estimates of the CB7-B5 complex require larger K to attain a consistent result (Fig. 2b). For host-guest complexes, Velez-Vega et. al. 47 observed that ligand extraction simulations featured sudden jumps over high energy barriers. With a large K, intermediate states are more likely to favor the sampling of both unbound and bound states, leading to more precise free energy estimates. Corroborating our analysis, the standard deviation of estimated free energy differences q c ]) collapses as a function of total sample size. The collapse occurs in all our (σ = σ 2 [∆f
systems for all but the smallest numbers of replicas (Fig. 3). Fig. S3–S4 in the SI show that results are qualitatively the same for EXP and BAR estimators. (Actually, BAR appears indistinguishable from MBAR in the figures, but does yield slightly different numbers.) In our derivation, we made the assumption that the number of states, K, is large. Our simulation results indicate that an average neighbor exchange rate exceeding 35% (see Fig. S6–S8 in the SI) is sufficient to satisfy this assumption. For the smallest values of K, however, the standard deviation is large; this is most noticeable with the EXP estimator (Fig. S3 in the SI). Why do free energy estimates exhibit worse convergence behavior with few replicas? There are several possible explanations. With small K, there may be “bottlenecks” with low acceptance rates between adjacent replicas. Each bottleneck essentially breaks the ladder of replicas into independent sets of simulations, eliminating the sampling benefits of allowing a random walk in state space. There are certainly bottlenecks with K = 8 in the deca-alanine HRE simulations (Fig. S6 in the SI). Bottlenecks may also occur, for example, in discontinuous phase transitions. 14
ACS Paragon Plus Environment
Page 14 of 27
Page 15 of 27
Journal of Chemical Theory and Computation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment
Journal of Chemical Theory and Computation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment
Page 16 of 27
Page 17 of 27
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
Even without bottlenecks, however, small K can also deteriorate free energy convergence through limited phase space overlap between adjacent stages. 1 The EXP estimator is especially sensitive to large variances in the potential energy difference. Supporting this perspective, for K = 16 with deca-alanine HRE and K = 8 with deca-alanine parallel tempering, standard deviations are substantially larger with EXP (Fig. S3 in the SI) compared to BAR (Fig. S4 in the SI) and MBAR. We are not the first to observe that extreme bottlenecks are unnecessary for small K to deteriorate RE sampling; for example, Roe et al. 54 found that HRE simulations with K = 24 and neighbor exchange rates between 60% and 80% are more converged than similar simulations with K = 8 and neighbor exchange rates between 10% and 35%. In addition to the fact that the convergence trend is the same across a large range of K, the trend of the standard deviation with respect to N is also consistent with analytical results. As N becomes larger and approaches the asymptotic regime, a line with slope fixed at − 21 is a good fit to the plot on a logarithmic scale (Fig. S4 in the SI); the coefficients of determination are R2 = 0.92, 0.94, and 0.75 for the largest K in the three systems. Beyond free energies, another key objective of molecular simulation is the estimation of thermodynamic expectations. We assessed the convergence of a representative thermodynamic expectation, the heat capacity in state 1. (The heat capacity for all states is shown in Fig. S11.) The convergence of this quantity depends on the statistical estimator (Fig. 4 and Fig. S9 and S10 in the SI). The simplest estimator uses only samples from state 1. When the heat capacity is estimated using this straightforward approach, then the standard deviation does not collapse as a function of the total simulation time; increasing K while keeping N constant will reduce the sample size within the state 1. Finding the balance between neighbor exchange rates and reduced sample size is a key reason for optimizing K. 23,32–34 On the other hand, when heat capacity is estimated using MBAR, convergence trends resemble those of the free energy: for sufficiently large K, the standard deviation collapses as a function of total simulation time, but precision deteriorates for small K. The contrast in convergence
17
ACS Paragon Plus Environment
Journal of Chemical Theory and Computation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
behavior between two estimators is explained by the fact that MBAR makes use of information from all states. Furthermore, because MBAR uses estimated free energies to weigh various samples, its convergence can be expected to be strongly linked with the convergence of the free energy. These results suggest that MBAR may improve the convergence of other thermodynamic expectations when K is large.
4.2
RE efficiency metrics
A key objective of a sampling methods is to generate as many uncorrelated samples as possible in a given amount of CPU time. 11 Generating many uncorrelated samples should facilitate the precise estimation of thermodynamic quantities. Due to the correlation between walkers, however, the efficiency of RE is difficult to quantify. While relatively straightforward to calculate, it is not clear whether RE efficiency metrics are related to the precision of estimated thermodynamic quantities. 51 Grossfield and Zuckerman 21 described a simple thought experiment to show that the mixing of state indicies is not a sufficient condition for convergence. Their experiment involves two wells with equal depth but different widths separated by a high energy barrier. If replicas are seeded in each well, then the exchange probability will be very high even if barrier crossing never occurs and the relative population in each well never equilibrates. Instead of focusing on replica indices, they suggest analyzing sampling quality by comparing results from independent sets of simulations (as was done in this paper.) In our simulations, different mixing efficiency metrics exhibit distinct trends as a function of K (Fig. 5). The autocorrelation time of the replica state index, τac , decreases quickly and levels off, correlating with the collapse in the variance of free energy differences. In the deca-alanine parallel tempering simulations, τ2 and the mean first-passage time monotonically increase with K. In the harmonic bias simulations, both the relaxation time of the empirical transition matrix, τ2 , and the mean first-passage time decrease and then increase. In agreement with previous observations, 33,34 the minimum values coincide with the value of 18
ACS Paragon Plus Environment
Page 18 of 27
Page 19 of 27
Journal of Chemical Theory and Computation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment
Journal of Chemical Theory and Computation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
K where the exchange rate is approximately ∼40%. The increase in relaxation time with K may be a consequence of neighbor exchange. A simple model for neighbor exchange is a random walk on a n-cycle, in which clockwise or counterclockwise motion each have a probability of
1 . 2
In this system, the relaxation
time increases as O(n2 ). 55 An increase of τ2 with the number of states in a Markov chain is common across many other systems. 55 It may be possible, however, to accelerate mixing of state indices and decrease τ2 by attempting exchanges beyond nearest neighbors or by attempting many neighbor exchanges in a row. 12 Because the convergence of free energy estimates collapses for large K, any replica exchange efficiency index that does not level off with K has no relevance on convergence behavior. Hence, with simple neighbor exchange, τ2 and the mean first passage time increase with K and are not relevant to determining adequate K. On the other hand, the observation that τac levels off with K suggests that the autocorrelation time is suitable indicator that K is sufficiently large. It is important to note, however, that the relevance of τac is merely empirical and has no theoretical foundation; there may be cases whether τac levels off but free energy calculations do not follow asymptotic behavior.
5
Conclusions
We have studied the effect of increasing the number of states K on the precision of free energy differences and thermodynamic expectations estimated based on RE simulations. Our analysis shows that when K is sufficiently large, then the asymptotic variance of free energy differences does not depend on K. Corroborating the analytical results, variances of MBAR-based free energy differences and heat capacity estimates for model systems collapse with an increasing number of states. It follows that RE efficiency metrics that do not level off with K are not related to statistical convergence. Comparing several RE efficiency metrics, we empirically observe that the autocorrelation time of state indices τac is the most relevant
20
ACS Paragon Plus Environment
Page 20 of 27
Page 21 of 27
Journal of Chemical Theory and Computation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment
Journal of Chemical Theory and Computation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
to free energy convergence.
Acknowledgement This research was supported by IIT startup funds. Computer resources were provided by the Open Science Grid. 56
Supporting Information Available Supporting figures showing free energies and convergence properties estimated by EXP and BAR, average neighbor exchange rates in different simulations, and additional heat capacity convergence trends.
This material is available free of charge via the Internet at http:
//pubs.acs.org/.
Notes and References (1) Lu, N.; Woolf, T. B. In Free Energy Calculations: Theory and Applications in Chemistry and Biology; Chipot, C., Pohorille, A., Eds.; Springer Berlin: Heidelberg, 2007; Chapter 6, pp 199–247. (2) Shirts, M. R.; Chodera, J. D. J. Chem. Phys. 2008, 129, 124105. (3) Mitsutake, A.; Sugita, Y.; Okamoto, Y. Biopolymers 2001, 60, 96–123. (4) Iba, Y. Int. J. Mod Phys C 2001, 12, 623–656. (5) Swendsen, R. H.; Wang, J.-S. Phys. Rev. Lett. 1986, 57, 2607–2609. (6) Hukushima, K.; Nemota, K. J. Phys. Soc. Jpn. 1996, 65, 1604–1608. (7) Bittner, E.; Nußbaumer, A.; Janke, W. Phys. Rev. Lett. 2008, 101, 1–4. (8) Javanparast, B.; Hao, Z.; Enjalran, M.; Gingras, M. J. P. Phys. Rev. Lett. 2015, 114, 130601. 22
ACS Paragon Plus Environment
Page 22 of 27
Page 23 of 27
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
(9) Sugita, Y.; Okamoto, Y. Chem. Phys. Lett. 1999, 314, 141–151. (10) Sugita, Y.; Kitao, A.; Okamoto, Y. J. Chem. Phys. 2000, 113, 6042. (11) Zuckerman, D. M. Annu. Rev. Biophys. Biomol. Struct. 2011, 40, 41–62. (12) Chodera, J. D.; Shirts, M. R. J. Chem. Phys. 2011, 135, 194110. (13) Jiang, W.; Roux, B. J. Chem. Theory Comput. 2010, 6, 2559–2565. (14) Khavrutskii, I. V.; Wallqvist, A. J. Chem. Theory Comput. 2010, 6, 3427–3441. (15) Meng, Y.; Dashti, D. S.; Roitberg, A. E. J. Chem. Theory Comput. 2011, 7, 2721–2727. (16) Gallicchio, E.; Lapelosa, M.; Levy, R. M. J. Chem. Theory Comput. 2010, 6, 2961–2977. (17) Wang, K.; Chodera, J. D.; Yang, Y.; Shirts, M. R. J. Comput.-Aided Mol. Des. 2013, 27, 989–1007. (18) Jiang, W.; Luo, Y.; Maragliano, L.; Roux, B. J. Chem. Theory Comput. 2012, 8, 4672–4680. (19) Kokubo, H.; Tanaka, T.; Okamoto, Y. J. Comput. Chem. 2013, 34, 2601–2614. (20) Lyman, E.; Zuckerman, D. M. J. Phys. Chem. B 2007, 111, 12876–12882. (21) Grossfield, A.; Zuckerman, D. M. Annu. Rep. Comput. Chem. 2009, 5, 23–48. (22) Kofke, D. A. J. Chem. Phys. 2002, 117, 6911. (23) Rathore, N.; Chopra, M.; de Pablo, J. J. J. Chem. Phys. 2005, 122, 024111. (24) Patriksson, A.; van der Spoel, D. Phys. Chem. Chem. Phys. 2008, 10, 2073–2077. (25) Shenfeld, D. K.; Xu, H.; Eastwood, M. P.; Dror, R. O.; Shaw, D. E. Phys. Rev. E 2009, 80, 46705.
23
ACS Paragon Plus Environment
Journal of Chemical Theory and Computation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
(26) Fiore, C. E. J. Chem. Phys. 2011, 135, 114107. (27) Malakis, A.; Papakonstantinou, T. Phys. Rev. E 2013, 88, 1–14. (28) Ballard, A. J.; Wales, D. J. J. Chem. Theory Comput. 2014, 10, 5599–5605. (29) Park, S.; Im, W. J. Chem. Theory Comput. 2014, 10, 2719–2728. (30) Sabri Dashti, D.; Roitberg, A. E. J. Chem. Theory Comput. 2013, 9, 4692–4699. (31) Chipot, C.; Pohorille, A. In Free Energy Calculations; Chipot, C., Pohorille, A., Eds.; Springer-Verlag: Berlin, 2007; p 47. (32) Kone, A.; Kofke, D. A. J. Chem. Phys. 2005, 122, 1–2. (33) Denschlag, R.; Lingenheil, M.; Tavan, P. Chem. Phys. Lett. 2009, 473, 193–195. (34) Lingenheil, M.; Denschlag, R.; Mathias, G.; Tavan, P. Chem. Phys. Lett. 2009, 478, 80–84. (35) Chandler, D. Introduction to Modern Statistical Mechanics; Oxford University Press: New York, 1987; pp 171–173. (36) Virnau, P.; Muller, M. J. Chem. Phys. 2004, 120, 10925. (37) Thiede, E.; Van Koten, B.; Weare, J.; Dinner, A. R. Eigenvector method for umbrella sampling enables error analysis. 2016, arXiv:1603.04505. arXiv.org ePrint archive. http://arxiv.org/abs/1603.04505 (accessed March 14, 2016). (38) Zwanzig, R. J. Chem. Phys. 1954, 22, 1420. (39) The approximation does not hold when λk = 0, but this case is negligible in the limit of large K. (40) Rao, C. R. Bull. Calcutta Math. Soc. 1945, 37, 81–91.
24
ACS Paragon Plus Environment
Page 24 of 27
Page 25 of 27
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
(41) Shirts, M. R.; Bair, E.; Hooker, G.; Pande, V. S. Phys. Rev. Lett. 2003, 91, 140601. (42) Levy, Y.; Jortner, J.; Becker, O. M. Proc. Natl. Acad. Sci. USA 2001, 98, 2188–2193. (43) Park, S.; Khalili-Araghi, F.; Tajkhorshid, E.; Schulten, K. J. Chem. Phys. 2003, 119, 3559–3566. (44) Liu, S.; Ruspic, C.; Mukhopadhyay, P.; Chakrabarti, S.; Zavalij, P. Y.; Isaacs, L. J. Am. Chem. Soc. 2005, 127, 15959–15967. (45) Park, S.; Khalili, F.; Strumpfer, J. Stretching Deca-alanine. http://www.ks.uiuc.edu/ Training/Tutorials/science/10Ala-tutorial/tutorial-html/ (accessed March 2, 2015). (46) Wang, J.; Cieplak, P.; Kollman, P. A. J. Comput. Chem. 2000, 21, 1049–1074. (47) Velez-Vega, C.; Gilson, M. K. J. Comput. Chem. 2013, 34, 2360–71. (48) Hinsen, K. J. Comput. Chem. 2000, 21, 79–85. (49) Bennett, C. H. J. Comput. Phys. 1976, 22, 245–268. (50) Nadler, W.; Hansmann, U. H. E. Phys. Rev. E 2007, 76, 065701. (51) Abraham, M. J.; Gready, J. E. J. Chem. Theory Comput. 2008, 4, 1119–1128. (52) Chodera, J. D.; Swope, W. C.; Pitera, J. W.; Seok, C.; Dill, K. A. J. Chem. Theory Comput. 2007, 3, 26–41. (53) Minh, D. L. Applied Probability Models; Brooks/Cole: Pacific Grove, CA, 2001. (54) Roe, D. R.; Bergonzo, C.; Cheatham, T. E. J. Phys. Chem. B 2014, 118, 3543–3552. (55) Levin, D. A.; Peres, Y.; Wilmer, E. L. Markov Chains and Mixing Times; American Mathematical Society: Providence, RI, 2009.
25
ACS Paragon Plus Environment
Journal of Chemical Theory and Computation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
(56) Pordes, R.; Petravick, D.; Kramer, B.; Olson, D.; Livny, M.; Roy, A.; Avery, P.; Blackburn, K.; Wenaus, T.; Würthwein, F.; Foster, I.; Gardner, R.; Wilde, M.; Blatecky, A.; McGee, J.; Quick, R. J. Phys.: Conf. Ser. 2007, 78, 012057.
26
ACS Paragon Plus Environment
Page 26 of 27
Page 27 of 27
Journal of Chemical Theory and Computation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment