6714
J. Phys. Chem. B 2005, 109, 6714-6721
Assessing the Accuracy of Metadynamics† Alessandro Laio, Antonio Rodriguez-Fortea, Francesco Luigi Gervasio, Matteo Ceccarelli, and Michele Parrinello* Computational Science, Department of Chemistry and Applied Biosciences, ETH Zurich, c/o USI Campus, Via Buffi 13, CH-6900 Lugano, Switzerland ReceiVed: October 7, 2004; In Final Form: January 4, 2005
Metadynamics is a powerful technique that has been successfully exploited to explore the multidimensional free energy surface of complex polyatomic systems and predict transition mechanisms in very different fields, ranging from chemistry and solid-state physics to biophysics. We here derive an explicit expression for the accuracy of the methodology and provide a way to choose the parameters of the method in order to optimize its performance.
I. Introduction Computing free energies and accelerating rare events in complex polyatomic systems are of evergrowing interest in computational physics and several powerful methodologies have been proposed to tackle these problems.1 Although it is obvious that the optimal technique will depend on the system and on what one wishes to observe, many methods are apparently aimed at very similar tasks. For instance the free energy along a given reaction coordinate can be computed in at least four different manners: thermodynamic integration,2,3 weighted histogram techniques,4,5 Jarzynski’s identity based methods,6 and adaptive force bias.7 A quantitative assessment of the performance and limitations of each method would help considerably to ensure best use of the computer time allocated for a project.1,8 In this work we address this issue for the metadynamics method introduced in ref 9. This algorithm is aimed at reconstructing the multidimensional free energy of complex systems, and is based on an artificial dynamics performed in the space defined by a few collective coordinates s, which are assumed to provide a coarse-grained description of the system. The dynamics is driven by the free energy of the system and is biased by a history-dependent potential constructed as a sum of Gaussians centered along the trajectory of the collective variables. This potential, in time, fills the minima in the free energy surface, i.e., the sum of the Gaussians and of the free energy becomes approximately a constant as a function of the collective variables. As we discussed in ref 10, our methodology can be viewed as a finite temperature extension of the Wang and Landau algorithm11 and it is also closely related to the adaptive force bias algorithm,7 in which the deriVatiVe of the free energy along a one-dimensional reaction coordinate is reconstructed with a history-dependent bias. The algorithm has been successfully applied in several different fields, ranging from chemistry12-15 and materials science16,17 to crystal structure prediction18,19 and biophysics.20 Its major advantages are its ability to cope with high dimensionality and its flexibility: it can in fact be exploited not only for efficiently computing the free energy, but also for exploring new reaction pathways (i.e., for accelerating rare events), if the collective coordinates are properly chosen. †
Part of the special issue “David Chandler Festschrift”.
The algorithm introduced in ref 9 requires, at every step, the evaluation of the derivative of the free energy with respect to the collective variables, which are then evolved in a discrete fashion in the direction of the maximum gradient. In ref 10, we analyzed in detail and improved this algorithm, deriving an explicit expression for the error, and showing that it can be used for computing with an arbitrary accuracy the density of states in a model system together with weighted histogram techniques.21 In ref 12, to facilitate the implementation of the algorithm in conjunction with molecular dynamics schemes, we introduced a formulation in which the collective variables are evolved continuously. This is achieved by introducing, for a system described by a set of coordinates x, an extended Lagrangean in which the collective variables s(x) are coupled to auxiliary variables s˜ by an harmonic restraining potential of the form 1/2k(s˜ - s(x))2. We also assigned to the auxiliary variables a fictitious kinetic energy 1/2Ms˜˘ 2. If M is very large, the motion of s˜ is adiabatically separated and driven by the derivative of the free energy, which thus does not have to be computed explicitly. This significantly broadened the scope of the methodology, so that it can now be applied, in conjunction with Car-Parrinello molecular dynamics,22 to the prediction of reaction pathways for systems described at the DFT level.12,15 Subsequently we discovered that in practice the adiabatic separation condition, explicitly invoked in ref 12, is not strictly required for the success of the methodology, and that also if the auxiliary masses are small, the history-dependent Gaussian potential FG(s) tends to flatten the underlying free energy F(s). Furthermore, we checked that the algorithm described in ref 12 is also valid in the limits k f ∞ and M f 0. In this case, since s˜ ) s(x), the Gaussian history-dependent potential acts directly on the atomic position and the auxiliary variables are no longer necessary. As we will discuss in detail in this work, the real core of the metadynamics algorithm (and what makes it work in all the variants) is that if the Gaussians are added sufficiently slowly, the collective variables have the tendency to diffuse toward the closest local minimum of F(s) + FG(s), and the next Gaussian will preferentially be placed in this minimum. This property does not depend on the underlying dynamics of the system, as long as it is able to reach the equilibrium distribution associated with F(s) + FG(s), and the
10.1021/jp045424k CCC: $30.25 © 2005 American Chemical Society Published on Web 02/15/2005
Assessing the Accuracy of Metadynamics
J. Phys. Chem. B, Vol. 109, No. 14, 2005 6715
fictitious mass can only change the efficiency at which the equilibrium is reached (e.g., the auxiliary variables can be used as a thermostat for keeping the degrees of freedom associated with the collective variables at the correct temperature). This observation suggests that the continuous version of the metadynamics algorithm derives in fact from very general principles, but it poses the question of its accuracy, i.e., of its ability to reconstruct a multidimensional free energy efficiently in a realistic case. For example, while it is obvious that for fixed values of the metadynamics parameters the error will be larger in systems that relax slowly toward equilibrium, the exact dependence of the accuracy on the system relaxation times is actually hard to predict. In this work, we will show that, in the same line of ref 10, it is possible to derive an explicit expression (eq 12) for the average deviation of FG(s) from F(s) as a function of the parameters of the metadynamics and a few parameters characterizing the physical system, such as the diffusion coefficient, the temperature and the system size. This allows an a priori estimate of the computational time necessary to obtain the required accuracy (eq 14) and to choose the parameters in an optimal manner. Moreover, the explicit knowledge of the error allows the optimal combination of the results of different runs by weighted histogram analysis,21,4 a technique we have already exploited in conjunction with metadynamics in ref 10. The expression for the error is derived by an extensive numerical analysis on a very simple Langevin model (section III) in which F(s) is known analytically, but in section IV, we will show that this expression predicts with remarkable precision the error in a real case, in which the metadynamics is applied to reconstruct a bidimensional free energy of a cyclophane-catenane complex in solution. II. The Continuous Metadynamics Equations Consider a system described by a set of coordinates x and a potential V(x) evolving under the action of a dynamics (e.g., molecular dynamics, Langevin, or Monte Carlo) whose equilibrium distribution is canonical at temperature 1/β. We are interested in studying the probability distribution for this system as a function of a limited number of collective variables s(x), which are functions of the coordinates x. This probability distribution can be written as
P(s) )
exp(-βF(s))
∫ ds exp(-βF(s))
1 ln β
(∫ dx exp(-βV(x))δ(s - s(x)))
(1)
Consider now a trajectory x(t) of the system at temperature 1/β. If this trajectory could be computed for a very long time, P(s) could be obtained by taking the histogram of the collective variable s along this trajectory, i.e., at time t, P(s) ) 1/t ∫t0 dt′ δ(s(x(t′)) - s). This can be thought of as the ∆s f 0 limit of
P∆s(s) )
1 x ∆s 2πt
∫0
t
(
dt exp -
)
(s - s(x(t′)))2 2∆s2
VG(s(x), t) ) w
∑
(2)
If the system displays metastability, the motion of s will be bound in some local minimum of the free energy F(s) (i.e., in a local maximum of P(s)), and it will escape from this minimum with a very low probability on the time scale determined by the potential V(x) alone.
(
exp -
t′ ) τG,2τG,3τG,... t′