The Energy Gap as a Universal Reaction Coordinate for the

May 11, 2009 - Citation data is made available by participants in Crossref's Cited-by Linking service. For a more comprehensive list of citations to t...
1 downloads 0 Views 1023KB Size
J. Phys. Chem. B 2009, 113, 7867–7873

7867

The Energy Gap as a Universal Reaction Coordinate for the Simulation of Chemical Reactions Letif Mones,†,‡ Petr Kulha´nek,†,‡ Istva´n Simon,† Alessandro Laio,§ and Monika Fuxreiter*,† Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, P.O. Box 7, H-1518 Budapest, Hungary, and International School of AdVanced Studies (SISSA, SAS), Via Beirut 2-4, Trieste, Italy ReceiVed: January 4, 2009; ReVised Manuscript ReceiVed: March 20, 2009

The selection of a proper reaction coordinate is a major bottleneck in simulations of chemical reactions in complex systems. Increasing the number of variables that are used to bias the reaction largely affects the convergence and leads to an unbearable increase in computational price. This problem can be overcome by employing a complex reaction coordinate that depends on many geometrical variables of the system, such as the energy gap (EGAP) in the empirical valence bond (EVB) method. EGAP depends on all of the coordinates of the system, and its robustness has been demonstrated for a variety of enzymatic reactions. In this work, we demonstrate that EGAP, derived from a classical representation, can be used as a reaction coordinate in systems described with any quantum chemistry Hamiltonian. Benefits of using EGAP as a reaction coordinate as compared to a traditional geometrical variable are illustrated in the case of a symmetric nucleophilic substitution reaction in water solution. EGAP is shown to provide a significantly more efficient sampling and allows a better localization of the transition state as compared to a geometrical reaction coordinate. Introduction Computational methods play a distinguished role in unraveling mechanisms of chemical reactions in solutions and enzymes.1-3 In spite of the rapid increase in computational power, high-level techniques still face sampling problems above a few hundred atoms. While in small systems it is possible to search systematically for saddle points on the potential energy surface, in large systems or in solution, it is necessary to drive the system along a suitably chosen reaction coordinate. Potential of mean force (PMF) methods, such as thermodynamic integration,4 umbrella sampling,5 adaptive force-bias,6 constrained dynamics,7 steering dynamics,8 and metadynamics,9 can be used to bias the system and enhance sampling along a suitably chosen reaction coordinate. Application of these methods, however, requires a rather accurate a priori knowledge of the microscopic details of the reaction mechanism, as it is well-known that the transition state is critically influenced by many details. For example, transition path sampling of the ionic dissociation of NaCl in water10 showed that solvent degrees of freedom cannot be neglected for describing the dissociation process. In enzymes, conformational changes in the active site or penetration of water molecules into the active site can critically perturb the position of the transition state, as seen in the Klenow fragment of DNA polymerase I.11 These problems hinder the accuracy of most free energy methods and cannot be simply overcome by a more extensive sampling (e.g., increasing the length of the simulation). Instead, the solution requires a proper choice of the reaction coordinate. Although a large variety of reaction coordinates or collective variables (CVs) have been used for studying chemical reactions, their selection in complex systems is still arbitrary and all of the methods suffer from serious hysteresis if the * To whom correspondence should be addressed. E-mail: monika@ enzim.hu. † Hungarian Academy of Sciences. ‡ Both can be considered as first authors based on equal contributions. § International School of Advanced Studies (SISSA, SAS).

“correct” variable is not biased, as the “hidden” degrees of freedom perform transitions in an uncontrolled manner. A possible direction for designing a universal reaction coordinate for modeling chemical reactions is offered by the empirical valence bond (EVB) formalism, developed almost 30 years ago by Warshel and colleagues.12 Within this approach, the reaction is represented by a series of resonance states. Each resonance state is characterized by an empirical energy function, and the ground state energy of the system is obtained by solving the secular equation of the system similarly as in the valence bond theory.13 The reaction coordinate is defined as a difference between the potential energies of the resonance states (EGAP). EGAP has the advantage of being capable of describing the reorganization of the system toward the transition state, while it requires only the definition of the resonance states and not of the reaction pathway itself. Within the framework of EVB, EGAP has been successfully applied to a wide range of enzymatic reactions.14 For example, in the case of ATP synthetase, EGAP could describe the coupling of the mechanical and chemical steps.15 Recently, EGAP was used to study the catalytic activity of a molten globule enzyme and the relationship between folding and catalysis.16 These applications illustrate the potential and robustness of EGAP as a reaction coordinate. These results also suggest that such a complex reaction coordinate is required to capture all essential aspects of enzymatic reactions, where conformational changes concomitant to the chemical step largely influence the energetics of the reaction. Implementation of EGAP in higher-level Hamiltonians has been attempted by the frozen density functional approach.17 In this case, the eigenvector problem of the solute Hamiltonian was solved in the presence of a frozen electron density of the solvent molecules. Unlike hybrid techniques, this approach does not suffer from boundary problems. On the other hand, the method is computationally expensive. To overcome such a problem, a different approach was suggested based on computing the EVB reference potential and extrapolating to the high-level ab initio surface at reactant,

10.1021/jp9000576 CCC: $40.75  2009 American Chemical Society Published on Web 05/11/2009

7868

J. Phys. Chem. B, Vol. 113, No. 22, 2009

Mones et al.

product, and transition states.18 This approach, however, faces severe convergence problems if the EVB and the high-level potential energy surfaces significantly deviate from each other. This can be solved by systematically refining the EVB Hamiltonian, but this may be time-consuming. In this work, we describe how EGAP can be used as a reaction coordinate to explore free energy surfaces for systems described by a high-level Hamiltonian. The approach can be applied in different methods in which one or more variables are biased. As a model system, we study the symmetric nucleophilic substitution reaction

In the framework of the valence bond method, a reaction can be considered as a transition from the reactant to product state. This transition can be characterized by a reaction coordinate EGAP that is defined by the difference (gap) between the energies of the resonance states:

ClA- + MeClL f ClAMe + ClL-

ξ(x) ) EGAP(x) ) ε11(x) - ε22(x)

(1)

where A and L subscripts denote attacking and leaving chloride ions. Using such a simple system, it is possible to demonstrate the advantages of using EGAP as a collective variable, as compared to a geometrical coordinate, in terms of better sampling and a more accurate localization of the transition state. Theoretical Background Definition of the EGAP Reaction Coordinate. The reaction coordinate has been defined in the framework of the EVB method. A detailed description and applications of the technique can be found in the literature;1,14 therefore, only the main points are emphasized here. The EVB method uses the quantum mechanical concept of valence bond theory,13 describing the system by interacting resonance states. In the EVB method,12 energies of the resonance states are approximated by a classical potential as Nq

εii(x) )



Nb

Diq(1 - e-aq(rq-r0,q))2 + i

i

q)1

i + ∑ Ubond,j j)1

Na



Nt

i Uangle,j +

j)1



Nnb

i Utorsion,j +

j)1

i ∑ Unb,kl

(2)

k,l)1

constructed by the Morse terms for the breaking and forming bonds (first term), the harmonic terms for bonds (Ubond), angles (Uangle), and torsions (Utorsion) for covalently bonded atoms, and the nonbonded terms (Unb) including electrostatic and van der Waals energies. For simplicity, only two resonance states will be considered here, although more can also be handled. The first resonance state is the reactant, and the second state is the product of the reaction. The Hamiltonian of the system is the following:

H)

(

H11 H12 H21 H22

)

(3)

where Η11) ε11 and Η22 ) ε22 + R2, where ε11 and ε22 are the energies of the first and second resonance states and R2 is the energy offset between two resonance states. The off-diagonal terms Η12 and Η21 are interaction energies between resonance states (which are equal to each other in this case). They can be represented by either general Gaussian functions19 or more frequently by a simple exponential function:

H12 ) A12e-µ12rab

(4)

where rab is the representative distance between two atoms whose bonding is changed from first to second resonance structure, whereas A12 and µ12 are parameters. The off-diagonal terms are supposed to be transferable between different environments.20 A modification of the EVB method including solvent effects into off-diagonal elements is also known.21

The lowest-energy solution of secular problem is the ground state energy EEVB of the system:

1 1 EEVB ) (H11 + H22) - √(H11 - H22)2 + 4H122 2 2

(5)

(6)

As EGAP depends on all system coordinates, it can in principle describe the reorganization of the whole system toward a transition state including the reorganization of the solvent and of the protein environment in enzymatic reactions. This feature is missing in reaction coordinates that are based on local geometrical parameters. Implementation of EGAP in Free Energy Calculations. There are two possible manners to evaluate the free energies on a QM surface using EGAP as a reaction coordinate. One possibility is to run the simulation governed by EGAP on the EEVB potential by the EVB method. The differences between the EEVB and the QM potential energy surfaces are taken into account by postprocessing the configurations derived from the classical simulations. Corrections could be applied to stationary points of the reaction (reactant, transition state, and product) employing linear response approximation.18 Alternatively, free energy perturbation/umbrella sampling (FEP/US) can be used to compute the differences between the QM surface and the potential constructed as a combination of energies of resonance states along distinct segments of the free energy profile.22 These approaches are referred to as indirect implementations of the EGAP reaction coordinate, as the simulation is not directly performed on the QM surface. These approaches are extremely efficient if the EEVB and the QM potential energy surface differ by less than 2-3 RT. If this is not the case, the perturbative approach leads to systematic errors and should not be used. In this work, we demonstrate that EGAP can also be used to directly bias the sampling on the QM potential energy surface. In the direct implementation, simulations are performed on the QM potential surface, computing the potential of mean force as a function of EGAP evaluated on the classical potential energy surface. This approach can be used in connection with various techniques out of which metadynamics9 and constrained dynamics7 will be described in detail. FEP/US Approach. To generate configurations along the reaction pathway on the classical surface by EVB, a mapping potential EMAP is defined as the linear combination of resonance state energies ε11 and ε22: w EMAP ) (1 - λw)ε11 + λwε22

(7)

where λw is a mixing factor ranging from 0 (reactants) to 1 (products) that is varied in discrete increments. Free energies are computed for configurations generated on the mapping surface (eq 7) by simulations at each value of λw (referred in the following as windows) using the Zwanzig formula:23 w-1 FEP ∆A0fw

) -RT

∑ i)0

〈 (

i+1 i EMAP - EMAP ln exp RT

)〉

i

(8)

where 〈...〉i denotes averaging over the trajectory for window i. i i+1 and EMAP EMAP are the energies of the same configurations that

A Universal Reaction Coordinate

J. Phys. Chem. B, Vol. 113, No. 22, 2009 7869

i are used to evaluate EMAP , which were computed with EMAP with the subsequent value of λw, R is the gas constant, and T is the absolute temperature. Differences between the QM and mapping surfaces are taken into account by the umbrella sampling formula:24

w FEP ∆AQM (ξ′) ) ∆A0fw - RT lnδ(ξ(x) - ξ′) EQM - EMAP exp RT

(

)

w

(9)

where EMAP is the mapping potential for a given configuration, EQM is the quantum mechanical energy of the same configuration, ξ(x) is the reaction coordinate (that could be EGAP or a geometrical coordinate), and ξ′ is a given value of the reaction coordinate. δ is the Dirac delta function, 〈...〉w denotes averaging over the trajectory of window w. The average in eq 9 can converge in a reasonably short simulation time only if the argument of the exponential, (EQM - EMAP)/RT, fluctuates by less than 2-3 RT. We must emphasize that in this approach the entire free energy profile is evaluated at the QM level. Metadynamics. Within the framework of metadynamics, sampling along a given reaction coordinate(s) (also denoted as collective variables, CVs) is enhanced by biasing the normal evolution of the QM system using a history-dependent penalty potential VMTD.9 The biasing potential is constructed as a sum of Gaussians centered on the trajectory of selected collective variables. The free energy surface is explored by gradually filling it with Gaussians. Atomic forces are given by

fi ) -

∂EQM dVMTD(ξ) ∂ξ ∂xi dξ dxi

(10)

where xi is the position of atom i and EQM is the high-level potential energy surface. The biasing potential which is defined as t/tMTD

VMTD(ξ, t) ) h



[

exp -

i)1

(ξ(x) - ξi)2 2w2

]

(11)

where ξ(x) is the reaction coordinate. The w and h parameters of the Gaussian functions determine the speed and accuracy at which the free energy is reconstructed, and tMTD is the time between depositions of two successive Gaussians. In this manner, the high-level free energy surface is filled up along a reaction coordinate ξ(x). The reaction coordinate is defined using the classical adiabatic potentials (ε11 and ε22) in the case of ξ(x) ) EGAP(x). ξi is the value of the reaction coordinate, where Gaussians are centered. The time average VMTD(ξ,t) estimates the free energy for sufficiently long simulations:25

∆AMTD(ξ) ) -lim tf∞

1 t - tF

∫t t VMTD(ξ, t′) dt′ + C F

(12) where C is an arbitrary constant and tF is the time at which metadynamics has “filled” the entire relevant region in CV space. Constrained Dynamics. This method enhances sampling by imposing a constraint on a reaction coordinate.7 This is usually achieved by solving a modified equation of motion, in which the force is given by

fi ) -

∂EQM ∂σ(x) -λ ∂xi ∂xi

(13)

where σ(x) is the holonomic constraint given by σ(x) ) ξ(x) ξ′ ) 0. λ is a coefficient, which is solved by the Lagrange

multipliers approach in such a way that ξ(x) is equal to ξ′. It was shown that the average of λ over long simulation time is equal to the derivative of the free energy:26 c ∂AQM (ξ) ) 〈-λ〉c ∂ξ ξ′

(14)

where the superscript c means sampling at a constant value of ξ(x) ) ξ′. The free energy of the unbiased system can be obtained by numerical integration of eq 14, including correction27 at the discrete points of ξ spanning the desired range of a reaction coordinate:

∆ACD(ξ) ) )

∂AQM(ξ) dξ + C 1 ∂ξ c ξ2 ∂AQM 1 d - RT ln ξ1 ∂ξ dξ √Z

∫ξξ ∫

2

(

〈 〉)

dξ + C

c

(15)

where C is an arbitrary constant and Z is the Fixman determinant.28 Methods Description of the Model System. The symmetric nucleophilic substitution reaction of methyl chloride with chloride ion was described by two resonance states corresponding to the reactant and product states of the reaction. The energies of these two resonance states were evaluated using classical potentials (eq 2), with parameters derived from the AMBER GAFF force field29 and Morse terms from ref 1. Van der Waals parameters for chlorine atom and ion were defined as the average of the corresponding values to keep the system symmetric and handle both chlorines uniformly in the classical treatment. Values of the reaction coordinate EGAP were computed with a modified version of the Q program.30 The free energy was also computed as a function of a “traditional” geometrical reaction coordinate, the difference between the distances of the entering and leaving chloride atoms from the central carbon atom (DD). For the QM/ MM calculations, the methyl chloride and the chloride ion were described using the semiempirical PM3 Hamiltonian31 and water by the classical TIP3P model32 with the full electrostatic coupling between quantum and classical regions.33 The PM3 QM/MM Hamiltonian is chosen here only because at this simple level of description it was possible to perform extensive tests and to converge the free energies up to statistical accuracy. For the QM/MM simulations, periodic boundary conditions were applied with long-range electrostatic interactions using the particle mesh Ewald method34 with a 9 Å direct interaction cutoff. The neutrality of the system was achieved by a uniform background charge of +1. All simulations were performed using the AMBER 9 program.35 Computational Details FEP/US Calculations. FEP/US calculations were performed on the mapping potential using 51 windows with equidistant λ values. Each window was 100 ps long using a 0.5 fs time step. Then, the PM3 energy of each collected configuration was computed and the free energy difference between the PM3 and the mapping potential was determined according to eq 9. Three free energy profiles were computed using snapshots collected at each 100, 10, and every time step resulting in 2000, 20 000, and 200 000 collected snapshots per window, respectively. Increasing the number of configurations was expected to increase the precision of the method. The reference free energy profile

7870

J. Phys. Chem. B, Vol. 113, No. 22, 2009

Mones et al.

has been obtained by the constrained dynamics method directly on the PM3 potential energy surface using the same number of windows. In both cases, the first 25% of the data was discarded from the analysis. Metadynamics. Metadynamics simulations were performed depositing Gaussians with widths of 15 kcal mol-1 for EGAP and 0.25 Å for DD and heights of 0.1 kcal mol-1 in the gas phase and 0.5 kcal mol-1 in solution. The Gaussians were deposited every 1000 and 100 molecular dynamics time steps in the gas phase and solution, respectively. With these parameters, the time required to escape the reactant free energy minimum using EGAP or DD is comparable. The total lengths of the metadynamics simulations were 1 ns and 250 ps in the gas phase and solution, using a 0.5 fs time step with 2 × 106 or 5 × 105 QM evaluations, respectively. Constrained Dynamics. In the constrained dynamics simulations, the free energy profiles were computed from 51 windows of the reaction coordinate in the range of [-2.5, 2.5] Å for DD and [-200, 200] kcal mol-1 for EGAP in vacuum and [-300, 300] kcal mol-1 for EGAP in solution. In total, constrained dynamics simulations were 5 and 2.5 ns long in the gas phase and solution, respectively. All simulations were performed on QM(PM3) and QM(PM3)/ MM potential energy surfaces in the gas phase and solution, respectively. In all cases, the temperature was maintained at 300 K by a Langevin thermostat with a friction coefficient of 5 ps-1 in the gas phase and 1 ps-1 in solution. The system in solution was maintained at constant volume, which was previously properly equilibrated. Committor Analysis A 500 ps long constrained dynamics simulation was performed on the QM(PM3)/MM potential energy surface while constraining the reaction coordinate to its transition state value and 500 configurations were collected (one in each 1 ps). A constant temperature of 300 K was kept by a Langevin thermostat at 300 K with a friction coefficient of 1 ps-1. From each configuration, 100 unconstrained molecular dynamics simulations were initiated with random initial velocities generated according to a Maxwell-Boltzman distribution with a temperature of 300 K. The lengths of the unconstrained MD simulations were 0.5 ps each (1000 steps). This period was sufficient to reach either the reactant or product state. Sampling Efficiency of Metadynamics. The efficiency of the sampling on the QM potential energy surface using different reaction coordinates was assessed by two measures. Hysteresis of metadynamics calculations was characterized by the standard deviation of the history-dependent potential of metadynamics -VMTD:

χ(t) )

 (

1 VΩ

1 VΩ

δ(t) )

 (

1 VΩ

1 VΩ

∫Ω (AMTD(ξ, t) - ACD(ξ))2 dξ-

)

∫Ω (AMTD(ξ, t) - ACD(ξ)) dξ

(17)

2

Results Implementation of EGAP into Free Energy Calculations. As EGAP is defined on a classical surface its implementation as a reaction coordinate in higher-level methods faces with the problem that the EVB mapping and the quantum potential energy surfaces can significantly deviate from each other. As a consequence, the difference between the EQM and EMAP surfaces heavily influences the convergence of the free energies computed by different methods using EGAP as a reaction coordinate. In Figure 1, the free energy profiles of the symmetric substitution reaction of methyl chloride by chloride ion in a vacuum are shown. These were either obtained directly on the QM(PM3) surface or computed by the FEP/US method (eq 9) using configurations generated on the classical EMAP potential (eq 7). At the transition state, the free energy profile computed by the FEP/US approach exhibits a good agreement with the profile directly obtained on the QM(PM3) surface using constrained dynamics. A considerable difference between the two profiles can be observed near the reactant and product states even upon increasing the number of configurations by 2 orders of magnitude. The discrepancy between the two types of free energy profiles is related to the difficulty in converging the exponential average in eq 9, and may disappear at considerably longer simulation times. In the case of a reasonable agreement between EQM and EMAP surfaces,36,37 however, one can apply the linear response approach,18 where the potential energy differences are only taken into account in stationary points. Although this method is computationally much more efficient (3 × 103 QM evaluations as compared to 106 in our case), it leads to systematic errors whenever EQM and EMAP deviate by more than 2-3 RT, as in these conditions linear response approximation cannot be applied. Free Energy Profiles Obtained by Direct Implementation of EGAP into Metadynamics. As an alternative to the postprocessing FEP/US method, EGAP can be directly used as a reaction coordinate on the high-level potential energy surfaces, as

∫Ω (-VMTD(ξ, t) - ACD(ξ))2 dξ-

)

∫Ω (-VMTD(ξ, t) - ACD(ξ)) dξ

2

(16)

where ACD(ξ) is the fully converged free energy profile, evaluated with the constrained dynamics approach. The regions Ω of CV space on which the ξ-averages are taken are [-2.5, 2.5] Å and [-300, 300] kcal mol-1 for DD and EGAP, respectively. Similarly, the error of the metadynamics free energy estimate (eq 12) is given by

Figure 1. Free energy profiles of the reaction in vacuum obtained by the FEP/US method using 2000 (blue), 20 000 (green), and 200 000 (red) configurations per window of the simulation on the mapping potential (classical trajectory). The PMF free energy profile calculated directly on the QM(PM3) potential energy surface by constrained dynamics is used as a reference (black).

A Universal Reaction Coordinate

J. Phys. Chem. B, Vol. 113, No. 22, 2009 7871

Figure 2. Free energy profiles of the Cl- + MeCl f MeCl + Cl- reaction with EGAP (A) and DD (B) using metadynamics (red) and constrained dynamics (black). Free energy profiles were computed on the QM(PM3) surface (dashed) in a vacuum and on the QM(PM3)/MM surface (solid) in solution.

described in detail for metadynamics (eqs 10-12) and constrained dynamics (eqs 13-15). Free energy profiles of the methyl chloride substitution reaction in a vacuum and in solution computed by metadynamics and constrained dynamics using EGAP and a geometric DD reaction coordinate (defined as the difference between the distances of the entering and leaving chloride atoms from the central carbon atom) are displayed in Figure 2. In vacuum, the free energy profiles obtained with both reaction coordinates are symmetric (∆A0 ≈ 0 kcal mol-1) and provide activation barriers of 11.5 and 11.6 kcal mol-1 with metadynamics for DD and EGAP, respectively, in good agreement with the experimental data of 12.2 kcal mol-1.38 Note that previous DFT calculations resulted in asymmetric profiles with ∆A0 ) 2 kcal mol-1,20,39 likely due to sampling difficulties. In contrast to the results obtained from the simulations in vacuum, in solution, a larger deviation between the EGAP and DD reaction coordinates can be observed. In metadynamics, the ∆A0 value obtained with EGAP is 0.0 kcal mol-1, while using DD results in a slightly asymmetric profile with ∆A0 ) -0.21 kcal mol-1. Activation free energies of the solution reaction deviate by almost 3 kcal/mol, out of which the value of 24.0 kcal mol-1 obtained using EGAP is in reasonable agreement with the experimental data of 26.6 kcal mol-1.40 The statistical errors on the two profiles were estimated as 0.5 and 0.7 kcal mol-1 for EGAP and DD in metadynamics, respectively, with respect to the profile obtained by constrained dynamics. Localization of the Transition State on a High-Level Surface Using EGAP. In order to demonstrate the quality of EGAP as a CV on a high-level surface, we have employed committor analysis10 to probe the proper localization of the transition state. For this purpose, 100 independent and unconstrained molecular dynamics simulations have been initiated from each of the 500 configurations collected with the reaction coordinate constrained to its value at the transition state. For each configuration, we computed the fraction of trajectories PA that falls into the reactant minimum. Then, by taking the average over several configurations, we computed the probability of observing a given PA41 (i.e., probability density function; Figure 3). In an ideal case, the probability to observe a PA value should have a maximum at PA ) 0.5 and should decrease to 0 for PA ) 0 and PA ) 1, which means that starting from each configuration localized on the putative transition state half of the configurations should fall into the reactant region and another half into the product region. The committor analysis obtained using EGAP

Figure 3. Committor analysis of the reaction in solution using EGAP (red), DD (black), and two geometric coordinates (the distance of the entering and leaving chloride from the central carbon) (blue), as a reaction coordinate.

has indeed a clear maximum for PA ) 0.5. Probabilities of configurations that are fully committed to either end state (reactants or products, with PA ) 1 or PA ) 0) are lower. In contrast, in the case of DD, there is a high probability that configurations belonging to the putative transition state are fully committed to the reactants or the product and the maximum of the probability density function is at PA ) 0 (corresponding to the reactant state). Furthermore, in the case of DD, two minima can be observed at approximately PA ) 0.26 and PA ) 0.70 and with increasing probabilities toward the end states. This reflects that the transition state region has not been sampled uniformly. This point is also demonstrated by the relative probability of observing “good” (PA ∼0.5) or “bad” (PA ∼0 or 1) committor probability:

F)

P(PA ∈ [0.48, 0.52]) P(PA ∈ [0, 0.02] ∪ [0.98, 1])

(18)

The value of F obtained using the EGAP coordinate was twice as much as the one obtained using DD. The shape of the probability density function as well as the ratio of the configurations committed to transition vs end states illustrates that, even in the simple system that is considered here, a very natural coordinate, such as DD, is not capable of unambiguously localizing the transition state (TS). In principle, using more

7872

J. Phys. Chem. B, Vol. 113, No. 22, 2009

Mones et al.

Figure 4. Time evolution of EGAP (A) and DD (B) using metadynamics (reaction in solution).

Figure 5. Comparison of DD (black) and EGAP (red) reaction coordinate performances during metadynamics (reaction in solution). (A) Hysteresis χ (eq 16) characterizes the deviation of the history-dependent potential from the exact free energy as a function of time. (B) Efficiency δ (eq 17) reflects convergence of the average free energy as a function of time, with a log-log representation shown in the inset. tF (eq 12) is equal to 26 ps (520 Gaussians) and 29 ps (580 Gaussians) for the DD and EGAP simulations, respectively.

coordinates simultaneously could improve the localization of the transition state. To probe this idea, we computed the free energy as a function of two reaction coordinates, i.e., the distances of the entering and leaving chloride from the central carbon atom. Increasing the number of reaction coordinates, however, does not lead to committor distributions peaked at PA ) 0.5. This corroborates that EGAP provides a better representation of the reaction even at a higher level of theory as compared to geometric reaction coordinates. Sampling Performance Using EGAP on a High-Level Surface. Using EGAP as a reaction coordinate on the high-level energy surface leads to a better transition state localization as compared to geometric coordinates. This however could occur at an extra computational cost, as sampling according to a classical bias (EGAP) might force the system to visit regions of unphysically high QM energy. This possible disadvantage was studied by comparing the sampling performance of metadynamics upon using EGAP and DD as reaction coordinates. The time evolution of DD (Figure 4) shows that the system remains periodically bound in the two states even after t ) 700 hills, when the free energy profile should be filled with Gaussians. Instead, the run performed with EGAP after the free energy profile is filled shows a diffusive behavior in the entire domain with no sign of a residual barrier. This clearly illustrates that a very serious hysteresis is observed with DD, while no sign of residual barrier is found with EGAP.

This observation is also corroborated by computing the energy hysteresis χ (eq 16) (Figure 5A). The deviations of the metadynamics estimate from the correct free energy are significantly larger if DD is used as a collective variable. This indicates the presence of hidden degrees of freedom that are not sampled exhaustively biasing only DD. This is also confirmed by computing the metadynamics error δ at time t (eq 17). Free energies obtained by using EGAP converge faster than those computed using DD (Figure 5B). In practice, the difference in δ implies that in order to achieve the same accuracy one has to invest approximately four times more computational resources if one uses DD rather than EGAP as a reaction coordinate, at least in the case of a substitution reaction. These results demonstrate that EGAP increases the efficiency of the calculations as compared to a geometric coordinate. Conclusions In conclusion, we demonstrated that EGAP defined as the difference between potential energies of resonance states at the classical level can be applied in high-level calculations as a reaction coordinate. In complex systems, where the reaction pathway is influenced by many factors, EGAP corresponds to a combination of a large number of geometrical variables. Hence, it can result in reaction mechanisms that are not biased by the choice of a single geometric variable. This has been demonstrated for a variety of enzymatic reactions. We have shown

A Universal Reaction Coordinate that EGAP can be used as a reaction coordinate to directly bias high-level calculations. Using this variable considerably increases computational efficiency with respect to normal geometrybased coordinates. If EMAP is capable of reproducing EQM within 2-3 RT, the approach based on linear response should be preferred, as it allows computing the free energy barrier with 2 orders of magnitude less computational effort than metadynamics, a method that requires several quantum calls to construct the history-dependent potential. As the simulation is driven on the higher-level energy surface, discrepancies between the lowerand higher-level energy surfaces are taken into account and the proper localization of the TS is ensured. Full implementation of EGAP in conjunction with various free energy methods is under development. Acknowledgment. The authors are grateful to Arieh Warshel, whose work largely motivated the present project and also for the fruitful discussions. This work was funded by grants of the Hungarian Scientific Research Fund (OTKA) K72569, MRTNCT-2005-019566 of European FP6 (M.F.), GVOP-3.2.1.-200404-0195/3.0. We also acknowledge the support of the Bolyai Ja´nos (M.F.) fellowship. References and Notes (1) Warshel, A. Computer modeling of chemical reactions in enzymes and solutions; John Wiley & Sons: New York, 1991. (2) Sherwood, P. Hybrid quantum mechanics/Molecular Mechanics Approaches. Modern methods and Algorithms of Quantum Chemistry; John von Neumann Institute for Computing: Julich, Germany, 2000. (3) Lin, H.; Truhlar, D. G. Theor. Chem. Acc. 2007, 117, 185. (4) Darve, E. Thermodynamic integration using constrained and unconstrained dynamics. In Free energy calculations; Chipot, C., Pohorille, A., Eds.; Springer: 2007; p 119. (5) Kastner, J.; Thiel, W. J. Chem. Phys. 2005, 123, 144104. (6) Darve, E.; Pohorille, A. J. Chem. Phys. 2001, 115, 9169. (7) Carter, E. A.; Ciccotti, G.; Hynes, J. T.; Kapral, R. Chem. Phys. Lett. 1989, 156, 472. (8) Jensen, M. O.; Park, S.; Tajkhorshid, E.; Schulten, K. Proc. Natl. Acad. Sci. U.S.A. 2002, 99, 6731. (9) Laio, A.; Parrinello, M. Proc. Natl. Acad. Sci. U.S.A. 2002, 99, 12562. (10) Geissler, P. L.; Dellago, C.; Chandler, D. J. Phys. Chem. B 1999, 103, 3706. (11) Fothergill, M.; Goodman, M. F.; Petruska, J.; Warshel, A. J. Am. Chem. Soc. 1995, 117, 11619. (12) Warshel, A.; Weiss, R. M. J. Am. Chem. Soc. 1980, 102, 6218.

J. Phys. Chem. B, Vol. 113, No. 22, 2009 7873 (13) Heitler, W.; London, F. Z. Phys. 1927, 44, 425. (14) Warshel, A.; Sharma, P. K.; Kato, M.; Xiang, Y.; Liu, H.; Olsson, M. H. Chem. ReV. 2006, 106, 3210. (15) Strajbl, M.; Shurki, A.; Warshel, A. Proc. Natl. Acad. Sci. U.S.A. 2003, 100, 14834. (16) Roca, M.; Messer, B.; Hilvert, D.; Warshel, A. Proc. Natl. Acad. Sci. U.S.A. 2008, 105, 13877. (17) Wesolowski, T.; Muller, R. P.; Warshel, A. J. Phys. Chem. 1996, 100, 15444. (18) Rosta, E.; Klahn, M.; Warshel, A. J. Phys. Chem. B 2006, 110, 2934. (19) Chang, Y.-T.; Miller, W. H. J. Phys. Chem. 1990, 94, 5884. (20) Hong, G.; Rosta, E.; Warshel, A. J. Phys. Chem. B 2006, 110, 19570. (21) Schmitt, U. V.; Voth, G. A. J. Phys. Chem. 1998, 102, 5547. (22) Warshel, A.; Weiss, R. M. Ann. N.Y. Acad. Sci. 1981, 367, 370. (23) Zwanzig, R. W. J. Chem. Phys. 1954, 22, 1420. (24) Torrie, G. M.; Valleau, J. P. J. Comput. Phys. 1977, 23, 187. (25) Bussi, G.; Laio, A.; Parrinello, M. Phys. ReV. Lett. 2006, 96, 090601. (26) Mulders, T.; Kruger, P.; Swegat, W.; Schlitter, J. J. Chem. Phys. 1996, 104, 4869. (27) Schlitter, J.; Khlan, M. J. Chem. Phys. 2003, 118, 2057. (28) Fixman, M. Proc. Natl. Acad. Sci. U.S.A. 1974, 71, 3050. (29) Wang, J.; Wolf, R. M.; Caldwell, J. W.; Kollman, P. A.; Case, D. A. J. Comput. Chem. 2004, 25, 1157. (30) Marelius, J.; Kolmodin, K.; Feierberg, I.; Aqvist, J. J. Mol. Graphics Modell. 1998, 16, 213. (31) Stewart, J. J. P. J. Comput. Chem. 1989, 10, 209. (32) Jorgensen, W. L.; Chandrasekhar, J.; Madura, J.; Klein, M. L. J. Chem. Phys. 1983, 79, 926. (33) Walker, R. C.; Crowley, M. F.; Case, D. A. J. Comput. Chem. 2008, 29, 1019. (34) Darden, T. A.; York, D. J. Chem. Phys. 1993, 103, 8577. (35) Case, D. A.; Darden, T. A.; Cheatham, T. E., III; Simmerling, C. L.; Wang, J.; Duke, R. E.; Luo, R.; Merz, K. M.; Pearlman, D. A.; Crowley, M.; Walker, R. C.; Zhang, W.; Wang, B.; Hayik, S.; Roitberg, A.; Seabra, G.; Wong, K. F.; Paesani, F.; Wu, X.; Brozell, S.; Tsui, V.; Gohlke, H.; Yang, L.; Tan, C.; Mongan, J.; Hornak, V.; Cui, G.; Beroza, P.; Matthews, D. H.; Schafmeister, C.; Ross, W. S.; Kollman, P. A. AMBER 9, 9th ed.; University of California: San Francisco, CA, 2006. (36) Chang, Y. T.; Miller, W. H. J. Phys. Chem. 1990, 94, 5884. (37) Wu, Y.; Chen, H.; Wang, F.; Paesani, F.; Voth, G. A. J. Phys. Chem. B 2008, 112, 467. (38) Barlow, S. E.; Vandoren, J. M.; Bierbaum, V. M. J. Am. Chem. Soc. 1988, 110, 7240. (39) Ensing, B.; Laio, A.; Parrinello, M.; Klein, M. L. J. Phys. Chem. B 2005, 109, 6676. (40) Albery, J. K. AdV. Phys. Org. Chem. 1978, 16, 87. (41) Berga, B. A.; Harris, R. C. Comput. Phys. Commun. 2008, 179, 443.

JP9000576