Demonstrating an Order-of-Magnitude Sampling Enhancement in

Feb 11, 2016 - ... an Order-of-Magnitude Sampling Enhancement in Molecular Dynamics Simulations of Complex Protein Systems ... Phone: (212) 403-8664...
0 downloads 0 Views 1005KB Size
Subscriber access provided by GAZI UNIV

Article

Demonstrating an order-of-magnitude sampling enhancement in molecular dynamics simulations of complex protein systems Albert C. Pan, Thomas M. Weinreich, Stefano Piana, and David E. Shaw J. Chem. Theory Comput., Just Accepted Manuscript • DOI: 10.1021/acs.jctc.5b00913 • Publication Date (Web): 11 Feb 2016 Downloaded from http://pubs.acs.org on February 13, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Chemical Theory and Computation is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Demonstrating an order-of-magnitude sampling enhancement in molecular dynamics simulations of complex protein systems Albert C. Pan,1,† Thomas M. Weinreich,1 Stefano Piana,1 and David E. Shaw1,2,† 1

2

D. E. Shaw Research, New York, NY 10036, USA.

Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA.

† To whom correspondence should be addressed. Albert C. Pan E-mail:

[email protected]

Phone:

(212) 403-8664

Fax:

(646) 873-2664

David E. Shaw E-mail:

[email protected]

Phone:

(212) 478-0260

Fax:

(212) 845-1286

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 29

Abstract Molecular dynamics (MD) simulations can describe protein motions in atomic detail, but transitions between protein conformational states sometimes take place on timescales that are infeasible or very expensive to reach by direct simulation. Enhanced sampling methods, the aim of which is to increase the sampling efficiency of MD simulations, have thus been extensively employed. The effectiveness of such methods when applied to complex biological systems like proteins, however, has been difficult to establish, because even enhanced sampling simulations of such systems do not typically reach timescales at which convergence is extensive enough to reliably quantify sampling efficiency. Here, we obtain sufficiently converged simulations of three proteins to evaluate the performance of simulated tempering, a member of a widely used class of enhanced sampling methods that use elevated temperature to accelerate sampling. Simulated tempering simulations with individual lengths of up to 100 µs were compared to (previously published) conventional MD simulations with individual lengths of up to 1 ms. With two proteins, BPTI and ubiquitin, we evaluated the efficiency of sampling of conformational states near the native state, and for the third, the villin headpiece, we examined the rate of folding and unfolding. Our comparisons demonstrate that simulated tempering can consistently achieve a substantial sampling speedup of an order of magnitude or more relative to conventional MD.

2 ACS Paragon Plus Environment

Page 3 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Introduction Protein function is intimately tied to transitions among conformational states. Characterizing these states and determining the most populated states at a given temperature is a problem of fundamental importance for the field of biology. Molecular dynamics (MD) simulations sample different protein conformations with atomic detail.1 Such simulations, however, must sequentially resolve motions occurring at the femtosecond timescale, whereas interesting events such as protein folding, major conformational changes, and ligand binding typically occur on timescales many orders of magnitude longer.2 Long and computationally expensive simulations are thus required, and it is often infeasible to adequately sample all significantly populated conformational states.

Numerous variations of conventional MD, collectively referred to as enhanced sampling methods, have been proposed to accelerate the exploration of protein conformational states.3–12 Despite their frequent use, however, their success when applied to complex biological systems remains unclear.10 One (though not the only) key unresolved question about the utility of enhanced sampling methods in these larger systems is the degree of speedup to be expected.

Assessments of enhanced sampling results for complex biological systems have often been based upon comparisons with experimental measurements. Although such comparisons are often useful, complications may arise from errors in the physical model underlying the simulation (the “force field”), discrepancies between experimental and simulation conditions, and other sources.13 These complications can in some cases make it difficult to isolate and assess the performance of the sampling method itself.

A more direct test of the performance of an enhanced sampling method is to compare enhanced sampling results to those from a long conventional MD simulation of the same system done in the same force field under identical conditions. The results of enhanced sampling simulations

3 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 29

are often evaluated by comparison with conventional MD simulations of very small model systems like alanine dipeptide, where performing well-converged simulations is computationally tractable. Although these tests are an important first step toward validating any proposed method, such systems likely present qualitatively different sampling challenges than do larger, more biologically interesting systems whose conformational ensembles are explored over longer timescales. Navigating the gap between initial testing and subsequent application to larger, more complex systems can be challenging.14

Directly comparing enhanced sampling and conventional MD simulations of complex biological systems is a promising approach for the evaluation and improvement of enhanced sampling methods.10,15–19 Such testing has historically been difficult because it can be computationally expensive to simulate systems of realistic size and complexity for long enough to obtain well converged quantitative measures of sampling efficiency. With the advent of MD software with improved scalability across computer processors,20–25 and of specialized hardware for MD,26,27 such long simulations are becoming increasingly available, making direct comparisons of this sort computationally feasible.

Here, we evaluate the performance of simulated tempering, an enhanced sampling method that mixes periods of simulation at higher temperatures with periods simulated at a lower temperature.3,4 The strategy of using elevated temperatures to accelerate sampling is common among many enhanced sampling methods,10 including the replica exchange method.5,28 Unlike replica exchange, however, where configurations are exchanged among multiple simulations run in parallel at different temperatures, simulated tempering performs a single simulation where the temperature itself is updated (cf., SI eq. 1). Although widely used, the efficiency of methods that use elevated-temperature strategies has been the subject of much debate.10,15,29–35 We note that, whereas conventional MD generates a continuous trajectory at a single temperature, enabling kinetic properties to be directly extracted, simulated tempering—like many enhanced sampling

4 ACS Paragon Plus Environment

Page 5 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

methods—does not; the comparison we perform in this work, however, focuses on sampling efficiency and not on kinetic properties.

We compared an aggregate total of over 800 µs of simulated tempering runs, including single trajectories as long as 100 µs, to previously published long-timescale (up to 1 ms in a single run) conventional MD simulations of three proteins:36–38 bovine pancreatic trypsin inhibitor (BPTI), ubiquitin, and villin. With BPTI and ubiquitin, we tested the ability of simulated tempering to efficiently explore states close to the native state. In the case of villin, we quantified the ability of simulated tempering to fold and unfold the protein. We focus here on comparing the conformational sampling efficiency (i.e., how fast simulated tempering and conventional MD transition between different conformations). In principle, different thermodynamic observables, such as free energies and melting temperatures, may converge with different rates, and the efficiency of one enhanced sampling method with respect to another may depend on the observable of interest. The convergence rate for most observables, however, will likely be dominated by the conformational sampling efficiency, which is why we focus on that metric here.

We first performed a structural and thermodynamic analysis to ensure that the conventional MD and enhanced sampling simulations explore similar conformations with roughly the same populations.39 This provided good evidence that the simulations were sufficiently converged to enable a quantitative assessment of sampling efficiency. From this efficiency assessment we found that simulated tempering is able to consistently increase the sampling efficiency across all three systems relative to a conventional MD simulation conducted at 300 K. The speedup ranges from about an order of magnitude for sampling the folded states of BPTI and ubiquitin to an estimated three orders of magnitude for sampling the folding of villin. We end with a discussion of some of the lessons learned from applying tempering to complex biological systems.

5 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 29

Results

Dynamics of folded states in BPTI and ubiquitin

Conventional MD and simulated tempering sample the same conformational space

We performed simulated tempering simulations of BPTI and ubiquitin using the same force fields and the same equilibrated starting structures as the previously reported 1-ms conventional MD simulations of these proteins.36,37 In simulated tempering simulations,3,4 the system temperature of a single simulation is updated among a ladder of discrete values, and the update is accepted or rejected according to a criterion that preserves detailed balance (SI eq. 1). Simulation time spent at higher temperatures accelerates the sampling of conformational states, and that sampling is communicated back to the lower rung of interest by way of temperature updates. In the tempering simulations of BPTI and ubiquitin, we used 20 discrete temperatures exponentially distributed from 300 K to 400 K with a temperature update interval of 2 ps (see SI for a detailed discussion of the methods and simulation parameters used).

To begin our assessment of simulated tempering, we quantified the extent of the structural similarity between the protein conformations produced by simulated tempering and conventional MD by clustering frames from both simulations and projecting the results onto the first and second principal component vectors, PC1 and PC2, for visualization. The conventional MD and simulated tempering trajectories were concatenated together before clustering to ensure that there was no ambiguity about whether two structures from different trajectories belonged to the same cluster. In the simulated tempering trajectories, only frames from the 300-K rung were used. In the visualization in Figs. 1A and 2A, the colored contours represent significantly occupied states, which we define as states that were independently visited at least twice in both the conventional MD and the simulated tempering simulations and that had at least 1% occupancy in either

6 ACS Paragon Plus Environment

Page 7 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

conventional MD or simulated tempering (Table S2, Fig. S1). These significantly occupied states account for over 90% of the frames in each of the trajectories. Approximately 92% of the conventional MD and 93% of the simulated tempering trajectories belonged to clusters meeting these criteria in BPTI, and approximately 95% of the conventional MD and 99% of the simulated tempering trajectories belonged to such clusters in ubiquitin. The agreement between the conventional MD and simulated tempering clusters, both in position and extent, suggests that the same states were being visited with roughly the same populations.

Similarity in PCA space, however, does not necessarily indicate a true structural correspondence, since projections from higher to lower dimensionality may entail a loss of information. To verify that the underlying structures within a cluster were similar, we determined the mean structures defined by each of the clusters and measured the root-mean-square deviation (RMSD) difference between the structures derived from conventional MD and simulated tempering. We obtained extremely close structural agreement. RMSD deviations between the means of these clusters ranged from 0.01 Å to 0.26 Å (Table S2). As a point of reference, the RMSD deviation between distinct clusters ranged from 0.55 Å to 2.4 Å. Overlays of the mean structures from several clusters for BPTI and ubiquitin are shown in Figs. 1C and 2C respectively.

Population estimates from conventional MD and simulated tempering agreed reasonably well. The average unsigned error of the conformational free energy difference between population estimates for the clusters was 0.3 kcal mol−1 for ubiquitin and 0.2 kcal mol−1 for BPTI, both less than the thermal energy. The largest error was 0.54 kcal mol−1 for the blue cluster in ubiquitin (Fig. 2, Table S2). Additional details about the conformations visited by BPTI and ubiquitin shown in Figures 1 and 2 can be found in previous publications describing the conventional MD simulations.36,37

7 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 29

Simulated tempering provides at least an order-of-magnitude improvement in sampling efficiency compared to conventional MD

Having confirmed that simulated tempering and conventional MD sample the same conformational space with similar populations, we proceeded to assess the gain in sampling efficiency, if any, due to simulated tempering as compared to conventional MD simulations. We found at least an order-of-magnitude speedup as measured by the number of corresponding states visited per unit time. In BPTI, the number of transitions to the states depicted in Fig. 1A was 0.09 transitions per µs for conventional MD and 1.2 transitions per µs for the 300-K simulated tempering rung, giving a sampling speedup by this metric of approximately 13-fold due to simulated tempering. Similarly, the sampling speedup for ubiquitin by the same metric is around 14-fold. Another, more indirect metric—the decay time of the autocorrelation functions of the backbone RMSDs of BPTI and ubiquitin—gave a similar result (Fig. S2).

In our efficiency analysis for conventional MD versus simulated tempering, we always compare equivalent amounts of total simulation time, which ensures that the computer resources required are properly taken into account. Total simulation times for simulated tempering refer to the simulation time aggregated over all rungs. Enabling simulated tempering on top of conventional MD will always add some computational overhead. Earlier work, however, demonstrated that this overhead is negligible on Anton.40 The computational resources required for a given amount of total simulation time is thus almost the same for conventional MD and simulated tempering.

In the simulated tempering simulations, the average round-trip time—the time for the simulation to travel from the lowest to the highest temperature rung then back again to the lowest rung— was a great deal faster than the rates of conformational changes of interest. In other words, the simulated tempering simulations cycled through temperature space much faster than they cycled through different conformations. This condition ensures that sampling relevant for comparison against the conventional MD simulations at 300 K was efficiently obtained, since every time a 8 ACS Paragon Plus Environment

Page 9 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

conformational change occurred, it was sampled multiple times at the lowest rung.31 In the simulated tempering simulations of BPTI and ubiquitin, the average round-trip times were about 10 ns—at least an order of magnitude faster than the fastest conformational change of interest. Incidentally, because the average round-trip times were much faster than the conformationalchange times, calculating efficiency over only the conformations sampled at the lowest rung is similar to calculating the efficiency over conformations sampled at all rungs.

Dynamics of villin folding

Simulated tempering significantly increases sampling efficiency in protein folding relative to conventional MD

We performed another test of simulated tempering on the folding of the C-terminal fragment of the Nle/Nle double mutant of the villin headpiece, on which we have previously performed several equilibrium folding simulations with conventional MD at and around the melting temperature, Tm ~ 370 K.38 Similar to the simulations of BPTI and ubiquitin, we ran a tempering simulation using a 20-rung temperature ladder with exponentially distributed rungs from 300 K to 390 K. Plots of Cα RMSD as a function of time from the experimentally determined folded structure (PDB ID 2F4K41) for the conventional MD simulation at the melting temperature and the simulated tempering simulation indicate that all trajectories reversibly explore the folded and unfolded states (Fig. 3A).

To quantify the extent of agreement between the simulated tempering and conventional MD simulations, we classified the trajectories into folded and unfolded states based on a Cα RMSD cutoff from the folded structure. The mean structures of the folded state determined from the simulated tempering and the conventional MD simulations overlap extremely well (∆ RMSD = 0.13 Å) (Fig. 3B), as do population estimates based on computing melting curves 9 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 29

(Fig. 3C). We note that one benefit of simulated tempering over conventional MD, in addition to increased sampling efficiency, is automatically obtaining the temperature dependence of various system observables such as the melting curve.

To quantify sampling efficiency, we calculated the average reversible-folding time, defined as the sum of the average folding and unfolding times. This quantity is directly related to the number of states visited per unit time for a two-state system. The reversible-folding time of villin for the simulated tempering simulation was approximately 5.9(9) µs (Table 1), which is comparable to the value of 3.7(3) µs obtained near the melting temperature with a conventional MD simulation. (Standard errors are indicated in parentheses.) The sampling efficiency obtained in the simulated tempering simulation of villin likely represents a significant speedup over conventional MD at 300 K (see SI). In contrast with the ubiquitin and BPTI systems, we do not have a quantitative estimate of the relevant transition times for villin, because obtaining villin equilibrium folding trajectories at 300 K from conventional MD is difficult due to very long unfolding times. We can, however, make a rough estimate by extrapolating the average unfolding times from three long-timescale simulations at T = 360 K, 370 K, and 380 K (Table 1) using an Arrhenius form, while assuming that the folding time remains approximately constant as a function of temperature.42 This calculation gives an unfolding time on the order of several milliseconds at 300 K, implying that simulated tempering speeds up sampling efficiency by approximately three orders of magnitude as compared to conventional MD.

Discussion The mixing of simulations at different temperatures to increase sampling efficiency is a strategy shared by a broad class of enhanced sampling methods that includes annealing, simulated tempering, and replica exchange.10,43 These approaches rely on accessing temperatures where a speedup in sampling is possible, and then communicating that speedup to a lower temperature of 10 ACS Paragon Plus Environment

Page 11 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

interest. In this section, we discuss the extent to which our simulated tempering simulations have achieved that goal. In particular, we ask whether significant further speedup might be achieved, considering the lessons learned from our application of this method to complex biological systems.

One obvious approach to improve simulated tempering efficiency is the use of rungs with even higher temperatures. Higher temperatures generally result in faster transition rates between conformations, especially for conformational changes whose free energetic barriers are dominated by enthalpy. Higher temperatures, however, also allow access to conformational states that may not be relevant at lower temperatures, a phenomenon that reduces the speed of sampling at a lower temperature of interest. In particular, when sampling an ensemble of folded states, higher temperatures increase the likelihood of unfolding. Since refolding times are long, protein unfolding during a simulated tempering run would severely slow the sampling of folded states. This explosion of states with temperature is particularly pronounced for complex systems with a large number of degrees of freedom, limiting the highest temperature rung of these systems to values near the melting temperature. This limitation on the highest temperature also restricts the potential speedup using simulated tempering and related methods44,45 like replica exchange, since the greatest speedup achievable will always be a fraction of the transition speed at the fastest rung.

Recent conventional MD simulations of ubiquitin, for example, estimated a melting temperature of around 370 K and an average folding time of 3 ms at 390 K, using the same force field as this study.37 Because this average folding time is much longer than the typical transition timescale among folded states in a simulated tempering simulation, the occurrence of an unfolding event during a simulated tempering run aimed at sampling folded states would drastically reduce the sampling efficiency. Although a simulation estimate of BPTI’s melting temperature is unavailable, experimental measurements give a value of around 380 K.46 The maximum temperature of 400 K in our simulated tempering simulations of these systems is thus likely a 11 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 29

good compromise between accelerating folded-state dynamics without a high probability of unfolding for both of these systems.

When the goal is to sample protein folding itself, a similar temperature limit applies. Unfolding rates generally have an Arrhenius dependence on temperature, increasing exponentially as temperature increases. The folding rate, however, is more weakly temperature dependent and often has an anti-Arrhenius dependence at high temperatures.47 The different temperature dependences of the folding and unfolding rates thus give rise to a maximally achievable rate of reversible folding near the melting temperature, Tm. As a result, the greatest sampling speed achievable at any other temperature will be some fraction of the sampling speed near Tm. Adding rungs with temperatures much greater than Tm can reduce sampling efficiency.31,35

Notably, the reversible-folding time in our simulated tempering simulation of villin was similar to the speed of a conventional MD simulation near the melting temperature. Since the reversible-folding time is minimized near the melting temperature (i.e., the number of folding and unfolding events per unit time is maximized), this implies that populations at 300 K are obtained at maximal efficiency, with a loss of sampling only due to having to visit other temperatures. Again, the round-trip time among temperature rungs (approximately 20 ns) was much faster than the typical reversible-folding timescale (microseconds), indicating that the accelerated sampling achieved around the melting temperature is being efficiently communicated to the temperature rung at 300 K.

Another way to take advantage of the faster sampling afforded by higher temperatures in simulated tempering simulations is to increase the time spent at existing higher temperature rungs31,35,45 by skewing the rung weights (SI eq. 1 and Fig. S3). Skewing the weights, however, will also lead to longer average round-trip times, which will in turn impact the sampling efficiency at the lowest rung. Finding an optimal balance between the amount of skew and efficiency can be difficult in general. By severely skewing weights toward higher temperature 12 ACS Paragon Plus Environment

Page 13 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

rungs in a simulated tempering simulation of ubiquitin to be near the limits of acceptable roundtrip times, we did observe a fourfold increase in sampling efficiency (Fig. S3B and SI text). Skewing the rung weights had a similar, but more marginal, effect on the simulated tempering simulations of villin folding, given that these simulations were already close to the efficiency achieved at the melting temperature (see SI). We note that, because the occupancy of different rungs in simulated tempering is determined by the value of the weights, skewing the weights is tantamount to adding more rungs (or more replicas, in the context of replica exchange simulations) in a temperature range of interest.

Our choices of other, more straightforward tempering parameters, such as temperature spacing and interval updating, appear to be close to optimal in the sense that changing these parameters is unlikely to lead to further significant speedup. Further discussion can be found in the SI.

Conclusion and Outlook We have compared the conformational sampling efficiency of an enhanced sampling method with that of conventional MD simulations for sampling the conformational states of three nontrivial biological systems, including a protein with a conformational-change timescale as long as 100 µs. Applying the tempering method using the same force field and conditions as the longtimescale conventional runs makes it possible to assess gains in sampling efficiency due specifically to the enhanced sampling method. When exploring the dynamics of folded states with BPTI and ubiquitin, as well as the folding dynamics of villin, we consistently observed speedups of an order of magnitude or more.

We emphasize that our focus here has been on assessing the conformational sampling efficiency of simulated tempering (i.e., on whether simulated tempering samples different protein conformations faster than conventional MD). In principle, different thermodynamic observables

13 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 29

may converge with different rates, and the efficiency of a given enhanced sampling method may depend on the observable of interest. The convergence rate for most observables, however, will likely be dominated by the conformational sampling efficiency. As evidence for this, the efficiency speedup derived from a different property—the decay timescale of the autocorrelation of the backbone RMSD (Fig. S2)—provides a speedup estimate similar to that obtained from looking at conformational sampling efficiency (as described in the Results section).

We believe that the speedup due to simulated tempering of an order of magnitude or more will continue to hold for systems larger than BPTI, ubiquitin, and villin. Larger systems will require additional temperature rungs to ensure adequate exchange between different rungs, and visiting those additional rungs will reduce the fraction of simulation time spent at a lower rung of interest. Adding rungs will only slow down conformational sampling, however, if the typical conformational-transition timescale becomes faster than the average round-trip time during a simulated tempering simulation. In that case, conformations that are sampled at higher rungs may not get communicated down to the rung of interest (e.g., the 300-K rung).

We can use a rough scaling argument to show that the average round-trip time for a relatively large system will continue to be faster than expected conformational-change timescales. In simulated tempering, the number of required rungs, nr, scales approximately as N1/2, where N is the number of atoms in the system,4 and the average round-trip time, tRT, scales with ݊௥ଶ , implying that tRT ~ N. The average round-trip time, at a temperature-update interval of 2 ps, is about 10 ns for ubiquitin, a 70-residue protein system with approximately 20,000 atoms when solvated (Fig. S4A). According to the derived scaling, even a system 10 times the size of ubiquitin (200,000 atoms and several hundred residues) would have an average round-trip time of 100 ns. Since these larger systems will likely have conformational-change timescales longer than those seen in ubiquitin (i.e., longer than hundreds of nanoseconds to hundreds of microseconds), our speedup estimate should hold for these much larger systems.

14 ACS Paragon Plus Environment

Page 15 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Simulated tempering and other methods based on changing temperature are some of the more basic enhanced sampling methods. Little a priori knowledge of the system being sampled is required, and the changes in temperature affect all the degrees of freedom uniformly. The speedup due to simulated tempering can thus be considered a baseline to which speedups from more complicated methods can be compared. Such methods include Hamiltonian tempering12,48,49 and collective variable–based approaches like umbrella sampling,50 metadynamics,6 and temperature-accelerated MD.8 These methods often require additional physical intuition, but could potentially result in an even greater speedup. Unlike temperature, specific collective variables and Hamiltonian terms focus on a subset of the degrees of freedom of the entire system. In particular, careful design of a Hamiltonian tempering protocol would suffer less, relative to temperature, from efficiency limitations due to an explosion of states at high ladder rungs.12 Identifying a subset of the degrees of freedom that lead to good sampling of conformational changes of interest, however, is not always straightforward, and remains an active area of research.51–53

We believe that comparing the sampling efficiency of enhanced sampling methods vs. longtimescale MD simulations, as done in this study, could be an important way to guide future development efforts and spur innovation in this field. As noted in a recent review on equilibrium sampling in biomolecular simulations, no enhanced sampling method “has been documented to yield an order of magnitude improvement over conventional MD in the range of systems and conditions of primary interest,” and the “sampling problem cries out for a pragmatic focus on efficiency.”10 This study has shown that simulated tempering can indeed yield a consistent order-of-magnitude speedup for the conformational sampling of three complex biological systems. In addition, we have gained a deeper understanding of the various parameters that affect and limit the efficiency of this method when applied to these systems. We believe that this understanding is applicable to the broader class of methods, including replica exchange, that use elevated temperature to enhance sampling.

15 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 29

Methods Simulated tempering simulations were performed on a special-purpose machine, Anton,26,40 and followed the published simulation protocols of the conventional MD simulations of BPTI,36 ubiquitin,37 and villin.38 Trajectories were clustered using a change-point algorithm combined with spectral clustering. Additional methodology details and a complete list of simulations can be found in the Supporting Information.

Acknowledgments The authors thank Michael Eastwood for helpful discussions and a critical reading of the manuscript, Paul Maragakis, Robert Dirks, Adam Lerer, and Kenneth Mackenzie for helpful discussions, Ansgar Philippsen for help with figures, and Rebecca Bish-Cornelissen and Berkman Frank for editorial assistance.

Supporting Information Two tables providing a list of simulations (Table S1) and details about the comparison between the conventional MD and simulated tempering simulations (Table S2), and four figures showing additional analysis (Figures S1–S4). This material is available free of charge via the Internet at http://pubs.acs.org/.

16 ACS Paragon Plus Environment

Page 17 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

References 1. McCammon, J.A.; Gelin, B.R.; Karplus, M. Dynamics of folded proteins. Nature 1977, 267(5612), 585–590. 2. Klepeis, J.L.; Lindorff-Larsen, K.; Dror, R.O.; Shaw, D.E. Long-timescale molecular dynamics simulations of protein structure and function. Curr. Opin. Struct. Biol. 2009, 19(2), 120–127. 3. Lyubartsev, A.P.; Martsinovski, A.A.; Shevkunov, S.V.; Vorontsov-Velyaminov, P.N. New approach to Monte Carlo calculation of the free energy: Method of expanded ensembles. J. Chem. Phys. 1992, 96(3), 1776–1783. 4. Marinari, E.; Parisi, G. Simulated tempering: A new Monte Carlo scheme. Europhys. Lett. 1992, 19(6), 451–458. 5. Sugita, Y.; Okamoto, Y. Replica-exchange molecular dynamics method for protein folding. Chem. Phys. Lett. 1999, 314(1), 141–151. 6. Laio, A.; Parrinello, M. Escaping free-energy minima. Proc. Natl. Acad. Sci. U.S.A. 2002, 99(20), 12562–12566. 7. Hamelberg, D.; Mongan, J.; McCammon, J.A. Accelerated molecular dynamics: A promising and efficient simulation method for biomolecules. J. Chem. Phys. 2004, 120(24), 11919–11929. 8. Maragliano, L.; Vanden-Eijnden, E. A temperature accelerated method for sampling free energy and determining reaction pathways in rare events simulations. Chem. Phys. Lett. 2006, 426, 168–175. 9. Schlick, T. Molecular dynamics-based approaches for enhanced sampling of long-time, large-scale conformational changes in biomolecules. F1000 Biol. Rep. 2009, 1(51), 1–9.

17 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 29

10. Zuckerman, D.M. Equilibrium sampling in biomolecular simulations. Annu. Rev. Biophys. 2011, 40, 41–62. 11. Wu, X.; Damjanovic, A.; Brooks, B.R. Efficient and unbiased sampling of biomolecular systems in the canonical ensemble: A review of self-guided Langevin dynamics. Adv. Chem. Phys. 2012, 150, 255–326. 12. Liu, P.; Kim, B.; Friesner, R.A.; Berne, B.J. Replica exchange with solute tempering: A method for sampling biological systems in explicit water. Proc. Natl. Acad. Sci. U.S.A. 2005, 102(39), 13749–13754. 13. van Gunsteren, W.F.; Dolenc, J.; Mark, A.E. Molecular simulation as an aid to experimentalists. Curr. Opin. Struct. Biol. 2008, 18(2), 149–153. 14. Ovchinnikov, V.; Karplus, M.; Vanden-Eijnden, E. Free energy of conformational transition paths in biomolecules: The string method and its application to myosin VI. J. Chem. Phys. 2011, 134(8), 085103. 15. Periole, X.; Mark, A.E. Convergence and sampling efficiency in replica exchange simulations of peptide folding in explicit solvent. J. Chem. Phys. 2007, 126(1), 014903. 16. Huang, X.; Bowman, G.R.; Pande, V.S. Convergence of folding free energy landscapes via application of enhanced sampling methods in a distributed computing environment. J. Chem. Phys. 2008, 128(1), 205106. 17. Sutto, L.; D’Abramo, M.; Gervasio, F.L. Comparing the efficiency of biased and unbiased molecular dynamics in reconstructing the free energy landscape of metenkephalin. J. Chem. Theory Comput. 2010, 6(12), 3640–3646. 18. Pierce, L.C.T.; Salomon-Ferrer, R.; de Oliveira, C.A.F.; McCammon, J.A.; Walker, R.C. Routine access to millisecond time scale events with accelerated molecular dynamics. J. Chem. Theory Comput. 2012, 8(9), 2997–3002. 19. Miao, Y.; Feixas, F.; Eun, C.; McCammon, J.A. Accelerated molecular dynamics simulations of protein folding. J. Comput. Chem. 2015, 36(20), 1536–1549. 18 ACS Paragon Plus Environment

Page 19 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

20. Plimpton, S. Fast parallel algorithms for short-range molecular dynamics. J. Comput. Phys. 1995, 117, 1–19. 21. Fitch, B.G.; Rayshubskiy, A.; Eleftheriou, M.; Ward, T.J.C.; Giampapa, M.; Zhestkov, Y.; Pitman, M.C.; Suits, F.; Grossfield, A.; Pitera, J.; Swope, W.; Zhou, R.; Feller, S.; Germain, R.S. Blue Matter: Strong scaling of molecular dynamics on Blue Gene/L. Proc. Intl. Conf. Comput. Sci. (ICCS 2006) 2006, 3992, 846–854. 22. Phillips, J.C.; Braun, R.; Wang, W.; Gumbart, J.; Tajkhorshid, E.; Villa, E.; Chipot, C.; Skeel, R.D.; Kalé, L.; Schulten, K. Scalable molecular dynamics with NAMD. J. Comput. Chem. 2005, 26(16), 1781–1802. 23. Hess, B.; Kutzner, C.; van der Spoel, D.; Lindahl, E. GROMACS 4: Algorithms for highly efficient, load-balanced, and scalable molecular simulation. J. Chem. Theory Comput. 2008, 4(3), 435–447. 24. Case, D.A.; Cheatham, T.E., III; Darden, T.; Gohlke, H.; Luo, R.; Merz, K.M., Jr.; Onufriev, A.; Simmerling, C.; Wang, B.; Woods, R.J. The Amber biomolecular simulation programs. J. Comput. Chem. 2005, 26(16), 1668–1688. 25. Bowers, K.J.; Chow, E.; Xu, H.; Dror, R.O.; Eastwood, M.P.; Gregersen, B.A.; Klepeis, J.L.; Kolossvary, I.; Moraes, M.A.; Sacerdoti, F.D.; Salmon, J.K.; Shan, Y.; Shaw, D.E. Scalable algorithms for molecular dynamics simulations on commodity clusters. Proceedings of the ACM/IEEE Conference on Supercomputing (SC06) 2006, New York, NY: IEEE. 26. Shaw, D.E.; Dror, R.O.; Salmon, J.K.; Grossman, J.P.; Mackenzie, K.M.; Bank, J.A.; Young, C.; Deneroff, M.M.; Batson, B.; Bowers, K.J.; Chow, E.; Eastwood, M.P.; Ierardi, D.J.; Klepeis, J.L.; Kuskin, J.S.; Larson, R.H.; Lindorff-Larsen, K.; Maragakis, P.; Moraes, M.A.; Piana, S.; Shan, S.; Towles, B. Millisecond-Scale Molecular Dynamics Simulations on Anton. Proceedings of the Conference on High Performance Computing, Networking, Storage and Analysis (SC09) 2009, New York, NY: ACM.

19 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 29

27. Dror, R.O.; Dirks, R.M.; Grossman, J.P.; Xu, H.; Shaw, D.E. Biomolecular simulation: A computational microscope for molecular biology. Annu. Rev. Biophys. 2012, 41, 429– 452. 28. Swendsen, R.H.; Wang, J.S. 1986 Replica Monte Carlo simulation of spin glasses. Phys. Rev. Lett. 2012, 57(21), 2607–2609. 29. Zuckerman, D.M.; Lyman, E. A second look at canonical sampling of biomolecules using replica exchange simulation. J. Chem. Theory Comput. 2006, 2(4), 1200–1202. 30. Beck, D.A.; White, G.W.; Daggett, V. Exploring the energy landscape of protein folding using replica-exchange and conventional molecular dynamics simulations. J. Struct. Biol. 2006, 157(3), 514–523. 31. Zheng, W.; Andrec, M.; Gallicchio, E.; Levy, R.M. Proc. Natl. Acad. Sci. U.S.A. 2007, 104(39), 15340–15345. 32. Denschlag, R.; Lingenheil, M.; Tavan, P. Efficiency reduction and pseudo-convergence in replica exchange sampling of peptide folding–unfolding equilibria. Chem. Phys. Lett. 2008, 458, 244–248. 33. Machta, J. Strengths and weaknesses of parallel tempering. Phys. Rev. E Stat. Nonlin. Soft Matter Phys. 2009, 80(5), 056706. 34. Rosta, E.; Hummer, G. Error and efficiency of replica exchange molecular dynamics simulations. J. Chem. Phys. 2009, 131(16), 165102. 35. Rosta, E.; Hummer, G. Error and efficiency of simulated tempering simulation. J. Chem. Phys. 2010, 132(3), 034102. 36. Shaw, D.E.; Maragakis, P.; Lindorff-Larsen, K.; Piana, S.; Dror, R.O.; Eastwood, M.P.; Bank, J.A.; Jumper, J.M.; Salmon, J.K.; Shan, Y.; Wriggers, W. Atomic-level characterization of the structural dynamics of proteins. Science 2010, 330(6002), 341– 346. 20 ACS Paragon Plus Environment

Page 21 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

37. Piana, S.; Lindorff-Larsen, K.; Shaw, D.E. Atomic-level description of ubiquitin folding. Proc. Natl. Acad. Sci. U.S.A. 2013, 110(15), 5915–5920. 38. Piana, S.; Lindorff-Larsen, K.; Shaw, D.E. Protein folding kinetics and thermodynamics from atomistic simulation. Proc. Natl. Acad. Sci. U.S.A. 2012, 109(44), 17845–17850. 39. Grossfield, A.; Zuckerman, D.M. Chapter 2: Quantifying uncertainty and sampling quality in biomolecular simulations. Annu. Rep. Comput. Chem. 2009, 5, 23–48. 40. Scarpazza, D.P.; Ierardi, D.J.; Lerer, A.K.; Mackenzie, K.M.; Pan, A.C.; Bank, J.A.; Chow, E.; Dror, R.O.; Grossman, J.P.; Killebrew, D.; Moraes, M.A.; Predescu, C.; Salmon, J.K.; Shaw, D.E. Extending the generality of molecular dynamics simulations on a special-purpose machine. Proceedings of the 27th IEEE International Parallel and Distributed Processing Symposium (IPDPS '13), 2013, pp. 933–945. Boston, MA: IEEE Computer Society. 41. Kubelka, J.; Chiu, T.K.; Davies, D.R.; Eaton, W.A.; Hofrichter, J. Sub-microsecond protein folding. J. Mol. Biol. 2006, 359(3), 546–553. 42. Fersht, A.R. Structure and Mechanism in Protein Science: A Guide to Enzyme Catalysis and Protein Folding; Freeman: New York, 1999. 43. Rauscher, S.; Neale, C.; Pomès, R. Simulated tempering distributed replica sampling, virtual replica exchange, and other generalized-ensemble methods for conformational sampling. J. Chem. Theory Comput. 2009, 5(10), 2640–2662. 44. Park, S. Comparison of the serial and parallel algorithms of generalized ensemble simulations: An analytical approach. Phys. Rev. E Stat. Nonlin. Soft Matter Phys. 2008, 77, 016709. 45. Zhang, C.; Ma, J. Comparison of sampling efficiency between simulated tempering and replica exchange. J. Chem. Phys. 2008, 129, 134112. 46. Moses, E.; Hinz, H.J. Basic pancreatic trypsin inhibitor has unusual thermodynamic stability parameters. J. Mol. Biol. 1983, 170(3), 765–776. 21 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 29

47. Oliveberg, M.; Tan, Y.J.; Fersht, A.R. Negative activation enthalpies in the kinetics of protein folding. Proc. Natl. Acad. Sci. U.S.A. 1995, 92(19), 8926–8929. 48. Fukunishi, H.; Watanabe, O.; Takada, S. On the Hamiltonian replica exchange method for efficient sampling of biomolecular systems. J. Chem. Phys. 2002, 116(20), 9058– 9067. 49. Sugita, Y.; Kitao, A.; Okamoto, Y. Multidimensional replica-exchange method for freeenergy calculations. J. Chem. Phys. 2000, 113, 6042–6051. 50. Torrie, G.M.; Valleau, J.P. Nonphysical sampling distributions in Monte Carlo freeenergy estimation: Umbrella sampling. J. Comput. Phys. 1977, 23, 187–199. 51. Rohrdanz, M.A.; Zheng, W.; Clementi, C. Discovering mountain passes via torchlight: Methods for the definition of reaction coordinates and pathways in complex macromolecular reactions. Annu. Rev. Phys. Chem. 2013, 64, 295–316. 52. Pan, A.C.; Weinreich, T.M.; Shan, Y.; Scarpazza, D.P.; Shaw, D.E. Assessing the accuracy of two enhanced sampling methods using EGFR kinase transition pathways: The influence of collective variable choice. J. Chem. Theory Comput. 2014, 10(7), 2860– 2865. 53. Bolhuis, P.G.; Chandler, D.; Dellago, C.; Geissler, P.L. Transition path sampling: Throwing ropes over rough mountain passes, in the dark. Annu. Rev. Phys. Chem. 2002, 53, 291–318.

22 ACS Paragon Plus Environment

Page 23 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Figures and Table

Figure 1. Conventional MD and simulated tempering sample the same conformational space: BPTI. Projections of the (A) conventional MD and (B) simulated tempering simulations onto the top two principal component vectors, PC1 and PC2. The colors indicate clusters of structurally similar states, and the thickness of the contour lines reflects the population density of that cluster (see SI). (C) Comparisons of mean structural snapshots of BPTI from the clusters identified in (A) and (B) between conventional MD (darker colors) and simulated tempering (lighter colors). The colors of the structures correspond to the cluster colors in the PCA projections, (A) and (B). The majority of the conformational variability in BPTI occurred in two adjacent loop regions, Loop 1 and Loop 3. The left-hand panel shows the mean structure of the 23 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 29

native-like cluster (orange) and the call-out on the right-hand panel zooms in on the Loop 1 and Loop 3 regions, showing the mean structures of the two clusters with the largest RMSD deviations between conventional MD and simulated tempering (0.26 Å for the purple cluster and 0.25 Å for the cyan cluster). The structures are almost indistinguishable. Here, RMSD was measured over residues 4 to 54 of the protein backbone. Cluster visits as a function of time for (D) 100 µs of the 1-ms conventional MD simulation and (E) the simulated tempering simulation. The rate of cluster visits is approximately an order of magnitude faster in the simulated tempering simulation. The colors of the points correspond to the colors in the earlier panels. Only data from the lowest rung of simulated tempering simulation B1 (T = 300 K) were used.

24 ACS Paragon Plus Environment

Page 25 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Figure 2. Conventional MD and simulated tempering sample the same conformational space: ubiquitin. Same as Fig. 1, but for ubiquitin. Conformational changes were primarily observed in two regions of the protein: i) the flipping of a section of Loop 5 composed of residues 50 to 54, and ii) motions involving Loop 1, Loop 3, and the top of the α helix between Loop 2 and Loop 3. The largest RMSD deviation between mean structures from the conventional MD and simulated tempering simulations was 0.13 Å, for the purple cluster measured over residues 2 to 71 of the protein backbone. Data are from simulations U3 through U5.

25 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 29

Figure 3. Simulated tempering of villin folding achieves reversible-folding speeds similar to a conventional MD simulation at the melting temperature. (A) Cα RMSD traces of villin to PDB ID 2F4K41 in a conventional MD simulation at the melting temperature (top) and in a simulated tempering simulation (bottom). The simulated tempering simulation achieves reversible-folding speeds similar to those of the conventional MD simulation at the melting temperature. (B) An overlay of the mean folded-state structures determined from the conventional MD (darker orange) and simulated tempering (lighter orange) simulation V2, by averaging structures with Cα RMSD less than 3.5 Å from 2F4K, showing very close agreement; the difference in backbone RMSD is 0.13 Å. (We note that the conventional MD simulation was done at 370 K, whereas the temperature rung over which the simulated structure was averaged was at 369.1 K.) (C) Melting curves determined from simulated tempering simulations V1, V2, and V3 (red circles) and three conventional MD simulations (blue squares) at different temperatures showing good agreement over populations. Error bars represent standard errors of

26 ACS Paragon Plus Environment

Page 27 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

the mean over five equal blocks of the trajectories, and the dashed black line is a sigmoidal curve fit to the data.

27 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

System

τf (µs)

τu (µs)

τf + τu (µs)

360 K standard MD

2.6(5)

6(1)

9(1)

370 K standard MD

2.2(2)

1.5(1)

3.7(3)

380 K standard MD

3.7(5)

0.9(1)

4.6(5)

Simulated tempering

3.2(5)

2.7(7)

5.9(9)

Page 28 of 29

Table 1. Average folding, τf, and unfolding, τu, times for villin in conventional MD and simulated tempering simulation V2 are shown. The sum of the average folding and unfolding times in the third column—the average reversible-folding time—is a measure of sampling efficiency (smaller times correspond to more efficient sampling). Only data from the lowest rung of simulated tempering run V2 were used in this calculation. Standard errors are indicated in parentheses.

28 ACS Paragon Plus Environment

Page 29 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

TOC Graphic

29 ACS Paragon Plus Environment