Driving Structural Transitions in Molecular Simulations Using

space, and provides direct information about the dynamical processes of the ... while maintaining detailed balance, which can be challenging,1,2 and i...
3 downloads 0 Views 5MB Size
Subscriber access provided by Gothenburg University Library

Article

Driving Structural Transitions in Molecular Simulations Using Nonequilibrium Candidate Monte Carlo An#l Kurut, Rasmus Fonseca, and Wouter Boomsma J. Phys. Chem. B, Just Accepted Manuscript • DOI: 10.1021/acs.jpcb.7b11426 • Publication Date (Web): 20 Dec 2017 Downloaded from http://pubs.acs.org on December 20, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

The Journal of Physical Chemistry B is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Driving Structural Transitions in Molecular Simulations Using Nonequilibrium Candidate Monte Carlo Anıl Kurut,∗,† Rasmus Fonseca,‡ and Wouter Boomsma∗,† †Department of Computer Science, University of Copenhagen, Denmark ‡Department of Molecular and Cellular Physiology, Stanford University, United States of America E-mail: [email protected]; [email protected] Phone: +45 51 92 36 00

Abstract

Hybrid simulation procedures which combine molecular dynamics with Monte Carlo are attracting increasing attention as tools for improving the sampling efficiency in molecular simulations. In particular, encouraging results have been reported for nonequilibrium candidate protocols, in which a Monte Carlo move is applied gradually, and interleaved with a process that equilibrates the remaining degrees of freedom. While initial studies have uncovered a substantial potential of the method, its practical applicability for sampling structural transitions in macromolecules remains incompletely understood. Here, we address this issue by systematically investigating the efficiency of nonequilibrium candidate Monte Carlo on the sampling of rotameric distributions of two peptide systems at atomistic resolution both in vacuum and explicit solvent. The

1

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

studied systems allow us to directly probe the efficiency with which a single or a few slow degrees of freedom can be driven between well-separated free energy minima, and to explore the sensitivity of the method towards the involved free parameters. In line with results on other systems, our study suggests that order-of-magnitude gains can be obtained in certain scenarios, but also identifies challenges that arise when applying the procedure in explicit solvent.

Introduction The Markov chain Monte Carlo (MC) and Molecular Dynamics (MD) approaches to molecular simulation are thermodynamically equivalent, but use fundamentally different sampling strategies. MD exploits gradients (i.e. forces) for efficient local exploration of the phase space, and provides direct information about the dynamical processes of the system, but it is limited to small steps in each iteration of the simulation. In contrast, MC can make dramatic changes to the molecule in each iteration, thus allowing for faster convergence of thermodynamic quantities, but provides no (or little) information about dynamics. While this would suggest that MC be the preferred tool for calculating thermodynamic averages, this is not generally true. In fact, in particular for macromolecular simulations, even long time scale simulations that probe only thermodynamic quantities are frequently conducted using MD-based sampling strategies. The problem with MC is that although it has the potential to improve performance by making large structural updates, it is not trivial to construct such moves in practice. This is especially true for explicit solvent simulations, where any large modification in the solute structure is almost certain to fail due to overlap with solvent molecules. However, even in an implicit solvent setting, the design of efficient moves requires proper handling of correlations between relevant degrees of freedom (DoFs) while maintaining detailed balance, which can be challenging, 1,2 and is not necessarily transferable between systems. 2

ACS Paragon Plus Environment

Page 2 of 27

Page 3 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

One solution in the search for efficient Monte Carlo moves could be to modify DoFs gradually over a number of interpolation steps, and interleave this process with an integration of the remaining DoFs, thus allowing them to equilibrate. This is the idea behind the Nonequilibrium Candidate Monte Carlo (NCMC) procedure proposed by Chodera and coworkers. 3 Formally, such a Monte Carlo move can be described as a nonequilibrium process, and the NCMC framework specifies the requirements for securing the correct stationary distribution with such moves, building on pioneering work in various fields over the last decades. 4–11 The NCMC approach can be viewed as a way to simplify the design of Monte Carlo moves by focusing on driving particular DoFs of interest, while other DoFs are adjusted automatically in correspondence to the correlations dictated by the force field. Recent years have provided a number of interesting applications of these ideas, including encouraging results on simple two-state models, 3,12–14 but also on fully atomistic settings, for instance for switching between protonation states in constant pH simulations, 15,16 and driving structural transitions in peptide systems. 17 While these results testify to a substantial potential in non-equilibrium-based sampling method, more work remains to be done to fully clarify how well these techniques work in practice. In particular, in the context of macromolecular simulations using standard molecular force fields, it is not clear whether order-of-magnitude efficiency gains can generally be expected, and if so, how difficult it is to choose parameter values to obtain such improvements. In this paper, we systematically investigate how efficiently non-equilibrium simulation procedures can sample transitions in simple peptide systems. Unlike previous work which addressed how to construct suitable candidate structures through global perturbations of a system, 17 we here study non-equilibrium sampling as the basis of a more standard Monte Carlo move, where only one or a few slow DoFs are perturbed at a time. We investigate two peptide systems, chosen to have a few dominant slow DoFs (the sidechain χ angles), with a clear separation of timescales to the remaining DoFs. In each case, the systems have clear free energy minima (the rotameric states of the side chains) and substantial barriers 3

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

between them, and we probe the sampling efficiency both by counting transitions between these minima and by inspecting the convergence rate to the global stationary distribution. The systems are designed to be simple enough to allow for complete convergence of our simulations, while being complex enough to address some of the challenges found in real biomolecular simulation tasks. Initially, we study the systems in vacuum, to investigate the ability of NCMC to handle coupled internal DoFs, testing the hypothesis that NCMC can simplify the design on Monte Carlo moves by adjusting secondary DoFs automatically. We then address the possibility of conducting Monte Carlo moves in explicit solvent, and finally discuss the opportunities and limitations of the method in this context.

NCMC Framework The idea of incorporating nonequilibrium processes within a simulation to obtain equilibrium properties has appeared in various forms over the last decades. 4–11 In an effort to unify these ideas, Chodera and coworkers introduced the NCMC framework 3 for conducting updates to both structural states and thermodynamic parameters associated with a system. In this paper, we will use the NCMC terminology, but limit ourselves to sampling of structural DoFs in the canonical ensemble. NCMC consists of two phases: 1) An equilibrium phase where all DoFs are sampled using any conventional equilibrium method and 2) a nonequilibrium switching phase where a switching protocol escorts the system from an initial state to a final state (Figure Figure1). The switching protocol (Λ) is uniquely determined by the initial state of the NCMC phase (x0 ) and consists of alternating series of perturbation (E) and propagation kernels (R), Λ = (E1 , R1 , E2 , R2 , . . . RN ). Each perturbation kernel drives one or few DoFs out of equilibrium to an excited state, Ei = (xi−1 → x∗i ). Following that, a propagation kernel is applied to another set of DoFs (typically the set of unperturbed DoFs) and relaxes them conditioned 4

ACS Paragon Plus Environment

Page 4 of 27

Page 5 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

on the driven ones, Ri = (x∗i → xi ). The main idea behind these alternating kernels is to smoothly switch the state of the driven DoFs while making the necessary adjustments to the rest of the system. Under the protocol Λ, the switching trajectory (X) from an initial configuration (x0 ) to a final configuration (xN ) through a number of NCMC steps N can be written as follows; E

R

E

R

E

R

1 1 2 2 N N X(Λ, x0 ) = x0 −→ x∗1 −→ x1 −→ x∗2 −→ x2 · · · xN −1 −−→ x∗N −−→ xN

(1)

where x represent a configuration on the phase space of the system including both coordinates and momenta. Equilibration with MD

NCMC switching proposal for

1

Initialization: Perturbation 1: n Verlet integration steps Partial MC rotation (all DoFs) (only 1) Randomize Randomize velocities velocities

1

...

Propagation 1: velocity-Verlet integration (all DoFs except 1)

Propagation N Accept or reject NCMC proposal

Figure 1: An iteration of an NCMC simulation including an equilibration phase of n steps of MD integration followed by a nonequilibrium phase with alternating series of perturbation and propagation kernels. Semi-transparency is used to denote parts of the molecule that are static, while the parts that are updated in a given step are depicted in full colour.

Unlike conventional MC moves where the acceptance criterion depends on the potential energy difference, in NCMC, it depends on the work done during the switching trajectory. A strict pairwise detailed balance condition enforces that the accepted candidates sample

5

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 27

the Boltzmann distribution: 3 acceptance probability

probability of generating X

z }| { z }| { ˜ N , Λ)P ˜ (Λ|x ˜ N )A(xN → x0 ) π(x0 ) ζ(X|x0 , Λ) P (Λ|x0 ) A(x0 → xN ) = π(xN )ζ(X|x | {z } | {z }

Boltzmann weight

where π(x) =

(2)

probability of proposing Λ

R exp(−βU (x)) exp(−βU (x))dx

is the Boltzmann weight of a configuration x with a Hamiltonian

U (x) and inverse temperature β. ζ(X|x0 , Λ) is the probability of generating trajectory X given the initial configuration x0 and protocol Λ. P (Λ|x0 ) denotes the probability of proposing protocol Λ given the initial configuration x0 and A(x0 → xN ) is the acceptance probability of the NCMC proposal trajectory X. In order to satisfy detailed balance there ˜ = (R ˜ N , E˜N , R ˜ N −1 , E˜N −1 , · · · , R˜1 , E˜1 )) with a non zero proposal must be a reverse protocol (Λ ˜ N ) > 0) given that the initial configuration is xN . Under these conditions, probability (P (Λ|x ˜ can be written as, the probability of generating trajectories X and X

ζ(X|x0 , Λ) =

N Y

Ei Ri

(3a)

E˜i R˜i

(3b)

i=1

˜ N , Λ) ˜ = ζ(X|x

1 Y i=N

where Ei and Ri represent conditional probability of each perturbation and propagation kernels, respectively. Inserting equations 3a and 3b into the detailed balance condition (equation 2), we can obtain an expression for the acceptance probability of an NCMC proposal:  Q1 ˜ ˜ ˜ π(xN ) A(x0 → xN ) i=N Ei Ri P (Λ|xN ) =  QN A(xN → x0 ) π(x0 ) i=1 Ei Ri P (Λ|x0 ) N ˜ ˜ Y Ei Ri

 = exp − β[U (xN ) − U (x0 )]

i=1

|

!

Ei R i {z }

˜ N) P (Λ|x P (Λ|x0 )

(4)

= 1 (in our case)

In our case, the perturbation kernel is chosen to be a deterministic rotation of a dihedral 6

ACS Paragon Plus Environment

Page 7 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

angle, and the propagation kernel is a symplectic integrator, in which case

QN  E˜i R˜i  i=1

Ei Ri

cancels (see original reference 3 for details). After applying the Metropolis criterion we are left with:

"

 P (Λ|x ˜ N) Acc(x0 → xN ) = min 1, exp − β[U (xN ) − U (x0 )] P (Λ|x0 ) 

# (5)

˜ are directly determined by how The relative proposal probabilities of the protocols Λ and Λ we generate the NCMC proposals (see NCMC proposals section).

Benchmark System The primary goal of this paper is to investigate the efficiency of NCMC in atomistic protein systems and to explore its sensitivity to two key parameters: 1) the number of NCMC steps and 2) the number of equilibration steps. To ensure that full convergence can be achieved within a reasonable simulation time, we limit ourselves to two peptide systems: Valine (VAL) and Methionine (MET) embedded centrally in a helix of eight Alanine residues (AAAAXAAAA). These systems were chosen due to a clear separation of timescales within the dynamics of their DoFs, with the χ angles of the central side chains being particularly slow. Furthermore, these angles have well-defined free energy minima separated by substantial barriers, allowing us to directly assess the sampling efficiency of NCMC by measuring the transition rates between their free energy minima. VAL was specifically chosen for its slow transitioning rotameric states and for its uneven equilibrium populations (Figure 2), which has previously been identified as a challenge for the NCMC method. 3 MET, which has three slow DoFs (χ1 , χ2 , and χ3 ) with increasing rotamer transition rates relative to one another, was chosen to investigate effect of non-trivial coupling between its rotamers (Figure 2). It is also substantially bulkier, and thus expected to be more difficult to rotate through explicit

7

ACS Paragon Plus Environment

The Journal of Physical Chemistry

water. For each of the two systems, two simulation variants were constructed; one in vacuum and one in explicit solvent with the TIP3P water model. 18 Finally, to explicitly rule out any slow dynamics in the backbone, the peptides are constraint to a helical conformation (see Section: Simulation details). VAL

MET 0.06

0.06 χ1

0.05

0.05

0.04

0.04

PDF

PDF

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 27

0.03

χ3

χ1

0.03

0.02

0.02

0.01

0.01

0.00

χ2

vacuum water

0.00 0

60

120

180

240

300

360

60 120 180 240 300

60 120 180 240 300

60 120 180 240 300

χ1 ( ◦ )

χ2 ( ◦ )

χ3 ( ◦ )

χ1 ( ◦ )

Figure 2: The marginal equilibrium probability density function (PDF) of χ angles of VAL and MET in vacuum and in explicit solvent. (Insets) Structure of studied peptides: The backbone is depicted in stick representation and side chains are depicted in balls & sticks representation. The helix formed by the ALA residues is shown as pink ribbon. The axes of rotation corresponding to the χ1 , χ2 and χ3 angles are highlighted.

NCMC proposals Perhaps the simplest Monte Carlo move for sampling side-chain rotameric states would be a mixture model of two Gaussians, centred at ±120◦ relative to the current state. However, such moves would be subject to high rejection rates when rotameric states are unevenly populated, and thus very costly in an NCMC setting, where each move can involve thousands of energy evaluations. To ensure a fair representation of the NCMC approach, we therefore conduct our experiments in an idealized scenario, where candidate values for the χ angles are sampled directly from their stationary equilibrium probability density functions. While these distributions can in principle be estimated adaptively during the simulation, we here estimate them from initial MD simulations (see Figure 2). In both solvent conditions, we construct 1D proposals for driving the χ1 angle of both VAL and MET by sampling end

8

ACS Paragon Plus Environment

Page 9 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

points (χN ) for NCMC switching trajectories from Gaussian mixture models which were fitted on the corresponding reference MD simulations. To explicitly drive the additional χ angles of MET, we also construct 2D and 3D proposals where (χ1 , χ2 ) and (χ1 , χ2 , χ3 ) are simultaneously driven, respectively (see Simulation details). The use of these proposal distributions gives rise to a bias in the simulation which must be compensated for in the   ˜ Λ|xN ) in eq 5 to ensure that we sample from the original Boltzmann acceptance criterion PP((Λ|x 0) distribution. During an NCMC move, a selected angle is rotated from its current state to the proposed state through the shortest angular path (∆χi ) by dividing the rotation into N i equally spaced steps, where i is the iteration number. Since moves are proposed independently from the current state, the total angular change, ∆χi , varies between iterations. It is therefore appropriate to dynamically adjust the number of NCMC steps (N i ) based on the size of the proposed move. In this way, a more relevant parameter, the size of the angular perturbation, ∆χi /N i , is kept fixed rather than the number of NCMC steps. For the results in our paper, the reported NCMC step parameter refers to the number of steps for a 120◦ move. To calculate the actual number of NCMC steps for a given move in multiple angular dimensions, we use the angular dimension in which the angular change is the largest.

Computational Details MD simulations MD simulations were run for 2000 ns using OpenMM 19 version 7 (GPU platform) in the Canonical ensemble. A velocity Verlet integrator with 2 fs time step was used together with an Anderson thermostat at 300 K. All hydrogen bonds were constrained with a tolerance of 10−6 . Periodic boundary conditions were applied and a 10 ˚ A cutoff was used for long range interactions including electrostatic interactions in a cubic box of 27 ˚ A. A custom backbone 9

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 27

potential is implemented to restrain the peptide to a helical configuration:

Ub = k0 (1 − cos(θ − θ0 ))

(6)

where the force constant k0 is set to 10 kJ/mol and θ0 is set to standard values for an α-helical configuration. NCMC simulations In our study, an iteration of an NCMC simulation consists of two phases: 1) An equilibration phase, which is a conventional MD simulation with 500 steps of velocity Verlet integration together with an Anderson thermostat at 300 K in the Canonical ensemble and 2) an NCMC switching phase, which is a series of steps, each consisting of an MC perturbation followed by a propagation of velocity Verlet integration with a 2 fs time step in the micro-Canonical ensemble (Figure 1). The propagation kernel is applied to all atoms except the ones defining the perturbed χ angles. The atoms that define the χ angles are fixed in space by setting their masses to zero during the propagation. Note that, in addition to the perturbed χ angle, this approach leads to four more frozen DoFs (bond lengths and angles between the massless particles). H-bond constraints are applied as in the MD simulations and iterated till full convergence before proceeding to the next perturbation. Prior to each phase, velocities are sampled from the Maxwell-Boltzmann distribution following a Gibbs sampling procedure. The long range interaction cutoffs are set consistently with the MD simulations. All NCMC simulations were performed using Phaistos simulation package 20 with the QuickStep library (github.com/wouterboomsma/quickstep). Analysis In order to establish the efficiency of NCMC simulations relative to conventional MD simulations we have utilised two measures: 1) The transition rate per an effective simulation time unit and 2) The Jensen-Shannon Divergence (JSD) of sampled statistics at a given effective

10

ACS Paragon Plus Environment

Page 11 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

simulation time unit from a 1000 ns long MD simulation. In our setup, due to the fact that we apply constraints, an NCMC step contains two force evaluations, one after perturbation and one after propagation. Time-wise, one NCMC step is therefore considered as corresponding to two conventional MD integration steps. We confirmed the validity of our effective time definition by comparing the run time of NCMC and MD steps on the CPU implementation of OpenMM version 7. 19 In addition, each NCMC phase includes two more energy evaluations to calculate its acceptance probability. Consequently, an effective simulation time for an NCMC simulation with n-steps of equilibration and N i NCMC steps in iteration i, run for I iterations is calculated as

T effective =

I  X i=1

1 NCMC =2 MD

z }| {  i 2 ]) · (2 fs) . ( |{z} 500 +[ N · 2 + |{z} | {z } initial + final

equilibration

(7)

time step

For the calculation of rotamer transition rates, we defined three rotamer states for all χ angles using the following ranges [0,120), [120,240), [240,360]. Whenever a χ angle switch between these states is encountered in the trajectory, it is considered as a rotamer transition and the number of rotamer transitions throughout a simulation is counted accordingly. Using these relations, the rotamer transition rate is defined as the number of rotamer transitions per ns of effective simulation time. The χ angles are recorded after each completion of an NCMC iteration. While transition rates give us an insight about sampling efficiency of individual χ angles it contains no information about how well we sample their correlations. To remedy this, Jensen-Shanon divergence is used to investigate how close the NCMC-sampled distribution (P ) is to the target distribution (Q) by measuring the distance between two distributions from their average. 1 1 JSD(P kQ) = DKL (P kM ) + DKL (QkM ) 2 2 11

ACS Paragon Plus Environment

(8)

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

where M = 12 (P + Q) and DKL (P kQ) =

P  i

P (i) P (i) ln Q(i)



Page 12 of 27

is Kullback–Leibler divergence.

A 1000 ns long MD simulation is used as the reference target distribution (Q) and 20 ns of NCMC and MD simulations are used as sampled distributions (P ). Averages and standard error of means are calculated over 10 simulations in vacuum and 5 simulations in solvent. The distributions P and Q are determined by fitting a Gaussian mixture model to the χ angle trajectories using the scikit-learn python package 21 with initialisation of the means at (60, 180, 240) for each χ angle. A numerical integration is performed by using a million sample points from the corresponding fits. We chose JSD rather than the more direct comparison of the two distributions, DKL , because JSD is more robust towards the regions that are not sampled by P but present in Q.

Results To understand how efficiently an NCMC move can relax the internal DoFs of peptides, we initially conducted simulations in vacuum, thus eliminating the complications associated with on-pathway water molecules. We first investigated acceptance and efficiency of NCMC sampling for both VAL and MET while driving only the slowest degree of freedom, χ1 . Then, we performed multi-dimensional NCMC simulations for MET in vacuum to determine the impact of explicitly driving multiple slow DoFs. After systematically establishing the efficiency of NCMC in vacuum, we repeated the same experiments in explicit solvent to examine the difficulty of rotating the side-chains through explicit water molecules.

NCMC in Vacuum Acceptance of 1D proposals As an initial test of our simulation setup, we investigated the acceptance rate of NCMC

12

ACS Paragon Plus Environment

Page 13 of 27

proposals for driving a single DoF, χ1 , in vacuum as a function of the number of NCMC steps (Figure 3). Two opposite trends are observed from the acceptance curve: the leftVAL

70

Acceptance of NCMC, %

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

MET

65 60 55 50 45 40 35 MC

1

10 102 103 104 105 MC

NCMC steps

1

10 102 103 104 105

NCMC steps

Figure 3: Acceptance of NCMC proposals which drives χ1 in vacuum as a function of NCMC steps for VAL (left) and MET (right). The semi-transparent dots denote points for which the correct Boltzmann distribution is not obtained, and which should therefore be disregarded (see text). The ‘MC’ point refers to a simulation where the entire NCMC switching phase is replaced by a one-step MC perturbation without any propagation.

most part shows a negative correlation with increasing NCMC steps while the right-most part displays a positive correlation. The right-most part behaves as expected: a longer switching phase generally corresponds to a more subtle update of the system, and thus a higher acceptance rate. An explanation to the left-most part can be found by considering the sampled distributions of the NCMC and MD simulations, which clearly show that the simulations with fewer NCMC steps are not converging to the expected distribution (Figure 4). These results highlight an important requirement for the practical use of NCMC: the equilibration phase must be sufficiently long to ensure ergodic sampling. While explicitly driving only the slowest degree of freedom (χ1 ), we rely on the equilibration phase for sampling the remaining DoFs. When insufficient time is allocated for equilibration, the sampling becomes conditional on the perturbed state. Indeed, increasing the number of equilibration steps resolves this problem and restores ergodicity (Figure 5). This necessary compensation of the equilibration time is inherently associated with a decrease in efficiency, which invalidates the performance increase we observe for the this left-most range of small 13

ACS Paragon Plus Environment

The Journal of Physical Chemistry

VAL

0.010

MET 0.004

10 20

0.008

20 30

0.002

0.006

∆PDF

∆PDF

0.004

0.000 −0.002 −0.004

0.002 0.000 0

60

120

180

240

300

360

−0.006

60 120 180 240 300

60 120 180 240 300

60 120 180 240 300

χ1 ( ◦ )

χ2 ( ◦ )

χ3 ( ◦ )



χ1 ( )

Figure 4: Difference between rotamer probability density functions of MD and NCMC for VAL and MET in vacuum using 1D proposals with labeled NCMC steps and 500 equilibration steps.

NCMC-step values (see below). 0.07

MD 500 5000 10000

0.06 0.05

0.004 0.002

0.04

∆PDF

PDF

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 27

0.03 0.02

0.000 −0.002 −0.004

0.01

500 5000 10000

−0.006

0.00 0

60

120

180

χ1 ( ◦ )

240

300

360

0

60

120

180

χ1 ( ◦ )

240

300

360

Figure 5: χ1 rotamer probability density function of VAL in vacuum sampled with various lengths of equilibration phase and 5 NCMC steps using 1D proposals (left). The difference between the probability densities of MD and NCMC (right). The black line and shaded area show mean and the standard deviation taken over four 500ns long MD simulations. Labels indicate the number of equilibration steps performed prior to each NCMC switching phase.

Efficiency of 1D sampling In order to evaluate the efficiency of NCMC sampling we have defined two measures: 1) the rotamer transition rate that indicates the number of transitions per effective simulation time unit and 2) Jensen-Shannon Divergence (JSD) that measures the distance between the true equilibrium distribution and a sampled distribution at a given effective simulation time 14

ACS Paragon Plus Environment

Page 15 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

unit. In both cases, for fair comparisons with MD, the unit of time is based on the total number of energy evaluations conducted (see Simulation details) where one NCMC step is equivalent to 4 fs of effective simulation time unit (two MD steps). Although investing in more NCMC steps elevates the acceptance (Figure 3), at some point, the cost of an NCMC move becomes too high, resulting in an overall reduction of the sampling efficiency. For both VAL-1D

MET-1D

Figure 6: Rotamer transition rate (top) and Jensen-Shannon Divergence of MD and NCMC at 20 ns effective simulation time from an independent 1000 ns of MD simulation as a function of NCMC steps for VAL (left) and MET (right) in vacuum using 1D proposals. The semitransparent dots denote points for which the correct Boltzmann distribution is not obtained, and which should therefore be disregarded (see text). The ‘MC’ point refers to a simulation where the entire NCMC switching phase is replaced by a one-step MC perturbation without any propagation.

VAL and MET, this effect can clearly be seen in the transition rates as well as in the JSD values where the former starts to decrease and the latter starts to increase (Figure 6). The point of maximum efficiency is found as a trade-off between the cost and the acceptance rate. For VAL, this maximum (the minimum in JSD) can directly be related to the peak of the χ1 transition rate which is the only slow DoFs in the system. This relationship is less straightforward for MET due to the remaining rotamer DoFs: Although driving χ1 provides substantial speed up for χ1 transitions, it does not necessarily lead to χ2 transitions due to 15

ACS Paragon Plus Environment

The Journal of Physical Chemistry

weak correlations between χ1 and χ2 (Figure 7). As a consequence, only subtle improvements are observed in the χ2 transition rate, leaving it as the slowest degree of freedom, and thus the bottleneck for the further efficiency improvements. MET 350 300

χ1

250 200 150 100 50 0 350 300

χ2

250 200 150 100 50 0 350 300 250

χ3

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 27

200 150 100 50 0 60 120 180 240 300

60 120 180 240 300

60 120 180 240 300

χ1

χ2

χ3

Figure 7: Visualisation of the 3D probability density function of MET χ angles sampled in an NCMC simulations (30 NCMC steps) by inspection of its pairwise bivariate distributions (off-diagonal), and the 1D univariate marginal distributions (diagonal). For comparison, the corresponding bivariate plots for the MD simulations are included in red above the diagonal, and the marginals are plotted in red in the diagonal (hardly visible to extensive overlap with the NCMC results). n the off-diagonal plots, lighter colors indicate higher probability densities. The color of the lowest probability density is turned off for clear identifications of the peak locations and their shapes. The axes of the upper diagonal subplots are reversed to facilitate a direct comparison of the peaks. The plots illustrate the free energy barriers in the system, and the similarity between MD and NCMC establishes that the Boltzmann distribution is correctly sampled by our NCMC implementation.

Multi-dimensional NCMC sampling Since χ1 and χ2 demonstrate only a weak coupling and their transitions are comparably slow, further performance improvements can be expected by driving both DoFs simultaneously. We probe this effect by conducting simulations using 2D and 3D NCMC moves, which act on (χ1 , χ2 ) and (χ1 , χ2 , χ3 ) angles, respectively. Indeed, in 2D sampling, we see that directly 16

ACS Paragon Plus Environment

Page 17 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

driving all slow DoFs boosts the efficiency dramatically (Figure 8). Although it results in similar χ1 transitions as in the 1D sampling, it vastly improves χ2 transitions, thus leading to much lower JSD values. As expected, driving the fast χ3 angle along with the two slow angles does not provide any additional improvements, producing similar JSD levels as in 2D sampling. MET-2D

MET-3D

Figure 8: Top: Acceptance of 2D (left) and 3D (right) NCMC proposals. Middle: Rotamer transition rates. Bottom: Jensen-Shanon Divergence of MD and NCMC from an independent MD simulation of 1000 ns as a function of NCMC steps for MET in vacuum.

NCMC in explicit solvent One of the interesting promises of NCMC is the potential to conduct Monte Carlo updates in an explicit water environment. One would hope that this could broaden the applicability of

17

ACS Paragon Plus Environment

The Journal of Physical Chemistry

Monte Carlo simulations and provide substantial speedups to molecular simulations involving large structural rearrangements. Here, we investigate this potential on our peptide systems, by repeating the experiments in explicit water, and studying the ability of NCMC to relax the water molecules during a structural transition from one rotameric state to another. VAL-1D

Rotamer transition rate, ns−1

Acceptance of NCMC, %

50

MET-1D

MET-2D

MET-3D

40 30 20 10 0 2.0 χ1 χ2 χ3

1.5 1.0 0.5

0.0 0.10 0.08

JSD at 20ns

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 27

0.06 0.04 0.02

MD

10

102

103

NCMC steps

104

MD

10

102

103

104

MD

NCMC steps

10

102

103

NCMC steps

104

MD

10

102

103

104

NCMC steps

Figure 9: Acceptance of NCMC switching proposals (top), rotamer transition rates (middle) and Jensen-Shannon Divergence of MD and NCMC from an independent MD simulation of 1000 ns as a function of NCMC steps for VAL and MET in explicit solvent.

Overall, our results show that the acceptance of NCMC proposals is lower than for the vacuum simulations, which results in a general reduction in the sampling efficiency both in terms of transition rates and JSD convergence (Figure 9). In contrast to the vacuum behaviour, the acceptance of 1D proposals in solvent demonstrates almost identical behaviour for VAL and MET, which suggests that the bottleneck for acceptance is the propagation of water molecules rather than the internal DoFs of the peptide chains. To investigate this 18

ACS Paragon Plus Environment

Page 19 of 27

effect in detail, we computed the differences in total energy produced by the accepted and rejected NCMC proposals and split them into their kinetic and potential components (Figure 10). In the solvent case, the distribution of the total energy differences in all proposals has a 0.35 µ:4.32 0.49 µ:7.46

0.10

0.05

vacuum water

0.25

0.01 µ:-0.22 0.01 µ:-0.26

0.20

0.00 µ:-0.25 0.00 µ:-0.26

PDF of

PDF of

VAL Rejected proposals

0.15

0.00

Accepted proposals

MET

0.18 µ:2.42 0.53 µ:6.30

0.15 0.10

0.05

Accepted proposals

Rejected proposals

VAL

0.00 0.08 µ:0.87 0.34 µ:3.93

0.15

0.16 µ:1.79 0.28 µ:4.11

All proposals

0.20

All proposals

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

0.10

0.05 0.00 -20

∆Epot ∆Ekin 0.02

MET 0.43 0.49 0.33 0.49

µ:2.73 µ:2.78 µ:-0.31 µ:3.51

vacuum water

0.47 0.50 0.35 0.48

µ:3.94 µ:5.18 µ:0.38 µ:2.28

0.01

0.00 0.03

0.21 0.35 0.49 0.56

0.02

0.21 0.35 0.48 0.57

µ:-4.68 µ:-12.41 µ:4.46 µ:12.15

µ:-4.68 µ:-12.75 µ:4.43 µ:12.49

0.01

0.00 0.30 0.44 0.42 0.51

0.02

0.33 0.44 0.42 0.52

µ:-1.61 µ:-2.71 µ:2.49 µ:6.63

µ:-0.83 µ:-2.61 µ:2.62 µ:6.71

0.01

0.00 0

20

Etot (kT)

40

-20

0

20

Etot (kT)

-100-50

40

0 50 100 150

E (kT)

-100-50

0 50 100 150

E (kT)

Figure 10: Probability density functions of the total energy difference (left) and the potential (∆Epot ) and kinetic energy (∆Ekin ) differences (right) before and after an NCMC move using 4500 NCMC steps in vacuum (red) and explicit water (blue). Labels indicate probability mass larger than 5 kT energy difference that will ultimately lead to rejections of the NCMC move. The mean values (µ) of the corresponding energy differences are given to quantify the shift of the distribution towards higher values.

substantial shoulder towards higher energies, which is what ultimately leads to the reduced acceptance rate (Figure 10, left-bottom). A hint towards the origin of this shoulder can be found by comparing the distribution of the energy components in the accepted proposals with the complete set of proposals (Figure 10, right, bottom and middle). The set of accepted candidates has a tendency towards lower potential energy and higher kinetic energy. In other words, an average increased kinetic energy among proposals means that only those structures with sufficiently low potential energy can be accepted. This effect becomes more 19

ACS Paragon Plus Environment

The Journal of Physical Chemistry

pronounced when dealing with explicit water molecules. An NCMC step with a large step size will create substantial overlaps with the water molecules in its path, leading first to increased potential energies (during perturbation), and subsequently to an elevated kinetic energy during propagation. This effect can also be seen by splitting the potential energy contributions of our proposals into protein-protein (P2P), protein-water (P2W), and waterwater (W2W) components (Figure 11). The narrow distributions of energy differences in P2P and P2W indicate that the protein-protein and protein-water clashes that are generated by the perturbation step can be resolved during the NCMC switching phase. However, in the W2W case the tail towards elevated energy differences in the rejected moves suggests that this is the dominant contribution to the change in total energy and highlights that waterwater clashes are the bottleneck for acceptance. 0.020

Rejected Proposals Accepted Proposals 0.43 µ:2.73 0.38 µ:-0.15

EP2P

0.015

All Proposals

0.21 µ:-4.68 0.24 µ:-5.03

0.30 µ:-1.61 0.33 µ:-1.86

0.010

vacuum water

0.005 0.000 0.50 µ:3.96

0.40 µ:-6.70

0.46 µ:0.24

0.39 µ:-0.26

0.38 µ:-0.28

0.39 µ:-0.27

0.49 µ:3.55

0.35 µ:-12.01

0.44 µ:-1.88

PDF of

EW2W

0.004

0.002 0.000

EP2W

0.010

0.005 0.000 0.006

Eall

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 27

0.004

0.002 0.000

-100

0

100

200

-100

0

100

E (kT)

200

-100

0

100

200

Figure 11: Probability density functions of the potential energy differences before and after 1D NCMC proposals using 4500 NCMC steps in vacuum (red) and explicit water (blue) for VAL. Labels indicate probability mass larger than 5 kT energy difference that will ultimately lead to rejections of the NCMC move. The mean values (µ) of the corresponding energy differences are given to quantify the shift of the distribution towards higher values.

Multi-dimensional NCMC Sampling 20

ACS Paragon Plus Environment

Page 21 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Despite the similarity in the acceptance rates of 1D-proposals for VAL and MET, driving only the χ1 angle of MET is not as efficient as for VAL. This suggests, like in the vacuum case, that the performance for MET could be improved by driving χ2 as well. We thus repeated the multi-dimensional sampling experiments in explicit solvent as well. Somewhat surprisingly, in contrast to considerable gains in vacuum, this led to reductions in the acceptance rates, and thereby even lower sampling efficiencies (Figure 9). Since the only major difference between these two cases is the presence of water, these results indicate that the multi-dimensional sampling proposals interfere more substantially with the surrounding water. One possible explanation is that the paths taken by the multi-dimensional proposals are altered compared to the 1D case, due to the increased number of frozen DoFs. This additional rigidity might restrict the conformational space of the side chain during the propagation through the solvent thus leading to suboptimal switching paths.

Discussion The goal of our study was to investigate the performance of the NCMC method for conducting structural updates in protein systems. As model systems, we chose two peptides consisting of VAL and MET embedded in an Alanine helix and systematically investigated the sampling efficiency of rotamer transitions within these systems. Overall, our results revealed a number of interesting aspects of the NCMC procedure. We obtained the most pronounced improvements in vacuum where the VAL system turned out to be particularly well-suited for simulation with NCMC. Driving its single slow degree of freedom provided a significant speed up in its rotamer transition rate which was directly reflected in a substantial efficiency gain. MET proved to be more difficult due to the similar timescales of the dynamics of the two DoFs, χ1 and χ2 . Despite an orders of magnitude speed up in the driven χ1 angle its coupling with χ2 was insufficient to enforce transitions in χ2 , leaving it

21

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

as the bottleneck for efficiency gain. As expected, its efficiency benefited dramatically from driving both slow DoFs. Sampling of uneven distributions has previously been reported as a major challenge for the NCMC procedure. 3 This observation raised concerns about the applicability of the method on real proteins systems where uneven distributions are the norm. Our results for the VAL system revealed that NCMC can still offer substantial increases in transition rates in the case of an extreme non-uniformity, although more NCMC steps and thus a longer relaxation phase were necessary to achieve these improvements. Another aspect of the NCMC method that we specifically set out to address with this study was the ability of NCMC to simplify the design of Monte Carlo moves, by allowing move-designers to focus their attention on perturbing a small number of DoFs and letting the remaining DoFs evolve automatically. Focusing our attention on the vacuum case, where Monte Carlo is typically employed, our results suggest that NCMC could indeed be useful in this scenario, at least when there is a clear separation of timescales between the driven and non-driven degrees of freedom. This distinction is clearly seen in the comparison between MET 1D, 2D and 3D sampling, where 1D is substantially worse than 2D (because χ1 and χ2 have dynamics on the same time scale), while going to 3D sampling provides no additional benefit (because χ3 can equilibrate within the time frame of the propagation steps in the NCMC phase). In explicit solvent, performance generally dropped considerably. The acceptance rates were dramatically reduced, pointing to a requirement for smaller NCMC step sizes for such scenarios. In general, increasing the number of NCMC steps is linked to an increased cost (in terms of energy evaluations), which must be counterbalanced with an increased acceptance rate. In solvent, the acceptance level of NCMC moves was insufficient to compensate for this elevated cost. Moreover, in contrast to vacuum results, multi-dimensional sampling resulted in further reductions in solvent implying that the restriction on the conformational space of 22

ACS Paragon Plus Environment

Page 22 of 27

Page 23 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

the protein might crucially alter the relaxation path of the peptide through the surrounding solvent molecules. Despite our modest improvements in explicit solvent, our results do not preclude that NCMC could be beneficial in this context in the future. Smarter moves could reduce the kinetic energy induced in the surroundings, for instance by guidance from coarse grained simulations. 17 Using different propagation kernels could be another way to improve performance, for instance using a Monte Carlo move that explicitly takes positions of the nearest water molecules into account. 11,22 Finally, using a soft-core potential during the switching phase could reduce the size of the forces of overlapping atoms during the nonequilibrium path. It is also important to note that, unlike standard Monte Carlo moves, the NCMC move defines an entire path. In this paper, we limited ourselves to finding optimal endpoints for such moves, and considering only linear interpolations of the angles involved - other strategies might lead to higher acceptance rates. Overall, our results can be distilled into a few general observations: Compared to standard Monte Carlo, the novel feature in NCMC is the propagation step. But propagation is a mixed blessing: on one side, propagation is critical to avoid the high potential energies at the end of the switching trajectory (in particular in explicit solvent). On the other hand, any kinetic energy transferred to the environment during propagation (i.e. heat) will have to be compensated for by an equally sized reduction of the potential energy in the final state to achieve acceptance. In other words, step sizes must be sufficiently small to avoid large forces on the atoms during propagation. Each NCMC step is associated with at least one energy evaluation, and the efficiency of the NCMC approach will therefore eventually decline as the number of NCMC steps grows. Whether the optimal compromise between these two considerations leads to an overall gain in efficiency will depend on the magnitude of the forces encountered during the switching trajectory, and thus depend on both the force field, the system and the nature of the driven degrees of freedom. In our case, a sweet-spot

23

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

existed for the vacuum setup, but not for the explicit water case.

Acknowledgement This work was supported by the Villum Foundation (A.K., R.F. and W.B., grant number VKR023445) and a Stanford BioX visiting scholarschip from the Novo Nordisk Foundation (R.F., grant number NNF15OC0015268). We thank Kresten Lindorff-Larsen for valuable discussions.

References (1) Dodd, L.; Boone, T.; Theodorou, D. A Concerted Rotation Algorithm for Atomistic Monte Carlo Simulation of Polymer Melts and Glasses. Mol. Phys. 1993, 78, 961–996. (2) Bottaro, S.; Boomsma, W.; E. Johansson, K.; Andreetta, C.; Hamelryck, T.; Ferkinghoff-Borg, J. Subtle Monte Carlo Updates in Dense Molecular Systems. J. Chem. Theory Comput. 2012, 8, 695–702. (3) Nilmeier, J. P.; Crooks, G. E.; Minh, D. D. L.; Chodera, J. D. Nonequilibrium Candidate Monte Carlo is an Efficient Tool for Equilibrium Simulation. Proc. Natl. Acad. Sci. U.S.A. 2011, 108, 1009–1018. (4) Stern, H. A. Molecular Simulation with Variable Protonation States at Constant pH. J. Chem. Phys. 2007, 126 . (5) Opps, S. B.; Schofield, J. Extended State-Space Monte Carlo Methods. Phys. Rev. E 2001, 63, 056701. (6) Nilmeier, J.; Jacobson, M. P. Monte Carlo Sampling with Hierarchical Move Sets: POSH Monte Carlo. J. Chem. Theory Comput. 2009, 5, 1968–1984. 24

ACS Paragon Plus Environment

Page 24 of 27

Page 25 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

(7) Brown, S. P.; Head-Gordon, T. Cool Walking: a New Markov Chain Monte Carlo Sampling Method. J. Comput. Chem. 2003, 24, 68–76. (8) Ballard, A. J.; Jarzynski, C. Replica Exchange with Nonequilibrium Switches. Proc. Natl. Acad. Sci. U.S.A. 2009, 106, 12224–12229. (9) B¨ urgi, R.; Kollman, P. A.; Van Gunsteren, W. F. Simulating Proteins at Constant pH: An Approach Combining Molecular Dynamics and Monte Carlo Simulation. Proteins 2002, 47, 469–480. (10) Ath`enes, M. Computation of a Chemical Potential Using a Residence Weight Algorithm. Phys. Rev. E 2002, 66, 1–14. (11) Nicolini, P.; Frezzato, D.; Chelli, R. Exploiting Configurational Freezing in Nonequilibrium Monte Carlo Simulations. J. Chem. Theory Comput. 2011, 7, 582–593. (12) Chen, Y.; Roux, B. Generalized Metropolis Acceptance Criterion for Hybrid Nonequilibrium Molecular Dynamics - Monte Carlo Simulations. J. Chem. Phys. 2015, 142, 024101. (13) Chen, Y.; Roux, B. Efficient Hybrid Non-equilibrium Molecular Dynamics - Monte Carlo Simulations with Symmetric Momentum Reversal. J. Chem. Phys. 2014, 141, 114107. (14) Giovannelli, E.; Gellini, C.; Pietraperzia, G.; Cardini, G.; Chelli, R. Nonequilibrium Candidate Monte Carlo Simulations with Configurational Freezing Schemes. J. Chem. Theory Comput. 2014, 10, 4273–4283. (15) Chen, Y.; Roux, B. Constant-pH Hybrid Nonequilibrium Molecular Dynamics Monte Carlo Simulation Method. J. Chem. Theory Comput. 2015, 11, 3919–3931. (16) Radak, B. K.; Roux, B. Efficiency in Nonequilibrium Molecular Dynamics Monte Carlo Simulations. J. Chem. Phys. 2016, 145, 134109. 25

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(17) Chen, Y.; Roux, B. Enhanced Sampling of an Atomic Model with Hybrid Nonequilibrium Molecular Dynamics-Monte Carlo Simulations Guided by a Coarse-Grained Model. J. Chem. Theory Comput. 2015, 11, 3572–3582. (18) Berendsen, H. J. C.; Postma, J. P. M.; van Gunsteren, W. F.; Hermans, J. In Intermolecular Forces; Pullman, B., Ed.; Springer, Dordrecht, 1981; p 331. (19) Eastman, P. et al. OpenMM 4: a Reusable, Extensible, Hardware Independent Library for High Performance Molecular Simulation. J. Chem. Theory Comput. 2013, 9, 461– 469. (20) Boomsma, W. et al. PHAISTOS: A Framework for Markov Chain Monte Carlo Simulation and Inference of Protein Structure. J. Comput. Chem. 2013, 34, 1697–1705. (21) Pedregosa, F. et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. (22) Chelli, R. Local Sampling in Steered Monte Carlo Simulations Decreases Dissipation and Enhances Free Energy Estimates via Nonequilibrium Work Theorems. J. Chem. Theory Comput. 2012, 8, 4040–4052, PMID: 26605571.

26

ACS Paragon Plus Environment

Page 26 of 27

Page 27 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

For Table of Contents Only Driving structural transitions using non-equilibrium candidate Monte Carlo How efficient is it?

NCMC

27

ACS Paragon Plus Environment