On-the-Fly Specifications of Reaction Coordinates in Parallel Cascade

In the present study, we propose on-the-fly specifications of RCs as .... on-the-fly specification of RC, n types of RCs are provided as candidates a ...
0 downloads 0 Views 2MB Size
Subscriber access provided by University of Pennsylvania Libraries

Biomolecular Systems

On-the-Fly Specifications of Reaction Coordinates in Parallel Cascade Selection Molecular Dynamics Accelerate Conformational Transitions of Proteins Ryuhei Harada, and Yasuteru Shigeta J. Chem. Theory Comput., Just Accepted Manuscript • DOI: 10.1021/acs.jctc.8b00264 • Publication Date (Web): 04 May 2018 Downloaded from http://pubs.acs.org on May 14, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

On-the-Fly Specifications of Reaction Coordinates in Parallel Cascade Selection Molecular Dynamics Accelerate Conformational Transitions of Proteins Ryuhei Harada*,† and Yasuteru Shigeta*,† †

Center for Computational Sciences, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8577, Japan Corresponding author: Ryuhei Harada, Yasuteru Shigeta E-mail: [email protected], [email protected]

Abstract Parallel Cascade Selection Molecular Dynamics (PaCS-MD) is an efficient conformational sampling method for generating a set of reactive trajectories that connect a given reactant and a product. In PaCS-MD, initial structures relevant to conformational transitions are reasonably selected by referring to a set of reaction coordinates (RCs), and short-time molecular dynamics (MD) simulations are independently launched from them. To efficiently perform PaCS-MD, specifications of RCs are essential but specifying reasonable RCs is generally non-trivial. In the present study, we propose on-the-fly specifications of RCs as an extended PaCS-MD. In the present method, n types of RCs are provided as candidates a priori as follow: RC = (X1, X2, …, Xn), and one of RCs is specified as a cycle-dependent manner, i.e. the reasonable RC is searched at every cycle by evaluating gradients of RCs, i.e. RC with the steepest gradient for cycle is regarded as the reasonable RC, and conformational resampling is proceeded along it, promoting conformational transition of a given protein. For a demonstration, the extended PaCS-MD was applied to reproduce the open-closed conformational transition of T4 lysozyme (T4L). As candidates of possible RCs, (1) root-mean square distance, (2) principal coordinates, (3) accessible surface area, (4) radius of gyration, and end-to-end distance were adopted in the cycle-dependent specifications of RCs. Through the demonstration, the extended PaCS-MD successfully reproduced the conformational transition from the open to closed states of T4L. As a more complicated practice, a dimerization process of di-ubiquitin was efficiently reproduced with the extended PaCS-MD, showing the high conformational sampling efficiency of the present algorithm. In contrast, the conventional PaCS-MD with a fixed RC sometimes failed to generate a set of reactive trajectories when an unreasonable RC was specified, i.e. the conformational sampling efficiency of PaCS-MD might more or less depend on the specified RCs. Judging from the present demonstrations, on-the-fly specifications of RCs might be effective to reproduce/predict essential transitions of a given protein. 1

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1. Introduction Proteins often undergo anisotropic, large-amplitude structural fluctuations upon the biological functions. Molecular dynamics (MD) simulations with reliable force fields reproduce/predict essential structural fluctuations as atomic-level trajectories with fs time resolution. Therefore, MD simulations provide the temporal/spatial resolution to investigate protein dynamics in silico. However, it might be still difficult to obtain fluctuations relevant to the biological functions because of the accessible time scales of conventional MD (CMD), i.e. the characteristic time scales of biological functions exceed the accessible time scales of CMD. To overcome this issue, a significant speed up of the CMD simulation has been recently achieved with a specialized purpose machine. As a typical example, D. E. Shaw Research has developed a special purpose machine called “Anton” that enables one to simulate ms-order folding processes for small proteins with atomistic model.1-4 In contrast to the extension of the time scale of MD via the hardware development, several enhanced conformational sampling methods have been proposed instead of CMD to reproduce/predict biologically relevant rare events in the long-time dynamics. For typical examples, as statistical approaches, metadynamics (MetaD),5-8 multicanonical MD (McMD),9 replica-exchange MD (REMD),10 and its variants11-18 have been proposed to search a widely conformational space. As our experiences, applications based on MetaD have been reported to estimate free energy profiles for the biological reactions of Nylon-oligomer hydrolase.19, 20 Additionally, as a biased dynamics, Targeted MD21 and steered MD22, 23 might directly reproduce transition pathways with external constraints and forces, in which transition mechanisms are discussed from the generated transition pathways. Generally, external constraints or biases are imposed in the enhanced conformational sampling method introduced before. As an external biases-free approach, parallel cascade selection MD (PaCS-MD)24 has been proposed for generating transition pathways between a given reactant and a product under a condition that the end-point structures (reactant and product) are given a priori. To efficiently promote transitions, PaCS-MD independently launches short-time MD simulations repeatedly from different initial structures selected with given rules, where initial velocities were renewed based on the Maxwell–Boltzmann distribution. As a trick for promoting transitions from the reactant to the product, the reassignments of initial velocities in restarting short-time MD simulations might be simple but relevant processes. In the preceding study,25 Karplus et al. have reported that the conformational sampling efficiency might be enhanced via restarting short-time MD simulations with renewed initial velocities. By repeating cycles of conformational resampling from reasonably selected initial structures, PaCS-MD generates possible transition pathways from the reactant to the product because initial structures selected at the end of each cycle gradually become closer to the product. PaCS-MD displays some conceptual analogies with the ratchet-and-pawl MD26, 27 proposed by Karplus and Paci and their extensions. For instance, in the ratchet-and-pawl MD, a perturbation is added to the potential energy that depends on the time through reaction coordinate (RC). The time-dependent perturbation forces the system to sample a wide conformational space that is separated by higher free energy barriers. Here, the external perturbation promotes transitions to forward and prevents RC from decreasing 2

ACS Paragon Plus Environment

Page 2 of 27

Page 3 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

significantly to backward. The main difference between PaCS-MD and the ratchet-and-pawl MD might be the absence of an explicit biasing force. PaCS-MD generates a set of unbiased trajectories that highly overlap each other along the generated transition pathways. The trajectories generated by cycles of conformational resampling provide information on sampled local conformational spaces. To quantitatively evaluate the distinct trajectories, PaCS-MD is combined with the umbrella sampling (US) method28, 29 or Markov state model (MSM). These additional calculations are regarded as post-processing treatments to estimate free energies along the presumed transition pathways. For example, the free energy calculation based on the combination of PaCS-MD with MSM have already been reported in the preceding studies.30 The rate-limiting process is to prepare coarse-grained transition pathways based on PaCS-MD. After obtaining the rough transition pathways, their evaluations would be easily performed by the post-processing treatment. In the present formulation, PaCS-MD violates the detailed balance condition. Thus the resulting reactive paths are evaluated with the post-processing treatment to get thermodynamic and kinetic information. To efficiently perform PaCS-MD, it might be essential that how RCs are specified, i.e. reasonable RCs for characterizing transitions of interest are unknown a priori. To specify reasonable RCs, a typical problem might be to set degrees of freedom (DOF) that describe the slowest motion of a given system and to search for the most likely trajectory relevant to the biological functions. In the present study, the reasonable RC was defined as DOF to correctly split the states of proteins without degenerations. To solve this issue, people have developed several techniques, including diffusion maps,31 MSM,32, 33 blue moon sampling,34 milestoning,35 and many others. As the past studies indicated, specifications of reasonable RC are non-trivial, i.e. one has to search optimal RCs that characterize essential dynamics of interest based on preliminary runs or empirical approaches, which might be a time-consuming part. In the conventional PaCS-MD, as a candidate of RC, root-mean-square deviation measured from the product (RMSDproduct) has been adopted because RMSDproduct can characterize how sampled snapshots are close to the product. However, it is not ensured that RMSDproduct always describes essential conformational transitions, i.e. one has to search alternative RCs relevant to essential dynamics of interest. Therefore, we newly propose an efficient strategy based on cycle-dependent specifications of RCs as an extended PaCS-MD, realizing automatically specifications of reasonable RCs when an initial set of RCs are given a priori. Judging from the history-dependent specifications of RCs, the present strategy might have some similarities with the conformational sampling based on the self-learning and machine-learning algorithms5, 7-9, 36-38

because these methods can do self-corrections and reductions of the search in their free energy

landscape explorations. In the present study, RCs in the conventional PaCS-MD are specified on-the-fly (cycle-dependent) manner, i.e. a RC is specified at each cycle by evaluating gradients of RCs. For the on-the-fly specification of RC, n types of RCs are provided as candidates a priori, i.e. RC = (X1, X2, …, Xn). Then, the best RC is searched/specified for cycles of PaCS-MD by referring to the gradients of RCs for cycle. Herein, RC with the steepest gradient for cycle is regarded as a reasonable RC, and conformational 3

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

resampling is performed along the steepest RC. The present paper is organized as follows. In the following section, the concept of conventional PaCS-MD24 is firstly reviewed. Subsequently, PaCS-MD based on the cycle-dependent specifications of RCs is introduced as the extended PaCS-MD. Finally, we discuss the conformational sampling efficiencies of the extended and conventional PaCS-MD simulations. 2. Materials and Methods 2.1 Conventional PaCS-MD with RMSD measured from a given product PaCS-MD24 is an enhanced sampling method for generating reactive trajectories between a given reactant and a product, i.e. it is assumed that the end-point structures are known a priori. To generate the reactive trajectories, PaCS-MD repeats a cycle of (1) selections of initial structures and (2) short-time MD simulations restarting from them with renewed velocities, which is defined as conformational resampling of PaCS-MD. To select the initial structures, we should specify a reasonable RC that describes conformational transitions of a given protein. In the conventional PaCS-MD, RMSDproduct is utilized as the simplest RC because RMSDproduct can identify snapshots that are structurally similar to the product. After selecting initial structures, short-time MD simulations are independently restarted from them, accelerating conformational transitions to the product. By repeating conformational resampling, it would be expected that the value of RMSDproduct gradually becomes smaller as a cycle goes by. For the successive conformational resampling, snapshots with small values of RMSDproduct are specified as the initial structures at the next cycle. In restarting the short-time MD simulations, initial velocities are renewed based on the Maxwell–Boltzmann distribution. The reassignment of initial velocities might also enhance the conformational sampling as the previous studies reported.25 The cycle of (1) and (2) is repeated until the value of RMSDproduct becomes small enough, i.e. RMSDproduct < ε, where ε is a threshold for terminating PaCS-MD. For a flowchart of PaCS-MD, see Fig. 1. Generally, arbitrary RCs can be specified in PaCS-MD a priori, and their specifications might more or less affect the conformational sampling efficiency. As a general RC, scoring functions of RCs might be effective for selecting initial structures. Instead of RMSDproduct, the scoring functions are utilized to rank snapshots sampled by PaCS-MD. In the present study, initial structures are selected by refereeing to values of the scoring functions. For candidates of RCs, (1) principal coordinates (PCs), (2) accessible surface area (sasa), (3) radius of gyration (Rg), and (4) end-to-end distance (dend) were considered to define the scoring functions. 2.2 Scoring functions as a function of sasa, Rg, and dend To simply define a scoring function, a normalized error is estimated for a physical variable between a given structure and a product. In terms of the physical variable for the structure () and the product ( ), the normalized error,  , is defined as follows: 4

ACS Paragon Plus Environment

Page 4 of 27

Page 5 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

 = | 1 − 

 

|,

(1)

where X = sasa, Rg, and dend. At the end of each cycle, snapshots generated by PaCS-MD are ranked by referring to low values of  . Then, highly ranked (ninitial) snapshots are selected as initial structures at the next cycle, where ninitial is a total number of initial structures selected at each cycle. Finally, PaCS-MD is terminated with the following criterion in terms of a threshold,  :  <  . 2.3 Scoring function as a function of PCs For PCs, the following inner product, , is defined between a given structure and a product, and utilized as the scoring function:

 ! = 

 ∙"#   "#     % %"# %%"#

= ∑()* +,-



"#'"#' ,    %  %"# %%"#

(2)

 = - , / , … , 123 and    represent PCs of the structure and the product, and DOF where  means the dimension of PC subspace. Herein, the accessible range of  is as follows: −1.0 <  < 1.0. At the end of each cycle, (ninitial) snapshots with high  values are selected as the initial structures at the next cycle, and resampled by restarting short-time MD simulations from them. Herein, the selected initial structures are highly corrected with the product. Therefore, the conformational resampling from the relevant initial structures might promote conformational transitions to the product. Finally, PaCS-MD is terminated with the following criterion in terms of a threshold, 6 :  > 6 . 2.4 On-the-fly specifications of RCs in the extended PaCS-MD In the present study, RCs are specified on-the-fly (cycle-dependent) manner, i.e. a reasonable RC is automatically searched by evaluating the gradients of RCs for cycle at the beginning of each cycle. Before

 = (X1, X 2, …, X DOF), where N types of the applications, candidates of RCs are provided a priori, i.e. RC 

RCs are initially provided. Herein, the gradient at the ith cycle for the jth RC (X j), :; , are defined as follows: 

:; = 〈@ 〉+ − 〈@ 〉+B- C = 1, 2, … . , DOF , 〈@ 〉+ =

HIJIIKL

M IJIIKL ∑HM,@,+ ,

(3)

(4)

where ninitial is a total number of initial structures selected at each cycle, and 〈 〉+ in Eqs. (3-4) represents an average of Xj over the selected initial structures. Herein, a derivative value of RCs adjacent (the i-1th to 5

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 27

the ith) cycles is represented as Eq. (3) (defined as a simple difference of the averages). By referring to Eq. (3), one RC with the steepest gradient is specified on-the-fly manner. To select the initial structures at the (i+1)th cycle, the minimum gradient value, :; 2.5 Å). Additionally, the error defined by Eq. (1) for the conventional PaCS-MD with dendclosed moderately decreased (the red dots in Fig. 3(b)) compared to the others (the profiles of errors for conventional PaCS-MD with on sasaclosed (blue) and Rgclosed (green) in Fig. 3(b)), meaning that dendclosed could not describe the collective motion between the open and closed states of T4L. Actually, the N-/C-terminal residues were initially quite close to each other. Therefore, dendclosed was almost unchanged upon the open-closed transition of T4L, i.e. dendclosed cannot describe the large-amplitude domain motion correctly and split the open and closed states of T4L enough. For PaCS-MD with the normalized inner product (IP) was monitored for cycle. As shown in Fig. 3(c), the values of IP rapidly increased for the early 20 cycles, and reached ~ 1.0 at the 100th cycle. For the conformational transition of T4L, the collective domain motions between the open and closed states might be described by the PCA modes. As an evidence, the highly ranked top ten modes were significantly dominant and covered ~ 80 % of the overall modes, which seems to be a good basis set for expanding a conformational subspace to describe the collective motions. Therefore, from a point of the view of dynamics, IP might be a reasonable RC to describe the collective motions of T4L because the highly ranked top ten modes can sufficiently characterize the large-amplitude domain motion of T4L. Actually, PaCS-MD with IP identified snapshots similar to the closed form at the 100th cycle with a relatively high accuracy (1.0 Å < Cα RMSDclosed < 1.5 Å (the green line in Fig. 3(a)). Judging from the computational results from conventional PaCS-MD, a static RC specified a priori might be insufficient for promoting conformational transitions of a given protein. Therefore, a fixed RC in the conventional PaCS-MD sometimes fails to promote conformational transitions depending on a preliminary specified RC. As an alternative way, it might be better that RCs are dynamically specified, i.e. an optimal RC is search from candidates of a set of RCs, and specified on-the-fly by monitoring the gradients of RCs for the cycles of conformational resampling. 3.2 Convergence to a given product in the extended PaCS-MD To improve the conformational sampling efficiency of the conventional PaCS-MD, the extended 8

ACS Paragon Plus Environment

Page 9 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

PaCS-MD was considered based on dynamic specifications of RCs. In the extended PaCS-MD, the  = (RMSDclosed, IP, following five RCs (DOF = 5) were specified as the candidates of RCs a priori: RC sasaclosed, Rgclosed, dendclosed). To obtain initial gradients defined by Eqs. (3-4), 100-ps MD simulations were independently launched from the relaxed open form of T4L with different initial velocities as the 1st cycle, generating ten (ninitial = 10) trajectories. Then 100-ps MD simulations were restarted from the final snapshots of each trajectory with renewed initial velocities as the 2nd cycle, generating ten trajectories. Sequentially, 100-ps MD simulations were restarted from the final snapshots of each trajectory as the 3rd cycle. In terms of all the trajectories at the 1st, 2nd, and 3rd cycles, the gradients of RCs defined by Eqs. (3-4) were estimated and evaluated by their minimum values based on Eq. (5), i.e. one RC with the steepest gradient was searched and it was specified as an optimal DOF. Figures 4(a-c) show profiles of specified RCs for the cycles, i.e. how reasonable RCs were dynamically searched by referring to the gradients of RCs. As shown in Figs. 4(a-c), several RCs were dynamically specified as reasonable RCs. Judging from these profiles, RMSDproduct and IP were frequently specified as optimal RCs compared to the other RCs (sasaclosed, Rgclosed, dendclosed). Table 2 shows frequencies for how many times these five RCs were dynamically specified over the 100 cycles. Judging from Table 2, the accumulated frequencies for RMSDproduct and IP exceeded 0.30 in all the extended PaCS-MD simulations, indicating that these two RCs might be suitable for describing the conformational transition of T4L. Herein, the red line in Fig. 5(a) shows a profile of Cα RMSDclosed of initial structures for the extended PaCS-MD with DOF = 5, indicating that on-the-fly specifications of RCs could significantly improve the convergence to the product than the conventional PaCS-MD. In particular, the red line in Fig. 5(a) rapidly converged within 1.0 Å of Cα RMSDclosed around the 10th cycle. Therefore, the extended PaCS-MD might increase a possibility to reach a given product by dynamically specifying RCs because reasonable RCs for describing transitions of interest are searched and specified by evaluating the gradients of RCs for the cycles of conformational resampling. Next, we address that how the conformational sampling efficiencies depend on the specific choice of the n types RCs initially specified. As a furthermore analysis, an extended PaCS-MD with the reduced  = (IP, sasaclosed, Rgclosed, dendclosed). Namely, one of reasonable RCs RCs (DOF = 4) were considered: RC (RMSDproduct) was excluded. Herein, the green line in Fig. 5(a) shows a set of profiles of Cα RMSDclosed of initial structures for the extended PaCS-MD with DOF = 4, showing a good convergence to the product because the final value of Cα RMSDclosed reached 1.2 Å at the 100th cycle. In the extended PaCS-MD with DOF = 4, the frequencies of IP and sasaclosed were about 0.42 and 0.26 (see Table 2), indicating that these RCs might be reasonable RCs. Judging from the computational results from the extend PaCS-MD with DOF = 4, the conformational sampling efficiency was improved compared to the conventional PaCS-MD. As an evidence for the improvement, the value of Cα RMSDclosed fluctuated around 1.0 Å after the 40th cycle (the green line in Fig. 5(b)), showing a high accuracy for identifying the product. As an additional analysis, another extended PaCS-MD with the further reduced RCs (DOF = 3) 9

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 27

 = (sasaclosed, Rgclosed, dendclosed). In contrast to the previous trials, the reasonable RCs were considered: RC (RMSDproduct and IP) were excluded. As shown in the blue line in Fig. 5(a), the profiles of Cα RMSDclosed for the extended PaCS-MD with DOF = 3 showed a good convergence to the product because the final value of Cα RMSDclosed reached ~ 1.2 Å at the 100th cycle, this is the almost same result with the extended PaCS-MD with DOF = 4. For the dynamic specifications of RCs, the frequencies of these three RCs were not significantly different, i.e. (sasaclosed, Rgclosed, dendclosed) = (0.41, 0.21, 0.38) (see Table 2), indicting that these three RCs might have the equal potential for describing the conformational transition of T4L. Judging from the profile of Cα RMSDclosed, the extend PaCS-MD with DOF = 3 could improved the conformational sampling efficiency compared to the conventional PaCS-MD. Actually, the profile of minimum values of Cα RMSDclosed among the three trials showed a good convergence to the product. Actually, the minimum value of Cα RMSDclosed fluctuated around 1.0 Å after the 40th cycle (the blue line in Fig. 5(b)). As an important evidence for the improvement, the extended PaCS-MD successfully generated reactive trajectories, even if unreasonable RCs (sasaclosed, Rgclosed, dendclosed) were specified as candidates of RCs a priori. By evaluating gradients of RCs for the cycles, the extended PaCS-MD with DOF = 3 dynamically searches and selects an optimal RC among the candidates of RCs, leading to the efficient conformational sampling. 3.3 Demonstration through a dimerization of di-ubiquitin To further validate the extended PaCS-MD, the dimerization of di-ubiquitin was additionally treated. Herein, the dissociated (open) and dimerized (closed) forms of di-ubiquitin were displayed in Fig. 6(a). First, the following five candidates were considered as a set of RCs for the extended PaCS-MD, i.e. (RMSDproduct, IP, sasadimerized, Rgdimerized, denddimerized). Starting from the relaxed open form, the extended PaCS-MD with DOF = 5 was performed for 100 cycles. At every cycle, (ninitial = 10) structures were selected by referring to the gradients of RCs and their conformations were resampled by short-time (100-ps) MD simulations. The red line in Fig. 6(b) shows the minimum values of Cα RMSDdimerized for the 100 cycles. Judging from this profile, the extended PaCS-MD with DOF = 5 promoted the dimerization at the 11th cycle (Cα RMSDdimerized < 2.0 Å). As a summary of the dynamical search of RCs, RMSDproduct was the most frequently specified as the effective RC. In more detail, the five RCs were specified with the following frequencies: (RMSDproduct, IP, sasadimerized, Rgdimerized, denddimerized) = (63.3 %, 8.2 %, 6.1 %, 7.1 %, 15.3 %), indicating that RMSDproduct might be the most effective RC compared to the other RCs. Second, the extended PaCS-MD was performed with the following four RCs, (IP, sasadimerized, Rgdimerized, denddimerized) excluding RMSDproduct. The green line in Fig. 6(b) shows a profile of minimum values of Cα RMSDproduct for the 100 cycles. Judging from this profile, the extend PaCS-MD with DOF = 4 identified a snapshot similar to the product because the minimum value of Cα RMSDdimerized was less than 2.0 Å at the 62th cycle (Cα RMSDproduct ~ 1.7 Å), showing a slower convergence to the product compared to the extended PaCS-MD with DOF = 5. Therefore, it might be better to include RMSDproduct as a candidate of the effective RCs to efficiently perform the extended PaCS-MD. Next, the extended PaCS-MD was performed with the following three RCs, (sasadimerized, Rgdimerized, 10

ACS Paragon Plus Environment

Page 11 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

denddimerized) excluding RMSDproduct and IP. The blue ling in Fig. 6(b) shows a profile of minimum values of Cα RMSDdimerized as for the 100 cycles, showing a bad convergence to the product (Cα RMSDdimerized ~ 10.0 Å at the 100th cycle). This behavior also supports that RMSDproduct and IP might be candidates of the effective RCs for describing the dimerization. In contrast, the other RCs (Rgclosed, sasaclosed, dendclosed) seem to be inappropriate for efficiently reproducing the dimerization. Therefore, it might be better to reconsider an alternative set of RCs or surely include RMSDproduct and IP as RCs in advance. It is noteworthy that an optimal choice of reasonable RC is non-trivial task for more complicated biological systems. Thus, the present method offers a heuristic way to avoid this problem. Finally, we performed the conventional PaCS-MD with a fixed RC (RMSDproduct, IP, sasadimerized, Rgdimerized, denddimerized) to compare the conformational sampling efficiency with the extended one. Figure 6(c) shows a set of profiles of minimum values of Cα RMSDdimerized for the 100 cycles. As expected, the conventional PaCS-MD with RMSDproduct eproduced the conformational transition to the product (the red line in Fig. 6(c)). In contrast, the conventional PaCS-MD with IP failed to generate the transition pathway (the green line in Fig. 6(c)). For the other RCs (sasadimerized, Rgdimerized, denddimerized), the conventional PaCS-MD failed to reproduce the reactive trajectories (the blue, magenta, and cyan lines in Fig. 6(c)). However, the extended PaCS-MD with DOF = 4 improved the convergence to the product by the dynamic specifications of these four RCs (the green line in Fig. 6(b)). In the extended PaCS-MD with DOF = 4, each RC was dynamically specified for the 100 cycles with the following frequencies: (IP, sasadimerized, Rgdimerized, denddimerized) = (50.0 %, 8.2 %, 7.1 %, 34.7 %). Therefore, the extended PaCS-MD might accelerate the conformational transition to the product due to the dynamics specifications of RCs. 4. Conclusion In the present study, the extended PaCS-MD based on cycle-dependent specifications of RCs was

 = (X1, X2, …, proposed. In the extended PaCS-MD, N types of RCs are provided as candidates a priori: X XN), and an optimal RC among them is specified on-the-fly, i.e. a reasonable RC is searched/specified at every cycle by evaluating gradients of RCs for cycle, and one RC with the steepest gradient is regarded as the best DOF and the conformational resampling is proceeded along the optimal DOF, promoting conformational transitions of a given protein. Due to the dynamic specifications of RCs, the extended PaCS-MD might prevent the system from being trapped into one of the local minima because the system might efficiently escape from the local minima by conformational resampling along the dynamically changing RCs. In the conventional PaCS-MD, a RC is initially specified and fixed (unchanged) for all the cycles. Therefore, the conventional PaCS-MD might show a bad convergence to a given product when the initially specified RC was inappropriate for describing conformational transitions of interest. In contrast, the extended PaCS-MD might bypass this issue due to the dynamic specifications of RCs because the conformational resampling along cycle-dependent RCs might promote biologically relevant rare events efficiently. Generally, reasonable RCs are non-trivial, and it might be difficult to specify optimal RCs depending on target systems. Therefore, it might be better to prepare candidates of different types of RCs a 11

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 27

priori and automatically search/specify a reasonable set of RCs among them by referring to the gradients of RCs during the cycles of conformational resampling. For comparisons with other methods (meta-dynamics, McMD, targeted MD, steered MD, and temperature-accelerated MD), the present strategy might have an advantage in conformational search because the extended PaCS-MD selects RCs based on the cycle-dependent manner to find valid transition paths and does not use external biases to promote conformational transitions to a given product. On the other hand, the most of enhanced conformational sampling methods fix a RC (or a few of RCs) a priori to search the conformational subspaces and suffer from artifacts owing to the bias perturbation terms. In particular, an artifact from the specification of RC might be improved because optimal RCs would be reasonably selected based on the dynamic RC search for cycles. However, the present formulation cannot directly estimate free energies upon conformational transitions using the generated trajectories. To tackle this problem, an alternative strategy was proposed to calculate the free energies based on a combination of PaCS-MD with a MSM.30, 55, 56 In terms of all the trajectories, transitions among microstates are counted to obtain a transition matrix under the assumption that each transition was a Markov process and stationary distribution is estimated from the transitions matrix to yield the free energy landscapes projected onto a subspace spanned by the specified RCs. Note here that we have to specify a set of RCs to define the microstates and need to choose reasonable RCs to well describe the conformational transition of interest. Therefore, it might be easy to estimate free energy profiles of proteins based on the PaCS-MD/MSM strategy without an additional conformational search such as the umbrella sampling.28, 29 Finally, it is noted that how different definitions/choice of RCs lead to a different relative decrease of RCs. As an example, for protein folding simulations, a decrease in RMSD to native may correspond to a different decrease in the fraction of native contacts (NC), which might be trivial judging from the definitions of RMSD and NC. In this sense, the extended PaCS-MD might systematically select the direction with the steepest variation of NC. Actually, the current RC specifications are based on geometrical properties (RMSD, sasa, Rg, dend, and NC, etc) rather than physical properties. Therefore, for the safety of conformational search, physical properties (for example, the potential energy of system) might be monitored for cycles in addition to decreases in gradients of geometrically defined RCs.

12

ACS Paragon Plus Environment

Page 13 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Acknowledgement The present work was supported by the Japan Society for the Promotion of Science (JSPS) KAKENHI Grant-in-Aid for Young Scientists (A) (No. JP16H06164), and by Grant-in-Aid for Scientific Research of the Innovative Areas “Dynamic Ordering and Biofunction”, “Photosynergetics” and “3D Active-Site Science” (Nos. JP26102525, JP26107004, and JP26105012) from JSPS. The computations were performed with the COMA / HA-PACS at the Center for Computational Sciences (CCS), University of Tsukuba. The computations were performed using the COMA/HA-PACS at the Center for Computational Sciences (CCS) of the University of Tsukuba. This research was partially supported by the Initiative on Promotion of Supercomputing for Young or Women Researchers, Information Technology Center, The University of Tokyo. This research partly used computational resources under the Collaborative Research Program for Young/Women Scientists provided by the Academic Center for Computing and Media Studies, Kyoto University. This work was supported by the TSUBAME Encouragement Program for Young/Female Users of the Global Scientific Information and Computing Center at Tokyo Institute of Technology.

13

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 1 (a)

Flowchart of the conventional PaCS-MD with RMSD measured from product. (RMSDproduct).

14

ACS Paragon Plus Environment

Page 14 of 27

Page 15 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(b)

Flowchart of the cycle-dependent (on-the-fly) specifications of RCs in the extended PaCS-MD.

15

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 27

Figure 2

Superimposed structures of the open (red) and closed (blue) X-ray crystal structures of T4L. (Cα RMSD between the open and closed forms was ~ 2.8 Å).

16

ACS Paragon Plus Environment

Page 17 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Figure 3

(a) Profiles of Cα RMSDclosed of initial structures for the 100 cycles. The conventional PaCS-MD with (red) RMSDproduct, (green) IP, (magenta) sasaclosed, (blue) Rgclosed, and (cyan) dendclosed. Cα RMSDclosed was averaged over the three trials, and plotted with their standard deviations. The broken line represents a criterion for judging convergences of PaCS-MD. (b) Profiles of the errors defined by Eq. (1) for the conventional PaCS-MD with (red) dendclosed, (green) Rgclosed, and (blue) sasaclosed. The plotted errors were averaged over three trials of PaCS-MD. (c) Profile of IP for the 100 cycles. Each value of IP was averaged over three trials of the conventional PaCS-MD, and plotted with their standard deviations. The broken line represents the upper range of IP.

17

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 27

Figure 4

Profiles of the dynamically specified RCs for 100 cycles in the extended PaCS-MD with (a)  RC = closed closed closed closed closed closed (RMSDproduct, IP, sasa , Rg , dend ), (b)  RC = (IP, sasa , Rg , dend ), and (c)  RC = (sasaclosed, Rgclosed, dendclosed).

18

ACS Paragon Plus Environment

Page 19 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Figure 5

Profiles of Cα RMSDclosed of initial structures for 100 cycles. (a) The extended PaCS-MD with (red)  RC = closed closed closed closed closed closed (RMSDproduct, IP, sasa , Rg , dend ), (green)  RC = (IP, sasa , Rg , dend ), and (blue)  RC = (sasaclosed, Rgclosed, dendclosed). The values of Cα RMSD were averaged over three trials, and plotted with their standard deviations. (b) Profiles of the minimum values of Cα RMSDclosed among three trials. The

 = (RMSDproduct, IP, sasaclosed, Rgclosed, dendclosed), (green)  extended PaCS-MD with (red) RC RC = (IP, closed closed closed closed closed closed sasa , Rg , dend ), and (blue)  RC = (sasa , Rg , dend ).

19

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 27

Figure 6

(a) Dissociated (open) and dimerized (closed) forms of di-ubiquitin. (b) Profiles of the minimum values of Cα RMSDdimerized as a function of a cycle number. The extended PaCS-MD with (red) DOF = 5, (green) DOF = 4, and (blue) DOF = 3. (c) The conventional PaCS-MD with (red) RMSDproduct, (green) IP, (blue) sasadimerized, (magenta) Rgdimerized, and (cyan) denddimerized. Each value of Cα RMSDdimerized was averaged over (ninitial = 10) initial structures selected at each cycle.

20

ACS Paragon Plus Environment

Page 21 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Table 1 Minimum simulation times to sample the closed state of T4L (Cα RMSDclosed < 1.0 Å) and minimum Cα RMSDclosed among three trials of the conventional/extended PaCS-MD using ninitial = 10 for 100 cycles. Type

Trial

(RC)

Minnimum

Minimum

Mininum

Cycle

Simulation Time [ns]

Cα RMSDclosed [Å]

Conventional

1

23

23

0.64

PaCS-MD

2

13

13

0.82

(RMSDproduct)

3

66

66

0.74

Conventional

1

N/A

N/A

1.00

PaCS-MD

2

25

25

0.92

(IP)

3

6

6

0.87

Conventional

1

N/A

N/A

1.54

PaCS-MD

2

N/A

N/A

2.10

(sasaclosed)

3

N/A

N/A

1.10

Conventional

1

N/A

N/A

1.53

PaCS-MD

2

N/A

N/A

1.11

(Rgclosed)

3

N/A

N/A

1.47

Conventional

1

N/A

N/A

1.58

PaCS-MD

2

N/A

N/A

1.50

(dendclosed)

3

N/A

N/A

1.60

Extended

1

6

6

0.61

PaCS-MD

2

8

8

0.62

(DOF = 5)*

3

11

11

0.68

Extended

1

11

11

0.86

PaCS-MD

2

80

80

0.92

(DOF = 4)*

3

27

27

0.87

Extended

1

90

90

0.99

PaCS-MD

2

30

30

0.83

(DOF = 3)*

3

20

20

0.88

 = *DOF = 5, DOF = 4, and DOF = 3 represent the total number of RCs in the extended PaCS-MD, i.e. RC  = (IP, sasaclosed, Rgclosed, dendclosed), and RC  = (sasaclosed, (RMSDproduct, IP, sasaclosed, Rgclosed, dendclosed), RC Rgclosed, dendclosed), respectively.

21

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 27

Table 2 Frequencies of dynamically specified RCs in the extended PaCS-MD for 100 cycles. Type / RC

RMSDproduct

IP

sasaclosed

Rgclosed

dendclosed

DOF = 5*

0.44

0.36

0.07

0.02

0.11

DOF = 4*

-

0.42

0.26

0.12

0.20

DOF = 3*

-

-

0.41

0.21

0.38

 = *DOF = 5, DOF = 4, and DOF = 3 represent the total number of RCs in the extended PaCS-MD, i.e. RC  = (IP, sasaclosed, Rgclosed, dendclosed), and RC  = (sasaclosed, (RMSDproduct, IP, sasaclosed, Rgclosed, dendclosed), RC Rgclosed, dendclosed), respectively.

22

ACS Paragon Plus Environment

Page 23 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

TOC

23

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 27

Reference 1.

Klepeis, J. L.; Lindorff-Larsen, K.; Dror, R. O.; Shaw, D. E. Long-Timescale Molecular Dynamics Simulations of Protein Structure and Function. Curr. Opin. Struct. Biol. 2009, 19, 120-127.

2.

Shaw, D. E.; Maragakis, P.; Lindorff-Larsen, K.; Piana, S.; Dror, R. O.; Eastwood, M. P.; Bank, J. A.; Jumper, J. M.; Salmon, J. K.; Shan, Y. B.; Wriggers, W. Atomic-Level Characterization of the Structural Dynamics of Proteins. Science 2010, 330, 341-346.

3.

Lindorff-Larsen, K.; Piana, S.; Dror, R. O.; Shaw, D. E. How Fast-Folding Proteins Fold. Science 2011, 334, 517-520.

4.

Piana, S.; Lindorff-Larsen, K.; Shaw, D. E. Atomic-Level Description of Ubiquitin Folding. Proc. Natl. Acad. Sci. U. S. A. 2013, 110, 5915-5920.

5.

Laio, A.; Parrinello, M. Escaping Free-Energy Minima. Proc. Natl. Acad. Sci. U. S. A. 2002, 99, 12562-12566.

6.

Raiteri, P.; Laio, A.; Gervasio, F. L.; Micheletti, C.; Parrinello, M. Efficient Reconstruction of Complex Free Energy Landscapes by Multiple Walkers Metadynamics. J. Phys. Chem. B 2006, 110, 3533-3539.

7.

Piana, S.; Laio, A. A Bias-Exchange Approach to Protein Folding. J. Phys. Chem. B 2007, 111, 4553-4559.

8.

Laio, A.; Gervasio, F. L. Metadynamics: A Method to Simulate Rare Events and Reconstruct the Free Energy in Biophysics, Chemistry and Material Science. Rep. Prog. Phys. 2008, 71, 126601/1-126601/22.

9.

Nakajima, N.; Nakamura, H.; Kidera, A. Multicanonical Ensemble Generated by Molecular Dynamics Simulation for Enhanced Conformational Sampling of Peptides. J. Phys. Chem. B 1997, 101, 817-824.

10.

Sugita, Y.; Okamoto, Y. Replica-Exchange Molecular Dynamics Method for Protein Folding. Chem. Phys. Lett. 1999, 314, 141-151.

11.

Sugita, Y.; Kitao, A.; Okamoto, Y. Multidimensional Replica-Exchange Method for Free-Energy Calculations. J. Chem. Phys. 2000, 113, 6042-6051.

12.

Fukunishi, H.; Watanabe, O.; Takada, S. On the Hamiltonian Replica Exchange Method for Efficient Sampling of Biomolecular Systems: Application to Protein Structure Prediction. J. Chem. Phys. 2002, 116, 9058-9067.

13.

Liu, P.; Kim, B.; Friesner, R. A.; Berne, B. J. Replica Exchange with Solute Tempering: A Method for Sampling Biological Systems in Explicit Water. Proc. Natl. Acad. Sci. U. S. A. 2005, 102, 13749-13754.

14.

Okumura, H. Partial Multicanonical Algorithm for Molecular Dynamics and Monte Carlo Simulations. J. Chem. Phys. 2008, 129, 124116/1-124116/9.

15.

Wang, L.; Friesner, R. A.; Berne, B. J. Replica Exchange with Solute Scaling: A More Efficient Version of Replica Exchange with Solute Tempering (Rest2). J. Phys. Chem. B 2011, 115, 24

ACS Paragon Plus Environment

Page 25 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

9431-9438. 16.

Itoh, S. G.; Okumura, H. Hamiltonian Replica-Permutation Method and Its Applications to an Alanine Dipeptide and Amyloid-Beta(29-42) Peptides. J. Comput. Chem. 2013, 34, 2493-2497.

17.

Itoh, S. G.; Okumura, H. Replica-Permutation Method with the Suwa-Todo Algorithm Beyond the Replica-Exchange Method. J. Chem. Theory Comput. 2013, 9, 570-581.

18.

Yamamori, Y.; Kitao, A. Mustar Md: Multi-Scale Sampling Using Temperature Accelerated and Replica Exchange Molecular Dynamics. J. Chem. Phys. 2013, 139, 145105/1-145105/11.

19.

Kamiya, K.; Baba, T.; Boero, M.; Matsui, T.; Negoro, S.; Shigeta, Y. Nylon-Oligomer Hydrolase Promoting Cleavage Reactions in Unnatural Amide Compounds. J. Phys. Chem. Lett. 2014, 5, 1210-1216.

20.

Baba, T.; Boero, M.; Kamiya, K.; Ando, H.; Negoro, S.; Nakano, M.; Shigeta, Y. Unraveling the Degradation of Artificial Amide Bonds in Nylon Oligomer Hydrolase: From Induced-Fit to Acylation Processes. Phys. Chem. Chem. Phys. 2015, 17, 4492-4504.

21.

Schlitter, J.; Engels, M.; Kruger, P.; Jacoby, E.; Wollmer, A. Targeted Molecular-Dynamics Simulation of Conformational Change - Application to the T[--]R Transition in Insulin. Mol. Simul. 1993, 10, 291-308.

22.

Isralewitz, B.; Baudry, J.; Gullingsrud, J.; Kosztin, D.; Schulten, K. Steered Molecular Dynamics Investigations of Protein Function. J. Mol. Graph. Model. 2001, 19, 13-25.

23.

Isralewitz, B.; Gao, M.; Schulten, K. Steered Molecular Dynamics and Mechanical Functions of Proteins. Curr. Opin. Struct. Biol. 2001, 11, 224-230.

24.

Harada, R.; Kitao, A. Parallel Cascade Selection Molecular Dynamics (Pacs-Md) to Generate Conformational Transition Pathway. J. Chem. Phys. 2013, 139, 035103-1-10.

25.

Caves, L. S. D.; Evanseck, J. D.; Karplus, M. Locally Accessible Conformations of Proteins: Multiple Molecular Dynamics Simulations of Crambin. Protein Sci. 1998, 7, 649-666.

26.

Paci, E.; Karplus, M. Forced Unfolding of Fibronectin Type 3 Modules: An Analysis by Biased Molecular Dynamics Simulations. J. Mol. Biol. 1999, 288, 441-459.

27.

Camilloni, C.; Broglia, R. A.; Tiana, G. Hierarchy of Folding and Unfolding Events of Protein G, Ci2, and Acbp from Explicit-Solvent Simulations. J. Chem. Phys. 2011, 134, 045105.

28.

Torrie, G. M.; Valleau, J. P. Non-Physical Sampling Distributions in Monte-Carlo Free-Energy Estimation - Umbrella Sampling. J. Comput. Phys. 1977, 23, 187-199.

29.

Torrie, G. M.; Valleau, J. P. Monte-Carlo Study of a Phase-Separating Liquid-Mixture by Umbrella Sampling. J. Chem. Phys. 1977, 66, 1402-1408.

30.

Harada, R.; Nishihara, Y.; Wakai, N.; Kitao, A. Conformational Transition Pathway and Free Energy Analyses of Proteins by Parallel Cascade Selection Molecular Dynamics (PaCS-MD). AIP. Conf. Proc. 2014, 1618, 86-89.

31.

Rohrdanz, M. A.; Zheng, W. W.; Clementi, C. Discovering Mountain Passes Via Torchlight: Methods for the Definition of Reaction Coordinates and Pathways in Complex Macromolecular 25

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 27

Reactions. Annu. Rev. Phys. Chem., Vol 64 2013, 64, 295-316. 32.

Pande, V. S.; Beauchamp, K.; Bowman, G. R. Everything You Wanted to Know About Markov State Models but Were Afraid to Ask. Methods 2010, 52, 99-105.

33.

Chodera, J. D.; Noe, F. Markov State Models of Biomolecular Conformational Dynamics. Curr. Opin. Struct. Biol. 2014, 25, 135-144.

34.

Ciccotti, G.; Kapral, R.; Vanden-Eijnden, E. Blue Moon Sampling, Vectorial Reaction Coordinates, and Unbiased Constrained Dynamics. Chemphyschem 2005, 6, 1809-1814.

35.

Majek, P.; Elber, R. Milestoning without a Reaction Coordinate. J. Chem. Theory Comput. 2010, 6, 1805-1817.

36.

Li, W. F.; Takada, S. Characterizing Protein Energy Landscape by Self-Learning Multiscale Simulations: Application to a Designed Beta-Hairpin. Biophys. J. 2010, 99, 3029-3037.

37.

Li, W. F.; Takada, S. Self-Learning Multiscale Simulation for Achieving High Accuracy and High Efficiency Simultaneously. J. Chem. Phys. 2009, 130, 214108.

38.

Wojtas-Niziurski, W.; Meng, Y. L.; Roux, B.; Berneche, S. Self-Learning Adaptive Umbrella Sampling Method for the Determination of Free Energy Landscapes in Multiple Dimensions. J. Chem. Theory Comput. 2013, 9, 1885-1895.

39.

Zhang, X. J.; Matthews, B. W. Conservation of Solvent-Binding Sites in 10 Crystal Forms of T4-Lysozyme. Protein Sci. 1994, 3, 1031-1039.

40.

Weaver, L. H.; Matthews, B. W. Structure of Bacteriophage-T4 Lysozyme Refined at 1.7 a Resolution. J. Mol. Biol. 1987, 193, 189-199.

41.

Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. Comparison of Simple Potential Functions for Simulating Liquid Water. J. Chem. Phys. 1983, 79, 926-935.

42.

Hess, B.; Bekker, H.; Berendsen, H. J. C.; Fraaije, J. G. E. M. Lincs: A Linear Constraint Solver for Molecular Simulations. J. Comput. Chem. 1997, 18, 1463-1472.

43.

Miyamoto, S.; Kollman, P. A. Settle - an Analytical Version of the Shake and Rattle Algorithm for Rigid Water Models. J. Comput. Chem. 1992, 13, 952-962.

44.

Bussi, G.; Donadio, D.; Parrinello, M. Canonical Sampling through Velocity Rescaling. J. Chem. Phys. 2007, 126, 014101/1-014101/7.

45.

Parrinello,

M.;

Rahman,

A.

Polymorphic

Transitions

in

Single-Crystals

-

a

New

Molecular-Dynamics Method. J. Appl. Phys. 1981, 52, 7182-7190. 46.

Nose, S.; Klein, M. L. Constant Pressure Molecular-Dynamics for Molecular-Systems. Mol. Phys. 1983, 50, 1055-1076.

47.

Essmann, U.; Perera, L.; Berkowitz, M. L.; Darden, T.; Lee, H.; Pedersen, L. G. A Smooth Particle Mesh Ewald Method. J. Chem. Phys. 1995, 103, 8577-8593.

48.

Abraham, M. J.; van der Spoel, D.; Lindahl, E.; Hess, B, and the Gromacs Development Team, Gromacs User Manual 2018, http://www.gromacs.org/. 2018.

49.

Duan, Y.; Wu, C.; Chowdhury, S.; Lee, M. C.; Xiong, G. M.; Zhang, W.; Yang, R.; Cieplak, P.; Luo, 26

ACS Paragon Plus Environment

Page 27 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

R.; Lee, T.; Caldwell, J.; Wang, J. M.; Kollman, P. A Point-Charge Force Field for Molecular Mechanics Simulations of Proteins Based on Condensed-Phase Quantum Mechanical Calculations. J. Comput. Chem. 2003, 24, 1999-2012. 50.

Kitao, A.; Hirata, F.; Go, N. The Effects of Solvent on the Conformation and the Collective Motions of Protein - Normal Mode Analysis and Molecular-Dynamics Simulations of Melittin in Water and in Vacuum. Chem. Phys. 1991, 158, 447-472.

51.

Amadei, A.; Linssen, A. B. M.; Berendsen, H. J. C. Essential Dynamics of Proteins. Protein Struct. Funct. Genet. 1993, 17, 412-425.

52.

Balsera, M. A.; Wriggers, W.; Oono, Y.; Schulten, K. Principal Component Analysis and Long Time Protein Dynamics. J. Phys. Chem. 1996, 100, 2567-2572.

53.

Saeki, Y.; Kudo, T.; Sone, T.; Kikuchi, Y.; Yokosawa, H.; Toh-e, A.; Tanaka, K. Lysine 63-Linked Polyubiquitin Chain May Serve as a Targeting Signal for the 26s Proteasome. EMBO J. 2009, 28, 359-371.

54.

Rohaim, A.; Kawasaki, M.; Kato, R.; Dikic, I.; Wakatsuki, S. Structure of a Compact Conformation of Linear Diubiquitin. Acta Crystal. D 2012, 68, 102-108.

55.

Kitao, A.; Harada, R.; Nishihara, Y.; Tran, D. P. Parallel Cascade Selection Molecular Dynamics for Efficient Conformational Sampling and Free Energy Calculation of Proteins. Aip. Conf. Proc. 2016, 1790.

56.

Tran, D. P.; Takemura, K.; Kuwata, K.; Kitao, A. Protein-Ligand Dissociation Simulated by Parallel Cascade Selection Molecular Dynamics. J. Chem. Theory Comput. 2018, 14, 404-417.

27

ACS Paragon Plus Environment