Subscriber access provided by Gothenburg University Library
Biomolecular Systems
An Integrated Markov State Model and Path Metadynamics Approach to Characterize Drug Binding Processes Mattia Bernetti, Matteo Masetti, Maurizio Recanatini, Rommie E. Amaro, and Andrea Cavalli J. Chem. Theory Comput., Just Accepted Manuscript • DOI: 10.1021/acs.jctc.9b00450 • Publication Date (Web): 22 Aug 2019 Downloaded from pubs.acs.org on August 25, 2019
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1
An Integrated Markov State Model and Path
2
Metadynamics Approach to Characterize Drug
3
Binding Processes
4
Mattia Bernetti†,‡, Matteo Masetti*†, Maurizio Recanatini†, Rommie E. Amaro§, and
5
Andrea Cavalli*‖
6
†Department of Pharmacy and Biotechnology, Alma Mater Studiorum – Università di
7
Bologna, Via Belmeloro 6, I-40126 Bologna, Italy
8
§Department of Chemistry and Biochemistry, University of California, San Diego, 9500
9
Gilman Drive, La Jolla, California 92093-0340, United States
10
‖Computational & Chemical Biology, Istituto Italiano di Tecnologia, Via Morego 30, I-
11
16163 Genova, Italy
12
ACS Paragon Plus Environment
1
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Markov
State
Models;
Transition-Path
Page 2 of 64
1
KEYWORDS.
Theory;
Well-Tempered
2
Metadynamics; Path Collective Variables; Miminum Free Energy Pathway; 2-Adrenergic
3
Receptor.
4
5
ABSTRACT. Unveiling the mechanistic features of drug-target binding is of central
6
interest in biophysics and drug discovery. Herein, we address this challenge by combining
7
two major computational approaches, namely Molecular Dynamics (MD) simulations and
8
Markov State Models (MSM), with a Path Collective Variables (PCVs) description coupled
9
with metadynamics. We apply our methodology to reconstruct the binding process of the
10
antagonist alprenolol to the β2-adrenergic receptor, a well-established pharmaceutical
11
target. The devised protocol allowed us to estimate the binding free energy and identify
12
the minimum free energy path leading to the protein-ligand complex. In summary, we
13
show that MSM and PCVs can be efficiently integrated to shed light upon mechanistic
14
and energetic details underlying complex recognition processes in biological systems.
ACS Paragon Plus Environment
2
Page 3 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1
2
1. INTRODUCTION
3
Computational simulations are becoming increasingly effective in complementing experiments
4
and providing atomic-level descriptions of complex biological phenomena.1–5 In the context of
5
drug discovery, Molecular Dynamics (MD) simulations and related methods are widely employed
6
to compute key thermodynamic quantities in an increasingly reliable and routine fashion.6,7 Owing
7
to recent advances in software8–11 and hardware,12,13 predicting drug-target (un)binding
8
mechanisms and associated rates is also now within reach.6,14–22 Still, the timescales gap between
9
simulations and experiments remains an open challenge that potentially undermines any
10
computational investigation in this field.23,24
11
Typically, two classes of MD simulation-based approaches can be adopted to address the
12
challenge of bridging scales.6,14 Biased-MD simulation methods, including metadynamics25
13
among others, rely on the notion of reaction coordinates (or collective variables, CVs) whereby
14
additional forces are exploited to accelerate rare events.26 As the free energy surface as a function
15
of CVs is a natural outcome of these methods, a detailed mechanistic interpretation of the process
16
is also guaranteed. If a proper choice of CVs is made, reliable free energy estimations can be
17
attained at a moderate expense of computational resources.27 Unfortunately, choosing the best CVs
18
is not always an easy task, and time consuming trial-and-error procedures are usually exploited to
19
this aim. When prior knowledge about the investigated process is available, an optimal description
20
of the event can be achieved through the Path-CVs (PCVs) formalism.28 PCVs allow to trace the
21
minimum free energy path (MFEP) with high accuracy and have been successfully employed to
22
describe complex biomolecular processes, including conformational transitions,29 ion
23
conduction,30 and ligand (un)binding.31–33 However, since the so-called “guess path” is required
ACS Paragon Plus Environment
3
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 4 of 64
1
by construction, the practical usage of PCVs is strongly limited in real-life scenarios.34
2
Alternatively to biased-MD simulation methods, spontaneous binding events can also be obtained
3
through extensive conventional MD simulations in the microsecond to millisecond timeframe.35–
4
37
5
to make optimal usage of distinct simulations towards a statistical description of binding.38–43
6
Notably, brute force sampling followed by MSM does not require any prior knowledge about the
7
investigated process, and it allows relevant thermodynamic and kinetic quantities to be extracted.44
8
Unfortunately, the computational resources needed to achieve reliable estimates are still daunting
9
for the vast majority of the community and sampling of high energy states remains quite poor even
10
In this context, Markov State Models (MSM) provide the appropriate mathematical framework
when running remarkably long conventional MD simulations.
11
Here, we combine MSM and PCVs to take advantage of the individual benefits of the two
12
methods. Our protocol rests on the assumption that at least a few binding events can be observed
13
through conventional MD simulations. Subsequently, a preliminary MSM is constructed in a low-
14
data regime, and the outcome is employed as a template to construct the guess path, which is
15
required for PCVs-based simulations. In this way, only the most relevant portion of the phase space
16
is resampled through metadynamics, improving sampling around transition states and high energy
17
regions. In Figure 1 a schematic picture of the whole framework is provided. We show that this
18
procedure leads to a highly detailed mechanistic interpretation of the protein-ligand recognition
19
process while saving valuable computational time. Additionally, the free energy profile associated
20
with the binding mechanism allowed us to pinpoint free energy minima and transition states, and
21
to estimate the binding free energy in a remarkable agreement with experimental measures. As a
22
proof of concept, we apply the methodology to a pharmaceutically relevant problem, namely the
23
association of the drug alprenolol to the β2-adrenergic receptor (β2-AR, see Figure 2).37
ACS Paragon Plus Environment
4
Page 5 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1 2
2. RESULTS AND DISCUSSION
3
2.1 From a discrete-states kinetic model to Collective Variables.
4
A subset of eleven previously computed binding trajectories collected on the Anton specialized
5
hardware architecture (Table S1) was employed to build the kinetic model.37 In all the runs, the
6
ligand successfully reached the orthosteric pocket inside the receptor through strikingly similar
7
pathways. Moreover, in six of these trajectories the crystallographic geometry was recapitulated
8
within an RMSD lower than 0.8 Å.37 We built the MSM through the PyEMMA 2 software
9
package,45 using a lag time of 100 ns, and further analyzed the kinetic model with Transition-Path
10
Theory (TPT)46 in order to extract the most relevant transition pathways from the ensemble of
11
microstates.47,48 Provided two endpoints, which corresponded in this case to the protein-ligand
12
encounter complex (EC) and the bound state, TPT computes the net flux over all the possible
13
transitions between each pair of microstates. The network of fluxes is shown in Figure 3, where
14
the A state corresponds to the EC, while B represents the bound state. As the Figure shows, while
15
a dominant pathway could be easily recognized for the initial steps of the association process,
16
several routes were found in the proximity of the B endpoint. Thus, to extract a significant
17
sequence of intermediates, a suitable prioritization scheme was devised allowing us to identify
18
four relevant states (I1-4, Figure 3) highly involved in the total binding flux. The geometry of these
19
intermediates, along with their sequential order, can be considered as hallmarks (or “frames”
20
according to the PCVs jargon) that must be preserved when building PCVs. However, a functional
21
ensemble of frames (or frameset) suitable to the PVCs framework must not only preserve the
22
proper sequence of events, but it must also ensure a smooth and high-resolution transition along
23
the specific PCV controlling the progression along the path, i.e. the S variable.28 These
ACS Paragon Plus Environment
5
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 6 of 64
1
requirements translate into relatively small and homogeneous inter-frame distances, respectively,
2
that were not satisfied by the original ensemble of frames (Table S2). Nevertheless, once the
3
ordered sequence of intermediates is known, a better-suited frameset for PCVs can be easily
4
extracted by either re-processing the conventional-MD simulation trajectories or performing
5
focused biased-MD simulations (see Methods). Here, the latter strategy was undertaken,
6
employing steered-MD simulations to link the six parent stations into the final guess path.31,33
7
Retrospective validation of the obtained path was achieved by projecting the conventional-MD
8
simulation trajectories onto the PCVs space (see Figure 4). Interestingly, all the productive
9
trajectories (i.e. those leading to the crystallographic geometry) were found at low Z values,
10
providing initial clues about the suitability of the procedure. In fact, the Z PCV describes the
11
distance as compared to the guess path, which is ideally found at Z = 0 for all values of S.
12
Conversely, at high Z values alternative paths can be found. Most notably, all the non-productive
13
trajectories were mapped at approximately Z > 16 Å. This information turned out to be
14
instrumental for the subsequent resampling phase, as it provides an upper value for Z that can be
15
exploited to limit the exploration of the phase space on the most relevant region of the binding
16
pathway.
17 18
2.2 Detailed mechanistic interpretation of the drug-target recognition process.
19
The free energy surface (FES) reconstructed after 2 μs of metadynamics sampling is shown in
20
Figure 5, together with relevant configurations found along the binding pathway (M1-12, see also
21
Figure 6). By partitioning the protein-ligand bound/unbound region of the PCVs space at
22
approximately S = 20 (see below), a standard binding free energy of about −11.6 kcal/mol was
23
estimated,32,49 in remarkable agreement with experimental data (−12.2 kcal/mol).37 The error on
ACS Paragon Plus Environment
6
Page 7 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1
the free energy profile, as estimated through block analysis (see Methods), amounts to
2
0.2 kcal/mol. This error can be interpreted as the uncertainty affecting the free energy
3
that arises by considering trajectory sections (blocks) of increasing size and averaging
4
over all such sections. The procedure is repeated until the uncertainty becomes
5
independent from the block size. Such condition is only met once the correlation between
6
the trajectory data points, which is by definition associated with sequential MD trajectory
7
frames, is lost. In Figure S1, the time evolution of the free energy difference between
8
distinct regions of the PCVs space is also shown.
9
By looking at the FES, we can distinguish four main regions showing distinct energetic features.
10
In particular, we observe a deep (S ≥ 60) and a shallow (40 > S ≥ 20) energy minima region
11
separated by a transition area where the main barrier of the process is located. Moreover, a high-
12
energy region is also found at S values lower than 20. In this latter portion of the PCVs space, the
13
ligand was fully solvated, explaining the almost structureless features of the FES. Conversely, at
14
the opposite side of the plot, configuration M12 perfectly matched the native crystallographic
15
structure (PDB ID: 3NYA).50 As Figure 4A shows, the MFEP (calculated using the MEPSA
16
algorithm)51 was located at low Z values along more than half of the space spanned by S (red solid
17
line). This implies that, at least from M3 to M12, the frameset devised by our integrated procedure
18
was highly reliable, as an optimal guess path would ideally match the MFEP at Z = 0 for all S
19
values. It is nonetheless not surprising that lower accuracy was found in the remaining regions
ACS Paragon Plus Environment
7
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 8 of 64
1
(MFEP at high Z values). Importantly, this suboptimal behavior was not related to the proposed
2
methodology. Indeed, since several competing routes can be followed by the ligand approaching
3
the receptor, the very concept of the path becomes rather arguable for the early stages of binding.
4
For the same reason, in Figure 5B the free energy profile along the MFEP is only shown for the
5
most relevant regions of the PCVs space, namely the portion corresponding to the M3-M12
6
transition.
7
Taken as a whole, the shallow minima portion of the FES entails an ensemble of states describing
8
the EC (see for example the M1 configuration in Figure 5A and 6B). Here, Alprenolol was still
9
partly solvated and established hydrogen bonds with Thr195 and Phe193 through its polar tail,
10
stationing with its principal axis parallel to ECL2 and ECL3. These two loops delimit a solvent-
11
exposed cavity on the protein surface, named the extracellular vestibule (EV), which is known to
12
accommodate ligands before their advancement towards the orthosteric site. In M3 (Figure 6C),
13
Alprenolol occupied the outermost region of the EV, enclosed between ECL3 and the lower
14
portion of ECL2 (ECL2b). Remarkably, this configuration, with alprenolol almost completely
15
desolvated and forming a hydrogen bond with Thr195 through the charged tail, corresponded to
16
pose 2 reported by Dror et al.37 It is important to note that the unbound to EV-bound state transition
17
was proposed as the major barrier for the whole process, and was mostly attributed to dehydration
18
of the binding partners.37 Our simulations do not support this picture, as no relevant barriers could
19
be observed from the unbound regions of the FES up to the M3 state. Nevertheless, as discussed
20
in the Methods section, we wish to highlight that our simulations were carried out using a different
21
force field, which may indeed be partly responsible for the observed discrepancies. Instead, the
22
transition state region of the FES roughly corresponds to configurational changes of alprenolol
23
within the EV, with the main barrier (≈ 6 kcal/mol) separating the outermost (M3) from the
ACS Paragon Plus Environment
8
Page 9 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1
innermost (M6) regions of the vestibule. Here, by pivoting on the phenylethyl moiety and thus
2
accomplishing a half-turn rotation, the ligand underwent a substantial rearrangement to eventually
3
occupy the bottom of the EV (Figure 5C, 6D, and 6E). In configuration M6, alprenolol was
4
completely buried inside the protein and matched pose 3 reported by Dror et al.37
5
The final step of binding is represented by the ultimate transfer of alprenolol from the EV to the
6
orthosteric binding pocket. To do so, the ligand has to move through a narrow crevice between
7
ECL2 and helices H5-7, with the concurrent separation of two hydrophobic residues (Phe193 and
8
Tyr308). Notably, this step was proposed as the second energetic barrier encountered by the ligand
9
during the binding process.37 Interestingly, we did not observe any significant barrier at this stage.
10
Tyr308 was quite stable during the simulations, whereas Phe193 showed a wider degree of
11
flexibility. Indeed, as already observed in the original paper,37 the residues experienced frequent
12
conformational rearrangements regardless of the relative location of alprenolol. To further support
13
this observation, we monitored the distance between the center of mass of the phenyl rings of these
14
residues. In Figure 7A, the latter distance is shown as a function of the metadynamics simulation
15
time (red line) together with the S variable mapping the progress of the binding process (black
16
line). No apparent correlation between the two variables could be identified, as also pointed out
17
by the scatter plot reported in Figure 7B. Overall, three main states could be distinguished: a closed
18
state, where the side chain phenyl rings of these residues were in close contact (distance of about
19
4 Å), an open state, where a separation of about 8 Å was observed, and a wide-open state (distance
20
greater than 14 Å). An inter-residue distance of at least 8 Å was always found in the
21
correspondence of ligand ingress/egress, indicating that ligand passage required a sufficient
22
separation between Phe193 and Tyr308 (Figure 7C). Nevertheless, as shown by the red curve in
23
Figure 7A, the two residues were able to frequently interchange between such conformational
ACS Paragon Plus Environment
9
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 10 of 64
1
states over the entire course of the simulation regardless of the presence of alprenolol in their
2
surroundings. Thus, our simulations are indicating that no relevant barrier was associated with this
3
conformational rearrangement. Notably, configurations with a significantly larger inter-residue
4
separation, due to a further conformational rearrangement of Phe193 (observed in the simulation
5
time period 420-820 ns), were also captured in our simulations.
6
After overcoming the narrow passage through the Phe193 and Tyr308 gate (M8), a flip of the
7
polar tail resulted in a configuration displaying the hydroxyl group pointing upwards, in direction
8
of the previously occupied EV, and the charged nitrogen pointing downwards to the orthosteric
9
pocket (see Figure 6F and 6G). Maintaining this orientation and further advancing along the
10
crevice, a displacement of the tail of about 4 Å was observed in M9, making the positively charged
11
nitrogen approach Asp113 and Asn312, establishing a salt bridge with the former and a hydrogen
12
bond with the latter. Remarkably, this minimum represented the crossroad of two different routes
13
towards the bound state. Still focusing on the MFEP, M10 and M11 (Figure 6H) were
14
progressively visited, where the phenyl ring of alprenolol approached a hydrophobic area
15
comprising Val114 and Phe290. This configuration corresponded to pose 4 reported by Dror et
16
al.,37 indicated as the relevant intermediate prior to the native state. A flip of the hydroxyl towards
17
helices H6 and 7, in M11, was then necessary to adopt the final binding pose (M12), where a
18
network of interactions involved the polar tail and Asp113 and Asn312 and the ligand phenyl ring
19
accommodated in a hydrophobic region including Val114, Val117 and Phe290. The alternative,
20
higher-energy path from M9 went through M10’ and M12’ (dashed red line in Figure 4). The
21
former, quite spread and structurally diverse, was dominated by the flip of the ligand hydroxyl
22
towards H3 in order to interact with Asp113, in a diametrically opposed fashion compared to the
23
major path (Figure 8A). Additionally, the ligand phenyl explored different regions of the site,
ACS Paragon Plus Environment
10
Page 11 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1
adopting a configuration closely resembling state 4’’ reported by Dror et al.,37 who suggested it as
2
an alternative metastable pose before adopting the final binding mode. In M12’ (Figure 8B), the
3
phenyl moiety visited different regions of the orthosteric site around the volume occupied in the
4
target M12 pose. Noteworthy was the occupancy of a small hydrophobic cavity including Phe289,
5
Phe290, Phe208, and Trp286, exposed after displacement of Phe290, to which the energy barrier
6
between M10’ and M12’ might be adduced.
7 8
2.3 The integrated MSM-Path Metadynamics approach into perspective.
9
In the last decade, a plethora of diverse computational methods aimed at addressing the
10
timescales problem have been proposed and applied to pharmaceutically relevant issues of
11
different complexity.14,22 Accordingly, it is possible today to take advantage of a rich pool of
12
methods purposely developed to assist the drug design process with tailored tradeoffs between
13
accuracy and efficiency. Some of these methods are designed to provide on- and off-rates of drug
14
binding together with the associated free energy at high accuracy. For example, building on an
15
MSM framework, the adaptive sampling scheme known as High Throughput MD (HTMD) is able
16
to reconstruct the whole drug binding process in a totally unsupervised fashion.39 Through an
17
iterative learning and sampling procedure, HTMD has proven to achieve a converged binding
18
affinity one order of magnitude faster than conventional MD simulations combined with MSM.
19
Another adaptive scheme is the WExplore method developed by Dickson and coworkers.52
20
Relying on a weighed ensemble (WE) algorithm,53 replicas of the system are evolved in parallel
21
and assigned to a statistical weight. Periodically, these replicas are cloned or merged based on the
22
extent of exploration of configurational space in a way to enhance the heterogeneity in under-
23
sampled regions. This cloning and merging operations are governed by the trajectories’ statistical
ACS Paragon Plus Environment
11
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 12 of 64
1
weight, which is kept constant through the process so as to maintain an unbiased ensemble of
2
states. Over conventional WE, WExplore implements a hierarchical definition of sampling regions
3
based on dynamically defined Voronoi tessellations. This approach is most suited to study drug
4
unbinding, for which the protein-bound state is already known, and to provide reliable estimates
5
of drug residence time with a great saving of computational time.19–21 It has been recently argued
6
that, in general, adaptive sampling schemes may offer little advantage over conventional MD
7
simulations in getting faster drug binding processes in a totally unsupervised regime.24 However,
8
it has also been recognized that these methods hold great potential in providing an accurate
9
description of drug binding pathways as well as the ability to locate allosteric sites.24 A convenient
10
tradeoff between accuracy and efficiency is certainly provided by SEEKR,16,17 a hybrid
11
MD/Brownian Dynamics/milestoning methodology15 suited to efficiently reconstruct on- and off-
12
rates of drug binding processes through a multiscale approach. Conversely, concerning methods
13
more focused on the computational speed, we mention here the scaled MD approach54 and the
14
closely related τ-RAMD55 to get fast estimates of drug residence times and unbinding
15
pathways.56,57 Finally, when one is more interested in using MD simulation-based methods to
16
overcome limitations of conventional docking methods (dynamic docking), Supervised MD
17
(SuMD) simulations,58,59 the MD-binding protocol,60 as well as the “binding” variant of scaled
18
MD simulations,61 can be considered.
19
Where does the integrated MSM-Path Metadynamics approach fit within the multitude of
20
emerging methods? Even though a rigorous assessment of computational efficiency is out of the
21
scope of the present work, our results indicate that the proposed methodology can be ideally found
22
halfway between brute force MD-sampling and dynamic docking methods. Indeed, once a
23
functional path is obtained, metadynamics simulation can reach reasonable estimates of binding
ACS Paragon Plus Environment
12
Page 13 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1
free energies (and related free energy profiles) in the order of microsecond long simulations. It is
2
however also true that much of the efficiency of the whole scheme depends on the ability of the
3
initial pool of trajectories in capturing the binding process with reasonable accuracy. This can only
4
be achieved once multiple protein-ligand recognition events are sampled via conventional MD
5
simulations. While such initial requirement unavoidably translates into substantial computational
6
costs, it is nevertheless difficult to provide an a priori estimate, and it is also expected to vary
7
considerably depending on the investigated system. In such a scenario, we note that our strategy
8
would probably be more suited to characterize the binding process of fragments and/or low-affinity
9
leads. Indeed, these compounds usually show faster association kinetics over drug-like
10
compounds, making the association process easier to observe with conventional computational
11
resources.62
12
While much of the above discussion was focused on sampling schemes and limitations thereof,
13
one should always be aware that MD simulations are also affected by force fields inaccuracies. In
14
this respect, we already mentioned that the choice of different force fields might be responsible for
15
the observed discrepancies between our results and those previously reported by Dror et al.37 Thus,
16
notwithstanding recent impressive advances in force fields development,63 one should always be
17
cautious in the interpretation of MD simulation results, especially when the goal is to obtain
18
quantitative estimates of free energy profiles and/or when electronic polarization effects are known
19
to play a major role in the process under study. Despite accounting for such contributions during
20
MD simulations is not straightforward, a QM/MM approach for refining the free energy profiles
21
within the framework of path collective variables has been recently proposed by Haldar et al.64 It
22
might be interesting to consider such methodology in future work especially to assess the relative
ACS Paragon Plus Environment
13
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 14 of 64
1
merit of the individual force fields and, when possible, to foster their improvement and
2
development.
3 4
3. CONCLUSIONS
5
In summary, through the protocol presented herein, we achieved a highly informative description
6
of alprenolol association to the 2-AR, unveiling mechanistic features of the process with an
7
unprecedented level of detail. Additionally, by relying on the reconstructed energy profile, we
8
were able to estimate a related binding free energy of -11.6 kcal/mol, in very good agreement with
9
experimental data (-12.2 kcal/mol). In this way, we have demonstrated how MSM together with
10
PCV can be integrated to tackle open issues in biophysics and, more generally, in drug design such
11
as drug-target binding. As such, our study shows that the combination of rigorous computational
12
methods (MSM and PCV) can provide highly reliable structural and thermodynamic predictions,
13
paving the way for innovative methods and protocols in computational drug design and discovery.
14 15
4. METHODS
16
4.1 Construction of the Markov State Model
17
Constructing an MSM can be roughly summarized in two major stages. First, the initial
18
trajectory set, in its conventional view as a series of structures over time, is converted into the so-
19
called discrete trajectories, that is a series of microstates over time. To this end, the high
20
dimensional space of the system configurations has to be mapped into a lower dimensional space
21
defined through the identification of relevant variables that are able to discriminate different states
22
along the binding process. Such procedure is usually referred to as state decomposition. Secondly,
23
jumps of fixed-size in time (or lag time) are carried out over these discrete trajectories, and the
ACS Paragon Plus Environment
14
Page 15 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1
observed transitions in the microstate space are stored in a count matrix from which a transition
2
probability matrix is then obtained.22
3
In this context, major challenges are found in the exhaustiveness of the initial trajectory pool
4
and in the state decomposition procedure. As for the first aspect, it is essential that full binding
5
events are spontaneously observed, so as to ensure that all the degrees of freedom involved in the
6
process are reliably captured. On the one hand, a strength of MSMs is the possibility of aggregating
7
multiple trajectory data, so that individual portions of the whole binding process can be gathered
8
from independent trajectories. On the other hand, this may lead to disregarding some slow
9
structural rearrangement along orthogonal degrees of freedom that can possibly take place in the
10
different trajectories. As a consequence, when incorporating in the same microstate configurations
11
harvested from different trajectories, such configurations may in reality be kinetically disjoint and
12
thus unlikely to efficiently interconvert. Indeed, and moving to the second aspect, the aim is to
13
achieve a kinetically relevant definition of microstates through geometric clustering. While no
14
simple means are available to accomplish this task,65 such condition can be met to a certain degree
15
of approximation by grouping geometrically similar configurations, which are likely to rapidly
16
interconvert, into the same microstate. To this end, it is important to uniquely define each
17
microstate through careful identification of appropriate variables on which performing the
18
clustering procedure, so as to distinguish and consider all the relevant degrees of freedom involved
19
in the process under study.
20
Once the above steps have been carried out, a standard procedure to identify the optimal choice
21
of lag times to build the model, and then ensuring the condition of memory loss at the basis of the
22
Markovian assumption, is to monitor the relaxation time scales of the system dynamics as a
23
function of the lag time. When no clear-cut convergence of the relaxation time scales is observed,
ACS Paragon Plus Environment
15
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 16 of 64
1
additional simulations may be required before building the model to improve the statistical
2
uncertainty of undersampled regions.44
3 4
Below, we go through the details of the above aspects in our specific application to alprenolol2-AR binding.
5 6
The trajectory set. To build the MSM, we took advantage of the available simulations produced
7
by Dror et al. via unbiased sampling,37 reported in Table S1. Notably, in only 6 out of 12 runs the
8
authors successfully reproduced the binding pose observed in the crystal structure of the 2-AR in
9
complex with alprenolol (PDB ID: 3NYA)50 with an RMSD < 0.8 Å. Nevertheless, the ligand was
10
able to occupy the orthosteric site on the 2-AR in all of the 12 runs. In other words, in half of the
11
considered simulations, alprenolol was able to access the binding site but remained stuck in in non-
12
native binding modes. We aggregated the unbiased trajectories to build the MSM and in turn
13
extract information about ligand access to the orthosteric site. Remarkably, in the process of
14
constructing the MSM, one trajectory was purposely discarded from the dataset because of striking
15
differences found in the association pathway. This behaviour was already recognized in the
16
original paper by Dror et al.37
17
All of the unbiased simulations reported in the original paper by Dror et al. were started in the
18
unbound state, placing 10 replicas of the ligand in the bulk.37 Notably, a significant amount of
19
simulation time was spent by the ligands to wander in the solvent without accomplishing durable
20
contacts with the solvent-exposed surface of the protein.37 As such, with the aim to reduce as much
21
as possible the amount of data to process, and considering our final goal of achieving a thorough
22
understanding about the behaviour of the ligand from the surface to the orthosteric site, we
23
discarded these initial trajectory portions. To this end, for each trajectory, we defined contacts
ACS Paragon Plus Environment
16
Page 17 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1
between alprenolol and 2-AR to have occurred when the distance between heavy atoms from the
2
two species was below a cut-off value of 5 Å in at least one frame. We then monitored the distance
3
values of these contacts over the trajectories and discarded the frames in which the minimum value
4
was above a 30 Å cut-off. By applying this procedure, a total number of 296,659 frames from the
5
initial unbiased trajectory set was selected. All this procedure was carried out through in-house
6
python routines that are available upon request.
7 8
State decomposition and choice of the clustering metric. When constructing an MSM, the
9
first step is the selection of appropriate variables to map the configurations of the system into a
10
low-dimensionality space in which subsequently identifying the microstates of the kinetic model
11
by cluster analysis (state decomposition).44,65,66 By the term appropriate, we mean variables able
12
not only to recognize distinct states of the system along the process under investigation but also to
13
describe the dynamics thereof.44,65 The goal, in our case, was to monitor the advancement of
14
alprenolol along its route from the surface of the protein to the orthosteric site.
15
While using the RMSD of the ligand after alignment on the protein’s frame would be an obvious
16
choice, inspired by previous work67 we selected the minimum distance between protein and ligand
17
heavy atoms as variables to build the model. Another aspect to consider when choosing the
18
variables is the trade-off between their effectiveness and their total amount. As a clustering
19
procedure is applied, the larger the number of variables, the highest the computational effort
20
required. Therefore, we only considered proteins’ -carbons and the heavy atoms for the ligand.
21
Moreover, as the association process involved directly only a fraction of the protein residues, we
22
restricted our selection to the residues belonging to the bulk-exposed surface, the extracellular
23
vestibule, the narrow passage leading to the binding pocket, and the residues comprised in the
ACS Paragon Plus Environment
17
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 18 of 64
1
orthosteric site. As a result, 64 protein residues were included through this selection (Figure S2).
2
Similarly, we only considered 5 heavy atoms that sufficed in describing major rearrangements in
3
the ligand’s configurations (Figure S2): the two oxygen atoms from the phenol ether and the
4
hydroxyl groups, the nitrogen present in the secondary amine group, which is positively charged
5
at the physiological pH, and two carbon atoms. The two oxygens and the nitrogen allowed the
6
ligand to interact with protein residues by means of hydrogen bonds and salt bridges. Differently,
7
the two carbon atoms were required to describe the phenyl ring flipping, which could occur
8
regardless of the position of the other chemical groups. As a result, this led to a set of variables
9
consisting of 320 minimum distances between these two groups of atoms.
10
Once appropriate variables are chosen, and their values computed for each trajectory frame, the
11
clustering procedure can be carried out. The typical procedure is to devise a kinetically relevant
12
geometric clustering, in such a way that similar configurations in terms of geometry are likely to
13
rapidly interconvert. In order to cluster our data, we employed the K-means clustering algorithm,68
14
which has gained great popularity in the MSM community and has been employed with
15
satisfactory results in several previous works.69–71 This clustering algorithm tends to place more
16
clusters in more densely sampled regions, which avoids creating too many groups with small count
17
and thus favouring a more accurate determination of the transition probabilities. By construction,
18
the K-means algorithm requires the user to specify the number of output clusters. Thus, an
19
increasing number of clusters was empirically tested: 50, 100, 150, 200, 300, and 500. We
20
performed the cluster analysis using the 320 protein-ligand minimum distances as input variables.
21
For an easier visualization, we projected the resulting clusters in a more intuitive 2D space. Thus,
22
for each of the obtained clusters, we calculated the RMSD of the ligand with respect to the native
23
state (PDB ID: 3NYA)50 for all of the configurations contained in that cluster and took the average.
ACS Paragon Plus Environment
18
Page 19 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1
Similarly, the average distance of the ligand center-of-mass (COM) between the configurations
2
included in a cluster and the crystallographic complex was employed as a second dimension. The
3
result for the different number of clusters is displayed in Figure S3. As shown by the figure, when
4
increasing the number of clusters from 50 to 100 and 150, the inclusion of sampled points was
5
significantly improved. Conversely, a further increase to 200, 300 and 500 did not broaden
6
significantly the distributions of the clusters. Aiming at maintaining the number of clusters as more
7
limited as possible, we thus used 150 clusters to build our model.
8 9
Lag time selection. After defining the microstates, the trajectories are scanned to identify a
10
series of microstates over time, obtaining the so-called discrete trajectories.44,65,66,72 Then, jumps
11
of fixed size are performed along these discrete trajectories, and the transitions observed through
12
such jumps in the microstate space are recorded and stored in a count matrix that is subsequently
13
converted into a transition probability matrix. The size of jumps, referred to as the lag time of the
14
model, is a critical parameter of the whole procedure, as different lag times lead to different count
15
matrices. Under the Markovian assumption, systems are memoryless, that is the probability of
16
being in the current state depends only on the previous state.44,65,66,72 As such, transition counts
17
can be considered statistically independent. Accordingly, the Markov time is defined as the
18
smallest lag time that provides a Markovian behaviour.65 Assessing the lag time dependence of the
19
relaxation time scales implied in the system dynamics has been shown to be a useful approach to
20
determine appropriate lag times to ensure memory loss.65,66 To determine the appropriate lag time
21
for the construction of the MSM, we monitored the dependence of the time scales of the system
22
on increasing values of lag time (Implied TimeScales, or ITS, plot). To assess the reproducibility
23
of the entire procedure, the whole workflow, beginning with the cluster analysis, was repeated
ACS Paragon Plus Environment
19
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 20 of 64
1
three times. The corresponding three ITS plots were thus produced and compared. As a result, very
2
similar outcomes were obtained (Figure S4). As the Figure shows, the relaxation time scales level
3
off at a lag time corresponding to about 600 steps. Considering the trajectory saving frequency of
4
0.180 ns, our choice of 600 steps corresponded to a lag time of about 100 ns. We initially attempted
5
to aggregate all of the 12 productive and non-productive trajectories. However, this did not allow
6
observing clear convergence of the relaxation time scales. This behaviour would typically indicate
7
the presence of states not adequately sampled. Thus, we run through the simulations to identify
8
possible sources and noted that, while in 11 out of the 12 simulations the binding pathways were
9
remarkably similar, one single trajectory displayed the ligand getting access to the pocket through
10
a significantly different route. Therefore, we discarded the trajectory and repeated the analysis,
11
eventually obtaining the converged plots reported in Figure S4.
12 13
4.2 Flux analysis and prioritization scheme
14
MSM provides a discrete-state kinetic model of the process investigated that can be exploited to
15
obtain relevant thermodynamic and kinetic information. Often, however, the model still retains a
16
high level of complexity, such as it might be desirable to derive a coarse grain description thereof
17
providing the same quantitative information in a more compact fashion.44,65 Herein, this coarse-
18
graining step (also known as microstate lumping phase) was intentionally skipped, since a
19
microscopic description of the association process was required by the following PCVs
20
construction. Nevertheless, in order to identify the most relevant transitions between the bound
21
and unbound state of the protein-ligand complex, the model was further analysed with Transition-
22
Path Theory (TPT).46,73 Provided two endpoints (the encounter complex and the native pose in this
23
work), TPT aims at describing all the reactive trajectories in the microstate space provided by
ACS Paragon Plus Environment
20
Page 21 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1
MSM in terms of their probability distribution and fluxes.48 A central concept in TPT is the
2
committor probability of each microstate, 𝑞𝑖+ , that is the probability for the ith state to reach state
3
B rather than A.46,47 The forward committor probability can be thus defined as:47,48
4 5
𝑞𝑖+ = ∑𝑗 ∈ 𝐵𝑇𝑖𝑗 + ∑𝑗 ∈ 𝐴 ∪ 𝐵𝑇𝑖𝑗𝑞𝑗+ ,
(1)
6 7
where 𝑇𝑖𝑗 is the transition probability between two given microstates, and 𝐴 ∪ 𝐵 is the
8
complement of the union of A and B. In equilibrium conditions, if the detailed balance is satisfied,
9
the backward committor probability is simply given by:
10 11
𝑞𝑖― = 1 ― 𝑞𝑖+ .
(2)
12 13 14
The effective flux, that is the probability to visit microstate i and j consecutively, is thus defined as:
15 16
𝑓𝑖𝑗 = 𝜌𝑖 𝑞𝑖― 𝑇𝑖𝑗𝑞𝑗+ ,
(3)
17 18 19
in which 𝜌𝑖 is the Boltzmann probability for state i. Then, the net flux from A to B can also be computed as:
20 21
𝑓𝑖𝑗+ = max (𝑓𝑖𝑗 ― 𝑓𝑗𝑖).
(4)
22 23
While the reaction rates can be in principle estimated from the total flux,47,48 we highlight here
24
that we obtained the kinetic model in a data-poor regime, hence attempting to extract any
ACS Paragon Plus Environment
21
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 22 of 64
1
quantitative observables would be unwise. Rather, we exploited TPT to obtain the most
2
representative ordered sequence of intermediates between the A and B endpoints. The network of
3
fluxes was visualized with Graphviz,74 as displayed in Figure 3. As reported in the Results and
4
Discussion section, a dominant pathway could only be detected at the initial stages of binding.
5
Hence, rather than by visual inspection, we extracted the intermediates using a less subjective
6
procedure. Following Noé et al.,47 we first decomposed the total flux in circle-free pathways. Each
7
pathway was assigned a probability based on its flux (𝑓𝐴→𝐵 ):47 𝑖
8 9
𝑓𝐴→𝐵 𝑖
= ∑ 𝑓𝐴→𝐵 𝑝𝐴→𝐵 𝑖
(5)
𝑗 𝑗
10 11
Then, each state was assigned a score based on the number of times it was observed in the
12
ensemble of possible pathways, weighted by the relative probability of the associated pathway.
13
Through this analysis, the top-ranking states I1, I2, I3, and I4 were identified together with their
14
ordered sequence.
15 16
4.3 Construction of the guess path for PCVs
17
As already pointed out, the PCVs formalism requires a frameset representing a guess path for
18
the investigated event.28 A crucial requirement for the frameset is an equal spacing in terms of
19
RMSD between neighbouring snapshots. As such, including more frames results in smaller
20
distances and increases the resolution of the mechanistic interpretation. From this standpoint, the
21
few states obtained from TPT were not well-suited to provide functional PCVs. Specifically, an
22
RMSD of 5.7 Å, 8.6 Å, 7.7 Å, 1.6 Å, and 2.8 Å was obtained for the five transitions (see also Table
23
S2). First, such large distances between some of the states would inevitably affect the resolution
ACS Paragon Plus Environment
22
Page 23 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1
of the FES. Moreover, these states would not guarantee the equal spacing required to ensure a
2
smooth progression of the collective variable. Consequently, we devised a multi-step procedure in
3
which first we enriched the number of configurations between each pair of endpoints and
4
subsequently selected equally spaced configurations. As for the first step, we performed a series
5
of steered MD simulations75,76 starting in each one of the identified states. In order to plan the
6
steered MD simulations so as to start in one of the stations identified through the MSM and end in
7
the subsequent one, a series of runs were set using the RSMD as CV and targeting it to zero. In
8
addition, one more run was carried out from the first point (state A) increasing the coordination of
9
the ligand by water molecules in order to obtain additional points at a fully solvated state. We then
10
reverted this first short trajectory and merged it with the other steered MD simulation runs,
11
resulting in a 30 ns-long binding trajectory. For all the 6 runs, 5 ns-long each, force constants
12
ranging from 20 to 50 kcal/(mol Å2) were applied. We monitored the work required, and multiple
13
realizations were produced in order to ensure reproducibility of the outcomes. As we expected,
14
considering the relatively limited configurational space allowed to the ligand within the protein,
15
all the runs produced extremely similar results. Sample work profiles from such runs are shown in
16
Figure S5A. Conversely, as reported by Figure S5B, increasing the solvation of the ligand while
17
detaching from the protein opened the way to several distinct states, which were nevertheless less
18
relevant in light of the purposes of our study. Thus, in line with Favia et al.,31 we selected the run
19
in which the work was lower. Once the enriched trajectory, consisting of 30 ns of the 6 aggregated
20
steered MD simulation runs, was obtained, we proceeded to the selection of equally spaced frames.
21
We aimed at achieving an inter-frame distance of about 1 Å, which would allow distinguishing
22
relatively close configurations along the binding pathway, and in turn, achieving a reasonable
23
resolution of the outcoming FES. This step was non-trivial, since essentially translated into
ACS Paragon Plus Environment
23
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 24 of 64
1
iterating over 13,000 MD simulation snapshots to select a more contained number of frames while
2
achieving at the same time the desired spacing. To accomplish this task, we took advantage of in-
3
house scripts specifically devised for this purpose and involving inter-frames interpolation. This
4
procedure allowed us to select 80 points displaying an average inter-frame separation of 1.25 Å.
5
Finally, the guess path was parameterized using the S and Z variables developed by Branduardi et
6
al.:28
7 𝑃
8
𝑆(𝐱) =
∑𝑖 = 1𝑖 𝑒 𝑃
∑𝑖 = 1𝑒
2 ―𝜆‖𝐱 ― 𝐱𝑖‖
(6)
2 ―𝜆‖𝐱 ― 𝐱𝑖‖
9 10
and
11 12 13
𝑃
‖𝐱 ― 𝐱𝑖‖2
𝑍(𝐱) = ― 𝜆 ―1ln (∑𝑖 = 1𝑒 ―𝜆 (7)
),
14 15
in which x represents the current configuration of the system, and ‖𝐱 ― 𝐱𝑖‖ is the distance
16
between the current configuration and the ith structure of the frameset. Here, we used the RMSD
17
as a metric. In particular, the RMSD was calculated on all alprenolol’s heavy atoms, made
18
exception for the two methyl carbons of the terminal N-isopropyl moiety, after optimal alignment
19
over the alpha carbons from protein residues Cys77, Ala78, Asp79, Leu80, Val81, Met82, Gly83,
20
Leu84, Cys116, Val117, Thr118, Ala 119, Ser120, Ile121, Trp122, Thr123, Tyr209, Val210,
21
Pro211, Leu212, Ile 214, Met215, Phe282, Thr283 and Trp286. These alpha carbons belonged to
22
structured motifs and thus displayed small coordinate displacement during the simulations;
23
specifically, such condition was assessed by measuring the root mean square fluctuation of all the
ACS Paragon Plus Environment
24
Page 25 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1
protein’s alpha carbons in preliminary sample MD simulation runs. In Eq. 6 and 7, is a parameter
2
which controls the smoothness of the advancement along the path and was set to 1.48 Å-2.
3
All the data and PLUMED input files required to reproduce the results reported in this paper are
4
available on PLUMED-NEST (www.plumed-nest.org), the public repository of the PLUMED
5
consortium,77 as plumID:19.063.
6 7
4.4 Free energy profile and binding free energy
8
All the simulations were performed starting from selected configurations extracted from the
9
trajectories computed by Dror et al.37 Since, the parameters for protein and ligand could not be
10
transferred from the original topology, the entire system was re-parameterized. Thus, protein and
11
lipids were modelled according to the Amber ff14,78 which includes Lipid 14.79 As for alprenolol,
12
this was parameterized according to the general Amber force field (GAFF).80 Partial charges were
13
assigned to the ligand through the widely employed RESP procedure.81 In particular, geometrical
14
optimization and further calculation of the electrostatic potential were carried out via quantum
15
mechanics using the 6-31G* basis set at the Hartree-Fock level of theory; the NWChem software
16
was employed for this stage.82 The TIP3P model was used for the waters83 and an appropriate
17
number of ions, parameterized as in Joung and Cheatham,84 were added to neutralize the system.
18
All of the simulations were performed using the 4.6.7 version of the GROMACS MD simulation
19
engine85 patched with the 2.1.1 version of PLUMED.86
20
Production runs were performed in the NPT statistical ensemble using a 2 fs time-step. A cut-
21
off of 12 Å was used for non-bonded interactions, while long-range electrostatics was treated with
22
the Particle-Mesh Ewald scheme employing a grid spacing of 1.6 Å and a fourth order spline.87
23
The nominal temperature of 310 K was controlled with the Nosé-Hoover thermostat88,89 separately
ACS Paragon Plus Environment
25
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 26 of 64
1
coupling solutes, membrane and solvent with a time constant of 0.5 ps. The Parrinello-Rahman90,91
2
barostat was used to control the pressure (1 bar) using a semi-isotropic coupling and a time constant
3
of 5 ps. Bond lengths for chemical bonds involving hydrogens were restrained to their equilibrium
4
values with the LINCS algorithm.92
5
Well-tempered metadynamics27 was performed for a total time of about 2 μs. Gaussians with a
6
nominal height of 0.2 kcal/mol were used together with a bias factor of 15 units. The width of the
7
Gaussians was set to 0.1 and 0.01 nm2 along S and Z, respectively, and the space along the Z
8
dimension was limited to 0.16 nm2. The Gaussians deposition time was set to 500 molecular
9
dynamics steps. The statistical uncertainty was assessed through block analysis using PLUMED.93
10
This procedure allows providing an estimate of the error in the free energy as a function of pre-
11
defined CVs. Specifically, using the MD simulation trajectory obtained from the metadynamics
12
simulation, the free energy is computed in the CV space for independent and sequential blocks of
13
trajectory frames of fixed length. As a first step, each trajectory frame is assigned a weight
14
according to an umbrella-sampling-like reweighting approach.94 Subsequently, a binning of the
15
CV space is carried out, where each bin is assigned a free energy value that is computed from the
16
frame weights. Finally, for each bin, the average free energy over all the blocks can be determined
17
along with the associated error. Notably, the error is estimated from the fluctuations of the weights
18
over the block ensemble. The procedure was then repeated using different block sizes and the
19
corresponding average for the free energy error is computed from the errors associated with each
20
bin. As a typical behavior, an increase in the average error is observed as the block size becomes
21
larger. However, such dependence ceases once the correlation between the data, which is an
22
intrinsic attribute of MD simulation trajectories, is lost. As a result, the average error reaches a
23
plateau region, which corresponds to the error associated with the computed free energy. The
ACS Paragon Plus Environment
26
Page 27 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1
described protocol was applied to our metadynamics trajectory data employing a 200 x 40 binning
2
scheme along the S and Z CVs, respectively. Blocks of increasing size, up to 7000 frames, were
3
considered. The average error as a function of the block size is reported in Figure S6. As shown in
4
the plot, the error tends to a plateau region when the block size includes about 6000 frames
5
(corresponding of a simulation time of 12 ns). Thus, the corresponding average error value
6
provided a reasonable estimate of the uncertainty associated with the free energy computed from
7
our metadynamics simulations.
8
The standard binding free energy of Alprenolol to the 2-AR was estimated following the
9
approach proposed by Saladino et al.32 In particular, by partitioning the bound/unbound state at
10
approximately S = 20, the corrected binding free energy can be formulated as follows:49
11 12
Δ𝐺°𝑏 = Δ𝐹meta +Δ𝐹°corr≅ ― 𝑅𝑇ln
∫sited𝑆 d𝑍 e ∫bulkd𝑆 d𝑍 e
―
𝐹(𝑆,𝑍) 𝑅𝑇
𝐹(𝑆,𝑍) ― 𝑅𝑇
― 𝑅𝑇ln
𝑉meta bulk 𝑉°
(8)
13 14
The first term of Eq. 8 represents the probability ratio between the bound and unbound state of
15
the ligand as reconstructed by well-tempered metadynamics, while the second term is a correction
16
taking into account the fact that in the unbound state only a fraction of the volume of the box was
17
accessible to the ligand. This was due to the finite length of S and the confinement along Z, and it
18
was numerically estimated to be on the order of about 524 Å3. The second term of Eq. 9 also
19
involves the reference of standard state volume (or 1 M concentration), equal to 1660 Å3. Thus,
20
considering the explicit contribution from metadynamics simulations and the entropic cost due to
21
volumetric confinement in the unbound state, a value of – 11.6 kcal/mol was estimated (Δ𝐹meta =
ACS Paragon Plus Environment
27
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 28 of 64
1
– 12.3 kcal/mol and Δ𝐹°corr = 0.7 kcal/mol), in very good agreement with the experimental binding
2
free energy (−12.2 kcal/mol).37
3 4
4.5 Caveats and recommendations for the use of the MSM-Path Metadynamics approach
5
We conclude this section by briefly summarizing some practical suggestions for future
6
applications of the presented methodology. First and foremost, it should be ensured that at least
7
few spontaneous and complete binding events are observed in the initial trajectory set
8
obtained via conventional MD simulations. Alternatively, several disjoint trajectories can
9
be aggregated and analysed as a whole, provided that the entire process can be
10
reconstructed with sufficient accuracy. While as a proof-of-concept we employed here
11
trajectories that were already available, one could consider limiting the sampling of the
12
unbound state through volumetric confinement closer to the binding site.61 This strategy
13
would improve the sampling efficiency of conventional MD simulations, and it is not
14
expected to be critical considering that subsequent refinement is then achieved through
15
the metadynamics resampling. In case spontaneous binding events are recorded, care
16
must be taken in whether to include all the obtained information in the construction of
17
PCVs. Indeed, while greater sampling of overlapping regions of configurational space
ACS Paragon Plus Environment
28
Page 29 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1
certainly improves the statistical significance of the PCVs space under construction, the
2
sampling of highly dissimilar binding pathways might introduce additional uncertainty in
3
the MSM rather than alleviate it. In the present work, the individual trajectory responsible
4
for this behaviour was spotted by visual inspection and previous knowledge,37 but re-
5
computing the ITC plots in a leave-one-out fashion (i.e. discarding each time a different
6
trajectory set) would be a viable option to identify an undesirable outlier. Alternatively,
7
one may also take advantage of recently proposed trajectory analysis tools aimed at
8
automatically classifying the similarity of (un)binding pathways.57 In any case, particular
9
care should be addressed to the selection and assessment of appropriate variables that
10
are used to define the microstates of the MSM. This can for instance be achieved by
11
analysing the outcome of the clustering procedure, by ensuring that no relevant structural
12
changes are taking place within the same cluster in orthogonal degrees of freedom. In
13
general, the number of clusters for the microstate definition should be taken as low as
14
possible to avoid microstates with too few counts, condition which is necessary to build a
15
reliable transition probability matrix,65 but sufficiently large to geometrically discriminate
16
between different microstates. Finally, in the construction of the guess path, whose
ACS Paragon Plus Environment
29
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 30 of 64
1
essential requirement is that the inter-frame separation is as constant as possible, an
2
RMSD difference in the order of 1 Å would be desirable to achieve a reasonable resolution
3
and thus correctly map configurations along the S variable. As for the Z PCV, sampling
4
of values larger than16 Å2 are discouraged, since this would correspond to configurations
5
of about 4 Å RMSD difference from the corresponding S value, leading to highly
6
degenerate states for the same point in the PCV space.
7
8
FIGURES
9
Figure 1. The MSM-Path Metadynamics approach described in this work.
10
Figure 2. Orthogonal views of the the 2-AR embedded in a lipid bilayer. The protein is
11
shown in cartoons color-coded according to the -helix regions. In the top panel, the
12
alprenolol binding is depicted tracing its center-of-mass (orange spheres) along the whole
13
set of reactive trajectories.
ACS Paragon Plus Environment
30
Page 31 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1
Figure 3. Network of fluxes for the alprenolol binding to the 2-AR (A) Network of fluxes
2
calculated through TPT. Each node represents a single microstate and its size is
3
proportional to the connections. Conversely, the size of the edges is proportional to the
4
flux. (B) A close-up view of the most relevant regions of the network, showing the four
5
intermediate states mostly involved in the transition from the EC to bound state (endpoints
6
A and B, respectively).
7
Figure 4. Retrospective validation of the MSM-derived path though projection of the
8
conventional-MD simulation trajectories on the PCVs space. Each color of the grayscale,
9
from black to light grey (labels A0-C9 in the upper left corner of the plot), identifies one of
10
the 11 different trajectories used to build the MSM. The larger colored dots, labelled A,
11
I1-4, and B, are the projection of the 6 states used as templates to construct the frameset.
12
These are consistent with the labels used in Figure 2. The corresponding configurations
13
are depicted right below the plot, labeled using the same color code. Conversely, right
14
above the plot, sample structures are shown for dense regions in the PCVs space
15
depicting either alternative binding pathways or non-productive trajectories.
ACS Paragon Plus Environment
31
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 32 of 64
1
Figure 5. Mechanistic interpretation of the drug binding process. (A) Free energy surface
2
for the alprenolol binding to the 2-AR as reconstructed by metadynamics along the guess
3
path. Iso-lines are plotted each 2 kcal/mol. The main metastable states are reported
4
together with selected configurations. The minimum free energy pathway is displayed in
5
red (solid line) along with an alternative route (dashed line). (B) Free energy profile along
6
the MFEP (black line) together with the associated statistical uncertainty (gray bars). (C)
7
Focus on the alprenolol rearrangement upon the M3 to M6 transition (red to blue color
8
scale, nitrogen atom highlighted in yellow).
9
Figure 6. Relevant states along the Minimum Free Energy Path. (A) One representative
10
configuration per relevant minima. The color gradient goes from dark red (EC formed) to
11
blue (crystal pose). (B) Representative configuration for the M1 state. (C) Representative
12
configuration for the M3 state. (D) Half-turn rotation along the ligand main axis to move
13
from M3 (darker pink) to M4 (lighter pink). (E) Representative configuration for the M5
14
and M6 states. (F) Representative configuration for the M7 state. (G) The 4 Å
15
displacement to go from M8 (white) to M9 (azure). (H) Representative configuration for
ACS Paragon Plus Environment
32
Page 33 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1
the M11 state. In panels (B) to (H), ligand color code is consistent with panel (A); ligand
2
polar atoms are highlighted as spheres (blue for nitrogen, red for oxygen) and polar
3
hydrogens are also shown.
4
Figure 7. Assessing the response of residues Phe193 and Tyr308 to alprenolol passage
5
through the narrow crevice. (A) Phenyl-phenyl side-chains’ distance between Phe193 and
6
Tyr308 plotted against simulation time (red line, left y-axis). The instantaneous value of
7
S during metadynamics simulation is also shown (black line, right y-axis). (B) Scatter plot
8
of S plotted against the phenyl-phenyl side-chains’ distance. (C) Representative
9
configurations extracted from the metadynamics simulation showing the phenyl-phenyl
10
gate in the closed, open, and wide-open states (left, center, and right, respectively).
11
Figure 8. The alternative binding path. Configurations sampled in the alternative path,
12
namely M10’ (A) and M12’ (B). alprenolol is colored in yellow.
13 14
ASSOCIATED CONTENT
ACS Paragon Plus Environment
33
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 34 of 64
1
Supporting Information. The following files are available free of charge.
2
Supporting Figures: time evolution of free energy differences along metadynamics run;
3
details on the protein residues and ligand atoms used to parameterize PCVs; evaluation
4
of the significance of clusters used to build the MSM; ITS plots; outcome of steered MD
5
simulations; block analysis. Supporting Tables: summary of the trajectory set; RMSD
6
matrix of the intermediates identified through TPT.
7
8
AUTHOR INFORMATION
9
Corresponding Author
10
*E-mail:
[email protected] 11
*E-mail:
[email protected] 12
Present Addresses
13
‡Present Address: Scuola Internazionale Superiore di Studi Avanzati, via Bonomea 265,
14
Trieste, Italy
ACS Paragon Plus Environment
34
Page 35 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1
Author Contributions
2
The manuscript was written through contributions of all authors. All authors have given
3
approval to the final version of the manuscript.
4
Notes
5
A.C. is co-founder of BiKi Technologies, a startup company that develops methods based
6
on molecular dynamics simulations and related approaches for investigating protein-
7
ligand (un)binding. R.E.A. has equity interest in, and is a cofounder and on the scientific
8
advisory board of, Actavalon, Inc.
9
ACKNOWLEDGMENT
10
We wish to thank Robert D. Malmstrom, Davide Branduardi, Martina Bertazzo,
11
Mariarosaria Ferraro, and Luca Bellucci for useful discussions and technical support.
12
Andrea Spitaleri is also gratefully acknowledged for providing the initial system topology.
13
14
REFERENCES
(1)
Dror, R. O.; Mildorf, T. J.; Hilger, D.; Manglik, A.; Borhani, D. W.; Arlow, D. H.;
ACS Paragon Plus Environment
35
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 36 of 64
1
Philippsen, A.; Villanueva, N.; Yang, Z.; Lerch, M. T.; Hubbell, W. L.; Kobilka, B. K.;
2
Sunahara, R. K.; Shaw, D. E. Structural Basis for Nucleotide Exchange in
3
Heterotrimeric G Proteins. Science. 2015, 348, 1361–1365.
4
(2)
Malmstrom, R. D.; Kornev, A. P.; Taylor, S. S.; Amaro, R. E. Allostery through the
5
Computational Microscope: CAMP Activation of a Canonical Signalling Domain.
6
Nat. Commun. 2015, 6, 7588.
7
(3)
Masetti, M.; Xie, H.; Krpetić, Ž.; Recanatini, M.; Alvarez-Puebla, R. A.; Guerrini, L.
8
Revealing DNA Interactions with Exogenous Agents by Surface-Enhanced Raman
9
Scattering. J. Am. Chem. Soc. 2015, 137, 469–476.
10
(4)
Bernetti, M.; Masetti, M.; Pietrucci, F.; Blackledge, M.; Jensen, M. R.; Recanatini,
11
M.; Mollica, L.; Cavalli, A. Structural and Kinetic Characterization of the Intrinsically
12
Disordered Protein SeV NTAIL through Enhanced Sampling Simulations. J. Phys.
13
Chem. B 2017, 121, 9572–9582.
14 15
(5)
Amaro, R. E.; Mulholland, A. J. Multiscale Methods in Drug Design Bridge Chemical and Biological Complexity in the Search for Cures. Nat. Rev. Chem. 2018, 2, 148.
ACS Paragon Plus Environment
36
Page 37 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
Journal of Chemical Theory and Computation
(6)
Related Methods in Drug Discovery. J. Med. Chem. 2016, 59, 4035–4061.
2
3
De Vivo, M.; Masetti, M.; Bottegoni, G.; Cavalli, A. Role of Molecular Dynamics and
(7)
Wang, L.; Wu, Y.; Deng, Y.; Kim, B.; Pierce, L.; Krilov, G.; Lupyan, D.; Robinson,
4
S.; Dahlgren, M. K.; Greenwood, J.; Romero, D. L.; Masse, C.; Knight, J. L.;
5
Steinbrecher, T.; Beuming, T.; Damm, W.; Harder, E.; Sherman, W.; Brewer, M.;
6
Wester, R.; Murcko, M.; Frye, L.; Farid, R.; Lin, T.; Mobley, D. L.; Jorgensen, W. L.;
7
Berne, B. J.; Friesner, R. A.; Abel, R. Accurate and Reliable Prediction of Relative
8
Ligand Binding Potency in Prospective Drug Discovery by Way of a Modern Free-
9
Energy Calculation Protocol and Force Field. J. Am. Chem. Soc. 2015, 137, 2695–
10
11
2703.
(8)
Salomon-Ferrer, R.; Götz, A. W.; Poole, D.; Le Grand, S.; Walker, R. C. Routine
12
Microsecond Molecular Dynamics Simulations with AMBER on GPUs. 2. Explicit
13
Solvent Particle Mesh Ewald. J. Chem. Theory Comput. 2013, 9, 3878–3888.
14 15
(9)
Kutzner, C.; Páll, S.; Fechner, M.; Esztermann, A.; de Groot, B. L.; Grubmüller, H. Best Bang for Your Buck: GPU Nodes for GROMACS Biomolecular Simulations. J.
ACS Paragon Plus Environment
37
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
Page 38 of 64
Comput. Chem. 2015, 36, 1990–2008.
2
(10) Harvey, M. J.; Giupponi, G.; Fabritiis, G. De. ACEMD: Accelerating Biomolecular
3
Dynamics in the Microsecond Time Scale. J. Chem. Theory Comput. 2009, 5,
4
1632–1639.
5
(11) Lee, T.-S.; Cerutti, D. S.; Mermelstein, D.; Lin, C.; LeGrand, S.; Giese, T. J.;
6
Roitberg, A.; Case, D. A.; Walker, R. C.; York, D. M. GPU-Accelerated Molecular
7
Dynamics and Free Energy Methods in Amber18: Performance Enhancements and
8
New Features. J. Chem. Inf. Model. 2018, 58, 2043–2050.
9
(12) Shaw, D. E.; Maragakis, P.; Lindorff-Larsen, K.; Piana, S.; Dror, R. O.; Eastwood,
10
M. P.; Bank, J. A.; Jumper, J. M.; Salmon, J. K.; Shan, Y.; Wriggers, W. Atomic-
11
Level Characterization of the Structural Dynamics of Proteins. Science. 2010, 330,
12
341–346.
13 14
(13) Shirts, M.; Pande, V. S. Screen Savers of the World Unite! Science. 2000, 290, 1903–1904.
ACS Paragon Plus Environment
38
Page 39 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1 2
Journal of Chemical Theory and Computation
(14) Gioia, D.; Bertazzo, M.; Recanatini, M.; Masetti, M.; Cavalli, A. Dynamic Docking: A Paradigm Shift in Computational Drug Discovery. Molecules . 2017, 22, 1−21..
3
(15) Votapka, L. W.; Amaro, R. E. Multiscale Estimation of Binding Kinetics Using
4
Brownian Dynamics, Molecular Dynamics and Milestoning. PLOS Comput. Biol.
5
2015, 11, e1004381.
6
(16) Votapka, L. W.; Jagger, B. R.; Heyneman, A. L.; Amaro, R. E. SEEKR: Simulation
7
Enabled Estimation of Kinetic Rates, A Computational Tool to Estimate Molecular
8
Kinetics and Its Application to Trypsin–Benzamidine Binding. J. Phys. Chem. B
9
2017, 121, 3597–3606.
10
(17) Jagger, B. R.; Lee, C. T.; Amaro, R. E. Quantitative Ranking of Ligand Binding
11
Kinetics with a Multiscale Milestoning Simulation Approach. J. Phys. Chem. Lett.
12
2018, 9, 4941–4948.
13
(18) Teo, I.; Mayne, C. G.; Schulten, K.; Lelièvre, T. Adaptive Multilevel Splitting Method
14
for Molecular Dynamics Calculation of Benzamidine-Trypsin Dissociation Time. J.
15
Chem. Theory Comput. 2016, 12, 2983–2989.
ACS Paragon Plus Environment
39
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 40 of 64
1
(19) Lotz, S. D.; Dickson, A. Unbiased Molecular Dynamics of 11 Min Timescale Drug
2
Unbinding Reveals Transition State Stabilizing Interactions. J. Am. Chem. Soc.
3
2018, 140, 618–628.
4 5
6 7
8 9
10 11
12 13
14
(20) Dickson, A.; Lotz, S. D. Multiple Ligand Unbinding Pathways and Ligand-Induced Destabilization Revealed by WExplore. Biophys. J. 2017, 112, 620–629.
(21) Dickson, A. Mapping the Ligand Binding Landscape. Biophys. J. 2018, 115, 1707– 1719.
(22) Bernetti, M.; Masetti, M.; Rocchia, W.; Cavalli, A. Kinetics of Drug Binding and Residence Time. Annu. Rev. Phys. Chem. 2019, 70, 143–71.
(23) Borhani, D. W.; Shaw, D. E. The Future of Molecular Dynamics Simulations in Drug Discovery. J. Comput. Aided. Mol. Des. 2012, 26, 15–26.
(24) Betz, R. M.; Dror, R. O. How Effectively Can Adaptive Sampling Methods Capture Spontaneous Ligand Binding? J. Chem. Theory Comput. 2019, 15, 2053–2063.
(25) Laio, A.; Parrinello, M. Escaping Free-Energy Minima. Proc. Natl. Acad. Sci. 2002,
ACS Paragon Plus Environment
40
Page 41 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
Journal of Chemical Theory and Computation
99, 12562–12566.
2
(26) Abrams, C.; Bussi, G. Enhanced Sampling in Molecular Dynamics Using
3
Metadynamics, Replica-Exchange, and Temperature-Acceleration. Entropy . 2014,
4
16, 163–199.
5
(27) Barducci, A.; Bussi, G.; Parrinello, M. Well-Tempered Metadynamics: A Smoothly
6
Converging and Tunable Free-Energy Method. Phys. Rev. Lett. 2008, 100, 020603.
7
(28) Branduardi, D.; Gervasio, F. L.; Parrinello, M. From A to B in Free Energy Space.
8
J. Chem. Phys. 2007, 126, 054103
9
(29) Berteotti, A.; Cavalli, A.; Branduardi, D.; Gervasio, F. L.; Recanatini, M.; Parrinello,
10
M. Protein Conformational Transitions: The Closure Mechanism of a Kinase
11
Explored by Atomistic Simulations. J. Am. Chem. Soc. 2009, 131, 244–250.
12
(30) Ceccarini, L.; Masetti, M.; Cavalli, A.; Recanatini, M. Ion Conduction through the
13
14
HERG Potassium Channel. PLoS One 2012, 7, e49017.
(31) Favia, A. D.; Masetti, M.; Recanatini, M.; Cavalli, A. Substrate Binding Process and
ACS Paragon Plus Environment
41
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 42 of 64
1
Mechanistic Functioning of Type 1 11β-Hydroxysteroid Dehydrogenase from
2
Enhanced Sampling Methods. PLoS One 2011, 6, e25375.
3
(32) Saladino, G.; Gauthier, L.; Bianciotto, M.; Gervasio, F. L. Assessing the
4
Performance of Metadynamics and Path Variables in Predicting the Binding Free
5
Energies of P38 Inhibitors. J. Chem. Theory Comput. 2012, 8, 1165–1170.
6
(33) Colizzi, F.; Masetti, M.; Recanatini, M.; Cavalli, A. Atomic-Level Characterization of
7
the Chain-Flipping Mechanism in Fatty-Acids Biosynthesis. J. Phys. Chem. Lett.
8
2016, 7, 2899–2904.
9 10
(34) Zinovjev, K.; Tuñón, I. Reaction Coordinates and Transition States in Enzymatic Catalysis. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2017, 8, e1329.
11
(35) Shan, Y.; Kim, E. T.; Eastwood, M. P.; Dror, R. O.; Seeliger, M. A.; Shaw, D. E.
12
How Does a Drug Molecule Find Its Target Binding Site? J. Am. Chem. Soc. 2011,
13
133, 9181–9183.
14
(36) Buch, I.; Giorgino, T.; De Fabritiis, G. Complete Reconstruction of an Enzyme-
ACS Paragon Plus Environment
42
Page 43 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1
Inhibitor Binding Process by Molecular Dynamics Simulations. Proc. Natl. Acad.
2
Sci. 2011, 108, 10184–10189.
3
(37) Dror, R. O.; Pan, A. C.; Arlow, D. H.; Borhani, D. W.; Maragakis, P.; Shan, Y.; Xu,
4
H.; Shaw, D. E. Pathway and Mechanism of Drug Binding to G-Protein-Coupled
5
Receptors. Proc. Natl. Acad. Sci. 2011, 108, 13118–13123.
6 7
(38) Husic, B. E.; Pande, V. S. Markov State Models: From an Art to a Science. J. Am.
Chem. Soc. 2018, 140, 2386–2396.
8
(39) Doerr, S.; De Fabritiis, G. On-the-Fly Learning and Sampling of Ligand Binding by
9
High-Throughput Molecular Simulations. J. Chem. Theory Comput. 2014, 10,
10
2064–2069.
11
(40) Held, M.; Metzner, P.; Prinz, J.-H.; Noé, F. Mechanisms of Protein-Ligand
12
Association and Its Modulation by Protein Mutations. Biophys. J. 2011, 100, 701–
13
710.
14
(41) Pan, A. C.; Roux, B. Building Markov State Models along Pathways to Determine
ACS Paragon Plus Environment
43
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 44 of 64
1
Free Energies and Rates of Transitions. J. Chem. Phys. 2008, 129, 64107.
2
(42) Phillips, J. C.; Braun, R.; Wang, W.; Gumbart, J.; Tajkhorshid, E.; Villa, E.; Chipot,
3
C.; Skeel, R. D.; Kalé, L.; Schulten, K. Scalable Molecular Dynamics with NAMD.
4
J. Comput. Chem. 2005, 26, 1781–1802.
5
(43) Shukla, D.; Meng, Y.; Roux, B.; Pande, V. S. Activation Pathway of Src Kinase
6
Reveals Intermediate States as Targets for Drug Design. Nat. Commun. 2014, 5,
7
3397.
8 9
(44) Pande, V. S.; Beauchamp, K.; Bowman, G. R. Everything You Wanted to Know about Markov State Models but Were Afraid to Ask. Methods 2010, 52, 99–105.
10
(45) Scherer, M. K.; Trendelkamp-Schroer, B.; Paul, F.; Pérez-Hernández, G.;
11
Hoffmann, M.; Plattner, N.; Wehmeyer, C.; Prinz, J.-H.; Noé, F. PyEMMA 2: A
12
Software Package for Estimation, Validation, and Analysis of Markov Models. J.
13
Chem. Theory Comput. 2015, 11, 5525–5542.
14
(46) E, W.; Vanden-Eijnden, E. Transition-Path Theory and Path-Finding Algorithms for
ACS Paragon Plus Environment
44
Page 45 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
Journal of Chemical Theory and Computation
the Study of Rare Events. Annu. Rev. Phys. Chem. 2010, 61, 391–420.
2
(47) Noé, F.; Schütte, C.; Vanden-Eijnden, E.; Reich, L.; Weikl, T. R. Constructing the
3
Equilibrium Ensemble of Folding Pathways from Short Off-Equilibrium Simulations.
4
Proc. Natl. Acad. Sci. 2009, 106, 19011–19016.
5 6
7 8
(48) Meng, Y.; Shukla, D.; Pande, V. S.; Roux, B. Transition Path Theory Analysis of CSrc Kinase Activation. Proc. Natl. Acad. Sci. 2016, 113, 9193–9198.
(49) General, I. J. A Note on the Standard State’s Binding Free Energy. J. Chem. Theory
Comput. 2010, 6, 2520–2524.
9
(50) Wacker, D.; Fenalti, G.; Brown, M. A.; Katritch, V.; Abagyan, R.; Cherezov, V.;
10
Stevens, R. C. Conserved Binding Mode of Human Β2 Adrenergic Receptor Inverse
11
Agonists and Antagonist Revealed by X-Ray Crystallography. J. Am. Chem. Soc.
12
2010, 132, 11443–11445.
13
(51) Marcos-Alcalde, I.; Setoain, J.; Mendieta-Moreno, J. I.; Mendieta, J.; Gómez-
14
Puertas, P. MEPSA: Minimum Energy Pathway Analysis for Energy Landscapes.
ACS Paragon Plus Environment
45
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
2
Page 46 of 64
Bioinformatics 2015, 31, 3853–3855.
(52) Dickson, A.; Brooks
3rd, C. L. WExplore: Hierarchical Exploration of High-
3
Dimensional Spaces Using the Weighted Ensemble Algorithm. J. Phys. Chem. B
4
2014, 118, 3532–3542.
5 6
(53) Huber, G. A.; Kim, S. Weighted-Ensemble Brownian Dynamics Simulations for Protein Association Reactions. Biophys. J. 1996, 70, 97–110.
7
(54) Mollica, L.; Decherchi, S.; Zia, S. R.; Gaspari, R.; Cavalli, A.; Rocchia, W. Kinetics
8
of Protein-Ligand Unbinding via Smoothed Potential Molecular Dynamics
9
Simulations. Sci. Rep. 2015, 5, 11539.
10
(55) Kokh, D.; Amaral, M.; Bomke, J.; Grädler, U.; Musil, D.; Buchstaller, H.-P.; Dreyer,
11
M.; Frech, M.; Lowinski, M.; Vallee, F.; Bianciotto, M.; Rak, A.; C Wade, R.
12
Estimation of Drug-Target Residence Times by τ-Random Acceleration Molecular
13
Dynamics Simulations. J. Chem. Theory Comput. 2018, 14, 3859−3869.
14
(56) Bernetti, M.; Rosini, E.; Mollica, L.; Masetti, M.; Pollegioni, L.; Recanatini, M.;
ACS Paragon Plus Environment
46
Page 47 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1
Cavalli, A. Binding Residence Time through Scaled Molecular Dynamics: A
2
Prospective Application to HDAAO Inhibitors. J. Chem. Inf. Model. 2018, 58, 2255–
3
2265.
4
(57) Schuetz, D. A.; Bernetti, M.; Bertazzo, M.; Musil, D.; Eggenweiler, H.-M.;
5
Recanatini, M.; Masetti, M.; Ecker, G. F.; Cavalli, A. Predicting Residence Time and
6
Drug Unbinding Pathway through Scaled Molecular Dynamics. J. Chem. Inf. Model.
7
2019, 59, 535–549.
8
(58) Sabbadin, D.; Moro, S. Supervised Molecular Dynamics (SuMD) as a Helpful Tool
9
To Depict GPCR-Ligand Recognition Pathway in a Nanosecond Time Scale. J.
10
Chem. Inf. Model. 2014, 54, 372−376.
11
(59) Cuzzolin, A.; Sturlese, M.; Deganutti, G.; Salmaso, V.; Sabbadin, D.; Ciancetta, A.;
12
Moro, S. Deciphering the Complexity of Ligand-Protein Recognition Pathways
13
Using Supervised Molecular Dynamics (SuMD) Simulations. J. Chem. Inf. Model.
14
2016, 56, 4687-4705
15
(60) Spitaleri, A.; Decherchi, S.; Cavalli, A.; Rocchia, W. Fast Dynamic Docking Guided
ACS Paragon Plus Environment
47
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 48 of 64
1
by Adaptive Electrostatic Bias: The MD-Binding Approach. J. Chem. Theory
2
Comput. 2018, 143, 1727-1736.
3
(61) Bertazzo, M.; Bernetti, M.; Recanatini, M.; Masetti, M.; Cavalli, A. Fully Flexible
4
Docking via Reaction-Coordinate-Independent Molecular Dynamics Simulations. J.
5
Chem. Inf. Model. 2018, 58, 490–500.
6
(62) Linker, S. M.; Magarkar, A.; Koefinger, J.; Hummer, G.; Seeliger, D. Fragment
7
Binding Pose Predictions Using Unbiased Simulations and Markov-State-Models.
8
J. Chem. Theory Comput. 2019.
9
(63) Robustelli, P.; Piana, S.; Shaw, D. E. Developing a Molecular Dynamics Force Field
10
for Both Folded and Disordered Protein States. Proc. Natl. Acad. Sci. 2018, 115,
11
4758–4766.
12
(64) Haldar, S.; Comitani, F.; Saladino, G.; Woods, C.; van der Kamp, M. W.;
13
Mulholland, A. J.; Gervasio, F. L. A Multiscale Simulation Approach to Modeling
14
Drug–Protein Binding Kinetics. J. Chem. Theory Comput. 2018, 14 , 6093–6101.
ACS Paragon Plus Environment
48
Page 49 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1 2
Journal of Chemical Theory and Computation
(65) Bowman, G. R. An overview and practical guide to buildingMarkov state models.
Adv. Exp. Med. Biol. 2014, 797, 7−22.
3
(66) Prinz, J.-H.; Wu, H.; Sarich, M.; Keller, B.; Senne, M.; Held, M.; Chodera, J. D.;
4
Schütte, C.; Noé, F. Markov Models of Molecular Kinetics: Generation and
5
Validation. J. Chem. Phys. 2011, 134, 174105.
6
(67) Plattner, N.; Noé, F. Protein Conformational Plasticity and Complex Ligand-Binding
7
Kinetics Explored by Atomistic Simulations and Markov Models. Nat. Commun.
8
2015, 6, 7653.
9
(68) MacQueen, J. Some Methods for Classification and Analysis of Multivariate
10
Observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical
11
Statistics and Probability, Volume 1: Statistics; Le Cam, L. M. and Neyman, J., Eds.;
12
University of California Press: Berkeley, Calif., 1967; pp 281–297.
13
(69) Plattner, N.; Doerr, S.; De Fabritiis, G.; Noé, F. Complete Protein–Protein
14
Association Kinetics in Atomic Detail Revealed by Molecular Dynamics Simulations
15
and Markov Modelling. Nat. Chem. 2017, 9, 1005.
ACS Paragon Plus Environment
49
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 50 of 64
1
(70) Pinamonti, G.; Zhao, J.; Condon, D. E.; Paul, F.; Noè, F.; Turner, D. H.; Bussi, G.
2
Predicting the Kinetics of RNA Oligonucleotides Using Markov State Models. J.
3
Chem. Theory Comput. 2017, 13, 926–934.
4
(71) Pérez-Hernández, G.; Paul, F.; Giorgino, T.; De Fabritiis, G.; Noé, F. Identification
5
of Slow Molecular Order Parameters for Markov Model Construction. J. Chem.
6
Phys. 2013, 139, 15102.
7 8
(72) Chodera, J. D.; Noé, F. Markov State Models of Biomolecular Conformational Dynamics. Curr. Opin. Struct. Biol. 2014, 25, 135–144.
9
(73) Da, L.-T.; Sheong, F. K.; Silva, D.-A.; Huang, X. Application of Markov State Models
10
to Simulate Long Timescale Dynamics of Biological Macromolecules BT - Protein
11
Conformational Dynamics; Han, K., Zhang, X., Yang, M., Eds.; Springer
12
International Publishing: Cham, 2014; pp 29–66.
13
(74) Ellson, J.; Gansner, E.; Koutsofios, L.; North, S. C.; Woodhull, G.; Mutzel, P.;
14
Junger, M.; Leipert, S. Graphviz – open source graph drawing tools. Lect. Notes
15
Comput. Sci. 2002, 2265, 483–484.
ACS Paragon Plus Environment
50
Page 51 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1 2
(75) Grubmüller, H.; Heymann, B.; Tavan, P. Ligand Binding: Molecular Mechanics
3
Calculation of the Streptavidin-Biotin Rupture Force. Science. 1996, 271, 997–999.
4
(76) Izrailev, S.; Stepaniants, S.; Isralewitz, B.; Kosztin, D.; Lu, H.; Molnar, F.; Wriggers,
5
W.; Schulten, K. Computational Molecular Dynamics: Challenges, Methods, Ideas.
6
In Proceedings of the 2nd International Symposium on Algorithms for
7
Macromolecular Modelling, first edition; Deuflhard, P., Hermans, J., Leimkuhler, B.,
8
Mark, A. E., Reich, S., Skeel, R. D., Eds.; Springer Berlin Heidelberg: Berlin,
9
Heidelberg, 1997; pp 39–65.
10
(77) Bonomi, M.; Bussi, G.; Camilloni, C.; Tribello G. A.; Banáš P.; Barducci, A.; Bernetti,
11
M.; Bolhuis, P. G.; Bottaro, S.; Branduardi, D.; Capelli, R.; Carloni, P.; Ceriotti, M.;
12
Cesari, A.; Chen, H.; Chen, W.; Colizzi, F.; De, S.; De La Pierre, M.; Donadio, D.;
13
Drobot, V.; Ensing, B.; Ferguson, A. L.; Filizola, M.; Fraser, J. S.; Fu, H.;
14
Gasparotto, P.; Gervasio, F. L.; Giberti, F.; Gil-Ley, A.; Giorgino, T.; Heller, G. T.;
15
Hocky, G. M.; Iannuzzi, M.; Invernizzi, M.; Jelfs, K. E.; Jussupow, A.; Kirilin, E.;
ACS Paragon Plus Environment
51
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 52 of 64
1
Laio, A.; Limongelli, V.; Lindorff-Larsen, K.; Löhr, T.; Marinelli, F.; Martin-Samos,
2
L.; Masetti, M.; Meyer, R.; Michaelides, A.; Molteni, C.; Morishita, T.; Nava, M.;
3
Paissoni, C.; Papaleo, E.; Parrinello, M.; Pfaendtner, J.; Piaggi, P.; Piccini, G.M.;
4
Pietropaolo, A.; Pietrucci, F.; Pipolo, S.; Provasi, D.; Quigley, D.; Raiteri, P.;
5
Raniolo, S.; Rydzewski, J.; Salvalaglio, M.; Sosso, G. C.; Spiwok, V.; Šponer, J.;
6
Swenson, D. W. H.; Tiwary, P.; Valsson, O.; Vendruscolo, M.; Voth, G. A.; White,
7
A. . Promoting Transparency and Reproducibility in Enhanced Molecular
8
Simulations. Nat. Methods 2019, 16, 670–673.
9
(78) Maier, J. A.; Martinez, C.; Kasavajhala, K.; Wickstrom, L.; Hauser, K. E.;
10
Simmerling, C. Ff14SB: Improving the Accuracy of Protein Side Chain and
11
Backbone Parameters from Ff99SB. J. Chem. Theory Comput. 2015, 11, 3696–
12
3713.
13
(79) Dickson, C. J.; Madej, B. D.; Skjevik, Å. A.; Betz, R. M.; Teigen, K.; Gould, I. R.;
14
Walker, R. C. Lipid14: The Amber Lipid Force Field. J. Chem. Theory Comput.
15
2014, 10, 865–879.
ACS Paragon Plus Environment
52
Page 53 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
1
(80) Wang, J.; Wolf, R. M.; Caldwell, J. W.; Kollman, P. A.; Case, D. A. Development
2
and Testing of a General Amber Force Field. J. Comput. Chem. 2004, 25, 1157–
3
1174.
4
(81) Bayly, C. I.; Cieplak, P.; Cornell, W.; Kollman, P. A. A Well-Behaved Electrostatic
5
Potential Based Method Using Charge Restraints for Deriving Atomic Charges: The
6
RESP Model. J. Phys. Chem. 1993, 97, 10269–10280.
7
(82) Valiev, M.; Bylaska, E. J.; Govind, N.; Kowalski, K.; Straatsma, T. P.; Van Dam, H.
8
J. J.; Wang, D.; Nieplocha, J.; Apra, E.; Windus, T. L.; de Jong, W. A. NWChem: A
9
Comprehensive and Scalable Open-Source Solution for Large Scale Molecular
10
Simulations. Comput. Phys. Commun. 2010, 181, 1477–1489.
11
(83) Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L.
12
Comparison of Simple Potential Functions for Simulating Liquid Water. J. Chem.
13
Phys. 1983, 79, 926–935.
14
(84) Joung, I. S.; Cheatham, T. E. Determination of Alkali and Halide Monovalent Ion
15
Parameters for Use in Explicitly Solvated Biomolecular Simulations. J. Phys. Chem.
ACS Paragon Plus Environment
53
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
Page 54 of 64
B 2008, 112, 9020–9041.
2
(85) Pronk, S.; Páll, S.; Schulz, R.; Larsson, P.; Bjelkmar, P.; Apostolov, R.; Shirts, M.
3
R.; Smith, J. C.; Kasson, P. M.; van der Spoel, D.; Hess, B.; Lindahl, E. GROMACS
4
4.5: A High-Throughput and Highly Parallel Open Source Molecular Simulation
5
Toolkit. Bioinforma. 2013, 29, 845–854.
6
(86) Tribello, G. A.; Bonomi, M.; Branduardi, D.; Camilloni, C.; Bussi, G. PLUMED 2:
7
New Feathers for an Old Bird. Comput. Phys. Commun. 2014, 185, 604–613.
8
(87) Essmann, U.; Perera, L.; Berkowitz, M. L.; Darden, T.; Lee, H.; Pedersen, L. G. A
9
Smooth Particle Mesh Ewald Method. J. Chem. Phys. 1995, 103, 8577–8593.
10
(88) Nosé, S. A Molecular Dynamics Method for Simulations in the Canonical Ensemble.
11
12 13
14
Mol. Phys. 1984, 52, 255–268.
(89) Hoover, W. G. Canonical Dynamics: Equilibrium Phase-Space Distributions. Phys.
Rev. A 1985, 31, 1695–1697.
(90) Parrinello, M.; Rahman, A. Polymorphic Transitions in Single Crystals: A New
ACS Paragon Plus Environment
54
Page 55 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
2 3
Journal of Chemical Theory and Computation
Molecular Dynamics Method. J. Appl. Phys. 1981, 52, 7182–7190.
(91) Nosé, S.; Klein, M. L. Constant Pressure Molecular Dynamics for Molecular Systems. Mol. Phys. 1983, 50, 1055–1076.
4
(92) Hess, B.; Bekker, H.; Berendsen, H. J. C.; Fraaije, J. G. E. M. LINCS: A Linear
5
Constraint Solver for Molecular Simulations. J. Comput. Chem. 1998, 18, 1463–
6
1472.
7
(93) Grossfield, A.; Zuckerman, D. M. Ann. Rep. Comp. Chem. 2009, 5, 23–48. (94)
8
Branduardi, D.; Bussi, G.; Parrinello, M. Metadynamics with Adaptive Gaussians.
9
J. Chem. Theory Comput. 2012, 8, 2247–2254.
10
ACS Paragon Plus Environment
55
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
Page 56 of 64
TOC Graphics
2
ACS Paragon Plus Environment
56
Page 57 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
Figure 1. The MSM-Path Metadynamics approach described in this work. 171x52mm (300 x 300 DPI)
ACS Paragon Plus Environment
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Figure 2. Orthogonal views of the the β2-AR embedded in a lipid bilayer. The protein is shown in cartoons color-coded according to the α-helix regions. In the top panel, the Alprenolol binding is depicted tracing its center-of-mass (orange spheres) along the whole set of reactive trajectories.
ACS Paragon Plus Environment
Page 58 of 64
Page 59 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
Figure 3. Network of fluxes for the Alprenolol binding to the β2-AR (A) Network of fluxes calculated through TPT. Each node represents a single microstate and its size is proportional to the connections. Conversely, the size of the edges is proportional to the flux. (B) A close-up view of the most relevant regions of the network, showing the four intermediate states mostly involved in the transition from the EC to bound state (endpoints A and B, respectively).
ACS Paragon Plus Environment
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Figure 4. Retrospective validation of the MSM-derived path though projection of the conventional-MD trajectories on the PCVs space. Each color of the grayscale, from black to light grey (labels A0-C9 in the upper left corner of the plot), identifies one of the 11 different trajectories used to build the MSM. The larger colored dots, labelled A, I1-4, and B, are the projection of the 6 states used as templates to construct the frameset. These are consistent with the labels used in Figure 2. The corresponding configurations are depicted right below the plot, labeled using the same color code. Conversely, right above the plot, sample structures are shown for dense regions in the PCVs space depicting either alternative binding pathways or non-productive trajectories.
ACS Paragon Plus Environment
Page 60 of 64
Page 61 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
Figure 5. Mechanistic interpretation of the drug binding process. (A) Free energy surface for the Alprenolol binding to the β2-AR as reconstructed by metadynamics along the guess path. Iso-lines are plotted each 2 kcal/mol. The main metastable states are reported together with selected configurations. The minimum free energy pathway is displayed in red (solid line) along with an alternative route (dashed line). (B) Free energy profile along the MFEP (black line) together with the associated statistical uncertainty (gray bars). (C) Focus on the Alprenolol rearrangement upon the M3 to M6 transition (red to blue color scale, nitrogen atom highlighted in yellow).
ACS Paragon Plus Environment
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Figure 6. Relevant states along the Minimum Free Energy Path. (A) One representative configuration per relevant minima. The color gradient goes from dark red (EC formed) to blue (crystal pose). (B) Representative configuration for the M1 state. (C) Representative configuration for the M3 state. (D) Halfturn rotation along the ligand main axis to move from M3 (darker pink) to M4 (lighter pink). (E) Representative configuration for the M5 and M6 states. (F) Representative configuration for the M7 state. (G) The 4 Å displacement to go from M8 (white) to M9 (azure). (H) Representative configuration for the M11 state. In panels (B) to (H), ligand color code is consistent with panel (A); ligand polar atoms are highlighted as spheres (blue for nitrogen, red for oxygen) and polar hydrogens are also shown.
ACS Paragon Plus Environment
Page 62 of 64
Page 63 of 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Theory and Computation
Figure 7. Assessing the response of residues Phe193 and Tyr308 to alprenolol passage through the narrow crevice. (A) Phenyl-phenyl side-chains’ distance between Phe193 and Tyr308 plotted against simulation time (red line, left y-axis). The instantaneous value of S during metadynamics simulation is also shown (black line, right y-axis). (B) Scatter plot of S plotted against the phenyl-phenyl side-chains’ distance. (C) Representative configurations extracted from the metadynamics simulation showing the phenyl-phenyl gate in the closed, open, and wide-open states (left, center, and right, respectively).
ACS Paragon Plus Environment
Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Figure 8. The alternative binding path. Configurations sampled in the alternative path, namely M10’ (A) and M12’ (B). Alprenolol is colored in yellow.
ACS Paragon Plus Environment
Page 64 of 64