Perturbation Approaches for Exploring Protein Binding Site Flexibility

Jul 11, 2016 - Microsecond time scale trajectories showing the opening of transient pockets or of tunnels along ligand binding pathways have been repo...
0 downloads 8 Views 2MB Size
Subscriber access provided by The University of British Columbia Library

Article

Perturbation Approaches for Exploring Protein Binding Site Flexibility to Predict Transient Binding Pockets Daria B. Kokh, Paul Czodrowski, Friedrich Rippmann, and Rebecca C Wade J. Chem. Theory Comput., Just Accepted Manuscript • DOI: 10.1021/acs.jctc.6b00101 • Publication Date (Web): 11 Jul 2016 Downloaded from http://pubs.acs.org on July 12, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Chemical Theory and Computation is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Perturbation Approaches for Exploring Protein Binding Site Flexibility to Predict Transient Binding Pockets Daria B. Kokh1*, Paul Czodrowski2, Friedrich Rippmann2, Rebecca C. Wade1,3,4*. 1

Molecular and Cellular Modeling Group, Heidelberg Institute for Theoretical Studies,

Heidelberg, Germany 2

Global Computational Chemistry, Merck KGaA, Darmstadt, Germany

3

Zentrum für Molekulare Biologie, DKFZ-ZMBH Alliance, Heidelberg University, Heidelberg,

Germany 4

Interdisciplinary Center for Scientific Computing, Heidelberg University, Heidelberg, Germany

Short title: Protein pocket flexibility from L-RIP and RIPlig

KEYWORDS. Protein flexibility; transient pockets; ligand-protein binding; MD simulations.

ACS Paragon Plus Environment

1

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 48

Abstract Simulations of the long-time scale motions of a ligand binding pocket in a protein may open up new perspectives for the design of compounds with steric or chemical properties differing from those of known binders. However, slow motions of proteins are difficult to access using standard molecular dynamics (MD) simulations and are thus usually neglected in computational drug design. Here, we introduce two non-equilibrium MD approaches to identify conformational changes of a binding site and detect transient pockets associated with these motions. The methods proposed are based on the Rotamerically Induced Perturbation (RIP) MD approach, which employs perturbation of side-chain torsional motion for initiating large-scale protein movement. The first approach, Langevin-RIP (L-RIP), entails a series of short Langevin MD simulations, each starting with perturbation of one of the side-chains lining the binding site of interest. L-RIP provides extensive sampling of conformational changes of the binding site. In less than 1ns of MD simulation with L-RIP, we observed distortions of the -helix in the ATP binding site of HSP90 and flipping of the DFG loop in Src kinase. In the second approach, RIPlig, a perturbation is applied to a pseudo-ligand placed in different parts of a binding pocket, which enables flexible regions of the binding site to be identified in a small number of 10ps MD simulations. The methods were evaluated for four test proteins displaying different types and degrees of binding site flexibility. Both methods reveal all transient pocket regions in less than a total of 10ns of simulations, even though many of these regions remained closed in 100ns conventional MD. The proposed methods provide computationally efficient tools to explore binding site flexibility and can aid in the functional characterization of protein pockets, and the identification of transient pockets for ligand design.

ACS Paragon Plus Environment

2

Page 3 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

1.

Introduction

Exploiting target flexibility in the discovery of protein inhibitors is an emerging and highly challenging goal in computer-aided drug design. The importance of protein motion in determining the selectivity and specificity of small molecule binding has recently become widely recognized, with an increasing number of publications being dedicated to the detection and analysis of transient or “cryptic” pockets (see, for example, Refs.1,2). Computational prediction of transient binding pockets relies heavily upon the ability of simulation methods to reveal the protein motions required for pocket opening. MD simulation is often the method of choice, but its practical application is hindered by the computational costs of sampling the slow motions of proteins. Routine MD simulations, such as those usually employed in drug design, are generally limited to events on the 1-100 nanosecond time scale, while opening of a binding pocket or sub-pocket often requires longer times. Microsecond timescale trajectories showing the opening of transient pockets or of tunnels along ligand binding pathways have been reported in only a few studies, see e.g. 3. To go beyond the time-scale of standard MD simulations, a number of enhanced sampling MD methods, such as replica-exchange methods 4, 5, temperature accelerated MD 6, and hyper- or meta-dynamics 7, 8

, have been proposed. These methods have not, however, been adopted widely in the computer-

aided drug design field because they are still computationally expensive, their efficiency is strongly dependent on user expertise, and their application often requires prior knowledge of the type of motion in the system studied and adaptations of the computational procedures to the particular system. Several years ago, Ho and Agard 9 proposed a non-equilibrium MD method, named Rotamerically Induced Perturbation (RIP), which enables protein flexibility that normally occurs on the microsecond or longer time scale (for example, distortion of secondary and tertiary structure) to

ACS Paragon Plus Environment

3

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 48

be explored. RIP employs repeated perturbations of a single residue (specifically, transfer of the residue’s kinetic energy to one side-chain torsion angle), followed by short (0.1ps) implicit solvent MD simulations under constant energy conditions. The final snapshots from each of these MD runs are combined in a RIP trajectory that consists of 100 such perturbation steps (pulses) and a total of 10ps MD simulation in the original implementation reported in Ref.9. The excitation of the torsional rotation of a single protein residue initiates kinetic energy transport through the protein structure, breaking unstable non-covalent bonds and leading to large (up to about 10Å) conformational motions of flexible protein segments. A set of RIP trajectories, generated for each residue in the protein with a rotatable side-chain, provides a global map of the protein’s deformability. A major advantage of the RIP approach in comparison to other enhanced MD methods is that it is completely unsupervised. Besides, RIP simulations are quite fast and can be trivially parallelized since they consist of many short, independent MD runs. Unfortunately, the RIP approach, while successful at identifying flexible elements in proteins, is less suited for the prediction of the accessible conformational variations of a protein. Indeed, because of the very short MD simulation time and the constant energy conditions employed after system perturbation, the system is not equilibrated at the end of each pulse. As a result, the average protein temperature tends to increase from pulse to pulse due to the repeated perturbation (see Figure S1A). The high kinetic energy of the protein facilitates the crossing of energy barriers between different conformations and thus extends the conformational space sampled. On the other hand, heating can lead to considerable distortions of loosely packed elements of a protein structure, including partial unfolding. Since a protein structure generally consists of elements with different degrees of flexibility, it is difficult to find a balance between exciting slow motions and keeping fast moving elements near their

ACS Paragon Plus Environment

4

Page 5 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

equilibrium state. Moreover, it should be noted that despite the fact that each RIP trajectory requires only about 10ps of simulation, one has to generate a trajectory for each protein residue with a rotatable side-chain to explore protein flexibility. This can make complete RIP simulations relatively time-consuming for large proteins (though much quicker than conventional MD to generate similar levels of protein flexibility). Motivated by the idea that a local protein motion might be initiated by kinetic perturbation of selected residues, we have modified the original RIP procedure with the aim of generating suitable tools for prediction of long-time scale (microsecond or millisecond) conformational variations of a designated binding site in a protein. To improve both the accuracy and the speed compared to the original RIP method, we developed two computational approaches: (1) RIP of a ligand, RIPlig, for fast prediction of flexible regions in a binding pocket, and (2) Langevin-RIP, L-RIP, for enhanced sampling of binding site conformations arising from the large-scale motions. In RIPlig, the perturbation procedure is applied not to each protein residue, but to an artificial molecule placed in the binding pocket with repetitions starting with different positions of the molecule. LRIP is a computationally more expensive approach in which perturbation is applied to the side chains of the binding site residues, but the original RIP procedure is modified by using a Langevin thermostat in the simulations instead of the original constant-energy conditions. This ensures that the average protein temperature is preserved during the MD simulations. Additionally, the friction term enables the elements of the protein structure with fast relaxation times to equilibrate within each perturbation pulse. This makes it possible to increase the number of perturbation pulses and, thus, enhance sampling of slow conformational variations.

ACS Paragon Plus Environment

5

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 48

Figure 1. The four proteins used for the method validation. Left – crystal structures of the four evaluation proteins, illustrating the variety of conformational changes observed. Right - the most flexible parts of the binding sites and parts often missing in crystal structures are shown in blue and pink, respectively; the rest of the protein structure is shown by gray cartoon. The residue numbering is as in the PDB files: 1UYD, 3U4W, 1M47, 1A28 for HSP90, Src, IL2, and PR, respectively. (A) HSP90: PDB structures 1UYD, 1BYQ,

ACS Paragon Plus Environment

6

Page 7 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

and 1YER are shown in gray, red, and blue; -helix3 is unstable, its middle part transforms into a loop in some structures. (B) Src kinase: PDB structures 3U4W and 3G6H are shown in blue and red. The main structural changes observed include motion of two beta-sheets (in particular of the K295 side-chain), Ploop, C-helix, DFG loop (F405 moves to the center of the ATP binding pocket opening a hidden subpocket), and A-loop which may adopt a helical form. (C) IL-2: PDB structures 1M48, 1M4b, 1M47, 1PW6, 1NBP, are shown in blue, green, orange, magenta, and gray, respectively, along with a ligand from 1NBP and 1PW6 shown with carbon atoms in black and magenta, respectively; the protein binding site closes/opens by rotation of the side chains of F42, R38, L72. Additionally, an adjacent transient pocket opens due to the motion of the flexible loop 75-79 and the rotation of F78. (D) PR: PDB structures 3ZRA, 2OVH, and 1A28 are shown in blue, red and gray, respectively, along with a ligand from 1A28. Displacement of the α-helix9 in the structure 2OVH does not affect the shape of the binding site, which is deeply buried. Structures were rendered in UCSF Chimera10

ACS Paragon Plus Environment

7

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 48

For evaluation of these methods, we considered four proteins that demonstrate different types of binding site flexibility: the heat shock protein 90 N-terminal domain (HSP90), Src tyrosine kinase (Src), interleukin-2 (IL2), and the progesterone receptor ligand binding domain (PR) (shown in Fig.1). The binding site of the N-terminal ATP-binding domain of HSP90 is lined by the unstable α-helix3 which, according to available crystal structures, undergoes distortion in the middle, converting into two short helices, connected by a loop, as shown in Fig.1A. It is noteworthy that even in the structure with a complete -helix3 (PDB 2UYD), the hydrogen-bond network is slightly distorted in the region between residues L103 and L107 and in the absence of a ligand, this region adopts a loop conformation (PDB:1YER, 1YES). However, a transition from complete α-helix3 to loop

conformation was not observed in a 0.4s standard MD trajectory (the simulation procedure is given in Section 1 of Supporting Information and the results are shown in Fig.S2) To the best of our knowledge, no simulation has been reported that reveals unwinding of α-helix3 or a transition between different conformations of the HSP90 binding site. On the other hand, the high flexibility of the ATP lid (which includes the C-terminal part of α-helix3 and the following loop up to residue 135) was shown to be associated with the chaperone function of bacterial and yeast HSP90s (reviewed, for example, in Ref.11) The complete α-helix3 is associated with the most open binding pocket, including the ATP binding site and the transient pocket region under α-helix3. In the distorted state (for example, cocrystallized with adenosine-diphosphate, ADP; PDB: 1BYQ shown in Fig.1A), the side chains of the loop part of α-helix3 move into the binding pocket and close its transient part. There are two main conformations of the distorted α-helix3: (i) loop-In, in which residues N106-I110 form a loop that is oriented towards the pocket which has its smallest size; (ii) loop-Out, in which the distorted

ACS Paragon Plus Environment

8

Page 9 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

part is shifted to residues T109-G114 of α-helix3 and the pocket is slightly larger (structures are shown in Fig.1A in blue and red, respectively). The latter conformation is observed in human HSP90 bound with ATP or ADP (PDB: 3T0Z and 1BYQ, respectively). Crystal structures of the unbound HSP90 reveal that both loop-In and loop-Out conformations are possible, but a conformation with a complete helix has not been observed in an apo-form. Src tyrosine kinase is a two-domain enzyme: the N-terminal lobe (N-lobe) includes a 5-stranded -sheet and the C-helix, which are connected by a flexible linker (P-loop), and the C-terminal lobe (C-lobe) comprises -helices from the D-helix to the I-helix and several short -strands. One of the important flexible elements of the binding site is a catalytically crucial DFG loop (D404-F405-G406) located just before the activation loop (A-loop, shown in pink in Fig.1B). Its backbone motion is accompanied by rotation of F405 by 180 (blue and red structures in Fig.1B) and causes opening of a transient sub-pocket behind the ATP binding pocket (DFG-out), which has been used for inhibiting tyrosine kinases (so-called “type-II” inhibitors). Several computational studies of the DFG loop motion have been reported for different tyrosine kinases. In P38 MAP kinase, the transition between the DFG-Out and -In loop conformations has been observed in a 60 ns trajectory of explicit solvent MD simulations carried out at 1000K1 and using accelerated MD driving the transition from the DFG-In to the DFG-Out orientation 12. In insulin receptor kinase, DFG loop motion initiated by temperature accelerated MD was achieved in a 47 ns trajectory 13. In standard explicit solvent MD simulations on an Anton computer, the DFG-OutIn transition has been observed in the EGF receptor after 7-8 s of simulations 14. Besides the DFG loop motion, displacement and partial distortion of beta-sheet elements including the P-loop, are observed in crystal structures (shown in blue in Fig.1B, right panel) and cause additional changes of the binding pocket. Displacement of the C-helix, that presumably also contributes to the

ACS Paragon Plus Environment

9

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 48

protein motion required for transition between DFG-In and DFG-Out loop conformations, was also observed in simulations 14. The cytokine hormone interleukin-2 (IL-2) is an example of a target with a solvent-exposed protein-protein binding site that is highly adaptive due to the flexibility of several side chains around R38, F42, and L72 (see Fig.1C). Multiple openings of the binding pocket have been observed in a 10ns MD trajectory2. Additionally, crystal structures show some conformational changes outside the main binding site, including partial unwinding of -helix1 and motion of the loops formed by residues Y31-K35 and S75-R81 (shown in pink in Fig 1C). The latter loop motion together with rotation of the F78 side chain leads to closure/opening of a deeply buried transient pocket (adjacent to the protein-protein binding site) detected in titration experiments in Ref. 15 (the ligand, shown with carbon atoms in black in Fig.1C, is covalently bound to Y31C in the mutant structure PDB 1NBP). In this study, cooperative binding of one ligand to the protein binding site and another ligand to the transient pocket behind the loop D74-R81 was reported. As the pocket shape depends on the loop motion, one should expect that longer MD simulations might be required for its opening. The human progesterone receptor ligand binding domain (PR) is a target for hormone therapy. The structure of the binding site is conserved and has the typical nuclear receptor fold 16. Though largescale displacement of the -helix outside the binding site has been observed in crystal structures (see structure coloured in wheat in Fig.1D), variations of the binding site itself are quite modest and arise mainly due to side-chain rotation. Therefore, we have used PR to examine robustness of the proposed method with respect to the possible overestimation of the binding site distortions initiated by perturbation.

ACS Paragon Plus Environment

10

Page 11 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

In the present study, we first compared the performance of the L-RIP approach with the original RIP method, using HSP90 and Src as test examples. We also explored the longer timescale dynamics of -helix3 in HSP90, and motion of the DFG motif in the Src tyrosine kinase protein by performing short (up to 10 ns) explicit solvent MD equilibration of excited structures generated in L-RIP simulations. Then, we compared binding site flexibility and transient binding pockets predicted using L-RIP and RIPlig simulations for the four test proteins with the results obtained from conventional 100 ns explicit solvent MD simulations and with the pockets observed in crystal structures. We demonstrate that the L-RIP and RIPlig approaches provide computationally efficient ways to explore binding site flexibility and to identify transient binding pockets.

2.

Methods

2.1 Protein structure preparation and standard MD simulations The reference coordinates for starting the simulations were obtained from the following crystal structures in the Protein Data Bank: HSP90: 1UYD; Src: 3U4W; IL-2: 1M47; and PR: 1A28. Ligands and water molecules were removed. Missing loops (residues 75-76 and 99-102 in IL-2) were built using the Maestro software, version 9.417. Hydrogen atoms were generated in the standard protonation state at pH 7 using the NAMD package 18. Conventional explicit solvent MD simulations were performed using the NAMD software 18 and the CHARMM 27 force field

19,20

. The protein was solvated in a box of TIP3P water molecules

extending at least 10 Å beyond the protein surface with periodic boundary using the VMD program 21

; counter ions were added for system neutralization. A cut-off of 10 Å and the particle-mesh

ACS Paragon Plus Environment

11

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 48

Ewald (PME) method were used for computing non-electrostatic forces and long-range electrostatics, respectively. A time step of 2 fs was employed and all bonds to hydrogen atoms were constrained using the SHAKE algorithm22. The system was minimized in 1000 steps and then gradually heated up to 300 K with a step of 25K over 200 ps. Simulations were performed under NVT conditions with the Langevin thermostat (friction coefficient of 1 ps-1) employed for temperature control. Trajectory data were written at 1ps intervals in all simulations.

2.2 Definition of the binding site For the proteins HSP90, Src, and PR, the position of the ligand in the reference protein crystal structure was used to define the binding pocket. Although more crystal structures are available for these proteins, we chose not to use them for the pocket definition for validation of the L-RIP and RIPlig procedures as, for many applications, only one reference structure with a ligand bound will be available. For IL2, however, we demonstrate how more than one reference crystal structure can be used by considering three different ligands, each bound to different transient pockets (from PDB

structures 1M4A, 1NBP, and 1M48): the three holo-structures were aligned using their heavy atoms; the ligands were extracted and combined into one artificial molecule that was used later for defining the binding site in IL-2. All residues whose atoms were located within a threshold distance from any ligand atom were assigned to the binding site. The threshold distance (5.5-7.0 Å) was defined to be large enough to ensure that the complete active site is included in the simulation region, with the smallest value of 5.5 Å being used for the buried pocket in PR (the binding site residues for the four test targets are visualized in Figure S3; details of the setup of the L-RIP and RIPlig simulations are given in Table S1 of SI).

ACS Paragon Plus Environment

12

Page 13 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

2.3 RIP and L-RIP methods The RIP method reported in Ref9 was implemented as a Python script (available at http://boscoh.com/rip/) using standard packages (in particular AMBER 23) for performing implicit solvent MD simulations. Here, we only give a short overview and details related to the implementation and modification of the method in the present study. Specifically, we modified the script in order to employ the freely available NAMD 2.9 software implicit solvent model (GB/SA)

24

18

with the generalized Born

and the CHARMM27 force field

19,20

; all the major MD

simulation parameters and simulation procedure were kept the same. During the preparation step, the starting structures were energy minimized, gradually heated to 300K, and equilibrated first under constant energy conditions and then with Langevin dynamics in NVT ensemble (damping coefficient of 10 ps-1; each equilibration step-wise is of 5ps). The RIP simulation procedure consists of a set of perturbation pulses. In each pulse, the total kinetic energy of a single residue in the binding site with a rotatable side chain was applied to the rotational degrees of freedom of the torsion angle only. Then a short implicit solvent MD run of length 0.1ps was performed to let the excess kinetic energy in the side-chain rotation be transferred to nearby residues. In the RIP method9, the MD simulations were performed under constant energy conditions, while in the LRIP simulations, a Langevin thermostat (damping coefficient of 1 ps-1) is employed. This allows us to increase the MD relaxation step from 0.1ps (employed in the original RIP) to 0.3ps or even longer without protein heating, which enables a larger conformational space to be sampled and short-time fluctuations to be damped out. The number of perturbation pulses can also be extended from the 100 used in RIP to 300 or even 1000, which enables slower motions to be explored. The last snapshots of each pulse were combined to make a RIP/L-RIP trajectory for further analysis. This is different from the original RIP procedure9, where only a single final snapshot from each

ACS Paragon Plus Environment

13

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 48

trajectory was employed to explore protein flexibility. The L-RIP perturbation procedure was applied to each rotatable residue of a protein binding site, thereby providing us with a set of perturbation trajectories for each test protein.

2.4 RIPlig method In the RIPlig method, the perturbation was applied to an artificial molecule (referred to hereafter as a pseudo-ligand; we employed a phenylalanine, PHE, residue with H atoms capping the backbone atoms). The procedure was repeated several times, each time with the pseudo-ligand initially placed in a different part of the binding pocket. A protocol for selection of the pseudoligand position and orientation was designed to ensure that the perturbation is applied to different regions of the binding site (Sec. S2 of Supplementary Information). Consequently, each RIPlig trajectory reveals the flexibility of a particular region of the binding site (this is demonstrated in Fig. S4), while a full set of RIPlig trajectories enables the complete binding site to be sampled. In the RIPlig procedure, energy minimization and equilibration were carried out for the pseudoligand-protein complex as for the original RIP procedure described above. The backbone atoms of the pseudo-ligand were restrained with a harmonic potential with a force constant of 20 kcal mol-1Å-2 to prevent the pseudo-ligand from moving out of the pocket. If the perturbed pseudo-ligand had close contacts with the atoms of a highly mobile part of the binding site, the pseudo-ligand tended to move away from the protein, resulting in less perturbation in subsequent pulses. For this reason, Langevin dynamics was not used as it would further diminish the perturbation effect. Accordingly, a 0.1ps MD relaxation step under constant energy condition was employed in RIPlig as in the original RIP procedure.

ACS Paragon Plus Environment

14

Page 15 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

2.5 Computation of the pocket shape and identification of transient pocket regions. For the identification of transient pocket regions, the generated structures were superimposed on the reference structure using only the backbone atoms of the binding site. Then the TRAPP (TRAnsient Pockets in Proteins)

25

protocol was employed to compute the pocket shape and

identify transient regions of the binding pocket. The pocket shape was described by a distribution function mapped onto a 3D cubic grid (a grid spacing of 0.75Å) centered on the pocket and accommodating all the binding site residues. Grid points were assigned either to the pocket or to the protein or to the bulk solvent using the procedure described in Ref. 25. To identify transient pocket regions, we did not use all snapshots of the generated MD or LRIP/RIPlig trajectories, as this would require multiple simulations of very similar pocket conformations. Instead, we employed clustering of the generated structures by similarity of their binding site conformations (see details of the clustering procedure in SI, Section S3). Representative structures for each cluster were combined in an ensemble that was then used for detection of transient pocket regions. For each structure in an ensemble, the transient pocket regions were defined as protein cavities that appear or disappear relative to the position of the binding pocket of a reference structure (i.e. a grid point was assigned a value of 1 if the point belongs to a pocket in a structure, but is inside the protein in a reference structure, and a value of -1 for the opposite situation). Then a transient pocket distribution in an ensemble of structures was obtained by averaging of transient regions over all conformations in an ensemble, which roughly defines the fraction of structures in which a particular pocket region is found, but does not exist in the reference structure (or vice versa).

ACS Paragon Plus Environment

15

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

3.

Page 16 of 48

Results and Discussion

3.1 Effect of Langevin dynamics: Comparison of RIP and L-RIP methods To examine the effect of the Langevin damping on the RIP simulation results, we assessed the RIP and L-RIP procedures using two test targets: Src kinase and HSP90, both characterized by backbone transitions between different conformations of their binding sites. Specifically, we focused on the DFG loop motion in Src kinase starting from the DFG-Out conformation (PDB 3U4W) and the distortion of -helix3 in HSP90 starting with the complete α-helix3 (PDB 1UYD). As noted in the Introduction, in the both cases, conformational transitions are expected to occur on the s time-scale or longer, although there are some structural elements with shorter time scale motions. For clarity, we consider for each target only RIP/L-RIP trajectories generated by perturbing one residue that reveals pronounced mobility of the structural elements analyzed. To explore the effect of the length of the MD relaxation step on the range of conformations sampled, two trajectories (each consisting of 300 perturbation pulses) were generated by each method (RIP/L-RIP). These had MD simulation times of either 0.1ps (as in the original RIP procedure9) or 0.3ps in each pulse, referred to as (L-)RIP-0.1ps and (L-) RIP-0.3ps, respectively, hereafter. The backbone RMSD variations along the corresponding RIP and L-RIP trajectories are illustrated in Figs. 2 and 3 for HSP90 and SRC, respectively.

ACS Paragon Plus Environment

16

Page 17 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Figure 2. Comparison of RIP and L-RIP simulations of flexibility in HSP90. (A-F) – Residue-wise backbone RMSD from the starting reference structure (PDB:1UYD) for structures generated using the RIP (A, B) and L-RIP (C, D) approaches, as well as in the 100 ns explicit solvent MD trajectory (E), and for crystal structures (F, denoted as X-ray; the list of structures is given in SI, Table S3). Two different sets of simulations, obtained by perturbation of L107 (the position of this residue is denoted by the red triangle), with the length of the relaxation step in each perturbation pulse of 0.1ps and 0.3ps, are indicated as (L)RIP-0.1ps (A, C) and (L-) RIP-0.3ps (B,D), respectively. The magnitude of the deviation from the reference structure is shown by colors from white (less than 3 Å) to dark-red corresponding to 13 Å. (G, H) - two representative conformations observed in the last snapshots of RIP-0.1ps/0.3ps (G) and L-RIP0.1ps/0.3ps (H) trajectories are shown in gray cartoon with the most flexible region colored in red. The starting structure is shown in blue cartoon. The position of the perturbed residue L107 is indicated by the red triangles.

ACS Paragon Plus Environment

17

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 48

HSP90. For comparative analysis of the L-RIP and RIP methods, we employed perturbation of the L107 residue which is located in an unstable part of the -helix3 and whose perturbation causes the most pronounced distortion of -helix3 (RMSD plots for L107 and several other residues of the -helix3 are shown in Fig. S5). The protein flexibility pattern observed in the first 100 pulses of the RIP simulations is quite similar to that revealed by MD simulations (Fig.2 A and E, respectively). However, during the next 200 pulses, the backbone fluctuations extend to the whole of -helix3, leading to its unwinding (Fig. 2G ). Besides, the N-terminal part of -helix2 (residues 60-70), as well as the N-terminal region of the protein, undergo notable conformational changes. Increasing the length of the MD relaxation in each RIP pulse from 0.1ps to 0.3ps further extends distorted regions whereby the RMSD of the complete protein structure increases from 4 Å up to 6 Å after 200 pulses (shown in Figure S6). The average temperature of the protein rises from about 280K up to 320K after 300 pulses of the RIP-0.3ps trajectory (Fig. S1A). One can therefore conclude that the RIP simulations reveal the most flexible regions of the protein structure, but the number of perturbation steps must be limited to about 100 for HSP90 in order to keep the protein in a state reasonably close to the native folded one. The use of Langevin dynamics in the L-RIP procedure enables the average protein temperature to be maintained along the perturbation trajectory (Fig. S1B), thus reducing protein distortion as illustrated in Fig. 2C,H . Increasing the relaxation time from 0.1ps to 0.3ps causes the perturbation to spread along -helix3, giving rise to a number of different conformations of the loop part of helix3, while keeping the fluctuations of the rest of the protein within the values observed in standard MD simulations (Fig. 2D). The total RMSD of the protein structure is maintained within 2-2.5 Å (in contrast to 4-6 Å in RIP simulations, see Fig.S5). The pattern of the flexible regions observed in L-RIP and the extent of conformational changes are, in general, very similar to those

ACS Paragon Plus Environment

18

Page 19 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

revealed by MD simulations, only the mobility of the -helix3, is essentially missing in the MD trajectory. This result demonstrates that, in contrast to RIP, different time-scale motions can be successfully combined in one L-RIP trajectory without extensive artificial distortions of loosely packed elements of the protein structure. Src kinase. In L-RIP simulations with a 0.3ps MD relaxation step, the motion of the F405 sidechain, particularly flipping from DFG-In to DFG-Out conformations, was observed upon perturbation of F405 itself (perturbation trajectories for neighbouring residues show only a minor effect on the F405 orientation as illustrated in Fig.S7). Therefore, we used F405 perturbation trajectories for assessment of the RIP and L-RIP methods. RIP perturbation of F405 with a short 0.1ps relaxation results mostly in distortion of the C-helix and mobility of the P-loop with the 1st and 2nd β-strands (see Fig.1B), which appear to be flexible in MD simulations as well (Fig. 3A, E). Increasing the relaxation time in each pulse to 0.3ps leads to some movement of the F405 side-chain, but at the same time causes significant distortion of the protein structure: unwinding of the C-helix as illustrated in Fig. 3G (coloured in green), a hinge motion of the N- and C-lobes that leads to a displacement of the F-helix (shown in pink in Fig. 3G), and the overestimated mobility of the -strand and P-loop (residues A259-S303, whose RMSD values exceeded 13 Å, Fig.3B). On the other hand, the applied perturbation was not large enough for the generation of the expected flipping of F405, as can be seen from the typical conformations visualized in the inset in Fig. 3G. Employment of the Langevin thermostat in the L-RIP procedure results in reduction of protein conformational distortions for both L-RIP simulations, with 0.1ps and 0.3ps relaxation time (see Fig. 3 A-D; the protein backbone RMSD from the starting structure does not exceed 2.5 Å, Fig.

ACS Paragon Plus Environment

19

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 48

S8 A, B). Specifically, the hinge motion of the C- and N-lobes of the protein is greatly diminished (i.e. motion of the F-helix is negligible) and the C-helix preserves its structure (Fig.3H). On the other hand, in contrast to RIP and standard MD simulations, the DFG loop reveals motion accompanied by rotation of the F405 side-chain, as illustrated in the inset in Fig. 3H. Thus, similar to HSP90 example, the single L-RIP trajectory of Src reveals both short-time scale fluctuations (i.e. β-sheet of P-loop, as observed in 100ns MD) and slower conformational changes (such as the DFG loop movement).

3.2 Exploring binding site conformations using L-RIP trajectories In the previous section, we demonstrated that using Langevin dynamics in the RIP procedure makes it possible to increase the relaxation time in each pulse and the number of pulses, while simultaneously preserving the protein temperature and maintaining the overall protein structure close to its original state. The longer relaxation step increases the variety of conformations sampled. At the same time, the perturbation of the degrees of freedom that correspond to slow protein motions cannot be equilibrated in such short MD runs, and multiple perturbation steps that gradually lead the system to higher energy states along the slow motion coordinates are required. This may lead the system into alternative locally stable states that are inaccessible on the nano- or micro-second time-scale of standard MD simulations. Thus, generation of multiple L-RIP trajectories enables extensive sampling of protein conformational space, although each conformation generated is not energetically validated and therefore, most likely does not correspond to any local energy minimum.

ACS Paragon Plus Environment

20

Page 21 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Figure 3. Comparison of RIP and L-RIP simulations of Src flexibility. (A-F) Residue-wise backbone RMSD from the reference structure (PDB: 3U4W) in RIP (A,B), L-RIP (C,D), and the 100ns standard MD trajectory (E), and for crystal structures (F); the list of structures is given in Table S3 of Supporting Information; the A-loop region, missing in many structures, has been modeled and , therefore, has a high RMSD from the reference structure, which it is partially folded into a short -helix). RIP and L-RIP trajectories were obtained by perturbation of F405 (the position of this residue is denoted by the red triangle) and using 0.1 ps (A, C) and 0.3 ps (B, D) MD relaxation in each pulse. Representative conformations of structures generated by the RIP (G) and L-RIP (H) procedures with 0.1ps and 0.3 ps MD equilibration steps are shown in gray cartoons (three flexible elements, A-loop, F-helix and C-helix, are highlighted in red, pink and green, respectively) and are shown along with the reference structure PDB 3U4W (blue cartoon).

ACS Paragon Plus Environment

21

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 48

To transfer a protein from a perturbed L-RIP conformation into a nearby locally equilibrated state, standard MD simulations can be performed. Obviously, whether an MD run enables the global or a local energy minimum of the protein to be reached depends on the length of the MD simulations and the complexity of the protein’s energy landscape. One can expect, however, that the less stable, high energy protein conformations will be eliminated faster than the locally stable ones. Therefore, the resultant ensemble of MD trajectories starting from L-RIP conformations may provide insight into the conformational space of locally stable states and the transition pathways between them. We tested this hypothesis using several L-RIP trajectories for HSP90 and Src kinase, consisting of 1000 perturbation pulses each (by perturbation of residues L107 in HSP90 (4 trajectories) and F405 in SRC (2 trajectories). We performed up to 10ns MD simulations starting from about 20 snapshots from each L-RIP trajectory selected with a constant stride. To assess the conformational changes observed in simulations, we mapped the relative density distributions of protein conformations onto the 2D space of ψ dihedral angles (defined by backbone atoms N-Cα-C'-N) of several residues located in the flexible region of interest. Specifically, the dihedral angles of N105, I110, and K112 from the -helix3 of HSP90 and N404 and F405 of the DFG-loop in SRC kinase were used. The distributions generated from 100ns MD trajectories, L-RIP simulations, and after equilibration of the L-RIP structures with standard MD simulation are shown in Figs. 4 and 5, and compared to diverse crystal structures.

ACS Paragon Plus Environment

22

Page 23 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Figure 4. Simulation of conformational changes of the -helix3 in HSP90. Variations of the -helix3 conformations are illustrated as a population density distribution of three backbone ψ dihedral angles (I105 vs N105, and I110 vs K112) computed from (A) snapshots (at 20 ps intervals) of the 100 ns MD trajectory;

ACS Paragon Plus Environment

23

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 48

(B) four L-RIP trajectories of 1000 pulses each (0.3ps relaxation MD step after each pulse); (C) from the 5-10 ns interval of explicit solvent MD equilibration applied to structures generated by L-RIP, which are shown in (B). Triangles indicate positions of crystal structures with loop-In (red), loop-Out (blue), and complete helix (yellow) conformations of the -helix3 (PDB identifiers are given in Table S4). In the insets, the pocket shape is shown for: (A) the starting (reference) crystal structure with co-crystallized ligand shown in ball and sticks, (B) one representative L-RIP structure that is close to a loop-In conformation, and (C) the same structures as in (B), but after 5ns explicit MD equilibration as well as a crystal structure of the loop-In conformation with a co-crystallized ligand. In each plot, computed population density values (in black) are normalized by their maximum magnitude and isocontours of constant density are drawn with a step of 10%.

ACS Paragon Plus Environment

24

Page 25 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

HSP90. In the crystal structures with an unperturbed -helix3 (indicated by yellow triangles in Fig.4), all three ψ dihedral angles are about -50, while the conformations of the distorted -helix3 are characterized by larger values of two ψ dihedral angles: N105/I110 in the case of loop-In conformations, or I110/K112 for loop-Out conformations (the positions of loop-In and loop-Out conformations are indicated in Fig 4 by red and blue triangles, respectively). As can be seen from Fig4A, the helical orientation of the three selected residues is retained in the 100 ns MD simulations (and even in 0.4 µs MD, see Fig.S2). In contrast, the first 300 pulses of LRIP-0.3ps simulations readily result in an increase of the N105 dihedral angle and then, starting from about 400 pulses, -helix3 undergoes distortions in the region of the N105-I110 residues, forming perturbed states that are similar to the loop-In conformation in the space of the three dihedral angles (Fig.4B). Interestingly, the loop-Out conformation, which is associated with flipping of the K112 side-chain (see Fig.1A), was not observed as K112 seems to be insensitive to the L107 perturbation. Additional analysis, however demonstrated that motion of K112 can be initiated by perturbation of several residues outside the binding pocket, particularly those located at the N-terminus of the -helix3 (including K112 itself) and the rest of the ATP-loop (residues 111-138; see Fig.S9A). This result is consistent with the mobility of HSP90 observed in bacterial and yeast structures, where -helix3 can be divided into two regions whose motion is practically uncorrelated: N-terminal, which remains stable, and C-terminal, which belongs to the ATP loop and undergoes complete unfolding during the ATPase cycle, see Fig.S9B. As expected, a subsequent 10 ns explicit solvent MD equilibration of the L-RIP structures reduces the diversity of conformations, making the conformational distribution more compact (Fig. 4C). Specifically, a high population of conformations that are close to the loop-In structure is formed,

ACS Paragon Plus Environment

25

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 48

though many perturbed structures tend to be restored back to the helical conformation of -helix3. A possible explanation of the latter effect is that the CHARMM 2.7 20 force field employed may introduce some bias towards the helical conformation26. One of the generated structures that clearly converges to a loop-In conformation is displayed in the inset to Fig.4C (the corresponding per-residue RMSD from the loop-In conformation is shown in Fig. S10). Although the backbone RMSD of the -helix3 between L-RIP and the loop-In crystal structure is reduced from 4.2 Å to 2.6Å, , some deviation from the loop-In crystal structure is still observed after 10ns of the MD equilibration. This might be caused by the roughness of the energy landscape and by additional stabilization of the loop region of -helix3 in the loop-In conformation due to direct crystal packing contacts between -helix3 and -helix2 (particularly between D102 and Y61). Src kinase. Similarly to the HSP90 analysis, we mapped the relative density distribution of the DFG-loop conformations in Src kinase onto the 2D space of the ψ dihedral angles of D404 and F405 (Fig. 5). Points in this space that correspond to 16 crystal structures with DFG-Out and DFGIn conformations are shown by triangles. In the 100ns standard MD trajectory started from the PDB structure 3U4W, the DFG-Out state is quite well preserved during the whole simulation (Fig.5A). In the first hundreds of L-RIP perturbation pulses, however, D404 and F405 rotate (Fig. 5B and S11). Increasing the number of perturbation pulses initiates not only further mobility of the F405 side-chain, but also breaking of the D404 and K295 salt-link, which leads to formation of DFG-In loop conformations, whose population increases with the number of perturbation pulses applied (see Fig.S12).

ACS Paragon Plus Environment

26

Page 27 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Figure 5. Simulations of the DFG loop motion in Src kinase. Distribution of DFG loop conformations mapped onto the ψ dihedral angles of D404 and F405 as observed in: (A) 100 ns explicit MD simulations; (B) two L-RIP trajectories of 1000 pulses each (0.3ps relaxation MD step after each pulse); (C)- the interval between 5ns and 10 ns of MD trajectories started from L-RIP conformations. Red triangles indicate positions of crystal structures that represent the transition of the DFG loop from DFG-Out to DFG-In

ACS Paragon Plus Environment

27

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 48

conformations (the PDB identifiers are given in Table S4). The insets show the orientations of D404, F405 and K295 for a representative L-RIP structure before and after equilibration and the shape of the binding pocket for corresponding conformations obtained from simulations (B, C) and observed in crystal structures with co-crystallized ligands (ball-and-stick representation) (A,C). Population densities are computed and represented as in Fig.4.

ACS Paragon Plus Environment

28

Page 29 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Curiously, after 10ns of MD equilibration, the conformational distribution becomes more widely spread over the D404 ψ dihedral angle when F405 is in the DFG-Out orientation (~150). This is due to strong interactions between the K295 and D404 side chains, which keep the K295 orientation restrained while the DFG loop, specifically the F405 side-chain, adopts different orientations (as shown in Fig.5C and Fig. S11). This indicates the complexity of the corresponding energy landscape. In contrast, the orientation of the K295 and D404 side chains in the DFG-In orientation is well-localized, which agrees with the compact distribution of DFG-In dihedral angles observed in crystal structures (Fig. 5).

3.3 Benchmarking of the L-RIP and RIPlig approaches on the four test proteins We performed L-RIP and RIPlig simulations with 0.3ps and 0.1ps MD relaxation steps, respectively, and 300 perturbation pulses in each trajectory for all four test proteins, considering this as a basic framework for fast exploration of the flexible elements in a binding site. The number of trajectories generated for each protein depends on the size of the binding site in L-RIP and on the pocket size in RIPlig (see Table S1 and Figure S2). Additionally, 100 ns standard MD simulations were carried out for each target starting from the same reference structures as used in perturbation MD simulations. All frames in a simulated trajectory were clustered according to the binding site similarity (details of the clustering procedure are given in Section S3 in the SI). Residue-wise backbone RMSD values for each cluster representative relative to the starting structure are shown for each target and simulation method used in Fig.6, along with a similar plot for a set of crystal structures.

ACS Paragon Plus Environment

29

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 48

Finally, the same set of representative conformations for each target was employed for computation of the binding pocket variations and for identification of transient pocket regions. To visualize the detected transient regions, we selected a plane for each protein that crosses the most important transient regions (see Figs. 7 A-D) and represented pocket occurrence in this plane by color (yellow to red for rarely to most often opening pocket regions, and cyan to blue for rarely to most often closing regions) in Figs. 7 E-T. This view obviously cannot fully represent pocket similarity in the 3D space, but gives an overview of the positions of the transient pocket regions detected. HSP90. RIPlig and L-RIP simulations, starting from a conformation with a complete -helix3 (PDB: 1UYD), yield two main regions of the binding site with high flexibility (see Fig.6A) : (1) The middle part of -helix3 that is partially transformed into a loop in several structures generated by L-RIP, and also strongly distorted in RIPlig simulations, though without a clear helix-loop transition. In the MD trajectory only backbone fluctuations in the neighbourhood of L107 indicate that such motion might be feasible on a longer timescale; (2) The ATP-lid that comprises the Cterminal part of -helix3 and the following loop (residues 112-138). Perturbation MD simulations reveal notably higher flexibility of the ATP-lid than the standard MD simulations and the crystal structures of human HSP90. Notably, unfolding of the ATP-lid has been shown to be a prerequisite for the ability of yeast HSP90s to form a protein binding face for the substrate during the ATPase cycle11. Therefore, the high flexibility of the ATP-lid predicted in the perturbation simulations indicates that the dynamics and structural properties of the human HSP90 might be similar. This suggestion is in agreement with the results of experiment

27

, in which a high similarity in the

underlying enzymatic mechanism of human and yeast HSP90 was observed. The regions (3) - Cterminus of -helix2 and (4)- -helix1, do not directly affect the binding pocket; they are flexible

ACS Paragon Plus Environment

30

Page 31 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

in all simulations and have quite large temperature factors in the crystal structures, although they display little variation in conformation.

Distortion of -helix3 leads to partial closing of the binding pocket in HSP90. Specifically, the transient sub-pocket beneath -helix3 presented in the reference structure was identified as a disappearing transient region in experimental as well as in simulated structures (indicated by the arrow in Fig. 7 A,E,I), with the smallest transient region identified in standard MD simulations. In addition to the transient pocket, the ATP pocket is partially closed in some snapshots due to the flexibility of the ATP lid and side-chains of -helix2. In contrast, in crystal structures the shape of the ATP pocket seems to be very stable, possibly because in the majority of structures it is occupied by ligands. Src kinase. One can distinguish at least six mobile regions of Src kinase indicated in Fig. 6B. Four of them belong to the binding site and are observed in crystal structures and in perturbation MD simulations: (1) DFG and A-loop; (2) a part of the beta-strand with a key residue K295 that makes a salt-link with DFG loop residue D404, and P-loop; (3) a hinge loop Y340-G344 between the Cand N-lobes with residue M341 pointing to the pocket; (4) C-helix, residues P304-L317. As has been discussed above, motion of the DFG loop, presumably the slowest motion in the Src structure among those mentioned above, is observed upon L-RIP perturbation of F405, though the complete flipping of F405 from DFG-Out to DFG-In positions is not always achieved in a 300 pulse L-RIP trajectory. In the RIPlig simulations, F405 changed its orientation without notable movement of the backbone when a pseudo-ligand was placed directly behind the DFG loop (see Fig.S4). In standard MD, the mobility of the DFG loop is essentially missing.

ACS Paragon Plus Environment

31

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 48

Figure 6. Illustration of protein flexibility observed in simulations and in crystal structures. Residue-wise backbone RMSD relative to the reference structure is shown for the four test targets (A) HSP90, (B) Src, (C) IL2, and (D) PR, (reference structure PDB codes are 1UYD, 3U4W, 1M47, 1A28, respectively; they

ACS Paragon Plus Environment

32

Page 33 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

are shown on the right-hand side in cartoon representation colored by increasing B-factor from blue to red) in the crystal structures (‘X-ray’) and in the simulations: 100nsstandard explicit solvent MD trajectories and L-RIP and RIPlig simulations (see simulation details in the text). For the crystal structures, the RMSD values were computed for each structure (see the list of PDB codes in Table S3), while for the simulation results, all generated structures were clustered according to the similarity of the binding site conformations and only representative structures of each cluster were then analyzed. Accordingly, each column of the RMSD plot represents one cluster representative, while rows represent residues in the protein sequence). The most flexible protein segments observed in simulations or in crystal structures are denoted by numbers in the RMSD plots and in the protein structures.

ACS Paragon Plus Environment

33

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 34 of 48

Figure 7. Transient binding pocket regions detected in the four proteins. In (A-D), a reference structure and a cross-section plane are shown for each target, and in the lower rows, the cross-section of the opening and closing transient regions (color variation from yellow/cyan to red/blue indicates increasing number of structures in which a particular pocket is open/closed) are shown along with the protein cross-section, as observed in crystal structures (E-H), obtained in explicit solvent MD (I-L), RIPlig (M-P), and L-RIP-0.3ps

ACS Paragon Plus Environment

34

Page 35 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(Q-T). Arrows indicate positions of transient binding pockets. . The protein van der Waals surface is shown in grey, the structure in cartoon representation, and the ligand bound is shown with carbon atoms colored green. Structures were rendered in UCSF Chimera 10

ACS Paragon Plus Environment

35

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 36 of 48

The largest RMSD values in the region (1) for crystal structures in Fig.6 B correspond to residues of the mobile A-loop that are missing in many crystal structures (and were modelled for the analysis). In the reference structure, (PDB: 3U4W), the A-loop is folded into a short helix (see Fig.1B) that is very weakly coupled with the rest of the protein structure, which explains its low sensitivity to the perturbation of the binding site residues. Accordingly, no clear correlation between the A-loop and the binding site conformations (particularly, the position of the DFG loop) is observed in the crystal structures. Additionally, fluctuation of the regions 5 (G453-L360) and 6 (V467-G476), located outside the binding site, were observed in all simulations. These elements have quite large B-factors but little variation in conformation in the crystal structures. The first three mobile regions of the binding site mentioned above manifest themselves as transient regions in the binding pocket simulations: regions 1 and 3 in Fig. 6 are denoted in Fig.7 B, F by DFG-loop and M341, respectively, while protein motion in region 2 gives rise to two transient pocket regions, labelled K295 and P-loop in Fig.7 B,F. Displacement of the C-helix, (region 4) affects the pocket shape as well, but the corresponding transient region cannot be projected on the plane used for visualization in Fig.7. Analysis of binding pocket fluctuations in the standard MD trajectory shows only two transient regions, arising from the side-chain rotation (particularly K295) and motion of the P-loop (Fig.7J). The same regions are observed in L-RIP and RIPlig trajectories (Fig.7 N, R). Additionally, a part of the transient sub-pocket behind the DFG loop is detected in RIPlig and L-RIP trajectories, though complete opening of the sub-pocket was not observed. Also, two transient regions arising from the motion of the hinge loop and rotation of residue M343 are detected in general agreement with the crystal structures. IL-2. For simulation of the binding site flexibility in IL-2, we used a structure (PDB 1M47) where the adjacent buried binding pocket (denoted as 1 in Fig.7G, see also Fig1C) was partially closed

ACS Paragon Plus Environment

36

Page 37 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

by the G74-L80 loop and the F78 side chain. Perturbation simulations reveal high flexibility of the loop and opening of the corresponding transient pocket region, while in the standard MD trajectory it is permanently closed by the F78 side-chain (see Fig.7, O, S, and K). The solvent exposed binding site (residues K35-Y45; region 2 in Fig.6 C) changes its conformation mainly due to side-chain rotation. This kind of protein flexibility is not clearly pronounced in the backbone RMSD plot, but manifests itself as transient pocket regions in the X-ray structures (Fig.7G, pocket denoted by number 2) as well as in all simulations (Fig. 7, K, O, S). Both the RMSD plot in Fig.6 and the detected transient pockets illustrated in Fig.7 show that the backbone mobility and therefore transient pocket size is somewhat overestimated in perturbation simulations and underestimated in standard MD simulations relative to the crystal structures. Two further flexible loops that do not belong to the binding site (denoted as 3 and 4 in Fig. 6) are observed in the crystal structures and simulations. PR. The binding pocket in PR is buried and, in the majority of crystal structures, only two flexible regions of the binding site are observed, which arise from rotation of the M909 and F794 sidechains located in -helix9 and -helix5, on opposite sides of the binding pocket (see Fig. 1C). However, there are several mobile elements observed in the PR crystal structures that are adjacent to the binding site of co-crystallized ligands (denoted as regions 1-3 in Fig.6D). Specifically, in some structures a part of -helix5 and -helix6, (residues G778-Y795 , region 1 in Fig.6D), is displaced or distorted (see Fig.1D); -helix9 (R900-A922, region 2) is completely displaced (with its part missing in PDB: 2OVH shown in Fig.1D or PDB: 1A52); -helix2 (region 3) is mobile, showing displacement by several Å. All these regions have high B-factors in the reference PDB structure and are successfully identified by perturbation MD simulations as shown in Fig. 6D.

ACS Paragon Plus Environment

37

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 38 of 48

Their mobility, however, is not observed in 100ns standard MD simulations since it arises from slow back-bone motion and distortion of the secondary structure of the protein. Two flexible regions of the binding site mentioned above (from rotation of the M909 and F794 sidechains), appear as transient sub-pockets (Fig.7H, denoted as regions 1 and 2) with all sampling methods, albeit with some shifting in the position of the sub-pockets because of the slight displacement of -helices during equilibration in both explicit solvent (standard MD) and implicit solvent (L-RIP, RIPlig) MD simulations. Both transient regions identified from the standard MD trajectory (Fig. 7L) arise from side-chain fluctuations and are notably smaller than in the X-ray structures and in perturbation simulations. Interestingly, L-RIP and RIPlig reveal two additional flexible regions merged with the region 2 (denoted by 3 and 4 in Fig.7T). These transient regions do not appear in the crystal structures analyzed. However, this apparent overestimation of the protein flexibility by perturbation MD methods in fact reveals a possible opening of adjacent pockets due to motion of -helix2 and -helix5 that is present in some crystal structures that were not used for the transient pocket analysis because of their incompleteness. Indeed, both flexible regions 3 and 4 can be observed as small transient pockets adjacent to the ligand binding pocket in the crystal structure PDB:2OVH (see Fig.S13; this structure has part of -helix3 missing).

4. Conclusions In the present work, we introduce and evaluate the applicability of two perturbation MD methods that provide computationally efficient and inexpensive ways to explore protein binding site flexibility.

ACS Paragon Plus Environment

38

Page 39 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

For all test targets, the simulation results demonstrate that the proposed L-RIP and RIPlig procedures reveal all flexible elements of the binding sites observed in experimental structures, although they are not equally well pronounced. The same is true for the positions of the transient pocket regions identified, which demonstrate very good correlations with those identified in crystal structures. For all test proteins, the computational time required to reveal binding site flexibility is less than 7ns of MD simulations if the L-RIP method is used and several times less for RIPlig (see Section S4 of Supporting Information). In contrast, in 100 ns standard explicit solvent MD simulations, only side chain fluctuations and the motion of loosely bound elements in protein structures are observed, while the long-time scale backbone motion or distortions of the secondary structure are not sampled. A further important advantage of both proposed protocols is that they do not require a priori knowledge of system behaviour, which often appears to be a bottleneck for enhanced MD methods 28

. We have demonstrated that the same protocol is applicable for different types of proteins, and

that it can be used to explore pocket variations arising from elements of protein structure characterized by quite different degrees of flexibility- from large-scale backbone motions to side chain rotations. Additionally, we have demonstrated with the examples of HSP90 and Src kinase that the L-RIP procedure can be used to obtain insights into the conformational space of the binding site and not just to find regions of high flexibility as the original RIP method was designed to do. Starting standard MD simulations from perturbed (high-energy) states makes the sampling of the energy landscape much faster than in simulations started from the global energy minimum, although complete sampling of the energy landscape may not be achieved either. However, local thermodynamic minima separated by high energy barriers that are hardly accessible in

ACS Paragon Plus Environment

39

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 40 of 48

conventional MD simulations can be efficiently accessed in this procedure. We demonstrated that micro-second time-scale motions, such as DFG loop motion and unwinding of the -helix, observed in Src and HSP90 proteins, respectively, are elucidated in less than 1ns of L-RIP simulations. One should note here some limitations of the proposed methods. First, the conformational changes observed in RIPlig may depend on the initial placement of the pseudo-ligand and the pocket space available for motion of the pseudo-ligand. Particularly, the pocket flexibility observed in RIPlig may be overestimated for small buried pockets and underestimated for open solvent-exposed pockets. Furthermore, because of the constant energy equilibration procedure used in RIPlig, the average protein temperature may rise with increasing number of perturbations, as well as with the length of the subsequent implicit solvent MD simulations (as in the original RIP method). This makes RIPlig more useful for a rapid, though less accurate, evaluation of pocket flexibility. Then, as both the L-RIP and RIPlig approaches are based on perturbation of protein structures, they are unlikely to be able to predict folding of protein segments (for example, the re-building of the helix in HSP90). For the same reason, they provide only sampling of protein conformations, not the kinetics or thermodynamics of the protein motion. One should also take into account, that although the position of the transient pocket regions can be revealed from a single run, it is advisable to generate several perturbation trajectories for each perturbed residue if better sampling of protein conformational space is required. Finally, an overestimation of the pocket mobility cannot be ruled out because both simulation methods, RIPlig and L-RIP, provide non-equilibrated structures. To obtain the accurate conformations of a binding site required for ligand docking, additional explicit solvent MD equilibration of L-RIP structures is advisable. We demonstrated for two example cases that MD equilibration effectively stabilizes the system after perturbation

ACS Paragon Plus Environment

40

Page 41 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

simulations, by eliminating the most energetically unfavourable conformations. Importantly, LRIP simulations combined with explicit solvent equilibration are several-fold faster than standard MD simulations to explore such conformational changes.

Supporting Information Available. Figure S1 shows the variation of protein temperature in L-RIP and RIP simulations. The procedure for the 400ns standard MD simulations of HSP90 started from the unliganded structure with a complete -helix3 (PDB: 2UYD) is given in Section S1 and the results are summarized in Fig.S2. The procedure for the placement of the pseudo-ligand used in RIPlig simulations is given in Sec. S2. The clustering procedure is described in Section S3, and computation time required for perturbation simulations in the present work is discussed in Section S4. The sets of PDB structures and parameters of the binding sites for the four test targets are given in Tables S1-S4 and illustrated in Figs. S3-S4. A comparison of protein flexibility generated by perturbation of different residues in HSP90 and Src is shown in Figs. S5 and Fig. S7, respectively. The variation in RMSD of HSP90 along RIP and L-RIP trajectories is shown in Fig.S6. Fig. S8 shows the RMSD of Src kinase as observed in three different L-RIP trajectories upon perturbation of F405. The RMSF of -helix3 and the ATP lid observed in L-RIP trajectories obtained by perturbation of different residues from this region is compared with that of three representative structures of bacterial and yeast HSP90 in Fig.S9. The per-residue RMSD of the L-RIP conformation in HSP90 as observed before and after MD equilibration is plotted in Fig.S10 and the -helix3 RMSD before and after equilibration is given in Table S5. An evaluation of the Src kinase DFG loop conformations along L-RIP trajectories is given in Figs. S11, S12. The opening of an adjacent transient pocket in PR is

ACS Paragon Plus Environment

41

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 42 of 48

illustrated in Fig. S13. This material is available free of charge via the Internet at http://pubs.acs.org.

Availability A modified RIP package including L-RIP and RIPlig functionalities is available in a command line python application that uses the NAMD package for performing MD simulations. The software is licensed under the GNU General Public License v2.0 and is freely available at http://www.mcm.h-its.org/lrip-riplig/. L-RIP and RIPlig simulations are also implemented in a TRAPP (Transient Pockets in Proteins) web server http://www.mcm.h-its.org/trapp2.

Corresponding Author [email protected], [email protected].

Author Contributions Conceived and designed experiments: DBK, RCW; Performed the experiments: DBK; Analysed and interpreted the data: DBK, FR, PG, RCW; Wrote the paper: DBK, RCW. All authors have given approval to the final version of the manuscript.

Funding Sources

ACS Paragon Plus Environment

42

Page 43 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Klaus Tschira Foundation, the German Federal Ministry for Education and Research (BMBF) Biotech Cluster Rhein-Neckar (BioRN) (Project INE-TP03), and the EU/EFPIA Innovative Medicines Initiative (IMI) Joint Undertaking, K4DD (grant no. 115366).

Financial Disclosure We gratefully acknowledge the support of the Klaus Tschira Foundation, the German Federal Ministry for Education and Research (BMBF) Biotech Cluster Rhein-Neckar (BioRN) (Project INE-TP03), and the EU/EFPIA Innovative Medicines Initiative (IMI) Joint Undertaking, K4DD (grant no. 115366). This work reflects only the authors’ views and neither the BMBF, the IMI nor the European Commission is liable for any use that may be made of the information contained herein. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Acknowledgment We thank Dr. Stefan Richter for his assistance with the webserver implementation and Dr. Joanna Panecka for careful reading of the manuscript and critical comments. We also thank Dr. Bosco K. Ho for making the RIP software freely available. We gratefully acknowledge the support of the Klaus Tschira Foundation, the German Federal Ministry for Education and Research (BMBF) Biotech Cluster Rhein-Neckar (BioRN) (Project INE-TP03), and the EU/EFPIA Innovative Medicines Initiative (IMI) Joint Undertaking, K4DD (grant no. 115366).

ACS Paragon Plus Environment

43

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 44 of 48

Abbreviations HSP90, human heat-shock protein 90; Src, cellular and sarcoma tyrosine protein kinase; IL-2, interleukin-2; PG, progesterone receptor ligand binding domain; RIP, Rotamerically Induced Perturbation; L-RIP, Langevin Rotametically Induced Perturbation; RIPlig, Rotamerically Induced Perturbation of a ligand.

References (1)

Frembgen-Kesner, T.; Elcock, A. H. Computational Sampling of a Cryptic Drug Binding Site in a Protein Receptor: Explicit Solvent Molecular Dynamics and Inhibitor Docking to p38 MAP Kinase. J. Mol. Biol. 2006, 359 (1), 202–214.

(2)

Eyrisch, S.; Helms, V. Transient Pockets on Protein Surfaces Involved in Protein-Protein Interaction. J. Med. Chem. 2007, 50 (15), 3457–3464.

(3)

Dror, R. O.; Pan, A. C.; Arlow, D. H.; Borhani, D. W.; Maragakis, P.; Shan, Y.; Xu, H.; Shaw, D. E. Pathway and Mechanism of Drug Binding to G-Protein-Coupled Receptors. Proc. Natl. Acad. Sci. U. S. A. 2011, 108 (32), 13118–13123.

(4)

Kubitzki, M. B.; de Groot, B. L. Molecular Dynamics Simulations Using TemperatureEnhanced Essential Dynamics Replica Exchange. Biophys. J. 2007, 92 (12), 4262–4270.

(5)

Voter, A. Parallel Replica Method for Dynamics of Infrequent Events. Phys. Rev. B 1998, 57 (22), R13985–R13988.

(6)

So̸rensen, M. R.; Voter, A. F. Temperature-Accelerated Dynamics for Simulation of Infrequent Events. J. Chem. Phys. 2000, 112 (21), 9599–9606.

ACS Paragon Plus Environment

44

Page 45 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(7)

Voter, A. Hyperdynamics: Accelerated Molecular Dynamics of Infrequent Events. Phys. Rev. Lett. 1997, 78 (20), 3908–3911.

(8)

Hamelberg, D.; Mongan, J.; McCammon, J. A. Accelerated Molecular Dynamics: A Promising and Efficient Simulation Method for Biomolecules. J. Chem. Phys. 2004, 120 (24), 11919–11929.

(9)

Ho, B. K.; Agard, D. a. Probing the Flexibility of Large Conformational Changes in Protein Structures through Local Perturbations. PLoS Comput. Biol. 2009, 5 (4), e1000343.

(10)

Pettersen, E. F.; Goddard, T. D.; Huang, C. C.; Couch, G. S.; Greenblatt, D. M.; Meng, E. C.; Ferrin, T. E. UCSF Chimera--a Visualization System for Exploratory Research and Analysis. J. Comput. Chem. 2004, 25 (13), 1605–1612.

(11)

Krukenberg, K. A.; Street, T. O.; Lavery, L.; Agard, D. Conformational Dynamics of the Molecular Chaperone Hsp90. Q. Rev. Biophys. 2011, 44 (2), 229–255.

(12)

Filomia, F.; De Rienzo, F.; Menziani, M. C. Insights into MAPK p38alpha DFG Flip Mechanism by Accelerated Molecular Dynamics. Bioorg. Med. Chem. 2010, 18 (18), 6805– 6812.

(13)

Vashisth, H.; Maragliano, L.; Abrams, C. F. “DFG-Flip” in the Insulin Receptor Kinase Is Facilitated by a Helical Intermediate State of the Activation Loop. Biophys. J. 2012, 102 (8), 1979–1987.

(14)

Shan, Y.; Arkhipov, A.; Kim, E. T.; Pan, A. C.; Shaw, D. E. Transitions to Catalytically Inactive Conformations in EGFR Kinase. Proc. Natl. Acad. Sci. U. S. A. 2013, 110 (18), 7270–7275.

ACS Paragon Plus Environment

45

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(15)

Page 46 of 48

Hyde, J.; Braisted, A. C.; Randal, M.; Arkin, M. R. Discovery and Characterization of Cooperative Ligand Binding in the Adaptive Region of Interleukin-2. Biochemistry 2003, 42 (21), 6475–6483.

(16)

Zhang, Z.; Olland, A. M.; Zhu, Y.; Cohen, J.; Berrodin, T.; Chippari, S.; Appavu, C.; Li, S.; Wilhem, J.; Chopra, R.; Fensome, A.; Zhang, P.; Wrobel, J.; Unwalla, R. J.; Lyttle, C. R.; Winneker, R. C. Molecular and Pharmacological Properties of a Potent and Selective Novel Nonsteroidal Progesterone Receptor Agonist Tanaproget. J. Biol. Chem. 2005, 280 (31), 28468–28475.

(17)

2013, S. R. Schrödinger Release 2013-1: Maestro, Version 9.4, Schrödinger, LLC, New York, NY, 2013. 2013.

(18)

Phillips, J. C.; Braun, R.; Wang, W.; Gumbart, J.; Tajkhorshid, E.; Villa, E.; Chipot, C.; Skeel, R. D.; Kalé, L.; Schulten, K. Scalable Molecular Dynamics with NAMD. J. Comput. Chem. 2005, 26 (16), 1781–1802.

(19)

MacKerell, A.; Bashford, D. All-Atom Empirical Potential for Molecular Modeling and Dynamics Studies of Proteins. J. Phys. Chem. B 1998, 102, 3586–3616.

(20)

Mackerell, A. D.; Feig, M.; Brooks, C. L. Extending the Treatment of Backbone Energetics in Protein Force Fields: Limitations of Gas-Phase Quantum Mechanics in Reproducing Protein Conformational Distributions in Molecular Dynamics Simulations. J. Comput. Chem. 2004, 25 (11), 1400–1415.

(21)

Humphrey, W.; Dalke, a; Schulten, K. VMD: Visual Molecular Dynamics. J. Mol. Graph. 1996, 14 (1), 33–38, 27–28.

ACS Paragon Plus Environment

46

Page 47 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(22)

Ryckaert, J.-P.; Ciccotti, G.; Berendsen, H. J. . Numerical Integration of the Cartesian Equations of Motion of a System with Constraints: Molecular Dynamics of N-Alkanes. J. Comput. Phys. 1977, 23 (3), 327–341.

(23)

Case, D. a; Cheatham, T. E.; Darden, T.; Gohlke, H.; Luo, R.; Merz, K. M.; Onufriev, A.; Simmerling, C.; Wang, B.; Woods, R. J. The Amber Biomolecular Simulation Programs. J. Comput. Chem. 2005, 26 (16), 1668–1688.

(24)

Tanner, D. E.; Chan, K.-Y.; Phillips, J. C.; Schulten, K. Parallel Generalized Born Implicit Solvent Calculations with NAMD. J. Chem. Theory Comput. 2011, 7 (11), 3635–3642.

(25)

Kokh, D. B.; Richter, S.; Henrich, S.; Czodrowski, P.; Rippmann, F.; Wade, R. C. TRAPP: A Tool for Analysis of Transient Binding Pockets in Proteins. J. Chem. Inf. Model. 2013, 53 (5), 1235–1252.

(26)

Patapati, K. K.; Glykos, N. M. Three Force Fields’ Views of the 3(10) Helix. Biophys. J. 2011, 101 (7), 1766–1771.

(27)

Richter, K.; Soroka, J.; Skalniak, L.; Leskovar, A.; Hessling, M.; Reinstein, J.; Buchner, J. Conserved Conformational Changes in the ATPase Cycle of Human Hsp90. J. Biol. Chem. 2008, 283 (26), 17757–17765.

(28)

Romo, T. D.; Grossfield, A. Unknown Unknowns: The Challenge of Systematic and Statistical Error in Molecular Dynamics Simulations. Biophys. J. 2014, 106 (8), 1553–1554.

ACS Paragon Plus Environment

47

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 48 of 48

Table of Contents Graphic

ACS Paragon Plus Environment

48