Molecular Dynamics Simulations and Structural Network Analysis of c

Aug 3, 2015 - ... and Data Sciences, Department of Computational Sciences, Schmid College of Science and ... processes in cellular networks and repres...
0 downloads 4 Views 7MB Size
Subscriber access provided by UNIV OF CAMBRIDGE

Article

Molecular Dynamics Simulations and Structural Network Analysis of c-Abl and c-Src Kinase Core Proteins: Capturing Allosteric Mechanisms and Communication Pathways from Residue Centrality Amanda Tse, and Gennady M. Verkhivker J. Chem. Inf. Model., Just Accepted Manuscript • DOI: 10.1021/acs.jcim.5b00240 • Publication Date (Web): 03 Aug 2015 Downloaded from http://pubs.acs.org on August 5, 2015

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Chemical Information and Modeling is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 74

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Molecular Dynamics Simulations and Structural Network Analysis of c-Abl and c-Src Kinase Core Proteins: Capturing Allosteric Mechanisms and Communication Pathways from Residue Centrality Amanda Tse1, Gennady M. Verkhivker1,2

1



Graduate Program in Computational and Data Sciences, Department of Computational Sciences,

Schmid College of Science and Technology, Chapman University, One University Drive, Orange, CA 92866, USA 2

Chapman University School of Pharmacy, Irvine, CA 92618, USA

‡corresponding author E-mail: [email protected]

ACS Paragon Plus Environment

1

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 74

Abstract The Abl and Src tyrosine kinases play a fundamental regulatory role in orchestrating functional processes in cellular networks and represent an important class of therapeutic targets. Crystallographic studies of these kinases have revealed a similar structural organization of multidomain complexes that confers

salient features of their regulatory mechanisms. Molecular

characterization of the interaction networks and regulatory residues

by which the SH3 and

SH2 domains act cooperatively with the catalytic domain to suppress or promote kinase activation presents an active area of structural, biochemical and computational investigations. In this work, we combine biophysical simulations with computational modeling of the residue interaction networks to characterize allosteric mechanisms of kinase regulation and gain insight into differential sensitivity of c-Abl and c-Src kinases to specific drug binding. Using these approaches, we examine dynamics of cooperative rearrangements in the residue interaction networks and elucidate structural role of regulatory residues responsible for modulation of kinase activity. We have found that global network parameters such as residue centrality can unambiguously distinguish functional sites interactions in the regulatory assemblies.

that are responsible for mediating allosteric This study has revealed mechanistic aspects of

allosteric mechanisms and communication pathways by which the SH3 and SH2 domains may exert their regulatory influence on the catalytic domain and kinase activity. We have also found that high centrality residues can be linked to each other to form efficient and robust routes that transmit allosteric signals between spatially separated regulatory regions. The presented results have demonstrated that global features of the residue interaction networks may serve as transparent and robust indicators of kinase regulatory mechanisms and accurately pinpoint key functional residues.

ACS Paragon Plus Environment

2

Page 3 of 74

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Introduction Protein kinase genes are signaling switches

that direct functional processes and protein

communications in cellular networks and signal transduction pathways1,2. kinases, that

represent one of the largest protein families,

The human protein

are regulated by different

mechanisms including phosphorylation of the activation loops, autoinhibition, and allosteric activation

by protein binding partners that enable the kinase domain (KD) to adopt a

catalytically competent conformation and attain activity3-11. The Abelson (Abl) family of nonreceptor protein tyrosine kinases consists of two members, c-Abl and the Abl-related gene (Arg), that are involved in interactions with a multitude of cellular proteins, linking extracellular stimuli to signaling pathways that control cell growth, survival, invasion, adhesion and migration12-14. Each Abl protein contains an SH3-SH2-KD (Src homology 3–Src homology 2–kinase domain) domain cassette, which confers autoregulated kinase activity and is common among non-receptor tyrosine kinases12-14. The Src family of cytoplasmic non-receptor protein tyrosine kinases (c-Src, Lck, Fgr, Fyn, c-Yes, Blk, Hck, and Lyn kinases) can integrate diverse cellular signals and are involved in

mediating

Abl

kinase activity and signal transduction15-17.

Structural

and

biochemical studies have revealed autoinhibitory mechanisms that can modulate and constrain catalytic activity and substrate specificity of the c-Abl and c-Src kinases. These kinases share the same SH3-SH2-KD modular organization that confers salient features of their regulatory mechanisms. Crystallographic studies have provided

a molecular framework of activation

mechanisms by detailing a common structural organization of the regulatory complexes for the c-Src18-20 and c-Abl kinases21-24. In the autoinhibitory complexes, the SH3 domain binds to the linker that connects the SH2 domain and the KD, while the SH2 domain interacts with the C-terminal lobe of the KD playing a negative regulatory role as an intramolecular autoinhibitory

ACS Paragon Plus Environment

3

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 74

“clamp” that maintains the KD in a conformation with low catalytic activity (Figure 1). The disengagement of the SH3-SH2 domains relieves the autoinhibitory constraints in c-Src and cAbl kinases and yields an activated form. Small-angle X-ray scattering (SAXS) reconstruction of an activated form of c-Abl protein in solution has confirmed that upon kinase activation the SH2 and SH3 domains switch from a compact assembly to an elongated arrangement. The active c-Abl structure yielded a fully extended conformation with the KD, SH2, and SH3 domains in a linear arrangement, yet the position of the SH3 domain could not be determined22. Recent structural and biochemical investigations of c-Abl kinase25-27 have reported molecular details of the regulatory interactions by which the SH3 and SH2 domains can suppress or promote kinase activation.

The SH3 domain of c-Abl

plays a major role in stabilizing the

autoinhibitory state (Figure 1A) by interacting with the N-terminal lobe of KD and the SH2kinase linker. Biophysical studies have demonstrated that the SH3 domain can sustain its interactions with the Abl linker even in the absence of the KD28,29, indicating that robustness of these autoinhibitory interactions may be dictated by their central role in kinase regulation. Furthermore,

the enhanced SH3-linker interactions,

induced by mutations

of the linker

prolines, could allosterically strengthen the autoinhibitory constraints and suppress the kinase activity, rendering the improved c-Abl sensitivity to small molecule inhibitors30. Conversely, mutations or deletions in the SH3 domain and the SH2-kinase linker

can trigger the release of

autoinhibitory constraints, leading to the enhanced activation potential of c-Abl. The crystal structure of a downregulated c-Abl core protein is composed of a myristoylated (Myr) Nterminal “cap” (NCap) followed by the SH3, SH2, and kinase domains (Figure 1A). In this structure, the NCap fragment can form direct interactions with the SH2 domain and the SH3– SH2 connector, strengthening the autoinhibitory “grip” over the catalytic core31.

ACS Paragon Plus Environment

4

Page 5 of 74

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 1.

Structural organization of the c-Abl and c-Src regulatory complexes. Common

domain organization of the c-Abl and c-Src kinases is schematically shown at the top. The core domains in c-Abl are NCap: N-terminal region, SH3: Src homology domain 3, SH2: Src homology domain 2, SH2-kinase linker (SH2 Linker), and kinase domain (KD). (A) The crystal structure of the downregulated c-Abl complex (pdb id 2FO0) is rendered in ribbons and colored by domains : NCap (blue), SH3 (red), SH2 (green), SH2 linker (magenta) and KD (pink). (B) The crystal structure of the autoinhibitory c-Src complex (pdb id 2SRC) is also shown in ribbons and colored by domains: SH3 (red), SH2 (green), SH2 linker (magenta) and KD (pink). The functional regulatory residues and phosphorylation tyrosine sites are annotated and shown in colored bold sticks.

ACS Paragon Plus Environment

5

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 74

The disruption of key autoinhibitory interactions by phosphorylation of tyrosine residues is crucial for positive regulation of Abl activity32-34. Biochemical studies have demonstrated that phosphorylation of the SH3 residues Y89 and Y134 can interfere with the SH3-linker binding (Figure 1A), thereby disrupting key negative regulatory interactions and leading to kinase activation34,35.

Phosphorylation of Y245 in the linker can also disrupt the SH3-linker

interactions and promote Abl kinase activity35,36. At the same time, mutation of Y245

that

prevents its phosphorylation can reduce maximal activation of c-Abl by ~50% in vitro33. Similarly, mutations of Y158

in the SH2-KD interface have shown an increase in kinase

activity13, suggesting that stabilizing SH2 interactions with the C-lobe of the catalytic core may be required for downregulation of kinase activity. The Src family kinases such as Hck, Lyn and Fyn can phosphorylate Y89 in the SH3 domain of Abl (Figure 1B) and

exert control

over Abl kinase activity by perturbing the autoinhibitory SH3-linker interactions36,37. These studies have shown that phosphorylation and single mutations of perturb networks

of autoinhibitory interactions and

functional residues can

induce

large conformational

rearrangements between the inactive and active kinase states. Crystallographic studies of the isolated catalytic domains have revealed that activity of c-Abl and c-Src kinases may be also supported through conformational changes in the key functional regions of the catalytic core : the glycine rich P-loop, the Asp-Phe-Gly (DFG) motif, the regulatory αC-helix, and the activation loop (A-loop)38-41. A particularly important control element of kinase

regulation is the A-loop that can leverage

both conformational and

phosphorylation preferences to modulate kinase activity and determine substrate access. The C-lobe of the catalytic core contributes to the active site through the A-loop with a single tyrosine site (Y412) that undergoes autophosphorylation by other tyrosine kinases 25,26. Structural

ACS Paragon Plus Environment

6

Page 7 of 74

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

and functional analyses have indicated that the isolated Abl KD has an intrinsically closed DFG-out conformation, phosphorylation.

where

Y412 acts as a pseudo-substrate not accessible for

Phosphorylation of Y412 in the A-loop can stabilize an

conformation and is coupled with the increased catalytic activity of c-Abl33,42.

active kinase Biochemical

studies have also discovered that an intramolecular SH2-KD interaction in Abl can be both necessary and sufficient for high catalytic activity of the enzyme, confirming a positive regulatory role of this interaction in kinase function43. A recent “tour de force” investigation of Abl regulation44 has unveiled that the SH2-KD interactions can allosterically induce kinase activation by converting the KD from an intrinsically inactive conformation to an active form, which is then stabilized via phosphorylation of Y412 in the A-loop. The most recent work from Kuriyan lab has presented a new crystal structure of an SH2-KD construct, revealing that the SH2-KD interactions in the active complex can promote processive phosphorylation and moderately enhance kinase activity45. NMR and SAXS studies of the modified c-Abl core protein lacking the myristoylated NCap has shown that the apo c-Abl form can adopt a ‘closed’ conformation, which is similar to the down-regulated Abl core crystal structure46. Surprisingly, the presence of Imatinib could lead to an ‘open’ conformation, where the SH3 and SH2 domains

are displaced from the back of the KD. Remarkably, the combination of

Imatinib with the allosteric inhibitor GNF-5 can restore the closed, inactivated state. These structural studies of c-Abl complexes with inhibitors have also established that the SH2 and SH3 domains can adopt a range of different positions with respect to KD, confirming that the SH3-linker interactions may be sufficient to maintain c-Abl in a down-regulated conformation46. The discovery of Imatinib, a highly selective inhibitor that targets the inactive, downregulated form of c-Abl has marked a historical breakthrough in the development of the tyrosine kinase

ACS Paragon Plus Environment

7

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 74

inhibitors combating chronic myeloid leukemia (CML)38,39. A large number of Imatinib-resistant Abl mutants emerging at the advanced disease stages are associated not only with the binding site and catalytic domain residues47, but could also arise in the SH3 and SH2 domains, the SH2 linker and SH3-SH2 connectors48. While Imatinib is a potent inhibitor of c-Abl, it does not inhibit c-Src even though Imatinib has been crystallized with c-Abl and c-Src in virtually identical inactive conformations49. The high c-Abl selectivity for Imatinib over closely related cSrc kinase was proposed to result from the higher energetic price for adopting Imatinib-bound conformation by c-Src. However, this mechanism was challenged in a subsequent study, where a series of inhibitors derived from the Imatinib scaffold appeared to bind with equally high potency to both c-Src and c-Abl kinase domains, while adopting similar inactive DFG-out conformation in the crystal structures50. Computational studies have characterized free energy landscapes in c-Abl and c-Src kinases and showed that the stability difference between the DFGout and DFG-in conformations of c-Src may be larger than in c-Abl, and conformational selection may underlie the mechanism of Imatinib specificity51,52. Molecular dynamics (MD) simulations and free energy calculations combined with isothermal titration calorimetry have quantified

energetics of conformational transitions in c-Abl and c-Src kinases53, confirming

that a more favorable kinetic accessibility and thermodynamic stability of the DFG-out conformation in c-Abl may determine Imatinib selectivity. Subsequent investigations using absolute free energy computations have reconciled computational and experimental data by showing that Imatinib binding specificity may be controlled by a cumulative effect of conformational selection, favoring stabilization of the inactive c-Abl conformation, and broadly distributed variations in the inhibitor-kinase interactions54. MD simulations and free energy simulations of Imatinib binding to inactive structures of c-Abl, c-Kit, Lck, and c-Src kinase

ACS Paragon Plus Environment

8

Page 9 of 74

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

domains indicated a crucial role of van der Waals dispersive interactions in determining binding specificity with c-Abl55,56. Recently, a

systematic analysis of inhibitors that stabilize

an

Imatinib-like inactive conformation have revealed that ligands that are equipotent against Abl and Src showed a small difference in inhibition between unphosphorylated and phosphorylated A-loop forms, while inhibitors that selective for c-Abl over c-Src were sensitive to the phosphorylation state of the A-loop in c-Abl57.

Hence, the high selectivity and

strong

sensitivity of Imatinib towards the unphosphorylated form of c-Abl is not a universal characteristic of kinase inhibitors that stabilize the DFG-out conformation. It was suggested that allosteric coupling between the P-loop and A-loop regions could modulate differences in thermodynamic stability of c-Abl and c-Src kinase conformations, thus fine-tuning binding preferences of specific inhibitors. Steady-state fluorescence kinetics and NMR spectroscopy have studied directly the process of Imatinib binding to the catalytic domain of Abl and Src with millisecond time resolution58, suggesting that the energy landscape of Abl kinase may combine conformational selection with an induced fit to enable specific Imatinib binding. These experiments have also revealed that Imatinib binding may cause chemical-shift perturbations distributed over a large fraction of the protein structure, advocating for a global conformational change that was also seen in the NMR study of full-length Abl46. Computational approaches have studied the atomic details of the protein kinase dynamics and regulation at different levels of complexity : from detailed analyses of the catalytic domain and drug resistance51-59 to simulations of the regulatory assemblies60-64. In our previous studies, we analyzed mechanisms of allosteric kinase regulation by integrating multiscale simulations and modeling of

long-range communications60,61. A multi-disciplinary approach combining

simulations, functional assays and mutagenesis has characterized the inter-domain coupling in

ACS Paragon Plus Environment

9

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 74

the active Abl complex, suggesting that the SH2-KD interactions can allosterically stabilize the catalytically competent position of the αC-helix and thus exert control over kinase activity62. Microsecond all-atom simulations and differential scanning calorimetry have investigated the dynamics of the SH3-SH2 tandem that operates as a two-state switch, alternating between conformations observed in the autoinhibited

and active complexes63.

The

biophysical,

biochemical and computational studies of Abl and Src regulation have indicated a complex interplay between the SH3 and SH2 domains,

the SH2 linker and the catalytic domain.

Although a substantial progress has been made in understanding structural basis of kinase regulation and drug sensitivity, molecular details underlying

the inter-domain allosteric

communications in the c-Abl and c-Src complexes are yet to be fully understood. Functional significance of

conformational states can be

described through

global

rearrangements in the residue interaction networks that contribute to large conformational changes.

These networks are often controlled by critical functional residues that determine

relative conformational populations and allosteric communications in the inactive and active kinase states64,65. A graph-based representation of protein structures description of

residue interaction networks66-69, providing

a

yields a convenient

robust framework for

understanding allosteric communications in protein systems. Topology-based network parameters describing node centrality (degree, closeness, and betweenness) have been exploited to predict protein-protein

interactions70,71, protein-DNA interfaces72, ligand binding sites73,74,

and catalytic residues in enzymes75. These studies have linked organization of protein structure networks with structural stability and high connectivity of functional residues, particularly indicating that residues involved in short path length communications could mediate signaling76. Graph-based protein networks

that incorporated topology-based residue connectivity and

ACS Paragon Plus Environment

10

Page 11 of 74

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

contact maps of residues cross-correlations obtained from MD simulations77,78 have provided important insights into structural mechanisms underlying allosteric interactions and communication pathways in various protein systems79-85.

By using connectivity networks of

interacting residues, computational approaches have characterized rigid and flexible regions in protein structures, explaining thermal stability and activity of various proteins86-88. In this work, we present a multi-faceted network-based analysis integrated with biophysical simulations to model allosteric mechanisms of kinase regulation and characterize molecular interactions underlying differential sensitivity of c-Abl and c-Src kinases to specific drug binding. MD simulations of the catalytic core and

regulatory complexes in c-Abl and c-Src

kinases are combined with the structure-based network modeling to determine organization of the residue interaction networks and allosteric communication pathways across all functional states. By using global network parameters such as residue centrality we determine cooperative rearrangements in the residue interaction networks and identify global mediating sites that coordinate kinase activity. This study reveals that residue-based centrality can unambiguously distinguish functional sites

responsible for regulatory interactions in the autoinhibitory and

active complexes. Modeling of the residue interaction networks is leveraged in reconstruction of communication pathways in the kinase assemblies, providing

a plausible mechanistic view of

allosteric mechanisms by which the SH3 and SH2 domains may exert their regulatory influence on kinase activity. The structure-based network approach provides a simple and transparent view

of structural stability and function, showing how the efficiency and robustness of the

allosteric interaction networks in c-Abl and c-Src kinases may be linked with their binding preferences and selectivity profiles.

ACS Paragon Plus Environment

11

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 74

Materials and Methods MD Simulations and Analysis of Collective Motions MD simulations of the c-Abl and c-Src kinase crystal structures (500 ns for each structure) were performed for different forms of the catalytic domain and for the regulatory complexes. The crystal structures of the c-Abl and c-Src kinases were obtained from the Protein Data Bank 89. A spectrum of simulated KD crystal structures included the inactive conformations (pdb id 1IEP, 1OPJ for c-Abl and 2OIQ for c-Src), the Cdk/Src-like inactive forms (pdb id 2G1T for c-Abl and 2SRC for c-Src), and the active conformations (pdb id 2GQG for c-Abl and 3G5D for c-Src). The simulated crystal structures of the regulatory complexes included the autoinhibitory complexes pdb id 2FO0 for c-Abl and 2SRC for c-Src) and active complexes (pdb id 1OPL, chain B for c-Abl and 1Y57 for c-Src). The retrieved structures were examined for missing and disordered segments. The missing residues, unresolved structural segments and disordered loops were modeled with the ArchPRED server90. MD simulations were carried out using NAMD 2.6 package91 with the CHARMM27 force field92 and the explicit TIP3P water model. The employed MD protocol is consistent with the overall setup that was described in details in our earlier studies93. An NPT production simulation was run on the equilibrated structures for 500 ns keeping the temperature at 300 K and constant pressure (1 atm)

using Langevin piston coupling algorithm. Principal component analysis

(PCA) of the MD conformational ensembles was performed using the CARMA package94. The frames were saved every 5 ps, and a total of 10,000 frames were used to compute the correlation matrices for each simulation.

ACS Paragon Plus Environment

12

Page 13 of 74

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Local Structural Parameters : Relative Solvent Accessibility and Residue Depth We have computed the relative solvent accessibility parameter (RSA) that is defined as the ratio of the absolute solvent accessible surface area (SASA) of that residue observed in a given structure and the maximum attainable value of the solvent-exposed surface area for this residue95. According to this model, residues are considered to be solvent exposed if the ratio value exceeds 50% and to be buried if the ratio is less than 20%. Analytical SASA is estimated computationally using analytical equations and their first and second derivatives and was computed using web server GetArea95. Residue depth measures the closest distance of the residue to bulk solvent96. In the first step, the protein molecule is solvated and water molecules that clash with atoms of the protein are removed from the box. Solvent dynamics is mimicked by repeated solvation each time in a different orientation. At each iteration, the value of a residue depth is computed as the distance from a residue to the closest molecule of bulk water. The reported parameter is the average depth over all solvation iterations. The total number of applied solvation cycles was 100. Both parameters are known to correlate with the effects of mutations on protein stability and protein interactions.

Protein Structure Network Construction In the protein structure network analysis, a graph-based representation of proteins was used in which amino acid residues were considered as nodes connected by edges corresponding to the nonbonding residue-residue interactions. The details of the construction of such a graph at a particular interaction cut-off ( I min ) were extensively discussed77,78. Here, we describe the main steps in the construction of protein structure networks adopted in our study. The interactions

ACS Paragon Plus Environment

13

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 74

between side chain atoms of amino acid residues (nodes) define edges of the protein structure network and are evaluated from the normalized number of contacts between nodes. The noncovalent interactions between sequence neighbors are ignored in the graph construction. The interaction between two residues i and j is measured as

I ij =

nij ( Ni × N j )

× 100 (1)

In the original formulation of the graph construction procedure77,78, the interaction parameter was also defined as a percentage given by:

I ij =

nij ( Ni × N j )

× 100 (2)

where nij is number of distinct atom pairs between the side chains of amino acid residues i and j that lie within a distance of 4.5 Å. N i and N j are the normalization factors for residues i and j

respectively. We have determined the normalization factors Ni for all 20 residue types

as was described in previous studies77. The number of interaction pairs including main-chain and side-chain made by residue type i with all its surrounding residues is also evaluated. The normalization factors take into account the differences in the sizes of the side chains of the different residue types and their propensity to make the maximum number of contacts with other amino acid residues in protein structures. The pair of residues with the interaction Iij greater than

ACS Paragon Plus Environment

14

Page 15 of 74

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

a user-defined cut-off ( I min ) are connected by edges and produce a protein structure network graph for a given interaction cutoff I min . According to the analysis of a large number of protein structures,

I min values could vary from 1% to 15%, where the lower I min , the higher is the

graph connectivity. The optimal interaction cutoff, that can produce adequate graph representations for a wide range of protein structures, was determined as the transition point for the largest connected cluster77. According to this definition, the I min value often lies in the range 2-4% for a diverse spectrum of protein systems and molecular complexes77-82. A similar analysis was conducted in our study. In the graph-based analysis of the protein kinase

structures

performed in the present study, at I min =1%, all residue nodes are connected by edges, while at I min =10%, there are typically very few residue nodes connected by non-covalent edges (interactions). We found that the appropriate transition value for the cut-off I min =2.5%-3%. Hence, in the present study, any pair of residues are connected in the protein structure graph if I min =3.0%.

Global Network Parameters

A weighted network representation of the protein structure is adopted that

includes non-

covalent connectivity of side chains and residue cross-correlation fluctuation matrix83. In this model of a protein network, the weight wij of an edge between nodes i and j is determined by the dynamic information flow through that edge as measured by the correlation between respective residues. The weight wij is defined as wij = − log( Cij ) where Cij is the element of the

ACS Paragon Plus Environment

15

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 74

covariance matrix measuring the cross-correlation between fluctuations of residues is i and j obtained from MD simulations. The shortest paths between two residues are determined using the Floyd–Warshall algorithm97 that compares all possible paths through the graph between each pair of residue nodes. Network calculations were performed using the python module NetworkX (http://networkx.github.io/). To select the shortest paths that consist of dynamically correlated intermediate residues, we considered the short paths that included sufficiently correlated ( Cij = 0.5–1.0) intermediate residues. Using the constructed protein structure networks, we computed the residue-based betweenness parameter. The betweenness of residue i is defined to be the sum of the fraction of shortest paths between all pairs of residues that pass through residue i :

N

Cb (ni ) = ∑ j