A General Mechanism for the Propagation of Mutational Effects in

Dec 13, 2016 - Perturbation analysis of interaction networks within proteins and ... and K240 sites of the multifunctional NADP(H):quinone oxidoreduct...
0 downloads 0 Views 2MB Size
Subscriber access provided by UB + Fachbibliothek Chemie | (FU-Bibliothekssystem)

Article

A General Mechanism for the Propagation of Mutational Effects in Proteins Nandakumar Rajasekaran, Swaathiratna Suresh, Soundhararajan Gopi, Karthik Raman, and Athi N. Naganathan Biochemistry, Just Accepted Manuscript • DOI: 10.1021/acs.biochem.6b00798 • Publication Date (Web): 13 Dec 2016 Downloaded from http://pubs.acs.org on December 17, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Biochemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

A General Mechanism for the Propagation of Mutational Effects in Proteins Nandakumar Rajasekaran,1† Swaathiratna Suresh,2† Soundhararajan Gopi, 1 Karthik Raman1 & Athi N. Naganathan*1 1

Department of Biotechnology, Bhupat & Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India. 2

Center for Biotechnology, Anna University, Chennai 600025, India. e-mail: [email protected]

ACS Paragon Plus Environment

1

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 38

ABBREVIATIONS WT, wild-type; MD, molecular dynamics; CB, betweenness centrality; WSME, Wako-SaitôMuñoz-Eaton; NMR, nuclear magnetic resonance; vdW, van der Waals; PTMs, posttranslational modifications

ACS Paragon Plus Environment

2

Page 3 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

ABSTRACT

Mutations in the hydrophobic interior of proteins are generally thought to weaken the interactions only in their immediate neighborhood. This forms the basis of protein-engineering based studies of folding mechanism and function. However, mutational work on diverse proteins has shown that distant residues are thermodynamically coupled, with the network of interactions within the protein acting as signal conduits, thus raising an intriguing paradox. Are mutational effects localized and if no, is there a general rule for the extent of percolation and on the functional form of this propagation? We explore these questions from multiple perspectives in the current work. Perturbation analysis of interaction networks within proteins and microsecondlong molecular dynamics simulations of several aliphatic mutants of ubiquitin reveal strong evidence for distinct alteration of distal residue-residue communication networks. We find that mutational effects consistently propagate into the second shell of the altered site (even up to 15– 20 Å) in proportion to the perturbation magnitude and dissipates exponentially with a decay distance-constant of ~4–5 Å. We also report evidence for this phenomenon from published experimental NMR data that strikingly resemble predictions from network theory and MD simulations. Reformulating these observations onto a statistical mechanical model, we reproduce the stability changes of 375 mutations from 19 single-domain proteins. Our work thus reveals a robust energy dissipation-cum-signaling mechanism in the interaction network within proteins, quantifies the partitioning of destabilization energetics around the mutation neighborhood and presents a simple theoretical framework for modeling the allosteric effects of point mutations.

ACS Paragon Plus Environment

3

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 38

Introduction The hydrophobic interior of proteins is highly organized, with the packing fraction approaching that of randomly packed spheres1 or even non-spherical particles.2 The well-ordered buried residues are generally thought of as a scaffold to the more mobile polar- and chargedresidues located on the surface that determine the binding specificity and affinity to ligands and other proteins. However, a large body of recent work has revealed that the interior of proteins has a liquid-like character displaying extensive conformational variability in side-chain packing despite the large packing fraction.3-5 The interactions mediated by backbone hydrogen bonds are dynamically connected and distinctly correlated to even distal sites in single-domain proteins.6, 7 The residue interaction networks are also influenced and modulated by ligand binding to various extents and forms the basis of the frequently observed allosteric responses and long-range thermodynamic coupling.8-11 Remarkably, the effect of binding can propagate to distant sites even in the absence of a conformational change as originally shown from theory12 and later by experiments.13,

14

These detailed studies reveal that the nature, strength and dynamics of

interactions within the core of proteins have a definite functional consequence5, 15-20 and provide mechanistic insights into the long-range coupling implicit in the classic allosteric models.21, 22 Random mutations in DNA, and hence the protein, are the effectors that drive the mechanism of natural selection. Point mutations in proteins therefore play a pivotal role, not just in the acquisition of new function, but also in disease and drug resistance. At the molecular level, mutations can alter the intrinsic conformational propensity of amino acids (say a glycine substitution), the charged or polar status on the protein surface or the network of packing interactions within the protein interior. A large number of proteins and enzymes have been reengineered to enhance their stability and activity through mutations involving charged residues

ACS Paragon Plus Environment

4

Page 5 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

on the protein surface.23-25 The underlying physico-chemical origins of these mutations are well understood and algorithms can routinely and accurately predict stability enhancing mutations.25, 26

However, the mechanism of destabilization arising from mutations in the hydrophobic interior is less clear.27 A large body of work has attempted to address the question of the effects of mutations on protein structure and the physical principles that determine the loss in stability upon mutations involving side-chain truncations. Many different parameters including increase in cavity volume, changes in accessible surface area, residue depth, occluded surface, hydrophobic solvation, packing density and the number of neighboring methyl groups surrounding the mutated site have been invoked to explain the loss in stability observed in mutations involving aliphatics.27-34 While the above factors are physically reasonable and correlate well with the experimental changes in stability in each of those proteins, the underlying physical origin has generally been challenging to extricate, as correlation does not always point to causation. The situation is also complicated by the specifics of the protein structure and the nature of mutation. Given that protein residues by themselves exhibit long-range correlations through a complex network of interactions, it is natural to expect that such mutations in the well-packed hydrophobic core will also have as much, if not a more drastic influence on the resulting dynamics, correlations and the allosteric output. In fact, even a single bond-flip in Ubiquitin (NH-in and NH-out) has been shown to modulate the functional response at a distant active site.35 Apart from specific allosteric mutations in select proteins that result in a change in conformation or the oligomeric status,16 it is generally assumed that mutations affect only the immediate neighborhood of residues and that they do not contribute to structural changes distal from the altered site, i.e. there is no modulation of long-range allosteric coupling. This

ACS Paragon Plus Environment

5

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

assumption has been the cornerstone of mutation-based exploration of protein folding,36,

Page 6 of 38

37

binding and enzymatic mechanisms.38 But, there are several experimental observations that challenge this assumption while clearly pointing out that mutational effects are not localized even in the absence of a dramatic conformational change. An array of mutational studies on the PDZ domains and dihydrofolate reductase have revealed that protein resides are organized into co-evolving regions called ‘sectors’ that are thermodynamically coupled to the binding site pocket to varying levels.9, 39-42 Works on lysozyme and Sso7d reveal that mutations induce distinct but minor shifts in backbone positions throughout the protein or large-scale structural rearrangement in a context-dependent manner.30, 43 The L69S mutation in ubiquitin, the L99A mutation in T4 lysozyme and multiple cavity-creating mutations in staphylococcal nuclease lead to significant changes in the chemical shift pattern and hydrogen-exchange protection factors across nearly the entire structure.44-46 NMR studies probing methyl-bearing side-chains in protein L report changes in the dynamics of residues far from the mutated site F22L.47 It is interesting to note that similar changes have been reported for even surface modifications in ubiquitin48 and upon phosphorylation of a tyrosine (Y397) in the PDZ3 domain.49 These changes have been hypothesized to arise from weakening of interactions and an increase in folded state entropy that in turn allows for the population of excited state like conformations with distinct functionalities. No major structural rearrangements have been noted in these cases with the average structure of the modified protein being near identical to the WT. The relevant question is then, do mutational effects, particularly of those residues buried within the protein interior, consistently propagate beyond the immediate mutational neighborhood? If yes, how far do they propagate, what is the functional form and what is the

ACS Paragon Plus Environment

6

Page 7 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

dependence on the perturbation magnitude? If observed, can these features be interwoven together to reproduce the changes in stability induced by mutations? We explore these questions in the current work at three levels — graph theory, all-atom MD simulations and a statistical mechanical model — together with a re-analysis of published experimental data. We find a universal behavior that impinges on the mechanism, magnitude and functional form of the allosteric phenomena mediated by point mutations involving side-chain truncations and the molecular origins of the associated protein destabilization.

Methods Calculation of Betweenness Centrality The contact map of ubiquitin (PDB id: 1UBQ) and other proteins (SH3, 1SHG; BsCspB, 1CSP; bACBP, 2ABD) were generated with a heavy-atom cut-off of 6 Å. The interactions were weighted by pair-wise interaction energy of -46 J mol-1 and any two residues were assumed to be interacting if the interaction energy was lower than four times this value (i.e. a minimum of four atom-atom interactions). This was then used to generate an interaction network and hence the degree distribution (number of neighbors). The results are insensitive to the magnitude of the pair-wise interaction energy or vdW interaction cutoff. In a graph G(V, E) comprising vertices V and edges E, the betweenness centrality (CB) of any node v is defined as the sum of the fractions of shortest paths between all pairs of nodes that pass through the particular node50:   =



∈

  

where σst is the number of shortest paths from s to t, and σst(v) is the number of shortest paths from s to t that pass through a vertex v. Betweenness centrality was computed using the Boost

ACS Paragon Plus Environment

7

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 38

Graph Library interface (http://dgleich.github.io/matlab-bgl/) implemented in MATLAB (Mathworks Inc., USA). Perturbation of the Protein Residue Network After generating the protein residue network, we proceeded to perturb every residue in the network, by removing varying fractions of interactions (edges connecting other residues in the network). For every residue, we created mutants, i.e. new networks with a specified fraction of edges. We mimicked the effects of small, medium and large mutations by removing 25%, 50% and 75% of the edges connected to a residue, respectively. We proceeded to compute betweenness centrality for every node in each of the new mutant networks, as described above. All-Atom Explicit Solvent MD Simulations All simulations were performed in GROMACS employing the AMBER99SB*-ILDN force field. WT Ubiquitin and each of its seven mutants (I13V, V17A, I23A, I30A, L43A, L67A, L69A) were placed in periodic dodecahedral box with a distance of at least 12 Å from the box edge, solvated with ~ 7200 TIP3P water molecules and finally with ions to maintain charge neutrality. The structures were energy minimized for 5000 steps using the steepest descent algorithm, equilibrated for 2 nanoseconds and the resultant output structures were employed to run 1 microsecond of simulation (for each variant) with a 2 femtosecond time-step at 300 K. The frames were stored every 5 picoseconds thus amounting to 200,001 frames for a single variant. A Langevin thermostat with a damping coefficient of 1/(1 picosecond) was employed for maintaining the temperature. Long-range electrostatics was calculated using the particle mesh Ewald (PME) procedure at grid spacing of 1.2 Å and with a 10 Å cutoff for non-bonded interactions. The dynamic cross-correlation index (DCC) between pairs of residues was calculated from the covariance matrix. For ascertaining the strength of the packing interactions, a three-dimensional 200001 x 76 x 76 matrix was generated for every

ACS Paragon Plus Environment

8

Page 9 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

variant that included the pairwise van der Waals interaction energies. The overall van der Waals interaction energy of a particular residue is calculated by summing up the pairwise interaction energies of that residue with all the other residues of the protein. This involved mainly the nearest neighbors in the structure, as the vdW energy functional is short-ranged. The distribution of the interaction energy was typically Gaussian-like that was distinct in the mutants when compared to the WT. The change in the backbone chemical shifts from atomic-level simulations was predicted using SPARTA+.51 Wako-Saitô-Muñoz-Eaton (WSME) Model The WSME model is a structure-based statistical mechanical model that assumes for an ensemble of 2N microstates, where N is the protein length, arising from a binary-description of the residue folded status (binary variables 1 and 0 for foldedand unfolded-like conformations, respectively, for a residue).52, 53 We employed a version of the model that includes contribution from mean-field van der Waals interactions (EvdW; or packing), simplified solvation free energy (∆Gsolv) and Debye-Hückel (DH) electrostatics (Eelec).26, 54 The free energy of each microstate (∆F) with a string of folded residues between and including m and n (i.e. a string of ones) can be written as:

∆F = ∑ ∆G

n

stab m,n

− T ∑ ∆S m

where ∆S is the entropic penalty for fixing a residue in the native-like conformation. The effective stabilization free energy

stab at ∆Gm,n

a temperature T is the sum of the three energy

terms: stab ∆Gm,n = EvdW + Eelec + ∆Gsolv

A distance cut-off or rc1 between heavy atoms identifies the interacting residue pairs excluding the nearest neighbors (i.e. a contact-map). For a given pair of interacting residues, the effective

ACS Paragon Plus Environment

9

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 38

van der Waal’s interaction energy is the product of the number of inter-residue heavy atom interactions and the mean-field vdW interaction energy per atomic contact ξ.

The DH

electrostatic potential term can be written as:

Eelec = ∑ KCoulomb m,n

qi q j

ε eff rij

(

exp −rij κ

)

where KCoulomb is the Coulomb constant (1389 kJ. Å/ mol), qi is the charge on the atom i, rij is the distance between charge centers i and j, and εeff is the effective dielectric constant. εeff is fixed to 29 that estimated from previous calibrations from comparing four different homologous protein families54 and 138 single point mutations involving charged residues.26, 54 The Debye screening length (1/κ) depends on εeff, solvent ionic-strength (I) and temperature (T). m,n ) The solvation free energy is treated to be proportional to the number of native contacts ( xcont

in that microstate with the proportionality constant being ∆Cpcont , which is the temperatureindependent heat capacity change upon fixing a native contact. Therefore,

(

)

(

m,n ∆Gsolv = xcont ∆C pcont  T −Tref −T ln T Tref 

)

where Tref is the reference temperature which is fixed to 385 K.55 The effect of denaturant D is introduced following the phenomenological linear free energy relation: stab stab ∆Gm,n ([D]) = ∆Gm,n ([0]) − mres k[D]

where k = m – n + 1 and mres is the per-residue empirical destabilization constant that quantifies the sharpness of the chemical unfolding transition. The total partition function (Z) can be calculated as a function of temperature or denaturant from the transfer matrix formalism of Wako and Saitô52 as follows:

ACS Paragon Plus Environment

10

Page 11 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

N  Z = vl ∏ X i  vrtr  i=1  where vl and vr are the left and right vectors, and Xi is a N-by-N matrix that accounts for the interactions of the ith residue with the residues following it.52 The probability of finding a residue i folded ( χ i ) can be calculated from the derivative of the matrix Xi:

 i−1

  ∂X  

 j=1



N



χ i = Z −1vl ΠX j   i   Π X j  vrtr ∂ln z   j=i+1



The folding landscapes plotted in Figure 5 are generated by accumulating the partial partition functions of specific N-terminal and C-terminal residues. A detailed description of the method can be found in the classical work of Wako and Saitô52 and also in the recent descriptions of the model.26, 54 Predicting Mutant Unfolding Curves The WSME model was employed to exactly reproduce the unfolding thermodynamic parameters of each of the 19 WT proteins. This involved varying ξ and mres for a given WT protein, with a fixed entropic cost (∆Sconf = -16.5 J mol-1 K-1 per residue), heat capacity change per native contact ( ∆C pcont = -0.358 J mol-1 K-1), pH (=7.0) and ionic strength (I=0.05 M). The generated mean residue-unfolding curve (i.e. the average folded probability of all residues in the protein) was then a fit to a two-state model with free-floating baselines to estimate the stability of the WT at zero denaturant (∆G; blue in Figure 4c, 4d). A heavy-atom distance cut-off of either 6 Å or 5 Å (rc1) was employed to identify the interacting residues in the native PDB structure. Because the experimental unfolding parameters are available, we have varied ξ for each of the WT proteins; however, changes in free energy (∆∆G) can also be predicted by assuming a single value of ξ across all proteins as done before. 26, 63

ACS Paragon Plus Environment

11

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 38

Once the WT unfolding thermodynamics was reproduced, the mutant effects were directly predicted employing identical parameters to the WT and from the empirical relations detailed in equations 1 and 2 (see Results section). The following combinations of first-/second-shell (rc1 / rc2) interaction cutoffs (in Å) were attempted: 6/6, 6/5, 6/4, 5/5 and 5/4. For each of these cases the relative destabilization energy of the first (x1) and the second shell (x2) was modulated in steps of 0.1 - x1 therefore goes from 0.1 to 1 while x2 from 0 to 1 (55 combinations) – requiring the generation of 20, 625 (55 combinations of x1/x2 * 375 mutants) unfolding curves per combination of interaction cut-offs, resulting in a grand total of 101, 325 unfolding curves. The combination (across x1, x2, rc1 and rc2) that resulted in the best correlation coefficient, slope of the linear regression line and mean-absolute error was then chosen. This was found to be rc1 = rc2 = 6 Å, and with (x1, x2) = (0.5, 0.2). Note that a value of x1 = 0.5 does not mean a 50% reduction in the interaction energy. As an example, consider a mutation from isoleucine to alanine and the parameter combination above. Isoleucine has 8 heavy atoms and alanine has 5 heavy atoms. Following the empirical relation in the section on the WSME model, the van der Waals interaction energy in the first shell will decrease to 0.8125 times the original energy, i.e. a 18.75% reduction of interaction energy in the first shell. The destabilization in the second shell will depend on the amino acid residues present in the first shell. If the isoleucine has a valine and a glutamine in the first shell, the interaction energy between them and their neighboring residues within 6 Å will be decreased by 7.5%.

ACS Paragon Plus Environment

12

Page 13 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Figure 1. Graph theory predicts long-range modulation of communication network upon perturbations. (a) Cartoon- (left panel) and graph-representation (right panel) of ubiquitin.56 In the latter, the nodes are shown in blue and the edges in gray. (b-d) Absolute changes in the betweenness centrality (CB) between the mutant and the wild-type plotted as function of the Cα– Cα distance from perturbed site for different perturbation magnitudes (25%, 50% and 75%). (e) Effective changes in betweenness centrality arising from different perturbations across the entire interaction network (cyan) and a single-exponential fit to the same (red). (Inset) A plot of the standard-deviation of the cyan points in the main panel. (f) Magnitude of the exponent (dc) from a similar interaction-network perturbation analysis on four different proteins.

Results Graph Theory Predicts Long-Range Effects of Mutational Perturbations It is well established that proteins are best seen as systems with a distinct network of interactions (Figure 1a).57 We therefore chose to first explore the effect of perturbations on the interaction network of ubiquitin (Ubq, α/β), SH3 (all-β), B. subtilis CspB (all-β) and bovine ACBP (all-α) from the perspective of graph theory that views the protein structure as a collection of nodes (residues) and edges (interactions; Figure 1a for Ubq). On identifying the nearest neighbors and hence the degree of connectivity from the contact-map, we systematically remove varying fractions of the edges — 25%, 50% or 75% — for all nodes, to mimic the effect of small, medium and large

ACS Paragon Plus Environment

13

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 38

mutations, respectively. We then calculate the global network communication parameter, betweenness centrality (CB), for every node in the network before and after the “mutation”. CB is a measure of how influential a particular node is in a network, and is commonly used identify critical nodes (hubs) in social and protein-interaction networks.58 It is defined as the sum of the fractions of shortest paths between all pairs of nodes that pass through a particular node. The parameter ∆CB = CB (mut) – CB (WT) is therefore employed to quantify the effect of the in silico mutation. We find that the perturbations induced on the network of interactions influence the betweenness centrality at distances even up to 15 Å away from the originating site (Figure 1b1d). The effect is more dramatic for larger ‘mutations’ (Figure 1d) compared to smaller mutations (Figure 1b for example) despite a similar extent of percolation into the network. There is also a trend wherein the perturbation magnitude diminishes when moving further away from the mutated site, as intuitively expected. Few residues do not follow this general trend. For example, the absolute betweenness centrality is predicted to be >120 at ~12 Å away from the mutated site even for a 25% perturbation. This arises from the perturbation of a long-range interaction between I3 and L56 that strongly pack against each other and hence result in large changes in betweenness centrality. For a better quantification of this phenomenon and to eliminate any such idiosyncratic structural effects, we pool in the changes in betweenness centrality for all nodes and for the three different perturbation magnitudes and bin them based on the distance from the mutated side. Here, we find a clear trend wherein the effect of mutations on the global network parameter decays exponentially from the mutated site with a distance constant (dc) of about 4.1 Å (Figure 1e). The variation of this parameter also exhibits a similar behavior decaying to zero (i.e. no effect on the network property) only at >15 Å (inset to Figure

ACS Paragon Plus Environment

14

Page 15 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

1e). The magnitude of the decay constants does not seem to vary significantly on analyzing proteins with different topologies and secondary structure content (Figure 1f). Since the dynamic range on the abscissa is small (distance varies just over one order of magnitude), we cannot rule out a power-law behavior. However, a similar exponential propagation has been noted before in perturbation studies of protein–protein interaction networks indicating that such a functional form is expected when there is dissipation.59

Functional Form of the Propagation of Mutational Perturbation While the changes in CB are helpful in identifying general physical principles associated with interaction network perturbation, it is difficult to extend or compare them to real-world mutations and energies. To explore this issue further and to understand the extent of propagation of mutational effects in the folded ensemble of a protein, we performed one-microsecond long explicit water all-atom MD simulations on WT Ubq and seven aliphatic mutants (I13V, V17A, I23A, I30A, L43A, L67A, L69A). The mutated residues are completely buried within the hydrophobic protein interior with a mean relative solvent accessible surface area of just 1.3 % (±1.4%) and therefore perturb purely the internal packing and dynamics. The experimental destabilization free energies are also available for these mutants60 thus enabling a direct comparison. The Cα RMSD with respect to the starting structure is less than ~3 Å at most times for all the trajectories with a timedependence that is markedly different from the WT (Supporting Figure S1). The RMSF (rootmean-square fluctuations) is clearly higher than the WT in all the mutants with the variants involving large truncations — I23A and I30A — displaying a significantly larger RMSF across the entire structure (Figure S2).

ACS Paragon Plus Environment

15

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 38

As a first step towards understanding these structural changes, we calculated dynamic crosscorrelations (DCC) of pairwise residues across the structure that provides information on interresidue correlated motions averaged over time. Since the average Cα-positions do not vary significantly (Figure S3), we employ the Cα–Cα distance as a metric to quantify the distance dependence of perturbation. As expected, the magnitude of the correlation coefficient decays rapidly as we move away from the residue of interest (Figure S4). However, we find that some distant residues (>15 Å) are mutually correlated though the magnitude is small (cyan in Figure S4). For example, we find the motions of residues 5-6, 13-15, 42-45 and 66-68 (see cyan in panels corresponding to L43A, L67A, L69A in Figure S4) to be mutually correlated. This is essentially a generalization of the previous observations of long-range correlations and allosteric coupling in ubiquitin7 (and also other proteins6) from constrained or millisecond time-scale simulations.61 The fact that we observe the same from relative shorter unconstrained simulations is evidence that the time-scale of the current simulations are long enough to perform a detailed analysis of the energetics. Interestingly, mutations shift the inter-residue correlations distinctly; smaller mutations (V17A, I13V) have marginal effect on the correlations while large mutations (I23A, I30A) have more dramatic effects. For example, the inter-residue cross correlations go from being positive and significant at residues around 40 and 52 to near-zero in I23A, suggestive of subtle structural disruption.

ACS Paragon Plus Environment

16

Page 17 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Figure 2. The effect of point mutations is felt around the entire structure. All energy units are in kJ mol-1. (a-c) Distributions of packing interaction energy (EvdW) for the residues R42, Y59 and A28 before (i.e. WT Ubq, black) and after the specified mutation (gray) from all-atom explicit water MD simulations. The Cα–Cα distance of the three residues from the mutated position is also indicated. (d-i) Plots of mutation-induced changes in the mean packing interaction energy (circles and left axis; mean EvdW, mut – mean EvdW, WT) and the ratio of the standard deviation of the EvdW distributions (cyan line and right axis). The residues are colorcoded according to the respective Cα - Cα distance from the mutated residue (d in Å): d4 heavy atom difference).

ACS Paragon Plus Environment

23

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 38

the combination that provides a physically reasonable estimate of all the three parameters was chosen (see Figures S7-S11). We find a very good agreement between the experimental and predicted ∆∆G at a uniform distance cutoff of 6 Å for both the first shell and the second shell (rc1 = rc2 = 6 Å), and with (x1, x2) = (0.5, 0.2). The effective correlation coefficient of the entire data set is ~0.59 with a global slope of 0.98. The correlation coefficient is on the lower side mainly because of mutations involving large truncations, i.e. changes in the number of side-chain atoms >4 (aromatic-toalanine; magenta dots in Figure 4g). It has been shown before that such mutations promote protein chain collapse and large structural re-arrangements around the altered site to eliminate the large cavity;30, 43 clearly, this alters the native structure and hence cannot be captured by our approach. Moreover, the aromatic mutations show-up as outliers even in the protein mutant structure based analysis suggesting that the large deviations do not originate from the second shell considerations (Figure S6). The correlation coefficient increases to ~0.69 and the slope changes to 0.88 for mutations with small truncations (≤3 heavy atoms change). Protein-wise analysis reveals that the mutational perturbations are robustly captured across most proteins that span of range of sizes, secondary structure composition and mutation type (Figure S12, Table S6). A few outliers in each of the proteins therefore confound the global analysis (i.e. when employing all of the data) resulting in the lower correlation. In fact, the majority of the outliers are from Fyn- and src-SH3, proteins in which multiple aromatic-to-alanine substitutions have been performed (Figure S12).67, 68 How reasonable is this prediction? To evaluate this question, we employ FOLDEF (FOLDX energy function65), a golden standard in mutational analysis, and predict the ∆∆G for the same set of mutations used in this study. The resulting slope/correlation coefficient values for the

ACS Paragon Plus Environment

24

Page 25 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

entire dataset and for small truncations are 0.94/0.60 and 0.96/0.64, respectively, very similar to our predictions (Figure S13). However, FOLDEF is the more complex model and employs an array of atomic- and residue-based parameters (sequence-based entropy, polar-, apolar-solvation terms etc.) apart from relative weighting terms for the various energy functionals.

Discussion In an attempt to unify the disparate and seemingly conflicting observations and assumptions involving side-chain truncation mutations in protein structures, we dissect the mutational effects in detail from different approaches. Because proteins are conventionally seen as molecules with a well-defined pattern of a network of interactions, we first employed a simple “toy model” of ubiquitin interaction network to explore the possible effects of perturbations. We find that even minor perturbations in the intra-molecular interaction networks can influence distal residue– residue communication networks, measured using betweenness centrality, in a distinct manner (Figure 1b). The extent of propagation in the network can be approximately modeled as an exponential function with the actual magnitude dependent on the originating perturbation (Figure 1c). Both these are predictions, and are expected to hold true for interaction networks within proteins that are very different from random graphs. On analyzing detailed molecular dynamics simulations of several mutants of ubiquitin (Figures 2-3), we find that mutations involving side-chain truncations have a domino-like effect on the protein structure. First, these mutations increase the flexibility of the neighbors (blue in Figure 4a, 4b) around a mutated site (red in Figure 4a, 4b), as there is a significant loss of packing interactions in the immediate vicinity (Figure 2). Second, this higher conformational flexibly of the first-shell residues weakens the packing interactions holding the second shell together (i.e. white edges in Figure 4a), but to a lesser extent as they are farther away from the mutated site.

ACS Paragon Plus Environment

25

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 38

This is in fact very similar to the mechanism of destabilization observed in repeat proteins that happens at a larger length scale (i.e. at level of the repeat domains)69 and the ‘cascade effect’ of mutations that has been recently proposed based on analysis of network properties.70 Remarkably, we do not observe any change in the structure of the protein or a dramatic conformational rearrangement. This is additional evidence that allosteric effects can be mediated without structural changes and can be channeled merely through subtle changes in the packing interaction network.9, 14 More specifically, we find evidence that modulations in the mean and variance of residue-residue packing interaction energies can percolate to large distances within the protein interaction network, exactly as predicted by the classical work of Cooper and Dryden.12 This in turn translates to changes in the inter-residue communication paths that modify the existing allosteric coupling observed in the WT (for example, see Figure S4). The dissipation in energy or the effect of mutation on the network of interactions decays approximately in an exponential manner, with a decay distance constant of 4.7 Å, as we move away from the mutated site. The implication is that the effect of mutation can be ‘felt’ as far as 15-20 Å from the mutational site. Some residues do pack better on average (negative ∆EvdW,m; points below the zero line Figure 3b and Figure S5), but a majority of them display weaker packing interactions (positive ∆EvdW,m), thus destabilizing the structure significantly. The amplitude of the perturbation, however, depends on the original perturbation magnitude. Mutations involving large truncations can therefore exert more effect at distal sites (i.e. significant weakening of packing interactions) than smaller truncations. These results are entirely consistent with the perturbation analysis of interaction network in ubiquitin. We then recast the principles gleaned from the analysis of MD simulations onto a structurebased statistical mechanical Wako-Saitô-Muñoz-Eaton (WSME) model. We decouple the

ACS Paragon Plus Environment

26

Page 27 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

magnitude of the destabilization energy in the first and that propagated into the second shell and its dependence on the nature of the mutation through a simple empirical relation. Then, through a detailed yet systematic approach, we estimate the first/second-shell partitioning to be 0.5:0.2 (ratio of 2.5) that consistently predicts the changes in stabilities for 375 mutations from 19 different proteins in terms of slope of the linear regression line, correlation coefficient and the mean absolute error (Figure 4). This 50:20 partitioning of the destabilization energy is in good agreement with that estimated from MD simulations of specific mutants of ubiquitin that points to a value of 1.95 (± 0.98) (by taking the ratio of the average first and second shell perturbations shown in Figure 3c). More detailed simulations on multiple mutant types, structural class and mutation positions, which are also simultaneously consistent with experiments, are required to identify the exact magnitude of this number. The correlation coefficient from our method is on the lower side possibly due to a combination of reasons. First, mutations also affect the intrinsic conformational preferences of amino acids in secondary structure elements that are not incorporated in the WSME model currently as they significantly increase the parameter space. Second, proteins with large side-chain truncations, especially those involving aromatics, are known to dramatically rearrange the backbone and relative side-chain orientations of nearby residues to eliminate the large cavity that is created. Despite these caveats, our minimalistic approach works as well as the FOLDX method attesting to its robustness while revealing the molecular origins of mutation-induced destabilization. A larger database with additional mutations is needed to provide an independent test set for exploring the generality of our conclusions. FOLDX is adept in capturing the mutational effects involving small-to-large substitutions and changes in solvation induced by polar(apolar) to apolar(polar) substitutions on the protein surface. These mutation types cannot be addressed by our current protocol. However,

ACS Paragon Plus Environment

27

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 38

this should be possible in the near future given the minimal parameter set required by the WSME model and its variants.

Figure 5. Direct experimental evidence for the modulation of allosteric coupling through mutations. The top, middle and bottom rows represent ubiquitin, SNase and T4 Lysozyme (T4L), respectively. (a, d, g) Effective changes in backbone chemical shifts (from 15N and 1Hα nuclei; panels a and b), and between the ground and excited state for T4L (from 15N, 1HN, 13Cα, 1 Hα and 13C’ nuclei; panel c) on mutations in three proteins and as a function of distance from the mutated site. The red curves are fit to single-exponential functions. (b, e, h) Same as in panels a, d, and g but with the changes mapped onto the structure for L43A in Ubq (panel b), L125A in SNase (panel e) and L99A in T4L (panel h). (c, f, i) Predicted folding landscapes at 298 K for the WT and indicated mutant as a function of the number of residues structured in the N- or Ctermini. U, N and I represent unfolded, native and intermediate ensembles, respectively. A spectral color-coding is employed going from blue (low in free energy) to red (high in free energy).

ACS Paragon Plus Environment

28

Page 29 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Our observations, however, are not without precedent. Simplified lattice simulations suggest that the energetic coupling between residues should decay exponentially as a function of distance between the residues.71 This expectation is also evident in the plot of the degree of energetic coupling as a function of distance from the mutated site (estimated from double-mutant cycles) in four different proteins.39, 72 Despite this, it is surprising that such a distance-dependence of mutational effects has not been measured directly. The ideal techniques that could report on this are NMR experiments that are sensitive to small changes in the electronic environment of residues arising from packing rearrangements. However, there are only a few works (to our knowledge) — on Ubiquitin,44, 73 T4 Lysozyme (T4L),45 SNase46 — that report on the changes in chemical shift of backbone atoms upon mutational perturbations of buried residues. A plot of changes in chemical shift (chemical shift perturbation) upon mutation in these proteins as a function of distance from the mutated site bears remarkable resemblance to the expectation from network analysis and MD simulations (Figure 5a, 5d, 5g). The distance constants (dc) estimated by fitting the chemical shifts data to exponential functions are: 7.7 ±1.0 Å for Ubq, 9.0 ± 1.0 Å for SNase and 16.6 ± 3.1 Å for T4L. The larger magnitude of the mutational percolation in experiments suggests that our estimates from network theory or MD based approaches could just be a lower limit. Mapping the chemical shift changes on to the respective structures, we find that nearly the entire structure is perturbed with only a handful unaffected residues (Figure 5b, 5e, 5h). We are able to extract a similar distance dependence on calculating the changes in backbone chemical shifts from the simulations reported here (Figure S14) indicating that the molecular origins of these observations are the weaker packing of the side-chains due to mutations that spread nearly throughout the structure.

ACS Paragon Plus Environment

29

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 38

The strikingly similar distance dependence of the mutational perturbations from different approaches (interaction network analysis, MD simulations, experiments) in various mutants is therefore strong evidence that our observation is a universal feature of truncation mutations in proteins. The implication of the above observation is that destabilizing mutations will not just affect the relative population of the unfolded state, but the entire folding landscape (the two columns to the right in Figure 5), possibly resulting in non-intuitive effects on function. Many partially structured states display increased probability of occurrence (also termed ‘excited states’ or ‘invisible states’), consistent with the interpretation of mutational effects in several proteins.45, 46, 74 While ligand binding and unbinding events or post-translational modifications (PTMs) can modulate the allosteric response in a time-dependent manner (protein ‘quakes’75), we find that the ‘information’ generated by the mutation is strongly imprinted on the ensemble dynamics or the landscape of proteins that in turn determines the functional response. Our work strongly reveals that mutations in proteins can have far-reaching consequences. First, the distinct network of interactions arising from the unique amino-acidic sequence in a protein structure could also be seen as an evolutionarily selected feature to dissipate the adverse effects of mutations on the functioning of a protein. Second, while mutational effects consistently propagate into the second shell of interactions in proportion to the perturbation, the weak exponential dependence suggests that the region around the mutated residue is not the only neighborhood that is affected. The packing of residues as far as 15–20 Å from the mutated site is weakened, contributing to the observed experimental destabilization. This phenomenon therefore distorts the original communication network of the WT, thus acting as a distinct signaling mechanism, while simultaneously providing a simple avenue to fix conditionally neutral mutations that are critical for evolutionary adaptation.41 Third, our observations possibly explain

ACS Paragon Plus Environment

30

Page 31 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

why the regions around the catalytic site of enzymes (up to a remarkable 27 Å from the functional sites) display a lower evolutionary rate compared to distant residues.76 Fourth, we expect any molecular event such as ligand (un-)binding or PTMs to have a similar long-range effect purely from the modulation of internal dynamics and therefore contributing to allostery. Last, folding mechanisms of proteins are generally inferred from mutation-based Brønsted analysis under the assumption that mutational effects affect only the first shell of interactions.36 Our observations challenge this assumption and call for caution in the interpretation of proteinengineering based studies of folding mechanisms.

ACS Paragon Plus Environment

31

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 38

ASSOCIATED CONTENT SUPPORTING INFORMATION

The Supporting Information is available free of charge on the ACS Publications website. WT and mutant RMSD, RMSF and vdW energy distribution plots from MD simulations, dynamic inter-residue cross correlation analysis from MD simulations, plots of mean absolute error, slope and correlation coefficient from the predictions of the WSME model for different first- and second-shell dimensions, supporting tables including the mutant database and the associated parameters. (PDF) AUTHOR INFORMATION

Corresponding Author * [email protected] Author Contributions †

The authors contributed equally to this work.

The manuscript was written through contributions of all authors. ACKNOWLEDGEMENT

A. N. N. is a Wellcome Trust / DBT India Alliance Intermediate Fellow. We thank the P. G. Senapathy Center for Computing Resource at the Indian Institute of Technology Madras, Chennai, India for the high-performance computational facilities.

ACS Paragon Plus Environment

32

Page 33 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

References 1. 2. 3.

4. 5.

6. 7.

8. 9.

10. 11. 12. 13. 14. 15. 16. 17. 18.

Richards, F. M. (1974) The interpretation of protein structures: total volume, group volume distributions and packing density, J. Mol. Biol. 82, 1-14. Gaines, J. C., Smith, W. W., Regan, L., and O'Hern, C. S. (2016) Random close packing in protein cores, Phys. Rev. E 93, 032415. Bowman, G. R., and Geissler, P. L. (2012) Equilibrium fluctuations of a single folded protein reveal a multitude of potential cryptic allosteric sites, Proc. Natl. Acad. Sci. U. S. A. 109, 11681-11686. Bowman, G. R., and Geissler, P. L. (2014) Extensive conformational heterogeneity within protein cores, J. Phys. Chem. B 118, 6417-6423. DuBay, K. H., Bowman, G. R., and Geissler, P. L. (2015) Fluctuations within folded proteins: implications for thermodynamic and allosteric regulation, Acc. Chem. Res. 48, 1098-1105. Fenwick, R. B., Orellana, L., Esteban-Martin, S., Orozco, M., and Salvatella, X. (2014) Correlated motions are a fundamental property of beta-sheets, Nat. Commun. 5, 4070. Fenwick, R. B., Esteban-Martin, S., Richter, B., Lee, D., Walter, K. F., Milovanovic, D., Becker, S., Lakomek, N. A., Griesinger, C., and Salvatella, X. (2011) Weak long-range correlated motions in a surface patch of ubiquitin involved in molecular recognition, J. Am. Chem. Soc. 133, 10336-10339. Lockless, S. W., and Ranganathan, R. (1999) Evolutionarily conserved pathways of energetic connectivity in protein families, Science 286, 295-299. Suel, G. M., Lockless, S. W., Wall, M. A., and Ranganathan, R. (2003) Evolutionarily conserved networks of residues mediate allosteric communication in proteins, Nat. Struct. Biol. 10, 59-69. Volkman, B. F., Lipson, D., Wemmer, D. E., and Kern, D. (2001) Two-state allosteric behavior in a single-domain signaling protein, Science 291, 2429-2433. Fuentes, E. J., Der, C. J., and Lee, A. L. (2004) Ligand-dependent dynamics and intramolecular signaling in a PDZ domain, J. Mol. Biol. 335, 1105-1115. Cooper, A., and Dryden, D. T. (1984) Allostery without conformational change. A plausible model, Eur. Biophys. J 11, 103-109. Tzeng, S. R., and Kalodimos, C. G. (2009) Dynamic activation of an allosteric regulatory protein, Nature 462, 368-372. Popovych, N., Sun, S., Ebright, R. H., and Kalodimos, C. G. (2006) Dynamically driven protein allostery, Nat. Struct. Mol. Biol. 13, 831-838. Swain, J. F., and Gierasch, L. M. (2006) The changing landscape of protein allostery, Curr. Opin. Struct. Biol. 16, 102-108. Cui, Q., and Karplus, M. (2008) Allostery and cooperativity revisited, Prot. Sci. 17, 1295-1307. Tzeng, S. R., and Kalodimos, C. G. (2011) Protein dynamics and allostery: an NMR view, Curr. Opin. Struct. Biol. 21, 62-67. Fenwick, R. B., Esteban-Martin, S., and Salvatella, X. (2011) Understanding biomolecular motion, recognition, and allostery by use of conformational ensembles, Eur. Biophys. J 40, 1339-1355.

ACS Paragon Plus Environment

33

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

19.

20. 21. 22.

23. 24.

25.

26. 27.

28.

29.

30.

31. 32. 33. 34. 35.

36.

Page 34 of 38

Wand, A. J. (2013) The dark energy of proteins comes to light: conformational entropy and its role in protein function revealed by NMR relaxation, Curr. Opin. Struct. Biol. 23, 75-81. Nussinov, R., and Tsai, C. J. (2015) Allostery without a conformational change? Revisiting the paradigm, Curr. Opin. Struct. Biol. 30, 17-24. Monod, J., Wyman, J., and Changeux, J. P. (1965) On Nature of Allosteric Transitions - a Plausible Model, J. Mol. Biol. 12, 88-118. Koshland, D. E., Jr., Nemethy, G., and Filmer, D. (1966) Comparison of experimental binding data and theoretical models in proteins containing subunits, Biochemistry 5, 365385. Sanchez-Ruiz, J. M., and Makhatadze, G. I. (2001) To charge or not to charge?, Trends Biotech. 19, 132-135. Strickler, S. S., Gribenko, A. V., Gribenko, A. V., Keiffer, T. R., Tomlinson, J., Reihle, T., Loladze, V. V., and Makhatadze, G. I. (2006) Protein stability and surface electrostatics: A charged relationship, Biochemistry 45, 2761-2766. Gribenko, A. V., Patel, M. M., Liu, J., McCallum, S. A., Wang, C. Y., and Makhatadze, G. I. (2009) Rational stabilization of enzymes by computational redesign of surface charge-charge interactions, Proc. Natl. Acad. Sci. USA 106, 2601-2606. Naganathan, A. N. (2013) A Rapid, Ensemble and Free Energy Based Method for Engineering Protein Stabilities, J. Phys. Chem. B 117, 4956-4964. Ratnaparkhi, G. S., and Varadarajan, R. (2000) Thermodynamic and structural studies of cavity formation in proteins suggest that loss of packing interactions rather than the hydrophobic effect dominates the observed energetics, Biochemistry 39, 12365-12374. Eriksson, A. E., Baase, W. A., Zhang, X. J., Heinz, D. W., Blaber, M., Baldwin, E. P., and Matthews, B. W. (1992) Response of a protein structure to cavity-creating mutations and its relation to the hydrophobic effect, Science 255, 178-183. Jackson, S. E., Moracci, M., elMasry, N., Johnson, C. M., and Fersht, A. R. (1993) Effect of cavity-creating mutations in the hydrophobic core of chymotrypsin inhibitor 2, Biochemistry 32, 11259-11269. Xu, J., Baase, W. A., Baldwin, E., and Matthews, B. W. (1998) The response of T4 lysozyme to large-to-small substitutions within the core and its relation to the hydrophobic effect, Protein Sci. 7, 158-177. Main, E. R. G., Fulton, K. F., and Jackson, S. E. (1998) Context-dependent nature of destabilizing mutations on the stability of FKBP12, Biochemistry 37, 6145-6153. Zhou, H., and Zhou, Y. (2004) Quantifying the effect of burial of amino acid residues on protein stability, Proteins 54, 315-322. Naganathan, A. N., and Muñoz, V. (2010) Insights into protein folding mechanisms from large scale analysis of mutational effects, Proc. Natl. Acad. Sci. U.S.A 107, 8611-8616. Naganathan, A. N., and Orozco, M. (2011) The protein folding transition-state ensemble from a Gō-like model, Phys. Chem. Chem. Phys. 13, 15166-15174. Smith, C. A., Ban, D., Pratihar, S., Giller, K., Paulat, M., Becker, S., Griesinger, C., Lee, D., and de Groot, B. L. (2016) Allosteric switch regulates protein-protein binding through collective motion, Proc. Natl. Acad. Sci. U. S. A. 113, 3269-3274. Fersht, A. R., Matouschek, A., and Serrano, L. (1992) The Folding of an Enzyme .1. Theory of Protein Engineering Analysis of Stability and Pathway of Protein Folding, J. Mol. Biol. 224, 771-782.

ACS Paragon Plus Environment

34

Page 35 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

37.

38. 39.

40. 41. 42.

43.

44.

45.

46.

47.

48.

49.

50. 51.

Itzhaki, L. S., Otzen, D. E., and Fersht, A. R. (1995) The Structure of the Transition-State for Folding of Chymotrypsin Inhibitor-2 Analyzed by Protein Engineering Methods Evidence for a Nucleation-Condensation Mechanism for Protein-Folding, J. Mol. Biol. 254, 260-288. Horovitz, A. (1996) Double-mutant cycles: a powerful tool for analyzing protein structure and function, Fold. Des. 1, R121-R126. Chi, C. N., Elfstrom, L., Shi, Y., Snall, T., Engstrom, A., and Jemth, P. (2008) Reassessing a sparse energetic network within a single protein domain, Proc. Natl. Acad. Sci. U. S. A. 105, 4679-4684. Reynolds, K. A., McLaughlin, R. N., and Ranganathan, R. (2011) Hotspots for allosteric regulation on protein surfaces, Cell 147, 1564-1575. McLaughlin Jr., R. N., Poelwijk, F. J., Raman, A., Gosal, W. S., and Ranganathan, R. (2012) The spatial architecture of protein function and adaptation, Nature 491, 138-142. Murciano-Calles, J., Mclaughlin, M. E., Erijman, A., Hooda, Y., Chakravorty, N., Martinez, J. C., Shifman, J. M., and Sidhu, S. S. (2014) Alteration of the C-Terminal Ligand Specificity of the Erbin PDZ Domain by Allosteric Mutational Effects, J. Mol. Biol. 426, 3500-3508. Consonni, R., Santomo, L., Fusi, P., Tortora, P., and Zetta, L. (1999) A single-point mutation in the extreme heat- and pressure-resistant sso7d protein from Sulfolobus solfataricus leads to a major rearrangement of the hydrophobic core, Biochemistry 38, 12709-12717. Haririnia, A., Verma, R., Purohit, N., Twarog, M. Z., Deshaies, R. J., Bolon, D., and Fushman, D. (2008) Mutations in the hydrophobic core of ubiquitin differentially affect its recognition by receptor proteins, J. Mol. Biol. 375, 979-996. Bouvignies, G., Vallurupalli, P., Hansen, D. F., Correia, B. E., Lange, O., Bah, A., Vernon, R. M., Dahlquist, F. W., Baker, D., and Kay, L. E. (2011) Solution structure of a minor and transiently formed state of a T4 lysozyme mutant, Nature 477, 111-114. Roche, J., Caro, J. A., Dellarole, M., Guca, E., Royer, C. A., Garcia-Moreno, B. E., Garcia, A. E., and Roumestand, C. (2013) Structural, energetic, and dynamic responses of the native state ensemble of staphylococcal nuclease to cavity-creating mutations, Proteins 81, 1069-1080. Millet, O., Mittermaier, A., Baker, D., and Kay, L. E. (2003) The effects of mutations on motions of side-chains in protein L studied by 2H NMR dynamics and scalar couplings, J. Mol. Biol. 329, 551-563. Castaneda, C. A., Chaturvedi, A., Camara, C. M., Curtis, J. E., Krueger, S., and Fushman, D. (2016) Linkage-specific conformational ensembles of non-canonical polyubiquitin chains, Phys. Chem. Chem. Phys. 18, 5771-5788. Zhang, J., Petit, C. M., King, D. S., and Lee, A. L. (2011) Phosphorylation of a PDZ Domain Extension Modulates Binding Affinity and Interdomain Interactions in Postsynaptic Density-95 (PSD-95) Protein, a Membrane-associated Guanylate Kinase (MAGUK), J. Biol. Chem. 286, 41776-41785. Freeman, L. C. (1977) A set of measures of centrality based on betweenness., Sociometry 40, 35-41. Shen, Y., and Bax, A. (2010) SPARTA+: a modest improvement in empirical NMR chemical shift prediction by means of an artificial neural network, J. Biomol. NMR 48, 13-22.

ACS Paragon Plus Environment

35

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

52. 53.

54.

55. 56. 57. 58. 59. 60. 61. 62. 63.

64. 65.

66.

67. 68.

69.

70.

Page 36 of 38

Wako, H., and Saito, N. (1978) Statistical Mechanical Theory of Protein Conformation .2. Folding Pathway for Protein, J. Phys. Soc. Japan 44, 1939-1945. Muñoz, V., and Eaton, W. A. (1999) A simple model for calculating the kinetics of protein folding from three-dimensional structures, Proc. Natl. Acad. Sci. U.S.A. 96, 11311-11316. Naganathan, A. N. (2012) Predictions from an Ising-like Statistical Mechanical Model on the Dynamic and Thermodynamic Effects of Protein Surface Electrostatics, J. Chem. Theory Comput. 8, 4646-4656. Robertson, A. D., and Murphy, K. P. (1997) Protein structure and the energetics of protein stability, Chem. Rev. 97, 1251-1267. Chakrabarty, B., and Parekh, N. (2016) NAPS: Network Analysis of Protein Structures, Nucleic Acids Res., 10.1093/nar/gkw1383. Vijayabaskar, M. S., and Vishveshwara, S. (2010) Interaction energy based protein structure networks, Biophys. J 99, 3704-3715. Holme, P., Kim, B. J., Yoon, C. N., and Han, S. K. (2002) Attack vulnerability of complex networks, Phys. Rev. E Stat. Nonlin. Soft Matter Phys. 65, 056109. Maslov, S., and Ispolatov, I. (2007) Propagation of large concentration changes in reversible protein-binding networks, Proc. Natl. Acad. Sci. U. S. A. 104, 13655-13660. Went, H. M., and Jackson, S. E. (2005) Ubiquitin folds through a highly polarized transition state, Prot. Eng. Des. Sel. 18, 229-237. Lindorff-Larsen, K., Maragakis, P., Piana, S., and Shaw, D. E. (2016) Picosecond to Millisecond Structural Dynamics in Human Ubiquitin, J. Phys. Chem. B. 120, 8313-8320. Best, R. B. (2012) Atomistic molecular simulations of protein folding., Curr. Opin. Struct. Biol. 22, 52-61. Lane, T. J., Shukla, D., Beauchamp, K., and Pande, V. S. (2013) To milliseconds and beyond: challenges in the simulation of protein folding, Curr. Opin. Struct. Biol. 23, 5865. Gromiha, M. M. (2007) Prediction of protein stability upon point mutations, Biochem. Soc. Transac. 35, 1569-1573. Guerois, R., Nielsen, J. E., and Serrano, L. (2002) Predicting changes in the stability of proteins and protein complexes: A study of more than 1000 mutations, J. Mol. Biol. 320, 369-387. Dehouck, Y., Kwasigroch, J. M., Gilis, D., and Rooman, M. (2011) PoPMuSiC 2.1: a web server for the estimation of protein stability changes upon mutation and sequence optimality, BMC Bioinfo. 12, 151. Northey, J. G. B., Di Nardo, A. A., and Davidson, A. R. (2002) Hydrophobic core packing in the SH3 domain folding transition state, Nat. Struc. Biol. 9, 126-130. Riddle, D. S., Grantcharova, V. P., Santiago, J. V., Alm, E., Ruczinski, I., and Baker, D. (1999) Experiment and theory highlight role of native state topology in SH3 folding, Nat. Struc. Biol. 6, 1016-1024. Sivanandan, S., and Naganathan, A. N. (2013) A Disorder-Induced Domino-Like Destabilization Mechanism Governs the Folding and Functional Dynamics of the Repeat Protein IκBα, PLOS Comput. Biol. 9, e1003403. Achoch, M., Dorantes-Gilardi, R., Wymant, C., Feverati, G., Salamatian, K., Vuillon, L., and Lesieur, C. (2016) Protein structural robustness to mutations: an in silico investigation, Phys. Chem. Chem. Phys. 18, 13770-13780.

ACS Paragon Plus Environment

36

Page 37 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

71.

72. 73.

74.

75.

76.

Liu, Z., Chen, J., and Thirumalai, D. (2009) On the accuracy of inferring energetic coupling between distant sites in protein families from evolutionary imprints: Illustrations using lattice model, Proteins 77, 823-831. Fodor, A. A., and Aldrich, R. W. (2004) On Evolutionary Conservation of Thermodynamic Coupling in Proteins, Proc. Natl. Acad. Sci. U. S. A. 279, 19046-19050. Lee, S. Y., Pullen, L., Virgil, D. J., Castaneda, C. A., Abeykoon, D., Bolon, D. N., and Fushman, D. (2014) Alanine scan of core positions in ubiquitin reveals links between dynamics, stability, and function, J. Mol. Biol. 426, 1377-1389. Korzhnev, D. M., Salvatella, X., Vendruscolo, M., Di Nardo, A. A., Davidson, A. A., Dobson, C. M., and Kay, L. E. (2005) Low-populated folding intermediates of Fyn SH3 characterized by relaxation dispersion NMR, Nature 430, 586-590. Ansari, A., Berendzen, J., Bowne, S. F., Frauenfelder, H., Iben, I. E., Sauke, T. B., Shyamsunder, E., and Young, R. D. (1985) Protein states and proteinquakes, Proc. Natl. Acad. Sci. U. S. A. 82, 5000-5004. Jack, B. R., Meyer, A. G., Echave, J., and Wilke, C. O. (2016) Functional Sites Induce Long-Range Evolutionary Constraints in Enzymes, PLoS Biol. 14, e1002452.

ACS Paragon Plus Environment

37

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 38 of 38

For Table of Contents Use Only

ACS Paragon Plus Environment

38