Quantifying Protein Disorder through Measures of Excess

Apr 25, 2016 - Phone: +91-44-2257 4140. ... we explore these issues by exploiting the order–disorder transitions of a helix in Pbx-Homeodomain toget...
0 downloads 0 Views 2MB Size
Article pubs.acs.org/JPCB

Quantifying Protein Disorder through Measures of Excess Conformational Entropy Nandakumar Rajasekaran, Soundhararajan Gopi, Abhishek Narayan, and Athi N. Naganathan* Department of Biotechnology, Bhupat & Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India S Supporting Information *

ABSTRACT: Intrinsically disordered proteins (IDPs) and proteins with a large degree of disorder are abundant in the proteomes of eukaryotes and viruses, and play a vital role in cellular homeostasis and disease. One fundamental question that has been raised on IDPs is the process by which they offset the entropic penalty involved in transitioning from a heterogeneous ensemble of conformations to a much smaller collection of binding-competent states. However, this has been a difficult problem to address, as the effective entropic cost of fixing residues in a folded-like conformation from disordered amino acid neighborhoods is itself not known. Moreover, there are several examples where the sequence complexity of disordered regions is as high as well-folded regions. Disorder in such cases therefore arises from excess conformational entropy determined entirely by correlated sequence effects, an entropic code that is yet to be identified. Here, we explore these issues by exploiting the order−disorder transitions of a helix in Pbx-Homeodomain together with a dual entropy statistical mechanical model to estimate the magnitude and sign of the excess conformational entropy of residues in disordered regions. We find that a mere 2.1-fold increase in the number of allowed conformations per residue (∼0.7kBT favoring the unfolded state) relative to a well-folded sequence, or ∼2N additional conformations for a N-residue sequence, is sufficient to promote disorder under physiological conditions. We show that this estimate is quite robust and helps in rationalizing the thermodynamic signatures of disordered regions in important regulatory proteins, modeling the conformational folding-binding landscapes of IDPs, quantifying the stability effects characteristic of disordered protein loops and their subtle roles in determining the partitioning of folding flux in ordered domains. In effect, the dual entropy model we propose provides a statistical thermodynamic basis for the relative conformational propensities of amino acids in folded and disordered environments in proteins. Our work thus lays the foundation for understanding and quantifying protein disorder through measures of excess conformational entropy.



INTRODUCTION Intrinsically disordered proteins (IDPs) and disordered regions in proteins play critical roles in a variety of cellular processes in organisms ranging from viruses to eukaryotes. They display higher evolutionary rates than ordered proteins, occupy vital positions in regulatory metabolic networks, are focal points for post-translational modifications in eukaryotes, and have been implicated in a number of diseases.1−5 Since they are unstructured by definition, a number of approaches involving NMR,6,7 single-molecule measurements,8 and ensemble-based methods7,9,10 have been used in conjunction with each other to explore the nature of the conformational space sampled by IDPs. These methods generally point to the fact that the IDP landscape is structurally heterogeneous which in turn presents the possibility of a variety of binding mechanisms to its partner proteins effectively regulating the downstream functional response. The binding of an IDP to an ordered domain or otherwise raises several important questions, one of the more interesting ones being, how is the loss in conformational entropy upon the binding to a partner domain compensated?11 In other words, in © XXXX American Chemical Society

going from a disordered domain sampling multitudes of conformations to a reasonably well-structured complex, there is a large loss in conformational entropy. This transition needs to be offset by a sufficient gain in stabilization free energy with its partner for the complex to be stable. Several mechanisms have been proposed in this regard with the popular ones being the large buried surface area associated with IDP complexes12 and their “fuzzy” nature.13 The more relevant question in this regard is how much is the loss in backbone conformational entropy for residues in a disordered environment when they fold? Is it the same magnitude as going from an unfolded ensemble of conformations to a well-folded structure as has been observed in the folding of ordered domains?14−17 Even a measure of the latter has been fraught with challenges despite decades of work. The important distinction here, however, is that IDPs are by definition unstructured under physiological conditions, the Received: January 21, 2016 Revised: April 25, 2016

A

DOI: 10.1021/acs.jpcb.6b00658 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B same environment in which unfolded states of ordered domains are only minimally populated. A large body of work has shown that, in general, the disorder tendency of IDPs originates from a combination of relatively large charge density and low sequence hydrophobicity18 and more recently through specific but variable patterning of charges.19,20 From a physicochemical perspective, the disordered nature can therefore have simple energetic origins from electrostatic repulsion (from similarly charged groups close in sequence),20 from stretches of polar tracts (Q, N, etc.) that promote better interactions of the protein chain with solvent than itself and from appropriate positioning of highly flexible glycine or structure-breaking proline residues. The former two cases correspond to repeats or low-complexity sequences, say poly-Q or poly-K, that are easy to identify purely from elementary sequence analysis. However, there are many cases where sequence analysis alone is insufficient to capture the disordered nature of proteins or regions within them3notable examples are the disordered DNA-binding domains of Brk21 and CytR22 that have wellfolded counterparts in Tc3/CENP-B and LacR/PurR, respectively (also, see below). This suggests that the origins of disorder are subtle in non-low-complexity sequences and are indicative of a hidden amino acid code originating most likely from correlated conformational entropic effects of neighboring residues. In other words, the specific patterning of amino acid residues in such sequences results in excess backbone conformational flexibility, thus promoting disorder; excess, as the disorder tendency is more than that of amino acids that are from ordered neighborhoods. These issues highlight the fact that a critical estimate lacking in the current literature on IDPs is the magnitude of this excess conformational entropy ΔΔS, potentially an easier estimate than measures of absolute entropy changes. However, how different does the entropic penalty of a residue from a disordered sequence (ΔSd) need to be in relation to the same estimated from several different approaches for folding−unfolding transitions of ordered domains16,23 (ΔSo ∼ −16 to −18 J mol−1 K−1 res−1) to promote disorder under physiological conditions? Partitioning the residue neighborhood into two distinct classes, ordered (o) and disordered (d), the excess conformational entropy of a residue in a disordered environment can be written as

Figure 1. Partial disorder in PbxHD. (A) Structure of PbxHD (PDB id: 1LFU) highlighting the low sequence complexity (blue) and high sequence complexity disordered regions (red; helix 4). (B) Predicted disorder tendency from IUPRED (short disorder; magenta) and PONDR-FIT (cyan). The shaded area indicates helix 4 that is experimentally identified to be partially structured in the apo-form.

and hence disordered. NMR experiments reveal that two conformations of the protein coexistfully folded (N) and an intermediate with the C-terminal fourth helix unstructured (N*)and exchange populations in a temperature-dependent manner even in the absence of DNA.25 The unique feature of this C-terminal helix is that it possesses both order-promoting (F, Y, I) and disorder-promoting (G, Q, E) residues (as per conventional definition3). This means that the successful physics-based predictor IUPRED26 or the neural-networkbased meta-predictor PONDR-FIT27 is unable to make a distinction and predicts this region to be ordered (Figure 1B). Here, we provide an estimate of the magnitude and sign of the excess conformational entropy of disordered residues from a detailed analysis of the folding thermodynamics of PbxHD with a simple dual-entropy statistical mechanical model. We highlight the applicability and usefulness of this term through multiple examples ranging from modeling of IDP conformational landscapes to partitioning of flux in ordered domains with small stretches of disordered regions.



METHODS Wako−Saitô −Muñ oz−Eaton (WSME) Model. The WSME model28−30 is an Ising-like statistical mechanical model with Go̅-like energetics.31 In this binary description, each residue in an N-residue protein is assumed to sample two sets of conformations, folded (native) and unfolded (nonnative) that are assigned the binary variables 1 and 0, respectively, thus leading to a total of 2N structured microstates or conformations. Interactions between two residues are allowed only if all of the intervening residues are also structured. This allows for an exact solution formulation29,32 that is detailed below and in the Supporting Information. Since the number of possible unfolded-like conformations for every residue (i.e., with non-native dihedral angles) is large, the model incorporates a fundamental parameter ΔS that accounts for the entropic cost or penalty for fixing a residue in the native conformation (see below). In the current version of the WSME model with electrostatics and simplified solvation terms,33 the free energy of each microstate is

ΔΔS = ΔSo − ΔSd = kB ln(Ωof/Ωou) − kB ln(Ωdf /Ωdu) ≃ kB ln(Ωdu /Ωou)

where kB is the Boltzmann constant and Ω is the multiplicity of states with the superscript denoting the specific folded-like (f) or unfolded-like (u) conformations. The term to the right directly measures the relative propensity of a residue to be unfolded given a particular environment. A procedure to extract this excess conformational flexibility of residues in disordered regions (or relative conformational entropy with respect to sequences of ordered regions) would be to characterize a protein that undergoes a partial order−disorder transition. Importantly, its structure along with the heat capacity thermogram and temperature dependent populations should also be available, as they serve as reliable controls to calibrate any model. One such protein is the 82-residue all-helical DNA-binding domain, Pbx-Homeodomain (PbxHD). The structure of PbxHD is composed of four helices with a long unstructured N-terminal tail (Figure 1A).24 The N-terminal tail is composed of a stretch of basic residues (i.e., a low-complexity sequence)

n

ΔF =

∑ ΔGmstab,n − T ∑ ΔS m

where the effective stabilization free energy of a particular microstate (m, n) (i.e., a string of 1’s between and including m and n) at a temperature T is represented as a sum of van der B

DOI: 10.1021/acs.jpcb.6b00658 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B

performed for the WW domain starting from the unfolded state (a string of 0’s) for both of the scenarios (with and without correction for the excess conformational entropy of disordered residues) under conditions of iso-stability and continued until the progress variable, the number of structured residues, reaches 29. In calculating the order of formation of secondary structures and flux from the transition path (Supporting Information), if 70% of the residues within a secondary structure element forms and stays above this threshold for the remainder of the MC run, it is considered to be formed. Thresholds of 50−80% reveal little changes in the order of secondary structure formation. Far-UV Circular Dichroism. The peptide, Ac-KKNIGKFQEEANIYAAKTA-NH2 (acetylated and amidated at the N- and C-termini, respectively), corresponding to the fourth helix of PbxHD, was chemically synthesized through the standard Fmoc-chemistry by Genscript, USA. Temperature dependent far-UV circular dichroism (CD) spectra were measured at pH 6.0 and 30/150 mM ionic strength conditions on a JASCO J-815 spectrophotometer in a 1 mm path length cuvette.

Waals interactions (EvdW), electrostatic potential (Eelec), and solvation free energy (ΔGsolv): ΔGmstab , n = E vdW + Eelec + ΔGsolv

The interacting partners that contribute to the van der Waals energy (EvdW) are identified by setting a distance cutoff (rcut) to the pairwise heavy-atom partners (i, j) calculated from the PDB file EvdW =

∑ ξi ,jρ m,n

where ρ = 1 if rij ≤ rcut and ρ = 0 otherwise. In the current work, a uniform heavy-atom cutoff of 5 Å is employed for all proteins with a mean-field vdW interaction energy ξ. A Debye−Hückel (DH) treatment is employed for the electrostatic potential term qiqj Eelec = ∑ K Coulomb exp( −rijκ ) εeff rij m,n



where KCoulomb is the Coulomb constant (1389 kJ.Å/mol), qi is the charge on the atom i, rij is the distance between charge centers i and j, and εeff is the effective dielectric constant. The magnitude of the effective dielectric constant is fixed to 29 from previous calibrations that involved comparing four different homologous protein families33 and 138 single point mutations involving charged residues.34 1/κ is the Debye screening length that depends on εeff, solvent ionic strength (I), and temperature (T). The solvation free energy is treated to be proportional to the number of native contacts (xm,n cont) in that microstate with the proportionality constant being ΔCcont p , which is the temperature-independent heat capacity change upon fixing a native contact. Therefore,

RESULTS AND DISCUSSION Estimating the Magnitude of ΔΔS. To estimate the relative backbone conformational entropy (ΔΔS) of the Cterminal disordered-like region of PbxHD, we characterize the heat capacity profile employing the latest version of the Wako− Saitô − Muñ o z−Eaton (WSME) statistical mechanical model29,30 that incorporates energetic terms for van der Waals interactions, Debye−Hückel electrostatics, and solvation.33,34 The only difference between the current analysis and the previous works10,11 is that we employ two values for the backbone conformational entropic cost: ΔSo for the ordered (residues 1−63) and another ΔSd for the disordered regions (residues 64−82; see Methods and the Supporting Information). To exactly fit the experimental heat capacity profile25 (Figure 2A), we modulate four thermodynamic parameters: two for the entropic penalty, one for the strength of vdW interactions, and the other being the heat capacity change (see Table S1 for the parameters). We simultaneously estimate the relative populations of both N and N* at 283 K and compare against the experimental estimates. We do not include any excess entropy correction for the low sequence complexity Nterminal region (blue in Figure 1), as the string of positively charged residues provides sufficient electrostatic repulsion to disallow structure formation. Panels B and C of Figure 2 indicate that, when both the backbone conformational entropy terms are identical, N is the most populated state; this is contrary to the experimental observation that N is just 10% of the total population with N* accounting for the remaining 90% at 283 K.25 However, as we systematically increase the magnitude of ΔSd with respect to ΔSo, the relative population of N decreases in a sigmoidal manner (Figure 2C). The best agreement between the experimental population of N at 283 K and the model is obtained at a ΔΔS value of ∼6.1 J mol−1 K−1 that corresponds to a 2-fold increase in the number of microstates sampled per residue for amino acids in disordered high-complexity regions (a 2.1-fold increase; ΔSd = −24.3 and ΔSo = −18.2 J mol−1 K−1). This value not only reproduces the shape of the thermogram but also the temperature dependence of the N− N* populations, thus validating the estimated parameters (Figure 2D). It is important to note that a contact-map

m,n ΔGsolv = xcont ΔCpcont[(T − Tref ) − T ln(T /Tref )]

where Tref is the reference temperature which is fixed to 385 K.16 The total partition function (Z) is calculated from the transfer matrix formalism of Wako and Saitô as described in the earlier works29 and in the Supporting Information. The various experimental observables are calculated from the derivatives of the partition function (Supporting Information). The only difference in the current work is that apart from the uniform entropy costs we perform an additional calculation that invokes two different entropy costs: one ΔSo for ordered regions or regions that are identified by STRIDE35 to populate the known secondary structure basins in the Ramachandran map and a ΔSd for disordered regions as shown from experiments or identified as coil by STRIDE. Monte Carlo Kinetics. Single bond-flip kinetics was performed in the same manner as before36,37 but by restricting the number of independent folding nucleation points to three for the 30-residue WW domain. This effectively reduced the conformational space to a more amenable 768 211 microstates (single + double + triple sequence approximations) compared to the 230 microstates when not restricting the number of nucleation points. In the MC kinetics, a residue is randomly chosen and flipped (0 → 1 or 1 → 0) and the move accepted or rejected according to the Metropolis criterion. The free energy of each microstate is calculated from the parameters in Table S1 (Supporting Information). 1000 independent runs were C

DOI: 10.1021/acs.jpcb.6b00658 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B

exhaustive enumeration studies.40 The magnitude is also in agreement with an amino acid triad based study that points to a near 2-fold increased sampling of disordered residues relative to residues in the ordered region,41 and the sign is consistent with predictions from analytical treatments.42 In effect, the magnitude and sign of ΔΔS is within the range expected from different approaches and we provide a first thermodynamic estimate of the same. We further validate the magnitude of the excess conformational entropy while simultaneously highlighting its application in different scenarios below. Modeling the Conformational Landscapes of Supertertiary Structures. Apart from PbxHD, a large number of systems have now been shown to exhibit two conformational phases wherein a part of the protein is folded and another region exhibits disorder. These proteins have also been referred to as possessing a “supertertiary structure”.43 The disordered region then folds upon binding to its cognate partner. Modeling or understanding the conformational landscape of such systems are difficult, as only the completely folded conformation in the presence of the partner proteins is available. In some cases, however, there is strong experimental evidence for the regions of the protein that are disordered in the absence of its binding partner. One such example is the tumor suppressor protein IκBα that together with NF-κB maintains cellular homeostasis. IκBα is a 6-ankyrin repeat (AR) domain in which the sixth AR is fully disordered and the fifth AR is partially disordered (Figure 3A).44 The disorder tendency is again not captured by disorder prediction servers (Figure S3) signaling that they are high complexity regions.45 In fact, the conformational behavior of IκBα is similar to that of PbxHD but at a much larger length scale, as the entire repeat or parts of the repeat domains are disordered. Since the average protection factor for each repeat is known,45 we use this experimental data as a control to systematically increase the entropic penalty for different repeats or parts of it (i.e., use a ΔΔS value of 6.1 J mol−1 K−1 while maintaining the experimental Tm of 315 K) and calculate the residue probabilities that can be used as a proxy for local stability (red in Figure 3B calculated using eq 4 in the Supporting Information). The best agreement with the experimental protection factors is obtained for the following combination of disordered regions: the entire sixth AR, Nterminal half of the first AR and C-terminal half of the fifth AR (Table S2 and Figure 3C). The excess entropy correction explains ∼92% of variance in the data (0.962) which is better than the contact-map correction based approach (r2 = 0.76) that relied on tuning the strength of native interactions.46 This exercise also reveals a distinct step-like trend in the order of stability of the different ankyrin repeats that extends beyond the perturbed repeat and that has been termed the “domino-like” destabilization mechanism;46 in other words, the disorder in one repeat results in the loss of one inter-repeat interface that in turn destabilizes the next repeat and so on (red in Figure 3B). The calculated 1D free energy profile and free energy surface in the presence of disorder (red in Figure 3D and Figure 3E) is dramatically different from that predicted from the fully folded structure (blue in Figure 3D) with a broad ensemble involving 110−160 structured residues and with ARs 1−4 the most structured (Figure 3B). This is in accordance with single-molecule FRET experiments that point to large changes in end-to-end distances with temperature when the donor−acceptor pair is labeled in ARs 2 and 6 or ARs 3 and 6

Figure 2. Estimating the magnitude of ΔΔS. (A) An illustrative example of the WSME model fit (line) to the DSC curve of PbxHD (circles). (B and C) The free energy profile at 283 K (in kJ mol−1) as a function of the reaction coordinate (RC), the number of structured residues. The color-coding corresponds to different values of relative entropy (ΔΔS), as shown in panel C. (D) Correlation between experimental and predicted populations of the fully folded state (N) at the indicated temperatures.

correction eliminating long-range contacts formed by the residues of fourth helix does not identify an intermediate or accurately reproduce the experimental populations of N (Figure S1). Moreover, the helix 4 is unstructured in isolation (NMR signatures in refs 24 and 38 and far-UV CD data in Figure S2), pointing to the fact that disorder is intrinsic to the sequence. At the outset, the magnitude of this relative backbone conformational entropy term seems very small and is just about ∼0.73kBT (at 298 K). With entropy being an extensive property, however, the total number of additional accessible microstates for the entire helix will be the product of the individual residue terms. For the 13-residue C-terminal helix, this value translates to an increased sampling of ∼15 500 microstates. This is highly significant and dominates over other stabilization energetic terms coming from interactions with the first and third helix, thus promoting disorder. The magnitude of the per-residue term is also lower than the ∼3.4-fold increased conformational sampling of Ala → Gly substitution in α-helix (that can be thought of as an upper limit),15 possibly as a result of the breakdown of the Flory’s isolated pair hypothesis.39 In other words, steric effects restrict the accessibility to all possible conformational states, as shown for short helices from nearD

DOI: 10.1021/acs.jpcb.6b00658 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B

the protein CBP.48,49 In the bound form, ACTR displays three short helices. Mutational experiments indicate that the rate of binding of ACTR to its partner and hence the binding constant becomes stronger upon systematically increasing the helicity of helix 1, thus providing evidence to the mechanism of conformational selection;50 helix 1 is therefore expected to be partially structured in the unbound ensemble with helices 2 and 3 disordered. How does the conformational landscape of ACTR respond to the changing propensities of helix 1? To answer this question, we directly reproduce the temperature dependence of helicity and the mean helical fraction in helix 151 employing the dual entropy model with a tunable ΔΔS (Figure 4A,B). We find

Figure 3. Modeling disorder in IκBα. (A) Structure of the ankyrin repeat protein IκBα. (B) Predicted residue probabilities of IκBα at 298 K employing a uniform entropic cost for all residues, i.e., from the NFκB bound conformation (blue). The second and fourth repeats are shaded for reference. (B and C) Predicted residue probabilities employing the entropic penalty for disordered residues for the sixth repeat and the N- and C-terminal halves of the first and fifth repeats, respectively, that reproduce the experimental amide exchange data (panel C). (D) 1D free energy profiles (in kJ mol−1) for the examples discussed above as a function of the number of structured residues as the reaction coordinate (RC). (E) Free energy surface employing a single-sequence-approximation (SSA) representation of the folding landscape (in kJ mol−1); n stands for the number of structured residues, while m stands for the starting residue. The structural identity of any microstate on this landscape can simply be obtained from the coordinate (m, n). N and U indicate the native and unfolded states, respectively.

Figure 4. Folding-binding landscape of ACTR. The colors blue, red, and black represent the conformational behavior of ACTR in the absence of its binding partner, upon structure enhancing mutations in helix 1 and a near-fully folded structure, respectively. (A) Temperature dependent helical fractions (circles) and the WSME model predicted unfolding curves. The arrow indicates the effect of mutations in helix 1 that enhance helical propensity. (B) The residue probabilities as a function of sequence index; helical regions are shaded in blue. (C and D) The 1D free energy profiles (in kJ mol−1) and the corresponding probability densities at 278 K as a function of the reaction coordinate (RC), the number of structured residues. The subensembles corresponding to the fully unfolded state, partial structured helix 1 and near-fully “folded” state are labeled as a, b, and c, respectively.

but only minor changes when labeled in ARs 1 and 4.47 We note that our approach is extendable to any other system as long as the disordered tendency from prediction servers or experiments together with the equilibrium unfolding data are available. Folding-Binding Landscape of ACTR. The binding mechanisms of IDPs to ordered partners are varied and multiple scenarios, ranging from induced fit to conformational selection or even a combination of both, have been proposed.12 It will therefore be advantageous to have a simple method to model the conformational landscape of the numerous short IDPs in the absence of their binding partners using simple thermodynamic observations. In this regard, we employ the protein ACTR as a model system below together with the binary entropy WSME model. The transactivation domain of the mammalian ACTR (referred to as ACTR from hereon), a protein implicated in breast cancer and antibiotic resistance, is completely disordered in the absence of its partner, the nuclear coactivator domain of

a very good agreement between these two experimental variables and model calculations at a relative entropic penalty of 3.26 J mol−1 K−1 per residue (ΔSd = −24.3 and ΔSo = −21.0 J mol−1 K−1; blue in Figure 4A and see also Figure S4). The corresponding 1D free energy profile and probability density points to a distribution with a large degree of disorder but with a small shoulder at ∼10 structured residues (blue in Figure 4C,D). Modulating ΔSo of helix 1 residues to range from −21.0 to −18.2 (i.e., reducing the entropic penalty), we can systematically increase its helicity which in turn changes the nature of the free energy profile. The free energy profile of ACTR when helix 1 is near fully folded now resembles a onestate-like system with a small bump separating the completely unfolded state (subensemble a) and the ensemble in which the helix 1 is structured (subensemble b; Figure 4C,D). One-statelike behavior has been previously predicted for systems with low nonlocal contact density from different approaches,52−54 indicating that this observation might be a generic feature of IDPs and molten-globule-like proteins as originally proposed.55 E

DOI: 10.1021/acs.jpcb.6b00658 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B

interactions determining the large differences in thermodynamic barriers (and hence rates) between these two systems but failed to account for the similar unfolding midpoints.33 To explore if this disagreement arises from small differences in the number of disordered residues between these two proteins, we first reproduce the EnHD experimental Tm of 325 K as before by modulating the basic thermodynamic parameters and with a uniform entropic cost for all residues (green in Figure 5B). Using identical parameters and replacing the contact map of EnHD with hTRF1, we predict a Tm of 340 K, which is 15 K higher than the EnHD and in disagreement with experimental estimates56 (blue in Figure 5B). A closer look at the structure, however, reveals that a secondary difference between the two proteins is the length of loop connecting helices 2 and 3in EnHD, it is 4 residues in length, while it is 8 residues in Htf1 (Figure 5A). To account for this difference, we repeat the above procedure again but by explicitly introducing higher intrinsic conformational entropy for the loop regions (ΔΔS value of 6.1 J mol−1 K−1) in hTRF1. This calculation now predicts a Tm of 330 K that is closer to the expectation from experiments (red in Figure 5B). A related application of the ΔΔS estimate is the ability to model the changes in stability induced by loop truncation or elongation in a given protein through elementary thermodynamic considerations. For example, recent experiments on mAcP reveal large differences in stability upon such loop modifications.57 Since the structure of the mutants is not available, we assume that the effect of introducing (or eliminating) a residue in the loop would be proportional to the entropic component of the free energy, i.e., 1.8 kJ mol−1 per residue, while fixing every other model parameter. With such a calculation, not only is the predicted change in Tm’s in qualitative agreement with experimental observations57 (Figure 5C), but the agreement can also be extended to a quantitative level by directly comparing the induced changes in stabilities (ΔΔG) from experiments that results in a correlation of 0.84 (Figure 5D). The only outlier (arrow in Figure 5D) is a loop extension mutant with 12 extra residues than the WT, indicating a possible breakdown of our approach or potentially arising from complicated structural effects that are not taken into consideration in our structure-based model (mutant structures are not available). It is however important to note that our predictions are better than estimates from polymer models for loop entropy. For example, when the loop has six extra residues than the WT, our model predicts a change in stability of ∼11 kJ mol−1 which compares reasonably well with experimental estimates of ∼8 kJ mol−1, while the Chan−Dill model for loop entropy58 points to a modest 3−4 kJ mol−1 change. Because disordered regions generally display higher evolutionary rates,59,60 and as even small differences in the number of disordered residues can translate to significant stability changes (evident from examples above), it is tempting to suggest that Nature has exploited this strategy to tune the stability of similar proteins. In fact, there is subtle evidence to this assertion in the structural comparisons of proteins from thermophilic and mesophilic organisms. Proteins from thermophiles are generally thought to be more stable than their mesophilic counterparts due to a larger network of charge−charge interactions and better packing.61 However, large-scale analysis of homologous protein structures in the two classes has also revealed an interesting trend in which thermophilic proteins have been shown to possess a smaller fraction of disordered residues

We extend the calculations to check for the nature of the population distribution when all three helices are completely folded (i.e., ΔSo = −18.2 J mol−1 K−1 for all residues in the helices; black in Figure 4B). This results in a broad free energy profile with three distinct subensembles corresponding to the fully unfolded state, partially structured helix 1, and fully structured helices 1 and 2 (a, b, and c, respectively, in Figure 4D and the distribution in black). Helix 3 is only weakly folded even upon the dramatic change in the entropic penalty to that of a fully folded state. This suggests that the interactions within helix 3 are not sufficient to promote folding and that folding could be “induced” by interactions with the partner NCBD. In fact, the third helix displays a slightly higher RMSD and standard deviation within the NMR models (0.34 ± 0.15 Å) compared to helices 1 and 2 (0.24 ± 0.08 Å), providing indirect evidence to our statement. Effectively, we show that it is possible to obtain physical insights into the tunable conformational landscape of IDPs with this dual entropy WSME model in conjunction with coarse but important equilibrium thermodynamic observables. Toward a Better Understanding of the Origins of Differences in Thermodynamic Stabilities. The DNAbinding domains of Engrailed homeodomain (EnHD) and hTRF1 serve as an interesting example. These structurally homologous proteins play critical roles in eukaryotic cellular processes, with EnHD controlling the embryonic development in Drosophila and hTRF1 determining the length of telomeric repeats. Despite possessing similar structures (Figure 5A), EnHD folds up to 3 orders of magnitude faster than hTRF1 while displaying a near-identical chemical unfolding midpoint.56 A previous analysis using the WSME model with uniform conformational entropy pointed to specific electrostatic

Figure 5. The effect of short disordered stretches on thermodynamic stability. (A) Structural alignment of EnHD and hTRF1 with short (magenta) and long (cyan) loops, respectively. (B) Predicted heat capacity profiles (in kJ mol−1 K−1) of EnHD (green) and hTRF1 with (red) and without (blue) the correction for the excess entropy of disordered residues. (C) Predicted changes in melting temperatures for the loop truncation (red) and extension (blue) mutants of mAcP. (D) Experimental and predicted changes in stability in kJ mol−1 with respect to the WT mAcP (empty circle) at 298 K following the color code in panel C. The arrow indicates an outlier that corresponds to a mutant with 12 extra residues in the loop compared to the WT. F

DOI: 10.1021/acs.jpcb.6b00658 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B (∼16%) compared to mesophilic proteins (∼19%) on average.62 In other words, with every other interaction type (packing, electrostatics, etc.) being identical between the two classes of proteins, the larger number of disordered residues in mesophilic proteins alone is expected to destabilize them much more than their thermophilic cousins. These results also highlight one of the possible reasons for the challenges faced in predicting protein stabilities (even from structure-based approaches63), and our proposed entropic corrections could be one of the steps toward addressing them. Role of Disordered Regions in the Partitioning of Folding Flux. All of the calculations above pertain to either changes in equilibrium populations or changes in conformational stability induced by disordered residues. It is natural to expect that disordered residues could also have an effect on the folding mechanisms. This is because many proteins have disordered stretches between secondary structure elements that will reduce the local stability; this is expected to lower the statistical weight of microstates with disordered regions and hence their probability and by extension the relative folding flux through them. In this regard, the folding of the three-stranded Fip35 WW domain presents an interesting case (Figure 6A). Experiments reveal that folding of this WW domain involves two alternative pathways (or intermediate states) with near-identical time constantsone in which the hairpin 1 is formed first (H1; τ ∼ 10 μs) and the other in which the hairpin 2 is formed first (H2; τ ∼ 15 μs).64 The exact solution to the WSME model with a uniform conformational entropy cost and assuming a Tm of 333 K predicts a large local stability for strand 1 (ΔG < 0) with strands 2 and 3 only ∼50% folded (ΔG ∼ 0; blue in Figure 6B). This suggests a strongly biased folding through strand 1 from purely thermodynamic expectations. To further confirm this, we perform 1000 single bond-flip Monte Carlo (MC) kinetic simulations on a landscape of 768 211 microstates at the midpoint temperature to extract the partitioning of the folding flux (if any; see Figure 6C, Figure S5, and Methods). We find that hairpin 1 is formed first in ∼68% of the runs with a smaller fraction (∼32%) folding through hairpin 2 (Figure 6D). Though the overall results are in agreement with experiments, the magnitudes of the flux are expected to be near equal and this is not captured in the original WSME model (that predicts 68:32 partitioning for H1:H2) or in all-atom MD simulations65 (that predict 100% flux through hairpin 1). One reason for this observation could be the large thermodynamic stability of hairpin 1 predicted by the WSME model (blue in Figure 6B) and that is also observed in MD simulations.65 Structurally, however, hairpins 1 and 2 are very different, as strands 1 and 2 are connected by a longer disordered loop that is functionally critical,66,67 while a tight turn connects strands 2 and 3 (Figure 6A). To account for this, we incorporate the estimated ΔΔS term for the loop residues connecting strands 1 and 2 and the disordered C-terminal residues in the WSME model, while maintaining a Tm of 333 K (see Supporting Information Table S1). This correction does not change the thermodynamic barrier in the one-dimensional projection though subtle changes in the free-energy profile are evident (Figure S6). Single bond-flip MC kinetics on the landscape generated from this more realistic model now reveals a similar free-energy surface but with a 54:46 partitioning of the folding flux in good agreement with published estimates64 (Figures 6E, S7, and S8). Thus, though strand 1 exhibits large local stability, hairpin 1 is now unable to fold significantly faster

Figure 6. The effect of short disordered stretches on folding flux. (A) Structure of the Fip35 WW domain. (B) Local stability profile without (blue) and with (red) correction for the excess conformational entropy of disordered residues at iso-stability conditions. Shaded areas highlight the β-strands. (C) A representative example of a long MC run displaying multiple folding−unfolding transitions (gray) near the thermal unfolding midpoint. A running average is shown in blue for visual clarity. (D) Free energy surface (units of kJ mol−1) constructed by lumping together the partial partition functions up to the triplesequence approximation of the WSME model with a uniform entropic cost. The reaction coordinates are the first 15 structured residues of the N-terminus (nN‑term) and C-terminus (nC‑term). The white curves are the projection of the MC simulation starting from the unfolded state (U) and moving toward either hairpin 1 first (H1; 68%) or hairpin 2 first (H2; 32%) and finally to the fully folded state (N). (E) Same as panel D but with a free-energy surface generated by correction for the excess conformational entropy of disordered residues.

than hairpin 2 simply because of the additional destabilization provided by the longer loop connecting strands 1 and 2. The corollary is that shortening of this loop would increase the folding rate while also increasing the thermodynamic stability. This is exactly what is observed in mutational studies on the homologous Pin WW domain66 supporting the results of our calculations.



CONCLUSION We provide a first thermodynamic estimate of the excess conformational entropy of disordered residues in proteins (ΔΔS ∼ 6.1 J mol−1 K−1 per residue); this corresponds to a 2fold increase in the number of microstates sampled per residue for amino acids in disordered high-complexity regions compared to the residue environment in the ordered region. G

DOI: 10.1021/acs.jpcb.6b00658 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B Author Contributions

We extract this number through a detailed analysis of the unfolding thermodynamics of PbxHD, a protein for which both the heat capacity profile and temperature dependent populations of the partially structured C-terminal helix are available enabling an experimentally constrained quantification. This excess entropy term should be seen as an average measure with fluctuations around this mean value representing specific sequence, size (length of disordered region), or solvent effects. It is important to note that our method does not predict or identify regions that could be disordered but provides a simple yet physical mean-field approach to model the subtle effects of disordered residues once they are identified by experimental measures, structure or sequence-based tools. The estimate should work well for short high complexity disordered regions (4−20 residues) beyond which correlated sequence effects can strongly dominate through confounding excluded volume effects. For regions of low complexity dominated by charged residues, it is not necessary to include an excess conformational entropy term, as the disorder will be an emergent property of these systems arising simply from charge−charge repulsion. However, such charge−charge interactions can also be favorable and offset the disorder tendency to promote folding through specific post-translational modifications (particularly phosphorylation) frequently observed in the disordered regions. This has been highlighted in the recent detailed experimental study on the protein 4E-BP268 whose conformational behavior is also captured very well by the current model69 attesting to the power of the thermodynamic approach we propose here. Given the ever-increasing interest in IDPs, we show that the ΔΔS estimate can have potential applications in more accurate prediction of protein stabilities, modeling of disorder in domains that undergo folding upon binding, and in estimating the origins and magnitude of the partitioning of folding flux associated with ordered domains with disordered segments. Our estimate could be employed as a possible correction to or constrain backbone torsional potentials of molecular simulations, particularly in native-centric coarse-grained approaches. It further presents an alternate statistical thermodynamic estimate in terms of the microstate density of the flexibility associated with loops that are conventionally modeled through polymer physics theories. We believe that additional thermodynamic measurements of microscopic populations (as in PbxHD) and scanning calorimetry measurements upon systematic modulation of sequence pattern, length, and composition could help in quantifying fluctuations expected in the magnitude of the ΔΔS term for high complexity disordered regions, thus potentially revealing the elusive sequence-structure code, not at the residue level but in terms of the relative amino acid neighborhoods.



All authors have given approval to the final version of the manuscript. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS A.N.N. is a Wellcome Trust/DBT India Alliance Intermediate Fellow.



ABBREVIATIONS WSME, Wako−Saitô−Muñoz−Eaton; DH, Debye−Hückel; RC, reaction coordinate; NMR, nuclear magnetic resonance; CD, circular dichroism; MC, Monte Carlo



(1) Dyson, H. J.; Wright, P. E. Intrinsically Unstructured Proteins and Their Functions. Nat. Rev. Mol. Cell Biol. 2005, 6, 197−208. (2) Babu, M. M.; van der Lee, R.; de Groot, N. S.; Gsponer, J. Intrinsically Disordered Proteins: Regulation and Disease. Curr. Opin. Struct. Biol. 2011, 21, 432−440. (3) Uversky, V. N. A Decade and a Half of Protein Intrinsic Disorder: Biology Still Waits for Physics. Protein Sci. 2013, 22, 693−724. (4) van der Lee, R.; Buljan, M.; Lang, B.; Weatheritt, R. J.; Daughdrill, G. W.; Dunker, A. K.; Fuxreiter, M.; Gough, J.; Gsponer, J.; Jones, D. T.; et al. Classification of Intrinsically Disordered Regions and Proteins. Chem. Rev. 2014, 114, 6589−6631. (5) Das, R. K.; Ruff, K. M.; Pappu, R. V. Relating Sequence Encoded Information to Form and Function of Intrinsically Disordered Proteins. Curr. Opin. Struct. Biol. 2015, 32, 102−112. (6) Mittag, T.; Forman-Kay, J. D. Atomic-Level Characterization of Disordered Protein Ensembles. Curr. Opin. Struct. Biol. 2007, 17, 3− 14. (7) Jensen, M. R.; Ruigrok, R. W.; Blackledge, M. Describing Intrinsically Disordered Proteins at Atomic Resolution by NMR. Curr. Opin. Struct. Biol. 2013, 23, 426−435. (8) Brucale, M.; Schuler, B.; Samorì, B. Single-Molecule Studies of Intrinsically Disordered Proteins. Chem. Rev. 2014, 114, 3281−3317. (9) Ganguly, D.; Chen, J. H. Topology-Based Modeling of Intrinsically Disordered Proteins: Balancing Intrinsic Folding and Intermolecular Interactions. Proteins: Struct., Funct., Genet. 2011, 79, 1251−1266. (10) Baker, C. M.; Best, R. B. Insights into the Binding of Intrinsically Disordered Proteins from Molecular Dynamics Simulation. WIREs Comput. Mol. Sci. 2014, 4, 182−198. (11) Flock, T.; Weatheritt, R. J.; Latysheva, N. S.; Babu, M. M. Controlling Entropy to Tune the Functions of Intrinsically Disordered Regions. Curr. Opin. Struct. Biol. 2014, 26C, 62−72. (12) Wright, P. E.; Dyson, H. J. Intrinsically Unstructured Proteins: Re-Assessing the Protein Structure-Function Paradigm. J. Mol. Biol. 1999, 293, 321−331. (13) Fuxreiter, M.; Tompa, P. Fuzzy Complexes: A More Stochastic View of Protein Function. Adv. Exp. Med. Biol. 2012, 725, 1−14. (14) Muñ oz, V.; Serrano, L. Intrinsic Secondary Structure Propensities of the Amino-Acids, Using Statistical Phi-Psi Matrices Comparison with Experimental Scales. Proteins: Struct., Funct., Genet. 1994, 20, 301−311. (15) D'Aquino, J. A.; Gomez, J.; Hilser, V. J.; Lee, K. H.; Amzel, L. M.; Freire, E. The Magnitude of the Backbone Conformational Entropy Change in Protein Folding. Proteins: Struct., Funct., Genet. 1996, 25, 143−156. (16) Robertson, A. D.; Murphy, K. P. Protein Structure and the Energetics of Protein Stability. Chem. Rev. 1997, 97, 1251−1267. (17) Baxa, M. C.; Haddadian, E. J.; Jumper, J. M.; Freed, K. F.; Sosnick, T. R. Loss of Conformational Entropy in Protein Folding Calculated Using Realistic Ensembles and Its Implications for NMR-

ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jpcb.6b00658.



REFERENCES

WSME model equations, parameters, and supplementary figures supporting the conclusions (PDF)

AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. Phone: +91-44-2257 4140. H

DOI: 10.1021/acs.jpcb.6b00658 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B Based Calculations. Proc. Natl. Acad. Sci. U. S. A. 2014, 111, 15396− 15401. (18) Uversky, V. N.; Gillespie, J. R.; Fink, A. L. Why Are ″Natively Unfolded″ Proteins Unstructured under Physiologic Conditions? Proteins: Struct., Funct., Genet. 2000, 41, 415−427. (19) Müller-Späth, S.; Soranno, A.; Hirschfeld, V.; Hofmann, H.; Ruegger, S.; Reymond, L.; Nettels, D.; Schuler, B. Charge Interactions Can Dominate the Dimensions of Intrinsically Disordered Proteins. Proc. Natl. Acad. Sci. U. S. A. 2010, 107, 14609−14614. (20) Das, R. K.; Pappu, R. V. Conformations of Intrinsically Disordered Proteins Are Influenced by Linear Sequence Distributions of Oppositely Charged Residues. Proc. Natl. Acad. Sci. U. S. A. 2013, 110, 13392−13397. (21) Cordier, F.; Hartmann, B.; Rogowski, M.; Affolter, M.; Grzesiek, S. DNA Recognition by the Brinker Repressor − an Extreme Case of Coupling between Binding and Folding. J. Mol. Biol. 2006, 361, 659− 672. (22) Moody, C. L.; Tretyachenko-Ladokhina, V.; Laue, T. M.; Senear, D. F.; Cocco, M. J. Multiple Conformations of the Cytidine Repressor DNA-Binding Domain Coalesce to One Upon Recognition of a Specific DNA Surface. Biochemistry 2011, 50, 6622−6632. (23) Baldwin, R. L. Energetics of Protein Folding. J. Mol. Biol. 2007, 371, 283−301. (24) Sprules, T.; Green, N.; Featherstone, M.; Gehring, K. Lock and Key Binding of the HOX YPWM Peptide to the PBX Homeodomain. J. Biol. Chem. 2003, 278, 1053−1058. (25) Farber, P. J.; Mittermaier, A. Concerted Dynamics Link Allosteric Sites in the PBX Homeodomain. J. Mol. Biol. 2011, 405, 819−830. (26) Dosztanyi, Z.; Csizmok, V.; Tompa, P.; Simon, I. The Pairwise Energy Content Estimated from Amino Acid Composition Discriminates between Folded and Intrinsically Unstructured Proteins. J. Mol. Biol. 2005, 347, 827−839. (27) Xue, B.; Dunbrack, R. L.; Williams, R. W.; Dunker, A. K.; Uversky, V. N. PONDR-FIT: A Meta-Predictor of Intrinsically Disordered Amino Acids. Biochim. Biophys. Acta, Proteins Proteomics 2010, 1804, 996−1010. (28) Wako, H.; Saito, N. Statistical Mechanical Theory of Protein Conformation 0.1. General Considerations and Application to Homopolymers. J. Phys. Soc. Jpn. 1978, 44, 1931−1938. (29) Wako, H.; Saito, N. Statistical Mechanical Theory of Protein Conformation 0.2. Folding Pathway for Protein. J. Phys. Soc. Jpn. 1978, 44, 1939−1945. (30) Muñoz, V.; Eaton, W. A. A Simple Model for Calculating the Kinetics of Protein Folding from Three-Dimensional Structures. Proc. Natl. Acad. Sci. U. S. A. 1999, 96, 11311−11316. (31) Taketomi, H.; Ueda, Y.; Go, N. Studies on Protein Folding, Unfolding and Fluctuations by Computer-Simulation 0.1. Effect of Specific Amino-Acid Sequence Represented by Specific Inter-Unit Interactions. Int. J. Pept. Protein Res. 1975, 7, 445−459. (32) Bruscolini, P.; Pelizzola, A. Exact Solution of the Muñoz-Eaton Model for Protein Folding. Phys. Rev. Lett. 2002, 88, 258101. (33) Naganathan, A. N. Predictions from an Ising-Like Statistical Mechanical Model on the Dynamic and Thermodynamic Effects of Protein Surface Electrostatics. J. Chem. Theory Comput. 2012, 8, 4646− 4656. (34) Naganathan, A. N. A Rapid, Ensemble and Free Energy Based Method for Engineering Protein Stabilities. J. Phys. Chem. B 2013, 117, 4956−4964. (35) Heinig, M.; Frishman, D. STRIDE: A Web Server for Secondary Structure Assignment from Known Atomic Coordinates of Proteins. Nucleic Acids Res. 2004, 32, W500−W502. (36) Faccin, M.; Bruscolini, P.; Pelizzola, A. Analysis of the Equilibrium and Kinetics of the Ankyrin Repeat Protein Myotrophin. J. Chem. Phys. 2011, 134, 075102. (37) Bruscolini, P.; Naganathan, A. N. Quantitative Prediction of Protein Folding Behaviors from a Simple Statistical Model. J. Am. Chem. Soc. 2011, 133, 5372−5379.

(38) Jabet, C.; Gitti, J.; Summers, M. F.; Wolberger, C. NMR Studies of the PBX1 TALE Homeodomain Protein Free in Solution and Bound to DNA: Proposal for a Mechanism of HOXB1-PBX1-DNA Complex Assembly. J. Mol. Biol. 1999, 291, 521−530. (39) Flory, P. J. Statistical Mechanics of Chain Molecules; Wiley: New York, 1969. (40) Pappu, R. V.; Srinivasan, R.; Rose, G. D. The Flory Isolated-Pair Hypothesis Is Not Valid for Polypeptide Chains: Implications for Protein Folding. Proc. Natl. Acad. Sci. U. S. A. 2000, 97, 12565−12570. (41) Baruah, A.; Rani, P.; Biswas, P. Conformational Entropy of Intrinsically Disordered Proteins from Amino Acid Triads. Sci. Rep. 2015, 5, 11740. (42) Badasyan, A.; Mamasakhlisov, Y. S.; Podgornik, R.; Parsegian, V. A. Solvent Effects in the Helix-Coil Transition Model Can Explain the Unusual Biophysics of Intrinsically Disordered Proteins. J. Chem. Phys. 2015, 143, 014102. (43) Tompa, P. On the Supertertiary Structure of Proteins. Nat. Chem. Biol. 2012, 8, 597−600. (44) Croy, C. H.; Bergqvist, S.; Huxford, T.; Ghosh, G.; Komives, E. A. Biophysical Characterization of the Free I Kappa B Alpha Ankyrin Repeat Domain in Solution. Protein Sci. 2004, 13, 1767−1777. (45) Truhlar, S. M. E.; Torpey, J. W.; Komives, E. A. Regions of I Kappa B Alpha That Are Critical for Its Inhibition of Nf-Kappa B DNA Interaction Fold Upon Binding to Nf-Kappa B. Proc. Natl. Acad. Sci. U. S. A. 2006, 103, 18951−18956. (46) Sivanandan, S.; Naganathan, A. N. A Disorder-Induced Domino-Like Destabilization Mechanism Governs the Folding and Functional Dynamics of the Repeat Protein IκBα. PLoS Comput. Biol. 2013, 9, e1003403. (47) Lamboy, J. A.; Kim, H.; Lee, K. S.; Ha, T.; Komives, E. A. Visualization of the Nanospring Dynamics of the I Kappa B Alpha Ankyrin Repeat Domain in Real Time. Proc. Natl. Acad. Sci. U. S. A. 2011, 108, 10178−10183. (48) Demarest, S. J.; Martinez-Yamout, M.; Chung, J.; Chen, H. W.; Xu, W.; Dyson, H. J.; Evans, R. M.; Wright, P. E. Mutual Synergistic Folding in Recruitment of CBP/P300 by P160 Nuclear Receptor Coactivators. Nature 2002, 415, 549−553. (49) Ebert, M.-O.; Bae, S.-H.; Dyson, J. H.; Wright, P. E. NMR Relaxation Study of the Complex Formed between CBP and the Activation Domain of the Nuclear Hormone Receptor Coactivator ACTR. Biochemistry 2008, 47, 1299−1308. (50) Iesmantavicius, V.; Dogan, J.; Jemth, P.; Teilum, K.; Kjaergaard, M. Helical Propensity in an Intrinsically Disordered Protein Accelerates Ligand Binding. Angew. Chem., Int. Ed. 2014, 53, 1548− 1551. (51) Kjaergaard, M.; Norholm, A. B.; Hendus-Altenburger, R.; Pedersen, S. F.; Poulsen, F. M.; Kragelund, B. B. TemperatureDependent Structural Changes in Intrinsically Disordered Proteins: Formation of Alpha-Helices or Loss of Polyproline II? Protein Sci. 2010, 19, 1555−1564. (52) Abkevich, V. I.; Gutin, A. M.; Shakhnovich, E. Impact of Local and Non-Local Interactions on Thermodynamics and Kinetics of Protein Folding. J. Mol. Biol. 1995, 252, 460−471. (53) Garcia-Mira, M. M.; Sadqi, M.; Fischer, N.; Sanchez-Ruiz, J. M.; Muñoz, V. Experimental Identification of Downhill Protein Folding. Science 2002, 298, 2191−2195. (54) Naganathan, A. N.; Orozco, M. The Native Ensemble and Folding of a Protein Molten-Globule: Functional Consequence of Downhill Folding. J. Am. Chem. Soc. 2011, 133, 12154−12161. (55) Griko, Y. V.; Privalov, P. L. Thermodynamic Puzzle of Apomyoglobin Unfolding. J. Mol. Biol. 1994, 235, 1318−1325. (56) Gianni, S.; Guydosh, N. R.; Khan, F.; Caldas, T. D.; Mayor, U.; White, G. W. N.; DeMarco, M. L.; Daggett, V.; Fersht, A. R. Unifying Features in Protein-Folding Mechanisms. Proc. Natl. Acad. Sci. U. S. A. 2003, 100, 13286−13291. (57) Dagan, S.; Hagai, T.; Gavrilov, Y.; Kapon, R.; Levy, Y.; Reich, Z. Stabilization of a Protein Conferred by an Increase in Folded State Entropy. Proc. Natl. Acad. Sci. U. S. A. 2013, 110, 10628−10633. I

DOI: 10.1021/acs.jpcb.6b00658 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B (58) Chan, H. S.; Dill, K. A. Intrachain Loops in Polymers - Effects of Excluded Volume. J. Chem. Phys. 1989, 90, 492−509. (59) Brown, C. J.; Takayama, S.; Campen, A. M.; Vise, P.; Marshall, T. W.; Oldfield, C. J.; Williams, C. J.; Dunker, A. K. Evolutionary Rate Heterogeneity in Proteins with Long Disordered Regions. J. Mol. Evol. 2002, 55, 104. (60) Lin, Y.-S.; Hsu, W.-L.; Hwang, J.-K.; Li, W.-H. Proportion of Solvent-Exposed Amino Acids in a Protein and Rate of Protein Evolution. Mol. Biol. Evol. 2007, 24, 1005−1011. (61) Kumar, S.; Tsai, C. J.; Nussinov, R. Factors Enhancing Protein Thermostability. Protein Eng., Des. Sel. 2000, 13, 179−191. (62) Tokuriki, N.; Oldfield, C. J.; Uversky, V. N.; Berezovksy, I. N.; Tawfik, D. S. Do Viral Proteins Possess Unique Biophysical Features? Trends Biochem. Sci. 2009, 34, 53−59. (63) Guerois, R.; Nielsen, J. E.; Serrano, L. Predicting Changes in the Stability of Proteins and Protein Complexes: A Study of More Than 1000 Mutations. J. Mol. Biol. 2002, 320, 369−387. (64) Wirth, A. J.; Liu, Y.; Prigozhin, M. B.; Schulten, K.; Gruebele, M. Comparing Fast Pressure Jump and Temperature Jump Protein Folding Experiments and Simulations. J. Am. Chem. Soc. 2015, 137, 7152−7159. (65) Shaw, D. E.; Maragakis, P.; Lindorff-Larsen, K.; Piana, S.; Dror, R. O.; Eastwood, M. P.; Bank, J. A.; Jumper, J. M.; Salmon, J. K.; Shan, Y. B.; et al. Atomic-Level Characterization of the Structural Dynamics of Proteins. Science 2010, 330, 341−346. (66) Jager, M.; Zhang, Y.; Bieschke, J.; Nguyen, H.; Dendle, M.; Bowman, M. E.; Noel, J. P.; Gruebele, M.; Kelly, J. W. StructureFunction-Folding Relationship in a WW Domain. Proc. Natl. Acad. Sci. U. S. A. 2006, 103, 10648−10653. (67) Dave, K.; Jager, M.; Nguyen, H.; Kelly, J. W.; Gruebele, M. High-Resolution Mapping of the Folding Transition State of a WW Domain. J. Mol. Biol. 2016, 428, 1617−1636. (68) Bah, A.; Vernon, R. M.; Siddiqui, Z.; Krzeminski, M.; Muhandiram, R.; Zhao, C.; Sonenberg, N.; Kay, L. E.; Forman-Kay, J. D. Folding of an Intrinsically Disordered Protein by Phosphorylation as a Regulatory Switch. Nature 2015, 519, 106−109. (69) Gopi, S.; Rajasekaran, N.; Singh, A.; Ranu, S.; Naganathan, A. N. Energetic and Topological Determinants of a PhosphorylationInduced Disorder-to-Order Protein Conformational Switch. Phys. Chem. Chem. Phys. 2015, 17, 27264−27269.

J

DOI: 10.1021/acs.jpcb.6b00658 J. Phys. Chem. B XXXX, XXX, XXX−XXX