Correlation Analysis of Trp-Cage Dynamics in Folded and Unfolded

Nov 30, 2015 - Random matrix theory inspired analysis of the correlation matrices has been carried out. The spectra of these correlation matrices show...
0 downloads 10 Views 1MB Size
Article pubs.acs.org/JPCB

Correlation Analysis of Trp-Cage Dynamics in Folded and Unfolded States Luigi L. Palese* Department of Basic Medical Sciences, Neurosciences and Sense Organs (SMBNOS), University of Bari “Aldo Moro”, Piazza G. Cesare − Policlinico, 70124 Bari, Italy S Supporting Information *

ABSTRACT: A fundamental and still debated problem is how folded structures of proteins are related to their unfolded state. Besides the classical view, in which a large number of conformations characterize the unfolded state while the folded one is dominated by a single structure, recently a reassessment of the denatured state has been suggested. A growing amount of evidence indicates that not only the folded but also the unfolded state is at least partially organized. Here, we try to answer the question of how different protein dynamics is in folded and unfolded states by performing all-atom molecular dynamics simulations on the model protein Trp-cage. Random matrix theory inspired analysis of the correlation matrices has been carried out. The spectra of these correlation matrices show that the low rank modes of Trp-cage dynamics are outside of the limit expected for a random system both in folded and in unfolded conditions. These findings shed light on the nature of the unfolded state of the proteins, suggesting that it is much less random than previously thought.



INTRODUCTION Proteins are linear polymers of amino acid residues that undergo reversible folding/unfolding transitions. In most cases, all the information needed to realize the folded form from the unfolded one is entirely encoded in their linear sequence. In fact, under suitable conditions, such as in the presence of physiological ionic force in water and below the melting temperature, most proteins assume spontaneously the folded conformation. While an unfolded protein can assume an enormous number of different conformations, upon shifting to folding conditions the entire protein population is spontaneously converted to the native structure. This folded conformation has been classically considered to be an ordered one, typically the only ordered one. Anfinsen’s own explanation of the folding phenomenon was a thermodynamic based hypothesis, which postulates that in folding conditions the protein population attains a minimum in Gibbs free energy.1−3 In the denatured (here used as synonymous of unfolded) state, proteins are generally described as the self-avoiding random coil accordingly to the Flory rotational isomeric state model.4 Considering the constraints on bond rotations imposed by the monomer’s chemistry, the number of possible conformations for a protein of 100 residues is of the order of 1030, each separated by small energy barriers of the magnitude of kT. This implies that talking about individual structures in the unfolded conditions is equivalent to nonsense and that only a statistical description of this state can be considered. This view has obtained considerable experimental and theoretical support.5,6 But unavoidably, this same idea of the unfolded state leads to a search problem, which is universally known as the Levinthal’s paradox.7 The folding funnel paradigm is one of the most successful solutions to this problem.8,9 © 2015 American Chemical Society

Recently, a number of theoretical and experimental works have challenged the classical view of the denatured ensemble as unstructured. The presence of polyproline II, the residual structure flickering in the denatured state ensemble, and the demonstration that structured chains with flexible links also obey to Flory-type statistics suggest to profoundly reconsider the nature of the unfolded state and its importance in the folding phenomenon.10−18 Here we address the question of the similarities and the differences between the folded and unfolded states from a dynamical point of view. All-atom molecular dynamics (MD) simulations on the Trp-cage mini-protein19 have been carried out at temperatures above and below the experimentally determined melting temperature (Tm), and the MD-derived time series has been analyzed in order to asses the randomness of motions in both conditions.



METHODS The starting structure of Trp-cage was obtained from PDB20 1L2Y19 and modeled as at neutral pH by VMD21 to obtain an all-atom system in the setup phase. MD simulations were performed with NAMD222 using the CHARMM22 force field with CMAP correction.23,24 The starting Trp-cage structure was solvated in boxes of TIP3P25 water molecules using 15 Å padding, as to prevent image interactions. The system was brought to the electroneutrality with 150 mM KCl. The simulation temperature was kept constant by means of Langevin thermostat, and pressure was maintained at 1.01325 Received: October 3, 2015 Revised: November 27, 2015 Published: November 30, 2015 15568

DOI: 10.1021/acs.jpcb.5b09678 J. Phys. Chem. B 2015, 119, 15568−15573

Article

The Journal of Physical Chemistry B

which describes how many components of an eigenvector k (of dimension N) significantly contribute to its length.26 The inverse participation ratio has been related to the information entropy, but its meaning can be more easily interpreted. In fact, for an entirely delocalized eigenvector with equal components

bar by means of the Langevin piston. Before the production runs, each simulation has been subjected to minimization and equilibration protocols: first of all, the system was subjected to 1000 steps of conjugated gradient energy minimization, followed by 100 ps equilibration in which protein atoms were set as fixed. Later the same protocol was applied without constraints; then the system was treated as detailed below (see Results and Discussion section). Periodic boundary conditions and particle mesh Ewald method were applied. The time step was set to 2 fs, and frames were saved every 10 ps. Before proceeding with data analysis, protein configurations were aligned to a common reference structure. Cartesian coordinates of the Trp-cage backbone atoms were then extracted at different sampling intervals Δt (note that this sampling frequency is not the time step used during the simulations nor the frequency at which configurations were saved along the trajectory) and arranged in an empirical matrix (whose rows contained the different time-dependent conformations) by Tcl and Vim scripting. Principal component analysis (PCA) was based on the eigenvector decomposition of the correlation matrix P whose elements are

vα(k) =

the participation ratio (to which we refer to in the main text) is 1/Ik = N, while for a fully localized eigenvector with a single nonzero component 1/Ik = 1. For the Wishart matrices the participation ratio converges to the value of N/3. Numerical analysis was carried out by the NumPy28 Python package, implemented in IPython.29 A demo IPython notebook is reported in the Supporting Information.



RESULTS AND DISCUSSION Trp-cage (NLYIQWLKDGGPSSGRPPPS) is a 20-residue polymer that contains a short α-helix at residues 2−8, a 310helix at residues 11−14, and a polyproline II helix at the Cterminus.19 The NMR structure of this mini-protein shows a globular-like fold, where three proline residues (Pro-12, Pro-18, and Pro-19) and a glycine (Gly-11) pack against the aromatic side chains of the central tryptophan (Trp-6) and Tyr-3. This three-dimensional structure is stable below 315 K, which is the Tm of this mini-protein.19,30 The small size of the Trp-cage, along with its folding time of only a few microseconds,31 has made this mini-protein a widely used model system for protein folding. Its folding/unfolding transitions have been extensively studied both experimentally and by all-atom MD, with a large number of techniques and in different conditions,32−38 including the pressure effects on its stability diagram and the impact of the presence of protein crowders or organic surfaces.39−41 All-atom MD has been performed at 300 K for the folded state and at 335 K for the unfolded state. Large structural fluctuations were monitored by all-atom root-mean-square deviation (RMSD) of the sampled conformations aligned to the reference structure. In order to obtain the unfolded conformations, the protein was subjected prior to a 500 K simulation; after unfolding (which was monitored by the RMSD jump) the system was further equilibrated for 150 ns and thereafter cooled to 335 K at a rate of 0.01 K/step. Before the production run, a 150 ns equilibration step at 335 K has been carried out. After the equilibration steps, production runs were performed. What is reported below refers to 0.5 μs long simulations, but a series of shorter simulations (150 ns each) were carried out in both conditions, as well as 1.0 μs simulation at both temperatures, leading essentially to the same results (not shown). The reported simulation carried out at 300 K confirms that Trp-cage is stable at this temperature. All-atom RMSD analysis shows that the overall deviation from the reference structure, and its relative standard deviation, is 2.54 ± 0.39 Å (see also Figure S1 in Supporting Information). All the secondary structure motifs and the global tertiary structure remain essentially unchanged during the entire simulation run. The calculated root-mean-square fluctuation (RMSF) of the αcarbon atoms shows that the region of Trp-cage which presents the major structural changes (beside the extreme N- and Cterminal residues) is its C-terminal half, and particularly the 310helix delimiting residues (see Figure S2). In fact, one of the most noticeable changes that can be observed in the 300 K

Cij

Pij =

CiiCjj

where C is the covariance matrix calculated from the empirical matrix, after centroid subtraction, as Cij = ⟨(xi − ⟨xi⟩)(xj − ⟨xj⟩)⟩

where ⟨...⟩ means the average over all instantaneous sampled structures. The square symmetric correlation matrix P is diagonalized as RTPR = Λ

where R is an orthonormal transformation matrix, the superscript T means transposition, and Λ is a diagonal matrix whose elements are the eigenvalues. The empirical matrix is projected onto the eigenvectors (which are the columns of R) to give the principal components pi(t): p = RT(x(t ) − ⟨x⟩)

with obvious meaning of symbols. Spectral analysis was carried out on the difference matrix X obtained by the empirical matrix D as

Xa(t ) = Da(t + 1) − Da(t ) The correlation matrix of X was then calculated, and the set of eigenvalues of this matrix was the so-called spectrum, which was obtained as reported above. To evaluate how many eigenvalues are above the Wishart range26 (which means outside of the expected range for a random system), here the shuffling method has been used. As it has been shown previously,27 a reliable estimate of the expected maximum eigenvalue for the random matrix counterpart of X can be obtained by randomly shuffling it a suitable number of times; from this upper estimate of the Wishart range, the number of large nonrandom eigenvalues can be calculated. The inverse participation ratio is a quantity related to the distribution of the eigenvector components: N

Ik =

1 N

∑ (vα(k))4 α=1

15569

DOI: 10.1021/acs.jpcb.5b09678 J. Phys. Chem. B 2015, 119, 15568−15573

Article

The Journal of Physical Chemistry B simulation(s) is a sort of switch of this central 310-helix relative to the α-helix and the polyproline II regions. As expected, the simulation carried out at 335 K shows that the RMSD is higher (7.98 ± 1.89 Å), with respect to the 300 K simulation, during all the production run. This means that the Trp-cage explores at this temperature conformations that are completely different from the folded starting one. Also, the αcarbon atom RMSF shows significant differences between the two simulation temperatures. All the residues show at 335 K a hight mobility, above 4 Å. As in the 300 K simulation, also in the 335 K case the highest mobility is observed in the N- and C-terminal regions (see Figures S1 and S2). Collectively the RMSD and the RMSF analyses suggest that Trp-cage simulations performed below and above the Tm can be considered as satisfactory, even if minimal, models for the folded and the unfolded protein dynamics, respectively. So, it is interesting to evaluate also the overall configuration dynamics in these two conditions. For such purpose, PCA has been performed on both the simulations. PCA shows that the Trpcage explores four different regions in the dynamics performed at temperature below the Tm, as reported in Figure 1. These

Figure 2. PCA of the 335 K simulation. First (horizontal axis) vs second (vertical axis) principal components are displayed. These two components account for 39.8% of the total variance. Colors from blue (zero) to dark red correspond to an increasing number of observed configuration in the bin (100 bins for each component).

unfolded protein such as the thermally denatured Trp-cage, which should be described essentially as a self-avoiding random coil. The conformations explored at this temperature constitute a very heterogeneous ensemble, ranging from completely extended structures to more compact, turn-like, ones (see Figure S4). However, Figure 2 shows that some conformations are more frequently explored than the average, even if they are not traps in the conformational landscape because they can be observed only transiently. Moreover, also these relatively stable transients are different from the folded structure of Trp-cage, as confirmed by the RMSD analysis. It should be stressed that in this thermally denatured ensemble some fluctuating native-like structures can be (more or less) transiently observed,10−18,32 particularly in the polyproline C-terminal region and in the 310helix spanning residues, but native-like global structures cannot be observed in the reported time scale (see Figures S1 and S4). So this simulation represents a pure unfolded dynamics (and this is an important point for what will be discussed later). Jointly, PCA and RMSD analyses of the unfolded state of Trp-cage are in perfect agreement with the classical view of the unfolded state of a protein: a single, large potential well with no defined different conformation clusters (Figure 2), unlike those that can be observed in the simulation below the Tm reported in Figure 1. Furthermore, these features of PCA of the unfolded state are reminiscent of diffusive, Brownian-like, motion. Considering only the PCA data, it is not possible to exclude that the presence of minima (Figure 2) in the unfolded protein dynamical landscape can be attributed only to chance. But it is also possible that the observed landscape could be the consequence of the presence of organized, in a systematic way, motions in the unfolded dynamics, too. Stated in other words, we need of a method able to discriminate the random dynamics from the nonrandom one in order to answer to the questions raised by the presence of minima in the unfolded dynamical landscape. As demonstrated by a large corpus of theoretical and experimental evidence, random matrix theory (RMT) is a collection of useful tools capable of doing this.26,27,43 RMT predicts an universal behavior for the eigenvalue distribution of some symmetric random matrix ensembles, including their expected maximum eigenvalue distribution. In order to analyze the correlated motions inside the protein and

Figure 1. PCA of the 300 K simulation. First (horizontal axis) vs second (vertical axis) principal components are displayed. These two components account for 41.9% of the total variance. Colors from blue (zero) to dark red correspond to an increasing number of observed configuration in the bin (100 bins for each component).

regions in the conformational landscape are stable and recursively visited, so they must be considered well-defined and separated structural clusters. As mentioned above, they substantially correspond to small changes in the relative orientation of secondary structure motifs. Interestingly, a change of the position of proline residues with respect to the Trp-6 can be observed. The more populated central cluster in Figure 1 corresponds to a well packed cage, in which both faces of the indole ring are protected by the proline residues (see Figure S3). Less populated clusters show some features that are similar to the previously described “proline detached” states.33,42 Particularly, the lower right cluster in Figure 1 (see also Figure S3) is characterized by a switched position of the Pro-12 containing 310-helix with respect to the indole ring of Trp-6, along with a more detached position of the polyproline region. On the other hand, PCA performed on the simulation at a temperature above the Tm shows that the unfolded form of the Trp-cage explores a single large region, with no separate clusters of conformations (see Figure 2). This is expected for an 15570

DOI: 10.1021/acs.jpcb.5b09678 J. Phys. Chem. B 2015, 119, 15568−15573

Article

The Journal of Physical Chemistry B

zero) eigenvalues. These spectral features indicate an excess information26 in the correlation and covariance matrices. The trace of these variance−covariance matrices, which is a measure of the total variation, shows a remarkable (but somewhat expected) difference between the two explored temperatures. As reported in the Table S1, the traces obtained from the simulation above Tm are larger by 1 order of magnitude compared to those obtained from the simulation below Tm. All together these observations permit to state that even if the unfolded state can be described by the self-avoiding random coil model (at ensemble average measure of global physical properties such as the radius of gyration), its dynamics is not random at all. It is a correlated system, similarly to what observed in the case of folded protein dynamics,27,45 although with a greater data dispersion with respect to the folded condition. By extension, the fact that some structures are more persistent in the conformational landscape of the unfolded peptide must be ascribed to the intrinsic system’s properties, exactly as in the folded state, and not to pure chance. Switching again the focus to the sampling frequency behavior of the correlation matrix spectra, some interesting differences between the two simulation conditions become noticeable (Figure 3). In the 300 K simulation the largest eigenvalue (indicated as λ1 hereafter) shows a strong sampling frequency dependence: it increases until Δt equals 100 ps, whereupon it reaches a steady level. This can be interpreted as the time scale necessary for a complete coupling of the system, so two regimes of temporal coupling can be distinguished. Moreover, it can be observed a significant gap between λ1 and the rest of the spectrum, which increases at increasing Δt. This means that the protein behaves at temperature below Tm as a strongly correlated system, in agreement with what was recently reported for other protein models27,43,44,46 and similarly to the previously observed behavior of stock markets.26,47,48 In principle, the growth of λ1(Δt) (and the relative gap with the rest of the spectrum) can occur in two different ways: the gradual increase of the global coupling strength or the quick formation of a collective core which is then joined by other system’s degrees of freedom. The participation index analysis of the eigenvector associated with λ1 (see Figure 4; see also Figure S5) suggests that the latter model is correct. The large timedependent spectral gap, such as that observed in the folded Trp-cage, occurs typically in complex systems characterized by a rigid dynamic, where a single delocalized eigenvector dominates. But here it is necessary a note of caution. The values of the participation index show that the first eigenvector is only partially delocalized and not completely delocalized, as expected for a true rigid system: in this regard Trp-cage is soft also below the Tm. Instead, in the simulation at a temperature above Tm two eigenvalues are considerable larger than the rest of the spectrum, and the gap progressively decreases at increasing Δt. This behavior is such that at Δt of 100 and 200 ps no clear gaps are evident between λ1 and the other eigenvalues (see Figure 3). This λi(Δt) pattern of the unfolded dynamics at short time sampling intervals (with two large eigenvalues separated from the bulk of the spectrum) has been observed in other (very) complex systems.26 Moreover, the participation index analysis reported in Figure 4 shows that, on the contrary of what occurs in the folded condition, the eigenvector associated with the largest eigenvalue is more delocalized at short sampling times than at longer ones (see also Figure S6). This is also true for the eigenvector associated with the second largest eigenvalue λ2 of the unfolded

the time scale in which such correlations develop, a series of sampling frequency-dependent correlation matrices has been constructed and their eigensystems have been analyzed, as detailed in the Methods section (see also sample software in Supporting Information). Both simulations were sampled44 at Δt of 10, 20, 50, 100, and 200 ps. These correlation matrices were numerically diagonalized, and the eigenvalues, as well as the corresponding eigenvectors, were analyzed. Figure 3 shows

Figure 3. Spectral analysis of the 300 K simulation (left panel, blue symbols) and of the 335 K simulation (right panel, red symbols) as a function of the sampling interval. Note that both panels share the same vertical scale.

the eigenvalue spectra of the correlation matrices for the two reference simulations. It must be forthwith stated that the spectra of these correlation matrices look completely different from those expected for the Wishart ensemble,26,27 which is the pertinent one considering the specific features of the correlation matrices from MD experiments.27,43,44 A significant number of eigenvalues lie above the expected maximum eigenvalue predicted by RMT. Correlation matrices obtained from the simulation below the Tm have 21, 21, 20, 19, and 17 eigenvalues above their random counterparts at Δt = 10, 20, 50, 100, and 200 ps, respectively, while for the simulation above the Tm there are 18, 17, 16, 16, and 15 eigenvalues above the Wishart range, with the same sampling time interval order. The fact that Trp-cage is a correlated system both at temperatures below and above Tm is highlighted by the determinants and traces of the covariance and correlation matrices obtained from the empirical data sets (see Table S1). In the simulations discussed here the determinant of all correlation matrices is equal to zero, so they appear as correlated (while their traces are all, obviously, equal to the correlation matrix dimensions). Considering that the determinant of a matrix can be geometrically interpreted as the volume of the parallelepiped spanned by the matrix vectors, a zero-valued determinant means that the swarming space available for the data points is much less than a n-dimensional hypersphere of unit radius, and it is limited to a hyperplane of at most n − 1 dimensions (so as to justify the zero volume). The determinant of the variance−covariance matrices, which can be called the generalized covariance, is also zero (or at least zero for all practical purposes). These zero-valued determinants are due to the presence of a large number of very small (but not 15571

DOI: 10.1021/acs.jpcb.5b09678 J. Phys. Chem. B 2015, 119, 15568−15573

Article

The Journal of Physical Chemistry B

therefore their associated eigenvectors) are well above the RTM predicted boundaries. This means that the observed dynamics of Trp-cage in unfolded state is the result of the intrinsic properties of the protein, exactly as what is observed in the folded state. It is interesting to note that a symmetrical relation between folded and unfolded dynamics can be highlighted in the reported simulation setup: not only nativelike structure are present in the unfolded state dynamics, but it is also possible to detect structures for which a clear role as precursors of the unfolding path has been previously shown in the folded Trp-cage dynamics. Although folded and unfolded states of Trp-cage show some similarities, it should be noted that the eigensystem analysis of the correlation matrices points out some significant differences between these two conditions. The folded protein dynamics appears more rigid and dominated by a single mode, which starts as a large collective core and whose size increases in a 100 ps time scale. Conversely, the dynamics in the unfolded case is more soft and without a single dominant mode. This not so rigid dynamics of the main modes in the unfolded state could be one of the reasons that allow the protein to rapidly explore its conformational landscape.

Figure 4. Participation index of the first eigenvector as a function of the sampling interval. Blue circles, 300 K simulation; red circles, 335 K simulation.



simulation, indeed with higher participation index values (not shown). If we consider the participation index as a criterion to discriminate the protein softness and rigidity, it must be concluded that unfolded Trp-cage is more rigid over short time scales compared to longer ones. In the long time-scale limit, the picture that emerges for the unfolded dynamics is that of a much softer system in which different components of similar importance spread and combine progressively. Along with this, it should be emphasized that a number of low rank eigenvectors describing the unfolded dynamics show a high participation index, similar to what can be observed for the low rank eigenvectors of the folded dynamics (see respectively Figures S6 and S5). It is the presence of these low rank eigenvectors, lying outside the RMT allowed range and characterized by a high participation index, that renders the Trp-cage a nonrandom system both above and below the Tm. They are always involved in the correlated dynamics, starting from the persistence of the three-dimensional structure of the folded Trp-cage up to the native-like secondary structure observed in the unfolded dynamics. From a structural point of view, these correlations in the unfolded dynamics of Trp-cage lead to the proline-centered native-like structures (see above), accordingly to what previously observed for this and other model systems of the folding process.49

ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jpcb.5b09678. Figures S1−S6 and Table S1 (PDF) Numerical calculations details (ZIP)



AUTHOR INFORMATION

Corresponding Author

*E-mail [email protected]. Notes

The authors declare no competing financial interest.

■ ■

ACKNOWLEDGMENTS This work was supported by a grant from the University of Bari (CUP H93G13000170005). REFERENCES

(1) Wu, H. Studies on Denaturation of Proteins. XII. A Theory of Denaturation. Chin. J. Physiol. 1931, 5, 321−344. (2) Mirsky, A. E.; Pauling, L. On the Structure of Native, Denatured, and Coagulated Proteins. Proc. Natl. Acad. Sci. U. S. A. 1936, 22, 439− 447. (3) Anfinsen, C. B. Principles that Govern the Folding of Protein Chains. Science 1973, 181, 223−230. (4) Flory, P. J. Statistical Mechanics of Chain Molecules; Interscience: New York, 1969. (5) Tanford, C. Protein Denaturation. Adv. Protein Chem. 1968, 23, 121−282. (6) Dill, K. A.; Shortle, D. Denatured States of Proteins. Annu. Rev. Biochem. 1991, 60, 795−825. (7) Levinthal, C. How to Fold Graciously. In Mössbauer Spectroscopy in Biological Systems; Univ. of Illinois Press: 1969. (8) Frauenfelder, H.; Sligar, S. G.; Wolynes, P. G. The Energy Landscapes and Motions of Proteins. Science 1991, 254, 1598−1603. (9) Bryngelson, J. D.; Onuchic, J. N.; Socci, N. D.; Wolynes, P. G. Funnels, Pathways, and the Energy Landscape of Protein Folding: A Synthesis. Proteins: Struct., Funct., Genet. 1995, 21, 167−195. (10) Fitzkee, N. C.; Fleming, P. J.; Gong, H.; Panasik, N., Jr.; Street, T. O.; Rose, G. D. Are Proteins Made from a Limited Parts List? Trends Biochem. Sci. 2005, 30, 73−80.



CONCLUSIONS In this work it has been demonstrated that Trp-cage shows a dynamics completely different from what expected for a true random systems at temperatures both below and above Tm. The obtained results show that Trp-cage is partially organized also in the unfolded state, according to a large corpus of literature data (see the above-mentioned references). From a structural point of view, this partial organization of the unfolded state is related to the presence of native-like secondary structures and more frequently visited conformations in the dynamical landscape. An important finding of the present work is that all these features observed in the unfolded state cannot be ascribed to pure chance. In fact, the spectra of the correlation matrices obtained from the unfolded Trp-cage dynamics show that a large number of eigenvalues (and 15572

DOI: 10.1021/acs.jpcb.5b09678 J. Phys. Chem. B 2015, 119, 15568−15573

Article

The Journal of Physical Chemistry B (11) Rose, G. D.; Fleming, P. J.; Banavar, J. R.; Maritan, A. A Backbone-based Theory of Protein Folding. Proc. Natl. Acad. Sci. U. S. A. 2006, 103, 16623−16633. (12) Mezei, M.; Fleming, P. J.; Srinivasan, R.; Rose, G. D. Polyproline II Helix Is the Preferred Conformation for Unfolded Polyalanine in Water. Proteins: Struct., Funct., Genet. 2004, 55, 502−507. (13) Hamburger, J. B.; Ferreon, J. C.; Whitten, S. T.; Hilser, V. J. Thermodynamic Mechanism and Consequences of the Polyproline II Structural Bias in the Denatured States of Proteins. Biochemistry 2004, 43, 9790−9799. (14) Pappu, R. V.; Srinivasan, R.; Rose, G. D. The Flory Isolated-Pair Hypothesis Is not Valid for Polypeptide Chains: Implications for Protein Folding. Proc. Natl. Acad. Sci. U. S. A. 2000, 97, 12565−12570. (15) van Gunsteren, W. F.; Bürgi, R.; Peter, C.; Daura, X. The Key to Solving the Protein-Folding Problem Lies in an Accurate Description of the Denatured State. Angew. Chem., Int. Ed. 2001, 40, 351−355. (16) Shortle, D.; Ackerman, M. S. Persistence of Native-Like Topology in a Denatured Protein in 8 M Urea. Science 2001, 293, 487−489. (17) Cho, J.-H.; Meng, W.; Sato, S.; Kim, E. Y.; Schindelin, H.; Raleigh, D. P. Energetically Significant Networks of Coupled Interactions within an Unfolded Protein. Proc. Natl. Acad. Sci. U. S. A. 2014, 111, 12079−12084. (18) Thukral, L.; Schwarze, S.; Daidone, I.; Neuweiler, H. β-Structure within the Denatured State of the Helical Protein Domain BBL. J. Mol. Biol. 2015, 427, 3166−3176. (19) Neidigh, J. W.; Fesinmeyer, R. M.; Andersen, N. H. Designing a 20-residue Protein. Nat. Struct. Biol. 2002, 9, 425−430. (20) Berman, H. M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T.; Weissig, H.; Shindyalov, I. N.; Bourne, P. E. The Protein Data Bank. Nucleic Acids Res. 2000, 28, 235−242. (21) Humphrey, W.; Dalke, A.; Schulten, K. Vmd: Visual Molecular Dynamics. J. Mol. Graphics 1996, 14, 33−38. (22) Phillips, J. C.; Braun, R.; Wang, W.; Gumbart, J.; Tajkhorshid, E.; Villa, E.; Chipot, C.; Skeel, R. D.; Kale, L.; Schulten, K. Scalable Molecular Dynamics with NAMD. J. Comput. Chem. 2005, 26, 1781− 1802. (23) MacKerell, A. D.; Bashford, D.; Bellott, M.; Dunbrack, R.; Evanseck, J.; Field, M. J.; Fischer, S.; Gao, J.; Guo, H.; Ha, S.; et al. Allatom Empirical Potential for Molecular Modeling and Dynamics Studies of Proteins. J. Phys. Chem. B 1998, 102, 3586−3616. (24) MacKerell, A. D.; Feig, M.; Brooks, C. L. Extending the Treatment of Backbone Energetics in Protein Force Fields: Limitations of Gas-Phase Quantum Mechanics in Reproducing Protein Conformational Distributions in Molecular Dynamics Simulations. J. Comput. Chem. 2004, 25, 1400−1415. (25) Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. Comparison of Simple Potential Functions for Simulating Liquid Water. J. Chem. Phys. 1983, 79, 926−935. (26) Kwapień, J.; Drożdż, S. Physical Approach to Complex Systems. Phys. Rep. 2012, 515, 115−226. (27) Palese, L. L. Random Matrix Theory in Molecular Dynamics Analysis. Biophys. Chem. 2015, 196, 1−9. (28) van der Walt, S.; Colbert, S. C.; Varoquaux, G. The NumPy Array: A Structure for Efficient Numerical Computation. Comput. Sci. Eng. 2011, 13, 22−30. (29) Pérez, F.; Granger, B. E. IPython: A System for Interactive Scientific Computing. Comput. Sci. Eng. 2007, 9, 21−29. (30) Barua, B.; Lin, J. C.; Williams, V. D.; Kummler, P.; Neidigh, J. W.; Andersen, N. H. The Trp-cage: Optimizing the Stability of a Globular Miniprotein. Protein Eng., Des. Sel. 2008, 21, 171−185. (31) Qiu, L.; Pabit, S. A.; Roitberg, A. E.; Hagen, S. J. Smaller and Faster: the 20-Residue Trp-cage Protein Folds in 4 μs. J. Am. Chem. Soc. 2002, 124, 12952−12953. (32) Snow, C. D.; Zagrovic, B.; Pande, V. S. The Trp Cage: Folding Kinetics and Unfolded State Topology via Molecular Dynamics Simulations. J. Am. Chem. Soc. 2002, 124, 14548−14549.

(33) Juraszek, J.; Bolhuis, P. G. Sampling the Multiple Folding Mechanism of Trp-cage in Explicit Solvent. Proc. Natl. Acad. Sci. U. S. A. 2006, 103, 15859−15864. (34) Hałabis, A.; Ż mudzińska, W.; Liwo, A.; Ołdziej, S. Conformational Dynamics of the Trp-cage Miniprotein at its Folding Temperature. J. Phys. Chem. B 2012, 116, 6898−6907. (35) Kim, S. B.; Dsilva, C. J.; Kevrekidis, I. G.; Debenedetti, P. G. Systematic Characterization of Protein Folding Pathways Using Diffusion Maps: Application to Trp-cage Miniprotein. J. Chem. Phys. 2015, 142, 085101. (36) Byrne, A.; Williams, D. V.; Barua, B.; Hagen, S. J.; Kier, B. L.; Andersen, N. H. Folding Dynamics and Pathways of the Trp-cage Miniproteins. Biochemistry 2014, 53, 6011−6021. (37) Abaskharon, R. M.; Culik, R. M.; Woolley, G. A.; Gai, F. Tuning the Attempt Frequency of Protein Folding Dynamics via TransitionState Rigidification: Application to Trp-cage. J. Phys. Chem. Lett. 2015, 6, 521−526. (38) Cote, Y.; Maisuradze, G. G.; Delarue, P.; Scheraga, H. A.; Senet, P. New Insights into Protein (Un)Folding Dynamics. J. Phys. Chem. Lett. 2015, 6, 1082−1086. (39) Paschek, D.; Hempel, S.; García, A. E. Computing the Stability Diagram of the Trp-cage Miniprotein. Proc. Natl. Acad. Sci. U. S. A. 2008, 105, 17754−17759. (40) Bille, A.; Linse, B.; Mohanty, S.; Irbäck, A. Equilibrium Simulation of Trp-cage in the Presence of Protein Crowders. J. Chem. Phys. 2015, 143, 175102. (41) Levine, Z. A.; Fischer, S. A.; Shea, J.-E.; Pfaendtner, J. Trp-cage Folding on Organic Surfaces. J. Phys. Chem. B 2015, 119, 10417− 10425. (42) Juraszek, J.; Saladino, G.; van Erp, T. S.; Gervasio, F. L. Efficient Numerical Reconstruction of Protein Folding Kinetics with Partial Path Sampling and Pathlike Variables. Phys. Rev. Lett. 2013, 110, 108106. (43) Potestio, R.; Caccioli, F.; Vivo, P. Random Matrix Approach to Collective Behavior and Bulk Universality in Protein Dynamics. Phys. Rev. Lett. 2009, 103, 268101. (44) Yamanaka, M. Random Matrix Theory of Rigidity in Soft Matter. J. Phys. Soc. Jpn. 2015, 84, 063801. (45) Palese, L. L. Protein Dynamics: Complex by Itself. Complexity 2013, 18, 48−56. (46) Yamanaka, M. Random Matrix Theory Analysis of Cross Correlations in Molecular Dynamics Simulations of Macro-Biomolecules. J. Phys. Soc. Jpn. 2013, 82, 083801. (47) Laloux, L.; Cizeau, P.; Bouchaud, J.-P.; Potters, M. Noise Dressing of Financial Correlation Matrices. Phys. Rev. Lett. 1999, 83, 1467. (48) Plerou, V.; Gopikrishnan, P.; Rosenow, B.; Nunes Amaral, L. A.; Stanley, H. E. Universal and Nonuniversal Properties of Cross Correlations in Financial Time Series. Phys. Rev. Lett. 1999, 83, 1471. (49) Lindorff-Larsen, K.; Piana, S.; Dror, R. O.; Shaw, D. E. How Fast-Folding Proteins Fold. Science 2011, 334, 517−520.

15573

DOI: 10.1021/acs.jpcb.5b09678 J. Phys. Chem. B 2015, 119, 15568−15573