Inherent Structure Analysis of Protein Folding - American Chemical

Inherent Structure Analysis of Protein Folding. Jaegil Kim* and Thomas Keyes*. Department of Chemistry, Boston UniVersity, Boston, Massachusetts 02215...
0 downloads 0 Views 481KB Size
J. Phys. Chem. B 2007, 111, 2647-2657

2647

Inherent Structure Analysis of Protein Folding Jaegil Kim* and Thomas Keyes* Department of Chemistry, Boston UniVersity, Boston, Massachusetts 02215 ReceiVed: October 6, 2006; In Final Form: December 15, 2006

An analysis in terms of the inherent structures (IS, local minima) of the multidimensional potential energy landscape is applied to proteins. Detailed calculations are performed for the 46 bead BLN model, which folds into a four-stranded β-barrel. Enhanced sampling has allowed determination of 239 199 IS states, believed to encompass nearly all the compact, low-energy states, and of well-averaged thermodynamic quantities at low temperature. The density of states shows distinct lobes for compact and extended states, and entropic barriers for the collapse and local ordering transitions. A two-dimensional scatterplot or density of states clearly shows the multifunnel structure of the energy landscape. The anharmonic vibrational free energy is found to play a crucial role in protein folding. The problem of determining the folding transition in a multifunnel system is discussed, and novel indicators of folding are introduced. A particularly clear picture is obtained through the occupation probabilities, pi, of individual low-lying IS, which become finite below the collapse temperature; it is suggested that poor foldability corresponds to a large “misfolding interval” where the excited state pi>0 exceeds that of the native state p0.

I. Introduction Stillinger and Weber1,2 have formulated an exact version of thermodynamics based on partitioning the multidimensional potential energy surface, or landscape, into basins of attraction of the local minima, which they named inherent structures (IS). Each configuration is mapped to an IS via steepest descent minimization and, correspondingly, every function of the coordinates has a counterpart evaluated at the minimum and denoted with subscript “is”. The partition function is then rewritten in terms of the energy distribution of IS, Ω(Uis), called the density of states, and the vibrational Helmholtz free energy, Avib(Uis,T), determined by the configurational integral for a system confined to a basin. More generally, the IS mapping suggests thinking about complex systems in terms of the topography of the landscape and allows an intuitively appealing description of continuous systems in terms of discrete states and3 the transitions connecting them. Inherent-structure methods have4 been extensively applied to liquids. Properties of liquids are calculated in the thermodynamic limit, N f ∞, and the number of IS increases as eRN, where R is a constant. Then, fluctuations in Uis are small in the usual sense, simplifying any average, and use of a continuous distribution, Ω(Uis), is justified. The IS are5 composites of weakly interacting local regions, which leads to a Gaussian Ω(Uis) away from the very lowest energies. The IS mapping is equally applicable to proteins, but relatively little6 has been done. In contrast to the liquid case, there is no thermodynamic limit. Fluctuations can be large, and we will show that low-lying IS can have finite occupation probabilities, indicating that they should be treated individually. The different types of interactions yield a natural heterogeneity, and the density of states can have a rich structure. Two significant temperatures for protein folding7-10 are the folding temperature Tf, and the collapse temperature Tθ, with * E-mail: [email protected], [email protected].

Tθ g Tf. Cooling below the collapse temperature, the protein changes from an extended form to a compact form, signaled by a peak in the heat capacity. Tf should mark the midpoint of the broad transition from compact non-native states to the native state. The ratio, Tθ/Tf g 1, is7-9 an indicator of foldability. Good foldability is expected if Tθ/Tf ≈ 1, since then the protein approaches the native state almost immediately after collapsing, and is less likely to misfold. For a simple good folder, the fluctuations, 〈(δQ)2〉, in the order parameter Q, will have a peak at Tf; 〈 〉 denotes an equilibrium average. Determination of Tf for frustrated, intermediate-poor folders is problematic and will be discussed herein. In the following we present a broad application of the IS method to the11 46 bead β-barrel forming version, denoted β BLN, of the “minimalist” BLN model. The 46-mer is an intermediate folder, with a rough energy landscape capable of producing misfolding and quasiergodicity, and can thus be a computational challenge. Using an enhanced sampling algorithm we obtain 239 199 IS including, we believe, all those of importance below Tθ. With such knowledge an extremely detailed description of the system is possible, revealing the power of IS thermodynamics for proteins. II. Basic Concepts 1. Inherent Structure Thermodynamics. Thermodynamics is governed by the canonical partition function

Z(T) )

∫ dUisΩ(Uis)e-(U +A is

vib(Uis,T))/kBT

(1)

with Avib defined as

exp(-Avib(Uis,T)/kBT) ) 1 Λ3N

∑′ ∫V

basin

drN exp(-(U(rN) - Uis)/kBT) Ω(Uis)∆Uis

10.1021/jp0665776 CCC: $37.00 © 2007 American Chemical Society Published on Web 02/21/2007

(2)

2648 J. Phys. Chem. B, Vol. 111, No. 10, 2007

Kim and Keyes

where the prime indicates a sum over all IS with energy in the range (Uis,Uis + ∆Uis), Λ is the de Broglie wavelength, and the integral is over the configuration space volume of the basin. The probability distribution of IS energies obeys

P(Uis,T) ) Ω(Uis)e

-(Uis+Avib(Uis,T))/kBT

/Z(T)

1 N-1 1 N-2 U ) Kr (ri,i+1 - re)2 + Kθ (θi,i+1 - θe)2 + 2 i)1 2 i)1





N-3



[Ai(1 + cos(φi)) + Bi(1 + cos(3φi))] + ∑ i)1

(3)

N-2

4 Boltzmann’s constant, kB, does not appear explicitly after changing to appropriate units. A variable with temperature as an argument denotes an equilibrium ensemble average. For a Uis-dependent quantity the average may be obtained by multiplying by P(Uis,T) and integrating over Uis. With the IS corresponding to discrete states of the system, the configurational entropy obeys in units of kB,

Sc(Uis) ) ln(Ω(Uis))

(4)

and

Sc(T) )

∫ dUisSc(Uis)P(Uis,T)

(5)

The total free energy is a sum of configurational and vibrational contributions,

A(Uis,T) ) Ac(Uis,T) + Avib(Uis,T)

(6)

Ac(Uis,T) ) Uis - T ln(Ω(Uis))

(7)

with

and the usual thermodynamic free energy is A(T) ) A(Uis(T),T). The theory outlined above is exact if the contributions of individual IS are infinitesimal, justifying the use of a continuum description. Finite occupation probabilities for the low-lying protein states are indeed encountered below, but we will proceed with the continuum approach, discussing the matter as it arises, and present a modified theory in future work. 2. The 46-Bead β BLN Model. The 46-mer of the β-barrel forming version of the BLN model, introduced by Honeycutt and Thirumali,11 has been studied extensively10-18 in terms of its thermodynamics, kinetics, and potential energy landscape. The B (hydrophobic) L (hydrophilic) N (neutral) model, while lacking atomistic detail, includes stretches, bends, torsions, and nonbonded interactions. The original11 version had rigid bonds but we use the modification by Berry19 to include harmonic stretches, since it is more convenient for minimizations. The sequence of beads is B9N3(LB)4N3B9N3(LB)5L. Since the landscape exhibits a high degree of energetic frustration in the low-energy regions, this system also has been used as a benchmark to test various global optimizations algorithms.20,21,22 By energetic frustration we mean that the system is unable to sample states which it should visit according to the equilibrium distribution. “State” has a precise meaning in this work, namely, all configurations that map to a specific IS. Energetic frustration may arise from traps or multiple funnels, groups of states with good intragroup mobility and poor intergroup mobility. The explicit form of the potential, which determines the landscape, is

N

∑ ∑ Ci,j i)1 j)i+2

[( ) ( ) ] σ

12

rij

- Dij

σ

rij

6

(8)

Harmonic stretching and bending constants, and the equilibrium bond length and angle, are Kr ) 231.2/σ2, Kθ ) 20/ (rad)2, re ) σ, and θe ) 105°. Energy and length units are defined in terms of the characteristic parameters of the nonbonded interactions,  and σ, and  is used to define a dimensionless temperature. The third term in eq 8 is the contribution of the torsions, each defined by a quartet of successive beads. If zero or one beads are neutral then Ai ) Bi ) 1.2, favoring the trans configuration, while otherwise Ai ) 0, Bi ) 0.2, promoting flexibility. Thus, the chain is able to turn at a neutral segment. The global energy minimumsthe “ground state” ISsis a four-strand β barrel, with energy 0 ) -49.4072, which is consistent with the previous study.21 The properties of individual IS will be labeled with a subscript indicating its order in energy, e.g., i. The nonbonded coefficients are the following: (i ) N, all j), C ) 1, D ) 0; (i ) L, j ) L, B), C ) 2/3, D ) -1; (i, j ) B); C ) D ) 1. Neutral beads have no attractions, while hydrophobic beads attract each other and hydrophilic beads have extra repulsions with B and L sites to mimic the solvation effect. 3. Sampling Algorithm. An enhanced sampling algorithm is required to obtain equilibrium properties, and to reliably find all of the lowest lying IS, of a frustrated intermediate folder in the vicinity of Tf. We employ the “average energy guided simulated tempering” (AST) method.23,24 Simulated tempering (ST) accelerates the convergence of simulations by combining an additional Markov chain in temperature with GibbsBoltzmann sampling in the configuration space.25,26 Besides conventional canonical Monte Carlo (MC) or molecular dynamics (MD) simulations at the temperatures Ti, the system infrequently attempts to make a transition from Ti to Tj among a set of canonical ensembles, with the acceptance probability obeying

A(TifTj;x) ) min[1,e(βi-βj)Uw(Tj)/w(Ti)] ) min[1,e(βi-βj)Ue∆Fij]

(9)

where the temperature weight, w(T) ) eF(T), has been introduced to promote transitions in T space. Here we use 20 canonical ensembles with 1.2 g T g0.2. The principal liability of conventional ST is the unknown weight w(T), which must be iteratively determined by using some preliminary simulations.27 AST automates the weight determining process by using the average energy guide, U h (T).23 By taking an average of U at different Tj, the weight is adaptively determined during the simulation as

∫ Uh (T)/T dT

w(T) ) eF(T) ) e-

T

2

(10)

Once U h (T) is determined the effective free energy F(T) is calculated analytically by using the interpolation for U h (T), which connects successive grid points between [Tl-1, U h l-1] and [Tl, U h l] as

Inherent Structure Analysis of Protein Folding

∆Fij ) F(βj) - F(βi) ) j-1

)-

∫T ∑ l)i

Tl+1 i

J. Phys. Chem. B, Vol. 111, No. 10, 2007 2649

∫TT Uh (T)/T2 dT j

i

U h i + γi(T - Tl)

j-1

)

T2

δFl ∑ l)1

(11)

where γl ) (U h l+1 - U h l)/(Tl+1 - Tl) and δFl ) γl ln(Tl/Tl+1) + (U h l - γlTl)(1/Tl+1 - 1/Tl). The analytical evaluation of the integral in eq 11 through linear interpolation in T enables an accurate estimation of ∆Fij, even in phase transition regions. The canonical sampling at each Tj is generated by scaling the potential and fixing the average kinetic energy according to the reference temperature T0 ) β0-1 ) 1.2 with a NoseHoover thermostat.28 The Hamiltonian for the jth subensemble is given by

Hj ) K(pN) + λjU(rN)

(12)

where K is the kinetic energy and U is the potential energy, and λj ) T0/Tj is the scaling parameter. The generated trajectory gives a correct canonical sampling, W(rN;βj) ∝ e-β0λjU(rN) ) e-βjU(rN). The average guide U h (T) has been updated every 106 MD steps, and converges at the 10th iteration. Through a 4 × 108 MD steps simulation the system shows vigorous subensemble transitions. The IS have been determined for 4 × 105 equilibrium configurations saved every 103 MD steps, by applying conjugategradient minimization until the energy variation is smaller than 10-5 upon further minimization. We believe that our database of 239 199 IS contains all the low-lying states important below Tθ, including some previously unknown, which play a significant role in folding. On the other hand, more higher energy states can be found by going to higher temperature, and the form of the density of states at high Uis remains uncertain.

Figure 1. Simulated (a) thermodynamic quantities U(T) and CUU(T) from 4 × 105 selected configurations, and Utot(T) and CUU,tot by joining 4 × 108 canonical simulation data via the weighted histogram method, and (b) IS thermodynamic quantities Uis(T) and CUisUis from the 4 × 105 saved equilibrium configurations.

III. Simulated Quantities and Analysis 1. Thermodynamic Signatures of Collapse and Folding. Protein folding may be viewed through the T-dependence of equilibrium or inherent structure thermodynamic quantities. Here we define the heat capacities, CXY(T) ≡ ΣXY/kBT2, where Σ denotes the fluctuations, ΣXY ≡ 〈δXδY〉, and δX ) X - 〈X〉. Previous studies10,15,14,18 have estimated the collapse temperature as Tθ ≈ 0.65, via the peak of the potential energy, constant volume heat capacity, CV ) CUU(T). The folding transition temperature was originally estimated10,14 as Tf ≈ 0.35, by large fluctuations in the order parameter

Q)

1 MQ

N

θ(ζ - |rij - rij0|) ∑ i,j>i+4

Figure 2. Simulated (a) thermodynamic quantities Q(T) and ΣQQ(T) and (b) IS thermodynamic quantities Qis(T) and ΣQisQis(T).

(13)

where θ(x) is the Heaviside step function, rij and rij0 are the separations of beads i and j in the instantaneous configuration and the ground state, respectively, MQ is a normalization constant, and ζ ) 0.2 is the threshold value to take into account thermal fluctuations around the ground state. More recent estimates for Tf are18 0.27 and, by estimating the midpoint of the transition from the free energy A(Q),15 0.24. The value of Tθ/Tf g 1.86 from the various estimates is appropriate for an intermediate folder. In parts a and b of Figure 1, various equilibrium and IS thermodynamic quantities, respectively, are plotted as a function of T. The collapse transition is seen through a peak in CUU, as usual, and also in its IS-energy counterpart, CUisUis, at the same

Tθ ≈ 0.65. In Figure 1a, Utot and CUU,tot are calculated by joining all the canonical simulation data of 4 × 108 MD steps via the weighted histogram method (WHM).29,30 On the other hand, U, CUU, and the corresponding IS quantities are determined by taking an average of selected 4 × 105 equilibrium and IS configurations, respectively. Except for a small difference at low temperatures near Tf, in which CUU,tot shows a smooth variation, but CUU displays a small ruggedness, the overall coincidence confirms that the data set is well equilibrated. In addition to the main peak at Tθ, a small shoulder is observed in CUisUis, which is not seen in CUU, around the “local ordering” transition temperature, To ≈ 1.18 This transition is associated with early formation of secondary structures of the second and

2650 J. Phys. Chem. B, Vol. 111, No. 10, 2007

Kim and Keyes

Ωbi(Uis; Nr, U0is, δU) ) Nr!δU -1 (Nr - (Uis - U0is)/δU)!((Uis - U0is)/δU)!

Figure 3. Simulated density of states and (Table 1) Fit 1 (solid) and Fit 3 (dashed) to two binomials.

fourth β strands. Thus we realize one of the original aims in introducing the IS,1 bringing out features obscured by thermal averaging, In contrast to the sharp collapse transition, the folding transition is not so easy to detect with our enhanced sampling. Both Q(T) and Qis(T) in Figures 2 a and 2b are monotonically increasing crossing Tθ and Tf, and do not show any saturation behavior associated with dominant occupation of the native state, i.e., a folding transition. Correspondingly, the conventional order parameter fluctuations and their IS counterpart, ΣQQ(T) and ΣQisQis(T), also increase monotonically, with a small shoulder at Tθ. The folding transition also has been identified by a shoulder in CUU(T).10,18 However, this thermodynamic signature is also significantly reduced and can only be seen in ∂CUU,tot/∂T. The dependence of thermal features upon the quality of the sampling has also been reported in the LJ-38 cluster. The phase change between the truncated octahedral and icosahedral basins was associated with a small peak in CUU in the harmonic superposition approximation for the density of states,31 but only a slope change, without any peak, appears in more sophisticated parallel tempering simulations.32 Spurious thermodynamic features due to an incomplete sampling have also been pointed out in enhanced sampling simulations of the β BLN 69-mer,33 and using our statistical temperature molecular dynamics algorithm34 on the 46-mer. 2. The Density of States from Enumeration of the IS. Assuming that we have a complete database of IS, the density of states Ω(Uis) can be determined by counting the number in each energy bin. The result, shown in Figure 3, is bimodal with a high-energy shoulder. The low- and high-energy lobes are physically distinct groups of states, corresponding to compact (co) and extended (ex) configurations, respectively. More states are present in the highenergy group, corresponding to higher configurational entropy. Cooling below the collapse temperature Tθ, the system moves from the high-energy group to the low-energy group. This is not surprising, but the nature of the collapse is particularly clear and explicit in the IS picture. By contrast, in liquids a Gaussian is a good first approximation.5 Even so it is recognized35 that some lower energy cutoff must be invoked, corresponding to the minimum energy, disordered, liquid-like IS. A natural way to do this is34 with a binomial distribution, which becomes a Gaussian at higher energy. To justify the binomial distribution, suppose that the protein energy is approximately a composite of local contributions.8 If the local excitation energy is δU and the ground state energy is U0is, the number of excited regions is (Uis - U 0is )/δU, and the statistical weight is the binomial coefficient for selecting that number out of the total number of local regions, Nr; thus

(14)

and the number of IS in the distribution is 2Nr. For a continuous distribution the factorial is replaced by the Γ function, x! f Γ(x + 1). Here we make a fit to two binomials, denoted Fit 1 in Table 1 and shown in Figure 3, with the idea that two disjoint types of states are generated by two distinct types of excitation, and ignoring the high-energy shoulder for simplicity. Unphysically, the ground state of the high energy, extended-state lobe is U0is,ex ) -94.0 , 0. In the physical region of Uis g0 it is in its Gaussian limit. By contrast, U0is,co ) -50.1.0 ≈ 0, indicating that the model of weakly interacting two-level local regions8 might hold some truth for the compact states. The compactstate lobe is not in its Gaussian limit near 0 and provides a good representation of the low-energy data. The excess density of states in the vicinity of 0 comes from the extended lobe. In an attempt to maintain zero density of states below 0, Fit 2 (not shown) was made to four parameters with U0is,co and U0is,ex set equal to 0. Now the high-energy lobe is too narrow, and the low-energy lobe moves up somewhat to compensate. Finally we set U0is,co ) 0 and varied the other five parameters, but modified the high-energy binomial by multiplying by (Uis - 0)Θ(Uis - 0), attaining a good representation for all energies. Fit 3, shown in Figure 3, has no physical justification but is a useful empirical representation. All the values of Nr indicate rearranging groups of about 3 beads. The minima in Ω(Uis) are entropic barriers to the collapse and local ordering transitions. Figure 1b shows that Uis(Tθ ) 0.65) ≈ -35, at the minimum between the principal two lobes. No detailed calculation is necessary to conclude that Tθ is the transition temperature between the compact and extended states, accompanied by large energy fluctuations. Similarly, Uis(To) is near the minimum associated with the high-energy shoulder. 3. Two-Dimensional Density of States: IS Scatterplot in (Uis, Qis). Disconnectivity graph analysis13,17,36 previously demonstrated a multifunnel structure of the energy landscape of the 46-mer. This is transparently shown in the twodimensional scatterplot of individual IS vs (Uis, Qis) in Figure 4, which represents a two-dimensional density of states, and which we propose as an alternative to36 disconnectivity graphs. For Uis < -45 the folding funnel leading to the global minimum, containing configurations with Qis > 0.65, is separated by a strong barrier from the misfolding funnel, comprised of configurations with Qis < 0.65. A large majority of the collapsed states with Uis < -35 belong to the misfolding funnel. Folding implies a dominant occupation of the native state, corresponding to a set of IS near the bottom of the folding funnel having similar structural motifs. However, around Tf the 46mer will exhibit non-negligible populations for the structurally dissimilar low-lying IS near the termini of the two funnels. The associated incoherent average of the order parameter significantly reduces the folding signature in ΣQisQis(T) or Qis(T). Most differences of the lowest lying IS come from hydrophobic mismatches between β strands associated with the variation of the loop regions, Figure 4b. 4. The Vibrational Free Energy. The second component of IS thermodynamics is the vibrational free energy. If Avib were independent of Uis, Tθ for (extended) f (compact) could be estimated from the condition ∆Ac ) 0. From the peak positions and amplitudes of the two lobes in the density of states

Inherent Structure Analysis of Protein Folding

J. Phys. Chem. B, Vol. 111, No. 10, 2007 2651

TABLE 1: Density of States Fit Parameters for Two Binomialsa Fit 1 Fit 2 Fit 3 a

0 Uis,c

Nr,co

0 Uis,co

δUco

Nr,ex

0 Uis,ex

∆Uex

14.4 15.1 14.4

-51.1 -49.4 -49.4

1.38 1.32 1.14

17.8 17.1 13.5

-98.7 -49.4 -143.

10.0 5.28 15.8

Fit 1, all six parameters varied. Fit 2, both U0is fixed at 0. Fit 3, fixed at 0 and high-energy lobe smoothly cut off at 0.

Figure 5. (a) Uis-dependent, T-independent (from harmonic sum) part of exact harmonic Ahvib/kBT (squares) and small-fluctuation approximation in eq 18 (circles), and fit to a quadratic function of a1 + a2Uis + a3U2is, and (b) approximate anharmonic vibrational free energy and fit to a quadratic function b1 + b2Uis + b3U2is..

dependent parts as M

βAhvib

) M ln(βpω0) +

∑i ln(ωi/ω0)

(16)

where ωi is the ith normal-mode frequency for the selected IS, M ) 3N - 6, and ω0 is a reference frequency. The IS-thermodynamic quantity is given by the relation with

e-βAvib(Uis,T) ) 〈e-βAvib〉Uis h

Figure 4. (a) Multifunnel energy landscape represented by scatterplot of IS in (Uis,Qis), with a magnified view of the low-energy region in the lower panel, and (b) low-lying inherent structures. In part b, from left the native state (IS0), first (IS1), second (IS2), and third (IS3) excited IS.

one easily finds ∆Uis ) -31.7, ∆Sc ) -2.36, and thus Tθ ) 13.4.The enormous deviation of this estimate from the true Tθ ) 0.65 strongly suggests that the vibrational free energy plays an important role in protein folding. The same conclusion follows from the observation (Figures 1b and 3) that for modest temperatures T J 1, Uis(T) substantially exceeds the energy, Uis ≈ -9.4, corresponding to the peak of the high-energy lobe. This is impossible based on configurational entropy alone. Increasing entropy is what drives the protein to higher Uis with increasing T, but Sc decreases as Uis moves above the peak. The harmonic approximation is simple and has been very useful at the lowest temperatures in liquids, where the system occupies the bottoms of the basins. For a single IS, M

βAhvib )

∑i ln(βpωi)

(15)

which can be rewritten to separate out the Uis dependent and T

h

(17)

the Uis-subscript denoting an average over the appropriate energy bin. A further simplification is often obtained by averaging the harmonic sum,

βAhvib(Uis,T) ) M ln(βpω0) + H(Uis)

(18)

where H(Uis) ) 〈∑M i ln(ωi/ω0)〉Uis. The approximation is valid if fluctuations of the individual Avib, eq 16, are small; such is the case in liquids. The Uis-dependent, T-independent parts of the exact and approximate (eq 18) βAhvib are shown in Figure 5a. The data set of the small fluctuation approximation has been fitted to a quadratic function of Uis as

H(Uis) ) a1 + a2Uis + a3U2is

(19)

giving a1 ) 272.8, a2 ) -0.913, and a3 ) -0.00236. The nontrivial part of the harmonic approximation is the entropy, so the negative of the quantity in the figure is a vibrational entropy that increases with increasing Uis. Consequently, Uis(T) is larger than would be expected from the configurational free energy alone, and may exceed the energy at the peak in the density of states.37 Substantial differences are observed between the exact and small-fluctuation form of βAhvib(Uis), demonstrating the impor-

2652 J. Phys. Chem. B, Vol. 111, No. 10, 2007

Kim and Keyes

tance of fluctuations in the protein, as opposed to the liquid. Note the sharp drop in the exact βAhvib, or sharp increase in vibrational entropy, upon moving from collapsed to extended states above Uis ≈ -35. This confirms our hypothesis that the two lobes in Ω(Uis) correspond to physically distinct states. There is no reason a priori to believe that the harmonic approximation is accurate in proteins, except at the very lowest T. We have also estimated the anharmonic contribution, following the method of ref 38. Considering the series expanj+δ, the deviation of the potension, βAah vib(Uis,T) ) ∑jAj(Uis)T tial energy from the harmonic approximation can be used to estimate Aah vib as

Uah(Uis,T) ) U(Uis,T) - Uis - MkBT/2 )-

(j + δ)Aj(Uis)T j+δ ∑ j)1

(20)

By calculating the conditional average potential energy U(Uis,T) and retaining only the first term in eq 20, the simulation data of -Uah(Uis,T)/((1 + δ)T1+δ) have been fitted to a quadratic form,

A1(Uis) ) b1 + b2Uis + b3U2is

(21)

yielding b1 ) -4.115, b2 ) 0.332, and b3 ) -0.00001 (Figure 5b). We find that the best approximation is obtained with 1+δ in the δ ) 1.5, that is, the factor of βAah vib(Uis) ) A1(Uis)T Boltzmann factor has a T1.5 dependence, as well as a roughly linear Uis dependence. The anharmonic contribution opposes the occupation of highenergy states, partially counteracting the aforementioned effect of Ahvib. The opposing tendencies of harmonic and anharmonic Avib are not surprising. Low vibrational frequencies lead to large accessible configuration-space volumes in the harmonic approximation, increasing the vibrational entropy and decreasing the free energy. However, following a low-frequency normal coordinate from the IS usually leads to a transition state marking the true boundary of the basin before the harmonic estimate of the thermally allowed displacement is reached. Thus the harmonic approximation overestimates the vibrational entropy and underestimates Avib. It is corrected by an opposing Aah vib, which contains the effects of the transition states. 5. The Configurational Entropy Derived from IS Distributions. Having determined Ω(Uis) by enumeration of the IS, the configurational entropy, Sc(Uis), follows from eq 4. However, it will be seen in the following that an alternative approach is desirable, due to the difficulty of obtaining a complete enumeration. Once we have sampled a large number of IS at different T and determined Avib(Uis,T), Sc(Uis) can be estimated from the IS energy distributions P(Uis,T).39 Taking the logarithm of eq 3 determines the configurational entropy up to a T-dependent term as

Sc(Uis) ) ln P(Uis,T) + βUis + βAvib(Uis,T) + ln Z(T) (22) In supercooled liquids, βAvib(Uis,T) often shows a weak Uis dependence, allowing curves ln P(Uis,T) + βUis at different T to be superimposed on the same master curve by adding a temperature-dependent constant. However, this is not the case for the present system, since the dynamic range of Avib is not small compared to the sampled IS energy range, and Avib shows a very strong Uis dependence when the states change from extended to compact around Uis(Tθ). We found that ∆Avib ) Avib(Uis ) 20) - Avib(Uis ) -50) ≈ -70.73 is comparable

Figure 6. Configurational entropy Sc(Uis) calculated from simulated P(Uis,T) according to eq 22 with (a) and without (b) the anharmonic contribution at the simulated temperatures. In both cases, the unknown temperature-dependent terms have been determined to maximize overlap between the curves by using the weighted histogram method.

to ∆Uis ) 70 at T ) 1.2 and the ratio ∆Avib/∆Uis ≈ 0.5 even at the collapse temperature Tθ. The essential Uis-dependence of Avib indicates that a strong coupling between configurational and vibrational degrees of freedom is an intrinsic feature of proteins, as distinguished from supercooled liquids. In Figure 6, simulated data points of the RHS of eq 22 at different T (except for ln Z(T)), from simulated P(Uis,T) and calculated Avib, are plotted to maximize overlap by determining a proper temperature-dependent constant, using the weighted histogram method (WHM).28,29 Both a calculation (a) using the full anharmonic Avib and a harmonic approximation (b) with Ahvib from eq 18 are given. All data superimpose very well on a master curve Sc(Uis) for the whole sampled Uis range when anharmonicity is included. On the other hand, the harmonic approximation yields a good master curve only at low energies (temperatures). The harmonic configurational entropy Sc,har(Uis) shows a marked difference from Sc(Uis) at high Uis. The configurational entropy exhibits (Figure 7) inflections at Uis ≈ -35 and ≈ 5, and irregular behavior at Uis ≈ -46.5 (inset). The associated features in Ω(Uis) are the minima associated with the collapse and local ordering transitions, and an extended low-energy tail below Uis ≈ -46.5, respectively. Since the fitted βAvib is a monotonic function of Uis the origin of this behavior of Sc(Uis) in the current calculation is P(Uis,T) in eq 22. Indeed, examination of the inherent structure free energy, A(Uis,T) ) -(1/β) ln P(Uis,T), in Figure 7b, reveals that the irregularity at Uis ≈ -46.5 is related to the crossover from a delocalized P(Uis,T) to an exponential localized about the ground state at T ≈ 0.3. The features discussed in Sc(Uis) and P(Uis,T) are intrinsic to the IS approach, and are smoothed out or disappear in ordinary thermodynamics. In this sense, Uis, like Q, can be considered as a good order parameter for protein folding. IV. Calculations in the Inherent Structure Approach Having shown and analyzed a number of simulated quantities from the inherent structure formalism, we now turn to calculations. The T-dependent, averaged, inherent structure energy,

Inherent Structure Analysis of Protein Folding

Figure 7. (a) Configurational entropy, Sc(Uis), and harmonic approximation, Sc,har(Uis), with arrows indicating inflection points, and (inset) magnified view of Sc(Uis) around the global minimum; Sc(-49.5) has been chosen as zero. (b) Simulated inherent structure free energy, A(Uis,T) ) - (1/β) ln P(Uis,T), at various temperatures.

Uis(T), and configurational entropy, Sc(T), are central to our approach, and we will introduce several new quantities as well. 1. Using Sc,Ω(Uis) from Direct Enumeration of the IS. We first tried to reproduce Uis(T) using Sc,Ω(Uis), indicating the quantity obtained from eq 4 and direct enumeration of the density of states. The configurational entropy from the derivation using P(Uis,T) in Section III.4 will simply be denoted Sc(Uis) in the following. Evaluating Uis(T) without consideration of the Uis-dependence of Avib fails completely (Figure 8), as previously h indicated. Using only either Ahvib + Aah vib or Avib leads to good agreement with simulation up to T ≈ 0.5. A failure in reproducing the higher values of Uis(T) could be a consequence of missing high-energy IS, or indicate the need for a better treatment of anharmonicity. The comparison of Sc,Ω(Uis) with Sc(Uis) in Figure 8b illustrates the former scenario. We find that Sc,Ω starts to deviate from Sc at Uis* ≈ -46, where the consecutive IS energy difference ∆Uis ) i+1 - i falls to zero. This means that IS distribution becomes quasicontinuum for Uis > Uis*, making it difficult to calculate Ω(Uis) by direct enumeration. Another possibility is an error in βAah vib; Uis(T) will be extremely sensitive to the form of Avib. In the model of37,40 a Gaussian Ω(Uis) and a linear Uis-dependence of Avib, the high-T limit of Uis(T) is shifted from the energy at the peak of the density of states by an amount proportional to the product of the slope of Avib(Uis) and σ2, where σ is the width of the Gaussian. Above Tθ the extended-state lobe should dominate, and it has σ2 ) 467. Thus, small errors in Avib(Uis) will make enormous changes in the high-T value of Uis(T). 2. Using Sc(Uis) from P(Uis,T). In Section III.4 we demonstrated the procedure of determining Sc(Uis) without enumerating the IS. The function Sc(Uis) is that which gives the simulated

J. Phys. Chem. B, Vol. 111, No. 10, 2007 2653

Figure 8. (a) Simulated inherent structure energy (points) vs T and calculation using no vibrational free energy (lowest curve), harmonic Ahvib only (upper curve), and harmonic and anharmonic Ahvib, and (b) configurational entropy Sc,Ω(Uis) ) ln Ω(Uis) and Sc(Uis) determined from P(Uis,T), and consecutive IS energy difference ∆Uis ) i+1 - i, i being the IS index in ascending order.

P(Uis,T) for an assumed Avib(Uis,T), and Ω(Uis) ) exp(Sc(Uis)). Any error in Avib(Uis,T) leads to a compensating error in Sc(Uis) and in Ω(Uis). Reproducing the simulated Uis(T) is thus not a prediction. However, given one’s best estimate of Avib(Uis,T), the method allows an alternative framework for calculations, with a lower computational burden than direct enumeration. As expected, a good reproduction of simulated P(Uis,T) is obtained in Figure 9. At T < 0.28, the distributions are exponentials peaked around the global minimum. Both Uis(T) and its fluctuations/kBT2, CUisUis in Figure 9b, also show perfect agreement with simulation, and continuously extend to zero temperature. On the other hand, the same quantities within the harmonic approximation show noticeable deviations from simulation at high temperature. The calculation benefits from an inherent computational advantage of the IS approach: the construction of Sc(Uis) based on IS sampled from a particular temperature range allows an accurate thermodynamics for lower temperatures, including those where the sampling is hampered by ergodicity breaking. The behavior of the averaged configurational entropy provides insights into the folding process. In Figure 10, we have calculated ∆Sc(T) ) Sc(T) - Sc(T0), Sc(T) ) dUisSc(Uis)P(Uis,T). Here we assume that the protein remains in the native basin at the very low T0 ) 0.025. ∆Sc(T) shows a rapid drop crossing Tθ and a slow decay after collapse. The gradual reduction of ∆Sc(T) for 0.6 g T g 0.2 represents a slow relaxation to the native state, which is characteristic of an intermediate folder.

2654 J. Phys. Chem. B, Vol. 111, No. 10, 2007

Kim and Keyes

Figure 9. (a) Calculateded P(Uis,T) from Sc(Uis) (solid lines) and simulated IS distributions (lines points), high and intermediate temperatures (upper panel), and low temperatures (lower panel), and (b) Uis(T) and CUisUis(T) calculated from Sc(Uis) and Sc,har(Uis), and simulation results (squares and circles). In the lower panel of part a, solid lines correspond to P(Uis,T) at T ) 0.11, 0.15, 0.18, 0.2, 0.25, 0.28, 0.34, 0.43, and 0.5.

With a good folder having a concomitant folding and collapse transitions, the configurational entropy rapidly falls to zero after collapse. Extrapolating the 0.3 g T g 0.2 data to zero with linear and quadratic fits yields values just below and above 0.15, respectively, so our estimate for the Kauzmann temperature is TK ≈ 0.15. While TK is sometimes associated with an ideal glass transition in liquids, here it represents something more like “freezing” into the ground state. From the relation of Svib(Uis,T) ) -∂Avib/∂T, and eqs 20 and 21, the harmonic and anharmonic contributions for the entropy are obtained

Sh(Uis,T) ) M(1 - ln βpω0) - H(Uis)

(23)

Sah(Uis,T) ) - (δ + 1)Tδ(b1 + b2Uis + b3U2is).

(24)

and

Multiplying each entropy term by P(Uis,T) and integrating gives a decomposition of the total entropy ∆Stot ) ∆Sc(T) + ∆Sh(T) + ∆Sah(T) in Figure 10, which shows that the reduction of the total entropy around Tθ mainly originates in the configurational term, Sc(T), and the Uis-dependent part of the harmonic term, Sh,Uis(T) ) ∫dUis( - H(Uis))P(Uis,T) (eq 18), while the anharmonic entropy Sah(T) gives an opposing effect. The accuracy of ∆Stot(T) has been verified by comparison with the entropy ∆StotTI(T) obtained by thermodynamic integration.

Figure 10. (a) Decomposition of the total entropy ∆Stot(T) into the configurational ∆Sc(T), the harmonic ∆Sh(T), and the anharmonic ∆Sah(T). The harmonic entropy has been further separated into the T-dependent part and the Uis-dependent part ∆Sh,Uis(T). The configurational and Uis-dependent harmonic and anharmonic contributions are magnified five times for comparison. The circles correspond to ∆ STI tot(T) obtained by thermodynamic integration, and (b) entropy fluctuations/(kBT2), total CStotStot, and decomposition into CScSc, CShSh, and CSahSah.

The decomposed entropy fluctuations (/kBT2) in Figure 10b illuminate the three transitions under discussion. All components show a strong peak around the collapse temperature, Tθ. The configurational entropy fluctuations exhibit a broad shoulder around T ≈ 0.25, corresponding to the slowing of the configurational entropy reduction with decreasing T, due to frustration from the entropic barrier at the first minimum in Ω(Uis) (Figure 3). We suggest that the shoulder is identifying Tf, and our our proposal is Tf ) 0.25. Our choice of Tf also coincides with the crossover from a delocalized P(Uis,T) to an exponential localization into IS near the global minimum in Figure 9, associated with a strong reduction of the entropy fluctuations upon a further decrease in T. The local ordering transition appears as a peak in the harmonic vibrational entropy fluctuations. A peak in the order parameter fluctuations, ΣQQ, is a standard way to identify Tf, but, with our enhanced sampling, no peak is observed in it or its IS counterpart, ΣQisQis, in Figure 2b. As discussed earlier, the incoherent average of Qis over dissimilar IS in different funnels necessarily leads to large fluctuations even at low temperatures, suppressing the peak that might be seen with poor sampling. To examine the order parameter fluctuations we first simulated the nonthermal average, Q h is(Uis,T)) ) 〈Qis〉Uis, over the IS in a narrow energy bin (Figure 11a), and then calculated Q h is(T) ) Q h is(Uis)P(Uis,T)dUis and its fluctuations, ΣQh isQh is (Figure

Inherent Structure Analysis of Protein Folding

Figure 11. (a) IS average of order parameter Q h is(Uis) and (b) Q h is(T) and its fluctuations, ΣQh isQh is, calculated from Sc(Uis). For comparison, ΣQh isQh is has been magnified 20 times.

J. Phys. Chem. B, Vol. 111, No. 10, 2007 2655

Figure 12. (a) Simulated IS occupation probability pi(T) vs temperature and (b) harmonic IS occupation probability phi (T) vs temperature for the ground state and the first nine excited states, in order i ) 0-9.

11b). The calculated Q h is(T) reproduces the simulation data very well down to T ) 0.2 and predicts saturation behavior below TK. In contrast to the monotonic T-dependence of ΣQisQis, the fluctuations, ΣQh isQh is, show characteristic peaks at Tθ and at our proposed Tf ≈ 0.25. This discrepancy is due to the averaging process determining Q h is, in which the large fluctuations in Qis, represented by the scattered points in Figure 11a, have been suppressed. The peaks in ΣQh isQh is correspond exactly to sharp increases in Q h is(T). Thus it is possible to construct order parameter fluctuations that give a folding signal for a multifunnel landscape, but the physical relevance needs to be evaluated. V. Folding and Misfolding: Occupation Probabilities of Individual States Despite the success achieved so far, the averaging of structurally dissimilar states having a similar Uis makes it difficult to isolate unambiguous thermodynamic or IS-thermodynamic indicators of folding in a multifunneled landscape. More fundamentally, one must question the use of a continuum description of the energy landscape at low temperature in proteins. We will now show that the occupation probabilities of individual IS states, pi(T), become finite below Tθ, and illuminate the folding process very clearly. In Figure 12a we show the simulated occupation probabilities for the ground state and the first nine excited states, p0-p9, over the low-temperature region. Individual pi become finite below Tθ, indicating that the formalism should be modified to include discrete states plus a continuum; this work is in progress. The onset of finite pi is a significant “waypoint” in protein folding. From near Tθ down to T ≈ 0.32, p1 and p2 exceed p0, giving a transparent mechanism for misfolding if escape from the excited-state basins is slow. We suggest that the presence of a temperature range with pi > p0, terminating by definition at a new characteristic temperature denoted Tp0, plays a key role in misfolding. On the basis of Uis only, p0 must exceed every other pi, so we are seeing the action of the vibrational free energy. Below Tp0 ≈ 0.32, p0 rises sharply above the other pi. This dominant occupation of the ground state, IS0, is strongly

Figure 13. Plots of IS energy vs time in conventional MD, starting from different initial configurations of IS0 (black) and IS1 (gray) at temperatures (a) T ) 0.3 and (b) T ) 0.4.

correlated with the crossover of P(Uis,T) from a broad distribution to one localized at the bottom of the landscape around Uis ≈ -49, in Figure 9. To continue the investigation of low-lying IS at T < 0.2, below the lowest simulated T, we calculated the harmonic occupation probabilities pih in Figure 12b, taking into account only the harmonic vibrational free energy, pih ) e-βi-βAis,ih/ Z(Nis,β), where Z(Nis,β) is the harmonic partition function computed with the Nis lowest IS. The convergence of pih has been checked as a function of Nis and the low-lying pih do not change with a further increase of Nis g103. In contrast to the simulated pi(T), p0h > pih for all temperatures, and p0h reaches a value of 0.5 at T ≈ 0.15, perhaps suggesting the condition p0(TK) ) 0.5. It follows that having pi > p0 requires, more specifically, vibrational anharmonicity. Figure 10 shows that the anharmonic term dominates the Uis-dependence of the vibrational free energy below Tθ. We believe that this delicate anharmonic effect on the folding thermodynamics is one of intrinsic properties of proteins absent in supercooled liquids.

2656 J. Phys. Chem. B, Vol. 111, No. 10, 2007 The order of the pih is roughly consistent with the largest simulated pi for T < Tp0 as p0h . p1h ≈ p2h > p3h > p5h ≈ p7h.... While apprehensive of introducing yet another characteristic temperature, Tp0 seems to be significant. So too does the temperature at which the first pi becomes finite, but for now we will assume that it coincides with Tθ. The protein is most susceptible to trapping in the “misfolding interval” between Tθ and Tp0, and Tθ/Tp0 ≈ 1 is a condition for good foldability. Vibrational anharmonicity controls the foldability of the protein through the structure of the misfolding interval. The role of the misfolding interval in folding dynamics and ergodicity breaking is demonstrated in Figures 13a and 13b, where we compare IS plots of canonical MD starting from IS0 and IS1 at temperatures T ) 0.3 and 0.4, respectively. At T ) 0.3, the trajectory from IS0 only visits a set of restricted basins Ba ) {IS0, IS8, IS22, ...} repeatedly, while the trajectory from IS1 only samples another set of basins Bb ) {IS1, IS2, IS11, ...} and does not touch Ba basins during 3 × 106 MD steps. Note that the basins of the Ba and Bb sets belong to the folding and misfolding funnels, respectively, in Figure 4, which are separated by a high-energy barrier. This means that conventional canonical MD does not sample the thermally accessible configurations, i.e., ergodicity breaking and slow folding near Tf. On the other hand, at the elevated temperature T ) 0.4 both trajectories show a short transient trapping and begin, infrequently, to visit the other set of basins. VI. Discussion and Summary Our broad goal is to efficiently characterize an arbitrary protein with equilibrium simulations. Here we have tried to make the case that the IS approach is a powerful tool in this regard. Three strengths are the following: (1) the introduction of welldefined discrete states; (2) the availability of IS variables in which thermal fluctuations do not obscure structural information; and (3) computational sampling advantages. We also argue that the method should be applied in concert with enhanced sampling simulations. The density of states has two lobes, corresponding to extended states at high energy and compact states at low energy, and a shoulder on the high-energy lobe, and in contrast to the undifferentiated Ω(Uis) found in liquids. At Tθ ) 0.65, the averaged IS energy, Uis(T), is at the minimum between the two lobes, for a particularly clear picture of the collapse. Similarly, the fluctuating Uis values fall within the “well” around the upper minimum at the local ordering transition, To ≈ 1. The IS scatterplot in (Uis,Qis), or two-dimensional density of states, was introduced and shown to easily reveal funnel structure in the energy landscape. Without the decrease of the vibrational free energy, or increase of the vibrational entropy, with increasing Uis, the protein would remain folded until a very high T > 13. In addition to the overall increase of Svib with increasing Uis, the entropy of extended states exceeds that of the compact states by ∼6 entropy units at fixed Uis. The behavior of Avib appears to be more important for proteins than for liquids. A comprehensive analysis of the issues involved in the calculation of the configurational entropy and the vibrational free energy and entropy has been presented. A peak in the heat capacity gives Tθ ) 0.65, in agreement with the early work on this system. Regarding the original Tf ) 0.35, we have demonstrated the absence of characteristic features in CV, 〈(δQ)2〉 and 〈(δQis)2〉, and explained how they are washed out by incoherent averaging over multiple funnels.

Kim and Keyes The entropy fluctuations were introduced as a folding indicator, and seen to have a peak at Tθ and a shoulder at T ≈ 0.25, which we identify with Tf, in agreement with more recent estimates of18 0.27 and15 0.24. The fluctuations of the partially averaged order parameter, Q h , confirm our values of Tθ and Tf, and illustrate the nature of averaging in a multifunnel system. Furthermore, the distribution P(Uis,T) exhibits localization to an exponential decay about the ground state energy below T ) 0.25. With enhanced sampling it is possible to determine the averaged T-dependent occupation probabilities of individual IS, pi(T). In most applications of statistical mechanics such probabilities are infinitesimal, but we have discovered another feature of the collapse transition in proteins, namely, that the pi for the low-lying states attain finite values below Tθ. Near Tf, it may be possible to perform calculations with a relatively small number of IS, a significant opportunity and another difference from the liquid-state case. Furthermore, p1 and p2 of first two excited states exceed the ground state p0 until T ) 0.32, providing a transparent mechanism for misfolding. It is suggested that frustration and a significant pi>0 > p0 “misfolding interval” go hand in hand, and the temperature Tp0 was defined to mark the low end of the range. Since it is shown that pi>0 > p0 behavior is controlled by Aah vib, anharmonicity, frustration and misfolding are interconnected in protein folding. Acknowledgment. This material is based upon work supported by the National Science Foundation under Grant Nos. CHE-0352026 and CHE-0312959 (ITR). References and Notes (1) Stillinger, F. H.; Weber, T. A. Phys. ReV. A 1983, 28, 2408. Stillinger, F. H.; Weber, T. A. Science 1984, 225, 983. (2) Stillinger, F. H.; Weber, T. A. J. Chem. Phys. 1984, 81, 5095. (3) Goldstein, M. J. Chem. Phys. 1969, 51, 3728. Keyes, T.; Chowdhary, J. Phys. ReV. E 2001, 64, 032201. (4) Stillinger, F. H. Science 1995, 267, 1935. (5) Madan, B.; Keyes, T. J. Chem. Phys. 1993, 98, 3342. Keyes, T. Phys. ReV. E 2000, 62, 7905. Keyes, T.; Chowdhary, J.; Kim, J. Phys. ReV. E 2002, 66, 051110. (6) Baumketner, A.; Shea, J. E.; Hiwatari, Y. Phys. ReV. E 2003, 67, 011912. (7) Onuchic, J.; Luthey-Schulten, Z.; Wolynes, P. G. Annu. ReV. Phys. Chem. 1997, 48, 545. (8) Bryngelson, J.; Wolynes, P. G. J. Phys. Chem. 1989, 93 6902. (9) Thirumalai, D.; Klimov, D. K. Curr. Opin. Struct. Biol. 1999, 9, 197. (10) Guo, Z.; Brooks, C. L. Biopolymers 1997, 42, 745. (11) Honeycutt, J. D.; Thirumalai, D. Proc. Natl. Acad. Sci. U.S.A. 1990, 87, 3526. Honeycutt, J. D.; Thirumalai, D. Biopolymers 1992, 32, 695. (12) Nymeyer, H.; Garcia, A. E.; Onuchic, J. N. Proc. Natl. Acad. Sci. U.S.A. 1998, 95, 5921. (13) Miller, M. A.; Wales, D. J. J. Chem. Phys. 1999, 111, 6610. (14) Calvo, F.; Doye, J. P. K. Phys. ReV. E 2000, 63, 010902. (15) Evans, D.; Wales, D. J. J. Chem. Phys. 2003, 118, 3891. (16) Gordon, H. L.; Kwan, W. K.; Gong, C.; Larrass, S.; Rothstein, S. M. J. Chem. Phys. 2003, 118, 1533. (17) Komatsuzaki, T.; Hoshino, K.; Matsunaga, Y.; Rylance, G. J.; Johnston, R. L.; Wales, D. J. J. Chem. Phys. 2005, 122, 084714. (18) Pan, P.; Gordon, H.; Rothstein, S. J. Chem. Phys. 2006, 124, 024905. (19) Berry, R. S.; Elmaci, N.; Rose, J. P.; Vekhter, B. Proc. Natl. Acad. Sci. U.S.A. 1997, 94, 9520. (20) Lee, Y.-H.; Berne, B. J. J. Phys. Chem. A 2000, 104, 86. (21) Kim, S. Y.; Lee, S. J.; Lee, J. J. Chem. Phys. 2003, 119, 10274. (22) Wales, D. J.; Dewsbury, P. E. J. J. Chem. Phys. 2004, 121, 10284. (23) Kim, J. G.; Fukunishi, Y.; Kidera, A.; Nakamura, H. Phys. ReV. E 2004, 69, 021101. (24) Kim, J. G.; Fukunishi, Y.; Kidera, A.; Nakamura, H. J. Chem. Phys. 2004, 121, 5590. (25) Marinari, E.; Parisi, G. Europhys. Lett. 1992, 19, 451. (26) Lyubartsev, A. P.; Martsinovski, A. A.; Shevkunov, S. V.; Vorontsov-Velyaminov, P. N. J. Chem. Phys. 1992, 96, 1776.

Inherent Structure Analysis of Protein Folding (27) Hansmann, U. H. E.; Okamoto, Y. Phys. ReV. E 1996, 54, 5863. (28) Hoover, W. G. Phys. ReV. A 1985, 31, 1695. (29) Ferrenberg, A. M.; Swendsen, R. H. Phys. ReV. Lett. 1992, 69, 2292. (30) Newman, M. E. J.; Barkema, G. T. Monte Carlo Methods in Statistical Physics; Clarendon: Oxford, UK, 1999. (31) Doye, J. P. K.; Wales, D. J.; Miller, M. A. J. Chem. Phys. 1998, 109, 8143. (32) Neirotti, J. P.; Calvo, F.; Freeman, D. L.; Doll, J. D. J. Chem. Phys. 2000, 112, 10340. (33) Larrass, S. A.; Pegram, L. M.; Gordon, H. L.; Rothstein, S. M. J. Chem. Phys. 2003, 119, 13149.

J. Phys. Chem. B, Vol. 111, No. 10, 2007 2657 (34) Kim, J.; Straub, J. E.; Keyes, T. Phys. ReV. Lett. 2006, 97, 050601. (35) Debenedetti, P. G.; Stillinger, F. H.; Shell, M. S. J. Phys. Chem. B 2003, 107, 14434. Shell, M. S.; Debenedetti, P. G.; La Nave, E.; Sciortino, F. J. Chem. Phys. 2003, 118, 8821. (36) Becker, O. M.; Kaplus, M. J. Chem. Phys. 1997, 106, 1495. (37) Sastry, S. Nature 2001, 409, 164. la Nave, E.; Mossa, S.; Sciortino, F. Phys. ReV. Lett. 2002, 88, 225701. (38) Chowdhary, J.; Keyes, T. J. Phys. Chem. B 2004, 108, 19786. (39) Sciortino, F.; Kob, W.; Tartaglia, P. Phys. ReV. Lett. 1999, 83, 3214. (40) Chowdhary, J.; Keyes, T. Phys. ReV. E 2004, 69, 041104.