Solvent Viscosity Dependence of the Protein Folding Dynamics - The

Jan 30, 2008 - Solvent viscosity has been frequently adopted as an adjustable parameter in various computational studies (e.g., protein folding simula...
7 downloads 0 Views 166KB Size
J. Phys. Chem. B 2008, 112, 6221-6227

6221

Solvent Viscosity Dependence of the Protein Folding Dynamics† Young Min Rhee‡ and Vijay S. Pande* Department of Chemistry, Stanford UniVersity, Stanford, California 94305 ReceiVed: August 6, 2007; In Final Form: NoVember 7, 2007

Solvent viscosity has been frequently adopted as an adjustable parameter in various computational studies (e.g., protein folding simulations) with implicit solvent models. A common approach is to use low viscosities to expedite simulations. While using viscosities lower than that of aqueous is unphysical, such treatment is based on observations that the viscosity affects the kinetics (rates) in a well-defined manner as described by Kramers’ theory. Here, we investigate the effect of viscosity on the detailed dynamics (mechanism) of protein folding. On the basis of a simple mathematical model, we first show that viscosity may indeed affect the dynamics in a complex way. By applying the model to the folding of a small protein, we demonstrate that the detailed dynamics is affected rather pronouncedly especially at unphysically low viscosities, cautioning against using such viscosities. In this regard, our model may also serve as a diagnostic tool for validating lowviscosity simulations. It is also suggested that the viscosity dependence can be further exploited to gain information about the protein folding mechanism.

1. Introduction Solvent viscosity in the medium for protein folding is an important factor that governs the details of the complicated process. The effect of changing viscosity on the folding kinetics has been studied with many different approaches.1-4 Interestingly, experimental and computational investigations have used solvent viscosity in somewhat different contexts. Because lowering the viscosity of aqueous solution in experiment is extremely difficult, if not impossible, experimental approaches mainly have considered kinetics changes with increasing viscosities with added viscogens in the solution. In simulations of protein folding, the viscosity has been widely used as an adjustable parameter in folding simulations with an implicit solvent model.4-6 The use of such a modification dates back to the early work by Honeycutt and Thirumalai,5 and in usual cases, the purpose is to expedite simulations and thus to enhance the sampling in the vastly available phase space. Unlike experimental situations, where added viscogens may perturb the stability of folded proteins and where subsequent compensation needs to be considered, such a computational trick of expedition can be applied with relative ease, because the stability of a folded protein (or mathematically, the system Hamiltonian) is not a function of solvent viscosity. More specifically, changes of viscosity in such models do not affect the definitions of folded and unfolded states but only alter the “speed” along the folding path of the motion through its introduction into the equations of motion. In fact, a good linear relationship between the folding rate and the solvent viscosity has been observed at least for folding of small proteins over about an order of magnitude range of solvent viscosity.4 At first glance, different solvent viscosities with the implicit solvent model may seem unlikely to change the folding dynamics because the system Hamiltonian is not a function of †

Part of the “Attila Szabo Festschrift”. * To whom correspondence should be addressed. E-mail: pande@ stanford.edu. ‡ Present address: Department of Chemistry, University of California, Berkeley, CA 94720.

the solvent viscosity as explained in the above. However, the viscosity is a property related to the dynamics of the system, and its change can easily alter other dynamic properties of the system such as the folding mechanism. Such an aspect of polymeric reactions can be easily understood in terms of the diffusive property of the phenomenon. Protein folding is a diffusive polymeric reaction, and the overall diffusion property is governed by both intrinsic chain diffusion and extrinsic solvent diffusion.1-3,7 By altering the solvent viscosity, only the extrinsic part is changed, and the overall system characteristics may change in a nonsimple manner at different viscosities. For example, Jas et al. reported very different viscosity dependences of folding rates of helical and hairpin proteins with added viscogens.2 This complexity will be most prominent if the change of solvent viscosity has an inhomogeneous effect along the progress of folding. In fact, an early investigation by Klimov and Thirumalai using an off-lattice coarse grain model of proteins predicted the existence of various regimes for diffusive motion of proteins.8 Plotkin and Wolynes also reported a similar complexity in the changes of folding at different solvent frictions.9 More recently, Best and Hummer presented a theoretical description of such inhomogeneity by simulating the folding of a helix-bundle protein.10 In this article, we investigate the dependence of folding dynamics on the solvent viscosity in the implicit solvent model. First, through the use of a one-dimensional model11-17 with the conformational specific folding probability (pfold), it is shown that the inhomogeneous effect of viscosity may actually alter the dynamics. The alteration may range from continuous changes in the original path to abrupt changes in the dominant pathway (mechanism switch). By application of the method to folding of a small protein, it is suggested that the use of excessively small viscosities may lead to a switch in the mechanism. Finally, it is suggested that such alteration in dynamics can be exploited to elucidate the characters of the protein folding mechanism. Implications of such findings are also discussed in the context of future direction for protein folding studies.

10.1021/jp076301d CCC: $40.75 © 2008 American Chemical Society Published on Web 01/30/2008

6222 J. Phys. Chem. B, Vol. 112, No. 19, 2008

Rhee and Pande

2. Theory A. The Dependence of the Commitment Probability on Diffusion: One-Dimensional Case. In the diffusive limit, it is well-known that the commitment probability, p(x), from a given conformation x satisfies18

∇x ‚D(x) e-βU(x)∇xp(x) ) 0

(1)

where D(x) and U(x) denotes the position dependent diffusion constant and the system potential, respectively.18 In general, D(x) is not isotropic and should be considered as a matrix.13 For simplicity, let us first suppose that we have a one-dimensional system described by a potential U(x). Let us also suppose that the system possesses a nonuniform diffusion property, characterized by a one-dimensional function D(x). In this simple case, the analytical solution of eq 1 can be obtained as18



1 βU(y) dy e a D(y) p(x) ) b 1 eβU(y) dy a D(y) x



(2)

If the diffusion property of the system can be “perturbed” as D′(x) without changing the potential, and if we define a diffusion coefficient as R′(x) ) D′(x)/D(x), the commitment probability at this new viscosity will be similarly obtained as

p′(x) )

1 eβU(y) dy ∫axR′(y)D(y) 1 eβU(y) dy ∫abR′(y)D(y)

(3)

From this equation, it is clear that the change in the diffusion property will lead to a change in the commitment probability at any given conformation unless the diffusion property is homogeneously affected over the entire configuration space (namely, unless R′(x) is constant for all x). In the protein folding case, the commitment probability is given as the conformation-specific folding probability pfold, which can be readily obtained with simulations. In general, any modification in the solvent viscosity will lead to a change in the diffusion property of the protein molecule, and such a change will strongly depend on the conformation of the protein as was explained in the previous section. Consequently, by using pfold values at different viscosities, one can obtain the conformation dependent diffusion property for the given protein molecule. B. Complication from Multidimensionality. When the reaction occurs in a multidimensional space, a complication appears as a result of path diversity. This complication can be easily demonstrated with a two-dimensional model surface given as19

U(x, y) ) [1 - 0.5 tanh(y - x)](x + y - 5)2 + 0.2[((y - x)2 - 9)2 + 3(y - x)] + 15 exp[-(x - 2.5)2 (y - 2.5)2] - 20 exp[-(x - 4)2 - (y - 4)2] (4) The shape of this surface is presented in Figure 1 as a contour plot. This potential has two reaction paths (also shown in the figure), both of which are thermally accessible at moderate to high temperatures.19 Let us assume that the system has an inhomogeneous isotropic diffusion property

D(x, y) )

{

(x e 2.5) 1 2 1 + d(x - 2.5) (x > 2.5)

(5)

Figure 1. Contour plot of the sample two-dimensional surface and its reaction paths (gray, Path 1; black, Path 2). Dashed circles represent the reactant (upper left) and the product (lower right) state definitions used in the solution of the Smoluchowski equation. The paths are obtained by taking the steepest descent step with a step size of 0.01 starting from appropriate saddle points.

with an adjustable diffusion parameter d. At first glance, this assumption of inhomogeneity may appear to be unphysical. However, if we consider the two variables in this surface as representing not spatial coordinates but other conceptual degrees of freedom, the inhomogeneity may readily appear. A typical example will be the frequently adopted fraction of native contact formation, Q20: when a protein forms the contacts within helices prior to the contacts for β-turns, the effect of solvent diffusion will likely be nonuniform over different Q. With the assumed description, the committor values in the two-dimensional space can be obtained by numerically solving the modified Smoluchowski’s equation, eq 1.18,19 Thus, the correlations of committors at various d values can be straightforwardly obtained. Figure 2a,b presents such correlations along the two paths at three different diffusion parameters. As can be clearly seen in the figure, the correlations have different patterns on the two paths. Of course, when the correlation is obtained at various conformations including off-pathway points, it will appear as scattered points. An additional, but important point to note in Figure 2 is the fact that committors along Path 2 have negligible dependence on the change of the diffusion parameter (i.e., the committor correlation is nearly along the diagonal line). In fact, we further note that the diffusion function D(x, y) changes only after the transition state on this path. Near the first transition state of Path 1, on the contrary, the diffusion function is not homogeneous and changes drastically as a function of the diffusion parameter. This fact again supports the view that the committor can be used as a sensitive indicator of the changes around the transition state region, because it is a sharply varying function in that region.21 Also, this is in accord with the mathematical suggestion in the above, regarding the use of the committor together with varying diffusion in the elucidation of the mechanistic aspect of the reaction. How can we devise a metric for measuring the degree of the changes in the dynamics based on this committor correlation in the multidimensional protein folding case? Again, we note that the correlation will lie on a specific curve (not necessarily a straight line) when there is a dominant folding path that does not vary at different viscosities. Because the off-pathway points will actually spread the correlation as explained above and because the degree of this spread will become larger when the

Solvent Viscosity Dependence of Protein Folding

J. Phys. Chem. B, Vol. 112, No. 19, 2008 6223 obtain the set of p2,i values. In fact, we can define this set so that σ can be minimized with respect to {p2,i}:

σ ) Min [σ(p2,i)] {p2,i}

(8)

with a constraint of 0 e p2,i e 1. Because ∂σ/∂p2,i can be easily obtained with the above definitions, this minimization can be attained iteratively in a conventional manner.22 3. Demonstration

Figure 2. Correlations of committors at different diffusion parameters d for the two-dimensional surface along Path 1 (O) and Path 2 (×). Insets represent mean first passage times (MFPTs) for the same conformations. From MFPTs, it is clear that the dominant path (mechanism) will switch from Path 2 at high viscosity to Path 1 at low viscosity. Both committor and MFPT were obtained by solving appropriate Smoluchowski equations numerically at temperature kT ) 15. The two paths are along the steepest descent paths shown in Figure 1.

change in the pathway is larger, the scatter of the committor distribution in the correlation can be used as the metric for quantifying the degree of dynamics change as a function of viscosity. In the limit of complete sampling, the committor correlation can be expressed as a two-dimensional probability distribution function f(p1,p2). The scatter of the correlation can be described with

σ2 )

∫ dp1[∫p22f(p1,p2) dp2 - (pj2(p1))2]

(6)

where pj2(p1) is the conditional average of p2 at a given p1 value:

pj2(p1) )

∫ dp′1 ∫ dp2[δ(p′1 - p1)p2f(p1, p2)]

(7)

In practice, only a limited number of points will be available on the correlation plot (recall that each point in the correlation plot requires ∼100 simulations). We propose the following practical treatment for this actual case of incomplete data of the committor distribution. First, the domain of p1 is divided into NB bins, so that an equal number of data points belong to each bin. Let us denote the boundaries of these bins as p1,0 ) 0, p1,1,..., p1,NB-1, p1,NB ) 1. Now, at each bin boundary p1,i, let us assume that their corresponding “optimal” p2,i values are assigned with a certain criterion. When this criterion is properly chosen, a set of connections of (p1,i, p2,i) points with straight line segments will be a close approximation to eq 7, and the root-mean-squared scatter σ from this approximate pj2 can be used as a scatter metric for the dynamics change. With these definitions, the only remaining issue is to define the method to

To demonstrate the usefulness of the suggested pfold comparisons at different diffusion conditions, we have applied the method for the folding of the small protein BBA5, a 23-residue miniprotein designed and characterized by the Imperiali group.23,24 Its folding has been well characterized computationally with both implicit and explicit solvent models.25,26 In this work, Langevin dynamics with varying viscosity (γ) was used together with Still’s GB/SA model of solvation.27 The protein molecule was described with Garcia and Sanbonmatsu’s modified AMBER force field.28 The folding probabilities were calculated at 80 conformations sampled along various folding trajectories. To measure the folding probability of any given conformation, 100 independent molecular dynamics simulations were performed with randomly chosen initial velocities. With 100 samples, the standard deviation in the calculated pfold is 0.05 or less in all cases. Results at waterlike viscosity (γ0 ) 91 ps-1) were used as the reference for pfold comparisons. For each trajectory, the simulation was continued for 5 ns, after which more than 90% of the trajectories committed either to the native or to the folded state. The commitment was checked at every 100 ps at the reference viscosity. For simulations with different viscosities, the simulation length and the check frequency was linearly scaled (for example, at γ ) 1/2γ0, the commitment was check at every 50 ps). The simulations were performed on Folding@Home supercluster29,30 with TINKER molecular dynamics simulation package.31 Other details of the simulation protocols can be found elsewhere.21 Figure 3 presents the variations of pfold for the conformation set at various viscosities. It is interesting to note that the degree of agreement (scatter of the points around the fit lines) varies widely depending on the change of viscosities. In one sense, this is quite understandable considering that the change of diffusive characters from different viscosities will definitely change the detailed dynamics as suggested in the previous section. However, it is still important to note that relatively small changes in the viscosities (changes by a factor of 2) have a noticeable scatter in the correlation or perturbations on the dynamics. This is in fact in qualitative agreement with a previous finding by Zagrovic et al.4 regarding the kinetics change near the waterlike viscosity region: even though the protein folding rate showed a good linear correlation with the solvent viscosity, the slope of the folding rate versus viscosity relation was found to be less than unity (0.89 in TrpCage case4). This can be explained in terms of a continuous change of the “dominant path” of folding with different viscosities as schematically illustrated in Figure 4. Because the folding paths will change continuously with varying viscosities, the scatter of the pfold correlation will continuously increase and the change in the folding rate will slightly deviate from the expectation of the simple linear viscosity relation. With this observation, it will be pertinent to address a question regarding the behavior of the pfold correlation at extremely low viscosities. The dependence of the folding rate in this condition

6224 J. Phys. Chem. B, Vol. 112, No. 19, 2008

Rhee and Pande

Figure 5. (a) Scatters of pfold correlations at different viscosities with NB ) 5 (O), NB ) 10 (0), and NB ) 20 (×). Dotted lines represent simple trends based on data points in high- and low-viscosity regions for NB ) 5. (b) Scatters of pfold correlations obtained with cubic polynomial fits. The statistical uncertainty in the scatter is smaller than the size of the circles.

Figure 3. Correlations of pfold for BBA5 at different viscosities. Lines represent third-order polynomial fits to the data (with constraints on (0,0) and (1,1) points).

Figure 4. Schematic explanation of different viscosity dependence of different folding mechanisms. When solvent viscosity is small, the longrange contact formation will be fast (Path A). When the solvent viscosity is large, the short-range contact formation will become fast (Path B). The actual folding mechanism will be determined by the nature of both protein and solvent. BBA5 folds via Path B at normal viscosity. However, as the viscosity is lowered, the dominant path will continuously switch toward Path A as indicated by Path B′.

(γ < 0.1γ0) was reported to severely deviate from the high viscosity (γ ∼ γ0) trend with much smaller sensitivity in the change of folding rates (slope of 0.21 for TrpCage).4 Figure 5a presents the degree of deviations in pfold correlations as measured by the root-mean-squared fit error (σ) at different viscosities. Even though one must be careful about analyzing the data based on a limited number of points shown in the figure, the trend of change is in exact agreement with the previous observation of the kinetics dependence.4

Interestingly, the crossing points of the low-viscosity and high-viscosity trend lines are in a quantitative agreement (∼0.17γ0 in this work versus ∼0.13γ0 in ref 4) for the widely different pairs of methods (pfold correlation versus folding rate comparison) and proteins (BBA5 versus TrpCage). Such an agreement suggests an intrinsic nature of solvent viscosity in regard to its effect on the protein folding kinetics and dynamics. We again stress that the drastic deviation of pfold correlation can be considered as a direct evidence of such mechanistic (dynamics) change based on the sensitivity of pfold in dynamics.11,21 One potential concern that one may have in this analysis may be in regard to the statistical uncertainty of σ that we have obtained from the limited number of correlation data. In fact, because each pfold measurement involves a certain level of statistical uncertainty (0.05 or less when 100 trajectories were used to determine the folding probability at each conformation) and because a similar level of uncertainty in σ will significantly weaken the above conclusion regarding the pronounced deviation at an extremely low viscosity, this assessment of statistical uncertainty in σ will be an important diagnostic. However, the iterative nature of the approach for obtaining σ described in the previous section renders this assessment rather difficult. To circumvent this difficulty, we have adopted a cubic polynomial fitting, where the fitting parameters and their explicit functional dependence on each pfold value can be straightforwardly defined. (See Appendix for the detailed mathematical expressions.) Figure 3 presents the polynomial fit lines thus obtained in each pfold correlation. The variation of the scatter metric at different viscosities is presented in Figure 5b together with the associated statistical uncertainty within the cubic polynomial model. One can see that the rather drastic change in the low-viscosity region is consistently observed, and the effect of the involved statistical uncertainty is quite negligible. Also, the very weak dependence of the crossover point (∼0.17γ0) on the fitting model, whether it is from the model-free scheme with different number of bins or from the more ad hoc cubic fitting, more strongly supports our suggestion described in the above. If pfold is sensitive to the change in dynamics as explained in the above, can it be used as a tool to gather more information of the given folding reaction? To approach an answer to this question, the pattern of the pfold correlation observed in Figure

Solvent Viscosity Dependence of Protein Folding

J. Phys. Chem. B, Vol. 112, No. 19, 2008 6225 not as prominent. This again suggests that the folding mechanism is affected differently by the use of low solvent viscosity. Indeed, this finding may imply an interesting usage of the proposed method. By comparison of the folding probabilities of various conformations at different viscosities, mechanistic information may be deduced about the location where the formations of long-range contacts become important. Numerically, such an analysis will have an additional advantage: even though one-dimensional models of protein foldings are useful and many approaches of obtaining such model have been proposed, it is still not trivial to obtain the effective potential U(x) along a “folding coordinate” x (at least in terms of computational cost). In the proposed analysis, however, the explicit form of U(x) is not needed in obtaining the pfold correlation. Thus, any difficulty related to the definition of an effective potential U(x) can be naturally avoided. Of course, the position along the folding coordinate (TS, pre-TS, or postTS) of a given conformation can be directly deduced from the pfold value itself. 4. Conclusion

Figure 6. Dynamical sensitivity for solvent viscosity measured by the slope of pfold correlation. Slopes were obtained from the fit lines shown in Figure 3.

3 has been further considered. Because the minimum-energy path on the effective folding free energy surface32 has a curvature as illustrated in Figure 4, the effective viscosity and thus the effective diffusion constant along the path will be inhomogeneous. Starting from eq 3, it is trivial to show that

dp′ dp′/dx C ) ) dp dp/dx R′(x(p))

(9)

where C denotes a constant given as

C)

1 βU(y) dy e ∫abD(y)



1 eβU(y) dy a R'(y)D(y) b

(10)

This constant is difficult to obtain in realistic calculations unless the effective diffusion parameter (D(x)) and potential of the mean force (U(x)) are accurately known. However, this coefficient is only a constant and can be safely ignored in the following analysis. Assuming that the one-dimensional effective viscosity and effective diffusion are inversely proportional to each other, we can define the one-dimensional effective viscosity coefficient as η(x) ) 1/R′(x), which is directly proportional to the slope of pfold correlation, dp′/dp. Physically, this coefficient represents the sensitivity of changes in the one-dimensional effective viscosity at different solvent viscosities. Namely, when η(x) is larger at x1 than at x0, we can consider that the solvent viscosity change effects are more important at x1. Figure 6 shows the slopes of this correlation at selected solvent viscosities. We can see that the solvent viscosity change has a larger effect at the later stage of folding for this protein. This is in accord with the folding mechanism of this protein: because BBA5 folds with the diffusion-collision mechanism25,26 where the local structure is formed first and long-range tertiary contact is formed later, the solvent viscosity will have a larger effect in the later stage of folding (namely, the rate of the later stage is mostly determined by the solvent diffusion). With a small solvent viscosity, the effect on the post-TS region becomes

In summary, we have shown that the use of low solvent viscosities may result in complex changes in the dynamics of protein folding. By using the conformation specific folding probabilities (pfold)19,21 in a realistic simulation of protein folding with BBA5, we have observed that the change in dynamics became drastic as the viscosity was altered by more than an order of magnitude. It was also observed that the folding dynamics depended on the solvent viscosity in a markedly similar fashion to the case of the folding kinetics.4 This finding is interesting in that the two observations (dynamics versus kinetics) have used widely different measurables (pfold versus rate) and suggests the existence of two different types of (solvent dependent versus solvent independent) variables that govern protein folding. This is also in accord with various experimental findings with increased solvent viscosities. In fact, low-solvent viscosity simulations have been widely used in protein folding studies.4,33-35 As discussed in the introduction, the rationale of using such potentially unphysical viscosities is the simple relationship between the viscosity and the folding rate, as predicted by Kramers’s theory.36 One important usage of the low viscosity simulation is to obtain direct comparisons of the results from long-time and short-time simulations, practically because the long-time ones can only be performed with expedited simulations5 at the moment. For example, Caflisch and co-workers have used the fast solvent accessible surface area model37 in conjunction with zero solvent viscosity (no random collisions with solvent molecules) for the comparisons.38 By applying the simulation protocol to a small β-sheet protein (GS peptide), they reported significant differences in folding rate (kinetics) estimations from data sets based on the two different timescales. In fact, our results suggest that such a difference could be caused by a non-singleexponential behavior in the long time dynamics (as can be deduced from simulation length dependent folding time constant estimations when a single exponential behavior was assumed; see Figure 5 of ref 38). Such a non-single-exponential behavior was analyzed by Head-Gordon and co-workers39 using a simplified model40 of protein L at various viscosities: when the time scale of the equilibration within the unfolded state is not fast enough compared to the actual folding time scale, the kinetics from many short trajectories cannot represent the behavior of the actual system unless they are started from well equilibrated

6226 J. Phys. Chem. B, Vol. 112, No. 19, 2008

Rhee and Pande

conformations.39,41 The present result in conjunction with these reports offers a potential caution regarding the use of viscosity modifications. Because the changes in solvent viscosity have a nonhomogeneous effect (smaller effect on the intrinsic chain diffusion), the kinetics will deviate from single exponentials when the intrinsic chain diffusion and solvent driven diffusion fall into a competing regime from such a modification. If the use of excessively low viscosity induces such a competition between the barrier-crossing and the fluctuations inside the folded or unfolded basin, both the kinetics and dynamics of the simulated folding will likely deviate from the actual behavior in a more physical situation. Comparisons with experimental results will be important in the validation of such results. Because many experimental techniques such as temperature-jump kinetics measurement use unfolded states at an unequilibrated situation, the apparent single exponential behavior from such an experiment will be an evidence of fast equilibration within the unfolded basin. If the low-viscosity simulation deviates from a single exponential for such a system, one may deduce that the use of a low viscosity may have introduced an artifact in folding kinetics/dynamics of the given protein. It will also be interesting to investigate the relationship between the kinetics/dynamics changes at different viscosities for proteins with varying contact orders.42 Clearly, the change in the solvent viscosity will have a different effect on a longrange contact formation than on a short-range contact formation. Because the solvent viscosity change will be likely to have a larger effect for long-range contact formations, we expect that the mechanism switch at an extreme viscosity predicted in our simulation will arise more pronouncedly for proteins with large contact orders. For proteins with large contact orders, one may potentially even design a more provoking experiment, where some of the solvent exposed charged/hydrogen-bonding residues not directly involved in the contact formation are mutated into noncharged/non-hydrogen-bonding residues. The reasoning behind this proposal is a possibility to experimentally mimic the low-viscosity regime: charged/hydrogen-bonding residues will have tighter interactions with the hydrogen-bonding network of water, effectively feeling more viscous drags from the solvent. Finally, it is noted that the detailed dynamics of equilibration within the unfolded ensemble as one of the characteristics of the unfolded state have not been fully understood yet.39 Because the diffusion is an important process that interconnects various states within the unfolded ensemble, investigation of solvent viscosity dependence will play an important role in such studies. Ultimately, long time dynamics with realistic modeling will present absolute comparisons with various simplified models and will be of help in getting more insights about the nature of the unfolded state. Obviously, the ongoing developments in both experiment and theory for protein folding43 will make advances in these aspects and will enable more comprehensive understandings of the complicated but interesting phenomena of protein folding. Appendix Measurement of Scatter in pfold Correlation and Its Statistical Uncertainty. When two sets of pfold, {xi}, and {yi} (i ) 1, 2, ..., N) are correlated, its cubic polynomial fit can be formulated by optimizing the coefficients a and b via minimizing the following:

∆)

∑i (yi - a(xi3 - xi) - b(xi2 - xi) - xi)2

(A1)

Because the two sets of pfold are measured with the same definitions of the folded and unfolded state, the fitting function must satisfy yx)0 ) 0 and yx)1 ) 1. The target function y ) a(x3 - x) + b(x2 - x) + x is designed so that these boundary conditions are satisfied. It is trivial to show that the two fitting parameters can be obtained as a ) (CX - BY)/(AC - B2) and b ) (AY - BX)/(AC - B2) with

∑i (xi3 - xi)2

(A2)

∑i (xi2 - xi)(xi3 - xi)

(A3)

∑i (xi2 - xi)2

(A4)

X)

∑i (yi - xi)(xi3 - xi)

(A5)

Y)

∑i (yi - xi)(xi2 - xi)

(A6)

A) B)

C)

When these parameters are obtained, the scatter from the correlation can be simply represented as

s)

x∆N

(A7)

Because the measurement errors of each pfold, δxi and δyi, can be obtained from a simple rule of binomial distribution, and because these errors are independent from each other, the statistical uncertainty in s can be formulated based on these measurement errors. After a little algebra, it can be shown that the following is satisfied:

δs )

1 2sN

x∑[( ) ( ) ] ∂∆

i

∂xi

2

δxi2 +

∂∆ ∂yi

2

δyi2

(A8)

Because the formulas for the partial derivatives of ∆ can be trivially obtained, they will not be elaborated here. One important aspect to be noted in this uncertainty is its dependence on the number of conformations N used in generating the correlation. Because the summation runs through i ) 1, 2, ..., N, the uncertainty δs scales as ∼1/xN. Because we have adopted N ) 80 throughout this work, it is natural to see this uncertainty is approximately 1 order of magnitude smaller than the errors in pfold measurements (∼0.05). Acknowledgment. It is our pleasure to contribute this issue in honor of Attila Szabo, whose work has inspired us on countless occasions. This work was supported by NSF Molecular Biophysics (Grant MCB-0317072) and NIH (Grant R01GM062868). Y.M.R. acknowledges a support from William Nichols Graduate Fellowship through the Department of Chemistry, Stanford University. References and Notes (1) Ansari, A.; Jones, C. M.; Henry, E. R.; Hofrichter, J.; Eaton, W. A. Science 1992, 256, 1796. (2) Jas, G. S.; Eaton, W. A.; Hofrichter, J. J. Phys. Chem. B 2001, 105, 261. (3) Qiu, L. L.; Hagen, S. J. J. Am. Chem. Soc. 2004, 126, 3398. (4) Zagrovic, B.; Pande, V. S. J. Comput. Chem. 2003, 24, 1432. (5) Honeycutt, J. D.; Thirumalai, D. Biopolymers 1992, 32, 695. (6) Ponder, J. W.; Case, D. A. AdV. Protein Chem. 2003, 66, 27. (7) Qiu, L. L.; Hagen, S. J. Chem. Phys. 2004, 307, 243.

Solvent Viscosity Dependence of Protein Folding (8) Klimov, D. K.; Thirumalai, D. Phys. ReV. Lett. 1997, 79, 317. (9) Plotkin, S. S.; Wolynes, P. G. Phys. ReV. Lett. 1998, 80, 5015. (10) Best, R. B.; Hummer, G. Phys. ReV. Lett. 2006, 96, 228104. (11) Du, R.; Pande, V. S.; Grosberg, A. Y.; Tanaka, T.; Shakhnovich, E. S. J. Chem. Phys. 1998, 108, 334. (12) Ma, A.; Dinner, A. R. J. Phys. Chem. B 2005, 109, 6769. (13) Berezhkovskii, A.; Szabo, A. J. Chem. Phys. 2005, 122, 014503. (14) Best, R. B.; Hummer, G. Proc. Natl. Acad. Sci. U.S.A. 2005, 102, 6732. (15) Krivov, S. V.; Karplus, M. J. Phys. Chem. B 2006, 110, 12689. (16) Bolhuis, P. G. Proc. Natl. Acad. Sci. U.S.A. 2003, 100, 12129. (17) Park, S.; Sener, M. K.; Lu, D.; Schulten, K. J. Chem. Phys. 2003, 119, 1313. (18) Gardiner, C. W. Handbook of Stochastic Methods; Springer: Berlin, Germany, 1985. (19) Rhee, Y. M.; Pande, V. S. J. Phys. Chem. B 2005, 109, 6780. (20) Clementi, C.; Garcia, A. E.; Onuchic, J. N. J. Mol. Biol. 2003, 326, 933. (21) Rhee, Y. M.; Pande, V. S. Chem. Phys. 2006, 323, 66. (22) Press, W. H.; Teukolsky, S. A.; Vetterling, W. T.; Flannery, B. P. Numerical Recipes in C, 2nd ed.; Cambridge University Press: New York, 1995. (23) Struthers, M.; Cheng, R.; Imperiali, B. Science 1996, 271, 342. (24) Struthers, M.; Ottesen, J. J.; Imperiali, B. Folding Des. 1998, 3, 95. (25) Rhee, Y. M.; Sorin, E. J.; Jayachandran, G.; Lindahl, E.; Pande, V. S. Proc. Natl. Acad. Sci. U.S.A. 2004, 101, 6456.

J. Phys. Chem. B, Vol. 112, No. 19, 2008 6227 (26) Snow, C. D.; Nguyen, N.; Pande, V. S.; Gruebele, M. Nature 2002, 420, 102. (27) Qiu, D.; Shenkin, P. S.; Hollinger, F. P.; Still, W. C. J. Phys. Chem. A 1997, 101, 3005. (28) Garcia, A. E.; Sanbonmatsu, K. Y. Proc. Natl. Acad. Sci. U.S.A. 2002, 99, 2782. (29) Pande, V. S.; Baker, I.; Chapman, J.; Elmer, S. P.; Khaliq, S.; Larson, S. M.; Rhee, Y. M.; Shirts, M. R.; Snow, C. D.; Sorin, E. J.; Zagrovic, B. Biopolymers 2003, 68, 91. (30) Shirts, M.; Pande, V. S. Science 2000, 290, 1903. (31) Ponder, J. W. TINKER, Software Tools for Molecular Design; Department of Biochemistry and Molecular Biophysics, Washington University: St. Louis, MO, 2000. (32) Fleurat-Lessard, P.; Ziegler, T. J. Chem. Phys. 2005, 123, 084101. (33) Snow, C. D.; Qiu, L.; Du, D.; Gai, F.; Hagen, S. J.; Pande, V. S. Proc. Natl. Acad. Sci. U.S.A. 2004, 101, 4077. (34) Ferrara, P.; Caflisch, A. Proc. Natl. Acad. Sci. U.S.A. 2000, 97, 10780. (35) Simmerling, C.; Strockbine, B.; Roitberg, A. E. J. Am. Chem. Soc. 2002, 124, 11258. (36) Kramers, H. A. Physica 1940, 7, 284. (37) Ferrara, P.; Apostolakis, J.; Caflisch, A. Proteins 2002, 46, 24. (38) Paci, E.; Cavalli, A.; Vendruscolo, M.; Caflisch, A. Proc. Natl. Acad. Sci. U.S.A. 2003, 100, 8217. (39) Marianayagam, N. J.; Fawzi, N. L.; Head-Gordon, T. Proc. Natl. Acad. Sci. U.S.A. 2005, 102, 16684. (40) Brown, S.; Fawzi, N. L.; Head-Gordon, T. Proc. Natl. Acad. Sci. U.S.A. 2003, 100, 10712. (41) Fersht, A. R. Proc. Natl. Acad. Sci. U.S.A. 2002, 99, 14122. (42) Plaxco, K. W.; Simons, K. T.; Baker, D. J. Mol. Biol. 1998, 277, 985. (43) Wand, J. Chem. ReV. 2006, 106, 1543.