Force-Induced Change in Protein Unfolding Mechanism: Discrete or

Jan 27, 2011 - Although the approximately log−linear behavior for these pulling directions at low force could be fit by a Bell model, significant de...
0 downloads 0 Views 6MB Size
ARTICLE pubs.acs.org/JPCB

Force-Induced Change in Protein Unfolding Mechanism: Discrete or Continuous Switch? Thomas G. W. Graham and Robert B. Best* Department of Chemistry, Cambridge University, Lensfield Road, Cambridge CB2 1EW, U.K.

bS Supporting Information ABSTRACT: Mechanical stretching of proteins modifies their folding kinetics and may also cause a switch of folding mechanism from that at zero force. It is not clear from the kinetics alone whether the change is a continuous distortion of the zero force pathway or it occurs via a “discrete switch” to an alternative pathway. We use molecular simulations to dissect this switch of mechanism as a pulling force is applied to protein G via four different pairs of residues, or “pulling coordinates”. Using a statistical clustering approach based on the pattern of native contact formation, we find distinct unfolding mechanisms at low and high force. For pulling coordinates for which the protein is resistant to the applied force, a marked “turnover” in the force-dependent unfolding kinetics is associated with an abrupt switch to a novel mechanical unfolding pathway. In contrast, pulling along coordinates where the protein has low resistance to force induces a smoother acceleration in the unfolding rate and a more gradual shift in the unfolding mechanism. The switch in folding pathway is captured by projection onto appropriate two-dimensional free energy surfaces, which separate the low and high force transition states. Remarkably, we find for a weak coordinate that the high force transition state is already accessible in the absence of force. Brownian dynamics simulations on these surfaces capture the force dependence of the kinetics, supporting the use of simplified low-dimensional models for interpreting mechanical unfolding experiments. We discuss the implications of the switch in pathway for the mechanical strength of proteins, and how such a switch may be experimentally tested.

’ INTRODUCTION Ensemble experiments, together with theory, have yielded many insights into how proteins fold and have led to a general understanding of the underlying principles.1-5 For example, the cooperative (often, two-state) nature of folding may be understood as arising from the funneled nature of the folding energy landscape (reviewed in ref 6). The recent development of singlemolecule techniques promises to advance our knowledge further by allowing protein folding to be studied at an unprecedented level of detail (reviewed in refs 7 and 8). Tracking individual molecules has numerous advantages; for example, the properties of folded and unfolded subpopulations may be resolved and studied separately;9-11 it can be demonstrated directly that intermediates are “on pathway”,12 and in the future, it may be possible to observe directly events along folding “transition paths”, given sufficient experimental resolution.13 The powerful class of single-molecule experiments considered here uses atomic force microscopy (AFM),14-16 or optical or electromagnetic tweezers,12,17-19 to probe the free energy landscapes of proteins using precise molecular-scale mechanical perturbations20-39 In these protein pulling experiments, the AFM or optical/ electromagnetic trap is used to apply a mechanical force to the r 2011 American Chemical Society

protein via two attachment points, usually the N- and C-termini. Unfolding (and, in some cases, folding12,40,41) events in the presence of the applied force can be identified by tracking the molecular extension, also known as the “pulling coordinate” (in practice, the effects of molecular handles linking the protein and the pulling device must also be accounted for). The pulling coordinate naturally suggests itself as a progress variable, since it is closely related to the experimental observable and the effect of force on the energy landscape is simply the projection of the pulling force onto this coordinate. Consequently, several onedimensional models of folding using the pulling coordinate as a reaction coordinate have been derived for interpreting experiments: a very simple model, devised by Bell42 and commonly used to fit experimental data,43,44 supposes that the positions of the native state and transition state along the pulling coordinate remain approximately the same in the presence of different applied forces, allowing the effect of force on the unfolding energy barrier, and rate, to be calculated straightforwardly. Received: November 10, 2010 Revised: December 31, 2010 Published: January 27, 2011 1546

dx.doi.org/10.1021/jp110738m | J. Phys. Chem. B 2011, 115, 1546–1561

The Journal of Physical Chemistry B A more sophisticated description based on Kramers’ rate theory,45 also accounts for the shape of the barrier and the shift in transition state position along the reaction coordinate caused by the applied force. While these one-dimensional descriptions have been successfully applied to fit experimental data in many cases,16,18,25,46,47 the extension along the pulling direction is not in general an ideal reaction coordinate,48 and a multidimensional description may be required in some cases to capture the dependence of folding and unfolding kinetics on applied forces.49-51 In particular, various lines of experimental and theoretical evidence suggest a switch between distinct unfolding pathways in the absence or presence of force.27,52 Disagreement between Φ values2,53 measured in mechanical and chemical denaturation experiments suggests that chemical and mechanical unfolding occurs along different pathways characterized by structurally distinct transition states, and that mechanical unfolding may even proceed via intermediates that do not occur during denaturant-induced unfolding;15,24,39,54-56 on the other hand, there are some broad similarities in the overall pattern of Φ-values. Simulations of coarse-grained models have shown a “turnover” in protein unfolding kinetics between low-force and high-force regimes, which has been interpreted as evidence of parallel pathways that differ in their sensitivity to an applied force.48,57 Constant-velocity pulling experiments have likewise revealed a turnover in mean rupture force at very low pulling velocities.51 Here, we use coarse grained simulations to study in detail the effect of pulling in different directions on the unfolding kinetics and mechanism of GB1, the B1 domain of streptoccocal protein G (GB1). This protein is a widely used model in ensemble protein folding experiments,58-64 and has been studied more recently by single-molecule force spectroscopy.65-68 Previous experimental and simulation results for GB1 and for the topologically similar B1 domain of streptoccocal protein L have suggested that the mechanical strength of these domains arises from the presence of a parallel β-sheet “force clamp” motif between strands 1 and 4, and that rupture of these contacts is the principal barrier to mechanical unfolding.30,67,69-74 GB1 has also recently been used to develop novel materials designed to mimic the mechanical properties of muscles;75 therefore, understanding the force dependence of its mechanical strength may have practical applications. We find a marked anisotropy in the mechanical resistance of GB1, similar to that observed in simulations of other proteins,70 and in AFM pulling experiments.25,26 When pulled along coordinates parallel to the β-sheet, GB1 exhibits high mechanical resistance, with a marked “turnover” from a low force regime in which the unfolding rate is relatively insensitive to the applied force to a high force regime where the unfolding rate is strongly force-dependent. We refer to such coordinates as “mechanically strong”, since the protein will tend to resist unfolding when pulled along them. Pulling perpendicular to the β-sheet results in dramatically lower mechanical resistance, with negligible kinetic turnover. By applying a clustering analysis to unfolding trajectories, we show the kinetic turnover for pulling along two mechanically strong directions is accompanied by a sudden, discrete shift in unfolding mechanism from a native-like pathway at low force to a novel rupture pathway at high force. In contrast, pulling along two mechanically weak directions accelerates unfolding gradually while causing a more continuous shift in the unfolding mechanism. By analyzing low-dimensional projections onto the pulling coordinate, r, and the fraction of native

ARTICLE

Figure 1. Equilibrium folding of the Go model of protein G. Left: The fraction of native contacts (Q) as a function of time along a sample trajectory segment. Right: Free energy, G(Q), along the Q coordinate.

contacts, Q, we identify alternate transition state structures at intermediate force along the strong 1-56 pulling coordinate that are associated with intrinsic-like or mechanical transition pathways. Collectively, these results provide a detailed portrait of force-induced shifts between parallel unfolding pathways and suggest that interesting behavior may be observed experimentally through the use of methods that probe mechanical unfolding over a broader range of forces.

’ RESULTS AND DISCUSSION Coarse Grained Folding Model for GB1. We use a coarse grained Gooh-like model to study the folding of the GB1 domain of protein G, in which each residue is represented by a single bead. Such a coarse grained model allows us to sample relevant time scales that would be inaccessible to all-atom simulations. By construction, Goh-like models result in a funneled energy landscape, and they have been very successful in predicting mechanisms of folding, binding, and domain swapping,5,76-78 as well as for studying mechanical unfolding.70,79,80 It has been previously found for a number of proteins that the overall fraction of native contacts formed, Q (see Methods), serves as a useful coordinate to describe the progress of the folding/unfolding reaction.81 Monitoring Q over the course of long simulation trajectories of the GB1 Go model revealed back-and-forth transitions between a folded (high Q) state and an unfolded (low Q) state (Figure 1), as has been previously observed for similar models by other authors.82,83 The temperature for these simulations, 300 K, was chosen such that the folded state was favored at zero force, with an equilibrium constant of Keq = [U]eq/[F]eq ≈ 0.03 based on long equilibrium simulations. We reasoned that choosing a temperature that favors the folded state more closely reflects typical experimental conditions for mechanical unfolding experiments in which the folded state is much more stable than the unfolded state in the absence of force, and the “intrinsic’’ barrier to unfolding is high. By comparison, the experimentally measured Keq of GB1 in the absence of denaturant at 298 K is ∼0.000360 (we did not attempt to match this stability in the simulation, as it would have resulted in a “downhill” folding scenario which is unrealistic for this protein). Also shown in Figure 1 is the potential of mean force (PMF), G(Q), as a function of Q, which is related to the equilibrium probability density on Q, P(Q), by P(Q) = e-βG(Q), where β = 1/(kBT). This one-dimensional projection of the protein folding free energy landscape onto Q reveals distinct folded and unfolded basins separated by an energy barrier. In addition, the presence of an intermediate species is shown by a third local minimum in the PMF centered at Q ∼ 0.35. Inspection of frames with Q near this value shows that this intermediate contains a core β-sheet 1547

dx.doi.org/10.1021/jp110738m |J. Phys. Chem. B 2011, 115, 1546–1561

The Journal of Physical Chemistry B

Figure 2. Force-dependent unfolding kinetics for four different pulling coordinates. (A) Residue pairs defining the pulling directions used in the simulations. (B) Computed waiting times for pulling at different forces in different directions. Fits are colored by pulling direction using blue for 1-56, red for 10-48, green for 32-56, and purple for 10-32. Solid lines represent fits to a single-Bell (10-32 and 32-56 directions) or double-Bell model (1-56 and 10-48 directions). Dashed lines represent fits to the DHS model45 with ν = 1/2 (10-32 and 32-56) or to the DHS model with ν = 1/2 plus a force-independent constant (1-56 and 10-48). Fitted models and parameters are given in Table 1.

nucleus composed of strands 1, 4, and 3, with the helix and strand 2 mostly unfolded and detached from this folded core. The intermediate becomes less prominent (and folding appears more two-state) at higher temperature84 and in the presence of applied force (see below). There is conflicting evidence in the experimental literature regarding whether GB1 unfolds via an intermediate.61,63,64 Results from continuous flow fluorescence experiments indicate the presence of an intermediate in which tryptophan residue 43 in strand 3 has native-like fluorescence.61,63 In contrast to the results of our simulations, the barrier between the unfolded and intermediate states was predicted experimentally to be the principal barrier in the absence of denaturant.63 On the basis of the comparison of our simulated Keq with that from experiments, however, our simulations may correspond more closely to mildly denaturing conditions, for which the barrier between the native and intermediate states becomes higher than that between the intermediate and the unfolded state.63 Our observations are qualitatively consistent with results from NMR experiments85 and Φ value analysis,64 which suggest initial loss of structure in the helix and N-terminal hairpin along the unfolding pathway in solution, even though Baker and colleagues64 do not detect evidence of a metastable intermediate. Calculation of Unfolding Kinetics from First Passage Times. We investigated how the rate of unfolding of the GB1 Go model varies when pulling forces are applied to different pairs of amino acids (Figure 2A). Experimentally, this might be realized by disulfide cross-linking of different protein molecules via surface cysteine residues placed at these sites,34 or by more specialized chemistry,25 since, unlike ubiquitin, protein G does not naturally occur in cross-linked form.26 Given the relatively low computational cost of Go model simulations, it was possible to compute the rate of unfolding from the mean first passage time

ARTICLE

using many independently initialized simulations. Simulations were initiated from randomly selected conformations at the predicted center of the folded basin (Q = 0.9 ( 0.01), and were said to have unfolded when Q dropped below 0.13, the predicted center of the unfolded basin (cf. Figure 1). By running many such randomly initialized simulations, we accumulated histograms of unfolding times at a range of different forces along different pulling directions. We attempted to choose pulling coordinates with different mechanical strengths by comparing average inter-residue distances between the folded state and the predicted intrinsic transition state. To obtain an initial structure of the intrinsic transition state, a collection of putative transition state configurations was selected from the predicted barrier region at Q ∼ 0.55 and verified to have a splitting probability for folding (Pfold) between 0.4 and 0.6 by shooting simulations with randomized initial velocities (data not shown). One-dimensional models would predict that pulling along coordinates that have a large increase in distance between the folded state and the intrinsic transition state (Δxq) will accelerate unfolding kinetics substantially since a force F lowers the barrier height by approximately FΔx‡; such pulling directions, in other words, will be mechanically “weak”. In constrast, pulling directions with a small or negative change in extension between the folded and transition state are predicted to be mechanically “strong”. On the basis of this reasoning, we successfully identified two mechanically weak and two mechanically strong pulling coordinates (Figure 2A). Pulling along the weak coordinates between residues 10 and 32 and between residues 32 and 56 causes an immediate acceleration in the rate of unfolding, even at very low forces (Figure 2B). In contrast, pulling along the mechanically strong directions between residues 1 and 56 and between residues 10 and 48 accelerates unfolding only at relatively high forces (Figure 2B), consistent with earlier experimental and simulation results showing anisotropy in the mechanical resistance of other proteins.25,26,34,48,70,86,87 Counterintuitively, pulling at low forces along the 1-56 coordinate even slows down unfolding, an effect previously observed in Go model simulations of the topologically equivalent protein L and of titin I27,57 as well as in pulling simulations of lattice models.52 Interestingly, both strong pulling directions lie along the axis of the β-sheet, while both weak pulling directions are between the β-sheet and the R-helix (Figure 2A). The initial force dependence of the unfolding rate for the weak 10-32 and 32-56 pulling directions was well fit by onedimensional models, which is consistent with pulling in these directions acting primarily to accelerate unfolding via an intrinsiclike pathway. Although the approximately log-linear behavior for these pulling directions at low force could be fit by a Bell model, significant deviations were observed at high force, likely reflecting a Hammond-like shift of the transition state toward the folded basin at very high forces. This effect could be captured by fitting to the model of Dudko, Hummer, and Szabo (DHS).45 Assuming a cusp-like (ν = 1/2) or linear-cubic (ν = 2/3) barrier gave similar fits, and results from the former are shown here (Table 1). In contrast to the weak pulling directions, the kinetic trends along the strong pulling directions showed a nonmonotonic dependence on the pulling force which could not be explained by the one-dimensional models. Indeed, the unfolding rate initially slows down for pulling along 1-56 at low force. However, we were able to obtain reasonable fits by modeling the rate as a 1548

dx.doi.org/10.1021/jp110738m |J. Phys. Chem. B 2011, 115, 1546–1561

The Journal of Physical Chemistry B

ARTICLE

Table 1. Parameters for the DHS and Bell Models Fitted to the Unfolding Times in Figure 2Ba coordinate (model)

k1(0) (ms-1)

k2(0) (ms-1)

Δxq1 (nm)

Δxq2 (nm)

ΔGq (kcal/mol)

1-56

48

-0.058

1.4

(double Bell)

29,67

-0.131,0.016

0.3,2.5

0.16,0.18

1-56

41

0.19

0.25

14.1

(DHSþconst.)

30,56

0.07,5.0

0.11,0.39

4.3,23.9

0.24

0.17

10-48

57

-0.10

8

(double Bell)

31,83

-0.30,0.11

1,14

0.21,0.27

10-48

58

1.5

0.37

10.1

(DHSþconst.) 32-56

50,66 53

1.0,2.3

0.35,0.39

9.9,10.3

0.35

(Bell)

40,69

0.33,0.38

32-56

38

0.49

7.1

(DHS)

25,59

0.41,0.56

6.5,7.8

10-32

56

0.46

(Bell)

47,68

0.43,0.49

10-32

51

0.56

6.9

(DHS)

37,71

0.51,0.61

6.5,7.2

a

95% confidence intervals are shown in italics below each fit. The data were fit to the following four models: Bell: k(F) = k1(F)exp(βFΔx‡1); DHS: k(F) = k1(0)[1-vFΔx‡1/βΔG‡]1/v-1exp(βΔG‡[1-(1-vFΔx‡1/βΔG‡)1/v]); Double-Bell (see eq. 1); DHSþconst: k(F) = k1(0) þ k2(0)[1-vFΔx‡2/ βΔG‡]1/v-1 exp(βΔG‡[1-(1-vFΔx‡2/βΔG‡)1/v])n.

sum over two processes. Writing the overall rate as the sum of a force-independent “intrinsic” rate and a force-dependent rate given by the DHS model gave reasonable fits for both the 10-48 and 1-56 directions (ν = 1/2).48 We note that the fitted barrier heights for the DHS model for the strong pulling directions are roughly double those for the weak pulling directions, consistent with unfolding along the strong coordinates being very unfavorable at low force. Although the use of a constant intrinsic rate of unfolding misses the decrease in unfolding rate at low forces in the 1-56 direction, using the sum of two DHS models results in more parameters (six) than can be determined by the data. As an alternative, we have fitted a model with a sum of two Bell terms, which requires four parameters: q

q

kðFÞ ¼ k1 ð0ÞeβFΔx1 þ k2 ð0ÞeβFΔx2

ð1Þ

Here, the two terms are intended to represent the flux over distinct barriers with different rate constants in the absence of force (k(0)) and at different displacements from the native state along the pulling coordinate (xq). The resulting fits for 1-56 pulling predict the existence of one unfolding pathway that is faster than the other in the absence of force but whose rate decreases in the presence of force due to a negative displacement to the transition state, xq1. In contrast, unfolding through the second pathway accelerates as the force increases, due to xq2 being positive. Thus, with increasing force, the initially faster pathway becomes slower and the initially slower pathway becomes faster, giving rise to the turnover observed in the kinetics. An analogous kinetic turnover has been observed in chemical denaturation experiments as a function of denaturant concentration, and was similarly interpreted as evidence of parallel folding pathways.88,89 The data at high force were not well fit by the Bell model (as for the weaker pulling directions) due to the shift of the barrier with force. We find that, for pulling at 300 pN along 10-48, we even obtain a non-exponential waiting time distribution (not shown), suggesting barrierless folding. An alternative to considering a sum of two one-dimensional models is an explicitly two-dimensional model.50 Such a model

captures the scenario in which a single saddle is distorted by force, giving rise to a turnover in the kinetics. We give the results of the fit to such a model in Figure 1 of the Supporting Information. The Unfolding Mechanism of GB1 Changes in Response to External Forces. To study how the observed trends in the kinetics relate to changes in the mechanism of unfolding, we sought to develop a measure of the similarity between different unfolding trajectories. Due to the importance of native contacts in the Go model, we have used the order of native contact formation as a basis for comparing trajectories.90 Since real unfolding trajectories do not proceed unidirectionally but involve numerous recrossings, there are several possible ways to compare the order of events between different trajectories.91 We obtained reasonable results by considering the average formation of the different contacts as a function of the overall fraction of native contacts Q, an approach that we will refer to as “average contact formation” or ACF. ACF arrays were constructed by first sorting frames from each transition path trajectory by total Q into six bins of width 0.1 between Q = 0.2 and Q = 0.8, and then averaging the degree of formation, qij, of each contact over all frames within each bin. Such an array is expected to reflect, on average, which structural elements are folded and which are unfolded at each stage of the unfolding process. We obtained similar results by comparing trajectories based on the order in which contacts broke for the first time during unfolding (analogous to “contact appearance order” [CAO] in ref 90), although we found this to be less robust at distinguishing different classes of folding mechanisms. In order to cluster groups of similar trajectories, we defined the “distance” or “dissimilarity” between two trajectories as the Euclidian distance between their ACF arrays (eq 8 in Methods). The array plots in Figure 3 show all pairwise ACF distances between trajectories at every force. For all pulling coordinates, a shift in mechanism occurs with increasing force, as is evidenced by the fact that trajectories at similar forces tend to have shorter distances between them than trajectories at different forces. 1549

dx.doi.org/10.1021/jp110738m |J. Phys. Chem. B 2011, 115, 1546–1561

The Journal of Physical Chemistry B

Figure 3. Continuous versus discrete switch of mechanism. ACF distances between pairs of trajectories at different forces are shown in matrix form, for each of the four pulling directions. Several hundred individual trajectories are arranged by pulling force (colored bars) along the horizontal and vertical axes. Darker shades correspond to larger intertrajectory distances (i.e., less similar trajectories), and lighter shades correspond to smaller distances.

However, differences in the exact shape and magnitude of these shifts imply that the unfolding mechanism is affected differently by pulling in different directions. For the mechanically strong 1-56 and 10-48 directions, a discrete switch is observed between two major classes of trajectories that dominate at low and high force. Trajectories within each group are more similar to other trajectories within the same group than to trajectories within the opposite group. The force at which the switch between “low-force” and “high-force” trajectories occurs, ∼100 and ∼50 pN for 1-56 and 10-48 pulling, respectively, coincides with the point at which the turnover in the kinetics is observed. In contrast, acceleration of the unfolding rate for the weak 10-32 and 32-56 pulling directions is associated with a more gradual shift in mechanism, while more dramatic shifts in mechanism occur only at relatively high forces, where the unfolding barrier is very small or vanishing. Clustering Analysis Reveals Force-Induced Shifts between Unfolding Mechanisms. In order to study in more detail how the mechanism of unfolding changes in response to pulling, we used the intertrajectory ACF distances plotted in Figure 3 to identify groups of similar trajectories by k-means clustering using 10 clusters (see Methods); choosing different numbers of clusters gave qualitatively similar results. Parts A-D of Figure 4 summarize the results of clustering hundreds of unfolding simulations at different pulling forces in the four different directions. In the left panel in parts A-D of Figure 4 are grayscale plots of the average ACF arrays of all trajectories within each cluster. Each plot shows the average order of events along the unfolding pathway, characteristic of trajectories within that cluster. For instance, cluster 1 in Figure 4A shows that, when Q is between 0.7 and 0.8, the contacts that tend to be broken are those between the helix and strands 2, 3, and 4; those in the center of the helix; and those between the N-terminus of strand 3 (loosely defined) and the C-terminus of strand 4. As Q drops to between

ARTICLE

0.5 and 0.6 for this cluster, contacts begin breaking between strands 1 and 2 and between the helix and strand 1. By the time that Q drops to between 0.3 and 0.4, these contacts are almost always broken. A very different pattern of unfolding occurs for cluster 10 in Figure 4A. Here, contacts between strands 1 and 4 begin breaking between Q = 0.7 and Q = 0.8 and are almost completely broken by the time that Q drops to between 0.5 and 0.6. Contacts are also progressively lost between strands 3 and 4, starting at the N-terminal end of strand 3 (C-terminal end of strand 4), and between strands 1 and 2, starting at the N-terminal end of strand 1 (C-terminal end of strand 2). To illustrate the relationship (i.e., ACF distances) between different clusters, we have used two-dimensional graph projections, shown in the right panels in Figure 4A-D. These plots show that pulling in the strong directions parallel to the β-sheet (1-56 and 10-48; Figure 4A,B) causes a dramatic change in the order of events along the unfolding pathway. One set of nodes, representing low force unfolding events, forms a distinct group separate from those representing high force unfolding events. Examining the ACF matrices shows that, for both 1-56 and 10-48 pulling directions, the contacts between strands 1 and 4 tend to break much earlier at high force than at low force. This makes sense, given that both the 1-56 and 10-48 pulling directions are positioned to apply a shearing force between these two strands. This is consistent with previous simulation and experimental work which has shown that rupture of a “force clamp” composed of parallel beta-sheets is often the initial event in mechanical unfolding of proteins that contain this motif.30,70-74 Conversely, while rupture of intrahelical and helix-strand contacts is one of the earliest events in low-force unfolding trajectories, this tends to happen later in high-force unfolding trajectories. Significantly, this switch between “intrinsic”-like unfolding mechanisms at low force and a new “mechanical” unfolding mechanism at high force occurs over a similar range of forces to the turnover in the kinetics. For the 1-56 direction (Figure 4A), trajectories obtained at forces below 50 pN are found primarily in the “low-force” clusters 1-4. Trajectories simulated at forces of 200 pN or higher are found primarily in the “high-force” clusters 5-10. However, 100 and 150 pN trajectories are distributed between the low-force and high-force halves of the diagram, consistent with these forces being near a crossover point at which unfolding via the intrinsic pathway and unfolding via the mechanical pathway become equally probable. Paralleling the trends in the kinetics, the switch of mechanism in the 10-48 direction occurs at a lower force (∼50 pN) than that in the 1-56 direction (Figure 4B). Pulling along the mechanically weak 10-32 direction causes more subtle shifts in mechanism than pulling along the mechanically strong directions (Figure 4C), and there is a weaker separation between low-force-dominated and high-force-dominated clusters. One change that does occur, however, is a reversal of the order in which the N-terminal and C-terminal hairpins tend to unfold. This trend is also observed for pulling in the 1-56 and 32-56 directions. Looking more closely at the ACF plots reveals that accelerated unfolding of the C-terminal hairpin occurs by an unzipping mechanism in which contacts first detach at the N-terminus of strand 3 (the C-terminus of strand 4), and then strand 3 peels away from strand 4 as the helix is pulled apart from the sheet. Though pulling at relatively low forces in the 32-56 direction accelerates unfolding while causing only a modest change in 1550

dx.doi.org/10.1021/jp110738m |J. Phys. Chem. B 2011, 115, 1546–1561

The Journal of Physical Chemistry B

ARTICLE

Figure 4. Cluster analysis of unfolding trajectories. For each of the four pulling directions (shown separately in parts A-D), unfolding trajectories from all pulling forces have been globally clustered using the ACF metric. On the right in each of parts A-D is shown a graph in which each cluster node is represented as a pie chart with wedges showing the fraction of members obtained from each pulling force. The area of each node is proportional to the number of members in the cluster, while the spatial separation and thickness of the edges (see legend) represent the ACF distances between the clusters. Multidimensional scaling120 was used to arrange the clusters in two dimensions in a way that best approximates the distances between them; however, it should be emphasized that the two-dimensional distances between clusters in these plots are only approximate, while the distance ranges represented by the connecting lines are exact. On the left of each part is shown the order of contact formation matrix for each cluster, as used in calculating the ACF. On the vertical axis is the fraction of native contacts, Q, divided into windows of 0.1 between Q = 0.3 and Q = 0.8. Along the horizontal axis are the 120 individual contacts within the model, grouped (E) by the secondary structural elements they connect. The average degree of formation of each contact is shown in grayscale, where white indicates complete formation of the contact and black indicates complete disruption of the contact.

mechanism, a dramatically different unfolding pathway emerges at sufficiently high forces. This new mechanism is represented by clusters 6 and 10 in Figure 4D, which are highly dissimilar to the others. In addition to detachment of strands 3 and 4, this new

pathway involves detachment of strand 1 from strand 4, similar to what is observed for pulling along the 1-56 coordinate. Unlike 1-56 pulling trajectories, however, in which all of the contacts between strands 1 and 4 tend to break simultaneously, pulling at 1551

dx.doi.org/10.1021/jp110738m |J. Phys. Chem. B 2011, 115, 1546–1561

The Journal of Physical Chemistry B

Figure 5. Bayesian analysis of transition states along the 1-56 coordinate: the potential of mean force G(Q,r) (left column), probability distribution for transition paths P(Q,r|TP) (center column), and conditional probability of being on a transition path P(TP|Q,r) (right column) along Q and r for pulling at different forces. In the color scheme of the center and right columns, black represents regions of highest probability and yellow/white represents regions of lowest probability, on a logarithmic scale.

high forces in the 32-56 direction causes sequential disruption of strand 1-4 contacts, starting at the C-terminal ends of these two strands. Inspecting individual trajectories reveals that, at high forces, the C-terminal end of strand 4 is peeled away from strands 1 and 3. Peeling apart of the protein from the C-terminus then continues until most of the protein has unfolded. However, we note that this pathway appears at forces where the barrier is very small (close to being “downhill”), in contrast to the switch which occurs for the strong coordinates, where the unfolding barrier is still high after the switch. Collectively, these clustering results reveal that the turnover in the kinetics for pulling along mechanically strong coordinates is

ARTICLE

Figure 6. Bayesian analysis of transition states along the 10-32 coordinate at different forces. See Figure 5 legend for details.

associated with a discrete switch between unfolding via an intrinsic-like mechanism and unfolding via a novel “mechanical” mechanism. In contrast, pulling along mechanically weak directions appears to accelerate unfolding via mostly intrinsic-like pathways whose order of events gradually changes with increasing force. For pulling at extreme enough forces along mechanically weak directions (i.e., 32-56), more dramatic shifts in the unfolding mechanism may in some cases be observed. Two-Dimensional Projections Also Reveal Shifts in Mechanism. The highly cooperative nature of protein folding suggests it may be possible to describe key features of the overall reaction using a relatively small number of global progress variables. To gain an alternative perspective on shifts in unfolding 1552

dx.doi.org/10.1021/jp110738m |J. Phys. Chem. B 2011, 115, 1546–1561

The Journal of Physical Chemistry B mechanism in response to external forces, we sought to describe the process using a projection onto the fraction of native contacts, Q, and the pulling coordinate, r. Q is often found to serve as a useful reaction coordinate for protein folding at zero force, but it is not clear that Q remains an optimal coordinate at higher pulling forces. In contrast, the pulling coordinate (r) is expected to become, by construction, a good reaction coordinate in the limit of high force, though it is not in general a useful reaction coordinate at low force.43,48 We therefore investigated whether the combination of Q and r would provide an improved low-dimensional description that could be used to predict the kinetics of the unfolding reaction as a function of applied force and to identify transition states. For this analysis, we focus on one mechanically weak pulling direction, 10-32, and one mechanically strong pulling direction, 1-56. To determine whether projections onto Q and r would reveal shifts in the unfolding mechanism as seen in the clustering results, we projected unfolding transition paths onto Q and r and compiled this information into two-dimensional histograms (center column of Figures 5 and 6). We focused on trajectory segments connecting the folded state at Q = 0.89 with the intermediate state at Q = 0.35, given that the barrier between these two states is predicted to be the principal barrier to unfolding (Figure 1). The probability of a configuration from a transition path being in a particular (Q,r) bin is given by nðQ ;rÞ N

ð2Þ P P where N= Q rn(Q,r) is the total number of frames in all transition path trajectories and n(Q,r) is the number of frames in the specified (Q,r) bin. These projections reveal obvious differences between lowforce and high-force unfolding mechanisms. In the presence of low pulling forces along the 1-56 coordinate, Q decreases during unfolding with no corresponding increase in r (Figure 5, middle column). Forces of up to 50 pN cause no change in this basic mechanism, except to shift the probability distribution in the r1-56 direction toward the maximum extension of β-strands 1 and 4. By about 100 pN, however, a new unfolding mechanism emerges in which extension along r1-56 (i.e., rupture of contacts between strands 1 and 4) accompanies the decrease in Q. As the force increases, the point at which this rupture occurs on average shifts to higher Q. In other words, increasing force causes the shearing apart of strands 1 and 4 to occur progressively earlier along the unfolding pathway. This rupture becomes so rapid by 300 pN that it begins while the rest of the protein remains mostly folded. In contrast, projections of 10-32 pulling trajectories onto Q and r (Figure 6, middle column) reveal a more gradual shift in unfolding mechanism, in agreement with trajectory clustering results. In the absence of force, a decrease in Q is modestly correlated with an increase in r10-32. As the force along the 10-32 coordinate increases, however, the decrease in total Q becomes more strongly correlated with an increase in r10-32. Unlike the case for the 1-56 pulling trajectories, however, at no force along the 10-32 coordinate does there appear to be an abrupt transition between low-force and high-force mechanisms. Projection of the Free Energy Landscape onto Q and r. To better understand the above trends for transition path trajectories, we calculated two-dimensional projections of the folding free energy landscape onto Q and either r1-56 or r10-32. To gather data about extended conformations that are largely unsampled at zero force, we used two-dimensional umbrella PðQ ;rjTPÞ ¼

ARTICLE

sampling with harmonic bias potentials in terms of both Q and r. Because of the steepness of the free energy landscape in some areas, particularly along the 1-56 pulling coordinate, many overlapping umbrella windows were needed to achieve adequate coverage of the (Q,r) surface. The centers and force constants of these windows are provided in Tables 1 and 2 of the Supporting Information. These data were analyzed using the weighted histogram analysis method (WHAM) to obtain two-dimensional potential of mean force (PMF) surfaces.92 These are shown in the upper left-hand panels of Figures 5 and 6 (“0 pN PMF”). The choice of r as a reaction coordinate has the advantage that the PMF at any pulling force can also be calculated by adding the linear term -Fr to the zero-force PMF. Calculated PMFs for the various pulling forces used in the kinetics simulations are shown in the leftmost columns of Figures 5 and 6. At zero force, two-dimensional PMFs along Q and r1-56 (Figure 5, upper left panel) or Q and r10-32 (Figure 6, upper left panel) show a folded basin at Q ≈ 0.9, an intermediate state at Q ≈ 0.35, and an unfolded basin at Q ≈ 0.15, which is consistent with the one-dimensional PMF calculated in terms of Q alone (Figure 1). As expected, increasing the force along either pulling direction is predicted to substantially lower the free energy of the unfolded state relative to the folded state. Additionally, pulling in either direction is predicted to reduce the stability of the intermediate. The PMF in Q and r1-56 reveals an extremely steep “rupture” barrier along the r1-56 pulling coordinate, which corresponds physically to the maximum extension of the force clamp between β-strands 1 and 4. The height of the rupture barrier is predicted to decrease with Q, implying that additional contacts within the protein other than those between strands 1 and 4 contribute to the stability of the force clamp. At zero force, the principal barrier between the folded state and the Q ≈ 0.35 intermediate is predicted to occur at approximately the same extension along the 1-56 coordinate as the folded state. Moreover, this barrier between the folded and intermediate states is largely inextensible along r1-56 due to the persistence within the intermediate of contacts between β-strands 1 and 4. Thus, pulling along the 1-56 coordinate at low forces is not predicted to accelerate the rate of unfolding significantly. However, the (Q,r1-56) PMF suggests that, at high enough forces along the 1-56 pulling direction, it should be possible to bypass the native unfolding pathway and instead unfold by crossing the strand 1-4 rupture barrier. Because the height of this barrier increases with Q, contacts between strands 1 and 4 are expected to break first at low Q and then at increasing Q with increasing force, consistent with what is observed for actual transition paths (Figure 5, left column). In contrast, the barrier between folded and intermediate states is significantly lower and broader along the 10-32 pulling coordinate, r10-32. Even small forces along the 10-32 direction are therefore predicted to stabilize and distort the barrier. This will accelerate unfolding and shift the unfolding pathway toward a concerted decrease in Q and increase in r10-32, consistent with what is observed for actual unfolding trajectories. Predicting Transition States from Low-Dimensional Projections. If Q and r serve collectively as a good pair of reaction coordinates, then it should be possible to predict the location of transition states using the Bayesian formula (see Methods): PðQ ;rjTPÞ ð3Þ PðTPjQ ;rÞ  PðQ ;rÞ 1553

dx.doi.org/10.1021/jp110738m |J. Phys. Chem. B 2011, 115, 1546–1561

The Journal of Physical Chemistry B

ARTICLE

Figure 7. Comparison of Bayesian transition state predictions and results from transition path sampling (TPS). The plots in the left column show the P(TP|Q,r) predictions from Bayesian analysis. Calculated values from TPS of P(TP|x) (center column) and Pfold(x) (right column) for each chosen initial configuration x are plotted using the color scale shown in the legend for starting frames at different positions along Q and r. Parts A and B show the same data in the absence of force projected onto Q and either r1-56 (A) or r10-32 (B). Part C shows results for pulling at 100 pN in the 1-56 direction.

This formula can be understood to mean that the probability that a conformation with a given combination of Q and r will lie along a transition path, P(TP|Q,r), is proportional to the enrichment of frames with that particular combination of Q and r among transition paths relative to the equilibrium distribution. We used eq 3 to predict the positions of transition states along Q and r at each pulling force. The predicted equilibrium distribution P(Q,r)|F at pulling force F was calculated (up to a normalization constant) using the formula PðQ ;rÞjF  e - βðGðQ ;rÞ - FrÞ

ð4Þ

where G(Q,r) is the PMF in Q and r in the absence of force and -Fr is the additional contribution of the pulling force to the free energy. To reduce statistical noise, we considered only (Q,r) bins that contained at least 0.01% of the total number of transition path frames (i.e., n(Q,r) g 0.0001N in equation eq 2). The results of this analysis, shown in the right columns of Figures 5 and 6, predict a pronounced shift of the position of the transition state along Q and r as a function of the applied force. In the case of the 1-56 pulling coordinate (Figure 5, right column), there is relatively little shift in the predicted transition state at low forces, though the P(TP|Q,r1-56) distribution, following P(Q,r|TP), is compressed toward the maximum r1-56 extension of the β-sheet.

Two distinct modes in P(TP|Q,r1-56) appear by 100 pN, suggesting the possibility of two parallel transition states at this intermediate force. The position of the second transition state at the β-sheet rupture barrier suggests that it is associated with shearing apart of strands 1 and 4. The crossover between these putative transition states appears to occur at the same force as the turnover in unfolding kinetics and the switch in mechanism based on clustering analysis. Moreover, the turnover in the kinetics can be rationalized on the basis of the different positions of these two transition states along the r1-56 coordinate: While the predicted low-force transition state occurs initially at lower r1-56 than the native state, the predicted high-force transition state occurs at higher r1-56 than the native state. This correctly implies that the rate of unfolding will first decrease in response to an external force along the 1-56 coordinate and then increase as the unfolding flux shifts to the second transition state at sufficiently high forces. Remarkably, the projection onto Q and r10-32 predicts two local maxima in P(TP|Q,r10,32) that differ in both Q and r10-32, even in the absence of force. Inspection of the two-dimensional PMF (Figure 6, top-left panel) reveals that the maximum at lower Q is located near the predicted saddle point between the folded and intermediate/unfolded states. The maximum at higher Q, however, is located at a local free energy minimum, implying that 1554

dx.doi.org/10.1021/jp110738m |J. Phys. Chem. B 2011, 115, 1546–1561

The Journal of Physical Chemistry B

ARTICLE

Table 2. Definitions of Regions of Interest for Analysis of Transition Statesa Q boundaries

r boundaries

A

0.48 < Q < 0.57

1.5 nm < r10-32 < 2.3 nm

B C

0.7 < Q < 0.83

2.5 nm < r10-32 < 2.75 nm r1-56 e 2.85 nm

region

D a

r1-56 > 2.85 nm

Region names correspond to subfigures in Figure 8 (see also Supplemental Figures 4 and 5 in the Supporting Information).

this mode may be associated with a second on-pathway metastable intermediate. To test these transition state predictions, we ran transition path sampling (TPS) shooting simulations starting from a randomly-selected group of several hundred frames that covered large areas on both the (Q,r1-56) and (Q,r10-32) planes (Figure 7). To increase the likelihood of identifying transition states, we drew starting configurations x from windows of Q(x) and r(x) in high-P(TP|Q,r) regions, using frames from both transition path trajectories as well as the equilibrium distribution. Thus, it should be emphasized that the resulting distribution does not provide an unbiased view of the distribution of transition states that is equivalent to P(TP|Q,r). Instead, it shows in which regions transition states were successfully located by a directed search guided by P(TP|Q,r) predictions. For TPS runs, initial velocities for each residue were randomly chosen from a Maxwell-Boltzmann distribution at 300 K. Fifty such sets of starting velocities were chosen for each starting frame, and simulations were run in the forward and reverse (i.e., velocity-reversed) directions until the protein folded (Q > 0.89) or reached the intermediate (Q < 0.35). For each starting configuration x, the probability of folding, Pfold(x), was calculated as the fraction of all simulations (forward and reverse) that reached the folded state. The probability of lying on a transition path, P(TP|x), was calculated for each starting frame x as the fraction of forward/reverse pairs of starting velocities for which the forward and reverse simulations reached opposite (folded or intermediate) states. Note that the basic premise that P(TP|x) is maximized when Pfold(x) = 0.5 rests on the assumption of overdamped dynamics, in which case the predicted relationship P(TP|x) = 2Pfold(x)(1 - Pfold(x)) should hold.93 As shown in Figure 2 in the Supporting Information, this assumption holds for our data to within the predicted sampling error given the total number of simulations. The results of our simulations show that transition state structures can be found in the predicted regions of the (Q,r) plane. Plotting Pfold(x) or P(TP|x) for individual frames as a function of Q and r10-32 shows a number of high-P(TP|x) points with Pfold(x) ∼ 0.5 between about Q = 0.5 and Q = 0.6 (Figure 7A), as predicted by the plot of P(TP|Q,r10-32) at zero force. Strikingly, points with high P(TP|x) and Pfold(x) ∼ 0.5 can also be found at a position that corresponds to the low-Q edge of the second mode in P(TP|Q,r10-32). This suggests that there may indeed be a second transition state at higher Q and r10-32, as predicted by the Bayesian analysis. Projection of all TPS data points onto Q and r1-56 shows that the probability of a frame being a good transition state appears to vary only as a function of Q and not as a function of r1-56 (Figure 7B), implying that r1-56 is a poor reaction coordinate at low force. A similar result was obtained when considering only starting frames drawn from an equilibrium distribution (not shown).

Figure 8. Transition states identified from Pfold calculations (Pfold between 0.4 and 0.6): (A) Lower-Q region, no force; (B) higher-Q, high-r10-32 region, no force; (C) lower-r1-56 region, 100 pN pulling along 1-56; (D) higher-r1-56 region, 100 pN pulling along 1-56. The upper and middle rows show different views of the structures, while the last row shows the φ-values calculated from these structures. In each case, the solid black symbols give the φ-values for the respective transition state, with the shaded region representing the low force transition state (A). The experimental (zero force) φ-values are shown as red symbols in part A.

Though Q is a reasonably good reaction coordinate, it is clearly not perfect, as is shown by the considerable spread in the Pfold values at any given value of Q (Figure 7A,B and Figure 3 in the Supporting Information). This variability in Pfold at a given value of Q is much greater than can be explained by sampling error alone (see Figure 3 in the Supporting Information). To get a better idea of what “good” transition state conformations looked like, we examined the structures of starting frames with 0.4 < Pfold(x) < 0.6 from different regions of the (Q,r10-32) plane (see Table 2). Conformations from region A were characterized by partial to near-complete unfolding of the helix and detachment of the helix from the β-sheet (Figure 8A). Additionally, β-strand 2 was in some cases detached from strand 1, leaving only a structured β-sheet core composed of strands 1, 4, and 3, similar to the intermediate. Transition state conformations from region B were markedly different (Figure 8B). In these conformations, the two β-hairpins were splayed apart like a pair of scissors. The N-terminal hairpin, in particular, tended to be bent at a sharp angle from its position in the native structure. As residue 10 is at the tip of this N-terminal hairpin, these results explain why this splayed transition state would be favored by pulling along the 10-32 direction. In contrast to transition state structures from region A, the helix tends to be largely intact in transition state structures from region B, and contacts between the helix and the sheet (with the exception of the N-terminal hairpin) are largely preserved. We additionally wished to test the prediction that parallel intrinsic-like and mechanical transition states occur at intermediate pulling forces along the 1-56 coordinate. Shown in Figure 7B are the results of TPS simulations in the presence of a 100 pN pulling force for selected points within the region of the 1555

dx.doi.org/10.1021/jp110738m |J. Phys. Chem. B 2011, 115, 1546–1561

The Journal of Physical Chemistry B (Q,r1-56) plane that is predicted on the basis of P(TP|Q,r1-56) to encompass both transition states. As before, these starting points were chosen in a “biased” manner from transition paths. As predicted, there are indeed a number of frames within this region with Pfold(x) ∼ 0.5 and P(TP|x) ∼ 0.5. In line with what was predicted, transition states were found in both predicted high-P(TP) regions, at both low and high values of r1-56. Due to the limitations of the present sampling, it is unclear whether lowr1-56 and high-r1-56 regions represent two distinct clusters of transition states or whether they are instead two ends of a single extended transition state region. We examined structures of transition states (with 0.4 < Pfold(x) < 0.6) for r1-56 e 2.85 nm (region C) and r1-56 > 2.85 nm (region D). Low-r1-56 transition state structures are similar to the two groups of 0 pN transition states discussed above (Figure 8C). In some cases, the helix is partially to completely unfolded and detached from the β-sheet. Alternatively, the helix may remain largely structured while the β-sheet adopts a splayed conformation. In contrast, the second cluster of transition states at higher r1-56 (Figure 8D) differs markedly from the other transition state structures shown in Figure 8A-C. This cluster is characterized by hyperextension and separation of strands 1 and 4 along the pulling direction, which suggests that these conformations tend to unfold not through a global structural transition but instead by the local disruption of contacts between strands 1 and 4. We were interested in determining whether there is a correspondence between transition states taken from different regions of the (Q,r) plane and different unfolding mechanisms identified by the clustering approaches described above. After combining forward and reverse trajectories from successful TPS runs, we assigned each trajectory to the nearest cluster from Figure 4. The results support the idea that projection onto Q and r is able to distinguish transition states associated with different mechanisms. Transition paths launched from region A of the (Q,r10-32) plane (structures in Figure 8A) were mostly classified into cluster 1 from Figure 4C (Figure 4 in the Supporting Information). In contrast, transition paths launched from starting frames in region B were primarily classified into cluster 2 from Figure 4C (Figure 4 in the Supporting Information). The fact that cluster 2 contains trajectories at higher forces on average than cluster 1 (Figure 4C) is consistent with the prediction that the higherr10-32 set of transition states in region B will be favored in the presence of pulling forces along 10-32. Likewise, we tested whether the two different transition state regions predicted for intermediate pulling forces in the 1-56 direction might correspond to distinct intrinsic (low-r1-56) or mechanical (high-r1-56) unfolding pathways. Indeed, transition path trajectories launched from transition states with r1-56 e 2.85 nm (region C; Figure 8C) were primarily assigned to lowforce clusters (Figure 5 in the Supporting Information), while those launched from transition states with r1-56 > 2.85 nm (region D; Figure 8D) were primarily assigned to high-force clusters (Figure 5 in the Supporting Information). Collectively, these results suggest that, at intermediate pulling forces along the r1-56 direction, unfolding can occur via very different transition states, associated with intrinsic or mechanical unfolding pathways. The difference in the displacement of these two transition states along the pulling direction r1-56 explains the switch from the intrinsic to the mechanical transition state at intermediate force. To characterize further the transition states obtained by pulling at different forces on the r1-56 and r10-32 coordinates, we

ARTICLE

have calculated folding φ-values94 for the transition-state ensembles shown in Figure 8. Here, we obtain φi for residue i as φi = ÆNiæ/N0i , where N0i is the number of contacts formed by residue i in the native state and ÆNiæ is the average number of native contacts formed by residue i in the transition-state ensemble. This definition results in φ-values consistent with the experimental interpretation: a φ-value of 0 results from a residue forming no native contacts in the transition state, while φ = 1 indicates a residue which is forming all its native contacts. For the intrinsic transition state in Figure 8A, our estimated φ-values are in good agreement with the experimental data from ensemble experiments in the absence of force:64 the main feature of the transition state is the formation of the C-terminal hairpin. The alternate, low-force, transition state on the r10-32 coordinate, Figure 8B, is more structured in the N-terminal hairpin and helix relative to the intrinsic transition state. The low-r transition state for r1-56 pulling, Figure 8C, shows a pattern of φ-values very similar to Figure 8A, showing that, even in the presence of a 100 pN pulling force, the structure of the intrinsic transition state is remarkably similar to that at low force. The high-r transition state obtained at 100 pN, Figure 8D, is more structured than the intrinsic transition state, as expected. The φ-values for the transition state ensemble in Figure 8D are most similar to those determined from a previous all-atom simulation study of protein G unfolding with a pulling force applied to the termini.72 Since unfolding occurs at large forces (∼1 nN) in the all-atom studies, it is entirely reasonable that the “high-force” pathway should be favored in this case. A key feature of this transition state is the loss of structure in the N- and C-terminal strands due to their separation. This is also the main difference between the φ-values for the transition state favored by pulling on r10-32, Figure 8B, and that favored by pulling on r1-56, Figure 8D. Simulations on Two-Dimensional Energy Landscapes Partially Recapitulate Kinetic Trends in Pulling Simulations. Given the success of two-dimensional projections in locating unfolding transition states, we sought to investigate how well a simple two-dimensional description of the dynamics could capture the unfolding kinetics of the full model. Although some studies have found simple diffusion models to be insufficient to describe protein folding95 and polymer dynamics,96 in many cases low-dimensional diffusion models have successfully captured folding dynamics.81,84,97-100 As described in the Methods, we ran Brownian dynamics (BD) simulations on the above (Q, r1-56) and (Q,r10-32) PMF surfaces after first smoothing each surface, adding steep boundaries at the edges of sampling, and then interpolating the surface with a differentiable cubic spline. We searched a range of ratios between the diffusion coefficients in the Q and r directions to try to identify a value that would give reasonable qualitative agreement with the results from the full model, finding an optimal ratio of Dr/DQ = 10 nm2 between the diffusion coefficients in the r and Q directions. This ratio of diffusion coefficients is consistent with diffusion coefficients estimated directly from simulations using a Bayesian approach.84,97,101 In previous work, we estimated a positiondependent diffusion coefficient on Q, D(Q), ranging between 5 and 20 μs-1 for this model of protein G, from equilibrium simulations. Here, we applied the same method to estimate a diffusion coefficient D(r) along each pulling coordinate r. Since r in general does not clearly separate folded and unfolded states, the analysis was done separately for the folded and unfolded states and the results reported in Figure 6 of the Supporting Information. There is a large difference between the diffusion 1556

dx.doi.org/10.1021/jp110738m |J. Phys. Chem. B 2011, 115, 1546–1561

The Journal of Physical Chemistry B

Figure 9. Unfolding times from Brownian dynamics (BD) simulations on PMF surfaces along Q and r10-32 or r1-56 (solid lines). For comparison, results from the full model are also shown (dotted lines). The unfolding rate in the absence of force is normalized to 1 in each case.

coefficients for the folded and unfolded states; however, because the folding transition state is native-like (see above), we base our estimate of Dr on the native-state results, giving Dr ≈ 100 nm2 3 μs-1, in accord with the empirically chosen Dr/DQ ratio. The results of BD simulations using a ratio of Dr/DQ = 10 nm2 are shown in Figure 9. For clarity, both curves have been normalized such that the unfolding rate at 0 pN is equal to 1. In reality, the mean unfolding rate of simulations on the (Q, r10-32) surface was 1.4 times greater at 0 pN than that of simulations on the (Q,r1-56) surface. Consistent with the full model, pulling along the 10-32 direction is expected to accelerate unfolding much more rapidly than pulling along the 1-56 direction. In addition, while pulling along the 10-32 direction accelerates unfolding even at very low forces, pulling in the 1-56 direction at 10 pN causes a very slight decrease in the rate of unfolding compared to that in the absence of force. However, this is much less than the >2-fold decrease in unfolding rate caused by pulling between residues 1 and 56 in the full Go model. BD simulations also greatly underestimate the turnover force along the 1-56 direction at which the rate of unfolding begins to increase rather than decrease, and the kinetics at high forces are not described well for either pulling direction. These discrepancies most likely arise from imperfections in the chosen pair of reaction coordinates, as well as fine features of the energy surface which may be lost in the discretized potential of mean force; other limitations of the model are the neglect of position dependence of the diffusion coefficients and the absence of cross-terms in the diffusion tensor, although these are likely to be less important. Nonetheless, the results show that a simple two-dimensional description of the system obtained from equilibrium sampling is able to capture much of the complexity in the unfolding kinetics observed in the full simulations. This suggests that two-dimensional models may be sufficient for describing the switch in mechanism in experiments.50

’ CONCLUDING REMARKS Pulling a protein along multiple directions can provide useful information about its free energy landscape.19 Interpreting such experiments, however, requires careful consideration of how pulling influences the mechanism of unfolding and the heights of the relevant free energy barriers. Here, we have used simulations of a coarse-grained model of the GB1 domain to address the question of how pulling along different coordinates influences the kinetics as well as the basic mechanisms of unfolding. Our unfolding kinetics calculations reveal the presence of substantial anisotropy in the mechanical resistance of GB1.

ARTICLE

Consistent with previous theoretical and experimental work, the protein is most mechanically resistant along the parallel β-sheet force-clamp motif (pulling directions 1-56 and 10-48). For these two directions, the unfolding rate as a function of force undergoes a biphasic turnover between weakly force-dependent and strongly force-dependent regimes. Results from trajectory clustering and two-dimensional projections of the dynamics imply that this kinetic turnover is associated with a switch-like transition in unfolding mechanism as a function of the applied force. In principle, both pathways are always available for unfolding, but at most forces, the difference in unfolding barrier height means that either one or the other will dominate. However, it would be expected that, at some low to intermediate force, where the barrier heights are similar, both pathways will be accessible. This is indeed what we find, with distinct “intrinsic” and “mechanical” transition states coexisting at intermediate forces along the 1-56 direction, as predicted by a Bayesian analysis of equilibrium and transition path data, and confirmed by direct simulations. Taken together, these results support a “parallel-pathway” model in which force induces a discrete switch from the barrier for solution unfolding to a novel mechanical barrier that is ordinarily too high in energy to be crossed in the absence of force. In contrast, pulling along the mechanically weak 10-32 and 32-56 coordinates leads to an immediate acceleration of the rate of unfolding that is accompanied by a gradual shift of the unfolding mechanism and of the locations of predicted transition states. This is consistent with a “single-pathway” model in which pulling along mechanically weak directions accelerates unfolding through the intrinsic pathway by lowering the height of the intrinsic barrier. Trajectory clustering and projection analyses suggest that the imperfect overlap between the pulling coordinate and the intrinsic reaction coordinate additionally leads to a gradual distortion of this intrinsic barrier. At sufficiently high forces (especially for the 32-56 direction), this gradual distortion may give way to an outright switch to a new pathway, similar to what is observed for the strong pulling directions. In contrast to the strong pulling directions, however, this switch occurs only after unfolding through the intrinsic pathway has been accelerated to the extent that it becomes nearly diffusion-limited. Our results thus predict a rich repertoire of mechanical behaviors for a simple single-domain protein, the consequences of which should be detectable experimentally. A counterintuitive prediction of our simulations is that pulling in the 1-56 direction at low forces will initially decrease the rate of unfolding. A similar “catch bond” effect has been observed experimentally for intermolecular interactions,102 and while similar kinetic trends have not yet been observed for simple globular domains to our knowledge, there is evidence that pulling on certain domains may promote the formation of mechanically stabilized intermediates via non-native interactions.39,56 The absence of such attractive non-native interactions in our Go model suggests that force-induced stabilization might in some cases occur by other mechanisms, such as a reduction in the conformational freedom of the transition state in the presence of force. It will be important to test the validity of such predictions, especially as recent constant-force unfolding kinetics measurements of the B1 domain of protein L did not detect evidence of such a turnover in the rate of unfolding as a function of force.18 Studying shifts in mechanism experimentally will be challenging; for example, the distribution of unfolding times is expected to remain single exponential even in the presence of two parallel 1557

dx.doi.org/10.1021/jp110738m |J. Phys. Chem. B 2011, 115, 1546–1561

The Journal of Physical Chemistry B unfolding pathways. Distinguishing alternative pathways will no doubt benefit from approaches that combine force spectroscopy with other detection methods, in particular single-molecule fluorescence spectroscopy.103 It may also be possible to measure force-dependent Φ-values94 for folding to determine whether the structure of the transition state changes at different forces.35,54,104 Alternatively, the temperature dependence of the unfolding rate at different pulling forces may give important clues about a change of mechanism. Such experiments will benefit from the use of improved electronic feedback systems for performing experiments in constantforce mode, which will simplify the interpretation of the results and facilitate comparison with simulations12,41,47 (see the text and Figure 7 in the Supporting Information). From a computational perspective, a more detailed picture of changes in the unfolding mechanism in response to force might be obtained by the use of Markov state models that describe the kinetics of transitioning between different metastable conformers.105-108 In particular, it will be useful to develop methods that describe, and perhaps even predict, the evolution of such kinetic transition networks as a function of applied force. Ultimately, understanding the relationships between the kinetics and mechanisms of protein unfolding under force will facilitate the rational design of mechanical proteins and proteinbased materials.109,110 Of particular interest from the perspective of our work, Hongbin Li and colleagues recently reported a novel protein GB1-based elastomeric material whose distinctive mechanical properties have been attributed to the reversible unfolding of GB1 domains.75 A better understanding of mechanical unfolding of single domains, derived from joint efforts in singlemolecule experimentation, theory, and simulation, will inform the design of such materials, perhaps revealing new ways in which force-dependent shifts in the mechanisms and kinetics of unfolding can be exploited to produce novel material properties.

’ METHODS Coarse-Grained Simulations. A coarse-grained CR Go model was constructed from the X-ray structure of protein G111 following Karanicolas and Brooks.82,112 Briefly, this model includes harmonic terms for bonds and angles and a statistical potential for the pseudotorsion angles. Nonbonded interactions between pairs of residues in contact in the native state are represented by an attractive potential, and all other pairs interact via a repulsive potential. Using this simplified model, we were able to compute protein unfolding rates by direct simulation, starting from the predicted center of the folded basin and terminating at the predicted center of the unfolded basin. Simulations were run using Langevin dynamics propagated using the stochastic dynamics integrator in Gromacs version 4.0.5113,114 at a constant temperature of 300 K with an integration time step of 10 fs. A low external friction coefficient of 0.1 ps-1 was chosen in order to accelerate the dynamics. Bond constraints were imposed using the LINCS algorithm, and frames were written out every 10 steps. Folding was monitored using the fraction of native contacts, defined as X 1 qij ð5Þ Q ¼ N ði, jÞ∈fcontactsg

where the sum is taken over all pairs of residues that form contacts in the Go model and where qij is the degree of contact

ARTICLE

formation between residues i and j, defined by the sigmoidal function  -1 ð0Þ ð6Þ qij ¼ 1 þ eβðj Brij j - γrij Þ Here, B r ij is the displacement between residues i and j, r(0) ij is the distance between residues i and j in the native state, and β and γ are parameters that can be adjusted to vary the steepness and cutoff distance in the definition of contact formation. For all the analyses reported below, we used β = 50/nm and γ = 1.2. Pulling Trajectories. Starting configurations with an overall fraction of native contact formation (Q) between 0.89 and 0.91 were selected, with weights proportional to eβVumb, from a previously collected umbrella sampling trajectory (see below). Each simulation was terminated when the fraction of overall backbone contact formation dropped below 0.13 or after a maximum length of 500 million steps (5 μs). The total simulation time required to reach Q = 0.13 was recorded, and the final segment of each trajectory with Q between 0.89 and 0.13 was extracted and saved. Average waiting times were calculated, assuming single-exponential kinetics, using the maximum likelihood formula n P

Ætæ ¼

i¼1

n P

ti þ ðN - nÞT n

(

i¼1

ti þ ðN - nÞT n3=2

ð7Þ

where ti is the waiting time for the ith trajectory, N is the total number of trajectories, n is the number of trajectories in which unfolding was observed, and T is the maximum allowed simulation time per trajectory.48,70 Applied pulling forces between residues i and j were simulated r j|, to the potential energy by adding an extra linear term, -F| B r i -B r j are the positions function, where F is the applied force and B r i and B of the residues i and j upon which the pulling force is exerted. Umbrella Sampling. To calculate the potential of mean force (PMF) along reduced coordinates, we performed simulations with additional harmonic bias potentials either in terms of Q alone (i.e., Vumb = 1/2kQ(Q - Q0)2) or in terms of both Q and the distance r between the pair of residues on which the pulling force was applied (Vumb = 1/2kQ(Q - Q0)2 þ 1/2kr(r - r0)2).115 A complete list of the umbrella centers (Q0 and r0) and force constants (kQ and kr) used in our simulations is provided in Tables 1 and 2 in the Supporting Information. Other simulation parameters were the same as those listed above. PMFs in terms of Q and r were obtained using the weighted histogram analysis method (WHAM).92,116 PMFs in the presence of different forces, F, could then be calculated by adding the linear term -Fr to the zero-force PMF. Trajectory Clustering. In order to organize and interpret the large volume of data from our unfolding simulations, we developed a clustering approach to identify groups of transition path trajectories with similar orders of events. Such clustering methods are often employed, for instance, to group similar molecular configurations from molecular dynamics simulations as the basis for constructing simplified Markov state models.105,107,117,118 In addition, an rmsd-based “sequence alignment” approach was previously used to cluster entire trajectories.119 Two major challenges associated with clustering whole trajectories are the inherent high dimensionality of the data and the potential for recrossings along transition paths.90 To simplify the problem, we chose to compare trajectories based on the order in which contacts broke, on average, as a 1558

dx.doi.org/10.1021/jp110738m |J. Phys. Chem. B 2011, 115, 1546–1561

The Journal of Physical Chemistry B function of the overall fraction of native contacts, Q. An average contact formation (“ACF”) matrix was calculated for each trajectory by first sorting frames into six bins of width 0.1 between Q = 0.2 and Q = 0.8. The degree of formation of each contact, qij, was averaged over the frames within each bin, giving a two-dimensional array of average qij for individual contacts as a function of total Q. In order to identify clusters of similar trajectories, we first defined the “distance” between two trajectories as the Euclidian norm of the difference between their respective ACF arrays: sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi X X ð1Þ ð2Þ ðxij - xij Þ2 ð8Þ d12 ¼ i

j

Here, the x(1) ij are the elements in the ACF array of the first trajectory and the x(2) ij are the elements in the ACF array of the second trajectory. The distance, d12, serves as a measure of how dissimilar trajectories 1 and 2 are in terms of order of events. Trajectories in which structural elements unfold in the same order are expected to have more similar ACF arrays and a smaller distance between them than trajectories in which structural elements unfold in a different order. After calculating intertrajectory distances, the following kmeans clustering algorithm was used to classify trajectories into a prespecified number of clusters: (i) For k desired clusters, select k initial trajectories to serve as cluster centers. (ii) Calculate the distance between each trajectory and each cluster center. Assign each trajectory to the cluster corresponding to the cluster center to which it is closest. (iii) Update the center of each cluster to be the average of the trajectories within that cluster. (iv) If the cluster centers are unchanged from the previous iteration, then stop. Otherwise, go to (ii). The kmeans() function from the Python scipy.cluster.vq library, which implements this algorithm, was used to perform k-means clustering. To visualize the resulting cluster centers, two-dimensional projections of the centers that optimally preserved the distances between them were calculated by multidimensional scaling (MDS). This was done using the neato program in the GraphViz package (www.graphviz.org), which implements the algorithm of Kamada and Kawai.120 Cluster diagrams were drawn using Mathematica 7.0 (Wolfram Research, Champaign, IL). Calculation of P(TP|Q,r). In order to determine how the ratelimiting step for unfolding changes at different forces, we predicted transition state configurations using data from both equilibrium sampling and nonequilibrium unfolding trajectories. A transition state, x, can be defined as a state at which the conditional probability of being on a transition path, P(TP|x), is maximized, up to a theoretical upper bound of P(TP|x) = 0.5 for stochastic kinetics in the overdamped limit.93,121 To locate predicted transition states on the reduced (Q,r) surface, we applied the Bayesian formula: PðTP;Q ;rÞ ¼ PðTPjQ ;rÞPðQ ;rÞ ¼ PðQ ;rjTPÞPðTPÞ ð9Þ where P(Q,r) denotes the equilibrium probability of finding a conformation with particular values of Q and r, P(TP) denotes the equilibrium probability of being along a transition path, P(Q, r|TP) represents the probability distribution of Q and r among conformations taken from transition paths, and P(TP|Q,r)

ARTICLE

represents the probability that a given conformation occurs along a transition path given that it has the specified values of Q and r. Because P(TP) is constant for any given force, we can write PðTPjQ ;rÞ 

PðQ ;rjTPÞ PðQ ;rÞ

ð10Þ

In other words, transition states are predicted to occur in regions of the (Q,r) plane that are over-represented among transition paths compared to the equilibrium distribution. We thus identified prospective transition states by taking the ratio of transition path and equilibrium probability distributions over Q and r and finding where this ratio was maximized. Brownian Dynamics (BD) Simulations. Brownian dynamics simulations were performed on two-dimensional potential of mean force (PMF) surfaces calculated by WHAM (see above). Outlying points at the boundaries of sampling were first cropped, and the energy of all points outside the sampled region was set equal to the maximum energy of any point within the sampled region. In effect, this introduced reflecting boundaries that constrained the BD simulations to those regions of the PMF that had actually been sampled by the umbrella simulations. The PMFs were then smoothed and interpolated using a Gaussian kernel with widths in each direction equal to half the distance between grid points in the WHAM output. A differentiable cubic spline interpolating function was used to compute forces for Brownian dynamics using the Ermak-McCammon algorithm.122 The equation for propagating the dynamics forward one time step (Δt) in each dimension was xi ðt þ ΔtÞ ¼ xi ðtÞ -

dU Di Δt þ Ri ðΔtÞ dxi

ð11Þ

where it is assumed that the off-diagonal elements in the diffusion tensor are equal to zero. Ri(Δt) is a Gaussian random number with mean equal to zero and standard deviation equal to (2DiΔt)1/2, and U is the potential energy in units of kBT. Here, we chose for simplicity to use a diagonal diffusion tensor, since our other approximations such as position-independent diffusion coefficients are likely to be more significant; however, we note that in general it is possible to estimate the off-diagonal elements by an analogous procedure.100 Similar to Go model kinetics simulations, BD simulations were started from the native state basin with Q = 0.90 and r(1,56) = 2.5 and terminated when Q dropped below 0.13.

’ ASSOCIATED CONTENT

bS

Supporting Information. Figures showing the fit of the two-dimensional theoretical model to simulation data, the relation between P(TP|Q) and Pfold and between Pfold and Q, assignment of transition paths obtained from shooting simulations to ACF clusters, and position-dependent diffusion coefficients computed for each pulling coordinate. Supporting text and figure giving predictions for constant velocity experiments. Tables of the parameters used for two-dimensional umbrellasampling simulations. This material is available free of charge via the Internet at http://pubs.acs.org.

’ AUTHOR INFORMATION Corresponding Author

*E-mail: [email protected]. Phone: þ44-1223-336470. Fax: þ44-1223-336362. 1559

dx.doi.org/10.1021/jp110738m |J. Phys. Chem. B 2011, 115, 1546–1561

The Journal of Physical Chemistry B

’ ACKNOWLEDGMENT T.G.W.G. was supported by a scholarship from the Winston Churchill Foundation and R.B.B. by a Royal Society University Research Fellowship. We would like to thank David de Sancho, Jane Clarke, and Emanuele Paci for helpful discussions. ’ REFERENCES (1) Bryngelson, J. D.; Wolynes, P. G. Proc. Natl. Acad. Sci. U.S.A. 1987, 84, 7524–8. (2) Matouschek, A.; Kellis, J. T.; Serrano, L.; Fersht, A. R. Nature 1989, 340, 122–126. (3) Dill, K. A.; Chan, H. S. Nat. Struct. Biol. 1997, 4, 10–19. (4) Oliveberg, M.; Tan, Y. J.; Silow, M.; Fersht, A. R. J. Mol. Biol. 1998, 277, 933–43. (5) Wolynes, P. G. Q. Rev. Biophys. 2005, 38, 405–410. (6) Onuchic, J. N.; Luthey-Schulten, Z.; Wolynes, P. G. Annu. Rev. Phys. Chem. 1997, 48, 545–600. (7) Borgia, A.; Williams, P. M.; Clarke, J. Annu. Rev. Biochem. 2008, 77, 101–125. (8) Galera-Prat, A.; Gomez-Sicilia, A.; Oberhauser, A. F.; Cieplak, M.; Carrion-Vazquez, M. Curr. Opin. Struct. Biol. 2010, 20, 63–69. (9) Schuler, B.; Lipman, E.; Eaton, W. Nature 2002, 419, 743–747. (10) Rhoades, E.; Cohen, M.; Schuler, B.; Haran, G. J. Am. Chem. Soc. 2004, 126, 14686–14687. (11) Nettels, D.; M€uller-Sp€ath, S.; K€uster, F.; Hofmann, H.; Haenni, D.; R€uegger, S.; Reymond, L.; Hoffmann, A.; Kubelka, J.; Heinz, B.; Gast, K.; Best, R. B.; Schuler, B. Proc. Natl. Acad. Sci. U.S.A. 2009, 106, 20740–20745. (12) Cecconi, C.; Shank, E. A.; Bustamante, C.; Marqusee, S. Science 2005, 309, 2057–2060. (13) Chung, H. S.; Louis, J. M.; Eaton, W. A. Proc. Nat. Acad. Sci. U.S.A. 2009, 106, 11837–11844. (14) Rief, M.; Gautel, M.; Oesterhelt, F.; Fernandez, J. M.; Gaub, H. E. Science 1997, 276, 1109–1112. (15) Marszalek, P. E.; Lu, H.; Li, H.; Carrion-Vazquez, M.; Oberhauser, A. F.; Schulten, K.; Fernandez, J. M. Nature 1999, 402, 100–103. (16) Carrion-Vazquez, M.; Oberhauser, A. F.; Fowler, S. B.; Marszalek, P. E.; Broedel, S. E.; Clarke, J.; Fernandez, J. M. Proc. Natl. Acad. Sci. U.S.A. 1999, 96, 3694–3699. (17) Tskhovrebova, L.; Trinick, J.; Sleep, J. A.; Simmons, R. M. Nature 1997, 387, 308–312. (18) Liu, R.; Garcia-Manyes, S.; Sarkar, A.; Badilla, C. L.; Fernandez, J. M. Biophys. J. 2009, 96, 3810–3821. (19) Shank, E. A.; Cecconi, C.; Dill, J. W.; Marqusee, S.; Bustamante, C. Nature 2010, 465, 637–640. (20) Rief, M.; Fernandez, J. M.; Gaub, H. E. Phys. Rev. Lett. 1998, 81, 4764–4767. (21) Li, H.; Carrion-Vazquez, M.; Oberhauser, A. F.; Marszalek, P. E.; Fernandez, J. M. Nat. Struct. Biol. 2000, 7, 1117–1120. (22) Li, H.; Oberhauser, A. F.; Fowler, S. B.; Clarke, J.; Fernandez, J. M. Proc. Natl. Acad. Sci. U.S.A. 2000, 97, 6527–6531. (23) Brockwell, D.; Beddard, G.; Clarkson, J.; Zinober, R.; Blake, a.; Trinick, J.; Olmsted, P.; Smith, D.; Radford, S. Biophys. J. 2002, 83, 458– 472. (24) Fowler, S. B.; Best, R. B.; Toca Herrera, J. L.; Rutherford, T. J.; Steward, A.; Paci, E.; Karplus, M.; Clarke, J. J. Mol. Biol. 2002, 322, 841– 849. (25) Brockwell, D. J.; Paci, E.; Zinober, R. C.; Beddard, G. S.; Olmsted, P. D.; Smith, D. A.; Perham, R. N.; Radford, S. E. Nat. Struct. Biol. 2003, 10, 731–737. (26) Carrion-Vazquez, M.; Li, H.; Lu, H.; Marszalek, P. E.; Oberhauser, A. F.; Fernandez, J. M. Nat. Struct. Biol. 2003, 10, 738–43. (27) Williams, P. M.; Fowler, S. B.; Best, R. B.; Toca-Herrera, J. L.; Scott, K. A.; Steward, A.; Clarke, J. Nature 2003, 422, 446–449. (28) Dietz, H.; Rief, M. Proc. Natl. Acad. Sci. U.S.A. 2004, 101, 16192–16197.

ARTICLE

(29) Schwaiger, I.; Kardinal, A.; Schleicher, M.; Noegel, A. A.; Rief, M. Nat. Struct. Mol. Biol. 2004, 11, 81–85. (30) Brockwell, D. J.; Beddard, G. S.; Paci, E.; West, D. K.; Olmsted, P. D.; Smith, D. A.; Radford, S. E. Biophys. J. 2005, 89, 506–519. (31) Forman, J. R.; Qamar, S.; Paci, E.; Sandford, R. N.; Clarke, J. J. Mol. Biol. 2005, 349, 861–871. (32) Li, L.; Huang, H. H.-L.; Badilla, C. L.; Fernandez, J. M. J. Mol. Biol. 2005, 345, 817–826. (33) Ng, S. P.; Rounsevell, R. W. S.; Steward, A.; Geierhaas, C. D.; Williams, P. M.; Paci, E.; Clarke, J. J. Mol. Biol. 2005, 350, 776– 789. (34) Dietz, H.; Berkemeier, F.; Bertz, M.; Rief, M. Proc. Natl. Acad. Sci. U.S.A. 2006, 103, 12724–12728. (35) Brown, A. E. X.; Litvinov, R. I.; Discher, D. E.; Weisel, J. W. Biophys. J. 2007, 92, L39–41. (36) Cao, Y.; Li, H. Nat. Mater. 2007, 6, 109–114. (37) Borgia, A.; Steward, A.; Clarke, J. Angew. Chem., Int. Ed. 2008, 47, 6900–6903. (38) Sadler, D. P.; Petrik, E.; Taniguchi, Y.; Pullen, J. R.; Kawakami, M.; Radford, S. E.; Brockwell, D. J. J. Mol. Biol. 2009, 393, 237–248. (39) Serquera, D.; Lee, W.; Settanni, G.; Marszalek, P. E.; Paci, E.; Itzhaki, L. S. Biophys. J. 2010, 98, 1294–1301. (40) Fernandez, J. M.; Li, H. Science 2004, 303, 1674–1678. (41) Schlierf, M.; Berkemeier, F.; Rief, M. Biophys. J. 2007, 93, 3989– 3998. (42) Bell, G. Science 1978, 200, 618–627. (43) Evans, E.; Ritchie, K. Biophys. J. 1997, 72, 1541–1555. (44) Merkel, R.; Nassoy, P.; Leung, A.; Ritchie, K.; Evans, E. Nature 1999, 397, 50–3. (45) Dudko, O.; Hummer, G.; Szabo, A. Phys. Rev. Lett. 2006, 96, 108101. (46) Rief, M.; Gautel, M.; Schemmel, A.; Gaub, H. Biophys. J. 1998, 75, 3008–3014. (47) Cao, Y.; Kuske, R.; Li, H. Biophys. J. 2008, 95, 782–788. (48) Best, R. B.; Paci, E.; Hummer, G.; Dudko, O. K. J. Phys. Chem. B 2008, 112, 5968–5976. (49) Kirmizialtin, S.; Huang, L.; Makarov, D. J. Chem. Phys. 2005, 122, 234915. (50) Suzuki, Y.; Dudko, O. K. Phys. Rev. Lett. 2010, 104, 048101. (51) Yew, Z. T.; Schlierf, M.; Rief, M.; Paci, E. Phys. Rev. E 2010, 81, 031923. (52) Socci, N. D.; Onuchic, J. N.; Wolynes, P. G. Proc. Natl. Acad. Sci. U.S.A. 1999, 96, 2031–2035. (53) Itzhaki, L. S.; Otzen, D. E.; Fersht, A. R. J. Mol. Biol. 1995, 254, 260–88. (54) Best, R. B.; Fowler, S. B.; Herrera, J. L. T.; Steward, A.; Paci, E.; Clarke, J. J. Mol. Biol. 2003, 330, 867–877. (55) Best, R. B.; Bin, L.; Steward, A.; Daggett, V.; Clarke, J. Biophys. J. 2001, 81, 2344–2356. (56) Forman, J. R.; Yew, Z. T.; Qamar, S.; Sandford, R. N.; Paci, E.; Clarke, J. Structure 2009, 17, 1582–90. (57) West, D. K.; Olmsted, P. D.; Paci, E. J. Chem. Phys. 2006, 124, 154909. (58) Alexander, P.; Fahnestock, S.; Lee, T.; Orban, J.; Bryan, P. Biochemistry 1992, 31, 3597–3603. (59) Alexander, P.; Orban, J.; Bryan, P. Biochemistry 1992, 31, 7243– 7248. (60) Kuszewski, J.; Clore, G. M.; Gronenborn, A. M. Protein Sci. 1994, 3, 1945–52. (61) Park, S. H.; O’Neil, K. T.; Roder, H. Biochemistry 1997, 36, 14277–83. (62) Sheinerman, F. B.; Brooks, C. L. J. Mol. Biol. 1998, 278, 439–56. (63) Park, S. H.; Shastry, M. C.; Roder, H. Nat. Struct. Biol. 1999, 6, 943–7. (64) McCallister, E. L.; Alm, E.; Baker, D. Nat. Struct. Biol. 2000, 7, 669–73. (65) Cao, Y.; Balamurali, M. M.; Sharma, D.; Li, H. Proc. Natl. Acad. Sci. U.S.A. 2007, 104, 15677–81. 1560

dx.doi.org/10.1021/jp110738m |J. Phys. Chem. B 2011, 115, 1546–1561

The Journal of Physical Chemistry B (66) Cao, Y.; Yoo, T.; Zhuang, S.; Li, H. J. Mol. Biol. 2008, 378, 1132–41. (67) Cao, Y.; Li, H. Nat. Nanotechnol. 2008, 3, 512–6. (68) Li, H.; Wang, H.-C.; Cao, Y.; Sharma, D.; Wang, M. J. Mol. Biol. 2008, 379, 871–80. (69) Li, P.-C.; Makarov, D. J. Phys. Chem. B 2004, 108, 745–749. (70) West, D. K.; Brockwell, D. J.; Olmsted, P. D.; Radford, S. E.; Paci, E. Biophys. J. 2006, 90, 287–297. (71) Li, P.-C.; Huang, L.; Makarov, D. E. J. Phys. Chem. B 2006, 110, 14469–74. (72) Glyakina, A. V.; Balabaev, N. K.; Galzitskaya, O. V. J. Chem. Phys. 2009, 131, 045102. (73) Glyakina, A. V.; Balabaev, N. K.; Galzitskaya, O. V. Biochemistry (Moscow) 2009, 74, 316–328. (74) Glyakina, A. V.; Balabaev, N. K.; Galzitskaya, O. V. Open Biochem. J. 2009, 3, 66–77. (75) Lv, S.; Dudek, D. M.; Cao, Y.; Balamurali, M. M.; Gosline, J.; Li, H. Nature 2010, 465, 69–73. (76) Levy, Y.; Wolynes, P. G.; Onuchic, J. N. Proc. Natl. Acad. Sci. U.S.A. 2004, 101, 511–516. (77) Levy, Y.; Cho, S. S.; Onuchic, J. N.; Wolynes, P. G. J. Mol. Biol. 2005, 346, 1121–1145. (78) Yang, S.; Cho, S. S.; Levy, Y.; Cheung, M. S.; Levine, H.; Wolynes, P. G.; Onuchic, J. N. Proc. Natl. Acad. Sci. U.S.A. 2004, 101, 13786–13791. (79) Cieplak, M.; Szymczak, P. J. Chem. Phys. 2006, 124, 194901. (80) Mickler, M.; Dima, R. I.; Dietz, H.; Hyeon, C.; Thirumalai, D.; Rief, M. Proc. Natl. Acad. Sci. U.S.A. 2007, 104, 20268–20273. (81) Socci, N. D.; Onuchic, J. N.; Wolynes, P. G. J. Chem. Phys. 1996, 104, 5860–5868. (82) Karanicolas, J.; Brooks, C. L. Protein Sci. 2002, 11, 2351–2361. (83) Mittal, J.; Best, R. B. Biophys. J. 2010, 98, 315–320. (84) Best, R. B.; Hummer, G. Proc. Natl. Acad. Sci. U.S.A. 2010, 107, 1088–1093. (85) Ding, K.; Louis, J. M.; Gronenborn, A. M. J. Mol. Biol. 2004, 335, 1299–1307. (86) Li, P.-C.; Makarov, D. J. Chem. Phys. 2004, 121, 4826–4832. (87) Dietz, H.; Rief, M. Phys. Rev. Lett. 2008, 100, 098101. (88) Wright, C. F.; Lindorff-Larsen, K.; Randles, L. G.; Clarke, J. Nat. Struct. Biol. 2003, 10, 658–662. (89) Wright, C. F.; Steward, A.; Clarke, J. J. Mol. Biol. 2004, 338, 445–451. (90) Gin, B. C.; Garrahan, J. P.; Geissler, P. L. J. Mol. Biol. 2009, 392, 1303–1314. (91) Kazmirski, S. L.; Li, A.; Daggett, V. J. Mol. Biol. 1999, 290, 283–304. (92) Kumar, S.; Rosenberg, J. M.; Bouzida, D.; Swendsen, R. H.; Kollman, P. A. J. Comput. Chem. 1992, 13, 1011–1021. (93) Hummer, G. J. Chem. Phys. 2004, 120, 516–523. (94) Matouschek, A.; Kellis, J. T.; Serrano, L.; Fersht, A. R. Nature 1989, 340, 122–126. (95) Sangha, A. K.; Keyes, T. J. Phys. Chem. B 2009, 113, 15886– 15894. (96) Huang, L.; Makarov, D. E. J. Chem. Phys. 2008, 128, 114903. (97) Best, R. B.; Hummer, G. Phys. Rev. Lett. 2006, 96, 228104. (98) Yang, S.; Onuchic, J. N.; Levine, H. J. Chem. Phys. 2006, 125, 054910. (99) Chahine, J.; Oliveira, R. J.; Leite, V. B. P.; Wang, J. Proc. Natl. Acad. Sci. U.S.A. 2007, 104, 14646–14651. (100) Marinelli, F.; Pietrucci, F.; Laio, A.; Piana, S. PLoS Comput. Biol. 2009, 5, e1000452. (101) Hummer, G. New J. Phys. 2005, 7, 516–523. (102) Evans, E.; Leung, A.; Heinrich, V.; Zhu, C. Proc. Natl. Acad. Sci. U.S.A. 2004, 101, 11281–11286. (103) Lang, M. J.; Fordyce, P. M.; Engh, A. M.; Neuman, K. C.; Block, S. M. Nat. Methods 2004, 1, 133–139. (104) Best, R. B.; Fowler, S. B.; Toca-Herrera, J. L.; Clarke, J. Proc. Natl. Acad. Sci. U.S.A. 2002, 99, 12143–8.

ARTICLE

(105) Chodera, J. D.; Singhal, N.; Pande, V. S.; Dill, K. A.; Swope, W. C. J. Chem. Phys. 2007, 126, 155101. (106) Noe, F.; Horenko, I.; Sch€utte, C.; Smith, J. C. J. Chem. Phys. 2007, 126, 155102. (107) Noe, F.; Sch€utte, C.; Vanden-Eijnden, E.; Reich, L.; Weikl, T. R. Proc. Natl. Acad. Sci. U.S.A. 2009, 106, 19011–19016. (108) Berezhkovskii, A.; Hummer, G.; Szabo, A. J. Chem. Phys. 2009, 130, 205102. (109) Li, H. Org. Biomol. Chem. 2007, 5, 3399–3406. (110) Li, H. Adv. Funct. Mater. 2008, 18, 2643–2657. (111) Gallagher, T.; Alexander, P.; Bryan, P.; Gilliland, G. L. Biochemistry 1994, 33, 4721–4729. (112) Karanicolas, J.; Brooks, C. L., III. J. Mol. Biol. 2003, 334, 309– 325. (113) Hess, B.; Kutzner, C.; van Der Spoel, D.; Lindahl, E. J. Chem. Theor. Comput. 2008, 4, 435–447. (114) Van Gunsteren, W. F.; Berendsen, H. J. C. Mol. Simul. 1988, 1, 173–185. (115) Torrie, G.; Valleau, J. J. Comput. Phys. 1977, 23, 187–199. (116) Chodera, J. D.; Swope, W. C.; Pitera, J. W.; Seok, C.; Dill, K. A. J. Chem. Theor. Comput. 2007, 3, 26–41. (117) Bowman, G. R.; Beauchamp, K. A.; Boxer, G.; Pande, V. S. J. Chem. Phys. 2009, 131, 124101. (118) Voelz, V. A.; Bowman, G. R.; Beauchamp, K.; Pande, V. S. J. Am. Chem. Soc. 2010, 132, 1526–1528. (119) Ota, M.; Ikeguchi, M.; Kidera, A. Proc. Natl. Acad. Sci. U.S.A. 2004, 101, 17658–17663 . (120) Kamada, T.; Kawai, S. Inf. Process. Lett. 1989, 31, 7–15. (121) Best, R. B.; Hummer, G. Proc. Natl. Acad. Sci. U.S.A. 2005, 102, 6732–6737. (122) Ermak, D. L.; McCammon, J. A. J. Chem. Phys. 1978, 69, 1352–1360.

1561

dx.doi.org/10.1021/jp110738m |J. Phys. Chem. B 2011, 115, 1546–1561