Article pubs.acs.org/JPCB
Bridging Experiments and Native-Centric Simulations of a Downhill Folding Protein Athi N. Naganathan*,† and David De Sancho*,‡,¶ †
Department of Biotechnology, Bhupat & Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India ‡ CIC nanoGUNE, Tolosa Hiribidea, 76, E-20018 Donostia-San Sebastián, Spain ¶ IKERBASQUE, Basque Foundation for Science, María Díaz de Haro 3, 48013 Bilbao, Spain S Supporting Information *
ABSTRACT: Experiments and atomistic simulations have independently contributed to the mechanistic understanding of protein folding. However, a coherent detailed picture explicitly combining both is currently lacking, a problem that seriously limits the amount of information that can be extracted. An alternative to atomistic models with physicsbased potentials is the native-centric (i.e., Go̅ type) coarsegrained models, which for many years have been successfully employed to qualitatively understand features of protein folding energy landscapes. Again, quantitative validation of Go̅ models against experimental equilibrium unfolding curves is often not attempted. Here we use an atomistic topology-based model to study the folding mechanism of PDD, a protein that folds over a marginal thermodynamic barrier of ∼0.5 kBT at midpoint conditions. We find that the simulations are in exquisite agreement with several equilibrium experimental measurements including differential scanning calorimetry (DSC), an observable that is possibly the most challenging to reproduce from explicit-chain models. The dynamics, inferred using a detailed Markov state model, display a classical Chevronlike trend with a continuum of relaxation times under both folding and unfolding conditions, a signature feature of downhill folding. The number of populated microstates and the connectivity between them are shown to be temperature dependent with a maximum near the thermal denaturation midpoint, thus linking the macroscopic observation of a peak in the DSC profile of downhill folding proteins and the underlying microstate dynamics. The mechanistic picture derived from our analysis thus sheds light on the intricate and tunable nature of the downhill protein folding ensembles. In parallel, our work highlights the power of coarse-grained models to reproduce experiments at a quantitative level while also pointing at specific directions for their improvement.
■
INTRODUCTION
However, for various reasons discussed below, trying to match experiments and molecular dynamics (MD) simulations has been challenging. The time-scales that atomistic simulations are able to access has been continuously increasing22 due to the development of new hardware like graphic processing units or dedicated computer architectures for running MD (e.g., the Anton supercomputer23). Though multiple folding-unfolding transitions can be observed for microsecond folding proteins24 (and in some cases even for millisecond-folders25−27), how much we can read from the resulting terabytes of data is limited by the accuracy of the models themselves,28 which are in a constant state of evolution. Despite these issues, the atomiclevel insights gleaned from these long time-scale simulations have shed light on the underlying physical processes that govern folding.24 Although efforts are under way for optimizing physics-based force fields,29,30 the inability to reproduce
One of the predictions of energy landscape theory is the existence of different scenarios for folding.1 While many singledomain proteins have been found to fold to their native conformations over sizable free energy barriers resulting in a simple two-state transition (type I scenario),2 it is still possible for some proteins to fold over marginal barriers or in the absence of a macroscopic free energy barrier (type 0 scenario, downhill folding). The exponential weighting of the free energy in the partition function makes it highly challenging to experimentally extract the details of conformational substates populated en route to folding, particularly when the barrier is larger than 3kBT. However, this becomes possible in the case of downhill folding3,4 thus opening up the possibility of exploring the nature of partially structured states even from simple equilibrium experiments. This has triggered the interest of many researchers toward understanding downhill folding,5 via experiments,6−13 simulations,14−19 or, more rarely, from a combination of the two.20,21 © 2015 American Chemical Society
Received: October 1, 2015 Revised: November 2, 2015 Published: November 2, 2015 14925
DOI: 10.1021/acs.jpcb.5b09568 J. Phys. Chem. B 2015, 119, 14925−14933
Article
The Journal of Physical Chemistry B
the melting temperature from varied spectroscopic probes. We have run simulations on this protein using an atomic-level topology-based model.50 Our results show that for this protein, the results of the Go̅ model simulation can be quantitatively compared with equilibrium experiments. The model recovers the temperature dependence of the heat capacity and free energy accurately, at least up to the folding midpoint temperature. Also, by introducing a simple temperature dependence we are able to map the simulated rates, derived from a Markov state model, onto the experimental temperature-jump (T-jump) data. The excellent level of agreement between simulations and experiments allows for a detailed description of the ensembles populated by PDD at different conditions, the underlying unfolding mechanism and points at specific directions for the improvement of explicit-chain Go̅models.
equilibrium experimental observables directly remains a critical limitation of atomistic MD. This is an important issue, as reproducing equilibrium experiments validates the underlying energetic scale which can otherwise be too stabilizing or destabilizing with varying degrees of cooperativity.31 Moreover, the broadness of an equilibrium unfolding transition, particularly that from the heat capacity profile, carries information on the thermodynamic barrier height32−34 and hence on the population of the partially structured states, thus serving as a stringent test to a model’s performance. One alternative possibility is to use coarse-grained Go̅ model simulations,35−37 whose energy functions are based on the topology of the experimental 3D structure of the protein. These models have proven to be successful for decades in the study of protein folding36 and binding38 and more recently in the characterization of intrinsically disordered proteins.39 This success is based on the funnel-like nature of the Go̅-model energy landscape, which is compliant with the minimum frustration principle of protein folding.1 Further support for a Go̅ model description stems from the successes of the Ising-like Muñoz−Eaton (ME) model for protein folding,40−42 wherein it is possible to directly access the total and partial partition functions making suitable assumptions on structural features of microstates. To our knowledge the ME model faithfully reproduces more experimental observables than any other simulation model. Despite these successes, results from most explicit-chain coarse-grained simulation models are seldom directly compared against equilibrium experimental data (with a few notable exceptions43−45) leaving open the question of whether they can recapitulate the experimental observables that atomistic simulations usually miss, and hence be used as a tool to resolve the nature of the underlying conformational ensembles.46 We address these outstanding issues here by exploring the distribution of microstates that are populated in downhill protein folding. We focus on the protein PDD (see Figure 1A),
■
METHODS Coarse-Grained Molecular Dynamics Simulations. The all-heavy-atom coarse-grained Go̅-model of Onuchic and coworkers50 was employed to study the unfolding of PDD (PBD id: 2PDD, Figure 1A). Parameters were obtained from the SMOG Web server.51 In this model the energy function contains harmonic terms for bonds, angles and improper dihedrals, a Lennard-Jones term for nonbonded atom pairs that form native contacts, and an excluded volume term for nonnative interactions. The contact map was constructed with a contact cutoff of 0.5 nm for each atom pair, excluding up to i−i + 3 sequential neighbors (see Figure 1B). Molecular dynamics simulations were run using Gromacs.52 A leapfrog stochastic dynamics integrator with a time-step of 0.5 fs (as the carbon mass = 1 in reduced units) was employed to simulate unfolding in the temperature range between 100−150 (in reduced units) for 108 integration steps each. A Berendsen thermostat was used to maintain temperature with a time constant of 1 ps. Estimation of Equilibrium Observables. Free energy profiles, probability densities and heat capacity thermograms were generated by the weighted histogram analysis method (WHAM).53 PDD has two parallel helices that contribute to a strong far-UV CD signal. The helical fractions were calculated as before,45 using the following empirical expression for the mean-residue ellipticity [θλ] = fH [θλ∞](1 − kλ/ ) + (1 − fH )[θλcoil]
(1)
Here f H is the fraction of residues in the helical conformation, ⟨lH⟩ is the mean length of a helical stretch, and [θ∞ λ ]and [θcoil λ ]are the basis spectra for an infinite length helix and coil, respectively. These basis-spectra were obtained from Chen and co-workers.54 The parameter kλ accounts for the wavelength dependence of the helical spectrum, and they were modified (mean 2.7) to reproduce the 268 K far-UV CD spectrum of PDD at pH 7.0 employing the helical assignments from the PDB file. PDD has a single tyrosine residue (Y10) in the middle of the first helix that is the sole contributor to the near-UV CD signal. We model the near-UV CD unfolding curve as arising from the temperature-dependent changes in interactions made by the side-chain of tyrosine within an interaction shell of 0.5 nm - this primarily includes residues V7 and K14 in helix 1, and residues K32, I35, and D36 in helix 2. The data are normalized between 0 and 1 to enable a direct comparison (i.e., without fitting to
Figure 1. Native structure of PDD (A) and the corresponding contact map (B). The upper triangular matrix displays the contact-map and the lower-triangular matrix shows the same colored according to the various interaction types: (blue) intra-helix 1 interactions, (red) intrahelix 2 interactions, (green) intra-loop interactions, (magenta) interhelical interactions, and (yellow) helix-loop interactions.
which has been characterized as a downhill folder via an array of experimental approaches. Kinetic studies indicate that the folding relaxation rate is slower than its homologue and onestate downhill folder BBL;47 global and independent analysis of the thermodynamic data reveal a marginal barrier of