Structural and Kinetic Characterization of the ... - ACS Publications

Sep 19, 2017 - thermodynamic and kinetic characterization of this IDP test case, our study demonstrates how computational methods can be effective too...
0 downloads 11 Views 2MB Size
Subscriber access provided by University of Sussex Library

Article

Structural and Kinetic Characterization of the Intrinsically Disordered Protein SeV N Through Enhanced Sampling Simulations TAIL

Mattia Bernetti, Matteo Masetti, Fabio Pietrucci, Martin Blackledge, Malene Ringkjøbing Jensen, Maurizio Recanatini, Luca Mollica, and Andrea Cavalli J. Phys. Chem. B, Just Accepted Manuscript • DOI: 10.1021/acs.jpcb.7b08925 • Publication Date (Web): 19 Sep 2017 Downloaded from http://pubs.acs.org on September 20, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

The Journal of Physical Chemistry B is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 36

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Structural and Kinetic Characterization of the Intrinsically Disordered Protein SeV NTAIL Through Enhanced Sampling Simulations

Mattia Bernetti,†,‡ Matteo Masetti,†,* Fabio Pietrucci,§ Martin Blackledge,┴ Malene Ringkjobing Jensen,┴ Maurizio Recanatini,† Luca Mollica,‡,* and Andrea Cavalli†,‡,*



Department of Pharmacy and Biotechnology, Alma Mater Studiorum – Università di Bologna, Via

Belmeloro 6, 40126, Bologna, Italy ‡

CompuNet, Istituto Italiano di Tecnologia, Via Morego 30, 16163, Genova, Italy

§

Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, Sorbonne Universités–

Université Pierre et Marie Curie Paris 6, CNRS UMR 7590, IRD UMR 206, Museum national d’Histoire naturelle, F-75005 Paris, France ┴

Protein Dynamics and Flexibility by NMR Group, Institut de Biologie Structurale, 38044

Grenoble, France

Corresponding Author *[email protected] *[email protected] *[email protected] 1

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Abstract Intrinsically disordered proteins (IDPs) are emerging as an important class of the proteome. Being able to interact with different molecular targets, they participate in many physiological and pathological activities. However, due to their intrinsically heterogeneous nature, determining the equilibrium properties of IDPs is still a challenge for biophysics. Herein, we applied state-of-the-art enhanced sampling methods to Sev NTAIL, a test case of IDPs, and constructed a bin-based kinetic model to unveil the underlying kinetics. To validate our simulation strategy, we compared the predicted NMR properties against available experimental data. Our simulations reveal a rough freeenergy surface comprising multiple local minima, which are separated by low energy barriers. Moreover, we identified interconversion rates between the main kinetic states, which lie in the subµs timescales, as suggested in previous works for Sev NTAIL. Therefore, the emerging picture is in agreement with the atomic-level properties possessed by the free IDP in solution. By providing both a thermodynamic and kinetic characterization of this IDP test case, our study demonstrates how computational methods can be effective tools for studying this challenging class of proteins.

Introduction In the last decade, it has become clear that a significant proportion of eukaryotic proteins are unstructured under physiological conditions.1-2 The lack of a stable secondary and tertiary structure is the peculiar feature of this class, hence the term Intrinsically Disordered Proteins (IDPs). This intrinsic disorder may involve the overall amino acid sequence or just be limited to specific domains. Notably, it is responsible for the highly dynamic nature of these proteins. Indeed, rather than possessing a well-defined, most probable state at equilibrium, IDPs typically exist as a dynamic conformational ensemble of states, among which they can easily interconvert. This feature accounts for the peculiar ability of IDPs to interact with multiple targets and thus to take on many physiological roles related to, for example, signaling or regulation.3-4 Likewise, dysfunctions of 2

ACS Paragon Plus Environment

Page 2 of 36

Page 3 of 36

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

these proteins can lead to several pathological conditions, such as neurodegenerative disorders and cancer.5-6 To gain a comprehensive picture of their functionality and implication in diseases, it is thus essential to produce a detailed characterization of the equilibrium properties of free IDPs in solution, in terms of both thermodynamics and kinetics. Exploration of this part of the proteome is still in its infancy. Effective experimental tools for studying IDPs include nuclear magnetic resonance (NMR) spectroscopy,7 small angle X-ray scattering (SAXS),8 and fluorescence Förster resonance energy transfer (FRET),9 which are wellsuited to dealing with the structural dynamics of proteins. However, characterizing the equilibrium properties of IDPs remains challenging due to their highly dynamic nature, as well as the heterogeneity of the above techniques in terms of space and time resolution. From a structural standpoint, NMR spectroscopy has been particularly successful in characterizing the ensemble of conformations that collectively describe these peptides. A two-step procedure called “sample and select” (SAS) is typically the most common process for identifying a representative ensemble of structures sampled at equilibrium. First, a pool of configurations is generated, usually by exploiting conformer libraries. Then, through an iterative process, the ensemble is filtered by satisfying the best agreement between experimental and back-calculated NMR observables, such as scalar couplings (J), chemical shifts (CSs), and residual dipolar couplings (RDCs). Notwithstanding the power and elegance of this approach, SAS does not provide any kinetic information on the implied timescales regulating the interconversion between conformations. But this information is instrumental to understanding the folding of IDPs and the recognition mechanisms with molecular partners. The only source of dynamic and time-resolved information by NMR is the study of spin dynamics: a method for IDPs was recently proposed,10 based on multiple temperature measurements to discriminate between motions on different timescales. However, even within the framework of this new family of experimental techniques, the nature of

15

N spin relaxation only

allows the identification of motions in the timescale of tens of nanoseconds, i.e. within times that

3

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

are shorter than the ones governing the interconversion between different local structures present along a disordered chain (e.g.. considering the folding of an helix occurring within the µs timescale). In this context, all-atom molecular dynamics (MD) is emerging as a key tool,11-15 as it provides an ensemble of conformations in equilibrium conditions. Additionally, MD can describe the kinetics of the observed events,16 allowing a comprehensive picture of IDPs at a fully atomistic level. However, when applying MD to study slow processes such as folding and unfolding, extensive simulations (in the microsecond-to-millisecond timescale) are typically required, even for relatively short amino acid sequences.17 MD-based enhanced sampling methods are particularly promising in this regard, as they allow an efficient exploration of the configurational space, while preserving the Boltzmann distribution of states in the given statistical ensemble.18 In these methods, the sampling is accelerated by either exploiting high temperature replicas, as in replica exchange (or Parallel Tempering MD, PT- MD), or through bias potentials or forces acting on selected degrees of freedom, which are known to describe the event one wishes to accelerate (reaction coordinates or collective variables, CVs). Metadynamics19 is one such CV-based methods that can ultimately be combined with PT-MD20 to further improve sampling effectiveness (PTMetaD).21 The everincreasing number of computational studies focusing on IDPs22-23 demonstrates the strong need for robust techniques to simulate these systems. Such techniques range from force field optimization to the use of implicit solvent models. Interestingly, NMR can be directly combined with MD, leading to ensemble-restrained MD simulations.24 Within these approaches, chemical shifts are implemented as additional terms of the molecular mechanics force field.25 It has been shown that using replica-averaged restraints improves the description of the protein dynamics around the native structure.26 Conversely, the so-called NMR-guided metadynamics incorporates experimental information in the form of a purposely designed CV. It has been successfully used to characterize the complex free-energy landscape of the Aβ1-40 peptide.27 4

ACS Paragon Plus Environment

Page 4 of 36

Page 5 of 36

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

The C-terminal domain (NTAIL) of the nucleoprotein of the Sendai virus (Figure 1A), a member of the Paramyxoviridae family, is intrinsically disordered.28 Experimental studies on NTAIL and similar peptides from the same family have revealed a high helical propensity in the central region of the sequence, giving rise to differently folded states that are accessible to this class of domains.29 Herein, we investigate the ability of state-of-the-art computational methods to tackle both the thermodynamic and kinetic aspects of IDPs. In particular, we first used the enhanced sampling method PTMetaD in the Well-Tempered Ensemble (PTMetaD-WTE, for clarity hereafter simply referred to as metadynamics)30-31 to explore the conformational space of NTAIL and to characterize its free-energy landscape without any bias towards experimental observables. The reliability of calculations was then validated by comparing the back-calculated CS with previously collected NMR data. Finally, a discrete-states kinetic model was built to describe the interconversion between states populated by the peptide, and to calculate the corresponding rates. Although there were marginal discrepancies with structural observables, our approach succeeded in identifying all the relevant conformational states of NTAIL, and can be useful in characterizing interconversion rates that are otherwise elusive to direct experimental investigation. In addition, the rather good agreement between theoretical and experimental NMR observables provides, to the best of our knowledge, the first quantitative assessment of metadynamics to fully explore the free energy surface (FES) of a disordered polypeptide.

Computational methods Sampling strategy. In metadynamics19 sampling is enhanced by adding a history-dependent bias potential along selected degrees of freedom of the system, referred to as CVs. At time intervals τ, a Gaussian-shaped potential V(S,t) is deposited in the CVs space:

(, ) =  ∙

( ) 

(1)

5

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

where S denotes a CV, and σ and ω are the width and height of the Gaussian, respectively. As a result, not only the exploration in the CV space is improved by the forces arising from the bias, but the system is also discouraged from visiting regions that have already been sampled. Once all the regions have been adequately sampled, an unbiased estimate of the underlying free energy is given by taking the negative of the overall bias potential, which in turn is the sum of all the Gaussians deposited. A major effort when dealing with metadynamics is the choice of the CVs. Indeed, significant energy barriers may exist along degrees of freedom that are not explicitly accounted for by the chosen CVs, thus mining the efficiency of the method. To attenuate such drawback, metadynmics can be combined with PT, resulting in the PTMetaD framework.21 In PTMetaD, multiple replicas of the system are simulated at increasing temperatures and, at constant intervals of time, exchanges between adjacent replicas i and j are allowed according to a Metropolis scheme with probability:

 = min 1, exp  − " #$ − $" # + " &" (" ) − "  #' +  &  # −  (" )'() (2)

where U is the potential energy of a given replica and V is the bias potential added by metadynamics to that replica. At the higher temperatures, crossing of relatively low energy barriers along all degrees of freedom is facilitated, while the canonical distribution is maintained at the lowest temperature replica. A crucial point of PT is that the energy distributions of adjacent replicas need to overlap in order for the exchanges to occur. Thus, a significant number of replicas of the system is typically required to cover a given temperature range, which may be counter-productive in terms of computational cost. To improve efficiency, several PT variants can be exploited, ranging from the use of hybrid-solvent models32 to the increase of the effective temperature on specific degrees of freedom.33 Among these, the Well-Tempered Ensemble (WTE)30 approach allows to 6

ACS Paragon Plus Environment

Page 6 of 36

Page 7 of 36

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

achieve the required overlap with a reduced number of replicas by enhancing the energy fluctuations of the system (Figure S1). To combine the approaches and performing metadynamics in the WTE, a preliminary PTMetaD is performed using the potential energy as unique CV and storing the bias. Then, in the subsequent PTMetaD-WTE production phase, the previously obtained bias in the space of the potential energy is maintained while the CVs chosen to guide the sampling along slow degrees of freedom are biased through well-tempered metadynamics.34 System setup. The initial system for our simulations was prepared from conformation H1 of the peptide (sequence: NDEDVSDIERRIAMRLAERRQEDSAT), determined via NMR in a previous work. Termini were capped by adding the acetyl (ACE) and N-methyl amide (NME) groups at the N- and C-terminus respectively. Solvation was accomplished by adding 15,622 water molecules in a cubic box having 78 Å edge size. Na+ counter ions were subsequently added to neutralize the system. A last generation force-field was used to treat the peptide. In particular, the Amber ff99SB*-ILDN,35 resulting from the original ff99SB36 corrected with the “ILDN” side-chain torsion parameters37 and the helix-coil transition balance optimizations,38

was adopted. While other

computationally more expensive simulative setups could be envisioned (e.g. Amber ff03w39 combined with TIP4P/200540 or Amber ff99SB-ILDN combined with TIP4P-D),41 Amber ff99SB*ILDN was recently used in several studies together with cheap water models, and worked well for a large variety of proteins including IDPs.11, 35, 42-44 Accordingly, the TIP3P45 model was applied for the water molecules, and the ions were described according to a recent reparametrization by Joung and Cheatham.46 After an initial minimization consisting of 5,000 steps of steepest descent, the system was equilibrated in two stages. First, we performed a 500 ps long simulation at 298 K in the NVT ensemble using the velocity-rescaling thermostat,47 with a 0.1 ps time constant for coupling. This was followed by 500 ps in the NPT ensemble using the Parrinello-Rahman barostat,48 with a 2.0 ps time constant for coupling. Long-range electrostatics were treated via the particle mesh Ewald

7

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 36

method.49 A cutoff of 12 Å was used for short-ranged non-bonded interactions. All simulations were performed applying a 2 fs time step. For the PT, the 298-400 K temperature range was covered with 8 replicas, geometrically distributed50 according to:

*" =

45

. 65 *+,- &./01 ' /23

(3)

where Ti is the temperature of replica i. Specifically, the obtained values were: 298.00, 310.80, 324.15, 338.07, 352.60, 367.73, 383.53, 400.00 K. The system equilibrated at 298 K was heat up to 400 K in 7 subsequent steps in order to obtain the starting configurations for each replica. All simulations were performed with the 4.6.7 version of the GROMACS MD engine,51 patched with the PLUMED 2.1 software.52 To run in the WTE, a preliminary 5 ns long PTMetaD run was carried out, using the potential energy as the only CV. Gaussians with a height (ω) of 2.5 kj/mol and a σ of 500 kj/mol were added every 250 steps. A biasfactor of 50 was applied to ensure a sufficient overlap between the energy distribution of neighboring replicas, and exchanges were attempted every 100 steps. The bias gathered in this preliminary run was then kept fixed during the subsequent PTMetaD production run, thus allowing to simulate in the WTE. For the production phase, the alpha helical content (αcont)53 and the radius of gyration (Rgyr)54 were chosen, respectively defined as:

789-: = ∑J