A Flexible Domain-Domain Hinge Promotes an Induced-fit Dominant

Feb 24, 2016 - Center of Systems Biology and Human Health, School of Science and Institute for ... induced fit (IF).23−26 In CS, the protein is in e...
0 downloads 0 Views 3MB Size
Subscriber access provided by Flinders University Library

Article

A Flexible Domain-domain Hinge Promotes an Inducedfit Dominant Mechanism for the Loading of GuideDNA into Argonaute Protein in Thermus Thermophilus Lizhe Zhu, Hanlun Jiang, Fu Kit Sheong, Xuefeng Cui, Xin Gao, Yanli Wang, and Xuhui Huang J. Phys. Chem. B, Just Accepted Manuscript • DOI: 10.1021/acs.jpcb.5b12426 • Publication Date (Web): 24 Feb 2016 Downloaded from http://pubs.acs.org on February 26, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

The Journal of Physical Chemistry B is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

A Flexible Domain-domain Hinge Promotes an Induced-fit Dominant Mechanism for the Loading of Guide-DNA into Argonaute Protein in Thermus thermophilus Lizhe Zhu1,2, Hanlun Jiang2,3, Fu Kit Sheong1, Xuefeng Cui4, Xin Gao4, Yanli Wang5 and Xuhui Huang1,2,3* 1 Department of Chemistry, 2 Center of Systems Biology and Human Health, School of Science and Institute for Advance Study, 3 Bioengineering Graduate Program, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong. 4 Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia. 5 Laboratory of Non-Coding RNA, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China.

ACS Paragon Plus Environment

1

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 33

ABSTRACT

Argonaute proteins (Ago) are core components of the RNA Induced Silencing Complex (RISC) that load and utilize small guide nucleic acids to silence mRNAs or cleave foreign DNAs. Despite the essential role of Ago in gene regulation and defense against virus, the molecular mechanism of guide-strand loading into Ago remains unclear. We explore such mechanism in bacterium Thermus thermophilus Ago (TtAgo), via a computational approach combining molecular dynamics, bias-exchange metadynamics and protein-DNA docking. We show that apo TtAgo adopts multiple closed states that are unable to accommodate guide-DNA. Conformations able to accommodate the guide are beyond the reach of thermal fluctuations from the closed states. These results suggest an induced-fit dominant mechanism for guide-strand loading in TtAgo, drastically different from the two-step mechanism for human Ago 2 (hAgo2) identified in our previous study. Such difference between TtAgo and hAgo2 is found to mainly originate from the distinct rigidity of their L1-PAZ hinge. Further comparison among known Ago structures from various species indicates that the L1-PAZ hinge may be flexible in general for prokaryotic Agos but rigid for eukaryotic Agos.

KEYWORDS: molecular recognition; Argonaute; DNA; guide strand loading; Molecular Dynamics; enhanced sampling; protein-DNA docking

I. Introduction RNA interference (RNAi) is a key cellular mechanism involved in the regulation of gene expression and the defense against viruses throughout all kingdoms of life1,2. As the central

ACS Paragon Plus Environment

2

Page 3 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

component of RNAi, the RNA Induced Silencing Complex (RISC) mediates the sequencespecific recognition and inhibition of target mRNA3, and foreign nucleic acids4,5. As the core component of RISC, Argonaute protein (Ago) loads and utilizes a small guide nucleic acid to selectively capture the target nucleic acid for inhibition or degradation3,6-9. Because of its promising pharmaceutical potential in regulating specific gene silencing10-12, Ago has received phenomenal attention in the past decades6,13. Despite intensive studies, the mechanism of the very first step of the RISC formation - guide nucleic acid loading into Ago, remains largely elusive. In vitro14,15 and under crystallographic conditions9,16-18, spontaneous loading of single stranded guide has been observed for various Ago. In vivo, it is generally accepted that guide-strand loading consists of two-steps: loading of a duplex DNA or RNA from the Dicer protein19,20 and release of one (passenger) strand, so that only the guide-strand remains in Ago21,22. It remains unknown whether apo Ago loads the singlestranded guide in a selective manner. The guide-strand loading is a molecular recognition process between Ago and the guide nucleic acid. Two extreme mechanistic models are often used in understanding such molecular recognition: conformational selection (CS) and induced fit (IF)23-26. In CS, the protein is in equilibrium of a number of meta-stable conformational states and the nuclear acid can readily bind to one of the states26,27. In IF, the final bound conformation is inaccessible for the apo protein within the reach of thermal fluctuation, the nucleic acid has to first bind and form weak interactions with a different conformation and further induces conformational change of the protein towards the final bound form25. Both models have found abundant examples in nature. For instance, CS has been suggested in the recognition between lac repressor headpiece and DNA28,29. IF seems to play a major role in the recognition between Pancreatic and duodenal

ACS Paragon Plus Environment

3

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 33

homeobox 1 (Pdx1) and DNA30, and between p53 and DNA31. Due to the necessary conformational change after the initial binding, IF appears to offer higher specificity and fidelity of recognition than CS, especially when similar binding competitors are present32. In practice, the recognition mechanism often lies between CS and IF33-36. To understand the molecular mechanism of guide-strand loading, one must consider chemical details and dynamics, using techniques such as molecular simulations34, because of the intrinsic complexity of Ago and guide nucleic acid and the complex interaction between them. Along this direction, we have previously studied miRNA loading into human Ago 2 (hAgo2)37 via a computational protocol combining all-atom Molecular Dynamics (MD), Markov State Models (MSMs)38-52 and protein-RNA docking53,54. There we proposed a two-step model for guidestrand loading into hAgo237: first, guide strand selectively binds to an open conformation of apo hAgo2 via CS; subsequently, small-scale structural rearrangements of both molecules lead to the final bound complex. In the present study, we focus on the bacterium Thermus thermophilus Ago (TtAgo). TtAgo is among the most extensively investigated Ago proteins55-57. In Fig.1A, we display the crystal structure of TtAgo (PDB 3DLH) in its binary form with bound single-stranded DNA (ssDNA). Similar to other Ago proteins, TtAgo consists of 6 functional domains, namely the N (Nterminal) domain, L1 (Linker 1), the PAZ (PIWI-Argonaute-Zwille) domain, Linker 2 (L2), the MID (middle) domain and the C-terminal PIWI (P element–induced wimpy testis) domain. MID and PAZ bind to the 5' and 3' terminus of the guide respectively and define the two ends of the nucleic-acid binding channel. Driven mainly by the electrostatic attraction between the positively charged protein residues and the negatively charged backbone atoms of the DNA58,59, the 5’ segment of the channel binds nucleotide 2-8 of the guide known as the seed region (right top of

ACS Paragon Plus Environment

4

Page 5 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Fig.1A)8. Correct positioning of the seed region in the channel is crucial for subsequent recognition of target nucleic acid both in TtAgo8 and human Ago60,61. The diameter of the nucleic-acid binding channel is regulated by the relative positions of PIWI, L1 and PAZ. Notably, in this binary crystal structure, direct contacts between the PAZ and PIWI domain exist and bury the guide DNA in the binding channel (see SI Fig.1).

Figure 1. Argonaute protein of the bacterium Thermus thermophilus (TtAgo). (A) Domain composition (left) and the nucleic-acid binding channel (right) of TtAgo (structure PDB 3DLH). The seed region (definition in the main text) is enlarged for better visualization in the inset figure. The guide DNA is in licorice representation while TtAgo atoms in contact with the DNA is plotted as surfaces. Atoms with negative and positive charges are in red and blue respectively. (B) Collective variables (CVs) describing TtAgo conformations: dCH1 and dCH2 measure the diameter of nucleic-acid binding channel; θPAZ describes the torsion angle between PAZ and L2; dN-MID and dN-PIWI measure the distance between N and MID domain and between N and PIWI domain respectively. These CVs are used in the enhanced sampling scheme (Methods). Residues defining the CVs are colored as in (A).

ACS Paragon Plus Environment

5

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 33

Because both TtAgo and the guide DNA are large biopolymers, directly simulating the loading process via unbiased MD simulations is impractical. Therefore we adopt a step-wise protocol. We first perform All-atom MD simulations and bias-exchange metadynamics (BEMD), an enhanced sampling scheme, to extensively explore the conformational space of apo TtAgo. Large-scale protein-DNA docking simulations are further performed to assess the exposure of the binding channel of guide-DNA in TtAgo conformations. Our results show that apo TtAgo adopts multiple closed conformations that are unable to accommodate a guide DNA, pointing to an IF dominant mechanism for guide-strand loading. Moreover, the rigidity of L1-PAZ hinge is shown to be critical in regulating the exposure of the nucleic-acid binding channel and therefore the guide-strand loading mechanism. The biological meaning of these findings are discussed below.

II. Methods 1.

Molecular Dynamics simulation

The initial apo TtAgo conformation was taken from the crystal structures7-9, and the missing residues were added using MODELLER version 9.1162. The apo TtAgo protein was solvated in a dodecahedral box containing 36,349 TIP3P water molecules with 109 Na+ ions and 126 Cl- ions to neutralize the charge. All the MD simulations were performed using the GROMACS 4.5.4 package63 and the Amber99SB-ILDN force field64. Long-range electrostatic interactions were treated with the Particle-Mesh Ewald method65,66. Both short-range electrostatic interactions and van der Waals interactions used a cut-off of 10Å. All bonds were constrained by the LINCS algorithm67. Velocity-rescaling thermostat68 and the Parrinello-Rahman barostat69 were used for temperature and pressure coupling respectively. Because Thermus thermophilus has its optimal

ACS Paragon Plus Environment

6

Page 7 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

growth rate at 65℃, all MD simulations were performed at 340 K. The system was first energy minimized with the steepest descent algorithm and equilibrated for 1ns under NPT ensemble (T=340 K, P=1 atm) with the positions of all the heavy atoms restrained. Next, we performed a 5ns NPT simulation (T=340 K, P=1 atm) to further equilibrate the system. Finally, all the production MD simulations were performed under NVT ensemble at 340 K. 2.

Enhanced sampling of conformational landscape

BEMD calculations were carried out with PLUMED version 1.370 to enhance the sampling of the conformational space of apo TtAgo. In BEMD, several replicas of the same protein system were simulated, each experiencing a different history dependent bias potential that discourages sampling of conformations that has been visited71-73. Trial exchanges of the bias potential accumulated in each replica were attempted at a fixed frequency, using a Metropolis Monte Carlo acceptance criterion73. Effectively, this procedure flattens the free energy landscape in a set of collective variables (CVs) describing the conformational space of the protein and therefore encourages the protein to explore a large volume of interest in the state space in all replicas. After convergence, the accumulated bias potential were combined and transformed into an unbiased probability distribution via Weighted Histogram Analysis Method (WHAM), resulting in the unbiased free energy landscape in terms of these CVs74. We defined five CVs based on geometrical considerations and physical intuitions to describe TtAgo conformations (Fig.1B). All five CVs were defined using α-Carbon atoms of the corresponding residues. dCH1 measures the center-of-mass (c.o.m.) distance between part of L1 (residues 165-174) and one long helix on PIWI (residues 654-668). dCH2 is the c.o.m. distance between part of PAZ (residues 192-202) and part of PIWI (residues 572-581). These two

ACS Paragon Plus Environment

7

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 33

distances directly measure the diameter of nucleic-acid binding channel, i.e. the ability of TtAgo to accommodate DNA/RNA. θPAZ describes the torsion motion between a helix on PAZ (residues 178-185 for TtAgo and 229-236 for hAgo2) and a helix on L2 (residues 283-290 for TtAgo and 371-378 for hAgo2) that has been suggested in previous studies56; dN-MID measures the distance between N (residues 42-53, 105-122) and MID (residues 416-427, 449-456) and encodes the maximum possible length of the nucleic-acid binding channel. Since dN-MID alone cannot fully describe the position of N, an additional distance dN-PIWI between the same N residues and part of PIWI (residues 541-574, 618-624) was defined. For direct comparison between TtAgo and hAgo2, dPAZ-PIWI was defined as the c.o.m. distance between PAZ and a helical turn on PIWI (residues 664-667 for TtAgo, 812-815 for hAgo2). Our BEMD scheme included 8 replicas: 5 replicas with 1-dimensional metadynamics biasing on 5 CVs each, 3 replicas with 2-dimensional metadynamics biasing on 3 pairs of CVs: dCH1 and dCH2, dCH1 and θPAZ, dCH1 and θPAZ. These 3 pairs were chosen because dCH1, dCH2 and θPAZ are more directly relevant to exposure of the nucleic-acid binding channel. All replicas initiated from the same relaxed structure of the crystal 3HK2. Boundary quadratic potentials were used to restrict the range of exploration: 22 Å