NonO RRM1 with 5

Apr 11, 2016 - p54nrb/NonO is a nuclear RNA-binding protein involved in many cellular events such as pre-mRNA processing, transcription, and nuclear ...
0 downloads 0 Views 2MB Size
Subscriber access provided by UNIV OF CONNECTICUT

Article

Structure, dynamics and interaction of p54nrb/ NonO RRM1 with 5' Splice Site RNA sequence Jean-Baptiste Duvignaud, Mikael Bédard, Takashi Nagata, Yutaka Muto, Shigeyuki Yokoyama, Stéphane M. Gagné, and Michel Vincent Biochemistry, Just Accepted Manuscript • DOI: 10.1021/acs.biochem.5b01240 • Publication Date (Web): 11 Apr 2016 Downloaded from http://pubs.acs.org on April 15, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Biochemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Structure,

dynamics

and

interaction

of

p54nrb/NonO RRM1 with 5' Splice Site RNA sequence.

Jean-Baptiste Duvignaud1,2,3,¶, Mikaël Bédard1,3,¶, Takashi Nagata4,5,6, Yutaka Muto6,7, Shigeyuki Yokoyama8,9, Stéphane M. Gagné2,3 and Michel Vincent1,3*

1 Département de Biologie Moléculaire, Biochimie Médicale et Pathologie, Université Laval, Québec, G1V 0A6, Canada. 2 Département de Biochimie, Microbiologie et Bio-informatique, Université Laval, Québec, G1V 0A6, Canada. 3 PROTEO and IBIS, Université Laval, Québec, G1V 0A6, Canada. 4 Institute of Advanced Energy, Kyoto University, Gokasho, Uji, Kyoto 611-0011, Japan 5 Graduate School of Energy Science, Kyoto University, Gokasho, Uji, Kyoto 611-0011, Japan 6 RIKEN Center for Life Science Technologies, Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan 7 Faculty of Pharmacy and Research Institute of Pharmaceutical Science, Musashino University, Nishitokyo-shi, Tokyo 202-8585, Japan

1 ACS Paragon Plus Environment

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

8 RIKEN Systems and Structural Biology Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan 9 RIKEN Structural Biology Laboratory, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan

¶ These authors contributed equally to this work

2 ACS Paragon Plus Environment

Page 2 of 50

Page 3 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Abstract

p54nrb/NonO is a nuclear RNA-binding protein involved in many cellular events such as pre-mRNA processing, transcription and nuclear retention of hyper-edited RNAs. In particular, it participates to the splicing process by directly binding the 5' splice site of pre-mRNAs. The protein also concentrates in a nuclear body called paraspeckle by binding a G-rich segment of the ncRNA NEAT1. The N-terminal section of p54nrb/NonO contains tandem RNA recognition motifs (RRMs) preceded by an HQ-rich region including a threonine residue (Thr15) whose phosphorylation inhibits its RNA binding ability, except for G-rich RNAs. In this work, our goal was to understand the rules that characterize the binding of the p54nrb/NonO RRMs to their RNA target. We have done in vitro RNA binding experiments which revealed that only the first RRM of p54nrb/NonO binds to the 5' splice site RNA. We have then determined the structure of the p54nrb/NonO RRM1 by liquid-state NMR which revealed the presence of a canonical fold (β1α1β2β3α2β4) and the conservation of aromatic amino acids at the protein surface. We also investigated the dynamics of this domain by NMR. The p54nrb/NonO RRM1 displays some motional properties that are typical of a well-folded protein with some regions exhibiting more flexibility (loops and β-strands). Furthermore, we determined the affinity of p54nrb/NonO RRM1 interaction to the 5' splice site RNA by NMR and fluorescence quenching and mapped its binding interface by NMR, concluding in a classical nucleic acid interaction. This study provides an improved understanding of the molecular basis (structure and dynamics) that governs the binding of the p54nrb/NonO RRM1 to one of its target RNAs.

3 ACS Paragon Plus Environment

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 50

Introduction

Since its discovery in the early 1990s, the p54nrb/NonO protein has been shown to be involved in many nuclear functions, allowing its label of “multifunctional” protein 1. Indeed, p54nrb/NonO is implicated in pre-mRNA splicing, regulation of transcription, transcription termination and DNA unwinding and pairing 1-5. The role of p54nrb/NonO in the pre-mRNA splicing process has been studied extensively. Different reports showed that this multifunctional protein can directly bind the pre-mRNA 5' splice site (5'SS) as well as the CTD domain of the RNA polymerase II and the snRNP U5

5-7

. The capacity

of p54nrb/NonO to bridge these different molecular components suggests that this protein is critical in the spatiotemporal linkage of the transcription to the splicing process. In addition, p54nrb/NonO is a major component of a nuclear compartment called the paraspeckle through its interaction with the non-coding RNA NEAT-1/Vinc1/Men ε/β 811

. Paraspeckles are involved in intra-nuclear retention of hyper-edited RNA 12, 13. Most

p54nrb/NonO functions known to date involve its binding to various nucleic acids. Interestingly, a recent study demonstrated that the binding efficiency of p54nrb/NonO to its nucleic acids target is altered by the phosphorylation of its threonine residue in position 15 14. The in vitro binding of p54nrb/NonO to poly(A)/(C)/(U) and 5' splice site RNA (5'SS) sequences 5 was impaired by the Thr15 phosphorylation but surprisingly, the binding capacity of the Thr15-phosphorylated p54nrb/NonO to poly(G) sequences was preserved

14

. Interestingly, the non-coding NEAT-1/Vinc1/Men ε/β RNA, which is the

principal scaffold of the paraspeckle, contains G-rich sequences at its 5' extremity which

4 ACS Paragon Plus Environment

Page 5 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

are bound by p54nrb/NonO

15

. This observation suggests that p54nrb/NonO can

discriminate between different RNA binding sites.

p54nrb/NonO is a protein of 471 amino acids in its human form (p54nrb) and of 473 amino acids in its murine form (NonO). p54nrb/NonO harbours a DBHS (Drosophila behaviour, human splicing) motif, which is shared with PSPC1 and SFPQ proteins, also found in the paraspeckle

16-18

. The DBHS domain comprises a RNA binding domain (RBD), a

conserved region called NOPS (NonO/paraspeckle) and a coiled-coil region (Figure 1A). The RBD of p54nrb/NonO consists of two RNA recognition motifs (RRMs), preceded by an N-terminal 73 amino acid-long section rich in histidine and glutamine (HQ domain) which is not found in PSPC1 and SFPQ sequences. This HQ-rich domain was recently predicted as a prion-like domain (PLD) essential for the building of the paraspeckle 19. In addition to the coiled-coil domain, the C-terminal region of p54nrb/NonO involves a charged domain (+/-) and a proline-rich section at the extreme C-terminus. This Cterminal part is involved in homo and / or hetero-dimerization of the protein with its main protein partners (SFPQ and PSPC1) 18, 20 and in double-strand DNA binding 21.

RNA-binding surfaces of the canonical RRM fold (β1α1β2β3α2β4) are generally formed by the four-stranded β-sheet, the β2-β3 loop and the β1-α1 loop

22

. The two central β-

strands of the motif (β3 and β1) contain well-conserved consensus sequences named RNP1 and RNP2 (ribonucleoprotein consensus sequence 1 and 2, respectively). Both RNP sequences usually include aromatic residues with solvent-exposed side chains that participate in nucleic acid recognition by mediating stacking interactions with RNA bases

5 ACS Paragon Plus Environment

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

23

Page 6 of 50

. It is noticeable that the sequences of the three human DBHS proteins (p54nrb/NonO,

SFPQ and PSPC1) show a high degree of identity between their RRMs (Figure 1B). Whereas their RRM1 motif is basic and possesses the conserved aromatic residues generally involved in RNA binding, their RRM2 is acidic and lacks these specific amino acids (Figure 1B). These observations suggest that the RRM1 of the DBHS proteins is likely involved in a more classical nucleic acid interaction than the RRM2 motif and raise the possibility of function preservation between the RRM motifs of the DBHS protein family. Besides, recent structural studies revealed that the spatial arrangement of the RRMs, although unusual, represents a common feature of the human DBHS proteins 24, 25 26

. The RRM2 motif, the NOPS domain and the coiled-coil region are involved in the

dimerization of those proteins, where the NOPS domain of one subunit interacts with the RRM2 of the other subunit via a supernumerary β-strand

25

. For the RRM1 motif, the

comparison of the homodimer 26 and the heterodimers 24, 25 revealed a shift in its position suggesting an inherent domain flexibility for this motif.

The RRM motif is one of the most common protein domains found in eukaryotes and thus, several structural and binding studies were presented from the end of the 80’s till now. It is remarkable that only little variation in the 3D structures were observed, mostly due to the size of the loops 23. Despite this abundant literature, a relatively small number of studies devoted to RRM dynamics were published in order to better understand how such a common domain, involved in so many different functions, would bind and interact with such a variety of nucleic acids.

6 ACS Paragon Plus Environment

Page 7 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

In an attempt to better understand the binding modulation of p54nrb/NonO to its RNA partner, we performed a molecular dissection of the N-terminal RBD of p54nrb/NonO. We demonstrated that the RNA binding capacity of p54nrb/NonO resides in its first RRM domain. The solution structure of the RRM1 domain was determined by NMR. It adopts a fold similar to other known RRM structures despite its small size (70 amino acids) and is comparable to the one obtained in the heterodimer complex with PSPC1

25

. A RNA

binding study was performed by NMR and fluorescence to map and evaluate the affinity of the RRM1 construct towards the short 5'SS RNA sequence (5'-AAAAAGGUAAG-3') shown to interact with p54nrb/NonO in large transcription complexes 5. This interaction appears to be specific since the complexes were not detected using 5'SS RNAs with single point mutations at either intron position +1 and +2 5. Our results indicate that p54nrb/NonO RRM1 interacts with the 5'SS RNA using its typical β-sheet surface with a Kd of about 70µM. This weak affinity was confirmed by intrinsic fluorescence quenching experiments. Finally, the dynamics of the monomeric RRM1 domain was also evaluated by NMR. These conjugated efforts were done in order to decipher the general rules that guide the specificity and the affinity of this RRM to a specific RNA.

7 ACS Paragon Plus Environment

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Materials and methods

Molecular cloning

A construct of mouse p54nrb/NonO, in pGEX-4T1 was kindly provided by F. MoreauGachelin27. The N-RRM1/2 construct (aa 1-236) was generated by PCR on p54nrb/NonO cDNA using primers F (5'-CAT ATG CAG AGC AAT AAA GCC TTT AAC TTG-3') and R (5'-GAA TTC ACT CTT CAT CAT CTA ACT GGT CCA-3'). The PCR product was inserted in pCR 2.1-TOPO (Invitrogen), digested with NdeI and EcoRI enzymes and inserted in pET-30a (Novagen) by the same restriction sites. The same procedure was used to generate N-RRM1 (aa 1-146) with specific primers F (5'-CAT ATG CAG AGC AAT AAA GCC TTT AAC TTG-3') and R (5'-GAA TTC ATG CAC TGT GAC AGG CAA AGC GCA-3'). A PCR on p54nrb/NonO cDNA with primers F (5'-CAT ATG CAC AGT GCA TCC CTT ACA GTC CGC-3') and R (5'-CTC GAG CTC TTC ATC ATC TAA CTG GTC CAT-3') was realized to generate the RRM2 construct (aa 148-236). The PCR product was inserted in pCR 2.1-TOPO digested by NdeI and XhoI and inserted by the same restriction sites in pET-30a. Finally, the RRM1 construct (aa 68-153) was synthesized by Bio Basic Inc. (including a 20 amino acid long cleavable His-tag in Nterminal (MHHHHHHSSGLVPRGSGGSM) which was then inserted into pET-30a.

8 ACS Paragon Plus Environment

Page 8 of 50

Page 9 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Protein expression and purification

For the structure determination, p54nrb/NonO RRM1 (residues 68-153) (Figure S4 for amino acids sequence) was synthesized by a cell-free protein synthesis system purified as described previously protein

was

prepared

in

28

28

and

. A single 1.2 mM uniformly 13C- and 15N-labeled

20 mM 2H-Tris-HCl buffer

(pH

7.0),

100 mM NaCl,

1 mM dithiothreitol (DTT), and 0.02% (w/v) NaN3, with the addition of D2O to 10% v/v. The engineered protein sample used for the NMR measurements comprised 98 amino acid residues with extra sequences: (Gly-Ser-Ser)2-Gly and Gly-Pro-Ser-Ser-Gly at the N- and C-terminus, respectively. For the in vitro RNA binding assays, titration experiments and NMR relaxation studies, the p54nrb/NonO N-terminus plasmidic constructs (N-RRM1/2, N-RRM1, RRM1 and RRM2) were transformed in Rosetta (DE3) pLysS competent E. coli (Novagen) or BL21 Star™ (DE3) pLysS chemically competent E. coli (Invitrogen). For unlabeled protein expression, bacteria were incubated in a shaker (150 rpm) and grown in LB to an O.D. of 0.4 at 600 nm and induced for 16 hours at 29 °C with 0.4 mM IPTG. For

15

N-labeled

protein expression, bacteria were incubated in a shaker (250 rpm) and grown in M9 minimal medium, containing

15

NH4Cl, to an O.D. of 0.7 at 600 nm and induced for 4

hours at 29 °C with 0.4 mM IPTG. Bacteria were lysed in binding buffer (bb: 20 mM phosphate buffer, pH 7.4, 500 mM NaCl, 5 mM imidazole) supplemented with complete protease inhibitor (Roche) using Emulsiflex C3 (Avestin). Soluble fractions were filtered (0.45 µM) and purified by FPLC using a His Trap column (GE Healthcare). Proteins were eluted by various step gradients of the elution buffer (EB: 20 mM phosphate buffer,

9 ACS Paragon Plus Environment

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

pH 7.4, 500 mM NaCl, 500 mM imidazole). Samples were finally dialysed in the desired experimental buffer (incubating or NMR buffers, see below). Endogenic N-terminus histidine stretch was used to purify N-RRM1/2 and N-RRM1 constructs whereas a cleavable histidine-tag was added to the N-terminus of the RRM1 construct and a noncleavable one to the C-terminus of the RRM2 construct.

RNA binding assay

For 5'SS binding assay, streptavidine-agarose beads were blocked 1 h in blocking buffer (BB: 20 mM Hepes, pH 7.4, 10 % glycerol, 100 mM KCl, 0.2 mM EDTA , 0.5 mM DTT , 5 % BSA, complete protease inhibitor (Roche)) at 4 °C. 1.5 nmole of biotinylated synthetic 5'SS RNA were incubated with 20 µL of beads 30 minutes at 4 °C in the incubating buffer (IB:20 mM Hepes, pH 7.4, 10 % glycerol, 100 mM KCl, 0.2 mM EDTA, 0.5 mM DTT, 0.5 % BSA, complete protease inhibitor (Roche)). Beads were washed and incubated with 5 µg of desired proteins 30 minutes at 4 °C in IB. Bound proteins were solubilised in SDS sample buffer and analyzed by Western blot with the αpolyHis mAb (Sigma-Aldrich).

NMR spectroscopy: resonance assignments

For the RRM1 3D structure determination, NMR experiments were performed at 25 °C on 700 MHz spectrometer (Bruker AVANCE 700) equipped with a cryogenic probe. Backbone and side chain assignments were obtained by standard triple resonance experiments 29, 30. 2D [1H,15N]-HSQC, and 3D HNCO, HN(CA)CO, HNCA, HN(CO)CA, 10 ACS Paragon Plus Environment

Page 10 of 50

Page 11 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

HNCACB, and CBCA(CO)NH spectra were used for the 1H, 15N, and 13C assignments of the protein backbone. Side chain 1H and 13C assignments were obtained using 2D [1H,13C]-HSQC, and 3D HBHA(CO)NH, H(CCCO)NH, (H)CC(CO)NH, HCCH-COSY, HCCH-TOCSY, and (H)CCH-TOCSY spectra. The 1H and

13

C spin systems of the

aromatic rings of Phe, Tyr and His were identified using 3D HCCH-COSY, HCCHTOCSY experiments and 3D

13

C-edited NOESY-HSQC was used for the sequence-

specific resonance assignment of the aromatic side chains. All the assignments were checked for consistency with 3D 15N-edited NOESY-HSQC and 13C-edited NOESYHSQC spectra. 3D NOESY spectra were recorded with mixing times of 80 ms. Chemical shift referencing was based on IUPAC recommendations using 2,2-Dimethyl-2silapentane-5-sulfonate sodium salt (DSS) program NMRPipe

32

31

The NMR data were processed with the

. Spectra were analyzed with the programs NMRView

33

, and

KUJIRA 34.

Structure calculations

Peak lists for the NOESY spectra were generated by interactive peak picking, and peak intensities were determined by the automatic integration function of NMRView

35

. The

three-dimensional structure was determined by the combined automated NOESY crosspeak assignment and structure calculation with torsion angle dynamics implemented in the CYANA program

36-38

. Restraints for the backbone torsion angles ψ and φ were

determined by a chemical shift database analysis with the program TALOS standard CYANA protocol was applied

27

39

. The

. Several hydrogen bonds derived from the

11 ACS Paragon Plus Environment

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 50

NOE network for the β-sheet were added for structure refinement

40

. The 20 structures

from the CYANA calculation were subjected to restrained energy-refinement with the program AMBER9 (http://amber.scripps.edu)

41

, using the Generalized Born model.

During the AMBER calculations, distance and dihedral angle restraints were applied with force constants of 32 kcal mol-1 Å-1 and 250 kcal mol-1 rad-2, respectively. PROCHECK_NMR

42

was used to validate the final structures. Structure figures were

prepared with the program PyMOL (The PyMOL Molecular Graphics System, Version 1.7.4 Schrödinger, LLC.).

NMR spectroscopy: general information for titration and relaxation studies

All experiments were carried out at 25 °C on a 600 MHz VARIAN INOVA spectrometer equipped with XYZ-axis pulsed field gradient and triple resonance probe. Chemical shift referencing was based on IUPAC recommendations using 2,2-Dimethyl-2-silapentane-5sulfonate sodium salt (DSS)

31

. All NMR data were processed using NMRPipe

32

and

spectra were analyzed with the program NMRView 35.

NMR spectroscopy: relaxation data analysis

For relaxation studies, three experiments were recorded:

15

N-T1,

15

N-T2 and {1H}-15N

NOE (NMR pulse sequences from VnmrJ BioPack). Spectra were recorded on a

15

N-

labeled RRM1 sample at a concentration of 0.15 mM in NMR Buffer (10 mM phosphate, pH 7, 100mM NaCl, 0.02 % NaN3, 20 mM DTT, 3 mM imidazole, 10 % D2O).

12 ACS Paragon Plus Environment

15

N-T1

Page 13 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

delay times of 10, 20, 40, 80, 160, 320, 640, and 1280 ms were used. 15N-T2 delay times of 10, 30, 50, 70, 90, 130, 170, 210, and 250 ms were used. For {1H}-15N NOE experiments, one spectrum was acquired with a 5 s 1H saturation time for the NOE to build up and another spectrum was acquired with a 5 s recycle delay. Determination of 15

N-R1 and

15

N-R2 relaxation rates was accomplished using CURVEFIT (A. G. Palmer,

Columbia University, New-York, NY). {1H}-15N NOE values were obtained directly from the HetNOE analysis function in NMRView. A noise floor of 3.2%, 2.3% and 3.0% was used in the calculation of 15N-R1, 15N-R2 and {1H}-15N NOE. Analysis of relaxation data was performed using the extended Model-Free formalism, using the statistical approach of Mandel et al 43. Values for the 15N gyromagnetic ratio, H-N bond length, and chemical shift anisotropy were -2.712 * 107 rad T-1 s-1, 1.02 Å, and -172 ppm, respectively. An initial overall tumbling (τm) was estimated from R2/R1 ratio with residues in secondary structure elements only (31 residues). The center of mass of the pdb structure was translated to coordinate origin using the PDBINERTIA program (A. G. Palmer, Columbia University) and the R2R1_DIFFUSION program (A. G. Palmer, Columbia University) was used to estimate the diffusion tensors for spherical and axially symmetric diffusion models. The relative moments of inertia for the NMR structure 2RS8 of RRM1 are 1.0, 0.97, and 0.52. We used the axially symmetric model, based on statistics and performed Model-Free analysis as describe in Mandel et al

43

. After

optimization of the dynamics parameters, we obtained a D∥/D⊥ value of 1.373 ± 0.044 and a τm value of 6.61 ± 0.04 ns, which is in accordance with what is expected for this size of protein.

13 ACS Paragon Plus Environment

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 50

Residues with significant overlap (K70, S76, K98, E130, E104), poor signal-to-noise ratio or bad fits (G68, T71, F72, T73, L152), or no model selected (L78, E90, K101, V105, T120, T122) were discarded, therefore allowing for the characterization of 52 residues.

NMR spectroscopy: titration experiments

For the titration study, a series of 2D [1H,15N]-HSQC spectra were recorded at 25 °C on a

15

N-labeled RRM1 sample at a concentration of 0.0833 mM in NMR Buffer (10 mM

phosphate, pH 7, 100mM NaCl, 0.02 % NaN3, 20 mM DTT, 3 mM imidazole, 10 % D2O), without or with the desired RNA ligand. The 5' splice site RNA (5'SS: 5'AAAAAGGUAAG-3') was synthesized, HPLC purified and lyophilised (Sigma-Aldrich) before titration experiments. Characterization of interactions between p54nrb/NonO RRM1 and 5'SS RNA ligand was performed by monitoring chemical shift perturbations (CSP) in a series of eight 2D [1H,15N]-HSQC spectra upon titration. 5'SS RNA binding characterization were achieved by adding ligand to

15

N-labeled RRM1 samples until a

RNA/protein ratio of 3.5.

Dissociation constant evaluation (Kd)

For each titration, normalized CSP of p54nrb/NonO RRM1 amide resonances were calculated using the equation ∆δ(15N+1H) = (δ1H2 + (δ15N/6.5)2)1/2. Resonances presenting CSP more than the average of the RRM1 amide resonance CSP plus one standard

14 ACS Paragon Plus Environment

Page 15 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

deviation and that did not present any overlap after addition of ligand were considered to determine Kd’s. CSP of the selected resonances were fitted to a two-state model equation:

߂ߜ௢௕௦ = ߂ߜ௠௔௫

{ሺ௄ௗାሺଵା௥ሻሾ௉ሿబ ሻ ି ටሺ௄ௗ ାሺଵା௥ሻሾ௉ሿబ ሻమ ିସሾ௉ሿమబ ௥} ଶሾ௉ሿబ

(Equation 1)

∆δobs, ∆δmax and [P]0 denote the observed chemical shift, the maximum chemical shift and the total protein concentration respectively. In addition, r and Kd denote the RRM1/RNA ratio and the dissociation constant of the complex. Calculated Kd’s were submitted to a Q test in order to reject outliers with 95 % confidence 44. Twelve residues were selected for the binding of 5'SS RNA to p54nrb/NonO RRM1. Kd’s were obtained for each residue by best fit and we also performed 10000 Monte-Carlo re-sampling simulations for each selected residue based on the previous two-state model equation. A standard error of 0.002 ppm was set for the ∆δobs axis. The final Kd’s values were derived from the mean of the Monte-Carlo re-sampling distribution of selected residues. Errors on the Kd’s were described by a 90% confidence interval.

Intrinsic fluorescence quenching experiments

The lyophilized 5' splice site RNA (5'SS: 5'-AAAAAGGUAAG-3'), purchased from IDT (Integrated DNA Technologies), was dissolved in experimental buffer (10mM phosphate, 100mM NaCl, pH 7 and 20mM DTT). For the intrinsic fluorescence quenching experiments, the purified RRM1 domain was diluted to final concentrations of 20 µM and 40 µM, and the fluorescence emission of the protein was monitored upon addition of 15 ACS Paragon Plus Environment

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

5’SS RNA (ratio RNA/protein used: 0, 0.25, 0.5, 0.75, 1, 1.5, 2, 2.5, 3 and 3.5). All the fluorescence experiments were carried out on a Varian Cary Eclipse fluorescence spectrophotometer at 20°C. The fluorophore (RRM1 protein) was excited at 280 nm and the emission was recorded from 290 to 350 nm with a maximum emission at 310 nm. The maximum emission signal decreases upon addition of the RNA. The fluorescence change was fitted using Equation 1 to determine the binding affinity, and a Monte-Carlo analysis was performed.

16 ACS Paragon Plus Environment

Page 16 of 50

Page 17 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Results and Discussion

Initial RNA binding characterization of p54nrb/NonO

It is known that p54nrb/NonO can bind to a wide spectrum of RNA sequences

1, 21, 27

. To

explore p54nrb/NonO − RNA interactions, we performed a molecular dissection of the Nterminal moiety of the protein containing both RRMs. In vitro RNA binding experiments were carried out using the 5' splice site RNA sequence (5'SS : 5'-AAAAAGGUAAG-3') 5

. Constructs, possessing either both RRMs or only the RRM1, were able to bind the

RNA (Figure 1C). In contrast, a construct containing only the RRM2 domain failed to bind the 5’SS RNA. Similar results were observed by NMR performing amide proton chemical shift perturbation analysis (CSP) using

15

N-labeled N-RRM1, RRM1 and

RRM2 constructs. Indeed, the addition of the 5'SS RNA induced significant perturbation of the N-RRM1 and RRM1 2D [1H,15N]-HSQC spectra while no significant changes were observed for the RRM2 2D [1H,15N]-HSQC (Figures S1, S2,and S5). Unfortunately, we were unable to perform CSP analysis with the N-RRM1/2 construct due to insolubility and aggregation problems. However, the addition of unlabeled N-RRM1 to 15

N-labeled RRM2 during the CSP analysis in presence or in absence of RNA did not

affect amide resonances of the RRM2 construct suggesting no interaction or cooperativity between these RRMs for RNA binding (Figure S3). Since the results presented above demonstrated a direct interaction of p54nrb/NonO RRM1 with the 5'SS RNA, and given the observed inherent flexibility of the RRM1 motif in the homo- and heterodimer

17 ACS Paragon Plus Environment

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 50

structures,24-26 we determined the solution structure of this motif and explored its dynamics and its affinity to the 5'SS RNA by NMR.

Structure of p54nrb/NonO RRM1

For the NMR characterization of p54nrb/NonO RRM1 structure, dynamics and binding to RNA, the N-terminal tail of the protein RNA binding domain (residue 1-67) was discarded as no peaks from this segment presented chemical shift perturbations following the addition of 5'SS RNA. Therefore, removal of those residues, that present a low dispersion and a lot of overlap on the [1H,15N]-HSQC, was desirable to go further with the RRM1 structural characterization. Moreover, it is worth noting that, in agreement with the low dispersion observed for those residues on the [1H,15N]-HSQC spectrum, secondary structure and disorder predictions indicate that the tail is mostly disordered with a small stretch of low complexity sequence (Figure S4).

To determine the solution structure of the p54nrb/NonO RRM1, we recorded a complete set of standard 2D and 3D NMR spectra 29, 30, as described in Materials and methods. The 2D [1H,15N]-HSQC spectrum of the RRM1 showed well-dispersed and sharp H-N resonances with uniform intensity, which are characteristic of a folded protein (Figure 2A). The resonance assignment of the polypeptide backbone (1HN, 15

13

C (Cα, Cβ, C’) and

N) was completed to 83% for 15N, 90% for 1HN, 95% for Cα and Cβ and finally, 87% for

C’. Fourteen residues were not observed in the 2D [1H,15N]-HSQC spectrum (Gln74,

18 ACS Paragon Plus Environment

Page 19 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Arg75, Leu83, His108, Lys109, Asp110, Lys111, Gly112, Arg121, Cys147, His148, Ser149, Ala150 and Ser151) (BMRB ID: 11458). A total of 820 inter-residue and 305 intra-residue distance restraints, 23 hydrogen bond restraints, and 104 dihedral angle restraints were used in the final structure calculations with the program CYANA 2.0.17 program AMBER9

36-38

. The final structures were energy-refined with the

using the same set of restraints except for the hydrogen bond

restraints, which were excluded. The final 20 energy-minimized conformers that represent the solution structure of the p54nrb/NonO RRM1 are well defined and show excellent agreement with the experimental data (Table 1). Each member of the final ensemble of 20 calculated structures agreed well with the experimental data with no upper limit violation greater than 0.05 Å and no dihedral angle violation greater than 2°. The final ensemble of structures was deposited in the PDB under the identification code: 2RS8.

The quality of the structure ensemble was also shown to be good. The Ramachandran statistics for all the residues in the ensemble were in acceptable space (89% in the most favored regions, 11% within additionally allowed regions, and 0.2% within generously allowed regions), according to the program PROCHECK_NMR

42

. The final set of 20

calculated structures in the ensemble converged well, as shown by the statistics in Table 1 and by the conformer superposition in Figure 2B. The RMSDs of the structured core region (Ser76-Ala146) in the ensemble were 0.31 Å for the backbone and 1.19 Å for all heavy atoms.

19 ACS Paragon Plus Environment

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The structured core region of the p54nrb/NonO RRM1 adopts the canonical RNP-type RRM fold, composed of a four-stranded anti-parallel β-sheet and two α-helices with the β1α1β2β3α2β4 topology (Figure 2C). In addition, α2 and β4 are connected with a loop containing a β-Turn (Leu136-Lys139, called later β4’), which can be classified as Type-I’ 45

. Although this overall architecture is consistent with the typical RRM fold, 70 residues

long is rather short with regard to most RRMs which are 80-90 residues long 23.

As expected, the three conserved aromatic residues contained in the RNP1 and RNP2 consensus sequences of p54nrb/NonO (Phe79, Phe113 and Phe115) have solvent-exposed side chains

46

. However, some important unusual differences are found in the

conformational arrangement of theses conserved phenylalanines. The χ1-angle of Phe113 in the RNP1 turned out to be about -60° (average of -66° ± 3° considering the 20 conformers). This specific orientation is supported by the presence of nine long range NOEs between Phe113 and residues Gly81, Asp82 and Leu83 from the β1 strand and ten short range NOEs with residues Gly112 and Gly114 (see Table S1). Among those NOEs, three long range and six short range NOEs directly involves the side chain of the Phe113. This is distinct from typical χ1-angles of the corresponding phenylalanines in canonical RRMs, which are approximately +60°. The positive χ1-angle brings the aromatic side chain to either the β2 or β2-β3 loop direction. In the case of the p54nrb/NonO RRM1, these directions are occupied by the bulky side chains of His108, Lys111, and Phe115 and are therefore sterically hindered (Figure 2C). If canonical RRM nucleic acid interactions are to be achieved, conformational rearrangement should be expected. Otherwise, Phe113 might interact with nucleic acid in some novel manner. Interestingly,

20 ACS Paragon Plus Environment

Page 20 of 50

Page 21 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

the β-sheet of p54nrb/NonO also contains two additional solvent-exposed aromatic residues (Phe106 and His108) that could also possibly be involved in RNA interactions (Figure 2C).

Another distinctive feature of the putative RNA-binding surface of the p54nrb/NonO RRM1 is that it contains six positively charged amino acid residues: Arg77 (β1), His108 (β2), Lys111 (β2-β3 loop), Arg117 (β3), Arg142 (β4), and Arg144 (β4) (Figure 2C). The structures of RRMs in complex with their cognate nucleic acids indicate that most of these RRMs contain 2 to 4 basic amino acid residues on their RNA-binding surface. The positively charged side chains could form electrostatic interactions and hydrogen bonds providing specificity for nucleic acid binding.

NMR relaxation data

The backbone dynamics of p54nrb/NonO RRM1 have been characterized through NMR measurements of 15N relaxation parameters R1, R2 and {1H}-15N steady-state NOE of the backbone amide resonances. The good dispersion of the NMR data in the 2D [1H,15N]HSQC allowed the characterization of the dynamic of 52 amides (Figure 3A, 3B and 3C). Some resonances that present overlap, that have a low signal to noise ratio or that cannot be well fitted for R1 and R2 determination, were not considered for the study (see Materials and methods for details). The average values for R1, R2, NOE and R2/R1 ratio are summarized in Table 2. Values measured are in the range of what is expected for a folded protein of that size (~10 kDa). Accordingly with the NMR structure, β strands and

21 ACS Paragon Plus Environment

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

α helices are more ordered with expected higher NOE values (0.737 ± 0.014) than the loop regions which have less order with lower NOE values (0.689 ± 0.020). The R2/R1 ratio is generally used as an estimation of the global correlation time (τm) of the protein at the beginning of the refinement of initial data for Model-Free analysis. An initial overall tumbling (τm) was estimated from R2/R1 ratio with residues in secondary structure elements only (31 residues). After optimization of the dynamics parameters, we obtained a D∥/D⊥ value of 1.373 ± 0.044 and a τm value of 6.61 ± 0.04 ns, which is in accordance with what is expected for a protein of this size. The model selection allowed us to fit 15 residues to model 1 (S2), 22 residues to model 2 (S2, τe), 1 residue to model 3 (S2, Rex), 11 residues to model 4 (S2, τe, Rex) and 3 residues to model 5 (S2, τe, S2f). The optimized dynamics parameters, i.e. S2, τe and Rex, are plotted on E, F and G panels of Figure 3. The average S2 is 0.880 ± 0.007 which is typical of a well-structured protein. We noticed that 12 residues need a Rex contribution which describes slow motions on the micro to millisecond timescale. These results clearly confirm that the RRM1 domain of the p54nrb/NonO protein is unambiguously well ordered and present typical dynamics on the nano to picosecond timescale. In addition, it shows that some residues undergo slower motions on the micro to millisecond timescale that could be involved in the regulation of p54nrb/NonO binding to RNA.

22 ACS Paragon Plus Environment

Page 22 of 50

Page 23 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

NMR chemical shift perturbation analysis of p54nrb/NonO RRM1 with the 5' splice site RNA

p54nrb/NonO is part of the splicing machinery and participates in this process by directly binding the 5'SS of the pre-mRNA 5. To determine the 5'SS binding interface of p54nrb/NonO RRM1, we recorded a set of 2D [1H,15N]-HSQC spectra on the 15N-labeled RRM1 construct upon titration of the ligand and analysed amide CSP. While increasing the concentration of the RNA (from 0:1 to 3.5:1), some of the 1H-15N resonances shifted in a continuous way indicating fast exchange between the bound and the free states of the motif on the NMR time scale (Figure 4A, and Figure S5). All the resonances significantly perturbed by the addition of the ligand (means of RRM1 CSP + one standard deviation) originated from residues located at the β-sheet surface of the domain (Figures 4B, 4E and 4F). Half of them (Asn82, Glu104, Phe113, Phe115, Arg142 and Arg144) have solventexposed side chains that could thus directly interact with the RNA ligand (Figure 4F). The two conserved Phe of the RNP1 consensus sequence (Phe113 and Phe115), the conserved Asn (Asn82) of the RNP2 and, to a lesser extent, Phe79 (CSP just under the threshold) seem to be involved in the interaction of p54nrb/NonO RRM1 with the 5'SS RNA. Interestingly, two Arg of the β4 strand and a Glu on the β2 strand are perturbed by the ligand suggesting that they could play a role in the binding specificity of the interaction by mediating electrostatic interactions or hydrogen bonding with the 5'SS RNA bases or backbones phosphates. Altogether, the residues perturbed by the addition of the 5'SS RNA compose a basic interface suitable for the binding of negatively charged RNA.

23 ACS Paragon Plus Environment

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Besides, six others amide resonances (Leu78, Val80, Gly81, Ileu107, Gly114 and Val143) are significantly perturbed by the addition of the ligand, although all six residues have their side chains at the opposite side of the interacting β-sheet surface. This suggests that these residues are either implicated in the binding via their backbone amides and/or that the RRM1 of p54nrb/NonO undergoes a structural rearrangement upon RNA binding which could affect the symmetry or the dynamics of the β-sheet H-bond network. This rearrangement seems likely when looking at CSPs obtained for residues Val80 and Gly114 considering the fact that the N-H and C=O bonds of Val80 make hydrogen bonds with the C=O and N-H of Gly114 to stabilize the β-sheet. Altogether, the significant CSPs observed upon RNA addition for Val80, Gly81, Asn82, Phe113, Gly114 and Phe115, all constrained in the same region of the RRM1 3D structure, confirm their structural linkage and their global adaptation to the 5'SS RNA ligand. Surprinsingly, the additional solvent-exposed aromatic residue Phe106, located in the βsheet, seems to be less implicated in the ligand binding or does not undergo a large structural rearrangement since a small CSP (under the threshold) is observed for that residue. The implication of the other additional solvent-exposed aromatic residue His 108 cannot be determined due to the absence of available NMR signal for this residue. This suggests that these two solvent exposed residues are less affected by the binding of the 5'SS RNA ligand and possibly less relevant in this interaction.

In order to determine the dissociation constant of the complex, we used the twelve residues with the most significant CSP data (mean of CSP + one standard deviation, i.e.

24 ACS Paragon Plus Environment

Page 24 of 50

Page 25 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Leu78, Val80, Gly81, Asn82, Glu104, Ileu107, Phe113, Gly114, Phe115, Arg142, Val143, Arg144). The CSPs of each residue were best fitted to a two-state model. The affinity was estimated at 71 ± 9 µM (Figures 4C and 4F). 10000 Monte-Carlo simulations were performed for each residue to refine the Kd measurements, and the results were in accordance with the value obtained by Best-Fit, which clearly reinforce the quality of CSP data measured (Figure 4G). The relatively low affinity of the motif for the 5'SS sequence is consistent with the fact that the binding is in fast exchange. In parallel, intrinsic fluorescence of the RRM1 construct upon binding to the 5'SS RNA was used to confirm the affinity. As determined by our NMR binding study, the RRM1 protein/RNA interface involves aromatic residues. More precisely, five of the eight aromatic residues contained in the RRM1 construct (7 Phe and 1 Tyr) are solvent-accessible and on the βsheet surface. The RRM1 aromatic residues were excited at 280 nm and emission scans were recorded between 290 and 350 nm upon addition of the RNA ligand to a RRM1 sample. The maximum emission was observed at 310 nm. Thus, we used the quenching of the fluorescence (F0-F) at this wavelength upon the addition of the RNA ligand in order to estimate the Kd (Figure 4D). The affinity was estimated at an average of 23 µM, and confirmed by 10000 Monte-Carlo simulations with an average value of 23.5 ± 6 µM. Though, this value is a bit lower than the one determined by the NMR CSP study, it confirms an affinity in the low µM range for the interaction of p54nrb/NonO RRM1 with the 5' splice site RNA. As the structure of the complex is unknown, the results of the CSP analysis provide a general overview of the residues involved in RNA binding but the residues undergoing an indirect conformational change cannot be discriminated from the critical residues

25 ACS Paragon Plus Environment

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 50

involved in the direct interaction. Our data suggest that p54nrb/NonO RRM1 classically binds to the 5'SS RNA sequence with low affinity using essentially some of the aromatic and basic residues on its β-sheet surface.

Relation between structure, dynamics and function

The RRM is one of the most abundant protein domains found in eukaryotes been studied extensively

47

47

and has

. The present study further deepens our knowledge on the

function, the structure and the dynamics of this motif. We have clearly shown that between the two RRM domains of p54nrb/NonO, only RRM1 binds RNA using conserved residues from the RNP1 and RNP2 consensus sequence and others from distinct β strands and loops. Hence, we have determined the structure of the RRM1 domain by NMR. The structure of the N-terminal part of p54nrb/NonO (residues 68 to 306) complexed with the N-terminal part of PSPC1 was recently obtained by X-ray crystallography (3SDE)

25

.

Comparing the structure of the RRM1 domain from the X-ray with that of the current NMR studies, it can be observed that the folding of the two proteins is really similar with a RMSD of 0.99 (with 284 backbone atoms superposed with PyMOL). Except for the difference observable in the loops, the most important differences that can be observed between the two structures involve the side chain of Phe113 of the RNP1, and the side chain of two residues of the β2 strand (Phe106 and His108). The χ1-angle of Phe113 in the NMR structure ensemble (20 structures) is -66° ± 3° compared to +73° for the X-ray structure which seems to be a more typical angle for RRM domains (Figure 5A). This surprising difference between the NMR and the X-ray structure is not unique according

26 ACS Paragon Plus Environment

Page 27 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

to RRM studies published to date. It has been shown that a Phe residue at the same position in the RNP1 of NAB3 RRM domain also presents this variation between NMR and X-ray structure in absence of RNA (Figure 5B)

48, 49

. Interestingly, structures of

NAB3-RRM complexed with RNAs obtained by NMR and X-ray crystallography, show that, for both methods, the Phe undergoes a structural rearrangement upon RNA binding (Figure 5C). Moreover it is interesting to point out that the Phe are almost superimposed for both methods in presence of RNA (Figure 5C). This suggests that the β-sheet surface and the RRM domain present some plasticity linked to their function to bind RNA. Interestingly, in agreement with those observations, our data show that the backbone amide of Phe113 is dynamically active as underlined by the presence of fast (τe) and slow (Rex) motions for that residue (Figures 3F and 3G). Since perturbations analysis showed that this residue is involved in p54nrb/NonO RRM1 RNA binding, it is tempting to suggest that the dynamics present on the backbone amide of Phe113 could be diagnostic of a role of this residue in the regulation of the protein RNA binding kinetics. However, it is worth mentioning that this assertion is hypothetical as neither the structure of the alternate free state (involved in the conformational exchange) nor the structure of the RRM1 bound to RNA are known. Phe106 and His108 are also oriented in a different way in the NMR and the X-ray structures. Interestingly, Phe106 backbone amide also present τe and Rex and unfortunately, no dynamics data is available for His108. Previous studies on RRM have revealed that residues from the β-sheet surface and residues from various loops are involved in the binding of RNA

22

. The present NMR

binding study agrees with these findings, as seven residues from the RNP consensus sequence (β1 and β3 strands) and five from β2, β4 strands and loops were shown to be

27 ACS Paragon Plus Environment

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

implicated in 5'SS RNA binding. The RRM1 of p54nrb/NonO is one of the shortest RRM domains known to date due to the presence of short loops between the β-strand, especially the β2-β3 loop. The absence of NMR signal on the 2D [1H,15N]-HSQC spectrum and the high B-factor values derived from the X-ray structure (aa 108-112, mean B-factor : 58.6) for that loop suggest that this region is very flexible. In the ModelFree analysis presented in Figure 3, the residues 100 to 120 (α1-β2 loop to β3-α2 loop) which comprise this short loop, present evident dynamics with some τe and seven Rex out of the twelve obtained from the Model-Free analysis. It can also be noticed by the observation of S2 data that two regions of the proteins are less restricted in the nanopicosecond timescale. The α1-β2 loop and the β2 strand have mean S2 of 0.838 and on the other side of the β-sheet, the α2-β4’ loop and the β4’ and β4 strands have a S2 average of 0.857. The other segments of the motif have higher S2 values with a mean of 0.906 (including data from loop and from structured sections) (Figure 3E). Five residues (Glu104, Ile107, Arg142, Val143 and Arg144) from those two more flexible regions present large CSPs (Figure 4B), but Ile107 is the only one to have some Rex. As the side chain of this residue is not solvent accessible, its amide resonance perturbations following RNA addition might result from a conformational change / adaptation of the RRM1 to embrace the ligand. On the other side of the β-sheet, the 3 residues of the β4 strand with large CSPs (Arg142, Val143 and Arg144) have the higher S2 values of this region and they are part of the β-sheet with evident hydrogen bonds on backbone which tighten their position. These results suggest that the charges brought by the two Arg residues could potentially be important for the binding to the target RNA. It is known from the literature that even if no consensus sequences are defined on the β4 strand, this

28 ACS Paragon Plus Environment

Page 28 of 50

Page 29 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

region is essential to confer affinity and specificity for RRM/RNA binding 23. We show here that β2, β4 and their respective loops are dynamically active and hypothesize that this flexibility could potentially facilitate the binding of the RNA to the RNP sequence. In the middle of the β-sheet, the β1 and β3 strands are less dynamic with S2 > 0.91 (Figure 3E). However, all the residues have fast internal motions described by a τe contribution and two residues have Rex (Asn82 and Phe113) meaning that they undergo conformation exchange. We also noticed that β3-α2 loop residues involved in the binding also have a conformational exchange contribution (Rex) that could potentially help to fit the RNA ligand.

To conclude, we showed that p54nrb/NonO RRM1 is the RRM motif responsible for the binding of the protein to RNA. We showed that the RRM1 adopts a classical RRM fold in solution that allows its binding to the 5'SS RNA mainly via some solvent-exposed basic and aromatic residues located to the motif β-sheet surface. Even if the RRM domains are structurally well characterized and their consensus sequences are well known, their dynamics remains poorly studied. We reported here that many of the residues of p54nrb/NonO RRM1 potentially involved in the direct RNA binding present interesting amide dynamics properties on the pico-nanosecond and the micro-millisecond timescales. If we compare these observations to other published studies, we can highlight some similarities. The β2-β3 loop, regardless of its size, seems to be conservatively highly dynamic in the free form

50-52

but not necessarily in the bound form

53

. For

example, the dynamics study of the U1A RBD1 revealed that upon binding, the β2-β3 loop is stabilized resulting in the formation of a one-turn helix in the complex form 53, 54. Some studies point out that the β4’, β4 region (including loops) seems generally 29 ACS Paragon Plus Environment

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 50

disturbed by the binding of the RNA from a dynamic point of view 50. In the case of the hnRNP RBD2, the dynamic of the β4’ region (with Rex) is quenched when bound to the RNA but not the β2-β3 loop. The flexibility of these two regions seems involved in the recognition process allowing a fit RNA binding, but the hypothesis of a protein-protein interaction is not excluded. Based on what has been published to date, dynamics of RRM domains present some similarities in their free form, but the stabilization seems to occur in different regions in their bound form. Regarding the p54nrb/NonO RRM1, in addition of the backbone dynamics detected for these two regions, we showed that the α1-β2 loop and the β2 strand are less restricted than in other RRMs and are not necessarily implicated in the direct binding of the RNA. On the other hand, the CsF-64 RBD displays an inversed dynamic pattern, with a gain in mobility when bound to its target

55

, adding

more complexity to the comprehension of the role of RRM dynamics in the recognition of specific RNA sequences. Finally, the backbone dynamics of the p54nrb/NonO RRM1 could be an important factor for allowing optimal binding of this motif to its target RNA. These results not only contribute to better characterize the multifunctional protein p54nrb/NonO, but also to deepen our knowledge on the ubiquitous RRM motifs.

30 ACS Paragon Plus Environment

Page 31 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Associated Content Supporting information The Supporting Information is available free of charge on the ACS Publications website at DOI: xxx [1H,15N]-HSQC spectra probing the binding of 5'SS RNA (N-RRM1, RRM1, RRM2), secondary structure and disorder prediction, and PLAAC prediction. Accession Codes The structure of the p54nrb/NonO RRM1 is available as PDB entry 2RS8. The resonance assignment for p54nrb/NonO RRM1 has been deposited in the Biological Magnetic Resonance Bank as accession number 11458.

Author Information Corresponding Author *

Telephone: +1-418-656-2131 #2872. Fax: +1-418-656-7176. E-mail:

[email protected]. Present Address †

M.V.: IBIS, 1030 avenue de la Médecine, Université Laval, Québec, Qc, Canada, G1V

0A6

31 ACS Paragon Plus Environment

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 50

Funding

This work was supported by the Natural Science and Engineering Research Council (NSERC) of Canada (grant No. 122625-2011 to M.V. and studentships to M.B.) and by the Regroupement stratégique sur la fonction, la structure et l’ingénierie des protéines (PROTEO).

This

work

was

also

supported

by

the

RIKEN

Structural

Genomics/Proteomics Initiative (RSGI), the National Project on Protein Structural and Functional Analyses of the Ministry of Education, Culture, Sports, Science and Technology of Japan (MEXT). T.N. acknowledges financial support by JSPS KAKENHI 15H01634 and 26440026.

Notes The authors declare no competing financial interest

Acknowledgments

We thank Pierre Audet for the maintenance of the NMR infrastructure at Laval University, Celine Bruelle for help with molecular cloning and Sébastien Morin with NMR spectroscopy. Mikael Bédard would like to thank the Fonds Québécois de la Recherche sur la Nature et les Technologies (FQRNT) for the award of a graduate studentship. Jean-Baptiste Duvignaud would like to thank PROTEO for the award of a postdoctoral fellowship.

32 ACS Paragon Plus Environment

Page 33 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Abbreviations 5'SS RNA: 5' Splice site RNA sequence BB : blocking buffer BSA : bovine serum albumin CTD : C terminal domain CSP : Chemical Shift Perturbation DBHS: drosophila behaviour human splicing DTT: dithiothreitol EDTA: Ethylenediaminetetraacetic acid FPLC: Fast Protein Liquid Chromatography NOE : Nuclear Overhauser Effect hnRNP : heterogeneous nuclear ribonucleoprotein HSQC : Heteronuclear Single Quantum Coherence HPLC : High Pressure Liquid Chromatography IB: incubating buffer IPTG : Isopropyl β-D-1-thiogalactopyranoside LB: Lysogeny Broth or Luria-Bertani medium NMR : Nuclear Magnetic Resonance NOPS: NonO/paraspeckle domain PDB: Protein Data Bank PLD: prion-like domain PSPC1:paraspeckle component 1 protein R1: the spin-lattice relaxation rate constant R2: the spin-spin relaxation rate constant RBD : rna binding domain Rex: conformation exchange term RNP-1 or 2 : ribonucleoprotein consensus sequence 1 or 2 RMSD : Root Mean Square Deviation RRM: rna recognition motif S2: Order parameter 33 ACS Paragon Plus Environment

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

SDS: Sodium dodecyl sulfate SFPQ: Splicing factor, proline- and glutamine-rich τe: local correlation time τm: global correlation time

34 ACS Paragon Plus Environment

Page 34 of 50

Page 35 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

References

[1] Shav-Tal, Y., and Zipori, D. (2002) PSF and p54(nrb)/NonO--multi-functional nuclear proteins, FEBS Lett 531, 109-114. [2] Basu, A., Dong, B., Krainer, A. R., and Howe, C. C. (1997) The intracisternal Aparticle proximal enhancer-binding protein activates transcription and is identical to the RNA- and DNA-binding protein p54nrb/NonO, Mol Cell Biol 17, 677-686. [3] Danckwardt, S., Kaufmann, I., Gentzel, M., Foerstner, K. U., Gantzert, A. S., Gehring, N. H., Neu-Yilik, G., Bork, P., Keller, W., Wilm, M., Hentze, M. W., and Kulozik, A. E. (2007) Splicing factors stimulate polyadenylation via USEs at non-canonical 3' end formation signals, EMBO J 26, 2658-2669. [4] Hallier, M., Lerga, A., Barnache, S., Tavitian, A., and Moreau-Gachelin, F. (1998) The transcription factor Spi-1/PU.1 interacts with the potential splicing factor TLS, J Biol Chem 273, 4838-4842. [5] Kameoka, S., Duque, P., and Konarska, M. M. (2004) p54(nrb) associates with the 5' splice site within large transcription/splicing complexes, EMBO J 23, 1782-1791. [6] Emili, A., Shales, M., McCracken, S., Xie, W., Tucker, P. W., Kobayashi, R., Blencowe, B. J., and Ingles, C. J. (2002) Splicing and transcription-associated proteins PSF and p54nrb/nonO bind to the RNA polymerase II CTD, RNA 8, 1102-1111. [7] Peng, R., Dye, B. T., Perez, I., Barnard, D. C., Thompson, A. B., and Patton, J. G. (2002) PSF and p54nrb bind a conserved stem in U5 snRNA, RNA 8, 1334-1347. [8] Clemson, C. M., Hutchinson, J. N., Sara, S. A., Ensminger, A. W., Fox, A. H., Chess, A., and Lawrence, J. B. (2009) An architectural role for a nuclear noncoding RNA: NEAT1 RNA is essential for the structure of paraspeckles, Mol Cell 33, 717-726. [9] Hutchinson, J. N., Ensminger, A. W., Clemson, C. M., Lynch, C. R., Lawrence, J. B., and Chess, A. (2007) A screen for nuclear transcripts identifies two linked noncoding RNAs associated with SC35 splicing domains, BMC Genomics 8, 39. [10] Sasaki, Y. T., Ideue, T., Sano, M., Mituyama, T., and Hirose, T. (2009) MENepsilon/beta noncoding RNAs are essential for structural integrity of nuclear paraspeckles, Proc Natl Acad Sci U S A 106, 2525-2530. [11] Sunwoo, H., Dinger, M. E., Wilusz, J. E., Amaral, P. P., Mattick, J. S., and Spector, D. L. (2009) MEN epsilon/beta nuclear-retained non-coding RNAs are upregulated upon muscle differentiation and are essential components of paraspeckles, Genome Res 19, 347-359. [12] Chen, L. L., DeCerbo, J. N., and Carmichael, G. G. (2008) Alu element-mediated gene silencing, EMBO J 27, 1694-1705. [13] Zhang, Z., and Carmichael, G. G. (2001) The fate of dsRNA in the nucleus: a p54(nrb)-containing complex mediates the nuclear retention of promiscuously Ato-I edited RNAs, Cell 106, 465-475. [14] Bruelle, C., Bedard, M., Blier, S., Gauthier, M., Traish, A. M., and Vincent, M. (2011) The mitotic phosphorylation of p54(nrb) modulates its RNA binding activity, Biochem Cell Biol 89, 423-433. 35 ACS Paragon Plus Environment

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

[15] Murthy, U. M., and Rangarajan, P. N. (2010) Identification of protein interaction regions of VINC/NEAT1/Men epsilon RNA, FEBS Lett 584, 1531-1535. [16] Bond, C. S., and Fox, A. H. (2009) Paraspeckles: nuclear bodies built on long noncoding RNA, J Cell Biol 186, 637-644. [17] Fox, A. H., Lam, Y. W., Leung, A. K., Lyon, C. E., Andersen, J., Mann, M., and Lamond, A. I. (2002) Paraspeckles: a novel nuclear domain, Curr Biol 12, 13-25. [18] Myojin, R., Kuwahara, S., Yasaki, T., Matsunaga, T., Sakurai, T., Kimura, M., Uesugi, S., and Kurihara, Y. (2004) Expression and functional significance of mouse paraspeckle protein 1 on spermatogenesis, Biol Reprod 71, 926-932. [19] Hennig, S., Kong, G., Mannen, T., Sadowska, A., Kobelke, S., Blythe, A., Knott, G. J., Iyer, K. S., Ho, D., Newcombe, E. A., Hosoki, K., Goshima, N., Kawaguchi, T., Hatters, D., Trinkle-Mulcahy, L., Hirose, T., Bond, C. S., and Fox, A. H. (2015) Prion-like domains in RNA binding proteins are essential for building subnuclear paraspeckles, J Cell Biol 210, 529-539. [20] Fox, A. H., Bond, C. S., and Lamond, A. I. (2005) P54nrb forms a heterodimer with PSP1 that localizes to paraspeckles in an RNA-dependent manner, Mol Biol Cell 16, 5304-5315. [21] Yang, Y. S., Hanke, J. H., Carayannopoulos, L., Craft, C. M., Capra, J. D., and Tucker, P. W. (1993) NonO, a non-POU-domain-containing, octamer-binding protein, is the mammalian homolog of Drosophila nonAdiss, Mol Cell Biol 13, 5593-5603. [22] Clery, A., Blatter, M., and Allain, F. H. (2008) RNA recognition motifs: boring? Not quite, Curr Opin Struct Biol 18, 290-298. [23] Maris, C., Dominguez, C., and Allain, F. H. (2005) The RNA recognition motif, a plastic RNA-binding platform to regulate post-transcriptional gene expression, FEBS J 272, 2118-2131. [24] Lee, M., Sadowska, A., Bekere, I., Ho, D., Gully, B. S., Lu, Y., Iyer, K. S., Trewhella, J., Fox, A. H., and Bond, C. S. (2015) The structure of human SFPQ reveals a coiled-coil mediated polymer essential for functional aggregation in gene regulation, Nucleic Acids Res 43, 3826-3840. [25] Passon, D. M., Lee, M., Rackham, O., Stanley, W. A., Sadowska, A., Filipovska, A., Fox, A. H., and Bond, C. S. (2012) Structure of the heterodimer of human NONO and paraspeckle protein component 1 and analysis of its role in subnuclear body formation, Proc Natl Acad Sci U S A 109, 4846-4850. [26] Knott, G. J., Lee, M., Passon, D. M., Fox, A. H., and Bond, C. S. Caenorhabditis elegans NONO-1: Insights into DBHS protein structure, architecture, and function, Protein Sci 24, 2033-2043. [27] Hallier, M., Tavitian, A., and Moreau-Gachelin, F. (1996) The transcription factor Spi-1/PU.1 binds RNA and interferes with the RNA-binding protein p54nrb, J Biol Chem 271, 11177-11181. [28] Kigawa, T., Yabuki, T., Matsuda, N., Matsuda, T., Nakajima, R., Tanaka, A., and Yokoyama, S. (2004) Preparation of Escherichia coli cell extract for highly productive cell-free protein expression, J Struct Funct Genomics 5, 63-68. [29] Cavanagh, J., Fairbrother, W. J., Palmer, A. G. I., and and Skelton, N. J. (1996) Protein NMR spectroscopy, principles and practice, Academic Press.

36 ACS Paragon Plus Environment

Page 36 of 50

Page 37 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

[30] Clore, G. M., and Gronenborn, A. M. (1998) Determining the structures of large proteins and protein complexes by NMR, Trends Biotechnol 16, 22-34. [31] Markley, J. L., Bax, A., Arata, Y., Hilbers, C. W., Kaptein, R., Sykes, B. D., Wright, P. E., and Wuthrich, K. (1998) Recommendations for the presentation of NMR structures of proteins and nucleic acids. IUPAC-IUBMB-IUPAB Inter-Union Task Group on the Standardization of Data Bases of Protein and Nucleic Acid Structures Determined by NMR Spectroscopy, J Biomol NMR 12, 1-23. [32] Delaglio, F., Grzesiek, S., Vuister, G. W., Zhu, G., Pfeifer, J., and Bax, A. (1995) NMRPipe: a multidimensional spectral processing system based on UNIX pipes, J Biomol NMR 6, 277-293. [33] Johnson, B. A. a. B., R.A. (1994) NMR View - computer program for the visualization and analysis of NMR data., J. Biomol. NMR 4, 603-614. [34] Kobayashi, N., Iwahara, J., Koshiba, S., Tomizawa, T., Tochio, N., Guntert, P., Kigawa, T., and Yokoyama, S. (2007) KUJIRA, a package of integrated modules for systematic and interactive analysis of NMR data directed to high-throughput NMR structure studies, J Biomol NMR 39, 31-52. [35] Johnson, B. A. (2004) Using NMRView to visualize and analyze the NMR spectra of macromolecules, Methods Mol Biol 278, 313-352. [36] Guntert, P. (2004) Automated NMR structure calculation with CYANA, Methods Mol Biol 278, 353-378. [37] Guntert, P., Mumenthaler, C., and Wuthrich, K. (1997) Torsion angle dynamics for NMR structure calculation with the new program DYANA, J Mol Biol 273, 283298. [38] Herrmann, T., Guntert, P., and Wuthrich, K. (2002) Protein NMR structure determination with automated NOE assignment using the new software CANDID and the torsion angle dynamics algorithm DYANA, J Mol Biol 319, 209-227. [39] Cornilescu, G., Delaglio, F., and Bax, A. (1999) Protein backbone angle restraints from searching a database for chemical shift and sequence homology, J Biomol NMR 13, 289-302. [40] Nagata, T., Suzuki, S., Endo, R., Shirouzu, M., Terada, T., Inoue, M., Kigawa, T., Kobayashi, N., Guntert, P., Tanaka, A., Hayashizaki, Y., Muto, Y., and Yokoyama, S. (2008) The RRM domain of poly(A)-specific ribonuclease has a noncanonical binding site for mRNA cap analog recognition, Nucleic Acids Res 36, 4754-4767. [41] Case, D. A., Cheatham, T. E., 3rd, Darden, T., Gohlke, H., Luo, R., Merz, K. M., Jr., Onufriev, A., Simmerling, C., Wang, B., and Woods, R. J. (2005) The Amber biomolecular simulation programs, J Comput Chem 26, 1668-1688. [42] Laskowski, R. A., Rullmannn, J. A., MacArthur, M. W., Kaptein, R., and Thornton, J. M. (1996) AQUA and PROCHECK-NMR: programs for checking the quality of protein structures solved by NMR, J Biomol NMR 8, 477-486. [43] Mandel, A. M., Akke, M., and Palmer, A. G., 3rd. (1995) Backbone dynamics of Escherichia coli ribonuclease HI: correlations with structure and function in an active enzyme, J Mol Biol 246, 144-163. [44] Dean, R. B., and Dixon, W. J. (1951) Simplified Statistics for Small Numbers of Observations, analytical chemistry 23, 636-638.

37 ACS Paragon Plus Environment

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

[45] Hutchinson, E. G., and Thornton, J. M. (1994) A revised set of potentials for betaturn formation in proteins, Protein Sci 3, 2207-2216. [46] Muto, Y., and Yokoyama, S. (2012) Structural insight into RNA recognition motifs: versatile molecular Lego building blocks for biological systems, WIREs RNA, 229-243. [47] Venter, J. C., and Adams, M. D., and Myers, E. W., and Li, P. W., and Mural, R. J., and Sutton, G. G., and Smith, H. O., and Yandell, M., and Evans, C. A., and Holt, R. A., and Gocayne, J. D., and Amanatides, P., and Ballew, R. M., and Huson, D. H., and Wortman, J. R., and Zhang, Q., and Kodira, C. D., and Zheng, X. H., and Chen, L., and Skupski, M., and Subramanian, G., and Thomas, P. D., and Zhang, J., and Gabor Miklos, G. L., and Nelson, C., and Broder, S., and Clark, A. G., and Nadeau, J., and McKusick, V. A., and Zinder, N., and Levine, A. J., and Roberts, R. J., and Simon, M., and Slayman, C., and Hunkapiller, M., and Bolanos, R., and Delcher, A., and Dew, I., and Fasulo, D., and Flanigan, M., and Florea, L., and Halpern, A., and Hannenhalli, S., and Kravitz, S., and Levy, S., and Mobarry, C., and Reinert, K., and Remington, K., and Abu-Threideh, J., and Beasley, E., and Biddick, K., and Bonazzi, V., and Brandon, R., and Cargill, M., and Chandramouliswaran, I., and Charlab, R., and Chaturvedi, K., and Deng, Z., and Di Francesco, V., and Dunn, P., and Eilbeck, K., and Evangelista, C., and Gabrielian, A. E., and Gan, W., and Ge, W., and Gong, F., and Gu, Z., and Guan, P., and Heiman, T. J., and Higgins, M. E., and Ji, R. R., and Ke, Z., and Ketchum, K. A., and Lai, Z., and Lei, Y., and Li, Z., and Li, J., and Liang, Y., and Lin, X., and Lu, F., and Merkulov, G. V., and Milshina, N., and Moore, H. M., and Naik, A. K., and Narayan, V. A., and Neelam, B., and Nusskern, D., and Rusch, D. B., and Salzberg, S., and Shao, W., and Shue, B., and Sun, J., and Wang, Z., and Wang, A., and Wang, X., and Wang, J., and Wei, M., and Wides, R., and Xiao, C., and Yan, C., and Yao, A., and Ye, J., and Zhan, M., and Zhang, W., and Zhang, H., and Zhao, Q., and Zheng, L., and Zhong, F., and Zhong, W., and Zhu, S., and Zhao, S., and Gilbert, D., and Baumhueter, S., and Spier, G., and Carter, C., and Cravchik, A., and Woodage, T., and Ali, F., and An, H., and Awe, A., and Baldwin, D., and Baden, H., and Barnstead, M., and Barrow, I., and Beeson, K., and Busam, D., and Carver, A., and Center, A., and Cheng, M. L., and Curry, L., and Danaher, S., and Davenport, L., and Desilets, R., and Dietz, S., and Dodson, K., and Doup, L., and Ferriera, S., and Garg, N., and Gluecksmann, A., and Hart, B., and Haynes, J., and Haynes, C., and Heiner, C., and Hladun, S., and Hostin, D., and Houck, J., and Howland, T., and Ibegwam, C., and Johnson, J., and Kalush, F., and Kline, L., and Koduru, S., and Love, A., and Mann, F., and May, D., and McCawley, S., and McIntosh, T., and McMullen, I., and Moy, M., and Moy, L., and Murphy, B., and Nelson, K., and Pfannkoch, C., and Pratts, E., and Puri, V., and Qureshi, H., and Reardon, M., and Rodriguez, R., and Rogers, Y. H., and Romblad, D., and Ruhfel, B., and Scott, R., and Sitter, C., and Smallwood, M., and Stewart, E., and Strong, R., and Suh, E., and Thomas, R., and Tint, N. N., and Tse, S., and Vech, C., and Wang, G., and Wetter, J., and Williams, S., and Williams, M., and Windsor, S., and Winn-Deen, E., and Wolfe, K., and Zaveri, J., and Zaveri, K., and Abril, J. F., and Guigo, R., and Campbell, M. J., and Sjolander, K. V., and Karlak, B., and Kejariwal, A., and Mi, H., and Lazareva, B., 38 ACS Paragon Plus Environment

Page 38 of 50

Page 39 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

and Hatton, T., and Narechania, A., and Diemer, K., and Muruganujan, A., and Guo, N., and Sato, S., and Bafna, V., and Istrail, S., and Lippert, R., and Schwartz, R., and Walenz, B., and Yooseph, S., and Allen, D., and Basu, A., and Baxendale, J., and Blick, L., and Caminha, M., and Carnes-Stine, J., and Caulk, P., and Chiang, Y. H., and Coyne, M., and Dahlke, C., and Mays, A., and Dombroski, M., and Donnelly, M., and Ely, D., and Esparham, S., and Fosler, C., and Gire, H., and Glanowski, S., and Glasser, K., and Glodek, A., and Gorokhov, M., and Graham, K., and Gropman, B., and Harris, M., and Heil, J., and Henderson, S., and Hoover, J., and Jennings, D., and Jordan, C., and Jordan, J., and Kasha, J., and Kagan, L., and Kraft, C., and Levitsky, A., and Lewis, M., and Liu, X., and Lopez, J., and Ma, D., and Majoros, W., and McDaniel, J., and Murphy, S., and Newman, M., and Nguyen, T., and Nguyen, N., and Nodell, M., and Pan, S., and Peck, J., and Peterson, M., and Rowe, W., and Sanders, R., and Scott, J., and Simpson, M., and Smith, T., and Sprague, A., and Stockwell, T., and Turner, R., and Venter, E., and Wang, M., and Wen, M., and Wu, D., and Wu, M., and Xia, A., and Zandieh, A., and Zhu, X. (2001) The sequence of the human genome, Science 291, 1304-1351. [48] Hobor, F., Pergoli, R., Kubicek, K., Hrossova, D., Bacikova, V., Zimmermann, M., Pasulka, J., Hofr, C., Vanacova, S., and Stefl, R. (2010) Recognition of transcription termination signal by the nuclear polyadenylated RNA-binding (NAB) 3 protein, J Biol Chem 286, 3645-3657. [49] Lunde, B. M., Horner, M., and Meinhart, A. (2010) Structural insights into cis element recognition of non-polyadenylated RNAs by the Nab3-RRM, Nucleic Acids Res 39, 337-346. [50] Katahira, M., Miyanoiri, Y., Enokizono, Y., Matsuda, G., Nagata, T., Ishikawa, F., and Uesugi, S. (2001) Structure of the C-terminal RNA-binding domain of hnRNP D0 (AUF1), its interactions with RNA and DNA, and change in backbone dynamics upon complex formation with DNA, J Mol Biol 311, 973-988. [51] Maynard, C. M., and Hall, K. B. (2010) Interactions between PTB RRMs induce slow motions and increase RNA binding affinity, J Mol Biol 397, 260-277. [52] Nagata, T., Kanno, R., Kurihara, Y., Uesugi, S., Imai, T., Sakakibara, S., Okano, H., and Katahira, M. (1999) Structure, backbone dynamics and interactions with RNA of the C-terminal RNA-binding domain of a mouse neural RNA-binding protein, Musashi1, J Mol Biol 287, 315-330. [53] Mittermaier, A., Varani, L., Muhandiram, D. R., Kay, L. E., and Varani, G. (1999) Changes in side-chain and backbone dynamics identify determinants of specificity in RNA recognition by human U1A protein, J Mol Biol 294, 967-979. [54] Allain, F. H., Howe, P. W., Neuhaus, D., and Varani, G. (1997) Structural basis of the RNA-binding specificity of human U1A protein, EMBO J 16, 5764-5772. [55] Deka, P., Rajan, P. K., Perez-Canadillas, J. M., and Varani, G. (2005) Protein and RNA dynamics play key roles in determining the specific recognition of GU-rich polyadenylation regulatory elements by human Cstf-64 protein, J Mol Biol 347, 719-733.

39 ACS Paragon Plus Environment

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 40 of 50

Tables

Table 1. Structural Statistics for the RBD1 of p54nrb/nonO NMR restraints Distance restraints Total NOE Intra-residue Inter-residue

1125 305 820

Sequential (|i-j| = 1) Medium-range (1 < |i-j| < 5) Long-range (|i-j| ≥ 5) Hydrogen bonds restraints a Dihedral angle restraints φ and ψ χ angle Structure statistics (20 structures) CYANA target function (Å2) Residual NOE Violations Number > 0.10 Å Maximum (Å) Residual dihedral angle violations Number > 5.0 º Maximum (º) AMBER energies (kcal/mol) Mean AMBER energy Mean restraints violation energy Ramachandran plot statistics (%) Residues in most favored regions Residues in additionally allowed regions Residues in generously allowed regions Residues in disallowed regions Average R.M.S.D. to mean structure (Å) b Protein backbone Protein heavy atoms a b

Used only in CYANA calculation For residues Ser73-Ala143 of the p54nrb/nonO RRM1

40 ACS Paragon Plus Environment

268 174 378 23 73 31 0.02 0 0.05 0 1.57

-3261 4.81 88.9 10.9 0.2 0.0 0.31 1.19

Page 41 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Table 2: Average values of relaxation parameters measured for p54nrb/NonO RRM1 R1(s-1)

1.808 ± 0.019

R2 (s-1)

9.666 ± 0.189

NOE

0.723 ± 0.012

R2/R1

5.37 ± 0.12

41 ACS Paragon Plus Environment

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure legends Figure 1: Structural dissection of p54nrb/NonO RNA binding domain (A) schematic representation of p54nrb/NonO RBD. (B) Amino acid sequence alignment of the RRM1s and RRM2s of human DBHS proteins. Identical residues between RRM1s, RRM2s and both RRMs are coloured blue, cyan and green, respectively. RNP1 and RNP2 residues are boxed and their conserved aromatic and charged residues are indicated by asterisks. (C) In vitro 5' splice site RNA binding assays. N-terminal constructs of p54nrb/NonO were incubated with agarose beads alone (lane 2), coated with the 5'SS RNA (lane 4). Figure 2: Solution structure of p54nrb/NonO RRM1. (A) 2D [1H,15N]-HSQC of the p54nrb/NonO RRM1 (aa 68-153) with assignments. (B) Overlay of the 20 final conformers of p54nrb/NonO RRM1 (Thr71 – Ser149) for the best fit of the backbone atoms of residues Gly68 – Thr153. (C) Ribbon representation of p54nrb/NonO RRM1. Solvent-exposed aromatic and basic residues located at the β-sheet surface are coloured in green and in blue, respectively. Figure 3: Sequential NMR dynamic data for p54nrb/NonO RRM1: (A) The spinlattice relaxation rate constant (R1) , (B) the spin-spin relaxation rate constant (R2), (C) the heteronuclear steady-state {1H}-15N nuclear Overhauser effect (NOE) , (D) R2/R1 ratio, (E) Order parameter (S2) for internal motions (S2f is for fast timescale (black) and S2s is for slow timescale (grey)), (F) Effective correlation time for internal motions (τe), (G) Chemical exchange term Rex. Figure 4: NMR chemical shift perturbation analysis of p54nrb/NonO RRM1 upon addition of the 5' splice site RNA. (A) Close-up view of the superimposed 2D [1H,15N]HSQC spectra of

15

N-labeled RRM1 collected as the 5'SS RNA was gradually added.

The ratio color-code is presented at the top of the spectra set and some of the residues exhibiting significant chemical shift perturbations are labeled. (B) Normalized chemical shift perturbations (CSPs; (∆δ=[(δHN)2 + (δN/6.5)2]1/2) of p54nrb/NonO RRM1, upon addition of a 3.5 fold excess of the 5'SS RNA, were plotted as a function of the RRM1 42 ACS Paragon Plus Environment

Page 42 of 50

Page 43 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

residue number. CSPs greater than the average + 2 standard deviations (0.082 ppm) are in red and those greater than the average of RRM1 CSP + 1 standard deviation (0.061 ppm) are coloured in orange. Non-attributed resonances or resonances that could not be followed after the addition of the RNA are indicated by an “X” on the histogram and proline amino-acid residues are indicated by a “P” on the histogram. (C) Normalized chemical shift perturbations, plotted against the RNA/protein ratio for each selected amino acids used in the estimation and Monte-Carlo refinement of the Kd (see Materials and methods). (D) Intrinsic fluorescence quenching experiments at 20 µM (red) and 40µM (blue) of RRM1 upon addition of 5'SS RNA, resulting in Kd estimation of 24 µM and 22 µM, respectively. Residues exhibiting significant perturbations are coloured on the ribbon diagram (E) and on the surface (F) of the p54nrb/NonO RRM1 motif. (G) Table of individual Kd obtained by best fit and Monte-Carlo for each studied residue, and the average Kd obtained for the 5'SS RNA – p54nrb/NonO RRM1 binding affinity. Figure 5: Phenylalanine orientation discrepancies and ligand adaptation. (A) Superposition of NMR (green) and X-ray crystallography structure (cyan) of p54nrb/NonO RRM1, the red circle emphasise the two different orientation of the F113 (respectively 2RS8 and 3SDE). (B) Superposition of NMR (grey) and X-ray crystallography structure (gold) of the NAB3 RRM (respectively 2KVI and 2XNQ). (C) Structures of NAB3 RRM bound to a short RNA sequence (UCUU for NMR and UUCUU for X-ray) (respectively 2L41 and 2XNR). This figure was generated with the program PyMOL.

43 ACS Paragon Plus Environment

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 1:

44 ACS Paragon Plus Environment

Page 44 of 50

Page 45 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Figure 2:

45 ACS Paragon Plus Environment

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 3:

46 ACS Paragon Plus Environment

Page 46 of 50

Page 47 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Figure 4:

47 ACS Paragon Plus Environment

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 5:

48 ACS Paragon Plus Environment

Page 48 of 50

Page 49 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Graphic for the Table of Contents

49 ACS Paragon Plus Environment

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

50 ACS Paragon Plus Environment

Page 50 of 50