Electrostatic Contribution of Serine Phosphorylation to the Drosophila

Interaction of the Histone mRNA Hairpin with Stem–Loop Binding Protein (SLBP) and Regulation of the SLBP–RNA Complex by Phosphorylation and Prolin...
0 downloads 0 Views 375KB Size
Biochemistry 2004, 43, 9401-9412

9401

Electrostatic Contribution of Serine Phosphorylation to the Drosophila SLBP-Histone mRNA Complex† Roopa Thapar,‡ William F. Marzluff,*,‡,§ and Matthew R. Redinbo‡,§,⊥ Department of Biochemistry and Biophysics, Program in Molecular Biology and Biotechnology, and Department of Chemistry, UniVersity of North Carolina, Chapel Hill, North Carolina 27599 ReceiVed December 24, 2003; ReVised Manuscript ReceiVed April 22, 2004

ABSTRACT: Unlike all other metazoan mRNAs, mRNAs encoding the replication-dependent histones are not polyadenylated but end in a unique 26 nucleotide stem-loop structure. The protein that binds the 3′ end of histone mRNA, the stem-loop binding protein (SLBP), is essential for histone pre-mRNA processing, mRNA translation, and mRNA degradation. Using biochemical, biophysical, and nuclear magnetic resonance (NMR) experiments, we report the first structural insight into the mechanism of SLBPRNA recognition. In the absence of RNA, phosphorylated and unphosphorylated forms of the RNA binding and processing domain (RPD) of Drosophila SLBP (dSLBP) possess helical secondary structure but no well-defined tertiary fold. Drosophila SLBP is phosphorylated at four out of five potential serine or threonine sites in the sequence DTAKDSNSDSDSD at the extreme C-terminus, and phosphorylation at these sites is necessary for histone pre-mRNA processing. Here, we provide NMR evidence for serine phosphorylation of the C-terminus using 31P direct-detect experiments and show that both serine phosphorylation and RNA binding are necessary for proper folding of the RPD. The electrostatic effect of protein phosphorylation can be partially mimicked by a mutant form of SLBP wherein four C-terminal serines are replaced with glutamic acids. Hence, both RNA binding and protein phosphorylation are necessary for stabilization of the SLBP RPD.

Most eukaryotic mRNAs are posttranscriptionally modified by the addition of a 5′ 7-methyl-guanine cap and a 3′ poly(A) tail. These modifications are necessary for the proper control of mRNA metabolism, which requires the assembly of large complexes that regulate translation initiation, mRNA stability, and mRNA degradation. Replication-dependent histone genes encode the only known family of mRNAs that are not polyadenylated. Unlike other eukaryotic mRNAs that end in a polyA sequence, replication-dependent histone mRNAs end in a highly conserved, 26 nucleotide stem-loop structure. The stem-loop of replication-dependent histone mRNAs is involved in pre-mRNA processing, translation, and stability (1, 2). Mature histone mRNAs are generated by endonucleolytic cleavage four or five nucleotides downstream of the stem-loop to form the 3′ end. The first step in this processing reaction involves the binding of a 32 kDa stem-loop binding protein (SLBP)1 to the stem-loop and recruitment of the U7 small nuclear ribonucleoprotein (snRNP) to a site downstream of the stem-loop in the premRNA (3). The pre-mRNA is subsequently cleaved, and the mature mRNA is exported to the cytoplasm. SLBP is also a component of the cytoplasmic histone messenger ribo† Supported by National Institutes of Health (NIH) Research Grant GM58961 to W.F.M, an NIH Supplement Grant GM29832 to W.F.M. and R.T., NIH Research Grant CA90604 to M.R.R. and a Burroughs Wellcome Career Award to M.R.R. * To whom correspondence should be addressed. E-mail address: [email protected]. Phone: (919) 962-2141. Fax: (919) 962-1274. ‡ Department of Biochemistry and Biophysics. § Program in Molecular Biology and Biotechnology. ⊥ Department of Chemistry.

nucleoprotein (mRNP) complex (4). The high-affinity (KD ≈ 1 nM) interaction (5, 6) between SLBP and the mRNA stem-loop is crucial for efficient processing, translation, and regulation of degradation of histone mRNA in mammalian cells. Recent biochemical studies (3-6) on the SLBP-RNA complex and solution NMR studies (7, 8) on the free RNA demonstrate that the mode of interaction of SLBP with histone stem-loop RNA is likely to be unique. The solution NMR structures of 24-nt and 28-nt stem-loops confirm that the RNA forms a canonical six base-pair A-form stem that starts with the highly conserved GC base pair at the base and ends in the UA base pair. The UUUC tetraloop is stabilized by base stacking interactions. Few NOEs are observed for bases flanking the stem, and NMR relaxation measurements indicate that these bases are disordered in solution. Mutagenesis experiments (3, 5, 6) on the RNA reveal that the SLBP-RNA interaction involves sequencespecific contacts among the mRNA stem, the loop, and the flexible flanking sequences, particularly the 5′ flanking 1 Abbreviations: SLBP, stem-loop binding protein; dSLBP, Drosophila stem-loop binding protein; RPD, RNA binding and processing domain; RBD, RNA binding domain; HSQC, heteronuclear single quantum coherence; PABP, poly(A)-binding protein; 3′-UTR, 3′ untranslated region; NMR, nuclear magnetic resonance; NOE, nuclear Overhauser enhancement; MALDI, matrix-assisted laser desorption ionization; PFG, pulsed-field gradient; PG-SLED, pulsed gradient stimulated echo longitudinal encode-decode; EMSA, electrophoretic mobility shift assay; CD, circular dichroism; CREB, cyclic AMPresponse element-binding protein; CBP, CREB-binding protein; KIX, CREB-binding domain of CBP; KID, kinase-inducible activation domain; c-Myb, c-myb protooncogene product.

10.1021/bi036315j CCC: $27.50 © 2004 American Chemical Society Published on Web 07/02/2004

9402 Biochemistry, Vol. 43, No. 29, 2004 sequence. Mutation of both the first and third uridine in the loop and base transversions in the stem at the second GC base pair decrease binding affinity >200-fold, whereas mutation of the sixth UA base pair to a UG base pair reduces affinity by ∼15-fold (9). In contrast to known RNA binding domains, such as the KH domain and the RRM domain that recognize non-Watson-Crick or exposed bases in singlestranded loops and bulges or the dsRBD that recognizes the double-stranded stem in the minor groove without much sequence specificity, SLBP appears to make base-specific contacts with the single-stranded flanking and loop regions, as well as the double-stranded stem in a sequence-specific manner. It is not clear how this small 70 amino acid RNA binding domain (RBD) that bears no homology to any other known protein achieves this extensive mode of RNA interaction. Drosophila SLBP (dSLBP) is phosphorylated at multiple sites in the N- and C-terminal domains (10, 11). Several serine and threonine phosphorylation sites have been identified in the RNA binding and processing domain. Electrospray ionization mass spectrometric analysis of C-terminal deletion mutants also suggests that there are four (out of five possible) phosphorylation sites in the extreme C-terminus of dSLBP in the sequence DTAKDSNSDSDSD (10). Phosphorylation of the extreme C-terminus is necessary for efficient processing of the histone pre-mRNA to the mature mRNA but not for RNA binding in electrophoretic mobility shift assays (EMSA). We report herein our first insight into the structural changes that occur when both the phosphorylated (P-RBD) and unphosphorylated (RBD) SLBP RBDs bind histone mRNA. Our data suggest that both the SLBP RBD and P-RBD lack a well-defined tertiary fold and fluctuate between an ensemble of partly folded helical conformations in solution. Although spectral changes are observed in response to RNA binding, the presence of RNA is not sufficient to stabilize a single conformation in the unphosphorylated protein. We provide NMR evidence that the C-terminal serines are phosphorylated and show that the role of serine phosphorylation is to increase further the stability of the SLBP-RNA complex by a favorable electrostatic interaction most likely with one or more basic residues in this domain. The effect of phosphorylation can be mimicked partially by mutation of the four C-terminal serines to glutamic acids in a mutant that we term 4E-RPD. Intriguingly, sulfate anions can also induce folding of the RPD and the presence of both sulfate anions and RNA results in complete folding of the RPD. Our results are consistent with previous biological studies reported for the SLBP-RNA complex and form the basis for ongoing structural studies. EXPERIMENTAL PROCEDURES Limited Proteolysis of BaculoVirus dSLBP. Full-length, baculovirus-expressed human and Drosophila SLBPs were incubated with the proteases EndoLysC, trypsin, Asp-N, and Glu-C at 37 °C at a molar ratio of SLBP/protease of 1:100 at varying time points up to 2 h. Human SLBP was completely degraded within 15 min, whereas the Drosophila isoform gave distinct bands in all reactions. All reactions were stopped at varying time points by the addition of Pefaloc (Boehringer Mannheim) and freezing the reaction

Thapar et al. on dry ice. SDS-PAGE loading buffer was added to the reactions, which were boiled for 10 min, and then the samples were analyzed by 14% SDS-PAGE. The Endo-Lys C digest gave only two bands, which were subjected to detailed analysis by MALDI mass spectrometry. The fragments were gel-eluted, and a precise whole mass was obtained on each fragment. In addition, the two fragments were subject to total tryptic digest, the resulting peptides were isolated by HPLC, and masses were obtained on the individual peptides. The total coverage of peptides mapped to the C-terminal RPD domain was 60%, whereas that for the N-terminal domain was 40%. Peptides for which masses were not obtained lay between sequence segments that were unambiguously identified using this approach, thereby increasing our confidence level in the right domain cutoffs. Truncation of the fragments identified by mass spectrometry resulted in severe solubility and aggregation problems, providing further evidence that our fragments represented structural domains in the intact protein. These analyses were performed by Dr. Christoph Borchers’ group (Department of Biochemistry and Biophysics, UNC-CH), and the analysis of the phosphorylated state of the baculovirus-expressed RPD has been reported elsewhere (10). Sample Preparation. The C-terminal 105 amino acids of the dSLBP, termed the RPD (residues 172-276), were subcloned into the NdeI and EcoRI restriction sites of the bacterial and baculoviral expression vectors pET21a (Novagen) and pFAST BacHTa (Invitrogen), respectively. Unlabeled and uniformly 15N-labeled SLBP RPD was expressed from the vector pET21a by growing BL21(DE3) RIL Codon Plus cells (Stratagene) transformed with pET21a-RPD at 20 °C in the presence of 1 mM IPTG in Luria broth or minimal media, respectively. Under these conditions, about 50% of the protein was recovered from the soluble fraction. The RPD was purified using cation-exchange chromatography over an S-sepharose column (Amersham Pharmacia Biotech) followed by gel filtration chromatography using an S100 column (Amersham Pharmacia Biotech). The proteins were >95% pure as judged by SDS-PAGE. Phosphorylated forms of the RPD and full-length SLBP proteins were obtained by infecting Sf9 cells with RPD-baculovirus and growing the Sf9 cells in Grace’s medium for 48 h. Baculovirus-expressed, full-length RPD proteins had a 20 residue N-terminal tag that encoded a His tag and were hence purified using nickel affinity chromatography. The dSLBP RPD-4E mutant in which all four C-terminal serines were mutated to glutamic acids using Quickchange mutagenesis was expressed in the vector pET28a (Novagen) with a C-terminal His tag and purified using immobilized metal affinity chromatography (IMAC) over a Ni2+ column. Unlike the RPD, which was mostly insoluble when expressed with a His tag, the Histagged 4E mutant was quite soluble when expressed at 20 °C. The identity of all constructs was confirmed by DNA sequencing as well as electrospray mass spectrometery. The 28-mer RNA (5′GGCCAAAGGCCCUUUUCAGGGCCACC CA3′) used in our studies corresponds to the sequence of the mammalian histone H4 hairpin for which NMR structures have previously been described (7, 8). The RNA was custom synthesized, deprotected, and PAGE-purified by Dharmacon, Inc. To ensure that >95% of the RNA was in the stemloop form, the lyophilized RNA was taken up in NMR buffer, heated to 95 °C for 10 min, and then snap-cooled on ice.

Role of Phosphorylation in Formation of dSLBP-RNA Complex The protein-RNA complex used for NMR studies was formed by first exchanging the protein into 20 mM Tris, pH 7.0, 50 mM KCl, 0.2 mM EDTA, 10% D2O, and 0.1% sodium azide and then adding a suitable volume of a highly concentrated RNA solution to the protein sample to form the SLBP RPD-RNA complex. NMR Measurements. NMR data were collected at 25 °C and pH 7.0 on an Inova 600 MHz spectrometer equipped with either a 5 mm z-gradient triple resonance probe for collection of HSQC or NOESY data or a 5 mm broadband probe used for 31P studies. 2D (15N,1H) HSQC spectra were collected on the free and RNA-bound forms of 15N-labeled RPD. A 3D (15N,1H) NOESY-HSQC (τm ) 200 ms) was collected with a 1 mM sample of dSLBP RPD with 16 transients per scan and 200 increments in t1 and 64 increments in t2. Spectra were processed using the program Felix 98.2 (Accelrys). Pulsed-Field Gradient (PFG) NMR. Diffusion experiments were performed using a PG-SLED (pulsed gradient stimulated echo longitudinal encode-decode) sequence that has been previously described (12). Typically, a series of 1520 1D proton spectra were acquired by varying the strength of the diffusion gradient between 100% and 10% of the maximum peak intensity using 64 transients per spectrum and a 1 mM unlabeled protein sample. A concentration of 1 mM is routinely used in these experiments to obtain sufficient S/N at longer gradient strengths. NMR spectra were processed and the methyl proton peak intensities were integrated using the software package VNMR (Varian, Inc.). Peak intensities, s(g), were fit as a function of gradient strength (g) using the equation s(g) ) A e-δg2 to obtain the observed decay rate (d). The value of the protein hydrodynamic radius, Rprot h , was calculated using dioxane as a reference molecule, ref such that the ratio Rprot ) dref/dprot. The effective h /Rh hydrodynamic radius used for dioxane (Rref h ) was 2.12 Å (12). The theoretically predicted values of the hydrodynamic radii were calculated by the method of Uversky (13). Circular Dichroism. All experiments were performed in 10 mM potassium phosphate buffer, pH 7.0, at 20 °C. The conformational stability of the RNA binding domains in the presence and absence of the RNA was determined using thermal denaturation. Thermal unfolding was monitored by observing the change in ellipticity at 222 nm as a function of increasing temperature. The heating rate was 30 °C per hour in a cuvette with a path length of 1 mm. Sedimentation Equilibrium. Sedimentation equilibrium measurements for RPD and P-RPD were performed on 0.3 mM samples in NMR buffer described above in a Beckman XL-A analytical ultracentrifuge using a Ti60 four-hole rotor. The rotor speed was set at 20 000 rpm for 18 h and then set to 45 000 rpm for an additional 8 h. At the end of this period, the meniscus was depleted as could be ascertained by comparison of the last absorption profiles. Absorbance scans were recorded at 2 h intervals at 295 nm and 25 °C. Data analysis was performed using the software program Origin XL-A. Calculation of Phosphate pKa Values. A series of 1D 31P NMR spectra were collected at different pH values on an Inova 500 MHz spectrometer at 25 °C. The pKa values were determined from a least-squares fit of the 31P chemical shift as a function of pH to the equation δ ) [δ2(10pH-pKa) + δ1]/

Biochemistry, Vol. 43, No. 29, 2004 9403

[1 + 10pH-pKa], where δ2 and δ1 represent the chemical shifts of the dianionic and monoanionic forms of the phosphate group, respectively. The random-coil 31P-serine, 31P-threonine, and 31P-tyrosine chemical shifts were obtained from previous studies as mentioned in Table 1 (14, 15). Measurement of Off Rates. Since the off rate of the protein-RNA complex is very slow, that is