Anal. Chem. 2010, 82, 7795–7803
In-Gel Digestion for Mass Spectrometric Characterization of RNA from Fluorescently Stained Polyacrylamide Gels Masato Taoka,*,† Maki Ikumi,† Hiroshi Nakayama,‡,§ Shunpei Masaki,‡ Ryozo Matsuda,† Yuko Nobe,‡ Yoshio Yamauchi,† Jun Takeda,‡ Nobuhiro Takahashi,‡,| and Toshiaki Isobe†,‡ Department of Chemistry, Graduate School of Sciences and Engineering, Tokyo Metropolitan University, Minamiosawa 1-1, Hachioji-shi, Tokyo 192-0397, Japan, Core Research for Evolutional Science and Technology (CREST), Japan Science and Technology Agency, Sanbancho 5, Chiyoda-ku, Tokyo 102-0075, Japan, Biomolecular Characterization Team, RIKEN Advanced Science Institute, Hirosawa 2-1, Wako, Saitama 351-0198, Japan, and Department of Biotechnology, United Graduate School of Agriculture, Tokyo University of Agriculture and Technology, Saiwai-cho 3-5-8, Fuchu-shi, Tokyo 183-8509, Japan Although current mass spectrometry-based proteomics technology allows for high-throughput analysis of protein components in functional ribonucleoprotein complexes, this technology has had limited application to studies of RNA itself. Here we present a protocol for RNA analysis using polyacrylamide gel electrophoresis coupled with liquid chromatography-tandem mass spectrometry. Specifically, RNAs of interest are subjected to polyacrylamide gel electrophoresis and stained with a fluorescent dye, and RNAs in gel bands are digested with nuclease and then analyzed directly liquid chromatography-mass spectrometry, resulting in highly accurate mass values and reliable information on post-transcriptional modifications. We demonstrate that the method can be applied to the identification and chemical analysis of small RNAs in mouse embryonic stem cell extracts and of small RNAs in the spliceosomal ribonucleoprotein complex pulled down from yeast cells using a tagged protein cofactor as bait. The protocol is relatively simple and allowed us to identify not only three novel methylated nucleotide residues of RNase P RNA, U6 snRNA, and 7SL RNA prepared from mouse ES cells but also various 3′-end forms of U4, U5S, and U6 snRNAs isolated from the yeast spliceosome at the femtomole level. The method is thus a convenient tool for direct analysis of RNAs in various cellular ribonucleoprotein complexes, particularly for the analysis of post-transcriptional modifications and metabolic processing of RNA. RNAs are involved in nearly all aspects of gene expression, including ribosome-mediated peptide bond formation. Recent genetic and biochemical evidence has revealed, however, that diverse types of noncoding RNAs (ncRNAs) also play pivotal roles in a variety of cellular processes, such as precursor mRNA processing, chromatin remodeling, transcriptional regulation, gene * To whom correspondence should be addressed. E-mail:
[email protected]. † Tokyo Metropolitan University. ‡ Japan Science and Technology Agency. § RIKEN Advanced Science Institute. | Tokyo University of Agriculture and Technology. 10.1021/ac101623j 2010 American Chemical Society Published on Web 08/26/2010
silencing, centromere function, and translational regulation.1-5 Through these processes, ncRNAs participate in the regulation of differentiation, cell proliferation, and programmed cell death.6 In general, ncRNAs can be identified by various hybridizationbased techniques, including Northern blotting and microarrays, and by sequencing-based techniques after reverse transcription. Hybridization techniques require prior knowledge of the sequence, whereas sequencing does not need such information to identify RNA molecules with high sensitivity. However, sequencing procedures can be time-consuming and labor intensive (e.g., cDNA library construction), and they cannot assess qualitative aspects of RNA such as type or position of modified nucleosides.7 Thus, development of efficient methods to identify RNAs and their post-transcriptional modifications is necessary to characterize the functional aspects of large numbers of unknown RNA molecules. Mass spectrometry (MS)-based techniques offer sensitive methods for the direct chemical analysis of RNA and therefore are well suited as complementary methods to conventional RNA analysis techniques. Numerous reports have described the development and the application of MS for the analysis of modifications on RNA transcripts and synthetic RNAs,8,9 as well as for the identification of RNAs by MS-based fingerprinting after endonuclease treatment.10 RNA sequences also have been determined by generating tandem MS (MS/MS) profiles by collisioninduced dissociation, followed by electrospray ionization11-14 with (1) Cullen, B. R. Mol. Cell 2004, 16, 861–865. (2) Fischer, S. E.; Butler, M. D.; Pan, Q.; Ruvkun, G. Nature 2008, 455, 491– 496. (3) Hirota, K.; Miyoshi, T.; Kugou, K.; Hoffman, C. S.; Shibata, T.; Ohta, K. Nature 2008, 456, 130–134. (4) Wang, X.; Arai, S.; Song, X.; Reichart, D.; Du, K.; Pascual, G.; Tempst, P.; Rosenfeld, M. G.; Glass, C. K.; Kurokawa, R. Nature 2008, 454, 126–130. (5) Zilberman, D.; Cao, X.; Jacobsen, S. E. Science 2003, 299, 716–719. (6) Schickel, R.; Boyerinas, B.; Park, S. M.; Peter, M. E. Oncogene 2008, 27, 5959–5974. (7) Motorin, Y.; Muller, S.; Behm-Ansmant, I.; Branlant, C. Methods Enzymol. 2007, 425, 21–53. (8) Beverly, M. B. Mass Spectrom Rev. DOI: 10.1002/mas.20260. (9) Thomas, B.; Akoulitchev, A. V. Trends Biochem. Sci. 2006, 31, 173–181. (10) Hossain, M.; Limbach, P. A. RNA 2007, 13, 295–303. (11) McLuckey, S. A.; Van Berkel, G. J.; Glish, G. L. J. Am. Soc. Mass Spectrom. 1992, 3, 60–70.
Analytical Chemistry, Vol. 82, No. 18, September 15, 2010
7795
subsequent structure analysis assisted by computer software.15,16 However, although these previous studies demonstrated the potential of MS in various aspects of nucleic acids research, there is no method that allows for determination of the MS/MS spectrum of RNAs in small amounts of biological samples and correlates a tandem mass spectrum with nucleotide sequences from a database to identify the RNA. We recently developed a sensitive liquid chromatography (LC)-MS system to measure accurate masses of RNAs and obtain MS/MS spectra at the subfemtomole level.17 To improve the efficiency of interpretation of MS/MS spectra, we also developed the software “Ariadne” that compares MS/MS data for RNA nucleolytic fragments against a DNA/RNA sequence database to automate identification of RNAs in biological samples18 and that yields results equivalent to those generated by the many sequence search engines widely used for proteomics, such as SEQUEST and Mascot.19 At present, LC-MS coupled with Ariadne-based RNA identification is difficult to adapt for direct analysis of the characteristics of RNAs in complex mixtures, and thus it is necessary to isolate the RNA from the mixture using techniques such as LC or polyacrylamide gel electrophoresis (PAGE). Although PAGE affords an easy means to obtain highly purified RNA, a method of in-gel digestion with subsequent extraction of RNA fragments has never been established. To address this technological deficit, we developed a method for in-gel digestion of RNA followed by identification of the resulting fragments via LC-MS. The method was applied to the analysis of commercial phenylalanine tRNA, small RNAs purified from mouse embryonic stem (ES) cells, and small nuclear RNAs (snRNAs) isolated from the prespliceosomal ribonucleoprotein (RNP) complex. The method is an efficient tool with high sensitivity for the analysis of small RNAs in biological samples, particularly RNAs pulled down by affinity purification from cells. EXPERIMENTAL SECTION Chemicals. Standard laboratory chemicals were obtained from Wako Pure Chemical Industries (Osaka, Japan). Yeast tRNAphe-1 and RNase A were obtained from Sigma-Aldrich (St. Louis, MO). RNase T1 was purchased from Worthington (Lakewood, NJ) and further purified by reverse-phase LC before use. MazF was obtained from TAKARA Bio Inc. (Shiga, Japan). In-Solution RNA Digestion. RNase T1 digestion of RNA was performed in 10 mM sodium acetate (pH 5.3) at 37 °C for 30 min at an enzyme/substrate ratio of 1:500 (w/w). The digests were analyzed immediately by nanoflow LC-MS or stored at -20 °C until use. (12) Ni, J.; Pomerantz, C.; Rozenski, J.; Zhang, Y.; McCloskey, J. A. Anal. Chem. 1996, 68, 1989–1999. (13) Oberacher, H.; Wellenzohn, B.; Huber, C. G. Anal. Chem. 2002, 74, 211– 218. (14) Schurch, S.; Bernal-Mendez, E.; Leumann, C. J. J. Am. Soc. Mass Spectrom. 2002, 13, 936–945. (15) Oberacher, H.; Mayr, B. M.; Huber, C. G. J. Am. Soc. Mass Spectrom. 2004, 15, 32–42. (16) Rozenski, J.; McCloskey, J. A. J. Am. Soc. Mass Spectrom. 2002, 13, 200– 203. (17) Taoka, M.; Yamauchi, Y.; Nobe, Y.; Masaki, S.; Nakayama, H.; Ishikawa, H.; Takahashi, N.; Isobe, T. Nucleic Acids Res. 2009, 37, e140. (18) Nakayama, H.; Akiyama, M.; Taoka, M.; Yamauchi, Y.; Nobe, Y.; Ishikawa, H.; Takahashi, N.; Isobe, T. Nucleic Acids Res. 2009, 37, e47. (19) Panchaud, A.; Affolter, M.; Moreillon, P.; Kussmann, M. J. Proteomics 2008, 71, 19–33.
7796
Analytical Chemistry, Vol. 82, No. 18, September 15, 2010
Electrophoresis of RNA. RNA was subjected to PAGE essentially as described.20 Briefly, RNA in solution was denatured prior to electrophoresis by mixing with 9 volumes of a loading buffer containing 2 mM EDTA (pH 8.0) and 95% formamide, heating at 95 °C for 5 min, and cooling immediately on ice. RNA was separated on 10% (w/v) polyacrylamide gels containing 8 M urea and 0.5× TBE (45 mM Tris, 32.3 mM boric acid, 1.25 mM EDTA, pH 8.3) with 0.5× TBE as a running buffer, and gels were stained with SYBR Gold (Invitrogen, Carlsbad, CA) for 5 min. In-Gel RNA Digestion. Gel pieces containing RNA bands were excised from gels, cut into small pieces, and dried under vacuum. The gel pieces were digested with 15 µL of 2 ng/µL RNase T1 or RNase A, with incubation at 37 °C for 1 h. The nucleolytic fragments were extracted from the gel using 100 µL of RNase-free water, passed through a centrifugal filter unit with a polyvinylidene fluoride membrane (Ultrafree-MC, Millipore, Billerica, MA), and then 5 µL of 2 M triethylammonium acetate (pH 7.0) was added before LC-MS analysis. LC of Oligonucleotides with Ultraviolet Detection. RNA fragments from in-gel digestion were applied to an LC system (LC20A, Shimadzu, Kyoto, Japan) equipped with a capillary Develosil C30 column (2.1 mm ×150 mm), eluted by a gradient of methanol in 10 mM triethylammonium acetate (pH 7.0), and monitored by ultraviolet absorption at 260 nm. The sum of the peak areas was used to calculate the recovery of RNA produced by in-gel digestion. Preparation of Mouse ES Small RNA. A mouse embryonic stem cell line, D3 (American Type Culture Collection, Manassas, VA), was maintained as described.21 Small RNAs from ES cells were purified using the mirVana miRNA Isolation kit (Ambion, Austin, TX) according to the manufacturer’s instructions. Preparation of Yeast Brr2-Associated Spliceosome Complex. The Brr2-associated spliceosomal complex was purified from the yeast Saccharomyces cerevisiae strain S288C expressing tandem-affinity-purification (TAP)-tagged Brr2, essentially as described.17 Briefly, yeast cells were grown in 4 L of YPD medium to A600 of 2.0, and the cells were disrupted by a Multibeads Shocker (Yasui Kikai Co., Osaka, Japan) at 2000 rpm for 4 min in 10 mM HEPES-KOH (pH 7.9), 200 mM NaCl, 10 mM KCl, 1.5 mM MgCl2, 0.5 mM dithiothreitol (DTT), and 0.5 mM phenylmethylsulfonyl fluoride. The extract (∼1.2 g protein) was mixed with 1.8 mL IgG-Sepharose beads (50% slurry, GE Healthcare UK Ltd., Little Chalfont, Buckinghamshire, U.K.), incubated for 60 min at 4 °C, and centrifuged at 100 000 × g for 30 min at 4 °C. After the beads were washed, the Brr2associated complex was incubated with 100 U/mL AcTEV (Invitrogen) protease for 60 min at room temperature. The eluate was then supplemented with CaCl2 to 3 µM and incubated with 200 µL Calmodulin Affinity Resin beads (50% slurry, Stratagene, La Jolla, CA) at 4 °C for 60 min. After the beads were washed with 10 mM Tris-HCl (pH 8.0), 300 mM NaCl, 0.1% NP-40, 1 mM magnesium acetate, 1 mM imidazole, 10 mM DTT, and 3 µM CaCl2, the Brr2-associated complex was recovered by incubation with the buffer with 20 mM EGTA (20) Sambrook, J.; Russell, D. Molecular Cloning: A Laboratory Manual., 3rd ed.; Cold Spring Harbor Laboratory Press: Cold Spring Harbor, NY, 2001. (21) Nunomura, K.; Nagano, K.; Itagaki, C.; Taoka, M.; Okamura, N.; Yamauchi, Y.; Sugano, S.; Takahashi, N.; Izumi, T.; Isobe, T. Mol Cell Proteomics 2005, 4, 1968–1976.
instead of 3 µM CaCl2. RNA and protein components in the complex were separated by phenol-chloroform extraction.17 LC-MS Apparatus for RNA Analysis. The LC system used was essentially as described,22 consisting of a nanoflow pump (LC Assist, Tokyo, Japan) that delivers solvent to the fritless spray tip electrospray ionization column and a ReNCon gradient device. The column was prepared with a fused-silica capillary (150 µm i.d. × 375 µm o.d.) using a laser puller (Sutter Instruments Co., Novato, CA) and was slurry-packed with reversed-phase material (Develosil C30-UG-3, particle size 3 µm, Nomura Chemical, Aichi, Japan) to a length of 50 mm. High voltage for ionization in negative mode was applied, and the LC eluate was sprayed online to an LTQ-Orbitrap hybrid mass spectrometer (Thermo Fisher Scientific, San Jose, CA). LC was performed at a flow rate of 100 nL/ min using a 60-min linear gradient from 5% to 40% methanol in 20 mM triethylammonium acetate (pH 7.0). The mass spectrometer was operated in a mode to automatically switch between OrbitrapMS and linear ion trap-MS/MS acquisition as described.17 Database Search and Interpretation of MS/MS Spectra of RNA. We used the software, Ariadne,18 for database searches for RNA. Ariadne searched the masses of product ions generated by MS/MS of an RNase T1 digest of sample RNA from a candidate nucleotide sequence in a database. Databases used were the mouse small RNA database, which was constructed from the cDNA of less than 1000 bases in NCBI GenBank (http:// www.ncbi.nlm.nih.gov/genbank/), and EMBL sequence database (http://www.ebi.ac.uk/embl/), which contained more than 1500 entries, and the genome database of S. cerevisiae (http:// downloads.yeastgenome.org/). The following search parameters were used: the maximum number of missed cleavages was set at 1; the variable modification parameters were one (for mouse small RNA) or two (for the S. cerevisiae genome) methylations per RNA fragment for any residue; RNA mass tolerance of ±50 ppm and MS/MS tolerances of ±0.5 Da were allowed. We selected the RNAs highly ranked by Ariadne as candidates and set more strict criteria for RNA identification to eliminate ambiguous identifications: the original MS/MS spectrum was carefully inspected to confirm that the assignment was based on two or more y- or c-series ions. RNAs with less than 50% coverage were eliminated from the identifications; the nucleotide length determined by PAGE and that from the database were used to validate RNA identification. Proteomics Procedures. SDS-PAGE of protein and in-gel digestion,23 digestion of protein without gel separation,22 and LCMS/MS analysis of the resulting peptides22 were performed as described. The LC-MS apparatus used for proteomics analyses was the direct nanoflow LC-MS system equipped with a quadrupole-time-of-flight hybrid mass spectrometer (Q-Tof Ultima, Waters, Bedford, MA).22 Database searches were performed using Mascot software (version 2.2.1., Matrix Science Ltd., London, U.K.) with the SGD sequence database (release 20060506, http:// downloads.yeastgenome.org/) under the search parameters as described.24 The criteria for protein identification were based on (22) Natsume, T.; Yamauchi, Y.; Nakayama, H.; Shinkawa, T.; Yanagida, M.; Takahashi, N.; Isobe, T. Anal. Chem. 2002, 74, 4725–4733. (23) Taoka, M.; Ichimura, T.; Wakamiya-Tsuruta, A.; Kubota, Y.; Araki, T.; Obinata, T.; Isobe, T. J. Biol. Chem. 2003, 278, 5864–5870. (24) Taoka, M.; Yamauchi, Y.; Shinkawa, T.; Kaji, H.; Motohashi, W.; Nakayama, H.; Takahashi, N.; Isobe, T. Mol. Cell Proteomics 2004, 3, 780–787.
the vendor’s definitions (Matrix Science, Ltd.), and we used the stricter criteria for protein assignment, as reported.17 RESULTS AND DISCUSSION Optimization of Digestion and Elution Conditions. We combined the reliability of gel electrophoresis with the nanoflow LC-MS analytical system, followed by a sequence database search with Ariadne, which has proven to be a powerful tool in RNA analysis,18 to develop an in-gel digestion procedure as an interfacing technique. We designed the procedure by referring to the in-gel digestion method for proteins by MS-based identification.25 As shown in Figure 1, the method includes four steps: (i) after separation by PAGE, the entire gel is fluorescently stained and RNA bands are excised, (ii) the RNA in each band is completely digested with a nuclease, (iii) each digest is extracted away from the gel matrix, and (iv) the solution containing the RNA fragments is filtered to remove gel particles prior to LC-MS analysis. Optimization of each step was carried out to maximize the recovery of RNA fragments which were analyzed by LC with UV detection; yeast tRNAPhe-1 (10 pmol) stained with SYBR Gold served as a standard/positive control (80% recovery on average). The method was also applicable to gel bands stained with ethidium bromide, SYBR Green II, or SYBR Safe, which are less sensitive dyes than SYBR Gold for visualization of RNA bands. The band size used for the analysis was 2 mm high × 8 mm wide × 1 mm thick. For step 1, washing the band (e.g., to remove excess amounts of chemicals) should be avoided because it decreases recovery of RNA fragments because the washing with water or 70% ethanol containing 0.1 M NaCl, conditions that are commonly used for ethanol precipitation of oligonucleotides, decreased the recovery of RNA fragments to below 40%. Crushing the gel into small pieces only marginally improved RNA recovery. For step 2, the RNase T1 concentration tested was between 0.2 and 200 ng/µL. More than 2 ng/ µL of enzyme allowed sufficient recovery. Incubation for digestion was tested between 0.5-8 h. Extending the time for digestion more than 1 h did not increase the recovery. RNase A and MazF were also applicable. For step 3, time of the extraction varied from 0.5 to 8 h. Extending the extraction time more than 1 h did not increase the recovery. For step 4, filtration of the reaction through a polytetrafluoroethylene membrane was necessary because the residual gel plugged the line of LC. Cellulose acetate or a Durapore membrane unit was also suitable for the filtration. Compatibility with RNA Microcharacterization. To establish the compatibility of PAGE-separated RNAs with characterization by in-gel digestion coupled with LC-MS, the following issues needed to be addressed: (a) whether any RNA fragments can be recovered after digestion of PAGE-separated RNAs and (b) whether the recovered RNA fragments retain the same chemical structures, including post-transcriptional modifications (without additional experimental modification), as for RNAs digested in buffer solution as established in the literature.26 To address these issues, aliquots of 100 fmol of yeast tRNAPhe-1 were electrophore(25) Granvogl, B.; Ploscher, M.; Eichacker, L. A. Anal. Bioanal. Chem. 2007, 389, 991–1002. (26) Egami, F.; Takahashi, K.; Uchida, T. Prog. Nucleic Acid Res. Mol. Biol. 1964, 3, 59–101.
Analytical Chemistry, Vol. 82, No. 18, September 15, 2010
7797
Figure 1. Overview of the procedure to identify RNAs using the in-gel digestion method. In-gel digestion connects RNA preparation by any procedure to LC-MS analysis followed by a database search. See text for details.
sed and processed in-gel as described in Figure 1. After extraction of RNA from the mixture, LC-MS spectra were obtained. Figure 2a and b shows the base peak chromatograms of the tRNAPhe-1 preparation that had been digested in the gel or in solution (note that the preparation was contaminated with a trace amount of tRNAPhe-2,17 a structural homologue of tRNAPhe-1). Table 1 lists the sequence and the molecular mass of each nucleotide fragment. There were no detectable differences in the separation and retention times of major peaks in the base peak chromatograms, indicating that in-gel digestion did not affect the LC analysis. The mass values for all of the fragments were easily obtained for each gel band with no sodium adduct ion (Figure 3a and b), which frequently are seen in MS spectra of RNA,27,28 and thus could be assigned to the hypothetical fragments from the original tRNAPhe-1 sequence with mass accuracy within 5 ppm (Table 1); RNase T1 generated more than 10 fragments of yeast tRNAPhe-1 having a 3′-phosphate or 2′, 3′-cyclic phosphate. Importantly, all of the mass values estimated were consistent with the posttranscriptional modifications reported for yeast tRNAPhe-1, including nine methylated nucleotides and a single wybutosine,29 although we could not distinguish pseudouridine from uridine as this modification is mass-silent. The fragments from the gel band covered the entire tRNAPhe-1 sequence except for mono-, di-, and trinucleotides released by RNase T1 cleavage, as was also seen in the in-solution digest. In addition, there (27) Holzl, G.; Oberacher, H.; Pitsch, S.; Stutz, A.; Huber, C. G. Anal. Chem. 2005, 77, 673–680. (28) Premstaller, A.; Oberacher, H.; Huber, C. G. Anal. Chem. 2000, 72, 4386– 4393. (29) Czerwoniec, A.; Dunin-Horkawicz, S.; Purta, E.; Kaminska, K. H.; Kasprzak, J. M.; Bujnicki, J. M.; Grosjean, H.; Rother, K. Nucleic Acids Res. 2009, 37, D118–121.
7798
Analytical Chemistry, Vol. 82, No. 18, September 15, 2010
was no evidence of artificial modification of tRNAPhe-1 during the in-gel digestion experiment. Using the signal intensity of the mass spectra obtained above, we calculated the recovery of RNA fragments extracted from a gel band at the femtomole level. Our in-gel digestion method showed RNA recovery from 30% to 100% relative to in-solution digestion at the level of 2000, 500, and 100 fmol of tRNAphe-1 (50, 12.5, and 2.5 ng, respectively, Supporting Information Table 1), which just exceeded the detection limit of SYBR Gold staining for PAGE-separated RNA bands. MS/MS Analysis and Identification of RNAs Using the Database Search Engine Ariadne. In our previous study, the sequences of major RNA fragments of yeast tRNAPhe-1 cleaved in solution by RNase T1 were assigned by analyzing MS/MS spectra collected automatically by data-dependent collisioninduced dissociation.17 For in-gel digestion, the RNA fragments generated a series of product ions in the collision-induced dissociation-MS/MS analysis (Figure 3c), just as in the insolution digest. The product ions included the major c/y and a/w series ions with minor derivatives (hydrated or dehydrated ions and those ions that lost nucleotide bases), as reported earlier.12,27,30,31 To establish the feasibility of using the in-gel digestion method to identify RNAs by Ariadne, we searched the S. cerevisiae genome database using the MS and MS/MS spectra obtained from in-gel digested tRNAPhe-1 as a query. Even at 100 fmol, Ariadne identified the eight genes for tRNAPhe-1 by matching six oligonucleotide fragments harboring four methylated nucleotides from the in-gel-digested sample (Supporting Information (30) Tromp, J. M.; Schurch, S. J. Am. Soc. Mass Spectrom. 2005, 16, 1262– 1268. (31) Wu, J.; McLuckey, S. A. Int. J. Mass Spectrom. 2004, 237, 197–241.
Figure 2. Base peak chromatogram for yeast tRNAPhe-1 digested with RNase T1 in gel (a) or in solution (b). Both the in-gel and solution-phase RNase T1 digests began with 100 fmol of the tRNA, with subsequent analysis by LC-MS. (a) Analysis of in-gel digested tRNA; (b) analysis of tRNA digested in solution. Seventeen major oligoribonucleotide peaks, indicated by arrows with peak numbers, were assigned to the fragments of yeast tRNA (see Table 1). The count corresponding to 100% of the y-axis is shown at the upper left of the figure. Table 1. Assignments of Oligonucleotide Fragments Found in the RNase Digest of the Yeast tRNAPhe-1 Preparation by In-Gel Digestion observed peak number
m/z
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
639.0663 646.0743 803.1004 691.1020 590.1023 978.1258 637.0692 754.6278 969.1208 860.7845 989.1476 854.7813 968.1190 975.6185 966.6144 959.1141 1381.2095
theoretical
molecular charge mass (Da) -2 -2 -2 -1 -2 -2 -2 -2 -2 -3 -2 -3 -2 -2 -2 -2 -3
1280.1484 1294.1645 1608.2166 692.1099 1182.2203 1958.2674 1276.1542 1511.2713 1940.2575 2585.3772 1980.3111 2567.3675 1938.2539 1953.2528 1935.2447 1920.2440 4146.6522
Ariadne search
molecular mass (Da)
∆ppm
1280.1487 1294.1643 1608.2170 692.1100 1182.2195 1958.2681 1276.1538 1511.2718 1940.2576 2585.3782 1980.3114 2567.3677 1938.2533 1953.2529 1935.2424 1920.2428 4146.6543
0.2 -0.1 0.2 0.2 -0.7 0.4 -0.3 0.3 0.1 0.4 0.2 0.1 -0.3 0.1 -1.2 -0.6 0.5
tRNAa Phe2 Phe1, Phe1, Phe1, Phe1, Phe1, Phe1, Phe1, Phe1, Phe1, Phe1, Phe1, Phe1 Phe1 Phe1 Phe1 Phe1,
Phe2 Phe2 Phe2 Phe2 Phe2 Phe2 Phe2 Phe2 Phe2 Phe2 Phe2
Phe2
residue numbersa
sequence
100 fmol band
68-71 54-57 11-15 44-45 72-75 46-51 54-57 72-76 46-51 58-65 25-30 58-65 66-71 5-10 5-10 66-71 31-42
UUCGp TΨCGp CUCAGp AGp CACC-OH m7GUCm5CUGp TΨCG>p CACCA-OH m7GUCm5CUG>p m1AUCCACAGp Cm2,2GCCAGp m1AUCCACAG>p AAUUCGp AUUUAm2Gp AUUUAm2G>p AAUUCG>p ACmUGmAAyWAΨm5CUG>p
identified identified identified out of settingb out of settingc unidentifiedd identified out of settingc unidentifiedd identified identified unidentifiede identified identified identified identified out of settingc
a
According to the MODOMICS database (http://genesilico.pl/modomics). b Fragments longer than four nucleotides were assigned. c The search was not constructed to detect complex post-transcriptional modifications such as wybutosine and post-transcriptional nucleotide addition. d The MS/MS assignment could not be made because of inconsistent dissociation of 7-methylguanine. e The MS/MS assignment could not be made because of weak signal.
Figure 1 and Table 1). These results showed that the in-gel digestion method could easily identify an RNAseven one containing post-transcriptional modificationssat the femtomole level by LC-MS followed by an Ariadne-assisted genome database search.
Practical Applications of the In-Gel Digestion Method. Small RNAs Isolated from Mouse ES Cells. The characterization of several different RNAs by the in-gel digestion method would establish the general applicability and robustness of the approach Analytical Chemistry, Vol. 82, No. 18, September 15, 2010
7799
Figure 3. LC-MS analysis of a 100-femtomole band of yeast tRNAPhe-1 digested in gel. Typical mass spectrum of the RNase T1 digests, (a) [CACCA-OH]2- and (b) [ACmUGmAAyWAΨm5CUG>p]3- of tRNAPhe-1. Abbreviations: Cm, 2′-O-methylcytidine; Gm, 2′-O-methylguanosine; yW, wybutosine; Ψ, pseudouridine; m5C, 5-methylcytidine). Chromatography was performed as described in Materials and Methods. Note that no sodium adduct ion (arrow) was observed in the MS spectrum under the solvent conditions employed. (c) MS/MS spectrum of [CACCAOH]2-. The parent ion that has lost cytosine [P-B(C)]2- was a doubly charged product. All other assigned signals were singly charged products, unless indicated otherwise.
and demonstrate that the method does not introduce artificial modifications into the RNA. Hence, we prepared a fraction of small RNAs from mouse ES cells and subjected it to PAGE. The gel image (Figure 4a) revealed the fraction to be a complex mixture of RNAs in the range of 100-500 bases and quantities below several hundreds of nanograms in 3 µg of the mixture. All 11 bands were analyzed by LC-MS via the in-gel digestion method. Sets of oligonucleotides generated by MS and MS/MS analyses of each band were subjected to a search using Ariadne and the mouse small RNA database. Figure 4b illustrates a typical result, which assigned U3 snRNA with a significantly high score. Figure 4c shows the base peak chromatograms of the nucleolytic fragments derived from the U3 snRNA band, together with the assignments of each fragment determined by the MS analyses. Except for mono-, di-, and trinucleotides and a relatively large fragment, UUCUUCCCUCCUUUG, released by RNase T1 cleavage, the nucleotide sequences of the assigned fragments coincided with that of trimethylguanosine-capped U3 snRNA and covered the entire U3 sequence. Each RNA match found in the database was invariably verified in this way, and none was accepted based solely on a single retrieval from the automatic database search. The 11 bands were directly identified from the gel in which 3 µg of ES small RNA fraction was loaded (Figure 4a and Supporting Information Table 2). This RNA fraction included many common 7800
Analytical Chemistry, Vol. 82, No. 18, September 15, 2010
small RNAs such as snRNAs and 5S and 5.8S rRNAs. The identified RNAs were based on multiple matches to gene families and splicing variants in the column of “accession numbers” (Supporting Information Table 2). Supporting Information Table 2 lists the oligonucleotide sequences we identified, and we estimated that one or more RNAs within the family or the variant were in the gel band. In total, the 1279 nucleotide sequences of 11 RNAs identified by this study comprised 1261 unmodified and 30 post-transcriptionally modified nucleotides, including 12 that were partially modified (Supporting Information Table 2). The sequences covered 66.0% of 1938 nucleotides derived from the 11 RNAs sequences and were compatible with reported sequences.32 In addition, we identified three novel partially methylated nucleotides, namely, residue A32 of RNase P RNA, residue U4 of U6 snRNA, and residue U102 of 7SL RNA however the method could not determine whether a methyl group was bonded to the base or sugar. Identification of RNAs in an RNP Complex Pulled down from Yeast Cells. For the next validation of the in-gel digestion method, RNAs in an RNP complex pulled down by a tagged bait protein were chosen because this technique is indispensable for analysis of samples, such as this one, that contain small amounts of RNAs. (32) Reddy, R. Nucleic Acids Res. 1988, 16 (Suppl), r71–85.
Figure 4. Identification of mouse ES small RNAs using in-gel digestion. (a) PAGE profile of RNA components in the ES small RNA fraction. The bands 1-8 (1, RNaseP RNA; 2, 7SL RNA; 3, U3 snRNA; 4, U2 snRNA; 5.1, U1 snRNA; 5.2, 5.8S rRNA; 6.1, U4 snRNA; 6.2, U8 snoRNA; 7.1, 5S rRNA; 7.2, U5 snRNA; 8, U6 snRNA), visualized by SYBR Gold staining, were subjected to LC-MS analysis. RNA size markers are indicated on the left. (b) Mapping score histogram of the band 3 (U3 snRNA) search result, which presents scores for all entries in the mouse small RNA database. Frequencies of entries within a 10-point scoring range were counted, converted to common logarithm of frequency +1, and plotted. A “hit” for the query is indicated by an arrow. (c) Base peak chromatogram of the RNase T1 digest of mouse U3 snRNA. A gel piece containing U3 snRNA was in-gel digested with RNase T1 and subjected to LC-MS analysis. Major oligoribonucleotide peaks assigned as RNase T1 fragments are indicated by arrows with the corresponding sequence. Detailed data for MS/MS-based assignment of each fragment are given in Supporting Information Table 2.
Then we chose the spliceosomal complex pulled down by tagged Brr2 as an example to demonstrate that the procedure is suitable for analysis of samples containing a limited amount of RNA. The spliceosomal complex of S. cerevisiae comprises the U4/U6.U5 tri-snRNP, which contains the major RNA components U4, U5S, U5L, and U6, and small nuclear ribonucleoproteins (snRNPs).33 We purified the spliceosomal RNP complex, separated the proteins from the RNAs by phenol/chloroform extraction, and analyzed the protein composition by LC-MS/MS-based proteomics (Figure 5a and Supporting Information Table 3). In total, this analysis identified 44 proteins containing all 26 components of the yeast Brr2-associated RNP complex,34-36 9 potential Brr2-binding proteins reported in the SGD site (http://www.yeastgenome. org/), and 9 previously unreported proteins identified based on only one or two peptides, suggesting that our preparation mainly contained the Brr2-associated spliceosomal complex. The RNAs in the same Brr2-associated RNP complex, on the other hand, gave rise to four major bands with an approximate size of 100-200 nucleotides as determined by 8 M urea-10% PAGE (33) Hacker, I.; Sander, B.; Golas, M. M.; Wolf, E.; Karagoz, E.; Kastner, B.; Stark, H.; Fabrizio, P.; Luhrmann, R. Nat. Struct. Mol. Biol. 2008, 15, 1206– 1212.
and SYBR Gold staining (Figure 5b). These bands were excised from the gel and subjected to the in-gel digestion method with RNase T1. The RNA fragments from the gel were analyzed with a nanoflow LC-MS system, followed by a genome database search of S. cerevisiae by Ariadne. The analysis clearly identified the RNA bands as U4, U5L, U5S, and U6 snRNAs (Figure 5b, Supporting Information Tables 4 and 5). The nucleotide sequences of the assigned fragments coincided with those of yeast snRNAs and covered their entire sequences except for mono-, di- and trinucleotides released by RNase T1 cleavage (Figure 5c and Supporting Information Table 5). RNase A cleavage could also be used for in-gel digestion. The fragments of the U6 snRNA band produced by RNase A cleavage were analyzed by LC-MS and assigned to (34) Gottschalk, A.; Neubauer, G.; Banroques, J.; Mann, M.; Luhrmann, R.; Fabrizio, P. EMBO J. 1999, 18, 4535–4548. (35) Nash, R.; Weng, S.; Hitz, B.; Balakrishnan, R.; Christie, K. R.; Costanzo, M. C.; Dwight, S. S.; Engel, S. R.; Fisk, D. G.; Hirschman, J. E.; Hong, E. L.; Livstone, M. S.; Oughtred, R.; Park, J.; Skrzypek, M.; Theesfeld, C. L.; Binkley, G.; Dong, Q.; Lane, C.; Miyasato, S.; Sethuraman, A.; Schroeder, M.; Dolinski, K.; Botstein, D.; Cherry, J. M. Nucleic Acids Res. 2007, 35, D468–471. (36) Stevens, S. W.; Barta, I.; Ge, H. Y.; Moore, R. E.; Young, M. K.; Lee, T. D.; Abelson, J. RNA 2001, 7, 1543–1553.
Analytical Chemistry, Vol. 82, No. 18, September 15, 2010
7801
Figure 5. Protein and RNA composition of the purified yeast Brr2-associated snRNP complex. (a) SDS-PAGE profile of the control (Ctrl) and the Brr2-associated RNP complex (Brr2) pulled down with TAP-tagged Brr2 as the bait (visualized by Coomassie brilliant blue staining). The control experiment was performed using TAP-tagged Ist3, which forms a prespliceosomal RNP complex that shares no protein/RNA components with the Brr2-associated complex. Molecular mass markers are indicated on the left, and the proteins identified by LC-MS/MS analysis of individual bands excised from the gel are shown on the right. The star indicates the position of TAP-tagged Brr2. Proteins that were also identified by LC-MS analysis from the control Ist3 complex (indicated in parentheses) were considered as contaminants in our Brr2 complex preparation. (b) PAGE profile of RNA components in the Ctrl and the Brr2-associated complex. The bands, visualized by SYBR Gold staining and containing 3-11 ng of RNA (40-300 fmol), were subjected to LC-MS analysis. RNA size markers (in bases) are indicated on the left. (c) RNA sequence of yeast U6 snRNA and summary of RNA identification. Dashed arrows denote RNase T1-digested RNA fragments identified by this analysis, and solid arrows indicate those identified after digestion with RNase A.
the U6 sequence (Figure 5c). This analysis yielded the U6 snRNA sequence that had been missed by RNase T1 digestion, indicating that the combination of these two RNases of different specificity is an effective means of obtaining higher coverage of RNA by this method. The in-gel digestion method easily determined the sequences of the 3′ and 5′ ends of a limited amount of RNA. The analysis of U4 and U5S snRNAs identified multiple 3′-terminal fragments containing 1, 2, or 3 uridines (AAUACCU1-3-OH; Supporting Information Table 5) and 3 or 4 uridines or 4 uridines with 1 cytidine at the 3′-end (AACU3-4-OH and AACU4C-OH; Supporting Information Table 5), respectively, and likewise the analysis of U6 identified multiple 3′-terminal fragments consisting of a stretch of 4 to 8 uridines (U4-8p; Figure 5c and Supporting Information Table 5). The heterogeneities were previously reported as a homogeneous 3′-end or heterogeneous 3′-ends with ambiguous structure because of low-resolution data obtained by S1 nuclease mapping37,38 or Northern blot analysis including sodium periodate or phosphatase treatment.39 The analysis also identified the 5′-terminal fragment AUCCUUAUG and AAG (both with a 5′-trimethylguanosine cap) of U4 snRNA and of U5S and U5L snRNAs, respectively, although the 5′-terminal fragment of U6 was not detected. These results showed unequivocally that (37) Patterson, B.; Guthrie, C. Cell 1987, 49, 613–624. (38) Siliciano, P. G.; Brow, D. A.; Roiha, H.; Guthrie, C. Cell 1987, 50, 585– 592. (39) Lund, E.; Dahlberg, J. E. Science 1992, 255, 327–330.
7802
Analytical Chemistry, Vol. 82, No. 18, September 15, 2010
our in-gel digestion method is capable of identifying RNAs in a limited amount of biological sample, such as key components in the spliceosomal RNP complex and is also applicable for determining 5′-terminal post-transcriptional modifications and 3′terminal microheterogeneity of RNA molecules. We have used different procedures to purify distinct complexes pulled down by tagged Brr2 (this study) and by tagged Lsm3,17 both containing the U4/U6.U5 tri-snRNP complex.33 We found that all the RNA sequences that we determined to have posttranscriptional modifications, including 5′ capping and 3′ microheterogeneity, were essentially the same in both complexes, suggesting that our in-gel digestion method is refractory to the purification procedure and that similarities and differences in posttranscriptional modifications of RNA can be determined. CONCLUSION We describe here a novel method based on mass spectrometric characterization of RNAs via in-gel digestion of distinct bands resolved by PAGE. Using the method, RNAs separated by PAGE were analyzed by nanoflow reversed-phase LC with high-resolution MS, followed by DNA/RNA sequence database searches using Ariadne. The method recovered over 30% of the RNA as fragments digested by RNase from a gel piece, identified the RNA, and afforded the identification of post-transcriptional modifications from hundred femtomole-level samples of RNA. Even lesser amounts of RNA (e.g.,