RNAModMapper: RNA Modification Mapping Software for Analysis of

Sep 24, 2017 - Liquid chromatography tandem mass spectrometry (LC-MS/MS) has proven to be a powerful analytical tool for the characterization of modif...
0 downloads 13 Views 1MB Size
Subscriber access provided by UNIVERSITY OF ADELAIDE LIBRARIES

Article

RNAModMapper: RNA modification mapping software for analysis of liquid chromatography tandem mass spectrometry data Ningxi Yu, Peter A Lobue, Xiaoyu Cao, and Patrick A. Limbach Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.7b01780 • Publication Date (Web): 24 Sep 2017 Downloaded from http://pubs.acs.org on September 26, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Analytical Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

RNAModMapper: RNA modification mapping software for analysis of liquid chromatography tandem mass spectrometry data

Ningxi Yu, Peter A. Lobue, Xiaoyu Cao and Patrick A. Limbach*

Rieveschl Laboratories for Mass Spectrometry, Department of Chemistry, University of Cincinnati, PO Box 210172, Cincinnati, Ohio 45221-0172, United States *To whom correspondence should be addressed.

Phone (513) 556-1871 Fax (513) 556-9239 Email [email protected] Key words: modified nucleosides, RNA sequencing, tRNA, tandem mass spectrometry, LC-MS/MS

Running Title: RNAModMapper Sequencing Software

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ABSTRACT Liquid chromatography tandem mass spectrometry (LC-MS/MS) has proven to be a powerful analytical tool for the characterization of modified ribonucleic acids (RNAs). The typical approach for analyzing modified nucleosides within RNA sequences by mass spectrometry involves ribonuclease digestion followed by LC-MS/MS analysis and data interpretation. Here we describe a new software tool, RNAModMapper (RAMM), to assist in the interpretation of LC-MS/MS data. RAMM is a stand-alone package that requires user-submitted DNA or RNA sequences to create a local database against which collision-induced dissociation (CID) data of modified oligonucleotides can be compared. RAMM can interpret MS/MS data containing modified nucleosides in two modes: fixed and variable. In addition, RAMM can also utilize interpreted MS/MS data for RNA modification mapping back against the input sequence(s). The applicability of RAMM was first tested using total tRNA isolated from Escherichia coli. It was then applied to map modifications found in 16S and 23S ribosomal RNA from Streptomyces griseus.

ACS Paragon Plus Environment

Page 2 of 28

Page 3 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Introduction Ribonucleic acid (RNA) is a polymeric biomolecule that plays crucial roles in many biological processes. Once transcribed, RNA can then undergo post-transcriptional modification where chemical functionalities are added enzymatically to the canonical nucleosides within the RNA sequence. A wide variety of chemical modifications to RNA have been identified, ranging from simple methylations to more complex modifications composed of multiple functional groups.1-3 Among all types of RNAs, transfer RNA (tRNA), has been found to contain the greatest density of modifications,4 many of which have been associated with structural stability assisting in codon/anticodon recognition during translation or as determinants for aminoacyl synthetase recognition.5-9 The biological relevance for many of these modifications is of interest, and recent studies show tRNA modification profiles (the complete pattern of modified nucleosides in a total tRNA sample) may correlate with the cellular stress response and human diseases.10-12 Understanding the functional importance of tRNA modification profiles can be a challenge due to the complexity of a total tRNA sample, given cells can contain anywhere from 30 to 300 unique tRNA sequences depending on the organism.13 At present, the most useful approach for measuring tRNA modifications has been based on liquid chromatography tandem mass spectrometry (LCMS/MS) wherein the total tRNA pool is digested to nucleosides and modified nucleoside levels are measured quantitatively.14 Total tRNA nucleoside profiling is relatively easy to implement with any sample,15,16 is based on well-established protocols,17 and offers multiple options for quantitative measurements.18-20 Moreover, total tRNA nucleoside profiling can reveal all modified nucleosides in the sample in a single experiment. However, one limitation of examining tRNA modification profiles at the nucleoside level is the lack of information relating back to changes in individual tRNA sequence modification patterns. Thus, most nucleoside-based examinations focus on specific

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

modifications, such as those found in the anticodon loop that can be assigned to only a few specific tRNA sequences. Modern approaches that seek to identify specific tRNA sequence locations that are modified are based on either RNA-seq technology or mass spectrometry.21 RNA-seq methods have an advantage in throughput,22,23 however special sample treatment is typically required to recognize sites of chemical modification.24,25 Mass spectrometry enjoys the advantage of being sensitive to every chemical modification within tRNA, although analysis is typically time-consuming and requires more sample than RNA-seq. Even with the advent of high-throughput genomic sequencing technologies, there remains an interest in mass spectrometry as a platform for tRNA modification mapping with a specific attentiveness in developing new tools and methods that reduce sample consumption and analysis time.26-28 The most common platform for tRNA modification mapping by mass spectrometry is LCMS/MS.29 RNA modification mapping by mass spectrometry was adapted from prior biochemical approaches to RNase mapping of RNA.30 A tRNA nucleoside profile is obtained and these modifications are placed onto the correct tRNA sequence context by digesting an individual tRNA or total tRNA pools with a base-specific ribonuclease that generates oligoribonucleotide digestion products amenable to LC-MS/MS. Collisional-induced dissociation (CID) of oligonucleotides generates product ions that can be assigned using the McLuckey nomenclature as c-, y-, w- and a-B-type product ions.31-33 Mass differences between sequential c- and y-type fragment ions reveal the identity of canonical and modified nucleosides as well as their location within the oligonucleotide sequence. While this RNA modification mapping by mass spectrometry approach is quite effective, overall experimental throughput is most often limited by the data interpretation (assigning fragment ions to each MS/MS

ACS Paragon Plus Environment

Page 4 of 28

Page 5 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

spectrum) and sequence annotation (mapping interpreted MS/MS data back onto the original RNA sequence) steps. Given the low throughput of manual MS/MS data interpretation for oligonucleotides, a number of computational tools have been created to improve the automation of this step. The first tool for analyzing oligonucleotide MS/MS data was Simple Oligonucleotide Sequencer (SOS),34 an interactive program developed by Rozenski and McClosky for ab initio oligonucleotide sequencing by mass spectrometry. Nyakas et al.35 developed the OMA and OPA software toolbox, which can analyze precursor and product ion spectra of oligonucleotides, oligonucleotide derivatives and oligonucleotide adducts with metal ions or drugs. While each of these tools has utility for simplifying MS/MS data interpretation, neither is applicable to large-scale tRNA modification mapping due to the inability to batch process large LC-MS/MS files. The first computational platform applicable to RNA modification mapping at scale was Ariadne, developed by Nakayama et al.36 Ariadne is a web-based database search engine that uses MS/MS data of RNA digestion products to search against the sequence database to identify particular RNAs in biological samples. RMM is another database search program, which can search whole prokaryotic genomes or RNA FASTA sequence databases,37 to identify the specific RNA within a sample. Unlike RMM, Ariadne does offer functionalities that provide the user control over the types of chemical modifications that may be present within the MS/MS datasets, although the database of organisms present in Ariadne is limited. More recently, the standalone program RoboOligo was developed to allow both automated de novo and manual analysis of modified oligonucleotide MS/MS spectra.38 RoboOligo was explicitly created to handle tRNAs and the large diversity of modifications present in those samples. However, RoboOligo only allows annotation of a single tRNA sequence at a time.

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

While these newer tools are a major improvement over manual interpretation of MS/MS data, our efforts at improving RNA modification mapping throughput remain hindered by the lack of software that can automate both the MS/MS data interpretation and tRNA sequence annotation steps. Thus, here we introduce a new tool for oligonucleotide MS/MS data interpretation built specifically for RNA modification mapping. RNAModMapper (RAMM) is able to interpret CID data from oligonucleotides, and then map interpreted MS/MS sequences onto RNA sequences. The capabilities of RAMM were evaluated using multiple tRNA samples. Further, to test its capabilities with other RNA types, bacterial 16S and 23S rRNAs were also analyzed and processed. As a standalone program built in an open source code environment, RAMM enables higher throughput RNA modification mapping by LC-MS/MS.

Materials and Methods Materials Escherichia coli MRE 600 total transfer ribonucleic acid, RNase T1, TRI-Reagent, calcium chloride, 1,3,3,3-hexafluoro-2-propanol (HFIP) and triethylamine (TEA) were purchased from Sigma Aldrich (St. Louis, MO). Ammonium acetate was purchased from Fisher Science (Fair Lawn, NJ). A Nucleobond AX 500 column was purchased from Macherey-Nagel (Düren, Germany). The RNAIDTM kit with SPINTM was purchased from MP Biomedicals (Solon, OH). LC-MS grade water and methanol were purchased from Honeywell B&J (Morristown, NJ).

Ribosomal RNA Isolation

ACS Paragon Plus Environment

Page 6 of 28

Page 7 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Total RNA was obtained from Streptococcus griseus cultured in house as described previously.39 S. griseus total RNA sample (88 µg) was run on a 1% low melting point agarose gel. The 16S and 23S rRNA bands were excised and individually treated with 600 µL of RNA binding salt solution, which was heated to 50 °C for 10 min. To each sample was added 2.2 µL of 10% acetic acid and 10 µL of RNAMATRIXTM. The solutions were suspended for 10 min and centrifuged for 1 min to pellet the RNA/RNA matrix complex. The pellets were washed twice with RNA washing solution and suspended with sterile water. The RNA was eluted from each sample by incubating at 50 °C for 5 min, and then the solution was centrifuged for 2 min and the liquid with RNA was transferred to a spin filter before final centrifugation and storage.

Ribonuclease digestion For RNase T1 digestion, 10 µg of RNA sample and 500 U of RNase T1 were combined in 220 mM ammonium acetate and incubated for 2 h at 37 °C. Samples were lyophilized and then rehydrated in mobile phase A (MPA: 200 mM HFIP, 8.15 mM TEA, pH = 7.0).

LC-MS/MS analysis For low-resolution LC-MS/MS, RNase T1 digestion products were separated on a Waters XBridge C18 column (3.5 µm, 1 mm × 150 mm) with a gradient of 5% B to 20% B in 5 min; 20% B to 30% B to 95% B in 43 min; hold at 95% B for 5 min, followed by re-equilibration for another 15 min at 5% B, where mobile phase B is composed of 50:50 v:v MPA:methanol, pH = 7.0. Low-resolution data were acquired on a Thermo LTQ-XL mass spectrometer with a capillary temperature of 275 °C, spray voltage of 4 kV, sheath gas, auxiliary gas and sweep gas at 40, 10, and

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

10 arbitrary units, respectively, and a capillary voltage of -100 V. Samples were analyzed in negative polarity over an m/z range of 500 to 2000. Data dependent acquisition was used to collect MS/MS data at a normalized collision energy of 42% with an activation time of 30 ms. A full range mass scan was followed by four scans of the most abundant precursors from the full scan, with m/z values selected for CID analyzed for up to 10 scans before they were placed on a dynamic exclusion list for 30 s. Precursor ion selection was performed with an isolation width of 2. For high-resolution LC-MS/MS, RNase T1 digestion products were separated on an Agilent Poroshell 120 EC-C18 column (2.7 µm, 1mm x 75 mm) thermostatted at 50 °C at a flow rate of 80 µL/min with a gradient of 5% B for 5 min; 5% to 73% B in 60 min; 100% B for 5 min, followed by re-equilibration at 5% B for 20 min. The same mobile phase A and B compositions used in the lowresolution LC-MS/MS experiments were used for all high-resolution LC-MS/MS experiments. High-resolution data were acquired on a Waters Synapt G2-S HDMS mass spectrometer with a source temperature of 120 °C, desolvation temperature of 400 °C, capillary voltage of 2.5 kV, sampling cone of 55V, source offset of 80V, cone gas of 50 L/hr and desolvation gas of 700 L/hr. For all measurements, the mass spectrometer was operated in V-mode (sensitivity) with a typical resolving power of 15,000 FWHM (full width at half maximum). Samples were analyzed in negativemode ESI over an m/z range of 500 to 2000 for MS and 300 to 2000 for MS/MS. Data dependent acquisition was used to collect MS/MS data (1 s scan) for a maximum of 3 ions per MS scan (0.2 s scan) using a collision energy ramp from 18 V to 38 V before being added to a dynamic exclusion list for 15 s. Precursor ion selection was performed with an isolation width of 2. Lockspray calibration was performed using a solution of leucine enkephalin (200 pg/µL) infused at 5 µL/min. Lockspray scans were collected for 1 s every 30 s with setmass at m/z 554.2615.

ACS Paragon Plus Environment

Page 8 of 28

Page 9 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Data analysis RNAModMapper (RAMM) was used for automated MS/MS data analysis and sequence annotation. The program and user manual are available as a free download from http://bearcatms.uc.edu/. RAMM was developed in Java and tested on Windows 10, Windows 8 and Windows 7. Data analysis was performed on a Dell precision tower 7910 running Windows 10 with an Intel Xeon CPU E52630 processor at 2.4 GHz with 128 GB RAM. E. coli tRNA sequences with modifications were obtained from the MODOMICS database.3 The 16S and 23S rRNA sequences of S. griseus were obtained from NCBI (http://www.ncbi.nlm.nih.gov/).

Results and Discussion RAMM was developed as a tool to enable local processing of MS/MS spectra obtained from oligoribonucleotides with an emphasis on handling chemically modified RNase digestion products. Unlike other RNA mass spectrometry software, RAMM uses user-generated sequence input files as the basis for mapping the interpreted MS/MS data with full flexibility to handle more than 100 posttranscriptionally modified nucleosides. A schematic workflow of the process used for data analysis and annotation is shown in Figure 1. A typical RNA modification mapping experiment involves RNA sample preparation and enzymatic digestion followed by LC-MS/MS analysis of the RNase digestion products. This experimental workflow will generate the data file that is used by RAMM to interpret MS/MS data of interest, from which modified nucleosides can then be mapped onto RNA sequences.

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 1.

Page 10 of 28

The workflow of RNA modification mapping by LC-MS/MS.

The RAMM home screen is shown in Supplemental Figure 1. Two menus with options are available: Actions and Functions. The Actions menu allows the user to select between fixed and variable sequence position modification mapping and to identify the input files, modifications and tolerance parameters. The Functions menu allows the user to load an output file and export any output files as a .CSV formatted file. The type of mapping experiment (fixed or variable) that is desired is identified in the main user interface by selecting the Actions menu. The MS/MS data file (MGF) and the FASTA file containing the modified or unmodified sequences to be annotated are then chosen in the mapping window (Figure 2). The user also defines the enzyme used during sample preparation, precursor

ACS Paragon Plus Environment

Page 11 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

and product ion mass tolerances, mass type (average or monoisotopic), and other experimental processing parameters. Once the processing parameters have been chosen, the RNA sequences contained in the FASTA file are processed in silico to generate a local database of RNase digestion products against which the experimental MS/MS data is compared and scored. These interpreted MS/MS sequences can then be mapped back onto the full-length input sequence(s) to annotate the modifications in a location-specific manner. The key design and development features of RAMM will be described first, followed by selected demonstrations of the applicability of the software.

Figure 2.

Screenshot of fixed sequence position modification mapping window. As denoted in the text, the user has full control over the input files and data processing parameters.

Input files

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 28

Input files are required in FASTA format. RNA modified nucleosides can be annotated in the input file by using the MODOMICS nomenclature.3 RNA gene (i.e., DNA) sequences can also be used as RAMM will convert directly to the predicted RNA sequence. All MS/MS raw data must be converted to the MGF (Mascot generic format) data format for input into RAMM. For Thermo LTQ-XL data, both the Mass Matrix File Conversion Tool40 and MSConvert41 were capable of converting the original RAW data file format to an MGF format that could be processed with RAMM. For Waters Synapt G2-S data, the RAW data file was first pre-processed in PLGS (ProteinLynx Global Server, Waters Corp.) for MS/MS spectra noise reduction and then exported in mzML format. MSConvert was then used to convert the mzML to an MGF format that could be processed with RAMM.

In silico digestion RAMM supports five different ribonucleases for in silico digestion: RNase T1, RNase U2, RNase A, RNase MC1 and Cusativin. The programmed selectivity of each enzyme is as follows: RNase T1 cleaves at the 3'-end of guanosine and N2-methylguanosine (m2G).42 RNase U2 cleaves at the 3'-end of guanosine and adenosine.43,44 RNase A cleaves at the 3'-end of unmodified pyrimidines. Cusativin cleaves at the 3'-end of cytidine45 and RNase MC1 cleaves at the 5'-end of uridine.46 The user can select among different 3'-termini products: linear phosphate, cyclic phosphate, and hydroxyl. The user also has the option to select up to 5 missed cleavages for any enzyme.

MS/MS spectra interpretation

ACS Paragon Plus Environment

Page 13 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

RAMM is designed such that the set of user FASTA files serves as a locally generated database of potential enzymatic digestion products against which the MS/MS data (in the MGF file) are searched and ranked. For each in silico generated digestion product, a molecular weight is calculated and serves as the initial comparison with the parent mass of the MS/MS query spectrum. If the mass difference is within the precursor mass tolerance, the predicted product ions of the in silico digestion product are calculated and compared against the peaks in the query spectrum with any matches determined by the product ion mass tolerance. The MS/MS data is then scored and ranked as described below. RAMM supports 120 chemical modifications/motifs and allows the user to designate single methylation and thiolation motifs (Supplemental Table S1). Because typical oligonucleotide MS/MS data cannot differentiate nucleobase ring positions that are modified (e.g., m1A versus m6A), methylations or thiolations to canonical nucleobases that cannot be differentiated by MS/MS data can be represented by only one modification motif (e.g., mA represents any methylated adenosine) in RAMM if the user desires. Pseudouridine is an isomer of uridine, thus it cannot be identified by RAMM in the MS/MS data. However, RAMM allows for up to five user-defined modifications to be added to the modification database, thus derivatization of pseudouridine47,48 or other modifications can be accounted for depending on experimental needs. RAMM was developed to account for two common situations in RNA modifications: fixed and variable sequence position modifications. For example, for tRNAs a significant number of modifications are known to only occur at the wobble base (position 34) of the tRNA sequence. In contrast, other modifications can occur at a variety of sequence locations in any individual tRNA. Identifying fixed or variable sequence modifications are user-selected features of the program, although only one option is available at a time.

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 28

Scoring function The scoring function of RAMM is built upon both a normalized binomial distribution probability score49 and a dot product score,50 which have been used previously in the interpretation of peptide MS/MS data in proteomics. RAMM incorporates both approaches, adopted for oligonucleotide fragmentation schemes, with the final reported score being the product of the two scores. The cumulative binomial distribution probability49 (Equation 1) is generated from the c-, y-, w-, and a-B-type ions that arise from oligonucleotide fragmentation in CID. , ,  =







 1 −  . 1

where N is the total number of theoretical c-, y-, w-, and a-B-type ions, n is the total number of matched theoretical ions, and p is the probability of matching one theoretical product ion. P(n,p,N) is a calculated P-value of matching at least n out of N theoretical ions by chance. The P-value will be very small for a true positive. As the P-value depends on the length of the oligonucleotide and the number of matched ions, it is more useful to convert this to a P-score, which is normalized to the oligonucleotide length as in Equation 2. If all the theoretical ions are found in the MS/MS spectrum, the value S(P) will be 100, regardless of the length of the oligonucleotide.  =

100 ∗ log   . 2  ∗ log  

RAMM allows the P-score to be weighted by the relative abundance differences between c/y-type ions and a-B/w-type ions for oligoribonucleotides. McLuckey and co-workers found that c/y-type

ACS Paragon Plus Environment

Page 15 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

ions have higher relative abundance as compared to a-B/w-type ions for samples containing a 2'hydroxyl.33 To account for these fragmentation channel differences, the program default P-score calculation is as given in Equation 3. !  = [0.7 ∗ $/& ] + [0.3 ∗ *+/, ] . 3 The two weighting factors were estimated from the relative contribution of c/y (70%) and a-B/w (30%) dissociation channels at low and high excitation amplitudes from typical MS/MS data obtained during these studies. These default values can be adjusted by the user to match the experimental data obtained from the user’s own mass spectrometer. A limitation of scoring MS/MS data using only the P-score is that multiple sequences could match the data resulting in similar or identical P-score values with no easy means of differentiating the true sequence from false positives. To improve the accuracy of MS/MS data assignments, RAMM also incorporates a dot product approach to calculate the similarity between observed and reconstructed spectra (Equation 4).50 - =

∑ /01234536 × /43$02849$836

; ; :∑ /01234536 × ∑ /43$02849$836

. 4

Iobserved and Ireconstructed are the ion abundances of the observed and reconstructed spectra respectively. Figure 3 provides an example of spectrum reconstruction in RAMM. In this example, the representative experimental data is separated into those m/z values that match, within the userspecified product ion tolerance, the predicted product ion m/z values for the particular sequence (generated by the in silico database spectrum) and those experimental m/z values that do not match. The ion abundances for the matching m/z values are retained at their original values while ion abundances for all non-matching m/z values are averaged. A dot product of 1 would signify that all of the experimentally most abundant ions arise from only those m/z values generated by the

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 28

database spectrum. Low dot product scores arise when the most abundant ions in the experimental data set do not match the predicted m/z values of the database spectrum.

Figure 3.

Spectrum reconstruction. The observed MS/MS spectrum is compared against m/z values generated by in silico CID of two oligonucleotide sequences (A and B). In this example, A and B have the same number of matched product ions, although at different overall ion abundance. To reconstruct spectral data for calculating the dot product, the matched ions (red) are kept, and the intensity of unmatched ions (grey) is the average intensity of all the unmatched ions in the observed spectrum.

Although scoring is built primarily around conventional c- and y-type fragment ions, one unique feature of some modified nucleosides is their propensity to fragment through direct nucleobase loss. These labile nucleobases include 7-methylguanosine (m7G), lysidine (k2C), queuosine (Q), epoxyqueuosine (oQ), and N6-threonylcarbamoyladenosine (t6A). When these bases are present in the sample, nucleobase loss is the primary fragmentation channel during MS/MS resulting in CID data with few or very low abundance c- and y-type ions. Another similar example is that RNase digestion products ending in a 3'-phosphate often dissociate during CID by loss of phosphoric acid, which again can lead to reduced c- and y-type ions in the MS/MS spectrum.51 To account for this

ACS Paragon Plus Environment

Page 17 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

behavior, the RAMM scoring function supports neutral loss events (Supplemental Table S2) during the dot product calculation step. The type of mass spectrometer (e.g. low-resolution LTQ-XL versus high-resolution Synapt G2-S) used during the LC-MS/MS experiment can have an impact on the MS/MS data interpretation output. Since the unit-resolution of the LTQ-XL is unable to sufficiently resolve the mass difference between consecutive U and C nucleotides (mass difference = 1 Da), there is the potential for RAMM to interpret the spectrum associated with a theoretical digestion product, for example UCCCGp, with both sequence isomers (e.g. CUCCGp, CCUCGp, etc.) and C-to-U substitutions (e.g. UUCCGp, UCUCGp, etc.). In these instances, the analyst must choose the correctly interpreted spectrum through examination of the precursor and product ion mass errors, P-score, and dot product for each of the possible interpretations generated by RAMM. Generally, the interpretation that provides the lowest mass errors, highest P-score, and highest dot product represents the correct assignment. LC-MS/MS mapping experiments performed on the Synapt provided sufficient resolution for differentiation of oligonucleotides containing consecutive C and U nucleotides. Therefore, by setting narrower precursor and product ion mass tolerances (0.02 Da and 0.1 Da, respectively) in RAMM, which are consistent with the resolution and mass accuracy of the Synapt, significantly fewer potential interpretations will be produced for a given digestion product. The overall quality and characteristics of the MS/MS spectrum can also impact RAMM output. During the course of running mapping experiments on both the LTQ-XL and Synapt, it was observed that the MS/MS spectra generated on the Synapt generally result in lower dot product scores than those generated on the LTQ-XL for the same digestion product. It is believed that this is due to the inherently different nature of how CID spectra for oligonucleotides are produced on an ion-beam type (Synapt) and trapping type (LTQ-XL) mass spectrometer. The instrument parameters

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 28

that provide the highest relative abundance of sequence specific product ions on the Synapt also tend to yield a higher relative abundance of precursor ion in the MS/MS spectrum than that observed on the LTQ-XL (Supplemental Figure S2). As the RAMM scoring function does not currently take the precursor ion into account, this leads to a bias in the dot product score for MS/MS spectra containing a high abundance of precursor ion. Interpreted and scored MS/MS results are shown on four main panels (Figure 4). Panel A will list interpreted MS/MS spectra in retention time order. Each interpreted spectrum can be selected by the user. Once selected, detailed information will be shown within Panels B, C and D. Panel B shows the mass error between the observed precursor ion mass and theoretical precursor ion mass, the P-score, dot product and how many ions are matched. Panel C presents the interpreted MS/MS spectrum; matched product ions are labeled and highlighted. Panel D gives the value and mass error for each matched ion from 5'-end to 3'-end.

ACS Paragon Plus Environment

Page 19 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 4.

Screenshot of user interface. A) Interpreted MS/MS spectra listed by retention time. B) MS/MS data assignment features and statistics including precursor mass error, outputs of scoring functions, and number of detected fragment ions. C) For the selected data from A, the interpreted MS/MS spectrum is shown with assigned fragment ions highlighted. D) Tabular output of the data in C including mass error for each fragment ion assignment.

Scoring Function Performance As described above, the scoring function has two components: P-score and dot product. RAMM uses the product of those two components to rank order MS/MS assignments. To characterize the ability of RAMM to correctly identify the true MS/MS sequence assignment, a FASTA file containing only E. coli tRNA gene sequences (Supplemental Table S3) and the 27 known posttranscriptional modifications in E. coli tRNAs (no pseudouridine, Supplemental Table S4) were used as inputs for spectral interpretation. For these evaluations, both the variable sequence and fixed position functions were used. For low resolution Thermo LTQ-XL experiments, an average mass calculation was used. MS/MS data interpretation precursor and product ion mass tolerances were set to 1.0 Da. For high resolution Waters Synapt G2-S experiments, a monoisotopic mass calculation was used. The precursor mass tolerance was set to 0.02 Da and the product ion mass tolerance was set to 0.1 Da. RAMM spectral interpretation results were verified manually and then classified: those MS/MS spectra known to be present in the sample that were correctly interpreted by RAMM and those MS/MS spectra known to be present but were incorrectly interpreted by RAMM. Based on the results, receiver operating characteristic (ROC) curves52 were created by plotting the sensitivity and

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 28

specificity as shown in Figure 5. Sensitivity is defined as true positive/(true positive + false negative) and specificity is defined as true negative/(true negative + false positive). Here, a true positive is the number of correct MS/MS interpretations with scores above the user-defined scoring threshold; a false negative is the number of correct MS/MS interpretations whose scores were found to be below the scoring threshold; a true negative is number of the incorrect MS/MS interpretations with scores below the scoring threshold; and a false positive is the number of incorrect MS/MS interpretations with scores above the scoring threshold.

Figure 5.

The receiver operating characteristic (ROC) curve for sequence mapping results of fixed (red) and variable (blue) sequence position modifications over the range of scoring thresholds from 1 to 100 for low resolution MS/MS data.

The area under the ROC curve (AUC) were 0.94 and 0.89 for spectral interpretation of fixed sequence position modifications and variable sequence position modifications, respectively, using low resolution data from the LTQ-XL with 486 MS/MS spectra and 177 RNase T1 digestion product sequences where the scoring threshold is the product of the P-score and dot product.

ACS Paragon Plus Environment

Page 21 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Overall RAMM was able to correctly characterize MS/MS spectra containing modified nucleosides from the data set tested for either fixed or variable modification options. The relative lower AUC for variable sequence position modifications is mainly caused by false positives, as a larger number of incorrect modified MS/MS spectra were scored. These results reinforce that, while RAMM can be a powerful and effective tool for the automated interpretation of MS/MS data and subsequent RNA sequence mapping, the results are not completely without error and user interaction with the interpreted data remains warranted. The program was tested for its ability to correctly interpret LC-MS/MS data containing modified nucleosides from a typical untargeted analysis by using an RNase T1 digestion of total tRNA from E. coli. This sample has been extensively characterized,39 and the modification status of each E. coli tRNA is known.3 There are a total of 73 unique theoretical RNase T1 digestion products containing at least one post-transcriptionally modified nucleoside (excluding pseudouridine) as listed in Supplemental Table S5. RAMM was able to accurately interpret the MS/MS spectra for 58 of these digestion products using both low-resolution (LTQ-XL) and high-resolution (Synapt G2-S) instruments, consistent with past manual interpretation outcomes.39 A P-score of 70 can be considered significant for LC-MS/MS data generated on both a lowresolution (LTQ-XL) and high-resolution (Synapt) mass spectrometer. However, due to the inherently different character of the MS/MS spectra for oligonucleotides generated on these two different mass spectrometers, a dot product score of 0.85 was considered significant on the LTQXL and a dot product score of 0.65 was considered significant on the Synapt. Although the use of appropriate scoring thresholds provides a way to minimize the number of incorrect interpretations, RAMM does not eliminate the need for manual inspection of the data interpretations. While RAMM

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 28

can improve the workflow and throughput, manual inspection of data will remain a significant limitation when working with a large number of RNase digestion products.

RNA Modification Mapping To examine the capabilities of the program to map MS/MS data onto RNA sequences, a FASTA file containing 40 unique tRNA sequences with 27 known post-transcriptional modifications (Supplemental Table S6) was used. MS/MS interpretation was performed using only fixed sequence position modifications. The length of matched oligonucleotides varied from dimers to a 16-mer, and all matched digestion products are aligned under the input sequences. An example of the mapping result is shown in Figure 6, where matched digestion products are aligned under the tRNASer(UGA) sequence.

Figure 6.

Representative example of sequence mapping with fixed modifications. The modifications are listed above the sequence, and matched digestion products, which arise from interpreted MS/MS data, are aligned under the sequence. The depth of each digestion product indicates how many times an MS/MS spectrum was found and interpreted in the data file.

ACS Paragon Plus Environment

Page 23 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

When unmodified RNA sequence files are input, RAMM will align both unmodified and modified digestion products with differences highlighted. Another feature implemented is that the number of MS/MS spectra that are interpreted and then mapped onto a particular sequence region is denoted to provide additional information on the MS/MS data quality. It should be noted that interpreted MS/MS spectra can be aligned to more than one sequence when available. For example, using the E. coli total tRNA RNase T1 digest sample, the MS/MS spectrum interpreted as CACCGp can align with four of the 40 tRNA sequences used here (tRNAGln(UUG), tRNAGln(CUG), tRNASer(UGA) or tRNATrp(CCA)). The mapping function will not discriminate among the four sequences. Additional confidence in the quality of the mapping results can be achieved by exploring one or more of the experimental options that exist to increase sequence coverage.29 Overall it is important for the user to be aware that this software cannot overcome known limitations to the RNA modification mapping protocol when only a single RNase is used. Future iterations of the software are envisioned that could compile mapping information from experiments using multiple RNases to better classify the confidence in the mapping/sequence annotation step for mixtures of RNAs.

S. griseus 16S rRNA and 23S rRNA sequence mapping RAMM was next used to annotate modifications from rRNA digests. Unlike tRNAs, rRNAs have fewer types of modifications with pseudouridine and methylations being the most common. To test the ability of RAMM to identify methylations, S. griseus 16S (1570 nt) and 23S (3208 nt) rRNAs (Supplemental Table S7) were digested with RNase T1 and analyzed by LC-MS/MS. For data interpretation, the following modifications (or motifs) were selected: mA, Am, mCm, Cm, mC, mG, Gm, mU, Um and mUm.

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 28

Supplemental Table S8 lists the modified oligonucleotides that were identified in the S. griseus 16S and 23S rRNA samples. A methyl group on the base or sugar will increase the nucleoside mass by 14 Da, and RAMM is able to distinguish if the methyl group was on the base or sugar by using a-B ions and base loss ions. As an example of data interpretation quality, the RNase T1 digestion product UC[mA]CGp from 16S rRNA was identified with a P-score of 100.0, which indicates all the a-B/w– type and c/y type–ions were found in the spectral data. Another example is the digestion product UC[Am]CGp, which was interpreted with a P-score of 92.99, reflecting the missing a3-B and Am base loss from the molecular ion assignments. Supplemental Figure S3 provides detailed information surrounding these interpretations.

Conclusions In this paper, we reported a new program RAMM for the stand-alone analysis of LC-MS/MS data of oligonucleotides. RAMM was developed with the particular goals of allowing modified nucleosides to be identified within MS/MS data and for RNA modification mapping. For the former, RAMM uses a two-stage scoring function to improve the accuracy of MS/MS sequence interpretation. Besides c-, y-, w-, a-B-type ions, RAMM can also account for neutral and base loss ions that are often encountered in oligoribonucleotide MS/MS data. Although the program is preconfigured for the known RNA modification motifs, a user-controlled feature allows for custom modifications (e.g., synthetic modified RNA) to be entered to handle other modifications of interest.

Supporting Information.

ACS Paragon Plus Environment

Page 25 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Modifications supported by RAMM, neutral loss values, E. coli gene and tRNA sequences, E. coli tRNA RNase T1 digestion products, S. griseus 16S and 23S rRNA sequences and identified modified RNase T1 digestion products, RAMM main panel, comparison of low and high resolution data and representative example of scoring differentiation (PDF)

Acknowledgements The authors would like to thank Drs. Collin Wetzel and Robert L. Ross for helpful discussions and all lab members for program testing. Financial support of this work was provided by the National Science Foundation (NSF CHE1507357). The generous support of the University of Cincinnati and the Rieveschl Eminent Scholar Endowment for these studies is also appreciated

References (1) Limbach, P. A.; Crain, P. F.; McCloskey, J. A. Nucleic Acids Res 1994, 22, 2183-2196. (2) Cantara, W. A.; Crain, P. F.; Rozenski, J.; McCloskey, J. A.; Harris, K. A.; Zhang, X.; Vendeix, F. A.; Fabris, D.; Agris, P. F. Nucleic Acids Res 2011, 39, D195-D201. (3) Machnicka, M. A.; Milanowska, K.; Oglou, O. O.; Purta, E.; Kurkowska, M.; Olchowik, A.; Januszewski, W.; Kalinowski, S.; Dunin-Horkawicz, S.; Rother, K. M.; Helm, M.; Bujnicki, J.; Grosjean, H. Nucleic Acids Res 2013, 41, D262-D267. (4) Bjork, G. R.; Ericson, J. U.; Gustafsson, C. E.; Hagervall, T. G.; Jonsson, Y. H.; Wikstrom, P. M. Ann Rev Biochem 1987, 56, 263-285. (5) Persson, B. C. Mol Microbiol 1993, 8, 1011-1016. (6) Hagervall, T. G.; Ericson, J. U.; Esberg, K. B.; Ji-nong, L.; Björk, G. R. Biochim Biophys Acta 1990, 1050, 263-266. (7) Muramatsu, T.; Nishikawa, K.; Nemoto, F.; Kuchino, Y.; Nishimura, S.; Miyazawa, T.; Yokoyama, S. Nature 1988, 336, 179-181. (8) Helm, M.; Alfonzo, J. D. Chem Biol 2014, 21, 174-185. (9) Phizicky, E. M.; Alfonzo, J. D. FEBS Lett 2010, 584, 265-271. (10) Chan, C. T.; Pang, Y. L. J.; Deng, W.; Babu, I. R.; Dyavaiah, M.; Begley, T. J.; Dedon, P. C. Nature Commun 2012, 3, 937. (11) Kirino, Y.; Yasukawa, T.; Ohta, S.; Akira, S.; Ishihara, K.; Watanabe, K.; Suzuki, T. Proc Natl Acad Sci USA 2004, 101, 15070-15075. (12) Esteller, M. Nature Rev Genet 2011, 12, 861-874.

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 28

(13) Chan, P. P.; Lowe, T. M. Nucleic Acids Res 2016, 44, D184-D189. (14) Su, D.; Chan, C. T.; Gu, C.; Lim, K. S.; Chionh, Y. H.; McBee, M. E.; Russell, B. S.; Babu, I. R.; Begley, T. J.; Dedon, P. C. Nature Prot 2014, 9, 828-841. (15) Rose, R. E.; Quinn, R.; Sayre, J. L.; Fabris, D. RNA 2015, 21, 1361-1374. (16) Thüring, K.; Schmid, K.; Keller, P.; Helm, M. Methods 2016, 107, 48-56. (17) Pomerantz, S. C.; McCloskey, J. A. Methods Enzymol 1990, 193, 796-824. (18) Brückl, T.; Globisch, D.; Wagner, M.; Müller, M.; Carell, T. Angew Chem Int Ed 2009, 48, 7932-7934. (19) Kellner, S.; Ochel, A.; Thüring, K.; Spenkuch, F.; Neumann, J.; Sharma, S.; Entian, K.-D.; Schneider, D.; Helm, M. Nucleic Acids Res 2014, 42, e142-e142. (20) Russell, S. P.; Limbach, P. A. J Chromatogr B 2013, 923–924, 74-82. (21) Limbach, P. A.; Paulines, M. J. WIREs: RNA 2016, 8, e1367. (22) Wang, Z.; Gerstein, M.; Snyder, M. Nature Rev Genet 2009, 10, 57-63. (23) Tserovski, L.; Marchand, V.; Hauenschild, R.; Blanloeil-Oillo, F.; Helm, M.; Motorin, Y. Methods 2016, 107, 110-121. (24) Cozen, A. E.; Quartley, E.; Holmes, A. D.; Hrabeta-Robinson, E.; Phizicky, E. M.; Lowe, T. M. Nat Methods 2015, 12, 879-884. (25) Zheng, G.; Qin, Y.; Clark, W. C.; Dai, Q.; Yi, C.; He, C.; Lambowitz, A. M.; Pan, T. Nat Methods 2015, 12, 835-837. (26) Suzuki, T.; Suzuki, T. In Methods Enzymol, Gott, J. M., Ed., 2007, pp 231-239. (27) Suzuki, T.; Ikeuchi, Y.; Noma, A.; Suzuki, T.; Sakaguchi, Y. In Methods Enzymol; Academic Press, 2007, pp 211-229. (28) Wetzel, C.; Limbach, P. A. Analyst 2016, 141, 16-23. (29) Ross, R.; Cao, X.; Yu, N.; Limbach, P. A. Methods 2016, 107, 73-78. (30) Kowalak, J. A.; Pomerantz, S. C.; Crain, P. F.; McCloskey, J. A. Nucleic Acids Res 1993, 21, 4577-4585. (31) McLuckey, S. A.; Habibi-Goudarzi, S. J Am Chem Soc 1993, 115, 12085-12095. (32) McLuckey, S. A.; Van Berker, G. J.; Glish, G. L. J Am Soc Mass Spectrom 1992, 3, 60-70. (33) Huang, T.-y.; Kharlamova, A.; Liu, J.; McLuckey, S. A. J Am Soc Mass Spectrom 2008, 19, 1832-1840. (34) Rozenski, J.; McCloskey, J. A. J Am Soc Mass Spectrom 2002, 13, 200-203. (35) Nyakas, A.; Blum, L. C.; Stucki, S. R.; Reymond, J.-L.; Schürch, S. J Am Soc Mass Spectrom 2013, 24, 249-256. (36) Nakayama, H.; Akiyama, M.; Taoka, M.; Yamauchi, Y.; Nobe, Y.; Ishikawa, H.; Takahashi, N.; Isobe, T. Nucleic Acids Res 2009, 37, e47. (37) Matthiesen, R.; Kirpekar, F. Nucleic Acids Res 2009, 37, e48. (38) Sample, P. J.; Gaston, K. W.; Alfonzo, J. D.; Limbach, P. A. Nucleic Acids Res 2015, 43, e64. (39) Cao, X.; Limbach, P. A. Anal Chem 2015, 87, 8433-8440. (40) Xu, H.; Freitas, M. A. Proteomics 2009, 9, 1548-1555. (41) Kessner, D.; Chambers, M.; Burke, R.; Agus, D.; Mallick, P. Bioinformatics 2008, 24, 25342536. (42) Steyaert, J. Eur J Biochem 1997, 247, 1-11. (43) Uchida, T.; ARIMA, T.; EGAMI, F. J Biochem 1970, 67, 91-102. (44) Houser, W. M.; Butterer, A.; Addepalli, B.; Limbach, P. A. Anal Biochem 2015, 478, 52-58. (45) Rojo, M. A.; Arias, F. J.; Iglesias, R.; Ferreras, J. M.; Muñoz, R.; Escarmís, C.; Soriano, F.; López-Fando, J.; Méndez, E.; Girbés, T. Planta 1994, 194, 328-338. (46) Addepalli, B.; Lesner, N. P.; Limbach, P. A. RNA 2015, 21, 1746-1756.

ACS Paragon Plus Environment

Page 27 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

(47) Mengel‐Jørgensen, J.; Kirpekar, F. Nucleic Acids Res 2002, 30, e135. (48) Addepalli, B.; Limbach, P. A. J Biol Chem 2016, 291, 22327-22337. (49) Beausoleil, S. A.; Villén, J.; Gerber, S. A.; Rush, J.; Gygi, S. P. Nature Biotech 2006, 24, 12851292. (50) Yen, C.-Y.; Houel, S.; Ahn, N. G.; Old, W. M. Mol Cell Proteomics 2011, 10, M111. 007666. (51) Krivos, K. L.; Addepalli, B.; Limbach, P. A. Rapid Commun Mass Spectrom 2011, 25, 3609-3616. (52) Kapp, E. A.; Schütz, F.; Connolly, L. M.; Chakel, J. A.; Meza, J. E.; Miller, C. A.; Fenyo, D.; Eng, J. K.; Adkins, J. N.; Omenn, G. S. Proteomics 2005, 5, 3475-3490.

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Table of Contents Graphic 47x26mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 28 of 28