Interconversion of Peptide Mass Spectral Libraries ... - ACS Publications

Jul 7, 2016 - Mass Spectrometry Data Center, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899,. United S...
0 downloads 5 Views 1MB Size
Subscriber access provided by CORNELL UNIVERSITY LIBRARY

Article

Interconversion of Peptide Mass Spectral Libraries Derivatized with iTRAQ or TMT Labels Zheng Zhang, Xiaoyu Yang, Yuri A Mirokhin, Dmitrii V Tchekhovskoi, Weihua Ji, Sanford P. Markey, Jeri Roth, Pedatsur Neta, Deniz Baycin Hizal, Michael A. Bowen, and Stephen E. Stein J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/acs.jproteome.6b00406 • Publication Date (Web): 07 Jul 2016 Downloaded from http://pubs.acs.org on July 22, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Proteome Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 20

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Interconversion of Peptide Mass Spectral Libraries Derivatized with iTRAQ or TMT Labels Zheng Zhang1, Xiaoyu Yang1, Yuri A. Mirokhin1, Dmitrii V. Tchekhovskoi1, Weihua Ji1, Sanford P. Markey1, Jeri Roth1, Pedatsur Neta1, Deniz Baycin Hizal2, Michael A. Bowen2, Stephen E. Stein1* 1

Mass Spectrometry Data Center, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899 USA 2

Antibody Discovery and Protein Engineering Department, MedImmune LLC, One MedImmune Way, Gaithersburg, Maryland 20878 USA Correspondence to: *E-mail: [email protected]. Tel: +1 301 975 2505. KEYWORDS: peptide mass spectral library, iTRAQ, TMT, isobaric tag, spectral conversion ABBREVIATIONS: iTRAQ, isobaric tags for relative and absolute quantitation; TMT, Tandem Mass Tag; FDR, false discovery rate; PSM, peptide spectral match; PD, Proteome Discoverer; NIST, National Institute of Standards and Technology; NCBI, National Center for Biotechnology Information; NIH/NCI, National Institutes of Health/National Cancer Institute; CPTAC, Clinical Proteomic Tumor Analysis Consortium; HCD, higher-energy collisional dissociation; I2T, iTRAQ 4-plex to TMT conversion; T2I, TMT to iTRAQ 4-plex conversion; I2I, iTRAQ 4-plex to iTRAQ 8-plex conversion.

1 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ABSTRACT Derivitization of peptides with isobaric tags such as iTRAQ and TMT is widely employed in proteomics due to their compatibility with multiplex quantitative measurements. We recently made publicly available a large peptide library derived from iTRAQ 4-plex labeled spectra. This resource has not been used for identifying peptides labeled with related tags with different masses, because values for virtually all masses of precursor and most product ions would differ for ions containing the different tags as well as containing different tag-specific peaks. We describe a method for interconverting spectra from iTRAQ 4-plex to TMT (6- and 10-plex) and to iTRAQ 8-plex. We interconvert spectra by appropriately mass shifting sequence ions and discarding derivative-specific peaks. After this ‘cleaning’ of search spectra, we demonstrate that the converted libraries perform well in terms of peptide spectral matches. This is demonstrated by comparing results using sequence database searches as well as by comparing search effectiveness using original and converted libraries. At 1% FDR TMT labeled query spectra match 97% as many spectra against a converted iTRAQ library as compared to an original TMT library. Overall this interconversion strategy provides a practical way to extend results from one derivatization method to others that share related chemistry and do not significantly alter fragmentation profiles.

INTRODUCTION Matching the tandem mass spectrum of a peptide ion to a validated annotated reference spectrum in a library provides a fast and reliable means of assigning sequences, modifications, and charge states.1, 2 A central requirement for the application of this method is the availability of sufficiently complete, appropriate high quality libraries of tandem mass spectra of peptide ions. As new data are collected and processed, libraries have been increasing in proteome coverage. However, the recent increased popularity of beam-type collision cells relative to ion traps has required building new spectral libraries, because both the higher resolution of such spectra, and their strong energy dependence and wider m/z range produce spectra that may substantially differ from ion trap fragmentation spectra. Consequently, we have recently been compiling and validating libraries of spectra created by beam-type collision-cell fragmentation. Another qualitative variation in peptide spectra arises from labeling the peptides with different tags. In principle, a different library for each different labeling method is required, since different label types will generate different mass shifts. In this work, we describe a method for interconverting spectra from the two most popular isobaric labeling strategies, iTRAQ (4- or 8plex) 3, 4 and TMT (6- or 10-plex) 5 labels (Figure 1 shows their structural formula). These conversions are aided by the fact that both methods use labels of similar chemical basicity which add to amino groups at the N-terminus and lysine residues. The success of this interconversion allows the publicly available annotated human and mouse peptide libraries of iTRAQ 4-plex 2 ACS Paragon Plus Environment

Page 2 of 20

Page 3 of 20

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

spectra (peptide.nist.gov) to be employed for assignment of spectra from TMT and iTRAQ 8plex studies.

METHODS We employ three distinct steps to accomplish interconversion of iTRAQ and TMT peptide spectral libraries. The process is presented below for the case of searching TMT 6-plex spectra against an iTRAQ 4-plex library. Other interconversions follow similar paths and will be discussed later. (1) Shift Library Spectrum Peaks: The same chemistry is used to add TMT and iTRAQ tags, 6 resulting in tags attaching to each lysine and to the N-terminus of each peptide. Peak list m/z interconversions simply involve appropriate shifting of precursor and product ions m/z values for ions that contain tags. This step starts with a spectral library in which peaks have been reliably annotated. The number of tags on each annotated ion is counted; multiplied by the known mass shift to converting an iTRAQ to a TMT tag (85.06087 Da)6; added to the mass of the ion (obtained by multiplying the m/z by charge, z); and finally dividing by charge state. (2) Discard Library Spectrum Peaks: Three varieties of iTRAQ 4-plex peaks contain no sequence information and, since they could not be present in TMT spectra, were discarded. These include (see Table 1 for details): a. iTRAQ reporter ions b. iTRAQ fragments c. iTRAQ ion losses from the precursor Note that we sought but did not observe any significant iTRAQ losses from fragment ions, so no such corrections were needed. (3) Discard Search Spectrum Peaks: Like iTRAQ labels, all peaks generated solely by TMT fragmentation contain no sequence information and are readily identified in an otherwise unidentified spectrum. In principle, these could be removed in the search software, but we chose to discard them prior to searching. These include (see Table 1): a. TMT reporter ions b. TMT fragments c. TMT ion losses from the precursor As for iTRAQ, we found no significant TMT losses from peptide fragment ions, so we expect minimal effects in search performance from such pathways. Spectral Library Search: Testing and refinement of search performance against the derived TMT library used freely available MSPepSearch software (peptide.nist.gov). Search tolerance settings were precursor ion tolerance of 10 ppm and fragment ion tolerance of 0.1 Da. The maximum number of output hits was set as 1. The library was built starting from the ASCII msp format 3 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

using the program Lib2NIST (peptide.nist.gov). Proteome Discoverer 2.1 (referred to as PD 2.1, Thermo Scientific) was employed to perform spectral library searches using its built-in MSPepSearch node. Note that fragment ion tolerance was set as 0.1 Da because it was the lowest limit setting allowed by PD 2.1. Determining the Cutoff Score at 1% FDR: A target-decoy strategy was employed to establish score cutoffs for achieving 1% false discovery rates (FDRs). A score cutoff was selected so that numbers of peptide spectral matches (PSMs) from searching a target decoy library was approximately 1% of the PSMs from searching the target library. For simplicity, the original unconverted spectral library was treated as a target decoy library for a converted library. Or a converted library served as the target decoy library for an unconverted library. Equivalent results were obtained using a precursor shifted library as a decoy as described by Cheng et al. 7 Sequence Database Search: To further aid in monitoring the relative performance changes of converted libraries we used the Sequest HT search 8 as implemented in PD 2.1. Mass tolerance settings were the same as in the library search. The search engine rank was 1 and FDR was set at 1% using NCBI Aug. 2014 human fasta file with 55 926 sequences. Data Sources: The iTRAQ4 libraries of human and mouse tryptic peptides are available on-line (peptide.nist.gov). The version used in the present work was created on Nov. 2014 and is an output of the NIH/NCI CPTAC program (proteomics.cancer.gov) which conducted highly fractionated shotgun proteomic studies of human tumors. 9 Spectra in this library version originated from HCD fragmentation and are of the best quality (e.g., highest MSGF+ score) for each identified peptide ion at each reported fragmentation energy. The Human iTRAQ4 library contains about 1.2 million spectra and covers approximately one-third of all identifiable tryptic peptides in the fasta sequence file used in its creation (peptide.nist.gov). TMT 6- or 10-plex and iTRAQ 8-plex labeled spectra were obtained from MedImmune LLC (test data sets and libraries are summarized in Table 2). Conversion Code: The interconversion of spectra between tags was done using data originally in the ASCII msp format (peptide.nist.gov) using code written in ‘R’. The code is available upon request.

RESULTS AND DISCUSSION Proof of Concept Figure 2 illustrates the alignment of fragment ion peaks, before and after peak shifting as described above, for selected peptide ions containing 2 tags, 1 tag and 3 tags. These examples demonstrate the high degree of similarity in the fragmentation patterns of TMT and iTRAQ labeled peptides. 4 ACS Paragon Plus Environment

Page 4 of 20

Page 5 of 20

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

To test and optimize the library interconversion process, we started with a relatively small peptide spectral library, mouse I4 (Table 2), and converted it to simulate TMT spectra (referred to as mouse I2T library). We did not need to differentiate between the 6- and 10-plex because their tag masses and the conversion steps were the same. To evaluate the effectiveness of this trial converted mouse I2T library, we used TMT-DS1 (Table 2) to test with the MSPepSearch tool. After reporter ions, TMT fragments (tag ions), and TMT ion losses from precursor (parent ions) were discarded from the input TMT-DS1 spectra, we found improvement in spectral matching as evidenced by higher scores as well as increased numbers of identifications above the score cutoff. This observation is consistent with other reports of improved peptide identification after removing some tag-related ions from the raw data in conventional sequence database searches. 10 To more closely examine the effects of the removing reporter, tag, and parent ions, we applied various combinations of these peak removal steps. The numbers of spectral matches with a score above 200 for the different cases are given in Table 3. Removing reporter ions led to about 30% increase in peptide matches, while removing the tag or parent ions showed smaller positive effects. As expected, removal of any two types of ions resulted in higher improvement than removal of either one alone and removing all three types of ions from search spectra had the greatest impact, increasing the number of identifications by more than 50%. In all following analyses, these three ion types were removed from input query spectra before library searching, unless specified otherwise. In practice, tag-specific peak removal could also be accomplished by a preprocessing step in the search program since all removed peaks will depend only on the tag and precursor mass and charge. Performance of Converted Spectral Libraries Our pilot study demonstrated the feasibility of converting existing peptide spectral libraries to other types of target libraries and showed that, when combined with tag-specific peak removal, results can be further improved. To ensure that the performance of the converted spectral libraries is comparable to unconverted libraries, several types of comparisons are reported. First, we compare results of searching a set of spectra having one label (TMT or iTRAQ) against a library composed of spectra using another label against that of searching an unconverted library. Second, we use a set of unconverted and converted spectra as input to perform Sequest database searches. Third, we use an experimental data set as input to perform both library and Sequest database searches. These are individually described below. 1) Comparing Performance of Converted Libraries and Unconverted Libraries. We first examined relative performance of converted and unconverted libraries for TMT labeled query spectra. For this purpose, we converted the NIST human I4 to a TMT target spectral library (referred to as human I2T) and then created a conventional (unconverted) TMT library (human T) directly from a TMT proteomic study. To eliminate any differences in results due to 5 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

the contents of these libraries, only the 24 612 peptide ions present in both libraries (human I2T and T) were used to make “common subsets” of these libraries (Table 4). As query spectra, we used the 25 808 selected spectra in the TMT-DS2 data set. Further details of these libraries and query spectra are given in Table 2. Cutoffs were determined by using the target decoy libraries described in the method section and the scores are presented in Table 4. At 1% FDR, 14 106 peptide ions were identified by searching the human T library and 13 735 were found when searching the I2T library. Among those 13 735 peptide ions, 99% were identical to those identified in the unconverted human T library. These results indicate that for the same query spectra, results were nearly the same against human T and I2T libraries. We then examined performance in a reverse way, by searching iTRAQ query spectra against a library derived from TMT labeled peptides. For this purpose, using the general rules presented above, the 25 808 query TMT spectra were converted to iTRAQ 4-plex spectra, for searching against both a converted (from the human T library to a human T2I library) and our unconverted human I4 library (Table 4). At 1% FDR, 13 963 peptide ions were identified by searching the human T2I library and 13 877 were found when searching the I4 library. Among those 13 877 peptide ions, 99% were identical to those identified in the T2I library, confirming again the near equivalence of the quality of the spectra in converted and unconverted libraries. 2) Using Unconverted and Converted Spectra as Input to Perform Sequest Searches. To further examine the comparability of unconverted and converted spectra, we used library spectra as input data for (re)identification by Sequest. Specifically, we used the “common subset” described above of 24 612 unconverted and converted spectra (all with tag-specific ions removed) as input query spectra to perform Sequest database searches. At 1% FDR, Sequest identified 98.3% (24 182) of the PSMs from unconverted 24 612 TMT spectra and 97.6% (24 034) PSMs from the same number of converted 24 612 I2T spectra. About 99% of the ions identified in the converted spectra were also identified from searching unconverted spectra. In addition, the distribution of Sequest XCorr values was nearly identical for the two searches. 3) Comparing Results of Sequest Searches and Library Searches. To compare the relative effectiveness of the converted library to conventional sequence searching, results of library searching were compared to results from Sequest. For this purpose, the 441 511 MS/MS spectra from TMT-DS2 (Table 2) were used as input query spectra to search against a human fasta sequence (as of NCBI Aug. 2014 version, contains 55 926 sequences) database using Sequest HT node in PD 2.1, and the results were compared with those from searching the human I2T library using MSPepSearch node in PD 2.1. Table 5 shows the number of PSMs, and peptides identified at 1% FDR. As we can see, these two approaches led to comparable levels of identification, with library search yielding more PSMs than Sequest (143 680 and 134 898, respectively) and nearly the same number of peptides (45 819 vs 45 644). While the majority of peptide matches were the same using the two search methods (36 073), 6 ACS Paragon Plus Environment

Page 6 of 20

Page 7 of 20

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

both identified significant numbers of unique sequences (9746 vs 9571, respectively), suggesting that spectral library and sequence database searches are complementary and can lead to better coverage if combined. Of the 9571 peptides uniquely identified by Sequest, 6046 were not present in the library. The other 3525 were found in the library but their spectra were derived from different charge states, or had different modifications. Considering that peptides in I2T only represent about 40% of the tryptic sequences in the human sequence database (unpublished data), it is apparent that at their present state of development, converted spectral libraries have similar overall performance to sequence database search methods. Of course, of the approximately 10 000 peptide ions uniquely identified by Sequest, those with good quality spectra could be added to the library to further improve results. Interconversion of Spectral Libraries to Other Target Libraries Data shown so far were obtained from converting iTRAQ 4-plex to TMT spectra. To further evaluate the feasibility of interconversion of spectral libraries, we converted the human I4 library to mimic an iTRAQ 8-plex target library (referred to as human I2I). Input MS/MS query spectra (326 673) were from iTRAQ8-DS (Table 2). In the low m/z region, from about 145 to 223, we found 4 tag-related clusters (Table 1). Since they do not contribute to sequence identification, these clusters at low m/z region were removed. Results from searching the I2I library and from Sequest database search are presented in Table 5. The I2I library led to more identification of peptides (23 285 vs 15 171) and PSMs (59 817 vs 34 707) than the Sequest search. In addition, both approaches identified unique peptide matches (9708 vs 1594), suggesting again their complementary nature. Among 1594 peptides uniquely identified by Sequest, 806 were not included in the library. The other 788 were found in the library but their spectra were derived from different charge states, or had different modifications. Summary Adding a new modification to a conventional sequence search engine is, in principle, straightforward, involving only the shifting of peaks containing the modification. This process is inherently more difficult for spectral libraries, where mass shifting requires that all peaks are reliably annotated with the correct sequence and any changes in the fragmentation profile are considered. In this study we present an example where such modification switching should work very well, since the modifications appear at the same locations (N-terminus and lysine residues) and are chemically very similar. We find this strategy, in fact, without loss of searching performance, enabling libraries of the most popular isobaric tags to be interconverted. Work is underway to generalize the in-silico addition of modifications to spectral libraries. Such spectrum transformations require accounting for any differences in fragmentation caused by the modification, as for example in methionine oxidation, where loss of CH3SOH is often observed in the spectrum. We also find that changes in spectra containing the isobaric tags and label-free spectra can cause significant shifts in relative intensities of sequence ions. We are now 7 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

working on developing methods for making such transformations using mass shifted libraries of annotated spectra and peak scoring methods that incorporate modification specific effects. ACKNOWLEDGMENTS We thank Raghothama Chaerkady and Wen Yu from MedImmune for their comments on this manuscript. We also acknowledge support from the NIH/NCI CPTAC program through an Interagency Agreement, ACO15005, with NIST.

NIST COMMERCIAL DISCLAIMER Certain commercial equipment, instruments, or materials are identified in this paper in order to specify the experimental procedure adequately. Such identification is not intended to imply recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that the materials or equipment identified are necessarily the best available for the purpose.

REFERENCES (1) Lam, H.; Deutsch, E. W.; Eddes, J. S.; Eng, J. K.; Stein, S. E.; Aebersold, R. Building consensus spectral libraries for peptide identification in proteomics. Nat. Methods 2008, 5, 873−875. (2) Zhang, X.; Li, Y.; Shao, W.; Lam, H. Understanding the improved sensitivity of spectral library searching over sequence database searching in proteomics data analysis. Proteomics 2011, 11, 1075-1085. (3) Ross, P.L.; Huang, Y.N.; Marchese, J.N.; Williamson, B.; Parker, K.; Hattan, S.; Khainovski, N.; Pillai, S.; Dey, S.; Daniels, S.; Purkayastha, S.; Juhasz, P.; Martin, S.; Bartlet-Jones, M.; He, F.; Jacobson, A.; Pappin, D.J. Multiplexed protein quantitation in Saccharomyces cerevisiae using amine-reactive isobaric tagging reagents. Mol. Cell. Proteomics 2004, 3, 1154-1169. (4) Choe, L.; D'Ascenzo, M.; Relkin, N.R.; Pappin, D.; Ross, P.; Williamson, B.; Guertin, S.; Pribil, P.; Lee, K.H. 8-plex quantitation of changes in cerebrospinal fluid protein expression in subjects undergoing intravenous immunoglobulin treatment for Alzheimer's disease. Proteomics 2007, 7, 3651-3660. (5) Thompson, A.; Schäfer, J.; Kuhn, K.; Kienle, S.; Schwarz, J.; Schmidt, G.; Neumann, T.; Johnstone, R.; Mohammed, A.K.; Hamon, C. Tandem mass tags: a novel quantification strategy for comparative analysis of complex protein mixtures by MS/MS. Anal. Chem. 2003, 75, 18951904. (6) Pichler, P.; Köcher, T.; Holzmann, J.; Mazanek, M.; Taus, T.; Ammerer, G.; Mechtler, K. Peptide labeling with isobaric tags yields higher identification rates using iTRAQ 4-plex compared to TMT 6-plex and iTRAQ 8-plex on LTQ Orbitrap. Anal. Chem. 2010, 82, 65496558. 8 ACS Paragon Plus Environment

Page 8 of 20

Page 9 of 20

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(7) Cheng, C.Y.; Tsai, C.F.; Chen, Y.J.; Sung, T.Y.; Hsu, W.L. Spectrum-based method to generate good decoy libraries for spectral library searching in peptide identifications. J. Proteome Res. 2013, 12, 2305-2310. (8) Eng, J. K.; McCormack, A. L.; Yates, J. R. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass Spectrom. 1994, 5, 976–989. (9) Mertins P.; Yang F.; Liu T.; Mani D. R.; Petyuk V. A.; Gillette M. A.; Clauser K. R.; Qiao J. W.; Gritsenko M. A.; Moore R. J.; Levine D. A.; Townsend R.; Erdmann-Gilmore P.; Snider J. E.; Davies S. A.; Ruggles K. V.; Fenyo D.; Kitchens R. T.; Li S.; Olvera N.; Dao F.; Rodriguez H.; Chan D. W.; Liebler D.; White F.; Rodland K. D.; Mills G. B.; Smith R. D.; Paulovich A. G.; Ellis M.; Carr S. A. Ischemia in tumors induces early and sustained phosphorylation changes in stress kinase pathways but does not affect global protein levels. Mol. Cell. Proteomics 2014, 13, 1690–1704. (10) Sheng, Q.; Li, R.; Dai, J.; Li, Q.; Su, Z.; Guo, Y.; Li, C.; Shyr, Y.; Zeng, R. Preprocessing significantly improves the peptide/protein identification sensitivity of high-resolution isobarically labeled tandem mass spectrometry data. Mol. Cell. Proteomics 2015, 14, 405-417.

9 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 20

Table 1. Masses and Mass Shifts Used for Building Libraries and ‘Cleaning’ Search Spectra

Type of ions

iTRAQ (4plex)

iTRAQ (8plex)

TMT (6plex)

TMT (10plex)

Reporter (Rep)

114.110680 115.107715 116.111070 117.114424

113.107873 114.111228 115.108263 116.111618 117.114973 118.112008 119.115363 121.122072

126.127725 127.124760 128.134433 129.131468 130.141141 131.138176

126.127725 127.124760 127.131079 128.128115 128.134433 129.131468 129.137790 130.134825 130.141141 131.138176

Tag+H+

145.109339

305.212636

230.170208

230.170208

Tag-related clusters (m/z < 223)

X

Precursor (Prec)

X

X

X

X

Prec-Rep

X

X

X

X

Prec-Tag

X

X

X

X

Prec-H2O

X

X

X

X

Prec-NH3

X

X

X

X

Prec-H2O-NH3

X

X

X

X

Prec-Rep-CO

X

X

Prec-Rep-CO-H2O

X

X

Prec-Rep-CO-Tag

X

X

“X” in the table indicates that type is removed.

10 ACS Paragon Plus Environment

Page 11 of 20

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Table 2. Description of Test Data Sets (DS) and Libraries (All Digested with Trypsin/Lys-C and Fragmented by HCD) Name

Tag

Organism

Type

Instrument

Spectra

Source

TMTDS1

TMT 6plex

Mouse

Exp a

LTQ Orbitrap Velos

14 575 (selected as query) for Table 3

MedImmune

TMTDS2

TMT 10-plex

Human

Exp a

Q Exactive

25 808 (selected as query: TMTDS2-Select) for Table 4; 441 511 (all as query: TMT-DS2-All) for Table 5

MedImmune

iTRAQ8 -DS

iTRAQ 8-plex

Human

Exp a

LTQ Orbitrap Velos

326 673 (all as query) for Table 5

MedImmune

Mouse I4

iTRAQ 4-plex

Mouse

Lib b

Q Exactive, LTQ Orbitrap Velos

91 068

peptide.nist.gov

Q Exactive, LTQ Orbitrap Velos

1 201 632

Q Exactive

31 428

Human I4 c

Human Tc

iTRAQ 4-plex

TMT 10-plex

Human

Human

Lib b

Exp a/ Lib b

(Nov 24, 2014)

peptide.nist.gov (Nov 26, 2014)

a

MedImmune

Experimental data set. Spectral library. c A subset library that contains 24 612 spectra in both original Human I4 and Human T was used in Table 4. b

11 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 20

Table 3. Numbers of Peptide Spectral Matches above Thresholda for Different Query Spectrum Processing Methods Treatment of input data PSMs b No clean up 3263 Removal of reporter 4237 Removal of parent 3560 Removal of tag 3371 Removal of reporter and parent 4600 Removal of reporter and tag 4490 Removal of tag and parent 3627 Removal of all of three 4923 a MSPepSearch score threshold of 200. b Library was mouse I2T library that contains 91 068 spectra; Input data set was TMT-DS1 selected 14 575 spectra.

12 ACS Paragon Plus Environment

Page 13 of 20

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Table 4. Performance of Interconverted Libraries Containing Same Peptide Ions Input spectra a, b Libraries b Cutoffs at 1% FDR PSMs TMT-DS2 Human T 334 14 106 TMT-DS2 Human I2T 294 13 735 T2I Human T2I 338 13 963 T2I Human I4 291 13 877 a Input spectra were 25 808 TMT-DS2-Select, and iTRAQ 4-plex-converted spectra (Table 2). b Tag-specific ions have been removed from input spectra and libraries; libraries contain only the 24 612 peptide ion spectra contained in both original T and I4 libraries.

13 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 20

Table 5. Results of Searching Human I2T and I2I Libraries Compared to Sequest at 1% FDR Input spectra

Search against

Peptides

TMT-DS2 a Human I2T library 45 819 a TMT-DS2 Sequest database 45 644 b iTRAQ8-DS Human I2I library 23 285 iTRAQ8-DS b Sequest database 15 171 a 441 511 ‘cleaned’ TMT-DS2-All spectra. b 326 673 ‘cleaned’ iTRAQ8-DS spectra.

Common/ Unique peptides 36 073/9746 36 073/9571 13 577/9708 13 577/1594

14 ACS Paragon Plus Environment

PSMs 143 680 134 898 59 817 34 707

Page 15 of 20

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

FIGURE LEGENDS Figure 1. The structural formula of 3 isobaric tagging reagents: (A) iTRAQ 4-plex (B) TMT 6and 10-plex (C) iTRAQ 8-plex. A blue line indicates the site of modification moiety. Green and red lines indicate the fragmentation sites. The structure of iTRAQ 8-plex balancer group has not been published. Figure 2. iTRAQ 4-plex labeled spectra before (A, C, E) and after (B, D, F) conversion to TMT for example peptides (lower panels, in blue): DQAVENILLSPLVVASSLGLVSLGGK (charge: 3; D1-iTRAQ 4-plex; K26-iTRAQ 4-plex); WAAVVVPSGEEQR (charge: 3; W1-iTRAQ 4plex); ETKDTDIVDEAIYYFK (charge: 2; E1-iTRAQ 4-plex; K3-iTRAQ 4-plex; K16-iTRAQ 4-plex). Experimentally derived target TMT spectra for the same peptide are shown in the upper panels (in red). For clarity, only major peaks are labeled.

15 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 1. The Structural Formula of Isobaric Tagging Reagents

(A) iTRAQ 4-plex

(B) TMT 6- and 10-plex

(C) iTRAQ 8-plex O O N N

N O O

16 ACS Paragon Plus Environment

Page 16 of 20

Page 17 of 20

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 2. iTRAQ 4-plex Labeled Spectra Before and After Conversion to TMT

(A) Before:

(B) After:

17 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(C) Before:

(D) After:

18 ACS Paragon Plus Environment

Page 18 of 20

Page 19 of 20

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(E) Before:

(F) After:

19 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

For TOC only:

20 ACS Paragon Plus Environment

Page 20 of 20