Anal. Chem. 2009, 81, 7170–7180
Evaluation of Several MS/MS Search Algorithms for Analysis of Spectra Derived from Electron Transfer Dissociation Experiments Kumaran Kandasamy,†,‡ Akhilesh Pandey,*,† and Henrik Molina*,†,§ McKusick-Nathans Institute for Genetic Medicine and Departments of Biological Chemistry, Pathology and Oncology, Johns Hopkins University, Baltimore, Maryland 21205, Institute of Bioinformatics, International Tech Park Ltd., Bangalore 560 066, India, and Centre de Regulacio Genomica, Dr. Aiguader 88, 08003 Barcelona, Spain Electron transfer dissociation (ETD) is increasingly becoming popular for high-throughput experiments especially in the identification of the labile post-translational modifications. Most search algorithms that are currently in use for querying MS/MS data against protein databases have been optimized on the basis of matching fragment ions derived from collision induced dissociation of peptides, which are dominated by b and y ions. However, electron transfer dissociation of peptides generates completely different types of fragments: c and z ions. The goal of our study was to test the ability of different search algorithms to handle data from this fragmentation method. We compared four MS/MS search algorithms (OMSSA, Mascot, Spectrum Mill, and X!Tandem) using ∼170 000 spectra generated from a standard protein mix, as well as from complex proteomic samples which included a large number of phosphopeptides. Our analysis revealed (1) greater differences between algorithms than has been previously reported for CID data, (2) a significant charge state bias resulting in >60-fold difference in the numbers of matched doubly charged peptides, and (3) identification of 70% more peptides by the best performing algorithm than the algorithm identifying the least number of peptides. Our results indicate that the search engines for analyzing ETD derived MS/MS spectra are still in their early days and that multiple search engines could be used to reduce individual biases of algorithms. Tandem mass spectrometry provides a means to sequence peptides. By introducing energy into peptide ions, a dissociation of the peptides backbone takes place resulting in sets of fragment ions. Manual interpretation of fragment ion patterns can provide a complete or partial amino acid sequence of a peptide. Although manual interpretation is accurate and sensitive, this approach is time-consuming and, therefore, not compatible with the large number of spectra that can be generated when tandem mass spectrometry is combined with liquid chromatography (LC-MS/ MS). * To whom correspondence should be addressed. † Johns Hopkins University. ‡ Institute of Bioinformatics. § Centre de Regulacio Genomica.
7170
Analytical Chemistry, Vol. 81, No. 17, September 1, 2009
The need to analyze large numbers of MS/MS spectra resulted in mid and late 1990s in the appearance of algorithms to analyze MS/MS spectra with minimal or no manual interference. Initially, two peptide tandem MS spectra search algorithms (SEQUEST1 and Mascot2) were practically the only available algorithms but today the number of search engines has increased and are available either commercially, for example, Spectrum Mill (Agilent), PEAKS (BIS), Phenyx (GeneBio), and ProteinLynx Global server (Waters) or as open source, exemplified by InsPecT,3 OMSSA,4 Prospector,5 VEMS,6 and X!Tandem7 (a more comprehensive list has been published by Nesvizhskii et al.8). When the number of times that OMSSA, Mascot, Spectrum Mill, X!Tandem, and SEQUEST have been mentioned in the literature (PubMed abstract search) during the past five years is counted, Mascot and SEQUEST together account for 90%, suggesting that these two algorithms are still the dominant choices of the proteomics community. Several studies in the past have described and compared different search engines9-12 in the context of collision induced dissociation (CID), which is the most widely used method for gas phase sequencing of peptides. Such comparisons are based on the b-ions and y-ions13 that dominate CID spectra. While the rupture of a peptide’s backbone is the predominant reaction path in CID, exceptions exist for peptides such as those carrying (1) Eng, J. K.; McCormack, A. L.; Yates, J. R. I. J. Am. Soc. Mass Spectrom. 1994, 5, 976–989. (2) Perkins, D. N.; Pappin, D. J.; Creasy, D. M.; Cottrell, J. S. Electrophoresis 1999, 20, 3551–3567. (3) Tanner, S.; Shu, H.; Frank, A.; Wang, L. C.; Zandi, E.; Mumby, M.; Pevzner, P. A.; Bafna, V. Anal. Chem. 2005, 77, 4626–4639. (4) Geer, L. Y.; Markey, S. P.; Kowalak, J. A.; Wagner, L.; Xu, M.; Maynard, D. M.; Yang, X.; Shi, W.; Bryant, S. H. J. Proteome Res. 2004, 3, 958–964. (5) Clauser, K. R.; Baker, P.; Burlingame, A. L. Anal. Chem. 1999, 71, 2871– 2882. (6) Matthiesen, R.; Lundsgaard, M.; Welinder, K. G.; Bauw, G. Bioinformatics 2003, 19, 792–793. (7) Craig, R.; Beavis, R. C. Bioinformatics 2004, 20, 1466–1467. (8) Nesvizhskii, A. I.; Vitek, O.; Aebersold, R. Nat. Methods 2007, 4, 787–797. (9) Chamrad, D. C.; Korting, G.; Stuhler, K.; Meyer, H. E.; Klose, J.; Bluggel, M. Proteomics 2004, 4, 619–628. (10) Sadygov, R. G.; Cociorva, D.; Yates, J. R., 3rd. Nat. Methods 2004, 1, 195– 202. (11) Kapp, E. A.; Schutz, F.; Connolly, L. M.; Chakel, J. A.; Meza, J. E.; Miller, C. A.; Fenyo, D.; Eng, J. K.; Adkins, J. N.; Omenn, G. S.; Simpson, R. J. Proteomics 2005, 5, 3475–3490. (12) Balgley, B. M.; Laudeman, T.; Yang, L.; Song, T.; Lee, C. S. Mol. Cell Proteomics 2007, 6, 1599-1608. (13) Roepstorff, P.; Fohlman, J. Biomed. Mass Spectrom. 1984, 11, 601. 10.1021/ac9006107 CCC: $40.75 2009 American Chemical Society Published on Web 07/29/2009
serine/threonine phosphorylation and O-linked glycosylation. For this group of modified peptides, the dominating fragmentation pathway is dissociation of the modification rather than of the sequence revealing backbone; these modifications are often referred to as labile modifications. With only limited amino acid sequence information, confident identification of these peptides and their modified residues is challenging. Two consecutive stages of mass spectrometric fragmentation methodologies has been applied to help this problem for serine and threonine phosphorylation events,14,15 and also algorithms, such as AScore16 and SLoMo,17 are available to facilitate localization of the modified residues. However, recently alternative tandem MS based dissociation techniques have been applied to solve these problems. These newer fragmentation techniques convert the peptide ions into labile radicals by reacting them with low energy electrons, either by electron capture dissociation (ECD)18 or by electron transfer dissociation (ETD).19 The terms “capture” and “transfer” refer to how the radical peptide ions are generated. ECD and ETD are referred to as ’soft ionization’ methods, since the labile modifications are rather stable when subjected to these techniques. Both ECD and ETD have proven useful studying several modifications that are observed to be labile in CID.20-24 Because of commercial availability and improving robustness of ETD and ECD instrumentation, recent studies include thousands of ETD spectra.24-29 With large number of spectra, search engines are required. However, whereas search algorithms have generally been optimized to match the b- and y-ions of CID experiments, the ETD and ECD experiments result in a different type of fragment ions and also MS/MS spectra that look very different. The fragment ions, being the results of N-CR bond dissociation in the peptide backbone, are known as c- and z-type (14) Beausoleil, S. A.; Jedrychowski, M.; Schwartz, D.; Elias, J. E.; Villen, J.; Li, J.; Cohn, M. A.; Cantley, L. C.; Gygi, S. P. Proc. Natl. Acad. Sci. U. S. A. 2004, 101, 12130–12135. (15) Gruhler, A.; Olsen, J. V.; Mohammed, S.; Mortensen, P.; Faergeman, N. J.; Mann, M.; Jensen, O. N. Mol. Cell Proteomics 2005, 4, 310–327. (16) Beausoleil, S. A.; Villen, J.; Gerber, S. A.; Rush, J.; Gygi, S. P. Nat. Biotechnol. 2006, 24, 1285–1292. (17) Bailey, C. M.; Sweet, S. M.; Cunningham, D. L.; Zeller, M.; Heath, J. K.; Cooper, H. J. J. Proteome Res. 2009, 8, 1965–1971. (18) Zubarev, R. A.; Kelleher, N. L.; McLafferty, F. W. J. Am. Chem. Soc. 1998, 120, 3265–3266. (19) Syka, J. E.; Coon, J. J.; Schroeder, M. J.; Shabanowitz, J.; Hunt, D. F. Proc. Natl. Acad. Sci. U. S. A. 2004, 101, 9528–9533. (20) Kelleher, N. L.; Zubarev, R. A.; Bush, K.; Furie, B.; Furie, B. C.; McLafferty, F. W.; Walsh, C. T. Anal. Chem. 1999, 71, 4250–4253. (21) Mirgorodskaya, E.; Roepstorff, P.; Zubarev, R. A. Anal. Chem. 1999, 71, 4431–4436. (22) Stensballe, A.; Jensen, O. N.; Olsen, J. V.; Haselmann, K. F.; Zubarev, R. A. Rapid Commun. Mass Spectrom. 2000, 14, 1793–1800. (23) Hogan, J. M.; Pitteri, S. J.; Chrisman, P. A.; McLuckey, S. A. J. Proteome Res. 2005, 4, 628–632. (24) Swaney, D. L.; Wenger, C. D.; Thomson, J. A.; Coon, J. J. Proc. Natl. Acad. Sci. U. S. A. 2009, 106, 995–1000. (25) Chi, A.; Huttenhower, C.; Geer, L. Y.; Coon, J. J.; Syka, J. E.; Bai, D. L.; Shabanowitz, J.; Burke, D. J.; Troyanskaya, O. G.; Hunt, D. F. Proc. Natl. Acad. Sci. U. S. A. 2007, 104, 2193–2198. (26) Molina, H.; Horn, D. M.; Tang, N.; Mathivanan, S.; Pandey, A. Proc. Natl. Acad. Sci. U. S. A. 2007, 104, 2199–2204. (27) Zhang, Q.; Tang, N.; Brock, J. W.; Mottaz, H. M.; Ames, J. M.; Baynes, J. W.; Smith, R. D.; Metz, T. O. J. Proteome Res. 2007, 6, 2323–2330. (28) Molina, H.; Matthiesen, R.; Kandasamy, K.; Pandey, A. Anal. Chem. 2008, 80, 4825–4835. (29) Good, D. M.; Wirtala, M.; McAlister, G. C.; Coon, J. J. Mol. Cell Proteomics 2007, 6, 1942–1951.
ions.13 An added complexity in handling ETD data is that since ETD’s potential is pinpointing of modifications, not only must a search engine match the correct amino acid sequence, but also match a given modification to the correct residue. Most open-source and some commercial search algorithms have already been described in some detail. The goal of this study is therefore not to dissect the search algorithms, but rather to compare how search algorithms handle ETD data. As our test set for comparing four tandem MS search algorithms (OMSSA, Mascot, Spectrum Mill and X!Tandem), we used data that included phosphopeptides, nonphosphorylated peptides, and known standard peptides fragmented by ETD. EXPERIMENTAL SECTION Generation of Samples. We used peptides from three types of human protein samples: (A) standard proteins from the Universal Proteins Standard, UPS1 (Sigma), prepared as described previously,28 (B) phosphopeptides, enriched using either titanium dioxide (human embryonic kidney 293T cells)26 or by IMAC30 (p196, human pancreas cancer cell line), and (C) SCX separated trypsinated lysate of p196, a primary human pancreatic cancer cell line. All proteins were reduced with DTT and alkylated with iodoacetamide prior to digestion. ETD Analysis. All samples were analyzed using an ETD equipped Paul-type ion trap mass spectrometer coupled with liquid chromatography and an automated chip based nanoelectrospray source (6340 ion trap, Agilent Technologies, Santa Clara, California, USA). Supplemental activation was used for all ETD experiments. Generation of Peak Lists. The peak lists to be searched by the algorithms were created as follows: (1) Fragment ion information and, when possible, charge states of the fragmented peptides, were extracted from the raw data using the MS/MS extractor function of Spectrum Mill, (Agilent Technologies, Santa Clara, California). This function creates a single uniquely named.pkl file for each MS/MS spectrum. A total of 166 931 were extracted from the raw data. (2) To prepare peak lists for search algorithms other than Spectrum Mill we created a copy of the generated .pkl files that we deisotoped using an in-house deisotoping script identical to the one used by Spectrum Mill’s presearch deisotoping. (3) Deisotoped .pkl files were sorted according to known or unknown charge state. Two Mascot Generic Format (.mgf) peak lists were generated containing the name of each .pkl file as well as charge state information. This division was needed since X!Tandem in our hands did not recognize the .mgf charge state information: “CHARGE)2+, 3+, 4+, 5+” or “CHARGE)2 3 4 5” used peptides of unknown charge state. Spectra marked this way was searched only as 2+ peptides. To overcome this problem, each spectrum from precursors of unknown charge states was split into 4 entries, differing only in charge state, and searched independently. After searching these spectra, only the best match per “charge state split” set was extracted. Examples of the.mgf format are shown in Supporting Information Figure 2. Using OMSSA we experienced problems searching large.mgf files (>100 000 spectra or ∼270 MB), and it was necessary to split such peak lists into multiple smaller files. The largest.mgf file success(30) Zhai, B.; Villen, J.; Beausoleil, S. A.; Mintseris, J.; Gygi, S. P. J. Proteome Res. 2008, 7, 1675–1682.
Analytical Chemistry, Vol. 81, No. 17, September 1, 2009
7171
Figure 1. FDR/score plots for OMSSA (A), Mascot (B), Spectrum Mill (C), and X!Tandem (D). For Spectrum Mill, plots had to be created separately for charge states 2+, 3+, 4+, and 5+.
fully searched with OMSSA contained ∼78 000 spectra, (∼210 Mb). The results from multiple OMSSA searches were later merged. Database Searching. The ETD spectra were searched against the human subset of RefSeq (March 5, 2007). Separate searches against the reversed sequence RefSeq database were conducted. Trypsin and Lys-C samples were searched independently. The following search engines versions were used: OMSSA (version 2.1.0), Mascot (version 2.2), Spectrum Mill (version 3.03.078), and X!Tandem (version 2007.04.01.1). Prior to submission of this manuscript, we repeated the search of the known standard peptides using the most current versions for X!Tandem and OMSSA and did not observe significant differences. Search Parameters. For conducting the database searches, we used web interfaces of the search engines. Using the available options in these web based interfaces, we designed a set of parameters that would be as uniform as possible: All masses were chosen as monoisotopic. Mass tolerance of 2.5 Da was used for precursor ions, while the fragment ion mass tolerance was 0.7 Da. Up to three missed proteolytic events were allowed. All cysteines were considered carbamidomethylated. Variable modifications included phosphorylation of serine, threonine, and tyrosine and oxidation of methionine residues. Different search algorithms match different types of fragment ions. Thus, for OMSSA, the web interface includes c and z ions while e.g. Spectrum Mill and Mascot have defined ETD fragment ions to also include y-ions. For X!Tandem, we chose c, z, and y ions. The type of c and z ions matched differ between the search engines 7172
Analytical Chemistry, Vol. 81, No. 17, September 1, 2009
and for some algorithms, we were not able to find documentation for the type used. For peptides with unknown charge states, 2-5 positive charges were allowed. OMSSA offers an option to select a charge state for which multiple charged fragment ions would be allowed: this charge state was set to 3. For all search engines, the number of maximum matched fragments ions was set to their default values. Supporting Information Table 1 summarizes the search parameters used for the search engines including available information for the fragments ions being matched. Parsing of Search Output. Using in-house Python scripts, the best and second best match for each spectrum were extracted and deposited into a database for analysis and comparisons. The extraction was conducted using the following set of rules: (1) One spectrum can only be matched to one peptide, unless another peptide has been matched with an identical score/E-value. (2) When a spectrum has been searched with different precursor charge states, only the best matching peptide of the four charge states (2+, 3+, 4+, and 5+) is extracted. (3) If a spectrum has been matched to a peptide which can be mapped to multiple proteins, only one entry is extracted. (4) Using the three abovementioned criteria, the next best peptide matched to a spectrum was also extracted and marked “2nd best”. RESULTS AND DISCUSSION In a comparison such as ours, it is important to ensure that the test set (a list of m/z values and intensities of fragment ions) used to test the performance of various algorithms is identical. However, different search engines handle peak lists differently.
Table 1. Compilation of Numbers Discussed in Text and in Figuresa FDR
OMSSA
Mascot
A
lower score thresholds
0.1%
1.5 × 10-05 [e-value]
47.8 [score]
1%
3.7 × 10-03 [e-value]
41.2 [score]
5%
2.7 × 10-01 [e-value]
32.5 [score]
Spectrum Mill 11.1 [score], [2+] 12.1 [score], [3+] 16.7 [score], [4+] 18.1 [score], [5+] 10.1 [score], [2+] 10.3 [score], [3+] 12.5 [score], [4+] 12.5 [score], [5+] 8.6 [score], [2+] 8.6 [score], [3+] 10.5 [score], [4+] 11.4 [score], [5+]
X!Tandem -1.924 [log 10 E-value]
-2.797 [log 10 E-value]
-4.964 [log 10 E-value]
B
calculated false positive rates using known standard peptides
0.1% 1% 5%
0.0% (300)b 0.0% (495)b 3.5% (662)b
0.9% (533)b 1.4% (751)b 7.0% (1196)b
0.0% (477)b 0.1% (682)b 2.8% (983)b
0.4% (232)b 0.8% (592)b 6.7% (814)b
C
total number of peptides matches at three different FDR’s from total number of unique spectra (given in parentheses)
0.1%
2487
3928
5126
2010
1.0%
4491
5529
7824
5130
5.0%
6676 (6596)
8683 (8683)
11439 (10905)
7164 (6945)
2 × 10300 [e-value]
139.7 [score]
25th percentile [FDR%] median [FDR%] 75th percentile [FDR%]
4.8 × 10-02 3.9 × 10-01 1.5
2.6 × 10-02 2.3 × 10-01 1.6
18.3 [score], 23.4 [score], 23.5 [score], 23.6 [score], 3.1 × 10-02 2.4 × 10-01 1.7
DGAGDVAFVK (2+) KTVTAMDVVYALK (2+) VLVEPDAGAGVAVMoxK (2+) TVTAMDVVYALK (2+) IINEPTAAAIAYGLDR (2+) number of matched phospho peptides
2.642-1 (5.0)c 5.058 × 10-4 (6.1 × 10-01)c 3.417 × 10-02 (2.3)c 8.208 × 10-01 (7.9)c 5.388 × 10-01 (6.5)c 1552
62.3 (2.5 × 10-02)c 55.7 (2.7 × 10-02)c 51.8 (2.8 × 10-02)c 50.2 (2.8 × 10-02)c 60.6 (2.5 × 10-02)c 993
13.8 (4.3 16.8 (1.4 15.8 (2.4 13.1 (5.0 10.7 (3.6 1810
D maximum score 5.0%
E
F
5%
× × × × ×
[2+] [3+] [4+] [5+]
10-02)c 10-02)c 10-02)c 10-02)c 10-01)c
-16.7 [log 10 E-value] 1.1 × 10-01 3.6 × 10-01 1.6 -3.4 -3.0 -2.7 -1.9 -1.3 1019
(4.5 × 10-01)c (7.5 × 10-01)c (1.2)c (5.2)c (1.4 × 10-01)c
a Table 1A Cut-off thresholds: Estimation of score/expectation threshold for FDR’s of 0.1%, 1%, and 5%. These values are calculated using linear interpolation. The values can be visualized in Figure 1. For Spectrum Mill, these values are charge state dependent and therefore thresholds for each charge state were calculated. Table 1B False positive rates: Spectra that could be traced to the analysis of known standard peptides (Sigma) were extracted and the matched peptides were queried against the Sigma standard protein sequences. Only matches that fulfilled the thresholds in Table 1A were used. In brackets are listed the numbers of validated matches for each FDR data set. Table 1C Number of peptides matched. The number of peptides matched by each algorithm at the three FDR values (0.1%, 1% and 5%). For 5% FDR, the number of spectra is shown in brackets. Table 1D. Score distribution of matched peptides. All scores and expectation values were converted to FDR values using linear interpolation. Using the FDR values, the distribution for each algorithm was calculated: 25% percentile, median and 75% percentile are shown. The value (native score) of the best matched peptide is shown for each algorithm. Table 1E. Doubly charged peptides identified by all four algorithms: Scores or expectation values are shown for the five MS/MS spectra of doubly charged peptides shown in Figure 4. Calculated FDR values are shown in brackets. Table 1F. Phosphopeptides: The numbers of phosphopeptides matched (5% FDR) by each of the four algorithms are shown. b Number of spectra used. c Scores (Mascot and OMSSA), e-value (OMSSA), and log 10 E-value (X!Tandem) converted into FDR percentages are shown in brackets.
As an example, Spectrum Mill deisotopes extracted MS/MS data prior to the search whereas this does not seem to be the case for e.g. Mascot. In the case of Mascot (Matrix Science), a deisotoping option is possible and is offered as a stand-alone program (Mascot Distiller). Prior to our comparison, we tested Mascot and other algorithms with nondeisotoped and deisotoped data. Generally, peptides scored higher when the data were deisotoped (data not shown) and we conclude that deisotoping is favorable. To remove the bias originating from using different forms of a peak list (deisotoped versus nondeisotoped) and for uniformity, we decided to deisotope all data prior to analysis using the deisotoping script used by Spectrum Mill. Further, in PTM analysis, a large fraction of identified peptides are often of the “one-hit wonder”31 type. (31) Veenstra, T. D.; Conrads, T. P.; Issaq, H. J. Electrophoresis 2004, 25, 1278– 1279.
Therefore, in our comparison we chose to treat all peptide identifications as independent events. Establishing Thresholds for Peptide Identifications Using False Discovery Rates. One of the most intuitive parameters to assess the performance of search algorithms is the number of identified peptides, or rather: The number of peptides matched to fragmentation patterns fulfilling defined criteria. Because MS/ MS search algorithms rely on scoring systems to “assign” spectra as ’identified’ and because different search algorithms use different scoring systems, it is difficult to ensure that no bias is introduced when using this measure. One approach to tackle such an comparison was shown by Kapp et al.11 where they compared five algorithms by allowing different researchers, each with high experience for a given search engine, to conduct searches using the most optimal parameters. More recently, Balgley et al.12 used Analytical Chemistry, Vol. 81, No. 17, September 1, 2009
7173
estimated false discovery rate (FDR) to set the threshold for counting the number of peptides “identified” by a given algorithm. This latter approach, which we decided to use, can be summarized as follows: MS/MS spectra are searched against a normal and a reversed database. For a given score, a cumulative FDR value is calculated by dividing the number of spectra matched in the reversed database with the summed number of matches in the normal and the reversed database. Figure 1 shows a graphical plot of the cumulative FDR values versus the respective scores by the four search algorithms. We decided to compare the algorithms using FDR values of 0.1%, 1% and 5%. Linear interpolation was used to calculate the respective cutoff scores. For Mascot (Mascot score), OMSSA (-log 2[e-value]) and X!Tandem (-log 10[E-value]) without differentiating between charge states of assigned peptides. However, for Spectrum Mill (score), we generated separate plots for peptides of charge 2+, 3+, 4+, and 5+ since the significance of this score is charge state dependent. With regard to the calculated thresholds, it should be noted that for X!Tandem we chose to use the E-value over the Hyper Score. Both scores have previously been used12 but for our data set the use of the E-score resulted in significantly more matches, especially at a FDR value of 0.1% (see Supporting Information Figure 2). For Spectrum Mill, our calculated score thresholds are higher than in our previous study.28 This, we contribute to that we in our previous study permitted lower scoring peptides that could be assigned to a protein that was confidently identified. In this comparison, we treated each peptide as an independent event, which we find is a valid approach with respect to PTM modified peptides where a high number of one peptide, one protein type of identifications are expected. The calculated thresholds for the four algorithms at FDR values of 0.1%, 1%, and 5% are shown in Table 1A. To validate the thresholds obtained by the FDR approach, we decided to calculate the corresponding false discovery rates. This was possible using the known standard peptides in our probe set. Spectra fulfilling the FDR thresholds and originated from the analysis of the Sigma standard were extracted and the matches
queried against the known protein sequences. For the 0.1% and 1% data sets, the false positive rates were quite different from the FDR values and also differed significantly in-between the algorithms. However, for the 5% FDR data set the agreement was better ranging from 3% (Spectrum Mill) to 7% (Mascot) and we decided to use this data set for the comparison and discussion. The false positive values are shown in Table 1B, where numbers in brackets indicate the number of matched known standard peptides. Peptide Identifications. Having established the basis for the comparison by calculating the appropriate score thresholds, we counted the number of peptides matched by each of the algorithms. In the 5% FDR data set, the algorithm matching the most peptides (11 439) was Spectrum Mill. The corresponding numbers for Mascot, X!Tandem and OMSSA were 8683, 7164, and 6676, respectively (score, charge state, peptide sequence, and FDR value for each peptide are available in Supporting Information Table 2). We observed a similar pattern for the FDR 1% and 0.1% data sets, with the exception that OMSSA matched slightly more spectra at FDR 0.1%, than X!Tandem (2487 versus 2010). The numbers of matched peptides per algorithm for the 0.1%, 1%, and 5% FDR data sets are listed in Table 1C. Table 1D shows best scores, medians and 25% and 75% percentiles for the four 5% FDR data sets. For easy comparison in-between the algorithms, the three latter values were also converted into FDR-values using linear interpolation. Dividing the peptides matched by each algorithm into charge states (2+, 3+, 4+, and 5+), we observed that OMSSA, and to some extent X!Tandem, identified very few doubly charged peptides. Of all peptides identified by OMSSA, doubly charged peptides accounted for less than 1%. The corresponding percentages for Spectrum Mill and Mascot were in the range of 40-50% whereas this number, for X!Tandem, ranged from 1% (FDR of 0.1%) to 12% (FDR of 5%). To validate this observation, we limited the analysis to only the known standard peptides (10% of the extracted matches) and observed the same pattern: OMSSA and partly X!Tandem showed a clear bias against doubly charged
Figure 2. Charge state distributions for the matched peptides by (from left to right) OMSSA, Mascot, Spectrum Mill, and X!Tandem. Panel (A) shows the distributions of charge states for all peptides matched by a given algorithm, while panel (B) shows the distributions for known standard peptides. The data from the 5% FDR group are used. Numbers used for each analysis are shown in Table 1C, 5% FDR (all peptides) and 1B, 5% FDR (only known standard peptides). 7174
Analytical Chemistry, Vol. 81, No. 17, September 1, 2009
peptides. The charge state distributions for all peptides (upper panel) and only known standard peptides (lower panel) are shown in Figure 2. Our probe set used for this comparison was primarily composed of tryptic peptides and we expected a large number of doubly charged peptides - the percentage of doubly charged peptides deposited in Human Proteinpedia32 is ∼60% and ∼3/4 of the tryptic peptides in a comprehensive study by Cox and Mann carried two positive charges.33 Seen in that light, the percentages of doubly charged peptides matched by OMSSA, and to some extent X!Tandem, are very low, and this charge state bias is very likely the main reason for the lower number of matches by OMSSA and X!Tandem. To study this observation in detail, we extracted some doubly charged peptides identified by all four algorithms (1%) and aligned the matched fragment ions by each of the algorithms. Figure 3 show such five ETD spectra. Below each spectrum a chess board indicates which of the theoretical fragment ions (b, c, c′, z, z′, z′′, and y), each algorithm has matched. Matched fragments ions are indicated with a filled field. The spectra A-C in Figure 3 were identified by all four algorithms within the 5% FDR group. Spectra presented in panel D and E were only contained in the 5% FDR group for Mascot and Spectrum Mill, but by ignoring any score thresholds, we could also retrieve these spectra and their matches from OMSSA and X!Tandem. For these cases OMSSA and X!Tandem had indeed matched the spectra to the same peptides as Mascot and Spectrum Mill, but with lower confidence. Though not all algorithms included y-ions in ETD searching (see Supporting Information Table 1), we do not think that the observed bias can be attributed to this. An examination of the MS/MS spectra (Figure 3) showed that y-ions were of lower intensities and not too often matched by algorithms using y-ions to match ETD spectra (see Supporting Information Table 1). Surprisingly, we found adequate agreement between algorithms that had matched a peptide to a spectrum with high confidence as compared to algorithms that had matched the identical peptide to the exact same spectrum with lower confidence. For the five doubly charged peptides in Figure 3, Mascot and Spectrum Mill matched the spectra 10-200 fold better then OMSSA and X!Tandem (fold difference based on scores converted to FDR values). Although less prominent, this difference is also reflected in the median scores for the peptides matched by all four algorithms. Scores and respective FDR values for the five peptides in Figure 3 are shown in Table 1E and the medians and 25th and 75th percentile scores for peptides identified by all algorithm are shown in Table 1D. The general conclusion from this analysis is that the four algorithms agree well with regard to matching of the observed fragments ions. Overlap and Correlation of Identifications between Search Algorithms. From the apparently different handling of charge states, we expected the overlap in-between the search algorithms to be low. Indeed, when we calculated the overlap of more than 17 000 spectra (FDR 5%) only 1/6 were identified by all 4 algorithms. In agreement with the low overlap, we observed that 20-40% of the matches by an algorithm were specific to that particular algorithm only. Both these observations are in sharp contrast to CID based comparison of algorithms where an overlap by all algorithms of ∼50% has been reported11,12 and where only
4-7% algorithm-specific matches have been reported (Supporting Information Table 3 in work by Balgley et al.12) The Venn diagrams in Figure 4 depict the overlap, for the 0.1% FDR (A), 1% FDR (B), and 5% FDR (C) data sets. For all three data sets, approximately half of all matches were identified by two or more algorithms. Although the above analysis describes similarities and differences between the four algorithms, this comparison is influenced by the observed charge state bias. In a second comparison, we therefore focused on the 2717 (5% FDR) spectra that were matched by all four algorithms (89% of these matched triply charged peptides). For each of these matches, we checked the FDR group to which they belonged (0.1%, 1% or 5% FDR’s) and mapped this information in Figure 5 using color codes: e0.1% FDR: green, 0.1-1% FDR: yellow and 1-5% FDR: red. From this figure it is evident that approximately 1/3 of the shared identifications were matched with similar confidence. The majority of these matches belonged to the most confidently identified group (0.1% FDR). However, nearly the same number of matches (27%) were assigned to the most confident group (FDR e0.1%) by one algorithm and to the least confident group (FDR: 1-5%) by another. From Figure 5, it is also seen that generally, Mascot and Spectrum Mill match with higher confidence than X!Tandem and OMSSA. The medians calculated for the algorithms (using scores/ e-values converted to FDR values) confirms this with values for Mascot and Spectrum Mill that are ∼4 times better than for X!Tandem and OMSSA. Finally, in contrast to the all spectra matched by a specific algorithm, the medians for the group of peptides matched identically by all four algorithms were 4-10 times better (Table 1D) indicating that spectra matched identically by several algorithms are generally more confident matches. Agreement of Peptides Matched to Spectra. In the above section, we focused on spectra matched to the identical peptide (32) Mathivanan, S.; Ahmed, M.; Ahn, N. G.; Alexandre, H.; Amanchy, R.; Andrews, P. C.; Bader, J. S.; Balgley, B. M.; Bantscheff, M.; Bennett, K. L.; Bjorling, E.; Blagoev, B.; Bose, R.; Brahmachari, S. K.; Burlingame, A. S.; Bustelo, X. R.; Cagney, G.; Cantin, G. T.; Cardasis, H. L.; Celis, J. E.; Chaerkady, R.; Chu, F.; Cole, P. A.; Costello, C. E.; Cotter, R. J.; Crockett, D.; DeLany, J. P.; De Marzo, A. M.; DeSouza, L. V.; Deutsch, E. W.; Dransfield, E.; Drewes, G.; Droit, A.; Dunn, M. J.; Elenitoba-Johnson, K.; Ewing, R. M.; Van Eyk, J.; Faca, V.; Falkner, J.; Fang, X.; Fenselau, C.; Figeys, D.; Gagne, P.; Gelfi, C.; Gevaert, K.; Gimble, J. M.; Gnad, F.; Goel, R.; Gromov, P.; Hanash, S. M.; Hancock, W. S.; Harsha, H. C.; Hart, G.; Hays, F.; He, F.; Hebbar, P.; Helsens, K.; Hermeking, H.; Hide, W.; Hjerno, K.; Hochstrasser, D. F.; Hofmann, O.; Horn, D. M.; Hruban, R. H.; Ibarrola, N.; James, P.; Jensen, O. N.; Jensen, P. H.; Jung, P.; Kandasamy, K.; Kheterpal, I.; Kikuno, R. F.; Korf, U.; Korner, R.; Kuster, B.; Kwon, M. S.; Lee, H. J.; Lee, Y. J.; Lefevre, M.; Lehvaslaiho, M.; Lescuyer, P.; Levander, F.; Lim, M. S.; Lobke, C.; Loo, J. A.; Mann, M.; Martens, L.; MartinezHeredia, J.; McComb, M.; McRedmond, J.; Mehrle, A.; Menon, R.; Miller, C. A.; Mischak, H.; Mohan, S. S.; Mohmood, R.; Molina, H.; Moran, M. F.; Morgan, J. D.; Moritz, R.; Morzel, M.; Muddiman, D. C.; Nalli, A.; Navarro, J. D.; Neubert, T. A.; Ohara, O.; Oliva, R.; Omenn, G. S.; Oyama, M.; Paik, Y. K.; Pennington, K.; Pepperkok, R.; Periaswamy, B.; Petricoin, E. F.; Poirier, G. G.; Prasad, T. S.; Purvine, S. O.; Rahiman, B. A.; Ramachandran, P.; Ramachandra, Y. L.; Rice, R. H.; Rick, J.; Ronnholm, R. H.; Salonen, J.; Sanchez, J. C.; Sayd, T.; Seshi, B.; Shankari, K.; Sheng, S. J.; Shetty, V.; Shivakumar, K.; Simpson, R. J.; Sirdeshmukh, R.; Siu, K. W.; Smith, J. C.; Smith, R. D.; States, D. J.; Sugano, S.; Sullivan, M.; Superti-Furga, G.; Takatalo, M.; Thongboonkerd, V.; Trinidad, J. C.; Uhlen, M.; Vandekerckhove, J.; Vasilescu, J.; Veenstra, T. D.; Vidal-Taboada, J. M.; Vihinen, M.; Wait, R.; Wang, X.; Wiemann, S.; Wu, B.; Xu, T.; Yates, J. R.; Zhong, J.; Zhou, M.; Zhu, Y.; Zurbig, P.; Pandey, A. Nat. Biotechnol. 2008, 26, 164– 167. (33) Cox, J.; Hubner, N. C.; Mann, M. J. Am. Soc. Mass Spectrom. 2008, 19, 1813–1820.
Analytical Chemistry, Vol. 81, No. 17, September 1, 2009
7175
Figure 3. Five MS/MS spectra that each was matched to an identical doubly charged peptide by all four algorithms. The spectra shown are the actual spectra searched after extraction and deisotoping. Below each spectrum is a “chess board” depiction representing the theoretical c, c′, z, z′, z′′, and y fragment ions for the matched peptides. A filled square indicates a matched fragment ion. The peptides matched to the spectra in panels A-C are by all four algorithms identified within the 5% FDR data set. The spectra in panels D and E were in the 5% FDR data set for Mascot and Spectrum Mill and were also matched by both OMSSA and X!Tandem, although at FDRs higher than 5%. In Table 1E are shown scores and calculated FDR values for the five matched peptides.
by all four algorithms. A different question is to ask is how well do the algorithms agree in their assignments of the spectra? Thus, we extracted all spectra from the 5% FDR data set assigned by two or more of the algorithms. Of the 16 479 matched spectra, 8913 passed this criterion. We observed that the large majority of those spectra were assigned to the identical peptide by different 7176
Analytical Chemistry, Vol. 81, No. 17, September 1, 2009
algorithms. Only for 5% of the spectra did we see different assignments for the same spectrum. For ∼20% of these alternatively assigned spectra, the charge states of the peptides were unknown, but only in a very few cases did algorithms disagree with regard to the charge state of the peptides matched to such a spectrum. More than half of the alternatively assigned spectra
Figure 4. Venn diagrams depicting overlap between the 4 algorithms at (A) 0.1% FDR, (B) 1% FDR, and (C) 5% FDR. The areas representing each algorithm are to scale but not the overlaps. Identical color codes are used as in Supporting Information Table 1.
were assigned to phosphopeptides and most of the differences were related to the assignment of phosphorylated residues. More than half of the differently assigned spectra were only assigned by two algorithms. In Figure 6 are shown four examples of such alternatively matched spectra. In the panel, diamonds and triangles are used to mark fragment ions that can be matched to the peptides that different algorithms assigned to the shown spectrum. Phosphopeptides: Numbers. One very important application of ETD is the analysis of labile modifications exemplified by O-linked phosphorylation and O-linked N-acetylglucosamine
Figure 5. Comparison of confidence levels for matches in-between algorithms. At 5% FDR, 2717 spectra were matched to identical peptides by all four algorithms. Agreements and differences with respect to the three different FDR groups are shown using three color codes: green, 0.1% FDR group; yellow: 1% FDR; red, 5% FDR. The division of the matches into the four vertical sections is based on the agreement in-between the algorithms. Best agreement is shown at the top. The numbers on the right side of the map represent the total number of spectra in each section. Analytical Chemistry, Vol. 81, No. 17, September 1, 2009
7177
Figure 6. Four cases where different algorithms matched different peptides to the same spectrum. Two different matches are shown (upper left and right corners) for each spectrum. Diamonds (]) and triangles (1) indicate fragment ions that can be matched to each of the peptides. Sp indicates a serine phosphorylated residue. FDR values are listed for each assignment: (A) DAYSSpFGSRSDSRGK (OMSSA, 3.8% FDR) vs DAYSSFGSRSDSpRGK (Spectrum Mill, 8.4 · 10-2 % FDR), (B) FPGQLNADLR (Spectrum Mill, 4.3% FDR) vs GRGGGGGGGGGGGGGR (Mascot, 1.4% FDR), (C): DPSSYVKLSpVGKK (X!Tandem, 4.5% FDR) vs TTPSYVAFTDTER (Mascot, 3.0% FDR), and (D) GDSSpAEELK (Spectrum Mill, 3.8 · 10-1 % FDR) vs KEELARLR (Mascot, 1.7% FDR).
(O-GlcNAc).34,35 Focusing on spectra matched to phosphopeptides, we tabulated these numbers for the four algorithms (Table 1F). In the 5% FDR group, Mascot and X!Tandem both identified ∼1000 phosphorylated peptides, corresponding to 11% and 14%, respectively, of their total matches. The corresponding numbers for Spectrum Mill and OMSSA were 1810 (16%) and 1552 (23%), respectively. OMSSA identified a relative larger portion of phosphopeptides than the three other algorithms and we speculate that this might be due OMSSA’s bias against doubly charged peptides. We have previously observed many phosphorylation motifs embedded around arginine and lysine residues which can trigger a missed tryptic cleavage event and result in higher charged peptide ions.26 (34) Li, X.; Molina, H.; Huang, H.; Zhang, Y. Y.; Liu, M.; Qian, S. W.; Slawson, C.; Dias, W. B.; Pandey, A.; Hart, G. W.; Lane, M. D.; Tang, Q. Q. J. Biol. Chem. 2009, 284, 19248-19254. (35) Chalkley, R. J.; Thalhammer, A.; Schoepfer, R.; Burlingame, A. L. Proc. Natl. Acad. Sci. U. S. A. 2009, 106, 8894–8899.
7178
Analytical Chemistry, Vol. 81, No. 17, September 1, 2009
Phosphopeptides: Ambiguity in Assignment? Combining the phosphopeptides matched by all four algorithms led to 3164 peptides. Filtering for redundancy reduced this number to 1918 unique peptide matches originating from a total of 1353 unique amino acid sequences. The majority of these sequences (90%) contained more than one modifiable residue (serine, threonine, and tyrosine), averaging 4.1 such residues per peptide. This number is significantly higher than the average of 1.6 phosphorylation events per peptide, for the entire data set. With nearly 3 potential sites per phosphorylation event, we decided to compare how the search algorithms had handled the localization of phosphorylated residues. In theory, a phosphorylated serine or threonine residue should not behave differently in ETD from unmodified amino acids although other factors could influence the assignments. When it comes to phosphopeptides, often this group is of lower abundance, which can result in MS/MS spectra of lower quality compromising the certainty of assignments. On
Figure 7. Box plots showing differences between best matched and 2nd best matched peptide for a given spectrum. Box Plots are only shown for Mascot and Spectrum Mill. To simplify the comparison the differences are shown as FDR values. Whiskers indicate minimum and maximum. 25% and 75% percentiles are shown by bottom and top of each box, and the medians are indicated by the line dividing each box. Smaller values indicate small differences between best and 2nd best match (less ideal), whereas larger values indicate larger differences (preferred). Box plot (A) shows the distributions of differences for all MS/MS spectra while (B) shows those that were matched to only phosphopeptides.
the other hand, peptide sequences with multiple modifiable residues could indeed exist in several isomeric forms but not necessarily at identical stoichiometric ratios. Such isomeric phosphopeptides are expected to behave similarly in reversed phase chromatography, the choice for LC-MS/MS experiments, and would therefore very likely be fragmented simultaneously. When one spectrum is assigned to more than one phosphopeptide, it is therefore difficult to assess if this is due to ambiguous assignments or a isomeric peptides which would be a biologically relevant observation. To assess our data with respect to the above, we (1) counted the number of cases where more than one phosphopeptide was assigned equally well to the same spectrum and (2) compared the score difference between the best match to that of the second best match for each spectrum. For Mascot, none of the 993 spectra assigned to phosphopeptides had more than one highest scoring match. For OMSSA, 49 of the 1552 matched phosphopeptides (3%) had one or more phosphopeptides assigned to an identical spectrum. These numbers were 84 (8%) for X!Tandem and 20% (365) for Spectrum Mill. The noticeably larger percentages for Spectrum Mill could be the result of the scoring system of Spectrum Mill. Where OMSSA and Mascot rely on probabilistic score systems that might be capable of distinguishing more subtle differences, Spectrum Mill relies on a more empirical scoring system. Regarding the differences between the best matched peptide and the second best matched peptide to a spectrum. For Mascot and Spectrum Mill, a second best matched peptide existed for nearly all spectra, but for OMSSA and X!Tandem, such matches were less often available. For the cases where a second best match was available, we calculated the difference from that to the best match and to allow for a comparison between the algorithms, we used the scores/e-values converted into FDR values. The differences between a best and a second best match (delta FDR) for Spectrum Mill and Mascot results are presented in a Box Plot (Figure 7). Here, large numbers indicate better differentiation (i.e., more favorable). The data has been divided into phosphorylated (A) and nonphospho-
rylated (B) subsets and it is easily seen that the differences between the best and second best matched peptide are lower for phosphopeptides than nonphosphorylated peptides. Also, the plot indicates that Mascot, compared to SpectrumMill, is better at differentiating the best match from the next best match. The smaller differences observed for Spectrum Mill are consistent with the observation that 20% of spectra matched by Spectrum Mill were was matched equally well to multiple peptides. All of those matches were to different isomeric forms of the phosphopeptide. The difference between the phosphorylated and nonphosphorylated peptides illustrates that even when using ETD, localizing a modified residue among several potential residues still poses a challenge. The added complexity in localizing phosphorylated residues has previously been recognized and addressed, experimentally15,36,37 and by developing special scoring systems.15,16,37,38 CONCLUSIONS The most recent CID based MS/MS search algorithm comparison12 found that, among five algorithms, OMSSA and X!Tandem identified the most peptides. This CID study also showed that, few peptide matches were specific to only one algorithm and that ∼50% of all matches were identified by all algorithms. Using ETD data, our conclusions are in sharp contrast to this. We observed that Spectrum Mill and Mascot performed best for ETD data. We also found that a relatively large number of matches were specific to only one algorithm, and the overlap across the four algorithms was lower than noted in the above-mentioned CID based comparison. Even for spectra identified by all four algorithms, we observed large differences in the confidence level of the assigned matches. Though 30% of the matches that were identified by all algorithms agreed well with regard to confidence, (36) Olsen, J. V.; Blagoev, B.; Gnad, F.; Macek, B.; Kumar, C.; Mortensen, P.; Mann, M. Cell 2006, 127, 635–648. (37) Palumbo, A. M.; Reid, G. E. Anal. Chem. 2008, 80, 5727-5735. (38) Schroeder, M. J.; Shabanowitz, J.; Schwartz, J. C.; Hunt, D. F.; Coon, J. J. Anal. Chem. 2004, 76, 3590–3598.
Analytical Chemistry, Vol. 81, No. 17, September 1, 2009
7179
nearly the same number disagreed to such an extent that if one algorithm assigned a match at 1% to 5% FDR another algorithm matched the same peptide to the same spectrum g10 times better (0.1% FDR group). This indicates that the field of automated ETD database searching is in its early days and because of the large differences, users of the ETD methodology must be very cautious in their conclusions especially if they are using a single search algorithm to analyze ETD spectra. One illustration of this danger is the charge state bias which we have observed in this study and which have led to a common misconception that the ETD methodology is far less suited for doubly charged peptide ions than it really is.39 One alternative to deal with known and unknown biases of algorithms would be to use multiple algorithms. However, the best solution, is that global ETD data sets are made available to algorithm developers who can then optimize algorithms by incorporating observed fragmentation patterns specific to ETD and ECD,40-43 just as it has been done for CID over more than a decade. In this study, Spectrum Mill performed the best. (39) Swaney, D. L.; McAlister, G. C.; Coon, J. J. Nat. Methods 2008, 5, 959– 964. (40) Savitski, M. M.; Nielsen, M. L.; Zubarev, R. A. Anal. Chem. 2007, 79, 2296– 2302. (41) Falth, M.; Savitski, M. M.; Nielsen, M. L.; Kjeldsen, F.; Andren, P. E.; Zubarev, R. A. Anal. Chem. 2008, 80, 8089–8094. (42) Cooper, H. J.; Hudgins, R. R.; Hakansson, K.; Marshall, A. G. J. Am. Soc. Mass Spectrom. 2002, 13, 241–249. (43) Fung, Y. M.; Chan, T. W. J. Am. Soc. Mass Spectrom. 2005, 16, 1523– 1535.
7180
Analytical Chemistry, Vol. 81, No. 17, September 1, 2009
However, open-source tandem MS search algorithm developers have contributed greatly to the proteomics community in the recent years and we expect that the optimization curve for ETD search algorithms developments should be steep. What is the best today might easily change tomorrow and this subject should therefore be revisited in the not too distant future. SUPPORTING INFORMATION AVAILABLE Schematic of the analysis, examples of the.mgf peak list used when submitting the tandem MS data to the algorithms, FDR/ Hyperscore and FDR/E-values plots created for X!Tandem, summary of the search parameters used to compare the algorithms, summary of the Venn diagrams depicted in Figure 4, and summary of matched spectra (5% FDR) by all four MS/MS search algorithms. This material is available free of charge via the Internet at http://pubs.acs.org. ACKNOWLEDGMENT We thank Jakob Bunkenborg, Rune Matthiesen and David M. Horn for their input and suggestions. This work was supported in part by NIH Roadmap Grant U54 RR020839 and Department of Defense Era of Hope Scholar award (W81XWH-06-1-0428).
Received for review March 24, 2009. Accepted July 7, 2009. AC9006107