Enhanced Characterization of Complex Proteomic Samples Using LC

Enhanced Characterization of Complex Proteomic Samples Using LC−MALDI MS/MS: Exclusion of Redundant Peptides from MS/MS Analysis in Replicate Runs...
0 downloads 0 Views 475KB Size
Anal. Chem. 2005, 77, 7816-7825

Technical Notes

Enhanced Characterization of Complex Proteomic Samples Using LC-MALDI MS/MS: Exclusion of Redundant Peptides from MS/MS Analysis in Replicate Runs Hsuan-shen Chen, Tomas Rejtar, Victor Andreev, Eugene Moskovets, and Barry L. Karger*

Barnett Institute and Department of Chemistry and Chemical Biology, Northeastern University, 341 Mugar, 360 Huntington Avenue, Boston, Massachusetts 02115

Due to the complexity of proteome samples, only a portion of peptides and thus proteins can be identified in a single LC-MS/MS analysis in current shotgun proteomics methodologies. It has been shown that replicate runs can be used to improve the comprehensiveness of the proteome analysis; however, high-intensity peptides tend to be analyzed repeatedly in different runs, thus reducing the chance of identifying low-intensity peptides. In contrast to commonly used online ESI-MS, offline MALDI decouples the separation from MS acquisition, thus allowing in-depth selection for specific precursor ions. Accordingly, we extended a strategy for offline LC-MALDI MS/MS analysis using a precursor ion exclusion list consisting of all identified peptides in preceding runs. The exclusion list eliminated redundant MS/MS acquisitions in subsequent runs, thus reducing MALDI sample depletion and allowing identification of a larger number of peptide identifications in the cumulative dataset. In the analysis of the digest of an Escherichia coli lysate, the exclusion list strategy resulted in a 25% increase in the number of unique peptide identifications in the second run, in contrast to simply pooling MS/MS data from two replicate runs. To reduce the increased LC analysis time for repeat runs, a four-column multiplexed LC system was developed to carry out separation simultaneously. The multiplexed LC-MALDI MS provides a high-throughput platform to utilize the exclusion list strategy in proteome analysis. High performance liquid chromatography (HPLC) coupled with tandem mass spectrometry (MS/MS) has been widely used in shotgun proteomics to identify and quantitate peptides and, thus, proteins.1 Such analyses can be employed in profiling biological states of cells and organelles as well as in applications * Corresponding author. Phone: (617) 373-2867. Fax: (617) 373-2855. E-mail: [email protected]. (1) McCormack, A. L.; Schieltz, D. M.; Goode, B.; Yang, S.; Barnes, G.; Drubin, D.; Yates, J. R., 3rd. Anal. Chem. 1997, 69, 767-776.

7816 Analytical Chemistry, Vol. 77, No. 23, December 1, 2005

in biomarker discovery.2,3 However, due to the great complexity of the proteome, comprehensive analysis to characterize all components is still not possible with present methodologies. Currently, electrospray ionization (ESI) is generally used to couple LC separation to mass spectrometry in proteomics applications due to the high sensitivity and complete automation of ESIMS instruments available.4,5 In LC-ESI MS analysis, MS/MS acquisitions are triggered by “data-dependent analysis” of the most intense peaks in the MS spectrum. As a result, only a few MS/ MS spectra can be acquired in one data-dependent analysis cycle; however, due to the complexity of proteomic samples, coeleution of peptides is prevalent with LC separation. If the number of coeluting peptides exceeds the number of MS/MS spectra that can be acquired in one data-dependent analysis cycle, low-intensity peptides will not undergo MS/MS analysis. Dynamic exclusion is generally applied in LC-ESI MS/MS analysis to mitigate this problem. New advances in MS instrumentation, such as the linear ion trap, have shown the ability to acquire more MS/MS spectra in one data-dependent analysis cycle, thus improving the capability to analyze coeluting peptides; however, even with dynamic exclusion and new instrumentation, LC-ESI MS still has intrinsic limitations when analyzing very complex samples, due to the finite time that eluting peaks are available for online MS/MS acquisitions. In addition, MS/MS acquisitions can be triggered at positions off the apex of the peak, resulting in a loss in sensitivity and, thus, peptide identifications. Redundant MS/MS acquisitions also occur due to multiple charged states of peptides. These intrinsic limitations result in a significantly less than complete number of peptide identifications in a single ESI LC-MS/MS analysis of a complex sample. (2) McDonald, W. H.; Yates, J. R., 3rd. Dis Markers 2002, 18, 99-105. (3) Pan, S.; Zhang, H.; Rush, J.; Eng, J.; Zhang, N.; Patterson, D.; Comb, M. J.; Aebersold, R. Mol. Cell. Proteomics 2005, 4, 182-190. (4) Wolters, D. A.; Washburn, M. P.; Yates, J. R., 3rd. Anal. Chem. 2001, 73, 5683-5690. (5) Link, A. J.; Eng, J.; Schieltz, D. M.; Carmack, E.; Mize, G. J.; Morris, D. R.; Garvik, B. M.; Yates, J. R., 3rd. Nat. Biotechnol. 1999, 17, 676-682. 10.1021/ac050956y CCC: $30.25

© 2005 American Chemical Society Published on Web 10/22/2005

To improve the comprehensiveness of proteome analysis, several groups have used replicate LC-MS runs.6,7 In one study, it was demonstrated that the total number of protein identifications increased by 70% when accumulating data from nine repeated LC/ LC/MS/MS experiments using an LCQ Deca XP ion trap instrument.8 However, the redundancy of protein identifications from run to run was found to be high, reflecting the fact that the sampling process in LC-ESI MS analysis favors the identifications of high abundance proteins. Recently, it has been demonstrated that this protein redundancy can be reduced using a precursor exclusion list strategy (extended dynamic exclusion) in the repeated runs, resulting in an increase in the number of identified proteins.9 One obvious drawback of replicate experiments is the prolonged analysis time. Using a multicolumn separation system to perform multiple LC-ESI MS analyses in parallel could greatly increase the throughput. The commercially available MUX system couples four separation columns to MS by switching the ESI interface between columns and sharing MS instrumental time;10 however, the column switching in the MUX system discards a portion of the separations, which can result in a loss in peptide identifications. To perform MS/MS analysis, coupling of multiple ESI interfaces requires multiple mass analyzers (such as micro ion-trap array11,12) in a multiplexed LC-ESI MS system, which is still under development. MALDI provides an alternative approach to coupling LC separations to mass spectrometry. Various interfaces have been developed for both online and off-line coupling of LC to MALDI MS.13-22 The off-line MALDI interfaces allow eluents from the LC column to be deposited and archived on the MALDI target prior (6) Lipton, M. S.; Pasa-Tolic, L.; Anderson, G. A.; Anderson, D. J.; Auberry, D. L.; Battista, J. R.; Daly, M. J.; Fredrickson, J.; Hixson, K. K.; Kostandarithes, H.; Masselon, C.; Markillie, L. M.; Moore, R. J.; Romine, M. F.; Shen, Y.; Stritmatter, E.; Tolic, N.; Udseth, H. R.; Venkateswaran, A.; Wong, K. K.; Zhao, R.; Smith, R. D. Proc. Natl. Acad. Sci. U.S.A. 2002, 99, 11049-11054. (7) Wu, C. C.; MacCoss, M. J.; Howell, K. E.; Yates, J. R., 3rd. Nat. Biotechnol. 2003, 21, 532-538. (8) Liu, H.; Sadygov, R. G.; Yates, J. R., 3rd. Anal. Chem. 2004, 76, 41934201. (9) Hui, J. P. M.; Tessier, S.; Butler, H.; Badger, J.; Kearney, P.; Carrier, A.; Thibault, P. Proceedings of the 51st ASMS Conference on Mass Spectrometry and Allied Topics, Montreal, Canada, June 8-12 2003. (10) Morrison, D.; Davies, A. E.; Watt, A. P. Anal. Chem. 2002, 74, 18961902. (11) Misharin, A. S.; Laughlin, B. C.; Vilkov, A.; Takats, Z.; Ouyang, Z.; Cooks, R. G. Anal. Chem. 2005, 77, 459-470. (12) Tabert, A. M.; Griep-Raming, J.; Guymon, A. J.; Cooks, R. G. Anal. Chem. 2003, 75, 5656-5664. (13) Miliotis, T.; Kjellstrom, S.; Nilsson, J.; Laurell, T.; Edholm, L. E.; MarkoVarga, G. J. Mass Spectrom. 2000, 35, 369-377. (14) Preisler, J.; Hu, P.; Rejtar, T.; Karger, B. L. Anal. Chem. 2000, 72, 47854795. (15) Rejtar, T.; Hu, P.; Juhasz, P.; Campbell, J. M.; Vestal, M. L.; Preisler, J.; Karger, B. L. J. Proteome Res. 2002, 1, 171-179. (16) Wall, D. B.; Berger, S. J.; Finch, J. W.; Cohen, S. A.; Richardson, K.; Chapman, R.; Drabble, D.; Brown, J.; Gostick, D. Electrophoresis 2002, 23, 3193-3204. (17) Preisler, J.; Hu, P.; Rejtar, T.; Moskovets, E.; Karger, B. L. Anal. Chem. 2002, 74, 17-25. (18) Ericson, C.; Phung, Q. T.; Horn, D. M.; Peters, E. C.; Fitchett, J. R.; Ficarro, S. B.; Salomon, A. R.; Brill, L. M.; Brock, A. Anal. Chem. 2003, 75, 23092315. (19) Tegeler, T. J.; Mechref, Y.; Boraas, K.; Reilly, J. P.; Novotny, M. V. Anal. Chem. 2004, 76, 6698-6706. (20) Zhang, B.; McDonald, C.; Li, L. Anal. Chem. 2004, 76, 992-1001. (21) Chen, H. S.; Rejtar, T.; Andreev, V.; Moskovets, E.; Karger, B. L. Anal. Chem. 2005, 77, 2323-2331.

to MS and MS/MS analysis. The decoupling of separation and MS analysis removes time constraints on MS/MS acquisition, permitting a more detailed strategy for selecting precursor ions for MS/MS analysis.3,23,24 Moreover, peptide ions in MALDI are predominately singly charged ions, thus resulting in less complex MS spectra than that in ESI MS. MS/MS spectra can be acquired with optimum sensitivity at the maximum of the chromatographic peak, and redundant acquisitions can be minimized. In addition, off-line coupling also allows parallel separations to be carried out simultaneously and subsequently analyzed using a single mass spectrometer.25 In this paper, we have adopted and extended the precursor exclusion list strategy used for LC-ESI MS9 and applied it to LCMALDI MS. Our strategy utilizes a precursor exclusion list from peptides identified with high confidence in repeat LC-MS/MS runs, thus minimizing redundant MS/MS acquisitions and, as will be shown, thus facilitating analysis of low-intensity peptides. Importantly, only precursors of positively identified peptides were excluded from subsequent MS/MS acquisitions, allowing reanalysis of peptides with marginal scores. Furthermore, additional MS/ MS spectra of precursors that could not be analyzed in the first run due to sample depletion could occur. In this strategy, multiplexed LC-MALDI MS provides a high-throughput platform to analyze complex proteomes. A tryptic digest of an Escherichia coli cell lysate was used as a model sample to illustrate the advantages of this new strategy. The results demonstrate that the approach offers the potential to increase the number of protein identifications in LC-MALDI MS analysis by 25% compared to simply pooling the results from repeat runs. EXPERIMENTAL SECTION Chemicals. Acetonitrile, acetone, trifluoroacetic acid (TFA), ammonium citrate, ammonium bicarbonate, dithiothreitol (DTT), iodoacetamide, and R-cyano-4-hydroxycinnamic acid (CHCA) were purchased from Sigma (St. Louis, MO). The CHCA matrix was recrystallized before use. Sequencing-grade trypsin was obtained from Promega (Madison, WI). Sample Preparation. E. coli strain HM22 was grown in rich medium [Luria-Bertani (LB) broth supplemented with diaminopimelic acid (75 µg/mL)]. The overnight culture was diluted 1000-fold in 200 mL of fresh media, and the bacteria were incubated with aeration at 37 °C for 3 h to a density of ∼108 cfu/ mL. After 3 h of growth, the bacteria were harvested by cooling on ice for 5 min and centrifugation at 5000g, 4 °C, for 10 min. After removal of the supernatant, cells from 25 mL of culture were quickly resuspended in 1 mL of lysis buffer (50 mM Tris HCl, 1.6% (w/v) SDS, 4% (v/v) mercaptoethanol, pH 7.4) and heated for 4 min at 90 °C. The lysate was centrifuged in a tabletop centrifuge at room temperature and 10000g, for 10 min. The supernatant was carefully removed to a clean tube, and the protein concentration was measured by the Bradford assay. It should be (22) Chen, V. C.; Cheng, K.; Ens, W.; Standing, K. G.; Nagy, J. I.; Perreault, H. Anal. Chem. 2004, 76, 1189-1196. (23) Andreev, V. P.; Rejtar, T.; Chen, H. S.; Moskovets, E. V.; Ivanov, A. R.; Karger, B. L. Anal. Chem. 2003, 75, 6314-6326. (24) Andreev, V.; Chen, H.-S.; Rejtar, T.; Moskovets, E.; Karger, B. L. Proceedings of the 52nd ASMS Conference on Mass Spectrometry and Allied Topics, Nashville, TN, May 23-27 2004. (25) Lee, H.; Griffin, T. J.; Gygi, S. P.; Rist, B.; Aebersold, R. Anal. Chem. 2002, 74, 4353-4360.

Analytical Chemistry, Vol. 77, No. 23, December 1, 2005

7817

noted that the sample was originally prepared for protein fractionation by gel electrophoresis; hence, the high concentration of SDS and mercaptoethanol in the lysis buffer.26 To mimic a highly complex mixture, no such gel electrophoretic prefractionation was utilized in our study; however, the excess mercaptoethanol in the lysis buffer had to be removed. The lysate was purified by SDS-PAGE separation, carried out on a NuPAGE electrophoresis system (Invitrogen, Carlsbad, CA), using a NuPAGE Novex 4-12% Bis-Tris gel and NuPAGE MOPS SDS running buffer. Approximately 40 µg of the lysate was loaded onto the gel, followed by electrophoresis at 200 V for 5 min to separate proteins from excess mercaptoethanol. It should be noted that all proteins in the lysate were focused in a small band on the top of the gel without discernible resolution. The gel was then stained, and the entire band containing proteins was excised, destained, and in-gel-digested with trypsin, as described previously.21 Briefly, the gel was incubated with 200 µL of 25 mM ammonium bicarbonate and 10 mM DTT solution and heated at 56 °C for 1 h. The supernatant was removed from the vial and replaced with the addition of 200 µL of 55 mM iodoacetamide, then incubated in the dark for 30 min at room temperature. After alkylation, the gel was washed sequentially with 200 µL of 50 mM ammonium bicarbonate solution, 50% ACN/water (v/v) solution, and pure ACN, then dried using a CentriVap concentrator (Labconco, Kansas City, MO). The dried gel was incubated in 200 µL of 5 ng/µL of trypsin overnight at 37 °C. To extract the peptides from the gel, 100 µL of extraction buffer (1% TFA, 50% ACN/water (v/v)) was added to the vial and the mixture was sonicated for 10 min. The supernatant was collected and dried in the CentriVap concentrator to remove ACN in the extraction buffer. The extraction was resuspended in 100 µL of 0.1% TFA/ H2O (v/v) and stored at -20 °C until the LC-MS analysis. MALDI MS Direct Deposition. Cytochrome c digest (Michrom BioResources, Auburn, CA) was mixed with matrix solution (7 mg/mL CHCA, 50% ACN/water) and then deposited directly on a MALDI plate as an array of spots. It should be noted that no LC separation was performed prior to the deposition. Each spot contained roughly 200 nL of solution (equivalent to 5 s of spotting in LC-MALDI at a deposition flow rate of 2.4 µL/min) and 5 fmol of digested cytochrome c. LC-MALDI MS Analysis. The separation in LC-MALDI MS was carried out on a 15 cm × 100 µm i.d. analytical column packed with 5-µm Magic C18 particles (Michrom BioResources). The samples were directly injected into the analytical column using a manual injection valve (Valco, Houston, TX) equipped with a 2-µL sample loop. The mobile phases were delivered using an UltiMate HPLC system (Dionex, Sunnyvale, CA). The composition of solvent A was 2% (v/v) ACN and 0.1% (v/v) TFA, and that of B was 85% (v/v) ACN, 5% (v/v) 2-propanol, and 0.1% (v/v) TFA. The separation was carried out at a flow rate of 1 µL/min using a 45-min gradient (0-30% solvent B in 40 min, 30-95% solvent B in 1 min, then constant at 95% B for 4 min). The eluent from the LC column was mixed with matrix solution (7 mg/mL CHCA, 0.1% (v/v) TFA in 50% (v/v) ACN/water) at a ratio of 2:3, resulting in a total flow of 2.5 µL/min, and the mixture was then deposited onto a stainless steel MALDI plate as an array of discrete spots (26) SDS PAGE Gel Electrophroesis; http://www.ciwemb.edu/labs/koshland/ Protocols/Protein/sdspage.html; last accessed on 08/22/05.

7818

Analytical Chemistry, Vol. 77, No. 23, December 1, 2005

at the rate of 5 s/spot, using a lab-built deposition interface described previously.21 To obtain high mass accuracy in the MS mode, 50 nM of Glu-1-fibrinopeptide B (m/z 1570.667) was added in matrix solution as an internal standard. After the separation was completed, the MALDI plate was stored at 4 °C until MS analysis. All MS and MS/MS spectra were acquired on an AB 4700 Proteomic Analyzer (Applied Biosystems, Framingham, MA). Individual MS spectra for each spot were acquired using 50 laser shots at each of 20 random positions within the spot, resulting in a total of 1000 laser shots across the entire spot. Then, the spectra were calibrated using a single internal standard with the 4700 Explorer software (Applied Biosystems), resulting in roughly (30 ppm mass accuracy across the entire MALDI plate. All MS spectra were combined into a single binary file containing both MS and chromatographic information, and then denoised and processed, using lab-developed algorithms, MEND23 and PRESEL,24 to generate a list of deisotoped candidate precursors for the MS/ MS analysis. Up to a maximum of 15 MS/MS spectra were acquired from each spot (equivalent to 3 MS/MS spectra per chromatographic second). Each MS/MS spectrum was acquired using 1000 laser shots in the same fashion as for the MS acquisition. The resulting MS/MS spectra were processed by PeakToMascot tool (Applied Biosystems) using S/N cutoff of 10 and sequentially submitted to MASCOT database searching software (ver. 2.0, Matrix Science, Boston, MA) for peptide and protein identification using the SwissProt database (release 45) with E.coli taxonomy. Each identified protein was required to have at least one unique positive peptide identification that is not shared with any other protein. To estimate the level of false positive matches, the same dataset was searched against the SwissProt database with protein sequences reversed using a lab-developed Perl script. Multiplexed LC-MALDI MS Analysis. The multiplexed 4-channel HPLC system was off-line-coupled to MALDI-MS using a lab-built deposition interface based on a previous design.21 Each separation channel consisted of an individual manual injection valve (Valco) with a 2-µL sample loop and a 15 cm × 100 µm i.d. analytical column packed with 5-µm Magic C18 particles, allowing simultaneous separation of multiple samples using an identical gradient. The mobile phase was delivered by a single LC pump at the rate of 4 µL/min, then split equally into four separation channels, i.e., 1 µL/min for each channel. Capillary tubes with 20-µm i.d. were used to transfer the flow and compensate for the variation in back pressure from the hand-packed columns. The eluents from four individual LC columns were mixed in separate micro-Tees with MALDI matrix solution in a ratio of 2:3 (total flow rate of 2.5 µL/min) and simultaneously transferred onto a standard 2-in. × 2-in. MALDI plate via four identical, fused-silica capillaries (10-cm length, 280-µm o.d., 20-µm i.d.). The MALDI plate was moved onto an X-Y stage controlled by a LabVIEW program (National Instruments, Austin, TX). The deposition capillaries were held perpendicular to the MALDI plate and aligned with a 12-mm gap between adjacent capillaries. During the deposition, the probe was moved up and down by a stepper motor, thus allowing simultaneous deposition of eluents from the four separation channels as discrete spots at the rate of 5 s/spot. A standard 2-in. × 2-in. MALDI plate could accommodate four

Figure 1. The effect of variation of MALDI MS/MS spectra on Mascot ion scores. A tryptic digest of cytochrome c was mixed with matrix solution and directly deposited on the MALDI target to mimic coeluted peptides in LC separations, and two relatively low intensity peptides were chosen for MS/MS acquisitions and Mascot database search. Distribution of the Mascot score for 100 replicate MS/MS measurements of peptide A (TGQAPGFTYTDANK, m/z 1470.66, black bars) and peptide B (MIFAGIKK, m/z 907.52, gray bars) is shown as histograms. All MS/MS spectra were acquired using fresh spots; i.e., all spectra were obtained from the first acquisition on individual spots. The significant threshold of the Mascot score was 28 at 95% confidence level.

spot arrays, each containing a total of 196 spots (7 × 28) that represented approximately 16 min of separations for 5 s/spot. The four complete separations could be spotted onto three plates. This design of the multiplexed LC system can be easily expanded to eight or even more columns to increase the throughput of separation. The MALDI samples were then analyzed using an AB 4700 Proteomic Analyzer, as described above. RESULTS AND DISCUSSION Our goal in this study was to develop a strategy of obtaining as much information as possible in multiple LC-MALDI MS/ MS analysis of a complex proteome sample; that is, identifying low-intensity peptides in the presence of high-intensity peptides to maximize the number of identifications. The strategy of a precursor exclusion list will be demonstrated using both a single channel and a multiplexed LC-MALDI MS platform in the analysis of an E. coli cell lysate. The Effect of Variation of MALDI MS/MS Spectra on Mascot Search Results. Due to the limited instrument time and sample amount, the MS/MS spectrum of each precursor ion is usually acquired only once in a single LC-MS/MS run, to maximize the number of peptides that can be analyzed. However, numerous sources of variation can occur during the data acquisition, such as fluctuations in the ionization process itself, ion counting statistics, and fragmentation statistics, and these errors can propagate in the MS/MS spectrum and affect peptide identification. In ESI, it has been shown that these variations can result in significant differences (up to 25% RSD) in Sequest Xcorr scores of MS/MS spectra acquired using direct infusion.27 However, no such study has yet been reported in MALDI MS/ MS analysis. (27) Venable, J. D.; Yates, J. R., 3rd. Anal. Chem. 2004, 76, 2928-2937.

It is well-known that the reproducibility of measured intensities is relatively low in MALDI MS and MS/MS analysis, largely due to the heterogeneous crystal morphology of the deposited samples and fluctuation of laser intensity. In addition, the change in solvent composition during gradient elution can affect crystal morphology. The coefficient of variance of the peak intensities in MALDI MS has been reported to be as high as 42%.28 To examine the variation in MALDI MS/MS analysis, a tryptic digest of cytochrome c was chosen as a model sample and directly deposited at multiple spots on the MALDI plate without separation to represent a complex mixture. Two peptides, TGQAPGFTYTDANK (peptide A, m/z 1470.66) and MIFAGIKK (peptide B, m/z 907.52), were selected for MS/MS acquisition. The S/N ratios of these two peptides in the MS spectrum were 50 and 15, respectively (data not shown). For peptide A, a total of 100 sample spots was used for MS/MS acquisitions. On each spot, 25 MS/MS spectra were sequentially acquired using 1000 laser shots for each spectrum. The data were then submitted to a Mascot search using the SwissProt database with the mass tolerance of (50 ppm. Peptide B was also analyzed with the same procedure (i.e., 1000 laser shots/spectrum, 25 spectra/spot) using an additional 100 spots. Figure 1 shows the distribution of Mascot scores for 100 replicate MS/MS spectra for the two peptides, obtained from the first acquisition of the particular sample spot; i.e., each spectrum was acquired on a fresh sample. The averages of Mascot scores for peptides A and B were 60.4 (RSD 25.4%) and 23.9 (RSD 25.1%), respectively. The sizable variation in the Mascot scores shows that the fluctuation in MALDI MS/MS analysis can greatly affect the database search result, similar to ESI MS/MS, as reported.27 Interestingly, the RSDs of the Mascot scores in MALDI MS/MS (28) Dekker, L. J.; Dalebout, J. C.; Siccama, I.; Jenster, G.; Sillevis Smitt, P. A.; Luider, T. M. Rapid Commun. Mass Spectrom. 2005, 19, 865-870.

Analytical Chemistry, Vol. 77, No. 23, December 1, 2005

7819

were similar to the RSDs of SEQUEST Xcorr score in ESI MS/MS analysis (25%).27 Although the variations in Mascot scores of these two peptides were comparable, the number of identifications were quite different. The average Mascot score of peptide A (60.4) was two times higher than the significant threshold (28 at 95% confidence level), and almost all spectra of peptide A were identified; thus, the likelihood of identifying this peptide in a single MALDI MS/ MS acquisition would be high. On the other hand, the average Mascot score of the low S/N peptide B (23.9) was less than the significant threshold, and only 25 spectra of peptide B resulted in positive identifications (Figure 1). The likelihood of identifying peptide B in a single MALDI MS/MS acquisition was therefore only 25%. This example illustrates that for a low-intensity precursor ion, such as peptide B, multiple MS/MS acquisitions can increase the likelihood of identification and, thus, increase the number of peptide/protein identifications from multiple replicate LC-MS runs.6,7 Among the sources contributing to the variation of MS/MS spectra, sample consumption is a distinct difference between ESI MS and MALDI MS. In ESI, the sample is completely sprayed during the data acquisition; in MALDI, only a portion of the spotted sample is consumed in a single MS (or MS/MS) acquisition. On the other hand, MALDI sample spots are depleted during the spectral acquisition, resulting in signal decay. This factor is especially pronounced in the MS/MS acquisition, mainly due to the high laser fluence used in the process. In a previous report, it was found that the intensity of the base peak in the MS/ MS spectrum decreased by a factor of 2 after roughly 15 acquisitions.24 Sample depletion thus introduces an additional factor in MALDI-MS/MS acquisition that can influence the spectral quality and, thus, the results of database searching. Figure 2 shows the effect of sample depletion on the Mascot score. The acquisition value in this figure indicates the number of MS/MS spectra previously acquired on a particular spot. For example, acquisition number 10 represents the 10th spectrum acquired on the same sample spot. The spread of Mascot scores obtained from 100 spots is illustrated by the box-and-whisker plots; the boxes represent 25-75 percentiles, and the whiskers represent 5-95 percentiles. It can be seen that MS/MS spectra of peptide A (m/z 1470.66, Figure 2A) with an acquisition number lower than 7 could be assigned to the correct peptide sequence; however, the median ion score decreased steadily after the first five acquisitions. Sample depletion resulted in a reduced signal in the MS/MS spectra, thus decreasing the chance of positive identification. At acquisition number 15, 30% of the MS/MS spectra had an ion score lower than the significant threshold so that this peptide had only a 70% chance of being identified. The effect of sample depletion is even more critical for lowintensity peptides. As illustrated in Figure 2B, roughly 30% of the MS/MS spectra of peptide B (m/z 907.52) obtained from the first five acquisitions could be identified; however, less than 10% of the MS/MS spectra could be identified at acquisition number 15. After 20 acquisitions, none of MS/MS spectra had an ion score above the significant threshold. Hence, the likelihood of identifying peptide B decreased from 30% to 0 after 20 acquisitions on the same spot. This example illustrates that for low-intensity peptides, acquiring MS/MS spectra before significant sample depletion 7820

Analytical Chemistry, Vol. 77, No. 23, December 1, 2005

results in a much higher likelihood of positive identifications. In summary, in LC-MALDI MS/MS analysis, a precursor selection strategy allowing low-intensity peptides to be analyzed before sample depletion could greatly improve the number of peptide identifications. Strategy for Enhanced Proteome Analysis. As noted in the Introduction, it has been shown that accumulating datasets from multiple runs can significantly increase the number of peptide identifications in LC-MS/MS analysis.8 However, high-intensity peptides often are repeatedly analyzed in replicate runs, resulting in redundant MS/MS spectra in the cumulative dataset.8 The offline LC-MALDI MS system provides the opportunity to conduct in-depth selection of precursor ions prior to MS/MS acquisition, in which chromatographic retention and MS information can be used for decision making. This precursor ion selection can be employed in the cumulative dataset, as well. Accordingly, we have developed a strategy to improve comprehensiveness of proteomic analysis by introducing a precursor exclusion list, based on high confidence identification, in consecutive LC-MALDI MS analyses. Figure 3 illustrates the workflow of the exclusion list strategy for comprehensive proteomic analysis. First, the sample is separated by LC and deposited on a MALDI plate, followed by MS analysis, as described in the Experimental Section. The same LC-MS procedure is repeated several times to obtain replicate LC-MALDI MS data. The MS data of the first separation are then processed by MEND and PRESEL for denoising and peak list generation for MS/MS acquisition. The resulting MS/MS spectra are then generated and submitted to Mascot database searching to obtain peptide identifications. All peptides identified at a given confidence level (e.g., at 95%) now serve as a precursor exclusion list (in terms of m/z and retention time) in the subsequent run. After the peak list of the second separation is generated, the peaks matched to peptides previously identified in the exclusion list are removed from the peak list before MS/MS acquisition. It should be noted that the peak would have to fall within the tolerance windows of both the retention time and m/z value to be excluded. The ranges of tolerance windows are typically determined by mass accuracy in the MS and peak widths in the chromatogram. The peptides positively identified in the second separation can then be added to the exclusion list for MS/MS analysis in the third separation. The same procedure can then be applied to all subsequent separations. Employing an exclusion list enables additional, less intense peptides to be analyzed and identified, thus increasing the number, coverage, and confidence level of protein identifications. It should be noted that the criteria used to establish the exclusion list can be easily adjusted. Raising the confidence level of the Mascot search from 95 to 99% results in a smaller number of high-confidence peptides to exclude, thus potentially reducing false positive identifications. Alternatively, high-intensity peaks which yield low ion scores could be added in the exclusion list, thus reducing sample consumption further and potentially leading to a larger number of identifications. To demonstrate the principle of the exclusion list strategy, only peptides identified with confidence level at 95% were excluded in the present study. To demonstrate the advantages of the strategy of Figure 3, the LC-MALDI MS analysis of the tryptic digest of the E. coli cell lysate was performed with and without the implementation

Figure 2. The decay of the Mascot score caused by sample depletion. The MS/MS spectra of peptide A (panel A, TGQAPGFTYTDANK, m/z 1470.66, white box) and peptide B (panel B, MIFAGIKK, m/z 907.52, gray box) were sequentially acquired 25 times on the same spot. The dashed line represents the significant threshold of the ion score in the Mascot database search. The distributions of Mascot scores from 100 spots were illustrated by box-and-whisker plots, in which whisker and the box represent the 5-95 and 25-75 percentile, respectively. The central line of the box is the median of the ion scores.

of the exclusion list using the same LC column. Four and five replicate LC-MALDI MS analyses were performed with and without the exclusion list, respectively. A mass tolerance of (50 ppm and an elution time tolerance of (0.5 min (roughly 2σ of a nanoLC peakwidth) were used in matching peaks to the exclusion list. It should be noted that the difference in retention time among runs on the same column was