Tandem Mass Spectrometry of Intact Proteins for Characterization of

Gavin E. Reid, Hao Shang, Jason M. Hogan, Gil U. Lee, and Scott A. McLuckey .... Kathrin Breuker , Mi Jin , Xuemei Han , Honghai Jiang , Fred W. McLaf...
0 downloads 0 Views 76KB Size
Anal. Chem. 2001, 73, 5725-5731

Tandem Mass Spectrometry of Intact Proteins for Characterization of Biomarkers from Bacillus cereus T Spores Plamen A. Demirev,* Javier Ramirez,† and Catherine Fenselau

Department of Chemistry, University of Maryland, College Park, Maryland 20742

Intact protein biomarkers from Bacillus cereus T spores have been analyzed by high-resolution tandem Fourier transform ion cyclotron resonance mass spectrometry. Two techniques have been applied for excitation of the isolated multiply charged precursor ion species: sustained off-resonance irradiation/collisionally activated dissociation and electron capture dissociation. Fragmentationderived sequence tags and BLAST sequence similarity proteome database searches allow unequivocal identification of the major biomarker protein with unprecedented specificity. Sequence-specific fragmentation patterns further confirm protein identification. Moreover, methodology combining accurate mass measurements of intact proteins with additional information contained in a proteome database permits tentative assignment of several other protein biomarkers isolated from the B. cereus T spores. We argue that approaches involving tandem MS of protein biomarkers, combined with bioinformatics, can drastically improve the specificity of individual microorganism identification, particularly in complex environments. A consensus exists that in the post-genomic era bioanalytical mass spectrometry, coupled with bioinformatics, will play a pivotal role for characterization of cellular proteomes and for outlining the function and dynamics of the living cell at the molecular level.1-5 Currently, the major application of this emerging technology is identification of individual proteins in complex mixtures. It is based on proteome database searches using the masses of tryptic peptides (peptide maps or “fingerprints”) generated after proteolysis of intact proteins (often separated/fractionated by a variety of multidimensional chromatographic and electrophoretic techniques).4,5 Peptide sequence tag information obtained by tandem mass spectrometry complementary to or concurrently with the peptide maps increases the specificity and improves the capability for unequivocal protein identification.6-8 Reducing the * Corresponding author. Phone: (301) 405-8618. Fax: (301) 405-8615. E-mail: [email protected]. † Present address: Bruker Daltonics Inc., 15 Fortune Dr., Billerica, MA 01821. (1) Proteome Research: New Frontiers in Functional Genomics; Wilkins, M., Williams, K. L., Appel, R. D., Hochstrasser, D. F., Eds.; Springer: Berlin, 1998. (2) Pandey, A.; Mann, M. Nature 2000, 405, 837. (3) Proteome Research: Mass Spectrometry; James. P., Ed.; Springer: Berlin, 2001. (4) Aebersold, R.; Goodlett, D. Chem. Rev. 2001, 101, 269. (5) Godovac-Zimmermann, J.; Brown, L. Mass Spectrom. Rev. 2001, 20, 1. 10.1021/ac010672n CCC: $20.00 Published on Web 10/26/2001

© 2001 American Chemical Society

number of steps in individual protein characterization, for example, by obviating the need for tryptic protein digestion, can be achieved by tandem mass spectrometry of intact protein precursor ions. Several years ago, Mortz et al. demonstrated that sequence tag data could be obtained from electrosprayed multiply charged protein ions by tandem mass spectrometry in a 6T Fourier transform mass spectrometer.9 The sequence tags were derived from internal y- or b-fragment ions and were typically four amino acids or shorter. Sustained off-resonance irradiation/collisionally activated dissociation (SORI/CAD) or CW infrared multiphoton dissociation (IRMPD) were the ion excitation/fragmentation techniques used in those studies. The tag data, obtained by tandem FTICR-MS combined with precursor ion masses were exploited in search algorithms to query a protein sequence database, allowing unequivocal protein identification for all four proteins studied.9 The advantages of trapping instruments, FTICR10,11 as well as the quadrupole trap12, for tandem mass spectrometry13 of intact proteins have been illustrated in several recent studies.14-21 McLafferty and co-workers have developed electron capture dissociation (ECD) into an efficient tool for fragmentation of individual proteins in an FTICR instrument.14-17 In ECD, low energy (,0.1 eV) electrons are captured by multiply charged protein cations, resulting in extensive c- and z- peptide backbone fragmentation. Marshall and co-workers have used fragment data (6) Mann, M.; Wilm, M. Anal. Chem. 1994, 66, 4390. (7) Washburn, M.; Wolters, D.; Yates, J. R. Nature Biotechnol. 2001, 19, 242. (8) Conrads, T.; Anderson, G.; Veenstra, T.; Pasa-Tolic, L.; Smith, R. D. Anal. Chem. 2000, 72, 3349. (9) Mortz, E.; O’Connor, P.; Roepstorff, P.; Kelleher, N.; Wood, T.; McLafferty, F. W.; Mann, M. Proc. Natl. Acad. Sci. U.S.A. 1996, 93, 8264. (10) Marshall, A.; Hendrickson, Ch.; Jackson, G. Mass Spectrom. Rev. 1998, 17, 1. (11) Marshall, A. Int. J. Mass Spectrom. 2000, 200, 331. (12) March, R. E. Int. J. Mass Spectrom. 2000, 200, 285. (13) Williams, E. Anal. Chem. 1998, 70, 179A. (14) Zubarev, R.; Kelleher, N.; McLafferty, F. W. J. Am. Chem. Soc. 1998, 120, 3265. (15) Zubarev, R.; Horn, D.; Fridriksson, E.; Kelleher, N.; Kruger, N.; Lewis, M.; Carpenter, B.; McLafferty, F. W. Anal. Chem. 2000, 72, 563. (16) Horn, D.; Ge, Y.; McLafferty, F. W. Anal. Chem. 2000, 72, 4778. (17) Horn, D.; Zubarev, R.; McLafferty, F. W. Proc. Natl. Acad. Sci. U.S.A. 2000, 97, 10313. (18) Li, W.; Hendrickson, C.; Emmett, M.; Marshall, A. G. Anal. Chem. 1999, 71, 4397. (19) McLuckey, S. A.; Stephenson, J. L. Mass Spectrom. Rev. 1998, 17, 369. (20) Xiang, F.; Anderson, G. A.; Veenstra, T.; Lipton, M.; Smith, R. D. Anal. Chem. 2000, 72, 2475. (21) Cargile, B.; McLuckey, S. A.; Stephenson, J. L. Anal. Chem. 2001, 73, 1277.

Analytical Chemistry, Vol. 73, No. 23, December 1, 2001 5725

from IRMPD of intact protein cations in order to search a database of 1000 protein entries (restricted by the molecular weights of the intact precursor ions).18 In addition, methods for identification of microorganisms based upon tandem mass spectrometry of intact proteins in quadrupole ion traps are being developed.20,21 Crude lysates of E. coli cells have been electrosprayed directly in an ion trap after on-line cleanup through a dual-microdialysis device.20 Automatic precursor ion selection and subsequent fragmentation produce “global” MS/MS surveys. The distinctive MS/MS spectral patterns of proteins in the range below 20 kDa can be used to provide a basis for confident microorganism identification by selecting specific biomarkers. Alternatively, tandem quadrupole ion trap mass spectrometry of the intact capsid protein from bacteriophage MS2 has been employed for its identification in E. coli lysates.21 Collisionally activated dissociation of the multiply charged precursor ion corresponding to the MS2 biomarker was followed by ion/ ion reactions in order to provide sequence tag data. Database searches employing both the sequence tags and molecular weight data resulted in the coat protein identification and, consequently, in direct confirmation of the presence of the phage in the mixture of the two microorganisms.21 Recently, we began developing bioinformatics tools to incorporate proteome database search algorithms for microorganism identification by mass spectrometry.22-24 Determining the masses, Mr, of a set of protein biomarkers from intact unknown organisms and correlating these data with sequence-derived theoretical Mr of proteins (retrieved together with their organismic sources from Internet-accessible databases) permits microorganism ranking and identification.22 Extensions of the approach include statistical analysis of proteome uniqueness as a function of mass accuracy and proteome size,23 and an iterative procedure to account for posttranslational modifications not reflected directly in the proteome database.24 Software incorporating statistical algorithms is being developed and is accessible for on-line queries.25 In parallel, research has been aimed at elucidating the structures and properties of individual biomarkers observed by MALDI-TOFMS from intact bacterial cells,26-29 spores,30 and viruses.31 In this paper, we report the identification by tandem MS of protein biomarkers observed in spectra of intact spores of Bacillus cereus T. B. cereus T, a Gram-positive bacterial strain found in the soil, is an opportunistic pathogen, inducing mild food poisoning in humans. It is a member of the group of bacteria to which Bacillus anthracis also belongs. In contrast to phenotype differences, genetic analysis demonstrates that B. anthracis should be considered a lineage of B. cereus.32,33 The similarity between (22) Demirev, P.; Ho, Y. P.; Ryzhov, V.; Fenselau, C. Anal. Chem. 1999, 71, 2732. (23) Pineda, F.; Lin, J.; Fenselau, C.; Demirev, P. Anal. Chem. 2000, 72, 3739. (24) Demirev, P.; Lin, J.; Pineda, F.; Fenselau, C. Anal. Chem. 2001, 73, in press. (25) http://infobacter.jhuapl.edu (26) Holland, R. D.; Duffy, C. R.; Rafii, F.; Sutherland, J. B.; Heinze, T.; Holder, C. L.; Voorhees, K. J.; Lay, J. O. Anal. Chem. 1999, 71, 3226. (27) Dai, Y.; Li, L.; Roser, D. C.; Long, S. R. Rapid Commun. Mass Spectrom. 1999, 13, 73. (28) Arnold, R. J.; Reilly, J. P. Anal. Biochem. 1999, 269, 105. (29) Ryzhov, V.; Fenselau, C. Anal. Chem. 2001, 73, 746. (30) Hathout, Y.; Ho, Y. P.; Ryzhov, V.; Demirev, P. A.; Fenselau, C. J. Nat. Prod. 2000, 63, 1492. (31) Kim, Y. J.; Freas, A.; Fenselau, C. Anal. Chem. 2001, 73, 1544. (32) Helgason, E.; Okstad, O.; Caugant, D.; Johansen, H.; Fouet, A.; Mock, M.; Hegna. I.; Kolsto, A. Appl. Environ. Microbiol. 2000, 66, 2627.

5726

Analytical Chemistry, Vol. 73, No. 23, December 1, 2001

biomarkers observed in MALDI-TOF mass spectra from different strains of refractive B. cereus and B. anthracis (Sterne) spores has already been noted.34,35 Here, we isolate the biomarkers from intact B. cereus T spores in a simple procedure prior to analysis by electrospray tandem Fourier transform ion cyclotron resonance mass spectrometry. We apply two techniques for excitation of the selected precursor ion species trapped in the cell: SORI/CAD and ECD. Fragmentation-derived sequence tags and BLAST sequence similarity proteome database searches allow unequivocal identification of the major biomarker protein with unprecedented specificity. Sequence-specific fragmentation additionally confirms the protein identification. Furthermore, methodology combining accurate mass measurements of intact proteins by FTICR with additional information contained in a proteome database permits tentative assignment of several other potential protein biomarkers isolated from the B. cereus T spores. We also argue that approaches involving tandem mass spectrometry of biomarkers combined with bioinformatics can drastically improve the specificity of individual microorganism identification, particularly in complex environments. EXPERIMENTAL SECTION Microorganism Growth and Biomarker Isolation. Bacillus cereus T cell lines were obtained from the U.S. Army Medical Research Institute for Infectious Disease (USAMRIID, Frederick, MD). The spores were grown in-house, using NSM (New Sporulation Medium) growth medium and following procedures already described.34 The spores were harvested by centrifugation at 10000g for 10 min, and the pellet was washed with deionized water several times. The intact cells were lyophilized and stored at -20° C prior to sample preparation and analysis. The samples contained more than 99% refractive spores, as revealed by phasecontrast microscopy inspection. For isolation of the biomarkers, lyophilized spores (around 50 mg) were suspended in 5 mL AcN/ H2O (5% TFA) for 15 min at room temperature. The suspension was then centrifuged at 10000g for 15 min. The supernatant was further subjected to cleanup by elution through a C18 disposable extraction cartridge (J. T.Baker, Phillipsburg, NJ) in order to remove inorganic salts and lipid-containing compounds. The fraction, eluted in 90/10 AcN/H2O (0.1% TFA), contained the biomarkers of interest, as verified by MALDI-TOF mass spectrometry. The biomarker solution was concentrated on a SpeedVac SC110A (Savant Instruments, Holbrook, NY) and lyophilized overnight prior to ES-FTICR mass spectrometry. Mass Spectrometry. Positive ion MALDI mass spectra of the intact spores, as well as the eluted fractions, were obtained on a “Kompact MALDI 4” (Kratos Analytical Instruments, Chestnut Ridge, NY) time-of-flight instrument in the linear mode. A detailed description of the MALDI-TOF experimental procedure has been already presented.34,35 A “HiResESI” (IonSpec Co., Irvine, CA) FTICR mass spectrometer equipped with a 4.7 T actively shielded superconducting magnet, was used in these studies. The instrument was equipped with an Analytica (Branford, CT) electrospray (33) Daffonchio, D.; Cherif, A.; Borin, S. Appl. Environ. Microbiol. 2000, 66, 5460. (34) Hathout, Y.; Demirev, P. A.; Ho, Y. P.; Bundy, J. L.; Ryzhov, V.; Sapp, L.; Stutler, J.; Jackman, J.; Fenselau, C. Appl. Environ. Microbiol. 1999, 65, 4313. (35) Ryzhov, V.; Hathout, Y.; Fenselau, C. Appl. Environ. Microbiol. 2000, 66, 3828.

ion source with an indirectly heated glass transfer capillary. The source was modified by adding an x,y,z manipulator for low-flow (micro-) electrospray. The microelectrospray setup was further modified in-house. Sample solution was infused by a syringe pump at a rate of 200 nL/min through a 35-cm-long fused-silica capillary (150-µm o.d., 25-µm i.d., Polymicro Technologies, Phoenix, AZ). The end of the silica capillary, hand-pulled using a lab torch and covered with a conducting coating, was grounded, and the source end plate was at 1.4 kV. The biomarker extracts were electrosprayed in a solution of 1:1 MeOH/H2O (0.1% AcA). Ions were accumulated in a hexapole device for 1500 ms prior to trapping in the ICR cylindrical cell by standard pulse sequence scripts. For FTICR tandem MS measurements, ions of a particular mass/ charge were first isolated in the ICR cell by ejection of all other species. SORI/CAD of the trapped ions was performed for 650 ms by a rf burst with an amplitude of 2.85 V at m/z 972 (i.e., at a frequency around 800 Hz from the ICR frequency of the isolated ions36). Ions were “heated” (internally excited) as a result of multiple collisions with pulsed neutral gas. Two pulses of Ar with duration of 2 ms each were introduced immediately prior to and at the middle of the irradiation period (at a maximum pressure read-out on the ion gauge of 5 × 10-7 Torr and estimated pressure in the cell of 5 × 10-8 Torr). The average translational energy of the precursor ions in the FTICR cell is estimated at around 6 eV. After a delay of about 2 s, the resulting fragment ions were accelerated for detection by a rf sweep excitation waveform (60 V p-p from 50 to 700 kHz). The image current was amplified and digitized at an acquisition rate of 1 MHz and 512 k data points before Fourier transform (two zero fills and Blackman apodization) to yield a mass spectrum. ECD was performed following the procedures described by McLafferty and co-workers15,16 and adapted to the current setup. Electrons were produced by passing a 4.2 A current through the heated filament supplied with the instrument. Electrons were continuously injected into the cell for 10 s by biasing the filament at around -1.5 V. After ion trapping and isolation, the potential at the two cell-trapping plates was dropped to -1.5 V during electron injection. After ECD and prior to ion detection, a gas pulse was used to axialize the fragment ions in order to increase the sensitivity. Deconvolution of the multiply charged spectra to zero charge state was performed with the software provided with the instrument. The average mass accuracy in broad-band mode for all mass measurements reported in this work was around 15 ppm for the monoisotopic (or the most abundant isotopic) ion peaks. Database Search. BLAST sequence homology searches were performed on-line at the Web site37 of the Swiss Institute of Bioinformatics using the BLAST network service with the NCBI BLAST 2 software.38 The sequence tag query was executed against all protein entries contained in the combined “Swiss-PROT + TrEMBL + TrEMBL_New” database (total of 550 993 sequences).39 The comparison matrix chosen was “PAM30” with an E value threshold set at 100 (neither the “gapped alignment” nor “low complexity region filtering” option was selected). (36) Gauthier, J.; Trautman, T.; Jacobson, D. Anal. Chim. Acta 1991, 246, 211. (37) http://www.expasy.ch (38) Altschul, S. F.; Madden, T. L.; Scha¨ffer, A. A.; Zhang, J.; Zhang, Z.; Miller, W.; Lipman, D. J. Nucleic Acids Res. 1997, 25, 3389. (39) Bairoch, A.; Apweiler, R. Nucleic Acids Res. 2000, 28, 45.

RESULTS AND DISCUSSION Protein Biomarker Spectra. The ES-FTICR spectrum of biomarkers isolated from B. cereus T is shown in Figure 1. The major biomarker consists of a doublet, close in mass to that of the biomarker (6711 Da), observed in MALDI-TOF mass spectrometry of intact spores.34,35 The high-resolution FTICR instrument allows determination of the masses of the individual peaks in the isotopic distribution with an accuracy of 10 ppm or better. The most intense isotopic peaks in the doublet are 6710.55 (Figure 1.b) and 6726.54 Da, respectively. Several other biomarker peaks observed in the FTICR spectrum (Figure 1) can be correlated to biomarker peaks observed in the MALDI-TOF mass spectra from the intact microorganisms. Data from SORI/CAD tandem mass spectrometry of the isolated major biomarker doublet (at 6710.5 and 6726.5) are presented in Figure 2. Both precursor ions are completely fragmented, with the loss of 17 resulting in the major peaks at 6693.49 and 6709.51. Accurate mass measurements and reproducible calculations of the mass differences between the various isotopes of observed fragment ions allow determination of the amino acid residue losses. We are, thus, able to discern between Lys (theor. MW 128.10) and Gln (theor. MW128.06) loss based on the experimentally determined mass differences (128.13 and 128.02 Da, respectively). A contiguous sequence tag, LGGYQK, was easily obtained from the SORI/CAD spectrum. The fact that the first amino acid residue loss occurs from the M-17 precursor ions (at 6693.49 Da) suggests that we are observing the b-ion series. Database Search and Biomarker Identification. Using the BLAST similarity search software available on the Web site,37 a search using the sequence tag determined from the SORI/CAD mass spectrum was performed. The query was unconstrained, that is, the entire SWISSProt/TrEMBL database containing more than 5 × 105 entries was searched. Only two hit sequences were returned as a result of the query. The first was for SASP-2 protein (P06554), a “small acid-soluble spore protein” from B. cereus containing 65 amino acids with an average molecular weight of 6841.5 Da. The second protein hit was EX5B_HAEIN (P45157), “exodeoxyribonuclease V, B chain” from Haemophilus influenzae containing 1211 AA, with an average molecular weight of 139 857 Da. A comparison with the precursor ion mass immediately rules out the second hit (P45157). Moreover, the sequence tag is coincident with the six C-terminal amino acids (59-65) of P06554. In contrast, the tag is found between amino acid residues 232-237 for P45157. The possibility that we may observe an endogenous proteolitic fragment from P45157 can be also convincingly ruled out by comparing calculated masses of eventual fragments (based on the P45157 AA sequence) with the observed precursor ion mass. The predicted average molecular weights of fragments 177-237 and 178-237 are 6798.8 and 6651.7 Da, respectively. The discrepancy in mass (131 Da), calculated from the sequence of P06554, and the experimentally observed one, can be explained by cleavage of N-terminal Met, the most common posttranslational modification in prokaryotes.41 Posttranslational modifications (e.g., N-terminal Met cleavage) are usually not reflected in proteome databases generated by tanslation of the (40) Cooks, R. G.; Rockwood, A. L. Rapid Commun. Mass Spectrom. 1991, 5, 93. (41) Hirel, P. H.; Schmitter, J. M.; Dessen, P.; Fayat, G.; Blanquet, S. Proc. Natl. Acad. Sci. U.S.A. 1989, 86, 8247.

Analytical Chemistry, Vol. 73, No. 23, December 1, 2001

5727

Figure 1. FTICR mass spectrum, deconvoluted to zero-charge state, obtained by ES of the biomarker extract: (a) mass range from 6600 to 7600 Da; (b) expanded view of the spectum in (a) with the isotope distribution of the 6710.55 biomarker; inset, “raw” spectrum showing the isotope distribution of the 6+ charge state at 1119.5 Th (for a definition of the unit Th, Thomson, see ref 40).

DNA open reading frames.24 The activity of the N-terminal bacterial amino-peptidases depends strongly on the penultimate amino acid.42 Empirically established cleavage rules suggest that Met preceding Ser should be always cleaved, which is the case for P06554. Small acid-soluble spore proteins bind to both strands of spore DNA, and protect the DNA from damage (including UVlight-induced cleavages). The second species in the doublet at 6726.5 Da, which is 16 Da higher than the observed protein P06554, can be assigned as an oxidation product. The oxidation (42) Gonzales, T.; Baudouy, J. FEMS Microbiol. Rev. 1996, 18, 319.

5728

Analytical Chemistry, Vol. 73, No. 23, December 1, 2001

is most probably occurring at the Met(19) residue. This oxidation site is confirmed also by comparing the observed SORI/CAD and ECD fragmentation patterns (Figure 3). The observed fragment ion masses are compared to predicted cleavages (forming the b-, y-, c- and z-ion series) calculated from the amino acid sequence for the protein P06554 using FragPro43. Most fragment ions that contain Met(19) are observed as doublets 16 Da apart. Whether Met oxidation occurs in vivo44 or, most probably, as an artifact (43) Senko, M. FragPro 1.0, peptide fragmentation analysis software for Windows, download: http://members.aol.com/msmssoft/

Figure 2. Mass spectrum of the fragments obtained by SORI/CAD after isolation of the 7+ charge state of the 6710.5/6726.5 precursor biomarker doublet: (a) mass/charge spectrum (in Th) with different charge states denoted; (b) same spectrum, deconvoluted to zero-charge state, in the range between 5600 and 6800 Da (amino acid losses are assigned on the basis of observed mass differences between fragment peaks).

after the protein has been isolated45,46 remains an open question. Comparison of sequence-specific fragmentation patterns, obtained by either ion activation technique, with theoretically predicted ion masses from the sequence additionally confirms the P06554 protein identity. As already noted,15,16 SORI/CAD generates predominantly b- and y-peptide fragments (Figure 3a), in contrast to ECD, where c- and z-ions are exclusively observed (Figure 3b). No contiguous tag sequence could be discerned in the tandem mass spectrum obtained by ECD (Figure 3.b). Accurate mass measurements of intact proteins combined with sequence information contained in a proteome database allow us (44) Panzenbock, U.; Kritharides, L.; Raftery, M.; Rye, K.; Stocker, R. J. Biol. Chem. 2000, 275, 19536.

to tentatively assign (Table 1) the identity of two other protein biomarkers observed in the spectrum (Figure 1). They all correspond to small acid-soluble spore proteins, albeit from different Bacillus species. In all cases, the N-terminal amino acid sequence indicates that the respective proteins will be posttranslationally modified by cleavage of the initiation Met amino acid.24,42 The B. cereus T genome is not completely sequenced and not all expressible protein sequences from that microorganism strain are available in proteome databases. That is the reason the current (45) Wilkins, M.; Gasteiger, E.; Gooley, A.; Herbert, B.; Molloy, M.; Binz, P.; Ou, K.; Sanchez, J.; Bairoch, A.; Williams, K.; Hochstrasser, D. F. J. Mol. Biol. 1999, 289, 645. (46) Larsen, M.; Roepstorff, P. Fresenius’ J. Anal. Chem. 2000, 366, 677.

Analytical Chemistry, Vol. 73, No. 23, December 1, 2001

5729

Figure 3. Comparison between tandem mass spectra (in the range from 3000 to 7000 Da, deconvoluted to zero-charge state) obtained after isolation of the 7+ charge state of the major biomarker doublet; the asterisk indicates fragments from the oxidized species: (a) SORI/CAD spectrum; (b) ECD spectrum. Table 1. . Tentative Assignment of Isolated B. cereus T Spore Biomarker Peaksa obs mass (Da)b

protein in SwissPROT

predicted mass (Da)c

6710.55

P06554d

6710.46

7095.60

P06552

7095.52

7215.69

P04834

7215.61

a

description

amino acid sequence

small acid-soluble spore protein small acid-soluble spore protein small acid-soluble spore protein

MSRSTNKLAV PGAESALDQM KYEIAQEFGV QLGADATARA NGSVGGEITK RLVSLAEQQL GGYQK MPNQSGSNSS NQLLVPGAAQ VIDQMKFEIA SEFGVNLGAE TTSRANGSVG GEITKRLVSF AQQQMGGGVQ MANNKSSNNN ELLVYGAEQA IDQMKYEIAS EFGVNLGADT TARANGSVGG EITKRLVQLA EQQLGGGRF

See Figure 1a. b Value for the most abundant isotope is listed. c Listed mass without N-terminal Met. d Confirmed by tandem FTICR.

assignment for the biomarkers at 7095.60 and 7215.69 Da is only provisory, and other observed biomarkers (e.g. at 6850.68 and 7280.52 Da) cannot as yet be mapped to known proteins. Attempts to obtain sequence tags from any of these biomarker ions were 5730 Analytical Chemistry, Vol. 73, No. 23, December 1, 2001

unsuccessful, perhaps reflecting their lower abundances in the sample. Finally, information contained in the proteome database, the organismic source for each sequence, would have allowed us to

identify the microorganism in this particular case (if it were unknown). We argue that with the exponential increase in the number of microorganisms with sequenced genomes,47 rapid methods for microorganism identification combining bioinformatics and mass spectrometry would become more widespread.22-24 Therefore, tandem mass spectrometry of intact protein biomarkers can provide valuable orthogonal information that will drastically improve the specificity of individual microorganism identification, particularly in complex environments. CONCLUSIONS We have demonstrated by high-resolution FTICR tandem mass spectrometry that the major high-mass biomarker peak observed in MALDI-TOF spectra of intact B. cereus T spores is, indeed, a protein. Fragmentation by SORI/CAD of the isolated biomarker molecular ion generates a long (six amino acids) sequence tag. Combining database sequence similarity search based on the amino acid tag with information about the molecular weight of the intact precursor ion has allowed identification of the unique sequence of that DNA-binding protein (out of 500 000 different (47) http://www.tigr.org

sequences). The protein has been posttranslationally modified by cleavage of the N-terminal Met amino acid. Accurate determination of the masses of observed biomarkers combined with empirical rules for N-terminal Met cleavage has also allowed us to tentatively identify other protein biomarkers extracted from B. cereus T spores. ACKNOWLEDGMENT We thank Amy Freas for growing the microorganisms used in this study, and Yetrib Hathout for helpful suggestions regarding protein extraction and purification. Joany Jackman (USAMRIID) is gratefully acknowledged for providing the B. cereus T cell lines. This work was supported in part by a grant from DARPA. SUPPORTING INFORMATION AVAILABLE Raw (mass/charge) spectrum of the biomarker extract. This material is available free of charge via the Internet at http:// pubs.acs.org. Received for review June 19, 2001. Accepted September 11, 2001. AC010672N

Analytical Chemistry, Vol. 73, No. 23, December 1, 2001

5731