Anal. Chem. 2003, 75, 6886-6893
Bacillus Spore Identification via Proteolytic Peptide Mapping with a Miniaturized MALDI TOF Mass Spectrometer Robert D. English,† Bettina Warscheid,‡ Catherine Fenselau,‡ and Robert J. Cotter*,†
Middle Atlantic Mass Spectrometry Laboratory, Department of Pharmacology and Molecular Sciences, The Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, and Department of Chemistry, University of Maryland, College Park, Maryland 20704
An approach is tested here as a rapid screening method for Bacillus spore species employing bacterial peptide analysis with a miniaturized MALDI TOF mass spectrometer. A limited set of tryptic peptides was generated in situ following selective solubilization of the small, acid-soluble protein family (SASP) from spore samples on the MALDI sample holder. To facilitate species identification, a compact database was created comprising masses of the tryptic cleavage products generated in silico from all Bacillus and Clostridium SASPs whose sequences are available in public databases. Experimental measurements were matched against the custom-made database, and a published statistical model was then used to evaluate the probability of false identifications. There is significant interest in smaller and more robust mass spectrometers for chemical or biological agent detection, environmental monitoring, forensics, space applications, and clinical environments.2-11 Ideally, instrumental performance would not be sacrificed with miniaturized instrumentation; however, size reduction intrinsically decreases both ion flight times and resolving power in shorter time-of-flight (TOF) analyzers. The TOF analyzer is an excellent counterpart to pulsed laser sources in matrixassisted laser desorption/ionization (MALDI), offers the multichannel recording advantage, and in theory has an unlimited mass * To whom correspondence should be addressed. Phone: (410) 955-3022. Fax: (410) 955-3420. E-mail:
[email protected]. † The Johns Hopkins University School of Medicine. ‡ University of Maryland. (1) Pineda, F. J.; Lin, J. S.; Fenselau, C.; Demirev, P. A. Anal. Chem. 2000, 72, 3739-3744. (2) Badman, E. R.; Cooks, G. J. Mass Spectrom. 2000, 35, 659-671. (3) Palmer, P. T.; Limero, T. F. J. Am. Soc. Mass Spectrom. 2001, 12, 656675. (4) Eckenrode, B. A. J. Am. Soc. Mass Spectrom. 2001, 12, 683-693. (5) Cotter, R. J.; Fancher, C.; Cornish, T. J. J. Am. Soc. Mass Spectrom. 1999, 34, 1368-1372. (6) Short, R. T.; Fries, D. P.; Kerr, M. L.; Lembke, C. E.; Toler, S. K.; Wenner, P. G.; Byrne, R. H. J. Am. Soc. Mass Spectrom. 2001, 12, 676-682. (7) Syage, J. A.; Nies, B. J.; Evans, M. D.; Hanold, K. A. J. Am. Soc. Mass Spectrom. 2001, 12, 648-655. (8) Diaz, J. A.; Giese, C. F.; Gentry, W. R. J. Am. Soc. Mass Spectrom. 2001, 12, 619-632. (9) Berkout, V. D.; Cotter, R. J.; Segers, D. P. J. Am. Soc. Mass Spectrom. 2001, 12, 641-647. (10) Prieto, M. C.; Kovtoun, V. V.; Cotter, R. J. J. Mass Spectrom. 2002, 37, 1158-1162. (11) Cotter, R. J.; English, R. D.; Hardy, A.; Warscheid, B.; Gardner, B. D. J. Mass Spectrom. Soc. Jpn. 2003, 51, 36-41.
6886 Analytical Chemistry, Vol. 75, No. 24, December 15, 2003
range.12 The deterioration in resolution caused by temporal and spatial problems associated with ion formation is more easily compensated in larger instruments in which source dimensions are negligible with respect to the size of the mass analyzer. These effects are no longer negligible in smaller instruments and can drastically reduce instrumental performance. Although TOF mass spectrometers can provide the exception, another disadvantage for many miniaturized mass spectrometers is reduction in their mass range. We have previously shown, however, that high-mass range and excellent resolving power can be preserved in a miniaturized MALDI TOF mass spectrometer built in-house.13 MALDI14,15 is a very powerful tool for identification of intact microorganisms.16,17 Advantages of MALDI for the analysis of such complex biological samples include high sensitivity, formation of mostly singly charged ions, and tolerance to contaminants. Among microorganisms, rapid analysis of spore-forming Bacillus species is of particular interest in the context of bioterrorism/counterterrorism activities, biotechnology, and environmental studies. Rapid screening and discrimination of bacterial species with small mass spectrometers is challenging because the desorbed biomarkers (proteins) have to be analyzed with sufficient resolving power and adequate mass accuracy to provide high confidence levels in microorganism identification. At the same time, miniaturized mass spectrometers should provide high transmission to maintain low detection limits. Small, acid-soluble spore proteins (SASPs) have shown to be reliable biomarkers for the analysis of various Bacilli, which can be extracted from spores by treatment with acid.18 Recently, proteolytic digests have been generated in situ from SASPs to enable microorganism identification by microsequencing and database searching.19,20 In this paper, we used a bioinformatics (12) Cotter, R. J. Time-of-Flight Mass Spectrometry: Instrumentation and Applications in Biological Research; American Chemical Society: Washington, DC, 1997. (13) English, R. D.; Cotter, R. J. J. Mass Spectrom. 2003, 38, 296-304. (14) Tanaka, K.; Ido, Y.; Akita, S. In Proceedings of the Second Japan-China Joint Symposium on Mass Spectrometry; Matsuda, H., Liang, X.-T., Eds.; Bango: Osaka, 1987; pp 185-188. (15) Karas, M.; Bachmann, D.; Bahr, U.; Hillenkamp, F. Int. J. Mass Spectrom. Ion Processes 1987, 78, 53-68. (16) Fenselau, C.; Demirev, P. A. Mass Spectrom. Rev. 2001, 20, 157-171. (17) Spengler, B. J. Mass Spectrom. 1997, 32, 1019-1036. (18) Hathout, Y.; Setlow, B.; Cabrera-Martinez, R.-M.; Fenselau, C.; Setlow, P. Appl. Environ. Microbiol. 2003, 69, 1100-1107. (19) Warscheid, B.; Fenselau, C. Anal. Chem. 2003, 75, 5618-5627. (20) Warscheid, B.; Jackson, K.; Sutton, C.; Fenselau, C. Anal. Chem. 2003, 75, 5608-5617. 10.1021/ac034624+ CCC: $25.00
© 2003 American Chemical Society Published on Web 11/06/2003
Figure 1. Photograph of the MALDI mass spectrometer with a 3-in. TOF analyzer with a sample plate that can accommodate ∼16 spots.
strategy1,21 in which a database was constructed and contained only the masses of the peptides formed from trypsin-induced SASP cleavages.19,20 The experimentally observed peptide masses were searched against this database to identify the Bacillus species. This kind of limited database reduces “false matches” that occur with more frequency when entire genomes are included.22 SASPs represent reliable biomarkers for spore-forming microorganisms such as Bacillus and Clostridium species, allowing the identification and discrimination of closely related species.19,20 EXPERIMENTAL SECTION Instrument. The MALDI TOF pulsed extraction mass spectrometer has previously been described in detail.10,11 Briefly, the mass spectrometer is a 3-in. time-of-flight mass analyzer and is equipped with a pulsed nitrogen laser and sample plate allowing (21) Yao, Z.-P.; Demirev, P. A.; Fenselau, C. Anal. Chem. 2002, 16, 2529-2534. (22) Pineda, F. J. Submitted.
for multiple spots (∼16 1.0-µL spots) (see Figure 1). The vacuum chamber was pumped via a Pfeiffer vacuum turbopump (Asslar, Germany) with a flow rate of 330 L/s. The operating pressure was ∼2 × 10-7 Torr. The ion source region consists of two accelerating regions. The sample plate was biased at 6.45 kV. The drift region, ∼3 in. long, was floated at -2 kV up to the front of the microchannel plate detector. Ions were extracted from the source with a 3.3-kV pulse applied to the sample plate following a preset time delay. The total ion acceleration energy was ∼12 keV in a 0.5-in-long region. A Laser Science (Franklin, MA) 337nm pulsed nitrogen laser operated at 10 Hz was used in conjunction with a variable neutral density filter to attenuate the laser beam (Oriel Corp., Stratford, CT). The laser pulse triggered a digital delay generator (Stanford Research Systems, Sunnyvale, CA), which was used to start the Lecroy 9374M oscilloscope (500 MHz, 2 GS/s acquisition rate) acquisition and to provide the extraction pulse after a preset time delay. (Though the pulsed Analytical Chemistry, Vol. 75, No. 24, December 15, 2003
6887
extraction method is mass dependent and only focuses ions of a given m/z at a certain time delay,12 a fixed 200-ns delay period optimized for m/z 2000 provided good focusing throughout the mass range of interest.) The microchannel plate detector (model 4655-13 with an 8° bias angle and a 4-µm channel diameter) consisted of two plates in chevron array (Hamamatsu, Bridgewater, NJ). The detector was in contact with the drift tube via a Teflon sleeve. Each spectrum consisted of an average of 200 laser shots and was calibrated with a nearby peptide mixture. Data were transferred and stored in a personal computer using commercial software (TOFWARE, Ilys Software, Pittsburgh, PA). Materials. Trifluoroacetic acid (TFA) and acetonitrile were purchased from Sigma Chemical Co. (St. Louis, MO). Milli-Q water was purified with a Millipore deionization system (Bedford, MA). Bradykinin fragment 2-9, neurotensin, dynorphin A, insulin chain B oxidized, ACTH 1-39, and bovine insulin were purchased from Sigma Chemical Co. and used as calibration standards. R-Cyano-4-hyroxycinnamic acid (R-HCHA) was purchased from Sigma Chemical Co. Trypsin immobilized on Agarose beads in 25 mM NH4HCO3 was purchased from Pierce Biotechnology (Rockford, IL). Bacillus cereus T, Bacillus thuringiensis subs. Kurstaki HD-1, Bacillus subtilis 168, and Bacillus globigii spores were grown by standard techniques described elsewhere.23 Spores of the nonpathogenic strain Bacillus anthracis Sterne were grown on agar plates containing new sporulation media. Spores were harvested, purified by lysozyme treatment and salt detergent washes, and stored at -20 °C before use.24,25 Sample Preparation. Bacillus spores were suspended in water with concentrations ranging from 2.4 to 2.8 mg mL-1. A rough estimate of the number of spores deposited onto the MALDI sample target is ∼35 000 spores. The sample preparation procedure19,20 consisted of depositing 0.6 µL of spore suspension and 1.0 µL of 10% TFA on the sample holder. After the suspension dried, 1.1 µL of trypsin beads was deposited and allowed to react under humidification for between 5 and 25 min. The sample was dried at room temperature, and 0.4 µL of 0.1% TFA and 0.7 µL of matrix solution (10 mg mL-1 R-HCHA in 50% acetonitrile/50% water/0.1% TFA) was deposited. External calibration was performed with 0.5 µL of a peptide mixture mixed with 0.5 µL of R-HCHA. Database Design. A customized database was created and consisted of protonated molecular masses of tryptic peptides generated in silico from all known SASPs in Bacillus and Clostridium species. Forty-five SASPs from nine Bacillus and two Clostridium species were retrieved from Swiss-Prot and NCBI web databases and from a previous study by Hathout et al.18 Some of these SASPs were duplicates, since some Bacillus spores have subsets of SASPs in common. Amino acid sequences were modified by removing the N-terminal methionine residue26 and then theoretically digested with trypsin using commercial software (GPMAW program, Lighthouse Data). Theoretical digests of SASPs were restricted to one missed cleavage site to limit the (23) Hathout, Y.; Demirev, P. A.; Ho, Y.-P.; Bundy, J.; Ryzhov, V.; Sapp, L.; Stutler, J.; Jackman, J.; Fenselau, C. Appl. Environ. Microbiol. 1999, 65, 43134319. (24) Nicholson, W. L.; Setlow, P. In Molecular biological methods for Bacillus; Harwood, C. R.; Cutting, S. M., Ed.; John Wiley and Sons: Chichester, England, 1990; pp 391-450. (25) Jenkinson, H. F.; Sawyer, W. D.; Mandelstam, J. J. Gen. Microbiol. 1981, 123, 1-16.
6888
Analytical Chemistry, Vol. 75, No. 24, December 15, 2003
number of tryptic peptides theoretically generated. The number of potential database entries was further limited to tryptic peptides with average masses between 1000 and 3000 Da. The lower mass limit was chosen to exclude possible interference by matrix ion signals occurring in this mass range. The upper mass limit is based on the fact that only a few cleavage products generated in silico show molecular masses higher than 3000 Da. The following are the numbers of SASP and tryptic peptide masses, respectively, characterizing each species in the database discussed here: B. subtilis 168 (7, 52), B. anthracis A2012 (7, 65), B. cereus T (5, 41), Bacillus firmus (2, 7), Bacillus stearothermophilus (2, 14), Bacillus halodurans (4, 30), Bacillus megaterium (12, 68), Clostridium bifermentans (2, 15), and Clostridium perfringens (4, 28). Among these, 199 tryptic peptides were found to provide unique molecular masses. Tryptic peptides from SASPs listed for Clostridium species were entered into the peptide database to provide a better test of this strategy. B. anthracis, B. cereus T, and B. thuringiensis represent closely related species and are commonly classified as B. cereus group members.27 Since some identical SASPs have been recently reported for these three species,18-20,28,29 masses of the corresponding tryptic peptides were entered nonredundantly and labeled to characterize B. cereus members as a group against other species. SASP sequences in B. cereus T and B. thuringiensis Kurstaki are identical,18,19 and discrimination is considered most reliable when based on DNA plasmids.30,31 However, some of the SASPs in B. anthracis Sterne have unique sequences, which allow discrimination from B. thuringiensis subs. Kurstaki HD-1 and B. cereus T, based on protein masses or tryptic peptides.18-20 The SASPs have not been sequenced for B. thuringiensis and B. globigii yet. It has been reported previously that B. cereus T and B. thuringiensis share the same set of SASPs,18,19 but it is still unclear whether B. cereus, B. thuringiensis, and B. anthracis are different species or varieties of the same species.29 The theoretical masses of protonated tryptic peptides, obtained as described above, were entered into the database along with an indication of the species source. This step was accomplished (26) Driks, A.; Setlow, P. In Prokaryotic Development; Brun, Y. V., Shimkets, L. J., Eds.; American Society for Microbiology: Washington, DC, 1999; pp 191218. (27) Helgason, E.; Okstad, O. A.; Caugant, D. A.; Johansen, H. A.; Fouet, A.; Mock, M.; Hegna, I.; Kolsto, A.-B. Appl. Environ. Microbiol. 2000, 66, 26272630. (28) Read, T. D.; Peterson, S. N.; Tourasse, N.; Bailie, L. W.; Paulsen, I. T.; Nelson, K. E.; Tettelin, H.; Fouts, D. E.; Eisen, J. A.; Gill, S. R.; Holtzapple, E. K.; Okstad, O. A.; Helgason, E.; Rilstone, J.; Wu, M.; Kolonay, J. F.; Beanan, M. J.; Dodson, R.; Brinkac, L. M.; Gwinn, M.; Deboy, R. T.; Madpu, R.; Daugherty, S. C.; Durkin, A. S.; Haft, D. H.; Nelson, W. C.; Peterson, J. D.; Pop, M.; Khouri, H. M.; Radune, D.; Benton, J. L.; Mahamoud, Y.; Jiang, L.; Hance, I. R.; Weidman, J. F.; Berry, K. J.; Plaut, R. D.; Wolf, A. M.; Watkins, K. L.; Nierman, W. C.; Hazen, A.; Cline, R.; Redmond, C.; Thwaite, J. E.; White, O.; Salzberg, S. L.; Cline, R.; Redmond, C.; Thwaite, J. E.; White, O.; Salzber, S.; Thomason, B.; Friedlander, A. M.; Koehler, T. M.; Hanna, P. C.; Kolsto, A.; Fraser, C. M. Nature 2003, 423, 81-86. (29) Ivanova, N.; Sorokin, A.; Anderson, I.; Galleron, N.; Candelon, B.; Kapatral, V.; Bhattacharyya, A.; Reznik, G.; Mikhailova, N.; Lapidus, A.; Chu, L.; Mazur, M.; Goltsman, E.; Larsen, N.; D’Souza, M.; Walunas, T.; Grechkin, Y.; Pusch, G.; Haselkorn, R.; Fonstein, M.; Ehrlich, S. D.; Overbeek, R.; Kyupides, N. Nature 2003, 423, 87-91. (30) Thorne, C. B. In Bacillus subtilis and other Gram-positive bacteria; Sonenshein, A. L., Hoch, J. A., Losick, R., Eds.; American Society for Microbiology: Washington, DC, 1993; pp 113-124. (31) Turnbull, P. C. B.; Hutson, R. A.; Ward, M. J.; Jones, M. N.; Quinn, C. P. J. Appl. Bacteriol. 1992, 72, 21-28.
Table 1. Summary of Parameters Input into Eqs 1-3 To Determine P(k) for Each Bacillus Speciesa Bacillus anthracis Sterne cereus T subtilis 168 thuringiensis subs. Kurstaki HD-1 globigii
peaks matched (k)
peaks detected (K)
theoretical peptides (n)
20 15 10 9
25 31 16 10
65 41 52 41
6
20
n/a
a For all Bacillus species, mass max ) 3000 Da, massin ) 1000 Da, ∆m ) 2, and n* ) 1000.
with commercial software (Microsoft Excel, Microsoft Corp.) where all calculations for the statistical model were also performed. Statistical Method. The strategy followed in these experiments is based on a model described in detail previously.21 This is the first example of the model applied to the identification of Bacillus spore species via tryptic peptide mapping. The identification algorithm in eq 1 uses a hypothesis test based on a model of
P(k) )
K! e-(K-k)n/n* (1 - e-n/n*)k (K - k)!k!
(1)
false database matches. The probability P(k) that k peaks have matches in the proteome of a given microorganism (i.e., proteolytic peptides from SASPs in Bacilli) is expressed in eq 1, with the caveat that the mass features are uniformly distributed. In the mass range mmin to mmax, K is the number of peaks observed in a mass spectrum, n is the number of theoretical peptides possible from the SASP digest of a spore, k is the number of matches between the observed masses and the theoretical masses, and n* is calculated as given in eq 2 (where ∆m is the
n* ) (mmax - mmin)/∆m
(2)
mass accuracy tolerance, and mmin to mmax are the minimum and maximum masses, respectively, that constitute the mass window). It is of interest to note that it is advantageous for n* . n, as is the case in the experiments reported here. The p-value (R) is the probability of obtaining the observed number of matches or more by chance, or randomly. This calculation is given in eq 3. kmax
R)
∑ P(k)
(3)
kobs
Table 1 defines the variables (used in eqs 1-3) for each Bacillus species. P(k) values were calculated from kobserved to kmax (where kmax is the maximum number of matches possible for each Bacillus species, which also is equal to n). The p-value term (R) is then determined by summing the P(k) values from kobserved to kmax. The total number of tryptic peptides entered into the database was 199. In practical terms, a lower p-value signifies that a match is less likely due to random chance; therefore, a p-value of 1.0 means
that the probability is 100% that the match is due to random chance. By comparing each tested Bacillus species across the suite of species put in the database, p-values can be compared and determined based on the randomness of a “match”. If two “known” spore species share many masses, then the probability is higher that the “unknown” spore may be either candidate. It must be noted that even if two p-values are numerically similar, there may be unique peptides formed enabling identification of even closely related species.19,20 Our model does not consider the uniqueness of peaks, nor is it yet applicable to mixtures of spores. Every peak with S/N g 3 was put into the database. Pineda is investigating statistical models to analyze mixtures,32 and Wahl has tested complex search algorithms for fingerprint library matching.33 RESULTS B. subtilis 168. Figure 2 shows mass spectra acquired from a tryptic digest generated in situ from five different spore genesis. The spectrum in Figure 2a shows peptides obtained from B. subtilis 168, which serve to test the p-value model, because the genome from this species has been fully sequenced. The labeled peaks (1-10) observed in the mass spectrum were unique to B. subtilis when compared to spectra from the other Bacilli tested in this study. Table 2 lists the peak number, observed [M + H]+ mass, theoretical [M + H]+ mass, peptide sequence, and SASP type from which the tryptic peptide originated based on peptide mass matching. A few ion masses were observed in the mass spectrum of the tryptic digest from B. subtilis that did not match masses in the peptide database used in this work (for B. subtilis). Although these were not further studied, previous workers have pointed out that such mismatches with a pure sample usually reflect posttranslational modifications.34,35 Table 1, described previously in the Statistical Method section, shows that there were 10 (k) matches for the 16 (K) peaks observed with S/N g 3. Once again, all matched masses were within (1 Da, and the p-values in Table 3 show that B. subtilis has the lowest (best) score compared to all other Bacilli in the database, with R ) 1.95 × 10-11. All other values were of the 10-1 magnitude, where it is much more likely that matches were due to chance (i.e., random match). B. subtilis is identified with a high level of confidence using the approach presented here. B anthracis Sterne. B. anthracis Sterne is used here as a model bacterium for the pathogenic B. anthracis strains currently of concern in homeland security. Figure 2b shows the MALDI spectrum of a tryptic digest generated in situ from B. anthracis Sterne spores. For peaks numbered 11-30, the experimentally determined m/z values of tryptic peptides match the values of protonated peptides obtained in silico from SASPs in B. anthracis Sterne within (1 Da as shown in Table 4. Moreover, the masses with asterisks (*) in Table 4 denote protonated tryptic peptides that are unique to B. anthracis in mass and sequence, relative to the other species with SASP sequences available in public databases. These results are consistent with previous observations,19 with minor differences observed for low-abundance ions between 1000 and 1300. (32) English, R. D. Personal communication with F. J. Pineda. (33) Wahl, K. L.; Wunschel, S. C.; Jarman, K. H.; Valentine, N. B.; Petersen, C. E.; Kingsley, M. T.; Zartolas, K. A.; Saenz, A. Anal. Chem. 2002, 74, 61916199.
Analytical Chemistry, Vol. 75, No. 24, December 15, 2003
6889
Figure 2. MADLI spectra of the tryptic digests generated in situ from (a) B. subtilis 168, (b) B. anthracis Sterne, (c) B. cereus T, (d) B. thuringiensis subs. Kurstaki HD-1, and (e) B. globigii spores and analyzed with the miniaturized TOF mass spectrometer. Peaks that were matched to peptides in the SASP database are numbered 1-39. Peaks that occur in more than one spectrum carry the same number. Table 2. List of Masses for Tryptic SASP Peptides from B. subtilis 168
a
peaka
[M+H]+ohs
[M+H]+theor
sequence
SASP type (Da)
1* 2* 3* 4*
1322.1 1483.9 1640.3 1880.3
5* 6* 7* 8* 9*
2285.6 2442.6 2784.2 2842.5 2911.5
10*
2970.6
1322.5 1484.7 1640.9 1881.0 1881.0 2286.5 2442.7 2783.9 2841.9 2912.1 2912.1 2970.1
LVSFAQQQMGGR LEIASEFGVNLGADTTSR RLVSFAQQNMGGGQF LEIASEFGVNLGADTTSR LEIASEFGVNLGADTTSR ANQNSSNDLLVPGAAQAIDQMK ANNNSGNSNNLLVPGAAQAIDQMK QNQQSAAGQGQFGTEFASETNAQQVR QNQQSAGQQGQFGTEFASETDAQQVR QNQQSAAGQGQFGTEFASETNAQQVRK KQNQQSAAGQGQFGTEFASETNAQQVR KQNQQSAGQQGQFGTEFASETEDAQQVR
-β (6848.6) -R (6939.6) -R (6939.6) -R (6939.6) -β (6848.6) -β (6848.6) -R (6939.6) -γ (9136.5) -γ (9136.5 -γ (9136.5) -γ (9136.5) -γ (9136.5)
*, unique to B. subtilis 168.
As Table 1 shows, 25 peaks (K) (with S/N g 3) were observed for B. anthracis Sterne, and 20 (k) of these matched peptides in the SASP database. The R-value is 3.70 × 10-20, meaning that the 6890
Analytical Chemistry, Vol. 75, No. 24, December 15, 2003
probability of obtaining a match with B. anthracis Sterne by chance is only ∼1 in 1020. The R-values among B. anthracis Sterne, B. cereus T, and B. thuringiensis subs. Kurstaki are similar, as shown
4.45 × 10-1 3.61 × 10-1 1.00 1.00 1.00 2.13 × 10-1 1.00 1.00 1.43 × 10-2 1.00 4.93 × 10-1 3.87 × 10-2 6.28 × 10-2 3.81 × 10-1 2.59 × 10-1 1.17 × 10-1 3.94 × 10-6 2.01 × 10-1 1.00 6.64 × 10-9 1.98 × 10-2 1.00 1.00 8.50 × 10-3 n/a n/a n/a n/a 1.86 × 10-13 4.81 × 10-1 2.63 × 10-12 1.91 × 10-1
n/a, not available. a
1.86 × 10-13 5.65 × 10-1 2.63 × 10-12 1.91 × 10-1 1.09 × 10-10 6.47 × 10-1 9.87 × 10-9 3.61 × 10-1
4.70 × 10-1 1.95 × 10-11 1.00 2.73 × 10-3
1.51 1.00 2.25 × 10-1 3.66 × 10-2 4.92 × 10-3 6.96 × 10-4 n/a 1.16 × 10-9 3.64 × 10-1 1.16 × 10-9 3.70 × 10-20
anthracis Sterne cereus subtilis thuringiensis globigii
C. perfringens C. bifermentans megaterium halodurans stearothermophilus firmus globiga thuringiensis subtilis cereus anthracis A2012
Table 3. p-Values (or r-Values) for Matches of Spectra (Far Left Column) with Bacillus and Clostridium Database Components (Top Row)
in Table 3, not surprising given the genomic similarity of B. anthracis Sterne, B. cereus, and B. thuringiensis, which has caused them to be clustered as the B. cereus group.28,29 The identification of B. subtilis, B. firmus, B. stearothermophilus, B. halodurans, B. megaterium, C. bifermentans, and C. perfringens from this spectrum would occur with a higher probability for false identification (see Table 3). B. cereus T and B. thuringiensis subs. Kurstaki HD1. Panels c and d of Figure 2 show the MALDI spectra acquired for the tryptic digest from B. cereus T and B. thuringiensis subs. Kurstaki HD-1 spores, respectively. These are discussed together because their SASPs have been shown to have identical sequences.18,19,29 The spectra recorded in the present study for these two species are not identical; however, all peaks with S/N > 3 in Figure 2d are also present in Figure 2c. The cause of the spectral differences is not yet known. Further, the species cereus T, thuringiensis Kurstaki, and anthracis Sterne express many identical SASPs,17,18,28,29 and spectra in Figure 2 b-d have many features in common. As already noted (Table 4), several unique tryptic peptides originate from B. anthracis SASPs. Table 5 contains a mass list for all observed peaks that match peptide masses ((1 Da) in theoretical digests of SASPs catalogued from B. cereus T and B. thuringiensis subs. Kurstaki. Peaks (marked with asterisks) at m/z 1273.5, 1534.6, 2728.8, 2852.0, and 2954.2 Da are unique to B. cereus T/B. thuringiensis subs. Kurstaki HD-1. These peptide masses are also predicted to be unique based on the known sequences of B. cereus SASPs. The R-values for matches of these two spectra with species in the database are shown in Table 3. The calculations demonstrate that the spectrum in Figure 2c is most likely that of B. cereus or B. thuringiensis (with identical R-values). The next closest is B. anthracis. We observed that all the B. thuringiensis samples consistently showed lower signal intensities compared to all other Bacillus species in this study. Nonetheless, 9 (k) matches were made out of the 10 (K) observed peaks in the mass spectrum, and all of these matched peaks were also observed in spectra acquired from either B. anthracis Sterne or B. cereus T. Table 3 shows that the lowest (best) R-value is obtained for B. thuringiensis and B. cereus T, followed by B. anthracis Sterne. All other values were much higher, and few (if any) matches were made with the other Bacillus species, as evident by the multiple R-values ) 10.0. B. globigii. B. globigii was included because it provides small, acid-soluble proteins whose sequences are not yet known. Figure 2e shows the mass spectrum acquired from an in situ tryptic digest of B. globigii. Fewer peaks are observed than for all the other Bacillus species analyzed, but our results are in accordance with those found previously.20 Table 1 and Figure 2e show that for 20 (K) peaks observed, 6 (k) matches were made to database masses. There are no database entries for predicted proteolytic peptides for B. globigii SASPs (see Experimental Section), and the best match was made to B. stearothermophilus with R ) 2.02 × 10-9 (see Table 3) with six (k) matches. This similarity in SASPs has been noted previously.18 The next nearest match was to B. subtilis with R ) 2.73 × 10-3. This observation is also not unexpected given that B. globigii was described previously as a B. subtilis strain.36 Analytical Chemistry, Vol. 75, No. 24, December 15, 2003
6891
Table 4. List of Masses for Tryptic SASP Peptides from B. anthracis Sterne [M+H]+obs
[M+H]+theor
sequence
SASP type (Da)
11
1033.1
1033.1
12* 13
1140.1 1189.1
14* 15* 16
1296.1 1336.4 1430.6
17 18* 19* 20 21*
1488.9 1518.9 1529.4 1595.3 1644.8
22* 23* 24 25
1675.7 1827.7 1885.6 1940.7
26 27 28 29* 30*
1956.5 1972.8 2258.6 2835.9 2944.1
1140.4 1189.3 1189.3 1189.3 1296.5 1336.4 1430.7 1430.7 1488.8 1518.8 1528.8 1594.8 1644.9 1644.9 1674.9 1827.1 1885.1 1940.1 1940.1 1940.1 1956.1 1972.1 2258.5 2835.0 2943.1
ANGSVGGEITK ANGSVGGEITK LVALAQQQLR ANGSVGGEITKR ANGSVGGEITKR ANGSVGGEITKR GLDGGAVSDMAFR VGDYLANEVEAR LAVPGAESALDQMK LAVPGAESALDQMK LVAMAEQSLGGFHK LVSLAEQQLGGFQK LVSLAEQQLGGGVTR LVAMAEQQLGGGYTR RLVAMAEQSLGGFHK AIEIAEQQLMKQNQ RLVSLAEQQLGGFQK VADEQEQHTIANLMVK NSNQLASHGAQAALDQMK YEIAQEFGVQLGADATAR YEIAQEFGVQLGADATAR YEIAQEFGVQLGADATAR YEIAQEFGVQLGADSTAR YEIAQEFGVQLGADTSSR ANQNSSNQLVVPGATAAIDQMK ATSGASIQSTNASYGTEFATETNVQAVK AQASGASIQSTNASYGTEFATETDVHAVK
-R (6939.6) -β (6848.6) -R/β (6375.2) -R/β (6834.6) -2 (6710.5) -R/β (7162.9) hypoth protein-1 (9207.3) hypoth protein-1 (9207.3) -2 (6710.5) -1 (6678.5) -R/β (6834.6) -1 (6678.5) -R/β (7162.9) -R/β (7080.8) -R/β (6834.6) -R/β (6668.6) -R/β (6678.5) hypoth protein-1 (9207.3) -R/β (7080.8) -2 (6710.5) -1 (6678.5) R/β (7162.9) -R/β (6834.6) -R/β (7080.8) -R/β (6834.6) hypoth protein-2 (9737.3) hypoth protein-2 (9737.3)
peaka
a
*, unique to B. anthracis Sterne.
Table 5. List of Masses for Tryptic SASP Peptides from B. cereus T (Bc) and B. thuringiensis Subs. Kurstaki HD-1 (Bt) peaka
[M+H]+obs
[M+H]+theor
sequence
11
1032.7
13
1189.3
31* 16
1273.1 1430.6
17 32* 20 24 25
1488.3 1534.6 1594.7 1885.3 1940.0
26 27 28 33* 34* 35*
1955.3 1971.7 2258.7 2728.4 2851.7 2954.8
1033.1 1033.1 1189.3 1189.3 1189.3 1273.5 1430.7 1430.7 1488.8 1534.8 1594.8 1885.1 1940.1 1940.1 1940.1 1956.1 1972.1 2258.5 2728.8 2852.0 2954.2
ANGSVGGEITK ANGSVGGEITK ANGSVGGEITKR ANGSVGGEITKR ANGSVGGEITKR LVAMAEQQLGGR LAVPGAESALDQMK LAVPGAESALDQMK LVAMAEQSLGGFHK LVSLAEQQLGGYQK LVAMAEQQLGGGYTR NSNQLASHGAQAALDQMK YEIAQEFGVQLGADATAR YEIAQEFGVQLGADATAR YEIAQEFGVQLGADATAR YEIAQEFGVQLGADSTAR YEIAQEFGVQLGADTSSR ANQNSSNQLVVPGATAAIDQMK AQASGAQSANASYGTEFATETDVHSVK ATSGASIQSTNASYGTEFSTETDVQAVK YEIAQEFGVQLGADATARANGSVGGEITK
a
SASP type (Da)
Bacillus
-R (6939.6) -β (6848.6) -R/β (6834.6) -2 (6710.5) -R/β (7162.9) -1 (7335.1) -2 (6710.5) -1 (6678.5) -R/β (6834.6) -2 (6710.5) -R/β (7080.8) -R/β (7080.8) -2 (6710.5) -1 (6678.5) -R/β (7162.9) -R/β (6834.6) -R/β (7080.8) -R/β (6834.6) -γ (9540.1) -γ (9540.1) -2 (6710.5)
Bc Bc Bc Bc, Bt Bc, Bt Bc, Bt Bc Bt Bc, Bt Bc, Bt Bc, Bt Bc, Bt Bc, Bt Bc Bc Bc
*, unique to B. cereus T and/or B. thuringiensis subs. Kurstaki HD-1.
CONCLUSIONS In this demonstration of concept, we have shown that identification of Bacillus spores is possible through the combination of a miniaturized MALDI TOF mass spectrometer, in situ proteolysis, peptide mass mapping, use of a database limited to peptides derived from SASP families, and probability testing calculations. (34) Demirev, P. A.; Lin, Y. S.; Pineda, F. J.; Fenselau, C. Anal. Chem. 2001, 73, 4566-4573. (35) Demirev, P. A.; Ramirez, J.; Fenselau, C. Anal. Chem. 2001, 73, 57255731.
6892
Analytical Chemistry, Vol. 75, No. 24, December 15, 2003
Spectra of peptides, when compared to intact proteins, offer the advantages of more sensitivity, better reproducibility, and more accurate mass analysis. Selective solubilization of a limited number of proteins for tryptic digestion provides a limited and manipulatable number of potential peptides. This bioinformatics approach does not yet take advantage of specific peptides known to be unique to each species. We also (36) Priest, F. In Bacillus subtilis and other Gram-positive bacteria; Sonenshein, A. L., Hoch, J. A., Losick, R., Eds.; American Society for Microbiology; Washington, DC, 1993; pp 3-16.
have used the small mass spectrometer, with SASP solubilization and trypsin cleavage, to analyze mixtures of Bacillus spores. Preliminary studies with mixtures have shown promising evidence of organism-specific proteins. Statistical models are under development to determine the components of mixtures of microorganisms,32 which combine limited datasets and significance testing with interrogation for species-specific peptides. ACKNOWLEDGMENT The authors thank Dr. Fernando Pineda at The Johns Hopkins University Bloomberg School of Public Health for discussions
pertaining to the p-value statistical model and Dr. Ben Gardner at The Johns Hopkins University School of Medicine for initial assistance with instrument modifications. This work was supported by contracts from the Defense Advanced Research Project Agency (DARPA) to R.J.C. and C.F., and by the Deutsche Forschungsgemeinschaft to B.W.
Received for review June 6, 2003. Accepted September 12, 2003. AC034624+
Analytical Chemistry, Vol. 75, No. 24, December 15, 2003
6893