A Simple Intact Protein Analysis by MALDI-MS for Characterization of Ribosomal Proteins of Two Genome-Sequenced Lactic Acid Bacteria and Verification of Their Amino Acid Sequences Kanae Teramoto, Hiroaki Sato,* Liwei Sun, Masaki Torimura, and Hiroaki Tao Research Institute for Environmental Management Technology, National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba, Ibaraki 305-8569, Japan Received April 18, 2007
Rapid identification of bacteria by a bioinformatics-based approach, which processes the mass spectra observed by matrix-assisted laser desorption/ionization-mass spectrometry (MALDI-MS), relies on the calculated masses of ribosomal subunit proteins as biomarkers predicted from amino acid sequences found in protein sequence databases. To verify the actual state of the registered sequence information, a simple intact protein analysis by MALDI-MS using cell lysates as samples was applied to the characterization of ribosomal proteins from genome-sequenced Streptococcus thermophilus and Lactobacillus delbrueckii subsp. bulgaricus strains. This method avoided the risk of loss of some subunit proteins and the formation of disulfide bonds during the purification of ribosomal proteins. By comparing this with the MALDI mass spectra of different strains and carrying out manual inspection of sequence information, a total of five errors in N-terminal amino acid sequences were identified. After sequence correction, approximately 40 out of 53 subunit proteins could be assigned, considering N-terminal methionine loss only as a post-translational modification. These show promise for use as practical biomarkers for the rapid identification of S. thermophilus and L. bulgaricus. After verification of these amino acid sequences, mass differences relative to those of genome-sequenced strains have the potential for distinguishing bacteria at the strain level. Keywords: ribosomal proteins • MALDI-MS • bacteria identification • protein sequence database • sequence error
Introduction Chemotaxonomic characterization of bacteria by matrixassisted laser desorption/ionization-mass spectrometry (MALDIMS) has received much attention in numerous fields, including the food industry, public health, and environmental analyses.1-3 To replace conventional mass spectral fingerprint analyses, which are frequently influenced by mass spectral variability,4-6 a bioinformatics-based approach for rapid identification of bacteria has been proposed.7,8 This method compares the observed masses in the MALDI mass spectra with the calculated masses of the biomarker proteins predicted from the amino acid sequences found in protein sequence databases such as NCBInr and UniProt Knowledgebase, consisting of Swiss-Prot and TrEMBL. The use of ribosomal proteins as suitable biomarkers has been proposed by Pineda et al.9 The ribosome is an organelle found in all cells that coordinates protein synthesis. The bacterial ribosome consists of more than 50 ribosomal subunit proteins and 3 rRNAs. Because most ribosomal subunit proteins are highly basic and their lower molecular weights range mainly from 4 to 40 kDa, intense MALDI mass spectral peaks with acceptable mass accuracy of within a few Da can be observed.10-13 The fact that limited numbers of ribosomal subunit proteins undergo post-translational modifications, except for N-terminal methionine loss, * Author for correspondence. E-mail:
[email protected]. 10.1021/pr070218l CCC: $37.00
2007 American Chemical Society
is also an advantage for their use as biomarker proteins in a bioinformatics-based approach. Moreover, N-terminal methionine loss can easily be predicted from the amino acid sequence information according to an empirical N-terminal rule.14,15 To investigate how many ribosomal subunit proteins could actually be assigned in the MALDI mass spectra using the calculated masses, we have characterized the expressed ribosomal proteins of a genome-sequenced bacterium, Lactobacillus plantarum NCIMB 8826, as a model sample.13 Through the comparison of the MALDI mass spectra of the purified ribosomal proteins and cell lysates, 31 ribosomal subunit proteins could be selected as reliable biomarkers for rapid identification of L. plantarum. These biomarkers were further used for MALDI-MS characterization of two industrial L. plantarum cultures. Interestingly, the masses of several ribosomal subunit proteins were different from those of L. plantarum NCIMB 8826. This result indicates that variation of amino acid sequences of ribosomal subunit proteins at strain level can be observed. We speculate that this sequence variation can provide useful data for discriminating and further classifying bacteria at the strain level by applying a bioinformatics-based approach. To develop this method, it will be important to gain an understanding of what degree ribosomal subunit proteins are varied and conserved within the same species. Mutations in the amino Journal of Proteome Research 2007, 6, 3899-3907
3899
Published on Web 09/14/2007
research articles acid sequences of ribosomal proteins can be readily identified using a homology search if the sequence information is registered for different strains within the same species of genome-sequenced bacteria. However, if the registered information contains sequence errors, it will be difficult to judge whether or not such differences result from a genuine sequence mutation. The bioinformatics-based approach for bacterial characterization relies on the amino acid sequences of ribosomal proteins found in the protein sequence databases, and therefore, the actual state of the registered information should be verified. At present, however, only limited numbers of bacterial ribosomal proteins have been characterized.10,12,13,16,17 As a result, to our knowledge, few sequence errors of ribosomal subunit proteins have been identified by mass spectral characterization of whole ribosomal proteins, where the errors were due to misannotation of N-termini.12,17 In all the previous studies, purified ribosomal proteins have been subjected to characterization. For detailed structural characterization, especially for determining post-translational modifications, two-dimensional liquid chromatography combined with an electro-spray (ESI)-quadrupole time-of flight mass spectrometer (Q-tof) system have been employed.17 On the other hand, we have reported in a previous paper that ribosomal subunit proteins of L. plantarum can be successfully observed by a combination of cell disruption and the use of acidic matrix solution with 1% trifluoroacetic acid (TFA).13 Because this simple intact protein analysis by MALDI-MS requires no purification or chromatographic separation, it is preferable to the exhaustive characterization of various bacterial ribosomal proteins to verify their sequence information. In this study, the simple intact protein analysis by MALDIMS using cell lysates as samples was applied to the characterization of the ribosomal proteins of two separate genomesequenced strains of Streptococcus thermophilus and L. delbrueckii subsp. bulgaricus (L. bulgaricus) as model bacteria. Both bacteria are typical Gram-positive lactic acid bacteria, which have been traditionally used as a yogurt starter culture. The amino acid sequences of all ribosomal subunit proteins of the four strains are available from the Swiss-Prot/TrEMBL databases. Several different amino acid sequences are registered for different strains within the same species on the databases. In the present study, we discovered, based on the MALDI characterization of the expressed ribosomal proteins, that some of these differences are actually attributable to sequencing errors.
Teramoto et al.
ground (four times for 20 s each at 3000 rpm) with 0.1 mm zirconia silica beads in a Biospec mini bead-beater-8 (Bartlesville, OK). This cell disruption procedure was different from that used in our previous study.13 After removing the beads and cell debris by centrifugation, a proportion of the cell lysates was subjected to MALDI-MS measurements, and the remainder was used for the isolation of the ribosomal protein fraction. Crude ribosome was isolated from the supernatant of the lysate by ultra-centrifugation at 80 000g for 6 h. The precipitate (70S ribosome particles) was further purified by ultra-centrifugation through a 30% sucrose cushion at 45 000g for 16 h. To obtain purified ribosomal protein, rRNAs were removed from the 70S ribosome by precipitation using 1 M magnesium acetate and glacial acetic acid. After precipitation using acetone at -20 °C, centrifugation, and dialyzation against 2% acetic acid, the solutions of purified ribosomal protein were finally obtained. MALDI-MS. Each sample solution of cell lysates and purified ribosomal protein was mixed with a sinapinic acid matrix solution at a concentration of 10 mg/mL in 50% acetonitrile with 1% TFA. The concentration of the sample solution and the mixing ratio of the sample/matrix solutions were experimentally adjusted to provide sufficient peak intensity comparable to that of 2 pmol of calibrants, described below. About 1.5 µL of sample/matrix mixtures was spotted onto the MALDI target and dried in air. The MALDI-MS measurements were performed using an AXIMA CFR-plus time-of-flight mass spectrometer (Shimadzu/Kratos, Kyoto, Japan) equipped with a pulsed N2 laser (λ ) 337 nm, pulse width 3 ns, frequency 10 Hz). The MALDI mass spectra in the range of m/z 2000-40 000 were observed in positive linear mode by averaging 500 individual laser shots for 450 µm squire using a raster scan mode (50 µm step, 10 × 10 × 5 shots). More than 15 mass spectra for each sample were collected by three separate measurements from more than five sample spots. We confirmed that comparable mass spectra were observed. Of these, three mass spectra, in which the L2 subunit (ca. 30 kDa) was clearly observed, were selected for characterization.
Materials and Methods
Mass calibration was carried out according to the following steps. Provisional peak assignment was performed by internal calibration using three peaks of ACTH clip 18-39 ([M + H]+, m/z 2465.7) and myoglobin ([M + H]+, m/z 16952.6 and [M + 2H]2+, m/z 8476.8) as references (ca. 2 pmol each). Selfcalibration10,11 was further performed using moderately strong peaks assigned to ribosomal subunit proteins as internal reference peaks.
Cell Culture and Preparation of Ribosomal Protein Fraction. S. thermophilus ATCC BAA-250 (same as LMG 18311),18 ATCC BAA-491 (same as LMD 9),19 and L. bulgaricus ATCCBAA 36519 were purchased from the American Type Culture Collection (ATCC, Manassas, VA). L. bulgaricus NBRC 13953T (same as ATCC 11842T)20 was purchased from the National Institute of Technology and Evaluation (NITE)-Biological Resource Center (NBRC, Chiba, Japan). Each experimental culture was grown in de Man-Rogosa-Sharp (MRS) medium at 37 °C for 18 h. Isolation and purification of ribosomal proteins were carried out according to the previous paper,13 which was based on the methods of Kurland21 and Traub22 with slight modifications. Briefly, cells were harvested by centrifugation and washed twice in TMA-I buffer (10 mM Tris-HCl (pH 7.8), 30 mM NH4Cl, 10 mM MgCl2, and 6 mM 2-mercaptoethanol). The cells were
Assignment of Ribosomal Subunit Proteins. The amino acid sequence of each ribosomal subunit protein was obtained from the Uniprot knowledgebase (Swiss-Prot and TrEMBL) (http:// ca.expasy.org/sprot/). The calculated mass of each subunit protein was predicted using a Compute pI/Mw tool on the ExPASy proteomics server (http://www.expasy.org/tools/pi_tool. html). Here, N-terminal methionine loss was only considered as a possible post-translational modification. It has been empirically known that N-terminal methionine is cleaved from specific penultimate amino acid residues, i.e., glycine, alanine, serine, proline, valine, threonine, and cysteine.14,15 The possibilities of other modifications such as acetylation and methylation were not considered because these modifications could not be predicted only from the amino acid sequence information. The assignment of ribosomal subunit proteins was judged from errors within 150 ppm compared between the
3900
Journal of Proteome Research • Vol. 6, No. 10, 2007
Characterization and Verification of Ribosomal Proteins
research articles
Figure 1. MALDI mass spectra of purified ribosomal proteins (A) and cell lysate (B) of S. thermophilus ATCC BAA-491.
calculated masses as [M + H]+ ions and the observed masses in the MALDI mass spectra.
Results and Discussion Comparison of MALDI Mass Spectra of Purified Ribosomal Proteins and Cell Lysates. In the simple intact protein analysis by MALDI-MS, the cell disruption conditions under which ribosomal subunit proteins are effectively extracted from the cells appear to be the most important parameter, because Gram-positive bacteria have a rigid cell wall. In our previous study,13 the cells were ground with 0.1 mm glass beads. Although ribosomal subunit proteins could be clearly observed using this method, prominent unknown peaks were also observed and the peaks of several subunit proteins, which were
observed in purified ribosomal protein, were not present. In this study, several cell disruption methods such as French press, ultrasonication, cryomilling, and bead-beating were examined. We found out that bead-beating with 0.1 mm zirconia silica beads, instead of the previously used glass beads, provides more effective extraction of ribosomal proteins. Figures 1 and 2 show the MALDI mass spectra observed for purified ribosomal protein (A) and cell lysate (B) of S. thermophilus ATCC BAA-491 and L. bulgaricus NBRC 13953T in the range of m/z 4000-32 000, respectively. The details of the peak assignments are described in the following sections. Although the intensities of some peaks are slightly different, the same peaks are generally observed independently of sample preparation. A total of 41 ribosomal subunit proteins (including one Journal of Proteome Research • Vol. 6, No. 10, 2007 3901
research articles
Teramoto et al.
Figure 2. MALDI mass spectra of purified ribosomal proteins (A) and cell lysate (B) of L. bulgaricus NBRC 13953T.
overlap peak of L28 and S21) were commonly observed for S. thermophilus ATCC BAA-491 (Figure 1). Moreover, the observation of L20 for cell lysate resulted in a total of 42 subunit proteins. In the case of L. bulgaricus NBRC 13953T (Figure 2), a total of 34 subunit proteins was assigned for purified ribosomal protein. The number of assigned peaks appears to be lower than that of S. thermophilus. For the cell lysate, however, six subunit proteins (L23, L29, S3, S15, S20, and S21) are further observed. Although L35 disappears, a total of 39 subunit proteins are observed in the cell lysate. The specific subunit proteins are difficult to observe, since the purified ribosomal protein tends to be lost by coprecipitation with rRNA or binding to the tubes or dialysis membrane during the purification procedure of ribosomal protein. For example, the observation of L20 of purified ribosomal protein 3902
Journal of Proteome Research • Vol. 6, No. 10, 2007
of Escherichia coli is difficult.10,23 In E. coli, L20 and L23 directly bind to 23S rRNA,24 whereas S15 and S20 directly bind to 16S rRNA.25 To avoid the loss of specific subunit proteins during the purification procedure, the use of cell lysate appears to suffice for the observation of ribosomal subunit proteins. Figure 3 shows expanded MALDI mass spectra around the peaks of S14 for S. thermophilus ATCC BAA-491 (A) and L. bulgaricus NBRC 13953T (B). In each spectrum, the upper shows the MALDI mass spectrum of purified ribosomal protein, whereas the bottom shows that of cell lysate. The observed m/z values for cell lysates were in good agreement with the calculated m/z values for [M + H]+ ion of S14. Interestingly, the peaks observed for purified ribosomal protein shift to a lower mass by approximately 4 Da in both cases. Similarly, a peak shift of approximately -2 Da was observed for purified
research articles
Characterization and Verification of Ribosomal Proteins
Figure 3. MALDI mass spectra around the peak of S14 for S. thermophilus ATCC BAA-491 (A) and L. bulgaricus NBRC 13953T (B), together with amino acid sequence of S14. (Upper trace) purified ribosomal protein, (bottom trace) cell lysate. Underlines in the amino acid sequences indicate C-x-x-C motif. The peaks with asterisks would be non-ribosomal proteins. Table 1. Assigned Ribosomal Subunit Proteins for Cell Lysates of S. thermophilus ATCC BAA-250 and BAA-491 ATCC BAA-250
Figure 4. Partial MALDI mass spectra of cell lysates of S. thermophilus ATCC BAA-491 (A) and ATCC BAA-250 (B).
ribosomal proteins L33 and L36. These subunit proteins have a C-x-x-C motif (C: cysteine residue, x: any amino acid residue) known as the zinc-finger or zinc-ribbon motif, which ordinarily binds Zn2+ to ensure correct folding of the proteins in vivo.26-30 S14 has two C-x-x-C motifs, whereas L33 and L36 has one. The peak shifts by -4 or -2 Da for purified ribosomal protein suggest the formation of disulfide bonding within the C-x-x-C motif by removal of Zn2+. This is possible because during the purification process of ribosomal protein, the addition of glacial acetic acid in order to remove rRNA causes the removal of Zn2+. As demonstrated above, it was confirmed that the use of cell lysates was effective for the observation of ribosomal subunit proteins. This simple method can prevent the risk of loss of subunit proteins and the formation of disulfide bonds during the purification of ribosomal proteins. We therefore decided to carry out further characterization of both bacteria using cell lysate as the sample. Characterization of Ribosomal Proteins of S. thermophilus Strains. Ribosomal subunit proteins of S. thermophilus were characterized through the comparison of MALDI mass spectra for two genome-sequenced strains (ATCC BAA-250 and ATCC BAA-491). Figure 4 compares the expanded MALDI mass spectra of both strains in the range of m/z 9500-12 000 as an example (the whole MALDI mass spectra are shown in SIFigure 1 in Supporting Information). The differences in peak position of some subunit proteins reflect their sequence
protein namea
accession numberb
L2 L5 L6 L9 L13 L14 L15 L17 L18 L19 L20 L22 L23 L24 L28 L29 L30 L31 L32 L33 L34 L35 L36
Q5M2B6 Q5M2C5 Q5M2C8 Q5M252 Q5M6E7 Q5M2C3 Q5M2D2 Q5M2E0
S3 S4 S5 S6 S7 S8 S9 S10 S11 S12 S13 S14 S15 S16 S17 S18 S19 S20 S21
Q5M427 Q5M472 Q5M2B8 Q5M2B5 Q5M2C4 Q5M294 Q5M2C1 Q5M2D1 Q5M4X0 Q5M278 Q5M277 Q5M2K7 Q5M471 Q5M2D6 Q5M2B9 Q5M255 Q5M2D0 Q5M2Q3 Q5M2M5 Q5M2C7 Q5M6E6 Q5M2B2 Q5M2D8 Q5M2M4 Q5M2D7 Q5M2C6 Q5M6A7 Q5M387 Q5M2C2 Q5M2Q5 Q5M2B7 Q5M3E1
calculated m/z valuesc
ATCC BAA-491 accession numberb
calculated m/z valuesc
second residue
Met. loss
Large subunit proteins 29828.4 Q03IF4 29828.4 19698.9 Q03IG3 19699.9 19327.9 Q03IG6 19327.9 17172.1 Q03I91 17100.0 16172.9 Q03MU6 16172.9 13080.2 Q03IG1 13080.2 15545.9 Q03IH0 15531.9 14330.4 Q03IH8 14330.4 Q03IG7 12924.9 13147.4 Q03KD6 13147.4 13502.9 Q03KJ1 13502.9 12286.3 Q03IF6 12286.3 10790.7 Q03IF3 10790.7 10871.8 Q03IG2 10841.8 6782.1 Q03IC6 6782.1 7901.2 Q03IF9 7901.2 6268.3 Q03IG9 6268.3 9326.4 Q03L85 9326.4 6650.6 Q03IB0 6650.6 5926.0 Q03IA9 5926.0 5346.4 Q03IQ3 5346.4 7680.1 Q03KJ0 7690.1 4452.4 Q03IH4 4452.4
G A S K N I K A I N A A N F A K A K A R K P K
yes yes yes
Small subunit proteins 23942.6 Q03IF7 23964.6 22940.3 Q03I94 22940.3 16920.6 Q03IG8 16920.6 10980.4 Q03IV3 10980.4 17633.3 Q03IS0 17633.3 14656.0 Q03IG5 14656.0 14109.2 Q03MU5 14109.2 11497.4 Q03IF0 11425.3 13255.2 Q03IH6 13255.2 14926.4 Q03IR9 14926.4 13290.4 Q03IH5 13290.4 6939.2 Q03IG4 6966.3 10387.0 Q03MP3 10387.0 10267.9 Q03JG3 10267.9 10008.6 Q03IG0 10027.7 9137.6 Q03IV5 9137.6 10492.1 Q03IF5 10420.0 Q03L31 8290.5 6794.0 Q03JL0 6779.9
G S A A S V A A A P A A A A E A G A S
yes yes yes yes yes yes yes
yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes
a
Subunit proteins denoted by bold type have the same amino acid sequence between two strains. b Accession number of Swiss-Prot/TrEMBL. c Calculated considering N-terminal methionine loss.
mutation. The observed peaks were assigned by comparing the observed mass to the calculated mass predicted from each Journal of Proteome Research • Vol. 6, No. 10, 2007 3903
research articles
Teramoto et al.
Table 2. Comparison of Amino Acid Sequences at N-Terminal Side of Ribosomal Protein S20 amino acid sequence (N-terminal side) bacteria name
1
accession number at Swiss-Prot
10
20
30
40
S. thermophilus LMG18311a S. thermophilus LMD-9b
MEVKTLANIK - - - - MANIK
SAIKRAELNV SAIKRAELNV
KQNEKNSAQK KQNEKNSAQK
SALRT V IKAF SALRT V IKAF
Q5M4T9 Q03L31
S. thermophilus CNRZ1066 S. agalactiae serogroup V S. pyogenes serotype M2 S. agalactiae serogroup Ia S. pneumoniae S. pyogenes serotype M1 S. mutans L. lactis subsp. lactis
MEVKTLANIK MEVKTLANIK MEVKTLANIK - - - - MANIK - - - - MANIK - - - - MANIK - - - - MANIK - - - - MANIK
SAIKRAELNV SAIKRAELNV SAIKRAELNV SAIKRAELNV SAIKRAELNV SAIKRAELNV SAIKRAELNV SAIKRAELNK
KQNEKNSAQK KQNEKNSAQK KQNEKNSAQK KQNEKNSAQK KQNEKNSAQK KANEKNSAQK KQNNRNSAQK VANERNAQQK
SALRT V IKAF SAMRTAIKAF SAMRTAIKAF SAMRTAIKAF SAMRTAIKAF SAMRTAIKAF SAMRSAIKAF SAMRTLIKKF
Q5M080 Q8DZZ2 Q1JGL9c Q3K1D7d Q97RH7e P66509f Q8DU30 Q9CEU5
a Same as S. thermophilus ATCC BAA-250. b Same as S. thermophilus ATCC BAA-491. c Same sequence is registered for S. pyogenes serotype M4 (Q1J6D8), S. pyogenes serotype M12 (Q1JBK1 and Q1JLI3), and S. pyogenes serotype M28 (Q48TC9). d Same sequence for S. agalactiae serotype III (Q8E5P3). e Same sequence for S. pneumoniae ATCC BAA-255 (Q8CWS0). f Same sequence for S. pyogenes serotype M3 (P66510), S. pyogenes serotype M6 (Q5XBZ3), and S. pyogenes serotype M28 (P66511).
Table 3. Comparison of DNA Sequences around the Translation Initiation Site of Ribosomal Protein S20 Gene DNA sequence and translated amino acid sequencec
strain
LMG18311a
GTA
TAA
TGA
CAT
AGT
TAG
ATA
AAT
LMD-9b
GTA
TAA
TGA
CAT
AGT
TAG
ATA
AAT
TTG M TTG
GAG E GAG
GTG V GTG
AAA K AAA
ACA T ACA
TTG L TTG M
GCA A GCA A
AAT N AAT N
ATT I ATT I
AAA K AAA K
a Same as S. thermophilus ATCC BAA-250. b Same as S. thermophilus ATCC BAA-491. c TTG indicates start codon. Underlined section indicates putative Shine-Dalgarno (SD) sequence.
amino acid sequence on the Swiss-Prot/TrEMBL database taking into account the N-terminal methionine loss. Table 1 summarizes the assigned ribosomal proteins of the two S. thermophilus strains, together with calculated masses, accession numbers of Swiss-Prot/TrEMBL, and the status of the N-terminal methionine loss. Detailed characterization results are summarized SI-Tables 1 and 2 in the Supporting Information. The numbers of assigned ribosomal subunit proteins were 40 (22/32 for 50S subunit proteins and 18/21 for 30S subunit proteins) for ATCC BAA-250, and 42 (23/32 for 50S subunit proteins and 19/21 for 30S subunit proteins) for ATCC BAA491, respectively. The empirical rule of N-terminal methionine loss was confirmed to be completely applicable to the assigned ribosomal subunit proteins of S. thermophilus. Among the ribosomal subunit proteins having the same amino acid sequence registered for both strains, 29 subunit proteins were actually observed at the same m/z values on MALDI mass spectra. These protein names are denoted by bold type in Table 1. These subunit proteins have potential for use in the identification of S. thermophilus species. On the other hand, mass differences between both strains are clearly observed for 11 subunit proteins (L5, L9, L15, L24, L35, S3, S10, S14, S17, S19, and S21) of which the registered amino acid sequences are different. These subunit proteins may be significant biomarkers for discrimination of S. thermophilus strains. No peak was observed for 12 subunit proteins (L1, L3, L4, L7/L12, L10, L11, L16, L21, L25, L27, S1, and S2) of both strains. Large molecular weight proteins such as S1 and S2 are generally difficult to observe. Some of the other subunit proteins might have post-translational modifications other than N-terminal methionine loss. It has been reported that post-translational modifications of L1, L3, L7/L12, L11, and L27 occur in E. coli,10 Rhodopseudomonas palustris,16 Thermus thermophilus,12 and/ or Caulobacter crescentus.17 These subunit proteins are unsuitable biomarkers for MALDI characterization of bacteria because 3904
Journal of Proteome Research • Vol. 6, No. 10, 2007
their modifications are not clear and therefore their actual molecular weight cannot be unequivocally predicted from their amino acid sequence. Interestingly, two subunit proteins (L18 and S20) were assigned only for ATCC BAA-491. One might suspect that the registered amino acid sequences found in Swiss-Prot (Q5M2C9 for L18 and Q5M4T9 for S20) of ATCC BAA-250 contain errors. In fact, several annotation errors in N-terminal methionine residue of ribosomal proteins have been clarified by mass spectral characterization.12,14 To inspect the amino acid sequences of L18 and S20, a homology search was performed using NCBI BLASTp program (http://www.ncbi.nlm.nih.gov/BLAST/). Multiple alignments of 40 N-terminal amino acid sequences of 10 homologous S20 subunit proteins by ClustalW program on DNA Data Bank Japan (DDBJ) (http://clustalw.ddbj.nig.ac.jp/top-j.html) are shown in Table 2. Two patterns of N-terminal amino acid sequences are listed. One is MEVKTLANIK-, the same as that of ATCC BAA-250 () LMG18311) and the other is MANIK-, the same as that of ATCC BAA-491 () LMD-9). In detailed inspections, the DNA sequences around the translation initiation region of S20 gene (rpsT) of ATCC BAA-250 and ATCC BAA491 were compared. These have the same DNA sequence in this region, as shown in Table 3. In prokaryotes, a specific nucleotide sequence (typically AGGAGGT) is located approximately 10 bases upstream of the start codon, which is known as the Shine-Dalgarno (SD) sequence.31 For S20 in ATCC BAA491 () LMD-9), the sequence (GGAGGT) prior to the start codon appears to be the SD sequence, whereas no specific sequence prior to the assigned start codon was recognized for that of ATCC BAA-250 () LMG18311). Consequently, in the case of ATCC BAA-250, the TTG coding leucine (L) at position 6 would correspond to the true start codon (note that prokaryotes have three types of start codon: ATG, GTG, and TTG). As a result, the corrected sequence becomes the same as that of ATCC BAA-491. To verify this prediction, the MALDI mass
research articles
Characterization and Verification of Ribosomal Proteins
which is further removed by post-translational modification. The results of the corrected L18 and S20 are shown in SI-Table 1 in Supporting Information. Several lengthy N-terminal amino acid sequences have been discovered during the update process on Swiss-Prot. Before release 52.1 on March 20. 2007, amino acid sequences of the eight subunit proteins of S. thermophilus ATCC BAA-250 (L9, L10, L15, L21, L31, S4, S6, and S21) were corrected. We can also confirm that the corrected amino acid sequences for six ribosomal subunit proteins (except for L10 and L21) are accurate. Characterization of Ribosomal Proteins in Cell Lysates of L. bulgaricus Strains. To further investigate the actual state of registered amino acid sequence information, a simple intact protein analysis by MALDI-MS was applied for the characterization of ribosomal proteins of two genome-sequenced L. bulgaricus strains (NBRC 13953T and ATCC BAA-365). MALDI mass spectra of both strains and detailed assignment results are presented in SI-Figure 2 and SI-Tables 3 and 4 in the Supporting Information. A total of 39 subunit proteins for both strains could be assigned by matching observed mass to calculated mass while taking N-terminal methionine loss into consideration. In this case, sequence errors were found in three subunit proteins (L13, L20, and S4) of ATCC BAA-365. Table 5 shows the amino acid sequences for L13, L20, and S4 of both strains and their accession numbers on the TrEMBL database. These amino acid sequences of ATCC BAA-365 are considerably shorter than those of NBRC 13953T () ATCC11842T) Since the observed masses of these subunit proteins of NBRC 13953T matched their calculated masses, a lack of N-terminal amino acid residues was inferred. As a result of a BLAST search for each subunit protein, the amino acid sequences of homologues of other Lactobacillus species such as L. acidophilus, L. johnsonii, L. casei, L. sakei, L. brevis, and L. plantarum supported this inference. To complete the N-terminal amino acid sequences, the corresponding upstream DNA sequences for these subunit proteins of ATCC BAA-365 were obtained from the NCBI Entrez Nucleotide database (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db)Nucleotide), and compared with those of NBRC 13953T. Table 6 compares the DNA sequences of the L13 gene and its upstream for both strains. Regardless of the difference in the annotated N-termini for both strains, these have the same DNA sequences at the translation initiation region. The original start codon in the case of ATCC BAA-365 is assigned to TTG, which corresponds to the codon coding leucine at position 18
Figure 5. Partial MALDI mass spectra in the range of m/z 82009200 for S. thermophilus ATCC BAA-491 (A) and ATCC BAA-250 (B).
spectra of ATCC BAA-250 were re-examined as to whether the corresponding peak was clearly observed at the correct mass. Figure 5 shows expanded MALDI mass spectra of both strains in the range of m/z 8200-9200. The mass of [M + H]+ ion calculated from the original amino acid sequences of S20 of ATCC BAA-250 is 8992.4 where no peak is observed. If the sixth leucine residue of S20 of ATCC BAA-250 is actually the N-terminal methionine residue, the amino acid sequence of this subunit protein was to be the same as that of ATCC BAA491. The corresponding peaks are clearly observed at almost the same position. This observation supports the fact that the N-terminal amino acid sequence of S20 on Swiss-Prot (Q5M4T9) should be corrected. The amino acid sequence of L18 was also reexamined in the same manner. The multiple alignment of the N-terminal amino acid sequences of 10 homologues were shown in Table 4. Similar to the case of S20 subunit, there are two types of N-terminal amino acid sequences. One is MKIVISK-, the same as that of ATCC BAA-250 () LMG18311), and the other is MISK-, the same as that of ATCC BAA-491 (LMD-9). Examination of the DNA sequence around the translation initiation region of the L18 gene (rplR) of ATCC BAA-250 indicates that GTG coding valine (V) at position 4 is likely to be the true start codon. As expected, the corresponding peak was clearly observed in the MALDI mass spectra of ATCC BAA-250 at the corrected mass (m/z 12 938.9) assuming the N-terminal methionine loss (see Supporting Information, Figure 1). Thus, it was confirmed that valine (V) at position 4 of L18 of ATCC BAA250 on Swiss-Prot (Q5M2C9) should be N-terminal methionine,
Table 4. Comparison of Amino Acid Sequences at N-Terminal Side of Ribosomal Protein L18 amino acid sequence (N-terminal side) bacteria name
1
10
20
30
40
accession number at Swiss-Prot
S. thermophilus LMG18311a S. thermophilus LMD-9b
MKIV ISKPDK - - -MISKPDK
NKLRQKRHRR NKLRQKRHRR
I RGKLSGTAD VRGKLSGTAD
RPRLN IFRSN RPRLN IFRSN
Q5M2C9 Q03IG7
S. thermophilus CNRZ1066 S. pyogenes serotype M2 S. agalactiae serotype Ia S. mutans S. pneumoniae S. pyogenes serotype M1 L. acidophilus L. sakei subsp. sakei
MKIV ISKPDK MKIV ISKPDK - --MISKPDK - --MISKPDK - --MISKPDK - --MISKPDK - --MISKPDK - --MISKPDK
NKLRQKRHRR N KIRQKRHRR N KIRQKRHRR N KIRQKRHRR NKLRQKRHRR NKI RQKRHRR NKLR LKRHRR NKTRQKRHTR
I RGKLSGTAD VRGKLSGTAD VRGKLSGTAD VRGKLSGTAD VRGKLSGTAD VRGKLSGTAD IRG KI SGTAE VRGKI SGTAD
RPRLN IFRSN RPRLNVFRSN RPRLN IFRSN RPRLN IFRSN RPRLNVFRSN RPRLNVFRSN RPRL S I FRSN CPRLNVFRSN
Q5LXS7 Q1JJ46 c Q3K3V3 d Q8DS29 Q97SU6 e Q9A1V8 f Q5FM75 Q38US7
a Same as S. thermophilus ATCC BAA-250. b Same as S. thermophilus ATCC BAA-491. c Same sequences are registered for S. pyogenes serotype M4 (Q1J8Z7), S. pyogenes serotype M12 (Q1JE43 and Q1JP01), and S. pyogenes serotype M28 (Q48VT3). d Same sequence for S. agalactiae serotype III (Q8E7S5) and S. agalactiae serotype V (Q8E2B8). e Same sequence for S. pneumoniae ATCC BAA-255 (Q8CWV2). f Same sequence for S. pyogenes serotype M3 (Q7CFK6), S. pyogenes serotype M6 (Q5XEB9), and S. pyogenes serotype M18 (Q7CNP3).
Journal of Proteome Research • Vol. 6, No. 10, 2007 3905
research articles
Teramoto et al.
Table 5. Comparison of Registered N-Terminal Amino Acid Sequences of L13, L20, and S4 of L. bulgaricus ATCC 11842T and ATCC BAA-365 on the TrEMBL Database N-terminal amino acid sequences subunit protein
strain
L13
a
1
accession number at TrEMBL
10
20
30
40
ATCC 11842T a BAA-365
MRTT PL AKTS
EI ERKWY L ID MID
ATDV SLGR L S ATDV SLGR L S
TAVAT I LR GK TAVAT I LR GK
Q1GBI6 Q04BY3
L20
ATCC 11842T a BAA-365
MPRVKGGTVT
RARRKKV L KL MKL
AKG YRGSKHV AKG YRGSKHV
QFKAASTQ L F QFKAASTQ L F
Q1G9B4 Q049G3
S4
ATCC 11842T a BAA-365
MSRYTGPSWK
RSRRLG I S L S MS
GTGKE LAR RS GTGKE LAR RS
YIPGQHGPNH YIPGQHGPNH
Q1GAV4 Q04B92
Same as NBRC 13953T used in this study.
Table 6. Comparison of DNA Sequences around the Translation Initiation Region of the Ribosomal Protein L13 Gene DNA sequences and translated amino acid sequencesb
strains
ATCC 11842T a
AAT
TAA
ACG
GAG
GAA
TTT
ACA
ATCC BAA-365
AAT TAA continued ATT GAA I E ATT GAA
ACG
GAG
GAA
TTT
CGT R CGT
AAG K AAG
TGG W TGG
TAC Y TAC
ATCC 11842T ATCC BAA-365 a
ACA
TTG M TTG
CGT R CGT
ACT T ACT
ACA T ACA
CCA P CCA
TTA L TTA
GCA A GCA
AAG K AAG
ACT T ACT
AGT S AGT
GAA E GAA
TTG L TTG M
ATT I ATT I
GAC D GAC D
GCT A GCT A
ACT T ACT T
GAT D GAT D
GTT V GTT V
TCA S TCA S
TTG L TTG L
GGT G GGT G
CTG R CGT R
CTT L CTT L
Same as NBRC 13953T used in this study. b TTG indicates start codon. Underlined section indicates putative SD sequence.
in the case of NBRC 13953T. Since the SD sequence could be confirmed at approximately 10 nucleotides upstream of the L13 gene of NBRC 13953T and the expressed L13 protein was observed in the MALDI mass spectra, the N-terminal amino acid sequence of L13 for NBRC 13953T is true. Therefore, L13 of ATCC BAA-365 has the same amino acid sequence as that of NBRC 13953T, providing [M + H]+ ion at m/z 16 392.0. The intense peak of correct L13 was clearly observed (see SI-Figure 2 in Supporting Information). The amino acid sequence for L20 and S4 of ATCC BAA-365 could be corrected in a similar manner. In these cases, 17 N-terminal amino acid residues (MPRVKGGTVTRARRKKV) for L20 and 18 N-terminal amino acid residues (MSRYTGPSWKRSRRLGIS) for S4 were added, and the original N-terminal methionine (the codon TTG) was corrected to leucine, giving the same amino acid sequence as that of NBRC 13953T. The peaks of correct L20 and S4 could be observed at around m/z 13 323.7 and 23 389.7, respectively. Discussion on Sequence Errors and Sequence Mutations. By comparing with the MALDI mass spectra of different strains and carrying out manual inspection of sequence information, several errors in amino acid sequences registered on SwissProt/TrEMBL databases could be found: two subunit proteins (L18 and S20) of S. thermophilus ATCC BAA-250 and three (L13, L20, and S4) of L. bulgaricus ATCC BAA-365. All of these errors can be attributed to mis-annotation of the start codon. The mis-annotated start codon of S20 of S. thermophilus ATCC BAA250 is GTG, which is the same codon coding valine, and the others are TTG, the same as that coding leucine. One reason for this mis-annotation might relate to the fact that prokaryotes have two additional start codons (GTG and TTG) in addition to conventional ATG. Interestingly, the incorrect N-terminal amino acid sequences for S. thermophilus BAA-250 (L18 and S20) are longer by 3 or 5 amino acid residues, whereas those for L. bulgaricus ATCC11842 (L13, L20, and S4) are shorter by 17 or 18 amino acid residues. This systematic error in the assignment of the start codon might be caused by the character of the program used for each gene annotation project. In UniProt Knowledgebase, amino acid sequences translated from computer-annotated genes on EMBL nucleotide sequence 3906
Journal of Proteome Research • Vol. 6, No. 10, 2007
entries are first registered on the TrEMBL database. After a detailed inspection of TrEMBL entries by specialists, the entries are progressively updated to the Swiss-Prot database. SwissProt entries are therefore generally thought to be reliable. In the case of ribosomal proteins of S. thermophilus ATCC BAA250, the amino acid sequences of eight ribosomal subunit proteins were revised before updating them to the Swiss-Prot database. In spite of these inspections, however, two subunit proteins (L18 and S20) slipped through the sequence revision. Sequence errors might spread over other homologues whose sequences have been newly annotated by sequence homology search. It appears likely that the errors of L18 and S20 of S. thermophilus ATCC BAA-250 were caused by the sequence for S. thermophilus CNRZ 1066 and other species of Streptococcus, as shown in Tables 2 and 4. Verification of the masses of expressed proteins is thus very important. After the correction of the amino acid sequences, actually mutated subunit proteins could be determined. Among the assigned ribosomal subunit proteins, ten and two subunit proteins were mutated within each two strains of S. thermophilus and L. bulgaricus, respectively. For S. thermophilus strains, among finally assigned subunit proteins (42 out of 53 subunit proteins), 12 subunit proteins have mutated amino acid residues. In the case of L. bulgaricus strains, only two subunit proteins (L24 and S18) have different amino acid sequences. To confirm this genetic similarity, DNA sequences of 16S rRNA, DNA gyrase subunit B gene (gyrB), and chaperone protein DnaJ gene (dnaJ) were compared between two strains of each bacteria. Both DNA sequences of 16S rRNA gene are the same within a variation of nine copies of the genes. The DNA sequences of gyrB and dnaJ have recently been proposed for phylogenetic discrimination of genus Streptococcus at species and strain levels.32 However, both gene sequences show only 4 to 6-bp differences against approximately 1950-bp (gyrB) and approximately 1130-bp (dnaJ) for both cases of S. thermophilus and L. bulgaricus. These results indicate that each of the two strains of both bacteria are genetically very similar. The mutation of the amino acid residue in the ribosomal protein sequence was also slight: S14 of S. thermophilus contains two mutated residues and the others have only one residue. This is caused by only one or two nucleotide mutations. Such slight mutations, however, give clear mass shifts of 10∼72 Da (except
research articles
Characterization and Verification of Ribosomal Proteins
for L5 of S. thermophilus and L24 of L. bulgaricus, with a 1 Da difference), which are large enough to distinguish by conventional MALDI time-of-flight mass spectrometer. Even for L. bulgaricus strains, the mass shift of S18 provided a reliable guide for discriminating these strains.
Conclusions In this report, ribosomal subunit proteins of each two genome sequenced lactic acid bacteria, S. thermophilus and L. bulgaricus, could be successfully characterized by the simple intact protein analysis by MALDI-MS. The use of cell lysates obtained by simple cell disruption was effective in avoiding the risk of loss of some subunit proteins and the formation of disulfide bonds during the purification of ribosomal proteins. By using this method, approximately 40 out of 53 subunit proteins could be assigned to four genome-sequenced bacterial strains. Although we did not determine post-translational modifications other than N-terminal methionine loss, this simple intact protein analysis is adequate for our purpose, which is the selection of practical biomarkers whose actual masses can be predicted from their amino acid sequences. The bioinformatics-based approach for the characterization of bacteria relies on the amino acid sequences found in protein sequence databases. Consequently, errors in the sequence information strongly affect the reliability of the obtained results. We were able to discover a total of five errors in the N-terminal amino acid sequence for two strains found in the TrEMBL/ Swiss-Prot database. The sequence information has been progressively updated on Swiss-Prot. However, updating is not so fast, and some errors still remain, for example L18 and S20 of S. thermophilus ATCC BAA-250. To select reliable ribosomal protein biomarkers for the characterization of bacteria using a bioinformatics-based approach, the simple intact protein analysis by MALDI-MS demonstrated here is expected to play an important role in verifying sequence information on protein sequence databases, even for the provisional TrEMBL database. To build a reliable biomarker library, comprehensive characterization of the ribosomal proteins of many genome-sequenced lactic acid bacteria is currently in progress using this method. After verification of the amino acid sequences and selection of reliable biomarkers, mass differences of a given bacteria sample relative to the reference mass list of the genomesequenced strain would provide a good guide to distinguishing bacteria at the strain level. We have demonstrated here that MALDI-MS sensitively distinguished different strains within the same species, whose DNA sequences of 16S rRNA, gyrB, and dnaJ genes are the same or very similar. We therefore plan to develop a technique for chemotaxonomic discrimination of bacteria at the strain level by the simple intact protein analysis by MALDI-MS employing ribosomal proteins as biomarkers.
Supporting Information Available: Figures for MALDI mass spectra of S. thermophilus ATCC BAA-250 and ATCC BAA-491, and MALDI mass spectra of L. bulgaricus NBRC 13953T and ATCC BAA-365. Tables of corrected ribosomal subunit proteins of S. thermophilus ATCC BAA-250, and assigned ribosomal subunit proteins of S. thermophilus and L. bulgaricus strains. This material is available free at http:// pubs.acs.org. References (1) Lay, J. O., Jr. Trends Anal. Chem. 2000, 19, 507-516. (2) Fenselau, C; Demirev, P. A. Mass Spectrom. Rev. 2001, 20, 157171. (3) Lay, J. O., Jr. Mass Spectrom. Rev. 2001, 20, 172-194.
(4) Wang, Z.; Russon, L.; Li, L.; Roser, D.; Long, S. R. Rapid Commun. Mass Spectrom. 1998, 12, 456-464. (5) Saenz, A. J.; Petersen, C. E.; Valentine, N. B.; Gantt, S. L.; Jarman, K. H.; Kingsley, M. T.; Wahl, K. L. Rapid Commun. Mass Spectrom. 1999, 13, 1580-1585. (6) Williams, T. L.; Andrzejewski, D.; Lay, J. O., Jr.; Musser, S. M. J. Am. Soc. Mass Spectrom. 2003, 14, 342-351. (7) Demirev, P. A.; Ho, Y. P.; Ryzhov, V.; Fenselau, C. Anal. Chem. 1999, 71, 2732-2738. (8) Demirev, P. A.; Lin, J. S.; Pineda, F. J.; Fenselau, C. Anal. Chem. 2001, 73, 4566-4573. (9) Pineda, F. J.; Antoine, M. D.; Demirev, P. A.; Feldman, A. B.; Jackman, J.; Longenecker, M.; Lin, J. S. Anal. Chem. 2003, 75, 3817-3822. (10) Arnold, R. J.; Reilly, J. P. Anal. Biochem. 1999, 268, 105-112. (11) Arnold, R. J.; Polevoda, B.; Reilly, J. P.; Sherman, F. J. J. Biol. Chem. 1999, 274, 37035-37040. (12) Suh, M. J.; Hamburg, D. M.; Gregory, S. T.; Dahlberg, A. E.; Limbach, P. A. Proteomics 2005, 5, 4818-4831. (13) Sun, L.; Teramoto, K.; Sato, H.; Torimura, M.; Tao, H.; Shintani, T. Rapid Commun. Mass Spectrom. 2006, 20, 3789-3798. (14) Sherman, F.; Stewart, J. W.; Tsunasawa, S. BioEssays 1985, 3, 2731. (15) Moerschell, R. P.; Hosokawa, Y.; Tsunasawa, S.; Sherman, F. J. Biol. Chem. 1990, 265, 19638-19643. (16) Strader, M. B.; VerBerkmoes, N. C.; Tabb, D. L.; Connelly, H. M.; Barton, J. W.; Bruce, B. D.; Pelletier, D. A.; Davison, B. H.; Hettich, R. L.; Larimer, F. W.; Hurst, G. B. J. Proteome Res. 2004, 3, 965978. (17) Running, W. E.; Ravipaty, S.; Karty, J. A.; Reilly, J. P. J. Proteome Res. 2007, 6, 337-347. (18) Bolotin, A.; Quinquis, B.; Renault, P.; Sorokin, A.; Ehrlich, S. D.; Kulakauskas, S.; Lapidus, A.; Goltsman, E.; Mazur, M.; Pusch, G. D.; Fonstein, M.; Overbeek, R.; Kyprides, N.; Purnelle, B.; Prozzi, D.; Ngui, K.; Masuy, D.; Hancy, F.; Burteau, S.; Boutry, M.; Delcour, J.; Goffeau, A.; Hols, P. Nat. Biotechnol. 2004, 22, 15541558. (19) Makarova, K.; Slesarev, A.; Wolf, Y.; Sorokin, A.; Mirkin, B.; Koonin, E.; Pavolv, A.; Pavlova, N.; Karamychev, V.; Polouchine, N.; Shakhova, V.; Grigoriev, I.; Lou, Y.; Rohksar, D.; Lucas, S.; Huang, K.; Goldstein, D. M.; Hawkins, T.; Plengvidhya, V.; Welker, D.; Hughes, J.; Goh, Y.; Benson, A.; Baldwin, K.; Lee, J. H.; DiazMuniz, I.; Dosti, B.; Smeianov, V.; Wechter, W.; Barabote, R.; Lorca, G.; Altermann, E.; Barrangou, R.; Ganesan, B.; Xie, Y.; Rawsthorne, H.; Tamir, D.; Parker, C.; Breidt, F.; Broadbent, J.; Hutkins, R.; O’Sulllivan, D.; Steele, J.; Unlu, G.; Saier, M.; Klaenhammer, T.; Richardson, P.; Kozyavkin, S.; Weimer, B.; Mills, D. Proc. Natl. Acad. Sci. U.S.A. 2006, 103, 15611-15616. (20) Van de Guchte, M.; Penaud, S.; Grimaldi, C.; Barbe, V.; Bryson, K.; Nicolas, P.; Robert, C.; Oztas, S.; Mangenot, S.; Couloux, A.; Loux, V.; Dervyn, R.; Bossy, R.; Bolotin, A.; Batto, J. M.; Walunas, T.; Gibrat, J. F.; Bessieres, P.; Weissenbach, J.; Ehrlich, S. D.; Maguin, E. Proc. Natl. Acad. Sci. U.S.A. 2006, 103, 9274-9279. (21) Kurland, C. G. In Methods in Enzymology; Moldave, K., Grossman, L., Eds.; Academic Press: New York, 1971; Vol. 20, pp 379-381. (22) Traub, P.; Mizushima, S.; Lowry, C. V.; Nomura, M. In Methods in Enzymology; Moldave, K., Grossman, L., Eds.; Academic Press: New York, 1971; Vol. 20, pp 391-407. (23) Yamamoto, T.; Izumi, S.; Gekko, K. FEBS Lett. 2006, 580, 36383642. (24) Rohl, R.; Nierhaus, K. H. Proc. Natl. Acad. Sci. U.S.A. 1982, 79, 729-733. (25) Mizushima, S.; Nomura, M. Nature 1970, 226, 1214-1218. (26) Herfurth, E.; Briesemeister, U.; Wittmann-Liebold, B. FEBS Lett. 1994, 351, 114-118. (27) Tsiboli, P.; Triantafillidou, D.; Franceschi, F.; Choli-Papadopoulou, T. Eur. J. Biochem. 1998, 256, 136-141. (28) Ha¨rd, T.; Rak, A.; Allard, P.; Kloo, L.; Garber, M. J. Mol. Biol. 2000, 296, 169-180. (29) Krishna, S. S.; Majumdar, I.; Grishin, N. Nucleic Acids Res. 2003, 31, 532-550. (30) Panina, E. M.; Mironov, A. A.; Gelfand, M. S. Proc. Natl. Acad. Sci. U.S.A. 2003, 100, 9912-9917. (31) Shine, J.; Dalgarno, L. Proc. Natl. Acad. Sci. U.S.A. 1974, 71, 13421346. (32) Itoh, Y.; Kawamura, Y.; Kasai, H.; Shah, M. M.; Nhung, P. H.; Yamada, M.; Sun, X.-S.; Koyama, T.; Hayashi, M.; Ohkusu, K.; Ezaki, T. Syst. Appl. Microbiol. 2006, 29, 368-374.
PR070218L Journal of Proteome Research • Vol. 6, No. 10, 2007 3907