Sub-Speciating Campylobacter jejuni by Proteomic ... - ACS Publications

GroES, thioredoxin Trx, and ribosomal proteins: L7/L12 (50S), L24 (50S), S16 (30S), L29 (50S), and S15. (30S), and conserved proteins similar to strai...
0 downloads 0 Views 606KB Size
Sub-Speciating Campylobacter jejuni by Proteomic Analysis of Its Protein Biomarkers and Their Post-Translational Modifications Clifton K. Fagerquist,* Anna H. Bates, Sekou Heath, Bryan C. King, Brandon R. Garbus, Leslie A. Harden, and William G. Miller Western Regional Research Center, Agricultural Research Service, United States Department of Agriculture, 800 Buchanan Street, Albany, California 94710 Received December 27, 2005

We have identified several protein biomarkers of three Campylobacter jejuni strains (RM1221, RM1859, and RM3782) by proteomic techniques. The protein biomarkers identified are prominently observed in the time-of-flight mass spectra (TOF MS) of bacterial cell lysate supernatants ionized by matrix-assisted laser desorption/ionization (MALDI). The protein biomarkers identified were: DNA-binding protein HU, translation initiation factor IF-1, cytochrome c553, a transthyretin-like periplasmic protein, chaperonin GroES, thioredoxin Trx, and ribosomal proteins: L7/L12 (50S), L24 (50S), S16 (30S), L29 (50S), and S15 (30S), and conserved proteins similar to strain NCTC 11168 proteins Cj1164 and Cj1225. The protein biomarkers identified appear to represent high copy, intact proteins. The significant findings are as follows: (1) Biomarker mass shifts between these strains were due to amino acid substitutions of the primary polypeptide sequence and not due to changes in post-translational modifications (PTMs). (2) If present, a PTM of a protein biomarker appeared consistently for all three strains, which supported that the biomarker mass shifts observed between strains were not due to PTM variability. (3) The PTMs observed included N-terminal methionine (N-Met) cleavage as well as a number of other PTMs. (4) It was discovered that protein biomarkers of C. jejuni (as well as other thermophilic Campylobacters) appear to violate the N-Met cleavage rule of bacterial proteins, which predicts N-Met cleavage if the penultimate residue is threonine. Two protein biomarkers (HU and 30S ribosomal protein S16) that have a penultimate threonine residue do not show N-Met cleavage. In all other cases, the rule correctly predicted N-Met cleavage among the biomarkers analyzed. This exception to the N-Met cleavage rule has implications for the development of bioinformatics algorithms for protein/pathogen identification. (5) There were fewer biomarker mass shifts between strains RM1221 and RM1859 compared to strain RM3782. As the mass shifts were due to the frequency of amino acid substitutions (and thus underlying genetic variations), this suggested that strains RM1221 and RM1859 were phylogenetically closer to one another than to strain RM3782 (in addition, a protein biomarker prominent in the spectra of RM1221 and RM1859 was absent from the RM3782 spectrum due to a nonsense mutation in the gene of the biomarker). These observations were confirmed by a nitrate reduction test, which showed that RM1221 and RM1859 were C. jejuni subsp. jejuni whereas RM3782 was C. jejuni subsp. doylei. This result suggests that detection/identification of protein biomarkers by pattern recognition and/or bioinformatics algorithms may easily subspeciate bacterial microorganisms. (6) Finally, the number and variation of PTMs detected in this relatively small number of protein biomarkers suggest that bioinformatics algorithms for pathogen identification may need to incorporate many more possible PTMs than suggested previously in the literature. Keywords: Campylobacter jejuni • doylei • sub-speciation • MALDI-TOF • proteomics • post-translational modification • bacterial classification • food safety • protein biomarkers

Introduction Foodborne pathogens represent a significant public health risk. Efforts to detect, identify, and track foodborne pathogens have always been important with regard to preharvest agricul* To whom correspondence should be addressed. C. K. Fagerquist, Western Regional Research Center, Agricultural Research Service, U. S. Department of Agriculture, 800 Buchanan Street, Albany, CA 94710, U.S.A. E-mail: [email protected]. 10.1021/pr050485w CCC: $33.50

 2006 American Chemical Society

tural practices as well as post-harvest food processing and food safety. The possibility of a deliberate attack on the nation’s food supply has given greater urgency to the development of techniques that rapidly identify and track (the spread and ultimate origin of) pathogens and their toxins. For it to be of practical value, a pathogen identification technique must be sensitive, rapid, highly specific, robust, and reproducible. Ideally, the technique must not only identify the pathogen by Journal of Proteome Research 2006, 5, 2527-2538

2527

Published on Web 08/19/2006

research articles genus and species but also by strain so that the origin of the pathogen can be determined quickly during an outbreak or incident. The technique should also be compatible with the detection of pathogens in real-world samples, which are often complex, e.g., food, water, soil, plants, etc. Real-world samples often have a multitude of natural bacterial flora; consequently, it is important to detect and identify the pathogen responsible for illness in a sample that contains other microorganisms, many of which may not cause illness. With regard to this issue, the necessity of culturing an isolate from a sample before analysis has both advantages and disadvantages. Disadvantages are the time required to culture (24-48 h) as well as the problem of “culture bias”. The media used for culturing may competitively favor the growth and detection of certain pathogens over others. One advantage of culturing is that the viability of the pathogen (or pathogens) is confirmed by the appearance of bacterial growth. A pathogen detection technique that requires bacterial culturing prior to analysis is one of many factors that must be taken into account when evaluating the range of techniques available. One pathogen identification technique of increasing popularity involves detection of protein biomarkers by matrixassisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS).1-13 Detection of proteins has the advantage that an organism’s proteins are reflective of its underlying genetic identity. Typically, 20-50 protein biomarkers per strain, primarily high copy cytosolic proteins, are detected by this technique. Analysis of MALDI-TOF MS spectra typically involves using either pattern recognition or bioinformatics algorithms for pathogen identification.14-20 The pattern recognition approach compares an unknown spectrum to the spectra of reference strains, where the reference strains have been identified definitively by some other microbiological or molecular biological technique. Pattern recognition involves analysis of a series of m/z peaks within a limited mass range. Peak intensity (absolute or relative) is not a significant criterion for analysis as there is considerable sample-to-sample variability. However, m/z peaks must meet at least some minimum signal-to-noise (S/N) threshold value.14-15 The bioinformatics approach to pathogen identification takes advantage of the increasing number of viral and bacterial pathogens genomes.16-20 These genomes can be used to generate a list of hypothetical protein molecular weights (MWs) which can be compared to the of m/z values of biomarker ions in a MALDI-TOF MS spectrum. In addition to generating MWs of unmodified proteins, Demirev et al. reported incorporating the possible post-translational modifications (PTMs) common to bacterial proteins, e.g., N-terminal methionine cleavage in a simple algorithm for identification.18 Among the most important of the food-borne pathogens is Campylobacter, which is suspected of causing 2.4 million known cases of gastrointestinal illness each year in the United States and is also a significant problem worldwide. In 99% of identified campylobacterosis cases, the species responsible for illness is C. jejuni. In rare cases, campylobacterosis and C. jejuni have been linked also to the autoimmune-mediated peripheral neuropathy Guillain-Barre´ syndrome, which is the leading cause of paralytic disease in the U.S.A.21 Very recently, our laboratory reported speciation of C. coli, C. jejuni, C. helveticus, C. lari, C. sputorum, and C. upsaliensis by the analysis of MALDI-TOF MS spectra of cell lysates.22 The analysis relied primarily on visual comparison of MS spectra grouping biomarker ion “peaks” that were common to certain 2528

Journal of Proteome Research • Vol. 5, No. 10, 2006

Fagerquist et al.

species and strains (“species-identifying biomarkers ions” or “SIBIs”). Some biomarker ions were assigned tentatively to specific proteins by correlating their observed m/z to the hypothetical protein MWs generated from genomic sequencing. One prominent SIBI was identified tentatively by extraction and peptide mass mapping as the DNA-binding protein HU.22 In a separate communication, we also reported definitive identification, by proteomic techniques, of the HU protein biomarker observed in the MALDI-TOF MS spectrum of cell lysates of C. jejuni, C. coli, C. lari, C. upsaliensis, C. helveticus, and C. concisus strains.23 In that study, it was determined that variations in the mass of the HU biomarker among species and strains were due to amino acid substitutions caused by nonsynonomous mutations in the HU gene: hup. We also reported that the prominence of this biomarker in MALDI-TOF spectra was due, in part, to five lysine residues clustered at the carboxyterminal of the protein.23 In the current study, we have continued our identification of C. jejuni protein biomarkers that are prominently observed in the MALDI-TOF MS spectra of bacterial cell lysates. We have selected three strains of C. jejuni for protein biomarker extraction and identification: RM1221, RM1859, and RM3782. The genome of strain RM1221 had been sequenced and annotated previously, and the genome sequence is currently available in public databases.24 The genome sequences of strains RM1859 and RM3782 have not been determined although RM3782 was identified as C. jejuni by extended multi-locus sequence typing.25 RM1859 was analyzed previously by Lior and Penner serotyping methods and identified as C. jejuni.26 A nitrate reduction test27,28 revealed that strains RM1221 and RM1859 were C. jejuni subsp. jejuni whereas strain RM3782 was C. jejuni subsp. doylei.

Materials and Methods With the exception of the nitrate reduction test used to distinguish C. j. jejuni from C. j. doylei, a detailed description of the protocol for protein biomarker extraction and analysis was reported previously.23 The following is an abbreviated description of the materials and methods employed. Materials. All chemicals and solvents were purchased from either Sigma-Aldrich (St. Louis, MO) or Fisher Scientific (Pittsburgh, PA) unless otherwise noted. Gels, buffers, standards, and electrophoresis apparatus for protein separation were purchased from Invitrogen (Carlsbad, CA,). Gases were obtained from Praxair (Oakland, CA). Bacterial Culture. Campylobacter was grown as described previously.22-23 Warning: Campylobacter is a Biosafety Level 2 human pathogen. All appropriate precautions were taken when handling this pathogen. Briefly, all Campylobacter strains were kept frozen until inoculation on nutrient agar. C. jejuni strains were grown at 37-42 °C on Brucella Broth Agar containing 1.5% Bacto Agar (Becton Dickinson) and supplemented with 0.25 g/L of sodium metabisulfite (anhydrous), 0.25 g/L sodium pyruvate (anhydrous), 0.25 g/L of FeSO4‚7H2O, and 5% laked horse blood. The inoculated agar plates were placed in plastic zip-lock freezer bags. The bags were filled/purged three times with a compressed certified gas mixture of 5-10% hydrogen, 10% carbon dioxide, and 80-85% nitrogen and then given a final fill. After 48 h of growth, the bacteria were harvested for MALDI-TOF MS analysis and protein extraction. Nitrate Test to Distinguish C. j. jejuni. from C. j. Doylei. To determine nitrate reduction in Campylobacter, nitrate disks and anaerobic nitrate reagents A and B (Remel, Inc, Lenexa,

research articles

Sub-Speciating Campylobacter jejuni by Proteomic Analysis Table 1. Primer Pairs Used for the Amplification of the Biomarker Genes of C. j. doylei Strain RM3782 30S S15 (rpsO) 30S S16 (rpsP) 50S L7L12 (rplL) 50S L24 (rplX) 50S L29 (rpmC) chaperonin GroES cons. protein (similar to cj1164) cons. protein (similar to cj1225) cytochrome c553 (cyf) thioredoxin (trx) trans. init. IF1 (infA) transthyretin-like protein DNA-binding HU (hup)

F R F R F R F R F R F R F R F R F R F R F R F R F R

KS) were used.27-28 Zinc dust was obtained from BioMerieux (Marcy l’Etoile, France). C. j. jejuni strain NCTC11168 and C. j. doylei strain ATCC49350 were used as controls for the procedure. Each strain was grown out of the -80 °C freezer on Anaerobe basal agar (ABA; Oxoid, Basingstoke, U.K.) amended with 5% laked horse blood (Hema Resource & Supply, Aurora, OR) under the conditions described above. After 48 h, each strain was streaked for isolation. The nitrate disk was placed on an area of heavy growth and incubated for an additional 24 h under the same conditions. The nitrate disk was subsequently taken off the plate using a 1 µL loop and placed in a clean tube. One drop of anaerobic nitrate reagents A and B were added to the tube. A color change (clear to red) indicates nitrate was reduced to nitrite. If no color changed was observed after 3 min, zinc dust was added. If no color change was observed after addition of zinc dust, then nitrate was reduced completely to nitrogen in the previous step indicative that the strain tested was C. j. jejuni. If a color change was observed after addition of zinc dust (but not before), then no reduction took place in the previous step indicative that the strain tested is C. j. doylei. Amplification and Sequencing of the Biomarker Genes. Extraction of Campylobacter DNA was performed as described previously.23,25 Primer pairs used to amplify the biomarker genes of C. j. doylei strain RM3782 and the control strains (C. j. jejuni NCTC 11168 and RM1221 and C. coli RM2228) are listed in Table 1. Primers were purchased from Integrated DNA Technologies, Inc., (Coralville, IA) or from Operon Biotechnologies (Alameda, CA). The PCRs were carried out with a final volume of 50 µL; each reaction contained: 50 ng of genomic DNA, 1× MasterAmp PCR buffer (Epicenter, Madison, WI), 2.5 mM MgCl2, and 1× MasterAmp Enhancer (Epicenter), 1 µM each primer, 0.25 mM each dNTP, and 0.02 U/µL Taq polymerase (New England Biolabs). Amplifications and cycle sequencing reactions were performed using a Bio-Rad (Hercules, CA) Tetrad-thermocycler. The PCRs employed the following thermal parameters: 2 min. initial denaturation; 30 s at 94 °C; 30 s at 48 °C; 1 min at 72 °C (35 cycles); and a 5 min final extension at 72 °C. Amplification products were purified on a Qiagen BioRobot 8000 workstation (Qiagen, Santa Clarita, CA).

5′ CTT TGG TAA ATA GCA CAG ATT CTC C 3′ 5′ CCT GAT GCC AAA GGA CAA AGG 3′ 5′ TGA CAC AAA TGA TGA GTC AAG C 3′ 5′ CCT GTA TCA ACC TTA TGT GCA A 3′ 5′ TCA CWA TWG GTT TAA ATG CG 3′ 5′ CTA ACA TAT YGC AYA TAC CTC RTG G 3′ 5′ TAG CMA RCT CAC CMG CAC CTA CGC 3′ 5′ CCT ATC GGA ACT CGT ATC TTT GG 3′ 5′ TTM ACA ACA ACG CCT TGW ATT TCT C 3′ 5′ GTA AAG GTG CWG TTG AGG AAT G 3′ 5′ TTT ATA GTA TGA TAT TAT CAC TC 3′ 5′ CTA CTT CTT TAG CAA CAC TTA C 3′ 5′ GAT GTT CAT GAG ART GWT CTT CG 3′ 5′ GCT ATA GGR GTR AGY GTT ACC TTG C 3′ 5′ ATC TTT ATA CCT GTA ATT GTC CTG G 3′ 5′ CTC AGC GGT AGA GTA GTC GG 3′ 5′ TCT ATA TTA ATC ACA CCG TCT C 3′ 5′ GAT GAG GAT TCT TTR CGT GAT G 3′ 5′ CTA AGT CCA GCA GGA CCT C 3′ 5′ TTC AGA TTG ATC TGA TTT GC 3′ 5′ TAT AGG GAC TCA TGT GCC 3′ 5′ GGT CTC ACT TTC ATG AGC 3′ 5′ AGA GAT AGA CGY GGT AAA GCA G 3′ 5′ GAG GAA GTT TCT CAA GTC TTG C 3′ 5′ AAT GAA ATT GAY AAT AGT GGR GGA T 3′ 5′ TTT CCT ATA AGY TCA YTT ACT TTT T 3′

Cycle sequencing reactions were carried out with the ABI PRISM BigDye terminator cycle sequencing kit (Version 3.1, Applied Biosystems, Foster City, CA) and standard conditions. Labeled products were purified with DyeEx (Qiagen) 96 well plates. DNA sequencing was done on an ABI PRISM 3130 × l genetic analyzer using the POP-7 polymer, ABI PRISM Genetic Analyzer Data Collection and Sequencing Analysis software modules. Protein Extraction. A 10 µL inoculating loop was used to transfer cells from the culture plate to four extraction tubes each containing 0.5 mL of 2:1 water/acetonitrile, 0.1% trifluoroacetic acid and zirconia/silica beads. After bead-beating and centrifugation, the supernatants were combined in a 15 mL polypropylene tube, to which was added 3.0 mL of 0.1% TFA. The tube was capped, vortexed briefly and filter centrifuged. After centrifugation, the filtrate was stored at -80 °C for subsequent HPLC analysis. HPLC Separation. The cell lysate supernatant solution was separated by high-performance liquid chromatography (HP Series II 1090, Palo Alto, CA). The column used for chromatographic separation was a large capacity protein and peptide C18 Vydac column. Cell lysate analytes were detected by UV absorbance (λ ) 210 nm). Mobile phase A was 0.1% TFA in HPLC grade water. Mobile phase B was 1:1 acetonitrile/2-propanol with 0.1% TFA. The flow rate was 1.5 mL/min. The HPLC gradient was as follows: 0-5 min (85% A); 5-65 min (85-20% A); 65-80 min (20% A); 80-90 min (20-90% A). About 90 HPLC fractions were collected. HPLC fractions were analyzed subsequently by MALDI-TOF analysis (or stored at -80 °C for later analysis). 1D SDS-PAGE. HPLC fractions of interest were centrifuge/evaporated to a final volume of 20-30 µL. To this volume was added 10 µL of lithium dodecyl sulfate (LDS) sample buffer (Nupage, Invitrogen, Carlsbad, CA). After incubation, the 30-35 µL sample was loaded onto a 4-12% Bis-Tris 1.5 mm × 10-lane gel cassette (Nupage). Protein standards occupied lanes #1 and #10. Electrophoresis was performed using a constant voltage pre-program for 35-40 min. After electrophoresis, the gel was stained for 48 h with a fluorescent gel-staining solution. Journal of Proteome Research • Vol. 5, No. 10, 2006 2529

research articles After de-staining, gels were photographed with a gel/blot imager at a wavelength of 302 nm. In-Gel Digestion of Protein Biomarker. Prominent gel bands were excised on a UV light box. Cautionary note: UV radiation is harmful to eyes and skin. Precautions were taken to cover all exposed skin. In addition, both UV-resistant glasses and a UV-resistant shield were worn for eye protection. The excised gel bands were divided further to produce ∼1 mm3 gel cubes. Two or three gel cubes from each band were deposited into a 96-well processing tray. In-gel digestion was performed robotically using an automated protein digester per manufacturer’s instructions (DigestPro, Intavis, Langenfeld, Germany). Porcine trypsin was used for enzymatic digestion. Digested samples were then analyzed by nano-LC-MS/MS (or stored at -80 °C for later analysis). MALDI-TOF MS of Cell Lysates. MALDI-TOF protein biomarker analysis of Campylobacter cell lysates has been described in detail elsewhere.22-23 Briefly, Campylobacter cells were transferred using a 1 µL inoculating loop to an extraction tube containing 0.5 mL of 2:1 water/acetonitrile with 0.1% trifluoroacetic acid (TFA) and zirconia/silica beads. The tube was agitated for 60 s on a beadbeater followed by centrifugation. A saturated solution of trans-4-hydroxy-3-methoxy-cinnamic acid (ferulic acid) was prepared in 2:1 water/acetonitrile with 0.1% TFA. This saturated matrix solution was then diluted by 66%, and 0.5 µL aliquots of this diluted matrix solution was deposited in a 7 × 7 array of spots onto a square stainless steel target plate. After drying at room temperature, 0.5 µL of cell lysate supernatant was deposited onto the matrix spots. The spots were allowed to dry at room temperature. The target plate was then introduced into the source of a Bruker Reflex II MALDI-TOF mass spectrometer. Laser desorption/ionization was achieved with a pulsed nitrogen laser. The instrument was operated in reflectron mode at an ion acceleration voltage of 20 kV which resulted in a mass accuracy of (5-10 Da when externally calibrated. Data were processed using Bruker XMASS software. MALDI-TOF MS of HPLC Fractions of Cell Lysates. The 90 HPLC fractions of a strain were analyzed by MALDI-TOF MS to identify the fractions that contained protein biomarkers which were prominently observed in the MALDITOF MS of cell lysates. Analysis of the HPLC fractions by MALDI-TOF MS was similar in procedure to the analysis of cell lysates. A 0.5 µL aliquot of each HPLC fraction was spotted onto a dried spot of the ferulic acid matrix. We observed that the MALDI-TOF MS m/z of a protein biomarker in an HPLC fraction and the MALDI-TOF MS m/z of the protein biomarker in the cell lysate were almost always within (5 Da. Variations outside the (5 Da range were ascribed to target characteristics, instrument performance or external calibration. HR-ESI-MS of HPLC Fractions of Cell Lysates. Once a biomarker (or biomarkers) was identified in an HPLC fraction, a more exact measurement of its mass was obtained by highresolution electrospray ionization mass spectrometry (HR-ESIMS) using a quadrupole/time-of-flight (Qq-TOF) mass spectrometer (Q-STAR Pulsar I, MDS Sciex/ABI, Toronto, Canada). Prior to analysis the reflectron-TOF of the instrument was calibrated externally giving a resolution of ∼7000-8000 fwhm. Samples were introduced by spray tip with a backing pressure of 1-8 atm. Spray voltage was 1800 V. The average MW of the protein biomarker was calculated by deconvolution of the charge state envelope using the Bayesian protein reconstruct 2530

Journal of Proteome Research • Vol. 5, No. 10, 2006

Fagerquist et al. Table 2. Sources and Locations of Three Strains of Campylobacter jejuni strain

synonym

species

source

location

RM1221 RM1859 RM3782

ATCC BBA-1062

C. jejuni C. jejuni C. jejuni

chicken unknown human

USA (CA) unknown South Africa

provided with the instrument software. The average MW of the protein biomarker was calculated from the average of 3-4 separate measurements of a re-calibrated TOF analyzer. The predicted average MW of a protein biomarker was calculated from its genomically derived amino acid sequence (if available) using GPMAW software. Nanoflow-HPLC MS/MS. Tryptic peptides from in-gel digested protein biomarkers were separated chromatographically using a nanoflow HPLC system (LC Packings/Dionex, Sunnyvale, CA) interfaced to our Qq-TOF instrument. Samples were ionized by nano-electrospray ionization (nano-ESI) at a flow rate of 200-250 nL/min. The 10-15 µL of a sample was preconcentrated by loading onto a 1 mm C18 “trap” column at a flow rate of 20 µL/min. After preconcentration, the “trap” column was brought in-line with the eluent of the analytical HPLC pump, which had a flow rate of 0.245 mL/min and a 1000-to-1 split resulting in a flow of 200-250 nL/min through the “trap” column, analytical column, and spray tip. Mobile phase A was 0.5% glacial acetic acid diluted in HPLC grade water. Mobile phase B was 80% HPLC grade acetonitrile, 20% HPLC grade water, 0.5% glacial acetic acid. The HPLC gradient was as follows: 0-12 min (92-20% A); 12-16 min (20-92% A); 16-29 min (92% A); 29-29.1 min (92% A). The analytical column used was a C18 monomeric column. Data were acquired using the data dependent scanning of the instrument software. A survey scan preceding to the product ion scan determines “on-the-fly” the m/z and z of the tryptic peptide precursor ion. A software script is used to determine the optimum collision energy for the collision-induced fragmentation of a tryptic peptide during the product ion scan depending on its m/z and z. Database Searches. The WIFF acquisition files created by the Qq-TOF instrument software were processed to MGF files using a WIFF-to-DTA batch converter.29 The DTA files containing MS/MS data were then used in a search against a flat file containing amino acid sequences of all eubacterial proteins encoded in the NCBI nonredundant database. An in-house version of MASCOT30 and Global Proteome Machine31 search engines were used for database searching. Given the high number of basic residues found in these proteins, five possible missed cleavages were used in the search.

Results Nitrate Reduction Assay of Three Strains of Campylobacter jejuni. Three C. jejuni strains, RM1221, RM1859, and RM3782, were analyzed by the nitrate reductase assay. The assay indicated that RM1221 and RM1859 were C. jejuni subspecies jejuni whereas RM3782 was C. jejuni subspecies doylei. Table 2 shows the sources and locations of the three strains. Table 3 summarizes the results of the nitrate reductase assay. MALDI-TOF MS of Three Strains of Campylobacter jejuni. The three C. jejuni strains were also analyzed by MALDI-TOF MS which are shown in Figure 1. The protein biomarkers observed in the three spectra show similarities as well as differences. In some cases, biomarker mass shifts are observed

Sub-Speciating Campylobacter jejuni by Proteomic Analysis Table 3. Results of the Nitrate Reductase Assay for Three Strains of Campylobacter jejuni: RM1221, RM1859, and RM3782a

strain

color change after addition of reagents A and B

color change after addition of zinc

positive for nitrate reduction

identification

NCTC11168 ATCC49350 RM1221 RM1859 RM3782

Yes No Yes Yes No

N/A Yes N/A N/A Yes

Yes No Yes Yes No

C j. jejuni C. j. doylei C. j. jejuni C. j. jejuni C. j. doylei

a

Results of controls are also provided in the first two rows.

when comparing spectra. For example, a biomarker ion at m/z ≈ 7035 is present in all three spectra, however a biomarker ion at m/z ≈ 11 200 is present in spectra of strains RM1221 and RM1859, but in the spectrum of RM3782, it appears “shifted” to m/z ≈ 11 260. Doubly protonated protein biomarkers are identified by (+2). An asterisk (*) denotes a likely contaminant: R- and β-chain of horse hemoglobin, from the blood agar media used to grow Campylobacter. The protein biomarker ions at m/z 10 277 (m/z 5139, +2) and m/z 10305 (m/z 5153, +2) in the spectra of RM1221 and RM1859, respectively, had been identified previously, by proteomic analysis, to be the DNA-binding protein HU. The differences in HU MW between RM1221 and RM1859 were found to be due to a 94Gly f Ser substitution.23 Biomarker Extraction. A similar process of extraction and proteomic identification was performed on the other protein biomarkers. Table 4 summarizes the protein biomarkers that were successfully extracted and identified. Besides the previously reported DNA-binding protein HU of RM1221 and RM1859,23 11 new biomarkers were extracted and identified for RM1221, 7 new biomarkers for RM1859, and 6 new biomarkers for RM3782. Six of the seven new biomarkers extracted and identified for RM1859 were identical to a biomarker extracted and identified for RM1221. Only two of six biomarkers of RM3782 were identical to a protein biomarker of RM1221. Those biomarkers not recovered (NR) were nonetheless assigned tentatively if they were identified in at least one strain. Many of the RM3782 biomarkers identified appear to have undergone a mass shift with respect to their equivalent RM1221 and RM1859 protein biomarkers. In the calculation of the MW, we have also taken into account the contribution of disulfide bonds (S-S) and reduced cysteines (-SH). Proteomic identification was confirmed by both GPM and Mascot analysis. Sequencing the Biomarker Genes of RM3782. As mentioned previously, of the three C. jejuni strains analyzed in this study, only the RM1221 genome has been fully sequenced and annotated.24 Because of the predicted high sequence similarity between RM1221 and RM1859, it was possible to identify the protein biomarkers of RM1859 proteomically without having to sequence the putative biomarker genes. However, we sequenced the putative biomarker genes of RM3782 because this strain appeared to have only a few biomarker masses in common with RM1221. Table 5 shows the amino acid sequences of the unmodified protein biomarkers of RM1221, RM1859 (from refs 23 and 24), and RM3782 (this work), as predicted by DNA sequencing. Boxed residues highlight amino acid differences between the biomarker proteins. Of special note is the transthyretin-like periplasmic protein biomarker of

research articles RM3782 whose translation is terminated prematurely after the 21st residue due to a stop codon caused by a guanine to thymine transversion at position 64 in the biomarker gene. From the results presented in Table 5, it is apparent that the mass “shifts” observed in the RM3782 protein biomarkers (Table 4) are the result of amino acid substitutions caused by nonsynonomous point mutations in the biomarker genes. Synonomous mutations would not result in a residue substitution. Post-Translational Modifications of Protein Biomarkers. Several protein biomarkers (identified with an asterisk in Table 4) were found to be post-translationally modified. The predicted MW of both the unmodified and modified (*) protein biomarkers are shown if they were known (or could be deduced logically from the MS and MS/MS data). The most frequent post-translational modification (PTM) observed among the protein biomarkers extracted was N-terminal methionine (NMet) cleavage. N-Met cleavage is common among bacterial proteins18,32-34 and was observed for ribosomal proteins: L7/L12, L24, S15, thioredoxin, and the translation initiation factor IF-1. As shown in Table 4, N-Met cleavage was confirmed by the difference in mass between the measured MW and the predicted MW (-131 Da). N-Met cleavage was confirmed also from MS/MS of the tryptic peptides of the extracted protein. Table 6 shows the N-terminal sequences for these biomarkers and the tryptic peptides sequenced by MS/MS. The penultimate residue is boxed because it is known that this residue has an effect on the probability of N-Met cleavage.18,32-34 This issue will be explored further in the Discussion section. We detected also another PTM that involved simple truncation of the polypeptide chain. A biomarker ion at m/z ≈ 13 730 was observed in the MALDI-TOF spectra of RM1221 and RM1859 in Figure 1. Analysis of the extracted proteins indicated that the biomarker was a transthyretin-like periplasmic protein with a predicted average MW of 15920.2 Da. Cleavage of the first 20 residues of the polypeptide chain (signal peptide) resulted in a truncated protein with a predicted average MW of 13 727.4 Da, which is within the experimental error of the measured MW (Table 4). In addition, our MS/MS results confirmed 20 residue truncation of this polypeptide as shown in Table 6. It is also worth noting that truncation of this protein occurred as a result of cleavage between the nonbasic residues 20Ala and 21Thr and, thus, was unlikely a result of trypsin digestion. Cleavage of the polypeptide between 20Ala and 21Thr, the absence of MS/MS sequence coverage for residues 1-20, and the MW of the biomarker suggests that the N-terminus of the truncated polypeptide is 21Thr. A more complicated PTM was observed for the RM1221 protein biomarker at m/z 9554 (Figure 1). The MALDI-TOF MS of the HPLC fraction gave a m/z 9552 (data not shown). When this HPLC fraction was analyzed by HR-ESI-MS it produced a more accurate average MW value of 9548.9 ( 0.2 Da. When this HPLC fraction was processed by 1-D PAGE, in-gel digested and its tryptic peptides analyzed by LC/MS/MS, the protein identified was found to be cytochrome c553 of RM1221. The predicted average MW of cytochrome c553 for strain RM1221 is 10837.8 Da when the three cysteine resdues are in their reduced state (-SH). However, cytochrome c553 is known to undergo significant post-translational modifications consistent with its biological role in electron transport. For instance, cytochrome c553 is known to undergo a 19-residue cleavage (signal peptide) from the N-terminal side of the polypeptide chain.35 Such a 19-residue cleavage was confirmed by our MS/ Journal of Proteome Research • Vol. 5, No. 10, 2006 2531

research articles

Fagerquist et al.

Figure 1. MALDI-TOF MS of cell lysates of C. jejuni RM1221, RM1859, and RM3782.

MS data as shown in Tables 6 and 7. Cleavage of the polypeptide occurs between two nonbasic residues (19Ala and 20Ala), suggesting that the N-terminus of the truncated polypeptide is 20Ala. The predicted average MW of this truncated polypeptide would then be 8934.3 Da (with the remaining two cysteines in their reduced state). Further modification of the protein involves covalent attachment of heme to the polypeptide via two thioether bonds formed from reduction of two heme double bonds by the two cysteine thiols of the truncated 2532

Journal of Proteome Research • Vol. 5, No. 10, 2006

polypeptide. The average MW of heme (unattached to the polypeptide) is 616.5 Da (C34H32O4N4Fe). When the mass of the heme moiety is added to the truncated polypeptide, we obtain a average mass of 9550.8 Da for the entire complex, which is 2 Da higher than the HR-ESI-MS measurement of 9548.9 ( 0.2 Da. The 2 Da discrepancy could be due to some other, as of yet unidentified, PTM. Table 7 also shows that we did not obtain MS/MS tryptic peptide coverage for the part of the c553 protein that includes the two cysteines residues (C) whose

research articles

Sub-Speciating Campylobacter jejuni by Proteomic Analysis

Table 4. Protein Biomarkers of Strains RM1221, RM1859, and RM3782 that Were Successfully Extracted and Identifieda protein biomarkers (* signifies presence of a PTM)

HUb

RM1221 MALDI (m/z) 10277

50S* L7/L12

12905

HR-ESI-MS 10273.9 12899.8 Ave MW (Da) ( 0.2 ( 0.2

50S* L24

trans* init. IF-1

CP*c (similar cj1164)

8154

8461

8272

8151.6 ( 0.1

8458.1 8269.1 ( 0.2 ( 0.2

30S S16

8681

cyto.* c553

9554

CPc trans* (similar thyretinlike cj1225)

NOd

8677.8 9548.9 ( 0.2 ( 0.2

predicted 10274.0 13031.0 8283.0 8300.8 10258.6 8678.2 10837.8 ave MW (Da) 12899.8* (-SH) (-SH) (Two S-S) (3 -SH) 8151.8* 9550.8* GPM log(e) mascot

-160.2

RM1859 MALDI (m/z) 10305

-180.4 1518

-28.1 93

-15.7 137

-2.2 65

-9.3 145

-47.3 220

12901

NOd

8460

8270

8679

NOd

NR

8269.3 ( 0.3

NR

HR-ESI-MS 10304.0 12900.0 Ave MW (Da) ( 0.2 ( 0.2 predicted 10304.0 13031.0 ave MW (Da) 12899.8* GPM log(e) mascot

-214.5

RM3782 MALDI (m/z) 10276

10258.6 (Two S-S)

-296.2 2244 12889

HR-ESI-MS 10273.9 12886.0 ave MW (Da) ( 0.4 ( 0.1

-78.2 423

-137.6 1222

10098 10094.7 ( 0.2

15920.2 7033.5 9458.0 13727.4*

11335.0 10226.1 (S-S,-SH) 10094.9* 11203.8*

-60.4 379

-79.4 518

-17.8 248

-60.3 438 9459

-34.1 304

7035

9551.1

15920.2 7033.5 9458.0 13727.4*

11335.0 10226.1 (S-S,-SH) 10094.9* 11203.8*

-2.0 166

-65.8 538

-20.1 322

-59.0 460

-65 474

NOd

7034

9492

8475

8258

8680

9587

9587

NR

NR

NR

NR

NR

NR

NR

11208 11204.0 ( 0.1

7033.1 9458.2 ( 0.2 ( 0.2

8181

NR

9460

13727.6 ( 0.2

NR

NR

7036

7033.2 9458.0 ( 0.0 ( 0.5

30S* S15

13728

NR

NR

13732 13727.4 ( 0.3

thio*redoxin

9551

NR

NR

chaperonin

9551.2 ( 0.1

predicted 10274.0 13016.9 8310.0 8300.8 9794.1 8678.2 10871.8 9583.1 ave MW (Da) 12885.7* (-SH) (-SH) (Two S-S) (3 -SH) 8178.8* 9584.9* GPM log(e) mascot

50S L29

NR

NTe

11205

10096

11204.1 ( 0.2

10094.9 ( 0.5

-20.0 261

11263

10142

7081.4 9490.1 ( 0.3 ( 0.1

11261.9 ( 0.2

No ESI

7033.5 9490.1

11393.0 10269.1 (S-S,-SH) 10137.9* 11261.8*

-15.8 183

-55.9 421

-53.1 424

-20.1 253

a Those biomarkers not recovered (NR) were nonetheless assigned tentatively if they were identified in one strain. Many of the RM3782 biomarkers appear to have undergone a mass shift with respect to the masses of equivalent biomarkers for strains RM1221 and RM1859. The oxidation state of protein cysteines are indicated (S-S or -SH or both) and incorporated into the MW calculation. b HU for RM1221 and RM1859 were previously reported in ref 23. c CP: conserved protein. d NO: not observed. e NT: not translated.

side-chains bond covalently to the heme moiety. Heme attachment would necessarily complicate peptide backbone fragmentation of those tryptic peptide(s) to which the heme is bound (in addition to increasing the precursor ion mass by ∼600 Da), thus complicating confirmation by database searching. It is interesting to note that cytochrome c553 (heme, polypeptide and metal atom) appeared to survive intact through protein extraction, chromatographic separation, and ESI analysis. In particular, retention of the Fe atom to the protein suggests it is tightly bound to the porphyrin, perhaps facilitated by significant retention of the polypeptide’s tertiary structure. Figure 2 shows HR-ESI-MS of the HPLC fraction containing the cytochrome c553 biomarker (and the transthyretin biomarker). The protonated charge states of cytochrome c553 are enumerated. The relatively low charge state maximum and the narrow charge state distribution (especially on the high charge side) suggests a folded (or partially folded) tertiary structure indicating that the native state of this protein appears to have survived largely intact during extraction, chromatographic separation and transfer into the gas phase by ESI.36 It is also interesting to mention that the RM1859 protein biomarker at m/z 9551 was not identified as cytochrome c553 but as a conserved protein (similar to cj1225). This was the only case of two identified protein biomarkers whose MWs were within the experimental error of our externally calibrated MALDI-TOF MS instrument. The RM1221 protein biomarker at m/z 8461 (Figure 1) was identified after extraction and analysis as the translation initiation factor IF-1 which has a predicted average MW of 8300.8 Da. MS/MS of the biomarker’s N-terminal peptide

confirmed N-Met cleavage of this protein (Table 6) resulting in a MW of 8169.6 Da. The HR-ESI-MS MW of this biomarker was 8458.1 ( 0.2 Da (Table 4). The difference between the predicted and measured MW is thus +288.5 Da. Presumably, this protein undergoes further post-translational modification, but we were unable to identify the other PTMs. Until the exact PTMs have been determined, this identification should be considered as tentative. The protein biomarker at m/z 8272 in strain RM1221 was identified as the conserved protein (similar to cj1164) which has a predicted average MW of 10258.6 Da. HR-ESI-MS of the extracted biomarkers gave a MW of 8269.1 ( 0.2 Da for strain RM1221 and MW of 8269.3 ( 0.3 Da for strain RM1859. MS/ MS confirmed that this protein did not undergo N-Met cleavage (Table 8). In consequence, we examined the possibility of simple truncation of the polypeptide from the C-terminal end of the protein. However, cleavage of the polypeptide from the C-terminus did not result in a MW consistent with the observed value. In consequence, an additional PTM must be involved for this biomarker beyond polypeptide truncation. Finally, the 50S ribosomal protein L29 biomarker was observed in all three strains at m/z 7036 (RM1221), m/z 7035 (RM1859), and m/z 7034 (RM3782). However, the HR-ESI-MS of this biomarker for strain RM3782 gave a MW of 7081.4 ( 0.3 Da. Unusually rapid oxidation of the three methionine residues of this protein in its HPLC fraction resulted in a mass increase of 48 Da over that of the un-oxidized protein. Because of the rapidity of the methionine oxidation in this HPLC fraction, we were unable to obtain HR-ESI-MS of the unoxidized protein. However, methionine oxidation was a potential modification of the search parameters for all MS/MS Journal of Proteome Research • Vol. 5, No. 10, 2006 2533

research articles

Fagerquist et al.

Table 5. Predicted Protein Biomarker Sequences for Strains RM1221, RM1859 (from Refs 23 and 24), and RM3782 (this work), as Determined by DNA Sequencinga

a Boxed residues highlight biomarker sequence differences between strains. The gene that encodes the protein biomarker (where known) is shown in parentheses next to the protein. The NCBI GenBank accession numbers of the biomarker genes are given in parentheses next to the sequence. Note the translation errors in the transthyretin-like periplasmic protein (black dots indicate stop codons).

data and, in consequence, MS/MS identification of the biomarker was not affected.

Discussion Relatedness of Strains RM1221, RM1859, and RM3782. The number of protein biomarkers common to RM1221 and RM1859 suggest that these two strains are closer phylogenetically than either strain is to RM3782. RM3782 shares only a few biomarkers with RM1221 and RM1859. Many of the RM3782 biomarkers (both identified and those not recovered) appear shifted in mass 15-60 Da, compared to their equivalent biomarkers in RM1221 and RM1859. As mentioned previously, these mass shifts correspond to amino acid substitutions in the biomarker that are due to nonsynonomous point mutations in the biomarker gene (synonomous mutations would not result in a residue substitution). These amino acid substitutions (and their underlying genetic differences) appear to suggest a more distant phylogenetic relationship between RM3782 and the other two strains in this study. Consistent with these observations, a nitrate reduction test27,28 revealed strain RM3782 to be C. jejuni subsp. doylei, whereas strains RM1221 and RM1859 are C. jejuni subsp. jejuni. The “Missing” Biomarker in C. Jejuni Subsp. Doylei Strain RM3782. Our laboratory reported previously the absence of a biomarker ion at m/z ≈ 13 700 in two strains of C. j. subsp. doylei : RM2095 and RM2096.22 This biomarker ion was present 2534

Journal of Proteome Research • Vol. 5, No. 10, 2006

in most (but not all) C. j. subsp. jejuni strains in our collection but it was absent in all C. j. subsp. doylei strains (including RM3782). Subsequent proteomic identification of this biomarker as a transthyretin-like periplasmic protein in C.j. jejuni strains allowed sequencing of its gene in RM3782, revealing many nonsense mutations that would have prevented full translation, and thus its absence from the MALDI-TOF MS spectrum. The fact that some C. j. jejuni strains are also missing this biomarker suggest the following possibilities: (1) the protein is not being translated due to a gene error (like C. j. doylei); (2) the protein is not highly expressed; (3) the protein is not being ionized efficiently by MALDI. We are currently sequencing the gene that encodes this transthyretin-like periplasmic protein from our C. j. doylei strains and those C. j. jejuni strains that are missing this biomarker in MALDI-TOF spectra. The results will be reported in a subsequent communication. N-terminal Methionine Cleavage Rule. Table 8 summarizes results that confirm the presence of N-terminal methionine to selected protein biomarkers of RM1221, RM1859, and RM3782. N-Met attachment is confirmed both by MS and MS/MS in most cases. In a few cases, N-Met attachment could only be confirmed from the overall MW of the protein biomarker. The N-terminal methionine cleavage rule indicates that the N-terminal methionine will be post-translationally cleaved depending on the penultimate residue.18,32-34 Following this

Sub-Speciating Campylobacter jejuni by Proteomic Analysis

research articles

Table 6. Cleavage of the Polypeptide Chain from the N-terminal Side of Selected Protein Biomarkers, as Confirmed by MS/MS of Biomarker Tryptic Peptidesa

a The partial N-terminal sequence of proteins of strain RM1221 and RM3782 are in gray script. Penultimate residues or signal peptides are boxed. Tryptic peptides (and their charge states) sequenced by MS/MS and database searching are in black script.

Table 7. Unmodified (Signal Peptide Boxed) and Truncated Polypeptide Sequence for C. jejuni RM1221 Cytochrome C553 and Their Calculated MWsa

a Sequence confirmed by MS/MS of tryptic peptides is highlighted in black. Note the 19-residue truncation of the polypeptide between two nonbasic residues: 19Ala and 20Ala. The cysteines that covalently bond to the heme are underlined. The calculated average MW of heme + truncated polypeptide is 9550.8 Da, which is 2 Da higher than the HR-ESI-MS average MW of the biomarker (9548.9 ( 0.2 Da).

rule, N-Met always cleaves if the penultimiate residue is Ala, Gly, Pro, Ser, or Thr. N-Met does not undergo cleavage if the penultimate residue is Arg, Asn, Ile, Leu, Lys, or Phe. N-Met cleavage may (or may not) undergo cleavage if the pentultimate residue is Cys, His, Met, Trp, Tyr, Asp, Glu, Gln, or Val. The results summarized in Tables 6 and 8 are completely consistent with this rule, except for the DNA-binding protein HU and the 30S ribosomal protein S16. For these two biomarkers, the penultimate residue is threonine (Thr) which, according to the N-Met cleavage rule, is predicted to undergo N-Met cleavage. However, MS and/or MS/MS data clearly show for these two proteins that the N-terminal methionine is present. We previously reported extraction and identification of the DNA-binding protein biomarker HU for C. jejuni, C. coli, C. lari, and C. upsaliensis strains.23 In all of these species and strains, the penultimate residue was threonine, and in all cases the N-terminal methionine is present in HU even though the N-Met cleavage rule would predict post-translational

cleavage of N-terminal methionine. If there is a methionine aminopeptidase that specifically cleaves N-Met when the penultimate residue is threonine, it would seem to be absent from C. jejuni and perhaps from all thermophillic campylobacters. Alternatively, the primary methionine aminopeptidase of Campylobacter may be altered in its specificity and thus will not cleave polypeptides with penultimate N-terminal threonine residues. Post-Translational Modifications. Protein biomarkers that were extracted and identified from more than one strain were found to have the same PTM (or lack of a PTM) for all the strains analyzed in this study. Mass shifts detected among biomarkers across different strains appear to be due to amino acid substitutions of the primary polypeptide chain and not due to variability of the PTMs. Given the extensive PTMs detected in some of the biomarkers, the proteins identified in this study appear to represent high copy, intact proteins reflective of not only the organism’s genetic identity but also Journal of Proteome Research • Vol. 5, No. 10, 2006 2535

research articles

Fagerquist et al.

Figure 2. HR-ESI-MS of the HPLC fraction containing the cytochrome c553 biomarker of strain RM1221 and the transthyretin-like periplasmic protein (b). Cytochrome c553 is identified by its enumerated charge states and its MW which is calculated from deconvolution of its charge state envelope. The relatively low charge state maximum and the narrow charge state distribution (along the high charge side) suggests a folded (or partially folded) tertiary structure. Table 8. MS/MS Confirmation of N-Met Presence in Selected Protein Biomarkersa

a Partial N-terminal sequence of proteins of strain RM1221 and RM3782 are in gray script (penultimate residue boxed). Tryptic peptides sequenced by MS/MS and their charge states are in black script. MS/MS of DNA-binding protein HU of RM1221 and RM1859 was reported previously in ref 23.

its biological processes during normal growth. This conclusion is supported by the detection and analysis of numerous proteins involved in protein synthesis as well as the retention of post-translational modifications in many of the proteins identified. 2536

Journal of Proteome Research • Vol. 5, No. 10, 2006

Protein Biomarker “Identification” by Genomic Generated Protein MWs. Table 9 summarizes the tentative assignment of biomarker ions from the MALDI-TOF MS spectrum of RM1221. Protein assignment is based upon comparison of the m/z of the biomarker ion (which we have found to be accurate

research articles

Sub-Speciating Campylobacter jejuni by Proteomic Analysis Table 9. Prominent Protein Biomarker Ions (m/z) in the MALDI-TOF MS Spectrum of RM1221a

a Tentative protein assignments (in gray) of a biomarker ion based on hypothetical MWs (genomic database) and strict adherence to the N-terminal Met cleavage rule.18,32-34 Exceptions to the N-Met cleavage rule are highlighted with an asterisk*. Confirmed identification of the biomarker ion is given in black. Partial N-terminal sequence of the protein is provided with the penultimate residue boxed. CP refer to conserved protein.

to within (5-10 Da for our externally calibrated MALDI-TOF MS) to the protein MWs as derived from the fully sequenced and annotated genome of RM1221. Those proteins whose assignments were confirmed definitively are highlighted in black. In selecting potential candidates for the biomarker ions, we have applied rigorously the N-Met cleavage rule (the only exception to its application is when the penultimate residue is Thr). Several biomarker ions have multiple tentative assignments. It is interesting to find that four of the biomarkers do not conform to either an unmodified protein or one with N-terminal cleavage. For instance, the biomarker ion at m/z 9554 has two potential identifications: the unmodified conserved protein (similar to cj1225) with a MW of 9551 Da and an N-Met cleaved proton/peptide symporter family protein (with a MW after N-Met cleavage of 9681 - 131 ) 9550 Da). However, neither of those two assignments was found to be correct. After extraction and identification, the biomarker was determined to be cytochrome c553 with an unmodified MW of 10 838 Da but a PTM MW of 9549 Da. Previously, our laboratory reported the tentative assignment of a C. jejuni biomarker ion at m/z 13 730 as the 30S ribosomal protein S13 (predicted MW of 13 735 Da).22 Given the mass accuracy of the externally calibrated MALDI-TOF MS instrument (( 5-10 Da) used in those measurements, and given that there were no other proteins close to the observed MW in the database, this assignment appeared reasonable. However, subsequent extraction and analysis of this biomarker, as discussed earlier, revealed it to be a transthyretin-like periplasmic protein with an unmodified MW of 15 921 Da, but which had undergone removal of its 20-residue signal peptide resulting in a post-

translationally modified MW of 13 727 Da. Thus, a simple comparison of the database of protein MWs to the biomarker ion m/z can easily lead to mis-identification of biomarker and lowered confidence in the identification of the pathogen. Minor Note: In the process of completing this work, we found that the calculation of the nominal mass of as many as ∼50% of the biomarkers identified in the NCBI nr database were incorrect for C. jejuni strain RM1221. However, we found no errors in the amino acid sequences of these proteins.

Conclusions The conclusions of this study are as follows. Shifts in biomarker mass between subspecies (and strains) of C. jejuni appear to be due to amino acid substitutions caused by nonsynonomous mutations in the biomarker gene and not due to PTM variability. The frequencies biomarker mass shifts between strains appear to correlate with the phylogenetic relatedness of strains. These mass shifts are currently being exploited for Campylobacter classification using our pattern recognition algorithm. PTMs for a particular protein biomarker appear to be consistently present across species, sub-species, and strains of Campylobacter. Protein biomarkers observed in MALDI-TOF spectra appear to represent intact, post-translationally modified proteins, e.g., cytochrome c553. The N-Met cleavage rule predicted correctly the presence (or absence) of N-Met cleavage for most of the biomarkers. However, the 30S ribosomal protein S16 and the previously reported DNAbinding protein HU do not follow the rule even though the penultimate residue of both biomarkers is threonine, which the Journal of Proteome Research • Vol. 5, No. 10, 2006 2537

research articles rule predicts should result in N-Met cleavage. This exception to the N-Met cleavage rule suggests that the Campylobacter methionine aminopeptidase, responsible for N-Met cleavage, may have altered specificity, resulting in the absence of methionine cleavage when the penultimate residue is threonine. Given the number and variety of PTMs detected in this relatively small number of biomarkers, future bioinformatics algorithms for protein/pathogen identification may need to incorporate many more potential PTMs than previously anticipated. In addition, incorporation of N-Met cleavage into such algorithms will need to incorporate the methionine aminopeptidase specificity in a particular organism.

Acknowledgment. We thank Dr. William H. Vensel for assistance with the MASCOT software and Delilah Wood for the SEM image of Campylobacter for the table of contents. Mention of a brand or firm name does not constitute an endorsement by the U.S. Department of Agriculture over other of a similar nature not mentioned. This article is a US Government work and is in the public domain in the U.S.A. Supporting Information Available: Sub-speciating campylobacter jejuni by proteomic analysis of its protein biomarkers and their post-translational modifications. This material is available free of charge via the Internet at http:// pubs.acs.org. References (1) Cain, T. C.; Lubman, D. M.; Weber, Jr., W. J. Rapid Commun. Mass Spectrom. 1994, 8, 1026-1030. (2) Krishnamurthy, T.; Ross, P. L.; Rajamani, U. Rapid Commun. Mass Spectrom. 1996, 10, 883-888. (3) Krishnamurthy, T.; Ross, P. L. Rapid Commun. Mass Spectrom. 1996, 10, 1992-1996. (4) Holland, R. D.; Wilkes, J. G.; Rafii, F.; Sutherland, J. B.; Persons, C. C.; Voorhees, K. J.; Lay, J. O., Jr. Rapid Commun. Mass Spectrom. 1996, 10, 1227-1232. (5) Arnold, R.; Reilly, J. Rapid Commun. Mass Spectrom. 1998, 12, 630-636. (6) Welham, K.; Domin, M.; Scannell, D.; Cohen, E.; Ashton, D. Rapid Commun. Mass Spectrom. 1998, 12, 176-180. (7) Haag, A.; Taylor, S.; Johnston, K.; Cole, R. J. Mass Spectrom. 1998, 33, 750-756. (8) Wang, Z.; Russon, L.; Li, L.; Roser, D.; Long, S. R. Rapid Commun. Mass Spectrom. 1998, 12, 456-464. (9) Dai, Y.; Li, L.; Roser, D.; Long, S. R. Rapid Commun. Mass Spectrom. 1999, 13, 73-78. (10) Fenselau, C.; Demirev, P. A. Mass Spectrom. Rev. 2001, 20, 157171. (11) Lay, J. O., Jr. Mass Spectrom. Rev. 2001, 20, 172-194. (12) Ramirez, J.; Fenselau, C. J. Mass Spectrom. 2001, 36, 929-936.

2538

Journal of Proteome Research • Vol. 5, No. 10, 2006

Fagerquist et al. (13) Whiteaker, J.; Karns, J.; Fenselau, C.; Perdue, M. L. 2004, 1, 185194. (14) Jarmon, K. H.; Cebula, S. T.; Saenz, A. J.; Petersen, C. E.; Valentine, N. B.; Kingsley, M. T.; Wahl, K. L. Anal. Chem. 2000, 72, 12171223. (15) Wahl, K. L.; Wunschel, S. C.; Jarman, K. H.; Valentine, N. B.; Petersen, C. E.; Kingsley, M. T.; Zartolas, K. A.; Saenz, A. J. Anal. Chem. 2002, 74, 6191-6199. (16) Demirev, P. A.; Ho, Y.-P.; Ryzhov, V.; Fenselau, C. Anal. Chem. 1999, 71, 2732-2738. (17) Peneda, F. J.; Lin, J. S.; Fenselau, C.; Demirev, P. A. Anal. Chem. 2000, 72, 3739-3744. (18) Demirev, P. A.; Lin, J. S.; Peneda, F. J.; Fenselau, C. Anal. Chem. 2001, 73, 4566-4573. (19) Yao, Z.-P.; Demirev, P. A.; Fenselau, C. Anal. Chem. 2002, 74, 2529-2534. (20) Peneda, F. J.; Antoine, M. D.; Demirev, P. A.; Feldman, A. B.; Jackman, J.; Longenecker, M.; Lin, J. S. Anal. Chem. 2003, 75, 3817-3822. (21) http://www.cdc.gov/ncidod/dbmd/diseaseinfo/campylobacter_t.htm. (22) Mandrell, R. E.; Harden, L. A.; Bates, A. H.; Miller, W. G.; Haddon, W. F.; Fagerquist, C. K. Applied Environ. Microbio. 2005, 71, 6292-6307. (23) Fagerquist, C. K.; Miller, W. G.; Harden, L. A.; Bates, A. H.; Vensel, W. H.; Wang, G.; Mandrell, R. E. Anal. Chem. 2005, 77, 48974907. (24) Fouts, D. E.; Mongodin, E. F.; Mandrell, R. E.; Miller, W. G.; Rasko, D. A.; Ravel, J.; Brinkac, L. M.; DeBoy, R. T.; Parker, C. T.; Daugherty, S. C.; Dodson, R. J.; Durkin, A. S.; Madupu, R.; Sullivan, S. A.; Shetty, J. U.; Ayodeji, M. A.; Shvartsbeyn, A.; Schatz, M. C.; Badger, J. H.; Fraser, C. M.; Nelson, K. E. PloS Biology 2005, 3, e15, 72-85. (25) Miller, W. G.; On, S. L. W.; Wang, G.; Fontanoz, S.; Lastovica, A. J.; Mandrell, R. E. J. Clin. Microbiol. 2005, 43, 2315-2329. (26) Patton, C. M.; Barrett, T. J.; Morris, G. K. J. Clin. Microbiol. 1985, 22, 558-565. (27) Wideman, P. A.; Citronbaum, D. M.; Sutter, V. L. J. Clin. Microbiol. 1977, 5, 315-319. (28) Holdeman, L. V.; Cato, E. P.; Moore, W. E. C. Virginia Polytech. Inst. Anaerobic Manual, 4th ed.; VPI: Blacksburg, VA, 1977. (29) Boehm, A. M.; Galvin, R. P.; Sickmann, A. BMC Bioinformatics 2004, 5, 162. (30) Perkins, D. N.; Pappin, D. J.; Creasy, D. M.; Cottrell, J. S. Electrophoresis 1999, 20, 3551-3557. (31) Craig, R.; Beavis, R. C. Bioinformatics 2004, 20, 1466-1467. (32) Hirel, P. H.; Schmitter, J. M.; Dessen, P.; Fayet, G.; Blanquet, S. Proc. Natl. Acad. Sci. U.S.A. 1989, 86, 8247-8251. (33) Gonzalez, T.; Baudouy, J. J. FEMS Microbiol. Rev. 1996, 18, 319334. (34) Solbiati, J.; Chapman-Smith, A.; Miller, J.; Miller, Ch.; Cronan, J., Jr. J. Mol. Biol. 1999, 290, 607-614. (35) Koyanagi, S.; Nagata, K.; Tamura, T.; Tsukita, S.; Sone, N. J. Biochem. (Tokyo) 2000, 128, 371-375. (36) Grandori, R.; Matecko, I.; Muller, N. J. Mass Spectrom. 2002, 37, 191-196.

PR050485W