Multidimensional Proteomic Analysis of the ... - ACS Publications

Athens, Georgia 30605, and Applied Biosystems, Lingley House, 120 Birchwood Boulevard,. Warrington, WA3 7QH, United Kingdom. Received June 16, 2006...
0 downloads 0 Views 233KB Size
Multidimensional Proteomic Analysis of the Soluble Subproteome of the Emerging Nosocomial Pathogen Ochrobactrum anthropi Robert Leslie James Graham,*,† Catherine E. Pollock,† S. Naomi O’Loughlin,† Nigel G. Ternan,† D. Brent Weatherly,‡ Philip J. Jackson,§ Rick L. Tarleton,‡ and Geoff McMullan† School of Biomedical Sciences, University of Ulster, Coleraine, County Londonderry, BT52 1SA, United Kingdom, The Center for Tropical and Emerging Global Diseases, University of Georgia, Athens, Georgia 30605, and Applied Biosystems, Lingley House, 120 Birchwood Boulevard, Warrington, WA3 7QH, United Kingdom Received June 16, 2006

We report the first large-scale gel-free proteomic analysis of the soluble subproteome of the emerging pathogen Ochrobactrum anthropi. Utilizing our robust offline multidimensional protein identification protocol, a total of 57 280 peptides were initially identified utilizing automated MS/MS analysis software. We describe our investigation of the heuristic protein validation tool PROVALT and demonstrate its ability to increase the speed and accuracy of the curation process of large-scale proteomic datasets. PROVALT reduced our peptide list to 8517 identified peptides and further manual curation of these peptides led to a final list of 984 uniquely identified peptides that resulted in the positive identification of 249 proteins. These identified proteins were functionally classified and physiochemically characterized. A variety of typical “housekeeping” functions identified within the proteome included nucleic acid, amino and fatty acid anabolism and catabolism, glycolysis, TCA cycle, and pyruvate and selenoamino acid metabolism. In addition, a number of potential virulence factors of relevance to both plant and human disease were identified. Keywords: proteomics • 2D LC-MS/MS • Ochrobactrum anthropi • pathogen

Introduction The R-proteobacteria are a biologically interesting group of genetically diverse bacteria. Many members of this subdivision are capable of interaction with eukaryotic cells and can function as intracellular symbionts or as pathogens of plants and animals. It has been shown that some members are important human pathogensssome can establish asymptomatic chronic animal infectionsswhereas others are agriculturally important, assisting plants with nitrogen fixation.1 The R-2 subgroup of the proteobacteria contain the well-known genera Rhizobacteria, Agrobacterium, Rickettsia, Bartonella, and Brucella that include species of widespread medical and agricultural importance.2 A lesser known member of this group is the genus Ochrobactrum, which is genetically most closely related to the genus Brucella.3 Until 1998, Ochrobactrum anthropi was considered to be both the sole and type species of the genus Ochrobactrum, despite the genetic and phenotypic heterogeneity visible within isolates of the species.4 Subsequent analysis by Velasco et al.5 resulted in the description of O. intermedium as a second * To whom correspondence should be addressed. Dr Robert Graham, School of Biomedical Sciences, University of Ulster, Cromore Road, Coleraine, BT52 1SA, U.K.; Email: [email protected]; Telephone: +44(0)2870 323227; Fax:+44(0)2870 324965. † University of Ulster. ‡ University of Georgia. § Lingley House. 10.1021/pr060293g CCC: $33.50

 2006 American Chemical Society

species. Further systematic work by Lebuhn et al.6 led to the identification of two new species, O. grignonense and O. tritici, isolated from soil and wheat rhizoplane. Most recently, a fifth species, O. gallinifaecis, isolated from a chicken fecal sample, was described. Ochrobactrum species have been described as being environmentally abundant free-living R-proteobacteria. A number of reports exist in the literature describing the use of Ochrobactrum species as either a source of biotechnologically useful enzymes7-9 or in the detoxification of xenobiotic compounds such as halobenzoates.10-14 The ability of Ochrobactrum species to act as legume endosymbionts in temperate genera such as Lupinus, Musa, and Acacia has also recently been demonstrated.15-17 O. anthropi has been reported to be of widespread distribution in hospital environments.18 Although O. anthropi is only weakly virulent, it has been found to cause hospital-acquired infections usually, but not always, in immunocompromised hosts.19-22 The organism has been found to adhere, possibly as a result of biofilm formation, to the surface of catheters, pacemakers, intraocular lenses and silicon tubing, thus representing potential sources of infection in the clinical environment.23,24 Upon infection, O. anthropi has been shown to cause pancreatic abscess, catheter-related bacteremia, endophthalmitis, urinary tract infection, and endocarditis.19 With the exception of the antibiotic imipenem, O. anthropi strains usually are resistant to all β-lactams, with Nadjar and co-workers Journal of Proteome Research 2006, 5, 3145-3153

3145

Published on Web 09/26/2006

research articles showing that in at least one isolate, resistance was due to an extended spectrum β-lactamase.18 The most effective antimicrobial agents for treating human infection have thus far been reported to be imipenem, trimethoprim-sulfamethoxazole, and ciprofloxacin.21 The genomes of O. intermedium and O. anthropi are complex and composed of two independent circular chromosomes.25 Recent work by Teyssier et al.26 revealed an exceptionally high level of genomic diversity within Ochrobactrum species, possibly reflecting their adaptability to various ecological niches. Although there is currently no genome sequence data available for any Ochrobactrum species genome sequence information does exist for 20 R-proteobacteria, including three from species of its closest neighbor Brucella. The availability of such information not only offers an excellent model system to study the forces, mechanisms, and rates by which bacterial genomes evolve27 but also to carry out functional genomic and proteomic investigations of these and closely related organisms. Identification and characterization of all the proteins expressed under normal growth conditions will be invaluable as a reference point for comparative proteomic analyses of this emerging pathogen. This will, it is hoped, enable the identification of both virulence factors and potential targets for therapeutic strategies to deal with this pathogen. We now report the first large-scale gel-free proteomic analysis of the soluble subproteome of the emerging pathogen O. anthropi. This has allowed the identification of 249 proteins involved in a variety of typical “housekeeping” functions including nucleic acid, amino and fatty acid anabolism and catabolism, glycolysis, TCA cycle, and pyruvate and selenoamino acid metabolism. In addition, a number of potential virulence factors of relevance to both plant and human disease were identified.

Experimental Procedures Reagents. All reagents were purchased from Sigma-Aldrich (Poole, UK) with the exception of mass spectrometry grade water and acetonitrile, which were purchased from Romil (Cambridge, UK), and Trypsin, which was purchased from Promega (Southampton, UK). Cell Culture and Growth Conditions. O. anthropi UU551 was routinely maintained at 37 °C on nutrient agar. Routine growth of the organism involved the inoculation of nutrient broth (50 mL in 250-mL Erlenmeyer flasks) with a loop of fresh, actively growing (16 h) culture from agar plates. Flasks were incubated aerobically at 37 °C with orbital shaking at 200 rpm in an Innova 4230 refrigerated incubator shaker (New Brunswick Scientific, NJ). Growth was monitored by the increase in culture attenuance at 600 nm. Protein Extraction and Quantification. O. anthropi cultures were harvested in the mid-log phase (D600 ) 1.2) of growth by centrifugation at 9000 × g for 10 min at 3-5 °C. The cell pellet was weighed and resuspended in 10 mM PBS (pH 7.8) at a ratio of 1 g of cells to 2 mL of buffer. The cells were then broken using sonication as described previously by Hayes et al.28 The soluble proteome fraction was isolated by centrifugation of the homogenate at 25 000 × g for 30 min at 3-5 °C (Beckman J2HS, Beckman Instruments, CA) followed by ultracentrifugation at 150 000 × g for 2 h at 3-5 °C (Beckman L8-M, Beckman Instruments, CA) to sediment the insoluble fraction. The supernatant was decanted and stored frozen in 1 mL aliquots at -70 °C until required. The total soluble protein content was measured using the Bradford assay.29 3146

Journal of Proteome Research • Vol. 5, No. 11, 2006

Graham et al.

Strong Anion Exchange Perfusion Chromatography. An aliquot of the soluble protein fraction (5 mg) was loaded onto a Porous HQ/20 4.6 mm × 100 mm (1.662 mL column volume) strong anion exchange column (SAX) (Applied Biosystems, Foster City, USA) connected to a Biocad Vision workstation (Applied Biosystems/MDS SCIEX, Toronto, Canada). Buffers used for protein elution were Buffer A (50 mM Tris-HCl, pH 8.0) and Buffer B (50 mM Tris-HCl, 1M NaCl, pH 8.0). Protein elution was performed using a gradient of 0-100% Buffer B over 20 column volumes, at a flow rate of 5 mL min-1, with a further 10 column volumes of 100% Buffer B. This procedure was repeated five times, allowing a total fraction volume of 5 mL to be collected using an AFC 2000 automated fraction collector. Samples were then concentrated and desalted using 3 kDa Amicon centriprep filters (Millipore corporation, Bedford, MA) as per the manufacturer’s instructions. Tryptic Digestion. Trypsin (6 µg, Promega, Southampton, UK) in 50 mM NH4HCO3, pH 7.8 was added to 100 µL of the desalted samples and incubated overnight at 37 °C, following which the reactions were frozen at -70 °C until required. Liquid Chromatography-Mass Spectrometric (LC-MS) Analysis. Mass spectrometry was performed using a 3200 Q-TRAP Hybrid ESI Quadropole linear ion trap mass spectrometer, ESI-Q-q-Qlinear ion trap-MS/MS (Applied Biosystems/MDS SCIEX, Toronto, Canada) with a nanospray interface, coupled with an online Ultimate 3000 nanoflow liquid chromatography system (Dionex/LC Packings, Amsterdam, The Netherlands). A µ-precolumn Cartridge of (300 µm × 5 mm, 5 µm particle size) was placed prior to the C18 capillary column (75 µm × 15 cm, 3 µm particle size) to enable desalting and filtering. Both columns consisted of the reverse phase material PepMAP 100 C18 silica-based, with a 100 Å pore size (Dionex/ LC Packings). The buffers used in the gradient were Buffer A (0.1% formic acid in 2% acetonitrile) and Buffer B (0.1% formic acid in 80% acetonitrile). The nanoLC gradient was 95 min in length: 0-55% B in 70 min, hold at 55% B for 10 min, 10 min at 90% B followed by 5 min at 100% A. The flow rate of the gradient was 300 nLmin-1. The detector mass range was set at 400-1800 m/z. MS data acquisition was performed in positive ion mode. During MS acquisition peptides with 2+ and 3+ charge states were selected for fragmentation. Database Searching and Protein Identification. Protein identification was carried out using an internal MASCOT server (version 1.9; Matrix Science, London, UK) searching against the MSDB database (latest version at the time of processing). Peptide tolerance was set at (1.2 Da with MS/MS tolerance set at (0.6 Da and the search was set to allow for 1 missed cleavage. Manual Curation Protein Identification. Only identifications with a MASCOT MOWSE score > 43 were regarded as significant hits regardless of the number of peptides. PROVALT30 Protein Identification. For identification purposes, the minimum peptide length was set at 6 amino acids, minimum peptide MOWSE score was set at 25 and the minimum high quality peptide MOWSE score was set at 49. Again, only identifications with scores > 43 were regarded as significant hits regardless of the number of peptides. Bioinformatics. PSORTb version 2.0.431 (http://www. psort.org/psortb/index.html) was used for the prediction of bacterial protein subcellular localization. SignalP 3.032 (http://www.cbs.dtu.dk/services/SignalP/) was used to predict the presence and location of signal peptide cleavage sites in amino acid sequences, for classically secreted proteins. Secre-

Proteomic Analysis of Ochrobactrum anthropi

Figure 1. Multiple (three run) analysis of overall proteins identified from the soluble subproteome of Ochrobactrum anthropi.

2.033

tomeP (http://www.cbs.dtu.dk/services/SecretomeP/) was used for the prediction of nonclassical, i.e., not signal peptide triggered, protein secretion.

Results and Discussion Comprehensive Analysis of O. anthropi Soluble Subproteome. In this study, we report the first gel-free proteomic analysis of the R-proteobacterium O. anthropi. Utilizing our previously reported34 robust multidimensional protein identification system 249 proteins from within the soluble subproteome have been identified. This expressed gene product subset represents an estimated 5% of the total O. anthropi proteome, employing data based upon the typical predicted genome size.26 No data is currently available in the literature on the expected distribution of proteins within subproteomic fractions of O. anthropi. As a benchmark however, a study concentrating mainly on the analysis of the cytosolic proteins of B. melitensis 16M, a phylogenetically closely related organism, identified 187 proteins equating to 6% of its predicted proteome.35,36 To achieve a comprehensive analysis of the soluble subproteome, cell extract obtained from whole bacterial cells underwent initial separation via strong anion exchange chromatography as previously described.34 The collected fractions were tryptically digested followed by introduction onto the LCMS system and subsequent identification using MASCOT. As previously reported34,37 the complex nature of the peptide mixtures to be analyzed on LC-MS systems often exceeds their separation capabilities. This, coupled with the inherent restrictions of the automated acquisition of peptides for MS/MS, makes it essential that samples be run more than once to increase overall peptide identifications. In this study, all peptide fractions were analyzed three separate times, increasing our overall protein identification by 19% (Figure 1). In the current study, 57 280 peptides were initially identified utilizing automated MS/MS analysis software. Automated curation of this initial list of identified peptides by the heuristic bioinformatic tool PROVALT30 reduced this list to 8517 identified peptides. Manual curation of this list, involving the removal of peptides identified more than once coupled with the removal of redundant peptides from any single protein and discarding obvious false positives, gave a final list of 984 uniquely identified peptides that led to the positive identification of 249

research articles

Figure 2. Two-dimensional visualization of the predicted molecular mass and pI of proteins identified within the Ochrobactrum anthropi soluble subproteome by multidimensional analysis.

proteins. The average number of peptides per protein was 4 and the average MOWSE score was 229. Characterization of O. anthropi Soluble Subproteome. The protein subset identified from this soluble subproteome represented proteins with a wide range of physio-chemical properties in respect to pI and molecular mass (Mr) (Figure 2). This 2D visualization showed that the smallest protein identified was 50S ribosomal protein L34 (Mr ) 5108 Da), and the largest was acetyl coA carboxylase (Mr ) 491 912 Da). The most acidic protein identified was hypothetical cytosolic protein BMEI1918 (pI ) 3.84), whereas the most basic was the 50S ribosomal protein L34 (pI ) 12.70). Proteomic analysis of the origin of the identified proteins in this study bears out previous genomic studies showing that, phylogentically, the genus Ochrobactrum is most closely related to Brucella with 84% of the proteins identified having closest match to this genus. The majority of the remaining proteins were matched to other members of the R-2 sub group of the proteobacteria, Rhizobacteria (8.8%), Bartonella (2.4%), and Agrobacterium (1.2%) with the remaining 3.6% distributed among other bacterial species. Of the 249 proteins detected in this study, functional roles for 229 proteins (90%) were known or could be predicted from database analysis. Proteins within this soluble subproteome were assigned to functional categories utilizing methodologies as previously described by Takami et al.38 and Wasinger et al.39 Figure 3 shows that the largest category of identified proteins were involved in protein synthesis (ribosomal proteins) (15.7%), followed by those involved in metabolism of amino acids and related molecules (11.2%), followed by those similar to hypothetical proteins (9.6%). The remaining proteins were distributed among other functional categories. The rapid increase in genomic data over the past decade has revealed many important aspects of microbial cellular processes, however there are still a significant number of potential gene products for which we know nothing, save that they are classified as “hypothetical proteins”. Indeed, within the genome sequence of B. melitensis strain 16M, the closest relative phylogenetically of O. anthropi for which genomic data is available, some 716 predicted gene products, equivalent to 22% of the total genome, are predicted to be either hypothetical proteins or proteins of unknown function. In previous work we have underlined the necessity to assign, where possible, an element of biological functionality to such gene products to develop both systems biology and our understanding of cellular Journal of Proteome Research • Vol. 5, No. 11, 2006 3147

research articles

Graham et al.

Figure 3. Functional categorization of proteins identified within the soluble subproteome of Ochrobactrum anthropi.

processes within these organisms. Within the current study, we have established the presence within the soluble subproteome of O. anthropi of some 24 proteins that had previously been annotated as hypothetical conserved proteins. The identification of such proteins within the cell-extract of O. anthropi establishes the biological functionality of these “hypothetical” predicted protein coding sequences and more elegantly demonstrates the potential of proteomics to validate bioinformatics predictions. Having established the presence of such proteins and wishing to understand how they contribute to functional processes, we further examined them using NCBI BLASTp (www.ncbi.nml.nih.gov/BLAST/). Such an approach allows conserved domains within protein sequences to be identified and thereby enables a degree of inferred functionality. Using this methodology allowed us to assign putative functions to a third of these proteins (Table 1). Comparison of Manual Curation of Data with Automated Analysis Utilizing PROVALT. Previously, along with others, we have reported on the limitations of current automated MS data interpretation from large-scale proteomics studies.34 One of the major challenges that we encountered was the elimination of erroneous data that may lead to the identification of “phantom” proteins within any given sample. To this end, we suggested that until a reliable MS data interpretation tool could be found, the only way to proceed, and ensure integrity of data interpretation, was to manually curate the MS data, resulting in 3148

Journal of Proteome Research • Vol. 5, No. 11, 2006

many laborious weeks of analysis. For example, when working with the subproteome of Geobacillus thermoleovorans, we reported a total processing time of 40 days, whereas Chong and Wright37 working with similar datasets from Sulfolobus solfataricus P2 took over 43 days. This represents a bottleneck in the proteomic workflow and a not inconsiderable effort by researchers involved! PROVALT is a protein validation tool that uses a heuristic method for the analysis of proteins identified by MASCOT, the main MS data interpretation tool currently employed by the proteomic community.30 PROVALT takes large proteomic MS datasets and reorganizes them, taking multiple MASCOT results and identifies peptides that match. Redundant peptides are then removed, and related peptides are grouped together associated with their predicted matching protein. In addition to its rapidity, the program also is of benefit in that it clusters together proteins, with no unique peptides, into homologous groups, thus eliminating for researchers one common cause of false positive result generation. Although not investigated in this study, as an additional tool, PROVALT contains an algorithm that allows for the determination of protein false positive detection rates when using Mascot.30 This is one of the many standard practices now being called for by leading journals in the proteomics field. PROVALT was specifically written for use in the analysis of the proteome from the causative agent of Chagas disease, the parasite Trypanosoma cruzi. With the obvious benefits that

research articles

Proteomic Analysis of Ochrobactrum anthropi

Table 1. Functional Analysis of Hypothetical Conserved Proteins from Ochrobactrum anthropi Based on Conserved Domain Analysis code

NCBI-CDD (Pfam)

AG3373 Q98NN9•RHILO Q57C29•BRUAB Q57BC9•BRUAB AI3289 AE3435 AC3596 Q8FZS3•BRUSU Q92R40•RHIME AC3305 Q9FBM5•STRCO AE3289 AD3485 AH3491 AB3560 Q8FW32•BRUSU AH3273 Q6G2D7•BARHE A96010 AI3597

none 01722 02637 none none none none none none none none none none none 03966 none 03795 02492 none 03992

AG3273 AI3391 AF3388

04543 06821 04391

AI3415

none

such a program offers, we wanted to evaluate the efficacy of PROVALT in the analysis of our dataset from O. anthropi. This, we envisaged, would help in identifying whether PROVALT would be of benefit to the wider proteomic community and potentially lead to a decrease in the time taken in the analysis of proteomic data. To this end, we compared results generated by PROVALT with the protein list produced after manual curation of results from a subset of O. anthropi proteomic data. The most immediate benefit conferred by the use of PROVALT was in the time taken to generate a final protein list. Common to both methods was a total of 94 h of MS run time necessary to generate both MS/MS data files and MASCOT result files for 14 triplicate fractions of O. anthropi cytosolic proteome. The manual curation of these MASCOT result files to produce a protein master list took some 13 days resulting in the identification of 145 distinct proteins. PROVALT took only 10 h to generate a list of some 720 proteins, with a further 7 h required to identify 154 distinct proteins, meeting the level of stringency as applied to the manual curation list. When the PROVALT identifications were assessed against the manual curation list, a total of 120 proteins were common to both. Although PROVALT allowed the grouping together of homologous proteins within the manually generated list, such proteins remained as distinct entities and thus served as a source of false positives. When such proteins were removed from the manual curation list, the total protein number identified fell to 123 (Figure 4). An additional benefit from PROVALT grouping together homologous proteins, based on peptide content, is the removal of our dependence upon others within the scientific community assigning consistent, correct, and meaningful descriptors for gene and protein function. For example, within one of the homologous groupings listed by PROVALT, peptides were described as originating from an acetyl-CoA C-acyltransferase [EC 2.3.1.16] (accession number AG3571). Other descriptors of genes matching this peptide subset included fatty acid oxidation complex, beta subunit, and

possible function

conserved protein of unknown function (COG3184) BolA-like protein (COG0271) GatB/YqeY domain (COG1610) conserved protein of unknown function conserved protein of unknown function conserved protein of unknown function conserved protein of unknown function conserved protein of unknown function transcriptional accessory protein (COG2183) conserved protein of unknown function conserved protein of unknown function conserved protein of unknown function conserved protein of unknown function conserved protein of unknown function (COG4530) protein of unknown function (DUF343) saccharopine dehydrogenase and related proteins (COG1748) YCII-related domain cobalamin synthesis protein conserved protein of unknown function uncharacterized enzyme involved in the biosynthesis of extracellular polysaccharides (COG3239) conserved protein of unknown function (COG2947) predicted esterase of the alpha/beta hydrolase fold (COG3545) protein of unknown function (DUF533), some members may be secreted or integral membrane proteins conserved protein of unknown function

probable acyl-coA thiolase. In the absence of software to identify homology between these disparately described proteins, false positive identifications are the likely outcome. The presence of such variation is therefore dependent upon the quality, uniformity, and extent of available sequence annotation within public databases. From an initial analysis of both protein lists, it became apparent that a number of peptides were assigned to proteins with the same predicted function but with different database accession numbers and often from distinct species of Brucella. For example, the presence of the peptides SKTSINELLR and TSINELLR in our samples resulted in MASCOT alone assigning them as belonging to 2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-succinyltransferase with the accession number AAL53511 whereas PROVALT assigns this protein the code AD3543. This aberration appears to have occurred as a result of the PROVALT software, in this instance, choosing to assign a PIR database code to this protein as a replacement of the EMBL code assigned by MASCOT alone. More problematic are the in-

Figure 4. Comparison of proteins identified by PROVALT analysis and manual curation from the soluble subproteome of Ochrobactrum anthropi. Journal of Proteome Research • Vol. 5, No. 11, 2006 3149

research articles

Graham et al.

Figure 5. Overview of identified proteins from the soluble subproteome of Ochrobactrum anthropi. Cellular localization was predicted based upon the use of PSortB v.2.0.4,21 SignalP v.30,22 and SecretomeP.

stances where not only do PROVALT and MASCOT assign the protein with a different accession number but also describe them as having a closest identity with different bacterial species. For example, the peptide EGDIVNIDVTYLLDGWHGDSSR was identified as a methionine aminopeptidase, with PROVALT describing it as originating from Brucella melitensis (strain 16M; code AI3341), however MASCOT describes it as originating from Brucella abortus biovar 1 str.9-941 and gave it an accession number of Q57CL4•BRUAB. Such a situation is a source of potential variation between lists and it is not clear why it has arisen given that the MASCOT datafiles used are common, and in many previous cases PROVALT assigns such proteins to a homologous series. Finally, in analysis of the PROVALT protein list, it was possible to identify a number of proteins with the same predicted function that had remained as distinct groupings due to the presence of an apparently distinct peptide(s). Upon further analysis, it was clear that in the majority of cases these distinct peptides contained single isobaric amino acid changes. These occurrences can be explained by either errors in the translation of these proteins in the public databases or the presence of two distinct copies of the gene encoding these proteins within the O. anthropi genome. In only one case did it appear that the protein found in the proteome of O. anthropi could be a chimeric form, although again the presence of two distinct copies of the gene encoding this protein could not be excluded without further scientific investigation. Subcellular Protein Localization. Subcellular localization prediction tools have been used for many years to identify those proteins that are retained by and exported from cells. They may also have uses in identifying possible diagnostic and thera3150

Journal of Proteome Research • Vol. 5, No. 11, 2006

peutic targets as well providing information on the functionality of a protein.31 In the current study, a number of bioinformatics tools including PSortB,31 SignalP,32 and SecretomeP33 were utilized. These bioinformatics tools endeavor to assign a subcellular location for each protein. These tools use a set of descriptor rules and a variety of computational algorithms and networks to analyze a proteins’ amino acid composition in an attempt to identify known motifs or cleavage sites. All 249 proteins identified in this study were initially analyzed using PsortB; 120 were predicted to be cytoplasmic containing no helical domains, the remaining 129 proteins were either predicted to contain helical domains or to have localizations other than within the cytoplasm (Figure 5). These protein subsets were further analyzed using SignalP, to predict aminoterminal signal peptides, and Secretome P, which attempts to identify nonclassically secreted proteins. Of those 120 proteins classified by PSortB as being cytoplasmic and containing no helical domains or signal peptide, 114 were confirmed as nonsecretory, while six were predicted to be potential secretory proteins (one containing a predicted signal peptide and five predicted to be nonclassically secreted). Of the 129 proteins initially predicted by PSortB to contain helical domains or to have localizations other than within the cytoplasm, 9 had predicted signal peptide sequences, 73 were classified as noncytoplasmic with no signal peptide, 33 were identified as nonclassically secreted proteins, and 14 were predicted to have both signal peptides and to be nonclassically secreted. The 63 proteins predicted to be secreted are shown in Table 2. Within this subset, 24 proteins were identified as possessing an N-terminal signal peptide. These proteins were further analyzed for the presence of lipobox, RR-motif, and signal

research articles

Proteomic Analysis of Ochrobactrum anthropi

Table 2. Proteins Identified within the Soluble Subproteome of Ochrobactrum anthropi by PSortB, SignalP, or SecretomeP as Potentially Secreted code

AD3502 AH3344 Q8G0I8•BRUSU Q57DS5•BRUAB Q8G0K0•BRUSU

protein

dnaK protein protein translation elongation factor Tu DNA-binding protein HU 30S ribosomal protein S4 Peptidyl-prolyl cis-trans isomerase, cycrophilin type AB3347 50S ribosomal protein L2 Q98N19•RHILO Glutamine synthetase I AB3348 50S ribosomal protein L24 AF3283 H+- transporting two-sector ATPase, beta chain AD3528 flagellin AC3264 electron transfer flavoprotein AB3550 NAD(P) transhydrogenase (AB-specific) AAN29719 30S ribosomal protein S9 Q92MM2•RHIME peptide binding periplasmic ABC transporter AC3358 CTP synthase AC3347 30S ribosomal protein S19 Q5FM23•LACAC 50S ribosomal protein L11 AH3455 acriflavin resistance protein A precursor AI3289 hypothetical cytosolic protein BMEI0303 AE3435 hypothetical protein BMEI1467 Q9F6V1•BRUAB Organic peroxide resistance protein AC3596 hypothetical membrane associated protein BMEII0692 AH3384 biotin carboxyl carrier protein of acetyl-CoA carboxylase Q57B61•BRUAB Hypothetical Lytic murien transglycosylase AC3305 Hypothetical protein BMEI0425 Q8FZT1•BRUSU Phosphoserine phosphotase AI3353 ATP-dependent clp protease adaptor protein clpS AC3437 30S ribosomal protein S18 AF3280 30S ribosomal protein S16 AC3286 50S ribosomal protein L32 AD3292 50S ribosomal protein L31 Q57CQ3•BRUAB 30S ribosomal protein S12 Q9FBM5•STRCO Hypothetical protein AH3316 cold shock protein cspA AH3347 30S ribosomal protein S17 AE3320 phnA AH3260 outer membrane lipoproteins carrier protein precursor Q6FZ18•BARQU hypothetical protein (50S ribosomal protein L34) S44982 flagellin AE3481 N-methyl-D-aspartate receptor CAC39251 peptide methionine sulfoxide reductase AE3374 Glutamine synthetase I AB3362 single-stranded DNA-binding protein AD3485 hypothetical protein BMEI1866 Q98M53•RHILO Peptidyl prolyl cis-trans isomerase AB3495 30S ribosomal protein S20 AE3399 integration host factor alpha-chain AH3273 hypothetical cytosolic protein BMEI0173 AD3264 electron transfer flavoprotein alpha-chain AG3441 probable pyridoxamine-phosphate oxidase AB3483 dihydroxy-acid dehydratase AG3328 proteinase do AG3273 hypothetical protein Q89KW8•BRAJA pyruvate dehydrogenase beta subunit AG3267 amino acid N-acetyltransferase AC3259 50S ribosomal protein L28 AI3391 hypothetical cytosolic protein BMEI1119 AB3503 50S ribosomal protein L35 AC3401 cell wall degredation protein AB3349 50S ribosomal protein L15 AI3415 hypothetical protein BMEI1311 AC3322 membrane bound lytic murien transglycolase B

MOWSE

PSortBa

Secretome P

PSortB SP pred

SignalP SP pred

speciesb

715 644 566 544 453

C C U U P

Yes 0.69 No Yes 0.95 Yes 0.55 Yes 0.89

No No No No No

No AMA-KS17-18 No No No

Bm Bm Bs Ba Bs

394 361 336 293

U C C U

Yes 0.85 Yes 0.5 Yes 0.51 No

No No No Yes

No No No AEA-KP 15-16

Bm Rl Bm Bm

280 273 257 254 233 206 183 183 176 158 158 157 153

P U CM U P C U U CM U U U U

Yes 0.96 Yes 0.63 Yes 0.90 Yes 0.64 Yes 0.74 No Yes 0.8 Yes 0.92 Yes 0.80 Yes 0.7 Yes 0.90 Yes 0.94 Yes 0.93

No No No No Yes No No No Yes No Yes No No

TLA-ST 18-19 No No No ASA-AT 28-29 ALA-AL 25-26 No No ALA-QA 24-25 No SAA-QN 25-26 No ASA-QQ 11-12

Bm Bm Bm Bs Rm Bm Bm La Bm Bm Bm Ba Bm

149

U

Yes 0.75

Yes

No

Bm

141 135 132 132

CM CM U U

No No No Yes 0.67

Yes Yes Yes No

AEA-AQ 29-30 ARA-EN 37-38 AKA-AL 18-19 No

Ba Bm Bs Bm

130 129 125 125 122 120 119 116 105 98 97 96 95 94 92 89 89 88 86 77 76 68 68 66 66 66 64 59 52 51 50 50 48 46 44

U U U U U U C U U U U E U U C U U C U U U U U C P U C C C U U U U U CM

Yes 0.55 Yes 0.72 Yes 0.82 Yes 0.76 Yes 0.82 Yes 0.92 Yes 0.64 Yes 0.88 Yes 0.68 Yes 0.8 Yes 0.64 Yes 0.96 No Yes 0.93 Yes 0.66 Yes 0.91 Yes 0.96 Yes 0.85 Yes 0.59 Yes 0.59 Yes 0.89 No Yes 0.87 No No Yes 0.58 Yes 0.59 Yes 0.64 Yes 0.51 Yes 0.83 Yes 0.90 Yes 0.92 Yes 0.63 Yes 0.89 Yes 0.56

No No No No No No No No No Yes No No No No No No Yes No No No No No No No Yes No No No No No No No No Yes No

No ARA-GS 10-11 ADA-LK 21-22 No APV-KR 16-17 SRA-GR 9-10 No No No No ATA-GG 25-26 LLT-QN 14-15 MDV-LR 3-4 No No No ALA-QT 35-36 No No No No SPG-PD 25-26 No AGA-RG 20-21 ALA-AQ 37-38 No No No No No VKA-AA24-25 No No KKA-AK32-33 No

Bm Bm Bm Bm Ba Sc Bm Bm Bm Bm Bq Ss Bm Oa Bm Bm Bm Rl Bm Bm Bm Bm Bm Bm Bm Bm Bj Bm Bm Bm Bm Bm Bm Bm Bm

a Cellular localizations: C, cytoplasmic; CM, cytoplasmic membrane; U, unknown; P, periplasmic; E, Extracellular. b Species: Bm, Brucellas melitensis; Bs, Brucella suis; Ba, Brucella abortus; Rl, Rhizobium loti; La, Lactobacillus acidophilus; Sc, Streptomyces coelicor; Bq, Bartonella quintana; Ss, Shigella sonnei; Oa, Ochrobactrum anthropi; Bj, Bradyrhizobium japonicum.

Journal of Proteome Research • Vol. 5, No. 11, 2006 3151

research articles

Graham et al.

Table 3. Proteins Identified within the Soluble Subproteome of Ochrobctrum anthropi Containing Predicted Export Signals proteins

function

signal peptidea

AH3344 AF3283 AD3528 Q92MM2•RHIMEb AC3358 AH3455b AE3435b AC3596 Q57B61•BRUAB AC3305 Q8FZT1•BRUSU Q9FBM5•STRCO S44982 AE3481 AD3485 AD3264 AB3483 AG3328 AI3415b AF3280 AC3286 Q57CQ3•BRUAB Q6FZ18•BARQU AB3503

protein translation elongation factor Tu H+- transporting two-sector ATPase, beta chain flagellin peptide binding periplasmic ABC transporter CTP synthase acriflavin resistance protein A precursor hypothetical protein BMEI1467 hypothetical membrane associated protein BMEII0692 Hypothetical Lytic murien transglycosylase Hypothetical protein BMEI0425 Phosphoserine phosphotase Hypothetical protein flagellin N-methyl-D-aspartate receptor hypothetical protein BMEI1866 electron transfer flavoprotein alpha-chain dihydroxy-acid dehydratase proteinase do hypothetical protein BMEI1311 30S ribosomal protein S16 50S ribosomal protein L32 30S ribosomal protein S12 hypothetical protein (50S ribosomal protein L34) 50S ribosomal protein L35

MCWRLSGSRTKRTTAMA MAKAATPKTTAAAEA MASILTNSSALTALQTLA MSRLNRFLISALAAAAIAAPALATSASA MARYVFITGGVVSSLGKGIAAAALA MNRTIRCFAAGAAFIVFAAQPALA MIQRLAALAAGAGLSLAAATLPSAA MLASATMPASA MAFVAKGRSWTAAAVIAVGMMAGAGIAEA MAIPSLSQIRFRNAACFVAFFAALFVSFIALPREARA MSQQVSLVATLIANPAKA MRPPITSRA MAQVINTNSLSLLT MDV MKFEANEREPVKTMLGKSIVAASVFTIAMAGSALA MLRFRCSRTKGTLRSGSRTPPFSPG MKMPPYRSRTTTHGRNMAGA MAIAPKAGFARTLFATVALGAMSVAGTVSMGTPPALA MKWNDFRKAQCGDDAASAPAAAPAAAPATKKA MALKIRLARA MAVPKRKTSPSRRGMRRSADA MPTVNQLIRKPRTAPV MKRTYQPSKLVRKRRHGFRARMATA MPKMKTKSAAKKRFKITGTGKVKA

a Putative signal peptides were predicted as described by Tjalsma et al.40 and Pugsley.41 The hydrophobic H-domain is bold and italic. The signal peptide cleavage sites are the last three amino acid residues and are in bold. Positively charged amino acids are italic (K, R). b Proteins likely to be secreted via the Sec pathway.

peptide cleavage sites to allow assignment, where possible, to a particular secretion pathway40,41 (Table 3). Of these 24 proteins, only 4 had the required architecture that would allow them to be assigned to the Sec pathway (Q92MM2_RHIME, AH3455, AE3435, and AI3415). The remainder of the proteins, though containing the correct cleavage site for a signal peptide, did not in fact have the full N-terminal architecture that would be required to allow us to classify them as secreted proteins.40,41 This once again highlights the limitations of some bioinformatic tools, which presently are concentrated largely on motif based predictors. However, researchers are now moving toward “smarter” in silico strategies whereby a number of predictors based on both structural and experimental data are being used to attempt to predict protein localization.42 Until this next generation of bioinformatics tools are widely available, the researcher, as demonstrated here, must manually interpret the results to gain any level of biological significance. Protein Identification. The entire protein complement of a cell will not be expressed at any given time. Instead, certain subsets of proteins will be expressed at specific times and in response to various stimuli. In the current study, some 249 proteins were identified within the soluble subproteome of O. anthropi and assigned a functional classification. Using this information, it was possible to search the Kyoto Encyclopedia of Genes and Genomes (KEGG) database to identify possible biochemical pathways. Previously, Djordjevic et al.43 used such an approach to investigate the proteome of the related R-proteobacterium, S. meliloti, positively identifying active metabolic pathways if a minimum of three different enzymatic activities were identified. Within O. anthropi, a number of pathways involved in “housekeeping” activities could be identified including nucleic acid, amino and fatty acid anabolism and catabolism, glycolysis, TCA cycle, and pyruvate and selenoamino acid metabolism. Little is known of the processes involved in the human pathogenesis of O. anthropi. Proteomic investigations of O. anthropi are therefore important for the identification of 3152

Journal of Proteome Research • Vol. 5, No. 11, 2006

proteins that may act as candidate virulence factors, in addition to providing potential targets for the development of specific laboratory diagnostic tools for the identification of this emerging pathogen. In this study, we identified numerous proteins with the potential to act as virulence factors. For example, adenylate kinase has been shown to be an important virulence factor of Pseudomonas aeruginosa. Adenylate kinase enhances macrophage death possibly through the generation of pools of AMP, ADP, and ATP, all of which are cytotoxic.44 It has long been recognized that flagella can play an important role in the virulence of some bacterial pathogens.45 Two different flagellin proteins, the structural component of bacterial flagella, having closest similarity to Shigella sonnei and B. melitensis, were detected in this study. Recently, Fretin et al.46 have demonstrated the importance of flagella in the persistence of B. melitensis within a murine model of infection for brucellosis. Given that the R-proteobacteria can function as intracellular symbionts or pathogens of plants as well as animals, it was unsurprising to find a number of potential plant virulence factors within the O. anthropi proteome. BolA-like protein expression has been correlated with the degree of pathogenicity of the gram-negative bacterium Xanthomonas campestris.47 Additionally, the chromosomal response regulatory gene chvI has been shown to be essential for virulence in Agrobacterium tumefaciens.48 The presence of a lytic murein transglycosylase within O. anthropi suggests the presence of a type IV secretion system, which is used by a number of gram-negative bacteria to translocate virulence factors into eukaryotic cells or to mediate conjugative transfer of plasmids.49 Such an observation would be consistent with our knowledge of other closely related R-proteobacteria such as A. tumefaciens and B. melitensis.50

Conclusion In recent years, public concern has grown over the increasing number of nosocomial infections, with molecular analysis revealing the involvement of a wide range of microorganisms that were not previously considered to be pathogenic. As a

research articles

Proteomic Analysis of Ochrobactrum anthropi

result, it has become necessary to develop not only an understanding of the pathogenesis of these bacteria of medical relevance, but also to begin the development of rapid tests for their detection within clinical samples. O. anthropi is one such bacterium of relevance to hospital-acquired infection. As exemplified within this study, large-scale proteomic investigations generate equally large datasets that require analysis to glean functional biological information. Previously, we have shown that the “bottleneck” within such studies occurs at the curation of generated peptide lists, taking up to several months of effort in some studies to arrive at a final protein list.34 The development of the heuristic tool PROVALT by Weatherly et al.30 for the analysis of the Trypanosoma cruzi proteome and other bioinformatics tools of this type represent, we believe, a significant advance in proteomics. This tool enables the automated, rapid, and precise curation of large proteomic datasets resulting in a significant reduction in the time taken for protein identification. The free availability of this program will greatly assist the development and accessibility of proteomics within the wider scientific community. Development of proteomic tools such as PROVALT are essential if we are to establish a systems biology approach to the understanding of cellular processes within a given organism.

Acknowledgment. We thank the Wellcome Trust for funding the vacation studentship of C.E.P. (VS/05/ULS/A2). R.L.J.G. was supported by the Northern Ireland Centre of Excellence in Functional Genomics, with funding from the European Union (EU) Program for Peace and Reconciliation, under the Technology Support for the Knowledge-Based Economy Supporting Information Available: The master list of proteins identified and other relevant material. This material is available free of charge via the Internet at http://pubs.acs.org. References (1) Batut, J.; Anderson, S. G.; O’Callaghan, D. Nat. Rev. Microbiol. 2004, 2, 933-945. (2) Ugalade, R. A. Microbes Infect. 1999, 1, 1211-1219. (3) Teyssier, C.; Marchandin, H.; De Buochberg, M. S.; Ramuz, M.; Jumas-Bilak, E. J. Bacteriol. 2003, 185, 2901-2909. (4) Holmes, B.; Popoff, M.; Kiredjian, M.; Kersters, K. Int. J. Syst. Bacteriol. 1988, 38, 406-416. (5) Valesco, J.; Romero, C.; Lo´pez-Gon ˜i, I.; Leiva, J.; Dı´az, R.; Moriyo´n. Int. J. Syst. Bacteriol. 1998, 48, 759-768. (6) Lebuhn, M.; Achouak, W.; Schloter, M.; Berge, O.; Meier, H.; Barakat, M.; Hartmann, A.; Heulin, T. Int. J. Syst. Evol. Microbiol. 2000, 50, 2207-2223. (7) Asano, Y.; Nakazawa, A.; Kato, Y.; Kondo. J. Biol. Chem. 1989, 264, 14233-14239. (8) Fanuel, L.; Thamm, I.; Kostanjevecki, V.: Samyn, B.; Joris, B.; Goffin, C.; Brannigan, J.; Van Beeumen, J.; Frere, J. M. Cell. Mol. Life Sci. 1999, 55, 812-818. (9) Komeda, K.; Asano, Y. Eur. J. Biochem. 2000, 267, 2028-2035. (10) Song, B.; Palleroni, N. J.; Haggblom, M. M. Appl. Environ. Microbiol. 2000, 66, 3446-3453. (11) Favaloro, B.; Tamburro, A.; Trofino, M. A.; Bologna, L.; Rotilio, D.; Heipieper, H. J. Biochem. J. 2000, 346, 553-559. (12) El-Sayed, W.; Ibrahim, M. K.; Abu-Shady, M.; EL-Beih, F.; Ohmura, N.; Saiki, H.; Ando, A. J. Biosci. Bioeng. 2003, 96, 310-312. (13) Jo, J.; Won, S. H.; Lee, B. H. Biotechnol. Lett. 2004, 26, 13911396. (14) Branco, R.; Alpoim, M. C.; Morais, P. V. Can. J. Microbiol. 2004, 50, 697-703. (15) Magalhaes Cruz, L.; De Souza, E. M.; Weber, O. B.; Baldani, J. I.; Dobeireiner, J.; Pedrosa Fde, O. Appl. Environ. Microbiol. 2001, 67, 2375-2379. (16) Ngom, A.; Nakagawa, Y.; Sawada, H.; Tsukahara, J.; Wakabayashi, S.; Uchiuimi, T.; Nuntagij, A.; Kotepong, S.; Suzuki, A, Higashi, S.; Abe, M. J. Gen. Appl. Microbiol. 2004, 50, 17-27.

(17) Trujilo, M. E.; Willems, A.; Abril, A.; Planchuelo, A.-M.; Rivas, R.; Lunden ˜ a, D.; Mateos, P. F.; Martı´nez-Molina, E.; Vela´zquez, E. Appl. Environ. Microbiol. 2005, 71, 1318-1327. (18) Nadjar, D.; Labia, R.; Cerceau, C.; Bizet, C.; Philippon, A.; Arlet, G. Antimicrob. Agents Chemother. 2001, 45, 2342-2330. (19) Romero Go´mez, M. P.; Esteban, A. M. P.; Daza, J. A. S.; Nieto, J. A. S.; Alvarez, D.; Garcı´a, P. P. J. Clin. Microbiol. 2004, 42, 33713373. (20) Brivet, F.; Guibert, M.; Kiredjian, M.; Dormont, J. Clin. Infect. Dis. 1993, 17, 516-518. (21) Cieslak, T. J.; Robb, M. L.; Drabick, C. J.; Fischer, G. W. Clin. Infect. Dis. 1992, 15, 1068-1069. (22) Haditsch, M.; Binder, L.; Tschurtschenthaler, G.; Watschinger, R.; Zauner, G.; Mittermayer, H. Infection 1994, 21, 306-310. (23) Earhart, K. C.; Boyce, K.; Bone, W. D.; Wallace, M. R. Clin. Infect. Dis. 1997, 24, 281-282. (24) Kern, W. V.; Oethinger, M.; Kaufhold, A.; Rozdzinski, E.; Marre, R. Infection 1993, 21, 306-310. (25) Jumas-Bilak, E.; Michaux-Charachon, S.; Bourg, G.; Ramuz, M.; Allardet-Servent, A. J. Bacteriol. 1998, 180, 2749-2755. (26) Teyssier, C.; Marchandin, H.; Masnou, A.; Jeannot, J.-L.; De Buochberg, M. S.; Jumas-Bilak, E. Electrophoresis 2005, 26, 28982907. (27) Sa¨llstro¨m, B.; Andersson, S. G. Curr. Opin. Microbiol. 2005, 8, 579-585. (28) Hayes, V. E.; Ternan, N. G.; McMullan, G. FEMS Microbiol. Lett. 2000, 186, 171-175. (29) Bradford, M. M. Anal. Biochem. 1976, 72, 248-254. (30) Weatherly, D. B.; Atwood, J. A., III; Minning, T. A.; Cavola, C.; Tarleton, R. L.; Orlando, R. Mol. Cell. Proteomics 2005, 4, 762772. (31) Gardy, J. L.; Laird, M. R.; Chen, F.; Rey, S.; Walsh, C. J.; Ester, M.; Brinkman, F. S. L. Bioinformatics 2005, 21, 617-623. (32) Bendtsen, J. D.; Nielsen, H.; von Heijne, G.; Brunak, S. Mol. Biol. 2004, 340, 783-795. (33) Bendtsen, J. D.; Kiemer, L.; Fausbøll, A.; Brunak, S. Microbiology 2005, 5, 58-70. (34) Graham, R. L. J.; Pollock, C. E.; Ternan, N. G.; McMullan, G. J. Proteome Res. 2006, 5, 822-828. (35) Wagner, M. A.; Eschenbrenner, M.; Horn, T. A.; Kraycer, J. A.; Mujer, C. V.; Hagius, S.; Elzer, P.; DelVecchio, V. G. Proteomics 2002, 2, 1047-1060. (36) DelVecchio, V. G.; Wagner, M. A.; Eschenbrenner, M.; Horn, T. A.; Kraycer, J. A.; Estock, F.; Elzer, P.; Mujer, C. V. Vet. Microbiol. 2002, 90, 593-603. (37) Chong, P. K.; Wright, P. C. J. Proteome Res. 2005, 4, 1789-1798. (38) Takami, H.; Takaki Y.; Chee, G. J.; Nishi, S.; Shimamura, S.; Suzuki, H.; Matsui, S.; Uchiyama, I. Nucl. Acids Res. 2004, 32, 6292-6303. (39) Wasinger, V. C.; Urquhart, B. L.; Humphrey-Smith, I. Electrophoresis 1999, 20, 2196-2203. (40) Tjalsma, H.; Bolhuis, A.; Jongbloed, J. D. H.; Bron, S.; van Dijl, J. M. Microbiol. Mol. Biol. Rev. 2000, 64, 515-547. (41) Pugsley, A. P. Microbiol. Rev. 1993, 57, 50-108. (42) Eisenstein, M. Nat. Methods 2006, 3, 420. (43) Djordjevic, M. A.; Chen, H. C.; Natera, S.; Van Noorden, G.; Menzel, C.; Taylor, S.; Renard, C.; Geiger, O.; Weiller, G. F. Mol. Plant-Microbe Interact. 2003, 16, 508-524. (44) Markaryan, A.; Zaborina, O.; Punj, V.; Chakrabarty, A. M. J. Bactriol. 2001, 183, 3345-3352. (45) Gewirtz, A. T.; Navas, T. A.; Lyons, S.; Godowski, P. J.; Madara, J. L. J. Immunol. 2001, 167, 1882-1885. (46) Fretin, D.; Fauconnier, A.; Kohler, S.; Halling, S.; Leonard, S.; Nijskens, C.; Ferooz, J.; Lestrate, P.; Delrue, R. M.; Danese, I.; Vandenhaute, J.; Tibor, A.; DeBolle, X.; Letesson, J. J. Cell Microbiol. 2005, 7, 687-698. (47) Chin, K. H.; Lin, F. Y.; Hu, Y. C.; Sze, K. H., Lyu, P. C.; Chou, S. H. J. Biomol. NMR 2005, 31, 167-172. (48) Mantis, N. J.; Winas, S. C. J. Bacteriol. 1993, 175, 6626-6636. (49) Ho¨ppner, C.; Carle, A.; Sivanesan, D.; Hoeppner, S.; Baron, C. Microbiology 2005, 151, 3469-3482. (50) DelVecchio, V. G.; Kapatral, V.; Redkar, R. J.; Patra, G.; Mujer, C.; Los, T.; Ivanova, N.; Anderson, I.; Bhattacharyya, A.; Lykidis, A.; Reznik, G.; Jablonski, L.; Larse, N.; D’Souza, M.; Bernal, A.; Mazur, M.; Goltsman, E.; Selkov, E.; Elzer, P. H.; Hagius, S.; O’Callaghan, D.; Letesson, J.-J.; Haselkorn, R.; Kyrpides, N.; Overbeek, R. PNAS 2002, 99, 443-448.

PR060293G Journal of Proteome Research • Vol. 5, No. 11, 2006 3153