Proteogenomic Definition of Biomarkers for the Large - American

Sep 4, 2013 - clade. The molecular weights of these three biomarkers, as for other conserved ... the case for large clades such as the Roseobacter gro...
1 downloads 0 Views 863KB Size
Article pubs.acs.org/jpr

Proteogenomic Definition of Biomarkers for the Large Roseobacter Clade and Application for a Quick Screening of New Environmental Isolates Joseph Alexander Christie-Oleza, Guylaine Miotello, and Jean Armengaud* DSV, IBEB, Lab Biochim System Perturb, CEA, Parc Technologique Marcel Boiteux, BP17171, Bagnols-sur-Cèze F-30207, France S Supporting Information *

ABSTRACT: Whole-cell, matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry has become a routine and reliable method for microbial characterization due to its simplicity, low cost, and high reproducibility. The identification of microbial isolates relies on the spectral resemblance of lowmolecular-weight proteins to already-existing isolates within the databases. This is a gold standard for clinicians who have a finite number of well-defined pathogenic strains but represents a problem for environmental microbiologists with an overwhelming number of organisms to be defined. Here we set a milestone for implementing whole-cell MALDI-TOF mass spectrometry to identify isolates from the biosphere. To make this technique accessible for environmental studies, we propose to (i) define biomarkers that will always show up with an intense m/z signal in the MALDI-TOF spectra and (ii) create a database with all the possible m/z values that these biomarkers can generate to screen new isolates. We tested our method with the relevant marine Roseobacter lineage. The use of shotgun nanoLC-MS/MS proteomics on the small proteome fraction of nine Roseobacter strains and the proteogenomic toolbox helped us to identify potential biomarkers in terms of protein abundance and low variability among strains. We show that the DNA binding protein, HU, and the ribosomal proteins, L29 and L30, are the most robust biomarkers within the Roseobacter clade. The molecular weights of these three biomarkers, as for other conserved homologous proteins, vary due to sequence variation above the genus level. Therefore, we calculated the m/z values expected for each one of the known Roseobacter genera and tested our strategy during an extensive screening of natural marine isolates obtained from coastal waters of the Western Mediterranean Sea. The use of this technique versus standard sequencing methods is discussed. KEYWORDS: shotgun proteomics, proteogenomics, MALDI-TOF mass spectrometry, marine bacteria, environmental isolates, biomarkers, microorganism identification



INTRODUCTION

and should be useful for monitoring microbial groups and their related processes in ecosystems and for screening new isolates.6 Members of the Roseobacter clade are an important component of marine ecosystems, accounting for up to 20% of the total bacterial community. Consequently, this clade is being intensively studied.7,8 Roseobacters are metabolically versatile organisms that play important roles in global carbon and sulfur cycles.9 Their genomes encompass a large diversity of enzymatic information, denoting their generalist/opportunist life behavior.10,11 Their interactions with the microbial marine community are beginning to be uncovered through the identification of a large diversity of secreted proteins and active compounds. 12,13 To date, 54 genomes of Roseobacter representatives have been released (www.roseobase.org). Nevertheless, this information is still poor when compared

Whole-cell, matrix-assisted laser desorption/ionization time-offlight (MALDI-TOF) mass spectrometry is becoming a standard technique to identify microorganisms.1,2 In this approach, the profile of m/z values measured for the abundant and ionizable, low-molecular-weight proteins is compared with those recorded in similar conditions for reference strains. The relatively low cost, rapidity, simplicity and reliability of this method have been well documented.3 Clinical microbial diagnostics by whole-cell MALDI-TOF mass spectrometry is becoming a gold standard.1,4 Because of their importance for human health, pathogens are well represented in databases cataloguing MALDI-TOF mass spectra of references. Environmental microorganisms are, however, under-represented in such repositories, although hundreds of millions of prokaryotic species populate the earth.5 The implementation of whole-cell MALDI-TOF mass spectrometry in environmental microbiology should allow a quick identification of bacterial species © 2013 American Chemical Society

Special Issue: Agricultural and Environmental Proteomics Received: June 13, 2013 Published: September 4, 2013 5331

dx.doi.org/10.1021/pr400554e | J. Proteome Res. 2013, 12, 5331−5339

Journal of Proteome Research

Article

samples (biological duplicates), each measured twice (technical replicates). A spectrum from each replicate was constructed from 150 consecutive laser shots after external calibration. Using internal standards, spectra were recalibrated posttreatment. The 100 most intense m/z peaks were considered for biomarker analysis.

with the high number of known genera that make up this coherent phylogenetic clade. The massive amount of uncharacterized bacteria in the environment makes the use of whole-cell MALDI-TOF difficult for identifying bacteria belonging to genera that are underrepresented in the MALDI-TOF spectrum databases. This is the case for large clades such as the Roseobacter group. Although culture media or colony age do not strongly influence bacterial identification,5,14 it has been recommended to grow isolates under standard conditions similar to those used to construct the database to avoid misidentifications.15,16 Furthermore, it is impossible to identify organisms that are distantly related to those present in the databases. To address this issue, Wynne and coworkers have already stated the need to define m/z MALDI-TOF mass spectrometry biomarkers to discriminate species rather than using their spectral profile.17 These biomarkers could advantageously be housekeeping proteins present whatever the conditions and similar between closely related species. We recently proposed a set of reliable bacterial group biomarkers for identifying specific environmental microorganisms.6 Our proteogenomic approach proved that shotgun nanoLC-MS/MS proteomics combined with genome inspection gives higher confidence to the identification of MALDI-TOF biomarkers for a given genus, that is, Ruegeria. By this strategy, we defined five biomarkers for this specific genus (proteins HU, L29, L30, L32, and S17) and predicted their sequence variability among reference strains to screen for new Ruegeria members within a large set of environmental isolates.6 Here we propose the extension of this concept to a larger group of environmental bacteria, that is, the environmentally relevant Roseobacter clade. We define a more restricted set of reliable biomarkers by means of a large proteogenomic survey carried out on 10 representatives of the clade. The use of these biomarkers proved that a cost-effective diagnosis of environmental isolates is possible by the use of MALDI-TOF MS.



Shotgun nanoLC-MS/MS Analysis

Shotgun proteomics was carried out on the small-protein fraction (