Method Development for Metaproteomic Analyses ... - ACS Publications

Apr 2, 2012 - SW, Washington, D.C.. 20375, United States. •S Supporting Information. ABSTRACT: The large-scale identification and quantitation of pr...
3 downloads 16 Views 2MB Size
Article pubs.acs.org/ac

Method Development for Metaproteomic Analyses of Marine Biofilms Dagmar Hajkova Leary,† W. Judson Hervey, IV,† Robert W. Li,‡ Jeffrey R. Deschamps,§ Anne W. Kusterbeck,§ and Gary J. Vora*,§ †

National Academy of Sciences, National Research Council, Postdoctoral Research Associate, US Naval Research Laboratory, 4555 Overlook Ave.-SW, Washington, D.C. 20375, United States ‡ Bovine Functional Genomics Laboratory, Animal and Natural Resources Institute, United States Department of Agriculture, Beltsville, Maryland, United States § Center for Bio/Molecular Science and Engineering, US Naval Research Laboratory, 4555 Overlook Ave.-SW, Washington, D.C. 20375, United States S Supporting Information *

ABSTRACT: The large-scale identification and quantitation of proteins via nanoliquid chromatography (LC)-tandem mass spectrometry (MS/MS) offers a unique opportunity to gain unprecedented insight into the microbial composition and biomolecular activity of true environmental samples. However, in order to realize this potential for marine biofilms, new methods of protein extraction must be developed as many compounds naturally present in biofilms are known to interfere with common proteomic manipulations and LCMS/MS techniques. In this study, we used amino acid analyses (AAA) and LC-MS/MS to compare the efficacy of three sample preparation methods [6 M guanidine hydrochloride (GuHCl) protein extraction + in-solution digestion + 2D LC; sodium dodecyl sulfate (SDS) protein extraction + 1D gel LC; phenol protein extraction + 1D gel LC] for the metaproteomic analyses of an environmental marine biofilm. The AAA demonstrated that proteins constitute 1.24% of the biofilm wet weight and that the compared methods varied in their protein extraction efficiencies (0.85−15.15%). Subsequent LC-MS/MS analyses revealed that the GuHCl method resulted in the greatest number of proteins identified by one or more peptides whereas the phenol method provided the greatest sequence coverage of identified proteins. As expected, metagenomic sequencing of the same biofilm sample enabled the creation of a searchable database that increased the number of protein identifications by 48.7% (≥1 peptide) or 54.7% (≥2 peptides) when compared to SwissProt database identifications. Taken together, our results provide methods and evidence-based recommendations to consider for qualitative or quantitative biofilm metaproteome experimental design.

D

entire protein complement of environmental microbiota at a given point in time”6) have lagged behind.7 Specifically, the absence of standardized and reproducible methods for protein extraction, purification, quantitation, and processing that result in material amenable for subsequent liquid chromatography− mass spectrometry metaproteome analyses have impeded progress in this field.8,9 These deficiencies have prompted researchers to explore the development of methods for processing complex environmental microbial communities for proteomic and metaproteomic analyses7,10−14 but have also contributed to limiting the number of metaproteomic studies conducted on microbes collected directly from the estuarine or marine environment to size-filtered surface water bacterioplankton communities12,15−17 or uncultured bacterial symbionts from deep-sea tube worms.18

espite decades of research that have focused on understanding the formation and prevention of marine biofilms, relatively little is known about the microbial consortia and biomolecular components that are responsible for biofilm formation on varying substrates and in differing seasonal and geographical environments.1,2 This gap in our understanding is due in large part to the current inability to cultivate the vast majority of marine microbes in the laboratory and the resulting lack of associated genomic, functional genomic, and proteomic information. However, a new suite of innovative molecular techniques (e.g., metagenomics, metatranscriptomics, metaprotemics) now potentially allows us to circumvent this obstacle and offers a unique opportunity to gain unprecedented insight into the composition, biological potential, and biomolecular activity within marine biofilms in a culture-independent manner.3 While accepted methods for environmental metagenomics4 and metatranscriptomics5 have been developed for interrogating marine microbial communities, methods for metaproteomics (defined as “the large-scale characterization of the © 2012 American Chemical Society

Received: December 14, 2011 Accepted: April 2, 2012 Published: April 2, 2012 4006

dx.doi.org/10.1021/ac203315n | Anal. Chem. 2012, 84, 4006−4013

Analytical Chemistry

Article

Scheme 1. Experimental Workflow

When compared to these microbial communities, environmental microbial biofilms pose even greater experimental challenges due to their increased community complexity, biomolecular complexity, environmental heterogeneity, and process-interfering contaminants.19 Nevertheless, necessary advances have been made in the proteomic characterization

of environmental biofilms and are due largely to the study of low complexity natural acid mine drainage (AMD) microbial biofilms.3,20 The seminal study of AMD biofilms combined mass spectrometry-based proteomics with metagenomic analyses to identify 2033 proteins from the five most abundant biofilm species which included nearly half of the predicted 4007

dx.doi.org/10.1021/ac203315n | Anal. Chem. 2012, 84, 4006−4013

Analytical Chemistry

Article

proteins from the dominant biofilm bacterium.20 Subsequent studies on AMD biofilms have included quantitative proteomic,21 strain-resolved community proteomic,22 comparative proteogenomic,23 and extracellular proteome24 analyses, thus developing these microbial communities into a model system for environmental metaproteomic research. Overall, the combined research on AMD biofilms has served to exemplify both the power and challenge of biofilm proteomics. Despite these demonstrations, the absence of accepted methods for the metaproteomic interrogation of highly complex biofilms has restricted the number of studies to only two other environmental biofilm communities: wastewater treatment biofilms14 and groundwater basalt chip biofilms.25 Methodological challenges have clearly limited the number of studies in this emerging field and warrant the development and testing of protein processing methods for metaproteomic analyses. In this study, we used amino acid analyses (AAA) and nanoliquid chromatography (LC) tandem mass spectrometry (MS/MS) to compare the efficacy of three methods [6 M guanidine hydrochloride (GuHCl) extraction + in-solution digestion + two-dimensional (2D) LC-MS/MS; sodium dodecyl sulfate (SDS) extraction + SDS-polyacrylamide gel electrophoresis (PAGE) and one-dimensional LC-MS/MS (1D gel LC); phenol extraction + 1D gel LC] for the metaproteomic analyses of highly complex marine biofilms. Our results, which are presented strictly in terms of the number of unique peptides and proteins identified on the basis of MS/ MS spectra searches against a public database (SwissProt) and a matched sample metagenomic database (Biofilm database), demonstrate reproducible method-specific differences in the number of protein identifications and protein coverage and provide criteria to consider for qualitative or quantitative LCMS/MS biofilm metaproteome experimental design.

adapted to marine biofilms, (2) sodium dodecyl sulfate (SDS), or (3) modified phenol.26 Modified GuHCl Lysis. Briefly, four volumes of lysis buffer [25 mM Tris-HCl pH 7.4, 150 mM NaCl, 0.5 mM ethylenediaminetetraacetic acid (EDTA), 1 mM MgCl2, 5 mM dithiothreitol (DTT)] were added to one volume of ground biofilm, and the samples were sonicated in a water bath for 10 min. Immediately after sonication, samples were incubated at 60 °C for 1 h and snap frozen once in a MeOH/dry ice mixture in the middle of this incubation. GuHCl (6 M final concentration) and DTT (10 mM final concentration) in 50 mM Tris buffer pH 7.6 were added to the samples which were then incubated for 1 h at 60 °C to facilitate protein denaturation. As each sample contained insoluble debris after this step, the samples were centrifuged at 5000g for 5 min at 4 °C and an aliquot of the resulting supernatant was desalted and analyzed by AAA in order to determine the amount of protein extracted from the biofilm. The remainder of the samples (which included the insoluble debris) were incubated in the dark with 25 mM iodoacetamide (IAA) for 1 h, diluted six times with 100 mM ammonium bicarbonate (ABC), and digested in solution overnight with immobilized trypsin. The samples were then centrifuged at 5000g for 5 min at 4 °C, and the peptide-containing supernatant was concentrated by speed-vac and desalted using C18 Strata columns (Phenomenex, Torrance, CA) according to the manufacturer’s recommendations. The desalted peptides were further concentrated and stored at −20 °C until they were ready for 2D LC-MS/MS analyses. SDS. One volume of ground biofilm was mixed with four volumes of 4% SDS in 100 mM ABC, and the samples were sonicated in a water bath for 10 min. Immediately after sonication, samples were incubated at 60 °C for 1 h and snap frozen once in a MeOH/dry ice mixture in the middle of this incubation. As the samples contained insoluble debris, they were also centrifuged at 5000g for 5 min at 4 °C and an aliquot of supernatant was analyzed by AAA in order to determine the amount of protein extracted from the biofilm. Modified Phenol. One volume of ground biofilm was mixed with four volumes of extraction buffer (100 mM Tris HCl buffer pH 8.8, 10 mM EDTA, 5 mM DTT, 0.9 M sucrose) and sonicated for 10 min in a water bath. Tris buffered phenol was added to the samples in 1:1 ratio, and samples were then shaken on ice for 30 min, centrifuged at 5000g for 5 min at 4 °C; the phenol phase was carefully removed. An equal volume of fresh phenol was added for a second extraction. After incubation on ice and centrifugation, this second phenol layer was removed and combined with the previously collected phenol extract. The combined phenol phases were then back extracted with an equal volume of fresh extraction buffer. Extracted proteins were precipitated from the phenol phase by adding five volumes of 100 mM ammonium acetate in 100% MeOH prechilled to −80 °C, incubated overnight at −80 °C, and then collected by centrifugation at 7000g for 30 min at 4 °C. The protein pellets were then resuspended in 100 mM ammonium acetate in ice-cold MeOH, incubated at −20 °C for 30 min, and collected by centrifugation at 12 000g for 30 min at 4 °C. This washing step was repeated twice to remove the remaining phenol, and the last wash was carried out using icecold acetone. The remaining pellets were air-dried and dissolved in 2% SDS in water to prepare the samples for SDS-PAGE. An aliquot of each protein extract was taken for AAA.



EXPERIMENTAL SECTION Chemicals. All chemicals used in this study were of analytical or higher grade. The UltraPure Tris buffered phenol solution, immobilized trypsin used for in-solution digestions, and sequencing grade modified trypsin for in-gel digestions were obtained from Invitrogen (Carlsbad, CA), Thermo Fisher Scientific (Rockford, IL), and Promega (Madison, WI), respectively. Sample Preparation. The several grams of marine biofilm samples used in this study were harvested in August, 2010 from the hull of a US Navy ship in Norfolk, VA. Briefly, biofilms with a Fouling Rating 20 (i.e., “advanced slime”-Naval Ships’ Technical Manual Chapter 081, 2006) were scraped from the air−water interface and immediately snap frozen in sterile 50 mL conical tubes using an EtOH-dry ice bath. Upon returning to the laboratory, all environmental biofilm samples were subjected to the experimental workflow depicted in Scheme 1. Prior to processing and manipulation, an aliquot of each thawed and wet biofilm was submitted for AAA to determine the protein amount and amino acid composition of the starting material (Protein Chemistry Laboratory, Texas A&M University, College Station, TX). Sample preparation for LC-MS/ MS analyses was initiated by grinding the biofilm with a mortar and pestle in the presence of liquid nitrogen and dividing the ground material into nine sample tubes (400 mg of wet weight/ tube, 5 mg of protein/tube) to enable the testing, in triplicate, of the three methods to be compared in this study. Each method utilized one of three protein extraction methods: (1) modified guanidine hydrochloride (GuHCl) lysis10 that was 4008

dx.doi.org/10.1021/ac203315n | Anal. Chem. 2012, 84, 4006−4013

Analytical Chemistry

Article

1D-SDS-PAGE and In-Gel Digestion. To examine the banding patterns that resulted from each protein extraction method, aliquots from extracts prepared by the same method were pooled (total protein = 15 μg), mixed with 4× loading buffer (Invitrogen), and heat denatured. All samples were then loaded on to a 1D-SDS-PAGE gel (4−12% gradient Bis-Tris NuPAGE, Invitrogen) and separated by electrophoresis (106 V, 90 min, 2-(N-morpholino)ethanesulfonic acid buffer). After electrophoresis, the gel was stained with Coomassie blue (BioRad, Hercules, CA) and imaged. Proteins extracted by the SDS and modified phenol methods were separated by 1D-SDS-PAGE prior to digestion as described above with the exception that 20 μg of total protein from each replicate was run in individual lanes. After staining and imaging, the sample lanes were cut into six bands, and proteins were reduced by DTT, alkylated by IAA, and digested in-gel using porcine trypsin (Promega) overnight. The peptides were extracted from the gel pieces by sonication in 0.1% formic acid (FA) in 60% acetonitrile (ACN). The extracts were then collected, and this step was repeated three more times. A final gel dehydration step (i.e., sonication with 100% ACN) was used to minimize peptide loss. Peptide digests corresponding to the same replicate and band were combined and concentrated via speed-vac. All samples (18 per extraction method) were analyzed by reverse phase liquid chromatography (LC) and tandem mass spectrometry (MS/MS). Peptide samples from the modified GuHCL lysis method were separated by 2D-LC. Liquid Chromatography Tandem Mass Spectrometry (LC-MS/MS) Analyses. One-dimensional (1D-LC) or twodimensional (2D-LC) separations27,28 were performed for each extraction method as shown in Scheme 1. The chromatographic separations used in this study are described in greater detail in the accompanying Supporting Information. A Q-STAR Elite (AB Sciex) hybrid quadrupole-time-of-flight (Q-TOF) mass spectrometer equipped with a Nano III ESI source was used to acquire all MS and MS/MS peptide spectra using information dependent acquisition (IDA). A mass range of 350−1600 Da was monitored in TOF MS scan. The three most abundant precursor ions from TOF MS scans with an intensity >20 counts per second were submitted for MS/MS analyses. Former target ions were excluded from MS/MS submission for 15 s. Protein Identification and Bioinformatics. MS data were acquired using Analyst QS (AB Sciex), and tandem mass spectra were extracted by mascot.dll. Spectra extracted from the six bands or six salt steps of the same sample were merged and analyzed using Mascot (Matrix Science, London, UK; Mascot Server version 2.3.02) and X! Tandem (The GPM, thegpm.org; version 2007.01.01.1). Mascot was set up to search the SwissProt database (version 57.15; 515 203 entries, release date March 24, 2009) or the Biofilm database (developed from the matched biofilm metagenomic DNA sequencing effort; 243 146 entries + 112 common contaminant entries; see Supporting Information for details) assuming the digestion enzyme was trypsin and allowing for four missed cleavages. X! Tandem was set up to search a subset of the SwissProt_57.15 and Biofilm database with the same settings as Mascot. Fragment ion mass tolerance was set to 0.20 Da and a parent ion tolerance to 0.20 Da. Deamidation of asparagine and glutamine, oxidation of methionine, and the iodoacetamide derivative of cysteine were specified in Mascot and X! Tandem as variable modifications. Scaffold software (version Scaffold_3_00_08, Proteome Software Inc., Portland, OR) was

used to validate the MS/MS-based peptide and protein identifications. Peptide identifications were accepted if they could be established at >80.0% probability or >95.0% probability as specified by the Peptide Prophet algorithm29 (see text for specific settings), and protein identifications were accepted if they could be established at >95.0% probability and contained at least two identified peptides or >90.0% probability and contained at least one identified peptide as determined by the Protein Prophet algorithm.30 Proteins that contained similar peptides and could not be differentiated on the basis of MS/MS analyses alone were grouped to satisfy the principles of parsimony. All common contaminants (e.g., trypsin, keratins) were removed from the protein lists prior to analyses. All proteomic data from this study have been deposited in ProteomeCommons.org Tranche under the following hash: □□+ttu0hYjRw26GhVSrYrI89YhhNJPsZcKlPlFZ7pBkd/ 7o4WLpAJ/S05A8yYvE4xHFYb85mzxLs2YJ+5/4MFPc3 × 4W9oAAAAAAAAPHw==.□□



RESULTS AND DISCUSSION Amino Acid Analyses (AAA). The ability to accurately quantify protein concentrations is a critical aspect of standard protein chemistry procedures and a necessary measurement when attempting to develop repeatable and reproducible methods for the metaproteomic analyses of environmental samples. However, environmental biofilms are often known to contain interfering components that confound traditional protein quantification methods.10 The wet biofilm samples and derivative protein extracts chosen for this study were found to contain green pigments from photosynthetic microorganisms that interfered with the use of traditional protein quantification methods that are based on UV absorption (data not shown). As such, we chose to use acid hydrolysis AAA to provide a precise determination of protein concentration and relative amino acid composition from the original wet marine biofilm sample as well as the method-derived protein extracts (Scheme 1). In contrast to the protein content of surface seawater microorganism cell pellets (estimated to be 10% (weight/ weight)),17 AAA of the wet biofilm starting material revealed a total protein content of 1.24% (weight/biofilm wet weight) suggesting a lower cellular density then that found in cell pellets and a greater contribution of the nonproteinaceous components of the hydrated extracellular matrix in marine biofilms. A wet weight biofilm corresponding to ∼5 mg of total protein was subjected to each protein extraction procedure, and the quantity of the resulting protein extracts are summarized in Table 1S in the Supporting Information. Protein extraction with 4% SDS was found to be the most efficient method tested (yield 782 ± 163 μg of protein, 15.15% extraction efficiency) but also demonstrated large variability. The modified GuHCl lysis method yielded protein extracts with the most reproducible protein amounts among the analyzed replicates (134 ± 10 μg) but with a comparatively low extraction efficiency (2.6%). Finally, the modified phenol extraction method was found to be the least efficient extraction method (0.85% extraction efficiency) and resulted in protein amounts that demonstrated the greatest variability among the three replicates (44 ± 14 μg). This result was not entirely surprising as the modified phenol extraction method included a precipitation step that is known to exacerbate method variability. Furthermore, we anticipated that a simple increase in the starting amount of biological material would increase the protein extraction efficiency using this method and indeed that 4009

dx.doi.org/10.1021/ac203315n | Anal. Chem. 2012, 84, 4006−4013

Analytical Chemistry

Article

extracts demonstrated staining in a broad MW range (>250−10 kDa), the SDS method protein extracts demonstrated a population of proteins with a more restricted and smaller MW size distribution (75−5 kDa). The difference in these profiles may reflect the isolation of different protein populations based on the method of extraction or may be due to the presence of charged contaminants in the SDS extracts (e.g., DNA, RNA) which interfere with electrophoresis. As phenol effectively separates proteins from these contaminants, protein extracts derived from the modified phenol method are more compatible with SDS-PAGE analyses and the resulting migration profiles are likely more representative of the true MW size distribution of the extracted protein population. Overall, the combined results of AAA and SDS-PAGE analysis suggested that the protein populations being extracted from the marine biofilm sample varied on the basis of the extraction method used and led to the prediction that a large percentage of the peptides and proteins to be identified by LC-MS/MS would likely be unique to the particular extraction method utilized. In order to prepare the SDS and phenol extracted material for LC-MS/MS, gel bands were cut from SDS-PAGE gels for in-gel protein digestion (Figure 2SB,C in the Supporting Information). Each run replicate produced a similar electrophoretic mobility pattern, and special care was taken to cut the gel bands at the same MW range in each replicate lane to ensure similar peptide complexity in each LC-MS/MS sample and higher reproducibility among replicates. Method Comparison via LC-MS/MS. Although the analyses of metaproteomic data sets do not have to be limited to organisms for which matched genomic sequence is available,31 they are most informative in combination with other “omics” approaches (genomics, transcriptomics, etc.).20,32 As such, the LC-MS/MS spectra acquired from the three protein extraction methods were searched against two databases: SwissProt 57.15 and a Biofilm database that was generated directly from a sample matched biofilm metagenome. The replicates of each extraction method were imported into Scaffold 3 software as individual biological samples (nine total samples) and assigned to an extraction method sample category (three categories: GuHCl, SDS, and phenol). Using the combined results of all three methods, 64 proteins were identified from the SwissProt database using ≥2 peptides/ protein (95% protein and 80% peptide probability) and 1504 spectra were assigned to peptide sequences. In comparison, 99 proteins (55% increase) were identified from the Biofilm database using ≥2 peptides/protein and 2131 spectra (42% increase) were assigned to peptide sequences. The same trend was observed when generating protein and peptide assignments using ≥1 peptide/protein (90% protein, 95% peptide probability): the Biofilm database search results demonstrated a 49% increase in protein identifications (568 versus 382) and a 39% increase in peptide assignments (2935 versus 2106). The findings suggest that the gene/protein sequences of many of the marine microorganisms present in this particular biofilm were not present in SwissProt as they likely have not previously been sequenced. As the resolving power of metaproteome analyses rely heavily upon reference databases against which MS data are searched,19 it was predicted that generating a matched metagenome would provide for a more thorough metaproteome analysis, and indeed, that was the case. Even still, the total number of proteins identified in our highly complex marine biofilm were far less than the number of proteins identified in the low complexity AMD biofilm.20 As

was the case with a subsequent study utilizing the same method with 10.6 mg of biofilm resulting in an average extraction efficiency of 7.00% (data not shown). A summary of the relative molar amino acid (AA) content of the original biofilm sample and derived protein extracts prepared by each method is shown in Figure 1S in the Supporting Information. We observed a lower content of the AA Arg (1.7%), Lys (2.0%), and His (0.5%) in protein extracts prepared by the modified GuHCl lysis method (Figure 1SB in the Supporting Information) in comparison to the original wet biofilm (6.5%, 4.0%, and 1.2%, Figure 1SA in the Supporting Information, respectively) from which the extracts were derived. In contrast, the percentage content of these AA in extracts prepared by the other two methods corresponded more closely to the original wet biofilm (SDS extraction, 7.3%, 3.4% and 1.4%, Figure 1SC in the Supporting Information; modified phenol extraction, 5.0%, 4.0%, and 1.3%, Figure 1SD in the Supporting Information, respectively). Since trypsin specifically hydrolyses proteins at the C-terminus of Lys and Arg, this decrease in basic AA in the modified GuHCl lysis method extracts merits some consideration as this introduced skew may potentially result in the loss of trypsin active sites and identified tryptic peptides. Thus, this bias toward uncharged AA may result in the identification of different peptides (and proteins) in a sample thus skewing its proteomic content analysis. Indeed, evidence for this can be seen in our data set when examining the number of identified peptides that did not harbor a terminal Lys or Arg that were unique to each extraction method (e.g., identification using the Biofilm database: modified GuHCl lysis, 50 peptides; modified phenol, 5 peptides; SDS, 0 peptides). SDS-PAGE Protein Separation. To obtain a preliminary indication of whether the same proteins were being extracted from the marine biofilm using the three methods of interest, we performed an SDS-PAGE analysis. Rather surprisingly, the electrophoretic mobility patterns of the three protein extracts were markedly different (Figure 2SA; see Figure 2S in the Supporting Information). The greatest staining intensity was observed with the modified phenol extracted protein samples, and no staining was observed with the modified GuHCl lysis extracts even though the same protein amount was loaded into each sample well. Confirmatory staining assays using a more sensitive protein staining agent (SYPRO Ruby) produced the same result for the modified GuHCl lysis extracts (data not shown), suggesting an incompatibility between proteins extracted using GuHCl and SDS-PAGE-based protein separation and gel staining. In contrast, the SDS and modified phenol extraction methods resulted in protein electrophoretic mobility patterns that could be visualized, but neither method produced protein extracts that exhibited characteristic protein banding. The lack of sharp banding patterns from complex protein extracts obtained directly from the environment has been previously observed in soil and wastewater extracts13 as well as Chesapeake Bay surface water microbial community samples.12 One possible explanation for this observation is that, due to the species and biomolecular complexity of environmental samples, the proteins present in these extracts are highly heterogeneous and have varied post-translational modifications which, when combined, do not allow the visualization of individually banded proteins. Nevertheless, an obvious difference in the protein electrophoretic mobility patterns between the SDS and modified phenol method extracts was the molecular weight (MW) size distribution. While modified phenol method protein 4010

dx.doi.org/10.1021/ac203315n | Anal. Chem. 2012, 84, 4006−4013

Analytical Chemistry

Article

Figure 1. Comparison of LC-MS/MS identified proteins and peptides using each protein extraction method. Venn diagrams were generated using 95% protein probability/80% peptide probability/≥2 peptides per protein (left two columns) or 90% protein probability/95% peptide probability/ ≥1 peptide per protein (right two columns) from searches using the SwissProt 57.15 database (white diagrams) or Biofilm database (gray diagrams). The number of proteins (diagrams A−D) and peptides (diagrams E−H) identified using each protein extraction method at the aforementioned probabilities are shown. A full list of identified proteins can be found in Table 3S in the Supporting Information.

methods. To improve the reproducibility, better separation methods on the protein and peptide level can be employed to lower sample complexity at the time of sampling by LC-MS/ MS. This can be achieved, in part, by cutting sample lanes into more bands for the gel-based methods, introducing a greater number of salt steps for the 2D-LC method, or increasing the length of the HPLC gradient for all methods. As the previous experiments suggested that the protein populations being extracted from the marine biofilm sample varied on the basis of the extraction method used, we sought to determine the number of proteins and peptides identified by LC-MS/MS that were unique and common to each of the three extraction methods utilized. The greatest number of proteins identified by ≥2 peptides/protein (>95% protein and >80% peptide probability, probability settings satisfied in all nine samples) were found using the modified phenol extracted material (SwissProt database = 44 unique proteins, Biofilm database = 72 proteins; Figure 1A,B). At the same probability settings, 33 and 54 proteins were identified in the modified GuHCl lysis extracts and 14 and 17 proteins in the SDS extracts, respectively (Figure 1A,B). This finding demonstrates that the proteins/peptides were identified with higher protein sequence coverage in the modified phenol method extracts than the modified GuHCl lysis method or SDS method extracts. When proteins identified by ≥1 peptide/protein (>90% protein and >95% peptide probability) were investigated, the total number of identified proteins increased by more than five times and the greatest number of proteins identified were found using the modified GuHCl lysis method extracts (Figure 1C,D). As such, the modified GuHCl lysis method yielded the deepest proteome coverage among examined methods. These results further illustrate the complexity of the biofilm as the probability of obtaining MS/MS spectra of peptides that belong to the same protein is small even though we employed several separation methods prior to mass spectrometry. Similar findings were observed by Froehlich et al.34 when comparing chloroplastic envelope proteins identified by MudPIT analysis with proteins identified by 1D-SDS LC-MS/MS analysis. Again,

there are a number of factors (e.g., sample type, protein amount analyzed, protein modifications, instrument type, run time, search conditions, quality of protein database) that can influence protein identification in environmental samples, it is currently difficult to draw any conclusions to account for the numerical discrepancy. For example, this study is the first to describe a metaproteomic analysis of a complex marine biofilm and the first study in which the protein amount was carefully estimated prior to proteomic analysis. Thus, given our differing sample source and processing method, meaningful comparisons of the total number of proteins identified are problematical as there is no comparable prior study. In addition to absolute quantities or yield, experimental repeatability and reproducibility are important practical considerations for any analytical technique, particularly when comparing methods. Experimental reproducibility, which represents the variation in sample replicates,33 was calculated by dividing the number of unique proteins identified in all three method replicates by the total number of unique proteins identified in the respective extraction method. In general, the experimental reproducibility was similar for all three methods tested and suggested that the lysis, extraction, and LC-MS/MS analyses procedures were comparable among the methods (Table 2S in the Supporting Information). Barring one exception (SDS extraction, ≥1 peptide/protein), the observed reproducibility was higher when the SwissProt database was used for searches (86−92% for proteins identified by ≥2 peptides/protein, 43−51% for proteins identified by ≥1 peptide/protein) than when the Biofilm database was used (74−87%, 35−37%, respectively). Thus, noticeably lower reproducibility was observed when more proteins were identified. This may be due to the inherent complexity of environmental samples and the low probability of peptides from the same protein being sampled in all LC-MS/MS runs. That is, when we improve our protein identification capabilities using the Biofilm database, peptides can be matched to different protein sequences other than those found in the SwissProt database, thus lowering the experimental reproducibility of the 4011

dx.doi.org/10.1021/ac203315n | Anal. Chem. 2012, 84, 4006−4013

Analytical Chemistry

Article

in a trend similar to the ≥2 peptides/protein identifications, searching against the Biofilm database provided a greater number of protein identifications (38−59% increase) when compared to SwissProt database identifications. The full list of identified proteins can be found within Table 3S in the Supporting Information. On the peptide level, the most peptides were identified in modified phenol extracts when proteins with ≥2 peptides/ protein were examined (Figure 1E,F) while the modified GuHCl lysis method identified the most peptides in proteins with ≥1 peptides/protein (Figure 1G,H). Interestingly, little overlap in identified peptides (Figure 1E−H) was observed among the three tested methods as 4.4−10.0% of the peptides were common to all of them (the same is true for identified proteins). Both the modified GuHCl lysis method and modified phenol method demonstrated large numbers of unique peptides. The modified phenol method and SDS method consistently had more peptides in common between them than did the modified GuHCl lysis method and SDS method (Figure 1E−H). This may be due to the fact that the modified phenol method and SDS method were both in-gel digestion methods. Overall, the modified GuHCl lysis method is a simple and convenient method that identified the greatest number of proteins (≥1 peptide/protein, 90% protein and 95% peptide probability). The modified phenol method was more labor intensive, but the samples prepared by this method contained fewer contaminants; thus, the acquired MS/MS spectra were of better quality, leading to higher identification probabilities. Furthermore, the combination of the modified phenol extraction and in-gel digestion led to higher coverage of the identified proteins, and these are suitable and necessary characteristics for the development of improved approaches for quantitative metaproteomic measurements.9 Although the SDS extraction method yielded the highest protein amounts in the extracts, the fewest peptides and proteins were identified using this method. We suspect that the biochemical complexity of the marine biofilm was too great for this extraction method. As a result, this extraction method failed to specifically extract only proteins out of the complex environmental matrix and instead coextracted contaminants that hindered downstream sample preparation. It is possible that a simple sample cleanup step (e.g., trichloroacetic acid−acetone precipitation) could be introduced in this method to increase the quality of the SDS extracted proteins. Considering the relatively low degree of identified peptide and protein overlap, the three tested methods can be viewed as complementary and could be used in tandem to investigate the same environmental sample to obtain a more accurate characterization of protein expression. For example, while all three proteomic approaches tested identified proteins that were most similar to those found in eukaryotes such as Hydra vulgaris, Aplysia californica, Patella granatina, Paracentrotus lividus, Dictyostelium discoideum, and Caenorhabditis elegans and marine prokaryotes such as Synechococcus spp., Erythrobacter litoralis, Roseobacter denitrif icans, Synechocystis spp., and Silicibacter sp., there were obvious differences in the types of proteins identified. The top five proteins identified using the GuHCl lysis method (tubulin β chain, allophycocyanin α subunit, C-phycocyanin β chain, bacterial adenosine triphosphate (ATP) synthase α and β subunits), SDS method (histone H2B, nonmuscle actin, polyubiquitin, tubulin β chain, nonannotated protein), and phenol method (nonmuscle actin, bacterial ATP synthase α subunit, tubulin β chain, nonannotated protein, C-phycoerythrin class 1 β subunit) varied in

protein type, cellular localization, and putative organism of origin (Table 3S in the Supporting Information).



CONCLUSIONS The majority of bacteria on the planet reside in biofilms,35 and our inability to culture more than ∼1% of these species36 suggests that new methods are required to interrogate these communities, not only to foster a better fundamental understanding of community structure and function but also to enable the development of preventative measures that would be of obvious benefit to human health and medicine, the environment, industrial processes, and the materials industry. This study focused on the critical need for efficient and nonbiased protein extraction and sample preparation methods for metaproteomic analyses of complex marine biofilms. However, given the tremendous complexity and variability of environmental biofilms (i.e., chemical and biochemical composition, species composition and concentration, dynamic range of protein expression, membrane composition, etc.) and in the absence of prior sample-specific knowledge, we and others believe that it is unlikely that there is a “universal” protein extraction procedure that will work equally well on all biofilm sample types.10,37 While our findings clearly demonstrate that the different biochemical methods tested introduce a protein extraction bias, two of the methods described are immediately suitable for the metaproteomic analyses of biofilms based on the type of information that is desired. For qualitative studies (i.e., cataloging proteins expressed in biofilms), the modified GuHCl lysis followed by in-solution digestion and 2D LC-MS/MS is a suitable method and will result in the identification of a large number of proteins. For relative quantitative studies, the modified phenol method followed by 1D-SDS-PAGE and in-gel digestion is recommended as it provides greater protein coverage and will result in better statistical confidence of the protein/peptide ratios since more ratios will be calculated per protein. Individually or in tandem, the methods described in this study are capable of facilitating metaproteomic studies and combined with other “omics” methods have the potential to greatly enhance our understanding of the in situ microbial activity within marine biofilms.



ASSOCIATED CONTENT

S Supporting Information *

Additional information is available as noted in the text. This material is available free of charge via the Internet at http:// pubs.acs.org.



AUTHOR INFORMATION

Corresponding Author

*Address: Center for Bio/Molecular Science and Engineering, Naval Research Laboratory, 4555 Overlook Avenue-SW, Bldg. 30/Code 6910, Washington, DC 20375, USA. Tel: 202.767.0394. Fax: 202.767.9594. E-mail: [email protected]. mil. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS We would like to thank Dr. Zheng Wang for his bioinformatic contributions. D.H.L. and W.J.H. IV are National Research Council postdoctoral fellows. This work was supported by the Office of Naval Research via U.S. Naval Research Laboratory 4012

dx.doi.org/10.1021/ac203315n | Anal. Chem. 2012, 84, 4006−4013

Analytical Chemistry

Article

(25) Paszczynski, A. J.; Paidisetti, R.; Johnson, A. K.; Crawford, R. L.; Colwell, F. S.; Green, T.; Delwiche, M.; Lee, H.; Newby, D.; Brodie, E. L.; Conrad, M. Biodegradation 2011, 22 (6), 1045−1059. (26) Hurkman, W. J.; Tanaka, C. K. Plant Physiol. 1986, 81 (3), 802− 806. (27) McDonald, W. H.; Yates, J. R., III Dis. Markers 2002, 18 (2), 99−105. (28) Hervey, W. J.; Khalsa-Moyers, G.; Lankford, P. K.; Owens, E. T.; McKeown, C. K.; Lu, T. Y.; Foote, L. J.; Asano, K. G.; Morrell-Falvey, J. L.; McDonald, W. H.; Pelletier, D. A.; Hurst, G. B. J. Proteome Res. 2009, 8 (7), 3675−3688. (29) Keller, A.; Nesvizhskii, A. I.; Kolker, E.; Aebersold, R. Anal. Chem. 2002, 74 (20), 5383−5392. (30) Nesvizhskii, A. I.; Keller, A.; Kolker, E.; Aebersold, R. Anal. Chem. 2003, 75 (17), 4646−4658. (31) Turse, J. E.; Marshall, M. J.; Fredrickson, J. K.; Lipton, M. S.; Callister, S. J. PLoS One 2010, 5 (11), e13968. (32) Banfield, J. F.; Verberkmoes, N. C.; Hettich, R. L.; Thelen, M. P. OMICS 2005, 9 (4), 301−333. (33) Tabb, D. L.; Vega-Montoto, L.; Rudnick, P. A.; Variyath, A. M.; Ham, A. J.; Bunk, D. M.; Kilpatrick, L. E.; Billheimer, D. D.; Blackman, R. K.; Cardasis, H. L.; Carr, S. A.; Clauser, K. R.; Jaffe, J. D.; Kowalski, K. A.; Neubert, T. A.; Regnier, F. E.; Schilling, B.; Tegeler, T. J.; Wang, M.; Wang, P.; Whiteaker, J. R.; Zimmerman, L. J.; Fisher, S. J.; Gibson, B. W.; Kinsinger, C. R.; Mesri, M.; Rodriguez, H.; Stein, S. E.; Tempst, P.; Paulovich, A. G.; Liebler, D. C.; Spiegelman, C. J. Proteome Res. 2010, 9 (2), 761−776. (34) Froehlich, J. E.; Wilkerson, C. G.; Ray, W. K.; McAndrew, R. S.; Osteryoung, K. W.; Gage, D. A.; Phinney, B. S. J. Proteome Res. 2003, 2 (4), 413−425. (35) Costerton, J. W.; Lewandowski, Z.; Caldwell, D. E.; Korber, D. R.; Lappin-Scott, H. M. Annu. Rev. Microbiol. 1995, 49, 711−745. (36) Woese, C. R. Curr. Biol. 1996, 6 (9), 1060−1063. (37) Wilmes, P.; Bond, P. L. Trends Microbiol. 2006, 14 (2), 92−97.

core funds. The opinions and assertions contained herein are those of the authors and are not to be construed as those of the U.S. Navy, military service at large, or U.S. Government.



REFERENCES

(1) Cooksey, K. E.; Wigglesworth-Cooksey, B. Aquat. Microb. Ecol. 1995, 9, 87−96. (2) Qian, P. Y.; Lau, S. C.; Dahms, H. U.; Dobretsov, S.; Harder, T. Mar. Biotechnol. (NY) 2007, 9 (4), 399−410. (3) Verberkmoes, N. C.; Denef, V. J.; Hettich, R. L.; Banfield, J. F. Nat. Rev. Microbiol. 2009, 7 (3), 196−205. (4) Venter, J. C.; Remington, K.; Heidelberg, J. F.; Halpern, A. L.; Rusch, D.; Eisen, J. A.; Wu, D.; Paulsen, I.; Nelson, K. E.; Nelson, W.; Fouts, D. E.; Levy, S.; Knap, A. H.; Lomas, M. W.; Nealson, K.; White, O.; Peterson, J.; Hoffman, J.; Parsons, R.; Baden-Tillson, H.; Pfannkoch, C.; Rogers, Y. H.; Smith, H. O. Science 2004, 304 (5667), 66−74. (5) Frias-Lopez, J.; Shi, Y.; Tyson, G. W.; Coleman, M. L.; Schuster, S. C.; Chisholm, S. W.; Delong, E. F. Proc. Natl. Acad. Sci. U. S. A. 2008, 105 (10), 3805−3810. (6) Wilmes, P.; Bond, P. L. T Environ. Microbiol. 2004, 6 (9), 911− 920. (7) Pierre-Alain, M.; Christophe, M.; Severine, S.; Houria, A.; Philippe, L.; Lionel, R. Microb. Ecol. 2007, 53 (3), 426−434. (8) Lacerda, C. M.; Reardon, K. F. Briefings Funct. Genomics Proteomics 2009, 8 (1), 75−87. (9) Keller, M.; Hettich, R. E Microbiol. Mol. Biol. Rev. 2009, 73 (1), 62−70. (10) Thompson, M. R.; Chourey, K.; Froelich, J. M.; Erickson, B. K.; Verberkmoes, N. C.; Hettich, R. L. Anal. Chem. 2008, 80 (24), 9517− 9525. (11) Ogunseitan, O. A. J. Microbiol. Methods 1993, 17 (4), 273−281. (12) Kan, J.; Hanson, T. E.; Ginter, J. M.; Wang, K.; Chen, F. Saline Syst. 2005, 1, 7. (13) Benndorf, D.; Vogt, C.; Jehmlich, N.; Schmidt, Y.; Thomas, H.; Woffendin, G.; Shevchenko, A.; Richnow, H. H.; Von, B. M. Biodegradation 2009, 20 (6), 737−750. (14) Abram, F.; Gunnigle, E.; O’Flaherty, V. O Electrophoresis 2009, 30 (23), 4149−4151. (15) Morris, R. M.; Nunn, B. L.; Frazar, C.; Goodlett, D. R.; Ting, Y. S.; Rocap, G. ISME J. 2010, 4 (5), 673−685. (16) Sowell, S. M.; Wilhelm, L. J.; Norbeck, A. D.; Lipton, M. S.; Nicora, C. D.; Barofsky, D. F.; Carlson, C. A.; Smith, R. D.; Giovanonni, S. J. ISME J. 2009, 3 (1), 93−105. (17) Sowell, S. M.; Abraham, P. E.; Shah, M.; Verberkmoes, N. C.; Smith, D. P.; Barofsky, D. F.; Giovannoni, S. J. ISME J. 2011, 5 (5), 856−865. (18) Markert, S.; Arndt, C.; Felbeck, H.; Becher, D.; Sievert, S. M.; Hugler, M.; Albrecht, D.; Robidart, J.; Bench, S.; Feldman, R. A.; Hecker, M.; Schweder, T. Science 2007, 315 (5809), 247−250. (19) Schneider, T.; Riedel, K. Proteomics 2010, 10 (4), 785−798. (20) Ram, R. J.; Verberkmoes, N. C.; Thelen, M. P.; Tyson, G. W.; Baker, B. J.; Blake, R. C.; Shah, M.; Hettich, R. L.; Banfield, J. F. Science 2005, 308 (5730), 1915−1920. (21) Belnap, C. P.; Pan, C.; Denef, V. J.; Samatova, N. F.; Hettich, R. L.; Banfield, J. F. ISME J. 2011, 5 (7), 1152−1161. (22) Lo, I.; Denef, V. J.; Verberkmoes, N. C.; Shah, M. B.; Goltsman, D.; DiBartolo, G.; Tyson, G. W.; Allen, E. E.; Ram, R. J.; Detter, J. C.; Richardson, P.; Thelen, M. P.; Hettich, R. L.; Banfield, J. F. Nature 2007, 446 (7135), 537−541. (23) Denef, V. J.; Kalnejais, L. H.; Mueller, R. S.; Wilmes, P.; Baker, B. J.; Thomas, B. C.; Verberkmoes, N. C.; Hettich, R. L.; Banfield, J. F. Proc. Natl. Acad. Sci. U. S. A. 2010, 107 (6), 2383−2390. (24) Erickson, B. K.; Mueller, R. S.; Verberkmoes, N. C.; Shah, M.; Singer, S. W.; Thelen, M. P.; Banfield, J. F.; Hettich, R. L. J. Proteome Res. 2010, 9 (5), 2148−2159. 4013

dx.doi.org/10.1021/ac203315n | Anal. Chem. 2012, 84, 4006−4013