Comparative Proteome Cataloging of Lactobacillus rhamnosus Strains

May 27, 2011 - example, Lactobacillus rhamnosus strains GG (ATCC 53103) and. Lc705 of human and dairy origin, respectively.8,9 Strain GG is considered...
1 downloads 0 Views 2MB Size
ARTICLE pubs.acs.org/jpr

Comparative Proteome Cataloging of Lactobacillus rhamnosus Strains GG and Lc705 Kirsi Savijoki,*,†,§ Niina Lietzen,† Matti Kankainen,‡ Tapani Alatossava,§ Kerttu Koskenniemi,|| Pekka Varmanen,§ and Tuula A. Nyman† †

Institute of Biotechnology, University of Helsinki, Finland Valio Ltd, Helsinki, Finland § Department of Food and Environmental Sciences, University of Helsinki, Finland Department of Veterinary Biosciences, University of Helsinki, Finland

)



bS Supporting Information ABSTRACT: The present study reports an in-depth proteome analysis of two Lactobacillus rhamnosus strains, the well-known probiotic strain GG and the dairy strain Lc705. We used GeLCMS/MS, in which proteins are separated using 1-DE and identified using nanoLCMS/MS, to generate high-quality protein catalogs. To maximize the number of identifications, all data sets were searched against the target databases using two search engines, Mascot and Paragon. As a result, over 1600 highconfidence protein identifications, covering nearly 60% of the predicted proteomes, were obtained from each strain. This approach enabled identification of more than 40% of all predicted surfome proteins, including a high number of lipoproteins, integral membrane proteins, peptidoglycan associated proteins, and proteins predicted to be released into the extracellular environment. A comparison of both data sets revealed the expression of more than 90 proteins in GG and 150 in Lc705, which lack evolutionary counterparts in the other strain. Differences were noted in proteins with a likely role in biofilm formation, phage-related functions, reshaping the bacterial cell wall, and immunomodulation. The present study provides the most comprehensive catalog of the Lactobacillus proteins to date and holds great promise for the discovery of novel probiotic effector molecules. KEYWORDS: proteomics, proteome coverage, surfome, probiotic Lactobacillus, 1-DE, LCMS/MS, Compid, RT-PCR

’ INTRODUCTION The Lactobacillus genus is a heterogeneous group of lactic acid bacteria (LAB) with important implications in the manufacture of fermented food and feed products.1 These highly versatile bacteria inhabit a wide variety of environmental niches, and some of them can, at least transiently, colonize the human gastrointestinal tract (GIT).2,3 Furthermore, some Lactobacillus strains are known to prevent a broad range of diseases, conditions, or syndromes in humans.4 Many strains of these bacteria are therefore added to dietary supplements that are marketed as probiotic, health-promoting products, which represent one of the fast growing sectors in the functional foods market.5,6 It has become evident that the probiotic characteristics of one strain may not apply to a related strain of the same species, and many of the current claims of health-beneficial properties in commercially available products that include probiotics are based on strainspecific properties.7 While there is rich evidence of the healthpromoting effects of certain lactobacilli, the molecular mechanisms by which probiotic lactobacilli achieve these effects have remained largely unknown. r 2011 American Chemical Society

Strain-dependent characteristics have been assigned to, for example, Lactobacillus rhamnosus strains GG (ATCC 53103) and Lc705 of human and dairy origin, respectively.8,9 Strain GG is considered as a prototype probiotic due to its well-established health benefits, whereas Lc705 is less-studied strain that is routinely used as a starter adjunct in dairy products.10,11 However, probiotic products containing these strains have been reported to act on different set of disorders. For example, consumption of dietary supplements containing the strain GG has been reported to prevent diarrhea in children, atopic diseases, atopic dermatitis, milk allergies in infants, respiratory infections, as well as dental caries.1218 Strain Lc705 is able to exert yeast- and mold-inhibiting activities and bioprotective properties when used in fermented dairy products in combination with other probiotic species.10,11 Later studies suggest synergistic effects between these two probiotic strains, as well as two others, such as mycotoxin binding and alleviating symptoms of irritable bowel syndrome.1921 Received: January 31, 2011 Published: May 27, 2011 3460

dx.doi.org/10.1021/pr2000896 | J. Proteome Res. 2011, 10, 3460–3473

Journal of Proteome Research A functional genomics study aiming to define-strain specific differences associated with the strains GG and Lc705 was recently completed.22 According to that study both genomes contained distinct genomic islands. One of these islands, detected only in the GG genome, contained a gene cluster coding for a unique pilus structure (SpaCBA), and SpaC was demonstrated to be essential for binding of the GG cells to human mucus in vitro.22,23 In addition, both genomes are equipped with a significant number of strain-specific genes involved in, for example, carbohydrate metabolism, exopolysaccharide biosynthesis, bacteriophage components, and unknown functions.22 However, the genetic information is only indicative of the cell’s potential, and linking the unique and unknown genes to specific phenotypes is the first priority in pursuing molecular pathways behind the probiotic functions. To meet this goal, proteomewide analysis is prerequisite as the provided proteomic information reflects the actual state in a given cell at a given time. To date, the reported proteome studies of probiotic lactobacilli, including Lactobacillus plantarum, Lactobacillus fermentum, Lactobacillus casei, and L. rhamnosus, have aimed to identify the mechanisms of probiotic functions using 2-DE based applications.2428 In this study, we applied GeLCMS/MS, which entails protein separation by 1-DE coupled with in-gel digestion of the proteins and the subsequent identification using nano-LCMS/MS, to generate high quality proteome catalogs from the Lactobacillus rhamnosus strains GG and Lc705. This approach has been shown to be one of the most effective ways to maximize the number of protein identifications in several proteome profiling studies.2934 Here we show that the GeLC MS/MS approach is a straightforward and efficient way to narrow the long list of proteins to be examined as factors contributing to strain-specific characteristics. In addition, the presented proteomic data provide a group of proteins which can be investigated in greater detail by necessary in vivo/in vitro experiments to establish their role as a probiotic effector or hostinteractive protein. To the best of our knowledge, the present study is the most extensive survey into the “probiotic proteome” to date, which is expected to shed further light on proteins explaining the phenotypic differences between the strains GG and Lc705 .

ARTICLE

(Vibrogen-Zellm€uhle; Edmund B€uhler, T€ubingen, Germany) for 3 min at 4 °C, which was repeated three times. Homogenates were supplemented with 1 Laemmli buffer, boiled for 15 min, and incubated for 30 min at RT. The cell extracts were centrifuged at 16 000g for 30 min at RT to remove cell debris and glass beads. The resulting cell-free protein extract was collected, and 150 μg of protein from both strains was separated on a Criterion 1020% linear gradient SDS/polyacrylamide gel (Bio-Rad). After Coomassie Blue staining, whole gel lanes were cut into 25 1 mm cubes for in-gel tryptic digestion as follows. The gel pieces were destained with 50% acetonitrile (ACN) in 200 mM NH4HCO3, reduced with DTT, and alkylated with iodoacetamide prior to digestion with trypsin (sequencing grade modified trypsin V5111, Promega), essentially as described by Shevchenko et al.35 The peptides were recovered by incubating the gel pieces once in 25 mM NH4HCO3 and twice in 5% formic acid. The recovered peptides were in a final volume of ∼130150 μL. NanoLCMS/MS

The LCMS/MS analysis of the tryptic peptides was performed using an Ultimate 3000 Nano-LC (Dionex, Sunnyvale, CA) and QSTAR Elite hybrid quadrupole TOF mass spectrometer (Applied Biosystems/MDS Sciex, Foster City, CA) with nano-ESI ionization. The samples were first concentrated and desalted on a C18 trap column (10 mm  150 μm, 3 μm, 120 Å, PROTECOL; SGE Analytical Science, Griesheim, Germany), and the peptides were separated on a PepMap100 C18 analytical column (15 cm  75 μm, 5 μm, 100 Å; LC Packings, Sunnyvale, CA) at 200 nL/min. Peptides were eluted with a linear gradient of ACN (040% in 50 or 120 min) in 0.1% formic acid at a flow rate of 200 nL/min. MS data were acquired using Analyst QS 2.0 software. The information-dependent acquisition method consisted of a 0.5 s TOF-MS survey scan of m/z 4001400. From every survey scan, the two most abundant ions with charge states between þ2 and þ4 were selected for product ion scans. Once an ion was selected for MS/MS fragmentation, it was put on an exclusion list for 60 s. All LCMS/MS analyses were performed with two biological and two technical replicates. Protein Identification and Compilation of Search Results

’ MATERIALS AND METHODS Strains and Culture Media

Lactobacillus rhamnosus GG (ATCC 53103) and Lactobacillus rhamnosus Lc705 (DSM 7061), designated GG and Lc705 here, were grown on MRS agar (Becton & Dickinson) anaerobically (Anaerocult A, Merck KGaA) at 37 °C. Individual colonies were inoculated in duplicate in MRS broth (Becton & Dickinson) and grown at 37 °C overnight. A 1 mL aliquot of the cultures was inoculated in 100 mL of MRS, and the cultures were grown microaerobically at 37 °C. At the onset of the stationary growth phase (OD600 ∼4.0 and ∼3.0 for GG and Lc705, respectively), 1.5 mL samples were removed, and the cells were harvested by centrifugation. The cells were washed with ice-cold 50 mM Tris-HCl pH 8.0 (Sigma-Aldrich), and the cell pellets were stored at 80 °C. Preparation of Protein Sample and 1-DE

Protein extracts from two biological replicate cultures were obtained by disrupting cells suspended in 30 mM Trizma Base with glass beads (e106 μm; Sigma) in a homogenizer

All LCMS/MS results were analyzed using the ProteinPilot (version 2.0.1, Applied Biosystems) software. The MS/MS data were searched against an in-house database of the published ORF set of the predicted GG (acc. no. FM179322; 2944 entries) or Lc705 (acc. no., FM179323 and pLC; FM179324; 2992 entries) protein-coding sequences22 using Mascot (Matrix Science, version 2.2.03) and Paragon search algorithms through the ProteinPilot interface. The search criteria for Mascot searches were as follows: trypsin digestion with one allowed miss-cleavage; carbamidomethyl modification of cysteine as a fixed modification; oxidation of methionine as a variable modification; peptide mass tolerance of 50 ppm; MS/MS fragment tolerance of 0.2 Da; and peptide charges of 1þ, 2þ, and 3þ. The parameters for Paragon searches using the Rapid search mode included the previously mentioned methionine and cysteine modifications. The recently reported Compid tool36 was used to parse significant hits from the compiled Mascot and Paragon output files into tab delimited data files. Protein identifications that had probability-based Mascot Mowse scores g 50 and p < 0.05 and/or Paragon Unused ProtScores g 1.3 and p < 0.05 were considered reliable high-confidence identifications. 3461

dx.doi.org/10.1021/pr2000896 |J. Proteome Res. 2011, 10, 3460–3473

Journal of Proteome Research

ARTICLE

To estimate the false discovery rates (FDRs), all Mascot and Paragon searches were repeated using identical search parameters and validation criteria against GG or Lc705 decoy databases containing all protein sequences in both forward and reverse orientations. Sequences were reversed using the Perl decoy.pl script provided by Matrix Science (http://www.matrixscience.com/help/decoy_help.html). The FDR percentages were calculated using the formula 2nreverse/(nreverse þ nforward) given by Elias and Gygi.37 The FDRs were below 3.0% for each data set. Bioinformatic Analyses

The theoretical isoelectric point (pI) and molecular weights (Mw) of proteins were defined by the pepstats program (http:// www.hgmp.mrc.ac.uk/Software/EMBOSS).38 Other information related to protein functional annotations, subcellular localization, presence of adhesion-associated domains, and protein evolutionary counterparts (proteins that are homologues/paralogues and unique) were obtained from the original genomic analyses of the GG and Lc705 strains.22

Figure 1. GeLCMS/MS workflow for identifying the GG and Lc705 proteins expressed in MRS. An equal amount of protein extract (150 μg/ lane) was loaded and separated by 1-DE. After Coomassie blue staining, the lanes were cut into 25 slices for in-gel tryptic digestion, and peptides from each digestion were separated using two linear ACN gradient conditions in LC before the MS/MS analyses. All MS/MS spectra were processed by ProteinPilot software and searched against the in-house GG and Lc705 protein databases using Mascot and Paragon database search algorithms with the threshold set at p < 0.05.

’ RESULTS AND DISCUSSION GeLCMS/MS Analysis of the Expressed and Identified L. rhamnosus Proteins

In the present study, a comparative proteomics approach based on 1-DE and nanoLCMS/MS (GeLCMS/MS) was applied to complement the previous comparative genomic findings and to search for details behind the reported phenotypic and probiotic differences of two closely related L. rhamnosus strains GG and Lc705.22 For this purpose, cell samples from both strains were withdrawn at the onset of stationary phase (Supporting Information Figure 1); extensive immune modulation-related responses have been observed in this growth stage in L. plantarum strain WCFS1.39 Proteins were extracted in alkaline conditions to minimize the loss of proteins prior to 1-DE, since such an approach is known to increase protein solubility with yields close to that of SDS.40 The recovered peptides from in-gel tryptic digestions were separated using two different linear ACN gradients in LC to reduce the sample complexity before MS/MS (Figure 1). For identification, all MS/MS data were searched against the GG and Lc705 protein databases using Mascot41 and Paragon42 search engines through the ProteinPilot interface (Figure 2). Merging the Mascot search results identified a total of 1930 and 2010 proteins (p < 0.05) from the GG and Lc705 strains, respectively (Figure 2A). In the Paragon searches, the number of identified proteins (p < 0.05) was 1587 from the GG and 1634 from the Lc705 strains (Figure 2A). Next, the Mascot and Paragon search results were combined using Mascot Mowse scores (ms) g 50 and p < 0.05 and/or Paragon UnusedProt scores g1.3 and p < 0.05 as the criteria for high-quality identifications (Figure 2B). Using these stringent criteria, 1664 and 1705 proteins were reliably identified from the GG and Lc705 strains, respectively (Figure 2B, Supporting Information Tables 1 and 2). The protein identification scores and the number of unique peptides assigned to each protein are provided as Supporting Information Table 3. The quality of both data sets is demonstrated by the identification of 1544 and 1616 proteins from the GG and Lc705 strains with at least five or more unique peptides, respectively. From the high-quality identifications, only three proteins in both strains were single peptide matches. The MS/MS spectra, including fragment ion assignments, for

Figure 2. Protein identification results from Mascot and Paragon. (A) Number of protein identifications (p < 0.05) that are unique and common between the data sets (Mascot and Paragon search data). (B) Uniquely and commonly identified proteins with high-confidence search criteria in (Mascot Mowse score [ms] g 50, p < 0.05) and in Paragon (UnusedProt score [pg] g 1.3, p < 0.05) within the compiled data sets for each strain. (C) Number of common and unique identifications within the GG and Lc705 data sets. The unique identifications are further divided into two groups reflecting the number of proteins with or without evolutionary counterparts (orthologues/paralogues).

the proteins with single peptide matches are shown in Supporting Information Table 4. To define specifically and commonly expressed proteins, all identified proteins fulfilling the stringent search criteria were divided into strain-specific (unique) proteins that lack evolutionary counterparts in the other strain tested here and strainshared (conserved) proteins sets. According to Kankainen et al.,22 the number of conserved genes is 2441 and 2457 in the Lc705 and GG strains, respectively. In this study, 1420 of the predicted conserved genes from both strains were noted to produce an identifiable product (Figure 2C, Supporting Information Table 5). The number of identifications related to unique proteins lacking evolutionary counterparts was 95 and 156 in the GG and Lc705 strains (Figure 2C, Supporting Information 3462

dx.doi.org/10.1021/pr2000896 |J. Proteome Res. 2011, 10, 3460–3473

Journal of Proteome Research

Figure 3. Theoretical 2D view of the identified (red and blue) and predicted (gray) proteomes for the GG and Lc705 strains. The calculated Mw values of all proteins were plotted against the predicted pIs. x axis, pI 214; y axis, molecular weight (Da) on a log10 scale 102107.

Table 6), respectively. The number of protein identifications from the conserved protein set that were exclusively detected from one of the strains was 149 in the GG and 129 in the Lc705 strain (Figure 2C, Supporting Information Table 6), which implies that the proteins in question could be differentially expressed in these strains. Even though the present study does not provide accurate protein quantification data, the relative protein amounts between the two strains can be estimated by comparing identification results; protein identification data with higher sequence coverage and identification scores indicates higher protein abundance.43 Physiochemical Characterization of the Identified Proteins Shows Unbiased Proteome Characterization

The protein identification data were further assessed by creating virtual 2D protein gels (Figure 3), which show an approximate bimodal distribution of the identified and predicted proteins of both L. rhamnosus strains. The Mw distribution is very similar in both strains: the majority of proteins are in the range of 0120 kDa with a moderate number of proteins that are larger than 150 kDa (Supporting Information Figure 2A). The identified proteins with a Mw of over 40 kDa are more prevalent within the identification results, which possibly results from the lower probability of detecting one of the few tryptic peptides from small proteins. Alternatively, the bias can be explained by the fact that genes small in size have lower confidence to be genuine genes and thereby are more likely to be just false predictions made in gene prediction.44 Classification based on pI values shows that the identified proteome below pI 7.0 covers >70% of all predicted proteins in each strain, whereas 4345% of all predicted proteins with pI values above 7.0 were identified (Supporting Information Figure 2B). Thus, the analysis seemed to slightly favor the identification of soluble cytoplasmic proteins, which typically cluster around pI 56. Integral membrane proteins (IMPs) are typically under-represented in most proteome studies due to their low abundance and/or hydrophobic nature, and their solubility decreases as the number of TMDs increases. In our study, ∼40% of all predicted IMPs carrying up to 15 TMDs were identified from both strains (346 proteins in the GG strain and 323 proteins in the Lc705 strain) (Supporting Information Tables 1 and 2, Figure 3). The highest number of IMP identifications, which accounted for >50% of all predicted IMPs in both L. rhamnosus strains, was obtained with proteins with one (57% in GG, 55% in Lc705) or two (38% in GG, 40% in Lc705) TMDs. From all predicted IMPs with three or more TMDs, 33% and 34% could be identified from

ARTICLE

the GG and Lc705 strains, respectively (Supporting Information Figure 3). Next we examined if the membrane proteome coverage could be increased by using less-stringent search criteria. For this purpose, all identified proteins with p < 0.05 (one missed trypsin cleavage allowed) and having an evolutionary counterpart was considered as true identification. As a result, 125 and 142 additional proteins were identified from the GG and Lc705 strains, respectively (Supporting Information Tables 1 and 2). Most of these proteins were predicted to be alkaline (Supporting Information Figure 4), and ∼50% (61 proteins in GG and 67 proteins in Lc705) were proteins with TMDs (Supporting Information Tables 1 and 2). After combining all search results, the membrane proteome coverage was increased from 40% to 57% and 54% in the GG and Lc705 strains, respectively. Membrane proteome studies of other Gram-positive bacteria including Corynebacterium glutamicum, Bacillus subtilis, and Staphylococcus aureus have involved various membrane enrichment and protein digestion methods to maximize the number of identified IMPs, which at best were reported to cover 5070% of all predicted IMPs.4549 The coverage of identified IMPs in our study was within the published range; hence, the protein extraction in alkaline conditions coupled with GeLCMS/MS analysis has resulted in proteome coverage comparable with other enrichment approaches. Identified Proteins with Potential Impact on Dairy or Probiotic Traits

To highlight the most relevant findings from the identified proteome data, all proteins were first categorized according to their putative function within the cell using the COG (clusters of orthologous groups of proteins) database.50 According to this classification, all identified proteins seemed to fall into 21 COG functional groups (Figure 4A), with a similar pattern of protein distribution in both strains. In both cases, the best coverage was obtained with proteins involved in amino acid transport and metabolism (COG: E, 6673%), translation and transcription (COG: J, 9596%; COG: K, 7275%), cell wall and membrane biogenesis (COG: M, 8188%), and general function (COG: R, 7475%). A substantial portion of the identified proteins (16% and 19% of all identified proteins in strains GG and Lc705, respectively) could not be linked to any COG category. Interestingly, a wide repertoire of transcriptional regulator proteins was identified, in total 134 and 135 in the GG and Lc705 strains, out of 209 and 212 predicted from the respective genome sequences (Supporting Information Tables 1 and 2). The regulator proteins typically show high turnover rates and are expressed at low levels, and identification of such a high number of this group of proteins demonstrates that the GeLCMS/MS methodology is sensitive enough to uncover relevant gene regulatory networks in bacteria. To gain a better insight into the functional differences between the GG and Lc705 strains, the identifications common to both data sets were next excluded from the comparisons. The remaining identifications related to conserved proteins, which were specifically identified from one strain only (149 in GG and 129 in Lc705), were plotted against their respective COGs (Figure 4B). These data suggest that several proteins involved in carbohydrate transport/metabolism (COG G), transcription (COG K), and replication and DNA repair (COG L) could be differentially expressed. A more detailed COG comparison including only the uniquely identified proteins (95 in GG and 156 in Lc705) with 3463

dx.doi.org/10.1021/pr2000896 |J. Proteome Res. 2011, 10, 3460–3473

Journal of Proteome Research

ARTICLE

Figure 4. COG classification of the L. rhamnosus proteins. (A) All identified and predicted GG and Lc705 proteins plotted as a function of their COG categories. (B) Identification of conserved proteins which were identified from only GG or Lc705. (C) Identification of unique proteins lacking evolutionary counterpart. COGs are as follows: C, energy production and conversion; D, cell cycle control, cell division; E, amino acid transport and metabolism; F, nucleotide transport and metabolism; G, carbohydrate transport and metabolism; H, coenzyme transport and metabolism; I, lipid transport and metabolism, chromosome partitioning; J, translation, ribosomal structure, and biogenesis; K, transcription; L, replication, recombination, and repair; M, cell wall/membrane/envelope biogenesis; N, cell motility; O, post-translational modification and protein turnover, chaperones; P, inorganic ion transport and metabolism; Q, secondary metabolites biosynthesis, transport, and catabolism; R, general function prediction only; S, function unknown; T, signal transduction mechanisms; U, intracellular trafficking, secretion, and vesicular transport; V, defense mechanisms; N/A indicates query proteins that do not belong to any of the currently defined COGs.

no equivalent partner suggests differences in pathways affecting dairy and/or the probiotic traits (carbohydrate transport and metabolism and cell wall/membrane biogenesis, COG: M) (Figure 4C). Among the unique identifications, the biological role for 36 unknown proteins in GG and 46 in Lc705 remains to elucidated (Figure 4C). The Lactose Operon Components Are Expressed by the GG Strain Despite the Mutated Transcriptional Antiterminator Gene. Most significant differences with relevance to dairy traits were noted in proteins involved in uptake and hydrolysis of lactose. Lactose is the primary carbon and energy source in milk during dairy fermentations carried out by LAB. In related L. rhamnosus TCELL-1 strain, the lacTEGF operon, the galKETRM (Leloir pathway) or the lacRABDC (tagatose 6-phosphate pathway) operons involved in the metabolism of lactose have been shown to be repressed by glucose and independently induced in the presence of lactose or galactose.51 The GG strain is unable to utilize lactose, which is believed to result from constitutive repression of the lac operon due to frameshift mutations in lacT encoding the antiterminator protein of the lac operon and lacG encoding the phospho-β-galactosidase enzyme.22 However, we were able to identify LacF, the lactosespecific IIA component of the phosphotransferase system (Supporting Information Table 1) and to show that the lac genes are expressed despite the mutations in lacT and the presence of glucose in the culture medium (Supporting Information Figure 5). Other related identifications and RT-PCR

analyses suggest that both the Leloir (galKETRM, LGG_00653 00657) and the tagatose 6-phosphate (lacRABDC, LGG_ 0066400668) pathways are also actively expressed in the GG strain during growth in MRS with glucose (Supporting Information Table 1 and Figure 5). In the case of the Lc705 strain, that is known for its ability to utilize lactose,22 the corresponding identifications coupled with confirmatory RT-PCR analyses suggest that all three metabolic routes for lactose and galactose metabolism (LC_0063800642, pLC_0006200065, LC_00312 00315, LC_0062500629) are also active despite the presence of glucose in MRS (Supporting Information Figure 5 and Table 2). The LacG (phospho-β-galactosidase) protein could be identified from the Lc705 strain, but not from GG. In summary, our analysis strongly suggests that the frameshift mutation in lacG, but not in lactT, is the cause of apparent inability of the GG strain to metabolize lactose. Strain GG Is More Active in Production of PhageProteins. Proteins with importance for interactions with the host may be encoded by bacteriophages, as recently demonstrated by phage-encoded lysin that was shown to mediate binding of Streptococcus mitis to human platelets through interaction with fibrinogen.52 Approximately 1.5% from all GG and Lc705 proteins, assigned to several COGs, have been predicted to have phage-related functions. In the present study a total of 22 and 20 potential phage-proteins from the strains GG and Lc705, respectively, were identified which suggests that phage-related functions are activated in both strains (Figure 5, Supplemental 3464

dx.doi.org/10.1021/pr2000896 |J. Proteome Res. 2011, 10, 3460–3473

Journal of Proteome Research

ARTICLE

Table 1. Presence of Major Essential Protein Components Required for Phage Particle Assembly in the GG and Lc705 cells as Revealed by Ge-LC/MS/MS major protein components identified protein function in the phage assembly

GG cells

Lc705 cells

head morphogenesis

Figure 5. Schematic distribution of phage related genes along the GG and Lc705 genome as revealed by the GeLCMS/MS analysis for proteins (rectangles above the genome bar) and by the comparative genomics for phage genes22 (circles below the bar). The capital letter with the bar indicates the particular region containing the genes expressing the phage proteins discovered in the present study. Only proteins and ORFs equal or longer than 100 amino acid residues are included in this analysis.

Table 7). From the identified proteins, six proteins in GG (GGø10, 11, 18, 20, 22, 25) and eight in Lc705 (LCøP15, 16, 1821, 23, 24) fulfilled only less-stringent identification criteria (Mascot [ms] scores