Proteome of the Caenorhabditis elegans Oocyte - Journal of Proteome

Apr 1, 2011 - Department of Biochemistry and Molecular Biology, Alberta Children's Hospital Research Institute of Child and Maternal Health, Universit...
0 downloads 3 Views 1MB Size
ARTICLE pubs.acs.org/jpr

Proteome of the Caenorhabditis elegans Oocyte John K. Chik,† David C. Schriemer, Sarah J. Childs, and James D. McGhee* Department of Biochemistry and Molecular Biology, Alberta Children's Hospital Research Institute of Child and Maternal Health, University of Calgary, Calgary, Alberta, Canada T2N 4N1

bS Supporting Information ABSTRACT: Oocytes were purified from the temperaturesensitive fertilization-defective fer-1(b232ts) mutant of the nematode Caenorhabditis elegans and used for comprehensive mass spectrometric analysis. Using stringent criteria, 1165 C. elegans proteins were identified; at lower stringency, an additional 288 proteins were identified. We validate the high degree of sample purity and evaluate several possible sources of bias in the proteomic data. We compare the classes of proteins identified in the current oocyte proteome with protein classes identified in our previously determined oocyte transcriptome. The oocyte proteome appears enriched in proteins likely to be needed immediately upon fertilization, whereas the transcriptome appears enriched in molecules and processes needed later in embryogenesis. The current study provides fundamental background information for future more detailed studies of oocyte biology. KEYWORDS: C. elegans, oocyte, proteome, fer-1

’ INTRODUCTION The small free-living nematode Caenorhabditis elegans has become one of the best-studied model organisms in biology, providing fundamental understanding into developmental biology, neurobiology, lifespan, aging, and aspects of whole organism physiology such as responses to stress and to infections.1 The experimental advantages of working with C. elegans are well established: superb classical genetics,2 defined cell number (∼1000 somatic cells of precisely known and reproducible lineage3), high-quality well-annotated genomic sequence (both of C. elegans and related nematodes4), optical transparency (an especially important feature with the advent of GFP-based transgenic reporters5), the powerful gene knockout capabilities of RNA-mediated interference,6,7 and a rapidly improving ability to modify the genome.8 Genetic and genomic approaches have been traditionally favored for investigating the biology of C. elegans but proteomic approaches are rapidly becoming more prominent, especially in the past several years; for reviews of C. elegans proteomics, see refs 9 11. For example, Schrimpf et al.12 and Merrihew et al.13 have provided comprehensive mass-spectrometry-based inventories of proteins expressed in mixed stage populations of wild type C. elegans, thereby extending previous smaller-scale inventories,14 16 significantly enhancing the accuracy of genome annotation, and contributing to the concept of a quantitatively stable eukaryotic core proteome.17 More specialized studies have exploited C. elegans temperature-sensitive mutations to investigate the proteomics of germ line development, especially male female r 2011 American Chemical Society

differences.18,19 Mass spectrometry has also been used to identify C. elegans proteins associated with sperm chromatin,20 insulin signaling targets,21 mitochondria,22 and targets of phosphorylation.23 Application of mass spectrometry based-proteomics to individual tissues or cells of C. elegans would help to understand how the worm genotype is revealed in the worm phenotype. In the current work, we produce relatively large quantities of purified C. elegans oocytes and then use mass spectrometry to define the oocyte proteome, that is, ∼1100 1400 individual oocyte proteins, the precise number depending on identification criteria. We then compare our oocyte proteome with the previously defined oocyte transcriptome24 with the aim of understanding how the oocyte dowry of maternally provided proteins and mRNAs are able to drive cell divisions, signaling pathways and gene expression in the early embryo, and launch the embryo into its independent life.

’ MATERIALS AND METHODS Sample Preparation

For each of four independent experimental replicates, fer-1(b232ts) mutant C.elegans were grown on egg-trays inoculated with E. coli strain N99 as previously described.25,26 (Details of the fer-1 gene and its mutations can be found at http:// www.wormbase.org/db/gene/gene?name=WBGene00001414; class=Gene). After incubation for 3 days at 25.9 °C, adult worms Received: November 8, 2010 Published: April 01, 2011 2300

dx.doi.org/10.1021/pr101124f | J. Proteome Res. 2011, 10, 2300–2305

Journal of Proteome Research

ARTICLE

(now full of unfertilized oocytes) were harvested by centrifugation and oocytes released by brief sonication (2  20 s at 70 W power, Braun-Sonic 2000, 4 mm probe). The resulting crude oocyte preparation was first filtered through a 44 μm Nytex mesh to remove adult carcasses and then through a 22 μm Nytex mesh to recover the oocytes. The oocytes were washed off the mesh, repeatedly washed with distilled water containing a protease inhibitor cocktail (Roche Diagnostics), adjusted to a final volume of 1 mL, chilled on ice, and disrupted by sonication (25 s, 70 W). SDS was added to a final concentration of 1%, the extract clarifed by centrifugation, 50 μL loaded onto a 12% SDSPAGE gel for separation, and the resulting Coomassie Bluestained gel was cut into 20 30 roughly equal bands. Gel slices were destained, reduced, alkylated, and trypsin digested according to a slight modification of a standard protocol.27 Briefly, gel slices were cut into 1 mm3 pieces, rinsed once with 200 μL HPLC grade water, washed twice with 200 μL of 25 mM ammonium bicarbonate pH 8 in 50% acetonitrile for 15 min and once with acetonitrile to dehydrate the gel pieces. Gel pieces were dried in a SpeedVac and 12.5 ng/μL trypsin solution (Princeton Separations, New Jersey) in 25 mM ammonium bicarbonate was added to the dried gel pieces (5 10 μL). Once gel fragments were hydrated with trypsin solution, additional 25 mM ammonium bicarbonate was added to cover the fragments and samples were incubated at 37 °C overnight. Extraction of peptides from gel pieces was performed first with 50 μL 1% formic acid, followed by 50 μL of 1% formic acid in 50% acetonitrile. Pooled supernatants were dried down and reconstituted in 3% acetonitrile/0.2% formic acid for injection onto the LC MS system. Mass Spectrometry

Digests were analyzed using an integrated Agilent 1100 LCion-Trap-XCT-Ultra system, fitted with an Agilent ChipCube source containing injector components, a precolumn channel (300 μm i.d, 5 mm long), an analytical column channel (75 μm i.d., 150 mm long) and a nanospray emitter on a microfluidic chip. The retention phase was 5 μm Zorbax 300 SB-C18 particles for both precolumn and analytical column. Injected samples were first trapped and desalted in the precolumn channel for 5 min with 3% acetonitrile/0.2% formic acid delivered by the auxiliary pump at 4 μL/min. The peptides were then reverse eluted from the precolumn and separated on the analytical column at 0.3 μL/min. The mobile phase A consisted of 3% acetonitrile while phase B was 97% acetonitrile, both containing 0.2% formic acid. Data-dependent acquisition of collision-induced dissociation MS/MS was utilized, and parent ion scans were run over the mass range m/z 400 1200. Data Processing

Collected data were exported to an mzXML formatted file28 with CompassXport (1.3.2 - Bruker-Daltonics) for use within the Trans-Proteomic Pipeline (TPP v.4.1) developed by the Institute for Systems Biology (Seattle WA).28 A generic Mascot formatted file (.dta) was extracted from the mzXML file using MzXML2Search utility (TPP) and searched against a custom database using the X!Tandem search engine (v. 07 07 01 02 w/“kscore plug-in”).29 Our custom database consisted of C. elegans proteins (Wormpep v180) along with human keratins (NCBI RefSeq release 21) and porcine trypsin to account for sample contamination. Assuming monoisotopic resolution, parent ion error was set at (1.2 m/z and fragment ion error was (0.8 m/z. The database was searched using “semi-tryptic” constraints, which requires only one canonical tryptic cleavage site and allows

Figure 1. Comparison of the chromosomal distribution of (A) C. elegans germline-expressed genes (replotted from35) and (B) proteins identified in the C. elegans oocyte (present work). Chromosome identity is indicated on the X-axis. For (A), the Y-axis represents the number of germline-expressed genes on each chromosome detected by SAGE, normalized to the total number of genes on that particular chromosome. For (B), the Y-axis represents the number of oocyte proteins identified on each chromosome, normalized to the total number of genes on that particular chromosome.

for one missed cleavage. Fixed carbamidomethyl modifications were added to all cysteines and variable N-terminal pyrrolidone modifications were included in the search parameters. Search results were converted into pepXML using Tandem2XML. For each gel, the pepXML files for each individual gel slice were aggregated using InteractParser. This combined data set was processed through PeptideProphet30 to assign a sample-wide probability to the peptide assignments. This process was done for all four oocyte samples. The PeptideProphet results from all four samples were combined to form the “final” fer-1 oocyte proteome using ProteinProphet31 to validate protein assignments. The processed data associated with this study may be downloaded from Proteome Commons (http://proteomecommons. org/) Tranche using the following hash: [2O2JQWe05yIyd NqsBf5y7O7YVT72EIhpUj2Fwc0OfWcbpGDfhTIAGvG3IZx F6E2zBvaRvaJoQ/L5/s/bfhOxmR3rzNIAAAAAAAA9OA==]. Data are in the form of peak lists (.mgf format) generated for each individual gel band, from each of the four replicates. 2301

dx.doi.org/10.1021/pr101124f |J. Proteome Res. 2011, 10, 2300–2305

Journal of Proteome Research

ARTICLE

digested with trypsin, processed for mass spectrometry and the overall results pooled. In the pooled data set, 1453 C. elegans proteins were identified with a probability g0.9 according to ProteinProphet,31 of which 1165 proteins were identified by more than one unique peptide (listed in Supplementary Table 1, Supporting Information, and referred to as the “higher confidence” set); 194 of these higher confidence assignments are degenerate because of multiple homologous proteins and/or multiple splice variants. Of the 288 C. elegans proteins identified by only one unique peptide (listed in Supplementary Table 2, Supporting Information), 48 assignments are degenerate. Overall Validation of the Experimental Material

Figure 2. Exploration of parameters that might introduce or might reveal biases or limitations in the current MS proteome. (A) Normalized frequency distribution of protein hydrophobicity (estimated using GRAVY) for: (O) all proteins identified in WormBase WS180; (crosses) all proteins identified by oocyte transcripts;24 (b) all proteins identified in the current oocyte proteome. The arrow points to a subset of hydrophobic proteins encoded in the C. elegans genome that are completely missing from the current oocyte proteome. (B) Normalized frequency distribution of protein length (in units of thousand amino acid residues) for: (O) all proteins identified in WormBase WS180; (crosses) all proteins identified by oocyte transcripts;24 (b) all proteins identified in the current oocyte proteome. The arrow points to the set of short proteins over-represented in the current oocyte proteome.

’ RESULTS AND DISCUSSION General Features of the C. elegans Oocyte Proteome

C. elegans normally reproduces as a self-fertilizing hermaphrodite. The C. elegans fer-1 gene is expressed only in primary spermatocytes and is necessary for sperm to become motile32 In the fer-1(b232ts) mutant, sperm become fertilization defective at ∼26 °C, causing oocytes to accumulate inside the hermaphrodite and allowing relatively large numbers of unfertilized oocytes to be purified.25,26 Although some of these oocytes have prematurely initiated DNA synthesis, such “endomitosis” should have little influence on the overall oocyte proteome; see also.20 For each of four independent oocyte preparations, protein extracts were prepared and separated on one-dimensional SDS-polyacrylamide gels; molecular weight fractions (i.e., gel slices) were

Defining the proteome of particular C. elegans tissues, cells or life-stages is necessarily a compromise between the sensitivity of protein detection and levels of possible sample contamination. It is difficult to justify exhaustive analyses to detect rare proteins if these protein could be contaminants, a dilemma made more serious by the enormous dynamic range of present day LC MS systems. In this section, we emphasize overall validation of sample quality and assessment of contamination levels. We have previously estimated that C. elegans oocytes prepared by the present method are ∼95% pure.24 Small numbers of particular proteins can be identified in our lists that must reflect contamination with nonoocyte material, such as major sperm proteins (msp gene products), intestine-restricted intermediate filaments (e.g., IFB-2) and intestinal proteases (e.g., ASP-1 and CPR-6). However, the overall high quality of the starting material is strongly supported by several distinctive features of the C. elegans oocyte. (1) As will be described in more detail in a later section, 89% (1034/1165) of the higher confidence oocyte protein set were also identified in our previous SAGE (Serial Analysis of Gene Expression) analysis of oocyte transcripts, based on RNA purified from independent preparations of fer1(b232ts) oocytes.24 In this previous analysis, we had also produced a second list of oocyte transcripts that was, if anything, overcorrected for somatic contamination; 86% (1002/1165) of the higher confidence protein set are still identified in this more stringently defined oocyte transcriptome. Although there is no necessary relation between the presence of oocyte proteins and the presence of oocyte transcripts, we feel that the excellent agreement between the two lists provides strong support for the overall high quality of our oocyte preparation. (2) Reinke and Cutter (2009)33 have observed that ∼38% of all genes expressed in the C. elegans germline are transcribed as multigene operons, compared to 15% of all C. elegans genes. Consistent with high quality starting material, 34% of the current higher-confidence set of oocyte proteins are encoded in operons. (3) Genes expressed in the hermaphrodite germline are depleted from the X-chromosome, perhaps because of inactivation of the X chromosome in the germline.34 Figure 1A (recalculated from Wang et al.35) shows the distribution of germline-expressed genes over all the chromosomes including the X. Figure 1B shows that oocyte proteins show a qualitatively similar chromosomal distribution. The central argument in this section is that any contamination of oocytes with somatic material/molecules would homogenize the distinctive oocyte features. The good agreement between the presence of oocyte proteins and oocyte transcripts, as well as 2302

dx.doi.org/10.1021/pr101124f |J. Proteome Res. 2011, 10, 2300–2305

Journal of Proteome Research

ARTICLE

Figure 3. Classification of proteins detected in the C. elegans oocyte proteome, compared to the classification of SAGE-identified transcripts from the same tissue source.24 The proportion of proteins (P) and transcripts (T) falling into general protein/gene categories as defined by WormMart COG Codes (http://www.wormbase.org/biomart/martview) are shown on the left, while individual COG Code subcategories are shown on the right. The numbers within the bars represent the fraction of proteins/genes corresponding to each segment, for example, 23% of all identified proteins in the oocyte proteome are assigned to the class of “Information Storage and Processing”. For this analysis, 6002 genes identified as oocyte transcripts and 1078 proteins identified as oocyte proteins were used. For proteins identified ambiguously in the oocyte proteome (i.e., where the identification is high-quality but ambiguous because of lack of protein sequence variation (i.e., among highly related proteins such as histones)), only one example was chosen for each redundant set.

their matching distribution within operons and across the genome, are all consistent with high quality oocyte preparations. It is, of course, more difficult to guarantee the provenance of individual proteins. Additional support for sample quality is provided by comparisons with two previous mass spectrometry protein inventories that should at least overlap with the present oocyte proteome. Tops et al.19 used temperature sensitive mutations (fem-1 and fem-3) combined with differential 14N/15N labeling to identify proteins that should be enriched in either the female or the male germline. Although germline proteins need not end up in mature gametes, by and large, we find good agreement between the oocyte proteome and proteins enriched in the female germline. Tops et al.19 identified ten proteins upregulated >4-fold in the female germline, of which nine are identified in the oocyte proteome. In contrast, the oocyte proteome contains only two of the eight proteins upregulated >4-fold in the male germline; (see Table 1 of Tops et al.19). More broadly, the present oocyte proteome contains 61% (26/43) of the proteins identified by Tops et al.19 as up-regulated >2-fold in the female germline, compared to 31% (11/36) of the proteins upregulated >2-fold in the male germline; (see Supplementary Table 3 of Tops et al.19). Finally, we should mention the study of Chu et al.,20 who performed MudPIT analysis on a crude chromatin fraction from C. elegans oocytes (but which also contained 20 30% fertilized embryos); ∼50% (119/240) of the proteins in their list are also found in the present oocyte proteome.

Overall Physical Properties of the Oocyte Proteome

In the current section, we inspect the general physical properties of the oocyte proteome. These properties could reflect real features of the oocyte proteome but could also identify possible biases or limitations introduced by the particular methods used for protein handling and mass spectrometry (independently of the sample quality discussed in the previous section). We compare overall hydropathy profile and protein size distribution between three data sets: (i) the combined higher and lower confidence oocyte protein lists defined in the current paper; (ii) all oocyte proteins identified by our previous analysis of oocyte transcripts;24 and (iii) all C. elegans proteins identified in the WormBase WS180 database. Overall, the present proteomic analysis appears to have only modest biases for or against particular classes of oocyte proteins, with two possible exceptions. (i) Figure 2A shows that the distribution of overall protein hydropathy (GRAVY36) for the three data sets listed in the previous paragraph overlap extensively. However, the oocyte proteome lacks the small secondary peak in the distribution corresponding to highly hydrophobic proteins such as trans-membrane receptors (see arrow in Figure 2A). Poor representation of such proteins was also noted in the study of Schrimpf et al.12 Implications for oocyte biology will be discussed below. (ii) Figure 2B shows that the overall size distributions (number of residues) are highly similar for the three data sets. However, shorter proteins appear over-represented in our present mass-spectrometry derived proteome 2303

dx.doi.org/10.1021/pr101124f |J. Proteome Res. 2011, 10, 2300–2305

Journal of Proteome Research (arrow in Figure 2B). In contrast, larger proteins were over-represented in the data set of Schrimpf et al.,12 a difference possibly reflecting different separation methods used in the two studies. Implications of the Oocyte Proteome

As noted above, 89% of the oocyte proteins identified in our higher confidence list have also been identified in our previous SAGE analysis of oocyte transcripts.24 Before discussing individual protein categories, two points should be made about general comparisons between the 1165 oocyte proteins identified by mass spectrometry and the larger set of genes previously identified via analysis of oocyte transcripts. First of all, there is only a modest correlation between the number of SAGE tags (i.e., a measure of transcript abundance) associated with a particular oocyte gene and the level of the corresponding protein, as assessed by the numbers of individual peptide identifications. Such weak correlation between levels of protein and levels of the corresponding transcripts is the rule for most experimental systems; (see de Sousa et al.37 for a recent compilation). On the other hand, the set of oocyte proteins that are identified by mass spectrometry tend to be highly transcribed, that is, they are associated, on average, with overall higher numbers of SAGE tags than are genes identified via oocyte transcripts. For example, ∼ 5% of all genes identified as oocyte transcripts have normalized SAGE tag counts g50, compared to ∼21% of all genes identified via mass spectrometry. Figure 3 shows the distribution of COG classes38 of all the uniquely identified oocyte proteins compared to the distribution of COG classes determined in our previous analysis of oocyte transcripts.24 Inspecting the distribution of COG-code categories on the left of Figure 3, the most obvious difference is the underrepresentation of “Poorly Characterized” proteins in the proteome, compared to the oocyte transcriptome. In other words, there appear to be uncharacterized genes expressed in the oocyte and represented as transcripts but not represented (or weakly represented) as proteins. If “Poorly Characterized” proteins/ genes are removed from consideration, the proportion of proteins and transcripts in the remaining general categories of “Information Storage and Processing”, “Cellular Processes and Signaling” and “Metabolism” are roughly similar. When these broad categories are broken down (Figure 3, right side), significant differences in proportions of genes in different categories can be detected between the proteome and the transcriptome. These differences support the general hypothesis that the proteome emphasizes molecular machinery that is needed immediately upon fertilization, whereas the transcriptome emphasizes molecules and processes needed later in embryogenesis. We list three examples: (i) Within the general category of “Information Storage and Processing”, the proteome is more highly represented within the category of “Translation, Ribosome Structure and Biogenesis”, whereas the transcriptome is more highly represented within the category of “Transcription” and “Replication, Recombination and Repair”. (ii) Within the general category of “Cellular Processes and Signaling”, the proteome is more highly represented within the category of ” Post-translational Modifications, Protein Turnover and Chaperones”, whereas the transcriptome is more highly represented within the category of “Signal Transduction Mechanisms”. (iii) Within the general category of “Metabolism”, the oocyte proteome is more highly represented within “Energy

ARTICLE

Production and Conversion” as befits a soon-to-be dynamic embryo, but underrepresented in a wide variety of transport categories, as expected for processes that probably do not come into play until later in embryogenesis or even after hatching. Although the above proteome-transcriptome differences may conform to biological expectations, the conclusions are important nonetheless and it is important to have concrete evidence for such expectations. How many of these differences in protein class distribution reflect oocyte biology and how many reflect inherent biases in our data? Comparing hydropathy and size distributions of proteins with individual COG codes, it is difficult to claim that proteomic bias is the reason for differences between the oocyte proteome and the oocyte transcriptome. However, there is one possible and important exception, namely the finding that the oocyte proteome appears to be under-represented in proteins involved in signal transduction. As noted in the previous section, the oocyte proteome does not contain the subset of the most highly hydrophobic proteins (arrow in Figure 2A), the secondary peak that is almost entirely due to proteins that fall in the classification of “Signal Transduction Mechanisms”. Because the mixed stage proteomes described by Schrimpf et al.12 were also depleted in such highly hydrophobic proteins, the suggestion that oocytes are depleted in receptors involved in signal transduction may reflect proteomic biases, not oocyte biology. Ultimately, however, such questions will have to be addressed by more focused experiments. Concluding Remarks

We have identified, with high confidence, 1165 proteins present in the unfertilized C. elegans oocyte, plus a further 288 oocyte proteins identified with lower confidence. These data will prove useful for interpreting critical biological processes, such as signaling and ovulation, egg-shell formation and the establishment of embryo polarity. We trust that the current study will provide a precedent for defining the proteomes of distinct life stages and individual tissues isolated from this accommodating experimental animal.

’ ASSOCIATED CONTENT

bS

Supporting Information Supplemental Table 1 contains the list of all C. elegans oocyte proteins identified by two or more unique peptides; Supplemental Table 2 contains oocyte proteins identified by only one unique peptide. This material is available free of charge via the Internet at http://pubs.acs.org.

’ AUTHOR INFORMATION Corresponding Author

*E-mail: [email protected]. Phone: 403-220-4476. Present Addresses †

Department of Chemical and Biological Sciences, Mount Royal University, Calgary, Alberta, Canada T3E 6K6.

’ ACKNOWLEDGMENT This work was supported by operating grants from the Canadian Institutes of Health Research (CIHR) to D.C.S., S.J. C., and J.D.M. D.C.S., S.J.C., and J.D.M. are Canada Research Chairs. 2304

dx.doi.org/10.1021/pr101124f |J. Proteome Res. 2011, 10, 2300–2305

Journal of Proteome Research

’ REFERENCES (1) Girard, L. R.; Fiedler, T. J.; Harris, T. W.; Carvalho, F.; Antoshechkin, I.; Han, M.; Sternberg, P. W.; Stein, L. D.; Chalfie, M. WormBook: the online review of Caenorhabditis elegans biology. Nucleic Acids Res. 2007, 35 (Database issue), D472–5. (2) Brenner, S. The genetics of Caenorhabditis elegans. Genetics 1974, 77 (1), 71–94. (3) Sulston, J. E.; Schierenberg, E.; White, J. G.; Thomson, J. N. The embryonic cell lineage of the nematode Caenorhabditis elegans. Dev. Biol. 1983, 100 (1), 64–119. (4) Hillier, L. W.; Miller, R. D.; Baird, S. E.; Chinwalla, A.; Fulton, L. A.; Koboldt, D. C.; Waterston, R. H. Comparison of C. elegans and C. briggsae Genome Sequences Reveals Extensive Conservation of Chromosome Organization and Synteny. PLoS Biol. 2007, 5 (7), e167. (5) Chalfie, M. GFP: Lighting up life. Proc. Natl. Acad. Sci. U.S.A. 2009, 106 (25), 10073–80. (6) Mello, C. C. Return to the RNAi world: rethinking gene expression and evolution. Cell Death Differ. 2007, 14 (12), 2013–20. (7) Fire, A. Z. Gene silencing by double-stranded RNA. Cell Death Differ. 2007, 14 (12), 1998–2012. (8) Frokjaer-Jensen, C.; Davis, M. W.; Hopkins, C. E.; Newman, B. J.; Thummel, J. M.; Olesen, S. P.; Grunnet, M.; Jorgensen, E. M. Single-copy insertion of transgenes in Caenorhabditis elegans. Nat. Genet. 2008, 40 (11), 1375–83. (9) Schrimpf, S. P.; Hengartner, M. O. A worm rich in protein: Quantitative, differential, and global proteomics in Caenorhabditis elegans. J. Proteomics 2010. (10) Shim, Y. H.; Paik, Y. K. Caenorhabditis elegans proteomics comes of age. Proteomics 2010, 10 (4), 846–57. (11) Audhya, A.; Desai, A. Proteomics in Caenorhabditis elegans. Brief Funct. Genomic Proteomic 2008, 7 (3), 205–10. (12) Schrimpf, S. P.; Weiss, M.; Reiter, L.; Ahrens, C. H.; Jovanovic, M.; Malmstrom, J.; Brunner, E.; Mohanty, S.; Lercher, M. J.; Hunziker, P. E.; Aebersold, R.; von Mering, C.; Hengartner, M. O. Comparative functional analysis of the Caenorhabditis elegans and Drosophila melanogaster proteomes. PLoS Biol. 2009, 7 (3), e48. (13) Merrihew, G. E.; Davis, C.; Ewing, B.; Williams, G.; Kall, L.; Frewen, B. E.; Noble, W. S.; Green, P.; Thomas, J. H.; MacCoss, M. J. Use of shotgun proteomics for the identification, confirmation, and correction of C. elegans gene annotations. Genome Res. 2008, 18 (10), 1660–9. (14) Mawuenyega, K. G.; Kaji, H.; Yamuchi, Y.; Shinkawa, T.; Saito, H.; Taoka, M.; Takahashi, N.; Isobe, T. Large-scale identification of Caenorhabditis elegans proteins by multidimensional liquid chromatography-tandem mass spectrometry. J. Proteome Res. 2003, 2 (1), 23–35. (15) Schrimpf, S. P.; Langen, H.; Gomes, A. V.; Wahlestedt, C. A two-dimensional protein map of Caenorhabditis elegans. Electrophoresis 2001, 22 (6), 1224–32. (16) Tabuse, Y.; Nabetani, T.; Tsugita, A. Proteomic analysis of protein expression profiles during Caenorhabditis elegans development using twodimensional difference gel electrophoresis. Proteomics 2005, 5 (11), 2876–91. (17) Weiss, M.; Schrimpf, S.; Hengartner, M. O.; Lercher, M. J.; von Mering, C. Shotgun proteomics data from multiple organisms reveals remarkable quantitative conservation of the eukaryotic core proteome. Proteomics 2010, 10 (6), 1297–306. (18) Bantscheff, M.; Ringel, B.; Madi, A.; Schnabel, R.; Glocker, M. O.; Thiesen, H. J. Differential proteome analysis and mass spectrometric characterization of germ line development-related proteins of Caenorhabditis elegans. Proteomics 2004, 4 (8), 2283–95. (19) Tops, B. B.; Gauci, S.; Heck, A. J.; Krijgsveld, J. Worms from venus and mars: proteomics profiling of sexual differences in Caenorhabditis elegans using in vivo 15N isotope labeling. J. Proteome Res. 2010, 9 (1), 341–51. (20) Chu, D. S.; Liu, H.; Nix, P.; Wu, T. F.; Ralston, E. J.; Yates, J. R., 3rd; Meyer, B. J. Sperm chromatin proteomics identifies evolutionarily conserved fertility factors. Nature 2006, 443 (7107), 101–5.

ARTICLE

(21) Dong, M. Q.; Venable, J. D.; Au, N.; Xu, T.; Park, S. K.; Cociorva, D.; Johnson, J. R.; Dillin, A.; Yates, J. R., 3rd Quantitative mass spectrometry identifies insulin signaling targets in C. elegans. Science 2007, 317 (5838), 660–3. (22) Li, J.; Cai, T.; Wu, P.; Cui, Z.; Chen, X.; Hou, J.; Xie, Z.; Xue, P.; Shi, L.; Liu, P.; Yates, J. R., 3rd; Yang, F. Proteomic analysis of mitochondria from Caenorhabditis elegans. Proteomics 2009, 9 (19), 4539–53. (23) Zielinska, D. F.; Gnad, F.; Jedrusik-Bode, M.; Wisniewski, J. R.; Mann, M. Caenorhabditis elegans has a phosphoproteome atypical for metazoans that is enriched in developmental and sex determination proteins. J. Proteome Res. 2009, 8 (8), 4039–49. (24) McGhee, J. D.; Fukushige, T.; Krause, M. W.; Minnema, S. E.; Goszczynski, B.; Gaudet, J.; Kohara, Y.; Bossinger, O.; Zhao, Y.; Khattra, J.; Hirst, M.; Jones, S. J.; Marra, M. A.; Ruzanov, P.; Warner, A.; Zapf, R.; Moerman, D. G.; Kalb, J. M. ELT-2 is the predominant transcription factor controlling differentiation and function of the C. elegans intestine, from embryo to adult. Dev. Biol. 2009, 327 (2), 551–65. (25) Stroeher, V. L.; Kennedy, B. P.; Millen, K. J.; Schroeder, D. F.; Hawkins, M. G.; Goszczynski, B.; McGhee, J. D. DNA-protein interactions in the Caenorhabditis elegans embryo: oocyte and embryonic factors that bind to the promoter of the gut-specific ges-1 gene. Dev. Biol. 1994, 163 (2), 367–380. (26) Mains, P. E.; McGhee, J. D. Biochemistry of C. elegans; Oxford University Press: Eynsham, Oxon., U.K., 1999; pp 227 244. (27) Shevchenko, A.; Tomas, H.; Havlis, J.; Olsen, J. V.; Mann, M. In-gel digestion for mass spectrometric characterization of proteins and proteomes. Nat. Protoc. 2006, 1 (6), 2856–60. (28) Keller, A.; Eng, J.; Zhang, N.; Li, X. J.; Aebersold, R. A uniform proteomics MS/MS analysis platform utilizing open XML file formats. Mol. Syst. Biol. 2005, 1, 0017. (29) Fenyo, D.; Beavis, R. C. A method for assessing the statistical significance of mass spectrometry-based protein identifications using general scoring schemes. Anal. Chem. 2003, 75 (4), 768–74. (30) Keller, A.; Nesvizhskii, A. I.; Kolker, E.; Aebersold, R. Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal. Chem. 2002, 74 (20), 5383–92. (31) Nesvizhskii, A. I.; Keller, A.; Kolker, E.; Aebersold, R. A statistical model for identifying proteins by tandem mass spectrometry. Anal. Chem. 2003, 75 (17), 4646–58. (32) Achanzar, W. E.; Ward, S. A nematode gene required for sperm vesicle fusion. J. Cell. Sci. 1997, 110 (Pt 9), 1073–81. (33) Reinke, V.; Cutter, A. D. Germline expression influences operon organization in the Caenorhabditis elegans genome. Genetics 2009, 181 (4), 1219–28. (34) Reinke, V.; Gil, I. S.; Ward, S.; Kazmer, K. Genome-wide germline-enriched and sex-biased expression profiles in Caenorhabditis elegans. Development 2004, 131 (2), 311–23. (35) Wang, X.; Zhao, Y.; Wong, K.; Ehlers, P.; Kohara, Y.; Jones, S. J.; Marra, M. A.; Holt, R. A.; Moerman, D. G.; Hansen, D. Identification of genes expressed in the hermaphrodite germ line of C. elegans using SAGE. BMC Genomics 2009, 10, 213. (36) Kyte, J.; Doolittle, R. F. A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 1982, 157 (1), 105–32. (37) de Sousa Abreu, R.; Penalva, L. O.; Marcotte, E. M.; Vogel, C. Global signatures of protein and mRNA expression levels. Mol. Biosyst. 2009, 5 (12), 1512–26. (38) Tatusov, R. L.; Fedorova, N. D.; Jackson, J. D.; Jacobs, A. R.; Kiryutin, B.; Koonin, E. V.; Krylov, D. M.; Mazumder, R.; Mekhedov, S. L.; Nikolskaya, A. N.; Rao, B. S.; Smirnov, S.; Sverdlov, A. V.; Vasudevan, S.; Wolf, Y. I.; Yin, J. J.; Natale, D. A. The COG database: an updated version includes eukaryotes. BMC Bioinform. 2003, 4, 41.

2305

dx.doi.org/10.1021/pr101124f |J. Proteome Res. 2011, 10, 2300–2305