ARTICLE pubs.acs.org/jpr
Proteomic Comparison of Plastids from Developing Embryos and Leaves of Brassica napus )
Diogo Ribeiro Demartini,† Renuka Jain,‡ Ganesh Agrawal,§ and Jay J. Thelen*,|| Department of Biochemistry and Interdisciplinary Plant Group, Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, Missouri 65211, United States
bS Supporting Information ABSTRACT: Plastids are highly specialized organelles, responsible for photosynthesis and biosynthesis of various phytochemicals. To better understand plastid diversity and metabolism, a quantitative proteomic study of two plastid forms from Brassica napus (oilseed rape) was performed. Plastids were isolated from leaves (chloroplasts) of two-week-old plants and developing embryos (embryoplasts) three-weeks after flowering, using an approach avoiding protein storage vacuole contamination. Proteins from five different plastid preparations were prefractionated by SDS-PAGE and sectioned into multiple bands, and in-gel proteins were subjected to trypsin digestion. Tryptic peptides from each band were eluted and analyzed by liquid chromatographytandem mass spectrometry (LCMS/MS) and spectra were searched against a comprehensive plant database. Proteins were quantified based on MS/MS spectral counting of unique, nonhomologous peptides. Functional classification and quantitative comparison of over 2000 redundant proteins (compiled to 675 nonredundant proteins) determined that light reaction proteins are more prominent in chloroplasts, while many Calvin cycle enzymes are more prominent in embryoplasts. Embryoplasts also contain a diversity of other metabolic enzymes undetected in chloroplasts. Many enzymes involved in de novo fatty acid and amino acid biosynthesis were detected in embryoplasts but not chloroplasts. Additionally, protein synthesis-related proteins were prominent in embryoplasts. Collectively, these results indicate that these two plastid types are distinct. KEYWORDS: spectral-counting, embryoplasts, chloroplasts, label-free, comparative proteomics, rapeseed
1. INTRODUCTION Plastids are diverse and highly specialized, semiautonomous organelles responsible for complex biosynthetic pathways in green algae, higher plants, and apicomplexan parasites (apicoplasts).1 Many processes take place within these organelles including carbon dioxide reduction and assimilation into carbohydrates, and the biosynthesis of fatty acids, amino acids, and secondary metabolism compounds.2 All mature plastids originate from dedifferentiated proplastids and are capable of multiple, diverse morphological and biochemical forms. By far, the most studied plastid subtype is the chloroplast.1,3 The mature chloroplast is a complex organelle containing three major subcompartments with distinct characteristics: (1) the envelope membrane located between the amorphous plastid stroma and the cellular cytosol, containing both outer and inner membranes; (2) the aqueous stroma; and (3) the thylakoid membrane system. Each of these subcompartments have been characterized at the proteome level using different protein extractions procedures to enrich for recalcitrant membrane proteins or soluble protein complexes to reach deeper inside the plastid proteome.2 Thylakoids and the envelope are complex membrane systems with both hydrophobic and hydrophilic proteins.2 From these r 2011 American Chemical Society
and other studies, it is clear that the envelope membrane works as a selective barrier, responsible for metabolite exchange from plastids to the extraplastidial environment and import of proteins to the plastids.4 Also, the recognition of target peptides within a plastid preprotein is performed by the outer-membrane Toc complex and the inner-membrane Tic complex,5 which forms the very well characterized translocation system.6,7 Several transporters are present in the plastid inner envelope membrane, such as triose phosphate/phosphate, phosphoenolpyruvate/pyruvate, glucose 6-phosphate/phosphate translocator,8 and translocators for dicarboxylic acids.7,8 Approximately 2100 Arabidopsis thaliana proteins are predicted to be plastidial, while rice plastids are predicted to contain approximately 4800 proteins.9 Plastids also can be very different in form and function, depending on their location in the plant2 or stage of development.10,11 Plastids from nongreen organs such as roots depend on the supply of carbon substrates from the cytosol, such as triose phosphate, malate, glucose-6-phosphate, and pyruvate.12 The ability of plastids to use exogenous carbon substrates is dependent on the presence of specific transporters in the envelope membranes.13 For example, isolated plastids from Received: October 18, 2010 Published: February 28, 2011 2226
dx.doi.org/10.1021/pr101047y | J. Proteome Res. 2011, 10, 2226–2237
Journal of Proteome Research
ARTICLE
Figure 1. Summarized methodology used in this proteomics study. (a) Source of plastids and their purification. (b-1) Embryoplasts and chloroplasts prefractionated by 12% SDS-PAGE; abbreviations “C1 (2)” and “E1 (2)” refers to biological replicates 1 and 2 for chloroplasts and embryoplasts, respectively. Five biological replicate analyses were performed for each plastid, but only two are depicted due to space limitations. After separation, staining and destaining, each gel was sliced into 12 segments per lane, according to molecular mass ranges. A specific example from range 8 obtained from chloroplast and embryoplast gels are shown. (b-2) Each individual gel segment was submitted to individual trypsin digestion, and each gel digested segment was analyzed twice by LCMS/MS. (b-3) Counted peptides that match to one specific protein are represented by x, y, and z. (c) Output files from quantification analyses for one protein from chloroplasts only (due to space limitation), where each square denotes one LCMS/MS analysis. Each biological replicate (C1 to C5), with two technical replicates (T1 and T2), is represented. The summed counted spectra for each biological replicate were averaged and finally the technical replicates were averaged (xcf for chloroplasts, xef for embryoplasts, not shown in this figure). Annotated data sets contain only nonredundant proteins for each plastid, and numbers xcf and xef represent the average assigned spectra per biological replicate for each protein.
developing Brassica napus embryos actively uptake pyruvate, glucose-6-phosphate, dihydroxyacetone phosphate, malate, and acetate.14 Since B. napus embryo plastids have been described as photoheterotrophic, it is possible that multiple substrates are necessary to support the carbon demand for starch and de novo fatty acid synthesis (FAS) throughout seed development. A quantitative proteomic analysis of developing embryo plastids, particularly in a comparative manner against chloroplasts (a wellcharacterized plastid form), may shed light on this unusual plastid. Purification of plastids from developing embryos is difficult due to the prominence of protein storage vacuoles, which readily rupture during mechanical disruption of the cell. A novel, nonmechanical homogenization protocol for isolating plastids from developing embryos of B. napus was recently developed and validated by proteomic analysis and was employed for this study. 15 Plastids from three weeks-after-flowering embryos (embryoplasts) and plastids from two-week-old leaves (chloroplasts) of B. napus were quantitatively compared. Coupling SDS-PAGE (as a protein prefractionation technique) to nanospray liquid chromatographytandem mass spectrometry (nLCMS/MS) and spectral counting, protein expression in embryoplasts and chloroplasts was comparatively analyzed in a statistically robust manner.
2. EXPERIMENTAL PROCEDURES Plant Growth and Plastid Purification from Developing Embryos and Leaves
Brassica napus cv. Reston plants were grown in greenhouse under a light/dark cycle of 18/6 h. Purification of plastids from three-weeks after flowering developing embryos (embryoplasts) was performed as described previously.15 Purification of chloroplasts from two-week-old leaves was done according to Perry et al.16 Prefractionation of Plastids by SDS-PAGE
Purified embryoplasts and chloroplasts were resolved by 12% SDS-PAGE as a prefractionation step. For both plastid subtypes, 100 μg of protein from five biological replicates was resolved. After Coomassie Blue staining gels were cut into 12 segments per biological replicate, creating 12 ranges using the molecular mass markers as a reference (Figure 1b-1). Segments were labeled from C1 to C5 for chloroplasts and E1 to E5 for embryoplasts, followed by its specific MM range number. A total of 120 segments were obtained for an entire gel. Each individual segment was finely minced, transferred to sterile microcentrifuge tubes, and immediately processed for trypsin digestion (Figure 1b-2). 2227
dx.doi.org/10.1021/pr101047y |J. Proteome Res. 2011, 10, 2226–2237
Journal of Proteome Research Trypsin Digestion and LCMS/MS Analyses of Gel Segments
High-throughput trypsin digestion was performed as described previously.17 Nanospray-liquid chromatographytandem mass spectrometry analysis (nESILCMS/MS) was performed with a ProteomeX LTQ mass spectrometer (Thermo Fisher, San Jose, CA; http://www.thermo.com/). Digested peptides were reconstituted in 60 μL 0.1% formic acid (FA; v/v) and separated by capillary reverse phase chromatography , 5 μm; http://www. (BioBasic C-18, 100 0.18 mm2, 300 Å biobasic.com/), in a 90 min run with an ACN gradient (0 60%), using 0.1% FA in water as solvent A and 99.9% ACN þ 0.1% FA as solvent B (v/v in all cases). Eluted peptides were ionized with fused-silica Pico-Tip emitter (12 cm, 360 μm od, 75 μm id, 30 μm tip; New Objective, Woburn, MA; http://www. newobjective.com/) at ion spray 3.0 kV and in a 250 nL/min. Five data-dependent MS/MS scans (isolation width 2 amu, 35% normalized collision energy, minimal signal threshold 500 counts, dynamic exclusion (repeat count, 2; repeat duration, 30 s; exclusion duration, 180 s) of the three most intense parent ions were acquired in positive ion mode following each full scan in a mass range of m/z 4002000. Protein Identification and Quantification
Acquired MS/MS spectra were searched using SEQUEST under BioWorks 3.2 software package (Thermo Fisher, San Jose, CA; www.thermo.com). Searches were performed against a Viridiplantae protein database from the National Center for Biotechnology Information (ftp://ftp.ncbi.-nih.gov/blast/, 217 043 entries), containing forward and randomized sequences in a concatenated format to determine false discovery rate. The decoy database was randomized and concatenated using an inhouse developed program DecoyDB Creator, available at www. oilseedproteomics.missouri.edu. Oxidation of methionine was allowed as variable modification and carbamidomethylation of cysteine as a static modification. High confidence protein identification was performed by requiring: (1) a minimum of two unique and no overlapping peptides for protein assignment; (2) minimum peptide probability of 95% for each peptide match; (3) minimum Xcorr values of 1.5, 2.0, and 2.5 for single, double and triple charge state peptides, respectively. Output files from SEQUEST searches were uploaded with Scaffold 2.1.03 (Proteome Software Inc., Portland, OR; www.proteomesoftware.com/) for spectral counting. Individual search results file from SEQUEST search were uploaded on Scaffold for spectral count analyses. Three output formats available in Scaffold for protein quantification were used for data analysis: number of unique spectra, number of unique peptides and number of total assigned spectra (AS). Data Analysis
Spectral count data from Scaffold were exported to a unified file format and manually curated. First, all gel-segmented, protein assignment data derived from a single SDS-PAGE lane (i.e., one plastid isolation) were combined to monitor the total number of proteins quantified per plastid. Second, assigned spectra for those proteins detected in more than one MM-range in one biological replicate were summed using the Genbank accession number as reference (Figure 1c). After this step, chloroplasts and embryoplasts data sets contained only NR proteins, based upon accession number. Summed AS for the first and second technical replicates in each biological replicate were individually averaged (Figure 1c, bottom). Two technical replicates for each of five biological replicates were averaged and standard deviation was
ARTICLE
calculated (Figure 1c, bottom). Final numbers are represented by “average AS per biological replicate” (xf). Total AS were obtained by summing xf from all biological and technical replicates (Figure 1c, bottom). To determine the predicted subcellular localization for identified proteins, three criteria were used. First, the data sets were compared by Basic Local Alignment Search Tool (BLAST),18 searches with the proteins experimentally characterized and available at the Plastid Proteome Database (www.plprot.ethz. ch),19 using the CompletePlastidProteom_20061017 database. The retrieved protein with the highest identity for each particular query was used. Second, data sets were submitted to TargetP 1.1, which predicts subcellular localization based on N-terminal amino acid sequence (http://www.cbs.dtu.dk/services/ TargetP).20 The third step was comparing each plastid data set with the TAIR 8 protein database using BLASTP (TAIR, http:// www.arabidopsis.org/index.jsp).21 The highest protein identity homologue gene was retrieved for each entry. Final subcellular localization was summarized by combining plprot, TargetP, and TAIR information. First, all plastid encoded proteins (PE), based on genome information, from http://www.ncbi.nlm.nih.gov/ genomes/ORGANELLES/plastids_tax.html, or TAIR, were annotated appropriately. The same criterion was applied to those proteins for which identity to a PE Arabidopsis thaliana protein was higher than 90%. Proteins that had high sequence identity (>90%) from the plprot BLAST results were considered to be plastid-localized. Target P localization information was employed for those proteins that did not meet any of the aforementioned criteria. Protein Functional Classification
Proteins were functionally classified according to Bevan et al. (1998),22 with modifications. Information obtained in TAIR, Universal Protein Resource (http://www.uniprot.org/),23 ExPASy Proteomics Server (http://ca.expasy.org/),24 and scientific literature were primarily used for functional classification. Proteins annotated as “unknown”, “hypothetical” or “putative protein” were submitted to BLASTP using “higher plants” as organism refinement (http://blast.ncbi.nlm.nih.gov/). Final annotation for these proteins was classified according to sequence identity. If sequence identity between query and result was higher than 90%, the annotation from BLASTP result was used; between 70 and 90%, the term “similar” was amended to the annotation; between 50 and 70%, the term “putative” was amended to the annotation, and if sequence identity was lower than 50%, the entry was annotated simply as “expressed protein.” Determination of Ortohlogous Entries
Following the previous steps, each proteomics data set was mined for possible orthologs by BLASTP searching against the original sequence database. Proteins with identity equal or higher than 90% with one or more proteins in the data set were then mined against the TAIR8 database, as a genome reference. If these proteins mapped to the same Arabidopsis gene and identity between them was higher than 90%, they were tentatively orthologous. These entries were then carefully compared for sequence alignment, spectral counts, MM, and subcellular localization. If these parameters were in agreement, redundant orthologous proteins were merged into one entry if the counted peptides for each protein arose from different spectra, to avoid “double counting”; in this case, individual spectrum was monitored in the raw file. Genbank accession numbers and organisms for merged proteins are presented in the Supporting Information. 2228
dx.doi.org/10.1021/pr101047y |J. Proteome Res. 2011, 10, 2226–2237
Journal of Proteome Research
ARTICLE
Table 1. Assigned Spectra (AS) for Total and Plastid-Specific, Nonredundant (NR) Proteins According to Functional Classificationa chloroplasts NR proteins total functional classification Cell growth/division
specific AS (%)
total AS (%)
specific AS (%)
31 (0.09)
21 (0.95)
214 (0.61)
206 (1.45)
332 (0.94)
28 (1.27)
1460 (4.16)
1365 (9.63)
1758 (4.98) 451 (1.28)
65 (2.94) 198 (8.95)
6604 (18.82) 1,337 (3.81)
3478 (24.54) 859 (6.06)
29 961 (84.85)
1410 (63.74)
15 677 (44.68)
1854 (13.08)
6 (0.02)
0 (0.00)
256 (0.73)
158 (1.11)
Metabolism
893 (2.53)
174 (7.87)
5050 (14.39)
2768 (19.53)
Protein synthesis
104 (0.29)
7 (0.32)
2236 (6.37)
1655 (11.68)
Secondary metabolism
582 (1.65)
0 (0.00)
502 (1.43)
199 (1.40)
Signal transduction
516 (1.46)
34 (1.54)
418 (1.19)
335 (2.36)
Transcription Transporters
14 (0.04) 357 (1.01)
14 (0.63) 2(0.09)
350 (1) 698 (1.99)
350 (2.47) 718 (5.07)
Cell structure Destination/storage Disease/defense Energy Intracellular traffic
Unclassified Total a
AS (%)
embryoplasts NR proteins
306 (0.87)
259(11.71)
246 (0.70)
226 (1.59)
35 311 (100.00)
2212 (100.00)
35 084 (100.00)
14 171 (100.00)
Total and plastid-specific number (and percentage) of NR proteins detected in particular plastid types are noted.
3. RESULTS Simple Workflow for In-Depth Differential Proteomics
Spectral counting has been shown to be a useful quantitative proteomics approach to address biological questions from a variety of systems.2527 From a technical standpoint it is simpler when compared with other MS-based methods for quantification.27 However, if improperly performed or interpreted it can propagate experimental or biological variation.25,28 To minimize this, we performed 10 replicates (5 biological and 2 technical) for each purified plastid subtype to reduce sample complexity and enhance the number of proteins that could be detected in the ensuing LCMS/MS analyses. For consistency, gels were segmented according to molecular mass ranges (MM range) of the protein standard (Figure 1). Since individual SDSPAGE lanes represent single biological replicates, counted spectra (which we term assigned spectra, AS) attributed to the same protein but detected in more than one MM range (or gel segment) were summed (Figure 1c). In each range, one particular protein could be detected at most 10 times, that is, 5 biological replicates plus 2 LCMS/MS replicates per MM range. Occasionally proteins were detected only in one LC MS/MS analysis whereas in several cases proteins were detected in both LCMS/MS replicates and all biological replicates for that particular segment (Figure 1c). Proteins were classified based upon the frequency they were detected as follows: (a) low abundance proteins (detected only in one MM range, only one LCMS/MS analysis and only two total spectra assigned); (b) medium abundance proteins (detected between two and seven LCMS/MS analyses in one or more MM ranges); and (c) high abundance proteins (those detected in greater than seven LCMS/MS analyses also in one or more MM ranges). Assigned spectra (AS) of each protein detected in more than one range in a biological replicate were summed and averaged among all five biological replicates. We also averaged the assigned spectra for the two technical replicates. Therefore, the final AS
data present both the technical and biological variation in this investigation (Figure 1c, bottom). Using this strategy, a total of 2104 total proteins were detected and analyzed: 1339 from embryoplasts and 765 from chloroplasts. To account for redundant proteins, each AS for those proteins detected in more than one range were summed using Genbank accession number as a reference (Figure 1c, top), resulting in 617 and 240 nonredundant (NR) proteins for embryoplasts and chloroplasts, respectively. Remaining protein redundancy was attributed to the presence of possible orthologs in this data set. Reciprocal BLASTP searches allowed 40 NR proteins from embryoplasts to be reduced to 18, and in chloroplasts 29 NR proteins were reduced to 12. In most cases, two NR proteins were combined, except for ATP synthase CF1 R and β subunits and RuBisCO large subunit, which had multiple entries. The total AS for these “merged” proteins were summed and presented in the Supporting Information. Each MS/MS spectrum was verified as counted only once. Redundant proteins detected in only one LCMS/MS analysis with only two AS were considered orphans and removed from the data sets. Using these data quality thresholds, the original embryoplast data set containing 1339 total proteins was filtered to a final data set of 492 NR proteins comprising 35 311 total AS (Table 1). For chloroplasts, the original 765 total proteins were reduced to 183 NR proteins and 35 084 total AS (Table 1). The number of total AS differed by less than 0.1% between these two large-scale proteomic investigations. Therefore, despite the disparity in the number of identified proteins (due to dynamic range of protein expression), equal amounts of embryoplast and chloroplast proteins were comparatively analyzed. All remaining data analyses were performed with these NR-protein data from embryoplasts and chloroplasts. The statistical strategy used to summarize the final quantification data (Figure 1c, bottom), was the best option among the strategies tested. There are a considerable number of low-abundance proteins in the data set. Since these proteins were detected in no more than two 2229
dx.doi.org/10.1021/pr101047y |J. Proteome Res. 2011, 10, 2226–2237
Journal of Proteome Research
Figure 2. Functional classification of nonredundant proteins identified and quantified for each plastid subtype. (a) Proteins from embryoplasts and chloroplasts are classified according to function. Number of proteins detected (b, left) only in embryoplasts, (b, right) in chloroplasts, and overlapping proteins (OVPs) detected in both plastids, number is indicated beside the arrow. (c) Functional classification for the 104 OVPs.
analyses, it produced a high standard deviation when five degrees of freedom were used for analysis. By initially averaging the five biological replicates, slightly lower standard deviations were observed for low-abundance proteins. Plastids are organelles with multiple subcompartments,3,29,30 containing hydrophobic (membrane) and hydrophilic (mostly stromal) proteins. Several extraction methods have previously been used to isolate and characterize these different groups of proteins. Recently, Ferro and co-workers identified and quantified 1323 proteins from Arabidopsis thaliana chloroplasts.31 Identification of greater than 1000 proteins was possible because the authors performed subplastidial fractionation followed by further separation of hydrophobic proteins from hydrophilic proteins. Employing such an approach requires high yields of plastids, which was not attainable for embyroplasts in this study. Therefore, the strategy adopted in the present work was to isolate high quality chloroplasts and embryoplasts and prefractionate using only SDS-PAGE; focusing on comparative analysis of two plastid forms from the same plant. Embryoplasts have a Large Number of Unique Proteins
Protein classification revealed distinct differences between the proteomes of embryoplasts and chloroplasts (Figure 2a). Proteins related to energy processes were the most prominent class in both plastids. They were 2.3-fold higher in chloroplasts, comprising 53.0% of total NR proteins, compared to 23.2% for embryoplasts. Proteins involved in disease and defense were slightly higher in chloroplasts (1.3-fold) than embryoplasts, while most of remaining classes were more prominent in embryoplasts. In particular, proteins involved in protein synthesis (5.0-fold higher), metabolism (1.7-fold higher), and destination and storage (1.4-fold higher) were all more prominent in embryoplasts (Figure 2a).
ARTICLE
Although the percentage of total NR proteins involved in energy production was higher in chloroplasts, more energyrelated proteins were detected in embryoplasts. Among the 114 proteins related to energy in embryoplasts 86 were previously characterized as plastid-localized including 16 plastidencoded (PE) proteins. In chloroplasts, only 4 of the 97 energyrelated proteins were not previously detected in plastid preparations, and twenty PE proteins were observed. Despite current BLASTP and functional queries, it was not possible to unequivocally assign 4.6% of embryoplast proteins and 7.1% of chloroplast proteins to a specific functional class (Figure 2). Proteins involved in electron transport were also detected in both plastids, 23 entries in chloroplasts and thirteen in embryoplasts, including cytochome b6f complex and rubredoxin-related proteins. Subunits from the photosynthetic ATP synthetase complex were also detected in both plastid subtypes: CF1 hydrophilic subunits R, β, γ, δ, ε and the CF0 transmembrane subunit β. ATP synthetase β subunit levels were 2-fold higher in chloroplasts, and the entire complex was 2.2-fold higher in chloroplasts. Plastid-encoded NADH dehydrogenase subunits I, J, 5 and 7 were detected exclusively in chloroplasts. A partial glycolytic pathway was detected in both plastid types. In embryoplasts five steps of this pathway could be mapped from the 23 redundant entries detected, while in chloroplasts, only two steps from six redundant entries were detected. In embryoplasts, the glycolytic enzymes detected were: fructose-biphosphate aldolase, glyceraldehyde 3-phosphate dehydrogenase (A, B and C subunits), triose-phosphate isomerase, phosphoglycerate kinase, and phosphofructokinase β subunit. In chloroplasts, glyceraldehyde 3-phosphate dehydrogenase and aldolase were the only glycolytic activities detected. Proteins involved in primary metabolism such as those involved in de novo fatty acid biosynthesis and amino acid metabolism comprised 15.2% of 492 NR proteins from embryoplasts and 8.7% of 183 NR proteins in chloroplasts (Figure 2). In chloroplasts, six proteins involved in de novo fatty acid synthesis were detected while in embryoplasts there were 71 proteins related to this pathway. Starch synthase was also detected in both plastids with 8.5-fold higher abundance in embryoplasts based on number of AS. A higher number of proteins involved in protein synthesis were detected in embryoplasts (67 proteins, 13.6%) versus chloroplasts (5, 2.7%). In embryoplasts there were 11 elongation factors detected while in chloroplasts only 1 was present (elongation factor TU). Ribosomal proteins encoded by the plastid genome (ribosomal proteins L16, L2, S2, S4, S7 and S8) and other nuclear-encoded ribosomal proteins were detected only in embryoplasts. The translocon at the outer envelope membrane of chloroplasts 75-III (Toc-75) and the translocon at the inner envelope membrane of chloroplasts 110 (Tic110), were also detected only in embryoplasts. Figure 2b reveals that a considerable number of NR proteins were specifically detected in embryoplasts: out of 492 NR proteins, 388 (78.9%) were embryoplast-specific. In chloroplasts 183 NR proteins were identified and 79 (43.2%) were chloroplast-specific. The specificity is based on the detection of these proteins within each plastid subtype in this study. It does not imply that these proteins are not expressed; they were simply undetected in the current study. Embryoplasts and chloroplasts contain 104 proteins in common, the overlapping proteins (OVPs), which represent 21.1% of embryoplast NR proteins and 56.8% of chloroplast NR 2230
dx.doi.org/10.1021/pr101047y |J. Proteome Res. 2011, 10, 2226–2237
Journal of Proteome Research
ARTICLE
Figure 3. Functional classification of proteins detected in only one plastid type. Proteins detected only in chloroplasts (white bar) or only in embryoplasts (gray bar), are represented as a percentage of the number of nonredundant proteins per plastid.
proteins. Classification of the 104 OVPs reveals that five classes constitute 88.5% of all OVPs and energy-related proteins are the most prominent with 63 proteins (60.6%), followed by 12 related to destination and storage (11.5%), 7 related to disease and defense (6.7%), and 6 related to metabolism (5.8%). The remaining classes together sum 11.5% of the OVPs. In contrast, many NR proteins were detected in embryoplasts but not chloroplasts and the classification of these proteins is shown (Figure 3). Proteins related to metabolism, destination and storage, protein synthesis and energy correspond to 62.4% of the embryoplast-specific proteins. Among all chloroplast-specific NR proteins, those related to energy correspond to 43.0%. Out of 388 embryoplast-specific NR proteins, 69 are related to metabolism and represent 17.8% of all embryoplast-specific NR proteins, and in chloroplasts, the same class represents 12.7%. Peptide Quantification Defines Differences between Embryoplasts and Chloroplasts
One interesting finding from this work is that although the total number of AS for each plastid subtype is nearly identical (35 084 for embryoplasts and 35 311 for chloroplasts, Table 1), the distribution of these AS is remarkably different among the various proteins detected in these two plastids. Upon the basis of AS, in embryoplasts there were 76 low abundance proteins (15.5%), 219 medium abundance proteins (44.5%), and 197 high abundance proteins (40.0%). Chloroplast proteins were distributed as follows: 24 low (13.1%), 58 medium (31.7%), and 101 (55.2%) high abundance proteins. While low abundance proteins were found among all 13 functional classes in embryoplasts, in chloroplasts they were found in only 7 functional classes (Figure 4). Chloroplasts contained a preponderance of proteins related to energy production, accounting for 84.9% of total AS (29961 counts, Table 1). Within this class, 53 proteins were associated with photosystem I and II complexes comprising 18,968 AS or 63.3% of all spectral counts for chloroplasts. Proteins related to energy also represented more among the chloroplast-specific NR proteins, including number of identified proteins (Figure 3) and spectral counting (Supplemental Table 2, Supporting Information). This class was responsible for 1410 AS; 63.7% of all AS among chloroplast-specific proteins.
Figure 4. Protein abundance based on number of LCMS/MS analyses in which one particular protein was detected. Proteins in each plastid data set were classified in low abundance (black bar; maximum of one LCMS/MS analyses and two total assigned spectra among all analyses), medium abundance (gray bar; protein detected between 2 and 7 LCMS/MS analyses) and high abundance protein (white bar; protein detected in more than seven LCMS/MS analyses). Each protein could be detected in at most 10 LCMS/MS analyses in each range, since 5 biological replicates were analyzed twice each. Error bars are not given because these are absolute numbers.
In chloroplasts, ATP synthase CF1 β subunit accrued the highest number of spectral counts (3554 AS), representing 11.9% of total assigned spectra for energy-related proteins. In embryoplasts, this protein accounted for 1485 spectra out of 15 677 for energy-related proteins, corresponding to 9.5% within this class. Proteins involved in protein destination represented 10.4% of total NR proteins from chloroplasts, but only 4.9% of total AS for this organelle. In embryoplasts, this class represented 14.4% of NR proteins and was responsible for 18.8% of total AS for this plastid subtype. Differences were also observed with proteins related to primary metabolism. In embryoplasts, 15.2% of NR proteins belong to this class and they represent 14.4% of all assigned spectra for the organelle. In contrast, 7.7% of NR proteins were assigned to primary metabolism in chloroplasts and accounted for only 1.4% of total assigned spectra. For example, a putative starch synthase (gi 15223331, Supporting Information) was detected in both plastids, but was expressed 8.5-fold higher in embryoplasts than chloroplasts based upon AS. 2231
dx.doi.org/10.1021/pr101047y |J. Proteome Res. 2011, 10, 2226–2237
Journal of Proteome Research The plastid transcriptionally active chromosome protein 16 (pTAC 16, gi 42565672, AT3G46780.1) was detected in both plastids, but was 32-fold higher in chloroplasts. According to Pfalz and co-workers,32 pTAC 16 might have nucleoside-diphosphate-sugar epimerase activity, based on protein domain prediction. The protein is among 35 polypeptides detected by Pfalz and co-workers that might be involved in plastid gene expression.32 This protein is annotated as an expressed protein in the plprot database, and according to the AT_CHLORO database (http://www.grenoble.prabi.fr/at_chloro/),31 is located in the envelope and thylakoids of chloroplasts and described as an NAD-dependent epimerase, dehydratase family protein.31 Embryoplasts and Chloroplasts Share Mostly Energy-Related Proteins
Proteomic analysis of B. napus embryoplasts and chloroplasts revealed 104 overlapping proteins (OVPs; Figure 2c). By monitoring AS from the biological and technical replicates their abundances could be compared in a meaningful manner, analogous to 104 quantitative Western blots (Figure 5). As previously shown in Figure 2, 60.6% of OVPs are related to energetic processes, but the abundance of these OVPs based on AS is 2-fold higher in chloroplasts than embryoplasts. Some proteins such as glyceraldehyde-3-phosphate-dehydrogenase R and β subunits were equally abundant in both plastids. A similar situation occurred with 23 kDa subunit of oxygen evolving system of photosystem II (gi 11134091) and another protein related to the photosystem II (gi 49359169). Although the latter protein is slightly more abundant in chloroplasts according to the data presented, the number of assigned spectra is very similar for both plastids (467 total AS in chloroplasts, and 444 for embryoplasts). Five proteins from the Calvin cycle were present among the OVPs. Interestingly, transketolase (gi 7329685) was 50-fold higher in embryoplasts versus chloroplasts (Supplemental Table 1, Supporting Information). Starch synthase (gi 15223331) was prominently detected in both plastids but was 8.5-fold more abundant in embryoplasts versus chloroplasts (210 assigned spectra in chloroplasts and 1,788 in embryoplasts). Based upon number of assigned spectra starch synthase was the most prominent protein in embryoplasts while ATP synthase CF1 β subunit was the most prominent protein in chloroplasts. The protein with the highest differential expression was cobalaminindependent methionine synthase protein (gi 15238686). This particular protein was detected in chloroplasts at very low abundance (two total assigned spectra), but embryoplasts contained 176 total AS, which is an 88-fold differential. However, the accuracy of this expression difference is uncertain due to the low counts observed in chloroplasts. In fact, the high error for many of the proteins was due to the low AS counts for one or both plastid samples. This is clearly a limitation of the spectral counting approach. Among the 40 overlapping proteins related to photosystem I and II, 34 of them were more prominent in chloroplasts (Figure 5). The plastid ATP synthase complex was also more abundant in chloroplasts (2.2-fold higher). Proteins within the broad class of protein destination and storage such as ATP-dependent proteases, were detected in both plastids but at higher levels in chloroplasts (2.3-fold higher).
4. DISCUSSION The term embryoplast was recently suggested for semigreen, developing embryo plastids because these organelles strikingly
ARTICLE
share characteristics of both photosynthetic chloroplasts and heterotrophic leucoplasts, and therefore can not be classified as either.15 In the same investigation, a new protocol for purifying plastids from developing embryos was presented, beginning with isolated protoplasts. Using this isolation protocol, we performed a global, quantitative comparison of B. napus embryoplasts and chloroplasts at the proteome level to systematically characterize this unusual plastid type. By utilizing 3 WAF embryos we could effectively work with tissue that contained a limited amount of starch and storage protein,15,33 which interferes with purification. Quantitation of detected proteins was performed using a relative approach to allow for direct comparisons individual proteins of these two plastid subtypes. Therefore, whatever limitations for “bottom-up” quantitation exist for any individual protein (e.g., few proteotypic peptides due to hydrophobicity or low mass), it is an intrinsic property of the protein that is equivalent between the two plastid types. Spectral counting is based on the premise that protein abundance is directly reflected by the number of tandem MS spectra assigned to peptide surrogates for that particular protein.15,25,31 The precision level of this technique is sufficient to determine peptide (as a surrogate for protein) abundance for most MS platforms.34 According to Ferro and collaborators,31 spectral counting is a straightforward, semiquantitative approach; thresholds applied in a label-free work must be clearly indicated, as indicated in the present work. Although spectral counting has been proven to be a quantitative approach, early validation experiments were performed without SDS-PAGE prefractionation. To verify spectral counting is quantitatively valid when coupled to SDS-PAGE prefractionation we performed a simple titration experiment by resolving bovine serum albumin (BSA) by SDS-PAGE. Stained protein (CBB) was quantified by densitometry and compared to spectral count analysis of sectioned BSA bands (Supplemental Figure 2, Supporting Information). The relationship was linear from 20 to 500 ng protein and was highly reproducible based upon biological triplicate analyses. Since neither the nuclear or plastid genome of Brassica napus was available at the time this work was performed, a protein database comprised of multiple plant species was queried to ensure maximum proteome coverage. However, this required ortholog verification during post analysis to eliminate redundant protein assignments. Exactly 2104 total proteins were identified with high-confidence criteria from embyoplasts and chloroplasts, and after assessing redundancy this number was reduced to 492 and 183 NR proteins, respectively, due to the presence of orthologous proteins from the multiplant database employed for spectral mining. Additionally, to ensure high-confidence protein assignment and quantification many marginal proteins were discarded, for example, proteins detected with one peptide or low cross correlation scores (Xcorr). The false-discovery rates as determined by mining randomized sequences concatenated to the forward database were consistently below 0.05 for the entire study. Since this was measured using nonredundant assignments instead of spectral counts this is likely an overestimate. And the observation that most identified proteins were observed or predicted to be plastidial supports this conclusion. Although a recent study with model organisms suggest single peptide assignments for large-scale studies could result in a lower false positive rate,35 it is unclear whether this rule is also true for nonmodel organisms and so we relied on two peptides for high-confidence protein assignment. 2232
dx.doi.org/10.1021/pr101047y |J. Proteome Res. 2011, 10, 2226–2237
Journal of Proteome Research
ARTICLE
Figure 5. Quantification comparison of the abundance based on spectral counting quantification for the overlapping NR proteins. Gray bar indicates the relative abundance for each particular NR-protein. Numbers and error bars were derived as discussed in methods and Supporting Information. Complete quantification information is available in the Supplemental Table 1. Proteins indicated by (**) indicate that few assigned spectra (