ARTICLE pubs.acs.org/jpr
Components of Complex Lipid Biosynthetic Pathways in Developing Castor (Ricinus communis) Seeds Identified by MudPIT Analysis of Enriched Endoplasmic Reticulum Adrian P. Brown, Johan T. M. Kroon, Jennifer F. Topping, Joanne L. Robson, William J. Simon and Antoni R. Slabas* School of Biological and Biomedical Sciences, University of Durham, South Road, Durham, DH1 3LE, United Kingdom
bS Supporting Information ABSTRACT: Ricinoleic acid is a feedstock for nylon-11 (N11) synthesis which is currently obtained from castor (Ricinus communis) oil. Production of this fatty acid in a temperate oilseed crop is of great commercial interest, but the highest reported level in transgenic plant oils is 30%, below the 90% observed in castor and insufficient for commercial exploitation. To identify castor oil-biosynthetic enzymes and inform strategies to improve ricinoleic acid yields, we performed MudPIT analysis on endoplasmic reticulum (ER) purified from developing castor bean endosperm. Candidate enzymes for all steps of triacylglycerol synthesis were identified among 72 proteins in the data set related to complex-lipid metabolism. Previous reported proteomic data from oilseeds had not included any membrane-bound enzyme that might incorporate ricinoleic acid into oil. Analysis of enriched ER enabled determination of which protein isoforms for these enzymes were in developing castor seed. To complement this data, quantitative RT-PCR experiments with castor seed and leaf RNA were performed for orthologues of Arabidopsis oil-synthetic enzymes, determining which were highly expressed in the seed. These data provide important information for further manipulation of ricinoleic acid content in oilseeds and peptide data for future quantification strategies. KEYWORDS: ricinoleic acid, Ricinus communis, triacylglycerol, endoplasmic reticulum, MudPIT
’ INTRODUCTION Plants synthesize a wide variety of fatty acids in developing seeds which are incorporated into triacylglycerols (TAGs) and stored as oils or low melting point fats to provide energy during germination. The proportion of each fatty acid type in seed TAGs is dependent on the plant species.1 Vegetable seed oils are important commodities in the industrial and food sectors and the world consumption in 2009 was over 120 million metric tons.2 Concerns over the long-term supply of mineral oil and carbon emissions are increasing interest in plant-derived oils for both energy use and as a source of raw materials for polymers and lubricants. Considerable advances have been made over the last two decades in understanding fatty acid modification and subsequent TAG biosynthesis in plants.35 These reactions are catalyzed by membrane-bound systems found in the endoplasmic reticulum (ER) and incorporation of fatty acids into TAG by acyltransferases is also thought to occur here before formation of storage oil bodies.6 While diverse fatty acids can accumulate in seed TAGs, the fatty acid composition of lipids in other plant tissues, and of membrane phospholipids within seeds, is much more restricted. This is thought to be due to the structural roles which fatty acids play in metabolic processes as components of membrane lipids.7 Modified ‘unusual’ fatty acids which are found in seed TAGs are thought to disorganize membrane structure and enzyme function if they accumulate above a threshold level, a restriction that is not imposed on storage lipids. r 2011 American Chemical Society
Ricinoleic acid (12-hydroxyoctadec-cis-9-enoic acid; 18:1-OH) is a hydroxylated fatty acid found in some seed oils which is a raw material for the production of Nylon 11. It is commercially obtained from Ricinus communis (castor) seed oil which contains over 90% 18:1-OH in its constituent fatty acids, allowing polymer production from it without additional fractionation, and over 50 000 tons of castor seed are used annually for this purpose. Castor has a number of disadvantages as a crop species, however, as it is nondeterminant (so the crop cannot be harvested at one single time), grows in restricted climatic regions, contains an allergenic 2S albumin as a major storage protein, and the seeds synthesize the potent toxin ricin.8,9 These problems could be overcome if selected steps in the biosynthesis of tri-18:1-OH (TAG with 18:1-OH at all 3 positions) could be introduced into alternative crop species. Synthesis of 18:1-OH in plants is catalyzed by oleate-12hydroxylase (Δ12-OHase), an enzyme similar to related oleate12-desaturases but found exclusively in certain seeds. The substrate for Δ12-OHase is phosphatidylcholine (PC) and hydroxylation occurs on oleic acid linked to the sn-2 position of the glycerol backbone. Identification of the castor Δ12-OHase gene was achieved using an EST-sequencing strategy10 and expression of it and other related hydroxylase genes in plant species has been extensively analyzed. To date though, the highest proportion of 18:1-OH Received: March 3, 2011 Published: June 10, 2011 3565
dx.doi.org/10.1021/pr2002066 | J. Proteome Res. 2011, 10, 3565–3577
Journal of Proteome Research
ARTICLE
Figure 1. Enrichment of ER from developing castor endosperm: (A) stages of castor bean used for ER preparation, with endosperm [E] labeled; (B) total fatty acid amount during bean development; (C) electron micrograph of enriched ER sample; (D) quantification of the fold-purification of Δ12-OHase during ER preparation. Western blot analysis with indicated amounts of endosperm homogenate, supernatant after 2000g centrifugation and purified ER, with antibody binding quantified as described in Materials and Methods. (E) Identity of major proteins in castor endosperm homogenate (lane 1) and purified ER (lane 2). The molecular mass of standard proteins is shown in kilodaltons (kDa). Protein samples (12 μg) were resolved on a 12% Novex SDS-PAGE gel (Invitrogen) and the identity of prominent Coomassie stained proteins was determined by Maldi-Tof analysis of tryptic peptides.47.
reported in transgenic seed oils is 30%,11 below the optimum level required in a commercial alternative to castor oil. Understanding the complexity of castor enzymes involved in tri-18:1-OH synthesis could provide tools to engineer higher 18:1OH content. The pathways for TAG formation in oilseeds are well delineated but many of the enzymes in castor are encoded by genefamilies, making identification of key genes more difficult. Selection of the correct gene(s) for transfer to alternative species could be critical since these enzymes might have acyl-group selectivity and preferentially incorporate specific fatty acids into TAG. The castor genome has recently been sequenced12 and the 31 237 reported gene models provide a database for gene expression and protein identification studies. Using a proteomic approach therefore, it should be possible to identify TAG-biosynthetic proteins in
developing castor seeds, limiting the number of potential candidates necessary for engineering of tri-18:1-OH synthesis. Proteomic studies have been reported on developing oilseeds from a number of species including castor, soybean, Arabidopsis thaliana, and Brassica napus using both gel- and non-gel-based technologies.1318 Despite storage lipid being a major component of these seeds, the Δ12-OHase from castor was the only candidate enzyme identified for oleic acid hydroxylation and incorporation into tri-18:1-OH.17 This was possibly due to abundance issues or difficulty with solubilizing membrane-bound enzymes for gel-based analysis. These results indicate that enrichment of the specific compartment where TAG is synthesized is required for proteomic identification of enzymes in this metabolic pathway. Accordingly, we have performed a multidimensional 3566
dx.doi.org/10.1021/pr2002066 |J. Proteome Res. 2011, 10, 3565–3577
Journal of Proteome Research protein identification technology (MudPIT) type of proteomic analysis19 on highly enriched ER preparations from developing castor bean seeds. This led to the identification of 701 proteins, most of which contain predicted membrane spanning domains. Several functional classes of proteins were identified, including those involved in membrane-bound lipid biosynthesis and transfer, lipid vesicle trafficking, protein folding and processing, as well as glycosylation reactions. We report the first proteomic identification of the major proteins involved in TAG biosynthesis in plants. These studies have been complemented with comparative gene expression analysis also aimed at identification of candidate genes involved in the biosynthesis of tri-18:1-OH from oleic acid in castor seeds. There is a correlation between the candidate genes identified by the gene expression and proteomic approaches. This data should aid in the rational design of transgenic studies aimed at developing a temperate crop with a high level of 18:1-OH in TAG.
’ MATERIALS AND METHODS Growth of Castor Plants
A commercial R. communis variety (99N89I) was grown in John Innes No.3 compost with a 16/8 h light/dark cycle at 23 and 18 °C, respectively. Plants were watered daily and inflorescences tagged on emergence of the first female flowers. The whole flowering stem was harvested 25 days later, with fruit cut transversely and endosperm tissue removed for ER purification. Expanding true leaves, appearing after the first cotyledons and leaf-pair, were harvested when they were 1015 cm in size and stored in liquid nitrogen for RNA extraction. ER Enrichment and Analysis
ER was enriched using two discontinuous sucrose gradients as described.17 Endosperm from beans at designated stages IV to VI (Figure 1A) was used directly for the preparation and not stored frozen. These stages were equivalent to those previously described20 and their main distinguishing features were the first appearance of a brown testa and the extent of the bean area taken up by the endosperm. Purified ER from the second gradient was collected by centrifugation and resuspended in 10% glycerol before storage at 80 °C. For electron microscopy, enriched ER from the second gradient was collected by centrifugation and resuspended in Karnovsky fixative (2% paraformaldehyde, 2.5% glutaraldehyde in 0.1 M phosphate buffer pH 7.4). After incubation at room temperature for 2 h, the sample was washed in 0.1 M phosphate buffer, postfixed in 1% osmium tetroxide, dehydrated through a series of alcohols, and embedded in LRWhite resin, polymerized at 60 °C. Ultrathin sections (70 nm) were cut on an ultramicrotome, picked up on copper grids, stained using 1% aqueous uranyl acetate and Reynolds lead citrate (5 min each), and photographed using an Hitachi H7600 transmission electron microscope. Western blot analysis for quantification of ER enrichment was performed with samples of endosperm homogenate, supernatant after the first 2000g clearing-centrifugation (before sucrosegradient loading), and the final ER preparation. These were resolved on a 12% Novex gel (Invitrogen) and transferred onto Hybond ECL (GE Healthcare). The primary antibody for Western analysis was raised against a Δ12-OHase C-terminal peptide (EGAPTQGVFWYRNKY) and the secondary was Cy3-labeled (Sigma C2306). Hybridization extent was quantified using a Typhoon scanner and ImageQuant software (GE Healthcare) and
ARTICLE
calculated band volumes per microgram of protein for each sample type were used to determine fold-purification between the different stages. Loading of different amounts of each sample on the protein gel confirmed that detected binding was proportional to the amount of protein loaded. Synthesis of cDNA from Castor Endosperm and Leaf
Total RNA was extracted from leaves 10 days post emergence and stage III and IV developing castor seed endosperm. Samples (100 mg) of both leaf and endosperm were harvested from 3 different plants and the RNA was extracted separately, to give three biological replicates for each tissue. The tissue was snap frozen in liquid N2 and ground with a mortar and pestle, and the RNA was extracted using the RNeasy kit (Qiagen) in RLC buffer. Residual DNA was removed by DNase treatment (RQ1, Promega). The RNA samples were then purified through an RNeasy column (Qiagen). For each cDNA synthesis reaction, 15 μg of total RNA was used as a template (3 5 μg RNA cDNA synthesis reactions) using oligo-dT as the primer. The cDNA was synthesized using a SuperScript Double-Stranded cDNA Synthesis Kit (Invitrogen) and diluted 1:4 prior to use as a PCR template. Primer Design and Quantitative Real Time RT-PCR Analysis of Seed versus Leaf Transcripts
Primers were designed with the aid of OLIGO v 4.05 (1992 Wojciech Rychlik) near the 30 end of the genes and, where possible, to include an intron. The Tm of the primers ranged from 59.3 to 66.4 °C and the products ranged from 154 to 399 bp. Each primer was checked for specificity by BLAST analysis against the castor genomic database (http://castorbean.jcvi.org/index.php). In addition, each primer pair was checked by PCR followed by gel electrophoresis on a 4% agarose gel. Melt analysis following real time PCR was also used to verify single products. The PCR reactions were carried out in a RotorGene 3000 (Qiagen) using the SYBR Green JumpStart Taq ReadyMix (Sigma). In each case, the template was 0.5 μL of the cDNA synthesis reaction and 0.25 mM of each primer in a total volume of 20 μL. The PCR reaction conditions were an initial denaturation step of 95 °C for 3 min, followed by 40 cycles of 95 °C (20 s), 55 °C (20 s), and 72 °C (20 s). Reactions were set up in triplicate (technical replicates) for both seed and leaf samples with each set of gene-specific primers. The RotorGene software calculated the takeoff value and the amplification efficiency for each technical replicate and averaged them. Only reactions with monitored amplification efficiencies between 1.8 and 2 were used by the analysis software for sample comparison. Reactions were analyzed pairwise (seed and leaf) for each gene and the Comparative Quantification Programme (Rotor-Gene 6.0.19) was used to calculate the ratio of seed/leaf transcript (cDNA) abundance. This was repeated with each of the biological replicates. The seed:leaf transcript ratios from each biological replicate were averaged to give the final result. MudPIT Analysis
Following purification, ER samples were resuspended in 200 μL of lysis buffer (9 M urea, 2 M thiourea, 4% CHAPS) and incubated at room temperature on an orbital shaker for 2 h to solubilize the membrane proteins. After centrifugation at 13 000g for 10 min, the supernatant was transferred to a new tube, 4 vol of ice-cold acetone was added, and proteins were precipitated overnight at 20 °C. The resulting precipitate was collected by centrifugation and washed with fresh ice-cold acetone, and the 3567
dx.doi.org/10.1021/pr2002066 |J. Proteome Res. 2011, 10, 3565–3577
Journal of Proteome Research protein was repelleted. Approximately 100 μg of protein was first fully solubilized in 40 μL of 2% SDS using a sonic water bath for 10 min and then the sample was diluted to 0.1% SDS in 100 mM triethylammonium bicarbonate. Proteins were reduced with tris 2-carboxyethylphosphine, alkylated using methyl-methane-thiolsulfonate (MMTS), and digested overnight at 37 °C with trypsin at a 1:10 trypsin to protein ratio. Following digestion, the sample was vacuum-dried and resuspended in 3.0 mL of strong-cation exchange (SCEX) buffer (10 mM K2HPO4/25% acetonitrile (ACN), pH 2.8) using a sonic water bath for 10 min to aid solubilization. The peptide mixture was applied to a pre-equilibrated Poly-LC (200 2.1 mm) SCEX column on an Ettan micro LC system (GE Healthcare) using a flow rate of 200 μL/min. Peptides were eluted from the column with a biphasic KCl gradient in SCEX buffer; 0150 mM KCl in 11.25 column volumes, and 150500 mM KCl in 3.25 column volumes. Thirty 200 μL fractions were collected across the gradient and each was vacuum-dried and resuspended in 90 μL of 2% ACN and 0.1% formic acid. Thirty microliters of each fraction was analyzed by LCMS/MS using a nanoflow Ettan MDLC system (GE Healthcare) coupled to a hybrid quadrapoleTOF mass spectrometer (QStar Pulsar i, Applied Biosystems) fitted with a nanospray source (Protana) and a PicoTip silica emitter (New Objective). Samples were loaded via a Zorbax 300SB-C18 5 0.3 mm trap column (Agilent) and online chromatographic separation used a Zorbax 300SB-C18 capillary column (150 mm 75 μm) with a linear gradient of 540% ACN, 0.1% formic acid over 2 h at a flow rate of 200 nL/min. Remaining peptides were then eluted in 84% ACN with continuing online MS analysis. MS and MSMS data were acquired using Applied Biosystems Analyst software version 1.1, continuously switching between a 1 s survey scan and 3 3 s product ion scans throughout peptide elution. Only ions with 2+ to 4+ charge state and with TIC > 10 counts were selected for fragmentation. All MSMS data files were processed using Protein Pilot software version 2.0.1 (Applied Biosystems). Protein identification was made using the Paragon search algorithm within the software and a downloaded copy of the castor TIGR database (2007; http://castorbean.jcvi.org/index.php) containing 31 221 protein sequences as the search database. MS and MS/MS tolerances were set to 0.15 and 0.1 Da, respectively, and cleavage sites were defined as lysine and arginine with a single missed cleavage. MMTS cysteine and oxidized methionine modifications were also allowed for in the database search. A protein detection threshold score of 1.3 (95% confidence) was set as the minimum for protein identification to be included in the output. Identification data for all of the fractions within an experimental replicate were further processed using the ProGroup algorithm within Protein Pilot to give an output of the minimum set of proteins that can be reported with scores above the 1.3 (95% confidence) threshold. A combined data file containing the results from all three replicates was constructed in Excel and is supplied as Supplementary Table S2. Proteins listed were identified from a minimum of two different peptides in one replicate or from single peptides in at least two of the replicates. Matched peptides for each of the individual proteins are shown in Supplementary Table S4. The false positive rate (percentage number of whole proteins identified) for the data was determined using all of the above parameters in a Paragon search against the TIGR castor database where all of the individual protein sequences were randomized. MS data-files have been deposited at PRIDE, accession numbers 1491714919.
ARTICLE
Analysis of Identified Castor Proteins
Sequences of all the identified castor proteins were used in BLAST P homology searches against the TAIR9.0 Arabidopsis amino acid database (http://www.arabidopsis.org/Blast/) to determine their highest scoring Arabidopsis orthologues. They were also aligned with an Saccharomyces cerevisiae ER-associated protein database made from data available at http://www. yeastgenome.org/ . Gene ontology (GO) terms were searched with the word ‘endoplasmic’ and a nonredundant protein database made from entries in the 38 ‘Protein complexes and locations (GO cellular component)’ output sets. This contained 450 sequences and proteins in the castor ER data set were aligned with it using Standalone BLAST.
’ RESULTS Preparation of Enriched ER Membrane Fraction
Castor bean has been classically used as an experimental material for the preparation of intact organelles and membrane fractions from plants. A procedure for ER isolation from germinating castor bean21 involves two sucrose gradient ultracentrifugation steps and avoids formation of membrane pellets, preventing aggregation of membranes from different intracellular origins. We applied this procedure to endosperm of castor bean at developmental stages IV to VI, which are optimal for ER enrichment and are synthesizing storage lipid (Figure 1A, B). The average yield from 32 ER purifications was 3.2 mg of total protein per preparation. Examination of the enriched ER using electron microscopy demonstrated that it consisted of closed vesicles with an approximate diameter ranging from 100 to 200 nm and no contamination was evident from intact peroxisomes, mitochondria, or plastids (Figure 1C). To estimate the enrichment of ER during preparation, we used quantitative Western analysis with antibodies against Δ12-OHase, which is located in the ER of developing seeds (Figure 1D). When expressed on a per microgram protein basis, the preparations were enriched approximately 200-fold for Δ12-OHase with respect to the crude homogenate, with duplicate experiments giving values of 235- and 179-fold. Identification of major proteins in the initial homogenate for ER purification (made by chopping with a razor blade) and final ER sample revealed substantial changes in the protein profile during ER preparation (Figure 1E). Dominant proteins in the homogenate were seed-storage proteins, whereas the ER preparation contained substantial amounts of known ERlocalized proteins such as protein disulfide isomerase and calnexin. We considered enrichment of ER to be a necessary step before proteomic investigation into membrane-bound components of TAG biosynthesis, which due to their abundance could otherwise have been missed in direct analysis of whole cellular homogenates. The Complexity of Enzymes Involved in TAG Biosynthesis in Developing Seeds Is Revealed by Genomic Analysis and Quantitative PCR Analysis of mRNA Abundance
18:1-OH is made by hydroxylation of oleic acid linked to the sn-2 position of PC. Known metabolic pathways for incorporation of newly synthesized 18:1-OH into TAG are shown in Figure 2. Both acyl-CoA dependent (solid arrows) and independent pathways (dashed arrows) have been shown to exist in various oil seeds, but the relative importance of each in developing castor beans is not known. In the acyl-CoA dependent pathway, acyl-CoAs are successively used to acylate glycerol-3phosphate (G-3-P), lysophosphatidic acid (LPA), and diacylglycerol (DAG) to form TAG. The phosphate group is removed 3568
dx.doi.org/10.1021/pr2002066 |J. Proteome Res. 2011, 10, 3565–3577
Journal of Proteome Research
ARTICLE
Figure 2. TAG-biosynthetic pathways in developing castor seeds, showing enzyme isoforms with higher expression in seed than leaf and those identified by MudPIT analysis. 18:1-OH is synthesized from oleic acid (18:1) linked to phosphatidylcholine (PC) in the ER. Potential routes for its incorporation into TAG via acyl-Coenzyme A (CoA) dependent (solid arrows) or independent (dashed arrows) pathways are shown. Enzyme isoforms that showed higher RNA expression in castor seed than leaf are listed in green boxes and protein isoforms identified in MudPIT analysis of the ER are in red boxes. Bracketed enzymes are shown for completeness and were not analyzed by quantitative RT-PCR or identified by MudPIT. Enzyme abbreviations are: Δ12-OHase, oleate-12-hydroxylase; PLA2, phospholipase A2; LACS, long chain acyl-CoA synthetase; LPCAT, 1-acylglycerol-3-phosphocholine acyltransferase; GPAT, glycerol-3-phosphate acyltransferase; AGPAT, 1-acyl-glycerol-3-phosphate acyltransferase; LPAT, lysophosphatidic acid acyltransferase; PAP, phosphatidic acid phosphatase; DGAT, diacylglycerol acyltransferase; PDAT, phosphatidylcholine diacylglycerol acyltransferase; PLC, phospholipase C; CPT, CDP-choline:diacylglycerol cholinephosphotransferase; TA, transacylase.
from the intermediate phosphatidic acid (PA) by phosphatidic acid phosphatase (PAP). Acyl-CoA donors for this pathway are generated from PC by the activity of either long chain acylCoA synthetase (LACS) or 1-acylglycerol-3-phosphocholine acyltransferase (LPCAT). In the acyl-CoA independent pathway, the terminal step of TAG biosynthesis involves the direct transfer of 18:1-OH from PC18:1-OH to DAG. Transacylase enzymes, which also make TAG in an acyl-CoA independent manner in animals, have not been purified and identified from plants, although their presence in some species has been inferred from labeling experiments.22,23 The expression of castor genes encoding enzymes of TAG biosynthesis was examined by quantitative RT-PCR. Candidate genes were identified by BLAST P alignment of known Arabidopsis TAG-biosynthetic enzymes against predicted gene-model translations in the castor database. Several candidate isoenzymes for each of the steps in the TAG-synthetic pathway were identified (Table 1). For all but two enzymes, the highest scoring castor orthologue for each Arabidopsis gene-family member was chosen for RT-PCR analysis and two orthologues of Arabidopsis PDAT 1A and PAP 2 had similar homology scores and were both analyzed. The situation with regards to glycerol-3-phosphate acyltransferase (GPAT) was complicated due to database nomenclature at the time of sequence analysis and subsequent experimental verification of gene function in other laboratories. Acyltransferases involved in the formation of LPA and PA share a common set of motifs and some were assigned to the 1-acylglycerol-3-phosphate-acyltransferase (AGPAT) family. Four mammalian GPATs have been cloned and one of them, the ERlocalized GPAT 4, has also been designated as AGPAT 6 or LPAT ζ. Despite inclusion in the AGPAT family, which would imply 1-acyl-glycerol-3-phosphate acyltransferase or lysophosphatidic
acid acyltransferase (LPAT) activity, it has now been experimentally proved that mouse GPAT 4 is a membrane bound GPAT.24 For this reason, we also looked for orthologues of AGPAT 6 in the castor genome database and included them in the expression analysis. Interestingly, interrogation of the Arabidopsis Expression Angler database25 with Arabidopsis diacylglycerol acyltransferase 2 (At3g51520) identified a co-regulated (r-value 0.615) acyltransferase, At5g60620.1 that resembled mouse AGPAT 6, further supporting its possible involvement in storage lipid biosynthesis. Quantitative RT-PCR analysis of relative mRNA abundance in leaf and developing seeds was performed for the candidate castor TAG-biosynthetic genes (Table 1) using sequence-specific primers (Supplementary Table S1). Genes which showed greater than 2-fold higher expression in seed compared to leaf and hence might be involved in castor seed-specific TAG biosynthesis are shown in Figure 2 (green boxes). Notable features of the relative expression results in Table 1 are as follows: [1] Δ12-OHase is much more highly expressed in castor seed than leaf. Transcripts were not detected in castor leaf in these experiments, consistent with reported data demonstrating seed-specific expression of this gene.10 [2] For several TAG-biosynthetic enzymes, there is a clear isoform that is more highly expressed in seed than leaf. LACS 9 is by far the most highly expressed acyl-CoA synthetase in seed and of the GPAT and diacylglycerol acyltransferase (DGAT) genes analyzed, the acyltransferases AGPAT 6.1 and DGAT 2 have the highest relative expression in seed. [3] In the acyl-CoA independent pathway, phosphatidylcholine diacylglycerol acyltransferase (PDAT) 2 is the dominant PDAT and after Δ12-OHase has the highest relative expression of any gene tested. [4] LPAT 3, 2, and 4 had similar expression ratios between seed and leaf. [5] PAP 1-1 showed a modestly higher expression in seed. 3569
dx.doi.org/10.1021/pr2002066 |J. Proteome Res. 2011, 10, 3565–3577
Journal of Proteome Research
ARTICLE
Table 1. Quantitative RT-PCR Analysis of Candidate Castor Genes Potentially Involved in 18:1-OH Incorporation into TAG from PCa enzyme
Arabidopsis accessionb
castor accession
ratio mRNA seed/leaf ( SE
Δ12-OHase
Not present
28035.m000362
Δ12-desat Δ15-desat
At3g12120 At2g29980
29613.m000358 29814.m000719
3.2 ( 0.96 1.8 ( 1.04
c
∞
LACS 9
At1g77590
29908.m006186
32.3 ( 10.68
LACS 8
At2g04350
29732.m000322
3.3 ( 0.7
LACS 3
At1g64400
30190.m010831
1.1 ( 0.25
LACS 6
At3g05970
29844.m003365
0.7 ( 0.18
LACS 1
At2g47240
30076.m004616
0.5 ( 0.26
LACS 7
At5g27600
30128.m008777
0.3 ( 0.11
LACS 2 GPAT 1
At1g49430 At1g06520.1
29851.m002473 28350.m000105
0.1 ( 0.04 3.0 ( 1.5
GPAT 5
At3g11430
29736.m002070
2.4 ( 0.47
GPAT 3
At4g01950
29908.m005967
0.6 ( 0.18
GPAT 2
At1g02390
30076.m004618
0.6 ( 0.21
GPAT 4
At1g01610
30174.m008615
0.4 ( 0.14
GPAT 6
At2g38110
29822.m003441
0.2 ( 0.09
AGPAT 6.1
At5g60620.1
30122.m000357
15.4 ( 4.02
AGPAT 6.3 AGPAT 6.2
Not present At1g80950
30174.m008937 30170.m014002
3.6 ( 0.95 1.5 ( 0.25
LPAT 3
At1g51260
30169.m006433
4.7 ( 0.62
LPAT 2
At3g57650
27810.m000646
3.6 ( 0.84
LPAT 4
At1g75020
29851.m002448
3.0 ( 0.38
LPAT 1A
At4g30580
29687.m000571
1.5 ( 0.09
LPAT m2
Not present
29666.m001430
1.0 ( 0.09
LPAT 5
At3g18850
30170.m013990
0.3 ( 0.06
LPAT 6 PAP 11
Not present At3g09560
30169.m006432 30170.m013896
0.2 ( 0.11 2.8 ( 1.33
PAP 12
At5g42870
30170.m013897
0.7 ( 0.15
PAP 2-A
At1g15080
29747.m001075
PAP 2-B DGAT 2
At1g15080 At3g51520
29660.m000760 29682.m000581
1.4 ( 0.15 1.3 ( 0.20
DGAT 1
At2g19450
29912.m005373
0.7 ( 0.40
PDAT 2
At3g44830
29991.m000626
90.7 ( 25.75
PDAT 1A PDAT 3
At5g13640 At3g03310
29912.m005286 29929.m004538
13.0 ( 1.73 3.0 ( 0.48
15.5 ( 0.87
PDAT 5
At4g19860
30060.m000520
1.4 ( 0.45
PDAT 4
At1g04010
29637.m000766
0.5 ( 0.09
PDAT 1B
At5g13640
29706.m001305
0.3 ( 0.21
PDAT 6
At1g27480
30170.m013594
0.1 ( 0.04
a
Relative mRNA abundance in castor leaf and seed shown as the mean ratio from three independent experiments (for calculation of standard error SE, n = 3). Numbers in bold show the highest ratios observed within different gene families and no Δ12-OHase expression was detected in castor leaf. Enzyme abbreviations are as in Figure 2 and: Δ12-desat, oleate-12-desaturase; Δ15-desat, linoleate-15-desaturase. b Sequences used in BLAST alignments to identify listed castor genes. c No clear single orthologue of the castor sequence in Arabidopsis.
Gene family members were found with increased expression in seed for all the major TAG-biosynthetic enzymes. While a ranking of the potential importance of individual isoenzymes for tri-18:1-OH biosynthesis can be made from analysis of transcript ratios between seed and leaf, this does not take into account the absolute level of each transcript, which might be more highly correlated with protein abundance. In addition, there may not be a strict correlation between mRNA level and the amount of protein present.14,26 To further investigate which TAG-biosynthetic isoenzymes were present in castor seeds and complement
the expression data described above, we carried out in-depth proteomic analysis using a MudPIT type of approach.27 MudPIT Analysis Identifies Candidate Membrane-Bound Proteins Involved in TAG Biosynthesis
MudPIT analysis was performed to identify proteins present in enriched ER from developing castor bean endosperm, which is a major site of storage lipid synthesis. Three independent ER preparations were used in replica experiments in which proteins were initially solubilized with urea, thiourea and CHAPS, 3570
dx.doi.org/10.1021/pr2002066 |J. Proteome Res. 2011, 10, 3565–3577
Journal of Proteome Research
ARTICLE
Table 2. TAG-Biosynthetic Proteins Identified by MudPIT in Developing Castor Endosperm ER identifiera
castor accession
% peptide coverage
transmembrane domainsb
observed in replicate
RT-PCR rank
spectra detectedc ( SE 1579 ( 156.6
Δ12-OHased
28035.m000362
34.0
6
1, 2 + 3
1st
Δ12-desat
29613.m000358
15.4
5
1, 2 + 3
1st
263 ( 21.1
LACS 9
29908.m006186
31.7
6
1, 2 + 3
1st
700 ( 41.2
LACS 8
29732.m000322
17.9
5
1, 2 + 3
2nd
248 ( 13.1
LACS 1
30076.m004616
3.9
4
1+3
5th
127 ( 21.0
AGPAT 6.1
30122.m000357
9.2
6
1, 2 + 3
1st
43 ( 7.2
LPAT 2
27810.m000646
12.7
6
1, 2 + 3
2nd
99 ( 9.9
LPAT 9 MBOAT
29687.m000572 30190.m011126
6.3 21.0
4 8
1, 2 + 3 1, 2 + 3
e
17 ( 4.8 294 ( 6.3
PAP 2
29841.m002865
22.0
5
1, 2 + 3
83 ( 16.3
PAP 2
29805.m001476
11.7
4
1, 2 + 3
17. ( 1.2
1, 2 + 3
1st
154 ( 28.6
1+3
2nd
50 ( 31.5
1, 2 + 3
2nd
164 ( 15.2
DGAT 2
29682.m000581
22.6
4
DGAT 1
29912.m005373
3.5
10
PDAT 1A
29912.m005286
14.1
5
a
In addition to abbreviations in Figure 2 and Table 1, MBOAT is membrane-bound O-acyltransferase. b The number of transmembrane domains was predicted using TMpred at the ExPASy Proteomics Server (http://www.expasy.ch/tools/). c Average total number of spectra detected for each protein per experimental replicate. d Entries in bold were ranked first for the ratio of seed/leaf mRNA abundance within each gene family analyzed by quantitative RT-PCR. e Indicates the protein isoform had not been selected for RT-PCR analysis.
acetone-precipitated, and resuspended in SDS before digestion with trypsin. An initial peptide fractionation by SCEX produced 30 fractions which were individually analyzed by reverse-phase LCMS-MS. MS-data from each fraction was separately processed using Protein Pilot and the results were combined to make a final protein list for that replicate. In total, 701 proteins were identified from developing castor bean endoplasmic reticulum using this strategy, with 556 present in all three replicates. The false positive rates (percentage of the number of proteins identified at 95% confidence from searching a randomizedsequence vs true version of the castor database) for the three replicates were 5.5, 5.4, and 4.8%. Combined data from the three experiments is presented in Tables 2 and 3 and Supplementary Tables S2, S3, and S4. The full data set of castor proteins identified is listed in Supplementary Table S2, which is split into major functional groups, and proteins involved with lipid metabolism are listed in Tables 2 and 3. For each protein associated with TAG biosynthesis, the number of spectra acquired during data processing (which reflects the protein abundance in the ER) was determined. These results are included in Table 2 and peptides detected from these proteins are in Supplementary Table S3. Matched peptides for all of the proteins in each of the independent replicates are shown in Supplementary Table S4. At least one protein isoform was identified in the ER for the majority of known TAG biosynthetic enzymes in oilseeds, with enzymes of both acyl-CoA dependent and independent pathways represented (Figure 2, red boxes, Table 2). These proteins have between 4 and 10 predicted transmembrane domains. Important points to note in relation to TAG biosynthesis and the pathways shown in Figure 2 are the following: [1] From analysis of the total number of spectra detected for each protein (Table 2), Δ12-OHase was clearly the most abundant protein from these pathways in the ER. Despite high sequence homology between Δ12-OHase and Δ12-desaturase, they can be distinguished via mass spectrometry and the level of the Δ12-desaturase is considerably lower than that of the Δ12-OHase. [2] A single PDAT protein was identified, PDAT 1A. The absence of any PDAT 2, which had a higher ratio of seed to leaf expression than
PDAT 1A in the RT-PCR analysis, was possibly due to protein abundance or localization issues. [3] Three isoenzymes were detected for long chain acyl-CoA synthetase, LACS 1, 8, and 9, with LACS 9 being the most abundant based on number of spectra. [4] Only one candidate was seen for GPAT, AGPAT 6.1, which came on top of the ranked list of expression ratios for these enzymes. [5] Two LPAT enzymes, LPAT 2 and 9, were present and LPAT 2 had the second-highest ratio of seed versus leaf expression of the LPAT genes tested. LPAT 9 was not selected for quantitative PCR analysis because it did not rank in the top BLAST P-matches of any Arabidopsis LPAT orthologue. [6] Both DGAT 1 and 2 were identified, with the abundance of DGAT 2 apparently higher than DGAT 1, reflecting the expression data from these two enzymes. [7] Two PAP type 2 proteins were detected in the ER which were different from those selected for transcript expression analysis. In the castor database, these proteins were listed as ‘conserved hypothetical protein’ and ‘dolichyldiphosphatase’ but they were identified as PAP candidates after BLAST P alignment of identified castor ER proteins with the Arabidopsis TAIR database. It is notable that no PAP 1 candidate was in the MudPIT data set. This may be due to either lower abundance or the fact that PAP 1 is loosely associated with the ER. Genes encoding type 1 PAP have recently been cloned from Arabidopsis28 but it is unclear whether they or the lower molecular weight type 2 proteins play a significant role in TAG synthesis in plant seeds. Completion of the acyl-CoA-dependent pathway in Figure 2 requires either phospholipase A2 (PLA2) or LPCAT. Both are potentially in the data set, although not designated as such in the castor database. For example, three proteins that align with Arabidopsis lipases (Table 3) may be PLA2 enzymes as the lipase assignment does not specify the enzyme substrate. Lipases which are active on the sn-2 position of phospholipids are PLA2 enzymes but they inherently have lipase activity. Some of the additional proteins identified as hydrolases might also have lipase activity. A castor orthologue (MBOAT in Table 2) of Arabidopsis LPCAT (AT1g12640) was also identified in the enriched ER sample. 3571
dx.doi.org/10.1021/pr2002066 |J. Proteome Res. 2011, 10, 3565–3577
Journal of Proteome Research
ARTICLE
Table 3. Additional Lipid Biosynthetic Proteins Identified by MudPIT in Castor Endosperm ERa % peptide identifier
castor accession
coverageb
transmembrane scorec
domains
comments
Desaturases ω-3 fatty acid desaturase
29681.m001360
3.4
2.24
6
29814.m000719
3.0
2.66
5
Electron transport NADH-cytochrome b5 reductase Cytochrome b5
29755.m000441
4.7
2.00
3
29818.m000392
26.4
8.01
2
29904.m002991
47.1
8.02
1
30204.m001761
49.3
8.11
1
30213.m000673
41.0
10.20
1
3-ketoacyl-CoA synthetase
30170.m014048
16.3
10.06
6
29807.m000494
4.4
3.76
6
*
Microsomal beta-keto-reductase
29929.m004548 29929.m004767
20.5 11.9
8.70 5.88
5 4
#e #
Enoyl-CoA reductase
30003.m000326
13.5
5.71
4
#
29791.m000529
19.8
14.84
2
Diacylglycerol kinase
30138.m004002
3.3
3.15
5
Diacylglycerol kinase family
30174.m008655
5.7
2.03
3
29640.m000412
11.1
4.05
6
29842.m003508
4.0
4.29
7
Fatty acid modification: acyl-CoA elongase *d
Fatty acid oxidation Cytochrome P450 Phosphatidic acid metabolism
Phospholipid synthesis Phosphatidate cytidyltransferase Ethanolaminephospho-transferase
30138.m003845
2.5
2.30
9
Phosphatidylinositol synthase
29819.m000078
14.1
6.00
4
Phosphatidylserine synthase
29660.m000781
3.1
2.00
9
*
PE/PC synthesis
Sphingolipid metabolism Serine palmitoyl transferase
29983.m003305
9.3
8.02
6
Sphingosine phosphate lyase
30131.m006861 29929.m004623
10.2 3.5
7.26 2.06
4 6
Sphingolipid Δ4 desaturase
29851.m002395
6.3
2.02
6
29848.m004509
5.0
3.33
3 6
Membrane lipid synthesis Monogalactosyl diacylglycerol synthase
*
Steroid metabolism HMG-CoA reductase
30170.m013805
3.1
2.01
Farnesyl-diphosphate farnesyltransferase
30131.m007117
15.3
13.53
3
Cycloartenol synthase Cycloeucalenol cycloisomerase
29667.m000347 30147.m014529
17.4 10.2
17.8 6.01
5 7
C-14 sterol reductase
29092.m000450
8.9
4.00
6
7-dehydrocholesterol reductase
30169.m006262
6.1
5.32
10
C-4 methylsterol oxidase
30190.m011039
17.7
7.83
4
3 beta-hydroxysteroid dehydrogenase
29739.m003645
12.5
8.15
4
Hydroxysteroid dehydrogenase
30170.m014238
15.5
10.04
6
Sterol isomerase
30128.m008583
6.4
2.78
5
Sterol methyltransferase
29864.m001470 30170.m014288
17.6 40.4
9.63 23.72
1 2
*
Lipases GDSL-motif lipase/hydrolase family
29983.m003318
GLIP5 carboxylesterase/lipase
29739.m003689
GLIP1 carboxylesterase/lipase
28885.m000108
Monoglyceride lipase
27660.m000084
Triacylglycerol lipase
30183.m001305
7.6
2.14
4
*
6.06
7
*
5.5
2.02
3
*
3.7
2.02
2
20.2
16.01
3
13
Protein acylation 3572
dx.doi.org/10.1021/pr2002066 |J. Proteome Res. 2011, 10, 3565–3577
Journal of Proteome Research
ARTICLE
Table 3. Continued % peptide
transmembrane
castor accession
coverageb
scorec
domains
29912.m005375
2.5
2.84
1
Acyl-coenzyme A binding domain containing
29753.m000252
20.6
8.27
1
Acyl-CoA-binding protein, ACBP
30170.m013760
17.2
6.02
1
ATS3 lipid binding protein
28166.m001073
29.6
6.26
1
Lipid binding protein
29653.m000295
14.4
2.00
3
29827.m002570
6.7
7.16
3
29681.m001370 27953.m000064
2.1 16.2
2.89 8.99
2 3
identifier N-myristoyl transferase
comments
Lipid-binding proteins
Phosphoinositide binding protein Trigalactosyldiacylglycerol 2 Calcium lipid binding protein
27394.m000357
3.8
3.26
7
29602.m000216
3.8
3.04
3
30174.m008919
8.0
4.09
6
29840.m000618
8.1
2.05
1
30147.m013761
45.5
14.49
1
Carboxyl transferase subunit R Carboxyl transferase subunit β
27798.m000585 28890.m000006
11.9 5.5
11.99 2.06
4 4
Malonyl CoA - ACP transacylase
30113.m001448
4.7
2.13
4
Steroid binding protein
Arabidopsis ACBP2 *
* *
Fatty acid synthesis
a
Proteins in the MudPIT data set that function in lipid metabolic pathways in addition to those in Table 2 are listed, with functional categories in bold. b The proportion of the protein sequence covered by identified peptides in the MudPIT analysis. c The total protein score generated in the Protein Pilot output. d Indicates the protein identity was from the top match in a BLAST P search of the Arabidopsis TAIR database with the castor protein sequence as a query (i.e., not as listed in the TIGR castor database). e Indicates protein identity listed was inferred from alignment with a S. cerevisiae ER protein database, constructed as described in Materials and Methods.
It is interesting to note the correlation between MudPIT and quantitative PCR expression analysis. Proteins observed in all three replicates of the MudPIT analysis were ranked highly (first or second) within each gene family in the seed/leaf mRNA transcript level study (where analyzed), further supporting their probable role in seed-specific TAG synthesis. This is the first time major enzymes for TAG biosynthesis have been identified in a proteomic study of plants, demonstrating the power obtained from selective enrichment of a membrane fraction from a suitable tissue. The absolute level of each of the detected proteins in developing ER will be determined using a concatamer approach29 based on the selected reporter ions detected in this MudPIT analysis. Other Components of Lipid Biosynthesis Identified by MudPIT Analysis
In addition to the enzymes shown in Figure 2 and Table 2, 58 other proteins associated with lipid metabolism were identified in enriched castor ER, all of which had predicted transmembrane domains (Table 3). The database designations of some of these proteins did not link them with lipid metabolism, but BLAST P alignment of all identified castor ER proteins with complete Arabidopsis and S. cerevisiae ER protein databases allowed alternative assignments of function. We found candidate proteins in endosperm ER from several major lipid metabolic pathways, including fatty acid modification, membrane phospholipid and sphingolipid synthesis and steroid metabolism. The data set contained two proteins with fatty acid desaturase domains and components of the associated cytochrome b5 electron transport chain, which also functions in 18:1-OH synthesis.30 Fatty acid elongation to more than 20 carbons occurs in the ER and candidates for three members of the acyl-CoA elongase
complex were identifiable following the BLAST P alignments described above. Castor oil contains