Modification-Specific Proteomics of Plasma Membrane Proteins

Nov 24, 2005 - Thomas S. Nu1hse,‡ Urs Brodbeck,§ Scott C. Peck,‡ and Ole N. Jensen*,† ... Odense M, Denmark, Sainsbury Laboratory, John Innes C...
1 downloads 0 Views 276KB Size
Modification-Specific Proteomics of Plasma Membrane Proteins: Identification and Characterization of Glycosylphosphatidylinositol-Anchored Proteins Released upon Phospholipase D Treatment Felix Elortza,†,| Shabaz Mohammed,†,⊥ Jakob Bunkenborg,† Leonard J. Foster,†,O Thomas S. Nu1 hse,‡ Urs Brodbeck,§ Scott C. Peck,‡ and Ole N. Jensen*,† Department of Biochemistry and Molecular Biology, University of Southern Denmark, Campusvej 55, DK-5230 Odense M, Denmark, Sainsbury Laboratory, John Innes Centre, Norwich NR4 7UH, United Kingdom, and Institute of Biochemistry and Molecular Biology, University of Berne, P.O. Box, CH-3000 Bern 9, Switzerland Received November 24, 2005

Plasma membrane proteins are displayed through diverse mechanisms, including anchoring in the extracellular leaflet via glycosylphosphatidylinositol (GPI) molecules. GPI-anchored membrane proteins (GPI-APs) are a functionally and structurally diverse protein family, and their importance is wellrecognized as they are candidate cell surface biomarker molecules with potential diagnostic and therapeutic applications in molecular medicine. GPI-APs have also attracted interest in plant biotechnology because of their role in root development and cell remodeling. Using a shave-and-conquer concept, we demonstrate that phospholipase D (PLD) treatment of human and plant plasma membrane fractions leads to the release of GPI-anchored proteins that were identified and characterized by capillary liquid chromatography and tandem mass spectrometry. In contrast to phospholipase C, the PLD enzyme is not affected by structural heterogeneity of the GPI moiety, making PLD a generally useful reagent for proteomic investigations of GPI-anchored proteins in a variety of cells, tissues, and organisms. A total of 11 human GPI-APs and 35 Arabidopsis thaliana GPI-APs were identified, representing a significant addition to the number of experimentally detected GPI-APs in both species. Computational GPI-AP sequence analysis tools were investigated for the characterization of the identified GPI-APs, and these demonstrated that there is some discrepancy in their efficiency in classification of GPI-APs and the exact assignment of ω-sites. This study highlights the efficiency of an integrative proteomics approach that combines experimental and computational methods to provide the selectivity, specificity, and sensitivity required for characterization of post-translationally modified membrane proteins. Keywords: post-translational modification • GPI-anchor • membrane protein • subproteome • modification-specific proteomics • mass spectrometry • glycosylphosphatidylinositol-specific phospholipase D

Introduction Protein structure and function is largely determined by the three-dimensional structure. However, protein activity, localization, and interactions are frequently modulated or controlled by covalent post-translational modifications, such as phosphor* Corresponding author: Ole Nørregaard Jensen, Ph.D., Protein Research Group, Department of Biochemistry and Molecular Biology, University of Southern Denmark, DK-5230 Odense M, Denmark. Tel., +45 6550 2368; fax, +45 6550 2467; e-mail, [email protected]. URL: www.protein.sdu.dk. † University of Southern Denmark. ‡ John Innes Centre. § University of Berne. | Current address: Cooperative Research Centre on Biosciences (CIC bioGUNE), Technology park of Bizkaia, 801 A Building, 48160 Derio, Spain. ⊥ Current address: Department of Biomolecular Mass Spectrometry, Utrecht University, Sorbonnelaan 16, 3584 CA Utrecht, The Netherlands. O Current address: Department of Biochemistry and Molecular Biology, University of British Columbia, 301-2185 East Mall, Vancouver, BC V6T 1Z4, Canada. 10.1021/pr050419u CCC: $33.50

 2006 American Chemical Society

ylation, glycosylation, and acylation. Systematic determination of the temporal and spatial dynamics of post-translational modifications is one of the biggest challenges in current proteomics research.1,2 Post-translationally modified plasma membrane proteins present a special challenge because they can be present in low abundance and they are integrated in the membrane bilayer via transmembrane domains or they are tethered to the membrane by lipid or lipid-like moieties. Among the membrane proteins, glycosylphosphatidylinositol-anchored proteins (GPI-APs) are unique because they are soluble proteins that are processed post-translationally to carry a C-terminal glycosylphosphatidylinositol anchor that mediates their transport and attachment to the outer leaflet of the plasma membrane. Thus, GPI-APs are found mainly in the plasma membrane of eukaryotic cells, but they may also be present in intracellular compartments, during transport to or from the plasma membrane (e.g., endocytosis).3 GPI-APs constitute a Journal of Proteome Research 2006, 5, 935-943

935

Published on Web 02/18/2006

research articles

Elortza et al.

Figure 1. Structural features of glycosylphosphatidylinositol-anchored proteins. The domain composition of a GPI-anchored pre-protein and a GPI-anchored mature protein after it is processed and linked to the pre-synthesized lipid anchor. Protein sequence features common for GPI-anchored proteins are shown: a cleavable N-terminal hydrophobic secretion signal, a 8-30 amino acid long hydrophobic region in the COOH-terminus, a hydrophilic spacer region that precedes the hydrophobic region in the COOH-terminus, a cleavage site (ω-site) where the pre-synthesized GPI-anchor is attached. With few exceptions, GPI-anchored proteins contain no transmembrane domains in the core of the protein sequence. Man, mannose; GlcN, glucosamine; Ins, inositol; P, phosphate group; PLD and PI-PLC, cleavage sites of phospholipase D and C, respectively.

structurally and functionally diverse class of membraneattached proteins. GPI-APs are enriched in membrane microdomains (lipid rafts) in mammalian cells,4,5 and they also exist in plant cell membranes.6,7 GPI-APs constitute an average of approximately 0.5% of total cell proteins, and they mediate many important cellular functions including immune recognition, cell-cell and host-pathogen interactions, complement regulation, and cell signaling.8-12 Hence, GPI-APs are now recognized as an important class of membrane proteins, for example, in molecular medicine, as they play a role in a variety of disorders, for example, Paroxysmal Nocturnal Hemoglobinuria,13 and they may serve as biomarker candidates for diseases, such as human hepatocellular carcinoma and pancreatic and biliary carcinomas.14,15 GPI-APs are also abundant on the surface of pathogenic microorganisms, including trypanosomes16 and the malaria parasite Plasmodium falciparum,17 and they may serve as therapeutic targets for prevention or treatment of these infectious diseases. In plant biotechnology, GPI-APs have attracted interest because of their key role in root development and cell wall remodeling.18 With few exceptions, GPI-APs share a number of common structural features19 including the complete absence of transmembrane domains and the presence of (i) a cleavable Nterminal hydrophobic secretion signal, (ii) a 8-20 amino acid long hydrophobic region in the COOH-terminus, and (iii) a hydrophilic spacer region that precedes the hydrophobic region in the COOH-terminus (Figure 1). In the ER, a transamidase enzyme recognizes and processes the C-terminal hydrophobic tail of the nascent protein at the so-called “ω-site”. This enzyme also transfers the nascent protein to a pre-synthesized GPI anchor. In effect, the mature GPI-anchored protein is tethered to the plasma membrane via the GPI moiety. In living cells and tissues, the protein can be released from the cell surface by 936

Journal of Proteome Research • Vol. 5, No. 4, 2006

enzymatic cleavage of the GPI moiety by the action of specific phospholipases. The soluble protein is then shed to the extracellular space. This is a unique cellular mechanism for modulation of protein activities at or near the cell surface, and it presumably plays a fundamental role in cell signaling processes, plasma membrane morphology, and cell-cell recognition.20 Analysis of native GPI-APs and site-directed mutagenesis studies have shown that there are certain sequence constraints for the “ω-site” that is recognized by the transamidase, which attaches the GPI anchor to proteins.21,22 A number of bioinformatic methods for prediction of GPI-anchored proteins by amino acid sequence analysis have been reported, including DGPI,23 Big-Pi,24 GPI-SOM,25 and a plant-specific predictor.26 A computational sequence analysis of the genomes of Arabidopsis thaliana, Caenorhabditis elegans, Drosophila melanogaster, and Homo sapiens suggests that approximately 2-3% of these genomes encode GPI-APs.25 In a separate analysis of the A. thaliana genome, it was suggested that it has the potential to generate at least 248 GPI-APs.26 The advances in sensitive and specific computational methods for predictions of GPI-APs from genomic information have not until recently been complemented by experimental methods for proteomic analysis of GPI-APs. The GPI-APs are typically released by enzymatic treatment of plasma membrane fractions by phosphatidylinositol-phospholipase C (PI-PLC)27 followed by two-phase separation in a detergent/aqueous system.28 This simple method enables recovery of the released protein moiety in the aqueous phase. The soluble proteins are optionally separated by electrophoresis and identified by mass spectrometry. PI-PLC enzymatic treatment was successfully used in proteomics experiments to isolate GPI-APs from plants29-31 and humans.30,32 We have previously demonstrated that human and

Modification-Specific Proteomics of Plasma Membrane Proteins

plant GPI-APs can be efficiently identified by combining PI-PLC treatment with SDS-PAGE and LC-MS/MS, leading to the identification of six GPI-APs in a human lipid raft-enriched membrane preparation and 44 GPI-APs in a plant plasma membrane fraction.30 PI-PLC is a very efficient and specific enzyme that recognizes the GPI anchor core structure and catalyzes the cleavage of the phosphodiester bond of phosphatidylinositol to yield diacylglycerol and inositol phosphate at the end of the GPI anchor (Figure 1). However, PI-PLC does not cleave GPI anchors that are extensively ‘decorated’ at the core structure. This decoration may include branching of the core structure by the addition of carbohydrate units or acyl moieties.33 Modifications include the introduction of an ether linkage in the sn-1 position of the glycerol backbone or the addition of phosphoethanolamine at a mannose residue of the GPI anchor. Importantly, palmitoylation of the inositol ring renders the GPI anchor resistant to the action of PI-PLC. Phospholipase D (PLD) exhibits absolute specificity toward the GPI anchor, but in contrast to PI-PLC, the PLD exhibits a broader substrate specificity as it is not hampered by the presence of modifications on the core GPI anchor.34 Thus, all known GPI-APs are potential substrates for GPI-PLD, but this enzyme has not yet been used in proteomic strategies for identification of GPI-APs. Using the concept of ‘modification-specific proteomics’,1,35 we report for the first time the use of glycosylphosphatidylinositol-specific phospholipase D in combination with mass spectrometry to study the GPI-anchored proteome of plasma membrane preparations. We also tested a series of computational sequence analysis techniques to validate the assignments and characterize the identified proteins. Eleven GPI-APs in detergent-resistant membrane preparations from human cells and 35 GPI-APs in a plasma membrane preparation from A. thaliana were identified and characterized.

Experimental Section Preparation of Membranes. HeLa cells were serum-starved and lysed in 100 mM Na2CO3, pH 11.0, and mechanically disrupted by 10 strokes in a Dounce homogenizer and three 20 s bursts of a probe sonicator. The lysates were cleared and combined with an equal volume of 90% sucrose in MESbuffered saline (MBS, 150 mM NaCl and 25 mM MES, pH 6.5) for a final sucrose concentration of 45%. This solution was then placed in the bottom of an ultracentrifuge tube as the base of a discontinuous sucrose gradient. Additional layers consisting of 35% and 5% sucrose in MBS were gently placed on top, and the whole gradient was centrifuged at 166 000g for 18 h at 4 °C. The resulting low-density light-scattering band (approximately 18% sucrose) was extracted, diluted 4× in Na2CO3, and centrifuged a further 2 h (166 000g, 4 °C) to pellet the raftenriched membranes.4,36 Suspension cultures of A. thaliana were maintained as described previously.37 Plasma membranes were prepared as reported,38 using a homogenization buffer containing 250 mM sucrose, 100 mM HEPES/KOH, pH 7.5, 15 mM EGTA, 5% glycerol, 0.5% poly(vinylpyrrolidone) K 25, 3 mM DTT, and 1 mM PMSF at 2 mL/g fresh weight. Microsomal membranes were resuspended in buffer R (250 mM sucrose, 5 mM potassium phosphate, pH 7.5, and 6 mM KCl) and subjected to phase partitioning38 in 6.0% each dextran T-500 and poly(ethylene glycol) 3350 in buffer R. For removal of external soluble proteins, plasma membranes were washed with 100 mM Na2CO3.

research articles Glycosylphosphatidylinositol-Specific Phospholipase D (PLD) Purification. The PLD enzyme was purified from bovine serum as previously described.39 PLD had an activity of 4700 U/mL at 0.28 mg protein/mL. PLD was stored in a buffer of 10 mM triethanolamine/HCl, pH 7.4, containing 0.05% Na-azide and 150 mM NaCl. Two-Phase Separation and Phospholipase D Treatment. Two-phase separation was performed as described.28 Membranes were equilibrated by resuspending the pellet in buffer A (20 mM HEPES, pH 7.5, 0.2 mM PMSF, and 0.5 tablet of protease inhibitor per mL) and pelleted again at 20 000g for 20 min. The membrane fraction was resuspended in 100 µL of buffer A, and then the same volume Triton X-114 was added and mixed to homogeneity. The mixture was chilled on ice for 5 min and then transferred to 37 °C for 20 min for phase separation. The aqueous supernatant was discarded, and the procedure was repeated. The detergent phase was recovered, and 100 µL of A buffer with 4 µL of PLD (4700 U/mL and 0.28 mg protein/mL) was added; the mixture was incubated at 37 °C with shaking. After 1 h, phase separation was performed, and the aqueous supernatant was recovered. Buffer and enzyme was added again, and the procedure was repeated. The two resulting supernatants were pooled, and the proteins were recovered by acetone precipitation. Tryptic Digestion of Protein. 1. In-Gel Digestion. Proteins were separated by SDS-PAGE and visualized by silver staining. Protein bands were cut out and in-gel-digested with trypsin.40 2. In-Solution Digestion. Proteins recovered by acetone precipitation were resuspended in 21 µL of 400 mM NH4HCO3 (pH 7.8) in 8 M urea. Five microliters of 45 mM DTT was added, and the mixture was incubated for 15 min at 50 °C, then chilled, and 5 µL of 100 mM iodacetamide was added for S-alkylation, followed by incubation in the dark at room temperature for 15 min. Lys C protease was added (15 ng, Princeton, Adelphia, NJ), and the sample was incubated at 37 °C for 6 h. One hundred and forty microliters of H2O and 20 pmol of sequence grade trypsin (Promega, Madison,WI) in 5 µL of 100 mM NH4HCO3 were added and incubated overnight at 37 °C. The digests were then kept at -20 °C. Prior to MS, peptide mixtures were loaded onto custommade POROS R2 and R3 GeL-Loader microcolumns41 and washed with 5% FA. Peptides were eluted in 80% acetonitrile in 5% FA then dried in a vacuum centrifuge for subsequent mass spectrometric analysis. Mass Spectrometry. Automated nanoflow liquid chromatography/tandem mass spectrometry analysis was performed using a QTOF Ultima mass spectrometer (Waters/Micromass, Manchester, U.K.) employing automated data dependent acquisition (DDA). A nanoflow-HPLC system (Ultimate; Switchos2; Famos; LC Packings, Amstersdam, The Netherlands) was used to deliver a flow rate of 175 nL/min. Chromatographic separation was accomplished by loading peptide samples onto a homemade 2 cm fused silica precolumn (75 µm i.d.; 360 µm o.d.; Zorbax SB-C18 5 µm (Agilent, Wilmington, DE)) using autosampler essentially as described.42 Sequential elution of peptides was accomplished using a linear gradient from Solution A (0% acetonitrile in 1% formic acid/0.6% acetic acid/ 0.005% heptafluorobutyric acid (HFBA)) to 40% of solution B (90% acetonitrile in 1% formic acid/0.6% acetic acid/0.005% HFBA) in 30 min over the precolumn in-line with a homemade 8 cm resolving column (75 µm i.d.; 360 µm o.d.; Agilent Zorbax SB-C18 3.5 µm). The resolving column was connected using a fused silica transfer line (20 µm i.d.) to a distally coated fused Journal of Proteome Research • Vol. 5, No. 4, 2006 937

research articles silica emitter (New Objective, Cambridge, MA) (360 µm o.d./ 20 µm i.d./10 µm tip i.d.) biased to 2.6 kV. The mass spectrometer was operated in the positive ion mode with a resolution of 9000-11 000 full width at halfmaximum (fwhm). Data-dependent analysis was employed (three most abundant ions in each cycle were selected for MS/ MS): 1 s MS (m/z 350-1500) and max 4 s per MS/MS (m/z 50-2000, continuum mode), 30 s dynamic exclusion. A charge state recognition algorithm was employed to determine optimal collision energy for low energy CID MS/MS of peptide ions. Raw data was processed using MassLynx 3.5 ProteinLynx (smooth 3/2 Savitzky Golay and center 4 channels/80% centroid), and the resulting MS/MS data set was exported in the Micromass pkl format. External mass calibration using NaI resulted in mass errors of less than 50 ppm, typically 5-15 ppm in the m/z range 50-2000. To compensate for mass accuracy drift due to temperature fluctuations, the centroided data from each LC-MS run was recalibrated postacquisition using a custom-made perl-script that fitted by linear regression the observed m/z values to the theoretical ones based on the best scoring peptide identifications made by MASCOT. Automated peptide identification from raw data was performed using an in-house MASCOT server (v. 1.8) (Matrix Sciences, London, U.K.) using the NCBI nonredundant protein database and the following constraints: only tryptic peptides up to two missed cleavage sites were allowed; (0.5 Da tolerance for MS and (0.2 Da for MS/MS fragment ions; carbamidomethyl cysteine (C) was specified as a fixed modification; deamidation (NQ), and methionine oxidation (M) were specified as variable modifications. All of the GPI-APs from HeLa cells and most of the GPI-APs in Arabidopsis were identified based on two or more different peptide tandem mass spectra matching to each individual protein. A total of 15 A. thaliana GPI-APs were each identified based on one peptide sequence obtained by tandem mass spectrometry. In these cases, the tandem mass spectra were manually inspected to validate the data and the corresponding protein sequence assignments. Peptides from in-solution digests were analyzed as described above except that a gradient of 5-60%B in 95 min was used. Data-dependent analysis was employed (five most abundant ions in each cycle): 1 s MS (m/z 350-1500) and max 3 s per MS/MS (m/z 50-2000, continuum mode), 60 s dynamic exclusion.

Results Release of GPI-Anchored Proteins by Phospholipase D. The concept of modification-specific proteomics encompasses the integration of techniques to provide selectivity, specificity, and sensitivity for detection of a distinct class of post-translationally modified proteins.1 In the present study, we used plasma membrane preparation methods and the enzyme phospholipase D to achieve high selectivity toward membrane-associated GPI-APs from human cell culture (HeLa) and from A. thaliana cell culture. Next, mass spectrometry analysis by capillary LCMS/MS provided high analytical sensitivity and specificity for protein identification. Finally, computational sequence analysis provided further specificity to validate bona fide GPI-APs and to eliminate false positive assignments. We modified the analytical strategy for proteomic analysis of GPI-anchored proteins that was previously introduced by our laboratory.30 The PLD enzyme was used instead of the PI-PLC enzyme to release GPI-APs from the plasma membrane fractions in the 938

Journal of Proteome Research • Vol. 5, No. 4, 2006

Elortza et al.

presence of detergent. PLD enzyme treatment in a Triton X-114 suspension at 37 °C followed by two-phase partitioning of the membrane proteins and soluble proteins at 4 °C was applied to the analysis of raft-enriched membrane (REM) fractions from H. sapiens or microsome preparations from A. thaliana. The PLD enzyme hydrolyses the glycosylphosphatidylinositol, releasing the soluble GPI-protein from the membrane/ detergent phase and enabling its recovery in the aqueous phase. Without a priori knowledge of the anchor forms of GPIAPs present, we expected that PLD would be able to access most of the GPI-APs present in the membranes. Control experiments were conducted using no treatment (negative control) or commercially available PI-PLC, as previously described.30 Protein samples isolated in this way were concentrated by precipitation, and a fraction was separated by SDS-PAGE. Silver staining of the SDS-PAGE gels demonstrated that a range of proteins were selectively recovered after PLD treatment of human raft-enriched membrane protein samples (Figure 2, panel A) and plant plasma membrane preparations (Figure 3, panel A). It was clear that both PLD and PLC treatment generated a range of proteins in the soluble fraction, as compared to the untreated control. In our previous proteomic study of GPI-APs,30 we used Western blotting by anti-crossreacting determinant (CRD) antibody19 to confirm the release of GPI-APs upon PI-PLC treatment. However, PLD cleaves at a different site than PLC (Figure 1), and the residual GPI anchor is not recognized by the anti-cross-reacting determinant antibody, and so this method was omitted in the present study. Protein Identification by Tandem Mass Spectrometry and Sequence Database Searching. 1. Identification of GPI-APs in a Human Lipid Raft-Enriched Fraction. Proteins recovered after PLD treatment of the human lipid raft-enriched sample were identified after tryptic digestion of the protein precipitate or after SDS-PAGE separation and in-gel digestion of proteins. Nine protein bands were excised from the SDS-PAGE gel containing the GPI-AP-enriched fraction from human REM fractions as indicated in Figure 2, panel A. Recovered peptides were analyzed by capillary HPLC interfaced to electrospray ionization quadrupole time-of-flight tandem mass spectrometry (LC-MS/MS) and the peptide tandem mass spectra submitted for database searching. For example, Figure 2B shows two tandem mass spectra obtained from protein band 6 that were assigned to the tryptic peptides VENQVLSVR and LQDASAEVER from bone marrow stromal antigen-2. When a criteria of having at least two peptides matched to accept a protein identification was used, a total of 42 proteins were identified by LC-MS/MS of peptide samples obtained by in-gel and in-solution digestion of the PLD-treated human membrane protein sample. To further evaluate these 42 protein sequences, we used computational sequence analysis tools to distinguish genuine GPIAPs from proteins that were released or leaked from membranes independent of PLD enzyme treatment. Three GPI-AP prediction algorithms were investigated, including Big-PI (http:// mendel.imp.univie.ac.at/gpi/gpi_server.html), DGPI (http:// www.expasy.org/tools/), and GPI-SOM (http://gpi.unibe.ch/). As the result, 11 protein sequences were assigned as GPI-APs by at least two of the three prediction tools (Table 2), whereas 31 of the 42 proteins were disqualified based on the lack of the required amino acid sequence features. These latter proteins are probably present in the soluble fraction due to release from the membrane fraction by residual proteolytic activity or due to leaking as a result of high abundance.

Modification-Specific Proteomics of Plasma Membrane Proteins

research articles

Figure 2. Proteomic analysis of GPI-anchored proteins from human cells. (Panel A) Isolation and detection of GPI-APs from human HeLa cells. The band marked with an asterisk contains the PLD enzyme. (Panel B) Mass spectrometry analysis of tryptic peptides obtained by nanoscale liquid chromatography-MS/MS analysis from protein band 6. MS/MS spectra of peptides and the assigned amino acid sequences. To aid clarity, only the y-ions are highlighted. These two sequences matched bone marrow stromal antigen 2 protein, with a MASCOT score of 62 (significance threshold of 54).

241 authentic lipid raft proteins.4 Thus, the present modification-specific strategy for determination of GPI-APs is significantly more efficient and sensitive than standard proteomics methods that do not target particular types of modified membrane proteins. We also conclude that it is preferable to use several independent computational sequence analysis tools for validation of GPI-anchored protein assignments (see below).

Figure 3. Proteomic analysis of GPI-anchored proteins from A. thaliana. (Panel A) Isolation and mass spectrometry analysis of GPI-APs from A. thaliana cells. (Panel B) Mass spectrometry analysis of tryptic peptides obtained by nanoscale liquid chromatography-MS/MS analysis. MS/MS spectra and the assigned amino acid sequences. To aid clarity, only the y-ions are highlighted. This sequence matched PLC-X domain-containing protein At5g67130 protein, with a MASCOT score of 70 (significance threshold of 54).

These experiments demonstrate for the first time that PLD is a useful and efficient enzyme for the proteomic analysis of GPI-APs in human plasma membrane preparations. Eleven out of 42 identified human proteins (26%) were bona fide GPI-APs, and they currently represent the largest set of human GPI-APs found in a single proteomic experiment. In contrast, a proteomic analysis of lipid rafts revealed only 5 GPI-APs among

Several interesting observations were made during data analysis and interpretation. We previously identified mesothelin/megakaryocyte potentiating factor in the PLC-released fraction.30 However, at that time, the algorithms predicted the sequence database entry (Q9UK57) as a clear non-GPI-AP, and we considered the protein a contaminant. The database entry Q9UK57 represents the soluble form of mesothelin/megakaryocyte potentiating factor. In the present study, we identified this protein by sequencing of four peptides in the PLD-released fraction. Submitting the pre-pro-megakaryocyte potentiation factor sequence (Q14859) or mesothelin (Q9BR17 or Q13421), all of which contain the sequence covered by the identified peptides, the protein is recognized as GPI-AP by all three predictors (Table 2). Bone marrow stromal antigen 2 (Bst-2) was recognized as GPI-AP by all three predictors, but at the same time, it contained a transmembrane domain close to its NH2-terminus as revealed by the DAS TM filter (http://mendel.imp.univie.ac.at/ sat/DAS/DAS.html).43 The observation of both an NH2-terminal transmembrane domain and a GPI anchor in the same protein is in agreement with another report for rat Bst-2,44 and the same features were also found in a minor but pathologically important isoform of the prion protein (PrP) and in a plant protein involved in disease resistance, NDR1.45 The fact that we observed this protein in the PLD-treated sample from human REM fraction indicates that the transmembrane domain of the Bst-2 protein was proteolytically removed. Otherwise, the protein would remain associated with the membrane fraction of the two-phase system. Journal of Proteome Research • Vol. 5, No. 4, 2006 939

research articles

Elortza et al.

Table 1. Proteins Identified upon PLD Enzyme Treatment of Human Lipid Raft-Enriched Membranesa Swiss-Prot

name

no.

sco

P05186 P08174 Q9BR17 Q7Z3B1 P15328 P14384 P55290 P19256-2 P13987 Q03405 Q10589

alkaline phosphatase decay-acceleration factor, CD55 mesothelin/megakaryocyte potentiating factor neuronal growth regulator 1 folate receptor carboxypeptidase M precursor cadherin 13 preproprotein antigen CD58/surface glycoprotein LFA-3 CD59 UPAR, CD87 bone marrow stromal cell antigen 2

21 7 5 5 2 4 3 2 3 2 2

839 250 202 180 110 123 105 87 77 70 62

NCBI

name

no.

sco

6090615 21361344 4507879 2506545 30795231 6164848 19743813 33354077 4336424 4504085 18999392 29799 2772564 15277577 190804 20146101 6138770 12643412 182710 5453559 25188179 307132 15680023 114776 5729718 13129092 2144362 5453916 10716563 307110 885684

dihydropyridine receptor R 2 subunit 4F2, heavy chain voltage-dependent anion channel 1 78 kDa glucose-regulated protein precursor brain abundant, membrane attached signal protein 1 transferrin receptor fibronectin receptor β subunit L1 cell adhesion molecule cell surface glycoprotein P1H12 precursor/melanona Sp glycerol-3-phosphate dehydrogenase 2 cytochrome c oxidase subunit Va precursor CD44R1 ADP/ATP carrier protein voltage-dependent anion channel 2 ubiquinone-binding protein EMMPRIN human leucocyte antigen A 4F2 light chain fibronectin receptor R-subunit precursor ATP synthase, subunit d voltage-dependent anion channel 3 membrane glycoprotein B-cell receptor-associated protein 31 β-2-microglobulin precursor 5T4-antigen hypothetical protein MGC5508 cytochrome-c oxidase progesterone membrane binding protein calnexin lysosomal membrane glycoprotein-2 thymopoietin β

14 18 12 12 11 9 9 8 8 6 6 5 5 5 4 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2 2

557 865 600 446 565 398 393 340 298 231 227 238 216 171 164 130 127 107 106 81 77 116 100 96 89 81 79 79 74 64 56

a Two independent peptide matches were required for calling a protein. Swiss-Prot, Swiss-Prot database code for the protein; NCBI, database code for the protein; no., number of peptides matched for each protein; sco, score obtained in MASCOT database search (significance threshold 54).

Table 2. GPI-Anchored Proteins from Human Raft-Enriched Membranesa Big-Pi parameter values

ω-site Swiss-Prot

name

no.

sco

GPI-SOM

DGPI

Big-Pi

BSSco

P-v of best site

P-value 6th

P05186 P08174 Q9BR17 Q7Z3B1 P15328 P14384 P55290 P19256-2 P13987 Q03405 Q10589

alkaline phosphatase decay-acceleration factor mesothelin/megakaryocyte potentiating factor neuronal growth regulator 1 folate receptor carboxypeptidase M precursor cadherin 13 preproprotein antigen CD58/surface glycoprotein LFA-3 CD59 UPAR bone marrow stromal cell antigen 2

21 7 5 5 2 4 3 2 3 2 2

839 250 202 180 110 123 105 87 77 70 62

500 NO/349 625 321 229 422 689 220 120 304 160

NO/501 353 606 324 234 423 690 208 102 304 160

501 353 606 NO/324 234 NO/421 893 NO/208 102 305 161

11.27545 9.40479 5.77495 -9.60 15.62 -11.19 5.48943 -9.07 17.51 17.01581 11.94909

6.77 × 10-3 7.56 × 10-3 9.36 × 10-3 2.31 × 10-2 5.23 × 10-3 2.53 × 10-2 9.52 × 10-3 2.23 × 10-2 4.68 × 10-3 4.82 × 10-3 6.50 × 10-3

1.76 × 10-4 2.73 × 10-4 5.88 × 10-4 6.66 × 10-3 5.66 × 10-3 8.12 × 10-3 6.22 × 10-4 6.22 × 10-3 3.26 × 10-5 3.78 × 10-5 1.50 × 10-4

a Swiss-Prot, Swiss-Prot database code for the protein; no., number of peptides matched for each protein; sco, score obtained in MASCOT database search (significance threshold 54); GPI-SOM, proteins identified as GPI-AP by http://gpi.unibe.ch/ predictor; DGPI, proteins identified as GPI-AP by http:// 129.194.185.165/dgpi/index_en.html predictor; Big-Pi, proteins identified as GPI-AP by http://mendel.imp.univie.ac.at/sat/gpi/gpi_server.html predictor. BigPi parameters: BSSco, best site score; P-v best site, P-value of best ω-site, linear fit; P-value 6th, P-value of the best site, sixth degree polynomial fit. Inside the table, NO/number indicates the predictor does not recognize it as GPI-AP but it gives the best possible position in the amino acid listed.

2. Identification of GPI-APs in A. thaliana Plasma Membrane Preparations. The growing importance of GPI-APs in plant biology is reflected in recent proteomic29,30 and genetic 940

Journal of Proteome Research • Vol. 5, No. 4, 2006

studies of GPI-AP from A. thaliana.18,46 We therefore investigated whether PLD is applicable to proteomic analysis of GPI-APs in the plant A. thaliana. SDS-PAGE separation

research articles

Modification-Specific Proteomics of Plasma Membrane Proteins Table 3. Proteins Identified upon PLD Treatment of Arabidopsis thaliana Plasma Membranesa

a AGI, Arabidopsis Genome Initiative code; no., number of peptides matching each protein; sco, score obtained in MASCOT database search (significance threshold 54); DGPI, proteins identified as GPI-AP by http://129.194.185.165/dgpi/index_en.html predictor; Big-Pi, proteins identified as GPI-AP by http:// mendel.imp.univie.ac.at/gpi/plant_server.html; GPI-SOM, proteins identified as GPI-AP by http://gpi.unibe.ch/ predictor; Borner03, GPI-APs predicted by computational methods or experimental techniques, respectively, by Borner et al.;29 Elortza03, GPI-APs detected and identified by mass spectrometry by Elortza et al.;30 proteins marked in bold are GPI-AP detected for the first time in a proteomic work.

demonstrated specific enrichment of proteins after PLD treatment of A. thaliana plasma membrane protein preparations (Figure 3, panel A). LC-MS/MS analysis of peptide mixtures obtained by in-gel digestion or in-solution digestion of protein leads to the identification of 27 individual proteins (minimum of two peptide matches to accept a protein identification) (Table 3). We noted that an additional 15 proteins were identified by one peptide sequence match, only (Table 3).

The tandem mass spectrum corresponding to the tryptic peptide SDGGGVFEILDR is shown in Figure 3, panel B. The very good quality MS/MS data and the high mass accuracy and mass resolution of the QTOF mass spectrum enabled unambiguous identification of this protein as PLC-X domaincontaining protein (At5g67130). We then used the three different GPI-AP prediction tools and published data29,30 to validate the assignment of GPI-APs in the dataset (Table 3). The PLC-X domain-containing protein, At5g67130 was recogJournal of Proteome Research • Vol. 5, No. 4, 2006 941

research articles nized as GPI-AP by the DGPI, GPI-SOM, and Big-Pi algorithms but not by the predictor by Borner et al.29 Among the 27 proteins identified by at least two MS/MS-based peptide sequence matches, a total of 25 were recognized as bona fide GPI-APs, again demonstrating the high selectivity, specificity, and sensitivity of the modification-specific approach to proteomics. Among the 15 proteins that were identified with high probability by only one MS/MS-based peptide match, an additional 10 GPI-APs were assigned. The set of 35 GPI-APs were validated by high probability scores by at least two of the computational techniques (Table 3). Furthermore, 34 of these GPI-APs were assigned as such by three independent computational methods. The highest amino acid sequence coverage was obtained for the Reticuline oxidase-like/FAD binding domain-containing protein and the SKU5-At4g12420, suggesting that these proteins are the most abundant in the preparation. Five GPI-APs were detected experimentally for the first time (Table 3), including fasciclin-like protein (FLA9, At1g03870), aspartyl protease-like family protein (At1g65240), glycero-phosphodiesterase-like protein (Atg66970), and two β-1,3 glucanases (At3g58100 and At5g61130). Reticulin oxidase-like protein/FAD binding domain-containing protein (At4g20830) was previously described as a major contaminant protein that co-purifies when using the Triton X-114 two-phase separation in combination with PI-PLC treatment,29,30 and it was detected in plant rafts.6 We now assign this protein as a true GPI-AP since a revised amino acid sequence extended by 30 amino acids has emerged in the database (gi 30685222). As a result, DGPI and GPI-SOM recognize At4g20830 as a GPI-anchored protein.

Discussion In the present study, 11 out of 42 recovered proteins were identified as GPI-APs in a human raft-enriched membrane protein fraction (Tables 1 and 2). For A. thaliana plasma membranes, we detected 35 GPI-APs among 42 identified proteins (Table 3). This may suggest that the modificationspecific proteomic strategy is more efficient for studying GPIAPs from plants than from humans since fewer ‘contaminant proteins’ were identified in the former case. However, sample purity, heterogeneity, and the relative abundance and composition of membrane proteins will affect the performance of the methods, and a direct comparison of the outcomes of the two experiments from two different cell types is not appropriate. The treatment of membrane fractions with two different phospholipases, PI-PLC and PLD, rendered a very similar display after SDS-PAGE analysis of human and plant cells, respectively. The number and types of GPI-APs that were identified in the proteomic analysis of the plant GPI-APenriched fractions were comparable for plasma membrane preparations treated with PI-PLC in our previous study30 and PI-PLD (this study), respectively, suggesting that the most abundant GPI-APs in plant cells are prototypical GPI anchors. In the cases where the GPI-APs are highly modified (branched), we would expect a more diverse and complementary set of GPIAPs upon treatment with PI-PLC and PLD, respectively. In the present study, we identified more human GPI-APs than in our previous study using PI-PLC,30 thereby increasing the number from 6 to 11 of GPI-APs found in a single proteomic study of human cells. This is probably due not only to the use of the PLD enzyme but also to refined sample handling methods and improved data interpretation tools. 942

Journal of Proteome Research • Vol. 5, No. 4, 2006

Elortza et al.

We investigated three different computational techniques for amino acid sequence-dependent prediction of GPI-anchored proteins and ω-sites for HeLa plasma membrane proteins. These tools are optimized for or biased toward classification of proteins from certain species and their application to sequences from other species may give rise to false positive or false negative assignments. Overall, however, the sequence analyses performed by the different predictors are generally consistent when requiring at least two independent positive calls to accept a GPI-AP assignment. Independently, the three methods generate less reliable results. For example, Big-Pi did not recognize human neuronal growth regulator 1 (Q7Z3B1), carboxypeptidase M precursor (P14384), and antigen CD58 as GPI-APs, whereas both DGPI and GPI-SOM did classify them as positives. On the other hand, DGPI failed to recognize alkaline phosphatase (P05186) as GPI-AP and GPI-SOM failed with decay-acceleration factor, both bona fide human GPI-APs. In the case of GPI-APs in plants, the Big-Pi predictor exhibited a higher degree of false negative calls among the validated GPI-APs, probably due to its high specificity. Five proteins that were recognized by the three other methods were not called by Big-Pi, including FLA2 (At4g12730, score: -2.83); COBRA protein (At4g16120, -15.15); plastocyanin-like protein (At5g15350, -24.23); beta 1, 3 glucanase (At3g58100, -15.50); and beta 1,3 glucanase (At5g61130, -1.78). Interestingly, false negative calls among the other predictors did not coincide in any case, except for FAD binding domain-containing protein, which was not called by Big-Pi or by Borner et al. Thus, all plant proteins listed as GPI-APs were recognized as such by at least three sequence-based predictors, except for the FAD binding domain-containing protein. In general, our results indicate that it is more probable to call false negatives than to call false positive GPI-APs. Using a combination of sequencebased predictors of GPI-APs is therefore the most robust and reliable approach to assign these species among proteins identified by mass spectrometry. The human body is made up from approximately 220 different cell types. Assuming that the human genome contains approximately 23 000 genes and that 2% of the human genome encodes potential GPI-anchored proteins,25 we predict that humans may generate approximately 460 GPI-anchored proteins. In the present work, we have detected 11 GPI-anchored proteins in raft-enriched membrane fractions from HeLa cells. Many GPI-anchored proteins are ubiquitously expressed (e.g., CD59), but some are specific to certain tissues, cell types, or cellular organelles. We assume that the sensitivity and specificity of the modification-specific strategy presented here can identify a majority of the GPI-anchored proteome of the Hela cell DRMs. The plant A. thaliana encodes approximately 248 GPI-APs,29 and we detected 35 GPI-APs in this work and 44 GPI-APs in our previous study,30 providing a total of 50 different GPI-anchored proteins from a plant cell culture. This represents approximately 20% of the total GPI-anchored proteome of the plant. Since we were exclusively working with plant cell culture and not with primary plant tissue, 20% coverage of the GPIAP genome is very reasonable. Neither tryptic peptides containing the ω-sites nor tryptic peptides from domains beyond the ω-site were observed for any of the 11 GPI-AP from HeLa or the 35 GPI-AP from Arabidopsis. This observation supports the assumption that these GPI-APs were C-terminally processed prior to addition of the GPI anchor. We do not expect the current analytical method to be able to detect and sequence the modified

research articles

Modification-Specific Proteomics of Plasma Membrane Proteins

C-terminal tryptic peptides from any of the GPI-APs. Observation of these peptides would provide direct evidence for correct classification of GPI-APs and also enable determination of the ω-site. As shown in Table 2, there is a high degree of discrepancy among prediction methods regarding to ω-site assignment, and we believe there is a clear need of experimental data to correctly address this issue. We are currently investigating alternative sample handling and processing methods that are suited for the recovery and characterization of these species (Omaetxebarria, M.; Elortza, F.; Jensen, O. N.; et al., submitted).

Conclusion We introduced the enzyme phospholipase D as a tool for modification-specific proteomic analysis of GPI-anchored proteins. PLD facilitated recovery and identification of a large set of GPI-anchored proteins from human and plant plasma membrane preparations by mass spectrometry. As a result, there are now two phospholipase enzymes, PI-PLC and PLD, available to proteomic analysis aimed at the characterization of GPI-anchored proteins. This is particularly relevant for analysis of those cells and organisms that contain nonstandard or heterogeneously decorated versions of the core GPI anchor, such as transformed cells and pathogenic microorganisms, for example, P. falciparum or Trypanosomes. Further refinement and integration of experimental techniques and bioinformatic prediction methods will lead to improvements in predictionbased structural and functional classification of proteins. Abbreviations: GPI, glycosyl-phosphatidylinositol; GPI-AP, glycosylphosphatidylinositol-anchored protein; LC, liquid chromatography; MS, mass spectrometry; MS/MS, tandem MS; PIPLC, phosphatidylinositol-phospholipase C; PLD, glycosylphosphatidylinositol-specific phospholipase D; PTM, post-translational modification; QTOF, quadrupole time-of-flight; REM, raft-enriched membrane fraction.

Acknowledgment. The authors thank F. M. Goni and X. Contreras for assistance in obtaining the PLD enzyme. F.E. was supported by a postdoctoral fellowship from the Basque Government, T.S.N. by an EMBO short-term fellowship, and L.J.F. by a Michael Smith Foundation for Health Research Career Investigator award. This project was supported by a research grant from the Danish Natural Sciences Research Council (O.N.J.) and by the Gatsby Charitable Foundation (T.S.N, S.C.P). O.N.J. is a Lundbeck Foundation Research Professor and the recipient of a Young Investigator Award from the Danish Natural Sciences Research Council. References (1) Jensen, O. N. Curr. Opin. Chem. Biol. 2004, 8 (1), 33-41. (2) Mann, M.; Jensen, O. N. Nat. Biotechnol. 2003, 21 (3), 255-261. (3) Mayor, S.; Riezman, H. Nat. Rev. Mol. Cell Biol. 2004, 5 (2), 110120. (4) Foster, L. J.; De Hoog, C. L.; Mann, M. Proc. Natl. Acad. Sci. U.S..A. 2003, 100 (10), 5813-5818. (5) Harder, T.; Scheiffele, P.; Verkade, P.; Simons, K. J. Cell Biol. 1998, 141 (4), 929-942. (6) Borner, G. H.; Sherrier, D. J.; Weimar, T.; Michaelson, L. V.; Hawkins, N. D.; Macaskill, A.; Napier, J. A.; Beale, M. H.; Lilley, K. S.; Dupree, P. Plant Physiol. 2005, 137 (1), 104-116. (7) Peskan, T.; Westermann, M.; Oelmuller, R. Eur. J. Biochem. 2000, 267 (24), 6989-6995. (8) Triantafilou, M.; Triantafilou, K. Trends Immunol. 2002, 23 (6), 301-304. (9) Paratcha, G.; Ledda, F.; Baars, L.; Coulpier, M.; Besset, V.; Anders, J.; Scott, R.; Ibanez, C. F. Neuron 2001, 29 (1), 171-184. (10) Ghiran, I.; Klickstein, L. B.; Nicholson-Weller, A. J. Biol. Chem. 2003, 278 (23), 21024-21031.

(11) Ahmad, S. R.; Lidington, E. A.; Ohta, R.; Okada, N.; Robson, M. G.; Davies, K. A.; Leitges, M.; Harris, C. L.; Haskard, D. O.; Mason, J. C. Immunology 2003, 110 (2), 258-268. (12) Wang, K. C.; Kim, J. A.; Sivasankaran, R.; Segal, R.; He, Z. Nature 2002, 420 (6911), 74-78. (13) Hall, C.; Richards, S. J.; Hillmen, P. Acta Haematol. 2002, 108 (4), 219-230. (14) Nakatsura, T.; Yoshitake, Y.; Senju, S.; Monji, M.; Komori, H.; Motomura, Y.; Hosaka, S.; Beppu, T.; Ishiko, T.; Kamohara, H.; Ashihara, H.; Katagiri, T.; Furukawa, Y.; Fujiyama, S.; Ogawa, M.; Nakamura, Y.; Nishimura, Y. Biochem. Biophys. Res. Commun. 2003, 306 (1), 16-25. (15) Swierczynski, S. L.; Maitra, A.; Abraham, S. C.; IacobuzioDonahue, C. A.; Ashfaq, R.; Cameron, J. L.; Schulick, R. D.; Yeo, C. J.; Rahman, A.; Hinkle, D. A.; Hruban, R. H.; Argani, P. Hum. Pathol. 2004, 35 (3), 357-366. (16) Lillico, S.; Field, M. C.; Blundell, P.; Coombs, G. H.; Mottram, J. C. Mol. Biol. Cell 2004, 14 (3), 1182-1194. (17) Naik, R. S.; Krishnegowda, G.; Gowda, D. C. J. Biol. Chem. 2003, 278 (3), 2036-2042. (18) Lalanne, E.; Honys, D.; Johnson, A.; Borner, G. H.; Lilley, K. S.; Dupree, P.; Grossniklaus, U.; Twell, D. Plant Cell 2003, 16 (1), 229-240. (19) Hooper, N. M. Proteomics 2001, 1 (6), 748-755. (20) Metz, C. N.; Brunner, G.; Choi-Muira, N. H.; Nguyen, H.; Gabrilove, J.; Caras, I. W.; Altszuler, N.; Rifkin, D. B.; Wilson, E. L.; Davitz, M. A. EMBO J. 1994, 13 (7), 1741-1751. (21) Eisenhaber, B.; Bork, P.; Eisenhaber, F. Protein Eng. 1998, 11 (12), 1155-1161. (22) Udenfriend, S.; Kodukula, K. Annu. Rev. Biochem. 1995, 64, 563591. (23) Kronegg, D.; Buloz, D. http://www.expasy.ch/tools/ (accessed 1999). (24) Eisenhaber, B.; Bork, P.; Eisenhaber, F. Protein Eng. 2001, 14 (1), 17-25. (25) Fankhauser, N.; Maser, P. Bioinformatics 2005, 21 (9), 1846-1852. (26) Borner, G. H.; Sherrier, D. J.; Stevens, T. J.; Arkin, I. T.; Dupree, P. Plant Physiol. 2002, 129 (2), 486-499. (27) Hooper, N. M.; Low, M. G.; Turner, A. J. Biochem. J. 1987, 244 (2), 465-469. (28) Bordier, C. J. Biol. Chem. 1981, 256 (4), 1604-1607. (29) Borner, G. H.; Lilley, K. S.; Stevens, T. J.; Dupree, P. Plant Physiol. 2003, 132 (2), 568-577. (30) Elortza, F.; Nuhse, T. S.; Foster, L. J.; Stensballe, A.; Peck, S. C.; Jensen, O. N. Mol. Cell. Proteomics 2003, 2 (12), 1261-1270. (31) Sherrier, D. J.; Prime, T. A.; Dupree, P. Electrophoresis 1999, 20 (10), 2027-2035. (32) Fivaz, M.; Vilbois, F.; Pasquali, C.; van der Goot, F. G. Electrophoresis 2000, 21 (16), 3351-3356. (33) Ikezawa, H. Biol. Pharm. Bull. 2002, 25 (4), 409-417. (34) Deeg, M. A.; Davitz, M. A. Methods Enzymol. 1995, 250, 630640. (35) Jensen, O. N. In Proteomics: A Trends Guide; Elsevier: London, 2000; pp 36-42. (36) Smart, E. J.; Ying, Y. S.; Mineo, C.; Anderson, R. G. Proc. Natl. Acad. Sci. U.S.A. 1995, 92 (22), 10104-10108. (37) Nuhse, T. S.; Peck, S. C.; Hirt, H.; Boller, T. J. Biol. Chem. 2000, 275 (11), 7521-7526. (38) Walter, H.; Larsson, C. Methods Enzymol. 1994, 228, 451-469. (39) Hoener, M. C.; Brodbeck, U. Eur. J. Biochem. 1992, 206 (3), 747757. (40) Shevchenko, A.; Wilm, M.; Vorm, O.; Mann, M. Anal. Chem. 1996, 68 (5), 850-858. (41) Gobom, J.; Nordhoff, E.; Mirgorodskaya, E.; Ekman, R.; Roepstorff, P. J. Mass Spectrom. 1999, 34 (2), 105-116. (42) Licklider, L. J.; Thoreen, C. C.; Peng, J.; Gygi, S. P. Anal. Chem. 2002, 74 (13), 3076-3083. (43) Cserzo, M.; Eisenhaber, F.; Eisenhaber, B.; Simon, I. Bioinformatics 2004, 20 (1), 136-137. (44) Kupzig, S.; Korolchuk, V.; Rollason, R.; Sugden, A.; Wilde, A.; Banting, G. Traffic 2003, 4 (10), 694-709. (45) Coppinger, P.; Repetti, P. P.; Day, B.; Dahlbeck, D.; Mehlert, A.; Staskawicz, B. J. Plant J. 2004, 40 (2), 225-237. (46) Gillmor, C. S.; Lukowitz, W.; Brininstool, G.; Sedbrook, J. C.; Hamann, T.; Poindexter, P.; Somerville, C. Plant Cell 2005, 17 (4), 1128-1140.

PR050419U

Journal of Proteome Research • Vol. 5, No. 4, 2006 943