Article pubs.acs.org/jpr
Comprehensive Analysis of Protein N‑Glycosylation Sites by Combining Chemical Deglycosylation with LC−MS Weixuan Chen, Johanna M. Smeekens, and Ronghu Wu* School of Chemistry and Biochemistry and the Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, Georgia 30332, United States S Supporting Information *
ABSTRACT: Glycosylation is one of the most important protein modifications in biological systems. It plays a critical role in protein folding, trafficking, and stability as well as cellular events such as immune response and cell-to-cell communication. Aberrant protein glycosylation is correlated with several diseases including diabetes, cancer, and infectious diseases. The heterogeneity of glycans makes comprehensive identification of protein glycosylation sites very difficult by MS because it is challenging to match mass spectra to peptides that contain different types of unknown glycans. We combined a chemical deglycosylation method with LC−MS-based proteomics techniques to comprehensively identify protein N-glycosylation sites in yeast. On the basis of the differences in chemical properties between the amide bond of the N-linkage and the glycosidic bond of the O-linkage of sugars, O-linked sugars were removed and only the innermost N-linked GlcNAc remained, which served as a mass tag for MS analysis. This chemical deglycosylation method allowed for the identification of 555 protein N-glycosylation sites in yeast by LC−MS, which is 46% more than those obtained from the parallel experiments using the Endo H cleavage method. A total of 250 glycoproteins were identified, including 184 membrane proteins. This method can be extensively used for other biological samples. KEYWORDS: glycoproteomics, comprehensive analysis of protein N-glycosylation, chemical deglycosylation, LC−MS
■
INTRODUCTION Glycosylation is one of the most important protein modifications in biological systems.1,2 It plays extremely important roles in a wide range of cellular events such as immune response, cell division, and cell−cell communication.3−5 Aberrant protein glycosylation is directly related to several diseases including diabetes, infectious diseases, and cancer.6−10 A comprehensive analysis of glycosylation sites will assist in better understanding the role protein glycosylation plays under physiological and pathological conditions.11,12 However, the comprehensive characterization of protein glycosylation is beyond the reach of conventional biochemical methods. With the development of mass spectrometry (MS) instrumentation, genome sequencing techniques, and computational methods, MS-based proteomics techniques have grown rapidly in recent years.13−17 These developments offer a unique opportunity to comprehensively identify modified proteins and to pinpoint the correlating modification sites.18−26 Undoubtedly diverse glycan structures in glycoproteins contain an abundance of important information.27,28 However, identification of the glycosylation sites represents another important research direction, which can provide us with other critical information, such as which proteins get glycosylated and at which sites. Despite many years’ development, it is still challenging to perform a large-scale analysis of protein glycosylation sites using MS because of the low abundance of glycoproteins and the heterogeneity of glycans. Diverse © 2014 American Chemical Society
structures of glycans will directly result in the difficulty associated with matching mass spectra to peptides that contain different types of unknown glycans. Thus, removing glycans from glycopeptides prior to MS analysis is a very important step in N-glycosylation analysis. In order to achieve this, enzymatic deglycosylation methods have been widely used. The enzyme Peptide-N4-(N-acetyl-β-glucosaminyl)asparagine amidase F (PNGase F) has been employed to remove N-glycans from proteins or peptides. In the process, PNGase F converts Asn residues to Asp.29,30 However, this deamination of Asn frequently happens in vivo and in vitro. Performing the reaction in heavy-oxygen water can help to more confidently identify the bona f ide glycosylation sites,31,32 but this further increases the experimental expenses. Other enzyme-based methods were also introduced to comprehensively profile protein N-glycosylation by MS.33,34 Glycopeptides were treated with Endoglycosidase D and H (Endo D and Endo H), which partially cleaved glycans so that only the innermost N-Acetylglucosamine (GlcNAc) remained. When combined with 1D gel electrophoresis and lectin chromatography, 62 protein N-glycosylation sites in 37 proteins were identified in human plasma samples.33 However, one of the drawbacks of these enzyme-based methods is that each enzyme has an inherently specific cleavage site and therefore no enzyme can be used to cleave all glycans.35 Received: October 4, 2013 Published: February 3, 2014 1466
dx.doi.org/10.1021/pr401000c | J. Proteome Res. 2014, 13, 1466−1473
Journal of Proteome Research
Article
Figure 1. (a) Different chemical bonds of O- and N-linkages in N-glycosylated peptides. (b) Basic principle of the chemical deglycosylation method coupled with MS analysis. (c) A sample tandem mass spectrum of an identified N-glycopeptide (HLDQLFHN#STLN#STLDYEIR) from protein Mcd4.
enzyme, the chemical deglycosylation method is not affected by structural variation because it is based on the chemical property differences between the amide bond of the N-linkage and the glycosidic bond of the O-linkage of sugars. We chose yeast as a model system and utilized the well-established lectin method to enrich glycopeptides from yeast whole cell lysates. Then, using the method presented here, we successfully identified the yeast protein N-glycosylation sites on a large scale. Comprehensive analysis of protein glycosylation will assist in a better understanding of glycoprotein structures, activities, and functions.
For example, PNGase F has some restrictions for glycan and peptide structures,36,37 including that it cannot cleave the glycan when the first GlcNAc is bound to an α1−3 fucose.35 Besides enzymatic methods, chemical methods were also studied for completely or partially removing glycans from glycopeptides or glycoproteins,38−40 which were employed to improve peptide and O-glycosylation site identification by MS41−43 and to analyze glycans.44,45 However, they have not been applied to comprehensively map protein N-glycosylation sites in complex biological samples by combination with LC− MS. In this work, we combined a chemical deglycosyation method and MS-based proteomics techniques to identify protein N-glycosylation sites on a large scale in complex biological samples. The chemical deglycosylation method is based on property differences of the covalent bonds involved in protein N-linked glycans. To be more specific (as shown in Figure 1a), the amide bond of the innermost N-linked GlcNAc and the glycosidic bond of the rest of the sugars are very different from the chemistry point of view. After the chemical deglycosylation process, only the innermost N-linked GlcNAc will be retained as a mass tag for MS analysis. Unlike enzymatic methods, which are hindered by the inherent specificity of each
■
MATERIALS AND METHODS
Cell Culture, Cell Lysis, and Protein Extraction and Digestion
BY4742 MAT alpha yeast (Saccharomyces cerevisiae) were grown in YPD overnight. Cells were harvested by centrifugation once the optical density (OD) was about 1.0 at 600 nm. The cells were lysed by the MiniBeadbeater (Biospec) in a lysis buffer containing 50 mM Tris (pH = 8.2), 8 M urea, 75 mM NaCl, 50 mM NaF, 2% SDS, and one protease inhibitor cocktail tablet per 10 mL of buffer (Complete mini, EDTA-free, Roche). After centrifuging at 13,000 rpm, the supernatants 1467
dx.doi.org/10.1021/pr401000c | J. Proteome Res. 2014, 13, 1466−1473
Journal of Proteome Research
Article
followed by up to 20 MS/MS in the LTQ for the most intense ions. The selected ions were excluded from further analysis for 90 s. Singly charged and unassigned ions were not sequenced. Maximum ion accumulation times were 1000 ms for each full MS scan and 100 ms for MS/MS scans.
were transferred to new tubes, and the protein concentration in the lysate was determined by the BCA protein assay (Pierce). Disulfide bonds within proteins were reduced with 5 mM DTT (56 °C, 25 min) and subsequently alkylated with 15 mM iodoacetamide (RT, 30 min in the dark). Excessive iodoacetamide was quenched with 5 mM DTT (RT, 15 min in the dark), following which proteins were purified by TCA (tricholoroacetic acid) protein precipitation. The purified proteins were digested with trypsin (Promega) at 37 °C overnight (∼15 h). The trypsin:protein ratio was ∼100:1 in a buffer containing 50 mM Tris (pH = 8.2) and 1.5 M urea.
Database Searches and Data Filtering
All MS/MS spectra were searched using the SEQUEST algorithm (version 28). 53 Precursors for the MS/MS fragmentation were checked and corrected for incorrect monoisotopic peak assignments.54 Spectra were matched against a database encompassing sequences of all proteins in the yeast ORFs database (S288C 2010) downloaded from the Saccharomyces Genome Web site (http://www.yeastgenome. org/). Each protein sequence was listed in both the forward and reversed orientations to estimate the false discovery rates (FDR) correlated with peptide identification. The following parameters were used for the database search: 20 ppm precursor mass tolerance, 1.0 Da product ion mass tolerance, a fully tryptic digestion, and up to two missed cleavages. The variable modifications used were oxidation of methionine (+15.9949) and a GlcNAc tag on Asn (+203.0794), and the only fixed modification was carbamidomethylation of cysteine (+57.0214). The target-decoy method was employed to control FDRs of glycopeptide identification.55 Linear discriminant analysis (LDA) was used to distinguish correct and incorrect peptide identifications using a few parameters such as Xcorr, ΔCn, and precursor mass accuracy.54 This approach is similar to other methods in the literature.56−58 Peptides less than six amino acids in length were discarded, and peptides were filtered so that only those with a less than 1% FDR based on the number of decoy sequences remained in the final data set. When determining FDRs, the data set excluded peptides that were not glycosylated and thus included only glycopeptides. This allowed us to evaluate FDRs for glycopeptide identification very accurately.
Glycopeptide Enrichment
Glycopeptides were enriched by lectins, which have been extensively reported in the literature.46−51 Protein digestion was quenched by adding 10% TFA to a final concentration of 0.4%, and the resulting peptides were purified using tC18 Sep-Pak cartridges (Waters). After purification, the peptide mixture was dissolved in a binding buffer (pH = 7.3) containing 20 mM Tris, 0.5 M NaCl, 1 mM CaCl2, and 1 mM MnCl2 and incubated for 10 min in a mixture of agarose beads bound to two types of lectins (WGA and Con A). After incubation, the beads were washed with the binding buffer three times. Finally, glycosylated peptides were eluted with a buffer containing 20 mM ethylenediamine, 200 mM methyl α-D-mannoside, and 200 mM N-acetyl-D-glucosamine. Chemical Deglycosylation
Enriched peptides were dried in a lyophilizer for at least 24 h prior to chemical deglycosylation. Next, the dry peptides were cooled in an ethanol/dry ice bath for at least 1 min. A 50-μL mixture of trifluoromethanesulfonic acid (TFMS) and toluene was slowly added into the peptides. The concentration of toluene in the mixture varied between 0 and 10%. The reaction temperature (−20, 4, and 25 °C) and time (1, 2, 4, and 8 h) were also optimized to determine optimal reaction conditions. The reaction was quenched with a solution containing pyridine, methanol, and water at a ratio of 3:1:1. The solution was neutralized with 0.5% ammonium bicarbonate, and the resulting deglycosylated peptides were dried and purified using tC18 Sep-Pak cartridges (Waters) for MS analysis.
Glycosylation Site Localization
To assign glycosylation site localizations and measure the assignment confidence, we applied a probabilistic algorithm59 that considers all glycoforms of a peptide and uses the presence or absence of experimental fragment ions unique to each to calculate a Modscore, which is similar to Ascore.60 The Modscore indicates the likelihood that the best site match is correct when compared to the next best match. We considered sites with a score ≥19 (99% confidence) to be confidently localized.
Endo H Treatment
Enriched peptides were dissolved in 20 μL of 50 mM sodium citrate (pH = 5.5), and 5 μL of Endo H (New England BioLabs) was added into the solution. The reaction mixture was incubated for removing glycans for overnight (∼15 h) at 37 °C. The resulting samples were also dried and purified for MS analysis.
■
RESULTS AND DISCUSSION For protein N-glycosylation in eukaryotes, the core glycan usually contains two GlcNAc structures and several mannose structures. The linkage between the first GlcNAc and Asn within the protein is always an amide bond, while the linkages between individual carbohydrates within the glycans are glycosidic bonds. These C−N and C−O bonds have different chemical properties. For example, the glycosidic bond is much more vulnerable to solvolysis compared to the amide bond. On the basis of this difference alone, chemical methods can be utilized to selectively cleave all O-linkages while leaving the Nlinkages intact (see Figure 1a). Previously, acids such as hydrogen fluoride (HF) and TFMS have been used for the chemical deglycosylation of glycoproteins.38,39,61 Since HF is relatively inefficient for deglycosylation and difficult to handle
LC−MS/MS Analyses
Dried peptide samples were dissolved in a solution of 5% ACN and 4% formic acid (FA), and 2 μL of the resulting solution were loaded onto a microcapillary column packed with C18 beads (Magic C18AQ, 5 μm, 200 Å, 100 μm × 16 cm, Michrom Bioresources) by a Dionex WPS-3000TPLRS autosampler (UltiMate 3000 thermostatted Rapid Separation Pulled Loop Wellplate Sampler). Peptides were separated by reversed-phase chromatography using an UltiMate 3000 binary pump with a 90 min gradient of 4−30% ACN (in 0.125% FA). Peptides were detected with a data-dependent Top20 method52 in a hybrid dual-cell quadrupole linear ion trap−Orbitrap mass spectrometer (LTQ Orbitrap Elite, ThermoFisher, with a software of Xcalibur 2.0.7 SP1). For each cycle, one full MS scan (resolution 60,000) in the Orbitrap at 106 AGC target was 1468
dx.doi.org/10.1021/pr401000c | J. Proteome Res. 2014, 13, 1466−1473
Journal of Proteome Research
Article
due to its toxicity, TFMS was used for the experiments presented here. Purified peptides from a yeast whole cell lysate were enriched with a mixture of two lectins: Concanavalin A (ConA), which binds to mannose, and wheat germ agglutinin (WGA), which binds to sialic acid and GlcNAc. This lectin enrichment method has been extensively applied for separating glycopeptides and glycoproteins.20,32,62 After lectin enrichment, the sample was split equally into four samples: two for parallel chemical deglycosylation experiments and two for Endo H experiments. All samples were completely dried in a lyophilizer for at least 24 h. Next, samples were subjected to the chemical deglycosylation reaction in which all sugars through glycosidic bond linkage were removed and only the innermost N-linked GlcNAc remained. Once we finished performing the chemical deglycosylation process, peptides were purified and analyzed using an online LC−MS system, as displayed in Figure 1b. The resulting raw files were searched against the yeast database to find peptides containing Asn tagged with GlcNAc. A sample tandem mass spectrum is displayed in Figure 1c. The glycopeptide HLDQLFHN#STLN#STLDYEIR (# denotes N-glycosylation sites) was identified with a mass accuracy of 1.1 ppm and Xcorr of 5.7. This glycopeptide is from protein Mcd4, which is highly conserved among eukaryotes, and is involved in GPI anchor synthesis. Based on the accurate precursor mass and fragments in the MS2 spectrum, the peptide and glycosylation sites at N198 and N202 could be confidently identified. Due to the complexity of biological samples, optimization of chemical deglycosylation reaction conditions is very necessary for comprehensive identification of protein N-glycosylation sites. First, the effect that the reaction time and temperature had on the comprehensive identification of glycopeptides was investigated. Fewer unique glycopeptides were identified when the reaction was carried out at room temperature compared to 4 °C and −20 °C (Figure 2a). A potential reason for this is that TFMS is a strong acid and will cause peptides to decompose at higher temperatures. When the reaction was carried out for longer than 2 h at room temperature, the number of the identified glycopeptides decreased dramatically. At increased reaction times, it is expected that the reaction would be closer to completion and thus more glycopeptides could be identified. However, at increased reaction times, peptide decomposition is also more likely to occur. Therefore, it is important to find the optimal conditions that maximize the O-linked sugar removal and minimize peptide decomposition. Based on the experiments performed, the greatest number of glycopeptides was identified when the reaction was run for either 2 or 4 h at −20 °C. Toluene was previously reported to be useful in preventing glycoprotein decomposition and potential side reactions involving amino acid residue side chains during the removal of O-linked sugars. When used with TFMS during chemical deglycosylation, toluene may serve as a radical scavenger.63 In order to find these optimal reaction conditions, another set of parallel experiments with varying toluene concentrations and reaction times was performed. The results shown in Figure 2b explicitly demonstrate that significantly fewer glycopeptides were identified when no toluene was present across all reaction times. However, there was no significant difference in the number of glycopeptides identified when a different percent of toluene was used in our experiment (2%, 5%, or 10%).
Figure 2. Optimization of the chemical deglycosylation reaction for comprehensively identifying protein N-glycosylation. (a) The effect of temperature and reaction time. (b) The effect of toluene concentration and reaction time.
To test the reproducibility of the new method, two parallel experiments were run, each using 250 μg of purified peptides from the same yeast whole cell lysate. From these experiments, 1327 and 1315 total glycopeptides were identified with a false positive rate of 19, which corresponds to >99% confidence of localization. Among these well-localized sites, the vast majority of them (397 sites) contained the N-glycosylation consensus motif, NXS/T (X refers to any amino acid except proline), and 4 sites had another well-known protein N-glycosylation motif, NXC. Only a small number of sites did not contain either of the two motifs. As reported previously, the protein N-glycosylation motif is not very strict.64,65 Table 1 lists some of the protein glycosylation sites that were identified in this work. One example is Pho11, which is one of three repressible acid phosphatases. It is a glycoprotein that is transported to the cell surface by the secretory pathway. According to the UniProt Web site,66 Pho11 was predicted to have 10 potential N-glycosylation sites. In this work, 7 of the 10 predicted glycosylation sites were identified: 97, 162, 315, 390, 439, 445, and 461. Only three predicted sites were not identified: 192, 250, and 356. Among the 219 glycoproteins identified, 38% contained a single N-glycosylation site and 22% contained two sites (Figure 3c). Additionally, there were 22 glycoproteins that contained at least 6 N-glycosylation sites. 1469
dx.doi.org/10.1021/pr401000c | J. Proteome Res. 2014, 13, 1466−1473
Journal of Proteome Research
Article
mannose N-glycans are the predominant type of glycan found in yeast.67 For the two parallel experiments, purified peptides were from the same yeast whole cell lysate and all other experimental parameters were consistent except for the cleavage methods. From the two experiments, 1014 and 1027 total glycopeptides were identified, respectively. A total of 379 glycosylation sites in 180 proteins were found from the two experiments (Supplemental Table 2). Figure 4 shows the comparison of
Figure 4. Comparison of the glycosylation sites and glycoproteins identified from the chemical deglycosylation and Endo H methods. (a) Comparison of N-glycosylation sites. (b) Overlap of N-glycoproteins.
the results from the two methods. Using chemical deglycosylation, 555 N-glycosylation sites were identified, which is 46% more than the 379 sites identified using the Endo H method. The results showed 289 common sites between the two experiments. Altogether, 645 protein N-glycosylation sites were identified in this work (Supplemental Table 3). From the two experiments, 219 and 180 glycoproteins were identified, respectively, and 149 were common between both experiments. In a recent report, two enzymes, trypsin and Glu-C, were individually used to increase the sequence coverage.32 Combining the results from the two enzymes yielded a total of 805 sites, among which 516 sites were considered to be well localized. Between the results obtained by this chemical deglycosylation method and those reported previously, 331 common sites were identified. All 250 glycoproteins identified in this work were clustered using the Database for Annotation, Visualization and Integrated Discovery (DAVID)68 to determine Gene Ontology (GO)
Figure 3. Experimental results of N-glycosylation identification by the chemical deglycosylation method. (a) Comparison of glycopeptides, glycosites, and glycoproteins from the two parallel experiments. (b) Overlap of glycosylation sites identified from the two experiments. (c) Distribution of glycoproteins with single, double, and multiple glycosylation sites.
The current chemical deglycosylation method was also compared with a previously used enzymatic method. Several enzymes, including Endo D and Endo H, can cleave the linkage between the two GlcNAcs in the core of N-glycans. Endo D cleaves paucimannose glycans, while Endo H cleaves high mannose glycans and some hybrid oligosaccharides. For the following experiments Endo H was chosen because highTable 1. Examples of Identified Protein Glycosylation Sites protein
sites
peptide sequencea
Xcorr
PPM
annotation
Axl2
41 50 163 403 483 24 85 212 365 97 162 315 390 439 445 461
R.VN#ESFTFQISN#DTYK.S R.VN#ESFTFQISN#DTYK.S K.LDPNEVFN#VTFDR.S K.FQSSN#LTLAGEVPK.N K.TTNHWELVN#TTK.M R.QN#DTHLTVR.G R.SLSVIENELSAGFSVYSN#SSDVPER.F K.QGHIAYN#HSTTTTSLYLNEPIGLHPK.I R.HLN#STTLLVPIPRPDTK.D K.LSN#YTGQFSGALSFLNDDYEFFIR.D R.DFLAQYGYM*VEN#QTSFAVFTSNSNR.C R.SVGANLFN#ASVK.L R.VYTEKFQCSN#DTYVR.Y K.VCN#VSSVSN#STELTFFWDWNTK.H K.VCN#VSSVSN#STELTFFWDWNTK.H K.HYN#DTLLKQ.-
3.25 3.25 2.38 2.59 2.94 1.71 3.29 4.34 3.80 4.19 3.69 2.72 2.95 2.61 2.61 3.19
1.01 1.01 −0.33 0.58 0.68 0.4 −1.45 −0.38 0.37 0.48 1.07 −0.64 −5.24 3.38 3.38 0.3
integral plasma membrane protein required for axial budding in haploid cells
Ecm39 Pbn1
Pho11
a
α-1,6-mannosyltransferase essential component of glycosylphospha-tidylinositolmannosyltransferase I
one of three repressible acid phosphatases, a glycoprotein that is transported to the cell surface by the secretory pathway
# refers to the identified N-glycosylation site. 1470
dx.doi.org/10.1021/pr401000c | J. Proteome Res. 2014, 13, 1466−1473
Journal of Proteome Research
Article
contain high mannose, this method can be extensively applied to many different types of samples for the comprehensive identification of protein N-glycosylation sites by MS.
categories overrepresented in the glycoproteome. As expected, the majority of the identified glycoproteins (184) were membrane proteins, which were significantly enriched, shown by a p-value of 3.5 × 10−36. There were 43 glycoproteins located in the endoplasmic reticulum (ER) and 34 proteins located in the ER membrane. Glycoproteins in the nuclear envelope-endoplasmic reticulum network, cell wall, Golgi apparatus, and vacuole were also overrepresented, as shown in Table 2. The results from the biological process analysis
■
Protein N-glycosylation sites identified by the chemical deglycosylation method (Supplemental Table 1); protein Nglycosylation sites identified by the Endo H method (Supplemental Table 2); all glycosylation sites identified in this work (Supplemental Table 3). This material is available free of charge via the Internet at http://pubs.acs.org.
Table 2. Clustering of All Identified Glycoproteins Based on Cellular Compartments and Biological Processes term cellular compartments
biological processes
membrane ER ER membrane nuclear envelope-ER network cell wall external encapsulating structure Golgi apparatus vacuole carbohydrate metabolic process cell wall organization protein metabolic process phospholipid metabolic process transport
no. of proteins
%
184 43 34 34
73.3 17.1 13.5 13.5
3.50 6.70 2.70 1.10
27 27
10.8 10.8
1.30 × 10−13 1.30 × 10−13
18 15 58
7.2 6.0 23.1
3.30 × 10−4 1.30 × 10−3 2.00 × 10−19
35 116
13.9 46.2
6.70 × 10−9 2.00 × 10−7
■
p-value × × × ×
10−36 10−20 10−14 10−13
16
6.4
2.20 × 10−4
69
27.5
4.60 × 10−2
ASSOCIATED CONTENT
S Supporting Information *
AUTHOR INFORMATION
Corresponding Author
*Phone: 404-385-1515. Fax: 404-894-7452. E-mail: ronghu.
[email protected]. Notes
The authors declare no competing financial interest.
■
ACKNOWLEDGMENTS We thank Dr. Steve P. Gygi at Harvard Medical School for allowing us to use their software. This work was supported by a start-up fund from Georgia Institute of Technology.
■
REFERENCES
(1) Varki, A.; Cummings, R. D.; Esko, J. D.; Freeze, H. H.; Stanley, P.; Bertozzi, C. R.; Hart, G. W.; Etzler, M. E. Essentials of Glycobiology, 2nd ed.; Cold Spring Harbor Laboratory Press: Cold Spring Harbor, NY, 2008. (2) Spiro, R. G. Protein glycosylation: Nature, distribution, enzymatic formation, and disease implications of glycopeptide bonds. Glycobiology 2002, 12, 43R−56R. (3) van Kooyk, Y.; Rabinovich, G. A. Protein-glycan interactions in the control of innate and adaptive immune responses. Nat. Immunol. 2008, 9, 593−601. (4) Marth, J. D.; Grewal, P. K. Mammalian glycosylation in immunity. Nat. Rev. Immunol. 2008, 8, 874−887. (5) Haltiwanger, R. S.; Lowe, J. B. Role of glycosylation indevelopment. Annu. Rev. Biochem. 2004, 73, 491−537. (6) Gabius, H. The Sugar Code; Wiley-VCH Verlag GmbH & Co.: Weinheim, 2009. (7) Gilgunn, S.; Conroy, P. J.; Saldova, R.; Rudd, P. M.; O’Kennedy, R. J. Aberrant psa glycosylation-a sweet predictor of prostate cancer. Nat. Rev. Urol. 2013, 10, 99−107. (8) Ju, T. Z.; Otto, V. I.; Cummings, R. D. The tn antigen-structural simplicity and biological complexity. Angew. Chem., Int. Ed. 2011, 50, 1770−1791. (9) Apweiler, R.; Hermjakob, H.; Sharon, N. On the frequency of protein glycosylation, as deduced from analysis of the swiss-prot database. Biochim. Biophys. Acta, Gen. Subj. 1999, 1473, 4−8. (10) Wu, J.; Xie, X. L.; Liu, Y. S.; He, J. T.; Benitez, R.; Buckanovich, R. J.; Lubman, D. M. Identification and confirmation of differentially expressed fucosylated glycoproteins in the serum of ovarian cancer patients using a lectin array and LC−MS/MS. J. Proteome Res. 2012, 11, 4541−4552. (11) Taylor, A. D.; Hancock, W. S.; Hincapie, M.; Taniguchi, N.; Hanash, S. M. Towards an integrated proteomic and glycomic approach to finding cancer biomarkers. Genome Med. 2009, 1, 57. (12) Tian, Y.; Zhang, H. Glycoproteomics and clinical applications. Proteomics: Clin. Appl. 2010, 4, 124−132. (13) Patwa, T.; Li, C.; Simeone, D. M.; Lubman, D. M. Glycoprotein analysis using protein microarrays and mass spectrometry. Mass Spectrom. Rev. 2010, 29, 830−844.
showed that 58 of 251 identified glycoproteins were involved in the carbohydrate metabolic process, and 116 glycoproteins were associated with the protein metabolic process. In addition, glycoproteins related to cell wall organization and transportation were also enriched.
■
CONCLUSIONS The heterogeneity of glycans in glycoproteins makes it extraordinarily challenging to comprehensively identify protein glycosylation sites using MS. Several enzymatic methods were reported to completely or partially cleave glycans and leave a mass tag for MS analysis. However, one of the greatest disadvantages of enzymatic methods is that every enzyme has some restrictions for cleaving glycans. Based on differences in the chemical properties of O- and N-linkages of sugars, a chemical deglycosylation method, that can selectively cleave all O-linked sugars and leave the innermost N-linked GlcNAc as a mass tag for MS analysis, was combined with LC−MS-based proteomics techniques for comprehensive analysis of protein N-glycosylation sites in yeast whole cell lysates. With this method, 555 protein N-glycosylation sites were identified from the two biologically duplicated runs in yeast whole cell lysates, which is 46% more than those identified by another two parallel experiments using the Endo H method. Among 250 glycoproteins identified in this work, there are 184 membrane proteins, which is consistent with the fact that membrane proteins are typically heavily glycosylated. Compared to enzymatic methods, this chemical deglycosylation method is more cost-efficient, generic, and effective. Although we used yeast as a model system in our experiment, in which glycans 1471
dx.doi.org/10.1021/pr401000c | J. Proteome Res. 2014, 13, 1466−1473
Journal of Proteome Research
Article
(14) Zhang, Z. H.; Tan, M. J.; Xie, Z. Y.; Dai, L. Z.; Chen, Y.; Zhao, Y. M. Identification of lysine succinylation as a new post-translational modification. Nat. Chem. Biol. 2011, 7, 58−63. (15) Kettenbach, A. N.; Wang, T. B.; Faherty, B. K.; Madden, D. R.; Knapp, S.; Bailey-Kellogg, C.; Gerber, S. A. Rapid determination of multiple linear kinase substrate motifs by mass spectrometry. Chem. Biol. 2012, 19, 608−618. (16) Yates, J. R.; Ruse, C. I.; Nakorchevsky, A. In Annual Review of Biomedical Engineering; Annual Reviews: Palo Alto, 2009; Vol. 11, pp 49−79. (17) Zhu, Z.; Su, X.; Clark, D. F.; Go, E. P.; Desaire, H. Characterizing o-linked glycopeptides by electron transfer dissociation: Fragmentation rules and applications in data analysis. Anal. Chem. 2013, 85, 8403−8411. (18) Wu, R. H.; Haas, W.; Dephoure, N.; Huttlin, E. L.; Zhai, B.; Sowa, M. E.; Gygi, S. P. A large-scale method to measure absolute protein phosphorylation stoichiometries. Nat. Methods 2011, 8, 677− U111. (19) Wu, R. H.; Dephoure, N.; Haas, W.; Huttlin, E. L.; Zhai, B.; Sowa, M. E.; Gygi, S. P. Correct interpretation of comprehensive phosphorylation dynamics requires normalization by protein expression changes. Mol. Cell. Proteomics 2011, 10, No. M111.009654. (20) Breidenbach, M. A.; Palaniappan, K. K.; Pitcher, A. A.; Bertozzi, C. R. Mapping yeast N-glycosites with isotopically recoded glycans. Mol. Cell. Proteomics 2012, 11, No. M111.015339. (21) Witze, E. S.; Old, W. M.; Resing, K. A.; Ahn, N. G. Mapping protein post-translational modifications with mass spectrometry. Nat. Methods 2007, 4, 798−806. (22) Tao, W. A.; Wollscheid, B.; O’Brien, R.; Eng, J. K.; Li, X. J.; Bodenmiller, B.; Watts, J. D.; Hood, L.; Aebersold, R. Quantitative phosphoproteome analysis using a dendrimer conjugation chemistry and tandem mass spectrometry. Nat. Methods 2005, 2, 591−598. (23) Alvarez-Manilla, G.; Atwood, J.; Guo, Y.; Warren, N. L.; Orlando, R.; Pierce, M. Tools for glycoproteomic analysis: Size exclusion chromatography facilitates identification of tryptic glycopeptides with N-linked glycosylation sites. J. Proteome Res. 2006, 5, 701− 708. (24) Dalpathado, D. S.; Desaire, H. Glycopeptide analysis by mass spectrometry. Analyst 2008, 133, 731−738. (25) Mertins, P.; Qiao, J. W.; Patel, J.; Udeshi, N. D.; Clauser, K. R.; Mani, D. R.; Burgess, M. W.; Gillette, M. A.; Jaffe, J. D.; Carr, S. A. Integrated proteomic analysis of post-translational modifications by serial enrichment. Nat. Methods 2013, 10, 634−637. (26) Zhu, Z. K.; Hua, D.; Clark, D. F.; Go, E. P.; Desaire, H. Glycopep detector: A tool for assigning mass spectrometry data of Nlinked glycopeptides on the basis of their electron transfer dissociation spectra. Anal. Chem. 2013, 85, 5023−5032. (27) Zhang, H. Q.; Wang, Z. H.; Stupak, J.; Ghribi, O.; Geiger, J. D.; Liu, Q. Y.; Li, J. J. Targeted glycomics by selected reaction monitoring for highly sensitive glycan compositional analysis. Proteomics 2012, 12, 2510−2522. (28) Dube, D. H.; Bertozzi, C. R. Glycans in cancer and inflammation. Potential for therapeutics and diagnostics. Nat. Rev. Drug Discovery 2005, 4, 477−488. (29) Zhang, H.; Li, X. J.; Martin, D. B.; Aebersold, R. Identification and quantification of N-linked glycoproteins using hydrazide chemistry, stable isotope labeling and mass spectrometry. Nat. Biotechnol. 2003, 21, 660−666. (30) Malerod, H.; Graham, R. L. J.; Sweredoski, M. J.; Hess, S. Comprehensive profiling of N-linked glycosylation sites in hela cells using hydrazide enrichment. J. Proteome Res. 2013, 12, 248−259. (31) Kuster, B.; Mann, M. O-18-labeling of N-glycosylation sites to improve the identification of gel-separated glycoproteins using peptide mass mapping and database searching. Anal. Chem. 1999, 71, 1431− 1440. (32) Zielinska, D. F.; Gnad, F.; Schropp, K.; Wisniewski, J. R.; Mann, M. Mapping N-glycosylation sites across seven evolutionarily distant species reveals a divergent substrate proteome despite a common core machinery. Mol. Cell 2012, 46, 542−548.
(33) Hagglund, P.; Bunkenborg, J.; Elortza, F.; Jensen, O. N.; Roepstorff, P. A new strategy for identification of N-glycosylated proteins and unambiguous assignment of their glycosylation sites using hilic enrichment and partial deglycosylation. J. Proteome Res. 2004, 3, 556−566. (34) Segu, Z. M.; Hussein, A.; Novotny, M. V.; Mechref, Y. Assigning N-glycosylation sites of glycoproteins using LC/MSMS in conjunction with endo-m/exoglycosidase mixture. J. Proteome Res. 2010, 9, 3598− 3607. (35) Tretter, V.; Altmann, F.; Marz, L. Peptide-N4-(N-acetyl-betaglucosaminyl)asparagine amidase F cannot release glycans with fucose attached alpha-1−3 to the asparagine-linked N-acetylglucosamine residue. Eur. J. Biochem. 1991, 199, 647−652. (36) Tarentino, A. L.; Plummer, T. H. Enzymatic deglycosylation of asparagine-linked glycans - purification, properties, and specifity of ologosaccharide-cleaving enzymes from flavobacterium-meningosepticum. Methods Enzymol. 1994, 230, 44−57. (37) Kuhn, P.; Guan, C.; Cui, T.; Tarentino, A. L.; Plummer, T. H.; Vanroey, P. Active-site and oligosaccharide recognition residues of prptide-N-4(N-acetyl-beta-D-glucosaminyl)asparagine amidase-f. J. Biol. Chem. 1995, 270, 29493−29497. (38) Edge, A. S. B.; Faltynek, C. R.; Hof, L.; Reichert, L. E.; Weber, P. Deglycosylation of glycoproteins by trifluoromethanesulfonic acid. Anal. Biochem. 1981, 118, 131−137. (39) Edge, A. S. B. Deglycosylation of glycoproteins with trifluoromethanesulphonic acid: Elucidation of molecular structure and function. Biochem. J. 2003, 376, 339−350. (40) Alley, W. R.; Mann, B. F.; Novotny, M. V. High-sensitivity analytical approaches for the structural characterization of glycoproteins. Chem. Rev. 2013, 113, 2668−2732. (41) Fryksdale, B. G.; Jedrzejewski, P. T.; Wong, D. L.; Gaertner, A. L.; Miller, B. S. Impact of deglycosylation methods on twodimensional gel electrophoresis and matrix assisted laser desorption/ ionization-time of flight-mass spectrometry for proteomic analysis. Electrophoresis 2002, 23, 2184−2193. (42) Bellwied, P.; Staubach, S.; Hanisch, F.-G. Chemical in-gel deglycosylation of o-glycoproteins improves their staining and mass spectrometric identification. Electrophoresis 2013, 34, 2387−2393. (43) Gerken, T. A. O-glycoprotein biosynthesis: Site localization by edman degradation and site prediction based on radom peptide substrates. Methods Mol. Biol. 2012, 842, 81−108. (44) Triguero, A.; Cabrera, G.; Royle, L.; Harvey, D. J.; Rudd, P. M.; Dwek, R. A.; Bardor, M.; Lerouge, P.; Cremata, J. A. Chemical and enzymatic N-glycan release comparison for N-glycan profiling of monoclonal antibodies expressed in plants. Anal. Biochem. 2010, 400, 173−183. (45) Kilz, S.; Budzikiewicz, H.; Waffenschmidt, S. In-gel deglycosylation of sodiumdodecyl sulfate polyacrylamide gel electrophoresis-separated glycoproteins for carbohydrate estimation by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. J. Mass Spectrom. 2002, 37, 331−335. (46) Yang, Z. P.; Hancock, W. S. Approach to the comprehensive analysis of glycoproteins isolated from human serum using a multilectin affinity column. J. Chromatogr. A 2004, 1053, 79−88. (47) Fanayan, S.; Hincapie, M.; Hancock, W. S. Using lectins to harvest the plasma/serum glycoproteome. Electrophoresis 2012, 33, 1746−1754. (48) Wang, Y. H.; Wu, S. L.; Hancock, W. S. Approaches to the study of N-linked glycoproteins in human plasma using lectin affinity chromatography and nano-hplc coupled to electrospray linear ion trapfourier transform mass spectrometry. Glycobiology 2006, 16, 514−523. (49) Nilsson, C. L. Lectin techniques for glycoproteomics. Curr. Proteomics 2011, 8, 248−256. (50) Geyer, H.; Geyer, R. Strategies for analysis of glycoprotein glycosylation. Biochim. Biophys.Acta, Proteins Proteomics 2006, 1764, 1853−1869. (51) Lai, Z. W.; Nice, E. C.; Schilling, O. Glycocapture-based proteomics for secretome analysis. Proteomics 2013, 13, 512−525. 1472
dx.doi.org/10.1021/pr401000c | J. Proteome Res. 2014, 13, 1466−1473
Journal of Proteome Research
Article
(52) Haas, W.; Faherty, B. K.; Gerber, S. A.; Elias, J. E.; Beausoleil, S. A.; Bakalarski, C. E.; Li, X.; Villen, J.; Gygi, S. P. Optimization and use of peptide mass measurement accuracy in shotgun proteomics. Mol. Cell. Proteomics 2006, 5, 1326−1337. (53) Eng, J. K.; McCormack, A. L.; Yates, J. R. An approach to correlate tandem mass-spectral data of peptides with amino-acidsequences in a protein database. J. Am. Soc. Mass Spectrom. 1994, 5, 976−989. (54) Huttlin, E. L.; Jedrychowski, M. P.; Elias, J. E.; Goswami, T.; Rad, R.; Beausolei, S. A.; Villen, J.; Haas, W.; Sowa, M. E.; Gygi, S. P. A tissue-specific atlas of mouse protein phosphorylation and expression. Cell 2010, 143, 1174−1189. (55) Elias, J. E.; Gygi, S. P. Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat. Methods 2007, 4, 207−214. (56) Du, X.; Callister, S. J.; Manes, N. P.; Adkins, J. N.; Alexandridis, R. A.; Zeng, X.; Roh, J. H.; Smith, W. E.; Donohue, T. J.; Kaplan, S.; Smith, R. D.; Lipton, M. S. A computational strategy to analyze labelfree temporal bottom-up proteomics data. J. Proteome Res. 2008, 7, 2595−2604. (57) Kall, L.; Canterbury, J. D.; Weston, J.; Noble, W. S.; MacCoss, M. J. Semi-supervised learning for peptide identification from shotgun proteomics datasets. Nat. Methods 2007, 4, 923−925. (58) Zhang, J. Y.; Ma, J.; Dou, L.; Wu, S. F.; Qian, X. H.; Xie, H. W.; Zhu, Y. P.; He, F. C. Bayesian nonparametric model for the validation of peptide identification in shotgun proteomics. Mol. Cell. Proteomics 2009, 8, 547−557. (59) Beausoleil, S. A.; Villen, J.; Gerber, S. A.; Rush, J.; Gygi, S. P. A probability-based approach for high-throughput protein phosphorylation analysis and site localization. Nat. Biotechnol. 2006, 24, 1285− 1292. (60) Kim, W.; Bennett, E. J.; Huttlin, E. L.; Guo, A.; Li, J.; Possemato, A.; Sowa, M. E.; Rad, R.; Rush, J.; Comb, M. J.; Harper, J. W.; Gygi, S. P. Systematic and quantitative assessment of the ubiquitin-modified proteome. Mol. Cell 2011, 44, 325−340. (61) Mort, A. J.; Lamport, D. T. A. Specific deglycosylation via anhydrous hydrogen-fluoride. Plant Physiol. 1976, 57, 57−57. (62) Mechref, Y.; Madera, M.; Novotny, M. V. In Methods in Molecular Biology; Posch, A., Ed.; Humana Press Inc.: Totowa, NJ, 2008; Vol. 424, pp 373−396. (63) Forbes, J.; Beeley, J. G. Deglycosylation of glycoproteins with trifluoromethanesulphonic acid - use of water-soluble scavengers. Biochem. Soc. Trans. 1989, 17, 737−737. (64) Zielinska, D. F.; Gnad, F.; Wisniewski, J. R.; Mann, M. Precision mapping of an in vivo N-glycoproteome reveals rigid topological and sequence constraints. Cell 2010, 141, 897−907. (65) Schwarz, F.; Aebi, M. Mechanisms and principles of N-linked protein glycosylation. Curr. Opin. Struct. Biol. 2011, 21, 576−582. (66) Http://www.Uniprot.Org/uniprot/p35842. (67) Hamilton, S. R.; Gerngross, T. U. Glycosylation engineering in yeast: The advent of fully humanized yeast. Curr. Opin. Biotechnol. 2007, 18, 387−392. (68) Huang, D. W.; Sherman, B. T.; Lempicki, R. A. Systematic and integrative analysis of large gene lists using david bioinformatics resources. Nat. Protoc. 2009, 4, 44−57.
1473
dx.doi.org/10.1021/pr401000c | J. Proteome Res. 2014, 13, 1466−1473