An Enzymatic Deglycosylation Scheme Enabling Identification of

Michele MariottiFabian LeinischDiana Julie LeemingBirte SvenssonMichael J. .... Maja N. Christiansen , Daniel Kolarich , Helena Nevalainen , Nicolle H...
0 downloads 0 Views 239KB Size
An Enzymatic Deglycosylation Scheme Enabling Identification of Core Fucosylated N-Glycans and O-Glycosylation Site Mapping of Human Plasma Proteins Per Ha1 gglund,‡ Rune Matthiesen,† Felix Elortza,† Peter Højrup, Peter Roepstorff, Ole Nørregaard Jensen, and Jakob Bunkenborg* Department of Biochemistry and Molecular Biology, University of Southern Denmark, Campusvej 55, DK-5230 Odense M, Denmark Received January 31, 2007

Global proteome analysis of protein glycosylation is a major challenge due to the inherent heterogeneous and diverse nature of this post-translational modification. It is therefore common to enzymatically remove glycans attached to protein or peptide chains prior to mass spectrometric analysis, thereby reducing the complexity and facilitating glycosylation site determinations. Here, we have used two different enzymatic deglycosylation strategies for N-glycosylation site analysis. (1) Removal of entire N-glycan chains by peptide-N-glycosidase (PNGase) digestion, with concomitant deamidation of the released asparagine residue. The reaction is carried out in H218O to facilitate identification of the formerly glycosylated peptide by incorporatation of 18O into the formed aspartic acid residue. (2) Digestion with two endo-β-N-acetylglucosaminidases (Endo D and Endo H) that cleave the glycosidic bond between the two N-acetylglucosamine (GlcNAc) residues in the conserved N-glycan core structure, leaving single GlcNAc residues with putative fucosyl side chains attached to the peptide. To enable digestion of complex and hybrid type N-glycans, a number of exoglycosidases (β-galactosidase, neuraminidase and N-acetyl-β-glucosaminidase) are also included. The two strategies were here applied to identify 103 N-glycosylation sites in the Cohn IV fraction of human plasma. In addition, Endo D/H digestion uniquely enabled identification of 23 fucosylated N-glycosylation sites. Several O-glycosylated peptides were also identified with a single N-acetylhexosamine attached, arguably due to partial deglycosylation of O-glycan structures by the exoglycosidases used together with Endo D/H. Keywords: proteomics • post-translational modifications • mass spectrometry • HILIC • glycosylation • diagnostic ions • plasma proteins • fucosylation

Introduction Rapid developments in mass spectrometry have provided new opportunities for systematic analysis of post-translational modifications (PTMs) on a proteome scale.1,2 This development has led to an augmented understanding of the importance of PTMs in various cellular processes and diseases.3 In the context of PTM analysis, glycosylation is a particular challenge. Glycosylation is one of the most common types of PTMs, comprising a class of structurally diverse modifications, including N-, O-, and C-linked glycans with carbohydrate moieties ranging from monosaccharides to large branched oligosaccharide chains composed of 7-40 monomers attached to the polypeptide chain. N-linked glycans are large structures containing a conserved trimannosyl chitobiose pentasaccharide core attached to asparagine residues in an NX(S/T/C) motif where X * Author to whom correspondence should be addressed. E-mail: [email protected]. † Present address: Cooperative Research Centre on Biosciences (CIC; bioGUNE), Technology park of Bizkaia, 801 A Building, 48160 Derio, Spain. ‡ Present address: Enzyme and Protein Chemistry, Biocentrum DTU 224124, Technical University of Denmark, DK-2800 Kgs, Lyngby, Denmark. 10.1021/pr0700605 CCC: $37.00

 2007 American Chemical Society

can be any amino acid except proline.4,5 O-linked glycans are most often attached to the hydroxyl groups of serine or threonine residues. They lack a defined attachment motif or common core structure and are, in general, more structurally diverse than N-glycans.6 C-glycosylation is a single mannose residue linked to tryptophan residues through a carbon-carbon bond to the indole moiety.7 Because of the structural complexity of glycans, comprehensive structural analysis of protein glycosylation is predominantly applied to purified proteins. Proteome-scale glycosylation analysis is typically focused on attachment-site analysis. In glycosylation-directed proteomics protocols, methods for enrichment of glycoproteins and/or glycopeptides include lectin affinity chromatography, peroxidase oxidation followed by hydrazide coupling, hydrophilic interaction chromatography (HILIC), boronic acid chromatography, or glycosylation-specific antibodies.8-12 Identification of glycan attachment points in peptide chains is then facilitated by complete or partial removal of the glycan chain, either by chemical or enzymatic cleavage,13-15 or by fragmentation in the mass spectrometer.16 There are several analytical benefits of removing the major part of Journal of Proteome Research 2007, 6, 3021-3031

3021

Published on Web 07/18/2007

research articles the glycan prior to mass spectrometric analysis: different peptide glycoforms are merged into a single species yielding decreased glycan heterogeneity and increased sensitivity; labile sugar groups are removed leading to improved backbone fragmentation in collision-induced dissociation (CID) experiments; the mass of the resulting peptide is reduced leading to less complex tandem MS spectra; and finally, data-interpretation is made easier. Entire N-glycans can be removed by peptide-N-glycosidase F (PNGase F) with concomitant deamidation of the sugarlinked asparagine to aspartic acid. Thus, deamidated asparagine residues provide indirect evidence for N-glycosylation sites. However, as deamidation of asparagine is a common modification in vivo and spontaneous deamidation occur in vitro, falsepositive identifications is of major concern.17 The level of confidence can be increased by performing the enzymatic deglycosylation in H218O.13,14,18,19 An alternative approach is to perform partial deglycosylation and retain a small glycan moiety or a single monosaccharide residue as direct unambiguous proof of glycosylation. Highmannose type N-glycans can be trimmed with R-mannosidase to retain the pentasaccharide core,20 or with endo-β-N-acetylglucosaminidase H (Endo H) to retain only a single Nacetylglucosamine (GlcNAc) residue attached to the peptide backbone.21,22 The retained GlcNAc residue is a valuable tool for reliable glycosylation site identification and can potentially be used for precursor ion scanning.23 The CID fragmentation of N-acetylhexosamine (HexNAc) gives rise to a signature of diagnostic ions that aids spectral interpretation and can be used for postacquisition data-filtering. A more general deglycosylation based on endo-β-N-acetylglucosaminidase digestion is limited by the different substrate specificities and pH optima of the glycosidases necessary for extensive trimming of complex and hybrid type glycans. We have previously described a method for identification of N-glycosylation sites using a combination of Endo D and Endo H together with exoglycosidases (β-galactosidase, neuraminidase, and N-acetyl-β-glucosaminidase).9 This enzyme combination (referred to as Endo D/H) is applicable to all of the major N-glycans types. A number of fucosyltransferases act on the maturing glycan structures in mammalian cells, and the R1-6 fucosyltransferase FUT8 attaches a fucose group to the conserved chitobiose core of N-linked glycans (core-fucosylation).24,25 This modification is relevant in a wide variety of biological processes where fucosyl residues in N-linked glycans have been implicated.25 The fucosyl group attached to the innermost GlcNac residue in some complex and hybrid type glycans is retained on the peptide with the Endo D/H deglycosylation scheme, thus allowing identification. Here, the utility of PNGase F and Endo D/H as tools for global N-glycosylation site analysis were tested in a gel-free proteomics strategy. The enzymes were applied to HILICenriched glycopeptides from a tryptic digest of Cohn IV fraction plasma proteins as a highly complex glycoprotein model system lacking abundant serum albumin and immunoglobulins. Of the 103 N-glycosylation sites identified in total, 13 and 42 sites were uniquely identified using PNGase F and Endo D/H, respectively. In addition, 23 N-glycosylation sites were identified with fucosyl residues attached to the innermost GlcNAc residue in the Endo D/H deglycosylated peptides. By comparing the glycopeptides resulting from the two deglycosylation strategies, we show that the Endo D/H methodology also is applicable for identification of O-glycosylation sites. 3022

Journal of Proteome Research • Vol. 6, No. 8, 2007

Ha1 gglund et al.

Materials and Methods Sample Preparation/In-Solution Digestion. Cohn fractionation of human plasma proteins was performed at Statens Serum Institut from a plasma pool from healthy Danish blood donors. The Cohn IV fraction was collected as the precipitate at 40% ethanol and pH 5.9 during the ethanol fractionation.26 A total of 1 mg of lyophilized protein paste (Cohn IV fraction of human plasma) was solubilized with the acid labile detergent RapiGest SF (Waters Corp., Milford, MA) by vortexing vigorously in 300 µL of 0.2% RapiGest in 50 mM ammonium bicarbonate. Insoluble debris was removed by centrifugation, and the supernatant transferred to a new eppendorf tube. A total of 15 µL of DTT (100 mM) was added, and the mixture was incubated for 30 min at 60 °C, followed by an addition of 9 µL of iodoacetamide (550 mM) and incubation in the dark at room temperature for 30 min. Twenty micrograms of sequence grade trypsin (Promega, Madison,WI) in 300 µL of milliQ-water was added and incubated overnight at 37 °C. To hydrolyze RapiGest SF, the sample was acidified by the addition of 60 µL of 500 mM HCl for 50 min at 40 °C. To remove the hydrolyzed RapiGest, the digest was centrifuged for 10 min at 14 000 rpm, and the aqueous phase was transferred to new vials and kept at -20 °C. HILIC Chromatography. Hydrophilic interaction chromatography (HILIC) was performed using a zwitter-ionic (ZIC) resin as the polar stationary phase. HILIC microcolumns were prepared by packing ZIC-HILIC chromatography media (particle size 10 µm) (Sequant, Umeå, Sweden) into GELoader tips (Eppendorf, Hamburg, Germany). The digested samples were lyophilized, redissolved in 80% acetonitrile (ACN) and 5.0% formic acid (FA), and loaded onto HILIC microcolumns equilibrated with 80% ACN and 5.0% FA. Columns were washed twice in 80% ACN and 5.0% FA, and peptides were eluted in 0.5% FA. Endoglycosidase Digestion. Endo-β-N-acetylglucosaminidase digestions of tryptic peptides were carried out using a mixture (referred to as Endo D/H) of the following enzymes: 1 mU endoglycosidase D from Streptococcus pneumonia (ICN Biomedicals, Aurora, OH); 25 mU endoglycosidase H from Streptomyces plicatus (Roche); 2.5 mU neuraminidase from Arthrobacter ureafaciens (Roche); 0.5 mU β-galactosidase from Bos taurus testes (ProZyme, San Leandro, CA); 0.5 mU N-acetylβ-glucosaminidase from Diplococcus pneumonia (Roche). The original enzyme suspensions were stored according to the manufacturer’s instructions at either 4 or -20 °C, and mixed prior to use. To avoid proteolytic digestion of the glycosidases (and the stabilizing proteins added to the glycosidases such as BSA and other contaminants), the protease inhibitor Pefabloc Sc (AEBSF) (Roche) was added to a final concentration of 4 mM. Endo D/H digestions were carried out in 100 mM ammonium acetate buffer (pH 5.5) at 37 °C overnight. For PNGase F digestion, Pefabloc Sc (Roche) was added to the tryptic digest, and the mixture was lyophilized. The activity of trypsin has to be quenched to avoid incorporation of 18O at C-terminal lysine and arginine residues and to avoid digestion of PNGase F. One unit PNGase F from Flavobacterium meningosepticum (Roche) was diluted in 30 µL of 100 mM NH4CO3 buffer (pH 7.4) and lyophilized. PNGase F was resuspended in 30 µL of 99% H218O (Icon Isotopes, Summit, NJ) and added to the lyophilized tryptic digest, and reactions were carried out at 37 °C overnight. Strong Cation Exchange (SCX) Chromatography. After deglycosylation, with PNGase F or Endo D/H, peptides were

research articles

Enzymatic Deglycosylation for Site Mapping

separated using SCX and an Ettan LC system (Amersham Pharmacia Biotech, Uppsala, Sweden). The peptide samples were diluted 50 times in solvent A (5 mM ammonium formate and 30% ACN adjusted to pH 3 with FA) and loaded onto a 150 mm × 2.1 mm Polysulfoethyl A (PolyLC, Columbia, MD) column. Peptides were separated in a 20 min gradient at a constant flow rate of 0.2 mL/min from 0 to 40% solvent B (1.0 M ammonium formate and 30% ACN adjusted to pH 3 with FA), and 2-min fractions were collected. The volume of each fraction was reduced to approximately 1/5 by vacuum centrifugation, loaded onto a reversed-phase stagetip made of C18 disk material from Empore (3M, Minneapolis, MN),27 washed with 5% FA, and eluted with 5% FA and 80% ACN. The volume was reduced to ∼1 µL in a Speed Vac and diluted to 10 µL with 0.5% acetic acid. Liquid Chromatography-Tandem Mass Spectrometry (LCMS/MS). Peptide separation was achieved using an LC-Packings nanoflow LC system (LC Packings, Amsterdam, The Netherlands) equipped with a Famos autosampler. Custom-made precolumns (1 cm, 75 µm i.d. fused silica) and analytical columns (10 cm, 50 µm i.d. fused silica) were used with silicatefrits retaining the Reprosil-Pur C18 3 µm beads (Dr. Maisch GmbH, Ammerbuch-Entringen, Germany). The silicate frits were prepared by quickly dipping the end of the fused silica capillary into a mixture of 44 µL of kasil 1 (PQ corporation, Berwyn, PA) and 8 µL of formamide (Sigma-Aldrich, St. Louis, MO). The length of the frit was adjusted by cutting with a ceramic scoring wafer after kasil polymerization for 2 h at 100 °C. The column was then packed using a pressure bomb (Proxeon, Odense, Denmark). Peptides were loaded onto the precolumn with a flow of ∼5 µL/min. After loading, nanoflow reversed-phase LC was performed using a flow split before the precolumn to achieve a flow of approximately 100 nL/min going through the precolumn and the analytical column directly into the ESI source of a Q-TOF Ultima tandem mass spectrometer (Waters Corp., Milford, MA) using a linear gradient of 5-40% ACN in 90 min with 0.5% aqueous acetic acid (Merck) as solvent A and 0.5% aqueous acetic acid and 80% ACN as solvent B. Spectral Analysis and Database Searching. The tandem mass spectra acquired during LC-MS/MS were smoothed, centroided, and deisotoped using the “Fast”-algorithm where each isotopic cluster of a given charge state is transformed to a single mass measurement and converted to a peak-list text file using ProteinLynx Global Server 2.0 (Waters Corp., Milford, MA). The spectra were searched against the human nonredundant databases from the International Protein Index (version 3.07), or human Uniprot (version 49 downloaded from ftp:// ftp.ebi.ac.uk/pub/databases/SPproteomes/uniprot/proteomes/ 25.H_sapiens.dat) using Mascot v2.1 software (Matrix Sciences Ltd., Cheshire, U.K.) or VEMS software28,29 (www.yass.sdu.dk). The following search parameters were used: carbamidomethylation of all cysteines, possible oxidation of methionine, possible conversion of terminal glutamines and glutamates into pyroglutamate, possible formation of pyro-carbamidomethylcysteine, possible carbamidomethylation of the N-terminus, at most 3 missed tryptic cleavage sites, a 0.2 Da error tolerance in both MS and MS/MS. We included three variable modifications for asparagine: PNGase F induced deamidation in H218O (leading to a monoisotopic residue mass of 117.0312 Da), GlcNAc linked (adding a monoisotopic mass of 203.079 Da), and fucosyl GlcNAc (adding a monoisotopic mass of 349.137 Da). Four different definitions were tested for fucosyl GlcNAc:

Figure 1. Enzymatic digestion of glycopeptides by glycosidases. By the use of a mixture of exoglycosidases, it is possible to trim a heterogeneous set of N-linked glycans with β-galactosidase, neuraminidase, and N-acetyl-β-glucosaminidase to glycan structures that are substrates to either Endo D or Endo H that cleaves within the chitobiose core leaving only a single GlcNAc residue attached to the asparagine residue. If the glycan is corefucosylated, the fucose residue remains attached to the innermost GlcNAc residue. Alternatively, the entire N-linked glycan can be removed with PNGase F (or A) that converts the asparagine residue to aspartic acid. If this reaction is performed in H218O, the enzymatic process leads to a mass increase of 3 Da per asparagine glycosylation site, making it possible to distinguish between the two types of asparagine deamidation: spontaneous deamidation prior to enzyme treatment and PNGase F catalyzed deglycosylation. The mixture of exoglycosidases also reduces the complexity of some O-linked glycan structures down to the innermost N-acetylhexosamine.

(1) intact glycan, (2) neutral loss of the fucose residue (146 Da), (3) neutral loss of fucosyl GlcNAc (349 Da), and (4) two neutral loss series of 146 and 349 Da. We defined the general modification N-acetylhexosamine (HexNAc) since N-acetylgalactosamine and N-acetylglucosamine have identical mass and cannot be distinguished by MS. The addition of HexNAc (monoisotopic mass of 203.079 Da) to serine and threonine with a neutral loss series of 203 Da was searched as a variable modification. All the reported peptide assignments containing glycosylation sites were validated by manual interpretation of the spectra. The searches were repeated against the Uniprot database modified in a manner similar to that described by Atwood et al.30 with the letter N in the fasta database exchanged with J when N appears in the glycosylation sequon NXS/T/C. J was defined with the same monoisotopic mass as asparagine, but the variable modifications (18O incorporation, GlcNAc, and fucosyl GlcNAc) were only allowed as variable J modifications.

Results and Discussion Overall Strategies for Determination of Glycosylation Sites. Two alternative enzymatic strategies for global glycosylation site analysis were utilized here to complement each other. These strategies are based on deglycosylation using either a peptide-N-glycosidase (PNGase F) or two endo-β-N-acetylglucosaminidases (Endo D and Endo H) (Figure 1). PNGase F cleaves the amide bond between asparagine and the innermost GlcNAc residue of the N-glycan. Asparagine is converted to aspartic acid in this reaction, and glycosylation sites can thus by identified through a mass increase of 1 Da. However, in a preliminary study (data not shown), we observed several Journal of Proteome Research • Vol. 6, No. 8, 2007 3023

research articles

Figure 2. Overview of the analytical strategy. A pellet of the Cohn IV fraction of human plasma was solubilized in RapiGest, reduced, and alkylated, followed by in-solution digestion with trypsin. The peptides were diluted in 80% ACN and 5% FA, and the glycopeptides were enriched by hydrophilic interaction chromatography (HILIC). The glycan structures were removed either by PNGase F treatment in H218O or by Endo D/H treatment. The resulting peptides from each deglycosylation scheme were separated by SCX into 10 fractions and analyzed by LC-MS/MS.

deamidated asparagines also in samples not exposed to PNGase F treatment (especially asparagines that are followed by G and D in the protein sequence or asparagines at the N-terminal of peptides). To avoid this ambiguity, PNGase F deglycosylation was carried out in H218O, which leads to a mass increase of 3 Da per deglycosylated site in peptides. The 18O-labeled asparagine is analytically appealing because of the unique mass of 117 Da, which cannot be mistaken for any other amino acid in the tandem mass spectrum. One potential drawback of 18O labeling is that trypsin catalyzes the incorporation of one or two 18O atoms into the carboxylic acid group of C-terminal lysine and arginine residues. As recently reported by Angel et al.31 partial 18O incorporation in the C-terminus can be problematic for determination of monoisotopic peptide masses and should be avoided. To minimize C-terminal 18O incorporation from residual trypsin activity, we added a trypsin inhibitor to the glycopeptides prior to incubation with the PNGase F in H218O. In the second strategy, N-linked glycans are removed using endo-β-N-acetylglucosaminidases Endo D and Endo H (Figure 1). These enzymes hydrolyze the O-glycosidic bond between the two innermost GlcNAc residues attached to asparagine. Thus, a single GlcNAc residue (mass increase 203 Da) will remain attached to asparagine. In some N-glycans from mammals an R-1-6-linked fucose residue is attached to the innermost GlcNAc residue, and this residue will also be retained on the peptide after Endo D/H deglycosylation (mass increase of 349 Da). We have previously described a combination of endo- and exoglycosidases (referred to as Endo D/H) to trim down the glycan structure irrespective of the N-glycan structure.9 These combinations includes Endo H that is active on high mannose type glycosylation and Endo D that is active on complex and hybrid type glycosylation when used together with a set of exoglycosidases.32,33 To investigate the benefits and limitations of the two different deglycosylation strategies, we attempted to use a gelfree shotgun approach applied to Cohn fractionated human plasma, as outlined in Figure 2. The Cohn process separates human blood plasma proteins into five major fractions by differential solubility under varying temperature, ethanol con3024

Journal of Proteome Research • Vol. 6, No. 8, 2007

Ha1 gglund et al.

centration, and pH conditions with serum albumin as the end product. Analysis of human plasma proteins is complicated by the presence of highly abundant proteins like immunoglobulins (fraction II and III) and serum albumin (fraction V). The Cohn IV fraction was thus chosen as test material for this analysis because it is a complex mixture of glycoproteins from human plasma depleted of these abundant plasma proteins. Human plasma proteins from the Cohn IV fraction were in-solutiondigested after solubilization with the acid labile detergent RapiGest, and the glycopeptides were enriched by HILIC chromatography, essentially as described previously.9 A notable exception, however, was that the concentration of formic acid was raised from 0.5% to 5%, to improve the glycopeptide selectivity over nonglycosylated peptides (based on tests with tryptic digests of bovine fetuin, data not shown). Approximately 85% of the redundant list of peptides identified by LC-MS/ MS after enrichment and deglycosylation was identified as glycopeptides through their modified residues (data not shown). The sample was split in two followed by treatment with either PNGase F or Endo D/H. The deglycosylated peptides were separated by SCX into 10 fractions. Finally, these fractions were separated and analyzed by LC-MS/MS on a quadrupole timeof-flight (Q-TOF) instrument. Database Searching. The tandem mass spectra were searched against the Uniprot database using VEMS or Mascot software. The rather complicated fragmentation spectra from corefucosylated peptides (vide supra) prompted us to test four different definitions of fucosyl N-acetylhexosaminyl asparagine (for details see Materials and Methods). The definition with two series of neutral loss from both the single fucose residue and the entire fucosyl N-acetylhexosamine residue gave the best description and scores of the data. Inspired by Atwood et al.,30 we modified the Uniprot database by substituting the letter N with J whenever it appears in the glycosylation sequon NXS/T/C. The letter J in the search engine was defined to have the same mass as asparagine. Defining it with the mass of aspartic acid as proposed by Atwood et al. requires that all glycosylation site asparagines be modified, and this is not always the case. For example, the peptide NLFLJHSEJATAK from haptoglobin contains two glycosylation sequons, and we find that there is occupation heterogeneity with a form NLFLjHSEJATAK glycosylated only in the first sequon. This identification would not have been possible using the strategy proposed by Atwood. We then limited glycosylation-associated asparagine variable modifications only to J-residues. The peptide identifications returned by Mascot from these searches had fewer false-positive identifications of glycosylated peptides and were easier to look through, but essentially, the peptide and glycosylation site identifications after manual validation were identical to those from the normal database search. Identification of N-Glycosylation Sites. LC-MS/MS analysis of N-glycosylated peptides obtained by the two different deglycosylation strategies revealed notable differences in the fragment ion spectra. The PNGase F treatment in H218O yielded the simplest spectra, where the glycosylation site can be pinpointed in the ion-series by its mass of 117 Da. As an example, the fragment ion spectrum of a doubly charged precursor matching the peptide AAIPSALDTNSSK from Galectin-3-binding protein is shown in Figure 3A. Endo D/H-treated peptides gave more complex spectra, and panels B and C in Figure 3 display the fragmentation spectra of the same peptide from Galectin-3-binding protein carrying a HexNAc and a

Enzymatic Deglycosylation for Site Mapping

research articles

Figure 3 (Continued) Journal of Proteome Research • Vol. 6, No. 8, 2007 3025

research articles

Ha1 gglund et al.

Figure 3. LC-ESI MS/MS fragment ion spectra of the peptide AAIPSALDTNSSK from Galectin-3 binding protein after (A) PNGase F treatment

and (B and C) Endo D/H treatment acquired on a Q-TOF Ultima mass spectrometer. (A) A deconvoluted fragmentation spectrum of the [M + 2H]2+ precursor ion at m/z 639.32 corresponding to the deamidated, 18O-labeled peptide. The glycosylation site was unambiguously assigned from the y-ion series where a mass increment of 117 Da located the modified asparagine (denoted D*). (B) A deconvoluted fragmentation spectrum of the [M + 2H]2+ precursor ion at m/z 739.37 corresponding to the peptide with a retained HexNAc residue. The glycosylation site was unambiguously assigned from the y-ion series. A series of intense signals corresponding to neutral losses of the HexNAc residue is denoted with a square (0). The oxononium ion at m/z 204 and fragmentation of this ion give rise to a set of diagnostic signals at m/z 186, 168, 144, 138, and 126; the m/z’s in the spectrum are the experimentally measured values (see Table 1 for the theoretical masses and composition). (C) A deconvoluted fragment ion spectrum of the [M + 2H]2+ precursor ion at m/z 812.39 corresponding to the peptide with a retained fucosyl HexNAc residue. The glycosylation site was unambiguously assigned from the y-ion series, but the very labile fucose residue (the loss of fucose is denoted with a triangle (3) in the spectrum) makes the spectra quite complex, and it is not possible to assign the fucose linkage position, but it seems reasonable to assign it as being linked to the HexNAc. A series of signals corresponding to neutral loss of the entire fucosyl HexNAc residue is denoted with the composite symbol (0 3).

fucosyl-HexNAc moiety, respectively. In the low m/z range, no immonium ion (m/z 90) from 18O-labeled asparagine was observed in spectra from the PNGaseF digestion. However, in spectra from the Endo D/H digestion, the HexNAc oxonium ion at m/z 204 and its fragmentation products give rise to a series of relatively intense diagnostic ions (Figure 3B,C). Table 1 gives a summary of these ions, their composition, and their relative intensity and frequency in the annotated spectra. As described previously, the presence of these low m/z diagnostic ions is a strong indication for the presence of HexNAc.34,35 It should however be noted that the same set of ions are produced from GlcNAc and GalNAc residues irrespective of their O- or N-glycan origin. The amide linkage between GlcNAc and asparagine in N-glycans is more stable than most Oglycosidic linkages, and in many cases, it is possible to determine the glycosylation site from an ion series where the HexNAc residue remains attached to the peptide backbone (Figure 3B). In core-fucosylated peptides, the fucose residues are linked to the GlcNAc residue through an O-glycosidic bond, and the primary fragmentation channel is thus represented by neutral loss of the fucose residue (Figure 3C). Therefore, in most of the collision-induced spectra, there is no direct experimental evidence for the N-linked core-fucosylation. Fucose residues can also be situated on serine or threonine residues, and 3026

Journal of Proteome Research • Vol. 6, No. 8, 2007

Table 1. Diagnostic Ions for HexNAca

composition

losses

m/z

frequency in annotated spectra (%)

-H2O -H4O2 -C2H4O2 -CH6O3 -C2H6O3

204.08665 186.07608 168.06552 144.06552 138.05496 126.05496

96 86 97 74 95 100

+

C8H14NO5 C8H12NO4+ C8H10NO3+ C6H10NO3+ C7H8NO2+ C6H8NO2+

intensity relative to base peak intensity (%)

60 14 15 3 12 53

a The oxonium ion at m/z 204.087 decomposes and gives rise to a number of diagnostic ions. The statistics are based on the set of spectra assigned to identified peptides containing an asparagine-linked HexNAc. These diagnostic ions appear in most of the annotated spectra and with high intensity relative to the base peak.

migration of fucose residues has been reported.36 However, we do not observe any peptides with fucosylated serine or threonine residues in the PNGase F treated samples, suggesting that the fucose residues in the Endo D/H-treated samples are indeed situated on N-linked HexNAc residues. Site-specific assignments of fucosyl residues would most likely be facilitated by using complementary fragmentation techniques such as electron capture dissociation (ECD)16,37 or electron-transfer

research articles

Enzymatic Deglycosylation for Site Mapping Table 2. N-Glycosylated Proteins and the Sites Identifieda protein name

accession

Alpha-1-antitrypsin Alpha-2-HS-glycoprotein Alpha-2-macroglobulin Angiotensinogen Antithrombin III Apolipoprotein B-100 Beta-2-glycoprotein I C4b-binding protein alpha C4b-binding protein beta chain Carboxypeptidase B2 Carboxypeptidase N subunit 2 Ceruloplasmin Clusterin Coagulation factor V Coagulation factor XII Complement C1q subcomponent, A chain Complement C1r subcomponent Complement C2 Complement C3 Complement C4-A Complement component C6 Complement component C8 alpha chain Complement component C9 Complement factor H Complement factor H-related protein 2 Complement factor I Extracellular matrix protein 1 Fibrinogen beta chain Fibrinogen gamma chain Fibronectin Fibulin-1 Ficolin-3 Galectin-3 binding protein Haptoglobin Hemopexin precursor Hepatocyte growth factor-like protein Histidine-rich glycoprotein Ig alpha-2 chain C region Ig gamma-1 chain C region Ig gamma-2 chain C region Ig gamma-4 chain C region Ig mu chain C region Immunoglobulin J chain Inter-alpha-trypsin inhibitor heavy chain H1 Inter-alpha-trypsin inhibitor heavy chain H2 Inter-alpha-trypsin inhibitor heavy chain H3 Inter-alpha-trypsin inhibitor heavy chain H4 Kallistatin Kininogen-1 Phospholipid transfer protein Plasma kallikrein Plasma protease C1 inhibitor Properdin Protein Z-dependent protease inhibitor Proteoglycan-4 Prothrombin Serum amyloid A-4 protein Serum amyloid P-component Serum paraoxonase/arylesterase 1 Vitamin K-dependent protein Z Vitronectin

P01009 P02765 P01023 P01019 P01008 P04114 P02749 P04003 P20851 Q96IY4 P22792 P00450 P10909 P12259 P00748 P02745 P00736 P06681 P01024 P0C0L4 P13671 P07357 P02748 P08603 P36980 P05156 Q16610 P02675 P02679 P02751 P23142 O75636 Q08380 P00738 P02790 P26927 P04196 P01877 P01857 P01859 P01861 P01871 P01591 P19827 P19823 Q06033 Q14624 P29622 P01042 P55058 P03952 P05155 P27918 Q9UK55 Q92954 P00734 P35542 P02743 P27169 P22891 P04004

PNGase F 180 271 156 1424 128 1523, 3411

64, 71 74 138, 397, 762 86, 291, 354 2209 249

HexNAc

FucHexNAc

271 176 410 47 128 1523, 2239, 2982, 3336, 3411 162 221 64, 71 85, 108 74 138, 397, 762 86, 374

138, 397, 762

146

85 226, 1328 324 437 415 217, 882, 911, 1029 126 103 444 78 98 189 69,125,551 184, 207, 211 187, 453 296 125 205

514 621 85 226, 862, 1328 324 437 217, 822, 882, 911, 1029, 1095 126 103 530 394 78 1007

69, 125, 551 241 187 296 125

176

176

46, 210, 273, 280 49

46, 441 49 285 118 576 207, 517 157 205, 294

576 517 205 143 127 238 180

252 86, 169

85

127, 453 352 428 180 1159 143, 416 94 51 252 306 86, 169, 242

882

189 69, 551 184, 207

47, 205, 327 180 176 177 46, 210 49 285

25

121

Uniprot annotation 271 156, 176 410, 1424 47 128 1523, 2239, 2982, 3336, 3411 162 221 64, 71 85, 108 74 138, 397, 762 86, 291, 354, 374 2209 249 146 514 621 85 226, 862, 1328 324 437 415 822, 882, 911, 1029, 1095 126 103 444, 530 394 78 1007 98 189 69, 125, 551 184, 207, 211, 241 187, 453 296 125 47, 205, 327 180 176 46, 210, 273, 280, 441 49 285 118 576 207, 517 157 205, 294 143 127, 453 25, 238, 352 428 180 1159 121, 143, 416 94 51 252 306 86, 169, 242

a The protein name and entry refers to the Uniprot accession codes. Many of the glycosylation sites were defined by more than one peptide due to tryptic missed cleavages or semi-tryptic cleavages, in-source fragmentation, or post-translational modifications. A list of the identified peptides can be found in Supplementary Table 1 in Supporting Information. Glycosylation site annotation is shown with the sites annotated in UniProt as “known” in bold and as “potential” in italics.

dissociation (ETD)38 where the peptide backbone cleavage occurs without prior destruction of the glycan. After removing redundant nonidentical peptide entries and only retaining the highest scoring species, a total of 275 peptides was identified (see Supplementary Table 1 in Supporting Information) defining 103 N-glycosylation sites (Table 2). The peptide degeneracy was primarily caused by posttranslational modifications, semi-tryptic or tryptic missed cleavages, or in-source fragmentation. The most common

occurrences were the oxidation of methionine and the formation of pyro-glutamine and pyro-glutamate for peptides with N-terminal Q and E residues. Finally, the reaction of cysteine with iodoacetamide produced side products, and a fairly large number of N-terminal amine groups were found to be carbamidomethylated. In total, 60 N-glycosylation sites were identified from the sample treated with PNGase F in the presence of H218O. From the sample treated with Endo D/H, 76 sites were identified with a HexNAc residue attached and 23 sites were Journal of Proteome Research • Vol. 6, No. 8, 2007 3027

research articles

Ha1 gglund et al.

Table 3. O-Glycosylated Peptides Identified by Tandem Mass Spectrometrya protein name

accession number

glycopeptide

Coagulation factor X

P00742

D.P210tENPFDLLDFnQTQPER227.G

Coagulation factor XII

P00748

Complement C4

P01028

Complement component C7 Hemopexin Histidine-rich glycoprotein

P10643 P02790 P04196

Inter-alpha-trypsin inhibitor heavy chain H1

P19827

Inter-alpha-trypsin inhibitor heavy chain H2

P19823

Inter-alpha-trypsin inhibitor heavy chain H4

Q14624

R.L311HVPLmPAQPAPPKPQPttR330.T R.cbmL311HVPLmPAQPAPPKPQPttR330.T R.T331PPQsQtPGALPAK344.R M.A1221QETGDNLYWGSVTGSQSNAVsPtPAPR1248.N + HexNAc N.A1240VSPtPAPR1248.N K.E692NPLtQAVPK701.C A.t24PLPPtsAHGNVAEGEtKPDPDVtER49.C + 2 HexNAc R.s271sttKPPFKPHGsR284.D + HexNAc R.s271sttKPPFKPHGsR284.D + 2 HexNAc R.T642FVLSALQPSPtHsssNtQR661.L + HexNAc R.cbmT642FVLSALQPsPtHsssNtQR661.L + HexNAc R.T642FVLSALQPsPtHsssNtQR661.L + 2 HexNAc F.V644LSALQPsPtHsssNtQR661.L + HexNAc L.S646ALQPSPtHsssNtQR661.L + HexNAc L.s646ALQPsPtHsssNtQR661.L + 2 HexNAc S.A647LQPSPtHsssNtQR661.L + HexNAc M.L681AQGsQVLEstPPPHVMR698.V + HexNAc A682QGsQVLEstPPPHVmR698.V + HexNAc R.L690AILPAsAPPAtsNPDPAVsR710.V + 3 HexNAc R.cbmL690AILPAsAPPAtsNPDPAVsR710.V + 3 HexNAc

Kininogen

P01042

K.t521EHLAsssEDsttPsAQtQEKtEGPtPIPSLAK553.P + 4 HexNAc

Plasma protease C1 inhibitor

P05155

S.N23PnAtssssQDPEsLQDR40.G + 2 HexNAc

UniProt annotation

211 (O) 221 (N) 328 (O) 329 (O) 337 (O)

24 (O) 653 (O)

691 (O) 701 (O) 702 (O) 709 (O) 533 (O) 542 (O) 546 (O) 25 (N)

a The protein name and accession number refer to the Uniprot entries. The O-linked HexNAc residues are quite labile, and it is difficult to unambiguously assign the specific glycosylation site in the peptide sequence. Bold letters denote modified residues, and lowercase letters denote potentially modified residues with the number of observed HexNAc residues appended to the peptide sequence. The lowercase numbers denote the sequence position of the peptide and the relevant Uniprot annotation for the peptides (annotated residues are underlined). Cbm denotes carbamidomethylation of the N-terminal amine, and m denotes an oxidized methionine.

found to be core-fucosylated. Altogether, 13 and 42 glycosylation sites were uniquely identified using PNGase F and Endo D/H, respectively. Ten glycosylation sites were identified with all three versions (18O, GlcNAc, and fucosyl GlcNAc) of asparagine and 36 sites with two versions. The majority of the sites (91) was annotated in UniProt as “known” and 10 of the sites as “potential”. The site in the immunoglobulin entry P01861 was not annotated, and finally, the site 217 in Complement factor H was annotated as “not glycosylated”. In the case of site 217, the peptide defining the site (SPDVIJGSPISQK) was identified in both the PNGase F and Endo D/H-treated samples as three nonidentical versions: 18O-labeled (with a Mascot score of 70), 18O-labeled with a carbamidomethylated N-terminal amine (Mascot score 85), and HexNAc-modified (Mascot score 60). Thus, the identification of the peptide as being glycosylated is made with high confidence. O-Glycosylation. While we were analyzing the data from the Endo D/H-digested sample, it became apparent that some spectra containing diagnostic HexNAc oxonium ions matched peptides with a single HexNAc residue attached to peptides lacking the NXS/T/C consensus sequon for N-glycans. In most cases, however, these peptides contained serine or threonine residues, suggesting that the HexNAc residues may derive from O-glycans. Database searching and manual interpretation of the spectra lacking the NXS/T/C sequon and containing serine or threonine resulted in a total of 23 nonidentical peptides deriving from 11 different protein entries that were found to carry one or more HexNAc residues (Table 3). No O-glycosylated peptides were detected in the PNGase F treated sample, suggesting that the observed HexNAc residues appear as a consequence of the glycosidase treatment and not because the peptides were modified with a single GalNAc or GlcNAc residue. 3028

Journal of Proteome Research • Vol. 6, No. 8, 2007

From the knowledge about O-glycan structures and enzyme specificities, it can be extrapolated that the exoglycosidases used in Endo D/H have the potential of trimming at least some O-glycans. Neuraminidase from A. ureafaciens removes R-(23,6,8,9)-linked sialic acid residues, the bovine β-galactosidase releases nonreducing terminal β-(1-3,4,6)-linked galactose residues, and the diplococcal β-N-acetylhexosaminidase releases GlcNAc from both GlcNAc β1-3Gal and GlcNAc β1-6Gal linkages commonly found in mucin type O-glycan structures.39 The combined action of these exoglycosidases on two putative mucin-type O-glycan structures is illustrated in Figure 1. As shown, the exoglycosidases have the potential of trimming, for example, a simple sialylated core 1 glycan or a more complex core 2 hexasaccharide down to a single monosaccharide unit. Exoglycosidases have been utilized in several previous studies for determination of O-glycan linkage and structure in released oligosaccharides (for example, see refs 40 and 41). These exoglycosidases have also been applied to intact glycopeptides. For example, the sequential treatment of the O-glycosylated hinge-region of immunoglobulin A1 from human plasma with neuraminidase and β-galactosidase trimmed the O-linked sugar chains (predominantly sialylated Gal-β1-3GalNAc) down to the innermost peptide linked N-acetylgalactosamine.42 Glycans are very labile, and determining O-linked glycosylation sites using standard CID fragmentation is difficult as the glycosidic bond to S/T is broken as one of the primary fragmentation events and the loss of the HexNAc residue leads to a perfectly normal serine or threonine residue. The fragmentation spectra are rather complex to interpret because of this neutral loss, but in many cases, it is possible to assign the glycosylation site to different stretches delimited by proline residues since the facile N-directed fragmentation of proline

Enzymatic Deglycosylation for Site Mapping

research articles

Figure 4. Deconvoluted LC-ESI tandem mass spectrum of a [M + 4H]4+ precursor ion at m/z 650.36 acquired on a Q-TOF Ultima mass spectrometer. The fragment ion spectrum corresponds to the peptide LHVPLMPAQPAPPKPQPTTR in Coagulation factor XII (P00748) carrying two HexNAc residues tentatively assigned as the two threonine O-glycosylation sites. The y and b ion series are indicated, and the losses of the labile O-linked HexNAc residues are annotated with a square (0) (00 indicates loss of both sugars). The (*) denotes the loss of ammonia (very intense for the y12-ion where glutamine can eliminate ammonia to form pyroglutamine). All the six ions listed in Table 1 that arise from the fragmentation of HexNAc are found as intense ions in the low mass region (the three most intense diagnostic m/z values are highlighted with the experimental values).

generates fragments retaining the HexNAc residue. Furthermore, there is some redundancy from peptide variants arising from different modifications (peptide truncation, oxidation of methionines, and carbamidomethylation of the N-terminal amine) that aid interpretation. An example of an O-glycopeptide spectrum is shown in Figure 4. This spectrum was assigned to the peptide LHVPLMPAQPAPPKPQPTTR in coagulation factor XII (P00748) carrying two HexNAc residues. The most intense ions are formed at proline residues with associated losses of the HexNAc residues, but it is possible to assign the glycosylation site to the two threonine residues in good accordance with the Swiss-Prot annotation of two potential glycosylation sites. The Mascot score of 18 is not significant, but the confidence in the assignment of the b-ions can be strengthened by comparison of the fragmentation spectrum of the version of the peptide with carbamidomethylated Nterminal amine (the b-ions are shifted + 57 Da). The other peptide identified from Coagulation factor XII was found to be O-glycosylated at residues S335 and T337. Only T337 is annotated as a “potential” glycosylation site in Swiss-Prot). Coagulation factor X and plasma protease C1 inhibitor were found to be both N- and O-glycosylated. A common problem with O-glycosylation site annotation is the frequent clustering of potential sites in peptide sequences. For example, it was not possible to assign the glycosylation sites for the peptides LAILPASAPPATSNPDPAVSR and TEHLASSSEDSTTPSAQTQEKTEGPTPIPSLAK carrying three and four HexNAc residues,

respectively. Both peptides are annotated in Swiss-Prot with three O-glycosylation sites among the potential residues. Global analysis of O-glycosylation poses a challenging analytical problem. O-glycan structures vary considerably; there is no consensus-motif for O-glycosylation, and there is no known endo-acting enzyme that is generally applicable for removal of O-glycans. The most commonly used enzyme is O-glycanase (Endo-β-N-acetylgalactosaminidase) that displays strict substrate specificity and is only active on the disaccharide galactosyl-β-1-3-GalNAc attached to serine or threonine and removes the entire glycan.43 Chemical removal of glycans with trifluoromethanesulphonic acid is quite efficient and insensitive to glycan structure,44 but the entire glycan is removed, and thus, information about the site of attachment is lost. O-glycans can be removed by β-elimination under alkaline conditions15,45,46 thereby converting glycosylated serine and threonine residues to dehydroalanine and amino-dehydrobutyric acid, respectively. This results in mass shifts of 18 Da that can be used for identification of glycosylation sites. A drawback with the β-elimination reaction is the low specificity, since both modified (e.g., phosphorylated) and, to a lesser extent, nonmodified serine and threonine residues also may be modified in βelimination reactions along with other side products (e.g., β-elimination of carbamidomethylated cysteines also forming dehydroalanine). The β-elimination reaction can be combined with Michael addition for selective tagging and enrichment of modified peptides or for relative quantitation.15,47 The only way Journal of Proteome Research • Vol. 6, No. 8, 2007 3029

research articles to unambiguously identify glycosylation sites is to leave a part of the glycan attached to the peptide, although an exoglycosidase strategy leaving a single GalNAc can pose a problem because trimmed mucin-type glycans will be indistinguishable from O-GlcNAc modified peptides by mass spectrometry. How successful the exo-glycosidase treatment is for global trimming of all the heterogeneous O-linked oligosaccharides is not known to us, but it definitely works on a subset of O-glycan structures. Many O-linked glycans carry unique terminal sugar structures, and a more global exoglycosidase cocktail requires additional enzymes to cleave all glycosidic bonds in O-linked glycan structures. For example, the addition of the newly discovered R-N-acetylgalactosaminidase and R-galactosidase48 would remove the R-1,3-linked N-acetylgalactosamine from the A blood group and the R-1,3-linked galactose from the B blood group.

Conclusions The daunting endeavor of characterizing the human plasma proteome is complicated by the many post-translational modifications secreted proteins undergo. Here, we have applied two different deglycosylation strategies to map glycosylation sites after HILIC enrichment of glycopeptides. In conclusion, this study has shown that the Endo D/H partial deglycosylation strategy has several advantages compared with total deglycosylation. Although the complete removal of N-glycans by PNGase F in H218O is a simple pathway that provides indirect evidence for glycosylation sites, Endo D/H provides direct evidence for glycosylation sites via the retained GlcNAc residue and enabled mapping of a higher number of glycosylation sites. Endo D/H also offers the unique possibility to identify N-linked core-fucosylation. By hyphenating a lectin enrichment of fucosylated species, this valuable property can be used for screening core-fucosylated proteins and identifying the attachment sites in a more direct manner. Furthermore, we demonstrate that some O-glycosylated peptides were identified in the Endo D/H strategy, presumably due to exoglycosidase activity. The detailed enzymatic mechanisms behind this observation will be investigated on O-glycosylated model proteins. Further developments of the Endo D/H deglycosylation strategy in conjunction with ECD or ETD fragmentation techniques will be explored as tools for more comprehensive glycosylation site analysis.

Acknowledgment. J.B. and R.M. acknowledge grants from the Carlsberg Foundation. Shabaz Mohammed is thanked for sharing his knowledge and help. Inga Laursen at Statens Serum Institut is thanked for the generous gift of the Cohn IV fraction of human plasma. Financial support for this study was provided by the Danish Biotechnology Instrument Center. Supporting Information Available: Tables detailing the N- (Supplementary Table 1) and O-glycosylated (Supplementary Table 2) peptides identified by LC-MS. This material is available free of charge via the Internet at http://pubs.acs.org. References (1) Jensen, O. N. Modification-specific proteomics: characterization of post-translational modifications by mass spectrometry. Curr. Opin. Chem. Biol. 2004, 8 (1), 33-41. (2) Mann, M.; Jensen, O. N. Proteomic analysis of post-translational modifications. Nat. Biotechnol. 2003, 21 (3), 255-261. (3) Kannagi, R.; Izawa, M.; Koike, T.; Miyazaki, K.; Kimura, N. Carbohydrate-mediated cell adhesion in cancer metastasis and angiogenesis. Cancer Sci. 2004, 95 (5), 377-384.

3030

Journal of Proteome Research • Vol. 6, No. 8, 2007

Ha1 gglund et al. (4) Bause, E.; Hettkamp, H. Primary structural requirements for N-glycosylation of peptides in rat-liver. FEBS Lett. 1979, 108 (2), 341-344. (5) Bause, E.; Legler, G. The Role of the hydroxy amino-acid in the triplet sequence Asn-Xaa-Thr(Ser) for the N-glycosylation step during glycoprotein-biosynthesis. Biochem. J. 1981, 195 (3), 639644. (6) Van den Steen, P.; Rudd, P. M.; Dwek, R. A.; Opdenakker, G. Concepts and principles of O-linked glycosylation. Crit. Rev. Biochem. Mol. Biol. 1998, 33 (3), 151-208. (7) Hofsteenge, J.; Muller, D. R.; de Beer, T.; Loffler, A.; Richter, W. J.; Vliegenthart, J. F. New type of linkage between a carbohydrate and a protein: C-glycosylation of a specific tryptophan residue in human RNase Us. Biochemistry 1994, 33 (46), 13524-13530. (8) Bunkenborg, J.; Pilch, B. J.; Podtelejnikov, A. V.; Wisniewski, J. R. Screening for N-glycosylated proteins by liquid chromatography mass spectrometry. Proteomics 2004, 4 (2), 454-465. (9) Hagglund, P.; Bunkenborg, J.; Elortza, F.; Jensen, O. N.; Roepstorff, P. A new strategy for identification of N-glycosylated proteins and unambiguous assignment of their glycosylation sites using HILIC enrichment and partial deglycosylation. J. Proteome Res. 2004, 3 (3), 556-566. (10) Li, Y.; Larsson, E. L.; Jungvid, H.; Galaev, I.; Mattiasson, B. Affinity chromatography of neoglycoproteins. Bioseparation 2000, 9 (5), 315-323. (11) Peracaula, R.; Royle, L.; Tabares, G.; Mallorqui-Fernandez, G.; Barrabes, S.; Harvey, D. J.; Dwek, R. A.; Rudd, P. M.; de Llorens, R. Glycosylation of human pancreatic ribonuclease: differences between normal and tumor states. Glycobiology 2003, 13 (4), 227244. (12) Zhang, H.; Li, X. J.; Martin, D. B.; Aebersold, R. Identification and quantification of N-linked glycoproteins using hydrazide chemistry, stable isotope labeling and mass spectrometry. Nat. Biotechnol. 2003, 21 (6), 660-666. (13) Gonzalez, J.; Takao, T.; Hori, H.; Besada, V.; Rodriguez, R.; Padron, G.; Shimonishi, Y. A method for determination of N-glycosylation sites in glycoproteins by collision-induced dissociation analysis in fast atom bombardment mass spectrometry: identification of the positions of carbohydrate-linked asparagine in recombinant alpha-amylase by treatment with peptide-N-glycosidase F in 18Olabeled water. Anal. Biochem. 1992, 205 (1), 151-158. (14) Kuster, B.; Mann, M. 18O-labeling of N-glycosylation sites to improve the identification of gel-separated glycoproteins using peptide mass mapping and database searching. Anal. Chem. 1999, 71 (7), 1431-1440. (15) Wells, L.; Vosseller, K.; Cole, R. N.; Cronshaw, J. M.; Matunis, M. J.; Hart, G. W. Mapping sites of O-GlcNAc modification using affinity tags for serine and threonine post-translational modifications. Mol. Cell. Proteomics 2002, 1 (10), 791-804. (16) Mirgorodskaya, E.; Roepstorff, P.; Zubarev, R. A. Localization of O-glycosylation sites in peptides by electron capture dissociation in a Fourier transform mass spectrometer. Anal. Chem. 1999, 71 (20), 4431-4436. (17) Robinson, N. E.; Robinson, A. B. Deamidation of human proteins. Proc. Natl. Acad. Sci. U.S.A. 2001, 98 (22), 12409-12413. (18) Kristiansen, T. Z.; Bunkenborg, J.; Gronborg, M.; Molina, H.; Thuluvath, P. J.; Argani, P.; Goggins, M. G.; Maitra, A.; Pandey, A. A proteomic analysis of human bile. Mol. Cell. Proteomics 2004, 3 (7), 715-728. (19) Kaji, H.; Saito, H.; Yamauchi, Y.; Shinkawa, T.; Taoka, M.; Hirabayashi, J.; Kasai, K.; Takahashi, N.; Isobe, T. Lectin affinity capture, isotope-coded tagging and mass spectrometry to identify N-linked glycoproteins. Nat. Biotechnol. 2003, 21 (6), 667-672. (20) Liu, T.; Li, J. D.; Zeng, R.; Shao, X. X.; Wang, K. Y.; Xia, Q. C. Capillary electrophoresis-electrospray mass spectrometry for the characterization of high-mannose-type N-glycosylation and differential oxidation in glycoproteins by charge reversal and protease/glycosidase digestion. Anal. Chem. 2001, 73 (24), 58755885. (21) Mills, K.; Johnson, A. W.; Diettrich, O.; Clayton, P. T.; Winchester, B. G. A strategy for the identification of site-specific glycosylation in glycoproteins using MALDI TOF MS. Tetrahedron: Asymmetry 2000, 11 (1), 75-93. (22) Sleat, D. E.; Zheng, H. Y.; Qian, M. Q.; Lobel, P. Identification of sites of mannose 6-phosphorylation on lysosomal proteins. Mol. Cell. Proteomics 2006, 5 (4), 686-701. (23) Jebanathirajah, J.; Steen, H.; Roepstorff, P. Using optimized collision energies and high resolution, high accuracy fragment ion selection to improve glycopeptide detection by precursor ion scanning. J. Am. Soc. Mass Spectrom. 2003, 14 (7), 777-784.

research articles

Enzymatic Deglycosylation for Site Mapping (24) Staudacher, E.; Altmann, F.; Wilson, I. B.; Marz, L. Fucose in N-glycans: from plant to man. Biochim. Biophys. Acta 1999, 1473 (1), 216-236. (25) Ma, B.; Simala-Grant, J. L.; Taylor, D. E. Fucosylation in prokaryotes and eukaryotes. Glycobiology 2006, 16 (12), 158R-184R. (26) Kistler, P.; Nitschmann, H. Large scale production of human plasma fractions. Vox Sang. 1962, 7 (4), 414-424. (27) Rappsilber, J.; Ishihama, Y.; Mann, M. Stop and go extraction tips for matrix-assisted laser desorption/ionization, nanoelectrospray, and LC/MS sample pretreatment in proteomics. Anal. Chem. 2003, 75 (3), 663-670. (28) Matthiesen, R.; Bunkenborg, J.; Stensballe, A.; Jensen, O. N.; Welinder, K. G.; Bauw, G. Database-independent, databasedependent, and extended interpretation of peptide mass spectra in VEMS V2.0. Proteomics 2004, 4 (9), 2583-2593. (29) Matthiesen, R.; Trelle, M. B.; Hojrup, P.; Bunkenborg, J.; Jensen, O. N. VEMS 3.0: Algorithms and computational tools for tandem mass spectrometry based identification of post-translational modifications in proteins. J. Proteome Res. 2005, 4 (6), 23382347. (30) Atwood, J. A.; Sahoo, S. S.; Alvarez-Manilla, G.; Weatherly, D. B.; Kolli, K.; Orlando, R.; York, W. S. Simple modification of a protein database for mass spectral identification of N-linked glycopeptides. Rapid Commun. Mass Spectrom. 2005, 19 (21), 3002-3006. (31) Angel, P. M.; Lim, J. M.; Wells, L.; Bergmann, C.; Orlando, R. A potential pitfall in 18O-based N-linked glycosylation site mapping. Rapid Commun. Mass Spectrom. 2007, 21 (5), 674-682. (32) Koide, N.; Muramatsu, T. Endo-beta-N-acetylglucosaminidase acting on carbohydrate moieties of glycoproteins. Purification and properties of the enzyme from Diplococcus pneumoniae. J. Biol. Chem. 1974, 249 (15), 4897-4904. (33) Tarentino, A. L.; Plummer, T. H., Jr.; Maley, F. The release of intact oligosaccharides from specific glycoproteins by endo-beta-Nacetylglucosaminidase H. J. Biol. Chem. 1974, 249 (3), 818-824. (34) Treilhou, M.; Ferro, M.; Monteiro, C.; Poinsot, V.; Jabbouri, S.; Kanony, C.; Prome, D.; Prome, J. C. Differentiation of O-acetyl and O-carbamoyl esters of N-acetyl-glucosamine, by decomposition of their oxonium ions. Application to the structure of the nonreducing terminal residue of Nod factors. J. Am. Soc. Mass Spectrom. 2000, 11 (4), 301-311. (35) Chalkley, R. J.; Burlingame, A. L. Identification of GlcNAcylation sites of peptides and alpha-crystallin using Q-TOF mass spectrometry. J. Am. Soc. Mass Spectrom. 2001, 12 (10), 1106-1113. (36) Harvey, D. J.; Mattu, T. S.; Wormald, M. R.; Royle, L.; Dwek, R. A.; Rudd, P. M. “Internal residue loss”: rearrangements occurring during the fragmentation of carbohydrates derivatized at the reducing terminus. Anal. Chem. 2002, 74 (4), 734-740. (37) Hakansson, K.; Cooper, H. J.; Emmett, M. R.; Costello, C. E.; Marshall, A. G.; Nilsson, C. L. Electron capture dissociation and infrared multiphoton dissociation MS/MS of an N-glycosylated tryptic peptide to yield complementary sequence information. Anal. Chem. 2001, 73 (18), 4530-4536.

(38) Hogan, J. M.; Pitteri, S. J.; Chrisman, P. A.; McLuckey, S. A. Complementary structural information from a tryptic N-linked glycopeptide via electron transfer ion/ion reactions and collisioninduced dissociation. J. Proteome Res. 2005, 4 (2), 628-632. (39) Yamashita, K.; Ohkura, T.; Yoshima, H.; Kobata, A. Substratespecificity of Diplococcal beta-N-acetylhexosaminidase, a useful enzyme for the structural studies of complex type asparaginelinked sugar chains. Biochem. Biophys. Res. Commun. 1981, 100 (1), 226-232. (40) Prime, S.; Dearnley, J.; Ventom, A. M.; Parekh, R. B.; Edge, C. J. Oligosaccharide sequencing based on exo- and endoglycosidase digestion and liquid chromatographic analysis of the products. J. Chromatogr., A 1996, 720 (1-2), 263-274. (41) Xie, Y. M.; Tseng, K.; Lebrilla, C. B.; Hedrick, J. L. Targeted use of exoglycosidase digestion for the structural elucidation of neutral O-linked oligosaccharides. J. Am. Soc. Mass Spectrom. 2001, 12 (8), 877-884. (42) Iwase, H.; Tanaka, A.; Hiki, Y.; Kokubo, T.; Sano, T.; Ishii-Karakasa, I.; Toma, K.; Kobayashi, Y.; Hotta, K. Mutual separation of hingeglycopeptide isomers bearing five N-acetylgalactosamine residues from normal human serum immunoglobulin A1 by capillary electrophoresis. J. Chromatogr., B 1999, 728 (2), 175-183. (43) Iwase, H.; Hotta, K. Release of O-linked glycoprotein glycans by endo-alpha-N-acetylgalactosaminidase. Methods Mol. Biol. 1993, 14, 151-159. (44) Edge, A. S. B. Deglycosylation of glycoproteins with trifluoromethanesulphonic acid: elucidation of molecular structure and function. Biochem. J. 2003, 376, 339-350. (45) Greis, K. D.; Hayes, B. K.; Comer, F. I.; Kirk, M.; Barnes, S.; Lowary, T. L.; Hart, G. W. Selective detection and site-analysis of OGlcNAc-modified glycopeptides by beta-elimination and tandem electrospray mass spectrometry. Anal. Biochem. 1996, 234 (1), 38-49. (46) Huang, Y.; Konse, T.; Mechref, Y.; Novotny, M. V. Matrix-assisted laser desorption/ionization mass spectrometry compatible betaelimination of O-linked oligosaccharides. Rapid Commun. Mass Spectrom. 2002, 16 (12), 1199-1204. (47) Vosseller, K.; Hansen, K. C.; Chalkley, R. J.; Trinidad, J. C.; Wells, L.; Hart, G. W.; Burlingame, A. L. Quantitative analysis of both protein expression and serine / threonine post-translational modifications through stable isotope labeling with dithiothreitol. Proteomics 2005, 5 (2), 388-398. (48) Liu, Q. P.; Sulzenbacher, G.; Yuan, H.; Bennett, E. P.; Pietz, G.; Saunders, K.; Spence, J.; Nudelman, E.; Levery, S. B.; White, T.; Neveu, J. M.; Lane, W. S.; Bourne, Y.; Olsson, M. L.; Henrissat, B.; Clausen, H. Bacterial glycosidases for the production of universal red blood cells. Nat. Biotechnol. 2007, 25 (4), 454-464.

PR0700605

Journal of Proteome Research • Vol. 6, No. 8, 2007 3031