A New Approach for Mapping Sialylated N-Glycosites in Serum Proteomes Bart Ghesquie` re,†,‡ Lien Buyl,†,‡ Hans Demol,†,‡ Jozef Van Damme,†,‡ An Staes,†,‡ Evy Timmerman,†,‡ Joe1 l Vandekerckhove,†,‡ and Kris Gevaert*,†,‡ Department of Medical Protein Research, VIB, B-9000 Ghent, Belgium, and Department of Biochemistry, Ghent University, B-9000 Ghent, Belgium Received June 14, 2007
A new approach for proteome-wide analysis of sialylated N-glycopeptides based on the diagonal chromatographic COFRADIC technology is presented here. The use of R(2-3,6,8,9) neuraminidase is central to isolate sialylated N-glycopeptides out of a complex peptide mixture. Two different COFRADIC techniques are introduced here, either without or with post-metabolic oxygen-18 labeling (direct versus indirect sorting), and when applied to immuno-depleted mouse serum, we herewith identified 93 sialylated glycosylation sites in 53 serum proteins. Keywords: diagonal chromatography • sialic acid • serum • COFRADIC
Introduction Sialic acid is usually found at the nonreducing ends of N-glycans where it plays a crucial role in determining surface characteristics of cells and secreted glycoproteins.1 Sialic acid is a N-acetylated derivative of neuraminic acid bound to glycans through an R-2,3 or an R-2,6 linkage to galactose or Nacetylhexosamine, a transfer that is catalyzed by the family of sialyltransferases.2 The presence of this monosaccharide on terminal glycan epitopes is increasingly used as a marker for aberrant glycosylation. During malignant transformation, for example, an increase of oligosaccharides resulting in augmented branching of glycans and a concomitant rise in the concentration of sialyltransferases can lead to an increase of sialic acid residues at glycan termini.3-8 Using lectins such as SNA from Sambucus nigra or MAA from Maackia amurensis, sialic acid containing glycoproteins can be affinity captured. However, one of the disadvantages of lectins is their rather limited specificity: SNA and MAA are mainly restricted toward, respectively, R-2,6- and R-2,3-bound sialic acid. Combining several lectins in a multi-lectin approach9 or serial lectin affinity chromatography (SLAC)10 may partially solve this problem. Our lab has previously developed a peptide-centric, gel-free proteomics technology based on the principle of diagonal chromatography11 that is called combined fractional diagonal chromatography (COFRADIC).12 COFRADIC isolates a specific set of peptides from proteome digests following two consecutive, identical RP-HPLC runs. Briefly, a peptide mixture is fractionated during a primary RP-HPLC run. Primary peptide * To whom correspondence should be addressed. Prof. Dr. Kris Gevaert, Department of Medical Protein Research and Biochemistry, VIB and Faculty of Medicine and Health Sciences, Ghent University, A. Baertsoenkaai 3, B-9000 Ghent, Belgium. Tel: +32-92649274. Fax: +32-92649496. E-mail:
[email protected]. † VIB. ‡ Ghent University.
4304
Journal of Proteome Research 2007, 6, 4304-4312
Published on Web 10/05/2007
fractions are then treated with a chemical or an enzyme specific for a set of peptides. After this modification (sorting) reaction, each of these primary fractions (or a chosen combination) is reloaded onto the same RP-column and separated under identical conditions. During a series of such secondary runs, peptides of which the structure was altered by the sorting reaction now elute differently as compared to unaltered peptides, allowing their specific isolation. Using COFRADIC, protocols for the separation of methionyl,12 cysteinyl,13 amino (N) terminal peptides,14 phosphopeptides,15 N-glycosylated peptides,16 and protein segments containing amino acids structurally involved in ATP-binding17 were developed and have been applied to different biological samples.18-20 We show here that by COFRADIC, sialic acid-containing N-glycopeptides can be directly isolated using R(2-3,6,8,9) neuraminidase from Arthrobacter ureafaciens, which cleaves off terminal sialic acid residues between the primary and secondary RP-HPLC runs. A proof-of-concept study was first performed on a tryptic digest of bovine fetuin and human alpha1-acid glycoprotein. Using two different, though related approaches we then identified 93 different sialic acid-containing N-glycosites in 53 proteins in immuno-depleted mouse serum, thus constituting one of the largest hitherto reported catalogues of in vivo sialylation sites of mouse serum proteins.
Materials and Methods MARS Depletion of Mouse Sera. Sera were taken from male C57 black strain mice, and the three most abundant proteins (albumin, Ig’s and transferrin) were affinity removed using a 4.6 mm (I.D.) × 100 mm multiple affinity removal system (MARS, Agilent) operated according to the manufacturer’s protocol. Each MARS depletion step was done on a 90 µL serum sample, and nonretained proteins were collected in volumes 10.1021/pr0703728 CCC: $37.00
2007 American Chemical Society
Mapping Sialylated N-Glycosites
adjusted to a final protein concentration of 1.15 mg/mL (determined by Biorad’s Protein Assay (Biorad Laboratories, Munich, Germany)). Sorting of Sialylated N-Glycopeptides of Fetuin and Alpha1-Acid Glycoprotein. Bovine fetuin (0.8 mg) and 0.5 mg of human alpha-1-acid glycoprotein (both from Sigma-Aldrich, Steinheim, Germany) were dissolved, reduced, and alkylated in 500 µL of 2 M guanidinium hydrochloride, 25 mM TCEP (Pierce, Rockford, IL), and 100 mM iodoacetamide (SigmaAldrich) in 50 mM Tris.HCl (pH 8.0). Alkylation lasted for 90 min at 37 °C, after which excess of reagents was removed on a NAP-5 desalting column (Amersham Biosciences, Uppsala, Sweden) and proteins were eluted with 1 mL of 10 mM Tris.HCl (pH 8.7). Prior to digestion, the protein mixture was heated at 95 °C for 10 min and cooled on ice for 15 min; 20 µg of sequencing-grade modified trypsin (Promega, WI) was then added for overnight digestion at 37 °C. One-hundred microliters of the generated peptide mixture (corresponding to 0.08 mg of fetuin and 0.05 mg of alpha-1acid glycoprotein) was separated on a RP-HPLC column (2.1 mm internal diameter (I.D.) × 150 mm (length) 300SB-C18 column, Zorbax, Agilent, Waldbronn, Germany) using an Agilent 1100 Series HPLC system. Following a 10 min wash with solvent A (10 mM ammonium acetate at pH 5.5 in water/ acetonitrile, 98/2 (v/v) (water was LC-MS grade from Biosolve, Valkenswaard, The Netherlands and acetonitrile was HPLC grade from Baker, Deventer, The Netherlands)), a linear gradient to 100% solvent B (10 mM ammonium acetate at pH 5.5 in water/acetonitrile, 30/70 (v/v)) was applied over 100 min (the primary COFRADIC run). Using Agilent’s electronic flow controller, a constant flow of 80 µL/min was applied. Peptides eluting between 27 and 72 min after sample injection were collected in 45 fractions of 1 min each in a 96 well plate. Primary fractions separated by 15 min were pooled and dried in a centrifugal vacuum concentrator. Prior to an identical secondary RP-HPLC separation, pooled fractions were redissolved in 90 µL of 50 mM NaH2PO4 (pH 5.0) and 8 µL of a 1 mU/µl stock solution of R(2-3,6,8,9) neuraminidase (Sigma-Aldrich) from Arthrobacter ureafaciens was added. Desialylation proceeded for 3 h at 37 °C and was stopped by adding 5 µL of 50% acetic acid. Each neuraminidase treated pool of primary fractions was separately re-loaded onto the same RP-HPLC column, and the same solvent gradient was applied as during the primary separation (the secondary COFRADIC run). Desialylated peptides were collected in an interval from 1 to 12 min following the end of the original collection interval of each primary fraction in secondary fractions of 1 min (80 µL) each. Protocol A: Direct Sorting of Sialylated N-Glycopeptides. One milliliter of MARS-depleted serum was desalted over a NAP-10 column (Amersham Bioscience) in 1.5 mL of 0.4 M guanidinium hydrochloride in 30 mM Tris-HCl (pH 7.6). Cysteine alkylation and trypsin digestion (now in 50 mM NH4HCO3; pH 7.4) were performed essentially as described above. A peptide amount equivalent to 200 µg of depleted serum proteins was lyophilized, subjected to mild oxidation in 100 µL of 10 mM sodium acetate (pH 5.5) buffer containing 0.5% (w/w) H2O2 for 30 min at 30 °C, and immediately loaded for the first separation step. This additional step was necessary to avoid unwanted shifts of methionyl peptides between the primary and secondary runs, due to accidental oxidation. The COFRADIC sorting of the sialylated N-glycopeptides was performed as described above.
research articles
Figure 1. Direct and indirect sorting of sialylated N-glycopeptides.
To reduce the number of analytical LC-MS/MS runs, secondary fractions were pooled, typically by combining 2 secondary COFRADIC fractions (e.g., fraction X and X+4) of each secondary interval. Such pooled fractions were dried in a centrifugal vacuum concentrator and redissolved in 85 µL of freshly prepared 50 mM NH4HCO3 (pH 7.4). For deglycosylation, 500 milli-units of PNGase F were added for 3 h at 30 °C after which the peptides were redried and redissolved in 20 µL of 0.1% formic acid in water. Half of this peptide mixture was used for nanoLC-MS/MS analysis using a Waters nanoAcquity HPLC (Waters Corporation, Milford, MA) in-line coupled to a Q-TOF Premier mass spectrometer (Waters Corporation) as previously described.16 Protocol B: Indirect Sorting of Sialylated N-Glycopeptides. Two milliliters of MARS depleted mouse serum was divided in two equal parts. Both samples were desalted on a NAP-10 column with 1.5 mL of 50 mM NaH2PO4 (pH 5.0) and the volume of the desalted protein mixture was reduced to 500 µL by centrifugal vacuum drying. To one sample was added 16 mU neuraminidase, and it was incubated for 4 h at 37 °C; 16 µL of 50 mM NaH2PO4 (pH 5.0) was added to the other sample as a control. Both samples were reduced, alkylated, and digested with trypsin as indicated above, and the untreated, sialylated peptides were labeled at their C-termini with oxygen18 isotopes as previously described.21 Equal amounts of desialylated peptides and sialylated peptides were mixed, the sum of which corresponded to an equivalent of about 500 µg of depleted serum. Then, this peptide mixture was subjected to a COFRADIC-protocol directed at sorting N-glycosylated peptides using PNGase F16. The different steps used in both protocols are schematically represented in Figure 1. Sorted peptides were pooled into several fractions and separated by capillary RP-HPLC prior to MALDI MS(/MS) analysis as previously described.20 For differential analysis (protocol B), a so-called MALDI compound list was automatically generated by the WarpLC software after first measuring Journal of Proteome Research • Vol. 6, No. 11, 2007 4305
research articles
Ghesquie` re et al.
Figure 2. COFRADIC sorting of sialic acid containing N-glycopeptides. A tryptic digest of human alpha-1-acid glycoprotein and bovine fetuin was separated on a RP-HPLC (UV absorbance trace at 214 nm is shown). During this primary run (A), peptides eluting between 27 and 72 min were collected in 1 min intervals and primary fractions separated by 15 min (indicated in dark gray boxes) were pooled. Such pooled fractions were treated with R(2-3,6,8,9) neuraminidase and reseparated in a identical RP-HPLC run (B). Now, peptides not containing sialic acid do not shift (indicated in dark-gray boxes in B), whereas N-glycosylated peptides that contained sialic acid undergo a hydrophobic shift (light-gray boxes in B).
all 384 spots in MS mode. Here, ions with a signal-to-noise ratio of more than 70 were selected for automated MS/MS analysis. Peptide Identification by Mascot. MS/MS spectra were searched using Mascot22 in the Swiss-Prot database (version 2.1.04). The taxonomy was set to mouse, the peptide mass tolerance was set at ( 0.2 Da, and the peptide fragment mass tolerance was set at (0.5 Da, with the ESI-QUAD-TOF or MALDI-TOF/TOF as selected instruments for peptide fragmentation rules. Allowed variable modifications were: methionine oxidation to its sulfoxide derivative, pyrrolidone carboxylic acid formation of amino terminal glutamine, carbamidomethylation of cysteine, and deamidation of asparagine. Only peptides of which the Mascot score of the matched MS/MS spectrum exceeded Mascot’s identity threshold score set at 95% confidence and which were ranked first were withheld for further confirmation; typically, a sufficient number of b and y fragment ions needed to be present to cover (most of) the peptide sequence, and the PNGase F-mediated conversion of Asn to Asp was covered by such fragments.
Results and Discussion COFRADIC-Based Characterization of Sialylation Sites in Fetuin and r-1-Acid Glycoprotein. Bovine fetuin contains 4306
Journal of Proteome Research • Vol. 6, No. 11, 2007
three N-glycosylation sites23 (N99, N156, and N176) and carries several sialic acid residues.24,25 The human R-1-acid glycoprotein holds five different glycosites26 (N33, N56, N72, N93, and N103) and is also sialylated.27 Our sorting procedure was first tested on a tryptic digest of a mixture of these sialoproteins to monitor the extent of column retention alteration upon desialylation of glycopeptides. A tryptic digest of this protein mixture was separated by RPHPLC (Figure 2A), and fractions of 1 min each were collected between 27 and 72 min and pooled into groups that were separated by 15 min. Following R(2-3,6,8,9) neuraminidase treatment, each pool of altered primary fractions was rerun on the same column and under identical chromatographic conditions. One example of such a secondary run is shown in Figure 2B. Here, three primary fractions, eluting between 41 and 42, 56-57 and 71-72 min, were pooled and desialylated. Because of the removal of the negatively charged hydrophilic sialic acid, desialylation was expected to introduce a hydrophobic shift. The primary fraction that eluted between 41 and 42 min clearly contained glycopeptides that shifted 2-4 min outside the primary collection interval (see peaks at 44 and 46 min in Figure 2B). These peptides were then collected for further mass spectrometric analysis.
Mapping Sialylated N-Glycosites
To identify desialylated peptides by PSD or MS/MS, deglycosylation by PNGase F was introduced. It has been suggested that deglycosylation is necessary for efficient peptide sequencing by MS/MS because attached glycans can impede peptide bond fragmentation and interpretation of CID-MS/MS spectra.16,28-30 PNGase F removes N-glycans by cleaving the bond between the glycan and the conjugated asparagine, converting the latter into an aspartic acid31 and thereby leaving a mass label that aids interpretation of MS/MS spectra. PNGase F treatment was carried out on COFRADIC-sorted desialylated peptides prior to MALDI-MS analysis. MALDI-PSD analysis of sorted peptides collected in all secondary COFRADIC runs led to the identification of three peptides containing a deamidated asparagine in the NXS/T consensus motif. As for bovine fetuin, we identified sialylated N-glycans anchored to N99 (identified peptide: 91VLDPTPLAN*CSVR103 (N* ) deamidated asparagine)) and N176 (172NAESN*GSYLQLVEISR187), whereas in the R-1-acid-glycoprotein, sialylation was located on N93 (87QDQCIYN*TTYLNVQR101). When subjecting an identical peptide mixture to the COFRADIC procedure to sort N-glycosylated peptides16 (thus irrespective of the actual sialylated or asialylated nature of the glycan chains), five sites were identified that included the hereidentified sialylated sites: N99, N156, and N176 of fetuin and N33 and N93 of the R1-acid glycoprotein (data not shown). Three possible glycosylated sites were thus missed, but as discussed before,16 this is most probably due to bulky glycans hindering proteolysis close to tryptic sites. In conclusion, in this peptide mixture, 5 glycosylated asparagines appeared analyzable using COFRADIC, and when applied to sialo-COFRADIC, 3 sites were found sialylated. Whether the remaining two analyzable sites were not sialylated or desialylation resulted in a too small hydrophobic shift so that their corresponding tryptic peptides were not efficiently sorted remains elusive. Feeling comfortable with the hydrophobic shifts evoked by desialylation, we then used the HPLC setup to analyze more complex protein mixtures. Sialylated Proteins in Mouse Serum: Protocol A (Direct Sorting). In a follow-up study, sialylated proteins in mouse serum immuno-depleted of its three most abundant components (albumin, transferrin, and IgG’s) were identified using the above-described protocol for direct sorting of sialylated peptides. Using 200 µg of protein material from mouse serum affinity-depleted from its three most abundant proteins (this amount corresponds to an equivalent of about 13 µL of crude, nondepleted serum), 423 MS/MS spectra were linked to desialylated peptides using Mascot.22 Together they cover 84 different sialylated N-glycosylation sites in 48 different proteins (Table 1). Sialylated Proteins in Mouse Serum: Protocol B (Indirect Sorting). We exploited a different approach for identifying sialylated N-glycopeptides (Figure 1). Here, MARS-depleted serum was split in two equal parts of which one part was treated with neuraminidase. Both samples were then digested with trypsin, and peptides derived from the nondesialylated part were COOH-terminally labeled with oxygen-18,21 giving rise to a mass increase of 4 Da compared to their desialylated counterparts. In the next step, equal amounts of both protein digests were mixed and subjected to the first RP-HPLC separation. At this stage, we can distinguish four types of peptides with respect to N-glycosylation: 18O-labeled sialylated peptides
research articles and their 16O-labeled desialylated counterparts, which will both elute in different time intervals and are thus collected in different fractions. The third class is composed of peptides that are glycosylated but not sialylated and these will not segregate into 16O- and 18O-peptides. Finally, all remaining nonglycosylated peptides will also show coelution of their 16O- and 18Ovariants. Following fraction collection, peptides are now N-deglycosylated using PNGase F. This treatment will not affect column retention of nonglycosylated peptides in the second run. However, it will produce chromatographic shifts for the first three peptide classes mentioned above, because glycans are removed. Interestingly, whereas the 16O/18O variants of nonsialylated N-glycopeptides (third class) coelute and show similar ion intensities as isotopic couples, in vivo sialylated peptides show up as a single isotopic variants at different elution times (Figure 3). This “indirect” sorting process identifies sialylated peptides in retrospect based on three criteria: the chromatographic shift between sialylated and desialylated peptides, their appearance as isotopic singles (16O or 18O), and the Asn-to-Asp conversion in the consensus motif for N-glycosylation as a result of deglycosylation. Thus, an important screening aspect is the search for isotope singles versus doublets. Because this is most easily done by MALDI-MS, we here used this approach: sorted peptides are first assessed in MS-mode determining whether peptides are singles (sialylated peptides, Figure 3) or couples (probably nonsialylated peptides, see below) before they are subjected to PSD-analysis. In this way, sampling of nonsialylated peptides is avoided. Using this indirect sorting approach on a total of 500 µg of mouse serum protein material (an equivalent of 33 µL of undepleted serum), 117 glycosylated peptides with a single isotope pattern were identified after MALDI MS/MS analysis. These peptides cover 27 proteins and 36 glycosylation sites (Table 1). Compared to our first round of analysis, we notice an overlap of 27 sites and 22 proteins of which it is thus logical to state that these are validated by this second, alternative approach. On the basis of known concentration ranges of mouse serum proteins and on the overlap of the current list of identified serum proteins (Table 1) with the proteins identified by our N-glycoproteomics COFRADIC approach,16 it is fair to state that the current COFRADIC protocol identifies proteins that span a concentration range of 5 or more orders of magnitude. The number of sialylated peptides identified with the “indirect” approach is clearly lower than with the direct protocol although more than double the amount of material was used. Different reasons may account for this, related to both the nature of the peptides and the MS-technique. For instance, we found that glycopeptides from abundant proteins eluted in broad intervals in which sialylated, heavy peptides and desialylated, light peptides are insufficiently separated and therefore appeared as couples in MALDI MS spectra which were thus not considered for MS/MS analysis. Furthermore, the number of sialic acid residues on a glycan is an important factor for the COFRADIC sorting efficiency: if not enough sialic acid residues are present, the evoked difference in column retention can be small by which such peptides will elute in the same collection interval as their sialylated counterparts and are again observed as couples rather than singles. Finally, PSD often gives lower quality peptide fragmentation spectra that do not readily lead to unambiguous peptide identification.32 However, this problem can be alleviated when using a different Journal of Proteome Research • Vol. 6, No. 11, 2007 4307
research articles
Ghesquie` re et al.
Table 1. Identified Sialic Acid Containing Peptides in Depleted Mouse Seruma identified protein
sialylation site
Secreted proteins N402 (known) N25 (known) N34 (known) N76 (known) N94 (known) Alpha-1-antitrypsin 1-1 [P07758] N64 (potential) N101 (potential) N265 (potential) Alpha-1-antitrypsin 1-2 [P22599] N265 (known) Alpha-1-antitrypsin 1-4 [Q00897] N101 (potential) Alpha-2-antiplasmin [Q61247] N126 (potential) Alpha-2-HS-glycoprotein [P29699] N156 (potential) N176(known) Alpha-2-macroglobulin [Q61838] N157 (known) N405 (known) N412 (known) N568 (potential) N1003 (known) N1358 (known) Antithrombin-III [P32261] N129 (potential) N188 (known) Beta-2-glycoprotein 1 [Q01339] N162 (known) Biotinidase [Q8CIF4] N131 (known) C4b-binding protein [P08607] N227 (potential) Carboxypeptidase B2 [Q9JHH6] N72 (known) N107 (potential) Carboxypeptidase N catalytic chain [Q9JJN5] N143 (unknown) Carboxypeptidase N subunit 2 [Q9DBB9] N111 (known) N119 (known) N348 (known) Ceruloplasmin [Q61147] N138 (known) N757 (known) Clusterin [Q06890] N327 (known) Coagulation factor X [O88947] N187 (known) N218 (potential) Coagulation factor XI [Q91Y47] N449 (potential) Coagulation factor XIII B chain [Q07968] N545 (known) Complement C4-B [P01029] N224 (potential) Complement component C8 gamma chain [Q8VCG4] N158 (known) N173 (known) Complement component C9 [P06683] N263 (known) Complement factor B [P04186] N282 (known) Complement factor D [P03953] N124 (known) N176 (potential) Complement factor H [P06909] N773 (known) Corticosteroid-binding globulin [Q06770] N89 (known) N217 (known) N232 (known) N253 (potential) N320 (potential) Fibronectin [P11276] N1006 (known) Haptoglobin [Q61646] N148 (known) N182 (known) N256 (known) N264 (potential) Heparin cofactor 2 [P49182] N167 (known) Insulin-like growth factor-binding protein N64 (potential) complex acid labile chain [P70389] Inter-alpha-trypsin inhibitor heavy chain H3 [Q61704] N577 (known) Kininogen-1 [O08677] N168 (known) Major urinary protein 3 precursor [P04939] N66 (known) Mannan-binding lectin serine protease 2 [Q91WP0] N641 (known) Murinoglobulin-1 [P28665] N993 (known) N1142 (known) Phosphatidylcholine-sterol acyltransferase [P16301] N108 (potential) N296 (potential) N408 (known) Afamin [O89020] Alpha-1-acid glycoprotein 1 [Q60590]
4308
Journal of Proteome Research • Vol. 6, No. 11, 2007
protocol A
x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
protocol B
x
x x x x
x x x
x x x x
x x x x x x x
x x
x x x
research articles
Mapping Sialylated N-Glycosites Table 1. (Continued) identified protein
sialylation site
Secreted proteins Phosphatidylinositol-glycan-specific phospholipase D1 [O70362] N303 (potential) N317 (known) N655 (known) Plasma kallikrein [P26262] N127 (known) Plasma protease C1 inhibitor [P97290] N83 (potential) N243 (known) Protein Z-dependent protease inhibitor [Q8R121] N81 (known) N184 (known) N278 (known) N299 (known) Serine protease inhibitor A3K [P07759] N105 (potential) N185 (known) N270 (known) Serine protease inhibitor A3M [Q03734] N269 (known) Sulfated glycoprotein 1 [Q61207] N334 (potential) Thyroxine-binding globulin [P61939] N168 (potential) Vitamin K-dependent protein C [P33587] N214 (known) Plasma membrane proteins Complement-activating component of N183 (potential) Ra-reactive factor [P98064] Epidermal growth factor receptor [Q01279] N128 (potential) N352 (known) Leukemia inhibitory factor receptor [P42703] N385 (known) N402 (known) N675 (known) Macrophage colony-stimulating factor 1 receptor [P09581] N491 (known) Platelet glycoprotein V [O08742] N51 (potential) Intracellular proteins Liver carboxylesterase N [P23953] N377 (known) Nuclear factor of activated T-cells 5 [Q9WV30] N252 (unknown)
protocol A
x x x x x x x x x x x x x x x x x
protocol B
x
x x x x x
x x x x x x x
x x x x
x
a All identified proteins that contained sialylated N-glycans are indicated by their name and Swiss-Prot accession number. Identified sites of sialylation in each protein are shown in the right column, and it is indicated whether these sites are known, potential, or unknown N-glycosylation sites according to the information stored in the Swiss-Prot database.
peptide fragmentation mode like CID (as we used in the direct approach), which, when using ESI doubly or triply charged peptides, generally produces higher quality MS/MS spectra containing higher numbers of fragment ions that lead to increased peptide sequence coverage. Such an approach could, for instance, require that the isolated peptides are first analyzed in LC-MS modus and peptides of interest (single 16O or 18O labeled peptides) are then further selected for CID analysis (e.g., using m/z-inclusion lists during a second LC-MS/MS run).
Concluding Remarks In this report, we described two approaches to isolate and identify sialylated N-glycopeptides on a proteome-wide scale. Central is the use of a neuraminidase with broad specificity for removing sialic acid from the glycan backbone, which results in altered retention of affected peptides on a RP column. Sorted, desialylated peptides are further N-deglycosylated with PNGaseF, which makes it possible to identify the corresponding peptide backbone sequence following MS/MS analysis. This also means that, by default, O-linked sialylated peptides will be missed. In both methods, a sialylation event is identified at the peptide level. In the first protocol (direct sorting), the criteria are a shift to later column retention times and the Asnto-Asp conversion in the NXS/T motif. In the second protocol (indirect sorting), peptide sialylation is concluded based on its appearance as an isotope single (18O), a chromatographic shift and the Asn-to-Asp conversion. False positives could have been produced when chemical deamidation happens in peptides between the two consecutive COFRADIC runs, producing “false shifts”. Considering the
buffer conditions used for desialylation or PNGaseF treatment, we feel potential partial deamidation may only occur at sites known to be highly sensitive for spontaneous deamidation, such as Asn-Gly and Asn-Ser sequences.33 However, we noticed that less than 10% of sorted peptides contain a deamidated Asn not in the consensus motif for N-glycosylation and further omitted such potential false positives from the reported list of identifications. Alternative procedures for identifying sialylated proteins have been described. These typically used lectin-affinity chromatography enriching for potentially sialylated proteins or peptides. The group of Regnier for instance used SNA at one point in their proteomic route to enrich sialylated peptides and identified 34-37 proteins (50 and 52 potential sialylation sites, respectively) after analyzing 1 mL of nondepleted human serum.28,34 Zhao and co-workers reported the identification of approximately 130 sialylated glycoproteins (no mentioning of the number of actual identified sites) in human serum depleted from 12 of its most abundant proteins.35 In their work, equivalents of 125 µL of serum samples were loaded on three different lectin columns, two which are rather specific for sialic acids (SNA and MAA lectin) and one with a rather poor selectivity (wheat germ agglutinin, WGA). Isolated glycoproteins were then further fractionated by LC and SDS-PAGE and finally identified by MS. Although a linear comparison between our results and those mentioned above is not possible, it is tempting to state that, given the amount of analyzed material (equivalents of 13-33 µL of nondepleted serum) and the number of identified proteins (53) and characterized sites (93), Journal of Proteome Research • Vol. 6, No. 11, 2007 4309
research articles
Ghesquie` re et al.
Figure 3. Examples of validated sialic acid containing N-glycopeptides. Following our N-COFRADIC setup depicted in Figure 1, several sialylated N-glycopeptides were validated because only the presence of sialic acid can cause a difference in elution time on a RP-HPLC column. The MS spectra of 2 peptides are shown (A-B) of which we identified both the light and heavy isotopic variant in distinct secondary fractions (ELHHLQEQD*VSNVFLDK from ceruloplasmin (A) and GDEKED*ITAEALDLSLK from inter-alpha-trypsin inhibitor heavy chain H (B)).
our sialo-COFRADIC is at least equally sensitive, if not more, than lectin-affinity-based methods. Clearly, in these affinity procedures, selectivity toward the type of sialylation can be incorporated: for instance, MAA lectin recognizes R-2,3 linked sialic acid, while SNA lectin displays a preference for the R-2,6 linkage. This, however, also implies that to isolate the sialome as complete as possible, several sialic acid specific lectins need to be combined. It is well-known that even in such a multi-lectin or serial-lectin affinity setup eventually not all glycosylated peptides or proteins are cap4310
Journal of Proteome Research • Vol. 6, No. 11, 2007
tured,9 indicating that some sialylated glycoforms might be missed. Furthermore, lectins tend to have a too broad specificity, and nonglycosylated peptides are typically coisolated, which thus contaminates further analysis. Because we did not use lectin-affinity chromatography, for instance to pre-enrich sialylated proteins or peptides, sialo-COFRADIC does not suffer from these inherent drawbacks. Furthermore, in each of these procedures, time-consuming analytical steps need to be undertaken for detailed identification of the sialylated sites, which is not necessary in our COFRADIC approach, which im-
research articles
Mapping Sialylated N-Glycosites
mediately targets sialylated peptides. In addition, like selection by different lectins in affinity-based approaches, our desialylation approach could similarly sort for different linkages by using neuraminidases with selective properties (e.g., Streptococaus pneumoniae neuraminidase specific for the R-2,3 linkage36). Recently, Larsen and co-workers demonstrated that titanium dioxide may be used to enrich for sialylated glycopeptides,37 whereas the group of Sickmann used strong cation exchange to enrich sialylated glycopeptides from a digest of platelet plasma membranes.38 Although both approaches are quite simple and straightforward, they may suffer from the coisolation of sulfated and phosphorylated glycopeptides. Our approach clearly avoids this because it directly puts selectivity for peptide isolation on the sialic acid moiety by its selective removal by neuraminidases. Therefore, one may expect that our approach produces less “false positives”. Unfortunately, a direct comparison of our results with those mentioned above37,38 is not possible. Although we have used mouse serum, Larsen’s and Sickmann’s group used human plasma and saliva and human platelet plasma membranes respectively for their analyses. We here identified in total 93 sites in 53 mouse serum proteins (Table 1); 63 glycosylation sites are annotated in SwissProt as known, 28 as potential, and 2 sites are hitherto unknown. The majority of the identified proteins (46) are plasma residents (Table 1), and a second main group of proteins are soluble forms of membrane proteins (5) of which their presence can be explained by membrane/receptor shedding. This indicates that our approach could also be used to study the sialylation pattern of isolated membrane protein fractions. Indeed, most of the expected glycosylation and sialylation sites are located in the extracellular regions of membrane proteins. If we assume that similar sialylation patterns are produced in the human protein orthologues, our study suggests that the proteo-sialylation approach can cover potential biomarkers for cancers and pathophysiological conditions. Haptoglobin, for instance, is an acute-phase protein that is mainly produced by the liver and secreted into the bloodstream. It contains 4 N-glycosylation sites that are known to be sialylated,39 and we here identified all of them. Changes in the sialic acid content of the N-glycans of haptoglobin occur in hepatocellular carcinomas.40 Alpha-1-acid glycoprotein is a highly glycosylated protein; about 45% of its molecular weight consists out of carbohydrates.41 Here, 4 N-glycosites containing sialic acid were identified. Desialylation of this protein leads to enhanced inhibition of platelet aggregation,42 while the sialyl Lewis X form induced during inflammation43 ameliorates both complement and neutrophil-mediated injuries.44,45 Hypersialylation of alpha-1-acid glycoprotein was found to be associated with lymphomas.36 Finally, murinoglobulin is a known inhibitor of hemeagglutination; desialylation of this glycoprotein stops the inhibitory effect.46 In this respect, it is important to stress that the 18O/16O labeling approach (protocol B) can equally be used in a differential setup in which sera from healthy and diseased states are differently labeled and their ratios measured. Such a strategy should not be restricted to differential oxygen labeling21 but could also use the iTRAQ-technology47,48 opening the way for multiplex analysis strategies. Abbreviations: COFRADIC, combined fractional diagonal chromatography; iTRAQ, isobaric tags for relative and absolute quantification; MAA, Maackia amurensis lectin; MARS, multiple
affinity removal system; PNGaseF, peptide N-glycosidase F; SLAC, serial lectin affinity chromatography; SNA, Sambucus nigra lecitn; TCEP, tris-(carboxyethyl)phosphine; WGA, wheat germ agglutinin.
Acknowledgment. We thank Dr. Koen Kas (ProNota N.V., Zwijnaarde-Ghent, Belgium) for providing us with the mouse serum samples. The lab is supported by research grants from the Fund for Scientific ResearchsFlanders (Belgium) (G.0280.07), the Concerted Research Actions (GOA) from the Ghent University, the Inter University Attraction Poles (IUAP05), and the European Union Interaction Proteome (6th Framework Program). References (1) Schauer, R. Achievements and challenges of sialic acid research. Glycoconj. J. 2000, 17 (7-9), 485-499. (2) Schauer, R. Biosynthesis and function of N- and O-substituted sialic acids. Glycobiology 1991, 1 (5), 449-452. (3) Chiricolo, M.; Malagolini, N.; Bonfiglioli, S.; Dall’Olio, F. Phenotypic changes induced by expression of beta-galactoside alpha2,6 sialyltransferase I in the human colon cancer cell line SW948. Glycobiology 2006, 16 (2), 146-154. (4) Taniguchi, A.; Hasegawa, Y.; Higai, K.; Matsumoto, K. Transcriptional regulation of human beta-galactoside alpha2, 6-sialyltransferase (hST6Gal I) gene during differentiation of the HL-60 cell line. Glycobiology 2000, 10 (6), 623-628. (5) Aas-Eng, D. A.; Asheim, H. C.; Deggerdal, A.; Smeland, E.; Funderud, S. Characterization of a promoter region supporting transcription of a novel human beta-galactoside alpha-2,6sialyltransferase transcript in HepG2 cells. Biochim. Biophys. Acta 1995, 1261 (1), 166-169. (6) Petrick, A. T.; Meterissian, S.; Steele, G., Jr.; Thomas, P. Desialylation of metastatic human colorectal carcinoma cells facilitates binding to Kupffer cells. Clin. Exp. Metastasis 1994, 12 (2), 108116. (7) Suer, S.; Sonmez, H.; Karaaslan, I.; Baloglu, H.; Kokoglu, E. Tissue sialic acid and fibronectin levels in human prostatic cancer. Cancer Lett. 1996, 99 (2), 135-137. (8) Yamada, N.; Chung, Y. S.; Takatsuka, S.; Arimoto, Y.; Sawada, T.; Dohi, T.; Sowa, M. Increased sialyl Lewis A expression and fucosyltransferase activity with acquisition of a high metastatic capacity in a colon cancer cell line. Br. J. Cancer 1997, 76 (5), 582-587. (9) Yang, Z.; Hancock, W. S. Approach to the comprehensive analysis of glycoproteins isolated from human serum using a multi-lectin affinity column. J. Chromatogr. A 2004, 1053 (1-2), 79-88. (10) Cummings, R. D.; Kornfeld, S. Fractionation of asparagine-linked oligosaccharides by serial lectin-Agarose affinity chromatography. A rapid, sensitive, and specific technique. J. Biol. Chem. 1982, 257 (19), 11235-11240. (11) Brown, J. R.; Hartley, B. S. Location of disulphide bridges by diagonal paper electrophoresis. The disulphide bridges of bovine chymotrypsinogen A. Biochem. J. 1966, 101 (1), 214-228. (12) Gevaert, K.; Van Damme, J.; Goethals, M.; Thomas, G. R.; Hoorelbeke, B.; Demol, H.; Martens, L.; Puype, M.; Staes, A.; Vandekerckhove, J. Chromatographic isolation of methioninecontaining peptides for gel-free proteome analysis: identification of more than 800 Escherichia coli proteins. Mol. Cell. Proteomics 2002, 1 (11), 896-903. (13) Gevaert, K.; Ghesquiere, B.; Staes, A.; Martens, L.; Van Damme, J.; Thomas, G. R.; Vandekerckhove, J. Reversible labeling of cysteine-containing peptides allows their specific chromatographic isolation for non-gel proteome studies. Proteomics 2004, 4 (4), 897-908. (14) Gevaert, K.; Goethals, M.; Martens, L.; Van Damme, J.; Staes, A.; Thomas, G. R.; Vandekerckhove, J. Exploring proteomes and analyzing protein processing by mass spectrometric identification of sorted N-terminal peptides. Nat. Biotechnol. 2003, 21 (5), 566569. (15) Gevaert, K.; Staes, A.; Van Damme, J.; De Groot, S.; Hugelier, K.; Demol, H.; Martens, L.; Goethals, M.; Vandekerckhove, J. Global phosphoproteome analysis on human HepG2 hepatocytes using reversed-phase diagonal LC. Proteomics 2005, 5 (14), 3589-3599. (16) Ghesquiere, B.; Van Damme, J.; Martens, L.; Vandekerckhove, J.; Gevaert, K. Proteome-wide characterization of N-glycosylation events by diagonal chromatography. J. Proteome Res. 2006, 5 (9), 2438-2447.
Journal of Proteome Research • Vol. 6, No. 11, 2007 4311
research articles (17) Hanoulle, X.; Damme, J. V.; Staes, A.; Martens, L.; Goethals, M.; Vandekerckhove, J.; Gevaert, K. A new functional, chemical proteomics technology to identify purine nucleotide binding sites in complex proteomes. J. Proteome Res. 2006, 5 (12), 3438-3445. (18) Van Damme, P.; Martens, L.; Van Damme, J.; Hugelier, K.; Staes, A.; Vandekerckhove, J.; Gevaert, K. Caspase-specific and nonspecific in vivo protein processing during Fas-induced apoptosis. Nat. Methods 2005, 2 (10), 771-777. (19) Meuleman, P.; Libbrecht, L.; De Vos, R.; de Hemptinne, B.; Gevaert, K.; Vandekerckhove, J.; Roskams, T.; Leroux-Roels, G. Morphological and biochemical characterization of a human liver in a uPA-SCID mouse chimera. Hepatology 2005, 41 (4), 847856. (20) Gevaert, K.; Pinxteren, J.; Demol, H.; Hugelier, K.; Staes, A.; Van Damme, J.; Martens, L.; Vandekerckhove, J. Four stage liquid chromatographic selection of methionyl peptides for peptidecentric proteome analysis: the proteome of human multipotent adult progenitor cells. J. Proteome Res. 2006, 5 (6), 1415-1428. (21) Staes, A.; Demol, H.; Van Damme, J.; Martens, L.; Vandekerckhove, J.; Gevaert, K. Global differential non-gel proteomics by quantitative and stable labeling of tryptic peptides with oxygen18. J. Proteome Res. 2004, 3 (4), 786-791. (22) Perkins, D. N.; Pappin, D. J.; Creasy, D. M.; Cottrell, J. S. Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 1999, 20 (18), 3551-3567. (23) Spiro, R. G. Studies on fetuin, a glycoprotein of fetal serum. II. Nature of the carbohydrate units. J. Biol. Chem. 1962, 237 382388. (24) Bendiak, B.; Harris-Brandts, M.; Michnick, S. W.; Carver, J. P.; Cumming, D. A. Separation of the complex asparagine-linked oligosaccharides of the glycoprotein fetuin and elucidation of three triantennary structures having sialic acids linked only to galactose residues. Biochemistry 1989, 28 (15), 6491-6499. (25) Cumming, D. A.; Hellerqvist, C. G.; Harris-Brandts, M.; Michnick, S. W.; Carver, J. P.; Bendiak, B. Structures of asparagine-linked oligosaccharides of the glycoprotein fetuin having sialic acid linked to N-acetylglucosamine. Biochemistry 1989, 28 (15), 65006512. (26) Treuheit, M. J.; Costello, C. E.; Halsall, H. B. Analysis of the five glycosylation sites of human alpha 1-acid glycoprotein. Biochem. J. 1992, 283 ( Pt 1), 105-112. (27) Schmid, K.; Nimerg, R. B.; Kimura, A.; Yamaguchi, H.; Binette, J. P. The carbohydrate units of human plasma alpha1-acid glycoprotein. Biochim. Biophys. Acta 1977, 492 (2), 291-302. (28) Qiu, R.; Regnier, F. E. Comparative glycoproteomics of N-linked complex-type glycoforms containing sialic acid in human serum. Anal. Chem. 2005, 77 (22), 7225-7231. (29) Lewandrowski, U.; Moebius, J.; Walter, U.; Sickmann, A. Elucidation of N-glycosylation sites on human platelet proteins: a glycoproteomic approach. Mol. Cell. Proteomics 2006, 5 (2), 226233. (30) Xiong, L.; Regnier, F. E. Use of a lectin affinity selector in the search for unusual glycosylation in proteomics. J. Chromatogr., B: Anal. Technol. Biomed. Life Sci. 2002, 782 (1-2), 405-418. (31) Tarentino, A. L.; Gomez, C. M.; Plummer, T. H., Jr. Deglycosylation of asparagine-linked glycans by peptide:N-glycosidase F. Biochemistry 1985, 24 (17), 4665-4671. (32) Gevaert, K.; Demol, H.; Martens, L.; Hoorelbeke, B.; Puype, M.; Goethals, M.; Van Damme, J.; De Boeck, S.; Vandekerckhove, J. Protein identification based on matrix assisted laser desorption/ ionization-post source decay-mass spectrometry. Electrophoresis 2001, 22 (9), 1645-1651.
4312
Journal of Proteome Research • Vol. 6, No. 11, 2007
Ghesquie` re et al. (33) Krokhin, O. V.; Antonovici, M.; Ens, W.; Wilkins, J. A.; Standing, K. G. Deamidation of -Asn-Gly- sequences during sample preparation for proteomics: Consequences for MALDI and HPLCMALDI analysis. Anal. Chem. 2006, 78 (18), 6645-6650. (34) Qiu, R.; Regnier, F. E. Use of multidimensional lectin affinity chromatography in differential glycoproteomics. Anal. Chem. 2005, 77 (9), 2802-2809. (35) Zhao, J.; Simeone, D. M.; Heidt, D.; Anderson, M. A.; Lubman, D. M. Comparative serum glycoproteomics using lectin selected sialic acid glycoproteins with mass spectrometric analysis: application to pancreatic cancer serum. J. Proteome Res. 2006, 5 (7), 1792-1802. (36) Cassidy, J. T.; Jourdian, G. W.; Roseman, S. The sialic acids. VI. Purification and properties of sialidase from Clostridium perfringens. J. Biol. Chem. 1965, 240 (9), 3501-3506. (37) Larsen, M. R.; Jensen, S. S.; Jakobsen, L. A.; Heegaard, N. H. Exploring the sialiome using titanium dioxide chromatography and mass spectrometry. Mol. Cell. Proteomics 2007. (38) Lewandrowski, U.; Zahedi, R. P.; Moebius, J.; Walter, U.; Sickmann, A. Enhanced N-glycosylation site analysis of sialoglycopeptides by strong cation exchange prefractionation applied to platelet plasma membranes. Mol. Cell. Proteomics 2007. (39) Turner, G. A. Haptoglobin. A potential reporter molecule for glycosylation changes in disease. Adv. Exp. Med. Biol. 1995, 376, 231-238. (40) Ang, I. L.; Poon, T. C.; Lai, P. B.; Chan, A. T.; Ngai, S. M.; Hui, A. Y.; Johnson, P. J.; Sung, J. J. Study of serum haptoglobin and its glycoforms in the diagnosis of hepatocellular carcinoma: a glycoproteomic approach. J. Proteome Res. 2006, 5 (10), 26912700. (41) Fournier, T.; Medjoubi, N. N.; Porquet, D. Alpha-1-acid glycoprotein. Biochim. Biophys. Acta 2000, 1482 (1-2), 157-171. (42) Costello, M.; Fiedel, B. A.; Gewurz, H. Inhibition of platelet aggregation by native and desialised alpha-1 acid glycoprotein. Nature 1979, 281 (5733), 677-678. (43) Chavan, M. M.; Kawle, P. D.; Mehta, N. G. Increased sialylation and defucosylation of plasma proteins are early events in the acute phase response. Glycobiology 2005, 15 (9), 838-848. (44) De Graaf, T. W.; Van der Stelt, M. E.; Anbergen, M. G.; van Dijk, W. Inflammation-induced expression of sialyl Lewis X-containing glycan structures on alpha 1-acid glycoprotein (orosomucoid) in human sera. J. Exp. Med. 1993, 177 (3), 657-666. (45) Williams, J. P.; Weiser, M. R.; Pechet, T. T.; Kobzik, L.; Moore, F. D., Jr.; Hechtman, H. B. alpha 1-Acid glycoprotein reduces local and remote injuries after intestinal ischemia in the rat. Am. J. Physiol. 1997, 273 (5 Pt 1), G1031-G1035. (46) Kitame, F.; Nakamura, K.; Saito, A.; Sinohara, H.; Homma, M. Isolation and characterization of influenza C virus inhibitor in rat serum. Virus Res. 1985, 3 (3), 231-244. (47) Ross, P. L.; Huang, Y. N.; Marchese, J. N.; Williamson, B.; Parker, K.; Hattan, S.; Khainovski, N.; Pillai, S.; Dey, S.; Daniels, S.; Purkayastha, S.; Juhasz, P.; Martin, S.; Bartlet-Jones, M.; He, F.; Jacobson, A.; Pappin, D. J. Multiplexed protein quantitation in Saccharomyces cerevisiae using amine-reactive isobaric tagging reagents. Mol. Cell. Proteomics 2004, 3 (12), 1154-1169. (48) Wiese, S.; Reidegeld, K. A.; Meyer, H. E.; Warscheid, B. Protein labeling by iTRAQ: a new tool for quantitative mass spectrometry in proteome research. Proteomics 2007, 7 (3), 340-350.
PR0703728