Anal. Chem. 2010, 82, 7722–7728
An Approach to Quantifying N-Linked Glycoproteins by Enzyme-Catalyzed 18O3-Labeling of Solid-Phase Enriched Glycopeptides Quazi Shakey, Brian Bates, and Jiang Wu* Global Biotherapeutics Technologies, Pfizer, Cambridge, Massachusetts 02140 Global analysis of glycoproteins shows great promise for the discovery of therapeutic targets and clinical biomarkers. Selective capture of glycopeptides by hydrazide resin followed by mass spectrometric identification of the peptides released by PNGaseF treatment has been most widely used. However, the majority of the reports using this approach focus on global profiling, rather than relative quantitation of glycoprotein alternations in pathological states. We describe an integrated strategy allowing for relative quantitation of glycoproteins in complex biological mixtures using this approach. The strategy includes periodate oxidation of tryptic digests, solid-phase enrichment of glycopeptides via hydrazide-coupled magnetic beads, in conjunction with 18O stable isotope labeling catalyzed by both trypsin and PNGaseF, and subsequent identification and quantitation by LC-MS/MS analysis. Three 18O atoms (18O3) are incorporated into N-linked glycopeptides for samples treated in 18Owater, two at the carboxyl terminus by trypsin during hydrazide coupling and the third at the N-glycosylation site through PNGaseF-mediated deglycosylation. Thus, mass shifts of 6 and 8 Da are indicative of singly and doubly glycosylated peptides, respectively. Experimental conditions were optimized to promote the trypsinmediated 18O2 incorporation and prevent backbone exchange. The accuracy, reproducibility, and linearity of relative quantitation were evaluated by using 15 glycoproteins spiked into mouse serum at different concentration ratios. Using this approach, we were able to identify and quantitate 224 N-glycopeptides representing 130 unique glycoproteins from 20 µL of the undepleted mouse serum samples. The strategy can be easily adapted to the analysis of glycoproteins in tissues, cell lines, and other sample origins. Protein glycosylation is one of the most prevalent posttranslational modifications. It plays key roles in many biological processes, such as cell communication, signaling, adhesion, molecular recognition, protein conformation, and folding. Over 50% of mammalian proteins are glycosylated. Alternations in glycosylation pattern and abundance have been indicated in various diseases such as cancer, Alzheimer’s disease, and rheu* To whom correspondence should be addressed. Phone: 617-665-8157. Fax: 617-665-8350. E-mail:
[email protected].
7722
Analytical Chemistry, Vol. 82, No. 18, September 15, 2010
matoid arthritis. Therefore, the characterization of glycosylation profiles and their changes in pathological states may ultimately lead to the elucidation of their biological significance and help in understanding their roles in the disease process.1-4 Carbohydrates are typically linked to serine or threonine residues (O-linked) or to asparagine residues (N-linked). N-Linked glycosylation has a well-defined amino acid consensus motif of N-X-S/T where X represents any amino acid except proline. Such N-linked glycans can be released by peptide-N-glycosidase F (PGNaseF) with high specificity. In addition, N-glycosylation is common in extracellular proteins, such as the extracellular domain of plasma membrane proteins and proteins secreted from cells to surrounding tissues, blood, or other body fluids that play major roles in molecular and cellular recognition and intercellular communication. Such proteins can be attractive therapeutic targets and serve as clinical biomarkers for diagnosis and drug efficacy.1,3-5 Recent advances in mass spectrometric instruments, enrichment techniques, and data analysis tools have made it possible to unbiasedly identify global glycosylation of proteins in various biological samples, enabling the monitoring of hundreds of sitespecific glycosylations in a single experiment.6-10 A wide variety of strategies have been investigated for this purpose, of which the solid-phase capture by hydrazide chemistry has been most widely accepted.6,7,10-21 In this approach, originally developed by Zhang et al.,10 the cis-diol groups in the carbohydrates are oxidized by periodate to aldehydes, which are then captured by covalent (1) An, H. J.; Kronewitter, S. R.; de Leoz, M. L.; Lebrilla, C. B. Curr. Opin. Chem. Biol. 2009, 13 (5-6), 601–7. (2) Durand, G.; Seta, N. Clin. Chem. 2000, 46 (6 Pt. 1), 795–805. (3) Peracaula, R.; Barrabes, S.; Sarrats, A.; Rudd, P. M.; de Llorens, R. Dis. Markers 2008, 25 (4-5), 207–18. (4) Spiro, R. G. Glycobiology 2002, 12 (4), 43R–56R. (5) Bhat, S.; Czuczman, M. S. Expert Opin. Biol. Ther. 2010, 10 (3), 451–8. (6) Chen, R.; Jiang, X.; Sun, D.; Han, G.; Wang, F.; Ye, M.; Wang, L.; Zou, H. J. Proteome Res. 2009, 8 (2), 651–61. (7) McDonald, C. A.; Yang, J. Y.; Marathe, V.; Yen, T. Y.; Macher, B. A. Mol. Cell Proteomics 2009, 8 (2), 287–301. (8) Schiess, R.; Mueller, L. N.; Schmidt, A.; Mueller, M.; Wollscheid, B.; Aebersold, R. Mol. Cell Proteomics 2009, 8 (4), 624–38. (9) Wollscheid, B.; Bausch-Fluck, D.; Henderson, C.; O’Brien, R.; Bibel, M.; Schiess, R.; Aebersold, R.; Watts, J. D. Nat. Biotechnol. 2009, 27 (4), 378–86. (10) Zhang, H.; Li, X. J.; Martin, D. B.; Aebersold, R. Nat. Biotechnol. 2003, 21 (6), 660–6. (11) Calvano, C. D.; Zambonin, C. G.; Jensen, O. N. J. Proteomics 2008, 71 (3), 304–17. (12) Dai, Z.; Zhou, J.; Qiu, S. J.; Liu, Y. K.; Fan, J. Electrophoresis 2009, 30 (17), 2957–66. (13) Drake, R. R.; Schwegler, E. E.; Malik, G.; Diaz, J.; Block, T.; Mehta, A.; Semmes, O. J. Mol. Cell Proteomics 2006, 5 (10), 1957–67. (14) Hirabayashi, J. J. Biochem. 2008, 144 (2), 139–47. 10.1021/ac101564t 2010 American Chemical Society Published on Web 08/26/2010
coupling to hydrazide resin. Nonglycosylated proteins/peptides are removed by extensive washing, and the remaining glycopeptides are released by PNGaseF treatment, which cleaves N-linked glycans from glycoproteins/peptides. The glycosylated asparagine residue is thus deamidated to aspartic acid, leading to a mass shift of +0.9840 Da at the site of glycosylation. Although the protocol was first described for global analysis of N-linked glycoproteins in plasma, it has been adapted to glycoprotein profiling of tissues, other body fluids, and secreted glycoproteins in culture medium, among others.6,16,22-25 Recently, the strategy has been extended to selective labeling of live cells, aiming to identify diseaseassociated cell surface glycoproteins.7-9 Relative quantitation of protein glycosylation in different pathological samples has been achieved by a variety of stableisotope labeling methods, such as metabolic labeling by SILAC or chemical labeling of the enriched glycopeptides by iTRAQ or deuterium-coded succinic anhydride reagents.10,18,24,26 However, the SILAC approach cannot be applied to body fluids and/or tissues. Chemical labeling using deuterium-coded succinic anhydride was described in the original protocol,10,18 but it is not trivial to achieve quantitative yet highly specific labeling on R-amine groups. Furthermore, the 4 Da mass differences induced by deuterium labeling are insufficient to differentiate the light and heavy analogue of large peptides due to the overlap with natural isotopes. Enzyme-catalyzed transfer of two 18O atoms from water to a peptide C-terminal carboxylate group is a simple and universal approach to introducing stable-isotopic tags for relative quantification.27-30 Samples of comparison are treated by enzyme in H216O and H218O, respectively, and the paired peptides are uniformly labeled with either 16O2 (labeling by two 16O atoms) or 18O2 (labeling by two 18O atoms), leading to a mass difference of 4 Da for single chain peptides. Since the pioneering work published by Yao et al.,31 this approach has been applied for (15) Liu, T.; Qian, W. J.; Gritsenko, M. A.; Camp, D. G., II; Monroe, M. E.; Moore, R. J.; Smith, R. D. J. Proteome Res. 2005, 4 (6), 2070–80. (16) Pan, S.; Wang, Y.; Quinn, J. F.; Peskind, E. R.; Waichunas, D.; Wimberger, J. T.; Jin, J.; Li, J. G.; Zhu, D.; Pan, C.; Zhang, J. J. Proteome Res. 2006, 5 (10), 2769–79. (17) Sun, B.; Ranish, J. A.; Utleg, A. G.; White, J. T.; Yan, X.; Lin, B.; Hood, L. Mol. Cell Proteomics 2007, 6 (1), 141–9. (18) Tian, Y.; Zhou, Y.; Elliott, S.; Aebersold, R.; Zhang, H. Nat. Protoc. 2007, 2 (2), 334–9. (19) Ueda, K.; Fukase, Y.; Katagiri, T.; Ishikawa, N.; Irie, S.; Sato, T. A.; Ito, H.; Nakayama, H.; Miyagi, Y.; Tsuchiya, E.; Kohno, N.; Shiwa, M.; Nakamura, Y.; Daigo, Y. Proteomics 2009, 9 (8), 2182–92. (20) Zhao, J.; Simeone, D. M.; Heidt, D.; Anderson, M. A.; Lubman, D. M. J. Proteome Res. 2006, 5 (7), 1792–802. (21) Zhou, Y.; Aebersold, R.; Zhang, H. Anal. Chem. 2007, 79 (15), 5826–37. (22) Arcinas, A.; Yen, T. Y.; Kebebew, E.; Macher, B. A. J. Proteome Res. 2009, 8 (8), 3958–68. (23) Hwang, H. J.; Quinn, T.; Zhang, J. Methods Mol. Biol. 2009, 566, 263–76. (24) Lei, Z.; Beuerman, R. W.; Chew, A. P.; Koh, S. K.; Cafaro, T. A.; UrretsZavalia, E. A.; Urrets-Zavalia, J. A.; Li, S. F.; Serra, H. M. J. Proteome Res. 2009, 8 (4), 1992–2003. (25) Ramachandran, P.; Boontheung, P.; Xie, Y.; Sondej, M.; Wong, D. T.; Loo, J. A. J. Proteome Res. 2006, 5 (6), 1493–503. (26) Aggelis, V.; Craven, R. A.; Peng, J.; Harnden, P.; Cairns, D. A.; Maher, E. R.; Tonge, R.; Selby, P. J.; Banks, R. E. Proteomics 2009, 9 (8), 2118–30. (27) Becker, G. W. Briefings Funct. Genomics Proteomics 2008, 7 (5), 371–82. (28) Capelo, J. L.; Carreira, R. J.; Fernandes, L.; Lodeiro, C.; Santos, H. M.; SimalGandara, J. Talanta 2010, 80 (4), 1476–86. (29) Miyagi, M.; Rao, K. C. Mass Spectrom. Rev. 2007, 26 (1), 121–36. (30) Ye, X.; Luke, B.; Andresson, T.; Blonder, J. Briefings Funct. Genomics Proteomics 2009, 8 (2), 136–44. (31) Yao, X.; Freas, A.; Ramirez, J.; Demirev, P. A.; Fenselau, C. Anal. Chem. 2001, 73 (13), 2836–42.
relative quantitation of different kinds of peptides in various biological samples.27-30,32 It has been demonstrated that sequential digestion with proteolytic enzyme and PNGaseF in H218O leads to mass shift of 6 Da for glycopeptides, a unique feature that can be utilized to quantify relative abundances of N-glycosylation and protein expression.33,34 However, there is so far no report using 18O-tagging for relative quantitation of solid-phase captured glycopeptides. In this report, we describe an integrated strategy allowing for global profiling of N-linked glycopeptides in complex mixtures by hydrazide magnetic bead capture in conjunction with 18O-based relative quantitation. For this purpose, we optimized conditions for periodate oxidation, hydrazide coupling, and trypsin- and PNGaseF-catalyzed incorporation of 18O in an attempt to minimize sample manipulation and improve accuracy and reproducibility of the quantitation. The protocol was evaluated using undepleted mouse serum spiked with a mixture of 15 glycoproteins at different concentrations. The method is robust, sensitive, and specific. With little change, this approach is also well suited to samples of other origins and thus has the potential to facilitate therapeutic and biomarker discovery. MATERIALS AND METHODS Materials. Sequence-grade modified trypsin was purchased from Promega (Madison, WI). BcMag hydrazide-modified magnetic beads (1 µm diameter) were from Bioclone (San Diego). Glycerol-free PNGaseF was from New England BioLabs (Ipswich, MA). 18O-Water (99% pure) was purchased from Cambridge Isotope Laboratories (Andover, MA). Lyophilized mouse serum, sodium periodate, and all the glycoprotein standards (human R-2-HS-glycoprotein, human haptoglobin, human R-1-antitrypsin, human complement C4, human complement C3, R-1antichymotrypsin, human R2-macroglobulin, bovine transferrin, bovine R-1-acid glycoprotein, bovine fetuin-B, yeast invertase, pig thyroglobulin, Aspergillus niger glucose oxidase, chicken ovalbumin, and horseradish peroxidase) were from SigmaAldrich (St. Louis, MO). Tris(2-carboxyethyl)phosphine (TCEP) was purchased from Fluka (Buchs, Switzerland). The SepPak C18 cartridge was from Waters (Milford, MA). All other chemicals and reagents were of the highest grade. Sample Preparation and 18O3 Labeling. Lyophilized mouse serum was reconstituted in 50 mM of pH 8.0 Tris-HCl buffer containing 8 M urea. Two identical aliquots, each corresponding to 20 µL of the mouse serum (∼1.5 mg protein), were taken and were labeled as samples A and B. Fifteen glycoproteins dissolved in 50 mM of pH 8.0 Tris-HCl buffer were spiked into the samples at various designated concentrations (0.5-5.0 µg, as shown in Table 1). The two samples were reduced with 5 mM TCEP for 30 min at room temperature and alkylated with 10 mM iodoacetamide for 30 min in the dark. After 4-fold dilution with 25 mM of pH 8.0 Tris buffer, the proteins were digested with trypsin (10 µg) overnight at room temperature. The digestion was subsequently quenched by adding TFA to a final concentration of 0.5%. The two peptide samples were desalted by using 50 mg SepPak C18 cartridges. (32) Fenselau, C.; Yao, X. J. Proteome Res. 2009, 8 (5), 2140–3. (33) Reynolds, K. J.; Yao, X.; Fenselau, C. J. Proteome Res. 2002, 1 (1), 27–33. (34) Liu, Z.; Cao, J.; He, Y.; Qiao, L.; Xu, C.; Lu, H.; Yang, P. J. Proteome Res. 2010, 9 (1), 227–36.
Analytical Chemistry, Vol. 82, No. 18, September 15, 2010
7723
Table 1. Identification and Relative Abundance Measurements of the Glycopeptides Derived from 15 Proteins Spiked into Mouse Serum at Different Concentrationsa
no.
spiked-in protein
1
human R-2-HS-glycoprotein
2 3 4
5 6 7 8 9 10 11 12 13 14 15
spiked-in amount (sample A:B)
peptide sequence
AALAAFNAQNN*GSNFQLEEISR VCQDCPLLAPLN*DTR yeast invertase AEPILN*ISNAGPWSR FATN*TTLTK NPVLAAN*STQFR human haptoglobin M#VSHHN*LTTGATLINEQWLLTTAK NLFLN*HSEN*ATAK VVLHPN*YSQVDIGLIK pig thyroglobulin DM#QPRPESPEETDLTAELFSPVDLNQVIVSEN*R FLANVGQFN*LSGALGTR GTFN*FSHFFQQLGLPGFQK LCDVDPCCTGFGFLN*VSQLK LGVN*VTWTLR bovine transferrin N*SSLCALCIGSEK human R-1-antichymotrypsin FN*LTETSEAEIHQSFQHLLR TLN*QSSDELQLSM#GNAM#FVK YTGN*ASALFILPDQDKM#EEVEAM#LLPETLK human R-1-antitrypsin YLGN*ATAIFFLPDEGK human complement C4 FSDGLESN*SSTQFEVK GLN*VTLSSTGR human complement C3 N/D R2-macroglobulin N/D bovine R-1-acid glycoprotein CVYN*CSFIK NPEYN*K TFM#LAASWN*GTK Aspergillus niger, glucose GGFHN*TTALLIQYENYR oxidase GPIIEDLNAYGDIFGSSVDHAYETVELATNN*QTALIR bovine fetuin-B GEN*ATVNQRPANPSK chicken ovalbumin YN*LTSVLM#AM#GITDVFSSSANLSGISSAESLK horseradish peroxidase GLCPLNGN*LSALVDFDLR GLIQSDQELFSSPN*ATDTIPLVR LYN*FSNTGLPDPTLN*TTYLQTLR M#GN*ITPLTGTQGQIR SFAN*STQTFFNAFVEAM#DR
expected O3/16O3 ratio
observed O3/16O3 ratio (CV)b
observed O2/18O3 (%) (CV)c
0.50 µg:2.50 µg
5.00
0.50 µg:2.50 µg
5.00
0.50 µg:2.50 µg
5.00
0.50 µg:2.50 µg
5.00
0.50 µg:2.50 µg 0.50 µg:1.00 µg
5.00 2.00
0.50 µg:1.00 µg 0.50 µg:1.00 µg
2.00 2.00
0.50 µg:1.00 µg 0.50 µg:1.00 µg 5.00 µg:0.50 µg
2.00 2.00 0.10
5.00 µg:0.50 µg
0.10
4.64 (6.68%) 4.85 (5.15%) 4.97 (11.3%) 4.88 (9.84%) 4.22 (9.00%) 4.41 (8.84%) 4.85 (7.18%) 4.56 (8.55%) 4.96 (9.27%) 6.82 (0%) 5.72 (3.15%) 5.40 (0%) N/D 4.86 (3.70%) 1.62 (14.8%) 1.99 (10.1%) 1.90 (10.5%) 1.49 (11.4%) 2.18 (2.75%) 1.83 (2.73%) N/D N/D 0.088 (9.50%) N/D 0.087 (4.62%) 0.070 (14.3%)
10.2 (18.6%) 7.60 (12.5%) 8.20 (19.5%) 4.60 (12.8%) 9.10 (8.13%) 11.3 (2.30%) 6.50 (24.9%) 6.30 (4.92%) 12.3 (8.54%) N/Dd 3.90 (3.59%) 5.90 (35.6%) N/D 6.00 (6.00%) 11.2 (14.5%) 12.3 (8.37%) 9.80 (2.04%) 43.4 (14.7%) 5.80 (0%) 3.80 (0%) N/D N/D 6.70 (22.2%) N/D N/D N/D
5.00 µg:0.50 µg 5.00 µg:0.50 µg 5.00 µg:0.50 µg
0.10 0.10 0.10
0.080 (7.13%) 0.092 (4.76%) 0.074 (11.1%) 0.120 (0%) 0.098 (0%) 0.12 (32.0%) N/D N/D
N/D N/D N/D N/D N/D N/D N/D N/D
18
18
18
a Samples A and B were treated in 16O-water and 18O-water, respectively. The * denotes the N-glycosylation site, and # represents an oxidized Met. b The average value of abundance differences and CV value measured from three process replicates. c The average value of the abundance differences of 18O2/18O3 and CV value measured from three process replicates. d N/D ) not detected.
The eluates from the cartridges were concentrated to ∼100 µL in a Speedvac to remove acetonitrile (ACN) and then mixed with the same volume of 20 mM NaIO4. Oxidation of the carbohydrates was allowed to proceed for 1 h in an ice bath. After the reaction mixture was desalted with a SepPak C18 cartridge of the same size as that used to remove residual oxidant, the peptides were eluted with 55% ACN-0.05% acetic acid and lyophilized. The hydrazide coupling buffer (100 mM NaCl, 100 mM phosphate buffer, pH 6.0) was prepared in 16O- and 18O-water, respectively. The two lyophilized samples corresponding to samples A and B were reconstituted in 0.2 mL of the 16O- and 18O- coupling buffer, respectively. Trypsin (20 µg) was added to each sample to catalyze the exchange, followed by adding 10 mg of the hydrazide-modified magnetic beads that were prewashed, respectively, with 2 × 100 µL of 80%ACN and 100 µL of the coupling buffer in 16O- and 18Owater. The coupling and 18O labeling were conducted at 37 °C overnight with shaking. The supernatant was discarded, and the beads were sequentially washed three times each with 100 µL of 8 M urea/0.5 M NaCl, 2 M NaCl, 70% ACN-0.1% TFA, and finally 50 µL of 50 mM ammonium bicarbonate (ABC) buffer prepared in 16O- and 18O-water. The magnetic beads were resuspended in 75 µL of 50 mM ABC buffer prepared in 16O- and 18O-water, matching the previous 7724
Analytical Chemistry, Vol. 82, No. 18, September 15, 2010
labeling. PNGaseF (3 µg) was added to each tube, and the deglycosylation and 18O labeling proceeded for 6 h at 37 °C with shaking. The supernatant was collected, and the beads were further washed with 2 × 75 µL of 0.5% TFA. The 16O- and 18Olabeled samples were then pooled and concentrated to ∼60 µL. The peptides were desalted using C18 Stagetips prior to LC-MS/ MS analysis. The above experiments were repeated twice on different days to assess process variations. Peptide Analysis by LC-MS/MS. Approximately one-third of the glycopeptide samples were loaded onto a Pico-frit column (New Objective) packed with reversed-phase Magic C18 material (5 µm, 200 Å, 75 µm × 10 cm) and coupled to an LTQ-Orbitrap XL mass spectrometer (ThermoElectron, Waltham, MA). Chromatographic methods were identical for all the samples analyzed. Peptides were separated at a flow rate of 0.2 µL/min using a 90 min linear gradient ranging from 2% to 40% B (mobile phase A: 0.1% formic acid/2% ACN; mobile phase B: 90% ACN/ 0.1% formic acid). Electrospray voltage was 1.8 kV. The instrumental method consisted of a full MS scan (scan range 375-1550 m/z, with 30 K fwhm resolution @ m/z ) 400, target value 2 × 106, maximum ion injection time of 500 ms) followed by data-dependent CID scans of the four most intense precursor ions. Peptide precursor ions were selected with an isolation window of 2.5 Da and a target value of 1 × 105. Dynamic
Figure 1. Schematics of strategy for solid-phase capture of glycopeptides and enzyme-catalyzed 18O3-labeling for relative quantitation. The cis-diol groups on carbohydrates are oxidized by periodate to form aldehyde, which are then coupled to hydrazide groups on the magnetic beads in the 18O-water in the presence of trypsin to promote the exchange of C-terminal carboxylate oxygen atoms with solvent. The covalently captured N-glycopeptides are released with PNGaseF in 18O-water to incorporate the third 18O atom to the N-glycosylated site.
exclusion was implemented with a repeat count of 2 and exclusion duration of 75 s. Three LC-MS/MS analyses were performed for each sample to generate a total of nine raw files for three process replicates. Data Analysis. The mass spectra were searched against a home-built mouse IPI database (updated in March, 2009) plus the sequences of spiked-in glycoproteins, using the Bioworks 3.3.1 SP1 with SEQUEST search algorithm (ThermoElectron, Waltham, MA). The mass accuracy was set to 5 ppm for precursor ions and to 0.5 Da tolerance for fragment ions. The search parameters took into account two missed cleavages for trypsin, static modification of carboxamidomethylation at cysteine (+57.0215 Da), and up to five differential modifications: deamidation in 16O (Asn +0.9840 Da), deamidation in 18O (Asn +2.9882 Da), C-terminal 18O2 labeling (+4.0084 Da), and methionine oxidation (Met +15.9949 Da). The search results were validated using the following filters: Sf g 0.20; charge-dependent Xcorr scores 1.50 (+1), 1.80 (+2), 2.30 (+3), and 3.00 (g+4). Since N-linked glycosylation has a consensus motif N-X-S/T, the identified peptides were further filtered to remove peptides without the motif. The above criteria yielded a false discovery rate (FDR) of ∼0.1% across all the samples. Relative quantitation was achieved using PepQuan software within BioWorks 3.3.1 using a mass tolerance of 0.0050 Da and a mass shift of 6.0126 Da for singly glycosylated peptides and 8.0168 Da for doubly glycosylated peptides. Basically, the abundance ratio is calculated as the intensity ratio of heavy and light monoisotopic peaks. Each of the three process replicates was analyzed in triplicate; a total of nine LC-MS/MS raw files were processed. To quantify the accuracy of the method, the abundance ratio of a peptide identified from all LC-MS/MS runs are averaged. The CV of the measurement was calculated if a glycopeptide was identified in a minimum of two LC-MS/MS runs. To estimate the completeness of the 18O3 incorporation, the ratio of 18O2/18O3 isotopes were also calculated for each peptide using pepQuan. If the most intense isotopic peak of a peptide shifts to a higher mass, then the M + 1 isotopic peak is used for comparison. RESULTS We sought to develop an 18O-based quantitative approach compatible with solid-phase hydrazide coupling. The approach
should retain the specificity of the solid-phase capture while providing an accurate and reproducible quantitation of the glycopeptides. The overall 18O-labeling strategy is illustrated in Figure 1. Experimentally, two identical aliquots from undepleted mouse serum (20 µL) spiked with 15 glycoproteins were proteolyzed with trypsin to generate a mixture of peptides. The peptides were reconstituted in 16O- or 18O-water and were mixed with trypsin and hydrazide-modified magnetic beads. The enzyme-catalyzed C-terminal labeling by 16O2 or 18O2 and glycopeptide capture by the hydrazide moiety took place simultaneously. Upon deglycosylation, a third 16O or 18O is introduced at the originally N-glycosylated Asn residue. The two samples labeled with 16O3 or 18O3 were mixed and analyzed by LC-MS/MS. To examine the effectiveness of the proposed strategy, we spiked 15 glycoproteins into samples A and B at three different concentration ratios, i.e., 1:2, 1:5, and 10:1, and measured the 18O3/ 16 O3 ratios of the glycopeptides. The glycopeptides in these proteins have different sequences from their mouse homologues and are therefore discernible from mouse proteins. In addition to measuring 18O3/16O3 ratios, we also calculated ratios of 18O2/18O3, which is a measure of the incompleteness of the 18 O3 labeling. The average value of 18O2/18O3 is primarily affected by the 18O2-labeling efficiency during trypsin-catalyzed postdigestion exchange, the purity of 18O-water, and the solvent water in PNGaseF. In the case of quantitative labeling by three 18 O atoms, the value of 18O2/18O3 would be approximately 4.5-5%. Asarepresentativeexample,theglycosylatedpeptideN*SSLCALCIGSEK (where * denotes the residue site of N-glycosylation) derived from the bovine transferrin is illustrated in Figure 2. Figure 2A shows a magnified MS spectrum of the doubly charged precursor ions. The monoisotopic peaks at m/z 720.3277 and m/z 723.3342 exhibit a signature 3 Da difference, corresponding to the incorporation of three 18O atoms (18O3) and single glycosylation. The abundance ratio calculated from the heavy and light monoisotopes (18O3/16O3) is 5.02, which is in excellent agreement with the expected ratio of 5.00. The ratio of 18O2/18O3 is estimated from the relative ion intensities of either isotopic pair (m/z 722.3354/723.3342) or isotopic pair (m/z 722.8334/723.8357). Analytical Chemistry, Vol. 82, No. 18, September 15, 2010
7725
Figure 2. Mass spectrometric identification and quantitation of deglycosylated peptide N*SSLCALCIGSEK from bovine transferrin (* represents the previously glycosylated Asn residue). (A) The doubly charged precursor ions of the 16O3- and 18O3-labeled peptides have peaks at m/z 720.3277 and m/z 723.3342, respectively. The intensity ratio of the 18O3/16O3 isotope clusters is 5.02, and the abundance ratio of 18O2/18O3 is 6.1%. (B) MS/MS spectrum of the 16O3-labeled glycopeptide at m/z 720.3277 Da. (C) MS/MS spectrum of the 18O3-labeled glycopeptide at m/z 723.3342 Da. Compared with Figure 2B, all the y-ions of the heavy peptide in Figure 2C show a 4 Da mass shift indicating that two 18O atoms are attached at the carboxyl terminus, while all the identified b-ions shows a mass increase of 2 Da confirming that the site of N-glycosylation was labeled by a 18O atom.
The ratio (∼6.1%) indicates that the labeling is quantitative. The MS/MS spectra of the light and heavy peptides (Figure 2B and 7726
Analytical Chemistry, Vol. 82, No. 18, September 15, 2010
2C) show characteristic fragment ions corresponding to the labeling of two 18O atoms at the C-terminus and one at the glycosylated
Figure 3. The Venn-diagram of the number of glycoproteins identified from three process replicates of the same mouse serum pool.
Asn residue. Not surprisingly, all the singly charged y-ions of the heavy peptide display a unique mass shift of 4 Da, attributed to the labeling of the C-terminus by two 18O atoms. On the other hand, b-ions of the heavy peptide shift by 2 Da, confirming the incorporation of one 18O at the glycosylated Asn residue. Table 1 shows sequences of the identified glycopeptides from spiked-in proteins, the expected ratio, the observed average ratio of 18O3/16O3, and the average ratio of 18O2/18O3. A total of 32 distinct glycopeptides from 13 proteins were identified with high confidence, of which 28 peptide were also quantitated, including 2 doubly glycosylated peptides. The 18O3/16O3 ratios for most of the identified peptides were in good agreement with the expected fold difference of the spiked-in proteins, with variations less than 20%. However, much higher variations were observed for peptides FLANVGQFN*LSGALGTR, YLGN*ATAIFFLPDEGK,GGFHN*TTALLIQYENYR,andYN*LTSVLM#AM#GITDVFSSSANLSGISSAESLK. Several factors may contribute to the variation, such as measurement error due to low peak intensity, incomplete 18O labeling, limited dynamic range of the mass spectrometric measurement, etc. The corresponding CV of the measured abundance ratios varied from 0% to 15%, except for LYN*FSNTGLPDPTLN*TTYLQTLR (CV∼32%), a doubly glycosylated peptide from horseradish peroxidase. As shown in Table 1, the 18O2/18O3 ratios for most of the peptides were below 10%, suggesting that the trypsin-catalyzed 18 O2 labeling was virtually complete. The exception was the antitrypsin peptide YLGN*ATAIFFLPDEGK, which showed the 18 O2/18O3 ratio of 43.4%. The incomplete labeling obviously accounted for the abnormally low 18O3/16O3 ratio observed for this peptide. In this situation, counting the heavy peptide as the sum of both 18O2 and 18O3 clusters would give better accuracy of quantitation. The abundance of 18O2 peaks was too low to be accurately measured in many cases; therefore, the CV of the 18O2/18O3 measurement was generally higher. For proteins with expected abundance ratio of 0.10, the 18O2 abundances were not detectable. For glycopeptides identified in mouse serum, the average ratio of 18O3/16O3 is 0.93 ± 0.13, demonstrating that this approach allows reproducible and accurate quantitation of glycopeptides in serum samples. The Venn-diagram (Figure 3) displays the number of glycoproteins identified in three process replicates of the mouse serum pool. In total, we were able to identify 224 distinct peptides from 130 mouse proteins. A vast majority of the proteins (61.5%) were reproducibly identified from all three replicates; however, each replicate was able to identify approximately 15% more proteins.
DISCUSSION The objective of the study was to develop a robust and reproducible strategy for global glycopeptide quantitation by integrating hydrazide-based chemical enrichment and 18O-based enzymatic labeling. We explored a different approach to achieve this in the preliminary study. We first conducted hydrazide capture in regular water and deglycosylation in 18O-water, followed by 18O labeling of the released peptides by immobilized trypsin. This simple approach was aimed to eliminate back-exchange of 18O-labeled peptides and minimize the consumption of 18O-water. However, the resulting mass spectra suffered from significant interference from peptides derived by tryptic digestion of the coexisting PNGaseF. This approach was thus not further pursued. For the protocol described here, the key is to promote the efficient incorporation of 18O2 by trypsin during the course of solid-phase capture and minimize 18O-water usage and sample loss caused by multiple steps of manipulation. Incomplete incorporation represents a significant issue and is a bottleneck for the widespread application of the trypsin-catalyzed 18Olabeling approach. Hajkova et al.35 observed that the protonated form of the C-terminal carboxylate groups would bind more strongly to the active sites of trypsin which have a high local negative charge. Thus, it is expected that slightly acidic pH conditions will increase the protonation of the carboxylate moiety and, in turn, increase the rate of the exchange reaction. We performed 18O-labeling experiments in ABC buffer, pH 8.0, and observed that peptides with either Asp or Glu adjacent to the C-terminal Lys have a much slower exchange rate, leading to incomplete incorporation (i.e., significant amount of peptides were labeled by zero or one 18O). This suggests that the negative charge on the residue adjacent to C-terminal Lys also adversely affects the exchange rate. Thus, we used pH 6.0 phosphate buffer, which was also the optimal pH for hydrazide coupling, and prolonged incubation to promote the exchange. Under these conditions, complete 18O3 labeling was achieved for most of the peptides, including peptides with a C-terminal Asp/Glu-Lys sequence, as evidenced by the insignificant amount of 18O2 isotopic cluster as opposed to the enormous abundance of 18O3 isotopic cluster. In our experiments, the capture of glycopeptides and trypsincatalyzed 18O labeling occurred simultaneously. Solid-phase capture was usually performed with hydrazide-modified agarose beads. Applying the protocol to 18O quantitation would require the consumption of a large amount of 18O-tagged water and more tedious sample manipulation. The high density magnetic beads offered several unique advantages, including minimal use of 18O-water and ease of sample manipulation. After coupling, the magnetic beads were extensively washed with 8 M urea and other solutions to remove the residual trypsin and unbound peptides to eliminate back-exchange and reduce nonspecific binding Potential random deamidation of Asn during sample preparation sometimes affects the unambiguous identification of glycopeptides, as it leads to the same mass shift as deglycosylation in 16 O-water. 18O-Labeling provides a unique tool to differentiate the random deamidation during sample preparation from (35) Hajkova, D.; Rao, K. C.; Miyagi, M. J. Proteome Res. 2006, 5 (7), 1667–73.
Analytical Chemistry, Vol. 82, No. 18, September 15, 2010
7727
Figure 4. MS/MS spectrum of the 18O-labeled peptide CPLLTPFN∧DTN∧VVHTVNTALAAFNTQNN*GTYFK@ derived from mouse R-2-HSglycoprotein. The ∧ represents the random deamidation of Asn residue during sample preparation, leading to a mass shift of 0.98 Da. The * denotes the PNGaseF-mediated deamidation of glycosylated Asn residue in 18O-water, resulting in a 2.98 Da mass shift. The symbol @ represents the labeling of C-terminal carboxylate group by two 18O atoms.
deamidation by PNGaseF-catalyzed deglycosylation. For example, CPLLTPFN∧DTN∧VVHTVNTALAAFNTQNN*GTYFK@ (∧, *, and @ denote deamidation on Asn 8, 11, 28, and 18O2labeling at the C-terminus), a glycopeptide derived from R-2HS-glycoprotein, was confidently identified (Mr ) 3690.7533; Xcorr ) 6.73; Sf ) 0.92). As shown in Figure 4, the continuous y3-y14+1, y24+2, and y25+2 ions clearly indicated mass shifts of 1 Da on Asn residues 8 and 11; 3 Da on Asn residue 28; and 4 Da on C-terminus. It is obvious that the Asn residue at positions 8 and 11 are not a glycosylation site although it contains the glycosylation motif; rather the mass shifts were caused by deamidation. Development of an analytical methodology generally utilizes simple mixtures of spiked-in proteins to scrutinize accuracy, reproducibility, and dynamic range.34,36 The mixture, however, cannot reflect the complexity of real biological samples and conclusions are less valid. We spiked 15 glycoproteins into two test samples at three concentration ratios (1:2; 1:5; and 10:1) and measured the accuracy and reproducibility of the quantitation by three process replicates. The ratios measured by mass spectrometry were found to be in good agreement with the spiked-in ratios (Table 1). However, the average variation is considerably higher as the fold difference becomes larger (e.g., 10:1 ratio), suggesting that the dynamic range of the internal labeling method is limited. Another source of variation is the incomplete 18O2 labeling by trypsin, as is the case for antitrypsin. Third, low intensity of the peptide precursor ions also contributes to the variation of the quantitative measurement. For low abundance peaks, manual interrogation should always be performed to ensure the accuracy of the ratio measurement. We identified a total of 130 glycoproteins with high confidence from 20 µL of the undepleted mouse serum. Interestingly, as shown in Figure 3, the overlap between any two replicates is very limited, presumably because the abundances of these peptides are too low to be reproducibly identified. It is obvious that the
most abundant glycoproteins can be identified in all replicates; the key is to identify and quantitate these low abundance proteins. With the same volume of mouse plasma, Zhou et al. reported identification of 71 glycoproteins on an LCQ instrument using a similar hydrazide chemistry for glycopeptides capture and stable isotope labeling via succinic anhydride for quantitation.21 Sixtyone proteins identified by Zhou were also detected in our experiments. Our approach was able to identify many low abundance proteins, such as EGFR, insulin-like growth factorbinding protein complex acid labile chain, and leukemia inhibitory factor receptor. Although the exact concentrations of these proteins in mouse serum are not known, they are present in human plasma in low abundances (ng/mL).37 Overall, the accuracy, precision, and reproducibility are adequately high to allow this approach to be applied to the discovery of glycoprotein alternation in biological samples.
(36) Qian, W. J.; Liu, T.; Petyuk, V. A.; Gritsenko, M. A.; Petritis, B. O.; Polpitiya, A. D.; Kaushal, A.; Xiao, W.; Finnerty, C. C.; Jeschke, M. G.; Jaitly, N.; Monroe, M. E.; Moore, R. J.; Moldawer, L. L.; Davis, R. W.; Tompkins, R. G.; Herndon, D. N.; Camp, D. G.; Smith, R. D. J. Proteome Res. 2009, 8 (1), 290–9.
AC101564T
7728
Analytical Chemistry, Vol. 82, No. 18, September 15, 2010
SUMMARY In conclusion, we describe a robust, sensitive, specific, and quantitative method for the unambiguous identification and accurate quantitation of N-linked glycoproteins in complex mixtures. The methods and experimental conditions presented here can be easily adapted to the quantitative glycoproteomic analysis of tissues, body fluids, and cell lines, where metabolic labeling is not readily available. The 18O-labeling strategy is also compatible with the solid-phase capture at protein levels to probe cell surface glycoproteins.7-9 This strategy is very suitable to quantitatively monitor glycoprotein changes associated with pathological states to facilitate drug target and biomarker discovery. ACKNOWLEDGMENT The authors thank Laura Lin’s group for sharing instrument resources. Received for review June 11, 2010. Accepted August 17, 2010.
(37) Haab, B. B.; Geierstanger, B. H.; Michailidis, G.; Vitzthum, F.; Forrester, S.; Okon, R.; Saviranta, P.; Brinker, A.; Sorette, M.; Perlee, L.; Suresh, S.; Drwal, G.; Adkins, J. N.; Omenn, G. S. Proteomics 2005, 5 (13), 3278–91.