Convenient and Precise Strategy for Mapping N-Glycosylation Sites

Jul 10, 2015 - N-glycosylation is one of the most prevalence protein post-translational modifications (PTM) which is involved in several biological pr...
0 downloads 0 Views 903KB Size
Page 1 of 14

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

Analytical Chemistry

Temperature probe

Temperature probe

Condition: 50% v/v TFA RT: 10 mins Temp: 80 oC Power: 100w

Microwave heating

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

A convenient and precise strategy for mapping N-glycosylation sites using microwave-assisted acid hydrolysis and characteristic ions recognition Cheng Ma1#, Jingyao Qu1,3#, Jeffrey Meisner1, Xinyuan Zhao2, Xu Li1, Zhigang Wu1, Hailiang Zhu1, Zaikuan Yu1, Lei Li1, Yuxi Guo1, Jing Song1 and Peng George Wang1,3* 1

Center for Diagnostics & Therapeutics and Department of Chemistry, Georgia State University, Atlanta, Georgia 30303, United States 2 National Institute of Biological Sciences, Beijing 102206, People’s Republic of China 3 National Glycoengineering Research Center and The State Key Laboratory of Microbial Technology, Shandong University, Jinan, Shandong 250100, People’s Republic of China

ABSTRACT N-glycosylation is one of the most prevalence protein post-translational modifications (PTM) which is involved in several biological processes. Alternation of N-glycosylation is associated with cellular malfunction and development of disease. Thus, investigation of protein N-glycosylation is crucial for diagnosis and treatment of disease. Currently, deglycosylation with peptide N-glycosidase F is the most commonly used technique in N-glycosylation analysis. Additionally, a common error in N-glycosylation site identification, resulting from protein chemical deamidation, has largely been ignored. In this study, we developed a convenient and precise approach for mapping N-glycosylation sites utilizing with optimized TFA hydrolysis, ZIC-HILIC enrichment, and characteristic ions of N-acetylglucosamine (GlcNAc) from HCD fragmentation. Using this method, we identified a total of 257 N-glycosylation sites and 144 N-glycoproteins from healthy human serum. Compared to deglycosylation with endoglycosidase, this strategy is more convenient and efficient for large scale N-glycosylation sites identification, and provides an important alternative approach for the study of N-glycoprotein function. It is widely recognized that N-glycoproteins are involved in many physiological processes, 1 2 3 such as cell−cell interaction , molecular recognition , and modulation of protein functions . Alterations in the amount and type of N-glycosylation can influence protein function and interactions with other molecules. Many useful clinical biomarkers and their therapeutic targets consist of glycoproteins or glycan antigens. For example, alpha-fetoprotein (AFP) and Golgi 4,5 membrane protein 1 (GP73) can be used for detection of hepatocellular carcinoma (HCC) , 6 prostate-specific antigen (PSA) for prostate cancer, and CA125 for ovarian cancer . Thus, the investigation of protein N-glycosylation has enormous potential in the discovery of biomarkers for the diagnosis and treatment of disease. Either enzymatic or chemical deglycosylation are generally used for N-glycosylation site analyses. Peptide N-glycosidase F (PNGase F) has been widely applied in deglycosylation. PNGase F cleavage leads to the conversion of asparagine (Asn) to aspartic acid (Asp) and a mass increase of 0.9840 Da 7,8. However, asparagine (Asn) can undergo spontaneous deamidation to aspartic acid (Asp), which also generates a 0.984 Da mass tag and confounds the correct assignment of glycosites 9. Thus, deglycosylation is always preferred performing in 18 O-water, which results in a 2.989 Da mass increase and a “confident glycosylation site information”10. Unfortunately, a recent study showed that deaminated peptides that incorporate 18 O-water were also identified in large-scale analyses, highlighting a problem of glycosylation site assignment based on deamidation of asparagine with PNGase F in 18O water 11. Other

ACS Paragon Plus Environment

Page 2 of 14

Page 3 of 14

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

enzyme-based methods were also introduced to comprehensively profile protein 12,13 N-glycosylation by mass spectrometry (MS) . Endo-β-N-acetylglucosaminidase D and H (Endo D and Endo H) cleave the glycosidic bond between the first two GlcNAc, leaving the amide-linked GlcNAc on the asparagine, which can be recognized as a label of N-glycosylation site by LC-MS/MS. Together with the amino acid sequence, this is considered 12 as unambiguous assignment of the glycosylation site . Nevertheless, the use of these endoglycosidases were limited by their relatively narrow glycan structural preferences and 14 high cost . On the contrary, chemical deglycosylation methods generally show no substrate preferences. Strong acids can hydrolyze glycosidic bonds and leave the innermost GlcNAc residue on the asparagine. Chen et al utilized trifluoromethanesulphonic acid (TFMSA) to 15 hydrolyze glycosidic bonds and identified 250 glycoproteins in yeast . Compared to enzymatic deglycosylation, this acid hydrolysis technique is faster and more convenient. However, due to its low reproducibility, further development and optimization is necessary. Since Gedye published the first articles of rapid organic synthesis using household microwave oven in Tetrahedron Letters in 1986, microwaves have been extensively 16 investigated as energy sources . Microwaves have been extensively used for carrying out chemical reactions and have become an effective non-conventional energy source for 17,18 19-21 22 peptides synthesis , enzyme digestion of protein , and deglycosylation . Lee observed 13 trypsin-digested N-glycosylated peptides from horse radish peroxidase (HRP) utilizing a 23 domestic microwave oven . However, domestic microwave ovens are not stabilized, and reaction conditions need to be better controlled. Currently, most complex protein post-translation modifications (PTM) can be analyzed with 24 various MS/MS dissociation models . Collision induced dissociation (CID) is the most commonly used fragmentation technique that can rapidly break precursor peptides into b+ and y+ ions. The fragmentation pattern of higher-energy collisional dissociation (HCD) is featured with higher activation energy and shorter activation time compared to traditional ion trap CID, and also generates b- and y-type fragmentation ions. While the higher energy for HCD leads to 25 a predominance of y-ions, b-ions can be further fragmented to a-ions or smaller species . Without the low mass cut-off restriction and with high mass accuracy, HCD fragmentation can be applied for providing more informative characteristic ions which are needed accurate for PTMs studies. Our previous study of protein core-fucosylation (CF) by HCD has demonstrated 26 this point . Herein, we reported a method for N-glycosylation site identification based on microwave-assisted acid hydrolysis (MAAH). Trifluoroacetic acid (TFA) was used for deglycosylation, followed by zwitterionic hydrophilic interaction chromatography (ZIC-HILIC) enrichment and HCD fragmentation by Orbitrap Elite MS/MS. TFA concentration and microwave reaction time were optimized for high deglycosylation efficiency with minimum peptide backbone cleavage. Spectra containing characteristic fragmentation ions of GlcNAc were selected and identified as N-glycopeptides for database searching. This method of microwave-assistant TFA deglycosylation is more convenient and faster for large scale N-glycosylation sites identification than enzymatic deglycosylation. In totally, 257 unique N-glycosylation sites and 144 glycoproteins were identified in normal human serum. Further, it provides a convenient and efficient approach for N-glycoprotein function studies and biomarker screening in cancer research. EXPERIMENTAL SECTION Materials and Chemicals. Healthy human serum samples were supplied by the Cancer Institute & Hospital of Shanxi Province (Taiyuan, Shanxi, P.R. China). Previous institutional ethical approval was obtained, and the volunteers in the study provided written informed



ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

consent. Six human serum samples were equally pooled to generate a mixed sample for this experiment. The ethics committee approved the research protocol. Endo H and Endo F3 were purchased from Sigma-Aldrich (St. Louis, MO, USA). Recombinant Endo M was expressed in Escherichia coli by our own group. Sequencing-grade trypsin was purchased from Promega (Madison, WI, USA). Trifluoroacetic acid (TFA), Formic acid (FA), and HPLC-grade acetonitrile (ACN) were purchased from ThermoFisher (Waltham, MA, USA). Deionized water was generated with a Millipore Milli-Q A10 system (Bedford, MA, USA). Iodoacetamide (IAA) and dithiothreitol (DTT) were obtained from Acros Organics (Morris Plains, NJ, USA). ZIC-HILIC resin was bought from Sequant (Merck, Germany). 3M Empore C8 disk was bought from 3M Bioanalytical Technologies (St. Paul, MN, USA). 3 kDa and 30 kDa Microcon filtration devices were bought from Millipore. Tris-HCl was purchased from United States Biological (US Biological, Swampscott, MA, USA). FASP. Approximately 50 µg standard sample (Bovine fetuin) or 200 µg processed human serum protein was subjected to FASP procedure as reported 27. Briefly, samples were mixed with 50 µL lysis buffer (0.1 M Tris-HCl, 0.1 M DTT, 4% SDS) and incubated at 90 oC for 10 min. The mixture was then centrifuged at 15,000 × g for 10 min, and the resulting supernatant was mixed with 200 µL 8 M urea in 0.1 M Tris/HCl, pH 8.5 (UA solution). The mixture was loaded into a 30 kDa Microcon filtration device (Millipore, USA) and centrifuged at 14,000 × g for a minimum of 20 min until less than 10 µL remaining. The concentrates were then diluted in the devices with 200 µL UA solution and centrifuged twice. After centrifugation, the concentrates were mixed with 100 µL 50 mM iodoacetamide (IAA) in UA solution, incubated in darkness at room temperature for 30 min, and then centrifuged for 20 min. The concentrate was diluted with 200 µL 8 M urea in 0.1 M Tris/HCl, pH 8.5 and concentrated again. This step was repeated twice. Samples were then diluted with 100 µL 40 mM ammonium bicarbonate buffer solution (ABC) and concentrated twice. After concentrating, 8 µg sequencing-grade trypsin in 100 µL 40 mM ABC (ammonium bicarbonate, pH 7.8) was mixed with sample for digestion overnight at 37 oC. Digested peptides were collected with 50 ABC by concentration, and this step was repeated six times. The final concentration of peptides was determined by UV-spectrometry (Nanodrop, Thermo) using an extinction coefficient of 1.1 for 0.1% (g/L) solution at 280 nm. N-glycopeptide Enrichment. Approximately 100 µg of digested serum peptides were added into 10 mg commercial ZIC-HILIC resin and enriched by published procedure previously 28 . The detailed strategy is as follows: C8 disk was put into 100 µL tip first; then, about 20 mg ZIC-HILIC resin was dissolved in 300 µL acetonitrile and injected into 3 tips of equal quantity. In-solution digested peptides were re-dissolved in 80% ACN, 0.5% FA and loaded into the ZIC-HILIC 200 µL tip equilibrated with binding buffer (80% ACN, 0.5% FA). ZIC-HILIC tip was washed with 100 µL 80% ACN, 1% FA, 19% H2O for 6 times and bounded peptides were eluted with 80 µL elution buffer (99% H2O, 1% FA) for 3 times. The enriched glycopeptides were resuspended in 100 µL sodium acetate solution (100 mM, pH 4.5) and reaction buffer A (50% TFA) for N-glycopeptides simplification, respectively. Microwave-assisted Acid Hydrolysis of Glycopeptides. Sialylglycopeptide (SGP), bovine fetuin, and human serum samples were applied for optimizing microwave-assisted acid hydrolysis method, respectively. In this study, all of samples were independently carried out 3 times to exploit the reproducibility of acid hydrolysis deglycosylation. In this step, ZIC-HILIC-enriched glycopeptides were mixed with different concentration of TFA, and then incubated with microwave reactor (CEM Discover bio). The parameter setting as the reaction

ACS Paragon Plus Environment

Page 4 of 14

Page 5 of 14

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

power of microwave 100 W, and heating experiments were performed at 80 °C, applying heating times of 1 min, 3 mins, 5 mins, 10 mins, 30 mins, and 1 hour. All of simplified N-glycopeptides linked one GlcNAc were named partially deglycosylated glycopeptides. As a comparison, enriched glycopeptides were also deglycosylated in 100 mL ammonium acetate buffer by Endo M, Endo H, or Endo F3 overnight at 37 °C. Zip-tip C18 was used to desalt, further lyophilized and stored at -80 oC until analyzed by nano LC-MS/MS. Analysis with Mass Spectrometry. RP-HPLC MS/MS experiments were performed on LTQ-Orbitrap Elite mass spectrometer (Thermo Fisher) equipped EASY-spray source and nano-LC UltiMate 3000 high performance liquid chromatography system (Thermo Fisher). EASY-Spray PepMap C18 Columns (50 cm; particle size, 2 µm; pore size, 100 Å; Thermo Fisher, US) were used for separation. Separation was achieved with a linear gradient from 3 % to 40 % solvent B for 80 min at a flow rate of 300 nL/min (mobile phase A: 2% ACN, 98% H2O, 0.1% FA; mobile phase B: 80% ACN, 20 % H2O, 0.1 % FA). LTQ-Orbitrap Elite mass spectrometer was operated in the data-dependent mode. A full-scan survey MS experiment (m/z range from 375 to 1500; automatic gain control target, 1,000,000 ions; resolution at m/z 200, 60,000; maximum ion accumulation time, 50 ms) was acquired by the Orbitrap mass spectrometer, and ten most intense ions were fragmented by HCD in the octupole collision cell. HCD fragment ion spectra were acquired in the Orbitrap analyzer with resolution of 15,000 at m/z 200 (automatic gain control target, 10,000 ions; maximum ion accumulation time, 200 ms). The MS/MS scan model was set as centroid. The other conditions used were: temperature of 200 oC, S-lens RF level of app. 60%, ion selection threshold of 50,000 counts for HCD. As a comparison, prior ions were also fragmented by CID and acquired in the ion trap. Data Filtering and Database Searches. In this work, the raw data was converted to mgf files by Proteome Discoverer 1.3. The potential N-glycopeptide spectra were selected according to characteristic ions of GlcNAc and merge into a new file with in-house software. All of selected spectra were searched with pFind (version 2.1) database searching software. The human proteome sequence database was extracted from Uniprot_swissprot plus Uniprot_TrEMBL (Release on 2012–04, human, 65,493 entries), concatenated with reversed versions of all sequences. The mass tolerance was set to 20 ppm for the precursor ions and 25 mmu for the fragment ions. A false discovery rate (FDR) of 1% was estimated respectively and applied to all data sets at the peptide-spectrum match (PSM) level. The mgf data were compared to the target and decoyed human Uniprot database with static modification of carbamidomethyl (Cys, +57.0214), dynamic modification of Oxidation (Met, +15.9949), GlcNAc tag (Asn, +203.0794), and Acetylation (N-Terminal). Enzyme was set to trypsin, with two missed cleavages allowed. Redundant protein entries were removed by pBuild software to form group entries. RESULT AND DISCUSSION Work Flow. We analyzed N-glycosylation sites from 6 serum samples of healthy individuals. Lyophilized serum proteins were digested with trypsin using FASP method. ZIC-HILIC enrichment was performed on the glycopeptides from digestion. N-glycopeptides were then simplified by the microwave-assisted TFA hydrolysis and then analyzed with high-precision Orbitrap Elite. All the spectra from simplified N-glycopeptides were combined into an mgf file with our in-house software based on the presence of characteristic ions from GlcNAc. At last, pFind studio 2.0 was used for database searching (Figure 1). Two important steps are paid most close attention: 1) to develop an efficient microwave assisted TFA hydrolysis



ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

deglycosylation method; 2) to optimize glycopeptides selection and simplification for data processing, in order to eliminate the interference of false N-glycopeptides spectra. Optimization of Microwave-assisted Acid Hydrolysis Strategy. Since glycosidic bonds between sugar residues are more acid labile than amide bond between innermost GlcNAc of N-glycan and asparagine, acid hydrolysis can be used for the deglycosylation of N-glycoprotein. In this work, sialylglycopeptide (SGP) and bovine fetuin were used for method optimization. SGP is a natural N-glycopeptide extracted from egg yolks, which has 6 amino acids and eleven oligosaccharides. Bovine fetuin is a glycoprotein with 3 N-glycosylation sites. We selected a series of TFA concentration (30%, 40%, 50%, 60%, 70%, and 80%) and microwave times (1min, 5min, 10min, 20min, and 30min) for method optimization. The best deglycosylation was achieved at 50 % FA and 80 oC for 10 minutes (MS result of deglycosylated SGP are shown as Figure 2a and 2b. Under this condition, partial deglycosylation product peak is maximized with minimum peptide backbone damage, as compared with 30 minutes reaction (Figure 2c). MS and MS/MS spectra of deglycosylated fetuin from tryptic-digest are shown as supporting information figure S-1. The parallel experiment of acid hydrolysis was carried out in triplicates and reproducibility of acid hydrolysis is good (data not shown). In fact, the use of acid hydrolysis for N-glycosylation site mapping has been reported in the past 15. Chen et al. have carried out an extensive study to optimize the TFMSA hydrolysis conditions (temperature, reaction time and addition of toluene). As another alternative method, our TFA method showed two obvious advantage: 1) the reaction is faster by using microwave (only 10 mins compared to 2 hrs), and 2) the reaction condition is relatively milder. Analysis of Protein N-glycosylation. Benefit from its high scanning speed and reproducibility, CID fragmentation has been used traditionally for large scale glycoproteomics studies. However, in recent years, more and more researchers preferred HCD for its superior sequencing capability. HCD provides two major advantages, high mass accuracy/resolution and the lack of one third cut-off effect for low mass ions, at the cost of reduced scanning speed and sensitivity 29, 30. To achieve an accurate identification of N-glycosylation sites, we employed HCD fragmentation to develop a credible strategy based on diagnostic fragment ions of GlcNAc from our simplified glycopeptide. Three synthetic N-GlcNAc peptides (AVLVNnITTGER, EnGTISRY, RWnATYFD) were used as standards for method optimization. A NCE of 27 was chosen from a series of HCD energy since it provides thorough fragmention of N-GlcNAc peptides into b+, y+ ions together with strong GlcNAc fragment ion peaks, including m/z=204.08, 186.07, 168.08, 144.05, 138.05 and 126.04 (Supporting Information Figure S-2). These diagnostic peaks serve as solid proof of N-glycosylation and the information can only be retrieved by HCD. We designed an in-house program that can filter the simplified N-glycopeptide spectra according to diagnostic ions of GlcNAc, and only spectra containing all of the fragment peaks were selected for later processing and database searching. An mgf file was generated by merging the selected spectra. To validate our program, we mixed three N-GlcNAc peptides with E. coli protein digest as a simulative biological system. E. coli is a non N-glycoproteins model biological system. The N-glycosylation sites were analyzed by CID and HCD fragmentation with and without our program filtering. As a result, a total of 6 and 8 “glycopeptides” were identified with CID and HCD fragmentation without filtering, respectively, meaning 3 and 5 false positive identifications. Oppositely, only the three synthetic N-GlcNAc peptides were identified from software-filtered HCD spectra (Supporting Information Figure 3a). The results indicates that even with GlcNAc residue as mass tag, traditional database

ACS Paragon Plus Environment

Page 6 of 14

Page 7 of 14

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

searching of CID and HCD fragmentation methods still generates false positive results, unless diagnostic ion fragments of GlcNAc are taken into consideration. As a traditional method, PNGase F deglycosylation and direct database searching are usually used in N-glycosylation sites identification. Herein, a simple comparison was applied between our developed method and traditional method to evaluate the spontaneous deamidation status. Glycoprotein bovine fetuin (3 N-sites) was evaluated as a standard sample to detect the glycosylation sites. With microwave-assisted acid hydrolysis and characteristic ions recognition, the only 3 N-sites were identified. However, a total of 6 “N-sites” were identified with PNGase F deglycosylation and direct database searching, meaning 3 false N-sites were identified (Supporting Information Table S-1). Optimizing of Candidate N-glycopeptide Spectra. Database searching software uses different scoring functions to identify peptide sequences via comparison of theoretical MS/MS spectra with experimental data 31. The matching ratio between practical fragmentations of MS/MS spectra and theoretical MS/MS spectra will determine the confidence level of peptides. Some PTM, such as glycosylation and phosphorylation, generate complex MS/MS spectra due to the weaker chemical bond at the the modification site relative to the peptide backbone. Therefore, to overcome the complexity of biological samples and the massive number of spectra generated in large scale glycopeptide analysis, specialized processing methods are absolutely necessary. In this work, we developed a procedure to improve N-GlcNAc peptides scoring by deleting non b/y type peaks, including diagnostic ions of GlcNAc, precursor ions and their isotope peaks. Another mgf file was created by removing the diagnostic ions of GlcNAc and precursor ions from the previously generated mgf file. The same simulative biological system mentioned in previous paragraph was analyzed again with this procedure. A total of 30 spectra were identified from the three synthetic GlcNAc-peptides (Supporting Information Figure S-3b), comparing to 20 without optimization. This result demonstrates that our spectra manipulation effectively improved scoring in large scale analysis of N-glycosylation. Large Scale Analysis of N-glycosylation Sites in Healthy Human Serum. To illustrate the feasibility of the chemical deglycosylation in complex samples, a total of 6 healthy human serums were applied for further investigation. Meanwhile, 3 kinds of endoglycosidase, Endo M, Endo H and Endo F3, were applied in the analysis of N-glycosylation site as control. Each endoglycosidase can cleave specific N-glycans between the two GlcNAcs in the N-glycan core. Endo M has broad substrate specificity for hybrid- type and bi-antennary complex-type oligosaccharides; Endo H cleaves high-mannose type and some hybrid oligosaccharides; Endo F3 cleaves core-fucose type oligosaccharides. The overview of analytical strategy developed is illustrated as figure 4a. To investigate the reproducibility of the strategy, serum samples were independently prepared in triplicates and measured by LC-MS/MS runs with 120 min acquisition time. From the identified result of three independent experiments, the overlap is around 77 % of each two identified result. The Venn diagram which is the overlap among three independent experiment of serum sample is shown in figure 4c. In sum, a total of 257 unique N-glycosylation sites and 144 N-glycoproteins were disclosed with chemical deglycosylation from human serum (The detailed information of identified N-glycoproteins and N-glycopeptides are provided in Supporting Information Table S-2). Of the 257 glycosites, 21 sites were only identified after remove the diagnostic ion of GlcNAc and isotopic peaks. Meanwhile, in the paralleled experiment with enzymatic deglycosylation, all experimental steps are the same except the method of glycan simplification. In the experiment of Endo F3 deglycosylation, we used an extraordinary method to identify the CF-glycosylation sites

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

according to our previously published paper 26. Compared with the identified result of the enzymatic method, a total of 73 unique N-glycosylation sites were identified with acid hydrolysis (Figure 4c), indicating the advantage of microwave assistant TFA deglycosylation for efficient deglycosylation. Credible information on N-glycosylation sites has been obtained using both acid hydrolysis and enzymatic method in healthy human serum. Combining both methods, we identified 205 N-glycoproteins and 436 N-glycosylation sites (Supporting Information Table S-3). Thus, it supplied a reliable data for N-glycoproteome analysis. As we know, the canonical N-glycosylation motifs are N-X-S/T and few are N-X-C (X is any amino acid except P). In this study, N-glycosylation sites that match with N-X-T (57.6%) occur more frequently than N-X-S (40.5%). The rare presence of 4 N-X-C motif N-glycosylation sites in our identified result, and only 5 identified N-glycosylation sites do not match any of the consensus sequences mentioned above, which account for 1.1% in all of identified N-glycosylation sites. This ratio is less than the data in plasma which published by Liu 32. Of the total 205 N-glycoproteins from the top confidence set, approximately 44% was identified with a single N-glycosylation site, 21% with two sites, and 9% with three. The most heavily N-glycosylated protein is alpha-2-macroglobulin, which contains 8 N-glycosylation sites according to our experimental result. This is completely in accordance with all the recognized literature 33, suggesting the accuracy of our method.  CONCLUSION In this study, we established a precise and convenient method to investigate N-glycosylation sites in human serum using microwave-assisted TFA hydrolysis and characteristic ions recognition. Compared to the enzymatic method with endoglycosidase, microwave-assisted TFA hydrolysis deglycosylation can identify more N-glycosylation sites, and is not N-glycan form specific. Meanwhile, N-glycopeptide spectra, distinguished with characteristic ions, can apply a precise method for N-glycosylation sites. Although we identified 257 N-glycosylation sites and 144 N-glycoproteins from healthy human serum, our method is convenient and faster in large scale identification of N-glycosylation sites, and provides an important approach for N-glycoprotein functional studies and biomarker screening in cancer research.  AUTHOR INFORMATION Corresponding Author Phone: (404) 413-3591 Fax: (404) 413-3580 Email: [email protected] Author Contributions This manuscript was written through contributions of all authors. All authors have given approval to the final version of this manuscript.  ACKNOWLEDGMENTS We sincerely thank Georgia Research Alliance (GRA) and Georgia State University for purchasing the analytical instrument used in this research. Our work was partially supported by a funding from the Key Grant Project of Chinese Ministry of Education No. 313033.  SUPPORTING INFORMATION AVAILABE Three supplementary figures (Figures S1−S3) and three tables (Table S1−S3) contain additional experimental data and Supporting Information, which detail the result of identified

ACS Paragon Plus Environment

Page 8 of 14

Page 9 of 14

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

N-glycopepitdes and N-glycoproteins by chemical and enzymatic methods. This information is available free of charge via the Internet at http://pubs.acs.org/.

REFERENCES (1) Liwosz, A.; Lei, T.; Kukuruzinska, M. A. The Journal of biological chemistry 2006, 281, 23138-23149. (2) Sanchez-Felipe, L.; Villar, E.; Munoz-Barroso, I. Glycoconjugate journal 2012, 29, 539-549. (3) Varki, A.; Kannagi, R.; Toole, B. P. In Essentials of Glycobiology, Varki, A.; Cummings, R. D.; Esko, J. D.; Freeze, H. H.; Stanley, P.; Bertozzi, C. R.; Hart, G. W.; Etzler, M. E., Eds.; Cold Spring Harbor Laboratory Press. The Consortium of Glycobiology Editors, La Jolla, California: Cold Spring Harbor (NY), 2009. (4) Wright, L. M.; Kreikemeier, J. T.; Fimmel, C. J. Cancer detection and prevention 2007, 31, 35-44. (5) Drake, R. R.; Schwegler, E. E.; Malik, G.; Diaz, J.; Block, T.; Mehta, A.; Semmes, O. J. Molecular & cellular proteomics : MCP 2006, 5, 1957-1967. (6) Takahashi, K.; Kijima, S.; Yoshino, K.; Shibukawa, T.; Moriyama, M.; Iwanari, O.; Sawada, K.; Matsunaga, I.; Murao, F.; Kitao, M. Nihon Sanka Fujinka Gakkai zasshi 1985, 37, 591-595. (7) Kaji, H.; Saito, H.; Yamauchi, Y.; Shinkawa, T.; Taoka, M.; Hirabayashi, J.; Kasai, K.; Takahashi, N.; Isobe, T. Nature biotechnology 2003, 21, 667-672. (8) Zhang, H.; Li, X. J.; Martin, D. B.; Aebersold, R. Nature biotechnology 2003, 21, 660-666. (9) Ye, J.; Xu, Y.; Harris, J.; Sun, H.; Bowman, A. S.; Cunningham, F.; Cardona, C.; Yoon, K. J.; Slemons, R. D.; Wan, X. F. Virology 2013, 446, 225-229. (10) Kuster, B.; Mann, M. Analytical chemistry 1999, 71, 1431-1440. (11) Palmisano, G.; Melo-Braga, M. N.; Engholm-Keller, K.; Parker, B. L.; Larsen, M. R. J Proteome Res 2012, 11, 1949-1957. (12) Hagglund, P.; Bunkenborg, J.; Elortza, F.; Jensen, O. N.; Roepstorff, P. J Proteome Res 2004, 3, 556-566. (13) Segu, Z. M.; Hussein, A.; Novotny, M. V.; Mechref, Y. Journal of proteome research 2010, 9, 3598-3607. (14) Tretter, V.; Altmann, F.; Marz, L. European journal of biochemistry / FEBS 1991, 199, 647-652. (15) Chen, W.; Smeekens, J. M.; Wu, R. Journal of proteome research 2014, 13, 1466-1473. (16) Patgiri, A.; Menzenski, M. Z.; Mahon, A. B.; Arora, P. S. Nature protocols 2010, 5, 1857-1865. (17) Stevens, M. Y.; Wieckowski, K.; Wu, P.; Sawant, R. T.; Odell, L. R. Organic & biomolecular chemistry 2014. (18) Hojo, K.; Shinozaki, N.; Hidaka, K.; Tsuda, Y.; Fukumori, Y.; Ichikawa, H.; Wade, J. D. Amino acids 2014, 46, 2347-2354. (19) Li, J. F.; Wei, F.; Dong, X. Y.; Guo, L. L.; Yuan, G. Y.; Huang, F. H.; Jiang, M. L.; Zhao, Y. D.; Li, G. M.; Chen, H. Food Sci. Biotechnol. 2010, 19, 463-469. (20) Chen, Z. Y.; Li, Y. L.; Lin, S. H.; Wei, M. P.; Du, F. Y.; Ruan, G. H. Biochem Bioph Res Co 2014, 445, 491-496. (21) Damm, M.; Nusshold, C.; Cantillo, D.; Rechberger, G. N.; Gruber, K.; Sattler, W.; Kappe, C. O. J. Proteomics 2012, 75, 5533-5543. (22) Chen, W. X.; Smeekens, J. M.; Wu, R. H. Journal of Proteome Research 2014, 13, 1466-1473. (23) Lee, B. S.; Krishnanchettiar, S.; Lateef, S. S.; Gupta, S. Rapid communications in mass spectrometry : RCM 2005, 19, 1545-1550. (24) Lehmann, W. D.; Kruger, R.; Salek, M.; Hung, C. W.; Wolschin, F.; Weckwerth, W. J Proteome Res 2007, 6, 2866-2873. (25) Frese, C. K.; Altelaar, A. F.; Hennrich, M. L.; Nolting, D.; Zeller, M.; Griep-Raming, J.; Heck, A. J.; Mohammed, S. J Proteome Res 2011, 10, 2377-2388. (26) Ma, C.; Zhang, Q.; Qu, J.; Zhao, X.; Li, X.; Liu, Y.; Wang, P. G. J. Proteomics. 2015, 114, 61-70. (27) Wisniewski, J. R.; Zougman, A.; Nagaraj, N.; Mann, M. Nat Methods 2009, 6, 359-362. (28) Ma, C.; Zhao, X.; Han, H.; Tong, W.; Zhang, Q.; Qin, P.; Chang, C.; Peng, B.; Ying, W.; Qian, X. Electrophoresis 2013, 34, 2440-2450. (29) Olsen, J. V.; Macek, B.; Lange, O.; Makarov, A.; Horning, S.; Mann, M. Nat Methods 2007, 4, 709-712.

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(30) Olsen, J. V.; Schwartz, J. C.; Griep-Raming, J.; Nielsen, M. L.; Damoc, E.; Denisov, E.; Lange, O.; Remes, P.; Taylor, D.; Splendore, M.; Wouters, E. R.; Senko, M.; Makarov, A.; Mann, M.; Horning, S. Molecular & cellular proteomics : MCP 2009, 8, 2759-2769. (31) Fu, Y.; Yang, Q.; Sun, R.; Li, D.; Zeng, R.; Ling, C. X.; Gao, W. Bioinformatics (Oxford, England) 2004, 20, 1948-1954. (32) Liu, T.; Qian, W. J.; Gritsenko, M. A.; Camp, D. G., 2nd; Monroe, M. E.; Moore, R. J.; Smith, R. D. J. Proteome Res. 2005, 4, 2070-2080. (33) Bunkenborg, J.; Pilch, B. J.; Podtelejnikov, A. V.; Wisniewski, J. R. Proteomics 2004, 3, 454-465.

ACS Paragon Plus Environment

Page 10 of 14

Page 11 of 14

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 1. Large scale mapping of N-glycosylation sites using microwave-assisted acid hydrolysis and characteristic ions recognition.

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 2. LC-MS spectra of SGP treated with microwave-assistant TFA hydrolysis. LC-MS spectrum of SGP is shown as (a). LC-MS spectrum of SGP in 10 mins reaction time is shown as (b). MS spectrum of SGP in 30 mins reaction time is shown as (c).

ACS Paragon Plus Environment

Page 12 of 14

Page 13 of 14

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 3. Diagnostic ions and precursor ions of GlcNAc are deleted from MS/MS spectra. Diagnostic ions include 204.0866, 186.0760, 168.0655, 144.0655, 138.0549, and 126.9549.

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 4. Large scale analysis of N-glycosylation sites in human serum samples. A. Large scale analysis of N-glycosylation sites in human serum samples with microwave assistant TFA hydrolysis and endoglycosidase (Endo H, Endo M, and Endo F3). B. Repeatability of deglycosylation via microwave assistant TFA hydrolysis. C. Overlap mapping of N-glycosylation with different deglycosylated method. All of Venn diagrams were drawn with online Venny 2.0 (http://bioinfogp.cnb.csic.es/tools/venny/).

ACS Paragon Plus Environment

Page 14 of 14