Article pubs.acs.org/jpr
Parallel Comparison of N‑Linked Glycopeptide Enrichment Techniques Reveals Extensive Glycoproteomic Analysis of Plasma Enabled by SAX-ERLIC Sarah M. Totten,† Christa L. Feasley,‡ Abel Bermudez,† and Sharon J. Pitteri*,† †
Canary Center at Stanford for Cancer Early Detection, Department of Radiology, Stanford University School of Medicine, 3155 Porter Drive MC5483, Palo Alto, California 94304, United States ‡ ThermoFisher Scientific, 1400 Northpoint Parkway Suite 10, West Palm Beach, Florida 33407, United States S Supporting Information *
ABSTRACT: Protein glycosylation is of increasing interest due to its important roles in protein function and aberrant expression with disease. Characterizing protein glycosylation remains analytically challenging due to its low abundance, ion suppression issues, and microheterogeneity at glycosylation sites, especially in complex samples such as human plasma. In this study, the utility of three common N-linked glycopeptide enrichment techniques is compared using human plasma. By analysis on an LTQ-Orbitrap Elite mass spectrometer, electrostatic repulsion hydrophilic interaction liquid chromatography using strong anion exchange solid-phase extraction (SAXERLIC) provided the most extensive N-linked glycopeptide enrichment when compared with multilectin affinity chromatography (M-LAC) and Sepharose-HILIC enrichments. SAX-ERLIC enrichment yielded 191 unique glycoforms across 72 glycosylation sites from 48 glycoproteins, which is more than double that detected using other enrichment techniques. The greatest glycoform diversity was observed in SAX-ERLIC enrichment, with no apparent bias toward specific glycan types. SAXERLIC enrichments were additionally analyzed by an Orbitrap Fusion Lumos mass spectrometer to maximize glycopeptide identifications for a more comprehensive assessment of protein glycosylation. In these experiments, 829 unique glycoforms were identified across 208 glycosylation sites from 95 plasma glycoproteins, a significant improvement from the initial method comparison and one of the most extensive site-specific glycosylation analysis in immunodepleted human plasma to date. Data are available via ProteomeXchange with identifier PXD005655. KEYWORDS: glycoproteomics, mass spectrometry, plasma, protein glycosylation, N-linked glycopeptide enrichment
■
INTRODUCTION Glycosylation of proteins is the most abundant post-translational modification, occurring in ∼50% of all eukaryotic proteins.1−3 The accurate and complete characterization of glycoproteins has been of increasing interest due to their biological importance in mediating cancer metastasis, cell−cell interactions, protein stability, and immune response.4−6 Glycosylation of cancer markers and therapeutic proteins (biologics) is now noted as a critical feature that must be monitored.7−11 Glycosylation has also been shown to be aberrant in a number of diseases, including several types of cancer.12 The clinical relevance of glycoproteins is made evident by the observation that the vast majority of FDA approved cancer biomarkers are glycosylated proteins.13−16 Despite recent technological advances in the use of mass spectrometry for glycoproteomic studies, glycosylation remains analytically challenging to monitor directly. This is due in large part to the microheterogeneity at the glycosylation site and the relatively low abundance and poor ionization efficiency of © XXXX American Chemical Society
glycopeptides in the presence of nonglycosylated peptides. Unmodified peptides occur in equimolar amounts, whereas glycopeptides can be divided into as many glycoforms as are present at each site. The suppression of glycopeptide ions relative to the nonglycosylated peptide ions can also lower detectability in complex mixtures, especially in enzymatic digests from biological matrices, thus making them much more difficult to observe in a typical LC−MS/MS experiment. For these reasons, enrichment at either the glycoprotein or glycopeptide level, or both, has become an essential part of sample preparation for glycosylation analysis. To date, several different approaches for N-linked glycopeptide enrichment have been reported in the literature. Hydrazide-functionalized solid supports have been used to capture glycans on glycopeptides through the use of functionalized solid supports. Although highly selective for glycopepReceived: September 23, 2016
A
DOI: 10.1021/acs.jproteome.6b00849 J. Proteome Res. XXXX, XXX, XXX−XXX
Article
Journal of Proteome Research
increased sensitivity, allowing for more in-depth glycosylation site mapping on a greater number of plasma proteins.
tides, this method involves releasing the glycan from the peptide and subsequent analysis of the deglycosylated peptides for identification of glycosylation sites.17−20 This eliminates information concerning which glycan structures occupied which glycosylation sites. Similarly, titanium dioxide (TiO2)-based approaches, although highly selective for sialylated glycopeptides, often require glycan cleavage from the peptide.21−25 Alternatively, to retain protein- and site-specific information, several other enrichment strategies have been described for the analysis of intact N-linked glycopeptides following protein digestion and are described in detail in a number of reviews.26−29 Some of the commonly reported enrichment strategies include lectin affinity selection,30−33 hydrophilic interaction liquid chromatography (HILIC),34−37 and electrostatic repulsion hydrophilic interaction chromatography (ERLIC).38−46 Lectin affinity chromatography uses carbohydrate binding proteins (lectins) with affinity for particular carbohydrate moieties to capture glycopeptides or glycoproteins. 2,3,30,31,33 Lectin affinity columns and resins are commercially available; however, the lectin density tends to be too low for adequate binding of glycopeptides that typically will only have a single binding site per peptide.30,31,33 Alternatively, lectin columns can be synthesized to have higher lectin density and better glycopeptide selectivity and are useful for targeting specific glycan moieties.31,33 Because the selectivity of lectins relies on their affinity for specific carbohydrate moieties (versus exploiting their hydrophilic properties), multiple lectins often need to be used to broaden the types of glycans captured for a more global analysis. Hydrophilic interaction chromatography (HILIC) has also been widely reported for the enrichment of glycopeptides across a variety of media, including carbohydrate-based matrices (such as cellulose, Sepharose, and cotton), amide-80 chromatography columns, and zwitterionic stationary phases (ZIC-HILIC). HILIC-based enrichment strategies are less biased for specific glycan species compared with lectin affinity and TiO 2 approaches, with glycopeptide separation depending largely on the hydrophilicity and size of the glycan moiety. HILIC materials are also widely commercially available and make for a practical and easily accessible option for enrichment. ERLIC has been recently reported as an enrichment strategy for phosphopeptides and glycopeptides as an alternative to more traditional HILIC enrichment strategies, which demonstrate lower selectivity.38−46 The charged chromatography media may select a wider range of glycopeptides and be less hindered by the relative size of the glycan or peptide. A limited number of studies have reported a direct, side-byside comparison of enrichment techniques to reveal any bias in the enriched fraction.48,49 Furthermore, only a few studies have extended beyond single or small mixtures of glycoprotein standards to assess the efficacy and specificity of glycopeptide enrichment techniques that allow for site-specific analysis in real complex samples such as serum or plasma. Here we perform a direct comparison of three techniques, lectin affinity chromatography, Sepharose-HILIC, and SAX-ERLIC, to determine the most appropriate enrichment strategy for Nlinked glycoproteomic analysis of human plasma based on ease of use, most robust enrichment, and least biased for particular glycoform or subgroup (most universal) approach. Additionally, enrichment using the most successful method (SAXERLIC) was repeated and subsequently analyzed on an Orbitrap Fusion Lumos Tribrid mass spectrometer to maximize N-linked glycopeptide identification on an instrument with
■
MATERIALS AND METHODS
Sample Preparation
Pooled normal human EDTA-plasma was purchased from Innovative Research (Novi, MI). Eight 200 μL aliquots of pooled human plasma were each depleted of 14 abundant proteins using CaptureSelect Human 14 material, as previously described.50 All aliquots of depleted plasma were recombined, and the protein concentration was measured using a Bradford Protein Assay (ThermoFisher Scientific). The combined protein sample was then redivided into eight equal aliquots. Samples were buffer exchanged into 50 mM ammonium bicarbonate using 3K spin filters (Millipore) and then thermally denatured for 10 min at 95 °C. Protein disulfide bonds were reduced with 10 mM dithiothreitol and alkylated with 18 mM iodoacetamide (Sigma). Samples were digested with trypsin (Promega) with a 1:25 enzyme to protein ratio for 18 h at 37 °C. C18 Solid-Phase Extraction
C18 solid-phase extraction (SPE) was performed on six of the eight aliquots of depleted plasma. Two of these six were used as subsequent controls for the glycopeptide enrichment. Four samples were C18 desalted and concentrated prior to subsequent M-LAC and Sepharose-HILIC enrichment. SOLA C18 HRP (10 mg/mL) SPE cartridges (ThermoFisher Scientific) were loaded on a vacuum manifold. The cartridges were preconditioned using in-house vacuum during washing. One mL of each wash solution was applied at a time including 3 mL of methanol, followed by 3 mL of 5% acetonitrile (ACN) with 0.1% trifluoroacetic acid (TFA) in water, 3 mL of 50% ACN with 0.1% TFA in water, and 6 mL of 5% ACN with 0.1% TFA in water. Peptides were applied to the SPE cartridge and washed with 500 μL of 5% ACN with 0.1% TFA in water. Positive pressure from a pipet bulb was used to assist with flow across the SPE cartridge. The SPE cartridge was then washed with 6 mL of 5% ACN with 0.1% TFA in water. The peptides were next eluted in three 500 μL aliquots of 50% ACN with 0.1% TFA in water. Larger peptides were eluted in two 1 mL aliquots of 80% ACN with 0.1% TFA in water. 50 and 80% eluates were combined and samples were dried in a Speedvac. Multi-Lectin Affinity Chromatography
Lectin affinity chromatography was performed on two of the six aliquots following C18 solid-phase extraction. A multilectin affinity chromatography (M-LAC) column containing Concavalin A (ConA) and Wheat Germ Agglutinin (WGA) was used for glycopeptide capture. The M-LAC column was constructed using a previously described method.51 In brief, ConA and WGA were covalently immobilized onto a polymeric matrix at a density of 15 mg of lectin per mL of POROS 20-AL beads. The beads were then packed into a 4.6 mm × 100 mm column. After C18 SPE, proteins were reconstituted in 500 μL of lectin loading buffer containing 25 mM Tris, 0.5 mM NaCl, 1 mM CaCl2, and 1 mM MnCl2 and loaded onto the column. The column was washed with loading buffer, and unbound tryptic peptides were collected in the flow-through fraction. The bound tryptic glycopeptides were then eluted at low pH with 100 mM acetic acid. The bound peptides were then concentrated using a 3K spin filter (Millipore). The B
DOI: 10.1021/acs.jproteome.6b00849 J. Proteome Res. XXXX, XXX, XXX−XXX
Article
Journal of Proteome Research
ms) with supplemental activation (35 eV) was performed in a subsequent scan on the same precursor ion selected for HCD. Glycopeptide-enriched samples from the SAX-ERLIC and C18 control (5 μL injection volume) were analyzed on an Orbitrap Fusion Lumos mass spectrometer (Thermo Scientific, San Jose, CA) in a data-dependent mode. Approximately 0.5 to 1 μg of digest was injected. Liquid chromatography was performed using an Easy nLC-1000 (Thermo Scientific, San Jose, CA) running at 300 nL/min with a 60 min linear gradient of 2−45% acetonitrile/0.1% formic acid, followed by a 1 min ramp to 98% with a 10 min hold. The column was reequilibrated with 6 μL of 0.1% formic in water prior to each subsequent injection. The mass spectrometer was scanned from 400−1600 m/z with the wide quadrupole isolation on at a resolution of 120 000 (m/z 200) with a 4e5 AGC target value and ion funnel RF of 30. Ions selected for data-dependent MS2 scans were filtered for a charge state of +2 through +8, monoisotopic precursor selection match, an intensity of at least 5e4, and dynamic exclusion set to 20 s. Data-dependent HCD tandem mass spectra were collected with a resolution 30 000 in the Orbitrap using a loop count of ten, isolation width of 2.0 Th, a fixed first mass of 195 (to guarantee the observance of 204 glycan oxonium ion), 30% normalized collision energy, and a target value 1e4. Precursors with HCD MS2 spectra that contained at least one glycan oxonium ion (m/z 204.0867, 274.0921, 292.1026, 366.1396, and 657.2347) as one of the top 20 MS2 ions within 5 ppm mass tolerance were targeted for a subsequent EThcD MS2, a hybrid fragmentation method combining electron-transfer/higher-energy collision dissociation.53 EThcD was collected in the Orbitrap at a resolution of 30 000, isolation width of 2.0 Th, reagent ion target value of 2e5, maximum precursor injection time of 250 ms for a target value of 2e4, and an HCD supplemental activation of 20. Calibrated ETD reaction times were not utilized for this method. Instead, ETD reaction times were varied according to charge state and based on prior experimentally determined reaction times. Precursors with a charge state of +3 had ETD reaction time of 100 ms, +4 charge state had reaction time of 70 ms, +5 charge state 50 ms, and +6−8 charge states 30 ms. The raw data files were searched against the human database using Proteome Discoverer 2.1 (Thermo Scientific, Bremen, DE) and Preview (Protein Metrics, San Carlos, CA) to verify digestion completeness nonglycopeptides using Sequest HT.
concentrated sample was then subjected to another round of C18 SPE for desalting, as described above. HILIC Using Sepharose Media
Enrichment by HILIC using Sepharose media was performed on two of the six aliquots following C18 solid-phase extraction. This protocol was adapted from a published study by Wada et al.52 Empty Mini Bio-Spin columns were purchased from BioRad and packed with 100 μL of Sepharose CL-4B media (Sigma-Aldrich). Sepharose media was conditioned with 500 μL of a high organic solvent of 1-butanol/ethanol/H2O (4:1:1, v/v/v). Samples were reconstituted in the same organic solvent, loaded onto the Sepharose, and placed on a rocker for 30 min. The loading solution was drained into waste, and the samples were then washed twice with 500 μL of organic solvent. Glycopeptides were eluted by adding 500 μL of elution solvent (1:1 ethanol/water, v/v), rocking for an additional 30 min, and then collecting the eluent. Eluted peptides were dried using a Speedvac. Strong Anion Exchange Chromatography Using ERLIC (SAX-ERLIC)
Strong anion exchange chromatography using ERLIC (SAXERLIC) was performed on two aliquots following immunodepletion, with no prior desalting with C18 SPE. SOLA SAX SPE cartridges (ThermoFisher Scientific) were loaded on a vacuum manifold. The cartridges were preconditioned using in-house vacuum during washing. One mL of each wash solution was applied at a time including 3 mL of ACN, followed by 3 mL of 100 mM triethylammonium acetate in water, 3 mL of 1% TFA in water, and 3 mL of 95% ACN with 1% TFA in water. The organic content of the tryptic peptide solution was adjusted to 95% ACN with 1% TFA in water. Peptides were applied to the SPE cartridge; then, the cartridge was washed with 500 μL of 95% ACN with 1% TFA in water. Positive pressure from a pipet bulb was used to assist with flow across the SPE cartridge. The SPE cartridge was then washed with an additional 6 mL of 95% ACN with 1% TFA in water. The peptides were next eluted in three 500 μL aliquots of 50% ACN with 0.1% TFA in water. Larger peptides were eluted in two 1 mL aliquots of 5% ACN with 0.1% TFA in water. Eluates were combined and dried in a Speedvac. LC−MS/MS Analysis
Samples were reconstituted in 0.3% formic acid in water and loaded onto a C18 trap column (15 μL injection volume) coupled to an UltiMate Rapid Separation LC (Dionex) system at 5 μL/min for 10 min. Samples were then loaded onto a 25 cm length C18 analytical column (Picofrit 75 um ID, New Objective) packed in-house with Magic C18AQ resin (Michrom Bioresources). Tryptic peptides were eluted using a multistep gradient at a flow rate of 0.6 μL/min from 0.1% formic acid in water to 85%−0.1% formic acid in acetonitrile over 120 min. Peptides were analyzed using a LTQ-Orbitrap Elite mass spectrometer equipped with electron-transfer dissociation (ThermoFisher Scientific). The electrospray ionization voltage was set to 2.25 kV, and the capillary temperature was set to 200 °C. MS1 scans were performed over m/z 400−1800, and the top five most intense ions (+2 or higher charge states) were subjected to higher energy collisioninduced dissociation (HCD) with 27 eV, default charge state +4, for 0.1 s. If oxonium product ions (m/z 138.0545, 204.0867, 274.0921, 292.0800, and 366.1396) were observed in the HCD spectra, electron-transfer dissociation (ETD) (200
Data Processing and Analysis
LC−MS data were searched using Byonic (ProteinMetrics) against the human Swiss-Prot database. A focused database with decoys was also created for each search. A precursor mass tolerance of 10 ppm was selected, and HCD and ETD tolerance were set to 0.5 Da. The following modifications were selected: carbamidomethyl on cysteine (fixed modification), methionine oxidation (common modification), and asparagine deamidation (rare modification). The N-glycan human database provided in Byonic was used to search for N-glycan compositions. A false discovery rate of 5% and Byonic cutoff score of 100 were used for searches. All samples were prepared and analyzed in two replicates and are reported as averages. Glycan structures were drawn using GlycoWorkbench.54 The mass spectrometry data and search results have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the data set identifier PXD005655. C
DOI: 10.1021/acs.jproteome.6b00849 J. Proteome Res. XXXX, XXX, XXX−XXX
Article
Journal of Proteome Research
■
RESULTS AND DISCUSSION The primary goals of this work were to perform a parallel comparison of glycopeptide enrichment methods using depleted human plasma and to identify glycosylation sites and structures on less abundant proteins in human plasma. The overall experimental workflow is shown in Figure 1. Human
sample shows the highest signal intensity for the GlcNAc and sialic acid−H2O oxonium ions (Figure 2A); however, the SAXERLIC shows more peptides eluting over a larger time range (Figure 2B). The oxonium ion signals for the SAX-ERLIC (Figure 2B) show the highest intensity and widest elution time range when compared with the M-LAC (Figure 2C) and Sepharose-HILIC (Figure 2D) methods. Not only is this indicative of the enrichment of glycopeptides in this sample but also it the efficiency of activating the subsequent productdependent ETD scans for improved glycopeptide identification. The oxonium ions are less intense in the M-LAC sample (Figure 2C) by nearly an order of magnitude with a notable decrease in abundance of the sialic acid−H2O ions. The oxonium ion signal in the Sepharose-HILIC (Figure 2D) is most similar to the C18 control with respect to elution time range, diversity, and intensity. Interestingly, the base peak chromatogram signal is of lowest intensity in the SAX-ERLIC sample; however, the oxonium EICs have the highest intensity of all of the samples (with the exception of three peaks found in the C18 control). The degree of glycopeptide enrichment from each method in the initial method comparison analysis was further evaluated by counting the N-linked glycopeptide identifications from MS/ MS spectra from HCD-pd-ETD experiments on a LTQOrbitrap Elite mass spectrometer, as shown in Figure 3. The C18 control sample yielded 4814 total peptides with only 16 peptides identified as glycopeptides (0.32%). SAX-ERLIC, Sepharose-HILIC, and M-LAC strategies all yielded a higher percentage of glycopeptide identifications relative to total peptides (22, 2, and 5%, respectively) and absolute number of peptides identified as glycopeptides (262, 63, and 117, respectively) compared with the C18 control (Figure 3A). Additionally, the SAX-ERLIC-enriched samples resulted in the highest number of unique glycopeptide identifications, both in absolute count and as a percentage of total peptide count (Figure 3B). See Supplementary Table S1 for a complete listing of all of the glycopeptides identified in this study by each enrichment method. The diversity of glycopeptides identified by each method is demonstrated by the number of unique glycopeptides (as defined by unique peptide sequence and glycan composition), as shown in Figure 3B. The LC−MS/MS data from the method comparison analysis yielded 10 unique glycopeptides in the control, representing