Comparative Serum Glycoproteomics Using Lectin Selected Sialic Acid Glycoproteins with Mass Spectrometric Analysis: Application to Pancreatic Cancer Serum Jia Zhao,† Diane M. Simeone,§ David Heidt,‡ Michelle A. Anderson,§ and David M. Lubman*,†,‡,| Department of Chemistry, The University of Michigan, Ann Arbor, Michigan 48109-1055, Department of Surgery, The University of Michigan Medical Center, Ann Arbor, Michigan 48109-0656, Department of Surgery and Molecular Integrative Physiology, The University of Michigan Medical Center, Ann Arbor, Michigan 48109-0656, and Comprehensive Cancer Center, University of Michigan Medical Center, Ann Arbor, Michigan Received February 3, 2006
A strategy is developed in this study for identifying sialylated glycoprotein markers in human cancer serum. This method consists of three steps: lectin affinity selection, a liquid separation and characterization of the glycoprotein markers using mass spectrometry. In this work, we use three different lectins (Wheat Germ Agglutinin, (WGA) Elderberry lectin,(SNA), Maackia amurensis lectin, (MAL)) to extract sialylated glycoproteins from normal and cancer serum. Twelve highly abundant proteins are depleted from the serum using an IgY-12 antibody column. The use of the different lectin columns allows one to monitor the distribution of R(2,3) and R(2,6) linkage type sialylation in cancer serum vs that in normal samples. Extracted glycoproteins are fractionated using NPS-RP-HPLC followed by SDS-PAGE. Target glycoproteins are characterized further using mass spectrometry to eludicate the carbohydrate structure and glycosylation site. We applied this approach to the analysis of sialylated glycoproteins in pancreatic cancer serum. Approximately 130 sialylated glycoproteins are identified using µLC-MS/MS. Sialylated plasma protease C1 inhibitor is identified to be down-regulated in cancer serum. Changes in glycosylation sites in cancer serum are also observed by glycopeptide mapping using µLC-ESI-TOF-MS where the N83 glycosylation of R1-antitrypsin is down regulated. In addition, the glycan structures of the altered proteins are assigned using MALDI-QIT-MS. This strategy offers the ability to quantitatively analyze changes in glycoprotein abundance and detect the extent of glycosylation alteration as well as the carbohydrate structure that correlate with cancer. Keywords: serum • glycoprotein • pancreatic cancer • lectin • liquid chromatography • mass spectrometry
1. Introduction Pancreatic cancer is a major oncologic challenge where early detection biomarkers are desperately needed. Pancreatic carcinoma is the fourth most frequent cause of cancer death in Europe and the USA.1 It has the worst prognosis of any cancer with a 5-year survival rate less than 3%. Despite years of research in this area, there is no reliable method for detection of this disease in asymptomatic patients. The biomarker CA19-9 is currently used clinically in patients with pancreatic cancer, however the sensitivity and specificity of the biomarker are not high, and serum levels are significantly increased in inflammatory diseases of the pancreas and biliary tract. More recent * To whom correspondence should be addressed. The University of Michigan Medical Center, Department of Surgery, MSRB1, Rm A510B, 1150 West Medical Center Drive, Ann Arbor, MI, 48109-0656. Fax: (734) 615-8108. E-mail:
[email protected]. † Department of Chemistry, The University of Michigan. ‡ Department of Surgery, The University of Michigan Medical Center. § Department of Surgery and Molecular Integrative Physiology, The University of Michigan Medical Center. | Comprehensive Cancer Center, University of Michigan Medical Center.
1792
Journal of Proteome Research 2006, 5, 1792-1802
Published on Web 05/26/2006
RNA-based studies have reported overexpression of S100A4, prostate stem cell antigen, osteopontin, mesothelin, hTert, and CEACAM1, with elevations of some of these molecules measured in serum, although the clinical applications of these RNAbased markers have not been widely reported.2,3 There is currently great interest in developing protein-based serum markers for cancer. On the basis of the inaccessible location of the pancreas, a serum test is needed to screen patients for the early detection of this disease, particularly in high-risk populations. An important target for serum detection involves the presence of glycosylated proteins. Protein glycosylation has long been recognized as a very common posttranslational modification, playing a fundamental role in many biological processes such as immune response and cellular regulation.4,5 The glycoproteome is one of the major subproteomes of human serum, where glycoproteins secreted into the bloodstream comprise a major part of the serum proteome.6 Many clinical biomarkers and therapeutic targets in cancer are glycoproteins, such as CA125 in ovarian cancer, Her2/neu in breast cancer and prostate-specific antigen in prostate cancer. In addition, the alteration in protein glycosylation which occurs 10.1021/pr060034r CCC: $33.50
2006 American Chemical Society
Comparative Serum Glycoproteomics
through varying the heterogeneity of glycosylation sites or changing glycan structure of proteins on the cell surface and in body fluids have been shown to correlate with the development of cancer and other disease states.7 Therefore, a method that can (1) quantitatively analyze glycoprotein abundance and (2) detect the extent of glycosylation alteration and the carbohydrate structure that correlate with pancreatic cancer will be essential for the discovery of new potential diagnostic markers of this disease. Sialic acids are generally found in the nonreducing terminus of most glycoproteins and glycolipids via a R-2,3 or R-2,6 linkage to galactose or Hex-NAc. Sialic acids are important regulators of cellular and molecular interactions. They can either mask recognition sites or serve as recognition determinants.8 Increased sialylation of tumor cell surfaces is wellknown and is due to either increased activity of the sialyltransferases or due to the increased branching of N-linked carbohydrates leading to termini which can be sialylated.9 Aberrant sialylation in cancer cells is thought to be a characteristic feature associated with malignant properties including invasiveness and metastatic potential. Various methods have been developed to enrich glycoproteins. Zhang et al. have developed a method to enrich glycoproteins through hydrazide chemistry.10 In this method, the captured glycopeptides were deglycosylated by PNGase F and quantified by isotope labeling. Lectin affinity chromatography has recently been widely used to purify glycoproteins with specific structures. Hancock and co-workers developed a multilectin affinity column which combines ConA, WGA and Jacalin to capture the majority of glycoproteins present in human serum.11 In related work, Regnier et al. utilized serial lectin affinity chromatography (SLAC) for fractionation and comparison of glycan site heterogeneity on glycoproteins derived from human serum.12,13 Novotny et al. combined silica based lectin microcolumns with high-resolution separation techniques for enrichment of glycoproteins and glycopeptides.14 In this work, we analyze pancreatic cancer serum using sialic acid specific lectin affinity chromatography followed by fractionation using RP-HPLC and further separation by SDS-PAGE. This method could be used to identify potential serum marker proteins of human cancer. Herein, we intend to show the result as a proof-of-principle for the workflow introduced. The expression of sialic acid glycoproteins with different substructures are compared between normal and cancer serum based on UV absorption detection. Low and medium abundant glycoproteins are analyzed after the depletion of 12 highly abundant proteins. Altered glycoproteins are digested and identified by LC-MS/MS. The structures of the released carbohydrate from purified serum proteins are studied using a MALDI-quadrupole-ion trap Tof (MALDI-QIT) mass spectrometer. Ultimately, this method is used to detect the change of the isoforms and extent of glycosylation of target glycoproteins in cancer serum. Glyco-peptide mapping is performed using LC-ESI-TOF-MS to study the difference of glycosylation efficiency on the glycosylation site of proteins between normal and cancer serum.
2. Experimental Section 2.1. Samples. Human normal serum and pancreatic cancer serum were provided by University Hospital according to IRB guidelines. A 40-cm3 portion of blood was provided by each patient. The samples were permitted to sit at room temperature for a
research articles minimum of 30 min (and a maximum of 60 min) to allow the clot to form in the red top tubes, then centrifuged at 1300 × g at 4 °C for 20 min. The serum was removed, transferred to a polypropylene capped tube and frozen. The frozen samples were stored at -70 °C until assayed. Six samples (three normal serum and three cancer serum) were studied in this work. 2.2. Removing High Abundant Proteins Using Antibody Column and Protein Assay. A 125-µL portion of human serum was depleted using the ProtromeLab IgY-12 proteome partitioning kit (Beckman Coulter, Fullerton, CA) after brief centrifugation using a 0.45 µm spin filter for 1 min at 9200 × g. The experimental procedure follows the protocol provided by Beckman. This column enables removal of albumin, IgG, R1antitrypsin, IgA, IgM, transferrin, haptoglobin, R1-acid glycoprotein, R2-macroglobin, HDL(apolipoproteins A-I&A-II) and fibrinogen in a single step. The final volume of serum sample in elution buffer after depletion is 15-20 mL. This volume was concentrated using 15 mL, 10 kDa Amicon filters (Millipore, Billerica, MA). Protein assays were carried out in a 250 µL transparent 96well plate (Fisher, Barrington, IL) according to the Bradford assay method since the plate based method requires less sample than the cuvette based assay and it enables the simultaneous reading of all the samples and standards. 2.3. MAL, SNA, and WGA Affinity Selection. Agarose bound lectins, Wheat Germ Agglutinin,(WGA) Elderberry lectin, (SNA), Maackia amurensis lectin, (MAL) were purchased from Vector Laboratories (Burlingame, CA). Agarose bound WGA was packed into the disposal screw end-cap spin column with filters at both ends. The column was first washed with 500 µL binding buffer (20 mM Tris, 0.2 M NaCl, pH 7.4) by centrifuging the spin column at 500 rpm for 2 min. The protease inhibitor stock solution was prepared by dissolving one complete EDTA-free Protease inhibitor cocktail tablet (Roche, Indianapolis, IN) in 1 mL H2O. The stock solution was added to binding buffer and elution buffer at a ratio of 1:50. Fifty microliter depleted or nondepleted serum sample diluted with 500 µL binding buffer was loaded onto the column and incubated for 15 min. The column was centrifuged for 2 min at 500 rpm to remove the nonbinding fraction. The column was washed with 600 µL binding buffer twice to wash off the nonspecific binding. The captured glycoproteins were released with 150 µL elution buffer (0.5 M N-acetyl-glucosamine in 20 mM Tris and 0.5 M NaCl, pH 7.0) and the eluted fraction was collected by centrifugation at 500 rpm for 2 min. This step was repeated twice and the eluate fractions were pooled. SNA and MAL spin columns were purchased from QIAGEN (Valencia, CA) and the elution procedure was similar to that used with the WGA spin column. The elution buffer for these two lectins is 0.3 M lactose in buffered saline. 2.4. RP-HPLC Separation of Lectin Bound Glycoproteins. The enriched glycoprotein fraction was loaded onto nonporous silica reverse phase high-performance liquid chromatography (NPS-RP-HPLC) for separation. High separation efficiency was achieved by using an ODSIII-E (4.6 × 33 mm) column (Eprogen, Inc., Darien, IL) packed with 1.5 µm nonporous silica. To collect purified proteins from NPS-RP-HPLC, the reversed-phase separation was performed at 0.5 mL/min and monitored at 214 nm using a Beckman 166 model UV detector (BeckmanCoulter). Proteins eluting from the column were collected by an automated fraction collector (model SC 100; BeckmanCoulter), controlled by an in-house designed DOS-based software program. To enhance the speed, resolution and Journal of Proteome Research • Vol. 5, No. 7, 2006 1793
research articles reproducibility of the separation, the reversed-phase column was heated to 60 °C by a column heater (Jones Chromatography, model 7971). Both mobile phase A (water) and B (ACN) contained 0.1% v/v TFA. The gradient profile used was as follows: 5% to 15% B in 1 min, 15% to 25% B in 2 min, 25% to 30% B in 3 min, 30% to 41% B in 15 min, 41% to 47% B in 4 min, 47% to 67% B in 5 min and 67% to 100% B in 2 min. Deionized water was purified using a Millipore RG system (Bedford, MA). 2.5. Gel Electrophoresis and Fluorescence Dye Labeling. 2.5.1. SDS-PAGE. The fractions collected from RP-HPLC were further separated by SDS-PAGE according to Laemmli15, run in a Mini-PROTEAN Cell (Bio-Rad, Hercules, CA) at 80 V controlled by Power Pac3000 (Bio-Rad, Hercules, CA). The proteins were visualized by staining with Sypro-ruby fluorescence dye (Molecular Probes, Carlsbad, CA). The staining was performed according to the protocol suggested by the manufacturer. 2.5.2. 2-D PAGE. 2-D electrophoresis was performed according to “2-D gel electrophoresis principles and methods” (Amersham, Piscataway, NJ). A 5-µL serum sample was loaded in a 11 cm (pH 3-10) IPG gel (Biorad). The first dimension separation was carried out on a Protean IEF Cell (Biorad) with a maximum of 35 000 vhr. 4-20% poly-acrylamide gel (11 × 16 cm) was used for the second dimension separation which was carried out in a Hoefer SE600 electrophoresis unit (Amersham). The 2-D gel was first stained with pro-Q glycoprotein dye (Molecular Probes, Carlsbad, CA) followed by Sypro-Ruby fluorescence dye staining. The staining procedure of these two dyes follows the protocol provided. 2.6. Protein Digestion by Trypsin. Fractions obtained from NPS-RP-HPLC are concentrated down to ∼20 µL using a SpeedVac concentrator (Thermo, Milford, MA) operating at 45 °C. A 20-µL portion of 100 mM ammonium bicarbonate (Sigma) was then mixed with each concentrated sample to obtain a pH value of ∼7.8. 0.5 µL of TPCK modified sequencing grade porcine trypsin (Promega, Madison, WI) was added and vortexed prior to a 12-16 h incubation at 37 °C on an agitator. For in-gel digestion, a gel slice was destained in 200 mM NH4HCO3 in 40% ACN and incubated at 37 °C for 30 min. After reduction and alkylation, gel pieces were dried down in a speedvac. A 50-µL portion of reaction solution (100 mM NH4HCO3 in 9% ACN) and 1 µL trypsin (Promega) were added to the gel sample. After 12-16 h incubation at 37 °C, the liquid from the gel piece was removed and transferred to a new tube. 2.7. Glycan Cleavage by PNGase F and Glycan Purification. For glycan cleavage and purification, the procedure follows that of YQ Yu et al.16 The peaks collected from NPS RP-HPLC were dried down completely and redissolved in 40 µL 0.1% (w/v) RapiGest solution (Waters, Milford, MA) prepared in 50 mM NH4HCO3 buffer, pH 7.9 to denature the protein. Protein samples were reduced with 5 mM DTT for 45 min at 56 °C and alkylated with 15 mM iodoacetamide in the dark for 1 h at room temperature. A 2-µL enzyme PNGase F (New England Biolabs, Ipswich, MA) was added to the samples and the solution was incubated for 14 h at 37 °C. The glycans released were purified prior to MALDI-MS analysis using SPE microelution plates (Waters) packed with HILIC sorbent (5 mg). Salt, protein and detergent were removed at this step. The microelution SPE device was operated using a centrifugation device with a plate adaptor (Thermo). 2.8. Mass Spectrometry. 2.8.1. Glycan Structure Analysis. MS and MSn spectra of glycan samples were acquired on a 1794
Journal of Proteome Research • Vol. 5, No. 7, 2006
Zhao et al.
Figure 1. Strategy used to quantify sialylated glycoprotein differences between normal and cancer serum and characterize the glycol isoforms and glycan structures.
Shimadzu Axima QIT MALDI quadrupole ion trap-ToF (MALDIQIT)(Manchester, UK). Acquisition and data processing were controlled by Launch-pad software (Karatos, Manchester, UK). A pulsed N2 laser light (337 nm) with a pulse rate of 5 Hz was used for ionization. Each profile results from 2 laser shots. Argon was used as the collision gas for CID and helium was used for cooling the trapped ions. The TOF was externally calibrated using 500 fmol/µL of bradykinin fragment 1-7 (757.40 m/z), angiotensin II (1046.54 m/z), P14R(1533.86 m/z), ACTH(2465.20 m/z) (sigma). 25 mg/mL 2,5-dihydroxybenzonic acid (DHB) (LaserBio Labs, France) was prepared in 50% ACN with 0.1% TFA. 0.5 µL glycan sample was spotted on the stainless steel target and 0.5 µL matrix solution was added followed by air-drying. 2.8.2. Glycopeptide Mapping. Digested peptide mixtures from peak c1, c2, and c′ in Figure 5 were separated by a capillary RP column (C18, 0.3 × 50 mm) (Michrom, Auburn, CA) on a Paradigm MG4 micropump (Michrom) with a flow rate of 5 µL/min. The gradient starts at 5% ACN, was ramped to 60% ACN in 25 min and finally ramped to 90% in another 5 min gradient. Both solvent A(water) and B(ACN) contain 0.3% formic acid. The resolved peptides were detected by an ESITOF spectrometer (LCT premier, Micromass/Waters, Milford, MA). The capillary voltage for electrospray was set at 3000 V, sample cone at 75 V. Desolvation was accelerated by maintaining the desolvation temperature at 150 °C and source temperature at 100 °C. The desolvation gas flow was 300 L/h. The data were acquired in “V” mode and the TOF was externally calibrated by Sodium Iodide and Cesium Iodide mixtures. The instrument was controlled by MassLynx 4.0 software. 2.8.3. Protein Identification. Digested peptide mixtures from NPS-RP-HPLC collection or in-gel digestion were separated in the same manner as described above. The resolved peptides were analyzed on an LTQ mass spectrometer with an ESI ion source (Thermo, San Jose, CA). The capillary temperature was 175 °C, spray voltage was 4.2 kV and capillary voltage is 30 V. The normalized collision energy was set at 35% for MS/MS. MS/MS spectra were searched using SEQUEST algorithm incorporated in Bioworks software (Thermo) and the SwissProt human protein database. One mis-cleavage is allowed during the database search. Positive protein identification was accepted for a peptide with Xcorr of greater than or equal to 3.0 for triply-, 2.5 for doubly-, and 1.9 for singly charged ions.
Comparative Serum Glycoproteomics
research articles
Figure 2. (a) UV Chromatogram of three 125-µL serum depletions from three patients by IgY antibody column to remove 12 high abundant proteins. During the binding process, the fraction flowing through was collected as a top 12 depleted serum fraction and the Top 12 retained protein fraction was collected during elution. The absorption was set at 280 nm. A 20-µg portion of proteins from the top 12 depleted serum fraction (b) and 15 µg proteins from top 12 retained protein fraction (c) were further separated by a C18 NPSRP column. The wavelength is 214 nm.
3. Results and Discussion The analytical strategy used in this work is outlined in Figure 1. Glycoproteins containing sialic acid are enriched using WGA, SNA, and MAL affinity columns separately. Serum sample is depleted before the lectin extraction step for the detection of medium and low abundant proteins. The lectin enriched fractions are fractionated by NPS-RP-HPLC and the eluting proteins were detected with UV absorption detection. Protein fractions are collected by peak and the altered peak between normal and cancer samples are further separated by SDS-PAGE followed by in gel digestion. The altered proteins are identified by peptide sequencing using µLC-MS/MS. N-glycans are
cleaved from target glycoproteins by PNGase F. The structures of oligosaccharides released are analyzed by a hybrid ion trap TOF mass spectrometer. Glyco-peptide mapping is performed using a LC-ESI-TOF-MS in order to study the change in the structure of the isoforms and the extent of glycosylation in target glycoproteins in cancer serum. Three normal serum and three pancreatic cancer serum samples are analyzed in this work and reproducible results are obtained. 3.1. Analysis of Depleted Serum Sample. The serum proteome is dominated by a few highly abundant proteins which constitute about 90% of the total protein content of the serum. These proteins severely interfere with the quantification and Journal of Proteome Research • Vol. 5, No. 7, 2006 1795
research articles
Zhao et al.
Figure 3. SNA(a), MAL(b), and WGA(c) selected glycoproteins from depleted normal (upper chromatogram) and pancreatic cancer (lower chromatogram) serum sample were separated by NPS-RP C18 column. The UV absorption was at 214 nm. (d) peak a and a′ were further separated by SDS-PAGE gel. Lane 1: peak a from normal serum; Lane 2: peak a′ from cancer serum; Lane 3: MW marker.
identification of proteins of lower abundance.17 The 2-D gel image stained with pro-Q glycoprotein dye (supplementary data) suggests that most of the high abundant proteins in serum are glycosylated proteins and the presence of these high abundant glycoproteins masks the detection of glycoproteins with lower abundance. Although albumin is not a glycoprotein, it binds to other glycoproteins so that partial binding to lectins occurs and it is stained by the glycoprotein dye. Since many important marker proteins are detected in low concentration in biological samples, removing the high abundant proteins may be a critical strategy for serum biomarker discovery. In this study, 12 highly abundant proteins (albumin, IgG, R1-antitrypsin, IgA, IgM, transferrin, haptoglobin, R1-acid glycoprotein, R2-macroglobin, HDL(apolipoproteins A-I&A-II) and fibrinogen) are removed using an affinity column based on avian antibody (IgY)-antigen interactions. Figure 2a shows the chromatogram of the binding and washing process of 125 µL of human serum. The protein assay result indicates that around 7% of total protein is retained in the top 12 depleted serum fraction. The UV chromatograms of three depletions (Figure 2a) show that this step is very reproducible. The protein assay is performed to ensure that same amount of depleted serum sample in each case is subject to reverse-phase separations, which allows quantitation to be performed in this step. Approximately 20 µg protein from the top 12 depleted serum fraction (Figure 2b) and 15 µg protein from the top 12 retained 1796
Journal of Proteome Research • Vol. 5, No. 7, 2006
protein fraction (Figure 2c) are separated using a C18 NPSRP column. It is observed that most of the 12 proteins have been effectively removed except some fraction of the albumin. With removal of the highly abundant proteins, the remaining proteins can be identified over a relatively high dynamic range. To compare the sialic acid glycoprotein expression between normal and cancer serum, we used three lectins (WGA, MAL, SNA) to enrich sialic acid attached glycoproteins. These three lectins each bind different structural subclasses of these moieties. MAL could select glycoproteins containing NeuAcGal-GlcNAc with sialic acid at the 3 position of galactose.18 SNA binds preferentially to sialic acid attached to terminal galactose in (R-2,6) and to a lesser degree, (R-2,3) linkage.19 WGA can interact with some glycoproteins via sialic acid residues and it also binds oligosaccharides containing terminal N-acetylglucosamine.20 Proteins bound with WGA were eluted by 0.5 M N-acetyl-glucosamine and proteins bound with SNA and MAL were eluted by 0.3 M Lactose. The protein assay suggests that around 5-10% of the protein content is extracted by the lectin affinity columns. The parallel application of these three lectins gives us a complete profile of sialylated glycoconjugates with heterogeneous structures and it could also provide information on the distribution of the sialylic glycoproteins with different substructures. The enriched glycoproteins were further separated using a nonporous reversed phase (NPS-RP) C18 column where rapid
Comparative Serum Glycoproteomics
research articles
Figure 4. MAL(a), SNA(b), and WGA(c) selected glycoproteins from nondepleted normal (upper chromatogram) and pancreatic cancer (lower chromatogram) serum sample were separated by NPS-RP C18 column. The UV absorption was at 214 nm. (d) peak b and b′ were further separated by SDS-PAGE gel. Lane 1: MW marker; Lane 2: peak b from normal serum; Lane 3: peak b′ from cancer serum.
separation ( 247 > 83 in mature under-glycosylated forms.24 In this study, the peak in the UV chromatogram (see Figure 4) of R1-antitrypsin enriched by WGA is observed to change in shape in the pancreatic cancer serum. The two peaks of this protein appearing in normal serum are labeled as c1 and c2 and the one peak in cancer serum is labeled as c′. The result of glycopeptide mapping of c1, c2, and c′ suggests that there is change in the glycosylation site occupancy of R1-antitrypsin in the pancreatic cancer serum. The tryptic digests of these three peaks are analyzed by µLC-ESI-TOF with the LCT mass spectrometer (Figure 5). The glycosylated peptides which usually have a mass over 3000 were multiply charged by the ESI source and could be detected in the lower mass range (