Identification of N-Linked Glycoproteins in Human Saliva by

Guy H. Carpenter ... Sricharan Bandhakavi , Susan K. Van Riper , Pierre N. Tawfik , Matthew D. Stone , Tufia Haddad , Nelson L. Rhodus , John V. Carli...
0 downloads 0 Views 326KB Size
Identification of N-Linked Glycoproteins in Human Saliva by Glycoprotein Capture and Mass Spectrometry Prasanna Ramachandran,† Pinmannee Boontheung,† Yongming Xie,† Melissa Sondej,‡ David T. Wong,‡ and Joseph A. Loo*,†,§ Department of Chemistry and Biochemistry, UCLA School of Dentistry and UCLA Dental Research Institute, and Department of Biological Chemistry, David Geffen School of Medicine, University of California-Los Angeles, Los Angeles, California Received December 31, 2005

Abstract: Glycoproteins make up a major and important part of the salivary proteome and play a vital role in maintaining the health of the oral cavity. Because changes in the physiological state of a person are reflected as changes in the glycoproteome composition, mapping the salivary glycoproteome will provide insights into various processes in the body. Salivary glycoproteins were identified by the hydrazide coupling and release method. In this approach, glycoproteins were coupled onto a hydrazide resin, the proteins were then digested and formerly N-glycosylated peptides were selectively released with the enzyme PNGase F and analyzed by LC-MS/MS. Employing this method, coupled with in-solution isoelectric focusing separation as an additional means for prefractionation, we identified 84 formerly N-glycosylated peptides from 45 unique N-glycoproteins. Of these, 16 glycoproteins have not been reported previously in saliva. In addition, we identified 44 new sites of N-linked glycosylation on the proteins. Keywords: glycoproteins • tandem mass spectrometry • salivary proteins • 2D gel electrophoresis • mucin • solution isoelectric focusing

Introduction Proteomics permits the qualitative and quantitative assessment of a broad spectrum of proteins that play a role in various cellular responses. The promise of proteomics to deliver information on the differential expression of proteins as it relates to human disease challenges all aspects of the experimental approach. Although technologies, such as mass spectrometry (MS) and other sophisticated protein separation tools have been developed in recent years to accurately identify and quantify proteins in complex media, there is not one universal strategy to identify all proteins from a biological sample. Impressive results have been reported for the measurement of the protein components from the human plasma proteome,1 * To whom correspondence should be addressed. Phone: (310) 794-7023. Fax: (310) 206-4038. E-mail: [email protected]. † Department of Chemistry and Biochemistry. ‡ UCLA School of Dentistry and UCLA Dental Research Institute. § Department of Biological Chemistry, David Geffen School of Medicine. 10.1021/pr050492k CCC: $33.50

 2006 American Chemical Society

but it is difficult to assess the success for a single method to comprehensively identify all proteins in a given sample. Significant challenges in analyzing the mammalian proteome arise from both the dynamic range of protein amounts (1010) as well as from the structural complexity of these proteins. As shown by many researchers, the complexity of the proteome will need reduction and simplification to maximize the dynamic range of the analyses. To improve the applicability of techniques such as two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) and MS for the analysis of low abundance proteins from complex mixtures, prefractionation for reducing sample complexity is performed. Plasma or blood has been the accepted biofluid of choice for measuring levels of proteins and other biomolecules and ions for clinical testing. However, there may be advantages for other biomedia which can be more readily obtained for clinical sampling. Saliva is the fluid that irrigates the mouth and oral cavity. It has a rich proteome that is derived from the salivary glands, the linings of the oral cavity, oral microbes, and blood. Saliva is readily available and easier to collect compared to other body fluids such as blood, cerebral spinal fluid (CSF), tears, and urine. For this reason, the use of saliva for diagnostic purposes presents an attractive option. Studies have been performed in the past to use saliva for the diagnosis of diseases such as cancer, Sjogren’s syndrome, infectious diseases and HIV. Most of these studies have focused on the presence of one or a few selected proteins in saliva.2 Proteome profiling of saliva to search for patterns or biomarkers for diseases has not yet been thoroughly investigated. To fully explore the option of exploiting the salivary proteome for disease detection and monitoring disease progression, a catalog of all salivary proteins and posttranslational modifications would be beneficial. Several laboratories have reported the identification of a number of proteins in salivary fluid. Some of the common methods used to identify salivary proteins are 1D- and 2D-gel electrophoresis and online HPLC combined with mass spectrometry. More than 500 proteins have been enumerated so far,3-10 but this number is significantly lower than the over 4000 proteins obtained from other body fluids such as plasma.11 To extend the present salivary protein list, it will be useful to subfractionate the salivary proteome to identify rare and previously unknown proteins and post translational modifications. A robust strategy needs to be devised to fractionate and enrich for disease biomarkers. Journal of Proteome Research 2006, 5, 1493-1503

1493

Published on Web 05/11/2006

technical notes

N-Linked Glycoproteins in Human Saliva

Many strategies have been devised to subfractionate the human serum/plasma proteome. A common method includes using affinity chromatography to deplete high abundance proteins such as albumin and immunoglobulin, accounting for 80% of the total protein content in plasma.12 However, plasma has many other abundant proteins which add to the overall sample complexity. A commercially available Multiple Affinity Removal System (Agilent Technologies) was developed to specifically and simultaneously remove 6 high abundance human proteins (albumin, transferrin, haptoglobulin, alphaantitrypsin, IgA and IgG and their fragments) using immunospecific antibodies in an LC-column format. Affinity-purified polyclonal antibodies targeted to specific human proteins are covalently coupled to a column resin. Removal of these highabundance proteins prior to separation and analysis of the plasma proteome enables deeper probing of the proteome and unmasks many low-abundance proteins. Other subfractionation methods target a subset of proteins and peptides with a particular amino acid or posttranslational modification. In the isotope coded affinity tagging (ICAT) strategy, the reagent has a reactive group which targets cysteine residues.13 Methods such as immobilized metal affinity chromatography (IMAC), phosphotyrosine antibodies14 and titanium dioxide columns15 for enriching phosphorylated proteins and peptides are common. Glycosylation is a common posttranslational modification which is important in many cellular processes, including protein targeting, folding, and stability. A widely used technique for large scale enrichment of glycoproteins is the use of lectin affinity column.16 However, Zhang et al. published an elegant and more specific technique to pull down asparagine-linked (N-linked) glycoproteins.17 Their method involves oxidizing the carbohydrates on glycoproteins and coupling them onto a hydrazide resin. N-linked glycopeptides are then released using the enzyme PNGase F and the formerly modified peptides can be identified by mass spectrometry.17 This method of protein subfractionation has the advantage of providing information on the N-linked glycoprotein components of a complex mixture. This glycosyl component is important because it is found on both extracellular and secreted proteins. Also, several known disease biomarkers such as CA125 and CA15-3 for ovarian cancer,18,19 PSA for prostrate cancer20 and c-erbB-2 for breast cancer21 are glycoproteins. In many cases, the glycosyl group has been shown to change with the progression of cancer and is often associated with cancer metastasis.22-24 The hydrazide method of glycoprotein capture is useful in enriching for potential disease biomarkers. Importantly, this method is excellent in filtering out high abundance proteins from serum and plasma such as albumin and reducing the complexity arising from other proteins by selecting specifically for glycopeptides. Zhang and co-workers used this method to search for quantitative differences in protein expression between normal mice and mice with skin cancer. They identified 6 proteins whose expression differed in cancer containing mice.25 In addition, this method was combined with immunoaffinity subtraction by Liu and co-workers to identify more than 300 N-linked glycoproteins in human plasma.26 In this paper, we report the identification of salivary glycoproteins using the N-linked glycopeptide capture method. Saliva is known to be rich in O- and N-linked glycosylated proteins. Some of the most well studied salivary glycoproteins are mucins such as Muc5B and Muc7 and proline-rich glycoproteins. These proteins are thought to be responsible for the 1494

Journal of Proteome Research • Vol. 5, No. 6, 2006

protective functions of saliva in the oral cavity.27 In mucins, the carbohydrates are mostly O-linked, and they constitute nearly 70-80% of the total weight of these proteins.28-33 Lectin probes revealed basic proline rich proteins to be highly O-glycosylated.34,35 Many other proline rich proteins were also found to be N-glycosylated.36-38 Salivary amylase was predicted to be primarily N-glycosylated. The sites for N-glycosylation are thought to be residues 427-429 (NGS), 364-368 (QNGKD) and 474-477 (NGNC).39 Other well-known salivary glycoproteins include the immunoglobulins, lactoferrin, lactoperoxidase, defensin,40 salivary agglutinins,41 and carbonic anhydrase.42 The large number of N- and O-linked oligosaccharides in the mucin glycoproteins confer its extended conformation, hydrophilicity, and viscocity.43 These properties help mucins adhere to teeth and protect teeth from microbes, chemicals, and mechanical wear and tear. Muc5B does not attach to many kinds of cells and common strains of oral bacteria. The glycosylation on Muc5B is responsible for its cell attachment inhibiting property.44 This might explain how it protects surfaces in the oral cavity from attack by microbes. On the other hand mucins such as Muc7 readily attach to and aggregate oral bacteria such as S. mutans.41,45 Muc5B and salivary agglutinin adhere to bacteria such as H. pylori, which infects the gastric mucosa and are responsible for clearance of this bacteria from the oral cavity.41 Immunoglobulins are thought to prevent microbial colonization in the oral cavity by binding and blocking structures that help them adhere to surfaces. Other salivary glycoproteins such as proline-rich glycoprotein, lysozyme, lactoferrin, salivary agglutinin, histatins, and defensins bind and kill bacteria.46 Using the hydrazide glycoprotein capture method, we simplify and enrich for the N-glycoproteome of saliva.

Experimental Section Chemicals. The chemicals were mostly purchased from Sigma (St. Louis, MO), unless stated otherwise. Affigel Hz Hydrazide gel, Coupling Buffer, and dithiothreitol (DTT) was obtained from Bio-Rad (Hercules, CA). Trifluoroacetic acid (TFA) was obtained from Pierce (Rockford, IL). Glycerol free PNGase F, Glycoprotein Denaturing Buffer, 10% NP-40 solution, and G7 Reaction Buffer was obtained from New England BioLabs (Ipswich, MA). O-Glycanase was purchased from Prozyme (San Leandro, CA). Sequencing-grade trypsin was procured from Promega (Madison, WI). Saliva Collection. Whole saliva was collected from healthy nonsmoking adults in the morning, at least 2 h after the last intake of food. The mouth was rinsed with water immediately before the collection. Saliva was collected and placed on ice. Protease Cocktail Inhibitor (1 µL/ml of whole saliva) was added to saliva immediately after collection to minimize protein degradation. Whole saliva was then centrifuged at 12 000 rpm at 4 °C for 10 min. The supernatant was collected and stored at -80 °C. The pellet was discarded. Enzymatic Deglycosylation and Separation of Salivary Proteins. 48 uL of whole saliva was mixed with 1X Glycoprotein Denaturing Buffer and boiled at 100 °C for 10 min. The solution was then mixed with 1X G7 Reaction Buffer and 1% NP-40. PNGase F and O-Glycanase were added to the test samples and the reaction was allowed to proceed at 37 °C overnight. In the control sample, no enzyme was added, but the same volume of water was added instead. The solution was allowed to incubate at 37 °C overnight with the test sample. Control and test samples were dialyzed against milliQ water at 4 °C to

technical notes remove salts and detergents from the samples. The dialyzed samples were dried, resuspended in sample buffer and analyzed by either 1D- or 2D-gel electrophoresis. The 1D- separation was carried out using Nu-PAGE 4-12% Bis-Tris gels (Invitrogen, Carlsbad, CA). In the 2D- separation, the first dimension IEF separation was performed using 11 cm Readystrip IPG strips (pH 3-10 NL) (Bio-Rad) on a Protean IEF cell (Bio-Rad). Proteins were further separated in the second dimension by Criterion Tris-HCl 8-16% precast gels (Bio-Rad) on a Protean Plus Dodeca Cell (Bio-Rad). Approximately 250 µg of protein was loaded onto the IPG strips for the 2D-PAGE analysis. The 1D- and 2D-gels were stained with Sypro Ruby protein stain (Bio-Rad) and Pro-Q Emerald 300 Glycoprotein Stain (Molecular Probes, Eugene, OR). The gels were imaged using the PDQuest image analysis software (Bio-Rad). Proteins of interest were excised by a spot-excision robot (Proteome Works, BioRad), trypsin digested in-gel, and analyzed by liquid chromatography-tandem mass spectrometry (LC-MS/MS) with electrospray ionization (ESI). Solution IEF Fractionation. Proteins in whole saliva were precipitated by mixing with four times the volume of 100% cold ethanol and then incubated overnight at -20 °C. The mixture was centrifuged at 13 000 rpm for 15 min at 4 °C. The pellet was resuspended in lysis buffer (Zoom 2D protein solubilizer; Invitrogen), Complete Protease Inhibitor (Roche Diagnostic Corporation, Indianapolis, IN), Tris base, DTT and water and sonicated on ice. The pH of the lysate was adjusted to pH 8.58.7 with 1M Tris base and then incubated for 15 min at room temperature with shaking. Sample lysate was reduced for 30 min with 99% dimethylacrylamide (DMA) at room temperature. To quench any excess of DMA, DTT was added and incubated for 5 min at room temperature. After centrifuging the sample for 30 min at 13 400 rpm at 4 °C, the supernatant was collected. The protein concentration was determined by the NonInterfering Protein Assay (Geno Technology, St. Louis, MO) to be approximately 1.5 mg/mL. Protein lysate (1.5 mg/mL, 400 µL) was diluted to a final concentration of 0.6 mg/mL in dilution buffer consisting of Zoom IEF denaturant, Zoom focusing buffer pH 3-7 (Invitrogen), Zoom focusing buffer, pH 7-12, and 5 µL 2M DTT. Solution IEF separation with a Zoom IEF Fractionator (Invitrogen) was performed in the standard format (pH 3.0 to pH 10). Diluted sample was loaded into each of the five chamber of the fractionator. Five fractions (pI 3-4.6, 4.6-5.4, 5.4-6.2, 6.2-7.0, and 7.0-10.0) were obtained after fractionation. Proteins from each fraction were precipitated by mixing with 70% acetone, incubating at -20 °C for 3-4 h and centrifuging at 13 000 rpm for 30 min. Glycoprotein Pulldown. Proteins from saliva were precipitated using the ethanol method described above. The precipitated proteins were resuspended in coupling buffer (pH 5.5). Sodium periodate was added to a final concentration of 15 mM. The solution was incubated in the dark for 1 h at room temperature. Glycerol was added to a final concentration of 20 mM to quench any excess sodium periodate remaining in the solution. The mixture was incubated for 15 min with mixing at room temperature. To remove any remaining sodium periodate, the solution was dialyzed against 1X coupling buffer (overnight at 4 °C) using a 3.5 kDa dialysis cassette (Pierce). The hydrazide resin was equilibrated by washing with 3 volumes of distilled water and 6 volumes of coupling buffer. The proteins were added to the resin and coupled overnight by incubating at room temperature with shaking. The gel was

Ramachandran et al.

then allowed to settle and the supernatant containing uncoupled nonglycoproteins was removed by pipetting. The resin was washed six times with urea buffer A (8 M urea, 200 mM Tris, 0.05% SDS, 5 mM EDTA, pH 8.3). The proteins on the resin were reduced with a solution of 10 mM TCEP in urea buffer A. The reduced proteins were alkylated with 50 mM iodoacetamide in urea buffer A. The resin was then washed 6 times with urea buffer B (1 M urea, 25 mM Tris, pH 8.3). The resin was resuspended in urea buffer B, trypsin was added to the solution, and the proteins attached to the resin were digested overnight at 37 °C with shaking. The nonglycopeptides released by trypsin digestion were removed by washing 3 times with 1.5 M NaCl, 80% acetonitrile (ACN)/ 0.1% TFA (aq), methanol and water and six times with 100 mM ammonium bicarbonate. The resin was then resuspended in 100 mM ammonium bicarbonate. The N-linked carbohydrates were released by adding PNGaseF to the resin and incubating overnight at 37 °C. The resin was washed twice with 80% ACN. The washes were pooled, and the released peptides were dried in a vacuum centrifuge. The peptides were then resuspended in 0.1% TFA and analyzed by LC-MS/MS. Liquid Chromatography-Mass Spectrometry. LC-MS/MS of peptide mixtures were performed on an Applied Biosystems (Foster City, CA) QSTAR Pulsar XL (QqTOF) mass spectrometer equipped with a nanoelectrospray interface (Protana, Odense, Denmark) and an LC Packings (Sunnyvale, CA) nano-LC system. The nano-LC was equipped with a homemade precolumn (150 µm × 3 mm) and analytical column (75 µm × 150 mm) packed with Jupiter Proteo C12 resin (particle size 4 µm, Phenomenex, Torrance, CA). The released peptides were dried and dissolved in 0.1% formic acid (FA) solution. For each LCMS/MS run, typically 6 µL of sample solution was loaded to the precolumn. The precolumn was washed with the loading solvent (0.1% FA) for 4 min before the sample was injected onto the LC column. The eluents used for the liquid chromatography were 0.1% FA (aq) (solvent A) and 95% ACN containing 0.1% FA (solvent B). The flow was 200 nL/min, and the following gradient was used: 3% B to 35% B in 72 min, 35% B to 80% B in 18 min, and maintained at 80% B for the final 9 min. The column was finally equilibrated with 3% B for 15 min prior to the next run. For online LC-MS/MS analyses, a Proxeon (Odense, Denmark) nano-bore stainless steel online emitter (30 µm i.d.) was used for spraying with the voltage set at 1900 V. Peptide product ion spectra were recorded automatically during the LC-MS/ MS runs by the information-dependent analysis (IDA) on the mass spectrometer. Argon was employed as the collision gas. Collision energies for maximum fragmentation efficiencies were calculated using empirical parameters based on the charge and mass-to-charge ratio of the peptide precursor ion. Protein identification was accomplished utilizing the Mascot database search engine (Matrix Science, London, UK). All searches were performed against the Human IPI database. For the protein sequence searches, the following variable modifications were set: carbamidomethylation of cysteines, cyclization of N-terminal carbamoylmethyl cysteine, oxidization of methionines, conversion of asparagines to aspartic acid at the site of carbohydrate attachment on asparagines, and cyclization of N-terminal glutamine to Pyro-Glu. For saliva samples prefractionated by in-solution IEF fractionation, DMA modification of cysteine was added also to the variable modification list. In all searches, one missed tryptic cleavage was tolerated and a mass tolerance of 0.3 Da was set for the precursor and product Journal of Proteome Research • Vol. 5, No. 6, 2006 1495

N-Linked Glycoproteins in Human Saliva

technical notes suggested by the staining with the carbohydrate-specific stain Pro-Q Emerald47 (Figure 1A, lane 3). This indicates the presence of a moderate amount of N-glycosylated proteins. Treatment of the salivary proteins with both PNGase F and O-Glycanase reduced slightly the level of total glycosylation over that by PNGase F alone (Figure 1A, lane 4). However, many bands still appeared with the Pro-Q Emerald stain. Therefore, the combined treatment failed to completely remove all glycosylation, and may be expected because O-Glycanase is not as efficient as PNGase F in removing glycosylation from glycoproteins. Restaining the Pro-Q Emerald stained gel with Sypro Ruby yielded protein bands with reduced molecular mass, consistent with the deglycosylation process (Figure 1B, lanes 2-4).

Figure 1. SDS-PAGE of salivary proteins after treatment with PNGase F and O-Glycanase. Gels were stained with (A) Pro-Q Emerald glycoprotein stain and (B) Sypro Ruby. (Lane 1) Candy Cane MW standards; (Lane 2) untreated whole saliva; (Lane 3) PNGase F treated saliva; and (Lane 4) PNGase F and O-Glycanase treated saliva.

ions. A Mascot score of greater than 25 was considered a significant match. However even among those peptides that had a Mascot score greater than 25, the MS/MS spectra for each peptide were manually examined to verify the accuracy of the identification. The validity of a formerly N-glycosylated peptide was confirmed by the presence of a consensus N-X-(S/T) sequence and deamidation on the asparagine residues.

Results and Discussion Glycoproteins in Saliva. The primary aim of this study was to identify asparagine-linked glycoproteins in human whole saliva by employing the hydrazide capture and release method. To investigate the relative abundance of glycoproteins in saliva and address whether the glycoproteins are primarily in the Nor O-linked glycosylated form, one-dimensional SDS-PAGE of saliva with and without PNGase F/O-Glycanase treatment was compared (Figure 1). PNGase F cleaves the bond between the innermost sugar moiety and asparagine residues on N-linked glycoproteins. O-Glycanase breaks the bond between the sugar and serine/threonine residues on O-linked glycoproteins. Treatment of salivary proteins with PNGase F significantly reduced glycosylation compared to untreated whole saliva, as

To further explore the presence of glycoproteins in salivary fluid, two-dimensional gel electrophoresis of whole saliva and stained with Pro-Q Emerald (Figure 2) was performed. From our previous work that reported the identification of salivary proteins from 2D-PAGE,3-10 many of the proteins stained by Pro-Q Emerald can be identified, such as polymeric-immunoglobulin receptor, alpha amylase, prolactin inducible protein, carbonic dehydratase and zinc-alpha-2-glycoprotein, and many of these proteins are known to be highly glycosylated. Treatment of the salivary proteome with enzyme PNGase F confirmed the Pro-Q Emerald result (Figure 3). Deglycosylated proteins show a general pattern of migration towards the acidic end of the gel and they also tended to migrate to a lower molecular mass; deglycosylation by PNGase F of N-linked glycoproteins converts asparagines to aspartic acid, reducing the pI of the protein. Some of the spots indicated by the labels in Figure 3 from the 2D-gel of deglycosylated salivary proteins were excised, digested in-gel with trypsin and analyzed by LCMS/MS. Their identifications confirmed that these are suspected glycoproteins, such as transferrin, amylase, prolactin inducible protein, zinc-alpha-2-glycoprotein and polymeric immunoglobulin receptor precursor. Overall, these results indicated the presence of a moderate level N-glycosylated salivary proteins and also suggests that the hydrazide coupling method may be an effective technique to subfractionate the salivary proteome. N-Linked Glycopeptide Enrichment. Prior to being subjected to the hydrazide coupling technique, proteins in saliva

Figure 2. 2D-PAGE (pH 3-10 shown) of whole saliva stained with (A) Sypro Ruby and (B) Pro-Q Emerald glycoprotein stain. The glycoproteins marked in gel B are identified as follows: (1) polymeric immunoglobulin receptor precursor, (2) salivary alpha amylase, (3) zinc-alpha-2-glycoprotein, (4) carbonate dehydratase VI precursor, and (5) prolactin inducible protein. 1496

Journal of Proteome Research • Vol. 5, No. 6, 2006

technical notes

Figure 3. 2D-PAGE of (A) whole saliva and (B) PNGase F/OGlycanase treated whole saliva. Labeled protein spots identified by trypsin digestion/LC-MS/MS include the following: (1) serotransferrin precursor (TRFE_HUMAN), (2) polymeric immunoglobulin receptor precursor (PIGR_HUMAN), (3 and 4) salivary alpha amylase precursor (AMYS_HUMAN), (5) zinc-alpha-2glycoprotein precursor (ZA2G_HUMAN), and (6) prolactin inducible protein precursor (PIP_HUMAN).

were isolated by two different methods. In the first method, proteins were ethanol precipitated. A second approach, involving fractionating proteins by their isoelectric point, was used to further reduce sample complexity. This strategy ensures that only proteins focusing within the targeted pH range are loaded onto narrow range immobilized pH gradient (IPG) gels. Prefractionating by pI benefits 2D-PAGE and subsequent mass spectrometric analysis by accepting larger protein loads without degrading resolution or spot-count; most proteins likely to precipitate near the electrodes are eliminated and coprecipitation of desired proteins is thus minimized.48 Multi-chamber devices based on immobilized pH gradient membranes have been fabricated commercially, and the newer, smaller devices show great promise for pre-fractionation. We have investigated this method specifically for this project by using a commercial version of this device (Zoom-IEF Fractionator). The fractionator partitions the complex mixture into up to 7 separate subsamples on the basis of their pI. Each fraction is collected and run separately onto either 2D-PAGE and/or LC-MS/MS. The advantage of this approach is that many more proteins can be resolved and observed, especially the lower abundance proteins, than without pre-fractionation. Thus, the salivary proteins were preseparated by in-solution isoelectric focusing into five separate pH fractions: 3.0-4.6, 4.6-5.4, 5.4-6.2, 6.2-7.0, and 7.0-10.0. The proteins in each Zoom fraction were then acetone precipitated. The proteins obtained by either method were resuspended in coupling buffer (pH 5.5) and subjected to the following steps: (a) oxidation of carbohydrates on glycoproteins with sodium periodate, (b) quenching the excess sodium periodate

Ramachandran et al.

in the solution with glycerol and removing any remaining sodium periodate by dialysis overnight against coupling buffer at 4 °C, (c) coupling the glycoprotein via the sugar to the hydrazide-agarose resin to form a stable covalent hydrazone bond, (d) washing off the nonglycoproteins, (e) digesting the glycoproteins attached to the hydrazide resin with trypsin, (f) washing off nonglycopeptides, (g) eluting the formerly Nglycosylated peptides with the enzyme PNGase F, and (h) analyzing the formerly glycosylated peptides by LC-MS/MS. N-linked glycosylation occurs on asparagines residues which fall into a consensus N-X-(S/T) sequence motif, where X represents any amino acid residue except proline.49 N-deglycosylation with PNGase F converts asparagines to aspartic acid with a mass increase of 1Da.50 In our study, the authenticity of the identified N-linked glycopeptide was based on the presence of N-X-(S/T) sequence on the peptide and deamidation on the asparagines residues. The accuracy of the peptide identifications was validated by inspecting manually each MS/MS spectra. Our glycoprotein capture/release procedure is a slight modification of the procedure published by Aebersold and coworkers.17 The main difference is inclusion of an additional wash step prior to trypsin digestion to remove nonspecific binding of nonglycoproteins to the agarose resin. Although our procedure lead to an overall reduction in nonspecific binding as compared to the published hydrazide capture protocol, some nonglycopeptides were identified as well (see Table 1). Identification of Salivary N-Linked Glycoproteins. The peptides isolated were desalted and analyzed by LC-MS/MS. Figure 4A shows a representative MS/MS spectrum of a deglycosylated glycopeptide [(M+2H)2+ at m/z 593.77] from neutrophil gelatinase associated lipocalin. The b- and y-series of product ions clearly show the appropriate mass shift indicative of conversion of asparagine to aspartic acid at the original site of N-glycosylation. For example, the mass difference of 115 Da for aspartic acid is found for both the b3-b2 and y8-y7 product ion pairs at residue 3. Figure 4B shows the MS/MS spectrum of a peptide [(M+2H)2+ at m/z 688.94] from bactericidal/permeability increasing protein like precursor. Again, a shift of 115 Da between the y6 and y7 product ions indicates the presence of aspartic acid instead of asparagines, as a consequence of the enzymatic deglycosylation process. The y10-y9 pair yields a mass difference of 114 Da, indicating that the Asn at position 4 was not N-glycosylated. A total of 84 former glycopeptides from 45 salivary glycoproteins were identified by this method. Of these proteins 16 are novel salivary proteins that had not been reported in our earlier work5 (see Table 1). These include low abundance proteins that are related with cancer development and progression, e.g., carcinoembryonic antigen-related cell adhesion molecule 5, 6, and 8, and thrombospondin. Some of the abundant proteins in saliva include amylase, transferrin, albumin, immunoglobulin, polymeric-immunoglobulin-receptor precursor, prolactin-inducible protein precursor, mucin and zinc-alpha-2-glycoprotein precursor. Among the 84 formerly N-linked glycopeptides identified, 9 peptides originated from Mucin 5B precursor. This is not surprising because of its large molecular weight (590 kDa) and Muc5B is known to be highly N- and O-glycosylated. Mucin appears to be the most abundant glycoprotein in saliva, as it has 32 potential sites of N-linked glycosylation. Therefore, removing mucins prior to the hydrazide capture method may help reduce sample complexity. This will be explored in future studies. Polymericimmunoglobulin-receptor precursor and zinc-alpha-2-glycoJournal of Proteome Research • Vol. 5, No. 6, 2006 1497

technical notes

N-Linked Glycoproteins in Human Saliva Table 1. Glycoproteins Identified by the Glycoprotein Capture and Release Methodp

IPI acc. no.

protein name

IPI00004573 Polymeric immunoglobulin receptor precursor

MW (Mr kDa)

pI

formerly glycosylated peptides (Mascot score)

83.262 5.58 AN*LTNFPEN*GTFVVNIAQLSQDDSGRYK (70)

formerly glycosylated peptide sequence

82-109

QpgIGLYPVLVIDSSGYVNPN*YTGR (95)

168-190

LSLLEEPGN*GTFTVILNQLTSR (60)

413-434

IIEGEPNLKVPGN*VTAVLGETLK (49)

457-479

VPGN*VTAVLGETLK (101)

466-479

WN*NTGCcQALPSQDEGPSK (72)

498-515

IPI00007244a Myeloperoxidase 83.815 9.19 SCcPACcPGSN*ITIRb (46) precursor SYN*DSVDPR (41) IPI00012165 Mucin 5B 590.122 6.24 AQGLVLEASN*GSVLINGprecursor QRb (118)

315-327 481-489 136-154

LDGPTEQCcPDPLPLPAGN*CcTDEEGICHRb (65)

237-264

IPI00013972a Carcinoembryonic antigen related cell adhesion molecule 8 precursor a IPI00019943 Afamin precursor IPI00020091a Alpha-1-acid glycoprotein 2 precursor a IPI00020487 Extracellular glycoprotein lacritin precursor IPI00022417 Leucine-rich alpha-2-glycoprotein precursor a IPI00022429 Alpha-1-acid glycoprotein 1 precursor

1498

DIECnQAESFPN*WTLAQVGQKb (82)

1549-1568

AFGQFFSPGEVIYN*KTDRb (49)

4891-4909

QpgVN*ETWTLEN*CnTVARb (47)

4962-4974

VVLLDPKPVAN*VTCcVNKb (60)

4981-4997

GN*CcTYVLMoRb (32)

5039-5047

FGN*LSLYLDNHYCcTASATAAAARb (49)

5054-5076

LPYSLFHN*NTEGQCc -GTCcTNNQRb (32)

5153-5172

38.13 6.95 LFIPN*ITTKb (47)

69.024 5.64 DIENFN*STQKb (41) 23.588 5.03

QpgNQCnFYNSSYLNVQR

284-292

non-formerly glycosylated peptides

ADAAPDEKVLDSGFR AFVNCcDENSR CnGLGINSR DVSLAKADAAPDEK GGCITLISSEGYVSSK GSVTFHCALGPEVANVAK ILLNPQDK LSDAGQYLCQAGDDSNSNKK LVSLTLNLVTR NADLQVLKPEPELVYEDLR NGFPLPLAR QSSGENCDVVVNTLGK TVTINCPFK TVTINCPFKTENAQK YKCGLGINSR YWCLWEGAQNGR CpcPLLVDSEGWVK GVAGSSVAVLCcPYNR IANVFTNAFR QALAQISLPR AAGGAVCcEQPLGLECcR AAYEDFNVQLR AENYPEVSIDQVGQVLTCcSLETGLTCcK AQAQPGVPLGELGQVVECcSLDFGLVCcR AQAQPGVPLR AVTLSLDGGDTAIR CcADSSFTVLAELR CpcGLTDNENCcLK CcPTCcPCcATFVEYSR ELGQVVECcSLDFGLVCcR FKMoCcFNYEIR GATGGLCcDLTCcPPTK GPGGDPPYKIR GRLEVPCcQSLEAYAELCcR GVQLSDWR IVTENIPCcGTTGTTCcSK LEVPCcQSLEAYAELCcR LFVESYELILQEGTFK LSCcLGASLQK LSPSCcPDALAPK LTDPNSAFSR LYDLHGDCSYVLSK MCcFNYEIR NPSGHCLVDLPGLEGCYPK NSFEDPCcSLSVENENYAR NWEQEGVFK QdCcSILHGPTFAACcR RGLVGSRPVVTR SMoDIVLTVTMVHGK SVVGDALEFGNSWK TFDGDVFR TGLLVEQSGDYIK TWLVPDSR VCcGLCcGNFDDNAINDFATR VCcSTWGDFHYK VHCcDVHFGLVCcR YAYVVDACcQPTCcR none

literature potential reported N-gly separation N-gly sites method sites

7

7

IS-IEF & EP

6

3

EP

32

0

EP & IS-IEF

11

0

IS-IEF

28-37

none

4

1

EP

87-101

none

5

5

IS-IEF

(90) 14.237 5.43 QpgFIEN*GSEFAQKb (32)

115-126

SILLTEQALAK

1

0

EP

38.944 7.88 KLPPGLLAN*FTLLR (62)

178-191

VAAGAFQGLR

5

5

EP

24.770 4.93 QDQCcIYN*TTYLNVQR (105)

87-101

none

5

5

IS-IEF

Journal of Proteome Research • Vol. 5, No. 6, 2006

technical notes

Ramachandran et al.

Table 1. (Continued)

IPI acc. no.

protein name

MW (Mr kDa)

pI

formerly glycosylated peptides (Mascot score)

formerly glycosylated peptide sequence

non-formerly glycosylated peptides

literature potential reported N-gly separation N-gly sites method sites

IPI00022431 Alpha-2-HSglycoprotein precursor IPI00022463 Serotransferrin precursor

39.300 5.43 VCcQDCcPLLAPLN*DTR (40) AALAAFNAQNN*GSNFQLEEISR (64) 77.00 6.81 QQQHLFGSN*VTDCcSGNFCcLFR (62)

145-159 166-187

CcNLLAEK

2

2

EP & IS-IEF

622-642

2

2

EP

IPI00022488 Hemopexin precursor IPI00022974 Prolactin inducible protein precursor

51.643 6.55 SWPAVGN*CcSSALR (47) ALPQPQN*VTSLLGCcTH (56) 16.562 8.26 TFYWDFYTN*R (50)

181-193 447-462 97-106

SVIPSDGPSVACcVK ASYLDCcIR CpcLKDGAGDVAFVK MoYLGYEYVTAIR TAGWNIPMoGLLYNK FDEFFSEGCcAPGSK CpcLVEKGDVAFVK YLGEEYVK CpcSTSSLLEACcTFR NFPSPVDAAFR

5

5

EP

1

1

EP & IS-IEF

IPI00023673 Galectin-3 binding protein precursor

65.289 5.13 ALGFEN*ATQALGR (83) YKGLN*LTEDTYKPR (50) AAIPSALDTN*SSK (64) TVIRPFYLTN*SSGVD (57) 80.237 8.89 IVGYLNEEGVLDQN*Rb(77)

64-76 394-407 542-554 571-585 199-213

ELGICcPDDAAVIPIKNNR FYTIEILK FYTIEILKVE IIIKNFDIPK YTACcLCcDDNPK IYTSPTWSAFVTD SSWSAR TLQALEFHTVPFQLLAR

7

7

EP & IS-IEF

AGFVCcPTPPYK FGHLEVPSSMFR IHGFDLAAINTQR ISNVFTFAFR NGQVWEESLKR VPCcFLAGDSR VQVNKAFLDSR NGFPLPLAR none

6

3

EP

3

0

EP

5

0

IS-IEF & EP

IPI00025023 Lactoperoxidase precursor

KPSPCcEFIN*TTARb (34)

IPI00025753 Desmoglein 1 precursor IPI00025846 Splice isoform 1 of Desmocollin-2 precursor a IPI00027412 Carcinoembryonic antigen-related cell adhesion molecule 6 precursor a IPI00027486 Carcinoembryonic antigen-related cell adhesion molecule 5 precursor IPI00031019a Cystatin-related epididymal spermatogenic protein precursor IPI00031547 Desmoglein 3 precursor IPI00032258a Complement C4 precursor IPI00032292 Metalloproteinase inhibitor 1 precursor a IPI00060143 Protein FAM3D precursor IPI00166729 Zinc alpha-2glycoprotein precursor

IPI00171411 Golgi membrane protein GP73 IPI00178926 Immunoglobulin J chain

113.644 4.9 TGEIN*ITSIVDRb(36) 99.899 5.19

AN*YTILKb

350-363

106-117

(49) AN*YTILKGNENGNFKb (26) LKAIN*DTAARb (31) 37.214 5.56 LQLSNGN*MoTLTLLSVKb (63)

391-397 391-405 625-634 191-206

VTVEDKDLVNTANWR none

12

0

IS-IEF

76.748 5.43 TLTLFN*VTRb (42)

272-280

none

28

0

EP

34-46

none

2

0

EP

107.44 4.86 DSTFIVN*Kb (45) 453-460 none LPAVWSITTLN*ATSALLRb (39) 535-552 b 192.65 6.65 GLN*VTLSSTGR (59) 1326-1336 none

4

0

EP & IS-IEF

16.265 9.05 KLKPVN*ASNANVKb (43)

23.156 8.46 FVGTPEVN*QTTLYQR (88) SHN*RSEEFLIAGK (29) 24.947 9.41 GLNIALVN*GTTGAVLGQK (56) 34.223 5.71 DIVEYYN*DSN*GSHVLQGRb (91)

46-60 99-101 100-117 100-117

2

2

EP & IS-IEF

LQSGTHCcLWTDQLLQGSEK none

1

1

EP & IS-IEF

3

EP & IS-IEF

3

0

EP & IS-IEF

70-80

CpcYTAVVPLVYGGETK

1

1

EP & IS-IEF

62-80

FVYHLSDLCcK IIRSSEDPNEDIVER IIVPLNNR QpgEDERIVLVDNK SSEDPNEDIVER

FGCcEIENN*RSSGAFWK (58)

118-133

IIVPLNNREN*ISDPTSPLR (56)

IS-IEF

4

118-126

18.087 5.12 EN*ISDPTSPLR (55)

2

AGEVQEPELR AKAYLEEECcPATLR AYLEEECPATLR CcLAYDFYPGKIDVHWTR EIPAWVPFDPAAQITK HVEDVPAFQALGSLNDLQFFR LKCLAYDFYPGK QDPPSVVVTSHQAPGEK QKWEAEPVYVQR QpgVEGMEDWKQDSQLQK WEAEPVYVQR YSLTYIYTGLSK YYYDGKDYIEFNK DTINLLDQR

FGCcEIENN*R (57)

49.768 5.04 AVLVNN*ITTGERb (80)

4

EPGLCcTWQSLR

113-124

Journal of Proteome Research • Vol. 5, No. 6, 2006 1499

technical notes

N-Linked Glycoproteins in Human Saliva Table 1. (Continued)

IPI acc. no.

protein name

IPI00242956 Fc fragment of IgG binding protein

MW (Mr kDa)

pI

formerly glycosylated peptides (Mascot score)

571.718 5.14 VITVQVAN*FTLRb (58)

YLPVN*SSLLTSDCcSERb (98)

formerly glycosylated peptide sequence

non-formerly glycosylated peptides

2511-2522 AIGYATAADCcGR GCcVLDVCcMGGGDHDILCcK LTYNHGGITGSR NQNRGNPAVSYVR YQKEEFCGLLSSPTGPLSSCHK 5182-5197 CpcLANGGIHYITLDGR FAVLQENVAWGNGR GATTSPGVYELSSR GNPAVSYVR VNGVLTALPVSVADGR VSYVGLVTVR 33-60 none

IPI00291488 Splice isoform of 12.984 4.69 TGVCcPELQADQN*CcWAP fourTQECcVSDSECcdisulfile core ADNLKb (94) domain protein 2 precursor a IPI00296099 Thrombospondin 1 129.33 4.71 VVN*STTGPGEHLRb (44) 1065-1077 none precursor IPI00296654 Bactericidal/perme- 49.1 8.82 LLAAAN*FTFKb (64) 91-100 AALSYVSEIGKAPLQR ability-increasing SDDNLLN*TSALGRb (81) 287-299 AGALNLDITGQLR protein-like 1 APEPLELTLPVELLADTR precursor LGATPVAMoLHTN320-336 FIAGFGVR b N*ATLR (39) LIPEVAR IPI00298082a Calcium activated 101.283 5.31 DSFDDALQVN*802-818 none chloride channel TTDLSPKb (38) protein 2 IPI00298828 Beta-238.273 8.34 VYKPSAGN*NSLYR (40) 155-167 none glycoprotein I LGN*WSAMPSCK (28) 251-261 precursor IPI00299547 Neutrophil 22.774 8.66 EDKSYN*VTSVLFR (28) 64-76 VPLQQNFQDNQFQGK gelatinase TFVPGCcQPGEFTLGNIK associated SYN*VTSVLFR (55) 83-92 SYPGLTSYLVR Lipocalin precursor IPI00299729 Transcobalamin I 48.164 4.96 ADEGSLKN*ISIYTKb (50) 209-222 LVGIQIQTLMoQK precursor AQKMoN*DTIFGFTMo365-380 EERb (49) b MN*DTIFGFTMEER (32) 368-380 NGENLEVR IPI00300786 Alpha-amylase, 57.731 6.47 NVVDGQPFTNWYDN*- 414-436 AHFSISNSAEDPFIAIHAESK salivary precursor GSNQVAFGRb (108) ALVFVDNHDNQR DFPAVPYSGWDFNDGK EVTINPDTTCcGNDWVCcEHR GHGAGGASILTFWDAR IAEYMNHLIDIGVAGFR LSGLLDLALGK LSGLLDLALGKDYVR MAVGFMLAHPYGFTR NMVNFR NWGEGWGFMPSDR SSDYFGNGRVTEFK TGSGDIENYNDATQVR TSIVHLFEWR WVDIALECcER YFENdGKDVNDWVGPPNDNdGVTK IPI00304557 Short palate, lung 26.995 5.35 AEPIDDGKGLN*114-147 none and nasal LSFPVTAN*VTVAGepithelium PIIGQIINLK (55) carcinoma GLN*LSFPVTAN*122-147 associated protein VTVAGPIIGQIINLK (53) 2 precursor IPI00328960a PREDICTED: 91.568 7.92 TPASN*ISTQVSHTKb (44) 140-153 none hypothetical protein XP_085831 78.438 5.03 LVSFEVPQN*TSVKb (43) IPI00333140a Delta notch-like 215-256 VTATGFQQCcSLIDGR EGF repeat containing transmembrane a IPI00374315 HYPOTHETICAL 37.902 5.78 IILN*QTAb(52) 66-73 FCcYDVSSCcR LOC389429 MGMYKIILN*QTARb (72) 61-73 IPI00384948 Ig alpha-2 chain C 53.868 6.02 TPLTAN*ITK (57) 342-350 DASGATFTWT region GFSPKDVLVR KGDTFSCcMoVGHEALPLAFTQ NFPPSQDASGDLYTTSSQLTLPATQCcLAGK o 461-482 QEPSQGTTTFAVTSILR LAGKPTHVN*VSVVM AEVDGTCc Y (56) SAVQGPPER TFTCcTAAYPESK WLQGSQELPR YLTWASR

1500

Journal of Proteome Research • Vol. 5, No. 6, 2006

literature potential reported N-gly separation N-gly sites method sites

4

0

EP

1

0

EP & ISIEF

4

1

EP

4

0

EP & ISIEF

12

0

IS-IEF

4

4

EP

1

1

EP

8

0

EP & ISIEF

2

0

EP

2

2

IS-IEF

3

0

IS-IEF

11

0

EP & IS IEF

3

0

5

5

EP & ISIEF EP & ISIEF

technical notes

Ramachandran et al.

Table 1. (Continued)

IPI acc. no.

protein name

MW (Mr kDa)

pI

formerly glycosylated peptides (Mascot score)

formerly glycosylated peptide sequence

non-formerly glycosylated peptides

IPI00400826 Clusterin precursor 57.796 6.24 LAN*LTQGEDQYYLR (64) 424-437 none IPI00418512 DMBT1 193.867 5.18 CpcSGN*ESYLWSCcPHKb (38) 822-835 FGQGSGPIVLDDVR c b LVNLN*SSYGLC AGR (95) 998-111 QLGCnGWATSAPGNAR QpgADN*DTIDYSNFLT1298-1320 VEVLYR AAVSGGIIKb (110) IPI00423460 Ig alpha-1 chain C 37.631 6.08 LAGKPTHVN*VSVVMo467-488 NFPPSQDASGDLYTTSSQLTregion AEVDGTCcY (56) LPATQCcLAGK DASGVTFTWTPSSGK TFTCcTAAYPESK TPLTATLSK WLQGSQELPR YLTWASR QEPSQGTTTFAVTSILR KGDTFSCcMVGHEALPLAFTQK GDTFSCcMVGHEALPLAFTQK IPI00431645 Haptoglobin 45.177 8.48 NLFLN*HSEN*ATAKb (59) 78-90 VGYVSGWGR precursor VTSIQDWVQK YVMoLPVADQDQCcIR

literature potential reported N-gly separation N-gly sites method sites

6 14

6 0

EP EP & ISIEF

2

2

EP

3

0

EP

a Novel salivary proteins. b Novel sites of N-linked glycosylation. c Carbamidomethyl modification of cysteine. dDeamidation of non-formerly glycosylated N and Q pc Cyclization of N-terminal carbamoylmethylcysteine. pg Cyclization of N-terminal glutamine. n Acrylamide modification of cysteine. o Oxidation of methionine. p The sites of N-linked glycosylation are indicated in bold with an asterisk. IS-IEF, in-solution IEF pre-fractionation; EP, ethanol precipitation without in-solution IEF pre-fractionation.

Figure 4. LC-MS/MS mass spectra of (A) doubly charged formerly N-glycosylated peptide SYN*VTSVLFR (m/z 593.77) from neutrophil gelatinase associated lipocalin precursor and (B) doubly charged formerly N-glycosylated peptide SDDNLLN*TSALGR (m/z 688.94) from bactericidal/permeability increasing protein like precursor. The asterisk (*) denotes the site of N-glycosylation as determined from the tandem mass spectrum.

protein precursor have 7 and 4 potential sites of N-linked glycosylation, respectively. The other high-abundance proteins observed in this study have only one or two potential sites of N-glycosylation. As expected, we did not find evidence for glycosylated peptides from albumin, a commonly observed protein in salivary fluid. However, formerly glycosylated peptides from afamin, which belongs to the family of albumins and contains several sites of glycosylation,51 were detected in our study. The glycoprotein capture method helped us identify many new sites of N-linked glycosylation not reported previously. Of the 84 sites of N-linked glycosylation, 44 were new sites of N-glycosylation from 27 different proteins (Table 1). In proteins

such as afamin precursor, complement C4 protein, thrombospondin precursor, and zinc alpha-2 glycoprotein precursor, we discovered N-glycosylation sites other than those that had been reported previously. In the other 23 proteins, the Nglycosylation sites are being reported for the first time in this paper. Mucins are known to be highly glycosylated; we detected 9 N-linked sites. Sugars on mucin are known to be involved in cell recognition. For this role, the glycosylation is expected to be on the outer edges of the molecule.41 Not surprisingly, the majority of the N-glycosylation sites discovered in our study mapped to either the N-terminal or the C-terminal domain. Of these 9 sites discovered, 3 mapped to the N-terminal domain of the molecule, 5 to the C-terminal domain, and one to the Journal of Proteome Research • Vol. 5, No. 6, 2006 1501

technical notes

N-Linked Glycoproteins in Human Saliva

central domain. In 15 glycoproteins, the number sites of glycosylation identified were less than that reported in the literature. Many reasons may account for this finding, including the presence of transient glycosylation and it may be dependent on the conditions that exist in the cell at a given time point. Another possibility is the lability of the carbohydrate moieties during the course of sample handling and analysis. An additional level of pre-fractionation, in-solution IEF separation, was employed to improve the dynamic range of the measurements. Out of the total number of glycoproteins identified, 38 proteins were identified without IEF fractionation, 27 were identified with in-solution IEF preseparation, and 20 proteins by both methods (Table 1). Clearly, the pre-fractionation strategy has been successfully utilized to gain access to lower abundant proteins in biological samples. However, this anticipated success did not translate as well to our glycoprotein cataloguing, as only a handful of new proteins were identified. This method may have suffered because of the high abundance of mucin 5B. With in-solution IEF fractionation, we identified 25 proteins in the pH 3.0-5.4 fractions, 8 proteins in the pH 5.4-10 fractions and 6 proteins were common to both fractions. An examination of the distribution of proteins on a 2D-gel (Figures 2A and 3A) indicates that a majority of the salivary proteins lie in the pH 5.4-10.0 range. However, more proteins were identified in the lower pH fractions. The pI of mucin 5B range near 6.24 and extends to the more basic zone. Given the large number of N-glycopeptides identified from Muc5B, it is likely that its high abundance level reduced our sensitivity for other lower abundance proteins. Future studies will explore the possibility of Muc5B depletion prior to glycoprotein capture.

Conclusion Many previous attempts toward large-scale identification of salivary glycoproteins have employed lectin probes. The identity of the protein was inferred by position on the 1D- or 2Dgel. In some studies, spots from the 1D- or 2D-gel were excised, the glycosylation removed by enzyme treatment or betaelimination, and the resulting peptides analyzed by mass spectrometry for protein identification.35,52,53 However these methods are rather indirect and often do not give accurate information about the identity of the protein and site and type of glycosylation. The hydrazide capture and release method on the other hand is efficient in revealing the identity of the protein and site of N-glycosylation. We report our initial attempt to characterize comprehensively salivary N-glycoproteins. The method of glycopeptide capture functions as a technique to pre-fractionate highly complex biofluids. It allowed us to identify 16 glycoproteins that have not been identified previously in saliva. Glycosylation of proteins in the body changes with onset and progression of cancer.24 Many of the known biomarkers for cancer are glycoproteins. Because saliva is a rich source of glycoproteins, the glycoproteome can be harnessed to mine for protein markers for cancer detection and treatment. Studies in the past have also revealed saliva as a rich source of protein biomarkers for various cancers. Elevated levels of tumor markers for breast cancer such as c-erbB-2 and CA15-3 have been found in saliva women suffering from this disease, compared to control patients.54 In patients with ovarian cancer, amplification in salivary levels of CA125 has been observed.55 These proteins are well-established biomarkers for cancer and are known to be glycoproteins. In diseases such as Sjogren’s 1502

Journal of Proteome Research • Vol. 5, No. 6, 2006

Syndrome, HIV, and cancer, many antibodies to proteins related to the disease are enhanced in saliva.2 In our present study, many of the novel proteins identified may be of special interest for cancer detection and prognosis. Carcinoembryonic antigen related cell adhesion molecule (CEACAM) includes a family of proteins which are deregulated in many human cancers. CEACAM 6, which was identified in our study, has been shown to be upregulated in colorectal cancer and other cancers.56 The gene for short palate, lung, and nasal epithelium carcinoma associated protein 2 precursor is deregulated in nasopharyngeal cancer and lung cancer.57 Thrombospondin-1 precursor is elevated in gastrointestinal, breast, and lung cancer. The protein level also changes drastically with the invasiveness and stage of cancer.58 It will be interesting to observe how the level of these proteins and their glycosylation varies in saliva from cancer patients. In the future, we hope to utilize the hydrazide capture method to identify variations in protein profiles in human saliva as potential markers for various diseases.

Acknowledgment. The UCLA Mass Spectrometry and Proteomics Technology Center was established with a grant from the W. M. Keck Foundation. We are grateful to Dr. Parag Mallick (Cedars-Sinai) and Drs. Shen Hu and Rachel Loo (UCLA) for helpful advice. This work was supported by a grant from the National Institutes of Health (National Institute of Dental and Craniofacial Research; U01 DE016275) to D.T.W. and J.A.L. and the Ruth L. Kirschtein National Service Award (GM07185) to P.R. Note Added after ASAP Publication: This manuscript was originally published on the Web 5/11/2006, with an incorrect reference. The version appearing on the Web 5/17/2006 and in print is correct.

References (1) Jacobs, J. M.; Adkins, J. N.; Qian, W. J.; Liu, T.; Shen, Y.; Camp, D. G.; Smith, R. D. J. Proteome Res. 2005, 4, 1073-1085. (2) Kaufman, E.; Lamster, I. B. Crit. Rev. Oral Biol. Med. 2002, 13, 197-212. (3) Ghafouri, B.; Tagesson, C.; Lindahl, M. Proteomics 2003, 3, 10031015. (4) Hardt, M.; Thomas, L. R.; Dixon, S. E.; Newport, G.; Agabian, N.; Prakobphol, A.; Hall, S. C.; Witkowska, H. E.; Fisher, S. J. Biochemistry 2005, 44, 2885-2899. (5) Hu, S.; Xie, Y.; Ramachandran, P.; Ogorzalek Loo, R. R.; Li, Y.; Loo, J. A.; Wong, D. T. Proteomics 2005, 5, 1714-1728. (6) Huang, C.-M. Arch. Oral Biol. 2004, 49, 951-962. (7) Vitorino, R.; Lobo, M. J. C.; Ferrer-Correira, A. J.; Dubin, J. R.; Tomer, K. B.; Domingues, P. M.; Amado, F. M. L. Proteomics 2004, 4, 1109-1115. (8) Wilmarth, P. A.; Riviere, M. A.; Rustvold, D. L.; Lauten, J. D.; Madden, T. E.; David, L. L. J. Proteome Res. 2005, 3, 1017-1023. (9) Yao, Y.; Berg, E. A.; Costello, C. E.; Troxler, R. F.; Oppenheim, F. G. J. Biol. Chem. 2003, 278, 5300-5308. (10) Xie, H.; Rhodus, N. L.; Griffin, R. J.; Carlis, J. V.; Griffin, T. J. Mol. Cell. Proteomics 2005, 4, 1826-1830. (11) Shen, Y.; Kim, J.; Strittmatter, E. F.; Jacobs, J. M.; Camp, D. G., II; Fang, R.; Tolie, N.; Moore, R. J.; Smith, R. D. Proteomics 2005, 5, 4034-4045. (12) Bjorhall, K.; Miliotis, T.; Davidsson, P. Proteomics 2005, 5, 307317. (13) Yan, W.; Lee, H.; Deutsch, E. W.; Lazaro, C. A.; Tang, W.; Chen, F.; Fausto, N.; Katze, M. G.; Aebersold, R. Mol. Cell. Proteomics 2004, 3, 1039-1041. (14) McLachlin, D. T.; Chait, B. T. Curr. Opin. Chem. Biol. 2002, 5, 591-602. (15) Pinske, M. W. H.; Uitto, P. M.; Hilhorst, M. J.; Oorns, B.; Heck, A. J. R. Anal. Chem. 2004, 76, 3935-3943. (16) Yang, Z.; Hancock, W. S.; Chew, T. R.; Bonilla, L. Proteomics 2005, 5, 3353-3366.

technical notes (17) Zhang, H.; Li, X. J.; Martin, D. B.; Aebersold, R. Nat. Biotechnol. 2003, 21, 660-666. (18) Duffy, M. J.; Shering, S.; Sherry, F.; McDermott, E.; O’Higgins, N. Int. J. Biol. Markers 2000, 15, 330-333. (19) Lloyd, K. O.; Yin, B. W.; Kudryashov, V. Int. J. Cancer 1997, 71, 842-850. (20) Jankovic, M. M.; Kosanovic, M. M. Clin. Biochem. 2005, 38, 5865. (21) Akiyama, T.; Sudo, C.; Ogawara, H.; Toyoshima, K.; Yamamoto, T. Science 1986, 232, 1644-1646. (22) Dennis, J. W.; Granovsky, M.; Warren, C. E. Biochim. Biophys. Acta 1999, 1473, 21-34. (23) Kannagi, R.; Izawa, M.; Kolke, T.; Miyazaki, K.; Kimura, N. Cancer Sci. 2004, 95, 377-384. (24) Kobata, A.; Amano, J. Immunol. Cell Biol. 2005, 83, 429-439. (25) Zhang, H.; Yi, E. C.; Li, X. J.; Mallick, P.; Kelli-Spratt, K. S.; Masselon, C. D.; Camp, D. G., II; Smith, R. D.; Kemp, C. J.; Aebersold, R. Mol. Cell. Proteomics 2005, 4, 144-155. (26) Liu, T.; Qian, W. J.; Gritsenko, M. A.; Camp, D. G., 2nd; Monroe, M. E.; Moore, R. J.; Smith, R. D. J. Proteome Res. 2005, 4, 20702080. (27) Levine, M. J.; Reddy, M. S.; Tabak, L. A.; Loomis, R. E.; Bergey, E. J.; Jones, P. C.; Cohen, P. E.; Stinson, M. W.; Al-Hashmini, I. J. Dental Res. 1987, 66, 436-441. (28) Loomis, R. E.; Prakobphol, A.; Levine, M. J.; Reddy, M. S.; Jones, P. C. Arch. Biochem. Biophys. 1987, 258, 452-464. (29) Prakobphol, A.; Levine, M. J.; Tabak, L. A.; Reddy, M. S. Carbohydrate Res. 1982, 108, 111-122. (30) Karlsson, N. G.; Schulz, B. L.; Packer, N. H. J. Am. Soc. Mass Spectrom. 2004, 15, 659-672. (31) Thomsson, K. A.; Schulz, B. L.; Packer, N. H.; Karlsson, N. G. Glycobiology 2005, 15, 791-804. (32) Wickstrom, C.; Davies, J. R.; Eriksen, G. V.; Veerman, E. C. I.; Carlstedt, I. Biochem. J. 1998, 334, 685-693. (33) Wu, A. M.; Csako, G.; Herp, A. Mol. Cell. Biochem. 1994, 137, 3955. (34) Carpenter, G. H.; Pankhurst, C. L.; Proctor, G. B. Electrophoresis 1999, 20, 2124-2132. (35) Carpenter, G. H.; Proctor, G. B. Oral Microbiol. Immunol. 1999, 14, 309-315. (36) Loomis, R. E.; Bhandary, K. K.; Tseng, C. C.; Bergey, E. J.; Levine, M. J. Biophys. J. 1987, 51, 193-203. (37) Oho, T.; Rahemtulla, F.; Mansson-Rahentulla, B.; Hjerpe, A. Int. J. Biochem. 1992, 24, 1159-1168. (38) Guile, G. R.; Harvey, D. J.; O’Donnell, N.; Powell, A. K.; Hunter, A. P.; Zamze, S.; Fernandes, D. L.; Dwek, R. A.; Wing, D. R. Eur. J. Biochem. 1998, 258, 623-656.

Ramachandran et al. (39) Bank, R. A.; Hettema, E. H.; Arwert, F.; Amerongen, A. V.; Pronk, J. C. Electrophoresis 1991, 12, 74-79. (40) Van Nieuw Amerongen, A.; Bolscher, J. G.; Veerman, E. C. Caries Res. 2004, 38, 247-253. (41) Prakobphol, A.; Boren, T.; Ma, W.; Zhixiang, P.; Fisher, S. J. Biochemistry 2005, 44, 2216-2224. (42) Thatcher, B. J.; Doherty, A. E.; Orvisky, E.; Martin, B. M.; Henkin, R. I. Biochem. Biophys. Res. Commun. 1998, 250, 635641. (43) van der Reijden, W. A.; Veerman, E. C.; Amerongen, A. V. Biorheol. 1993, 30, 141-152. (44) Gabriel, M. O.; Grunheid, T.; Zentner, A. J. Periodontology 2005, 76, 1175-1181. (45) Liu, B.; Rayment, S. A.; Gyurko, C.; Oppenheim, F. G.; Offner, G. D.; Troxler, R. F. Biochem. J. 2000, 345, 557-564. (46) Van Nieuw Amerongen, A., Bolscher, J. G. and Veermman, E. C. Caries Res. 2004, 38 (3), 247-253. (47) Schulenberg, H. C.; Steinberg, T. H.; Leung, W. Y.; Patton, W. F. Electrophoresis 2003, 24, 588-598. (48) Zuo, X.; Speicher, D. W. Proteomics 2002, 2, 58-68. (49) Bause, E. Biochemical J. 1983, 209, 331-336. (50) Gonzalez, J.; Takao, T.; Hori, H.; Besada, V.; Rodriguez, R.; Padron, G.; Shimonishi, Y. Anal. Biochem. 1992, 205, 151-158. (51) Jerkovic, L.; Voegele, A. F.; Chatwal, S.; Kronenberg, F.; Radcliffe, C. M.; Wormald, M. R.; Lobentanz, E. M.; Ezeh, B.; Eller, P.; Dejori, N.; Dieplinger, B.; Lottspeich, F.; Sattler, W.; Uhr, M.; Mechtler, K.; Dwek, R. A.; Rudd, P. M.; Baier, G.; Dieplinger, H. J. Proteome Res. 2005, 4, 889-899. (52) Lundy, F. T.; Al-Hashmini, I.; Rees, T. D.; Lamey, P. J. Oral Surg. Oral Med. Oral Pathol. Oral Radiol. Endodontol. 1997, 83, 252258. (53) Schulz, B. L.; Packer, N. H.; Karlsson, N. G. Anal. Chem. 2002, 74, 6088-6097. (54) Streckfus, C.; Bigler, L.; Tucci, M.; Thigpen, J. T. Cancer Investig. 2000, 18, 101-109. (55) Chien, D. X., Schwartz, P. E., Li, F. Q. Obstetric Gynecol. 1990, 75, 701-704. (56) Jantscheff, P.; Terracciano, L.; Lowy, A.; Glatz-Krieger, K.; Grunert, F.; Micheel, B.; Brummer, J.; Laffer, U.; Metzger, U.; Herrmann, R.; Rochlitz, C. J. Clin. Oncol. 2003, 21, 3638-3646. (57) Bingle, L.; Cross, S. S.; High, A. S.; Wallace, W. A.; Devine, D. A.; Havard, S.; Campos, M. A.; Bingle, C. D. J. Pathol. 2005, 205, 491497. (58) Esemuede, N.; Lee, T.; Pierre-Paul, D.; Sumpio, B. E.; Gahtan, V. J. Surg. Res. 2004, 122, 135-142.

PR050492K

Journal of Proteome Research • Vol. 5, No. 6, 2006 1503