Global Survey of the Bovine Salivary Proteome: Integrating

Sep 8, 2011 - (7, 8) These applications could potentially be extended into the ... (14, 15) Since the rumen has no secretory functions, saliva has a ...
1 downloads 0 Views 3MB Size
ARTICLE pubs.acs.org/jpr

Global Survey of the Bovine Salivary Proteome: Integrating Multidimensional Prefractionation, Targeted, and Glycocapture Strategies Ching-Seng Ang,†,‡ Steve Binos,†,‡ Matthew I Knight,†,‡ Peter J Moate,‡,§ Benjamin G Cocks,†,‡ and Matthew B McDonagh*,† †

Biosciences Research Division, ‡Dairy Futures Cooperative Research Centre, and §Future Farming Research Division, Department of Primary Industries, 1 Park Drive, Bundoora, Victoria, Australia

bS Supporting Information ABSTRACT: Saliva is easily obtainable from a large number of animals in a noninvasive manner and contains a wide diversity of compounds including hormones, metabolites, and proteins that may be a good source of biomarkers of health and disease. Here we have used a combination of multidimensional prefractionation, targeted, and glycocapture methodologies to profile the bovine salivary proteome. The nontargeted approach used four different separation methodologies consisting of SDSPAGE, Off-gel fractionation, RP-HPLC, and SCX-HPLC. In the targeted approach, we’ve employed a hypothesis-based methodology by only selecting extracellular proteins from in silico data. Finally, the hydrazide capture methodology not only enabled us to identify formerly N-linked glycoproteins but it also provided a selective enrichment process for the identification of low abundance proteins. Together, the three different approaches identified 402 salivary proteins and 45 N-linked glycoproteins. A large number of these proteins have previously been uncharacterized in bovine saliva. To date, this is the largest global survey of the bovine salivary proteome and expands the potential of the diagnostic utility of this fluid to guide development of experiments seeking biomarkers for health traits (i.e., disease resistance) as well as feed conversion efficiency and productivity traits in dairy and beef cattle. KEYWORDS: bovine saliva, global proteome, proteomics, N-linked glycoproteins, targeted proteomics, multidimensional fractionation

’ INTRODUCTION Cows secrete up to 200 L of saliva a day that contains a wide diversity of compounds including hormones, metabolites and proteins that contribute to the digestion process and may influence rumen function.1,2 Being readily available, easily collected in remote sites by unskilled personnel and stable for long period when frozen,3,4 saliva is a biological fluid ideal for screening for performance traits and for diagnostic purposes. There have been numerous studies on the salivary proteome of human.5,6 The most comprehensive study to date to catalogue the human salivary proteome is from a multisite consortium involving groups from the Scripps Research Institute, UCSF and UCLA in the U.S.A. (www.hspp.ucla.edu). Clinical assays using saliva to monitor proteins or post-translationally modified proteins have also taken place especially in relation to human disease where a panel of different biomarkers are used to achieve a level of acceptable sensitivity and specificity.7,8 These applications could potentially be extended into the agricultural sector. As for many human diseases, single markers for traits related to livestock performance and productivity are rare. Instead, traits of Published 2011 by the American Chemical Society

agronomic importance in livestock are usually explained by a large number of markers linked due to their multigenic nature.9,10 Phenotypic expression of traits such as feed conversion efficiency, methane production, and diseases resistance may also be complicated by the influence of variation in rumen function and microbial balance.11 13 Components of the oral cavity (buffering enzymes, immune response, etc.) have been shown to have a pronounced effect on the productivity and performance of ruminants.14,15 Since the rumen has no secretory functions, saliva has a critical role in the transport of ingesta during regurgitation and helps to buffer the rumen fluid in which the rumen microorganisms can proliferate. Cataloguing the types of proteins in the bovine saliva and subsequently developing screening assays for them is therefore important to form a foundation for biomarker discovery studies in bovine saliva and to evaluate the effectiveness of this biological matrix for prediction of bovine traits early in their life cycle for production, management and breeding purposes. Received: May 30, 2011 Published: September 08, 2011 5059

dx.doi.org/10.1021/pr200516d | J. Proteome Res. 2011, 10, 5059–5069

Journal of Proteome Research Surprisingly, despite the importance and impact of cattle production in the agricultural industries, little is known about the composition of bovine salivary proteome with the exception of a few early studies where some of the abundant salivary proteins were separated on SDS-PAGE gels and characterized.16,17 The main problem faced by comprehensive proteomics studies thus far has been the ability to separate and simplify very complex mixtures of proteins in which individual components may differ in abundance by several orders of magnitude. There is to date no perfect “universal” method for identifying the entire proteome or any instrument that is capable of identifying and quantifying the components of a complex protein sample in a simple one step procedure.18 For example, in a large scale identification of proteins using 2D-PAGE gel electrophoresis, SDS-PAGE prefractionation followed by LC MS (geLC MS) and 2D LC MS, a total of 1218 proteins were identified by the three methods in exponentially growing Bacillus subtilis cells, but only 140 proteins were consistently identified in all three analyses.19 This clearly demonstrates the need for complementation studies that utilize the strengths of each method. In this paper, we have for the first time investigated a nontargeted multidimensional prefractionation methodology and a targeted approach to profile the salivary proteome of cows to provide new information on the composition and potential diagnostic utility of this fluid. We have also utilized hydrazide capture technology to capture N-linked glycoproteins as well as any low abundance proteins that may be enriched by this selective methodology. This three step approach (nontargeted, targeted and glycocapture) has enabled us to identify 402 salivary proteins and 45 N-linked glycoproteins with a large number of them previously uncharacterized, thus making this by far the largest global survey of the bovine salivary proteome to date.

’ EXPERIMENTAL PROCEDURES Materials

The chemicals were mainly purchased from Sigma (St. Louis, MO) unless otherwise stated. Sequencing grade modified trypsin was purchased from Promega (Madison, WI). Affi-gel Hz Hydrazine resins and coupling buffers were purchased from Bio-Rad (Hercules, CA). All LC MS reagents were of the highest grade. Saliva Collection

Sampling of saliva from experimental dairy cows was approved by the Department of Primary Industry Animal Ethics Review committee and was conducted in accordance with the Australian Code of Practice for the Care and Use of Animals for Scientific Purposes (2004). Whole saliva was taken from the mouth of Holstein cows from the Department of Primary Industries, Ellinbank Centre (1301 Hazeldean Road, Ellinbank, Victoria 3821). Visual examination of the mouth was carried out prior to sampling and the color and clarity of saliva was assessed to ensure that the animal displayed no visible sign of trauma in the oral cavity and that contamination with nonsalivary proteins (e.g., blood) was minimized. Saliva was collected using gauze swabs placed at the base of the cheek on both sides of the mouth to absorb free whole saliva. Thus, the collected saliva contained a mixture of proteins derived mainly from the parotid, submaxillary and sublingual glands with contributions from other minor glands such as the ventral buccal, intermediate buccal, dorsal buccal and pharyngeal glands.20 Gauze swabs were transferred to

ARTICLE

50 mL centrifuge tubes and frozen immediately on liquid nitrogen prior to long-term storage at 80 °C. Sample Preparation

The collection gauze was cut into 7 smaller sections and immersed into 15 mL of the respective extraction buffer for SDSPAGE analysis (1 PBS, pH 7.4; 50 mM Tris-HCl, pH 8.0 or 0.15% TFA). The samples were then vortexed and sonicated for 1 min prior to placing the samples on ice for 30 min. This process was repeated twice. Whole extracted saliva was then centrifuged at 14 000 g at 4 °C for 15 min. The supernatant was collected and recentrifuged at 14 000 g at 4 °C a further 15 min to remove any particulates. Protein content of the saliva was determined to be ∼1.5 mg/15 mL buffer using the BCA assay (Pierce, IL). The supernatant was then aliquoted into 1 mL fractions, freeze-dried and stored at 80 °C. Fifty millimolar Tris-HCl, pH 8.0, extraction buffer was used for all other analysis (RP-HPLC, SCX-HPLC, Off-gel fractionation and glycocapture). Sample Prefractionation (Nontargeted Approach)

(A) SDS-PAGE of proteins was carried out using 4 12% NUGAGE gels (Invitrogen, Carlsbad, CA) using MES buffer. Sample preparation and electrophoretic conditions were carried out as per the manufacturer’s instructions. 30 μg of saliva was loaded onto the gel and gels were subsequently stained using Coomassie Brillant Blue G250. The entire gel lane was divided into 12 equal gel slices followed by in-gel digestion as described previously.21 (B) RP-HPLC separation of protein was performed using an Agilent 1290 series capillary HPLC (Agilent Technologies, Foster City, CA). Proteins were resolved on an Eclipse XDB-C8 column (Agilent Technologies, 150  2.1 mm). The eluents used for the liquid chromatography were 0.1% v/v TFA (solvent A) and 100% CH3CN/0.1% v/v TFA (solvent B). The flow rate was 200 μL/min and the following gradient was used: 0 80% B in 40 min and maintained at 80% B for the final 5 min. Detection was by UV-absorbance at 254 and 280 nm. Samples were collected at 1 min intervals over 45 min giving a total of 45 fractions. The fractions were freeze-dried and resuspended in 20 μL of 50 mM Tri-HCl, pH 8.0 and digested as described previously.22 (C) For Off-gel electrophoresis of proteins, the freeze-dried saliva was resuspended with 200 μL of water followed by acetone precipitation and final resuspension in 720 μL of protein OFFGEL solution (Agilent, OFFGEL high Res Kit). The samples were then rehydrated onto a 24 cm IPG strip (pH 3 10, NL, GE Healthcare) and fractionated on a 3100 OFFGEL fractionator (Agilent Technologies, Foster City, CA) into 24 wells as per the manufacturer’s instructions. Fractionated samples (24 fractions) were acetone precipitated and resuspended in 100 μL of 50 mM Tri-HCl, pH 8.0. These samples were then digested with trypsin as described previously.22 (D) Strong Cation Exchange (SCX) HPLC fractionation of peptides (see below for in solution digestion procedure) was performed using an Agilent 1290 series capillary HPLC (Agilent Technologies). Peptides were resolved on a BioBasic SCX column (Thermo Scientific, 100  2.1 mm). The eluents used for the liquid chromatography were 5 mM KH2PO4/25% CH3CN, pH 3.0 (solvent A) and 5 mM KH2PO4/25% CH3CN/600 mM KCl, pH 3.0 (Solvent B). The flow rate was 200 μL/min and the following gradient was used: 0% B for 3 min, 0 30% B in 35 min, 30 100% B in 5 min and maintained at 80% B for the final 5 min. Detection was by UV-absorbance at 214 and 254 nm. Samples were collected at 1 min intervals over 45 min giving a total of 45 5060

dx.doi.org/10.1021/pr200516d |J. Proteome Res. 2011, 10, 5059–5069

Journal of Proteome Research fractions. The peptides were freeze-dried and resuspended in 20 μL of 3% CH3CN/0.1% formic acid Targeted Approach

The targeted approach was modified from the Accurate Inclusion Mass Screening (AIMS) procedure described previously.23 In the targeted approach, we employed a hypothesis-based methodology by only selecting extracellular proteins that were predicted to be secreted in silico or from experimental evidence. Using the Uniprot database (www.uniprot.org), a list of 694 predicted or experimentally identified secreted proteins was generated (Supplementary Table 2, Supporting Information). Selection of proteins were carried out by first filtering the database for entries containing “Bos Taurus” followed by selecting proteins that were described as secreted. Proteotypic peptides were then predicted using PeptideSieve24 using a threshold of >50 and added to the “inclusion list” in the mass spectrometer instrument method, similar to that described earlier.23 The inclusion list consists of the m/z value and MS normalized collision energy which was set to 35. Purification of Formerly N-Linked Glycopeptides from Bovine Saliva

The hydrazide coupling method described previously25 was used in this study. Extracted bovine saliva was first precipitated with 9 volumes of acetone followed by resuspension to ∼1 mg/ mL concentration with the Affi-gel coupling buffer (pH 5.5) with addition of 0.2% CHAPS.26 Oxidation of the vicinal hydroxyls on the sugar residues of the carbohydrate to form aldehydes was carried out by addition of 15 mM sodium periodate for 1 h in the dark at room temperature. After removal of sodium periodate by desalting (Econo-Pac 10DG desalting column, Bio-Rad, Hercules, CA), the samples were added to the hydrazide resins and incubated at room temperature for 20hrs with end-to-end rotation. Nonglycoproteins were then removed by washing the hydrazide resins 8 times with 10 bed volume of urea solution (8 M Urea, 200 mM Tris pH 8.3). In the last wash, 10 mM TCEP was added to reduce the samples for 30 min at room temperature followed by alkylation with 50 mM iodoacetamide for 30 min in the dark at room temperature. The resin was then washed another 8 times with diluted urea solution (1 M Urea, 50 mM Tris pH 8.3). Digestion was carried out at an enzyme to protein ratio of 1:50 overnight at 37 °C. The trypsin-released peptides were removed by three sequential washes of 1.5 M NaCl, 80% ACN/0.1% TFA, 100% methanol, H2O and six times with 50 mM ammonium bicarbonate. PNGaseF (25U per mg of protein) was added to release the N-linked glycopeptides overnight at 37 °C. The resin were washed with 80% ACN/0.1% TFA and combined before freeze-drying and cleanup by solid phase extraction (C18 - Sep-Pak, Waters). Liquid Chromatography Mass Spectrometry (LC MS/MS)

LC MS/MS was carried out on a LTQ Orbitrap Velos (Thermo Scientific, West Palm Beach, FL) equipped with a nanoelectrospray interface coupled to an Ultimate 3000 RSLC nanosystem (Dionex, Sunnyvale, CA). The nanoLC system was equipped with a Acclaim Pepmap nanotrap column (Dionex C18, 100 Å, 75 μm  2 cm) and an Acclaim Pepmap RSLC analytical column (Dionex - C18, 100 Å, 75 μm  15 cm). Typically for each LC MS/MS experiment, 1 μL of the peptide mix was loaded onto the enrichment (trap) column at an isocratic flow of 3 μL/min of 3% CH3CN/0.1% formic acid for 4 min before the enrichment column is switched in-line with the

ARTICLE

analytical column. The eluents used for the liquid chromatography were 0.1% v/v formic acid (solvent A) and 100% CH3CN/ 0.1% v/v TFA (solvent B). The flow rate was 0.3 μL/min and the following gradient was used: 3 5% B for 1 min, 5 25% B in 40 min, 25 80% B in 10 min and maintained at 80% B for the final 5 min. The column was then equilibrated with 3% B for 10 min prior to the next analysis. All samples runs were performed at least in duplicates. The LTQ-Orbitrap Velos mass spectrometer was operated in the data dependent mode with nano ESI spray voltage of +1.6 kV, capillary temperature of 250 °C and S-lens RF value of 60%. All spectra were acquired in positive mode with full scan MS spectra scanning from m/z 150 2000 in the FT mode at 60 000 resolution after accumulating to a target value of 1.00  106 with maximum accumulation of 500 ms. Lock mass of 445.120024 from ambient air was applied. The 20 most intense peptide ions with charge states g2 were isolated at a target value of 5000 and fragmented by low energy CID with normalized collision energy of 30, activation Q of 0.25 and activation time of 10 ms. Dynamic exclusion settings of 2 repeat counts over 30 s and exclusion duration of 70 s. At all times, monoisotopic precursor selection was enabled. In the targeted approach, all parameters were similar except the top 15 most intense peptide ions within the inclusion list were selected for fragmentation. For the glycopeptide analysis, the mass spectrometer was operated in the data-dependent mode as described above and also in electron transfer disassociation (ETD) only and CIDETD mode. In the CID-ETD mode, the LTQ-Orbitrap Velos performed a full MS scan (60 000 resolution) followed by five alternate CID and ETD scans with activation time for CID and ETD of 10 ms and 100 ms, respectively. Charge state dependent ETD time was enabled for the ETD scan. For the ETD only mode the LTQ-Orbitrap Velos performed a full MS scan (60 000 resolution) followed by 10 ETD scans with an activation time of 100 ms. Data Analysis

Data analysis was done using the Proteome Discoverer 1.2 software suite (Thermo Scientific) with Mascot v.2.2.04 (Matrix Science, London, UK). An initial filter of precursor mass was set between 350 to 3000 Da. For the Mascot and ZCore search engine (ETD specific), the peptide mass tolerance was set to 20 ppm and 0.8 Da for MS/MS fragmentation ions. Searches were carried out on the latest version of the bovine International Protein Index (IPI Version 3.60 containing 31 577 protein sequences). Enzyme specificity was trypsin with maximum of 2 missed cleavages. Cysteine carbaidomethylation (+57.0215 Da) and methionine oxidation (+15.9949) were set as the fixed and variable modification respectively for all searches. ESI-FTICR was set as the default instrument search setting. For identification of formally glycosylated peptides deamidation of Asn to Asp (+0.9840 Da) was included as a variable modification. As CID and ETD spectra cannot be searched with the same parameters, they were processed in a workflow as part of the Proteome Discoverer suite whereby the CID and ETD spectra were separated with the scan event filter and CID spectra then searched with Mascot (instrument setting = ESI-FTICR) and ETD spectra search with both Mascot (instrument setting = ETD-TRAP) and Zcore. All the spectra were searched against the target/decoy database to achieve a targeted false discovery rate of 1% for all samples. In the case of Mascot scoring, a Mascot significant threshold score is chosen such that the 5061

dx.doi.org/10.1021/pr200516d |J. Proteome Res. 2011, 10, 5059–5069

Journal of Proteome Research

ARTICLE

Figure 1. Pipeline for the global survey of the Bovine salivary proteome. The identification pipeline consists of four nontargeted approaches, SDSPAGE, off-gel electrophoresis, RP-HPLC and SCX-HPLC, a hypothesis driven targeted approach and a glycocapture approach for N-linked glycoproteins. The detailed explanations of the individual processes are reflected in the Experimental Procedures.

number of spectral identifications in the decoy database is 1% of the number of identifications in the decoy database. For ETD results using the ZCore search algorithm, the EQuestNode (Probability score) was selected to achieve a targeted false discovery of 1%. The information for each protein sequence identified (Mascot ion score, δppm, sequence coverage, etc.) were also included in the various worksheets within the Supplementary Tables (Supplementary Tables 1, 2 and 3, Supporting Information).

’ RESULTS AND DISCUSSION To globally survey the bovine saliva proteome, three different strategies (i.e., nontargeted, targeted and glycocapture) were developed. A schematic representation of this bovine saliva proteome pipeline is outline in Figure 1. Each of these three strategies had its own advantages and limitations but used in combination enabled us to gain a comprehensive understanding of the global profile of the bovine salivary proteome. This unique pipeline identified a total of 402 proteins from 2666 peptides (62 744 peptide specific matches (PSMs)) together with 45 N-linked glycoproteins and is by far the largest global survey of the bovine salivary proteome. Nontargeted Approach

The nontargeted approached utilized four different prefractionation methodologies based on the different physiochemical properties of the proteins or peptides: (A) SDS-PAGE (electrophoretic mobility), (B) Off-gel fractionation (protein pI), (C) RP-HPLC (protein hydrophobicity) and (D) SCX-HPLC (peptide charge). Each of these prefractionation methodologies enriched for different classes of proteins in each of the separated fractions. Three prefractionation techniques separated

the bovine saliva proteome at the protein level; while the final method, SCX-HPLC separated the bovine saliva proteome at the peptide level. Every fraction in this study was then digested with trypsin and applied to a nanoLC MS/MS on the highly sensitive LTQ-Orbitrap Velos. The combined results from the nontargeted approach resulted in the identification of a total of 396 proteins at 1% false discovery rate (FDR). As sample quality is crucial, initial studies determined the optimum sample extraction buffer to extract bovine saliva. This was determined by extracting proteins from bovine saliva in 3 different extraction buffers. The buffers included PBS, a physiological buffer; Tris-HCl, a common digestion buffer used in proteomic studies and TFA, previously shown to be useful to stop proteolytic enzymes while directly compatible with HPLC buffers.27 There was significant overlap between the proteins identified from LC MS/MS analysis of digests from the three sets of extraction buffers (Figure 2A). The relative proportion of identified proteins that were unique to each buffer was low, ranging from 13% (Tris-HCl) to 16% (TFA). However, of a total of 251 proteins identified from the three extraction buffers, only 25% (64 proteins) were identified in all 3 buffers. ANOVA analysis on only the unique proteins identified from each extraction method shows significant difference (p value < 0.05) in the pI of proteins identified from the TFA and Tris buffers (Figure 2C). The unique proteins identified from the TFA extract appears to stretch a broader pI range; however, considering the lower number of total proteins identified and incompatibility with some downstream processing methodologies, TFA was not selected for subsequent sample preparation. With no significant difference in protein molecular weight range and pI range (Figure 2B and C) between the PBS and Tris-HCl buffers the decision to use Tris-HCl buffer for subsequent sample 5062

dx.doi.org/10.1021/pr200516d |J. Proteome Res. 2011, 10, 5059–5069

Journal of Proteome Research

ARTICLE

Figure 2. Comparing the different extraction buffers on protein identification. (A) Identified proteins from each of the three different extraction buffers based on a targeted 1% FDR. (B) Distribution of protein molecular weight of unique proteins identified in each extraction. (C) Distribution of protein pI of unique proteins identified in each extraction. The box and whisker plot represents the lower quartile, median and upper quartile of the data range. (* significantly different between TFA and Tris extraction buffer (p < 0.05)).

Figure 3. Breakdown of proteins identified from the four different prefractionation strategies in the nontargeted approach. There are a total of 42 overlapping proteins between the four prefractionation strategies which included SDS-PAGE (251 proteins), SCX-HPLC (206 proteins), Off-gel (150 proteins) and RP-HPLC (110 proteins). These prefractionation strategies enabled the identification of 97, 64, 38, and 14 unique proteins from the SDS-PAGE, SCX-HPLC, Off-gel and RP-HPLC prefractionation, respectively.

preparation was made based on the suitability of this buffer with most of the downstream sample preparation and digestion procedures. Prefractionation on a protein level was then expanded using RP-HPLC and Off-gel electrophoresis, a liquid based prefractionation technique based on protein pI.28 From a total of 36 gel lanes from the SDS-PAGE gel that utilized 3 different sample extraction buffers as described above (totalling 72 LC MS experiments when run in duplicate), we were able to identify a

total of 251 proteins based on a targeted 1% FDR (Supplementary Table 1, Supporting Information). In the RP-HPLC protein prefractionation (90 LC MS experiments—in duplicate), a total of 110 proteins were identified based on a targeted 1% FDR (Supplementary Table 1, Supporting Information). For the Offgel electrophoresis prefractionation (48 LC MS experiments— in duplicate), a total of 150 proteins were identified based on a targeted 1% FDR (Supplementary Table 1, Supporting Information). For the SCX-HPLC approach, prefractionation was carried out “off line” on the digested proteins based on the resultant peptide charge. The fractionated peptides were then subjected to LC MS (90 LC MS experiment—in duplicate) where a total of 206 proteins were identified based on a targeted 1% FDR (Supplementary Table 1, Supporting Information). Of the 396 unique proteins identified through the nontargeted strategy, only ∼10% (42) were identified by all four different strategies (Figure 3) with more than 50% (213) only being identified in 1 of the four strategies. The relative peptide counts for the overlapping proteins such as immunoglobulins, cathelicidins, carbonic anhydrase 6, lactoperoxidases etc. was high as was their respective empai scores29 (data not shown). These are likely indications of their relative abundance and potential role as housekeeping proteins in saliva (Supplementary Table 1, Supporting Information). There was no influence of the prefractionation methodology on MW and pI profile of proteins identified, and the fact that a number of unique proteins were identified in this study by each method suggests that any one of the existing methodologies is not adequate to profile the salivary proteome. However, utilizing the SCX and SDS methodologies as detailed would account for 86% of the nontargeted bovine salivary proteome reported here. The limited capacity for investigating complex proteomes comprised of proteins that span a large concentration range has been discussed30,31 and confirmed again 5063

dx.doi.org/10.1021/pr200516d |J. Proteome Res. 2011, 10, 5059–5069

Journal of Proteome Research

ARTICLE

Figure 4. Functional classification of the 396 proteins identified from the nontargeted approach (SDS-PAGE, Off-gel, RP-HPLC and SCX-HPLC prefractionation). Classification is based on the GeneOntology molecular function annotation using the DAVID Functional Annotation Clustering tool (http://david.abcc.ncifcrf.gov/).

in this study which highlights the necessity for complementary methodologies. The most abundant proteins in human saliva are amylase, polymeric immunoglobulin receptor, prolactin-inducible protein, albumins, immunoglobulins, transferrin, zinc-alpha-glycoprotein, cystatins, and carbonic anhydrase.32,33 In cows, the distribution is somewhat similar except for the apparent absence of amylase while short palate, lung and nasal epithelium carcinoma-associated protein 2A (BSP30) and odorant-binding protein are two of the most abundant proteins.16 Our study confirms this key difference, while identifying a large number of other salivary proteins (Supplementary Table 1, Supporting Information). In previous studies investigating the salivary proteome of sheep and goats,34 24 and 23 proteins were identified, respectively. Our study identified all the salivary proteins expressed in sheep and goat with the exception of a nonspecific dipeptidase in goat. While the previous studies utilized a different approach (2DPAGE) resulting in a total of only ∼6% similarity with proteins identified by our approach, the shared expression of those proteins in the salivary proteome of sheep, goat and cow could suggests similar functional role in ruminants. Using the DAVID Functional Annotation Clustering tool (http://david.abcc. ncifcrf.gov/)35 incorporating the GeneOntology molecular function annotation (GOTERM_MF2), a total of 161 proteins out of the 396 proteins were annotated. The major molecular features included proteins consistent with salivary functions (Figure 4) such as those involved in protein binding (31%). This list includes proteins like polymeric immunoglobulin receptor, ovostatin and prolactin that play an important role in host defense; those proteins consistent with hydrolase activity (12.5%) such as lactotransferrin, lysozyme, and cathepsin that have proteolytic and antimicrobial activities; those proteins with known enzyme inhibitor activity (7.5%) such as cystatins and serpin peptidases inhibitors that are important in the host protection and regulatory functions; lipid binding proteins (5.3%) such as secretoglobin, and fatty acid binding proteins that are important in steroid binding and energy metabolism;

carbohydrate binding proteins (2.2%) such as galactoside binding protein and peptidoglycan recognition protein that are involved in the innate immunity process; and those proteins with lyase activity (1.9%) of which carbonic anhydrase 6 is one of the most important for the buffering ability of saliva. The subcellular localization of all the identified proteins were then predicted using CELLO [http://cello.life.nctu.edu.tw] which made predictions based on a two-level support vector machine system.36 Of the 396 proteins identified, 157 were predicted to be extracellular or associated with the plasma membrane. As reviewed by Loo et al.,37 saliva has been extensively used in biomarker discovery studies and extracellular proteins in saliva are exceptionally useful as biomarker proteins. In contrast, the use of similar biomarker proteins in plasma may be problematic because some proteins are highly abundant in plasma, which obscures the detection of less abundant but potentially more interesting proteins. Comparative analysis carried out on the human salivary proteome with that of the human plasma proteome also suggests that extracellular proteins are more predominant in the salivary proteome.38 The large number (∼39.6%) of predicted extracellular or secreted proteins identified in this study further illustrates the potential use of the bovine saliva as targets that can be measured by noninvasive assays. Targeted Approach

The targeted approach essentially uses information available in the public domain to create a list of peptides for identification on the high mass accuracy and resolution Orbitrap mass spectrometer. This provides a rapid way of selecting and identifying a selected subset of proteins and the procedure has been shown to be at least 4-fold more efficient at detecting peptides of interest as compared to the data dependent shotgun approach with the sensitivity of this method similar to MRM on a triple quadrupole based system.23 Using this method, 77 proteins were identified based on a targeted 1% FDR in the first pass. After the first pass screen, peptides that were identified more than 50 times were 5064

dx.doi.org/10.1021/pr200516d |J. Proteome Res. 2011, 10, 5059–5069

Journal of Proteome Research

ARTICLE

Figure 5. Comparing the percentage of unique and overlapping proteins between the targeted approach and each of the four prefractionation methods in the nontargeted approach. Numbers in parentheses are the total number of proteins identified with the associated prefractionation method, and the actual number of number of proteins identified using the targeted approach are reflected at top of bar.

removed. Because of the isolation mass window on the LTQ, masses that are within (1 amu of the removed precursor mass were also excluded to prevent retriggering of MS/MS of these abundant peptides. Alternatively tightening the m/z tolerance of the mass spectrometer required to trigger an MS/MS scan would circumvent this problem but might result in loss of information.23 Through this refiltered inclusion list, we were able to identify an additional 12 proteins, bringing the total number of proteins identified to 89 or a success rate of 12.8%. In the targeted approach, we identified 6 additional proteins that were not previously identified in any of the nontargeted prefractionation approaches described above. The 89 identified proteins constitute 22.4% of the total proteins identified in the nontargeted approach (i.e., multidimensional prefractionation) but only using 4.6% of the total time (i.e., 14 LC MS experiments vs 300 LC MS experiments—in duplication). Comparing the targeted approach with each of the four prefractionation methodologies, there was a significantly higher number of unique proteins identified using the targeted approach versus RP-HPLC and Off-gel prefractionation as compared to the targeted approach versus SDSPAGE and SCX-HPLC prefractionation (Figure 5). This could be a result of the overall number of proteins identified in each of the prefractionation methodology, but it importantly shows the utility of the targeted approach in identifying proteins that were otherwise not seen in the other prefractionation methodologies. At first glance, the success rate appears to be low (12.8%), but it is within expectation since secreted proteins present in the saliva constitute only a small subset of all secreted proteins in Bos taurus. To increase the success of the targeted approach, publicly available databases of other organisms (e.g., Human Salivary Proteome Knowledge Base, http://www.skb.ucla.edu/ cgi-bin/spkbcgi-bin/main.cgi) could be utilized to select for bovine equivalents proteotypic peptides considering that the list of predicted secreted proteins was based on the small Uniprot database (5803 reviewed and 16 205 unreviewed Bos taurus proteins as of Dec 2010). Alternatively, larger databases (e.g., Bovine IPI database) could be utilized to rapidly screen for large

lists of candidate proteins making this a good alternative/complementary approach for protein profiling. Hydrazide based glycocapture for N-linked glycoproteins

The hydrazide coupling methodology25 enabled us to identify a total of 83 formerly glycosylated peptides from 45 unique N-linked glycoproteins (Table 1, Supplementary Table 3, Supporting Information). These proteins were identified using a combination of CID-only, ETD-only and alternate CID/ETD induced fragmentation with the last two fragmentation modes applied to identify large peptides that were originally undetected/uncleaved presumably due to to their association with complex glycans that contain the negatively charged sialic acid moiety. We further used two different search algorithms, Mascot39 and ZCore40 for added confidence of identification. Of the 45 identified N-linked glycoproteins (with consensus NXS/T motif), only 4 have been experimentally determined in the literature.41 45 We further compared the identified N-linked glycoproteins with that of in silico predicted results using NetNGlyc (http://www.cbs.dtu.dk/services/NetNGlyc/) and found 15 N-linked glycoproteins that were not predicted by the in silico prediction tool. For 21 of the identified proteins, the number of N-linked sites is less than the predicted sites, which could be attributed to the accuracy of the predictor,46 a result of transient/alteration of glycosylation,47,48 or simply due to the relative lower levels of these proteins in saliva. The concept of using the complementary nature of ETD and CID is not new as it provides a higher probability of identification for peptides due to its different mode of fragmentation thereby increasing MS/MS scan success for precursor ions.49 ETD induced fragmentation works best when peptides are triply and more charged50 as also seen in this study where 36 formerly glycosylated and nonformerly glycosylated peptides were present in more than 4+ charges (7 peptides contains 5+ charges). Of the formerly glycosylated peptides, 2 peptides were identified using ETD with 4+ charges (EALHNDQDHFnLTTGVFTcTIPGVYR and LRnLSSPLGLmAVNQEAWDHGLAYLPFNNK) and 2 peptides 5065

dx.doi.org/10.1021/pr200516d |J. Proteome Res. 2011, 10, 5059–5069

Journal of Proteome Research

ARTICLE

Table 1. Identified Glycoproteins Using Glycocapture and PNGaseF Release IPI acc number

protein

predicted pos - Uniprota

predicted pos - NetNGlyca

identified NxS/T site

IPI00711254

SCGB2A2 protein

None

None

1 (71)

IPI00852509

Putative uncharacterized protein

None

None

1 (74)

IPI00696625

MSMB protein

None

None

1 (30)

IPI00697101

Prolactin-inducible protein homologue

1 (106)

IPI00702243

Similar to Equ c1 isoform 1

4 (52, 67, 114, 125)

2 (67, 114)

IPI00839296

9 kDa protein

1 (69)

1 (69)

IPI00704889 IPI00696714

Kallikrein 1 Isoform Long of Polymeric

3 (102, 108, 254)

2 (102, 254) 3 (83, 420, 468)

IPI00722909

Similar to odorant binding protein

1 (63)

1 (63)

IPI00710664

Lactotransferrin

4 (252, 387, 495, 564)41

3 (252, 495, 564)

IPI00688247

Carbonic anhydrase 6

2 (62, 251)

2 (62, 251)

IPI00708343

Similar to salivary androgen-

1 (44)

1 (44)

IPI00720889

Similar to vitelline membrane outer layer 1

(high homology to IgA) 1 (106)

3 (83, 420, 468)

immunoglobulin receptor

binding protein beta subunit None

1 (116)

IPI00707101

Alpha-2-HS-glycoprotein

3 (99, 156, 176)42,52

IPI00701295

Immunoglobulin J chain

None

1 (71)

IPI00700655

Putative uncharacterized

None

1 (118)

IPI00711612

Bactericidal/permeability-

None

1 (52)

1 (156)

protein MGC137211 increasing protein-like 1 IPI00709683 IPI00718725

Prostaglandin-H2 D-isomerase 52 kDa protein

2 (51, 78)

IPI00697935

Pantetheinase

6 (39, 87, 147, 201, 316, 354)

IPI00842115

Hypothetical protein

IPI00715999

Cysteine-rich secretory protein 3

None

IPI00716157

Lactoperoxidase

5 (106, 212, 322, 358, 449)

IPI00907026

Similar to submaxillary apomucin

4 (77, 248, 423, 468)

2 (332, 468)

IPI00694309

Tetraspanin-1

6 (141, 154, 167, 180, 189, 194)

1 (141)

IPI00867097 IPI00904149

LYPD5 protein 163 kDa protein

6 (55, 70, 859, 983, 1395, 1415)

1 (182) 2 (983, 1415)

IPI00694607

Deoxyribonuclease-1

1 (40)

1 (40)

IPI00707884

Beta-2-glycoprotein 1

5 (92, 162, 183, 193, 253)43,44

1 (92)

IPI00695489

Alpha-1-antiproteinase

4 (68, 105, 143, 269)

1 (105)

IPI00690198

Hemopexin

3 (188, 218, 241)

1 (218)

IPI00705491

Haptoglobin

2 (286, 316)

IPI00825356

Similar to desmoglein 3

IPI00704432 IPI00713693

Ovostatin 2 Isoform 2B of Desmocollin-2 (Fragment)

IPI00701751

Similar to Fc fragment of IgG

5 (70, 236, 283, 305, 466)

2 (51, 78) 2 (283, 466) 2 (87, 201)

3 (93, 252, 302)

2 (93, 373) 1 (65) 2 (322, 449)

None

1 (286) 4 (109, 179, 458, 744)

None 4 (120, 346, 495, 579)

1 (928) 1 (495) 9 (91, 192, 208, 769, 1620,

binding protein

1742, 2113, 2686, 2786)

IPI00717527

Complement factor B (Fragment)

IPI00713502

Protein

IPI00843310

Desmoglein-1

3 (110, 180, 496)45

IPI00698608

LYPD3 protein

None

IPI00840440 IPI00691212

Similar to golgi membrane protein 1 Alpha-1-acid glycoprotein

IPI00716688

1 (109)

4 (122, 142, 285, 378)

3 (192, 1316, 2786) 1 (122)

None

2 (153, 3810) 1 (110) 2 (178, 185)

None

1 (109) 1 (136)

Lu-ECAM-1

None

1 (75)

IPI00715365

92 kDa protein

5 (99, 163, 275, 407, 583)

1 (163)

IPI00711323

Sushi domain containing 2

None

1 (634)

5 (34, 57, 94, 104, 136)

a

Annotation based on Uniprot database’s nonexperimental qualifiers (www.uniprot.org) except when protein not in database the NetNGlyc N-glycoslyation prediction server was used (http://www.cbs.dtu.dk/services/NetNGlyc/). 5066

dx.doi.org/10.1021/pr200516d |J. Proteome Res. 2011, 10, 5059–5069

Journal of Proteome Research with 3+ charges (FLRFDPVTGEVnSTYPR and NENFnFTEHLK). A further 41 formerly glycosylated peptides were also concurrently identified with CID and ETD induced fragmentation. CID induced fragmentation on the other hand is best suited for lower charged peptides including peptides which have 2+ and 3+ and which in our case, comprised the bulk of the peptides identified (>85%) from the glycocapture methodology. Eight N-glycoproteins (Tetraspanin-1, Beta-2-glycoprotein 1, Prostaglandin-H2 D-isomerase, sushi domain containing 2, 92 kDa protein, LuECAM-1, similar to golgi membrane protein 1 and LYPD5 protein) were not identified by any of the targeted and nontargeted approaches. Their homologues were also not seen in the SPKB database (http://hspp.dent.ucla.edu/cgi-bin/spkbcgibin/main.cgi), which is likely a result of selective enrichment, highlighting the potential for using this enrichment strategy to significantly decrease the sample complexity. There has been, to the best of our knowledge, no study on the complete N-glycoprotein profile of bovine saliva. The majority of published N-linked saliva glycoprotein profiling work has been performed on whole human mixed saliva from the parotid, submandibular, and sublingual glandular fluids where 45 and 77 N-linked glycoprotein were previously identified, respectively.32,51 By comparing the bovine N-linked glycoprotein in this study with that of the identified human whole saliva N-linked glycoproteins, our study revealed 16 similar N-linked glycoproteins that may be associated with host-defense or nutrient binding functions. Information that currently exists pertaining to the bovine salivary proteome and other agriculturally important animals like sheep and pigs is relatively low. The results from the present study, when combined with information derived from the numerous human studies, may hopefully be useful in drawing important correlations, particularly in areas of disease prediction.

’ CONCLUSION The primary goal of this study was to use three different strategies—nontargeted (i.e., multidimensional fractionation), targeted and glycocapture enrichment—to investigate the salivary and N-linked proteome of the cow. Using these three different strategies, we were able to perform a global survey of the proteome and N-linked glycoproteins of bovine saliva. To date, this is the largest discovery and protein profiling study of bovine saliva reported in the literature. Identifying the molecular components in the cow’s saliva can allow us to have a deeper understanding of what is present and also allows us to generate molecular interactions with these various components. Alone, this information is not sufficient to understand the causes of diseases or identify biomarkers for selection of dairy and beef cattle for important economic traits such as high feed conversion efficiency or low methane production. Quantitative information on the protein response to the various external (environmental, pharmacological) and internal (genetic, physiological) variations is therefore needed and is now the subject of further investigation through multiplex quantitative assays using multiple reaction monitoring mass spectrometry assays. This work thus provides the preparation, separation and mass spectrometry detection basis for multiplex quantitative analysis of biomarker proteins from bovine saliva. ’ ASSOCIATED CONTENT

bS

Supporting Information Supplementary Table 1: Summary of identified proteins from the nontargeted strategy (SDS PAGE, Off-gel fractionation, RP-

ARTICLE

HPLC and SCX-HPLC). The separate worksheets include breakdown of identified peptides and information pertaining to its positive identification (eg Mascot ion scores, exp value, δppm etc.) from each prefractionation strategy. Supplementary Table 2: List of proteins selected for the targeted experiment. The separate worksheets contain the first and second inclusion list and breakdown of identified peptides and information pertaining to its positive identification (e.g., Mascot ion scores, exp value, δppm etc.). Supplementary Table 3: Summary of identified proteins from the glycocapture strategy. The separate worksheets include breakdown of identified peptides and information pertaining to its positive identification (eg Mascot ion scores, exp value, ZCore scores, δppm etc.) from CID only data, CID-ETD data and ETD data. LC MS/MS data (Thermo .raw files) will be made available upon request to the authors.This material is available free of charge via the Internet at http://pubs.acs.org.

’ AUTHOR INFORMATION Corresponding Author

*Matthew McDonagh, Department of Primary Industries Victoria, Biosciences Research Division, 1 Park Drive, Bundoora 3083, Australia. Tel: +61 3 9742 0434. Fax: +61 3 9742 8700. E-mail: [email protected].

’ ACKNOWLEDGMENT The authors would like to thank the Gardiner Foundation for funding this research under grant INN-10-036 “Improving feed conversion efficiency and lifetime profitability of the Australian dairy herd through genetic markers and biomarkers”. We acknowledge the staff at the DPI Ellinbank Dairy Research facility for their assistance in preparation of animals for saliva collection for this experiment. We also like to thank Karen Olivia of Mercy Hospital for use of the Off-Gel fractionator and Ben Hayes for critically reading the manuscript. ’ REFERENCES (1) Bailey, C. B.; Balch, C. C. Saliva secretion and its relation to feeding in cattle. 2. The composition and rate of secretion of mixed saliva in the cow during rest. Br. J. Nutr. 1961, 15, 383–402. (2) Bailey, C. B. Saliva secretion and its relation to feeding in cattle. 4. The relationship between the concentrations of sodium, potassium, chloride and inorganic phosphate in mixed saliva and rumen fluid. Br. J. Nutr. 1961, 15, 489–98. (3) Wu, A. J.; Atkinson, J. C.; Fox, P. C.; Baum, B. J.; Ship, J. A. Crosssectional and longitudinal analyses of stimulated parotid salivary constituents in healthy, different-aged subjects. J. Gerontol. 1993, 48 (5), M219–24. (4) George, J. R.; Fitchen, J. H. Future applications of oral fluid specimen technology. Am. J. Med. 1997, 102 (4A), 21–5. (5) Amado, F. M.; Vitorino, R. M.; Domingues, P. M.; Lobo, M. J.; Duarte, J. A. Analysis of the human saliva proteome. Expert Rev. Proteomics 2005, 2 (4), 521–39. (6) Hu, S.; Loo, J. A.; Wong, D. T. Human saliva proteome analysis. Ann. N.Y. Acad. Sci. 2007, 1098, 323–9. (7) Kim, K.; Kim, Y. Preparing multiple-reaction monitoring for quantitative clinical proteomics. Expert Rev. Proteomics 2009, 6 (3), 225–9. (8) Kaufman, E.; Lamster, I. B. The diagnostic applications of saliva-a review. Crit. Rev. Oral Biol. Med. 2002, 13 (2), 197–212. (9) Andersson, L. Genetic dissection of phenotypic diversity in farm animals. Nat. Rev. Genet. 2001, 2 (2), 130–8. 5067

dx.doi.org/10.1021/pr200516d |J. Proteome Res. 2011, 10, 5059–5069

Journal of Proteome Research (10) Goddard, M. E.; Hayes, B. J. Mapping genes for complex traits in domestic animals and their use in breeding programmes. Nat. Rev. Genet. 2009, 10 (6), 381–91. (11) Firkins, J. L.; Yu, Z.; Morrison, M. Ruminal nitrogen metabolism: perspectives for integration of microbiology and nutrition for dairy. J. Dairy Sci. 2007, 90 (Suppl 1), E1–16. (12) Chalupa, W. Manipulating Rumen Fermentation. J. Anim. Sci. 1977, 45, 585–99. (13) Hegarty, R. Reducing rumen methane emissions through elimination of rumen protozoa. Aust. J. Agric. Res. 1999, 1321–7. (14) Russell, J. B.; Rychlik, J. L. Factors that alter rumen microbial ecology. Science 2001, 292 (5519), 1119–22. (15) Wright, A. D.; Kennedy, P.; O’Neill, C. J.; Toovey, A. F.; Popovski, S.; Rea, S. M.; Pimm, C. L.; Klein, L. Reducing methane emissions in sheep by immunization against rumen methanogens. Vaccine 2004, 22 (29 30), 3976–85. (16) McLaren, R. D.; McIntosh, J. T.; Howe, G. W. The purification and characterization of bovine salivary proteins by electrophoretic procedures. Electrophoresis 1987, 8 (7), 318–24. (17) Rajan, G. H.; Morris, C. A.; Carruthers, V. R.; Wilkins, R. J.; Wheeler, T. T. The relative abundance of a salivary protein, bSP30, is correlated with susceptibility to bloat in cattle herds selected for high or low bloat susceptibility. Anim. Genet. 1996, 27 (6), 407–14. (18) Righetti, P. G.; Castagna, A.; Antonioli, P.; Boschetti, E. Prefractionation techniques in proteome analysis: the mining tools of the third millennium. Electrophoresis 2005, 26 (2), 297–319. (19) Wolff, S.; Otto, A.; Albrecht, D.; Zeng, J. S.; Buttner, K.; Gluckmann, M.; Hecker, M.; Becher, D. Gel-free and gel-based proteomics in Bacillus subtilis: a comparative study. Mol. Cell. Proteomics 2006, 5 (7), 1183–92. (20) Jones, W. T.; Broadhurst, R. B.; Gurnsey, M. P.; Gurusinghe, C. J.; Birtles, M. J. Identification of gland sources of bovine salivary proteins. N. Z. J. Agric. Res. 1986, 29, 659–66. (21) Ang, C. S.; Veith, P. D.; Dashper, S. G.; Reynolds, E. C. Application of 16O/18O reverse proteolytic labeling to determine the effect of biofilm culture on the cell envelope proteome of Porphyromonas gingivalis W50. Proteomics 2008, 8 (8), 1645–60. (22) Ang, C. S.; Rothacker, J.; Patsiouras, H.; Gibbs, P.; Burgess, A. W.; Nice, E. C. Use of multiple reaction monitoring for multiplex analysis of colorectal cancer-associated proteins in human feces. Electrophoresis 2011, 32 (15), 1926–38. (23) Jaffe, J. D.; Keshishian, H.; Chang, B.; Addona, T. A.; Gillette, M. A.; Carr, S. A. Accurate inclusion mass screening: a bridge from unbiased discovery to targeted assay development for biomarker verification. Mol. Cell. Proteomics 2008, 7 (10), 1952–62. (24) Mallick, P.; Schirle, M.; Chen, S. S.; Flory, M. R.; Lee, H.; Martin, D.; Ranish, J.; Raught, B.; Schmitt, R.; Werner, T.; Kuster, B.; Aebersold, R. Computational prediction of proteotypic peptides for quantitative proteomics. Nat. Biotechnol. 2007, 25 (1), 125–31. (25) Zhang, H.; Li, X. J.; Martin, D. B.; Aebersold, R. Identification and quantification of N-linked glycoproteins using hydrazide chemistry, stable isotope labeling and mass spectrometry. Nat. Biotechnol. 2003, 21 (6), 660–6. (26) Berven, F. S.; Ahmad, R.; Clauser, K. R.; Carr, S. A. Optimizing performance of glycopeptide capture for plasma proteomics. J. Proteome Res. 2010, 9 (4), 1706–15. (27) Ang, C. S.; Rothacker, J.; Patsiouras, H.; Burgess, A. W.; Nice, E. C. Murine fecal proteomics: A model system for the detection of potential biomarkers for colorectal cancer. J. Chromatogr., A 2010, 1217 (15), 3330–40. (28) Ros, A.; Faupel, M.; Mees, H.; Oostrum, J.; Ferrigno, R.; Reymond, F.; Michel, P.; Rossier, J. S.; Girault, H. H. Protein purification by Off-Gel electrophoresis. Proteomics 2002, 2 (2), 151–6. (29) Ishihama, Y.; Oda, Y.; Tabata, T.; Sato, T.; Nagasu, T.; Rappsilber, J.; Mann, M. Exponentially modified protein abundance index (emPAI) for estimation of absolute protein amount in proteomics by the number of sequenced peptides per protein. Mol. Cell. Proteomics 2005, 4 (9), 1265–72.

ARTICLE

(30) Corthals, G. L.; Wasinger, V. C.; Hochstrasser, D. F.; Sanchez, J. C. The dynamic range of protein expression: a challenge for proteomic research. Electrophoresis 2000, 21 (6), 1104–15. (31) Wu, L.; Han, D. K. Overcoming the dynamic range problem in mass spectrometry-based shotgun proteomics. Expert Rev. Proteomics 2006, 3 (6), 611–9. (32) Ramachandran, P.; Boontheung, P.; Xie, Y.; Sondej, M.; Wong, D. T.; Loo, J. A. Identification of N-linked glycoproteins in human saliva by glycoprotein capture and mass spectrometry. J. Proteome Res. 2006, 5 (6), 1493–503. (33) Hu, S.; Xie, Y.; Ramachandran, P.; Ogorzalek Loo, R. R.; Li, Y.; Loo, J. A.; Wong, D. T. Large-scale identification of proteins in human salivary proteome by liquid chromatography/mass spectrometry and two-dimensional gel electrophoresis-mass spectrometry. Proteomics 2005, 5 (6), 1714–28. (34) Gutierrez, A. M.; Miller, I.; Hummel, K.; Nobauer, K.; Martinez-Subiela, S.; Razzazi-Fazeli, E.; Gemeiner, M.; Ceron, J. J. Proteomic analysis of porcine saliva. Vet. J. 2011, 187 (3), 356–62. (35) Huang, D. W.; Sherman, B. T.; Lempicki, R. A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 2008, 4 (1), 44–57. (36) Yu, C. S.; Chen, Y. C.; Lu, C. H.; Hwang, J. K. Prediction of protein subcellular localization. Proteins 2006, 64 (3), 643–51. (37) Loo, J. A.; Yan, W.; Ramachandran, P.; Wong, D. T. Comparative human salivary and plasma proteomes. J. Dent. Res. 2010, 89 (10), 1016–23. (38) Yan, W.; Apweiler, R.; Balgley, B. M.; Boontheung, P.; Bundy, J. L.; Cargile, B. J.; Cole, S.; Fang, X.; Gonzalez-Begne, M.; Griffin, T. J.; Hagen, F.; Hu, S.; Wolinsky, L. E.; Lee, C. S.; Malamud, D.; Melvin, J. E.; Menon, R.; Mueller, M.; Qiao, R.; Rhodus, N. L.; Sevinsky, J. R.; States, D.; Stephenson, J. L.; Than, S.; Yates, J. R.; Yu, W.; Xie, H.; Xie, Y.; Omenn, G. S.; Loo, J. A.; Wong, D. T. Systematic comparison of the human saliva and plasma proteomes. Proteomics Clin. Appl. 2009, 3 (1), 116–34. (39) Perkins, D. N.; Pappin, D. J.; Creasy, D. M.; Cottrell, J. S. Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 1999, 20 (18), 3551–67. (40) Sadygov, R.; Zingaretti, G.; Shofstahl, J. Validating Database Search Results of ETD Spectra. Am. Soc. Mass Spectrom. 2007, MPK194. (41) Lopez, M.; Coddeville, B.; Langridge, J.; Plancke, Y.; Sautiere, P.; Chaabihi, H.; Chirat, F.; Harduin-Lepers, A.; Cerutti, M.; Verbert, A.; Delannoy, P. Microheterogeneity of the oligosaccharides carried by the recombinant bovine lactoferrin expressed in Mamestra brassicae cells. Glycobiology 1997, 7 (5), 635–51. (42) Yet, M. G.; Chin, C. C.; Wold, F. The covalent structure of individual N-linked glycopeptides from ovomucoid and asialofetuin. J. Biol. Chem. 1988, 263 (1), 111–7. (43) Bendixen, E.; Halkier, T.; Magnusson, S.; Sottrup-Jensen, L.; Kristensen, T. Complete primary structure of bovine beta 2-glycoprotein I: localization of the disulfide bridges. Biochemistry 1992, 31 (14), 3611–7. (44) Kato, H.; Enjyoji, K. Amino acid sequence and location of the disulfide bonds in bovine beta 2 glycoprotein I: the presence of five Sushi domains. Biochemistry 1991, 30 (50), 11687–94. (45) Koch, P. J.; Walsh, M. J.; Schmelz, M.; Goldschmidt, M. D.; Zimbelmann, R.; Franke, W. W. Identification of desmoglein, a constitutive desmosomal glycoprotein, as a member of the cadherin family of cell adhesion molecules. Eur. J. Cell Biol. 1990, 53 (1), 1–12. (46) Hamby, S. E.; Hirst, J. D. Prediction of glycosylation sites using random forests. BMC Bioinform. 2008, 9, 500. (47) Reis, C. A.; Osorio, H.; Silva, L.; Gomes, C.; David, L. Alterations in glycosylation as biomarkers for cancer detection. J. Clin. Pathol. 2010, 63 (4), 322–9. (48) Seo, J.; Lee, K. J. Post-translational modifications and their biological functions: proteomic analysis and systematic approaches. J. Biochem. Mol. Biol. 2004, 37 (1), 35–44. (49) Swaney, D. L.; McAlister, G. C.; Coon, J. J. Decision tree-driven tandem mass spectrometry for shotgun proteomics. Nat. Methods 2008, 5 (11), 959–64. 5068

dx.doi.org/10.1021/pr200516d |J. Proteome Res. 2011, 10, 5059–5069

Journal of Proteome Research

ARTICLE

(50) Syka, J. E.; Coon, J. J.; Schroeder, M. J.; Shabanowitz, J.; Hunt, D. F. Peptide and protein sequence analysis by electron transfer dissociation mass spectrometry. Proc. Natl. Acad. Sci. U.S.A. 2004, 101 (26), 9528–33. (51) Sondej, M.; Denny, P. A.; Xie, Y.; Ramachandran, P.; Si, Y.; Takashima, J.; Shi, W.; Wong, D. T.; Loo, J. A.; Denny, P. C. Glycoprofiling of the Human Salivary Proteome. Clin. Proteomics 2009, 5 (1), 52–68. (52) Hagglund, P.; Bunkenborg, J.; Elortza, F.; Jensen, O. N.; Roepstorff, P. A new strategy for identification of N-glycosylated proteins and unambiguous assignment of their glycosylation sites using HILIC enrichment and partial deglycosylation. J. Proteome Res. 2004, 3 (3), 556–66.

5069

dx.doi.org/10.1021/pr200516d |J. Proteome Res. 2011, 10, 5059–5069