Novel Comprehensive Approach for Accessible Biomarker

May 3, 2011 - The method combines biotinylation of proteins and their affinity purification together with the selective recovery of glycopeptides. The...
1 downloads 21 Views 7MB Size
ARTICLE pubs.acs.org/jpr

Novel Comprehensive Approach for Accessible Biomarker Identification and Absolute Quantification from Precious Human Tissues Andrei Turtoi,†,‡ Bruno Dumont,† Yannick Greffe,† Arnaud Blomme,† Gabriel Mazzucchelli,‡ Philippe Delvenne,§ Eugene Nzaramba Mutijima,§ Eric Lifrange,|| Edwin De Pauw,‡ and Vincent Castronovo*,† †

Metastasis Research Laboratory, GIGA Cancer, University of Liege, Bat. B23, Liege, B-4000 Liege, Belgium Faculty of Medicine, Department of Anatomy and Pathology, University of Liege, B-4000 Liege, Belgium ‡ GIGA Systems Biology and Chemical Biology, Laboratory of Mass Spectrometry, Department of Chemistry, University of Liege, Bat. B6c, B-4000 Liege, Belgium Department of Senology, University Hospital (CHU), University of Liege, B-4000 Liege, Belgium

)

§

bS Supporting Information ABSTRACT: The identification of specific biomarkers obtained directly from human pathological lesions remains a major challenge, because the amount of tissue available is often very limited. We have developed a novel, comprehensive, and efficient method permitting the identification and absolute quantification of potentially accessible proteins in such precious samples. This protein subclass comprises cell membrane associated and extracellular proteins, which are reachable by systemically deliverable substances and hence especially suitable for diagnosis and targeted therapy applications. To isolate such proteins, we exploited the ability of chemically modified biotin to label ex vivo accessible proteins and the fact that most of these proteins are glycosylated. This approach consists of three successive steps involving first the linkage of potentially accessible proteins to biotin molecules followed by their purification. The remaining proteins are then subjected to glycopeptide isolation. Finally, the analysis of the nonglycosylated peptides and their involvement in an in silico method increased the confident identification of glycoproteins. The value of the technique was demonstrated on human breast cancer tissue samples originating from 5 individuals. Altogether, the method delivered quantitative data on more than 400 potentially accessible proteins (per sample and replicate). In comparison to biotinylation or glycoprotein analysis alone, the sequential method significantly increased the number (g30% and g50% respectively) of potentially therapeutically and diagnostically valuable proteins. The sequential method led to the identification of 93 differentially modulated proteins, among which several were not reported to be associated with the breast cancer. One of these novel potential biomarkers was CD276, a cell membrane-associated glycoprotein. The immunohistochemistry analysis showed that CD276 is significantly differentially expressed in a series of breast cancer lesions. Due to the fact that our technology is applicable to any type of tissue biopsy, it bears the ability to accelerate the discovery of new relevant biomarkers in a broad spectrum of pathologies. KEYWORDS: proteomics, accessible proteins, glycoproteins, cancer biomarkers

1. INTRODUCTION A major step in many aspects of research related to malignant diseases is the identification of specific and sensitive biomarkers suitable for the development of effective diagnostic, prognostic and therapeutic modalities. Nowadays owing to mass spectrometry, shotgun proteomics and DNA/RNA microarray analyses, the list of reported potential tumor biomarkers is increasing rapidly. Despite this abundance, very few of such modulated proteins have found their way into the clinical validation phase and even fewer are used as reliable therapeutic targets or diagnostic markers.14 We and others believe that one of the r 2011 American Chemical Society

most promising ways of overcoming this difficulty is narrowing down the number of proteins investigated to the essential group of interest. Modulated proteins accessible from the bloodstream are an example of such a group. These proteins are mostly membrane based or embedded in the surrounding extracellular matrix. They are of particular interest because they have increased potential to be reached by systemically delivered monoclonal antibodies loaded with pharmacological compounds. Received: March 6, 2011 Published: May 03, 2011 3160

dx.doi.org/10.1021/pr200212r | J. Proteome Res. 2011, 10, 3160–3182

Journal of Proteome Research A frequent limitation in the identification of potential biomarkers is the scarcity of the tissues from which they need to be recovered. This is particularly true of human pathological tissues, such as cancer lesions, which are available for research only in very small amounts, making these samples very precious. Currently available proteomic methods that enable the analysis of potentially accessible proteins from such limited quantities leave substantial room for improvement. The available techniques attempt to tackle the problem primarily by exploiting the physical location of the protein of interest5 and are to a lesser extent focused on their chemical properties. To this end, the use of chemically modified biotin that labels accessible proteins through their free amino groups combined with streptavidin affinity chromatography represents a powerful method.6 Nevertheless, accessible proteins that do not bear such free amino groups will escape the inventory. To try to include a more comprehensive set of accessible proteins, we decided to exploit the known fact that most membrane and extracellular proteins are glycoproteins.7 Hence, the analysis of glycoproteins in tandem with the biotinylation method would offer a real possibility of covering a supplementary segment of accessible, but previously unidentified, proteins. Enrichment techniques for glycoproteins have been developed by employing lectin column affinity purification combined with concanavalin A.8 An alternative method is the hydrazide capture of preliminarily oxidized glycans, which appears to be more reproducible and less biased toward certain subsets of the glycoproteome in comparison to the lectin technique.911 This approach has already been applied to cell lines12 and recently also to normal tissue samples.13 Along these lines, we have developed an original method that combines three sequential steps consisting in protein biotinylation and glycopeptide analysis followed by the sequencing of neither glycosylated nor biotinylated peptides (rest-proteins). The latter helps to assign those glycopeptides that cannot be related with enough confidence to a specific protein (see Figure 2). Additionally, we show that the rest-fraction also contains a number of unique and potentially accessible proteins making this step an integral part of the approach. The method was designed such that a minimal amount of material or information is wasted. To validate our approach, we included several internal standards and used a suitable MS analysis technique allowing absolute, accurate and repeatable quantification of proteins found in the sample. The proof of concept was demonstrated on human breast cancer tissues where known and novel differentially expressed antigens were identified. Altogether, our study provides scientists with a new method to increase the number of identified modulated proteins while maximally exploiting their precious samples, yet focusing on candidates that are potentially accessible and hence relevant tumor biomarkers. In this context, it is important to mention that ultimate proof of accessibility necessitates further validation experiments using appropriate in vivo tumor models and intravenous injection of affinity ligands (e.g., antibodies) coupled to suitable imaging agents (fluorescent-dyes or radio-isotopes).

2. MATERIALS AND METHODS All the individuals involved in the current work were informed in detail regarding the aims of the study and gave their written consent. The project purpose and the undertaken experiments complied with the regulations and ethical guidelines of the University of Liege, Belgium and was approved by the ethic

ARTICLE

committee. The study is divided into two parts: (i) technical (demonstrating repeatability, accuracy and the value of the method) and (ii) biological (proof of concept study). The technical part employed one “master sample” which was prepared as a pool of equal amounts of all the tissue samples involved in the MS analysis (all individuals, both tumoral and normal specimens). The biological part and specifically the identification of modulated proteins was conducted using each of the individual samples outlined in the Table S1 (Supporting Information) separately. Finally, the validation of specific differentially expressed protein was conducted on a separate group of breast cancer patients (30 tumoral and 10 normal individuals). All patients involved in the IHC validation study were diagnosed with ductal breast adenocarcinoma, had clinical grades of at least 2 and presented no metastasis at the time of surgery. Regarding the number and type of replicates involved in the respective aspect of the study; it is to be noted that: (i) all analyses conducted in the technical part involved three full technical replicates (at the level of tissue solubilization), performed with same tools on separate days and using the “master sample”; (ii) the investigations conducted in the biological proof of concept study are single technical/biological replicates of matched tumoral and normal samples originating from five individuals. 2.1. Tissue Sample Preparation

Pieces of fresh human breast cancer biopsies obtained from the Pathology Department of the University Hospital of Liege, Belgium, were immediately sliced and soaked in freshly prepared EZ-link Sulfo NHS-SS biotin (1 mg/mL in PBS [phosphate buffered saline], Pierce, Rockford, IL) solution and incubated for 20 min (37 °C) as described previously.6 The reaction was stopped by addition of tris-HCL pH 7.4 (final concentration 50 mM). Tissue samples were then snap-frozen in liquid nitrogen and pulverized using a Mikro-Dismembrator U (Braun Biotech, Melsungen, Germany). Approximately 100 mg of tissue powder was dissolved in 500 μL PBS containing protease inhibitor (PI) cocktail (Halt, Pierce, Rockford, IL), 0.5 mM oxidized glutathione (GSSG) and level 1 internal standard mix (IS1) consisting of bovine fetuin, bovine casein and biotinylated chicken ovalbumin (performed by incubation of ovalbumin with EZ-link Sulfo NHS-SS biotin reagent); ratio spike/sample 1/200. Homogenates were then sonicated (2  30 s) with a 2 mm microprobe and centrifuged at 20 000 g for 10 min (4 °C). Human serum albumin (HSA) and immunoglobulins (IgG) were eliminated using Qproteome HSA and IgG Removal Kit (Qiagen, Valencia, CA). The remaining pellet was suspended in the 500 μL lysis buffer (1% Nonidet P40 (NP40), 0.5% deoxycholic acid (DOC), 0.1% SDS (sodium dodecyl sulfate), 0.5 mM GSSG and PI cocktail in PBS, pH 7.0), sonicated (2  30 s) and centrifuged at 20 000 g for 10 min (4 °C). The sample was subjected to HSA and IgG depletion as described above. Fivehundred microliters of 2% SDS solution was added to the remaining insoluble pellet following a final resolubilization. The sample was then centrifuged (as mentioned above) and the supernatant collected. All lysates from the three solubilization steps were finally pooled together and boiled for 5 min. The latter was conducted to ensure complete denaturing of the proteins. 2.2. Isolation of Biotinylated Proteins

The total protein extract was mixed with 100 μL/mg Streptavidin (SA) resin (Pierce, Rockford, IL, USA) for 120 min under rotational conditions (RT). After the incubation, the supernatant

3161

dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182

Journal of Proteome Research

ARTICLE

Figure 1. Sequential extraction of accessible proteins using tissue samples. The first step consists of biotinylation of accessible proteins and their isolation (BIOT fraction). The second step utilizes protein digestion into peptides following the isolation of glycopeptides (GLYCO fraction). The final step also collects the nonglycosylated peptides (REST) using them to complement the sequence information of nonassigned glycopeptides (detailed in Figure 2). Additionally, the REST fraction contained a significant number of accessible proteins which were allowed to supplement the already identified proteins in BIOT and GLYCO fractions. Three internal standards (IS1, 2 and 3) were added at different manipulation steps in the method in order to monitor, recovery, reproducibility and accuracy of quantification. The composition of the respective internal standard is outlined in the Materials and Methods section. All the fractions were analyzed using the 2D-nanoUPLCMSe system, which consisted essentially of two C18-phases run at pH 10 and pH 3, respectively.

was retained for the subsequent glycoproteomic analysis (fraction 1). The streptavidin beads were washed 4 times with

0.5 mL buffer A (1% NP40, 0.1% SDS and 0.5 mM GSSG in PBS buffer), 4 times with 0.5 mL buffer B (0.1% NP40, 1.5 M NaCl 3162

dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182

Journal of Proteome Research and 0.5 mM GSSG in PBS buffer), 2 times with 0.5 mL buffer C (0.1 M Na2CO3 and 0.5 mM GSSG in PBS buffer at pH 11.0) and finally 2 times with 0.5 mL PBS buffer at pH 7 without GSSG. The biotinylated proteins were eluted 2 times with 0.2 mL of 100 mM dithiothreitol (DTT) and incubated at 60 °C for 30 min (fraction 2). Fraction 1 was also reduced in 100 mM DTT. Both fractions were alkylated with 150 mM iodoacetamide for 30 min in the absence of light. At this stage level 2 internal standard (IS2) consisting of bovine beta-lactoglobulin (spike 1/200) was added to both fractions. Proteins were then precipitated in the presence of 20% trichloroacetic acid (TCA) overnight (4 °C) and washed two times with ice-cold acetone. The protein pellets (fractions 1 and 2) were solubilized in 50 μL of 50 mM NH4HCO3 (only for fraction 2 additionally 1% DOC [deoxycholic acid] was added as well) and digested (1:50 protease/protein ratio) overnight using trypsin (Promega, Madison, WI) (37 °C). Following this, the digestion was extended for 4 h by addition of fresh trypsin (1:100). The biotinylated peptides (fraction 2) were further processed using MS. Fraction 1 was used for the isolation of glycopeptides as described below (Figure 1). 2.3. Isolation of Glycopeptides

The digested protein sample was acidified with HCl (final concentration 1%), transferred onto the C18 Sep-Pak column (Waters, Milford, MA) and washed with 3  1 mL of 0.1% formic acid solution. The peptides were eluted using acetonitrile (80%) and evaporated to dryness. The peptide-containing sample was dissolved in the 100 μL of oxidation buffer (100 mM sodium acetate, 150 mM NaCl at pH 5.5) and complemented with 10 μL of sodium periodate (100 mM stock solution) (Pierce, Rockford, IL) and incubated for 30 min in the dark. Following this, 10 μL of sodium sulfite (120 mM stock solution) was added and incubation was extended for an additional 10 min (quenching). The sample was adjusted to 200 μL (with oxidation buffer; as described above) loaded onto hydrazide resin (Bio-Rad, Hercules, CA) and the glycopeptides were bound overnight (RT). The glycopeptide-free flow-through as well as the two first washes (of hydrazide resin with water) were collected for the subsequent MS analysis (nonglycosylated proteins). After extensive washing (2  500 μL each: water, 1.5 M NaCl, methanol, 80% ACN and 50 mM NH4HCO3) the hydrazide resin was loaded with 100 μL of 50 mM NH4HCO3 solution and incubated overnight with 500 units of PNGase F (New England Biolabs, Ipswich, MA) (37 °C). After the incubation period, the glycopeptide-containing flow-through was collected and desiccated. 2.4. MS Analysis

Five micrograms of peptides originating from biotinylated, glycosylated and also the nonglycopeptide fractions were desalted using C18 ZipTip pipet tips (Millipore, Billerica, MA). Following this, the peptide containing samples were first desiccated and than dissolved in 18 μL of 100 mM ammonium formiate buffer (pH 10). To the dissolved samples level 3 internal standard mix was added (IS3) composed of MassPREP Digestion Standard Mixture 1 (Waters Corporation) containing equimolar mix of yeast alcohol dehydrogenase, rabbit glycogen phosphorylase b, bovin serum albumin and yeast enolase; final concentration in 18 μL sample was adjusted to 135 fmol of yeast alcohol dehydrogenase. Of the sample prepared, 9 μL was injected corresponding to an estimated protein load of 2.5 μg. For the MS analysis the 2D-nano Aquity UPLC (Waters) was coupled online with the SYNAPT G1 qTOF system (Waters).

ARTICLE

The configuration of the 2D-nano UPLC system was the following: first dimension separation column X-Bridge BEH C18 5 μm (300 μm  50 mm), trap column Symmetry C18 5 μm (180 μm  20 mm) and analytical column BEH C18 1.7 μm (75 μm  150 mm) (all Waters). The sample was loaded at 2 μL/min (20 mM ammonium formiate, pH 10) on the first column and subsequently eluted in 5 steps (10, 14, 16, 20 and 65% acetonitrile). Each eluted fraction was desalted on the trap column and subsequently separated on the second analytical column; flow rate 300 nL/min, solvent A (0.1% formic acid in water) and solvent B (0.1% formic acid in acetonitrile), gradient 0 min, 97% A; 90 min, 60% A. The MS acquisition parameters were: data independent, alternate scanning (MSE) mode, 501500 m/z range, ESIþ, V optics, scan time 1 s, cone 30 V and lock mass [Glu1]-Fibrinopeptide B ([M þ 2H]2þ 785.8426 m/z). Raw data were processed (deconvoluted, deisotoped, protein identification, absolute and relative quantification) using ProteinLynx Global SERVER (PLGS) v2.4. The processing parameters were: MS TOF resolution and the chromatographic peak width were set to automatic, low-/elevated- energy detection threshold to 250/100 counts, identification intensity threshold to 1500 counts and lock mass window to 785.8426 ( 0.30 Da. For protein identification UniProt human database served as the reference (canonical sequence data with 20 280 enteries). To this database the sequences of all spiked proteins (internal standards) of nonhuman origin were manually added. Peptide modification carbamidomethylation was set as fixed and oxidation (M) as variable. In addition, for glycoprotein analysis, deamidation (N) was included as a variable modification as well. A response factor (2200) for the conversion of the peptide intensities into absolute quantities was deduced previously following a repeated injection of the alcohol dehydrogenase (yeast, Swiss-Prot P00330) digest. This response factor was kept constant throughout the entire study. Routinely, IS1, IS2 and IS3 were checked for the correct relationship between the spiked and the measured absolute amount as well as for the relative ratio between the compared samples. The IS tolerances for both absolute quantities and relative ratios had to be within (35% deviation for the data set to be acceptable and included in further analysis. PLGS software calculated score, relative ratio of protein expression (tumoral vs normal), its p-value and the false positive rate (FPR) for each individual protein hit. The p-value output format (PLGS) was ranging from 0 to 1 indicating: (i) 00.05 significant down-regulation and (ii) 0.951.0 significant upregulation of the respective protein; both instances indicate a level of certainty g95%. Within the present study, a protein was considered as identified if the FPR was g96% and the score g80. A protein was considered as modulated if the relative ratio of protein expression was higher than 1.5 fold with significant p-value. 2.5. Glycoprotein Data Analysis

Regarding the glycoproteins, the processed MS data (deconvoluted spectra) were submitted for the database search, first separately for the fraction obtained from the hydrazide beads and then combined with the flow-through fraction. Following this, all the N-linked glycoproteins originating from the hydrazide beads were filtered out with a homemade program. This program checked for the presence of deamidated asparagines at the consensus sequence site (NXS/T, where X can be replaced by any amino acid except proline) for each of the peptides in question. In this initial step, a certain number of glycopeptides 3163

dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182

Journal of Proteome Research

ARTICLE

could immediately be assigned to a respective glycoprotein (GLYCO fraction). The remaining glycopeptides were not specific enough or had lower scores so that they could not be unambiguously associated with a protein. In order to help assign these peptides, they were matched with the peptides from the REST fraction analysis where several nonglycosylated peptides in conjunction with the glycosylated peptides (from the GLYCO fraction) permitted the protein identification. This new combined pool of proteomic results was named the GLYCO REST fraction. 2.6. Immunohistochemical Validation of Selected Biomarkers

The expression of CD276 was assessed by immunohistochemistry in formalin-fixed paraffin-embedded breast tissue sections. Samples originating from 30 tumoral and 10 normal breasts were immunostained using suitable antibody (monoclonal anti-B7H3, R&D Systems, Minneapolis, MN). Tissue sections of 5 μm thickness were unparaffined by three baths in xylene during 5 min and hydrated in the methanol gradient (100, 90, 70, 50% and H2O). Blocking of endogenous peroxidase was performed by 30 min incubation with 3% H2O2 and 90% methanol. Antigen retrieval was conducted in 10 mM citrate buffer (pH 6) using 95 °C water bath for 40 min. Following 30 min blocking in PBSnormal serum solution (150 μL of normal rabbit serum [Vector Laboratories, Burlingame, CA] and 20 μL of Tween 20 in 10 mL PBS), the sections were incubated with the primary antibody overnight at 4 °C. Sections were then incubated with the biotinylated secondary antibody for 30 min and further with avidinbiotin complex kit (ABC kit, Vector Laboratories) for additional 30 min. 3,30 -diaminobenzidine tetrachlorhydrate dihydrate (DAB) with 5% H2O2 was used for colorization. The slides were finally counter-stained with hematoxylin. Immunostaining was assessed by two independent evaluators who examined the samples for percentage positive cells (four arbitrary units/classes: 1 = 025%, 2 = 2550%, 3 = 5075% and 4 = 75100%) and for staining intensity (four arbitrary units/ classes: 0 = no staining, 1 = weak, 2 = moderate and 3 = strong). The results obtained by these two scales were then multiplied together yielding a single value named score (y axis in the Figure 9 ). Statistical analysis was performed using MannWhitney Rank Sum Test for comparison between two groups (Sigma Plot; Systat Software, San Jose, CA).

3. RESULTS 3.1. Isolation of Potentially Accessible Proteins from Tissue Samples—the Sequential Method

The schematic overview of the sequential method is displayed in the Figure 1. The method is composed essentially of three distinct steps: (i) isolation of biotinylated proteins (BIOT), (ii) purification of the glycopeptides (GLYCO) and (iii) analysis of the remaining peptides (REST). The latter fraction served for insilico complementation of the nonassigned glycopeptides (Figure 2). In addition, as this fraction contained a number of potentially accessible proteins, it was allowed to contribute this group of interest to the overall pool of modulated proteins. Due to the obvious complexity of the method, several rapid tools for monitoring the inter-replicate repeatability were introduced. These consisted of: (i) examining the flow-through of the streptavidin purification step for remaining biotinylated proteins and (ii) measurement of the flow-through of the hydrazide beads for residual glycopeptides. Further process controls were the

Figure 2. Overview of the in-silico combinatory method used to increase the number of identified glycoproteins. Those glycopeptides that were selectively bound on the hydrazide resin but were not specific enough to give protein identification were matched with the flow through fraction and subjected to a second database search.

internal standards which were spiked at three different steps, amounting to 8 individual proteins of nonhuman origin (detailed in the Materials and Methods). Ideally, the technical variability must be conferred to the limits which are below the threshold used to define if a given protein is differentially expressed. Overall it can be said that the capture of both biotinylated proteins and glycopeptides was efficient and allowed for good specificity and minor samples loss. This is evident from the data presented in the Figure 3A. Here it is noteworthy that in the BIOT fraction, quantitative recoveries of biotinylated albumin and negligible contamination of fetuin was detected (IS1). As far as the other two fractions are concerned (GLYCO and REST), fetuin was accurately quantified in both fractions (owing to both glycosylated and nonglycosylated peptides) whereas casein (IS1), being a phosphoprotein, was only recovered in the REST fraction. Regarding the reproducibility of the protein digestion and subsequent purification steps, good recoveries of beta-lactoglobulin (IS2) were observed in all fractions except the GLYCO one (for it is not a glycosylated protein), indicating that the quantitative and qualitative variability introduced by these steps is within the acceptable limits. 3164

dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182

Journal of Proteome Research

ARTICLE

Figure 3. Quantitative accuracy, recovery and repeatability of the technique. (A) Absolute quantitative evaluation of the internal standards spiked in the sample during the preparation process using the sequential method (see Figure 1). Protein quantity was calculated using the PLGS software, based on a previously calculated response factor (for details refer to Materials and Methods). The error indicates the standard deviation of means, based on three full process technical replicates. (B) Absolute quantification of the internal standards spiked in the sample during the glycopeptides analysis alone. For comparative reasons (as outlined in the Introduction), the isolation of glycopeptides was also conducted as a standalone technique. Quantification and error bars are the same as in A.

The present study uses MSe technology to perform absolute label-free quantification. The method uses highly reproducible HPLC and data independent alternate scanning. In addition high mass accuracy measurements are provided by an orthogonal time-of-flight mass spectrometer and “on the flight” acquisition of a lock-mass allowing for subsequent readjustment of the calibration. The data are processed based on the detection and correlation of all detectable precursor and fragment ions sharing the same chromatographic profile. Rapid alteration between low and elevated energy states applied in the collision cell allows for the simultaneous quantification and identification of proteins in a single experiment. In the present work, IS3 is spiked in the sample at the very late step of the sequential method (last step before the sample is injected in the UPLC), which allows a good estimate of performance of the quantification method. As outlined in Figure 3A, the quantification of four proteins (IS3) demonstrates that absolute quantification is feasible and accurate. However, a slight overestimation of the protein amounts especially for albumin (bovine) and glycogen phosphorylase b (rabbit) is evident. This is probably related to the presence of human homologues found in the same sample. Following the rationale that this variation is below the 1.5-fold ratio chosen as threshold for claiming differential expression of a protein, the resulting

deviation was not considered as important. Along the lines of the absolute quantification the question of the repeatability, in particular with respect to the full process technical replicates, becomes relevant. Therefore, in Figure 4A an inter-replicate comparison of the identified and quantified proteins in each respective fraction of the combinatory method (including in silico fraction GR) was conducted. The Pearson correlation coefficients (Pcc) indicate an overall strong correlation between the full process technical replicates (average Pcc for all fractions = 0.92). The Pcc value for the GLYCO fraction, appears to be slightly lower (average Pcc = 0.86) and improved to 0.89 when the nonglycosylated peptides (via the in silico approach) were considered. Overall the weaker Pcc can be explained by the fact that calculations were performed with less peptides (only the glycosylated ones). The performance of the sequential approach was further verified on the qualitative level; this was performed especially with respect to the overlap of protein identifications (Figure 5) and their cellular localization (Figure 6) in the full process replicates. In summary, for the sequential method and regarding the protein identification, it can be stated that the BIOT fraction displayed the least variability, having on average 75% of proteins present in all three replicates. This value decreased in the REST 3165

dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182

Journal of Proteome Research

ARTICLE

Figure 4. Repeatability of the method regarding absolute protein quantification in the three full process replicates. The data displayed regard all the proteins identified in the respective fractions of the combinatory method (A) and of the glycopeptides isolation procedure as standalone technique (B). The Pearson correlation coefficient (Pcc) is indicated for each comparison and fraction. 3166

dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182

Journal of Proteome Research

ARTICLE

Figure 5. Repeatability of the protein identification. The percentage overlap of the proteins identified in each respective process replicate and fraction of the method. The BIOT fraction represents only proteins obtained from the biotinylation step; glycoproteins identified are marked as GLYCO whereas the nonglycosylated proteins are designated REST. The remaining two diagrams, GLYCO ONLY (GO) and REST from GO indicate the reproducibility of the protein identification for the glycopeptides isolation procedure as standalone technique.

fraction to just above 65% and in the GLYCO fraction to 55%. From these data, it can be reasonably assumed that the enrichment at the protein level causes less variability in comparison to the enrichment done at the peptide level. This is in the line with the fact that digestion of proteins generally increases the complexity of the sample. Regarding the question of repeatability at the level of cellular localization of identified proteins, the results for each fraction are detailed in the Supporting Information (Figure S1) and for the entire sequential approach in the Figure 6. It is worth noting that there is practically no significant interreplicate variability. This implies that the method is able to isolate repeatedly the same quality of proteins, originating from identical subcellular localizations. 3.2. Comparison of the Sequential Approach with the Individual Methods

In order to determine the real benefit of combining two previously described procedures in a new method, we have sought to compare the sequential technique with the individual method parts respectively. For this purpose, it was necessary to perform the second part of the method, the isolation of the glycopeptides, as a “standalone” technique. As far as the

technical aspects of the method are concerned (quantification of IS), the direct isolation of glycosylated peptides (GLYCO ONLY [GO]) and the analysis of the remaining REST from GO fraction produced similar results as the sequential method. As shown in the Figure 3B the GLYCO ONLY fraction recovered specifically glycosylated fetuin, whereas biotinylated ovalbumin and casein were in-addition to fetuin only present in the REST from GO fraction. Internal standard 2 and 3 indicate no significant variability with respect to digestion, purification and MS quantification. On the protein level Figure 4B demonstrates high correlation between the individual replicates regarding the absolute protein quantities (average Pcc = 0.85). This correlation increased to 0.93 following the in silico association of the nonglycosylated peptides. Concerning the reproducibility of protein identification, analogous to the sequential method, the greatest variability is observed in the GLYCO ONLY fraction followed by the REST from GO fraction. Finally, both fractions of the glycopeptide enrichment as a standalone technique, show reproducible isolation patterns with respect to the subcellular localization of the identified proteins (results displayed in the Supporting Information, Figure S1). 3167

dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182

Journal of Proteome Research

ARTICLE

Figure 6. Qualitative comparison of the percentage of identified proteins in the combinatory method according to their subcellular localization. The figure displays three full technical process replicates. The abbreviation BGR refers to a merged data set of BIOT, GLYCO and REST fractions for the respective replicate.

The goal of the method development presented here was to identify and quantify repeatedly and accurately potentially accessible and hence clinically relevant proteins. Therefore, it was essential to perform a comparison of the number of potentially accessible proteins obtained with the sequential method with those from the biotinylation or glycopeptide isolation as “standalone” techniques. This comparison is shown in the Figures 7 and 8. At this stage, it is important to outline that within the frame of the current work, BIOT protein fraction contained ∼45%, GLYCO fraction ∼80% and the REST ∼30% of potentially accessible proteins (membrane, extracellular and secreted; results shown in the Supplemental Data section Figure S1). These percentages were not significantly different when glycopeptide isolation was performed alone. As far as the absolute numbers of potentially accessible proteins is concerned, BIOT protein fraction isolated on average 310 proteins (Figure 7A). The number of potentially accessible proteins in the GLYCO fraction (both in the sequential and standalone approach) was approximately 80 which increased to 110 following the in silico combination method (Figure 7A). This demonstrates the value of performing this operation especially for nonassigned glycopeptides (an increase of ∼30%). In the REST fraction and on average, 200 potentially accessible proteins were confidently identified. The combination of all the method parts yielded in the sequential setting over 410 proteins (þ 30% in comparison to

the biotinylation alone) and in the glycopeptide setting alone (including the corresponding REST from GO fraction) 250 proteins (50% in comparison to the sequential method). Concerning the question whether the sequential method provides a real additional value in comparison to the individual techniques performed alone, Figure 8 summarizes the most important findings. With respect to the sequential method following can be said; the analytical procedure of removal of previously biotinylated proteins is clearly superior, in terms of the percentage of identified unique proteins (∼53%), to both glycopeptide and the analysis of remaining (nonglycosylated and nonbioinylated) proteins. However, the GLYCO fraction is characterized with ∼36% unique, potentially accessible, proteins. The remaining pool of proteins (REST-fraction), due to a relatively high absolute numbers and high percentage of membrane and extracellular proteins, bears another ∼25% of unique proteins. From this observation, it can be concluded that each of the sequential method part brings an additional value to the technique as a whole. Considering the repeatability of the observation (Figure 8, upper section, numbers indicated in the brackets) the BIOT fraction has the highest number of potentially accessible proteins detected in all three replicates (101 or ∼60%). This decreases in the GLYCO fraction to ∼40% and further in the REST fraction to ∼20%. Hence, although the REST fraction bares a relatively elevated percentage of 3168

dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182

Journal of Proteome Research

ARTICLE

Figure 7. Analysis of the effective value of the combinatory method with respect to the individual components alone (BIOT, GLYCO and REST) and the absolute numbers of accessible proteins identified (accessible proteins: extracellular, secreted and/or membrane). (A) The number of accessible proteins identified in each of the steps of the combinatory method as well as the sum of all the components together. Three full process technical replicates are displayed. Notably, G indicates the average number of proteins obtained using only the data from the trapped glycoprotein fraction (GLYCO, Figure 1). However, using the in-silico method outlined in the Figure 2, the overall number of glycoproteins was increased; this resulted in the data marked GR. Importantly, the specific dynamic range of proteins characterizing the R (REST, Figure 1) fraction allowed for a significant identification of additional accessible proteins. (B) Same as A conducted only for the glycopeptides isolation procedure as standalone technique.

potentially accessible proteins, they are observed with high interreplicate variability. However, when the results of all the steps of the sequential method are combined together (BGR, Figure 8, lower section) the repeatability of identification of potentially accessible proteins (observed in all 3 process replicates) increases to ∼75%. Performing the glycoprotein analysis alone (including the in silico matching and the joining of all the potentially accessible proteins found in the resulting REST fraction) results in only ∼18% unique proteins, previously not observed in the sequential method.

3.3. Proof of Concept Study—Applying the Sequential Approach to Breast Cancer Samples

Within the frame of the current study, we have sought to apply the developed method to clinically relevant breast cancer samples obtained from five patients. The aim of this part of the study was to demonstrate that using relevant samples, the method is able to discern the potentially accessible and previously unreported differentially expressed proteins in human breast cancer. The group of patients selected for this study was homogeneous with respect to the type, grade and clinical stage of the tumor 3169

dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182

Journal of Proteome Research

ARTICLE

Figure 8. Comparisons of the identification overlap of accessible proteins in different fractions of combinatory method and glycopeptides analysis alone. The Venn diagram display three full process replicates. Following abbreviations were used: B (BIOT fraction), R (REST fraction) and GR (GLYCO fraction incl. in silico matching with REST; as described in Figure 2). Regarding the combinatory method, the figure shows that there is substantial overlapping among all the separate fractions. However, each fraction also contains a certain number of unique proteins. Regarding the unique proteins, the numbers in parentheses indicate how many of them were observed in 3 out of 3, 2 out of 3 and 1 out of 3 fractions respectively. The lower part of the figure shows a comparison between the combinatory method (BGR) and glycoprotein analysis alone (GR&R) incl. in silico matching with its corresponding REST fraction. The diagrams display only accessible proteins. For correct comparison accessible proteins found in the rest fractions are always included. It is worth noting that glycoprotein analysis as a standalone technique identifies on average 18% of unique proteins. In contrast to this, over 50% of the proteins identified in the combinatory method are unique.

(Table S1, Supporting Information). A given protein was considered as up-regulated when the relative abundance ratio (tumoral vs normal) was g1.5 with a p-value g 0.95 in at least 3 out of 5 patients examined. Similarly, when a given protein was found only in the tumoral condition it was considered as upregulated as well. Overall, the sequential approach identified 93 up-regulated, potentially accessible proteins in the 5 ductal breast carcinoma patients. Biotinylation alone contributed with 54 modulated proteins. The analysis of glycosylated and remaining peptides resulted respectively in 26 and 41 up-regulated proteins. Of these, 33 were unique to the biotin-, 19 to the rest and 20 to the glyco-fraction. The complete list of the up-regulated proteins is shown in the Table 1. Details regarding the number of unique peptides, sequence coverage and the FPR are provided in the Table 2. Regarding the BIOT fraction, the up-regulated proteins have on average been identified with ∼13 peptides, had a sequence coverage of ∼20% and FPR under ∼1%. Similarly, in the REST fraction the proteins have been identified with ∼15 unique peptides, the sequence coverage was ∼18% and average FPR under 0.5%. As expected the glycosylated proteins (GLYCO fraction) have been identified with less unique peptides (∼10), had lower sequence coverage (∼13%) and higher FPR (∼1.2%). Owing to the internal standards as well as to the specificity of the MS analysis (employing the MSe technology) it was possible to provide an absolute quantity for each respective protein identified (Table 3). The amounts refer to a total quantity of 2.5 μg of peptides injected on the UPLC column. A protein abundance ratio between the tumoral and the normal sample has been calculated as well (absolute quantity in the tumoral vs the normal specimen). These ratios (without any further normalization) correlate well (average Pearson correlation coefficient >0.90; data not shown) with the relative quantification ratios shown in the Table 1, demonstrating that the absolute quantification provides trustable values.

3.4. Validation of Modulated CD276 using Immunohistochemistry

CD276 protein was found overexpressed in the breast cancer samples using the outlined sequential MS-based method. Namely, this protein was identified in four out of five patients, being 3-fold up-regulated in one and only present in the tumoral conditions of the other three individuals (Table 1, glyco fraction). At the time of this work, no published data was available regarding the expression and function of CD276 during breast cancer development and progression. Moreover, this protein prompted the interest because it was observed solely in the glyco-fraction. To validate the method as being able to identify potentially new proteins, we sought to examine the expression of CD276 using IHC on a collection of 30 patients diagnosed with infiltrating ductal breast cancer. The control group was increased to a total of 10 normal individuals. We found that CD276 was expressed at the surface of most human breast cancer cells while in general absent in adjacent normal mammary epithelial cells. Summarizing the results detailed in the Figure 9 , CD276 showed a strong positive and statistically significant staining in human breast cancer tissue.

4. DISCUSSION Identification of accessible biomarkers remains one of the major limiting factors for the development of new effective diagnostic, prognostic and therapeutic modalities. In this study, we developed a new method that significantly increased the number of potentially reachable proteins compared to previously described approaches that made use of biotinylation alone. The uniqueness of the protocol consists in the sequential combination of three procedures applied to scarce biopsy samples. Exploiting the fact that most of the glycoproteins are inherently found in the extracellular and membrane space, we decided to 3170

dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182

3171

HLA-G

HLA-B

HLA-C HLA-C

P17693

P18465

P30499 Q07000

HLA-DRA

ITGB1

P01903

P05556

DQA2

HLA-

HLA-E

P13747

P01906

HLA-A

P16188

HLA-C

LGALS3BP

Q08380

HLA-C

FBLN1

P23142

P30504

EMILIN1 FN1

Q9Y6C2 P02751

Q29960

EFHD2

ALCAM

Q13740

Q96C19

CTNNA1 CTNNA2

P35221 P26232

COL12A1

COMP

P49747

Q99715

CALR

P27797

COL1A1

CNN3

Q15417

CLU

CNN2

Q99439

P02452

CDH1

P12830

P10909

AGR3

Q8TD06

CLIC4

CAP1 AEBP1

Q01518 Q8IUX7

Q9Y696

gene name

accession

5

3

5

3

3 4

5

3

3

3

5

4

4 5

4

3

5

5

3

3

5 4

3

5

4

5

4

4

5 5

patient

number of

96.44

/

1421.45

839.96

1063.01 /

956.87

/

/

901.04

87.07

358.22

314.03 5242.98

350.67

411.62

1643.94

338.58

180.79

445.81

704.98 156.40

168.00

518.01

1718.77

1387.96

/

1343.17

123.33 493.52

score

ratio

1.51

/

Tumor

Tumor

Tumor /

Tumor

/

/

Tumor

Tumor

Tumor

Tumor Tumor

Tumor

Tumor

0.82

1.43

Tumor

Tumor

Tumor Tumor

Tumor

6.69

Tumor

Tumor

/

Tumor

4.57 Tumor

T/N

0.78

/

Tumor

Tumor

Tumor /

Tumor

/

/

Tumor

Tumor

Tumor

Tumor Tumor

Tumor

Tumor

0.00

0.78

Tumor

Tumor

Tumor Tumor

Tumor

1.00

Tumor

Tumor

/

Tumor

1.00 Tumor

p-value score

100.16

2000.34

1169.80

1549.73

/

/ 1479.85

1595.97

1396.73

1307.41

/

76.06

75.76

246.42 143.16

314.35

1316.38

1208.05

760.49

/

268.27

515.60 242.34

971.83

922.03

558.62

637.03

196.84

580.33

442.43 77.96

BPSCC10/39 ratio

1.54

Tumor

Tumor

Tumor

/

/ Tumor

Tumor

Tumor

Tumor

/

3.90

2.69

1.92 14.73

Tumor

Tumor

1.65

3.97

/

Tumor

Tumor Tumor

Tumor

2.36

Tumor

Tumor

Tumor

Tumor

2.48 15.80

T/N

0.90

Tumor

Tumor

Tumor

/

/ Tumor

Tumor

Tumor

Tumor

/

1.00

1.00

1.00 1.00

Tumor

Tumor

1.00

1.00

/

Tumor

Tumor Tumor

Tumor

1.00

Tumor

Tumor

Tumor

Tumor

1.00 1.00

p-value

ratio T/N

520.57

/

206.05

24.81

844.28

343.31 839.11

650.44

520.58

494.42

841.77

305.30

410.54

110.60 68.48

478.34

89.45

201.28

414.33

349.21

660.15

713.48 140.25

153.46

2106.27

427.84

192.73

215.23

1971.02

510.65 449.56

Tumor

/

Tumor

Tumor

Tumor

Tumor Tumor

Tumor

Tumor

Tumor

Tumor

Tumor

Tumor

Tumor 4.10

Tumor

Tumor

3.60

2.97

Tumor

Tumor

Tumor Tumor

Tumor

Tumor

Tumor

Tumor

Tumor

Tumor

Tumor Tumor

BIOT Fraction

score

BPSCC10/40

Tumor

/

Tumor

Tumor

Tumor

Tumor Tumor

Tumor

Tumor

Tumor

Tumor

Tumor

Tumor

Tumor 1.00

Tumor

Tumor

1.00

1.00

Tumor

Tumor

Tumor Tumor

Tumor

Tumor

Tumor

Tumor

Tumor

Tumor

Tumor Tumor

p-value

433.01

127.13

3384.92

4567.53

4428.69

3718.36 4535.59

2156.76

/

463.91

/

94.10

/

/ 509.91

509.31

/

285.98

1093.33

/

/

267.04 /

/

717.26

/

536.67

225.34

/

1091.64 176.21

score

BPSCC10/46

patient

ratio

Tumor

13.60

Tumor

Tumor

Tumor

Tumor Tumor

Tumor

/

Tumor

/

4.57

/

/ Tumor

Tumor

/

0.88

1.92

/

/

Tumor /

/

14.73

/

Tumor

Tumor

/

Tumor Tumor

T/N

Tumor

1.00

Tumor

Tumor

Tumor

Tumor Tumor

Tumor

/

Tumor

/

1.00

/

/ Tumor

Tumor

/

0.00

1.00

/

/

Tumor /

/

1.00

/

Tumor

Tumor

/

Tumor Tumor

p-value score

361.30

250.56

1019.22

559.88

/

/ 563.98

393.69

404.03

/

337.49

499.01

201.16

263.86 319.76

/

/

340.35

2213.98

515.12

/

496.92 158.71

/

1586.53

800.74

371.75

90.00

191.97

675.44 195.15

BPSCC10/47 ratio

1.46

2.32

Tumor

Tumor

/

/ Tumor

24.05

Tumor

/

Tumor

0.96

Tumor

Tumor Tumor

/

/

2.16

0.86

Tumor

/

Tumor Tumor

/

2.08

Tumor

Tumor

Tumor

Tumor

Tumor Tumor

T/N

1.00

1.00

Tumor

Tumor

/

/ Tumor

1.00

Tumor

/

Tumor

0.43

Tumor

Tumor Tumor

/

/

1.00

0.03

Tumor

/

Tumor Tumor

/

1.00

Tumor

Tumor

Tumor

Tumor

Tumor Tumor

p-value

M, Mel

M, ER, G, E

M, ER, G, E

M

M, S

M M

M

M

M

M

S

S

S S

M

S

S

S

M, Cy, N, Mi, C J

M

M, Cy, CJ M, Cy, C J, C P

S

S, ER, Cy, C Su

CJ

CJ

M, C J

S

M S, Cy, N

locations

subcellular

BPSCC10/49

Table 1. List of Potentially Accessible Modulated Proteins Obtained from the Analysis of 5 Nontumoral Adjacent and 5 Tumoral Individual Matched Specimens, BIOT, REST and GLYCO Fractionsa

Journal of Proteome Research ARTICLE

dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182

3172

VAPA

CAP1

AEBP1

BASP1

CNN 2

CNN 3 CALR

CALU

CD59

COL1A1

COL12A1

Q01518

Q8IUX7

P80723

Q99439

Q15417 P27797

O43852

P13987

P02452

Q99715

TAGLN2 VCAN

P37802 P13611

VDAC1

THBS2

P35442

P21796

TXN

P10599

Q9P0L0

TNC

P24821

Rap 1b

P61224

ATP IB 1

HLA-H RAP1A

P01893 P62834

P05026

P4HB

P07237

SFRP4

PDIA6

Q15084

Q6FHJ7

POSTN

SLC9A3R1

O14745

SERPINF1

MARCKS

P29966

P36955

PGRMC2 MSN

O15173 P26038

Q15063

LSP1

MIF

P14174

KTN1

Q86UP2

P33241

gene name

accession

Table 1. Continued

5

5

3

4

3 5

3

4

4

5

3

3

5 4

3

5

4

3

3

4

3 3

5

3

3

5

4

4

3 5

4

3

4

patient

number of

1484.79

214.94

173.72

/

284.31 1742.08

357.89

505.63

1338.28

541.76

244.42

505.94

2178.35 206.26

715.31

730.64

920.42

753.72

109.79

864.99

896.25 780.76

93.49

/

295.05

668.37

870.92

1399.26

1380.57 429.91

2787.42

/

246.23

score

ratio

Tumor

1.51

2.23

/

Tumor Tumor

Tumor

Tumor

Tumor

2.44

Tumor

Tumor

3.71 Tumor

Tumor

2.56

Tumor

Tumor

Tumor

Tumor

Tumor Tumor

7.92

/

Tumor

4.71

Tumor

Tumor

Tumor Tumor

Tumor

/

Tumor

T/N

Tumor

1

0.93

/

Tumor Tumor

Tumor

Tumor

Tumor

1

Tumor

Tumor

1.00 Tumor

Tumor

1.00

Tumor

Tumor

Tumor

Tumor

Tumor Tumor

1.00

/

Tumor

1.00

Tumor

Tumor

Tumor Tumor

Tumor

/

Tumor

p-value score

158.85

770.88

/

821.38

180.36 569.48

475.01

396.71

3667.17

1924.92

117.92

/

2834.74 261.40

994.69

4810.20

898.73

991.77

832.54

191.58

/ 289.07

543.69

/

1045.84

105.02

513.86

474.18

354.35 319.88

7164.52

984.43

466.90

BPSCC10/39 ratio

23.40

2.72

/

Tumor

Tumor 4.02

Tumor

Tumor

Tumor

2.00

Tumor

/

3.56 Tumor

Tumor

Tumor

Tumor

Tumor

Tumor

Tumor

/ Tumor

3.86

/

Tumor

24.78

Tumor

Tumor

1.67 Tumor

Tumor

Tumor

Tumor

T/N

1

0.99

/

Tumor

Tumor 1

Tumor

Tumor

Tumor

0.46

Tumor

/

1.00 Tumor

Tumor

Tumor

Tumor

Tumor

Tumor

Tumor

/ Tumor

1.00

/

Tumor

1.00

Tumor

Tumor

0.97 Tumor

Tumor

Tumor

Tumor

p-value

ratio

Tumor

Tumor

9.21 Tumor

Tumor

11.36

Tumor

Tumor

Tumor

985.12

526.29

747.64

1060.17

519.91

208.32 3715

258.26

186

276.01

/

Tumor /

14.15

Tumor

1.30

5.75

Tumor

/

Tumor Tumor

Tumor

Tumor

Tumor

T/N

Tumor

1.63

Tumor

Tumor

Tumor Tumor

Tumor

Tumor

Tumor

Tumor

REST Fraction

303.15

2242.79

524.31 109.19

149.61

405.32

293.75

1046.78

229.79

/

775.41 /

88.03

821.02

113.46

164.36

1618.65

/

1259.69 270.21

3486.43

672.75

642.16

score

BPSCC10/40

Tumor

1

Tumor

Tumor

Tumor Tumor

Tumor

Tumor

Tumor

Tumor

Tumor

Tumor

1.00 Tumor

Tumor

1.00

Tumor

Tumor

Tumor

/

Tumor /

1.00

Tumor

0.75

1.00

Tumor

/

Tumor Tumor

Tumor

Tumor

Tumor

p-value

191.1

549.6

/

1206.22

/ 4194.96

/

433.56

200.87

947.24

/

1714.57

1908.01 /

/

738.84

156.34

/

/

246.79

2789.76 256.36

222.32

1095.04

/

755.48

732.04

1682.43

/ 629.58

/

4142.59

/

score

BPSCC10/46

patient

ratio

Tumor

2.16

/

Tumor

/ Tumor

/

Tumor

Tumor

3.13

/

Tumor

13.07 /

/

6.11

Tumor

/

/

Tumor

Tumor Tumor

12.94

Tumor

/

Tumor

Tumor

Tumor

/ Tumor

/

Tumor

/

T/N

Tumor

1

/

Tumor

/ Tumor

/

Tumor

Tumor

0.97

/

Tumor

1.00 /

/

1.00

Tumor

/

/

Tumor

Tumor Tumor

1.00

Tumor

/

Tumor

Tumor

Tumor

/ Tumor

/

Tumor

/

p-value score

117.04

214.94

173.72

70.49

/ 251.79

/

/

/

541.76

/

/

1775.15 278.25

/

2259.08

/

/

/

441.17

/ /

718.17

423.84

/

2076.72

/

414.50

/ 102.06

1343.77

/

164.45

BPSCC10/47 ratio

Tumor

1.57

2.08

Tumor

/ Tumor

/

/

/

1.84

/

/

2.10 Tumor

/

2.14

/

/

/

Normal

/ /

2.36

Tumor

/

Tumor

/

Normal

/ 1.06

2.05

/

Tumor

T/N

Tumor

1

0.88

Tumor

/ Tumor

/

/

/

1

/

/

1.00 Tumor

/

1.00

/

/

/

Normal

/ /

1.00

Tumor

/

Tumor

/

Normal

/ 0.55

1.00

/

Tumor

p-value

S

S

M, S

ER, S, Mel, SR

CJ ER, Cy, S, CSu

CJ

M

S

M

Mi

M

M,N S

S

Cy, S

S

M

S

M, Cy

M M

M, ER, Mel

M, ER, Mel

S

S

M, Cy, C P

M, Cy

M M, Cy, C P

S

M

M, ER

locations

subcellular

BPSCC10/49

Journal of Proteome Research ARTICLE

dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182

3173

TNC

THBS1

THBS2

TGFBI RHOA

TAGLN2

VCAN

VAPA

VCL

VDAC1

BGN

BST2

P07996

P35442

Q15582 P61586

P37802

P13611

Q9P0L0

P18206

P21796

P21810

Q10589

ATP1A1

P05023

P24821

RHOC

PDIA6

Q15084

RAB14

LCP1

P13796

P08134

SERPINF1

P36955

P61106

POSTN

Q15O63

GDI2 IQGAP1

MARCKS SLC9A3R1

P29966 O14745

RAB1C

HLA-DRB1

P01911

P50395 P46940

HLA-DRA

P01903

Q92928

HLA-A

P13746

P4HB

LGALS1

P09382

P07237

EMILIN1

FN1

P02751

COL1A2 DCTN2

P08123 Q13561

Q9Y6C2

gene name

accession

Table 1. Continued

4

5

4

5

4

5

4

3 4

3

3

4

4

3

3

4 5

3

5

5

4

3

5

4 4

3

3

3

5

5

4

5 4

patient

number of

218.25

1288.06

/

125.43

913.67

163.48

1060.02

76.58 1027.18

112.68

161.78

554.36

102.17

/

/

383.77 85.69

/

88.76

169.82

399.06

471.03

3640.87

985.06 329.5

/

/

/

514.61

65.33

126.21

282.26 235.06

score

ratio

Normal

1.6

/

0.68

Tumor

Tumor

Tumor

Tumor Tumor

Tumor

Tumor

Tumor

Tumor

/

/

Tumor 2.64

/

4.53

2.92

Tumor

Tumor

Tumor

Tumor Tumor

/

/

/

3.16

18.17

Tumor

2.75 Tumor

T/N

Normal

1.00

/

0.09

Tumor

Tumor

Tumor

Tumor Tumor

Tumor

Tumor

Tumor

Tumor

/

/

Tumor 1

/

1

1

Tumor

Tumor

Tumor

Tumor Tumor

/

/

/

1

1

Tumor

1 Tumor

p-value score

798.95

1393.69

569.57

681.77

/

118.29

773.75

115.79 1117.66

2078.27

2883.19

849.16

184.08

519.83

782.05

289.67 172.53

547.83

1141.14

621.52

168.77

1558.8

4389.11

567.95 287.64

850.84

282.65

320.4

4765.35

323.2

84.86

2055.15 382.04

BPSCC10/39 ratio

Tumor

4.66

Tumor

1.78

/

8.20

7.34

2.70 3.36

Tumor

Tumor

Tumor

1.06

Tumor

Tumor

1.48 3.98

Tumor

Tumor

4.16

1.22

Tumor

Tumor

Tumor Tumor

Tumor

Tumor

Tumor

7.56

18.42

Tumor

2.42 Tumor

T/N

383.74

298.43

3246.94

393.66

2809.76

/ 1268.24

111.52

663.79

242.83

129.98

1191.54

1310.46

325.05 210.1

1489.95

5677.9

6784.53

/

710.29

1875.4

133.61 474.77

448.54

181.34

248.88

2102.82

218.5

67.03

2236.75 1371.62

score

ratio

Tumor

Tumor

Tumor

Tumor

Tumor

/ Tumor

Tumor

Tumor

Tumor

Tumor

Tumor

Tumor

Tumor Tumor

Tumor

Tumor

Tumor

125.40

1721.03

/

6.23

Tumor

Tumor Tumor

Tumor

Tumor

Tumor

Tumor

27

Tumor

0.93 Tumor

T/N

Tumor

7.17

GLYCO Fraction Tumor

1.00

Tumor

0.21

/

1

1

0.77 0.9

Tumor

Tumor

Tumor

0.12

Tumor

Tumor

0.25 1

Tumor

Tumor

0.99

0.23

Tumor

Tumor

Tumor Tumor

Tumor

Tumor

Tumor

1

1

Tumor

1 Tumor

p-value

BPSCC10/40

Tumor

1.00

Tumor

Tumor

Tumor

Tumor

Tumor

/ Tumor

Tumor

Tumor

Tumor

Tumor

Tumor

Tumor

Tumor Tumor

Tumor

Tumor

Tumor

/

1

Tumor

Tumor Tumor

Tumor

Tumor

Tumor

Tumor

1

Tumor

0 Tumor

p-value

510.90

1721.03

301.42

164.47

555.63

175.1

3996.02

146.4 975.84

/

/

342.98

/

706.67

388.38

/ 179.84

166.93

295.89

671.44

801.19

/

1008.52

677.29 280.5

422.19

792.43

716.85

601.1

291.92

185.54

1374.21 900.57

score

BPSCC10/46

patient

ratio

Tumor

2.89

Tumor

2.03

Tumor

Tumor

Tumor

Tumor 3.19

/

/

Tumor

/

Tumor

Tumor

/ Tumor

Tumor

13.20

8.00

Tumor

/

Tumor

Tumor Tumor

Tumor

Tumor

Tumor

15.33

4.39

Tumor

2.18 Tumor

T/N

Tumor

1.00

Tumor

0.89

Tumor

Tumor

Tumor

Tumor 0.97

/

/

Tumor

/

Tumor

Tumor

/ Tumor

Tumor

1

1

Tumor

/

Tumor

Tumor Tumor

Tumor

Tumor

Tumor

1

1

Tumor

1 Tumor

p-value score

/

1012.83

86.8

125.43

326.76

81.27

/

/ /

/

/

/

66.87

/

/

105.58 85.69

/

88.76

169.82

189.95

/

696.09

/ /

/

/

/

514.61

65.33

/

282.26 /

BPSCC10/47 ratio

/

3.42

Tumor

1.09

Tumor

Tumor

/

/ /

/

/

/

Tumor

/

/

Tumor Normal

/

2.27

2.08

Tumor

/

Tumor

/ /

/

/

/

1.42

/

0.96

Tumor

0.62

Tumor

Tumor

/

/ /

/

/

/

Tumor

/

/

Tumor Normal

/

0.99

1

Tumor

/

Tumor

/ /

/

/

/

0.89

/ 1

/

1 /

p-value

1.11

1.88 /

T/N

G, M

S

M, Mi

M, Cy, C J

M

S

S M, Cy

S

S

S

M, Mel

M

M

M, Cy M

M, Cy

M, ER, Mel

M, ER, Mel

Cy, C J, C P

S, Mel

S

M, Cy M, Cy, C P

M, ER, G, E, L

M, ER, G, E, L

M

S

S

S

S M, Cy

locations

subcellular

BPSCC10/49

Journal of Proteome Research ARTICLE

dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182

3174

ITGAV

P06756

LAMP1

LAMP 2

TIMP1

POSTN SERPINF1

SERPINA5

TNC

THBS1

THY1

TFRC

TM9SF3

P11279

P13473

P01033

Q15063 P36955

P05154

P24821

P07996

P04216

P02786

Q9HD45

3

4

5

3

5

5

5 5

5

5

5

3

4

5

4 5

5

5

5

4

5

4

4 4

patient

number of

160.50

454.72

3487.09

/

2311.17

1866.00

229.72 1782.53

487.88

1285.89

317.31

/

111.30

250.90

/ 271.68

1377.61

1849.75

127.75

339.29

637.26

472.67

185.12 105.69

score

ratio

1.73

Tumor

1.23

/

Tumor

Tumor

3.74 1.92

0.25

Tumor

2.46

/

1.62

Tumor

/ 3.63

1.12

0.87

1.82

Tumor

Tumor

Tumor

1.11 3.00

T/N

1.00

Tumor

0.97

/

Tumor

Tumor

0.98 1.00

0.04

Tumor

1.00

/

0.99

Tumor

/ 1.00

0.98

0.04

1.00

Tumor

Tumor

Tumor

0.72 0.98

p-value score

ratio

2.72

136.05

/

1536.99

1799.00

3538.85

1177.84

1937.36 4267.74

5152.63

2.80

/

8.85

Tumor

Tumor

Tumor

Tumor 5.87

Tumor

3.67

’ 957.83 317.51

Tumor

405.10

Tumor

Tumor 15.33

2.51

4.39

2.89

Tumor

22.87

7.77

1.68 Tumor

T/N

851.18

153.67

1512.57

457.54 332.05

1420.54

1354.67

174.27

224.61

96.76

155.78

121.35 603.22

BPSCC10/39

0.99

/

1.00

Tumor

Tumor

Tumor

Tumor 1.00

Tumor

0.99

1.00

Tumor

0.99

Tumor

Tumor 1.00

1.00

1.00

1.00

Tumor

1.00

1.00

1.00 Tumor

p-value

143.12

100.73

1209.30

273.24

576.04

668.16

534.19 3651.86

2031.79

279.61

360.65

146.37

121.40

172.13

240.65 670.54

2045.18

449.11

247.39

458.35

122.83

166.67

/ 223.99

score

BPSCC10/40 ratio

Tumor

Tumor

8.58

Tumor

Tumor

2.23

Tumor 1.58

Tumor

11.13

2.12

Tumor

11.13

Tumor

Tumor Tumor

Tumor

2.03

Tumor

Tumor

2.97

Tumor

/ Tumor

T/N

Tumor

Tumor

1.00

Tumor

Tumor

0.99

Tumor 1.00

Tumor

1.00

1.00

Tumor

1.00

Tumor

Tumor Tumor

Tumor

1.00

Tumor

Tumor

1.00

Tumor

/ Tumor

p-value

/

2391.77

1209.30

163.63

2570.18

668.16

517.33 3651.86

3243.84

279.61

020.5

679.66

121.40

1030.07

535.32 1073.36

4553.91

449.11

2459.18

458.35

122.83

718.61

184.87 313.82

score

BPSCC10/46

patient

ratio

/

Tumor

5.37

Tumor

Tumor

0.94

Tumor 1953.862

Tumor

19.69

4.18

Tumor

5.10

Tumor

Tumor Tumor

Tumor

3.42

Tumor

Tumor

1.42

Tumor

Tumor Tumor

T/N

/

Tumor

1.00

Tumor

Tumor

0.46

Tumor 0.88

Tumor

1.00

1.00

Tumor

1.00

Tumor

Tumor Tumor

Tumor

1.00

Tumor

Tumor

1.00

Tumor

Tumor Tumor

p-value

/

320.6

/

154.4 /

score

/

156.73

1225.47

/

388.97

424.39

619.02 1953.86

1113.61

153.08

382.52

/

/

166.11

150.95 399.78

1417.68

2750.15

152.43

BPSCC10/47 ratio

/

Tumor

2.72

/

Tumor

1.48

Tumor 2.08

0.89

2.77

1.02

/

/

Tumor

Normal 2.36

7.24

0.44

2.86

/

Tumor

/

Tumor /

T/N

/

Tumor

1

/

Tumor

1

Tumor 1

0.02

1

’ 0.54

/

/

Tumor

Normal 1

1

0

1

/

Tumor

/

Tumor /

p-value

locations

M

M, Mel, S

M

S

S

S

S S, Mel

S

M, E, L

M, E, L

M

M

M

M M, ER, G

S

S

S

S

S

M

M M

subcellular

BPSCC10/49

The proteins were selected with respect to their presence in tumoral tissue samples and their absence or reduced presence in non-tumoral tissue samples. For certain proteins present in both the tumoral and the adjacent normal tissue, quantitative data (relative ratio of expression) are included if they were significantly overexpressed in the tumor (ratio g1.5). The p-value is ranging from 0 to 1 indicating: (i) 00.05 significant down-regulation and (ii) 0.951.0 significant up-regulation of the respective protein. The proteins are accordingly to the GO annotation located on the outer side of the cell membrane (secreted, extracellular or membrane). Additional information concerning the number of unique peptides, protein sequence coverage and FPR are provided in the Table 2. The corresponding sequence information regarding the glycosylated peptides is provided in the Table S2 (Supporting Information). The MS/MS spectra supporting the identification of glycosylated peptides are outlined in Supporting Information Figure S2. The following abbreviations were used:Subcellular location S, secreted; E, extracellular; M, membrane; Cy, cytoplasm; CJ, cell junction; CP, cell projection; CSu, cell surface; N, nucleus; G, Golgi; ER, endoplasmic reticulum; Mel, melanosome; SR, sarcoplasmic reticulum; L, lysosome.

a

IMP AD1I

Q9NX62

ITGB2

FCGR1A HLA-DRA

P12314 P01903

P05107

HPX

P02790

C4B

P0C0L5

EMILIN1

COL12A1

Q99715

LGALS3BP

CADM1

Q9BY67

Q08380

MRC2 CD 276

Q9UBG0 Q5ZPR3

Q9Y6C2

gene name

accession

Table 1. Continued

Journal of Proteome Research ARTICLE

dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182

Journal of Proteome Research

ARTICLE

Table 2. Values Indicate the Number of Unique Peptides, Sequence Coverage and the FPR Observed for Each of the Respective Proteins Reported in Table 1 (Average Numbers, n = 5) number of unique peptides

sequence coverage (%) tumor

false positive rate (%)

accession

tumor

normal

normal

tumor

normal

Q01518

14

9

30

16

0

0

Q8IUX7 Q8TD06

25 7

13 /

17 28

2 /

0 0

3 /

P12830

8

/

7

/

1

/

Q99439

9

/

28

/

0

/

Q15417

9

/

28

/

0

/

P27797

19

9

44

25

0

0

P49747

14

/

14

/

0

/

P35221

24

/

24

/

0

/

P26232 Q13740

20 12

/ /

8 17

/ /

1 0

/ /

BIOT Fraction

Q9Y696

7

5

20

9

0

3

P10909

14

11

28

25

0

0

P02452

36

27

22

19

1

0

Q99715

68

/

18

/

0

/

Q96C19

8

/

17

/

0

/

Q9Y6C2

18

17

10

10

0

0

P02751 P23142

56 14

27 7

25 20

8 10

0 1

2 3

Q08380

12

11

17

14

1

3

P16188

11

6

29

13

0

0

P13747

7

4

15

8

0

1

P17693

5

/

14

/

0

/

P18465

11

14

27

13

0

0

P30499

10

7

22

17

0

0

Q07000 Q29960

12 11

9 7

28 22

24 10

0 0

0 0

P30504

12

5

31

17

0

0

P01906

5

/

13

/

0

/

P01903

5

4

18

17

0

1

P05556

18

10

19

9

0

2

Q86UP2

33

/

17

/

0

/

P33241

12

/

36

/

1

/

P14174 O15173

4 10

2 8

28 34

32 29

0 0

0 0

P26038

16

8

16

3

0

3

P29966

6

13

27

20

0

0

O14745

11

/

29

/

0

/

Q15063

29

15

32

17

0

1

P36955

10

6

18

9

0

0

P07237

28

13

49

21

0

0

Q15084 P01893

10 9

/ 7

24 26

/ 17

0 0

/ 0

P62834

6

/

24

/

0

/

P61224

7

5

35

20

0

0

Q6FHJ7

3

/

8

/

1

/

P05026

6

/

18

/

0

/

P24821

39

/

18

/

1

/

P10599

6

4

47

28

0

0

P35442

24

/

17

/

0

/

3175

dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182

Journal of Proteome Research

ARTICLE

Table 2. Continued number of unique peptides accession

sequence coverage (%)

false positive rate (%)

tumor

normal

tumor

normal

tumor

normal

P37802 P13611

21 39

12 /

73 6

63 /

0 0

0 /

Q9P0L0

5

/

17

/

0

/

P21796

8

/

10

/

0

/

Q01518

13

8

23

14

0

0

Q8IUX7

23

/

16

/

0

/

P80723

8

/

33

/

0

/

Q99439 Q15417

5 8

/ /

11 17

/ /

0 0

/ /

P27797

15

8

34

18

0

0

O43852

13

/

38

/

1

/

P13987

4

2

16

9

0

0

P02452

35

28

19

14

0

0

Q99715

85

31

21

6

0

0

P08123

26

23

21

16

0

0

Q13561 Q9Y6C2

12 13

/ /

24 6

/ /

0 2

/ /

P02751

46

26

19

7

0

2

P09382

12

7

50

37

0

0

P13746

10

/

21

/

0

/

P01903

4

/

15

/

0

/

P01911

8

/

20

/

0

/

P29966

8

/

22

/

0

/

O14745 Q15063

11 32

/ /

15 30

/ /

0 0

/ /

P36955

9

11

14

11

0

0

P13796

16

16

16

7

0

1

Q15084

12

8

33

18

0

0

P07237

22

12

31

10

0

1

Q92928

6

/

23

/

0

/

P50395

12

12

17

16

0

0

P46940 P61106

31 6

29 /

11 32

10 /

0 0

1 /

REST Fraction

P08134

3

/

26

/

0

/

P05023

20

16

9

10

2

0

P24821

32

/

14

/

0

/

P07996

32

/

20

/

0

/

P35442

19

/

12

/

1

/

Q15582

10

6

11

4

2

1

P61586 P37802

4 12

3 5

28 41

24 39

0 0

0 0

P13611

42

26

6

2

1

1

Q9P0L0

7

/

22

/

0

/

P18206

27

26

13

13

1

1

P21796

10

/

19

/

0

/

P21810

7

6

17

15

0

0

Q10589

5

5

15

14

1

4

Q9UBG0 Q5ZPR3

15 6

12 6

6 13

4 15

2 0

2 5

Q9BY67

7

6

17

13

0

1

GLYCO Fraction

3176

dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182

Journal of Proteome Research

ARTICLE

Table 2. Continued number of unique peptides

sequence coverage (%)

false positive rate (%)

accession

tumor

normal

tumor

normal

tumor

normal

Q99715 P0C0L5

31 24

30 22

13 6

6 7

0 0

4 0

Q9Y6C2

11

10

13

8

0

2

Q08380

15

11

24

16

0

0

P02790

11

10

24

18

0

0

P12314

4

/

10

/

0

/

P01903

3

4

8

17

0

0

Q9NX62

5

/

14

/

1

/

P06756 P05107

17 7

13 /

19 9

5 /

0 1

4 /

P11279

14

9

30

21

0

0

P13473

7

5

15

13

0

1

P01033

3

4

23

30

0

0

Q15063

8

6

10

5

0

0

P36955

5

3

15

13

0

0

P05154

6

4

17

8

0

0

P24821 P07996

32 14

/ /

18 8

/ /

0 2

/ /

P04216

13

5

34

32

0

0

P02786

13

/

7

/

2

/

Q9HD45

8

5

6

2

1

2

further process the nonbiotinylated fraction and analyze the N-glycosylated proteins. The application of the previously described glycopeptide isolation method to the fraction of nonbiotinylated proteins allowed the recovery and identification of further potentially accessible proteins. The selective covalent binding of glycan residues to hydrazide resin made a high specific isolation of the glycosylated peptides possible. Recently, a similar strategy was employed to exploit the fact that membrane proteins are largely glycosylated.7 The authors used a biocytin hydrazide reactant to label the membrane glycoproteins in vivo in mammalian cells. The protocol requires the exposure of the cells to mild oxidative conditions, low temperatures and slightly acidic conditions over a period of one hour. Considering that this study was performed on scarce biopsy samples, we considered that the biotinylation reaction under physiological conditions followed by cell lysis and glycoproteome extraction might be a more appropriate technique to extract maximal information regarding the membrane and extracellular proteome. The original glycan oxidation hydrazide capture method has been known for the past 30 years. Zhang et al. and Tian et al. showed for the first time the application of mass spectrometry to characterize N-linked glycoproteins.9,10 However, when applied at the peptide level, the technique is limited by the number of nonassigned glycopeptides. In this study, nonbiotinylated and nonglycosylated peptides (REST fraction) were analyzed using UPLCMSe as well. In this context, it is important to mention that the REST fraction itself contained approximately 10% of proteins that had an N-glycosylation consensus site with deamidated asparagines. As this fraction was not treated with PNGase F, it is not unreasonable to assume that the deamidation occurred spontaneously in this case. Hence, care needs to be taken to differentiate these peptides from truly glycosylated ones. In fact, only 510% of the proteins from the

rest-fraction that had deamidated asparagines at the consensus site are known as glycoproteins. In contrast to this, hydrazideresin-bound peptides led to the identification of glycoproteins, of which over 85% are known to be glycosylated. In summary, specific peptides found in the REST fraction together with the nonassigned glycopeptides (GLYCO fraction) led to successful glycoprotein identification and an overall increase of the number of identified glycoproteins (þ ∼30%), their sequence coverage and score. However, the question of alternatively performing the glycoprotein enrichment prior to digestion merits to be addressed. One main concern of this approach regards the inability to pinpoint the exact glycosylation site. For example, proteins may have several N-glycosylation consensus sites, but in practice only one may be really glycosylated. If a spontaneous deamidation occurs at the other nonglycosylated asparagines, this would lead to a false positive result. In contrast, if the peptides containing sugars are specifically oxidized and bound to the hydrazide resin, the certainty to identify the correct consensus site increases significantly. This ability to tell exactly which site is glycosylated is of particular interest when specific antibodies against accessible proteins need to be developed. Interestingly, the REST fraction (both following the sequential method and the glycopeptide isolation alone) contains a relatively high number of potentially accessible proteins (∼30%). This is in contrast to the available data employing shotgunproteomics on tissue samples. As recently shown,14 the percentage of potentially accessible proteins in such experiments does not exceed ∼10%. This fact clearly indicates that the REST fraction in the current experiments cannot be compared with classical shotgun proteomics approach. At this point, it appears that fractionation cannot be the main explanation for these discrepancies. Some of the shotgun data were created with up3177

dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182

Journal of Proteome Research

ARTICLE

Table 3. Absolute Quantification of Modulated Proteins Reported in Table 1a patient BPSCC10/39

BPSCC10/40

BPSCC10/46

BPSCC10/47

BPSCC10/49

tumor

normal

ratio

tumor

normal

ratio

tumor

normal

ratio

tumor

normal

ratio

tumor

normal

ratio

accession

(fmol)

(fmol)

T/N

(fmol)

(fmol)

T/N

(fmol)

(fmol)

T/N

(fmol)

(fmol)

T/N

(fmol)

(fmol)

T/N

Q01518

27.25

8.42

3.24

25.01

10.67

2.34

15.21

/

T

55.70

/

T

42.22

/

T

Q8IUX7

23.05

/

T

118.25

6.32

18.70

25.39

/

T

20.01

/

T

26.44

/

T

Q8TD06

49.33

/

T

20.55

/

T

34.55

/

T

/

/

/

27.53

/

T

P12830

/

/

/

7.93

/

T

12.52

/

T

31.39

/

T

30.29

/

T

Q99439

28.12

/

T

42.61

/

T

2.34

/

T

73.15

/

T

43.28

/

T

Q15417

27.49

/

T

28.75

/

T

26.41

/

T

/

/

/

61.09

/

T

P27797 P49747

93.04 8.20

14.00 /

6.65 T

86.89 43.50

47.01 /

1.85 T

109.98 6.60

/ /

T T

490.02 /

233.34 /

2.10 /

176.27 /

133.50 /

1.32 /

P35221

17.52

/

T

17.17

/

T

26.42

/

T

10.16

/

T

36.70

/

T

P26232

/

/

/

1.96

/

T

/

/

/

/

/

/

13.98

/

T

Q13740

16.72

/

T

14.13

/

T

28.98

/

T

/

/

/

/

/

/

Q9Y696

19.42

/

T

6.33

11.25

0.56

3.40

/

T

/

/

/

29.54

/

T

BIOT Fraction

P10909

33.15

21.30

1.56

105.25

32.82

3.21

102.72

28.04

3.66

42.84

53.21

0.81

227.75

294.72

0.77

P02452

598.92

552.61

1.08

402.57

578.70

0.70

239.34

50.69

4.72

116.35

109.90

1.06

1901.01

420.83

2.59

Q99715 Q96C19

30.40 9.54

/ /

T T

69.35 0.72

/ /

T T

9.44 6.22

/ /

T T

/ 30.76

/ /

/ T

/ /

/ /

/ /

Q9Y6C2

18.93

/

T

16.91

8.24

2.05

7.32

/

T

/

/

/

33.91

/

T

P02751

197.97

/

T

216.76

16.54

13.10

61.01

8.58

7.11

55.63

/

T

73.77

/

T

P23142

26.00

/

T

17.75

5.45

3.26

12.10

/

T

/

/

/

44.99

/

T

Q08380

9.15

/

T

125.66

5.57

22.56

17.25

/

T

70.35

12.42

5.66

49.13

91.38

0.54

P16188

/

/

/

1.26

/

T

/

/

/

3.81

/

T

/

/

/

P13747

/

/

/

/

/

/

/

/

/

27.28

/

T

/

/

/

P17693 P18465

/ /

/ /

/ /

17.86 /

/ /

T /

11.45 /

/ /

T /

/ /

/ /

/ /

/ /

/ /

/ /

P30499

/

/

/

/

/

/

/

/

/

7.46

/

T

8.16

/

T

Q07000

/

/

/

/

/

/

6.50

/

T

10.36

/

T

/

/

/

Q29960

/

/

/

/

/

/

/

/

/

76.83

/

T

/

/

/

P30504

/

/

/

5.26

/

T

/

/

/

14.28

/

T

4.26

/

T

P01906

/

/

/

17.24

/

T

20.95

/

T

52.14

/

T

11.15

/

T

P01903

/

/

/

56.70

/

T

/

/

/

194.79

14.23

13.69

134.80

67.48

2.00

P05556 Q86UP2

42.12 13.15

20.47 /

2.06 T

49.54 19.93

12.27 /

4.04 T

41.54 36.19

/ /

T T

109.41 /

/ /

T /

153.28 23.59

127.13 /

1.21 T

P33241

16.40

/

T

26.78

/

T

25.57

/

T

498.84

/

T

/

/

/

P14174

66.26

/

T

83.82

/

T

194.74

/

T

/

/

/

418.63

201.41

2.08

O15173

18.32

/

T

17.82

10.87

1.64

23.57

/

T

/

/

/

/

/

/

P26038

10.47

/

T

12.96

/

T

7.81

/

T

66.34

/

T

26.59

13.48

1.97

P29966

37.74

/

T

16.83

/

T

/

/

/

117.74

/

T

/

12.98

N

O14745

27.16

/

T

15.83

/

T

38.33

/

T

23.53

/

T

/

/

/

Q15063 P36955

167.28 7.66

43.21 /

3.87 T

214.02 24.58

10.58 /

20.22 T

51.66 13.69

35.78 4.77

1.44 2.87

54.59 /

/ /

T /

376.86 /

/ /

T /

P07237

83.16

11.07

7.51

61.79

20.99

2.94

105.59

4.67

22.60

251.07

8.32

30.18

107.62

49.57

2.17

Q15084

/

/

/

/

/

/

13.40

/

T

65.23

/

T

33.37

/

T

P01893

/

/

/

2.99

/

T

8.59

/

T

/

/

/

/

/

/

P62834

/

/

/

0.94

/

T

/

/

/

22.20

/

T

/

/

/

P61224

0.54

/

T

4.64

/

T

/

/

/

/

/

/

/

43.25

N

Q6FHJ7

10.90

/

T

15.69

/

T

8.79

/

T

/

/

/

/

/

/

P05026 P24821

11.35 36.34

/ /

T T

12.80 43.50

/ /

T T

22.65 19.06

/ /

T T

/ 15.20

/ /

/ T

/ /

/ /

/ /

3178

dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182

Journal of Proteome Research

ARTICLE

Table 3. Continued patient BPSCC10/39

BPSCC10/40

BPSCC10/46

BPSCC10/47

BPSCC10/49

accession

tumor (fmol)

normal (fmol)

ratio T/N

tumor (fmol)

normal (fmol)

ratio T/N

tumor (fmol)

normal (fmol)

ratio T/N

tumor (fmol)

normal (fmol)

ratio T/N

tumor (fmol)

normal (fmol)

ratio T/N

P10599

58.00

70.02

0.83

100.71

/

T

136.25

19.86

6.86

349.51

24.71

14.15

178.09

92.28

1.93

P35442

31.08

/

T

49.40

/

T

2.75

/

T

/

/

/

/

/

/

P37802

147.79

45.28

3.26

227.26

80.51

2.82

213.89

15.89

13.46

793.49

53.56

14.81

296.44

154.23

1.92

P13611

62.24

/

T

72.51

/

T

38.30

/

T

/

/

/

370.69

/

T

Q9P0L0

7.04

/

T

/

/

/

14.49

/

T

16.96

/

T

/

/

/

P21796

11.02

/

T

13.73

/

T

13.85

/

T

/

/

/

/

/

/

Q01518

145.47

79.37

1.83

60.39

38.16

1.58

33.73

/

T

29.24

9.53

3.07

162.79

79.37

2.05

Q8IUX7

87.63

/

T

269.51

/

T

24.21

/

T

20.72

/

T

/

/

/

P80723

39.57

/

T

32.05

/

T

22.73

/

T

33.61

/

T

/

/

/

Q99439

69.92

/

T

42.00

/

T

34.84

/

T

/

/

/

/

/

/

Q15417

76.39

/

T

70.41

/

T

10.87

/

T

/

/

/

/

/

/

P27797

146.02

/

T

171.01

44.62

3.83

186.71

/

T

232.37

/

T

186.69

/

T

O43852

/

/

/

32.86

/

T

21.68

/

T

43.13

/

T

54.46

/

T

P13987

98.10

38.57

2.54

/

/

/

56.71

/

T

/

/

/

106.97

38.57

2.77

P02452

1169.78

378.44

3.09

296.91

146.24

2.03

217.81

158.64

1.37

158.29

89.37

1.77

787.36

378.44

2.08

Q99715

245.36

/

T

599.89

49.59

12.10

47.76

/

T

37.69

/

T

110.03

/

T

P08123

2435.30

298.96

8.15

1347.89

759.96

1.77

450.09

492.83

0.91

669.99

389.54

1.72

2071.91

298.96

6.93

Q13561

26.52

/

T

6.71

/

T

21.50

/

T

23.69

/

T

/

/

/

Q9Y6C2

50.92

/

T

20.37

/

T

5.32

/

T

16.09

/

T

/

/

/

P02751

1032.75

35.05

29.47

503.07

29.24

17.21

42.14

15.62

2.70

64.72

21.59

3.00

193.91

35.05

5.53

P09382

427.75

177.17

2.41

310.39

143.35

2.17

98.01

/

T

196.88

26.42

7.45

303.39

177.17

1.71

P13746

/

/

/

/

/

/

/

/

/

/

/

/

/

/

/

P01903

/

/

/

40.94

/

T

23.47

/

T

52.85

/

T

/

/

/

REST Fraction

P01911

/

/

/

/

/

/

/

/

/

4.08

/

T

/

/

/

P29966

371.72

/

T

44.48

/

T

38.49

/

T

50.18

/

T

/

/

/

O14745

53.47

/

T

24.99

/

T

20.32

/

T

18.63

/

T

/

/

/

Q15063

796.75

/

T

386.44

/

T

109.63

/

T

70.78

/

T

345.96

/

T

P36955

51.34

/

T

82.79

/

T

30.91

32.74

0.94

/

/

/

/

/

/

P13796

51.16

/

T

33.76

9.32

3.62

/

/

/

47.69

/

T

108.53

/

T

Q15084

115.34

27.25

4.23

62.03

26.86

2.31

75.40

/

T

100.48

14.39

6.98

110.93

27.25

4.07

P07237

246.73

44.97

5.49

114.81

/

T

166.75

/

T

141.21

12.94

10.91

158.54

44.97

3.53

Q92928

/

/

/

14.64

/

T

25.87

/

T

5.78

/

T

/

/

/

P50395

46.52

/

T

40.81

16.57

2.46

15.22

/

T

/

/

/

26.74

/

T

P46940

85.11

58.70

1.45

41.69

26.35

1.58

18.52

/

T

22.25

/

T

/

58.70

N

P61106

/

/

/

74.27

/

T

10.20

/

T

28.57

/

T

/

/

/

P08134

/

/

/

15.72

/

T

10.19

/

T

4.10

/

T

/

/

/

P05023

11.26

/

T

27.70

14.26

1.94

1.18

/

T

/

/

/

13.18

/

T

P24821

142.76

/

T

72.18

/

T

19.54

/

T

24.65

/

T

/

/

/

P07996

55.15

/

T

354.78

/

T

42.28

/

T

/

/

/

/

/

/

P35442

25.59

/

T

126.87

/

T

5.43

/

T

/

/

/

/

/

/

Q15582

31.86

/

T

57.99

15.84

3.66

/

/

/

21.65

/

T

/

/

/

P61586

56.39

/

T

66.91

20.29

3.30

14.93

/

T

52.55

13.00

4.04

/

/

/

P37802 P13611

90.56 277.48

/ /

T T

246.42 222.78

28.26 19.46

8.72 11.45

135.13 100.89

/ /

T T

265.33 72.60

/ /

T T

/ 473.07

/ /

/ T

Q9P0L0

29.22

/

T

/

/

/

13.25

/

T

13.65

/

T

42.13

/

T

P18206

69.04

87.19

0.79

92.61

59.56

1.55

26.24

/

T

18.98

12.38

1.53

92.50

87.19

1.06

P21796

/

/

/

55.36

/

T

20.46

/

T

26.05

/

T

49.19

/

T

3179

dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182

Journal of Proteome Research

ARTICLE

Table 3. Continued patient BPSCC10/39

BPSCC10/40

BPSCC10/46 tumor (fmol)

normal (fmol)

BPSCC10/47

BPSCC10/49

accession

tumor (fmol)

normal (fmol)

ratio T/N

tumor (fmol)

normal (fmol)

ratio T/N

ratio T/N

tumor (fmol)

normal (fmol)

ratio T/N

tumor (fmol)

normal (fmol)

ratio T/N

P21810

1220.20

338.77

3.60

830.89

239.89

3.46

371.85

Q10589

/

39.76

N

135.25

1

T

23.32

530.25

0.70

993.51

653.28

1.52

1366.49

336.93

4.06

/

T

60.09

/

T

/

/

Q9UBG0

70.25

97.90

0.72

117.20

90.79

1.29

/

/

/

/

41.64

/

T

225.49

/

T

Q5ZPR3

316.19

19.44

16.26

51.88

/

T

70.59

/

T

47.08

/

T

/

/

/

Q9BY67

114.44

/

T

105.15

24.53

4.29

63.81

/

T

121.79

/

T

/

/

/

Q99715

461.55

/

T

1200.98

77.78

15.44

267.53

69.74

3.84

124.05

85.92

1.44

474.59

/

T

P0C0L5

/

/

/

/

/

/

289.56

/

T

/

/

/

/

/

/

Q9Y6C2 Q08380

176.35 569.32

35.77 277.24

4.93 2.05

115.57 719.91

63.32 221.16

1.83 3.26

113.41 674.45

/ 141.05

T 4.78

203.40 817.14

/ 173.77

T 4.70

264.33 541.86

64.41 1016.83

4.10 0.53

P02790

1102.59

356.78

3.09

837.70

568.98

1.47

699.04

/

T

689.38

/

T

1020.71

2366.13

0.43

P12314

/

/

/

29.19

/

T

51.46

/

T

33.80

/

T

/

/

/

P01903

346.85

76.44

4.54

790.66

129.48

6.11

477.00

/

T

857.60

/

T

1387.41

498.70

2.78

GLYCO Fraction

Q9NX62

29.75

/

T

63.35

/

T

85.50

/

T

109.96

/

T

82.31

/

T

P06756

238.76

88.51

2.70

242.64

82.09

2.96

188.93

16.44

11.49

100.74

20.26

4.97

/

/

/

P05107

/

/

/

76.90

/

T

60.82

/

T

115.87

/

T

/

/

/

P11279 P13473

330.30 302.58

102.51 /

3.22 T

198.55 579.10

133.08 131.88

1.49 4.39

138.08 406.32

68.36 44.79

2.02 9.07

189.55 559.75

84.22 55.18

2.25 10.14

146.18 296.56

93.01 112.96

1.57 2.63 0.53

P01033

98.55

46.67

2.11

171.39

/

T

253.82

/

T

199.87

/

T

110.64

207.52

Q15063

149.69

45.08

3.32

121.32

/

T

148.65

/

T

73.46

/

T

135.54

/

T

P36955

689.37

421.77

1.63

1088.51

262.63

4.14

603.41

1219.09

0.49

437.38

1501.94

0.29

946.37

411.71

2.30

P05154

110.18

/

T

61.97

/

T

161.06

79.15

2.03

93.06

97.52

0.95

193.19

216.20

0.89

P24821

996.33

/

T

620.11

/

T

312.18

/

T

428.79

/

T

223.84

/

T

P07996

/

/

/

618.55

/

T

135.30

/

T

22.99

/

T

/

/

/

P04216 P02786

813.23 72.33

287.97 /

2.82 T

699.95 /

154.93 /

4.52 /

490.06 32.71

92.46 /

5.30 T

483.84 123.08

113.91 /

4.25 T

357.80 66.86

164.04 /

2.18 T

Q9HD45

327.21

90.22

3.63

196.84

94.87

2.07

292.61

/

T

/

/

/

/

/

/

The quantity is reported in fmol and relates to 2.5 μg of protein digest injected on the HPLC column. The ratio indicates (where applicable) the fold difference between the absolute protein quantities observed in the tumoral vs. the normal condition. No further normalization of the ratio values was conducted.

a

to 20 fractions based either on the molecular weight or the pI of the protein. The only plausible hypothesis is that the prior enrichment of such hydrophobe and basic proteins, as applied in the current setting, renormalized the remaining proteome of the tissue extract. In other words, owing to the two specific steps of the sequential method, a particular, new composition of the sample has emerged in a repeatable fashion. Altogether, the efficiency of the current approach is limited to the comparison of identified proteins with database information regarding their subcellular localization. However, a protein known to be a membrane protein does not necessary need to be accessible. Conversely, intracellular proteins may also in certain circumstances be shuttled to the surface (e.g., proteins found in the endoplasmic reticulum). Therefore, further careful assessment of identified proteins (e.g., with regard to their specific protein motifs) and in vivo validation experiments using labeled targeting agents are needed in order to confirm the real accessibility of a given protein. In the frame of the current study, known and novel modulated proteins are reported. Although the data are encouraging, this limited biological study serves only as a proof of concept

that the discovery of novel modulated and potentially accessible proteins through the usage of the outlined method is possible. One such modulated protein is CD276. Current literature shows that this protein is expressed at the cell surface of various types of tumors. Roth et al.15 showed CD276 expression in normal liver, urothelium, fetal kidney but also increased levels in prostate cancer. Recent studies16 indicated that CD276 may be predominantly expressed by the tumor vasculature. However, the exact role of this protein in the process of cancer remains relatively unknown. Some findings do indicate that one possible function of CD276 may be found at the level of the impairment of T-cell-mediated immunity.17 Our findings that CD276 is a potentially targetable protein in breast cancer are novel. The IHC analysis performed with a larger collection of breast cancer lesions validated the differential expression of CD276 in tumoral versus normal tissues. These results warrant further studies directed toward clarifying the role of this protein in tumor cells. Altogether, the method described in this study permits a comprehensive exploitation of scarce pathological material and has shown its ability to extract an interesting group of proteins 3180

dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182

Journal of Proteome Research

ARTICLE

Figure 9 . Immunohistochemical validation CD276. (A) Box-plot evaluating of CD276 antigen positivity in breast ductal adenocarcinoma and normal breast tissue. The details regarding the scoring as well as the statistics are indicated in the Materials and Methods. (B) Representative images of breast ductal adenocarcinoma cells [AC] and normal breast ducts [D] immunostained with anti-CD276.

that have the potential to be used as diagnostic or therapeutic cancer biomarkers. In this context, further (in vivo) studies are needed to validate the true systemic accessibility of the identified biomarkers. The approach is a move away from classical shotgun proteomics and is directed toward a specific group of proteins. The robustness of the analysis opens up new possibilities for using this method in other applications which may not be strictly related to cancer but to all other biological questions

that require an insight into the accessible part of the membrane proteome.

’ ASSOCIATED CONTENT

bS

Supporting Information Supplemental figures and tables. This material is available free of charge via the Internet at http://pubs.acs.org.

3181

dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182

Journal of Proteome Research

’ AUTHOR INFORMATION Corresponding Author

*Vincent Castronovo, MD, PhD GIGA Cancer, University of Liege, Pathology Building, B23, þ4, B-4000 Liege, Belgium. E-mail: [email protected]. Phone: þ32 43662479. Fax: þ32 43662975.

’ ACKNOWLEDGMENT This work was supported by a grant from the Research Concerted Action (IDEA project) of the University of Liege (ULG), Belgium, from the CEE (FP7 network: ADAMANTAntibody Derivatives As Molecular Agents for Neoplastic Targeting (HEALTH-F2-2007-201342)), from the National Fund for Scientific Research (NFSR, Belgium) and TELEVIE as well as from the Centre Anti-Cancereux of the ULG. The authors acknowledge the GIGA-Proteomics Platform of the ULG and Pascale Heneaux (LRM) for experimental support. ’ ABBREVIATIONS ABC, avidinbiotin complex; BIOT, biotinylated protein fraction; DAB, 330 diamino benzidine tetrachlorhydrate dehydrate; DNA, DNA; DOC, deoxycholic acid; DTT, dithiothreitol; FPR, false positive rate; GLYCO, glycosylated peptide/protein fraction; GSSG, oxidized glutathione; HCl, hydrogen chloride; H2O2, hydrogen peroxide; HSA, human serum albumin; IgG, immunoglobulin; IHC, immunohistochemistry; IS, internal standard; NaCl, sodium chloride; Na2CO3, sodium carbonate; NH4HCO3, ammonium bicarbonate; NP40, Nonidet P40; Pcc, Pearson correlation coefficient; PBS, phosphate buffered saline; PI, protease inhibitor; PNGase F, peptide N-glycosidase F; PLGS, ProteinLynx Global SERVER; REST, rest peptide/protein fraction; RNA, ribonucleic acid; RT, room temperature; SA, streptavidin; SDS, sodium dodecyl sulfate.

ARTICLE

stable isotope labeling and mass spectrometry. Nat. Biotechnol. 2003, 21 (6), 660–666. (10) Tian, Y.; Zhou, Y.; Elliott, S.; Aebersold, R.; Zhang, H. Solidphase extraction of N-linked glycopeptides. Nat. Protoc. 2007, 2, 334– 339. (11) Zhang, H. Glycoproteomics using chemical immobilization. Curr. Protoc. Protein Sci. 2007, 24, No. unit 24.3. (12) Sun, B.; Ranish, J. A.; Utleg, A. G.; White, J. T.; Yan, X.; Lin, B.; Hood, L. Shotgun glycopeptide capture approach coupled with mass spectrometry for comprehensive glycoproteomics. Mol. Cell. Proteomics 2007, 6, 141–149. (13) Chen, R.; Jiang, X.; Sun, D.; Han, G.; Wang, F.; Ye, M.; Wang, L.; Zou, H. Glycoproteomics analysis of human liver tissue by combination of multiple enzyme digestion and hydrazide chemistry. J. Proteome Res. 2009, 8, 651–661. (14) Sprung, R. W., Jr.; Brock, J. W.; Tanksley, J. P.; Li, M.; Washington, M. K.; Slebos, R. J.; Liebler, D. C. Equivalence of protein inventories obtained from formalin-fixed paraffin-embedded and frozen tissue in multidimensional liquid chromatography-tandem mass spectrometry shotgun proteomic analysis. Mol. Cell. Proteomics 2009, 8 (8), 1988–1998. (15) Roth, T. J.; Sheinin, Y.; Lohse, C. M.; Kuntz, S. M.; Frigola, X.; Inman, B. A.; Krambeck, A. E.; McKenney, M. E.; Karnes, R. J.; Blute, M. L.; Cheville, J. C.; Sebo, T. J.; Kwon, E. D. B7-H3 ligand expression by prostate cancer: a novel marker of prognosis and potential target for therapy. Cancer Res. 2007, 67, 7893–7900. (16) Crispen, P. L.; Sheinin, Y.; Roth, T. J.; Lohse, C. M.; Kuntz, S. M.; Frigola, X.; Thompson, R. H.; Boorjian, S. A.; Dong, H.; Leibovich, B. C.; Blute, M. L.; Kwon, E. D. Tumor cell and tumor vasculature expression of B7-H3 predict survival in clear cell renal cell carcinoma. Clin. Cancer Res. 2008, 14 (16), 5150–5157. (17) Castriconi, R.; Dondero, A.; Augugliaro, R.; Cantoni, C.; Carnemolla, B.; Sementa, A. R.; Negri, F.; Conte, R.; Corrias, M. V.; Moretta, L.; Moretta, A.; Bottino, C. Identification of 4Ig-B7-H3 as a neuroblastoma-associated molecule that exerts a protective role from an NK cell-mediated lysis. Proc. Natl. Acad. Sci. U.S.A. 2004, 101, 12640– 12645.

’ REFERENCES (1) Cancer diagnostics: discovery and clinical applications. Clin. Chem. 2002, 48 (whole issue), 11451375. (2) Recent advances in cancer biomarkers. Clin. Biochem. 2004, 37 (whole issue), 503647. (3) Biomarkers and clinical proteomics. Mol. Cell. Proteomics 2006, 5 (whole issue), S1S402. (4) Proteomics and biomarkers. J. Proteome Res. 2005, 4 (whole issue), 10531456. (5) Celis, J. E.; Gromov, P.; Cabezon, T.; Moreira, J. M.; Ambartsumian, N.; Sandelin, K.; Rank, F.; Gromova, I. Proteomic characterization of the interstitial fluid perfusing the breast tumor microenvironment: a novel resource for biomarker and therapeutic target discovery. Mol. Cell. Proteomics 2004, 3, 327–344. (6) Castronovo, V.; Kischel, P.; Guillonneau, F.; de Leval, L.; Defechereux, T.; De Pauw, E.; Neri, D.; Waltregny, D. Identification of specific reachable molecular targets in human breast cancer using a versatile ex vivo proteomic method. Proteomics 2007, 7, 1188–1196. (7) Wollscheid, B.; Bausch-Fluck, D.; Henderson, C.; O’Brien, R.; Bibel, M.; Schiess, R.; Aebersold, R.; Watts, J. D. Mass-spectrometric identification and relative quantification of N-linked cell surface glycoproteins. Nat. Biotechnol. 2009, 27 (4), 378–386. (8) Naeem, A.; Saleemuddin, M.; Khan, R. H. Glycoprotein targeting and other applications of lectins in biotechnology. Curr. Protein Pept. Sci. 2007, 8 (3), 261–271. (9) Zhang, H.; Li, X. J.; Martin, D. B.; Aebersold, R. Identification and quantification of N-linked glycoproteins using hydrazide chemistry, 3182

dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182