ARTICLE pubs.acs.org/jpr
Novel Comprehensive Approach for Accessible Biomarker Identification and Absolute Quantification from Precious Human Tissues Andrei Turtoi,†,‡ Bruno Dumont,† Yannick Greffe,† Arnaud Blomme,† Gabriel Mazzucchelli,‡ Philippe Delvenne,§ Eugene Nzaramba Mutijima,§ Eric Lifrange,|| Edwin De Pauw,‡ and Vincent Castronovo*,† †
Metastasis Research Laboratory, GIGA Cancer, University of Liege, Bat. B23, Liege, B-4000 Liege, Belgium Faculty of Medicine, Department of Anatomy and Pathology, University of Liege, B-4000 Liege, Belgium ‡ GIGA Systems Biology and Chemical Biology, Laboratory of Mass Spectrometry, Department of Chemistry, University of Liege, Bat. B6c, B-4000 Liege, Belgium Department of Senology, University Hospital (CHU), University of Liege, B-4000 Liege, Belgium
)
§
bS Supporting Information ABSTRACT: The identification of specific biomarkers obtained directly from human pathological lesions remains a major challenge, because the amount of tissue available is often very limited. We have developed a novel, comprehensive, and efficient method permitting the identification and absolute quantification of potentially accessible proteins in such precious samples. This protein subclass comprises cell membrane associated and extracellular proteins, which are reachable by systemically deliverable substances and hence especially suitable for diagnosis and targeted therapy applications. To isolate such proteins, we exploited the ability of chemically modified biotin to label ex vivo accessible proteins and the fact that most of these proteins are glycosylated. This approach consists of three successive steps involving first the linkage of potentially accessible proteins to biotin molecules followed by their purification. The remaining proteins are then subjected to glycopeptide isolation. Finally, the analysis of the nonglycosylated peptides and their involvement in an in silico method increased the confident identification of glycoproteins. The value of the technique was demonstrated on human breast cancer tissue samples originating from 5 individuals. Altogether, the method delivered quantitative data on more than 400 potentially accessible proteins (per sample and replicate). In comparison to biotinylation or glycoprotein analysis alone, the sequential method significantly increased the number (g30% and g50% respectively) of potentially therapeutically and diagnostically valuable proteins. The sequential method led to the identification of 93 differentially modulated proteins, among which several were not reported to be associated with the breast cancer. One of these novel potential biomarkers was CD276, a cell membrane-associated glycoprotein. The immunohistochemistry analysis showed that CD276 is significantly differentially expressed in a series of breast cancer lesions. Due to the fact that our technology is applicable to any type of tissue biopsy, it bears the ability to accelerate the discovery of new relevant biomarkers in a broad spectrum of pathologies. KEYWORDS: proteomics, accessible proteins, glycoproteins, cancer biomarkers
1. INTRODUCTION A major step in many aspects of research related to malignant diseases is the identification of specific and sensitive biomarkers suitable for the development of effective diagnostic, prognostic and therapeutic modalities. Nowadays owing to mass spectrometry, shotgun proteomics and DNA/RNA microarray analyses, the list of reported potential tumor biomarkers is increasing rapidly. Despite this abundance, very few of such modulated proteins have found their way into the clinical validation phase and even fewer are used as reliable therapeutic targets or diagnostic markers.14 We and others believe that one of the r 2011 American Chemical Society
most promising ways of overcoming this difficulty is narrowing down the number of proteins investigated to the essential group of interest. Modulated proteins accessible from the bloodstream are an example of such a group. These proteins are mostly membrane based or embedded in the surrounding extracellular matrix. They are of particular interest because they have increased potential to be reached by systemically delivered monoclonal antibodies loaded with pharmacological compounds. Received: March 6, 2011 Published: May 03, 2011 3160
dx.doi.org/10.1021/pr200212r | J. Proteome Res. 2011, 10, 3160–3182
Journal of Proteome Research A frequent limitation in the identification of potential biomarkers is the scarcity of the tissues from which they need to be recovered. This is particularly true of human pathological tissues, such as cancer lesions, which are available for research only in very small amounts, making these samples very precious. Currently available proteomic methods that enable the analysis of potentially accessible proteins from such limited quantities leave substantial room for improvement. The available techniques attempt to tackle the problem primarily by exploiting the physical location of the protein of interest5 and are to a lesser extent focused on their chemical properties. To this end, the use of chemically modified biotin that labels accessible proteins through their free amino groups combined with streptavidin affinity chromatography represents a powerful method.6 Nevertheless, accessible proteins that do not bear such free amino groups will escape the inventory. To try to include a more comprehensive set of accessible proteins, we decided to exploit the known fact that most membrane and extracellular proteins are glycoproteins.7 Hence, the analysis of glycoproteins in tandem with the biotinylation method would offer a real possibility of covering a supplementary segment of accessible, but previously unidentified, proteins. Enrichment techniques for glycoproteins have been developed by employing lectin column affinity purification combined with concanavalin A.8 An alternative method is the hydrazide capture of preliminarily oxidized glycans, which appears to be more reproducible and less biased toward certain subsets of the glycoproteome in comparison to the lectin technique.911 This approach has already been applied to cell lines12 and recently also to normal tissue samples.13 Along these lines, we have developed an original method that combines three sequential steps consisting in protein biotinylation and glycopeptide analysis followed by the sequencing of neither glycosylated nor biotinylated peptides (rest-proteins). The latter helps to assign those glycopeptides that cannot be related with enough confidence to a specific protein (see Figure 2). Additionally, we show that the rest-fraction also contains a number of unique and potentially accessible proteins making this step an integral part of the approach. The method was designed such that a minimal amount of material or information is wasted. To validate our approach, we included several internal standards and used a suitable MS analysis technique allowing absolute, accurate and repeatable quantification of proteins found in the sample. The proof of concept was demonstrated on human breast cancer tissues where known and novel differentially expressed antigens were identified. Altogether, our study provides scientists with a new method to increase the number of identified modulated proteins while maximally exploiting their precious samples, yet focusing on candidates that are potentially accessible and hence relevant tumor biomarkers. In this context, it is important to mention that ultimate proof of accessibility necessitates further validation experiments using appropriate in vivo tumor models and intravenous injection of affinity ligands (e.g., antibodies) coupled to suitable imaging agents (fluorescent-dyes or radio-isotopes).
2. MATERIALS AND METHODS All the individuals involved in the current work were informed in detail regarding the aims of the study and gave their written consent. The project purpose and the undertaken experiments complied with the regulations and ethical guidelines of the University of Liege, Belgium and was approved by the ethic
ARTICLE
committee. The study is divided into two parts: (i) technical (demonstrating repeatability, accuracy and the value of the method) and (ii) biological (proof of concept study). The technical part employed one “master sample” which was prepared as a pool of equal amounts of all the tissue samples involved in the MS analysis (all individuals, both tumoral and normal specimens). The biological part and specifically the identification of modulated proteins was conducted using each of the individual samples outlined in the Table S1 (Supporting Information) separately. Finally, the validation of specific differentially expressed protein was conducted on a separate group of breast cancer patients (30 tumoral and 10 normal individuals). All patients involved in the IHC validation study were diagnosed with ductal breast adenocarcinoma, had clinical grades of at least 2 and presented no metastasis at the time of surgery. Regarding the number and type of replicates involved in the respective aspect of the study; it is to be noted that: (i) all analyses conducted in the technical part involved three full technical replicates (at the level of tissue solubilization), performed with same tools on separate days and using the “master sample”; (ii) the investigations conducted in the biological proof of concept study are single technical/biological replicates of matched tumoral and normal samples originating from five individuals. 2.1. Tissue Sample Preparation
Pieces of fresh human breast cancer biopsies obtained from the Pathology Department of the University Hospital of Liege, Belgium, were immediately sliced and soaked in freshly prepared EZ-link Sulfo NHS-SS biotin (1 mg/mL in PBS [phosphate buffered saline], Pierce, Rockford, IL) solution and incubated for 20 min (37 °C) as described previously.6 The reaction was stopped by addition of tris-HCL pH 7.4 (final concentration 50 mM). Tissue samples were then snap-frozen in liquid nitrogen and pulverized using a Mikro-Dismembrator U (Braun Biotech, Melsungen, Germany). Approximately 100 mg of tissue powder was dissolved in 500 μL PBS containing protease inhibitor (PI) cocktail (Halt, Pierce, Rockford, IL), 0.5 mM oxidized glutathione (GSSG) and level 1 internal standard mix (IS1) consisting of bovine fetuin, bovine casein and biotinylated chicken ovalbumin (performed by incubation of ovalbumin with EZ-link Sulfo NHS-SS biotin reagent); ratio spike/sample 1/200. Homogenates were then sonicated (2 30 s) with a 2 mm microprobe and centrifuged at 20 000 g for 10 min (4 °C). Human serum albumin (HSA) and immunoglobulins (IgG) were eliminated using Qproteome HSA and IgG Removal Kit (Qiagen, Valencia, CA). The remaining pellet was suspended in the 500 μL lysis buffer (1% Nonidet P40 (NP40), 0.5% deoxycholic acid (DOC), 0.1% SDS (sodium dodecyl sulfate), 0.5 mM GSSG and PI cocktail in PBS, pH 7.0), sonicated (2 30 s) and centrifuged at 20 000 g for 10 min (4 °C). The sample was subjected to HSA and IgG depletion as described above. Fivehundred microliters of 2% SDS solution was added to the remaining insoluble pellet following a final resolubilization. The sample was then centrifuged (as mentioned above) and the supernatant collected. All lysates from the three solubilization steps were finally pooled together and boiled for 5 min. The latter was conducted to ensure complete denaturing of the proteins. 2.2. Isolation of Biotinylated Proteins
The total protein extract was mixed with 100 μL/mg Streptavidin (SA) resin (Pierce, Rockford, IL, USA) for 120 min under rotational conditions (RT). After the incubation, the supernatant
3161
dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182
Journal of Proteome Research
ARTICLE
Figure 1. Sequential extraction of accessible proteins using tissue samples. The first step consists of biotinylation of accessible proteins and their isolation (BIOT fraction). The second step utilizes protein digestion into peptides following the isolation of glycopeptides (GLYCO fraction). The final step also collects the nonglycosylated peptides (REST) using them to complement the sequence information of nonassigned glycopeptides (detailed in Figure 2). Additionally, the REST fraction contained a significant number of accessible proteins which were allowed to supplement the already identified proteins in BIOT and GLYCO fractions. Three internal standards (IS1, 2 and 3) were added at different manipulation steps in the method in order to monitor, recovery, reproducibility and accuracy of quantification. The composition of the respective internal standard is outlined in the Materials and Methods section. All the fractions were analyzed using the 2D-nanoUPLCMSe system, which consisted essentially of two C18-phases run at pH 10 and pH 3, respectively.
was retained for the subsequent glycoproteomic analysis (fraction 1). The streptavidin beads were washed 4 times with
0.5 mL buffer A (1% NP40, 0.1% SDS and 0.5 mM GSSG in PBS buffer), 4 times with 0.5 mL buffer B (0.1% NP40, 1.5 M NaCl 3162
dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182
Journal of Proteome Research and 0.5 mM GSSG in PBS buffer), 2 times with 0.5 mL buffer C (0.1 M Na2CO3 and 0.5 mM GSSG in PBS buffer at pH 11.0) and finally 2 times with 0.5 mL PBS buffer at pH 7 without GSSG. The biotinylated proteins were eluted 2 times with 0.2 mL of 100 mM dithiothreitol (DTT) and incubated at 60 °C for 30 min (fraction 2). Fraction 1 was also reduced in 100 mM DTT. Both fractions were alkylated with 150 mM iodoacetamide for 30 min in the absence of light. At this stage level 2 internal standard (IS2) consisting of bovine beta-lactoglobulin (spike 1/200) was added to both fractions. Proteins were then precipitated in the presence of 20% trichloroacetic acid (TCA) overnight (4 °C) and washed two times with ice-cold acetone. The protein pellets (fractions 1 and 2) were solubilized in 50 μL of 50 mM NH4HCO3 (only for fraction 2 additionally 1% DOC [deoxycholic acid] was added as well) and digested (1:50 protease/protein ratio) overnight using trypsin (Promega, Madison, WI) (37 °C). Following this, the digestion was extended for 4 h by addition of fresh trypsin (1:100). The biotinylated peptides (fraction 2) were further processed using MS. Fraction 1 was used for the isolation of glycopeptides as described below (Figure 1). 2.3. Isolation of Glycopeptides
The digested protein sample was acidified with HCl (final concentration 1%), transferred onto the C18 Sep-Pak column (Waters, Milford, MA) and washed with 3 1 mL of 0.1% formic acid solution. The peptides were eluted using acetonitrile (80%) and evaporated to dryness. The peptide-containing sample was dissolved in the 100 μL of oxidation buffer (100 mM sodium acetate, 150 mM NaCl at pH 5.5) and complemented with 10 μL of sodium periodate (100 mM stock solution) (Pierce, Rockford, IL) and incubated for 30 min in the dark. Following this, 10 μL of sodium sulfite (120 mM stock solution) was added and incubation was extended for an additional 10 min (quenching). The sample was adjusted to 200 μL (with oxidation buffer; as described above) loaded onto hydrazide resin (Bio-Rad, Hercules, CA) and the glycopeptides were bound overnight (RT). The glycopeptide-free flow-through as well as the two first washes (of hydrazide resin with water) were collected for the subsequent MS analysis (nonglycosylated proteins). After extensive washing (2 500 μL each: water, 1.5 M NaCl, methanol, 80% ACN and 50 mM NH4HCO3) the hydrazide resin was loaded with 100 μL of 50 mM NH4HCO3 solution and incubated overnight with 500 units of PNGase F (New England Biolabs, Ipswich, MA) (37 °C). After the incubation period, the glycopeptide-containing flow-through was collected and desiccated. 2.4. MS Analysis
Five micrograms of peptides originating from biotinylated, glycosylated and also the nonglycopeptide fractions were desalted using C18 ZipTip pipet tips (Millipore, Billerica, MA). Following this, the peptide containing samples were first desiccated and than dissolved in 18 μL of 100 mM ammonium formiate buffer (pH 10). To the dissolved samples level 3 internal standard mix was added (IS3) composed of MassPREP Digestion Standard Mixture 1 (Waters Corporation) containing equimolar mix of yeast alcohol dehydrogenase, rabbit glycogen phosphorylase b, bovin serum albumin and yeast enolase; final concentration in 18 μL sample was adjusted to 135 fmol of yeast alcohol dehydrogenase. Of the sample prepared, 9 μL was injected corresponding to an estimated protein load of 2.5 μg. For the MS analysis the 2D-nano Aquity UPLC (Waters) was coupled online with the SYNAPT G1 qTOF system (Waters).
ARTICLE
The configuration of the 2D-nano UPLC system was the following: first dimension separation column X-Bridge BEH C18 5 μm (300 μm 50 mm), trap column Symmetry C18 5 μm (180 μm 20 mm) and analytical column BEH C18 1.7 μm (75 μm 150 mm) (all Waters). The sample was loaded at 2 μL/min (20 mM ammonium formiate, pH 10) on the first column and subsequently eluted in 5 steps (10, 14, 16, 20 and 65% acetonitrile). Each eluted fraction was desalted on the trap column and subsequently separated on the second analytical column; flow rate 300 nL/min, solvent A (0.1% formic acid in water) and solvent B (0.1% formic acid in acetonitrile), gradient 0 min, 97% A; 90 min, 60% A. The MS acquisition parameters were: data independent, alternate scanning (MSE) mode, 501500 m/z range, ESIþ, V optics, scan time 1 s, cone 30 V and lock mass [Glu1]-Fibrinopeptide B ([M þ 2H]2þ 785.8426 m/z). Raw data were processed (deconvoluted, deisotoped, protein identification, absolute and relative quantification) using ProteinLynx Global SERVER (PLGS) v2.4. The processing parameters were: MS TOF resolution and the chromatographic peak width were set to automatic, low-/elevated- energy detection threshold to 250/100 counts, identification intensity threshold to 1500 counts and lock mass window to 785.8426 ( 0.30 Da. For protein identification UniProt human database served as the reference (canonical sequence data with 20 280 enteries). To this database the sequences of all spiked proteins (internal standards) of nonhuman origin were manually added. Peptide modification carbamidomethylation was set as fixed and oxidation (M) as variable. In addition, for glycoprotein analysis, deamidation (N) was included as a variable modification as well. A response factor (2200) for the conversion of the peptide intensities into absolute quantities was deduced previously following a repeated injection of the alcohol dehydrogenase (yeast, Swiss-Prot P00330) digest. This response factor was kept constant throughout the entire study. Routinely, IS1, IS2 and IS3 were checked for the correct relationship between the spiked and the measured absolute amount as well as for the relative ratio between the compared samples. The IS tolerances for both absolute quantities and relative ratios had to be within (35% deviation for the data set to be acceptable and included in further analysis. PLGS software calculated score, relative ratio of protein expression (tumoral vs normal), its p-value and the false positive rate (FPR) for each individual protein hit. The p-value output format (PLGS) was ranging from 0 to 1 indicating: (i) 00.05 significant down-regulation and (ii) 0.951.0 significant upregulation of the respective protein; both instances indicate a level of certainty g95%. Within the present study, a protein was considered as identified if the FPR was g96% and the score g80. A protein was considered as modulated if the relative ratio of protein expression was higher than 1.5 fold with significant p-value. 2.5. Glycoprotein Data Analysis
Regarding the glycoproteins, the processed MS data (deconvoluted spectra) were submitted for the database search, first separately for the fraction obtained from the hydrazide beads and then combined with the flow-through fraction. Following this, all the N-linked glycoproteins originating from the hydrazide beads were filtered out with a homemade program. This program checked for the presence of deamidated asparagines at the consensus sequence site (NXS/T, where X can be replaced by any amino acid except proline) for each of the peptides in question. In this initial step, a certain number of glycopeptides 3163
dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182
Journal of Proteome Research
ARTICLE
could immediately be assigned to a respective glycoprotein (GLYCO fraction). The remaining glycopeptides were not specific enough or had lower scores so that they could not be unambiguously associated with a protein. In order to help assign these peptides, they were matched with the peptides from the REST fraction analysis where several nonglycosylated peptides in conjunction with the glycosylated peptides (from the GLYCO fraction) permitted the protein identification. This new combined pool of proteomic results was named the GLYCO REST fraction. 2.6. Immunohistochemical Validation of Selected Biomarkers
The expression of CD276 was assessed by immunohistochemistry in formalin-fixed paraffin-embedded breast tissue sections. Samples originating from 30 tumoral and 10 normal breasts were immunostained using suitable antibody (monoclonal anti-B7H3, R&D Systems, Minneapolis, MN). Tissue sections of 5 μm thickness were unparaffined by three baths in xylene during 5 min and hydrated in the methanol gradient (100, 90, 70, 50% and H2O). Blocking of endogenous peroxidase was performed by 30 min incubation with 3% H2O2 and 90% methanol. Antigen retrieval was conducted in 10 mM citrate buffer (pH 6) using 95 °C water bath for 40 min. Following 30 min blocking in PBSnormal serum solution (150 μL of normal rabbit serum [Vector Laboratories, Burlingame, CA] and 20 μL of Tween 20 in 10 mL PBS), the sections were incubated with the primary antibody overnight at 4 °C. Sections were then incubated with the biotinylated secondary antibody for 30 min and further with avidinbiotin complex kit (ABC kit, Vector Laboratories) for additional 30 min. 3,30 -diaminobenzidine tetrachlorhydrate dihydrate (DAB) with 5% H2O2 was used for colorization. The slides were finally counter-stained with hematoxylin. Immunostaining was assessed by two independent evaluators who examined the samples for percentage positive cells (four arbitrary units/classes: 1 = 025%, 2 = 2550%, 3 = 5075% and 4 = 75100%) and for staining intensity (four arbitrary units/ classes: 0 = no staining, 1 = weak, 2 = moderate and 3 = strong). The results obtained by these two scales were then multiplied together yielding a single value named score (y axis in the Figure 9 ). Statistical analysis was performed using MannWhitney Rank Sum Test for comparison between two groups (Sigma Plot; Systat Software, San Jose, CA).
3. RESULTS 3.1. Isolation of Potentially Accessible Proteins from Tissue Samples—the Sequential Method
The schematic overview of the sequential method is displayed in the Figure 1. The method is composed essentially of three distinct steps: (i) isolation of biotinylated proteins (BIOT), (ii) purification of the glycopeptides (GLYCO) and (iii) analysis of the remaining peptides (REST). The latter fraction served for insilico complementation of the nonassigned glycopeptides (Figure 2). In addition, as this fraction contained a number of potentially accessible proteins, it was allowed to contribute this group of interest to the overall pool of modulated proteins. Due to the obvious complexity of the method, several rapid tools for monitoring the inter-replicate repeatability were introduced. These consisted of: (i) examining the flow-through of the streptavidin purification step for remaining biotinylated proteins and (ii) measurement of the flow-through of the hydrazide beads for residual glycopeptides. Further process controls were the
Figure 2. Overview of the in-silico combinatory method used to increase the number of identified glycoproteins. Those glycopeptides that were selectively bound on the hydrazide resin but were not specific enough to give protein identification were matched with the flow through fraction and subjected to a second database search.
internal standards which were spiked at three different steps, amounting to 8 individual proteins of nonhuman origin (detailed in the Materials and Methods). Ideally, the technical variability must be conferred to the limits which are below the threshold used to define if a given protein is differentially expressed. Overall it can be said that the capture of both biotinylated proteins and glycopeptides was efficient and allowed for good specificity and minor samples loss. This is evident from the data presented in the Figure 3A. Here it is noteworthy that in the BIOT fraction, quantitative recoveries of biotinylated albumin and negligible contamination of fetuin was detected (IS1). As far as the other two fractions are concerned (GLYCO and REST), fetuin was accurately quantified in both fractions (owing to both glycosylated and nonglycosylated peptides) whereas casein (IS1), being a phosphoprotein, was only recovered in the REST fraction. Regarding the reproducibility of the protein digestion and subsequent purification steps, good recoveries of beta-lactoglobulin (IS2) were observed in all fractions except the GLYCO one (for it is not a glycosylated protein), indicating that the quantitative and qualitative variability introduced by these steps is within the acceptable limits. 3164
dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182
Journal of Proteome Research
ARTICLE
Figure 3. Quantitative accuracy, recovery and repeatability of the technique. (A) Absolute quantitative evaluation of the internal standards spiked in the sample during the preparation process using the sequential method (see Figure 1). Protein quantity was calculated using the PLGS software, based on a previously calculated response factor (for details refer to Materials and Methods). The error indicates the standard deviation of means, based on three full process technical replicates. (B) Absolute quantification of the internal standards spiked in the sample during the glycopeptides analysis alone. For comparative reasons (as outlined in the Introduction), the isolation of glycopeptides was also conducted as a standalone technique. Quantification and error bars are the same as in A.
The present study uses MSe technology to perform absolute label-free quantification. The method uses highly reproducible HPLC and data independent alternate scanning. In addition high mass accuracy measurements are provided by an orthogonal time-of-flight mass spectrometer and “on the flight” acquisition of a lock-mass allowing for subsequent readjustment of the calibration. The data are processed based on the detection and correlation of all detectable precursor and fragment ions sharing the same chromatographic profile. Rapid alteration between low and elevated energy states applied in the collision cell allows for the simultaneous quantification and identification of proteins in a single experiment. In the present work, IS3 is spiked in the sample at the very late step of the sequential method (last step before the sample is injected in the UPLC), which allows a good estimate of performance of the quantification method. As outlined in Figure 3A, the quantification of four proteins (IS3) demonstrates that absolute quantification is feasible and accurate. However, a slight overestimation of the protein amounts especially for albumin (bovine) and glycogen phosphorylase b (rabbit) is evident. This is probably related to the presence of human homologues found in the same sample. Following the rationale that this variation is below the 1.5-fold ratio chosen as threshold for claiming differential expression of a protein, the resulting
deviation was not considered as important. Along the lines of the absolute quantification the question of the repeatability, in particular with respect to the full process technical replicates, becomes relevant. Therefore, in Figure 4A an inter-replicate comparison of the identified and quantified proteins in each respective fraction of the combinatory method (including in silico fraction GR) was conducted. The Pearson correlation coefficients (Pcc) indicate an overall strong correlation between the full process technical replicates (average Pcc for all fractions = 0.92). The Pcc value for the GLYCO fraction, appears to be slightly lower (average Pcc = 0.86) and improved to 0.89 when the nonglycosylated peptides (via the in silico approach) were considered. Overall the weaker Pcc can be explained by the fact that calculations were performed with less peptides (only the glycosylated ones). The performance of the sequential approach was further verified on the qualitative level; this was performed especially with respect to the overlap of protein identifications (Figure 5) and their cellular localization (Figure 6) in the full process replicates. In summary, for the sequential method and regarding the protein identification, it can be stated that the BIOT fraction displayed the least variability, having on average 75% of proteins present in all three replicates. This value decreased in the REST 3165
dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182
Journal of Proteome Research
ARTICLE
Figure 4. Repeatability of the method regarding absolute protein quantification in the three full process replicates. The data displayed regard all the proteins identified in the respective fractions of the combinatory method (A) and of the glycopeptides isolation procedure as standalone technique (B). The Pearson correlation coefficient (Pcc) is indicated for each comparison and fraction. 3166
dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182
Journal of Proteome Research
ARTICLE
Figure 5. Repeatability of the protein identification. The percentage overlap of the proteins identified in each respective process replicate and fraction of the method. The BIOT fraction represents only proteins obtained from the biotinylation step; glycoproteins identified are marked as GLYCO whereas the nonglycosylated proteins are designated REST. The remaining two diagrams, GLYCO ONLY (GO) and REST from GO indicate the reproducibility of the protein identification for the glycopeptides isolation procedure as standalone technique.
fraction to just above 65% and in the GLYCO fraction to 55%. From these data, it can be reasonably assumed that the enrichment at the protein level causes less variability in comparison to the enrichment done at the peptide level. This is in the line with the fact that digestion of proteins generally increases the complexity of the sample. Regarding the question of repeatability at the level of cellular localization of identified proteins, the results for each fraction are detailed in the Supporting Information (Figure S1) and for the entire sequential approach in the Figure 6. It is worth noting that there is practically no significant interreplicate variability. This implies that the method is able to isolate repeatedly the same quality of proteins, originating from identical subcellular localizations. 3.2. Comparison of the Sequential Approach with the Individual Methods
In order to determine the real benefit of combining two previously described procedures in a new method, we have sought to compare the sequential technique with the individual method parts respectively. For this purpose, it was necessary to perform the second part of the method, the isolation of the glycopeptides, as a “standalone” technique. As far as the
technical aspects of the method are concerned (quantification of IS), the direct isolation of glycosylated peptides (GLYCO ONLY [GO]) and the analysis of the remaining REST from GO fraction produced similar results as the sequential method. As shown in the Figure 3B the GLYCO ONLY fraction recovered specifically glycosylated fetuin, whereas biotinylated ovalbumin and casein were in-addition to fetuin only present in the REST from GO fraction. Internal standard 2 and 3 indicate no significant variability with respect to digestion, purification and MS quantification. On the protein level Figure 4B demonstrates high correlation between the individual replicates regarding the absolute protein quantities (average Pcc = 0.85). This correlation increased to 0.93 following the in silico association of the nonglycosylated peptides. Concerning the reproducibility of protein identification, analogous to the sequential method, the greatest variability is observed in the GLYCO ONLY fraction followed by the REST from GO fraction. Finally, both fractions of the glycopeptide enrichment as a standalone technique, show reproducible isolation patterns with respect to the subcellular localization of the identified proteins (results displayed in the Supporting Information, Figure S1). 3167
dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182
Journal of Proteome Research
ARTICLE
Figure 6. Qualitative comparison of the percentage of identified proteins in the combinatory method according to their subcellular localization. The figure displays three full technical process replicates. The abbreviation BGR refers to a merged data set of BIOT, GLYCO and REST fractions for the respective replicate.
The goal of the method development presented here was to identify and quantify repeatedly and accurately potentially accessible and hence clinically relevant proteins. Therefore, it was essential to perform a comparison of the number of potentially accessible proteins obtained with the sequential method with those from the biotinylation or glycopeptide isolation as “standalone” techniques. This comparison is shown in the Figures 7 and 8. At this stage, it is important to outline that within the frame of the current work, BIOT protein fraction contained ∼45%, GLYCO fraction ∼80% and the REST ∼30% of potentially accessible proteins (membrane, extracellular and secreted; results shown in the Supplemental Data section Figure S1). These percentages were not significantly different when glycopeptide isolation was performed alone. As far as the absolute numbers of potentially accessible proteins is concerned, BIOT protein fraction isolated on average 310 proteins (Figure 7A). The number of potentially accessible proteins in the GLYCO fraction (both in the sequential and standalone approach) was approximately 80 which increased to 110 following the in silico combination method (Figure 7A). This demonstrates the value of performing this operation especially for nonassigned glycopeptides (an increase of ∼30%). In the REST fraction and on average, 200 potentially accessible proteins were confidently identified. The combination of all the method parts yielded in the sequential setting over 410 proteins (þ 30% in comparison to
the biotinylation alone) and in the glycopeptide setting alone (including the corresponding REST from GO fraction) 250 proteins (50% in comparison to the sequential method). Concerning the question whether the sequential method provides a real additional value in comparison to the individual techniques performed alone, Figure 8 summarizes the most important findings. With respect to the sequential method following can be said; the analytical procedure of removal of previously biotinylated proteins is clearly superior, in terms of the percentage of identified unique proteins (∼53%), to both glycopeptide and the analysis of remaining (nonglycosylated and nonbioinylated) proteins. However, the GLYCO fraction is characterized with ∼36% unique, potentially accessible, proteins. The remaining pool of proteins (REST-fraction), due to a relatively high absolute numbers and high percentage of membrane and extracellular proteins, bears another ∼25% of unique proteins. From this observation, it can be concluded that each of the sequential method part brings an additional value to the technique as a whole. Considering the repeatability of the observation (Figure 8, upper section, numbers indicated in the brackets) the BIOT fraction has the highest number of potentially accessible proteins detected in all three replicates (101 or ∼60%). This decreases in the GLYCO fraction to ∼40% and further in the REST fraction to ∼20%. Hence, although the REST fraction bares a relatively elevated percentage of 3168
dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182
Journal of Proteome Research
ARTICLE
Figure 7. Analysis of the effective value of the combinatory method with respect to the individual components alone (BIOT, GLYCO and REST) and the absolute numbers of accessible proteins identified (accessible proteins: extracellular, secreted and/or membrane). (A) The number of accessible proteins identified in each of the steps of the combinatory method as well as the sum of all the components together. Three full process technical replicates are displayed. Notably, G indicates the average number of proteins obtained using only the data from the trapped glycoprotein fraction (GLYCO, Figure 1). However, using the in-silico method outlined in the Figure 2, the overall number of glycoproteins was increased; this resulted in the data marked GR. Importantly, the specific dynamic range of proteins characterizing the R (REST, Figure 1) fraction allowed for a significant identification of additional accessible proteins. (B) Same as A conducted only for the glycopeptides isolation procedure as standalone technique.
potentially accessible proteins, they are observed with high interreplicate variability. However, when the results of all the steps of the sequential method are combined together (BGR, Figure 8, lower section) the repeatability of identification of potentially accessible proteins (observed in all 3 process replicates) increases to ∼75%. Performing the glycoprotein analysis alone (including the in silico matching and the joining of all the potentially accessible proteins found in the resulting REST fraction) results in only ∼18% unique proteins, previously not observed in the sequential method.
3.3. Proof of Concept Study—Applying the Sequential Approach to Breast Cancer Samples
Within the frame of the current study, we have sought to apply the developed method to clinically relevant breast cancer samples obtained from five patients. The aim of this part of the study was to demonstrate that using relevant samples, the method is able to discern the potentially accessible and previously unreported differentially expressed proteins in human breast cancer. The group of patients selected for this study was homogeneous with respect to the type, grade and clinical stage of the tumor 3169
dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182
Journal of Proteome Research
ARTICLE
Figure 8. Comparisons of the identification overlap of accessible proteins in different fractions of combinatory method and glycopeptides analysis alone. The Venn diagram display three full process replicates. Following abbreviations were used: B (BIOT fraction), R (REST fraction) and GR (GLYCO fraction incl. in silico matching with REST; as described in Figure 2). Regarding the combinatory method, the figure shows that there is substantial overlapping among all the separate fractions. However, each fraction also contains a certain number of unique proteins. Regarding the unique proteins, the numbers in parentheses indicate how many of them were observed in 3 out of 3, 2 out of 3 and 1 out of 3 fractions respectively. The lower part of the figure shows a comparison between the combinatory method (BGR) and glycoprotein analysis alone (GR&R) incl. in silico matching with its corresponding REST fraction. The diagrams display only accessible proteins. For correct comparison accessible proteins found in the rest fractions are always included. It is worth noting that glycoprotein analysis as a standalone technique identifies on average 18% of unique proteins. In contrast to this, over 50% of the proteins identified in the combinatory method are unique.
(Table S1, Supporting Information). A given protein was considered as up-regulated when the relative abundance ratio (tumoral vs normal) was g1.5 with a p-value g 0.95 in at least 3 out of 5 patients examined. Similarly, when a given protein was found only in the tumoral condition it was considered as upregulated as well. Overall, the sequential approach identified 93 up-regulated, potentially accessible proteins in the 5 ductal breast carcinoma patients. Biotinylation alone contributed with 54 modulated proteins. The analysis of glycosylated and remaining peptides resulted respectively in 26 and 41 up-regulated proteins. Of these, 33 were unique to the biotin-, 19 to the rest and 20 to the glyco-fraction. The complete list of the up-regulated proteins is shown in the Table 1. Details regarding the number of unique peptides, sequence coverage and the FPR are provided in the Table 2. Regarding the BIOT fraction, the up-regulated proteins have on average been identified with ∼13 peptides, had a sequence coverage of ∼20% and FPR under ∼1%. Similarly, in the REST fraction the proteins have been identified with ∼15 unique peptides, the sequence coverage was ∼18% and average FPR under 0.5%. As expected the glycosylated proteins (GLYCO fraction) have been identified with less unique peptides (∼10), had lower sequence coverage (∼13%) and higher FPR (∼1.2%). Owing to the internal standards as well as to the specificity of the MS analysis (employing the MSe technology) it was possible to provide an absolute quantity for each respective protein identified (Table 3). The amounts refer to a total quantity of 2.5 μg of peptides injected on the UPLC column. A protein abundance ratio between the tumoral and the normal sample has been calculated as well (absolute quantity in the tumoral vs the normal specimen). These ratios (without any further normalization) correlate well (average Pearson correlation coefficient >0.90; data not shown) with the relative quantification ratios shown in the Table 1, demonstrating that the absolute quantification provides trustable values.
3.4. Validation of Modulated CD276 using Immunohistochemistry
CD276 protein was found overexpressed in the breast cancer samples using the outlined sequential MS-based method. Namely, this protein was identified in four out of five patients, being 3-fold up-regulated in one and only present in the tumoral conditions of the other three individuals (Table 1, glyco fraction). At the time of this work, no published data was available regarding the expression and function of CD276 during breast cancer development and progression. Moreover, this protein prompted the interest because it was observed solely in the glyco-fraction. To validate the method as being able to identify potentially new proteins, we sought to examine the expression of CD276 using IHC on a collection of 30 patients diagnosed with infiltrating ductal breast cancer. The control group was increased to a total of 10 normal individuals. We found that CD276 was expressed at the surface of most human breast cancer cells while in general absent in adjacent normal mammary epithelial cells. Summarizing the results detailed in the Figure 9 , CD276 showed a strong positive and statistically significant staining in human breast cancer tissue.
4. DISCUSSION Identification of accessible biomarkers remains one of the major limiting factors for the development of new effective diagnostic, prognostic and therapeutic modalities. In this study, we developed a new method that significantly increased the number of potentially reachable proteins compared to previously described approaches that made use of biotinylation alone. The uniqueness of the protocol consists in the sequential combination of three procedures applied to scarce biopsy samples. Exploiting the fact that most of the glycoproteins are inherently found in the extracellular and membrane space, we decided to 3170
dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182
3171
HLA-G
HLA-B
HLA-C HLA-C
P17693
P18465
P30499 Q07000
HLA-DRA
ITGB1
P01903
P05556
DQA2
HLA-
HLA-E
P13747
P01906
HLA-A
P16188
HLA-C
LGALS3BP
Q08380
HLA-C
FBLN1
P23142
P30504
EMILIN1 FN1
Q9Y6C2 P02751
Q29960
EFHD2
ALCAM
Q13740
Q96C19
CTNNA1 CTNNA2
P35221 P26232
COL12A1
COMP
P49747
Q99715
CALR
P27797
COL1A1
CNN3
Q15417
CLU
CNN2
Q99439
P02452
CDH1
P12830
P10909
AGR3
Q8TD06
CLIC4
CAP1 AEBP1
Q01518 Q8IUX7
Q9Y696
gene name
accession
5
3
5
3
3 4
5
3
3
3
5
4
4 5
4
3
5
5
3
3
5 4
3
5
4
5
4
4
5 5
patient
number of
96.44
/
1421.45
839.96
1063.01 /
956.87
/
/
901.04
87.07
358.22
314.03 5242.98
350.67
411.62
1643.94
338.58
180.79
445.81
704.98 156.40
168.00
518.01
1718.77
1387.96
/
1343.17
123.33 493.52
score
ratio
1.51
/
Tumor
Tumor
Tumor /
Tumor
/
/
Tumor
Tumor
Tumor
Tumor Tumor
Tumor
Tumor
0.82
1.43
Tumor
Tumor
Tumor Tumor
Tumor
6.69
Tumor
Tumor
/
Tumor
4.57 Tumor
T/N
0.78
/
Tumor
Tumor
Tumor /
Tumor
/
/
Tumor
Tumor
Tumor
Tumor Tumor
Tumor
Tumor
0.00
0.78
Tumor
Tumor
Tumor Tumor
Tumor
1.00
Tumor
Tumor
/
Tumor
1.00 Tumor
p-value score
100.16
2000.34
1169.80
1549.73
/
/ 1479.85
1595.97
1396.73
1307.41
/
76.06
75.76
246.42 143.16
314.35
1316.38
1208.05
760.49
/
268.27
515.60 242.34
971.83
922.03
558.62
637.03
196.84
580.33
442.43 77.96
BPSCC10/39 ratio
1.54
Tumor
Tumor
Tumor
/
/ Tumor
Tumor
Tumor
Tumor
/
3.90
2.69
1.92 14.73
Tumor
Tumor
1.65
3.97
/
Tumor
Tumor Tumor
Tumor
2.36
Tumor
Tumor
Tumor
Tumor
2.48 15.80
T/N
0.90
Tumor
Tumor
Tumor
/
/ Tumor
Tumor
Tumor
Tumor
/
1.00
1.00
1.00 1.00
Tumor
Tumor
1.00
1.00
/
Tumor
Tumor Tumor
Tumor
1.00
Tumor
Tumor
Tumor
Tumor
1.00 1.00
p-value
ratio T/N
520.57
/
206.05
24.81
844.28
343.31 839.11
650.44
520.58
494.42
841.77
305.30
410.54
110.60 68.48
478.34
89.45
201.28
414.33
349.21
660.15
713.48 140.25
153.46
2106.27
427.84
192.73
215.23
1971.02
510.65 449.56
Tumor
/
Tumor
Tumor
Tumor
Tumor Tumor
Tumor
Tumor
Tumor
Tumor
Tumor
Tumor
Tumor 4.10
Tumor
Tumor
3.60
2.97
Tumor
Tumor
Tumor Tumor
Tumor
Tumor
Tumor
Tumor
Tumor
Tumor
Tumor Tumor
BIOT Fraction
score
BPSCC10/40
Tumor
/
Tumor
Tumor
Tumor
Tumor Tumor
Tumor
Tumor
Tumor
Tumor
Tumor
Tumor
Tumor 1.00
Tumor
Tumor
1.00
1.00
Tumor
Tumor
Tumor Tumor
Tumor
Tumor
Tumor
Tumor
Tumor
Tumor
Tumor Tumor
p-value
433.01
127.13
3384.92
4567.53
4428.69
3718.36 4535.59
2156.76
/
463.91
/
94.10
/
/ 509.91
509.31
/
285.98
1093.33
/
/
267.04 /
/
717.26
/
536.67
225.34
/
1091.64 176.21
score
BPSCC10/46
patient
ratio
Tumor
13.60
Tumor
Tumor
Tumor
Tumor Tumor
Tumor
/
Tumor
/
4.57
/
/ Tumor
Tumor
/
0.88
1.92
/
/
Tumor /
/
14.73
/
Tumor
Tumor
/
Tumor Tumor
T/N
Tumor
1.00
Tumor
Tumor
Tumor
Tumor Tumor
Tumor
/
Tumor
/
1.00
/
/ Tumor
Tumor
/
0.00
1.00
/
/
Tumor /
/
1.00
/
Tumor
Tumor
/
Tumor Tumor
p-value score
361.30
250.56
1019.22
559.88
/
/ 563.98
393.69
404.03
/
337.49
499.01
201.16
263.86 319.76
/
/
340.35
2213.98
515.12
/
496.92 158.71
/
1586.53
800.74
371.75
90.00
191.97
675.44 195.15
BPSCC10/47 ratio
1.46
2.32
Tumor
Tumor
/
/ Tumor
24.05
Tumor
/
Tumor
0.96
Tumor
Tumor Tumor
/
/
2.16
0.86
Tumor
/
Tumor Tumor
/
2.08
Tumor
Tumor
Tumor
Tumor
Tumor Tumor
T/N
1.00
1.00
Tumor
Tumor
/
/ Tumor
1.00
Tumor
/
Tumor
0.43
Tumor
Tumor Tumor
/
/
1.00
0.03
Tumor
/
Tumor Tumor
/
1.00
Tumor
Tumor
Tumor
Tumor
Tumor Tumor
p-value
M, Mel
M, ER, G, E
M, ER, G, E
M
M, S
M M
M
M
M
M
S
S
S S
M
S
S
S
M, Cy, N, Mi, C J
M
M, Cy, CJ M, Cy, C J, C P
S
S, ER, Cy, C Su
CJ
CJ
M, C J
S
M S, Cy, N
locations
subcellular
BPSCC10/49
Table 1. List of Potentially Accessible Modulated Proteins Obtained from the Analysis of 5 Nontumoral Adjacent and 5 Tumoral Individual Matched Specimens, BIOT, REST and GLYCO Fractionsa
Journal of Proteome Research ARTICLE
dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182
3172
VAPA
CAP1
AEBP1
BASP1
CNN 2
CNN 3 CALR
CALU
CD59
COL1A1
COL12A1
Q01518
Q8IUX7
P80723
Q99439
Q15417 P27797
O43852
P13987
P02452
Q99715
TAGLN2 VCAN
P37802 P13611
VDAC1
THBS2
P35442
P21796
TXN
P10599
Q9P0L0
TNC
P24821
Rap 1b
P61224
ATP IB 1
HLA-H RAP1A
P01893 P62834
P05026
P4HB
P07237
SFRP4
PDIA6
Q15084
Q6FHJ7
POSTN
SLC9A3R1
O14745
SERPINF1
MARCKS
P29966
P36955
PGRMC2 MSN
O15173 P26038
Q15063
LSP1
MIF
P14174
KTN1
Q86UP2
P33241
gene name
accession
Table 1. Continued
5
5
3
4
3 5
3
4
4
5
3
3
5 4
3
5
4
3
3
4
3 3
5
3
3
5
4
4
3 5
4
3
4
patient
number of
1484.79
214.94
173.72
/
284.31 1742.08
357.89
505.63
1338.28
541.76
244.42
505.94
2178.35 206.26
715.31
730.64
920.42
753.72
109.79
864.99
896.25 780.76
93.49
/
295.05
668.37
870.92
1399.26
1380.57 429.91
2787.42
/
246.23
score
ratio
Tumor
1.51
2.23
/
Tumor Tumor
Tumor
Tumor
Tumor
2.44
Tumor
Tumor
3.71 Tumor
Tumor
2.56
Tumor
Tumor
Tumor
Tumor
Tumor Tumor
7.92
/
Tumor
4.71
Tumor
Tumor
Tumor Tumor
Tumor
/
Tumor
T/N
Tumor
1
0.93
/
Tumor Tumor
Tumor
Tumor
Tumor
1
Tumor
Tumor
1.00 Tumor
Tumor
1.00
Tumor
Tumor
Tumor
Tumor
Tumor Tumor
1.00
/
Tumor
1.00
Tumor
Tumor
Tumor Tumor
Tumor
/
Tumor
p-value score
158.85
770.88
/
821.38
180.36 569.48
475.01
396.71
3667.17
1924.92
117.92
/
2834.74 261.40
994.69
4810.20
898.73
991.77
832.54
191.58
/ 289.07
543.69
/
1045.84
105.02
513.86
474.18
354.35 319.88
7164.52
984.43
466.90
BPSCC10/39 ratio
23.40
2.72
/
Tumor
Tumor 4.02
Tumor
Tumor
Tumor
2.00
Tumor
/
3.56 Tumor
Tumor
Tumor
Tumor
Tumor
Tumor
Tumor
/ Tumor
3.86
/
Tumor
24.78
Tumor
Tumor
1.67 Tumor
Tumor
Tumor
Tumor
T/N
1
0.99
/
Tumor
Tumor 1
Tumor
Tumor
Tumor
0.46
Tumor
/
1.00 Tumor
Tumor
Tumor
Tumor
Tumor
Tumor
Tumor
/ Tumor
1.00
/
Tumor
1.00
Tumor
Tumor
0.97 Tumor
Tumor
Tumor
Tumor
p-value
ratio
Tumor
Tumor
9.21 Tumor
Tumor
11.36
Tumor
Tumor
Tumor
985.12
526.29
747.64
1060.17
519.91
208.32 3715
258.26
186
276.01
/
Tumor /
14.15
Tumor
1.30
5.75
Tumor
/
Tumor Tumor
Tumor
Tumor
Tumor
T/N
Tumor
1.63
Tumor
Tumor
Tumor Tumor
Tumor
Tumor
Tumor
Tumor
REST Fraction
303.15
2242.79
524.31 109.19
149.61
405.32
293.75
1046.78
229.79
/
775.41 /
88.03
821.02
113.46
164.36
1618.65
/
1259.69 270.21
3486.43
672.75
642.16
score
BPSCC10/40
Tumor
1
Tumor
Tumor
Tumor Tumor
Tumor
Tumor
Tumor
Tumor
Tumor
Tumor
1.00 Tumor
Tumor
1.00
Tumor
Tumor
Tumor
/
Tumor /
1.00
Tumor
0.75
1.00
Tumor
/
Tumor Tumor
Tumor
Tumor
Tumor
p-value
191.1
549.6
/
1206.22
/ 4194.96
/
433.56
200.87
947.24
/
1714.57
1908.01 /
/
738.84
156.34
/
/
246.79
2789.76 256.36
222.32
1095.04
/
755.48
732.04
1682.43
/ 629.58
/
4142.59
/
score
BPSCC10/46
patient
ratio
Tumor
2.16
/
Tumor
/ Tumor
/
Tumor
Tumor
3.13
/
Tumor
13.07 /
/
6.11
Tumor
/
/
Tumor
Tumor Tumor
12.94
Tumor
/
Tumor
Tumor
Tumor
/ Tumor
/
Tumor
/
T/N
Tumor
1
/
Tumor
/ Tumor
/
Tumor
Tumor
0.97
/
Tumor
1.00 /
/
1.00
Tumor
/
/
Tumor
Tumor Tumor
1.00
Tumor
/
Tumor
Tumor
Tumor
/ Tumor
/
Tumor
/
p-value score
117.04
214.94
173.72
70.49
/ 251.79
/
/
/
541.76
/
/
1775.15 278.25
/
2259.08
/
/
/
441.17
/ /
718.17
423.84
/
2076.72
/
414.50
/ 102.06
1343.77
/
164.45
BPSCC10/47 ratio
Tumor
1.57
2.08
Tumor
/ Tumor
/
/
/
1.84
/
/
2.10 Tumor
/
2.14
/
/
/
Normal
/ /
2.36
Tumor
/
Tumor
/
Normal
/ 1.06
2.05
/
Tumor
T/N
Tumor
1
0.88
Tumor
/ Tumor
/
/
/
1
/
/
1.00 Tumor
/
1.00
/
/
/
Normal
/ /
1.00
Tumor
/
Tumor
/
Normal
/ 0.55
1.00
/
Tumor
p-value
S
S
M, S
ER, S, Mel, SR
CJ ER, Cy, S, CSu
CJ
M
S
M
Mi
M
M,N S
S
Cy, S
S
M
S
M, Cy
M M
M, ER, Mel
M, ER, Mel
S
S
M, Cy, C P
M, Cy
M M, Cy, C P
S
M
M, ER
locations
subcellular
BPSCC10/49
Journal of Proteome Research ARTICLE
dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182
3173
TNC
THBS1
THBS2
TGFBI RHOA
TAGLN2
VCAN
VAPA
VCL
VDAC1
BGN
BST2
P07996
P35442
Q15582 P61586
P37802
P13611
Q9P0L0
P18206
P21796
P21810
Q10589
ATP1A1
P05023
P24821
RHOC
PDIA6
Q15084
RAB14
LCP1
P13796
P08134
SERPINF1
P36955
P61106
POSTN
Q15O63
GDI2 IQGAP1
MARCKS SLC9A3R1
P29966 O14745
RAB1C
HLA-DRB1
P01911
P50395 P46940
HLA-DRA
P01903
Q92928
HLA-A
P13746
P4HB
LGALS1
P09382
P07237
EMILIN1
FN1
P02751
COL1A2 DCTN2
P08123 Q13561
Q9Y6C2
gene name
accession
Table 1. Continued
4
5
4
5
4
5
4
3 4
3
3
4
4
3
3
4 5
3
5
5
4
3
5
4 4
3
3
3
5
5
4
5 4
patient
number of
218.25
1288.06
/
125.43
913.67
163.48
1060.02
76.58 1027.18
112.68
161.78
554.36
102.17
/
/
383.77 85.69
/
88.76
169.82
399.06
471.03
3640.87
985.06 329.5
/
/
/
514.61
65.33
126.21
282.26 235.06
score
ratio
Normal
1.6
/
0.68
Tumor
Tumor
Tumor
Tumor Tumor
Tumor
Tumor
Tumor
Tumor
/
/
Tumor 2.64
/
4.53
2.92
Tumor
Tumor
Tumor
Tumor Tumor
/
/
/
3.16
18.17
Tumor
2.75 Tumor
T/N
Normal
1.00
/
0.09
Tumor
Tumor
Tumor
Tumor Tumor
Tumor
Tumor
Tumor
Tumor
/
/
Tumor 1
/
1
1
Tumor
Tumor
Tumor
Tumor Tumor
/
/
/
1
1
Tumor
1 Tumor
p-value score
798.95
1393.69
569.57
681.77
/
118.29
773.75
115.79 1117.66
2078.27
2883.19
849.16
184.08
519.83
782.05
289.67 172.53
547.83
1141.14
621.52
168.77
1558.8
4389.11
567.95 287.64
850.84
282.65
320.4
4765.35
323.2
84.86
2055.15 382.04
BPSCC10/39 ratio
Tumor
4.66
Tumor
1.78
/
8.20
7.34
2.70 3.36
Tumor
Tumor
Tumor
1.06
Tumor
Tumor
1.48 3.98
Tumor
Tumor
4.16
1.22
Tumor
Tumor
Tumor Tumor
Tumor
Tumor
Tumor
7.56
18.42
Tumor
2.42 Tumor
T/N
383.74
298.43
3246.94
393.66
2809.76
/ 1268.24
111.52
663.79
242.83
129.98
1191.54
1310.46
325.05 210.1
1489.95
5677.9
6784.53
/
710.29
1875.4
133.61 474.77
448.54
181.34
248.88
2102.82
218.5
67.03
2236.75 1371.62
score
ratio
Tumor
Tumor
Tumor
Tumor
Tumor
/ Tumor
Tumor
Tumor
Tumor
Tumor
Tumor
Tumor
Tumor Tumor
Tumor
Tumor
Tumor
125.40
1721.03
/
6.23
Tumor
Tumor Tumor
Tumor
Tumor
Tumor
Tumor
27
Tumor
0.93 Tumor
T/N
Tumor
7.17
GLYCO Fraction Tumor
1.00
Tumor
0.21
/
1
1
0.77 0.9
Tumor
Tumor
Tumor
0.12
Tumor
Tumor
0.25 1
Tumor
Tumor
0.99
0.23
Tumor
Tumor
Tumor Tumor
Tumor
Tumor
Tumor
1
1
Tumor
1 Tumor
p-value
BPSCC10/40
Tumor
1.00
Tumor
Tumor
Tumor
Tumor
Tumor
/ Tumor
Tumor
Tumor
Tumor
Tumor
Tumor
Tumor
Tumor Tumor
Tumor
Tumor
Tumor
/
1
Tumor
Tumor Tumor
Tumor
Tumor
Tumor
Tumor
1
Tumor
0 Tumor
p-value
510.90
1721.03
301.42
164.47
555.63
175.1
3996.02
146.4 975.84
/
/
342.98
/
706.67
388.38
/ 179.84
166.93
295.89
671.44
801.19
/
1008.52
677.29 280.5
422.19
792.43
716.85
601.1
291.92
185.54
1374.21 900.57
score
BPSCC10/46
patient
ratio
Tumor
2.89
Tumor
2.03
Tumor
Tumor
Tumor
Tumor 3.19
/
/
Tumor
/
Tumor
Tumor
/ Tumor
Tumor
13.20
8.00
Tumor
/
Tumor
Tumor Tumor
Tumor
Tumor
Tumor
15.33
4.39
Tumor
2.18 Tumor
T/N
Tumor
1.00
Tumor
0.89
Tumor
Tumor
Tumor
Tumor 0.97
/
/
Tumor
/
Tumor
Tumor
/ Tumor
Tumor
1
1
Tumor
/
Tumor
Tumor Tumor
Tumor
Tumor
Tumor
1
1
Tumor
1 Tumor
p-value score
/
1012.83
86.8
125.43
326.76
81.27
/
/ /
/
/
/
66.87
/
/
105.58 85.69
/
88.76
169.82
189.95
/
696.09
/ /
/
/
/
514.61
65.33
/
282.26 /
BPSCC10/47 ratio
/
3.42
Tumor
1.09
Tumor
Tumor
/
/ /
/
/
/
Tumor
/
/
Tumor Normal
/
2.27
2.08
Tumor
/
Tumor
/ /
/
/
/
1.42
/
0.96
Tumor
0.62
Tumor
Tumor
/
/ /
/
/
/
Tumor
/
/
Tumor Normal
/
0.99
1
Tumor
/
Tumor
/ /
/
/
/
0.89
/ 1
/
1 /
p-value
1.11
1.88 /
T/N
G, M
S
M, Mi
M, Cy, C J
M
S
S M, Cy
S
S
S
M, Mel
M
M
M, Cy M
M, Cy
M, ER, Mel
M, ER, Mel
Cy, C J, C P
S, Mel
S
M, Cy M, Cy, C P
M, ER, G, E, L
M, ER, G, E, L
M
S
S
S
S M, Cy
locations
subcellular
BPSCC10/49
Journal of Proteome Research ARTICLE
dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182
3174
ITGAV
P06756
LAMP1
LAMP 2
TIMP1
POSTN SERPINF1
SERPINA5
TNC
THBS1
THY1
TFRC
TM9SF3
P11279
P13473
P01033
Q15063 P36955
P05154
P24821
P07996
P04216
P02786
Q9HD45
3
4
5
3
5
5
5 5
5
5
5
3
4
5
4 5
5
5
5
4
5
4
4 4
patient
number of
160.50
454.72
3487.09
/
2311.17
1866.00
229.72 1782.53
487.88
1285.89
317.31
/
111.30
250.90
/ 271.68
1377.61
1849.75
127.75
339.29
637.26
472.67
185.12 105.69
score
ratio
1.73
Tumor
1.23
/
Tumor
Tumor
3.74 1.92
0.25
Tumor
2.46
/
1.62
Tumor
/ 3.63
1.12
0.87
1.82
Tumor
Tumor
Tumor
1.11 3.00
T/N
1.00
Tumor
0.97
/
Tumor
Tumor
0.98 1.00
0.04
Tumor
1.00
/
0.99
Tumor
/ 1.00
0.98
0.04
1.00
Tumor
Tumor
Tumor
0.72 0.98
p-value score
ratio
2.72
136.05
/
1536.99
1799.00
3538.85
1177.84
1937.36 4267.74
5152.63
2.80
/
8.85
Tumor
Tumor
Tumor
Tumor 5.87
Tumor
3.67
’ 957.83 317.51
Tumor
405.10
Tumor
Tumor 15.33
2.51
4.39
2.89
Tumor
22.87
7.77
1.68 Tumor
T/N
851.18
153.67
1512.57
457.54 332.05
1420.54
1354.67
174.27
224.61
96.76
155.78
121.35 603.22
BPSCC10/39
0.99
/
1.00
Tumor
Tumor
Tumor
Tumor 1.00
Tumor
0.99
1.00
Tumor
0.99
Tumor
Tumor 1.00
1.00
1.00
1.00
Tumor
1.00
1.00
1.00 Tumor
p-value
143.12
100.73
1209.30
273.24
576.04
668.16
534.19 3651.86
2031.79
279.61
360.65
146.37
121.40
172.13
240.65 670.54
2045.18
449.11
247.39
458.35
122.83
166.67
/ 223.99
score
BPSCC10/40 ratio
Tumor
Tumor
8.58
Tumor
Tumor
2.23
Tumor 1.58
Tumor
11.13
2.12
Tumor
11.13
Tumor
Tumor Tumor
Tumor
2.03
Tumor
Tumor
2.97
Tumor
/ Tumor
T/N
Tumor
Tumor
1.00
Tumor
Tumor
0.99
Tumor 1.00
Tumor
1.00
1.00
Tumor
1.00
Tumor
Tumor Tumor
Tumor
1.00
Tumor
Tumor
1.00
Tumor
/ Tumor
p-value
/
2391.77
1209.30
163.63
2570.18
668.16
517.33 3651.86
3243.84
279.61
020.5
679.66
121.40
1030.07
535.32 1073.36
4553.91
449.11
2459.18
458.35
122.83
718.61
184.87 313.82
score
BPSCC10/46
patient
ratio
/
Tumor
5.37
Tumor
Tumor
0.94
Tumor 1953.862
Tumor
19.69
4.18
Tumor
5.10
Tumor
Tumor Tumor
Tumor
3.42
Tumor
Tumor
1.42
Tumor
Tumor Tumor
T/N
/
Tumor
1.00
Tumor
Tumor
0.46
Tumor 0.88
Tumor
1.00
1.00
Tumor
1.00
Tumor
Tumor Tumor
Tumor
1.00
Tumor
Tumor
1.00
Tumor
Tumor Tumor
p-value
/
320.6
/
154.4 /
score
/
156.73
1225.47
/
388.97
424.39
619.02 1953.86
1113.61
153.08
382.52
/
/
166.11
150.95 399.78
1417.68
2750.15
152.43
BPSCC10/47 ratio
/
Tumor
2.72
/
Tumor
1.48
Tumor 2.08
0.89
2.77
1.02
/
/
Tumor
Normal 2.36
7.24
0.44
2.86
/
Tumor
/
Tumor /
T/N
/
Tumor
1
/
Tumor
1
Tumor 1
0.02
1
’ 0.54
/
/
Tumor
Normal 1
1
0
1
/
Tumor
/
Tumor /
p-value
locations
M
M, Mel, S
M
S
S
S
S S, Mel
S
M, E, L
M, E, L
M
M
M
M M, ER, G
S
S
S
S
S
M
M M
subcellular
BPSCC10/49
The proteins were selected with respect to their presence in tumoral tissue samples and their absence or reduced presence in non-tumoral tissue samples. For certain proteins present in both the tumoral and the adjacent normal tissue, quantitative data (relative ratio of expression) are included if they were significantly overexpressed in the tumor (ratio g1.5). The p-value is ranging from 0 to 1 indicating: (i) 00.05 significant down-regulation and (ii) 0.951.0 significant up-regulation of the respective protein. The proteins are accordingly to the GO annotation located on the outer side of the cell membrane (secreted, extracellular or membrane). Additional information concerning the number of unique peptides, protein sequence coverage and FPR are provided in the Table 2. The corresponding sequence information regarding the glycosylated peptides is provided in the Table S2 (Supporting Information). The MS/MS spectra supporting the identification of glycosylated peptides are outlined in Supporting Information Figure S2. The following abbreviations were used:Subcellular location S, secreted; E, extracellular; M, membrane; Cy, cytoplasm; CJ, cell junction; CP, cell projection; CSu, cell surface; N, nucleus; G, Golgi; ER, endoplasmic reticulum; Mel, melanosome; SR, sarcoplasmic reticulum; L, lysosome.
a
IMP AD1I
Q9NX62
ITGB2
FCGR1A HLA-DRA
P12314 P01903
P05107
HPX
P02790
C4B
P0C0L5
EMILIN1
COL12A1
Q99715
LGALS3BP
CADM1
Q9BY67
Q08380
MRC2 CD 276
Q9UBG0 Q5ZPR3
Q9Y6C2
gene name
accession
Table 1. Continued
Journal of Proteome Research ARTICLE
dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182
Journal of Proteome Research
ARTICLE
Table 2. Values Indicate the Number of Unique Peptides, Sequence Coverage and the FPR Observed for Each of the Respective Proteins Reported in Table 1 (Average Numbers, n = 5) number of unique peptides
sequence coverage (%) tumor
false positive rate (%)
accession
tumor
normal
normal
tumor
normal
Q01518
14
9
30
16
0
0
Q8IUX7 Q8TD06
25 7
13 /
17 28
2 /
0 0
3 /
P12830
8
/
7
/
1
/
Q99439
9
/
28
/
0
/
Q15417
9
/
28
/
0
/
P27797
19
9
44
25
0
0
P49747
14
/
14
/
0
/
P35221
24
/
24
/
0
/
P26232 Q13740
20 12
/ /
8 17
/ /
1 0
/ /
BIOT Fraction
Q9Y696
7
5
20
9
0
3
P10909
14
11
28
25
0
0
P02452
36
27
22
19
1
0
Q99715
68
/
18
/
0
/
Q96C19
8
/
17
/
0
/
Q9Y6C2
18
17
10
10
0
0
P02751 P23142
56 14
27 7
25 20
8 10
0 1
2 3
Q08380
12
11
17
14
1
3
P16188
11
6
29
13
0
0
P13747
7
4
15
8
0
1
P17693
5
/
14
/
0
/
P18465
11
14
27
13
0
0
P30499
10
7
22
17
0
0
Q07000 Q29960
12 11
9 7
28 22
24 10
0 0
0 0
P30504
12
5
31
17
0
0
P01906
5
/
13
/
0
/
P01903
5
4
18
17
0
1
P05556
18
10
19
9
0
2
Q86UP2
33
/
17
/
0
/
P33241
12
/
36
/
1
/
P14174 O15173
4 10
2 8
28 34
32 29
0 0
0 0
P26038
16
8
16
3
0
3
P29966
6
13
27
20
0
0
O14745
11
/
29
/
0
/
Q15063
29
15
32
17
0
1
P36955
10
6
18
9
0
0
P07237
28
13
49
21
0
0
Q15084 P01893
10 9
/ 7
24 26
/ 17
0 0
/ 0
P62834
6
/
24
/
0
/
P61224
7
5
35
20
0
0
Q6FHJ7
3
/
8
/
1
/
P05026
6
/
18
/
0
/
P24821
39
/
18
/
1
/
P10599
6
4
47
28
0
0
P35442
24
/
17
/
0
/
3175
dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182
Journal of Proteome Research
ARTICLE
Table 2. Continued number of unique peptides accession
sequence coverage (%)
false positive rate (%)
tumor
normal
tumor
normal
tumor
normal
P37802 P13611
21 39
12 /
73 6
63 /
0 0
0 /
Q9P0L0
5
/
17
/
0
/
P21796
8
/
10
/
0
/
Q01518
13
8
23
14
0
0
Q8IUX7
23
/
16
/
0
/
P80723
8
/
33
/
0
/
Q99439 Q15417
5 8
/ /
11 17
/ /
0 0
/ /
P27797
15
8
34
18
0
0
O43852
13
/
38
/
1
/
P13987
4
2
16
9
0
0
P02452
35
28
19
14
0
0
Q99715
85
31
21
6
0
0
P08123
26
23
21
16
0
0
Q13561 Q9Y6C2
12 13
/ /
24 6
/ /
0 2
/ /
P02751
46
26
19
7
0
2
P09382
12
7
50
37
0
0
P13746
10
/
21
/
0
/
P01903
4
/
15
/
0
/
P01911
8
/
20
/
0
/
P29966
8
/
22
/
0
/
O14745 Q15063
11 32
/ /
15 30
/ /
0 0
/ /
P36955
9
11
14
11
0
0
P13796
16
16
16
7
0
1
Q15084
12
8
33
18
0
0
P07237
22
12
31
10
0
1
Q92928
6
/
23
/
0
/
P50395
12
12
17
16
0
0
P46940 P61106
31 6
29 /
11 32
10 /
0 0
1 /
REST Fraction
P08134
3
/
26
/
0
/
P05023
20
16
9
10
2
0
P24821
32
/
14
/
0
/
P07996
32
/
20
/
0
/
P35442
19
/
12
/
1
/
Q15582
10
6
11
4
2
1
P61586 P37802
4 12
3 5
28 41
24 39
0 0
0 0
P13611
42
26
6
2
1
1
Q9P0L0
7
/
22
/
0
/
P18206
27
26
13
13
1
1
P21796
10
/
19
/
0
/
P21810
7
6
17
15
0
0
Q10589
5
5
15
14
1
4
Q9UBG0 Q5ZPR3
15 6
12 6
6 13
4 15
2 0
2 5
Q9BY67
7
6
17
13
0
1
GLYCO Fraction
3176
dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182
Journal of Proteome Research
ARTICLE
Table 2. Continued number of unique peptides
sequence coverage (%)
false positive rate (%)
accession
tumor
normal
tumor
normal
tumor
normal
Q99715 P0C0L5
31 24
30 22
13 6
6 7
0 0
4 0
Q9Y6C2
11
10
13
8
0
2
Q08380
15
11
24
16
0
0
P02790
11
10
24
18
0
0
P12314
4
/
10
/
0
/
P01903
3
4
8
17
0
0
Q9NX62
5
/
14
/
1
/
P06756 P05107
17 7
13 /
19 9
5 /
0 1
4 /
P11279
14
9
30
21
0
0
P13473
7
5
15
13
0
1
P01033
3
4
23
30
0
0
Q15063
8
6
10
5
0
0
P36955
5
3
15
13
0
0
P05154
6
4
17
8
0
0
P24821 P07996
32 14
/ /
18 8
/ /
0 2
/ /
P04216
13
5
34
32
0
0
P02786
13
/
7
/
2
/
Q9HD45
8
5
6
2
1
2
further process the nonbiotinylated fraction and analyze the N-glycosylated proteins. The application of the previously described glycopeptide isolation method to the fraction of nonbiotinylated proteins allowed the recovery and identification of further potentially accessible proteins. The selective covalent binding of glycan residues to hydrazide resin made a high specific isolation of the glycosylated peptides possible. Recently, a similar strategy was employed to exploit the fact that membrane proteins are largely glycosylated.7 The authors used a biocytin hydrazide reactant to label the membrane glycoproteins in vivo in mammalian cells. The protocol requires the exposure of the cells to mild oxidative conditions, low temperatures and slightly acidic conditions over a period of one hour. Considering that this study was performed on scarce biopsy samples, we considered that the biotinylation reaction under physiological conditions followed by cell lysis and glycoproteome extraction might be a more appropriate technique to extract maximal information regarding the membrane and extracellular proteome. The original glycan oxidation hydrazide capture method has been known for the past 30 years. Zhang et al. and Tian et al. showed for the first time the application of mass spectrometry to characterize N-linked glycoproteins.9,10 However, when applied at the peptide level, the technique is limited by the number of nonassigned glycopeptides. In this study, nonbiotinylated and nonglycosylated peptides (REST fraction) were analyzed using UPLCMSe as well. In this context, it is important to mention that the REST fraction itself contained approximately 10% of proteins that had an N-glycosylation consensus site with deamidated asparagines. As this fraction was not treated with PNGase F, it is not unreasonable to assume that the deamidation occurred spontaneously in this case. Hence, care needs to be taken to differentiate these peptides from truly glycosylated ones. In fact, only 510% of the proteins from the
rest-fraction that had deamidated asparagines at the consensus site are known as glycoproteins. In contrast to this, hydrazideresin-bound peptides led to the identification of glycoproteins, of which over 85% are known to be glycosylated. In summary, specific peptides found in the REST fraction together with the nonassigned glycopeptides (GLYCO fraction) led to successful glycoprotein identification and an overall increase of the number of identified glycoproteins (þ ∼30%), their sequence coverage and score. However, the question of alternatively performing the glycoprotein enrichment prior to digestion merits to be addressed. One main concern of this approach regards the inability to pinpoint the exact glycosylation site. For example, proteins may have several N-glycosylation consensus sites, but in practice only one may be really glycosylated. If a spontaneous deamidation occurs at the other nonglycosylated asparagines, this would lead to a false positive result. In contrast, if the peptides containing sugars are specifically oxidized and bound to the hydrazide resin, the certainty to identify the correct consensus site increases significantly. This ability to tell exactly which site is glycosylated is of particular interest when specific antibodies against accessible proteins need to be developed. Interestingly, the REST fraction (both following the sequential method and the glycopeptide isolation alone) contains a relatively high number of potentially accessible proteins (∼30%). This is in contrast to the available data employing shotgunproteomics on tissue samples. As recently shown,14 the percentage of potentially accessible proteins in such experiments does not exceed ∼10%. This fact clearly indicates that the REST fraction in the current experiments cannot be compared with classical shotgun proteomics approach. At this point, it appears that fractionation cannot be the main explanation for these discrepancies. Some of the shotgun data were created with up3177
dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182
Journal of Proteome Research
ARTICLE
Table 3. Absolute Quantification of Modulated Proteins Reported in Table 1a patient BPSCC10/39
BPSCC10/40
BPSCC10/46
BPSCC10/47
BPSCC10/49
tumor
normal
ratio
tumor
normal
ratio
tumor
normal
ratio
tumor
normal
ratio
tumor
normal
ratio
accession
(fmol)
(fmol)
T/N
(fmol)
(fmol)
T/N
(fmol)
(fmol)
T/N
(fmol)
(fmol)
T/N
(fmol)
(fmol)
T/N
Q01518
27.25
8.42
3.24
25.01
10.67
2.34
15.21
/
T
55.70
/
T
42.22
/
T
Q8IUX7
23.05
/
T
118.25
6.32
18.70
25.39
/
T
20.01
/
T
26.44
/
T
Q8TD06
49.33
/
T
20.55
/
T
34.55
/
T
/
/
/
27.53
/
T
P12830
/
/
/
7.93
/
T
12.52
/
T
31.39
/
T
30.29
/
T
Q99439
28.12
/
T
42.61
/
T
2.34
/
T
73.15
/
T
43.28
/
T
Q15417
27.49
/
T
28.75
/
T
26.41
/
T
/
/
/
61.09
/
T
P27797 P49747
93.04 8.20
14.00 /
6.65 T
86.89 43.50
47.01 /
1.85 T
109.98 6.60
/ /
T T
490.02 /
233.34 /
2.10 /
176.27 /
133.50 /
1.32 /
P35221
17.52
/
T
17.17
/
T
26.42
/
T
10.16
/
T
36.70
/
T
P26232
/
/
/
1.96
/
T
/
/
/
/
/
/
13.98
/
T
Q13740
16.72
/
T
14.13
/
T
28.98
/
T
/
/
/
/
/
/
Q9Y696
19.42
/
T
6.33
11.25
0.56
3.40
/
T
/
/
/
29.54
/
T
BIOT Fraction
P10909
33.15
21.30
1.56
105.25
32.82
3.21
102.72
28.04
3.66
42.84
53.21
0.81
227.75
294.72
0.77
P02452
598.92
552.61
1.08
402.57
578.70
0.70
239.34
50.69
4.72
116.35
109.90
1.06
1901.01
420.83
2.59
Q99715 Q96C19
30.40 9.54
/ /
T T
69.35 0.72
/ /
T T
9.44 6.22
/ /
T T
/ 30.76
/ /
/ T
/ /
/ /
/ /
Q9Y6C2
18.93
/
T
16.91
8.24
2.05
7.32
/
T
/
/
/
33.91
/
T
P02751
197.97
/
T
216.76
16.54
13.10
61.01
8.58
7.11
55.63
/
T
73.77
/
T
P23142
26.00
/
T
17.75
5.45
3.26
12.10
/
T
/
/
/
44.99
/
T
Q08380
9.15
/
T
125.66
5.57
22.56
17.25
/
T
70.35
12.42
5.66
49.13
91.38
0.54
P16188
/
/
/
1.26
/
T
/
/
/
3.81
/
T
/
/
/
P13747
/
/
/
/
/
/
/
/
/
27.28
/
T
/
/
/
P17693 P18465
/ /
/ /
/ /
17.86 /
/ /
T /
11.45 /
/ /
T /
/ /
/ /
/ /
/ /
/ /
/ /
P30499
/
/
/
/
/
/
/
/
/
7.46
/
T
8.16
/
T
Q07000
/
/
/
/
/
/
6.50
/
T
10.36
/
T
/
/
/
Q29960
/
/
/
/
/
/
/
/
/
76.83
/
T
/
/
/
P30504
/
/
/
5.26
/
T
/
/
/
14.28
/
T
4.26
/
T
P01906
/
/
/
17.24
/
T
20.95
/
T
52.14
/
T
11.15
/
T
P01903
/
/
/
56.70
/
T
/
/
/
194.79
14.23
13.69
134.80
67.48
2.00
P05556 Q86UP2
42.12 13.15
20.47 /
2.06 T
49.54 19.93
12.27 /
4.04 T
41.54 36.19
/ /
T T
109.41 /
/ /
T /
153.28 23.59
127.13 /
1.21 T
P33241
16.40
/
T
26.78
/
T
25.57
/
T
498.84
/
T
/
/
/
P14174
66.26
/
T
83.82
/
T
194.74
/
T
/
/
/
418.63
201.41
2.08
O15173
18.32
/
T
17.82
10.87
1.64
23.57
/
T
/
/
/
/
/
/
P26038
10.47
/
T
12.96
/
T
7.81
/
T
66.34
/
T
26.59
13.48
1.97
P29966
37.74
/
T
16.83
/
T
/
/
/
117.74
/
T
/
12.98
N
O14745
27.16
/
T
15.83
/
T
38.33
/
T
23.53
/
T
/
/
/
Q15063 P36955
167.28 7.66
43.21 /
3.87 T
214.02 24.58
10.58 /
20.22 T
51.66 13.69
35.78 4.77
1.44 2.87
54.59 /
/ /
T /
376.86 /
/ /
T /
P07237
83.16
11.07
7.51
61.79
20.99
2.94
105.59
4.67
22.60
251.07
8.32
30.18
107.62
49.57
2.17
Q15084
/
/
/
/
/
/
13.40
/
T
65.23
/
T
33.37
/
T
P01893
/
/
/
2.99
/
T
8.59
/
T
/
/
/
/
/
/
P62834
/
/
/
0.94
/
T
/
/
/
22.20
/
T
/
/
/
P61224
0.54
/
T
4.64
/
T
/
/
/
/
/
/
/
43.25
N
Q6FHJ7
10.90
/
T
15.69
/
T
8.79
/
T
/
/
/
/
/
/
P05026 P24821
11.35 36.34
/ /
T T
12.80 43.50
/ /
T T
22.65 19.06
/ /
T T
/ 15.20
/ /
/ T
/ /
/ /
/ /
3178
dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182
Journal of Proteome Research
ARTICLE
Table 3. Continued patient BPSCC10/39
BPSCC10/40
BPSCC10/46
BPSCC10/47
BPSCC10/49
accession
tumor (fmol)
normal (fmol)
ratio T/N
tumor (fmol)
normal (fmol)
ratio T/N
tumor (fmol)
normal (fmol)
ratio T/N
tumor (fmol)
normal (fmol)
ratio T/N
tumor (fmol)
normal (fmol)
ratio T/N
P10599
58.00
70.02
0.83
100.71
/
T
136.25
19.86
6.86
349.51
24.71
14.15
178.09
92.28
1.93
P35442
31.08
/
T
49.40
/
T
2.75
/
T
/
/
/
/
/
/
P37802
147.79
45.28
3.26
227.26
80.51
2.82
213.89
15.89
13.46
793.49
53.56
14.81
296.44
154.23
1.92
P13611
62.24
/
T
72.51
/
T
38.30
/
T
/
/
/
370.69
/
T
Q9P0L0
7.04
/
T
/
/
/
14.49
/
T
16.96
/
T
/
/
/
P21796
11.02
/
T
13.73
/
T
13.85
/
T
/
/
/
/
/
/
Q01518
145.47
79.37
1.83
60.39
38.16
1.58
33.73
/
T
29.24
9.53
3.07
162.79
79.37
2.05
Q8IUX7
87.63
/
T
269.51
/
T
24.21
/
T
20.72
/
T
/
/
/
P80723
39.57
/
T
32.05
/
T
22.73
/
T
33.61
/
T
/
/
/
Q99439
69.92
/
T
42.00
/
T
34.84
/
T
/
/
/
/
/
/
Q15417
76.39
/
T
70.41
/
T
10.87
/
T
/
/
/
/
/
/
P27797
146.02
/
T
171.01
44.62
3.83
186.71
/
T
232.37
/
T
186.69
/
T
O43852
/
/
/
32.86
/
T
21.68
/
T
43.13
/
T
54.46
/
T
P13987
98.10
38.57
2.54
/
/
/
56.71
/
T
/
/
/
106.97
38.57
2.77
P02452
1169.78
378.44
3.09
296.91
146.24
2.03
217.81
158.64
1.37
158.29
89.37
1.77
787.36
378.44
2.08
Q99715
245.36
/
T
599.89
49.59
12.10
47.76
/
T
37.69
/
T
110.03
/
T
P08123
2435.30
298.96
8.15
1347.89
759.96
1.77
450.09
492.83
0.91
669.99
389.54
1.72
2071.91
298.96
6.93
Q13561
26.52
/
T
6.71
/
T
21.50
/
T
23.69
/
T
/
/
/
Q9Y6C2
50.92
/
T
20.37
/
T
5.32
/
T
16.09
/
T
/
/
/
P02751
1032.75
35.05
29.47
503.07
29.24
17.21
42.14
15.62
2.70
64.72
21.59
3.00
193.91
35.05
5.53
P09382
427.75
177.17
2.41
310.39
143.35
2.17
98.01
/
T
196.88
26.42
7.45
303.39
177.17
1.71
P13746
/
/
/
/
/
/
/
/
/
/
/
/
/
/
/
P01903
/
/
/
40.94
/
T
23.47
/
T
52.85
/
T
/
/
/
REST Fraction
P01911
/
/
/
/
/
/
/
/
/
4.08
/
T
/
/
/
P29966
371.72
/
T
44.48
/
T
38.49
/
T
50.18
/
T
/
/
/
O14745
53.47
/
T
24.99
/
T
20.32
/
T
18.63
/
T
/
/
/
Q15063
796.75
/
T
386.44
/
T
109.63
/
T
70.78
/
T
345.96
/
T
P36955
51.34
/
T
82.79
/
T
30.91
32.74
0.94
/
/
/
/
/
/
P13796
51.16
/
T
33.76
9.32
3.62
/
/
/
47.69
/
T
108.53
/
T
Q15084
115.34
27.25
4.23
62.03
26.86
2.31
75.40
/
T
100.48
14.39
6.98
110.93
27.25
4.07
P07237
246.73
44.97
5.49
114.81
/
T
166.75
/
T
141.21
12.94
10.91
158.54
44.97
3.53
Q92928
/
/
/
14.64
/
T
25.87
/
T
5.78
/
T
/
/
/
P50395
46.52
/
T
40.81
16.57
2.46
15.22
/
T
/
/
/
26.74
/
T
P46940
85.11
58.70
1.45
41.69
26.35
1.58
18.52
/
T
22.25
/
T
/
58.70
N
P61106
/
/
/
74.27
/
T
10.20
/
T
28.57
/
T
/
/
/
P08134
/
/
/
15.72
/
T
10.19
/
T
4.10
/
T
/
/
/
P05023
11.26
/
T
27.70
14.26
1.94
1.18
/
T
/
/
/
13.18
/
T
P24821
142.76
/
T
72.18
/
T
19.54
/
T
24.65
/
T
/
/
/
P07996
55.15
/
T
354.78
/
T
42.28
/
T
/
/
/
/
/
/
P35442
25.59
/
T
126.87
/
T
5.43
/
T
/
/
/
/
/
/
Q15582
31.86
/
T
57.99
15.84
3.66
/
/
/
21.65
/
T
/
/
/
P61586
56.39
/
T
66.91
20.29
3.30
14.93
/
T
52.55
13.00
4.04
/
/
/
P37802 P13611
90.56 277.48
/ /
T T
246.42 222.78
28.26 19.46
8.72 11.45
135.13 100.89
/ /
T T
265.33 72.60
/ /
T T
/ 473.07
/ /
/ T
Q9P0L0
29.22
/
T
/
/
/
13.25
/
T
13.65
/
T
42.13
/
T
P18206
69.04
87.19
0.79
92.61
59.56
1.55
26.24
/
T
18.98
12.38
1.53
92.50
87.19
1.06
P21796
/
/
/
55.36
/
T
20.46
/
T
26.05
/
T
49.19
/
T
3179
dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182
Journal of Proteome Research
ARTICLE
Table 3. Continued patient BPSCC10/39
BPSCC10/40
BPSCC10/46 tumor (fmol)
normal (fmol)
BPSCC10/47
BPSCC10/49
accession
tumor (fmol)
normal (fmol)
ratio T/N
tumor (fmol)
normal (fmol)
ratio T/N
ratio T/N
tumor (fmol)
normal (fmol)
ratio T/N
tumor (fmol)
normal (fmol)
ratio T/N
P21810
1220.20
338.77
3.60
830.89
239.89
3.46
371.85
Q10589
/
39.76
N
135.25
1
T
23.32
530.25
0.70
993.51
653.28
1.52
1366.49
336.93
4.06
/
T
60.09
/
T
/
/
Q9UBG0
70.25
97.90
0.72
117.20
90.79
1.29
/
/
/
/
41.64
/
T
225.49
/
T
Q5ZPR3
316.19
19.44
16.26
51.88
/
T
70.59
/
T
47.08
/
T
/
/
/
Q9BY67
114.44
/
T
105.15
24.53
4.29
63.81
/
T
121.79
/
T
/
/
/
Q99715
461.55
/
T
1200.98
77.78
15.44
267.53
69.74
3.84
124.05
85.92
1.44
474.59
/
T
P0C0L5
/
/
/
/
/
/
289.56
/
T
/
/
/
/
/
/
Q9Y6C2 Q08380
176.35 569.32
35.77 277.24
4.93 2.05
115.57 719.91
63.32 221.16
1.83 3.26
113.41 674.45
/ 141.05
T 4.78
203.40 817.14
/ 173.77
T 4.70
264.33 541.86
64.41 1016.83
4.10 0.53
P02790
1102.59
356.78
3.09
837.70
568.98
1.47
699.04
/
T
689.38
/
T
1020.71
2366.13
0.43
P12314
/
/
/
29.19
/
T
51.46
/
T
33.80
/
T
/
/
/
P01903
346.85
76.44
4.54
790.66
129.48
6.11
477.00
/
T
857.60
/
T
1387.41
498.70
2.78
GLYCO Fraction
Q9NX62
29.75
/
T
63.35
/
T
85.50
/
T
109.96
/
T
82.31
/
T
P06756
238.76
88.51
2.70
242.64
82.09
2.96
188.93
16.44
11.49
100.74
20.26
4.97
/
/
/
P05107
/
/
/
76.90
/
T
60.82
/
T
115.87
/
T
/
/
/
P11279 P13473
330.30 302.58
102.51 /
3.22 T
198.55 579.10
133.08 131.88
1.49 4.39
138.08 406.32
68.36 44.79
2.02 9.07
189.55 559.75
84.22 55.18
2.25 10.14
146.18 296.56
93.01 112.96
1.57 2.63 0.53
P01033
98.55
46.67
2.11
171.39
/
T
253.82
/
T
199.87
/
T
110.64
207.52
Q15063
149.69
45.08
3.32
121.32
/
T
148.65
/
T
73.46
/
T
135.54
/
T
P36955
689.37
421.77
1.63
1088.51
262.63
4.14
603.41
1219.09
0.49
437.38
1501.94
0.29
946.37
411.71
2.30
P05154
110.18
/
T
61.97
/
T
161.06
79.15
2.03
93.06
97.52
0.95
193.19
216.20
0.89
P24821
996.33
/
T
620.11
/
T
312.18
/
T
428.79
/
T
223.84
/
T
P07996
/
/
/
618.55
/
T
135.30
/
T
22.99
/
T
/
/
/
P04216 P02786
813.23 72.33
287.97 /
2.82 T
699.95 /
154.93 /
4.52 /
490.06 32.71
92.46 /
5.30 T
483.84 123.08
113.91 /
4.25 T
357.80 66.86
164.04 /
2.18 T
Q9HD45
327.21
90.22
3.63
196.84
94.87
2.07
292.61
/
T
/
/
/
/
/
/
The quantity is reported in fmol and relates to 2.5 μg of protein digest injected on the HPLC column. The ratio indicates (where applicable) the fold difference between the absolute protein quantities observed in the tumoral vs. the normal condition. No further normalization of the ratio values was conducted.
a
to 20 fractions based either on the molecular weight or the pI of the protein. The only plausible hypothesis is that the prior enrichment of such hydrophobe and basic proteins, as applied in the current setting, renormalized the remaining proteome of the tissue extract. In other words, owing to the two specific steps of the sequential method, a particular, new composition of the sample has emerged in a repeatable fashion. Altogether, the efficiency of the current approach is limited to the comparison of identified proteins with database information regarding their subcellular localization. However, a protein known to be a membrane protein does not necessary need to be accessible. Conversely, intracellular proteins may also in certain circumstances be shuttled to the surface (e.g., proteins found in the endoplasmic reticulum). Therefore, further careful assessment of identified proteins (e.g., with regard to their specific protein motifs) and in vivo validation experiments using labeled targeting agents are needed in order to confirm the real accessibility of a given protein. In the frame of the current study, known and novel modulated proteins are reported. Although the data are encouraging, this limited biological study serves only as a proof of concept
that the discovery of novel modulated and potentially accessible proteins through the usage of the outlined method is possible. One such modulated protein is CD276. Current literature shows that this protein is expressed at the cell surface of various types of tumors. Roth et al.15 showed CD276 expression in normal liver, urothelium, fetal kidney but also increased levels in prostate cancer. Recent studies16 indicated that CD276 may be predominantly expressed by the tumor vasculature. However, the exact role of this protein in the process of cancer remains relatively unknown. Some findings do indicate that one possible function of CD276 may be found at the level of the impairment of T-cell-mediated immunity.17 Our findings that CD276 is a potentially targetable protein in breast cancer are novel. The IHC analysis performed with a larger collection of breast cancer lesions validated the differential expression of CD276 in tumoral versus normal tissues. These results warrant further studies directed toward clarifying the role of this protein in tumor cells. Altogether, the method described in this study permits a comprehensive exploitation of scarce pathological material and has shown its ability to extract an interesting group of proteins 3180
dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182
Journal of Proteome Research
ARTICLE
Figure 9 . Immunohistochemical validation CD276. (A) Box-plot evaluating of CD276 antigen positivity in breast ductal adenocarcinoma and normal breast tissue. The details regarding the scoring as well as the statistics are indicated in the Materials and Methods. (B) Representative images of breast ductal adenocarcinoma cells [AC] and normal breast ducts [D] immunostained with anti-CD276.
that have the potential to be used as diagnostic or therapeutic cancer biomarkers. In this context, further (in vivo) studies are needed to validate the true systemic accessibility of the identified biomarkers. The approach is a move away from classical shotgun proteomics and is directed toward a specific group of proteins. The robustness of the analysis opens up new possibilities for using this method in other applications which may not be strictly related to cancer but to all other biological questions
that require an insight into the accessible part of the membrane proteome.
’ ASSOCIATED CONTENT
bS
Supporting Information Supplemental figures and tables. This material is available free of charge via the Internet at http://pubs.acs.org.
3181
dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182
Journal of Proteome Research
’ AUTHOR INFORMATION Corresponding Author
*Vincent Castronovo, MD, PhD GIGA Cancer, University of Liege, Pathology Building, B23, þ4, B-4000 Liege, Belgium. E-mail:
[email protected]. Phone: þ32 43662479. Fax: þ32 43662975.
’ ACKNOWLEDGMENT This work was supported by a grant from the Research Concerted Action (IDEA project) of the University of Liege (ULG), Belgium, from the CEE (FP7 network: ADAMANTAntibody Derivatives As Molecular Agents for Neoplastic Targeting (HEALTH-F2-2007-201342)), from the National Fund for Scientific Research (NFSR, Belgium) and TELEVIE as well as from the Centre Anti-Cancereux of the ULG. The authors acknowledge the GIGA-Proteomics Platform of the ULG and Pascale Heneaux (LRM) for experimental support. ’ ABBREVIATIONS ABC, avidinbiotin complex; BIOT, biotinylated protein fraction; DAB, 330 diamino benzidine tetrachlorhydrate dehydrate; DNA, DNA; DOC, deoxycholic acid; DTT, dithiothreitol; FPR, false positive rate; GLYCO, glycosylated peptide/protein fraction; GSSG, oxidized glutathione; HCl, hydrogen chloride; H2O2, hydrogen peroxide; HSA, human serum albumin; IgG, immunoglobulin; IHC, immunohistochemistry; IS, internal standard; NaCl, sodium chloride; Na2CO3, sodium carbonate; NH4HCO3, ammonium bicarbonate; NP40, Nonidet P40; Pcc, Pearson correlation coefficient; PBS, phosphate buffered saline; PI, protease inhibitor; PNGase F, peptide N-glycosidase F; PLGS, ProteinLynx Global SERVER; REST, rest peptide/protein fraction; RNA, ribonucleic acid; RT, room temperature; SA, streptavidin; SDS, sodium dodecyl sulfate.
ARTICLE
stable isotope labeling and mass spectrometry. Nat. Biotechnol. 2003, 21 (6), 660–666. (10) Tian, Y.; Zhou, Y.; Elliott, S.; Aebersold, R.; Zhang, H. Solidphase extraction of N-linked glycopeptides. Nat. Protoc. 2007, 2, 334– 339. (11) Zhang, H. Glycoproteomics using chemical immobilization. Curr. Protoc. Protein Sci. 2007, 24, No. unit 24.3. (12) Sun, B.; Ranish, J. A.; Utleg, A. G.; White, J. T.; Yan, X.; Lin, B.; Hood, L. Shotgun glycopeptide capture approach coupled with mass spectrometry for comprehensive glycoproteomics. Mol. Cell. Proteomics 2007, 6, 141–149. (13) Chen, R.; Jiang, X.; Sun, D.; Han, G.; Wang, F.; Ye, M.; Wang, L.; Zou, H. Glycoproteomics analysis of human liver tissue by combination of multiple enzyme digestion and hydrazide chemistry. J. Proteome Res. 2009, 8, 651–661. (14) Sprung, R. W., Jr.; Brock, J. W.; Tanksley, J. P.; Li, M.; Washington, M. K.; Slebos, R. J.; Liebler, D. C. Equivalence of protein inventories obtained from formalin-fixed paraffin-embedded and frozen tissue in multidimensional liquid chromatography-tandem mass spectrometry shotgun proteomic analysis. Mol. Cell. Proteomics 2009, 8 (8), 1988–1998. (15) Roth, T. J.; Sheinin, Y.; Lohse, C. M.; Kuntz, S. M.; Frigola, X.; Inman, B. A.; Krambeck, A. E.; McKenney, M. E.; Karnes, R. J.; Blute, M. L.; Cheville, J. C.; Sebo, T. J.; Kwon, E. D. B7-H3 ligand expression by prostate cancer: a novel marker of prognosis and potential target for therapy. Cancer Res. 2007, 67, 7893–7900. (16) Crispen, P. L.; Sheinin, Y.; Roth, T. J.; Lohse, C. M.; Kuntz, S. M.; Frigola, X.; Thompson, R. H.; Boorjian, S. A.; Dong, H.; Leibovich, B. C.; Blute, M. L.; Kwon, E. D. Tumor cell and tumor vasculature expression of B7-H3 predict survival in clear cell renal cell carcinoma. Clin. Cancer Res. 2008, 14 (16), 5150–5157. (17) Castriconi, R.; Dondero, A.; Augugliaro, R.; Cantoni, C.; Carnemolla, B.; Sementa, A. R.; Negri, F.; Conte, R.; Corrias, M. V.; Moretta, L.; Moretta, A.; Bottino, C. Identification of 4Ig-B7-H3 as a neuroblastoma-associated molecule that exerts a protective role from an NK cell-mediated lysis. Proc. Natl. Acad. Sci. U.S.A. 2004, 101, 12640– 12645.
’ REFERENCES (1) Cancer diagnostics: discovery and clinical applications. Clin. Chem. 2002, 48 (whole issue), 11451375. (2) Recent advances in cancer biomarkers. Clin. Biochem. 2004, 37 (whole issue), 503647. (3) Biomarkers and clinical proteomics. Mol. Cell. Proteomics 2006, 5 (whole issue), S1S402. (4) Proteomics and biomarkers. J. Proteome Res. 2005, 4 (whole issue), 10531456. (5) Celis, J. E.; Gromov, P.; Cabezon, T.; Moreira, J. M.; Ambartsumian, N.; Sandelin, K.; Rank, F.; Gromova, I. Proteomic characterization of the interstitial fluid perfusing the breast tumor microenvironment: a novel resource for biomarker and therapeutic target discovery. Mol. Cell. Proteomics 2004, 3, 327–344. (6) Castronovo, V.; Kischel, P.; Guillonneau, F.; de Leval, L.; Defechereux, T.; De Pauw, E.; Neri, D.; Waltregny, D. Identification of specific reachable molecular targets in human breast cancer using a versatile ex vivo proteomic method. Proteomics 2007, 7, 1188–1196. (7) Wollscheid, B.; Bausch-Fluck, D.; Henderson, C.; O’Brien, R.; Bibel, M.; Schiess, R.; Aebersold, R.; Watts, J. D. Mass-spectrometric identification and relative quantification of N-linked cell surface glycoproteins. Nat. Biotechnol. 2009, 27 (4), 378–386. (8) Naeem, A.; Saleemuddin, M.; Khan, R. H. Glycoprotein targeting and other applications of lectins in biotechnology. Curr. Protein Pept. Sci. 2007, 8 (3), 261–271. (9) Zhang, H.; Li, X. J.; Martin, D. B.; Aebersold, R. Identification and quantification of N-linked glycoproteins using hydrazide chemistry, 3182
dx.doi.org/10.1021/pr200212r |J. Proteome Res. 2011, 10, 3160–3182