Quantitative Tissue Proteomics Analysis Reveals Versican as

Dec 2, 2015 - ... the most aggressive tumors, and the treatment outcome of this disease is improved ... For this analysis, the second set of the patie...
1 downloads 0 Views 1MB Size
Subscriber access provided by KUNGL TEKNISKA HOGSKOLAN

Article

Quantitative tissue proteomics analysis reveals versican as potential biomarker for early-stage hepatocellular carcinoma Wael Naboulsi, Dominik A. Megger, Thilo Bracht, Michael Kohl, Michael Turewicz, Martin Eisenacher, Don Marvin Voss, Jörg F. Schlaak, Andreas-Claudius Hoffmann, Frank Weber, Hideo A. Baba, Helmut E. Meyer, and Barbara Sitek J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/acs.jproteome.5b00420 • Publication Date (Web): 02 Dec 2015 Downloaded from http://pubs.acs.org on December 7, 2015

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Proteome Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Quantitative tissue proteomics analysis reveals versican as potential biomarker for early-stage hepatocellular carcinoma Wael Naboulsi1*, Dominik A. Megger1, Thilo Bracht1, Michael Kohl1, Michael Turewicz1, Martin Eisenacher1, Don Marvin Voss1, Jörg F. Schlaak2, Andreas-Claudius Hoffmann3, Frank Weber4, Hideo A. Baba5, Helmut E. Meyer1, and Barbara Sitek1*

1

2

3

Medizinisches Proteom-Center, Ruhr-Universität Bochum, Germany

Department of Gastroenterology and Hepatology, University Hospital of Essen, Germany

Department of Medicine (Cancer Research), Molecular Oncology Risk-Profile Evaluation, University Hospital of Essen, Germany

4

Department of General, Visceral and Transplantation Surgery, University Hospital of Essen, Germany 5

Department of Pathology, University Hospital of Essen, Germany

1 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Abstract: Hepatocellular carcinoma (HCC) is one of the most aggressive tumors and the treatment outcome of this disease is improved when the cancer is diagnosed at an early stage. This requires biomarkers allowing for an accurate and early tumor diagnosis. To identify potential markers with such applications, we analyzed a patient cohort consisting of 50 patients (50 HCC and 50 adjacent non-tumorous tissue samples as controls) using two independent proteomic approaches. We performed label-free discovery analysis on 19 HCC and corresponding tissue samples. The data were analyzed considering events known to take place in early events of HCC development such as abnormal regulation of Wnt/b-catenin and activation of receptor tyrosine kinases (RTKs). 31 proteins were selected for verification experiments. For this analysis, the second set of the patient cohort (31 HCC and corresponding tissue samples) was analyzed using selected (multiple) reaction monitoring (SRM/MRM). We present the overexpression of ATP-dependent RNA helicase (DDX39), Fibulin-5 (FBLN5), Myristoylated alanine-rich C-kinase substrate (MARCKS) and Serpin H1 (SERPINH1) in HCC for the first time. We demonstrate Versican core protein (VCAN) to be significantly associated with well differentiated and low-stage HCC. We revealed for the first time the evidence of VCAN as potential biomarker for early-HCC diagnosis. KEYWORDS: Hepatocellular carcinoma, label-free proteomics, early diagnosis biomarker, targeted proteomics, multiple reaction monitoring, selected reaction monitoring, Versican. Corresponding authors *Wael Naboulsi, Medizinisches Proteom-Center, Ruhr-Universität Bochum, 44801 Bochum, Germany, Tel. +49-(0)-234/32- 24862, E-mail: [email protected] *Jun.-Prof. Dr. Barbara Sitek, Medizinisches Proteom-Center, Ruhr-Universität Bochum, 44801 Bochum, Germany, Tel. +49-(0)-234/32-24362, E-mail: [email protected]

2 ACS Paragon Plus Environment

Page 2 of 34

Page 3 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Introduction: Hepatocellular carcinoma (HCC) is the fifth most common cancer worldwide and number one among liver cancers. Moreover, it is the third leading cause of cancer related death.1 Late HCC detection is one of the major reasons for its high mortality rate, as the therapy outcome is less curative.2 Better HCC prognosis can be achieved when HCC is diagnosed at early stages. This includes the diagnosis of malignancies characterized by histologically well differentiated cells (G1) 3 and clinically low-stage tumors without vascular invasion (T1).4 The diagnosis of such an early HCCs is histologically problematic as the tumor is not distinguishable from dysplasia in terms of morphology. 3 In such cases, pathologists histologically evaluate resected specimens and observe stromal invasion. This could be seen as an invasive growth of the cancer tissue into the fibrous matrix and/or blood vessels. Yet, final diagnosis of early HCC is very difficult to apply by assessment of biopsy specimens.3 Most importantly, the majorities of HCC patients suffering from early tumors are clinically asymptomatic and have negative results for serum markers such as alphafetoprotein.5 This highlights the urgent need for diagnostic biomarkers which can be used to detect early-stage HCCs. In principle, proteins applied as diagnostic markers in early stage can be nominated based on the functional link between the candidates and the early events taking place in HCC development. Previously, it was reported that one of the most critical pathways involved in the initial stages of HCC progression is the Wnt/b-catenin pathway.6 In addition, the activation of receptor tyrosine kinases (RTKs) was shown to take place in early phases of hepatocarcinogenesis. In turn, stimulation of RTKs activates several pathways involved in HCC progression such as the Ras (raf/MEK/ERK) and JAK/STAT axes. 1 The activation of transforming growth factor-beta (TGFβ) which involved in RTKs stimulation has been shown to have an important role in HCC. 1,

7

Further, many downstream transcription factors 3

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

associated with the mentioned pathways have been reported to be activated in hepatocellular carcinoma. For example, G1/S-specific cyclin-D1 (CCND1), Myc proto-oncogene protein (MYC) and Transcription factor AP-1 (c-jun) have been shown to be frequently stimulated in HCC.8 On the other hand, only little is known about the involvement of CCAAT/ enhancerbinding protein alpha (CEBPA) in HCC progression. 9 As proteome analysis offers the possibility to analyze a large number of expressed proteins in tissues or cells, this technique has been applied in order to identify HCC related proteins in many studies.10-12 However, altered expression of proteins quantified with labelfree proteomic techniques should be further verified on a larger scale.13 Within the last few years, targeted proteomics using selected (or multiple) reaction monitoring (SRM or MRM) has been shown to be a powerful method for biomarker verification in samples with different complexity backgrounds including serum, urine and tissue. 14-16 In this study, we conducted a label-free proteome analysis on HCC tissue samples and adjacent non-tumorous tissue. Differentially expressed proteins associated with TGFβ and downstream interactors of the Wnt/b-catenin, Ras and JAK/STAT pathways in particular CCND1, MYC, c-jun and CEBPA were thereby identified using a knowledge-based approach and manual literature search. Subsequently, a targeted SRM/MRM approach was used to confirm the altered expression of selected proteins in a larger and independent cohort of tissue samples. Thereby, we aimed to shed light on part of the molecular players involved in HCC development, and in this way reveal HCC-related proteins valuable for early-stage diagnosis.

4 ACS Paragon Plus Environment

Page 4 of 34

Page 5 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Material and Methods: Tissue samples: HCC and corresponding non-tumorous tissue samples were collected from 50 primary liver cancer patients following hepatic resection or liver transplantation (Essen University Hospital, Germany, permission from local ethics committee 11-4839-BO). The patient cohort was divided into a discovery set (n=19) and a verification set (n=31). The tumors were classified according to TNM (pTNM) pathologic system (seventh edition). All available pathological data of the patients’ tissue samples are summarized in table 1 and detailed patients data are summarized in supplementary table S-1. After collection, the samples were snap-frozen and stored at -80°C till analysis. Sample preparation and protein digestion: Tissue samples were lysed and homogenized in sample buffer (30 mM Tris, 7 M Urea, 2 M Thiourea, 0.1% SDS, pH 8.5) and the protein concentration was determined using Bradford assay (Bio-Rad, Hercules, CA, USA). 10 µg of protein from tissue lysate were run into 18% Tris-glycine polyacrylamide gel (Anamed Elektrophorese, Germany) for about 1 cm (15 min. and 100V). After Coomassie staining, gels pieces were excised and proteins were tryptically digested with 1:50 ratio (enzyme to protein) over night at 37°C. The resulting peptides were then extracted from the gel by 15 min sonication on ice using 20 µl of 50% acetonitrile in 0.1% trifluoroacetic acid (TFA) twice. The peptides were dried via vacuum centrifugation and dissolved in 0.1% TFA. The peptide concentration was determined by amino acid analysis performed on an ACQUITY-UPLC equipped with AccQ Tag UltraUPLC column (Waters, Eschborn, Germany) calibrated with Pierce Amino Acid Standard (Thermo Scientific, Bremen, Germany).

5 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 34

Label-free analysis: Label-free analysis was performed using an Orbitrap Elite mass spectrometer coupled to an Ultimate 3000 RSLCnano system (Thermo Scientific, Bremen, Germany). 350 ng of peptides were delivered to the trap column (Acclaim PepMap 100, 300 μm × 5 mm, C18, 5 μm, 100 Å) at a flow rate of 30 μL/min (01%TFA). After seven minutes of washing, peptides were transferred to the analytical column (Acclaim PepMap RSLC, 75 μm × 50 cm, nano Viper, C18, 2 μm, 100 Å). Buffer A (0.1% FA) and buffer B (0.1% FA, 84% ACN) were used to elute the peptides from the analytical column using a gradient from 5% to 40% buffer B at a flow rate 400 nL/min over 98 minutes (column oven temperature 60°C). The MS was operated in data dependent-mode. Full MS scan spectra were acquired in the Orbitrap analyzer at 60,000 resolution in profile mode. The mass range was set as 350-2000 m/z. The 20 most abundant precursor ions in the MS scan were selected for subsequent MS/MS analysis in the linear ion trap following precursor fragmentation by collision-induced dissociation (CID). The mass window for precursor isolation was 1.0 m/z. We used charge screening and only precursors having charge states of +2, +3 and +4 were fragmented by normalized collision energy (CE) of 35.0 %. The applied dynamic exclusion settings were; 500 for exclusion list size, the exclusion duration was 30 s and the mass width relative to excluded mass was ± 10 ppm. The time for one duty cycle was 3.6 s if 20 MS/MS spectra were acquired. The data presented in this manuscript are available via ProteomeXchange with the identifier PXD002171. Protein identification and quantification: The generated raw data from the LC-MS/MS analysis were examined using Proteome Discoverer 1.3 (Thermo Fisher Scientific Rockford, IL, USA). Data base searches were performed

with

Mascot

2.3.2

(Matrix

Science

Ltd.,

London,

UK)

against

UniprotKB/SwissProt (2013_05), which contained 20,330 sequences of Homo sapiens. The 6 ACS Paragon Plus Environment

Page 7 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

search parameters were the following: precursor mass tolerance was 5 ppm, fragment mass tolerance was 0.4 Da, enzyme was trypsin, one missed cleavage site was allowed and oxidation (M) and propionamide (C) were set as dynamic modifications. The percolator program implemented in Proteome Discoverer was used to calculate the false discovery rate (FDR) of the identified peptides and only peptides with FDR < 1% were considered. Ion intensity-based label-free quantification was carried out using Progenesis LC-MS software (ver. 4.1.4832.42146, Nonlinear Dynamics Ltd., Newcastle upon Tyne, UK). The retention times of eluting peptides from all the samples within the experiment were aligned to a selected reference run. Only features with charges from +2 to +4 were included for later analysis. Features with two or less isotopic peaks were excluded. After the alignment and feature filtering, the raw abundances of all features were normalized to correct the experimental variations. Search results from Proteome Discoverer were imported into Progenesis LC-MS in order to combine peptide quantification and identification. Only unique peptides for a corresponding protein were used for quantification. The protein grouping option integrated in Progenesis LC-MS was disabled. Pathway analysis: The pathway analysis of differently expressed proteins was performed using Ingenuity Pathway Analysis software (IPA, Ingenuity Systems, Redwood City, CA) for biological function, networks and associated diseases. A core analysis was done via the expertly curated Ingenuity Knowledge Base using both direct and indirect relations. SRM/MRM assay development: We used SRM/MRM to confirm the differential expression of selected proteins obtained from the label-free experiment. Unique peptides ranging in length from 6-20 amino acids and containing K/R tryptic ends with no miss-cleavages were chosen for each of the selected 7 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 34

candidates. Unique peptides which were observed in our label-free experiment and/or previously reported with high observation number in PeptideAtlas

17

were prioritized during

the peptide selection. All peptides containing amino acids with methionine (M) and/or cysteine (C) were excluded as they are prone to chemical modifications. 69 selected peptides representing 31 proteins were chemically synthesized via Fmoc chemistry, based on solid-phase synthesis (INTAVIS peptide services, Cologne, Germany). Stable isotope-labeled peptides (SI peptides) were synthesized with labeled C-terminal [13C6, 15

N2]-lysine or [13C6,

15

N4]-arginine. All peptides were purchased with crude purity

approximately >70%, unless otherwise stated (supplementary table S-2). The SI peptides were dissolved in 30% ACN (0.1% formic acid) and the concentration was determined via amino acid analysis. The SI peptides were diluted to 250 fmol/µl in 0,1% FA and the optimal parameters for the peptides was experimentally-defined. Triple quadrupole mass spectrometry was used for the SRM/MRM analysis using an Agilent 6490 equipped with an iFunnel Technology source coupled to an Agilent 1290 Infinity Binary HPLC standard-flow system (Agilent Technologies, Santa Clara, CA, U.S.A). The MS analysis was conducted in positive ion mode with a capillary voltage of 3500V and a nozzle voltage of 300V. Sheath gas temperature was set at 250°C at 11L/min flow rate and the drying gas temperature was 150°C with a flow rate of 11 L/min. The QQQ-MS was operated using the mass hunter workstation (ver. B.06.00 Service Pack 1). The instrumental parameters of SI peptides were optimized as follows: Transition lists were created in Skyline 2.6 (MacCoss Lab Software, Seattle, WA, USA) from m/z > precursor-2 to last ion-2 of all SI peptides at +2 and +3 charge states. Predicted collision energies (CE) were calculated from the default Agilent 6490 CE linear equation (charge state 2 slope =0.031 and intercept =1; charge state 3 slope =0.036 and intercept = -4.8). The CEs of the SI peptides were optimized for each precursor by applying two steps on either side of the 8 ACS Paragon Plus Environment

Page 9 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

predicted CE with a step size equal to 5 V. The most intensive six transitions of each peptide with their optimized CE were selected for a final optimization step. Here, 500 fmol SI peptides mixture were spiked into a protein digest of liver tissue (2 µg) and loaded onto the analytical column (ZORBAX Eclipse Plus Rapid Resolution HD, 2.1x150 mm, 1.8 µm) tempered at 50°C. A 54 min multi-step gradient from 3% solvent B (84% ACN, 0.1% FA) and 97% buffer A (0.1% FA) to 90% B and 10% A at a flow rate of 0.4 ml/min was applied for separation. After optimizing a dynamic SRM method with a retention time window of 3 minutes, three transitions which provided the highest signal intensity and lowest level of interfering signals were chosen for the final assay. A preference was given toward y-ions with higher mass if the abundances were similar. Standard curves of SI peptides: To determine the linear quantification range of the SI peptides, a dilution series of heavy labeled peptides mixture spanning over a 105-fold range (25x10 -4 - 25 fmol/µl) were spiked in 2 µg of pooled liver tissue digest (HCC and controls) and each concentration was analyzed in four replicates. The lower limit of quantification (LOQ) of each peptide was calculated using an in-house script for the software R (R Foundation for Statistical Computing, Vienna, Austria) by analyzing the peak area of the quantifier transition versus the concentration. The LOQ was defined as the lowest point which had a coefficient of variation (CV%) below 20% of the measured area under the curve (AUC). For upper and lower average accuracy values (AAV) of each concentration, thresholds of 120% and 80%, respectively, were chosen.18 The amount of each SI peptide for verification analysis was adjusted to be within the linear quantification range and when possible similar to the endogenous peptide´s concentration. The latter was estimated based on the peak area obtained from the SRM/MRM calibration curve analysis.

9 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 34

SRM/MRM verification analysis: For SRM/MRM verification analysis, the tissue digest samples (31 HCC and corresponding tissue samples) (table 1) were prepared as described for the label-free analysis, except that the peptides were dissolved in 0.1% FA after extraction. The mixture of SI peptides was added to 2 µg of tissue digest samples. All data sets were processed using Skyline 2.6. The peaks were automatically integrated and Savitzky-Golay smoothing algorithm was applied. The data sets were then manually checked to ensure accurate integration and peak detection. The peak area ratios for endogenous/SI peptide were exported and the log transformed ratios were used for statistical analysis. The data have been deposited at PASSEL and are accessible via the identifier PASS00691. Statistical analysis: The statistical analysis of the label-free experiment was done within Progenesis LC-MS. Paired ANOVA was applied on the 19 HCC and 19 adjacent non-tumorous tissue samples. Proteins were considered to be significantly differently expressed when both p-value and qvalue were ≤ 0.05 (FDR adjusted p-value). For the statistical analysis of the SRM/MRM data a paired t-test was applied on the log-scale ratios (endogenous/SI peptide) and the resulting pvalues were FDR adjusted according to Benjamini and Hochberg.19 The fold change was calculated based on the mean difference between the ratios of 31 HCC and 31 controls. To assess the separation power of each protein between HCC and controls, receiver operating characteristic (ROC) analysis was performed. Briefly, the following steps were conducted; first, from the 31 HCC and 31 controls 1000 pairs of training sets (each containing 20 HCC and 20 controls) and complementary test sets (each containing 11 HCC and 11 controls) were randomly drawn. Second, for each protein, 1000 SVM classifiers were trained using the 1000 training sets. Subsequently, each of the 1000 test sets was predicted using its corresponding SVM classifier. Finally, the results of the 1000 predictions were used to construct a protein10 ACS Paragon Plus Environment

Page 11 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

specific ROC-curve and compute the corresponding area under the curve (AUC). All statistical analyses of the SRM/MRM data were conducted with an in-house developed R script.

11 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 34

Results: Label-free analysis and selection of proteins for verification experiment In this study, we carried out a label-free proteome analysis using samples collected from 19 HCC patients (19 HCC and 19 adjacent non-tumorous tissues). To identify proteins associated with interactors of the Wnt/b-catenin and the RTKs-activated Ras and JAK/STAT pathways in hepatocellular carcinoma. An outline of the applied workflow is summarized (Fig. 1). In total, 2,736 proteins were quantified from the label-free analysis. After protein filtration, 547 proteins were found to be differently expressed between HCC and controls, from which 426 proteins were up-regulated in HCC and 121 proteins were down-regulated (Fig. 2). All the data are summarized in supplementary table S-4. Three proteins which did not match the filtering criteria were still included in the subsequent analysis. These are Glypican-3 (GPC3) and Alpha-fetoprotein (AFP) as they are established biomarkers for HCC.20 Beside these two proteins, Myristoylated alanine-rich C-kinase substrate (MARCS) was also included for further analysis due to its potential novel role in HCC pathway initiation as will be illustrated in section three of the discussion (Fig. 3). We explored the biological significance of the differentially expressed proteins using Ingenuity Pathway Analysis (IPA) and manual literature research. 80 proteins were annotated to be associated with TGFβ1, several tumorigenic transcription factors CEBPA, CCND1, MYC, c-jun and their up-stream regulators Wnt/b-catenin, Ras and JAK/STAT pathways. The annotation of all proteins is summarized in supplementary table S-3. Manual literature research provided additional hints about which proteins could be appropriate for further analysis. This includes the association of proteins with HCC, other types of cancer, or proteins which might provide a novel rational understanding of disease development. In general, the cost of the synthesis of the SI peptides was the limiting factor for developing SRM/MRM assay for all the 80 proteins. Hence, only 31 proteins were selected for further verification 12 ACS Paragon Plus Environment

Page 13 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

analysis (Fig 3). Betaine-homocysteine S-methyltransferase 1 (BHMT) was the only selected down-regulated protein as it is well known to be a hepatocyte marker and it is less expressed in HCC compared to non-tumor tissues.21 Multiplex SRM/MRM assay development To perform SRM/MRM verification of the 31 proteins, 69 stable-isotope labeled peptides were used as an internal standard in the SRM/MRM targeted quantification (supplementary table S-2). Initially, we constructed a multiplex SRM/MRM assay for the 69 peptides. Then the accuracy and the analytical performance of the assay were checked by creating a regression curve for each SI peptide. All poorly ionized SI peptides with high LOQ and endogenous peptides with very low signal were excluded from further analysis. Finally, a multiplex SRM/MRM assay consisting of 40 peptides (40-plex) was established for the verification of 27 proteins. Supplementary figure S-1 shows the regression analysis of all the 40 peptides in addition to the amount of spiked SI peptides (supplementary figure S-1). All parameters for the SRM/MRM analyses including precursor ions, the monitored transitions and the applied collision energy are provided (supplementary table S-5). SRM/MRM biomarker verification analysis The optimized 40-plex assay representing the 27 proteins was applied in the verification analysis. None of the 62 samples (31 HCC and 31 adjacent non-tumorous tissue) analysed in this verification step were used in the previous label-free experiment (table 1). To assure the analytical variation of the 40-plex analysis, a standard mixture of the tissue digest was measured six times (one every 10th samples) between the 62 measurements. The median CV% (analytical variation) of all peptides area ratios from the six measured standards was 8.1%. Peptides

from

PEG

(LTEENTTLR),

DDX39

(VSVFFGGLSIK)

13 ACS Paragon Plus Environment

and

MAP1B

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 34

(HNLQDFINIK) proteins had a CV% over 20% and they were therefore excluded from further statistical analysis (Fig. 4). In accordance with the results from the label-free analysis, 18 proteins were found to be significantly up-regulated in HCC compared to their adjacent non-tumorous tissue (p-value < 0.05) (table 2). After correction for multiple testing, there were 11 up-regulated candidates having FDR adjusted p-value less than 0.05 (table 2). As expected, BHMT was the only down-regulated protein in the HCC group (p-value and FDR adjusted p-value