Article pubs.acs.org/jpr
Targeted Proteomics Pipeline Reveals Potential Biomarkers for the Diagnosis of Metastatic Lung Cancer in Pleural Effusion Chi-De Chen,† Chih-Liang Wang,# Chia-Jung Yu,‡ Kun-Yi Chien,‡ Yi-Ting Chen,§ Min-Chi Chen,∥ Yu-Sun Chang,∇ Chih-Ching Wu,*,⊥ and Jau-Song Yu*,‡ †
Graduate Institute of Biomedical Sciences, ‡Department of Cell and Molecular Biology, §Department of Biomedical Sciences, Department of Public Health and Biostatistics Consulting Center, and ⊥Department of Medical Biotechnology and Laboratory Science, College of Medicine, Chang Gung University, Taoyuan 33302, Taiwan # Division of Pulmonary Oncology and Interventional Bronchoscopy, Department of Thoracic Medicine, Chang Gung Memorial Hospital, Taoyuan 33302, Taiwan ∇ Molecular Medicine Research Center, Chang Gung University, Taoyuan 33302, Taiwan ∥
S Supporting Information *
ABSTRACT: The ability to discriminate lung cancer malignant pleural effusion (LC-MPE) from benign pleural effusion has profound implications for the therapy and prognosis of lung cancer. Here, we established a pipeline to verify potential biomarkers for this purpose. In the discovery phase, label-free quantification was performed for the proteome profiling of exudative pleural effusion in order to select 34 candidate biomarkers with significantly elevated levels in LC-MPE. In the verification phase, signature peptides for 34 candidates were first confirmed by accurate inclusion mass screening (AIMS). To quantify the candidates in PEs, multiple reaction monitoring mass spectrometry (MRM−MS) with stable isotope-labeled standards (SIS) peptides was performed for the 34 candidate biomarkers using the QconCAT approach for the generation of the SIS peptides. The results of the MRM assay were used to prioritize candidates based on their discriminatory power in 82 exudative PE samples. The five potential biomarkers (ALCAM, CDH1, MUC1, SPINT1, and THBS4; AUC > 0.7) and one three-marker panel (SPINT1/SVEP1/ THBS4; AUC = 0.95) were able to effectively differentiate LC-MPE from benign PE. Collectively, these results demonstrate that our pipeline is a feasible platform for verifying potential biomarkers for human diseases. KEYWORDS: Targeted proteomics, biomarker verification, lung cancer, malignant pleural effusion, AIMS, QconCAT, MRM−MS with SIS peptides
■
INTRODUCTION The past decade has seen the widespread use of proteomic technology for the discovery of disease biomarkers. Tandem mass spectral database searching can be used to achieve highthroughput protein identification.1,2 The subsequent development of stable isotope-labeling methods, such as stable isotopelabeling by amino acids in cell culture (SILAC) and isobaric tag for relative and absolute quantification (iTRAQ), has allowed numerous protein marker candidates to be discovered by comparing relative levels of proteins between different physiological conditions or disease states.3−6 In most cases, the potential markers have been identified by comparisons between only a few case and control samples, with the result being that a large proportion of the reported marker candidates might actually be due to individual variations. In addition, only a few candidates in the long initial list have been further verified using immunoassays. As a result, despite the remarkable advances in proteomics technology, few candidate biomarkers © 2014 American Chemical Society
discovered by proteomics approaches have been introduced into clinical use. This application gap is mainly due to the falsepositive results in the initial discovery stage as well as a lack of robust tools that can effectively evaluate the potential of candidates to serve as disease biomarkers in larger numbers of clinical samples.7−9 These observations suggest that there is an urgent need for efficient methods to prioritize potential biomarker candidates prior to the establishment of expensive and time-consuming immunoassays. Targeted mass spectrometry, with multiple reaction monitoring-mass spectrometry (MRM−MS) as the core technology, has been widely used for the quantification of small molecules such as drug metabolites and hormones.10,11 MRM−MS has been recently applied to the multiplexed quantification of proteins in body fluids using stable isotope-labeled standards Received: December 14, 2013 Published: May 1, 2014 2818
dx.doi.org/10.1021/pr4012377 | J. Proteome Res. 2014, 13, 2818−2829
Journal of Proteome Research
Article
(SIS).12−14 In MRM assays, unique signature peptides have to be determined to represent each protein for quantification. The signature peptides can be selected from MS-based data sets, such as PeptideAtlas, and then verified with additional experiments, such as accurate inclusion mass screening (AIMS), on authentic samples.15−17 The AIMS method is used to confirm that the given peptides are actually detectable in the target peptide mixtures using MS. Furthermore, MS/MS data in AIMS experiments can be directly used to configure MRM assays of the given peptides.15 Thus, the combination of AIMS and MRM assays can facilitate the development of multiplexed quantification assays for candidate proteins as well as the verification of biomarkers from discovery experiments.18 Recently, a targeted proteomics pipeline consisting of AIMS and semiquantitative selected reaction monitoring (SQ-SRM) has been shown to be capable of verifying multiple plasma biomarkers in a mouse cancer model.19 This pipeline allowed more than 1000 candidates to be narrowed to dozens, and subsequently, 80 candidates were verified in mouse plasma using quantitative MRM assays.19 A high-throughput method developed for MRM assays has also been developed to test the detectabilities of more than 1000 cancer-associated proteins in human plasma and urine.20 In another application, AIMS and MRM with SIS peptides were used to study plasma biomarkers of human cardiac injury and demonstrated that targeted proteomics can facilitate the development of biomarker candidates.21 These studies clearly demonstrate the capability of targeted proteomics to verify potential biomarkers. Pleural effusion (PE), abnormal fluid accumulation in the pleural space, usually occurs in patients with congestive heart failure, infectious diseases, or malignancy.22 In clinical practice, PEs are diagnosed as transudates or exudates based on Light’s criteria.23 Transudates are mainly caused by congestive heart failure or cirrhosis, which affects the balance of hydrostatic and oncotic forces. In contrast, most exudates are caused by pulmonary diseases, such as pneumonia, tuberculosis, pulmonary embolism, or malignancy.22 PEs in which cancer cells are detected, called malignant pleural effusion (MPE), imply a more advanced stage of cancer and a poorer prognosis.24 Among various malignancies, lung cancer (LC) accounts for the highest incidence of MPE.25 PE without etiologic evidence of tumor invasion is called paramalignant PE (PMPE) and may result from lung collapse, local or systemic inflammatory effects because of cancers, or coexisting pneumonia.26,27 However, only ∼60% of MPEs can be diagnosed with cytology or pleural biopsy.28 False-negative diagnoses of MPE may underestimate disease occurrence and might provide misleading therapeutic guidance. Thus, a more accurate and sensitive method for MPE diagnosis is needed to improve long-term strategies for cancer treatments. In this study, to identify biomarkers for MPE diagnosis, a targeted proteomics pipeline consisting of AIMS, QconCAT, and MRM with SIS peptides was used to verify the potential biomarkers discovered from a shotgun proteomics experiments (Figure 1A). Our results showed that this pipeline is an effective platform for developing effective biomarker panel(s) for highly heterogeneous human malignant diseases.
■
Yuan, Taiwan), and the study was approved by the Institutional Review Board. After centrifugation of samples at 1500g for 10 min at 4 °C, the supernatant was collected and stored in aliquots at −80 °C until use. MPE was diagnosed on the basis of a positive result in a cytological examination or pleural biopsy. Paramalignant pleural effusion (PMPE) was defined as a sample with negative results in all diagnostic methods performed regularly over 6 months. Benign exudate was diagnosed as previously described.22 Exudative PE samples for the discovery phase were collected from 13 lung adenocarcinoma (LC) patients (seven males and six females; mean age, 66.7 years; range, 46−81 years), 12 bacterial pneumonia (PN) patients (seven males and five females; 46−91 years of age, mean age 73.8), and 10 tuberculosis (TB) patients (six males and four females; mean age, 53.0 years; range, 32−88 years) (Supporting Information Table S1). The individual PE samples within each group were then pooled into three groups containing equal amounts of protein. To verify the potential biomarker candidates found in discovery phase, the 82 individual cases for the MRM−MS assay were age- and gender-matched (Table 1). Table 1. Overview of Patient Data Sets parameter no. of patients age (mean ± SD) gender (F/M) type of PE (n; %)
82 64.5 ± 12.9 29:53 benign tuberculosis pneumonia LC-PMPE LC-MPE
43 25 18 13 26
(52.4%) (30.5%) (22.0%) (15.9%) (31.7%)
initial cytology repeat cytology pleural biopsy
21 (78.3%) 4 (17.4%) 1 (3.8%)
diagnosis of LC-MPE (n; %)
Depletion of Abundant Proteins, Tryptic In-Solution Digestion, and Protein Identification by Online Two-Dimensional Liquid Chromatography Tandem Mass Spectrometry (2D-HPLC−MS/MS)
The pooled PEs were depleted of six high-abundance human plasma proteins (albumin, IgG, IgA, transferrin, α1-antitrypsin, and haptoglobin) using a multiple affinity removal system (MARS) affinity column (Hu-6HC, 4.6 × 100 mm; Agilent Technologies, Wilmington, DE, USA) on an Ä KTA Purifier-10 fast-performance liquid chromatography system (FPLC; GE Healthcare/Amersham Bioscience, UK). The binding and elution buffers were provided with the kit, and the depletion procedures followed the instructions of the manufacturer. Depleted exudates were buffer-exchanged into 50 mM ammonium bicarbonate and concentrated using an Amicon Ultra-4 centrifugal filter unit with an Ultracel-3 membrane (Millipore, Taipei, Taiwan). The three pooled exudates (40 μg each) were prepared for analysis of protein profiles by reducing with 10 mM dithiothreitol (DTT, Merck, Darmstadt, Germany) at 56 °C for 1 h and alkylating by incubating with 55 mM iodoacetamide (IAAM, Sigma, St. Louis, MO, USA) at room temperature for 1 h. After removing excess alkylating agent by incubating with 40 mM DTT at room temperature for 1 h, protein mixtures were digested with 1 μg of sequencing-
MATERIALS AND METHODS
Collection and Preparation of Clinical Specimens
All exudative PE samples were collected at the Department of Thoracic Medicine, Chang Gung Memorial Hospital (Tao2819
dx.doi.org/10.1021/pr4012377 | J. Proteome Res. 2014, 13, 2818−2829
Journal of Proteome Research
Article
Additional MS instrument settings in AIMS experiments were as described in our discovery experiments, except that “FT preview scan mode” and “dynamic exclusion” were disabled.
grade modified porcine trypsin (Promega, Madison, WI, USA) at 37 °C for 16 h. The peptide mixtures from the tryptic digestion of 4 μg of depleted proteins were then reconstituted in 0.1% formic acid, desalted using a homemade microcolumn (Source 15RPC, GE Healthcare), and analyzed using 2DHPLC coupled with linear ion trap mass spectrometer (LTQOrbitrap MS; Thermo Fisher, San Jose, CA, USA) operated with Xcalibur 2.0.7 software (Thermo Fisher). The detailed instrument settings for the 2D-HPLC and the LTQ-Orbitrap MS are provided in the Supporting Information.
Preparation of the QconCAT Protein
A stable isotope-labeled QconCAT protein was prepared as previously described.30,31 Briefly, the sequence of concatenated signature peptides was arranged according to the spectral count values of each candidate protein. The concatenated sequence was divided into two parts, QconCAT-1 (28 kDa) and QconCAT-2 (27.4 kDa), according to the range of spectral count values for each candidate protein (Supporting Information Figure S1). Nucleotide sequences were deduced from the two concatenated amino acid sequences and were synthesized with codon optimization for expression in Escherichia coli by GENEART (GENEART, Regensburg, Germany). The two artificial QconCAT genes were subcloned into the pET-15b vector (Novagen/Merck, Darmstadt, Germany) and transformed into E. coli (BL21 strain). Expression of the two QconCAT proteins was induced by incubating with 0.4 mM isopropyl β-D-thiogalactopyranoside (IPTG) for 2 h at 37 °C in M9 minimal medium supplemented with 15NH4Cl, and the cells were collected by centrifugation at 2300g for 10 min at 4 °C. The insoluble fractions of extracted E. coli proteins were dissolved in binding buffer (8 M urea, 0.5 M NaCl, 20 mM Na2HPO4) and were purified using Ni Sepharose (GE Healthcare) (Supporting Information Figure S2a). The bound fractions were then eluted with elution buffer (200 mM imidazole, 8 M urea, 0.5 M NaCl, 20 mM Na2HPO4) and separated by HPLC (SpectraSYSTEM; Thermo Fisher) on a reversed-phase C4 column (Jupiter 300, 250 × 4.6 mm; Phenomenex, Torrance, CA, USA) using a linear gradient of 20−37% buffer B (99.9% acetonitrile/0.1% formic acid) for 17 min, 37−60% buffer B for 2 min, and 60−100% buffer B for 2 min, with a flow rate of 0.9 mL/min (Supporting Information Figure S2b). Purified proteins, labeled with the 15N stable isotope, were quantified by amino acid analysis, digested with trypsin, and analyzed with LTQ-Orbitrap MS. The resulted spectra were then searched against an E. coli database extracted from SwissProt plus sequences of two QconCAT proteins.
Database Searching and Data Processing
The MS/MS spectra obtained from the LTQ-Orbitrap MS were searched against the SwissProt database (version 56.0, selected for Homo sapiens; 20 401 entries) using Mascot (version 2.2.06, Matrix Science). The cleaved enzyme was set to “trypsin”, with up to one missed cleavage. The variable modification was set to “oxidation on methionine residue”, and the fixed modification was set to “carbamidomethyl cysteine”. The maximum mass tolerance was set to 5 ppm for precursor ions and 0.5 Da for fragment ions (MS/MS peaks). The Mascot search results were imported into Scaffold (Proteome Software, Portland, OR, USA) to obtain information about the peptide/protein probabilities and the spectral count values. The protein identities were filtered using the followed criteria: (a) at least two unique peptides with peptide probability greater than 90% and (b) a protein probability greater than 95%. The spectral count values were normalized and used to compare the relative protein levels in malignant (LC) and nonmalignant (PN, TB) PE. Differences in spectral count values between malignant (n = 5) and control (n = 10) diseases were determined with the nonparametric Mann−Whitney test using SPSS software (version 12.0; SPSS Inc., Chicago, IL, USA) followed by setting the false discovery rate (FDR) at 0.05 for correction of multiple testing (q value, R package version 1.28.0). Proteins with FDRs of less than 0.05 and a 2-fold increase in spectral counts in LC-MPE were considered to be potential biomarker candidates. Selection of Signature Peptides for Candidate Proteins Using Accurate Inclusion Mass Screening (AIMS)
HPLC−MRM Analysis
The signature peptides for each candidate protein were selected using the following criteria: (a) unique peptides containing 8− 20 residues without any known post-translational modification sites (determined from the human protein reference database, HPRD)29 and no sequential or missed trypsin cleavage sites; (b) peptides without chemically reactive amino acids, such as C, M, and W; (c) peptides without unstable sequences, such as NG, DG, and QG; and (d) peptides without sequences potentially leading to missed cleavage, such as RP and KP. For candidate proteins where no suitable peptide was found in the discovery experiments, AIMS was used to search for potential signature peptides as previously described.15 The target proteins were digested in silico to obtain all possible tryptic peptides and the corresponding mass-to-charge ratio (m/z) values. After the resulting peptides were filtered using the criteria described above, accurate m/z values were first entered as input into the inclusion list (