Searching Missing Proteins Based on the ... - ACS Publications

Aug 3, 2016 - protein lists (http://bioinfo.hupo.org.cn/contaminants.html). The membrane ..... lipids and detergents: not just a soap opera. Biochim. ...
2 downloads 0 Views 3MB Size
Article pubs.acs.org/jpr

Searching Missing Proteins Based on the Optimization of Membrane Protein Enrichment and Digestion Process Mingzhi Zhao,†,▽ Wei Wei,†,▽ Long Cheng,#,▽ Yao Zhang,†,∥ Feilin Wu,†,‡ Fuchu He,*,† and Ping Xu*,†,§,⊥

J. Proteome Res. 2016.15:4020-4029. Downloaded from pubs.acs.org by EASTERN KENTUCKY UNIV on 01/28/19. For personal use only.



State Key Laboratory of Proteomics, National Engineering Research Center for Protein Drugs, Beijing Proteome Research Center, National Center for Protein Sciences Beijing, Beijing Institute of Radiation Medicine, Beijing 102206, P. R. China ‡ Life Science College, Southwest Forestry University, Kunming 650224, P. R. China § Key Laboratory of Combinatorial Biosynthesis and Drug Discovery (Wuhan University), Ministry of Education, and Wuhan University School of Pharmaceutical Sciences, Wuhan 430071, P. R. China ∥ Institute of Microbiology, Chinese Academy of Science, Beijing 100101, China ⊥ Anhui Medical University, Hefei 230032, Anhui, P. R. China # Department of Medical Molecular Biology, Beijing Institute of Biotechnology, 27 Tai-Ping Lu Road, Beijing 100850, China S Supporting Information *

ABSTRACT: A membrane protein enrichment method composed of ultracentrifugation and detergent-based extraction was first developed based on MCF7 cell line. Then, in-solution digestion with detergents and eFASP (enhanced filter-aided sample preparation) with detergents were compared with the time-consuming in-gel digestion method. Among the in-solution digestion strategies, the eFASP combined with RapiGest identified 1125 membrane proteins. Similarly, the eFASP combined with sodium deoxycholate identified 1069 membrane proteins; however, the in-gel digestion characterized 1091 membrane proteins. Totally, with the five digestion methods, 1390 membrane proteins were identified with ≥1 unique peptides, among which 1345 membrane proteins contain unique peptides ≥2. This is the biggest membrane protein data set for MCF7 cell line and even breast cancer tissue samples. Interestingly, we identified 13 unique peptides belonging to 8 missing proteins (MPs). Finally, eight unique peptides were validated by synthesized peptides. Two proteins were confirmed as MPs, and another two proteins were candidate detections. KEYWORDS: membrane proteomics, ultracentrifugation, detergent, digestion, missing proteins



INTRODUCTION

genome technology cannot explain the reason for disease because many genomic variations are not or only partially translated to the protein or cannot reflect the active cellular functions.8−10 These disadvantages required large-scale quantitative proteomics with high confidence in protein identification and quantification for searching the key genes or proteins in cancer-induced cases.11 MCF7 cell line was one of the stabilized breast cancer cell lines. It was established in 1973 by Soule and colleagues. MCF7 was ER- and PR-positive; the antiestrogens drug tamoxifen can inhibit the cell growth.12,13 These properties make it a suitable in vitro model for breast cancer research.

Breast cancer is one of the leading cancers of females worldwide, which causes 23% of total cancer cases and 14% of deaths every year. In clinical trial, breast cancers were divided into two subsets based on estrogen receptor (ER) expression or not at the very beginning of 1970s. Treatment decisions were solely based on clinicopathological variables, ER, progesterone receptor (PR), and HER2 expression for a long period.1 These subtypes matured into four accepted subtypes based on gene expression patterns: Luminal A, Luminal B, HER2-enriched, and basal-like breast cancer.2−4 While this subtype method cannot perfectly reflect the clinical subtypes, it requires more detail systematical research. Aiming to understand the clinical behavior and histopathological features of this disease, high-throughput platforms such as microarrays, whole-transcriptome sequencing, and proteomics were applied extensively in recent years.2,4−7 Using only © 2016 American Chemical Society

Special Issue: Chromosome-Centric Human Proteome Project 2016 Received: April 30, 2016 Published: August 3, 2016 4020

DOI: 10.1021/acs.jproteome.6b00389 J. Proteome Res. 2016, 15, 4020−4029

Article

Journal of Proteome Research

Figure 1. Overview of the methodology. The rough evaluation of the MBP enrichment, solubilization, and digestion. Overview of methodology for MBP enrichment and digestion method comparison (A). TCL, Super, and MBP fractions with ultracentrifugation, followed by detergent-based extraction shown on SDS-PAGE gel (B). Western blot of the same samples in panel B by antibodies against E-cadherin, β-actin, and α-tubulin (C). Aliquot of MP fraction (5 μg) of solubilization and in-solution digestion steps is visualized on SDS-PAGE gel (D).

only good solubility and enzyme compatibility but also the easy removal property in later steps. In this paper, a whole membrane proteomic workflow from membrane enrichment to different digestion comparison was optimized based on MCF7 cell line. Ultracentrifugation and detergent-based extraction methods were combined in MBP preparation. Different digestion strategies were applied and compared. Our MS data proved that in-solution digestion with RapiGest and eFASP with SDC methods are time-saving and consistent in MBP identification. Besides identifying the largest MBP data set of MCF7, four potential MPs fit the HPP guidelines were identified and validated using synthesized peptide method.

Although membrane proteins (MBPs) shared 20−30% of the total encoded human protein numbers and play important roles in various biological processes,14 the structure and function research of MBPs are hot issues because of the strong hydrophobicity and low-abundance properties and so on. The challenges in membrane proteomics include MBP annotation, enrichment, digestion, and downstream signal pathway analysis. Many studies focusing on MBP enrichment were optimized and reported; these methods include cell-surface capture technologies (biotin label, cationic colloidal silica particles, etc.),15,16 differential and density gradient centrifugation,17 and detergentbased extraction method.18 In-gel digestion is widely employed in shotgun membrane proteomic research because sodium dodecyl sulfate (SDS) can solubilize MBPs absolutely and it can be removed in latter steps, but it cannot be applied in insolution digestion because a slight concentration of SDS can severely affect the proteolytic enzyme activity.19−22 Furthermore, in-gel digestion is too complex and has a bad dynamic range problem or low cleavage yield for some peptides/proteins due to the large size or high hydrophobicity.23,24 So as to take the place of the in-gel digestion, FASP and further optimized eFASP method were developed for proteomic sample preparation.25,26 Many enzyme activity- or mass spectrometry (MS)-compatible candidate reagents for the in-solution digestion of MBPs were tested.27,28 Among these reagents, sodium deoxycholate (SDC)27,29 and RapiGest30 showed not



MATERIALS AND METHODS

Materials

RapiGest was purchased from Waters (Milford, MA). Dithiothretiol (DTT), iodoacetamide (IAA), sucrose, tetrasodium ethylene glycol tetraacetic acid (EGTA), sodium orthovanadate, and sodium fluoride were purchased from Amresco (Solon, OH). Unstained protein ladder (BenchMark) was purchased from Life Technologies (Waltham, MA). 2-D Quant Kit was purchased from GE Healthcare (Pittsburgh, PA). Protease inhibitor cocktail was purchased from Roche (Mannheim, Germany). r-Ac-trypsin and Lys-C were recombinant expressed and purified in our lab.31,32 Acetonitrile, 4021

DOI: 10.1021/acs.jproteome.6b00389 J. Proteome Res. 2016, 15, 4020−4029

Article

Journal of Proteome Research Table 1. Overview of the Key Buffer Conditions for In-Solution and In-Gel Digestions protocol steps solubilization

cysteine acetylation SDS removal digestion condition digestion enzymes detergent removal fraction method

condition

ISD:SDC

0.5% SDC 5 mM DTT 0.1% RapiGest 5 mM DTT 4% SDS 5 mM DTT 15 mM IAA 0.5% SDC 0.1% RapiGest 0.5% SDC 0.1% RapiGest Lys-C/trypsin 0.5% TFA high pH-RPLC gel pieces



ISD:RG

SF-ISD:SDC

SF-ISD:RG

in-gel

√ √ √

√ √

√ √

√ √



√ √ √ √ √

√ √ √ √ √

√ √ √

√ √ √ √





concentrations of cell lysate and enriched MBP fractions were measured by 2-D-quant kit. The same aliquot of the total cell lysate (TCL), the supernatant after ultracentrifugation (Super), and the solubilized MBP were ran on a modified 12% SDSPAGE gel (improved SDS concentration in gel preparation from 0.1 to 0.2%), followed by Coomassie Brilliant Blue G-250 staining. The same amount of proteins of each fractions was ran on 12% SDS-PAGE gel, transferred to polyvinylidene fluoride (PVDF) membrane, and incubated with anti-β-Actin antibody (dilution at volume ratio of 1:5000), anti-E-cadherin antibody (dilution at volume ratio of 1:200), or anti-α-tubulin antibody (dilution at volume ratio of 1:500) overnight at 4 °C after 5% bovine serum albumin (BSA) blockage. After washed three times with TBS-T buffer, the membrane was incubated with horseradish peroxidase-conjugated secondary antibody (Santa Cruz, Dallas, TX), followed by chemiluminescent detection with ECL detection reagent.

trifluoroacetic acid (TFA), formic acid (FA), human insulin, and hydrocortisone hemisuccinate were purchased from Sigma (Darmstadt, Germany). Anti-E-cadherin antibody was purchased from Abcam (Cambridge, MA). Anti-β-Actin and αtubulin antibodies were purchased from Santa Cruz Biotechnology (Dallas, TX). Mem-PER Kit was purchased from Pierce (Waltham, MA). Other unmentioned reagents were minimally analytical grade, and solvent was HPLC grade. Overall Scheme for Membrane Proteomics Study

A flowchart illustrates the MBP enrichment and digestion process (Figure 1A). In brief, the MBPs of MCF7 cell were enriched by ultracentrifugation and detergent-based extraction. The aliquots of the MBPs were used for in-gel, in-solution, and eFASP digestion separately. MCF7 Cell Culture

MCF7 cell line was routinely cultured in Dulbecco’s modified Eagle’s medium (DMEM, Life technologies, Waltham, MA) containing 10% fetal bovine serum (FBS). The cells were first washed with ice-cold phosphate-buffered saline (PBS) containing protease inhibitor cocktail then scraped with the same buffer and centrifuged at 60g for 2 min at 4 °C. The cell pellets were collected and frozen at −80 °C for further usage.

Comparison of MBP Digestion Methods

The aliquot of 40 μg MBP fraction was first solubilized with buffer B and incubated at 60 °C for 30 min, and iodoacetamide (IAA) was added to 20 mM and incubated at RT in the dark for 30 min. The sample was then run on a 12% SDS-PAGE. After Coomassie staining and destaining, the gel was cut into 10 pieces. In-gel digestion was performed with Lys-C first overnight at mass ratio of 1:100 at 37 °C and then did trypsin digestion (1:50) at 37 °C for another 4 h. The digested peptides were taken for LC−MS/MS analysis. The aliquots of 50 μg MBP fractions were used for insolution and eFASP digestions. All compositions for protocolspecific key buffer were described in Table 1. For in-solution digestion with SDC (ISD:SDC), the MBP fraction was mixed with buffer C (50 mM NH4HCO3, 0.5% SDC, and 5 mM DTT), absorbed back and forth with pipet for several times by a clean pipet, and then followed by sonication for 20 min and heated to 100 °C for another 10 min to solubilize the MBP fraction. After solubilizing, IAA was added to 5 mM and cultured for 30 min at RT. 10 % (5 μg) of the solubilized protein was checked with SDS-PAGE. Lys-C was added at a ratio of 1:100 (weight of enzyme to protein) and incubated at 37 °C overnight; trypsin was added to 1:50 and digested at 37 °C for another 4 h. After digestion, 10% (5 μg) of the digested sample was taken for digestion efficiency visualization by SDSPAGE. TFA was added to final 0.5% in the digestion sample, incubated at 37 °C for 45 min, and centrifuged at 13 000 rpm for 10 min to remove the surfactant in pellet.

MBP Enrichment with Ultracentrifugation Combined Detergent-Based Extraction

MCF7 cell pellet (three plates of each) was suspended with cell lysis buffer A (5 mM Tris, 1 mM EGTA, 1 mM sodium orthovanadate, 2 mM sodium fluoride, and protease inhibitor cocktail)33 separately and disrupted by a Soniprep sonicator (Scientz, China) in ice. The debris were eliminated by centrifugation at 9600g for 15 min at 4 °C. The supernatant was centrifuged at 120 000g for 80 min at 4 °C to get crude MBP fraction in pellet. The pellet was suspended in 0.1 M icecold Na2CO3 solution, incubated at 4 °C for 1 h with a vertical mixing apparatus (Scientz), and ultracentrifugated at 120 000g for 80 min at 4 °C again. The MBPs in the pellet were solubilized with 150 μL of Reagent A supplies in the Mem-PER kit and did further operations following the kit protocol. After removal of the upper phase, 100 μL of lower phase and the insoluble parts were clarified and solubilized with buffer B (20 mM Tris, 20 mM DTT, 4% SDS, pH 8.0). Aliquoted parts of MBP fractions (50 μg each) were precipitated with acetone (precooled to −20 °C) for 1 h at −20 °C at a ratio of 9:1 (volume ratio, acetone/protein sample) and then centrifuged at 8000g for 15 min at 4 °C to get the pellet. The pellet was washed again with 90% acetone and centrifuged at 8000g for 15 min at 4 °C to get the final MBP fractions. The protein 4022

DOI: 10.1021/acs.jproteome.6b00389 J. Proteome Res. 2016, 15, 4020−4029

Article

Journal of Proteome Research

ppm and 0.5 Da, respectively; the missed cleavage cutoff for protease digestion was set to two. Only the identifications satisfying the following criteria were considered: peptide length ≥7; FDR ≤ 1% at both peptide and protein levels.37 The 245 common contaminant proteins were excluded in the final protein lists (http://bioinfo.hupo.org.cn/contaminants.html). The membrane prediction was performed following the method previously described.38 In brief, three online methods (Phobius: http://phobius.sbc.su.se/, SOSUI: http://bp.nuap. nagoya-u.ac.jp/sosui/, and TMHMM: http://www.cbs.dtu.dk/ services/TMHMM/) were used for MBP prediction.38−40 The protein was annotated as a MBP only if it was predicted by at least two or three methods. The subcell and biological function annotation of the MBPs was analyzed by Unipro.

In-solution digestion with RG (ISG:RG) was achieved as the same as ISD:SDC method, except the soluble buffer D is composed of 50 mM NH4HCO3, 0.1% RG, and 5 mM DTT. For eFASP with SDC (SF-ISD:SDC), the MBP pellet (50 μg) was mixed with buffer B and absorbed back and forth with a pipet several times by a clean pipet. MBP fraction was solubilized by sonication for 20 min and heating to 100 °C for another 10 min. After solubilizing, IAA was added to 5 mM and incubated at RT in dark for 30 min to make the free cysteine fully alkylation. After clarification at 9600g for 15 min, the sample was transferred to a 3kD spin filter device (UFC500396, Merck-Millipore) and mixed with 200 μL of digestion buffer E (50 mM NH4HCO3 and 0.5% SDC). The device was centrifuged at 9600g for 15 min at 4 °C and repeated three times to exchange the buffer and remove SDS. All subsequent centrifugation steps were performed under the same conditions. After buffer exchanging, buffer E was added to the sample to 100 μL, and 10% (5 μg) of the sample was analyzed by SDSPAGE. The digestion step was done as described in ISD:SDC part. eFASP with RapiGest (SF-ISD:RG) was done as the same as SF-ISD:SDC, except the final digestion buffer F is composed of 50 mM NH4HCO3 and 0.1% RG. The digested peptides of the four in-solution digestions were fractioned separately using a HPLC system (Agilent 1260 affinity) with a C18 column (Kinetex 2.6 μm C18, 50 × 4.60 mm, Phenomenex). The mobile phase consisted of buffer A containing 2% acetonitrile (ACN) and 98% H2O (pH 10.0) and buffer B containing 98% of ACN and 2% H2O (pH 10.0). The HPLC gradient was set at a flow rate of 0.8 mL/min with a five-step gradient: 0% B for 4 min, 0−60% B in 16 min, 60− 100% B in 0.1 min, 100% B for 2 min, 100−0% B in 2 min, and 0% B for 1 min. The eluted samples were collected every min from 2 to 22 min and then pooled into 10 fractions for LC− MS/MS analysis after vacuum drying. The fraction merging method is shown in Figure S-1.

Verification of Missing Protein with Synthesized Peptides

To evaluate the unique peptides’ authenticity, especially the unique peptides of MPs (PE = 2, 3, 4) in this study, spectra were manually inspected by observing base peak intensity and b/y ions matching assisted by pFind and pBuild softwares.41,42 Considering alternate explanations of peptide−spectrum matches (PSMs) which passed the manually check, open search was performed. The spectra conflicting to previous scanning were filtered. Relatively high quality peptides (unique peptides ≥2, peptide length ≥9AA) were selected and chemically synthesized by BankPeptide (Hefei, China) for further analysis. The peptides were solubilized and analyzed following the same LC-MS/MS parameters described above. Both m/z and ion intensity were used to calculate the cosine similarity score, and specific formula was similar as previously described.34 Only when the similarity matching score is higher than 0.9, its corresponding peptides was considered as very high confidence peptide. Considering isobaric substitutions might change the mapping of the peptide from MPs to a mapping to a commonly observed protein, the isobaric filtering was performed by evaluating whether I = L, Q[Deamidated] = E, GG = N in its protein database.



LC−MS Analysis and Database Searching

Reverse-phase separation of digested peptides were performed on a nano-UPLC system (nanoAcquity Ultra Performance LC, Waters), and eluted peptides were ionized under high voltage (1.5 kV) and analyzed by an LTQ-Orbitrap Velos (Thermo Scientific).34 The initial MS spectrum (MS1) was analyzed over a mass range of 300−1600 Da with a resolution of 30 000 at m/ z 400. The automatic gain control (AGC) was set to 1 × 106, and the maximum injection time (MIT) was 150 ms. The subsequent MS spectrum (MS2) was analyzed using datadependent mode searching for the 20 most intense ions fragmented in the linear ion trap. For each scan, the AGC was set at 1× 104, and the MIT was 25 ms. The dynamic exclusion was set at 20−40 s to suppress repeated detection of the same fragment ion peaks. The MS/MS raw files produced by LTQ-Orbitrap Velos and the control data sets of PXD000442,35 PXD000066,36 and PXD0026197 were processed with Proteome Discoverer 2.0 (v2.0.0.802) against overlap queries (20 055 entries, including 2949 MBPs) between the Swiss-Prot database (release 2015.12) and the neXtProt database (release 2016.02). The searching engine Mascot Server (2.3.01) was used. Key parameters for database searching included: enzyme specificity was set to trypsin; cysteine carbamido methylation as a fixed modification; oxidation of methionine as variable modifications; the tolerances of peptides and fragment ions were set at 20

RESULTS AND DISCUSSION

Ultracentrifugation and Detergent-based Extraction Method Acquired MBP Fraction with High Purity

The mass weight of extracted MBP fraction from the two enrichment methods showed the similar yield (about 13% of the TCL proteins based on 2-D-quant method, data not shown). The MBP fraction with ultracentrifugation following detergent-based extraction showed different profile with the TCL and the supernatant after ultracentrifugation (Figure 1B). Antibodies against E-cadherin, α-tubulin, and β-actin were evaluated as representative of membrane and cytoplasmic proteins, respectively. As expected, the relative intensity of Ecadherin in MBP fraction was much higher than the supernatant, while the relative intensity of β-actin in MBP fraction was much lower than the supernatant (Figure 1C). These two results proved the MBP was enriched specifically with higher purity. RapiGest and SDC Exhibit Different Soluble Abilities and Enzyme-Compatibility

As shown in Figure 1D, only a part of the MBP was solubilized by SDC (line 3, Figure 1D). RapiGest worked well in both soluble and digesting step (lines 4 and 8, Figure 1D). In eFASP method, when solubilized with SDS before exchanging to SDC or RG with ultrafiltration for three times, the proteins were 4023

DOI: 10.1021/acs.jproteome.6b00389 J. Proteome Res. 2016, 15, 4020−4029

Article

Journal of Proteome Research solubilized well (lines 5 and 6, Figure 1D). The proteins in SFISD:SDC method were digested absolutely (line 9, Figure 1D). However, worse digestion efficiency was found in SF-ISD:RG (line 10, Figure 1D). These results suggested that the SDS might not be replaced with RapiGest completely in SF-ISD:RG method. The detailed database searching criteria and results are summarized in Supplementary Table 2. The identified proteins with more than one unique peptide are summarized in the Table 2 and Figure 2A. The identified proteins of MCF7 cell

identified proteins). ISD:RG and SF-ISD:SDC methods identified 1125 and 1069 MBPs, respectively. With the SFISD:RG protocol, the MBP fraction cannot be digested completely, resulting in the smallest proteins being identified among these five digesting methods. Therefore, this group of data was not used for further analysis. At the peptide level, in-gel digestion characterized more peptides (average 6.3 peptides per protein) than the other three methods (Table 2). At the protein level, the intensity distribution comparison proved that the in-gel digestion gained the highest protein intensity when starting from the same amount of proteome samples (Figure 2B). Among the two insolution digestion and eFASP with SDC digestion, ISD:RG presented the higher protein intensity.

Table 2. Proteomic Comparison of in Solution Digestions and In-Gel Digestion digestion method in gel SD:SDC ISD:RG SF-ISD:SDC SF-ISD:RG

protein type MP total MP total MP total MP total MP total

proteins proteins proteins proteins proteins

unique peptides

proteins

peptides/ protein

MBP (%)

7361 18707 2364 10393 5396 16627 4542 13185 745 3149

1091 2984 720 2808 1125 3505 1069 3194 309 1201

6.7 6.3 3.3 3.7 4.8 4.7 4.2 4.1 2.4 2.6

36.6

ISD:RG and SF-ISD:SDC Methods Exhibit the Faithful Identification of MBPs with In-Gel Digestion

ISD:RG and SF-ISD:SDC characterized more proteins than ingel digestion (Table 2, Figure 2A,C). The identified proteins, especially MBPs from ISD:RG and SF-ISD:SDC, with in-gel digestion were overlapped at a quite large ratio (Figure 2D). When comparing the molecular weight (MW) of these MBPs, the identified MBPs from ISD:RG, SF-ISD:SDC, and in-gel digestion also showed the comparable distribution. In the MW distribution, analysis showed the biggest part of the MBPs with MWs between 20 and 60 kDa (Figure 3A). The transmembrane domain prediction analysis showed the majority of MBPs with one transmembrane domain. There were still some proteins with TMHMM number ≥10 (Figure 3B). In the

25.6 32.1 33.5 25.7

varied from 1201 to 3505, of which 309 to 1125 were MBPs. The in-gel digestion yielded 1091 MBPs (36.6% of total

Figure 2. Comparison of in-solution and in-gel digestions based on the LC−MS/MS results. Comparison of identified MBPs and non-MBPs of the five digesting methods (A). The intensity distribution of the identified proteins from four digesting methods (not including SF-ISD:RG) (B). Venn diagram comparison of characterized proteins of ISD-SDC, ISD-RG, SF-ISD:SDC, and in-gel digestion (C). Venn diagram comparison of characterized MBPs of ISD-SDC, ISD-RG, SF-ISD:SDC, and in-gel digestion (D). 4024

DOI: 10.1021/acs.jproteome.6b00389 J. Proteome Res. 2016, 15, 4020−4029

Article

Journal of Proteome Research

Figure 3. Distribution comparison of the identified MBPs of in-solution and in-gel digestions based on the LC−MS/MS results. MW distribution of the identified MBPs (A). TMHMM distribution of the identified MBPs (B). Subcell location distribution of the identified MBPs (C). Biofunction distribution of the identified MBPs (D).

Table 3. Comparison of Identified MBP Numbers of Our Dataset with Previous Research unique peptides ≥1

unique peptides ≥2

sample type

total proteins

MBPs

total proteins

MBPs

research

PXD no.

MCF7 cell MammaPrint breast cancer breast cancer MCF7 cell

2366 6093 5161 4683

163 1824 779 1390

2303 4484 5008 4542

160 1333 744 1345

Segura35 Muraoka36 Tyanova7 Zhao

PXD000442 PXD000066 PXD002619 PXD004131

previously published membrane proteomic data sets (Table 3).36,35,7 When comparing the identified proteins with the three data sets, the proteins identified in this study contained a large part of unique identified proteins (Figure 4A). Interestingly, our data had the biggest overlap with Muraoka’s data set (Figure 4B);36 however, the other two data sets contain only a small portion of shared MBPs. This result implied that the regular proteomic protocol was difficult for the efficient identification of MBPs. The new method we presented here might be helpful in the identification of membrane proteome in breast cancer.

subcellular prediction of the identified MBPs, the top three subcellular locations included plasma, endoplasmic reticulum, and mitochondrion (Figure 3C). In the biofunction prediction of the identified MBPs, the top three biofunctions included binding proteins, enzyme, and receptors (Figure 3D). When comparing the number of each group, the MBP numbers of ISD:RG and SF-ISD:SDC methods, especially the ISD:RG method, are comparable to in-gel digestion. Largest Membrane Proteome Data Set of MCF7 in Breast Cancer Study

In our data set, 1125 MBPs were identified in ISD:RG method. In total, 1390 MBPs (unique peptides ≥1) were identified, among of which, 1345 MBPs were identified with unique peptides ≥2. This identification is larger than all of the

Four Candidate MBPs Belong to MPs

Missing protein (MP) searching was an important mission in the C-HPP study.43 According to the newly released neXtProt 4025

DOI: 10.1021/acs.jproteome.6b00389 J. Proteome Res. 2016, 15, 4020−4029

Article

Journal of Proteome Research

Figure 4. Comparison of the identified proteins and MBPs with previous researches. Venn diagram comparison of the identified proteins in this research with previous researches (A). Venn comparison of the identified MBPs in this research with previous researches (B).

Table 4. Identified MPs in this research raw file

peptide

score

DeltaM (ppm)

MCF7_inGel_01 MCF7_inGel_01 MCF7_ISD_RG_05 MCF7_ISD_RG_03 MCF7_ISD_SDC_07 MCF7_ISD_SDC_10 MCF7_inGel_08 MCF7_inGel_08 MCF7_SF_ISD_SDC_05 MCF7_SF_ISD_SDC_03 MCF7_inGel_04 MCF7_SF_ISD_SDC_01 MCF7_SF_ISD_SDC_10

LGILVVFSFIKEAILPSR RLGILVVFSFIKEAILPSR ETGSFLDLFR LSEAEEALYLIAK AVAAVAATGPASAPGPGGGR LLQQLVLSGNLIK LQSQIGGEFQSFPK NGLSNVLFFGLR MIASQVVDINLAAEPK VLELAGNEAQNSGER GSSGAGGR SQSPTCQMCGEK AMTTPVIIAIQTFCYQK

53.73 69.17 41.08 78.7 54.16 61.79 38.64 45.54 54.51 52.5 34.21 60.51 32.37

9.32 5.75 −0.48 −0.62 0.3 1.22 −2.27 6.33 3.43 1.09 11.17 −2.21 3.71

protein Q6UWH6 Q8IZD6 O75474a Q3SY17/Q9H1U9b P0DMR1c P0C5Z0d Q12999d Q7Z769d

Two peptides of good of MS2 quality, but LLQQLVLSGNLIK was reported in PE = 1 protein Q92837; this protein is a “candidate detection”. Q3SY17 and Q9H1U9 belong to the same protein family (very high sequence similarity, 95.96%). They cannot be distinguished based on the two unique peptides. This protein group is a “candidate detection”. cWith only one unique peptide but the MS2 spectra with good quality. dMS2 spectra with bad quality.

a b

database, there are 2949 protein products that have not been identified.44,45 Among the identified proteins of our data set, there were 13 unique peptides belonging to nine MPs (Table 4). After manual checking, eight peptides with high quality for four MPs (Q6UWH6, Q8IZD6, O75474, and Q3SY17/ Q9H1U9) were compared and showed consistent MS2 spectra with synthesized peptides. Eight peptides have high similarity (score ≥0.9) with synthesized peptide and MS2 spectra by comparing the b1+, y1+, b2+, and y2+ ion match and also matching the peak intensity pattern. The spectra for peptide of O75474 (AVAAVAATGPASAPGPGGGR) are shown in Figure 5B. The other MS2 spectra are shown in Figures S-2− S-9. In the identified MPs, P0DMR1, P0C5Z0, Q12999, and Q7Z769 were also identified with only one unique peptide (Table 4). Among these proteins, Q6UWH6 and Q8IZD6 fit for the guideline’s requirement, and they were considered as MPs. Q3SY17 and Q9H1U9 belong to solute carrier family, and they have high sequence similarity (95.96%) based on the two identified unique peptides (non-PE1 peptides). The two proteins cannot be separated, so they were considered to be candidate detections. O75474 was also considered to be a candidate detection because two peptides belonging to O75474 were identified (AVAAVAATGPASAPGPGGGR and LLQQLVLSGNLIK), but peptide LLQQLVLSGNLIK is shared with Q92837.

Because the MPs have no or insufficient protein evidence, the functional analysis by GO or Unipro exhibited less useful information. For example, the O75474 is predicted to be involved in Wnt signal pathway, while the Q3SY17, Q9H1U9, and Q8IZD6 are predicted to be involved in transport. The P0DMR1 is predicted to be involved in nucleotide binding, and the Q6UWH6 is predicted to be involved in ER to Golgi vesicle-mediated transport.



CONCLUSIONS We optimized a method for MBP enrichment with ultracentrifugation, followed by detergent-based extraction and MS analysis. A total of 1091 MBPs were characterized among the 2984 identified proteins within in-gel digestion method. A total of 1125 MBPs were identified with ISD:RG method; 1069 MP were characterized among the 3194 characterized proteins in SF-ISD:SDC method. A further evaluation proved the ISD:RG and SF-ISD:SDC methods are as faithful as in-gel digestion, but they are time-saving and easy to operate. The MBP numbers in our data set are the largest one (unique peptide ≥2) when compared with other data sets of MCF7 and breast cancer tissues. After strict filtering, eight unique peptides belonging to four MPs were selected as highly credible MPs by spectra quality check and synthesized peptide match. Even further, Q6UWH6 and Q8IZD6 are the identified MPs, while O75474 and 4026

DOI: 10.1021/acs.jproteome.6b00389 J. Proteome Res. 2016, 15, 4020−4029

Article

Journal of Proteome Research

Figure 5. Validation of the identified peptide of O75474 using synthesized peptide method. Peptide (AVAAVAATGPASAPGPGGGR), which stands for the O75474, from MCF7 cell digestion and synthesized peptide was compared.



Q3SY17/Q9H1U9 are candidate detections. These results may predict that the MBP enrichment-based membrane proteomics can be used in breast cancer; in particular, the MBP involved molecular mechanism research.



AUTHOR INFORMATION

Corresponding Authors

*F.H.: Tel/Fax: 86-10-68171208. E-mail: [email protected]. *P.X.: Tel: 8610-61777113. Fax: 8610-61777113. E-mail: [email protected].

ASSOCIATED CONTENT

Author Contributions

S Supporting Information *



The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jproteome.6b00389. Figure S-1. High pH-RPLC separation of in-solution digestion peptide samples. The sample pooling method was described in ISD:RG condition. The same pooling method was taken for the left three conditions. Figures S2−S-9. The peptide AVAAVAATGPASAPGPGGG of protein O75474, the peptide LLQQLVLSGNLIK of protein O75474 from both digested and synthesized the peptides, LQSQIGGEFQSFPK of protein Q3SY17 or Q9H1U9, the peptide NGLSNVLFFGLR of protein Q3SY17 or Q9H1U9, the peptide LGILVVFSFIKEAILPSR of protein Q6UWH6, the peptide RLGILVVFSFIKEAILPSR of protein Q6UWH6, the peptide ETGSFLDLFR of protein Q8IZD6, and the peptide LSEAEEALYLIAK of protein Q8IZD6 from both digested and synthesized peptide, respectively. (ZIP)

M.Z., W.W., and L.C. contributed equally to this work.

Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS We are grateful to colleagues in the Xu lab for helpful discussions and comments. This work was funded by the Chinese National Basic Research Programs (2013CB911201 & 2015CB910700), the International Collaboration Program (2014DFB30020), the National Natural Science Foundation of China (Grant Nos. 31470809, 31400698, and 31400697), the National High-Tech Research and Development Program of China (2014AA020900 & 2014AA020607), the National Natural Science Foundation of Beijing (Grant No. 5152008), Beijing Training Project for The Leading Talents in S&T, National Megaprojects for Key Infectious Diseases (2013zx10003002 and 2016ZX10003003), the Key Projects in the National Science & Technology Pillar Program (2012BAF14B00), the Unilevel 21th Century Toxicity Program 4027

DOI: 10.1021/acs.jproteome.6b00389 J. Proteome Res. 2016, 15, 4020−4029

Article

Journal of Proteome Research

(14) Luckey, M. Membrane Structural Biology: With Biochemical and Biophysical Foundations; Cambridge University Press, 2014. (15) Solis, N.; Cordwell, S. J. Current methodologies for proteomics of bacterial surface-exposed and cell envelope proteins. Proteomics 2011, 11 (15), 3169−3189. (16) Elschenbroich, S.; Kim, Y.; Medin, J. A.; Kislinger, T. Isolation of cell surface proteins for mass spectrometry-based proteomics. Expert Rev. Proteomics 2010, 7 (1), 141−154. (17) DePierre, J. W.; Karnovsky, M. L. Plasma membranes of mammalian cells: a review of methods for their characterization and isolation. J. Cell Biol. 1973, 56 (2), 275. (18) Seddon, A. M.; Curnow, P.; Booth, P. J. Membrane proteins, lipids and detergents: not just a soap opera. Biochim. Biophys. Acta, Biomembr. 2004, 1666 (1), 105−117. (19) Lu, B.; McClatchy, D. B.; Kim, J. Y.; Yates, J. R. Strategies for shotgun identification of integral membrane proteins by tandem mass spectrometry. Proteomics 2008, 8 (19), 3947−3955. (20) Zhang, H.; Lin, Q.; Ponnusamy, S.; Kothandaraman, N.; Lim, T. K.; Zhao, C.; Kit, H. S.; Arijit, B.; Rauff, M.; Hew, C. L.; et al. Differential recovery of membrane proteins after extraction by aqueous methanol and trifluoroethanol. Proteomics 2007, 7 (10), 1654−1663. (21) Shevchenko, A.; Wilm, M.; Vorm, O.; Mann, M. Mass spectrometric sequencing of proteins from silver-stained polyacrylamide gels. Anal. Chem. 1996, 68 (5), 850−858. (22) Vuckovic, D.; Dagley, L. F.; Purcell, A. W.; Emili, A. Membrane proteomics by high performance liquid chromatography−tandem mass spectrometry: Analytical approaches and challenges. Proteomics 2013, 13 (3−4), 404−423. (23) Rabilloud, T.; Vaezzadeh, A. R.; Potier, N.; Lelong, C.; LeizeWagner, E.; Chevallet, M. Power and limitations of electrophoretic separations in proteomics strategies. Mass Spectrom. Rev. 2009, 28 (5), 816−843. (24) Wisniewski, J. R.; Ostasiewicz, P.; Mann, M. High recovery FASP applied to the proteomic analysis of microdissected formalin fixed paraffin embedded cancer tissues retrieves known colon cancer markers. J. Proteome Res. 2011, 10 (7), 3040−3049. (25) Raimondo, F.; Corbetta, S.; Savoia, A.; Chinello, C.; Cazzaniga, M.; Rocco, F.; Bosari, S.; Grasso, M.; Bovo, G.; Magni, F.; Pitto, M. Comparative membrane proteomics: a technical advancement in the search of renal cell carcinoma biomarkers. Mol. BioSyst. 2015, 11 (6), 1708−1716. (26) Wisniewski, J. R.; Zougman, A.; Nagaraj, N.; Mann, M. Universal sample preparation method for proteome analysis. Nat. Methods 2009, 6 (5), 359−62. (27) Lin, Y.; Zhou, J.; Bi, D.; Chen, P.; Wang, X.; Liang, S. Sodiumdeoxycholate-assisted tryptic digestion and identification of proteolytically resistant proteins. Anal. Biochem. 2008, 377 (2), 259−266. (28) Kadiyala, C.; Tomechko, S. E.; Miyagi, M. Perfluorooctanoic acid for shotgun proteomics. PLoS One 2010, 5 (12), e15332. (29) Lin, Y.; Liu, Y.; Li, J.; Zhao, Y.; He, Q.; Han, W.; Chen, P.; Wang, X.; Liang, S. Evaluation and optimization of removal of an acidinsoluble surfactant for shotgun analysis of membrane proteome. Electrophoresis 2010, 31 (16), 2705−2713. (30) Yu, Y.-Q.; Gilar, M.; Lee, P. J.; Bouvier, E. S.; Gebler, J. C. Enzyme-friendly, mass spectrometry-compatible surfactant for insolution enzymatic digestion of proteins. Anal. Chem. 2003, 75 (21), 6023−6028. (31) Zhao, M.; Wu, F.; Xu, P. Development of a rapid high-efficiency scalable process for acetylated Sus scrofa cationic trypsin production from Escherichia coli inclusion bodies. Protein Expression Purif. 2015, 116, 120. (32) Zhao, M.; Cai, M.; Wu, F.; Zhang, Y.; Xiong, Z.; Xu, P. Recombinant expression, refolding, purification and characterization of Pseudomonas aeruginosa protease IV in Escherichia coli. Protein Expression Purif. 2016, 126, 69−76. (33) Peng, L.; Kapp, E. A.; Fenyö, D.; Kwon, M. S.; Jiang, P.; Wu, S.; Jiang, Y.; Aguilar, M. I.; Ahmed, N.; Baker, M. S.; et al. The Asia Oceania Human Proteome Organisation Membrane Proteomics

(MA-2014-02409), and the Foundation of State Key Lab of Proteomics (SKLPYB201404).



ABBREVIATIONS MBP, membrane protein; eFASP, enhanced filter-aided sample preparation; SDC, sodium deoxycholate; C-HPP, Chromosome-Centric Human Proteome Project; ER, estrogen receptor; PR, progesterone receptor; HER2, human epidermal growth factor receptor-2; MP, MBP; high-pH-RPLC, high pH reverse-phase liquid chromatography; DTT, dithiothretiol; IAA, iodoacetamide; DMEM, Dulbecco’s modified Eagle’s medium; TCL, total cell lysate; Super, the supernatant after ultracentrifugation; ISD:SDC, in-solution digestion with SDC; ISG:RG, In-solution digestion with RG; SF-ISD:SDC, eFASP with SDC; SF-ISD:RG, eFASP with RapiGest; AGC, automatic gain control; MIT, maximum injection time



REFERENCES

(1) Sotiriou, C.; Wirapati, P.; Loi, S.; Harris, A.; Fox, S.; Smeds, J.; Nordgren, H.; Farmer, P.; Praz, V.; Haibe-Kains, B.; et al. Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. Journal of the National Cancer Institute 2006, 98 (4), 262−272. (2) Perou, C. M.; Sørlie, T.; Eisen, M. B.; van de Rijn, M.; Jeffrey, S. S.; Rees, C. A.; Pollack, J. R.; Ross, D. T.; Johnsen, H.; Akslen, L. A.; et al. Molecular portraits of human breast tumours. Nature 2000, 406 (6797), 747−752. (3) Sørlie, T.; Tibshirani, R.; Parker, J.; Hastie, T.; Marron, J.; Nobel, A.; Deng, S.; Johnsen, H.; Pesich, R.; Geisler, S. Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc. Natl. Acad. Sci. U. S. A. 2003, 100 (14), 8418. (4) Network, C. G. A. Comprehensive molecular portraits of human breast tumours. Nature 2012, 490 (7418), 61−70. (5) Gruvberger, S.; Ringnér, M.; Chen, Y.; Panavally, S.; Saal, L. H.; Borg, Å.; Fernö, M.; Peterson, C.; Meltzer, P. S. Estrogen receptor status in breast cancer is associated with remarkably distinct gene expression patterns. Cancer Res. 2001, 61 (16), 5979−5984. (6) Sørlie, T.; Perou, C. M.; Tibshirani, R.; Aas, T.; Geisler, S.; Johnsen, H.; Hastie, T.; Eisen, M. B.; van de Rijn, M.; Jeffrey, S. S.; et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc. Natl. Acad. Sci. U. S. A. 2001, 98 (19), 10869−10874. (7) Tyanova, S.; Albrechtsen, R.; Kronqvist, P.; Cox, J.; Mann, M.; Geiger, T. Proteomic maps of breast cancer subtypes. Nat. Commun. 2016, 7, 10259. (8) Geiger, T.; Cox, J.; Mann, M. Proteomic changes resulting from gene copy number variations in cancer cells. PLoS Genet. 2010, 6 (9), e1001090. (9) Zhang, B.; Wang, J.; Wang, X.; Zhu, J.; Liu, Q.; Shi, Z.; Chambers, M. C.; Zimmerman, L. J.; Shaddox, K. F.; Kim, S. Proteogenomic characterization of human colon and rectal cancer. Nature 2014, 513 (7518), 382−387. (10) Nagaraj, N.; Wisniewski, J. R.; Geiger, T.; Cox, J.; Kircher, M.; Kelso, J.; Päab̈ o, S.; Mann, M. Deep proteome and transcriptome mapping of a human cancer cell line. Mol. Syst. Biol. 2011, 7 (1), 548. (11) Mann, M.; Kulak, N. A.; Nagaraj, N.; Cox, J. The coming age of complete, accurate, and ubiquitous proteomes. Mol. Cell 2013, 49 (4), 583−590. (12) Soule, H.; Vazquez, J.; Long, A.; Albert, S.; Brennan, M. A human cell line from a pleural effusion derived from a breast carcinoma. Journal of the National Cancer Institute 1973, 51 (5), 1409− 1416. (13) Shirazi, F. H.; Zarghi, A.; Ashtarinezhad, A.; Kobarfard, F.; Nakhjavani, M.; Anjidani, N.; Zendehdel, R.; Arfaiee, S.; Shoeibi, S.; Mohebi, S. Remarks in Successful Cellular Investigations for Fighting Breast Cancer Using Novel Synthetic Compounds; INTECH Open Access Publisher, 2011. 4028

DOI: 10.1021/acs.jproteome.6b00389 J. Proteome Res. 2016, 15, 4020−4029

Article

Journal of Proteome Research Initiative. Preparation and characterisation of the carbonate-washed membrane standard. Proteomics 2010, 10 (22), 4142−4148. (34) Su, N.; Zhang, C.; Zhang, Y.; Wang, Z.; Fan, F.; Zhao, M.; Wu, F.; Gao, Y.; Li, Y.; Chen, L.; Tian, M.; Zhang, T.; Wen, B.; Sensang, N.; Xiong, Z.; Wu, S.; Liu, S.; Yang, P.; Zhen, B.; Zhu, Y.; He, F.; Xu, P. Special Enrichment Strategies Greatly Increase the Efficiency of Missing Proteins Identification from Regular Proteome Samples. J. Proteome Res. 2015, 14 (9), 3680−92. (35) Segura, V.; Medina-Aunon, J. A.; Mora, M. I.; MartínezBartolomé, S.; Abian, J.; Aloria, K.; Antúnez, O.; Arizmendi, J. M.; Azkargorta, M.; Barceló-Batllori, S.; et al. Surfing transcriptomic landscapes. A step beyond the annotation of chromosome 16 proteome. J. Proteome Res. 2013, 13 (1), 158−172. (36) Muraoka, S.; Kume, H.; Adachi, J.; Shiromizu, T.; Watanabe, S.; Masuda, T.; Ishihama, Y.; Tomonaga, T. In-depth membrane proteomic study of breast cancer tissues for the generation of a chromosome-based protein list. J. Proteome Res. 2013, 12 (1), 208− 213. (37) Michalski, A.; Damoc, E.; Hauschild, J.-P.; Lange, O.; Wieghaus, A.; Makarov, A.; Nagaraj, N.; Cox, J.; Mann, M.; Horning, S. Mass spectrometry-based proteomics using Q Exactive, a high-performance benchtop quadrupole Orbitrap mass spectrometer. Mol. Cell. Proteomics 2011, 10 (9), M111.011015. (38) Almén, M. S.; Nordström, K. J.; Fredriksson, R.; Schiöth, H. B. Mapping the human membrane proteome: a majority of the human membrane proteins can be classified according to function and evolutionary origin. BMC Biol. 2009, 7 (1), 50. (39) Käll, L.; Krogh, A.; Sonnhammer, E. L. A combined transmembrane topology and signal peptide prediction method. J. Mol. Biol. 2004, 338 (5), 1027−1036. (40) Krogh, A.; Larsson, B.; Von Heijne, G.; Sonnhammer, E. L. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J. Mol. Biol. 2001, 305 (3), 567−580. (41) Wang, L. h.; Li, D. Q.; Fu, Y.; Wang, H. P.; Zhang, J. F.; Yuan, Z. F.; Sun, R. X.; Zeng, R.; He, S. M.; Gao, W. pFind 2.0: a software package for peptide and protein identification via tandem mass spectrometry. Rapid Commun. Mass Spectrom. 2007, 21 (18), 2985− 2991. (42) Fu, Y.; Yang, Q.; Sun, R.; Li, D.; Zeng, R.; Ling, C. X.; Gao, W. Exploiting the kernel trick to correlate fragment ions for peptide identification via tandem mass spectrometry. Bioinformatics 2004, 20 (12), 1948−1954. (43) Gaudet, P.; Argoud-Puy, G.; Cusin, I.; Duek, P.; Evalet, O.; Gateau, A.; Gleizes, A.; Pereira, M.; Zahn-Zabal, M.; Zwahlen, C.; Bairoch, A.; Lane, L. neXtProt: organizing protein knowledge in the context of human proteome projects. J. Proteome Res. 2013, 12 (1), 293−8. (44) Su, N.; Zhang, C.; Zhang, Y.; Wang, Z.; Fan, F.; Zhao, M.; Wu, F.; Gao, Y.; Li, Y.; Chen, L. Special Enrichment Strategies Greatly Increase the Efficiency of Missing Proteins Identification from Regular Proteome Samples. J. Proteome Res. 2015, 14, 3680. (45) Zhang, Y.; Li, Q.; Wu, F.; Zhou, R.; Qi, Y.; Su, N.; Chen, L.; Xu, S.; Jiang, T.; Zhang, C.; et al. The Tissue-Based Proteogenomics Reveals that Human Testis Endows Plentiful Missing Proteins. J. Proteome Res. 2015, 14, 3583.

4029

DOI: 10.1021/acs.jproteome.6b00389 J. Proteome Res. 2016, 15, 4020−4029