Searching Missing Proteins Based on the Optimization of Membrane

Aug 3, 2016 - A membrane protein enrichment method composed of ultracentrifugation and detergent-based extraction was first developed based on MCF7 ...
0 downloads 4 Views 2MB Size
Subscriber access provided by Northern Illinois University

Article

Searching Missing Proteins Based on the Optimization of Membrane Protein Enrichment and Digestion Process Mingzhi Zhao, Wei Wei, Long Cheng, Yao Zhang, Feilin Wu, Fuchu He, and Ping Xu J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/acs.jproteome.6b00389 • Publication Date (Web): 03 Aug 2016 Downloaded from http://pubs.acs.org on August 7, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Proteome Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Searching Missing Proteins Based on the Optimization of Membrane Protein Enrichment and Digestion Process #

#

#

Mingzhi Zhao1 , Wei Wei1 , Long Cheng6 , Yao Zhang1, 4, Feilin Wu1, 2, Fuchu He1*, Ping Xu1, 3, 5* 1

State Key Laboratory of Proteomics, National Engineering Research Center for Protein Drugs, Beijing Proteome

Research Center, National Center for Protein Sciences Beijing, Beijing Institute of Radiation Medicine, Beijing

102206, P. R. China 2

Life Science College, Southwest Forestry University, Kunming, 650224, P. R. China

3

Key Laboratory of Combinatorial Biosynthesis and Drug Discovery (Wuhan University), Ministry of Education,

and Wuhan University School of Pharmaceutical Sciences, Wuhan, 430071, P. R. China 4

Institute of Microbiology, Chinese Academy of Science, Beijing 100101, China

5

Anhui Medical University, Hefei 230032, Anhui, P. R. China

6

Department of Medical Molecular Biology, Beijing Institute of Biotechnology, 27 Tai-Ping Lu Rd, Beijing 100850,

China.

#These authors contribute equally to this work.

*Corresponding authors:

Fuchu He, Beijing Proteome Research Center & National Center for Protein Sciences Beijing, 38 Science Park

Road, Changping District, Beijing, 102206, P. R. China; Tel and Fax: 86-10-68171208; E-mail:

[email protected]

Ping Xu, Beijing Proteome Research Center & National Center for Protein Sciences Beijing, 38 Science Park Road,

Changping District, Beijing, 102206, P. R. China; Tel: 8610-61777113; Fax: 8610-61777113; E-mail:

[email protected]

ABSTRACT: 1

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

In this paper, a membrane protein enrichment method composed of ultra-centrifugation and detergent-based extraction was first developed based on MCF7 cell line. Then in-solution digestion with detergents and eFASP (enhanced filter aided sample preparation) with detergents were compared with the time consuming in-gel digestion method. Among the in-solution digestion strategies, the eFASP combined with RapiGest identified 1125 membrane proteins. Similarly, the eFASP combined with SDC identified 1069 membrane proteins, however, the in-gel digestion characterized 1091 membrane proteins. Totally, with the 5 digestion methods, 1390 membrane proteins were identified with ≥1 unique peptides, among which 1345 membrane proteins contain unique peptides ≥2. This is the biggest membrane protein dataset for MCF7 cell line and even breast cancer tissue samples. Interestingly, we identified 13 unique peptides belong to 8 missing proteins (MPs). Finally, 8 unique peptides were validated by synthesized peptides. 2 proteins were confirmed as MPs and another 2 proteins were candidate detections.

KEYWORDS: Membrane proteomics, ultra-centrifugation, detergent, digestion, missing proteins

INTRODUCTION Breast cancer is one of the leading cancers of females worldwide, which causes 23% of total cancer cases and 14% of deaths every year. In clinical trial, breast cancers were divided into two subsets based on oestrogen receptor (ER) expression or not at the very beginning of 1970s. Treatment decisions were solely based on clinicopathological variables, ER, progesterone receptor (PR) and HER2 expression for a long period 1. These subtypes matured into four accepted subtypes based on gene expression patterns: Luminal A, Luminal B, HER2-enriched, and basal-like breast cancer

2 3, 4

. While this subtype method can’t perfectly reflect the clinical 2

ACS Paragon Plus Environment

Page 2 of 28

Page 3 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

subtypes, which require a more detail systematical research. Aiming to understand the clinical behavior and histopathological features of this disease, high throughput platforms such as microarrays, whole-transcriptome sequencing and proteomics were applied extensively in recent years

2, 4-7

. Using only genome technology can’t explain the reason

for disease because many genomic variations are not or only partially translated to the protein or can’t reflect the active cellular functions 8-10. These disadvantages required large-scale quantitative proteomics with high confidence in protein identification and quantification for searching the key genes or proteins in cancer-induced cases 11. MCF7 cell line was one of the stabilized breast cancer cell lines. It was established in 1973 by Soule and colleagues. MCF7 was ER and PR positive, the anti-estrogens drug tamoxifen can inhibit the cell growth12, 13. These properties make it a suitable in vitro model for breast cancer research. Although membrane proteins (MBPs) shared 20-30 % of the total encoded human protein numbers and play important roles in various biological processes14, the structure and function research of MBPs are hot issues because of the strong hydrophobicity and low-abundance properties etc. The challenges in membrane proteomics including membrane protein annotation, enrichment, digestion, and downstream signal pathway analysis. A lot of studies focused on MBP enrichment were optimized and reported, these methods include cell-surface capture technologies (biotin label, cationic colloidal silica particles etc.)

15, 16

, differential and density gradient

centrifugation 17, and detergent-based extraction method 18. In-gel digestion is widely employed in shotgun membrane proteomic research because SDS can solubilize MBPs absolutely and it can be removed in latter steps. But it can’t be applied in in-solution digestion, because a slight 3

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

concentration of SDS can severely affect the proteolytic enzyme activity 19-22. Furthermore, in-gel digestion is too complex, have bad dynamic range problem or low cleavage yield for some peptides/proteins due to the large size and/or high hydrophobicity 23, 24. So as to take the place of the in-gel digestion, FASP and further optimized eFASP method were developed for proteomic sample preparation

25, 26

. Many enzyme activity- and/or mass spectrometry (MS)-compatible

candidate reagents for the in-solution digestion of MBPs were tested 27, 28. Among these reagents, sodium deoxycholate (SDC)

27, 29

and RapiGest

30

showed not only good solubility and enzyme

compatibility, but also the easy removal property in later steps. In this paper, a whole membrane proteomic workflow from membrane enrichment to different digestion comparison were optimized based on MCF7 cell line. Ultra-centrifugation and detergent-based extraction methods were combined in MBP preparation. Different digestion strategies were applied and compared. Our MS data proved that in-solution digestion with RapiGest and eFASP with SDC methods are time-saving and consistent in MBP identification. Besides identified the largest MBP dataset of MCF7, 4 potential MPs fit the HPP guidelines were identified and validated using synthetized peptide method.

MATERIALS AND METHODS Materials RapiGest was purchased from Waters (Milford, MA, USA). Dithiothretiol (DTT), iodoacetamide (IAA), sucrose, tetrasodium ethylene glycol tetraacetic acid (EGTA), sodium orthovanadate, and sodium fluoride were purchased from Amresco (Solon, OH, USA). Unstained protein ladder (BenchMark) was purchased from Life technologies (Waltham, MA, USA). 2-Dquant kit was purchased from GE Healthcare (Pittsburgh, PA, USA). Protease inhibitor cocktail was purchased 4

ACS Paragon Plus Environment

Page 4 of 28

Page 5 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

from Roche (Mannheim, Germany). r-Ac-trypsin and Lys-C were recombinant expressed and purified in our lab 31, 32. Acetonitrile, trifluoroacetic acid (TFA), formic acid (FA), human insulin, and hydrocortisone hemisuccinate were purchased from Sigma (Darmstadt, Germany). AntiE-cadherin antibody was purchased from Abcam (Cambridge, MA, USA). Anti-β-Actin and α-tubulin antibodies were purchased from Santa Cruz Biotechnology (Dallas, TX, USA). Mem-PER ® Kit was purchased from Pierce (Waltham, MA, USA). Other not mentioned reagents were minimally analytical grade and solvent were HPLC grade.

The Overall Scheme for Membrane Proteomics Study A flowchart illustrates the MBP enrichment and digestion process (Fig. 1A). Briefly, the MBPs of MCF7 cell were enriched by ultra-centrifugation and detergent-based extraction. The aliquots of the MBPs were used for in-gel, in-solution, and eFASP digestion separately.

MCF7 Cell Culture MCF7 cell line was routinely cultured in Dulbecco's modified Eagle's medium (DMEM, Life technologies, Waltham, MA, USA) containing 10 % FBS. The cells were first washed with ice-cold PBS containing protease inhibitor cocktail then scraped with the same buffer and centrifuged at 60×g for 2 min at 4 ℃. The cell pellets were collected and frozen at -80 ℃ for further usage.

MBP

Enrichment

with

Ultra-centrifugation

Combined

Detergent-based

Extraction MCF7 cell pellet (3 plates of each) was suspended with cell lysis buffer A (5 mM Tris, 1 mM EGTA, 1 mM sodium orthovanadate, 2 mM sodium fluoride and protease inhibitor cocktail)

33

separately and disrupted by a Soniprep sonicator (Scientz, China) in ice. The debris was 5

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

eliminated by centrifugation at 9,600×g for 15 min at 4 ℃. The supernatant was centrifuged at 120,000×g for 80 min at 4 ℃ to get crude MBP fraction in pellet. The pellet was suspended in 0.1 M ice-cold Na2CO3 solution, incubated at 4 ℃ for 1 h with a vertical mixing apparatus (Scientz), and ultracentrifugated at 120,000×g for 80 min at 4 ℃ again. The MBPs in the pellet was solubilized with 150 µL Reagent A supplies in the Mem-PER ® kit and did further operations following the kit protocol. After removing of the upper phase, 100 µL lower phase and the insoluble parts were clarified and solubilized with buffer B (20 mM Tris, 20 mM DTT, 4 % SDS, pH 8.0). Aliquot parts of MBP fractions (50 µg each) were precipitated with acetone (precooled to -20℃) for 1 h at -20 ℃ at a ratio of 9:1 (volume ratio, acetone/protein sample) and then centrifuged at 8,000×g for 15 min at 4 ℃ to get the pellet. The pellet was washed again with 90 % acetone and centrifuged at 8,000×g for 15 min at 4 ℃ to get the final MBP fractions. The protein concentrations of cell lysate and enriched MBP fractions were measured by 2-D-quant kit. The same aliquot of the total cell lysate (TCL), the supernatant after ultra-centrifugation (Super), and the solubilized MBP were ran on a modified 12 % SDS-PAGE gel (improved SDS concentration in gel preparation from 0.1 % to 0.2 %) followed by Coomassie Brilliant Blue G-250 staining. The same amount of proteins of each fractions were ran on 12 % SDS-PAGE gel, transferred to PVDF membrane, and incubated with anti-β-Actin antibody (dilution at volume ratio of 1:5000), anti-E-cadherin antibody (dilution at volume ratio of 1:200), or anti-α-tubulin antibody (dilution at volume ratio of 1:500) overnight at 4 ℃ after 5 % BSA blockage. After washed three times with TBS-T buffer, the membrane was incubated with horseradish peroxidase-conjugated secondary antibody (Santa Cruz, Dallas, TX, USA), followed by chemiluminescent detection with ECL detection reagent. 6

ACS Paragon Plus Environment

Page 6 of 28

Page 7 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Comparison of MBP Digestion Methods The aliquot of 40 µg MBP fraction was first solubilized with buffer B, incubate at 60 ℃ for 30 min, added iodoacetamide (IAA) to 20 mM and incubated at RT at dark for 30 min. The sample was then ran on a 12% SDS-PAGE. After Coomassie staining and destaining, the gel was cut into 10 pieces. In-gel digestion was performed with Lys-C firstly overnight at mass ratio of 1:100 at 37℃ and then did trypsin digestion (1:50) at 37℃ for another 4 hours. The digested peptides were taken for LC-MS/MS analysis. The aliquots of 50 µg MBP fractions were used for in-solution and eFASP digestions. All compositions for protocol-specific key buffer were described in Table 1. For in-solution digestion with SDC (ISD: SDC), the MBP fraction was mixed with buffer C (50 mM NH4HCO3, 0.5 % SDC, and 5 mM DTT), absorbed back and forth with pipette for several times by a clean pipette, and then followed by sonication for 20 min and heated at 100 ℃ for another 10 min to solubilize the MBP fraction. After solubilizing, IAA was added to 5 mM and cultured for 30 min at RT. 10 % (5 µg) of the solubilized protein was checked with SDS-PAGE. Lys-C was added at a ratio of 1:100 (weight of enzyme to protein) and incubate at 37 ℃ overnight, trypsin was added to 1: 50 and digested at 37 ℃ for another 4 h. After digestion, 10 % (5 µg) of the digested sample was taken for digestion efficiency visualization by SDS-PAGE. TFA was added to final 0.5 % in the digestion sample, incubated at 37 ℃ for 45 min, and centrifuged at 13,000 rpm for 10 min to remove the surfactant in pellet. In-solution digestion with RG (ISG: RG) was achieved as the same as ISD: SDC method except the soluble buffer D is composed of 50 mM NH4HCO3, 0.1 % RG and 5 mM DTT. For eFASP with SDC (SF-ISD: SDC), the MBP pellet (50 µg) was mixed with buffer B, 7

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

absorbed back and forth with pipette for several times by a clean pipette. MBP fraction was solubilized by sonication for 20 min and heating at 100 ℃ for another 10 min. After solubilizing, IAA was added to 5 mM and incubated at RT in dark for 30 min to make the free cysteine fully alkylation. After clarification at 9,600×g for 15 min, the sample was transferred to a 3kD spin filter device (UFC500396, Merck-Millipore) and mixed with 200 µL digestion buffer E (50 mM NH4HCO3 and 0.5 % SDC). The device was centrifuged at 9,600×g for 15 min at 4 ℃ and repeated three times to exchange the buffer and remove SDS. All subsequent centrifugation steps were performed under the same conditions. After buffer exchanging, buffer E was added to the sample to 100 µL and 10 % (5 µg) of the sample was analyzed by SDS-PAGE. The digestion step was done as described in ISD: SDC part. eFASP with RapiGest (SF-ISD: RG) was done as the same as SF-ISD: SDC except the final digestion buffer F is composed of 50 mM NH4HCO3 and 0.1 % RG. The digested peptides of the 4 in-solution digestions were fractioned separately using a HPLC system (Agilent 1260 affinity) with a C18 column (Kinetex 2.6 µm C18, 50×4.60 mm, Phenomenex). The mobile phase consisted of buffer A containing 2 % ACN and 98 % H2O (pH 10.0) and buffer B containing 98 % of ACN and 2 % H2O (pH 10.0). The HPLC gradient were set at a flow rate of 0.8 mL/min with a 5-step gradient; 0 % B for 4 min, 0-60 %B in 16 min 60-100% B in 0.1 min, 100 % B for 2 min, 100-0 % B in 2 min and 0 % B for 1min. The eluted samples were collected every min from 2 min to 22 min and then pooled into 10 fractions for LC-MS/MS analysis after vacuum drying. The fraction merging method was showed in Figure S-1.

LC-MS Analysis and Database Searching Reverse phase separation of digested peptides were performed on a nano-UPLC system 8

ACS Paragon Plus Environment

Page 8 of 28

Page 9 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(nanoAcquity Ultra Performance LC,Waters), eluted peptides were ionized under high voltage (1.5 kV) and analyzed by an LTQ-Orbitrap Velos (Thermo Scientific)34. The initial MS spectrum (MS1) was analyzed over a mass range of 300-1600 Da with a resolution of 30,000 at m/z 400. The automatic gain control (AGC) was set to 1 × 106, and the maximum injection time (MIT) was 150 ms. The subsequent MS spectrum (MS2) was analyzed using data-dependent mode searching for the 20 most intense ions fragmented in the linear ion trap. For each scan, the AGC was set at 1× 104, and the MIT was 25 ms. The dynamic exclusion was set at 20 ~ 40 s to suppress repeated detection of the same fragment ion peaks. The MS/MS raw files produced by LTQ-Orbitrap Velos and the control datasets of PXD000442 35 PXD000066

36

and PXD002619

7

were processed with Proteome Discoverer 2.0 (v2.0.0.802)

against overlap queries (20,055 entries, including 2949 MBPs) between the Swiss-Prot database (release 2015.12) and the neXtProt database (release 2016.02). The searching engine Mascot Server (2.3.01) was used. Key parameters for database searching including: enzyme specificity was set to trypsin; cysteine carbamido methylation as a fixed modification; oxidation of methionine as variable modifications; the tolerances of peptides and fragment ions were set at 20 ppm and 0.5 Da respectively; the missed cleavage cutoff for protease digestion was set to two. Only the identifications satisfying the following criteria were considered: peptide length ≥7, FDR ≤ 1 % at both peptide and protein levels

37

. The 245 common contaminant protein were

excluded in the final protein lists (http://www.maxquant.org/contaminants.zip). The membrane prediction was performed following the method describes previously three

online

methods

(Phobius:

http://phobius.sbc.su.se/,

38

. Briefly, SOSUI:

http://bp.nuap.nagoya-u.ac.jp/sosui/ and TMHMM: http://www.cbs.dtu.dk/services/TMHMM/) 9

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

were used for MBP prediction

38

,

39, 40

. The protein was annotated as a MBP only if it was

predicted by at least two or three methods. The sub-cell and biological function annotation of the MBPs was analyzed by Unipro.

Verification of Missing Protein with Synthesized Peptides To evaluate the unique peptides’ authenticity, especially the unique peptides of MPs (PE=2,3,4) in this study, spectra were manually inspected by observing base peak intensity and b/y ions matching assisted by pFind and pBuild softwares41,42. Considering alternate explanations of PSMs which passed the manually check, open search was performed. The spectra conflicting to previous scanning were filtered. Relatively high quality peptides (unique peptides≥2, peptide length≥ 9AA) were selected and chemically synthesized by BankPeptide (Hefei, China) for further analysis. The peptides were solubilized and analyzed following the same LC-MS/MS parameters described above. Both m/z and ion intensity were used to calculate the cosine similarity score, and specific formula was similar as previously described 34. Only when the similarity matching score is higher than 0.9, its corresponding peptides was considered as very high confidence peptide. Considering isobaric substitutions might change the mapping of the peptide from MPs to a mapping to a commonly-observed protein, the isobaric filtering was performed by evaluating whether I=L, Q[Deamidated]=E, GG=N in its protein database.

Results and Discussion Ultra-centrifugation and Detergent-based Extraction Method Acquired MBP Fraction with High Purity. The mass weight of extracted MBP fraction from the two enrichment methods showed the similar yield (about 13 % of the TCL proteins based on 2-D-quant method, data not shown). The MBP 10

ACS Paragon Plus Environment

Page 10 of 28

Page 11 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

fraction with ultra-centrifugation following detergent-based extraction showed different profile with the TCL and the supernatant after ultra-centrifugation (Fig. 1B). Antibodies against E-cadherin, α-Tubulin, and β-Actin were evaluated as representative of membrane and cytoplasmic proteins respectively. As expected, the relative intensity of E-cadherin in MBP fraction was much higher than the supernatant, while the relative intensity of β-Actin and β-Actin in MBP fraction was much lower than the supernatant (Fig. 1C). These two results proved the MBP were enriched specifically with higher purity.

RapiGest

and

SDC

Exhibit

Different

Soluble

Abilities

and

Enzyme-Compatibility As shown in Fig. 1D, only a part of the MBP was solubilized by SDC (line 3, Fig. 1D). RapiGest worked well in both soluble and digesting step (line 4 and 8, Fig. 1D). In eFASP method, when solubilized with SDS before exchanged to SDC or RG with ultra-filtration for three times, the proteins were solubilized well (line 5 and 6, Fig. 1D). The proteins in SF-ISD: SDC method was digested absolutely (line 9, Fig. 1D). However worse digestion efficiency was found in SF-ISD: RG (line 10, Fig. 1D). These results suggested that the SDS might not be replaced with RapiGest completely in SF-ISD: RG method. The detailed database searching criteria and results were summarized in Supplementary Table 2. The identified proteins with more than 1 unique peptide were summarized in the Table 2 and Fig. 2A. The identified proteins of MCF7 cell varied from 1201 to 3505, of which 309 to 1125 were MBPs. The in-gel digestion yielded 1091 MBPs (36.6 % of total identified proteins). ISD: RG and SF-ISD: SDC methods identified 1125 and 1069 MBPs, respectively. With the SF-ISD:RG protocol, the MBP fraction can’t be digested completely resulting in the smallest proteins were 11

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

identified among these five digesting methods. Therefore, this group of data was not used for further analysis. At the peptide level, in-gel digestion characterized more peptides (average 6.3 peptides per protein) than the other three methods (Table 2). At the protein level, the intensity distribution comparison proved that the in-gel digestion gained the highest protein intensity when starting from the same amount of proteome samples (Fig. 2B). Among the two in-solution digestion and eFASP with SDC digestion, ISD:RG presented the higher protein intensity.

ISD: RG and SF-ISD: SDC Methods Exhibit the Faithful Identification of MBPs with In-gel Digestion ISD:RG and SF-ISD:SDC characterized more proteins than in-gel digestion (Table 2, Fig. 2A & 2C). The identified proteins, especially MBPs from ISD: RG and SF-ISD: SDC with in-gel digestion were overlapped at a quite large ratio (Fig. 2D). When compared the molecular weight (MW) of these MBPs, the identified MBPs from ISD: RG, SF-ISD: SDC and in-gel digestion also showed the comparable distribution. In the MW distribution analysis showed the biggest part of the MBPs with MWs between 20-60 kD (Fig. 3A); the transmembrane domain prediction analysis showed the majority of MBPs with one transmembrane domains, there were still some proteins’ TMHMM number≥10 (Fig. 3B); in the subcellular prediction of the identified MBPs, the top three subcellular locations including plasma, endoplasmic reticulum, and mitochondrion (Fig. 3C); in the bio-function prediction of the identified MBPs, the top three bio-functions including binding proteins, enzyme, and receptors (Fig. 3D). When compared the number of each groups, the MBP numbers of ISD: RG and SF-ISD: SDC methods, especially the ISD: RG method, are comparable with in-gel digestion. 12

ACS Paragon Plus Environment

Page 12 of 28

Page 13 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Largest Membrane Proteome Dataset of MCF7 in Breast Cancer Study In our dataset, 1125 MBPs were identified in ISD:RG method. Totally, 1390 MBPs (unique peptides≥1) were identified, among of which, 1345 MBPs were identified with unique peptides≥ 2. This identification is larger than all of previously published membrane proteomic datasets 36, 35, 7

. When compared the identified proteins with the three datasets, the proteins identified in this

study contained a large part of unique identified proteins (Fig. 4A). Interestingly, our data had the biggest overlap with Muraoka’s dataset (Fig. 4B) 36. However, the other two datasets contain only a small portion of shared MBPs. This result implied that the regular proteomic protocol was difficult for the efficient identification of MBPs. The new method we presented here might be helpful in the identification of membrane proteome in breast cancer.

Four candidate MBPs Belong to MPs Missing proteins (MPs) searching was an important mission in the C-HPP study 43. According to the newly released neXtProt database, there are 2949 protein products have not been identified 44, 45

. Among the identified proteins of our dataset, there were 13 unique peptides belong to 9 MPs

(Table 4). After manual checking, eight peptides with high quality for 4 MPs (Q6UWH6, Q8IZD6, O75474, and Q3SY17/Q9H1U9) were compared and showed consistent MS2 spectra with synthesized peptides. Eight peptides have high similarity (score ≥0.9) with synthesized peptide and MS2 spectra by comparing the b1+, y1+, b2+, and y2+ ion match and also matching the peak intensity pattern. The spectra for peptide of O75474 (AVAAVAATGPASAPGPGGGR) were showed in Fig. 5B. The other MS2 spectra were showed as in Figure S-2-9. In the identified MPs, P0DMR1, P0C5Z0, Q12999 and Q7Z769 were also identified with only 1 unique peptide (Table 4). Among these proteins, Q6UWH6 and Q8IZD6 fit for the guideline’s requirement and they 13

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 28

were considered as MPs; Q3SY17 and Q9H1U9 belong to solute carrier family, and they have high sequence similarity (95.96%), based on the two identified unique peptides (non-PE1 peptides), the two proteins can’t be separated, so that they were considered to be candidate detections; O75474 was also considered to be a candidate detection because two peptides

belong

to

O75474

were

identified

( AVAAVAATGPASAPGPGGGR

and

LLQQLVLSGNLIK), but peptide LLQQLVLSGNLIK is shared with Q92837. Since the MPs have no or insufficient protein evidence, the functional analysis by GO or Unipro exhibited less useful information. For example: the O75474 is predicted to be involved in Wnt signal pathway, while the Q3SY17, Q9H1U9, and Q8IZD6 are predicted to be involved in transport. The P0DMR1 is predicted to be involved in nucleotide binding, and the Q6UWH6 is predicted to be involved in ER to Golgi vesicle-mediated transport. CONCLUSION We optimized a method for MBP enrichment with ultra-centrifugation followed by detergent-based extraction and MS analysis. A total of 1091 MBPs were characterized among the 2984 identified proteins within in-gel digestion method; a total of 1125 MBPs were identified with ISD:RG method; 1069 MP were characterized among the 3194 characterized proteins in SF-ISD:SDC method. A further evaluation proved the ISD:RG and SF-ISD:SDC methods are as faithful as in-gel digestion, but they are time-saving and easy to operate. The MBP numbers in our dataset is the largest one (unique peptide≥2) when compared with other datasets of MCF7 and breast cancer tissues. After strict filtering, 8 unique peptides belong to 4 MPs were selected as high-credible MPs by spectra quality check and synthesized peptide match. And even further, Q6UWH6 and Q8IZD6 14

ACS Paragon Plus Environment

Page 15 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

are the identified MPs while O75474 and Q3SY17/ Q9H1U9 are candidate detections. These results may predict that the MBP enrichment based membrane proteomics can be used in breast cancer especially the MBP involved molecular mechanism research.

ASSOCIATED CONTENT Supporting Information The Supporting Information is available free of charge on the ACS Publications website at DOI: Figure S-1. The high pH-RPLC separation of in-solution digestion peptide samples, the sample pooling method was described in ISD:RG condition, the same pooling method was taken for the left three conditions; Figure S-2~9. The peptide AVAAVAATGPASAPGPGGG of protein O75474, the peptide LLQQLVLSGNLIK of protein O75474 from both digested and synthesized the peptides;

LQSQIGGEFQSFPK of protein Q3SY17 or Q9H1U9 , the peptide

NGLSNVLFFGLR of protein Q3SY17 or Q9H1U9 , the peptide LGILVVFSFIKEAILPSR of protein Q6UWH6, the peptide RLGILVVFSFIKEAILPSR of protein Q6UWH6, the peptide ETGSFLDLFR of Protein Q8IZD6, the peptide LSEAEEALYLIAK of protein Q8IZD6 from both digested and synthesized peptide, respectively. * In figure S-2~9 the upper figure is the MS2 spectrum of digested peptide, the middle figure is the MS2 spectrum of synthesized peptide, the lower figure is the MS2 spectrum match result. Supplementary Table 1. Detailed information about the Velos identified proteins for four methods; Supplementary Table 2. The searching criteria and results Notes The authors declare no competing financial interest.

ACKNOWLEDGMENT 15

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The authors are grateful to colleagues in the Xu lab for helpful discussions and comments. This work was funded by the Chinese National Basic Research Programs (2013CB911201 & 2015CB910700), the International Collaboration Program (2014DFB30020), the National Natural Science Foundation of China (Grant No. 31470809, 31400698 & 31400697), the National High-Tech Research and Development Program of China (2014AA020900 & 2014AA020607), the National Natural Science Foundation of Beijing (Grant No. 5152008), Beijing Training Project for The Leading Talents in S&T, National Megaprojects for Key Infectious Diseases (2013zx10003002 & 2016ZX10003003), the Key Projects in the National Science & Technology Pillar Program (2012BAF14B00), the Unilevel 21th Century Toxicity Program (MA-2014-02409) and the Foundation of State Key Lab of Proteomics (SKLPYB201404).

ABBREVIATIONS MBP, membrane protein; eFASP, enhanced filter aided sample preparation; SDC, Sodium deoxycholate; C-HPP, Chromosome-Centric Human Proteome Project; ER, oestrogen receptor; PR, progesterone receptor; HER2, human epidermal growth factor receptor-2; MP, MBP; high-pH-RPLC, high pH reverse phase liquid chromatography; DTT, Dithiothretiol; IAA, iodoacetamide; DMEM, Dulbecco's modified Eagle's medium; TCL, total cell lysate; Super, the supernatant after ultra-centrifugation; ISD: SDC, in-solution digestion with SDC; ISG: RG, In-solution digestion with RG; SF-ISD: SDC, eFASP with SDC; SF-ISD: RG, eFASP with RapiGest; AGC, automatic gain control; MIT, maximum injection time.

Reference: 1.

Sotiriou, C.; Wirapati, P.; Loi, S.; Harris, A.; Fox, S.; Smeds, J.; Nordgren, H.; Farmer, P.; Praz, V.;

Haibe-Kains, B., Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. Journal of the National Cancer Institute 2006, 98, (4), 262-272. 2.

Perou, C. M.; Sørlie, T.; Eisen, M. B.; van de Rijn, M.; Jeffrey, S. S.; Rees, C. A.; Pollack, J. R.; Ross, D. 16

ACS Paragon Plus Environment

Page 16 of 28

Page 17 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

T.; Johnsen, H.; Akslen, L. A., Molecular portraits of human breast tumours. Nature 2000, 406, (6797), 747-752. 3.

Sørlie, T.; Tibshirani, R.; Parker, J.; Hastie, T.; Marron, J.; Nobel, A.; Deng, S.; Johnsen, H.; Pesich,

R.; Geisler, S., Repeated observation of breast tumor subtypes in independent gene expression data sets. PNAS 2003, 100, (14). 4.

Network, C. G. A., Comprehensive molecular portraits of human breast tumours. Nature 2012,

490, (7418), 61-70. 5.

Gruvberger, S.; Ringnér, M.; Chen, Y.; Panavally, S.; Saal, L. H.; Borg, Å.; Fernö, M.; Peterson, C.;

Meltzer, P. S., Estrogen receptor status in breast cancer is associated with remarkably distinct gene expression patterns. Cancer research 2001, 61, (16), 5979-5984. 6.

Sørlie, T.; Perou, C. M.; Tibshirani, R.; Aas, T.; Geisler, S.; Johnsen, H.; Hastie, T.; Eisen, M. B.; van

de Rijn, M.; Jeffrey, S. S., Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proceedings of the National Academy of Sciences 2001, 98, (19), 10869-10874. 7.

Tyanova, S.; Albrechtsen, R.; Kronqvist, P.; Cox, J.; Mann, M.; Geiger, T., Proteomic maps of breast

cancer subtypes. Nature communications 2016, 7. 8.

Geiger, T.; Cox, J.; Mann, M., Proteomic changes resulting from gene copy number variations in

cancer cells. PLoS Genet 2010, 6, (9), e1001090. 9.

Zhang, B.; Wang, J.; Wang, X.; Zhu, J.; Liu, Q.; Shi, Z.; Chambers, M. C.; Zimmerman, L. J.; Shaddox,

K. F.; Kim, S., Proteogenomic characterization of human colon and rectal cancer. Nature 2014, 513, (7518), 382-387. 10. Nagaraj, N.; Wisniewski, J. R.; Geiger, T.; Cox, J.; Kircher, M.; Kelso, J.; Pääbo, S.; Mann, M., Deep proteome and transcriptome mapping of a human cancer cell line. Molecular systems biology 2011, 7, (1). 11. Mann, M.; Kulak, N. A.; Nagaraj, N.; Cox, J., The coming age of complete, accurate, and ubiquitous proteomes. Molecular cell 2013, 49, (4), 583-590. 12. Soule, H.; Vazquez, J.; Long, A.; Albert, S.; Brennan, M., A human cell line from a pleural effusion derived from a breast carcinoma. Journal of the National Cancer Institute 1973, 51, (5), 1409-1416. 13. Shirazi, F. H.; Zarghi, A.; Ashtarinezhad, A.; Kobarfard, F.; Nakhjavani, M.; Anjidani, N.; Zendehdel, R.; Arfaiee, S.; Shoeibi, S.; Mohebi, S., Remarks in Successful Cellular Investigations for Fighting Breast Cancer Using Novel Synthetic Compounds. INTECH Open Access Publisher: 2011. 14. Luckey, M., Membrane structural biology: with biochemical and biophysical foundations. Cambridge University Press: 2014. 15. Solis, N.; Cordwell, S. J., Current methodologies for proteomics of bacterial surface‐exposed and cell envelope proteins. Proteomics 2011, 11, (15), 3169-3189. 16. Elschenbroich, S.; Kim, Y.; Medin, J. A.; Kislinger, T., Isolation of cell surface proteins for mass spectrometry-based proteomics. Expert review of proteomics 2010, 7, (1), 141-154. 17. DePierre, J. W.; Karnovsky, M. L., Plasma membranes of mammalian cells: a review of methods for their characterization and isolation. The Journal of cell biology 1973, 56, (2), 275. 18. Seddon, A. M.; Curnow, P.; Booth, P. J., Membrane proteins, lipids and detergents: not just a soap opera. Biochimica et Biophysica Acta (BBA)-Biomembranes 2004, 1666, (1), 105-117. 19. Lu, B.; McClatchy, D. B.; Kim, J. Y.; Yates, J. R., Strategies for shotgun identification of integral membrane proteins by tandem mass spectrometry. Proteomics 2008, 8, (19), 3947-3955. 20. Zhang, H.; Lin, Q.; Ponnusamy, S.; Kothandaraman, N.; Lim, T. K.; Zhao, C.; Kit, H. S.; Arijit, B.; 17

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Rauff, M.; Hew, C. L., Differential recovery of membrane proteins after extraction by aqueous methanol and trifluoroethanol. Proteomics 2007, 7, (10), 1654-1663. 21. Shevchenko, A.; Wilm, M.; Vorm, O.; Mann, M., Mass spectrometric sequencing of proteins from silver-stained polyacrylamide gels. Analytical chemistry 1996, 68, (5), 850-858. 22. Vuckovic, D.; Dagley, L. F.; Purcell, A. W.; Emili, A., Membrane proteomics by high performance liquid chromatography–tandem mass spectrometry: Analytical approaches and challenges. Proteomics 2013, 13, (3-4), 404-423. 23. Rabilloud, T.; Vaezzadeh, A. R.; Potier, N.; Lelong, C.; Leize‐Wagner, E.; Chevallet, M., Power and limitations of electrophoretic separations in proteomics strategies. Mass spectrometry reviews 2009, 28, (5), 816-843. 24. Wisniewski, J. R.; Ostasiewicz, P.; Mann, M., High recovery FASP applied to the proteomic analysis of microdissected formalin fixed paraffin embedded cancer tissues retrieves known colon cancer markers. Journal of proteome research 2011, 10, (7), 3040-3049. 25. Raimondo, F.; Corbetta, S.; Savoia, A.; Chinello, C.; Cazzaniga, M.; Rocco, F.; Bosari, S.; Grasso, M.; Bovo, G.; Magni, F.; Pitto, M., Comparative membrane proteomics: a technical advancement in the search of renal cell carcinoma biomarkers. Molecular BioSystems 2015, 11, (6), 1708-1716. 26. Wisniewski, J. R.; Zougman, A.; Nagaraj, N.; Mann, M., Universal sample preparation method for proteome analysis. Nat Methods 2009, 6, (5), 359-62. 27. Lin, Y.; Zhou, J.; Bi, D.; Chen, P.; Wang, X.; Liang, S., Sodium-deoxycholate-assisted tryptic digestion and identification of proteolytically resistant proteins. Analytical biochemistry 2008, 377, (2), 259-266. 28. Kadiyala, C.; Tomechko, S. E.; Miyagi, M., Perfluorooctanoic acid for shotgun proteomics. PLoS One 2010, 5, (12), e15332. 29. Lin, Y.; Liu, Y.; Li, J.; Zhao, Y.; He, Q.; Han, W.; Chen, P.; Wang, X.; Liang, S., Evaluation and optimization of removal of an acid‐insoluble surfactant for shotgun analysis of membrane proteome. Electrophoresis 2010, 31, (16), 2705-2713. 30. Yu, Y.-Q.; Gilar, M.; Lee, P. J.; Bouvier, E. S.; Gebler, J. C., Enzyme-friendly, mass spectrometry-compatible surfactant for in-solution enzymatic digestion of proteins. Analytical chemistry 2003, 75, (21), 6023-6028. 31. Zhao, M.; Wu, F.; Xu, P., Development of a rapid high-efficiency scalable process for acetylated Sus scrofa cationic trypsin production from Escherichia coli inclusion bodies. Protein expression and purification 2015. 32. Zhao, M.; Cai, M.; Wu, F.; Zhang, Y.; Xiong, Z.; Xu, P., Recombinant expression, refolding, purification and characterization of Pseudomonas aeruginosa protease IV in Escherichia coli. Protein Expr Purif 2016, 126, 69-76. 33. Peng, L.; Kapp, E. A.; Fenyö, D.; Kwon, M. S.; Jiang, P.; Wu, S.; Jiang, Y.; Aguilar, M. I.; Ahmed, N.; Baker, M. S., The Asia Oceania Human Proteome Organisation Membrane Proteomics Initiative. Preparation and characterisation of the carbonate‐washed membrane standard. Proteomics 2010, 10, (22), 4142-4148. 34. Su, N.; Zhang, C.; Zhang, Y.; Wang, Z.; Fan, F.; Zhao, M.; Wu, F.; Gao, Y.; Li, Y.; Chen, L.; Tian, M.; Zhang, T.; Wen, B.; Sensang, N.; Xiong, Z.; Wu, S.; Liu, S.; Yang, P.; Zhen, B.; Zhu, Y.; He, F.; Xu, P., Special Enrichment Strategies Greatly Increase the Efficiency of Missing Proteins Identification from Regular Proteome Samples. J Proteome Res 2015, 14, (9), 3680-92. 35. Segura, V.; Medina-Aunon, J. A.; Mora, M. I.; Martínez-Bartolomé, S.; Abian, J.; Aloria, K.; Antúnez, 18

ACS Paragon Plus Environment

Page 18 of 28

Page 19 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

O.; Arizmendi, J. M.; Azkargorta, M.; Barceló-Batllori, S., Surfing transcriptomic landscapes. A step beyond the annotation of chromosome 16 proteome. Journal of proteome research 2013, 13, (1), 158-172. 36. Muraoka, S.; Kume, H.; Adachi, J.; Shiromizu, T.; Watanabe, S.; Masuda, T.; Ishihama, Y.; Tomonaga, T., In-depth membrane proteomic study of breast cancer tissues for the generation of a chromosome-based protein list. Journal of proteome research 2012, 12, (1), 208-213. 37. Michalski, A.; Damoc, E.; Hauschild, J.-P.; Lange, O.; Wieghaus, A.; Makarov, A.; Nagaraj, N.; Cox, J.; Mann, M.; Horning, S., Mass spectrometry-based proteomics using Q Exactive, a high-performance benchtop quadrupole Orbitrap mass spectrometer. Molecular & Cellular Proteomics 2011, 10, (9), M111. 011015. 38. Almén, M. S.; Nordström, K. J.; Fredriksson, R.; Schiöth, H. B., Mapping the human membrane proteome: a majority of the human membrane proteins can be classified according to function and evolutionary origin. BMC biology 2009, 7, (1), 50. 39. Käll, L.; Krogh, A.; Sonnhammer, E. L., A combined transmembrane topology and signal peptide prediction method. Journal of molecular biology 2004, 338, (5), 1027-1036. 40. Krogh, A.; Larsson, B.; Von Heijne, G.; Sonnhammer, E. L., Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. Journal of molecular biology 2001, 305, (3), 567-580. 41. Wang, L. h.; Li, D. Q.; Fu, Y.; Wang, H. P.; Zhang, J. F.; Yuan, Z. F.; Sun, R. X.; Zeng, R.; He, S. M.; Gao, W., pFind 2.0: a software package for peptide and protein identification via tandem mass spectrometry. Rapid Communications in Mass Spectrometry 2007, 21, (18), 2985-2991. 42. Fu, Y.; Yang, Q.; Sun, R.; Li, D.; Zeng, R.; Ling, C. X.; Gao, W., Exploiting the kernel trick to correlate fragment ions for peptide identification via tandem mass spectrometry. Bioinformatics 2004, 20, (12), 1948-1954. 43. Gaudet, P.; Argoud-Puy, G.; Cusin, I.; Duek, P.; Evalet, O.; Gateau, A.; Gleizes, A.; Pereira, M.; Zahn-Zabal, M.; Zwahlen, C.; Bairoch, A.; Lane, L., neXtProt: organizing protein knowledge in the context of human proteome projects. J Proteome Res 2013, 12, (1), 293-8. 44. Su, N.; Zhang, C.; Zhang, Y.; Wang, Z.; Fan, F.; Zhao, M.; Wu, F.; Gao, Y.; Li, Y.; Chen, L., Special Enrichment Strategies Greatly Increase the Efficiency of Missing Proteins Identification from Regular Proteome Samples. Journal of proteome research 2015. 45. Zhang, Y.; Li, Q.; Wu, F.; Zhou, R.; Qi, Y.; Su, N.; Chen, L.; Xu, S.; Jiang, T.; Zhang, C., The Tissue-Based Proteogenomics Reveals that Human Testis Endows Plentiful Missing Proteins. Journal of proteome research 2015.

Figure Legends Abstract figure MCF7 cell line was used for the systematic evaluation of MP enrichment of ultra-centrifugation and detergent-based extraction. In-solution digestions, eFASP with RapiGest and SDC, and also in-gel digestion were compared systematically. The MBPs identified in this research were compared between different digestion methods and with previous research. The MPs identified in 19

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 28

this research were validated by synthesized peptides. Figure 1. Overview of the methodology; the rough evaluation of the MBP enrichment, solubilization and digestion Overview of methodology for MBP enrichment and digestion method comparison (A); the TCL, Super and MBP fractions with ultra-centrifugation followed by detergent-based extraction were showed on SDS-PAGE gel (B). The Western blot of the same samples in (B) by antibodies against E-cadherin, β-Actin, and α-tubulin (C). Aliquot of MP fraction (5 µg) of solubilization and in-solution digestion steps were visualized on SDS-PAGE gel (D). Fig. 2. The comparison of in-solution and in-gel digestions based on the LC-MS/MS results The comparison of identified MBPs and non-MBPs of the five digesting methods (A). The intensity distribution of the identified proteins from four digesting methods (not including SF-ISD:RG) (B).Venn diagram comparison of characterized proteins of ISD-SDC, ISD-RG, SF-ISD: SDC, and in-gel digestion(C). Venn diagram comparison of characterized MBPs of ISD-SDC, ISD-RG, SF-ISD: SDC, and in-gel digestion (D). Fig. 3. The distribution comparison of the identified MBPs of in-solution and in-gel digestions based on the LC-MS/MS results The MW distribution of the identified MBPs (A). The TMHMM distribution of the identified MBPs (B). The sub-cell location distribution of the identified MBPs (C). The bio-function distribution of the identified MBPs (D). Fig. 4. The comparison of the identified proteins and MBPs with previous researches Venn diagram comparison of the identified proteins in this research with previous researches (A). Venn comparison of the identified MBPs in this research with previous researches (B). Fig. 5. Validation of the identified peptide of O75474 using synthesized peptide method Peptide (AVAAVAATGPASAPGPGGGR), which stands for the O75474, from MCF7 cell digestion and synthesized peptide were compared.

Table 1. Overview of the key buffer conditions for in solution and in-gel digestions Protocol steps

Condition

ISD: SDC

ISD:RG

SF-ISD:SDC

20

ACS Paragon Plus Environment

SF-ISD:RG

In-Gel

Page 21 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research



0.5 % SDC 5 mM DTT Solubilization

0.1 % RapiGest 5 mM



DTT 4 % SDS 5 mM DTT Cysteine



15 mM IAA

acetylation















0.5 % SDC

SDS removal





0.1 % RapiGest

Digestion

0.5 % SDC

condition

0.1 % RapiGest

Digestion



√ √



Lys-C/Trypsin









0.5 % TFA









Fraction

High pH-RPLC









method

Gel pieces

enzymes Detergent removal





Table 2. The proteomic comparison of in solution digestions and in-gel digestion Digestion method

Protein type

Unique peptides

Proteins

Peptides/protein

MP/ (%)

in gel

MP

7361

1091

6.7

36.6

Total proteins

18707

2984

6.3

MP

2364

720

3.3

Total proteins

10393

2808

3.7

SD:SDC

ISD:RG

MP

5396

1125

4.8

Total proteins

16627

3505

4.7

MP

4542

1069

4.2

Total proteins

13185

3194

4.1

SF-ISD:SDC

SF-ISD:RG

Table 3.

MP

745

309

2.4

Total proteins

3149

1201

2.6

25.6

32.1

33.5

25.7

The comparison of identified MBP numbers of our dataset with previous researches Unique peptides≥1

Unique peptides≥2

Sample type

Researches

PXD No.

160

Segura (35)

PXD000442

4484

1333

Muraoka (36)

PXD000066

779

5008

744

Tyanova (7)

PXD002619

1390

4542

1345

Zhao

PXD004131

Total proteins

MBPs

Total proteins

MBPs

2366

163

2303

6093

1824

Breast cancer

5161

MCF7 cell

4683

MCF7 cell MammaPrint breast cancer

Table 4. Raw file

The identified MPs in this research Peptide

Score

DeltaM [ppm]

21

ACS Paragon Plus Environment

Protein

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 28

MCF7_inGel_01

LGILVVFSFIKEAILPSR

53.73

9.32

MCF7_inGel_01

RLGILVVFSFIKEAILPSR

69.17

5.75

MCF7_ISD_RG_05

ETGSFLDLFR

41.08

-0.48

MCF7_ISD_RG_03

LSEAEEALYLIAK

78.7

-0.62

54.16

0.3

MCF7_ISD_SDC_07

AVAAVAATGPASAPGP GGGR

Q6UWH6

Q8IZD6

O75474##

MCF7_ISD_SDC_10

LLQQLVLSGNLIK

61.79

1.22

MCF7_inGel_08

LQSQIGGEFQSFPK

38.64

-2.27

MCF7_inGel_08

NGLSNVLFFGLR

45.54

6.33

Q3SY17/Q9H1U9**

MCF7_SF_ISD_SDC_05

MIASQVVDINLAAEPK

54.51

3.43

P0DMR1#

MCF7_SF_ISD_SDC_03

VLELAGNEAQNSGER

52.5

1.09

MCF7_inGel_04

GSSGAGGR

34.21

11.17

P0C5Z0*

MCF7_SF_ISD_SDC_01

SQSPTCQMCGEK

60.51

-2.21

Q12999*

MCF7_SF_ISD_SDC_10

AMTTPVIIAIQTFCYQK

32.37

3.71

Q7Z769*

## Two peptides of good of MS2 quality, but LLQQLVLSGNLIK was reported in PE=1 protein Q92837, this protein is a “candidate detection”. ** Q3SY17 and Q9H1U9 belong to the same protein family (very high sequence similarity-95.96%). They can’t be distinguished based on the two unique peptides. This protein group is a “candidate detection”. # With only one unique peptide, but the MS2 spectrums with good quality. * These MS2 spectrums with bad quality.

22

ACS Paragon Plus Environment

Page 23 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 1. Overview of the methodology; the rough evaluation of the MBP enrichment, solubilization and digestion Overview of methodology for MBP enrichment and digestion method comparison (A); the TCL, Super and MBP fractions with ultra-centrifugation followed by detergent-based extraction were showed on SDS-PAGE gel (B). The Western blot of the same samples in (B) by antibodies against E-cadherin, β-Actin, and αtubulin (C). Aliquot of MP fraction (5 µg) of solubilization and in-solution digestion steps were visualized on SDS-PAGE gel (D). 236x189mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Fig. 2. The comparison of in-solution and in-gel digestions based on the LC-MS/MS results The comparison of identified MBPs and non-MBPs of the five digesting methods (A). The intensity distribution of the identified proteins from four digesting methods (not including SF-ISD:RG) (B).Venn diagram comparison of characterized proteins of ISD-SDC, ISD-RG, SF-ISD: SDC, and in-gel digestion(C). Venn diagram comparison of characterized MBPs of ISD-SDC, ISD-RG, SF-ISD: SDC, and in-gel digestion (D). 242x178mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 24 of 28

Page 25 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Fig. 3. The distribution comparison of the identified MBPs of in-solution and in-gel digestions based on the LC-MS/MS results The MW distribution of the identified MBPs (A). The TMHMM distribution of the identified MBPs (B). The subcell location distribution of the identified MBPs (C). The bio-function distribution of the identified MBPs (D). Figure 3 223x230mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Fig. 4. The comparison of the identified proteins and MBPs with previous researches Venn diagram comparison of the identified proteins in this research with previous researches (A). Venn comparison of the identified MBPs in this research with previous researches (B). 239x82mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 26 of 28

Page 27 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Fig. 5. Validation of the identified peptide of O75474 using synthesized peptide method Peptide (AVAAVAATGPASAPGPGGGR), which stands for the O75474, from MCF7 cell digestion and synthesized peptide were compared. 206x161mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

For TOC Only

ACS Paragon Plus Environment

Page 28 of 28