Finding Missing Proteins from the Epigenetically Manipulated Human

Jul 23, 2015 - *E-mail: [email protected]. Phone: +86-20-85225960. Fax: +86-20-85222616., *E-mail: [email protected]. Phone: ... Citation data is...
2 downloads 13 Views 3MB Size
Article pubs.acs.org/jpr

Finding Missing Proteins from the Epigenetically Manipulated Human Cell with Stringent Quality Criteria Lijuan Yang,†,⊥ Xinlei Lian,†,⊥ Wanling Zhang,†,⊥ Jie Guo,†,⊥ Qing Wang,†,⊥ Yaxing Li,† Yang Chen,† Xingfeng Yin,† Pengyuan Yang,§ Fei Lan,*,‡ Qing-Yu He,*,† Gong Zhang,*,† and Tong Wang*,† †

Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, Guangzhou 510632, China ‡ Key Laboratory of Epigenetics, Department of Cell Biology and §Department of Chemistry, School of Basic Medicine, and Institutes of Biomedical Sciences and Institutes of Biomedical Sciences, Fudan University, Shanghai 200032, China S Supporting Information *

ABSTRACT: The chromosome-centric human proteome project (C-HPP) has made great progress of finding protein evidence (PE) for missing proteins (PE2−4 proteins defined by the neXtProt), which now becomes an increasingly challenging field. As a majority of samples tested in this field were from adult tissues/cells, the developmental stage specific or relevant proteins could be missed due to biological source availability. We posit that epigenetic interventions may help to partially bypass such a limitation by stimulating the expression of the “silenced” genes in adult cells, leading to the increased chance of finding missing proteins. In this study, we established in vitro human cell models to modify the histone acetylation, demethylation, and methylation with near physiological conditions. With mRNA-seq analysis, we found that histone modifications resulted in overall increases of expressed genes in an even distribution manner across different chromosomes. We identified 64 PE2−4 and six PE5 proteins by MaxQuant (FDR < 1% at both protein and peptide levels) and 44 PE2−4 and 7 PE5 proteins by Mascot (FDR < 1% at peptide level) searches, respectively. However, only 24 PE2−4 and five PE5 proteins in Mascot, and 12 PE2−4 and one PE5 proteins in MaxQuant searches could, respectively, pass our stringently manual spectrum inspections. Collectively, 27 PE2−4 and five PE5 proteins were identified from the epigenetically modified cells; among them, 19 PE2−4 and three PE5 proteins passed FDR < 1% at both peptide and protein levels. Gene ontology analyses revealed that the PE2−4 proteins were significantly involved in development and spermatogenesis, although their chemical−physical features had no statistical difference from the background. In addition, we presented an example of suspicious PE5 peptide spectrum matched with unusual AA substitutions related to post-translational modification. In conclusion, the epigenetically manipulated cell models should be a useful tool for finding missing proteins in C-HPP. The mass spectrometry data have been deposited to the iProx database (accession number: IPX00020200). KEYWORDS: Chromosome-Centric Human Proteome Project, epigenetic, histone, missing proteins



INTRODUCTION

antibody evidence, while PE5 genes are considered as doubtful coding genes. Out of the recently updated denominator of 19 490 total protein coding genes,5 neXtProx (2015_01_01, release) has labeled 16 520 PE1, 2936 PE2−4, and 605 PE5 genes. As compared with the 2013 release, more than 2000 PE2− 4 proteins have been identified.7 Numerous factors have been recognized to account for missing MS evidence of PE2−4 proteins. These factors include: (1) proteins are expressed only in unusual organs or cell types, which can be improved by analyzing the plasma,10−12 more organs, and cell types;13−15 (2) proteins are either of low abundance or unsuitable for MS analyses, which can be tackled by improved

As initiated by the Human Proteome Organization (HUPO), the chromosome-centric human proteome project (C-HPP) has been implemented for the third year. The primary Phase I goal of C-HPP is to find protein evidence (PE) for all of the human coding genes,1,2 and the most important impact of such an achievement is to allow innovatively systematic interrogations on the biology/disease driven questions based on a more accurate human proteome knowledgebase.2−6 Serving as the primary bioinformatics platform for C-HPP, the neXtProt database integrates 14 professional resources including well-reviewed mass spectrometry (MS) database, UniProtKB/Swiss-Prot, and the Human Protein Atlas (HPA) containing antibody evidence;7 it is now widely accepted as an essential resource for C-HPP.5,8,9 According to the five PE levels defined by neXtProt, PE1 genes are confirmed protein coding genes, and PE2−4 genes are deemed coding genes for missing proteins that lack MS or © XXXX American Chemical Society

Special Issue: The Chromosome-Centric Human Proteome Project 2015 Received: May 28, 2015

A

DOI: 10.1021/acs.jproteome.5b00480 J. Proteome Res. XXXX, XXX, XXX−XXX

Article

Journal of Proteome Research MS instruments, multiple proteolytic digestion,16,17 and focusing on subcellular compartments such as detergent-insoluble proteins;18 and (3) proteins are expressed only in specific or early developmental stages.19,20 The last problem brought the epigenetic regulation to our attention, as numerous developmental stage specific genes were often switched off due to the DNA−histone association. In addition, these epigenetic changes have been correlated to aging and cancer phenotypes,21 and epigenetic regulation of the genome has emerged as an attractive new strategy for therapeutic intervention.22 These advances lead to a hint for missing protein exploration, to switch on or stimulate the transcription of missing protein coding genes by epigenetic manipulations in human cells. There are three major epigenetic mechanisms: DNA methylation, histone modifications, and ncRNAs.23 Histone is a key player in epigenetics, and its acetylation and methylation are the most common post-translational modifications (PTMs). These histone PTMs have important roles in transcriptional regulation, DNA repair, DNA replication, alternative splicing, and chromosome condensation.24 For example, we previously found that H3.3 lysine 36 trimethylation (H3.3K36me3) histone and its reader protein BS69 could work together to regulate premRNA process.25 Therefore, in this study, we established in vitro histone acetylation, demethylation, and methylation models, respectively, by using human lung, liver, and colorectal cancer (CRC) cells. We examined both transcription and proteomic changes after epigenetic interventions to find missing proteins.



(1:5000; ProteinTech) and goat antimouse (1:3000; Bioworld) secondary antibodies were used. Cytotoxicity Assays

The activity of supernatant lactate dehydrogenase (LDH) released from damaged cells was measured by using a Roche LDH Cytotoxicity Detection Kit as we previously described.33 In brief, supernatants were collected from the untreated and drugtreated cells and mixed with the LDH substrate. The light absorbance was then measured at 490 and 630 nm. Fresh media and cell lysates were used as the negative and positive controls, respectively. qRT−PCR

Total RNA was isolated by TRIzol (Life Technologies) as we previously described.29,34 RNA was treated with Dnase I (Thermo), followed by reverse transcription with a RevertAid First Strand cDNA Synthesis Kit (Thermo) and qRT-PCR analyses with a real-time PCR system (Applied BioSystems). The primer information is included in Supplementary Table S1. Protein Digestion and Label-Free MS

Protein digestion and MS analyses were performed as we described previously with minor modifications.18,27,28 Briefly, cells were lysed with 1% SDS buffer and subjected to reduction (8 M urea and 50 mM DTT at 37 °C, 1 h) and alkylation (100 mM IAA, at room temperature, 30 min). We next performed a filter-aided sample preparation (FASP) for the in-solution protein digestion.35 In detail, cell lysates were loaded into the 30 kDa ultracentrifugal filters (Sartorius Stedim Biotech, Shanghai, China) and centrifuged for 15 min (12 000 × g, 4 °C), followed by two sequential buffer change centrifugations with 8 M urea and five-times volume of 50 mM NH4HCO3, respectively. Trypsin was then added into the protein solution at mass ratio of 1:30 for in-solution digestion at 37 °C, for 8 h. Peptides were collected by centrifugation (12 000 × g, 4 °C, 15 min), freeze-dried, and resuspended by NH4HCO2 (pH = 10). The peptides were fractionated by high-pH RP-LC (10 fractions) and then analyzed with a triple TOF 5600 MS (5600 MS; AB SCIEX, Framingham, CA, USA). MS parameters: spray voltage, 2.3 kV; interface heater temperature, 120 °C; scan range, 350− 1500 m/z; mass tolerance, 50 mDa; resolution, > 30 k fwhm; information-dependent acquisition (IDA) MS/MS scans, applied; maximum number of candidate ions per cycle, 40; charge state, 2−4 and >200 cps; dynamic exclusion, applied; cooccurrence, 1; and duration, 20 s.18

MATERIALS AND METHODS

Cell Lines and Reagents

Human lung cancer A549 and CRC HCT116 cells were acquired from American Type Culture Collection (ATCC, Manassas, VA, USA). Human hepatoma MHCC97H (97H) cells were maintained and distributed by Fudan University.26−28 A549 and 97H cells were cultured in the complete DMEM medium (Life Technologies, Beijing, China); HCT116 cells were maintained in complete RPMI 1640 medium (Life Technologies). All of the complete media were, respectively, supplemented with 10% FBS (Life Technologies), 2 mM L-glutamine, 1 mM sodium pyruvate, 1% penicillin/streptomycin (pen/strep), and 10 μg/mL ciprofloxacin.18,29 TSA (Sigma-Aldrich, Shanghai, China),30 GSK126 (Biovision, Milpitas, CA, USA),31 and S2101 (Merck, Shanghai, China) were used to inhibit histone deacetylase (HDAC), EZH2 methyltransferase, and lysinespecific demethylase 1 (LSD1),32 respectively.

Database Searches

We used both MaxQuant (version 1.5.2.8) and Mascot (version 2.5.1) software to perform the database searches against UniprotSwiss HUMAN.fasta (2015_02 Release, 20 198 entries). Searching parameters were used as follows: fixed modification, carbamidomethyl (C); variable modifications, oxidation (M), Gln → pyro-Glu (N-terminus), and acetyl (N-terminus); fragment ion mass tolerance, 0.05 Da; parent ion tolerance, 15 ppm. For Mascot searches, we extracted wiff peak by AB SCIEX MS Data Converter (version 1.3) and set the peptide level FDR < 1%; we then further analyzed the resulting DAT file with Scaffold (version 4.2.1) to control the protein level FDR to 200 m/z should be labeled; (2) consecutive b and/or y ion series should be observed; (3) ion intensity >50; and (4) the distribution of ion mass errors should be generally linear. We D

DOI: 10.1021/acs.jproteome.5b00480 J. Proteome Res. XXXX, XXX, XXX−XXX

Article

Journal of Proteome Research Table 1. Detailed Information on the Identified PE2−4 and PE5 Proteins peptide evidence no. 1

Uniprot accession number Q9P225 a

HGNC gene name

neXtProt PE level

HPAb

DNAH2

2

P

TMEM92

2

U

2

Q6UXU6

3

P0C091a

FREM3

2

U

4

Q8NEE8a

TTC16

2

U

5

C9J7I0a

UMAD1

3

U

biological sourcec MHCC97HTSA A549-S2101 MHCC97H− S2101 MHCC97H− S2101 MHCC97HGSK126 A549-GSK126

MHCC97H− S2101 MHCC97HGSK126

6

Q6ZRH9

YB011

2

P

7

Q9NS26a

SPANXA1

2

U

8

F5GYI3 a

UBAP1L

2

P

DNAJC22

2

U

MHCC97H− S2101 MHCC97HGSK126 MHCC97HGSK126 MHCC97HGSK126 MHCC97H MHCC97HTSA

9

Q8N4W6

10

Q86WK7a

AMIGO3

2

U

HCT116-TSA

11

Q15935a

ZNF77

2

U

HCT116-TSA

12 13 14

P0C6C1a P0C221a O75901a

ANKRD34C CCDC175 RASSF9

2 2 2

U U U

15

Q8NGK0

OR51G2

2

P

16

Q6NX49a

ZNF544

2

U

MHCC97H MHCC97H MHCC97H MHCC97HTSA MHCC97H MHCC97HGSK126 MHCC97HGSK126

17 18

Q6NT89a O15375a

TRNP1 SLC16A5

2 2

P U

19 20 21 22 23 24 25

Q14929 Q8IZF3a Q8N9W4 H3BRN8 Q5XKL5 Q9UN75a P08912a

ZNF169 GPR115 GOLGA6L2 C15 BTBD8 PCDHA12 CHRM5

2 2 2 2 2 2 2

U U U U U P U

26

Q008S8a

ECT2L

2

U

spectra counts

unique peptide sequenced

HCT116-TSA HCT116 MHCC97H MHCC97HGSK126 HCT116-S2101 A549 A549 A549-GSK126 A549-GSK126 A549-S2101 MHCC97HGSK126 MHCC97H

E

scores

SEEMELKLER (Mc)

7

36

GPLELPSIIPPER (Mc) GPIEIPSIIPPER (Mqt) CEVTVLDALPR (Mc)

1 1 1

43 63 33

LQEFDGAVEDFLK (Mc)

1

30

GKTSDIEANQPLETNKENSSSVTVSDPEMENK (Mc) TSDIEANQPLETNKENSSSVTVSDPEMENK (Mc) TSDIEANQPIETNKENSSSVTVSDPEMENK (Mqt) GKTSDIEANQPIETNKENSSSVTVSDPEMENK (Mqt) ENSSSVTVSDPEMENK (Mqt)

3

68

1 1

44 44

3

34

2

40

GKTSDIEANQPIETNKENSSSVTVSDPEMENK (Mqt) TSDIEANQPIETNKENSSSVTVSDPEMENK (Mqt) VADLVAGRR (Mc)

1

28

1

14

1

30

SVPCDSNEANEMMPETPTGDSDPQPAPK (Mc)

4

62

SVPCDSNEANEMMPETPTGDSDPQPAPK (Mqt)

4

47

LCSLDVLRGVRLELAGAR (Mc)

1

36

QLAYQVLGLSEGATNEEIHR (Mc) QLAYQVLGLSEGATNEEIHR (Mc) QIAYQVIGISEGATNEEIHR (Mqt) IAWVSPQQELLR (Mc) LLDLSSNTLR (Mc) IIDISSNTIR (Mqt) IAWVSPQQEIIR (Mqt) DVFGNGISNDEEIVK (Mc) DVFGNGISNDEEIVK (Mqt) TGASALVYAINADDK (Mc) ADLLLLENK (Mc) ELDLEIEK (Mc) EIDIEIEK (Mqt)

1 2 1 2 2 1 2 3 1 3 1 1 3

77 65 64 65 46 32 98 47 52 32 32 31 125

EIDIEIEK (Mqt) TVLSIASRAER (Mc)

2 1

99 27

SDVISQLEQEEDLCR (Mc) SDVISQIEQEEDICR (Mqt) SDVISQIEQEEDICR (Mqt) SDVISQIEQEEDICR (Mqt) VFLAAELR (Mc) QAVAADALERDLFLEAK (Mc) QAVAADALERDLFLEAK (Mqt) VTLIRHQR (Mc) SAETCTSLSVEK (Mc) MREQEEEMR (Mc) TLDFPNIQHTL (Mc) CGEGSAAPMVLLGSAGVCSK (Mc) EDCLNPPSEPR (Mc) STSTTGKPSQATGPSANWAK (Mqt)

1 1 1 1 1 1 1 1 1 2 1 1 1 1

99 87 82 110 35 50 40 41 53 34 40 35 54 28

GTSHTPFER (Mqt)

1

64

DOI: 10.1021/acs.jproteome.5b00480 J. Proteome Res. XXXX, XXX, XXX−XXX

Article

Journal of Proteome Research Table 1. continued peptide evidence no.

Uniprot accession number a

27

A6NNA5

28

a

Q58FF6

HGNC gene name

neXtProt PE level

HPAb

DRGX

2

U

HSP90AB4P

5

n/a

biological sourcec MHCC97HGSK126 MHCC97HTSA

EAIEAQQSIGR (Mqt)

MHCC97HGSK126 MHCC97HGSK126

29

30 31 32

Q13670a

P01893 H7BZ55 Q96L14a

PMS2P11

HLAH CROL3 CEP170P1

5

5 5 5

n/a

n/a n/a n/a

spectra counts

unique peptide sequenced

A549 HCT116-TSA MHCC97H MHCC97HGSK126 MHCC97H MHCC97H− S2101 A549-S2101 HCT116-TSA A549 HCT116-TSA HCT116-S2101 MHCC97H− S2101

3

GFEVIYMSEPIDEYCVQQLK (Mc) KGFEVIYMSEPIDEYCVQQLK (Mc) SLLSVTKEGLELPEDEEEK (Mc) GFEVIYMSEPIDEYCVQQLK (Mc) GFEVIYMSEPIDEYCVQQIKEFDGK (Mc) GFEVIYMSEPIDEYCVQQIKEFDGK (Mqt) GFEVIYMSEPIDEYCVQQIK (Mqt) GFEVIYMSEPIDEYCVQQIK (Mqt) GFEVIYMSEPIDEYCVQQIK (Mqt) LSAASGYSDVTDSK (Mc) ISAASGYSDVTDSK (Mqt)

scores 75

10 1 1 21 2 1 3 1 2 1 1

61 36 65 61 40 52.898 91.317 31.402 61.477 104 85.909

ISAASGYSDVTDSK (Mqt) ISAASGYSDVTDSK (Mqt)

1 1

124.62 115.1

ISAASGYSDVTDSK (Mqt) ISAASGYSDVTDSK (Mqt) ISAASGYSDVTDSK (Mqt) GGSYSQAASSNSAQGSDVSLTA (Mc) WDAEKVALQAR (Mc) EINDVAGEIDSVTSSGTAPSTTLVDR (Mc)

1 1 1 1 1 1

81.448 43.861 70.47 31 32 34

FDR < 1% at both peptide and protein levels determined by Scaffold or MaxQuant. bAntibody evidence in HPA; “U” indicates that a protein has uncertain antibody evidence, “P” for pending normal tissue annotation, and “n/a” for no such evidence. cCell lines treated with drugs: cell line-drug. d Search engine: Mascot (Mc) and Maxquant (Mqt). a

USA). We used CBS Prediction Servers TMHMM 2.039 to predict transmembrane proteins and computed their transmembrane helices. In addition, we used SignalP 4.1 server to predict signal peptides in the first 70 N-terminal AAs.40 We considered a protein as a transmembrane protein when it has >18 expected AAs in transmembrane helices (TMHMM 2.0 instructions) and at least one signal peptide (SignalP).18

RNA Extraction and RNA-seq

The transcriptome analysis by RNA-seq was performed as we described previously.29,34 Briefly, total RNA was isolated by using TRIzol RNA extraction reagent (Ambion, Austin, TX). The polyA+mRNA was selected by NEBNext Poly(A) mRNA magnetic isolation module. We used NEBNext Ultra RNA Library Prep Kit for Illumina to construct sequencing libraries and then sequenced by an Illumina NextSeq-500 sequencer for 75 cycles. The raw sequencing data have been deposited in the Gene Expression Omnibus (accession number: GSE69420).

Gene Ontology Analysis

Identified missing proteins were subjected to gene ontology (GO) analyses by using PANTHER.41 The statistical overrepresentation tests on GO-Slim Biological Processes and Cellular Components were performed, respectively. The statistical significance was accepted when P < 0.05.

Sequence Analysis

High-quality reads were mapped to human mRNA reference sequence (RefSeq) for GRCh38/hg38 (downloaded from UCSC genome browser on May 21, 2014) using FANSe2 mapping algorithm with the options − L80 − S10 − I0 − E5 − B1.38 The reads mapped to splice variants of a certain gene were merged. The mRNA abundance was normalized by using rpkM (reads per kilobase per million reads). Genes with ≥ 10 mapped reads were considered as confidently detected genes.29

Statistical Analysis

The Mann−Whitney U-test was used to test the mean values of the features between groups of proteins, and the statistical significance was accepted when P < 0.01. Bootstrap analyses were performed to compare the mean values of properties between different protein groups. Both U-test and bootstrap were performed with the MATLAB software version R2013a (MathWorks, Natick, MA, USA).

Chromosomal Enrichment Analysis



The PE1−5 genes, which expressed only in the treated cells, were subjected to the chromosomal enrichment analysis using Fisher’s exact test, examining the null hypothesis that the PE1−5 genes that were switched-on in the histone manipulated cells were not overrepresented to a specific chromosome.

RESULTS

Epigenetic Manipulation of Human Cells

We first tested the minimal effective dose (MED) of the three tested drugs by using Western blotting (Figure 1A). According to the alteration of the histone modifications, the MEDs were determined as 1 μM for TSA, 50 μM for GSK126, and 100 μM for S2101. We further examined the expression levels of a series of known mRNA biomarkers for the downstream effects of these

Physical and Chemical Features of Proteins

The protein lengths in AAs were obtained from the human subset of Uniprot database. The isoelectric point (pI) and the charge at pH = 7.4 of each protein were calculated by using MATLAB bioinformatics toolbox (MathWorks, Natick, MA, F

DOI: 10.1021/acs.jproteome.5b00480 J. Proteome Res. XXXX, XXX, XXX−XXX

Article

Journal of Proteome Research

Figure 3. Proteomic identification of PE2−4 missing proteins. (A) Comparison of PE2−4 proteins across cell lines treated with a single drug. (B) Comparison of PE2−4 protein identifications at individual cell line level. (C) A representative MS spectrum for the PE2 protein, dynein heavy chain 2, axonemal (Q9P225). All MS spectra for identified missing proteins are listed in the Supporting Information.

three drugs31,42−47 using qRT-PCR (Figure 1B). The results validated that, at the determined MEDs, the expression level of at least one signature gene was significantly altered (P < 0.01, n = 3), as expected in each drug-treated cell line, compared with the untreated cells; while at lower concentrations, the drugs failed to mediate such significant changes in all cell lines. To validate the physiological state of the cells, we performed LDH experiments and confirmed that the cell membrane integrity was not compromised at the MED in all cases (Supplementary Figure S1). We next examined the distribution of the cellular protein abundance of all MS-quantified proteins (Figure 1C). We identified more than 6000 proteins in most of the experimental groups, except GSK126-treated A549 cells. The ion score cutoff for peptide level FDR ( 0.01, Mann−Whitney U-test, Figure 4A). This conclusion was further consolidated by the bootstrapping analysis (Figure 4B). We combined the prediction of signal peptides and transmembrane domains to predict the transmembrane proteins, and the missing proteins showed no significantly larger fraction than all PE1 proteins (P > 0.01, Fisher exact test, Figure 4C). This was also confirmed by the GO enrichment analysis on cellular component using PANTHER: no term was overrepresented (P > 0.05). Taken together, our results evidenced that these missing proteins were not physically or chemically different from the known proteins. The major reason for not being detected in previous studies tends to be that they are suppressed by histones in adult cells. In this study, we switched on some of their expression by epigenetic manipulation.

With Mascot searches, we initially found 44 PE2−4 and seven PE5 proteins that passed the peptide level FDR threshold (5

+ + +

0.0382 0.0160 0.0271



DISCUSSION Epigenetic modifiers are essential to ensure the distinct gene expression signatures during development and differentiation,49 resulting in that some groups of proteins are solely expressed in certain stages. These proteins tend to be missed in the cells and tissues from adults. However, it is extremely challenging or not realistic to test samples representing all human developmental stages, including embryos, fetus, neonate, infant and juvenile, and many others, partially due to ethics regulation and technical challenges (e.g., very limited number of cells during early embryo stages). The epigenetically manipulated human cell models, as reported in this study, represent a feasibly alternative way to bypass multiple developmental stage restrictions and to find missing proteins that are not expressed in adult samples.

involve development processes. We then searched the NCBI EST database for the expression of these missing protein coding genes in various developmental stages (Table 3). Fourteen such genes were found to have remarkable expression more than five transcripts per million (TPM) in at least one experiment. All these 14 genes were found expressed in nonadult stages, especially in the fetus; 11 out of them were found higher expression in nonadult stages. These results implied that our epigenetic manipulation enhanced the expression of proteins that were mainly expressed in other developmental stages but were switched off in adults. H

DOI: 10.1021/acs.jproteome.5b00480 J. Proteome Res. XXXX, XXX, XXX−XXX

Article

Journal of Proteome Research

Table 3. mRNA Expression Levels of the Identified Missing Proteins in Various Developmental Stages, Retrieved from NCBI EST Databasea

a

Uniprot ID

gene symbol

Q9P225 Q6UXU6 P0C091 Q15935 O75901 Q6NX49 Q6NT89 O15375 Q14929 Q8IZF3 Q9UN75 Q13670 P01893 Q96L14

DNAH2 TMEM92 FREM3 ZNF77 RASSF9 ZNF544 TRNP1 SLC16A5 ZNF169 GPR115 PCDHA12 PMS2P11 HLA-H CEP170P1

embryoid body

blastocyst

14

16

fetus 8 3 3 10 5 34 7 19 21 5 190 8 1 10

16 28

48

14 14

32 16

42 14

48

28

neonate

infant

juvenile

adult 11 14

43 32 32

42

35

53

8 2 39 11 31 5 7 32 6 8 2

The expression levels are given in TPM.

demonstrated in this study that histone acetylation and methylation manipulation resulted in massive increase of expressed gene numbers in most of the experimental groups. Guruceaga et al. have analyzed the properties of missing proteins and found that PE2−4 proteins are significantly different from PE1 proteins in terms of 3′ and 5′ UTRs, CDS length, and structures.15 Therefore, even though we can switch on the transcription, the translation and MS analyses are still challenging, which potentially leads to limited number of missing protein identifications in this study. Interestingly, the significant enrichment of development and spermatogenesis when analyzing PE2−4 missing proteins favored our argument of using such an in vitro epigenetic model to partially reflect developmental stages. Thus, we can reason that analyzing more cell lines and tissues as well as epigenetic manipulation by more reagents will lead to increased chances to identify more missing proteins. Since finding missing proteins is one of the most challenging current tasks for shotgun proteomics, the HUPO and HPP/CHPP have strongly encouraged manual/visual spectral inspection to avoid misinterpretation or overinterpretation of data. According to our current data, we realize that the stringent inspection of MS spectra is critical to the field. For example, regarding PE2−4 proteins, we found 20/44 (45%) and 52/64 (81%) of the initially identified missing proteins did not have confident spectra matches in Mascot and MaxQuant searches, respectively; most frequent reasons for the exclusion included unlabeled major peaks, extremely low ion intensity, and no consecutive b-/y-ions. Although differential performance of database searching software has been noted by other studies,51 it is still surprising that such a huge portion of missing protein identifications does not have confident MS spectra matches. When dealing with large data sets toward defining human proteome,13,14 Ezkurdia et al. have proposed that a simple combination of search results from different search engines would be very risky without strict manual inspection.52 Our current results re-emphasize that stringent quality criteria are critical for the C-HPP goals. A very recent report by Dong et al. has systematically modeled and predicted the neXtProt PE5 proteins using I-TASSER and COFACTOR and proposed 66 highly scoring PE5 proteins that have folding and potential functions.53 Interestingly, among the three PE5 protein

Figure 4. Physical and chemical comparison of identified PE2−4 proteins with proteome background. (A) Fraction distribution of AA length, pI, and charge. (B) Bootstrap comparisons. (C) Prediction of transmembrane proteins.

Supportive to this notion, recent single-cell RNA-seq analyses have characterized from oocyte to morula stages and found that the gene expression patterns shifts considerably; in addition, the cells from the same eight-cell embryo remarkably differ from each other in their gene expression profiles,50 exhibiting rapid epigenetic modulation and high heterogeneity. In agreement, we I

DOI: 10.1021/acs.jproteome.5b00480 J. Proteome Res. XXXX, XXX, XXX−XXX

Article

Journal of Proteome Research

Figure 5. Case report of a rare suspicious spectrum match for a PE5 protein relevant to amino acid substitution and post-translational modifications. (A) An original MASCOT spectrum matched to a unique peptide for Q9BZK3. (B) Reinspected spectrum using pLabel software. (C) The pLabel simulation of the E to Q (deamidated) substitution.

PeptideAtlas had recently been rereviewed. The primary reason is that the so-called “unique peptides” may just be the common PE1 protein peptides with unusual PTMs. To further reduce the potential false discoveries, we demonstrated a feasible way to use

experimentally identified by our current study, HSP90AB4P (Q58FF6) has been listed in such 66 PE5 proteins. Equally important was that we were informed by the C-HPP community that some PE5 proteins already included by J

DOI: 10.1021/acs.jproteome.5b00480 J. Proteome Res. XXXX, XXX, XXX−XXX

Article

Journal of Proteome Research Table 4. Possible AA Substitution by Identical Mass AA with PTMs Uniprot AC

HGNC symbol

Q58FF6

HSP90AB4P

P01893

HLAH

Q9BZK3 Q9BZK3

NACAP1 NACAP1

Q14929

ZNF169

Q13670 Q8NGK0

PMS2P11 OR51G2

cell lines MCCH97HGSK126 HCT116S2101 MCCH97H MCCH97HGSK126 HCT116S2101 MCCH97H MCCH97HGSK126

neXtProt PE level

unique peptides of single amino acid substitution

PTM

5

SLLSVTKEGLELPEDEEEK

Q58FF7

SLVSVTKEGLELPEDEEEK

methyl

5

GGSYSQAASSNSAQGSDVSLTA

P01892

GGSYSQAASSDSAQGSDVSLTA

amidated

2 2

LEDLSQEAQLAAAEK LEDLSQEAQLAAAEKFK

E9PAV3 E9PAV3

LEDLSQQAQLAAAEK LEDLSQQAQLAAAEKFK

deamidated deamidated

2

VTLLRHQR

Q9NZR2

VTLLRHER

amidated

5 2

LSAASGYSDVTDSK TVLSLASRAER

Q86UW9 Q9H339

LSTASGYSDVTDSK TVLSLASREER

dehydroxymethyl decarboxymethyl



Smith−Waterman alignment to discard all peptide matches with only one amino acid difference, minimizing the nonuniqueness due to the single AA variations or uncommon PTMs. In this study, we also presented an experimental example to emphasize the importance of PTM inspection. According to our current data, among all of the identified PE2−5 proteins, only one had been ruled out for such a reason. Hence, the AA substitution and rare PTM problems seem to be largely confined within PE5 proteins; certainly, this should be validated in a larger scale to reach general conclusions.



alternative protein

unique peptide sequence

AUTHOR INFORMATION

Corresponding Authors

*E-mail: [email protected]. Phone: +86-20-85225960. Fax: +86-20-85222616. *E-mail: [email protected]. Phone: +86-20-85224031. Fax: +86-20-85224031. *E-mail: [email protected]. Phone: +86-20-85227039. Fax: +86-20-85227039. *E-mail: [email protected]. Phone: +86-21-54237874. Fax: +86-21-54237874. Author Contributions ⊥

CONCLUSIONS

In this study, we demonstrated an alternative way for finding human missing proteins by using epigenetically manipulated human cell lines. Such interference greatly increased the overall expressed gene numbers. With peptide and protein level FDR < 1 and stringently manual MS spectrum inspections, we identified 19 PE2−4 and three PE5 proteins. These missing or dubious proteins had no physical−chemical difference from the background proteome, and they had minimally functional interactions between each other per knowledgebase. Finally, we presented a rare and dubious PE5 protein MS spectrum with a possible AA substitution.

L.Y., X.L., W.Z., J.G., and Q.W. contributed equally to this work.

Notes

The authors declare no competing financial interest.



S Supporting Information *

ACKNOWLEDGMENTS This work was supported by the National Basic Research Program “973” of China (No. 2014CBA02000 to T.W. and No. 2011CB910700 to Q.Y.H.), the International Collaboration Program (No. 2014DFB30010 to Q.Y.H.), the National Natural and Science Foundation of China (No. 81372135 to T.W.; Nos. 81322028 and 31300649 to G.Z.), International Science and Technology Cooperation Program of China (No. 2014DFB30020 to F.L.), Specialized Research Fund for the Doctoral Program of Higher Education of China (No. 20124401120008 to G.Z.) , and the Guangdong Natural Science Foundation (No. 2014A030313369 to T.W.).

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jproteome.5b00480. LDH release assay on epigenetic inhibitors treated cells; protein abundances in each cells obeys near-log-normal distribution; drug treatment did not alter proteins distribution; genome-wide coexpression pattern and the identified PE2−5 proteins; qRT-PCR primers; data set FDR control for label-free LC−MS/MS searched by Mascot/Scaffold; and sequencing read counts for each sample (PDF) Analyses of Q9P225, Q6UXU6, P0C091, Q8NEE8, C9J7I0, Q6ZRH9, Q9NS26, F5GYI3, Q8N4W6, Q86WK7, Q15935, P0C6C1, P0C221, O75901, Q8NGK0, Q6NX49, Q6NT89, O15375, Q14929, Q8IZF3, Q8N9W4, H3BRN8, Q5XKL5, Q9UN75, P08912, Q008S8, A6NNA5, Q58FF6, Q13670, P01893, H7BZ55, Q96L14, and the unique peptides therein (PDF)

(1) Paik, Y. K.; Jeong, S. K.; Omenn, G. S.; Uhlen, M.; Hanash, S.; Cho, S. Y.; Lee, H. J.; Na, K.; Choi, E. Y.; Yan, F.; Zhang, F.; Zhang, Y.; Snyder, M.; Cheng, Y.; Chen, R.; Marko-Varga, G.; Deutsch, E. W.; Kim, H.; Kwon, J. Y.; Aebersold, R.; Bairoch, A.; Taylor, A. D.; Kim, K. Y.; Lee, E. Y.; Hochstrasser, D.; Legrain, P.; Hancock, W. S. The ChromosomeCentric Human Proteome Project for cataloging proteins encoded in the genome. Nat. Biotechnol. 2012, 30, 221−3. (2) Paik, Y. K.; Omenn, G. S.; Thongboonkerd, V.; Marko-Varga, G.; Hancock, W. S. Genome-wide proteomics, Chromosome-Centric Human Proteome Project (C-HPP), part II. J. Proteome Res. 2014, 13, 1−4. (3) Huhmer, A. F.; Paulus, A.; Martin, L. B.; Millis, K.; Agreste, T.; Saba, J.; Lill, J. R.; Fischer, S. M.; Dracup, W.; Lavery, P. The chromosome-centric human proteome project: a call to action. J. Proteome Res. 2013, 12, 28−32. (4) Aebersold, R.; Bader, G. D.; Edwards, A. M.; van Eyk, J. E.; Kussmann, M.; Qin, J.; Omenn, G. S. The biology/disease-driven human proteome project (B/D-HPP): enabling protein research for the life sciences community. J. Proteome Res. 2013, 12, 23−7.



ASSOCIATED CONTENT



K

REFERENCES

DOI: 10.1021/acs.jproteome.5b00480 J. Proteome Res. XXXX, XXX, XXX−XXX

Article

Journal of Proteome Research (5) Lane, L.; Bairoch, A.; Beavis, R. C.; Deutsch, E. W.; Gaudet, P.; Lundberg, E.; Omenn, G. S. Metrics for the Human Proteome Project 2013−2014 and strategies for finding missing proteins. J. Proteome Res. 2014, 13, 15−20. (6) Horvatovich, P.; Lundberg, E. K.; Chen, Y. J.; Sung, T. Y.; He, F.; Nice, E. C.; Goode, R. J.; Yu, S.; Ranganathan, S.; Baker, M. S.; Domont, G. B.; Velasquez, E.; Li, D.; Liu, S.; Wang, Q.; He, Q. Y.; Menon, R.; Guan, Y.; Corrales, F. J.; Segura, V.; Casal, J. I.; Pascual-Montano, A.; Albar, J. P.; Fuentes, M.; Gonzalez-Gonzalez, M.; Diez, P.; Ibarrola, N.; Degano, R. M.; Mohammed, Y.; Borchers, C. H.; Urbani, A.; Soggiu, A.; Yamamoto, T.; Archakov, A. I.; Ponomarenko, E.; Lisitsa, A. V.; Lichti, C. F.; Mostovenko, E.; Kroes, R. A.; Rezeli, M.; Vegvari, A.; Fehniger, T. E.; Bischoff, R.; Vizcaino, J. A.; Deutsch, E. W.; Lane, L.; Nilsson, C. L.; Marko-Varga, G.; Omenn, G. S.; Jeong, S. K.; Cho, J. Y.; Paik, Y. K.; Hancock, W. S. A Quest for Missing Proteins: Update 2015 on Chromosome-Centric Human Proteome Project. J. Proteome Res., 2015, in press. DOI: 10.1021/pr5013009. (7) Gaudet, P.; Argoud-Puy, G.; Cusin, I.; Duek, P.; Evalet, O.; Gateau, A.; Gleizes, A.; Pereira, M.; Zahn-Zabal, M.; Zwahlen, C.; Bairoch, A.; Lane, L. neXtProt: organizing protein knowledge in the context of human proteome projects. J. Proteome Res. 2013, 12, 293−8. (8) Shiromizu, T.; Adachi, J.; Watanabe, S.; Murakami, T.; Kuga, T.; Muraoka, S.; Tomonaga, T. Identification of missing proteins in the neXtProt database and unregistered phosphopeptides in the PhosphoSitePlus database as part of the Chromosome-centric Human Proteome Project. J. Proteome Res. 2013, 12, 2414−21. (9) Farrah, T.; Deutsch, E. W.; Omenn, G. S.; Sun, Z.; Watts, J. D.; Yamamoto, T.; Shteynberg, D.; Harris, M. M.; Moritz, R. L. State of the human proteome in 2013 as viewed through PeptideAtlas: comparing the kidney, urine, and plasma proteomes for the biology- and diseasedriven Human Proteome Project. J. Proteome Res. 2014, 13, 60−75. (10) Malm, J.; Danmyr, P.; Nilsson, R.; Appelqvist, R.; Vegvari, A.; Marko-Varga, G. Blood plasma reference material: a global resource for proteomic research. J. Proteome Res. 2013, 12, 3087−92. (11) Omenn, G. S. Plasma proteomics, the Human Proteome Project, and cancer-associated alternative splice variant proteins. Biochim. Biophys. Acta, Proteins Proteomics 2014, 1844, 866−73. (12) Ponomarenko, E. A.; Kopylov, A. T.; Lisitsa, A. V.; Radko, S. P.; Kiseleva, Y. Y.; Kurbatov, L. K.; Ptitsyn, K. G.; Tikhonova, O. V.; Moisa, A. A.; Novikova, S. E.; Poverennaya, E. V.; Ilgisonis, E. V.; Filimonov, A. D.; Bogolubova, N. A.; Averchuk, V. V.; Karalkin, P. A.; Vakhrushev, I. V.; Yarygin, K. N.; Moshkovskii, S. A.; Zgoda, V. G.; Sokolov, A. S.; Mazur, A. M.; Prokhortchouck, E. B.; Skryabin, K. G.; Ilina, E. N.; Kostrjukova, E. S.; Alexeev, D. G.; Tyakht, A. V.; Gorbachev, A. Y.; Govorun, V. M.; Archakov, A. I. Chromosome 18 transcriptoproteome of liver tissue and HepG2 cells and targeted proteome mapping in depleted plasma: update 2013. J. Proteome Res. 2014, 13, 183−90. (13) Wilhelm, M.; Schlegl, J.; Hahne, H.; Gholami, A. M.; Lieberenz, M.; Savitski, M. M.; Ziegler, E.; Butzmann, L.; Gessulat, S.; Marx, H.; Mathieson, T.; Lemeer, S.; Schnatbaum, K.; Reimer, U.; Wenschuh, H.; Mollenhauer, M.; Slotta-Huspenina, J.; Boese, J. H.; Bantscheff, M.; Gerstmair, A.; Faerber, F.; Kuster, B. Mass-spectrometry-based draft of the human proteome. Nature 2014, 509, 582−7. (14) Kim, M. S.; Pinto, S. M.; Getnet, D.; Nirujogi, R. S.; Manda, S. S.; Chaerkady, R.; Madugundu, A. K.; Kelkar, D. S.; Isserlin, R.; Jain, S.; Thomas, J. K.; Muthusamy, B.; Leal-Rojas, P.; Kumar, P.; Sahasrabuddhe, N. A.; Balakrishnan, L.; Advani, J.; George, B.; Renuse, S.; Selvan, L. D.; Patil, A. H.; Nanjappa, V.; Radhakrishnan, A.; Prasad, S.; Subbannayya, T.; Raju, R.; Kumar, M.; Sreenivasamurthy, S. K.; Marimuthu, A.; Sathe, G. J.; Chavan, S.; Datta, K. K.; Subbannayya, Y.; Sahu, A.; Yelamanchi, S. D.; Jayaram, S.; Rajagopalan, P.; Sharma, J.; Murthy, K. R.; Syed, N.; Goel, R.; Khan, A. A.; Ahmad, S.; Dey, G.; Mudgal, K.; Chatterjee, A.; Huang, T. C.; Zhong, J.; Wu, X.; Shaw, P. G.; Freed, D.; Zahari, M. S.; Mukherjee, K. K.; Shankar, S.; Mahadevan, A.; Lam, H.; Mitchell, C. J.; Shankar, S. K.; Satishchandra, P.; Schroeder, J. T.; Sirdeshmukh, R.; Maitra, A.; Leach, S. D.; Drake, C. G.; Halushka, M. K.; Prasad, T. S.; Hruban, R. H.; Kerr, C. L.; Bader, G. D.; IacobuzioDonahue, C. A.; Gowda, H.; Pandey, A. A draft map of the human proteome. Nature 2014, 509, 575−81.

(15) Guruceaga, E.; Sanchez del Pino, M. M.; Corrales, F. J.; Segura, V. Prediction of a missing protein expression map in the context of the human proteome project. J. Proteome Res. 2015, 14, 1350−60. (16) Glatter, T.; Ludwig, C.; Ahrne, E.; Aebersold, R.; Heck, A. J.; Schmidt, A. Large-scale quantitative assessment of different in-solution protein digestion protocols reveals superior cleavage efficiency of tandem Lys-C/trypsin proteolysis over trypsin digestion. J. Proteome Res. 2012, 11, 5145−56. (17) Nardiello, D.; Palermo, C.; Natale, A.; Quinto, M.; Centonze, D. Strategies in protein sequencing and characterization: multi-enzyme digestion coupled with alternate CID/ETD tandem mass spectrometry. Anal. Chim. Acta 2015, 854, 106−17. (18) Chen, Y.; Li, Y.; Zhong, J.; Zhang, J.; Chen, Z.; Yang, L.; Cao, X.; He, Q. Y.; Zhang, G.; Wang, T. Identification of missing proteins defined by chromosome-centric proteome project in the cytoplasmic detergentinsoluble proteins. J. Proteome Res., 2015, in press. DOI: 10.1021/ pr501103r. (19) Cedar, H.; Bergman, Y. Epigenetics of haematopoietic cell development. Nat. Rev. Immunol. 2011, 11, 478−88. (20) Hu, N.; Strobl-Mazzulla, P. H.; Bronner, M. E. Epigenetic regulation in neural crest development. Dev. Biol. 2014, 396, 159−68. (21) Gonzalo, S. Epigenetic alterations in aging. J. Appl. Physiol. 2010, 109, 586−97. (22) Bartke, T.; Borgel, J.; DiMaggio, P. A. Proteomics in epigenetics: new perspectives for cancer research. Briefings Funct. Genomics 2013, 12, 205−18. (23) Hussain, N. Epigenetic influences that modulate infant growth, development, and disease. Antioxid. Redox Signaling 2012, 17, 224−36. (24) Portela, A.; Esteller, M. Epigenetic modifications and human disease. Nat. Biotechnol. 2010, 28, 1057−68. (25) Guo, R.; Zheng, L.; Park, J. W.; Lv, R.; Chen, H.; Jiao, F.; Xu, W.; Mu, S.; Wen, H.; Qiu, J.; Wang, Z.; Yang, P.; Wu, F.; Hui, J.; Fu, X.; Shi, X.; Shi, Y. G.; Xing, Y.; Lan, F.; Shi, Y. BS69/ZMYND11 reads and connects histone H3.3 lysine 36 trimethylation-decorated chromatin to regulated pre-mRNA processing. Mol. Cell 2014, 56, 298−310. (26) Wang, Q.; Wen, B.; Yan, G.; Wei, J.; Xie, L.; Xu, S.; Jiang, D.; Wang, T.; Lin, L.; Zi, J.; Zhang, J.; Zhou, R.; Zhao, H.; Ren, Z.; Qu, N.; Lou, X.; Sun, H.; Du, C.; Chen, C.; Zhang, S.; Tan, F.; Xian, Y.; Gao, Z.; He, M.; Chen, L.; Zhao, X.; Xu, P.; Zhu, Y.; Yin, X.; Shen, H.; Zhang, Y.; Jiang, J.; Zhang, C.; Li, L.; Chang, C.; Ma, J.; Yan, G.; Yao, J.; Lu, H.; Ying, W.; Zhong, F.; He, Q. Y.; Liu, S. Qualitative and quantitative expression status of the human chromosome 20 genes in cancer tissues and the representative cell lines. J. Proteome Res. 2013, 12, 151−61. (27) Chang, C.; Li, L.; Zhang, C.; Wu, S.; Guo, K.; Zi, J.; Chen, Z.; Jiang, J.; Ma, J.; Yu, Q.; Fan, F.; Qin, P.; Han, M.; Su, N.; Chen, T.; Wang, K.; Zhai, L.; Zhang, T.; Ying, W.; Xu, Z.; Zhang, Y.; Liu, Y.; Liu, X.; Zhong, F.; Shen, H.; Wang, Q.; Hou, G.; Zhao, H.; Li, G.; Liu, S.; Gu, W.; Wang, G.; Wang, T.; Zhang, G.; Qian, X.; Li, N.; He, Q. Y.; Lin, L.; Yang, P.; Zhu, Y.; He, F.; Xu, P. Systematic analyses of the transcriptome, translatome, and proteome provide a global view and potential strategy for the C-HPP. J. Proteome Res. 2014, 13, 38−49. (28) Wang, Q.; Wen, B.; Wang, T.; Xu, Z.; Yin, X.; Xu, S.; Ren, Z.; Hou, G.; Zhou, R.; Zhao, H.; Zi, J.; Zhang, S.; Gao, H.; Lou, X.; Sun, H.; Feng, Q.; Chang, C.; Qin, P.; Zhang, C.; Li, N.; Zhu, Y.; Gu, W.; Zhong, J.; Zhang, G.; Yang, P.; Yan, G.; Shen, H.; Liu, X.; Lu, H.; Zhong, F.; He, Q. Y.; Xu, P.; Lin, L.; Liu, S. Omics evidence: single nucleotide variants transmissions on chromosome 20 in liver cancer cell lines. J. Proteome Res. 2014, 13, 200−11. (29) Wang, T.; Cui, Y.; Jin, J.; Guo, J.; Wang, G.; Yin, X.; He, Q. Y.; Zhang, G. Translating mRNAs strongly correlate to proteins in a multivariate manner and their translation ratios are phenotype specific. Nucleic Acids Res. 2013, 41, 4743−54. (30) Ou, J. N.; Torrisani, J.; Unterberger, A.; Provencal, N.; Shikimi, K.; Karimi, M.; Ekstrom, T. J.; Szyf, M. Histone deacetylase inhibitor Trichostatin A induces global and gene-specific DNA demethylation in human cancer cell lines. Biochem. Pharmacol. 2007, 73, 1297−307. (31) McCabe, M. T.; Ott, H. M.; Ganji, G.; Korenchuk, S.; Thompson, C.; Van Aller, G. S.; Liu, Y.; Graves, A. P.; Della Pietra, A., III; Diaz, E.; LaFrance, L. V.; Mellinger, M.; Duquenne, C.; Tian, X.; Kruger, R. G.; L

DOI: 10.1021/acs.jproteome.5b00480 J. Proteome Res. XXXX, XXX, XXX−XXX

Article

Journal of Proteome Research

protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 2015, 43, D447−52. (49) Cantone, I.; Fisher, A. G. Epigenetic programming and reprogramming during development. Nat. Struct. Mol. Biol. 2013, 20, 282−9. (50) Xue, Z.; Huang, K.; Cai, C.; Cai, L.; Jiang, C. Y.; Feng, Y.; Liu, Z.; Zeng, Q.; Cheng, L.; Sun, Y. E.; Liu, J. Y.; Horvath, S.; Fan, G. Genetic programs in human and mouse early embryos revealed by single-cell RNA sequencing. Nature 2013, 500, 593−7. (51) Dyrlund, T. F.; Poulsen, E. T.; Scavenius, C.; Sanggaard, K. W.; Enghild, J. J. MS Data Miner: a web-based software tool to analyze, compare, and share mass spectrometry protein identifications. Proteomics 2012, 12, 2792−6. (52) Ezkurdia, I.; Vazquez, J.; Valencia, A.; Tress, M. Analyzing the first drafts of the human proteome. J. Proteome Res. 2014, 13, 3854−5. (53) Dong, Q.; Menon, R.; Omenn, G. S.; Zhang, Y. A Structural Bioinformatics Inspection of neXtProt PE5 Proteins in the Human Proteome. J. Proteome Res., 2015, just accepted manuscript. DOI: 10.1021/acs.jproteome.5b00516.

McHugh, C. F.; Brandt, M.; Miller, W. H.; Dhanak, D.; Verma, S. K.; Tummino, P. J.; Creasy, C. L. EZH2 inhibition as a therapeutic strategy for lymphoma with EZH2-activating mutations. Nature 2012, 492, 108− 12. (32) Suzuki, T.; Miyata, N. Lysine demethylases inhibitors. J. Med. Chem. 2011, 54, 8236−50. (33) Shen, S.; Guo, J.; Luo, Y.; Zhang, W.; Cui, Y.; Wang, Q.; Zhang, Z.; Wang, T. Functional proteomics revealed IL-1beta amplifies TNF downstream protein signals in human synoviocytes in a TNFindependent manner. Biochem. Biophys. Res. Commun. 2014, 450, 538−44. (34) Zhong, J.; Cui, Y.; Guo, J.; Chen, Z.; Yang, L.; He, Q. Y.; Zhang, G.; Wang, T. Resolving chromosome-centric human proteome with translating mRNA analysis: a strategic demonstration. J. Proteome Res. 2014, 13, 50−9. (35) Wisniewski, J. R.; Zougman, A.; Nagaraj, N.; Mann, M. Universal sample preparation method for proteome analysis. Nat. Methods 2009, 6, 359−62. (36) Wang, L. H.; Li, D. Q.; Fu, Y.; Wang, H. P.; Zhang, J. F.; Yuan, Z. F.; Sun, R. X.; Zeng, R.; He, S. M.; Gao, W. pFind 2.0: a software package for peptide and protein identification via tandem mass spectrometry. Rapid Commun. Mass Spectrom. 2007, 21, 2985−91. (37) Fan, S. B.; Meng, J. M.; Lu, S.; Zhang, K.; Yang, H.; Chi, H.; Sun, R. X.; Dong, M. Q.; He, S. M. Using pLink to Analyze Cross-Linked Peptides. Curr. Protoc Bioinformatics 2015, 49, 8.21.1−8.21.19. (38) Xiao, C. L.; Mai, Z. B.; Lian, X. L.; Zhong, J. Y.; Jin, J. J.; He, Q. Y.; Zhang, G. FANSe2: a robust and cost-efficient alignment tool for quantitative next-generation sequencing applications. PLoS One 2014, 9, e94250. (39) Krogh, A.; Larsson, B.; von Heijne, G.; Sonnhammer, E. L. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J. Mol. Biol. 2001, 305, 567− 80. (40) Petersen, T. N.; Brunak, S.; von Heijne, G.; Nielsen, H. SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat. Methods 2011, 8, 785−6. (41) Mi, H.; Muruganujan, A.; Casagrande, J. T.; Thomas, P. D. Largescale gene function analysis with the PANTHER classification system. Nat. Protoc. 2013, 8, 1551−66. (42) Liu, Y.; He, G.; Wang, Y.; Guan, X.; Pang, X.; Zhang, B. MCM-2 is a therapeutic target of Trichostatin A in colon cancer cells. Toxicol. Lett. 2013, 221, 23−30. (43) Kashyap, V.; Ahmad, S.; Nilsson, E. M.; Helczynski, L.; Kenna, S.; Persson, J. L.; Gudas, L. J.; Mongan, N. P. The lysine specific demethylase-1 (LSD1/KDM1A) regulates VEGF-A expression in prostate cancer. Mol. Oncol. 2013, 7, 555−66. (44) Sato, T.; Kaneda, A.; Tsuji, S.; Isagawa, T.; Yamamoto, S.; Fujita, T.; Yamanaka, R.; Tanaka, Y.; Nukiwa, T.; Marquez, V. E.; Ishikawa, Y.; Ichinose, M.; Aburatani, H. PRC2 overexpression and PRC2-target gene repression relating to poorer prognosis in small cell lung cancer. Sci. Rep. 2013, 3, 1911. (45) Gao, S. B.; Xu, B.; Ding, L. H.; Zheng, Q. L.; Zhang, L.; Zheng, Q. F.; Li, S. H.; Feng, Z. J.; Wei, J.; Yin, Z. Y.; Hua, X.; Jin, G. H. The functional and mechanistic relatedness of EZH2 and menin in hepatocellular carcinoma. J. Hepatol. 2014, 61, 832−9. (46) Nagel, S.; Venturini, L.; Marquez, V. E.; Meyer, C.; Kaufmann, M.; Scherr, M.; MacLeod, R. A.; Drexler, H. G. Polycomb repressor complex 2 regulates HOXA9 and HOXA10, activating ID2 in NK/T-cell lines. Mol. Cancer 2010, 9, 151. (47) Valente, S.; Rodriguez, V.; Mercurio, C.; Vianello, P.; Saponara, B.; Cirilli, R.; Ciossani, G.; Labella, D.; Marrocco, B.; Monaldi, D.; Ruoppolo, G.; Tilset, M.; Botrugno, O. A.; Dessanti, P.; Minucci, S.; Mattevi, A.; Varasi, M.; Mai, A. Pure enantiomers of benzoylaminotranylcypromine: LSD1 inhibition, gene modulation in human leukemia cells and effects on clonogenic potential of murine promyelocytic blasts. Eur. J. Med. Chem. 2015, 94, 163−74. (48) Szklarczyk, D.; Franceschini, A.; Wyder, S.; Forslund, K.; Heller, D.; Huerta-Cepas, J.; Simonovic, M.; Roth, A.; Santos, A.; Tsafou, K. P.; Kuhn, M.; Bork, P.; Jensen, L. J.; von Mering, C. STRING v10: proteinM

DOI: 10.1021/acs.jproteome.5b00480 J. Proteome Res. XXXX, XXX, XXX−XXX