Single Nucleotide Variants Transmissions on ... - ACS Publications

Nov 22, 2013 - dance and functionally relevant to cellular phenotypes in human cells.20 On ...... the overexpression of the protein of death-inducer o...
0 downloads 0 Views 3MB Size
Article pubs.acs.org/jpr

Omics Evidence: Single Nucleotide Variants Transmissions on Chromosome 20 in Liver Cancer Cell Lines Quanhui Wang,†,‡,▽ Bo Wen,‡,▽ Tong Wang,§,▽ Zhongwei Xu,∥,▽ Xuefei Yin,⊥,#,▽ Shaohang Xu,‡ Zhe Ren,‡ Guixue Hou,‡ Ruo Zhou,‡ Haiyi Zhao,‡ Jin Zi,‡ Shenyan Zhang,‡ Huan Gao,‡ Xiaomin Lou,†,‡ Haidan Sun,†,‡ Qiang Feng,‡ Cheng Chang,∥ Peibin Qin,∥ Chengpu Zhang,∥ Ning Li,∥ Yunping Zhu,∥ Wei Gu,§ Jiayong Zhong,§ Gong Zhang,§ Pengyuan Yang,⊥,# Guoquan Yan,⊥ Huali Shen,⊥,# Xiaohui Liu,⊥,# Haojie Lu,⊥,# Fan Zhong,*,⊥ Qing-Yu He,*,§ Ping Xu,*,∥ Liang Lin,*,‡ and Siqi Liu*,†,‡ †

Beijing Institute of Genomics, Chinese Academy of Sciences, No. 1 Beichen West Road, Beijing 100101, China BGI-Shenzhen, Beishan Road, Shenzhen 518083, China § Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, 601 Huangpu Road, Guangzhou 510632, China ∥ State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Engineering Research Center for Protein Drugs, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing 102206, China ⊥ Institutes of Biomedical Sciences, Fudan University, 130 DongAn Road, Shanghai 200032, China # Department of Chemistry, Fudan University, 220 Handan Road, Shanghai 200433, China ‡

S Supporting Information *

ABSTRACT: Cancer genomics unveils many cancer-related mutations, including some chromosome 20 (Chr.20) genes. The mutated messages have been found in the corresponding mRNAs; however, whether they could be translated to proteins still requires more evidence. Herein, we proposed a transomics strategy to profile the expression status of human Chr.20 genes (555 in Ensembl v72). The data of transcriptome and translatome (the mRNAs bound with ribosome, translating mRNAs) revealed that ∼80% of the coding genes on Chr.20 were detected with mRNA signals in three liver cancer cell lines, whereas of the proteome identified, only ∼45% of the Chr.20 coding genes were detected. The high amount of overlapping of identified genes in mRNA and RNC-mRNA (ribosome nascent-chain complex-bound mRNAs, translating mRNAs) and the consistent distribution of the abundance averages of mRNA and RNC-mRNA along the Chr.20 subregions in three liver cancer cell lines indicate that the mRNA information is efficiently transmitted from transcriptional to translational stage, qualitatively and quantitatively. Of the 457 genes identified in mRNAs and RNC-mRNA, 136 were found to contain SNVs with 213 sites, and >40% of these SNVs existed only in metastatic cell lines, suggesting them as the metastasis-related SNVs. Proteomics analysis showed that 16 genes with 20 SNV sites were detected with reliable MS/MS signals, and some SNVs were further validated by the MRM approach. With the integration of the omics data at the three expression phases, therefore, we are able to achieve the overall view of the gene expression of Chr.20, which is constructive in understanding the potential trend of encoding genes in a cell line and exploration of a new type of markers related to cancers. KEYWORDS: Proteome, transcriptome, translatome, Chromosome, mutation

Special Issue: Chromosome-centric Human Proteome Project Received: September 2, 2013 Published: November 22, 2013 © 2013 American Chemical Society

200

dx.doi.org/10.1021/pr400899b | J. Proteome Res. 2014, 13, 200−211

Journal of Proteome Research



Article

INTRODUCTION Cancer pathogenesis is rooted in chromosomal abnormalities and aneuploidy in cells. It is generally believed that chromosomal imbalances play a causative role in tumorigenesis.1 Moreover, each chromosome has its own characteristic of genomic structure and exerts the specific impact on intrinsic infidelity of inheritance. For instance, recurrent gain of the long arm of chromosome 20 (Chr.20q) was observed in many cancer cells, such as pancreatic cancer, gastric cancer, and colon cancer; however, 20q deletion is very rare in most cases reported so far.2−7 After examining the samples of prostate cancer, Tabach et al. proposed that 13 genes in Chr.20q13 associated with Chr.20q amplification play “cancer initiating genes” in the cancer-driving processes.8 A large number of somatic mutations were found in Chr.20 as well. According to the database collected by the Catalogue of Somatic Mutations in Cancer (COSMIC v65: http://cancer.sanger.ac.uk/cancergenome/projects/ cosmic/), there are in total 1684 genes with liver-cancer-related variations (Table S1 in the Supporting Information), of which 54 are located on Chr.20, including 50 SNVs, 2 deletions, and 2 unknown types. The variation rate is ∼10% of the genes in this chromosome. Determining which somatic mutations are likely to contribute to the cancer phenotype is the most common aim of chromosome-based research of cancer.9 With the advancement in sequencing technology, cancer genomics has emerged for increasing the basic knowledge of cancer biology and the opportunity to advance cancer prevention, diagnostics, prognostics, and treatment.10−12 A huge body of somatic mutations across several major tumor types has been generated by deep sequencing, which provided instructive insights into the opportunities and challenges of a genomicsdriven framework in cancer.13−15 Meanwhile, these databases have unveiled the immense genomic complexity and striking inter- and intratumor heterogeneity.16 Even though thousands of cancer mutations have accumulated, a limited number of functional changes and drives to cancer biology are learned from the sequencing data. The field is urgently demanded to gather systematical information related to oncological questions and somatic mutations and to bring the benefit elicited from sequencing power to clinical care. Unexpectedly, highthroughput RNA sequencing studies revealed that only a limited number of mutations were expressed at the mRNA level, indicating that the mutation messages at transcripts are not completely inherited from DNA. Moreover, even fewer mutated proteins were confirmed on the basis of somatic mutations in cancer. A transomics strategy is thus proposed, which requires the generation of several types of omics data from each individual cancer specimen, including genome, transcriptome, proteome, and metabolome.17,18 This strategy is expected to identify and validate the reliable mutations related to cancer that are altered at high frequency. It is generally accepted that mRNA abundance is poorly correlated with protein abundance in a biological system, especially for the expression products at low abundance. An argument has been raised for years in which the amount of translating mRNAs (mRNAs bound to ribosome-nascent chain complex, RNCmRNA) may better reflect protein abundance.19 Recently Wang et al. systematically analyzed the relative abundances of mRNAs, RNC-mRNAs, and proteins on s genome-wide scale and reported that a strong correlation between RNC-mRNAs and proteins in their relative abundances could be established through a multivariate linear model by integrating the mRNA length as an element.20 The authors proposed that the intrinsic

and genome-wide translation modulations at translatomic level at the steady state are tightly correlated with the protein abundance and functionally relevant to cellular phenotypes in human cells.20 On the basis of these findings, we adopted a transomic strategy to qualitatively and quantitatively monitor the changes of gene expression in liver cancer cell lines. We expect that the strategy with one more dimensional omics data will create a new scope to overview the expression status of the encoding genes on Chr.20, which is in accordance with the aim of ChromosomeCentric Human Proteomics Project (C-HPP).21,22 In this study, we selected three liver cancer cell lines and profiled the gene expression at three omics levels in these cells. There is a high incidence of liver cancer worldwide, and it is recognized by serious chromosomal abnormality and gene mutations. The cancer cell lines are the ideal model for the study of carcinogenetic mechanisms. Hep3B has p53 gene lost and behaves as a nonmetastatic cell line.23,24 MHCC97H and HCCLM3 with p53 mutations exhibit 100% lung metastasis upon orthotopic inoculation, and HCCLM3 has additional lymphonode metastasis ability.25 The three cell lines have been widely adopted in many research laboratories. On the basis of the qualitative and quantitative data through the transomics analysis of the three cell lines, the expression status of Chr.20 genes was mapped onto the chromosomal positions. It was revealed that the mRNAs from Chr.20 were close to the corresponding RNC-mRNAs in either identification number or the abundance, while the proteins identified were obviously less than the two mRNA data sets. Accordingly, the abundance distribution of the expressed Chr.20 genes remained as similar modes along the subregions of Chr.20 in all three cancer cell lines. The Chr.20 mutations detected at mRNA, RNC-mRNA, and protein level and their chromosomal distribution in the three cell lines were deeply analyzed. We obtained solid evidence, for the first time, that the SNV information in the genome could be transmitted through all expression stages, from mRNA to RNC-mRNA and from RNC-mRNA to protein. Meanwhile, the SNV types in Hep3B at both mRNA and protein were found to be different from those in MHCC97H and HCCLM3. It is worth investigating whether these SNVs may function as the biomarkers that enable discrimination of metastatic and nonmetastatic cancer cell lines.



MATERIALS AND METHODS

1. Sample Preparation and Transomics Data Acquisition

The three liver cancer cell lines Hep3B, MHCC97H, and HCCLM3 were cultured in H-DMEM supplemented with 10% fetal bovine serum at 37 °C and 5% CO2, with strict quality control to avoid any contaminations. The cells were then harvested at ∼80% confluence, and the harvested cells were divided into two groups for RNA-Seq and LC−MS/MS analysis. For the sample preparation of transcriptomic and translatomic analysis, ribosome-nascent chain (RNC) extraction was performed as previously reported.20 In brief, the mRNA and RNC-mRNA were isolated, respectively, from the cultured cells by using TRIzol_RNA extraction reagent (Ambion, Austin, TX). Equal amounts of mRNA or RNC-mRNA from three independent preparations were pooled; subsequently, the library was constructed by using NEBNext mRNA Sample Prep Master Mix Set. RNA sequencing was conducted upon Illumina HiSeq-2000 with 50 cycles. For the sample preparation of proteomic analysis, the harvested cells were washed with ice-cold PBS three times and lysed in the buffer containing 8 M urea, 50 mM NH4HCO3, and 201

dx.doi.org/10.1021/pr400899b | J. Proteome Res. 2014, 13, 200−211

Journal of Proteome Research

Article

5 mM IAA. The total cells lysates were centrifuged with 12 000g for 10 min at 4 °C to remove cell debris. The lysate was diluted to reduce the concentration of urea to ∼1 M, followed by sequential in-solution protein digestion by Lys-C and trypsin at 37 °C, respectively. The resulting tryptic peptides were cleaned by C18 Sep-Pak column (Waters UK, Manchester, U.K.) and fractionated by high-pH RP to 24 fractions. Peptides in each fraction were then delivered to LC−MS/MS analysis by Q-Exactive (Thermo Fisher Scientific) equipped with an Easy-nLC nanoflow LC system (Thermo Fisher Scientific) and Triple TOF 5600 equipped with nano-HPLC (Eksigent Technologies), respectively. Each fraction was done in such mass spectrometry analysis with duplicates. 2. Data Analysis of Transcriptome and Translatome

The sequencing reads were mapped to the Ensembl-v72 mRNA reference sequences using FANSe2 algorithm (http:// bioinformatics.jnu.edu.cn/software/fanse2/) with the parameters of −L55 −E3 −S14.26 Alternative splicing variants were merged. The genes with at least 10 mapped reads were set as the threshold for confident gene identification and quantification. For single nucleotide variants (SNVs) calling, TopHat was used for reads alignment with the reference sequences of hg19 (http://genome.ucsc.edu/cgi-bin/hgGateway), and SAMtools mpileup was used for statistical evaluation of the aligned reads, requiring a minimum read depth of 10 and a maximum read depth of 2000.27,28 ANNOVAR was used to annotate the SNVs, and those SNVs in gene coding regions were collected to construct a database for further proteome data searching.29 The abundance of mRNA and RNC-mRNA was normalized and estimated using the RPKM.30 The differential mRNAs and RNC-mRNAs across cell lines were evaluated with trimmed mean of M values method based on the negative binomial distribution using edgeR.31

Figure 1. Expressed Chr.20 genes identified at three omics levels in the three liver cancer cell lines. (a) The overlapping of the identified genes in all three cell lines among transcriptome, translatome, and proteome. (b) Overlapping of the identified genes at transcriptome among the three liver cancer cell lines. (c) Overlapping of the identified genes at translatome among the three liver cancer cell lines. (d) Overlapping of the identified genes at proteome among the three liver cancer cell lines.

genes on Chr.20 are detected in three liver cancer cell lines using RNA-Seq for mRNA and RNC-mRNA and LC−MS/MS for proteins. The scale of translatomic data is quite close to the transcribed genes, and the overlapping rate of transcriptome and translatome is as high as 95.5% (425 Chr.20 genes shared by mRNA and RNC-mRNA), indicating that most transcripts could bind to ribosome. However, only ∼57% (243/425) of the transcribed genes were detected by the proteomic approach. Additionally, there are some unexplainable results: 7 genes with read numbers >10 in RNC-mRNA but not in mRNA, while 2 proteins with FDR < 1% by MASCOT but not detected by RNA-Seq in both mRNA and RNC-mRNA. Further comparison of the expression status of Chr.20 genes among the three liver cancer cell lines, as shown in Figure 1b−d, demonstrates that the expressed genes are highly overlapped in these cell lines at all three levels, 87% (379/438) for transcriptome, 84% (362/429) for translatome, and 77% (189/245) for proteome, respectively. The expression data indicate that the three cell lines possibly come from close origins. Considering that the specific expression genes in such individual cell line are relatively lower, quantitative analysis to the large number of the expressed Chr.20 genes is necessary to find out the metastasis-related genes. As the early report, the chromosomal proteome data sets generated from Chinese Human Chromosome Proteome Consortium were assigned as CCPD, the version of CCPD in last year was assigned as CCPD1.0, and CCPD1.0 added the data of this year was assigned as CCPD2.0. The proteome data sets elicited in this study are termed as CCPD2013. The database search of LC−MS/MS signals was first against Ensembl v72 and the proteins identified were compared with four public databases, GPMBD (http://gpmdb.proteome.ca/), PeptideAtlas

3. Data Analysis of Proteome

The raw MS/MS data were converted into MGF format by the MSconvert module32 in the Trans-Proteomic Pipeline (TPP v4.5.2), followed by protein search using Mascot 2.3.02 (Matrix Science, Boston, MA) against Swiss-Prot (20 258 proteins, release 2013_06). For the discovery of the mutated peptides, the MS/MS data were searched against the SNV database generated from this study. All identified peptides matched to the FDR criteria at ≤0.01. For protein quantification, the extracted ion chromatograms (XICs) corresponding to peptides were constructed, and the accurate areas under the XIC curves were calculated as the peptide abundance. Then, the protein abundance was determined as the sum of all of its unique peptides. The quantification algorithm was based on SILVER.33 The iBAQ index was represented for protein abundance, and its median was normalized for comparison of the same protein in different data sets.



RESULTS

1. Expression Status of Chr.20 Genes in Three Liver Cancer Cell Lines

For the sake of exploring the expression status of Chr.20 genes related to liver cancer, we performed a systematic investigation of transomics in three liver cancer cell lines, Hep3B, HCCLM3, and MHCC97H, in which the expression of Chr.20 genes at mRNA, RNC-mRNA, and protein was qualitatively and quantitatively monitored. The data illustrated in Figure 1a reveal that approximately 79 (438/555), 78 (429/555), and 44% (245/555) of the coding 202

dx.doi.org/10.1021/pr400899b | J. Proteome Res. 2014, 13, 200−211

Journal of Proteome Research

Article

genes detected by mRNA, RNC-mRNA, or protein are greatly varied among the subregions, from 33 to 100% in mRNA or RNC-mRNA and from 0 to 89% in protein. With the Fisher test, the likelihood of gene expression at all three levels in three subregions, p12.3, q11.22, and q13.13, is significantly high, for which the p values are 90% at mRNA or RNC-mRNA and 60% at protein, respectively. The detection ratios of gene expression in q11.21 are obviously lower with p values over 0.95, 51, 60, and 31% at mRNA, RNC-mRNA, and protein, indicating that the subregional genes are possibly not preferred to be transcribed and translated. Intriguingly, the average gene lengths in the high gene expression regions, p12.3, q11.22, and q13.13 ranged from 1.6kb to 2.6kb, whereas the value in q11.21 is only 0.97kb, implying that the likelihood of gene expression somehow seems correlated to the gene length, at least in Chr.20. In most subregions, the higher the number of mRNA and RNC-mRNA that was found, the more proteins were identified. However, there are two subregions, q13.12 and p11.23, in which the ratios of proteins identified are obviously different from the average value, 30% in q13.12 and 61% in p11.23, whereas the ratios of mRNA and RNC-mRNA over the subregions are close to the average level of ∼79%. It merits attention that the proteins encoded by the p11.23 genes possess the relatively larger molecular mass and higher hydrophilicity, while those of the q13.12 genes have the relatively lower molecular weight and higher hydrophobicity. The extreme behavior of protein expression in the two subregions can likely be attributed to these distinct biophysical properties (Figures S1 and S2 in the Supporting Information). As previously reported, no protein products encoded by defensin genes located on Chr.20 were detected in the three human tissues and several cancer cell lines.35 The transomics study here further offers strong support for the findings. A total of 13 defensin genes of Chr.20 are located in q11.21 and p13; however, no transcriptional or translational signals are detected

(http://www.peptideatlas.org/), HPA (http://www.proteinatlas. org/), and neXtProt. The results of proteomics analysis and the data comparison for Chr.20 genes are summarized in Table 1. Table 1. Overlapping of Identified Proteins Encoded by Chr.20 Genes in CCPD with the Five Public Databasesa Database CCPD 1.0 LCCPD 2013 CCPD 2.0

Ensembl

GPMDBb

PeptideAtlas

HPAc

neXtProtd

555 319 252 335

386 308 248 322

330 274 241 285

238 172 141 181

429 309 249 323

a

Five databases and their versions are Ensembl (v72, release Jun. 2013), GPMDB (release Jul. 2013), PeptideAtlas (release Dec. 2012), ProteinAtlas (HPA, v 11.0), and neXtProt (release Jun. 2013). bData set annotated as “Green” was used, with the threshold “>20 Observations and log(e) < −5”. cData set with HPA evidence medium/high was used. dData set annotated as “protein evidence” was used.

Compared with the proteins encoded by Chr.20 genes and identified from liver cancer cells last year, CCPD2013 exhibits an obviously improved rate in protein identification, 125 proteins increased.34 The size of identified Chr.20 proteins in CCPD2.0, however, is not significantly changed versus CCPD1.0. This is understandable because the increased proteins in CCPD2.0 are contributed only from the proteomic analysis to the liver cancer cell lines. Upon the encoded genes of Chr.20 in Ensembl v72, 40% of them remain unidentified in CCPD2.0. In regards to the four public databases, CCPD2.0 contains some identified proteins derived from Chr.20 genes but not overlapped by them, 13, 50, 154, and 10 proteins different from GPMBD, PeptideAtlas, HPA, and neXtProt, respectively. 2. Correlation of Gene Expression Status with the Subregions of Chr.20

We further scrutinized the expression status of Chr.20 genes onto their chromosomal locations. As shown in Table 2, the ratios of

Table 2. Gene Coverage of Each Subregion on Chr.20 Identified at mRNA, RNC-mRNA, and Proteins subregions

genes

mRNAs

ratioa

p value

RNC-mRNAs

ratiob

p value

proteins

ratioc

p value

p13 p12.3 p12.2 p12.1 p11.23 p11.22 p11.21 p11.1 q11.1 q11.21 q11.22 q11.23 q12 q13.11 q13.12 q13.13 q13.2 q13.31 q13.32 q13.33 total

87 16 9 16 23 3 30 3 0 47 40 40 7 2 84 26 17 14 19 73 556

66 15 5 13 18 1 22 3 0 24 39 34 7 1 65 26 14 12 15 62 442

76% 94% 56% 81.6% 78.6% 33.6% 73% 100% 0 51% 98% 85% 100% 50% 77% 100% 82% 85% 79% 85% 79%

0.78 0.02 0.90 0.33 0.47 0.89 0.74 0.00 NA 1.00 0.00 0.13 0.00 0.63 0.65 0.00 0.28 0.18 0.43 0.07 NA

67 15 5 13 18 1 21 3 0 28 39 34 7 1 61 26 14 10 14 59 436

77% 94% 56% 81% 78% 33% 70% 100% 0 60% 98% 85% 100% 50% 73% 100% 82% 71% 74% 81% 78%

0.58 0.01 0.89 0.29 0.42 0.88 0.82 0.00 NA 0.99 0.00 0.10 0.00 0.61 0.89 0.00 0.25 0.64 0.60 0.24 NA

35 10 3 9 14 0 11 1 0 15 26 23 4 1 25 21 10 3 10 33 253

40% 63% 33% 56% 61% 0 37% 33% 0 32% 65% 58% 57% 50% 30% 81% 59% 21% 53% 45% 47%

0.83 0.05 0.65 0.12 0.04 0.83 0.79 0.43 NA 0.96 0.003 0.04 0.15 0.20 0.99 0.00 0.08 0.94 0.19 0.47 NA

a

Number of identified genes in mRNA/the number of genes in the correspondent subregion. bNumber of identified genes in RNC-mRNA/the number of genes in the correspondent subregion. cNumber of identified proteins in proteome/the number of genes in the correspondent subregion. 203

dx.doi.org/10.1021/pr400899b | J. Proteome Res. 2014, 13, 200−211

Journal of Proteome Research

Article

Figure 2. Distribution of the average abundance for the expressed genes in Chr.20 at the subregions. Upper panel and lower panel: the distribution of the average abundance for the expressed genes in Chr.20 p arm and q arm. The y axes assigned as RPKM and iBAQ represent the mRNA and protein abundance, respectively. The triangles represent differential mRNAs or proteins, taking Hep3B as the reference.

remained at a high level in the control tissues, such as DEFB1.39,40 The expression abundance of Chr.20 genes along the chromosomal subregions is outlined in Figure 2. The patterns of mRNA abundance distribution in the subregions of Chr.20 are close to that of RNC-mRNA, whereas that of protein abundance seem quite distinct from the corresponding mRNAs. Because the abundance distribution of mRNA and RNC-mRNA is highly overlapped, the transcriptional information transferred from Chr.20 genes is expected to be mostly delivered to the early translational-stage mRNA binding with ribosome. Closely looking at the subregions with higher protein abundance such as p13, p11.21, q11.22 and q11.23, the abundance profiles of proteins are not so different from the corresponding mRNA and RNC-mRNA. We postulate that the difference between protein abundance distribution and that of mRNAs on Chr.20 is likely due to the lower ratio of protein identification. Upon Figure 2, the distribution curves of expression abundance among three cancer cell lines are compared in parallel. Although in most subregions the abundance profiles of mRNA and RNC-mRNA in

by RNA-Seq or LC−MS/MS in all three liver cancer cell lines (Table S2 in the Supporting Information). There are in total around 40 defensin genes in the human genome, but only two genes, DEFB1 and DEFBA104, on Chr.8 were detected with mRNA. No protein signals for the two genes were detected either. The sequence alignment of DEFB1 and DEFBA104 with those defensins on Chr.20 shows little sequence homology (Figure S3 in the Supporting Information), which excludes the possibility of bias matching of mRNA reads to the two genes. These findings are consistent with previous reports that defensin genes are usually restricted within a few of tissues and cells, such as the respiratory, gastrointestinal, and genitourinary tracts, skin, and circulating blood cells.36 There is no report about these genes in liver cells. Our observation here provided strong evidence of the defect expression of defensin genes in liver cancer cells, either of Chr.8 or Chr.20. How the defensin genes expression is regulated related to chromosome has yet to be understood. In addition, some defensin proteins possibly block the tumor proliferation and trigger antitumor immunity37,38 and were reported to be specifically lost in cancer tissue, while they 204

dx.doi.org/10.1021/pr400899b | J. Proteome Res. 2014, 13, 200−211

Journal of Proteome Research

Article

Figure 3. Comparison and cluster analysis to the mutated genes and sites identified in transcriptome and translatome. (a) Overlapping of the identified mutated genes in this study and COSMIC. (b) Ovelapping of the identified mutated sites in this study and COSMIC. (c) Heatmap of mutated sites among three liver cancer cell lines.

(Figure 3a, Table S4 in the Supporting Information). Further analysis reveals that of these genes, 82 have a single variant site, 30 have 2 variant sites, and 11 have 3 or more variant sites. Two genes contain variant sites of more than 5, LAMA5 and PRIC285, with 11 and 13, respectively. The gene LAMA5 was reported to be involved in the regulation of cell migration, growth, and proliferation, and its mutation forms were found in cancer samples, such as gastric cancer and intestine cancer. The gene PRIC285 functions as a coregulator of peroxisome proliferatoractivated receptor γ, but its involvement in carcinogenesis has not yet been reported. Compared with the database of COSMIC, the public-cancer-related SNV database, ∼88% of the identified SNV genes are covered by the database, suggesting that these variant sites have a high likelihood of being cancer-related. The 136 SNV genes in Chr.20 were identified with total of 203 variant sites, including 162 from mRNA and 176 from RNC-mRNA, and the overlapped sites are also ∼80% (Figure 3b), implying that the majority of SNV information in mRNA could be accurately transmitted to RNC-mRNA. In a comparison of these sites with COSMIC, only ∼13% (26 mutation sites) are covered, and of the 26 sites, 21 are identified by both mRNA and RNC-mRNA. These SNV sites shared by the different data sets may really represent the biological correlation of liver carcinogenesis. Moreover, the SNV sites not covered by COSMIC but confirmed by both mRNA and RNC-mRNA may offer a new clue to explore the potential candidate of metastasis-related biomarkers. The SNV sites in three liver cancer cell lines are clustered into three groups, as depicted in Figure 3c. One group includes 87 sites in 72 genes, found in MHCC97H and HCCLM3 but not in Hep3B, while another group has 73 sites in 57 genes, found only in Hep3B but not in MHCC97H or HCCLM3. The third group contains all SNV sites through three cell lines, 25 sites in 23 genes. Of the SNV sites shared by MHCC97H and HCCLM3, 12 were appointed as cancer-related, such as LAMA5 at Chr.20_60887581_C→T in breast cancer and DIDO1 at Chr.20_61528271_G→A in gastric cancer, as shown in COSMIC, and of the SNV sites detected only in Hep3B, 10 were documented as cancer-related, such as SIRPA at Chr.20_1895889_G→C in prostate cancer. The SNV sites

three cell lines are similar, the abundance of mRNA and RNCmRNA of Hep3B in some subregions, such as p12.3, p11.2, and q13.12, is slightly different from that of the other two cell lines. On the basis of the expression abundance of Chr.20 genes and taking Hep3B as a reference, the differential genes potentially related to metastasis in MHCC97H and HCCLM3 are defined by setting stringent criteria. In MHCC97H, 46, 42, and 31 differential genes were defined, while in HCCLM3, 39, 42, and 23 differential genes were found at the transcriptomic, translatomic, and proteomic levels, respectively (Table S3 in the Supporting Information). The overlapping of differential genes between MHCC97H and HCCLM3 is over 80% at either mRNA or RNC, whereas it is significantly lower at protein, 20% of the genes in q11.23, q13.13, and q13.2 are suspected to be the differential genes, whereas 25%. Looking at the rates of individual nucleic acid in each SNV site, there seems to be an order of variation favor at A > G > T > C; however, at the rate of substitution, there seems to be two dominant substitutions of G→A and A→G with 23 and 18% in all substitution types, suggesting the two substituted changes are very active (Figure S4

in the Supporting Information). Comparison of the substitution types detected by both mRNA and RNC-mRNA reveals that most types have no change at transcription- and ribosomebinding stages; however, some substitution types changed greatly, such as 14 C→G, 15 C→T, and 13 T→C were detected in mRNA whereas there were only 5, 21, and 18 of the same type in RNC-mRNA, respectively. 4. SNVs of Chr.20 Genes Identified in Proteome

Are the SNVs in cancer cells finally delivered into the functional molecules? Although the current technique of proteomics is still restricted to identify enough unique peptides covering the whole 206

dx.doi.org/10.1021/pr400899b | J. Proteome Res. 2014, 13, 200−211

Journal of Proteome Research

Article

Figure 5. Validation of the mutated peptides in LAMA5 and DIDO1 by MRM. Upper panel: the overlays of MRM signals for the peptide of LAMA5, wLFPTGGSVR (r → w) in three liver cancer cell lines. Lower panel: the overlays of MRM signals for the peptide of DIDO1, LAaETGEGEGEPLSR (t → a), in three liver cancer cell lines.

RWLLLCNPGLAnTIVEK (d→n). Checking the SNV information transmitted from mRNA to protein, we found that 14 out of 20 sites had the detectable variant messages through all three expression levels (Table 3). Of the mutated peptides, more mutation signals appear detectable for the metastasis cell lines, especially for MHCC97H. More specifically, some mutated sites detected only in MHCC97H and HCCLM3 at both mRNA and protein are possibly metastasis-related indicators, such as VEDQENEPEAETYk (q→k) in BCAS1, ALFSQISSAVsLR (f→s) in LAMA5, ASSSILInESEPTTNIQIR (d→n) in NSFLC, and LTAEFEEAQTSAClLQEELEK (r→l) in RRBP1. The mutated sites detected only in Hep3B are likely to be non-metastasisrelated, such as FSVPVQHFCGGNPSTPIQVr (q→r) in CPNE1, GCELVDLADEVASVYeSYQPR (q→e) in MAVS, DNWNRPICSAPGPLFDvMER (l→v) in SPTLC3, and LLAaEQEDAAVAK (t→a) in RRBP1. This study has started to untangle the exploration of the SNVs and expressed Chr.20 genes in cancer cell lines using the transomics approaches. The transomics evidence paves an avenue for validation of the metastasis-related candidates in liver cancer in further investigations.

amino acid sequence of a protein, mass spectrometry with high resolution is available to detect some mutated peptides. To ensure the discovery of mutated peptides, we identified these peptides based on high quality and repeat of MS/MS spectra (Figure S5 in the Supporting Information). For instance, the mutated sites A303S and D502N in PYGB are confirmed by 78 and 88 MS/MS spectra, respectively, and the 10 mutated sites in RRBP1, DIDO1, BCAS1, NSFL1C, and CPNE1 are matched with more than 10 spectra at each site. We performed the MRM approach to validate all of the mutated peptides under the LC condition without prefractionation. Because the peptide complexity in the unfractionated samples is too high to detect all target peptides, only six mutated peptides were found with the satisfied MRM signals. Figure 5 illustrates the quantitative distribution of the two mutated peptides, wLFPTGGSVR (r→w) in LAMA5 and LAaETGEGEGEPLSR (t→a) in DIDO1 in three liver cancer cell lines. Because the signals of the two mutated peptides are quite lower in Hep3B, a logical deduction is that they are likely to be representative of the two metastasis cell lines, and the mutated peptides are the potential candidates related to metastasis. The other four mutated peptides confirmed by MRM are listed in Figure S6 in the Supporting Information. As shown in Table 3, 20 variant sites contained in 16 unique proteins encoded by genes in Chr.20 were identified. The corresponding mRNAs or RNC-mRNAs of the majority of mutated proteins (10/16) have multiple SNV sites, whereas only three proteins, LAMA5, PYGB, and RRBP1, were detected with multiple mutated peptides. For LAMA5, 2 of the 13 sites at mRNA and RNC-mRNA were identified by proteome, ALFSQISSAVsLR (f→s) and wLFPTGGSVR (r→w); for RRBP1, all 3 variant sites, LhSLTQAK (l→h), LLAaEQEDAAVAK (t→a), and LTAEFEEAQTSAClLQEELEK (r→l); and for PYGB, both of the 2 variant sites, LKQEYFVVAsTLQDIIR (a→s) and



DISCUSSION Under the guidance of C-HPP, we proposed a transomic strategy aiming at unveiling the correlations between gene expression and chromosome. It was found that the identified Chr.20 genes by RNA-Seq at mRNA are >95% overlapped with that at RNCmRNA, and the abundance distribution along Chr.20 subregions of the two mRNAs remains at a similar pattern. Moreover, >80% of the SNVs are shared by both mRNA and RNC-mRNA encoded by Chr.20 genes. The high correlation leads to a clear conclusion that the delivery process of mRNA to RNC-mRNA is very efficient with a complete conversion, at least in Chr.20. Furthermore, the transomics data from the three different 207

dx.doi.org/10.1021/pr400899b | J. Proteome Res. 2014, 13, 200−211

a

208

no no no no no no no yes no no yes no no yes

no no no no no no

AURKA BCAS1 CPNE1 DIDO1 KIAA0889 KIF16B LAMA5 LAMA5 MAVS NPEPL1 NSFL1C PLCG1 PSMF1 PYGB

PYGB RRBP1 RRBP1 RRBP1 SEC23B SPTLC3

mut. peptide

VLVTQQiPCQNPLPVNSGQAQR VEDQENEPEAETYk FSVPVQHFCGGNPSTPIQVr LAaETGEGEGEPLSR GPGPGSAVACSAAsSSRPDK HSTLGtEIEEQR ALFSQISSAVsLR wLFPTGGSVR GCELVDLADEVASVYeSYQPR ASEDPLLNLVSPLGCEVDVEEGDvGR ASSSILInESEPTTNIQIR EDELTFtK QDALVCFLHWEVVTHGYcGLGVGDQPGPNDK LKQEYFVVAsTLQDIIR QEYFVVAsTLQDIIR WLLLCNPGLAnTIVEK LhSLTQAK LLAaEQEDAAVAK LTAEFEEAQTSAClLQEELEK GAIQFVTHYQqSSTQR DNWNRPICSAPGPLFDvMER

A, mRNA; C, RNC-mRNA; P, proteme.

COSMIC

gene

D-N L→H T→A R→L H→Q L→V

F→I Q→K Q→R T→A F→S M→T F→S R→W Q→E L→V D-N I→T F→C A→S

mut. sites 868.466 (3+) 840.8654 (2+) 743.0407 (3+) 758.3679 (2+) 620.2953 (3+) 700.3442 (2+) 689.8881 (2+) 560.3009 (2+) 800.7056 (3+) 923.7798 (3+) 1037.0448 (2+) 491.7400 (2+) 867.9079 (4+) 1012.0648 (2+) 891.4753 (2+) 921.0031 (2+) 449.2613 (2+) 664.8564 (2+) 1220.0834 (2+) 925.9608 (2+) 792.0400 (3+)

m/z of mutated peptides

Table 3. Identified Mutated Proteins and Peptides in the Three Liver Cancer Cell Lines

111 23 103 154 80 36

84 106 154 116 40 88 27 23 120 33 154 39 72 115

score

97H (A/C/P) ○○● ●●● ○○○ ●●● ●●● ●●● ●●● ●●● ●●○ ●●● ●●● ●●● ○○○ ○●● ●●● ●●● ○○○ ●●● ●●● ●○○

Hep3B (A/C/P)a ●●○ ○○○ ○●● ●●● ●●○ ●●○ ○○○ ○○○ ○●● ●●○ ○○○ ○●● ○●● ○○● ○○● ○●○ ○●● ○○○ ○●● ●●●

●●● ●●○ ○○○ ●●● ○○○ ○○○

●●○ ●●● ○○○ ●●○ ●●○ ●●○ ●●○ ●●○ ●●○ ●●○ ●●● ●●○ ○○○ ●●●

LM3 (A/C/P)

20p11.21 20p12 20p12 20p12 20p11.23 20p12.1

20q13 20q13.2 20q11.22 20q13.33 20q11.23 20p11.23 20q13.2-q13.3 20q13.2-q13.3 20p13 20q13.32 20p13 20q12-q13.1 20p13 20p11.21

chromosome location

Journal of Proteome Research Article

dx.doi.org/10.1021/pr400899b | J. Proteome Res. 2014, 13, 200−211

Journal of Proteome Research

Article

consequently promotes the attachment, migration, invasion, and apoptosis resistance of cancer cell.49 Here DIDO-1 was detected with three mutation sites at either mRNA or RNC-mRNA, and one was verified by proteome. Importantly, the increased abundance of DIDO-1 mutant was perceived in metastatic liver cancer cell lines, which suggests that the role of mutated DIDO-1 is possibly to inhibit apoptosis and regulate the cell invasion ability.

techniques support the discovery of the differential genes and SNVs between the metastatic and nonmetastatic cell lines. In contrast with the high coverage and sensitivity at mRNA and RNC-mRNA, the proteomics data exhibit only half of the Chr.20 genes detected at the protein level and much less for the peptides with SNVs perceived. This is likely mainly resulting from the technique limitation of proteomics and the poor understanding of translation rate and protein stability. Our results regarding the SNVs related to liver cancer cell lines are partially in agreement with other reports of cancer genomics. Importantly, we found that although as high as ∼80% of SNVs could be transmitted from mRNA to RNC-mRNA, the mutated mRNAs bound to ribosome seems to not be translated to mutated proteins so efficiently, as only 12% of SNVs in RNCmRNA were verified at the protein level. We tend to attribute low detection of SNVs in peptides to the limitation of proteomics technique at current but not to a special process of biology. Once the peptide coverage per protein gets higher, the peptides with SNV signals are expected to dramatically augment. Even though the SNV peptide detection is not so satisfied, our study does provide the first hand evidence in experiment that the SNV messages are able to be delivered from mRNA to protein. The projects of cancer genomics have provided strong evidence with respect to a huge number of somatic mutations in cancer tissues and cells. An equitable hypothesis to these mutation mechanisms comes from mutations occurring within exomes, which could change the amino acid residues. Some changes in key residues of functional domains may bring the protein to be activated or inactivated in cancer and lead to metabolic or signaling pathway distorted at all. The deduction was supported by numbers of evidence from mRNA measurement but not from protein validation, even with a few of experiments from Western blot.41 The data illustrated in Table 3 doubtlessly demonstrate that the SNVs in mRNA or RNC-mRNA are translated to the variations in protein. Our study therefore offers a solid support to the hypothesis elicited from cancer genomics and reveals the transomics analysis as a feasible approach to explore the biomarkers related to cancer and metastasis based on the mutated residues. According to the transomics analysis of the Chr.20 gene in the liver cancer cells, some genes with SNVs may be worth further investigation in cancer research, such as LAMA5 and DIDO-1. The protein product of LAMA5 is Laminin α5, which belongs to the lamin family and was found to be collected in extracellular matrix glycoproteins and noncollagenous components of basement membranes. Laminin α5 mainly combines with laminin β and γ to form Laminin-511 (α5βγ) and Laminin-521 (α5β2γ). These proteins mediate cell-matrix adhesion and therefore regulate migration, growth, proliferation, and differentiation of various cell types, indicating the possible roles of these proteins in cancer metastasis.42 Laminin-511 or Laminin-521 was not reported to be involved in carcinoma-related pathways; some other protein members of the LAMINS family, such as laminin332 (α3β3γ2), were documented to mediate the invasion of gastric carcinoma cells and improves the metastatic potential of breast cancer cells.43−47 In our study, 13 SNV sites were identified by mRNA and RNC-mRNA, and 2 were confirmed by proteome. Moreover, the abundance of the proteins with SNVs in the cell lines of metastasis was different from that of the nonmetastasis cell lines. We thus hypothesize that LAMA5 mutant is involved in liver cancer metastasis. It was reported that the overexpression of the protein of death-inducer obliterator protein 1 (DIDO-1) induced cell apoptosis,48 while the latest study shows that DIDO-1 enhances expression of integrin V and



CONCLUSIONS From the transomics data of three liver cancer cell lines, ∼82% of Chr.20 genes are confirmed with expression signals, either mRNA or protein, which provide us a reference of active Chr.20 genes in such samples. Of these expressed genes, 136 were detected with SNVs, and over 40% of the SNV sites are metastatic cell specific ones, while 36% are nonmetastatic cellspecific ones, suggesting the possibility of these SNVs as metastasis-related markers. Over 80% of the SNVs detected in transcriptome and translatome are overlapped, implying that the SNV information could be transmitted from transcripts to translational mRNA efficiently. Furthermore, 20 SNV sites in 16 genes were detected in proteomics data, with 6 sites validated by MRM approach, illustrating that the SNVs could be transmitted sequentially from mRNA to protein. The roles of some genes with SNVs in metastasis pathways, such as DIDO and LAMA5, are valuable for further investigation. With the integration of the transomics data, for the first time we provided evidence of SNVs at the mRNA level and protein level, further providing a constructive clue to discover metastasis-related marker genes on Chr.20.



ASSOCIATED CONTENT

* Supporting Information S

This work contains supplementary Table S1−S4 and Fig. S1−S6. Figure S1 and S2 show the molecular weight and hydrophobicity of proteins encoded by genes in each sub-region of Chr.20. Figure S3 is the sequence alignment of defensin protein family using CLUSTAL 2.1. Figure S4 exhibits the substitution types of nucleic acids at all the mutation sites. Figure S5 lists the MS/MS spectra of all the identified mutated peptides, and Figure S6 shows the MRM signals of additional validated mutated peptides. These materials are available free of charge via the Internet at http://pubs.acs.org.



AUTHOR INFORMATION

Corresponding Authors

*Siqi Liu: Tel/Fax: 86-10-80485324. E-mail: siqiliu@genomics. org.cn. *Liang Lin: Tel/Fax: 86-755-25274284. E-mail: linl@genomics. org.cn. *Ping Xu: Tel/Fax: 8610-80705155. E-mail: xupingghy@gmail. com. *Qing-Yu He: Tel/Fax: 86-20-85227039. E-mail: tqyhe@jnu. edu.cn. *Fan Zhong: Tel/Fax: 86-21-54237158. E-mail: zonefan@163. com. Author Contributions ▽

Quanhui Wang, Bo Wen, Tong Wang, Zhongwei Xu, and Xuefei Yin contributed equally to this work. Notes

The authors declare no competing financial interest. 209

dx.doi.org/10.1021/pr400899b | J. Proteome Res. 2014, 13, 200−211

Journal of Proteome Research



Article

(6) Karhu, R.; Mahlamaki, E.; Kallioniemi, A. Pancreatic adenocarcinoma – genetic portrait from chromosomes to microarrays. Genes, Chromosomes Cancer 2006, 45 (8), 721−30. (7) Hodgson, J. G.; Chin, K.; Collins, C.; Gray, J. W. Genome amplification of chromosome 20 in breast cancer. Breast Cancer Res. Treat. 2003, 78 (3), 337−45. (8) Tabach, Y.; Kogan-Sakin, I.; Buganim, Y.; Solomon, H.; Goldfinger, N.; Hovland, R.; Ke, X. S.; Oyan, A. M.; Kalland, K. H.; Rotter, V.; Domany, E. Amplification of the 20q chromosomal arm occurs early in tumorigenic transformation and may initiate cancer. PLoS One 2011, 6 (1), e14632. (9) Ndegwa, N.; Cote, R. G.; Ovelleiro, D.; D’Eustachio, P.; Hermjakob, H.; Vizcaino, J. A.; Croft, D. Critical amino acid residues in proteins: a BioMart integration of Reactome protein annotations with PRIDE mass spectrometry data and COSMIC somatic mutations. Database 2011, 2011, bar047. (10) Previati, M.; Manfrini, M.; Galasso, M.; Zerbinati, C.; Palatini, J.; Gasparini, P.; Volinia, S. Next generation analysis of breast cancer genomes for precision medicine. Cancer Lett. 2013, 339 (1), 1−7. (11) Wheeler, D. A.; Wang, L. From human genome to cancer genome: The first decade. Genome Res. 2013, 23 (7), 1054−62. (12) Makohon-Moore, A.; Brosnan, J. A.; Iacobuzio-Donahue, C. A. Pancreatic cancer genomics: insights and opportunities for clinical translation. Genome Med. 2013, 5 (3), 26. (13) Garraway, L. A. Genomics-driven oncology: framework for an emerging paradigm. J. Clin. Oncol. 2013, 31 (15), 1806−14. (14) Grasso, C. S.; Wu, Y. M.; Robinson, D. R.; Cao, X.; Dhanasekaran, S. M.; Khan, A. P.; Quist, M. J.; Jing, X.; Lonigro, R. J.; Brenner, J. C.; Asangani, I. A.; Ateeq, B.; Chun, S. Y.; Siddiqui, J.; Sam, L.; Anstett, M.; Mehra, R.; Prensner, J. R.; Palanisamy, N.; Ryslik, G. A.; Vandin, F.; Raphael, B. J.; Kunju, L. P.; Rhodes, D. R.; Pienta, K. J.; Chinnaiyan, A. M.; Tomlins, S. A. The mutational landscape of lethal castrationresistant prostate cancer. Nature 2012, 487 (7406), 239−43. (15) Barbieri, C. E.; Bangma, C. H.; Bjartell, A.; Catto, J. W.; Culig, Z.; Gronberg, H.; Luo, J.; Visakorpi, T.; Rubin, M. A. The mutational landscape of prostate cancer. Eur. Urol. 2013, 64 (4), 567−76. (16) Samuel, N.; Hudson, T. J. Translating genomics to the clinic: implications of cancer heterogeneity. Clin. Chem. 2013, 59 (1), 127−37. (17) Liu, L. Y.; Yang, T.; Ji, J.; Wen, Q.; Morgan, A. A.; Jin, B.; Chen, G.; Lyell, D. J.; Stevenson, D. K.; Ling, X. B.; Butte, A. J. Integrating multiple ’omics’ analyses identifies serological protein biomarkers for preeclampsia. BMC Med. 2013, 11 (1), 236. (18) Berghoff, B. A.; Konzer, A.; Mank, N. N.; Looso, M.; Rische, T.; Forstner, K. U.; Kruger, M.; Klug, G. Integrative ″omics″-approach discovers dynamic and regulatory features of bacterial stress responses. PLoS Genet. 2013, 9 (6), e1003576. (19) Pradet-Balade, B.; Boulme, F.; Beug, H.; Mullner, E. W.; GarciaSanz, J. A. Translation control: bridging the gap between genomics and proteomics? Trends Biochem. Sci. 2001, 26 (4), 225−9. (20) Wang, T.; Cui, Y.; Jin, J.; Guo, J.; Wang, G.; Yin, X.; He, Q. Y.; Zhang, G. Translating mRNAs strongly correlate to proteins in a multivariate manner and their translation ratios are phenotype specific. Nucleic Acids Res. 2013, 41 (9), 4743−54. (21) Legrain, P.; Aebersold, R.; Archakov, A.; Bairoch, A.; Bala, K.; Beretta, L.; Bergeron, J.; Borchers, C. H.; Corthals, G. L.; Costello, C. E.; Deutsch, E. W.; Domon, B.; Hancock, W.; He, F.; Hochstrasser, D.; Marko-Varga, G.; Salekdeh, G. H.; Sechi, S.; Snyder, M.; Srivastava, S.; Uhlen, M.; Wu, C. H.; Yamamoto, T.; Paik, Y. K.; Omenn, G. S. The human proteome project: current state and future direction. Mol. Cell. Proteomics 2011, 10 (7), M111 009993. (22) Paik, Y. K.; Jeong, S. K.; Omenn, G. S.; Uhlen, M.; Hanash, S.; Cho, S. Y.; Lee, H. J.; Na, K.; Choi, E. Y.; Yan, F.; Zhang, F.; Zhang, Y.; Snyder, M.; Cheng, Y.; Chen, R.; Marko-Varga, G.; Deutsch, E. W.; Kim, H.; Kwon, J. Y.; Aebersold, R.; Bairoch, A.; Taylor, A. D.; Kim, K. Y.; Lee, E. Y.; Hochstrasser, D.; Legrain, P.; Hancock, W. S. The ChromosomeCentric Human Proteome Project for cataloging proteins encoded in the genome. Nat. Biotechnol. 2012, 30 (3), 221−3. (23) Yang, T. P.; Lee, H. J.; Ou, T. T.; Chang, Y. J.; Wang, C. J. Mulberry Leaf Polyphenol Extract Induced Apoptosis Involving

ACKNOWLEDGMENTS We acknowledge the entire group of Chinese Human Chromosome Proteome Consortium, Guangdong Provincial Engineering Laboratory for Proteomics, and Shenzhen Engineering Laboratory for Proteomics. This work is supported by grants from the 973 programs (2010CB912700, 2011CB910700, 2012CB910600, and 2013CB911200, 2013CB910802), Nature Science Foundations of China (91131009, 31070673, 31170780, 31000379, and 81372135), 863 projects (2012AA020200, 2011AA02A114, and 2012AA020502), Shenzhen Municipal Government of China (20101749), Shenzhen Key Laboratory of Transomics Biotechnologies (No. CXB2011O8250096A), Key Projects in the National Science & Technology Pillar Program (2012BAF14B00), and State Key Project Specialized for Infectious Diseases (2012ZX10002012−006).



ABBREVIATIONS Chr.20, chromosome 20; C-HPP, Chromosome-Centric Human Proteome Project; RNC, ribosome-nascent chain; CCPD, Chinese Chromosome Proteome Data Set; COSMIC, Catalogue of Somatic Mutations in Cancer



REFERENCES

(1) Jones, M. J.; Jallepalli, P. V. Chromothripsis: chromosomes in crisis. Dev. Cell 2012, 23 (5), 908−17. (2) Scotto, L.; Narayan, G.; Nandula, S. V.; Arias-Pulido, H.; Subramaniyam, S.; Schneider, A.; Kaufmann, A. M.; Wright, J. D.; Pothuri, B.; Mansukhani, M.; Murty, V. V. Identification of copy number gain and overexpressed genes on chromosome arm 20q by an integrative genomic approach in cervical cancer: potential role in progression. Genes, Chromosomes Cancer 2008, 47 (9), 755−65. (3) Buffart, T. E.; van Grieken, N. C.; Tijssen, M.; Coffa, J.; Ylstra, B.; Grabsch, H. I.; van de Velde, C. J.; Carvalho, B.; Meijer, G. A. High resolution analysis of DNA copy-number aberrations of chromosomes 8, 13, and 20 in gastric cancers. Virchows Arch. 2009, 455 (3), 213−23. (4) Davison, E. J.; Tarpey, P. S.; Fiegler, H.; Tomlinson, I. P.; Carter, N. P. Deletion at chromosome band 20p12.1 in colorectal cancer revealed by high resolution array comparative genomic hybridization. Genes, Chromosomes Cancer 2005, 44 (4), 384−91. (5) Deloukas, P.; Matthews, L. H.; Ashurst, J.; Burton, J.; Gilbert, J. G.; Jones, M.; Stavrides, G.; Almeida, J. P.; Babbage, A. K.; Bagguley, C. L.; Bailey, J.; Barlow, K. F.; Bates, K. N.; Beard, L. M.; Beare, D. M.; Beasley, O. P.; Bird, C. P.; Blakey, S. E.; Bridgeman, A. M.; Brown, A. J.; Buck, D.; Burrill, W.; Butler, A. P.; Carder, C.; Carter, N. P.; Chapman, J. C.; Clamp, M.; Clark, G.; Clark, L. N.; Clark, S. Y.; Clee, C. M.; Clegg, S.; Cobley, V. E.; Collier, R. E.; Connor, R.; Corby, N. R.; Coulson, A.; Coville, G. J.; Deadman, R.; Dhami, P.; Dunn, M.; Ellington, A. G.; Frankland, J. A.; Fraser, A.; French, L.; Garner, P.; Grafham, D. V.; Griffiths, C.; Griffiths, M. N.; Gwilliam, R.; Hall, R. E.; Hammond, S.; Harley, J. L.; Heath, P. D.; Ho, S.; Holden, J. L.; Howden, P. J.; Huckle, E.; Hunt, A. R.; Hunt, S. E.; Jekosch, K.; Johnson, C. M.; Johnson, D.; Kay, M. P.; Kimberley, A. M.; King, A.; Knights, A.; Laird, G. K.; Lawlor, S.; Lehvaslaiho, M. H.; Leversha, M.; Lloyd, C.; Lloyd, D. M.; Lovell, J. D.; Marsh, V. L.; Martin, S. L.; McConnachie, L. J.; McLay, K.; McMurray, A. A.; Milne, S.; Mistry, D.; Moore, M. J.; Mullikin, J. C.; Nickerson, T.; Oliver, K.; Parker, A.; Patel, R.; Pearce, T. A.; Peck, A. I.; Phillimore, B. J.; Prathalingam, S. R.; Plumb, R. W.; Ramsay, H.; Rice, C. M.; Ross, M. T.; Scott, C. E.; Sehra, H. K.; Shownkeen, R.; Sims, S.; Skuce, C. D.; Smith, M. L.; Soderlund, C.; Steward, C. A.; Sulston, J. E.; Swann, M.; Sycamore, N.; Taylor, R.; Tee, L.; Thomas, D. W.; Thorpe, A.; Tracey, A.; Tromans, A. C.; Vaudin, M.; Wall, M.; Wallis, J. M.; Whitehead, S. L.; Whittaker, P.; Willey, D. L.; Williams, L.; Williams, S. A.; Wilming, L.; Wray, P. W.; Hubbard, T.; Durbin, R. M.; Bentley, D. R.; Beck, S.; Rogers, J. The DNA sequence and comparative analysis of human chromosome 20. Nature 2001, 414 (6866), 865−71. 210

dx.doi.org/10.1021/pr400899b | J. Proteome Res. 2014, 13, 200−211

Journal of Proteome Research

Article

Regulation of Adenosine Monophosphate-Activated Protein Kinase/ Fatty Acid Synthase in a p53-Negative Hepatocellular Carcinoma Cell. J. Agric. Food Chem. 2012, 60 (27), 6891−6898. (24) Jiang, Y.; Zhou, X.; Chen, X.; Yang, G.; Wang, Q.; Rao, K.; Xiong, W.; Yuan, J. Benzo(a)pyrene-induced mitochondrial dysfunction and cell death in p53-null Hep3B cells. Mutat. Res. 2011, 726 (1), 75−83. (25) Li, Y.; Tian, B.; Yang, J.; Zhao, L.; Wu, X.; Ye, S. L.; Liu, Y. K.; Tang, Z. Y. Stepwise metastatic human hepatocellular carcinoma cell model system with multiple metastatic potentials established through consecutive in vivo selection and studies on metastatic characteristics. J. Cancer Res. Clin. Oncol. 2004, 130 (8), 460−8. (26) Zhang, G.; Fedyunin, I.; Kirchner, S.; Xiao, C.; Valleriani, A.; Ignatova, Z. FANSe: an accurate algorithm for quantitative mapping of large scale sequencing reads. Nucleic Acids Res. 2012, 40 (11), e83. (27) Trapnell, C.; Pachter, L.; Salzberg, S. L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 2009, 25 (9), 1105−11. (28) Li, H.; Handsaker, B.; Wysoker, A.; Fennell, T.; Ruan, J.; Homer, N.; Marth, G.; Abecasis, G.; Durbin, R. The Sequence Alignment/Map format and SAMtools. Bioinformatics 2009, 25 (16), 2078−9. (29) Wang, K.; Li, M.; Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010, 38 (16), e164. (30) Mortazavi, A.; Williams, B. A.; McCue, K.; Schaeffer, L.; Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 2008, 5 (7), 621−8. (31) Robinson, M. D.; McCarthy, D. J.; Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 2010, 26 (1), 139−40. (32) Kessner, D.; Chambers, M.; Burke, R.; Agus, D.; Mallick, P. ProteoWizard: open source software for rapid proteomics tools development. Bioinformatics 2008, 24 (21), 2534−6. (33) Wu, S.; Li, N.; Ma, J.; Shen, H.; Jiang, D.; Chang, C.; Zhang, C.; Li, L.; Zhang, H.; Jiang, J.; Xu, Z.; Ping, L.; Chen, T.; Zhang, W.; Zhang, T.; Xing, X.; Yi, T.; Li, Y.; Fan, F.; Li, X.; Zhong, F.; Wang, Q.; Zhang, Y.; Wen, B.; Yan, G.; Lin, L.; Yao, J.; Lin, Z.; Wu, F.; Xie, L.; Yu, H.; Liu, M.; Lu, H.; Mu, H.; Li, D.; Zhu, W.; Zhen, B.; Qian, X.; Qin, J.; Liu, S.; Yang, P.; Zhu, Y.; Xu, P.; He, F. First proteomic exploration of proteinencoding genes on chromosome 1 in human liver, stomach, and colon. J. Proteome Res. 2013, 12 (1), 67−80. (34) Wang, Q.; Wen, B.; Yan, G.; Wei, J.; Xie, L.; Xu, S.; Jiang, D.; Wang, T.; Lin, L.; Zi, J.; Zhang, J.; Zhou, R.; Zhao, H.; Ren, Z.; Qu, N.; Lou, X.; Sun, H.; Du, C.; Chen, C.; Zhang, S.; Tan, F.; Xian, Y.; Gao, Z.; He, M.; Chen, L.; Zhao, X.; Xu, P.; Zhu, Y.; Yin, X.; Shen, H.; Zhang, Y.; Jiang, J.; Zhang, C.; Li, L.; Chang, C.; Ma, J.; Yao, J.; Lu, H.; Ying, W.; Zhong, F.; He, Q. Y.; Liu, S. Qualitative and quantitative expression status of the human chromosome 20 genes in cancer tissues and the representative cell lines. J. Proteome Res. 2013, 12 (1), 151−61. (35) Wang, Q.; Wen, B.; Yan, G.; Wei, J.; Xie, L.; Xu, S.; Jiang, D.; Wang, T.; Lin, L.; Zi, J.; Zhang, J.; Zhou, R.; Zhao, H.; Ren, Z.; Qu, N.; Lou, X.; Sun, H.; Du, C.; Chen, C.; Zhang, S.; Tan, F.; Xian, Y.; Gao, Z.; He, M.; Chen, L.; Zhao, X.; Xu, P.; Zhu, Y.; Yin, X.; Shen, H.; Zhang, Y.; Jiang, J.; Zhang, C.; Li, L.; Chang, C.; Ma, J.; Yan, G.; Yao, J.; Lu, H.; Ying, W.; Zhong, F.; He, Q. Y.; Liu, S. Qualitative and quantitative expression status of the human chromosome 20 genes in cancer tissues and the representative cell lines. J. Proteome Res. 2013, 12 (1), 151−61. (36) Winter, J.; Pantelis, A.; Kraus, D.; Reckenbeil, J.; Reich, R.; Jepsen, S.; Fischer, H. P.; Allam, J. P.; Novak, N.; Wenghoefer, M. Human alphadefensin (DEFA) gene expression helps to characterise benign and malignant salivary gland tumours. BMC Cancer 2012, 12, 465. (37) Gerashchenko, O. L.; Zhuravel, E. V.; Skachkova, O. V.; Khranovska, N. N.; Filonenko, V. V.; Pogrebnoy, P. V.; Soldatkina, M. A. Biologic activities of recombinant human-beta-defensin-4 toward cultured human cancer cells. Exp Oncol 2013, 35 (2), 76−82. (38) Li, D.; Wang, W.; Shi, H.; Fu, Y. J.; Chen, X.; Chen, X.; Liu, Y. T.; Kan, B.; Wang, Y. Gene therapy with beta defensin-2 induces anti-tumor immunity and enhances local anti-tumor effects. Hum. Gene Ther. 2013, DOI: 10.1089/hum.2013.161. (39) Sun, C. Q.; Arnold, R.; Fernandez-Golarz, C.; Parrish, A. B.; Almekinder, T.; He, J.; Ho, S. M.; Svoboda, P.; Pohl, J.; Marshall, F. F.;

Petros, J. A. Human beta-defensin-1, a potential chromosome 8p tumor suppressor: control of transcription and induction of apoptosis in renal cell carcinoma. Cancer Res. 2006, 66 (17), 8542−9. (40) Donald, C. D.; Sun, C. Q.; Lim, S. D.; Macoska, J.; Cohen, C.; Amin, M. B.; Young, A. N.; Ganz, T. A.; Marshall, F. F.; Petros, J. A. Cancer-specific loss of beta-defensin 1 in renal and prostatic carcinomas. Lab. Invest. 2003, 83 (4), 501−5. (41) Yoshida, K.; Sanada, M.; Ogawa, S. Deep sequencing in cancer research. Jpn. J. Clin. Oncol. 2013, 43 (2), 110−5. (42) Mittag, F.; Falkenberg, E. M.; Janczyk, A.; Gotze, M.; Felka, T.; Aicher, W. K.; Kluba, T. Laminin-5 and type I collagen promote adhesion and osteogenic differentiation of animal serum-free expanded human mesenchymal stromal cells. Orthop. Res. Rev. 2012, 4 (4), e36. (43) Imura, J.; Uchida, Y.; Nomoto, K.; Ichikawa, K.; Tomita, S.; Iijima, T.; Fujimori, T. Laminin-5 is a biomarker of invasiveness in cervical adenocarcinoma. Diagn. Pathol. 2012, 7, 105. (44) An, S. J.; Lin, Q. X.; Chen, Z. H.; Su, J.; Cheng, H.; Xie, Z.; Zhang, X. C.; Zhou, H. Y.; Huang, Y.; Chen, S. L.; Guo, W. B.; Wu, Y. L. Combinations of laminin 5 with PTEN, p-EGFR and p-Akt define a group of distinct molecular subsets indicative of poor prognosis in patients with non-small cell lung cancer. Exp. Ther. Med. 2012, 4 (2), 226−230. (45) Hamasaki, H.; Koga, K.; Aoki, M.; Hamasaki, M.; Koshikawa, N.; Seiki, M.; Iwasaki, H.; Nakayama, J.; Nabeshima, K. Expression of laminin 5-gamma2 chain in cutaneous squamous cell carcinoma and its role in tumour invasion. Br. J. Cancer 2011, 105 (6), 824−32. (46) Santamato, A.; Fransvea, E.; Dituri, F.; Caligiuri, A.; Quaranta, M.; Niimi, T.; Pinzani, M.; Antonaci, S.; Giannelli, G. Hepatic stellate cells stimulate HCC cell migration via laminin-5 production. Clin. Sci. 2011, 121 (4), 159−68. (47) Carpenter, P. M.; Dao, A. V.; Arain, Z. S.; Chang, M. K.; Nguyen, H. P.; Arain, S.; Wang-Rodriguez, J.; Kwon, S. Y.; Wilczynski, S. P. Motility induction in breast carcinoma by mammary epithelial laminin 332 (laminin 5). Mol. Cancer Res. 2009, 7 (4), 462−75. (48) Rojas, A. M.; Sanchez-Pulido, L.; Futterer, A.; van Wely, K. H.; Martinez, A. C.; Valencia, A. Death inducer obliterator protein 1 in the context of DNA regulation. Sequence analyses of distant homologues point to a novel functional role. FEBS J. 2005, 272 (14), 3505−11. (49) Braig, S.; Bosserhoff, A. K. Death inducer-obliterator 1 (Dido1) is a BMP target gene and promotes BMP-induced melanoma progression. Oncogene 2013, 32 (7), 837−48.

211

dx.doi.org/10.1021/pr400899b | J. Proteome Res. 2014, 13, 200−211