Network-based combinatorial CRISPR-Cas9 screens identify

5 days ago - We observed that the co-occurrence of genes for immune-related, proliferation-related or metabolism-related, proliferation-related ...
0 downloads 0 Views 880KB Size
Subscriber access provided by UNIV OF NEW ENGLAND ARMIDALE

Article

Network-based combinatorial CRISPR-Cas9 screens identify synergistic modules in human cells Yucheng Guo, chen Bao, Dacheng Ma, Yubing Cao, Yanda Li, Zhen Xie, and Shao Li ACS Synth. Biol., Just Accepted Manuscript • DOI: 10.1021/acssynbio.8b00237 • Publication Date (Web): 14 Feb 2019 Downloaded from http://pubs.acs.org on February 15, 2019

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Network-based combinatorial CRISPR-Cas9 screens identify synergistic modules in human cells Yucheng Guo#, Chen Bao#, Dacheng Ma, Yubing Cao1, Yanda Li, Zhen Xie*, Shao Li* MOE Key Laboratory of Bioinformatics and TCM-X Center / Bioinformatics Division / TFIDT, BNRist, Department of Automation, Tsinghua University, Beijing 100084, China #:

These authors contributed equally to this work.

*: Corresponding authors: [email protected], [email protected]. Keywords: Network, CRISPR-Cas9, Combinatorial screen, Synergistic module ABSTRACT Tumorigenesis is a complex process that is driven by a combination of networks of genes and environmental factors; however, efficient approaches to identifying functional networks that are perturbed by the process of tumorigenesis are lacking. In this study, we provide a comprehensive network-based strategy for the systematic discovery of functional synergistic modules that are causal determinants of inflammation-induced tumorigenesis. Our approach prioritizes candidate genes selected by integrating clinical-based and network-based genome-wide gene prediction methods and identifies functional synergistic modules based on combinatorial CRISPR-Cas9 screening. Based on candidate genes inferred de novo from experimental and computational methods to be involved in inflammation and cancer, we used an existing TGFβ1-induced cellular transformation model in colonic epithelial cells and a new combinatorial CRISPR-Cas9 screening strategy to construct an inflammation-induced differential genetic interaction network. The inflammation-induced differential genetic interaction network that we generated yielded functional insights into all the genes and functional module combinations and showed varied responses to the inflammation agents. We identified opposing differential genetic interactions of inflammationinduced tumorigenesis: synergistic promotion and suppression. The synergistic promotion state was primarily caused by deletions in the immune and metabolism modules; the synergistic suppression state was primarily induced by deletions in the proliferation and immune modules or in the proliferation and metabolism modules. These results provide insight into possible early targets and biomarkers for inflammation-induced tumorigenesis and highlight the synergistic effects that occur among immune, proliferation and metabolism modules. In conclusion, this approach deepens the understanding of the underlying mechanisms that cause inflammation to potentially increase the cancer risk of colonic epithelial cells and accelerate the translation into novel functional modules or synergistic module combinations that modulate complex disease phenotypes.

1:

Present address: Syngentech Inc., Zhongguancun Life Science Park, Changping District, Beijing

102206, China.

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 19

Inflammation-induced tumorigenesis (IIT) is a major driver of tumorigenesis and is rarely caused by a single genetic mutation but instead by some perturbation of a complex gene network1-2. A full understanding of the functional networks that link inflammation to tumorigenesis is crucial to prevent and treat inflammation-induced tumorigenesis in general and for an early diagnosis in particular. Today, identification of the driven functional networks linking inflammation to tumorigenesis, especially those associated with colitis-associated cancer

3-5,

is a major focus of the frontier of

current cancer research; however, efficient approaches to identifying functional networks that are perturbed by inflammation-induced tumorigenesis are lacking. Clinical data analysis and network-based prediction analysis have been valuable in achieving a detailed understanding of complex diseases 6-8. To date, numerous computational and experimental methods have been developed and used to find driven gene networks in disease progression for genomic research 8. Massive gene expression data have been accumulated and used to infer gene networks using different computational methods 9. Indeed, several methods that can be used to explore functional modules in various biological networks have recently been introduced based on methods in network topology 10-14 or methodology generated from integrating network topology and functional data 15-20. Most importantly, different network models have been used in complex disease analysis, such as genetic interaction networks and protein-protein interaction networks

6.

Additionally, dynamic disease states have been considered using the mapping of comprehensive networks among different conditions

21.

Differential interaction networks can filter out nonspecific

interactions in the system such as housekeeping functional processes, easily revealing contextspecific interactions

22.

Integrating the advantages of computational methods and experimental

measures is critical for their effective application in complex diseases. Two key events in tumorigenesis are the accumulation of genomic instability and cell malignant transformation

23.

There is an existing transforming growth factor-beta 1 (TGFβ1)-induced cellular

model of malignant transformation in colonic epithelial cells, and this research has shown that the high expression of cytokines and growth factors such as TGFβ1 is an important signal for chronic inflammation in intestinal mucosa 24. Overexpression of TGFβ1 may result in the hyperproliferation of epithelial cells and increase the risk of colitis-induced colon cancer 25, 26. Research on cancer risk induced by chronic inflammatory disease, which promotes the development of several malignancies, especially colorectal cancer (CRC) 3-5, is increasing. In CRC tissues, elevated levels of the adhesion molecule L1CAM are correlated with tumor progression 27-29. To simulate the cellular transformation through which inflammation promotes the malignant transformation of colonic epithelial cells, we used a cellular model of colitis-induced colon cancer 24. In this model, the human intestinal epithelial cell line NCM460 was cultured under the stimulation of TGFβ1, and L1CAM was used to mimic the mechanism through which TGFβ1 promotes colitis-induced colon cancer. TGFβ1 contributes to colitis-associated carcinogenesis by promoting the epithelial-mesenchymal transition (EMT) 30-32.

2 ACS Paragon Plus Environment

Page 3 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Genome editing by the CRISPR-Cas9 system has been used to discover synergistic modules in cancer and other diseases increasingly 33, 34. Genome-wide CRISPR screening of mammalian cells has been extensively used to identify novel disease genes and functional modules

35.

Because

inflammation-induced tumorigenesis is a complex process that is driven by genetic networks, we performed combinatorial screening of the human intestinal epithelial cell line NCM460 to identify differential genetic interaction networks associated with TGFβ1 and functional module combinations of inflammation-induced tumorigenesis based on differential genetic interaction networks. In this study, we propose an integrated computational and experimental approach to the systematic discovery of functional synergistic modules that are causal determinants of inflammationinduced tumorigenesis. First, we prioritized candidate genes of genome-wide network-based prediction and functional modules based on combinatorial CRISPR-Cas9 screening. Next, based on the candidate genes inferred de novo from experimental and computational data on genes involved in inflammation and cancer, we used an existing TGFβ1-induced cellular transformation model and a new combinatorial CRISPR-Cas9 screening strategy to construct an inflammationinduced differential genetic interaction network. The inflammation-induced differential genetic interaction network that we generated yielded functional insights into all the genes and functional combinations and showed varied responses to inflammatory agents. We identified opposing differential genetic interactions of inflammation-induced tumorigenesis: synergistic promotion and suppression. The synergistic promotion state was mainly caused by deletions in the immune and metabolism modules; the synergistic suppression state was mainly induced by deletions in the proliferation and immune modules or in the proliferation and metabolism modules. These results provide insight into possible early targets and biomarkers for inflammation-induced tumorigenesis and highlight the synergistic effects among the immune, proliferation and metabolism modules. This work deepens the understanding of the underlying mechanisms that inflammation increases the risk of cellular malignant transformation and accelerates the expression of novel functional modules or synergistic module combinations that modulate complex disease phenotypes.

RESULTS Schematic Diagram of Systematic Strategies In this study, an integrated experimental and computational workflow of network-based combinatorial CRISPR-Cas9 screens was used to discover synergistic modules of malignant transformations in colonic epithelial cells (Fig. 1). To identify candidate genes responsible for the progression from inflammation to tumorigenesis, we combined clinical gene expression data with genome-wide gene prediction methods to identify candidate genes that are associated with inflammation and cancer (Fig. 1A). To identify functional synergistic modules of inflammationinduced tumorigenesis and construct a differential genetic interaction network, we used an existing

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 19

cellular model of malignant transformation in which intestinal epithelial cells are exposed to TGFβ1 and a new combinatorial CRISPR-Cas9 screening strategy to knockout candidate genes (Fig. 1B).

Construction of a gene coexpression network Network-based computational methods could contribute to the identification of disease genes and pathways, therefore boosting the development of drug targets 2. We employed a method, named CIPHER 7, that can systematically quantify the concordance between diseases and genes and prioritize candidate genes for specific diseases to predict the top 100 candidate genes for INFLAMMATORY BOWEL DISEASE 1 (IBD1; OMIM ID 266600) and COLORECTAL CANCER (CRC; OMIM ID 114500) (Fig. 1A). To further identify causal genes responsible for the progression from inflammation to tumorigenesis, we explored the GEO datasets for three tissue types (colon, stomach, and liver) to identify differentially expressed genes (DEGs) that show statistically significant differences in expression among normal, inflammation and cancer samples (Fig. 2A, 2B). To prioritize candidate disease genes, we performed gene coexpression analysis based on DEGs (Fig. 2A, 2C). Global gene coexpression networks were constructed from colon (38 arrays), stomach (58 arrays) and liver (314 arrays) gene expression data using 84 differentially expressed genes (Table S8). To reveal the biological processes involved in inflammation-induced tumorigenesis, we performed enrichment analysis of the KEGG pathway and Gene Ontology (GO). The significantly overrepresented pathways included signal transduction-related pathways and immune systemrelated pathways, such as the sphingolipid signaling pathway (P-value = 5.5E-9) and chemokine signaling pathway (P-value = 5.1E-6). Additionally, the significance results of the GO enrichment analysis are shown in Table S9. The results include cell cycle-related terms such as positive regulation of cell proliferation (P-value = 1.5E-15), programmed cell death (P-value = 2.0E-10), and cell differentiation (P-value = 3.4E-9) and metabolism and immune-related terms such as positive regulation of cellular metabolic process (P-value = 9E-21), positive regulation of cytokine production (P-value = 9.2E-22), positive regulation of immune response (P-value = 5.7E-15) and inflammatory response (P-value = 1.4E-15). Overall, 2 significantly enriched modules, immune system process (P-value = 2.2E-22) and metabolic process (P-value = 1.8E-4), were defined for the gene coexpression network of the colon. This Pearson correlation analysis identified coexpression modules corresponding to clusters of correlated genes (Fig. S1). These significance results of KEGG and GO enrichment analysis indicate that these candidate genes can characterize the underlying molecular basis of inflammation-induced tumorigenesis.

4 ACS Paragon Plus Environment

Page 5 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Identification of Differential Genetic Interaction Networks using CRISPR-based Dual Knockout Genome editing by the CRISPR-Cas9 system has been used to discover synergistic modules in cancer and other diseases increasingly currently

33, 34.

Many genomic techniques such as shRNA

and CRISPR screens have been developed to study gene essentiality

36.

To study 84 candidate

genes that are essential for the contribution of inflammation to cancer, we used a CRISPR-based double-knockout approach to establish a differential genetic interaction network (Fig. 3A). To test whether we could construct a two-guide RNA-DNA vector library based on the Golden Gate cloning method, we developed a hierarchical strategy (Fig. S2). Compared with the CombiGEM method 37, this strategy requires only two construction steps, is more flexible and involves fewer repetitive sequences. To identify novel genes and functional relationships involved in inflammation-induced tumorigenesis, we used the CRISPR-based double-knockout approach to perform functional screening in the NCM460 cell line and used a cellular model of inflammation-associated cancer that simulates the cellular transformation through which IBD induces the malignant proliferation of colonic epithelial cells

24.

Our screening method measures all possible genetic interactions among

104 genes, including 84 candidate genes (Table S8) and 20 negative control genes. To confirm the role of the 84 candidate genes in inflammation-induced tumorigenesis, we used dual-sgRNAs to knock out all possible interactions in NCM460 cells and constructed a differential genetic interaction map (dGImap) of these genes to uncover functional relationships (Fig. 3A). Fig. 1B presents an overview of our experimental and computational strategy. We used an sgRNA lentivirus library to infect a modified NCM460 cell line that could express cas9 protein stably and then divided the obtained cell population into an experimental group and a control group. The experimental group was treated with TGFβ1, and the control group was observed with no treatment. After 10 days of cell culture, the cell populations of the two groups were collected, and then their genomic DNA was extracted. Next, the count of different sgRNA combinations was measured by deep sequencing. Growth rates were determined by measurement of the number of live cells present after 10 d under normal conditions and after 10 d in the presence of TGFβ1, which induces an inflammatory microenvironment. The frequency of an sgRNA pair was compared between the experimental and control groups. We used "ρ" to assess TGFβ1-induced cell growth rate, which quantifies the difference in the cell growth rate between the experimental and control groups

38, 39.

An sgRNA with no effect in the

presence of TGFβ1 has a ρ of 0; sgRNAs that confer TGFβ1-induced cell growth have positive ρ values, and sgRNAs that sensitize cells to TGFβ1-induced cell growth have negative ρ values (Fig. 3A-3B). Sequencing data were then normalized, and statistical analysis was performed to obtain a quantitative differential GI score (dGl) for double mutants

38, 39.

ACS Paragon Plus Environment

An accelerated growth rate with

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 19

double mutations indicates that the two genes have a synergistic effect, which is represented by a positive dGI score. By contrast, inhibition of the growth rate after double mutations shows that the two genes have antagonistic effects, which are represented as a negative dGl score (Fig. 3B). In total, the differential genetic interaction network contains quantitative dGl scores for 7,056 gene pairs. By analyzing the distribution of dGl scores, we obtained a set of gene pairs with extreme genetic interactions (dGl 1.11), revealing significant differences between TGFβ1induced and normal colonic epithelial cells. At dGI < -0.84 or dGI > 1.11, we identified 39 positive gene pairs and 45 negative gene pairs with a significant difference (Fig. 3D, Table S10). In differential networks, the number of interactions of among genes is correlated with the sensitivity of the genes to inflammation, and the number of interaction hubs increases under inflammatory stress

22.

Moreover, we found that many of these interaction hubs are recognized as

key genes in immune or metabolic pathways (Fig. 3C), a finding that is consistent with that in previous studies 22.

Distinct Interaction Patterns Indicate Specific Inflammation-induced Tumorigenesis Mechanisms To identify significantly different genetic interaction patterns (Fig. 4A), we constructed a significantly different genetic interaction network based on the differential genetic interactions (Fig. 4B). To determine the association between different interaction patterns and specific inflammationinduced tumorigenesis mechanisms, we performed enrichment analysis of 63 genes in the differential genetic interaction network and found that these genes are mainly enriched in three inflammation-induced cancer pathways. The main functional enrichment contains immune-related terms such as innate immune response (P-value = 5.8E-10), positive regulation of immune response (P-value = 1.3E-9) and immune system development (P-value = 2.8E-9), metabolism-related terms such as positive regulation of cellular metabolic process (P-value = 2.0E-15), positive regulation of macromolecule metabolic process (P-value = 2.3E-15) and positive regulation of nitrogen compound metabolic process (P-value = 8E-15), and proliferation-related terms such as positive regulation of cell proliferation (P-value = 6.7E-10). The possibility of differential genetic interactions between genes occurring in two different modules is greater than that in the same module

22.

Thus, we conducted an association analysis

between biological processes and differential genetic interactions and found an enrichment of differential genetic interactions among modules. These results indicate that the differential genetic interactions between these biological functional modules have been reprogrammed after inflammatory stimulation (Fig. 4B). Based on the analysis, we constructed a functional network to model differential genetic interactions between the biological functional modules and dynamic responses after inflammatory stimulation (Fig. 4B). The enrichment of differential genetic 6 ACS Paragon Plus Environment

Page 7 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

interactions among several genes reveals modulewise interactions (Fig. 4B). There are three module combinations that strongly affect inflammation-induced tumorigenesis. This result suggests that cooccurrences in the proliferation and immune modules or in the proliferation and metabolism modules synergistically suppress the process of inflammation-induced tumorigenesis and that cooccurrences in the metabolism and immune modules synergistically promote the process of inflammation-induced tumorigenesis. The results further suggest that cooccurrences in the immune and proliferation modules or the metabolism and proliferation modules offer possible early targets for inflammation-induced tumorigenesis and that cooccurrences in the immune and metabolism modules may be possible early biomarkers that indicate an increased risk of transition from inflammation to tumorigenesis (Fig. 4B). To assess the quality of differential genetic interactions determined by sgRNAs in this study, we used RNA interference (shRNA) to confirm the screening-based phenotypes. We confirmed that three specific shRNA pairs targeting MYC-CDK4, IL6R-TNF, and PIK3CA-NFKB1 led to synergistic changes in cell growth when used in conjunction with sgRNA pairs. Similarly, sgRNA pairs and shRNA pairs that simultaneously targeted the three specific pairs exhibited similar synergy or antagonism (Fig. 4C-F). Particularly, the MYC-CDK4 gene pair is frequently activated in human cancers

40, 41.

MYC directly regulates genes involved in glucose metabolism as well as genes

involved in ribosome biogenesis

42.

Moreover, MYC could directly regulate cell cycle progress by

activating genes such as cyclin D and cyclin-dependent kinase 4 (CDK4) besides indirectly regulating the cell cycle by affecting metabolic processes

43.

Combinational deletion of CDK4 and

MYC is a new therapeutic strategy in malignant transformation for inflammation-induced tumorigenesis. In our previous research, mutations in genes in the apoptosis and proliferation categories are highly concurrent in colorectal cancer (CRC), and many double mutations in genes involved in proliferation, apoptosis, differentiation, immune responses and metabolism may be responsible for the transition to inflammation-induced tumorigenesis 44. These results imply that the differential gene interactions detected by our technique indeed capture the underlying molecular basis of inflammation-induced tumorigenesis. Our approach provides further understanding into the underlying mechanisms that inflammation induces the malignant proliferation of colonic epithelial cells and accelerates the translation into novel functional modules or synergistic module effects that regulate complex disease phenotypes.

Synergistic Module-based Drug Prediction for Inflammation-induced Tumorigenesis Network-based drug discovery is considered one of the most important methods for drug discovery in the next generation 2. Complex signaling networks often regulate inflammation-induced tumorigenesis, and multiple targets are often associated with form drug-target networks. In this study, we used synergistic modules to predict drugs that affect inflammation-induced tumorigenesis,

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 19

particularly compounds used in traditional Chinese medicine (TCM). From the traditional Chinese medicine compound database HerBioMap, we retrieved 392 compounds contained in Liu-wei-dihuang (LWDH) and used the drugCIPHER algorithm to predict the target profiles of the corresponding compounds 45. For each specific compound, we can obtain a compound score for the differential genetic interaction of inflammation-induced tumorigenesis. After ranking all compounds discovered in the Liu-wei-di-huang (LWDH) analogous prescriptions (Table S11), we selected four compounds (quercetin, isorhamnetin, kaempferol and albiflorin). We used the MTT assay to evaluate the inhibitory effects of the four compounds on TGFβ1-induced colonic epithelial cells (NCM460). We observed that the four candidate compounds all inhibited inflammation-induced tumorigenesis (Fig. 5). The half-maximal inhibitory concentration (IC50) values for the four compounds (quercetin, kaempferol, isorhamnetin and albiflorin) were 64.79 μM, 139.9 μM, 232.4 μM and 464 μM, respectively (Fig. 5). These results suggest that our computational analysis is useful in identifying compounds used in Chinese traditional medicine that effectively inhibit tumor cell growth, although the experimental ranking of the top four compounds differed slightly from our predicted ranking (Table S11). The results of the experimental validation are relatively good, indicating that the approach to discovering synergistic target combinations and functional module combinations is an effective process for inflammation-induced tumorigenesis.

DISCUSSION Cancer is a complex disease that is often caused by genetic mutations and environmental disturbances, especially in inflammation-induced tumorigenesis 1. However, efficient approaches to identify the functional synergistic modules associated with inflammation-induced tumorigenesis have been lacking. We have proposed an integrated network-based and CRISPR-based approach for the systematic discovery of functional synergistic modules that are causal determinants of inflammation-induced tumorigenesis (Fig. 1). The approach consists of three parts: construction of a gene coexpression network, CRISPR-based genetic interaction analysis, and synergistic modulebased drug prediction. The approach also integrates computational and experimental methods, infers candidate genes involved in inflammation and cancer, and constructs a candidate gene coexpression network. The existing cellular model of the malignant transformation of colonic epithelial cells induced by TGFβ1 is used to construct an inflammation-induced differential genetic interaction network based on a new combinatorial CRISPR-Cas9 screening strategy. The inflammation-induced differential genetic interaction network reveals the presence of a metabolismimmune imbalance, suggesting that the cooccurrence of changes in immune-related and metabolism-related gene expression is a possible early biomarker indicating an increased risk of transition from inflammation to tumorigenesis. The recent study highlights the fundamental role of 8 ACS Paragon Plus Environment

Page 9 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

signaling crosstalk between metabolism-immune balance and inflammatory control mechanisms and highlights metabolism-immune balance as a key determinant in inflammation-induced tumorigenesis

44.

Moreover, metabolism-immune imbalance in patients with chronic atrophic

gastritis has been found to be closely related to the cold syndrome and heat syndrome discussed in the traditional Chinese medical treatment concept

46, 47.

Our approach provides further

understanding of the mechanisms by which inflammation induces malignant proliferation and accelerates the translation into novel functional synergistic modules. Microarray-based gene expression analysis offers great potential for the discovery of previously unknown functional connections

48.

There are highly connected protein clusters in the digestive

system gene coexpression network, which shows the characteristics of the modular regulation of biological networks

14.

Two significantly related biological functions were obtained through

enrichment analysis: immune-related and metabolic-related biological functional modules. Chronic IBD is a risk factor for CRC

49.

In patients with IBD, elevated expression of TGFβ1 is

found in various cells such as epithelial cells. Additionally, TGFβ1-induced EMT was shown to be an important cause of the proliferation of epithelial cancer cells and tumor metastasis 30, 32. We used a cellular model of malignant transformation of colonic epithelial cells induced by TGFβ1 to construct a differential genetic interaction network of inflammation-induced tumorigenesis with CRISPR-Cas9based dual-sgRNA knockouts. In summary, we proposed an integrated approach for the systematic discovery of functional modules that are causal determinants of complex diseases by prioritizing candidate genes identified through clinical-based and network-based genome-wide gene prediction and functional modules based on combinatorial CRISPR-Cas9 screens. Next, we established a dual-sgRNA analysis method to identify novel disease genes and module combinations. This method identifies the inflammation-specific genetic interaction network of malignant transformation. Here, we used this approach to successfully identify specific gene pairs and functional module combinations associated with inflammation-induced malignant transformation via CRISPR-Cas dual-sgRNA knockouts that lead to changes in cell growth. The CRISPR-Cas9-based dual sgRNA knockout screening system facilitates the identification of novel disease genes and module combinations 33, 51. Moreover, we validated three gene pairs (PIK3CA + NFKB1, IL6R + TNF and MYC + CDK4) using dual-shRNA knockouts and confirmed their synergistic efficacy against malignant transformation induced by TGFβ1. Based on an inflammation-induced genetic interaction network, we predicted four potential chemical constituents of LWDH analogous prescriptions for inflammation-induced tumorigenesis, suggesting that these compounds could be viable therapeutic candidates. Our method represents an advance over previously described combinatorial disease gene and drug screening platforms 52-54 in that it uses high-throughput screening to reduce the cost of labor, material and time. The approach of network-based combinatorial CRISPR-Cas9 screening provides a

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 19

streamlined tool to reveal potential synergistic functional modules in biological networks, especially for complex diseases under multiple perturbations. Our computational and experimental strategy appears to be highly valuable in exploring functional modules inside biological networks. The integration of these functional modules is beneficial in further understanding the pathogenesis of complex diseases. We found that the cooccurrence of immune and proliferation changes or cooccurrence of metabolic and proliferation changes offer possible early targets for inflammationinduced tumorigenesis and that the cooccurrence of immune and metabolic changes produces possible early biomarkers that indicate an increased risk of transition from inflammation to tumorigenesis. In conclusion, our strategy provides new insight into the underlying mechanisms regarding how inflammation induces malignant tumorigenesis and accelerates translation into novel synergistic functional modules.

METHODS Experimental Materials The E. coli strain DH5α (TransGen Biotech, Beijing) was used for plasmid cloning and maintenance. The lentiviral packaging plasmids pCMVdR8.9 and pVSV-G were gifts from Professor Zhen Xie (Tsinghua University). Cell lines and Transfection The human NCM460 cell line used in this study was purchased from the American Type Culture Collection (ATCC). A Cas9-expressing stable monoclonal NCM460 cell line was constructed using

the packaged Cas9 lentivirus. The cell line was cultured in a sterile incubator (37°C, 5% CO2) with RPMI-1640 medium (Gibco) containing 10% FBS (Gibco), 1% penicillin and 1% streptomycin. The HEK293T cell line was a gift from the Zhen Xie laboratory (Tsinghua University); it was maintained in DMEM/high-glucose medium (Gibco) mixed with 10% FBS (Gibco), 1% penicillin and 1% streptomycin. The plasmid was integrated into the cells using the Attractene Transfection Reagent (QIAGEN). Cell viability assay Cell viability was assessed by the MTT assay. First, cells (approximately 1×104 cells/well) were seeded in a 96-well plate (100 μL of medium/well). After 24 hours of culture, the new medium was replaced, and then the cells were treated with a gradient-diluted drug. After 48 hours of drug stimulation, 10 μL of MTT solution (5 mg/mL) and 90 μL of PBS were added to each well. After incubating for 4 hours at 37℃, the culture solution was removed, and then 100 μL of DMSO was added to each well. Finally, after mixing, the OD value at a wavelength of 570 nm was measured using a microplate reader. Quantitative RT-PCR 10 ACS Paragon Plus Environment

Page 11 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

First, total RNA in cells was extracted by TRIZOL (Life Technologies). Next, RNA was reverse transcribed into cDNA using the TransScript All-in-One cDNA Synthesis Kit (TransGen Biotech). Real-time PCR was performed on a Roche LightCycler 480 II qPCR system using Platinum SYBR green qPCR SuperMix (TransGen Biotech) according to the procedure described in the kit’s instructions. The primers used are listed in Table S4. Relative changes in the transcription levels were calculated using the 2-ΔΔCT method. The housekeeping genes β-ACTIN and GAPDH were used as controls. Western blot analysis Proteins in the cells were extracted using RIPA lysis buffer (Solarbio). Next, different sizes of proteins were separated by electrophoresis on a 10% SDS-PAGE gel, transferred to a nitrocellulose membrane by electrophoresis, and incubated with 5% skim milk in PBS or Tween-20 for 2 h. The primary antibody (Abcam) was incubated overnight at 4 °C. Next, the filter was washed 3 times and incubated for 1 h in blocking solution containing the second antibody. Bands were visualized by chemiluminescent reagents, and ACTIN protein was used as a control. Fluorescence-Activated Cell Sorting Analysis After treating the cells with 0.25% trypsin for 3 minutes, the cells were collected into a tube and centrifuged at 300 g for 7 minutes. Next, the cells were resuspended using PBS solution. Flow cytometry (BD Biosciences) was used for sorting fluorescent cells, and data analysis was performed using FlowJo software. At least 10,000 cells were analyzed for each sample. ASSOCIATED CONTENT Supporting Information The Supporting Information includes four figures and eleven tables. Additional guide RNA library construction, genome extraction and NGS library preparation, hierarchical assembly of the dual-gRNA knockout library, gene selection, design and synthesis of sgRNAs, lentivirus packaging and infection, verification by RNA interference, determination of sgRNA phenotypes from pooled screens, definitions of the expected double-sgRNA phenotypes, synergistic module-based drug prediction.

AUTHOR INFORMATION Corresponding Author *Tel: +86-010-62797035. E-mail: [email protected]. *Tel: +86-010-62796050. E-mail: [email protected]

Author Contributions

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 19

S.L., Z.X., Y.L., Y.G. conceived the ideas implemented in this work. C.B., D.M., Y.C. constructed the two-guide RNA-DNA vector library. C.B., Y.C. performed the cell experiments. Y.G. and S.L. analyzed the data. S.L. supervised the project. Y.G., Z.X. and S.L. wrote the paper.

Notes The authors declare no competing financial interest.

ACKNOWLEDGMENTS The research is supported in part by NSFC grants 91729301, 81630103, 91229201 and The Project of Tsinghua-Fuzhou Institute for Data Technology (TFIDT2018001) to S. Li, NSFC grants 31471255 to Z. Xie.

12 ACS Paragon Plus Environment

Page 13 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Figure Legends

Fig. 1. Schematic diagram of the systematic strategies used to identify synergistic modules. (A) Data-based and network-based gene prediction of inflammation-induced tumorigenesis (IIT). (B) A new combinatorial CRISPR-Cas9 screening strategy for the identification of synergistic modules.

Fig. 2. Differential gene expression analysis and coexpression network analysis. (A) Flow diagram of microarray data analyses. (B) Differential gene expression analysis of microarray data.

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 19

Fig. 3. Calculation of differential genetic interactions. (A) Read counts for all the gRNAs. (B) Quantitative phenotypic changes for sgRNA combination were calculated based on phenotypic differences between the treated and control subpopulations. (C) The scatterplot shows the number of positive and negative differential phenotype differences. (D) The dGI score follows the normal distribution. The average value is 0.13, and the standard deviation is 0.49.

14 ACS Paragon Plus Environment

Page 15 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Fig. 4. Modularity of differential genetic interaction networks. (A) The differential genetic interaction network was obtained by analyzing the difference in the cell growth rate between single and double mutants. (B) Differential genetic interaction network. The differential genetic interaction maps include both differential positive (red) and differential negative (blue) interactions. The module map is based on a biological process-based set by the differential genetic interaction network (Supplementary Table 1). (C-F) Combinatorial inhibition of three specific shRNA pairs and predicted compounds of TCM. (C-E) The differential enrichment of the three specific shRNA pairs was compared with the differential enrichment of the corresponding sgRNA pairs. (F) Differential GI scores for the three specific pairs of shRNAs and sgRNAs.

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 19

Fig. 5. Synergistic module-based drug prediction and predicted compounds of TCM. (A-D) The IC50 values of the compounds based on the MTT assay. The concentrations of the compounds used and corresponding cell viability graphs are shown for quercetin (A), kaempferol (B), isorhamnetin (C), and albiflorin (D). The IC50 values (the concentrations that resulted in 50% cell viability of NCM460 cells that were stimulated by 10 ng/ml of TGFβ1 for 2 days) were 64.79 μM, 139.9 μM, 232.4 μM and 464 μM for quercetin, kaempferol, isorhamnetin, and albiflorin, respectively.

16 ACS Paragon Plus Environment

Page 17 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

REFERENCES 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28.

Elinav, E., et al., Inflammation-induced cancer: crosstalk between tumours, immune cells and microorganisms. Nat Rev Cancer, 2013. 13(11): p. 759-771. Barabási, A.L., N. Gulbahce, and J. Loscalzo, Network medicine: a network-based approach to human disease. Nature Reviews Genetics, 2011. 12(1): p. 56-68. Wang, D., et al., Peroxisome proliferator-activated receptor δ promotes colonic inflammation and tumor growth. Proceedings of the National Academy of Sciences of the United States of America, 2014. 111(19): p. 7084-9. Ekbom, A., et al., Increased Risk of Large-Bowel Cancer in Crohns-Disease with Colonic Involvement. Lancet, 1990. 336(8711): p. 357–359. Hu, B., et al., Correction for Hu et al., Microbiota-induced activation of epithelial IL-6 signaling links inflammasome-driven inflammation with transmissible cancer. Proceedings of the National Academy of Sciences, 2013. 110(31): p. 12852–12852. Hu, J.X., C.E. Thomas, and S. Brunak, Network biology concepts in complex disease comorbidities. Nat Rev Genet, 2016. 17(10): p. 615-29. Wu, X., et al., Network-based global inference of human disease genes. Molecular Systems Biology, 2008. 4: p. 189. Zhang, X.-F., et al., Differential network analysis from cross-platform gene expression data. Scientific Reports, 2016. 6: p. 34112. Marbach, D., et al., Wisdom of crowds for robust gene network inference. Nature Methods, 2012. 9(8): p. 796. Barabasi, A.L. and Z.N. Oltvai, Network biology: understanding the cell's functional organization. Nat Rev Genet, 2004. 5(2): p. 101-13. Huang, D.W., B.T. Sherman, and R.A. Lempicki, Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Research, 2009. 37(1): p. 1-13. Ravasz, E., et al., Hierarchical organization of modularity in metabolic networks. Science, 2002. 297(5586): p. 1551-5. Snel, B., P. Bork, and M.A. Huynen, The identification of functional modules from the genomic association of genes, in Proc Natl Acad Sci U S A. 2002. p. 5890-5. Spirin, V. and L.A. Mirny, Protein complexes and functional modules in molecular networks. Proc Natl Acad Sci U S A, 2003. 100(21): p. 12123-8. Bader, G.D. and C.W. Hogue, Analyzing yeast protein-protein interaction data obtained from different sources. Nat Biotechnol, 2002. 20(10): p. 991-7. Bar-Joseph, Z., et al., Computational discovery of gene modules and regulatory networks. Nat Biotechnol, 2003. 21(11): p. 1337-42. Ihmels, J., et al., Revealing modular organization in the yeast transcriptional network. Nat Genet, 2002. 31(4): p. 370-7. Jansen, R., et al., A Bayesian networks approach for predicting protein-protein interactions from genomic data. Science, 2003. 302(5644): p. 449-53. Stuart, J.M., et al., A gene-coexpression network for global discovery of conserved genetic modules. Science, 2003. 302(5643): p. 249-55. Tornow, S. and H.W. Mewes, Functional modules by relating protein interaction networks and gene expression. Nucleic Acids Res, 2003. 31(21): p. 6283-9. Chen, L., et al., Detecting early-warning signals for sudden deterioration of complex diseases by dynamical network biomarkers. Scientific Reports, 2012. 2: p. 342. Bandyopadhyay, S., M. Mehta, and D. Kuo, Rewiring of genetic networks in response to DNA damage. Science, 2010(December): p. 1385–1389. Jinesh, G.G., et al., Molecular genetics and cellular events of K-Ras-driven tumorigenesis. Oncogene, 2017. Schäfer, H., et al., TGF-β1-dependent L1CAM expression has an essential role in macrophage-induced apoptosis resistance and cell migration of human intestinal epithelial cells. Oncogene, 2013. 3244(10): p. 180189. Stadnicki, A., et al., Transforming growth factor-beta1 and its receptors in patients with ulcerative colitis. International Immunopharmacology, 2009. 9: p. 761-766. Babyatsky, M.W., G. Rossiter, and D.K. Podolsky, Expression of transforming growth factors alpha and beta in colonic mucosa in inflammatory bowel disease. Gastroenterology, 1996. 110(4): p. 975-84. Boo, Y.-J., et al., L1 Expression as a Marker for Poor Prognosis, Tumor Progression, and Short Survival in Patients with Colorectal Cancer. Annals of Surgical Oncology, 2007. 14(5): p. 1703-1711. Kaifi, J.T., et al., L1 is associated with micrometastatic spread and poor outcome in colorectal cancer. Mod Pathol, 2007. 20(11): p. 1183-1190.

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. 54.

Page 18 of 19

Gavert, N., et al., L1, a novel target of β-catenin signaling, transforms cells and is expressed at the invasive front of colon cancers. Journal of Cell Biology, 2005. 168(4): p. 633-642. Ellenrieder, V., A. Buck, and T.M. Gress, TGFbeta-regulated transcriptional mechanisms in cancer. International journal of gastrointestinal cancer, 2002. 31(1-3): p. 61-9. Ellenrieder, V., et al., Transforming growth factor beta1 treatment leads to an epithelial- mesenchymal transdifferentiation of pancreatic cancer cells requiring extracellular signal-regulated kinase 2 activation. Cancer research, 2001. 61(10): p. 4222-8. Zavadil, J. and E.P. Böttinger, TGF-β and epithelial-to-mesenchymal transitions. Oncogene, 2005. 24(37): p. 5764-74. Shi, J., et al., Discovery of cancer drug targets by CRISPR-Cas9 screening of protein domains. Nature biotechnology, 2015. 33(6): p. 661-667. Hsu, Patrick D., Eric S. Lander, and F. Zhang, Development and Applications of CRISPR-Cas9 for Genome Engineering. Cell, 2014. 157(6): p. 1262-1278. Shalem, O., et al., Genome-Scale CRISPR-Cas9 Knockout Screening in Human Cells. Science, 2014. 343(6166): p. 84-87. Jiang, P., et al., Network analysis of gene essentiality in functional genomics experiments. Genome Biol, 2015. 16: p. 239. Lieben, L., Genetic screens: CombiGEM — high-throughput identification of combinatorial gene effects. Nature Reviews Genetics, 2015. 16(10): p. 564-565. Bassik, M.C., et al., A systematic mammalian genetic interaction map reveals pathways underlying ricin susceptibility. Cell, 2013. 152(4): p. 909-922. Kampmann, M., M.C. Bassik, and J.S. Weissman, Integrated platform for genome-wide screening and construction of high-density genetic interaction maps in mammalian cells. Proceedings of the National Academy of Sciences of the United States of America, 2013. 110(25): p. E2317-26. Obaya, A.J., et al., The Proto-oncogene c-myc Acts through the Cyclin-dependent Kinase (Cdk) Inhibitor p27Kip1to Facilitate the Activation of Cdk4/6 and Early G1Phase Progression. Journal of Biological Chemistry, 2002. 277(34): p. 31263-31269. Haas, K., et al., Mutual requirement of CDK4 and Myc in malignant transformation: evidence for cyclin D1/CDK4 and p16INK4A as upstream regulators of Myc. Oncogene, 1997. 15(2): p. 179-92. Dang, C.V., MYC, metabolism, cell growth, and tumorigenesis. Cold Spring Harb Perspect Med, 2013. 3(8). Stine, Z.E., et al., MYC, Metabolism, and Cancer. Cancer Discov, 2015. 5(10): p. 1024-39. Guo, Y., et al., Multiscale modeling of inflammation-induced tumorigenesis reveals competing oncogenic and oncoprotective roles for inflammation. Cancer Res, 2017. 77(22): p. 6429-6441. Zhao, S. and S. Li, Network-based relating pharmacological and genomic spaces for drug target identification. PloS One, 2010. 5(7): p. e11764. Li, R., et al., Imbalanced network biomarkers for traditional Chinese medicine Syndrome in gastritis patients. Sci. Rep., 2013. 3: p. 1543. Li, S., et al., Understanding ZHENG in traditional Chinese medicine in the context of neuro-endocrineimmune network. IET Systems Biology, 2007. 1(1): p. 51-60. Ulitsky, I. and R. Shamir, Identifying functional modules using expression profiles and confidence-scored protein interactions. Bioinformatics, 2009. 25(9): p. 1158-64. Ekbom, A., et al., Ulcerative colitis and colorectal cancer. A population-based study. The New England Journal of Medicine, 1990. 323(18): p. 1228-1233. Wiercińska-Drapało, A., R. Flisiak, and D. Prokopowicz, Effect of ulcerative colitis activity on plasma concentration of transforming growth factor β 1. Cytokine, 2001. 14(6): p. 343-346. Griffith, M., et al., DGIdb: mining the druggable genome. Nature Methods, 2013. 10(12): p. 1209-1210. Kummar, S., et al., Utilizing targeted cancer therapeutic agents in combination: novel approaches and urgent requirements. Nature reviews. Drug discovery, 2010. 9(11): p. 843-856. Borisy, A.a., et al., Systematic discovery of multicomponent therapeutics. Proceedings of the National Academy of Sciences of the United States of America, 2003. 100(13): p. 7977-7982. Griner, L.A.M., et al., High-throughput combinatorial screening identifies drugs that cooperate with ibrutinib to kill activated B-cell–like diffuse large B-cell lymphoma cells. Proceedings of the National Academy of Sciences, 2014. 111(6): p. 2349-2354.

18 ACS Paragon Plus Environment

Page 19 ofmethods 19 ACS Computational

Identification of synergistic modules Synthetic Biology Immune module

Proliferation module 1 2 ACS Paragon Plus Environment 3 Metabolism module Experimental methods 4