Pathway-Based Drug Repositioning for Cancers: Computational

2 days ago - A new chemical policy is coming to a small corner of Amazon's vast online bazaar. The retailer says its... POLICY CONCENTRATES ...
1 downloads 0 Views 4MB Size
Article Cite This: J. Med. Chem. 2018, 61, 9583−9595

pubs.acs.org/jmc

Pathway-Based Drug Repositioning for Cancers: Computational Prediction and Experimental Validation Michio Iwata,† Lisa Hirose,‡ Hiroshi Kohara,‡,§ Jiyuan Liao,‡,§ Ryusuke Sawada,∥ Sayaka Akiyoshi,∥ Kenzaburo Tani,‡,⊥ and Yoshihiro Yamanishi*,†,#

Downloaded via UNIV OF SOUTH DAKOTA on November 9, 2018 at 18:56:29 (UTC). See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.



Department of Bioscience and Bioinformatics, Faculty of Computer Science and Systems Engineering, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan ‡ Project Division of ALA Advanced Medical Research, The Institute of Medical Science, The University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639, Japan § Division of Molecular and Clinical Genetics, Department of Molecular Genetics, Medical Institute of Bioregulation, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka, Fukuoka 812-8582, Japan ∥ Medical Institute of Bioregulation, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka, Fukuoka 812-8582, Japan ⊥ Division of Molecular Design, Research Center for Systems Immunology, Medical Institute of Bioregulation, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka, Fukuoka 812-8582, Japan # PRESTO, Japan Science and Technology Agency, Kawaguchi, Saitama 332-0012, Japan S Supporting Information *

ABSTRACT: Developing drugs with anticancer activity and low toxic side-effects at low costs is a challenging issue for cancer chemotherapy. In this work, we propose to use molecular pathways as the therapeutic targets and develop a novel computational approach for drug repositioning for cancer treatment. We analyzed chemically induced gene expression data of 1112 drugs on 66 human cell lines and searched for drugs that inactivate pathways involved in the growth of cancer cells (cell cycle) and activate pathways that contribute to the death of cancer cells (e.g., apoptosis and p53 signaling). Finally, we performed a large-scale prediction of potential anticancer effects for all the drugs and experimentally validated the prediction results via three in vitro cellular assays that evaluate cell viability, cytotoxicity, and apoptosis induction. Using this strategy, we successfully identified several potential anticancer drugs. The proposed pathway-based method has great potential to improve drug repositioning research for cancer treatment.



indications in terms of polypharmacology.10 Identification of new anticancer action from existing drugs, which are not necessarily known anticancer drugs, that have been confirmed to be safe for humans, may directly lead to clinical application and could be promptly delivered to cancer patients. Thus, drug repositioning could potentially reduce the costs of drug development for cancer treatment and also reduce the risk of toxic side-effects. A variety of computational methods have been developed for drug repositioning, and these can be categorized as supervised and unsupervised approaches. In the supervised approach, prior information on drug−disease associations (drug indications) is required. A collection of known drug−disease associations was used for learning predictive models to predict new drug−disease associations using statistical machine learning algorithms, and the prediction was based on various biomedical data (e.g., genomic and omics data) for drugs and diseases in most previous

INTRODUCTION Cancer is a leading cause of death worldwide.1 In the United States, an estimated 609,640 deaths from different types of cancer occurred in 2018.2 Currently, the cancer lifetime risk in Japan is approximately 50%.3,4 It is well-known that most anticancer drugs used in chemotherapy kill both cancer and normal cells, and thus their toxic side-effects are a serious problem. 5 Cancer patients are commonly treated with anticancer drugs on a regular basis. The damage caused by the drugs is physically and mentally painful for those patients, whose quality of life is significantly deteriorated. In addition, the cost of cancer care is rapidly increasing partly because of the emergence of antibody drugs.6 It is therefore a challenging issue to develop drugs for cancer chemotherapy that have not only anticancer effects but also low toxic side-effects and prices. Drug repositioning is the identification of novel therapeutic indications (i.e., applicable diseases) for existing drugs, which makes drug discovery more efficient in terms of time, risk, and expenditure.7−9 Numerous drugs have unknown mechanisms of action; thus, they have enormous potential for new drug © 2018 American Chemical Society

Received: July 2, 2018 Published: October 29, 2018 9583

DOI: 10.1021/acs.jmedchem.8b01044 J. Med. Chem. 2018, 61, 9583−9595

Journal of Medicinal Chemistry

Article

Figure 1. Overview of the proposed approach. (a) Identification of activated and inactivated pathways from drug-induced gene expression signatures. The up- and down-regulated genes in the signatures are mapped onto many biological pathway maps, and the enrichment of the up- and downregulated genes in each pathway is evaluated by the pathway enrichment analysis. (b) Prediction of potential anticancer effects of different drugs using the results of pathway enrichment analysis. Drugs that regulate cancer-related pathways, such as cell cycle pathway, p53 signaling pathway, and apoptosis pathway are selected. The pathway-based anticancer drug likeness score (PAD score) is calculated for each drug; high-scoring drugs are predicted to have anticancer effects.

methods.11−15 A serious limitation of this strategy is that the prediction results are heavily dependent on prior information on known drugs for diseases of interest. Conversely, in the unsupervised approach, no prior information on drug−disease associations is required. In this context, profiling of genome-wide transcriptional responses to drug treatment on human cell lines is a popular approach for drug repositioning. The Connectivity Map (CMap) database, which stores gene expression profiles induced by 1309 compounds on five cancer cell lines, is commonly used.16 A variety of computational methods that use CMap have been proposed to detect unknown drug−disease associations based on the inverse correlation of gene expression patterns between drugs and diseases.17−21 However, these methodologies heavily depend on the coverage of drugs and cell lines in CMap. Recently, the Broad Institute launched the Library of Integrated Network-based Cellular Signatures (LINCS) L1000 database.22 LINCS stores gene expression profiles induced by 20,413 compounds on 72 human cell lines, greatly exceeding the numbers of CMap. There are already some investigations in which the LINCS resources are used. For example, a normalization method for L1000 data was developed,23 some chemical structures were related to chemically induced gene expression profiles,24 the modes of action of certain bioactive compounds were elucidated in a cell-specific manner,25 and inhibitory and activating targets of candidate drugs were distinctively identified.26 The LINCS database is expected to be valuable for many pharmaceutical applications, such as drug discovery and repositioning.

In cancerous states, there is a complicated combination of multiple genes and proteins abnormalities; and the associated pathways, such as metabolic pathways and protein interactions that constitute the biological system are not normally controlled. Some examples of cancer-related abnormalities are malfunctioning growth suppressors, resistance to cell death, and induction of angiogenesis,27 which are likely due to dysfunctional biological pathways rather than to single genes or proteins. In fact, genes and proteins with sequence mutations and abnormal expressions tend to interact with each other or work in the same pathways, which are characteristic of the pathological conditions in various cancers, e.g., breast, ovarian, and lung cancers.28−33 Therefore, the cancer-related pathways could be a new promising drug target. In fact, the importance of pathways has been recognized in recent pharmaceutical research.34 However, previous methods did not use pathway information or pathway analysis for drug screening. In this work, we developed a computational approach for drug repositioning for cancers considering molecular pathways as the therapeutic target. The basic strategy for conventional drug discovery is to target a single cancer-specific protein and to search for compounds that specifically bind to it. However, the binding affinity to the target protein is not necessary correlated to the therapeutic effects. In contrast, the strategy of the proposed method is to target cancer-related pathways, e.g., cell cycle, apoptosis, and p53 signaling pathway, and search for compounds that regulate them. We predicted novel anticancer effects from existing drugs using large-scale chemically induced transcriptome data from LINCS, proving the usefulness and 9584

DOI: 10.1021/acs.jmedchem.8b01044 J. Med. Chem. 2018, 61, 9583−9595

Journal of Medicinal Chemistry

Article

Figure 2. Distribution of detected pathways for known anticancer drugs. The horizontal axis corresponds to the list of biological pathways and the vertical axis the frequency of detected pathways. Red squares indicate the number of activated pathways detected from up-regulated genes, and green circles the number of inactivated pathways detected from down-regulated genes. Pathways were sorted in descending order of the difference between the number of activated and inactivated ones. The pathways are colored according to the KEGG categories: Metabolism (blue), Environmental Information Processing (orange), Cellular Processes (pink), and Organismal Systems (purple).

focused on three cancer-related pathways: cell cycle pathway, p53 signaling pathway, and apoptosis pathway; and selected different drugs that regulate them. For each drug, we evaluated the significance of pathway enrichment by a hypergeometric test. Finally, we computed the pathway-based anticancer drug likeness score (PAD score) for each drug based on the enrichment of the selected pathways. The PAD score is a prediction score for finding novel anticancer drugs (see the Experimental Section). Pathway-Based Characterization of Known Anticancer Drugs. We performed pathway enrichment analyses for 1112 drugs and inferred their activities for 163 biological pathways based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database.37 From the evaluated drugs, 83 are known anticancer agents approved for treatment. From the enrichment analyses results, the 83 known anticancer drugs were characterized according to their pathway activities. Figure 2 shows the distributions of activated or inactivated pathways for the known anticancer drugs. The numbers of activated or inactivated pathways are shown in Table S1. The pathway enrichment analysis provides a functional insight into anticancer drugs at a pathway level. As expected, a typical example of drug-induced inactivation was the cell cycle pathway (hsa04110), in which many genes were down-regulated: inhibiting the cell cycle machinery clearly interferes with cancer cell proliferation. Other examples of inactivation included the oocyte meiosis pathway (hsa04114) and the progesteronemediated oocyte maturation pathway (hsa04914). Examples of the activated pathways included the p53 signaling pathway (hsa04115) and the apoptosis pathway (hsa04210). These observations agree with the fact that several anticancer drugs are known to activate cancer-suppressor genes and induce apoptosis.27 Another example of drug-induced activation was the oxytocin signaling pathway (hsa04921). Oxytocin, a

efficiency of the proposed pathway-based drug repositioning method.



RESULTS Overview of the Proposed Approach. A cell system consists of various molecular pathways, metabolic and signaling pathways and protein−protein interactions, which are responsible for most of the biological processes. Accordingly, we attempted to relate the response of a cell system to chemical perturbations, such as drugs, by measuring the transcriptional changes in gene expression levels of the system and correlating it with a cancerous state. For this purpose, we identified a catalog of cancer-related pathways and evaluated the extent of regulation with drug-induced gene expression data. Below, we present an overview of the proposed approach; for the detailed process, please refer to the Experimental Section. We initially analyzed a set of drug-induced gene expression profiles, referred to as “gene expression signatures” or “signatures”, in which each element of the signature is the logarithmic ratio of gene expression measured after drug treatment to that measured under control conditions. We then predicted the potential anticancer effects of the drugs based on the drug-induced pathway activities. The proposed approach consists of two steps: (1) identification of the activity of cancerrelated pathways, and (2) prediction of drugs with potential anticancer effects based on the pathway regulation (Figure 1). Figure 1a shows the procedure to identify activated and inactivated pathways from gene expression signatures. The upand down-regulated genes in the signatures are mapped onto many biological pathway maps, and the enrichment of the regulated genes in each pathway is evaluated by the pathway enrichment analysis.35,36 Figure 1b presents the procedure to predict which drugs have potential anticancer effects using the results from the pathway enrichment analysis. For this study, we 9585

DOI: 10.1021/acs.jmedchem.8b01044 J. Med. Chem. 2018, 61, 9583−9595

Journal of Medicinal Chemistry

Article

Figure 3. PCA results for (a) the original gene expression signatures, (b) pathway regulation profile, and (c) cancer pathway regulation profile. The upper panels represent the scatter-plots of the first and second principal component (PC1 and PC2) scores for drugs, and the bottom panels represent the PC loadings (weights) for features.

These results suggest that the pathway information provides different perspectives on drug activities. Figure 3c shows the PCA for the cancer pathway regulation signatures. According to the PC loading plot, PC1 corresponded to the inactivation of the cell cycle pathway, and PC2 corresponded to the activations of the p53 signaling pathway and apoptosis pathway. For example, dactinomycin (D00214), an anticancer drug known as an inducer of apoptosis and p53,40 had the highest PC2 score, supporting the previously stated observation (PC2 corresponded to the activation of apoptosis pathway and p53 signaling pathway). Figure S1 shows the boxplots representing the distributions of the PC scores of anticancer drugs and other drugs. We observed the difference of the distributions between anticancer drugs and nonanticancer drugs. The PC1 and PC2 scores for anticancer drugs were significantly higher than those for nonanticancer drugs (p-value < 0.001 and p-value < 1 × 10−5, respectively, by Wilcoxon’s rank sum test). It appeared that some nonanticancer drugs were widely distributed along both axes. Mefloquine (D04895), an antimalarial drug reported to interrupt the cell cycle pathway41 and to induce cell death,42 had the highest PC1 score. Therefore, mefloquine would potentially present anticancer activity. These results suggest that PC scores capture informative features that discriminate anticancer drugs from other drugs. Prediction of Anticancer Effects for Drugs in the LINCS Database. We obtained PAD scores from the pathway enrichment analyses for the inactivation of the cell cycle pathway and the activation of the p53 signaling pathway and the apoptosis pathway (see the Experimental Section). We

nonapeptide hormone, is important in the human reproductive system, and its therapeutic effect on cancer has been previously reported;38 thus, this pathway might be related to the inhibition of cancer cell proliferation. Additional examples included the activation of the cyclic adenosine monophosphate (cAMP) signaling pathway (hsa04024), the estrogen signaling pathway (hsa04915), the NF-kappa B signaling pathway (hsa04064), and the Toll-like receptor signaling pathway (hsa04620). The first two might be linked to the apoptosis pathways because some of the downstream pathways are related to the induction of apoptosis. Estrogen is known to affect the regulation of the physiological process such as reproduction and circulation in mammals, and estrogen-mediated pathways have been reported to promote apoptosis in a variety of cells.39 Activation of the NFkappa B signaling and Toll-like receptor signaling pathways might be related to the activation of immune cells. We performed a principal component analysis (PCA) for the original gene expression signatures, the pathway regulation signatures, and cancer pathway regulation signatures with 1112 drugs (see the Experimental Section). Figures 3a and 3b respectively show the PCA results for the original gene expression and pathway regulation signatures; the distributions of their PC scores and loadings were completely different for each one. Both anticancer and other drugs were uniformly distributed in the original gene expression signatures. In contrast, the PC scores for the different types of drugs had characteristic features in the pathway regulation signatures. For example, anticancer drugs were tightly clustered with high scores and low variance along the PC2 axis, compared with other drugs. 9586

DOI: 10.1021/acs.jmedchem.8b01044 J. Med. Chem. 2018, 61, 9583−9595

Journal of Medicinal Chemistry

Article

Figure 4. Comparison of the PAD scores for known anticancer drugs and those of other drugs. (a) The distribution of PAD scores for drugs. The vertical axes indicate the relative frequency (density) of drugs. Left panel indicates the density for anticancer drugs, and right panel that of other drugs (i.e., nonanticancer drugs). (b) Classification of nonanticancer drugs according to the first level of the ATC code.

Figure 5. Drug−pathway regulation network. Purple and blue circles denote anticancer and other drugs, and yellow triangles correspond to the regulated pathways. Red and green lines denote the activation and inactivation of pathways by drugs. The size of drugs node indicates the sum of the edges of each node. The edge width indicates the number of cell lines in which the drug regulates the pathway. Twenty-nine drugs with the highest PAD scores and the drug−pathway regulations identified in more than 7 cell lines are shown.

calculated the scores for all the drugs for which drug-induced gene expression data were available in the LINCS database. All

the results are put on the following Web site: http://labo.bio. kyutech.ac.jp/~yamani/pathwayDR/. 9587

DOI: 10.1021/acs.jmedchem.8b01044 J. Med. Chem. 2018, 61, 9583−9595

Journal of Medicinal Chemistry

Article

Table 1. List of Nonanticancer Drugs Used in the Experimental Validationa

Drugs ranking is shown for the proposed method. The ranks exclude the anticancer drugs according to the first level of the ATC code.

a

(p-value < 1 × 10−10 by Wilcoxon’s rank sum test), implying that the former tend to regulate the cancer-related pathways but the latter do not. Still, some nonanticancer drugs had high PAD

Figure 4a shows the distributions of the PAD scores for anticancer and other drugs, i.e., nonanticancer drugs. The scores of known anticancer drugs were higher than those of the others 9588

DOI: 10.1021/acs.jmedchem.8b01044 J. Med. Chem. 2018, 61, 9583−9595

Journal of Medicinal Chemistry

Article

Figure 6. Experimental validation of predicted anticancer effects of drugs in the (a) cell viability, (b) cytotoxicity, and (c) apoptosis assay. The horizontal axis represents the concentration of each drug on a logarithmic scale. The vertical axis represents the relative viability, relative cytotoxicity, and relative apoptosis induction (caspase-3/7 activity). Each curve is colored differently for each cell line.

hormone dependent cancers. We were able to confirm the validity of the predicted anticancer effects of several nonanticancer drugs with high PAD scores based on previous reports. Rosiglitazone (D08491), an antidiabetic, suppresses cancer cell growth and induces apoptosis.45 Zileuton (D00414), an antiasthmatic, inhibits the metastasis of prostate cancer.46 Perhexiline maleate (D05442), a vasodilator, enhances the activity of a known anticancer drug for neuroblastoma.47 Simvastatin (D00434), an antihyperlipidemic, inhibits cell growth and migration in anapestic thyroid cancer.48 Capsaicin (D00250), an antineuralgic, suppresses cell proliferation and induces apoptosis in osteosarcoma.49 Valproic acid (D00399), an anticonvulsant, induces apoptosis in breast cancer.50 These results demonstrate there is a correlation between PAD scores and anticancer effects; thus, this score may predict anticancer activity of existing drugs. Experimental Validation of the Predicted Anticancer Effects via Three in Vitro Cellular Assays. From the prediction results of the pathway-based method, we sorted all drugs in descending order of the PAD scores. Next, we excluded known anticancer drugs according to the first level of the ATC code. We also excluded drugs whose anticancer activities were previously reported when we started this work as of 2014. From the highest, we selected drugs that were available for purchase and deliverable on demand at that time. Finally, we selected 15

scores, suggesting that some of them may in fact regulate those pathways and function similarly to the known anticancer drugs. We then performed a large-scale prediction of potential anticancer effects for all the drugs, except the known anticancer ones. Figure 4b shows the classification of nonanticancer drugs according to the first level of the Anatomical Therapeutic Chemical classification system code (ATC code), and we found they belong to a wide variety of ATC codes. Menadione (D02335) from group B (blood and blood-forming organs) in the ATC code had the highest PAD score among all drugs. It was recently reported that this drug exhibits cytotoxicity against various cancer cell lines.43 Ixazomib (D10130), a proteome inhibitor, had the second highest PAD score among all nonanticancer drugs. This drug significantly inhibits the proliferation of human colorectal cancer cell lines.44 These results suggest that drugs with high PAD scores are likely to have anticancer effects. Furthermore, we investigated the regulation of biological pathways by drugs with high PAD scores (see Table S2). Figure 5 presents the drug−pathway regulation network, where edges are placed between the drugs and pathways if the drugs activate or inactivate them. For example, menadione regulated cancerrelated pathways, in accordance with its reported cytotoxicity in different cancer cell lines.43 Interestingly, it also regulated the oxytocin signaling pathway and estrogen signaling pathway. This implies that menadione may be a promising drug candidate for 9589

DOI: 10.1021/acs.jmedchem.8b01044 J. Med. Chem. 2018, 61, 9583−9595

Journal of Medicinal Chemistry

Article

Figure 7. Self-ranks for all anticancer drugs in the performance evaluation. Distribution for (a) the baseline method and (b) the proposed pathwaybased method. (c) Comparison between the two methods.

nonanticancer drugs for experimental validation under the budget constraint. We tested the anticancer activity using the Triplex Assay (Promega, Madison, WI, USA), which is designed to detect cell viability, cytotoxicity, and apoptosis (caspase-3/7 activity as a marker of apoptosis). Table 1 lists the nonanticancer drugs used for the experimental validation. It is worth noting that high scores were not achieved for these predictions by the conventional similarity search based only on gene expression signatures (see Table S3). To evaluate the effect of the selected drugs, we used human cancer cell lines A549 (lung adenocarcinoma), Caco-2 (epithelial colorectal adenocarcinoma), AsPC-1 (pancreatic tumor), PC-3 (prostate cancer), and immortalized esophageal epithelial Het-1A (a representative normal human epithelial cell). According to the ATC codes, there was one drug in group A (alimentary tract and metabolism), two in C (cardiovascular system), one in J (anti-infectives for systemic use), two in P (antiparasitic products, insecticides, and repellents), and four in group N (nervous system). Five of the 15 selected drugs do not have assigned ATC codes. An antipsychotic agent, penfluridol (D02630), had the highest PAD score in the panel. Drugs with low PAD scores, such as the antiprotozoal agent nitazoxanide (D02486) and the antibacterial cilastatin (D02194), were also included to assess the correlation between PAD score and anticancer activity. Figure S2 shows the heat maps representing the effects of drugs on cell viability, cytotoxicity, and apoptosis. The drugs appropriate for cancer treatment should have low cell viability and high cytotoxicity and apoptosis values. Granisetron (D04370, ATC code A) and dirithromycin (D03865, ATC code J) have some effect on Het-1A but not on cancer cell lines. The heat map for apoptosis indicates that two code P drugs, niclosamide (D00436) and nitazoxanide (D02486), have no apoptosis-inducing properties for any cell line, but niclosamide has an effect on the viability of A549; moreover, it has no toxicity on Het-1A. From the four code N drugs, penfluridol and promazine (D08430) showed toxicity and apoptosis-inducing properties only on cancer cell lines. Both drugs have appropriate effects on cell viability. This suggests that some drugs, especially

penfluridol and promazine, have promising characteristics as novel anticancer drugs. Figure 6 shows the results of the experimental validation for the five drugs with the highest PAD scores: penfluridol, niclosamide, phenothiazine (D02601), promazine, and nicardipine (D08270) (see Table 1). In the cell viability assay, all these drugs except phenothiazine, induced a dose-dependent decrease in the viability of all the tested cells (Figure 6a). In the cytotoxicity assay, three of these drugs, except for niclosamide and phenothiazine, induced a dose-dependent increase in the cytotoxicity with different specificity; PC-3 and A549 were sensitive to penfluridol, A549 and Caco-2 to promazine, and PC3 to nicardipine (Figure 6b). Accordingly, the same three drugs showed a dose-dependent upregulation of apoptosis (Figure 6c). Penfluridol-induced apoptosis was remarkable in Caco-2; promazine-induced activity was clearly observed in A549 and Caco-2, and only slightly in AsPC-1; and nicardipine-induced activity was slightly noticeable in Caco-2. Overall, these results suggest that penfluridol, promazine, and nicardipine have anticancer activity against various cancer cell types with slightly different but overlapping sensitivity. In fact, the pathway enrichment analysis identified the activation of the apoptosis pathway in colorectal adenocarcinoma cell lines (e.g., LOVO, RKO, and HCT116) for penfluridol, promazine, and nicardipine; and in a lung carcinoma cell line (H1299) for promazine. The results obtained by the computational analysis were identical to those from the experimental validation. Therefore, the predicted drugs may be promising candidates for cancer treatment. Figures S3, S4, and S5 show the results for 10 additional drugs in cell viability, cytotoxicity, and apoptosis assays, respectively. Although the antiprotozoal agent nitazoxanide showed significant activity reducing cell viability in all the cells tested, cytotoxicity and apoptosis induction were not observed. This suggests that the drugs with low PAD score do not present a clear anticancer activity in the in vitro culture. Unexpectedly, in the apoptosis assay, niclosamide and nitazoxanide showed a value extremely lower than that of the control (dimethyl sulfoxide, DMSO), indicating that these drugs may inhibit the assay in some way. 9590

DOI: 10.1021/acs.jmedchem.8b01044 J. Med. Chem. 2018, 61, 9583−9595

Journal of Medicinal Chemistry

Article

Performance Evaluation. We evaluated the performance of the proposed pathway-based method on the prediction of the anticancer effect of drugs. The similarity search based on gene expression profiles is a baseline method often used for this purpose. In this study, we searched databases for drugs that have gene expression patterns similar to a known anticancer drug (query). We compared the performance of the proposed pathway-based approach to that of the baseline method. Drugs with reported anticancer effects (approved for treatment) are referred to as “known anticancer drugs”. We used the 83 known anticancer drugs as the standard data, and the remaining 1029 as candidate drugs. We conducted a self-rank test by Jack-knife type (leave-oneout) cross-validation as follows: (1) Of the 83 known anticancer drugs, we chose one as a test drug and considered it as a drug with unknown anticancer effects. (2) We computed the PAD scores for 1029 candidate drugs and the test drug. (3) We ranked the test drug based on the PAD scores among these 1029 + 1 drugs. (4) We repeated these steps for all the known anticancer drugs. A self-rank of 1 is a perfect prediction, indicating that the method is able to correctly predict the known anticancer effect of the test drug. For the random prediction, the self-rank follows the uniform distribution on the interval from 1 to 1030. Figure 7 compares the distributions of the computed selfranks for the 83 known anticancer drugs using the baseline and proposed pathway-based methods. For both approaches, the self-ranks are distributed in the high-rank section, which means that transcriptome data are useful to predict anticancer effects. The pathway-based method was better than the baseline method at a significant level (p-value = 0.016, Wilcoxon’s signed-rank test). These results suggest that the pathway regulation is statistically accurate for predicting anticancer effects of existing drugs.

database was constructed by performing pathway enrichment analysis of drug-induced gene expression data in CMap.56 However, previous methods did not use pathway information or pathway analysis for drug screening. In addition, most previous methods used drug-induced gene expression data in CMap, and thus, the numbers of drugs and cell lines are limited. Our proposed method is the first to use cancer-related pathway information and pathway analysis for drug screening for cancers. We also used drug-induced gene expression data from LINCS, providing advantages in terms of the numbers of drugs and cell lines. We computed the pathway-based anticancer drug likeness score (PAD score) to identify novel anticancer drugs. It is difficult to determine a specific threshold for the PAD score. The PAD scores were expected to reflect the involvement of drugs in cancer-related pathways. Known anticancer drugs do not always have high PAD scores (see Figure 4a), and there are several explanations for the low PAD scores of known anticancer drugs. First, the effects of anticancer drugs are limited to specific cancer types. Second, the current selection of cancer-related pathways was not optimal for evaluating anticancer activities. Third, the expression levels of genes involved in cancer-related pathways were not differentially changed by drug treatment. The PAD score is a new predictor of anticancer activities, but much room for improvement remains. In this paper we focused on presenting the concept of pathway-based drug screening, and therefore, the current definition of the PAD score may not be perfect. In our future work, we will seek to improve the PAD score from mathematical and biological viewpoints. We focused on the regulation of three cancer-related pathways: the inactivation of the cell cycle pathway, and the activation of the p53 signaling pathway, and the activation of the apoptosis pathway. However, the regulation of different pathways may also be considered for this method. It is worth noting that the performance of the method depended on the definition of the pathways. Thus, a more appropriate definition of cancer-related pathways could improve the accuracy and coverage of the prediction of anticancer effects. In the pathway enrichment analysis, we focused on genes that were up- or down-regulated by chemical influence and identified the pathways in which these regulated genes were enriched. However, the performance of the method heavily depends on the quality and coverage of the biological pathways and the threshold values for determining up- and down-regulated genes. We therefore extracted the top and bottom 5% of regulated genes and found that choosing an appropriate threshold value is an important issue that must be considered in future work. Another approach would be to use fold change thresholds for the up- and down-regulation. Even more careful parameter settings for the pathway enrichment analysis could further improve the performance of the prediction. We focused on the repositioning of approved drugs for cancer treatment from clinical perspectives. Thus, we used gene expression profiles only for approved drugs that were available from LINCS, which resulted in the inclusion of 1112 drugs. Additionally, LINCS contains gene expression profiles for approximately 18,000 compounds (excluding approved drugs). Our proposed method is applicable to other compounds as well, but this was beyond the scope of the current study. We aim to analyze other compounds in future research. In this study, we identified several drugs with potential anticancer effects by the combination of in silico predictions and in vitro cellular assays. However, appropriately validating the



DISCUSSION AND CONCLUSIONS We proposed a novel drug repositioning approach for cancer treatment based on pathway regulation, “pathway-based drug repositioning”. The computational prediction is performed based on drug-induced activities in cancer-related biological pathways, which enables us to efficiently find novel anticancer effects of existing drugs from large-scale drug-induced transcriptome data. The concept of the proposed method differs from those of previous methods. The uniqueness of the proposed method lies in the use of molecular pathways as the therapeutic targets, in the screening of drugs for anticancer activities via cancer-related pathway analyses, in the use of largescale drug-induced gene expression data from a wide range of human cell lines, and in the integrative work of computational prediction and experimental validation. We validated the computational prediction results by three in vitro cellular assays and successfully identified several drugs with anticancer effects specific to cancer cells and no toxic side-effects on normal cells. The proposed approach is expected to be useful for drug repositioning for cancer treatment. The importance of pathways has been recognized in recent pharmaceutical research, as drugs interact with multiple target proteins and interfere with molecular pathways in a modular manner.34 For example, pathway fingerprints were constructed by evaluating the influences of drugs on targets, and the similarity of fingerprints was used for identifying drug function.51 Transcriptionally similar drugs were evaluated in a pathway-based manner,52 pathway enrichment analysis was performed using curated information on drug targets,53−55 and a 9591

DOI: 10.1021/acs.jmedchem.8b01044 J. Med. Chem. 2018, 61, 9583−9595

Journal of Medicinal Chemistry

Article

“ctl_vehicle”) and determined the correspondence between the treatment and control profiles by comparing their “distil_ids”. We excluded the gene expression profiles that lacked corresponding control profiles. The name of each compound was converted into its corresponding InChIKey (http://www.iupac.org/home/ publications/e-resources/inchi.html) via the perturbation ID using the information provided in LINCS, which yielded the gene expression profiles of 71 cell lines treated with 20,122 bioactive compounds. We used the gene expression profiles, measured at 6 (6.4) h, of 66 cell lines treated with 1112 drugs, which are a subset of all the bioactive compounds. Of the 1112, 1029 drugs were not known as anticancer drugs. Construction of Gene Expression Signatures. We constructed the drug-induced gene expression profiles, referred to as “gene expression signatures” or “signatures”. A gene expression signature is a high-dimensional feature vector in which each element is defined as the logarithmic ratio of the gene expression value measured after compound treatment to that measured under control conditions. In LINCS, gene expression profiles denoted by “Level 3” and “Level 4” are available. The former consists of profiles generated from invariant-set scaling and quantile normalization. Level 4 data are profiles generated using robust z-scores relative to population controls obtained from the average over all other wells on the 384-well plate. We applied the following normalization procedure using Level 3 data. A gene expression signature was constructed based on the logarithmic ratio of the compound treatment profile to the biological control profile.16 Then, it was centered to have a mean of zero and scaled to have a standard deviation of one. We represented the gene expression values in the signature of each drug with a feature vector x = (x1, x2, ..., xd)T, where d is the number of features, which is identical to the number of genes. For each cell line, the same drug has multiple signatures based on various concentrations and different time points. In this study, multiple signatures were merged into one by averaging the signatures, and 6.4 h were considered as 6 h. Pathway Enrichment Analyses. We performed pathway enrichment analyses of up- and down-regulated genes following the procedures reported in previous studies for different biological evaluations.35,36 We used 163 biological pathways in the following categories of the KEGG: Metabolism (except for global and overview maps), Environmental Information Processing (except for membrane transport and signaling molecules and interaction), Cellular Processes (except for transport and catabolism), and Organismal Systems. To perform the analysis, we used the genes ranked in the top 5% and bottom 5%. Let Gdrug denote a set of up- or down-regulated genes in a signature induced by a drug, and Gpathway denote a set of genes in a pathway map. Also, let r = |Gdrug|, k = |Gpathway|, z = |Gdrug∩Gpathway|, and l be the total number of genes in the entire data set (l = 978). We assumed that z follows a hypergeometric distribution. Therefore, the probability of observing an intersection of size z between Gpathway and Gdrug is computed as follows:

anticancer activity from various perspectives for all cancers is a difficult task. The anticancer efficacy of penfluridol for pancreatic cancer, where tumor growth is suppressed by induced apoptosis, has been recently reported.57 Contrastingly, the experimental validation results show that penfluridol has anticancer activity for the pancreatic cancer cell only in the cell viability assay, not in the apoptosis induction. The experimental conditions used in previous studies, such as the penfluridol concentration and exposure period, differed from those in our assay. Therefore, it is possible that the anticancer activity of penfluridol against pancreatic cancer would be observed under other experimental conditions. Niclosamide, one of the experimental drugs in this study, affects only the cell viability, but different anticancer activity was reported in previous studies.58,59 Also, phenothiazine, another drug used for the validation, showed no effects on any of the experimental assays. However, it was reported that this drug might work against prostate cancers by inducing apoptosis.25 In the previous study, apoptosis-inducing effects were predicted only on the basis of a computational prediction, and it was not experimentally validated. Thus, the observation for phenothiazine in our assays is more reliable than that in previous research. Some of the predicted drugs whose anticancer activities were not validated in this study still have a possibility to be repositioned for cancer treatment. To identify the final candidate drugs for clinical trials, we are currently working on the experimental validation of the predicted anticancer effects of drugs by comparing the tumor shrinkage effect in vivo using a human tumor immunodeficiency mouse model. In the experiments, we made comparisons between normal and cancerous human cell lines. It is noteworthy that we experimentally observed relatively lower cytotoxicity in normal epithelial cells Het-1A than cancer cell lines in the presence of tested drugs such as penfluridol. Although our current criteria did not exclude high-toxicity drugs, it is critically important to discover anticancer drugs with sufficient safety margins. Identification of the genes and pathways responsible for differences in cytotoxicity between normal and cancer cells through global gene expression analysis will enable improvement of our therapeutic index.



EXPERIMENTAL SECTION

Drug-Induced Transcriptome Data. Gene expression profiles from the LINCS project were obtained from the Broad Institute’s Web site (http://www.lincsproject.org). To our knowledge, LINCS is the largest database storing chemically induced gene expression data in terms of the numbers of compounds and cell lines. Thus, we used LINCS in this study. This project includes 77 human cell lines and various cellular perturbations, including compound treatment, gene silencing, and gene overexpression. We used the data from compound treatment experiments. Gene expression levels were measured using flow cytometry,60 and test samples were prepared using 384-well plates. LINCS provides 978 landmark genes called “L1000 genes”. Of these genes, 614 are essential according to the database of essential genes.61 The expression levels of these landmark genes were experimentally measured22 and used for this study. Those of the remaining genes (approximately 21,000) were estimated using a computational model based on the Gene Expression Omnibus.62 The expression levels of the landmark genes were measured at approximately 6 (6.4), 24 (24.4), and 48 h after compound treatment. Note that 6.4 and 24.4 h were the actual values, but they are denoted as 6 and 24 h, respectively, for simplicity. Each gene expression profile was filed using its “distil_id”. The total number of profiles was 1,328,098. We first selected 663,594 compound treatment profiles (denoted “trt_cp” in the information) and 28,557 control profiles (denoted

min(k , r )

P(Gpathway , Gdrug ) =

∑ i=z

ij k yzij l − k yz jj zzjj z j i zj r − i zz k {k { ij l yz jj r zz k {

The resulting P values were corrected using the false discovery rate (FDR).63 Pathway Regulation and Cancer Pathway Regulation Signatures. We represented the pathway enrichment analysis of each drug as a feature vector: fpath = (f1inh , f 2inh , . . . , f dinh , f1act , f 2act , . . . , f dact )T path

path

act where f inh j is the frequency of inhibition of the j-th pathway, f j is the frequency of activation of the j-th pathway, and dpath is the total number of pathways. The feature vector f path is referred to as the pathway regulation signature.

9592

DOI: 10.1021/acs.jmedchem.8b01044 J. Med. Chem. 2018, 61, 9583−9595

Journal of Medicinal Chemistry



In addition, we represented the pathway enrichment analysis of each drug for cancer-related pathways (i.e., cell cycle pathway, p53 signaling pathway, and apoptosis pathway) as a feature vector: cpath

ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jmedchem.8b01044. Table S1: The number of detected pathways for known anticancer drugs. Table S2: The list of drugs that have top 30 PAD scores. Table S3: List of nonanticancer drugs used in the experimental validation. Figure S1: Distributions of the first and second principal component (PC1 and PC2) scores for anticancer drugs and the other drugs. Figure S2: Heatmap representation of the results of experimental validations for anticancer effects of drugs in terms of cell viability, cytotoxicity, and apoptosis inducibility (caspase 3/7 activity). Figure S3: Experimental validation of predicted anticancer effects of drugs by cell viability assay. Figure S4: Experimental validation of predicted anticancer effects of drugs by cytotoxicity assay. Figure S5: Experimental validation of predicted anticancer effects of drugs by apoptosis inducibility assay based on Caspase 3/7 activity (PDF)

fcpath = (f1inh , f 2inh , . . . , f dinh , f1act , f 2act , . . . , f dact )T f inh j

Article

cpath

f act j

where is the frequency of inhibition and the frequency of activation of the j-th cancer-related pathway, and dcpath is the total number of cancer-related pathways. The feature vector fcpath is referred to as the cancer pathway regulation signature. Pathway-Based Anticancer Drug Likeness Score. Using the proposed pathway-based method, we computed the pathway-based anticancer drug likeness score (PAD score) for each query drug as follows:

| l m n o o o o o ∑ ∑ { < } Sdrug = o I P ( G , G ) 1.0 } m pathway drug o o o o o j=1 k=1 o n ~ − log(min{P(Gpathway , Gdrug )})

where Sdrug is the proposed PAD score, m is the number of pathways to be regulated (three in this study), n is the number of cell lines (66 in this study), I{·} is an indication function that returns 1 if the event is true, where P is the FDR-corrected p-value computed in the pathway enrichment analysis, and min{·} is an operation to consider the minimum value. The indication function was used to count the number of cancer-related pathways that were regulated by drugs in each cell line. The indication function returns 0 in most cases, as the FDR-corrected p-value often returns 1. To practically enhance the sensitivity of pathway enrichment detection, we used FDR-corrected p-value