Large-Scale Prediction of Beneficial Drug Combinations Using Drug

Dec 1, 2015 - Then, we constructed a predictive model by learning a sparsity-induced classifier based on known drug combinations from the Orange Book ...
0 downloads 10 Views 2MB Size
Article pubs.acs.org/jcim

Large-Scale Prediction of Beneficial Drug Combinations Using Drug Efficacy and Target Profiles Hiroaki Iwata,† Ryusuke Sawada,† Sayaka Mizutani,‡ Masaaki Kotera,‡ and Yoshihiro Yamanishi*,†,¶ †

Division of System Cohort, Multi-Scale Research Center for Medical Science, Medical Institute of Bioregulation, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka, Fukuoka 812-8582, Japan ‡ Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro-ku, Tokyo 152-8550, Japan ¶ Institute for Advanced Study, Kyushu University, 6-10-1, Hakozaki, Higashi-ku, Fukuoka, Fukuoka 812-8581, Japan S Supporting Information *

ABSTRACT: The identification of beneficial drug combinations is a challenging issue in pharmaceutical and clinical research toward combinatorial drug therapy. In the present study, we developed a novel computational method for large-scale prediction of beneficial drug combinations using drug efficacy and target profiles. We designed an informative descriptor for each drug−drug pair based on multiple drug profiles representing drug-targeted proteins and Anatomical Therapeutic Chemical Classification System codes. Then, we constructed a predictive model by learning a sparsity-induced classifier based on known drug combinations from the Orange Book and KEGG DRUG databases. Our results show that the proposed method outperforms the previous methods in terms of the accuracy of high-confidence predictions, and the extracted features are biologically meaningful. Finally, we performed a comprehensive prediction of novel drug combinations for 2,639 approved drugs, which predicted 142,988 new potentially beneficial drug−drug pairs. We showed several examples of successfully predicted drug combinations for a variety of diseases.



treatment of severe asthma.4 Combined use of vasodilators aspirin and dipyridamole has been shown to be more beneficial and safer than using either of the drugs alone for secondary prevention of stroke in the U.S.5 Heretofore, the discovery of such drug combinations has been heavily dependent on empirical findings.6 An experimental method was proposed to find optimal combinations between a set of drugs,7 but the method is limited in terms of the time of analysis and data resource. Screening all possible combinations between drugs is unrealistic; therefore, there is a strong incentive to develop computational methods for systematic identification of beneficial drug combinations. Recently, a variety of computational methods have been proposed to predict beneficial drug combinations. The use of transcriptome data in the Connectivity Map database8 was proposed by several groups. Pathway enrichment analysis of drug-induced gene expression profiles was performed, and correlated drug pairs were detected in Combinatorial Drug Assembler (CDA).9 Other information on drugs and diseases (e.g., drug chemical structures and drug similarities) was combined with the integrative analysis with drug-induced gene expression data for drug combination prediction in DrugCom-

INTRODUCTION Combinatorial drug therapy is useful for the treatment of complex diseases such as cancers and diabetes mellitus to improve therapeutic efficacy and reduce the risk of adverse drug reactions (ADRs). The identification of beneficial drug combinations for combinatorial drug therapy is a challenging issue in pharmaceutical and clinical research. Current knowledge about beneficial drug combinations is limited. Because almost all drugs have medicinal and adverse effects, adjustments of drug type and dose based on the patient’s condition are typically required.1,2 In recent years, various sources of largescale data on drugs are emerging; thus, the systematic identification of novel drug combinations is expected to contribute to the development of combinatorial drug therapy. In addition, understanding the molecular mechanisms of combinatorial drug therapies should expedite the discovery of more efficacious drug combinations. Many successful combinatorial drug treatments exist. For example, rosiglitazone and exenatide are used in combination for diabetes treatment. The use of rosiglitazone alone is associated with incidences of myocardial infarction; however, combined therapy of rosiglitazone and exenatide has been reported to reduce myocardial infarction.3 Using bronchodilators ipratropium bromide and salbutamol together shows a better therapeutic effect than salbultamol alone for the © 2015 American Chemical Society

Received: March 5, 2015 Published: December 1, 2015 2705

DOI: 10.1021/acs.jcim.5b00444 J. Chem. Inf. Model. 2015, 55, 2705−2716

Article

Journal of Chemical Information and Modeling

Figure 1. Workflow of the proposed method. First, we obtained known beneficial drug−drug associations from the Orange Book and KEGG DRUG database. Second, we constructed a descriptor of each drug−drug pair by combining the two drug profiles. Third, we learned a predictive model using logistic regression based on training data. Finally, we applied the predictive model to all possible drug−drug pairs and assigned prediction scores to all drug−drug pairs.

boRanker.10 However, both of the methods are applicable only to drugs for which gene expression data are available for both the drugs and diseases of interest, which limits large-scale applications. Another promising approach is to use comprehensive drug-related data and information on currently known drug combinations.11 This approach allows researchers to predict new drug pairs comprising drugs that have feature pairs (e.g., side effects, pathways, targets, and indications) available in known Orange Book drug combinations.12 However, the applicability of this approach strongly depends on known drug combinations obtained from the Orange Book alone, and its prediction accuracy depends on the prediction algorithm. There remains potential for improvement regarding applicability and prediction accuracy. In the present study, we developed a novel computational method for large-scale prediction of beneficial drug combinations using drug efficacy and target profiles. We designed an informative descriptor for each drug−drug pair based on

multiple drug profiles representing drug targeted proteins and Anatomical Therapeutic Chemical Classification System (ATC) codes. Then, we constructed a predictive model by learning a sparsity-induced classifier based on known drug combinations from the Orange Book and KEGG DRUG databases.13 The results show that our proposed method outperforms the previous methods in terms of accuracy of high-confidence predictions, and the extracted features are biologically meaningful. Finally, we used our method to perform a comprehensive prediction of novel drug combinations for 2,639 approved drugs, which predicted 142,988 new potentially beneficial drug−drug pairs. We showed several examples of successfully predicted drug combinations for a variety of diseases.



MATERIALS AND METHODS Data Sets. Information about beneficial drug combinations was obtained from the Orange Book12 and KEGG DRUG

2706

DOI: 10.1021/acs.jcim.5b00444 J. Chem. Inf. Model. 2015, 55, 2705−2716

Article

Journal of Chemical Information and Modeling databases.13 From the Orange Book, we extracted generic drugs comprising two prescription drugs and converted the Orange Book drug names into KEGG identifiers (D numbers) by exact name matching, followed by manual inspection. The resulting data set of Orange Book comprised 382 drug combinations involving 368 drugs (referred to in the present study as “Orange Book data”). Herein, 382 known drug−drug combinations were identified as positive examples, and the other drug−drug pairs were identified as negative examples in the Orange Book data. From KEGG DRUG, we extracted the mixtures of two drugs. The resulting data set of KEGG DRUG comprised 321 drug combinations involving 406 drugs (referred to in the present study as “KEGG DRUG data”). By merging the Orange Book data and KEGG DRUG data, the combined data set (referred to in the present study as “Merged data”) comprised 583 beneficial drug combinations involving 588 drugs. Here, 583 known drug−drug combinations were identified as positive examples, and the other drug−drug pairs were identified as negative examples in the Merged data. Information about drug-target interactions (including primary targets and off-targets) was obtained from seven major databases: KEGG DRUG,13 DrugBank,14 BindingDB,15 MATADOR,16 ChEMBL,17 PDSP-Ki,18 and TTD.19 The interaction data set comprised 4,007 drug-target interactions involving 588 drugs and 930 target proteins. A 930-dimensional binary vector represented each drug, whose elements encode the presence or absence of each target protein interaction by 1 or 0, respectively. The ATC codes of drugs identifiers were obtained from KEGG DRUG.13 The third level of the ATC code was used to represent drug therapy information. Each drug was coded using a 148-dimensional binary vector whose elements encode the presence or absence of each ATC code by 1 or 0, respectively. Subsequently, each drug was coded with a 1078-dimensional binary vector that combined target protein profile and ATC code profile. Drug indications were obtained from the KEGG database,13,20 where the information about drug indication was based on drug package inserts and medical books.21 Each drug was coded by a 218-dimensional binary vector whose elements encode for the presence or absence of each indication by 1 or 0, respectively. To predict new beneficial combinations from KEGG DRUG,13 we obtained 2,051 drugs that have no known beneficial combinations. Each drug was represented by target protein and ATC code profiles. The Merged data set was used by the proposed method as training data to learn a predictive model for making new predictions. Method for Predicting Beneficial Drug Combinations. We designed a supervised classification framework to predict new beneficial drug combinations using target proteins and ATC drug codes. The proposed method comprises four steps: “Data acquisition,” “Descriptor construction for drug−drug pairs,” “Construction of a predictive model,” and “Prediction of new drug combinations”. Figure 1 shows a workflow of the proposed method. First, we obtained currently known beneficial drug combinations from the Orange Book12 and KEGG DRUG databases.13 We represented each drug by a fingerprint (a binary feature vector) based on single data and concatenated multiple fingerprints (based on different data sets) into a single profile for each drug. Second, we constructed an informative descriptor of each drug−drug pair by combining the two drug profiles. Third, using logistic regression, we wrote

a machine learning algorithm to construct a predictive model based on known beneficial drug combinations as the training data set. Finally, we applied the predictive model to all possible drug−drug pairs and assigned the prediction scores to each pair. The model we designed uses the following classification framework: one drug from the drug−drug pair was represented as X; the other was identified as X′; and the pair was represented by (X, X′). We represented a drug−drug pair (X, X′) by a feature vector Φ(X, X′) and then estimated a function f(X, X′) = wTΦ(X, X′) that would predict whether or not the drug−drug pair (X, X′) was a combination pair. We optimized the weight vector w based on the learning set using label information. The profile of drug X was represented by an N-dimensional binary vector as follows: Φ(X) = (x1,x2, ..., xN)T, where xk ∈ {0,1}, = 1, ..., N. For example, Φ(X) is a profile of target proteins and ATC codes of drugs in this study. In the same manner, the profile of drug X′ was represented by Φ(X′) = (x1′ , x2′ , ..., xN′ )T, where xk′ ∈ {0, 1}, k = 1, ..., N. First, we designed a feature vector for each drug−drug pair considering the same features between Φ(X) and Φ(X′) as follows: Φsame(X , X ′) = (x1x1′, x 2x 2′ , ..., xN xN′ )T

Second, we designed a feature vector for each drug−drug pair considering different feature pairs between Φ(X) and Φ(X′) as follows: Φdiff (X , X ′) = (x1x 2′ , x1x3′ , ..., x1xN′ , ..., xN − 2xN′ − 1, xN − 2xN′ , xN − 1xN′ )T

Note that Φsame(X, X′) is an N-dimensional feature vector and Φdif f(X, X′) is an (N × (N−1)/2)-dimensional feature vector. Based on the concatenation of Φsame(X, X′) and Φdiff(X, X′), we designed an integrative high-dimensional feature vector for each drug−drug pair, considering unique combinations of feature pairs between Φ(X) and Φ(X′) as follows: Φ(X , X ′) = (Φsame(X , X ′)T , Φdiff (X , X ′)T )T = (x1x1′, x 2x 2′ , ..., xN xN′ , x1x 2′ , x1x3′ , ..., x1xN′ , ..., xN − 2xN′ − 1, xN − 2xN′ , xN − 1xN′ )T

Note that Φ(X, X′) is an N (N + 1)/2-dimensional feature vector. We used logistic regression as a binary classifier to predict whether or not a pair comprising drug X and drug X′ would be beneficial. To avoid overfitting, the predictive model learned weight parameters by minimizing the loss function with L1regularization. L2-regularization is standard, but tended to make most weight elements nonzero values, which makes it difficult to interpret features from the resulting weight vector. In contrast, L1-regularization tends to make most weight elements zeros, which simplifies extracting informative features from the resulting weight vector. Therefore, we used L1-regularization. Given a learning set of drug−drug pairs and association labels (Φ(Xi, X′j ), yij), yij ∈ {+1, −1} (i, j = 1, 2, ..., n; i < j), where n is the number of unique drugs in the learning set, we optimized the weight vector w of the linear logistic regression with L1regularization as follows n−1

min ∥w∥1 + C ∑ w

2707

n



log(1 + exp(−yij w TΦ(Xi , X′j )))

i=1 j=i+1 DOI: 10.1021/acs.jcim.5b00444 J. Chem. Inf. Model. 2015, 55, 2705−2716

Article

Journal of Chemical Information and Modeling

Figure 2. ROC and PR curves of the proposed method and the previous method using the Orange Book data set. The results show comparisons of the prediction accuracy using common drugs among the two methods. Panels (A) and (B) show the ROC and PR curves, respectively, of the pairwise CV; panels (C) and (D) show the ROC and PR curves, respectively, of the blockwise CV (training drugs vs test drugs); and panels (E) and (F) show the ROC and PR curves, respectively, of the blockwise CV (test drugs vs test drugs).

where ∥·∥1 is L1 norm (the sum of absolute values), and C is a penalty parameter to control sparsity. We examined various values (0.0001, 0.001, 0.01, 0.1, 1, 10, 100, 1000, 10000) for the penalty parameter C and selected the value that gave the highest area under the curve (AUC) score in the pairwise crossvalidation (CV) experiment. We used an efficient algorithm called LIBLINEAR22 to train the L1-regularized logistic regression model. Performance Evaluation Protocol. We used known drug combinations from the Orange Book data and Merged data as gold standard data sets. In general, there are many scenarios and objectives in addressing the prediction problem for object pairs.23−25 We evaluated the performance of the proposed method by simulating various scenarios in practical applications using two types of 5-fold cross-validation (CV) experiments, which we call pairwise CV and blockwise CV. The 5-fold pairwise CV was performed as follows. First, we randomly divided all drug−drug pairs in the gold standard set into five independent subsets. Second, we identified one subset of drug−drug pairs as a test set and used the remaining four subsets of drug−drug pairs as a training set. Third, we optimized a predictive model based on the drug−drug pairs in the training set. Finally, we applied the predictive model to the drug−drug pairs in the test set. The drug−drug pairs were

considered independent of each other; therefore, the drugs in the test pairs overlapped to some extent with those in the training set. The 5-fold blockwise CV was performed as follows. First, we randomly divided all drugs in the gold standard set into five independent subsets. Second, we specified one subset of drugs as test drugs and used the remaining four subsets of drugs as training drugs. Third, we optimized a predictive model based on drug−drug pairs comprising the training drugs. We applied the predictive model to two kinds of drug−drug pairs: (i) training drugs vs test drugs and (ii) test drugs vs test drugs. The drugs in the test set were completely different from those in the training set. For performance comparison, we implemented the previous method,11 whereby the drug features include the drug target proteins, ATC codes, side effects, pathways, and drug indications. Herein, we provide a brief explanation of the previous work.11 Their previous work showed that side effects and pathways were not useful for drug combination prediction; consequently, we used only drug target proteins, ATC codes, and drug indications. According to the procedure explained in the previous paper, drug-target interactions were collected from STITCH,26 DrugBank,14 and TTD19 databases. Drug indications were obtained from the SIDER database,27 where the 2708

DOI: 10.1021/acs.jcim.5b00444 J. Chem. Inf. Model. 2015, 55, 2705−2716

Article

Journal of Chemical Information and Modeling

Figure 3. AUC (FPR = 0.01), AUC, and AUPR scores of the proposed method and the previous method using the Orange Book data set. The results show comparisons of the prediction accuracy using common drugs among the two methods. Panels (A), (B), and (C) show the AUC (FPR = 0.01), AUC, and AUPR scores, respectively, of the pairwise CV; panels (D), (E), and (F) show the AUC (FPR = 0.01), AUC, and AUPR scores, respectively, of the blockwise CV (training drugs vs test drugs); and panels (G), (H), and (I) show the AUC (FPR = 0.01), AUC, and AUPR scores, respectively, of the blockwise CV (test drugs vs test drugs).

information about drug indication was extracted from the indications sections of the drug package inserts. In their paper, 0.4 was set as the threshold confidence level for the prediction score. We used the same threshold.



rate (FPR) because it is important for practical applications to evaluate high-confidence prediction score regions. The PR curve shows a plot of precision against recall, and the area under the PR curve (AUPR) shows a score of 1 for perfect prediction and shows the ratio of positive samples to all samples for random prediction. It is known that the PR curve is more practical for performance measures than the ROC curve especially when the number of positive examples is much fewer than that of negative examples.28 Figure 2 shows the ROC and PR curves in the crossvalidation experiments for the Orange Book data. In both the proposed method and the previous method, the curves of the blockwise CV were lower than those of the pairwise CV. This implies that it is more difficult to predict new combinations for uncharacterized drugs (without any known drug combination) than to predict new combinations for drugs of known combinations (with at least one known drug combination). In the PR curves, the proposed method worked significantly better than the previous method in the pairwise CV and blockwise CV. In the low false positive region of the ROC curves, the proposed method worked better than the previous method. However, in the high false positive region of the ROC curves, the proposed method worked worse than the previous

RESULTS

Performance Evaluation. We tested the proposed method on its ability to reconstruct known drug combinations in the Orange Book data and Merged data. We compared the performance of the proposed method with that of the previous method11 using the two data sets. Some features (target proteins, ATC codes, and indications) were not always available for all the drugs in each method; therefore, each method imputed missing data in the corresponding feature profiles using zeros. We performed two types of CVs: pairwise CV and blockwise CV (see Materials and Methods for more details). We evaluated the performance of the method by receiver operating characteristic (ROC) and precision-recall (PR) curves. The ROC curve shows a plot of true positive rates against false positive rates, and the area under the ROC curve (AUC) shows a score of 1 for perfect prediction and 0.5 for random prediction. We computed AUC at a low false positive 2709

DOI: 10.1021/acs.jcim.5b00444 J. Chem. Inf. Model. 2015, 55, 2705−2716

Article

Journal of Chemical Information and Modeling

Figure 4. ROC and PR curves of the proposed method and the previous method using the Merged data set. The results show comparisons of the prediction accuracy using common drugs among the two methods. Panels (A) and (B) show the ROC and PR curves, respectively, of the pairwise CV; panels (C) and (D) show the ROC curve and the PR curve, respectively, of the blockwise CV (training drugs vs test drugs); and panels (E) and (F) show the ROC curve and the PR curve, respectively, of the blockwise CV (test drugs vs test drugs).

combined and individual profiles to evaluate the importance of each data source. Similar observations were made for the Orange Book data and Merged data, and the tendency was much clearer in the latter. In the PR curves and in the low false positive region of the ROC curves, the proposed method worked significantly better than the previous method in the pairwise CV and blockwise CV. These results suggest that the proposed method is useful in practice for large-scale applications of predicting new drug combinations. The differences in AUC appear rather small; however, the differences in AUPR were large. The differences were clearly demonstrated by investigating the number of correct predictions of test drug pairs among the top 1,000 predictions (from the highest prediction score) in the cross-validation experiments. In the case of the Orange Book data, the proposed method generated 165 correct predictions compared with only 96 with the previous method. In the case of the Merged data, the proposed method yielded 169 correct predictions compared with 72 with the previous method. These results clearly show that the top enrichment in correct hits is a practical advantage of the proposed method over the previous method. Large-Scale Prediction of Novel Drug Combinations. Finally, we made comprehensive predictions for unknown drug combinations for all possible drugs. It is noteworthy that the

method. This result suggests that the proposed method has higher precision and lower sensitivity than the previous method. These properties of the proposed method are important from a practical standpoint because, in practice, experimental validations or clinical applications will usually be performed using high confidence predictions with the high prediction scores. Figure 3 shows the AUC and AUPR scores for the Orange Book data, where the performance measures were calculated for the combined and individual profiles to evaluate the importance of each data source. Among all individual data sets, ATC code profiles seem to be the most useful information, followed by target protein profiles and indication profiles. The method applied in conjunction with the integration of multiple profiles provided overall better results. However, in the proposed method, the integration of the three profiles worked worse than the integration of target proteins and ATC profiles. This result suggests that reasonable predictions can be achieved without using the indication information. Figure 4 shows the ROC and PR curves in the crossvalidation experiments for the Merged data. Note that the Merged data set is much larger than the Orange Book data set. Figure 5 shows the AUC and AUPR scores for the Merged data, where the performance measures were calculated for the 2710

DOI: 10.1021/acs.jcim.5b00444 J. Chem. Inf. Model. 2015, 55, 2705−2716

Article

Journal of Chemical Information and Modeling

Figure 5. AUC (FPR = 0.01), AUC, and AUPR scores of the proposed method and the previous method using the Merged data set. The results show comparisons of the prediction accuracy using common drugs among the two methods. Panels (A), (B), and (C) show the AUC (FPR = 0.01), AUC, and AUPR scores, respectively, of the pairwise CV; panels (D), (E), and (F) show the AUC (FPR = 0.01), AUC, and AUPR scores, respectively, of the blockwise CV (training drugs vs test drugs); and panels (G), (H), and (I) show the AUC (FPR = 0.01), AUC, and AUPR scores, respectively, of the blockwise CV (test drugs vs test drugs).

beneficial combinations of many marketed drugs have not been fully identified. We used a predictive model based on the Merged data as training data and applied this predictive model to 2,639 drugs with information available on the target proteins or ATC codes. As a result, we were able to predict 142,988 new potentially beneficial drug−drug pairs involving 1,981 drugs. The top 1,000 predictions can be found in Table S1. We tested the novelty of our prediction results using those obtained with the previous method. Among the 142,988 drug pairs predicted by the proposed method, 4,836 pairs were identified by the previous method, whereas the remaining 138,152 pairs were unique to our method (Figure S1, Supporting Information). Table 1 presents a list of the top 20 drug pairs sorted with our prediction scores. These drug pairs are visualized in a network representation (Figure 6), wherein the edges of the newly predicted pairs are indicated in red, and the known drug pairs are indicated in gray. Of the top 20 predicted pairs, seven pairs (35%) were validated in the scientific literature. We have investigated whether the newly predicted drug combinations were simply based on the same ATC code (or the same disease). Out of top 20 predictions in Table 1, only 4 drug pairs shared at least one ATC code, and 16 drug pairs did

not share the same ATC code. It implies that the prediction results are not always obvious. Herein, we make three important remarks regarding the validated pairs. The first remark relates to the use of adrenaline, a well-known neurotransmitter also known as epinephrine, in combination with a family of topical anesthetics including cocaine (rank 1), chloroprocaine (rank 2), and procaine (rank 18). These combinations were proven beneficial in humans29 and/or animals.30,31 The combinations of adrenaline and cocaine were reported safe and effective for the treatment of minor oral lacerations in patients.29 The combinations of adrenaline and chloroprocaine prolonged chloroprocainebupivacaine sciatic blocks in Sprague-Dawley rats.30 Finally, the mixtures of adrenaline and procaine extended the duration of local anesthesia in horses.31 The second remark relates to the combinations of antibiotics and antiinflammatory agents. As several candidate pairs were predicted, a large subnetwork was constructed in Figure 6. In Table 1, these pairs include chloramphenicol−dexamethasone (rank 3), erythromycin−dexamethasone (rank 4), and erythromycin−fluorometholone (rank 12). These combinations have been validated in human studies.32,33 The combination therapy of chloramphenicol and dexamethasone improved the 2711

DOI: 10.1021/acs.jcim.5b00444 J. Chem. Inf. Model. 2015, 55, 2705−2716

Article

Journal of Chemical Information and Modeling Table 1. List of Predicted Beneficial Drug Combinations rank

drug1 ID

drug1 name

drug2 ID

drug2 name

common ATC code

prediction score

confirmation

1 2 3 4

D00095 D00095 D00104 D00140

adrenaline adrenaline chloramphenicol erythromycin

D00110 D07678 D00292 D00292

cocaine chloroprocaine dexamethasone dexamethasone

no no yes yes

8.65 7.45 6.99 6.88

yes yes yes yes

5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

D00104 D00584 D00088 D00140 D00371 D07668 D00295 D00140 D00110 D00434 D00140 D00088 D00105 D00095 D08288 D00066

chloramphenicol fluorouracil hydrocortisone erythromycin theophylline chlorhexidine cimetidine erythromycin cocaine simvastatin erythromycin hydrocortisone estradiol adrenaline nortriptyline progesterone

D01367 D07870 D00104 D08333 D08187 D08410 D07668 D01367 D02335 D10178 D00295 D01954 D09567 D08422 D08354 D05173

fluorometholone dopamine chloramphenicol pentamidine metamfetamine pravastatin chlorhexidine fluorometholone menadione trelagliptin cimetidine sulfaphenazole ulipristal procaine phenindione nisoxetine

yes no no no no no no yes no no no no no no no no

6.87 6.76 6.75 6.70 6.65 6.52 6.42 6.33 6.30 5.92 5.85 5.68 5.64 5.51 5.48 5.42

no yes no no no no no yes no no no no no yes no no

indication/efficacy

ref

minor oral lacerations sciatic nerve blockade recurrent pterygium post operative abomasal hypomobility

29 30 32 34

breast and colon cancer

35

ocular rosacea

33

local anesthetic

31

Figure 6. Predicted drug combination network. The network comprises known combinations and newly predicted combinations, in which the edges of known beneficial drug combinations are colored gray and the edges of newly predicted drug combinations are colored red.

antineoplastic agent fluorouracil (rank 6). An increased efficacy of tumor growth inhibition has been reported for the combined use of dopamine and 5-fluorouracil in mice.35 This study showed that dopamine markedly improved the efficacy of 5fluorouracil for cancer treatments.35 Note that this pair was not predicted by the previous method.

safety and efficacy of human processed pericardium.32 The erythromycin−dexamethasone pair restored corticosteroid sensitivity through the inhibition of the PI3K-δ/Akt pathway and upregulation of GRα.34 The third remark is on the use of dopamine, another wellknown neurotransmitter, as an adjuvant for the cytotoxic 2712

DOI: 10.1021/acs.jcim.5b00444 J. Chem. Inf. Model. 2015, 55, 2705−2716

Article

Journal of Chemical Information and Modeling Among the seven drug pairs described above, four pairs were unique to our method (Table 1). These results suggest that our method is capable of predicting novel beneficial drug combinations in addition to those predictable by the previous method. Informative Drug Features Extracted by the Predictive Model. We examined highly weighted drug features in the predictive model learned from the Merged data set. The predictive model used in this study (i.e., L1-regularized logistic regression) is able to extract drug features. All results of this examination are contained in the Supporting Information. Table S2 shows 725 feature pairs that comprise drug target proteins or ATC codes. Table S3 shows 338 feature pairs comprising target proteins only, while Table S4 shows 116 feature pairs comprising ATC codes only. These feature pairs can be useful for understanding the mechanism of beneficial drug combinations. We discuss some examples below. Table 2 shows examples of highly weighted pairs of drug features that comprise drug target proteins and ATC codes, which explain the coadministration mechanisms. The first feature pair, C02A (antiadrenergic agents, centrally acting) and hsa:6559 (SLC12A3: sodium/chloride transporter), includes

the drug pair reserpine (an antihypertensive agent) and hydrochlorothiazide (a diuretic agent). Reserpine blocks adrenergic neurons by inhibiting the uptake of catecholamine and serotonin into the synaptic vesicles, thereby lowering blood pressure. Hydrochlorothiazide increases the amount of salt and water in urine, helping to lower blood pressure and decrease edema. In combination, these drugs are used to effectively lower high blood pressure.36 The second feature pair comprises B02B (vitamin K and other hemostatics) and N01B (anesthetics, local). Complete synthesis of proteins required for blood coagulation is dependent on Vitamin K; its shortage causes uncontrolled bleeding. Epinephrine is a hormone and a neurotransmitter, and it lowers the blood flow. In combination with local anesthetics such as lidocaine, epinephrine helps the anesthetic to diffuse in the blood at an effectively slow pace. Its effect thereby lengthens the amount of time a patient can undergo surgery without pain, although these drugs are contraindicated for use in procedures involving structures at terminal arteries (such as fingertips) and for patients who suffer from systemic diseases, such as diabetes.37 The third feature pair comprises A03B (belladonna and derivatives, plain) and hsa:43 (acetylcholinesterase). The corresponding drug pairs include atropine and pralidoxime, which are used in combination to cure organophosphate poisoning. Organophosphate insecticides and nerve gases, such as tabun, sarin, soman, and VX, destroy acetylcholinesterase by phosphorylation. Pralidoxime interrupts phosphorylation, and atropine blocks muscarinic acetylcholine receptors. Coadministration of these agents cooperatively reduces the effect of the poisoning.38 The fourth feature pair comprises S01E (antiglaucoma preparations and miotics) and S01E (antiglaucoma preparations and miotics). The corresponding drug pairs include latanoprost and timolol, coadministrated in the eye drops solution. Both of these agents aid treatment for primary open angle glaucoma.39 The fifth feature pair comprises J01A (tetracyclines) and S01H (local anesthetics). The use of antibiotics, including tetracyclines, is used to prevent infections at the surgical site and is standard care in surgery using anesthetics.40 From the wide variety of antibiotics, to reduce the risk of nosocomial infection, antibiotics are selected to address only likely pathogens. The sixth feature pair comprises C10A (lipid modifying agents, plain) and hsa:3156 (3-hydroxy-3-methylglutaryl-CoA reductase). The corresponding drug pairs include ezetimibe and simvastatin. Whereas ezetimibe reduces the amount of cholesterol absorbed from the diet, simvastatin inhibits the hydroxymethylglutaryl-CoA reductase required to synthesize cholesterol. They are coadministrated to lower cholesterol levels in the blood.41 The seventh feature pair comprises D06A (antibiotics for topical use) and J01X (other antibacterials). The corresponding drug pairs include oxytetracycline and polymixin B, which are both antibiotics used for the treatment of ocular bacterial infections.42 The twelfth feature pair comprises A02B (drugs for peptic ulcer and gastro-oesophageal reflux disease) and M02A (topical products for joint and muscular pain). The corresponding drug pairs include misoprostol and diclofenac. Diclofenac is a nonsteroidal antiinflammatory drug (NSAID) that can cause stomach ulcers and bleeding. Misoprostol stimulates mucus

Table 2. Examples of Highly Weighted Drug Features Inferred by the Proposed Method

rank

weight

drug1 feature ID drug2 feature ID

1

5.21

C02A hsa:6559

2

4.50

3

4.01

4

3.97

5

3.63

6

3.53

B02B N01B A03B hsa:43 S01E S01E J01A S01H C10A hsa:3156

7

3.48

8

3.46

9

3.40

10

3.27

11

3.18

12

3.14

13

2.94

14

2.92

15

2.91

D06A J01X V08A V08A J04A J04A J01C J01C D10A D10A A02B M02A J05A hsa:1633 hsa:316 hsa:5241 C01E D08A

description description antiadrenergic agents, centrally acting SLC12A3:solute carrier family 12 (sodium/ chloride transporter), member 3 vitamin K and other hemostatics anesthetics, local belladonna and derivatives, plain ACHE:acetylcholinesterase (EC:3.1.1.7) antiglaucoma preparations and miotics antiglaucoma preparations and miotics tetracyclines local anesthetics lipid modifying agents, plain HMGCR:3-hydroxy-3-methylglutaryl-CoA reductase (EC:1.1.1.34) antibiotics for topical use other antibacterials X-ray contrast media, iodinated X-ray contrast media, iodinated drugs for treatment of tuberculosis drugs for treatment of tuberculosis beta-lactam antibacterials, penicillins beta-lactam antibacterials, penicillins anti-acne preparations for topical use anti-acne preparations for topical use drugs for peptic ulcer and gastro-oesophageal reflux disease (GORD) topical products for joint and muscular pain direct acting antivirals DCK:deoxycytidine kinase (EC:2.7.1.74) AOX1:aldehyde oxidase 1 (EC:1.2.3.1) PGR:progesterone receptor other cardiac preparations antiseptics and disinfectants 2713

DOI: 10.1021/acs.jcim.5b00444 J. Chem. Inf. Model. 2015, 55, 2705−2716

Article

Journal of Chemical Information and Modeling

acting on the same pathway. We assume this is a logical approach because the coadministration is usually designed to treat one disease. The differences between preferable and inappropriate coadministrations may be small; careful examinations of inappropriate coadministrations may lead to uncover preferable coadministrations and vice versa.

secretion from the gastrointestinal wall and reduces ulcers caused by NSAIDs.43 These examples demonstrate that extracted drug feature pairs with high weights are biologically meaningful, which is an interesting result of the present study. This shows that the proposed model is able to extract the most significant pairs of drug features that drive beneficial drug combinations.





ASSOCIATED CONTENT

S Supporting Information *

DISCUSSION In the present study, we proposed a computational method for systematic prediction of beneficial drug combinations based on drug target protein and ATC code profiles. The novelty of the proposed method lies in its applicability to a larger number of drugs, in the interpretability of the predictive model, and in the high-performance prediction using state-of-the-art machine learning methods. We demonstrated the usefulness of the proposed method via various CV experiments and described some successful examples of newly predicted drug combinations validated against independent resources. One limitation of our proposed method is that the information about ATC code and target proteins is not always obtainable. This study showed that ATC code information was more informative than target protein information for drug combination prediction, but the ATC information is available only for marketed drugs; it is unavailable for uncharacterized drugs or new drug candidate compounds. Thus, the information on drug target proteins (including primary and off-target) is crucial in practical applications, because drugtarget interactions are involved in therapeutic pharmacological effects and in unexpected effects.44,45 However, the information on drug target proteins used in this study does not necessarily represent all target proteins of a specific drug; many drug-target interactions remain to be identified. Increasing information regarding drug-target interactions is becoming available due to the recent development of various biological assays and computational prediction methods.46−52 The use of complete information on drug-target interactions should improve the performance of our proposed method. Another limitation of the proposed method is that the disease context is not taken into account. From clinical viewpoints, it is important to identify beneficial drug combinations for each disease. Unfortunately, it is very difficult to prepare the training data because detailed information on drug combinations is insufficiently described in the disease context in current public databases. A candidate of such data source would be DailyMed;53 however, drug/disease names are not described by a standard terminology and are not linked to any IDs in other drug/disease databases. Such unstructured text data require ontology-based curation, which is outside the scope of this study. The objective of this study was to design a systematic screening method for drug combination candidates. A challenging future work is to integrate disease information into drug combination analysis. In the present study, we focused on beneficial drug combinations, but the analysis of adverse drug−drug interactions is also valuable in drug development and administration.54 Inappropriate coadministration of drugs may cause adverse effects. The cause of adverse effects are grouped by the drug pairs that act on (a) same enzyme, (b) same target, (c) same protein family, and (d) same pathway,55 and several computational analyses have been recently reported.55−59 All drug combinations analyzed in the present study are desirable coadministrations, but they could also be grouped by those

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jcim.5b00444. Table S4 contains extracted drug ATC codes pairs (XLSX) Table S1 contains top 1,000 predictions of drug−drug combinations (XLSX) Table S2 contains extracted drug features pairs involving target proteins and ATC codes (XLSX) Table S3 contains extracted drug target proteins pairs (XLSX) Figure S1 provides a Venn diagram of the predicted drug pairs between the proposed method and the previous method (PDF)



AUTHOR INFORMATION

Corresponding Author

*Phone: 81 92 642 6699. Fax: 81 92 642 6692. E-mail: [email protected]. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS This work was supported by JSPS KAKENHI Grant Number 25700029, the Program to Disseminate Tenure Tracking System, MEXT, Japan, and Kyushu University Interdisciplinary Programs in Education and Projects in Research Development.



REFERENCES

(1) Evans, W. E.; Relling, M. V. Moving towards individualized medicine with pharmacogenomics. Nature 2004, 429, 464−468. (2) International Warfarin Pharmacogenetics Consortium. Estimation of the Warfarin Dose with Clinical and Pharmacogenetic Data. N. Engl. J. Med. 2009, 360, 753. (3) Zhao, S.; Nishimura, T.; Chen, Y.; Azeloglu, E. U.; Gottesman, O.; Giannarelli, C.; Zafar, M. U.; Benard, L.; Badimon, J. J.; Hajjar, R. J.; Goldfarb, J.; Iyengar, R. Systems Pharmacology of Adverse Event Mitigation by Drug Combinations. Sci. Transl. Med. 2013, 5, 206ra140−206ra140. (4) Hossain, A.; Barua, U.; Roy, G.; Sutradhar, S.; Rahman, I.; Rahman, G. Comparison of salbutamol and ipratropium bromide versus salbutamol alone in the treatment of acute severe asthma. Mymensingh Medical Journal: MMJ. 2013, 22, 345−352. (5) Li, X.; Zhou, G.; Zhou, X.; Zhou, S. The efficacy and safety of aspirin plus dipyridamole versus aspirin in secondary prevention following TIA or stroke: A meta-analysis of randomized controlled trials. J. Neurol. Sci. 2013, 332, 92−96. (6) Jia, J.; Zhu, F.; Ma, X.; Cao, Z. W.; Li, Y. X.; Chen, Y. Z. Mechanisms of drug combinations: interaction and network perspectives. Nat. Rev. Drug Discovery 2009, 8, 111−128. (7) Zinner, R. G.; Barrett, B. L.; Popova, E.; Damien, P.; Volgin, A. Y.; Gelovani, J. G.; Lotan, R.; Tran, H. T.; Pisano, C.; Mills, G. B.; Mao, L.; Hong, W. K.; Lippman, S. M.; Miller, J. H. Algorithmic guided screening of drug combinations of arbitrary size for activity against cancer cells. Mol. Cancer Ther. 2009, 8, 521−532.

2714

DOI: 10.1021/acs.jcim.5b00444 J. Chem. Inf. Model. 2015, 55, 2705−2716

Article

Journal of Chemical Information and Modeling

protein-chemical interactions with user data. Nucleic Acids Res. 2014, 42, D401. (27) Kuhn, M.; Campillos, M.; Letunic, I.; Jensen, L. J.; Bork, P. A side effect resource to capture phenotypic effects of drugs. Mol. Syst. Biol. 2010, 6, 343. (28) Davis, J.; Goadrich, M. The relationship between PrecisionRecall and ROC curves. Proceedings of the 23rd international conference on Machine learning. 2006; pp 233−240. (29) Bonadio, W. A. Safe and Effective Method for Application of Tetracaine, Adrenaline, and Cocaine to Oral Lacerations. Ann. Emerg. Med. 1996, 28, 396−398. (30) Yung, E.; Lahoti, T.; Jafari, S.; Weinberg, J. D.; SchianodiCola, J. J.; Yarmush, J. M.; Ray, S. D. Bicarbonate Plus Epinephrine Shortens the Onset and Prolongs the Duration of Sciatic Block Using Chloroprocaine Followed by Bupivacaine in Sprague-Dawley Rats. Reg. Anesth. Pain Med. 2009, 34, 196−200. (31) Harkins, J.; Mundy, G.; Stanley, S.; Woods, W.; Boyles, J.; Arthur, R.; Sams, R.; Tobin, T. Regulatory significance of procaine residues in plasma and urine samples: preliminary communication. Equine Vet. J. 1996, 28, 121−125. (32) Alvarenga, L. S.; de Sousa, L. B.; de Freitas, D.; Mannis, M. J. Efficacy and Safety of Recurrent Pterygium Surgery Using Human Processed Pericardium. Cornea 2002, 21, 542−545. (33) Nazir, S. A.; Murphy, S.; Siatkowski, R. M.; Chodosh, J.; Siatkowski, R. L. Ocular rosacea in childhood. Am. J. Ophthalmol. 2004, 137, 138−144. (34) Sun, X.-J.; Li, Z.-H.; Zhang, Y.; Zhou, G.; Zhang, J.-Q.; Deng, J.M.; Bai, J.; Liu, G.-N.; Li, M.-H.; MacNee, W.; Zhong, X.-N. Combination of erythromycin and dexamethasone improves corticosteroid sensitivity induced by CSE through inhibiting PI3K-δ/Akt pathway and increasing GR protein. Am. J. Physiol Lung Cell Mol. Physiol 2015, 309, L139. (35) Sarkar, C.; Chakroborty, D.; Chowdhury, U. R.; Dasgupta, P. S.; Basu, S. Dopamine Increases the Efficacy of Anticancer Drugs in Breast and Colon Cancer Preclinical Models. Clin. Cancer Res. 2008, 14, 2502−2510. (36) http://www.drugs.com/mtm/hydralazine-hydrochlorothiazidereserpine.html (accessed February 12, 2015). (37) http://www.drugs.com/pro/lidocaine-and-epinephrineinjection.html (accessed February 12, 2015). (38) http://www.drugs.com/mtm/atropine-and-pralidoxime.html (accessed February 12, 2015). (39) http://www.drugs.com/uk/latanoprost-timolol-50-microgramsml-5-mg-ml-eye-drops-solution-leaflet.html (accessed February 12, 2015). (40) http://www.australianprescriber.com/magazine/28/2/38/40 (accessed February 12, 2015). (41) http://www.drugs.com/cdi/ezetimibe-simvastatin.html (accessed May 27, 2015). (42) http://www.drugs.com/mtm/oxytetracycline-and-polymyxin-bophthalmic.html (accessed May 27, 2015). (43) http://www.medicinenet.com/diclofenacandmisoprostol/ article.html (accessed May 27, 2015). (44) Whitebread, S.; Hamon, J.; Bojanic, D.; Urban, L. Keynote review: In vitro safety pharmacology profiling: an essential tool for successful drug development. Drug Discovery Today 2005, 10, 1421− 1433. (45) Blagg, J. Structure-Activity Relationships for In vitro and In vivo Toxicity. Annu. Rep. Med. Chem. 2006, 41, 353. (46) Keiser, M. J.; Roth, B. L.; Armbruster, B. N.; Ernsberger, P.; Irwin, J. J.; Shoichet, B. K. Relating protein pharmacology by ligand chemistry. Nat. Biotechnol. 2007, 25, 197−206. (47) Yamanishi, Y.; Araki, M.; Gutteridge, A.; Honda, W.; Kanehisa, M. Prediction of drug-target interaction networks from the integration of chemical and genomic spaces. Bioinformatics 2008, 24, i232−i240. (48) Kolb, P.; Ferreira, R. S.; Irwin, J. J.; Shoichet, B. K. Docking and chemoinformatic screens for new ligands and targets. Curr. Opin. Biotechnol. 2009, 20, 429−436.

(8) Lamb, J.; Crawford, E. D.; Peck, D.; Modell, J. W.; Blat, I. C.; Wrobel, M. J.; Lerner, J.; Brunet, J.-P.; Subramanian, A.; Ross, K. N.; Reich, M.; Hieronymus, H.; Wei, G.; Armstrong, S. A.; Haggarty, S. J.; Clemons, P. A.; Wei, R.; Carr, S. A.; Lander, E. S.; Golub, T. R. The Connectivity Map: Using Gene-Expression Signatures to Connect Small Molecules, Genes, and Disease. Science 2006, 313, 1929−1935. (9) Lee, J. H.; Kim, D. G.; Bae, T. J.; Rho, K.; Kim, J.-T.; Lee, J.-J.; Jang, Y.; Kim, B. C.; Park, K. M.; Kim, S. CDA: Combinatorial Drug Discovery Using Transcriptional Response Modules. PLoS One 2012, 7, e42573. (10) Huang, L.; Li, F.; Sheng, J.; Xia, X.; Ma, J.; Zhan, M.; Wong, S. T. DrugComboRanker: drug combination discovery based on target network analysis. Bioinformatics 2014, 30, i228−i236. (11) Zhao, X.-M.; Iskar, M.; Zeller, G.; Kuhn, M.; Van Noort, V.; Bork, P. Prediction of Drug Combinations by Integrating Molecular and Pharmacological Data. PLoS Comput. Biol. 2011, 7, e1002323. (12) Hare, D.; Foster, T. The Orange Book: The Food and Drug Administration’s advice on therapeutic equivalence. Am. Pharm. 1990, 30, 35−37. (13) Kanehisa, M.; Araki, M.; Goto, S.; Hattori, M.; Hirakawa, M.; Itoh, M.; Katayama, T.; Kawashima, S.; Okuda, S.; Tokimatsu, T.; Yamanishi, Y. KEGG for linking genomes to life and the environment. Nucleic Acids Res. 2008, 36, D480−D484. (14) Knox, C.; Law, V.; Jewison, T.; Liu, P.; Ly, S.; Frolkis, A.; Pon, A.; Banco, K.; Mak, C.; Neveu, V.; Djoumbou, Y.; Eisner, R.; Guo, A. C.; Wishart, D. S. DrugBank 3.0: a comprehensive resource for ’Omics’ research on drugs. Nucleic Acids Res. 2011, 39, D1035−D1041. (15) Liu, T.; Lin, Y.; Wen, X.; Jorissen, R. N.; Gilson, M. K. BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinities. Nucleic Acids Res. 2007, 35, D198− D201. (16) Günther, S.; Kuhn, M.; Dunkel, M.; Campillos, M.; Senger, C.; Petsalaki, E.; Ahmed, J.; Urdiales, E. G.; Gewiess, A.; Jensen, L. J.; Schneider, R.; Skoblo, R.; Russell, R. B.; Bourne, P. E.; Bork, P.; Preissner, R. SuperTarget and Matador: resources for exploring drugtarget relationships. Nucleic Acids Res. 2008, 36, D919−D922. (17) Gaulton, A.; Bellis, L. J.; Bento, A. P.; Chambers, J.; Davies, M.; Hersey, A.; Light, Y.; McGlinchey, S.; Michalovich, D.; Al-Lazikani, B.; Overington, J. P. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 2012, 40, D1100−D1107. (18) Roth, B. L.; Lopez, E.; Patel, S.; Kroeze, W. K. The Multiplicity of Serotonin Receptors: Uselessly Diverse Molecules or an Embarrassment of Riches? Neuroscientist 2000, 6, 252−262. (19) Zhu, F.; Shi, Z.; Qin, C.; Tao, L.; Liu, X.; Xu, F.; Zhang, L.; Song, Y.; Liu, X.; Zhang, J.; Han, B.; Zhang, P.; Chen, Y. Therapeutic target database update 2012: a resource for facilitating target-oriented drug discovery. Nucleic Acids Res 2012, 40, D1128−D1136. (20) Kanehisa, M.; Goto, S.; Furumichi, M.; Tanabe, M.; Hirakawa, M. KEGG for representation and analysis of molecular networks involving diseases and drugs. Nucleic Acids Res. 2010, 38, D355−D360. (21) Papadakis, M. A.; McPhee, S. J.; Rabow, M. W. CURRENT Medical Diagnosis and Treatment 2014; McGraw-Hill Medical: New York, 2013. (22) Fan, R.-E.; Chang, K.-W.; Hsieh, C.-J.; Wang, X.-R.; Lin, C.-J. LIBLINEAR: A Library for Large Linear Classification. J. Mach. Learn. Res. 2008, 9, 1871−1874. (23) Tabei, Y.; Pauwels, E.; Stoven, V.; Takemoto, K.; Yamanishi, Y. Identification of chemogenomic features from drug-target interaction networks using interpretable classifiers. Bioinformatics 2012, 28, i487− i494. (24) Iwata, H.; Mizutani, S.; Tabei, Y.; Kotera, M.; Goto, S.; Yamanishi, Y. Inferring protein domains associated with drug side effects based on drug-target interaction network. BMC Syst. Biol. 2013, 7, S18. (25) Park, Y.; Marcotte, E. M. Flaws in evaluation schemes for pairinput computational predictions. Nat. Methods 2012, 9, 1134−1136. (26) Kuhn, M.; Szklarczyk, D.; Pletscher-Frankild, S.; Blicher, T. H.; von Mering, C.; Jensen, L. J.; Bork, P. STITCH 4: integration of 2715

DOI: 10.1021/acs.jcim.5b00444 J. Chem. Inf. Model. 2015, 55, 2705−2716

Article

Journal of Chemical Information and Modeling (49) Namasivayam, V.; Hu, Y.; Balfer, J.; Bajorath, J. Classification of Compounds with Distinct or Overlapping Multi-Target Activities and Diverse Molecular Mechanisms Using Emerging Chemical Patterns. J. Chem. Inf. Model. 2013, 53, 1272−1281. (50) Chupakhin, V.; Marcou, G.; Baskin, I.; Varnek, A.; Rognan, D. Predicting Ligand Binding Modes from Neural Networks Trained on Protein-Ligand Interaction Fingerprints. J. Chem. Inf. Model. 2013, 53, 763−772. (51) Csermely, P.; Korcsmáros, T.; Kiss, H. J.; London, G.; Nussinov, R. Structure and dynamics of molecular networks: A novel paradigm of drug discovery: A comprehensive review. Pharmacol. Ther. 2013, 138, 333−408. (52) Brown, J.; Okuno, Y.; Marcou, G.; Varnek, A.; Horvath, D. Computational chemogenomics: Is it more than inductive transfer? J. Comput.-Aided Mol. Des. 2014, 28, 597−618. (53) http://dailymed.nlm.nih.gov/dailymed/ (accessed November 16, 2015). (54) Manzi, S. F.; Shannon, M. Drug Interactions − A Review. Clin. Pediatr. Emerg. 2005, 6, 93−102. (55) Takarabe, M.; Shigemizu, D.; Kotera, M.; Goto, S.; Kanehisa, M. Network-based analysis and characterization of adverse drug-drug interactions. J. Chem. Inf. Model. 2011, 51, 2977−2985. (56) Gottlieb, A.; Stein, G. Y.; Oron, Y.; Ruppin, E.; Sharan, R. INDI: A computational framework for inferring drug interactions and their associated recommendations. Mol. Syst. Biol. 2012, 8, 592. (57) Vilar, S.; Harpaz, R.; Uriarte, E.; Santana, L.; Rabadan, R.; Friedman, C. Drug−drug interaction through molecular structure similarity analysis. J. Am. Med. Inform. Assoc. 2012, 19, 1066−1074. (58) Vilar, S.; Uriarte, E.; Santana, L.; Tatonetti, N. P.; Friedman, C. Detection of Drug−Drug Interactions by Modeling Interaction Profile Fingerprints. PLoS One 2013, 8, e58321. (59) Vilar, S.; Uriarte, E.; Santana, L.; Lorberbaum, T.; Hripcsak, G.; Friedman, C.; Tatonetti, N. P. Similarity-based modeling in large-scale prediction of drug-drug interactions. Nat. Protoc. 2014, 9, 2147−2163.

2716

DOI: 10.1021/acs.jcim.5b00444 J. Chem. Inf. Model. 2015, 55, 2705−2716