Identification of Drug-Induced Myocardial Infarction-Related Protein

Jun 11, 2014 - prediction of drug−target interaction profiles based on information from .... protein targets related to adverse drug reactions that ...
0 downloads 0 Views 5MB Size
Article pubs.acs.org/crt

Identification of Drug-Induced Myocardial Infarction-Related Protein Targets through the Prediction of Drug−Target Interactions and Analysis of Biological Processes Sergey M. Ivanov,*,† Alexey A. Lagunin,‡,† Pavel V. Pogodin,‡,† Dmitry A. Filimonov,† and Vladimir V. Poroikov‡,† †

Orekhovich Institute of Biomedical Chemistry of Russian Academy of Medical Sciences, 10, Pogodinskaya str., 119121 Moscow, Russia ‡ Medico-biological Faculty, Pirogov Russian National Research Medical University, 1, Ostrovitianova str., 117997 Moscow, Russia S Supporting Information *

ABSTRACT: Drug-induced myocardial infarction (DIMI) is one of the most serious adverse drug effects that often lead to death. Therefore, the identification of DIMI at the early stages of drug development is essential. For this purpose, the in vitro testing and in silico prediction of interactions between drug-like substances and various off-target proteins associated with serious adverse drug reactions are performed. However, only a few DIMI-related protein targets are currently known. We developed a novel in silico approach for the identification of DIMI-related protein targets. This approach is based on the computational prediction of drug−target interaction profiles based on information from approximately 1738 human targets and 828 drugs, including 254 drugs that cause myocardial infarction. Through a statistical analysis, we revealed the 155 most significant associations between protein targets and DIMI. Because not all of the identified associations may lead to DIMI, an analysis of the biological functions of these proteins was performed. The Random Walk with Restart algorithm based on a functional linkage gene network was used to prioritize the revealed DIMIrelated protein targets according to the functional similarity between their genes and known genes associated with myocardial infarction. The biological processes associated with the 155 selected protein targets were determined by gene ontology and pathway enrichment analysis. This analysis indicated that most of the processes leading to DIMI are associated with atherosclerosis. The revealed proteins were manually annotated with biological processes using functional and disease-related data extracted from the literature. Finally, the 155 protein targets were classified into three categories of confidence: (1) high (the protein targets are known to be involved in DIMI via atherosclerotic progression; 50 targets), (2) medium (the proteins are known to participate in biological processes related with DIMI; 65 targets), and (3) low (the proteins are indirectly involved in DIMI pathogenesis; 40 proteins).



INTRODUCTION Myocardial infarction (MI) is one of the major causes of hospitalization and death in the world.1 It is a complex disease with genetic predisposition, and its manifestation depends on several risk factors, one of which is the clinical use of some drugs. Drug-induced myocardial infarction (DIMI) as an adverse reaction was entered into the appropriative section of the drug labels of many approved drugs, including nonsteroidal antiinflammatory drugs,2,3 proton pump inhibitors,4 antidiabetic drugs,5 antimigraine drugs,6 calcium channel blockers,7 glucocorticoid agonists,8 viral protease inhibitors,9 and antineoplastic agents.10 The associations between drugs and MI are often not revealed during preclinical and clinical investigations. This association leads to the withdrawal of drugs from global or local markets, e.g., Vioxx (rofecoxib), Bextra (valdecoxib), Zelnorm (tegaserod), and Reductil/Meridia (sibutramine).11 Most of the MI-associated drugs remain in medical use because © XXXX American Chemical Society

of the relatively rare frequency of DIMI or the absence of safer pharmacological alternatives. The majority of adverse drug reactions (ADRs) are the result of interactions between drugs or their metabolites with particular molecular targets, mostly proteins.12,13 These proteins may be the primary therapeutic targets localized in tissues different from the disease-related one or off-targets that are not related to the therapeutic potential of the drug. For example, terfenadine is a histamine H1 receptor antagonist, and its action on this receptor is the main reason for its therapeutic effect (treatment of allergic conditions). However, terfenadine also blocks HERG potassium channels and therefore causes ventricular tachycardia and fibrillation.12,13 Compounds interacting with off-targets with known strong relationships with serious ADRs should be Received: April 24, 2014

A

dx.doi.org/10.1021/tx500147d | Chem. Res. Toxicol. XXXX, XXX, XXX−XXX

Chemical Research in Toxicology

Article

Random Walk with Restart algorithm, GO and pathway analysis, and literature experimental data to classify the selected DIMIrelated protein targets into three categories of confidence: high, medium, and low. As a result of our analysis, we suggest that most of the revealed protein targets are potentially related to DIMI.

eliminated at the early stages of drug research. Currently, Novartis uses in vitro assays to test compounds against the most undesirable targets and thus eliminates the most promiscuous compounds from further research.13 Because of the high costs of in vitro and in vivo experiments, in silico tools based on molecular modeling and QSAR (quantitative structure−activity relationships) approaches have been proposed for the estimation of drug−target interactions (DTIs). Moreover, several Web resources have been developed for the in silico prediction of drug−target interactions. These resources provide qualitative14−19 or quantitative20 predictions of DTIs, which may help the understanding of ADRs at the early stages of drug development. Even though DIMI is one of the most serious ADRs, only a few protein targets associated with DIMI are currently known. The following proteins related to myocardial ischemia/infarction are described in the literature: α-adducin,21 adenosine A3 receptor, adrenergic α 2B and 2C, thromboxane A2,13 and serotonin 1B receptors.22 The lack of knowledge regarding DIMI-related protein targets substantially limits the usage of in vitro and in silico approaches for the early detection of DIMI. Several in vitro and in silico methods for the search of ADRrelated protein targets have been described in the literature.23−29 These methods are based mainly on the search for correlations between the DTI profiles of drugs and their adverse reactions. The Bioprint method developed by Cerep12 uses information on experimentally determined DTIs and the pharmacokinetics profiles of several hundreds of drugs to search for correlations between the protein targets and ADRs. Lounkine et al. predicted the interactions of marketed drugs with 73 targets included in the Novartis in vitro assay and validated the predictions using known drug−target interaction information from proprietary databases and through de novo experimental assays. These researchers used the obtained information to find correlations for 73 targets with ADRs and explain the causes of ADRs for several drugs through a constructed drug−target−ADRs network.23 However, the experimental testing of the interactions of many drugs with thousands of proteins is rather expensive, which limits the utilization of this approach. Some approaches have been developed based on DTI predictions by using structure−activity relationships (SAR)24−26 and docking methods.27,28 However, most described experimental and in silico approaches involved either relatively few drugs or protein targets. Therefore, these investigations did not allow identifying many “adverse reaction− target” associations. Moreover, with the exception of the methods developed by Yang27 and Kuhn,29 these approaches focused on the search for “correlations” between ADRs and protein targets without regard to causality. Yang and coauthors used gene ontology (GO) terms enriched by genes with known associations to the corresponding diseases to classify the identified protein targets into two classes: annotated with these terms or not annotated.27 Kuhn et al. manually annotated each “adverse effect−target” pair with supporting experimental data retrieved from the literature.29 We found only one published work25 that focused on cardiovascular adverse reactions, including DIMI, and only a few potential DIMI-related protein targets were described in all of the above-mentioned works. We propose a new in silico approach for the identification of protein targets related to adverse drug reactions that is based on the prediction of the DTI profiles of drugs with known MI associations and used this to reveal the most significant DIMIrelated targets through a statistical analysis. To estimate which of the revealed associations may be the cause of DIMI, we used the



MATERIALS AND METHODS

Data Set. Information on the adverse effects of 996 approved and withdrawn drugs was retrieved from the SIDER 2 database (release October 17, 2012).30 The created data set included only those drugs with systemic routes of administration (such as oral and intravenous) because such applications provide a sufficient concentration of the drug in the blood for DIMI development. The structural formulas of the drug substances were retrieved from the ChemIdPlus database.31 Structures with a molecular weight lower than 50 or a molecular weight higher than 1250 Da, as well as inorganic and charged structures, were removed from the data set. As a result, 747 unique structures with known information on their adverse effects were selected. The adverse effect data for 81 drugs with the same structural conditions and systemic routes of administration were additionally retrieved from the RxList Web site.32 As a result, a data set of 828 drug structures was created. Additional adverse events associated with 828 drugs were obtained from Meyler’s Encyclopedia of Adverse Reactions and Interactions.10 The names of the adverse effects from RxList and Encyclopedia were mapped to the MedDRA preference terms used in SIDER 2. We defined the following MedDRA preference terms to be synonyms of DIMI: “acute myocardial infarction,” “myocardial infarction,” “acute coronary syndrome,” and “coronary artery thrombosis.” We also performed a search in PubMed to obtain information about DIMI that was absent in SIDER 2, RxList, and Encyclopedia. The final data set included 828 structural formulas of 254 drugs related with MI and 574 drugs not related with MI (Supporting Information, Table S1). Drug−Target Interaction Profiles Prediction. The DTI profiles for 828 drugs were predicted using a specially created version of PASS (Prediction of the Activity Spectra of Substances) software. PASS allows the evaluation of the general biological potential of small-molecule organic substances based on their 2D structural formulas.19 PASS uses multilevel neighborhoods of atoms descriptors for the chemical structure representation and a Bayesian-like approach for the simultaneous predictions of many types of biological activities.19,33−35 The current version of PASS (PASS 2012) predicts 6400 types of biological activities, including pharmacotherapeutic effects, mechanisms of action (describing the type of ligand−protein interactions: inhibition or stimulation), toxic and side effects, interaction with antitargets, transporter and drug-metabolizing enzymes, and changes in gene expression. We previously showed the possibility of revealing possible molecular mechanisms underlying the ulcerogenic action of nonsteroidal anti-inflammatory drugs based on PASS prediction results and identified 24 new mechanisms of drug action that are likely related with the development of peptic ulcers.36 In the present work, we developed a special version of PASS with a training set based on information from the ChEMBLdb 16 database37 and DrugBank 3.0.38 This version predicts interactions with human targets present in the ChEMBLdb 16 database and DrugBank 3.0 without specifying the type of ligand−protein interaction (inhibition or stimulation). The compounds were classified as actives or inactives based on the data retrieved from the ChEMBLdb database and the thresholds for the appropriate end-points (Table 1). The developed PASS training set consisted of 227,379 structures of compounds. After the selection of the human protein targets of small (50−1250 MW), electroneutral, organic compounds and training the PASS, we found that it is possible to predict interactions between druglike compounds and 1738 human targets with average ROC AUC values of 0.97, as calculated through the leave-one-out cross-validation procedure. Each of the predicted DTIs is represented by two values: Pa, which is the probability that the drug interacts with a respective target, and Pi, which is the probability that the drug does not interact with a respective target. Those drugs for which Pa − Pi > 0 are considered to potentially B

dx.doi.org/10.1021/tx500147d | Chem. Res. Toxicol. XXXX, XXX, XXX−XXX

Chemical Research in Toxicology

Article

(retrieved on 01.10.2013). These genes exhibited several types of association with MI (polymorphisms, expression, and epigenetic changes) and were used as “seed” genes in GPEC. GO and Pathway Enrichment Analysis. The ClueGo plug-in of Cytoscape44 was used to perform the GO and pathway enrichment analysis and to construct similarity networks from the enriched GO biological processes and pathways. The pathway similarity network consists of enriched pathways from KEGG45 and REACTOME46 with a q-value less than 0.05 and a kappa score of 0.42. The q-value is the pvalue adjusted to control for multiple hypothesis testing and represents the minimum false discovery rate for which the association will be regarded as significant. The kappa score shows the similarity between the pathways based on their overlapping genes. The disease-specific pathways in KEGG were removed from the network. The labels of those nodes that correspond to common and highly specific pathways were removed (e.g., “regulation of insulin-like growth factor (IGF) transport and uptake by insulin-like growth factor binding proteins (IGFBPs)” from REACTOME). The GO similarity network consists of GO biological processes from the 7th to the 15th level enriched with a qvalue less than 0.01 and a kappa score of 0.4. The GO term fusion option was applied to combine similar terms. The labels of those nodes that correspond to common and highly specific GO terms were removed. Other Software. Figures 1, 2, 3, 6, and 7 were created using Microsoft Excel 2007 and Microsoft PowerPoint 2007. The probability distributions of the −log2(RWR) scores (Figure 3) were calculated using the polynomial estimation of distribution method.34 The probability distributions and Mann−Whitney statistics were calculated using software developed in our laboratory.

Table 1. Experimental Measures of Ligand−Target Interactions and the Appropriate Thresholds Used for the Classification of Compounds as Actives or Inactivesa measures of ligand−target interactions

threshold for active compounds

EC50 IC50 Ki DC50 inhibition inactivation