PhID: An Open-Access Integrated Pharmacology Interactions

Sep 14, 2017 - Key Laboratory of Combinatorial Biosynthesis and Drug Discovery (Wuhan University), Ministry of Education, and Wuhan University School ...
0 downloads 0 Views 1MB Size
Application Note Cite This: J. Chem. Inf. Model. XXXX, XXX, XXX-XXX

pubs.acs.org/jcim

PhID: An Open-Access Integrated Pharmacology Interactions Database for Drugs, Targets, Diseases, Genes, Side-Effects, and Pathways Zhe Deng,† Weizhong Tu,† Zixin Deng,*,† and Qian-Nan Hu*,†,‡ †

Key Laboratory of Combinatorial Biosynthesis and Drug Discovery (Wuhan University), Ministry of Education, and Wuhan University School of Pharmaceutical Sciences, Wuhan, 430071, China ‡ Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, 300308, Tianjin, China S Supporting Information *

ABSTRACT: The current network pharmacology study encountered a bottleneck with a lot of public data scattered in different databases. There is a lack of an open-access and consolidated platform that integrates this information for systemic research. To address this issue, we have developed PhID, an integrated pharmacology database which integrates >400 000 pharmacology elements (drug, target, disease, gene, side-effect, and pathway) and >200 000 element interactions in branches of public databases. PhID has three major applications: (1) assisting scientists searching through the overwhelming amount of pharmacology element interaction data by names, public IDs, molecule structures, or molecular substructures; (2) helping visualizing pharmacology elements and their interactions with a web-based network graph; and (3) providing prediction of drug−target interactions through two modules: PreDPI-ki and FIM, by which users can predict drug−target interactions of PhID entities or some drug−target pairs of their own interest. To get a systems-level understanding of drug action and disease complexity, PhID as a network pharmacology tool was established from the perspective of data layer, visualization layer, and prediction model layer to present information untapped by current databases.



The growing pharmacological “big data” and the perceived problems surrounding it need computer solutions.10 Great challenges exist such as combining these pharmacology network data sources and visualizing these lists and tables of information in a connected graph that can provide a full representation of the related elements for the pharmacologist. It is convenient to use website tools to track interesting entities and related bioentities from a wide variety of databases. Currently, pharmacology information is stored in multiple databases.11 Some databases focus on the biological actions of drugs, such as DrugBank 12 and TTD. 13 ChEBI, 14 BindingDB, 15 and STITCH16 pay attention to small molecule chemical and binding information. KEGG17 and SMPDB18 are well- known as professional pathway databases. Uniprot,19 Sider,20 and OMIM21 sort out and classify gene and protein, side-effect, and disease information, respectively. These above-mentioned databases provide valuable resources for drug discovery. However, they only single out different kind of pharmacology data. For example, DrugBank is a comprehensive drug and therapeutic target database, but it only contains around 7760 drugs, while there are more than

INTRODUCTION The productivity crisis in pharmaceutical research and development in the past two decades has brought great challenges to pharmacologists.1−3 High drug attrition rates might be attributable to the reductionist one-drug−one-target lock-andkey model.4 In order to achieve more novel or better therapeutics and reduce the attrition rate of new molecular entities in drug development, pharmacologists are recognizing the need for systems biology models of pharmacological-related reconstructed networks and using these to model how drugs interact with multiple targets to have effects on diseases.5 Furthermore, network pharmacy has advantages in complex and refractory diseases because it could identify single or a set of plausible targets in the range of integrated pharmacodynamic properties.6 For example, new concepts were found by the dissection of complex molecular and cellular pathways involved in the connection between cancer and inflammation.7 However, knowledge of drug actions, target functions and disease etiologies is not sufficient to obtain an overall picture of complex disease network.8 In this regard, comprehensive network pharmacies, which provide not only a list of pharmacy elements but also relationships between them, are needed for the holistic understanding of pharmacology through simulation and data integration.9 © XXXX American Chemical Society

Received: March 24, 2017 Published: September 14, 2017 A

DOI: 10.1021/acs.jcim.7b00175 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Journal of Chemical Information and Modeling



45 000 annotated bioactive compounds in ChEBI. In addition, many bioactive natural products are not recorded in DrugBank. Users cannot retrieve information about celastrol, deltonin, rhamnazin, etc. Users also cannot get the treated disease (OMIM id or ICD10 id) information by searching drug names; instead, the database only provides the drug pharmacology indications. The DrugBank pathway information is not comprehensive compared with the SMPDB. For example, the caffeine reference pathway is not available in DrugBank but can be found in SMPDB, which is designed specifically to support pathway elucidation and pathway discovery. Although SMPDB provides detailed information about human metabolic pathways, metabolic disease pathways, metabolite signaling pathways, and drug-action pathways, it only contains 618 small molecules. Sider stores 1430 marketed medicines, 5868 sideeffects (SEs), and 139 756 drug−SE pairs, while not including some biotech drugs such as Infliximab, Alglucerase, and Eculizumab and not including the drug−target information. ChEBI provides a comprehensive and nonredundant small molecule dictionary with bioactive properties, but it does not record biotech drugs as well as molecule’s adverse reactions, diseases, and pathways. STITCH is a a famous database of known and predicted interactions between chemicals and proteins. However, the “celastrol” cannot be retrieved. And the information on connections between disease, gene, chemicals, and proteins will largely extend the application of the drug repositioning. Thus, integrating different kinds of pharmacological information in branches of public databases remains a big challenge. VNP22 is developed to visualize network pharmacology of targets, diseases, and drugs. CTD23 integrated literature data to reconstruct the chemical−gene−disease networks. PhID is an open-access and integrated pharmacology database combined with richer bioactive molecules and multiscale (drug−target−disease−gene−pathway−side-effect) biological networks which can be of significant benefit to pharmacologists. We applied an intuitive and efficient method to merge identical physical entities into PhID so as to reduce redundancy of pharmacology elements and enable nonrepetitive links of them. It assists scientists searching through this overwhelming amount of pharmacology elements by entities names, public database ID, or drug chemical structures. Meanwhile, connections between these entities are complex. Text databases hardly show their relationships intuitively. In PhID, these entities are treated as graph nodes, and the relationships among them are regarded as graph edges. All entities are visualized in the dynamic and clickable graph so that users can start from a node of interest and perform continuous browsing. By the integration and visualization of complex relations among these entities, PhID may provide some additional knowledge behind the text descriptions. Additionally, PhID provides two modules PreDPI-ki24 and FIM25 to predict interactions between proteins and small molecules. PreDPI-ki integrates chemo-informatics (drugs) and bioinformatics (targets) descriptors and quantitative drug−target interactions, to construct a predictive model. FIM uses binding site physical−chemical properties to describe interactions between ligands and targets to predict ligand− target interactions. With these two models, users can predict a ligand−target interaction or a list of potential protein targets for a drug.

Application Note

DATA AND METHODS

PhID Repository. PhID is an online system that can be accessed through an Internet browser. The backend is implemented in Python (https://www.python.org), using the Django framework (https://www.djangoproject.com). The web interface includes three applications, namely, (i) entity query, (ii) network pharmacology visualization, and (iii) drug−target interaction prediction tools. PhID was designed to support both discovery-based and hypothesis-based approaches for the quick search of interested pharmacologically entities, visualization of pharmacology networks, and prediction of drug repositioning. Such unique support is provided by integration of vast amounts of information from public repositories. Due to the large number of tables, and complexity of data relationships used in the modeling work, the PostgreSQL (refer to www.postgresql.org) database described in Figure 2 is highly efficient for big data importing and querying. We decided to use the Django database management system that contains six “Many to Many” tables with major functions to support data manipulation including bulk-importing operations, relations adding, and specific database operation functions for data population and integration. Also the entities records of PhID can be managed in Django admin. Data Standardization. PhID contains six different types of entities: drug, protein, disease, gene, side-effect, and pathway. To access broader data, it was created by integrating 15 public databases and data from a research article,26 which contain at least one kind of PhID entity list or provide connections among entities. In order to assess the content overlap of the source databases and to reduce redundancy, identical physical entities were merged into PhID in an intuitive and efficient way. The same type of entities (drug, protein, disease, gene, side-effect, and pathway) are compared with identifiers that are specific chemical structures and database IDs, such as UniProt, OMIM, NCBI-Gene, SIDER, and SMPDB. Since different databases tend to distribute physical entities with different identifier types, we unify the identifier type and use canonical SMILES for drugs/chemicals (adding the biotech drugs from DrugBank after small molecular integration). Small molecule drugs that may have different database identifiers but represent the same SMILES object are merged through the Open Babel software. That is, to provide a comprehensive data set of drugs (chemicals), PhID collected drugs (chemicals) from extensive databases such as DrugBank, TTD, ChEBI, HMDB, BindingDB, STITCH, and VNP. Different source transcripts were merged into one PhID record if they shared the same canonical SMILES. Analogously, UniProt entry identity numbers (ids) for proteins, HGNC27 unique symbols for genes, SIDER UMLS IDs for side effect, SMPDB and KEGG IDs for pathways, and two different disease identifiers, MIM and ICD28 IDs for diseases, were used to merge elements that have different database IDs into PhID. PhID entities are connected to each other through, drug− target, drug−side-effects, drug−disease, drug−gene, drug− pathway, target−disease, target−pathway, target−gene, and disease−gene relations. To provide a comprehensive data set, the information was retrieved from 15 public databases and a research article. The data sets we used to describe the relations are binary interactions, which comprises different types of interactions that are in line with interconnected diverse types of PhID entities. Such interactions are based on database or B

DOI: 10.1021/acs.jcim.7b00175 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Journal of Chemical Information and Modeling



Application Note

RESULTS Database Statistics. The results of entities retrieved and mapping from the source databases mentioned above are summarized in Table 1. PhID (release 1.0) contains 332 263 chemicals and activities (small chemicals and biotech), 24 059 proteins (human proteins and therapeutic target), 8530 disease records, 43 415 human genes, 4492 side-effects, 867 pathways, and more than 209 000 interactions among them, and it forms a comprehensive pharmacology network. To make this kind of data easily accessible to the public, we have designed and deployed the PhID website (http://phid.ditad.org). The statistics of data overlap showed a match score between public databases. Here, 142 drugs were recorded in each of the six types of pharmacology databases which indicates that these important compounds have clear drug treatment, disease mechanisms, corresponding targets, genetic origins, drug metabolisms, and side-effects. Also, 9351 of the 332 263 molecular drugs and 4537 of the 24 059 targets owned 20 292 drug−target links in PhID. Therefore, they are also huge gaps between the drug−target interactions. User Interface. The PhID website is designed to be concise and intuitive. The welcome page introduces PhID to the users through showing the integrated public database list and the current state of the data set. Entering the user interface page, the search panel is divided into seven tabs. Entity selecting tabs are the top of the searching box while the identifier filter box is under it. In PhID, users can search pharmacy related entities and interactions using names, ids, molecular structures, or molecular fragments for different purposes. Specifically, four notes for querying pharmacy related entities (drug, target, disease, gene, side-effect, and pathway) should be paid attention to (1) For the convenience of the user to understand PhID at a glance, there is only one searching box shared by the entity query panel. The user is asked to first choose which type of entity will be queried by clicking the corresponding selecting tab. (2) Because of the lack of controlled and standardized vocabulary describing chemicals, PhID provides chemical component searches by chemical structure using a SMILES string in drug searching. And, users can search drugs with a specific molecular fragment. After the SMILES string of the molecular fragment is input, drug molecules will be scanned to identify if any molecule contains the specific fragment. (3) For any specific type of query, using entity names or the public database identifier in the searching box is workable. We recommend standard names used for search. The query will be traversed in the specific entities table. If the query string is not equal to this type of entity name or id records exactly, PhID will provide alternative entities whose names or ids contain the query string. Other constraints can be applied using the identifier filter box below the searching box for the drug, disease, and pathway query. (4) In our predicted models, drug features are extracted from drug SMILES format, and target features are obtained from the target protein sequence in FASTA format. So, drug−target predictions exist in PhID using a SMILES string and a FASTA sequence. A more detailed description of query examples can be found at http:// phid.ditad.org/static/FAQ.html.

literature. PhID takes in two type of interactions: physical interactions and reference interactions. Drug−target interactions represent physical interactions between drug and target. First, PhID collects drug−target interactions based on drug actions information from Drugbank and therapeutic target inhibitors from TTD. These two databases take in detailed drug, therapeutic target information and some drug−target interactions. Second, STITCH is a resource to explore known and predicted physical interactions of chemicals and proteins. PhID integrates STITCH human (Homo sapiens) protein’s chemical−protein interactions. Finally, a total number of 20 292 integrated drug−target interactions were extracted and mapped to PhID. Further drug−target interactions including BindingDB and Sc-PDB29 were refined as training set of prediction model DPIKi and FIM. Other interactions PhID take in are reference interactions. Some of them are listed below. Target−gene interactions mean connections between protein targets and corresponding coding genes. Drug−disease, drug−gene, drug−pathway, target−disease, target−pathway, disease−gene, and drug−side-effect interactions extracted from biological databases denote functional associations between them which formed biological/chemical networks. For multiple points of diseases and diverse diseases presentation, there is still a lack of a standardized representation of human disease including nomenclature (phenotype, genotype), progression (early, late, metastasis, stages), and manifestations (transient, acute, chronic). The search by disease-related relations or inaccuracy disease names in PhID will return results with specific IDs based on MIM (used by HGNC and OMIM) or ICD (used by TTD). The processing flow of PhID integrating interactions between drug, protein, disease, gene, side-effect, and pathway is schematically shown in Figure 1.

Figure 1. Schematic representation of PhID. The database integrates six types of entities: drugs, targets, diseases, genes, pathways, and sideeffects. (A) Cross-referenced resources from public databases and the data manipulation. (B) Six types of entities are linked to each other with connections. These connections which were integrated from the data sources are represented by solid line. And the dashed line indicates the DPI-ki and FIM predictions between drugs and targets (Reprinted from http://phid.ditad.org/static/FAQ.html). C

DOI: 10.1021/acs.jcim.7b00175 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Application Note

15075 OMIM, HMDB

19900 19759 VNP, TTD

size

11085 HMDB

SMPDB 97296 Sider 9047

VNP, TTD, ChEBI, Drugbank Uniport,HGNC 14349 PREDICT, VNP, TTD 20292 VNP, TTD, ChEBI, STITCH, Drugbank

source name size source name

Visualization. One of major functions of PhID is to visualize pharmacology network to represent the complex relationships between pharmacology entities such as drugs, targets, diseases, genes, pathways, and side-effects. Starting from arbitrary user inputs, users can obtain an online interactive network of retrieved results and explore complex relationships among them. These pharmacology entities are represented as network nodes that are connected through edges representing physical or functional relations between them. In order to get a distinct and less tangled graph, PhID uses a Fruchterman− Reingold force-directed graph-drawing algorithm to generate a graph.30 One or a few central nodes represent the queries, and the surrounding nodes represent the related results. The network is used for exploring pharmacology entity interactions to understand drug treatment, disease mechanisms, therapeutic target selection, diseases of genetic origin, and the knowledge of metabolism and side-effects, so that repositioning old drugs by discovering new therapeutic indications. Information about nodes are available as a list below the graph or click the node to get further information and relations of the related record in search results. With the help of PhID, users can easily visualize a continuous landscape of the entity subnetworks in a dynamic and interactive way. These data delivered by PhID can be validated through public databases whose links are described under the graph. Drug−Target Predictive Model. PhID not only facilitates the identification of therapeutic targets in retrieved networks but also is an online tool for quick screening of interactions between small molecules and their protein targets through two predictive models: PreDPI-ki and FIM. PreDPI-ki constructs a predictive model to define whether a drug interacts with a protein or not. It presents a discriminative computational framework to identify tight binding drug-target associations in human species by developing a chemo-genomics approach with the use of integrated molecular features and ki binding affinity values. FIM is a fragment interaction model using binding site physical−chemical properties to describe the interactions between ligands and targets for ligand−target interaction prediction. By PreDPI-ki and FIM data sets recorded in PhID, users can estimate a drug−target association through inputting a drug chemical SMILES and the FASTA sequence in PreDPI-ki or obtain a list of drug potential protein targets (scPDB proteins) by predicting a drug SMILES in FIM, which may not be present in the PhID visualization graph. Database Description and Utility. In our previous work VNP, we applied a graph of connections among drug, disease, and target to visualize the complex relationships among them and estimate drug repositioning. It is expected that the more contextualized knowledge provided by network and systems biology will promote drug discovery strategies to a mechanismbased rational design of therapies. In the updated version, PhID increases the amount of data and includes more node types in the pharmacological network, as well as more node connections. It also integrates two prediction models. The main page has seven tabs. In the first six tabs, text search is available for identifiers and common names of corresponding bioentities. The last tab is being used as predicted for drug− target interactions. The following two cases introduce query results of integrated drug “bromfenac” and literature validation of “dopamine” repositioning. Case Study 1: Bromfenac. This case study is to illustrate the data integration: a drug-centric network (Figure 2) searched by drug name “bromfenac”, in which red triangles,

disease gene side-effect pathway

In Table-1.xls available at our website http://phid.ditad.org. a

8530 43415 4492 867

24059 target

drug

VNP, Drugbank, TTD, ChEBI, HMDB, BindingDB, sc-PDB, STITCH DrugBank, TTD,BindingDB,Uniport,HGNC,scPDB, STITCH OMIM, ICD10, PharmGKB HGNC Sider SMPDB, KEGG

332263

source name source name type of related data (type of network nodes)

nodes source

size

drug

size

target

size

source name

size

source name

size

source name

pathway side-effect gene relations source disease Data Source

Table 1. PhID Currently Includes Six Entities and Their Interaction Information from 16 Sourcesa

2393

Journal of Chemical Information and Modeling

D

DOI: 10.1021/acs.jcim.7b00175 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Application Note

Journal of Chemical Information and Modeling

Figure 2. Query using drug, target, disease, gene, side-effect, and pathway names or public database ids in corresponding tabs. The text box autocomplete plug-in will help users to relate at most 10 PhID records name basis of the clicked tab and input characters. Interactions of any entity subnetworks in the searching results can be visualized in the dynamic and clickable visualization zone and downloaded in .png format as images. Entity result lists, external database links, and entity interaction information were listed below the graph. Different shapes and colors of nodes and edges refer to different types of interactions or entities, respectively. The Predict_model tab gives two drug−target interaction predicted models (Reprinted from http://phid.ditad.org/static/FAQ.html).

particular disease but for a wide range. The integrated network visualization tool allows the network of the intended target to be explored in an intuitive way and is useful for targeting potential drugs. The network interactions in PhID supply us a bird’s-eye view of the system-wide pharmacology landscape and help drug repositioning via the combination with predictive models.

green circles, orange rectangles, darkblue hexagon, skyblue rhombus, and purple clouds correspond to diseases, drugs, targets, genes, pathways, and side-effects, respectively. The query string bromfenac will obtain 2 targets, 3 diseases, 2 genes, 1 pathway, and 14 side-effects. TTD id DAP000732, ChEBI: 240107, DrugBank id DB00963, and HMDB15098 are integrated into drug ids, while they share the same SMILES string: “OC(O)Cc1cccc(c1N)C(O)c1ccc(cc1)Br”. The drug chemical information and medicinal description are from Drugbank and ChEBI, respectively. The drug−target, drug−disease, drug−gene, drug−side-effect, and drug−pathway relations are from Drugbank, TTD, Sider, and SMPDB. According to the following bioentities list or clicking of graph node, results are fully traceable and referenced to the original databases. Case Study 2: Repositioning Case. This case study is a repositioning example to introduce usage of a prediction model. Dopamine, one of the major transmitters in the extrapyramidal system of the brain, was used to treat cardiovascular and kidney diseases.31 In the past research, dopamine antagonists may initiate and promote breast cancer or colon cancer.32,33 VEGFA is a kind of growth factor active in angiogenesis, vasculogenesis, and endothelial cell growth, which is an attractive cancer target of metastatic colorectal carcinoma. With the aid of PreDPI-ki models, we found they have offtarget links by inputting the dopamine SMILES and VEGF-A FASTA sequence in PreDPI-ki input box of the predict_model tab. This prediction was supported by the literature result, in which dopamine has been reported as a safe VEGF-A inhibitor inducing dangiogenesis and vasculogenesis in experimental tumors.34



ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jcim.7b00175. Overall entity−relationship diagrams approach for PhID (PDF)



AUTHOR INFORMATION

Corresponding Authors

*E-mail: [email protected] (Q.-N.H.) *E-mail: [email protected] (Z.D.). ORCID

Zhe Deng: 0000-0002-6016-9890 Weizhong Tu: 0000-0002-3820-9956 Qian-Nan Hu: 0000-0001-5213-472X Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS This work was supported by the National Science Foundation of China [31270101; 31570092]; the national high technology research and development program [2012CB721000; 2012AA023402], and the Natural Science Foundation of Tianjin, China.



CONCLUSIONS AND FUTURE DIRECTIONS With PhID we have developed a rich source of information about network pharmacology related interactions at the systemic level. The database not only contains query−element data but also element−element interactions. All information can link to its original pharmacology database via public database ID. Meanwhile, the integration is not specific to a



REFERENCES

(1) Kola, I.; Landis, J. Can the pharmaceutical industry reduce attrition rates? Nat. Rev. Drug Discovery 2004, 3, 711−715.

E

DOI: 10.1021/acs.jcim.7b00175 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Application Note

Journal of Chemical Information and Modeling (2) Pammolli, F.; Magazzini, L.; Riccaboni, M. The productivity crisis in pharmaceutical R&D. Nat. Rev. Drug Discovery 2011, 10, 428−738. (3) Scannell, J. W.; Blanckley, A.; Boldon, H.; Warrington, B. Diagnosing the decline in pharmaceutical R&D efficiency. Nat. Rev. Drug Discovery 2012, 11, 191−200. (4) Harrold, J. M.; Ramanathan, M.; Mager, D. E. Network-based approaches in drug discovery and early development. Clin. Pharmacol. Ther. 2013, 94, 651−658. (5) Kell, D. B. Finding novel pharmaceuticals in the systems biology era using multiple effective drug targets, phenotypic screening and knowledge of transporters: where drug discovery went wrong and how to fix it. FEBS J. 2013, 280, 5957−5980. (6) Pujol, A.; Mosca, R.; Farres, J.; Aloy, P. Unveiling the role of network and systems biology in drug discovery. Trends Pharmacol. Sci. 2010, 31, 115−123. (7) Trinchieri, G. Cancer and Inflammation: An Old Intuition with Rapidly Evolving New Concepts. Annu. Rev. Immunol. 2012, 30, 677− 706. (8) Somvanshi, P. R.; Venkatesh, K. V. A conceptual review on systems biology in health and diseases: from biological networks to modern therapeutics. Systems and synthetic biology 2014, 8, 99−116. (9) Iskar, M.; Zeller, G.; Zhao, X. M.; van Noort, V.; Bork, P. Drug discovery in the age of systems biology: the rise of computational approaches for data integration. Curr. Opin. Biotechnol. 2012, 23, 609− 616. (10) Leung, E. L.; Cao, Z. W.; Jiang, Z. H.; Zhou, H.; Liu, L. Network-based drug discovery by integrating systems biology and computational technologies. Briefings Bioinf. 2013, 14, 491−505. (11) Ma’ayan, A.; Rouillard, A. D.; Clark, N. R.; Wang, Z.; Duan, Q.; Kou, Y. Lean Big Data integration in systems biology and systems pharmacology. Trends Pharmacol. Sci. 2014, 35, 450−460. (12) Law, V.; Knox, C.; Djoumbou, Y.; Jewison, T.; Guo, A. C.; Liu, Y.; Maciejewski, A.; Arndt, D.; Wilson, M.; Neveu, V.; et al. DrugBank 4.0: shedding new light on drug metabolism. Nucleic Acids Res. 2014, 42, 1091−1097. (13) Zhu, F.; Han, B.; Kumar, P.; Liu, X.; Ma, X.; Wei, X.; Huang, L.; Guo, Y.; Han, L.; Zheng, C.; Chen, Y. Update of TTD: therapeutic target database. Nucleic Acids Res. 2010, 38, 787−791. (14) Degtyarenko, K.; De Matos, P.; Ennis, M.; Hastings, J.; Zbinden, M.; McNaught, A.; Alcántara, R.; Darsow, M.; Guedj, M.; Ashburner, M. ChEBI: a database and ontology for chemical entities of biological interest. Nucleic Acids Res. 2007, 36, 344−350. (15) Liu, T.; Lin, Y.; Wen, X.; Jorissen, R. N.; Gilson, M. K. BindingDB: a web-accessible database of experimentally determined protein−ligand binding affinities. Nucleic Acids Res. 2007, 35, 198− 201. (16) Kuhn, M.; Szklarczyk, D.; Pletscher-Frankild, S.; Blicher, T. H.; von Mering, C.; Jensen, L. J.; Bork, P. STITCH 4: integration of protein−chemical interactions with user data. Nucleic Acids Res. 2014, 42, 401−407. (17) Kanehisa, M.; Goto, S.; Sato, Y.; Furumichi, M.; Tanabe, M. KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res. 2012, 40, 109−114. (18) Frolkis, A.; Knox, C.; Lim, E.; Jewison, T.; Law, V.; Hau, D. D.; Liu, P.; Gautam, B.; Ly, S.; Guo, A. C.; et al. SMPDB: the small molecule pathway database. Nucleic Acids Res. 2010, 38, 480−487. (19) Somervuo, P.; Holm, L. UniProt: a hub for protein information. Nucleic Acids Res. 2015, 43, 204−212. (20) Kuhn, M.; Letunic, I.; Jensen, L. J.; Bork, P. The SIDER database of drugs and side effects. Nucleic Acids Res. 2016, 44, 1075− 1079. (21) Hamosh, A.; Scott, A. F.; Amberger, J. S.; Bocchini, C. A.; McKusick, V. A. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic acids research 2004, 33, 514−517. (22) Hu, Q. N.; Deng, Z.; Tu, W.; Yang, X.; Meng, Z. B.; Deng, Z. X.; Liu, J. VNP: interactive visual network pharmacology of diseases, targets, and drugs. CPT: Pharmacometrics Syst. Pharmacol. 2014, 3, e105.

(23) Davis, A. P.; Grondin, C. J.; Johnson, R. J.; Sciaky, D.; King, B. L.; McMorran, R.; Wiegers, J.; Wiegers, T. C.; Mattingly, C. J. The comparative toxicogenomics database: update 2017. Nucleic Acids Res. 2017, 45, 972−978. (24) Cao, D.-S.; Liang, Y.-Z.; Deng, Z.; Hu, Q.-N.; He, M.; Xu, Q.-S.; Zhou, G.-H.; Zhang, L.-X.; Deng, Z.; Liu, S. Genome-scale screening of drug-target associations relevant to Ki using a chemogenomics approach. PLoS One 2013, 8, e57680. (25) Wang, C.; Liu, J.; Luo, F.; Deng, Z.; Hu, Q.-N. Predicting targetligand interactions using protein ligand-binding site and ligand substructures. BMC Syst. Biol. 2015, 9, S2. (26) Gottlieb, A.; Stein, G. Y.; Ruppin, E.; Sharan, R. PREDICT: a method for inferring novel drug indications with application to personalized medicine. Mol. Syst. Biol. 2011, 7, 496. (27) Gray, K. A.; Yates, B.; Seal, R. L.; Wright, M. W.; Bruford, E. A. Genenames.org: the HGNC resources in 2015. Nucleic Acids Res. 2015, 43, 1079−1085. (28) World Health Organization, The ICD-10 classification of mental and behavioural disorders: clinical descriptions and diagnostic guidelines; World Health Organization: Geneva, 1992. (29) Meslamani, J.; Rognan, D.; Kellenberger, E. sc-PDB: a database for identifying variations and multiplicity of ‘druggable’binding sites in proteins. Bioinformatics 2011, 27, 1324−1326. (30) Fruchterman, T. M.; Reingold, E. M. Graph drawing by forcedirected placement. Softw., Pract. Exp. 1991, 21, 1129−1164. (31) Wang, P. S.; Walker, A. M.; Tsuang, M. T.; Orav, E. J.; Glynn, R. J.; Levin, R.; Avorn, J. Dopamine antagonists and the development of breast cancer. Arch. Gen. Psychiatry 2002, 59, 1147−1154. (32) Folkman, J. In Role of angiogenesis in tumor growth and metastasis. Semin. Oncol. 2002, 29, 15−18. (33) Dvorak, H. Angiogenesis: update 2005. J. Thromb. Haemostasis 2005, 3 (3), 1835−1842. (34) Sarkar, C.; Chakroborty, D.; Dasgupta, P. S.; Basu, S. Dopamine is a safe antiangiogenic drug which can also prevent 5-fluorouracil induced neutropenia. Int. J. Cancer 2015, 137, 744−749.

F

DOI: 10.1021/acs.jcim.7b00175 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX