Cheminformatic Analysis of Antimalarial Chemical Space Illuminates

Aug 15, 2017 - Demystifying Multitask Deep Neural Networks for Quantitative Structure–Activity Relationships. Journal of Chemical Information and Mo...
0 downloads 12 Views 4MB Size
Article pubs.acs.org/jcim

Cheminformatic Analysis of Antimalarial Chemical Space Illuminates Therapeutic Mechanisms and Offers Strategies for Therapy Development Julia Nogueira Varela,† María Fernanda Lammoglia Cobo,†,‡,⊥ Sandip V. Pawar,† and Vikramaditya G. Yadav*,†,§ †

Department of Chemical & Biological Engineering, The University of British Columbia, Vancouver, BC, Canada, V6T 1Z3 Life Sciences Department, Monterrey Institute of Technology and Higher Education, Mexico City Campus, Mexico City, Mexico, 14380 § Neglected Global Diseases Initiative, The University of British Columbia, Vancouver, BC, Canada, V6T 1Z3 ‡

S Supporting Information *

ABSTRACT: The clear and present danger of malaria, which has been amplified in recent years by climate change, and the progressive thinning of our drug arsenal over the past two decades raise uncomfortable questions about the current state and future of antimalarial drug development. Besides suffering from many of the same technical challenges that affect drug development in other disease areas, the quest for new antimalarial therapies is also hindered by the complex, dynamic life cycle of the malaria parasite, P. falciparum, in its mosquito and human hosts, and its role thereof in the elicitation of drug resistance. New strategies are needed in order to ensure economical and expeditious development of new, more efficacious treatments. In the present study, we employ open-source cheminformatics tools to analyze the chemical space traversed by approved antimalarial drugs and promising candidates at various stages of development to uncover insights that could shape future endeavors in the field. Our scaffold-centric analysis reveals that the antimalarial chemical space is disjointed and segregated into a few dominant structural groups. In fact, the structures of antimalarial drugs and drug candidates are distributed according to Pareto’s principle. This structural convergence can potentially be exploited for future drug discovery by incorporating it into bioinformatics workflows that are typically employed for solving problems in structural biology. Significantly, we demonstrate how molecular scaffold hunting can be applied to unearth putative mechanisms of action of drugs whose activities remain a mystery, and how scaffold-centric analysis of drug space can also provide a recipe for combination therapies that minimize the likelihood of emergence of drug resistance, as well as identify areas on which to focus efforts. Finally, we also observe that over half of the molecules in the antimalarial space bear no resemblance to other molecules in the collection, which suggests that the pharmacobiology of antimalarial drugs has not been entirely surveyed.

1. INTRODUCTION The World Health Organization estimates that malaria infects over 200 million people in the world each year.1,2 Of these, over half a millionmost of who are children in Africa ultimately die of the disease. This figure translates to a staggering child mortality rate of 1 per minute. Malaria is a heavy economic burden on the poorest societies of the world, and progressively rising temperatures are amplifying its brunt to great levels.3 For example, at 20 °C, the malarial parasite, Plasmgodium falciparum, takes 26 days to fully mature, which more or less coincides with the average lifespan of the adult female Anopheles gambiae mosquitoes that harbor and transmit these parasites. In contrast, when the temperature rises to 25 °C, the parasite completes its development in a mere 13 days. The mosquitoes are also more aggressive when their surroundings warm up, and they live longer and breed more in these conditions. The shortened development cycle of P. falciparum coupled with the increased likelihood of contact © 2017 American Chemical Society

between people and disease-bearing mosquitoes will indubitably accelerate the spread of malaria. Global warming has also expanded the geographic ranges of many vector-borne diseases.4,5 In North America, for example, mosquito control programs implemented during the 1980s had successfully confined malaria to parts of California. However, on the back of two of the hottest decades on record, it has spread southward to Texas, eastward to Florida and New York, and even as far north as Ontario. Globally, malaria has made a comeback to southern Europe and parts of Russia and has even expanded upward to the highlands of southern Africa. Worryingly, these developments have occurred just as reports of drug resistance against frontline artemisinin-based combination therapies (ACTs) have emerged from Southeast Asia.6 Worse, research and development (R&D) in the pharmaceutical industry, Received: February 8, 2017 Published: August 15, 2017 2119

DOI: 10.1021/acs.jcim.7b00072 J. Chem. Inf. Model. 2017, 57, 2119−2131

compound category (letter code for Figure 1)

2120

others (H)

antibiotics (G) there are several distinct MoAs within this category; the letter codes for these MoAs have been listed after the compound number

sulfonamides (F)

antifolates (E)

amino alcohols (D)

8-aminoquinolines (C)

4-aminoquinolines (B)

endoperoxides (A)

Thiostrepton (34, G5) Doxycyclin (35, G6) Mirincamycin (36, G7) Atovaquone (37, H1)

Fosmidomycin (33, G4)

Artemisinin (1) Dihydroartemisinin (2) Artemether (3) Artesunate (4) Artemisone (5) 3-Artesanilide (6) Piperaquine (7) Amodiaquine (8) Chloroquine (9) Naphthoquine (10) Hydroxychloroquine (11) Pyronaridine (12) tert-Butyl isoquine (13) AQ-13 (14) Primaquine (15) Tafenoquine (16) Bulaquine (17) NPC-1161B[Preclinical] (47) Quinine (18) Lumefantrine (19) Halofantrine (20) Mefloquine (21) Dapsone (22) Pyrimethamine (23) Proguanil (24) Chlorproguanil (25) Cycloguanil (26) Ferroquine[Phase 2A] (58) Sulfadoxine (27) Sultamethoxazole (28) Sulfadiazine (29) Azithromycin (30, G1) Trimethoprim (31, G2) Tetracycline (32, G3)

drug[clinical phase, if not approved] (compound no. in Table S1)

inhibits the reductoisomerase that acts on 1-deoxy-D-xylulose-5-phosphate, a key intermediate in the nonmevalonate pathway for isoprenoid synthesis targets the ribosome in the apicoplast synthetic tetracycline that targets the apicoplast; also used as a prophylactic agent with chloroquine lincosamide antibiotic that acts on the apicoplast hydroxyl napthoquinone that acts on cytochrome BC1

semisynthetic azalide derivative of erythromycin that inhibits the 70S subunit of the Plasmodium spp. ribosome. closely related to pyrimethamine; it is a pyrimidine inhibitor of dihydrofolate reductase (DHFR) slow-acting drug targets the apicoplast of the malarial parasite

inhibit the dihydropteroate pathway that is essential for tetrahydrofolate biosynthesis during the asexual phase of the blood stage

interfere with folate biosynthesis in the parasite; ferroquine, on the other hand, inhibits hemozoin formation

mefloquine interacts with phospholipids in the cell membrane; MoAs for the other amino alcohols are unknown

effective during the liver and blood stages; the mechanisms of action are poorly characterized for most molecules in this class

selectively target the blood stage of malarial parasite; they inhibit the conversion of hematin to hemozoin, which subsequently disrupts the membrane function

broad-spectrum antimalarial drugs that target the asexual and sexual blood stages through heme-mediated decomposition of their endoperoxide bridges, which produces toxic radicals

mechanism of action (MoA)

Table 1. Chemical Categories and Mechanisms of Action of Approved Antimalarial Compounds and Drug Candidates Presently Undergoing Clinical Testing

Journal of Chemical Information and Modeling Article

DOI: 10.1021/acs.jcim.7b00072 J. Chem. Inf. Model. 2017, 57, 2119−2131

2121

natural products from Argemone mexicana (M2)

natural products from Nauclea pobeguinil (M1)

1,2,4-trioxolane (L3)

pyrazoles (I1) aminopyridine (I2) quinolone-3-diarylether (I3) immucillin G (I4) 1,2,4,5-tetraoxane (I5) diaminopyridine (I5) triazolopyrimidine (J) imidazolopiperazine (L1) spiroindolone (L2)

there are several distinct MoAs within this category; the letter codes for these MoAs have been listed after the compound number

compound category (letter code for Figure 1)

Table 1. continued

OZ439[Phase 2A] (57) OZ277[Phase 3] (63) Strictosamide[Phase 2B/3] (59) Berberine[Phase 2B/3] (60) Protopine[Phase 2B/3] (61) Allocryptopine[Phase 2B/3] (62)

21A02[Preclinical] (45) MMV390048[Preclinical] (46) ELQ-300[Preclinical] (48) BCX4945[Preclinical] (49) RKA182[Preclinical] (50) P218[Preclinical] (51) DSM265[Phase 1] (54) KAF156[Phase 2A] (55) NITD609[Phase 2A] (56)

unknown

unknown

inhibits protein synthesis iron-chelating agent interferes with hemoglobin and heme metabolism in the digestive organelles; also selectively inhibits glutathione reductase and acts as a chloroquine sensitizer unknown unknown inhibits mitochondrially expressed cytochrome BC1 inhibitor of purine nucleoside phosphorylase (PNP) interferes with hemoglobin catabolism in the parasite inhibits DHFR and folate biosynthesis inhibits dihydroorotate dehydrogenase (DHOD) believed to target a transmembrane domain of the cyclic amine resistance locus protein (CarI), though the precise location is unknown targets P-type ATPase 4 (Atp4), which was initially identified as a calcium ion pump but is now reported as being important for maintaining sodium homeostasis in the parasite; inhibition of this enzyme increases the intracellular sodium concentration, which causes the parasitic cell to swell and die these compounds interfere with hemoglobin catabolism in the parasite

prohormone synthesized from cholesterol; it inhibits glucose-6-phosphate dehydrogenase (G6PDH) in the parasite

Dehydroepiandrosterone or DHEA (40, H4) Cycloheximide (41, H5) Deferoxamine (43, H6) Methylene Blue (44, H7)

mechanism of action (MoA) inhibits hemozoin synthesis in the asexual phase of the blood stage MoA is unknown, but it is speculated that the drug inhibits hemozoin formation in the parasite’s food vacuole

Riboflavinn (38, H2) Pentamidine (39, H3)

drug[clinical phase, if not approved] (compound no. in Table S1)

Journal of Chemical Information and Modeling Article

DOI: 10.1021/acs.jcim.7b00072 J. Chem. Inf. Model. 2017, 57, 2119−2131

Article

Journal of Chemical Information and Modeling

therapies,28 and identifies promising structures for lead identification and optimization toward development of new, standalone antimalarial drugs.

especially in the domain of anti-infective small-molecule drugs, is presently at its lowest ebb. Drug approval rates have never been lower and many pharmaceutical companies have downsized their anti-infective discovery programs.7 A global health catastrophe could be at hand just as our drug arsenal is at its thinnest.8 New antimalarial therapies will soon be required,9 but the current state of antimalarial drug development points to an uncertain future. Antimalarial hits are presently identified using highthroughput whole-cell assays with the parasite itself, protein kinases activity assays, or through functional genomics.10 Yet, the mechanisms of action (MoA) of most of the compounds identified in these screens are poorly understood or unknown. The complex, dynamic life cycle of P. falciparum in its mosquito and human hosts further hinders the discovery of antimalarial compounds. The parasite develops in humans in two stages, the liver and the blood stage, in that order.11 The blood stage is further divided into an asexual and sexual phase.12 Since symptoms of the disease begin to manifest only after the parasite progresses to the blood stage,13 most antimalarial drugs target the parasite at this stage. Artemisinin, the chief ingredient in gold standard ACTs, acts on both the asexual and sexual phases of development.10,14 Other frontline combination treatments, including aminodiaquine−artesunate, piperaquine−dihydroartemisinin, and ferroquine−artesunate, also target the blood stage of P. falciparum.15,16 Nevertheless, despite having strong binding kinetics, these compounds have a short in vivo half live. Since the parasite is typically exposed to suboptimal concentrations of the drug due to its rapid clearance, the likelihood for drug resistance to emerge is also higher.15−17 It is evident that the inclusion of compounds that target the liver stage of the parasite could significantly improve the efficacy of these combination treatments. Further knowledge about the MoA and pharmacology of drugs will also be essential for predicting the emergence of and mitigating drug resistance.10,18 For instance, polymorphisms in the Plasmodium spp. drug efflux transporter genes crt and mdr1, dhfr, and dhps have been shown to play a leading role developing resistance against classical antimalarial drugs such as chloroquine, antifolates, and sulfadoxine−pyrimethamine, respectively.15,19−21 While these insights are guiding the development of the next generation of antimalarial therapies, a complementary school of thought proposes to re-examine the vast volumes of data generated by previous high-throughput screens using cheminformatics tools in order to improve future R&D outcomes.22−24 Such analyses require minimal computing infrastructure and are incredibly fast and easy to undertake. Moreover, they often succeed in focusing drug discovery programs.25,26 We assessed these claims in the present study by employing open-source cheminformatics tools installed on a personal computer to analyze the chemical space traversed by all approved antimalarial drugs and promising compounds from chemical libraries maintained by pharmaceutical companies and academic research groups. Previous work has been done to identify new scaffolds for chemokine receptors with the use of Tanimoto index.27 We extracted the Bemis−Murcko scaffolds of every molecule and performed a similarity analysis on the simplified structures in order to generate a connectivity diagram. Since the compounds that are linked together in this diagram can be inferred to have a common target and a similar MoA,18 our analysis illuminates MoAs that are unclear or unknown, provides recipes for new antimalarial combination

2. METHODOLOGY We analyzed the structures of 614 molecules (Table S1 in the Supporting Information) in order to generate the connectivity diagram. The complete list of molecules in the SMILES specification was compiled from a variety of sources. 63 of the 614 compounds are either approved antimalarial drugs or candidates presently undergoing clinical testing, and were sourced using chemical information search engines such as PubMed, SciFinder, Reaxys, and Google Scholar. These molecules act on either the liver and/or blood stage of the life cycle of the parasite through a variety of mechanisms (Table 1). 313 (out of 400 compounds) and 58 (out of 5370 compounds) compounds were also selected from chemical libraries synthesized by research groups at the University of California, San Francisco (UCSF), and Harvard University, respectively. Only those molecules that inhibit growth of the parasite by 50% or greater at a dose of 5 μM were selected from the UCSF library. Unlike the UCSF library, the molecules in the Harvard library were classified as being active or inactive against the malarial parasite. Consequently, only active molecules were picked from the Harvard library. Compounds were also selected from the Novartis and GSK chemical libraries that are available in the ChEMBL database. The Novartis and GSK libraries cumulatively comprise over 6000 compounds, not all of which are unique. As a consequence, we randomly selected 51 unique compounds from each library. Our decision was informed by the lack of computing power at our disposal to analyze all the samples in these libraries and uncertainty about their activity outcomes in the in vitro assays. Our investigation set also comprised 78 approved anticancer drugs. We hypothesized that since the vast majority of the scaffolds of the anticancer drugs considered in this study would be quite different to those of the antimalarial compounds, the anticancer drugs could serve as effective negative controls for our analysis. The structures of all 614 compounds were subsequently extracted from the SMILES specifications and imported into MarvinSketch29 for identification of their loose Bemis−Murcko (BM) scaffolds. The resulting scaffold data was once again expressed in SMILES specification and exported to an Excel spreadsheet. Loose BM scaffolds for two molecules, N-acetyl-Dpenicillamine and deferoxamine (compounds 42 and 43, respectively, in Table S1), could not be identified by the software. All analysis from there onward was performed using the JChem plugin30 in Excel. The BM lose framework was chosen instead of the BM framework, because it also takes into account different atoms, instead of only carbon and hydrogen. The SMILES specifications for loose BM scaffolds of the 614 structures were used to generate a Tanimoto similarity matrix using an in-built similarity estimation function in the plugin. Our choice of the Tanimoto index was informed by previous studies that reported a high degree of reproducibility of the calculation in analysis of numerous chemical databases.31,32 JChem offers a choice between several Tanimoto similarity indices that differ in the manner in which the fingerprints of the molecule are screened. Of these, contrasting a molecule’s chemical-hashed or pharmacophore fingerprints are two of the most common approaches to evaluate Tanimoto similarity. We evaluated the mean similarity between any two molecules in a 2122

DOI: 10.1021/acs.jcim.7b00072 J. Chem. Inf. Model. 2017, 57, 2119−2131

Article

Journal of Chemical Information and Modeling

Figure 1. Antimalarial chemical space. The network diagram for the 614 molecules investigated in this study reveals that antimalarial chemical space is disjointed and comprises several hub nodes. 2123

DOI: 10.1021/acs.jcim.7b00072 J. Chem. Inf. Model. 2017, 57, 2119−2131

Article

Journal of Chemical Information and Modeling

Figure 2. Nodal analysis of antimalarial chemical space. The chemical network comprises 391 connected nodes and 302 isolated nodes. The nodes have, on average, 1.629 neighbors. The network has a clustering coefficient of 0.219. The clustering coefficient for a node is the ratio of the number of connections between that node and its neighbors and the maximum number of connections that could possibly exist between that node and its neighbors. The clustering coefficient for the network is the average of the clustering coefficient of all nodes that it comprises. The centralization factor of the network is estimated to be 0.033, which indicates that the network comprises a larger number of smaller, segregated groups.

molecules analyzed and a connection between any two nodes indicates a minimum similarity of 0.7 between the two molecules. Since antimalarial compounds such as the amino alcohols, 4- and 8-aminoquinolines, and atovaquone, as well as combination therapies such as artemether−lumefantrine are administered orally, we also assessed each molecule’s compliance with the Rule of 5 using an in-built JChem function and rendered it in Cytoscape. This analysis identifies all molecules that may be desirable as oral drugs since we are comparing chemical structures that would be also comparable in biological systems.

population comprising 3960 bioactive molecules from the ChEMBL database not known to have antimalarial activity using both fingerprinting approaches and determined that the difference between the means is statistically significant (α = 0.05). The choice of a pharmacophore fingerprinting is also based on its ability to better capture the properties of the biological binding events.33 Since two molecules with similar pharmacophore fingerprints are more likely to bind to a common target in the parasite, the method’s conclusions are also like to be more informative for medicinal chemistry efforts than conclusions drawn using chemical-hashed fingerprints. Consequently, we opted to use Tanimoto similarities evaluated through pharmacophore fingerprinting to generate the similarity matrix. A previous report established that similarities of 0.7 or higher translate to functional equivalence.34,35 Accordingly, we defined molecules having a similarity coefficient of 0.7 or higher as being similar. Conversely, molecules with similarity coefficients below this threshold were considered to be dissimilar. As a consequence, we further processed the matrix using a customized script that ignored similarities below 0.7. This operation generated a sparse matrix, wherein each populated cell confirmed structural connectivity for the pair evaluated in that cell. The sparse matrix was then inputted to Cytoscape36 to generate the connectivity diagram for the population (Figure 1). Each node in the connectivity diagram corresponds to a unique compound among the 614

3. RESULTS AND DISCUSSION The chemical space−or, more appropriately, chemical network−traversed by approved antimalarial drugs and promising drug candidates comprises 391 connected nodes and 302 isolated nodes (Figure 2). The nodes have, on average, fewer than two neighbors (1.629, to be precise), which translates to a clustering coefficient of 0.219 and centralization factor of 0.033 for the entire network. Clustering coefficients and centralization factors range between 0 and 1, with lower numbers indicating that the network comprises a larger number of smaller, segregated groups. Likewise, the heterogeneity of the network, which is defined as the ratio of the standard deviation to the mean of the number of neighbors that a node has,37 is estimated to be 1.902. The low clustering coefficient and 2124

DOI: 10.1021/acs.jcim.7b00072 J. Chem. Inf. Model. 2017, 57, 2119−2131

Article

Journal of Chemical Information and Modeling Table 2. List of Selected 4- and 8-Aminoquinoline Compounds

centralization factor and high heterogeneity suggest that the antimalarial network contains several hub nodes, which implies

that the chemical space is disjointed and segregated into a few dominant structural groups such as the endoperoxides that 2125

DOI: 10.1021/acs.jcim.7b00072 J. Chem. Inf. Model. 2017, 57, 2119−2131

Article

Journal of Chemical Information and Modeling Table 3. Compounds with Known Antifolate Activity

exhibit a great deal of intraclass similarity but do not bear any resemblance with other classes of antimalarial drugs. Moreover,

two endoperoxide drugsartemisone and 3-artesanilidedo not bear any similarities to any of the other endoperoxides or to 2126

DOI: 10.1021/acs.jcim.7b00072 J. Chem. Inf. Model. 2017, 57, 2119−2131

Article

Journal of Chemical Information and Modeling Table 4. Selected Amino Alcohols, 8-Aminoquinolines, Aminopyridines, and Pyrazoles

one another (Table 2). It is relatively straightforward to conclude that the outlying endoperoxides appear to be products

of medicinal chemistry activities aimed at improving the residence time of endoperoxide drugs in the bloodstream. 2127

DOI: 10.1021/acs.jcim.7b00072 J. Chem. Inf. Model. 2017, 57, 2119−2131

Article

Journal of Chemical Information and Modeling Table 5. Selected Anticancer Molecules

aminoquinoline compounds (Table 2). Amodiaquine and tertbutyl isoquine, which belongs to the former group, are similar to tafenoquine from the latter group; and primaquine, another

Likewise, intraclass similarity is also observed within the 4aminoquinolines, amino alcohols, and sulfonamides. On the other hand, interclass similarity is observed between 4- and 82128

DOI: 10.1021/acs.jcim.7b00072 J. Chem. Inf. Model. 2017, 57, 2119−2131

Article

Journal of Chemical Information and Modeling

a preclinical drug candidate. ELQ-300 interferes with the activity of cytochrome b1 within the parasite, which suggests that MMV390048 might be acting on P. falciparum through multiple mechanisms. Additionally, the triazolopyrimidine DSM265, a phase 1 candidate, is also similar to several 4and 8-aminoquinolines. Triazolopyrimidines are strong inhibitors of dihydroorotate dehydrogenase (DHOD), a key enzyme in pyrimidine biosynthesis. In addition, the pyrazole-based preclinical candidate 21A02 forms a distinct subset with compounds from the Novartis and UCSF libraries. This result reveals a potentially underexploited role for pyrazole compounds in the treatment of malaria. Similar insights can be drawn from the connections between the UCSF, Harvard, Novartis, and GSK libraries to approved antimalarial drugs. Lastly, over half of the molecules in antimalarial space bear no resemblance to other molecules in the investigated set. One interpretation of this observation is that the pharmacobiology of antimalarial drugs has not been entirely surveyed, which paves the way for employing chemical genomics to investigate compound−target relationships in the parasite and develop a systems-level understanding of antimalarial pharmacobiology.41 The inclusion of the anticancer set within the analysis also yielded some interesting observations. The prohormone dehydroepiandrosterone (DHEA), which is a potent antimalarial agent, is similar to the anticancer drugs prednisone, exemestane, and megestrol (Table 5). DHEA inhibits glucose6-phosphate dehydrogenase, which is chiefly responsible for controlling the pentose phosphate pathway that maintains the redox balance in cells. The enzyme has also been implicated in controlling apoptosis and angiogenesis, thus explaining its role as a target for anticancer therapies.42 Targeting intermediates in biosynthesis that the parasite shares with drugged diseases such as cancer provides a strong platform for drug discovery. This observation is corroborated by other connections between the antimalarial and anticancer space. For instance, the antimalarial drugs proguanil and chlorproguanil that target folate biosynthesis are similar to the anticancer drugs procarbazine, chlorambucil, and vorinostat. Inhibition of folate biosynthesis is a proven strategy for treating cancer. Proguanil is also similar to several molecules in the Harvard libraryHMS2093O22, HMS2093C12, and HMS1920P10. The high number of isolated nodes also suggests that the network is sparse. In fact, the network is distributed according to Pareto’s principle (Figure 3). Also known as the 80/20 rule, Pareto’s principle states that roughly 80% of effects originate from 20% of causes. In the case of antimalarial chemical space, we observe that the scaffolds of approximately 80% of approved drugs and drug candidates comprise just 20% of all scaffolds in the space. Given the diverse origins of the molecules considered in this study, it is highly improbable that they are distributed in this manner owing to deliberate design. It is likely that antimalarial chemical spaceor, for that matter, the space traversed by all pharmacoactive moleculesis the product of thermodynamic convergence.43 The thermodynamic convergence for a scaffold is defined as the progressive evolution of its structure in order to minimize the free energy of its binding within the targeted active site. Additionally, adherence with Pareto’s principle implies that the network can be described using a power law, which, in turn, implies that it is scale-free44 and is the product of a stochastic multiplicative process.43 Similar behavior is exhibited by equilibrium statistical mechanical systems that are described by Boltzmann’s law. One interpretation of such behavior is that evolution of any

8-aminoquinoline, is similar to the 4-aminoquinolines, chloroquine hydroxychloroquine, and AQ-13. Scaffold similarity can be extended to unearth unknown MoAs of molecules. For instance, while the mechanism by which the 8-aminoquinolines inhibits P. falciparum is poorly understood, their similarity to the 4-aminoquinolines suggests that they too might act either by inhibiting the conversion of hematin to hemozoin, which subsequently disrupts the parasite’s plasma membrane during the blood stage of its life cycle or by preventing the parasite from detoxifying the heme group.15 8-Aminoquinolines have also been observed to be effective during the liver stage of the parasite’s life cycle.11,15,16 The minor structural difference between the 4- and 8-aminoquinolines and their effectiveness during the blood and liver stages, respectively, lays the groundwork for establishment of novel structure−activity relationships that can be exploited for the development of standalone compounds having the same effect as combination therapies. The strategy of colocalizing structural features onto a common scaffold or combining two scaffolds together has been previously explored by combining the 4-aminoquinoline scaffold with endoperoxides such as dihydroartemisinin.16 While the resulting products exhibited higher residence times, they were simply too toxic to be viable drug candidates. On the other hand, scaffold similarity between trimethoprim, which is an inhibitor of DHFR, and the UCSF drug candidates MMV665928, MMV665929, and MMV665949, whose MoAs are poorly understood, offers a window into the activity of these molecules (Table 3). Trimethoprim is also similar to dapsone, an antifolate and approved antimalarial drug, and the anticancer drug anastrazole. The latter inhibits aromatase or estrogen synthase. The structural similarity between trimethoprim and anastrazole could be exploited by conveniently coformulating the drugs into a combination therapy that minimizes the likelihood of emergence of drug resistance. In fact, this strategy has already been effectively employed to develop combination therapies comprising antifolates and 4-aminoquinolines. For example, pyrimethamine, an antifolate drug, is similar to MMV020505, another molecule from the UCSF library. The latter is a 4-aminoquinoline derivative and is similar to a number of compounds in that family. Several 4-aminoquinolines have been investigated as combination therapies with pyrimethamine, and some have even witnessed widespread adoption in regions where the malarial parasite has developed resistance to chloroquine.38 Structural similarity is also observed between amino alcohols with unknown MoAs such as halofantrine and methylene blue, which target the malarial parasite through as many as three distinct routes (Table 4). Of the molecules that are currently in clinical testing, berberine and protopine are similar to one another. These molecules are natural products in plant extracts of Argemone mexicana, a species of poppy that is endemic to Mexico. Structural similarity between these molecules suggests that they may be products of the same of metabolic pathway. Diversity-oriented synthesis, wherein a common scaffold is diversely functionalized, is very common in plant secondary metabolic pathways.39,40 The preclinical candidates NPC1161B and MMV390048 are also similar to one another. The former is an 8-aminoquinoline and the latter is an aminopyridine with an unknown MoA. However, NPC1161B is also similar to the 4aminoquinolines, amodiaquine, and tert-butyl isoquine. These structural relationships provide a window into the MoAs of 8aminoquinolines, as well as aminopyridine drugs. MMV390048 is also similar to the quinolone-3-diarylether, ELQ-300, which is 2129

DOI: 10.1021/acs.jcim.7b00072 J. Chem. Inf. Model. 2017, 57, 2119−2131

Article

Journal of Chemical Information and Modeling

molecules. Our analysis revealed some valuable insights about antimalarial chemical space that are also corroborated by experimental evidence collected by other research groups. Saliently, we observe that antimalarial chemical space is disjointed and segregated into a few dominant structural groups, and the chemical network is distributed according to Pareto’s principle. The emergence of a power law phenomenon within the network raises prospects for the use of principles of network thermodynamics to evolve the network to ultimately comprise a population of highly drug-like molecules. Additionally, scaffold similarity can be successfully employed to determine unknown MoAs of drugs and promising candidates; and either formulate compatible combination therapies that exhibit desired pharmacokinetics and minimize the likelihood of emergence of drug resistance or synthesize scaffolds that colocalize two or more distinct structural moieties for polypharmacological activity.49 We also observe that over half of the molecules in antimalarial space bear low scaffold/ pharmacophore similarity to other molecules in the collection, which suggests that the pharmacobiology of antimalarial drugs has not been entirely surveyed. The insights that we have unearthed regarding scaffold similarity and the preponderance of a small number of scaffolds in antimalarial chemical space, the potential use of chemical genomics to completely chart the pharmacobiology of antimalarial drugs, and the identification of a clear opportunity to develop a vector support machine employing network thermodynamics provides a template for future activities in antimalarial drug discovery. Lastly, the volume of insights that cheminformatics yields, when contrasted with how relatively straightforward it is to use, lays the groundwork for the use of similar techniques to assess a much larger pharmacoactive space to potentially improve research outcomes in drug discovery in general.

Figure 3. Antimalarial chemical space is distributed according to Pareto’s principle. The emergence of a power law phenomenon suggests that the chemical network could be the product of thermodynamic convergence. This observation raises prospects of the use of principles from network thermodynamics to evolve a chemical library to eventually comprise molecules that are highly druglike.

chemical space or network consumes free energy and that convergence to a state that is described by a power law occurs on account of the emergence of dominant nodes that consume the most free energy.45 This observation, in conjunction with the results of a recent study using support vector machines,46 offers an alternative strategy to rational drug design and conventional QSAR in drug discovery.47 In this alternative approach, the scaffolds that yield the vast majority of approved drugs and drug candidates could be evaluated using machine learning to define core bioactive scaffold(s) that are essential for drug-likeness. The core scaffolds could then be used as building blocks of highly focused libraries that have a higher likelihood of yielding hits compared to conventional chemical libraries. Such use of thermodynamics for evolving networks48 is rapidly gaining traction in the field of pattern recognition, and we predict that its use in drug discovery is imminent. Incidentally, 4- and 8-aminoquinolines accounted for the greatest number of bioactive molecules evaluated in this study. This fact, coupled with recent insights regarding the efficacy of combination therapies comprising 4-aminoquinolines and pyrimethamine, the realization that the pharmacobiology of antimalarial drugs remains incompletely mapped, and the eventual development of suitable vector support machines employing network thermodynamics provides a template for future activities in antimalarial drug discovery.



ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jcim.7b00072. Endoperoxides (Table S2), anticancer compounds (Tables S3 and S4), Harvard lab set compounds (Tables S5−S7), De Risi lab set compounds (Table S8), GSK set compounds (Table S9), and Novartis set compounds (Table S10) with their structure and BM loose framework comparison (PDF) Table S1: List of the 614 chemicals (XLSX)



4. CONCLUSIONS The past decade has witnessed frenetic activity by the pharmaceutical industry and academic community to discover new antimalarial drugs. Unfortunately, scientific and technical challenges have greatly constricted the discovery pipeline. The necessity for new and improved antimalarial drugs is now particularly acute owing to the emergence of drug resistant strains of P. falciparum in Southeast Asia. In the current study, we have employed open-source cheminformatics tools installed on a personal computer to analyze the chemical space traversed by all approved antimalarial drugs and promising compounds from chemical libraries maintained by pharmaceutical companies and academic research groups. We extracted Bemis− Murcko scaffolds of every molecule and subsequently generated a network diagram to assess scaffold similarity between the

AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. ORCID

María Fernanda Lammoglia Cobo: 0000-0002-0369-0364 Vikramaditya G. Yadav: 0000-0002-1120-6392 Present Address ⊥

Molecular Cancer Research Centre, Charité−University Hospital Berlin, Berlin, Germany, 13353. Notes

The authors declare no competing financial interest.



REFERENCES

(1) World Malaria Report 2015; 2015.

2130

DOI: 10.1021/acs.jcim.7b00072 J. Chem. Inf. Model. 2017, 57, 2119−2131

Article

Journal of Chemical Information and Modeling (2) Renslo, A. R. Antimalarial Drug Discovery: From Quinine to the Dream of Eradication. ACS Med. Chem. Lett. 2013, 4 (12), 1126− 1128. (3) McMichael, A. J.; Woodruff, R. E.; Hales, S. Climate Change and Human Health. Lancet 2006, 367 (9513), 859−869. (4) Epstein, P. R. Climate and Health. Science (Washington, DC, U. S.) 1999, 285 (5426), 347−348. (5) Epstein, P. R. Is Global Warming Harmful to Health? Sci. Am. 2000, 283 (2), 50−57. (6) O’Brien, C.; Henrich, P. P.; Passi, N.; Fidock, D. A. Recent Clinical and Molecular Insights into Emerging Artemisinin Resistance in Plasmodium Falciparum. Curr. Opin. Infect. Dis. 2011, 24 (6), 570− 577. (7) Brandenburg, K.; Schürholz, T. Lack of New Antiinfective Agents: Passing into the Pre-Antibiotic Age? World J. Biol. Chem. 2015, 6 (3), 71−77. (8) Yadav, V. Biosynthonics: Charting the Future Role of Biocatalysis and Metabolic Engineering in Drug Discovery. Ind. Eng. Chem. Res. 2014, 53 (49), 18597−18610. (9) Anderson, T. J. C.; Roper, C. The Origins and Spread of Antimalarial Drug Resistance: Lessons for Policy Makers. Acta Trop. 2005, 94 (3), 269−280. (10) Flannery, E. L.; Chatterjee, A. K.; Winzeler, E. A. Antimalarial Drug Discovery  Approaches and Progress towards New Medicines. Nat. Rev. Microbiol. 2013, 11 (12), 849−862. (11) Derbyshire, E. R.; Mota, M. M.; Clardy, J. The Next Opportunity in Anti-Malaria Drug Discovery: The Liver Stage. PLoS Pathog. 2011, 7 (9), e1002178. (12) Bousema, T.; Okell, L.; Felger, I.; Drakeley, C. Asymptomatic Malaria Infections: Detectability, Transmissibility and Public Health Relevance. Nat. Rev. Microbiol. 2014, 12, 833−840. (13) Miller, L. H.; Ackerman, H. C.; Su, X.; Wellems, T. E. Malaria Biology and Disease Pathogenesis: Insights for New Treatments. Nat. Med. 2013, 19 (2), 156−167. (14) O’Neill, P. M.; Barton, V. E.; Ward, S. A. The Molecular Mechanism of Action of ArtemisininThe Debate Continues. Molecules 2010, 15 (3), 1705−1721. (15) Biamonte, M. a.; Wanner, J.; Le Roch, K. G. Recent Advances in Malaria Drug Discovery. Bioorg. Med. Chem. Lett. 2013, 23 (10), 2829−2843. (16) Teixeira, C.; Pérez, B.; Gomes, A.; Gomes, J. R. B.; Gomes, P.; Vale, N. Recycling” Classical Drugs for Malaria. Chem. Rev. 2014, 114, 11164−11220. (17) Mbengue, A.; Bhattacharjee, S.; Pandharkar, T.; Liu, H.; Estiu, G.; Stahelin, R. V.; Rizk, S. S.; Njimoh, D. L.; Ryan, Y.; Chotivanich, K.; Nguon, C.; Ghorbal, M.; Lopez-Rubio, J.-J.; Pfrender, M.; Emrich, S.; Mohandas, N.; Dondorp, A. M.; Wiest, O.; Haldar, K. A Molecular Mechanism of Artemisinin Resistance in Plasmodium Falciparum Malaria. Nature 2015, 520 (7549), 683−687. (18) McNamara, C.; Winzeler, E. a. Target Identification and Validation of Novel Antimalarials. Future Microbiol. 2011, 6, 693−704. (19) Ekland, E. H.; Fidock, D. A. Advances in Understanding the Genetic Basis of Antimalarial Drug Resistance. Curr. Opin. Microbiol. 2007, 10 (4), 363−370. (20) Delves, M.; Plouffe, D.; Scheurer, C.; Meister, S.; Wittlin, S.; Winzeler, E. A.; Sinden, R. E.; Leroy, D. The Activities of Current Antimalarial Drugs on the Life Cycle Stages of Plasmodium: A Comparative Study with Human and Rodent Parasites. PLoS Med. 2012, 9 (2), e1001169. (21) Inoue, J.; Lopes, D.; do Rosário, V.; Machado, M.; Hristov, A. D.; Lima, G. F.; Costa-Nascimento, M. J.; Segurado, A. C.; Di Santi, S. M. Analysis of Polymorphisms in Plasmodium Falciparum Genes Related to Drug Resistance: A Survey over Four Decades under Different Treatment Policies in Brazil. Malar. J. 2014, 13, 372. (22) Xu, J.; Hagler, A. Chemoinformatics and Drug Discovery. Molecules 2002, 7 (8), 566−600. (23) Engel, T. Basic Overview of Chemoinformatics. J. Chem. Inf. Model. 2006, 46 (6), 2267−2277.

(24) Willett, P. Chemoinformatics - Similarity and Diversity in Chemical Libraries. Curr. Opin. Biotechnol. 2000, 11 (1), 85−88. (25) Willett, P. Similarity-Based Data Mining in Files of TwoDimensional Chemical Structures Using Fingerprint Measures of Molecular Resemblance. Wiley Interdiscip. Rev. Data Min. Knowl. Discovery 2011, 1 (3), 241−251. (26) Ekins, S.; Freundlich, J. S.; Choi, I.; Sarker, M.; Talcott, C. Computational Databases, Pathway and Cheminformatics Tools for Tuberculosis Drug Discovery. Trends Microbiol. 2011, 19 (2), 65−74. (27) Nair, P. C.; Sobhia, M. E. J. Chem. Inf. Model. 2008, 48, 1891− 1902. (28) Lewis, R.; Guha, R.; Korcsmaros, T.; Bender, A. Synergy Maps: Exploring Compound Combinations Using Network - Based Visualization. J. Cheminf. 2015, 7, 1−11. (29) Chemaxon MarvinSketch. www.chemaxon.com. (30) Chemaxon JChem for Office. www.chemaxon.com. (31) Bajusz, D.; Rácz, A.; Héberger, K. Why Is Tanimoto Index an Appropriate Choice for Fingerprint-Based Similarity Calculations? J. Cheminf. 2015, 7 (1), 20. (32) Todeschini, R.; Consonni, V.; Xiang, H.; Holliday, J.; Buscema, M.; Willett, P. Similarity Coefficients for Binary Chemoinformatics Data: Overview and Extended Comparison Using Simulated and Real Data Sets. J. Chem. Inf. Model. 2012, 52 (11), 2884−2901. (33) Zuccotto, F. Pharmacophore Features Distributions in Different Classes of Compounds 2003, 43, 1542−1552. (34) Shelat, A. A.; Guy, R. K. Scaffold Composition and Biological Relevance of Screening Libraries. Nat. Chem. Biol. 2007, 3, 442−446. (35) Baldi, P.; Nasr, R. When Is Chemical Similarity Significant? The Statistical Distribution of Chemical Similarity Scores and Its Extreme Values. J. Chem. Inf. Model. 2010, 50 (7), 1205−1222. (36) Cytoscape. www.cytoscape.org. (37) Dong, J.; Horvath, S. Understanding Network Concepts in Modules. BMC Syst. Biol. 2007, 1, 24. (38) Gasasira, A. F.; Dorsey, G.; Nzarubara, B.; Staedke, S. G.; Nassali, A.; Rosenthal, P. J.; Kamya, M. R. Comparative Efficacy of Aminoquinoline-Antifolate Combinations for the Treatment of Uncomplicated Falciparum Malaria in Kampala, Uganda. Am. J. Trop Med. Hyg. 2003, 68 (2), 127−132. (39) Firn, R. D.; Jones, C. G. The Evolution of Secondary Metabolism - a Unifying Model. Mol. Microbiol. 2000, 37 (5), 989− 994. (40) Firn, R. D.; Jones, C. G. Natural Products–a Simple Model to Explain Chemical Diversity. Nat. Prod. Rep. 2003, 20 (4), 382−391. (41) Roemer, T.; Davies, J.; Giaever, G.; Nislow, C. Bugs, Drugs and Chemical Genomics. Nat. Chem. Biol. 2011, 8, 46−56. (42) Zhang, C.; Zhang, Z.; Zhu, Y.; Qin, S. Glucose-6-Phosphate Dehydrogenase: A Biomarker and Potential Therapeutic Target for Cancer. Anti-Cancer Agents Med. Chem. 2014, 14 (2), 280−289. (43) Levy, M.; Solomon, S. Power Laws Are Logarithmic Boltzmann Laws. Int. J. Mod. Phys. C 1996, 7, 595−601. (44) Barabási, A.; Albert, R. Emergence of Scaling in Random Networks. Science 1999, 286, 509−512. (45) Hartonen, T.; Annila, A. Natural Networks as Thermodynamic Systems. Complexity 2012, 18, 53−62. (46) Zernov, V.; Balakin, K.; Ivaschenko, A.; Savchuk, N.; Pletnev, I. Drug Discovery Using Support Vector Machines. The Case Studies of Drug-Likeness, Agrochemical-Likeness, and Enzyme Inhibition Predictions. J. Chem. Inf. Comput. Sci. 2003, 43, 2048−2056. (47) Tropsha, A.; Reynolds, C.; Merz, K.; Ringe, D. QSAR in Drug Discovery. Design: Structure- and Ligand-Based Approaches 2010, 151− 164. (48) Ye, C.; Torsello, A.; Wilson, R. C.; Hancock, E. R. Thermodynamics of Time Evolving Networks. In Graph-Based Representations in Pattern Recognition; Liu, C.-L., Luo, B., Kropatsch, W. G., Cheng, J., Eds.; Springer International Publishing, 2015; pp 315−324. (49) Reddy, A. S.; Zhang, S. Polypharmacology: Drug Discovery for the Future. Expert Rev. Clin. Pharmacol. 2013, 6, 41−47.

2131

DOI: 10.1021/acs.jcim.7b00072 J. Chem. Inf. Model. 2017, 57, 2119−2131