Non-Specificity of Drug-Target Interactions – Consequences for Drug

Oct 5, 2016 - Frontiers in Molecular Design and Chemical Information Science: Introduction. Bienstock. ACS Symposium Series , Volume 1222, pp 1–7. A...
1 downloads 5 Views 2MB Size
Chapter 7

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

Non-Specificity of Drug-Target Interactions – Consequences for Drug Discovery Gerald Maggiora* and Vijay Gokhale* BIO5 Institute, The University of Arizona, Tucson, Arizona 85721, United States *E-mail: [email protected]; [email protected]

Dealing with the complexity of the human biosystem in drug discovery is a daunting task. At best we have an imperfect picture of its underlying physiology and pharmacology, which raises the question of how to identify potential drug compounds from the vast sea of xenobiotics that populate chemical space. The dominant approach is still based on the single-target paradigm, which has a number of inherent problems not the least of which is the difficulty of altering disease phenotypes by intervening at single targets embedded within complex, interconnected biological pathways. While newer multi-target approaches address some of the problems, they are not entirely without problems of their own. Superimposed on all of these difficulties is a surprising lack of compound and target specificity that the growing amount of data clearly shows is a more pervasive problem than has generally been assumed. And although sophisticated phenotypic screening methods are being developed in an effort to deal with some of these issues, many remain refractory.

“The drug discovery process is more like a shotgun than a rifle.”

© 2016 American Chemical Society Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

Introduction Drug research has made amazing progress since the beginning of the Twentieth Century, especially in the last decades of that century. Early efforts to discover drugs were mainly empirical and largely based on the search for existing substances (i.e., ‘natural products’) that could alter observable phenotypes such as blood pressure, body temperature, reduction of pain, and wound healing. In the mid-Twentieth Century, the notion that such phenotypic changes could also be brought about by chemically synthesized compounds designed to cause similar changes to the phenotype, ushered in the era of what could be called “chemical pharmacology”. Since that time it has been transformed from an empirical science based largely on phenomenological observations of biological systems to one in which the structure and interactions of many biomolecules as well as drug molecules were elucidated. And although the development and use of biologics has been on the rise over the last several decades, the present work is focused entirely on issues associated with small-molecule drugs, which remain the mainstay of current drug research. As depicted in Figure 1 there are three main issues confronting drug discovery. The first, and perhaps most important one, is the complexity of biological (a.k.a.‘living’) systems (1), biosystems for short, that can have an over-riding effect on numerous aspects of biological function, many of which may have some, albeit in many cases small, influence on drug efficacy. Moreover, because complex biological systems exhibit emergent properties and sometimes behave unpredictably, we are always on rather shaky ground when trying to discover new therapeutics for human health.

Figure 1. The three main issues confronting drug discovery. 92 Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

The second, biological reductionism has a long and distinguished history in Western scientific thought from the time of the early Greek philosophers to current day scientists. Importantly, it has played a substantive role in advancing the notion of single-target-oriented drug discovery, which remains the dominant paradigm, although it is becoming less so with each passing year due to its seeming inability to generate significant numbers of new drugs (2). The third and final issue is due to the lack of specificity of drugs and other xenobiotics for specific ‘drug targets’, which are mainly proteins (3) but may in some cases be nucleic acids (e.g. ribosomal RNA (4), quadruplex DNA (5)). The designation of a given protein or nucleic acid as a drug target is dependent upon the particular pathophysiological condition being addressed. Since most diseases are polygenic several drug targets should ideally be specified. This, however, is still relatively rare in drug discovery today, although it is changing slowly and multitarget discovery efforts are on the rise (6, 7). In addition to nominal drug targets, there are two other categories, namely, ‘off-targets’, which are known targets that are not related to the designated target or targets, and latent targets, which are unknown targets that nevertheless may interact with drugs and hence affect the biology and pharmacology in unspecified ways. As might be expected, additional complications can in some cases arise due to the presence of multi-tasking proteins whose functions depend on cellular location (8). This phenomenon introduces a whole new level of complexity to the targeting of drugs that is not generally addressed in many drug discovery efforts. In any case the non-specificity of drug-target interactions gives rise to polypharmacology – the ability of a single drug or xenobiotic to interact with multiple targets and thereby induce multiple pharmacologies (9–13). The presence of polypharmacology introduces considerable uncertainty in knowing which targets are perturbed by the introduction of a drug, by how much, and how these perturbations influence the overall biological processes associated with a drug’s action – the uncertainty this gives rise to is not fully appreciated within the drug discovery community. The existence of polypharmacologies is, however, not new and has been known for many years under the guise of adverse side effects, a.k.a. adverse drug reactions – ADRs that are due to undesirable off-target interactions of drugs with both known and latent targets. Another aspect of polypharmacology is reflected by the growing trend to repurpose existing drugs or shelved compounds (14) for new therapeutic indications, an enterprise that may be due to what can be called ‘desirable’ or ‘beneficial’ side effects. Although a number of important drug discovery concerns are discussed in this work four closely related issues should also be kept in mind. First, determining a drug’s molecular mechanism of action (MMOA) is not a requirement for FDA submission or approval (15) as nearly 30 percent of approved drugs have unknown MMOAs. However, as pointed out in an editorial in Nature Medicine lack of an MMOA may reduce the likelihood of drug approval (16). Hence, while molecular mechanistic approaches in drug research are quite common they are not essential and as will be seen in the sequel can sometimes be misleading. Second, the in vitro activity of a drug for a given target is not prima facie evidence that it will also be active in vivo and vice-versa – this also applies to ‘omics’ data such as that 93

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

describing protein-protein interactions. Third, the primary goal of drug discovery is finding therapeutics that improve human health not understanding biology. And fourth, and perhaps most important, there are exceptions to virtually every rule in biology. It may be well to consider that drug discovery is in many ways more of an engineering than a scientific process. Engineers routinely deal with incomplete knowledge and a lack of tools for implementing the relevant processes. Nonetheless, they develop ‘practical’ and in many cases novel solutions under these conditions, a situation that is in many ways reminiscent of drug discovery and development processes. An interesting example in this regard is the construction of massive cathedrals in Europe well before Isaac Newton formulated his laws of motion and the development of the tower cranes routinely used in large buildings today. Due to the extreme complexity of biosystems, drug discovery and development is plagued by a number of significant issues in many areas of biology and pharmacology even though modern high-throughput experimental methods have generated a tsunami of data. However, in order to take full advantage of the burgeoning amount of data mathematical models of biological processes (17–19) are needed to enable the extraction of information that can enhance understanding of the relevant biology in ways that facilitate the discovery and development of new and more effective therapeutics. Although in their relative infancy such models, which are under development in a number of laboratories, have already shown some promise in this regard. Further details on the nature and application of methods in this important field of research are provided in the following section.

Biosystems Are Open Complex Adaptive Systems Biosystems are structurally and functionally complex adaptive systems (CAS) with many interacting components (20). This characteristic differs from that in many physical and chemical systems where a single or at most several interactions dominate system behavior. In such cases, the other, weaker interactions can generally be safely ignored, an assumption that is also explicitly or implicitly made in the single-target approach to drug discovery. In some cases it appears to work reasonably well based on the number of successful drugs discovered using this approach precisely because perturbations associated with other biological functions are tolerable (21); such behavior may be characterized as ‘weak polypharmacology’. As indicated by the organismal hierarchy depicted in Figure 2, biosystems are organized in hierarchical fashion moving upwards from molecules, their smallest and simplest entities, to the entire organism, the largest and most complex entity. The downward arrow on the RHS of the figure indicates the reductionist tendency to reduce biosystems to their molecular components. By contrast, the upward arrow on the LHS of the diagram indicates the functional integration of the molecular components needed for an operative biosystem. As pointed out by Palsson (22), because of their hierarchical organization, biosystems can 94

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

behave in a biologically but not physically non-causal manner. In such instances, changes that occur at a lower level of the hierarchy may not propagate upwards, and hence, may not influence the functionality of higher system levels. The book by Dougherty and Bittner (23) provides a very clear and comprehensive discussion of many aspects of cellular regulatory logic couched in terms of a ‘factory metaphor”. Although the degree of complexity of the regulatory logic of even the most complex factory does not approach that of biosystems, the factory metaphor does capture a number of their salient regulatory features.

Figure 2. The hierarchical structure of biosystems (a.k.a. ‘living systems’). Biosystems are open systems that are partitioned by a variety of membranes into a large number of ‘functional compartments’ such as those defined by the elements of the organismal hierarchy. Because they are open systems they can control the exchange of matter, energy, and information that flows between ‘compartments’ as well as into and out of their surroundings. This affords the means for maintaining homeostasis and provides the physical environment within which essential organismal processes take place (24). The progression of changes in system complexity is accompanied by corresponding changes in spatial scale. In addition to their multi-scale spatial characteristics, biosystems also exhibit multi-scale temporal behavior. This is particularly well illustrated by the temporal behavior of biosystem components, such as enzymes, whose activities can be regulated in a number of ways. A variety of allosteric effectors can regulate enzyme activity by binding directly 95

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

to the enzyme. The time constants for such regulatory activities are on the order of milliseconds. By contrast the synthesis of new enzyme requires protein synthesis, which is controlled by various gene regulatory mechanisms and has a time constant several orders of magnitude longer than, for example, the msec timescale associated with enzyme activities (vide supra) or that for degrading enzymes via the ubiquitylation pathway (25). Biosystems rely on a set of catabolic networks to provide the energy necessary to drive a number of biological processes and a set of anabolic networks to synthesize the molecules needed to sustain life. A complex set of inter-related regulatory networks exemplified by the set of intracellular networks schematically depicted in Figure 3 (26) provide overall control for the processes necessary to sustain life. The diagram schematically depicts three important biological functions – gene expression, signal transduction, and metabolism – that influence one another. Molecular entities are designated by the dark grey filled ellipses; regulatory events are designated by blue filled rectangles. The solid dark grey arrows indicate ‘mass flow’; dashed red arrows indicate ‘catalytic action’ of molecular entities on the regulatory events. In many cases where disease phenotypes subvert normal biological processes drugs can interact with one or more protein targets in ways that can help restore the normal phenotype. Lastly, biosystems are self-organizing and self-reproducing, features that help to ensure the perpetuation and survival of a species. All of these behaviors complicate efforts to understand and model their functional characteristics.

Figure 3. High-level schematic diagram of an intra-cellular regulatory network depicted as a bipartite graph. The diagram is adapted with permission from Figure 8.2 in reference (26). Copyright 2008, John Wiley & Sons, New York. Because biosystems are CASs with many components they are by their very nature difficult to compute (27). Based on some rather abstruse mathematical reasoning the biomathematician Robert Rosen has argued that biosystems (‘living 96 Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

systems’) are, in fact, not even computable (28). Even if his proposition is challenged, simulating the processes that occur within a living cell, the simplest independently functioning biosystem, can present significant challenges to current computing capabilities due to their multi-scale spatial and temporal character (29). Two of the most difficult challenges are due to the small volume of cells, which restricts the number of molecules within them, and the crowded, heterogeneous nature of their interiors (30), which can influence such properties as diffusion and binding. These features preclude the use of more tradition differential equation methods that require a continuous, homogeneous medium within cells and have led to the development of other methods such as those based three-dimensional stochastic simulations that appear to be having some success (31). However, all approaches that have been carried out to date, or that will likely be carried out in the foreseeable future, are phenomenological (32) and hence require the experimental or computational estimation of numerous system parameters, a task that can become more difficult as system detail increases. Ideker, Galitski, and Hood (33) have given an interesting early, rather prescient account of many of the issues associated with the simulation of biosystems. Determining and properly accounting for all of the mechanistic features associated with cellular functions is a daunting task, especially given that latent factors of current or yet to be discovered mechanisms can have a significant impact on the nature of the models being developed and thus, on the results one is likely to obtain from such models. An excellent example of this comes from the discovery in the early 1990s of microRNAs (miRNAs). At that time they were considered to be curiosities but within a decade were seen to play an important role in modulating protein expression through their interactions with mRNAs (34). Hence, prior to the recognition of their important role in gene regulation, any model of gene regulation would have been incomplete. Even now that their role in regulating gene expression has been elucidated there still may be other factors yet to be discovered that will also play critical roles. In a sense, miRNAs were acting in a similar fashion to latent components in biosystems or latent variables in statistics – they are unknown but nonetheless exert effects that are subsumed into then existing explanatory variables. Computations on more complex entities in the organismal hierarchy depicted in Figure 2 are orders of magnitude more difficult than computations on individual cells since they involve cell-cell and higher-order interactions plus the transport of biological molecules and xenobiotics among the various supra-cellular entities (e.g. tissues or organs). Recently, an interesting software platform called Biocellion has been developed that is designed to handle multicellular biosystems and their interactions with the environment (35). Although the results are impressive, they are far from providing a significant amount of molecular detail since the procedures that can be implemented are phenomenological and typically deal with objects that lack internal molecular structure or molecular mechanistic pathways. Incorporating such information significantly increases the level of difficulty since it requires accounting for the behavior of their many interacting pathways, which is a major computational task in itself even if one assumes the processes take place in uncrowded, homogenous environments, an assumption that, as noted above, differs significantly from actual biosystem 97

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

environments. This rather pessimistic picture follows since treating cells and more complex entities not only requires accounting for the behavior of their many interacting pathways but also their internal structural features as well as the external environment in which they reside. All of these issues effectively foreclose the possibility that a computer simulation of a sufficiently complete mechanistic description of the molecular mechanism of action of any drug will be available in the near future. However, this does not mean that modeling biosystems is an impossible task, but rather one that requires a keen understanding of the salient biological features that must be incorporated into any computational model. In this regard, physiologists have been creating a variety of multi-scale phenomenological models at the sub-cellular, cellular, inter-cellular, and supra-cellular levels in order to provide insights into biological processes taking place in the human physiome (36, 37). While many models aimed at treating sub-cellular, cellular, and inter-cellular systems explicitly include ‘molecular entities’, these entities are not typically represented in structural detail but rather as structureless points in much the same fashion that molecules in classical chemical kinetics or reaction-diffusion systems are treated. At higher levels of biological organization that include tissues, organs, and the human organism, continuum models are used to reduce overall model complexity and to reduce computational requirements. A number of efforts are being made to develop methods for comprehensively simulating human physiology (see, for example (38)) which if realized at least at some level could ultimately provide the means for in silico testing of candidate drugs, a goal that is certainly worthy of achieving.

Biological Reductionism Biological reductionism is based on the premise that biological systems can be comprehended by understanding the structure and function of their smallest components. This has certainly been a factor in the widespread adoption of the single-target model that has played a dominant role in drug discovery research for over two decades. The reductionist philosophy derives from more than a century of scientific advances in physics and chemistry, which have been further reinforced by relatively recent progress in molecular and structural biology. All of these developments have given rise to renewed discussions as to whether the laws of physics and chemistry are sufficient in themselves to describe biological systems. Although these laws are most assuredly obeyed by biological systems, they are, nonetheless, insufficient to explain the functional characteristics by which these systems operate (39). This is a manifestation of the reductionist paradox (40), namely, that as the amount of molecular detail of the components of a biosystem increases the amount of information associated with their functionality correspondingly decreases, as depicted in Figure 2. It is also a manifestation of the fact that biosystems are CAS, and hence they possess emergent properties that cannot generally be inferred from the nature of their simpler components (41). 98

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Table 1. Selected functional (bold face) and relational biological (‘interaction’) networks. Catabolic and anabolic metabolic networks Intra- and inter-cellular signaling networks Neuronal networks Immunological networks

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

Gene regulatory networks Cell cycle networks Protein-protein interaction networks Transcript-transcript association networks DNA-protein (transcription factors) networks Chemical space networks Drug-target networks Drug-induced gene expression networks Drug-drug networks Target-target networks Interactome networks Human disease networks Phylogenetic networks

Key to the re-introduction of biosystem functionality is the development of the field of systems biology, which arose around the turn of the century in an effort to counter the reductionist trend and reintroduce functional information back into biological research (42, 43). While great strides have been made in identifying many of the mechanisms by which biosystems operate since its introduction, much remains to be done. The following section describes the role that network science (44) is playing in the development of systems biology by providing a powerful visual metaphor for systematically describing relationships among biological entities that are the stepping-stones to systems biology. The classic book by Miller (45) provides a comprehensive description of the systems aspects of all living systems from the simplest cellular to the most complex societal and supra-national systems. It is recommended reading for those interested in developing a fundamental understanding of all manner of ‘living systems’.

Systems Biology and Biological Networks Although still in its infancy systems biology is providing a framework for characterizing biological pathways and the functional networks within which they reside. A large number of relational networks depicting a wide variety of ‘omics’ 99 Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

data as well as other types of data relevant to studies in chemical biology and drug research have also been constructed. A listing of some of these networks is provided in Table 1. Functional networks are associated with the dynamical properties of biosystems, an excellent example being metabolic networks that represent the time-dependent biochemical transformations that sustain living systems. These networks have been extensively studied in metabolic engineering, but applications there have largely been confined to bacterial or simple cellular systems (46, 47). Computational studies have also been carried out on gene regulatory networks (GRN), but because of their complexity and the lack of data have been mostly restricted to small subsystems; dynamical models of a full GRN has yet to be carried out even for simple biosystems (48). By contrast, relational networks generally capture time-independent binary relationships among biological entities, protein-protein and drug-target interaction (49) networks being excellent examples (50). It is important to recognize that the omics data in relational networks, while very useful, is not sufficient for the construction of functional networks because the latter are associated with non-linear dynamical, time-dependent processes with complex feedback and feed-forward controls that regulate their behavior – metabolic networks being a prime example (51). Nevertheless, relational networks have provided a means for analyzing many important biological relationships. There are also numerous examples of relational networks useful in drug discovery − some of the earliest include the work of Paolini et al. (9) on the global mapping of pharmacological space, that of Yildirim et al. (52) on the construction of networks that link drugs to targets and targets to diseases, and that of Keiser et al. (53) on the construction of protein pharmacology networks that depict relationships among diverse target classes based on the similarity of their associated ligand sets. All of these works are early examples of networks associated with polypharmacology. Shortly thereafter Hopkins proposed a new paradigm for drug discovery based on what he calls network pharmacology (54, 55). His work provided an early, very clear account of the possible ways in which non-specific drug-target interactions can influence drug discovery. Figure 4 provides a high-level schematic diagram of the relationship of drugs and other xenobiotics to a variety of cellular processes involving nucleic acids and proteins. Genes and related molecular entities are designated by the green filled ellipses, proteins and amino acids by the dark-red filled ellipses, and drugs and their metabolites by the blue filled ellipses. Directed arrows that point from the entity in question towards the the processes they are associated with are labeled with boldface Arabic numerals. For example, drug-protein interactions (1), gene-mediated gene expression (6), miRNA modulation of translation (7), protein-protein interactions (11), signal transduction (12), and protein degradation via the ubiquitin-proteosome pathway (13). In a sense the figure provides a ‘macroscopic view’ since specific details are not given for the entities contained within each of the categories depicted in the figure. For example, the category ‘DRUGS’ subsumes all relevant small-molecule drugs and xenobiotic compounds. The category ‘GENES’ includes genes that code for proteins as well as, for example, non-coding genes that give rise to miRNAs. 100

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

Similarly the category ‘PROTEINS’ includes ‘druggable’ targets as well as other proteins involved in a wide range of biological processes.

Figure 4. High-level schematic diagram of the relationship of drugs and other xenobiotics to a variety of cellular processes depicted as a directed graph.

Although the diagram lacks many of the details underlying the processes it depicts it nevertheless provides an overarching framework for understanding the manifold issues associated with the interrelated biological processes that must be dealt with in drug research and highlights their inherent complexity. As will be seen in the sequel, it is particularly beneficial in clarifying these issues in the example of target-based drug discovery described in a forthcoming section (56).

Drug-Target Specificity The concept of specificity of drug-target interactions (57) was born well over a century ago with the work of Emil Fischer on the specificity of enzyme action. Fischer postulated that the observed specificity was due to a complementary fit between the enzyme and it’s substrate, an explanation that gave rise to the so-called ‘lock and key model’ (58). While this simple model may not have entirely stood the test of time, it has nonetheless significantly influenced our thinking about the interaction between small molecules and their macromolecular targets. This view was modified in the late 1950s, when Daniel Koshland proposed the ‘induced fit model’ (59), which supposes that the initial enzyme-substrate interaction is relatively weak, but subsequent interactions induce conformational changes in the 101

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

enzyme that strengthen the interactions and help position key catalytic groups on the enzyme for effective catalysis. Hence, the concept of a complementary fit of an enzyme with its substrate remained, albeit in somewhat altered form. Slightly after Fischer’s work, at the turn of the Twentieth Century, Langley, Clark, and Ehrlich independently proposed the related notion of a receptor that mediates the interaction of small molecules with living systems in ways that stimulate biological responses (60–62). While their proposal lacked the detail of the lock and key model, it nonetheless bore some resemblance to it since both explicitly or implicitly include the notion that specific entities are involved through which molecules can interact with living systems. As was the case for the mechanisms of enzyme catalyzed reactions, the receptor concept has evolved significantly until today detailed models of receptors exist that describe the complicated structural and functional properties of biological receptors and the multitude ways in which they can induce biological effects (63). In the 1960s numerous developments refined and extended the models of ligand-macromolecule binding (64, 65). Groundbreaking work by Monod, Wyman, and Changeux in France (66) and by Koshland, Nemethy, and Filmer in the United States (67) described models that significantly expanded the concept of ligand-macromolecule interactions. These models, called allosteric (“other site”) models, were based on the notion that sites distinct from active or receptor sites could bind ligands in a manner that would modulate the activity of enzymes or receptors. Significantly, this work provided a molecular basis for the control of many biological processes such as signal transduction (68, 69) and metabolic regulation (70) and afforded a molecular framework for systems biology (22, 71–73). Such interactions increase the complexity of biosystems and enhance the manner in which individual processes and hence whole pathways can interact with one another in ways that contribute to the overall function of biosystems. The notion that drug-target interactions have a reasonably high degree of specificity is, however, not the theme of this work. On the contrary, it is focused on describing and exploring the consequences of what appears to be a surprising lack of specificity in these interactions and its implications for drug discovery. That this should be case is not entirely unexpected given the fact that adverse side-effects have been with us from the earliest days of drug research foreshadowing the possibility that drugs were interacting with biosystems in multiple ways. However, even after the concept was well established that drugs and other xenobiotics induce biological effects by interacting with specific ‘molecular targets’, it was not generally applied to ‘off-target’ interactions. This situation has changed considerably in the last decade. Significant efforts are now being made to develop computational models for predicting likely side effects based on an array of different approaches that utilize information on drug structure and side effect profiles (74–87). SIDER, a publically available computer-readable side-effect resource (88, 89), currently contains 1,430 drugs and 140,064 drug-side-effect pairs, 39.1 percent of which also have associated frequency of occurrence information. Another aspect of the non-specificity of drug-target interactions is associated with the practice of drug repurposing (repositioning) – identifying new targets for existing drugs or shelved compounds (14). In contrast to ADRs, this type 102

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

of non-specificity in drug-target interactions can be thought of as arising from ‘beneficial side-effects’. Although physicians are permitted to prescribe and have for many years prescribed drugs for indications that are ‘off-label’, it has newly emerged as a practical strategy for drug discovery (90). That such is the case is not surprising given that any drug slated for repurposing has already been clinically evaluated for toxicity, one of the major stumbling blocks to drug approval. Even in the case of shelved compounds many have been evaluated in Phase 1 and Phase 2 clinical studies. Hence, some of the most difficult hurdles to drug approval have already been overcome. The PROMISCUOUS database, which was developed as a resource to aid in drug repositioning, provides an excellent source of information for more than 25,000 drug compounds including withdrawn or experimental drugs, annotated with data on side effects and on drug-target and protein-protein interactions (91).

Polypharmacology and Drug-Target Interaction Networks As noted in the Introduction, compounds that interact with more than one drug target are said to exhibit polypharmacology (9–13). The terminologies “promiscuity”, “compound promiscuity”, or “promiscuous compounds” have been commonly used as synonyms for polypharmacology and that practice will be continued here. Although compounds that do not interact with specific molecular features of drug targets are not considered further in this work, it is well to remember, “all compounds exhibiting polypharmacology are promiscuous, but not all promiscuous compounds exhibit polypharmacology” (92, 93). It has been argued that in some cases polypharmacologically active compounds may be advantageous for particular types of drug therapy where multi-target activities can lead to more effective drugs with potentially fewer side effects than is the possible with multi-target, multi-drug therapies (9–13, 54, 55, 94, 95) because the latter would likely suffer additional problems associated with drug delivery and ADRs due to the multiplicity of compounds. However, it should always be remembered that “polypharmacology is a two-edged sword” since it hits both targets and off-targets as well even if it is specifically designed to hit the former. Moreover, for structural or other reasons it is unlikely that a compound will interact with all of the required targets in ways that in all instances will produce the desired level of response. The fact that polypharmacologically active compounds may contain or do contain multiple pharmacophores increases their likelihood of hitting more off-targets both desirable and undesirable. Efforts have been made to overcome this possible liability by designing ligands that will be active against multiple, specific targets, but the task is a difficult one (13, 54, 55, 96–100). It is interesting to speculate whether in some way combining pharmacophores associated with known activities might provide a feasible strategy for the design of such polypharmacologically active compounds. Alternatively, pharmacophores associated with ADRs might also be removed (101). In 103

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

this regard, PharmMapperDB a database of pharmacophores derived from ligand-protein crystal structures, and PharmMapper a web server that performs reverse pharmacophore analysis by aligning pharmacophores associated with known drug targets to natural products and existing or newly synthesized compounds may be of assistance (102). Natural products may provide another potential alternative to existing synthetically derived or designed small molecules as a means for achieving polypharmacologically active compounds since Nature has had millions of years to achieve such polyfunctional compounds (103–108). Bipartite networks (a.k.a. two-mode networks) provide a visually intuitive way of representing polypharmacological information (vide infra). As noted earlier such networks are called drug-target networks (52, 109–112) and are made up of two sets of nodes, one corresponding to the set of drugs (113) and the other to the set of targets. An edge connecting a drug node with a given target node indicates that that the drug is active against that target (114). There are no edges connecting the set of drug nodes with each other; similarly for the set of target nodes. The first published drug-target network appears to be that of Yildirim, et al. (52). A variety of other relationships linking drugs, targets, genes, phenotypes, and diseases have also been investigated (52, 115–120). One-mode drug-drug and target-target networks can be obtained from ‘projections’ of the corresponding two-mode, bipartite networks (44). In an analogous fashion, two-mode drug-disease and phenotype-disease networks can be obtained from ‘projections’ of the corresponding three-mode tripartite networks. However, in both instances there is a consequent loss of information (110). Figure 5a provides a simple example of a bipartite drug-target network. Drugs are given by the shaded circles and targets by the shaded squares. An edge (solid line) is drawn between a given pair of drug-target nodes such as (D1,T1) if the activity of the drug towards that target is equal to or exceeds a chosen activity threshold value, say 10 μM. If the activity falls below the activity threshold no edge is drawn between a pair of drug-target nodes as illustrated by (D5,T3). Dashed lines, although not generally considered in bipartite graphs, correspond to cases where the activity of a drug-target pair such as (D4,T2) has not been measured. Figure 5b depicts the binary drug-target matrix ΩDT that faithfully represents the information in the bipartite network given in Figure 5a, where a ‘1’ corresponds to a solid edge, a ‘0’ to the lack of an edge, and a ‘?’ to a dashed edge. Although bipartite networks such as the drug-target network depicted in Figure 5 currently predominate, tripartite networks are becoming more popular as the amount of drug, target, genotype, phenotype, and disease information increases. As noted above, one-mode or unimodal relational matrices that describe drug-drug and target-target interactions can also be constructed in a number of ways by properly combining the ΩDT matrix with itself. Other one-mode drug-drug and target-target networks based on alternative combining criteria have been constructed by a number of different investigators, whose works should be consulted for details (53, 110). In this regard, the work of Paolini, et al. (9) is particularly interesting since they describe the construction of a polypharmacology interaction network that links targets based on the polypharmacological behavior of their ligands. 104

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

Figure 5. (a) Drug-target interactions depicted as a bipartite graph. (b) The drug-target matrix, ΩDT, corresponding to the bipartite graph in (a).

The degree of polypharmacology, π(Di), is obtained by counting the number of edges emanating from the i-th drug node Di. For example, π(D1) = 3. This can also be obtained by summing the binary values in the first row of the ΩDT matrix. The situation is more complicated for D2 due to the ambiguity brought on by the dashed edge associated with the drug-target pair (D2,T2). Since the activity of D2 has not been measured against T2 not drawing an edge between them would presume that D2 was measured and found to be inactive, when in fact it may be active. In such a case this could undercount the true value of π(D2). Since its true value is unknown the best that can be said from the available data is that the degree of polypharmacology is bounded, i.e. 1 ≤ π(D2) ≤ 2. The upper and lower bounds can be obtained from the ΩDT matrix as follows. Replace the question mark by a zero, which assumes the drug is inactive with respect to that target, and sum the elements in the second row, which yields a value for the lower bound of π≤ (D2) = 1. Now replace the question mark by unity, which assumes the drug is active. Again summing the elements in the second row yields a value for the upper bound of π≥ (D2) = 2. The procedure can then be repeated for each of the drugs, Di, i = 1,2,…,8.

105 Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

This issue is related to the problem of what Mestres et al. (121) call data completeness that arises because the activities of all of the compounds against all of the targets under consideration have not been measured. As they showed, this can have significant consequences on the characteristics of a drug-target network. Not surprisingly, as these authors point out, there is significant bias on the targets considered since they are heavily weighted towards targets of therapeutic interest. The ΩDT matrix in Figure 5b reveals another interesting property of drug-target relationships. Rather than viewing the data from the drug (i.e. ‘row’) perspective, it can also be equivalently viewed from the target (i.e. ‘column’) perspective. Because of this mathematical duality data can be collected from experiments that utilize procedures that are drug-based, target-based, or are combinations of the two to fill in the elements of a ΩDT matrix. Summing the values in a given column, say T4, yields what could be called the degree of polyspecificity, σ(T4) = 6. Moreover, a similar approach to that presented above for missing row data can also be applied to missing column data. Hence, for example, the lower and upper bounds associated with T3 are given by σ≤ (T3) = 3 and σ≥ (T3) = 5, respectively.

The Unexpected Prevalence of ‘Similarity Cliffs’ It is not surprising that a multiplicity of drugs can interact with a single target as demonstrated by the ‘column view’ of the ΩDT matrix described above since it is the basis of the well-established principle of structure-activity relationships (SARs) in drug research. However, there is more to the story. When drug-target interactions cover a wide range of targets with a large set of dissimilar compounds the situation becomes much more interesting. Consider now the activity landscape generated with respect to a single target (122). There are a number of topographic features associated with such landscapes including ‘activity cliffs’, where structurally similar compounds exhibit large differences in activity, ‘SAR regions’ where small structural changes are associated with correspondingly small changes in activity, ‘similarity cliffs’ where pairs of structurally dissimilar compounds have comparable activities, and lastly, ‘non-descript regions’ that constitute large areas of a landscape without distinguishing features (123). As is well known activity cliffs are relatively rare, but similarity cliffs are a surprisingly prevalent feature of activity landscapes (123). This is a more general phenomenon than scaffold hopping (124), since in that case pairs of structurally dissimilar compounds may have identical scaffolds or compounds with different scaffolds may, nonetheless, be structurally quite similar as attested to by the existence of numerous bioisosteres (125). While the notion of similarity cliffs is relatively new, it has a long and distinguished history in the lead optimization phase of drug discovery under the rubric of multiple lead series, which are valuable assets in this phase of drug discovery. In such cases where a given series fails due to metabolic, drug delivery, or toxicity issues, another series of active, structurally dissimilar compounds would be available that might not have the liabilities of the previous series. 106

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

Pharmacophores provide the final example that supports the prevalence of similarity cliffs. The structure of a pharmacophore is based on the three-dimensional arrangement generally of three or four ‘pharmacophoric features’ such as hydrogen bond donors and acceptors, charged groups, hydrophobic residues, and aromatic rings associated with what could be called an ‘active configuration’ of these features. The set of molecules that possess the active configuration is generally made up of a number of structurally dissimilar subsets of compounds with approximately comparable activities. Hence, a pair of compounds that form a similarity cliff can be obtained by selecting an active compound from each of two dissimilar subsets of active compounds all with comparable pharmacophores. In some ways this is just another, albeit more general, aspect of scaffold hopping. Lastly, the relative commonness of similarity cliffs suggests that active compounds are more likely to be grouped into small clusters scattered throughout chemical space than within one or two large clusters. This suggests a rationale for the observation by Willett and his colleagues (126–128) of why group fusion, which aggregates similarity values from multiple active query compounds, in many cases appears to outperform other similarity-based search procedures, especially given that the more diverse the set of actives the better the results tend to be (129). It also suggests why similarity-search methods that rely on single active query compounds tend to do poorly by comparison. Single-query-based similarity searches generate lists of compounds ordered by their similarity values or ranks with respect to the active query. Such a list would in principle contain all of the remaining active compounds in the data set, but they would be scattered throughout the list due to their varying similarities with respect to the active query compound. Since only the top scoring compounds are generally selected for further screening or other chemical informatic procedures, many dissimilar but active compounds are not investigated further. The apparent pervasiveness of polypharmacology has attracted the attention of numerous investigators because it seems to run counter to the specificity of natural biomolecular interactions and to the longstanding notion of the presumed specificity in drug-target interactions (130). Xenobiotics, a small subset of which are drugs, by definition are substances foreign to humans and to other higher animals. This includes many natural products produced by plants and simple organisms such as bacteria and fungi, and compounds produced by chemical synthesis. While it is the latter that are generally thought to be responsible for the observed polypharmacology, a recent paper by Gu, et al. (107) suggests that natural products may also exhibit significant polypharmacology, although conclusive evidence is not yet available. These examples beg the question as to the molecular basis for such widespread promiscuous drug-target behavior. An interesting view of this issue is described in the work of Jalencas and Mestres (131) who argue that the physiochemical properties and the fragment composition of drugs coupled with the protein families and distant binding site similarities of a drug’s principal targets support the speculation that polypharmacology is a manifestation of protein evolution induced by adaptive mechanisms employed by primitive biosystems to enhance their survival in highly variable chemical environments. 107

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

Recent work in Shoichet’s laboratory based on a review of more than 100 x-ray crystal structures of ligand-protein suggests that identical ligands can bind to multiple, structurally dissimilar proteins from different activity classes (132). Moreover, the binding sites occupied by a given ligand in the different proteins were not necessarily similar nor were the ligand atoms that participated in the binding. These observations may not be entirely surprising given the apparently widespread nature of polypharmacology. Whatever the mechanism(s), polypharmacology nevertheless has a significant impact, both positively and negatively, on drug discovery especially as it provides numerous examples that run counter to some of the key aspects of the single-target paradigm.

Drug-Target Interactions Databases Growing interest in all aspects of polypharmacology has given rise to considerable analysis on the prevalence and degree of polypharmacology of compounds in biologically relevant chemical space. Because of their relational features networks provide a highly visual way to represent the drug-target relationships, although their detailed analysis is almost always carried out algebraically based on relational matrices such as that in the model drug-target network described in the previous section and depicted in Figure 5b. A significant problem faced by all investigators who are trying to evaluate the ‘true’ degree of polypharmacology of any drug or xenobiotic is the multiplicity of data sources and the bias and sparseness of the data contained within many of them, an issue that has already been addressed to some extent by Mestres et al. (121) The most extensive work in this area is that of Hu and Bajorath, who have investigated numerous aspects associated with the impact of data quality on estimations of the degree of polypharmacology for many different activity classes (133–137). A number of DBs have been assembled in an effort to organize the burgeoning amount of information on drug-target interactions, where ‘interactions’ is taken broadly and includes biological responses such as enzyme catalysis as well as binding. It is important to note, however, that binding alone, even strong and highly specific binding, is not prima facie evidence that such an interaction will ultimately produce some type of biological response, which is the sine quo non of drug-target interaction. Impetus for constructing such massive DBs initially grew out of advances in high-throughput screening and combinatorial chemistry, and slightly thereafter by corresponding advances in several areas of high-throughput biology all of which have given rise to the Era of Big Data. Because there is so much data in different formats and of varying quality obtained from experiments in many laboratories using different experimental protocols, combining it all into a single unified DB is a non-trivial though highly desirable task. These factors raise a number of important issues, two of the most important being the quality and internal consistency of the data in the DBs. The former must be addressed to ensure that the well-known computer maxim “Garbage in, garbage out” does not apply, while the latter ensures that closely related DB queries yield comparable results. Although these issues are 108

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

currently being addressed in many DBs, they remain an issue to some degree in all DBs. Table 2 provides a listing of some of the most important drug-target databases (138), two of the most comprehensive being STITCH 4.0 (139) and Drug2Gene (140). Both DBs have combined data available from drug-target DBs such as DrugBank, ChEMBL, BindingDB, and PubChem BioAssay, from text mining a wide variety of literature sources, from available pathway data, and from computational estimations. The data in each DB are represented in ‘relation-centric’ formats that consist of three entities – drugs (compounds), their targets (genes), and the relationships that link them to one another. This simple format also affords the possibility of representing the data in the form of networks.

Table 2. Summary of available drug-target interaction databases. STITCH 4.0: Integration of protein-chemical interactions with user data. Drug2Gene: an exhaustive resource to explore effectively the drug-target relation network. PROMISCUOUS: a database for network-based drug-repositioning. DrugBank: a knowledge base for drugs, drug actions and drug targets. ChEMBL: a large-scale bioactivity database for drug discovery. BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinities. PubChem BioAssay: 2014 update.

All data are cleaned to remove redundancies and expressed in terminology that is consistent within each DB. Importantly, the quality of the activity and relational data is also assessed as to its reliability. Both DBs contain data on multiple species including significant amounts of human data, both present the data in graphical and tabular formats along with assessments of data quality facilitating analysis, both allow importing and exporting of data, and both are available online (141, 142). STITCH 4.0 contains relational information on the interaction of more than 300,000 compounds with ca. 2.6 million proteins from more than 1,000 organisms extracted from source DBs such as ChEMBL and BindingDB. Drug2Gene is currently the most comprehensive compound-target DB containing data from more than 19 publically available DBs with over 4.4 million compound-target relationships, most of which include bioactivity data. Although it has similar capabilities to STITCH 4.0, it also contains standardized activity data from multiple, diverse sources – an important contribution in that it allows for a more comprehensive comparison of compound-target relational data than has heretofore been available. The vast amount of data from both of the DBs strongly suggests that the specificity of drug-target interactions is much less than has heretofore been assumed. By contrast, the extensive work of Hu and Bajorath on compound 109

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

promiscuity suggests that the degree of promiscuity of most compounds is significantly overestimated (133–137). One reason for their assessment may be due to the more stringent requirement they place on compound activity. However, after removing all of the low quality compound-target interaction data from STITCH 4.0 and Drug2Gene the amount of high quality data that remains is still significant, but this begs the question as to how well the remaining data can be trusted. In any case, while a preponderance of the data seems suggest that promiscuity may be widespread the extent to which this is true cannot be fully assessed at this time. Nevertheless, although it is not definitive, the existence of numerous examples of ADRs and repurposed drugs strongly suggests that polypharmacology is widespread, and because of the duality described in the previous section also suggests that polyspecificity may be widespread. An equally serious issue is that of data consistency both within and between databases, which arises when querying the same in information in different ways yields inconsistent answers. Importantly, this issue remains even if it is assumed that all of the data within a given database are accurate and of high quality, an assumption that is not materially correct. Due to the size and complexity of drug-target databases identifying such inconsistencies is not possible without some form of computer assistance. In any case, efforts should be made to evaluate the internal consistency of the current databases, and it would also be informative to do this between databases as well. An interesting example involves querying both STITCH 4.0 and Drug2Gene databases for the proteins that interact with imatinib (See the sections on ‘Imatinib – A Prototypical Example of Target-Based Drug Discovery’ and ‘The Functional Complexity of CML’ for further details). Querying the former database yields 52 proteins while querying the latter yielded only 41 proteins, an obvious difference. Interestingly, only about 25 proteins were common to both database queries. While somewhat daunting, it nevertheless points to the need for additional effort to ensure that databases are at least somewhat consistent. Otherwise drawing conclusions based on their data can only be considered to be very tentative at best. Moreover, queries that result in a significant amount of spurious data can also lead researchers down incorrect or bogus avenues of research, a problem with very serious consequences.

Drug-Induced Effects on Gene Expression and Related Processes Although the direct interaction of drugs with a variety of largely protein targets and off-targets (Figure 4: 1) plays a central role in drug action, this is only part of the story. Introduction of drugs or xenobiotics can also alter biosystem functions through a variety of post-transcriptional and post-translational mechanisms that regulate the type and amount of proteins available at any given time. Drug-induced changes in gene expression (Figure 4: 2, 4, and 5) can play a significant role in determining the therapeutic effectiveness and ADRs of drugs since the up-regulation and down-regulation of target genes is related 110

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

to the synthesis of their corresponding proteins, which play substantive roles in regulating the machinery of human biology (143). Drug-induced gene expression patterns, also called gene profiles or signatures, have been used to connect small, mostly drug molecules, genes, and diseases as is exemplified by the seminal paper describing the development of the ‘Connectivity Map’ or CMap by scientists at the Broad Institute (144, 145). In these cases the MMOA of the small molecule perturbants is generally unknown. In addition to drug-induced changes in gene expression, gene-mediated gene expression (Figure 4: 6) can also influence levels of mRNAs (146), although there are relatively few examples of this gene regulatory mechanism. Gene expression levels do not normally correlate with expression levels of the corresponding proteins due to a variety of post-transcriptional controls on gene expression (147). One of the most important is related to the function of miRNAs (vide supra), short segments of non-coding, single-stranded RNA about 18-22 nucleotides long that are transcribed from non-coding genes and bind to mRNAs and prevent their translation or mark them for degradation (148) (Figure 4: 7 and 9). In either case the relevant genes are “silenced” because translation of an mRNA into its corresponding protein is prevented (Figure 4: 8), which partially explains the observation that levels of expressed genes do not always accord with those of their corresponding expressed proteins (149). MicroRNAs also function in a similar fashion to other gene silencing RNAs such as the small, inhibitory RNAi’s (150). Figure 4: 4 and 5 depicts the drug-induced transcription of genes that code for proteins and non-coding miRNAs, respectively, although the effect of drugs on miRNA expression is much less well characterized than is the case for genes that code for proteins (151, 152). The discovery of miRNAs provides a cautionary tale of how new, seminal discoveries and concepts can languish, slowing their implementation and hence their impact in biological research applications. For nearly a decade after their discovery in 1993 (153, 154) miRNAs were considered to be curiosities. That changed in 2002 with the discovery that they were linked to the disease chronic lymphocytic leukemia (CLL) (155). Since that time a variety of papers and reviews have described the potential role of miRNAs in a number of diseases (156–158) and as biomarkers (159, 160). The miRNA story is interesting in that a relatively new ‘biological entity’ that was essentially unknown little more than two decades ago is now seen to play a major role in the regulation of protein expression and in major diseases such as cancer. This raises an important, albeit rhetorical, question of whether all of the molecular mechanisms that underlie the key functions of biosystems have been discovered – they most likely have not been as the case of miRNAs clearly suggests. Drug induced changes in gene expression levels are, however, not the only changes that can affect the amount of different proteins and hence biosystem functions. While gene silencing by miRNAs or other small interfering RNAs can prevent the synthesis of new proteins they cannot affect the levels of stable proteins with long half-lives that have already been synthesized (161, 162). This can only be accomplished through ubiquitylation a complex post-translational process whereby proteins “marked” for degradation are conjugated with ubiquitin and subsequently degraded by cellular proteasomes (163) (Figure 4: 13). 111

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

All of the biological processes described above are associated with what could be called secondary targets because they represent targets that are involved in regulating the proteins and nucleic acids upon which most target-based drug discovery projects are based. Modulating the activity of these secondary targets is generally much more difficult than modulating the activity of direct or primary targets because of the complexity of the structures and processes involved. Nevertheless, some work has been done on a number of secondary targets. For example, target-based studies of miRNAs have been carried out, but it is difficult to evaluate the promise of this area of research since it is rather new and relatively few studies have been undertaken at this time (164, 165). There are two basic strategies that involve the design of miRNA mimics and compounds that bind directly to miRNAs and hence block or inhibit miRNA binding to mRNAs (Figure 4: 2′), but such therapies must typically overcome poor bioavailability. By contrast, efforts are being made in a number of laboratories to design specific small molecules that directly affect transcription factors by hindering the protein-protein or protein-DNA interactions that are necessary for their function (166, 167). Not surprisingly, this is a difficult task as these targets are considered by many to be ‘undruggable’. Neverthelss some headway is being made, albeit slowly. Efforts to influence levels of existing proteins via the ubiquitin-proteasome system (Figure 4: 13) are also being carried out by a number of investigators. As is the case for transcription factors developing methods for targeting the ubiquitylation process are chemically challenging, but progress is being made to exploit the drug discovery possibilities afforded by the ubiquitin-proteasome system that controls protein degradation (161–163). These are just a few of the studies aimed at targeting secondary targets as a strategy for discovering new drugs. Although the above discussion reveals some of the complexity of biosystems, this is so to speak only the “tip of the iceberg”. In this regard the ‘factory metaphor’ described in Epistemology of the Cell (23) provides an excellent overview of the many complexities of the regulatory logic of cells. Figure 3, which was discussed earlier, provides a graphical depiction of a simplified model of an intracellular regulatory network that illustrates some of the complexities that arise due to interactions between the gene expression, signal transduction, and metabolic networks. Even though much of the molecular detail is removed what remains clearly shows the convoluted nature and numerous feedback and feed-forward controls exhibited by such models. As shown in Table 1 these networks are only part, albeit an important part, of the set complex, interacting regulatory systems that play a crucial role maintaining the stability of biosystems to perturbations produced by the introduction of drugs or other xenobiotics. In addition, as was the case for miRNAs, other yet to be discovered system components may have a significant impact on the behavior of the human biosystem. At present the effects of these latent components are unknown but nonetheless may be impacting system behavior. Superimposed over biosystem complexity is the polypharmacology of all administered drugs and xenobiotics, which gives rise to significant uncertainties in system behavior that are difficult to account for. 112

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

These examples demonstrate the highly complex nature of drug action much of which may be hidden but can, nonetheless, have a significant impact on drug efficacy. This raises significant questions with regard to the single-target and even the multi-target drug discovery paradigms that continue to play an important role in drug discovery research today (vide infra).

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

Target-Based Drug Discovery It is commonly presumed that target-oriented procedures constitute a mechanistic, molecule-based description of the relevant biology associated with the targets. Such approaches are typically denoted as ‘hypothesis driven’. However, as can be seen from the preceding discussion, this is a significant oversimplification of the actual situation even in the case when information is provided on the pathway within which the target resides (Cf. Figures 3 and 4). This myopic view of biological mechanisms may in part be responsible for the high rate of failure primarily in Phase 2 clinical studies when potential drugs first confront the realities of interacting with the human biosystem (168, 169). It may also be at least partly responsible for an increase in phenotypic screening methods, which are discussed in detail in a forthcoming section (170). As has been pointed out numerous times the success of the target-based approach requires well-validated targets with well understood biology and pharmacology (171–175). Hence, the notion that novel targets will provide a potential source of new medicines for previously untreated or poorly treatable diseases is generally ill founded (176). This follows because novel targets by definition are targets whose function(s) within the human biosystem are not completely understood. Hence, compounds developed in this way typically lead to failures in the clinical phase. Even in cases of well-validated, non-novel targets, complete mechanistic details of their functions are generally lacking leading to a type of myopia, abetted by reductionist tendencies, that plagues the target-based approach to this day. Despite these deficiencies target-based procedures remain a mainstay of current drug research as evidenced by the large number of journal articles and reviews currently based on this approach, which is quite surprising given that there has been a sharp decline in new drugs developed in this way over the last two decades (177–182). A perceived advantage of the target-based approach is that it is amenable to highly automated high-throughput screening (HTS) procedures that can be applied to the large compound collections. This mechanized, brute-force approach appealed to the pharmaceutical industry since it could afford the high upfront costs of the sophisticated equipment and large compound collections needed to carryout HTS campaigns effectively. It also appealed to pharma management because the HTS-based research process provided a simpler and seemingly more predictable path to new drugs free of the inherent uncertainties and complexities of actual biosystems. And it afforded a means for essentially ‘industrializing’ the drug discovery process: (1) identify and validate a drug target → (2) screen a large compound collection against that target to identify an initial ‘hit-set’ → 113 Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

(3) carry out ligand-based virtual screening (LBVS) to enrich the hit-set → (4) repeat (3) as necessary → (5) carryout biologically relevant phenotypic assays to identify one or more leads from the enriched hit-set → (6) employ medicinal chemistry to optimize the most desirable lead or leads. However, what this approach gained in efficiency it lost in its inability to deal with the unexpected occurrences that inevitably arise. Since target-based screens currently are mostly directed towards single drug targets the hit-set obtained tends to cover more limited regions of chemical space than is typically the case for phenotypic screens. Since putatively ‘inactive’ regions of chemical space will almost certainly not be considered further, other compounds in these regions that may have some ‘drug potential’ are thus never evaluated. Moreover, the very nature of the single-target approach also precludes any assessment of a compound’s polypharmacology at least until it is tested against multiple targets (vide infra). In many cases, however, when multiple targets are screened they tend to belong to same class such as kinases. Hence, other regions of ‘target space’ that may lead to more effective therapy or may exhibit fewer possible ADRs are not evaluated. Structure-activity relationships (SARs), which are an integral part of the drug optimization process tend to be based almost exclusively on the designated drug target and generally are restricted to local regions of chemical space. While the mathematical models developed in such cases, called local SAR models, tend to be reasonably predictive they are restricted to small regions of chemical space. Hence they are usually not general enough to deal with multiple regions of chemical space that contain potentially active molecules. Global SAR models, on the other hand, cover much broader regions but are more difficult to develop and are generally less predictive. More sophisticated methods have been developed that can address some of the deficiencies of local models, but they require additional data, which is not always easy to come by. As discussed earlier multi-target-based procedures have been developed in an effort to counter some of the perceived deficiencies of single-target-based methods (6, 7). These methods are designed to address the polygenic character of most diseases based on either the polypharmacology of single compounds or the use of multiple compounds. The multi-target strategy also addresses issues raised by the metabolic engineering studies discussed in more detail in the following section, which show that inhibiting or otherwise modulating the activity of a single drug target is unlikely to significantly influence the behavior of biosystem pathways or their associated disease phenotypes. Experience teaches us that the administration of any drug will give rise to numerous side effects some of which are adverse and some of which may be beneficial. The latter is associated with the recent emphasis on the repurposing of existing drugs for new therapeutic indications, a clear manifestation of the polypharmacological behavior exhibited by many drugs. In numerous cases the single-target, single-disease paradigm, which is borne out of an implicit or explicit faith in the reductionist paradigm, is too limited to provide a suitable model for the discovery of new drugs. Perhaps more to the point its adoption may also be due to the fact that it considerably simplifies the implementation and analysis of the drug discovery process. 114

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

Importantly, any compound(s) discovered by target-based procedures must also be evaluated in one or more in vivo and/or ex vivo phenotypic screens to ensure that the activity observed in target-based screens is ‘biologically relevant’, which in a number of instances turns out not to be the case. This is rather curious given that target identification should to some extent ensure the validity of the target. Once the target is identified any compound that inhibits or in anyway appropriately modulates the target should be biologically relevant. But it seems that in many cases the main function of target-based assays is to reduce the vast number of compounds residing in most corporate compound collections to a manageable size that is more amenable to phenotypic screening. But such an approach significantly limits the chemical space assayed by such screens (vide supra) substantially lowering the probability of finding active compounds in follow-on phenotypic screens.

Imatinib – A Prototypical Example of Target-Based Drug Discovery The development of imatinib (Gleevec™, a.k.a. Glivec™) for the treatment of chronic myeloid leukemia (CML) exemplifies the target-base approach and has lead to a virtual flood of kinase inhibitors developed within the target-based paradigm for a variety of different targets, many of which are related to cancer (183). A number of reviews of the molecular biology of CML have appeared since the beginning of this century (184–187). Although all of the causative influences that bring about CML have not been completely elucidated, a key factor appears to be the presence of the BCR-ABL fusion gene, which is the translocation product between chromosomes 9 and 22 that is found on Philadelphia chromosome 22 (Ph+) (188–190). Its gene product is the constitutively active Bcr-Abl tyrosine kinase that plays a crucial role in the transformation of normal cells to the malignant phenotype, which is characterized by altered adhesion to stromal cells and the extra-cellular matrix (191), by constitutively active mitogenic signaling (192), and by inhibition of apoptosis (193). As has been noted by Ren (185), Bcr-Abl kinase activity is a necessary but not sufficient cause of the transformation of haematopoietic stem cells and the induction of a CML-like myeloproliferative disorder in mice. Hence, other ‘non-kinase’ activities associated with Bcr-Abl kinase must also play roles in leukaemogenesis including those induced by the interaction of proteins involved in signal transduction with Bcr-Abl kinase (194–196) and those involved in other processes such as Bcr-Abl-mediated induction of transcription factor NF-κB (197, 198) It is not entirely clear whether additional factors may also be necessary to bring about CML in humans. A mathematical model based on epidemiological data predicts that three additional stem cell mutations are required (199). Moreover, based on a mean latency of 4-11 years, data from ionizing radiation studies suggests that multiple mutations are needed to induce CML in humans (200). The need for further mutations is also supported by data, which shows that 115

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

about 30% of healthy humans have detectable amounts of the BCR-ABL-fusion gene in their blood cells, but only a small percentage go on to develop CML. Moreover, five percent of patients presenting with CML symptoms do not possess the mutant BCR-ABL oncogene (201) – is this due to misdiagnoses or is it an important, albeit a subtle, clue to other biological factors that can bring about a CML-like cellular transformation? In any case Bcr-Abl kinase is the putative target for drug intervention, and it led to the ‘rational’, target-based development of imatinib, the first in the growing class of tyrosine kinase inhibitors (TKIs) (202, 203) and to a number of second and third generation follow-on drugs (204, 205). As has been shown in x-ray crystallographic studies, imatinib binds to the ‘open form’ of the Bcr-Abl kinase (206) and once bound restricts autophosphorylation of Bcr-Abl, an important step in the initiation of cellular transformation (207). Binding of imatinib to the BcrAbl kinase also affects a number of other Bcr-Abl related signaling processes (vide supra) (194–198). Once the putative target was identified and its inhibition was shown to reduce the population of CML cells it appeared that a major step had been made in the treatment of CML, which indeed was the case. However, as is true in so many instances in biology, what appeared to be a rather straightforward solution turns out to be much more complex than initially assumed and many unanswered questions remain. A key issue is the persistence of subpopulations of CML cells that are resistant to apoptosis. Recent evidence suggests that CML stem cells are refractory to treatment by imatinib and other TKIs that effectively inhibit Bcr-Abl kinase – such cells termed kinase independent do not undergo apoptosis in the presence of TKIs suggesting that this may be responsible for residual disease in treated patients (208). Evidence also indicates that Bcr-Abl kinase independent stem cells behave differently from more mature, ‘oncogene addicted’ cells that were transformed by the so-called ‘driver’ BCR-ABL oncogene (209). Hence, efforts at curing patients responding to TKIs but who also exhibit minimal residual disease should be directed towards Bcr-Abl kinase independent survival pathways that remain active in these cells or are activated by kinase inhibition (209). Developing such therapies is important since CML is not cured until the residual disease is successfully eliminated. All of these issues provide examples of emergent phenomena as discussed in the Introduction and the section on Biological reductionism as well as in the Appendix. Resistance to apoptosis of kinase independent CML cells, which remain susceptible to Bcr-Abl kinase inhibition by imatinib, differs from the chemoresistance mechanisms that render imatinib ineffective in inhibiting Bcr-Abl kinase (209). Patients experiencing chemoresistance to imatinib therapy are grouped into two categories designated as primary and secondary. Patients in the first category are refractory to imatinib treatment, while those in the second category experience chemoresistance after initially successful imatinib therapy. A number of mechanisms leading to imatinib resistance have been described including gene duplication (210), gene mutations (211–213), and the induction of efflux transporters (214). Although other mechanisms of chemoresistance have also been described (211), much of the focus has been on the development of second and third generation drugs that overcome the loss of activity due to 116

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

mutations that lower the binding affinity of imatinib for Bcr-Abl kinase. Dasatinib, a second-generation inhibitor, binds to both active and inactive conformations of Bcr-Abl kinase and is active against all imatinib resistant mutations except those that affect the ‘gatekeeper’ residue (215, 216). Nilotinib, another second generation inhibitor that was developed using rationally designed modification of imatinib, is more potent than imatinib and shows better selectivity for Abl kinase over other kinases (217).

Functional Complexity of Chronic Myeloid Leukemia As described earlier (218), Figure 4 provides a high-level ‘macroscopic’ view of some of the major cellular processes that take place among biosystem components, which may also play a role in CML. It is clear from the figure that even in such an over-simplified scheme the relationships among the entities responsible for CML and its treatment are quite complex. For example, in addition to Bcr-Abl kinase a number of proteins are likely to play roles in the etiology of CML; exactly what all of their roles may be remains the subject of on-going research. In any case, the target-based picture afforded by the inhibition of Bcr-Abl kinase greatly oversimplifies many issues associated with CML and its treatment. As indicated in Figure 4: 2 and 10 drugs and proteins can interact with genes and induce the transcription of coding and non-coding genes (Figure 4: 4 and 5) giving rise to mRNAs and miRNAs, respectively. The latter can control the translation of specific mRNAs (Figure 4: 7) or can mark them for degradation (Figure 4: 13) (219). As was shown by Hershkovitz, Rokah, et al. (220) using miRNA microarrays and real-time PCR three miRNAs, Mir-31, Mir-155, and Mir-564, are down regulated in CML patients implying an increase in translation (Figure 4: 8) of the corresponding mRNAs with which they interact (Figure 4: 7). Moreover, their down-regulation also appears to be dependent upon Bcr-Abl activity. Figure 4: 1 indicates that drugs can interact with a variety of proteins exemplified by the graph in Figure 6 (221), which depicts the top ten proteins that interact with imatinib – a small subset of the 52 proteins obtained by querying the STITCH 4.0 drug-target database (139, 222). This is a perfect example of polypharmacology that goes well beyond the familiar interaction of imatinib with the Bcr-Abl kinase. Whether any of these interactions have an important affect on biological activity is currently unknown. However, since PDGFRA, PDGFRB, KIT, STAT5, and LYN are involved in a variety of cellular processes including proliferation, differentiation, growth, and signal transduction (vide supra), all of which may be relevant to the onset and persistence of CML, it seems quite reasonable to determine whether these interactions influence protein functions in ways that may positively or negatively affect their role(s) in the pathogenesis of CML. Regardless if this is or is not the case, or the degree to which it applies, the effect of polypharmacology is nevertheless expressed in the significant number of ADRs associated with imatinib therapy, a sample of which is provided in Table 3. 117

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

Figure 6. Graphical depiction of the ten proteins (‘targets’) that interact most strongly with imatinib obtained from the STITCH 4.0 website (http://stitch.embl.de. Accessed October 20, 2015).

The existence of polypharmacology alone does not, however, afford a detailed picture of imatinib’s MMOA. On the contrary, it tends to cloud the issue since polypharmacology can be associated with numerous biological effects, and pinpointing specific effects associated with the action of any drug exhibiting significant polypharmacology can be daunting. Even though inhibiting the Bcr-Abl kinase appears to be a crucial factor in imatinib-based CML therapy, clearly an unknown number of off-target and latent biological effects may also play substantive roles in CML remission. At this point in time, it is clear that the single-target view at best provides an incomplete picture of imatinib’s MMOA. Other Bcr-Abl kinase inhibitors such as dasatinib and nilotinib show different degrees of selectivity compared to imatinib. These inhibitors may show similar binding profiles but differ in their inhibitory or binding constants. In some cases, new kinase inhibitors that differ from the prototypical inhibitor imatinib may also interact with new protein targets.

118 Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

Table 3. Imatinib side effects (ADRs)a Acid or sour stomach

Lack or loss of strength

Belching

Loose stools

Constipation

Loss of interest or pleasure

Difficulty moving

Muscle stiffness

Discouragement

Night sweats

Excess air or gas in stomach/intestines

Passing gas

Fear or nervousness

Sleeplessness

Feeling sad or empty

Stomach discomfort, upset, or pain

Feeling unusually cold

Swollen joints

Feeling full or bloated

Difficulty concentrating

Increased bowel movements

Inability to fall asleep

Irritability

Weight loss

a

This table contains a listing of some of the possible side effects due to administration of imatinib. There is some redundancy in the table due to the way in which ADRs are reported.

Lastly, as discussed earlier in this section, proteins can interact with other proteins (Figure 4: 11) and induce a variety of cellular signaling processes (Figure 4: 12). In the present case, the protein-protein interactions primarily involve the interaction of a number of signaling proteins with Bcr-Abl kinase. Even at the high level of abstraction depicted in Figure 4 the interrelated processes associated with CML are clearly visible. Moreover, the polypharmacology associated with drug-protein interactions depicted in Figure 6 (223) adds a new level of complexity and uncertainty to the functionality of those parts of the human biosystem that are involved in the pathophysiology of CML. Finally, it is clear from the discussion in this section that the complex biosystem within which a given target such as Bcr-Abl kinase is embedded has a significant influence on the manifold processes associated with the target. For example, the effect on signal transduction of inhibiting Bcr-Abl kinase is associated with the inability of a number of specific proteins involved in cellular signaling to bind to the auto-phosphorylated form of the kinase. Another factor that can also have profound effects on the function of human biosystems is polypharmacology due to the surprising lack of specificity of drugs and xenobiotics for a range of biological targets. All of this points to the possible benefit of a phenotype-based rather than a target-based approach to drug discovery, issues that will be considered further in the following sections.

119 Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

Lessons from Metabolic Engineering The previous section raises a number of questions regarding the nature of target-based procedures that involve a single target. It seems that even in a case as apparently clear-cut as the inhibition of the mutant, constitutively active Bcr-Abl kinase by imatinib some unresolved issues nevertheless remain. Studies of metabolic pathways represent the most comprehensive and rigorous examination of biological pathways that have been carried out to date, though they represent only one part of the set of complex, interacting pathways that provide the functional framework for the operation of biosystems as depicted in Figure 3 (Cf. Table 1). The field of metabolic engineering has played a major role in this regard. One of its practical goals is optimization of the genetic and regulatory processes taking place within cells to increase their production of chemical substances. To accomplish this metabolic engineers have carried out extensive computational studies of cellular metabolism. In this regard metabolic control analysis (MCA) (46, 47) has provided a powerful tool that is a mainstay of numerous metabolic engineering applications. Although most of these studies are confined to bacteria and other simple cellular systems such as yeast they have, nonetheless, informed a number of issues that are relevant to drug discovery (21, 224–227). Control coefficients play a crucial role in MCA since they provide a measure of the control exerted by each enzyme on substrate fluxes or on any other systemic properties such as cell proliferation, apoptosis, carbon dioxide emission, ATP utilization, or the transport of specific substances across membranes. To ensure that the control coefficients are unitless they are defined as the fractional change in the systemic property divided by the fractional change in enzyme (“target”) activity. A powerful feature of control coefficients is that they can be estimated experimentally even for ill-defined systems – detailed mathematical models are not absolutely required (228). As an example, consider the relationship between increased glucose utilization and cell proliferation a common phenomenon in cancer cells. The control coefficient can be estimated in cultures of tumor cells by measuring the ratio of the fractional decrease in activity of a given enzyme in the glycolytic pathway compared to the corresponding fractional decrease in cell proliferation as a function of increasing inhibitor concentration. Since the glycolytic pathways is coupled to several other pathways determining whether a particular glycolytic enzyme is an appropriate target for inhibiting glycolysis is not straightforward. However, it can be accomplished by measuring the control coefficient of enzyme with respect to cell proliferation. A small value indicates that enzyme inhibition is largely decoupled from cell proliferation. If such were the case, the enzyme would not be an appropriate target. By contrast, as the control coefficient approaches unity, a relatively rare occurrence, the coupling between enzyme inhibition and a reduction in cell proliferation increases. Under these conditions the enzyme would be an excellent target for drug therapy, with the caveats that ADRs were minimal and within acceptable ranges and that ADMET properties were also within acceptable ranges. As it turns, however, inhibiting a single enzyme in the 120 Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

glycolytic pathway is insufficient and multiple enzymes must be inhibited for effective therapy. From this example it is clear that all of the complex interactions contributing to the decrease in cell proliferation as a function enzyme inhibitor concentration need not be considered explicitly. In one sense this is a benefit of the approach. However, a limitation is that the role of other factors such as those associated with polypharmacology cannot be assessed directly thus hindering attempts at developing a explanation of the complete mechanism of the inhibitory process, a problem that persists in most cases. As noted by Bailey (224), the goal of metabolic engineering to identify genes that confer specific cellular phenotypes is entirely consistent with that of drug discovery, although in the latter case the goal is generally though not exclusively associated with disease phenotypes. A growing amount of data in metabolic engineering and from a variety of sources indicate that manipulating single genes or their gene products does not generally alter phenotype or if it does it is not in simple obvious ways (224). This has two important consequences, first that most diseases are polygenic, and second that inhibiting or otherwise modulating a single gene product, i.e. a target, will generally be ineffective, although there are exceptions in both cases. The latter is supported by numerous metabolic engineering studies, which indicate that complex interacting metabolic networks cannot be regulated by perturbing a single network component. In addition, the effect of genetically or pharmacologically perturbing the activity of a given protein is not usually confined to that protein, but inevitably affects other proteins as well. This situation is exacerbated in the latter case due to the polypharmacological behavior exhibited by many drugs. All of these factors strongly suggest that drug discovery based on a single target paradigm is problematic at best and is marked by a significant degree of ‘mechanistic uncertainty’.

Phenotype-Based Drug Discovery In contrast to the target-based paradigm, phenotype-based procedures (229) are closer to the ‘intrinsic biology’ of human biosystems because phenotypic screens are generally developed with respect to functional responses that ideally are related to some disease state (230, 231) rather than to a specific target. Such an approach is particularly useful in complex, multi-factorial diseases where the underlying biology may not be completely understood (232). By contrast target-based approaches are usually considered to be hypothesis driven, with the implication that phenotypic approaches must not be. However, this is clearly pejorative since phenotype-based procedures are also based on hypotheses albeit biological hypotheses, and because of their limited account of numerous factors associated with the function of a target, it is target-based approaches that may lie on shakier ground. Though phenotypic screens are to a large degree target and mechanism agnostic (230) they nevertheless possess multiple, albeit in many cases latent, targets since the molecular mechanisms of few biological phenomena are fully elucidated. In such cases compound promiscuity – polypharmacology – may 121

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

be a virtue, especially given the inherent multi-target character of phenotypic screens. It should be appreciated, however, that the results of hitting multiple targets in a phenotypic screen is considerably different than hitting all of the same targets in individual assays, since in the former case the responses of some of the targets may be correlated with one another and hence will influence each other’s activities. It is not surprising that phenotypic screens typically generate a greater number of hits than target-based screens, which are largely focused on single targets (233). Consequently the hit-sets obtained in phenotypic screens tend to occupy larger regions of chemical space than is the case with target-based screens. This is a distinct advantage with regard to hit-set expansion since the resulting screening sets will cover more extensive regions of chemical space increasing the possibility of finding additional hits. However, the expanded coverage of chemical space comes at a price, namely, the loss of target specificity due to the polypharmacological behavior of most drugs and other xenobiotics. However, this is not necessarily bad thing since phenotypic screens are typically associated with biological phenomena that involve the response of one or more biomarkers or targets to the administration of a drug. Hence, a drug’s effect on a single target in most cases may not be highly relevant to the phenotypic endpoint of interest. Unlike the case in single-target-based SAR studies, which do not suffer from confounding influences due to multiple targets and compound promiscuity, SAR studies based on phenotypic endpoints can suffer from these problems. Hence, the biological activity observed in a phenotypic assay is a complicated mixture of the biological activities of all of the targets that are associated with a drug’s polypharmacology. This follows because the observed biological activity is not just related to the sum of the activities of the drug’s targets but is also related to their effect(s) on specific pathways in which they are embedded and interactions between these pathways. Consequently, such cases are likely to be much more difficult to interpret in structural terms. Nevertheless, as discussed by Lee, et al. (230), SAR studies can be conducted on complex cell-based assays if proper account is taken in the design, statistical validation, and operation of the assay. Lang (234) also presents an interesting discussion based on the work of Young, et al. (235) that considers the combination of complex cellular data obtained from high-content screening (HCS) experiments with modern computational tools as a means for obtaining ‘phenotypic SARs’ and for inferring, albeit only partially, MMOAs. Though phenotype-based approaches in many cases appear to be superior to target-based ones, the former are not without some difficulties. Two of the most apparent involve the limited throughput of phenotypic screening experiments and target identification. In the former case, although a number of medium throughput phenotypic screens have been developed, it was commonly assumed that high-throughput phenotypic screens comparable to those almost routinely carried out in the target-based case were not possible. But as is true in many instances in drug discovery, this challenge has over the last decade driven the development of new HCS methodologies coupled with sophisticated data analysis methods that are capable of deconvoluting the massive, complex datasets produced by these methods (236, 237). Hence, one of the perceived impediments of phenotypic 122

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

screens, namely their lower throughput than HTSs has been at least partially removed, and if the past is any indication it is not unrealistic to assume that the throughput of phenotypic screens will be significantly expanded in the future. Although such an improvement would make their throughput comparable to that of most HTSs, it begs the question of whether such mega-screening campaigns increase the likelihood of identifying more viable hits than lower throughput screens – if one is looking for a needle in a haystack, the odds of finding it are not necessarily improved by increasing the size of the haystack! Although a drug’s MMOA is not required for FDA submission or approval (15) efforts are nevertheless usually made to identify the putative target associated with the biological activity observed in a phenotypic screen, a procedure also known as target identification or target deconvolution (238–243). An interesting recent paper by Wei, et al. (244) suggests a simple statistical test as a means for inferring causal relationships between target activities and phenotypic readouts. Determining a drug’s MMOA is not just a purely scientific enterprise but has very practical ramifications with regard to the choice of drug therapy. Because of this, incorrect target identification can have disastrous consequences. For example, there is a substantial danger in cases where a cryptic target or mechanism other than the nominally identified one is responsible for the observed therapeutic efficacy. This follows because one of the rationales of the target-based approach, particularly in cancer patients, is the use of biomarkers to identify the appropriate drug therapy. This can result in an incorrect diagnosis that at best will provide no therapeutic benefit to patients and at worst can deprive them of potentially life saving drug therapy. An example provided by Moffat, et. al. (183) describes multiple clinical trials of the nominal RAF kinase inhibitor sorafenib that showed no benefit against melanoma regardless of the mutational status of BRAF gene, a finding that is consistent with evidence that its primary target is the gene product of VEGFR2 not BRAF. More recently, the clinical success of vemurafenib, a well-established inhibitor of the serine/threonine-protein kinase Braf, confirmed the BRAF target-based hypothesis for melanoma. Lastly, although comparative studies of the productivity of target- and phenotype-based methods suggest that the latter approach may be more effective, the results are not entirely clear-cut (245, 246). In any case, assuming that there are two, and only two approaches to drug discovery is a false dichotomy, and taking a more flexible approach that combines the best features of both approaches appears, quite reasonably, to be desirable.

Final Thoughts There are two of the main themes in this work. The first involves the structural and functional complexity of biosystems produced by their compartmentalization and the inter-connectedness of their pathways and networks, which link the systems components in ways that give rise to their functional role(s) and ensure the overall robust behavior (i.e. homeostasis) of the biosystem. Because of this complexity it is difficult to anticipate all aspects of how a biosystem will react upon the administration of a drug. This difficulty is compounded by the fact, as 123 Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

intimated by the case of miRNAs, that it is unlikely that all critical biosystems components have been discovered. In such cases important regulatory functions that depend on these latent components cannot be properly accounted for. The complex, non-linear, convoluted, and compartmentalized nature associated with the pathways and networks frustrate attempts to understand their functions without the aid of mathematical models (17–19). Nevertheless the literature continues to be filled with naïve, cartoon-like renditions of biological processes that at best afford an incomplete and in many cases misleading picture of biological processes. Being able to diagram a pathway or network is not tantamount to understanding how it function(s) by itself or within the context of a complete biosystem. The second theme is the surprising lack of specificity of drugs manifested by their polypharmacology. But, even if a ‘perfect drug’ could be designed that would interact only with the desired target and could be delivered effectively to that target, two highly desirable but not practically achievable goals, the complexity of the biological environment into which it is introduced precludes the possibility of fully accounting for its behavior in that environment. Of course perfect drugs do not in reality exist, and the promiscuous behavior of real drugs further complicates attempts to fully elucidate and understand their MMOA particularly in cases where their promiscuity involves multiple targets, especially those residing in multiple pathways. In such cases the polypharmacological behavior of many drugs imposes another level of interaction between pathways in addition to the ‘normal’ interactions that take place in the absence of drugs. Lastly, the administration of multiple drugs to treat single or multiple diseases is de facto a form of polypharmacology. Hence, the issues discussed above associated with the polypharmacological behavior of a single drug are completely applicable in this instance as well. Since the effects on drug discovery of the structural and functional complexity of the human biosystem and the polypharmacology exhibited by many drugs are not independent of one another they can combine in ways that produce emergent properties that are greater than those produced by each separately, a characteristic of all complex adaptable systems (1). This synergy further confounds the interpretation of a drug’s MMOA, a situation that is not routinely addressed in drug discovery research. Lastly, it should be noted that most patients, especially elderly ones, are for a variety of reasons rarely on a single drug, and in the case of multiple drugs it is likely that the drugs were prescribed to treat different medical conditions. Thus, a variety of possible drug-drug interactions may arise that can influence the behavior of a given drug because the presence of other drugs can affect both the pharmacokinetic and pharmacodynamic properties of all of the drugs. Target-based and phenotype-based approaches are in many respects more similar than they are different. Both basically share the same procedures; they differ mainly in order in which these procedures are typically carried out. For example, in the phenotype-based approach, which exemplifies ‘classical pharmacology’, an assay is developed for a functional activity associated with some disease state. A screening campaign is initiated to identify compounds that are active against the screen, followed in many cases by an effort to identify 124

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

the molecular target responsible for the observed activity. In the target-based approach by contrast, a putative target is identified first followed by a screening campaign to identify active compounds (247). To ensure that the target-active compounds are also biologically active they are then usually tested in one or more phenotypic assays. Since the target-based procedure is ostensibly the reverse of the of the phenotypic one it is sometimes referred to as ‘reverse pharmacology’ (248). As noted earlier, defining the drug discovery process as either target-based or phenotype-based sets up a false dichotomy since the procedures in many cases are not so well-defined and in fact can be more accurately characterized as a mixture of these approaches since in many cases both can and do gain information from various types of mechanistic information. At this point both procedures enter into what is generally called the lead optimization phase of drug discovery where medicinal chemists synthesize numerous analogs of the active compounds obtained thus far in an effort to identify more potent compounds. Preclinical development scientists then select a one (in rare instances more than one) compound with favorable drug delivery and metabolism properties from the set of compounds with enhanced target activity as a candidate for clinical studies. Hence, it is the early steps in drug discovery where significant differences in discovery and development procedures are most manifest. It is clear from the above that both drug discovery paradigms suffer from a number of uncertainties the most prevalent being the lack of comprehensive information on MMOA. This lack is more serious in the target-based approaches because they are highly oriented towards issues associated with the MMOA. Recall that in most target-based studies rarely does molecular mechanistic information extend beyond the putative pathway in which the target is embedded and in some cases rarely extends beyond closely associated components in that pathway, a highly limited mechanistic view to say the least. Phenotype-based approaches, on the other hand, are largely target agnostic. However, they do generally have information on the biological mechanism(s) associated with the phenotypic effect(s) under investigation, and in a number of cases also have supporting information on molecular mechanisms related to these phenotypic mechanism(s). Nevertheless, because of their closer alliance with biology phenotypic approaches are more robust against either a lack of or incorrect mechanistic information. And since as noted earlier, HCS throughput has in a, albeit limited, number of cases begun to rival that of the older, more established HTS methods it seems that phenotypic methods should become increasing important as a basic platform upon which to build drug discovery programs.

Appendix – Complex Systems There are basically two types of complex systems, namely, complex physical systems (CPSs) and complex adaptive systems (CASs). Biological systems fall into the latter category, which are distinguished by the following properties: (1) they have numerous interacting components called agents, (2) interactions between agents can change in response to changing conditions a feature that 125 Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

differentiates them from their complex physical counterparts whose interactions are invariant, (3) they are open systems because they can interact in various ways with their ‘environment’, (4) they are deterministic but nevertheless can exhibit unpredictable behaviors because of their sensitivity to initial conditions, (5) they exhibit emergent properties. As the field of CASs is relatively new there is some variability in their characterization, although the above characteristics appear to be relatively common (20, 249–251). Since the action performed by a given agent is conditioned upon the actions of other agents the behavior of the complete system cannot be obtained by simply combining the actions of individual agents – “the whole is more than the sum of its parts” (20). This non-additivity is what is responsible for the non-linear behavior of CASs. Because of their adaptability and because they are open systems CASs do not reach equilibrium. Nevertheless, they attain what can be characterized as ‘steady states’ that may exist far from equilibrium, a feature that allows for the formation of interesting patterns, the main point being there is no need for an ‘overall plan’ or ‘intelligent design’ in order to bring about the formation of complex patterns such as those inherent in living systems (252). Complex adaptive systems are generally hierarchical with consistent behavior at each level of the hierarchy. For example, Figure 2 depicts the organismal hierarchy common to most biosystems. Individual molecules, which lie at the bottom of the hierarchy, obey the laws of physics and chemistry but are not CASs; entities further up the hierarchy such as cells are CASs. The important point here is that the laws of physics and chemistry are not suspended in molecules once they become involved in cells. Moreover, it is not essential that all of the subcomponents of a CAS also exhibit complex adaptive behavior. The unpredictable behavior of CASs is due to the fact that their dynamical trajectories are highly sensitive to initial conditions, although they are deterministic (20, 27). Hence small changes in initial conditions can result in dramatic changes to the trajectories of these systems, which can also exhibit chaotic behavior. This sensitivity also applies to small changes in system parameters that can result in catastrophic changes (e.g., ‘bifurcations’) to the trajectories (253). Since both the system parameters and the initial conditions cannot be determined to sufficient accuracy experimentally, these dramatic effects on the trajectories render computation of the dynamical trajectories of complex systems problematic at best. One of the most important properties of complex systems is emergence (254), which arises from interactions among their components that give rise to new properties, patterns, or entities that are not present in the components themselves. Living systems are a prime example of emergence since many of their properties cannot be inferred from the properties of the molecules from which they are constituted. For example, epigenetic phenomena cannot be deuced solely from knowledge of the structures of DNA, RNAs, and proteins since these structures are devoid of specific functional information. As a more dramatic example consciousness, which is an emergent property of the brain, cannot be inferred from the structure of individual neurons. There are many other examples of emergence in biology such as the termite mound with a cathedral-like’ shape 126

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

produced by a termite colony, which provides a more easily grasped example of emergent phenomena in nature (255).

Acknowledgments

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

Authors would like to thank BIO5 Institute at the University of Arizona for partial support of this work.

References 1. 2. 3. 4. 5.

6.

7.

8.

9.

10. 11. 12. 13. 14.

15.

See ‘Appendix – Complex Systems’ for a brief discussion. See sections on ‘Target-Based Drug Discovery’ and ‘Phenotype-Based Drug Discovery’ for further discussion of this issue. Hopkins, A. L.; Groom, C. R. The druggable genome. Nat. Rev. Drug Discovery 2002, 1, 727–730. Wilson, D. N.; Doudna Cate, J. H. The structure and function of the eukaryotic ribosome. Cold Spring Harb. Perspect. Biol. 2012, 4, a011536. Bochman, M. L.; Paeschke, K.; Zakian, V. A. DNA secondary structures: stability and function of G-quadruplex structures. Nat. Rev. Genet. 2012, 13, 770–780. Zimmermann, G. R.; Lehár, J.; Keith, C. T. Multi-target therapeutics: when the whole is greater than the sum of its parts. Drug Discovery Today 2007, 12, 34–42. Medina-Franco, J. L.; Giulianotti, M. A.; Welmaker, G. S.; Houghten, R. A. Shifting from the single to the multitarget paradigm in drug discovery. Drug Discovery Today 2013, 18, 495–501. Butler, G. S.; Overall, C. M. Proteomic identification of multitasking proteins in unexpected locations complications drug targeting. Nat. Rev. Drug Discovery 2009, 8, 935–948. Paolini, G. V.; Shapland, R. H. B.; van Hoorn, W. P.; Mason, J. S.; Hopkins, A. L. Global mapping of pharmacological space. Nat. Biotechnology 2006, 24, 805–815. Peters, J.-U., Ed. Polypharmacology in Drug Discovery; John Wiley & Sons: New York, NY, 2012. Hopkins, A. L. In Polypharmacology in Drug Discovery; Peters, J.-U., Ed; John Wiley & Sons: Hoboken, NJ, 2012; pp 1−6. Peters, J.-U. Polypharmacology – friend or foe? J. Med. Chem. 2013, 56, 8955–8971. Anighoro, A.; Bajorath, J.; Rastelli, G. Polypharmacology: challenges and opportunities in drug discovery. J. Med. Chem. 2014, 57, 7874–7887. Shelved compounds are compounds that pharmaceutical companies have dropped from further study (i.e. “shelved”) for a variety of scientific or business reasons even though some clinical data may exist (e.g. from Phases 1-3). FDA website with information on Investigational New Drugs. http://www. fda.gov/drugs/developmentapprovalprocess/howdrugsaredevelopedand 127

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

16. 17.

18.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

19.

20. 21.

22. 23. 24.

25. 26. 27. 28. 29. 30. 31.

32.

33.

approved/approvalapplications/newdrugapplicationnda/default.htm (accessed October 17, 2015). Mechanism matters. Nature Med. 2010, 16, 347 (editorial). Klipp, E.; Liebermeister, W. Mathematical modeling of intracellular signaling pathways. BMC Neurosci. 2006, 7 (Suppl 1), S10, DOI: 10.1186/1471-2202-7-SI-S10. Bardwell, L.; Zou, X.; Nie, Q.; Komarova, N. L. Mathematical models of specificity in cell signaling. Biophys. J. 2007, 92, 3425–3441. Bachmann, J.; Raue, A.; Schilling, M.; Becker, V.; Timmer, J.; Kingmüller, U. Predictive mathematical models of cancer signaling pathways. J. Intern. Med. 2012, 271, 155–165. Holland, J. H. Complexity: A Very Short Introduction; Oxford University Press: Oxford, U.K., 2014. Bailey, J. E. Reflections on the scope and the future of metabolic engineering and its connections to functional genomics and drug discovery. Metab. Engineer. 2001, 3, 111–114. Palsson, B. Ø. Systems Biology: Properties of Reconstructed Networks; Cambridge University Press: Cambridge, U.K., 2006. Dougherty, E. R.; Bittner, M. L. Epistemology of the Cell; IEEE Press (published by John Wiley & Sons: Hoboken, NJ, 2011. The interior of cells, the simplest self-sustaining biosystems, is quite crowded (Cf. GoodsellD. S. The Machinery of Life; Springer-Verlag: New York, NY, 1998). Because of this processes such as diffusion and binding that occur within cells can differ significantly from the same process that, for example, takes place in an in vitro experiment on an isolated enzyme. Pierce, N. W.; Kleiger, G.; Shan, S. O.; Deshaies, R. J. Detection of sequential polyubiquitylation on a millisecond time scale. Nature 2009, 462, 615–619. Potapov, A. P. In Analysis of Biological Networks; Junker, B. H.; Schreiber, F., Eds.; John Wiley & Sons: New York, NY, 2008; pp 183−206. Steenburg, D. Chaos at the marriage of Heaven and Hell. Harvard Theol. Rev. 1991, 84, 447–466. Rosen, R. Essays on Life Itself; Columbia University Press: New York, NY, 2000. See page 69 of Reference 23 for additional discussion on the issue of the non-computability of biosystems. Dobson, C. Chemical space and biology. Nature 2004, 432, 824–828. Burrage, K.; Burrage, P. M.; Leier, A.; Marquez-Lago, T.; Nicolau, D. V., Jr. In Design and Analysis of Biomolecular Circuits: Engineering Approaches to Systems and Synthetic Biology; Koeppl, H., Densmore, D., Setti, G., di Bernardo, M., Eds.; Springer-Science+Business: New York, NY, 2011; pp 43−62. It is highly unlikely that first principle, ab initio computations based on quantum statistical mechanical or quantum dynamics procedures will be carried within the next two decades even if quantum computers become readily available. Ideker, T.; Galitski, T.; Hood, L. A new approach to decoding life: Systems biology. Ann. Rev. Genomics Hum. Genet. 2001, 2, 343–372. 128

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

34. See section on ‘Systems Biology and Biological Networks’ for additional discussion. 35. Kang, S.; Kahan, S.; McDermott, J.; Flann, N.; Shmulevich, I. Biocellion: Accelerating computer simulation of multi-cellular systems. Bioinformatics 2014, 30, 3101–3108. 36. Beard, D. A.; Bassingthwaight, J. B.; Greene, A. S. Computational modeling of physiological systems. Physiol. Genomics 2005, 23, 1–3. 37. Gavaghan, D.; Garny, A.; Maini, P. K.; Kohl, P. Mathematical models in physiology. Philos. Trans. Roy. Soc. A 2006, 364, 1099–1106. 38. IUPS Physiome Project. http://physiome.org.nz (accessed October 17, 2015). 39. See the discussion in Chapter 4 (‘Cells and Factories’) of Reference 23 for a clear discussion of the regulatory logic of biosystems. 40. Maggiora, G. M. The reductionist paradox: are the laws of chemistry and physics sufficient for the discovery of new drugs? J. Comput.-Aided Mol. Design 2011, 25, 699–708. 41. See ‘Appendix – Complex Systems’ for further discussion of emergent properties. 42. Butcher, E. C.; Berg, E. L.; Kunkel, E. J. Systems biology in drug discovery. Nat. Biotechnol. 2004, 22, 1253–1259. 43. Ekins, S.; Bugrim, A.; Nikolsky, Y.; Nikolskaya, T. Systems Biology: Applications in Drug Discovery. Pharmaceutical Sciences Encyclopedia 2010, 5, 1–61. 44. Newman, M. J. E. Networks – An Introduction; Oxford University Press: Oxford, U.K., 2010. 45. Miller, J. G. Living Systems; McGraw-Hill Book Company: New York, NY, 1978. 46. Fell, D. Understanding the Control of Metabolism; Portland Press: London, UK, 1997. 47. Torres, N. V.; Voit, E. O. Pathway Analysis and Optimization in Metabolic Engineering; Cambridge University Press: Cambridge, U.K., 2002. 48. Bolouri, H. Computational Modeling of Gene Regulatory Networks – A Primer; Imperial College Press: London, U.K., 2011. 49. The term ‘drug-target interaction’ refers to the interaction between a drug and its target; it does not necessarily imply that the drug has induced some type of biologically relevant activity associated with the target. 50. Time-dependent relational networks have also been constructed using timedependent Bayesian networks. See e.g. Song, L.; Kolar, M.; Xing, E. P. Time-varying dynamic Bayesian networks. Adv. Neural Inf. Proc. Syst. 2009, 22, 1–9. 51. See Chapter 4 (‘Cells and Factories’) of Reference 23 for a discussion of the difference between relational and dynamical networks in biological systems. 52. Yildirim, M. A.; Goh, K-I.; Cusick, M. E.; Barabási, A.-L.; Vidal, M. Drugtarget network. Nat. Biotechnol. 2007, 25, 1119–1126. 53. Keiser, M. J.; Roth, B. L.; Armbruster, B. N.; Ernsberger, P.; Irwin, J. J.; Shoichet, B. K. Relating protein pharmacology by ligand chemistry. Nat. Biotechnol. 2007, 25, 197–206. 129

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

54. Hopkins, A. L. Network paharmacology. Nat. Biotechnol. 2007, 25, 1110–1111. 55. Hopkins, A. L. Network pharmacology: the next paradigm in drug discovery. Nat. Chem. Biol. 2008, 4, 682–690. 56. See the section on ‘Imatinib − A Prototypical Example of Target-Based Drug Discovery’ for a detailed discussion. 57. The word ‘drug’ is used nominally in this context (i.e. drug-target interactions) and hence can in many instances be replaced by the word ‘compound’. The terminology ‘interactions’ is included here to emphasize the fact that drug-target interactions may not always give rise to bioactivities since binding may occur without concomitant expression of bioactivity. 58. Lemieux, R. U.; Spohr, U. How Emil Fischer was led to the ‘lock and key’ concept for enzyme specificity. Adv. Carbohydr. Chem. Biochem. 1994, 50, 1–20. 59. Koshland, D. E. Application of a theory of enzyme specificity to protein synthesis. Proc. Natl. Acad. Sci. U.S.A. 1958, 44, 98–104. 60. Langley, J. N. On the reaction of cells and of nerve-endings to certain poisons, chiefly as regards the reaction of striated muscle to nicotine and curare. J. Physiol. 1905, 33, 374–413. 61. Prüll, C.-R. Part of a scientific master plan? Paul Ehrlich and the origins of his receptor concept. Med. History 2003, 4, 332–356. 62. Machle, A.-H.; Prüll, C.-R.; Halliwell, R. F. The emergence of the drugreceptor theory. Nat. Rev. Drug Discovery 2002, 1, 637–641. 63. Limbird, L. E. The receptor concept: A continuing evolution. Mol. Interv. 2004, 4, 326–336. 64. Note that the term ‘ligand’ can refer to either small molecules or macromolecules. 65. Bissantz, C.; Kuhn, B.; Stahl, M. A medicinal chemist’s guide to molecular interactions. J. Med. Chem. 2010, 53, 5061–5084. 66. Monod, J.; Wyman, J.; Changeux, J. P. On the nature of allosteric transitions: A plausible model. J. Mol. Biol. 1965, 2, 88–118. 67. Koshland, D. E., Jr.; Némethy, G.; Filmer, D. Comparison of experimental binding data and theoretical models in proteins containing subunits. Biochemistry 1966, 5, 365–368. 68. Cantley, L. C.; Hunter, T.; Sever, R.; Thorner, J.; Eds. Signal Transduction – Principles, Pathways, and Processes; Cold Spring Harbor Laboratory Press: Cold Spring Harbor, NY, 2014. 69. Vivekanand, P.; Rebay, I. Intersection of signal transduction pathways and development. Annu. Rev. Genet. 2006, 40, 139–157. 70. Metallo, C. M.; Vander Heiden, M. G. Understanding metabolic regulation and its influence on cell physiology. Mol. Cell 2013, 49, 388–398. 71. Chuang, H.-Y.; Hofree, M.; Ideker, T. A decade of systems biology. Ann. Rev. Cell Devel. Biol. 2010, 26, 721–744. 72. Alon, U. An Introduction to Systems Biology – Design Principles of Biological Circuits; CRC Press: Boca Raton, FL, 2006. 73. Palsson, B. O. Systems Biology – Simulation of Dynamic Network States; Cambridge University Press: Cambridge, U.K., 2011. 130

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

74. Fliri, A. F.; Loging, W. T.; Thadeio, P. F.; Volkman, R. A. Analysis of druginduced effect patterns to link structure and side effects of medicines. Nat. Chem. Biol. 2005, 1, 389–397. 75. Bender, A.; Scheiber, J.; Glick, M.; Davies, J. W.; Azzaoui, K.; Hamon, J.; Urban, L.; Whitebread, S.; Jenkins, J. Analysis of pharmacology data and the prediction of adverse drug reactions and off target effects from chemical structure. ChemMedChem 2007, 2, 861–873. 76. Campillos, M.; Kuhn, M.; Gavin, A.-C.; Jensen, L. J.; Bork, P. Drug-target identification using side-effect similarity. Science 2008, 321, 263–266. 77. Kuhn, M.; Campillos, M.; Letunic, I.; Jensen, L. J.; Bork, P. A side effect resource to capture phenotypic effects. Mol. Syst. Biol. 2010, 6, 343; Open Source, DOI:10.1038/msb.2009.98. 78. Lee, S.; Lee, K. H.; Lee, D. Building the process-drug-side effect network to discover the relationship between biological processes and side effects. BMC Bioinf. 2011, 12(Suppl 2):S2. (http://www.biomedcentral.com/14712105/12/S2/S2. Accessed November 23, 2015). 79. Cami, A.; Arnold, Al.; Manzi, Sh.; Reis, B. Predicting adverse drug events using pharmacological network models. Science Trans. Med. 2011, 3, 114–127. 80. Liu, M.; Wu, Y.; Chen, Y.; Sun, J.; Zhao, Z.; Chen, X.; Matheny, M. E.; Xu, H. Large-scale prediction of adverse drug reactions - chemical, biological, and phenotypic properties of drugs. J. Am. Med. Inform. Assoc. 2012, 19, e29–e35. 81. Loukine, E.; Keiser, M. J.; Whitebread, S.; Mikhailov, D.; Hamon, J.; Jenkins, J. L.; Lavan, P.; Weber, E.; Doak, A. K.; Côté, S.; Shoichet, B.; Urban, L. Large-scale prediction and testing of drug activity on side-effect targets. Nature 2012, 486, 361–368. 82. Bresso, E.; Grisoni, R.; Marchetti, G.; Karaboga, A. S.; Souchet, M.; Devignes, M.-D.; Smail-Tabbone, M. Integrative relational machine-learning for understanding drug side-effect profiles. BMC Bioinf. 2013, 14, 207. http://biomedcentral.com/1471-2105/14/207 (accessed November 4, 2015). 83. Zheng, H.; Wang, H.; Xu, H.; Zhao, Z.; Azuaje, F. Correlating adverse drug reactions with biological pathways in humans. Bioinf. Biomed. 2013, 197–200; IEEE Conference on Bioinformatics and Biomedicine; DOI: 10.1109/BIBM.2013.6732488. 84. Kuhn, M.; Banchaabouchi, M. A.; Campillos, M.; Jensen, L. J.; Gross, C.; Gavin, A.-C.; Bork, P. Systematic identification of proteins that elicit drug side effects. Mol. Syst. Biol. 2013, 9, 663; DOI: 10.1038/msb.2013.10. 85. Zheng, H.; Wang, H.; Xu, H.; Wu, Y.; Zhao, Z. Linking biochemical pathways and networks – adverse drug reactions. IEEE Trans. Nanobiosci. 2014, 13, 131–137. 86. Yildrim, P.; Majnaric, L.; Ekmekci, O. I.; Holzinger, A. Knowledge discovery of drug data on the example of adverse drug reaction prediction. BMC Bioinf. 2014, 15 (Suppl. 6), S7. http://www.biomedcentral.com/14712105/15/S6/S7 (Accessed November 5, 2015). 87. LaButer, M. X.; Zhang, X.; Lenderman, J.; Bennion, B. J.; Wong, S. E.; Lightstone, F. C. Adverse drug reaction prediction using scores 131

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

88.

89.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

90. 91.

92.

93.

94. 95.

96. 97. 98.

99.

produced by large-scale drug-protein target docking on high-performance computing machines. PLoS ONE 2014, 9, e106298; DOI: 10.1371/ journal.pone.0106298. Kuhn, M.; Campillos, M.; Letunic, I.; Jensen, L. J.; Bork, P. A side effect resource to capture phenotypic effects of drugs. Mol. Syst. Biol. 2010, 6, 343; DOI:10.1038/msb.2009.98. Information in SIDER is downloadable and can be can be accessed on the EMBL website: sideeffects.embl.ge. The current version was updated on August 6, 2015. Barratt, M. J., Frail, D. E., Eds. Drug Repositioning – Bringing New Life to Shelved Assets and Existing Drugs; John Wiley & Sons: Hoboken, NJ, 2012. von Eichborn, J.; Murgueitio, M. S.; Dunkel, M.; Koerner, S.; Bourne, P. E.; Preissner, R. PROMISCUOUS: a database for network-based drug repositioning. Nucl. Acids Res. 2011, 39, D1060–D1066. However, there is an important caveat associated with some seemingly promiscuous compounds, namely, their behavior can be artifactual due to what has been called pan-assay interference compounds or PAINS. In contrast to most drugs, compounds in this category do not interact with specific molecular features of the targets. For example, compounds that contain Michael acceptors can undergo non-specific chemical reactions with multiple groups on a target protein. More generally, compounds can disrupt membrane structure in such a manner as to modify the function of membrane receptors or transport proteins, they can form aggregates that can lead to non-specific binding, they can complex metal ions in ways that lead to inactivation of protein function, or they can lead to false readouts because they are fluorescent or highly colored. See e.g. Baell, J.; Walters, M. A. Chemical con artists foil drug discovery. Nature 2014, 513, 481–483 and references cited therein. McGovern, S. L.; Caselli, E.; Grigorieff, N.; Shoicket, B. K. A common mechanism underlying promiscuous inhibitors from virtual and high-throughput screening. J. Med. Chem. 2002, 45, 1712–1722. Mestres, J.; Gregori-Puigjané, E. Conciliating binding efficiency and polypharmacology. Trends Pharmacol. Sci. 2009, 30, 470–474. Mencher, S. K.; Wang, L. G. Promiscuous drugs compared to selective drugs (promiscuity can be a virtue). BMC Clin. Pharmacol. 2005, 5, DOI:10.1186/ 1472-6904-5-3. Hopkins, A. L.; Mason, J. S.; Overington, J. P. Can we rationally design promiscuous drugs? Curr. Opin. Struct. Biol. 2006, 16, 127–136. Morphy, R.; Rankovic, Z. Fragments, network biology, and designing multiple ligands. Drug Discovery Today 2007, 12, 156–160. Sivachenko, A.; Kalinin, A.; Yuryev, A. Pathway analysis for the design of promiscuous drugs and selective drug mixtures. Curr. Drug Discovery Technol. 2006, 3, 269–277. Aspel, B.; Blair, J. A.; Gonzalez, B.; Nazif, T. M.; Feldman, M. E.; Aizenstein, B.; Hoffman, R.; Williams, R. L.; Shokat, K. M.; Knight, Z. A. Targeted polypharmacology: discovery of dual inhibitors of tyrosine and phosphoinositide kinases. Nat. Chem. Biol. 2008, 4, 691–699. 132

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

100. Besnard, J.; Ruda, G. F.; Setola, V.; Abecassis, K.; Rodriguiz, R. M.; Huang, X.-P.; Norval, S.; Sassano, M. F.; Shin, A. I.; Webster, L. A.; Simeons, F. R. C.; Stojanovski, L.; Prat, A.; Seidah, N. G.; Constam, D. B.; Bickerton, G. R.; Read, K. D.; Wetsel, W. C.; Gilbert, I. H.; Roth, B. L.; Hopkins, A. L. Automated design of ligands to polypharmacological profiles. Nature 2012, 492, 215–222. 101. Since it is likely in many cases that there will be some overlap among pharmacophores this may be difficult to accomplish without the loss of desirable activities. It would be interesting to know whether or not the pharmacophores associated with the desirable activities of repurposed drugs generally overlap and if so to what degree. 102. Liu, X.; Ouyang, S.; Yu, B.; Liu, Y.; Huang, K.; Gong, J.; Zheng, S.; Li, Z.; Li, H.; Jiang, H. PharmMapper server: a web server for potential drug target identification using pharmacophore mapping approach. Nucleic Acids Res. 2010, 38, W609–W614. 103. Cutler, S.; Cutler, H. G. Biologically Active Natural Products: Pharmaceuticals; CRC Press, Taylor & Francis Group: Boca Raton, FL, 2000. 104. Cragg, G. M.; Newman, D. J. Plants as a source of anti-cancer agents. Ethnopharmacology 2005, 100, 72–79. 105. Balunas, M. J.; Kinghorn, A. D. Drug discovery from medicinal plants. Life Sci. 2005, 78, 431–441. 106. Tringali, C., Ed. Bioactive Compounds from Natural Sources: Natural Products as Lead Compounds in Drug Discovery; CRC Press, Taylor & Francis Group: Boca Raton, FL, 2005. 107. Gu, J.; Gui, Y.; Chen, L.; Gu, Y.; Lu, H.-Z.; Xu, X. Use of natural products as chemical library for drug discovery and network pharmacology. PLoS ONE 2013, 8, e62839; DOI: 10.1371/journal.pone.0062839. 108. Mestres, J.; Gregori-Puigjané, E.; Valverde, S.; Solé, R. V. The topology of drug-target interaction networks: implicit dependence on drug properties and target families. Mol. BioSyst. 2009, 5, 1051–1057. 109. Janga, S. C.; Tzakos, A. Structure and organization of drug-target networks: insights from genomic approaches for drug discovery. Mol. BioSyst. 2009, 5, 1536–1548. 110. Vogt, I.; Mestres, J. Drug-target networks. Mol. Inf. 2010, 29, 10–14. 111. Metz, J. T.; Hajduk, P. J. Rational approaches to targeted polypharmacology: creating and navigating protein-ligand interaction networks. Curr. Opin. Chem. Biol. 2010, 14, 498–504. 112. Nejad, A. M.; Mousavian, Z.; Bozorgmehr, J. H. Drug-target and disease networks: polypharmacology in the post-genomic era. In Silico Pharm. 2013, 1, 17 (DOI: 10.1186/2193-9616-1-17). http://www.in-silicopharmacology.com/content/1/1/17 (accessed November 5, 2015). 113. Note that the ‘drug-target’ designation is used nominally and is not meant to preclude the inclusion of compounds that are not drugs or are unlikely to become drugs.

133 Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

114. A drug is considered to be active with respect to a given target if its activity against that target is equal to or greater than some chosen threshold value, say 10 M. 115. Schadt, E. E.; Friend, S. H.; Shaywitz, D. A. A network view of disease and compound screening. Nat. Rev. Drug Discovery 2009, 8, 286–295. 116. Hi, G.; Agarwal, P. Human disease-drug network based on genomic expression profiles. PLoS ONE 2009, 4, e6536; DOI: 10.1371/ journal.pone.0006536. 117. Dudley, J. T.; Deshpande, T.; Butte, A. J. Exploiting drug-disease relationships for computational drug repositioning. Briefings Bioinf. 2011, 12, 303–311. 118. Barabási, A.-L.; Gulbahce, N.; Loscalzo, J. Network medicine: a networkbased approach to human disease. Nat. Rev. Genet. 2011, 12, 56–68. 119. Hu, Z.; Chang, Y.-C.; Wang, Y.; Huang, C.-L.; Liu, Y.; Tian, F.; Granger, B.; DeLisi, C. VisANT 4.0: Integrative network platform to connect genes, drugs, diseases, and therapies. Nucl. Acids Res. 2013 (Web Server Issue), W225–W231. 120. Sun, P. G. The human drug-disease-gene network. Inf. Sciences 2015, 306, 70–80. 121. Mestres, J.; Gregori-Puigiané, E.; Valverde, S.; Solé, R. V. Data completeness – the Achilles heel of drug-target networks. Nat. Biotechnol. 2008, 26, 983–984. 122. Bajorath, J. Modeling activity landscapes for drug discovery. Expert Opin. Drug Discovery 2012, 7, 463–473. 123. Iyer, P.; Stumpfe, D.; Vogt, M.; Bajorath, J.; Maggiora, G. M. Activity landscapes, information theory, and structure-activity relationships. Mol. Inf. 2013, 32, 421–430. 124. Brown, N., Ed. Scaffold Hopping in Medicinal Chemistry; JohnWiley & Sons: Hoboken, NJ, 2014. 125. Brown, N., Ed. Bioisosteres in Medicinal Chemistry; Wiley-VCH Verlag GmbH & Co. KGaA: Weinheim, Germany, 2012. 126. Whittle, M.; Gillet, V. J.; Willett, P.; Alex, A.; Loesel, J. Enhancing the effectiveness of virtual screening by fusing nearest neighbor lists: a comparison of similarity coefficients. J. Chem. Inf. Comput. Sci. 2004, 44, 1840–1848. 127. Hert, J.; Willett, P.; Wilton, D. J.; Acklin, P.; Azzaoui, K.; Jacoby, E.; Schuffenhauer, A. New methods for ligand-based virtual screening: use of data fusion and machine learning techniques to enhance the effectiveness of similarity searching. J. Chem. Inf. Model. 2006, 46, 462–470. 128. Willett, P. Combination of similarity rankings using data fusion. J. Chem. Inf. Model. 2013, 53, 1–10. 129. Maggiora, G. M. In Foodinformatics; Martinez-Mayorga, K., MedinaFranco, J. L., Eds.; Springer: Heidelberg, Germany, 2014; pp 1–81. 130. See the earlier section on ‘Drug-Target Specificity’ for further discussion. 131. Jalencas, X.; Mestres, J. On the origins of drug polypharmacology. MedChemComm 2013, 4, 80–87. 134

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

132. Barelier, S.; Streling, T.; O’Meara, M. J.; Shoichet, B. K. The recognition of identical ligands by unrelated proteins. ACS Chem. Biol. 2015, 10, 2772–2784; DOI: 10.1021/acschembio.5b00683. 133. Hu, Y.; Bajorath, J. Growth of ligand-target interaction data in ChEMBL is associated with increasing and measurement-dependent compound promiscuity. J. Chem. Inf. Model. 2012, 52, 2550–2558. 134. Hu, Y.; Bajorath, J. How promiscuous are pharmaceutically relevant compounds? A data-driven assessment. AAPS J. 2013, 15, 104–111. 135. Hu, Y.; Bajorath, J. Activity profile relationships between structurally similar promiscuous compounds. Eur. J. Med. Chem. 2013, 69, 393–398. 136. Hu, Y.; Bajorath, J. Compound promiscuity – what can we learn from current data? Drug Discovery Today 2013, 18, 644–650. 137. Hu, Y.; Bajorath, J. What is the likelihood of an active compound to be promiscuous? Systematic assessment of compound promiscuity on the basis of PubChem confirmatory bioassay data. AAPS J. 2013, 15, 808–815. 138. A comprehensive list of available biologically relevant DBs is given annually in the Database Issue of Nucleic Acids Research. See Fernández-Suárez, X. M.; Ridgen, D. J.; Galperin, M. Y. The 2014 Nucleic Acids Research database issue and an updated NAR online molecular biology database collection. Nucl. Acids Res. 2014, 42, D1–D6. 139. Kuhn, M.; Szklarczyk, D.; Pletscher-Frankild, S.; Blicher, T. H.; von Mering, C.; Jensen, L. J.; Bork, P. STITCH 4.0: Integration of protein-chemical interactions with user data. Nucl. Acids Res. 2014, 42, D401–D407. 140. Roider, H. G.; Pavlova, N.; Kirov, I.; Slavov, S.; Slavov, T.; Uzunov, Z.; Weiss, B. Drug2Gene: an exhaustive resource to explore effectively the drugtarget relation network. BMC Bioinf. 2014, 15, 68–78. 141. STITCH is available at stitch.embl.de (accessed September 8, 2015). 142. Drug2Gene is available at drug2gene.info (accessed September 8, 2015). 143. Iskar, M.; Campillos, M.; Kuhn, M.; Jensen, L. J.; van Noort, V.; Bork, P. Drug-induced regulation of target expression. PLoS Comput. Biol. 2010, 6, e1000925; DOI: 10.1371/journal.pcbi.1000925. 144. Lamb, J.; Crawford, E. D.; Peck, D.; Modell, J. W.; Blat, I. C.; Wrobel, M. J.; Lerner, J.; Brunet, J. P.; Subramanian, A.; Ross, K. N.; Reich, M.; Hieronymus, H.; Wei, G.; Armstrong, S. A.; Haggarty, S. J.; Clemons, P. A.; Wei, R.; Carr, S. A.; Lander, E. S.; Golub, T. R. The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science 2006, 313, 1929–1935. 145. The Connectivity Map (CMap) is a publically accessible application that can be reached on the Broad Institute website, broad.mit.edu (accessed September 9, 2015). 146. Salesse, S.; Verfaillie, C. M. BCR-ABL-mediated increased expression of multiple known and novel genes that may contribute to the pathogenesis of chronic myelogenous leukemia. Mol. Cancer Ther. 2003, 2, 173–182. 147. Suarez, R. K.; Moyes, C. D. Metabolism in the age of ‘omes’. J. Exp. Biol. 2012, 215, 2351–2357. 135

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

148. Faraoni, I.; Antonetti, F. R.; Cardone, J.; Bonmassar, E. miR-155 gene: a typical multifunctional microRNA. Biochim. Biophys. Acta 2009, 1792, 497–505. 149. Greenbaum, D.; Colangelo, C.; Williams, K.; Gerstein, M. Comparing protein abundance and mRNA expression levels on a genomic scale. Genome Biol. 2003, 4, 117. http://genomebiology.com/2003/4/9/117 (accessed October 17, 2015). 150. Carthew, R. W.; Sontheimer, E. J. Origins and mechanisms of miRNAs and siRNAs. Cell 2009, 136, 642–655. 151. Juhasz, K.; Gombos, K.; Gocze, K.; Wolher, V.; Szirmai, M.; Revesz, P.; Magda, I.; Sebestyen, A.; Ember, I. Effect of N-methyl-N-nitrosurea on microRNA expression in CBA/CA mice. J. Environ. Occup. Sci. 2012, 1, 77–82. 152. Takahashi, K.; Tatsumi, N.; Fukami, T.; Yokoi, T.; Nakajima, M. Integrated analysis of rifamycin-induced microRNA and gene expression changes in human hepatocytes. Drug Metab. Pharmacokinet. 2014, 29, 333–340. 153. Lee, R. C.; Feinbaum, R. L.; Ambros, V. The C. elegans heterochronic gene lin-4 encodes small RNAs with anti-sense complementarity to lin-14. Cell 1993, 75, 843–854. 154. Wrightman, B.; Ha, I.; Ruvkin, G. Post-transcriptional regulation of the heterochronic gene lin-14 by lin-14 mediates temporal pattern formation in C. elegans. Cell 1993, 75, 855–862. 155. Calin, G. A.; Dumitru, C. D.; Shimizu, M.; Bichi, R.; Zupo, S.; Noch, E.; Aldler, H.; Rattan, S.; Keating, M.; Rai, K.; Rassenti, L.; Kipps, T.; Negrini, M.; Bullrich, F.; Croce, C. M. Frequent deletions and down-regulation of microRNA genes miR15 and miR16 at 13q14 in chronic lymphocytic leukemia. Proc. Natl. Acad. Sci. U.S.A. 2002, 99, 15524–15529. 156. Lu, M.; Zhang, Q.; Deng, M.; Miao, J.; Guo, Y.; Gao, W.; Cui, Q. An analysis of human microRNA and disease associations. PLoS ONE 2008, DOI: 10.1371/journal.pone.0003420. 157. Ardekani, A. M The role of microRNAs in human diseases. Avicenna J. Med. Biotech. 2010, 2, 161–179. 158. Li, K.; Kowdley, K. V. MicroRNAs in common human diseases. Genom. Proteom. Bioinf. 2012, 10, 246–253. 159. Heneghan, H. M.; Miller, N.; Kerin, M. J. MiRNAs as biomarkers and therapeutic targets in cancer. Curr. Opin. Pharmacol. 2010, 10, 543–550. 160. Dimmeler, S.; Zeiher, A. M. Circulating microRNAs: novel biomarkers for cardiovascular disease. Eur. Heart J. 2010, 31, 2705–2707. 161. Schneekloth, J S., Jr.; Crews, C. M. Chemical approaches to controlling intracellular protein degradation. ChemBiochem. 2005, 6, 4046. 162. Raina, K.; Crews, C. M. Chemical inducers of targeted protein degradation. J. Biol. Chem. 2010, 285, 11057–11060. 163. Brown, J.; Treherne, J. M. Targeted chemical libraries: the keys to unlock the ubiquitin system. Drug Discovery World 2014 (Summer), 66–74. 164. Schmidt, M. F. Drug target miRNAs: chances and challenges. Trends Biotechnol. 2014, 32, 578–585. 136

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

165. Monroig, P. del C.; Chen, L.; Zhang, S.; Calin, G. A. Small molecule compounds targeting miRNAs for cancer therapy. Adv. Drug Delivery Rev. 2015, 81, 104–116. 166. Berg, T. Inhibition of transcription factors with small organic molecules. Curr. Opin. Chem. Biol. 2008, 12, 464–471. 167. Koehler, A. N. A complex task? Direct modulation of transcription factors with small molecules. Curr. Opin. Chem. Biol. 2010, 14, 331–340. 168. Bunnage, M. E.; Gilbert, A. M.; Jones, L. H.; Hett, E. C. Know your target, know your molecule. Nat. Chem. Biol. 2015, 11, 368–372. 169. Sams-Dodd, F. Drug discovery: selecting the optimal approach. Drug Discovery Today 2006, 11, 465–472. 170. Current usage of the word ‘phenotype’ is much broader in scope than was the case in early times before the emergence of the field of molecular biology, when phenotype referred to relatively easily observable characteristics of a organism such as body temperature, blood pressure, limb deformities, etc. Nowadays, phenotype can refer to any macroscopically or microscopically observable traits of an organism or cellular population. Hence, one can speak of cellular, disease, cancer, apoptotic, morphological, metabolic, etc., phenotypes. In any case, the important issue is that all of the mechanistic details associated with a given phenotype need not be completely known. 171. Rask-Andersen, M.; Sällman Almén, M.; Schiöth, H. B. Trends in the exploitation of novel drug targets. Nat. Rev. Drug Discovery 2011, 10, 579–590. 172. Mullard, A. Reliability of ‘new drug target’ claims called into question. Nat. Rev. Drug Discovery 2011, 10, 643–644. 173. Metcalf, B. W., Dillon, S., Eds. Target Validation in Drug Discovery; Academic Press: Burlington, MA, 2007. 174. Blake, R. A. Target validation in drug discovery. Methods Mol. Biol. 2007, 356, 367–377. 175. Chen, X.-P.; Du, G.-H. Target validation: a door to drug discovery. Drug Disc. Ther. 2007, 1, 23–29. 176. Arrowsmith, J. Trial watch: Phase II failures: 2008-2010. Nat. Rev. Drug Discovery 2011, 10, 328–329. 177. Sams-Dodd, F. Target-based drug discovery: is something wrong? Drug Discovery Today 2005, 10, 139–147. 178. Sams-Dodd, F. Is poor research the cause of the declining productivity of the pharmaceutical industry? An industry in need of a paradigm shift. Drug Discovery Today 2013, 18, 211–217. 179. Brown, D. Target selection and pharma industry productivity: what can we learn from technology S-curve theory? Curr. Opin. Drug Discovery Dev. 2006, 9, 414–418. 180. Brown, D. Unfinished business: target-based drug discovery. Drug Discovery Today 2007, 12, 1007–1012. 181. Scannell, J. W.; Blanckley, A.; Boldon, H.; Warrington, B. Diagnosing the decline in pharmaceutical R&D efficiency. Nat. Rev. Drug Discovery 2012, 11, 191–200. 137

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

182. Pammolli, F.; Magazzini, L.; Riccaboni, M. The productivity crises in pharmaceutical R&D. Nat. Rev. Drug Discovery 2011, 10, 428–438. 183. Moffat, J. G.; Rudolph, J.; Bailey, D. Phenotypic screening in cancer drug discovery – past, present, and future. Nat. Rev. Drug Discovery 2014, 13, 588–602. 184. Deininger, M. W.; Goldman, J. M.; Melo, J. V. The molecular biology of chronic myeloid leukemia. Blood 2000, 96, 3343–3356. 185. Ren, R. Mechanisms of Bcr-Abl in the pathogenesis of chronic myelogenous leukemia. Nat. Rev. Cancer 2005, 5, 172–183. 186. Maru, Y. Molecular biology of chronic myeloid leukemia. Cancer Sci. 2012, 103, 1601–1610. 187. Comert, M.; Baran, Y.; Saydam, G. Changes in the molecular biology of chronic myeloid leukemia in [the] tyrosine kinase inhibitor era. Am. J. Blood Res. 2013, 3, 191–200. 188. Daley, G. Q.; Van Etten, R. A.; Baltimore, D. Induction of chronic myelogenous leukemia in mice by the P210bcr/abl gene of the Philadelphia chromosome. Science 1990, 247, 824–830. 189. Kelliher, M. A.; McLaughlin, J.; Witte, O. W.; Rosenberg, N. Induction of chromic myelogenous leukemia-like syndrome in mice with v-abl and BCR/ ABL. Proc. Natl. Acad. Sci. U.S.A. 1990, 87, 6649–6653. 190. Heisterkamp, N.; Jenster, G.; Ten Hoeve, J.; Zovich, D.; Pattengale, P. K.; Groffen, J. Acute leukemia in bcr/abl transgenic mice. Nature 1990, 344, 251–253. 191. Gordon, M. Y.; Dowding, C. R.; Riley, G. P.; Goldman, J. M.; Greaves, M. F. Altered adhesion interactions with marrow stroma of haematopoietic progenitor cells in chronic myeloid leukemia. Nature 1987, 328, 342–344. 192. Puil, L.; Liu, J.; Gish, G.; Mbamalu, G.; Bowtell, D.; Pelicci, P. G.; Arlinghaus, R.; Pawson, T. Bcr-Abl oncoptoteins bind directly to activators of the Ras signaling pathway. EMBO J. 1994, 13, 764–773. 193. Bedi, A.; Zehnbauer, B. A.; Barber, J. P.; Sharkis, S. J.; Jones, R. J. Inhibition of apoptosis by BCR-ABL in chromic myeloid leukemia. Blood 1994, 83, 2038–2044. 194. Danial, N. N.; Rothman, P. JAK-STAT signaling activiated by Abl oncogenes. Oncogene 2000, 19, 2523–2531. 195. Arlinghaus, R.; Sun, T. In Molecular Targeting and Signal Transduction; Kumar, R., Ed.; Kluwer Academic Publishers: Boston, MA, 2004; pp 239−270. 196. Seke Etet, P. F.; Vecchio, L.; Nwabo Kamdje, A. H. Signaling pathways in chronic myeloid leukemia and leukemic stem cell maintenance: key role of stromal microenvironment. Cell. Signal. 2012, 24, 1883–1888. 197. Kirchner, D.; Duyster, J.; Ottmann, O.; Schmid, R. M.; Bergmann, L.; Munzert, G. Mechanisms of Bcr-Abl-mediated NF-κB/Rel activation. Expt. Hematol. 2003, 31, 504–511. 198. Shishodia, S.; Aggarwal, B. B. In Molecular Targeting and Signal Transduction; Kumar, R., Ed.; Kluwer Academic Publishers: Boston, MA, 2004; pp 139−173. 138

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

199. Vickers, M. Estimation of the number of mutations necessary to cause chronic myeloid leukemia from epidemiological data. Br. J. Haemotol. 1996, 94, 1–4. 200. Lichtman, M. In Williams Helatology; Beutler, E., Lichtman, M. A., Coller, B. S., Kipps, T. J., Eds.; McGraw-Hill: New York, NY, 1995; pp 298–324. 201. Kurzock, R.; Gutterman, J. U.; Talpaz, M. The molecular genetics of Philadelphia chromosome-positive leukemias. N. Engl. J. Med. 1988, 319, 990–998. 202. Capdeville, R.; Buchdunger, E.; Zimmermann, J.; Matter, A. Glivec (STI571, imatinib), a rationally developed, targeted anticancer drug. Nat. Rev. Drug Discovery 2002, 1, 493–502. 203. Deininger, M.; Buchdunger, E.; Druker, B. J. The development of imatinib as a therapeutic agent for chronic myeloid leukemia. Blood 2005, 105, 2640–2653. 204. Reddy, E. P.; Aggarwal, A. K. The ins and outs of Bcr-Abl inhibition. Genes Cancer 2012, 3, 447–454. 205. Jabbour, E.; Kantarjian, H.; Cortes, J. Use of second- and third-generation tyrosine kinase inhibitors in the treatment of chronic myeloid leukemia: an evolving treatment paradigm. Clin. Lymphoma Myeloma Leuk. 2015, 15, 323–334. 206. Schindler, T.; Bornmann, W.; Pellicena, P.; Miller, W. T.; Clarkson, B.; Kuriyan, J. Structural mechanism for STI-571 inhibition of Abelson Tyrosine kinase. Science 2000, 289, 1938–1942. 207. Pendergast, A. M.; Gishizky, M. L.; Havlik, M. H.; Witte, O. N. SH1 domain autophosphorylation of P210 BCR/ABL is required for transformation but not growth factor independence. Mol. Cell. Biol. 1993, 13, 1728–1736. 208. Hamilton, A.; Helgason, G. V.; Schemionek, M.; Zhang, B.; Myssina, S.; Allan, E. K.; Nicolini, F. E.; Müller-Tidow, C.; Bhatia, R.; Brunton, V. G.; Koschmieder, S.; Holyoake, T. L. Chronic myeloid leukemia cells are not dependent on Bcr-Abl kinase activity for their survival. Blood 2012, 119, 1501–1510. 209. Bixby, D.; Talpaz, M. Seeking the causes and solutions to imatinib-resistance in chronic myeloid leukemia. Leukemia 2011, 25, 7–22. 210. Gorre, M. E.; Mohammed, M.; Ellwood, K.; Hsu, N.; Paquette, R.; Rao, P. N.; Sawyers, C. L. Clinical resistance to STI-571 caused by BCR-ABL gene mutation or amplification. Science 2001, 293, 876–880. 211. O’Hare, T.; Eide, C. A.; Deininger, M. W. Bcr-Abl kinase domain mutations, drug resistance, and the road to a cure for chronic myeloid leukemia. Blood 2007, 110, 2242–2249. 212. Branford, S.; Rudzki, Z.; Walsh, S.; Parkinson, I.; Grigg, A.; Szer, J.; Taylor, K.; Hermann, R.; Seymour, J. F.; Arthur, C.; Joske, D.; Lynch, K.; Hughes, T. Detection of BCR-ABL mutations in patients with CML treated with imatinib is virtually always accompanied by clinical resistance, and mutations in the ATP phosphate binding loop (P-loop) are associated with poor prognosis. Blood 2003, 102, 276–283. 213. Jabbour, E.; Kantarjian, H.; Jones, D.; Talpaz, M.; Bekele, N.; O’Brien, S.; Zhou, X.; Luthra, R.; Garca-Manero, G.; Giles, F.; Rios, M. B.; 139

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

214.

215.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

216.

217.

218. 219.

220.

221.

222.

223.

Verstovsek, S.; Cortes, J. Frequency and clinical significance of BCR-ABL mutations in patients with chronic myeloid leukemia treated with imatinib mesylate. Leukemia 2006, 20, 1767–1773. Hegedus, T.; Orfi, L.; Seprodi, A.; Varadi, A.; Sarkadi, B.; Keri, G. Interaction of tyrosine kinase inhibitors with human multidrug transporter proteins, MDR1 and MRP1. Biochim. Biophys. Acta 2002, 1587, 318–325. Shah, N. P.; Tran, C.; Lee, F. Y.; Chen, P.; Norris, D.; Sawyers, C. L. Overriding imatinib resistance with a novel ABL kinase inhibitor. Science 2004, 305, 399–401. O’Hare, T.; Walters, D. K.; Stoffregen, E. P.; Jia, T.; Manley, P. W.; Mestan, J.; Cowan-Jacob, S. W.; Lee, F. Y.; Heinrich, M. C.; Deninger, M. W.; Druker, B. J. In vitro activity of Bcr-Abl inhibitors AMN107 and BMS-354825 against clinically relevant imatinib-resistant Abl kinase domain mutants. Cancer Res. 2000, 65, 4500–4505. Weisberg, E.; Manley, P. W.; Breitenstein, W.; Brüggen, J.; Cowen-Jacob, S. W.; Ray, A.; Huntly, B.; Fabbro, D.; Fendrich, G.; Hall-Meyers, E.; Kung, A. L.; Mestan, J.; Daley, G. Q.; Callahan, L.; Catley, L.; Cavazza, C.; Azam, M.; Neuberg, D.; Wright, R. D.; Gilliland, D. G.; Griffin, J. D. Characterization of AMN107, a selective inhibitor of native and mutant Bcr-Abl. Cancer Cell. 2005, 7, 129–141. See section on ‘Systems Biology and Biological Networks’ for further discussion. Valencia-Sanchez, M. A.; Liu, J.; Hannon, G. J.; Parker, R. Control of translation and mRNA degradation by miRNAs and siRNAs. Genes Dev. 2006, 20, 515–524. Rokah, O. H.; Granot, G.; Ovcharenko, A.; Modai, S.; Pasmanik-Chor, M.; Toren, A.; Shomron, N.; Shpilberg, O. Down regulation of Mir-31, Mir-155, Mir-564 in chronic myeloid leukemia. PLoS ONE 2012, 7, e35501; DOI: 10.1371/journal.pone.0035501. The labels of the corresponding gene products are as follows: ABCG2 = ATP-binding Cassette subfamily G member 2 (a.k.a. BCRP – Breast Cancer Resistance Protein); ABL1 = Abelson murine leukemia viral oncogene homolog 1; ABL2 = Abelson related gene; BCR = Breakpoint cluster region; LCK = Lymphocyte-specific protein kinase; LYN = Member of Src family of protein Tyr kinases; KIT = Mast/stem cell growth factor receptor kit; PDGFRA = Platelet-derived growth factor receptor (α subunit); PDGFRB = Platelet-derived growth factor receptor (β subunit); STAT5A = Signal transducer and activator of transcription 5A. Green lines denote proteins that interact with imatinib; blue lines indicate protein-protein interactions. Not all of the interactions are of equal quality and some have been obtained from computational models. In addition, a drug-target interaction based solely on a binding constant, even a highly reliable one, is not prima facie evidence that a given drug also induces some type of biologically-relevant activity. The section on ‘Drug-Target Interactions’ provides additional discussion on drug-target databases. See also Figure 4: 1. 140

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

224. Bailey, J. E. Lessons from metabolic engineering for functional genomics and drug discovery. Nat. Biotechnol. 1999, 17, 617–618. 225. Cascante, M.; Boros, L. G.; Comin-Anduix, B.; de Atauri, P.; Centelles, J. J.; Lee, P. W. Metabolic control analysis in drug discover and disease. Nat. Biotechnol. 2002, 20, 243–249. 226. Hellerstein, M. K. A critique of the molecular target-based drug discovery paradigm based on principles of metabolic control: advantages of pathwaybased discovery. Metab. Eng. 2007, 10, 1–10. 227. Hellerstein, M. K. Exploiting complexity and the robustness of network architecture for drug discovery. Perspect. Pharmacol. 2008, 325, 1–9. 228. For an excellent example see Cascante, M.; Boros, L. G.; Comin-Anduix, B.; de Atauri, P.; Centelles, J. J.; Lee, P. W.-N. Metabolic control analysis in drug discovery and disease. Nat. Biotechnol. 2002, 20, 243–249. 229. Kotz, J. Phenotypic screening, take two. SciBX 2012, 5, DOI: 10.1038/ scibx.2012.380. 230. Lee, J. A.; Uhlik, M. T.; Moxham, C. M.; Tomandl, D.; Sall, D. J. Modern phenotypic drug discovery is a viable neoclassic pharma strategy. J. Med. Chem. 2012, 55, 4527–4538. 231. Eggert, U. S. The why and how of phenotypic small-molecule screens. Nat. Chem. Biol. 2013, 9, 206–209. 232. Bishop, T.; Sham, P. Analysis of Multifactorial Diseases; BIOS Scientific Publishers: Oxford, U.K., 2000. 233. See section on ‘Target-Based Drug Discovery’ for details. 234. Lang, P. SAR by HCS. Nat. Chem. Biol. 2008, 4, 18–19. 235. Young, D. W.; Bender, A.; Hoyt, J.; McWhinnie, E.; Chirn, G. W.; Tao, C. Y.; Tallarico, J. A.; Labow, M.; Jenkins, J. L.; Mitchison, T. J.; Feng, Y. Integrating high-content screening and ligand-target prediction to identify mechanism od action. Nat. Chem. Biol. 2008, 4, 59–68. 236. Harrison, C. High content screening − integrating information. Nat. Rev. Drug Discovery 2008, 7, 121. 237. Lang, P.; Yeow, K.; Nichols, A.; Scheer, A. Cellular imaging in drug discovery. Nat. Rev. Drug Discovery 2006, 5, 343–356. 238. Terstappen, G. C.; Schlüpen; Raggiaschi, R.; Gaviraghi, G. Target deconvolution strategies in drug discovery. Nat. Rev. Drug Discovery 2007, 6, 891–903. 239. Cho, Y. S.; Kwon, H. J. Identification and validation of bioactive small molecule target through phenotypic screening. Bioorg. Med. Chem. 2012, 15, 1922–1928. 240. Cho, Y. S.; Kwon, H. J. Identification and validation of bioactive small molecule target through phenotypic screening. Bioorg. Med. Chem. 2012, 15, 1922–1928. 241. Futamura, Y.; Muroi, M.; Osada, H. Target identification of small molecules based on chemical biology approaches. Mol. Biosyst. 2013, 9, 897–914. 242. Lee, J.; Bogyo, M. Target deconvolution techniques in modern phenotypic profiling. Curr. Opin. Chem. Biol. 2013, 17, 118–126. 243. Lee, J.; Bogyo, M. Target deconvolution techniques in modern phenotypic profiling. Curr. Opin. Chem. Biol. 2013, 17, 118–126. 141

Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.

Downloaded by GEORGIA INST OF TECHNOLOGY on November 9, 2016 | http://pubs.acs.org Publication Date (Web): October 5, 2016 | doi: 10.1021/bk-2016-1222.ch007

244. Wei, X.; Hoffman, A. F.; Hamilton, S. M.; Xiang, Q.; He, Y.; So, W. V.; So, S. S.; Mark, D. A simple statistical test to infer the causality of target/phenotype correlation from small molecule phenotypic screens. Bioinformatics 2012, 28, 301–305. 245. Eder, J.; Sedrani, R.; Wiesmann, C. The discovery of first-in-class drugs: origins and evolution. Nat. Rev. Drug Discovery 2014, 13, 577–586. 246. Swinney, D. C. Phenotypic vs. target-based drug discovery for first-in class medicines. Clin. Pharm. Therapeut. 2013, 93, 299–301. 247. As discussed in the section on ‘Target-Based Drug Discovery’ the single-target approach tends to yield fewer hits in initial HTS campaigns than phenotypic screens since the latter contain multiple, latent targets many of which may exhibit activity towards the screened compounds. 248. Both target-based and phenotype-based procedures generally proceed along the indicated lines, but the processes can vary from project to project: Cf. the section on Imatinib – A Prototypical Example of Target-Based Drug Discovery. 249. Mitchell, M. Complexity: A Guided Tour; Oxford University Press: Oxford, U.K., 2009. 250. Page, S. E. Diversity and Complexity; Princeton University Press: Princeton, NJ, 2011. 251. Miller, J. H.; Page, S. E. Complex Adaptive Systems: An Introduction to Computational Models of Social Life; Princeton University Press: Princeton, NJ, 2007. 252. Since CASs are open systems they need not obey the Second Law of Thermodynamics, which applies to closed systems. Because of this entropy is not required to continually increase in time, and hence, an open system need not move inexorably towards disorder. 253. Poston, T.; Stewart, I. Catastrophe Theory and its Applications; Pitman Publishing Limited: London, U.K., 1981. 254. Holland, J. H. Emergence – From Chaos to Order; Basic Books, Perseus Books Group: New York, NY, 1998. 255. https://en.wikipedia.org/wiki/emergence (accessed August 23, 2015).

142 Bienstock et al.; Frontiers in Molecular Design and Chemical Information Science - Herman Skolnik Award Symposium 2015: ... ACS Symposium Series; American Chemical Society: Washington, DC, 2016.