Future of ToxicologyMechanisms of Toxicity and Drug Safety: Where

James L. Stevens. Toxicology and ... Analytical Chemistry 2017 89 (22), 12441-12449 ... Chemical Research in Toxicology 2008 21 (1), 129-137 ... 2014,...
0 downloads 0 Views 230KB Size
Chem. Res. Toxicol. 2006, 19, 1393-1401

1393

PerspectiVe Future of ToxicologysMechanisms of Toxicity and Drug Safety: Where Do We Go from Here? James L. Stevens Toxicology and Drug Disposition, Lilly Research Laboratories, 2001 West Main Street, Greenfield, Indiana 46140 ReceiVed August 27, 2006

Recent high-profile drug withdrawals increase the pressure on regulators and the pharmaceutical industry to improve preclinical safety testing. Understanding mechanisms of drug toxicity is an essential step toward improving drug safety testing by providing the basis for mechanism-based risk assessments. Nonetheless, despite several decades of research on mechanisms of drug-induced toxicity and the application of various new technologies to preclinical safety assessment, the overall impact on preclinical safety testing has been modest. Assessing the risk of exposing humans to new drug candidates still depends on preclinical testing in animals, which in many, but not all cases, predicts outcomes in humans accurately. The following offers a perspective on the challenges and opportunities facing efforts to improve preclinical safety testing and outlines gaps and needs that must be addressed. A case is built for focusing solutions on defined problems within the current safety testing paradigm rather than imposing wholesale change. Targets for application of new technologies, including in silico screening, biomarkers, surrogate assays and ‘omic technologies, are outlined. Improving drug safety testing will depend on improving the application of mechanism-based risk assessment but will also require improving public and private collaborations in order to focus research regarding the mechanism of drug-induced toxicity on the most important problems. 1. Introduction 2. Background: Pharmaceutical Safety Assessment 2.1. Target Validation and Lead Generation 2.2. Lead Optimization 2.3. Preclinical Safety Assessment: From Clinical Candidate to Registration 3. Opportunities, Gaps, and Needs: Improving Preclinical Safety Testing 3.1. Knowledge Management: How, When, and Why Does Preclinical Safety Assessment Fail? 3.1.1. Opportunities, Gaps, and Needs 3.2. In Silico Approaches to Drug Safety Assessment 3.2.1. Opportunities, Gaps, and Needs 3.3. Surrogate Models for Safety Assessment 3.3.1. Cell- and Tissue-Based Surrogate Assays 3.3.2. Surrogate Animal Models 3.3.3. Opportunities, Gaps, and Needs 3.4. Biomarkers 3.4.1. Opportunities, Gaps, and Needs 3.5. Systems Toxicology and Pharmacogenetics: Making Sense of High Content Information 3.5.1. Opportunities, Gaps, and Needs 4. Summary: Improving Mechanism-Based Risk Assessment

1393 1393

1397 1397 1398 1398 1398

increase concern at a time when pharmaceutical industry productivity is nearly flat (2). A Food and Drug Administration (FDA) report (3) suggests that decreased pipeline productivity creates a risk that “the biomedical revolution will not deliver on its promise of better health” and that there is “...an...urgent...need for applying technologies such as genomics, proteomics, bioinformatics systems, and new imaging technologies...to detect safety problems early...” Others point out that “...there is little evidence that [new technologies] have had a major or direct impact on the safety assessment which supports...first studies of new drugs in humans” (4). To improve drug safety and deliver better health outcomes to patients, it is critical for toxicology to advance our understanding of mechanisms of ADRs, apply new technologies to the practical challenge of predicting drug safety, and improve mechanism-based risk assessment to ensure the safe use of new drugs in man. Herein, I offer a perspective on the current state and future needs for mechanism-based risk assessment in drug safety. Comments are limited to the development of small-molecule therapeutics. A brief summary of the preclinical safety assessment paradigm is followed by suggestions for future research directions. The opinions expressed herein are mine alone, and I apologize in advance for any significant omissions.

1399 1399

2. Background: Pharmaceutical Safety Assessment

1394 1395 1395 1396 1396 1396 1396 1397 1397 1397

1. Introduction Despite decades of research, adverse drug reactions (ADRs) remain a significant problem (1). Recent drug withdrawals

Achieving an acceptable efficacy and safety profile for a new drug is a complex process, requiring optimization of many variables within a single chemical structure (5). From a safety perspective, the complexity lies largely in the quality and interpretation of data from a discrete series of studies that define

10.1021/tx060213n CCC: $33.50 © 2006 American Chemical Society Published on Web 10/07/2006

1394 Chem. Res. Toxicol., Vol. 19, No. 11, 2006

SteVens

Figure 1. Overview of preclinical safety assessment. The figure shows a schematic representation of the drug discovery and development process from selection of a target to registration of a new drug entity (adapted from ref 14). Abbreviations used include the following: IND, investigative new drug application; NDA, new drug application; CE, candidate evaluation; CIB, clinical investigator brochure; MRSD, maximum recommend starting dose; and SAR, structure-activity relationship.

the preclinical safety assessment paradigm (6). The details and regulatory expectations are described in detail elsewhere (http:// www.fda.gov/cder/guidance/; http://www.ich.org/); the steps are summarized in Figure 1.

2.1. Target Validation and Lead Generation Target selection can be based on prior clinical data with prototypical first-in-class molecules or mechanisms of disease pathogenesis, or it can be inferred indirectly from genetic data in humans or preclinical species. Proactive safety assessment at this stage is largely an in cerebro and/or in silico exercise. Safety issues inherent in modulating a target can be anticipated from existing drug precedents; for example, agonists for peroxisome proliferators activator receptors (PPAR) might be anticipated to be tumorigenic, increase heart weight, and produce plasma volume expansion in preclinical studies (7). For novel targets, safety concerns must be inferred from literature on genetic studies in humans and lower organisms or by mining pathways involved in a disease process. At the target-to-lead stage, the focus of safety assessment should be on strategies to assess any anticipated safety issues. In silico analysis of chemical structure for safety end points, such as mutagenicity, are used prior to chemical synthesis (8). Prototypical small molecules or proteins (e.g., antibodies), even if not sufficiently “druglike” for clinical use, can be useful tools to investigate safety issues that may emerge during lead optimization. Pharmacodynamic (PD)-based biomarkers of efficacy, such as antibodies that recognize post-translational modifications associated with target modulation; for example, phosphorylation, ubiquitination, and acetylation states of proteins, help link dose-response relationships for target modula-

tion with organ toxicity. If the target in man has multiple forms, verifying the expression of similar isozymes and selectivity in preclinical animal models is important. Mapping the tissue distribution of the drug target is useful to determine if toxicity tracks closely with target expression. Knockout mice can also highlight the impact of target inhibition in normal tissues and in disease pathogenesis. In general, multiple approaches are required to decipher the mechanism of target organ toxicitys Having the right tools and an effective risk management strategy are the keys to success. A recent example of this complexity is IKKβ, a drug target implicated in diseases including oncology, asthma, bone loss, arthritis, and diabetes (9, 10). IKKβ phosphorylates IκB, which is degraded, releasing NFk-B to activate genes linked to apoptosis, inflammation, and, perhaps, cell division. Knockout mice deficient in IKKβ die in utero due to hepatocellular apoptosis and liver degeneration (11). At first glance, the genetic data suggest that inhibiting IKKβ might result in liver injury, but targeted deletion of IKKβ in the liver did not result in hepatocyte apoptosis or embryo lethality in utero (12). The risk that these observations represent to humans is not clear since there are few published data from clinical trials with IKKβ inhibitors (9, 10). Nonetheless, reports of clinical trials suggest that margins of safety (MOS) sufficient to allow testing in humans have been achieved. Additional data will be forthcoming in this rapidly developing field. IKKβ is an important drug target; publication of preclinical and clinical data will clarify whether or not the benefits warrant any risks. As toxicities emerge during development, one of the most important questions to address is whether the mechanism is related to the intended pharmacology (on-target toxicity) or to

PerspectiVe

Chem. Res. Toxicol., Vol. 19, No. 11, 2006 1395

Figure 2. Dose-response relationships and on- vs off-target toxicity. A schematic (adapted from ref 14) representation of the relationships between the no observed adverse effect level (NOAEL) marking the threshold for toxicity (T) and an efficacy dose (E) producing an 80% of maximum biological response (ED80). Note that the ED80 occurs as about 30-40% target modulation on the dose-response curve. Ontarget toxicity, on the left-hand side, is shown for two drugs that differ in the MOS based on the slope and maximum response values of the two drugs. Off-target toxicity, on the right-hand side, is depicted by two separate dose-response curves, one for efficacy (solid line) and one for toxicity (dotted line); MOS is defined by the difference in the ED80 and NOAEL from the two separate curves.

the chemical structure itself (off-target toxicity). Differentiating between these two potential mechanisms is crucial to designing strategies to increase safety; understanding the basic biology and pharmacology of the target is the first step. Drug development is a high-risk business. As Pasteur noted, “Chance favors the well-prepared mind.” When resolving safety issues in drug development, chance favors the well-prepared plan; to assess safety later in development starts at target selection.

2.2. Lead Optimization During lead optimization, the characteristics of a “good drug” are built into potential clinical candidates (5). From a safety perspective, the goal is to identify the dose-limiting toxicities, the most sensitive species, and set doses for later definitive safety studies. Potential candidates are often screened for safety using pilot repeat dose studies (13). Toxicity with one molecule can be easily circumvented by testing another, but recurring toxicity within the lead series can stall progress. However, without understanding the mechanism and relevance to humans, abandoning a novel target or lead series based on toxicity in short-term studies may be premature. However, to proceed, a strategy to design safer compounds must be in place. The MOS is defined in preclinical models by the separation between parameters defining the efficacy and safety; in general, they are based on exposure rather than administered dose (14). Two things are required to widen the MOS during lead optimization: knowledge of mechanism (on- vs off-target) and screening tools to inform the structure-activity relationship (SAR). If the mechanisms cannot be differentiated and a screen cannot be implemented, finding better compounds may be reduced to empirical screening rather than rational drug design. Establishing the PD relationships among exposure (Cmax and area under the curve), target inhibition, efficacy, and toxicity in vivo is a first step to differentiating mechanisms. Testing compounds in knockout mice can also help differentiate on- vs off-target contributions if the pathology is replicated in the mouse. Inactive compounds within the series can be used to differentiate on-target toxicity from off-target mechanisms related to the chemical scaffold. If an on- vs off-target mechanism is established, the strategies to increase the MOS will differ. As noted on the left in Figure 2, with an on-target mechanism, boundaries for increasing the

MOS are confined by the dose-response curve. Designing partial agonists/antagonists or less potent molecules with a “shallow” slope to the dose-response curve may help. Defining the human relevance of any pharmacologically linked toxicity is also critical. If the toxicity is due to off-target effects, the SAR can be guided away from offending chemical features linked to the dose-limiting toxicity by shifting the doseresponse curve for toxicity to higher doses (Figure 2, right panel). In either case, a mechanism-based screen reduces the time and amount of compound necessary to find an optimized drug candidate. Establishing the mechanism and validating the assay format are the primary goals of mechanistic investigations during lead optimization. An example where strategies to optimize and manage an ontarget toxicity are in place is hemorrhage associated with anticoagulant therapy (references in ref 15). Species differences in coagulation parameters are easily measured with standard biochemical assays. When blood levels of anticoagulant are maintained in an acceptable range, a therapeutic end point is achieved, reduced clot formation without excessive bleeding, allowing careful use of anticoagulants in the clinic. Anticoagulants are not ideal drugs, but clinical management of doselimiting bleeding episodes is common. An example where mechanistic information hampers progress is the development of selective serotonin (5-hydroxytryptamine, 5-HT) receptor agonists, specifically 5HT2c, in obesity (16). Valvulopathies occur in humans taking a combination of fenfluramine and phentermine for weight loss (17). Despite literature reports of preclinical models and suggestions that the 5-HT2B receptor subtype is responsible (18), the relationship of a particular 5-HT receptor and valvulopathy in patients has not been proven. Difficulty in clinical monitoring and the fact that the condition may not be reversible if the drug is withdrawn add to concerns. Because alternate therapies, such as diet and exercise, are available, the unknown mechanism and high level of concern leave a difficult path to testing this clinical hypothesis for the treatment of obesity.

2.3. Preclinical Safety Assessment: From Clinical Candidate to Registration Once a single clinical candidate is selected, the chemical die is cast, and the biological properties are locked in the structure. The emphasis for preclinical testing shifts from candidate identification to candidate evaluation (Figure 1); the focus of mechanism-based risk assessment shifts as well to the patient. Before submitting an investigational new drug application (IND), safety is assessed in rodent and nonrodent species according to regulatory expectations (6, 14). A primary goal of these safety studies is to identify a maximum recommended safe starting dose (MRSD) for the clinic (19). A maximum tolerated dose defines the target organs at the highest dose without mortality. The no observed adverse effect level (NOAEL) defines the highest dose at which adverse effects are not seen preclinically (14). A human equivalency dose, extrapolated from the NOAEL and divided by a safety factor, establishes the MRSD, the point where dose escalation begins in the clinic, and a key parameter in setting the dose range used to test the clinical hypothesis. Mechanism-based risk assessment impacts the MRSD and the safety factor used to set the MRSD in three ways: (i) understanding mechanisms of species sensitivity, (ii) establishing methods to monitor the risk, and (iii) defining the reversibility of target organ toxicity. In the absence of data on human relevance, the default position is to estimate the MRSD using

1396 Chem. Res. Toxicol., Vol. 19, No. 11, 2006

the NOAEL from the most sensitive preclinical test species. However, if the mechanism of toxicity in the most sensitive species is not relevant to humans, the context can change. For example, the renal toxicity of efavirenz, a reverse transcriptase inhibitor, in rats is not relevant to the human due to differences in routes of metabolism (20). This was a critical decision point in development and illustrated an effective use of mechanistic studies (Miwa, G. Personal communication). Clinical monitoring and reversibility are also critical. If the toxicity can be monitored with a clinical biomarker and is reversible in preclinical species, the default MRSD for phase I trials may 1/10 the HED based on a safety factor of 10 (19). If toxicity is severe, not easily monitored, or reversible, the allowable MRSD may be much lower. Limiting the clinical dosing range due to preclinical safety concerns can be a major cause of trial failure.

3. Opportunities, Gaps, and Needs: Improving Preclinical Safety Testing Understanding the current practice of preclinical safety assessment creates the context for three areas that toxicology must address in the future: (i) improving safety predictions for man, (ii) addressing the “pipeline problem” caused by high attrition due to preclinical toxicology, and (iii) defining mechanisms of toxicity. Suggestions regarding opportunities, gaps, and needs are offered below. A first step is the effective use of knowledge management to highlight opportunities for improvement.

3.1. Knowledge Management: How, When, and Why Does Preclinical Safety Assessment Fail? Managing the current knowledge base on drug safety, e.g., literature, drug safety data, etc., to define the opportunities for improvement is critical. Three related questions need to be addressed. (i) How often do failures occur? (ii) When do failures occur? (iii) Also, why do failures occur? Recent literature addresses these three questions to some extent. To understand how often safety predictions fail, Lazarou et al. (21) estimated the frequency of ADRs based on the incidence in hospitalized patients in the United States. In 1994, 2 million ADRs and 106000 fatalities occurred in hospitalized patients or 6.7 and 0.32% incidence, respectively. This may be an overestimation since the incidence of underlying disease and multiple drug treatments that can contribute to ADR frequency (22, 23) is probably higher as compared to the general population. Nonetheless, this study outlines a clear need to improve human safety; however, from a purely technical point of view, a 6% failure rate suggests that the safety assessment works reasonably well. Therefore, targeting specific issues, rather than reVamping the entire process, is more likely to tighten the safety net. Now the second questionsWhen do failures to detect human risk first occur? A retrospective analysis of clinical and preclinical data suggests that 70% of ADRs were preceded by findings during preclinical testing and that the first observations were in studies of 30 days or less (24), i.e., the studies that support first dose in humans (Figure 1). Internal industry data agree since nearly 80% of clinical candidate attrition due to preclinical safety occurs prior to first human dose (data not shown). Therefore, to improVe risk assessment and reduce pipeline attrition, it makes sense to focus technical solutions at or before this piVotal decision point in drug deVelopment. Having considered how and where, why does preclinical safety assessment fail to detect some target organ toxicity? This

SteVens

is difficult to address since most preclinical safety data are not publicly available. Nonetheless, a recent reanalysis of available data suggests that target organ toxicity for cardiovascular, hematopoietic, and gastrointestinal toxicities is predicted with an efficiency of ∼80%; for the urinary tract, it is predicted at about 70% followed by hepatic, skin, and neurological toxicities at 50% or less (4). The same studies point out that when other species predict human toxicity, 90% of the issues are identified by current practices. This suggests that target organ toxicity not replicated by preclinical species contributes disproportionately to failure and should be a focus for improVement.

3.1.1. Opportunities, Gaps, and Needs Although there are a number of public and private knowledge bases available (8, 25, 26) and the landscape is improving, the targets for improvement outlined above are based on incomplete safety data. Improving the content of preclinical safety knowledge databases is essential to more precisely define how new approaches and animal models can provide solutions to specific problems in two ways: first, by directing technical solutions toward pathologies not modeled by preclinical test species, and second, by focusing improvements to the most problematic target organ toxicities. Improved information sharing will ensure that knowledge management experiments identify the most important causes of attrition and key gaps in predicting human safety. This is an iterative process since the Bayesian nature of biology implies that new knowledge will reshape our interpretation of historical data (27). This incomplete knowledge base also contributes to the apparent failure of new technologies noted by Greaves et al. (4). In the author’s opinion, this is not a failure of the technology. Rather, a combination of exaggerated expectations and a failure to address the right questions, i.e., mechanismbased hypothesis testing, with technologies have contributed to a perception of failure. For example, should one expect to improve predictions of safety by applying a new technology to a preclinical model that is a poor surrogate for the biological outcome of interest? If species differences in basic pathophysiology hamper safety predictions for humans, the probability that a new measurement will yield improved results is low. Addressing technical gaps in information and knowledge management (28) will outline where technical solutions are needed to target the right problem with a testable hypotheses in appropriate models. This discussion frames specific opportunities for improving mechanism-based risk assessment in preclinical drug safety testing.

3.2. In Silico Approaches to Drug Safety Assessment Avoiding an offending chemical feature prior to synthesis is the simplest way to avoid toxicity. Recent reviews outline the state-of-the-art application of in silico models to predicting toxicity (8, 28, 29), including the systems approaches addressed later. At the earliest stages, e.g., target-to-lead (Figure 1), in silico models offer the opportunity to interrogate chemical space prior to compound synthesis (13, 30). Either local or global in silico models can be applied to predict the behavior of virtual libraries (31). As synthesis progresses, local models, built rapidly in real time as a lead series expands, allow modeling of accumulating biological or biochemical data. Global models based on large training sets drawn from databases can be used to profile the properties of any chemical structure against an end point (26). Current applications for predicting individual toxicology issues in silico have been summarized (8), as are recent efforts to model carcinogenic risk (8, 29).

PerspectiVe

How a model is applied can be as important as the model itself. Given the number of parameters that must be optimized in a drug molecule, it may be unrealistic to expect even a suite of in silico models to predict the biological properties of a single clinical candidate. By analogy, the Heisenberg uncertainty principle describes limitations to the accuracy in measuring only two parameters, the position and momentum, of a particle. A more practical approach is to “sieve” the many parameters that describe druggable chemical space in silico into “bins” of predicting biological and biopharmaceutical properties. Biological tests on selected compounds check the accuracy of the seive and enrichment of bins without the need to integrate the entire chemical space. This iterative approach can also guide an effective “fit-for-purpose” plan to define safety issues (30). Using an in silico model to look for the metaphorical needle in a haystack may be less practical than determining if the haystack is rich in needles before one starts looking.

3.2.1. Opportunities, Gaps, and Needs Although in silico models exist for some applications, there are many gaps where in silico models either underperform or cannot be built due to a lack of information. In addition, the application of in silico models is an information technologyintensive and model-dependent exercise (8, 26, 28). Barriers to effective implementation include (i) a lack of model compounds (training sets) that exemplify the toxicity, (ii) access to in vivo studies data in a common format, and (iii) availability of relevant higher throughput assays to extend the chemical space available to the model. Training set compounds are necessary to train the model, but chemical structures and associated study data are often intellectual property and unavailable. When available, in vivo data may be fragmented and incomplete or generated in different species and strains, adding further variation (25). Structures in public databases (26) can help inform models but may have little relevance to druggable chemical space. Finally, a lack of physiologically based higher throughput surrogate assays applicable to issue-driven toxicity assessment hampers the ability to rapidly expand the chemical space of interest within the model. These, and other gaps, can limit the impact of in silico models for predicting drug toxicity. Progress using predictive ‘omic databases for in silico predictions will be addressed later.

3.3. Surrogate Models for Safety Assessment During lead optimization (Figure 1), the lead series narrows to a subset of druggable compounds (5). If unacceptable toxicity, e.g., low MOS, emerges from pilot animal studies, defining an on-target vs off-target mechanism is critical, but assay formats that rapidly and reliably inform the SAR are also essential. In vivo studies, the current gold standard for predicting safety, are poor SAR tools since differences in absorption and disposition confuse compound comparisons and throughput is low. Surrogate assays have a higher capacity, and compounds can be compared at equivalent concentrations, but results are only meaningful if connected to pathophysiology in vivo. A review of surrogate assays is beyond the scope of this discussion, but useful links offer entry to the literature (http://ecvam.jrc.it/ index.htm). Illustrative examples are outlined below.

3.3.1. Cell- and Tissue-Based Surrogate Assays A battery of surrogate assays has been deployed to detect delayed ventricular repolarization (Qt prolongation) linked to

Chem. Res. Toxicol., Vol. 19, No. 11, 2006 1397

the potentially fatal cardiac arythmia, Torsades de Pointe (32). Electrophysiological recording in cells or tissue explants and screening for binding to the human ether-a-go-go (hERG), the potassium channel responsible for repolarization of the cardiomyocyte, are used to screen compounds prior to collecting electrophysiological recordings in conscious animals (33). Interpreting the liability for Torsade’s must be approached on a case-by-case basis; negative results in surrogate assays do not guarantee safety in vivo (32). Nonetheless, this suite of tools has moved the issue from empirical screening to rational drug design and hypothesis testing. Organ slices and primary cell models can be used to address a variety of target organ toxicities. Isolated hepatocytes have been used to screen for hepatotoxicity and drug-drug interactions. Screening for PPARR agonists in isolated hepatocytes is routine. Isolated hepatocytes from rodents, nonrodents, and humans are useful to screen compounds for induction of drugmetabolizing enzymes and drug-drug interactions and to establish human relevance for metabolism (34, 35). Organ slices are useful models when the tissue architecture is important for toxicity, e.g., if toxicity requires interactions between different cell types, and they provide opportunities for species comparisons (36). Regardless of the assay format, knowledge of the mechanism is necessary to link results from a screen with pathophysiology to ensure that the data will translate to an improved safety profile in vivo.

3.3.2. Surrogate Animal Models Lower organisms with shorter life cycles are amenable to higher throughput testing. Developmental and behavioral toxicities are only two examples where disrupting complex interactions among many cell types results in toxicity. The value of surrogate animal models seems clear; for example, knowledge of apoptosis can be traced to basic mechanisms defined in Caenorhabditis elegans (37). Recent reviews suggest that zebrafish show considerable promise as surrogate animal models for cardiovascular toxicity (38), C. elegans models offer insights into neurotoxicity (39), and amphibian models have been used for developmental toxicity screening (40). Both public and private databases, too numerous to reference, offer a wealth of genetic, biochemical, and cell biology contents for eukaryotic organisms from yeasts to mice.

3.3.3. Opportunities, Gaps, and Needs Lead optimization is the final opportunity to design properties into a clinical candidate (Figure 1). There is a critical need for mechanism-based surrogate models to address specific issues during lead optimization, accelerate candidate identification, and reduce later pipeline attrition. The importance of this juncture in the development pipeline cannot be overemphasized. The safety studies supporting first human exposure are the points where safety issues for man are first noted (24); the quality of the candidates emerging from lead optimization will impact safety and pipeline productivity directly. An important gap to implementing mechanism-based surrogate assays is a poor understanding of the cellular physiology in vivo and how physiological pathways adapt at the cellular level in vitro. A focus on developing primary cell culture models, tissue explants, and even a return to more classical isolated organs provides opportunities for species comparisons. Where primary tissues are not available, stem cells may provide populations of differentiated cells that can be used as safety screens (41). Additional work will be necessary for wider applications of these technologies.

1398 Chem. Res. Toxicol., Vol. 19, No. 11, 2006

Defining the relationships among surrogates and the complex physiology of mammalian species represents an opportunity for systems biology approaches (see below). Global transcript profiling can be an effective first step to defining similarities and differences in the basic physiology of preclinical test species vs surrogate models, even in the absence of toxicant treatment. Progress has been made with hepatocyte profiling (42-44), and hepatocyte gene expression profiles have been used to build predictive toxicogenomic databases (45). Combinatorial approaches to defining conditions that support physiological functions of primary cells and technical advances in multiplexing biological measurements of cellular function in higher throughput formats are also essential (46). New sources of physiologically based in vitro models, e.g., surrogate animals and stem cells, to interrogate liabilities for specific target organ toxicities, are sorely needed. Effective deployment of batteries of assays that can be “fit-for-purpose” will allow the flexibility to address safety issues early in development (30).

3.4. Biomarkers The need for biomarkers in preclinical safety assessment is highlighted in a follow-up report to the FDA Critical Path document (47): “New biomarker development has stalled....[P]otential new biomarkers have been proposed, but the essential work needed to evaluate their utility...has not been carried out.” Preclinical studies help validate a biomarker as a surrogate or clinical end point to diagnose or predict clinical outcomes. However, preclinical application of biomarkers can also provide earlier signals of developing safety issues in vivo. Preclinical biomarkers important for mechanism-based risk assessment include PD markers and safety markers. Safety markers can be further subdivided into premonitory and surrogate markers for injury. PD biomarkers measure the response of the target and relate drug exposure to target modulation and biological outcome, necessary information to differentiate on- vs off-target mechanisms of toxicity. Any measurement technology that monitors a process linked mechanistically to target modulation can be useful for developing PD biomarkers. Premonitory and surrogate safety biomarkers measure changes that occur prior to, or concurrent with, morphological evidence of pathology, respectively. For example, dogs exhibit focal myocardial necrosis after isoproterenol treatment (48). An increase in heart rate may be premonitory for cardiac injury while cardiac troponins and/or creatine kinase are released coincident with the rupture of cardiomyocytes and serve as surrogate markers of cell death. Likewise, the release of ALT and AST with concurrent hyperbilirubinemia may be a surrogate for hepatic injury, but increases in ALT and AST, with or without bilirubinemia, may occur in the absence of morphological evidence of injury. Early signals might indicate that injury will progress or may accommodate without an increase in severity or collapse of the MOS. Predicting whether or not an early signal is a transient and reversible effect, or a harbinger of progressive and irreversible injury, describes the essence of the predictive toxicology dilemma.

3.4.1. Opportunities, Gaps, and Needs Mechanism-based PD and safety biomarkers allow safety issues to be monitored and species differences to be defined as a candidate approaches the clinic. Increased focus in this area represents an opportunity to improve risk assessment, relate findings in preclinical species to humans, and increase clinical

SteVens

safety. Simply associating dose and time dependence with toxicity is insufficient; mechanism-based biomarkers are necessary to provide accurate risk assessment and facilitate clinical monitoring. Additional mechanism-based PD and validated premonitory safety biomarkers are critical to improving preclinical safety assessment, pipeline productivity, and monitoring human safety. Species differences are an important issue. Lack of crossreactivity for reagents, such as antibodies, is a significant hurdle to applying biomarkers to preclinical safety assessment. Mining the genome and tissue expression profiles are useful for gathering candidate biomarkers, but functional annotation of the rat, dog, and monkey genome, typical rodent and nonrodent species used in preclinical testing, is incomplete. Species differences in the activity of small molecules or therapeutic proteins targeting important classes of receptors, such as G-protein-coupled receptors or kinases, hamper interpretation of safety data and create uncertainty in assessing on-target safety issues with biologics. Increased attention on the need for biomarkers in preclinical safety assessment is necessary to solve these problems. Streamlining validation criteria is also critical. This will require increased cooperation among academic, regulatory, and industrial scientists. In addition, a single marker will not predict multiple pathologies in a single organ, e.g., ALT and hepatotoxicity, while combinatorial biomarkers approaches, although more complex and costly to develop, provide deeper insight (49) and may be an opportunity for systems toxicology approaches.

3.5. Systems Toxicology and Pharmacogenetics: Making Sense of High Content Information Pharmacogenetics (often used interchangeably with pharmacogenomics) investigates the interactions between a patient or animal genome and drug responses (15). Systems toxicology “represents an analytical approach to the relationships among elements of a system” as applied to toxicology (28, 50). Both require integration of information from multiple sources to describe and predict the behavior of biological systems; summaries of technologies and computational models are available (8, 28, 45, 50). Two aspects will be addressed briefly, predicting toxicity using ‘omic approaches and predicting individual patient safety through application of pharmacogenetic information. Available public and commercial and databases and classification algorithms (25, 28, 45, 51, 52) establish that global transcript and metabolite profiles can be used to classify toxicity from unknown compounds. It is noteworthy that the classifications appear robust despite the small numbers of structurally dissimilar compounds (e.g., ∼10 for single pathology) used to train the models (53, 54). For example, a phospholipidosis classifier was built using 12 model compounds and a signature set of less than 20 genes (55). This suggests that biological classifiers can be built based on the complexity inherent in the biological data relating to pathogenesis as opposed to an extensive SAR based on similar structures. Although progress may seem slow relative to expectations, results suggest that the predictive toxicology database can assist in safety predictions and increase knowledge of the mechanism. Although many genetic polymorphisms may contribute to disease, variability in the phenotype and genotype of the enzymes and transporters that govern the metabolism and disposition of a drug offers the greatest opportunity to understand and eliminate ADRs. The Paracelsus quotation, “All things are poison...the dosis (sic) determines that a thing is not a

PerspectiVe

poison”, is central to toxicology (reference in ref 37), but a modern Paracelsus, schooled in pharmacogenetics and epidemiology, might have said, “The dose determines the poison, but the patient and environment determine the dose that poisons.” Because toxicity is directly related to exposure, it is not surprising that variation in metabolism and disposition contributes to drug toxicity (15). Accordingly, differences in metabolism and drug-drug interactions may contribute to ADRs from a wide variety of drugs (23). Current preclinical safety assessment practices are designed to reduce variability and yield statistically robust results. This experimental necessity underestimates contributions from genetic variability to toxicity. Efforts to understand and apply murine (the Mouse Phenome Project; www.jax.org) and rat (www.physiogenix.com) genetic variability to understand drug responses may offer opportunities to improve our understanding of ADRs using preclinical models. Additional work will be necessary to establish the relevance of genetic diversity in drug responses with preclinical species and human.

3.5.1. Opportunities, Gaps, and Needs Recent advances in the tools and application of systems toxicology and pharmacogenetics offer the opportunity to improve preclinical safety testing. Effective information and knowledge management apply here as well but are discussed above with regard to in silico applications. A mechanistic understanding of species differences is another key area of need. Phylogenetics defines the genetic basis for differences among species and relatedness within lineages. The fact that species differences are at the heart of preclinical safety assessment calls for a “phylogenetic systems toxicology” approach to safety assessment. Investigation of species differences must progress from descriptive to mechanism-based systems biology approaches. This requires consistent functional annotation of the various genomes and mapping differences in physiology and pathophysiology across species including surrogate animal models. Understanding the genetic basis of drug toxicity will require systems approaches to integrate a “physiomic” view for preclinical species and humans. The low incidence of ADRs and “curse of dimensionality” are barriers that also confound predicting responses for individual patients (15). The well-recognized statin-induced rhabdomyolysis incidence is only 3.4 per 100000 and depends on multiple factors (23). Low frequency of ADRs with complex genetic and environmental etiology may only appear after a drug has been on the market for some time (1); understanding genetic and environmental contributions is difficult without the clarity of 20:20 hindsight. However, considerable progress will be required to improve pharmacovigilance and streamline phylogenetic systems toxicology approaches to help elucidate the roles of the environment and genetics in mechanisms of ADRs.

4. Summary: Improving Mechanism-Based Risk Assessment The preceding discussion provides a perspective and suggestions on directions for future mechanistic research on drug safety as well as practical applications to mechanism-based risk assessment. Toxicology is poised to deliver safer drugs to patients at lower cost, but technical and nontechnical challenges must be met to realize this vision. Recent advances in understanding the mechanism of metabolic activation of a drug to reactive species (56) and the basic mechanism of cell death (37) offer new insights, but a focus on improved understanding

Chem. Res. Toxicol., Vol. 19, No. 11, 2006 1399

of basic mechanisms of drug-induced target organ toxicity in preclinical animal models and humans is also essential to improve mechanism-based risk assessment for new and existing drugs. Progress will require both effective translation of basic research to practical applications and a clear focus on the right technical problems. A few final points frame the opportunities and challenges already enumerated. Improving the availability of safety data to fuel public and private collaboration is essential. The NIH Roadmap and FDA Critical Path documents call for more focused application of basic science in toxicology. Toxicologists must embrace the intent of these documents and work to refine and steer the applications to the most productive outcomes. Regulatory agencies must enable the development of new technologies and processes by providing incentives to bring forward new safety data. Industry must enter into open collaborations and facilitate the use of relevant compounds and safety data to test novel methods to improve performance and metric progress on safety assessment. All sectors must participate in educating the public to the risk:benefit considerations for any drug (57). The basic science of toxicology must advance to meet these challenges. As noted by Liebler (58), major advances often come from outside the field of toxicology. Changes to study sections structure at the NIH raise fear in the academic community that the focus of funding for research in toxicology will be diffused across interdisciplinary study sections. Although there is risk, opportunity and change require risk taking; toxicologists must adapt and compete based on the clinical impact and application of their research. The high-content nature of systems approaches offers great opportunity and significant barriers that must be addressed. Systems approaches as currently applied often describe what “might happen” or what “did happen” but do not predict the probability of an outcome. Risk assessment is neither a purely qualitative nor a retrospective exercise. Risk assessment, as practiced in pharmaceutical development, attempts to quantify risk (e.g., MOS) in advance to enable good risk:benefit decisions regarding therapeutic outcomes and patient safety. Systems approaches are complex, but must, nonetheless, achieve a level of quantitation and develop capabilities for intuitive visualization of results to allow machine learning to be incorporated into human learning and decision making. Understanding the mechanism is a key to defining these relationships. To comment on this or other Future of Toxicology perspectives, please visit our Perspectives Open Forum at http:// pubs.acs.org/journals/crtoec/openforum. Acknowledgment. I thank Drs. Lorrene Buckley, Myrtle Davis, Michael Dorato, Thomas Jones, Derek Lieshman, Armen Tashjian, Craig Thomas, John Vahle, David Watson, and Daniel Wierda for critical comments and Dr. Gerald Miwa for personal communications.

References (1) Lasser, K. E., Allen, P. D., Woolhandler, S. J., Himmelstein, D. U., Wolfe, S. M., and Bor, D. H. (2002) Timing of new black box warnings and withdrawals for prescription medications. J. Am. Med. Assoc. 287 (17), 2215-2220. (2) Kola, I., and Landis, J. (2004) Can the pharmaceutical industry reduce attrition rates? Nat. ReV. Drug DiscoVery 3 (8), 711-715. (3) FDA (2004) Innovation and Stagnation: Challenges and opportunities on the critical path to new medical products. http://www.fda.gov/oc/ initiatives/criticalpath/whitepaper. html. (4) Greaves, P., Williams, A., and Eve, M. (2004) First dose of potential new medicines to humans: How animals help. Nat. ReV. Drug DiscoVery 3 (3), 226-236.

1400 Chem. Res. Toxicol., Vol. 19, No. 11, 2006 (5) Lipinski, C., and Hopkins, A. (2004) Navigating chemical space for biology and medicine. Nature 432 (7019), 855-861. (6) Dorato, M. A., and Vodicnik, M. J. (2001) The toxicological assessment of pharmaceutical and biotechnology products. In Principles and Methods of Toxicology (Hayes, W. A., Ed.) pp 243-283, Taylor and Francis, Philadelphia. (7) El-Hage, J. (2004) Preclinical and Clinical Safety Assessment for PPAR Agonists. http://www.fda.gov/CDER/present/DIA2004/ Elhage.ppt. (8) Ekins, S., Nikolsky, Y., and Nikolskaya, T. (2005) Techniques: Application of systems biology to absorption, distribution, metabolism, excretion and toxicity. Trends Pharmacol. Sci. 26 (4), 202-209. (9) Karin, M., Yamamoto, Y., and Wang, Q. M. (2004) The IKK NFkappa B system: A treasure trove for drug development. Nat. ReV. Drug DiscoVery 3 (1), 17-26. (10) O’Neill, L. A. J. (2006) Targeting signal transduction as a strategy to treat inflammatory disease. Nat. ReV. Drug DiscoVery 5 (7), 549563. (11) Li, Z. W., Chu, W., Hu, Y., Delhase, M., Deerinck, T., Ellisman, M., Johnson, R., and Karin, M. (1999) The IKKbeta subunit of IkappaB kinase (IKK) is essential for nuclear factor kappaB activation and prevention of apoptosis. J. Exp. Med. 189 (11), 1839-1845. (12) Arkan, M. C., Hevener, A. L., Greten, F. R., Maeda, S., Li, Z. W., Long, J. M., Wynshaw-Boris, A., Poli, G., Olefsky, J., and Karin, M. (2005) IKK-beta links inflammation to obesity-induced insulin resistance. Nat. Med. 11 (2), 191-198. (13) Sasseville, V. G., Lane, J. H., Kadambi, V. J., Bouchard, P., Lee, F. W., Balani, S. K., Miwa, G. T., Smith, P. F., and Alden, C. L. (2004) Testing paradigm for prediction of development-limiting barriers and human drug toxicity. Chem.-Biol. Interact. 150 (1), 9-25. (14) Dorato, M. A., and Engelhardt, J. A. (2005) The no-observed-adverseeffect-level in drug safety evaluations: Use, issues, and definition(s). Regul. Toxicol. Pharmacol. 42 (3), 265-274. (15) Wilke, R. A., Reif, D. M., and Moore, J. H. (2005) Combinatorial pharmacogenetics. Nat. ReV. Drug DiscoVery 4 (11), 911-918. (16) Halford, J. C. (2006) Obesity drugs in clinical development. Curr. Opin. InVest. Drugs 7 (4), 312-318. (17) Gardin, J. M., Schumacher, D., Constantine, G., Davis, K. D., Leung, C., and Reid, C. L. (2000) Valvular abnormalities and cardiovascular status following exposure to dexfenfluramine or phentermine/fenfluramine. J. Am. Med. Assoc. 283 (13), 1703-1709. (18) Mekontso-Dessap, A., Brouri, F., Pascal, O., Lechat, P., Hanoun, N., Lanfumey, L., Seif, I., Haiem-Sigaux, N., Kirsch, M., Hamon, M., Adnot, S., and Eddahibi, S. (2006) Deficiency of the 5-hydroxytryptamine transporter gene leads to cardiac fibrosis and valvulopathy in mice. Circulation 113 (1), 81-89. (19) FDA (2005) Estimating the maximum safe starting dose in initial clinical trials for therapeutics in adult healthy volunteers. http:// www.fda.gov/CDER/guidance/5541fnl.doc. (20) Mutlib, A. E., Gerson, R. J., Meunier, P. C., Haley, P. J., Chen, H., Gan, L. S., Davies, M. H., Gemzik, B., Christ, D. D., Krahn, D. F., Markwalder, J. A., Seitz, S. P., Robertson, R. T., and Miwa, G. T. (2000) The species-dependent metabolism of efavirenz produces a nephrotoxic glutathione conjugate in rats. Toxicol. Appl. Pharmacol. 169 (1), 102-113. (21) Lazarou, J., Pomeranz, B. H., and Corey, P. N. (1998) Incidence of adverse drug reactions in hospitalized patients: A meta-analysis of prospective studies. J. Am. Med. Assoc. 279 (15), 1200-1205. (22) Uetrecht, J. P. (1997) Current trends in drug-induced autoimmunity. Toxicology 119 (1), 37-43. (23) Law, M., and Rudnicka, A. R. (2006) Statin safety: A systematic review. Am. J. Cardiol. 97 (8A), 52C-60C. (24) Olson, H., Betton, G., Robinson, D., Thomas, K., Monro, A., Kolaja, G., Lilly, P., Sanders, J., Sipes, G., Bracken, W., Dorato, M., Van, D. K., Smith, P., Berger, B., and Heller, A. (2000) Concordance of the toxicity of pharmaceuticals in humans and in animals. Regul. Toxicol. Pharmacol. 32 (1), 56-67. (25) Fostel, J., Choi, D., Zwickl, C., Morrison, N., Rashid, A., Hasan, A., Bao, W., Richard, A., Tong, W., Bushel, P. R., Brown, R., Bruno, M., Cunningham, M. L., Dix, D., Eastin, W., Frade, C., Garcia, A., Heinloth, A., Irwin, R., Madenspacher, J., Merrick, B. A., Papoian, T., Paules, R., Rocca-Serra, P., Sansone, A. S., Stevens, J., Tomer, K., Yang, C., and Waters, M. (2005) Chemical effects in biological systems-data dictionary (CEBS-DD): A compendium of terms for the capture and integration of biological study design description, conventional phenotypes, and ’omics data. Toxicol. Sci. 88 (2), 585601. (26) Yang, C., Benz, R. D., and Cheeseman, M. A. (2006) Landscape of current toxicity databases and database standards. Curr. Opin. Drug DiscoVery DeV. 9 (1), 124-133. (27) Eddy, S. R. (2004) What is Bayesian statistics? Nat. Biotechnol. 22 (9), 1177-1178.

SteVens (28) Waters, M. D., and Fostel, J. M. (2004) Toxicogenomics and systems toxicology: Aims and prospects. Nat. ReV. Genet. 5 (12), 936-948. (29) Contrera, J. F., Maclaughlin, P., Hall, L. H., and Kier, L. B. (2005) QSAR modeling of carcinogenic risk using discriminant analysis and topological molecular descriptors. Curr. Drug DiscoVery Technol. 2 (2), 55-67. (30) Mayne, J. T., Ku, W. W., and Kennedy, S. P. (2006) Informed toxicity assessment in drug discovery: Systems-based toxicology. Curr. Opin. Drug DiscoVery DeV. 9 (1), 75-83. (31) Wilson, A. G., White, A. C., and Mueller, R. A. (2003) Role of predictive metabolism and toxicity modeling in drug discoverysA summary of some recent advancements. Curr. Opin. Drug DiscoVery DeV. 6 (1), 123-128. (32) Lawrence, C. L., Pollard, C. E., Hammond, T. G., and Valentin, J. P. (2005) Nonclinical proarrhythmia models: Predicting Torsades de Pointes. J. Pharmacol. Toxicol. Methods 52 (1), 46-59. (33) Chiang, A. Y., Holdsworth, D. L., and Leishman, D. J. (2006) A onestep approach to the analysis of the QT interval in conscious telemetrized dogs. J. Pharmacol. Toxicol. Methods 54 (2), 183-188. (34) O’Brien, P. J., and Siraki, A. G. (2005) Accelerated cytotoxicity mechanism screening using drug metabolising enzyme modulators. Curr. Drug Metab. 6 (2), 101-109. (35) Sivaraman, A., Leach, J. K., Townsend, S., Iida, T., Hogan, B. J., Stolz, D. B., Fry, R., Samson, L. D., Tannenbaum, S. R., and Griffith, L. G. (2005) A microscale in vitro physiological model of the liver: Predictive screens for drug metabolism and enzyme induction. Curr. Drug Metab. 6 (6), 569-591. (36) Vickers, A. E., and Fisher, R. L. (2005) Precision-cut organ slices to investigate target organ injury. Exp. Opin. Drug Metab. Toxicol. 1 (4), 687-699. (37) Orrenius, S., and Zhivotovsky, B. (2006) The future of toxicologydoes it matter how cells die? Chem. Res. Toxicol. 19 (6), 729-733. (38) Zon, L. I., and Peterson, R. T. (2005) In vivo drug discovery in the zebrafish. Nat. ReV. Drug DiscoVery 4 (1), 35-44. (39) Driscoll, M., and Gerstbrein, B. (2003) Dying for a cause: Invertebrate genetics takes on human neurodegeneration. Nat. ReV. Genet. 4 (3), 181-194. (40) Song, M. O., Fort, D. J., McLaughlin, D. L., Rogers, R. L., Thomas, J. H., Buzzard, B. O., Noll, A. M., and Myers, N. K. (2003) Evaluation of Xenopus tropicalis as an alternative test organism for frog embryo teratogenesis assaysXenopus (FETAX). Drug Chem. Toxicol. 26 (3), 177-189. (41) Chaudhary, K. W., Barrezueta, N. X., Bauchmann, M. B., Milici, A. J., Beckius, G., Stedman, D. B., Hambor, J. E., Blake, W. L., McNeish, J. D., Bahinski, A., and Cezar, G. G. (2006) Embryonic stem cells in predictive cardiotoxicity: Laser capture microscopy enables assay development. Toxicol. Sci. 90 (1), 149-158. (42) Baker, T. K., Carfagna, M. A., Gao, H., Dow, E. R., Li, Q., Searfoss, G. H., and Ryan, T. P. (2001) Temporal gene expression analysis of monolayer cultured rat hepatocytes. Chem. Res. Toxicol. 14 (9), 12181231. (43) Waring, J. F., Ciurlionis, R., Jolly, R. A., Heindel, M., Gagne, G., Fagerland, J. A., and Ulrich, R. G. (2003) Isolated human hepatocytes in culture display markedly different gene expression patterns depending on attachment status. Toxicol. in Vitro 17 (5-6), 693-701. (44) Boess, F., Kamber, M., Romer, S., Gasser, R., Muller, D., Albertini, S., and Suter, L. (2003) Gene expression in two hepatic cell lines, cultured primary hepatocytes, and liver slices compared to the in vivo liver gene expression in rats: Possible implications for toxicogenomics use of in vitro systems. Toxicol. Sci. 73 (2), 386-402. (45) Fielden, M. R., and Kolaja, K. L. (2006) The state-of-the-art in predictive toxicogenomics. Curr. Opin. Drug DiscoVery DeV. 9 (1), 84-91. (46) Flaim, C. J., Chien, S., and Bhatia, S. N. (2005) An extracellular matrix microarray for probing cellular differentiation. Nat. Methods 2 (2), 119-125. (47) Critical Path Opportunities Report (2006) http://www.fda.gov/oc/ initiatives/criticalpath/reports/opp_report.pdf. (48) Sandusky, G. E., Means, J. R., and Todd, G. C. (1990) Comparative cardiovascular toxicity in dogs given inotropic agents by continuous intravenous infusion. Toxicol. Pathol. 18 (2), 268-278. (49) Koop, R. (2005) Combinatorial biomarkers: From early toxicology assays to patient population profiling. Drug DiscoVery Today 10 (11), 781-788. (50) Hood, L., and Perlmutter, R. M. (2004) The impact of systems approaches on biological problems in drug discovery. Nat. Biotechnol. 22 (10), 1215-1217. (51) Hayes, K. R., and Bradfield, C. A. (2005) Advances in toxicogenomics. Chem. Res. Toxicol. 18 (3), 403-414. (52) Lindon, J. C. H. E. N. J. K. (2006) Metabonomics techniques and applications to pharmaceutical research and development. Pharm. Res. 23 (6), 1075-1088.

PerspectiVe (53) Fielden, M. R., and Zacharewski, T. R. (2001) Challenges and limitations of gene expression profiling in mechanistic and predictive toxicology. Toxicol. Sci. 60 (1), 6-10. (54) Nicholson, J. K., Holmes, E., Lindon, J. C., and Wilson, I. D. (2004) The challenges of modeling mammalian biocomplexity. Nat. Biotechnol. 22 (10), 1268-1274. (55) Sawada, H., Takami, K., and Asahi, S. (2005) A toxicogenomic approach to drug-induced phospholipidosis: Analysis of its induction mechanism and establishment of a novel in vitro screening system. Toxicol. Sci. 83 (2), 282-292.

Chem. Res. Toxicol., Vol. 19, No. 11, 2006 1401 (56) Baillie, T. A. (2006) Future of toxicology-metabolic activation and drug design: Challenges and opportunities in chemical toxicology. Chem. Res. Toxicol. 19 (7), 889-893. (57) Smith, D. A., Johnson, D. E., and Park, B. K. (2006) Editorial overview: Safety of drugs can never be absolute. Curr. Opin. Drug DiscoVery DeV. 9 (1), 26-28. (58) Liebler, D. C. (2006) The poisons within: Application of toxicity mechanisms to fundamental disease processes. Chem. Res. Toxicol. 19 (5), 610-613.

TX060213N