CADRE-SS, an in Silico Tool for ... - ACS Publications

Dec 9, 2015 - Department of Chemistry, The George Washington University, 800 22nd Street Northwest, Washington, D.C. 20052, United States...
0 downloads 0 Views 1MB Size
Article pubs.acs.org/crt

CADRE-SS, an in Silico Tool for Predicting Skin Sensitization Potential Based on Modeling of Molecular Interactions Jakub Kostal*,†,‡ and Adelina Voutchkova-Kostal§ †

Computational Biology Institute, The George Washington University, 45085 University Drive Suite 305, Ashburn, Virginia 20147, United States ‡ DOT Consulting LLC, 113 South Columbus Street Suite 100, Alexandria, Virginia 22314, United States § Department of Chemistry, The George Washington University, 800 22nd Street Northwest, Washington, D.C. 20052, United States S Supporting Information *

ABSTRACT: Using computer models to accurately predict toxicity outcomes is considered to be a major challenge. However, state-of-the-art computational chemistry techniques can now be incorporated in predictive models, supported by advances in mechanistic toxicology and the exponential growth of computing resources witnessed over the past decade. The CADRE (Computer-Aided Discovery and REdesign) platform relies on quantum-mechanical modeling of molecular interactions that represent key biochemical triggers in toxicity pathways. Here, we present an external validation exercise for CADRE-SS, a variant developed to predict the skin sensitization potential of commercial chemicals. CADRE-SS is a hybrid model that evaluates skin permeability using Monte Carlo simulations, assigns reactive centers in a molecule and possible biotransformations via expert rules, and determines reactivity with skin proteins via quantum-mechanical modeling. The results were promising with an overall very good concordance of 93% between experimental and predicted values. Comparison to performance metrics yielded by other tools available for this endpoint suggests that CADRE-SS offers distinct advantages for first-round screenings of chemicals and could be used as an in silico alternative to animal tests where permissible by legislative programs.



INTRODUCTION Skin sensitization leading to allergic contact dermatitis (ACD) is a significant consumer and occupational health concern, affecting ca. 15−20% of the world’s Western population.1 While treatment is possible, it requires extended use of medication and permanent avoidance of the offending chemical agent, raising health care costs and costs related to prolonged employee absenteeism. Thus, skin sensitization potential is a key endpoint for the safety assessment of ingredients in commercial chemicals when significant dermal exposure is expected.2 The local lymph node assay (LLNA, OECD 429) is the preferred test method and the only means of definitively assessing skin sensitization for regulatory purposes under REACH. Additionally, one in chimico method (direct peptide reactivity assay, DPRA) and two in vitro methods (ARE−Nrf2 luciferase test, KeratinoSensTM, and human cell line activation Test, h-CLAT) are recommended by ECVAM (the European Centre for the Validation of Alternative Methods) to alleviate animal testing where permissible (EURL ECVAM 2013, EURL ECVAM 2014, EURL ECVAM 2015). These non-animal tests cover specific events within the skin sensitization AOP (adverse outcome pathway) and should only be considered in combinations or supported by in silico models (OECD 442C; © 2015 American Chemical Society

OECD 442D; Reach Guidance, Chapter R.7a, 2015). In a recent study, a weight-of-evidence approach using these assays resulted in 82 and 90% accuracy against LLNA and human data, respectively.3 However, not all protein-binding mechanisms were well-represented in the test set, and the authors recognized limited ability of these assays to assess precursors requiring metabolic activation. In contrast to in vitro and in chimico tests, a single in silico model can be designed to address multiple events within the skin sensitization AOP. Furthermore, an in silico approach is likely to provide assessments at much lower costs and in shorter timeframes than experimental testing. Over the last 2 decades, much effort has been invested in developing in silico methods for predicting skin sensitization potential, and there are many tools currently in use by industry. Many of these models rely on (quantitative) structure−activity relationships [(Q)SARs], i.e., they combine the use of physicochemical or structural descriptors and statistics to estimate the likelihood of skin sensitization. Some (Q)SAR models incorporate mechanistic knowledge about relevant Received: September 17, 2015 Published: December 9, 2015 58

DOI: 10.1021/acs.chemrestox.5b00392 Chem. Res. Toxicol. 2016, 29, 58−64

Article

Chemical Research in Toxicology

dermal injury. Since haptenation with skin proteins is widely recognized as the trigger for the sensitization biochemical cascade,2 CADRE-SS was developed as a tiered system with three modules for skin permeability, hapten activation mechanisms, and reactivity with skin proteins. Skin permeability is assessed using a multivariate model that was trained to reproduce experimentally determined permeability coefficients (log Kp) for a set of 143 compounds. The training set was collected from 10 different transdermal studies and consisted of common chemicals and drug-like substances.14−23 Descriptors are generated from Monte Carlo simulations; mixed quantum and classical mechanics calculations are used to describe the target chemical and the solvent, respectively. Several solvents are used to mimic interactions with lipid and protein fractions of the stratum corneum. Descriptors used in the model include physicochemical properties (e.g., dipole moment, surface and volume accessible areas, hydrogen bonding) and interaction energies obtained from energy pair distributions. The second CADRE-SS module identifies specific reactive sites and suggests likely biotransformations. Similar to the approach of Enoch et al.,24 haptenation and hapten-activation mechanisms are assigned via expert rules encoded in substructural patterns using the SMARTS language. An in-house library of SMARTS patterns was developed based on available literature and principles of chemical reactivity applied to the training set of chemicals. Key haptenation mechanisms include Michael addition, Schiff base formation, nucleophilic substitution, nucleophilic aromatic substitution, and acyl transfer.7,25 Some metal ions can also form coordination complexes with skin proteins and induce skin sensitization.26 In the CADRE-SS approach, any chemical identified as a potential hapten in the second module is subsequently evaluated for reactivity with skin proteins using quantum-mechanical calculations in the third module. The key target for haptenation is a sensory Kelch-like ECHassociated protein 1 (Keap1), which contains highly reactive surface cysteine and lysine residues.27 Thus, the simplest target-based approach considers covalent binding of skin sensitizers to discrete residues. Mulliner et al. have shown that for a series of substituted aldehyde and ketone Michael acceptors quantum-mechanical transition state (TS) calculations can yield a well-fitting regression model of experimental rate constants with glutathione (GSH), kGSH.28 This model is, however, unfeasible for screening large data sets of commercial chemicals in reasonable timeframes, particularly when larger chemicals with multiple conformers and reactive centers are involved. Schwobel and co-workers showed that physicochemical properties and quantum-mechanical parameters of compounds in their ground states, such as local electrophilicity index, bond energy, and accessible surface areas, can substitute for computed reaction energetics in a comparable model.29 Assuming the model is statistically robust, ground-state calculations offer several distinct advantages: (i) structure optimizations converge more easily to a ground state than to a transition state, (ii) reaction mechanisms (e.g., acid vs base-catalyzed reactions) do not have to be considered explicitly, and (iii) medium effects on reaction energetics can be neglected. As a result, predictions based on ground-state calculations are notably faster, allowing large data sets of chemicals to be screened in reasonable timeframes. The CADRE-SS module for protein reactivity is similar to approaches described elsewhere.29,30 The applicability of global and site-specific quantum-mechanical parameters in predicting kGSH (and, by extension, protein binding) was tested on a set of 36 aldehyde, ketone, and ester Michael acceptors.28 In Figure 1, electron affinity (EA), electrostatic solvation energy (EE), local softness on the β carbon (sβ), the LCAO-LUMO (LCAO = linear combination of atomic orbitals) coefficient on the β carbon (cβ), and charge on the carbonyl oxygen (qO) were used in constructing a linear model with correlation coefficient (R2) of 0.93. Local softness was calculated using condensed-to-atom Fukui functions fβ+ employing net atomic charges qβ and the relevant number of electrons N (with N + 1 referring to the anion in the neutral compound geometry)

biochemical processes (e.g., DEREK, Toxtree, TIME-SS); others fully rely on statistical evaluation of structural fragments (e.g., Case Ultra, TOPKAT). Since an AOP has been developed for skin sensitization, mechanistic models tend to perform better in external validation testing than purely statistical models.4 Much of our current understanding of the mechanistic pathways involved in skin sensitization as well as useful strategies to incorporate such mechanistic knowledge into predictive modeling is based on early research by Aptula, Patlewicz, and Roberts.5−8 Their efforts showed that it is feasible to develop quantitative models to estimate sensitization potency, i.e., LLNA pEC3 values, using physicochemical properties and structural descriptors. Furthermore, their characterization of haptenation mechanisms with skin proteins laid the foundation for many incumbent models, including the one described in this study. Most existing mechanistic models focus on the initial triggers in the AOP: permeability of stratum corneum, activation by autoxidation or by enzymes, and haptenation with skin proteins/peptides. Despite considerable advances in predictive modeling, there is consensus among the scientific community that no incumbent model can predict skin sensitization potentials accurately for broad application2,4 and that an integrated testing strategy (ITS) is needed.9−11 Existing in silico models have many documented shortages, such as a limited applicability domain, overpredictivity, and an inability to predict potency category.2,4 Most limitations can be attributed to a relatively small set of publicly available and reliable animal data and to our incomplete understanding of activation pathways. Fewer than 500 chemicals have published LLNA (local lymph node assay) data, which is the most desirable experimental assay for model training since it differentiates the potency of contact allergens. Limited understanding of metabolic activation and autoxidation pathways leads to narrow applicability domains. In a recent study, TIME-SS, widely recognized as one of the most promising in silico models for predicting skin sensitization potential,12,13 achieved 100% concordance with LLNA in distinguishing sensitizers from nonsensitizers. However, this value reflected only 16% of the substances tested, which were in the model’s applicability domain.4 CADRE (Computer Aided Discovery and REdesign) is a proprietary platform developed to address the challenges of applying (Q)SARs to endpoints with limited experimental data. Instead of using structure-based descriptors, which is commonplace in computational toxicology, CADRE relies on descriptors derived from modeling of molecular interactions, i.e., simulating behavior of molecules in their biological environments. In this study, we describe CADRE’s module for skin sensitization (CADRE-SS) and gauge its performance against existing in silico models. The data set used to compare predictive models consisted of 45 sensitizing and 55 nonsensitizing chemicals, which were identified from human and animal studies.



METHODS

The skin sensitization AOP consists of the following steps: penetration of the skin epidermis, possible activation by enzymatic or oxidative processes, and conjugation with skin proteins. Conjugated proteins are subsequently processed by epidermal Langerhans and dendritic cells. These cells migrate to the nearest lymph node where the conjugated protein fragments are recognized by specific T cell receptors. The activated T cells orchestrate an inflammatory response that can lead to 59

DOI: 10.1021/acs.chemrestox.5b00392 Chem. Res. Toxicol. 2016, 29, 58−64

Article

Chemical Research in Toxicology

The combined CADRE-SS model was trained on 384 chemicals, consisting of 182 sensitizers and 202 nonsensitizers, obtained from publically available33 and proprietary sources of LLNA data. Linear models were developed for each mechanistic domain associated with Keap1 binding, incorporating predicted skin permeability (KP) and protein reactivity. The following haptenation mechanisms were considered in CADRE-SS via expert rules: Michael addition, nucleophilic substitution, nucleophilic aromatic substitution, Schiff base formation, acyl transfer, and mechanisms implicated in nitroxide, hydroxylamine, hydrazine, and hyperoxide interactions with skin proteins. The model has no applicability domain restrictions: predictions are generated for any discrete chemical for which a haptenation mechanism can be identified. If no mechanism is assigned, then the substance is ruled a nonsensitizer. If more than one mechanism is identified, then each is assigned a prediction based on local reactivity parameters. In such a case, users are recommended to rely on the highest hazard category of sensitization potential predicted. If any descriptor falls out of range defined in the training set, then a lower confidence-in-prediction (CIP) score is assigned. A CIP score (1−3) is defined as follows: score 3 is assigned if all computed descriptors fall within 2σ (95%) from the mean of their training set distributions; one point is subtracted for each out-of-bounds descriptor. A prediction is valid regardless of the CIP score; however, users are encouraged to explore its utility within their chemical space. For example, one may use the CIP score to help prioritize chemicals for further animal or in silico testing. While most incumbent models predict only whether a chemical of concern is likely to sensitize or not,4 CADRE-SS predicts sensitization potential based on ECETOC categories: extreme (LLNA EC3%: