Evaluation of TOPKAT, Toxtree, and Derek Nexus ... - ACS Publications

Mar 28, 2016 - (2, 6, 7) In fact, the EU has banned in vivo testing for cosmetics,(8) and ... software tools, namely, TOPKAT, Toxtree, and Derek Nexus...
0 downloads 0 Views 2MB Size
Subscriber access provided by UNIV OF CALIFORNIA SAN DIEGO LIBRARIES

Article

Evaluation of TOPKAT, Toxtree and Derek in silico models for ocular irritation and development of a knowledgebased framework to improve prediction of severe irritation. Barun Bhhatarai, Daniel M Wilson, Amanda K Parks, Edward Carney, and Pamela J Spencer Chem. Res. Toxicol., Just Accepted Manuscript • DOI: 10.1021/acs.chemrestox.5b00531 • Publication Date (Web): 28 Mar 2016 Downloaded from http://pubs.acs.org on March 31, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Chemical Research in Toxicology is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Chemical Research in Toxicology

Evaluation of TOPKAT, Toxtree and Derek in silico models for ocular irritation and development of a knowledge-based framework to improve prediction of severe irritation Barun Bhhatarai, Daniel M. Wilson, Amanda K. Parks, Edward W. Carney, Pamela J. Spencer

Address correspondence: Barun Bhhatarai, The Dow Chemical Company, Midland MI 48674 USA Telephone: 989-638-6862. Email: [email protected] Fax: 989-638-9863

Running Title: Ocular irritation in silico model evaluation and knowledge-based framework Competing financial interests: The authors declare they have no actual or potential competing financial interests. Graphical Abstract:

1 DOW RESTRICTED

ACS Paragon Plus Environment

Chemical Research in Toxicology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 44

Abstract Assessment of ocular irritation is an essential component of any risk assessment. A number of (Q)SARs and expert systems have been developed and are described in the literature. Here, we focus on three in silico models (TOPKAT, BfR rulebase implemented in Toxtree and Derek Nexus), and evaluated their performance by using 1644 in-house and 123 ECETOC (European Centre for Toxicology and Ecotoxicology of Chemicals) compounds with existing in vivo ocular irritation classification data. Overall the in silico models performed poorly. The best consensus predictions of severe ocular irritants was 52% and 65%, for the in-house and ECETOC compounds, respectively. The prediction performance was improved by designing a knowledge-based chemical profiling framework that incorporated physicochemical properties and electrophilic reactivity mechanisms. The utility of the framework was assessed by applying it to the same test sets and three additional publicly available in vitro irritation datasets. The prediction of severe ocular irritants was improved to 73-77%, if compounds were filtered on the basis of AlogP_MR (hydrophobicity with molar refractivity). The predictivity increased to 74-80% for compounds capable of preferentially undergoing hard electrophilic reactions, such as Schiff base formation and acylation. This research highlights a need for reliable ocular irritation models to be developed taking into account mechanisms-of-action and individual structural classes. It also demonstrates the value of profiling compounds with respect to their chemical reactivity and physicochemical properties that in combination with existing models result in better predictions for severe irritants.

Keywords: Ocular irritation, QSAR, AlogP_MR, Toxtree, TOPKAT, Derek Nexus, physicochemical, knowledgebased framework, electrophilic reactivity.

2 DOW RESTRICTED

ACS Paragon Plus Environment

Page 3 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Chemical Research in Toxicology

1. INTRODUCTION: Assessing the ocular irritation potential of a compound is a necessary component in a risk assessment. It is required for communicating hazards for safe use, handling and shipping, product stewardship, screening of new R&D candidates and for registration of new compounds for use in agriculture or for entry into commerce. Historically, ocular irritation potential has been evaluated using the ‘Draize in vivo rabbit ocular irritation test’.1

In the Draize test, the severity of effects on ocular tissues (cornea,

conjunctiva and iris) and the recovery time from injury are assessed2, 3 although different cutoff exists for assessment of irritation as recommended by Organization for Economic Co‑operation and Development (OECD) and Europe (EU) guidelines (Figure 1). The irritation potential is summarized as the ‘maximum average score’ (MAS) obtained by averaging the weighted score from the Draize test for individual animals at each time of observation and selecting the highest of these averages. The modified MAS (MMAS) represents the maximum score calculated at 24 hours or longer following instillation4 where the scores are converted into eight possible descriptive ocular irritation ratings based on classification criteria adopted by the Interagency Coordinating Committee on the Validation of Alternative Methods (ICCVAM)5 (Figure 1). Currently, there is much emphasis on using non-animal alternative in vitro and in silico QSAR and readacross methods in safety assessment.2, 6, 7 In fact, the EU has banned in vivo testing for cosmetics8, and existing regulatory frameworks typically recommend a tiered assessment starting with in silico and in vitro approaches.9 In vitro methods for predicting ocular irritation potential are in various stages of development and validation under ECVAM and ICCVAM with consideration for regulatory use.10 The validation process for acceptance of in vitro models for regulatory use is rigorous and much progress has been made in the past decade. For in silico models OECD principles exists which were developed to facilitate the use of (Q)SAR models for regulatory purposes. However, any developed models still need to be assessed for their prediction accuracy and for their relevance to compounds being evaluated before relying on their predictions.

3 DOW RESTRICTED

ACS Paragon Plus Environment

Chemical Research in Toxicology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 44

The objective of the current study was to assess ocular irritation models implemented within three software tools namely TOPKAT, Toxtree and Derek Nexus using a large database derived from mining in-house ocular irritation classification data. Additionally, a publicly available but much smaller in vivo dataset4, 11 was also used in the evaluation to compare model predictions. This helps to cross-check if the coverage of models is narrow and could be limited by the diversity of the chemicals in Dow’s library. The three in silico models selected were available in-house and were underpinned by a different modelling algorithm. TOPKAT is a statistical-based expert system containing QSAR models for irritation. Toxtree contains an implementation of the BfR (Bundesinstitut für Risikobewertung - Federal Institute for Risk Assessment) rulebase for eye irritation and Derek Nexus is a knowledge-based expert system. A brief description of each of these models is provided in the Experimental section below. There is also several regression and classification based QSAR models that have been published in the literature. However, these models are typically local in nature dealing with specific chemistries, such as – pure organic compounds,3, 12-14 cationic surfactants,15 or pure bulk liquids,16 etc. A detailed review is provided in a publication by Abraham et al.17 A review of the literature-based QSAR models for dermal and ocular irritation has also been published by the European Commission's Joint Research Centre (JRC).18 The second research objective was to derive an in silico screening framework to be used in conjunction with the three models evaluated, with the aim of identifying structural features and mechanisms correlated with severe ocular irritation (Schematic 1). This second objective required the use of in-house and public ocular irritation databases to develop a knowledge-based chemical profiling framework using physicochemical properties, electrophilic reactivity, physicochemical and pharmacologic mechanisms, and then applying it to specific test sets. Two frameworks based on physicochemical properties and electrophilic reactivity were developed which are described here in detail. A third framework based on physicochemical and pharmacologic (biochemical and/or physiological) mechanisms will be discussed in a subsequent manuscript. 2. EXPERIMENTAL PROCEDURES: 4 DOW RESTRICTED

ACS Paragon Plus Environment

Page 5 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Chemical Research in Toxicology

2.1. Introduction to the models: A brief introduction and background information of the three models is provided below. TOPKAT –

TOxicity Prediction by Komputer Assisted Technology (TOPKAT) includes the structural

fragments produced by Enslein et al.19, 20 The model takes SMILES or SD files as an input and employs Quantitative Structure Toxicity Relationship (QSTR) for assessing various measures of toxicity. Specific details about the model are published elsewhere.18 The statistical model was trained on different chemical categories such as substituted monophenyls (286 compounds), acyclic esters (304 compounds), other aromatics (227 compounds), other acyclics (420 compounds) and alicyclics (216 compounds) comprising a total of 1453 compounds. These compounds were then used to derive three sub-models as listed in Table 1, which further can be applied in a pipeline flow to filter out four levels of rabbit ocular irritancy: non-irritant, mild, moderate and severe (Figure 2). The applicability of the models can be identified by checking whether the test set data properties are within the same range as the properties in the training data set and whether the test set data is within the optimum prediction space (OPS) derived from the training set compounds. The TOPKAT model: (1) is applicable only to organic molecules whereas organometallic and inorganic compounds are out of the domain, (2) do not accept charges except N+, O-, (3) is limited to smaller compounds where the number of rings should be less than 9 and SMILES characters less than 249, and (4) is applicable to only the largest fragment for toxicity estimation in the case of mixtures and salts. Toxtree: Toxtree was developed by IDEA Consult Ltd.21 It has several modules, one of which is for the BfR rulebase for predicting ocular irritation and corrosion (Figure S1A)22. The rulebase uses EU-based risk phrases23 (Figure 1) which categorize compounds into different ocular irritation/corrosion potentials (Table 2). The rulebase comprises physiochemical exclusion-rules and structural alert inclusion rules to classify compounds.

5 DOW RESTRICTED

ACS Paragon Plus Environment

Chemical Research in Toxicology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 44

Physicochemical exclusion rules: These depend on properties such as molecular weight (MW), hydrophobicity (logP), melting point (MP), water solubility (AqS), lipid solubility, etc., which are meant to identify compounds with no ocular irritation/corrosion potential. With the Toxtree implementation, aside from logP and MW, the majority of the properties need to be supplied by the end-user. In absence of these values, the structural inclusion rules (see below) alone can be used for prediction. There are 7 rules based on physicochemical properties which are applicable to all groups of chemicals or are further divided into different sub-rules for specific chemical classes - C, CN, CNHal, CNS and CHal (where C: Carbon, N: Nitrogen, Hal: Halogen, S: Sulfur). Structural inclusion rules: These depend on several structural filters that help to identify compounds with ocular irritation/corrosion potential. Some of these filters which are included are aliphatic monoalcohols, pyrazoles, ammonium salts, aliphatic amines, iso(thio)cyanates, etc. For example, prediction of category 8 of Toxtree, i.e., serious local lesions to the eye, is based on 17 different structural filters; similarly category 9, i.e., moderate reversible irritation to the eye, is based on 4 different filters, and category 10, i.e., skin corrosion, is based on 6 different filters. Details about these chemical classes and structural fragments are given in the Toxtree user manual.22, 24 The rulebase implemented in Toxtree either assigns a risk phrase for irritation or corrosion on the basis of the structural alerts or uses the physicochemical parameters to categorically rule out certain risk phrases. Toxtree is unable to assign risk phrases if the physicochemical properties are not provided and/or the structural alerts defined in the model are not present in a molecule. Since prediction is based on the structural filters the applicability domain is not defined or relevant. Derek Nexus: Derek Nexus is a knowledge-based expert system by Lhasa Limited25 wherein toxicity predictions are a result of two processes, evaluating alerts and estimating the likelihood of toxicity. Derek’s alerts are categorized by endpoint and super-endpoint and alerts for ocular irritation fall into two endpoints, i) irritation to the eye/lachrymation and ii) ocular toxicity. 6 DOW RESTRICTED

ACS Paragon Plus Environment

Page 7 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Chemical Research in Toxicology

Derek considers the toxicological endpoint, query structure, physicochemical property value and species for determining ocular irritation potential. Derek predicts the compound to be an irritant, toxic or provides no suggestions. It also calculates logP, logKp and MW from the structure provided which are used in refining the likelihood of prediction for other endpoints such as skin sensitization but not irritation. The irritancy estimation is divided in terms of the likelihood of presence of toxicity from 6 different classes and establishes arguments for or against an outcome (see Figure S1B). For example, if one rule gives a likelihood of improbable and another rule gives a likelihood of probable, the overall level of likelihood will be equivocal. In all three models, batch mode prediction is possible, although some restrictions might occur for the analysis of a very large number of compounds due to technical limitation of the tools to handle a batch input of more than 500 compounds at a time. 2.2. Experimental data and irritation categories: The in vivo ocular irritation data distribution for the in-house compounds across the irritancy classes based on MMAS scores are provided in Table 3. Significant attention was given to data curation - filtration of duplicates, removal of mixtures and compounds with unknown CAS or structure. The individual study reports and reviews were thoroughly studied and analyzed to retain high quality irritation data. For the 1644 in-house compounds obtained in total, 42 of which didn’t have ocular irritation data. 33 of the 42 compounds had experimental dermal irritation data and the other 9 were with estimated dermal irritation data using read-across. Out of 1602 compounds with rabbit ocular irritancy data, 684 (42.7%) were severe, 320 (20%) were moderate, 436 (27.2%) were mild and 162 (10.1%) were non-irritant. Due to proprietary issues only a few datasets on specific chemical categories are provided as SI reports but the landscape of the overall data in terms of product category and chemical functional group (FGs) has been presented below. The categorization of studied compounds in terms of their use in different products for specific business groups is given (Figure 3a) which shows predominantly industrial- and agro- chemicals. The distribution of 7 DOW RESTRICTED

ACS Paragon Plus Environment

Chemical Research in Toxicology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

common FGs and building blocks (Figure 3b-A) was examined to ensure representation of different chemical categories including organics and inorganics (Figure 3b-B) as well as to show the diversity of data. SMARTNames, a knowledge-based framework of chemistry space based on FGs26 was used to query the FGs present in the dataset. Compounds in the dataset were categorized according to the MMAS scale and plotted to examine the distribution of irritancy levels (Figure 3b-C). The distribution of severe and non-irritant compounds, pertaining to amines and alcohols and their classes, was also studied (Figure 4) as amines and alcohols are commonly (but not always) associated with irritants and non-irritants respectively.

2.3. Evaluation of individual models: The aim of this study was to determine prediction performance of the selected models for a set of compounds that had existing in vivo data. Predictions from the models were not obtained for all studied compounds due to limitations of the individual model in assigning risk phrases, identifying structural alerts or defining the applicability domain. The three models assessed during this study are given below. 2.3.1. TOPKAT – Three sub-models as discussed above (Table 1) were applied in a pipeline flow from A-B-C to filter out four classes of rabbit ocular irritancy: non-irritant, mild, moderate and severe. The components of TOPKAT implemented in Pipeline pilot v8.527 were used. The applicability of the models was identified by checking whether the test set data properties were within the same range as the properties of the training data set. 2.3.2. Toxtree – The molecular structures were input in batch mode of less than 500 compounds at once. Several physicochemical parameters including MP, AqS and lipid solubility are required by the model however these were not entered due to data unavailability. These parameters are requested by the model to potentially exclude irritating compounds based solely on physicochemical boundaries. For mixtures that contained metals, different stereochemistry and variable valency, the Toxtree prediction was obtained only after the chemical components were separated into fragments. Predictions were then made for each 8 DOW RESTRICTED

ACS Paragon Plus Environment

Page 8 of 44

Page 9 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Chemical Research in Toxicology

fragment, and the results were combined. In case there were different predictions for each fragment, the results were prioritized in the order of severe > moderate > mild > nonirritant. Thus the result with the most severe prediction was used for the overall prediction. For purposes of comparison, we binned the ocular irritation and corrosion prediction categories into five main classes: severe, moderate, non-irritant, not-severe and notcorrosive by regrouping Toxtree’s nine categories that are based on R-phrases used in the software. The irritation categories and the explanations used for classifying the compounds are given in Table 2. 2.3.3. Derek Nexus – the molecular structures were input in batch mode and predictions were compared with calculations generated within Derek Nexus. Both irritation and ocular toxicity were used for calculations. Compounds that are toxic to the eye might not be considered irritant, however, for this study such compounds were considered to be in the category of severe irritants. Because Derek is a rule-based expert system that utilizes specific structural alerts known to elicit irritation and ocular toxicity, only such compounds were identified in the analysis. This study details the individual model performances for correct ocular irritation prediction for all four classes (severe, moderate, mild and non-irritant). In addition, comparison of the models for overall correct predictions was done for only two experimental classes, irritants (combining severe, moderate and mild) and non-irritants and challenging each model. This study uses the best available information obtained from the overall ocular irritation prediction. 2.4. Data modeling and consensus analysis of model prediction: The prediction results for all compounds from each model were compiled together (Table 3). Since, each model gives different prediction outcomes, from quantitative (TOPKAT), to risk phrase (Toxtree) to qualitative (Derek Nexus), the prediction needed to be harmonized for the consensus analysis, which was performed as follows. Those compounds that were synonymously predicted to be in 9 DOW RESTRICTED

ACS Paragon Plus Environment

Chemical Research in Toxicology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 44

the same category by all models were considered reliable. Those compounds, where only one model gave an irritation prediction, the result was considered valid for consensus analysis, TOPKAT predicted all compounds while Toxtree and Derek predicted ‘unknowns’ or provided ‘no decisions’. For compounds where there were two different and/or opposite predictions, it was difficult to determine the concordance of prediction. Here, comparison to the experimental data helped to identify the most reliable prediction or analyze better performance of one model over the other. For the case where experimental ocular irritation data wasn’t available (and only dermal irritation data was present), the prediction was considered equivocal. For consensus analysis and for those compounds where experimental data were unavailable, the predictions from all three models were combined. For each irritancy level except non-irritant, the most severe prediction from any model was chosen, i.e., the priority of selection was corrosive > severe > moderate > mild > irritant. A compound was predicted to be a non-irritant only if by consensus all three predictions were non-irritant or at least two were non-irritant and one was unknown.

2.5. Assessment of the models using publicly available data: ECETOC published data for 132 compounds assessed in 149 in vivo rabbit studies in two phases.4, 11 Based on the classification opted by ICCVAM,5 the compounds were categorized into five different irritancy levels as given in Figure 1C. The compounds were filtered for uniqueness and SMILES were generated for 123 compounds that were subjected for estimation of irritation potential using all three models. The highest MMAS score was kept for each compound if multiple values were present. The compound list, SMILES representation, MMAS score and predictions from each model based on the chemical structures are given as Table S2.

2.6. Knowledge-based framework: Profiling based on physicochemical cutoffs and electrophilic reactivity: The knowledge-based screening framework, developed to understand the chemistry

10 DOW RESTRICTED

ACS Paragon Plus Environment

Page 11 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Chemical Research in Toxicology

and mechanisms that influence ocular irritation potentials of chemicals was divided into the following three categories (Schematic 1). 2.6.1. Physicochemical properties: Compounds can be physicochemically parameterized based on common molecular properties such as hydrophobicity (AlogP), molar refractivity (AlogP_MR), molecular weight (MW), acid dissociation constants (pKa), aqueous solubility (AqS), etc. In addition, some quantum mechanical descriptors for calculation of electron donation such as electronegativity, Highest Occupied Molecular Orbital Energy (E_HOMO), Lowest Unoccupied Molecular Orbital Energy (E_LUMO), Electrophilicity, as well

as

Number

of

H-donors

(Num_H_Donors),

Number

of

H-acceptors

(Num_H_Acceptors), were also calculated. The influence of these properties to differentiate severe or irritant compounds from the others was studied and any influential properties found were studied in more detail. 2.6.2. Electrophilic reactivity: Compounds may possess structural fragments that are prone to interaction with ocular proteins by electrophilic mechanisms, such as acylation, Schiff base formation or Michael addition. The mechanisms can be divided into harder and softer electrophilic mechanistic domains. These reaction mechanisms have been studied in detail for other endpoints such as skin sensitization28 but their application in irritation studies has been studied to a limited extent. Irritation caused by covalent binding of compounds has been studied in few publications23,

29

and some alerts used in BfR structural rules, are

indicative of electrophilic reactivity, but they are not complete and are not categorized into mechanistic domains. 2.6.3. Other mechanisms: Compounds may cause ocular irritancy via physicochemical or pharmacologic mechanisms or other covalent mechanisms such as nucleophilic substitution (e.g., SN2, SNAr). Examples include surface-active compounds such as surfactants or compounds that might inhibit specific pharmacologic targets such as mitochondria. This inherent reactivity-based framework also includes reactive chemicals that are important for 11 DOW RESTRICTED

ACS Paragon Plus Environment

Chemical Research in Toxicology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 44

other acute toxicity end-points. Various chemical fragments for ocular irritation or skin irritation have been reported in the past.19, 30 However, separate assessment of these other potential mechanisms is beyond the scope of the current manuscript.

In addition to the in-house and Bagley’s data11 three other publically available data for ocular irritation were downloaded to derive knowledge-based framework. They were, a) eChemportal 31 (May 20, 2014 download), b) ICCVAM’s Background Review Document (BRD) on severe eye irritants and corrosives10, 32 c) US-FDA’s Center for Food Safety and Applied Nutrition (CFSAN) data compiled from various drugs, cosmetics and household products.33 Two sets of ECHA ocular irritation data were downloaded from the OECD eChemportal which includes, 116 severe and 775 nonirritant compounds with reliability 1 score based on Klimisch scoring scheme.34 During the data curation process compounds with unknown CAS or structure, mixtures were excluded and compounds with data pertaining only to rabbit ocular irritation were kept. Similarly, from ICCVAM, 126 unique compounds with irritation data on three in vitro test methods: Isolated Chicken Eye (ICE), Hen’s Egg Test-Chorioallantoic Membrane (HET-CAM), and Bovine Corneal Opacity and Permeability (BCOP), were collected. From FDA’s CFSAN publications,33 2928 compounds with unique CAS and compound name were obtained. In all datasets, compounds with single CAS, defined chemical structure and the most severe endpoint were kept. After curation of duplicates, compounds with the same CAS and/or chemical name were deleted using the filters and protocols written in Pipeline Pilot. SMILES were collected from ChemIDPlus.35

For all datasets, physicochemical parameters such as octanol–water partition coefficient (AlogP), molar refractivity (AlogP_MR),36 molecular weight (MW), Number of H-donors (Num_H_Donors), Number of H-acceptors (Num_H_Acceptors), etc. were calculated using Pipeline Pilot27 components. Similarly, compounds were profiled based on reactivity for all compounds with SMILES representation using the 12 DOW RESTRICTED

ACS Paragon Plus Environment

Page 13 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Chemical Research in Toxicology

OECD Toolbox v3.237 except for FDA-CFSAN compounds, which were used to model physicochemical properties only. Compounds that may undergo electrophilic reactions such as Schiff base formation, acylation and Michael Acceptors were tagged. Compounds were subjected to 3D modeling using OASISTIMES to calculate the following parameters: Electronegativity, Highest Occupied Molecular Orbital Energy (E_HOMO), Lowest Unoccupied Molecular Orbital Energy (E_LUMO) and Electrophilicity. Pipeline Pilot, v8.527 was used for data management and analyses. Microsoft Excel and TIBCO SpotFire38 were used for developing histograms and scatter plots, respectively.

3. RESULTS AND DISCUSSION: The distributions of in-house compounds in association with different business products are given in Figure 3a and in terms of common FGs and irritancy category is given in Figure 3b. The distribution shows that FGs representing ethers and alcohols are present in the majority of compounds, followed by amines. Nitrogen-containing compounds are known to play an important role in ocular irritation, often due to their basicity, thus the distribution of different types of amines was also studied (Figure 4). Primary and secondary amines were present in 57% of the severe compounds (Figure 4, IA) but only 16% of non-irritants (Figure 4, IIA). Alcohols were not significantly different in distribution with 75% of severe compounds versus 67% of non-irritants being primary alcohols, respectively. For the 1602 compounds (Section 2.2) with ocular irritation data, 684 (42.7%) were severe, 320 (20%) were moderate, 436 (27.2%) were mild and 162 (10.1%) were non-irritant. The remaining 42 compounds had dermal irritation data for 33 compounds, out of which 13 were severe, 7 were moderate, 3 was mild, 6 were nonirritant and 4 were not-corrosive. For the 9 other compounds only read-across estimated dermal irritation data exists. The severe, moderate or even mild dermal irritants may cause significant ocular irritation, thus for prediction and comparison purposes none of the compounds were excluded from the total set. The ocular irritation estimation using QSAR models was evaluated on the total 1644 compounds and their performances are given below.

13 DOW RESTRICTED

ACS Paragon Plus Environment

Chemical Research in Toxicology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 44

3.1. Evaluation of individual models: The conceptual framework of this study was to evaluate the performance of three individual models for ocular irritation by using a set of compounds not in their training sets. The performance of each model is discussed below considering predictions with unknowns/no decisions as false negatives. 3.1.1. TOPKAT – TOPKAT gave predictions for all 1644 compounds and categorized them into one of four irritancy levels - severe, moderate, mild or non-irritant. The applicability domain of the model was also predicted for each irritancy level. The distribution of the predictions for 1602 compounds is given in Figure 5a. Overall, the model didn’t show robust prediction. The positive prediction for in-domain compounds for severe, moderate, mild and non-irritant was 43%, 15%, 18% and 29%, respectively, while prediction for the overall compounds was slightly higher. For 162 non-irritant compounds, 52 (47 + 5) were predicted correctly out of which 5 had an applicability domain issue because a chemical structural fragment was not identified or the compound was a metal. Even for the bestpredicted class, i.e., corrosive/severe, the prediction was less than 50% accurate, followed by non-irritants, which were 32% accurate. 3.1.2. Toxtree –Toxtree gave predictions for only 356 compounds, which were assigned into any of the five levels - severe, moderate, non-irritant, not-severe and not-corrosive. The remaining 78% of the total compounds were considered ‘unknown’ by the model. The unavailability of physicochemical parameters for the compounds tested will certainly affect the optimal performance of the model which may explain these ‘unknowns’. The distribution of the predictions for 1602 compounds is given in Figure 5b. Overall, Toxtree also performed poorly. The sensitivity of the model for total compounds for severe, moderate/mild and non-irritant was 13%, 0.5% and 21%, respectively. Toxtree doesn’t give information about the applicability domain of the model. For 162 non-irritant compounds, 34 were predicted correctly whereas 119 were predicted unknown and 9 were predicted

14 DOW RESTRICTED

ACS Paragon Plus Environment

Page 15 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Chemical Research in Toxicology

other than non-irritant. The best prediction of Toxtree was correctly identifying non-irritants 21% of the time. 3.1.3. Derek Nexus – Out of 1602 compounds analyzed, 165 were estimated to have structural alerts associated with ocular irritation potential and 9 with plausible ocular toxicity, whereas, ‘no decision’ was made for the remaining 89% of compounds (Figure 5c). Derek identified 154 irritants (11%) correctly but 11 non-irritants (6.7%) were also incorrectly identified as irritants. Derek does not give information about the applicability domain of the model. For the 162 non-irritants, Derek could not estimate 150 compounds that were classified as ‘unknown’ (1 compound was not estimated because of uncommon valency and 8 were considered to have structural problems or were mixtures). The inability to assign any irritation class could be due to few structural alerts for ocular irritation identified in the studied compounds or as implemented in the model. Those with plausible ocular toxicity (categorized here as severe) were 9 compounds with sulfonamide groups present, whereas experimentally they were mild irritants. 3.2. Evaluation of in silico models based on regrouping of irritancy classes: The comparison of three models for overall correct predictions was done by grouping the compounds into two experimental classes, irritants and non-irritants and assessing model performance. TOPKAT correctly predicted 34% of the irritants and 32% of the non-irritants, Toxtree correctly predicted 6.5% of the irritants and 21% of the non-irritants and Derek correctly predicted 5% of the irritants but provided no information regarding non-irritants. When the compounds were regrouped into three classes by further splitting the irritants into mild/moderate or severe/corrosive classes, the prediction percentage for TOPKAT (in-domain compounds) and Toxtree increased to 48.5% from 34% and 13% from 6% for severe/corrosive respectively while mild/moderate prediction for Derek increased to 9% from 0% (Figure 6). 3.3. Consensus prediction for in silico models:

15 DOW RESTRICTED

ACS Paragon Plus Environment

Chemical Research in Toxicology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 44

The current assessment shows that individually none of the ocular irritation models satisfactorily differentiated compounds into appropriate irritancy classes. Among three models, TOPKAT, a statistical model, provided better predictions than Toxtree or Derek, the rule-based and expert knowledge-based models. Thus, consensus using all three in silico models was examined for various irritancy classes. The consensus results were also inadequate, with the best prediction being 52% for severe irritants; nearly half of the other severely irritating compounds (326 of 684) were not flagged as such. Of these, 69% (226 of 326) were under-predicted as irritants and 31% were under predicted as non-irritants. Similarly, 59% (96 of 162) of non-irritants were over-predicted by consensus. Table 4 shows the predictions for compounds from all three models as well as the consensus prediction. The 42 in-house compounds with no experimental ocular irritation but dermal irritation information were also predicted using all three models and these predictions are given in Table S1.

For the Bagley et al dataset,4 the predicted results were slightly better than those for the in-house dataset. TOPKAT performed better than the other two in silico models with its best correct performance being 61% for severe and then 51% for non-irritant compounds (Figure S2). The best prediction for Toxtree was 19% for severe compounds and for Derek was 6% for mild compounds (Table 5). Derek flagged acrylates and methacrylates as irritants based on the α,β-unsaturated aldehyde as a FG category. However, 8 (2 acrylates and 5 methacrylates) out of 11 (3 acrylates and 8 methacrylates) of these compounds were reported by in vivo studies to be non-irritants based on MMAS categorization (Report S5). The α,β-unsaturated aldehyde in acrylates/methacrylates is a reactive functionality that acts as a Michael Acceptor, a soft electrophile. Such compounds may show a delayed response to ocular irritation.39 The in silico models performed poorly, not only for the Dow’s chemical library but also for the Bagley’s dataset showing the performance is independent of the size or the diversity of the chemical library. The models performance can be evaluated for the specific chemical classes or reaction domains that were

16 DOW RESTRICTED

ACS Paragon Plus Environment

Page 17 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Chemical Research in Toxicology

better or poorly predicted but the current focus of the manuscript is to examine prediction performance in terms of physicochemical properties and electrophilic reactivity potential. 3.4. Knowledge-based framework: Profiling based on physicochemical cutoff and electrophilic reactivity: The irritation potential of a chemical depends on its structure, physicochemical properties and the reaction it initiates at the point-of-contact. Results at the upper and lower ranges of a toxicity responses are often the most credible (i.e., very non-toxic or highly toxic). Out of different levels of irritancy, data pertaining to categorization of compounds with irritation levels of corrosive/severe or non-irritant often carry more confidence compared to mild and moderate irritation levels as it can be technically difficult to differentiate two middle levels of irritancy. This can be due to the subjective judgment inherent in the Draize scoring system, lack of resolution of the assay in which distinction between mild and moderate irritancy are based on minor variations in erythema and edema, or monitoring for an adequate amount of time to observe healing of irritation. Thus, to attempt differentiating severe/corrosive compounds against others, a knowledge-based framework was developed and used for chemical profiling. 3.4.1. Physicochemical properties: We studied different physicochemical parameters and found that AlogP and AlogP_MR, which are related to the hydrophobicity and polarizability of a molecule, respectively, help distinguish strong irritants and corrosives. The prediction percentage for all five datasets is given in Table 6. The parametric cut-offs of AlogP ≤ 2.2 and AlogP_MR ≤ 55 effectively differentiated the majority of severe/corrosive compounds (Figure 7) with positive prediction above 70-75% for all studied datasets. While this cut-off was applied a posteriori, after knowing the irritation potential of compounds, a priori application i.e., without any previous knowledge of the irritation potential gave different results (see Table 6). The prediction ranges from 50-70% for AlogP_MR cutoff and 4060% for AlogP cutoff except for data from ECHA and Bagley sets, which were both close to 20-30%. This observation could partly be due to the imbalance in data size and unequal 17 DOW RESTRICTED

ACS Paragon Plus Environment

Chemical Research in Toxicology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 44

distribution of the severe irritation class against non-irritants, a common problem of disproportionate data. 3.4.2. Electrophilic Reactivity: Secondly, electrophilic reactivity-based profiling was used where three reaction mechanisms were studied: Schiff base formation, acylation, and Michael Addition reaction. The first two are hard electrophilic reaction mechanisms and the latter is a soft electrophilic reaction mechanism. Schiff-base formers and acylators are known to have stronger dependency on logP.40 When a combined filter (AlogP from physicochemical character and Schiff base formers from reactivity), was applied, it helped to confirm severe/corrosive ocular irritants. For 22 in-house compounds that were profiled by the OECD ToolBox as Schiff base formers, there were 19 compounds with AlogP ≤ 2.2, and among them 15 were severe, two moderate and two mild irritants. Thus, the correct prediction of severe compounds improved to 79% when a combination of Framework 1 and 2 were used (Table 7) (Chemical structures are given in Report S1). , There were not enough compounds capable of undergoing Schiff base formation in ECHA, ICCVAM and Bagley datasets. But instead of AlogP, combination of AlogP_MR with compounds that undergo acylation reaction improved the severity prediction. For 77 in-house compounds that were profiled using the OECD ToolBox as acylators, there were 35 compounds with AlogP_MR ≤ 55 and among them 26 were severe, 5 moderate and 4 mild irritants. The correct prediction of severe irritants improved to 74% (Chemical structures are given in Report S2). Out of the other 42 compounds above the AlogP_MR cutoff, only 7 were severe irritants. For the ECHA, ICCVAM and Bagley datasets respectively, 119, 6 and 3 compounds were profiled to be acylators. For the ECHA dataset, many instances of misclassification or no classification of irritancy potential of compounds were found, which hindered further analysis (explained below).

18 DOW RESTRICTED

ACS Paragon Plus Environment

Page 19 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Chemical Research in Toxicology

In the case of Michael addition, there were 69 in-house compounds that were profiled using the OECD ToolBox as Michael Acceptors. Although the majority of them were severe (40%), they were also distributed among moderate (25%) and mild (25%) and some as non-irritant (10%). Only 42 compounds were under the AlogP ≤ 2.2 cutoffs and among them 26 were severe (62%) (Table 7). Of the other 27 compounds above the AlogP cutoff, only 2 were severe (Chemical structures are given in Report S3). Here, the prediction percentage didn’t quite improve above 65%. Some compounds with multiple FGs, such as di/tri/tetra-acrylate and bi-sulfoxide and small molecule electrophiles, such as ethyl/methyl/butyl acrylate, acrolein, methyl acrolein, etc, were found to be more potent ocular irritants. The role of 3D parameters such as energy gap (∆E = HOMO - LUMO) and electrophilicity index are currently being studied. A small ∆E is considered chemically reactive for covalent binding and is known to be useful to rank Michael acceptor potency.28 Similarly, some compounds might be reactive by multiple mechanisms, for example, 2-methyl-2-pentenal can undergo Schiff base formation as well as Michael reaction; Tetrahydro-1,3-isobenzo-furan-dione can undergo acylation as well as Michael reaction, etc. Thus far, we’ve only parameterized compounds potentially acting by two harder electrophilic mechanistic domains (acylation and Schiff base formation), both of which increased the prediction confidence. Our research efforts are continuing for compounds that may undergo softer electrophilic reactions, such as Michael addition. Several compounds for which ocular irritation data were downloaded from ECHA were categorized within ECHA as ‘non-irritant’ when they actually had significant ocular irritation but it didn’t reach the criteria for classification. In some cases, the ‘interpretation of the result’ and the ‘conclusion’ sections of public ECHA dossier mismatches and non-classified compounds are also concluded as non-irritants. Thus, an inherent challenge in using large datasets downloaded for validation of existing in silico models is the fact that the reports are not readily available and that not enough quality control or expertise went into describing the classification results. Some examples included α,β-unsaturated carbonyl compounds, which can act as Michael acceptors, especially the smaller molecules (Molecular weight < 200 g/mol). 19 DOW RESTRICTED

ACS Paragon Plus Environment

Chemical Research in Toxicology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 44

For example, there were 65 compounds in the ECHA “non-irritant” download of which 50 were profiled using the OECD QSAR as protein binders by Michael addition and 12 of them were under MW = 200g/mol (Report S4). Some smaller compounds like isobutyl methacrylate (CAS 97-86-9, molecular weight 142 g/mol) were slightly volatile (vapor pressure = 3.63 mmHg) but majority of these compounds were not volatile.35 The ECHA ‘interpretation of result’ and ‘conclusion’ sections for these compounds are given in Table 8. Only 3 compounds had AlogP_MR above 55 and only half of them crossed the AlogP cutoff of 2.2. Further study based on the third knowledge-based framework of ‘other mechanisms’ is in progress to understand key reaction mechanisms and predict irritation categories with higher levels of confidence and reliability. Similar approaches were used along with physicochemical-based filtering in the development of the models discussed above and others such as MultiCASE Inc.41 The three QSAR models that were assessed did not adequately flag compounds known to be strong ocular irritants, which may be attributed to poor data quality or inadequate knowledge-based characterization in the model development. 4. CONCLUSIONS: Our study shows that three current ocular irritation prediction in silico models used commonly could not reasonably be used to predict the ocular irritation potential for a set of in-house compounds. To the best of our knowledge these models have not been frequently upgraded with new data or knowledge to improve the model predictions. The individual model performances were poor for datasets evaluated. Among the three models evaluated, none correctly identified more than 48% of compounds or flagged sub-structures for even the most confident apical irritation levels such as corrosive/severe or non-irritant. In an effort to maximize the predictivity, a consensus approach was taken and although combining predictions led to an increase in reliability (52%), it still was not adequate for in silico ocular irritation assessment. The results could be due in part to inadequate information and curation of the data, subjectivity in scoring of ocular irritancy used to develop the models, identification and binning the majority of irritant in 20 DOW RESTRICTED

ACS Paragon Plus Environment

Page 21 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Chemical Research in Toxicology

appropriate classes or finding structural alerts to determine the irritation potential. Questions can be raised about the reliability of training set data, the scoring criteria used during the in vivo studies, expert judgment used to categorize structural filters, considerations of unknowns/no decision prediction as false negatives while model evaluation, etc., because the predictions made from different in silico models were often different for a given compound. This can perhaps be understood by having access to the highly curated experimental data used to build the models and the rules and decision trees implemented, which were all proprietary. These models may be meaningful for screening selected chemistries provided they carry structural alerts defined in them. Our effort to improve prediction for severe irritants using a knowledge-based combination framework was satisfactory. The prediction increased up to 79% when physicochemical properties and alerts for electrophilic reactivity were combined. Thus, this research emphasizes the need for in silico models that address chemical reactivity and physicochemical-based filtering as well as mechanism-based grouping of compounds which could also address the problem of lack of prediction of these tools. Even currently existing in silico model may be able to implement such filters for better categorization of irritation potential. The results of this study were not intended to demonstrate poor ocular irritation predictability of any particular model, rather to show that the in silico knowledge required for their efficient prediction are lacking in the in silico models. Our next effort will be to use this data to develop a comprehensive framework for predicting ocular irritation and a greater understanding of the mechanism(s)-of-action based on chemical reactivity.

21 DOW RESTRICTED

ACS Paragon Plus Environment

Chemical Research in Toxicology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 44

Acknowledgements: The authors gratefully acknowledge Tyler Auernhammer for data collection and ocular irritation report analysis. We also thank Dr. Raja Settivari and anonymous reviewers for critical review of the manuscript. Supporting Information: The Supporting Information file includes reports, figures and tables. This material is available free of charge via the Internet at http://pubs.acs.org.

Funding sources: Dow Chemical Co.’s internal funding for postdoctoral research.

Abbreviations: AlogP: log of octanol–water partition coefficient using Ghose and Crippen method, AlogP_MR: Ghose and Crippen estimate of Molar Refractivity, AqS: Aqueous Solubility; BCOP Assay: Bovine Corneal Opacity and Permeability Assay; BfR: Bundesinstitut für Risikobewertung (Federal Institute for Risk Assessment), CFG: Chemical Functional Group, CFSAN: Center for Food Safety and Applied Nutrition, ECETOC: European Centre for Toxicology and Ecotoxicology of Chemicals, ECHA: European Chemical Agency; ECVAM: European Committee for Validation of Alternative Methods; E_HOMO: Highest Occupied Molecular Orbital Energy, E_LUMO: Lowest Unoccupied Molecular Orbital Energy, EU: European Union, ICCVAM: Interagency Coordinating Committee on the Validation of Alternative Methods, IUPAC: International Union of Pure and Applied Chemistry, MMAS: Modified Maximum Average Score, MP: Melting Point, MW: Molecular Weight, Num_H_Acceptors: Number of H-acceptors, Num_H_Donors: Number of H-donors, OECD: Organization for Economic Cooperation and Development, OPS: Optimum Prediction Space, QSAR: Quantitative Structure Activity Relationship, QSTR: Quantitative Structure Toxicity Relationship, SMARTS: SMiles ARbitrary Target Specification, SMILES: Simplified Molecular-Input Line-Entry System, TOPKAT: TOxicity Prediction by Komputer Assisted Technology, US-FDA: United States Food and Drug Administration, UN GHS: United Nations Globally Harmonized System.

22 DOW RESTRICTED

ACS Paragon Plus Environment

Page 23 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Chemical Research in Toxicology

References (1)

Draize, J. H., Woodard, G., and Calvery, H. O. (1944) Methods for the study of irritation and toxicity of substances applied topically to the skin and mucous membranes. J. Pharmacol. Exp. Ther. 82, 377-390.

(2)

Worth, A., Barroso, J., Bremer, S., Burton, J., Casati, S., Coecke, S., Corvi, R., Desprez, B., Dumont, C., Gouliarmou Varvara, Goumenou, M.-P., Graepel, R., Griesinger, C., Halder, M. E., Janusch Roi, A., Kienzler, A., Madia, F., Munn, S., Nepelska, M., Paini, A., Price, A., Prieto, P. P., Rolaki, A., Schaeffer, M. W., Jutta, T., Whelan, M., Wittwehr, C., And Zuang, V. (2014) Alternative methods for regulatory toxicology – a stateof-the-art review, Institute for Health and Consumer Protection, Joint Research Center (JRC). https://ec.europa.eu/jrc/en/publication/eur-scientific-and-technical-research-reports/alternativemethods-regulatory-toxicology-state-art-review (Accessed 03/26/2016)

(3)

Kulkarni, A., and Hopfinger, A. J. (1999) Membrane-Interaction QSAR Analysis: Application to the Estimation of Eye Irritation by Organic Compounds. Pharm. Res. 16, 1245-1253.

(4)

Bagley, D. M., Gardner, J. R., Holland, G., Lewis, R. W., Vrijhof, H., and Walker, A. P. (1999) Eye Irritation: Updated Reference Chemicals Data Bank. Toxicol. In Vitro 13, 505-510.

(5)

Kay, J. H., and Calandra, J. C. (1962) Interpretation of eye irritation test. J. Cosmet. Sci. 13, 281-289.

(6)

Scott, L., Eskes, C., Hoffmann, S., Adriaens, E., Alepee, N., Bufo, M., Clothier, R., Facchini, D., Faller, C., Guest, R., Harbell, J., Hartung, T., Kamp, H., Le Varlet, B., Meloni, M., McNamee, P., Osborne, R., Pape, W., Pfannenbecker, U., Prinsen, M., Seaman, C., Spielmann, H., Stokes, W., Trouba, K., Van den Berghe, C., Van Goethem, F., Vassallo, M., Vinardell, P., and Zuang, V. (2010) A proposed eye irritation testing strategy to reduce and replace in vivo studies using Bottom-Up and Top-Down approaches. Toxicol. In Vitro 24, 1-9.

(7)

Patlewicz, G., Ball, N., Booth, E. D., Hulzebos, E., Zvinavashe, E., and Hennes, C. (2013) Use of category approaches, read-across and (Q)SAR: General considerations. Regul. Toxicol. Pharmacol. 67, 1-12.

(8)

EU Regulation, Regulation (EC) No 1223/2009. http://eur-lex.europa.eu/legal-content/EN/TXT/(Accessed 03/26/2016)

(9)

Brantom, P. G., Bruner, L. H., Chamberlain, M., DeSilva, O., Dupuis, J., Earl, L. K., Lovell, D. P., Pape, W. J. W., Uttley, M., Bagley, D. M., Baker, F. W., Bracher, M., Courtellemont, P., Declercq, I., Freeman, S., Steiling, W., Walker, A. P., Carr, G. J., Dami, N., Thomas, G., Harbell, J., Jones, P. A., Pfannenbecker, U., Southee, J. A., Tcheng, M., Argembeaux, H., Castelli, D., Clothier, R., Esdaile, D. J., Itigaki, H., Jung, K., Kasai, Y., Kojima, H., Kristen, U., Larnicol, M., Lewis, R. W., Marenus, K., Moreno, O., Peterson, A., Rasmussen, E. S., Robles, C., and Stern, M. (1997) A summary report of the COLIPA international validation study on alternatives to the Draize rabbit eye irritation test. Toxicol. In Vitro 11, 141-179.

(10)

ICCVAM. (2006) In vitro test methods for detecting ocular corrosives and severe irritants. Background Review Documents. (Accessed 03/26/2016)

(11)

Bagley, D. M., Botham, P. A., Gardner, J. R., Holland, G., Kreiling, R., Lewis, R. W., Stringer, D. A., and Walker, A. P. (1992) Eye irritation: Reference chemicals data bank. Toxicol. In Vitro 6, 487-491.

23 DOW RESTRICTED

ACS Paragon Plus Environment

Chemical Research in Toxicology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 44

(12)

Barratt, M. D. (1995) A quantitative structure-activity relationship for the eye irritation potential of neutral organic-chemicals. Toxicol. Lett. 80, 69-74.

(13)

Barratt, M. D. (1997) QSARS for the eye irritation potential of neutral organic chemicals. Toxicol. In Vitro 11, 1-8.

(14)

Chamberlain, M., and Barratt, M. D. (1995) Practical applications of QSAR to in vitro toxicology illustrated by consideration of eye irritation. Toxicol. In Vitro 9, 543-547.

(15)

Patlewicz, G. Y., Rodford, R. A., Ellis, G., and Barratt, M. D. (2000) A QSAR model for the eye irritation of cationic surfactants. Toxicol. In Vitro 14, 79-84.

(16)

Cronin, M. T. D., Basketter, D. A., and York, M. (1994) A quantitative structure-activity relationship (QSAR) investigation of a Draize eye irritation database. Toxicol. In Vitro 8, 21-28.

(17)

Abraham, M. H., Hassanisadi, M., Jalali-Heravi, M., Ghafourian, T., Cain, W. S., and Cometto-Muñiz, J. E. (2003) Draize Rabbit Eye Test Compatibility with Eye Irritation Thresholds in Humans: A Quantitative Structure-Activity Relationship Analysis. Toxicol. Sci. 76, 384-391.

(18)

Saliner, A. G., Patlewicz, G., and Worth, A. P. (2008) A Review of (Q)SAR Models for Skin and Eye Irritation and Corrosion. QSAR Comb. Sci. 27, 49-59.

(19)

Enslein, K. (1988) An overview of structure-activity-relationships as an alternative to testing in animals for carcinogenicity, mutagenicity, dermal and eye irritation, and acute oral toxicity. Toxicol. Ind. Health 4, 479-498.

(20)

Enslein, K., Borgstedt, H. H., Blake, B. W., and Hart, J. B. (1987) Prediction of rabbit skin irritation severity by structure activity relationships. In Vitro Toxicol. 1, 129-147.

(21)

Idea Consult Inc. https://www.ideaconsult.net. (Accessed 03/26/2016)

(22)

Herzler, M. (2010) The BfR decision support system (DSS) for local lesions, In http://www.ufz.de/export/data/32/38810_8_OSIRIS_Berlin_2010_Herzler.pdf, Federal Institute for Risk Assessment, Berlin. (Accessed 03/26/2016)

(23)

Gerner, I., Liebsch, M., and Spielmann, H. (2005) Assessment of the eye irritating properties of chemicals by applying alternatives to the Draize rabbit eye test: the use of QSARs and in vitro tests for the classification of eye irritation. Altern. Lab Anim. 33, 215-237.

(24)

IdeaConsult Ltd., (2009) Toxtree User Manual, 4 Angel Kanchev St. 1000 Sofia, Bulgaria.

(25)

Sanderson, D. M., and Earnshaw, C. G. (1991) Computer prediction of possible toxic action from chemical structure; the DEREK system. Hum. Exp. Toxicol. 10, 261-273.

(26)

Bhhatarai, B., and Schurer, S. (2011) SMARTNames: a new framework to organize chemical structural information based on chemically relevant functional groups, In Abstracts of the papers of the American

24 DOW RESTRICTED

ACS Paragon Plus Environment

Page 25 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Chemical Research in Toxicology

Chemical Society (ACS), American Chemical Society 1155 16TH ST, NW, Washington, DC 20036 USA, Denver, CO. (27)

http://accelrys.com/products/pipeline-pilot/. Accelrys Scitegic Pipeline Pilot v8.5,, Accelrys Software Inc., San Diego.

(28)

Enoch, S. J., Roberts, D. W., and Cronin, M. T. D. (2010) Mechanistic Category Formation for the Prediction of Respiratory Sensitization. Chem. Res. Toxicol. 23, 1547-1555.

(29)

Hulzebos, E., and Gerner, I. (2010) Weight factors in an Integrated Testing Strategy using adjusted OECD principles for (Q)SARs and extended Klimisch codes to decide on skin irritation classification. Regul. Toxicol. Pharmacol. 58, 131-144.

(30)

Klopman, G., Ptchelintsev, D., Frierson, M., Pennisi, S., Renssskers, K., and Dickens, M. (1993) Multiple computer automated structure evaluation methodology as an alternative to in vivo eye irritation testing. Altern. Lab. Anim. 21, 13.

(31)

eChemPortal, Registered Substances ECHA, http://echa.europa.eu/web/guest/information-onchemicals/registered-substances. (Accessed 03/26/2016)

(32)

ICCVAM. (2010) Evaluation of In Vitro Ocular Test Methods. https://ntp.niehs.nih.gov/pubhealth/evalatm/test-method-evaluations/ocular/in-vitro/tmer/index.html, (Accessed 03/26/2016)

(33)

Verma, R. P., and Matthews, E. J. (2015) Estimation of the chemical-induced eye injury using a weight-ofevidence (WoE) battery of 21 artificial neural network (ANN) c-QSAR models (QSAR-21): Part I: Irritation potential. Regul. Toxicol. Pharmacol. 71, 331-336.

(34)

Klimisch, H. J., Andreae, M., and Tillmann, U. (1997) A Systematic Approach for Evaluating the Quality of Experimental Toxicological and Ecotoxicological Data. Regul. Toxicol. Pharmacol. 25, 1-5.

(35)

ChemIDPlus. ChemID Plus Advanced. http://chem.sis.nlm.nih.gov/chemidplus/(Accessed 03/26/2016)

(36)

Ghose, A. K., Viswanadhan, V. N., and Wendoloski, J. J. (1998) Prediction of Hydrophobic (Lipophilic) Properties of Small Organic Molecules Using Fragmental Methods:  An Analysis of ALOGP and CLOGP Methods. J. Phys. Chem. A 102, 3762-3772.

(37)

http://www.oecd.org/chemicalsafety/risk-assessment/theoecdqsartoolbox.htm. OECD QSAR Toolbox. (Accessed 03/26/2016)

(38)

TIBCO SpotFire,. SpotFire for data visualization and decision making, http://spotfire.tibco.com/. .

(39)

Singh, M., Mallampati, R., Patlolla, R. R., Vashi, P., Hayden, P., and Klausner, M. (2009) To develop and invitro EFT-300 Skin Model for study of wound healing, In Society of Toxicology (SOT) Meeting, Baltimore MD.

25 DOW RESTRICTED

ACS Paragon Plus Environment

Chemical Research in Toxicology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 44

(40)

Roberts, D. W., Aptula, A. O., and Patlewicz, G. (2006) Mechanistic Applicability Domains for Non-Animal Based Prediction of Toxicological Endpoints. QSAR Analysis of the Schiff Base Applicability Domain for Skin Sensitization. Chem. Res. Toxicol. 19, 1228-1233.

(41)

http://www.multicase.com. 23811 Chagrin Blvd. Ste 305,Beachwood, OH.

26 DOW RESTRICTED

ACS Paragon Plus Environment

Page 27 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Chemical Research in Toxicology

Figure Legends: Figure 1. Guidelines for ocular irritation classification recommended by A) OECD and B) EU. C) Categorization of MAS score for classifying data from Bagley et al. (1992). Figure 2. Use of TOPKAT for ocular irritation prediction distinguishing A) non-irritant vs. irritant, B) mild vs. moderate irritant and C) moderate vs. severe irritant. Implementation of results in a pipeline flow to distinguish irritant categories. Blue circles are in-domain and red circles are out-of-domain compounds. Figure 3. a) Distribution of experimental compounds vs. product categories based on Dow business groups, where majority of compounds are from building and construction (23%) followed by coating materials (16.8%). b) Distribution of total experimental data in terms of (A) common chemical functional groups and building blocks where majority are aliphatic alcohol (18%) followed by ether (13%) and amine (12%); (B) organics and inorganics; and (C) different levels of ocular irritancy where majority are severe (42%) followed by mild (26%). Figure 4. Distribution of (I) severe and (II) non-irritant compounds for A) amines and B) alcohols. The total number of compounds and the percentile value is labelled. Figure 5. Distribution of experimental data vs. positive prediction for estimation using a) TOPKAT and applicability domain (AD), b) Toxtree and c) Derek Nexus. Figure 6. Correct prediction comparison by in silico models in terms of experimental classes. A) two experimental classes, B) three experimental classes obtained by splitting irritant into mild/moderate and severe/corrosive. Figure 7. The AlogP_MR cutoffs observed for a) in-house compounds, b) ECHA corrosive compounds, c) ECETOC Bagley et al. compounds, d) ICCVAM’s data, and e) FDA-CFSAN Verma and Matthews data. The color codes represent decreases in severity going from red to yellow to green except for ECHA data which are all Klimisch reliability 1 score. Y-axis represents the AlogP_MR value whereas 27 DOW RESTRICTED

ACS Paragon Plus Environment

Chemical Research in Toxicology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 44

parameters on X-axis were chosen for better visualization and data spread and have no relevance to the plots. The color codes represent the irritancy category; the circle size is dependent on molecular weight.

Schematics Schematic 1. Schematic representing knowledge-based screening frameworks that are described (represented by + sign) in the manuscript to assess parameters that potentially impact ocular irritation. This scheme doesn’t consider other physicochemical properties as included in BfR rule-base and the knowledge that strong dermal irritants can be ocular irritants.

28 DOW RESTRICTED

ACS Paragon Plus Environment

Page 29 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

Chemical Research in Toxicology

Figures Figure 1

29 DOW RESTRICTED

ACS Paragon Plus Environment

Chemical Research in Toxicology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

Page 30 of 44

Figure 2

30 DOW RESTRICTED

ACS Paragon Plus Environment

Page 31 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

Chemical Research in Toxicology

Figure 3

31 DOW RESTRICTED

ACS Paragon Plus Environment

Chemical Research in Toxicology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

Page 32 of 44

Figure 4

32 DOW RESTRICTED

ACS Paragon Plus Environment

Page 33 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

Chemical Research in Toxicology

Figure 5

33 DOW RESTRICTED

ACS Paragon Plus Environment

Chemical Research in Toxicology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

Page 34 of 44

Figure 6

34 DOW RESTRICTED

ACS Paragon Plus Environment

Page 35 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

Chemical Research in Toxicology

Figure 7

35 DOW RESTRICTED

ACS Paragon Plus Environment

Chemical Research in Toxicology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

Page 36 of 44

Tables: Table 1. Three TOPKAT QSAR models for ocular irritation with the number of compounds, AlogP range, molecular weight, and the chemical categorization based on the true or false predictions from each model. Models A TOPKAT Ocular None vs. Irritant TOPKAT Ocular Irritancy Mild vs. Moderate B Severe C TOPKAT Ocular Irritancy Moderate vs. Severe

Compounds AlogP range

Mol. weight

1241+218

-4.20 to 16.26

30 to 1199

Prediction False Non-irritant

855+386

-4.20 to 16.26

30 to 1021

Mild

Moderate and Severe

530+325

-3.83 to 16.26

30 to 925

Moderate

Severe

Prediction True Irritant

36 DOW RESTRICTED

ACS Paragon Plus Environment

Page 37 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

Chemical Research in Toxicology

Table 2. The Toxtree category rules and explanations observed for the compounds studied.

Ocular irritation and corrosion NOT skin corrosion R34 or R35 NOT lesions R34, R35, R36 or R41

Categor y

Ocular irritation and corrosion#explanation

Explanation

1

1N,2N,3N,4N,5N,6N,7N,8N,9Y,9.1N,9.2N,9.3N,9.4Y

9.4Y is logP GT 4.5

2

1N,2Y

2Y is logP GT 9.0 11.5Y is logP GT 1.5 7Y MW is 650 10.1Y is logP GT 3.8 12.1Y is MW GT 370 28Y is Triphenylphosphon ium salts

NOT eye irritation R41

3

NOT eye irritation R36 NOT corrosion R34, R35 or R41 NOT lesions R34, R35 or R36

4

1N,2N,3N,4N,5N,6N,7N,8N,9N,10N,11Y,11.1N,11.2N,11.3N,11.4N,1 1.5Y 1N,2N,3N,4N,5N,6N,7Y

5

1N,2N,3N,4N,5N,6N,7N,8N,9N,10Y,10.1Y

6

1N,2N,3N,4N,5N,6N,7N,8N,9N,10N,11N,12Y,12.1Y

Serious lesions to the eye R41

8

1N,2N,3N,4N,5N,6N,7N,8N,9N,10N,11N,12N,13N,14N,15N,16N,17 N,18N,19N,20N,21N,22N,23N,24N,25N,26N,27N,28Y

Moderate reversible irritation to the eye R36

9

Skin corrosion R34 or R35

10

Unknown

11

1N,2N,3N,4N,5N,6N,7N,8Y,8.1N,8.2N,8.3N,8.4N,9N,10N,11N,12N,1 3N,14N,15N,16N,17N,18N,19N,20N,21N,22N,23N,24N,25N,26N,27 N,28N,29N,30N,31N,32Y 1N,2N,3N,4N,5N,6N,7N,8N,9N,10N,11N,12N,13N,14N,15N,16N,17 N,18N,19N,20N,21N,22N,23N,24N,25N,26N,27N,28N,29N,30N,31N, 32N,33N,34N,35N,36N,37N,38Y N,2N,3N,4N,5N,6N,7N,8Y,8.1N,8.2N,8.3N,8.4N,9N,10N,11N,12N,13 N,14N,15N,16N,17N,18N,19N,20N,21N,22N,23N,24N,25N,26N,27N, 28N,29N,30N,31N,32N,33N,34N,35N,36N,37N,38N,39N

Classification Non irritant Not Severe Non irritant Not Corrosive Non Irritant Severe

32Y is aliphatic carboxylic Acid

Moderate

38Y is Aliphatic Amines

Corrosive

Unknown

37 DOW RESTRICTED

ACS Paragon Plus Environment

Chemical Research in Toxicology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

Page 38 of 44

Table 3. Experimental classes and number of compounds predicted by each in silico model irrespective of the correct prediction for different toxicity levels. *Irritant category is only available for Derek. **Not Corrosive category is only available for Toxtree. #Only dermal irritation data exists for these compounds which could be of hint to understand ocular irritation data.

Irritancy Level Severe Moderate Mild Irritant* Non-irritant Not corrosive** No ocular data#

Experimental data total) 684 320 436 162 42

(1644

TOPKAT Predicted 604 409 271 NA 360 NA NA

Toxtree Predicted 127 11 NA NA 206 12 1288

Derek Predicted 9 NA NA 171 NA NA 1464

38 DOW RESTRICTED

ACS Paragon Plus Environment

Page 39 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

Chemical Research in Toxicology

Table 4. Experimental data, number (No.) and percentage (Perc.) of correct predictions from all three in silico models (individually and in consensus) for different irritancy levels. Also shown are the data for correctly and not-correctly predicted compounds. *Consensus prediction using all three models for non-irritants. This value was derived by taking 66 non-irritants and including only those compounds that were correctly predicted by three models to be non-irritant or at least two models predicted non-irritant and one predicted unknown.

Correct Prediction Irritancy Level

Severe Moderate Mild Non-irritant No ocular data

TOPKAT

Experimental data

684 320 436 162 42

No.

Perc.

332 79 83 52

48.54 24.69 19.04 32.10

out of AD compounds 35/332 30/79 5/83 5/52

Consensus Prediction

Toxtree

Derek

Not predicted correctly

Predicted correctly

No.

Perc.

No.

Perc.

No.

Perc.

No.

Perc.

89 2 2 34

13.01

85 37 32 NA

12.43

326 204 324 96

47.67 63.75 74.31 59.25

358 116 112 45*

52.33 36.25 25.69 27.78

0.53 20.99

9.13 0

39 DOW RESTRICTED

ACS Paragon Plus Environment

Chemical Research in Toxicology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

Page 40 of 44

Table 5. Bagley experimental data and correct predictions using all three in silico models.

Correct Prediction Toxicity Levels

Corrosive/Severe Moderate Mild Non-irritant

TOPKAT

Experimental data 26 20 34 43

No.

Perc.

16 5 9 23

61.54 25.0 26.47 53.49

out of AD compounds 1/16 2/5 1/9 1/23

Toxtree

Consensus Prediction Derek

Predicted correctly

No.

Perc.

No.

Perc.

No.

Perc.

5 1 0 3

19.23

1 1 2 NA

3.84

17 6 10 23

65.38 30.0 29.42 53.49

1.85 6.97

5.56 0

40 DOW RESTRICTED

ACS Paragon Plus Environment

Page 41 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

Chemical Research in Toxicology

Table 6. Physicochemical cut-off for AlogP and AlogP_MR used for differentiating severe compounds from others and their correct prediction percentages. The better predictions among AlogP and AlogP_MR are bolded. Data belonging a priori (without previous knowledge of irritation potential) as well as a posteriori estimation are given. A posteriori estimation Datasets In-house ECHA ICCVAM in-vitro Bagley_MMAS Verma_FDA CFSAN

Experimental data - Severe 684 116 74 26 1029

AlogP_MR < 55 ‘Severe’

A priori estimation

AlogP < 2.2 ‘Severe’

AlogP_MR < 55 ‘Severe’

Correct

Percent

Correct

Percent

Correct

Total