Mechanistic Reactivity Descriptors for the Prediction of Ames

Jan 29, 2019 - Primary aromatic amines belong to the critical classes that are considered potentially mutagenic in the Ames test so there is a great n...
0 downloads 0 Views 281KB Size
Subscriber access provided by TULANE UNIVERSITY

Chemical Information

Mechanistic Reactivity Descriptors for the Prediction of Ames Mutagenicity of Primary Aromatic Amines Lara Kuhnke, Antonius M. ter Laak, and Andreas H. Göller J. Chem. Inf. Model., Just Accepted Manuscript • DOI: 10.1021/acs.jcim.8b00758 • Publication Date (Web): 29 Jan 2019 Downloaded from http://pubs.acs.org on February 4, 2019

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 6 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Mechanistic Reactivity Descriptors for the Prediction of Ames Mutagenicity of Primary Aromatic Amines Lara Kuhnke*,†, Antonius ter Laak†, Andreas H. Göller †† † Bayer AG, Pharmaceuticals R&D, 13353 Berlin, Germany ‡ Bayer AG, Pharmaceuticals R&D, 42096 Wuppertal, Germany KEYWORDS In silico prediction, Ames Mutagenicity, Primary Aromatic Amines, Quantum Mechanic Descriptors, Mutagen, Structure-activity-Relationship

ABSTRACT: Pharmaceutical products are often synthesized by the use of reactive starting materials and intermediates. These can, either as impurities or through metabolic activation, bind to the DNA. Primary aromatic amines belong to the critical classes that are considered potentially mutagenic in the Ames test so there is a great need of good prediction models for risk assessment. How primary aromatic amines exert their mutagenic potential can be rationalized by the widely accepted nitrenium ion hypothesis of covalent binding to the DNA of reactive electrophiles formed out of the aromatic amines. Since the reactive chemical species is different in chemical structure from the actual compound it is difficult to achieve good predictions via classical descriptor or fingerprint-based machine learning. In this approach, we use a combination of different molecular and atomic descriptors that is able to describe different mechanistic aspects of the metabolic transformation leading from the primary aromatic amine to the reactive metabolite that binds to the DNA. Applied to a test set, the combination shows significantly better performances than models that only use one of these descriptors and complemented the general internal Ames mutagenicity prediction model at Bayer.

INTRODUCTION. Primary aromatic amines (pAA) are often used in medicinal chemistry due to their broad range of chemical variability. This structure class, however, has a high risk of being mutagenic through metabolic activation. Mutagenicity is addressed early in drug discovery to avoid possible safety issues. The most common assay to test for mutagenic effects is the bacterial reverse mutation test called the Ames assay1 and is considered a reliable early predictor of mutagenic and carcinogenetic potential. It is used to assess the mutagenic potential of substances by looking for back mutations in different bacteria strains, typically Salmonella typhimurium, that are able to detect different types of mutations, e.g. point mutations or frameshift mutations. The compounds can be tested with or without a rat liver extract (S9) that contains active liver enyzmes to simulate metabolism in bacteria. Primary aromatic amines show their mutagenic potential by metabolic activation that can be tested adding the S9 mix to the Ames Assay. The mechanism of metabolic activation pathways of aromatic amines is well understood2 : pAAs are activated by N-Hydroxylation through cytochrome P450 enzymes and these intermediate hydroxylamines can then either directly or via transferases form a reactive nitrenium ion (Figure 1). This positively charged nitrenium species can then bind to the negatively charged DNA and cause miscoding during DNA replication, hence, have a mutagenic effect. For many other structural classes, Ames mutagenicity can be predicted straight-forward using common QSAR- methods with fingerprint descriptors3. However, when we apply our general model to primary aromatic amines, it performs significantly weaker than for typical drug molecules. While on non-pAAs the model has an F1 measure of 0.87, on pAAs it is only 0.51. The model seems to see the aromatic

amine substructure as indicator of mutagenicity that can be seen by the drop in negative predictive value from 0.89 for non-pAAs to 0.56 for pAAs while having a similar precision 0.87 for non-pAAs compared to 0.84 for pAAs..4 There are several approaches to predict Ames mutagenicity of this unique structural class. Ahlberg et al.5 reported good prediction performances by encoding a list of substructural features such as substitution patterns of the aniline or heterocycles. They encoded 591 of these features in an SAR fingerprint and calculated for each feature a statistical relevance for activating or deactivating fragments. While this method worked well on small building block-like structures, we were not able to apply this method to our internal data that included larger compounds and more complicated, diverse substitution patterns.6 Bentzien et al.7 used the ‘nitrenium ion hypothesis’ developed by Ford et al.2 to distinguish Ames positive from Ames negative compounds by calculating the stability of the nitrenium ion that is formed during metabolism. While we were able to reproduce the good performance on the test set they used, on our internal data it did not perform well. Debnath et al.8 used Ames data tested only with S9 metabolic activation in the strains Salmonella TA98 and TA100 and used AM1 molecular orbital energies (HOMO and LUMO) to express the mutagenic potency by log(revertants/nmol). In the end they came up with one equation for each strain. They observed a positive correlation with the HOMO values and a negative correlation with the LUMO values. All these different approaches aim to describe one key step involved in the mutagenicity of pAAs and work well if the mutagenicity of the pAA in question is determined by exactly

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

that reaction. However, our internal dataset is large and very likely includes different key steps of metabolic activation and thus cannot be predicted by only one approach. Therefore, we decided to combine several methods to describe the different steps of the mechanism as well as possible.

default. However, there is the option to keep occurrence counts which is then annotated as ECFC (extended connectivity fingerprint counts). For our models, we chose to use fingerprint counts and a diameter of six. We calculated ECFC-6 with the “Molecular Fingerprint” implementation15 by BIOVIA Pipeline Pilot for the parent compound pAA. HOMO-LUMO gap. The gap is the energy difference between the energy of the lowest unoccupied molecular orbital (LUMO) and the energy of the highest occupied molecular orbital (HOMO). It was successfully used in multiple QSAR models as a parameter for chemical reactivity, molecular stability and electronic band gaps in solids16. We used the Pipeline Pilot VAMP17 component to calculate the HOMO and LUMO Eigenvalues and then calculated the HOMO-LUMO gap for the parent compound pAA. Stability of the nitrenium ion ΔΔE. Bentzien et al.7 have shown an in silico method that is able to predict Ames mutagenicity of pAA by using the “nitrenium ion hypothesis” (Figure 1) from Ford et al.2. This approach describes the mutagenic effect of pAAs as the formation of a reactive nitrenium ion and concludes that the stability of this cation is therefore correlated with the mutagenic potential. They used quantum-mechanical calculations to compute the stability of the nitrenium ion using the relative energies ΔΔE according to this equation:

FIGURE 1 Mechanism of the nitrenium ion hypothesis by Ford and Griffin2

ΔΔE = ΔEArNH+ + ΔEPhNH2- ΔEArNH2- ΔEPhNH+

DATA. We derived compounds from our internal databases in both the Pharma and Cropscience divisions of Bayer as well as the Lhasa database9 that were tested at least in the mini Ames test (TA98, TA100). As this method is aiming to learn the mechanism of mutagenicity of pAAs as shown in Figure 1, data was excluded that showed Ames mutagenicity without the S9 activation liver mix resulting in the loss of 2% of the data set. In this case, we have to assume that the mutagenic effect is caused by a different substructure fragment than the activation of the pAA and would therefore add unwanted noise to the model. Using a BIOVIA Pipeline Pilot10 protocol, salts and counter ions were removed11, bases were deprotonated12 and acids protonated13 to retain the neutral chemical structures. Stereochemistry information was discarded as there is no clear evidence of different stereoisomers behaving differently in the Ames Assay. All structures have a molecular weight between 80 and 450 g mol-1. Contradicting Ames result flags were manually curated or discarded if there was not sufficient information available to decide which test datum is correct. After curating and filtering our data, we had 1657 compounds with a maximum of two aromatic amino functionalities. 883 of 1657 compounds are Ames positive and 774 Ames negative. 10% of the data set was randomly chosen and put aside as a test set with a similar distribution over the classes (91 compounds Ames positive, 75 compounds Ames negative). METHODS ECFC-6 – Extended connectivity fingerprints (ECFP) are circular topological fingerprints introduced by Rogers and Hahn14. They systematically store various atom properties for each heavy atom in the molecule as well as its neighborhood into circular layers up to a given diameter. Every occurrence of a substructure feature in the molecule is only stored once by

We used Heat of Formation energies of AM1 optimized structures calculated with the semi-empirical Hamiltonian AM1 from the VAMP software implemented in BIOVIAs’ Pipeline Pilot17. To calculate ΔΔE, we first had to calculate the ΔE of the neutral reference aniline (ΔEPhNH2) and its nitrenium ion species (ΔEPhNH+) (for AM1 geometry-optimized structures). For all other pAAs, the heat of formation energies for the neutral compound had to be calculated (ΔEArNH2). Then, all nitrenium ion species were created and the corresponding ΔEArNH+ was used to calculate ΔΔE. For each compound the most likely nitrenium ion species with the lowest ΔΔE value was selected. Sorted Shell Descriptors. Finkelmann et al.18 developed atom-reactivity descriptors that capture the electronic environment of every query atom in the molecule based on atomic charges calculated by the semi-empirical method density functional tight binding (DFTB) as implemented in the software DFTB+18,19. We here apply the same type of descriptor, but based on the coordinate sets obtained by AM1 optimizations for the parent compound pAA.. For each query atom, N topological radial shells are defined from the neighbor graph18. As query atom we used the nitrogen of the same aromatic amino functional group as for the ΔΔE. The first shell consists of the next topological neighbors of the query atoms meaning all atoms that are connected via bonds to the query atom. For each member of one shell this procedure is repeated to identify the atoms of the next shell. By this procedure one linear descriptor is obtained with the descendingly sorted atomic charges for each shell and sorted by the N shells required (Figure 2). As the substitution pattern studies related to Ahlberg et al.5 confirmed that the para-

ACS Paragon Plus Environment

Page 2 of 6

Page 3 of 6 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling position of the aromatic amine is crucial for detecting mutagenicity; we decided to use the sorted shell descriptor (charge descriptor) of length six to be able to also cover small functional groups such as carboxylic acids and nitro-groups in the para position. Algorithm – We used the random forest classification implementation from BIOVIA Pipeline Pilot for all models20 with the following default settings: 500 trees, maximum tree depth 50, number of descriptors D used as split criterion sqrt(D), ensemble method bagging. Performance Measures – The following performance measures are used to qualify the goodness of the models. They can all be derived from the confusion matrix where TP stands for the number of true positive predictions, TN for the number of true negative predictions, FN for the number of false negative predictions and FP for the number of false positive predictions.

𝑇𝑃 + 𝑇𝑁

Accuracy = 𝑇𝑃 + TN +

F1 measure = 2 ∗

Precision =

𝐹𝑃 + 𝐹𝑁

𝑇𝑃 𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 𝑇𝑃 𝑇𝑃 + 𝐹𝑃

Negative predictive value = Cohen′s Kappa = , where

𝑇𝑁

𝐹𝑁 + 𝑇𝑁 𝑝0 ― 𝑝𝑒 1 ― 𝑝𝑒

𝑇𝑃 + 𝑇𝑁

𝑝0 = 𝑇𝑃 + 𝐹𝑃 + 𝑇𝑁 + 𝐹𝑁

and (𝑇𝑃 + 𝐹𝑁) ∗ (𝑇𝑃 + 𝐹𝑃) (𝑇𝑁 + 𝐹𝑃) ∗ (𝑇𝑁 + 𝐹𝑁) 𝑝𝑒 = + 𝑇𝑃 + 𝐹𝑃 + 𝑇𝑁 + 𝐹𝑁 𝑇𝑃 + 𝐹𝑃 + 𝑇𝑁 + 𝐹𝑁 These measures were applied to qualify the performance of the model on the 10% test. Additionally, we performed an external 5-fold cross validation on the training data to assess the robustness of the model regarding the training data. External cross validation means that it was not used to optimize any parameters during training which gives five scenarios with a smaller training set (80%) and a larger test set (20%) in a 5-fold cross validation. This external cross validation was repeated five times with different folds and the mean values are reported. Both, cross validation and test set, results can be seen in Table 1. Domain of applicability – In the final model, we provide a reliability measure of each individual prediction. The reliability measure is calculated by the fraction of trees in the forest that give the final prediction e.g. when all 500 trees vote for the same class, the reliability is 100%. When the reliability measure is below 65%, we mark the compounds as out-ofdomain as the model certainty is very low on the prediction. Applied to our test set, 21 compounds would be labeled ‘out of domain’.

FIGURE 2 Scheme for building the sorted shell descriptors.

RESULTS Fingerprint based models. Extended connectivity fingerprint counts ECFC-6 showed good results in our general Ames model excluding primary aromatic amines. The general Ames model is built using a random forest classification model with a training set of 9279 compounds. It yielded an F1 measure of 0.87, a negative predictive value (NPV) of 0.89 and a precision of 0.87. Therefore, using the same approach to predict the pAA class seemed reasonable. We kept the fingerprint diameter of six as it is able to include the parasubstituted functional group of the ring since Ahlberg et al.5 showed that the substitution patterns play a fundamental role in the mutagenicity of pAAs. However, the performance of this model was very poor showing a Kappa of 0.37 and an F1 measure of 0.63, while the negative predictive value (NPV) just reached 0.6 and the precision 0.67 (Table 1). The model using only the HOMO-LUMO gap as descriptor yielded an F1 measure of 0.61, a very poor Kappa of 0.16, a precision of 0.63 and a NPV of 0.53 (Table 1). The performance using ΔΔE resulted in a Kappa of 0.27, F1 measure of 0.7, precision of 0.72 and NPV of 0.73 (Table 1). The performance using the sorted shell descriptors on the test set was best with a Kappa of 0.49, an F1 measure of 0.73 and a precision of 0.76 and a NPV of 0.73. This model looks reasonably good, however, we were not satisfied with the performance in the external 5-fold cross validation where the F1 measure for example drops to 0.66. Combinations of descriptors Using each descriptor alone did not lead to a good, robust model but each descriptor showed that it has a predictive ability on Ames mutagenicity of primary aromatic amines. We then combined the two vector descriptors and the two single value descriptors to see if the combination improves the model. The combination of the single value descriptors HOMO-LUMO gap and ΔΔE shows a clear improvement of the model of using only the HOMO-LUMO gap. The difference to only using ΔΔE are not as clear in the test set results but the external cross validation also shows an improvement of combining the descriptors over using them separately. The combination of the vector descriptors ECFC-6 and the sorted shell descriptor also lead to an improvement over the individual prediction abilities. Here, on the test set we achieved a Kappa of 0.57, an F1 measure of 0.81, 0.82 for precision and 0.76 for NPV (Table 1).

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 6

The combination of all four descriptors shows a significant overall improvement of all measurements in both cross validation and test set performances: (Table 1). We also tested all other combinations of the descriptors which can be seen in Table 1. Table 1. Performance of the model using different descriptor sets Cross Validation (mean) Accuracy

NPV

Precision

Test Set

FMeasure

Kappa

Accuracy

NPV

Precision

FMeasure

Kappa

Single descriptors ECFC-6

0.73

0.69

0.77

0.77

0.46

0.77

0.6

0.67

0.63

0.37

HOMO-LUMO gap

0.62

0.58

0.67

0.68

0.25

0.58

0.53

0.63

0.61

0.16

ΔΔE

0.62

0.6

0.65

0.63

0.25

0.64

0.69

0.72

0.7

0.27

Sorted shell 6

0.68

0.66

0.69

0.66

0.35

0.75

0.73

0.76

0.73

0.49

Two descriptor combinations HOMO-LUMO gap, ΔΔE

0.67

0.64

0.69

0.67

0.33

0.66

0.64

0.67

0.61

0.3

ECFC-6, Sorted shell 6

0.73

0.7

0.76

0.75

0.47

0.78

0.74

0.82

0.81

0.57

ECFC-6, ΔΔE

0.75

0.71

0.78

0.78

0.49

0.8

0.77

0.82

0.8

0.59

ECFC-6, HOMO-LUMO gap

0.74

0.7

0.78

0.78

0.48

0.8

0.76

0.83

0.81

0.59

ΔΔE, Sorted shell 6

0.7

0.68

0.71

0.68

0.39

0.73

0.7

0.75

0.72

0.45

0.7

0.68

0.72

0.69

0.4

0.75

0.73

0.76

0.72

0.49

HOMO-LUMO shell 6

gap,

Sorted

Three descriptor combinations ECFC-6, HOMO-LUMO gap, Sorted shell 6

0.74

0.71

0.77

0.75

0.48

0.78

0.74

0.82

0.8

0.56

ECFC-6, HOMO-LUMO gap, ΔΔE

0.75

0.71

0.77

0.76

0.48

0.81

0.74

0.81

0.8

0.55

HOMO-LUMO shell 6, ΔΔE

0.71

0.69

0.72

0.70

0.41

0.75

0.73

0.77

0.74

0.5

0.77

0.71

0.76

0.75

0.47

0.74

0.80

0.73

0.79

0.54

0.82

0.79

0.82

0.82

0.61

gap,

Sorted

ECFC-6, ΔΔE, Sorted shell 6

Combination of all four descriptors HOMO-LUMO gap, shell 6, ΔΔE, ECFC-6

Sorted

0.78

0.72

0.79

0.79

0.51

The table shows the mean performance for a 5x5-fold cross validation on the training set as well as the performance on the 10% test set using different sets of descriptors. The performance measures are the accuracy, negative predictive value (NPV), the precision, the F1 measure and Kappa. DISCUSSION. There are various approaches to predict the mutagenicity of primary aromatic amines. However, most of them are only able to cover one of the metabolic pathways that lead to a pAA to form the reactive metabolite that leads to covalent binding to the DNA. This new approach now combines multiple descriptors that explain different key steps of the pathway and by this improves model performance.

These descriptors are not able to provide a complete description of the underlying mechanism as complete physicsbased modeling would but are able to explain some of the important steps that lead to a mutagenic effect. As we were able to compile a big data set using not only public but also Bayer- internal data, we can now cover more

ACS Paragon Plus Environment

Page 5 of 6 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling chemical space and by this extend the applicability domain of the model. While we tried to maintain data that only followed the mechanism of mutagenicity of aromatic amines, we are aware that not all primary aromatic amines might follow the generally accepted mechanism. Other parts of the molecule can be equally responsible for the mutagenic effect. We have excluded upfront data points that showed mutagenicity before adding the activating S9 liver mix to derive a cleaner training set. By that we aimed to lower the risk to train a model for the nitrenium ion mechanism with non-fitting data. However, we have to say, that for novel primary aromatic amines, we cannot know whether another part of the molecule might lead to Ames mutagenicity even if the aromatic amine would not. In this case, our model might falsely classify them as Ames negative. Nevertheless, for most compounds with similar molecular weight and properties as our training data, the mutagenic effect can be predicted by the pAA mechanism of mutagenicity via an assumed liver metabolism. Additionally, not all descriptors are only focused on the amine part of the molecule. In the end, a machine-learning model will anyhow just learn to non-linearly map different combinations of features to the learned endpoint. Via the provided reliability measure for each individual prediction the classification is further qualified.

AUTHOR INFORMATION Corresponding Author * E-Mail: [email protected]

Orcid Lara Kuhnke: 0000-0002-4918-0132

Notes The authors declare no competing financial interest.

ACKNOWLEDGMENT We thank Dr. Joerg Wichard for assistance with deriving and understanding the data and the fruitful discussions as well Federica Maschietto for her evaluations on the substitution patterns and the ΔΔE descriptor.

ABBREVIATIONS pAA, primary aromatic amine; ECFC-6, Extended connectivity fingerprint diameter 6; HOMO, highest occupied molecular orbital, LUMO, lowest unoccupied molecular orbital; NPV, negative predictive value

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 6

REFERENCES

Ames,B. N.; McCann, J.; Yamasaki, E. Methods for Detecting Carcinogens and Mutagens with the Salmonella/Mammalian-Microsome Mutagenicity Test. Mutation Research. 1975, Vol. 31, Nr. 6, pp 347–364 2 Ford, G.P.; Griffin, G.R. Relative stabilities of nitrenium ions derived from heterocyclic amine food carcinogenesis: relationship to mutagenicity. Chem. Biol. Interact. 1992, 81 , pp. 19-33 3 Hillebrecht, A.;Muster, W.;Brigo,A.;Kansy M.; Weiser, T.; Singer, T. Comparative Evaluation of in Silico Systems for Ames Test Mutagenicity Prediction: Scope and Limitations Chem. Res. Toxicol., 2011, 24 (6), pp 843–854 1

Kuhnke, L., In silico prediction of Ames mutagenicity: Improving prediction performance by considering different DNA-binding mechanisms, Master thesis, FU Berlin 2013 5 Ahlberg, E.; Amberg, A.; Beilke, L.D.; Bower, D.; Cross, K.P..; Custer, L.; Ford, K.A.; Van Gompel, J.; Harvey, J.; Honma, M.; Jolly, R.; Joossens, E.; Kemper, R.A.; Kenyon, M.; Kruhlak, N.; Kuhnke, L.; Leavitt, P.; Naven, R.; Neilan, C.; Quiqley, D.; Shuey, H.P.; Sprikl, H.P.; Stavitskaya, L.; Teasdale, A.; White, A; Wichard, J.; Zwickl, C.; Myatt, G.J. Extending (Q)SARs to incorporate proprietary knowledge for regulatory purposes: a case study using aromatic amine mutagenicity. Regul. Toxicol. Pharmacol., 2016, 77 pp. 1-12 6 Unpublished results: Due to our smaller dataset with significantly higher structural diversity, we were not able to get statistically relevant substitution patterns for strong activating and strong deactivating fragments as described in the publication by Ahlberg, E.; Amberg, A.; Beilke, L.D.; Bower, D.; Cross, K.P..; Custer, L.; Ford, K.A.; Van Gompel, J.; Harvey, J.; Honma, M.; Jolly, R.; Joossens, E.; Kemper, R.A.; Kenyon, M.; Kruhlak, N.; Kuhnke, L.; Leavitt, P.; Naven, R.; Neilan, C.; Quiqley, D.; Shuey, H.P.; Sprikl, H.P.; Stavitskaya, L.; Teasdale, A.; White, A; Wichard, J.; Zwickl, C.; Myatt, G.J. (cf. reference 5) 7 Bentzien, J;Hickey, E.R.;Kemper, R.A.; Brewer, M.L.; Dyekjær, J.D.; East, S.P.; Whittaker, M. An in silico method for predicting Ames activities of primary aromatic amines by calculating the stabilities of nitrenium ions. J. Chem. Inf. Model, 2010, 50 , pp. 274-297 8 Debnath, A.K.; Debnath, G.; Shusterman, A.J.; Hansch, C. A QSAR investigation of the role of hydrophobicity in regulating mutagenicity in the ames test: 1. Mutagenicity of aromatic and heteroaromatic amines in Salmonella typhimurium TA98 and TA100, Environmental and Molecular Mutagenesis, 1992, Vol. 19, 1, p. 37-52 9 Vitic Nexus 2.6.1, Lhasa Limited(2017), https://www.lhasalimited.org/products/vitic-nexus.htm 10 Pipeline Pilot, version 16.5.0.143, Server version 17.1.0.115, Dassault Systemes Biovia Corp. 2016. 11 Pipeline Pilot Component “Standardize”, Setting: Keep Largest Fragment 12 Pipeline Pilot Component “Deprotonate Bases” 13 Pipeline Pilot Component “Protonate Acids” 14 Rogers, D; Hahn, M.; Extended-Connectivity Fingerprints, J. Chem. Inf. Model., 2010, 50 (5), pp 742–754 15 Pipeline Pilot Component “Molecular Fingerprints”, Settings: Type: Extended Connectivity; AtomAbstraction: AtomType; Maximum Distance: 6; OutputType: Fingerprint 16 Karelson, M.; Lobanov, V.S.;Katritzky, A.R. Quantum-chemical descriptorin qsar/qspr studies. Chem Rev., 1996, 96(3), pp1027–1044 17 Pipeline Pilot Component “Semiempirical QM descriptors” VAMP 10.0, Dassault Systemes Biovia Corp. 2017 18 Finkelmann, A.R.; Goeller, A.H.;Schneider, G. Site of Metabolism Prediction Base on ab initio Derived Atom Representations, J.Med. Chem., 2017, 12, p. 606-612 19 Gaus, M.; Cui, Q.; Elstner, M. DFTB3: Extension of the Self-Consistent-Charge Density-Functional Tight-Binding Method (SCC-DFTB) J. Chem. Theory Comput., 2011, 7, 931-948. 20 Pipeline Pilot Component “Learn RP Forest Model”, default settings 4

Table Of Contents Graphic:

ACS Paragon Plus Environment

6