Toxic Colors: The Use of Deep Learning for ... - ACS Publications

Leveraging high throughput toxicology experiments with chemical information and machine learning technologies in the form of a quantitative-structure ...
0 downloads 0 Views 2MB Size
Subscriber access provided by UNIV OF DURHAM

Chemical Information

Toxic Colors: The Use of Deep Learning for Predicting Toxicity of Compounds Merely from Their Graphic Images Michael Fernandez, Fuqiang Ban, Godwin Woo, Michael Hsing, Takeshi Yamazaki, Eric LeBlanc, Paul S. Rennie, William J. Welch, and Artem Cherkasov J. Chem. Inf. Model., Just Accepted Manuscript • DOI: 10.1021/acs.jcim.8b00338 • Publication Date (Web): 31 Jul 2018 Downloaded from http://pubs.acs.org on August 1, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

chemical library, and resulted in experimental identification of significant and previously unreported anti-androgen potentials for several well established generic drugs.

Introduction The discovery of new chemicals and materials plays a key role in technological innovation and economic growth but could also represent a source of significant environmental concerns 1. The current environmental regulations in North America and Europe demand rather comprehensive and costly physico-chemical, biological and toxicological characterization of any new chemical product before it can advance to a public use 1. The economic burden of such evaluations can be significantly mitigated by predicting potential negative impacts of chemicals using methods of modern cheminformatics. Leveraging high throughput toxicology experiments with chemical information and machine learning technologies in the form of a quantitative-structure property relationship (QSPR) is a viable strategy as described in several comprehensive reviews.2,3-5 In order to compare and benchmark the established and emerging methods of predictive toxicology, the “Toxicity testing in the twenty-first century” initiative (Tox21 Challenge) has been launched in recent years

6, 7

. The aim of this open competition is to evaluate the

performance of in-silico predictive tools on 12 different toxicology endpoints. Among others, those included stress response effect (SR) and nuclear receptor effects (NR), which are highly relevant to human health

8, 9

and can be implicated in activation of stress response pathways,

liver injury and cancer 10-12. The most accurate QSPR models of the Tox21 challenge represented ‘black-box’ machine learning solutions trained with precomputed molecular descriptors.2, 13, 14 In general, traditional QSPR predictions require featurization of chemical structures into descriptors such as electronic 2 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

and chemical-physical properties, structural fingerprints, topological and 3D indices, etc.15 The trade-off between accuracy and interpretability of the resulting QSAR models depends on the complexity of the molecular descriptors and machine learning methods use to correlate the toxicity response. On the other hand, a recently emerged deep learning (DL) methodology has employed a different paradigm where artificial neural networks have been trained directly on molecular graph representation without precomputing any QSPR molecular descriptors.16 we also believe, that similar to the impact on speech-, image- and signal recognition,17-19 DL networks have a potential to revolutionize the field of predictive toxicology by delivering ‘endto-end’ toxicology models. In ‘end-to-end’ learning, network inputs are some sort of ‘raw’ data, i.e. molecular images and the final output is a category or quality score, which reduces the effort of human design and performs better in most applications.18 In this work we propose to use DL to predict chemical toxicity in a ‘free from precomputed descriptors’ manner by direct use of 2D sketches of molecules to train 2D convolutional neural networks (2DConvNets) on the corresponding images. Such networks have been extremely successful recently in performing various image classification and segmentation tasks.20 Herein, we utilize 2DConvNets and trained them using three different 2D representations of chemical structures to directly predict biological endpoints from the Tox21 dataset. We will demonstrate that such AI-based approach can successfully identify relevant features in simple 2D graphic molecular representations that correspond to certain toxicity patterns. Notably, the visualization of the 2DConvNet filters provides an alternative pixel decomposition view of the classification tasks, while the images corresponding to optimal classifier outputs are abstract yet human readable representations of what the networks recognize as potentially toxic chemicals. 3 ACS Paragon Plus Environment

Page 2 of 31

Page 3 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Materials and Methods Tox21 data set and prior preparation The Tox21 dataset consists of 11,764 training compounds, including some environmental chemicals and drugs, and the corresponding endpoints are assembled into five SR and seven NR categories respectively 13. Nuclear receptor (NR) effects include estrogen receptor alpha ligand binding domain (LBD) (NR-ER-LBD), full estrogen receptor alpha (NR-ER), aromatase enzyme (NR-Aromatase), aryl hydrocarbon receptor (NR-AhR), full androgen receptor (NR-AR), androgen receptor LBD (NR-AR-LBD) and peroxisome proliferator-activated receptor gamma (NR-PPAR-gamma). Toxic stress response (SR) effects include nuclear factor (erythroid-derived 2)-like 2/antioxidant responsive element (SR-ARE), human embryonic kidney cells expressing luciferase-tagged ATAD5 (SR-ATAD5), heat shock factor response element (SR-HSE), mitochondrial membrane potential (SR-MMP) and p53 signaling pathway (SR-p53). The compounds were labelled according to the outcomes of the measurements as ‘active’ or ‘inactive’ (while not all compounds were assessed by all assays - see Table S1 in the Supporting Information). Chemical structures in the Tox21 dataset were curated using the MOE software21 workflow for structure cleaning and standardization, that includes removing salts and fragments from the loaded SMILES structures. For the purpose of 2DConvNet training, SMILES records were used to render 300x300 pixel resolution images of 2D sketches of chemical structures by OEDepict Toolkit from OpenEye Scientific Software 22. Figure 1 features three drawing schemes we have employed which differ in their representation of atom labels, colored dots and partial charge maps that were generated using the python programming language. Image resolutions from 50x50 to 500x500 pixels were

4 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

preliminary tested, but 300x300 format was used as providing optimal balance between model accuracy and complexity. Multiple layer perceptron (MLP) models In addition to 2DConvNets, we trained multiple layer perceptrons (MLPs) with different sets of conventional QSPR molecular descriptors to model Tox21 endpoints. MLP is a feedforward neural network with neurons in the input layer that receive their values from the input data vector (with exception of a bias neuron). One or more layers of hidden neurons collect values from the input neurons, giving a result that is passed to the neurons in the output layer that correspond to the different dependent labels (Figure 2A). The connections between units are associated with trained weights values. 2D Convolution neural networks (2DConvNets) For processing of the Tox21 molecular dataset we have employed a 2DConvNet approach that has a very large learning capacity and has already demonstrated superior performance on numerous image classification tasks.23 ConvNets have a large learning capacity that can be controlled by varying their depth and breadth and also the assumptions about the stationarity of statistics and locality of pixel dependencies in images. ConvNets have much fewer connections and parameters that standard feedforward neural networks with similarly-sized layers but with theoretically comparable -best performance.23 The utilized 2DConvNet architecture is featured in Figure 2B and consists of a hierarchy of trainable filters, interleaved with non-linearities and pooling (local averaging) connections, which, in our case, given an input 2D image, estimates its probability of representing a toxic chemical.

5 ACS Paragon Plus Environment

Page 4 of 31

Page 5 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

In particular, the 2D images are passed to a 2DConvNet for feature extraction by a convolutional and pooling layers; followed by feature condensation in a pooling layer (local averaging), which is flattened and passed to a dense layer and consequently to binary classification output. To activate the internal layers, we use ReLU function, while the output layers were activated with Softmax function.

23

A dropout framework was used to regularize the network predictions, for

this a fraction of the network parameters are set to zero according to a preset dropout value during batch training with batches of 128 examples, in such a way that not all the network parameters never are optimized on the same data.23 The 2DConvNets were trained using the Keras24 Python library (with a Tensorflow backend) by optimizing a softmax loss cost function that targets the classification of the molecules’ toxicity. ROC curves. Based on the predicted toxicity response and the experimental labels, the ROC curve plots the fraction of true toxic compounds among all toxics (TPR = true positive rate or sensitivity) versus the fraction of false toxic compounds among the non-toxics (FPR = false positive rate or 1specificity).

It therefore depicts the relative trade-offs between true (beneficial) and false

(costly) predictions. The best possible prediction is 100% sensitivity and 100% specificity with area under the curve (AUC) of 1, while completely random guessing is depicted by the diagonal dashed line from the bottom left to the top right corners with an AUC value of 0.5 on the ROC curve (as shown on Figure 3 and thereafter). There are additional weights assigned to bias values to act as offsets for the different layers. The network error is minimized by adjusting the weights through a training process that matches the output neurons to the corresponding target value. The MLP were trained using the Keras24

6 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 31

Python library (with a Tensorflow25 backend) by optimizing a softmax loss cost function that targets the classification of molecules toxicity. Spearman's rank correlation coefficient Spearman

correlation

is

a nonparametric measure

of rank

correlation

or

statistical

dependence between the rankings of two variables that assesses how well the relationship between two variables can be described using a non-necessary linear monotonic function. In this case, we used Spearman correlation in equation 1 to evaluate the linear relationship between rank values derived from predicted toxicity probabilities from different QSAR.

 =

(  ,  )

σ  ×σ  

Equation 1



In Equation 1 Rs is the usual Pearson correlation coefficient but applied to the predicted rank variables  , ! , derived from predicted probabilities probA and probB from two QSAR models A and B; "#$% & , % & ' is the is the covariance of the rank variables and σ σ are the are the standard deviations of the rank variables. !

Prestwick Chemical Library This library is a collection of 1280 small molecules, 95% of which represent off-patent approved drugs (FDA, EMA and other agencies). It is provided in the Supporting Information in SMILES format. eGFP cellular AR transcription assay

7 ACS Paragon Plus Environment

Page 7 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Selected predictions from a 2DConvNet on an external set of Prestwick compounds have been experimentally evaluated; thus we have examined Prestwick off-patent drugs for the predicted inhibitory effect against human androgen receptor (AR). This experiment has been conducted as previously described in

26

, i.e., LNCaP cells stably transfected with eGFP reporter under the

control of the probasin promoter were incubated with compounds at 2.5 uM for 3 days and fluorescence was measured. Data Availability All data generated or analyzed during this study are included in this published article (and its Supporting Information files).

Results Multiple layer perceptron (MLP) baseline models of toxicity To create the baseline QSPR classifiers for the Tox21 dataset we employed the MLP machine learning approach and three independent sets of conventional 2D and 3D molecular descriptors (all are available in the Supporting Information): - a set of inductive descriptors 27, which is a total of 51 partial charge, hardness and steric indices computed on the 3D structures, - a set of 361 2D graph-derived descriptors available via MOE software21, - a larger set of 2000 descriptors computed by Dragon software 28, which includes various molecular properties, as well as 2D and 3D indices. The three descriptor sets were evaluated for the prediction of toxicity endpoints in the Tox21 dataset using MLP classifiers. The quality of the resulting toxic and nontoxic class predictions 8 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

was evaluated using receiver operating characteristic (ROC) plots, as exemplified by Figure S1 in the Supporting Information, which illustrates the ability of the MLP models to correctly identify compounds that affect NR-AhR with 10-fold cross-validation AUC higher than 0.8 for the three sets of descriptors. Details of the optimum MLP classifiers appear in the Supporting Information along with details of MLP training. A comparative analysis of the 10-fold cross-validation AUC values of the single-task MLP classifiers trained on the three descriptor sets and all toxic effects in the Tox21 dataset is depicted in Figure 4. All MLP models of AhR, NR-AR-LBD and NR-SR-MMP effects yield AUC accuracies higher than 80%, while all toxic effects exhibit AUC values higher than 0.75. Notably, the inductive MLP models using only 51 IND variables demonstrated accuracies competitive with those of the MOE and Dragon models trained on descriptor sets of size two to three orders of magnitude higher. In general, for all 12 Tox21 endpoints, the developed baseline MLP-models exhibit cross-validation AUC values in the range 0.66-0.85, comparable to what have been previously reported in the Tox21 challenge2,

14

and other recent toxicity prediction

models.29 Herein we should re-iterate, however, that we did not intend or expect to outperform Tox21 models with the presented approach, but rather we demonstrate the utility of modern image-recognition AI even in such unconventional application as prediction of chemical toxicity from molecular structures. Use of 2DConNets for predicting Tox21 endpoints using molecular images as inputs. We have employed three different approaches to 2D-sketch chemical structures (as depicted in Figure 1) which differ in color scheming of chemical bonds, atoms and aromatic systems. Specifically, in the ‘label sketch’ schematics all non-carbon atoms are depicted with colored labels and presented as capital Latin letters according to a canonical 2D representation of a 9 ACS Paragon Plus Environment

Page 8 of 31

Page 9 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

molecule, in a ‘dot sketch’ approach all atoms are depicted as colored dots, with no letter representations, while in ‘charge sketch’ schematics non-carbon atoms are depicted with capital Latin letters, overlaid with red and blue gradient maps around negatively and positively charged atoms, respectively. All three representations omitted non-chiral hydrogens in the structures. The three types of molecular images were then used to optimize the architecture of 2DConvNets by extensive search over the hyper-parameter space (see Supporting Information for details) yielding optimal combinations of network parameters, presented in Table 1 for the dot-sketch schematic. The best performing 2DConvNet models found had a number of convolution filters ranging from 6 to 24 with kernel size in the range from 4 to 16 connected to a pooling layer with pooling area ranging from 4 to 12 and a fully connected to a dense layer of with number of nodes ranging from 10 to 100 that finally connects to the binary outputs. The 2DConvNets were trained for a maximum of 20 epochs to avoid overtraining and yield optimum cross-validation AUC values of 0.64-0.78 with recall and specificity of 0.21-0.56 and 0.70-1.00, respectively, also presented in Table 1. The schematic representation of the 2DConvNet to predict the NR-AR toxicity effect is depicted in Figure 4. Convolution filters and pooling layers automatically extract features from the pixel composition in the training images that are used to predict the toxicity scores using fully connected nodes. The representations of the remaining 2DConvNet models appear in the Supporting Information. The performance of the optimum 2DConvNets to automatically identify compounds with toxic effects from images is also depicted in Figure 4. The cross-validation accuracies were higher than 70% for eight out of 12 modelled Tox21 categories. Interestingly, the DL approach was particularly effective at identifying toxic compounds for all NR assays. Overall, the accuracies of 10 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

the image-based 2DConvNets were within 3-10% deviation of those from descriptor-based MLPs. Figure 4 reflects that the three sketching approaches accurately describe toxicity in some extent, but the simplest ‘dot sketch’ representation outperformed others across all toxic effects. Visualization of optimized 2DConvNet outputs It is well known that feedforward back-propagated neural nets are very efficient for functional mapping and pattern recognition, but their intrinsic ‘black-box’ nature has historically drawn considerable criticism in the modeling community. In contrast, the inner filters of a 2DConvNet can be visualized during its training, to help understanding of the underlying data learning patterns. For this purpose, we carried out visualization of the optimum inputs representing different atoms (red-, blue- and yellow- for oxygen, nitrogen and sulfur, respectively) and analyzed the distribution and intensity of the color channels that maximize the outputs of the network for the considered Tox21 endpoints. The generated red color intensity profile of the optimum image inputs for all toxic effects in Tox21 datasets are collected in Figures S2 and S3 in the Supporting Information. The smoothed intensity of red and blue channels of the optimum input correspond to regions in molecular images where high pixel intensity contributes to toxic classifications. As can be seen, red and blue pixel intensities depict very distinctive patterns across the 12 modelled toxic effects, where intense single and bimodal peaks could be associated with single and multiple oxygen and nitrogen atom substitutions, respectively. Despite the distinctive pixel distributions across the predicted classes of toxicity, the filter patterns are too complex to draw useful insights. Moreover, significant structural diversity of the Tox21 dataset compromises such model readability as filters overlay in a cumbersome manner as the 2DConvNet models handle sketches with molecules of different orientation, size and structural resolutions. Thus, human readable 11 ACS Paragon Plus Environment

Page 10 of 31

Page 11 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

models would require conveniently aligned 2D molecular sketches where substitutions on common scaffolds are consistently represented for series of analog structures and the corresponding implementation require further research and modeling efforts. Application of 2DConNets to identify potent AR inhibitors. Predictive cheminformatics models can be tremendously useful for identification of candidate compounds from chemical databases. Herein, we evaluate applicability of the developed 2DConvNet models by ranking ~1200 approved drugs from the Prestwick Chemical Library (http://www.prestwickchemical.fr/) for the predicted inhibition effect on human androgen receptor (AR). For all 1200 molecules we generated the ‘colored dot’ sketches which were then passed through the previously discussed 2DConvNet model trained on NR-AR activities from the Tox21 dataset. Inductive, MOE and Dragon descriptors were also computed for the Prestwick Chemical Library and toxicity rankings from MLP models in Figure 4 were quantitatively compared versus the 2DConvNet predictions using Spearman correlation coefficients in equation 1. Figure 5A illustrates that only the inductive descriptors exhibit ranking correlation coefficients higher than 0.5 for 8% to 30% of top-ranked structures, while MOE and Dragon descriptors exhibit low agreement with image-based ranking. This fact is also corroborated by Tanimoto similarity histograms of MOE, inductive and Dragon descriptors for the top-ranked predictions against AR active and inactive compounds in Tox21 dataset, which showed that only the inductive descriptors exhibit similarity histogram biased toward active AR inhibitors (Figure S4 in Supporting Information).

12 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

From the ranked molecular images, we selected 30 top-scored molecules for experimental evaluation using eGFP cellular AR transcription assay.26 Tanimoto similarity matrix for the topranked structures using MACCS keys fingerprints30 in Figure 6B depicts diverse chemical scaffolds. In addition, we have randomly selected 28 molecules from the remaining portion of the ranked Prestwick library and have also experimentally evaluated them to estimate the background probability of random identification of AR inhibitors using image-trained DL models. Preliminary experimental evaluation of the test-set Prestwick chemicals demonstrated that seven out of 58 tested chemicals exhibited significant >95% inhibition potency (when compared to the positive control substance – the current anti-androgen drug enzalutamide). Importantly, six out of seven active compounds were found from the predicted actives group, while only one of the randomly selected compounds exhibits sufficient inhibitory potency of the AR. However, further evaluation of the inhibition potency of the predictions depicted in Figure 7B corroborates that compounds Prestwick-371, -792, -842, and -1495 suppressed AR transcription in a concentration dependent manner using LNCaP-eGFP cells treated with R1881 in Table 2, with IC50 values in the range from 8 to 30 µM range. Meanwhile, compounds Prestwick-330 and -697 were initial false positives but compound Prestwich-826 was corroborated as having no effect of AR transcription as initially expected from our 2DConvNet model.

Discussion The accuracies of the developed MLP models featured in Figure 3 imply that certain toxic effects are, probably, more amenable to QSPR modeling than others. The prediction differences can be partly attributed to limitations of molecular descriptors and methods for their selection. In fact,

13 ACS Paragon Plus Environment

Page 12 of 31

Page 13 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

relevant feature selection is a well-known issue for the QSAR/QSPR field that has been tackled by a variety of methods including automatic relevance determination (ARD),31 random forest32 and genetic algorithm (GA)33 and still represents an active area of research. In contrast, deep network convolutions are very effective at extracting relevant input features. Thus, 2DConvNets demonstrate that chemically relevant features can be automatically identified from input tensors of 2D sketches of molecular structures, by kernel mapping and maximum pooling cycles before activation of dense nodes connected to label output nodes. This way relevant features are directly learned during the training phase20. We also established that 2DConvNet predictions were largely invariant to the orientation and layout of the molecular sketches. An augmented training set generated from translation, rotation, and flipping along the axis of the original molecular images yielded crossvalidation accuracies in the range from 0.62 to 0.83 for 10 to 100 training epochs (Figure 7). A more illustrative analysis of the effect of the molecule sketch orientation on the activation of the different layers is depicted in Figure 8 for the NR-AR model and compound Prestwick-371. Despite the fact that the rotated molecules displayed different activations of the convolution, pooling and dense layers, the final activation of the outputs yielded similar high scores for this active compound. Thus, it is possible to speculate that the developed DL models trained just on simple 2D molecular images resulted in surprisingly high accuracy of Tox21 benchmarking and, hence, may be capable of recognizing deterministic structural features of toxic chemicals. In fact, the 2DConvNet approach might also allow identification of intriguing atom distribution patterns associated with various toxicities. The fact that the enrichment between the top-scored and randomly sampled low-scored compounds achieved four-fold magnitude also demonstrates that the developed 2DConvNet approach could be quite effective. Another notable observation is that 14 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

none of the confirmed DL-predicted AR inhibitors (all featured in Table 2) corresponds to a steroid family. Thus, one could speculate that 2DConvNets did not pick on obvious structural/image patterns. In general, the identified potent AR inhibitors presented in Table 2 depict very diverse chemical scaffolds, which illustrates the capacity of the 2DConvNet to learn non-obvious structural arrangements for fast screening of in silico libraries for novel drug candidates or for drug repurposing purposes. In particular, we found very encouraging that the 2DConvNets identify rather diverse active inhibitors that also represent different subregions of chemical space in comparison to classical MLP predictions. Tox21 benchmark dataset was used to demonstrate that simplified image-based processing of molecular structures, somewhat surprisingly, provide reasonable predictions for such complicated and multi-faceted endpoints as chemical toxicity, which, among other implications, demonstrated the power and maturity of the modern methods of AI. Our results corroborate that DL can be useful to overcome two fundamental drawbacks that have hindered QSAR modeling for decades: optimal selection of molecular descriptors and reversibility of predictions (the “inverse QSAR problem”). In the presented somewhat orthogonal approach only traditional chemical sketches are required to predict biological activities, and chemical features are directly learned from human readable representations. DL models can act as complete pipelines that perform full ML workflows from feature extraction to learning the task output. In our ‘end-to-end’ implementation features are extracted from simple images using convolution layers connected to perceptron nodes that learn the generated features and activate the toxicity output omitting any hand-crafted feature precomputing algorithms. Image representation of molecules can be considered as a special featurization

15 ACS Paragon Plus Environment

Page 14 of 31

Page 15 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

method ‘free from precomputed descriptors’ that is very simple in comparison to sophisticated descriptors derived from computational chemistry, statistics or/and graph theory. The choice of molecular descriptors plays a significant role in defining the applicability domain (AD) in QSAR modeling.34 We have defined the AD of the developed DNN models in terms of similarity between molecular images, that have already found an extensive use in computer vision and image comparison studies35. Specifically, for the AD analysis of the developed models we have utilized KAZE image features,36 which detect and describe 2D features in a nonlinear scale space using nonlinear diffusion filtering. Such KAZE features reduce noise and retain object boundaries yielding superior localization accuracy and distinctiveness.36 The AD of our 2DConNet model is therefore defined using the Mahalanobis distance histogram (Figure S5 in the Supporting Information) computed from the KAZE features. The corresponding distance distribution defines a confident interval of how much the query chemicals differ from the training set (for more details see Figure S5 and its legends). It is also worth mentioning, that the presented DL-based QSAR models, being so drastically different from any conventional QSAR/QSPR, could represent a viable unbiased and complementary tool for any consensus predictions – a growing trend in modern cheminformatics, forming a part of its best practices. By using molecular images to predict functional properties, we leverage the power of DL neural networks to handle complex input objects with minimum human intervention, i.e. preprocessing and/or input feature selection. Image representation of molecular properties were also recently utilized in works of Schneider group37 to predict antimicrobial peptides but in the form of self-organizing map (SOM) transform of traditional molecular descriptors. Another recent report describes molecular sketches encoding sophisticated multiple layers of chemical and structural information in 16 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

multiple channels in the images for the prediction of chemical properties using also ConvNets.38 However, our approach is innovative in a way that it directly leverages on the intrinsic information content of traditional 2D molecular sketches as a very simplified way to convey chemical information amenable to interpretation using chemical intuition. In fact, ‘mechanistic’ vs. ‘black box’ modeling ideologies, where physical meaning of descriptions could have a defying role and whether QSAR models should be chemically interpretable, is a long lasting and still unresolved discussions within the cheminformatics community. By implementing simplified image descripting, we provide a novel angle to this long-standing discussion, where ‘chemical interpretability’ around molecular descriptors is less relevant. We demonstrate the ability of DL to produce end-to-end computational models that require minimum preprocessing to achieve an important and complex end task as toxicity prediction. At the same time, the analysis of the different model filters provides some insights into the effect of the image features on the predicted toxicity. However, we would like to acknowledge that achieving further interpretability would require to impose constraints on the 2DConvNet models (i.e. 2D alignment, image size rescaling) that we have preferred to avoid to highlight the end-toend nature of our approach.

Conclusions We have developed a deep learning tool that can automatically extract and learn toxicity-related structural features of chemical compounds from their 2D images. Three different coloring schemes were integrated with supervised 2D Convolution Networks that were trained for predicting 12 biological endpoints described in the Tox21 challenge. Despite the fact that no featurization of chemical structures was used in the training, i.e. no chemical descriptors were

17 ACS Paragon Plus Environment

Page 16 of 31

Page 17 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

computed whatsoever, the resulting AI-solutions demonstrated competitive performance in comparison with the state-of-the-art QSPR models. The results of this work feature the power of modern AI-enabled image recognition technology even for such non-obvious application as prediction of potential toxicity of chemicals. Furthermore, the use of the 2DConvNet approach not only allowed identification of intriguing atom distribution patterns associated with various toxicities, but also enabled identification of very potent androgen receptor inhibitors from a library of already existing off-patent drugs, which can facilitate drug repurposing efforts.

Acknowledgements This work has been done with support of UBC Data Sciences Institute’s PHIX program, and a research grant from the Terry Fox Research Institute.

Supporting Information Available Details of the Tox21 dataset, neural network models calibration, model sketches, calculated descriptor sets and intensity of the color channels of the 2DConvNet inputs that maximize toxicity output. This material is available free of charge via the Internet at http://pubs.acs.org/ References 1. Breithaupt, H., The costs of REACH. REACH is largely welcomed, but the requirement to test existing chemicals for adverse effects is not good news for all. EMBO Rep. 2006, 7, 968971. 2. Mayr, A.; Klambauer, G.; Unterthiner, T.; Hochreiter, S., DeepTox: Toxicity Prediction using Deep Learning. Front. Environ. Sci. 2016, 3. 3. Raies, A. B.; Bajic, V. B., In silico toxicology: computational methods for the prediction of chemical toxicity. WIREs Comput. Mol. Sci. 2016, 6, 147-172. 4. Toropov, A. A.; Toropova, A. P.; Raska, I., Jr.; Leszczynska, D.; Leszczynski, J., Comprehension of drug toxicity: software and databases. Comput. Biol. Med. 2014, 45, 20-25. 5. Ekins, S., Progress in computational toxicology. J Pharmacol. Toxicol. Methods 2014, 69, 115-140. 18 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

6. Krewski, D.; Acosta, D., Jr.; Andersen, M.; Anderson, H.; Bailar, J. C., 3rd; Boekelheide, K.; Brent, R.; Charnley, G.; Cheung, V. G.; Green, S., Jr.; Kelsey, K. T.; Kerkvliet, N. I.; Li, A. A.; McCray, L.; Meyer, O.; Patterson, R. D.; Pennie, W.; Scala, R. A.; Solomon, G. M.; Stephens, M.; Yager, J.; Zeise, L., Toxicity testing in the 21st century: a vision and a strategy. J Toxicol. Environ. Health B Crit. Rev. 2010, 13, 51-138. 7. Andersen, M. E.; Krewski, D., Toxicity testing in the 21st century: bringing the vision to life. Toxicol. Sci. 2009, 107, 324-330. 8. Eduati, F.; Mangravite, L. M.; Wang, T.; Tang, H.; Bare, J. C.; Huang, R.; Norman, T.; Kellen, M.; Menden, M. P.; Yang, J.; Zhan, X.; Zhong, R.; Xiao, G.; Xia, M.; Abdo, N.; Kosyk, O.; The, N.-N.-U. N. C. D. T. C.; Friend, S.; Dearry, A.; Simeonov, A.; Tice, R. R.; Rusyn, I.; Wright, F. A.; Stolovitzky, G.; Xie, Y.; Saez-Rodriguez, J., Prediction of human population responses to toxic compounds by a collaborative competition. Nat. Biotechnol. 2015, 33, 933. 9. Grun, F.; Blumberg, B., Perturbed nuclear receptor signaling by environmental obesogens as emerging factors in the obesity crisis. Rev. Endocr. Metab. Disord. 2007, 8, 161171. 10. Bartkova, J.; Hořejší, Z.; Koed, K.; Krämer, A.; Tort, F.; Zieger, K.; Guldberg, P.; Sehested, M.; Nesland, J. M.; Lukas, C.; Ørntoft, T.; Lukas, J.; Bartek, J., DNA damage response as a candidate anti-cancer barrier in early human tumorigenesis. Nature 2005, 434, 864. 11. Labbe, G.; Pessayre, D.; Fromenty, B., Drug-induced liver injury through mitochondrial dysfunction: mechanisms and detection during preclinical safety studies. Fundam. Clin. Pharmacol. 2008, 22, 335-353. 12. Jaeschke, H.; McGill, M. R.; Ramachandran, A., Oxidant stress, mitochondria, and cell death mechanisms in drug-induced liver injury: lessons learned from acetaminophen hepatotoxicity. Drug Metab. Rev. 2012, 44, 88-106. 13. Huang, R.; Xia, M., Editorial: Tox21 Challenge to Build Predictive Models of Nuclear Receptor and Stress Response Pathways As Mediated by Exposure to Environmental Toxicants and Drugs. Front. Environ. Sci. 2017, 5. 14. Capuzzi, S. J.; Politi, R.; Isayev, O.; Farag, S.; Tropsha, A., QSAR Modeling of Tox21 Challenge Stress Response and Nuclear Receptor Signaling Toxicity Assays. Front. Environ. Sci. 2016, 4. 15. Todeschini, R.; Consonni, V. Handbook of Molecular Descriptors. In; 2008; Vol. V11, pp 1-123. 16. Altae-Tran, H.; Ramsundar, B.; Pappu, A. S.; Pande, V., Low Data Drug Discovery with One-Shot Learning. ACS Cent. Sci. 2017, 3, 283-293. 17. Liu, B.; Lane, I., An end-To-end trainable neural network model with belief tracking for task-oriented dialog. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2017, 2017-August, 2506-2510. 18. Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A. A.; Veness, J.; Bellemare, M. G.; Graves, A.; Riedmiller, M.; Fidjeland, A. K.; Ostrovski, G.; Petersen, S.; Beattie, C.; Sadik, A.; Antonoglou, I.; King, H.; Kumaran, D.; Wierstra, D.; Legg, S.; Hassabis, D., Human-level control through deep reinforcement learning. Nature 2015, 518, 529-533. 19. Karanov, B.; Chagnon, M.; Thouin, F.; Eriksson, T. A.; Bülow, H.; Lavery, D.; Bayvel, P.; Schmalen, L., End-to-end Deep Learning of Optical Fiber Communications. CoRR 2018, abs/1804.0. 20. Krizhevsky, A.; Sutskever, I.; Hinton, G., Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 1097-1105. 19 ACS Paragon Plus Environment

Page 18 of 31

Page 19 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

21. Chemical Computing Group ULC, S. S. W., Suite #910, Montreal, QC, Canada, H3A 2R7 Molecular Operating Environment (MOE), 2013.08, 2018. 22. OpenEye Scientific Software (https://www.eyesopen.com), Santa Fe, NM, 2017. 23. Chen, L.-C.; Schwing, A. G.; Yuille, A. L.; Urtasun, R., Learning Deep Structured Models. CoRR 2014, abs/1407.2538. 24. Chollet, F. Keras (https://github.com/fchollet/keras), GitHub: 2015 (Accessed February 2, 2018). 25. Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G. S.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Goodfellow, I.; Harp, A.; Irving, G.; Isard, M.; Jia, Y.; Jozefowicz, R.; Kaiser, L.; Kudlur, M.; Levenberg, J.; Mane, D.; Monga, R.; Moore, S.; Murray, D.; Olah, C.; Schuster, M.; Shlens, J.; Steiner, B.; Sutskever, I.; Talwar, K.; Tucker, P.; Vanhoucke, V.; Vasudevan, V.; Viegas, F.; Vinyals, O.; Warden, P.; Wattenberg, M.; Wicke, M.; Yu, Y.; Zheng, X., TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. CoRR 2016, abs/1603.0. 26. Tavassoli, P.; Snoek, R.; Ray, M.; Rao, L. G.; Rennie, P. S., Rapid, non-destructive, cellbased screening assays for agents that modulate growth, death, and androgen receptor activation in prostate cancer cells. Prostate 2007, 67, 416-426. 27. Cherkasov, A., Inductive Descriptors: 10 Successful Years in QSAR. Curr. Comput. Aided Drug Des. 2005, 1, 21-42. 28. Dragon (ver. 6.0) (www.talete.mi.it/products/software.htm), Talete srl: Italy, 2018. 29. Yang, H.; Sun, L.; Li, W.; Liu, G.; Tang, Y., In Silico Prediction of Chemical Toxicity for Drug Design Using Machine Learning Methods and Structural Alerts. Front. Chem. 2018, 6, 30. 30. Durant, J. L.; Leland, B. A.; Henry, D. R.; Nourse, J. G., Reoptimization of MDL Keys for Use in Drug Discovery. J. Chem. Inf. Comput. Sci. 2002, 42, 1273-1280. 31. MacKay, D. J. C. Bayesian Non-linear Modelling for the Prediction Competition. 1994, ASHRAE; pp 1053-1062. 32. Breiman, L., Random Forests. Mach. Learn. 2001, 45, 5-32. 33. Fernandez, M.; Caballero, J.; Fernandez, L.; Sarai, A., Genetic algorithm optimization in drug design QSAR: Bayesian-regularized genetic neural networks (BRGNN) and genetic algorithm-optimized support vectors machines (GA-SVM). Mol. Divers. 2011, 15, 269-289. 34. Sahigara, F.; Mansouri, K.; Ballabio, D.; Mauri, A.; Consonni, V.; Todeschini, R., Comparison of different approaches to define the applicability domain of QSAR models. Molecules 2012, 17, 4791-810. 35. Goshtasby, A. A. Image Descriptors. In Image Registration; Springer: London, 2012; Chapter 5, pp 219-246. 36. Alcantarilla, P. F.; Bartoli, A.; Davison, A. J. KAZE Features. In European Conference on Computer Vision (ECCV) 2012, Berlin, Heidelberg, 2012; Springer Berlin Heidelberg: Berlin, Heidelberg, 2012; pp 214-227. 37. Schneider, P.; Muller, A. T.; Gabernet, G.; Button, A. L.; Posselt, G.; Wessler, S.; Hiss, J. A.; Schneider, G., Hybrid Network Model for "Deep Learning" of Chemical Data: Application to Antimicrobial Peptides. Mol. Inform. 2017, 36. 38. Goh, G. B.; Siegel, C.; Vishnu, A.; Hodas, N.; Baker, N., In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV); IEEE: 2018, pp 1340-1349.

20 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 31

TABLES

Table 1. Hyper-parameters, cross-validation AUC, recall and specificity for DL models using colored dots for atom representation.

Toxic effect NR-AhR

Number Kernel Pooling Dense Dropout AUC Recall of filters size area nodes 12 16 8 100 0.4 0.77 0.46

Specificity 0.89

NR-AR

6

16

4

20

0.4

0.75

0.34

0.99

NR-AR-LBD NRAromatase NR-ER

24

16

4

20

0.4

0.78

0.41

1.00

12

4

12

60

0.8

0.72

0.28

0.93

12

16

12

100

0.6

0.67

0.34

0.89

NR-ER-LBD NR-PPARgamma SR-ARE

24

16

8

60

0.4

0.73

0.42

0.90

24

16

12

60

0.4

0.66

0.50

0.71

24

16

4

100

0.4

0.67

0.56

0.70

SR-ATAD5

12

16

8

60

0.4

0.64

0.27

0.91

SR-HSE

24

16

4

60

0.4

0.76

0.21

0.93

SR-MMP

24

16

4

100

0.4

0.69

0.56

0.82

SR-p53

12

16

4

20

0.4

0.65

0.30

0.91

21 ACS Paragon Plus Environment

Page 21 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Table 2. Structures of six potent AR inhibitors (>95%) from the Prestwick dataset identified by the 2DConvNet model. 2DConvNet score

IC50 (µM)

Prestw-1495

0.53

6.8

Prestw-792

0.42

27.7

Prestw-842

0.41

7.8

Prestw-371

0.36

28.5

Compounds

Structure

22 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

A

B

C

Figure 1. 2D images of molecule sketches using three schemes. A) Atoms different from carbon are depicted with colored labels. B) Atoms are depicted as colored dots. C) Atoms different from carbon are depicted with black labels and partial charge are represented as red and blue gradient maps around negatively and positively charged atoms, respectively.

23 ACS Paragon Plus Environment

Page 22 of 31

Page 23 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

A

Inputs

Outputs

B

Outputs

Convolution

Pooling

Dense

Figure 2. Schematic representation of the neural network models: A) multiple layer perceptron (MLP) model and B) 2DConvNet model.

24 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 3. Cross-validation AUC for the predictions of the five SR and seven NR toxic effects in Tox21 dataset, including nuclear receptor (NR) effects: estrogen receptor alpha ligand binding domain (LBD) (NR-ER-LBD), full estrogen receptor alpha (NR-ER), aromatase (NRAromatase), aryl hydrocarbon receptor (NR-AhR), full androgen receptor (NR-AR), androgen receptor LBD (NR-AR-LBD) and peroxisome proliferator-activated receptor gamma (NR-PPARgamma); stress response (SR) effect: nuclear factor (erythroid-derived 2)-like 2/antioxidant responsive element (SR-ARE), human embryonic kidney cells expressing luciferase-tagged ATAD5 (SR-ATAD5), heat shock factor response element (SR-HSE), mitochondrial membrane potential (SR-MMP) and p53 signaling pathway (SR-p53).

25 ACS Paragon Plus Environment

Page 24 of 31

Page 25 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Representation Input (Images)

(ReLU)

Feature extraction (Convolution and pooling)

(ReLU)

Feature learning Fully connected neurons

Output scores (Softmax)

Figure 4. Schematic representation of the 2DConvNet of NR-AR toxicity effect. Activation of the internal layers uses an ReLU function and output layers are activated using Softmax function.21 Dropout is implemented to improve regularization of the network predictions. 26 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

A

B

Tanimoto similarity

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 31

Figure 5. A) Spearman correlation coefficients between the 2D ConvNet predictions and the MLP models built using inductive, MOE and Dragon descriptors for the NR-AR effect. B) Tanimoto similarity matrix for the top-ranked structures using MACCS keys fingerprints.30

27 ACS Paragon Plus Environment

Page 27 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

A

Randomly sampled from predicted negative

Predicted positive

200 150

-2

-1

Prestwick-792

150

100

100

50

50

0 -3

B

200

Prestwick-371

0

0

1

2

-3

-2

Log concentration [µM]

-1 0 1 Log concentration [µM]

200

200

Prestwick-842

Prestwick-1495

-3

-2

2

150

150

100

100

50

50

0

0

-1 0 1 Log concentration [µM]

-3

2

-2

-1 0 1 Log concentration [µM]

2

Figure 6. A) Percent inhibitory potency of the compounds Prestwick-371, -792, -842, and -1495 predicted as active (blue) with respect to enzalutamide control (green). B) Dose-response suppression of AR transcription by compounds Prestwick-371, -792, -842, and -1495 using LNCaP-eGFP cells treated with R1881. 28 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 7. Crossvalidation accuracies for the prediction of Tox21 effects using 2DConvNet built from augmented training sets of images using translation, rotation, and flipping along the axis.

29 ACS Paragon Plus Environment

Page 28 of 31

Page 29 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

3

Journal of Chemical Information and Modeling

3

A) 1

2

2

4

4

3

1

B)

3

C) 1

2

D)

2

4

4

30 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 31

Figure 8. Visualization of the activation of the convolution (1), pooling (2) and dense (3) and the output layers (4) of the 2DConvNet for Prestwick-371 ( IC50 = 28.5 µM) with input image rotations of (A) original, B) 90°, C) 180° and D) 270°.

Graphical TOC Entry

Toxicity

31 ACS Paragon Plus Environment