Interpretation of QSAR Models: Past, Present and Future - Journal of

Sep 26, 2017 - Comparison of the Predictive Performance and Interpretability of Random Forest and Linear Models on Benchmark Data Sets. Journal of Che...
14 downloads 21 Views 4MB Size
Perspective Cite This: J. Chem. Inf. Model. XXXX, XXX, XXX-XXX

pubs.acs.org/jcim

Interpretation of Quantitative Structure−Activity Relationship Models: Past, Present, and Future Pavel Polishchuk* Institute of Molecular and Translational Medicine, Faculty of Medicine and Dentistry, Palacký University and University Hospital in Olomouc, Hněvotínská 1333/5, 779 00 Olomouc, Czech Republic ABSTRACT: This paper is an overview of the most significant and impactful interpretation approaches of quantitative structure−activity relationship (QSAR) models, their development, and application. The evolution of the interpretation paradigm from “model → descriptors → (structure)” to “model → structure” is indicated. The latter makes all models interpretable regardless of machine learning methods or descriptors used for modeling. This opens wide prospects for application of corresponding interpretation approaches to retrieve structure−property relationships captured by any models. Issues of separate approaches are discussed as well as general issues and prospects of QSAR model interpretation.



LOCAL AND GLOBAL INTERPRETATION AND THEIR POSSIBLE APPLICATIONS At the outset, I will define local and global interpretation. Local interpretation is an estimation of how different parts of a single compound influence each other and a studied endpoint. It can be applied to identify structural motifs which reduce or enhance activity and thus will provide useful information about the further structural optimization of studied compounds to improve activity, like it was done in design of new antiviral agents. 5 Fragments which are identified as negatively contributing can be removed or replaced with new ones designed by a medicinal chemist or by computational enumeration of possible replacements. Local interpretation may be particularly useful for lead optimization tasks where the goal is to optimize an ADME profile. Global interpretation is retrieval of structure−property relationships for a series of compounds from QSAR models in a chemically/biologically meaningful way. Such information is useful from various viewpoints. 1. Knowledge mining. QSAR models can be used for hypothesis confirmationpredicting properties of designed compounds (virtual screening), and they can also be used for hypothesis generation based on model interpretation. Retrieved structure−property relationships may shed light on mechanisms of action, activity, selectivity of compounds, etc. The knowledge may then be used in the design or structure optimization steps.6 Further, interpretation can help researchers derive hypotheses that are sensible from a chemical viewpoint.7

U

nderstanding the laws of nature is of fundamental importance to humanity, and structure−property relationships are no exception.1−4 In the modern era, a vast amount of knowledge about properties of innumerable chemical compounds has been amassed. Machine learning promises to allow us to use the data not only for prediction but also for retrieval of knowledge, inaccessible to humans otherwise. But what is interpretation? Interpretation is the retrieval of usef ul knowledge from computational models. This is particularly important because it is possible to extract different parameters from models which cannot be interpreted by researchers, e.g. retrieved contribution of a noninterpretable descriptor may not improve problem understanding or explain structure−property relationships captured by a model. In this overview, I do not pursue the goal of providing an exhaustive review of all interpretation approaches developed to retrieve relationships captured by quantitative structure− activity relationship (QSAR) models. The main focus is to show the most interesting approaches, progress in their development, and the change in interpretation paradigm that made all QSAR models interpretable.2 Historical examples selected to illustrate interpretation approaches do not always meet best practices of QSAR modeling and the interpretation principles given below, but they are instructive nonetheless. It should be noted that interpretation approaches of machine learning models are the same for QSAR and Quantitative structure−property relationship (QSPR) studies, and therefore, no difference regarding these terms should be made in the context of the paper. © XXXX American Chemical Society

Received: May 17, 2017 Published: September 26, 2017 A

DOI: 10.1021/acs.jcim.7b00274 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Journal of Chemical Information and Modeling





Perspective

PRINCIPLES AND ISSUES OF QSAR INTERPRETATION I would like to define several principles and issues of QSAR interpretation which are sometimes missed in QSAR studies. Generally speaking the prerequisites of model interpretation are the same as for model development.14,15 1. Interpretation of only predictive models is reasonable. This is a particularly important issue but missed in many older QSAR studies. If a model is not predictive, this means that it cannot capture reasonable structure− property relationships for various reasons.14,16 Therefore, interpretation results are questionable and additional confirmation of interpretation validity is required, e.g., empirical mechanistic confirmation. However, if weak models are combined in a strong consensus model the latter can be interpreted, since it will pass the statistical validation step. Interpretation may be useful in the diagnostics of problems with a modeled data set leading to poor models. 2. Interpretation should consider the applicability domain of a model because each model trained on a limited number of samples covers only a small part of the whole chemical space which is vast.17 Applicability domain can be estimated as a distance from training set examples in a chemical space or statistically using various approaches.18,19 Interpretation results cannot be extrapolated to compounds which lie too far from applicability domain. This means that if a researcher retrieved a certain structure−property relationship, it does not mean that it is applicable to any compound in the chemical universe. However, extrapolation is possible if found structure−property relationships have a mechanistic confirmation. For example, if the toxic action of a limited number of α-haloketones were identified by a QSAR model, it does not automatically mean that all compounds containing this group will be toxic. However, knowledge that this group is chemically reactive will confirm its toxic mechanism of action and thus it may be considered a toxicophore. If interpretation results do not have such mechanistic support, they may be considered valid only within the applicability domain of a model. Similar conclusions were drawn by Alves et al.20 These authors discuss the too broad nature of widely recognized structural alerts and their possible integration with QSAR predictions in order to improve the outcome. 3. Interpretation results depend on the training set compounds used for modeling with the exception of the modeling of additive endpoints. There are two issues here. First, interpretation results can change with different training sets leading to different results for the same end-point. This is related to the domain applicability issue. The second is related to the nature of QSAR models. Statistical models explain Y by X variance. If a particular descriptor is found with a high contribution or importance this implies that changing the value of this variable corresponds to changes in modeling endpoint values but this does not necessarily mean that this causes the response. For example, if certain structural motifs were identified as negatively contributing to the protein binding constant, this means that those fragments decrease affinity values but they might still be important for receptor recognition and binding.

2. Knowledge-based validation of QSAR models. If retrieved model structure−property relationships correspond to experimentally observed ones and to expert knowledge, then the model can be considered valid. If they disagree, then the model is probably incorrect and should be discarded or the available knowledge about structure−property relationships should be revised. Interpretation may also help to identify biased data sets if very trivial patterns distinguishing actives and inactives were found. Let us imagine, if actives contain two negatively charged groups at a distance of 10−11 bonds and inactives do not contain them at all, then, the built model may recognize a simple patternthe presence of a negatively charged group. But, that will not be enough for really active compounds. However, the model will pass the statistical validation step successfully. Therefore, knowledge-based validation can be considered complementary to the traditional statistical validation of QSAR models. 3. Regulatory purposes. Models which pass knowledgebased validation will certainly have a higher confidence level. Therefore, the fifth principle of OECD guidelines asks for “mechanistic interpretation of QSAR models, if possible”.8 However, this principle is optional because for a long time many QSAR models were considered noninterpretable. Recent advances in interpretation of QSAR models could make the fifth OECD principal mandatory for models used for regulatory purposes. In this perspective, I will show that currently all models can be considered interpretable. However, not all endpoints can be interpreted.

INTERPRETABILITY VS PREDICTIVITY OF QSAR MODELS Unfortunately, the importance and possible benefits of model interpretation are still underestimated and there are several reasons for this. Interpretation is usually considered a double probleminterpretability of chosen descriptors and interpretability of a chosen machine learning method. Descriptor interpretability was a common limitation starting from the first QSAR models until recently and almost all interpretable models were built using only interpretable descriptors having clear structural or physicochemical meaning. However, addition or use of noninterpretable or hardly interpretable descriptors may improve model predictivity.9,10 On the other hand, modern machine learning approaches which are successfully applied to produce highly predictive models, such as Support Vector Machine (SVM), Neural Nets (NNs), and Random Forest (RF), are not interpretable by design, unlike the first QSAR models which were developed with a view to subsequent interpretation. This created the notion that there is a tradeoff between the predictivity and interpretability of QSAR models.11−13 Since predictive models are the primary goal of QSAR modeling, interpretation of models was considered minor and frequently neglected in QSAR studies. Recent advances in the interpretation of QSAR models have made predictivity and interpretability closer than ever before. Changing of the interpretation paradigm made all models interpretable regardless of descriptors or machine learning methods used. I hope that this short overview will show that all models are interpretable from which usef ul knowledge can be extracted. B

DOI: 10.1021/acs.jcim.7b00274 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Perspective

Journal of Chemical Information and Modeling



MODEL → DESCRIPTORS → (STRUCTURE) INTERPRETATION PARADIGM As mentioned, for many years, interpretable models were built only on interpretable descriptors. The model → descriptors → (structure) interpretation paradigm was implicitly formulated. This consists of two consecutive steps. In the first step, descriptor contributions are calculated depending on the interpretability of the machine learning method used for modeling. In the second step one interprets the calculated contributions “as is” or transfers them to a structural level to estimate the contributions of substructures. This step depends on the interpretability of descriptors. The review is focused mainly on the interpretability of machine learning methods and calculation of descriptor contributions from QSAR models and less on interpretation of descriptors because the latter is subjective and depends on the experience and knowledge of a particular researcher. A good example of how difficult descriptor interpretation could be was given by Stanton and Jurs in their study of surface tension of alkanes.21 More interpretation issues related to variable selection, model size, and outliers are discussed by Stanton in a separate publication.22 Model → descriptors → (structure) interpretation approaches can be divided into two groups: (i) approaches which are applied only to models based on specific machine learning methods and (ii) approaches which can be applied to models based on any machine learning method. The former I shall call machine learning method-specific approaches, and the latter, machine learning method-independent. Visualization of descriptor contributions transferred at a structural level is common to both groups and will be described in a separate section.

descriptor set. The obtained models had reasonable predictive ability. Analysis of the descriptors selected by each model showed their similar physicochemical meaning. All models recognized hydrophobicity and polarizability as important factors. This suggests that interpretation results can be similar regardless descriptors and variable selection routines used. Free−Wilson Models. Free and Wilson proposed another approach based on the assumption of the additive nature of quantitative structure−property relationships.25 The linear equation they suggested linked the biological activity of compounds with the presence/absence of particular substituents in their structures. The authors demonstrated the use of this approach with several examples. One is given below. The relative inhibitory activity of compounds depicted in Figure 1

Figure 1. Scaffold of compounds used by Free and Wilson in their analyses.25 R is H, CH3; X is Br, Cl, NO2; and Y is NO2, NH2, NHC(O)CH3.

against Staphylococcus aureus can be represented by the eq 2. The regression coefficients show the contributions of substituents. The model was trained on 10 out of 18 possible compounds and thus it can be applied for predicting the activity of only 8 new compounds. Such models have limited applicability domain. Although the Free−Wilson approach can be applied mainly for additive endpoints which rarely occur,26 they are very attractive to chemists due to the straightforward interpretation and clear structural explanation of the underlying relationships.



MACHINE LEARNING METHOD-SPECIFIC INTERPRETATION APPROACHES Multiple Linear Regression (MLR, Hansch Models). The first QSAR models were developed to be interpretable. Hansch et al. in their seminal paper carefully chose the chemical descriptors to explain the plant growth inhibition activity of phenoxyacetic acids, eq 1:23 1/C = 4.08π − 2.14π 2 + 2.78σ + 3.38

Act = 75R H − 112R CH3 + 84X Cl − 16XBr − 26X NO2 + 123YNH 2 + 18YNHC(= O)CH3 − 218YNO2

(2)

where R, X, and Y are indicator variables equal to 1 when the substituent is present at this position and 0 if not. Ordinary linear regression models cannot work with mutually correlated variables and high dimensional data. Therefore, it is recommended to limit the number of variables to be 5−10 times fewer than the number of compounds and to exclude mutually correlated descriptors. These issues can be solved by different regularization techniques, e.g., ridge regression,27 or the partial least-squares (PLS) method.28 Partial Least-Squares. PLS is a generalization of multiple linear regression and can work with correlated and numerous X variables and multiple outputs (Y). The PLS model finds new descriptors (X-scores, T) which are estimates of the latent variables. X-scores are orthogonal. They are estimated as a linear combination of the original variables X with coefficients W* (eq 3) and are good predictors of Y (eq 4). For more detailed description of the PLS method, please refer to the work of Wold et al.29

(1)

where σ is the Hammet constant and π is the difference in the logarithms of the partition coefficients of the substituted and unsubstituted compounds (π = log PX − log PH). These variables were chosen to represent the influence of electronic factors and rate of plant cell membrane penetration. The regression coefficients show that increasing the electron withdrawal ability of substituents increases the inhibitory effect. The effect of the hydrophobicity of substituents is nonlinear and hence there is an optimal value π ≈ 1. This information may be useful for the design of new active compounds, but these simple rules may be not applicable outside the phenoxyacetic acid series. Bhhatarai et al. showed that interpretation results of MLR models obtained based on different descriptor sets and using different variable selection techniques converge.24 The authors build MLR models for 158 cycloalkyl-pyranones encoded by manually selected descriptors utilizing knowledge about important molecular properties of HIV protease inhibitors and CODESSA and MOE descriptors. They used different variable selection strategies to obtain best models for each

T = XW*

(3)

Y = TC′ + F = XW*C′ + F = XB + F

(4)

Analysis of regression coefficients (B) in a PLS model (eq 4) is just like a normal linear regression (Hansch models) and will C

DOI: 10.1021/acs.jcim.7b00274 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Perspective

Journal of Chemical Information and Modeling

Figure 2. Interpretation of the PLS model.29 (left) Scores t1 and t2 for the PLS model show similarity between objects in a “new” descriptor space. (right) Descriptor contributions (loadings) into the first and second PLS components (DDGTS is the modeled property).

ships cannot be established. In these cases, methods which capture nonlinearity are preferable. One such early approach is decision trees which is nonparametric and can be applied to both regression and classification tasks.32,33 A decision tree is a top-down set of nodes. Starting from the root node, nodes are split on two or more child nodes according to the generated rules. Decision trees are interpretable by design and have become popular. Starting from the root node one may reconstruct the set of “IF···THEN···” rules which lead to compounds with particular property values and thus predict properties for new compounds and explain a captured structure−property relationship. In one of the first publications, decision trees were applied to investigate solid-phase fluorescence enhancement of 2-(diphenylacetyl)-1,3-indandione 1-(p-(dimethylamino)benzaldazine).34 The authors constructed the decision tree on the basis of a set of manually selected structural features which could be important for the investigated property (Figure 3). If one of the routes leading to potentially fluorescent compounds is followed, the following rule can be created: fluorescent compounds should have a maximum length of 7.2 Å or more AND should have a pentavalent phosphorus atom P(R)(−X)(−Y)−Z where R is O or S and X, Y, Z contain S. Such rules can be created for each

not be discussed here. However, PLS has more options for looking inside the models. To illustrate the interpretation of PLS models, let us consider a particular taskmodeling of the free energy of tryptophan synthase unfolding a unit of bacteriophage T4 lysosome when position 49 is modified to contain each of the 19 amino acids except arginine.29 Amino acids were encoded by seven highly correlated variables. PIE and PIF are lipophilicity constants of the amino acid side chain. DGR is the free energy of transfer of an amino acid side chain from the protein interior to water. SAC is the water-accessible surface area of amino acids calculated by MOLSV. MR is molecular refractivity, and Lam is a polarity. Vol is the molecular volume of the amino acid calculated by MOLSV. The authors excluded aromatic amino acids as possible outliers. For the rest of the 16 amino acids, they built the PLS models containing two components with reasonable predictive ability (R2 = 0.783, Q2 = 0.706). The plot of X-scores of the built PLS model (t1 against t2, Figure 2) shows similarity between amino acids in the new latent descriptor space. Sixteen amino acids are grouped according to polarity and inside each group by size and lipophilicity. Analysis of the contributions of descriptors in each PLS component demonstrated that the first component is dominated by lipophilicity and polarity (positive contributions of PIE, PIF, and Lam and negative contribution of DGR) (Figure 2). Size and polarity of amino acids mainly contribute to the second component with positive contributions of MR, Vol, and SAC and negative contribution of Lam. The importance of the descriptors is proportional to their distance from the origin in the loading space (Figure 2), thus more distant variables are more important. The usefulness of interpretation in this case is limited by the small training set. It is unlikely that the developed model will be able to correctly predict unfolding energy for non-natural mutations of tryptophan synthase, and thus these findings may not be extended on new observations. However, interpretation results support the hypothesis that unfolding energy depends on lipophilicity, water-accessible surface area, molecular volume, and refraction of mutated amino acids. For more examples of PLS model interpretations, please refer to the paper of Stanton.22 Other PLS methods, like orthogonal PLS (OPLS), O2PLS, L-shaped PLS, and orthogonal LPLS have also found application in chemoinformatics. They can be interpreted similarly to conventional PLS models on the basis of regression coefficients, scores, and loading.30,31 Decision Trees. The predictive ability of linear models is limited when the linear trend of structure−property relation-

Figure 3. Fragment of the decision tree from the work of Ashman et al. which describes the fluorescence of compounds as a function of structural motifs. (Reproduced with permission from ref 34. Copyright 1985 American Chemical Society). D

DOI: 10.1021/acs.jcim.7b00274 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Perspective

Journal of Chemical Information and Modeling terminal node. However, for deep and highly branched trees such analysis may become cumbersome. Decision tree models can work with high-dimensional data and correlated descriptors because the variable selection step is implemented in the trees by designin each split, the model selects the optimal descriptor value for splitting. However, decision tree models are prone to overfitting and susceptible to noise that substantially decrease their predictive ability. Different strategies were developed to avoid overfitting, e.g. pruning of the models based on predictions of the validation set. Regarding interpretation, the major shortcoming of decision trees is their instability since small changes in the training set may lead to a model with substantially different rules in tree nodes. “Black Box” Models. To overcome issues of low predictive ability and model instability, more advanced modeling approaches were introduced: NNs, SVM, and later RF.35 These can work in a high-dimensional descriptor space even without preselection of variables, have high predictive ability and reasonable computational speed. Whereas NN models are prone to overfitting, SVM and RF models are more stable and tolerant to noise in the input data. These methods are excellent tools for predictive modeling and they quickly became the gold standards in the chemoinformatics community. However, these models are considered black boxes since their interpretation is not straightforward unlike linear and decision tree models. Neural Nets. In the simplest case, neural nets consist of an input layer of neurons, one hidden layer, and an output layer (Figure 4). The sigmoid function is usually used as an

Q ij =

Si =

Pij ∑i Pij

(6)

∑ Q ij (7)

j

relative importancei =

Si ∑i Si

(8)

One of the first examples of the application of the described approach was an investigation of the adsorbability of 55 organic compounds on activated carbon fibers.38 The authors used molecular connectivity indices (χ) in MLR and NN models. Several outliers were removed to obtain a better MLR model (eq 9). v

v

v

log K = 3.33 − 1.553χ + 0.585χ + 3.526χ − 1.423χ c v

+ 2.294χ pc

n = 49, R adj2 = 0.648, SE = 0.199

(9)

The NN model with three neurons in the hidden layer was trained on all 55 examples and 5 descriptors selected by the MLR model (5:3:1 network topology). It demonstrated greater fitness (Radj2 = 0.901) than the MLR model (Radj2 = 0.648). Both models had an acceptable predictive accuracy measured on a small external test set and therefore their interpretation would be reasonable. Since the relative importance of descriptors calculated from the NN model alone is insufficient for understanding the captured structure−property relationship, the authors used the signs of the regression coefficients of the MLR model to determine the direction of descriptor influence (Table 1). However, this is a big assumption which can significantly affect interpretation results, and normally such knowledge transfer between models should be avoided. Table 1. Relative Importance of Descriptors and Their Influence on Adsorbability Derived from NN and MLR Models, Correspondingly relative importance, % influence on log K

Figure 4. Example of a NN model structure. Iijweights of the connections between ith input and jth hidden neurons; Hjweights of the connections between jth hidden and output neurons.

χ

5 v

χ

6 v

χ

20.3 ↓

17.3 ↑

34.4 ↑

χc

3

11.6 ↓

χ pc

4 v

16.4 ↑

The conclusions drawn by the authors are somewhat contradictory due to ambiguous interpretation of the chosen descriptors. The 3χv index was considered as related to bulky and branched molecules. Its negative influence was explained by a slower diffusion in slitlike micropores of bulky molecules or by the structure of activated carbon which is composed of stacking micrographitic planes. Earlier it was shown that adsorption can occur via electron donor−acceptor interactions and flat molecules are favorable for adsorption probably due to a shorter distance between the molecule center and the activated carbon surface.39,40 This phenomenon may explain the adsorption decrease with a 3χc decrease which is related to highly substituted compounds comprising tert-butyl groups or having more than three substituents. However, the positive effect of the most important variable 6χv (34.4%), which is also related to highly branched compounds, like atrazin, the authors explained by lower solubility of such compounds and hence better adsorbability on the surface of activated carbon. The positive effect of 5χv descriptor which is related to heteroatomic contents may confirm the electron donor−acceptor nature of

activation function in hidden neurons and weights of connections between neurons are optimized during model training, e.g., by backpropagation. Various approaches have been proposed for extracting information about variables contributions from NN models. One of the first attempts was estimation of the relative importance of variables proposed by Garson.36,37 For each hidden neuron Hj, the product of the absolute values of weights of input (Iij) and hidden (Hj) neurons are calculated (eq 5), the product value is scaled across all input neurons (eq 6) and for each input neuron, the sum is calculated (eq 7). The relative importance of the ith input neuron is calculated as the ratio of Si to the sum of Si across all input neurons (eq 8). However, the relative importance measure does not provide information about the direction of the descriptor influence. Pij = |Iij| × |Hj|

3 v

(5) E

DOI: 10.1021/acs.jcim.7b00274 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Perspective

Journal of Chemical Information and Modeling

The authors mainly focused on the first two components of the PLS model since these explained most of the variance and on the first two columns of the effective weights matrix for the NN model. The models had much in common. The large number of double bonds (NDB) had a significant negative effect on blood−brain permeation. The WNSA-3 and PNHS-3 descriptors representing the effects of partial negative surface area and the hydrophobic surface area, correspondingly, had the biggest positive impact on blood−brain permeability. This conclusion is supported by other studies.43 Differences between the two models are observed for the V4P descriptor representing molecular branching. In the PLS model, this had a small negative contribution in the first component and a large positive contribution in the second one, whereas in the NN model it had a large positive contribution in the two most important hidden neurons 1 and 2 whose effect was somewhat compensated by negative contributions in neurons 3 and 4. Thus, the overall picture of the structure−property relationship is similar for both models despite the “linearization” of the NN model interpretation. Both Garson’s and Guha’s approaches can also be applied to classification tasks where an NN model has several outputs (classes). Then descriptor contributions will be calculated for each output neuron separately. It should also be noted that NN models are usually initialized with random weights which are optimized during learning. For this reason, final model weights and interpretation results may differ slightly for independently trained models. To achieve more robust interpretation results several NN models may be built to average the relative importances/contributions. Random Forest. A random forest is an ensemble of decision trees built according to the following rules: (1) for the training of each tree the bootstrap sample of the whole training set is used, (2) the best split in each node is selected among a defined number of randomly chosen descriptors, (3) no pruning of trees.35 The most popular and widely used interpretation technique of RF models is estimation of variable importance. This procedure can be applied for other models as well therefore it will be discussed in detail in the corresponding section on machine learning method-independent approaches. Extraction of rules from the decision trees which constitute the forest is another straightforward interpretation alternative. However, since there is no tree pruning in RF models these rules can be very cumbersome. Therefore, different techniques were developed to reduce the number and complexity of extracted rules,44,45 but these techniques remain unpopular; few publications can be found regarding them.46 Kuz’min et al. proposed the procedure of estimating the contribution of single descriptors in a property value of a particular compound from regression RF models.47 First, in each node of the tree, the average activity among bootstrap training set compounds reaching this node is calculated (eq 13). For each child node the local score is calculated as the difference between average activity values in child and parent nodes (eq 14). This is the contribution of the ith descriptor used for splitting of the parent node. The overall contribution (Sk,i) of the ith descriptor for the kth compound is computed as the sum of its local scores when the compound reaches the node containing this descriptor averaged among all trees (T) (eq 15). Calculated descriptor contributions are context dependent and will vary for different compounds. This approach was later extended by Palczewska et al. to be suitable for interpretation of classification RF models and implemented

interactions of adsorbed molecules with activated carbon. This example demonstrates the importance of proper choice of descriptors for models which will be interpreted in term of descriptor contributions because uninterpretable descriptors may lead to contradictory conclusions or overinterpretation. An idea similar to Garson’s estimation of relative importance was used by Guha et al.41 These authors proposed calculating effective weights Pij as a product of weights of the ith input neuron (Iij) and jth hidden neuron (Hj) (eq 10) unlike the product of absolute values of Iij and Hj used in the relative importance interpretation scheme described above (eq 5). The matrix of size i × j of effective weights (Pij) is constructed where rows are input neurons and columns are hidden neurons. Columns are sorted according to relative squared contribution of hidden neurons (SCVj) calculated according to eqs 11 and 12. The interpretation of such a matrix is similar to interpretation of X-scores in the PLS method when hidden neurons are considered latent variables. Pij = Iij × Hj

CVj =

1 n

SCVj =

(10)

n

∑ Pij

(11)

i=1

CV2j ∑j CV2j

(12)

This approach was applied to model the blood−brain barrier permeability (logBB) of 97 compounds.41 This data set had been studied by means of the PLS model and the four most important descriptors were identified: NDB number of double bonds, WNSA-3 the difference between the partial negative surface area and the sum of the surface area on negative parts of molecules multiplied by the total molecular surface area, PNHS-3 atomic-constant-weighted hydrophilic surface area, V4P fourth-order valence-corrected path molecular connectivity index.42 Both PLS and the built NN models had comparable predictive ability. The X-scores matrix from the PLS model and the effective weights matrix from the NN model were retrieved for further comparison (Table 2). Table 2. X-Scores of the PLS Model and the Effective Weights Matrix for the NN Model (4-4-1 Topology)41 PLS components 1 WNSA-3 V4P NDB PNHS-3 Q2

0.54 −0.09 −0.57 0.62 0.59

2 −0.13 0.97 −0.08 0.17 0.74 NN (4-4-1)

3

4

0.79 0.17 0.58 −0.12 0.75

0.28 0.12 −0.58 −0.76 0.75

hidden neurons (ordered by SCV) WNSA-3 V4P NDB PNHS-3 SCV

1

2

3

4

52.41 37.65 −10.50 11.46 0.74

29.30 22.14 −16.85 6.59 0.16

−19.64 −3.51 −5.02 −2.72 0.08

2.26 −13.99 22.16 8.36 0.03 F

DOI: 10.1021/acs.jcim.7b00274 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Perspective

Journal of Chemical Information and Modeling in the rfFC R package.48 The described approach was used for modeling of different endpoints. An example will be given below in the section on visualization of descriptor contributions.

A mean

1 = n

1 T

and IF

n

∑ Ai i=1

c p LSic = A mean − A mean

Sk , i =

THEN IC50 is low

number of atoms is medium-high AND molecular volume is medium-high AND index of hydrogen deficiency is medium-high or high AND molecular weight is medium-high AND total charge is medium-high AND angle bending energy is low AND torsional energy is low-medium AND electrostatic energy is low or low-medium THEN IC50 is high Since rule extraction techniques for NN models are not widely used in chemoinformatics there are few examples of their application. Benfenati et al. interpreted the fuzzy neural network model which was used as a meta-learner in classifying pesticide toxicity.58 More examples of the application of rule extraction algorithms in chemo- and bioinformatics can be found in the work of Hudson et al.59 Recently approaches to rule extraction from deep neural nets were developed.60 There are a number of rule extraction approaches to SVM models, and I recommend the review of Barakat and Bradley61 for more detailed information. One such approach is relabeling of predicted outcomes of an SVM model and applying one of the decision tree methods to produce an interpretable model (Figure 6). In some implementationsthe active learning

(13) (14)

n

∑ LSi ,j j=1

(15)

Rule Extraction Approaches. To provide a clear representation of relationships captured by NN and other black box models, a number of rule extraction approaches were proposed. These extract symbolic rules “IF···THEN···ELSE” as in decision trees.49 Originally they were developed for interpretation of NN models; however, later such algorithms were developed for SVM models as well. One of the early approaches which can be applied for interpretation of NN models was KT.50 This finds single features (neurons) or their subsets which guarantee exceeding the activation threshold in the neuron on the next layer of NN (Figure 5). The same procedure is applied to find features

Figure 5. Schematic representation of results provided by KT rule extraction algorithm developed by Fu.50 Letters represent nodes with associated weights.

which guarantee inactivation of the node. Another approach, Validity Interval Analysis (VIA), iteratively finds the intervals of input variables which consistently activate output neurons.51 Rule extraction approaches return sets of rules which are very similar to the rules derived by decision trees. However, rules derived from NN models are more stable to noise than decision trees and may outperform them demonstrating better generalization ability.52,53 Different modifications were proposed to efficiently extract rules from NN models and new rule extraction approaches to target specific NN architectures were developed.54−56 A fuzzy three-layer NN model with 10 neurons in the hidden layer was built for 151 HIV-1 protease inhibitors based on 30 descriptors which were scaled to the [0; 1] range.57 Descriptor and output values were split on intervals before rule extraction. Descriptor values were split on five equal intervals: low [0;0.2], low-medium (0.2;0.4], medium (0.4;0.6], medium-high (0.6;0.8], high (0.8;1]. Output IC50 values (nM) were also partitioned on five splits: low [0;20], low-medium (20;50], medium (50;100], medium-high (100;500], high > 500. Several rule sets were derived to distinguish active and inactive compounds, e.g.: IF molecular weight is low-medium AND angle bending energy is medium AND torsional energy is low, medium, or medium-high AND electrostatic energy is medium-high

Figure 6. Rule extraction from SVM models based on relabeling of input data points and on additionally populated new data points.

based approachadditionally a certain number of new data points are randomly added to be close to the support vectors to increase the density in those regions and improve their representation. These new data points are labeled by the SVM model and conventional rule induction algorithms are applied to classify the new data set containing the original training set and the generated data points (Figure 6).62 The obtained set of rules may even outperform initial SVM models. Raccuglia et al. developed an SVM model to predict reaction outcomes for the crystallization of diamine templated vanadium selenites.7 This model exceeded human intuition success rate G

DOI: 10.1021/acs.jcim.7b00274 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Perspective

Journal of Chemical Information and Modeling

Figure 7. SVM-derived decision tree of diamine templated vanadium selenites synthesis. Green, blue, and red colored nodes define a specific set of reaction parameters that facilitate single-crystal formation. Inspection of these parameters leads to three chemical hypotheses of the effects low-, medium-, and high-polarizability amines on a reaction output, respectively. (Reproduced with permission from ref 7. Copyright 2016 Nature Publishing Group.)



(89% vs 78%) in a prospective study of hydrothermal reactions involving 34 new diamines. Interpretation of the obtained SVM models using a decision tree gave some insight leading to several hypotheses about optimal conditions of the studied reactions (Figure 7). Small, low-polarizability amines require the absence of competing Na+ cations and longer reactions times. Medium-polarizability amines require V4+ ions which were present in VOSO4 (sulfur present rule in Figure 7), because they are unable to generate V4+ from typical V5+ precursors in situ. High-polarizability tri- and tetramines require oxalate reactants to alter the charge density of inorganic secondary building blocks.7 This is a good example of how interpretation of QSAR models may help to generate reasonable chemical hypothesis. Recently, the same technique was used to confirm the importance of a developed descriptor which effectively discriminates structures which can be crystallized from those which cannot be crystallized.63 Actually, rule extraction cannot be considered as a true interpretation of black box models. It is just an approximation of these models with more simple ones which provide human readable output. Rule extraction approaches can be effective if there is only a relatively small number of highly important descriptors in the model, otherwise very long and complex rules can be generated. This limits the applicability of these interpretation approaches in chemoinformatic studies.

MACHINE LEARNING METHOD-INDEPENDENT INTERPRETATION APPROACHES

These approaches do not rely on the internal structure of models. This information is not even necessary. They investigate the model behavior, how the model output (response) changes with variation in input values (descriptors). They process all models as black boxes and therefore are suitable for interpretation of any model regardless of machine learning method used. These approaches are sensitivity analysis, variable importance, and partial derivatives. Sensitivity Analysis. Sensitivity analysis explores the dependence of output values to systematic changes in descriptor values while values of all other descriptors remain constant.64 Since the number of all possible combinations of descriptor values is incalculable later it was suggested to keep values of fixed descriptors as minimum, first quartile, median, third quartile, and maximum.65 In one of the first studies where sensitivity analysis was applied the authors built an NN model for 68 inhibitors of dihydrofolate reductase (Figure 8).64 This data set had been studied by means of a Hansch analysis and four of the most important descriptors were found: molar refraction of substituents in 3, 4, and 5 positions (MR3, MR4, MR5) and hydrophobic constant of the substituent in the position 3 (π3) (eq 16).66 The NN model was built on the same set of descriptors. H

DOI: 10.1021/acs.jcim.7b00274 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Perspective

Journal of Chemical Information and Modeling

log(1/K ) = 11.79MR 35 − 15.74MR 52 + 6.55MR 5 + 0.89MR3 + 0.80MR 4 − 0.21MR 24 + 1.58π3 − 1.77log(β × 10 π3 + 1) + 6.24, where log β = 0.175 Figure 8. Structures of inhibitors of dihydrofolate reductase used to build linear Hansch66 and neural net64 models. One or more X substituents were in the 3, 4, and 5 positions of the phenyl ring, and they were selected from halogens, hydroxyl, nitro, alkyl, alkoxy, acetylamino, and others.

The above example provides a clear functional relationship between descriptor values and the investigated response. However, if the model includes a large number of descriptors which is now common, the analysis can be inefficient and it is rarely used in chemoinformatics today. For more details on sensitivity analysis I recommend the review of Iooss and Saltelli.67 Variable Importance. Estimation of variable importance provides information about relative importance of descriptors used for model building but not about the direction of their influence (positive or negative). However, even such general information may be useful. Most often it is used for variable selection but other applications are also possible. Unlike the Garson approach of estimating of descriptor importance from NN models there are several approaches which can be applied to any model. One of the early attempts to estimate variable importance was made by Györgyi.68 He suggested adding noise to the input variables and measuring changes in predictive performance of the NN model. The more sensitive to noise a variable is the more important it is. A similar strategy was proposed by Breiman for estimating variable importance in RF models.35 He suggested permuting the values of the single variable to break the initial association of the variable with the response and to calculate the decrease in prediction accuracy on the out-of-bag set. This technique is widely used in bioinformatics to select important genes in microarray studies.69 In QSAR modeling it can be used to estimate the importance of single descriptors or group of descriptors representing different physicochemical

The sensitivity analysis suggested that the biological response almost linearly depended on MR3 values and there is a parabolic relation to MR4 and cubic to MR5 values (Figure 9). The authors added the cubic term of MR5 in the regression model published beforehand (eq 16) and rebuilt it keeping the same percentage of overall variance for each physicochemical parameter. Some improvement in predictive ability of the new linear model was found (eq 17). The root mean squared error calculated for eight compounds with substituents in position 5 whose predicted responses were affected by the newly built model was decreases: 0.093 vs 0.074 for eqs 16 and 17, correspondingly. Unfortunately in this study the authors improved model fitting but did not estimate the predictive ability of either model properly. Therefore, to avoid possible overfitting and overinterpretation, models should be properly validated and sensitivity analysis diagrams should be produced on cross-validation folds and not on training data. log(1/K ) = 0.95MR 5 + 0.89MR 5 + 0.80MR 4 − 0.21MR 24 + 1.58π3 − 1.77log(β × 10 π3 + 1) + 6.65,

where log β = 0.175

(17)

(16)

Figure 9. Sensitivity analysis of inhibition of dihydrofolate reductase as a function of physicochemical parameters. (Reproduced with permission from ref 64. Copyright 1992 American Chemical Society.) I

DOI: 10.1021/acs.jcim.7b00274 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Perspective

Journal of Chemical Information and Modeling

separately. It can be applied for calculation of descriptor contributions from any model regardless of machine learning method and endpoint (regression or classification). In combination with visualization techniques described in the next section, this method becomes really impressive. It should be noted that the first partial derivatives for linear models equal the corresponding regression coefficients. Thus, this approach can be considered as an estimation of “regression coefficients” from nonlinear models for a particular data point (compound) in a chemical space. Aoyama and Ichikawa proposed calculating partial derivatives for NN models that can be done in the analytical form.77 For the three-layer NN model depicted in Figure 4 the partial derivatives of the ith descriptor is calculated according to eq 18.77

factors to explain the underlying relationships or for variable selection to discard less relevant descriptors.70,71 Guha and Jurs demonstrated that the RF model selects important descriptors similar to the MLR and PLS models.72 They studied 79 derivatives of 4-piperazinylquinazolines for their ability to inhibit platelet derived growth factor receptor phosphorylation. Three descriptors which were used for developing the best MLR and PLS models were found in the top 10 of the most important ones selected by the RF model. One of those descriptors (MDEN-23) represents the geometric mean of the topological path lengths between secondary and tertiary nitrogens and encodes the presence or absence of nitrogen containing substituents attached to the common scaffold. Two other descriptors (RNHS3 and SURR-5) represent relative hydrophilicity or hydrophobicity of the molecular surface. The most important descriptor in the RF model (SURR-5) had a large negative contribution according to the linear models indicating that more hydrophobic molecules would have higher inhibition. This is supported by experimental observations. This example indicates once more (i) that variable importance alone is not enough since it does not show the direction of the influence and (ii) the importance of choosing clearly interpretable descriptors with unambiguous structural or physicochemical meaning. Later, the same authors demonstrated the applicability of Breiman’s variable importance to the interpretation of NN models.73 Variable importance can be also used for the estimation of relative importance of different physicochemical factors in the investigated property. Polishchuk et al. modeled the toxicity of 664 compounds toward Tetrahymena pyriformis.74 The authors encoded compound structures by simplex descriptors (count of tetraatomic fragments) where atoms were labeled by different properties: partial atomic charge, lipophilicity, refractivity, and H-bond donor/acceptor to represent possible contribution of electrostatic and hydrophobic interactions, polarizability, and H-bonding, respectively. Variable importance calculated from the RF model was summarized for each group of descriptors separately. Comparison of the summarized importances of different groups of descriptors indicated that mainly changes in hydrophobicity (31%) and polarizability (29%) contribute to toxicity changes. The importance of hydrophobic interactions for compounds toxic to Tetrahymena pyriformis was also confirmed by the PLS model built in the same study. Descriptors weighted by lipophilicity explained 50% of the variance of the toxicity. Estimating variable importance is a relatively simple technique and is widely used in different studies. This approach has some drawbacks related to the interpretation of RF models. The classification and regression tree (CART) algorithm32 used in conventional RF to build individual trees is biased to select variables with a larger number of categories leading to overestimation of importance of such variables.75 The bootstrap sampling used in RF may also have a negative effect. Variable importance may also be overestimated for highly correlated variables.75,76 To overcome these issues Strobl et al. suggested (1) building trees based on conditional inference framework instead of Gini criterion used in CART and (2) to estimate variable importance by conditional importance measures when input objects are first split into groups according to descriptor value and permuting descriptor values inside single groups.75,76 Partial Derivatives. The partial derivatives approach is a local sensitivity analysis method,67 but due to its exceptional power and wide use in chemoinformatics studies, I describe it

∂Y = ∂xi

∑ f ′(yj )Iijg ′(y)Hj j

(18)

where f′(yj) and g′(y) are the differential functions of the activation functions in the second (hidden) layer and third (output) layer of the NN model, and Iij and Hj are weights of connections between first and second layers and between second and third layers, correspondingly. In the case of several outputs in classification tasks or multitask models partial derivatives are calculated separately for each output neuron. The authors used this approach for interpretation of the NN model of 13C NMR chemical shifts of nonbornanes and nonbornenes depended on the exo and endo orientation of R substituents (Figure 10).77 The NN model was trained on

Figure 10. General structure of nonbornane and nonbornene derivatives used for modeling of 13C NMR chemical shifts.77 The dashed bond indicates the single or double bond.

chemical shifts of seven carbon atoms and the partial derivative of each descriptor was calculated. As expected, contributions had an opposite sign and almost the same magnitude for exo and endo isomers (Table 3). Contributions of chemical shifts of the first and the third carbon atoms were near zero and are probably insignificant whereas the chemical shift of the seventh carbon atom made a major contribution to the distinction of exo/endo isomers by the NN model. While the first partial derivatives designate descriptor contributions, Baskin et al. suggested calculating the dispersions of first partial derivatives as a measure of nonlinearity of contributions and the second partial derivatives to discriminate higher order dependencies like parabolic or hyperbolic ones.78 Marcou et al. proposed the scheme for calculation of partial derivatives for classification tasks and demonstrated their applicability on several examples.79 The finite difference approach can be used for the calculation of first partial derivatives for models which do not support analytical derivatives (see eqs 19−21 for forward, backward, and central difference, correspondingly). This can be applied for calculation of descriptor contributions from any model. An et al. used Canvas2D fingerprints and kernel PLS for modeling J

DOI: 10.1021/acs.jcim.7b00274 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Perspective

Journal of Chemical Information and Modeling

Table 3. Average Contributions of Chemical Shifts for Each Carbon Atom in Nonbornane and Nonbornene Derivatives Calculated from the NN Model (Architecture 8-14-2) by the Partial Derivatives Method.a exo endo a

C1

C2

C3

C4

C5

C6

C7

−0.009 0.010

0.088 −0.088

0.031 −0.032

−0.111 0.110

−0.107 0.109

0.141 −0.141

−0.213 0.214

Reprinted with permission from ref 77. Copyright 1992 American Chemical Society.

descriptor value is unlikely to change compound neighbors and thus the predicted value. Therefore, the descriptor contributions will frequently be zero. This is especially so if the number of neighbors k is relatively small. Those weighted k-NN schemes which explicitly consider the distance to neighbors will be less likely to show this effect. Another unexplored issue of the partial derivatives approach is the estimation of the numeric differentiation error. For forward and backward differences, the approximation error is proportional to the step size δ, for central difference it is proportional to δ2, for the Richardson extrapolation, which calculates partial derivatives based on several step sizes, e.g. δ and 2δ, the approximation error is proportional to δ4. Thus, the Richardson extrapolation gives the lowest error, and the central difference has an advantage over forward and backward difference approaches. However, the total error is the sum of the approximation error and the calculation error. The latter depends on the prediction error of the model (ε) and is proportional to ε/δ. Thus, decreasing the step size will decrease the approximation error and increase the calculation error. Therefore, an optimal value of the step size δ should be found which will minimize the total error.

many endpoints followed by visualization of atom contributions.80 For calculation of the partial derivatives, they used the forward difference approach (eq 19) with step size δ 0.05. Baehrens et al. used 142 substructure Dragon descriptors and Gaussian Process (GP) with radial-based kernel for classification of Ames mutagenicity of chemical compounds.81 All 142 substructures were ranked according to their calculated average contributions. Almost all previously established toxicophores were found at the top of the list and known detoxicophores at the bottom. There are numerous studies where the partial derivatives approach has been used for interpretation of QSAR models. Many of them used the visualization of calculated contributions that will be described in the next section.79,82−87 C=

f (x + δ ) − f (x ) δ

(forward difference)

C=

f (x ) − f (x − δ ) δ

(backward difference)

C=

f (x + δ /2) − f (x − δ /2) δ

(19)

(20)



(central difference)

VISUALIZATION OF DESCRIPTOR CONTRIBUTIONS The comparative molecular fields analysis (CoMFA) is an example of an excellent combination of interpretability of PLS models with smartly designed descriptors.91 All training molecules are aligned in 3D space and then special probe atoms are placed in the intersections of the 3D lattice to calculate electrostatic and steric interactions of molecules with appropriate probes. These values form the matrix of X variables which is used for PLS modeling of the target property. Regression coefficients of the PLS model represent the contributions of electrostatic and steric factors in a particular point in 3D space. This makes possible the visualization of positive and negative effects of electrostatic and steric fields by contour plots. This information may suggest which structural modifications are preferable to improve the target property, e.g., remove bulky groups in regions with positive contributions of a steric field. Variance explained separately by steric and electrostatic descriptors provides additional information about the sensitivity of an investigated property to changes in these factors. For more details and applications of CoMFA, please refer elsewhere.92−94 The CoMFA approach is limited mainly to congeneric series of compounds because atom-based alignments do not always correctly represent the binding mode of compounds, and it is most effective for compounds with relatively rigid structures. Otherwise, selection of the appropriate “bioactive” conformers can become a hard task. There are many similar approaches which use the same idea of developing visually interpretable models which overcome some issues of the original CoMFA approach: topomer CoMFA,95 GRID,96 CoMSIA,97 SOMFA,98 etc. Visualization is very effective for interpretation of models built on fragment descriptors. Descriptor contributions may be

(21)

It is worth noting that some published approaches are not recognized as partial derivatives by their authors; however, they use the finite difference approach. The following scheme is frequently used for interpretation of a model built on substructural fingerprints or fragment descriptors. The bit or descriptor value is replaced with zero and the difference between the initial prediction and the prediction after fingerprint/descriptor nullification is calculated and it is interpreted by authors as the contribution of the corresponding substructure.82,88−90 The described procedure is the backward finite difference approach with δ = x (eq 20). However, the calculated contributions of substructural descriptors regardless of the used approach (regression coefficients, partial derivatives, etc.) should not be considered as contributions of corresponding fragments because the substructures used as descriptors for modeling usually overlap and the negative contribution of a particular substructure can be compensated by the positive contribution of an overlapping substructure. For example, let us consider that for a molecule NCCO (2-aminoethanol), the fragment NCC has a contribution +1, and CCO, −1. One may conclude that the NCC substructural motif has a positive influence while CCO has a negative one. However, if one distributes the fragment contribution on corresponding atoms and sums them up, only the nitrogen atom will make a positive contribution +1/3 and the oxygen −1/3; both carbon atoms will have zero contribution. Despite many advantages, such as simplicity of calculation and broad applicability, the partial derivatives approach may fail for some machine learning approaches, e.g., for k-nearest neighbor (k-NN) models because a small change of a single K

DOI: 10.1021/acs.jcim.7b00274 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Perspective

Journal of Chemical Information and Modeling

phenyls were not favorable for a chosen molecule depicted in Figure 12. The calculated atom contributions can be summed up for the selected fragments to retrieve the general trend of structure− activity relationship, called global interpretation (Tables 4 and

calculated using any available approach, e.g. regression coefficients from linear models or partial derivatives from nonlinear models, and then, they are distributed on corresponding atoms. The calculated contribution of each descriptor is divided by the descriptor value and the number of atoms in the fragment and is assigned to corresponding atoms. Total atom contributions are the sum of the scores obtained from different descriptors. These can be visualized on the molecular structure of the investigated compounds by colorcoding. For the first time, a similar approach was suggested for visualizing the results of HQSAR,99 and later, it was used by other researchers to visualize descriptor contributions taken directly from linear QSAR models,100 calculated by means of different machine learning method-specific approaches or calculated by the partial derivatives method from SVM, RF, and other models.79,82−84 Let us examine the study of Kuz’min et al. where the authors modeled affinity for the 5-HT1A receptor of 347 compounds of the general formula depicted in Figure 11.47 Two models PLS

Table 4. Average Contributions Retrieved from PLS and RF Models for Aryl (Ar) Moieties of Ligands of 5-HT1A47

Figure 11. General formula of the 347 ligands of 5-HT1A receptor studied in the work of Kuz’min et al.47 Ar is substituted (hetero)aryls, L is a polymethylene chain, and R is various (poly)cyclic residues.

and RF were built and two machine learning method-specific approaches were used to estimate descriptor contributions in order to establish the structure−activity relationship trend. The first was based on regression coefficients from the PLS model as descriptor contributions and the second on the approach of estimation of descriptor contributions from RF models described in the previous section.47 Both models were trained on simplex descriptors (count of tetraatomic fragments). The variable section procedure was applied and 72 descriptors were selected for building of the PLS model, whereas the RF model was trained on all 6460 descriptors. Both models had reasonable predictive ability. Calculated descriptor contributions were distributed on the corresponding atoms and summarized. Summarized atom contributions were visualized on training set molecules to represent the relative importance of different parts of a moleculelocal interpretation (Figure 12). For example, imido moiety and ortho-methoxy group looked favorable and promoted affinity, whereas three-methylene chain and bulky

5). It becomes obvious that aryl moieties bearing in the orthoposition a p-electron donor substituents are favorable for interaction with the 5-HT1A receptor, whereas the substituents in para-position regardless of their nature are unfavorable probably due to some steric hindrance in that region of the receptor. At the same time it is clear that high affinity ligands should have a long enough linker chain with at least four Table 5. Average Contributions Retrieved from PLS and RF Models for Linker (L) Moieties of Ligands of 5-HT1A47

Figure 12. Color-coded atom contributions of the PLS model of a ligand of the 5-HT1A receptor. Atoms colored green have positive contributions to binding to the 5-HT1A receptor, and atoms colored red are negative. (Reproduced with permission from ref 47. Copyright 2011 John Wiley and Sons.) L

L

PLS

RF

−(CH2)6− −(CH2)5− −(CH2)4− −(CH2)3− −(CH2)2− −CH2−

0.8 0.71 0.81 0.08 −0.04 0.06

0.14 0.19 0.14 −0.01 −0.03 0.05 DOI: 10.1021/acs.jcim.7b00274 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Perspective

Journal of Chemical Information and Modeling methylene groups. These findings correspond to pharmacophore modeling and experimentally observed relationships.101−103 Of course, both models PLS and RF have some discrepancies in calculated contribution values, since different interpretation approaches were used and the modeling property is not additive. However, that did not interfere with retrieving similar structure−activity relationship trends and correct conclusions. This example demonstrates the important features of model interpretation. Results of structural interpretation of correct models should not depend significantly on a chosen machine learning method, interpretation procedure, or variable selection. Partially this conclusion was supported by the study of Marchese Robinson et al.104 The authors investigated different interpretation strategies for RF, linear SVM, and PLS models on a number of benchmark data sets and a good correspondence between found important structural motifs was observed. The same interpretation strategy can be applied for visualization of descriptor contributions calculated by means of partial derivatives. For example, Hasegawa et at. used extended connectivity fingerprints (ECFP) in combination with the SVM method for modeling inhibitors of D2, D3, and D4 dopamine isoenzymes.84 Visualization of calculated atom contributions helped to explain the selectivity of compounds to different dopamine isoenzymes. In particular, the carbonyl group of the depicted compound looks more suitable for binding to D2 and D3 receptors and not to the D4 receptor (Figure 13). There are many more studies which use the power of visualization for analysis of the results of model interpretation obtained by means of partial derivatives.79,82,83,85−87

They allow us to estimate the contributions of separate structural motifs directly from the model that obviates the intermediate stage of calculating descriptor contributions. Therefore, they are applicable for interpretation of models based on noninterpretable or hardly interpretable descriptors, such as many Dragon descriptors. All these approaches use the idea of molecular transformation105 or the matched molecular pairs (MMP)106 approaches. If there is a molecule AB consisting of two fragments A and B, the contribution of fragment B can be calculated as the difference between predicted activity values for the compound AB and the corresponding counter-fragment A (hypothetical molecule which is obtained from AB by removing fragment B). This approach does not use information about the machine learning method and descriptors used, and it can be applied equally well to classification, regression, and consensus models. Thus, it can be considered universal for the structural interpretation of QSAR models. It is important to distinguish this group of approaches where a whole fragment is removed from the structure from the approaches mentioned above where the value of a particular fragmental descriptor or a corresponding bit in the substructural fingerprint is replaced by zero. The former estimates the contribution of the removed fragment, and the latter estimates the contribution of a particular substructural descriptor; these are not the same. In the second case the encoded substructures can overlap, and thus to properly estimate the fragment contribution, one needs to distribute calculated descriptor contributions on atoms, summarize them, and assign them to the structure as described in the previous section. From this point of view, the Free−Wilson approach is the closest to the model → structure paradigm because fragments used for Free− Wilson modeling usually do not overlap. Thus, this approach can be considered as belonging to both interpretation paradigms. There are several implementations of the model → structure paradigm. The first two approaches were proposed independently by Riniker and Landrum107 and by Polishchuk et al.108 They are very similar in nature. Riniker and Landrum107 suggested estimating the contributions of single atoms by removing them from the structure and calculating the difference between the predicted values for the initial structure and the structure with a removed atom. This gives information about the contributions of individual atoms in each moleculelocal interpretation. The authors called this “similarity maps”. It was implemented using Python open-source toolkits RDKit,109 scikit-learn, and matplotlib. The same authors demonstrated its applicability on a set of dopamine D3 ligands using RF and ̈ Bayes (NB) models built on different fingerprints.107 Naive Interpretation of both models indicated the significant contribution of the piperazine ring of the studied compounds (Figure 14). Indeed, the nitrogen atom of the piperazine ring is protonated under physiological conditions and is responsible for formation of the critical salt bridge with Asp1103.32 of the D3 receptor.110 However, the authors noted that possible collisions in hashed fingerprints may somewhat distort the interpretation of the results. Later this approach was used for visualization of atomic contributions of the SVM model based on Morgan fingerprints in the Pred-hERG web-service which predicts the probability of compounds being hERG blockers.111 This approach however is limited by the use of descriptors which can be calculated for molecules consisting of multiple fragments. After removal of a single atom quite often multiple

Figure 13. Atom coloring of a dopamine inhibitor regarding different QSAR models. (Reproduced with permission from ref 84. Copyright 2010 John Wiley and Sons.)

In this section, conventional interpretation approaches of QSAR models are described. Machine learning methodindependent approaches and partial derivatives in particular are well-suited to interpret models based on any machine learning method. Thus, it is hardly true to say that genuine black box models exist in QSAR modeling. However, the common limitation of all approaches used in the traditional model → descriptors → (structure) paradigm is the necessity to use only interpretable descriptors.



MODEL → STRUCTURE INTERPRETATION PARADIGM Approaches using this paradigm have appeared recently and have not yet become popular. However, they have great potential because they make it possible to interpret any model regardless of machine learning method and descriptors used. M

DOI: 10.1021/acs.jcim.7b00274 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Perspective

Journal of Chemical Information and Modeling

included both interpretable and noninterpretable Dragon descriptors. The calculated contributions of known toxicophores and detoxicophores from mutagenicity RF and SVM models were consistent across the models and correspond to the known effects of those fragments (Figure 15). For the

Figure 15. Contributions of established toxicophore and detoxicophores from RF and SVM models based on Dragon and simplex (SiRMS) descriptors. (Reproduced with permission from ref 108. Copyright 2013 John Wiley and Sons.)

Figure 14. Similarity maps for two compounds from the set of dopamine D3 ligands using RF (left) and NB (right) models based on Morgan fingerprints with a radius of 2. Atoms shown in green are favorable for higher affinity ligands, in gray, neutral, and in pink, unfavorable. (Reproduced with permission from ref 107. Copyright 2013 Riniker and Landrum.)

solubility data set the retrieved global trend of structure− property relationship from PLS, RF, and SVM models based on simplex and Dragon descriptors demonstrate good convergence and correspondence to the known influence of different functional groups and fragments on compound solubility.108 Later this interpretation approach was successfully applied in other studies: modeling of of adenosine A2B receptor antagonists (consensus of linear models built on Dragon and MOE descriptors)112 and modeling of blood−brain barrier permeability (RF and SVM models based on ECFP).113 Recently the described structural interpretation approach was extended to enable estimation not only of overall fragment contributions but also the contributions of different physicochemical properties of those fragments in order to answer the question why those fragments are important.114 The extended approach of structural and physicochemical interpretation (SPCI) uses interpretable descriptors which have a clear physicochemical meaning but it does not belong to the model → descriptors → (structure) paradigm because contributions of single descriptors are not calculated. Let us illustrate the SPCI approach using a simple example. Assume that activity is described by the equation F = aπ + ρσ + i, where π is the hydrophobic term, e.g., lipophilicity, σ is the electronic term, e.g., the Hammett constant and i is an intercept. One has a molecule AB consisting of two fragments. The overall contribution of fragment B can be calculated as the difference between FAB = aπAB + ρσAB + i and FA = aπA + ρσA + i (structural interpretation). The contribution of individual physicochemical terms of fragment B, e.g., hydrophobic factor, is calculated as the difference between FAB = aπAB + ρσAB + i and FAπ = aπA + ρσAB + i (physicochemical interpretation). The fragment B was virtually removed in terms of just one physicochemical factor. This can be repeated separately for all other terms as well. The interpretation of these results can be somewhat tricky. The positive contribution of lipophilicity of fragment B will mean that the distribution of this property over atoms of the fragment causes increasing of the studied endpoint value. On

fragments will appear and hence models based on, for example, Dragon descriptors, cannot be interpreted in this way. Polishchuk et al. proposed a more general approach and suggested estimating contributions of whole fragments or groups of atoms by means of the same removal routine.108 This enables estimation of the contributions of whole fragments (functional groups, scaffolds, linkers) as well as their combined effect if two or more groups are removed simultaneously from the structure. This approach may be used for local and global interpretation. The global interpretation is ranking of a series of selected fragments together according to their average contributions to represent captured structure−property relationships. The distribution of contributions of a single fragment may also provide additional useful information. Low variance of contributions means that fragment contribution is almost independent of the molecular context while high variance suggests strong influence of the molecular context on fragment contribution. This approach is less susceptible to descriptors used than the previous one. Contributions of terminal fragments can be calculated for models based on any descriptors, even noninterpretable Dragon ones. In this case, free valence is capped with hydrogen atoms. Contributions of linkers and scaffolds can be calculated only from models based on descriptors supporting molecules consisting of multiple fragments, e.g. simplex descriptors, substructural fingerprints, etc. The applicability of the developed structural interpretation approach was demonstrated on two modeling tasks: solubility (regression) and Ames mutagenicity (classification).108 Two types of descriptors were chosen: (i) simplex descriptors which are counts of tetraatomic fragments with atoms labeled according to different atomic properties and (ii) all 2D Dragon descriptors. The former are interpretable descriptors and the latter can be considered as noninterpretable because they N

DOI: 10.1021/acs.jcim.7b00274 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Perspective

Journal of Chemical Information and Modeling

Figure 16. Overall contributions and contributions of different physicochemical factors of selected fragments in blood−brain barrier permeability resulted from the consensus model consisting of RF, GBM, and SVM models built on simplex descriptors. (Adapted with permission from ref 114. Copyright 2016 American Chemical Society.)

authors proposed calculating the difference between the predicted activity values of two compounds which form the matched pair to estimate the contribution of the transformation, e.g. difference in predicted values of compounds AB and AC results in the contribution of the B > C transformation. Such information can be useful for lead optimization. The absent members of matched pairs in the training set can be virtually generated to increase pair coverage and to improve the significance of the results in comparison to MMP. The proposed approach has no restrictions on descriptors used. It was implemented as a part of the OCHEM portal (http://ochem.eu). The applicability of the approach was demonstrated for the aquatic toxicity of chemical compounds (ASNN model built on E-state indices) and inhibitors of CYP3A4 (a decision tree model built on E-state indices and ALogPS descriptors)117 and later for interpreting the consensus model of lipophilicity of Pt(II) and Pt(IV) complexes which included three ASNN models based on E-state, EFG, and ISIDA fragmental descriptors.118 The authors suggested visualizing pairs found on a transformation graph where arrows links matched pairs and show the direction of the increase of the studied property (Figure 17). However, such representation may be misleading. If it was found that both A > B and B > C transformations enhance the activity, it would not be possible to conclude about the A > C transformation effect, as the molecular context in A > B and B > C pairs may differ. To overcome this issue, an analogue of the matched molecular series approach can be implemented where sequences of transformations are identified, e.g., A > B > C > ··· > Z, but that would require substantial computational efforts to enumerate all missing compounds. However, this approach may be feasible if the series of enumerated compounds would be limited by design. In their study of the toxicity of nitroaromatics Tinkov et al. enumerated possible substitutions of nitrobenzene to track the

the contrary, the negative contribution will mean that lipophilicity of the fragment B decreases the endpoint value. But it is not possible to derive from these results information similar to interpretation of descriptors contributions described above, e.g., increasing of lipophilicity will increase or decrease the endpoint value. In real models it is not single descriptors but whole groups that represent particular physicochemical factors and the fragment of interest should be removed in terms of all descriptors belonging to a certain group representing a single physicochemical factor. Simplex representation of molecular structure perfectly suits this interpretation strategy because each atom is labeled by different physicochemical parameters which are taken into account for generation of simplex descriptors.100,114,115 The described SPCI approach based on simplex descriptors was implemented in the open-source software tool with a graphical user interface.116 However, this approach is not limited using only simplex descriptors; other suitable descriptors can be used as well. SPCI was successfully applied to interpretation of RF, SVM, PLS, gradient boosting machines, and consensus models of blood−brain barrier permeability, affinity of antagonists of the fibrinogen receptor, and acute in vivo toxicity in rats as well as of one of the classical Free−Wilson data sets.114 In all cases, interpretation results of individual models converged and corresponded to experimentally observed structure−activity relationship. For example, it was found that groups with the ability to form hydrogen bonds, like carboxylic, amide, nitro, and thiazolyl, decrease blood−brain permeability, while the trifluoromethyl group may enhance it (Figure 16). Known and new potential toxicophores were retrieved by SPCI of QSAR models of in vivo toxicity; however, these findings should be extrapolated carefully to other compounds considering the applicability domain of the developed QSAR model.114 The third approach of structural interpretation was proposed by Sushko et al., and it replicates the MMP approach.117 The O

DOI: 10.1021/acs.jcim.7b00274 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Perspective

Journal of Chemical Information and Modeling

Figure 17. Example of a transformation graph for aquatic toxicity. (Reproduced with permission from ref 117. Copyright 2014 Sushko et al.

changes of predicted acute toxicity in rats depending on structural changes (Figure 18).119 It was predicted that 1,3dinitrobenze would be much more toxic than nitrobenzene, whereas 1,2- and 1,4-dinitrobenzenes have intermediate toxicity. Derivatives with more nitro groups have intermediate toxicity. It might be supposed that 1,3-dinitrobenzene may have a specific mechanism of toxic action relative to 1,2- and 1,4dinitrobenzenes, e.g., it may act as an uncoupler of oxidative phosphorylation like 2,4-dinitrophenol. Different mono-, di-, tri-, and tetrafluoro nitrobenzenes have similar toxicity which is related to the number of fluorine atoms and not to their position in the benzene ring. This might mean that these compounds have the same or very similar mechanism of toxic action. Interpretation based on a series of enumerated compounds provides a clear representation of structure− activity relationships captured by the model and can be easily analyzed by a human. However, it requires a reasonably designed series of compounds that may be very difficult in some cases. Therefore, methods which can automatically enumerate reasonable structures will be helpful. Webb et al. proposed an approach based on feature/fragment combination networks which is applicable to interpretation of binary classification models.120 A molecule of interest is fragmented, and fragments are organized in a hierarchical manner. The studied end-point values are then predicted by the model for each fragment, and the nodes of the obtained network are labeled according to their roles (Figure 19). The authors applied this approach to interpretation of RF, SVM, kNN, and DT models of mutagenicity. An example of an

Figure 19. Example of interpreted network for 2-amino-6-nitrobenzoic acid built for SVM model of mutagenicity. (Adapted with permission from ref 120. Copyright 2014 Webb et al.)

outputted fragment network is provided in Figure 19. ACTIVATING nodes are the first nodes from the bottom predicted to be active and are not deactivated by the descendant nodes (node 7 in Figure 19). A DEACTIVATED node is one predicted active (node 2) and having the inactive parent node (node 5). A DEACTIVATING node is one predicted inactive (node 2) and having an active child node (node 5). Nodes 1, 3, 4, and 6 are labeled as ACTIVITY_IDENTIFIED since they have an activating descendant. NEGATED and IGNORE nodes are omitted in Figure 19 for clarity. This provides a clear picture of important patterns. It identifies the toxic contribution of a nitro group and the deactivating effect of a carboxyl group appended to aminobenzene. Such networks can be obtained for each compound giving a local interpretation. Patterns can be further aggregated to produce more general patterns. This looks like a promising interpretation strategy because influence of molecular context is captured automatically during analysis of the network. However, this approach is not adapted for interpretation of regression models that would be desirable. One of unexplored issues of all described approaches is the effect of model accuracy on calculated contributions. The contribution is calculated as a difference between two predicted

Figure 18. Effect of different substituents on toxicity of nitrobenzene derivatives predicted from the QSAR model. (Reproduced with permission from ref 119. Copyright 2016 Springer.) P

DOI: 10.1021/acs.jcim.7b00274 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Perspective

Journal of Chemical Information and Modeling Table 6. Different Scenarios of Global Interpretation of QSAR Models scenario 1

Do specific interactions exist that cannot be disregarded? NO (e.g., passive diffusion through membranes, solubility, lipophilicity, etc.)

2 3

YES (e.g., ligand−receptor interactions)

Is the position of a ligand toward its target known?

fragment selection and grouping

NOT RELEVANT

manual selection on the basis of researcher experience

YES

selection according to ligand’s pose relatively to the target recommendation: select fragments from homogeneous sets of compounds having a common scaffold and presumably, acting by the same mechanism

NO

position in the structure. If one mixes its contributions, this might lead to loss of information and wrong conclusions. The third scenario is the most frequent in QSAR modeling tasks, when the investigated compounds form specific interactions but their poses relative to the target are unknown, cannot be established or compounds act by different or competing mechanisms. This is true for many tasks related to toxicity prediction when it is difficult to identify the target or the effect may be caused by several mechanisms simultaneously. In this case, retrieval of general structure−activity relationship trends should be done with care. For example, one may consider limiting the analysis to only congeneric series of compounds assuming that they all have the same mechanism of action and binding mode. Data sets corresponding to the third scenario are real black box for mechanistic interpretation of QSAR models. Examples of all three scenarios were provided in our recent publication.114

values which can be affected by a model. Thus, a small difference (fragment contribution) can be explained by the prediction error which is mainly caused by uncertainty of the training experimental data. Another issue is overestimation of contributions calculated for highly active/inactive compounds based on models which use nonextrapolating machine learning methods, e.g. tree-based ones (decision trees, Random Forest, gradient boosting machines, etc.). Nonextrapolating methods always return predictions within the range of values for the training set compounds used for modeling. Thus, for top active (inactive) compounds the contributions of fragments or matched pair transformations will always return zero or positive (negative) values because minuend will always be greater (lesser) or equal to subtrahend. This may create a bias and some outliers can be observed. All the approaches described investigate QSAR models as black boxes. They provide some input and analyze an output. Using such a paradigm, any model can be interpreted and thus no true black box QSAR models exist.



FUTURE PERSPECTIVES OF QSAR INTERPRETATION Increased interest in QSAR model interpretation is notable in recent years. Researchers more frequently use different techniques to extract the knowledge about structure−property relationships captured by QSAR models. In the era of big data and deep learning, interpretation of obtained models should become the ultimate tool in discovery of treasures hidden in chemical data sets. Interpretation already helps researchers to better understand structure−property relationships and generate new hypotheses. Wider application of model interpretation in QSAR studies, e.g. structural modification of studied compounds to improve their biological profiles, design of compounds with a desired set of properties, etc., are anticipated. Since all QSAR models are currently interpretable, their use for regulatory purposes may require the mandatory interpretation step and comparison of structure−property relationships captured by models with expert knowledge, thus performing knowledge-based validation of models. The latter may be important for all QSAR studies as well because it may reveal hidden biases in training/test sets which may be missed by statistical validation, e.g., trivial unreasonable patterns which discriminate actives and inactives. It is anticipated that new interpretation approaches will emerge. However, the main progress should be achieved in development of approaches targeting black box data sets (Table 6, scenario 3) where the underlying mechanism(s) is not revealed. There is a need to develop approaches which can distinguish the influence of different molecular contexts. Another important direction is development of easy-to-use interpretation workflows and tools for analysis of enormous amounts of information from interpretation of QSAR models and systematization of captured relationships. Some steps have been made. Organization (visualization) of structure−property



INTERPRETATION SCENARIO AS A FUNCTION OF A MODELING ENDPOINT Despite the described advances in model interpretation, there is one natural limitation of model interpretabilityinterpretability of an endpoint. It has a minor effect on local interpretability when contributions of atoms or fragments are calculated and considered for a single molecule. But it affects global interpretability of QSAR models when the general trend of structure−activity relationship is established as fragment grouping and analysis should take into account a mechanism of action or binding mode of investigated compounds; otherwise, interpretation results might be distorted. Several scenarios are possible (Table 6). The simplest one is where compounds do not form specific interactions with their functional target or these interactions can be neglected. Of course this is simplification but it may be particularly true for solubility, lipophilicity, passive diffusion through membranes, etc. The position of a particular group, e.g., cyclohexyl, in the middle of the structure or at the terminus will slightly affect compound lipophilicity or solubility. In this case fragments can be selected based on the decision of a researcher and identical fragments can be grouped without considering their molecular context. The second scenario is interpretation of models built for compounds which form specific interactions with their target and these interactions are known from experiments or molecular simulation studies. This is true for ligand−protein recognition. Here, fragments should be selected and grouped according to the ligand poses relative to the target and it would be reasonable to analyze and group fragments which bind to the same amino acids separately because the same fragment may have different effects on compound binding depending on its Q

DOI: 10.1021/acs.jcim.7b00274 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Perspective

Journal of Chemical Information and Modeling Table 7. Applicability of Interpretation Approaches to QSAR Models

Table 8. List of Freely Accessible Software for QSAR Model Interpretation Using One of the Model → Structure Approaches approach

software

link

ref

similarity map

RDKit

http://rdkit.org

107

universal structural interpretation structural and physicochemical interpretation

SPCI

http://qsar4u.com/pages/sirms_qsar.php; https://github.com/DrrDom/spci

108 114

prediction-driven matched molecular pairs

OCHEM

http://ochem.eu

117

QSAR models are empirical by nature. Therefore, to avoid misinterpretation, the model should meet certain criteria: (1) it should be predictive according to conventional statistical validation and (2) interpretation results should be considered only within the applicability domain of the corresponding QSAR model unless they are not supported by experimental mechanistic confirmation which can broaden the applicability of the found structure−property relationships. Extrapolation is unreliable not only for prediction but for interpretation results as well. There are several available tools which can be used for interpretation of QSAR models (Table 8). However, more tools will probably become available in the future.

relationships as (un)directed graphs is an attractive systematization approach. However, such networks can be extremely complex, and one needs a tool to extract relevant information from them.121,122 Automation aside, interpretation is complex, and human intervention will still be required.



CONCLUSIONS

Recent advances in interpretating QSAR models and the shift in interpretation paradigm from model → descriptors → (structure) to model → structure may change the field of QSAR modeling. For now there are no reasons to oppose QSAR model interpretability and predictivity. Traditionally considered highly predictive models such as NN, SVM, or RF are interpretable as well as any other model. The only difference is choice of interpretation approach that depends on chosen descriptors and machine learning methods (Table 7). If interpretable descriptors are used for QSAR modeling then one may choose among many machine learning method-specific and method-independent interpretation approaches as well as model → structure paradigm approaches. In the case of substructural fingerprints and fragment descriptors, it is possible to transfer descriptor contributions onto structures for visualization and analysis. If noninterpretable descriptors are used for modeling, then the model → structure approaches are the only choice. The good news is that if models built on the same data set are correct the structural interpretation results should converge regardless of the chosen interpretation approach, machine learning method, and descriptors. Thus, any model can be considered as interpretable, but not all endpoints! QSAR models built for endpoints with an unknown mechanism of action, and binding mode should be interpreted carefully to avoid overinterpretation and wrong conclusions.



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. ORCID

Pavel Polishchuk: 0000-0001-5088-8149 Notes

The author declares no competing financial interest.



ACKNOWLEDGMENTS The author thanks Prof. V. Kuz’min for fruitful discussions, comments, and remarks which helped me considerably improve the manuscript. This work was supported by Russian Scientific Foundation, grant no. 14-43-00024.



LIST OF ABBREVIATIONS DT, decision tree; MLR, multiple linear regression; MMP, matched molecular pairs; k-NNs, k-Nearest Neighbors; NNs, Neural Nets; PLS, partial least squares; QSAR, quantitative R

DOI: 10.1021/acs.jcim.7b00274 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Perspective

Journal of Chemical Information and Modeling

Benchmarking of Distance to Models for Ames Mutagenicity Set. J. Chem. Inf. Model. 2010, 50, 2094−2111. (19) Roy, K.; Ambure, P.; Aher, R. B. How important is to detect systematic error in predictions and understand statistical applicability domain of QSAR models? Chemom. Intell. Lab. Syst. 2017, 162, 44−54. (20) Alves, V. M.; Muratov, E. N.; Capuzzi, S. J.; Politi, R.; Low, Y.; Braga, R. C.; Zakharov, A. V.; Sedykh, A.; Mokshyna, E.; Farag, S.; et al. Alarms about structural alerts. Green Chem. 2016, 18, 4348− 4360. (21) Stanton, D. T.; Jurs, P. C. Computer-assisted study of the relationship between molecular structure and surface tension of organic compounds. J. Chem. Inf. Model. 1992, 32, 109−115. (22) Stanton, D. T. QSAR and QSPR Model Interpretation Using Partial Least Squares (PLS) Analysis. Curr. Comput.-Aided Drug Des. 2012, 8, 107−127. (23) Hansch, C.; Maloney, P. P.; Fujita, T.; Muir, R. M. Correlation of Biological Activity of Phenoxyacetic Acids with Hammett Substituent Constants and Partition Coefficients. Nature 1962, 194, 178−180. (24) Bhhatarai, B.; Garg, R.; Gramatica, P. Are Mechanistic and Statistical QSAR Approaches Really Different? MLR Studies on 158 Cycloalkyl-Pyranones. Mol. Inf. 2010, 29, 511−522. (25) Free, S. M.; Wilson, J. W. A Mathematical Contribution to Structure-Activity Studies. J. Med. Chem. 1964, 7, 395−399. (26) Patel, Y.; Gillet, V. J.; Howe, T.; Pastor, J.; Oyarzabal, J.; Willett, P. Assessment of Additive/Nonadditive Effects in Structure−Activity Relationships: Implications for Iterative Drug Design. J. Med. Chem. 2008, 51, 7552−7562. (27) Hoerl, A. E.; Kennard, R. W. Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics 1970, 12, 55− 67. (28) Lindberg, W.; Persson, J.-A.; Wold, S. Partial least-squares method for spectrofluorimetric analysis of mixtures of humic acid and lignin sulfonate. Anal. Chem. 1983, 55, 643−648. (29) Wold, S.; Sjöström, M.; Eriksson, L. PLS-regression: a basic tool of chemometrics. Chemom. Intell. Lab. Syst. 2001, 58, 109−130. (30) Eriksson, L.; Rosén, J.; Johansson, E.; Trygg, J. Orthogonal PLS (OPLS) Modeling for Improved Analysis and Interpretation in Drug Design. Mol. Inf. 2012, 31, 414−419. (31) Hasegawa, K.; Funatsu, K. Data Mining of Chemogenomics Data Using Bi-Modal PLS Methods and Chemical Interpretation for Molecular Design. Mol. Inf. 2014, 33, 749−756. (32) Breiman, L.; Friedman, J. H.; Olshen, R. A.; Stone, C. J. Classification and Regression Trees; Wadsworth: Belmont, CA, 1984; p 368. (33) Quinlan, J. R. C4.5: programs for machine learning; Morgan Kaufmann Publishers Inc.: San Francisco, CA, 1993; p 302. (34) Ashman, W. P.; Lewis, J. H.; Poziomek, E. J. Decision tree for chemical detection applications. Anal. Chem. 1985, 57, 1951−1955. (35) Breiman, L. Random Forests. Machine Learning 2001, 45, 5−32. (36) Garson, G. D. Interpreting neural-network connection weights. AI Expert 1991, 6, 46−51. (37) Goh, A. T. C. Back-propagation neural networks for modeling complex systems. Artificial Intelligence in Engineering 1995, 9, 143−151. (38) Brasquet, C.; Bourges, B.; Le Cloirec, P. Quantitative Structure−Property Relationship (QSPR) for the Adsorption of Organic Compounds onto Activated Carbon Cloth: Comparison between Multiple Linear Regression and Neural Network. Environ. Sci. Technol. 1999, 33, 4226−4231. (39) Leon y Leon, C. A.; Solar, J. M.; Calemma, V.; Radovic, L. R. Evidence for the protonation of basal plane sites on carbon. Carbon 1992, 30, 797−811. (40) Luehrs, D. C.; Hickey, J. P.; Nilsen, P. E.; Godbole, K. A.; Rogers, T. N. Linear Solvation Energy Relationship of the Limiting Partition Coefficient of Organic Solutes between Water and Activated Carbon. Environ. Sci. Technol. 1996, 30, 143−152. (41) Guha, R.; Stanton, D. T.; Jurs, P. C. Interpreting Computational Neural Network Quantitative Structure−Activity Relationship Models:

structure−activity relationship; QSPR, quantitative structure− property relationship; RF, Random Forest; SVM, Support Vector Machine



REFERENCES

(1) Cronin, M. T. D.; Puzyn, T.; Leszcynski, J.; Cronin, M. Quantitative Structure−Activity Relationships (QSARs)−Applications and Methodology. Recent Adv, QSAR Studies 2010, 3−11. (2) Cherkasov, A.; Muratov, E. N.; Fourches, D.; Varnek, A.; Baskin, I. I.; Cronin, M.; Dearden, J.; Gramatica, P.; Martin, Y. C.; Todeschini, R.; et al. QSAR Modeling: Where Have You Been? Where Are You Going To? J. Med. Chem. 2014, 57, 4977−5010. (3) Gasteiger, J. Chemoinformatics: Achievements and Challenges, a Personal View. Molecules 2016, 21, 151. (4) Dearden, J. C. The History and Development of Quantitative Structure-Activity Relationships (QSARs). International Journal of Quantitative Structure-Property Relationships (IJQSPR) 2016, 1, 1−44. (5) Kuz’min, V. E.; Artemenko, A. G.; Muratov, E. N.; Volineckaya, I. L.; Makarov, V. A.; Riabova, O. B.; Wutzler, P.; Schmidtke, M. Quantitative Structure−Activity Relationship Studies of [(Biphenyloxy)propyl]isoxazole Derivatives. Inhibitors of Human Rhinovirus 2 Replication. J. Med. Chem. 2007, 50, 4205−4213. (6) Muratov, E. N.; Artemenko, A. G.; Varlamova, E. V.; Polischuk, P. G.; Lozitsky, V. P.; Fedchuk, A. S.; Lozitska, R. L.; Gridina, T. y. L.; Koroleva, L. S.; Sil’nikov, V. N.; et al. Per aspera ad astra: application of Simplex QSAR approach in antiviral research. Future Med. Chem. 2010, 2, 1205−1226. (7) Raccuglia, P.; Elbert, K. C.; Adler, P. D. F.; Falk, C.; Wenny, M. B.; Mollo, A.; Zeller, M.; Friedler, S. A.; Schrier, J.; Norquist, A. J. Machine-learning-assisted materials discovery using failed experiments. Nature 2016, 533, 73−76. (8) Kuz’min, V. E.; Artemenko, A. G.; Muratov, E. N.; Ognichenko, L. N.; Hromov, A. I.; Liahovskij, A. V.; Polischuk, P. G. The Hierarchical QSAR Technology for Effective Virtual Screening and Molecular Design of Potential Antiviral Agents. Eleventh Congress of the Bulgarian Microbiologists with International Participation, St. Constantine, Varna, October 5−7, 2006; p 65. (9) Novotarskyi, S.; Sushko, I.; Körner, R.; Pandey, A. K.; Tetko, I. V. A comparison of different QSAR approaches to modeling CYP450 1A2 inhibition. J. Chem. Inf. Model. 2011, 51, 1271−1280. (10) Zhu, H.; Tropsha, A.; Fourches, D.; Varnek, A.; Papa, E.; Gramatica, P.; Ö berg, T.; Dao, P.; Cherkasov, A.; Tetko, I. V. Combinatorial QSAR Modeling of Chemical Toxicants Tested against Tetrahymena pyriformis. J. Chem. Inf. Model. 2008, 48, 766−784. (11) Guha, R. On the interpretation and interpretability of quantitative structure−activity relationship models. J. Comput.-Aided Mol. Des. 2008, 22, 857−871. (12) Fujita, T.; Winkler, D. A. Understanding the Roles of the “Two QSARs. J. Chem. Inf. Model. 2016, 56, 269−274. (13) Johansson, U.; Sönströd, C.; Norinder, U.; Boström, H. Tradeoff between accuracy and interpretability for predictive in silico modeling. Future Med. Chem. 2011, 3, 647−663. (14) Tropsha, A.; Gramatica, P.; Gombar, V. K. The Importance of Being Earnest: Validation is the Absolute Essential for Successful Application and Interpretation of QSPR Models. QSAR Comb. Sci. 2003, 22, 69−77. (15) Tropsha, A. Best Practices for QSAR Model Development, Validation, and Exploitation. Mol. Inf. 2010, 29, 476−488. (16) Dearden, J. C.; Cronin, M. T. D.; Kaiser, K. L. E. How not to develop a quantitative structure−activity or structure−property relationship (QSAR/QSPR). SAR QSAR Environ. Res. 2009, 20, 241−266. (17) Polishchuk, P. G.; Madzhidov, T. I.; Varnek, A. Estimation of the size of drug-like chemical space based on GDB-17 data. J. Comput.Aided Mol. Des. 2013, 27, 675−679. (18) Sushko, I.; Novotarskyi, S.; Körner, R.; Pandey, A. K.; Cherkasov, A.; Li, J.; Gramatica, P.; Hansen, K.; Schroeter, T.; Müller, K.-R.; et al. Applicability Domains for Classification Problems: S

DOI: 10.1021/acs.jcim.7b00274 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Perspective

Journal of Chemical Information and Modeling A Detailed Interpretation of the Weights and Biases. J. Chem. Inf. Model. 2005, 45, 1109−1121. (42) Stanton, D. T.; Mattioni, B. E.; Knittel, J. J.; Jurs, P. C. Development and Use of Hydrophobic Surface Area (HSA) Descriptors for Computer-Assisted Quantitative Structure−Activity and Structure−Property Relationship Studies. J. Chem. Inf. Comput. Sci. 2004, 44, 1010−1023. (43) Rankovic, Z. CNS Drug Design: Balancing Physicochemical Properties for Optimal Brain Exposure. J. Med. Chem. 2015, 58, 2584− 2608. (44) Deng, H. Interpreting Tree Ensembles with inTrees. 2014, arXiv:1408.5456. arXiv.org ePrint archive. http://arxiv.org/abs/1408. 5456 (accessed Aug 22, 2017). (45) Mashayekhi, M.; Gras, R. Rule Extraction from Random Forest: the RF+HC Methods. In Advances in Artificial Intelligence: 28th Canadian Conference on Artificial Intelligence, Canadian AI 2015, Halifax, Nova Scotia, Canada, June 2−5, 2015, Barbosa, D.; Milios, E., Eds.; Springer International Publishing: Cham, 2015; pp 223−237. (46) Liu, S.; Patel, R. Y.; Daga, P. R.; Liu, H.; Fu, G.; Doerksen, R.; Chen, Y.; Wilkins, D. Multi-class Joint Rule Extraction and Feature Selection for Biological Data. 2011 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Nov. 12−15, 2011; pp 476− 481. (47) Kuz’min, V. E.; Polishchuk, P. G.; Artemenko, A. G.; Andronati, S. A. Interpretation of QSAR models based on Random Forest method. Mol. Inf. 2011, 30, 593−603. (48) Palczewska, A.; Palczewski, J.; Marchese Robinson, R.; Neagu, D. Interpreting Random Forest Classification Models Using a Feature Contribution Method. In Integration of Reusable Systems; BouabanaTebibel, T., Rubin, H. S., Eds.; Springer International Publishing: Cham, 2014; pp 193−218. (49) Huysmans, J.; Baesens, B.; Vanthienen, J. Using Rule Extraction to Improve the Comprehensibility of Predictive Models, 2006. http:// dx.doi.org/10.2139/ssrn.961358 (accessed Aug 22, 2017). (50) Fu, L. Rule learning by searching on adapted nets. In Proceedings of the ninth National conference on Artificial intelligence; AAAI Press: Anaheim, CA, 1991; Vol. 2, pp 590−595. (51) Thrun, S. B. Extracting Provably Correct Rules from Artificial Neural Networks; Technical report, University of Bonn, Germany, 1993. (52) Fu, L. Rule generation from neural networks. IEEE Transactions on Systems, Man, and Cybernetics 1994, 24, 1114−1124. (53) Jagielska, I.; Matthews, C.; Whitfort, T. An investigation into the application of neural networks, fuzzy logic, genetic algorithms, and rough sets to automated knowledge acquisition for classification problems. Neurocomputing 1999, 24, 37−54. (54) Andrews, R.; Diederich, J.; Tickle, A. B. Survey and critique of techniques for extracting rules from trained artificial neural networks. Knowledge-Based Systems 1995, 8, 373−389. (55) Enbutsu, I.; Baba, K.; Hara, N. Fuzzy rule extraction from a multilayered neural network. In Proceedings of International Joint Conference on Neural Networks, Seattle, USA, July 8−12, IEEE: Seattle, WA, 1991; pp 461−465. (56) Towell, G. G.; Shavlik, J. W. Extracting refined rules from knowledge-based neural networks. Machine Learning 1993, 13, 71− 101. (57) Andonie, R.; Fabry-Asztalos, L.; Collar, C. J.; Abdul-Wahid, S.; Salim, N. In Neuro-fuzzy Prediction of Biological Activity and Rule Extraction for HIV-1 Protease Inhibitors. 2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology; Nov. 14−15, 2005; pp 1−8. (58) Benfenati, E.; Mazzatorta, P.; Neagu, D.; Gini, G. Combining Classifiers of Pesticides Toxicity through a Neuro-fuzzy Approach. In Multiple Classifier Systems: Third International Workshop, MCS 2002, Cagliari, Italy, June 24−26; Roli, F., Kittler, J., Eds.; Springer: Berlin, Heidelberg, 2002; pp 293−303. (59) Hudson, B. D.; Whitley, D. C.; Browne, A.; Ford, M. G. Extraction of Comprehensible Logical Rules from Neural Networks.

Application of TREPAN in Bio and Chemoinformatics. Croat. Chem. Acta 2005, 78, 557−561. (60) Zilke, J. R.; Loza Mencía, E.; Janssen, F. DeepRED−Rule Extraction from Deep Neural Networks. In Discovery Science: 19th International Conference, DS 2016, Bari, Italy, October 19−21; Calders, T., Ceci, M., Malerba, D., Eds.; Springer International Publishing: Cham, 2016; pp 457−473. (61) Barakat, N.; Bradley, A. P. Rule extraction from support vector machines: A review. Neurocomputing 2010, 74, 178−190. (62) Martens, D.; Baesens, B.; Gestel, T. V. Decompositional Rule Extraction from Support Vector Machines by Active Learning. IEEE Transactions on Knowledge and Data Engineering 2009, 21, 178−191. (63) Wicker, J. G. P.; Cooper, R. I. Beyond Rotatable Bond Counts: Capturing 3D Conformational Flexibility in a Single Descriptor. J. Chem. Inf. Model. 2016, 56, 2347−2352. (64) So, S. S.; Richards, W. G. Application of neural networks: quantitative structure-activity relationships of the derivatives of 2,4diamino-5-(substituted-benzyl)pyrimidines as DHFR inhibitors. J. Med. Chem. 1992, 35, 3201−3207. (65) Lek, S.; Belaud, A.; Baran, P.; Dimopoulos, I.; Delacoste, M. Role of some environmental variables in trout abundance models using neural networks. Aquat. Living Resour. 1996, 9, 23−29. (66) Dias Selassie, C.; Li, R.-l.; Poe, M.; Hansch, C. Optimization of hydrophobic and hydrophilic substituent interactions of 2,4-diamino5-(substituted-benzyl)pyrimidines with dihydrofolate reductase. J. Med. Chem. 1991, 34, 46−54. (67) Iooss, B.; Saltelli, A. Introduction to Sensitivity Analysis. In Handbook of Uncertainty Quantification; Ghanem, R., Higdon, D., Owhadi, H., Eds.; Springer International Publishing: Cham, 2016; pp 1−20. (68) Györgyi, G. Inference of a rule by a neural network with thermal noise. Phys. Rev. Lett. 1990, 64, 2957−2960. (69) Díaz-Uriarte, R.; Alvarez de Andrés, S. Gene selection and classification of microarray data using random forest. BMC Bioinf. 2006, 7, 3. (70) Svetnik, V.; Liaw, A.; Tong, C.; Wang, T. Application of Breiman’s Random Forest to Modeling Structure-Activity Relationships of Pharmaceutical Molecules. In Multiple Classifier Systems: 5th International Workshop, MCS 2004, Cagliari, Italy, June 9−11, Roli, F., Kittler, J., Windeatt, T., Eds.; Springer: Berlin, Heidelberg, 2004; pp 334−343. (71) Teixeira, A. L.; Leal, J. P.; Falcao, A. O. Random forests for feature selection in QSPR Models - an application for predicting standard enthalpy of formation of hydrocarbons. J. Cheminf. 2013, 5, 9. (72) Guha, R.; Jurs, P. C. Development of Linear, Ensemble, and Nonlinear Models for the Prediction and Interpretation of the Biological Activity of a Set of PDGFR Inhibitors. J. Chem. Inf. Comput. Sci. 2004, 44, 2179−2189. (73) Guha, R.; Jurs, P. C. Interpreting Computational Neural Network QSAR Models: A Measure of Descriptor Importance. J. Chem. Inf. Model. 2005, 45, 800−806. (74) Polishchuk, P. G.; Muratov, E. N.; Artemenko, A. G.; Kolumbin, O. G.; Muratov, N. N.; Kuz’min, V. E. Application of Random Forest Approach to QSAR Prediction of Aquatic Toxicity. J. Chem. Inf. Model. 2009, 49, 2481−2488. (75) Strobl, C.; Boulesteix, A.-L.; Zeileis, A.; Hothorn, T. Bias in random forest variable importance measures: Illustrations, sources and a solution. BMC Bioinf. 2007, 8, 25. (76) Strobl, C.; Boulesteix, A.-L.; Kneib, T.; Augustin, T.; Zeileis, A. Conditional variable importance for random forests. BMC Bioinf. 2008, 9, 307. (77) Aoyama, T.; Ichikawa, H. Neural networks as nonlinear structure-activity relationship analyzers. Useful functions of the partial derivative method in multilayer neural networks. J. Chem. Inf. Model. 1992, 32, 492−500. (78) Baskin, I. I.; Ait, A. O.; Halberstam, N. M.; Palyulin, V. A.; Zefirov, N. S. An approach to the interpretation of backpropagation neural network models in QSAR studies. SAR QSAR Environ. Res. 2002, 13, 35−41. T

DOI: 10.1021/acs.jcim.7b00274 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Perspective

Journal of Chemical Information and Modeling

(100) Kuz’min, V. E.; Artemenko, A. G.; Polischuk, P. G.; Muratov, E. N.; Hromov, A. I.; Liahovskiy, A. V.; Andronati, S. A.; Makan, S. Y. Hierarchic System of QSAR Models (1D-4D) on the Base of Simplex Representation of Molecular Structure. J. Mol. Model. 2005, 11, 457− 467. (101) Chilmonczyk, Z.; Szelejewska-Wozniakowska, A.; Cybulski, J.; Cybulski, M.; Koziol, A. E.; Gdaniec, M. Conformational Flexibility of Serotonin1A Receptor Ligands from Crystallographic Data. Updated Model of the Receptor Pharmacophore. Arch. Pharm. 1997, 330, 146− 160. (102) López-Rodríguez, M. L.; Morcillo, M. J.; Fernández, E.; Porras, E.; Murcia, M.; Sanz, A. M.; Orensanz, L. Synthesis and Structure− Activity Relationships of a New Model of Arylpiperazines. 3. 2-[ω-(4Arylpiperazin-1-yl)alkyl]perhydropyrrolo- [1,2-c]imidazoles and -perhydroimidazo[1,5-a]pyridines: Study of the Influence of the Terminal Amide Fragment on 5-HT1A Affinity/Selectivity. J. Med. Chem. 1997, 40, 2653−2656. (103) Gaillard, P.; Carrupt, P.-A.; Testa, B.; Schambel, P. Binding of Arylpiperazines, (Aryloxy)propanolamines, and Tetrahydropyridylindoles to the 5-HT1A Receptor: Contribution of the Molecular Lipophilicity Potential to Three-Dimensional Quantitative Structure− Affinity Relationship Models. J. Med. Chem. 1996, 39, 126−134. (104) Marchese Robinson, R. L.; Palczewska, A.; Palczewski, J.; Kidley, N. Comparison of the Predictive Performance and Interpretability of Random Forest and Linear Models on Benchmark Data Sets. J. Chem. Inf. Model. 2017, 57, 1773−1792. (105) Sheridan, R. P.; Hunt, P.; Culberson, J. C. Molecular Transformations as a Way of Finding and Exploiting Consistent Local QSAR. J. Chem. Inf. Model. 2006, 46, 180−192. (106) Leach, A. G.; Jones, H. D.; Cosgrove, D. A.; Kenny, P. W.; Ruston, L.; MacFaul, P.; Wood, J. M.; Colclough, N.; Law, B. Matched Molecular Pairs as a Guide in the Optimization of Pharmaceutical Properties; a Study of Aqueous Solubility, Plasma Protein Binding and Oral Exposure. J. Med. Chem. 2006, 49, 6672−6682. (107) Riniker, S.; Landrum, G. Similarity maps - a visualization strategy for molecular fingerprints and machine-learning methods. J. Cheminf. 2013, 5, 43. (108) Polishchuk, P. G.; Kuz’min, V. E.; Artemenko, A. G.; Muratov, E. N. Universal Approach for Structural Interpretation of QSAR/ QSPR Models. Mol. Inf. 2013, 32, 843−853. (109) Mokshyna, E.; Polishchuk, P.; Nedostup, V.; Kuz’min, V. QSPR-Modeling for the Second Virial Cross-Coefficients of Binary Organic Mixtures. International Journal of Quantitative StructureProperty Relationships (IJQSPR) 2016, 1, 72−84. (110) Chien, E. Y. T.; Liu, W.; Zhao, Q.; Katritch, V.; Won Han, G.; Hanson, M. A.; Shi, L.; Newman, A. H.; Javitch, J. A.; Cherezov, V.; et al. Structure of the Human Dopamine D3 Receptor in Complex with a D2/D3 Selective Antagonist. Science 2010, 330, 1091−1095. (111) Braga, R. C.; Alves, V. M.; Silva, M. F. B.; Muratov, E.; Fourches, D.; Lião, L. M.; Tropsha, A.; Andrade, C. H. Pred-hERG: A Novel web-Accessible Computational Tool for Predicting Cardiac Toxicity. Mol. Inf. 2015, 34, 698−701. (112) Pérez-Garrido, A.; Rivero-Buceta, V.; Cano, G.; Kumar, S.; Pérez-Sánchez, H.; Bautista, M. T. Latest QSAR study of adenosine A2B receptor affinity of xanthines and deazaxanthines. Mol. Diversity 2015, 19, 975−989. (113) Zhang, Y.-Y.; Liu, H.; Summerfield, S. G.; Luscombe, C. N.; Sahi, J. Integrating in Silico and in Vitro Approaches To Predict Drug Accessibility to the Central Nervous System. Mol. Pharmaceutics 2016, 13, 1540−1550. (114) Polishchuk, P.; Tinkov, O.; Khristova, T.; Ognichenko, L.; Kosinskaya, A.; Varnek, A.; Kuz’min, V. Structural and PhysicoChemical Interpretation (SPCI) of QSAR Models and Its Comparison with Matched Molecular Pair Analysis. J. Chem. Inf. Model. 2016, 56, 1455−1469. (115) Kuz’min, V. E.; Artemenko, A. G.; Muratov, E. N. Hierarchical QSAR technology based on the Simplex representation of molecular structure. J. Comput.-Aided Mol. Des. 2008, 22, 403−421.

(79) Marcou, G.; Horvath, D.; Solov’ev, V.; Arrault, A.; Vayer, P.; Varnek, A. Interpretability of SAR/QSAR Models of any Complexity by Atomic Contributions. Mol. Inf. 2012, 31, 639−642. (80) An, Y.; Sherman, W.; Dixon, S. L. Kernel-Based Partial Least Squares: Application to Fingerprint-Based QSAR with Model Visualization. J. Chem. Inf. Model. 2013, 53, 2312−2321. (81) Baehrens, D.; Schroeter, T.; Harmeling, S.; Kawanabe, M.; Hansen, K.; Muller, K.-R. How to Explain Individual Classification Decisions. J. Mach. Learn. Res. 2010, 11, 1803−1831. (82) Franke, L.; Byvatov, E.; Werz, O.; Steinhilber, D.; Schneider, P.; Schneider, G. Extraction and Visualization of Potential Pharmacophore Points Using Support Vector Machines: Application to Ligand-Based Virtual Screening for COX-2 Inhibitors. J. Med. Chem. 2005, 48, 6997−7004. (83) Carlsson, L.; Helgee, E. A.; Boyer, S. Interpretation of Nonlinear QSAR Models Applied to Ames Mutagenicity Data. J. Chem. Inf. Model. 2009, 49, 2551−2558. (84) Hasegawa, K.; Keiya, M.; Funatsu, K. Visualization of Molecular Selectivity and Structure Generation for Selective Dopamine Inhibitors. Mol. Inf. 2010, 29, 793−800. (85) Hasegawa, K.; Migita, K.; Funatsu, K. Atom Coloring for Chemical Interpretation and De Novo Design for Molecular Design. In Knowledge-Oriented Applications in Data Mining; Funatsu, K., Ed.; InTech, 2011; pp 167−182. (86) Chen, H.; Carlsson, L.; Eriksson, M.; Varkonyi, P.; Norinder, U.; Nilsson, I. Beyond the Scope of Free-Wilson Analysis: Building Interpretable QSAR Models with Machine Learning Algorithms. J. Chem. Inf. Model. 2013, 53, 1324−1336. (87) Stålring, J.; Almeida, P. R.; Carlsson, L.; Helgee Ahlberg, E.; Hasselgren, C.; Boyer, S. Localized Heuristic Inverse Quantitative Structure Activity Relationship with Bulk Descriptors Using Numerical Gradients. J. Chem. Inf. Model. 2013, 53, 2001−2017. (88) van Westen, G. J. P.; Wegner, J. K.; Geluykens, P.; Kwanten, L.; Vereycken, I.; Peeters, A.; Ijzerman, A. P.; van Vlijmen, H. W. T.; Bender, A. Which Compound to Select in Lead Optimization? Prospectively Validated Proteochemometric Models Guide Preclinical Development. PLoS One 2011, 6, e27518. (89) Cortes-Ciriano, I.; van Westen, G.; Lenselink, E.; Murrell, D.; Bender, A.; Malliavin, T. Proteochemometric modeling in a Bayesian framework. J. Cheminf. 2014, 6, 35. (90) Cortes-Ciriano, I.; Murrell, D.; van Westen, G.; Bender, A.; Malliavin, T. Prediction of the potency of mammalian cyclooxygenase inhibitors with ensemble proteochemometric modeling. J. Cheminf. 2015, 7, 1. (91) Cramer, R. D.; Patterson, D. E.; Bunce, J. D. Comparative molecular field analysis (CoMFA). 1. Effect of shape on binding of steroids to carrier proteins. J. Am. Chem. Soc. 1988, 110, 5959−5967. (92) Damale, M.; Harke, S.; Kalam Khan, F.; Shinde, D.; Sangshetti, J. Recent Advances in Multidimensional QSAR (4D-6D): A Critical Review. Mini-Rev. Med. Chem. 2014, 14, 35−55. (93) Kim, K. H.; Greco, G.; Novellino, E. A critical review of recent CoMFA applications. Perspect. Drug Discovery Des. 1998, 12, 257−315. (94) Akamatsu, M. Current State and Perspectives of 3D-QSAR. Curr. Top. Med. Chem. 2002, 2, 1381−1394. (95) Cramer, R. D. Topomer CoMFA: A Design Methodology for Rapid Lead Optimization. J. Med. Chem. 2003, 46, 374−388. (96) Goodford, P. The Basic Principles of GRID. In Molecular Interaction Fields; Cruciani, G., Ed.; Wiley-VCH Verlag GmbH & Co. KGaA: Weinheim, 2006; pp 1−25. (97) Klebe, G.; Abraham, U.; Mietzner, T. Molecular Similarity Indices in a Comparative Analysis (CoMSIA) of Drug Molecules to Correlate and Predict Their Biological Activity. J. Med. Chem. 1994, 37, 4130−4146. (98) Robinson, D. D.; Winn, P. J.; Lyne, P. D.; Richards, W. G. SelfOrganizing Molecular Field Analysis: A Tool for Structure−Activity Studies. J. Med. Chem. 1999, 42, 573−583. (99) Seel, M.; Turner, D. B.; Willett, P. Effect of Parameter Variations on the Effectiveness of HQSAR Analyses. Quant. Struct.-Act. Relat. 1999, 18, 245−252. U

DOI: 10.1021/acs.jcim.7b00274 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Perspective

Journal of Chemical Information and Modeling (116) Krysko, A. A.; Kornylov, A. Y.; Polishchuk, P. G.; Samoylenko, G. V.; Krysko, O. L.; Kabanova, T. A.; Kravtsov, V. C.; Kabanov, V. M.; Wicher, B.; Andronati, S. A. Synthesis, biological evaluation and molecular docking studies of 2-piperazin-1-yl-quinazolines as platelet aggregation inhibitors and ligands of integrin αIIbβ3. Bioorg. Med. Chem. Lett. 2016, 26, 1839−1843. (117) Sushko, Y.; Novotarskyi, S.; Korner, R.; Vogt, J.; Abdelaziz, A.; Tetko, I. Prediction-driven matched molecular pairs to interpret QSARs and aid the molecular optimization process. J. Cheminf. 2014, 6, 48. (118) Tetko, I. V.; Varbanov, H. P.; Galanski, M.; Talmaciu, M.; Platts, J. A.; Ravera, M.; Gabano, E. Prediction of logP for Pt(II) and Pt(IV) complexes: Comparison of statistical and quantum-chemistry based approaches. J. Inorg. Biochem. 2016, 156, 1−13. (119) Tinkov, O. V.; Ognichenko, L. N.; Kuz’min, V. E.; Gorb, L. G.; Kosinskaya, A. P.; Muratov, N. N.; Muratov, E. N.; Hill, F. C.; Leszczynski, J. Computational assessment of environmental hazards of nitroaromatic compounds: influence of the type and position of aromatic ring substituents on toxicity. Struct. Chem. 2016, 27, 191− 198. (120) Webb, S. J.; Hanser, T.; Howlin, B.; Krause, P.; Vessey, J. D. Feature combination networks for the interpretation of statistical machine learning models: application to Ames mutagenicity. J. Cheminf. 2014, 6, 8. (121) Wawer, M.; Bajorath, J. Local Structural Changes, Global Data Views: Graphical Substructure−Activity Relationship Trailing. J. Med. Chem. 2011, 54, 2944−2951. (122) Maggiora, G. M.; Bajorath, J. Chemical space networks: a powerful new paradigm for the description of chemical space. J. Comput.-Aided Mol. Des. 2014, 28, 795−802.

V

DOI: 10.1021/acs.jcim.7b00274 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX