Kernel Target Alignment Parameter: A New Modelability Measure for

Dec 16, 2015 - Two other modelability measures were employed for benchmarking purposes: the Jaccard distance average over the data set (Div), and a ...
0 downloads 0 Views 1017KB Size
Letter pubs.acs.org/jcim

Kernel Target Alignment Parameter: A New Modelability Measure for Regression Tasks Gilles Marcou,*,† Dragos Horvath,† and Alexandre Varnek†,‡ †

Laboratory of Chemoinformatics, University of Strasbourg, 1 rue Blaise Pascal, 67000 Strasbourg, France Laboratory of Chemoinformatics, Federal University of Kazan, Kremlevskaya str. 18, 420008 Kazan, Russia



S Supporting Information *

total number of pairs.2 For the regression models, statistical parameters (Q2 or R2) of different versions of kNN models were suggested as modelability indices.3 Conceptually, modelability of regression models could be derived from the similarity principle, stating that similar compounds possess similar properties.4 This means that for the properly selected molecular descriptors, similarity measures for a set of molecules in descriptor space (DS) and in activity space (AS) should correlate. One could expect that in this case a given DS may provide with a reasonable QSAR model built on this specified data set. As a measure of overall similarity between DS and AS, we propose to use the kernel target alignment (KTA),5,6 a normalized form of Hilbert−Schmidt independence criterion (HSIC).7 The latter is an empirical estimate of the Hilbert− Schmidt norm of the cross-covariance operator that measures a correlation between two kernels.7 Comparing two different DS, that with larger KTA value would provide with more predictive QSAR model and, therefore, KTA could be considered as a modelability measure. This hypothesis has been checked on 26 QSAR data sets (of ligands of dopamine, adrenergic, serotonin receptors and Tetrahymena pyriformis toxicity). For comparison purposes, two other parameters characterizing a chemical space of a given training set have been assessed. First, a diversity index (Div) has been calculated as an average value of the Tanimoto coefficient over all pairs of compounds. Second, the absolute normalized activity difference between each compound and its nearest neighbors, averaged over the data set (Sim) was also computed. According to the nomenclature set in,2 the Div measure corresponds to the MODI_TC whereas Sim corresponds to MODI_ACI parameters.

ABSTRACT: In this paper, we demonstrate that the kernel target alignment (KTA) parameter can efficiently be used to estimate the relevance of molecular descriptors for QSAR modeling on a given data set, i.e., as a modelability measure. The efficiency of KTA to assess modelability was demonstrated in two series of QSAR modeling studies, either varying different descriptor spaces for one same data set, or comparing various data sets within one same descriptor space. Considered data sets included 25 series of various GPCR binders with ChEMBL-reported pKi values, and a toxicity data set. Employed descriptor spaces covered more than 100 different ISIDA fragment descriptor types, and ChemAxon BCUT terms. Model performances (RMSE) were seen to anticorrelate consistently with the KTA parameter. Two other modelability measures were employed for benchmarking purposes: the Jaccard distance average over the data set (Div), and a measure related to the normalized mean absolute error (MAE) obtained in 1-nearest neighbors calculations on the training set (Sim = 1 − MAE). It has been demonstrated that both Div and Sim perform similarly to KTA. However, a consensus index combining KTA, Div and Sim provides a more robust correlation with RMSE than any of the individual modelability measures.

2. METHODS In this section, we describe the calculations of the developed modelability measures: kernel target alignment (KTA) and similarity-based parameters Sim and Div. 2.1. Kernel Target Alignment. Kernel target alignment (KTA) calculations involve two N-dimensional kernel matrices K̅ X and K̅ Y that first should be centered according to the procedure suggested by Shölkopff et al. .8

1. INTRODUCTION The question of data modelability, i.e., an a priori estimate of the feasibility to obtain predictive models for a given data set using a given set of molecular descriptors, is a recurring concern that can be tracked down to the seminal works of Patterson et al. about neighborhood behavior.1 Its solution would allow one to avoid time-consuming trials where selection of the “best” set of descriptors results from numerous model-building studies. Recently, modelability indices (MODI) estimating the feasibility of classification2 and regression3 have been reported. For the classification models, the nearest neighbor of each compound in the descriptor space is considered, followed by estimation a ratio of the pairs belonging to one same class to © 2015 American Chemical Society

K̅ = K −

1 T 1 1 11 K − K11T + 2 11T K11T N N N

(1)

Published: December 16, 2015 6

DOI: 10.1021/acs.jcim.5b00539 J. Chem. Inf. Model. 2016, 56, 6−11

Letter

Journal of Chemical Information and Modeling

Table 1. Name of receptors, number of activity measurements and Spearman’s rank correlation between the cross-validated RMSE and the KTA, Div, Sim and the consensus score for individual receptors (in ST-QSAR) or for receptors family (for MTQSAR)a Spearman’s correlation coefficient for ST-QSAR receptor’s name dopamine receptors

adrenergic receptors

serotonin receptors

number of compounds

RMSE vs KTA

RMSE vs Div

RMSE vs Sim

RMSE vs consensus

Spearman’s correlation coefficient for MT-QSAR

D1

272

−0.60

−0.30

−0.95

−0.70

−0.67

D2 D3 D4 D5 α1A

1325 846 424 98 222

−0.91 −0.50 −0.86 −0.77 −0.88

−0.63 −0.53 −0.81 −0.58 −0.63

−0.91 −0.93 −0.87 −0.41 −0.76

−0.89 −0.91 −0.88 −0.69 −0.88

−0.75

α1B α1D α2A α2B α2C β1 β2 β3 5-HT1A

141 130 158 94 119 168 221 141 884

−0.72 −0.51 −0.47 −0.62 −0.84 −0.55 −0.76 −0.24 −0.82

−0.65 −0.58 −0.38 −0.54 −0.53 −0.64 −0.89 −0.38 −0.58

−0.66 −0.76 −0.41 −0.41 −0.46 −0.25 −0.79 −0.51 −0.84

−0.77 −0.72 −0.53 −0.59 −0.69 −0.66 −0.88 −0.53 −0.79

−0.63

5-HT1B 5-HT1D 5-HT2A 5-HT2B 5-HT2C 5-HT3A 5-HT4 5-HT5A 5-HT6 5-HT7

138 139 654 256 504 95 62 79 859 275 1093

−0.73 −0.58 −0.43 −0.65 −0.77 −0.43 −0.00 −0.74 −0.51 −0.68 −0.20

−0.69 −0.59 −0.69 −0.41 −0.57 −0.57 0.03 −0.68 −0.67 −0.55 −0.88

−0.54 −0.30 −0.87 −0.87 −0.62 −0.48 −0.02 −0.37 −0.64 −0.64 −0.57

−0.80 −0.52 −0.80 −0.66 −0.72 −0.60 −0.01 −0.80 −0.72 −0.50 −0.70

THP Tox a

THP Tox stands for the Tetrahymena pyriformis toxicity data.

Here, the centered kernel matrix K̅ X is expressed as a function of the kernel matrix K and the unit matrix 11T (a square matrix containing only ones). Both HSIC and KTA are then calculated as a function of the centered kernel matrices K̅ X and K̅ Y according to eqs 2 and 3, respectively: HSIC =

KTA =

1 Tr(KX̅ KY̅ ) N −1 2

data set composed of N instances. This is of the same complexity as the nearest neighbor searches requested, for instance, by various MODI measures reported in.3 In this work, for single-task QSAR applications we used a Tanimoto kernel to characterize the descriptor space. Each element KX(i, j) of the kernel matrix is equal to the Tanimoto similarity index between compounds i and j calculated for a given set of descriptors. An RBF kernel was used for the property space using the property variance as the kernel parameter. In multi-task QSAR calculations, all activities grouped per receptor family were modeled at once. In this case, the Tanimoto kernel was used for the descriptor space. For the property space, we were not able to use an RBF kernel because of missing values. Therefore, following recommendation by Kung et al.13 in this case we used a Cosine kernel in which zero value was assigned to all missing elements. 2.2. Similarity-based Parameters. Training set dissimilarity (Div) is estimated by an average Jaccard distance, derived from the averaged Tanimoto coefficient ⟨Tc⟩ over the training set:

(2)

Tr(KX̅ KY̅ ) Tr(KX̅ KX̅ )Tr(KY̅ KY̅ )

(3)

where Tr(A) is a trace of the matrix A. In our case, matrices K̅ X and K̅ Y correspond to descriptors and property (or activity) spaces, respectively. It should be noted that dependence between two sets of variables could be expressed in terms of kernel canonical correlation analysis (KCCA)9−11 which is an extension of the canonical correlation analysis (CCA).12 However, both KTA and HSIC are significantly easier to compute than KCCA. Details of the algorithm of KTA calculations are given in the Supporting Information. It should be noted that assessment of KTA is straightforward and computationally much less consuming than the model’s preparation. The most computationally intensive step is the calculation of Tr(K̅ XK̅ Y), involving O(N2) multiplications, for a

Div = 1 − ⟨Tc⟩ =1−

2 N (N − 1)

∑ (xi , xj) ∈ SX xi ≠ xj

7

x ix j x i2

+ x 2j − x ix j (4) DOI: 10.1021/acs.jcim.5b00539 J. Chem. Inf. Model. 2016, 56, 6−11

Letter

Journal of Chemical Information and Modeling

Figure 1. Modeling workflow used in this work. An individual QSAR model is built using selected machine-learning method (RR for single-task and LASSO for multitask models) on the training set of each fold in both internal (3-CV) and external (3-CV) cross-validation procedures followed by its validation on the test set of this fold.

Here, xi is a descriptor vector of the compound i belonging to the training set SX of size N, ⟨Tc⟩ is an averaged over the training set Tanimoto coefficient. The second measure, Sim, considers for each compound i in the data set S its closest neighbor j. In general, the subset Si of (potentially more than one) Ni neighbors separated by exactly same distance from i is considered. Each property value yi is normalized between 0 and 1. For each compound i, the absolute value of differences yi and the property of its closest neighbor(s) is computed and averaged over Si. Finally, Sim is calculated using an average over all the N instances in S: Sim = 1 −

1 N

∑ i∈S

1 Ni

∑ |yi − yj | j ∈ Si

in order to diversify the scope of investigated biological properties. 3.2. Molecular Descriptors. Two types of molecular descriptors (ISIDA and BCUT) have been used in this study. ISIDA f ragment descriptors17 are counts of substructural molecular fragments, derived from 2D chemical structures. Different types of fragmentation (sequences, atom-centered fragments and triplets of atoms) can be coupled with various coloring schemes (atom symbol, pharmacophore type, electrostatic and lipophilic properties, etc.) resulted in unprecedented versatility of supported fragmentation schemes, in principle tunable to support a vast range of QSPR models. BCUT descriptors18 encode atomic charges, polarizability and hydrogen bonding donor−acceptor properties. They were generated with the ChemAxon tool generatemd 15.5.11.0 (2015) followed by their normalization. 3.3. Building and Validation of Models. Two types of regression models, single-task and multitask, have been built. Single-task models (ST-QSAR) have been obtained with the ridge regression (RR) method.19 The ridge parameter was optimized within inner 3-fold cross-validation procedure (3CV) in which at each fold an optimal value of the ridge parameter was calculated using a golden section search (GSS) algorithm.20 Finally, a median of 3 values obtained in each fold was calculated and further used to build a model on the entire training set followed by its validation on the external test set. The procedure was repeated following an external 3-CV procedure (see Figure 1). In multitask modeling (MT-QSAR), we used the l2,1 variant21−25 of the LASSO26 algorithm implemented in MALSAR library27 of MATLAB.28 The models were validated in external 3-CV procedure. Similarly to RR parametrization (see above), an optimal value of the LASSO regularization parameter was obtained using a GSS optimization algorithm within internal 3-CV (Figure 1). The ISIDA Fragmentor17 software was configured randomly to generate from 50 to 100 different types of ISIDA descriptors, of various topology, size and coloring schemes. In such a way, 50−100 ST-QSAR models per individual receptor and 50 MTQSAR per one receptor family (dopamine, adrenergic and serotonin) have been obtained. Predictive performance of each model has been assessed by the external 3-CV procedure rootmean-square error (RMSE).

(5)

The second term in eq 5 represents the normalized mean absolute error (MAE) obtained in leave-one-out calculations on the training set using 1-nearest neighbor technique. It has been also reported in3 as MODI_ACI(1) parameter. We use Sim instead of MAE (MODI_ACI(1)) in order to be in line with KTA and Div parameters that anticorrelate with the performances of QSAR models. 2.3. Consensus Modelability. Because considered modelability indices, Div, Sim and KTA, are based on distinct, independent premises, an effort was done to combine them, in expectation of a synergistic effect. Such a consensus modelability index can be assessed by the rank product (RP) method.14 For a given subset, each descriptor space DSi receives a rank rKTA, rDiv and rSim corresponding to KTA, Div or Sim measures, respectively. The RP of a CS is then the product of each rank: RP(DSi ) = rKTA(DSi ) × rDiv(DSi ) × rSim(DSi )

(6)

3. COMPUTATIONAL PROCEDURE 3.1. Data Preparation. Three data sets used in this study have been taken from reference.15 The first one is composed of affinity data (pKi) of 1597 compounds for the five (D1···D5) dopamine GPCRs (G protein coupled receptors). The second data set includes compounds targeting adrenergic receptors α1A, α1B, α1D, α2A, α2B, α2C, β1, β2, β3. The third data set gathers binders of serotonin receptors 5-HT1A, 5-HT1B, 5-HT1D, 5HT2A, 5-HT2B, 5-HT2C, 5-HT3A, 5-HT4, 5-HT5A, 5-HT6 and 5HT7. Overall, these data sets combine 2965, 1394 and 3945 affinity values for dopamine, adrenergic and serotonin receptors, respectively. The data distribution is given in Table 1. We also considered the data set concerning acute toxicity against Tetrahymena pyriformis (THP) of 1093 compounds,16

4. RESULTS AND DISCUSSION In single-task QSAR modeling, each type of ISIDA descriptors has been used to build and validate the models, on one hand, and to calculate the KTA, Div and Sim values, on the other 8

DOI: 10.1021/acs.jcim.5b00539 J. Chem. Inf. Model. 2016, 56, 6−11

Letter

Journal of Chemical Information and Modeling

Figure 2. Spearman’s correlation between cross-validated RMSE and KTA (green), Div (blue), Sim (red) and consensus score (black).

Figure 3. Cross-validated RMSE as a function of KTA for dopamine D2 data set in ST-QSAR (left) and for the ensemble of 5 dopamine receptors in MT-QSAR (right). Each point corresponds to a different descriptor space. The solid line illustrates the general trend.

Imperfect linear correlation does not automatically contradict the usefulness of KTA as a modelability score. In MT-QSAR studies, activities of all receptors within one family (dopamine, adrenergic, serotonin) were modeled simultaneously, using one same set of descriptors. For each family, a mean RMSE was calculated as an arithmetic average of RMSE values per receptor. A clear anticorrelation between mean RMSE and KTA has been observed (Figure 3 right and Table 1).

hand. The calculations returned, for each data set in each DS, ensembles of RMSE vs modelability index values. Reasonable linear correlations between RMSE and KTA was observed: for 20 out of 25 receptors, the Spearman correlation coefficient is larger than 0.50 (Table 1, Figure 2, and Figures S1−S5 in the Supporting Information). Relatively small correlation coefficients could be explained by the fact that RMSE reaches its minimum at a certain KTA threshold and it does not change much with further increase of KTA (Figure 3 left), which is perfectly in line with expected behavior of a modelability index. 9

DOI: 10.1021/acs.jcim.5b00539 J. Chem. Inf. Model. 2016, 56, 6−11

Letter

Journal of Chemical Information and Modeling To test the methodology outside the realm of integer fragment count fingerprints, alternatively BCUT descriptors were also considered. Because these represent both positive and negative real values, the Tanimoto kernel may not be positive definite, and was replaced by the RBF kernel of unit width to assess the KTA parameters. Figure 4 reports the observed

Figure 5. Example of an activity cliff in the serotonin 5-HT4 data set. CHEMBL1258452 (on the left) has an activity of 2.4 nM and CHEMBL1258559 (on the right) has an activity of 8664 nM.

the only exception of the 5-HT4 data set) correlates with RMSE, even if one of individual indices KTA, Div or Sim fails.

5. CONCLUSIONS It has been demonstrated that the kernel target alignment (KTA) parameter reasonably well correlates with RMSE of QSAR models and in such a way it can be used as a modelability measure. The algorithm of its calculations is rather simple and is not very time-consuming. Unlike previously reported MODI parameters,3 KTA is not the result of a modeling study per se, but estimates the intrinsic similarity between descriptors and property (activity) spaces. It can be combined with some other modelability indices like Sim and Div, which leads to more robust correlation with QSAR models performances. We believe that in practical applications, modelability indices are useful for selection the most relevant descriptor spaces for QSAR modeling on a given data set. Because a chance to obtain a well-performed QSAR model increases with modelability score, the focus should be set on DS candidates within the top KTA (or other modelability index) range.

Figure 4. Modeling performances as a function of KTA for each QSAR-ST using BCUT descriptors. THP Tox stands for the Tetrahymena pyriformis toxicity data set.

anticorrelation between KTA values and relative, crossvalidated RMSE values obtained for different data sets in the BCUT space. Thus, the KTA formalism can address both questions (i) “Given a dataset, what are the best descriptors to model it?” and (ii) “Given a descriptor space, which is the best dataset likely to be successfully modeled therein?”. We note that changing the DS, the answer to the second question can change too. Results given in Table 1 and Figures 3 and 4 demonstrate that the kernel target alignment parameter relating descriptor and property spaces anticorrelates with cross-validated RMSE and therefore can be efficiently used as a modelability measure for the regression tasks. In one case, the serotonin 5-HT4 data set, no correlation is observed. This might be related by the presence of activity cliffs in any studied descriptors space. For instance, the reported activities of the compounds CHEMBL1258452 and CHEMBL1258559 are very different (2.4 and 8664 nM,29 respectively) although their structures look very similarly (Figure 5). Bad KTA performance can also be related to the small size of this data set (64 molecules only). Application of Div and Sim parameters as modelability indices demonstrates that descriptors spaces that enhance the spread of training set compounds over vast space zones (in the sense of maximizing Div) implicitly allow for better modeling performances. Therefore, Div can also be used as a modelability criterion. Besides, we confirm that Sim (and related MODI_ACI3 parameter) correlates with modeling performances. This is not surprising, because it relates on the relative MAE of 1NN model in a LOO cross-validation procedure. Thus, it is equivalent to estimating the modelability of a data set using a basic modeling strategy. Most important, as shown from Figure 2 and Table 1, the consensus of the three modelability indices nearly always (with



ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jcim.5b00539. RMSE as a function of KTA plots for each target (in the case of STL) and for each target families (in the case of MTL); algorithm to compute KTA and an analysis of its computational complexity (PDF).



AUTHOR INFORMATION

Corresponding Author

*G. Marcou. E-mail: [email protected]. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS Dr. Igor Baskin is acknowledged for fruitful discussion. A.V. thanks Russian Scientific Foundation (Agreement No. 14-4300024 from October 1, 2014) for support.



REFERENCES

(1) Patterson, D. E.; Cramer, R. D.; Ferguson, A. M.; Clark, R. D.; Weinberger, L. E. Neighborhood behavior: a useful concept for

10

DOI: 10.1021/acs.jcim.5b00539 J. Chem. Inf. Model. 2016, 56, 6−11

Letter

Journal of Chemical Information and Modeling validation of “molecular diversity” descriptors. J. Med. Chem. 1996, 39, 3049−3059. (2) Golbraikh, A.; Muratov, E.; Fourches, D.; Tropsha, A. Data Set Modelability by QSAR. J. Chem. Inf. Model. 2014, 54, 1−4. (3) Golbraikh, A.; Fourches, D.; Sedykh, A.; Muratov, E.; Liepina, I.; Tropsha, A. Modelability Criteria: Statistical Characteristics Estimating Feasibility to Build Predictive QSAR Models foraDatase. In Practical Aspects of Computational Chemistry III; Leszczynski, J.; Shukla, M. K., Eds.; Springer: New York, 2014; Chapter 7. (4) Johnson, M. A.; Maggiora, G. M. Concepts and Applications of Molecular Similarity; Wiley: Hoboken, NJ, 1990. (5) Cortes, C.; Mohri, M.; Rostamizadeh, A. Algorithms for Learning Kernels Based on Centered Alignment. J. Mach. Learn. Res. 2012, 13, 795−828. (6) Chang, B.; Kruger, U.; Kustra, R.; Zhang, J. Canonical Correlation Analysis based on Hilbert-Schmidt Independence Criterion and Centered Kernel Target Alignment. In Proceedings of the 30th International Conference on Machine Learning (ICML-13), Atlanta, 2013; Dasgupta, S.; McAllester, D., Eds.; ACM: Atlanta, 2013; Vol. 28; pp 316−324. (7) Gretton, A.; Bousquet, O.; Smola, A.; Schölkopf, B. Measuring statistical dependence with Hilbert-Schmidt norms. In Proceeding Algorithmic Learning Theory, Singapore, 2005; Springer-Verlag: Singapore, 2005; pp 63−77. (8) Schölkopf, B.; Smola, A. Learning with Kernels; The MIT Press: Cambridge, London, 2002. (9) Melzer, T.; Reiter, M.; Bischof, H. Nonlinear feature extraction using generalized canonical correlation analysis. In Artificial Neural NetworksICANN 2001; Dorffner, G.; Horst, B.; Hornik, K., Eds.; Springer: Vienna, 2001; pp 353−360. (10) Bach, F. R.; Jordan, M. I. Kernel independent component analysis. J. Mach. Learn. Res. 2003, 3, 1−48. (11) Akaho, S. A Kernel Method For Canonical Correlation Analysis. In Proceedings of the International Meeting of the Psychometric Society (IMPS2001), Osaka, Japen, 2001; Yanai, H.; Okada, A.; Shigemasu, K.; Kano, Y.; Meulman, J., Eds.; Springer: Osaka, 2001. (12) Hotelling, H. Relations between two sets of variates. Biometrika 1936, 28, 321−377. (13) Kung, S.; Wu, P.-Y. In ICIMCS 2014: Proceedings of the Sixth International Conference on Internet Multimedia Computing and Service; ACM: Xiamen, China, 2014. (14) Breitling, R.; Armengaud, P.; Amtmann, A.; Herzyk, P. Rank products: a simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments. FEBS Lett. 2004, 573, 83−92. (15) Brown, J.; Okuno, Y.; Marcou, G.; Varnek, A.; Horvath, D. Computational chemogenomics: Is it more than inductive transfer? J. Comput.-Aided Mol. Des. 2014, 28, 597−618. (16) Zhu, H.; Tropsha, A.; Fourches, D.; Varnek, A.; Papa, E.; Gramatica, P.; Oberg, T.; Dao, P.; Cherkasov, A.; Tetko, I. V. Combinatorial QSAR modeling of chemical toxicants tested against Tetrahymena pyriformis. J. Chem. Inf. Model. 2008, 48, 766−784. (17) Varnek, A.; Fourches, D.; Horvath, D.; Klimchuk, O.; Gaudin, C.; Vayer, P.; Solov'ev, V.; Hoonakker, F.; Tetko, I. V.; Marcou, G. ISIDA - Platform for Virtual Screening Based on Fragment and Pharmacophoric Descriptors. Curr. Comput.-Aided Drug Des. 2008, 4, 191−198. (18) Pearlman, R. S.; Smith, K. Novel software tools for chemical diversity. In 3D QSAR in Drug Design; Kubinyi, H.; Folkers, G.; Martin, Y. C., Eds.; Springer: Berlin, 1998; pp 339−353. (19) McDonald, G. C. Ridge regression. Wiley Interdiscip. Rev. Comput. Stat. 2009, 1, 93−100. (20) Avriel, M.; Wilde, D. J. Optimality proof for the symmetric Fibonacci search technique. Fibonacci Quart. 1966, 4, 265−269. (21) Argyriou, A.; Evgeniou, T.; Pontil, M. Convex multi-task feature learning. Mach. Learn. 2008, 73, 243−272. (22) Argyriou, A.; Evgeniou, T.; Pontil, M. Multi-task feature learning. Adv. Neural Inf. Process. Syst. 2007, 19.

(23) Argyriou, A.; Micchelli, C.; Pontil, M.; Ying, Y. A spectral regularization framework for multi-task structure learning. Adv. Neural Inf. Process. Syst. 2008, 20, 25−32. (24) Liu, J.; Ji, S.; Ye, J. Multi-task feature learning via efficient l2,1norm minimization. In Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, Montreal, 2009; Bilmes, J.; Ng, A., Eds.; AUAI Press: Corvallis, Oregon, 2009; pp 339−348. (25) Nie, F.; Huang, H.; Cai, X.; Ding, C. H. Efficient and Robust Feature Selection via Joint l2,1-Norms Minimization. In Advances in Neural Information Processing Systems 23; Lafferty, J. D.; Williams, C. K. I.; Shawe-Taylor, J.; Zemel, R. S.; Culotta, A., Eds.; Curran Associates, Inc.: Vancouver, 2010; pp 1813−1821. (26) Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Series B Stat. Methodol. 1996, 73, 267−288. (27) Zhou, J.; Chen, J.; Ye, J. MALSAR: Multi-tAsk Learning via StructurAl Regularization, 1.1; Arizona State University: Phoenix, 2011. (28) Baldi, P.; Brunak, S.; Chauvin, Y.; Andersen, C. A.; Nielsen, H. Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 2000, 16, 412−424. (29) Xu, R.; Hong, J.; Morse, C. L.; Pike, V. W. Synthesis, structure− affinity relationships, and radiolabeling of selective high-affinity 5-HT4 receptor ligands as prospective imaging probes for positron emission tomography. J. Med. Chem. 2010, 53, 7035−7047.

11

DOI: 10.1021/acs.jcim.5b00539 J. Chem. Inf. Model. 2016, 56, 6−11