Editorial pubs.acs.org/jcim
Cite This: J. Chem. Inf. Model. 2019, 59, 945−946
Machine Learning in Drug Discovery
J. Chem. Inf. Model. 2019.59:945-946. Downloaded from pubs.acs.org by 193.56.65.213 on 03/25/19. For personal use only.
M
challenging and improvements can be achieved with machine learning techniques. Machine learning, especially deep learning, has become an even more important and fruitful basis for computational methods in drug discovery as reflected by the recent successes in this field. Looking back to the issues of this journal over the past 12 months when we started to think about this special issue, the growing role of machine learning cannot be overlooked. The editors hope that this issue brings these two fields closer together, encourages collaborations between machine learning groups and computational and medicinal chemists, enables newcomers to get an introduction, and provides an overview of the current research status for experts. We envision that employing machine learning in drug discovery will eventually lead to new, more efficient, and safer drugs.
achine learning is a working horse of modern drug discovery and has been ever since the early days of QSAR. Despite this long tradition, machine learning methods gained substantial momentum recently triggered by the success of deep learning in many application areas.1 A wide range of tasks in modeling and cheminformatics have been influenced by machine learning, such as chemical synthesis planing2 and library design,3 bioactivity and toxicity prediction,4 and virtual screening.5,6 Machine learning methods are no more restricted to traditional data types, such as compounds and protein sequences, but also extend to protein structures, imaging, textual, and transcriptomics data. We are pleased to see that the works included in this special issue reflect this variety of tasks, data types, and methods. From the methodological point of view, the most notable effect is the tendency toward deep learning and deep neural networks. In a large majority of papers, a deep neural network is used as the central machine learning method.7,8 Besides feedforward neural networks,9,10 also deeper, more specialized architectures, consisting of many stacked modules, are used.11 Overall, a diverse range of deep architectures have been employed, directly adopting novel advances in deep learning to highly relevant problems in drug design. Judging on those papers that focus on bioactivity or toxicity prediction, deep multitask networks appear to be established as a standard technique for large data sets.8−10,12 The benefits of the multitask effect were described by Wenzel et al.10 for their ADME models as well as for toxicity models by Sosnin et al.12 In the context of neural networks, two drawbacks are frequently noted: (a) the difficulty to interpret their prediction and (b) the lack of posterior estimates or confidence intervals on the prediction. To this end, Wenzel et al.10 developed “response maps”, a visualization tool that is able to mark substructures in the molecule that were indicative for the prediction. In the DeepConfidence framework,13 an ensemble of neural networks is used to provide confidence intervals for predictions of neural networks. This indicates that the community is aware of these drawbacks of deep learning methods in drug design, i.e., interpretability and confidence, and started to address them with appropriate techniques. The lack of really large, reliable training and test data remains an open issue especially in the context of structure-based design.14 To really estimate the predictivity of machine learning methods, novel evaluation schemes need to be developed. Beside typical cheminformatics tasks such as bioactivity and toxicity prediction, many other informative, novel, and relevant challenges have been undertaken. Cai et al.15 aim at predicting the site of metabolism for UGT-catalyzed reactions using treeform machine learning algorithms. Association of molecules with pathways can be addressed with self-normalizing neural networks.7 Membrane permeation of drug molecules can also be considered and tackled by traditional machine learning methods.16 Litsa et al.17 demonstrate how decision trees aid at mapping atoms from reactants to products of a chemical reaction. Many of these tasks appear to be extremely © 2019 American Chemical Society
Günter Klambauer† Sepp Hochreiter*,† Matthias Rarey*,‡
■
† Johannes Kepler University, LIT AI Lab & Institute for Machine Learning, 4040 Linz, Austria ‡ Universität Hamburg, ZBH−Center for Bioinformatics, 20146 Hamburg, Germany
AUTHOR INFORMATION
Corresponding Authors
*E-mail:
[email protected]. *E-mail:
[email protected]. ORCID
Günter Klambauer: 0000-0003-2861-5552 Matthias Rarey: 0000-0002-9553-6531 Notes
Views expressed in this editorial are those of the authors and not necessarily the views of the ACS.
■
REFERENCES
(1) Sanchez-Lengeling, B.; Aspuru-Guzik, A. Inverse molecular design using machine learning: Generative models for matter engineering. Science 2018, 361 (6400), 360−365. (2) Segler, M. H.; Waller, M. P. Modelling chemical reasoning to predict and invent reactions. Chem. - Eur. J. 2017, 23, 6118−6128. (3) Segler, M. H.; Kogej, T.; Tyrchan, C.; Waller, M. P. Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS Cent. Sci. 2018, 4 (1), 120−131. (4) Mayr, A.; Klambauer, G.; Unterthiner, T.; Steijaert, M.; Wegner, J. K.; Ceulemans, H.; Hochreiter, S.; Clevert, D.-A. Large-scale comparison of machine learning methods for drug target prediction on ChEMBL. Chemical Science 2018, 9, 5441. (5) Unterthiner, T.; Mayr, A.; Klambauer, G.; Steijaert, M.; Wegner, J. K.; Ceulemans, H.; Hochreiter, S. Deep learning as an opportunity in virtual screening. In Proceedings of the Deep Learning Workshop at NIPS; December 2014; Vol. 27, pp 1−9.
Special Issue: Machine Learning in Drug Discovery Published: March 25, 2019 945
DOI: 10.1021/acs.jcim.9b00136 J. Chem. Inf. Model. 2019, 59, 945−946
Journal of Chemical Information and Modeling
Editorial
(6) Pereira, J. C.; Caffarena, E. R.; dos Santos, C. N. Boosting docking-based virtual screening with deep learning. J. Chem. Inf. Model. 2016, 56 (12), 2495−2506. (7) Jiménez, J.; Sabbadin, D.; Cuzzolin, A.; Martínez-Rosell, G.; Gora, J.; Manchester, J.; De Fabritiis, G.; Duca, J. PathwayMap: Molecular pathway association with self-normalizing neural networks. J. Chem. Inf. Model. 2019, DOI: 10.1021/acs.jcim.8b00711. (8) Sturm, N.; Sun, J.; Vandriessche, Y.; Mayr, A.; Klambauer, G.; Carlsson, L.; Chen, H.; Engkvist, O. Application of Bioactivity ProfileBased Fingerprints for Building Machine Learning Models. J. Chem. Inf. Model. 2018, DOI: 10.1021/acs.jcim.8b00550. (9) Zhou, Y.; Cahya, S.; Combs, S. A.; Nicolaou, C. A.; Wang, J.; Desai, P. V.; Shen, J. Exploring Tunable Hyperparameters for Deep Neural Network with Industrial ADME Data Sets. J. Chem. Inf. Model. 2019, DOI: 10.1021/acs.jcim.8b00671. (10) Wenzel, J.; Matter, H.; Schmidt, F. Predictive Multitask Deep Neural Network Models for ADME-Tox Properties: Learning from Large Datasets. J. Chem. Inf. Model. 2019, DOI: 10.1021/ acs.jcim.8b00785. (11) Pogany, P.; Arad, N.; Genway, S.; Pickett, S. D. De novo Molecule Design by Translating from Reduced Graphs to SMILES. J. Chem. Inf. Model. 2018, DOI: 10.1021/acs.jcim.8b00626. (12) Sosnin, S.; Karlov, D.; Tetko, I. V.; Fedorov, M. V. A Comparative Study of Multitask Toxicity Modeling on a Broad Chemical Space. J. Chem. Inf. Model. 2019, DOI: 10.1021/ acs.jcim.8b00685. (13) Cortés-Ciriano, I.; Bender, A. Deep Confidence: A Computationally Efficient Framework for Calculating Reliable Prediction Errors for Deep Neural Networks. J. Chem. Inf. Model. 2018, DOI: 10.1021/acs.jcim.8b00542. (14) Sieg, J.; Flachsenberg, F.; Rarey, M. In The Need of Bias Control: Evaluating Chemical Data for Machine Learning in Structure-Based Virtual Screening. J. Chem. Inf. Model. 2019, DOI: 10.1021/acs.jcim.8b00712. (15) Cai, Y.; Yang, H.; Li, W.; Liu, G.; Lee, P. W.; Tang, Y. Computational Prediction of Site of Metabolism for UGT-catalyzed Reactions. J. Chem. Inf. Model. 2018, 58, 1169. (16) Brocke, S.; Degen, A.; MacKerell, A. D.; Dutagaci, B.; Feig, M. Prediction of Membrane Permeation of Drug Molecules by Combining an Implicit Membrane Model with Machine Learning. J. Chem. Inf. Model. 2018, DOI: 10.1021/acs.jcim.8b00648. (17) Litsa, E. E.; Pena, M. I.; Moll, M.; Giannakopoulos, G.; Bennett, G. N.; Kavraki, L. E. Machine Learning Guided Atom Mapping of Metabolic Reactions. J. Chem. Inf. Model. 2018, DOI: 10.1021/acs.jcim.8b00434.
946
DOI: 10.1021/acs.jcim.9b00136 J. Chem. Inf. Model. 2019, 59, 945−946