Artificial Intelligence Approach To Investigate the Longevity Drug | The

2 days ago - Longevity is a very important and interesting topic, and Klotho has been demonstrated to be related to longevity. We combined network ...
0 downloads 0 Views 12MB Size
Letter Cite This: J. Phys. Chem. Lett. 2019, 10, 4947−4961

pubs.acs.org/JPCL

Artificial Intelligence Approach To Investigate the Longevity Drug Jun-Yan Li,†,# Hsin-Yi Chen,†,# Wen-jie Dai,‡,# Qiu-Jie Lv,†,# and Calvin Yu-Chian Chen*,†,§,∥ †

School of Intelligent Systems Engineering, Artificial Intelligence Medical Center, Sun Yat-sen University, Shenzhen 510275, China School of Pharmacy, Sun Yat-sen University, Shenzhen 510275, China § Department of Medical Research, China Medical University Hospital, Taichung 40447, Taiwan ∥ Department of Bioinformatics and Medical Engineering, Asia University, Taichung 41354, Taiwan ‡

Downloaded via NOTTINGHAM TRENT UNIV on August 15, 2019 at 02:59:26 (UTC). See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.

S Supporting Information *

ABSTRACT: Longevity is a very important and interesting topic, and Klotho has been demonstrated to be related to longevity. We combined network pharmacology, machine learning, deep learning, and molecular dynamics (MD) simulation to investigate potent lead drugs. Related protein insulin-like growth factor 1 receptor (IGF1R) and insulin receptor (IR) were docked with the traditional Chinese medicine (TCM) database to screen out several novel candidates. Besides, nine different machine learning algorithms were performed to build reliable and accurate predicted models. Moreover, we used the novel deep learning algorithm to build predicted models. All of these models obtained significant R2, which are all greater than 0.87 on the training set and higher than 0.88 for the test set, respectively. The long time 500 ns molecular dynamics simulation was also performed to verify protein−ligand properties and stability. Finally, we obtained Antifebrile Dichroa, Holarrhena antidysenterica, and Gelsemium sempervirens, which might be potent TCMs for two targets.

L

largest Traditional Chinese Medicine database is employed for this study for drug screening.21 So we could develop a TCM formula that furnished multiple target system effects. As artificial intelligence (AI) has developed, breakthroughs have been made in the medical field.22−24 In drug discovery, machine learning and deep neural networks have shown their huge and astonishing potential.25,26 A concept of end-to-end drug discovery and development exploiting machine learning was proposed by Ekins et al.25 Studies showed that the machine learning model can obtain the implicit chemical knowledge from the chemical reaction data, which can help to synthesize new chemical substances.27 Chang et al. proposed a novel deep learning model that predicts anticancer drug responsiveness.28 The novelty is that it is the application of a deep learning model in predicting the feasibility of drug repurposing. In the latest reviews, Chan et al. concluded that computational prediction of atomic and molecular properties could assist de novo design. And AI is also able to search for correlations between molecular representations and biological and toxicological activities.29 For drug discovery, it requires a lot of money and time to produce a new drug. So, how to reduce the cost and shorten the time are crucial points for pharmaceutical companies. It is reported that a new drug optimization method based on the nanopore force spectroscopy has been proposed. By all-atom molecular dynamics simulations, the binding strength between

ongevity has received the popular attention of researchers in all ages, and many related pathways have been researched.1 Since Klotho family proteins are relevant to longevity, they could help uncover the secret of longevity.2 Klotho is also associated with many diseases. Most of the studies show that the increase or overexpression of Klotho is conducive to longevity.3 Research shows that mutation of the mouse Klotho gene leads to the aging syndrome.4 This target could be a modulator of the insulin-like growth factor 1 (IGF1) pathway or Wnt/β-catenin pathway, which could increase antitumor properties.5,6 Klotho could also lead to nervous system diseases, such as depression.7 However, Klotho could protect the heart and kidney.1,8,9 β-Klotho was a coreceptor complex with FGFR1, serving as a cofactor for the FGF21 signal pathway.10 The receptor complex formed at the plasma membrane in response to FGF21 stimulation.11,12 Several ratio-based designs of FGF2113,14 were reported on the basis of the natural FGF21 structure to be designed as a supplement.15 An FGF21imitating antibody treated diabetes through the β-Klotho pathway.16 The 3D structure of the β-Klotho ternary complex was solved,17 and the structure of α-Klotho was provided.18 The network-pharmacology based method was one of the practical ways to find related proteins.19 The downstream signaling pathway of Klotho inhibited the insulin receptor substrate (IRS) and insulin-like growth factor 1 receptor (IGF1R), not directly binding to these receptors.20 The inhibitors of these two targets could develop as collaborative work, respectively. Compared to western medicine, Chinese medicine properties are more mild and safe. The world’s © XXXX American Chemical Society

Received: July 30, 2019 Accepted: August 12, 2019

4947

DOI: 10.1021/acs.jpclett.9b02220 J. Phys. Chem. Lett. 2019, 10, 4947−4961

Letter

The Journal of Physical Chemistry Letters a drug molecule and a target protein could be electrically detected.30,31 Ensemble learning is a method that integrates multiple machine learning algorithms by using some rules, so as to achieve better effects than a single machine learning algorithm.32 There are two kinds of ensemble learning; one is boosting, and another is bagging. The former’s characteristic is the dependencies between weak learners, while for the latter, there are no dependencies between weak learners; all weak learners could be parallel fitted. In this case, we used some ensemble learning models, such as random forest, gradient boost regression tree. This Letter focuses on these two proteins, which are closely related to longevity. A novel traditional Chinese medicine (TCM) formula was also proposed in this study since we are utilizing the world’s largest TCM database; hence, we can know which TCM herb these compounds come from. That is why we can propose which TCM formula may be a potent prescription for longevity. The docking results and nine artificial intelligence, machine learning, and deep learning models could help us to identify the lead compounds. The integration of traditional Chinese medicines and artificial intelligence could be a developed determination. The flowchart of this project is provided in Figure 1. Network-Pharmacology-Based Analysis. The Kyoto Encyclopedia of Genes and Genomes (KEGG) database provided the longevity pathway for reference research.33 The related targets around the Klotho protein in the upstream could be focused as well. The interaction relationship between different proteins not only could provide a potential mechanism toward longevity but also could make contributions to further drug discovery. The protein−protein interaction provided information on the Klotho protein to furnish longevity through other proteins. Two proteins, IGF-1R and IR proteins, were selected for further study. The parts of the longevity regulating pathway in this research from the KEGG pathway database are shown in Figure 2. Virtual Screening of IGF-1R and IR Protein. The sequence of IGF-1R was obtained from the Uniprot Knowledgebase (P08069), and the three-dimensional structure of the IGF1R protein was acquired from the Protein Data bank (PDB) database,34 5FXS (1.9 Å), with a prefect resolution, for which the inhibited site could be next to the ligand in the crystal structure.35 The sequence of the insulin receptor (IR) was acquired from the Uniprot Knowledgebase (P06213), and the 3D conformation was obtained from the PDB database (4IBM) with 1.8 Å resolution. The binding site was in the insulin receptor subunit β area.36 The TCM Database @ Taiwan, the largest TCM database in the world, was employed for virtual screening with the Dock (Ligandfit) protocol. The Chemistry at HARvard Macromolecular Mechanics (CHARMm) force field, which could fix the receptor−ligand complex, was utilized for parametrization. Artif icial Intelligence, Deep Learning, and Machine Learning Models. The compounds with inhibited activity information for the IGF-1R protein were searched in previous literature.35,37 We used the software named Discovery Studio 2017 to calculate molecule properties. All machine learning algorithms were applied on the data set, which was separated into a training set and test set by a random protocol. The 2D-QSAR models were built by the compounds with known activities using support vector machine (SVM), random forest (RF), gradient boost regression tree (GBRT), ridge regression, AdaBoost, elastic net, stochastic gradient descent

Figure 1. Flowchart of the experiment.

Figure 2. Parts of the longevity regulating pathway in this research from KEGG pathway database.

Figure 3. Flowchart of machine learning algorithms. We used nine kinds of machine learning algorithms and then evaluated their performance. Finally, we chose those algorithms that perform better in this case to predict the target value.

4948

DOI: 10.1021/acs.jpclett.9b02220 J. Phys. Chem. Lett. 2019, 10, 4947−4961

Letter

The Journal of Physical Chemistry Letters

Figure 4. Interesting example to illustrate the principle of support vector machine (SVM). Left: In this case, like the XOR problem, in the twodimensional surface, there is no line that could separate the blue triangle from the red square. Right: This is a special case where two-dimensional space is inseparably linear, but for SVM it maps the function to three-dimensional space by a kernel function; then we could find a line to separate them.

Scheme 2

Figure 5. Simplified chart of the random forest regression algorithm.

Scheme 1

Figure 6. Flowchart of AdaBoost algorithm.

(SGD) regressor, Bayesian linear regression, and least angle regression algorithms (Figure 3). Several compounds with

IC50 activity toward IGF-1R were changed to PIC50 activity by formula 1. 4949

DOI: 10.1021/acs.jpclett.9b02220 J. Phys. Chem. Lett. 2019, 10, 4947−4961

Letter

The Journal of Physical Chemistry Letters Scheme 3

Table 1. Results of Different Machine Learning Models name SVM AdaBoost RIDGE random forest GBRT SGD elastic net LARS Bayesian

MSE on training set

R2 on training set

MSE on test set

R2 on test set

0.137 0.067 0.189 0.046

0.909 0.955 0.874 0.968

0.166 0.146 0.091 0.151

0.884 0.900 0.934 0.909

0.080 0.190 0.168 0.136 0.181

0.947 0.873 0.888 0.904 0.879

0.141 0.077 0.094 0.195 0.110

0.901 0.944 0.932 0.890 0.923

Table 2. Results of Docking and Partial Predicted Value of PIC50 compound conamine # GS-3 # isofebrifugine # OZN (control) nudicaulin α-dichroine 7,4′-dihydroxyflavan arecatannin A2 1-O-β-D-glucopyranosyl(2S,3R,4E,11E)-2-(2′Rhydroxyhexadece-noylamino)4,11-octadecadiene-1,3-diol gomisin D casearlucin A budmunchiamine L4 anhydrocannabisativine nauclederine 3β-[(α-L-arabinopyranosyl)oxy]19α-hydroxyolean-12-en-28-oic acid 28-β-D-glucopyranosyl ester nazlinin febrifugine ergocornine fritillebeinol subaphylline

random forest

GBRT

AdaBoost

96.62 97.143 97.753 55.401 96.577 96.026 95.238 101.93 94.677

7.028 7.271 6.985 6.409 6.224 6.900 6.236 6.446 5.765

6.846 6.878 6.846 6.791 6.307 6.846 6.293 6.537 5.469

6.927 7.160 7.011 6.398 6.185 7.011 6.394 6.185 5.401

111.908 109.203 108.14 110.509 100.053 97.829

6.327 5.711 6.589 6.745 7.197 6.914

6.198 5.963 6.717 6.670 7.141 7.330

6.439 5.852 6.690 6.927 7.193 7.193

100.596 97.662 96.932 96.652 94.674

7.036 6.595 6.463 6.642 6.228

6.846 6.670 6.294 6.486 5.631

6.927 6.927 6.568 6.624 6.462

suitably oriented separation hyperplane is established to maximize the distance between two parallel hyperplanes. It is assumed that the larger the distance or difference between parallel hyperplanes, the smaller the total error of the classifier. The key of SVM is kernel function. Vector sets in lowdimensional spaces are often difficult to partition, and the solution is to map them to higher-dimensional spaces. But the difficulty with this approach is the increase in computational complexity, and the kernel neatly solves this problem. In other words, as long as the proper kernel function is selected, the classification function of high dimensional space can be obtained. For a long time, the support vector machine (SVM) model was popular because of its rigorous mathematical derivation and excellent performance.39 A simple but easily understood example is shown in Figure 4. Random Forest (RF) Model. The random forest (RF) model, one of the ensemble learning algorithms, has a great generalization ability.40 For the regression task, the RF takes classification and regression trees41 (CART) as the weak learner and uses mean square error (MSE) as the principle.

Figure 7. Result of calculating the Pearson correlation coefficient of 204 features (a). Result of calculating the Pearson correlation coefficient of 9 features selected by Lasso (b).

PIC50 = − log(IC50)

docking score

(1)

Support Vector Machine (SVM) Model. The support vector machine (SVM)38 is an algorithm that maps raw data to a hyperplane through the kernel function and then classifies the data, minimizing the distance from all the data in the sample set to the hyperplane. Support vector machine (SVM) maps vector to a higher dimensional space where there is a maximal interval hyperplane. Two parallel hyperplanes are built on either side of the hyperplane where the data are separated. A 4950

DOI: 10.1021/acs.jpclett.9b02220 J. Phys. Chem. Lett. 2019, 10, 4947−4961

Letter

The Journal of Physical Chemistry Letters Table 3. Hyperparameter Settings of All Machine Learning Models name

code parameters

SVM AdaBoost RIDGE random forest GBRT SGD elastic net LARS Bayesian

random_state = 0,tol = 1e-5 alpha = 0.05, fit_intercept = True, normalize = True, copy_X = True, max_iter = None, tol = 5, solver = ‘auto’, random_state = 10 alpha = 0.05, fit_intercept = True, normalize = True, copy_X = True, max_iter = None, tol = 5, solver = ‘auto’, random_state = 10 n_estimators = 20, max_depth = 20, oob_score = True, n_jobs = −1, random_state = 10 n_estimators = 100 max_iter = 100, penalty = None,alpha = 0.5 random_state = 10,alpha = 0.1,l1_ratio = 0.001,normalize = False n_nonzero_coefs = 50 compute_score = True

Figure 8. Results of nine machine learning algorithms.

node, so the prediction of the random forest is the average of all predicted values of trees. For each decision tree, the training set was selected from the sample set randomly and the rest

The CART is a kind of binary tree, which could not only solve classification but also deal with the regression task. The prediction of the CART is based on the mean value of the leaf 4951

DOI: 10.1021/acs.jpclett.9b02220 J. Phys. Chem. Lett. 2019, 10, 4947−4961

Letter

The Journal of Physical Chemistry Letters

learning algorithms, for the same training set, and then assemble these weak learning algorithms to construct a stronger final learning algorithm (Figure 6). The AdaBoost algorithm specifies how to readjust the sample distribution and determine the weights of each weak classifier in the learning process. The simplified realization of AdaBoost is given in Scheme 2. Elastic Net Model. The elastic net45 model can be seen as a combination of lasso and ridge models. It is also the regularization of ordinary linear regression, but its loss function is neither all L1 regularization nor all L2 regularization, but with a weight parameter ρ to balance the proportion of L1 and L2 regularization, the objective function is as follows (3): 1 J (θ ) = 2

Figure 9. Structure of neural network.

m

n (i)

∑ (y

T (i) 2

− θ x ) + λ(ρ ∑ |θj|

i

j n

+ (1 − ρ) ∑ θj 2

data was used as the test set. In the process of training the nodes of each tree, the features used are extracted randomly from all the features without putting back according to a certain proportion. So, the random forest usually performs well in reducing overfitting and could achieve low variance.42 The chart of the random forest model is shown in Figure 5. Gradient Boost Regression Tree. The gradient boost regression tree (GBRT) is a kind of boosting algorithm.43 The algorithm flow is given in Scheme 1. Ridge Regression. Ridge regression is a biased estimation method specially used for collinearity data analysis, which is an improved least-squares estimation method in essence. By giving up the unbiased least-squares method, the regression coefficient is more consistent with reality and more reliable regarding the cost of losing part of the information and reduced precision. When the number of features is larger than the number of samples, the ridge regression performs well. The objective function is as follows (2): 1 J (θ ) = 2

m

j

Stochastic Gradient Descent. The stochastic gradient descent (SGD) regressor is a special case of gradient descent optimization. As the batch gradient descent requires all training samples when updating each parameter, the training process will become unusually slow as the number of samples increases. The principle behind the gradient descent algorithm: the gradient of the objective function J(θ) with respect to the parameter θ will be the direction in which the objective function rises the fastest. For the optimization problem of minimization, only one step in the opposite direction of the gradient is needed to advance the parameters, and then the objective function can be reduced. This step size is also called the learning rate η. The parameter update formula is as follows (4): θ ← θ − η ·∇θ J(θ )

n j

(4)

∇θJ(θ) is the gradient of parameter. According to the different data size of objective function J(θ), the gradient descent algorithm can be divided into three parts: batch gradient descent, stochastic gradient descent, and mini-batch gradient descent. For SGD, it minimizes the cost function of each sample. Although the cost function obtained by each iteration is not always in the direction of the global optimum, the overall

∑ (y(i) − θ Tx(i))2 + λ ∑ θj 2 i

(3)

(2)

Adapt Boost (AdaBoost) Model. AdaBoost is the abbreviation for adaptive boosting, which is based on the theory of boosting.44 AdaBoost is an iterative algorithm. The core idea of AdaBoost is to train different learning algorithms, named weak

Figure 10. Results of 17 deep learning models. 4952

DOI: 10.1021/acs.jpclett.9b02220 J. Phys. Chem. Lett. 2019, 10, 4947−4961

Letter

The Journal of Physical Chemistry Letters Table 4. Predicted Activity through Different Deep Learning Models compound

Model1a

Model2b

Model3c

Model4d

Model5e

conamine # GS-3 # isofebrifugine # OZN (control) nudicaulin α-dichroine 7,4′-dihydroxyflavan arecatannin A2 1-O-β-D-glucopyranosyl-(2S,3R,4E,11E)-2-(2′R-hydroxyhexadece-noylamino)-4,11-octadecadiene-1,3-diol gomisin D casearlucin A budmunchiamine L4 anhydrocannabisativine nauclederine 3β-[(α-L-arabinopyranosyl)oxy]-19α-hydroxyolean-12-en-28-oic acid 28-β-D-glucopyranosyl ester nazlinin febrifugine ergocornine fritillebeinol subaphylline

4.747 8.872 4.917 25.691 5.455 4.716 4.029 58.829 5.442 5.566 5.240 4.937 4.607 5.280 6.286 4.464 4.662 7.669 5.169 4.311

4.758 8.862 4.887 26.213 5.375 4.695 3.997 56.645 5.534 5.520 5.157 5.026 4.661 5.266 6.154 4.469 4.633 7.614 5.126 4.293

4.770 8.876 4.881 26.549 5.318 4.685 3.996 55.397 5.684 5.495 5.140 5.123 4.724 5.262 6.128 4.477 4.642 7.603 5.103 4.294

4.781 8.882 4.871 26.690 5.290 4.673 3.983 54.751 5.797 5.492 5.151 5.185 4.762 5.254 6.127 4.481 4.633 7.590 5.110 4.281

4.725 8.789 4.723 30.361 5.211 4.537 3.752 32.023 6.998 4.897 4.931 6.023 5.232 5.159 5.175 4.679 4.645 7.338 4.831 4.336

a

Model1: R2 on training set = 0.95, R2 on test set = 0.86. bModel2: R2 on training set = 0.94, R2 on est set = 0.87. cModel3: R2 on training set = 0.93, R2 on test set = 0.88. dModel4: R2 on training set = 0.92, R2 on test set = 0.89. eModel5: R2 on training set = 0.91, R2 on test set = 0.9.

Figure 11. Target compounds interaction network. The red diamond replaces the related targets, the round form presents the compounds. The compounds in the rectangle frame especially mean multitarget candidates.

probability of event A occurring. The Bayesian regression uses maximum likelihood estimation. The idea of maximum likelihood could be comprehended as, when something happens, we believe that the current environmental conditions are most conducive to this matter. Least Angle Regression (LARS). The least angle regression (LARS),46 which is similar to the forward stepwise form, is an efficient solution for lasso regression from the perspective of solution process. The least angle regression method makes a compromise between the forward gradient algorithm and the forward selection algorithm, retains a certain degree of accuracy of the forward gradient algorithm, and simplifies the iterative process of the forward gradient algorithm step by step. Deep Learning. Deep learning, which has been applied in many fields since it was first proposed,41 is a new revolution tool of machine learning. The basic models of deep learning

direction is still toward the global optimal solution, and the final result is usually near the global optimal solution. The pseudocode of SGD is in Scheme 3. Bayesian Linear Regression. Bayesian linear regression is a linear regression model using the statistical Bayesian inference (Bayesian inference) method. Bayes’ classification method based on statistics, based on Bayes’ theorem, predicts the probability that the sample belongs to a certain class by solving the posterior probability distribution. The Bayesian formula is as follows (5): P(A|B) = P(A)

P(B|A) P(B)

(5)

The P(A|B) is named as the posterior probability, which is the probability that event A happens when event B happens. The P(A) is named as prior probability, which represents the 4953

DOI: 10.1021/acs.jpclett.9b02220 J. Phys. Chem. Lett. 2019, 10, 4947−4961

Letter

The Journal of Physical Chemistry Letters

Figure 12. RMSD, total energy, and hydrogen bond distance of each receptor−ligand complex. (a) RMSD and energy of IGF1R complexes. (b) RMSD and total energy of IR complexes. (c) Hydrogen bond distance of candidate complexes.

can be roughly divided into three categories: multilayer perceptron model, deep neural network model, and recurrent neural network model. These are represented by the deep belief network (DBN), convolution neural networks (CNN), and recurrent neural network (RNN), respectively.42 The CNN performs well in image processing and natural language processing. It contains five main parts, input layer, convolutional layer, pooling layer, fully connected layer, and output layer. The input layer is the input of entire neural network. For image processing, it generally represents the pixel matrix of an image. For the data classification problem, it represents a one-dimensional array. The function of the convolution layer is to extract features from the input data. With the convolution kernel, features can be extracted. The function of the pooling layer is to gradually reduce the spatial size of the data body, so that the number of parameters in the network can be reduced, the computational resource consumption can be reduced, and overfitting can be effectively controlled. The fully connected layer is usually refitted at the

tail of CNN to reduce the loss of feature information. The output layer is the output of the whole CNN. Molecular Dynamics Simulation (MDs) in the Candidate Compounds. The receptor−ligand complex of candidates and control were confirmed by molecular dynamics simulation (MDs) in 500 ns with Gromacs 2018 software. The MD simulation system added solvent, energy minimization, NVT equilibrium, and NPT equilibrium. The explicit solvent water model TIP3P was added to the system, and the corresponding amount of ions was substituted in the system to simulate the physiological environment. The energy minimization was performed with the steepest descent minimization algorithm in 5000 steps. The NVT equilibration simulation was for a total of 20 ns with time steps 2 fs. The Lincs algorithm constrained all bonds and the 310 K temperature setting, which is similar to the physiological environment. The NPT equilibration ran for 20 ns with a set temperature coupling during the simulation. The particle mesh Ewald (PME) was utilized for long-range electrostatics analysis. The MD simulation was performed in 250000000 steps with 2 fs for each step. The total 500 ns MD 4954

DOI: 10.1021/acs.jpclett.9b02220 J. Phys. Chem. Lett. 2019, 10, 4947−4961

Letter

The Journal of Physical Chemistry Letters

Figure 13. Variation trend of gyrate, MSD, and SASA during the MD period: (a) protein IGF1R; (b) protein IR; (c) ligands of IGF1R; (d) ligand of IR.

applied the Verlet scheme for neighbor searching cutoff. The pressure coupling was applied in the Parrinello−Rahman way with isotropic type. In the calculation of the root-mean-square deviation (RMSD), the radius of gyration (gyrate), root-meansquare fluctuation (RMSF), secondary structure of protein, solvent accessible surface area (SASA), mean square displacement (MSD), the residue contact map (mdmat), hydrogen bond distance, total energy, torsion angle, and the motion of complex could be observed clearly. Especially, the frame during simulation could display the process of motion intuitively. For the ternary complex Klotho-FGF-FGFR, the high similarity of each component reflected a similar action in 3D. Sequence alignments αKlotho/βKlotho, FGF19/FGF21/ FGF23, and FGF1/FGF2/FGF3/FGF4 displayed the consistency of each element, respectively. As for FGFs, the protein conservation pattern displayed with a dendrogram and we could trace the residue distribution. To improve the result, we

observed the spatial distribution differences especially each same cluster. What is more, we cut the Klotho component from the ternary complex to the binary complex to analyze the effect of Klotho. Protein−Protein Interaction in Signal Pathway. The longevity pathway in the KEGG databank provided the related target of the Klotho protein (Figure 2). One of the mechanisms was the down-regulation of IGF-1R, which accompanied the benefit of up-regulation of Klotho. From the pathway, it could be identified what possesses this synergistic effect. And finally two proteins, IGF-1R and IR, which should be inhibited, were targets in the research. Docking Results. The whole procedure could be divided into three parts: protein preparation, ligand preparation, and parameter settings. The software we used is Discovery Studio Client v17.2.0.16349. The binding site was set in insulin receptor subunit β area 42. Then we did the protein 4955

DOI: 10.1021/acs.jpclett.9b02220 J. Phys. Chem. Lett. 2019, 10, 4947−4961

Letter

The Journal of Physical Chemistry Letters

Figure 14. RMSF analysis and residue contact map: (a) RMSF value of each IGF1R complexes; (b) RMSF value of IR complexes.

publications and protein database bank (PDB). We set the option of “input preserve controls in input” as true and “input control ligand property” as control. For the energy grid, we chose “Dreiding”; the extension from the site was set at 3.0, the softened potential energy was “true,” the dielectric constant was set as 1.0, and the distance dependent dielectric was true. The number of Monte Carlo trials are “2 500 120, 4 1200 300, 6 1500 350, 10 2000 500, 25 3000 750”. For the conformation search, the included electrostatic energy was false, the torsional step size for polar H was 30.0, and the maximum internal energy was set as 10000.00. For the docking parts, the RMS threshold for ligand/site match was 2.0, the rigid body SD iterations were 10, the rigid body BFGS iterations were 20, and the maximum poses retained were set as 5. For pose saving, the DockScore threshold and rigid body BFGS iterations were set as 0. The RMS threshold for diversity was 1.50, and the score threshold for diversity was 20.0. We chose true for perform clustering. The maximum clusters per molecule were 5, and the RMS threshold for clustering was 1.5. For interaction filters, we set 0 as the minimum number of features and true for donor including SH, for acceptor including aromatic F, and for metal including S. We chose false for “save only poses with minimum features”. For the minimization algorithm, we chose “do not minimize”, the minimization iterations were 1000, the

preparation in DS by removing the water and adding hydrogen atoms. These actions were done through a function called “prepare protein”. We used the default settings to finish this part. Finally, we opened the generated dsv format file, removed the cell from it, and saved it. For ligand preparation, we did not need to do preprocessing; we just loaded all the TCM database compounds and calculated the properties we want.21 The properties including AbsoluteEnergy, ConfNumber, ADMET_BBB, ADMET_BBB_Level, ADMET_Absorption_Level, ADMET_Solubility, ADMET_Solubility_Level, ADMET_Hepatotoxicity, ADMET_Hepatotoxicity_Probability, ADMET_CYP2D6, ADMET_CYP2D6_Probability, ADMET_PPB_Level, ADMET_AlogP98, ADMET_Unknown_AlogP98, and ADMET_PSA_2D. Before we began to dock the protein to the TCM database, we chose a docking protocol and set the parameters. In this experiment, we chose LigandFit as the protocol47 and the Chemistry at HARvard Macromolecular Mechanics (CHARMm) force field. The input receptor is the proteins we prepared. And for the insulin receptor (IR), the binding site was set in the insulin receptor subunit β area.36 The input ligands are the all 18776 compounds from the TCM database. The control ligand was IR1 according to the 4956

DOI: 10.1021/acs.jpclett.9b02220 J. Phys. Chem. Lett. 2019, 10, 4947−4961

Letter

The Journal of Physical Chemistry Letters

Figure 15. Genetic similarity of FGFs. (a) Sequence alignment of representative structure. Four clusters appeared with a 10 Å cutoff value. The purple cluster suggests a conservative area, and the red cluster suggests a variable region. (b) Three dimensions of FGFs. The conservative region is mainly placed around the linker area; however, several variable residues enhanced the selectivity between different subtypes.

minimization dielectric constant was 1.0, the minimization distance-dependent dielectrics were set as true, the minimization gradient tolerance was 0.001, and the minimization nonbond cutoff was 13.0. We used ten scoring functions

including LigScore 1, LigScore2, PLP1, PLP2, Jain, PMF, PMF04, Ludi Energy Estimate 1, Ludi Energy Estimate 2, and Ludi Energy Estimate 3. We chose parallel processing, and the batch size was set as 25. The whole experiment was 4957

DOI: 10.1021/acs.jpclett.9b02220 J. Phys. Chem. Lett. 2019, 10, 4947−4961

Letter

The Journal of Physical Chemistry Letters

according to the error rate. The larger the error rate is, the larger the weight is. The data confirm that the ensemble learning method does perform better than the single machine learning algorithm. So, we used the three ensemble learning algorithms to predict the bioactivity value, PIC50 (Table 2). The source code of these nine algorithms is shown in Supporting Information Table S2. Deep Learning Model. In order to enhance the accuracy of predicted activity value, the deep learning (DL) method was applied to build a new property-bioactivity model. A simple four-layer full connected neural network using ReLu (rectified linear units) function as activation function was constructed (Figure 9). The dropout technique was also applied in the second and third layer (with rate 0.4 for the second and 0.6 for the third) to reduce overfitting. The loss function is mean squared error (MSE) (7).

Table 5. TCM Formula Proposal TCM

component

target

antifebrile dichroa# Arecae Semen Gelsemium sempervirens# Holarrhena antidysenterica# Hippophae rhamnoides Ziziphus jujuba Cordyceps sinensis Lycium chinense Panax ginseng Perilla frutescens Phragmites communis Celosia cristada Trigonella foenum-graecum

isofebrifugine arecatannin A2 GS-3 conamine vitamin B1 vitamin B1 vitamin B1 vitamin B1 vitamin B1 vitamin B1 vitamin B1 vitamin B1 vitamin B1

IGF1R;IR IGF1R;IR IGF1R;IR IGF1R IR IR IR IR IR IR IR IR IR

accomplished by 22 h and 16 min. We input 18777 ligands (18776 from TCM database and one control ligand). There were 38314 poses docked, and 846 poses filtered or failed to dock. For the protein named IGF-1R, the inhibited site could be next to the ligand in the crystal structure.35 Its control ligand is OZN. In this part, we used the same parameters as above. But this experiment only took 8 h and 50 min. It produced 24642 docked poses, and 9758 poses filtered or failed to dock. The sample data set was constructed by 91 data, which included 73 training sets and 18 test sets. Taking the random forest model as an example, we calculated the Pearson correlation coefficient of 204 features to know the correlation between features. The formula is as follows (6): r=

MSE =

N

∑ (observed t − predictedt )2 t=1

(7)

We use the Adam optimizer49 with learning rate 0.001 and other parameters used in the original paper. We did 200 experiments on the GeForce GTX 1060. Only if the r2 is greater than 0.9 on the training set and greater than 0.85 on the test set could the model be saved. The result shows that only 17 models satisfied both requirements (Figure 10). The source code is shown in Supporting Information Table S3. The predicted bioactivity value is shown in Table 4. From the table, we can see the control compound OZN achieves a very high mark in every deep learning model. That is what we would like to see happen, because it proves that the control we selected is an appropriate choice. However, we noticed that the compound named arecatannin A2 gets the highest score in all models. Looking up its dock score 101.93 and the predicted values of three ensemble learning models 6.446, 6.537, 6.185, we could not find a reasonable reason to explain this abnormal phenomenon. It may be due to the size of data set; we only found and used 91 data. Though the value of r2 of deep learning models is great, the predicted result showed that models maybe did not perform as well as expected. For the rigor of the results, we only take the results of deep learning as a reference. In other words, the final results were determined by three ensemble learning models only. TCM Candidates. Considering docking score and predicted activity, candidates were determined. The top three ligands of IR were conamine, GS-3, isofebrifugine. Control ligand OZN had a low dock score but great performance in many areas. Two of the candidates (GS-3 and isofebrifugine) performed well in two targets, which indicated synergistic effects against disease. The interactions of the top 20 IGF1R candidates and top 60 IR ligands were displayed in a network (Figure 11). Multitarget ligands were focused. The top three ligands of IR were vitamine B1, GS-3, and isofebrigugine; the last two were multitargeting. Molecular Dynamics. The top three candidates as well as the control would validate through 500 ns molecular dynamics for IGF1R and IR protein. In the RMSD calculation (Figure 12), two of the IGF1R candidates were bumped up in 175 ns (8a violet) and 425 ns (8a orange), respectively, which matched with the videos we observed. The ligand RMSD of compound C was entirely unstable during MD, and the control stayed at a state for 175−350 ns. Both of them disappeared at the end. As

N ∑ xiyi − ∑ xi ∑ yi 2

N ∑ xi 2 − (∑ xi)2 N ∑ yi 2 − (∑ yi )

1 N

(6)

The result showed that most of features have a high correlation (Figure 7a). In data preprocessing, we first screened out the features with variance greater than 0.05, and 122 features were chosen. Then we normalized the values of selected features. Finally, we used the Lasso feature selection.48 For the 9 features selected by the Lasso, we calculated the Pearson correlation coefficient again and the result shows the features we selected have a good orthogonality (Figure 7b). The detailed description is shown in Supporting Information Table S1. We used up to 20 trees to train the model. The model performed well in the training set, the mean square error (MSE) is 0.046, and the R2 is 0.968. On the test set, the performance is also good, the mean square error (MSE) is 0.151, and the R2 is 0.910. For other models, we use the same preprocessing step. So, the features used in all models are the same. When the parameters were adjusted, all models achieved good results (Table 1, Figure 8) and predicted values were shown in Table 2. Besides, all the hyperparameter settings for machine learning algorithms were shown in Table 3. Notably, the random forest, GBRT, and AdaBoost algorithms achieved better results than others. It is interesting to note that these three algorithms belong to ensemble learning. The random forest is a kind of bagging algorithm, in which all weak learners were trained, respectively. Then each weak learner weights equally. GBRT and AdaBoost belong to boosting algorithms, which use same training set in each iteration. The weight of the sample is constantly adjusted 4958

DOI: 10.1021/acs.jpclett.9b02220 J. Phys. Chem. Lett. 2019, 10, 4947−4961

Letter

The Journal of Physical Chemistry Letters

experiment proved that Antifebrile dichroa, Holarrhena antidysenterica, and Gelsemium sempervirens could perform well in two targets, IR and IGF-1R (Table 5). Maybe these Chinese medicines could be potent lead drugs for longevity. Deep learning, as a popular algorithm, did not perform well in this case. We attributed this to the small data set. But it is hard to get a huge amount of bioactivity value data. So, how to apply deep learning with a small amount of data may be a new subject in the drug discovery field. On the basis of the existing data and professional background knowledge, we are committed to replacing some traditional biochemical experiments with computational methods to find out the most possible Chinese herb formula. For instance, when docking the protein to the traditional Chinese database (TCM), we found that some molecules could not combine with a protein based on the structure. So, if we want to design a drug for this target, these molecules, which could not dock well with a protein, should be excluded first. At present, the biochemical experiment is vital and essential, but it needs a lot of time and money. If we could use some methods to cut the workload of biochemical experiments or to provide some possible suggestions for biochemical experiments, it has great practical significance. Emulate, a company based in Boston, is developing multiple organ chips and disease models. A single organ chip is a living, microengineered environment capable of recreating the natural physiological and mechanical forces that cells undergo in the human environment. The organ chips are a new home for cells, which are no different from living in a human body. The organs by Emulate (chip work in a simulated human system) can provide internal operation of a window to reflect human physiology and disease, provide a new technology for the researchers, and are used to predict the reaction of the human body. In conclusion, in this Letter, a novel TCM formula was obtained by computing. Taking the effect of the Klotho pathway, we focused on two mainly related proteins, IGF1R and IR. Candidate compounds were screened out after structure-based drug discovery and ligand-based drug discovery. In the end, the GS-3, conamine, and isofebrigugine were chosen as candidate compounds. Reviewing the TCM database, we found the related Chinese herbs. Antifebrile dichroa, Holarrhena antidysenterica, and Gelsemium sempervirens could perform well in both targets. By consulting relevant literatures of traditional Chinese medicine, we found that some of the effects of the three drugs were documented, such as antibacterial action and detoxification. But three is not enough evidence to prove these effects. Maybe the three Chinese medicines should arouse the attention of concerned researchers.

for the IR protein, vitamin B1 and the control both had interactions with receptor. The GS-3 disappeared initially (due to the bad interaction, the ligand was out of the region during the equilibrium step), so the complex RMSD of GS-3-IR could be the trajectory of the origin IR protein, which set as a blank control. The total energies of all complexes were stable during MD. High occupancies of potential hydrogen bonds were observed, especially. Unfortunately, complex IR-control performed best by including two meaning H-bonds. The occupancies were 91.6% (ligand:H27−O:Met1153) and 99.3% (ligand:H34−O:Met1079), respectively (Figure 12c). For the IGF1R protein, two ligands (conamine and GS3) remained in the binding site during MD (controlling “disappearance”). We focused on the gyrate (tightness), mean square displacement (MSD), solvent accessible surface area (hydrophilic ability), and change of protein (Figure 13a,b) and ligand (Figure 13c,d), respectively. As for ligand MSD, logarithmic transformation occurs by the function YMSD = 2 + log10 MSD (nm2). The control ligand of IGF1R “disappears”, and obviously three stages (gyrate or SASA showed) were displayed. Vitamin B1 (red in Figure 13b,d) and control (orange) stayed in the binding region. We concentrated on vitamin B1 and found a novel skeleton type molecule for the IR protein. However, the new structure showed no activity superior to that of the control. The dock score of vitamin B1 was 122.655 higher than that of the control (55.216), but the control was more stable during MD. The RMSF for four IGF1R candidates (Figure 14a) performed differently in comamine and GS3 (pink and cyan) due to both being combined stably with the target. The marked area displayed different fluctuations influenced by different ligands. Two protein residues with stable ligandbinding showed similar spatial distances (Figure 14b). However, an unexpected similarity presented in the residue distance matrix (Figure 14b) even if it was the same ligand binding as in Figure 14a. There is now a general consensus that a ternary complex, namely, Klotho-FGFR-FGF, controlled the reflex of the FGF single pathway. However, the factors that govern the Klotho had yet to be fully defined, while interactions of Klotho (α and β) with naive FGFR (FGFR23 and FGF21) in the membrane had been newly examined by X-ray analysis. As with the crystal structure, other homogeneous complexes were also capable of undergoing the ternary complex mechanism, known as a “zip code”-like mechanism of FGF. Mohammadi et al.10,18 reported FGF23 was associated with the αKlotho-FGFR23 coreceptor in a 1:1:1 ratio. The interaction was constructed between ternary complexes by a similar sequence, yielding a representative sequence alignment between the FGFR family, FGF family, and klotho family, respectively. For each protein family, getting a sequence similarity was the first inspiring evidence. When the conservative region mainly concentrated at the area around the linker, the distinction between different FGFs such as red cluster indicated the specificity (Figure 15a). Therefore, the uniformity and otherness formed the selectiveness. The Ig-like C2-type 2 and 3 regions of FGFRs displayed similarity in three dimensions so that various FGFs could be recognized. The N terminal of FGF bound with FGFR while its C terminal bound with the kl family protein. Missing any of the three components would affect the effect. Longevity was maintained by a healthy lifestyle and healthy habits. However, on the basis of the key longevity secrets of the human body, a “longevity medicine” was possible. Our



ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jpclett.9b02220.



Nine molecular properties, Balaban and shadow indices, and the source codes for algorithms and detailed description about nine features.(PDF)

AUTHOR INFORMATION

Corresponding Author

*Tel: 02039332153. E-mail: [email protected]. 4959

DOI: 10.1021/acs.jpclett.9b02220 J. Phys. Chem. Lett. 2019, 10, 4947−4961

Letter

The Journal of Physical Chemistry Letters ORCID

(12) Kliewer, S. A.; Mangelsdorf, D. J. A Dozen Years of Discovery: Insights into the Physiology and Pharmacology of FGF21. Cell Metab. 2019, 29, 246−253. (13) Hecht, R.; Li, Y. S.; Sun, J.; Belouski, E.; Hall, M.; Hager, T.; Yie, J.; Wang, W.; Winters, D.; Smith, S.; et al. Rationale-Based Engineering of a Potent Long-Acting FGF21 Analog for the Treatment of Type 2 Diabetes. PLoS One 2012, 7, No. e49345. (14) Kharitonenkov, A.; Beals, J. M.; Micanovic, R.; Strifler, B. A.; Rathnachalam, R.; Wroblewski, V. J.; Li, S.; Koester, A.; Ford, A. M.; Coskun, T.; et al. Rational design of a fibroblast growth factor 21based clinical candidate, LY2405319. PLoS One 2013, 8, No. e58575. (15) Ameka, M.; Markan, K. R.; Morgan, D. A.; BonDurant, L. D.; Idiga, S. O.; Naber, M. C.; Zhu, Z.; Zingman, L. V.; Grobe, J. L.; Rahmouni, K.; et al. Liver Derived FGF21 Maintains Core Body Temperature During Acute Cold Exposure. Sci. Rep. 2019, 9, 630. (16) Foltz, I. N.; Hu, S.; King, C.; Wu, X.; Yang, C.; Wang, W.; Weiszmann, J.; Stevens, J.; Chen, J. S.; Nuanmanee, N.; et al. Treating diabetes and obesity with an FGF21-mimetic antibody activating the βKlotho/FGFR1c receptor complex. Sci. Transl. Med. 2012, 4, 162ra153. (17) Lee, S.; Choi, J.; Mohanty, J.; Sousa, L. P.; Tome, F.; Pardon, E.; Steyaert, J.; Lemmon, M. A.; Lax, I.; Schlessinger, J. Structures of β-klotho reveal a ’zip code’-like mechanism for endocrine FGF signalling. Nature 2018, 553, 501−505. (18) Chen, G.; Liu, Y.; Goetz, R.; Fu, L.; Jayaraman, S.; Hu, M. C.; Moe, O. W.; Liang, G.; Li, X.; Mohammadi, M. α-Klotho is a nonenzymatic molecular scaffold for FGF23 hormone signalling. Nature 2018, 553, 461−466. (19) Hasan, S.; Bonde, B. K.; Buchan, N. S.; Hall, M. D. Network analysis has diverse roles in drug discovery. Drug Discovery Today 2012, 17, 869−874. (20) Xu, Y.; Sun, Z. Molecular basis of Klotho: from gene to function in aging. Endocr. Rev. 2015, 36, 174−193. (21) Chen, C.-Y. C. TCM Database@Taiwan: The World’s Largest Traditional Chinese Medicine Database for Drug Screening In Silico. PLos ONE 2011, 6, e15939. (22) Moen, E.; Bannon, D.; Kudo, T.; Graf, W.; Covert, M.; Van Valen, D. Deep learning for cellular image analysis. Nat. Methods 2019, DOI: 10.1038/s41592-019-0403-1. (23) Chen, D.; Liu, S.; Kingsbury, P.; Sohn, S.; Storlie, C. B.; Habermann, E. B.; Naessens, J. M.; Larson, D. W.; Liu, H. Deep learning and alternative learning strategies for retrospective real-world clinical data. npj Digital Med. 2019, 2, 43. (24) Hansen, K.; Biegler, F.; Ramakrishnan, R.; Pronobis, W.; von Lilienfeld, O. A.; Müller, K.-R.; Tkatchenko, A. Machine Learning Predictions of Molecular Properties: Accurate Many-Body Potentials and Nonlocality in Chemical Space. J. Phys. Chem. Lett. 2015, 6, 2326−2331. (25) Ekins, S.; Puhl, A. C.; Zorn, K. M.; Lane, T. R.; Russo, D. P.; Klein, J. J.; Hickey, A. J.; Clark, A. M. Exploiting machine learning for end-to-end drug discovery and development. Nat. Mater. 2019, 18, 435−441. (26) Vamathevan, J.; Clark, D.; Czodrowski, P.; Dunham, I.; Ferran, E.; Lee, G.; Li, B.; Madabhushi, A.; Shah, P.; Spitzer, M.; et al. Applications of machine learning in drug discovery and development. Nat. Rev. Drug Discovery 2019, 18, 463−477. (27) Valan, M.; Makonyi, K.; Maki, A.; Vondracek, D.; Ronquist, F. Automated Taxonomic Identification of Insects with Expert-Level Accuracy Using Effective Feature Transfer from Convolutional Networks. Syst. Biol. 2019, 68, syz014. (28) Chang, Y.; Park, H.; Yang, H.-J.; Lee, S.; Lee, K.-Y.; Kim, T. S.; Jung, J.; Shin, J.-M. Cancer Drug Response Profile Scan (CDRscan): A Deep Learning Model That Predicts Drug Effectiveness from Cancer Genomic Signature. Sci. Rep. 2018, 8, 8857. (29) Chan, H. C. S.; Shan, H.; Dahoun, T.; Vogel, H.; Yuan, S. Advancing Drug Discovery via Artificial Intelligence. Trends Pharmacol. Sci. 2019, 40, 592−604.

Jun-Yan Li: 0000-0002-4679-8324 Calvin Yu-Chian Chen: 0000-0001-9213-9832 Author Contributions #

Equal contribution.

Author Contributions

Calvin Yu-Chian Chen designed research. Jun-Yan Li, Hsin-Yi Chen, and Wen-Jie Dai worked together to complete the experiment. Calvin Yu-Chian Chen contributed to analytic tools. Hsin-Yi Chen, Qiu-Jie Lv, and Wen-Jie Dai analyzed the data. Jun-Yan Li, Calvin Yu-Chian Chen, Hsin-Yi Chen, and Wen-Jie Dai wrote the manuscript together. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS This work was supported by the Guangzhou Science and Technology Fund (Grant No. 201803010072) and the Science, Technology, & Innovation Commission of Shenzhen Municipality (JCYL 20170818165305521). We also acknowledge the start-up funding from SYSU “Hundred Talent Program”.



REFERENCES

(1) Cardoso, A. L.; Fernandes, A.; Aguilar-Pimentel, J. A.; de Angelis, M. H.; Guedes, J. R.; Brito, M. A.; Ortolano, S.; Pani, G.; Athanasopoulou, S.; Gonos, E. S.; et al. Towards frailty biomarkers: Candidates from genes and pathways regulated in aging and agerelated diseases. Ageing Res. Rev. 2018, 47, 214−277. (2) Kuro-o, M. The Klotho proteins in health and disease. Nat. Rev. Nephrol. 2019, 15, 27−44. (3) Arbel Rubinstein, T.; Shahmoon, S.; Zigmond, E.; Etan, T.; Merenbakh-Lamin, K.; Pasmanik-Chor, M.; Har-Zahav, G.; Barshack, I.; Vainer, G. W.; Skalka, N.; et al. Klotho suppresses colorectal cancer through modulation of the unfolded protein response. Oncogene 2019, 38, 794−807. (4) Kuro-o, M.; Matsumura, Y.; Aizawa, H.; Kawaguchi, H.; Suga, T.; Utsugi, T.; Ohyama, Y.; Kurabayashi, M.; Kaname, T.; Kume, E.; et al. Mutation of the mouse klotho gene leads to a syndrome resembling ageing. Nature 1997, 390, 45−51. (5) Wolf, I.; Levanon-Cohen, S.; Bose, S.; Ligumsky, H.; Sredni, B.; Kanety, H.; Kuro-o, M.; Karlan, B.; Kaufman, B.; Koeffler, H. P.; et al. Klotho: A tumor suppressor and a modulator of the IGF-1 and FGF pathways in human breast cancer. Oncogene 2008, 27, 7094−7105. (6) Tang, X.; Wang, Y.; Fan, Z.; Ji, G.; Wang, M.; Lin, J.; Huang, S.; Meltzer, S. J. Klotho: A tumor suppressor and modulator of the Wnt/ β-catenin pathway in human hepatocellular carcinoma. Lab. Invest. 2016, 96, 197−205. (7) Pavlatou, M. G.; Remaley, A. T.; Gold, P. W. Klotho: A humeral mediator in CSF and plasma that influences longevity and susceptibility to multiple complex disorders, including depression. Transl. Psychiatry 2016, 6, No. e876. (8) Xie, J.; Cha, S.-K.; An, S.-W.; Kuro-o, M.; Birnbaumer, L.; Huang, C.-L. Cardioprotection by Klotho through downregulation of TRPC6 channels in the mouse heart. Nat. Commun. 2012, 3, 1238. (9) Hu, M.-C.; Moe, O. W. Klotho as a potential biomarker and therapy for acute kidney injury. Nat. Rev. Nephrol. 2012, 8, 423−429. (10) Ogawa, Y.; Kurosu, H.; Yamamoto, M.; Nandi, A.; Rosenblatt, K. P.; Goetz, R.; Eliseenkova, A. V.; Mohammadi, M.; Kuro-o, M. βKlotho is required for metabolic activity of fibroblast growth factor 21. Proc. Natl. Acad. Sci. U. S. A. 2007, 104, 7432−7437. (11) Kilkenny, D. M.; Rocheleau, J. V. The FGF21 Receptor Signaling Complex: Klothoβ, FGFR1c, and Other Regulatory Interactions. Vitam. Horm. 2016, 101, 17−58. 4960

DOI: 10.1021/acs.jpclett.9b02220 J. Phys. Chem. Lett. 2019, 10, 4947−4961

Letter

The Journal of Physical Chemistry Letters (30) Luan, B.; Zhou, R. Nanopore-Based Sensors for Detecting Toxicity of a Carbon Nanotube to Proteins. J. Phys. Chem. Lett. 2012, 3, 2337−2341. (31) Altae-Tran, H.; Ramsundar, B.; Pappu, A. S.; Pande, V. Low Data Drug Discovery with One-Shot Learning. ACS Cent. Sci. 2017, 3, 283−293. (32) Zhou, Y.; Li, S.; Xiong, N. Improved Vine Copula-Based Dependence Description for Multivariate Process Monitoring Based on Ensemble Learning. Ind. Eng. Chem. Res. 2019, 58, 3782−3796. (33) Kanehisa, M.; Sato, Y.; Kawashima, M.; Furumichi, M.; Tanabe, M. KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res. 2016, 44, D457−D462. (34) Burley, S. K.; Berman, H. M.; Christie, C.; Duarte, J. M.; Feng, Z.; Westbrook, J.; Young, J.; Zardecki, C. RCSB Protein Data Bank: Sustaining a living digital data resource that enables breakthroughs in scientific research and biomedical education. Protein Sci. 2018, 27, 316−330. (35) Degorce, S. L.; Boyd, S.; Curwen, J. O.; Ducray, R.; Halsall, C. T.; Jones, C. D.; Lach, F.; Lenz, E. M.; Pass, M.; Pass, S.; et al. Discovery of a Potent, Selective, Orally Bioavailable, and Efficacious Novel 2-(Pyrazol-4-ylamino)-pyrimidine Inhibitor of the Insulin-like Growth Factor-1 Receptor (IGF-1R). J. Med. Chem. 2016, 59, 4859− 4866. (36) Anastassiadis, T.; Duong-Ly, K. C.; Deacon, S. W.; Lafontant, A.; Ma, H.; Devarajan, K.; Dunbrack, R. L., Jr.; Wu, J.; Peterson, J. R. A highly selective dual insulin receptor (IR)/insulin-like growth factor 1 receptor (IGF-1R) inhibitor derived from an extracellular signalregulated kinase (ERK) inhibitor. J. Biol. Chem. 2013, 288, 28068− 28077. (37) Bryan, M. C.; Whittington, D. A.; Doherty, E. M.; Falsey, J. R.; Cheng, A. C.; Emkey, R.; Brake, R. L.; Lewis, R. T. Rapid Development of Piperidine Carboxamides as Potent and Selective Anaplastic Lymphoma Kinase Inhibitors. J. Med. Chem. 2012, 55, 1698−1705. (38) Hsu, C.-W.; Lin, C.-J. A comparison of methods for multiclass support vector machines. IEEE Trans. Neural Networks 2002, 13, 415−425. (39) Hwang, E.-J.; Jung, J.-Y.; Lee, S. K.; Lee, S.-E.; Jee, W.-H. Machine Learning for Diagnosis of Hematologic Diseases in Magnetic Resonance Imaging of Lumbar Spines. Sci. Rep. 2019, 9, 6046. (40) Breiman, L. Random Forests. Machine Learning 2001, 45, 5− 32. (41) Gordon, A. D.; et al. Review of “Classification and Regression Trees by L. Breiman; J. H. Friedman; R. A. Olshen; C. J. Stone”. Biometrics 1984, 40, 874. (42) Sheridan, R. P. Three Useful Dimensions for Domain Applicability in QSAR Models Using Random Forest. J. Chem. Inf. Model. 2012, 52, 814−823. (43) Friedman, J. H. Greedy Function Approximation: A Gradient Boosting Machine. Ann. Stat. 2001, 29, 1189−1232. (44) Freund, Y.; Schapire, R. E. Experiments with a new boosting algorithm. Machine Learning. Proceedings of the Thirteenth International Conference (ICML ’96); M. Kaufmann Publishers: San Francisco, 1996; pp 148−156. (45) Zou, H.; Hastie, T. Regularization and variable selection via the elastic net. J. Royal Stat. Soc. B 2005, 67, 301−320. (46) Efron, B.; Hastie, T.; Johnstone, I.; Tibshirani, R. Least Angle Regression. Ann. Stat. 2004, 32, 407−499. (47) Venkatachalam, C. M.; Jiang, X.; Oldfield, T.; Waldman, M. LigandFit: A novel method for the shape-directed rapid docking of ligands to protein active sites. J. Mol. Graphics Modell. 2003, 21, 289− 307. (48) Baraniuk, R. G. Compressive Sensing [Lecture Notes]. IEEE Signal Processing Mag 2007, 24, 118−121. (49) Kingma, D. P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv preprint arXiv:1412.6980 2014.

4961

DOI: 10.1021/acs.jpclett.9b02220 J. Phys. Chem. Lett. 2019, 10, 4947−4961