Subscriber access provided by UNIV OF SOUTHERN INDIANA
Biophysical Chemistry, Biomolecules, and Biomaterials; Surfactants and Membranes
Artificial Intelligence Approach to Find Lead Compounds for Treating Tumors JianQiang Chen, Hsin-Yi Chen, Wenjie Dai, Qiu-Jie Lv, and Calvin Yu-Chian Chen J. Phys. Chem. Lett., Just Accepted Manuscript • DOI: 10.1021/acs.jpclett.9b01426 • Publication Date (Web): 13 Jul 2019 Downloaded from pubs.acs.org on July 17, 2019
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
Artificial Intelligence Approach to Find Lead Compounds for Treating Tumors Jian-Qiang Chen†#, Hsin-Yi Chen†#, Wen-jie Dai‡#, Qiu-Jie Lv†#, Calvin Yu-Chian Chen†, §, ¶* †School
of Intelligent Systems Engineering, Artificial Intelligence Medical Center, Sun
Yat-sen University, Shenzhen, 510275, China ‡ School
of pharmacy, Sun Yat-sen University, Shenzhen, 510275, China
§Department
of Medical Research, China Medical University Hospital, Taichung
40447, Taiwan ¶Department
of Bioinformatics and Medical Engineering, Asia University, Taichung
41354, Taiwan #
Equal contribution
* Corresponding Authors Calvin Yu-Chian Chen, Ph.D. School of Intelligent Systems Engineering, Director of Artificial Intelligence Medical Center, Sun Yat-sen University, Guangzhou 510275, China. TEL: 02039332153 E-mail:
[email protected] 1
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 2 of 80
Abstract It has had been demonstrated that MMP13 Enzyme is related with to Cancer
most
of tumors cancer cells of tumors. The world largest traditional Chinese medicine database was applied to screened with for structure-based drug design and ligand-based drug design. To predict drug activity, machine learning models (Random Forest (RF), AdaBoost Regressor (ABR), Gradient Boosting Regressor (GBR)), and deep learning model were utilized to obtain validate the docking resultspredicted models, we achievedwe obtain the R2 of 0.922 on training set and 0.804 on the test set in the RF algorithm, respectively. During tThe deep learning algorithm, R2 on of training set is 0.90 and R2 on of test set is 0.810, respectively. After the docking and quantitative structure-activity relationship (QSAR) process, hHowever, they these TCM compounds failed fly away during theat molecular dynamics (MD) simulation period. We put forwardseek another way,
theof peptide design., All peptide edatabase candidates
were screened with by docking process. Modification peptides optimized the interaction modes and the affinities assessed with ZDOCK protocol and Refine Docked protein protocol. The 300ns MD simulation evaluated the stability of receptor-peptide complexes. Double site effect appeared on S2, a designed peptide based on known inhibitor, when complex with Bcl2BCL2. S3, a designed peptide referred from endogenous inhibitor p16, competed against cyclin when binding with CDK6. The MDM2 inhibitor S5 and S6 derived from p53 structure and binding with MDM2 stably. 2
ACS Paragon Plus Environment
Page 3 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
A Flexible flexible region of peptide S5 and S6 maybe enhanced the binding ability by changing its own conformation which out of foreseen. These peptides (S2, S3, S5, and S6) are potentially interesting to treat cancer, however, these findings need to be affirmed by biological testing which will be conducted in the near future.
3
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
TOC GRAPHICS
The
technique oflowchart reveal the whole protocol that we provide a brand new
concept of drug and peptide designf this manuscript can be quickly learned from this graphic.
4
ACS Paragon Plus Environment
Page 4 of 80
Page 5 of 80
The Journal of Physical Chemistry Letters
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 5
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 6 of 80
Tumor, the second most deadly disease around world both in developed and developing countries , it was a challenge to catch a metastatic target1. Activating the epidermal growth factor receptor (EGFR)2, which occurred in many types of tumors and promoted tumor progression. Macrophage migration inhibitory factor (MIF)
3
combined with EGFR which further blocked the excitation of EGFR. However, activation of EGFR by mutation or its ligand binding enhanced the secretion of MMP13, which degraded extracellular MIF and resulted in the elimination of negative regulation of MIF on EGFR4. Inhibited MMP13 could slow down the expansion of cancer cells and the deterioration of the disease4. The rRecent researches discovered show that the MMP13 are related to colorectal metastases 5, breast tumor 6, knee osteoarthritis 7. Network pharmacology-based analysis provided a multi-targets concept which was related to the idea of cancer treatment 8. The effects of multiple drugs not only improve the efficacy of the treatment, but also kill the diseased cells before the emergences of drug resistance9. Several related target proteins, cyclin-dependent kinase inhibitor 2A (CDKN2A), p53 and B-cell lymphoma 2 (BCL-2), cyclin-dependent kinase 6 (CDK6) and E3 ubiquitin-protein ligase Mdm2 would study as well. Cyclin-dependent kinase inhibitor 2A (CDKN2A) caused cell cycle arrest and inhibits tumor cell proliferation in cell culture when it overexpressed
10.
The protein
worked by inhibiting the activity of cyclin-dependent kinase 4 (cdk4) or cdk611-12. 6
ACS Paragon Plus Environment
Page 7 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
Tumor suppressor p16INK4a could bind with cdk6 which further inhibited tumor growth 13-14. Mutations in p53 tumor suppressor factors are the most common genetic changes in human cancers 15. The inactivation or mutation of tumor suppressor genes and the disorder of the balance that inhibits apoptosis can promote tumor development 16.
The selection of p53 mutations in the course of tumor occurrence and development
may lead to the parallel inactivation of multiple tumor suppressor genes, which may be the main reason for the high frequency of p53 mutations in cancer17-18. The interaction between angiopoietin and the p53 TAD2 domain in cancer cells could inhibit the function of p53 tumor inhibitors and promote cell survival19-20. Abnormal regulation of Bcl-2 family members makes it possible to escape apoptosis and tumor resistance to chemotherapy 21. The literature provided the rationale for testing combined therapies that used C-X-C chemokine receptor type 4 (CXCR4) and Bcl-2 inhibitors to increase the efficacy of these agents. 22-23 Artificial intelligence is realized by a system that combines representation learning with sophisticated ratiocination. Multiple processing layers in deep learning can represent multiple levels of abstraction, which greatly improves the technical level of drug discovery24. It was demonstrated that deep neural nets (DNNs) can be used as a practical quantitative structure−activity relationships (QSAR) method25. The Random forest model is widely applied in the field of bioinformatics and provides compelling 7
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
results26. AdaBoost algorithm yielded the good prediction performance in breast cancer analysis27. Gradient Boosting algorithm can aggregate its tree models to form a stronger predictor.28 Since Targeting targeting to a single protein achieved little effective treatments for many diseases. Hence, tThe network pharmacology-based multiple targets8 were screened with from TCM database29 and applied in curing cancer. Deep learning and other algorithms were used to find the potential drugs. The compounds in this study were revealed poor performance during MD simulation30 so that we should focused on peptides for drug design. The peptides we designed were stable in binding to the receptor through MD simulation. There was reason to believe that these peptides could affect the conformation of receptors. Cancer targeting peptides can significantly improve the selectivity and efficacy of existing chemotherapy drugs31. The peptide
has great prospects in the market.32 这里干嘛粗体? To ensure several related targets of MMP13, the relationship werewas constructed from Stitch database33 给个 reference 吧!网址都好, the first and second shells were set as no more than 20 interactions. Pathways forin cancer were specially highlighted in red points. Several known ligands were displayed with rounded rectangle. Other three targets would seek out based on the combined score which assess by several evidences. The relationship between these three targets and cancer were available in the literature. 8
ACS Paragon Plus Environment
Page 8 of 80
Page 9 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
Kyoto Encyclopedia of Genes and Genomes (KEGG) database 34 could provide proteinprotein interaction in a pathway. TheThis way provided a multi-targets for treating method to fight away tumor can be found out by this method. The structure of four target proteins were obtained from Protein Data Bank (PDB) 35.
The crystal structure of MMP13 (PDB: 3ZXH) complex with inhibitor (IC50=3nM)
in a great resolution (1.3 Å)36. The complete structure of CDKN2A (PDB: 1A5E) was acquired and none of the constraints show violation bigger than 0.5Å and dihedral angle violation bigger than 5 degrees 11. One of the p53 variants (5 site mutation) was gained from PDB (5O1H) 37, the mutation sites were in the hyper-variable region when a tumor occurs. A more rational way was screening after sequencing for different patient. Here, a multi-sites mutation model could provide more mutation information even if different mutation models could be very different in fact. The origin p53 protein (1TSR) 15 could anti-tumor so that its conformation could set as a control. The spatial structure was collected from PDB (2XA0)38. All of the proteins docked screening the TCM database using the Docking (ligandfit) protocol 39. The structure of docking proteins displayed the conformation rationality from disorder validation, the blue area mean the key residues. The 41 MMP13 inhibitors were collected in known literature
40
to create machine
learning models and deep learning model. To verify the reasonable of QSAR models, 9
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
20% of the data was set to a test set for external validation. The predicted activities were provided as an assessment. The “Remove cell” module of Accelrys Discovery Studio 2.5.5.9350 (DS 2.5) was employed to process these crystal structures before screening process. Remove all top-level cells while leaving their constituent contents intact. Split structures into separate molecules. The “Prepare Protein” module was used to clean protein molecules, dewater and hydrogenation to this four proteins. The “Calculate Molecular Properties” module was employed to get 204 properties of these inhibitors. Using Pearson correlation coefficient matrix (Figure 1) to judge the correlation and orthogonality of the features, principal component analysis (PCA) and Lasso feature selection were applied for data preprocessing (Figure 2). The residual plot (Figure 10) shows the difference between the dependent variables on the vertical axis and the horizontal axis. During these models, the residual is the difference between the observed value of the target variable (y) and the predicted value (y). AdaBoost Regressor model. In each iteration, the weight of the data misclassified by the previous classifier is improved, while the weight of the data correctly classified is reduced. Finally, AdaBoost27 takes the linear combination of basic classifiers as a strong classifier, in which the basic classifier with small classification error rate is given large weights, and the basic classifier with large classification error rate is given small weights.𝑀 means that the lifting tree has 𝑀 weak classifiers. 𝐺𝑚(x) denotes the m10
ACS Paragon Plus Environment
Page 10 of 80
Page 11 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
th weak classifier, 𝛼𝑚 is the parameter of the 𝑀th weak classifier.𝑤𝑖,𝑚 represents the weight of the i instance in round m. The Algorithmic Principle of AdaBoost is as follow:
Algorithm1 : AdaBoost Regressor Learning basic classifiers 𝐺1(x) from training set For m = 1 𝑡𝑜 𝑀 𝑑𝑜: Learning the basic classifier 𝐺𝑚(x) using the currently distributed 𝐷𝑚-weighted training data set (1) 𝐷𝑚 = (𝑤𝑚 + 1,1…𝑤𝑚 + 1,𝑖𝑤𝑚 + 1,𝑁) Calculating the classification error rate of the basic classifier 𝐺𝑚(x) on the weighted training data set Calculating the Coefficient of Basic Classifier 𝐺𝑚(x) 1 ― 𝑒𝑚
1
𝛼𝑚 = 2𝑙𝑜𝑔
(2)
𝑒𝑚
Update the weight distribution of training data endFor A new combination of basic classifiers: 𝑀
𝑓(x) = ∑𝑚 = 1𝛼𝑚𝐺𝑚(x)
(3)
end Algorithm
Random Forest model. For the same batch of data, only one tree can be generated by the same algorithm, and the Bagging strategy can generate different data sets. 𝑁1 samples are resampled from the sample set (assuming the sample set N data points) (there are samples that are put back, the number of sample data points remains unchanged to N), and the n samples are established created based on all samples. The 11
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 12 of 80
classifier repeats the above two steps m times to obtain m classifiers, and finally determines which class the data belongs to .The classifiers {f(a,b),b = 1,…N} where the {k} are independent identically and each tree votes for the best class at input a by using the weight vector(equation 1).41 Trees in random forests are represented as 𝛿1,𝛿2 ,𝛿3,..,𝛿𝑇, and 𝑤𝑖(x) is the average weights41. The Mean square error (equation 2) is used as an evaluation method of the model. 1
𝑁
(4)
𝑤𝑖(x) = 𝑁∑𝑡 = 1𝑤𝑖(x,𝛿𝑡) 1
𝑁
MSE = 𝑁∑𝑡 = 1(𝑦𝑡𝑟𝑢𝑒 ― 𝑦𝑝𝑟𝑒𝑑𝑖𝑐𝑡)2
12
ACS Paragon Plus Environment
(5)
Page 13 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
Gradient Boosting Regressor. The pseudo-residuals are the gradient descent direction of the loss function of the established model, which could be used to ensure better results in model iteration. 𝑥 = {𝑥1,𝑥2,𝑥3,..,𝑥𝑛} represents the random “input” values, 𝑦 is the random “output” value. ℎ(𝑥𝑖;𝑎) is the base learner, 𝜑(𝑦,𝐹(𝑥)) is the loss function. 𝑦𝑖𝑖 is the current pseudo-residuals. The pseudo code of Gradient Boosting28 is as follow:
Algorithm 2: Gradient Boosting Regressor F0(𝑥) = 𝑎𝑟𝑔𝑚𝑖𝑛𝜌
∑
𝑁
𝜑(𝑦𝑖,𝜌)
(6)
𝑖=1
For m = 1 𝑡𝑜 𝑀 𝑑𝑜: 𝑦𝑖𝑖 = ―
[
∂𝜑(𝑦𝑖,𝐹(𝑥𝑖)) ∂𝐹(𝑥𝑖)
]
,i=1,N
(7)
𝐹(𝑥) = 𝐹𝑚 ― 1(𝑥)
𝑛
𝑎𝑚 = 𝑎𝑟𝑔𝑚𝑖𝑛𝑎,𝛽∑𝑖 = 1[ 𝑦𝑖𝑖 ― 𝛽ℎ(𝑥𝑖;𝑎)]2
13
ACS Paragon Plus Environment
(8)
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
𝑛
𝜌𝑚 = 𝑎𝑟𝑔𝑚𝑖𝑛𝜌∑𝑖 = 1𝜑(𝑦𝑖,𝐹𝑚 ― 1(𝑥) + 𝜌ℎ(𝑥𝑖;𝑎𝑚))
Page 14 of 80
(9)
𝐹𝑚 = (𝑥)𝐹𝑚 ― 1(𝑥) + 𝜌𝑚ℎ(𝑥;𝑎𝑚) (10) endFor end Algorithm
Deep Learning model. Deep Learning has made breakthroughs in image classification, speech recognition and automatic driving. 24Depth represents the number of layers in the deep learning model. These layers represent the learning of the neural network model. The change of each layer in the neural network is parameterized by the weight of the layer. The loss function controls the output of the neural network. The weight value can be fine-tuned by the optimizer (Figure 3) to reduce the loss value. The in-depth learning model can learn all the presentation layers together rather than successively. Once the model modifies an internal feature, all other features depending on the feature will automatically adjust and adapt accordingly. It can learn these representations by decomposing complex and abstract representations into many intermediate layers, each of which is only a simple transformation of the previous space. The Adam optimizer algorithm is as follow: Algorithm 3: Adam optimizer Learning rate ϵ = 0.0006 Moment Estimation Exponential Decay Rate, 𝜌1,𝜌2 ∈ [0,1) Small Constants for Numerical Stability δ = 10 ―8 Initial parameter θ 14
ACS Paragon Plus Environment
Page 15 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
Initialization of first and second order matrix variables s = 0,r = 0 Initialization time step t = 0 While no stop do m samples x(1),…x(𝑚) were collected from training set, The target is y(𝑖) Computational gradient :
1
g ← 𝑚∇𝜃∑𝑖𝐿(𝑓(𝑥(𝑖);𝜃),𝑦(𝑖))
(11)
(12) t← t + 1 Updating biased first order matrix estimation s←𝜌1s + (1 ― 𝜌1)g (13) Updating biased partial second matrix estimation r←𝜌2r + (1 ― 𝜌2)g⨀g (14) Correcting the deviation of first-order matrix Correcting the deviation of second order matrix Calculation
𝑠
(15)
𝑠←1 ― 𝜌𝑡
1
𝑟
(16)
𝑟←1 ― 𝜌𝑡
update
2
∆θ = ― ϵ
(17) Application update θ←θ + ∆θ End while
𝑆 𝑟 +δ
(18)
Apart from the Dock score, Random Forest predicted, AdaBoost Regressor predicted, Gradient Boosting Regressor predicted and Deep Learning algorithm predicted, network interactions were focused to search if there were any candidates related to multi-targets (Table 12). Top 50 compounds towards different targets (Table 27and Table 38) were further to find intersection point to ensure the multifunctional drug candidates (Figure 45). After the cross-screening method, the candidates we selected influenced as many related targets as possible. The yellow dots represented proteins, and the blue squares were small molecules. The multi-target compounds were specifically identified in a box. 15
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 16 of 80
Four receptor-ligand complexes were further validated with MD in 300 ns. Four compounds were selected according to their Dock score, H-bonds and pi-pi interaction based on the intersecting information (Figure 58). Four complexes replaced NazlininMMP13, Subaphylline-TP53, Adrenaline-CDKN2A and E41 (control)-MMP13. Explicit solvent water model tip3p were added before MD stage, certain amounts of sodium and chloride ions were added to the system to mimic the humoral environment. The energy minimization, NVT equilibrium and NPT equilibrium were performed. Given the steepest descent algorithm to energy minimization, the bond constrained with Lincs algorithm for all bonds in NVT and NPT stages. With using the Particle Mesh Ewald (PME) for long-range electrostatics, the temperature coupling was on of Vrescale to modified Berendsen thermostat during NVT and NPT. MD simulation for 300ns to evaluate the binding stability. The periodic boundary conditions were set as 3D PBC and we considered the dispersion correction in system. Since the small moleculeTCM compounds candidates did not achieve the desired effect, an idea of screening peptides was put forward. For different targets, several peptides in bioactivity peptides database and SATPdb database
42
were screening out
with ligandfit protocol in DS software, respectively. The ZDOCK program was used to generate different conformations between biological macromolecules. Top pose in top cluster was deemed as the most likely combination, and the poses which root mean 16
ACS Paragon Plus Environment
Page 17 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
square deviation (RMSD) below 1 were refined with RDOCK program. Protein interface
–protein
interaction
would
be
further
analyzed
the
binding
phenomenainteraction. Inhibitors of cdk6 screened for SATPdb based on the structure of p16, a tumor suppressor. The candidate of Bcl2 protein, SATPdb26921 was further designed based on calculate mutation energy (stability) protocol. Saturated mutation provided the whole energy change of mutation. Residue 15 and 47 were designed as cysteine (Cys) to form disulfide bond so that enhanced space stability. The aggregation of peptides would evaluatedevaluate with aggregation score and developability indices (DI) to recognize aggregation in 5Å and 10Å. Hydrophobic residue aggregation region was replaced based on mutation energy result. After energy minimization and the refinement of loop and side-chain, ZDOCK and RDOCK displayed the interaction modes as well as the assessment score. Ligand of protein p53 was designed on the basis of designed drug of Bcl2 and further optimized. P16 could inhibited the activation of cdk6, which bind the catalytic cleft, opposite of cyclin binding site13. It was interesting that Bcl2 BCL2 peptide candidate displayed a synergistic effect binding with different sites. Binding with catalytic cleft and cyclin binding site in cdk6 inhibited the interaction with cyclin. E3 ubiquitinprotein ligase Mdm2 inhibited activity of p53, which represented a MDM2 inhibitor could develop. The structure of MDM2-p53 (1YCQ) was employed. 17
ACS Paragon Plus Environment
43
300ns MD
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
simulation were run to evaluate the complex effect. Three proteins CDKN2A, TP53 and BCL2 were confirmed. The false discovery rate of pathways in cancer was 1.73e-09 which was highly related in these proteins. It was interesting that the MMP13 protein was not directly related to cancer. The combined score between MMP13 and TP53 was 0.944 and likely the MMP13 inhibited natural TP53 to cause cancer (Figure 64). The melanoma pathway (map05218) in KEGG database displayed the relationship between selected targets, especially the tumor suppressor pathways. MITF and TP53 were implicated in further melanoma progression. Data from literatures and databases displayed a high correlation between these targets and tumors. All of the top 50 candidates for different targets would intersected in a network (Figure 45) and multi-target compounds was focused. Yellow points replaced the target proteins, and the blue square mean different TCM ligands. Several molecules related to multi-targets were specially expressed in the center. Top 10 ligands and control E41 were assessed with Dock score, -PMF, -PLP1, -PLP2. The hydrogen bond forming residues and the quantity of H-bond were provided (Table 41). The introduction of hydrogen bonds would greatly improve the stability of binding. The Dock score of compounds screened from TCM database were much higher than control as well as other score functions, there was a chance that we could discover better 18
ACS Paragon Plus Environment
Page 18 of 80
Page 19 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
binding substrates. Disorder validation result displayed a reasonable structure for docking (Figure 76), and when these lower than 0.5 on behalf of a more stable region and more conservative and credible. The information of dock score, H-bond and pi-pi interaction provided comprehensive assessment. AdaBoost Regressor model. Most of the predicted activities were stayed at proper section (Figure 810a), as can be seen from the histogram, the errors are usually distributed near zero, which usually indicates that the model is a good fit. The n_estimators value was set as 25. The mean square error (MSE) of training set is 0.020, and that of test set is 0.235. R2 on training set is 0.973, and R2 on test set is 0.781. The suitable model was used to predict activity of screening candidates. A higher predicted activity could provide confidence for us. Gradient Boosting Regressor model. The n_estimators, which is the number of iterations of gradient lifting or the number of weak classifiers, was set as 650, and the learning_rate was 0.003. The smaller the learning_rate is, the smaller the test error is. Maximum depth of Decision Stump (where tree depth does not include root) was set as 4. Some forecast points are scattered (Figure 810b), but overall they are better. The mean square error (MSE) of training set is 9.83e-08, and that of test set is 0.253. R2 on training set is 1, and R2 on test set is 0.765.
19
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Random Forest model. The model was built using the python and the jupyter notebook tool. The data of 41 datasets are preprocessed. Firstly, principal component analysis (PCA) is used to reduce the dimension of 204 high-dimensional data, and screen out the features with variance greater than 0.05, and obtain 83 features. The data with mean value of 0 and variance of 1 were obtained by standardization . Lasso feature selection is used to obtain features with small correlation coefficient and good orthogonality. The alpha of Lasso was set as 0.171, and 7 features were obtained by Lasso feature selection, aAccording to Pearson correlation coefficient matrix, the selected feature correlation coefficient was small and the orthogonality was good. In the Random Forest model, the n_estimators value was set as 7, and the random_state was 1. The two-dimensional residual distribution is quite random and uniform, and these points are randomly distributed around the horizontal axis (Figure 810c). This seems to indicate that our linear model works well. The mean square error (MSE) of training set is 0.056, and that of test set is 0.211. R2 on training set is 0.922, and R2 on test set is 0.804. The same as docking score, the predicted activities of candidates appeared better than control. Deep Learning model. To predict candidates’ activity value better, a simple 4-layers full connected neural network using ReLu (Rectified Linear Units) function as activation function model was constructed. We used a very small network, which 20
ACS Paragon Plus Environment
Page 20 of 80
Page 21 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
contains two hidden layers. The last layer of the network has only one unit and is not activated. It is a linear layer. Generally speaking, the less training data, the more serious the over-fitting. Dropout is one of the most effective and commonly used regularization methods for neural networks. To use dropout for a certain layer is to randomly discard some output features of that layer (set to 0) in the training process. Dropout technique was also applied in the second and third layer (with rate 0.4 for the second and 0.6 for the third) to reduce over-fitting. We use the Adam optimizer (Kingma & Ba, 2014) with learning rate 0.0006 . We did 350 times experiment on CPU. Scatter plot (Figure 911) shows the partial results that R2 is greater than 0 on training set and on test set. In all the experiments, we got one model that R2 on training set is 0.90 and R2 on test set is 0.810.The predicted bioactivity value is shown in Table 41. During all algorithms, R2 on training set is more than 0.9, and R2 on test set is more than 0.7, which show good validation of the predicted results. The Voting system(Table 53)was established from the multiple QSAR (RF, ABR, DBR, DL) and dock-score validations, and we selected the multi-target one from the top ten candidates.NazlininMMP13 complex included the most key residues for pi-pi interactions as well as the highest total score. Subaphylline-TP53 complex had the most quantity (5) of H-bond in all TP53 complex. Adrenaline-CDKN2A complex had 6 H-bonds, and 3 residues (Asp14, Pro41 21
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
and Asn42) formed the H-bonds (Figure 42). Nazlinin, Subaphyline and Adrenalline (Figure 107) would further be verified molecular dynamics. Interaction of complexes in 2D and 3D view included hydrogen bonds, pi interaction and van der Waals force (Figure 58). Some hydrophobic effects also provided stability of complexes (Figure 119). However, several unfavorable bumps warned unreasonable risk, which foreboded “fly away” during MD period. With the disappointing results of molecules developments, our research interest tends to develop peptides. Six proteins screened with bioactivity peptide library. Top 5 peptides which almost higher than molecular candidates which implied developable potential. The Dock score of lead peptide (207.901), with MMP13 was far higher than control (57.629). Oligopeptides behaved with an awful result during MD, so a long chain peptide was considered. Saturated mutation result appeared which amino acids could be changed. The designed peptides for bcl2 displayed a double binding mechanism, which had reason to believe it could develop as a better inhibitor. We wanted to design a peptide targeting catalytic cleft in cdk6, and this peptide end to binding at cyclin biding site. It was a benefit outcome which could directly inhibited the interaction between cdk6 and cyclin. Sequences of potential peptides were provided (Table 64). Some of the docking function score and energy items were assessed the
22
ACS Paragon Plus Environment
Page 22 of 80
Page 23 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
stability of complexes. The mutation energy from 26921 to S2 was -18.18 kcal/mol in bcl2, from S2 to S3 was -13.65 kcal/mol in p53. MD validation for 300ns in different complexes displayed the interactions between various receptors and ligands during simulation period. However, all of our potential candidates were fail in this test. All of the ligands “flied away” at the end of MD. Representative conformation could search after cluster analysis at last 10ns (Figure 12). The MD result reminded us candidates could not binding stably even if the dock poses displayed a great interaction. However, what’s wrong with our candidates? The conformation changing should be focused. An effective summary would provide principles for further design. Double site effect. Peptide26921 was screened out for a candidate to bind Bcl2 in BH1 domain. An optimized peptide S2 was discovered a double site effect which bound BH3 and BH4 domain, either (Figure 13a). It was named as O and T binding. RMSD of Complex rose at the first 25ns because of a conformation turn of T. For O, it had 3 stages. N terminal unwound at the first 200ns, and then at 200ns to 230ns the unordered N terminal searched a stable structure such as cyclized itself. And finally the N terminal could binding with Bcl2 protein which improve the interaction activity (Figure 13d), just as the change of SASA, RMSD and gyrate (Figure 13b-c, cyan). As for T, the main binding region was N terminal which different from O (O just for assistant). An “8” 23
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
type structure was presented in most of MD time rather than origin “P” type. N terminal could bind stably but the cycle C terminal wasn’t binding well and tended to found a “turn” (Figure 14a). φ and ψ angles of node residues displayed validation of “turn”, these two amino acids tended to form β secondary structure (Figure 14b). The cavity of Bcl2 protein provided anchor point (Figure 14c). BH1, BH3 and BH4 domain displayed low RMSF due to binding with ligand and less flexible. As the same, O bound through the cycle C terminal and with low RMSF values these residues. T bound through N terminal (Figure 14d). Based on the probability distribution of RMSD and gyrate value in p53-S3, Gibbs free energy could be estimated (Figure 15a). Low free energy time point set as a reference. The C terminal of S3 tended to be disordered, which could influent the structure of p53. Main hydrogen bonds were displayed (Figure 15b). The ending structure was simply as the low energy structure. The MSD of S3 significantly changed at 230ns due to the decentralization of C terminal (Figure 15c). From non-competitive to competitive peptide. The CDK6 could be inhibited by p16 (CDKN2A). P16 bound at the catalytic cleft, where opposite the cyclin would bind13. Two peptide p16 (yellow) and S4 (pink) displayed (Figure 16a). S4 designed from 23678, a peptide which screened based on p16 binding site. However, S4 bound at the cyclin site. ATP binding site K43 (red), proton acceptor D145 (yellow), 24
ACS Paragon Plus Environment
Page 24 of 80
Page 25 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
nucleotide binding region IGEGAYGKV (19-27, orange), phosphorylation site T177 (green) and de-phosphorylation site Y24 (magenta) all near the ligand S4 (300ns). It could block lots of CDK6 effect including cyclin by similar. However, the ligand could not connect with T177 directly that could be modified further. And the proton acceptor region also could optimize to obtain a more stable interaction. S4 binding pathway reminded which amino acids it could reach (Figure 16c). Residues distance matrix was displayed (Figure 16d). Flexible area rebuild. MDM2 and its ligands S5, S6 were colored by Debye-Waller factor, which displayed the flexible region (red) of conformation (Figure 17a). Apart from terminal, residues 20-28 of S6 transform conformation constantly during MD. As the same as S6, the S5 altered structure in shorter area (20-22). Based on two parallel terminals interacted with MDM2, residues in the middle began to change conformation. The jumped sharply of cyan line in 25ns was the tight structure loosed (gyrate could intuitively reflex) (Figure 17b-c). The occupancy, maximum distance, minimum distance of significant hydrogen bonds in complexes displayed (Table 75). The high occupancy of H-bond represented an opportunity to binding potential. High occupancy hydrogen bonds (like bcl2-O complex
H-bond) reflexed a steady interaction mode (Figure 18). To the contrary,
the acquired H-bond (like MDM2-S6) was result from conformation change (like the 25
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
flexible residue 20-28 mentioned above). Molecular Mechanics/Poisson-Boltzmann Surface Area (MM/PBSA) calculation provided binding free energy information, most of complexes became lower energy after 300ns MD (Table 86). The 300ns MD simulations displayed bad results even if favorable docking consequence, which reminded us the irrationality during short-time MD. It was possible that peptides could be an alternative option due to more binding potentials. It is hard to explain that our small molecules have a higher docking fraction than the control (E41, IC50=3nM), it could become a nanomole-level drug but failed. We put forward de novo peptides in pathway network to anti-tumor for the first time to overcome multidrug resistance. The MMP13 was associated with arthritis, it was the first time to design drugs for tumor. Inflammation-induced tumorigenesis theory was reported in many researches 44-46. Inflammation related proteins could develop as tumor targets. It was clearly that inflammation proteins influent tumor related proteins in pathway. The Bcl2 inhibitors were classic cancer drugs 47-49, but designed molecular drug for it was a challenge due to bcl2 protein act through protein-protein interaction. It was similar to a finger could not completely blocked two palms. The designed peptide could act on BH1, BH3 and BH4 domain meanwhile which could better intervene. A novel polypeptide was designed based on endogenous inhibitor p16 to obtain a non-ATP 26
ACS Paragon Plus Environment
Page 26 of 80
Page 27 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
competitive inhibitor however end up with a competitive inhibitorsa competitive inhibitor. The selection of CDK6 probably a way to reduce the toxicity so that we want to develop non-competitive peptide. Our CDK6 inhibited peptide must be useful and the next step was increase selectivity. It was funny that MDM2 inhibited the activity of p53, however, p53 peptide (QETFSDLWKLLP) could block the MDM2. Inhibiting the MDM2 to release the action of p53 was a great approach, but it should not be the mutated p53 which most of researcher emphasize. As for S5 and S6, it was interesting that the flexible area became another binding site which improve binding ability. Less researcher pinpins their hope on flexible region because less controllable. Multidrug resistance (MDR) is a problem to be considered in the study of cancer drugs. Our foothold was network analysis and peptide de novo. The use of synergistic drug reduces both side effects and the occurrence of drug resistance. The peptide segments studied are promising. The discovering of this research demonstrate the existent development direction, in patients with tumor, of MMP13 and in some cases the related protein that overexpression. We propose two methods to find lead compounds for the tumor target – MMP13. First, we provide a novel deep learning and other algorithms for finding the best (optima) potential drug and using computer-aided drug design they selected potential inhibitors. However, 300ns MD simulations displayed awful results. The 27
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 28 of 80
candidates “fly away”. We put forward another way of peptide design. Peptides were found in few cases of tumor that we analysis. Inflammation - cancer alternation represented a potential situation and in some signal pathways indeed displayed an interaction. In this context, it was worth noting that multiple related proteins which, blocked at the same time meeting the clinical therapeutic principle. Double site effect appeared on S2, a designed peptide based on known inhibitor, when complex with Bcl2. S3, a designed peptide referred from endogenous inhibitor p16, competed against cyclin when binding with CDK6. MDM2 inhibitor S5 and S6 derived from p53 structure and binding with MDM2 stably. Flexible region of peptide S5 and S6 enhanced the binding ability by changing its own conformation which out of foreseen. Peptides and peptidomimetics have recently attracted attention in the treatment of cancer. 32 Peptidebased therapeutics have many goodness. Peptides are easy to penetrate into deep tissues with low immunogenicity and can be synthesized rapidly in good effectiveness.
31
Compared with large protein binding, peptide is simpler and strong repeatability. Besides, under different storage conditions, peptides show greater stability. Peptides also have some drawbacks, such as rapid renal clearance poor enzyme stability and its secondary structure is difficult to maintain. 31 In this manuscript, 300ns MD proved that the selected peptides can bind protein targets with high affinity and specificity. The designed peptides (S2, S3, S5, and S6) with the drug potential could treat Tumors. Also, 28
ACS Paragon Plus Environment
Page 29 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
we have contacted Shenzhen Institute of Advanced Research, Chinese Academy of Sciences, and cooperated to test whether these peptides can effectively inhibit MMP13.
ASSOCIATED CONTENT Supporting Information Source codes for algorithms is available in Supporting Information.
AUTHOR INFORMATION Corresponding Author *E-mail:
[email protected] Notes The authors declare no competing financial interest. 29
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
ACKNOWLEDGEMENT This work was supported by Guangzhou science and technology fund (Grant No 201803010072), Science, Technology& Innovation Commission of Shenzhen Municipality (JCYJ20170818165305521), and from China Medical University Hospital (DMR-107-110). We also acknowledge the start-up funding from SYSU “Hundred Talent Program”.
30
ACS Paragon Plus Environment
Page 30 of 80
Page 31 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
References: 1.
Schwitalla, S., Tumor cell plasticity: the challenge to catch a moving target. Journal of
gastroenterology 2014, 49 (4), 618-27. 2.
Chong, C. R.; Janne, P. A., The quest to overcome resistance to EGFR-targeted therapies in cancer.
Nat Med 2013, 19 (11), 1389-400. 3.
Calandra, T.; Roger, T., Macrophage migration inhibitory factor: a regulator of innate immunity.
Nature reviews. Immunology 2003, 3 (10), 791-800. 4.
Zheng, Y.; Li, X.; Qian, X.; Wang, Y.; Lee, J. H.; Xia, Y.; Hawke, D. H.; Zhang, G.; Lyu, J.; Lu, Z., Secreted
and O-GlcNAcylated MIF binds to the human EGF receptor and inhibits its activation. Nature cell biology 2015, 17 (10), 1348-55. 5.
Mendonsa, A. M.; VanSaun, M. N.; Ustione, A.; Piston, D. W.; Fingleton, B. M.; Gorden, D. L., Host
and tumor derived MMP13 regulate extravasation and establishment of colorectal metastases in the liver. Molecular cancer 2015, 14, 49. 6.
Dumortier, M.; Ladam, F.; Damour, I.; Vacher, S.; Bieche, I.; Marchand, N.; de Launoit, Y.; Tulasne,
D.; Chotteau-Lelievre, A., ETV4 transcription factor and MMP13 metalloprotease are interplaying actors of breast tumorigenesis. Breast cancer research : BCR 2018, 20 (1), 73. 7.
Ruan, G.; Xu, J.; Wang, K.; Wu, J.; Zhu, Q.; Ren, J.; Bian, F.; Chang, B.; Bai, X.; Han, W.; Ding, C.,
Associations between knee structural measures, circulating inflammatory factors and MMP13 in patients with knee osteoarthritis. Osteoarthritis and cartilage 2018, 26 (8), 1063-1069. 8.
Hao da, C.; Xiao, P. G., Network pharmacology: a Rosetta Stone for traditional Chinese medicine.
Drug development research 2014, 75 (5), 299-312. 9.
Wu, Q.; Yang, Z.; Nie, Y.; Shi, Y.; Fan, D., Multi-drug resistance in cancer chemotherapeutics:
mechanisms and lab approaches. Cancer letters 2014, 347 (2), 159-66. 10. LaPak, K. M.; Burd, C. E., The molecular balancing act of p16(INK4a) in cancer and aging. Molecular cancer research : MCR 2014, 12 (2), 167-83. 11. Byeon, I. J.; Li, J.; Ericson, K.; Selby, T. L.; Tevelev, A.; Kim, H. J.; O'Maille, P.; Tsai, M. D., Tumor suppressor p16INK4A: determination of solution structure and analyses of its interaction with cyclindependent kinase 4. Molecular cell 1998, 1 (3), 421-31. 12. Matsuda, Y.; Ichida, T., p16 and p27 are functionally correlated during the progress of hepatocarcinogenesis. Medical molecular morphology 2006, 39 (4), 169-75. 13. Russo, A. A.; Tong, L.; Lee, J. O.; Jeffrey, P. D.; Pavletich, N. P., Structural basis for inhibition of the cyclin-dependent kinase Cdk6 by the tumour suppressor p16INK4a. Nature 1998, 395 (6699), 237-43. 14. Sherr, C. J.; Beach, D.; Shapiro, G. I., Targeting CDK4 and CDK6: From Discovery to Therapy. Cancer discovery 2016, 6 (4), 353-67. 15. Cho, Y.; Gorina, S.; Jeffrey, P. D.; Pavletich, N. P., Crystal structure of a p53 tumor suppressor-DNA complex: understanding tumorigenic mutations. Science 1994, 265 (5170), 346-55. 16. Conover, C. A., The IGF-p53 connection in cancer. Growth Horm IGF Res 2018, 39, 25-28. 17. Pappas, K.; Xu, J.; Zairis, S.; Resnick-Silverman, L.; Abate, F.; Steinbach, N.; Ozturk, S.; Saal, L. H.; 31
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Su, T.; Cheung, P.; Schmidt, H.; Aaronson, S.; Hibshoosh, H.; Manfredi, J.; Rabadan, R.; Parsons, R., p53 Maintains Baseline Expression of Multiple Tumor Suppressor Genes. Molecular cancer research : MCR 2017, 15 (8), 1051-1062. 18. Gurpinar, E.; Vousden, K. H., Hitting cancers' weak spots: vulnerabilities imposed by p53 mutation. Trends in cell biology 2015, 25 (8), 486-95. 19. Yeo, K. J.; Jee, J. G.; Hwang, E.; Kim, E. H.; Jeon, Y. H.; Cheong, H. K., Interaction between human angiogenin and the p53 TAD2 domain and its implication for inhibitor discovery. FEBS letters 2017, 591 (23), 3916-3925. 20. Raj, N.; Attardi, L. D., The Transactivation Domains of the p53 Protein. Cold Spring Harbor perspectives in medicine 2017, 7 (1). 21. Dey, J.; Deckwerth, T. L.; Kerwin, W. S.; Casalini, J. R.; Merrell, A. J.; Grenley, M. O.; Burns, C.; Ditzler, S. H.; Dixon, C. P.; Beirne, E.; Gillespie, K. C.; Kleinman, E. F.; Klinghoffer, R. A., Voruciclib, a clinical stage oral CDK9 inhibitor, represses MCL-1 and sensitizes high-risk Diffuse Large B-cell Lymphoma to BCL2 inhibition. Scientific reports 2017, 7 (1), 18007. 22. Klein, S.; Abraham, M.; Bulvik, B.; Dery, E.; Weiss, I. D.; Barashi, N.; Abramovitch, R.; Wald, H.; Harel, Y.; Olam, D.; Weiss, L.; Beider, K.; Eizenberg, O.; Wald, O.; Galun, E.; Pereg, Y.; Peled, A., CXCR4 Promotes Neuroblastoma Growth and Therapeutic Resistance through miR-15a/16-1-Mediated ERK and BCL2/Cyclin D1 Pathways. Cancer research 2018, 78 (6), 1471-1483. 23. Kremer, K. N.; Peterson, K. L.; Schneider, P. A.; Meng, X. W.; Dai, H.; Hess, A. D.; Smith, B. D.; Rodriguez-Ramirez, C.; Karp, J. E.; Kaufmann, S. H.; Hedin, K. E., CXCR4 chemokine receptor signaling induces apoptosis in acute myeloid leukemia cells via regulation of the Bcl-2 family members Bcl-XL, Noxa, and Bak. The Journal of biological chemistry 2013, 288 (32), 22899-914. 24. LeCun, Y.; Bengio, Y.; Hinton, G., Deep learning. Nature 2015, 521, 436. 25. Ma, J.; Sheridan, R. P.; Liaw, A.; Dahl, G. E.; Svetnik, V., Deep Neural Nets as a Method for Quantitative Structure–Activity Relationships. Journal of Chemical Information and Modeling 2015, 55 (2), 263-274. 26. Van Echelpoel, W.; Goethals, P. L. M., Variable importance for sustaining macrophyte presence via random forests: data imputation and model settings. Scientific Reports 2018, 8 (1), 14557. 27. Huang, Q.; Chen, Y.; Liu, L.; Tao, D.; Li, X., On Combining Biclustering Mining and AdaBoost for Breast Tumor Classification. IEEE Transactions on Knowledge and Data Engineering 2019, 1-1. 28. Zhang, C.; Zhang, Y.; Shi, X.; Almpanidis, G.; Fan, G.; Shen, X., On Incremental Learning for Gradient Boosting Decision Trees. Neural Processing Letters 2019. 29. Chen, Y.-C., Beware of docking! 2014; Vol. 36. 30. Hess, B.; Kutzer, C.; van der Spoel, D.; Lindahl, E., GROMACS 4: algorithms for Highly Efficient, LoadBalanced, and Scalable Molecular Simulation. 2008; Vol. 4, p 435-447. 31. Soudy, R.; Byeon, N.; Raghuwanshi, Y.; Ahmed, S.; Lavasanifar, A.; Kaur, K., Engineered Peptides for Applications in Cancer-Targeted Drug Delivery and Tumor Detection. Mini Rev Med Chem 2017, 17 (18), 1696-1712. 32
ACS Paragon Plus Environment
Page 32 of 80
Page 33 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
32. Ladner, R.; Sato, A.; Gorzelany, J.; de Souza, M., Phage display-derived peptides as therapeutic alternatives to antibodies. 2004; Vol. 9, p 525-9. 33. Szklarczyk, D.; Santos, A.; von Mering, C.; Jensen, L. J.; Bork, P.; Kuhn, M., STITCH 5: augmenting protein–chemical interaction networks with tissue and affinity data. Nucleic Acids Research 2015, 44 (D1), D380-D384. 34. Kanehisa, M., The KEGG database. Novartis Foundation symposium 2002, 247, 91-101; discussion 101-3, 119-28, 244-52. 35. Burley, S. K.; Berman, H. M.; Christie, C.; Duarte, J. M.; Feng, Z.; Westbrook, J.; Young, J.; Zardecki, C., RCSB Protein Data Bank: Sustaining a living digital data resource that enables breakthroughs in scientific research and biomedical education. Protein science : a publication of the Protein Society 2018, 27 (1), 316-330. 36. Tommasi, R. A.; Weiler, S.; McQuire, L. W.; Rogel, O.; Chambers, M.; Clark, K.; Doughty, J.; Fang, J.; Ganu, V.; Grob, J.; Goldberg, R.; Goldstein, R.; Lavoie, S.; Kulathila, R.; Macchia, W.; Melton, R.; Springer, C.; Walker, M.; Zhang, J.; Zhu, L.; Shultz, M., Potent and selective 2-naphthylsulfonamide substituted hydroxamic acid inhibitors of matrix metalloproteinase-13. Bioorganic & medicinal chemistry letters 2011, 21 (21), 6440-5. 37. Baud, M. G. J.; Bauer, M. R.; Verduci, L.; Dingler, F. A.; Patel, K. J.; Horil Roy, D.; Joerger, A. C.; Fersht, A. R., Aminobenzothiazole derivatives stabilize the thermolabile p53 cancer mutant Y220C and show anticancer activity in p53-Y220C cell lines. European journal of medicinal chemistry 2018, 152, 101-114. 38. Ku, B.; Liang, C.; Jung, J. U.; Oh, B. H., Evidence that inhibition of BAX activation by BCL-2 involves its tight and preferential interaction with the BH3 domain of BAX. Cell research 2011, 21 (4), 627-41. 39. Venkatachalam, C. M.; Jiang, X.; Oldfield, T.; Waldman, M., LigandFit: a novel method for the shape-directed rapid docking of ligands to protein active sites. Journal of molecular graphics & modelling 2003, 21 (4), 289-307. 40. Fuerst, R.; Yong Choi, J.; Knapinska, A. M.; Smith, L.; Cameron, M. D.; Ruiz, C.; Fields, G. B.; Roush, W. R., Development of matrix metalloproteinase-13 inhibitors – A structure-activity/structure-property relationship study. Bioorganic & Medicinal Chemistry 2018, 26 (18), 4984-4995. 41. Rahman, R.; Dhruba, S. R.; Ghosh, S.; Pal, R., Functional random forest with applications in doseresponse predictions. Scientific Reports 2019, 9 (1), 1628. 42. Singh, S.; Chaudhary, K.; Dhanda, S. K.; Bhalla, S.; Usmani, S. S.; Gautam, A.; Tuknait, A.; Agrawal, P.; Mathur, D.; Raghava, G. P., SATPdb: a database of structurally annotated therapeutic peptides. Nucleic acids research 2016, 44 (D1), D1119-26. 43. Kussie, P. H.; Gorina, S.; Marechal, V.; Elenbaas, B.; Moreau, J.; Levine, A. J.; Pavletich, N. P., Structure of the MDM2 oncoprotein bound to the p53 tumor suppressor transactivation domain. Science 1996, 274 (5289), 948-53. 44. Elinav, E.; Nowarski, R.; Thaiss, C. A.; Hu, B.; Jin, C.; Flavell, R. A., Inflammation-induced cancer: crosstalk between tumours, immune cells and microorganisms. Nature reviews. Cancer 2013, 13 (11), 33
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
759-71. 45. Hu, B.; Elinav, E.; Huber, S.; Booth, C. J.; Strowig, T.; Jin, C.; Eisenbarth, S. C.; Flavell, R. A., Inflammation-induced tumorigenesis in the colon is regulated by caspase-1 and NLRC4. Proceedings of the National Academy of Sciences of the United States of America 2010, 107 (50), 21635-40. 46. Zhang, X.; Ai, F.; Li, X.; She, X.; Li, N.; Tang, A.; Qin, Z.; Ye, Q.; Tian, L.; Li, G.; Shen, S.; Ma, J., Inflammation-induced S100A8 activates Id3 and promotes colorectal tumorigenesis. International journal of cancer 2015, 137 (12), 2803-14. 47. Iyer, D.; Vartak, S. V.; Mishra, A.; Goldsmith, G.; Kumar, S.; Srivastava, M.; Hegde, M.; Gopalakrishnan, V.; Glenn, M.; Velusamy, M.; Choudhary, B.; Kalakonda, N.; Karki, S. S.; Surolia, A.; Raghavan, S. C., Identification of a novel BCL2-specific inhibitor that binds predominantly to the BH1 domain. The FEBS journal 2016, 283 (18), 3408-37. 48. Kawashima-Goto, S.; Imamura, T.; Tomoyasu, C.; Yano, M.; Yoshida, H.; Fujiki, A.; Tamura, S.; Osone, S.; Ishida, H.; Morimoto, A.; Kuroda, H.; Hosoi, H., BCL2 Inhibitor (ABT-737): A Restorer of Prednisolone Sensitivity in Early T-Cell Precursor-Acute Lymphoblastic Leukemia with High MEF2C Expression? PloS one 2015, 10 (7), e0132926. 49. Vartak, S. V.; Hegde, M.; Iyer, D.; Gaikwad, S.; Gopalakrishnan, V.; Srivastava, M.; Karki, S. S.; Choudhary, B.; Ray, P.; Santhoshkumar, T. R.; Raghavan, S. C., A novel inhibitor of BCL2, Disarib abrogates tumor growth while sparing platelets, by activating intrinsic pathway of apoptosis. Biochemical pharmacology 2016, 122, 10-22.
34
ACS Paragon Plus Environment
Page 34 of 80
Page 35 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
The Journal of Physical Chemistry Letters
Table 1. Dock score,RF,ABR,GBR and DLpredicted activities for top10 TCM database towards MMP13. Name Nazlinin
Dock score 135.895
–PMF 63.240
–PLP1 49.490
–PLP2 52.420
Predicted activity*
H-bond quantity
RF*
4
4.430916 67 4.506833
ABR*
GBR*
DL*
4.056
4.304040
4.110227
3.83
09 3.876606
3.204858
8
5
Subaphylline
110.056
50.590
28.800
19.770
9
33
Ochrolifuanine A
108.029
55.470
74.170
75.910
4
N-Methyl tyramine-O-α-L-rhamnopyranoside
107.7070
57.920
52.870
45.740
2
Usambarine
105.303
66.820
77.850
77.610
2
Tubulosine
103.955
39.680
67.200
69.120
3
Emetine
103.495
36.960
55.640
60.170
7
5.018305 4.735444 56 4.977472 44 4.982027 22 4.9735 78
5.058230 4.125 77 5.214 5.09125 5.250062
5.080192 4.294524 55 5.250005 61 4.996268 44 5.278868
5.560901 4.813676 6 5.268369 4 8.218174 9.047226
Alangimarckine
102.011
36.570
66.140
66.370
4
4.979055
5 5.1322
36 5.069089
6.467669
Vitamin B1
98.292
57.010
65.330
52.850
2
Hydrachine A E41*
89.277
63.290
50.410
46.090
1
42.241
42.690
61.880
58.330
2
4.345916 56 4.910305 67 3.955305 56
3.725333 5.279666 33 4.866 67
4.251917 4 5.215940 6 5.104626 21
5.065189 3.978405 5.135390
26
3
*E41:control
56
RF:Random Forest ABR: AdaBoost Regressor GBR:Gradient Boosting DL:Deep Learning
35
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
Page 36 of 80
Table 12. The intersection TCM candidates of MMP13, CDKN2A, TP53 and BCL2 docking result. TCM name
Nazlinin
Subaphylline
Emetine Hernovine Adrenaline 1,7-Bis(4hydroxyphenyl)1,4,6-heptatrien3-one Diiodotyrosine Mescaline
Code a b
PDB name
Dock score
H-bond forming residues
Pi-Pi interaction
H-bond quantity
His226,Asp231,His232,Pro242 Met160,Ser99,Pro98 His232
4 3
Arg267 His222 Asp14 His226, His232 --Ala17 --Ser43, Tyr44, Ala17
5 7 2 3 2 6 3 2
MMP13 TP53 MMP13
135.895 89.796 110.056
d e f g h i j k
TP53 MMP13 CDKN2A MMP13 TP53 CDKN2A TP53 CDKN2A
80.638 103.495 72.432 99.644 48.776 77.896 67.328 76.144
Asp231, Ala186, Glu223 Asp208 Ala186, His222, Glu223, His226,Asp231, His232, Pro242 Ser99, Arg158, Asp208 Ala186, His222, Glu223,His226, Pro242 Asp14, Pro41 Ala188, His226, Asp231 Asp208, Thr256 Asp14, Pro41, Asn42 Asp208, Glu258 Asp14, Pro41
l
TP53
45.838
Arg158, Asp208, Thr256
Arg158, Met160
3
m n o p
CDKN2A TP53 CDKN2A TP53
73.831 41.146 70.371 56.574
Asp14, Pro41, Ser43 Asp208 Asp14, Pro41, Asn42 Ser99, Asp208, Thr256
Asp14 Pro98 Asp14 Met160
4 1 4 5
c
36
ACS Paragon Plus Environment
9
Page 37 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
The Journal of Physical Chemistry Letters
Table 3 Vote score of top ten candidates Vote score Name Nazlinin Subaphylline Ochrolifuanine A N-Methyl tyramine-O-α-L-rhamnopyranoside Usambarine Tubulosine Emetine Alangimarckine Vitamin B1 Hydrachine A E41*
pKi RF
ABR
GBR
1 1 1 1 1 1 1 1 1 1 0
0 0 1 0 1 1 1 1 0 1 0
0 0 0 0 1 0 1 0 0 1 0
DL 0 0 1 0 1 1 1 1 0 0 0
Dockscore 5 5 5 5 5 0 0 0 0 0 0
Total- Multiscore target 6 6 8 6 9 3 4 3 1 3 0
1 1 0 0 0 0 1 0 0 0 0
*E41:control Vote score: For all activity values predicted by one algorithm, candidates larger than control were voted 1 point, and others were voted as 0 point. Top 50% of the Dock score were voted 5 point (Dock score is critical) and others were voted as 0 point.
37
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
Page 38 of 80
Table 4. Designed peptides sequences. Sequences
Dock
*S1 /KKKK E41(control) /GGGG /RGGS 9GN(control) 26921/FKFGSFIKRMWRSKLAKKLRAK GKELLRDYANRVLSPEEEAAAPAPVPA *S2 (O and T) /THQGQHHCCKHLIKCWKLLRIWGIEL LRDYANRVLSPEEEAAANCDCYK 23678/THQGQHHCCKHLIKCWKLLRIW GIELLRDYANRVLSPEEEAAANCDCYK *S3/ESEFDPQEYYECKRQCMQLETSGQ YRRCHSQCLKRFEEDWPWSKYDCEE P16 (PDB code:1BI7 –B) 23678/ESEFDRQEYEECKRQCMQLETSG QMRRCVSQCDKRFEEDIDWSKYDNQD
207.901 57.629 87.021 107.402 50.305 1205.99
Zdock Zrank
Rdock
39.78
-93.019
14.442
42.75 42.37
-94.232 -73.495
-12.961 -2.6243
39.19
-96.464
-27.668
51.13
-93.09
-19.994
71.09 49.81
-134.42 -115.69
-23.466 -28.180
38
ACS Paragon Plus Environment
PDF Total Energy
1600.86
PDF Physical Energy
109.953
DOPE Score
-25221
Targets
Remark
MMP13 MMP13 CDKN2A P53 mut P53 mut Bcl2
Success
Bcl2
Fail Fail Screening Two sites synergies,
P53 mut 3199.54
84.7084
-36991
P53 mut Cdk6 Cdk6
pI =8.05
Screening based on p16
Page 39 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
The Journal of Physical Chemistry Letters
*S4/ESEFDPQEYYECKRQCMQLETSGQ YRRCHSQCLKRFEEDWPWSKYDCEE
51.13
-93.09
-19.994
*S5(16322)/CLAGRLDKQCTCRRSQPSR RSGHEVGRPSPHCGPSRQCGCHMD *S6/CWDHWLRKQHICRMWQYYLRFG HEVGRPSPHCGPSRQCGCHMD
37.6
-64.907
-8.8279
35.64
-80.812
-11.218
*Potential peptides PDF: probability density function; DOPE: Discrete Optimized Protein Energy;
39
ACS Paragon Plus Environment
3199.54
84.7084
-36991
Cdk6
MDM2 2008.51
107.514
-15588
MDM2
Cyclin binding site Based on p16 Based on S5
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 40 of 80
Table 5. Pivotal hydrogen bonds distances data. Hbond
Occupancy
Max (nm)
Min (nm)
lig/H20:O/A186 lig/H20:OE1/E223 lig/H20:OE2/E223 lig/H64:OD1/D179 R109/HE:OD1/D46(O) R109/HE:OD2/D46(O) R109/1HH1:OD1/D46(O) R109/1HH1:OD2/D46(O) D46/HN(O):NE2/Q25 R129/HE:ND1/H7(lig) R129/1HH1:ND1/H7(lig) K10/HZ1(T):OE2/E114 H11/HE2(T):O/A149 F113/HN:O:Q4(lig) S227/HG1:OH/Y48(lig) D228/HN:OE1/Q11(lig) D228/HN:NE2/Q11(lig) Q4/1HE2(lig):O/L111 R144/1HH1:OE1/Q18(lig) I169/HN:OG1/T21(lig) K13/HZ1(lig):O/A23 R14/HE(lig):NH2/R144 Y25/HH(lig):O/G22 K51/HZ1:OE1/E24(lig) K51/HZ1:OE2/E24(lig) R97/1HH1:OE1/E24(lig)
46.03% 74.03% 78.13% 68.80% 97.70% 97.90% 95.13% 94.67% 51.47% 47.20% 42.07% 53.37% 42.63% 81.73% 60.80% 73.30% 64.43% 56.93% 64.60% 42.37% 44.73% 45.10% 75.00% 43.25% 37.70% 40.43%
1.857 2.277 2.352 2.325 1.654 1.576 1.574 1.585 1.128 0.978 1.134 0.981 0.998 0.608 1.430 0.588 0.690 0.836 1.417 2.140 0.836 1.443 1.229 2.584 2.551 4.145
0.154 0.143 0.147 0.143 0.151 0.154 0.148 0.149 0.217 0.166 0.181 0.146 0.187 0.167 0.166 0.163 0.192 0.155 0.150 0.173 0.154 0.228 0.152 0.143 0.144 0.151
Y100/HH:O/P28(lig)
56.45%
3.162
0.149
40
ACS Paragon Plus Environment
Complex MMP13-S1
Bcl2-O
Bcl2-T
p53-S3
CDK6-S4
MDM2-S5 MDM2-S6
Page 41 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
Table 6. Molecular Mechanics/Poisson-Boltzmann Surface Area (MM/PBSA) calculation for binding free energy. 0ns (kcal/mol) Complex
MMP13-S1 Bcl2-O Bcl2-T P53-S3 CDK6-S4 MDM2-S5 MDM2-S6
300ns (kcal/mol)
Binding energy
Complex energy
Binding energy
Complex energy
-13.8371 5.2504 38.8589 15.3423 5.5667 13.0436 11.3177
-7213.272 -8389.766 -8344.393 -10243.48 -13634.24 -5077.350 -5159.072
12.6730 -51.8586 -22.0154 -10.1375 -18.4367 -8.9623 -47.8072
-7097.7526 -8464.5357 -8398.3407 -10098.453 -14119.587 -5143.5054 -5209.4696
41
ACS Paragon Plus Environment
Mutation energy
-18.18 -13.65 0.42 -2.39
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
Page 42 of 80
Table 27. Top 50 TCM candidates for MMP13 and TP53. PDB MMP13
Dockscore
PDB
name
Dockscore
Budmunchiamine L4
153.969
TP53
Glyasperin B
111.883
isofebrifugine
139.379
L-Valine-L-valine anhydride
109.544
Nazlinin Tetrahydrodeoxyoxolucidine A
135.895 134.046
Nazlinin Assamicadine
108.487 99.263
Nudicaulin
133.82
Eupachinilide J
97.207
Subaphyllin Febrifugine Anhydrocannabisativine
133.155 131.852 130.993
Saussureamine C Lindechunine B 11-Hydroxycephalotaxine
97.207 96.513 96.387
Carpaine
125.472
Gomisin D
95.16
Carpaine
125.472
94.627
L-Valine-L-valine anhydride
120.07
1-(1,5-Dimethyl-4-hexenyl)-4methyl benzene N-Norarmepavine
L-Valine-L-valine anhydride
118.651
2,6-Decamethylene pyridine
94.078
42
ACS Paragon Plus Environment
94.627
Page 43 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
The Journal of Physical Chemistry Letters
Phytosphingosine Trypargine N-Methyl tyramine-O-α-Lrhamnopyranoside Tetrahydrodeoxylucidine B
117.696 115.203 114.945
Launobine Subaphylline Arnidiol 3-O-palmitate
94.078 93.977 93.541
114.056
Cularimine
93.541
Trypargine
113.485
Cassyfiline
92.661
Acanthoine Acanthoine
113.403 113.403
Hernangerine S-(2-Carboxyethyl)-L-cysteine
92.541 91.438
Tetrahydrodeoxyoxolucidine B
112.488
Funtumine
91.438
Tetrahydrodeoxylucidine A
112.486
(−)-Nordicentrine
91.328
beta-Dichroine 111.848 N-Methyl tyramine-O-alpha-L- 111.106 rhamnopyranoside Subaphylline 110.056
Norannuradhapurine FB1
91.185 91.103
(+)-Guaiacin
91.049
β-Dichroine gamma-Dichroine Acanthoidine
109.045 109.044 108.365
Xylopine 7,4'-Dihydroxyflavan β-Acoradiene
91.049 91.042 90.765
Acanthoidine Ochrolifuanine A Norerythrostachaldine
108.365 108.029 107.999
Boldine Casearlucin A Annulide
90.681 90.279 90.19
43
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
Page 44 of 80
Febrifugine
107.589
Cheliensisamine
90.19
Ochrolifuanine A
107.09
Adrenaline
89.704
Dihydrooxolucidine A
105.639
5,15-Dimethylmorindol
89.671
Usambarine Tubulosine Tubulosine 3_4_5-trimethoxy_benzeneethanamine
105.354 105.29 104.825 104
6-O-Acetylstritosamide Diiodotyrosine (+)-Nordicentrine Cinchonicine
89.297 89.284 89.213 89.092
Acanthoidine
103.633
89.006
Acanthoidine Emetine
103.633 103.495
3,4,5-trimethoxy benzeneethanamine Hernovine 3α-Acetoxydiversifolol
Norerythrostachaldine Cephaeline
102.5 102.392
Actinodaphnine 88.602 Anticancer Flavonoid PMV70P691- 88.534
Alangimarckine
102.338
95 norboldine
Cephaeline
101.603
Alangimarckine
101.055
1,7-Bis(4-hydroxyphenyl)-1,4,6heptatrien-3-one Benzoylpaeoniflorin
44
ACS Paragon Plus Environment
88.769 88.602
88.302 88.268 88.22
Page 45 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
The Journal of Physical Chemistry Letters
Conessimine
100.448
Norcinnamolaurine
88.127
Broussonetine V
100.294
4-Epi-larreatricin
88.035
Hernovine (−)-Cassine Merresectine A
99.644 99.596 99.473
Deformylflustrabromine B Norhyoscyamine Mescaline
88.019 87.762 87.675
45
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
Page 46 of 80
Table 38. Top 50 TCM candidates for CDKN2A and BCL2. PDB
Dock-
PDB
name
score CDKN2A
dopamine
82.325
25-Anhydroalisol A 24-acetate
81.977
Dockscore
BCL2
trans-Phenylitaconic acid
102.302
5-(Hydroxymethyl)-furan-2-
80.881
carboxylic acid D-Norpseudoephedrine
81.977
Arillanin C
73.776
Anhydroalkannin
81.634
4(18),13-Clerodadien-3-oxo-15-oic
73.418
acid methyl ester Chinese bittersweet alkaloid I
81.561
trans-2-Hexenoic acid
73.418
Noradrenaline
80.595
Diterpenoid EF-D
72.909
(S)-cathinone
80.028
Oxalic acid
72.909
Dopamine
79.664
3-Butenoic acid
72.824
Broceaketolic acid
79.661
5-Carboxy-7-hydroxy-2-methyl-
72.356
benzopyran-γ-one
46
ACS Paragon Plus Environment
Page 47 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
The Journal of Physical Chemistry Letters
Ethyl O-β-D-oleandropyranosyl-(1→4)- 79.24
2-Furancarboxylic acid
72.356
O-3-O-methyl-6-deoxy-β-Dallopyranoside Salicylamine
79.24
trans-Phenylitaconic acid
71.798
Norephedrine
79.14
Aristolochic acid II methyl ester
69.671
Bufotalin
78.855
crotonic acid
69.671
ephedrine hydrochloridum
78.855
Crotonic acid
69.671
octopamine
78.447
Crocusatin B
68.655
Embelin
78.108
Imidazolylpropionic acid
68.655
Phenethylamine
78.108
succinic acid methyl ester
68.576
3-O-β-D-glucopyranosyl-(1→2)-β-D-
77.963
heptenoic acid
68.415
Tyramine
77.963
α-Aminoadipic acid
67.982
Tyramine
77.963
Butanoic acid
67.982
Adrenaline
77.896
Butyric acid
67.982
quinovopyranosyl quinovic acid
47
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
Page 48 of 80
4,4-Dimethyl-1,7-heptanedioic acid
77.645
hexanoic acid
67.813
noradrenaline
77.645
Caproic acid
67.813
Evocarpine
77.423
5β-Androstan-3α,17β-diol
67.584
Serotonin
77.423
Clausenidin
67.3
Coniferyl diangelate
76.589
2-Heptenic acid
67.3
4-Hydroxybenzylamine
76.589
3-methyl-butanoic acid
66.439
D-Cathinone
76.466
Ecliptasaponin A
65.662
1,7-Bis(4-hydroxyphenyl)-1,4,6-
76.144
Pentanic acid
65.662
9-Acetoxyfukinanolide
75.84
(2S)-(O-Hydroxyphenyl)lactate
64.75
13β,17β-Epoxyalisol A 24-acetate
75.801
m-Hydroxyphenylpyruvic acid
64.328
Propylamine
75.801
3-O-(E)-Coumaroylerythrodiol
64.327
Synephrine
75.446
Ginsenoside Rf
63.127
(2S,6ζ)-3,7-Dimethyloct-3(10)-ene-
75.161
Tiglic acid
63.127
heptatrien-3-one
48
ACS Paragon Plus Environment
Page 49 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
The Journal of Physical Chemistry Letters
1,2,6,7-tetrol 1-O-β-D-gluco-pyranoside (1S,2S)-norpseudoephedrine
74.665
cis-cou-maric acid
63.007
Cucurbitoside A
74.123
Ascosonchine
62.667
Isoamylamine
74.123
4'-O-β-D-Glucosyl-9-O-(6''-
62.595
deoxysaccharosyl)olivil (-)-synephrine
74.074
(2R)-Sodium 3-phenyllactate
62.454
Clerosterol
74.046
Melilotic acid
62.336
Hexyl amine-1
74.046
Dihydrooroxylin A
61.565
Diiodotyrosine
73.831
Methyl glutarate
61.565
7,7-Dimethyl-2-
73.718
1,4-Dimethyl-cis-cyclohexane
61.275
73.551
2,4-Nonadienic acid
61.275
73.551
ent-15α,18-Dihydroxy-16-kaurene
60.553
methylenebicyclo[3.1.1]heptan-6-ol acetate (2S)-2-O-β-D-Glucopyranosyl-2hydroxyphenylacetic acid Tryptophan
49
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
Page 50 of 80
Phenylalanine
72.865
2-Minaline
60.553
Emetine
72.432
6'-O-Acetylloganic acid
60.289
l-tyrosine
72.114
Angelic acid
60.289
D-Pseudoephedrine
71.543
Gallic acid
59.983
(1R,2S)-norephedrine
70.671
Urocanic acid
59.753
Mescaline
70.371
Ferulic acid
59.079
50
ACS Paragon Plus Environment
Page 51 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
The Journal of Physical Chemistry Letters
Table 4. Dock score,RF,ABR,GBR and DL predicted activities for top10 TCM database towards MMP13. RF*
Predicted activity* ABR* GBR*
DL*
4
4.43091
4.056
4.30404
4.11022
667 4.50683 333
3.83
009 3.87660 68
7 3.20485 85
5.01830 4.73544 556 4.97747 444 4.98202 222 4.9735 778
5.05823 4.125 077 5.214 5.09125 5.25006 25 5.1322 3.72533 5.27966 333 4.866 667
5.08019 4.29452 255 5.25000 461 4.99626 544 5.27886 8 836 5.06908 4.25191 94 5.21594 76 5.10462 021 626
5.56090 4.81367 16 5.26836 64 8.21817 9 9.04722 4 6 6.46766 5.06518 9 3.97840 9 5.13539 5 03
Name
Dock score
–PMF
–PLP1
–PLP2
H-bond quantity
Nazlinin
135.895
63.240
49.490
52.420
Subaphylline Ochrolifuanine A N-Methyl Usambarine rhamnopyranoside Tubulosine
110.056 108.029 tyramine-O-α-L- 107.7070 105.303 103.955
50.590
28.800
19.770
9
55.470 57.920 66.820 39.680
74.170 52.870 77.850 67.200
75.910 45.740 77.610 69.120
4 2 2 3
Emetine
103.495
36.960
55.640
60.170
7
Alangimarckine Vitamin B1 Hydrachine A E41*
102.011 98.292 89.277 42.241
36.570 57.010 63.290 42.690
66.140 65.330 50.410 61.880
66.370 52.850 46.090 58.330
4 2 1 2
*E41:control RF:Random Forest ABR: AdaBoost Regressor GBR:Gradient Boosting DL:Deep Learning
51
ACS Paragon Plus Environment
4.97905 4.34591 556 4.91030 667 3.95530 556 556
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
Page 52 of 80
Table 5. Vote score of top ten candidates. Vote score Name Nazlinin Subaphylline Ochrolifuanine A N-Methyl tyramine-O-α-L-rhamnopyranoside Usambarine Tubulosine Emetine Alangimarckine Vitamin B1 Hydrachine A E41*
pKi RF
ABR
GBR
DL
Dockscore
1 1 1
0 0 1
0 0 0
0 0 1
5 5 5
6 6 8
1 1 0
0
5
6
0
1 1 1 1 0 0 0
5 0 0 0 0 0 0
9 3 4 3 1 3 0
0 0 1 0 0 0 0
1 1 1 1 1 1 1 0
0
0
1 1 1 1 0 1 0
1 0 1 0 0 1 0
Totalscore
Multitarget
*E41:control Vote score: For all activity values predicted by one algorithm, candidates larger than control were voted 1 point, and others were voted as 0 point. Top 50% of the Dock score were voted 5 point (Dock score is critical) and others were voted as 0 point.
52
ACS Paragon Plus Environment
Page 53 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
The Journal of Physical Chemistry Letters
Table 6. Designed peptides sequences. Sequences
Dock
*S1 /KKKK E41(control) /GGGG /RGGS 9GN(control) 26921/FKFGSFIKRMWRSKLAKKLRAK GKELLRDYANRVLSPEEEAAAPAPVPA *S2 (O and T) /THQGQHHCCKHLIKCWKLLRIWGIEL LRDYANRVLSPEEEAAANCDCYK 23678/THQGQHHCCKHLIKCWKLLRIW GIELLRDYANRVLSPEEEAAANCDCYK *S3/ESEFDPQEYYECKRQCMQLETSGQ YRRCHSQCLKRFEEDWPWSKYDCEE P16 (PDB code:1BI7 –B) 23678/ESEFDRQEYEECKRQCMQLETSG
207.901 57.629 87.021 107.402 50.305 1205.99
Zdock Zrank
Rdock
39.78
-93.019
14.442
42.75 42.37
-94.232 -73.495
-12.961 -2.6243
39.19
-96.464
-27.668
51.13
-93.09
-19.994
71.09 49.81
-134.42 -115.69
-23.466 -28.180
53
ACS Paragon Plus Environment
PDF Total Energy
1600.86
PDF Physical Energy
109.953
DOPE Score
-25221
Targets
Remark
MMP13 MMP13 CDKN2A P53 mut P53 mut Bcl2
Success
Bcl2
Two sites synergies,
Fail Fail Screening
P53 mut 3199.54
84.7084
-36991
P53 mut
pI =8.05
Cdk6 Cdk6
Screening
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
Page 54 of 80
QMRRCVSQCDKRFEEDIDWSKYDNQD *S4/ESEFDPQEYYECKRQCMQLETSGQ YRRCHSQCLKRFEEDWPWSKYDCEE
51.13
-93.09
-19.994
*S5(16322)/CLAGRLDKQCTCRRSQPSR RSGHEVGRPSPHCGPSRQCGCHMD *S6/CWDHWLRKQHICRMWQYYLRFG HEVGRPSPHCGPSRQCGCHMD
37.6
-64.907
-8.8279
35.64
-80.812
-11.218
*Potential peptides PDF: probability density function; DOPE: Discrete Optimized Protein Energy
54
ACS Paragon Plus Environment
3199.54
84.7084
-36991
Cdk6
MDM2 2008.51
107.514
-15588
MDM2
based on p16 Cyclin binding site Based on p16 Based on S5
Page 55 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
Table 7. Pivotal hydrogen bonds distances data. Hbond Occupancy Max (nm) lig/H20:O/A186 46.03% 1.857 lig/H20:OE1/E223 74.03% 2.277 lig/H20:OE2/E223 78.13% 2.352 lig/H64:OD1/D179 68.80% 2.325 R109/HE:OD1/D46(O) 97.70% 1.654 R109/HE:OD2/D46(O) 97.90% 1.576 R109/1HH1:OD1/D46(O) 95.13% 1.574 R109/1HH1:OD2/D46(O) 94.67% 1.585 D46/HN(O):NE2/Q25 51.47% 1.128 R129/HE:ND1/H7(lig) 47.20% 0.978 R129/1HH1:ND1/H7(lig) 42.07% 1.134 K10/HZ1(T):OE2/E114 53.37% 0.981 H11/HE2(T):O/A149 42.63% 0.998 F113/HN:O:Q4(lig) 81.73% 0.608 S227/HG1:OH/Y48(lig) 60.80% 1.430 D228/HN:OE1/Q11(lig) 73.30% 0.588 D228/HN:NE2/Q11(lig) 64.43% 0.690 Q4/1HE2(lig):O/L111 56.93% 0.836 R144/1HH1:OE1/Q18(lig) 64.60% 1.417 I169/HN:OG1/T21(lig) 42.37% 2.140 K13/HZ1(lig):O/A23 44.73% 0.836 R14/HE(lig):NH2/R144 45.10% 1.443 Y25/HH(lig):O/G22 75.00% 1.229 K51/HZ1:OE1/E24(lig) 43.25% 2.584 K51/HZ1:OE2/E24(lig) 37.70% 2.551 R97/1HH1:OE1/E24(lig) 40.43% 4.145 Y100/HH:O/P28(lig) 56.45% 3.162
Min (nm) 0.154 0.143 0.147 0.143 0.151 0.154 0.148 0.149 0.217 0.166 0.181 0.146 0.187 0.167 0.166 0.163 0.192 0.155 0.150 0.173 0.154 0.228 0.152 0.143 0.144 0.151 0.149
55
ACS Paragon Plus Environment
Complex MMP13-S1
Bcl2-O
Bcl2-T
p53-S3
CDK6-S4
MDM2-S5 MDM2-S6
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Table 8. Molecular Mechanics/Poisson-Boltzmann Surface Area (MM/PBSA) calculation for binding free energy. Complex 0ns (kcal/mol) 300ns (kcal/mol) Mutation energy Binding Complex Binding Complex energy energy energy energy MMP13-S1 -13.8371 -7213.272 12.6730 -7097.7526 Bcl2-O 5.2504 -8389.766 -51.8586 -8464.5357 -18.18 Bcl2-T 38.8589 -8344.393 -22.0154 -8398.3407 P53-S3 15.3423 -10243.48 -10.1375 -10098.453 -13.65 CDK6-S4 5.5667 -13634.24 -18.4367 -14119.587 0.42 MDM2-S5 13.0436 -5077.350 -8.9623 -5143.5054 MDM2-S6 11.3177 -5159.072 -47.8072 -5209.4696 -2.39
56
ACS Paragon Plus Environment
Page 56 of 80
Page 57 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
Figure 1. The Pearson correlation coefficient matrix heat map of 204 selected features.
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
(a)
(b)
Figure 2. The Principal component analysis (PCA) visualization (a) 2D, (b) 3D.
ACS Paragon Plus Environment
Page 58 of 80
Page 59 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
Figure 3. Fine tuning structure of optimizer in neural network.
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
Figure 4. Four target proteins were docked screening in TCM database, respectively. Top 50 TCM candidates for different proteins were integrated in a network and the intersection were focused especially.
ACS Paragon Plus Environment
Page 60 of 80
Page 61 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
Figure 5. Interaction modes of different complexes candidates in 2D and 3D horizon. A. Nazlinin-MMP13 (a); B. Subaphylline-TP53 (d); C. Adrenaline-CDKN2A (i); D. E41 (control)-MMP13.
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Figure 6. Related proteins. Three proteins (CDKN2A、TP53、BCL2) were selected in the stitch interaction database. Sphere points replaced several proteins, and the rounded rectangle displayed the compounds related to MMP13. False discovery rate of pathway in cancer was 1.73e-09. The first and the second shells both set as no more than 20. It was funny that MMP13 resulted in the cancer through other related protein like TP53.
ACS Paragon Plus Environment
Page 62 of 80
Page 63 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
Figure 7. Disorder analysis for four proteins. Disorder value lower than 0.5 could be a stable structure. The amino acids around binding areas displayed with a cyan color. Protein: A. MMP13, B. CDKN2A, C. TP53, D. BCL2.
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
(a)
(b)
ACS Paragon Plus Environment
Page 64 of 80
Page 65 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
(c)
Figure 8. Residual plot.Different prediction models predicted compounds activities based on known MMP13 inhibitors.(a) AdaBoost Regressor (b) Gradient Boosting Regressor (c) Random Forest
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Figure 9. Scatter plots to present the results of 350 experiments in Deep Learning model
ACS Paragon Plus Environment
Page 66 of 80
Page 67 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
Figure 10. TCM candidates. Based on the common ligands with multiple targets, four ligands were ascertained referred to Dock score H-bond and pi-pi interactions for MD analysis. A. Nazlinin-MMP13 (a); B. Subaphylline-TP53 (d); C. Adrenaline-CDKN2A (i); D. E41 (control)-MMP13.
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Figure 11. Hydrophobic effect of MMP13 binding site with different ligands displayed with 2D and 3D vision.
ACS Paragon Plus Environment
Page 68 of 80
Page 69 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
Figure 12. Cluster of MD result during 290ns~300ns. Clustering result corresponding to different times and the ratio of different groups (pie graph) were provided. Unfortunately, all of the candidates (top1_Nazlinin, top3_Adrenaline, cont_E41) flown away at the end of simulation.
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
(a)
(b)
(c)
(d)
Figure 13. Complex of Bcl2-S2 and MMP13-KKKK. (a) Double sites binding of S2
ACS Paragon Plus Environment
Page 70 of 80
Page 71 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
peptide (O and T) in Bcl2, the peptide designed to target the BH1 domain based on an inhibitor and acquired a high affinity of BH3 domain; (b) RMSD and MSD during 300ns MD simulation. Both complexes were binding stability in MD period. T changed the conformation in very beginning (nearly 25ns) and leaded the change of complex RMSD. O altered conformation at 200na to 250ns. (c) SASA and gyrate analysis. All of the value were relatively stable. The SASA of O changed the same as its RMSD. (d) The change of N terminal in O. Auxiliary binding when the N terminal matched the conformation of Bcl2.
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
(a)
(b)
(c)
(d)
ACS Paragon Plus Environment
Page 72 of 80
Page 73 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
Figure 14. Structure alternation of T. (a) Different type of T. (b) Remarkable amino acid of “turn” node. In fact, residues of 1 to 15 all had obvious changes. Two of the most significant residues were displayed; (c) Cavity pathway of Bcl2. (d) RMSF analysis. The same structure of O and T had nearly different fluctuation. O bound with the cycle “O” type region, and the residue 1 to 15 were flexible. T was just the opposite.
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
(a)
(b)
(c)
(d)
ACS Paragon Plus Environment
Page 74 of 80
Page 75 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
Figure 15. MD analysis of p53 and CDK6 protein. (a) Free energy landscape (FEL). The Gibbs free energy was estimated based on the distribution of conformation. The structure with low Gibbs free energy would be set as a reference. (b) Vital hydrogen bonds in p53-S3 during MD period. Binding modes changed a lots compared with the origin structure. The final binding type was similar with the low Gibbs energy structure. (c) RMSD and MSD of different complexes were displayed. The change of protein and ligand were provided, respectively. (d) Gyrate and SASA analysis. They were stable during MD.
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
(a)
(b)
(c)
(d)
ACS Paragon Plus Environment
Page 76 of 80
Page 77 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
Figure 16. The structure of CDK6 binding with S4. (a) Binding sites of p16 and S4. (b) Significant site of CDK6. S4 could influent these region but Thr177 phosphorylation. It could be an improved scheme (c) The cavity that the peptide could reach. (d) Residues distances matrix of CDK6 when binding with S4.
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
(a)
(b)
(c)
(d)
ACS Paragon Plus Environment
Page 78 of 80
Page 79 of 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry Letters
Figure 17. MD analysis of MDM2 with ligand. (a) Flexible region of MDM2 and S6. The most inconstant amino acids printed as red. Residues 20-28 of S6 could evolve into potential binding area. (b) RMSD and MSD value; (c) gyrate and SASA change. S5 and S6 had similar effect to MDM2; (d) Flexible area reconstructed enhance the binding ability of S5 and S6. The variability of peptide implied another connection potential although most of conditions regard as a bad region that cannot control.
ACS Paragon Plus Environment
The Journal of Physical Chemistry Letters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Figure 18. Vital hydrogen bonds of candidate complexes. The donor H and hydrogen acceptors were shared.
ACS Paragon Plus Environment
Page 80 of 80