Modeling Compound–Target Interaction Network ... - ACS Publications

College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China. §College of Chemistry and Molecular Engineering, Peking Un...
2 downloads 4 Views 4MB Size
Subscriber access provided by Washington & Lee University Library and VIVA (Virtual Library of Virginia)

Article

Modeling compound-target interaction network of Traditional Chinese Medicines for type II diabetes mellitus: insight for polypharmacology and drug design Sheng Tian, Youyong Li, Dan Li, Xiaojie Xu, Junmei Wang, Qian Zhang, and Tingjun Hou J. Chem. Inf. Model., Just Accepted Manuscript • DOI: 10.1021/ci400146u • Publication Date (Web): 16 Jun 2013 Downloaded from http://pubs.acs.org on June 18, 2013

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Chemical Information and Modeling is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 46

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

1 2

Modeling compound-target interaction network of

3

Traditional Chinese Medicines for type II diabetes mellitus:

4

insight for polypharmacology and drug design

5 6

Sheng Tiana, Youyong Lia, Dan Li,b Xiaojie Xuc, Junmei Wangd, Qian Zhanga,

7

Tingjun Houa,b*

8 a

9

Institute of Functional Nano & Soft Materials (FUNSOM) and Jiangsu Key Laboratory for

10

Carbon-Based Functional Materials & Devices, Soochow University, Suzhou, Jiangsu 215123,

11

China b

12

College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China

13 14 15

c

College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China d

Department of Biochemistry, The University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd., Dallas, TX 75390

16 17 18 19

Corresponding author:

20

Tingjun Hou

21

E-mail: [email protected]

22

Phone: +86-512-65882039

or

[email protected]

23 24

Keywords: Type II diabetes mellitus; Traditional Chinese Medicines; Drug-target

25

interaction network; Molecular docking; Pharmacophore mapping; Naïve Bayesian

26

classifier; Polypharmacology

27 1

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

28 29

For Table of Contents Use Only

30

31 32 33 34 35 36 37 38 39 40 41 42 2

ACS Paragon Plus Environment

Page 2 of 46

Page 3 of 46

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

43

Abstract

44

In this study, in order to elucidate the action mechanism of Traditional Chinese

45

Medicines (TCM) that exhibit clinical efficacy for type II diabetes mellitus (T2DM),

46

an integrated protocol that combines molecular docking and pharmacophore mapping

47

was employed to find the potential inhibitors from TCM for the T2DM-related targets

48

and establish the compound-target interaction network. First, the prediction

49

capabilities of molecular docking and pharmacophore mapping to distinguish

50

inhibitors from non-inhibitors for the selected T2DM-related targets were evaluated.

51

The results show that molecular docking or pharmacophore mapping can give

52

satisfactory predictions for most targets but the validations are still quite necessary

53

because the prediction accuracies of these two methods are variable across different

54

targets. Then, the Bayesian classifiers by integrating the predictions from molecular

55

docking and pharmacophore mapping were developed, and the well-validated

56

Bayesian classifiers for 15 targets were utilized to find the potential inhibitors from

57

TCM and establish the compound-target interaction network. The analysis of the

58

compound-target network demonstrates that a small portion (18.6%) of the predicted

59

inhibitors can interact with multi-targets. The pharmacological activities for some

60

potential inhibitors have been experimentally confirmed, highlighting the reliability of

61

the Bayesian classifiers. Besides, it is interesting to find that a considerable number of

62

the predicted multi-target inhibitors have free radical scavenging/antioxidant activities,

63

which are closely related to T2DM. It appears that the pharmacological effect of the

64

TCM formulae is determined not only by the compounds that interact directly with

65

one or more T2DM-related targets, but also by the compounds with other

66

supplementary bioactivities important for relieving T2DM, such as free radical

67

scavenging/antioxidant effects. The mechanism uncovered by this study may offer a

68

deep insight for understanding the theory of the classical TCM formulae for

69

combating T2DM. Moreover, the predicted inhibitors for the T2DM-related targets

70

may provide a good source to find new lead compounds against T2DM.

71 3

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

72

Introduction

73

Diabetes mellitus, or simply diabetes, is a group of metabolic disorder diseases.

74

The diabetes patients have high blood glucose, because either the body system cannot

75

produce enough insulin (absolute insulin deficiency) or the related cells are not

76

sensitive to or respond to insulin (insulin resistance or relative insulin deficiency).

77

Diabetes mellitus has become a serious health problem around the world. Diabetes

78

mellitus is classified into three categories: type 1, type 2 and gestational diabetes.

79

Type 1 diabetes mellitus (T1DM) is characterized by the loss of the insulin-producing

80

beta cells of the islets of langerhans in the pancreas, leading to absolute insulin

81

deficiency. Type 2 diabetes mellitus (T2DM) is characterized by insulin resistance,

82

which may be combined with relatively reduced insulin secretion. The last one,

83

gestational diabetes mellitus (GDM), which occurs in about 2~5% of all pregnancies,

84

resembles T2DM in several respects, involving a combination of relatively inadequate

85

secretion and responsiveness. Globally, it is estimated that 285 million people have

86

diabetes mellitus, with T2DM making up about 90% of the cases in 2010.1 Its

87

incidence is increasing rapidly, and by 2030, this number is estimated to be almost

88

double.2

89

T2DM is a chronic disease, associated with a ten-year shorter life expectancy,

90

caused by a number of complications including cardiovascular, eyes, kidney and

91

nervous system diseases.3-6 Most T2DM patients need the medication to reduce their

92

blood glucose to near-normal level.7 To control the high-level blood glucose and

93

prevent the diabetic complications, many drugs, such as alpha-glucosidase inhibitors,

94

sulfonylureas, metformin, thiazolidinediones (TZDs) and insulin injections, have been

95

used to treat T2DM patients.7 To our disappointment, all of these therapeutic agents

96

have limited efficacy and are associated with undesired side effects, including

97

hypoglycemia, weight gain, gastrointestinal disturbances, edema, anemia, lactic

98

acidosis, etc.8, 9 Due to the deficiencies of marketed drugs for T2DM, the philosophy

99

of rational drug design, or more specifically, the “one gene, one drug, one disease”

100

paradigm for treating T2DM was met with controversy. 4

ACS Paragon Plus Environment

Page 4 of 46

Page 5 of 46

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

101

Over the past decade, the number of new molecular entities (NME) approved by

102

US Food and Drug Administration (FDA) did not change significantly. The reasons

103

that have been proposed for this discouraging situation may not be technological,

104

environmental or even scientific but philosophical.10 Also, it has been proven that

105

many effective drugs act on multiple rather than one specific disease-related target.11

106

It seems that the “magic bullets” cannot cure some specific diseases as expected.

107

The new drug candidates with low efficacy and undesired ADME/T properties are

108

not the only factor to debate the traditional paradigm for drug design/discovery.12, 13

109

The phenotype usually cannot be disturbed by single-gene knockouts14 due to

110

redundant functions and alternative compensatory signaling pathways.10,

111

intrinsic robustness of living systems against various perturbations is an essential

112

factor that hinders such magic bullet drugs to be successful in the drug market.16

15

The

113

Compared with the traditional single-target-drug paradigm, the new philosophy

114

for drug design known as polypharmacology or network pharmacology has attracted

115

more attentions.10, 17-19 Therefore, it has been realized that designing drug candidates

116

to target a disease-associated network rather than a single target may be a better

117

strategy to find better drug candidates for some complicated diseases, such as T2DM.

118

It is estimated that approximated 34% of new drugs were directly or indirectly

119

from natural products from 1981 to 2010.20 As an important source of natural products

120

in clinical use, the Traditional Chinese Medicines (TCM) has played an important role

121

in healthcare system of Chinese people for more than several thousand years. Because

122

of satisfactory in vivo efficacy and safety of TCM in targeting complicated chronic

123

diseases,21 finding potential drug candidates from TCM may be a good practice.22, 23

124

The main philosophical theories of TCM are based on Yin-Yang, Five Elements,

125

human body meridian systems and Zang Fu theory.24 It is well known that the basic

126

form of TCM for combating diseases is TCM formula (or prescription), which is the

127

mixture of special herbs with suitable dosages. A TCM formula consists of a large

128

number of chemical compounds, which may interact with multi-targets. Therefore, at

129

the molecular level, the mechanisms of TCM formulae may be explained by 5

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

130

polypharmacology or network pharmacology.

131

Compared with the time-consuming and costly experimental methods to uncover

132

the action mechanism of TCM for combating diseases, the use of computational

133

approaches may provide an easier and more practical way to understand the

134

mysterious power of TCM.25-27 Previous studies suggest that molecular modeling

135

techniques, especially molecular docking and pharmacophore mapping, not only can

136

find potentially active compounds from TCM for targeting disease-related proteins,

137

but also can identify potential targets for compounds from TCM.28-34 However, it

138

seems that most previous studies just simply used computational approaches to find

139

potential inhibitors from TCM or even establish the compound-target networks to

140

explore the polypharmacology of TCM. It should be noted that the validations of the

141

computational approaches is quite critical to guarantee the reliability of the

142

predictions for TCM given by the computational approaches.

143

Through thousands of years of medical use for various diseases in the Chinese

144

medical system, a large amount of knowledge has also been accumulated for diabetes

145

therapy that is termed as “XiaoKeZheng”.24, 35 Similar to the combination therapy of

146

multi-component drugs, the multi-components from TCM can interact with

147

multi-targets through an integrated way to lowing blood glucose or relieving

148

diabetes.21, 36 However, we still know little about how the active compounds in the

149

TCM formulae modulate the network for combating diabetes. Therefore, in this study,

150

we tried to establish the compound-target interaction network by theoretical

151

approaches and uncover the underlying action mechanisms of TCM for T2DM. To

152

make our predictions more reliable, the capabilities of molecular docking and

153

pharmacophore mapping to discriminate the known inhibitors from the non-inhibitors

154

for each T2DM-related target were assessed. To integrate the advantages of molecular

155

docking and pharmacophore modeling, the naïve Bayesian classifiers were developed

156

by combining the predictions from molecular docking and pharmacophore mapping

157

together. The combined models show better performance than those only based on the

158

predictions from molecular docking or pharmacophore mapping. Then, the 6

ACS Paragon Plus Environment

Page 6 of 46

Page 7 of 46

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

159

well-validated Bayesian classifiers were utilized to find potentially active compounds

160

in TCM that can interact with the important T2DM-related targets. Finally, the

161

compound-target interaction network was established to uncover the action

162

mechanisms of the classical TCM formulae for relieving T2DM.

163 164

Methods and Materials

165

Collection of the T2DM-related targets

166

By searching the public data sources, including Therapeutic Targets Database

167

(TTD)37, Drugbank38, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway

168

database39 and previous reported literatures31,

169

T2DM were identified and the available crystal structures of the protein-ligand

170

complexes for these targets were retrieved from the RCSB protein data bank41. For

171

each target, the crystal structure with the highest resolution was reserved if multiple

172

structures were available, and finally a non-redundant set of 48 crystal complexes for

173

the T2DM-related targets was obtained. It is necessary to point out that almost all of

174

the T2DM-related targets may have several synonyms because different standards

175

might be used in different TCM-related databases.37 For example, the corticosteroid

176

11-beta-dehydrogenase isozyme 1 also can be named as 11-Beta hydroxysteroid

177

dehydrogenase 1, 11-beta-HSD1, 11HSD1, and so on.37 The specific target with

178

different names should be check carefully. The names, PDB entries and the

179

corresponding crystal resolutions for the selected protein-ligand complexes are

180

summarized in Table S1 in the Supporting Materials.

40

, the important targets related to

181 182

Collection of the compounds in the T2DM-related TCM formulae

183

By extensive data profiling, 52 classical TCM formulae with clinical efficacy for

184

T2DM were compiled. There are 95 Chinese herbs in those 52 TCM formulae. The

185

frequency of each herb was counted, and only the 32 Chinese herbs with the

186

frequency to 5 or higher were considered in our study (Table S2 in the Supporting

187

Materials). The Latin names for the 32 studied Chinese herbs were collected from the 7

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 46

188

latest version of TCMCD developed in our group42-45 and [email protected]

189

It should be noted that multiple Latin names might be available for the same Chinese

190

herb. The reason is that different standards may be used in different TCM databases.

191

For example, the herb “Cang Zhu” in Chinese name is named as Latin name

192

Atractylodes lancea in TCMCD and Atractylodes lancea (Thunb.) DC. in

193

TCM-Database@Taiwan (Table S2 in the Supporting Materials). A total of 2,479

194

non-duplicated compounds isolated from these 32 Chinese herbs were found from the

195

TCMCD

196

standardized and optimized by using Discovery Studio 3.1 (DS3.1) molecular

197

simulation package.47

and

TCM-Database@Taiwan

databases.

These

compounds

were

198 199

Preparation of the validation datasets

200

The validation dataset for each T2DM-related target was prepared to evaluate the

201

prediction accuracies of molecular docking or pharmacophore mapping to distinguish

202

known inhibitors from non-inhibitors. The known inhibitors were obtained from the

203

BindingDB database.48 The number of the inhibitors for each target is listed in Table

204

S3 in the Supporting Materials. To ensure the statistical significance of the validation

205

results, the targets with the number of the known inhibitors less than 10 are excluded.

206

Besides, in order to achieve acceptable computational efficiency, when the number of

207

the known inhibitors for a target is higher than 500, only 500 randomly-selected

208

inhibitors

209

experimentally-validated non-inhibitors are quite limited, the compounds were

210

randomly chosen from the Chembridge database to serve as non-inhibitors for each

211

target.49 The inhibitors and non-inhibitors were compared and the duplicates were

212

removed from the non-inhibitor subset. To mimic the unbalanced nature of inhibitors

213

versus non-inhibitors, we chose to set the ratio of non-inhibitors versus inhibitors to

214

20. By applying the processing protocol mentioned above, the validation datasets for

215

the 33 T2DM-related targets were generated (Table S3 in the Supporting Materials).

were

included

in

the

validation

dataset.

216 8

ACS Paragon Plus Environment

Then,

because

the

Page 9 of 46

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

217

Validation of molecular docking-based virtual screening

218

The crystal structures of the protein-ligand complexes for the 33 T2DM-related

219

targets were used for the docking calculations performed with Glide50 in Schrodinger

220

9.0.51 For each crystal structure, the crystallographic water molecules were removed,

221

the missing hydrogen atoms were added, and the protonated states and partial charges

222

were assigned with the OPLS-2005 force field52. Minimizations were performed until

223

the average root mean square deviation of non-hydrogen atoms reached 0.3 Å.

224

Preparation and refinement were accomplished with the protein preparation wizard in

225

Schrodinger 9.051.

226

The validation datasets were processed using the LigPrep module in

227

Schrodinger51. For each small molecule, the possible ionization states were generated

228

by using Epik at pH=7.0. Because the non-inhibitors randomly chosen from

229

Chembridge do not have 3D structural information, all possible stereoisomers for

230

each molecule were generated. All of the other parameters were kept as the default

231

settings in LigPrep.

232

Then, the bounding box of size 10 Å × 10 Å × 10 Å was defined for each target and

233

centered on the mass center of the co-crystal ligand to confine the mass center of the

234

docked ligand by using the Receptor Grid Generation function of Glide.51 According

235

to our docking results, the size of the bounding box is big enough for almost all of the

236

ligands to be docked into the binding site successfully for each target. The default

237

Glide settings were used for grid generation. All of the compounds were docked into

238

the binding site of the studied targets and scored by the Standard Precision (SP) and

239

Extra Precision (XP) scoring modes of Glide.51 Five thousand poses per ligand were

240

generated during the initial phase of the docking calculation, and out of which the best

241

400 poses were chosen for energy minimization. Dielectric constant of 2.0 and 100

242

steps of conjugate gradient minimizations were used in energy minimization stage.

243

The performances of the SP and XP scoring modes of Glide on the 33 T2DM-related

244

targets were evaluated and compared.

245 9

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

246

Validation of pharmacophore-based virtual screening

247

The low-energy conformations of the molecules in the validation datasets were

248

generated by using the Conformation Generation protocol in DS3.147 with the FAST

249

setting and the number of the maximum conformations per compound was set to 100.

250

Then, the receptor-ligand interaction patterns (or pharmacophore features) for each

251

target was generated by using the Receptor-Ligand Pharmacophore Generation

252

(RLPG) protocol in DS3.1.47 A set of predefined pharmacophore features that include

253

hydrogen bond acceptor/donor (HBA/HBD), hydrophobic (HYD), negative/positive

254

ionizable (PI/NI) and ring aromatic (RA) of the ligand are identified for each

255

receptor-ligand complex firstly and then pruned when not satisfying the protein-ligand

256

interactions using adjustable topological rules.53

257

The RLPG protocol in DS3.1 was applied to the receptor-ligand complexes for the

258

33 T2DM-related targets. For each pharmacophore model, the minimum number of

259

the pharmacophore features was set to 3, and the maximum number of the

260

pharmacophore features was same to the number of the total features that can match

261

the receptor-ligand interactions. Up to ten pharmacophore models per receptor-ligand

262

complex were built and ranked by the selectivity as the fault settings in DS3.1. The

263

selectivity of a pharmacophore model was evaluated by a rule-based score function

264

based on the genetic function approximation (GFA) technique54. The details of the

265

GFA model for pharmacophore modeling were described in the previous study.55

266

Besides, the discrimination power of a pharmacophore model was determined by the

267

capability to distinguish known inhibitors from non-inhibitors. The discrimination

268

capability of a pharmacophore model is more important than the selectivity for virtual

269

screening. An ideal pharmacophore model used in virtual screening must have the

270

ability not only to find the most promising active compounds, but also to abandon the

271

inactive compounds. The discrimination capability of each pharmacophore model for

272

the corresponding validation set was evaluated.

273 274

Generation of the molecule-target interaction network 10

ACS Paragon Plus Environment

Page 10 of 46

Page 11 of 46

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

275

As mentioned above, the performance of molecular docking or pharmacophore

276

modeling for each target was validated. The validation process aiming to distinguish

277

the known inhibitors from non-inhibitors for each target is a binary classification

278

problem. Recently, the naïve Bayesian classification technique was successfully

279

applied to different classification problems by our group.45,

280

classification can process large amounts of data, learn fast, and is tolerant of random

281

noise. It only requires a small amount of training data to estimate the parameters

282

(means and variances of the variables) necessary for classification. Moreover, in the

283

scheme of naïve Bayesian classification, the belonging of a compound in which class

284

can be qualitatively evaluated by the posterior probability. The mathematical

285

procedure to train a naive Bayesian classifier was described previously.45, 56, 57 To

286

integrate the advantages of molecular docking and pharmacophore mapping, each

287

compound in the validation datasets was docked into the binding site firstly and then

288

mapped onto the pharmacophore model for each target. The docking scores given by

289

molecular docking and the fit values given by pharmacophore mapping were used as

290

the independent variables to train the naïve Bayesian classifiers.

56, 57

Naïve Bayesian

291

After rigorous validation, the naïve Bayesian classifiers with reliable predictions

292

for 15 targets were utilized to virtually screen the TCM compound dataset. First, The

293

2,479 non-duplicate compounds from 32 Chinese herbs of the TCM formulae for

294

T2DM (Table S2 in the Supporting Materials) were processed using the LigPrep

295

module in Schrodinger.51 Then, those compounds were docked into the binding sites

296

of the T2DM-related targets by using Glide in Schrodinger 9.0 and scored by the SP

297

and XP scoring modes. Next, each compound was mapped onto the pharmacophore

298

model with highest discrimination capability for each target. Based on the calculations

299

from molecular docking and pharmacophore mapping, each compound was predicted

300

by the reliable naïve Bayesian classifiers for 15 T2DM-related targets.

301

By applying the Bayesian classifiers, the potential inhibitors for each target were

302

determined. In order to balance the prediction accuracy and the complexity of

303

interaction network, only the top 10% of the predicted inhibitors ranked by decreasing 11

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

304

the Bayesian scores were extracted to generate the compound-target interaction

305

networks by using the Osprey network visualization program (version 1.2.0).58

306 307

Results and Discussion

308

The performance of molecular docking-based ligand profiling

309

Before applying molecular docking to screen the TCM compound dataset (Table

310

S1 in the Supporting Materials), the performance of molecular docking to distinguish

311

inhibitors from non-inhibitors for 33 T2DM-related targets (Table S2 in the

312

Supporting Materials) was evaluated.

313

First, as a representative measurement for the accuracy and reliability of

314

molecular docking, the “docking power” was examined.59 The co-crystallized ligands

315

were extracted and then re-docked into the binding sites. The root-mean-squared

316

deviation (RMSD) of the docked pose from the original position for each complex

317

was computed to evaluate whether the Glide docking is reliable for virtual screening.

318

The RMSD was used as the criterion for the success of the prediction. If RMSD of the

319

docked pose was less than or equal to 2.0 Å from the experimentally observed

320

conformation, we considered it to be a successful prediction. As shown in Table S3 in

321

the Supporting Materials, the SP and XP scoring modes can successfully recognize

322

the near-native conformations for 25 and 26 of the entire 33 complexes, i.e. the

323

successful rate are 75.8% and 78.8%, respectively. In addition, only four complexes

324

that do not meet the criterion of RMSD ≤ 2.0 Å by using either SP or XP scoring

325

mode (Table S3 in the Supporting Materials). From this data, it is clear that for most

326

complexes the Glide docking can successfully recognize the correct binding

327

conformations.

328

Then, we evaluated the “discrimination power” of molecular docking to

329

distinguish known inhibitors from non-inhibitors for each target. The student’s t-test

330

was employed to evaluate the significance of the difference between the means of the

331

two distributions of the Glide SP or XP scores for the known inhibitor and

332

non-inhibitor classes (Table S3 in the Supporting Materials). On the whole, the 12

ACS Paragon Plus Environment

Page 12 of 46

Page 13 of 46

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

333

molecular docking calculations for most targets give satisfactory results, indicated by

334

the quite low P-values associated with the difference in the mean SP or XP scores of

335

the inhibitors versus those of the non-inhibitors. The P-values for 25 of 33 targets

336

(76%) are lower than 10-20, suggesting that molecular docking can successfully

337

discriminate the known inhibitors from non-inhibitors for most targets. In other words,

338

the Glide docking can be used as a reliable tool to identify known inhibitors from

339

diverse chemical space for each target. The distributions of the Glide SP or XP scores

340

for four targets with good predictions and four targets with poor predictions are

341

illustrated in Figures 1a and 1b as examples.

342

Generally speaking, the XP scoring (extra precision) is believed to be better than

343

the SP scoring (standard precision). However, it is interesting to observe that the XP

344

scoring of Glide does not always show better predictions than the SP scoring. As

345

shown in Table S3 in the Supporting Materials, the XP scoring only shows better or

346

equal discrimination power for 15 (45.5%) targets according to the measurement of

347

the P-values. The results suggest that the XP scoring does not always yield better

348

performance than SP. Therefore, the comparison study of the performance between

349

the SP and XP scoring modes of Glide is quite necessary for a specific target. For each

350

target, based on the comparison results, more reliable scoring mode was chosen for

351

the following virtual screening of the TCM compound dataset.

352

In summary, the Glide docking based on the SP or XP scoring mode can

353

successfully recognize the near-native conformations for 25 or 26 of the entire 33

354

complexes. It is indicated that the Glide docking is suitable for our studies. Moreover,

355

it can successfully discriminate the known inhibitors from non-inhibitors for 25 of 33

356

targets, proving the reliability of molecular docking for most targets. Ideally, if the

357

results of docking can meet both of the two requirements: RMSD ≤ 2.0 Å and P-value

358

≤ 10-20, the docking-based virtual screening is reliable.

359

The reason of the failure of the docking calculations for some targets lies in the

360

fact that the performance of scoring function used by Glide is often varied across

361

different protein families.59,

60

For example, for HMG-CoA reductase, six crystal 13

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

362

complexes can be retrieved from PDB database (Table S4 in the Supporting

363

Materials). According to our analysis, there is only one complex (1DQA) can meet the

364

demand of RMSD (≤ 2.0 Å) by using the SP mode of Glide. Meanwhile, the

365

capability of distinguishing the known inhibitors from non-inhibitors for 1DQA

366

complex is not good, indicated by the high P-value (1.13×10-6) associated with the

367

difference in the mean SP scores of the known inhibitors versus that of the

368

non-inhibitors (Figure S1 in the Supporting Materials). That is to say, all of the

369

docking calculations by using the SP and XP scoring modes for HMG-CoA reductase

370

are questionable either duo to unacceptable RMSD or poor discrimination capacity. In

371

most previous application of molecular docking on the TCM studies, the capabilities

372

of molecular docking on the studied systems were not evaluated, and therefore, the

373

reported results may be questionable. According to our calculations, it is strongly

374

recommended that the validation process is necessary before any docking study for

375

TCM.

376 377

The performance of pharmacophore-based ligand profiling

378

Based on the 33 receptor-ligand complexes of the T2DM-related targets (Table S5

379

in the Supporting Materials), 28 complex-based pharmacophore models were

380

generated (Table S5 in the Supporting Materials), and no pharmacophore model can

381

be generated for 5 complexes. There are two possible reasons for explaining the

382

failures for these 5 complexes:(1) the number of extracted pharmacophore features is

383

quite limited (less than 3), and (2) some pharmacophore features extracted from the

384

receptor-ligand complex do not comply with the minimum feature distance criterion

385

(< 1.0 Å).

386

For each complex, the generated pharmacophore models were ranked by the

387

selectivity scores (Table S5 in the Supporting Materials). Then, the prediction power

388

of each pharamcophore model to distinguish the known inhibitors from non-inhibitor

389

in the validate dataset was evaluated. The discrimination power of a pharmacophore

390

model was quantitatively characterized by the AUC area under a receiver operating 14

ACS Paragon Plus Environment

Page 14 of 46

Page 15 of 46

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

391

characteristics (ROC) curve. All pharmacophore features derived from the

392

receptor-ligand interactions (Total features), the pharmacophore features for the most

393

selective pharmacophore model (Feature set A) and those for the pharmacophore

394

model with the highest AUC value (Feature set B) are listed in Table S5 in the

395

Supporting Materials. The comparison of the numbers of the pharmacophore features

396

for 25 targets among Total features, Feature set A and Feature set B are shown in the

397

Figure 2a and listed in Table S6 in the Supporting Materials. We can observe that the

398

numbers of Total features are generally more than those of Feature set A, and the

399

numbers of Feature set A are also more than those of Feature set B for the same

400

complex (Table S5 in the Supporting Materials) for most cases. Besides, the

401

percentage of the number of the pharmacophore features that are more than five for

402

Total features, Feature set A and Feature set B are 71.4% (20/28), 64.3% (18/28) and

403

32.1% (9/28), respectively.

404

It is expected that a pharmacophore model with more pharmacophore features

405

should be more selective than that with less pharmacophore features. According to our

406

analysis, the numbers of Feature set A for most selective pharmacophore models are

407

not always equal to those of Total features. If the number of Total features derived

408

from a complex is n, the numbers of Feature set A for most selective pharmacophore

409

models determined by GFA are usually equal to n (14/28) or n-1 (11/28), except n-2

410

for PPAR-γ, PTP1B and estrogen receptor. The results demonstrate that some

411

pharmacophore features derived directly from the protein-ligand interactions have no

412

effect on the identification of the known inhibitors from chemical space. Meanwhile,

413

the numbers of Feature set B of the pharmacophore models that have the highest

414

discrimination capacity determined by the AUC values are usually n-1 (11/28) or n-2

415

(11/28), except n for angiotensin-converting enzyme, insulin-like growth factor 1

416

receptor and mitogen-activated protein kinase 1, n-3 for PPAR-γ, PTP1B and estrogen

417

receptor. By comparing the pharmacophore features between Feature set A and

418

Feature set B, it was found that the numbers of the pharmacophore models with

419

Feature set B equaling to, one less than, and two less than those in Feature set A are 15

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

420

21.4% (6/28), 64.3% (18/28) and 14.3% (4/28), respectively.

421

In summary, the pharmacophore models with Feature set B determined by the

422

validation datasets are more reasonable to be used than those with Feature set A

423

determined by GFA (Table S5 in the Supporting Materials and Figure 2b). If AUC

424

higher than 0.70 was used as the criterion to judge a successful prediction, 5 (18%) of

425

the most selective pharmacophore models yield acceptable predictions while 15 (55%)

426

of the pharmacophore models with the best discrimination capabilities yield

427

acceptable predictions. The most selective pharmacophore model was optimized by

428

GFA, but it should be cautiously used in virtual screening when there are no enough

429

known inhibitors for the validation. The aim of virtual screening is to find more

430

promising compounds (true positives) and avoid false positives from the huge

431

chemical space. The most distinguishing pharmacophore models, which are validated

432

by the validation dataset, are more suitable to be used in structure-based virtual

433

screening. Obviously, the validation process before any pharmacophore modeling

434

study for TCM is quite necessary.

435 436

The Bayesian classifier by combining the predictions from molecular docking

437

and pharmacophore mapping

438

As mentioned above, molecular docking gives good predictions for 21 targets, and

439

pharmacophore mapping gives acceptable predictions for 15 targets. Both of

440

molecular docking (RMSD ≤ 2.0 Å and P-value ≤ 10-20) and pharmacophore mapping

441

(AUC ≥ 0.7) give good predictions for 8 targets, and five targets cannot be well

442

predicted by either molecular docking or pharmacophore mapping. However,

443

molecular docking or pharmacophore mapping can give relatively reliable predictions

444

for 28 T2DM-related targets (28/33=84.8%). It must keep in mind that the virtual

445

screening based on molecular docking or pharmacophore mapping is a binary

446

classification problem. Then, we may raise the following question: could the

447

combination of these two structure-based methods enhance the prediction? For each

448

molecule, the docking score and fit value are the quantitative measurements for 16

ACS Paragon Plus Environment

Page 16 of 46

Page 17 of 46

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

449

molecular docking and pharmacophore mapping, respectively. Therefore, the naïve

450

Bayesian classifiers were built by using the docking scores and fit values as the

451

independent variables for 24 targets with reliable predictions of molecular docking

452

and/or phamacophore mapping. Three criteria should be satisfied for the chosen

453

targets: (1). the number of known inhibitors should be equal to or higher than 10; (2).

454

the native-like conformation of the ligand should be reproduced by molecular docking

455

(RMSD ≤ 2.0 Å); (3). reasonable pharmacophore model should be generated. In order

456

to compare the prediction capacity among molecular docking, pharmacophore

457

modeling and the combination of the predictions from both, the AUC values with

458

5-fold cross validation were employed to measure the performance of the naïve

459

Bayesian classifiers (Table 1 and Figure 3). According to statistical results shown in

460

Table 1, we observe that the AUC values of the Bayesian classifiers based on the

461

docking scores and fit values are higher than those only based on the docking scores

462

or fit values for almost all targets (except PTP1B). For example, for insulin-like

463

growth factor 1 receptor (IGF-1R), the AUC values for the Bayesian classifier based

464

on the fit values and that based on the docking scores are 0.729 and 0.788,

465

respectively, while the AUC value for the Bayesian classifier based on both the fit

466

values and docking scores is 0.904 (Figure 4). The reason why the naïve Bayesian

467

classifier based on both fit values and docking scores for PTP1B cannot get the

468

improvement was also investigated. We found that only six known inhibitors could be

469

docked into the binding site of PTP1B and mapped onto the pharmacophore model of

470

the PTP1B complex, and the limited number of the PTP1B inhibitors could not

471

guarantee the statistical reliability of the validation results.

472

Here, the AUC values in the range of 0.5-0.6, 0.6-0.7, 0.7-0.8, 0.8-0.9 and 0.9-1.0

473

represent fail, poor, fair, good and excellent predictions for the Bayesian classifiers

474

(Table S7 in the Supporting Materials and Figure 3). The definition of AUC for

475

indicating the performance of prediction models is consistent with the definition used

476

by the Discovery Studio simulation package. The Bayesian classifiers built based on

477

docking scores or fit values can only give good or excellent predictions for 4 or 9 17

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

478

targets, while the combination of these two measurements can give good or excellent

479

predictions for 15 targets. Clearly, the combination of docking scores and fit values

480

can enhance the discrimination capacity.

481 482

The deep analysis of the potential active compounds for the T2DM-related

483

targets

484

As we discussed in the above section, for 15 targets (15/24=62.5%), the Bayesian

485

classifiers based on both docking scores and fit values give reliable predictions (AUC

486

≥ 0.8), and therefore, only the Bayesian classifier for these 15 targets were employed

487

to virtually screen the TCM compound dataset.

488

The numbers of the predicted inhibitors are listed in Table S8 in the Supporting

489

Materials. For protein-tyrosine kinase erbB-2, no potential inhibitor was found by the

490

Bayesian classifier. For protein-tyrosine kinase erbB-2, there are only ten inhibitors

491

for building and validating the Bayesian classifier. As shown in Figure S2 in the

492

Supporting Materials, these ten inhibitors almost have the same molecular framework

493

(Murcko framework).61 Therefore, the conservation of the target-inhibitor interaction

494

patterns may be unfavorable to generate a general prediction model for finding other

495

dissimilar compounds from the TCM compound dataset. Then, we investigated the

496

correlation between the number of the potential inhibitors and the volume and

497

hydrophobic property of the binding pocket for each target (Table S8 in the

498

Supporting Materials), and did not find obvious relationship between the number of

499

the potential inhibitors for each target and the volume or hydrophobic property of the

500

binding pocket.

501

It is well known that two similar compounds or two compounds that share similar

502

framework usually have similar pharmacological effects.62 Therefore, the similarities

503

between the potential inhibitors from TCM predicted by the Bayesian classifiers and

504

the known inhibitors for each target (Table S8 in the Supporting Materials) were

505

evaluated by the 2-D similarities (Tanimoto Coefficient) based on the MDLPubicKeys

506

fingerprints63 supported in DS3.1. The results are summarized in Table S9 in the 18

ACS Paragon Plus Environment

Page 18 of 46

Page 19 of 46

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

507

Supporting Materials. It is encouraging to find that a number of the potential

508

inhibitors are similar to the known inhibitors for 12 of 15 targets when the similarity

509

coefficient higher than 0.6 was used as the threshold. In addition, for insulin receptor

510

and estrogen receptor, 7 and 11 potential inhibitors have the similarity coefficient of

511

1.0 to the known inhibitors. For estrogen receptor, 9 of the 11 potential inhibitors are

512

the known inhibitors (Figure 5a), and the other 2 are quite similar to the known

513

inhibitors of estrogen receptor (Figure 5b).

514

It is interesting to find that all of the 8 compounds shown in Figure 5a were

515

directly isolated from natural products. Be specific, the compounds 2 (Emodin), 6

516

(Genistein) and 7 (Daidzein) were isolated from Polygonum cuspidatum

517

(polygonaceae),64 the compound 8 (Broussonin A) from Broussonetia Kazinoki,65 the

518

compound 4 (8-PN) from Humulus lupulus L.,66 the compound 3 (Coumestrol) from

519

plant-based natural molecules or derivatives,67 and the compounds 1 (Resveratrol)68

520

and 5 (Apigenin)69 also from natural plants. It should be noted that all of the 8

521

compounds are not only found in the natural plants mentioned above, but also found

522

in the Chinese herbs in the classical TCM formulae for treating T2DM. Furthermore,

523

for insulin receptor, we can also observe that 21 predicted inhibitors have the

524

similarity coefficients higher than 0.9 to the known inhibitors of insulin receptor

525

(Table S9 in the Supporting Materials). The structural analysis shows that 18 of the 21

526

molecules are quite similar to the known inhibitor, myricetin, of insulin receptor. The

527

myricetin and the 18 potential inhibitors similar to myricetin are shown in Figure 6.

528

Obviously, these 18 molecules share the same molecular framework (Murcko

529

framework)61 with myricetin and can be seen as the derivatives of myricetin.

530

In summary, our results imply that the Bayesian classifiers by combining the

531

predictions from molecular docking and pharmacophore mapping have exciting

532

capability to find promising inhibitors from TCM for the T2DM-related targets. Many

533

potential inhibitors predicted by the Bayesian classifiers are quite similar or even

534

identical to the known inhibitors. Moreover, by the 2D similarity analysis, we can

535

observe that some potential inhibitors are dissimilar (low similarity) to the known 19

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 46

536

inhibitors for most targets and they may be novel lead compounds for the studied

537

targets. Our observation can provide a deep insight for understanding the action

538

mechanism of TCM formulae for T2DM. The main mechanism of the TCM formulae

539

for treating T2DM may be similar to western drugs: the active compounds or

540

inhibitors in TCM can interact with the important disease-related targets, forming the

541

leading force for combating diabetes.

542 543

Analysis of the compound-target interaction network

544

A TCM formula usually consists of various herbs with numerous compounds to

545

exert potency towards diseases. The action mechanism of TCM is also termed as

546

“cocktails

547

multi-targets. Two possible mechanisms may be proposed for the pharmacological

548

effect of the TCM formulae for treating T2DM: (1). multi-active-components interact

549

with different T2DM-related targets to achieve synergic effect, (2). limited individual

550

components interact with multi-targets and most of them have activities against single

551

specific T2DM-related targets.

of

drugs”,

which

represents multi-components

interacting

with

552

After employing the Bayesian classifiers to virtually screen the TCM compound

553

dataset, 1,996 non-duplicated compounds were determined to be potential inhibitors

554

for 14 targets. Among these 1,996 compounds, 1,590 (74.38%) of them were

555

predicted to be drug-like by using the drug-likeness model developed in our group.45

556

The distributions of eight important molecular physicochemical properties, including

557

MW, PSA, AlogP, logD, logS, the number of hydrogen bond acceptors/donors and the

558

number of rotatable bonds, for these compounds are displayed in Figure S3 in the

559

Supporting Materials. In order to achieve more reliable predictions, only the top 10%

560

of the compounds (693) ranked by decreasing the Bayesian scores for each target

561

were chosen for the following analysis (Table S8 in the Supporting Materials). The

562

numbers of the predicted potential inhibitors for each target are listed in Figure 7.

563

It is interesting to find that there are a number of potential inhibitors that interact

564

with multiple targets (Figure 8 and Table S10 in the Supporting Materials). There are 20

ACS Paragon Plus Environment

Page 21 of 46

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

565

519 non-duplicated predicted inhibitors against a single target, 94 against two targets,

566

28 against three targets, 8 against four targets, and only one against five targets. The

567

compound-target interaction networks are shown in Figure 9. As can be seen from

568

Figure 8, the number of the potential inhibitors decreases rapidly with the increase of

569

the number of targets. Apparently, most compounds in Chinese herbs have potential

570

activities against limited targets. As can be seen in Figure 9a, the whole

571

compound-target network has 519 nodes (potential inhibitors) and 696 edges

572

(interactions) with an average of 1.34 interactions per node. About 81.39% (564/693)

573

of the potential inhibitors only interact with one target. According to our analysis, we

574

can conclude that the mechanism of the TCM formulae may belong to the second

575

hypothesis: limited individual components interact with multi-targets and the others

576

only interact with the single specific T2DM-related targets.

577

We then investigated the potential multi-target inhibitors predicted by the

578

Bayesian classifiers. The similarities between the potential multi-target inhibitors and

579

the known inhibitors for each target were calculated. When the minimum similarity

580

coefficient based on the MDLPubicKeys fingerprint was set to 0.6, no similar known

581

inhibitor was found for these potential inhibitors for all of four and five targets, but

582

similar known inhibitors were successfully found for some potential inhibitors for all

583

of two and three targets.

584

For the only potential inhibitor (mirificin) against five targets (carbonic anhydrase

585

1, carbonic anhydrase 2, FBPase, glycogen synthase kinase-3 beta and insulin

586

receptor), similar known inhibitors for three of five targets (carbonic anhydrase 1,

587

carbonic anhydrase 2 and insulin receptor) could be found. For the 8 potential

588

inhibitors against four targets, similar known inhibitors for two or three of four targets

589

could be found. Besides, for the predicted inhibitors against two and three targets,

590

similar known inhibitors for all of two or three targets could be found. For examples,

591

the compound (Pureoside A) that was predicted to be a potential inhibitor of carbonic

592

anhydrase 1, PPAR delta and PPAR gamma, and similar known inhibitors for each

593

target were found and shown in Figure 10; two compounds (Epicatechin and 21

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

594

(2RS)-2-(4-O-β-D-glucopyranosylphenyl) propionyl β-D-glucopyranoside) that were

595

predicted to be potential inhibitors for two targets and similar known inhibitors were

596

found and shown in Figure 11.

597

In summary, our predictions suggest that most of the compounds extracted from

598

the Chinese herbs in TCM formulae can only interact with a single specific target.

599

Although the theory of TCM formulae is multi-compounds for multi-targets,

600

according to our analysis, it is obvious that only limited compounds from the TCM

601

formulae for T2DM show activities against multi-targets. The 2D-similarity analysis

602

shows that some known inhibitors are similar to the predicted potential inhibitors

603

against multi-targets. Furthermore, the low similarities between some of the predicted

604

potential inhibitors and the known inhibitors for each target suggest that some novel

605

potential inhibitors might be found from the predictions given by the Bayesian

606

classifiers. Those novel potential inhibitors may provide good source for

607

pharmaceutical chemists to find promising leads for treating T2DM.

608 609

The possible mechanisms of the classical TCM formulae for treating T2DM

610

As discussed in the above section, it is demonstrated that the 519 non-duplicated

611

potential inhibitors can interact with 14 T2DM-related targets. More than 80% of the

612

potential inhibitors only interact with one target, and the other potential inhibitors

613

interact with two or more targets. The multiple compounds from one herb or different

614

herbs of TCM formulae may form synergistic interactions for combating T2DM. Then,

615

the pharmacological activities of those predicted inhibitors that can interact with two

616

or more targets were examined by extensive literature searching. First, we analyzed

617

the nine compounds that interact with at least four targets (Table S10 in the

618

Supporting Materials). The only compound, mirificin, which was predicted to be an

619

inhibitor of five targets (carbonic anhydrase 1, carbonic anhydrase 2, FBPase,

620

glycogen synthase kinase-3 beta and insulin receptor), has free radical scavenging

621

activity.70 Moreover, the eight compounds that are predicted to be the inhibitors of

622

four targets have reported activities. Interestingly, the reported activities for three 22

ACS Paragon Plus Environment

Page 22 of 46

Page 23 of 46

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

623

compounds (Apiopaeonoside, Kakkalide and tectorigenin 7-O-xylosylglucoside) are

624

not related to the studied targets while related to aldose reductase.71,

625

studies have shown the close relationship between aldose reductase and free radical

626

scavenging/antioxidant.73 Aldose reductase might induce the oxidative damage

627

resulted from the competition between aldose reductase and glutathione reductase for

628

NADPH. Furthermore, the inhibitors of aldose reductase not only show activities for

629

aldose reductase, but also have antioxidant activity.74 By checking the top 10% of the

630

predicted inhibitors of aldose reductase (Table S8 in the Supporting Materials), we

631

also found that 6 of the 15 compounds have free radical scavenging/antioxidant

632

activities. According to those observations, the linkage between the three potential

633

inhibitors for four targets and the one potential inhibitor for five targets can be

634

established. Although the predicted potential inhibitors for more than four targets do

635

not have reported activities as predicted by us, all of those four compounds have the

636

reported activities of free radical scavenging or antioxidant. Besides, the previous

637

studies have shown that there were direct relationship between free radical

638

scavenging/antioxidant and T2DM.75 The level of free radical increases with the

639

T2DM disease progressing76, and in addition, some complications of T2DM,

640

including obesity, retinopathy and nephropathy, and some T2DM-related diseases,

641

such as coronary arteriosclerotic heart disease and arteriosclerosis, are also related to

642

the out of control of free radical.77-81 At last, it is necessary to emphasize that some

643

potential inhibitors that can interact only one, two or three targets also possess the

644

reported activities of free radical scavenging/antioxidant, for example, the compound,

645

apigenin-7-glucoside, a potential inhibitor of two targets (insulin receptor and

646

pancreatic alpha-amylase). Actually, the activity of apigenin-7-glucoside against

647

pancreatic alpha-amylase has been reported;82 in addition, this compound also has

648

aldose reductase and free radical scavenging/antioxidant activities.83-85 The compound,

649

4'-O-β-glucopyranosyl-3',5,7-trihydroxyflavone with predicted activities for carbonic

650

anhydrase 1 and pancreatic alpha-amylase targets have aldose reductase and free

651

radical scavenging activities.84, 86 For the compound, taxifolin, a predicted inhibitor 23

ACS Paragon Plus Environment

72

Previous

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

652

against three targets (carbonic anhydrase 2, estrogen receptor and pancreatic

653

alpha-amylase), the experimental activity against estrogen receptor has been

654

reported,87 and this compound also shows free radical scavenging/antioxidant,88, 89

655

PI3K inhibition,90 and PPAR gamma inhibition activities.91 Moreover, we also found

656

that some potential multi-target inhibitors did not have any documented activities for

657

the T2DM-related targets. For example, the compound, lobetyolin, was predicted to be

658

a potential inhibitor of carbonic anhydrase 2, insulin receptor and pancreatic

659

alpha-amylase targets, but the reported activity of lobetyolin is antibacterial.92 It was

660

interesting to find that the known drug of T2DM, glibenclamide, also has antibacterial

661

activity. The antibacterial activity is also found to be valuable in relieving diabetes

662

and the associated secondary disorders.93 According to our observations, we can find

663

that some of the predicted potential inhibitors have reported pharmacological

664

activities related to T2DM for more than one target.

665

In summary, we have examined the pharmacological activities of the multi-targets

666

compounds extensively. It was found that some pharmacological effects of the

667

predicted inhibitors of multi-targets have been reported previously, highlighting the

668

reliable predictions of the Bayesian classifiers. Furthermore, some potential inhibitors

669

predicted by our study need experimental verifications. Then, during the analysis of

670

the pharmacological activities of potential multi-target inhibitors, it was found that

671

most of these potential compounds have free radical scavenging/antioxidant activities

672

that have been proved to be effective to relieve T2DM and the related diseases of

673

T2DM.

674

The theory of TCM formulae is multi-compounds versus multi-targets. The

675

hundreds or even thousands of compounds extracted from the herbs of TCM formulae

676

interact with multi-targets to achieve synergistic effects. How the compounds from the

677

herbs of TCM formulae affect the T2DM-related targets is also unclear and needs to

678

be uncovered. According to our analysis, it was found that most of the predicted

679

potential inhibitors can only interact with a single target. However, a few of them still

680

can interact with more than one target. Some pharmacological effects, such as radical 24

ACS Paragon Plus Environment

Page 24 of 46

Page 25 of 46

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

681

scavenging/antioxidant and antibacterial activities, which are not directly related to

682

the studied targets, were found. Most compounds from the herbs in the TCM formulae

683

only interact with an individual target, forming the leading fighting force to combat

684

T2DM. Then, those potential multi-target compounds may influence the

685

T2DM-related targets, forming additional forces to enhance the therapeutic effects. At

686

last, a portion of the compounds are responsible for remedying the other related

687

symptoms that are proved to be related to T2DM, such as abnormal status of free

688

radical and oxidation in the human body. All of these observation found in this study

689

can be seen as a proper way to reveal the classical theory “Monarch, Minster,

690

Assistant and Guide” in TCM prescriptions at the molecular level.

691 692

Conclusion

693

In this study, we employed structure-based and complex-based virtual screening

694

approaches to find potential inhibitors from the Chinese herbs of the classical TCM

695

formulae for T2DM against the important T2DM-related targets. In order to achieve

696

more reliable predictions, the performance of molecular docking and pharmacophore

697

mapping for the 33 T2DM-related targets was evaluated. By combining the

698

advantages of molecular docking and pharmacophore mapping together, the Bayesian

699

classifiers were developed by integrating the predictions from molecular docking and

700

pharmacophore mapping. The validation results show that the integrated models are

701

superior to that based on the prediction from molecular docking or pharmacophore

702

mapping only.

703

The Bayesian classifiers with satisfactory discrimination capabilities for 15

704

targets were employed to screen the 2,479 compounds from the Chinese herbs of the

705

TCM formulae for T2DM. The pharmaceutical activities of some compounds

706

predicted by the Bayesian classifiers are confirmed by the experimental studies. In

707

order to uncover the underlying mechanism of the TCM formulae for combating

708

T2DM, the top 10% of the potential compounds ranked by the Bayesian scores for

709

each target were extracted to build the compound–target interaction network. The 25

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

710

analysis of the compound-target network shows that ~19% of the potential inhibitors

711

interact with more than one target. The examination of the pharmacological activities

712

demonstrates that the biological activities for a small portion of these potential

713

multi-target inhibitors have been reported, but the underlying activities for most

714

potential multi-target inhibitors need experimental verification. Besides, it is

715

important for us to find that a variety of the potential multi-target inhibitors have free

716

radical scavenging/antioxidant activities, which are closely related to T2DM. We can

717

conclude that the pharmacological effects of the TCM formulae for T2DM are not

718

only determined by the compounds that interact directly with one or more

719

T2DM-related targets, but also by the compounds with other supplementary

720

bioactivities important for treating T2DM, such as free radical scavenging/antioxidant

721

activities. The holistic way to express join forces for treating T2DM may be the

722

essence of the TCM formulae to break the robustness of the phenotype of T2DM and

723

restore the human body to become normal physiological condition.

724 725

Acknowledgements

726

This study was supported by the National Science Foundation of China (21173156),

727

the National Basic Research Program of China (973 program, 2012CB932600), and

728

the Priority Academic Program Development of Jiangsu Higher Education Institutions

729

(PAPD)

730 731

Supporting Information

732

Table S1. The names, PDB entries and crystal resolutions for the 48 T2DM-related

733

targets; Table S2. The 32 Chinese herbs in the TCM classical formulae for T2DM

734

with the highest frequency; Table S3. The docking power and discrimination power

735

of the Glide docking for 33 T2DM-related targets; Table S4. The docking power of

736

the Glide docking based on the six crystal structures of the complexes of HMG-CoA

737

reductase; Table S5. The numbers of Total features, Feature set A and Feature set B,

738

and the selectivity and discrimination capabilities of the pharmacophore models based 26

ACS Paragon Plus Environment

Page 26 of 46

Page 27 of 46

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

739

on Feature set A and Feature set B; Table S6. The comparisons of the number of

740

pharmacophore features in Total features, Feature set A and Feature set B; Table S7.

741

The comparison of the performance of the naïve Bayesian classifiers built by using

742

docking scores, fit values and the combination of docking scores and fit values; Table

743

S8. The volume and hydrophobic properties of the binding pockets and the numbers

744

of the potential inhibitors predicted by the naïve Bayesian classifiers for 15 targets;

745

Table S9. The number of the potential inhibitors from the TCM formulae that are

746

similar to the known inhibitors for 15 targets by using different thresholds of the

747

similarity based on the MDLPublicKeys fingerprints; Table S10. The number of the

748

predicted inhibitors that can interact with two, three, four and five targets; Figure S1.

749

The distributions of the docking scores of the known inhibitors and non-inhibitors by

750

using the (a) SP and (b) XP scoring modes of Glide for 1DQA of HMG-CoA

751

reductase; Figure S2. Ten known inhibitors of receptor protein-tyrosine kinase erbB-2

752

target with similar Murcko framework (colored in red); Figure S3. The distributions

753

of the eight important physicochemical properties for the 1,996 non-duplicated

754

potential inhibitors for 15 targets. This information is available free of charge via the

755

Internet at http://pubs.acs.org

756 757

References

758 759 760 761 762 763 764 765 766 767 768 769 770 771 772

1.

Smyth, S.; Heron, A. Diabetes and obesity: the twin epidemics. Nature Med. 2006, 12, 75-80.

2. Wild, S.; Roglic, G.; Green, A.; Sicree, R.; King, H. Global prevalence of diabetes - Estimates for the year 2000 and projections for 2030. Diabetes Care 2004, 27, 1047-1053. 3. Florez, H.J.; Sanchez, A.A.; Marks, J.B. Type 2 Diabetes. Diabetes and the Brain 2010, 33-53. 4. Pasquier, F. Diabetes and cognitive impairment: how to evaluate the cognitive status? Diabetes Metab. 2010, 36, S100-S105. 5. Reaven, G.M. Pathophysiology of insulin-resistance in human-disease. Physiol. Rev. 1995, 75, 473-486. 6. Ripsin, C.M.; Kang, H.; Urban, R.J. Management of Blood Glucose in Type 2 Diabetes Mellitus. Am. Fam. Physician 2009, 79, 29-36. 7. Morral, N. Novel targets and therapeutic strategies for type 2 diabetes. Trends Endocrin. Met. 2003, 14, 169-175. 8. Marcus, A.O. Safety of drugs commonly used to treat hypertension, dyslipidemia, and type 2 diabetes (the metabolic syndrome): part 1. Diabetes Technol. Ther. 2000, 2, 101-10. 9. Marcus, A.O. Safety of drugs commonly used to treat hypertension, dyslipidemia, and Type 2 27

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816

diabetes (the metabolic syndrome): part 2. Diabetes Technol. Ther. 2000, 2, 275-81. 10. Hopkins, A.L. Network pharmacology: the next paradigm in drug discovery. Nature Chem. Biol. 2008, 4, 682-690. 11. Roth, B.L.; Sheffler, D.J.; Kroeze, W.K. Magic shotguns versus magic bullets: selectively non-selective drugs for mood disorders and schizophrenia. Nat. Rev. Drug Discovery 2004, 3, 353-359. 12. Hou, T.; Wang, J. Structure - ADME relationship: still a long way to go? Expert Opin. Drug Metab. Toxicol. 2008, 4, 759-770. 13. Hou, T.J.; Xu, X.J. Recent development and application of virtual screening in drug discovery: An overview. Curr. Pharm. Des. 2004, 10, 1011-1033. 14. Winzeler, E.A.; Shoemaker, D.D.; Astromoff, A.; Liang, H.; Anderson, K.; Andre, B.; Bangham, R.; Benito, R.; Boeke, J.D.; Bussey, H.; Chu, A.M.; Connelly, C.; Davis, K.; Dietrich, F.; Dow, S.W.; El Bakkoury, M.; Foury, F.; Friend, S.H.; Gentalen, E.; Giaever, G.; Hegemann, J.H.; Jones, T.; Laub, M.; Liao, H.; Liebundguth, N.; Lockhart, D.J.; Lucau-Danila, A.; Lussier, M.; M'Rabet, N.; Menard, P.; Mittmann, M.; Pai, C.; Rebischung, C.; Revuelta, J.L.; Riles, L.; Roberts, C.J.; Ross-MacDonald, P.; Scherens, B.; Snyder, M.; Sookhai-Mahadeo, S.; Storms, R.K.; Veronneau, S.; Voet, M.; Volckaert, G.; Ward, T.R.; Wysocki, R.; Yen, G.S.; Yu, K.X.; Zimmermann, K.; Philippsen, P.; Johnston, M.; Davis, R.W. Functional characterization of the S-cerevisiae genome by gene deletion and parallel analysis. Science 1999, 285, 901-906. 15. Kitano, H. Towards a theory of biological robustness. Mol. Syst. Biol. 2007, 3. 16. Kitano, H. Innovation - A robustness-based approach to systems-oriented drug design. Nat. Rev. Drug Discovery 2007, 6, 202-210. 17. Berger, S.I.; Iyengar, R. Network analyses in systems pharmacology. Bioinformatics 2009, 25, 2466-2472. 18. Goh, K.-I.; Cusick, M.E.; Valle, D.; Childs, B.; Vidal, M.; Barabasi, A.-L. The human disease network. Proc. Natl. Acad. Sci. U. S. A. 2007, 104, 8685-8690. 19. Yildirim, M.A.; Goh, K.-I.; Cusick, M.E.; Barabasi, A.-L.; Vidal, M. Drug-target network. Nature Biotechnol. 2007, 25, 1119-1126. 20. Newman, D.J.; Cragg, G.M. Natural Products As Sources of New Drugs over the 30 Years from 1981 to 2010. J. Nat. Prod. 2012, 75, 311-335. 21. Zhao, J.; Jiang, P.; Zhang, W. Molecular networks for the study of TCM Pharmacology. Brief. Bioinform. 2009, 11, 417-430. 22. Ji, H.-F.; Li, X.-J.; Zhang, H.-Y. Natural products and drug discovery Can thousands of years of ancient medical knowledge lead us to new and powerful drug combinations in the fight against cancer and dementia? EMBO Rep. 2009, 10, 194-200. 23. Newman, D.J.; Cragg, G.M.; Snader, K.M. The influence of natural products upon drug discovery. Nat. Prod. Rep. 2000, 17, 215-234. 24. Lukman, S.; He, Y.; Hui, S.-C. Computational methods for Traditional Chinese Medicine: A survey. Comput. Methods Programs Biomed. 2007, 88, 283-294. 25. Chang, S.-S.; Huang, H.-J.; Chen, C.Y.-C. Two Birds with One Stone? Possible Dual-Targeting H1N1 Inhibitors from Traditional Chinese Medicine. PLoS Comput. Biol. 2011, 7. 26. Chen, K.-C.; Chang, S.-S.; Tsai, F.-J.; Chen, C.Y.-C. Han ethnicity-specific type 2 diabetic treatment from traditional Chinese medicine? J. Biomol. Struct. Dyn. 2012, 1-17. 27. Tou, W.I.; Chang, S.-S.; Lee, C.-C.; Chen, C.Y.-C. Drug Design for Neuropathic Pain Regulation 28

ACS Paragon Plus Environment

Page 28 of 46

Page 29 of 46

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860

from Traditional Chinese Medicine. Sci. Rep. 2013, 3. 28. Chen, K.-C.; Chang, K.-W.; Chen, H.-Y.; Chen, C.Y.-C. Traditional Chinese medicine, a solution for reducing dual stroke risk factors at once? Mol. Biosyst. 2011, 7, 2711-2719. 29. Chen, X.; Ung, C.Y.; Chen, Y.Z. Can an in silico drug-target search method be used to probe potential mechanisms of medicinal plant ingredients? Nat. Prod. Rep. 2003, 20, 432-444. 30. Ehrman, T.M.; Barlow, D.J.; Hylands, P.J. In silico search for multi-target anti-inflammatories in Chinese herbs and formulas. Bioorg. Med. Chem. 2010, 18, 2204-2218. 31. Gu, J.; Zhang, H.; Chen, L.; Xu, S.; Yuan, G.; Xu, X. Drug-target network and polypharmacology studies of a Traditional Chinese Medicine for type II diabetes mellitus. Comput. Biol. Chem. 2011, 35, 293-297. 32. Harvey, A.L.; Clark, R.L.; Mackay, S.P.; Johnston, B.F. Current strategies for drug discovery through natural products. Expert Opin. Drug Discov. 2010, 5, 559-568. 33. Schuster, D.; Wolber, G. Identification of Bioactive Natural Products by Pharmacophore-Based Virtual Screening. Curr. Pharm. Des. 2010, 16, 1666-1681. 34. Tou, W.I.; Chen, C.Y.-C. In Silico Investigation of Potential Src Kinase Ligands from Traditional Chinese Medicine. PLoS One 2012, 7. 35. Li, W.L.; Zheng, H.C.; Bukuru, J.; De Kimpe, N. Natural medicines used in the traditional Chinese medical system for therapy of diabetes mellitus. J. Ethnopharmacol. 2004, 92, 1-21. 36. Herrick, T.M.; Million, R.P. Tapping the potential of fixed-dose combinations. Nat. Rev. Drug Discovery 2007, 6, 513-514. 37. Zhu, F.; Shi, Z.; Qin, C.; Tao, L.; Liu, X.; Xu, F.; Zhang, L.; Song, Y.; Liu, X.; Zhang, J.; Han, B.; Zhang, P.; Chen, Y. Therapeutic target database update 2012: a resource for facilitating target-oriented drug discovery. Nucleic Acids Res. 2012, 40, D1128-D1136. 38. Wishart, D.S.; Knox, C.; Guo, A.C.; Cheng, D.; Shrivastava, S.; Tzur, D.; Gautam, B.; Hassanali, M. DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res. 2008, 36, D901-D906. 39. Kanehisa, M.; Goto, S.; Sato, Y.; Furumichi, M.; Tanabe, M. KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res. 2012, 40, D109-D114. 40. Moller, D.E. New drug targets for type 2 diabetes and the metabolic syndrome. Nature 2001, 414, 821-827. 41. Berman, H.M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T.N.; Weissig, H.; Shindyalov, I.N.; Bourne, P.E. The Protein Data Bank. Nucleic Acids Res. 2000, 28, 235-242. 42. Qiao, X.B.; Hou, T.J.; Zhang, W.; Guo, S.L.; Xu, S.J. A 3D structure database of components from Chinese traditional medicinal herbs. J. Chem. Inf. Comput. Sci. 2002, 42, 481-489. 43. Shen, M.; Tian, S.; Li, Y.; Li, Q.; Xu, X.; Wang, J.; Hou, T. Drug-likeness analysis of traditional Chinese medicines: 1. property distributions of drug-like compounds, non-drug-like compounds and natural compounds from traditional Chinese medicines. J. Cheminform. 2012, 4, 1-13. 44. Tian, S.; Li, Y.; Wang, J.; Xu, X.; Xu, L.; Wang, X.; Chen, L.; Hou, T. Drug-likeness analysis of traditional Chinese medicines: 2. Characterization of scaffold architectures for drug-like compounds, non-drug-like compounds, and natural compounds from traditional Chinese medicines. J. Cheminform. 2013, 5, 1-14. 45. Tian, S.; Wang, J.; Li, Y.; Xu, X.; Hou, T. Drug-likeness Analysis of Traditional Chinese Medicines: Prediction of Drug-likeness Using Machine Learning Approaches. Mol. Pharm. 2012, 9, 2875-86. 46. Chen, C.Y.-C. TCM Database@Taiwan: The World's Largest Traditional Chinese Medicine 29

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904

Database for Drug Screening In Silico. PloS One 2011, 6. 47. Discovery Studio 3.1 Guide, Accelrys Inc., San Diego, 2012, http://www.accelrys.com. 48. Liu, T.; Lin, Y.; Wen, X.; Jorissen, R.N.; Gilson, M.K. BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinities. Nucleic Acids Res. 2007, 35, D198-D201. 49. Krueger, D.M.; Evers, A. Comparison of Structure- and Ligand-Based Virtual Screening Protocols Considering Hit List Complementarity and Enrichment Factors. Chemmedchem 2010, 5, 148-158. 50. Friesner, R.A.; Murphy, R.B.; Repasky, M.P.; Frye, L.L.; Greenwood, J.R.; Halgren, T.A.; Sanschagrin, P.C.; Mainz, D.T. Extra precision glide: Docking and scoring incorporating a model of hydrophobic enclosure for protein-ligand complexes. J. Med. Chem. 2006, 49, 6177-6196. 51. Schrödinger, version 9.0, Schrödinger, LLC, New York, NY, 2009, http://www.schrodinger.com. 52. Kaminski, G.A.; Friesner, R.A.; Tirado-Rives, J.; Jorgensen, W.L. Evaluation and reparametrization of the OPLS-AA force field for proteins via comparison with accurate quantum chemical calculations on peptides. J. Phys. Chem. B 2001, 105, 6474-6487. 53. Sutter, J.; Li, J.; Maynard, A.J.; Goupil, A.; Luu, T.; Nadassy, K. New Features that Improve the Pharmacophore Tools from Accelrys. Curr. Comput. Aided Drug Des. 2011, 7, 173-180. 54. Rogers, D.; Hopfinger, A.J. Application of genetic function approximation to quantitative structure-activity-relationships and quantitative structure-property relationships. J. Chem. Inf. Comput. Sci. 1994, 34, 854-866. 55. Meslamani, J.; Li, J.; Sutter, J.; Stevens, A.; Bertrand, H.-O.; Rognan, D. Protein-Ligand-Based Pharmacophores: Generation and Utility Assessment in Computational Ligand Profiling. J. Chem. Inf. Model. 2012, 52, 943-955. 56. Chen, L.; Li, Y.; Zhao, Q.; Peng, H.; Hou, T. ADME Evaluation in Drug Discovery. 10. Predictions of P-Glycoprotein Inhibitors Using Recursive Partitioning and Naive Bayesian Classification Techniques. Mol. Pharm. 2011, 8, 889-900. 57. Wang, S.; Li, Y.; Wang, J.; Chen, L.; Zhang, L.; Yu, H.; Hou, T. ADMET Evaluation in Drug Discovery. 12. Development of Binary Classification Models for Prediction of hERG Potassium Channel Blockage. Mol. Pharm. 2012, 9, 996-1010. 58. Breitkreutz, B.J.; Stark, C.; Tyers, M. Osprey: a network visualization system. Genome Biol. 2003, 4. 59. Cheng, T.; Li, X.; Li, Y.; Liu, Z.; Wang, R. Comparative Assessment of Scoring Functions on a Diverse Test Set. J. Chem. Inf. Model. 2009, 49, 1079-1093. 60. Cross, J.B.; Thompson, D.C.; Rai, B.K.; Baber, J.C.; Fan, K.Y.; Hu, Y.; Humblet, C. Comparison of Several Molecular Docking Programs: Pose Prediction and Virtual Screening Accuracy. J. Chem. Inf. Model. 2009, 49, 1455-1474. 61. Bemis, G.W.; Murcko, M.A. The properties of known drugs .1. Molecular frameworks. J. Med. Chem. 1996, 39, 2887-2893. 62. Yang, L.; Wang, K.; Chen, J.; Jegga, A.G.; Luo, H.; Shi, L.; Wan, C.; Guo, X.; Qin, S.; He, G.; Feng, G.; He, L. Exploring Off-Targets and Off-Systems for Adverse Drug Reactions via Chemical-Protein Interactome - Clozapine-Induced Agranulocytosis as a Case Study. PLoS Comput. Biol. 2011, 7. 63. Durant, J.L.; Leland, B.A.; Henry, D.R.; Nourse, J.G. Reoptimization of MDL keys for use in drug discovery. J. Chem. Inf. Comput. Sci. 2002, 42, 1273-1280. 64. Matsuda, H.; Shimoda, H.; Morikawa, T.; Yoshikawa, M. Phytoestrogens from the roots of 30

ACS Paragon Plus Environment

Page 30 of 46

Page 31 of 46

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948

Polygonum cuspidatum (Polygonaceae): Structure-requirement of hydroxyanthraquinones for estrogenic activity. Bioorg. Med. Chem. Lett. 2001, 11, 1839-1842. 65. Lee, D.Y.; Kim, D.H.; Lee, H.J.; Lee, Y.; Ryu, K.H.; Jung, B.-I.; Song, Y.S.; Ryu, J.-H. New estrogenic compounds isolated from Broussonetia kazinoki. Bioorg. Med. Chem. Lett. 2010, 20, 3764-3767. 66. Roelens, F.; Heldring, N.; Dhooge, W.; Bengtsson, M.; Comhaire, F.; Gustafsson, J.-A.; Treuter, E.; De Keukeleire, D. Subtle side-chain modifications of the hop phytoestrogen 8-prenylnaringenin result in distinct agonist/antagonist activity profiles for estrogen receptors alpha and beta. J. Med. Chem. 2006, 49, 7357-7365. 67. Zhao, L.Q.; Brinton, R.D. Structure-based virtual screening for plant-based ER beta-selective ligands as potential preventative therapy against age-related neurodegenerative diseases. J. Med. Chem. 2005, 48, 3463-3466. 68. Lyons, M.M.; Yu, C.W.; Toma, R.B.; Cho, S.Y.; Reiboldt, W.; Lee, J.; Van Breemen, R.B. Resveratrol in raw and baked blueberries and bilberries. J. Agric. Food Chem. 2003, 51, 5867-5870. 69. Ferreira, C.V.; Justo, G.Z.; Souza, A.C.S.; Queiroz, K.C.S.; Zambuzzi, W.F.; Aoyama, H.; Peppelenbosch, M.P. Natural compounds as a source of protein tyrosine phosphatase inhibitors: Application to the rational design of small-molecule derivatives. Biochimie 2006, 88, 1859-1873. 70. Wang, F.-R.; Yang, X.-W.; Zhang, Y.; Liu, J.-X.; Yang, X.-B.; Liu, Y.; Shi, R.-B. Three new isoflavone glycosides from Tongmai granules. J. Asian Nat. Prod. Res. 2011, 13, 319-329. 71. Ha, D.T.; Tran Minh, N.; Lee, I.; Lee, Y.M.; Kim, J.S.; Jung, H.; Lee, S.; Na, M.; Bae, K. Inhibitors of Aldose Reductase and Formation of Advanced Glycation End-Products in Moutan Cortex (Paeonia suffruticosa). J. Nat. Prod. 2009, 72, 1465-1470. 72. Matsuda, H.; Morikawa, T.; Toguchida, I.; Yoshikawa, M. Structural requirements of flavonoids and related compounds for aldose reductase inhibitory activity. Chem. Pharm. Bull. 2002, 50, 788-795. 73. Srivastava, S.K.; Ansari, N.H. Prevention of sugar-induced cataractogenesis in rats by butylated hydroxytoluene. Diabetes 1988, 37, 1505-1508. 74. Wolff, S.P.; Crabbe, M.J.C. Low apparent aldose reductase-activity produced by monosaccharide autoxidation. Biochem. J. 1985, 226, 625-630. 75. Collier, A.; Wilson, R.; Bradley, H.; Thomson, J.A.; Small, M. Free-radical activity in type-2 diabetes. Diabetic Med. 1990, 7, 27-30. 76. Sato, Y.; Hotta, N.; Sakamoto, N.; Matsuoka, S.; Ohishi, N.; Yagi, K. Lipid peroxide level in plasma of diabetic-patients. Biochem. Med. 1979, 21, 104-107. 77. Kannel, W.B.; McGee, D.L. Diabetes and Cardiovascular Risk-Factors - Framingham Study. Circulation 1979, 59, 8-13. 78. Greene, D.A. Acute and chronic complications of diabetes-mellitus in older patients. Am. J. Med. 1986, 80, 39-53. 79. Mohora, M.; Virgolici, B.; Paveliu, F.; Lixandru, D.; Muscurel, C.; Greabu, M. Free radical activity in obese patients with type 2 diabetes mellitus. Rom. J. Intern. Med. 2006, 44, 69-78. 80. Nishimura, C.; Kuriyama, K. Alteration of lipid peroxide and endogenous antioxidant contents in retina of streptozotocin-induced diabetic rats - effect of vitamin-a administration. Jpn. J. Pharmacol. 1985, 37, 365-372. 81. Paller, M.S.; Hoidal, J.R.; Ferris, T.F. Oxygen free-radicals in ischemic acute-renal-failure in the rat. 31

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979

J. Clin. Invest. 1984, 74, 1156-1164. 82. Funke, I.; Melzig, M.F. Effect of different phenolic compounds on alpha-amylase activity: screening by microplate-reader based kinetic assay. Pharmazie 2005, 60, 796-797. 83. He, Z.D.; Lau, K.M.; But, P.P.H.; Jiang, R.W.; Dong, H.; Ma, S.C.; Fung, K.P.; Ye, W.C.; Sun, H.D. Antioxidative glycosides from the leaves of Ligustrum robustum. J. Nat. Prod. 2003, 66, 851-854. 84. Jang, D.S.; Lee, Y.M.; Jeong, I.H.; Kim, J.S. Constituents of the Flowers of Platycodon grandiflorum with Inhibitory Activity on Advanced Glycation End Products and Rat Lens Aldose Reductase In Vitro. Arch. Pharmacal Res. 2010, 33, 875-880. 85. Wang, N.; Yang, X.-W. Two new flavonoid glycosides from the whole herbs of Hyssopus officinalis. J. Asian Nat. Prod. Res. 2010, 12, 1044-1050. 86. Xie, H.H.; Wang, T.; Matsuda, H.; Morikawa, T.; Yoshikawa, M.; Tani, T. Bioactive constituents from Chinese natural medicines. XV.(1)) - Inhibitory effect on aldose reductase and structures of saussureosides A and B from Saussurea medusa. Chem. Pharm. Bull. 2005, 53, 1416-1422. 87. Fang, H.; Tong, W.D.; Shi, L.M.; Blair, R.; Perkins, R.; Branham, W.; Hass, B.S.; Xie, Q.; Dial, S.L.; Moland, C.L.; Sheehan, D.M. Structure-activity relationships for a large diverse set of natural, synthetic, and environmental estrogens. Chem. Res. Toxicol. 2001, 14, 280-294. 88. Dong, Y.; Shi, H.; Yang, H.; Peng, Y.; Wang, M.; Li, X. Antioxidant Phenolic Compounds from the Stems of Entada phaseoloides. Chem. Biodiv. 2012, 9, 68-79. 89. Khairullina, V.R.; Gerchikov, A.Y.; Denisova, S.B. Comparative study of the antioxidant properties of selected flavonols and flavanones. Kinet. Catal. 2010, 51, 219-224. 90. Kong, D.; Zhang, Y.; Yamori, T.; Duan, H.; Jin, M. Inhibitory Activity of Flavonoids against Class I Phosphatidylinositol 3-Kinase Isoforms. Molecules 2011, 16, 5159-5167. 91. Mueller, M.; Lukas, B.; Novak, J.; Simoncini, T.; Genazzani, A.R.; Jungbauert, A. Oregano: A Source for Peroxisome Proliferator-Activated Receptor gamma Antagonists. J. Agric. Food Chem. 2008, 56, 11621-11630. 92. Mei, R.-Q.; Lu, Q.; Hu, Y.-F.; Liu, H.-Y.; Bao, F.-K.; Zhang, Y.; Cheng, Y.-X. Three new polyyne (= polyacetylene) glucosides from the edible roots of Codonopsis cordifolioidea. Helv. Chim. Acta. 2008, 91, 90-96. 93. Ilango, K.; Chitra, V.; Kanimozhi, P.; Balaji, G. Antidiabetic, antioxidant and antibacterial activities of leaf extracts of Adhatoda zeylanica. Medic (Acanthaceae). J. Pharm. Sci. Res. 2009, 1, 67-73.

980

Legend of figures

981

Figure 1. The distributions of the docking scores of the validation datasets for (a) four

982

targets (PPAR-gamma, PPAR-alpha, DPP-4 and insulin receptor) with low p-values,

983

and (b) four targets (carbonic anhydrase 2, pyruvate dehydrogenase [lipoamide]

984

kinase isozyme 1, transthyretin and corticosteroid 11-beta-dehydrogenase, isozyme 1)

985

with high p-values.

986

Figure 2. (a) The comparison of the numbers of the pharmacophore features in Total

987

features, Feature set A and Feature set B, and (b) The comparison of the 32

ACS Paragon Plus Environment

Page 32 of 46

Page 33 of 46

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

988

discrimination capacity indicated by the AUC values of the pharmacophore models

989

based on Feature set A and Feature set B.

990

Figure 3. The comparison of the discrimination capability indicated by the AUC

991

values for the naïve Bayesian classifiers based on fit values, docking scores and the

992

combination of fit values and docking scores.

993

Figure 4. The distributions of Bayesian scores of the inhibitors and non-inhibitors and

994

the AUC value of the naïve Bayesian classifiers based on (a) docking scores, (b) fit

995

values and (c) the combination of docking scores and fit values for insulin-like growth

996

factor 1 receptor (IGF-1R).

997

Figure 5. (a) Eight potential inhibitors predicted by the Bayesian classifier are

998

identical to the known inhibitors of estrogen receptor, and (b) three potential

999

inhibitors predicted by the Bayesian classifier are similar to the known inhibitors of

1000

estrogen receptor.

1001

Figure 6. Eighteen potential inhibitors from the TCM formulae predicted by the

1002

Bayesian classifier are similar to the known inhibitor, myricetin, of insulin receptor.

1003

Figure 7. The number of the potential inhibitors ranked by the Bayesian scores for

1004

establishing the compounds-targets network.

1005

Figure 8. The distribution of the number of the potential multi-target inhibitors.

1006

Figure 9. (a) The compound-target network for all potential inhibitors, (b) for those

1007

against no less than targets, (c) for those against no less than three targets, and (d) for

1008

those against no less than four targets.

1009

Figure 10. The molecule, Pueroside A, was predicted to an inhibitor of three targets

1010

(carbonic anhydrase, PPAR delta and PPAR gamma). (a) schematic representation of

1011

the interaction between Pueroside A and three targets, and (b) the known inhibitors

1012

that have the maximum similarity to Pueroside A based on the MDLPulicKeys

1013

fingerprints for three targets, respectively.

1014

Figure 11. The molecules, Epicatechin and (2RS)-2-(4-O-β-D-glucopyranosylphenyl)

1015

propionyl β-D-glucopyranoside were predicted to be the inhibitors of two targets. (a)

1016

schematic representation of the interactions between Epicatechin and carbonic 33

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2

and

estrogen

Page 34 of 46

1017

anhydrase

receptor,

and

those

between

1018

(2RS)-2-(4-O-β-D-glucopyranosylphenyl)

propionyl

β-D-glucopyranoside

1019

carbonic anhydrase 1 and carbonic anhydrase 2 targets, and (b) the known inhibitors

1020

of carbonic anhydrase 2 and estrogen receptor that have the maximum similarity to

1021

Epicatechin; the known inhibitors of carbonic anhydrase 1 and carbonic anhydrase

1022

that have the maximum similarity to (2RS)-2-(4-O-β-D-glucopyranosylphenyl)

1023

propionyl β-D-glucopyranoside based on the MDLPublicKeys fingerprints.

and

1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038

Table 1. The performances of the naïve Bayesian classifiers by using fit values,

1039

docking scores and the combination of fit values and docking scores for 24 targets No

Target name

Ninh

Nnon

AUC Fit value

Score

Combined

1

Carbonic anhydrase 1

52

1040

0.908

0.841

0.942

2

Carbonic anhydrase 2

91

1820

0.803

0.500

0.867

3

Growth factor receptor-bound protein 2

16

320

0.687

0.597

0.741

4

Peroxisome proliferator-activated receptor gamma

500

10000

0.532

0.795

0.837

5

Glycogen synthase kinase-3 β

69

1380

0.833

0.916

0.971

6

Aldose reductase

145

2900

0.717

0.836

0.892

7

Pancreatic α-amylase

22

440

0.795

0.743

0.865

34

ACS Paragon Plus Environment

Page 35 of 46

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

8

Peroxisome proliferator-activated receptor delta

224

4480

0.624

0.781

0.830

59

1180

0.641

0.573

0.672

0.614

0.741

0.814

(PPAR-δ) 9

Pyruvate dehydrogenase [lipoamide] kinase isozyme 1

10

Insulin receptor

370

7400

11

Insulin-like growth factor 1 receptor

177

2540

0.788

0.729

0.904

12

mRNA of Protein-tyrosine phosphatase,

34

680

0.604

0.759

0.596

263

5260

0.533

0.809

0.847

0.856

0.896

non-receptor type 1 (PTP1B) 13

Peroxisome proliferator-activated receptor α (PPAR-α)

14

Glucokinase (Hexokinase D)

314

6280

0.624

15

Protein Kinase C, alpha type

500

10000

0.572

0.600

0.683

16

Mitogen-activated protein kinase 10

500

10000

0.573

0.655

0.720

17

Corticosteroid 11-beta-dehydrogenase, isozyme 1

500

10000

0.526

0.578

0.604

(11-Beta hydroxysteroid dehydrogenase 1) 18

Mitogen-activated protein kinase 8

500

10000

0.551

0.601

0.661

19

Mitogen-activated protein kinase 1

130

2600

0.562

0.668

0.722

20

Receptor protein-tyrosine kinase erbB-2

13

260

0.669

0.937

0.957

21

Estrogen receptor

500

10000

0.676

0.897

0.940

22

Glucocorticoid receptor

115

2300

0.628

0.736

0.788

23

Fructose-1,6-bisphosphatase 1 (FBPase)

89

1780

0.740

0.821

0.856

24

Phosphatidylinositol-4,5-bisphosphate 3-kinase

16

320

0.840

0.898

0.934

catalytic subunit, delta isoform (Phosphoinositide 3-kinase delta)

1040 1041 1042

35

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1043 1044

Figure 1

1045

36

ACS Paragon Plus Environment

Page 36 of 46

Page 37 of 46

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

1046 1047

Figure 2

1048

37

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1049 1050

Figure 3

1051

38

ACS Paragon Plus Environment

Page 38 of 46

Page 39 of 46

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

1052 1053

Figure 4

1054

39

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1055 1056

Figure 5

40

ACS Paragon Plus Environment

Page 40 of 46

Page 41 of 46

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

1057 1058

Figure 6

1059

41

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1060 1061

Figure 7

1062 1063 1064

42

ACS Paragon Plus Environment

Page 42 of 46

Page 43 of 46

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

1065 1066

Figure 8

1067 1068 1069 1070 1071

43

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1072 1073

Figure 9

1074 1075 1076 1077 1078 1079

44

ACS Paragon Plus Environment

Page 44 of 46

Page 45 of 46

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

1080 1081

Figure 10

45

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1082 1083

Figure 11

46

ACS Paragon Plus Environment

Page 46 of 46