Subscriber access provided by Penn State | University Libraries
Article
ADMET Evaluation in Drug Discovery. 11. PharmacoKinetics Knowledge Base (PKKB) - a comprehensive database of pharmacokinetic and toxic properties for drugs Dongyue Cao, Junmei Wang, Rui Zhou, Youyong Li, Huidong Yu, and Tingjun Hou J. Chem. Inf. Model., Just Accepted Manuscript • DOI: 10.1021/ci300112j • Publication Date (Web): 06 May 2012 Downloaded from http://pubs.acs.org on May 7, 2012
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
Journal of Chemical Information and Modeling is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 19
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Information and Modeling
1
ADMET Evaluation in Drug Discovery. 11.
2
PharmacoKinetics Knowledge Base (PKKB) - a
3
comprehensive database of pharmacokinetic and toxic
4
properties for drugs
5 6
Dongyue Caoa†, Junmei Wangc†, Rui Zhoua, Youyong Lia, Huidong Yua, Tingjun Houa,b*
7 8 9 10 11
a
12
Institute of Functional Nano & Soft Materials (FUNSOM) and Jiangsu Key
13
Laboratory for Carbon-Based Functional Materials & Devices, Soochow University,
14
Suzhou, Jiangsu 215123, China b
15
c
16
College of Pharmaceutical Science, Soochow University, Suzhou, Jiangsu 215123, China
Department of Biochemistry, The University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd., Dallas, TX 75390
17 18 19 20
†
21
Corresponding author:
Co-first authors
22
Tingjun Hou
23
E-mail:
[email protected] 24
Post address: Institute of Functional Nano & Soft Materials (FUNSOM), Soochow
25
University, Suzhou 215123, P. R. China.
or
[email protected] 26 27 28 29 30
Keywords: PKKB; ADMET; ADME; Toxicity; Database; CADD
1
ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 2 of 19
Abstract
31 32 33
Good and extensive experimental ADMET (absorption, distribution, metabolism,
34
excretion and toxicity) data is critical for developing reliable in silico ADMET models.
35
Here we develop PharmacoKinetics Knowledge Base (PKKB) to compile
36
comprehensive information about ADMET properties into a single electronic
37
repository. We incorporate more than 10000 experimental ADMET measurements of
38
1685 drugs into PKKB. The ADMET properties in PKKB include octanol/water
39
partition coefficient, solubility, dissociation constant, intestinal absorption, Caco-2
40
permeability,
41
partitioning ratio, volume of distribution, metabolism, half-life, excretion, urinary
42
excretion, clearance, toxicity, half lethal dose in rat or mouse, etc. PKKB provides the
43
most extensive collection of freely available data for ADMET properties up to date.
44
All these ADMET properties, as well as the pharmacological information and the
45
calculated physiochemical properties are integrated into a web-based information
46
system. Ten separated datasets for octanol/water partition coefficient, solubility,
47
blood-brain partitioning, intestinal absorption, Caco-2 permeability, human oral
48
bioavailability, and P-glycoprotein inhibitors, have been provided for free download
49
and can be used directly for ADMET modeling. PKKB is available online at
50
http://cadd.suda.edu.cn/admet.
human
bioavailability,
plasma
protein
51 52 53 54 55 56 57 58 59 2
ACS Paragon Plus Environment
binding,
blood-plasma
Page 3 of 19
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Information and Modeling
60
Introduction
61
Drug discovery and development is a time-consuming and expensive process. It was
62
estimated that 40~60% of new chemical entity (NCE) failures can be attributed to
63
poor ADMET profiles.1, 2 ADMET properties can be predicted from the chemical
64
structures, so that huge number of compounds can be evaluated prior to be
65
synthesized and assayed.3-5 Theoretical predictions of ADMET properties have been
66
proved to be efficient in recent years.3-7 The lack of enough high quality experimental
67
data for training reliable models has been the major hurdle to model ADMET
68
properties.6, 8, 9 When the sample size used in training is limited, the in silico models
69
cannot give robust and accurate predictions, especially for the ADMET properties
70
involving complex processes, such as bioavailability, metabolism, toxicity, etc.
71
Traditionally, the available experimental datasets for ADMET modeling in the public
72
domain are often limited in quantity and quality. This is particularly true for in vivo
73
properties obtained directly from human, where data is typically only available for
74
compounds in clinic development.8 Encouragingly, the available large datasets are
75
expanding in the recent years. For example, three extensive data sets for intestinal
76
absorption, oral bioavailability in human and P-gp inhibitors were reported by our
77
group.10-13Nevertheless, further developments on the availability of ADMET data for
78
the public use are still necessary.
6
79
With more available ADMET data, it will be helpful to integrate all these data of a
80
variety of ADMET properties from different sources into a single information system.
81
PK/DB reported by Moda and the coworker is one information system providing the
82
service.14 PK/DB incorporates 1389 compounds and 4141 pharmacokinetic
83
measurements for eight ADME properties. And the data in PK/DB were directly taken
84
from the reported publicly available ADME datasets without careful curation. For
85
instance, in PK/DB, two core datasets, the intestinal absorption dataset with 687
86
molecules and the oral bioavailability dataset with 660 molecules, were directly taken
87
from the datasets reported by us.11,
88
information.
12
Thus PK/DB only provides a limited data
3
ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
89
Here, we develop PKKB (PharmacoKinetics Knowledge Base) to house 2-D and
90
3-D chemical structures, pharmacological information, experimental or calculated
91
physiochemical properties, and particularly high quality ADMET data of drugs.
92
PKKB incorporates the most extensive collections of ADMET data in the public
93
domain, which is much larger than PK/DB. The total number of experimental data in
94
PKKB is more than 10000, in comparison with 4000 in PK/DB. Most data in PKKB
95
were collected by us to develop the ADMET prediction models. During the ADMET
96
modeling process, the experimental data were carefully curated by us.10-13, 15-20 The
97
datasets in PKKB developed by us were widely used by a lot of well-known
98
information systems, such as Drugbank,21 KnowItAll22 and PK/DB14. The ADMET
99
data developed by us were used by a lot of scientists from pharmaceuticals and
100
academics. The extensive feedbacks from the users are very helpful for the
101
improvement of the data in PKKB. We frequently update the data in PKKB, which is
102
important to improve the quality of data. And we obtain the reliable data in PKKB. As
103
a publicly available online database, PKKB will be continuously maintained and
104
updated.
105 106
Methodology
107
The content of database
108
PKKB is hosted at http://cadd.suda.edu.cn/admet. The ADMET data currently in
109
PKKB are from 1685 drugs. All the FDA-approved small-molecule drugs found in
110
DrugBank have been collected into PKKB.21 We categorize the data fields for each
111
molecule into four groups: the general information, the pharmacological information,
112
the physicochemical properties and the ADMET properties (Table 1). The general
113
information for each molecule includes molecular name, synonyms, ACS number and
114
DrugBank ID. The pharmacological information for each molecule includes the status,
115
the administration route and the pharmacological effect. The physiochemical
116
properties for each molecule include the experimental octanol/water partition
117
coefficient (logP), the experimental aqueous solubility (logS), the experimental 4
ACS Paragon Plus Environment
Page 4 of 19
Page 5 of 19
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Information and Modeling
118
dissociation constant (pKa), the calculated octanol/water partition coefficient (logP),
119
the calculated octanol/water distribution coefficient at pH=7 (logD), the calculated
120
aqueous solubility, the number of hydrogen bond donors, the number of hydrogen
121
bond acceptors, the number of rotatable bonds and the topological polar surface area
122
(TPSA). We performed all the calculations with ACD/Labs (version 12.0). The
123
ADMET properties for each molecule in PKKB are categorized into five parts:
124
absorption, distribution, metabolism, excretion and toxicity. The properties associated
125
with absorption include the absolute value of intestinal absorption, the description of
126
intestinal absorption, Caco-2 permeability and bioavailability in human. The
127
properties associated with distribution include protein binding, volume of distribution
128
(VD) and blood/plasma partitioning ratio (D-blood). The properties associated with
129
metabolism include general metabolism information and half-time; the properties
130
associated with distribution include excretion route, urinary excretion and clearance.
131
The properties associated with toxicity include general toxicity information, LD50 in
132
rat and LD50 in mouse. The distributions for ten ADMET properties are shown in
133
Figure 1.
134
We also release eleven ADMET datasets reported by us in PKKB. These ADME
135
datasets include three solubility datasets of 1290, 1708 and 1210 molecules,
136
respectively,16, 23, 24 a Caco-2 permeability dataset of 100 molecules,20 a blood-brain
137
partitioning dataset of 109 molecules,18 a P-gp inhibitor dataset of 1302 molecules,10 a
138
intestinal absorption dataset of 647 molecules,11, 15 a bioavailability dataset of 1013
139
molecules,12, 13 , a hERG blocker dataset 806 molecules,25 a combined dataset of 470
140
compounds with both intestinal absorption data and oral bioavailability data, and a
141
combined dataset of 69 compounds with both intestinal absorption data and Caco-2
142
permeability data.11-13,
143
directly used for ADMET modeling. In the near future, more datasets will be
144
continuously added to the dataset collections in PKKB. These datasets are important
145
for those who are interested in benchmarking the results of experiments, validating
146
the accuracy of existing ADMET predictive models, and building new predictive
147
models.
20
All these datasets have been post-processed and can be
5
ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
148
It must be noted that the purpose of PKKB is quite different from those of
149
CheMBL,26 ChemSpider27 and DrugBank21. ChemSpide is a free chemical structure
150
database with more than 25 million molecules, and CheMBL is a free Chemical
151
database of bioactive drug-like small molecules. Drugbank is a popular information
152
system for drugs, and was primarily developed to provide chemical structures,
153
pharmacological data, and drug targets for drugs. ChemSpider does not afford
154
valuable information about ADMET. Some ADMET data are included in DrugBank
155
and ChEMBL, but they are quite limited. In comparison, PKKB provides the
156
comprehensive data for ADMET modeling.
157 158
Implementation
159
We developed an integrated data structure and a variety of querying functions to
160
allow easy and efficient retrieval of ADMET data (Figure 2). PKKB is installed on
161
Windows server’s workstations. MySQL5.1.46 is used as the relational database
162
management system (RDBMS). Apache Tomcat 6.0.26 server is used as the web
163
server platform. Meanwhile, we use J2EE (Spring3.0.5+Hibernate3.6.3), jQuery and
164
DHTML as the web interface. In order to construct user-friendly searching and
165
retrieval systems, PKKB provides interactive web interfaces based on the graphic
166
structure editor MarvinSketch and the substructure matching algorithm accomplished
167
in OpenBabel2.2.3.
168 169
Database access and database query
170
All querying functions in PKKB are available for everyone with internet access.
171
However, registration is highly encouraged for all users because the registered users
172
can download the search results using the download hyperlinks that the non-registered
173
users cannot use. Downloadable results are exported into a SDF database for
174
maximum flexibility of use in other applications. Everyone can download the ten
175
separated ADMET datasets.
176
PKKB provides registered users with an integrated searching interface for both 6
ACS Paragon Plus Environment
Page 6 of 19
Page 7 of 19
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Information and Modeling
177
structure and text search (Figure 3a). The results from such searches lead the users to
178
a “Molecule list” view (Figure 3b), which shows the basic information about each
179
molecule, including 2-D structure, name, molecular weight, DrugBank ID and ACS
180
number. Users can click the hyperlink of a particular molecule to get more detailed
181
information: 2D and 3-D structures, molecular properties and ADMET properties,
182
which are displayed on its respective information webpage (Figures 3c and 3d).
183
PKKB is incorporated with a web-based query tool supported by MarvinSketch and
184
Openbabel2.2.3. The molecular drawing interface, MarvinSketch, allows users to
185
quickly draw molecules through some basic functions by the GUI and advanced
186
functionalities, including sprout drawing, customizable shortcuts, abbreviated groups,
187
default and user defined templates and context sensitive popup menus. SMARTS rules
188
allow users to define any specific or generic queries. Structural searches contain exact,
189
substructure and similarity search supported by Openbabel2.2.3. The similarity
190
between the query and each molecule was measured by Tanimoto coefficient based on
191
the FP2 fingerprint. User can also take a structural search by inputting a molecule
192
with different molecular formats supported by MarvinSketch.
193
PKKB is developed with rapid text searching functions for molecular name,
194
DrugBank ID and ACS number. Moreover, for several molecular properties, including
195
molecular weight, predicted logD, experimental logP, predicted logP, predicted
196
solubility, number of hydrogen-bond acceptor, number of hydrogen-bond donors,
197
number of rotatable bonds, intestinal absorption and bioavailability, numerical
198
interval of these attribute values can be searched to find the target molecules.
199 200
Database management
201
The administration user of PKKB can activate the database management system.
202
The database management includes three management interfaces: User Management,
203
Role Management and Compound Management. User Management interface is used
204
to manage registered user accounts, such as organizing the user's name, email, phone
205
number, etc. Role Management interface is used to setup user’s permissions.
206
Compound Management interface is used to add or remove compound in PKKB 7
ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 8 of 19
207
(Figure 4a). In Compound Management interface, administrator can edit the available
208
properties of each molecule when it is necessary (Figure 4b); furthermore, the
209
structure of a compound can be edited by a molecular drawing interface supported by
210
MarvinSketch (Figure 4b). In Compound Management interface, administrators can
211
take a text search to find a specific compound to be edited (Figure 4a).
212
functions are helpful for the continuous maintenance of PKKB.
These
213 214
Conclusions and future development
215
In summary, we develop PKKB database, which is a unique knowledge environment
216
for ADMET properties. PKKB provides structures, pharmacological information,
217
important experimental or predicted physiochemical properties, and experimental
218
ADMET data for 1685 drugs.,PKKB integrates both predicted and experimental
219
information into a single and public resource. We expect that the rich content in
220
PKKB will facilitate the researchers to develop more reliable ADMET prediction
221
models in the near future. With the extensive data in PKKB, it is plausible to develop
222
more complicated models for more “complex” ADMET properties, such as
223
bioavailability, clearance, metabolism, etc. For example, from the combined dataset of
224
470 compounds with both intestinal absorption data and oral bioavailability data, we
225
may develop rules or models for the first-pass metabolism effect.
226
We are continuing to improve PKKB in the following directions. First of all, the
227
quality of the collected data needs to be improved further. The reliable data can
228
usually be generated under a single experimental protocol or even a single
229
experimental assay. Unfortunately, the data in PKKB were put together from various
230
sources, and they are subject to variability due to experimental conditions and
231
inter-laboratory errors. We will check the reliability of the data from different sources
232
carefully and guarantee the reliability of the data in PKKB as best as we can. Second,
233
to better characterize each entry, a complete record of that entry is needed, i.e., all the
234
fields for each molecule need to be filled in. Although PKKB already affords
235
extensive data for many ADMET properties, some data fields are far from 8
ACS Paragon Plus Environment
Page 9 of 19
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Information and Modeling
236
“completeness”. The empty data field usually indicates that the data has not been
237
measured or reported. However, in many cases, the data probably exists in somewhere
238
else, but our ADMET team has not found it and validated it. Moreover, the coverage
239
of the ADMET properties in PKKB is still limited. In the new version of PKKB, some
240
important ADMET properties will be added, such as genotoxicity, aquatic toxicity,
241
eye irritation, skin irritation, P450 inhibitors and substrates, etc. In addition to the
242
existence of missing data, PKKB is also missing some drug-like molecules. Currently
243
more than 1000 drug candidates in clinical trials are still on the PKKB ‘to do’ list and
244
will be added in a short period. Third, we will replace Openbabel2.2.3 by MORT
245
(Molecular Objects and Relevant Templates) developed in our group soon. MORT, as
246
the foundation library of gleap in AmberTools,28 has already been released. The Java
247
MORT under development will be used by PKKB. Finally, we plan to afford on-line
248
prediction models for several important ADMET properties in the next version of
249
PKKB.
250 251
Acknowledgement
252
The project is supported by the National Science Foundation of China (Grant No.
253
20973121), the National Basic Research Program of China (973 program,
254
2012CB932600 to T. Hou), the NIH (R21GM097617 to J. Wang) and the Priority
255
Academic Program Development of Jiangsu Higher Education Institutions (PAPD).
256 257
References
258 259 260 261 262 263 264 265 266 267
1.
Kennedy, T. Managing the drug discovery/development interface. Drug Discovery Today 1997, 2, 436-444.
2. Kola, I.; Landis, J. Can the pharmaceutical industry reduce attrition rates? Nature Reviews Drug Discovery 2004, 3, 711-715. 3. Hou, T. J. In Silico Predictions of ADME/T Properties: Progress and Challenges. Combinatorial Chemistry & High Throughput Screening 2011, 14, 306-306. 4. Hou, T. J.; Wang, J. M.; Zhang, W.; Wang, W.; Xu, X. Recent advances in computational prediction of drug absorption and permeability in drug discovery. Current Medicinal Chemistry 2006, 13, 2653-2667. 5. Zhu, J. Y.; Wang, J. M.; Yu, H. D.; Li, Y. Y.; Hou, T. J. Recent Developments of In Silico Predictions of 9
ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311
Oral Bioavailability. Combinatorial Chemistry & High Throughput Screening 2011, 14, 362-374. 6. Hou, T.; Wang, J. Structure - ADME relationship: still a long way to go? Expert Opinion on Drug Metabolism & Toxicology 2008, 4, 759-770. 7. van de Waterbeemd, H.; Gifford, E. ADMET in silico modelling: Towards prediction paradise? Nature Reviews Drug Discovery 2003, 2, 192-204. 8. Gola, J.; Obrezanova, O.; Champness, E.; Segall, M. ADMET property prediction: The state of the art and current challenges. Qsar & Combinatorial Science 2006, 25, 1172-1180. 9. Dearden, J. C. In silico prediction of ADMET properties How far have we come? Expert Opinion on Drug Metabolism & Toxicology 2007, 3, 635-639. 10. Chen, L.; Li, Y. Y.; Zhao, Q.; Peng, H.; Hou, T. J. ADME Evaluation in Drug Discovery. 10. Predictions of P-Glycoprotein Inhibitors Using Recursive Partitioning and Naive Bayesian Classification Techniques. Molecular Pharmaceutics 2011, 8, 889-900. 11. Hou, T. J.; Wang, J. M.; Zhang, W.; Xu, X. J. ADME evaluation in drug discovery. 7. Prediction of oral absorption by correlation and classification. Journal of Chemical Information and Modeling 2007b, 47, 208-218. 12. Hou, T. J.; Wang, J. M.; Zhang, W.; Xu, X. J. ADME evaluation in drug discovery. 6. Can oral bioavailability in humans be effectively predicted by simple molecular property-based rules? Journal of Chemical Information and Modeling 2007c, 47, 460-463. 13. Tian, S.; Li, Y. Y.; Wang, J. M.; Zhang, J.; Hou, T. J. ADME Evaluation in Drug Discovery. 9. Prediction of Oral Bioavailability in Humans Based on Molecular Properties and Structural Fingerprints. Molecular Pharmaceutics 2011, 8, 841-851. 14. Moda, T. L.; Torres, L. G.; Carrara, A. E.; Andricopulo, A. D. PK/DB: database for pharmacokinetic properties and predictive in silico ADME models. Bioinformatics 2008, 24, 2270-2271. 15. Hou, T. J.; Wang, J. M.; Li, Y. Y. ADME evaluation in drug discovery. 8. The prediction of human intestinal absorption by a support vector machine. Journal of Chemical Information and Modeling 2007a, 47, 2408-2415. 16. Hou, T. J.; Xia, K.; Zhang, W.; Xu, X. J. ADME evaluation in drug discovery. 4. Prediction of aqueous solubility based on atom contribution approach. Journal of Chemical Information and Computer Sciences 2004a, 44, 266-275. 17. Hou, T. J.; Xu, X. J. ADME evaluation in drug discovery - 1. Applications of genetic algorithms to the prediction of blood-brain partitioning of a large set of drugs. Journal of Molecular Modeling 2002, 8, 337-349. 18. Hou, T. J.; Xu, X. J. ADME evaluation in drug discovery. 3. Modeling blood-brain barrier partitioning using simple molecular descriptors (vol 43, 2137, 2003). Journal of Chemical Information and Computer Sciences 2004b, 44, 766-770. 19. Hou, T. J.; Xu, X. J. ADME evaluation in drug discovery. 2. Prediction of partition coefficient by atom-additive approach based on atom-weighted solvent accessible surface areas (vol 43, pg 1058, 2003). Journal of Chemical Information and Computer Sciences 2004c, 44, 1516-1516. 20. Hou, T. J.; Zhang, W.; Xia, K.; Qiao, X. B.; Xu, X. J. ADME evaluation in drug discovery. 5. Correlation of Caco-2 permeation with simple molecular properties. Journal of Chemical Information and Computer Sciences 2004d, 44, 1585-1600. 21. Wishart, D. S.; Knox, C.; Guo, A. C.; Shrivastava, S.; Hassanali, M.; Stothard, P.; Chang, Z.; Woolsey, J. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Research 2006, 34, D668-D672. 10
ACS Paragon Plus Environment
Page 10 of 19
Page 11 of 19
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Information and Modeling
312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355
22. KnowItAll information system; Bio-Rad Laboratories, Inc. http://www.knowitall.com, 2012. 23. Wang, J. M.; Hou, T. J.; Xu, X. J. Aqueous Solubility Prediction Based on Weighted Atom Type Counts and Solvent Accessible Surface Areas. Journal of Chemical Information and Modeling 2009, 49, 571-581. 24. Wang, J. M.; Krudy, G.; Hou, T. J.; Zhang, W.; Holland, G.; Xu, X. J. Development of reliable aqueous solubility models and their application in druglike analysis. Journal of Chemical Information and Modeling 2007, 47, 1395-1404. 25. Wang, S.; Li, Y.; Wang, J.; Chen, L.; Zhang, L.; Yu, H.; Hou, T. ADMET Evaluation in Drug Discovery. 12. Development of Binary Classification Models for Prediction of hERG Potassium Channel Blockage. Molecular Pharmaceutics 2012, 9, 996-1010. 26. Gaulton, A.; Bellis, L. J.; Bento, A. P.; Chambers, J.; Davies, M.; Hersey, A.; Light, Y.; McGlinchey, S.; Michalovich, D.; Al-Lazikani, B. ChEMBL: a large-scale bioactivity database for drug discovery. 2012, 40, D1100-D1107. 27. Pence, H. E.; Williams, A. ChemSpider: an online chemical information resource. Journal of Chemical Education 2010, 87, 1123-1124. 28. Zhang, W.; Hou, T. J.; Qiao, X. B.; Xu, X. J. Some basic data structures and algorithms for chemical generic programming. Journal of Chemical Information and Computer Sciences 2004, 44, 1571-1575.
11
ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
356
Figure legends
357 358
Figure 1. The distributions of ten experimental physiochemical and ADMET
359
properties in PKKB.
360 361
Figure 2. The basic schema of PKKB. We mine, clean and organize the ADMET data
362
from the publications by manual curation.
363 364
Figure 3. (a). The integrated text and structure searching interface in PKKB, (b) the
365
searching results for a list of target molecules; (c) the detailed information of a
366
molecule with 2-D structure; (d) the detailed information of a molecule with 3-D
367
structure.
368 369
Figure 4. (a) The Compound Management interface and (b) the interface for editing
370
the available properties of a specific molecule.
371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 12
ACS Paragon Plus Environment
Page 12 of 19
Page 13 of 19
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Information and Modeling
387
Table 1. The important data fields in PKKD and the corresponding number of
388
measures No.
Property
Measures
Physiochemical Properties 1
Molecular weight
1684
2
logP (experiment)
1019
3
logP (predicted, AB/logP v2.0)
1625
4
pka (experiment)
638
5
logD (pH=7, predicted)
1625
6
Solubility (experiment)
800
7
logS (predicted, ACD/Labs)(pH=7)
1614
8
logSw (predicted, AB/LogSw 2.0)
1625
9
Sw (mg/ml) (predicted, ACD/Labs)
1613
10
Sw (predicted)
1625
11
Number of hydrogen bond donors
1625
12
Number of hydrogen bond acceptors
1625
13
Number of rotatable bonds
1625
14
TPSA
1625
Pharmacology 15
Status
1372
16
Administration
501
17
Pharmacology
1543
Absorption 18
Intestinal absorption
679
19
Absorption (description)
699
20
Caco-2 permeability
64
21
Human bioavailability
992
Distribution 22
Plasma protein binding
1058
23
Volume of distribution (Vd)
646
24
Blood/plasma partitioning ratio (D-blood)
66
Metabolism 25
Metabolism
1111
26
Half-time
1116
Excretion 27
Excretion
855 13
ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
28
Urinary excretion
281
29
Clearance
410
Toxicity 30
Description of toxicity
873
31
LD50 (rat)
219
32
LD50 (mouse)
243
389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423
14
ACS Paragon Plus Environment
Page 14 of 19
Page 15 of 19
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Information and Modeling
424 425
Figure 1
15
ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
426
427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454
Figure 2
16
ACS Paragon Plus Environment
Page 16 of 19
Page 17 of 19
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Information and Modeling
455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479
Figure 3
17
ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
480
481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505
Figure 4
18
ACS Paragon Plus Environment
Page 18 of 19
Page 19 of 19
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Chemical Information and Modeling
506
For Table of Contents Use Only
507 508
ADMET Evaluation in Drug Discovery. 11. PharmacoKinetics Knowledge Base
509
(PKKB) - a comprehensive database of pharmacokinetic and toxic properties for
510
drugs
511 512
Dongyue Caoa, Junmei Wang, Rui Zhou, Youyong Li, Huidong Yu, Tingjun Hou*
513
19
ACS Paragon Plus Environment