Article Cite This: Mol. Pharmaceutics XXXX, XXX, XXX−XXX
pubs.acs.org/molecularpharmaceutics
Predicting Fraction Unbound in Human Plasma from Chemical Structure: Improved Accuracy in the Low Value Ranges Reiko Watanabe,*,† Tsuyoshi Esaki,† Hitoshi Kawashima,† Yayoi Natsume-Kitatani,†,§ Chioko Nagao,†,§ Rikiya Ohashi,†,‡ and Kenji Mizuguchi*,†,§ Laboratory of Bioinformatics and §Laboratory of In-silico Drug Design, Center of Drug Design Research, National Institutes of Biomedical Innovation, Health and Nutrition, 7-6-8 Saito-Asagi, Ibaraki, Osaka 567-0085, Japan ‡ Discovery Technology Laboratories, Mitsubishi Tanabe Pharma Corporation, 2-2-50 Kawagishi, Toda, Saitama 335-8505, Japan Mol. Pharmaceutics Downloaded from pubs.acs.org by UNIV OF SUNDERLAND on 09/28/18. For personal use only.
†
S Supporting Information *
ABSTRACT: Predicting the fraction unbound in plasma provides a good understanding of the pharmacokinetic properties of a drug to assist candidate selection in the early stages of drug discovery. It is also an effective tool to mitigate the risk of late-stage attrition and to optimize further screening. In this study, we built in silico prediction models of fraction unbound in human plasma with freely available software, aiming specifically to improve the accuracy in the low value ranges. We employed several machine learning techniques and built prediction models trained on the largest ever data set of 2738 experimental values. The classification model showed a high true positive rate of 0.826 for the low fraction unbound class on the test set. The strongly biased distribution of the fraction unbound in plasma was mitigated by a logarithmic transformation in the regression model, leading to improved accuracy at lower values. Overall, our models showed better performance than those of previously published methods, including commercial software. Our prediction tool can be used on its own or integrated into other pharmacokinetic modeling systems. KEYWORDS: plasma protein binding, fraction unbound in plasma, PPB, f u,p, machine learning
■
INTRODUCTION
packages easily, and therefore, such models would be a valuable asset to both industry and academia. The fraction unbound in plasma (f u,p) is an important determinant of drug efficacy in pharmacokinetic and pharmacodynamic studies. This is because, in general, only the unbound (free) drug is capable of interacting with pharmacological target proteins such as receptors, channels, and enzymes and is able to diffuse between plasma and tissues. The value of f u,p influences renal glomerular filtration and hepatic metabolism, and consequently, it affects the volume of distribution (Vd) and the total clearance (CLtot) of a drug, both essential factors to determine its pharmacokinetics.5 In addition, when the degree of plasma protein binding (PPB)
Drug discovery in academia has rapidly drawn attention as an alternative way to developing new drugs,1 and it has been reported that more than half of the recent small-molecule discoveries were originated from academic research.2,3 However, academic drug discovery projects fail to reach clinical trials in most cases primarily because of the lack of expertise and resources in academia to optimize the key molecular properties concerning pharmacokinetics and toxicology.4 Properly validated in silico models for ADMET (absorption, distribution, metabolism, excretion, and toxicity) prediction can help medicinal chemists in prioritizing suitable compounds at the screening and optimization stages of new drug seeds. While comprehensive commercial suites to predict ADMET properties exist, models built using freely available computational tools can be distributed widely or integrated into other © XXXX American Chemical Society
Received: July 24, 2018 Revised: September 7, 2018 Accepted: September 14, 2018
A
DOI: 10.1021/acs.molpharmaceut.8b00785 Mol. Pharmaceutics XXXX, XXX, XXX−XXX
Article
Molecular Pharmaceutics
For the PharmaPendium-derived data, 15 512 records of human PPB or f u were initially filtered on the Web site and downloaded. Then, the records that did not satisfy the inclusion criteria were removed, such as nonadult (e.g., infant, newborn baby, and aged), nonhealthy (e.g., pregnant, patients), concomitant administration, the use of radioactive methods, and other abnormalities such as the presence of special notes in the comment column), resulting in 9765 records. If a record had a range of values such as 96 to 97, the average was taken. The records with a value containing “” (e.g., >99, 1000 were removed, and f u,p between 0.001 to 1.0 were used. Finally, we created the final data set of 2738 compounds (419 from PharmaPemdium and 2130 from ChEMBL and 189 from Interview Forms). The list of 2319 compounds and their f u,p values are shown in Table S1, except for the PharmaPendium-derived 419 compounds due to licensing restrictions. As a control data set for the chemical space analysis, 6092 approved drugs in Japan were taken from KEGG DRUG (Kyoto Encyclopedia of Genes and Genomes, https://www.genome.jp/kegg/drug/). Structural information in each data set was converted to a 2D Structure Data File (SDF) using Open Babel.14 Descriptors Calculation. We employed open source programs Mordred (ver. 1.0.0)15 and PaDEL-Descriptor16 for the calculation of descriptors and fingerprints, respectively. For comparison, we also used the commercial software ADMET Predictor 8.1 (Simulations Plus) for the descriptor calculation. For the PCA analysis, three descriptors un-
is high, small differences in protein binding can have large effects on f u,p, and consequently, the medical efficacy can change dramatically.6 Therefore, it is necessary to make an accurate prediction of f u,p, especially in the low value ranges, during the course of drug development. About a dozen quantitative structure−activity relationship (QSAR) models for f u,p (or PPB) have been reported, but most of them are series-specific local models, focusing on individual plasma proteins such as human serum albumin7 and α-acid glycoprotein,8 or only deal with acidic drugs.9 Previously, only a few global models (i.e., models built from a data set that spans a large chemical space) have been developed; Votano et al. and Zhu et al. used a variety of machine learning methods to predict PPB with Molconn-Z and Dragon software on a training set of 808 and 1242 compounds, respectively.10,11 More recently, Ingle et al. constructed models with MOE descriptors on 1045 compounds and evaluated their models using pharmaceuticals and environmentally relevant ToxCast chemicals.12 Sun et al. developed global models with a pool of molecular descriptors calculated by PaDEL-descriptors, Schrodinger, and Discovery Studio software on 967 compounds and evaluated the model performance by pharmaceutical, industrial, and newly designed chemicals.13 Although the overall performance of these models has improved, all of them relied upon some commercial software, and the accuracy in the crucial low value ranges of f u,p needs to be improved. Several commercial software packages to predict values of f u,p are available, such as ADMET Predictor (Simulations Plus, Lancaster, CA, http://www.simulations-plus.com/) and BIOVIA Discovery Studio (Accelrys, San Diego, CA, http:// accelrys.com/). These suites are wildly used in the pharmaceutical industry. On the other hand, to the best of our knowledge, no models, created with free software and open to the public free of charge, are available at this moment. In this study, we built f u,p prediction models trained on the largest ever data set of experimental values, containing more than 2000 compounds. We employed several machine learning techniques using freely available software and focused specifically on the low value ranges of f u,p. To create a large and high-quality data set, comparable to those used in commercial suites, we compiled the experimental PPB or f u,p values from the ChEMBL (http://www.ebi.ac.uk/chembl/) and PharmaPendium (Elsevier, Amsterdam, Netherlands, https://www.pharmapendium.com) databases and manually processed the data by checking the experimental protocols, values, units, and other details. Then, molecular descriptors were calculated to represent the chemical structures, and generated models with several machine learning techniques were calculated to predict the values of f u,p using only freely available computational tools. The prediction models showed performance better than that of previous methods including commercial software, particularly in the lower range of f u,p. Our best models are freely available at http://adme.nibiohn.go.jp/ fup/.
■
MATERIALS AND METHODS Data Set Preparation. Two data sources were used in this study, PharmaPendium, a fully searchable database of the drug approval documents of the FDA and EMA and the ChEMBL database. We decided to include in our data set the reported values of PPB or f u,p in healthy adult humans in a single administration. B
DOI: 10.1021/acs.molpharmaceut.8b00785 Mol. Pharmaceutics XXXX, XXX, XXX−XXX
Article
Molecular Pharmaceutics
Figure 1. (A) Scatter plot of principal components 1 and 2 of the assembled data set (red, 2738 compounds) and control (green, 6092 compounds). The cumulative contribution ratio is shown on the top left. The frames indicate 95% normal confidence ellipses in the assembled data set (red) and control (green). The length of the vector is proportional to the importance of the descriptor. (B) A plot of ionization profiles in the assembled data set. Dots and the horizontal lines in the boxplots represent mean and median values, respectively. The percentages of the compounds in the each of the assembled data sets are printed on the top. (C) A plot of f u,p in logarithmic scale versus LogDpH7.4. in the assembled data set. Acids (587 compounds) are highlighted with light red circles, bases (516 compounds) with green, neutrals (1,509 compounds) are blue, and zwitterions (84 compounds) are dark red.
Mordred descriptors and the three types of fingerprints calculated by free software constituted the first set (D1), and ADMET Predictor descriptors and the three types of fingerprints relying on commercial software constituted the second set (D2). Descriptors that showed near-zero-variance and absolute correlations above 0.90 were identified and excluded by calculating the frequency ratio with the nearZeroVar function and by creating a correlation matrix with the findCorrelation function in the caret package. The data set was split into the training (2192 compounds) and test sets (546 compounds) with random selection at a ratio of 8:2. The Boruta package20 was used to automatically select the important features in the training set with the D1 and D2 descriptor sets. The binary or three-class classification models to predict the degree of fraction unbound in plasma were constructed using various machine learning techniques, namely, random forest (RF), support vector machine (SVM, with radial and linear basis functions), k-nearest neighbors (k-NN ), artificial neural network (ANN), AdaBoost, and partial least squares (PLS),
supported by Mordred, LogDpH7.4, acidic pKa (apKa), and basic pKa (bpKa), were generated by ChemAxon calculatorplugin software (ChemAxon). Data Analysis. Data analysis was performed in R, and the results were visualized by the ggplot217 and ggfortify18 packages. In total, 11 descriptors, i.e., molecular weight (MW), topological polar surface area (TopoPSA), SLogP, LogD pH7.4, apKa, bpKa, hydrogen bond acceptor (HBAcc), hydrogen bond donor (HBDon), the number of aromatic atoms (nAromAtom), the number of aromatic bonds (nAromBond), and the number of rotatable bonds (nRot), were used for the PCA analysis. Ionization profiles of 2696 compounds out of 2738 compounds in the data set were extracted from the ChEMBL database. Model Construction. The caret19 package in R was used to build prediction models. 2D descriptors were calculated by Mordred and ADMET Predictor, and 1436 and 345 features were used for model construction, respectively. Extended, KlekotaRoth, and AtomPairs2D fingerprints were generated by PaDEL-Descriptor. Two descriptor sets were defined; the C
DOI: 10.1021/acs.molpharmaceut.8b00785 Mol. Pharmaceutics XXXX, XXX, XXX−XXX
Article
Molecular Pharmaceutics
Figure 2. Boxplots representing the relationships between f u,p and (A) MW, (B) LogDpH7.4, (C) SLogP, (D) C2SP2, (E) SMR_VSA7, and (F) nAromAtom. The dotted lines and the horizontal lines in the boxplots represent mean and median values, respectively. Correlation coefficients (r) are indicated on the top right. Counts of compounds in each category are shown in Table S4.
together. Figure 1A shows a plot of the first two principal components (PC1 and PC2). The two principal components together explained approximately 60% of the variance, and they roughly correspond to lipophilicity/hydrophilicity and molecular weight, respectively. The 95% normal confidence ellipses for our data set and for the approved drugs were almost concentric, with the former being slightly smaller. This observation suggested that the created data set covered chemical space similar to that of the approved drugs. The assembled data set included 21.4, 18.8, and 55.1% of acids, bases, and neutrals, respectively, and a small fraction of zwitterions. Each of these molecular types shows a distinct distribution of f u,p (Figure 1B). Acids and neutrals exhibited lower f u,p than bases and zwitterions. The fraction unbound in the logarithmic scale versus LogDpH7.4, a lipophilicity measure, is plotted in Figure 1C. The more lipophilic a molecule is, the smaller its f u,p is in general. Acids tended to display a lower unbound concentration than neutral and basic molecules due to their higher affinity for plasma proteins. The correlations between physicochemical properties and logarithmic f u,p were analyzed with descriptors calculated by ADMET Predictor and Mordred, and 15 descriptors highly correlated with f u,p in Table S2 and their descriptions are summarized in Tables S3. They include lipophilicity measures such as SLogP and MLogP, covalent-bond-related descriptors such as C2SP2 (sp2 carbon bound to two other carbons), π-
with tuning algorithms to prioritize the optimal parameters for our predictions by the caret package. We implemented a 10fold cross-validation during training, and accuracy, kappa, and sensitivity (also referred to as recall or true positive rate) obtained from the confusion matrix were used to evaluate the classification models on the test set. The data distribution was examined prior to constructing the regression models, in which the f u,p values were transformed into the logarithmic scale of f u,p and ln Ka (natural logarithm of the pseudo binding constant imputed from PPB) using the equation of Zhu et al.11 Three end points, the original linear scale, the logarithmic scale of f u,p, and lnKa, were adopted for the regression model. The models were generated using RF, SVM (with radial and linear basis functions), k-NN, ANN, and PLS. We implemented a 10fold cross-validation during training and used a range of metrics to evaluate their performance on the test set.
■
RESULTS Data Collection and Chemical Space Analysis. To visualize the chemical space of the created data set, we performed a PCA of 11 descriptors, consisting of SLogP, LogDpH7.4, MW, apKa, bpKa, HBAcc, HBDon, TopoPSA, nAromAtom, nAromBond, and nRot, all of which are generally considered to be important parameters for synthetic expansion. As a control, a representative set of 6092 small-molecule approved drugs retrieved from KEGG DRUG was analyzed D
DOI: 10.1021/acs.molpharmaceut.8b00785 Mol. Pharmaceutics XXXX, XXX, XXX−XXX
Article
Molecular Pharmaceutics
Figure 3. (A) Summary of classification models. (B) Statistical results of the binary (upper columns) and three-class (lower columns) classification models from the training set (2192 compounds) and the test set (546 compounds) in the seven models are shown. D1: Models created with free software-derived descriptors. D2: Models created with commercial software-derived descriptors. *: The model that displayed the highest accuracy and kappa in binary and three-class classification models. (C) Comparison of the statistical results in the binary (RF) and three-class classification (RF) models with the test set in the D1 descriptors set. True positive rate in low class was indicated.
Figure 4. Distribution of the f u,p values in assembled data set with 2738 compounds depending on the modeling end points. (A) Original linear scale, (B) f u,p_lnKa, (C) log-scaled f u,p. Bars below the graphs indicate the transformed f u,p value into the fraction unbound in plasma.
TopoPSA, and nRot exhibited no or negligible correlation with f u,p. Classification Models. Binary and three-class global classification models were constructed, with threshold values of f u,p set to 0.05 to define the low and high classes for the binary classification and 0.05 and 0.2 to define the low, middle, and high classes for the three-class classification, respectively (Figure 3A). These thresholds were chosen based on the hepatic availability (Fh) values. Figure S1 shows how the hepatic availability (Fh) and hepatic clearance (CLh) change as a function of intrinsic clearance (CLint); the calculation was performed by applying the well-stirred model and with different values of f u,p. Since 96.9% of the ChEMBL compounds showed CLint < 500 mL/min/mg protein,21 we can only consider the range shown in Figure S1A; compounds
related descriptors, polarizability-related descriptors such as SMR_VSA7 (molar refractivity), and hydrogen bond and aromatic bond counts. The relationships between f u,p and six selected descriptors (four showing the highest correlation with f u,p, SLogP, C2SP2, SMR_VSA7, and nAromAtom, as well as two basic properties, MW and LogD) were plotted in Figure 2. While MW showed a clear negative correlation with f u,p for small molecules less than 500 Da (r = 0.34), no apparent correlation was observed for larger molecules (Figure 2A). LogDpH7.4 showed a moderate negative correlation (r = −0.37) (Figure 2B), and SLogP exhibited the highest negative correlation with f u,p (r = −0.54) (Figure 2C). C2SP2, SMR_VSA7, and nAromAtom also showed a moderate negative correlation with f u,p, r = −0.47, −0.43, and −0.43, respectively (Figure 2D−F). The number of HBAcc, HBDon, E
DOI: 10.1021/acs.molpharmaceut.8b00785 Mol. Pharmaceutics XXXX, XXX, XXX−XXX
Article
Molecular Pharmaceutics Table 1. Statistical Results of the Regression Models by Each of the Six Modelsa end point f u,p (lin)
D1
D2
f u,p (log)
D1
D2
R2 RMSE MAE R2 RMSE MAE R2 RMSE MAE R2 RMSE MAE
RF
SVMradi
SVMlin
ANN
k-NN
PLS
0.699 0.158 0.116 0.693 0.157 0.111 0.691b 0.410b 0.332b 0.677 0.415 0.331
0.717b 0.149b 0.101b 0.728b 0.145b 0.100b 0.667 0.415 0.323 0.684b 0.404b 0.317b
0.594 0.180 0.124 0.599 0.180 0.122 0.504 0.521 0.388 0.583 0.471 0.371
0.672 0.160 0.113 0.657 0.164 0.116 0.600 0.455 0.354 0.594 0.458 0.363
0.606 0.174 0.116 0.652 0.164 0.108 0.591 0.459 0.351 0.586 0.465 0.357
0.576 0.182 0.141 0.588 0.179 0.134 0.547 0.484 0.394 0.573 0.470 0.377
a
Statistical analysis of the regression models to be validated by a test set (546 compounds) by each of the six models using the free software-derived descriptor set (D1) and the commercial software-derived descriptor set (D2). The prediction results are displayed both in linear scale (upper columns) and logarithmic scale (lower columns). bThe model that displayed the highest R2 and lowest error rates.
key to better prediction accuracy for f u,p. Therefore, to examine the influence of the biased distribution, the original f u,p values were transformed to lnKa (Figure 4B) or logarithmic scale (Figure 4C), resulting in the more evenly distributed data sets. Initially, prediction models were constructed using six machine learning methods (RF, SVM with radial and linear basis functions, k-NN, ANN, and PLS) in the linear and logarithmic scales. A set of 326 and 305 free software-derived descriptors (D1) and 391 and 354 commercial softwarederived descriptors (D2) were selected in the linear and logarithmic scales, respectively. Statistical results on the test set are shown in Table 1, including the mean absolute error (MAE), the root-mean-square error (RMSE), and the coefficient of determination (R2). No significant differences in accuracy and kappa were detected by the paired t-test between the models trained with D1 and D2 descriptor sets (Table S5). With D1 descriptor set in the linear scale, the SVM radial model showed the lowest error rate, and with the D1 descriptor set in the logarithmic scale, the RF model showed the best prediction capability, when ntree and mtry were 500 and 261, respectively. These best models were defined as lin_D1_SVMradi and log_D1_RF, respectively. Figure 5A,B shows plots of the observed and predicted f u,p by lin_D1_SVMradi and log_D1_RF. To compare these models, the predicted values of f u,p by lin_D1_SVMradi were transformed into the logarithmic scale, and the accuracy scores were compared with those of the predicted values of f u,p by log_D1_RF (Figure 5C). The predicted values of f u,p by log_D1_RF showed a higher R2 (0.668 versus 0.490) and lower RMSE (0.409 versus 0.528) and MAE (0.332 versus 0.382) than those by lin_D1_SVMradi. Lipophilicity was the most important feature to predict f u,p, and descriptors related to polarizability and electronegativity showed the highest contributions (Figure S3). In addition, by setting the end point to lnKa, 351 descriptors were used to create regression models with D1, and the SVMradi model showed the best performance (lnKa_D1_SVMradi). The predicted values of f u,p by lnKa_D1_SVMradi were transformed into the logarithmic scale, and the scores were compared with those of the predicted values of f u,p by log_D1_RF (Figure S4). Statistical results showed that R2 in log_D1_RF was higher than that in lnKa_D1_SVMradi, and error rates in log_D1_RF were smaller than those in lnKa _D1_SVMradi.
with Fh > 0.5 roughly correspond to f u,p < 0.05, and those with Fh > 0.2 correspond to f u,p < 0.2. The binary prediction models were trained using seven machine learning methods (RF, SVM with radial and linear basis functions, k-NN, ANN, AdaBoost, and PLS), either with a selected set of 267 free software-derived descriptors (D1) or a selected set of 249 commercial software-derived descriptors (D2). Similarly, the three-class models were trained either with 307 D1 descriptors or 250 D2 descriptors. The models were validated on the common test set, and the statistical results of each model are shown in Figure 3B. The performance in terms of accuracy and kappa was comparable between the training set and the test set. No significant differences in accuracy and kappa were detected by the paired t-test between the models trained with D1 and D2 descriptor sets (Table S5). With the D1 descriptor set, RF showed the highest performance for the binary and three-class classification. RF parameters (ntree and mtry) were 500 and 103, respectively. The true positive rate for the low class in the three-class RF model was higher than that of the binary RF model (0.826 versus 0.753; Figure 3C). The top-ranked descriptors according to their variable importance for the RF models are shown in Figure S2A−D. Lipophilicity was the most important descriptor both in D1 and D2 models, and C2SP2 and AATS (averaged Moreau− Broto autocorrelation weighted by polarizability) in D1 models and hydrogen bond or π-related descriptors in D2 models were also listed as important determinants of f u,p, indicating that both D1 and D2 models mainly selected descriptors of lipophilicity and polarizability. Regression Models. Regression models to predict the numerical values of f u,p were generated. As shown in Figure 4A, our data set was weighted toward the lower range of f u,p; more than half of the compounds had a value of less than 0.1, an observation also made about the data sets used in previous reports.9,13 For highly biased data, the R-squared (R2, coefficient of determination) value can be unreliable in evaluating a prediction model, as discussed in the previous studies.10,11,15 Zhu et al. and Ingle et al. proposed to transform the percentage of PPB to lnKa, a pseudoequilibrium constant parameter, and this approach provided predicted values of PPB with better accuracy than those obtained by other methods based on PPB in the linear scale.9,10 This observation suggests that data transformation to obtain a more appropriate distribution is a F
DOI: 10.1021/acs.molpharmaceut.8b00785 Mol. Pharmaceutics XXXX, XXX, XXX−XXX
Article
Molecular Pharmaceutics
Figure 5. Plots of the observed and predicted f u,p by the best models in different end points with the D1 descriptors set. (A) lin_D1_SVMradi. (B) log_D1_RF. (C) Comparison of the prediction results between lin_D1_SVMradi (blue) and log_D1_RF (orange). Statistical analysis is summarized below. (D) Comparison of the prediction results among the three types of end points in the range of lower f u,p (0.001−0.05). Predicted result was plotted in lin_D1_SVMradi (blue), in lnKa_D1_SVMradi (green), and in log_D1_RF (orange). Error rates are shown on the right side.
and higher (0.05−1.0) ranges of f u,p separately, and the results were summarized in Figure 6B. The error rates in log_D1_RF were smaller than those in S+PrUnbnd (0.470 versus 0.642 for RMSE and 0.383 versus 0.495 for MAE in the lower range, 0.266 versus 0.360 for RMSE and 0.163 versus 0.213 for MAE in the higher range). Furthermore, RMSE and MAE in log_D1_RF in the higher range were smaller than those in the lower and overall ranges in log_D1_RF (0.266 versus 0.410 for RMSE, 0.163 versus 0.332 for MAE). In summary, the log_D1_RF model showed better accuracy overall than S +PrUnbnd, and the differences were pronounced in the low value ranges.
The error rates in the lower range of f u,p (0.001−0.05) for the three different end points were compared in Figure 5D. Those for log_D1_RF were the smallest, suggesting that the logarithmic transformation was particularly effective for the prediction of the lower range of f u,p. Then, the results for log_D1_RF were compared with those of the S+PrUnbnd model in ADMET Predictor, one of the most widely used commercial software suites (Figure 6A). The S+PrUnbnd model in the physicochemical and biopharmaceutical module of ADMET Predictor can predict the percentage of drug that is free in the plasma. The model was built using an ensemble of ANNs that were trained on the experimental data gathered from the literature in the logarithmic scale for 689 well-characterized molecules. Our log_D1_RF showed a higher R2 (0.691) than that of the S +PrUnbnd model (0.421) as well as a lower RMSE and MAE. We also calculated the error rates in the lower (0.001−0.05)
■
DISCUSSION The focus of drug discovery in academia has shifted in recent years from its traditional roles of target identification to a wider range of activities including high-throughput screening, and G
DOI: 10.1021/acs.molpharmaceut.8b00785 Mol. Pharmaceutics XXXX, XXX, XXX−XXX
Article
Molecular Pharmaceutics
Figure 6. Plots of the observed and predicted f u,p in the log_D1_RF model and ADMET Predictor S+PrUnbnd model. (A) Plot in overall range. Dotted line indicates the line of unity (x = y). (B) Statistical results in lower (left) and higher (right) ranges are summarized.
However, balanced accuracies for the low class of the two models were comparable, and the three-class classification model by RF showed a better true positive rate for the low class (0.826) than that of the binary classification model (0.753). When the degree of protein binding is high, small changes in protein binding can have a large influence on the free drug concentration and hence the drug’s pharmacological effects.6 Therefore, it is important to be able to predict whether a compound belongs to the low class or not. Given also that the three-class classification can provide more information than the binary classification, our three-class classification model should become a useful tool in the early stages of drug discovery. Quantitative values of f u,p are required to calculate the hepatic and renal clearance and Vd. We adopted three end points in our regression models, f u,p in the linear and logarithmic scales and lnKa. As shown in Figure 4A, our original data set is weighted toward the highly bound (lower unbound) compounds. Many of the approved drugs are, indeed, highly bound to plasma protein5 because of the relations between f u,p and hepatic availability. Figure S1 shows that with small f u,p, high hepatic availability (high Fh and low CLh) is achieved over a wide range of CLint. Although no specific f u,p values can ensure success as a drug, having small f u,p can be a step closer to achieving twice or three times daily dosing. To normalize the distribution of f u,p, we applied the logarithmic transformation. This procedure appears to expand the lower f u,p range more than the lnKa transformation adopted by Zhu et al. and Ingle et al.11,12 (Figure 4B,C). The log_D1_RF model showed the smallest error rates among the three types of end points in the range of lower f u,p, (lin_D1_ RF > lnKa_D1_SVMradi > log_D1_RF model; Figure 5D), presumably because training errors of each sample were amplified more in the logarithmic scale, especially at the lower range of f u,p, compared to the other scales. Consequently, the model in the logarithmic scale was the best suited to an accurate assessment of f u,p at lower values. Furthermore, evaluated on our test set, the S+PrUnbnd model’s R2 was lower
industrial-type drug discovery units in academic institutions have emerged.22 While a small number of academic drug discovery projects receive large investments, many academic projects suffer from the lack of funding and expertise in drug discovery. In silico pharmacokinetic prediction is a costeffective tool to incorporate pharmacokinetic viewpoints into the screening processes in academia. However, in silico prediction tools created by free software and freely available to academic researchers are limited to simple pharmacokinetics parameters such as logP, polar surface area (PSA), and the rule of five descriptors.23−25 PPB can dramatically affect the hepatic and renal clearance, tissue distribution, and pharmacological effects, and models that can accurately predict the low value ranges of f u,p are needed. In this study, global three-class classification and regression models to predict f u,p were constructed with the largest ever data set of 2738 compounds using several machine learning methods. For both classification and regression, the models trained with D1 and D2 descriptor sets showed almost identical accuracy and kappa, indicating that the models built using only freely available descriptor calculation tools achieved performance comparable to that of the models built using commercial software. In addition, our best regression model showed better accuracy than that of previously reported methods including commercial software, especially in the lower ranges of f u,p. Important variables to predict f u,p were common in most of the models, where lipophilicity and conjugated double-bondrelated features such as C2SP2, AATS, and π contributed descriptors were the key components of the models. It is known that lipophilic compounds tend to highly bind plasma proteins, and electronegativity or polarizability affects molecular acidity and basicity.26 Since f u,p correlates with lipophilicity and acidity of compounds as shown in Figure 1C, the inclusion of the descriptors related to lipophilicity and polarizability such as logP, C2SP2, AATS, and π led to the models successfully capturing the key factors for f u,p prediction. The best binary classification model showed better overall accuracy than the three-class model (0.801 versus 0.676). H
DOI: 10.1021/acs.molpharmaceut.8b00785 Mol. Pharmaceutics XXXX, XXX, XXX−XXX
Molecular Pharmaceutics
■
than that of our log_D1_RF model (0.421 versus 0.691), indicating that the log_D1_RF model showed performance better than that of S+PrUnbnd model. This R2 value was lower than the reported value of the developers’ own evaluation (0.68),27 presumably because our test set covered a wider range of chemical space than the S+PrUnbnd model’s training set. Although the model in the logarithmic scale was able to predict lower values of f u,p better than the previous methods, the variance of the error in the lower range of f u,p (0−0.05) is larger than that in the higher range of f u,p (0.05−1.0; Figure 6B), suggesting that the prediction accuracy in the lower range should still be improved. One possible reason is the lack of data in the range of extremely high protein binding (>99%). In addition, in creating the data set for the regression model, we had to omit the reports that cannot be associated with a single numerical value. Examples include “More than 99.0% of PPB”, and such cases are common for the lower range of f u,p, because of the limitations in the experimental setup. Given the critical importance of the lower range of f u,p, it will be beneficial to construct models by collecting the data, including this range, acquired by a well-validated evaluation system such as those employed in pharmaceutical companies. The f u,p prediction can play an important role in other types of pharmacokinetic prediction models such as clearance, Vd, and the elimination route. For example, Kusama et al. and Toshimoto et al. reported computational systems to predict major clearance pathways (renal or hepatic metabolism by cytochrome isoforms) of drugs by using four physicochemical features, i.e., charge, molecular weight, lipophilicity, and the fraction unbound in plasma.28,29 While the program is publicly available online (CPathPred, http://www.bi.cs.titech.ac.jp/ CPathPred/), the value of f u,p is required as input to obtain the prediction results. Predicted f u,p values from our models can be used to fill this gap. In general, prediction models built using freely available tools can be integrated into other systems for predicting many different types of pharmacokinetic parameters, and we believe that such an attempt will boost drug discovery in academia. Finally, we created the largest ever data set of f u,p consisting of 2738 compounds and built freely available prediction models of f u,p by machine learning. The classification model showed a high true positive rate of 0.826 for the low fraction unbound class. The strongly biased distribution of fraction unbound in plasma was mitigated by a logarithmic transformation in the regression model. Our models showed better performance than those of previously published methods including commercial software. The improvements over the previous methods were particularly pronounced in the low value ranges. Our prediction tool can be used on its own in academia to lead successful drug discovery or integrated into other pharmacokinetic modeling systems. Our f u,p prediction models can be accessed at http://adme.nibiohn.go.jp/fup/
■
Article
AUTHOR INFORMATION
Corresponding Authors
*Phone: +81-72-639-7010; Fax: +81-72-641-9881; E-mail:
[email protected] (R.W.) *Phone: +81-72-641-9890; Fax: +81-72-641-9881; E-mail:
[email protected] (K.M.) ORCID
Reiko Watanabe: 0000-0001-9359-8731 Tsuyoshi Esaki: 0000-0001-8780-6346 Notes
The authors declare no competing financial interest.
■
ACKNOWLEDGMENTS This work was conducted as part of the “Development of a Drug Discovery Informatics System” supported by the Japan Agency for Medical Research and Development (AMED). We thank Arina Afanaseva as part of the bioinformatics project, the National Institutes of Biomedical Innovation, Health and Nutrition for writing a custom python script to merge the two data sets, and Toshiyuki Oda and Daisuke Sato in Lifematics Inc. for helping with the creation of web interface.
■
ABBREVIATIONS f u,p, fraction unbound in plasma; ADMET, absorption, distribution, metabolism, excretion, and toxicity; Vd, volume of distribution; CLtot, total clearance; Fh, hepatic availability; CLh, hepatic clearance; CLint, intrinsic clearance; QSAR, quantitative structure−activity relationship; R2, coefficient of determination; PPB, plasma protein binding; lnKa, natural logarithm of the pseudo binding constant imputed from PPB; MAE, mean absolute error; RMSE, root-mean-square error; RF, random forest; SVM, support vector machine; k-NN, knearest neighbors; ANN, artificial neural network; PLS, partial least squares; FDA, Food and Drug Administration; EMA, European Medicines Agency
■
REFERENCES
(1) Frearson, J.; Wyatt, P. Drug Discovery in Academia- the third way? Expert Opin. Drug Discovery 2010, 5 (10), 909−19. (2) Munos, B. Lessons from 60 years of pharmaceutical innovation. Nat. Rev. Drug Discovery 2009, 8 (12), 959−68. (3) Stevens, A. J.; Jensen, J. J.; Wyller, K.; Kilgore, P. C.; Chatterjee, S.; Rohrbaugh, M. L. The role of public-sector research in the discovery of drugs and vaccines. N. Engl. J. Med. 2011, 364 (6), 535− 41. (4) Shamas-Din, A.; Schimmer, A. D. Drug discovery in academia. Exp. Hematol. 2015, 43 (8), 713−7. (5) Bohnert, T.; Gan, L. S. Plasma protein binding: from discovery to development. J. Pharm. Sci. 2013, 102 (9), 2953−94. (6) Roberts, J. A.; Pea, F.; Lipman, J. The clinical relevance of plasma protein binding changes. Clin. Pharmacokinet. 2013, 52 (1), 1−8. (7) Hall, L. M.; Hall, L. H.; Kier, L. B. QSAR modeling of betalactam binding to human serum proteins. J. Comput.-Aided Mol. Des. 2003, 17 (2−4), 103−18. (8) Lambrinidis, G.; Vallianatou, T.; Tsantili-Kakoulidou, A. In vitro, in silico and integrated strategies for the estimation of plasma protein binding. A review. Adv. Drug Delivery Rev. 2015, 86, 27−45. (9) Zhivkova, Z.; Doytchinova, I. Quantitative structure–plasma protein binding relationships of acidic drugs. J. Pharm. Sci. 2012, 101 (12), 4627−41. (10) Votano, J. R.; Parham, M.; Hall, L. M.; Hall, L. H.; Kier, L. B.; Oloff, S.; Tropsha, A. QSAR Modeling of Human Serum Protein
ASSOCIATED CONTENT
* Supporting Information S
The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.molpharmaceut.8b00785. Curation process of 625 compounds derived from PharmaPendium; Supporting figures (Figures S1−4) and tables (Tables S2−3) (PDF) Data set (Table S1) (XLSX) I
DOI: 10.1021/acs.molpharmaceut.8b00785 Mol. Pharmaceutics XXXX, XXX, XXX−XXX
Article
Molecular Pharmaceutics Binding with Several Modeling Techniques Utilizing StructureInformation Representation. J. Med. Chem. 2006, 49, 7169−81. (11) Zhu, X. W.; Sedykh, A.; Zhu, H.; Liu, S. S.; Tropsha, A. The use of pseudo-equilibrium constant affords improved QSAR models of human plasma protein binding. Pharm. Res. 2013, 30 (7), 1790−8. (12) Ingle, B. L.; Veber, B. C.; Nichols, J. W.; Tornero-Velez, R. Informing the Human Plasma Protein Binding of Environmental Chemicals by Machine Learning in the Pharmaceutical Space: Applicability Domain and Limits of Predictability. J. Chem. Inf. Model. 2016, 56 (11), 2243−52. (13) Sun, L.; Yang, H.; Li, J.; Wang, T.; Li, W.; Liu, G.; Tang, Y. In Silico Prediction of Compounds Binding to Human Plasma Proteins by QSAR Models. ChemMedChem 2018, 13, 572. (14) O’Boyle, N. M.; Banck, M.; James, C. A.; Morley, C.; Vandermeersch, T.; Hutchison, G. R. Open Babel: An open chemical toolbox. J. Cheminf. 2011, 3, 33. (15) Moriwaki, H.; Tian, Y. S.; Kawashita, N.; Takagi, T. Mordred: a molecular descriptor calculator. J. Cheminf. 2018, 10 (1), 4. (16) Yap, C. W. PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints. J. Comput. Chem. 2011, 32 (7), 1466−74. (17) Wickham, H. ggplot2: Elegant Graphics for Data Analysis, 2nd ed.; Springer, 2016. (18) Tang, Y.; Horikoshi, M.; Li, W. ggfortify: Unified Interface to Visualize Statistical Results of Popular R Packages. The R Journal 2016, 8, 478−489. (19) Kuhn, M. Building predictive models in R using the caret package. J. Stat Softw. 2008, 28 (5), 1−26. (20) Kursa, M. B.; Rudnicki, W. R. Feature Selection with the Boruta Package. J. Stat Softw 2010, 36 (11), 1−13. (21) Esaki, T.; Watanabe, R.; Kawashima, H.; Ohashi, R.; NatsumeKitatani, Y.; Nagao, C.; Mizuguchi, K. Data curation can improve the prediction accuracy of metabolic intrinsic clearance. Mol. Inf. 2018, 37, 1800086. (22) Vallance, P. Industry-Academic Relationship in a New Era of Drug Discovery. J. Clin. Oncol. 2016, 34 (29), 3570−5. (23) Tetko, I. V.; Bruneau, P. Application of ALOGPS to predict 1octanol/water distribution coefficients, logP, and logD, of AstraZeneca in-house database. J. Pharm. Sci. 2004, 93 (12), 3103−10. (24) Poda, G. I.; Tetko, I.; Rohrer, D. C. Towards predictive ADME profiling of drug candidates: Lipophilicity and solubility. In 229th American Chemical Society National Meeting & Exposition 2005, San Diego, CA, USA, March 13−17, 2005; MEDI 514. (25) Molinspiration Cheminformatics. Molinspiration Cheminformatics Software. http://www.molinspiration.com/. (26) Nagle, J. K. Atomic polarizability and electronegativity. J. Am. Chem. Soc. 1990, 112 (12), 4741−47. (27) SimulationsPlus. ADMET Predictor User Manual, version 8.1, 2015. (28) Kusama, M.; Toshimoto, K.; Maeda, K.; Hirai, Y.; Imai, S.; Chiba, K.; Akiyama, Y.; Sugiyama, Y. In silico classification of major clearance pathways of drugs with their physiochemical parameters. Drug Metab. Dispos. 2010, 38 (8), 1362−70. (29) Toshimoto, K.; Wakayama, N.; Kusama, M.; Maeda, K.; Sugiyama, Y.; Akiyama, Y. In silico prediction of major drug clearance pathways by support vector machines with feature-selected descriptors. Drug Metab. Dispos. 2014, 42 (11), 1811−19.
J
DOI: 10.1021/acs.molpharmaceut.8b00785 Mol. Pharmaceutics XXXX, XXX, XXX−XXX