Constructing Interconsistent, Reasonable, and Predictive Models for

Sep 14, 2016 - College of Bioengineering, Chongqing University, Chongqing 400044, China. J. Chem. ... Journal of Chemical Information and Modeling...
0 downloads 0 Views 3MB Size
Article pubs.acs.org/jcim

Constructing Interconsistent, Reasonable, and Predictive Models for Both the Kinetic and Thermodynamic Properties of HIV‑1 Protease Inhibitors Sujun Qu,† Shuheng Huang,‡ Xianchao Pan,‡ Li Yang,†,‡ and Hu Mei*,†,‡ †

Key Laboratory of Biorheological Science and Technology, Ministry of Education, Chongqing University, Chongqing 400044, China College of Bioengineering, Chongqing University, Chongqing 400044, China



S Supporting Information *

ABSTRACT: Accumulated evidence suggests that the in vivo biological potency of a ligand is more strongly correlated with the binding/unbinding kinetics than the equilibrium thermodynamics of the protein−ligand interaction (PLI). However, the existing experimental and computational techniques are largely insufficient and limited in large-scale measurements or accurate predictions of the kinetic properties of PLI. In this work, elaborate efforts have been made to develop interconsistent, reasonable, and predictive models of the association rate constant (kon), dissociation rate constant (koff), and equilibrium dissociation constant (KD) of a series of HIV protease inhibitors with different structural skeletons. The results showed that nine Volsurf descriptors derived from water (OH2) and hydrophobic (DRY) probes are key molecular determinants for the kinetic and thermodynamic properties of HIV-1 protease inhibitors. To the best of our knowledge, this is the first time that interconsistent and reasonable models with strong prediction power have been established for both the kinetic and thermodynamic properties of HIV protease inhibitors.

1. INTRODUCTION In the life cycle of human immunodeficiency virus type 1 (HIV-1), HIV-1 protease is a crucial enzyme for protein maturation and viral infection.1 HIV protease inhibitors (PIs) can block viral replication by preventing the formation of functional viral proteins2 and thus have become one of the most effective drugs in the treatment of acquired immune deficiency syndrome (AIDS).3 The binding affinities of the HIV protease inhibitors are often measured by thermodynamic properties, such as the IC50, equilibrium dissociation constant (KD), or Gibbs free energy.4 However, recent research increasingly suggests that the most crucial factors for drug potency are related not only to the thermodynamic properties but also to the kinetic rate constants for complex association (kon) and dissociation (koff).5 At present, kinetic properties are mainly determined by techniques such as stopped-flow analysis,6 isothermal titration calorimetry,7 capillary electrophoresis,8 affinity chromatography,9 and surface plasmon resonance methods.10−12 For example, surface plasmon resonance spectroscopy is most widely used in flexible platforms, but it is generally not suitable for detecting fast association or dissociation kinetics. Affinity chromatography can complement surface plasmon resonance spectroscopy and help to improve the reproducibility and lower the cost per analysis. The advantages of capillary electrophoresis are its efficiency, speed, and small sample requirements. © XXXX American Chemical Society

However, technical difficulties (i.e., slowness and measurement errors)13−15 still exist for each method, especially in highthroughput determination. With the development of computational chemistry, more and more computational methods have been used to predict thermodynamic and kinetic properties of HIV-1 PIs.16−18 For example, Wright et al.18 studied the binding free energies of nine FDA-approved HIV-1 PIs by molecular mechanics Poisson−Boltzmann surface area (MMPBSA) and molecular mechanics generalized Born surface area (MMGBSA) methods, and the results showed that seven inhibitors can be correctly distinguished by both methods with correctly converged normal-mode estimates of the configuration entropy. Li et al.19 performed steered molecular dynamics and umbrella sampling simulations to analyze the entire dissociation processes of three HIV-1 PIs. The results showed that the strength of the hydrogen-bonding network between the PIs and the protease plays a crucial role in the dissociation process and that the free energies of dissociation obtained by umbrella sampling were in good agreement with experimental data. Although some advances have been made in binding and unbinding simulations of HIV-1 PIs, accurate prediction of kinetic parameters (i.e., kon and koff) remains a daunting task. Received: June 7, 2016

A

DOI: 10.1021/acs.jcim.6b00326 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling As a classic method of computer-aided drug design, quantitative structure−activity relationship (QSAR) studies have been applied to predict the kinetic properties of HIV-1 PIs.20,21 Schaal22 applied comparative molecular field analysis (CoMFA) to predict association and dissociation rates of 34 HIV-1 PIs. After a laborious process of molecular alignment and parameter optimization, the cross-validated determination coefficients (q2) were 0.72 and 0.48 for the optimal koff and kon models, respectively. However, the prediction power of the kon model on a test data set was very poor, with a determination coefficient (R2pred) of 0.20, in comparison with 0.60 for the koff model. To predict the kinetic rate constants of HIV-1 PIs, Chiu and Xie23 adopted integrated energetic and conformational dynamic features derived from molecular dynamics simulations to construct multitarget machine learning (MTML) models by the random forest predictive clustering method. The four classes of PIs were predicted accurately, with an accuracy of 74.35%. In spite of these exploratory QSAR studies, the structural description method and prediction accuracy remain bottlenecks for predicting the kinetic parameters. In this study, after exhaustive seeking for an efficient method of structural description, the three-dimensional (3D)-grid-based VolSurf method was finally selected to construct prediction models for both the kinetic and thermodynamic properties of 37 HIV-1 PIs with different structural skeletons. For the first time, interconsistent, highly predictive, and reasonable models were obtained for both the kinetic and thermodynamic properties. It can be expected that the models developed in this work can be applied for screening of HIV-1 protease inhibitors with desired thermodynamic and kinetic properties. Also, the method and research framework presented in this paper could provide a useful reference for predicting the kinetic properties of other molecules.

Table 1. The 37 HIV-1 Protease Inhibitors and Corresponding log(kon), −log(koff), and −log(KD) Values

2. METHOD 2.1. Data Set. The 37 cyclic and linear HIV-1 protease inhibitors (Figure S1 in the Supporting Information) were obtained from the studies of Shuman et al.24 and Markgren et al.25. The kinetic rate constants were determined by global fitting of multiple sensorgrams, using a minimum of three sensorgrams for each inhibitor. It should be noted that for a certain inhibitor, the kinetic constants are averages of each constant for all of the replicates, and the KD value is not identical to the ratio of the averages of koff and kon. Herein, the logarithm of kon and the negative logarithms of koff and KD were used as target variables for the subsequent QSAR modeling (Table 1). Figure 1 shows the distribution of the 37 HIV-1 protease inhibitors in the space of kon and koff on logarithmic scales. It can be seen that association rate constants vary by more than 4 orders of magnitude and the dissociation rate constants by 3 orders of magnitude. 2.2. VolSurf. The VolSurf approach was first proposed to predict absorption, distribution, metabolism, and excretion (ADME) properties, which are closely related to pharmacokinetics.26−28 VolSurf probes each grid vertex around a molecule with chemical probes and then compresses 3D maps of the interaction energies of the molecule and probes into a few quantitative 2D numerical descriptors. A total of nine probes are used in VolSurf: water (OH2), a hydrophobic probe (DRY), an amphipathic probe (BOTH), H-bonding carbonyl (O), sp2 carboxy oxygen atom (O::), sp2 phenolate oxygen (O−), neutral flat NH (N1), sp2 N with one lone pair (N: ), and sp3 amine NH3 cation (N3+). For VolSurf calculations, OH2

clinical inhibitors

class non-B268 analogues P1/P1′ analogues of B268

P2/P2′ and central hydroxy analogues

cyclic sulfamide compounds

ID

log(kon)

−log(koff)

−log(KD)

B355 B295 A037 A038 B268 B277 B408 B409 B412 B429 B439 B440 A015 A016 A017 A018 B249 B322 B347 B365 B369 B388 B425 B435 A021 A023 A024 A030 A045 A047 Saq Lop Nelf Ind Rit Amp Ataz

6.033424 5.955207 5.30963 4.466868 5.550228 2.127105 5.948902 5.541579 5.257679 5.509203 4.909021 5.678518 5.037426 2.235528 4.639486 5.541579 4.612784 6.267172 3.963788 5.482874 6.805501 6.775974 5.823474 5.004321 5.836957 5.30103 5.344392 5.70927 5.698101 5.274158 5.912222 6.821514 5.821514 6.184691 6.593286 6.646404 6.222716

0.42829 0.36051 3.43771 3.31247 2.43533 2.31426 2.77211 3.36452 3.08778 3.42829 2.78781 3.51856 0.0278 1.21824 0.74715 0.3242 0.56384 1.16941 1.56864 1.51004 1.87615 1.64397 0.63078 2.18509 1.56384 0.85699 1.16431 1.37675 0.58004 1.15677 3.64397 3.18442 3.17522 2.80134 2.66555 2.31158 3.16115

6.436519 6.376751 8.709965 7.756962 7.966576 4.415669 7.675718 8.517126 7.739929 8.928118 7.465974 9.191789 5.153045 3.510042 5.301899 5.886057 5.101824 7.316953 5.524329 6.9914 7.856985 8.354578 7.446117 7.156767 7.400117 5.943095 6.464706 6.772113 6.170696 6.238824 9.501689 9.995679 8.785156 8.970616 9.216096 8.958607 9.394695

and DRY probes are generally used in most cases, and other probes can be selectively used in different cases. VolSurf descriptors are independent of superposition and thus overcome the shortage of traditional CoMFA and comparative molecular similarity index analysis (CoMSIA) methods. Also, the calculation process is fully automated and very fast, which benefits a large-scale computation. After removal of the counterions and addition of hydrogen atoms, the 37 HIV-1 PIs were charged by the MMFF94 method and then optimized by Powell’s method using the Tripos force field (Sybyl 8.1). The optimal conformer of each molecule was then used to generate Volsurf descriptors. Herein, five probes (OH2, DRY, BOTH, O, and N:) were used, and a total of 118 descriptors were generated, of which 112 were GRID descriptors and six were non-GRID descriptors for polarizability (POL), molecular weight (MW), elongation (Elon), fixed elongation (EEER), diffusivity (DIFF), and water/octanol partition coefficient (LogP). Five types of physiochemical properties can be extracted from these Volsurf descriptors: size and shape, hydrophilic properties, hydrophobic properties, interaction energy (integy) moments, and mixed descriptors. B

DOI: 10.1021/acs.jcim.6b00326 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling

the subsequent forward feature selection and PLS modeling. The determination coefficient (R2), the three-fold cross-validated R2 (Q2), and the R2 for the test data set (R2pred) were used for model evaluations.

3. RESULTS AND DISCUSSION 3.1. Feature Selection and PLS Modeling of −log(koff), log(kon), and −log(KD). Table S1 in the Supporting Information shows the results of forward feature selection and PLS modeling of −log(koff). It can be seen that all seven of these models achieve fairly good performances on both the training and test data sets by using less than eight variables. On the basis of a consideration of model complexity and interpretability, model 4 (R2 = 0.704; R2pred = 0.783) was chosen as the optimal model. This model uses only four variables: ID6-DRY, BV12-OH2, BV31-OH2, and Cw4-OH2 (Table S1 and Table 2). Table S2 shows the results of forward selection and PLS modeling of log(kon). It can be observed that models 6, 7, 8, and 9 have better predictive capabilities for both the training and test samples compared with the other candidate models. On the basis of a consideration of model complexity and interpretability, model 7 (R2 = 0.573; R2pred = 0.647) was selected as the optimal model. This model uses seven variables: D23DRY, BV12-OH2, Cw5-OH2, CP, HB1-O, W6, and D23-OH2 (Table S2 and Table 2). Because of the relationships among log(KD), log(koff), and log(kon), we first employed the four descriptors in the −log(koff) model and the seven descriptors in the log(kon) model (Table 2) to establish a PLS model of −log(KD). After removal of the duplicated BV12-OH2 descriptor, 10 variables were used for forward feature selection and PLS modeling. From the results (Table S3), it can be seen that model 4 with four variables achieves the best performance, with R2 = 0.578 and R2pred = 0.677 (Table 2). In comparison with model 4, forward feature selection from the original 118 descriptors and PLS modeling of −log(KD) were further investigated. The results (Table S4) showed that model 15 with five variables was the best model, with

Figure 1. Distribution of kinetic properties of the 37 HIV-1 protease inhibitors. Symbols: red circles, P1/P1′ analogues of B268; green triangles, P2/P2′ analogues of B268; black squares, non-B268 analogues; blue circles, cyclic sulfamide compounds; purple squares, clinical inhibitors. Diagonal lines refer to particular KD values (as labeled), calculated as the ratio of the dissociation rate constant to the association rate constant.

2.3. PLS Modeling. Partial least-squares (PLS) is a technique that combines features from principal component analysis (PCA) and multiple linear regression.29 PLS is especially useful when (1) the number of descriptors is similar to or higher than the number of observations and/or (2) the descriptors are highly correlated with each other. PLS has been proved to be more reliable than other techniques especially in cases of small sample size and low tolerance. In this paper, 118 descriptors were used for feature selection by forward stepwise regression (SPSS version 10.0), of which the entry and removal probabilities were set to 0.05 and 0.10, respectively. Twentyeight HIV-1 PIs were randomly selected as training samples using the random function in MATLAB, and the remaining nine were used as test samples. Then the values of log(kon), −log(koff), and −log(KD) were used as the target variables for

Table 2. Descriptors Involved in the PLS Models of log(kon), −log(koff), and −log(KD) descriptors involved in the optimal PLS modelsa −log(KD) class

description

size and shape descriptors

best volume

descriptors of hydrophilic regions

capacity factors hydrophilic regions hydrophobic integy moment hydrophobic local interaction energy minima distances critical packing parameter hydrogen bonding

interaction energy moments mixed descriptors

−log(koff)

log(kon)

model 4

model 15

BV31-OH2 BV12-OH2 Cw4-OH2

BV12-OH2

BV31-OH2

Cw5-OH2 W6

Cw5-OH2

BV31-OH2 BV11-OH2 Cw5-OH2

ID6-DRY D23-DRY

ID4-DRY D23-DRY

ID6-DRY D23-DRY D23-OH2 CP HB1-O

BV11-OH2 and BV31-OH2 are the first and third largest local hydrophilic volumes at −1.0 kcal/mol produced with the OH2 probe, respectively. BV12-OH2 is the largest local hydrophilic volume at −3.0 kcal/mol produced with the OH2 probe; Cw5-OH2 is the concentration of polar interactions on the molecular surface at the fifth energy level (−3.0 kcal/mol). Cw4-OH2 is the concentration of polar interactions on the molecular surface at the fourth energy level (−2.0 kcal/mol). ID4-DRY and ID6-DRY are the distances between the molecular center of mass and the barycenter of the hydrophobic interaction region at the fourth (−2.0 kcal/mol) and sixth (−4.0 kcal/mol) energy levels. D23-OH2 and D23-DRY are the distances between the second and third local minima of the interaction energy obtained using the OH2 and DRY probes, respectively. W6 is the molecular envelope accessible by solvent water molecules and accounts for polarizability and dispersion forces. CP is the ratio of the hydrophobic and hydrophilic parts of a molecule. a

C

DOI: 10.1021/acs.jcim.6b00326 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling Table 3. Final PLS Models of log(kon), −log(koff), and −log(KD) model −log(koff)

BV31-OH2 (0.420)

log(kon) −log(KD)

BV31-OH2 (0.536)

BV12-OH2 (−0.495) BV12-OH2 (0.328)

Cw5-OH2 (−0.305) Cw5-OH2 (−0.578) Cw5-OH2 (−0.704)

variables (standard coefficients)

R2 a

Q2 b

R2predc

ID6-DRY (0.584)

0.695

0.695

0.772

0.573

0.549

0.647

0.578

0.577

0.677

ID6-DRY (0.658)

D23-DRY (−0.394) D23-DRY (−0.452)

HB1-O (−0.264)

W6 (0.283)

CP (−0.233)

D23-OH2 (−0.246)

a

The determination coefficient of the regression equation for the training set. bThe determination coefficient of three-fold cross-validation. cThe determination coefficient of the regression equation for the test set.

Figure 2. PLS modeling results for −log(koff). (a, b) Scatter plots of experimental vs predicted −log(koff) values for (a) the 28 training samples and (b) the nine test samples. (c) First two principal component scores for the 28 training samples. The color legend represents the activity range of the samples. (d) Loadings of the independent variables in the first two principal components.

was selected in the log(kon) and −log(KD) models, while Cw4-OH2 was selected in the −log(koff) model. These two variables indicate the concentration of polar interactions on the molecular surface, but at different energy levels. The correlation coefficient between the two variables is 0.913 (p < 0.01). On the basis of model interpretability, Cw4-OH2 was thus replaced by Cw5-OH2 in the model of −log(koff). The performance of the new −log(koff) model is equivalent to that of the original model, with R2 = 0.695, Q2 = 0.695, and R2pred = 0.772. Table 3 shows the performances and selected variables of the final PLS models of −log(koff), log(kon), and −log(KD). It can be deduced that the most important variables for the kinetic properties of HIV-1 protease inhibitors, i.e., BV31-OH2, BV12OH2, Cw5-OH2, ID6-DRY, and D23-DRY, are all derived from the OH2 and DRY probes. BV31-OH2 and BV12-OH2 are related to the local hydrophilic volumes; Cw5-OH2 is

R2 = 0.648 and R2pred = 0.729. By comparison of model 15 with model 4, it can be seen that the variables selected in the two models are very similar: three variables (BV31-OH2, Cw5-OH2, and D23-DRY) are common, and an integy moment variable differs only in the energy level (ID4-DRY vs ID6-DRY, respectively) (Table 2). Further analysis showed that these two DRY variables are significantly correlated with each other (R = 0.886, p < 0.01). Thus, it can be deduced that the two PLS models derived from different variable selection schemes are consistent with each other in terms of variables selected and model performance. On the basis of a consideration of both model performance and logistic reasoning, model 4 was chosen as the optimal model for −log(KD); all of its variables are screened from those of the −log(koff) and log(kon) models (Table 2). By comparison of the variables in the PLS models of −log(koff), log(kon), and −log(KD) (Table 2), it can be seen that Cw5-OH2 D

DOI: 10.1021/acs.jcim.6b00326 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling

Figure 3. PLS modeling results for log(kon). (a, b) Scatter plots of experimental vs predicted log(kon) values for (a) the 28 training samples and (b) the nine test samples. (c) First two principal component scores for the 28 training samples. The color legend represents the activity range of the samples. (d) Loadings of the independent variables in the first two principal components.

Nelf, and Saq, are all distributed in the upper right corner. Figure 2d shows the loadings of independent variables in the first two principal components. It can be seen that BV31-OH2 and ID6-DRY make positive contributions to −log(koff). That is to say, relatively weaker hydrophilic (polar) interactions (−1.0 kcal/mol) and a larger imbalance between the center of mass and the hydrophobic region benefit a lower dissociation rate or longer residence time. On the contrary, BV12-OH2 and Cw5-OH2 make negative contributions to −log(koff), indicating that relatively stronger hydrophilic (polar) interactions (−3.0 kcal/mol) and larger polar surface benefit a higher dissociation rate or shorter residence time. 3.2.2. log(kon). Figure 3a,b shows scatter plots of experimental versus predicted log(kon) values for the training and test data sets, respectively. It can be observed that compounds B277 and A016 deviate largely from the regression line through the origin. The relatively larger prediction errors may be caused by the lower association rate constants of the two compounds. Figure 3c shows the first two principal component scores for the 28 training samples. The results show that most of the training samples are located in the first, second, and fourth quadrants. Also, it can be seen that the majority of the training samples are distributed around the five clinical drugs, which indicates similar association properties. Figure 3d shows the loadings of the independent variables in the first two principal components. It can be inferred that BV12-OH2 makes the largest positive contribution to log(kon). That is to say, strong

related to the concentration of polar interactions on the molecular surface; ID6-DRY is the distance between the molecular center of mass and the barycenter of the hydrophobic interaction region; and D23-DRY represents the distance between local minima of the interaction energy between the DRY probe and the target molecule. Thus, it can be concluded that both the kinetic and thermodynamic properties of HIV-1 protease inhibitors are closely related to the hydrophobic and polar interactions as well as the unbalanced degree of deviation of the hydrophobic barycenter from the center of mass. To further validate the derived PLS models, 30 repeats of PLS modeling based on randomly selected training and test samples were performed. The results showed that the average predictive capabilities of the three models are fairly good with small deviations (Tables S5 and S6). 3.2. Optimal PLS Models of −log(koff), log(kon), and −log(KD). 3.2.1. −log(koff). Figure 2a,b shows scatter plots of experimental versus predicted −log(koff) values for the training and test data sets, respectively. It is clear that all of the data points are distributed along regression lines through origin very well. Figure 2c shows the first two principal component scores for the 28 training samples. It can be observed that the sample scores in the first two components can characterize the dissociation rate constants very well. Along the diagonal from the third quadrant to the first quadrant, there is an obvious tendency of the decreased dissociation rate constants. Meanwhile, it can be seen that five clinical drugs, i.e., Amp, Ind, Rit, E

DOI: 10.1021/acs.jcim.6b00326 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling

Figure 4. PLS modeling results for −log(KD). (a) Scatter plots of experimental vs predicted −log(KD) values for (a) the 28 training samples and (b) the nine test samples. (c) First two principal component scores for the 28 training samples. The color legend represents the activity range of the samples. (d) Loadings of the independent variables in the first two principal components.

association process. BV12-OH2 was not selected in the final −log(KD) model. The reason for this may be the counteraction of its effects on the dissociation and association processes, which results in no significant contribution to the apparent −log(KD). 3.3. Graphical Representation of Volsurf Descriptors. Figure 5 shows the Volsulf descriptors of the two clinical compounds Nelf and Ind with equivalent equilibrium dissociation constants. By comparison, Nelf has lower dissociation and association rate constants than Ind (Table 1 and Figure 1). Figure 5a−d shows the hydrophilic volumes of Nelf and Ind generated by the OH2 probe at energy levels of −1.0 and −3.0 kcal/mol. It can be observed that the total hydrophilic volume of Nelf is similar to that of Ind at both energy levels. Also, there are no significant differences in BV31-OH2 and BV12-OH2 between these two compounds. Figure 5e,f shows the distances between the molecular center of mass and the barycenter of the hydrophobic interaction region at eight energy levels. It is obvious that Nelf has a larger unbalanced degree of hydrophobic distribution at the sixth energy level compared to Ind, which may be a determinant factor for the lower dissociation rate. Figure 6 shows the Volsulf descriptors of compounds A024 and B355. The association and dissociation rate constants of A024 are both lower than those of B355, although the equilibrium dissociation constants of the two compounds are similar to each other. From Figure 6a−f it can be observed that A024 has a larger BV31-OH2 value than B355, while the

hydrophilic (polar) interactions are beneficial to a fast association process. On the contrary, Cw5-OH2, D23-DRY, HB1-O, etc. make negative contributions to log(kon). It is interesting to note the different roles of BV12-OH2 and Cw5-OH2 in the −log(koff) and log(kon) models. BV12-OH2 benefits both the association and dissociation processes, while Cw5-OH2 favors the dissociation process and disfavors the association process. This finding may provide an important cue for the design of HIV-1 protease inhibitors with desired kinetic properties. 3.2.3. −log(KD). Figure 4a,b shows scatter plots of experimental versus predicted −log(KD) values for the training and test data sets, respectively. Again, the prediction errors of B277 and A016 are relatively larger compared with those of other samples because of the lower association rate constants. The score plot for the 28 training samples is shown in Figure 4(c). An obvious tendency of increasing −log(KD) values can be observed along the diagonal from the third quadrant to the first quadrant. In addition, the five clinical drugs in the training data set are all located in the upper right corner, which indicates strong binding affinities. Figure 4d shows the loadings of the four independent variables in the first two principal components. Consistent with the conclusions from the −log(koff) and log(kon) models, BV31OH2 and ID6-DRY make large positive contributions to −log(KD) by slowing the dissociation process, while Cw5-OH2 makes a negative contribution to −log(KD) by facilitating the dissociation process and, at the same time, hampering the F

DOI: 10.1021/acs.jcim.6b00326 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling

BV12-OH2 and Cw5-OH2 values for A024 are similar to those for B355. However, the unbalanced degree of the hydrophobic distribution of A024 is smaller than that of B355 at the sixth energy level (Figure 6e,f), which is beneficial to a slow dissociation process. Thus, it can be induced that the relatively lower dissociation rate of A024 is mainly caused by the larger BV31-OH2 value.

4. CONCLUSIONS It is well-known that quantitative predictions of association and dissociation rate constants have always been a challenging problem in the computational chemistry domain. In this work, grid-based VolSurf descriptors were successfully applied to modeling of both the kinetic and thermodynamic properties of a series of HIV-1 protease inhibitors. The results proved that nine VolSurf descriptors derived from water (OH2) and hydrophobic (DRY) probes are dominant factors for the kinetic and thermodynamic properties of HIV-1 protease inhibitors. To the best of our knowledge, this is the first time that interconsistent, highly predictive, and reasonable models have been established for both the kinetic and thermodynamic properties. The VolSurf descriptors can characterize size, shape, polarity, hydrophobicity, and unbalanced degree of property distribution. Also, the descriptors can be calculated rapidly and are independent of molecular alignments. Thus, the VolSurf descriptors can be widely applied to the design and optimization of lead compounds with desired kinetic and thermodynamic properties.

Figure 5. Volsulf properties of Nelf and Ind. (a−d) Hydrophilic volumes of (a, c) Nelf and (b, d) Ind at energy levels of (a, b) −1.0 kcal/mol and (c, d) −3.0 kcal/mol. (e, f) Distances between the molecular center of mass and the barycenter of the hydrophobic interaction region at eight energy levels for (e) Nelf and (f) Ind.



ASSOCIATED CONTENT

* Supporting Information S

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jcim.6b00326. Structures of the 37 HIV-1 protease inhibitors (Figure S1); forward feature selection and PLS modeling of −log(koff) (Table S1) and log(kon) (Table S2); forward feature selection and PLS modeling of −log(KD) based on the descriptors from the optimal log(kon) and log(koff) models (Table S3); forward feature selection and PLS modeling of −log(KD) (Table S4); performances (Table S5) and statistics (Table S6) of 30 repeats of PLS modeling (PDF)



AUTHOR INFORMATION

Corresponding Author

*Phone: +86-23-65102507. E-mail: [email protected]. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS This research was supported by the Natural Science Foundation of Chongqing (2013jcyjA10063) and the “111” Project of “Introducing Talents of Discipline to Universities”.



Figure 6. Volsulf properties of compounds A024 and B355. (a−d) Hydrophilic volumes of (a, c) A024 and (b, d) B355 at energy levels of (a, b) −1.0 kcal/mol and (c, d) −3.0 kcal/mol. (e, f) Distances between the molecular center of mass and the barycenter of the hydrophobic interaction region at eight energy levels for (e) A024 and (f) B355.

REFERENCES

(1) Kohl, N. E.; Emini, E. A.; Schleif, W. A.; Davis, L. J.; Heimbach, J. C.; Dixon, R. A. F.; Scolnick, E. M.; Sigal, I. S. Active Human Immunodeficiency Virus Protease Is Required for Viral Infectivity. Proc. Natl. Acad. Sci. U. S. A. 1988, 85, 4686−4690. (2) De Clercq, E. HIV-chemotherapy and -prophylaxis: New Drugs, Leads and Approaches. Int. J. Biochem. Cell Biol. 2004, 36, 1800−1822.

G

DOI: 10.1021/acs.jcim.6b00326 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling (3) De Clercq, E. Anti-HIV Drugs: 25 Compounds Approved within 25 Years after the Discovery of HIV. Int. J. Antimicrob. Agents 2009, 33, 307−320. (4) Copeland, R. A.; Pompliano, D. L.; Meek, T. D. Drug-target Residence Time and Its Implications for Lead Optimization. Nat. Rev. Drug Discovery 2006, 5, 730−739. (5) Dahl, G.; Akerud, T. Pharmacokinetics and the Drug−target Residence Time Concept. Drug Discovery Today 2013, 18, 697−707. (6) Kalidas, C. Chemical Kinetic Methods: Principles of Fast Reaction Techniques and Applications; New Age International (P) Ltd.: Delhi, 2005. (7) Leavitt, S.; Freire, E. Direct Measurement of Protein Binding Energetics by Isothermal Titration Calorimetry. Curr. Opin. Struct. Biol. 2001, 11, 560−566. (8) Couffin-Hoarau, A. C.; Aubertin, A. M.; Boustta, M.; Schmidt, S.; Fehrentz, J. A.; Martinez, J.; Vert, M. Peptide-Poly(L-lysine citramide) Conjugates and their In Vitro Anti-HIV Behavior. Biomacromolecules 2009, 10, 865−876. (9) Delille, C. A.; Pruett, S. T.; Marconi, V. C.; Lennox, J. L.; Armstrong, W. S.; Arrendale, R. F.; Sheth, A. N.; Easley, K. A.; Acosta, E. P.; Vunnava, A.; Ofotokun, I. Effect of Protein Binding on Unbound Atazanavir and Darunavir Cerebrospinal Fluid Concentrations. J. Clin. Pharmacol. 2014, 54, 1063−1071. (10) Esseghaier, C.; Ng, A.; Zourob, M. A Novel and Rapid Assay for HIV-1 Protease Detection Using Magnetic Bead Mediation. Biosens. Bioelectron. 2013, 41, 335−341. (11) Sussman, F.; Villaverde, M. C.; Dominguez, J. L.; Danielson, U. H. On the Active Site Protonation State in Aspartic Proteases: Implications for Drug Design. Curr. Pharm. Des. 2013, 19, 4257−4275. (12) Ershov, P. V.; Gnedenko, O. V.; Molnar, A. A.; Lisitsa, A. V.; Ivanov, A. S.; Archakov, A. I. Kinetic and Thermodynamic Analysis of Dimerization Inhibitors Binding to HIV Protease Monomers by Surface Plasmon Resonance. Biochemistry (Moscow) Supplement Series B: Biomed. Chem. 2012, 6, 94−97. (13) Zheng, X. W.; Bi, C.; Li, Z.; Podariu, M.; Hage, D. S. Analytical Methods for Kinetic Studies of Biological Interactions: A Review. J. Pharm. Biomed. Anal. 2015, 113, 163−180. (14) Wang, J. L.; Skolnik, S. Recent Advances in Physicochemical and ADMET Profiling in Drug Discovery. Chem. Biodiversity 2009, 6, 1887−1899. (15) Rossi, A. M.; Taylor, C. W. Analysis of Protein-ligand Interactions by Fluorescence Polarization. Nat. Protoc. 2011, 6, 365− 387. (16) Zoete, V.; Michielin, O.; Karplus, M. Protein-ligand Binding Free Energy Estimation Using Molecular Mechanics and Continuum Electrostatics. Application to HIV-1 Protease Inhibitors. J. Comput.Aided Mol. Des. 2003, 17, 861−880. (17) Cheng, Y. A.; Li, D. C.; Ji, B. H.; Shi, X. H.; Gao, H. J. Structurebased Design of Carbon Nanotubes as HIV-1 Protease Inhibitors: Atomistic and Coarse-grained Simulations. J. Mol. Graphics Modell. 2010, 29, 171−177. (18) Wright, D. W.; Hall, B. A.; Kenway, O. A.; Jha, S.; Coveney, P. V. Computing Clinically Relevant Binding Free Energies of HIV-1 Protease Inhibitors. J. Chem. Theory Comput. 2014, 10, 1228−1241. (19) Li, D.; Ji, B.; Hwang, K.-C.; Huang, Y. Strength of Hydrogen Bond Network Takes Crucial Roles in the Dissociation Process of Inhibitors from the HIV-1 Protease Binding Pocket. PLoS One 2011, 6, e19268. (20) Andersson, K.; Hämäläinen, M. D. Replacing Affinity with Binding Kinetics in QSAR Studies Resolves Otherwise Confounded Effects. J. Chemom. 2006, 20, 370−375. (21) Kurup, A.; Mekapati, S. B.; Garg, R.; Hansch, C. HIV-1 Protease Inhibitors: A Comparative QSAR Analysis. Curr. Med. Chem. 2003, 10, 1679−1688. (22) Schaal, W. Computational Studies of HIV-1 Protease Inhibitors. In Comprehensive Summaries of Uppsala Dissertations from the Faculty of Pharmacy, Vol. 263; Acta Universitatis Upsaliensis: Uppsala, Sweden, 2002.

(23) Chiu, S. H.; Xie, L. Toward High-throughput Predictive Modeling of Protein Binding/Unbinding Kinetics. J. Chem. Inf. Model. 2016, 56, 1164−1174. (24) Shuman, C. F.; Vrang, L.; Danielson, U. H. Improved Structureactivity Relationship Analysis of HIV-1 Protease Inhibitors Using Interaction Kinetic Data. J. Med. Chem. 2004, 47, 5953−5961. (25) Markgren, P. O.; Schaal, W.; Hamalainen, M.; Karlen, A.; Hallberg, A.; Samuelsson, B.; Danielson, U. H. Relationships Between Structure and Interaction Kinetics for HIV-1 Protease Inhibitors. J. Med. Chem. 2002, 45, 5430−5439. (26) van de Waterbeemd, H.; Gifford, E. ADMET in Silico Modelling: Towards Prediction Paradise? Nat. Rev. Drug Discovery 2003, 2, 192−204. (27) Ermondi, G.; Caron, G.; Pintos, I. G.; Gerbaldo, M.; Perez, M.; Perez, D. I.; Gandara, Z.; Martinez, A.; Gomez, G.; Fall, Y. An Application of Two MIFs-based Tools (Volsurf plus and Pentacle) to Binary QSAR: The Case of a Palinurin-related Data Set of Non-ATP Competitive Glycogen Synthase Kinase 3 Beta (GSK-3 beta) Inhibitors. Eur. J. Med. Chem. 2011, 46, 860−869. (28) Tintori, C.; Magnani, M.; Schenone, S.; Botta, M. Docking, 3DQSAR Studies and in Silico ADME Prediction on C-Src Tyrosine Kinase Inhibitors. Eur. J. Med. Chem. 2009, 44, 990−1000. (29) Geladi, P.; Kowalski, B. R. Partial Least-squares Regression: A Tutorial. Anal. Chim. Acta 1986, 185, 1−17.

H

DOI: 10.1021/acs.jcim.6b00326 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX