In Silico Prediction of Volume of Distribution in Human Using Linear

Jun 17, 2009 - In Silico Prediction of Volume of Distribution in Human Using Linear and .... GMFE of Each Class Using Four Models (1, 2, 3, and the co...
0 downloads 0 Views 2MB Size
4488 J. Med. Chem. 2009, 52, 4488–4495 DOI: 10.1021/jm9004658

In Silico Prediction of Volume of Distribution in Human Using Linear and Nonlinear Models on a 669 Compound Data Set Giuliano Berellini,†,§ Clayton Springer,‡ Nigel J. Waters,§ and Franco Lombardo*,§ †

Laboratory of Chemometrics, Department of Chemistry, University of Perugia, 06123 Perugia, Italy, ‡Computational Chemistry Group, Novartis Institutes for Biomedical Research, Cambridge, Massachusetts 02139, and §Metabolism and Pharmacokinetics Group, Novartis Institutes for Biomedical Research, 250 Massachusetts Avenue, Cambridge, Massachusetts 02139 Received April 10, 2009

The prediction of human pharmacokinetics early in the drug discovery cycle has become of paramount importance, aiding candidate selection and benefit-risk assessment. We present herein computational models to predict human volume of distribution at steady state (VDss) entirely from in silico structural descriptors. Using both linear and nonlinear statistical techniques, partial least-squares (PLS), and random forest (RF) modeling, a data set of human VDss values for 669 drug compounds recently published (Drug Metab. Disp. 2008, 36, 1385-1405) was explored. Descriptors covering 2D and 3D molecular topology, electronics, and physical properties were calculated using MOE and Volsurf+. Model evaluation was accomplished using a leave-class-out approach on nine therapeutic or structural classes. The models were assessed using an external test set of 29 additional compounds. Our analysis generated models, both via a single method or consensus which were able to predict human VDss within geometric mean 2-fold error, a predictive accuracy considered good even for more resourceintensive approaches such as those requiring data generated from studies in multiple animal species. Introduction The prediction of human pharmacokinetics for new compounds has become an important process in drug research, and many reports, understandably, have focused on prediction methods that utilize animal pharmacokinetic1-10 as well as in vitro data.11-16 Recently, the availability of computational chemistry methodologies has increased and these have been applied to the prediction of human pharmacokinetics and/or general ADMET properties.17-24 The advantages of the latter endeavors are clear and many-fold, ranging from the time-consuming, expensive acquisition of in vivo PK structure-activity relationships (SAR) (generally limited to rodent), to the need to synthesize a compound or series of compounds, to the use of multiple animal species in vivo data needed for extrapolative modeling. The obvious advantages of being able to prioritize synthesis and screening studies with a significantly higher probability of success are apparent. The prediction of the steady-state volume of distribution (VDss) is a key pharmacokinetic parameter, which together with clearance determines the half-life and thus impacts on the dosing regimen of a compound. The dosing regimen is designed to maintain a free plasma concentration, greater than that required to give the pharmacodynamic effect throughout the dosing interval, while lessening the maximal concentration (Cmax) and potential for related side effects. The construction of effective models not only requires sound computational tools but, very importantly, databases that have been carefully assembled. Human pharmacokinetic databases are challenging to compile because each data point *To whom correspondence should be addressed. Phone (617) 8714003. Fax: (617) 871-3078. E-mail: [email protected].

pubs.acs.org/jmc

Published on Web 06/17/2009

typically derives from a separate report in which experimental approaches differ from report to report. Such variables include the numbers and types of study subjects (e.g., healthy vs diseased, gender, age, etc), the routes of administration and doses, sample collection times, methods of sample analysis, and the types of pharmacokinetic parameters reported. We recently published a trend analysis of pharmacokinetic parameters based on simple physicochemical descriptors for a set comprising 670 compounds derived from original literature data, and we have detailed the quality of the data and the steps taken to gather and analyze them. This data set, to our knowledge, represents the largest publicly available data set of human PK parameters and the full set of references was also reported as Supporting Information.25 Obach26 and Sui27 have recently reviewed the approaches taken to predict volume of distribution, and several authors have published in silico models for the prediction of volume of distribution,17-19 being a fundamental parameter for the prediction of the dosage interval. These data offer a good opportunity to compare and contrast descriptors and statistical approaches on a high quality data set, equal for all the methods used, in analogy with recent work published by Hughes et al.,28 and second, to develop a rugged, comprehensive model with a larger data set than any of the previously reported models, encompassing a broader chemical space. This article details our approaches and results providing predictive models based on a structurally diverse and carefully assembled data set. Results and Discussion The results of the computational and statistical work led to two major sets of results: one derived from the random r 2009 American Chemical Society

Article

forest (RFa) approach29 and the other from a partial leastsquares (PLS) analysis, thus encompassing potentially “orthogonal” nonlinear and linear statistical treatments. We also explored the application of “direct” MLR approaches and, even though we did not pursue them in detail after consideration of RF and PLS analysis, we observe that, in general, linear models with exploration of suitable descriptors seem to represent a viable avenue for the prediction of some structure-property relationships,28 which, in this case, are exemplified by VDss. The best models from efforts using PLS and RF with either or both descriptor sets (MOE and/or VolSurf+) were combined to assess the potential of a consensus model (via the average of the logarithms of VDss). The summary statistics of the leave-class-out (LCO) analysis are shown in Table 1, and the performance of the models in predicting an external test set of 29 compounds is shown in Table 2. The results are also graphically shown in Figures 1 and 2, which refer to the consensus model predictions for the LCO and external test set, respectively. The PLS models were built using either MOE or VolSurf+ or a combination of both descriptor calculations. A five-component model using a combination of 95 MOE, VolSurf+, ISIS keys, and SMARTS string descriptors showed improved performance over the PLS model, which used MOE descriptors alone. All leave-class-out groups gave GMFE values less than 2 with the exception of the NSAIDs and steroids. The mean LCO GMFE was 1.8, with 70% predicted within 2-fold and 85% predicted within 3-fold of the observed figure. For the external test set, this model gave rise to a GMFE of 2.2, with about 55% of compounds predicted within 2-fold and 79% within 3-fold. The PLS model using just 11 VolSurf+ descriptors showed some further enhancements. Although the GMFE for the NSAIDs remained greater than 2, there was a marked improvement in the prediction of the steroid class (GMFE 2.0). However, the GMFE for the tricyclic antidepressant class increased from 1.7 to 3. Overall, the average LCO GMFE was 2.0, with 60% predicted within 2-fold and 80% within 3-fold. This model performed especially well in predicting the external test set; 79% fell within the 2-fold limit, 83% within the 3-fold limit, and GMFE was the lowest of any model at 1.8. Several RF models were also built using either MOE or VolSurf+ and a combination of both descriptor sets as well as a simplified three-descriptor model (logP, acidic, and basic character). The performance of the RF models was found to be very similar irrespective of the physicochemical parameters used. It is interesting to note that, except when MOE descriptors were used alone, the NSAID class was predicted with GMFE values close to 2. However, random forest models were unable to accurately capture the fluoroquinolones and morphinans classes, yielding GMFE values of 3.7 and 2.3, respectively. The best RF model using a combination of both MOE and VS+ descriptors gave rise to a mean GMFE of 2.0 with 60% predicted within 2-fold and 81% within 3-fold. Similar values were also observed in prediction of the external test set. The RF model using just three descriptors covering lipophilicity and ionization state performed reasonably well (data not shown), although from this analysis it is clear that a Abbreviations: MIF, molecular interaction field; MLR, multiple linear regression; PCA, principal component analysis; PLS, partial leastsquares; RF, random forest; SMARTS, smiles arbitrary target specification.

Journal of Medicinal Chemistry, 2009, Vol. 52, No. 14

4489

these three descriptors alone do not describe the magnitude of variance explained in the more complex models using greater number of variables. The more robust linear and nonlinear models utilize Volsurf+ descriptors either alone or in combination with MOE. In addition, the predictive accuracy of the pKa or fraction ionized calculation within Volsurf+ is likely a crucial determinant in the performance of the human VDss models. In Figure 3, the 11 coefficients of model 3 are shown. WN1 and WN5 are the H-bond acceptor volumes calculated by GRID MIF generated with the probe N1 (amide nitrogen) at -1 and -5 kcal/mol, respectively. The first one represents the H-bond acceptor interaction surface, while the second one represents the presence of strong electrostatic H-bond acceptor anchor points in a molecule (e.g., an ionized acidic group). It is therefore physically reasonable that this descriptor should be negatively influencing the volume of distribution as it would be strongly correlated with the presence of a carboxylic (or other acidic) group. FLEX represents the maximum flexibility of a molecule and LgD5-LgD10 are the 1-octanol/ water distribution coefficients (logD) calculated at pH 5, 6, 7.5, 8, 9, and 10. The most important positive coefficients belong to the lipophilic profile at higher pH expressed by logD from pH 7.5 to 10, while the most important negative coefficient belongs to the lipophilic profile at pH 5. Collectively, these descriptors identify, on the basis of the lipophilicity profile, acidic and basic compounds, with the latter generally showing significantly higher VDss values. This series of descriptors are likely to yield a more subtle and perhaps accurate profile of a given compound than a single logP or logD value. In fact, for a neutral compound, logD is equal to logP along the entire pH range, and so the positive contribution of the lipophilicity to the VDss is given from the sum of all logD coefficients. It is interesting to note that the negative value of the coefficient for FLEX is reminiscent of Veber’s analysis on permeability and the negative impact of rotatable bonds on it.30 A high value for this descriptor could therefore be interpreted as being detrimental for a compound’s ability to diffuse into tissues across membranes. DRDRAC and DRDRDO represent, respectively, the triplet between hydrophobic (DR) and H-bond acceptor (AC) points and the triplet between hydrophobic (DR) and H-bond donor (DO). The pharmacophoric descriptor DRDRDO shows a positive correlation with VDss, i.e., a higher value of such a descriptor would increase VDss, and it is generally higher for protonated bases. On the contrary, the pharmacophoric descriptor DRDRAC shows a negative correlation with VDss, i.e., a higher value would decrease VDss, and such a descriptor is generally found in ionized acidic groups. This descriptor, being pharmacophoric in nature, may account for the high propensity of lipophilic acidic compounds to bind albumin. It could be generally stated that lipophilicity (profile) and charge state of a molecule (or propensity toward yielding a negative or positive charge at physiological pH) are among the most important descriptors across the models developed. This confirms observations reported in the literature.14,15,17,19,20 As PLS and RF models tended to lose predictive accuracy for certain classes across the data set, the potential of a consensus model was assessed to see if deficiencies in one model could be compensated for by the other. On the basis of the statistics presented in Tables 1 and 2, a consensus of PLSMOE/VolSurf+ and PLS- VolSurf+ models was able to show some improvement in overall performance in terms of

4490 Journal of Medicinal Chemistry, 2009, Vol. 52, No. 14

Berellini et al.

Table 1. GMFE of Each Class Using Four Models (1, 2, 3, and the consensus of the models 2 and 3) with Different Descriptors (only VolSurf+ or Mixed VolSurf+ and MOE) and Different Statistic Regression Methods (Random Forest (RF) or Partial Least Squares (PLS)) method type of descriptors number of descriptors statistical approach

model 1

model 2

model 3

consensus of models 2 and 3

MOE and VolSurf+ 280 RF

MOE and VolSurf+ 95 PLS (PC5)

VolSurf+ only 11 PLS (PC6)

2.0 1.8 1.5 3.7 2.3 1.8 1.8 2.2 2.1

1.8 1.6 1.5 1.3 1.9 2.2 1.8 2.5 1.7

1.8 1.6 2.0 1.9 1.9 2.3 2.0 2.0 3.0

1.8 1.5 1.5 1.3 1.8 2.2 1.9 2.2 2.2

2.0 60 81 -0.62

1.8 70 85 -0.40

2.0 60 80 -0.70

1.8 67 86 -0.59

structural classes (no.) β-adrenergics (27) benzodiazepines (18) cephalosporins (27) fluoroquinolones (12) morphinans (12) NSAIDs (16) nucleosides/nucleotides (31) steroids (21) tricyclic antidepressants (8) GMFE across all compds (172) %