Predicting Activation of the Promiscuous Human ... - ACS Publications

Tien-Yi HouChing-Feng WengMax K. Leong ... Xiaofan Zheng , Sam Chao , Jung-Hui Sun , Brett R. Beno , Daniel M. Camac , Chong-Hwan Chang , Mian Gao ...
0 downloads 0 Views 8MB Size
ARTICLE pubs.acs.org/crt

Predicting Activation of the Promiscuous Human Pregnane X Receptor by Pharmacophore Ensemble/Support Vector Machine Approach Ci-Nong Chen,† Yu-Hsuan Shih,† Yi-Lung Ding,† and Max K. Leong*,†,‡ †

Department of Chemistry and ‡Department of Life Science and Institute of Biotechnology, National Dong Hwa University, Shoufeng, Hualien 97401, Taiwan

bS Supporting Information ABSTRACT: The nuclear receptor human pregnane X receptor (hPXR) is a ligand-regulated transcription factor that responds to a wide range of endogenous and xenobiotic molecules. Upon activation with ligands, hPXR can increase induction levels of metabolic enzymes. Therefore, hPXR plays a critical role in drug metabolism and excretion. Identifying the molecules that activate this protein can be of great help to predict adverse drug interaction, which, nevertheless, cannot be accurately modeled without taking into account its promiscuous nature, namely, highly flexible protein conformation and multiple ligand orientations. An in silico model was developed to predict the activation of hPXR using the novel pharmacophore ensemble/support vector machine (PhE/SVM) scheme. The predictions by the PhE/SVM model are in good agreement with the experimental observations for those molecules in the training set (n = 32, r2 = 0.86, q2 = 0.80, RMSE = 0.37, s = 0.21) and test set (n = 120, r2 = 0.80, RMSE = 0.25, s = 0.19). In addition, this PhE/SVM model performed equally well for those molecules in the outlier set (n = 8, r2 = 0.91, RMSE = 0.15, s = 0.12) and completely met with those validation criteria generally adopted to gauge the predictivity of a theoretical model. A mock test also verified its predictivity. When compared with crystal structures, the calculated results are consistent with the published hPXRligand cocomplex structure and the plasticity nature of hPXR is also revealed. Thus, this accurate, fast, and robust PhE/SVM model can be utilized for predicting the activation of promiscuous hPXR to facilitate drug discovery and development.

’ INTRODUCTION The nuclear receptor pregnane X receptor (PXR; NR1I2) was first named because of its activation by pregnanes.1 It was not until 1998 that Kliewer et al.2 discovered PXR from mouse and Bertilsson et al. identified its counterpart from human.3 PXR is present in various body tissues.4 Predominately, it is expressed in liver and also can be found in colon, small intestines, kidney, testis, and embryonic tissue.3,57 PXR can be activated by a wide range of structurally diverse endogenous and xenobiotic molecules, including bile acids, steroids, clinical drugs, herbal medicines, nutrition supplements, and environmental contaminants.8 Upon activation with ligands, PXR dissociates and translocates from the cytosol into the nucleus to activate many transcription factors that regulate induction of metabolic enzymes, including transporters.9,10 Consequently, PXR is referred to as the “master” regulator of pathways involving the major metabolic enzymes.11 In addition, recent evidence also indicates its potential role in drug resistance to chemotherapy in various cancers1215 as well as its association with inflammatory diseases, Crohn’s disease, and ulcerative colitis.16,17 r 2011 American Chemical Society

Activation of PXR by a given administered drug and the subsequent induction of metabolic enzymes may lead to accelerated metabolism of any other drugs in the case of polypharmacy, resulting in reduction of clinical efficacy or severe toxicity in the worst cases.11 The herbal antidepressant St. John’s wort (SJW), for example, can activate PXR, which, in turn, induces CYP3A4 expression, resulting in adverse drugdrug interaction with any other coadministered drugs that are metabolized by CYP3A4.18 Therefore, it is of critical importance to develop an in silico model to predict the ligand activation of hPXR in the process of drug discovery in the hope of reducing the attrition rates due to adverse drugdrug interaction.8 A number of pharmacophore, CoMFA, and QSAR models have been proposed to predict the activation of hPXR.1927 Nevertheless, the effects of protein plasticity are ignored by these proposed models as illustrated by the fact that different pharmacophore hypotheses adopted different combinations of chemical features, depending on the sample selections to develop the Received: July 27, 2011 Published: September 15, 2011 1765

dx.doi.org/10.1021/tx200310j | Chem. Res. Toxicol. 2011, 24, 1765–1778

Chemical Research in Toxicology

Figure 1. The superposition of proteins in various cocomplex structures (PDB codes: 1ILH, 1SKX, 1M13, 2QNV, chain A of 1NRL, chain A of 2O9I, and chain A of 3HVL), which are color-coded by purple, crimson, green, yellow, gray, blue, and cyan, respectively.

predictive models.28 In other words, there is no agreement among published models, which can be presumably attributed to the promiscuous nature of PXR protein.8 The high level of PXR flexibility can be illustrated by recently published crystal structures, namely, unbound hPXR (PDB codes: 1ILG and 3CTB) and hPXR in cocomplexes with hyperforin (PDB code: 1M13), SR12813 (PDB codes: 1ILH, 1NRL, and 3HVL), rifampicin (PDB code: 1SKX), T0901317 (PDB code: 2O9I), and colupulone (PDB code: 2QNV).2935 When superimposed, these proteins exhibit substantial structural discrepancies as illustrated by Figure 1, in which protein structures excerpted from cocomplex structures (PDB codes: 1ILH, 1SKX, 1M13, 2QNV, and chain A of 1NRL, 2O9I and 3HVL) were aligned. It can be observed that the intrinsically flexible residues, namely, Lys210, Met243, His407, and Arg410, form the putative binding pocket, contributing to the plastic nature of hPXR. In addition to protein conformation, the plastic nature of hPXR can also be manifested by its substantial variations in the size of binding pocket upon binding with structurally distinct ligands. For instance, the binding pocket volume of the unbound hPXR structure (PDB code: 1ILG) is about 334 Å3 as estimated by the CASTp package (available at http://sts-fw.bioengr.uic. edu/castp/calculation.php) using a 1.4 Å probe, whereas that of the hPXR-rifampicin cocomplex (PDB code: 1SKX) is about 589 Å3, displaying 76% increase in volume. As a result, any proposed analogue-based models, which fail to take into account the plastic nature of hPXR, can be only applied to those compounds, which interact with hPXR in a specific protein conformation. In contrast to analogue-based modeling, structure-based modeling seems to be a better alternative. In fact, a number of docking studies have been reported.19,26,27,3638 The ligandprotein interactions of such a highly flexible system

ARTICLE

Figure 2. The superposition of chains A, B and C in the hPXRSR12813 cocomplex structures (PDB code: 1ILH), which are colored by green, crimson, and blue, respectively.

Figure 3. Schematic presentation of PhE/SVM architecture.

can be modeled accurately by ensemble docking39 in conjunction with induced fit docking and molecular dynamics (MD).40 However, large number of protein structures substantially increases the demand of computational resources, making it extremely difficult to implement in a high-throughput fashion. Additional challenges can be encountered in the process of predictive model development since ligand can position itself in different directions for a given hPXR protein structure as 1766

dx.doi.org/10.1021/tx200310j |Chem. Res. Toxicol. 2011, 24, 1765–1778

Chemical Research in Toxicology

ARTICLE

Table 1. Statistical Parameters (Correlation Coefficient) r2, RMSE, Maximum Residual, Average Residual (ΔMax), Standard Deviation of Residual, and (Cross-Validation Coefficient) q2 Evaluated by Hypo A, Hypo B, Hypo C, Hypo D, and PhE/SVM in the Training Set params

a

Hypo A

Hypo B

Hypo C

Hypo D

PhE/SVM

r2

0.84

0.76

0.77

0.74

0.86

RMSE ΔMax

0.38 0.85

0.46 1.07

0.45 1.19

0.47 1.02

0.37 0.78

MAE

0.30

0.38

0.35

0.38

0.30

s

0.24

0.26

0.29

0.29

0.21

q2

N/Aa

N/A

N/A

N/A

0.80

Table 3. Weights, Tolerances, Three-Dimensional Coordinates of Chemical Features, and Interfeature Distances of Pharmacophore Model Hypo A

Not applicable.

Table 2. Statistical Parameters (Correlation Coefficient) r2, RMSE, Maximum Residual, Average residual (ΔMax), and Standard Deviation of Residual Evaluated by Hypo A, Hypo B, Hypo C, Hypo D, and PhE/SVM in the Test Set params

Hypo A

Hypo B

Hypo C

Hypo D

PhE/SVM

r2 RMSE

0.71 0.30

0.67 0.32

0.65 0.33

0.64 0.33

0.80 0.25

ΔMax

1.04

1.31

1.25

1.29

0.95

MAE

0.17

0.19

0.19

0.20

0.16

s

0.25

0.25

0.27

0.27

0.19

demonstrated by Figure 2, in which chains A, B, and C excerpted from hPXRSR12813 cocomplex structures (PDB code: 1ILH) were superimposed. It was shown that SR12813 can take 3 dramatically different poses to interact with hPXR, resulting in different interactions between SR12813 and target protein as depicted by Figure 2 of Watkins et al.35 Therefore, any molecular modeling methods that fail to take into account diverse conformations of hPXR protein and multiple orientations of a given ligand will yield fallible predictions or substantial deviations. Nevertheless, these seemingly intricate and yet critical issues can be resolved using a novel scheme recently proposed by Leong, in which a group of plausible pharmacophore hypothesis candidates were assembled to construct a pharmacophore ensemble (PhE), which, in turn, was treated as input for regression analysis via support vector machines (SVM).41 As such, each pharmacophore member in the PhE represents a single protein conformation or a group of protein conformations with similar spatial arrangements, and the architecture of PhE/SVM is schematically represented in Figure 3. It has been suggested by Yasuda et al. that the nature of multiple ligand-hPXR binding sites can be modeled by constructing multiple pharmacophore models for consensus prediction.27 However, it has been demonstrated that the PhE/SVM model performed better than the consensus prediction of pharmacophore ensemble.41 More importantly, this PhE/SVM scheme takes into consideration protein structural flexibility as well as ligand multiple orientations as demonstrated by numerous studies.4143 The objective of this investigation was to develop an accurate, fast, and robust in silico model based on the PhE/SVM scheme to predict the activation of hPXR.

Table 4. Weights, Tolerances, Three-Dimensional Coordinates of Chemical Features, and Interfeature Distances of Pharmacophore Model Hypo B

’ COMPUTATIONAL METHODS Data Compilation. A comprehensive literature search was carried out to retrieve all available EC50 values of 160 molecules to maximize the structural diversity.20,26,4450 To warrant better consistency, the average values were taken when there was more than one EC50 value in very close range for the same molecule and assay system. Furthermore, chemical structures were cautiously scrutinized and only compounds with defined stereochemistry were collected. All molecules enrolled in this study, their corresponding biological activities, and references to the literature are listed in Table S1 in the Supporting Information. Conformation Generations. The MacroModel package (Schr€odinger, Portland, OR) was employed to generate the stable conformations for each molecule using mixed Monte Carlo multiple minimum (MCMM)51/low mode52 in conjunction with the GB/SA hydration algorithm.53 Furthermore, the truncated-Newton conjugated gradient method (TNCG) with the force field selection of MMFFs54 was employed to conduct energy minimization; and the solvation effect was taken into account using water as solvent with a constant dielectric constant. The number of selected unique structures was limited 1767

dx.doi.org/10.1021/tx200310j |Chem. Res. Toxicol. 2011, 24, 1765–1778

Chemical Research in Toxicology

ARTICLE

Table 5. Weights, Tolerances, Three-Dimensional Coordinates of Chemical Features, and Interfeature Distances of Pharmacophore Model Hypo C

Table 6. Weights, Tolerances, Three-Dimensional Coordinates of Chemical Features, and Interfeature Distances of Pharmacophore Model Hypo D

to 255 within the energy window of 20 kcal/mol (or 83.7 kJ/mol) above the global minimum energy conformation. Training Set Selection. The generation of a good pharmacophore hypothesis, in large part, depends on the chemical and biological characteristics of those samples selected for the training set, which, theoretically, should include all classes of chemical structure with all ranges of biological activities. As such, any chemical or biological redundancy present in the samples will possibly give rise to overfitted or overtrained predictive models. More specifically, the key factor to constructing a good training set is to “teach” the program new knowledge from the input that can be carried out by selecting structurally similar compounds with significantly different biological activities or structurally dissimilar compounds with similar biological activities, for example. More detailed selection criteria have been described elsewhere.55,56 Thirty-two molecules with biological activities spanning over about 5 logarithm units were deliberately selected from the compound collections to construct the training set for automatic pharmacophore generation and regression. The remaining 120 molecules from the compound collections with biological activities spanning over about 4 logarithm units were treated as the test set to validate those generated pharmacophore hypotheses. Table S1 in the Supporting Information lists compounds used in the training set and test set, and their corresponding negative logarithm EC50 values, namely, pEC50, since all pharmacophore calculations are executed in a logarithm scale. Pharmacophore Development. Pharmacophore hypotheses were developed by HypoGen protocol implemented in DiscoveryStudio (Accelrys, San Diego, CA) using a variety of combinations of hydrogenbond acceptor (HBA), hydrogen-bond donor (HBD), hydrophobic (HP), aliphatic hydrophobic (AHP), and ring aromatic (RA) chemical features. The placement of excluded volume spheres (XVs) was also taken into account. In addition, the minimum and maximum numbers of each selected chemical feature, the total number of chemical features, and chemical feature weights and tolerances were varied in order to maximize the hypothesis diversity and performance. SVM Calculations. The predicted pEC50 values generated from the PhE in the training set were used as input for the SVM calculations, which were executed using the svm-train module implemented in the LIBSVM package (software available at http://www.csie.ntu.edu.tw/ ∼cjlin/libsvm), to generate regression models, and the developed SVM models were verified by those samples in the test set using the svmpredict module implemented in LIBSVM. Two regression modes, namely, ε-SVR and ν-SVR, which introduce parameters ε and ν to

control the number of support vector machine, respectively, were tested. The frequently used radial basis function (RBF) was selected as the kernel type. Run time conditions, namely, cost C, the width of the RBF kernel γ, and ε and ν in the cases of ε-SVR and ν-SVR, respectively, were systemically scanned using an in-house Perl script. Outlier Set. It is believed that an excellent predictive model should be accurate for those samples in the training and test set. In addition, it is critically important to demonstrate the robustness of a predictive model with a group of samples, which are distinctly different from those in the training set, namely, an outlier set.57 As a result, a small number of molecules were deliberately selected as the outliers to evaluate the robustness of developed SVM models since they are significantly different from those in the training set and test set in the chemical space determined by the first principal components (PCs) (see Results). Predictive Evaluations. The correlation coefficient (r2) between the predicted and observed values was calculated. All generated SVM models were subjected to 10-fold cross-validation, which gave rise to q2 values. In addition, other statistical parameters, namely, root-meansquare error (RMSE), maximum residual (ΔMax), mean absolute error (MAE), and standard deviation (s), were also employed to gauge the performance of a predictive model. It has been suggested by Golbraikh et al.58 that a good predictive model should meet the following criteria, which also have been adopted by Development of Environmental Modules for Evaluation of Toxicity of pesticide Residues in Agriculture (DEMETRA).59 r 2 > 0:6 q2 > 0:5 ðr 2  ro 2 Þ=r 2 < 0:1 and 0:85 e k e 1:15 0

jro 2  ro 2 j < 0:3 0

where ro2 and ro2 are the correlation coefficients except that the former is obtained from the regression line obtained from observed vs predicted values, whereas the latter is obtained from predicted vs observed values; and k is the slope of the regression line through the origin. As a result, those criteria were also adopted to verify the predictivity of generated models. In addition, it has been proposed by Roy and Roy60 recently that an acceptable predictive model should also fulfill the following condition, 1768

dx.doi.org/10.1021/tx200310j |Chem. Res. Toxicol. 2011, 24, 1765–1778

Chemical Research in Toxicology

ARTICLE

Figure 4. Generated pharmacophore models (A) Hypo A, (B) Hypo B, (C) Hypo C, and (D) Hypo D, in which aliphatic hydrophobic, excluded volume, hydrophobic, hydrogen-bond donor, hydrogen-bond acceptor, and ring aromatic chemical features are represented by cyan, gray, light blue, magenta, green, and orange blobs, respectively. The interfeature distances and angles among features, depicted in white, are measured in angstroms and degrees, respectively. which was also adopted to test the derived SVM models. rm 2 > 0:5 where rm2 is a modified version of r2, which is defined by pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi rm 2 ¼ r 2 ð1  jr 2  ro 2 jÞ

’ RESULTS PhE. Four pharmacophore hypotheses, designated by Hypo A, Hypo B, Hypo C, and Hypo D, were enlisted from all generated pharmacophore hypotheses to constitute the PhE based on their prediction accuracy of individual molecule as well as the statistical evaluations in the training set and test set as listed by Table S1 in the Supporting Information and Tables 1 and 2. These four candidate models in the ensemble consist of a variety of

combinations of chemical features, namely, one HBA, three HPs, and one RA in Hypo A; one HBA, three HPs, one RA, and one XV in Hypo B; one HBA, two HPs, one APH, one RA, and one XV in Hypo C; and one HBA, one HBD, and three HPs in Hypo D. Tables 36 summarize the characteristics of these four hypotheses, including weights, tolerances, three-dimensional coordinates, and interfeature distances. These four pharmacophore hypotheses are spatially arranged differently in addition to a variety of combinations of chemical features as demonstrated by Figure 4. Nevertheless, one HBA and two HPs always can be found among these four models. The closest distance between HP and HBA features in Hypo A is 5.733 Å, whereas that decreases to 5.491 Å, 4.907 Å, and 4.081 Å in Hypo B, Hypo C, and Hypo D, respectively. The distinct differences among these four pharmacophore models in space as well as chemical features can be illustrated by the superposition of 1769

dx.doi.org/10.1021/tx200310j |Chem. Res. Toxicol. 2011, 24, 1765–1778

Chemical Research in Toxicology

Figure 5. Superposition of four pharmacophore models Hypo A, Hypo B, Hypo C, and Hypo D, denoted in red, blue, green, and black, respectively.

these four models as shown in Figure 5. When fitted to 105, which is the most potent ligand enlisted in this study (Table S1 in the Supporting Information), Hypo A, Hypo B, Hypo C, and Hypo D yielded modest residuals of 0.24, 0.16, 0.36, and 0.43, respectively, whereas 105 adopted different orientations to generate the best fit in these four models as shown in parts AD of Figure 6, indicating multiple orientations of 105 when placed in the PXR binding pocket, which, in turn, can lead to different interactions between ligand and protein. The aromatic ring, for instance, is functional as a ππ interaction in Hypo A, Hypo B, and Hypo C, whereas it hydrophobically interacts with hPXR in Hypo D. The variations become even more pronounced by the overlay of these four conformations as demonstrated in part E of Figure 6. As a result, a PhE is suitable to address the conformational plasticity of hPXR as well as multiple orientations of binding ligands. The maximum prediction error (ΔMax) of Hypo C in the training set was given rise by the prediction of 93 with a value of 1.19, whereas Hypo A, Hypo B, and Hypo D only yielded deviations of 0.47, 0.92, and 0.17, respectively (Table S1 in the Supporting Information). The predictions of 23 by Hypo A and Hypo B deviated most from the observed value with residuals of 0.85 and 1.07, respectively, whose residuals were only 0.68 and 0.67 by Hypo C and Hypo D, respectively. Similarly, the prediction deviations of 72 by Hypo A, Hypo B, and Hypo C were merely 0.29, 0.29, and 0.46, respectively, whereas Hypo D generated a significant residual of 1.02. Such prediction discrepancies among these four models in the PhE manifest

ARTICLE

the fact that no single pharmacophore model executed better than the others for all molecules in the training set; nor did one perform worse than the others. Generally, the predictions by Hypo A, Hypo B, Hypo C, and Hypo D are in agreement with observed values in the training set as indicated by those small statistical evaluations, namely, RMSE, MAE, and s as well as the scatter plot of observed vs predicted pEC50 values as illustrated in Figure 7. However, all of them produced correlation coefficients (r2) of less than 0.80 except Hypo A (0.84), indicating their modest predictivity. When applied to those 120 samples in the test set, these four models in the PhE performed slightly better than they did in the training set as depicted by the statistical parameters RMSE, MAE, and s (Table 2). The RMSE values, for example, were 0.38, 0.46, 0.45, and 0.47 by Hypo A, Hypo B, Hypo C, and Hypo D in the training set (Table 1), respectively, and unanimously decreased to 0.30, 0.32, 0.33, and 0.33 in the test set despite the fact that there were more samples in the test set. These seemingly unusual observations can be explained by the fact that most of the activation values of those molecules in the test set are within logarithm scale between 4 and 5 as shown in Figure 8, as compared with their counterparts in the training set. Conversely, their ΔMax values indicate their performance deteriorations from the training set to the test set because of their increased values (Tables 1 and 2). Prediction discrepancies among these four pharmacophore hypotheses found in the training set also can be found in the test set. For example, the prediction of 96 by Hypo B yielded the maximal deviation of 1.31, whereas Hypo A, Hypo C, and Hypo D only generated errors of 0.52, 0.82, and 0.10, respectively. The maximum residual of Hypo A in the test set resulted from the prediction of 53 with a value of 1.04, whose evaluation errors were 0.71 and 0.93 by Hypo B and Hypo C, respectively, whereas Hypo D made perfect prediction without any deviation. Thus, discrepancies in the prediction by these four models also can be found in the test set as exhibited in Figure 8. The predictions by Hypo A, Hypo B, Hypo C, and Hypo D, in general, are also in agreement with observed values for molecules in the test set as shown in Table S1 in the Supporting Information. Therefore, it can be affirmed that Hypo A, Hypo B, Hypo C, and Hypo D are qualified to constitute PhE based on their performances in the training set and test set as well as their statistical evaluations as mentioned above. Nevertheless, their nonnegligible decreases in r2 values from the training set to the test set indicate their various levels of overtraining. PhE/SVM. Those four pharmacophore models enrolled in the ensemble were assembled and subjected to regression by SVM to generate the final PhE/SVM model. Table 7 summarizes the run time parameters for generating the optimal SVM, which was chosen on the basis of the prediction results of those samples in the training set and cross-validation as given in Table S1 in the Supporting Information and Table 1. It can be observed from Figure 7 that the PhE/SVM model predicted those molecules in the training set better than all of those individual hypotheses in the PhE. It also can be found from Table S1 in the Supporting Information that the PhE/SVM model yielded smaller residuals than the maximal errors produced by those hypotheses in the PhE for most of the molecules in the training set. In some cases, the PhE/SVM model even produced the smallest residuals. The prediction of 9 by PhE/SVM, for example, deviated from the 1770

dx.doi.org/10.1021/tx200310j |Chem. Res. Toxicol. 2011, 24, 1765–1778

Chemical Research in Toxicology

ARTICLE

Figure 6. Pharmacophore models (A) Hypo A, (B) Hypo B, (C) Hypo C, and (D) Hypo D fitted to 105 and (E) overlay of these four models, which are color-coded by red, blue, green, and black, respectively. The chemical features are described in Figure 4.

1771

dx.doi.org/10.1021/tx200310j |Chem. Res. Toxicol. 2011, 24, 1765–1778

Chemical Research in Toxicology

ARTICLE

Figure 7. Observed pEC50 vs the pEC50 predicted by Hypo A, Hypo B, Hypo C, Hypo D, and PhE/SVM model for those molecules in the training set. The solid line, dashed lines, and dotted lines correspond to the SVM regression of the data, 95% confidence interval for the SVM regression, and 95% confidence interval for the prediction, respectively.

Figure 8. Observed pEC50 vs the pEC50 predicted by Hypo A, Hypo B, Hypo C, Hypo D, and PhE/SVM model for those molecules in the test set. The solid line, dashed lines, and dotted lines correspond to the SVM regression of the data, 95% confidence interval for the SVM regression, and 95% confidence interval for the prediction, respectively.

observed value by 0.11, whereas Hypo A, Hypo B, Hypo C, and Hypo D generated errors of 0.20, 0.41, 0.28, and 0.16, respectively. Furthermore, the PhE/SVM model gave rise to the largest r2 and the smallest RMSE, ΔMax, MAE, and s in the training set (Table 1), signifying its better performance as compared with all of the pharmacophore models in the PhE as shown by Figure 7. When subjected to the 10-fold cross-validation, PhE/SVM yielded the correlation coefficient q2 of 0.80 as compared with its r2 of 0.86 in the training set. This seemingly negligible difference between both parameters plausibly denotes that this PhE/SVM model shows highly statistical significance between the estimated

values and the input data, and more importantly, it is highly possible that this PhE/SVM model is an authentic model. Like all models in the PhE, PhE/SVM also executed slightly better when applied to the test set as indicated by the parameters RMSE, MAE, and s (Table 2). Even ΔMax only slightly increased from 0.78 in the training set to 0.95 in the test set, and the parameter r2 merely decreased from 0.86 in the training set to 0.80 in the test set. Similar to those observations found in the training set, PhE/SVM also yielded the largest r2 and the smallest RMSE, ΔMax, MAE, and s in the test set. Thus, it can be asserted that this PhE/SVM model performed better than any of pharmacophore models in the PhE in the test set as demonstrated by 1772

dx.doi.org/10.1021/tx200310j |Chem. Res. Toxicol. 2011, 24, 1765–1778

Chemical Research in Toxicology

ARTICLE

Figure 8. More importantly, these slight differences between both r2 values and between r2 and q2 values assert that this PhE/ SVM model is a well-trained theoretical model since it will otherwise generate at least one substantial difference in the case of overtraining, which can be further manifested by its small RMSE values in the training set (0.37) and test set (0.25) as well as their little difference between both values. Validation by Outliers. Eight molecules were deliberately selected as the outlier set since they are very dissimilar to those in the training set so that the extrapolation power of generated models can be challenged as demonstrated by Figure 9, which displays the projection of all molecules adopted in this study in the chemical space, spanned by the first three principal components, which explain 95.4% of the variance in the original data. It can be found that all molecules in the outlier set are entirely situated outside the perimeter of the training set, rendering their high level of dissimilarity and providing a good measure of robustness of a predictive model.61 Table S1 in the Supporting Information and Table 8 list the prediction results in the outlier set and their associated statistical evaluations; and Figure 10 displays the corresponding scatter plot of observed vs predicted pEC50 values. It can be observed from Table S1 in the Supporting Information and Table 8 that the predictions by PhE/SVM are in excellent agreement with observed values for all molecules in the outlier set as indicated by its low ΔMax (0.38) and high r2 value (0.91). In fact, all statistical Table 7. Optimal Run Time Parameters for the SVM Model parameter

parameters indicate that PhE/SVM performed better in the outlier set than in the training set and test set. Predictive Evaluations. It can be observed from Figure 11, which displays the scatter plots of the residual vs the pEC50 values predicted by PhE/SVM for all of molecules in the training set, test set, and outlier set, that the residuals are approximately evenly distributed on both sides of zero along the range of predicted values, suggesting that there is no systematic error associated with this PhE/SVM model. The generated PhE/SVM was further subjected to those validation criteria proposed by Golbraikh et al.58 as well as Roy and Roy;60 and the evaluation results are listed in Table 9. It can be observed from Table 9 that PhE/SVM overall yielded large statistical values and, more importantly, fulfilled all statistical validation requirements, asserting its high level of predictivity. In addition, only little variations among these three data sets can be observed. The rm2 values, for instance, were 0.77, 0.80, and 0.75 evaluated by those samples in the training set, test set, and outlier set, respectively. The insignificant discrepancies among these three data sets, regardless of their sample numbers and chemical structures, suggest that this PhE/SVM model is very insensitive to the outliers, namely, very robust per se, which is an unusual and yet important characteristic for a predictive model. Table 8. Statistical Parameters (Correlation Coefficient) r2, RMSE, Maximum Residual, Average Residual (ΔMax), and Standard Deviation of Residual Evaluated by PhE/SVM in the Outlier Set

value

parameters

PhE/SVM

SVM type

ε-SVR

r

kernel type

radial basis function

RMSE

γ

0.0390625

ΔMax

0.38

cost

3.2

MAE

0.10

ε

0.508804

s

0.12

2

0.91 0.15

Figure 9. Molecular distribution for those samples in the training set (filled circles), the test set (open triangles), and the outlier set (gray squares) in the chemical space spanned by three principal components. 1773

dx.doi.org/10.1021/tx200310j |Chem. Res. Toxicol. 2011, 24, 1765–1778

Chemical Research in Toxicology

ARTICLE

Figure 10. Observed pEC50 vs the pEC50 predicted by Hypo A, Hypo B, Hypo C, Hypo D, and PhE/SVM model for those molecules in the outlier set. The solid line, dashed lines and dotted lines correspond to the SVM regression of the data, 95% confidence interval for the SVM regression, and 95% confidence interval for the prediction, respectively.

Mock Test. To mimic the real-world challenge, the developed PhE/SVM model was further tested by those 10 compounds assayed by Moore et al.9,18 Of those 10 molecules, 6 were also adopted in this study, providing a good way to calibrate the testing system, and 4 were novel, furnishing a true test. Thus, this group of samples served as a sound mock test. Nevertheless, these molecules were measured in CV-1 cells, whereas all of the compounds enrolled in this study were tested in HepG2 cells. To eliminate the discrepancy between both assay systems, the following correlation was constructed using those common 6 compounds:

pEC50 ðCV-1Þ ¼ 0:6433  pEC50 ðHepG2Þ þ 2:0677 n ¼ 6, r 2 ¼ 0:80, RMSE ¼ 0:42 The obtained scatter plot is shown in Figure 12. It can be observed that the experimental values in both systems were modestly correlated with each other with an r2 of 0.80, suggesting that there is no significant cell difference in the activation of hPXR. Thus, it is plausible to examine the PhE/SVM model with those molecules assayed in CV-1 cells. When tested by the other four molecules, namely, RU486, PCN, androstanol, and dexamethasone, of which the first and the latter three gave rise to EC50 values of 5.5 μM and more than 10 μM, respectively, the PhE/SVM yielded the corresponding EC50 values of 3.5 μM, 8.7 μM, 7.5 μM, and 7.7 μM, respectively, indicating that the predictions by PhE/SVM are in agreement with experimental observations. Thus, this mock test unequivocally ensured the predictivity of PhE/SVM.

’ DISCUSSION These four pharmacophore hypotheses in the ensemble comprise different numbers and types of chemical features, suggesting that not all ligands bind to hPXR using the same interactions, which is consistent with the fact that different published pharmacophore models adopted different combinations of chemical features (vide supra). Collectively, the selections of all

chemical features employed by these four models are in agreement with all published pharmacophore hypotheses. HBA is a common feature among these models in PhE, which plays a critical role in hPXR activation.62 For instance, Figure 13, which was generated by superimposing the conformation of 150 mapped by Hypo D with SR12813 in the chain A of the cocomplex structure (PDB code: 1NRL), shows that the distance between atom O of 150 and that of Ser247 is 2.520 Å, which is qualified to constitute a hydrogen-bond interaction. In fact, this chemical interaction is completely consistent with cocomplex structure (PDB code: 1NRL), in which the distance between atom O of SR12813 and that of Ser247 in the chain A is 2.87 Å as illustrated in Figure 14, which was generated using LigPlot (available at http://www.ebi.ac.uk/thornton-srv/software/LIGPLOT/). The (aliphatic or aromatic) hydrophobic chemical features, which are of importance in ligand binding,62 also can be found among these four models. The selections of RA by Hypo A, Hypo B, and Hypo C coincide with the qualitative model developed by Ekins et al.,26 in which the model was derived based on common feature. Of all published pharmacophore hypotheses, only the qualitative model developed by Schuster and Langer,25 in which the structure-based model was derived from chain B of the 1NRL cocomplex structure, and the quantitative model reported by Ekins et al.20 adopted the XV chemical features that actually justify the fact that both Hypo B and Hypo C also included XV. In fact, it can be observed from Figure 15, in which the conformation of 31 fitted to Hypo B was overlain to SR12813 in the chain B of the cocomplex structure (PDB code: 1NRL), that the chemical feature XV was perfectly positioned on Trp299, suggesting that it is of necessity to describe the ligandprotein interaction using the feature XV in some cases. The selection of HBD, conversely, seems unusual since only Hypo D adopted this chemical feature among those four models in the PhE. Nevertheless, of all published pharmacophore hypotheses, only the model developed by Ekins et al.26 selected this chemical feature and the QSAR model derived by Ung et al.21 also employed the number of HBD as a descriptor, suggesting 1774

dx.doi.org/10.1021/tx200310j |Chem. Res. Toxicol. 2011, 24, 1765–1778

Chemical Research in Toxicology

ARTICLE

Figure 11. Residual vs the pEC50 predicted by PhE/SVM in the training set (filled circles), the test set (open triangles), and the outlier set (gray squares).

Table 9. Validation Verification Based on Prediction Performance of Those Molecules in the Training Set, Test Set, and Outlier Set training set test set outlier set n

32

120

8

ro2

0.85

0.80

0.88

k

1.00

1.00

0.99

rm2

0.77

0.80

0.75

r2 > 0.6 q2 > 0.5

 

 N/Aa

 N/A

(r2  ro2)/r2 < 0.1 and 0.85 e k e 1.15







|ro2  ro2| < 0.3







rm2 > 0.5







0

a

Not applicable.

that this chemical interaction can take place between hPXR and some ligands. The most compelling evidence of HBD interaction can be manifested by the hyperforinhPXR cocomplex structure (PDB code: 1M13),63 in which the distance between atom H of hyperforin and atom O of Gln285 is 2.7 Å, resulting in the formation of hydrogen-bond interaction as depicted by Figure 3D of Watkins et al.63 The same observation was also found by Xiao et al.36 Thus, these discrepancies among published pharmacophore hypotheses can be plausibly attributed to the plastic nature of hPXR as well as possible multiple ligand orientations in the binding pocket, resulting in different interactions between ligand and protein. Consequently, it can be expected that no single theoretical model can suffice to explain such a promiscuous system, otherwise substantial prediction errors can be yielded. Nevertheless, such a perplexing system can be properly modeled using the PhE/SVM scheme due to the fact that it can take into account both critical factors that would otherwise fail to be addressed by any other analogue-based molecular modeling techniques. Furthermore, rifapentine, whose molecular volume was the largest among 160 molecules enlisted in this study, was rigidly

Figure 12. Correlation between experimental pEC50 values assayed in CV-1 and HepG2 cells based on 6 compounds, whose names are shown here.

aligned against bound rifampicin in the cocomplex structures (PDB code: 1SKX), whose binding pocket was the largest among published crystal structures29,3135,63 as displayed in Figure 16, from which it can be observed that the molecular volume of aligned rifapentine collides with Ser247 and Met243, depicting the fact that the binding pocket of hPXR-rifampicin is not spacious enough to accommodate rifapentine even thought it has the largest pocket. Therefore, it is necessary for hPXR protein to change its conformation, which could plausibly occur by reorganization of the side chains of some residues64 to expand its active site so that the more bulky rifapentine can be accommodated, resulting in a new protein conformation with larger binding pocket. This situation certainly can be resolved by time-consuming crystallization, or MD calculations, which may provide an alternative despite its higher computational costs. However, that will inevitably increase the protein number in the case of ensemble docking, making 1775

dx.doi.org/10.1021/tx200310j |Chem. Res. Toxicol. 2011, 24, 1765–1778

Chemical Research in Toxicology

Figure 13. The alignment of 150 with SR12813 in the chain A of the hPXRligand cocomplex structure (PDB code: 1NRL). The protein and SR12813 are represented in green, and S247 is represented in light gray. The chemical features are depicted in Figure 4.

ARTICLE

Figure 15. The alignment of 31 with SR12813 in the chain B of the hPXRligand cocomplex structure (PDB code: 1NRL). The protein and SR12813 are represented in green, and W299 is depicted in light gray. The chemical features are depicted in Figure 4.

Figure 16. The alignment of rifapentine with rifampicin in the hPXRactivator cocomplex structure (PDB code: 1SKX). Ser247 and Met243 are represented by purple, and rifampicin is depicted in green. The green meshed lobe shows the molecular volume of rifapentine.

Figure 14. Binding interactions between SR12813 and hPXR based on the crystal structure of the chain A of the cocomplex structure (PDB code: 1NRL). Figure generated by LigPlot.

it more difficult to be modeled by structure-based techniques; or more substantial predictive errors will be yielded in the case of analogue-based modeling. PhE/SVM, on the other hand, can easily and accurately model such a system without spending a lot of computational time.

’ CONCLUSIONS An in silico model was developed to predict the activation of human pregnane X receptor using the PhE/SVM scheme, which employed the combination of pharmacophore ensemble to address protein plasticity and multiple ligand orientations as well as support vector machine to yield an accurate and robust regression model. This derived PhE/SVM model executed extremely well for those 32 and 120 molecules in the training set and test set, respectively. Even the predictions of those 8 molecules in the outlier set, which were structurally distinct from those in the training set, also confirm its excellent performance. Various statistical evaluations and validation criteria also support its accuracy and predictivity, which was further ensured by a mock test. In addition, this predictive model can justify the variations in reported pharmacophore hypotheses and revealed a possible new protein conformation that was not reported by any published cocomplex crystal structures. Therefore, it can be claimed, based on the facts mentioned above, that this PhE/SVM model can be employed as a powerful predictive tool to facilitate drug discovery and development by reducing the attrition rates due to adverse drugdrug interaction arisen from the activation of hPXR. 1776

dx.doi.org/10.1021/tx200310j |Chem. Res. Toxicol. 2011, 24, 1765–1778

Chemical Research in Toxicology

’ ASSOCIATED CONTENT

bS

Supporting Information. Table S1 listing all molecules compiled for this investigation, their SMILES strings, observed and predicted pEC50 values, the corresponding residuals, data sets, and literature references. This material is available free of charge via the Internet at http://pubs.acs.org.

’ AUTHOR INFORMATION Corresponding Author

*E-mail: [email protected]. Funding Sources

This work was supported by the National Science Council, Taiwan.

’ ACKNOWLEDGMENT Parts of calculations were performed at the National Center for High-Performance Computing, Taiwan. The authors are grateful to Dr. G. H. Hakimelahi for reading the manuscript. ’ ABBREVATIONS hPXR, human pregnane X receptor; PhE/SVM, pharmacophore ensemble/support vector machine; HBA, hydrogen-bond acceptor; HBD, hydrogen-bond donor; HP, hydrophobic; AHP, aliphatic hydrophobic; RA, ring aromatic; XV, excluded volume sphere. ’ REFERENCES (1) Kliewer, S. A., and Willson, T. M. (2002) Regulation of xenobiotic and bile acid metabolism by the nuclear pregnane X receptor. J. Lipid Res. 43, 359–364. (2) Kliewer, S. A., Moore, J. T., Wade, L., Staudinger, J. L., Watson, M. A., Jones, S. A., McKee, D. D., Oliver, B. B., Willson, T. M., Zetterstr€om, R. H., Perlmann, T., and Lehmann, J. M. (1998) An Orphan Nuclear Receptor Activated by Pregnanes Defines a Novel Steroid Signaling Pathway. Cell 92, 73–82. (3) Bertilsson, G., Heidrich, J., Svensson, K., Åsman, M., Jendeberg, L., Sydow-B€ackman, M., Ohlsson, R., Postlind, H., Blomquist, P., and Berkenstam, A. (1998) Identification of a human nuclear receptor defines a new signaling pathway for CYP3A induction. Proc. Natl. Acad. Sci. U.S.A. 95, 12208–12213. (4) Lamba, V., Yasuda, K., Lamba, J. K., Assem, M., Davila, J., Strom, S., and Schuetz, E. G. (2004) PXR (NR1I2): splice variants in human tissues, including brain, and identification of neurosteroids and nicotine as PXR activators. Toxicol. Appl. Pharmacol. 199, 251–265. (5) Timsit, Y. E., and Negishi, M. (2007) CAR and PXR: The xenobiotic-sensing receptors. Steroids 72, 231–246. (6) Xie, W., and Evans, R. M. (2001) Orphan Nuclear Receptors: The Exotics of Xenobiotics. J. Biol. Chem. 276, 37739–37742. (7) Kliewer, S. A., Goodwin, B., and Willson, T. M. (2002) The nuclear pregnane X receptor: A key regulator of xenobiotic metabolism. Endocr. Rev. 23, 687–702. (8) Ekins, S. (2004) Predicting undesirable drug interactions with promiscuous proteins in silico. Drug Discovery Today 9, 276–285. (9) Moore, L. B., Parks, D. J., Jones, S. A., Bledsoe, R. K., Consler, T. G., Stimmel, J. B., Goodwini, B., Liddlei, C., Blanchard, S. G., Willson, T. M., Collins, J. L., and Kliewer, S. A. (2000) Orphan Nuclear Receptors Constitutive Androstane Receptor and Pregnane X Receptor Share Xenobiotic and Steroid Ligands. J. Biol. Chem. 275, 15122–15127. (10) Tolson, A. H., and Wang, H. (2010) Regulation of drugmetabolizing enzymes by xenobiotic receptors: PXR and CAR. Adv. Drug Delivery Rev. 62, 1238–1249.

ARTICLE

(11) Wang, H., and LeCluyse, E. L. (2003) Role of Orphan Nuclear Receptors in the Regulation of Drug-Metabolising Enzymes. Clin. Pharmacokinet. 42, 1331–1357. (12) Chen, Y., Tang, Y., Wang, M.-T., Zeng, S., and Nie, D. (2007) Human Pregnane X Receptor and Resistance to Chemotherapy in Prostate Cancer. Cancer Res. 67, 10361–10367. (13) Masuyama, H., Nakatsukasa, H., Takamoto, N., and Hiramatsu, Y. (2007) Down-Regulation of Pregnane X Receptor Contributes to Cell Growth Inhibition and Apoptosis by Anticancer Agents in Endometrial Cancer Cells. Mol. Pharmacol. 72, 1045–1053. (14) Zhou, J., Liu, M., Zhai, Y., and Xie, W. (2008) The Antiapoptotic Role of Pregnane X Receptor in Human Colon Cancer Cells. Mol. Endocrinol. 22, 868–880. (15) Gupta, D., Venkatesh, M., Wang, H., Kim, S., Sinz, M., Goldberg, G. L., Whitney, K., Longley, C., and Mani, S. (2008) Expanding the Roles for Pregnane X Receptor in Cancer: Proliferation and Drug Resistance in Ovarian Cancer. Clin. Cancer Res. 14, 5332–5340. (16) Martínez, A., Marquez, A., Mendoza, J., Taxonera, C., FernandezArquero, M., Díaz-Rubio, M., de la Concha, E. G., and Urcelay, E. (2007) Role of the PXR gene locus in inflammatory bowel diseases. Inflammatory Bowel Dis. 13, 1484–1487. (17) di Masi, A., Marinis, E. D., Ascenzi, P., and Marino, M. (2009) Nuclear receptors CAR and PXR: Molecular, functional, and biomedical aspects. Mol. Aspects Med. 30, 297–343. (18) Moore, L. B., Goodwin, B., Jones, S. A., Wisely, G. B., SerabjitSingh, C. J., Willson, T. M., Collins, J. L., and Kliewer, S. A. (2000) St. John’s wort induces hepatic drug metabolism through activation of the pregnane X receptor. Proc. Natl. Acad. Sci. U.S.A. 97, 7500–7502. (19) Khandelwal, A., Krasowski, M. D., Reschly, E. J., Sinz, M. W., Swaan, P. W., and Ekins, S. (2008) Machine Learning Methods and Docking for Predicting Human Pregnane X Receptor Activation. Chem. Res. Toxicol. 21, 1457–1467. (20) Ekins, S., Reschly, E., Hagey, L., and Krasowski, M. (2008) Evolution of pharmacologic specificity in the pregnane X receptor. BMC Evol. Biol. 8, 103. (21) Ung, C. Y., Li, H., Yap, C. W., and Chen, Y. Z. (2007) In Silico Prediction of Pregnane X Receptor Activators by Machine Learning Approache. Mol. Pharmacol. 71, 158–168. (22) Jacobs, M. N. (2004) In silico tools to aid risk assessment of endocrine disrupting chemicals. Toxicology 205, 43–53. (23) Kobayashi, K., Yamagami, S., Higuchi, T., Hosokawa, M., and Chiba, K. (2004) Key Structural Features of Ligands for Activation of Human Pregnane X Receptor. Drug Metab. Dispos. 32, 468–472. (24) Ekins, S., and Erickson, J. A. (2002) A Pharmacophore for Human Pregnane X Receptor Ligands. Drug Metab. Dispos. 30, 96–99. (25) Schuster, D., and Langer, T. (2005) The Identification of Ligand Features Essential for PXR Activation by Pharmacophore Modeling. J. Chem. Inf. Model. 45, 431–439. (26) Ekins, S., Chang, C., Mani, S., Krasowski, M. D., Reschly, E. J., Iyer, M., Kholodovych, V., Ai, N., Welsh, W. J., Sinz, M., Swaan, P. W., Patel, R., and Bachmann, K. (2007) Human Pregnane X Receptor Antagonists and Agonists Define Molecular Requirements for Different Binding Sites. Mol. Pharmacol. 72, 592–603. (27) Yasuda, K., Ranade, A., Venkataramanan, R., Strom, S., Chupka, J., Ekins, S., Schuetz, E., and Bachmann, K. (2008) A Comprehensive in vitro and in silico Analysis of Antibiotics that Activate PXR and Induce CYP3A4 in Liver and Intestine. Drug Metab. Dispos. 36, 1689–1697. (28) Ai, N., Krasowski, M. D., Welsh, W. J., and Ekins, S. (2009) Understanding nuclear receptors using computational methods. Drug Discovery Today 14, 486–494. (29) Wang, W., Prosise, W. W., Chen, J., Taremi, S. S., Le, H. V., Madison, V., Cui, X., Thomas, A., Cheng, K. C., and Lesburg, C. A. (2008) Construction and characterization of a fully active PXR/SRC-1 tethered protein with increased stability. Protein Eng., Des. Sel. 21, 425–433. (30) Watkins, R. E., Maglich, J. M., Moore, L. B., Wisely, G. B., Noble, S. M., Davis-Searles, P. R., Lambert, M. H., Kliewer, S. A., and Redinbo, M. R. (2003) 2.1 Å crystal structure of human PXR in complex with the St. John’s wort compound hyperforin. Biochemistry 42, 1430–1438. 1777

dx.doi.org/10.1021/tx200310j |Chem. Res. Toxicol. 2011, 24, 1765–1778

Chemical Research in Toxicology (31) Watkins, R. E., Davis-Searles, P. R., Lambert, M. H., and Redinbo, M. R. (2003) Coactivator binding promotes the specific interaction between ligand and the pregnane X receptor. J. Mol. Biol. 331, 815–828. (32) Chrencik, J. E., Orans, J., Moore, L. B., Xue, Y., Peng, L., Collins, J. L., Wisely, G. B., Lambert, M. H., Kliewer, S. A., and Redinbo, M. R. (2005) Structural disorder in the complex of human PXR and the macrolide antibiotic rifampicin. Mol. Endocrinol. 19, 1125–1134. (33) Xue, Y., Chao, E., Zuercher, W. J., Willson, T. M., Collins, J. L., and Redinbo, M. R. (2007) Crystal structure of the PXR-T1317 complex provides a scaffold to examine the potential for receptor antagonism. Bioorg. Med. Chem. 15, 2156–2166. (34) Teotico, D. G., Bischof, J. J., Peng, L., Kliewer, S. A., and Redinbo, M. R. (2008) Structural Basis of Human Pregnane X Receptor Activation by the Hops Constituent Colupulone. Mol. Pharmacol. 74, 1512–1520. (35) Watkins, R. E., Wisely, G. B., Moore, L. B., Collins, J. L., Lambert, M. H., Williams, S. P., Willson, T. M., Kliewer, S. A., and Redinbo, M. R. (2001) The human nuclear xenobiotic receptor PXR: structural determinants of directed promiscuity. Science 292, 2329–2333. (36) Xiao, L., Nickbarg, E., Wang, W., Thomas, A., Ziebell, M., Prosise, W. W., Lesburg, C. A., Taremi, S. S., Gerlach, V. L., Le, H. V., and Cheng, K. C. (2011) Evaluation of in vitro PXR-based assays and in silico modeling approaches for understanding the binding of a structurally diverse set of drugs to PXR. Biochem. Pharmacol. 81, 669–679. (37) Pan, Y., Li, L., Kim, G., Ekins, S., Wang, H., and Swaan, P. W. (2011) Identification and Validation of Novel hPXR Activators Amongst Prescribed Drugs via Ligand-Based Virtual Screening. Drug Metab. Dispos. 39, 337–344. (38) Liu, Y.-H., Mo, S.-L., Bi, H.-C., Hu, B.-F., Li, C. G., Wang, Y.-T., Huang, L., Huang, M., Duan, W., Liu, J.-P., Wei, M. Q., and Zhou, S.-F. (2011) Regulation of human pregnane X receptor and its target gene cytochrome P450 3A4 by Chinese herbal compounds and a molecular docking study. Xenobiotica 41, 259–280. (39) Totrov, M., and Abagyan, R. (2008) Flexible ligand docking to multiple receptor conformations: a practical alternative. Curr. Opin. Struct. Biol. 18, 178–184. (40) Cornell, W., and Nam, K. (2009) Steroid Hormone Binding Receptors: Application of Homology Modeling, Induced Fit Docking, and Molecular Dynamics to Study Structure-Function Relationships. Curr. Top. Med. Chem. 9, 844–853. (41) Leong, M. K. (2007) A Novel Approach Using Pharmacophore Ensemble/Support Vector Machine (PhE/SVM) for Prediction of hERG Liability. Chem. Res. Toxicol. 20, 217–226. (42) Leong, M. K., and Chen, T.-H. (2008) Prediction of cytochrome P450 2B6-substrate interactions using pharmacophore ensemble/support vector machine (PhE/SVM) approach. Med. Chem. 4, 396–406. (43) Leong, M. K., Chen, Y.-M., Chen, H.-B., and Chen, P.-H. (2009) Development of a New Predictive Model for Interactions with Human Cytochrome P450 2A6 Using Pharmacophore Ensemble/ Support Vector Machine (PhE/SVM) Approach. Pharm. Res. 26, 987–1000. (44) McGinnity, D. F., Zhang, G., Kenny, J. R., Hamilton, G. A., Otmani, S., Stams, K. R., Haney, S., Brassil, P., Stresser, D. M., and Riley, R. J. (2009) Evaluation of Multiple in Vitro Systems for Assessment of CYP3A4 Induction in Drug Discovery: Human Hepatocytes, Pregnane X Receptor Reporter Gene, and Fa2N-4 and HepaRG Cells. Drug Metab. Dispos. 37, 1259–1268. (45) Persson, K. P., Ekehed, S., Otter, C., Lutz, E. S. M., McPheat, J., Masimirembwa, C. M., and Andersson, T. B. (2006) Evaluation of human liver slices and reporter gene assays as systems for predicting the cytochrome P450 induction potential of drugs in vivo in humans. Pharm. Res. 23, 56–69. (46) Sinz, M., Kim, S., Zhu, Z., Chen, T., Anthony, M., Dickinson, K., and Rodrigues, A. D. (2006) Evaluation of 170 xenobiotics as transactivators of human pregnane X receptor (hPXR) and correlation to known CYP3A4 drug interactions. Curr. Drug Metab. 7, 375–388.

ARTICLE

(47) Lemaire, G., de Sousa, G., and Rahmani, R. (2004) A PXR reporter gene assay in a stable cell culture system: CYP3A4 and CYP2B6 induction by pesticides. Biochem. Pharmacol. 68, 2347–2358. (48) Hurst, C. H., and Waxman, D. J. (2004) Environmental phthalate monoesters activate pregnane X receptor-mediated transcription. Toxicol. Appl. Pharmacol. 199, 266–274. (49) Cui, X., Thomas, A., Gerlach, V., White, R. E., Morrison, R. A., and Cheng, K. C. (2008) Application and interpretation of hPXR screening data: Validation of reporter signal requirements for prediction of clinically relevant CYP3A4 inducers. Biochem. Pharmacol. 76, 680–689. (50) Mitro, N., Vargas, L., Romeo, R., Koder, A., and Saez, E. (2007) T0901317 is a potent PXR ligand: Implications for the biology ascribed to LXR. FEBS Lett. 581, 1721–1726. (51) Chang, G., Guida, W. C., and Still, W. C. (1989) An internalcoordinate Monte Carlo method for searching conformational space. J. Am. Chem. Soc. 111, 4379–4386. (52) Kolossvary, I., and Guida, W. C. (1996) Low mode search. An efficient, automated computational method for conformational analysis: Application to cyclic and acyclic alkanes and cyclic peptides. J. Am. Chem. Soc. 118, 5011–5019. (53) Still, W. C., Tempczyk, A., Hawley, R. C., and Hendrickson, T. (1990) Semianalytical treatment of solvation for molecular mechanics and dynamics. J. Am. Chem. Soc. 112, 6127–6129. (54) Halgren, T. A. (1996) Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94. J. Comput. Chem. 17, 490–519. (55) Sprague, P. W. (1995) Automated chemical hypothesis generation and database searching with Catalyst. Perspect. Drug Discovery Des. 3, 1–20. (56) G€uner, O. F. (2000) Pharmacophore perception, development, and use in drug design, International University Line, La Jolla, CA. (57) Leong, M. K., Lin, S.-W., Chen, H.-B., and Tsai, F.-Y. (2010) Predicting Mutagenicity of Aromatic Amines by Various Machine Learning Approaches. Toxicol. Sci. 116, 498–513. (58) Golbraikh, A., Shen, M., Xiao, Z., Xiao, Y.-D., Lee, K.-H., and Tropsha, A. (2003) Rational selection of training and test sets for the development of validated QSAR models. J. Comput.-Aided Mol. Des. 17, 241–253. (59) Benfenati, E., Chretien, J. R., Gini, G., Piclin, N., Pintore, M., and Roncaglioni, A. (2007) Validation of the models, in Quantitative Structure-Activity Relationships (QSAR) for Pesticide Regulatory Purposes (Benfenati, E., Ed.) pp 185199, Elsevier, Amsterdam. (60) Roy, P. P., and Roy, K. (2008) On Some Aspects of Variable Selection for Partial Least Squares Regression Models. QSAR Comb. Sci. 27, 302–313. (61) Gnanadesikan, R., and Kettenring, J. R. (1972) Robust estimates, residuals, and outlier detection with multiresponse data. Biometrics 28, 81–124. (62) Zimmermann, K., Wittman, M. D., Saulnier, M. G., Velaparthi, U., Sang, X. P., Frennesson, D. B., Struzynski, C., Seitz, S. P., He, L. Q., Carboni, J. M., Li, A. X., Greer, A. F., Gottardis, M., Attar, R. M., Yang, Z., Balimane, P., Discenza, L. N., Lee, F. Y., Sinz, M., Kim, S., and Vyas, D. (2010) SAR of PXR transactivation in benzimidazole-based IGF-1R kinase inhibitors. Bioorg. Med. Chem. Lett. 20, 1744–1748. (63) Watkins, R. E., Maglich, J. M., Moore, L. B., Wisely, G. B., Noble, S. M., Davis-Searles, P. R., Lambert, M. H., Kliewer, S. A., and Redinbo, M. R. (2003) 2.1 Å crystal structure of human PXR in complex with the St. John’s wort compound hyperforin. Biochemistry 42, 1430–1438. (64) Ngan, C.-H., Beglov, D., Rudnitskaya, A. N., Kozakov, D., Waxman, D. J., and Vajda, S. (2009) The Structural Basis of Pregnane X Receptor Binding Promiscuity. Biochemistry 48, 11572–11581.

1778

dx.doi.org/10.1021/tx200310j |Chem. Res. Toxicol. 2011, 24, 1765–1778