974
Chem. Res. Toxicol. 2003, 16, 974-987
Stepwise Discrimination between Four Modes of Toxic Action of Phenols in the Tetrahymena pyriformis Assay Gerrit Schu¨u¨rmann,* Aynur O. Aptula, Ralph Ku¨hne, and Ralf-Uwe Ebert Department of Chemical Ecotoxicology, UFZ Centre for Environmental Research, Permoserstrasse 15, 04318 Leipzig, Germany Received March 17, 2003
For a set of 220 phenols with literature data on their toxicity and associated mode of action (MOA) toward the ciliate Tetrahymena pyriformis, a stepwise classification scheme was developed that allows the identification of four MOAs from molecular hydrophobicity and AM1based quantum chemical descriptors, employing linear discriminant analysis or binary logistic regression. Taking the AM1 lowest unoccupied molecular orbital energy as the only parameter, an initial separation of polar narcotics and proelectrophiles from oxidative uncouplers and soft electrophiles is correct to 97%, and for the subsequent discrimination between polar narcotics and proelectrophiles as well as between oxidative uncouplers and soft electrophiles, 99 and 98% correct classifications are achieved using three and two molecular descriptors, respectively. The results are discussed in terms of detailed contingency table statistics and with respect to relationships between molecular descriptors and mechanisms of toxicity. Statistical model evaluation includes simulated external validation employing complementary subset models.
Introduction Phenols exert a broad spectrum of biological activities. Naturally occurring polyphenols in fruits and vegetables such as flavonoids and phenolic acids are known for their antioxidant activity, which apparently contributes to the protection against cardiovascular disease and some forms of cancer (1). However, phenolic compounds may both scavenge and generate reactive oxygen species, and the development of antioxidants for clinical use includes strategies to minimize the prooxidant activity (2). Phenolic additives used as preservatives, disinfectants, and in sunscreens or light stabilizers show estrogenic activity (3, 4), and there is increasing evidence that phenol toxicity may also proceed through a radical pathway initiated by a homolytic abstraction of the hydroxyl hydrogen (5). The widespread exposure to chlorophenols has recently led to the development of an immunoassay for 2,4,5-trichlorophenol as one of the congeners with significant toxic effects (6). Depending on the system pH, phenols undergo heterolytic dissociation, and within certain ranges of hydrophobicity and pKa, this results in the ability to act as oxidative uncouplers through a distinct shuttle mechanism across the inner membrane of mitochondria (7, 8). Electron-attracting substituents such as nitro groups and halogens diminish the electron density in the aromatic ring, yielding an increased electrophilicity that may lead to an impairment of proteins (receptors, enzymes) and nucleic acids (DNA, RNA) through covalent interactions with electron-rich sites (9, 10). This electrophilic mode of toxic action (MOA) can also take place after initial biotransformation, in which case the agents are classified as proelectrophiles (10). Moreover, distinct features of the electronic structure may lead to a redox-cycling activity of nitrophenols that is associated with the formation of intermediate radical
anions and the generation of superoxide radical anion (O2-•) through another shuttle mechanism. This additional pathway was hypothesized for 3-trifluoromethyl4-nitrophenol (TFM) (11), which has been employed as an agent against the sea lamprey (Petromyzoin marinus) in the Great Lakes for more than 30 years (12). Despite the rather complex bioreactivity pattern of phenolic compounds, an important mode of their toxic action toward aquatic organisms is polar narcosis (13), an unspecific membrane irritation caused by noncovalent van der Waals type interactions of xenobiotics accumulated in lipid tissues (14). In view of the various types of noncovalent and covalent interactions that may occur between chemical substances and membranes, proteins, and nucleic acids, the toxicity of a given compound is likely to be caused by the superposition of effects from different mechanisms. At the same time, it may not be uncommon that for a given xenobiotic and biological system, a certain MOA will dominate the overall hazardous effect. In line with this assumption, an expert system had been developed based on acute fish toxicity data toward the fathead minnow (Pimephales promelas) and empirical knowledge, which allows us to classify organic chemicals with respect to eight MOAs using only substructural fragments (15). While this approach offers a particularly simple way to predict the most likely MOA from molecular structure, the relevant structural rules are not necessarily linked to distinct types of bioreactivity and associated intermolecular interactions. Because the biological activity of drugs depends crucially on their three-dimensional (3D) structure and property profile, corresponding molecular descriptors offer a way to identify classification rules that elucidate the relevant geometric and electronic properties. A recent example is the discrimination between antibacterial and nonantibacterial activity of 661 organic
10.1021/tx0340504 CCC: $25.00 © 2003 American Chemical Society Published on Web 07/23/2003
Four Modes of Toxic Action of Phenols
chemicals including many druglike compounds, where an overall classification rate of 90% was achieved using log Kow and two parameters encoding certain aspects of the surface charge density of the compounds (16). In the present investigation, a set of 220 phenols with four empirically assigned (quasiexperimental) MOAs in the Tetrahymena pyriformis assay (17) is used to develop a stepwise classification scheme that allows us to predict the prevalent MOA from 3D molecular descriptors. The four MOAs under discussion are polar narcosis, oxidative uncoupling, proelectrophilicity, and soft electrophilicity. These MOAs have been shown earlier to occur in the T. pyriformis assay, and the assignment of phenols to prevalent MOAs is based on previous experimental and QSAR studies as discussed in ref 17. While a recently developed 1-step 4-MOA discrimination yielded an overall concordance between predicted and experimental MOAs of 86-89%, the correct classification rate for oxidative uncouplers and proelectrophiles was only up to 78%, and model validation indicated greater variations in the statistical performance of compounds associated with these two MOAs (18). Interestingly, a separate discrimination of proelectrophiles from all other compounds resulted in a highly significant model, which however was not the case for a corresponding attempt with oxidative uncouplers. To now address these aspects, additional 3D-based molecular descriptors are taken into account, and both linear discriminant analysis (LDA) and binary logistic regression (BLR) are employed. The results demonstrate that a hierarchical decision tree is superior to the previously derived 1-step discrimination (18) and leads to high classification rates for all four MOAs.
Material and Methods The data set of 220 phenols and associated allocations in terms of the four modes of action polar narcosis, oxidative uncoupling, proelectrophilicity, and soft electrophilicity were taken from our previous investigation (18). However, two corrections were performed as regards the MOA assignment: 2,6-dibromo-4-nitrophenol is now classified as an oxidative uncoupler according to a respective structural rule (17), and the previously included tetrachlorocatechol is now omitted due to an assumed redox-cycling activity (17). Both two-dimensional (2D) and 3D descriptors have been generated in order to quantify physicochemical, geometric, and electronic characteristics of the compounds. Two-Dimensional Molecular Descriptors. Calculated values for the molecular hydrophobicity in terms of the logarithmic octanol/water partition coefficient (log Kow), the compound acidity constant (pKa), the hydrophobicity corrected for ionization (log Dowu, considering only the undissociated compound fraction fu ) 1/[1 + 10pH-pKa] (19)), and the hydrogen bond donor and acceptor counts (NHdon, donor centers: O-H, N-H, and S-H) and NHacc (acceptor centers: -O-, CdO, -CtN, -Nd, tertiary nitrogen except N-C (sp2), S-H, and CdS) were taken from our previous investigation (18). Three-Dimensional Molecular Descriptors. Geometric structures of the compounds were optimized using the semiempirical quantum chemical AM1 scheme as implemented in MOPAC 93 (20). Subsequently, molecular surface areas (SA) were calculated for the whole molecules as well as confined to the SA portion built from certain atom types (SAC, SAH, SAN, and SAO), employing the MOLSV program (21) and the following van der Waals radii: C, 1.5 Å; H, 1.2 Å; N, 1.5 Å; O, 1.4 Å; F, 1.35 Å; Cl, 1.8 Å; Br, 2.0 Å; and I, 2.15 Å. To characterize the electronic structure of the compounds, parameters were calculated as follows: energy of the highest
Chem. Res. Toxicol., Vol. 16, No. 8, 2003 975 occupied and lowest unoccupied molecular orbital, EHOMO and ELUMO; atomic charge in terms of largest positive or negative value when considering all atomic sites, Qmax+, Qmax-, or only C, H, N, or O sites (QC-max+, QC-max-, etc.) as well as in terms of the corresponding average value when normalized with respect to all atoms or to all C, H, N, or O in the molecule (Qav+, Qav-, QC-av+, QC-av-, etc.) and in addition also average atomic charge when normalized with respect to atoms charged with the same (only positive or only negative) sign (Qavp+, Qavn-, QC-avp+, QC-avn-, etc.); acceptor (nucleophilic) and donor (electrophilic) delocalizability, DN and DE (for mathematical definitions, cf. ref 19, where for DN unoccupied molecular orbital energies are shifted upward by 10 eV in order to avoid arbitrarily small denominators) in terms of total value, maximum value, and average value with respect to all atoms or to C, H, N, or O (e.g., DHN, DC-maxN, DO-avE, DmaxN); charged partial surface area (CPSA) descriptors (22) based on AM1 atomic charges and the SAs as described above and evaluated in three different ways: all atoms, only heavy atoms (nonhydrogens), and only hydrogens (e.g., PPSA-1, sum of all positively charged atomic SAs; PPSA-1Z, PPSA-1 confined to heavy atoms; PPSA1H, PPSA-1 confined to hydrogens). In addition to the 25 standard CPSA descriptors (22), six additional parameters were calculated as introduced in ref 16. In total, 125 3D descriptors were generated (five SA, 24 atomic charge, 27 delocalizability, and 69 CPSA), leading to an overall total of 130 descriptors (including five 2D descriptors). Classification Modeling and Validation. LDA and BLR were employed as implemented in STATISTICA (23) and SPSS (24), respectively. Model building was performed in a stepwise manner, starting with those variables from each group (physicochemical characteristics, SA, charge, delocalizability, and CPSA) that indicate high Fisher F test and p significance values and including up to a maximum of five variables through a corresponding application of the three criteria (descriptor type, Fisher statistics, and p level). For the LDA and BLR classification models, a hierarchical strategy was selected, aiming at an initial separation into two groups covering either one and three or two and two modes of action, and a subsequent separation of the lumped groups from the initial step into four separated classes according to the four MOAs. In LDA, the assignment of the compounds to one of the two classes is fitted to the equation m
d ) a0 +
∑a x
(1)
k k
k)1
where d denotes the canonical discrimination function separating the two classes, xk is the k-th molecular property taken into account with its coefficient ak, a0 is a constant, and m is the number of descriptors included. The corresponding BLR equation reads
( )
ln
P
1-P
m
) a0 +
∑a x
k k
(2)
k)1
where P is the probability of a compound to belong to a certain class (e.g., the combined group of polar narcotics and oxidative uncouplers) and 1 - P is the probability of the alternative assignment (in this case, combined group of proelectrophiles and soft electrophiles), which combine to the so-called odds ratio P/(1 - P). The statistical performance was characterized in terms of contingency table statistics as outlined in the next section. To evaluate the predictive performance, the strategy of simulated external validation was applied as introduced recently for the case of categorical data (18). After ordering the compounds according to MOAs (polar narcotics, oxidative uncouplers, proelectrophiles, and soft electrophiles) and within each MOA according to increasing toxicity, two subgroups were generated by allocating all odd-numbered compounds to group 1 and all
976
Chem. Res. Toxicol., Vol. 16, No. 8, 2003
Schu¨ u¨ rmann et al.
even-numbered compounds to group 2. In this way, both group 1 and group 2 contain 110 compounds and almost the same number of compounds of each MOA (77 vs 76 polar narcotics, 9 vs 10 oxidative uncouplers, 13 vs 13 proelectrophiles, and 11 vs 11 soft electrophiles). Then, LDA and BLR models were calibrated for group 1 and group 2 separately, and for each of the two subgroups, an external MOA prediction was performed from the submodel trained on the complementary subset (group 1 predicted from group 2 and vice versa). Both calibration and external prediction performance were then evaluated using contingency table statistics (s.b.). For comparative purposes, two previously derived 1-step 4-MOA LDA models, a three variable model based on log Kow, pKa, and ELUMO, and a five variable model based on log Kow, pKa, ELUMO, EHOMO, and NHdon (18) were recalculated and reevaluated for the corrected data set as used in the present investigation. Moreover, multilinear regression using STATISTICA (23) was performed in order to analyze MOA specific relationships between the toxicity and the molecular descriptors selected for the final classification models. The respective statistics are characterized in terms of the squared correlation coefficient (r2), the standard error (SE), and the Fisher F value. Statistical Evaluation of Contingency Tables. For evaluating the calibration and prediction performance of the LDA and BLR models, the results are presented in the form of 2D 4 × 4 contingency tables. Here, nij denotes the number of chemicals with predicted MOA class i (row index running from 1 to r; in our case, r ) 4) and experimental MOA class j (column index running from 1 to c; in our case, c ) 4), and
N)
∑ ∑ r i)1
c j)1
nij ≡ n••
(3)
is the total number of compounds (here, N ) 220). In the contingency table, the row index refers to predicted categories, and the column index refers to experimental categories. The ith predicted MOA class contains
ni• )
∑
c j)1
nij
(4)
compounds, which is also called the marginal total of the ith row and correspondingly
n•j )
∑
r i)1
nij
(5)
denotes the total number of compounds in the jth table column as its marginal total. Note the use of the dot notation to simplify mathematical expressions such as in n•j, where the dot in the subscript indicates summation over the respective index (here, row index i, running from 1 to r). To evaluate the degree of agreement between predicted and experimental MOAs, several association coefficients have been calculated (25). The concordance is simply defined as the proportion of compounds where the predicted and experimental classification agree
1 concordance ) N
∑
r i)1
nii
(6)
While the concordance appears intuitively convincing, it ignores the fraction of agreement that might have been obtained by chance. The latter is addressed by the so-called κ index
κ)
(1/N)
∑n i
ii
- (1/N 2) 2
1 - (1/N )
∑n i
∑n i
i•
i• n•i
(7)
n•i
(26). With κ, the level of agreement between predicted and actual categories is normalized with respect to the one that could have been achieved by allocating the chemicals to classes at random but in accord with the marginal totals. Both the simple
Table 1. Two-Dimensional 4 × 4 Contingency Table category type
experimental category
total
predicted category
n11 n21 n31 n41
n12 n22 n32 n42
n13 n23 n33 n43
n14 n24 n34 n44
n1• n2• n3• n4•
total
n•1
n•2
n•3
n•4
n•• ) N
concordance and the κ vary from 0 (no agreement between predicted and experimental classes) and 1 (complete agreement). The third contingency coefficient used in our analysis is the λB parameter (25):
λB )
∑ max (n i
j
ij )
- maxj (n•j )
N - maxj (n•j )
(8)
In contrast to conventional measures of association, λB is specifically designed to characterize the predictive power of the classification method under investigation. Its value ranges between 0 (no prediction capability) and 1 (full prediction capability), and it quantifies the reduction in prediction error through exploitation of the classification model as compared to the prediction error when allocating the classes solely on the basis of the marginal totals. While the concordance, κ, and λB provide numerical values for the overall performance of the classification model, they do not specify potential differences in the classification rates between the individual MOA classes. For evaluating these class specific performances of the LDA and BLR models, the following two parameters have been employed
sensitivity for class k )
nkk n•k
(9)
predictivity for class k )
nkk nk•
(10)
For a given mode of action k (here, k ) 1-4; 1, polar narcosis; 2, oxidative uncoupling; 3, proelectrophilicity; and 4, soft electrophilicity), the associated sensitivity of the classification model represents the fraction of correctly identified class k chemicals, and is defined as the proportion of actual class k compounds that were predicted to belong to class k. As such, the sensitivity may also be called the recognition power of the model for MOA k. Correspondingly, the predictivity of the classification model with respect to category k is defined as the fraction of correctly predicted class k chemicals, which means the proportion of predicted class k compounds that actually belong to class k. Equations 9 and 10 provide a generalization of two-category statistics (27) to the case of polychotomous categorical variables.
Results The data set with 220 phenols, ciliate toxicities in terms of log IC50 (mol/L) values (growth inhibition in the 2-day T. pyriformis assay), and assignments to the four MOAs polar narcosis (153 compounds), oxidative uncoupling (19 compounds), proelectrophilicity (26 compounds), and soft electrophilicity (22 compounds) was taken from a previous study (18) (with two corrections as mentioned above) and is summarized in Table 2. The compounds are ordered by MOA class and increasing toxicity, allowing a convenient separation into the subgroups group 1 and group 2 used for the model validation (s.b.). The toxicity covers four orders of magnitude, ranging from 3.16 × 10-2 mol/L (4-hydroxyphenylacetic acid) to 1.94 × 10-6 mol/L (2,3,4,5-tetrachlorophenol). Within each MOA class, the median log IC50 (mol/L) values are
Four Modes of Toxic Action of Phenols
Chem. Res. Toxicol., Vol. 16, No. 8, 2003 977
Table 2. Compounds with Ciliate Toxicity in Terms of the Logarithmic Inhibition Concentration 50% (log IC50) and Calculated Molecular Descriptorsa no. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73
name 4-hydroxyphenylacetic acid 1,3,5-trihydroxybenzene 3-hydroxybenzyl alcohol 4-hydroxybenzoic acid 3-hydroxy-4-methoxybenzyl alcohol 4-hydroxy-3-methoxybenzylamine 2-hydroxybenzyl alcohol 4-hydroxyphenethyl alcohol 3-hydroxybenzoic acid 4-hydroxybenzamide 4-hydroxy-3-methoxybenzyl alcohol resorcinol 2,6-dimethoxyphenol 2,4,6-tris-(dimethylaminomethyl)phenol salicylic acid 2-methoxyphenol 5-methylresorcinol 4-hydroxybenzylcyanide 3-hydroxyacetophenone 2-ethoxyphenol 3-methoxyphenol 4-hydroxyacetophenone 3-ethoxy-4-methoxyphenol 2-cresol salicylamide ethyl-4-hydroxy-3-methoxyphenylacetate phenol 4-cresol 4-hydroxy-3-methoxyphenethyl alcohol 3-acetamidophenol 4-methoxyphenol isovanillin 4-hydroxy-3-methoxyacetophenone 3,5-dimethoxyphenol 2-hydroxyethylsalicylate 3-cyanophenol 3-cresol methyl-3-hydroxybenzoate vanillin 4-hydroxy-3-methoxybenzonitrile 4-ethoxyphenol 3-ethoxy-4-hydroxybenzaldehyde 4-fluorophenol 2-cyanophenol 5-fluoro-2-hydroxyacetophenone 4-hydroxypropiophenone 2,4-dimethylphenol 2-hydroxyacetophenone 2,5-dimethylphenol methyl-4-hydroxybenzoate 3-hydroxybenzaldehyde 3,5-dimethylphenol 2,3-dimethylphenol 3,4-dimethylphenol 4-chlororesorcinol 2-ethylphenol syringaldehyde salicylhydrazide 2-chlorophenol 2-fluorophenol 4-hydroxy-2-methylacetophenone 4-ethylphenol 3-ethylphenol salicylaldoxime 4-hydroxybenzaldehyde 2,3,6-trimethylphenol 2,4,6-trimethylphenol 2-hydroxy-5-methylacetophenone 2-bromophenol 2-allylphenol 5-bromo-2-hydroxybenzyl alcohol 2,3,5-trimethylphenol 2-vanillin
ELUMO (eV)
EHOMO (eV)
DC-maxE (1/eV)
DC-avN (1/eV)
NHdon
NHacc
polar narcotics -1.500 0.750 0.141 -1.736 0.160 0.247 -1.957 0.437 0.166 -1.979 1.557 -0.481 -2.010 0.290 0.070 -2.030 0.280 0.291 -2.046 0.437 0.188 -2.174 0.516 0.473 -2.186 1.557 -0.576 -2.220 0.327 -0.174 -2.300 0.290 0.413 -2.348 0.800 0.366 -2.402 1.098 0.399 -2.480 0.920 0.337 -2.488 2.187 -0.684 -2.490 1.324 0.356 -2.611 1.310 0.300 -2.616 0.897 0.063 -2.619 1.455 -0.473 -2.642 1.853 0.369 -2.674 1.574 0.414 -2.698 1.455 -0.376 -2.701 1.687 0.248 -2.705 1.974 0.409 -2.758 1.277 -0.265 -2.770 1.530 0.179 -2.792 1.475 0.398 -2.816 1.974 0.428 -2.820 0.470 0.247 -2.845 0.494 0.240 -2.857 1.574 0.304 -2.859 1.275 -0.489 -2.880 1.270 -0.397 -2.908 1.598 0.413 -2.920 1.560 -0.301 -2.936 1.597 -0.501 -2.937 1.974 0.390 -2.954 1.985 -0.491 -2.970 1.275 -0.476 -2.970 1.420 -0.430 -3.013 2.103 0.328 -3.015 1.804 -0.502 -3.017 1.915 0.059 -3.034 1.597 -0.509 -3.040 2.170 -0.786 -3.053 1.984 -0.364 -3.070 2.473 0.400 -3.078 1.915 -0.517 -3.081 2.473 0.348 -3.084 1.985 -0.397 -3.085 1.443 -0.547 -3.113 2.473 0.390 -3.122 2.423 0.374 -3.122 2.423 0.402 -3.125 1.580 -0.008 -3.160 2.503 0.410 -3.168 0.988 -0.572 -3.181 0.847 -0.317 -3.183 2.155 0.030 -3.185 1.715 0.013 -3.190 1.950 -0.291 -3.205 2.503 0.433 -3.228 2.503 0.386 -3.253 1.098 -0.191 -3.266 1.443 -0.446 -3.277 2.922 0.383 -3.281 2.972 0.429 -3.310 2.410 -0.321 -3.330 2.355 -0.013 -3.334 2.548 0.360 -3.343 1.597 0.041 -3.360 2.922 0.383 -3.377 1.645 -0.670
-9.182 -9.158 -9.211 -9.608 -9.039 -8.936 -9.260 -8.862 -9.522 -9.438 -8.641 -9.125 -8.749 -8.847 -9.644 -8.693 -9.015 -9.309 -9.382 -8.803 -8.941 -9.434 -8.821 -8.960 -9.298 -8.728 -9.115 -8.881 -8.950 -8.812 -8.636 -9.111 -9.001 -8.964 -9.438 -9.568 -9.025 -9.443 -9.126 -9.184 -8.597 -9.333 -9.093 -9.556 -9.274 -9.418 -8.784 -9.303 -8.899 -9.536 -9.449 -8.969 -8.931 -8.822 -9.099 -8.985 -9.348 -9.584 -9.260 -9.271 -9.317 -8.912 -9.043 -8.991 -9.493 -8.782 -8.693 -9.198 -9.245 -9.036 -9.101 -8.807 -9.272
-0.264 -0.278 -0.265 -0.261 -0.262 -0.264 -0.264 -0.269 -0.255 -0.263 -0.263 -0.272 -0.267 -0.265 -0.256 -0.265 -0.275 -0.262 -0.259 -0.263 -0.274 -0.263 -0.271 -0.268 -0.262 -0.263 -0.268 -0.268 -0.263 -0.274 -0.262 -0.257 -0.259 -0.281 -0.263 -0.256 -0.271 -0.257 -0.258 -0.255 -0.263 -0.261 -0.259 -0.257 -0.256 -0.263 -0.266 -0.264 -0.269 -0.262 -0.257 -0.272 -0.269 -0.270 -0.267 -0.268 -0.255 -0.260 -0.260 -0.257 -0.267 -0.268 -0.270 -0.265 -0.262 -0.269 -0.262 -0.264 -0.263 -0.281 -0.262 -0.272 -0.261
0.294 0.294 0.287 0.302 0.289 0.286 0.288 0.281 0.302 0.297 0.287 0.289 0.288 0.275 0.306 0.286 0.286 0.293 0.293 0.283 0.286 0.293 0.286 0.281 0.298 0.289 0.285 0.282 0.285 0.290 0.287 0.299 0.293 0.288 0.295 0.300 0.282 0.298 0.299 0.301 0.284 0.294 0.295 0.300 0.303 0.290 0.280 0.294 0.280 0.298 0.298 0.279 0.280 0.279 0.300 0.279 0.300 0.300 0.296 0.295 0.289 0.280 0.279 0.290 0.298 0.278 0.278 0.289 0.296 0.280 0.295 0.277 0.299
2 3 2 2 2 2 2 2 2 2 2 2 1 1 2 1 2 1 1 1 1 1 1 1 2 1 1 1 2 2 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 3 1 1 1 1 1 2 1 1 1 1 1 1 2 1 1
3 3 2 3 3 3 2 2 3 2 3 2 3 4 3 2 2 2 2 2 2 2 3 1 2 4 1 1 3 2 2 3 3 3 4 2 1 3 3 3 2 3 1 2 2 2 1 2 1 3 2 1 1 1 2 1 4 3 1 1 2 1 1 3 2 1 1 2 1 1 2 1 3
log IC50 (mol/L)
log Kow
978
Chem. Res. Toxicol., Vol. 16, No. 8, 2003
Schu¨ u¨ rmann et al.
Table 2 (Continued) no. 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147
name salicylhydroxamic acid 3-fluorophenol 2-chloro-5-methylphenol 4-allyl-2-methoxyphenol salicylaldehyde 2,6-difluorophenol 4-isopropylphenol ethyl-3-hydroxybenzoate 4-cyanophenol 4-chlorophenol 2-hydroxy-4-methoxyacetophenone ethyl-4-hydroxybenzoate 2-bromo-4-methylphenol 2,4-difluorophenol 3-isopropylphenol 5-bromovanillin R,R,R-trifluoro-4-cresol methyl-4-methoxysalicylate 4-propylphenol 4-bromophenol 2-chloro-4,5-dimethylphenol 4-butoxyphenol 4-chloro-2-methylphenol 2-hydroxy-4,5-dimethylacetophenone 3-(tert)-butylphenol 2,6-dichlorophenol 2-methoxy-4-propenylphenol 3-chloro-5-methoxyphenol 4-chloro-3-methylphenol 2-isopropylphenol 2,6-dichloro-4-fluorophenol 4-iodophenol 3-chlorophenol 4-(tert)-butylphenol 3,4,5-trimethylphenol 4,6-dichlororesorcinol 4-(sec)-butylphenol 4-hydroxybenzophenone 2,4-dichlorophenol 4-benzyloxyphenol 2,4,6-tribromoresorcinol 4-chloro-3-ethylphenol 2-phenylphenol 3-iodophenol 2,5-dichlorophenol 3-chloro-4-fluorophenol 3-bromophenol 6-(tert)-butyl-2,4-dimethylphenol 4-bromo-2,6-dimethylphenol 4-chloro-3,5-dimethylphenol 4-(tert)-pentylphenol 4-bromo-3,5-dimethylphenol 2,3-dichlorophenol 4-bromo-6-chloro-2-cresol 4-cyclopentylphenol 2-(tert)-butylphenol 2-(tert)-butyl-4-methylphenol 5-pentylresorcinol 3-phenylphenol 4-phenylphenol 2,4-dibromophenol 2,4,6-trichlorophenol 2-hydroxy-4-methoxybenzophenone 3,5-dichlorosalicylaldehyde 3,5-dichlorophenol 3,5-di-(tert)-butylphenol 4-hexyloxyphenol 3,5-dibromosalicylaldehyde 3,4-dichlorophenol 4-bromo-2,6-dichlorophenol 2,6-di-(tert)-butyl-4-methylphenol 4-hexylresorcinol 4-chloro-2-isopropyl-5-methylphenol 2,4,6-tribromophenol
log IC50 (mol/L) -3.379 -3.381 -3.393 -3.420 -3.424 -3.471 -3.473 -3.478 -3.516 -3.545 -3.550 -3.573 -3.599 -3.604 -3.612 -3.617 -3.618 -3.620 -3.643 -3.680 -3.688 -3.701 -3.701 -3.707 -3.730 -3.735 -3.750 -3.757 -3.796 -3.798 -3.804 -3.854 -3.871 -3.914 -3.932 -3.967 -3.979 -4.024 -4.036 -4.038 -4.060 -4.081 -4.094 -4.119 -4.125 -4.131 -4.145 -4.157 -4.167 -4.201 -4.229 -4.268 -4.276 -4.276 -4.292 -4.295 -4.301 -4.306 -4.351 -4.393 -4.398 -4.410 -4.420 -4.550 -4.569 -4.638 -4.638 -4.638 -4.745 -4.778 -4.796 -4.798 -4.854 -5.030
ELUMO (eV)
EHOMO (eV)
DC-maxE (1/eV)
DC-avN (1/eV)
NHdon
NHacc
polar narcotics 0.877 -0.433 1.915 0.025 2.654 0.054 2.400 0.303 1.813 -0.434 1.747 -0.321 2.902 0.446 2.514 -0.460 1.597 -0.413 2.485 0.095 1.980 -0.288 2.514 -0.365 2.854 0.032 1.947 -0.282 2.902 0.416 1.917 -0.758 2.877 -0.348 2.490 -0.313 3.032 0.430 2.635 0.020 3.103 0.053 3.161 0.328 2.984 0.134 2.863 -0.473 3.301 0.422 2.627 -0.258 2.580 -0.006 2.500 0.027 2.984 0.134 2.902 0.412 2.797 -0.568 2.895 0.024 2.485 0.019 3.301 0.471 2.872 0.429 2.080 -0.263 3.431 0.448 3.070 -0.486 2.957 -0.197 3.342 0.236 4.370 -0.610 3.513 0.140 3.090 -0.047 2.895 -0.048 2.957 -0.325 2.717 -0.265 2.635 -0.051 4.299 0.438 3.633 0.085 3.483 0.147 3.830 0.472 3.633 0.109 2.837 -0.248 3.606 -0.225 3.536 0.432 3.301 0.418 3.800 0.459 3.420 0.259 3.230 -0.149 3.200 -0.088 3.307 -0.299 3.367 -0.502 3.580 -0.572 3.070 -0.893 3.287 -0.285 5.127 0.452 4.219 0.339 3.423 -1.071 3.167 -0.236 3.517 -0.514 5.626 0.512 3.450 0.353 4.411 0.143 3.917 -0.621
-9.386 -9.373 -9.059 -8.886 -9.500 -9.459 -8.919 -9.421 -9.510 -9.125 -9.299 -9.513 -9.054 -9.246 -9.014 -9.566 -9.792 -9.355 -8.908 -9.189 -8.963 -8.594 -9.008 -9.030 -9.009 -9.374 -8.482 -9.222 -9.035 -8.986 -9.384 -9.243 -9.335 -8.894 -8.749 -9.224 -8.907 -9.399 -9.230 -8.977 -9.397 -9.041 -8.733 -9.347 -9.408 -9.251 -9.337 -8.687 -8.996 -8.977 -8.885 -9.056 -9.390 -9.237 -8.902 -8.965 -8.769 -8.941 -8.956 -8.692 -9.331 -9.390 -9.171 -9.578 -9.537 -8.942 -8.608 -9.543 -9.279 -9.446 -8.641 -8.849 -8.919 -9.503
-0.261 -0.265 -0.265 -0.276 -0.262 -0.252 -0.268 -0.258 -0.259 -0.262 -0.270 -0.264 -0.263 -0.254 -0.270 -0.254 -0.258 -0.270 -0.268 -0.262 -0.263 -0.263 -0.262 -0.266 -0.270 -0.255 -0.267 -0.270 -0.265 -0.269 -0.248 -0.270 -0.263 -0.268 -0.272 -0.260 -0.268 -0.264 -0.257 -0.263 -0.258 -0.265 -0.267 -0.259 -0.256 -0.256 -0.260 -0.271 -0.258 -0.267 -0.268 -0.267 -0.259 -0.253 -0.268 -0.271 -0.271 -0.276 -0.267 -0.266 -0.258 -0.244 -0.271 -0.251 -0.259 -0.273 -0.264 -0.255 -0.258 -0.248 -0.272 -0.272 -0.265 -0.252
0.300 0.295 0.291 0.283 0.297 0.306 0.278 0.294 0.300 0.296 0.293 0.293 0.292 0.306 0.278 0.308 0.308 0.296 0.278 0.297 0.288 0.28 0.291 0.288 0.276 0.307 0.283 0.296 0.291 0.277 0.318 0.296 0.296 0.276 0.278 0.312 0.276 0.293 0.307 0.286 0.323 0.288 0.284 0.296 0.307 0.306 0.297 0.273 0.289 0.288 0.275 0.288 0.306 0.302 0.276 0.275 0.274 0.277 0.285 0.285 0.308 0.318 0.295 0.318 0.307 0.273 0.278 0.320 0.306 0.319 0.271 0.275 0.283 0.320
3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1
3 1 1 2 2 1 1 3 2 1 3 3 1 1 1 3 1 4 1 1 1 2 1 2 1 1 2 2 1 1 1 1 1 1 1 2 1 2 1 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 3 2 1 1 2 2 1 1 1 2 1 1
log Kow
Four Modes of Toxic Action of Phenols
Chem. Res. Toxicol., Vol. 16, No. 8, 2003 979
Table 2 (Continued) no.
name
log IC50 (mol/L)
log Kow
ELUMO (eV)
EHOMO (eV)
DC-maxE (1/eV)
DC-avN (1/eV)
NHdon
NHacc
-8.596 -8.850 -9.323 -9.719 -9.485 -8.913
-0.263 -0.270 -0.254 -0.263 -0.256 -0.268
0.276 0.272 0.317 0.316 0.317 0.271
1 1 1 1 1 1
2 1 1 2 1 1
148 149 150 151 152 153
4-heptyloxyphenol 4-(tert)-octylphenol 2,4,5-trichlorophenol 3,5-diiodosalicylaldehyde 2,3,5-trichlorophenol nonylphenol
-5.033 -5.097 -5.097 -5.340 -5.373 -5.468
polar narcotics 4.748 0.329 5.157 0.465 3.577 -0.511 3.870 -0.901 3.577 -0.555 6.204 0.429
154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172
2,4,6-trinitrophenol 3,4-dinitrophenol 2,3-dinitrophenol 2,6-dinitrophenol 2,6-dichloro-4-nitrophenol 2,5-dinitrophenol 2,4-dinitrophenol 2,3,5,6-tetrafluorophenol 2,6-dinitro-4-cresol 2,6-dibromo-4-nitrophenol pentafluorophenol 2,6-diiodo-4-nitrophenol 4,6-dinitro-2-cresol 2,4-dichloro-6-nitrophenol pentachlorophenol 2,3,5,6-tetrachlorophenol 3,4,5,6-tetrabromo-2-cresol pentabromophenol 2,3,4,5-tetrachlorophenol
-2.845 -3.266 -3.463 -3.539 -3.632 -3.950 -4.077 -4.167 -4.229 -4.356 -4.638 -4.712 -4.721 -4.745 -5.049 -5.222 -5.574 -5.664 -5.712
oxidative uncouplers 1.588 -2.534 1.978 -1.862 1.978 -1.799 1.788 -1.952 2.736 -1.441 1.788 -2.194 1.788 -1.806 2.068 -0.994 2.287 -1.894 3.136 -1.452 2.213 -1.296 3.716 -1.422 2.287 -1.825 3.066 -1.578 4.323 -0.977 3.848 -0.816 4.967 -0.882 4.853 -1.193 4.058 -0.725
-11.414 -10.727 -10.649 -10.659 -10.174 -10.676 -10.807 -9.874 -10.351 -10.216 -9.940 -10.245 -10.510 -9.880 -9.574 -9.629 -9.498 -9.684 -9.456
-0.228 -0.236 -0.238 -0.241 -0.236 -0.236 -0.242 -0.249 -0.241 -0.244 -0.214 -0.256 -0.242 -0.240 -0.237 -0.249 -0.251 -0.246 -0.252
0.368 0.339 0.338 0.339 0.334 0.340 0.339 0.329 0.330 0.335 0.341 0.332 0.330 0.334 0.337 0.327 0.319 0.338 0.326
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
7 5 5 5 3 5 5 1 5 3 1 3 5 3 1 1 1 1 1
173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198
4-acetamidophenol 3-aminophenol 4-aminophenol 2,4-diaminophenol 3-methylcatechol 2-amino-4-(tert)-butylphenol 4-methylcatechol 1,2,4-trihydroxybenzene 5-amino-2-methoxyphenol hydroquinone catechol 5-chloro-2-hydroxyaniline 1,2,3-trihydroxybenzene 6-amino-2,4-dimethylphenol 2-aminophenol 4-chlorocatechol chlorohydroquinone 4-amino-2-cresol trimethylhydroquinone 2,3-dimethylhydroquinone 4-amino-2,3-dimethylphenol bromohydroquinone methylhydroquinone phenylhydroquinone 3,5-di-(tert)-butylcatechol methoxyhydroquinone
-2.180 -2.476 -2.924 -3.127 -3.280 -3.366 -3.368 -3.439 -3.450 -3.473 -3.752 -3.775 -3.850 -3.886 -3.939 -4.061 -4.262 -4.307 -4.342 -4.413 -4.440 -4.678 -4.858 -5.005 -5.109 -5.202
proelectrophiles 0.494 0.658 0.248 0.253 0.248 0.557 -0.609 0.847 1.380 0.269 2.444 0.531 1.370 0.333 0.210 0.133 0.148 0.391 0.590 0.233 0.880 0.297 1.712 0.286 0.210 0.269 1.616 0.449 0.618 0.620 1.980 -0.044 1.400 -0.111 0.747 0.446 1.690 0.229 1.240 0.216 1.150 0.417 1.780 -0.210 0.980 0.241 2.430 -0.168 4.530 0.374 0.470 0.229
-8.281 -8.460 -7.957 -7.569 -8.845 -8.268 -8.720 -8.620 -8.390 -8.734 -8.885 -8.231 -8.963 -8.309 -8.020 -8.911 -8.907 -8.208 -8.497 -8.560 -8.156 -8.937 -8.623 -8.585 -8.659 -8.581
-0.286 -0.265 -0.273 -0.294 -0.263 -0.270 -0.263 -0.266 -0.276 -0.261 -0.261 -0.271 -0.265 -0.272 -0.276 -0.258 -0.257 -0.269 -0.266 -0.263 -0.268 -0.254 -0.264 -0.263 -0.271 -0.269
0.281 0.290 0.282 0.279 0.287 0.276 0.286 0.295 0.286 0.290 0.290 0.293 0.294 0.280 0.282 0.301 0.301 0.281 0.281 0.283 0.279 0.302 0.286 0.287 0.274 0.290
2 2 2 3 2 2 2 3 2 2 2 2 3 2 2 2 2 2 2 2 2 2 2 2 2 2
2 1 1 1 2 1 2 3 2 2 2 1 1 1 1 2 2 1 2 2 1 2 2 2 2 3
199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216
3-hydroxy-4-nitrobenzaldehyde 5-hydroxy-2-nitrobenzaldehyde 2-amino-4-nitrophenol 3-nitrophenol 4-methyl-2-nitrophenol 4-hydroxy-3-nitrobenzaldehyde 4-nitrosophenol 2-nitroresorcinol 2-nitrophenol 4-methyl-3-nitrophenol 2-chloromethyl-4-nitrophenol 4-amino-2-nitrophenol 3-fluoro-4-nitrophenol 5-fluoro-2-nitrophenol 4-nitrocatechol 2-amino-4-chloro-5-nitrophenol 4-nitrophenol 2-chloro-4-nitrophenol
-3.270 -3.330 -3.475 -3.506 -3.571 -3.610 -3.654 -3.658 -3.670 -3.740 -3.750 -3.879 -3.935 -4.125 -4.170 -4.174 -4.420 -4.585
soft electrophiles 1.470 -1.755 1.751 -1.458 1.177 -0.882 1.854 -1.159 2.353 -1.141 1.470 -1.336 1.358 -0.796 1.560 -0.758 1.854 -1.014 2.273 -1.107 2.416 -1.186 0.807 -0.947 1.790 -1.285 2.086 -1.298 1.660 -1.159 1.804 -0.955 1.854 -1.065 2.326 -1.264
-10.152 -10.294 -8.798 -9.947 -9.650 -10.254 -9.575 -9.840 -9.954 -9.652 -10.144 -8.968 -10.239 -10.195 -9.763 -9.157 -10.072 -10.130
-0.244 -0.247 -0.255 -0.247 -0.253 -0.249 -0.264 -0.258 -0.254 -0.251 -0.25 -0.251 -0.254 -0.254 -0.247 -0.256 -0.253 -0.246
0.323 0.322 0.308 0.312 0.305 0.322 0.297 0.315 0.311 0.305 0.316 0.310 0.322 0.322 0.317 0.319 0.312 0.323
1 1 2 1 1 1 1 2 1 1 1 2 1 1 2 2 1 1
4 4 3 3 3 4 3 4 3 3 3 3 3 3 4 3 3 3
980
Chem. Res. Toxicol., Vol. 16, No. 8, 2003
Schu¨ u¨ rmann et al.
Table 2 (Continued) no. 217 218 219 220
name 4-chloro-6-nitro-3-cresol 4-nitro-3-(trifluoromethyl)phenol 3-methyl-4-nitrophenol 4-chloro-2-nitrophenol
ELUMO (eV)
log IC50 (mol/L)
log Kow
-4.638 -4.652 -4.729 -5.053
soft electrophiles 3.155 -1.193 2.770 -1.587 2.273 -1.007 2.656 -1.230
EHOMO (eV)
DC-maxE (1/eV)
DC-avN (1/eV)
NHdon
NHacc
-9.789 -10.498 -9.994 -9.871
-0.253 -0.243 -0.255 -0.249
0.315 0.332 0.305 0.322
1 1 1 1
3 3 3 3
a For the model validation in terms of two complementary subsets (see Model Validation section), all odd-numbered compounds are allocated to group 1 and all even-numbered compounds to group 2, respectively. The ciliate toxicity in terms of the logarithmic inhibition concentration 50% (log IC50), the calculated octanol/water partition coefficient in logarithmic form (log Kow), and the hydrogen bond donor and acceptor counts (NHdon and NHacc) were taken from ref 18, and the LUMO and HOMO energies (ELUMO and EHOMO), as well as the maximum donor delocalizability at carbon atoms (DC-maxE), and the average acceptor delocalizability with respect to carbon atoms (DC-avN) were calculated with the quantum chemical AM1 scheme (see the Materials and Methods section).
as follows: polar narcotics, -3.42; oxidative uncouplers, -4.36; proelectrophiles, -3.87; and soft electrophiles, -3.82. It shows that a simple discrimination between the four MOAs based on IC50 values is not feasible. Model Development. In contrast to 1-step 4-MOA classification models that showed deficiencies in identifying oxidative uncouplers and proelectrophiles (18), a 2-step approach was selected, starting with an initial separation into two groups followed by a second discrimination into the individual MOA classes. To this end, all six combinations of initially separating two MOAs from two other MOAs were investigated, taking up to five descriptors as discriminating variables. In addition, various options of initially separating one MOA from the other three and subsequently subdividing the 3-MOA group further were screened. For the variable selection, the following strategy was applied. From each of the five descriptor categories (physicochemical characteristics, SA, atomic charge, delocalizability, and CPSA), the most significant variables (as indicated by Fisher F test and p level) for the given discrimination step were selected as starting variables. Then, stepwise inclusion of further variables up to a maximum of five descriptors was performed, following standard statistical criteria and mechanistic reasoning (e.g., combine global and local electronic structure descriptors, charge and delocalizability descriptors, SA and electronic structure, and hydrogen atom and heavy atom descriptors). The best hierarchical classification approach turned out to be the following 2-step procedure: The first step consists of separating polar narcotics and proelectrophiles from oxidative uncouplers and soft electrophiles. This initial subdivision is best accomplished with ELUMO as the only descriptor, leading to a correct discrimination rate of 97% (both LDA and BLR). In the second step, separate discrimination models are applied for the two groups that resulted from the initial separation, leading to the final allocation of the compounds to the four MOAs. For the discrimination between polar narcotics and proelectrophiles, the three descriptors EHOMO, DC-maxE, and NHdon yield a correct rate of 99%, and with log Kow and DC-avN, oxidative uncouplers and soft electrophiles can be separated to 98%. The relevant squared intercorrelations between the first three parameters are EHOMO, 0.61 (DC-mxE) and 0.12 (NHdon); DC-mxE, 0.07 (NHdon) and 0.30 between log Kow and DC-avN. In Table 2, the descriptor values of all finally selected 2D and 3D descriptors are listed for the 220 phenols. Minor differences between present and previously reported frontier orbital energies are caused by a slightly different geometry optimization
Figure 1. Stepwise classification tree for identifying four MOAs of phenols. Initial separation based on ELUMO is followed by two branches, leading to a final classification as polar narcotics, proelectrophiles (left branch, employing EHOMO, NHdon, and DC-maxE), oxidative uncouplers, and soft electrophiles (right branch, employing log Kow and DC-avN).
procedure. The 2-step decision tree applies for both LDA and BLR and is visualized in Figure 1. LDA and BLR Classification Models. The abovedescribed 2-step approach was calibrated with LDA and BLR. To allow a direct comparison with earlier derived LDA models that provide a discrimination between all four phenol MOAs in 1-step (18), a three variable model employing log Kow, pKa, and ELUMO (the previous model with the smallest number of descriptors, now termed CM1 with CM, classification model) and a five variable model based on log Kow, pKa, ELUMO, EHOMO, and NHdon (the previously best performing model, now termed CM2) were recalibrated for the present (slightly corrected) data set. The resultant contingency table statistics for all four models are summarized in Table 3. As can be seen from the first row of that table, the correct classification rate (concordance, defined by eq 6) of the four models varies from 86.8 (CM1) to 95.9% (CM4, 2-step BLR model). Moreover, comparison with κ (eq 7) in the second row of Table 3 shows that correction for agreement by chance leads to overall agreement rates between 71.2 (CM1) and 90.7% (CM3). The influence of chance agreement between predicted and experimental MOA classes can be illustrated by considering the group of 153 polar narcotics (marginal total n•1 ) 153, cf. Tables 1 and 4). Because polar narcotics represent 68.9% of the data set under investigation, simply predicting any phenol to act as a polar narcotic would be correct to 68.9% on the average. While both the concordance and the κ are symmetric measures of agreement (that would, in principle, allow us to exchange rows and columns in the contingency table), λB (eq 8) characterizes the ability of the classifica-
Four Modes of Toxic Action of Phenols
Chem. Res. Toxicol., Vol. 16, No. 8, 2003 981
philes based on EHOMO, NHdon, and DC-maxE (left branch of the decision tree as shown in Figure 1). Here, CM2, CM3, and CM4 achieve correct classification rates of above 93% for both recognition and prediction. Greater differences in performance between the 1-step and the 2-step approach are seen for oxidative uncouplers and proelectrophiles. The poor CM1 sensitivity for proelectrophiles has been noted earlier (18), and it is now seen that this recognition power (the proportion of proelectrophiles that are predicted as proelectrophiles) is much lower than the corresponding prediction power (the proportion of predicted proelectrophiles that are actually proelectrophiles). Moreover, with proelectrophiles, both the CM2 sensitivity and the CM1 and CM2 predictivity are significantly inferior to the corresponding CM3 and CM4 performances. Interestingly, for oxidative uncouplers, all models except CM1 yield better predictivities than sensitivities. Note also that CM1 and CM2 both contain pKa that is known as a crucial parameter for these compounds to act as efficient proton shuttles across the inner mitochondrium membrane (7, 8), while the superior 2-step models CM3 and CM4 identify oxidative uncouplers through an initial discrimination based on ELUMO, and a subsequent separation from soft electrophiles employing log Kow and DC-avN (right branch in Figure 1). Figure 2 shows the molecular structures of those phenols that are misclassified in the hierarchical decision tree using LDA or BLR. In the first step, three polar narcotic 3,5-dihalogen salicyl aldehydes (halogen ) Cl, Br, and I) are wrongly assigned to the right branch, because their ELUMO values of -0.893 (Cl), -1.071 (Br), and -0.901 eV (I) are far below the typical range found for this MOA and actually close to the ELUMO median of phenols acting as soft electrophiles (-1.159 eV). In accord with this finding, subsequent application of the right branch CM (calibrated by LDA or BLR) classifies these three polar narcotics wrongly as soft electrophiles. In Table 4 that contains the four contingency tables showing predicted vs experimental MOA classes for CM1 to CM4, these misclassified compounds are located in cell entry n41 ) 3 that refers to row 4 (predicted MOA ) soft
Table 3. Contingency Table Statistics of Four Classification Modelsa statistical evaluation
CM1
CM3
CM4
concordance κ λB
all chemicals 0.868 0.900 0.712 0.797 0.567 0.657
CM2
0.955 0.907 0.851
0.959 0.906 0.866
sensitivity predictivity
polar narcotics 0.961 0.935 0.891 0.953
0.967 0.974
0.974 0.974
sensitivity predictivity
oxidative uncouplers 0.842 0.789 0.895 0.800 0.833 0.944
0.895 0.944
sensitivity predictivity
proelectrophiles 0.385 0.808 0.769 0.750
1.0 0.929
1.0 0.963
sensitivity predictivity
soft electrophiles 0.818 0.864 0.818 0.792
0.864 0.864
0.864 0.864
a CM1, LDA model with three variables (log K , pK , and ow a ELUMO); CM2, LDA model with five variables (log Kow, pKa, ELUMO, EHOMO, and NHdon); CM3, LDA-calibrated stepwise decision tree (cf. Figure 1), employing ELUMO in the first step and subsequently EHOMO, DC-maxE, and NHdon for the left branch and log Kow, DC-avN for the right branch, respectively; CM4, BLR-calibrated stepwise decision tree, using the same descriptors as CM3.
tion model to make the correct MOA assignment. More precisely, λB quantifies the reduction in error probability due to application of the classification model when predicting the MOA class for a given compound. According to λB, the correct prediction rate ranges from 56.7 (CM1) to 86.6% (CM4). With all three contingency measures, the presently derived 2-step models CM3 (LDA) and CM4 (BLR) are significantly superior to the earlier 1-step LDA models CM1 and CM2. With respect to the MOA specific performances, Table 3 contains the calculated sensitivities (eq 9) and predictivities (eq 10) for all four classification models. Again, CM3 and CM4 are superior to CM1 and CM2. High classification rates are achieved for polar narcotics, which in the case of the 2-step approach are identified through ELUMO and a subsequent discrimination from proelectro-
Table 4. Two-Dimensional 4 × 4 Contingency Tables of Four Classification Modelsa category type predicted category
oxidative uncouplers
proelectrophiles
soft electrophiles
total
polar narcotics oxidative uncouplers proelectrophiles soft electrophiles polar narcotics oxidative uncouplers proelectrophiles soft electrophiles polar narcotics oxidative uncouplers proelectrophiles soft electrophiles polar narcotics oxidative uncouplers proelectrophiles soft electrophiles
147 1 3 2 143 1 7 2 148 0 2 3 149 0 1 3
1 16 0 2 1 15 0 3 2 17 0 0 2 17 0 0
16 0 10 0 5 0 21 0 0 0 26 0 0 0 26 0
1 3 0 18 1 2 0 19 2 1 0 19 2 1 0 19
165 20 13 22 150 18 28 24 152 18 28 22 153 18 27 22
total
153
19
26
22
220
models CM1
CM2
CM3
CM4
experimental category polar narcotics
CM1, 1-step 3-variable LDA model; CM2, 1-step 5-variable LDA model; CM3, 2-step (1 + 3 + 2)-variable LDA model; CM4, 2-step (1 + 3 + 2)-variable BLR model (see legend of Table 3 and text). a
982
Chem. Res. Toxicol., Vol. 16, No. 8, 2003
Schu¨ u¨ rmann et al. Table 5. Contingency Table Statistics of Four Classification Models for Group 1 and Group 2 Calibrationa Group 1
Figure 2. Compounds misclassified in CM3 (2-step LDA-based classification model) and CM4 (2-step BLR-based classification model), applying the stepwise classification tree as outlined in Figure 1.
electrophilicity) and column 1 (experimental MOA ) polar narcosis) of CM3 and CM4. Moreover, the 2-step decision tree wrongly assigns two oxidative uncouplers (2,3,5,6-tetrachlorophenol and 2,3,4,5tetrachlorophenol) to the left branch and subsequently to the polar narcosis MOA (n12 ) 2 for both CM3 and CM4 in Table 4), which is also the case for two soft electrophiles (4-nitrosophenol and 2-nitroresorcinol; n14 ) 2 for CM3 and CM4 in Table 4). In this context, it should be noted that 4-nitrosophenol was the only nitroso derivative in the present set of 220 phenols. In addition, the polar narcotics 4-hydroxy-3-methoxybenzyl alcohol (BLR, LDA) and salicylhydroxamic acid (LDA) are misclassified as proelectrophiles (initial correct ELUMO-based assignment to left branch of decision tree is followed by misclassification based on EHOMO, NHdon, and DC-maxE; cf. n31 ) 2 and 1 in Table 4 for CM3 and CM4, respectively), and with both LDA and BLR, one soft electrophile 4-nitro-3-(trifluoromethyl)phenol is wrongly classified as an oxidative uncoupler (initial correct ELUMObased assignment to right branch is followed by misclassification based on log Kow and DC-avN; cf. n24 ) 1 in Table 4 for both CM3 and CM4). As can be further seen from Table 4, the above-mentioned poor CM1 sensitivity for proelectrophiles (38.5%) corresponds to 16 such compounds that are wrongly allocated the polar narcosis MOA (n13 ) 16 for CM1 vs experimental marginal total n•3 ) 26), while the much better CM1 predictivity for this MOA is seen by the fact that among the 13 phenols predicted as proelectrophiles (marginal total n3• ) 13), 10 compounds actually belong to this MOA (n33 ) 10 for CM1 in Table 4). Similarly, the CM2 deficiencies for oxidative uncouplers as compared to CM1, CM3, and CM4 are seen through comparison of the n22 entries in Table 4. Model Validation. Following the recently introduced approach to perform simulated external prediction with categorical data (18), two subsets that each contain 50% of the compounds and an almost equal representation of the four MOAs were generated by allocating all oddnumbered phenols (that are ordered by MOA and increasing toxicity) to group 1 and all even-numbered phenols to group 2 (cf. Table 2 and Materials and Methods). For all four CMs, the calibration statistics of the separate model trainings on the subsets group 1 and group 2 are summarized in Table 5. Interestingly, CM4
statistical evaluation
CM1
CM2
CM3
CM4
concordance κ λB
0.891 0.762 0.636
0.918 0.825 0.727
0.927 0.852 0.758
0.936 0.867 0.788
sensitivity predictivity
polar narcotics 0.961 0.974 0.914 0.949
0.948 0.973
0.974 0.974
sensitivity predictivity
oxidative uncouplers 0.889 0.778 0.889 0.889 0.875 0.727
0.778 0.875
sensitivity predictivity
proelectrophiles 0.462 0.692 0.750 0.900
1.0 0.867
1.0 0.867
sensitivity predictivity
soft electrophiles 0.909 0.909 0.833 0.769
0.727 0.889
0.727 0.800
Group 2 statistical evaluation
CM1
CM2
CM3
CM4
concordance κ λB
0.809 0.590 0.382
0.882 0.762 0.618
0.982 0.963 0.941
1.0 1.0 1.0
sensitivity predictivity
polar narcotics 0.895 0.947 0.895 0.960
0.987 0.987
1.0 1.0
sensitivity predictivity
oxidative uncouplers 0.500 0.600 0.556 0.600
0.900 1.0
1.0 1.0
sensitivity predictivity
proelectrophiles 0.538 0.923 0.583 0.800
1.0 0.989
1.0 1.0
sensitivity predictivity
soft electrophiles 0.818 0.636 0.692 0.700
1.0 1.0
1.0 1.0
a CM1, 1-step 3-variable LDA model; CM2, 1-step 5-variable LDA model; CM3, 2-step (1 + 3 + 2)-variable LDA model; CM4, 2-step (1 + 3 + 2)-variable BLR model (see legend of Table 3 and text).
yields a perfect calibration for group 2 (all contingency coefficients are 1 including the MOA specific sensitivities and predictivities). The overall comparison of group 1 and group 2 statistics shows that CM1 and CM2 yield a better calibration performance for group 1, while CM3 and CM4 perform better for group 2. At the same time, the differences in training statistics between the two subsets are greater for CM1 and CM2 than for CM3 and CM4, and for both group 1 and group 2, CM1 and CM2 yield, on the average, greater differences between MOA specific sensitivities and predictivities than CM3 and CM4. The former is illustrated by the CM1 λB value (0.636 vs 0.382) and the CM2 sensitivity for soft electrophiles (0.909 vs 0.636), and examples for the latter include the CM1 sensitivity and predictivity for proelectrophiles in group 1 (0.462 vs 0.750) and the group 1 performance of CM2 for soft electrophiles (0.909 vs 0.769). Finally, comparison of the contingency statistics across group 1 and group 2 shows that CM3 and CM4 are, on the whole, significantly superior to CM1 and CM2. Application of the submodels to predict the MOA classes of the complementary subset (group 1 model
Four Modes of Toxic Action of Phenols
Chem. Res. Toxicol., Vol. 16, No. 8, 2003 983
Table 6. Contingency Table Statistics of Four Classification Models for Group 1 and Group 2 External Predictiona Group 1 statistical evaluation
CM1
CM2
CM3
CM4
concordance k λB
0.873 0.725 0.576
0.909 0.818 0.697
0.845 0.876 0.788
0.909 0.824 0.697
sensitivity predictivity
polar narcotics 0.948 0.922 0.913 0.973
0.922 1.0
0.896 0.986
sensitivity predictivity
oxidative uncouplers 0.889 1.0 1.0 0.800 1.0 0.818
1.0 0.900
sensitivity predictivity
proelectrophiles 0.462 0.846 0.600 0.579
1.0 0.929
0.923 0.923
sensitivity predictivity
soft electrophiles 0.818 0.818 0.900 1.0
0.909 0.714
0.909 0.588
Group 2 statistical evaluation
CM1
CM2
CM3
CM4
concordance k λB
0.836 0.652 0.471
0.855 0.700 0.529
0.873 0.923 0.882
0.909 0.810 0.706
sensitivity predictivity
polar narcotics 0.934 0.934 0.888 0.922
1.0 0.962
0.961 0.936
sensitivity predictivity
oxidative uncouplers 0.700 0.600 0.700 0.875 0.857 1.0
0.600 1.0
sensitivity predictivity
proelectrophiles 0.308 0.538 0.667 0.875
1.0 1.0
0.846 0.786
sensitivity predictivity
soft electrophiles 0.909 0.909 0.625 0.556
0.909 0.909
0.909 0.833
a CM1, 1-step 3-variable LDA model; CM2, 1-step 5-variable LDA model; CM3, 2-step (1 + 3 + 2)-variable LDA model; CM4, 2-step (1 + 3 + 2)-variable BLR model (see legend of Table 3 and text).
predicts group 2 MOA classes and vice versa) yields the external prediction statistics as summarized in Table 6. Again, the CM3 and CM4 performance is, on the whole, more balanced with respect to the two subsets and the four MOAs. When comparing CM2 and CM3, Table 6 reveals that when predicting group 1 MOAs from the group 2 model, there are only two individual cases where CM2 outperforms CM3 (predictivity of proelectrophiles and of oxidative uncouplers). Similarly, CM1 yields better statistics than CM3 only in again two cases (sensitivity for polar narcosis, predictivity for soft electrophiles). In all other cases, CM3 is (partially far) superior to CM1 and CM2 for the external prediction of group 1 and group 2. The concordances in Table 6 would suggest a better external prediction performance for CM4 than for CM3. However, comparison of κ (overall degree of agreement corrected for agreement by chance) and λB (overall prediction power) between CM3 and CM4 indicates that the BLR-based external prediction is indeed inferior to the one based on LDA. This holds true for the compounds from both group 1 (classified by the group 2 model) and
group 2 (classified by the group 1 model). Except for the external predictivity of group 1 oxidative uncouplers (0.818 vs 0.900 in Table 6), CM3 is also superior to CM4 with respect to the MOA specific external sensitivities and predictivities. It follows that among the four CMs under investigation, the 2-step LDA classification model CM3 shows the overall best external prediction performance. Interestingly, the overall CM3 calibration performance (in terms of κ and λB) for the total set is superior to the one with group 1 but inferior to the one with group 2. It indicates that group 1 contains more of those compounds that are misclassified by CM3 than group 2. When comparing calibration with external prediction for group 1 and group 2, the variation in κ and λB is smaller for CM3 than for the other three classification models. It follows that among the different modeling strategies tested (1-step vs 2-step approach, LDA vs BLR), the hierarchical 2-step LDA classification tree appears to be the statistically most robust procedure for the identification and prediction of the phenolic MOAs under investigation. Comparison with Additional Models. For the sake of completeness, we have also analyzed the 6-variable 1-step 4-MOA LDA model that includes all descriptors (ELUMO, EHOMO, NHdon, DC-maxE, log Kow, and DC-avN) employed consecutively in the 2-step classification tree. For the total set of 220 phenols, concordance, κ, and λB are 0.932, 0.861, and 0.776 and thus (as expected) inferior to CM3 and CM4 but superior to CM1 and CM2, and the same picture is achieved when evaluating the group 1 and group 2 calibration and external prediction statistics (results not shown). Note that in the previous investigation confined to 1-step LDA models (18), neither atomic charges nor delocalizabilities nor CPSA descriptors had been considered. Finally, we also tested the 2-step classification tree when confining the stepwise variable selection to the descriptors used for CM1 (log Kow, pKa, and ELUMO) and CM2 (log Kow, pKa, ELUMO, EHOMO, and NHdon). Following standard selection criteria (Fisher F test, p level), ELUMO was (of course) selected again as the only discriminating variable for the first step, and for the subsequent left and right branch classification, the following descriptors were identified as statistically significant: CM1 descriptor set, log Kow, pKa, ELUMO (left), log Kow, and ELUMO (right); CM2 descriptor set, ELUMO, EHOMO, NHdon (left), log Kow, and ELUMO (right). The resultant contingency statistics in terms of concordance, κ, and λB for the total set of 220 phenols are 0.900, 0.779, and 0.672 (CM1 descriptor set) and 0.918, 0.831, and 0.731 (CM2 descriptor set), respectively. Again, these and the corresponding group 1 and group 2 statistics for both calibration and external prediction (results not shown) are inferior to CM3 and CM4. Interestingly, the 2-step model based on CM1 descriptors performs almost equally well as the original (but recalibrated) 1-step CM1, while CM2 is inferior to the corresponding 2-step procedure. MOA Specific Relationship between Toxicity and Descriptors. Figure 3 shows the data distributions of log IC50 vs log Kow for all four MOA groups as compared to a regression model of 20 nonpolar narcotics representing baseline toxicity in this bioassay
log IC50 (mol/L) ) -0.795 log Kow - 0.831 (11)
984
Chem. Res. Toxicol., Vol. 16, No. 8, 2003
Schu¨ u¨ rmann et al.
Figure 3. Log IC50 (mol/L) vs log Kow for 153 polar narcotics (circles, top left), 19 oxidative uncouplers (squares, top right), 26 proelectrophiles (triangles, bottom left), and 22 soft electrophiles (diamonds, bottom right). In all four plots, the regression line of 20 nonpolar narcotics (log IC50 (mol/L) ) -0.795 log Kow - 0.831 (28)) representing baseline toxicity is included for comparison.
(28). As can be seen from the figure, almost all phenols exert IC50 values below baseline toxicity, with only few exceptions occurring for the group of polar narcotics (top left). Interestingly, the largest scatter and greatest excess toxicities are observed for the group of 26 proelectrophiles (bottom left in Figure 3). Here, methoxyhydroquinone (compound nr. 198 in Table 2) and methylhydroquinone (nr. 195) show the largest deviations from baseline toxicity according to eq 11 with log IC50 values of -5.202 and -4.858, corresponding to excess toxicities
Te )
IC50 (calcd) IC50 (exp)
(12)
of Te ) 9931 (methoxyhydroquinone) and Te ) 1770 (methylhydroquinone), respectively. For polar narcotics and oxidative uncouplers, linear regression of log IC50 on log Kow shows reasonable statistics
polar narcotics: log IC50 (mol/L) ) -0.629 (( 0.028) log Kow - 1.979 (( 0.073) (13)
where n ) 153, r2 ) 0.78, SE ) 0.38, and F1,151 ) 520, and
oxidative uncouplers: log IC50 (mol/L) ) -0.643 (( 0.094) log Kow - 2.554 (( 0.287) (14) where n ) 19, r2 ) 0.73, SE ) 0.44, and F1,151 ) 47.1, while the respective correlations are rather poor for both proelectrophiles (r2 ) 0.26) and soft electrophiles (r2 ) 0.39). Moreover, the descriptors selected to best discriminate between polar narcotics and proelectrophiles (left branch in classification tree of Figure 1: EHOMO, NHdon, and DC-maxE) appear to be not suited for modeling the toxic potency of the respective compounds alone or in combination. When considering inclusion of log Kow, however, DC-maxE and EHOMO yield slightly improved calibration statistics for polar narcotics as compared to eq 13
polar narcotics: log IC50 (mol/L) ) -0.634 (( 0.026) log Kow - 21.01 (( 4.48) DC-maxE 7.508 (( 1.182) (15)
Four Modes of Toxic Action of Phenols
where n ) 153, r2 ) 0.80, SE ) 0.36, and F2,150 ) 307, and
log IC50 (mol/L) ) -0.649 (( 0.026) log Kow 0.469 (( 0.105) EHOMO + 2.350 (( 0.968) (16) where n ) 153, r2 ) 0.80, SE ) 0.36, and F2,150 ) 303, while there is still no reasonable regression equation for the group of 26 proelectrophiles. As regards the 22 soft electrophiles, the descriptors used to discriminate them from oxidative uncouplers (right branch in Figure 1: log Kow, DC-avN) do not yield proper correlations with their toxic potency when considered alone or combined, and for oxidative uncouplers, DC-avN does not contribute significantly to modeling log IC50 alone or in combination with log Kow. A more comprehensive analysis of MOA specific quantitative structure-activity relationships considering all 2D and 3D molecular descriptors as mentioned above is currently in progress and will be reported in due course.
Discussion Polar Narcotics and Proelectrophiles. The toxicity profile of phenols covers both nonspecific and specific components and is also linked to the metabolic capacity of the biological system under investigation. For only weakly acidic phenols, the hydrophobicity-dependent accumulation in lipid tissue leads to the polar narcosis syndrome (13). While it is still a matter of debate whether narcosis involves membrane-bound protein target sites or results from the accumulation of xenobiotics in lipid components of the membrane (29), the polar narcosis MOA appears to be related to both molecular hydrophobicity and hydrogen bond donor capacity of the phenols (13, 30). Surprisingly, a recent study of the aquatic toxicity of polar narcotics suggested that polar narcosis is associated with the hydrogen bond acceptor strength, while there was no significant regression with a corresponding hydrogen bond donor parameter (31). Moreover, toxicity decreased with increasing hydrogen bond acceptor strength (31). In our previous 1-step 4-MOA LDA classification model for phenols (18), NHdon and NHacc were tested as two simple counts of structural features associated with hydrogen bond donor or acceptor capacity, and NHdon (but not NHacc) was identified as a parameter with a significant discrimination power. Interestingly, NHdon also appears as a finally selected discriminating variable in the left branch of the 2-step classification tree (Figure 1) where polar narcotics are separated from proelectrophiles. It suggests that in accord with the initial idea (30), a distinct feature of phenols acting as polar narcotics is their combination of hydrophobicity and hydrogen bond donor capacity. Coming back to the left branch of the classification tree, EHOMO and DC-maxE also contribute significantly to the discrimination between polar narcotics and proelectrophiles. The HOMO energy is a quantum chemical measure of the ionization potential and thus can be used to model the oxidation potential or electron donation capability of molecules (19). In the present context, EHOMO may thus characterize the readiness of proelectrophiles to undergo oxidative biotransformation as suggested earlier (18). A similar meaning can be attributed to the maximum donor delocalizability confined to carbons, DC-maxE,
Chem. Res. Toxicol., Vol. 16, No. 8, 2003 985
which characterizes the susceptibility of phenol carbon atoms to be attacked by (endogeneous) electrophiles (19). The overall descriptor pattern is thus in line with actual differences in bioreactivity between polar narcotics and proelectrophiles. Oxidative Uncouplers and Soft Electrophiles. Increasing acidity of phenols promotes their ability to uncouple the respiration chain from oxidative phosphorylation (ATP synthesis), provided they are moderately hydrophobic and not yet strongly acidic. Under natural conditions, the energy gained through respiration is converted into a proton gradient across the inner mitochondrium membrane, which in turn drives the buildup of ATP from ADP. The hydrophobic character of potent oxidative uncouplers allows them to reach the intracellular mitochondrium and cross its outer and inner membrane, and through their moderate acidity, they remain protonated at the low pH region of the intermembrane area but become deprotonated in the inner matrix where the natural pH is elevated in accord with the above-mentioned proton gradient. Accordingly, the pKa would be expected to discriminate between oxidative uncouplers and phenols acting by other MOAs, considering also that a specific pKa range is required for the phenol to act as proton shuttle (8). Surprisingly, however, with the previous 1-step LDA discrimination models that included both pKa and log Kow (CM1 and CM2), the identification of oxidative uncouplers was relatively poor (18), and for the presently derived 2-step LDA and BLR classification trees, pKa is not among the most potent discriminating variables. Instead, the separation of uncouplers from soft electrophiles is achieved (after initial discrimination from the group of polar narcotics and proelectrophiles) through log Kow and DC-avN, the average acceptor delocalizability confined to carbon atoms. The nucleophilic (acceptor) delocalizability of a given molecular site characterizes its susceptibility for an attacking nucleophile (19). Thus, DC-avN quantifies the electrophilic or electron-accepting character of the phenol carbon atoms, which appears to be a reasonable parameter to discriminate soft electrophiles from phenols with different electronic structures and associated MOAs. Interestingly, replacement of log Kow by log Dowu (19) in the right branch of the 2-step classification tree (Figure 1) resulted in a significantly inferior discrimination between oxidative uncouplers and soft electrophiles. With ionogenic compounds such as phenols, log Kow refers to the neutral form and thus ignores that phenolate anions formed through dissociation have a substantially lowered hydrophobicity. By contrast, log Dowu refers to the undissociated compound portion that is actually present under the system pH of interest. As was demonstrated with structure-activity studies of nitroaromatics including nitrophenols (12), the use of log Kow tends to overestimate the narcosis component of the aquatic toxicity of ionogenic compounds and thus is likely to mask toxicity contributions from specific modes of action. At present, it remains unclear why for the purpose of discriminating oxidative uncouplers from phenols with other MOAs, both pKa and log Dowu appear to be less suited. As noted above, molecular hydrophobicity and carbon specific electrophilicity in terms of DC-avN serve to separate soft electrophiles from oxidative uncouplers. The electrophilic MOA is characterized by covalent interactions between the toxicant and the electron-rich endo-
986
Chem. Res. Toxicol., Vol. 16, No. 8, 2003
geneous sites of biological macromolecules such as sulfhydryl, amino, and hydroxyl groups (10). While soft electrophiles may directly attack nucleophilic sites of proteins and enzymes, proelectrophiles require metabolic activation before undergoing respective covalent interactions (10). Initial Discrimination Based on ELUMO. In the presently derived 2-step classification tree, the initial separation between proelectrophiles and polar narcotics on one hand and soft electrophiles and oxidative uncouplers on the other hand is achieved by ELUMO as the only molecular parameter. The LUMO energy is a measure for the energy gain through uptake of an additional electron (the lower ELUMO, the greater the energy gain) and thus may well serve as a global electrophilicity parameter besides the acceptor delocalizability DN and other quantum chemical parameters (9, 19, 32). The MOA grouping with respect to ELUMO as outlined in Figure 1 thus reflects the greater intrinsic electron affinity of soft electrophiles as compared to proelectrophiles. It shows further that with respect to the ELUMO scale of electrophilicity, polar narcotics and proelectrophiles are less reactive than oxidative uncouplers and soft electrophiles. Interestingly, the respective ELUMO medians of -0.028 (polar narcotics), 0.298 (proelectrophiles), -1.507 (oxidative uncouplers), and -1.163 eV (soft electrophiles) indicate that the greatest global electron affinity is associated with oxidative uncouplers. A possible explanation could be that electron-attracting substituents decrease the electron density in the aromatic ring and at the same time increase the acidity of the phenolic OH group. It would follow that with increasing acidity, the phenol passes an electrophilic window until oxidative uncoupling has become the dominating MOA. With very strong acids (which means below a certain pKa threshold), however, oxidative uncoupling is no longer feasible as outlined above. LDA vs BLR. The breaking down of the 4-MOA classification into two consecutive 2-class steps enabled the use of BLR besides LDA, which latter had been employed as the only classification method in the earlier investigation (18). While LDA is based on the assumption that all explanatory variables follow normal distribution with equal covariance matrixes for all classes to be identified (33), these requirements do not apply for BLR. Accordingly, BLR is more robust from the mathematical viewpoint and would thus appear to be preferred in many practical discrimination tasks. For the present data set of 220 phenols, application of the Kolmogorov-Smirnov statistic with the Lillefors correction (24) (preferred if more than 50 objects) for the 153 polar narcotics and of the Shapiro-Wilk’s test (24) (preferred if less than 50 objects) for the 19 oxidative uncouplers, 26 proelectrophiles, and 22 soft electrophiles revealed that the finally selected six molecular descriptors (ELUMO, EHOMO, NHdon, DC-maxE, log Kow, and DC-avN) do, in the majority of cases, not fulfill the requirement of normal distribution for each MOA group. Although such deviations from multivariate normality are expected to decrease the predictive power of LDA classification models, the present findings are somewhat different. As regards model calibration for the total data set (Table 3), BLR and LDA perform almost equally well. In the simulated external prediction analysis, however, BLR is slightly superior to LDA in the group 1 and group 2 submodel training, but LDA yields better external
Schu¨ u¨ rmann et al.
predictions (group 1 MOAs predicted by group 2 model and vice versa) than BLR. As a consequence, the LDAbased 2-step classification model CM3 is the current method of choice. However, the unexpected disadvantage of BLR as regards the prediction performance requires attention in future classification studies.
Conclusions The presently derived 2-step classification tree is significantly superior to previous 1-step 4-MOA LDA models in discriminating between four phenolic MOAs. Apparently, the stepwise approach allows a more efficient separation between prevalent mechanisms of toxicity as associated with the electronic and geometric structure of chemical compounds. A possible explanation is that the biological activity of chemical agents is usually not restricted to one (prevalent) MOA, which is reflected in correspondingly overlapping property profiles across different MOA classes and which can be better accommodated through a hierarchical decision tree. With regard to the present discussion about narcotic MOAs, the results support the view that a crucial feature of polar narcotics is their hydrogen bond donor capacity. It suggests that as compared to isohydrophobic nonpolar narcotics, compounds exerting the polar narcosis syndrome are likely to achieve a tighter fixation in the membrane and a correspondingly enhanced toxic potency. Interestingly, polar narcotics are more electrophilic on the global ELUMO scale than proelectrophiles but less electrophilic than soft electrophiles that in this respect rank still below oxidative uncouplers. The latter suggests that when increasing the acidity of an initially polar narcotic phenol, it may pass a soft electrophilicity window before reaching the characteristics of an oxidative uncoupler. In view of the partially significant deviations of the descriptor values from normal distributions, the almost equal discrimination performance of LDA and BLR is somewhat surprising. In particular, the slight superiority of LDA in the simulated external validation is remarkable, considering the often advocated greater robustness of BLR. So far, we have no explanation at hand for this finding and thus recommend to comparatively investigate the LDA and BLR performance more thoroughly in future classification studies.
Acknowledgment. This work was supported in part by the European Union IMAGETOX Research Training Network, HPRN-CT-1999-00015.
References (1) Morton, L. W., Caccetta, R. A., Puddey, I. B., and Croft, K. D. (2000) Chemistry and biological effects of dietary phenolic compounds: Relevance to cardiovascular disease. Clin. Exp. Pharmacol. Physiol. 27, 152-159. (2) Fukuhara, K., Nakanishi, I., Shimada, T., Ohkubo, K., Miyazaki, K., Hakamata, W., Urano, S., Ozawa, T., Okuda, H., Miyata, N., Ikota, N., and Fukuzumi, S. (2003) A planar catechin analogue as a promising antioxidant with reduced prooxidant activity. Chem. Res. Toxicol. 16, 81-86. (3) Routledge, E. J., Parker, J., Odum, J., Ashby, J., and Sumpter, J. P. (1998) Some alkyl hydroxy benzoate preservatives (parabens) are estrogenic. Toxicol. Appl. Pharmacol. 153, 12-19. (4) Miller, D., Wheals, B. B., Beresford, N., and Sumpter, J. P. (2001) Estrogenic activity of phenolic additives determined by an in vitro yeast bioassay. Environ. Health Perspect. 109, 133-138.
Four Modes of Toxic Action of Phenols (5) Garg, R., Kapur, S., and Hansch, C. (2001) Radical toxicity of phenols: A reference point for obtaining perspective in the formulation of QSAR. Med. Res. Rev. 21, 73-82. (6) Nichkova, M., Galve, R., and Marco, M.-P. (2002) Biological monitoring of 2,4,5-trichlorophenol (II): Evaluation of an enzymelinked immunosorbent assay for the analysis of water, urine, and serum samples. Chem. Res. Toxicol. 15, 1371-1379. (7) Terada, H. (1990) Uncouplers of oxidative phosphorylation. Environ. Health Perspect. 87, 213-218. (8) Schu¨u¨rmann, G., Somashekar, R. K., and Kristen, U. (1996) Structure-activity relationships for chloro- and nitrophenol toxicity in the pollen tube growth test. Environ. Toxicol. Chem. 15, 1702-1708. (9) Cronin, M. T. D., Manga, N., Seward, J. R., Sinks, G. D., and Schultz, T. W. (2001) Parametrization of electrophilicity for the prediction of the toxicity of aromatic compounds. Chem. Res. Toxicol. 14, 1498-1505. (10) Lipnick, R. L. (1990) Outliers: Their origin and use in the classification of molecular mechanisms of toxicity. Sci. Total Environ. 109/110, 131-153. (11) Hubert, T. D., Jeffrey, A. B., Vue, C., and Gingerich, W. H. (1999) Uptake, metabolism and elimination of TFM by fish. In Xenobiotics in Fish (Smith, D. J., Gingerich, W. H., and Beconi-Barker, M. G., Eds.) pp 177-187, Kluwer Academic/Plenum Publishers, New York. (12) Schmitt, H., Altenburger, R., Jastorff, B., and Schu¨u¨rmann, G. (2000) Quantitative structure-activity analysis of the algae toxicity of nitroaromatic compounds. Chem. Res. Toxicol. 13, 441450. (13) Bradbury, S. P., Henry, T. R., Niemi, G. J., Carslon, R. W., and Snarski, V. M. (1989) Use of respiratory-cardiovascular responses of rainbow trout (Oncorhynchus mykiss) in identifying acute toxicity syndromes in fish. Part 3: Polar narcosis. Environ. Toxicol. Chem. 8, 247-261. (14) Kaufman, R. D. (1977) Biophysical mechanisms of anesthetic action: Historical perspectives and review of current theories. Anesthesiology 46, 49-62. (15) Russom, C. L., Bradbury, S. P., Broderius, S. J., Hammermeister, D. E., and Drummond, R. A. (1997) Predicting modes of toxic action from chemical structure: Acute toxicity in the fathead minnow (Pimephales promelas). Environ. Toxicol. Chem. 16, 948967. (16) Aptula, A. O., Ku¨hne, R., Ebert, R.-U., Cronin, M. T. D., Netzeva, T. I., and Schu¨u¨rmann, G. (2003) Modeling discrimination between antibacterial and nonantibacterial activity based on 3D molecular descriptors. QSAR Comb. Sci. 22, 113-128. (17) Schultz, T. W., Sinks, G. D., and Cronin, M. T. D. (1997) Identification of mechanisms of toxic action of phenols to Tetrahymena pyriformis from molecular descriptors. In Quantitative Structure-Activity Relationships in Environmental SciencessVII
Chem. Res. Toxicol., Vol. 16, No. 8, 2003 987
(18)
(19)
(20)
(21) (22)
(23) (24) (25) (26) (27) (28)
(29) (30)
(31)
(32) (33)
(Chen, F., and Schu¨u¨rmann, G., Eds.) pp 329-342, SETAC Press, Pensacola, FL. Aptula, A. O., Netzeva, T. I., Valkova, I. V., Cronin, M. T. D., Schultz, T. W., Ku¨hne, R., and Schu¨u¨rmann, G. (2002) Multivariate discrimination between modes of toxic action of phenols. Quant. Struct.-Act. Relat. 21, 12-22. Schu¨u¨rmann, G. (1998) Ecotoxic modes of action of chemical substances. In Ecotoxicology (Schu¨u¨rmann, G., and Markert, B., Eds.) pp 665-749, John Wiley and Spektrum Akademischer Verlag, New York. MOPAC 93, 2nd revision (1994) Fujitsu Limited, 9-3, Nakase 1-Chrome, Mihama-ku, Chiba-city, Chiba 261, Japan, and Stewart Computational Chemistry, 15210 Paddington Circle, Colorado Springs, CO 80921. Smith, G. (1985) MOLSV, QCPE program 509. Stanton, D. T., and Jurs, P. C. (1990) Development and use of charged partial surface area structural descriptors in computerassisted quantitative structure-property relationships. Anal. Chem. 62, 2323-2329. STATISTICA for Windows, ‘99 ed. (1999) Statsoft Inc., Tulsa, OK. SPSS for Windows, Release 10.0 (1999) SPSS Inc., Chicago, IL. Hartung, J., Elpelt, B., Klo¨sener, K.-H. (2002) Statistik, 13th ed., 975 pp, Oldenbourg Verlag, Mu¨nchen, Germany. Cohen, J. (1960) A coefficient of agreement for nominal scales. Educ. Psychol. Measure. 20, 37-46. Cooper, J. A., II, Saracci, R., and Cole, P. (1979) Describing the validity of carcinogen screening tests. Br. J. Cancer 39, 87-89. Schultz, T. W., Wyatt, N. L., and Lin, D. T. (1990) Structuretoxicity relationships for nonpolar narcotics: A comparison of data from Tetrahymena, Photobacterium and Pimephales systems. Bull. Environ. Contam. Toxicol. 44, 67-72. Chaisuksant, Y., Yu, Q., and Connell, D. W. (1999) The internal critical level concept of nonspecific toxicity. Rev. Environ. Contam. Toxicol. 162, 1-41. Veith, G. D., and Broderius, S. J. (1987) Structure-toxicity relationships for industrial chemicals causing type(II) narcosis syndrome. In QSAR in Environmental ToxicologysII (Kaiser, K. L. E., Ed.), pp 385-391, D. Reidel Publishing Company, Dordrecht, The Netherlands. Dearden, J. C., Cronin, M. T. D., Zhao, Y.-H., and Raevsky, O. A. (2000) QSAR studies of compounds acting by polar and non-polar narcosis: an examination of the role of polarisability and hydrogen bonding. Quant. Struct.-Act. Relat. 19, 3-9. Veith, G. D., and Mekenyan, O. G. (1993) A QSAR approach for estimating the aquatic toxicity of soft electrophiles (QSARs for soft electrophiles). Quant. Struct.-Act. Relat. 12, 349-356. Hartung, J., and Elpelt, B. (1999) Multivariate Statistik, 6th ed., 815 pp, Oldenbourg Verlag, Mu¨nchen, Germany.
TX0340504