Identifying the Structural Requirements for Chromosomal Aberration

Dec 4, 2007 - These models are based on generation of a large number of molecular descriptors (structural indices) encoding chemical structures, selec...
0 downloads 0 Views 943KB Size
Chem. Res. Toxicol. 2007, 20, 1927–1941

1927

Identifying the Structural Requirements for Chromosomal Aberration by Incorporating Molecular Flexibility and Metabolic Activation of Chemicals Ovanes Mekenyan,*,† Milen Todorov,† Rossitsa Serafimova,† Stoyanka Stoeva,† Aynur Aptula,‡ Robert Finking,§ and Elard Jacob§ Laboratory of Mathematical Chemistry, Bourgas As. ZlataroV UniVersity, 8010 Bourgas, Bulgaria, Safety EnVironmental Assurance Centre (SEAC), UnileVer Colworth, Colworth House, Sharnbrook, Bedford MK44 1LQ, U.K., and Department of Product Safety, Regulations, Toxicology and Ecology, BASF Aktiengesellschaft, D-67056 Ludwigshafen, Germany ReceiVed July 10, 2007

Modeling the potential of chemicals to induce chromosomal damage has been hampered by the diversity of mechanisms which condition this biological effect. The direct binding of a chemical to DNA is one of the underlying mechanisms that is also responsible for bacterial mutagenicity. Disturbance of DNA synthesis due to inhibition of topoisomerases and interaction of chemicals with nuclear proteins associated with DNA (e.g., histone proteins) were identified as additional mechanisms leading to chromosomal aberrations (CA). A comparative analysis of in vitro genotoxic data for a large number of chemicals revealed that more than 80% of chemicals that elicit bacterial mutagenicity (as indicated by the Ames test) also induce CA; alternatively, only 60% of chemicals that induce CA have been found to be active in the Ames test. In agreement with this relationship, a battery of models is developed for modeling CA. It combines the Ames model for bacterial mutagenicity, which has already been derived and integrated into the Optimized Approach Based on Structural Indices Set (OASIS) tissue metabolic simulator (TIMES) platform, and a newly derived model accounting for additional mechanisms leading to CA. Both models are based on the classical concept of reactive alerts. Some of the specified alerts interact directly with DNA or nuclear proteins, whereas others are applied in a combination of two- or three-dimensional quantitative structure-activity relationship models assessing the degree of activation of the alerts from the rest of the molecules. The use of each of the alerts has been justified by a mechanistic interpretation of the interaction. In combination with a rat liver S9 metabolism simulator, the model explained the CA induced by metabolically activated chemicals that do not elicit activity in the parent form. The model can be applied in two ways: with and without metabolic activation of chemicals. Introduction The assessment of genotoxic potential is a critical point for approval and registration of new chemicals and drugs. Since no single test is capable of detecting all relevant genotoxic end points, a battery of in vitro and in vivo tests for genotoxicity is recommended (1). Computational programs used for genotoxicity prediction mainly focus on predicting the outcome of the Ames test as an indicator of mutagenicity, and relatively good predictive accuracies can be reached for this end point (2–4). The good performance of the model for the Ames test is based on the abundance of available data for this test system as well as on the fact that most of the molecular mechanisms underlying this genetic end point are fairly well understood and can be directly related to the chemical structure (5). The process of model development is more complex for the second in vitro genotoxicity test, which detects chromosomal aberrations (CA).1 * To whom correspondence should be addressed. Tel: ++359 56 880230. Fax: ++359 56 880230. E-mail: [email protected]. † Bourgas As. Zlatarov University. ‡ Unilever Colworth. § BASF Aktiengesellschaft. 1 Abbreviations: CA, chromosomal aberrations; CAS, Chemical Abstracts Service; CASE, Computer Automated Structure Evolution; COREPA, common reactivity pattern; CPSA, charged partial surface area; Diammin, minimum diameter (the minimum distance between two parallel planes

This test is experimentally standardized to a lesser extent than the Ames test, and publicly available experimental data for it are significantly less abundant than data for the Ames test. Moreover, in addition to the direct interaction of chemicals with DNA (6), other mechanisms could cause structural CA, such as the interactions of chemicals with enzymes involved in DNA replication and transcription (7) and with other nuclear proteins involved in chromosome segregation (e.g., histone proteins) (8). Chromosomal aberrations become observable as a result of the structural properties of chromosomes in the metaphase and can be registered as structural or numerical aberrations. In spite of all these considerations, a variety of computational approaches have been used in attempts to model the potential of chemicals circumscribing the molecule); EGAP, EHOMO - ELUMO; EHOMO, energy of the highest occupied molecular orbital; ELUMO, energy of the lowest unoccupied molecular orbital; EN, electronegativity; fHOMO, frontier atomic charge of highest occupied molecular orbital; fLUMO, frontier atomic charge of lowest unoccupied molecular orbital; GA, genetic algorithm; Lmax, greatest interatomic distance; LogKow, octanol/water partition coefficient; ML, machine learning; Mol. weight, molecular weight; NTP, National Toxicology Program; OASIS, Optimized Approach Based on Structural Indices Set; PNSA, partial negative surface area; PPSA, partial positive surface area; qi, atomic charge on species i; QSAR, quantitative structure– activity relationship; SEi, donor superdelocalizability of species i; SMILES, simplified molecular input line entry specification; SNi, acceptor superdelocalizability of species i; TIMES, tissue metabolic simulator; TOPS-MODE, Topological Sub-Structural Molecular Design; VolP, volume polarizability.

10.1021/tx700249q CCC: $37.00  2007 American Chemical Society Published on Web 12/04/2007

1928 Chem. Res. Toxicol., Vol. 20, No. 12, 2007

Mekenyan et al.

Figure 1. Illustration of the conformer distributions of two chemicals, 4,4′-dimethoxydiphenylamine (CAS number 101-70-2) and N,N′-di-secbutyl-p-phenylenediamine (CAS number 101-96-2), with respect to EGAP. The overlap between the conformer distributions [assessed by the Hellinger distance (35)] is used in the COREPA method to evaluate the similarity between the chemicals with respect to EGAP.

to cause CA. A large group of classification schemes are statistically based and are amenable to analysis by so-called machine learning (ML) models. These models are based on generation of a large number of molecular descriptors (structural indices) encoding chemical structures, selection of indices encoding significant structural information (i.e., elimination of redundant information), and construction of a classification model using the identified significant descriptors. The molecular descriptors reflect the molecular topologies, geometries, and electronic structures of chemicals that have been studied. Different feature-selection approaches, such as a genetic algorithm (GA) known as the method of variable importance (9), are used to specify the most discriminative indices. A number of classification models are used here (11–14). Despite the respectable statistical performance of ML, these models have no prior knowledge of a mechanism of action. This kind of model can lend support to a particular mechanism of action but in general cannot propose or contradict such a mechanism. Another statistically driven method used for predicting genotoxicity is the Computer Automated Structure Evolution (CASE) system (15). Recently, the CASE system was employed in order to develop a model for CA for the use of the Danish Environmental Protection Agency (Danish EPA, www.mst.dk). Details related to model generation are provided elsewhere (16). This system detects fragments (physicochemical parameters) that act as modulators of the modeled effect. The model for CA was trained on approximately 500 compounds taken from the Ishidate data collection (17) and showed relatively poor performance with respect to sensitivity2 (e60%) and specificity3 (82%). The significantly poor sensitivity indicates that the underlying two-dimensional (2D)-fragment-based descriptors do not describe the multitude of mechanisms leading to a positive result in the CA test. An attempt has been made by others to identify structural alert rules that facilitate identification of classes of chemicals with CA activity (18). Instead of basing a direct identification of alerts on the reactivities of the chemicals, the authors used information coded in their molecular structures. For this purpose, a classification model called Topological Sub-Structural Molecular Design (TOPS-MODE), which calculates the contribution of each part of a molecule to the activity, was developed. TOPS-MODE has a substructural nature, allowing this quantita2 Sensitivity: ability to detect active substances with a minimum of falsenegative predictions. 3 Specificity: ability to discriminate against false-positive predictions.

tive structure–activity relationship (QSAR) model to be considered as an incremental scheme in which the end-point value is calculated as an additive sum of the bond contributions. The approach was used to identify the regions making significant contributions to the clastogenic activity of a set of chemicals, which in turn made possible the generation of 22 rules containing alerts potentially responsible for CA induced by these chemicals. These rules permit the identification of certain classes of chemicals with a potential CA effect and can be implemented in expert systems (i.e., without ab initio QSAR calculations). Deductive Estimation of Risk from Existing Knowledge (DEREK) (19, 20) is an expert system using a mechanistic rulesbased approach. The system applies “If-Then-Else” rules extracted by human experts on the basis of qualitative associations with chemical structures. In a recent upgrade, a module capable of predicting the potential of a chemical to induce CA (21) was added to the modules for predicting mutagenicity and carcinogenicity. Approximately 100 prototype alerts derived from expert suggestions and the analysis of several collections of in vitro CA test data have been proposed. No information concerning the performance of this model has been found in the literature. Analysis of the reviewed models for predicting CA showed that although these models turned out to be useful for screening purposes, they have the following deficiencies. First, the majority of the models have a statistical character, which does not allow explicit identification of structural alerts on a purely mechanistic basis. The identification of alerts in some of the models is not subsequently associated with the multitude of mechanisms which could cause CA. Second, no identification of mechanisms which are specific for clastogenic activity has been done. Thus, almost all of the identified fragments represent known structural alerts associated with DNA reactivity; no alerts that result in protein binding have been defined. Finally, in the training sets used for deriving the models, no distinction has been made between chemicals eliciting activity as parent structures and those requiring metabolic activation. In all the experimental studies, the data with and without metabolic activation were included in the same class of biologically active chemicals. Hence, metabolic activation of the chemicals has not been simulated in the models derived so far. In view of this, the goal of the present work was to develop a method for predicting CA which combines a previously derived model for covalent interaction with DNA (i.e., a model for bacterial mutagenicity) (22, 23) with a newly derived model

Structural Requirements for Chromosomal Aberration

Figure 2. Correlation between Ames mutagenicity and chromosomal aberrations. Tests for both effects have been carried out on 662 chemicals listed in the OASIS mutagenicity database. (a) About 80% of the mutagenic chemicals were found to also cause CA, whereas (b) only 60% of the chemicals which cause CA are Ames-positive.

accounting for additional interaction mechanisms with proteins that lead to clastogenic activity. The new model is based on the classical concept of reactive alerts. However, the use of each alert has been justified by a mechanistic interpretation of the interaction. In addition, the tissue metabolic simulator (TIMES) model (22, 23) is used to predict the metabolic activation of chemicals to elicit CA.

Materials and Methods CA Test Data Information. Two training sets were obtained from a Danish EPA inventory of chemicals containing data for chromosomal aberrations (24). All tests were performed using a Chinese hamster lung cell fibroblast cell line that has been kept as a single-cell subclone since 1973. Test results for a total of 901 substances are presented in the Data Book of Chromosomal Aberration Test In Vitro (17), but only a subset of these chemicals was used to form the training sets for the current study. Many of the chemicals from the original set fall into the unknown or variable composition or biological material (UVCB) class, which includes chemicals that cannot be represented by a definite structural diagram and/or a specific molecular formula. These were excluded for the obvious reason that it is impossible to model a chemical with an undefined chemical structure. Inorganic chemicals, stereoisomers, chemicals with equivocal activity data, and chemicals considered as false positives by Niemela and Wedebye (24) on the basis of expert evaluation were also excluded. The remaining 538 chemicals from the initial list were grouped into two training sets. One of these training sets included 497 chemicals for which data on induced CA without S9 activation were provided; it was used to derive a CA model that did not account for metabolic activation of chemicals. The other training set included 162 chemicals for which

Chem. Res. Toxicol., Vol. 20, No. 12, 2007 1929 data on induced CA in the presence of S9 activation were provided; it was used for modeling CA with metabolic activation of chemicals. There were 121 chemicals that were members of both of these sets. Moreover, comparative analysis of the two sets showed that 23 chemicals were found to be CA-positive with and without S9 activation, whereas for 41 chemicals, information about CA activity was provided only for metabolically activated forms and not for the nonmetabolized parents. All of this caused difficulties in assessing the performance of the system with regard to the absence or presence of metabolic activation. The Chemical Abstracts Service (CAS) numbers, chemical names, simplified molecular input line entry specifications (SMILES), and CA activities of all chemicals used in this study can be found in Tables 1 and 2 of Appendix I in the Supporting Information. Modeling Methodology. Conformational Flexibility of Chemicals. On the basis of thermodynamic and kinetic considerations, it has been shown that molecules at macromolecular binding sites can adopt conformations that are substantially different from those of isolated, lowest-energy, or crystal-phase states (25). Individual conformers having a free energy higher than that of the lowest-energy structure by j20 kcal/mol (usually accepted as a threshold) exhibit significant variation in potentially relevant electronic descriptors. The observation that relatively small differences in conformer energies can result in significant variations in electronic structure highlighted the necessity of including all energetically reasonable conformers when defining common reactivity patterns (26, 27). The recently developed nondeterministic GA-based method for coverage of the conformational space by a limited number of conformers (28) was applied to generate conformers for the QSAR analysis. Because GAs generate random candidates for further selection, thereby leading to a nonreproducibility of conformers and their distribution in structural space, a new procedure for the saturation of a conformational space was developed (29). The goal of the saturation is to represent the conformational space of a molecule with an optimal number of conformers, providing stable conformational distributions across selected molecular descriptors that are no longer perturbed by adding new conformers. Such conformer distributions are expected to eventually provide reliable reactivity patterns (see the discussion below concerning the COREPA approach). Each of the generated conformations is subjected to a quantumchemical geometry-optimization procedure that results in a minimumenergy state at the enthalpy level under consideration.4 Next, the conformers are screened to eliminate those whose standard heats of formation, ∆Hf°, are greater than that associated with the lowestenergy conformer by a user-defined threshold (20 kcal/mol). Subsequently, conformational degeneracy due to molecular symmetry and geometry convergence is detected within a user-defined torsion-angle resolution. Molecular Descriptors. A variety of mechanistically sound molecular descriptors have been used in the Optimized Approach Based on Structural Indices Set (OASIS) software to assess receptor binding interactions (26, 27). In a preceding work (32) we have also tried to classify these descriptors on the basis of their abilities to describe toxic end points with different specificities. The effects mediated by DNA interaction could be characterized with lower specificity. Consequently, mutagenicity should be modeled by the set of traditionally used reactivity parameters, encompassing among others the energies of the lowest unoccupied and highest occupied molecular orbitals (ELUMO and EHOMO, respectively), which assess global electrophilicity and nucleophilicity of molecules, respectively; the difference between ELUMO and EHOMO (EGAP), which is a measure of molecular reactivity; electronegativity (EN), given by (EHOMO + ELUMO)/2; dipole moment (µ); and volume polarizability (VolP), which measures the average ability of a molecule to change electron density at its atoms during chemical interactions. Additional descriptors are the degree of stretching or compactness 4 Usually, MOPAC 93 (30, 31) is employed, making use of the AM1 Hamiltonian.

1930 Chem. Res. Toxicol., Vol. 20, No. 12, 2007

Mekenyan et al.

Table 1. Identified Alerting Groups for Interactions with DNA and the Associated Parameter Boundaries Used for Building a Bacterial Mutagenicity Model Based on the NTP and BASF Databases

Structural Requirements for Chromosomal Aberration

Chem. Res. Toxicol., Vol. 20, No. 12, 2007 1931 Table 1 (continued)

(quantified as the sum of interatomic steric distances, Geom. Wiener); the greatest interatomic distance (Lmax); planarity (the normalized sum of the torsion angles in a molecule); the Van der Waals surface; and the solvent-accessible surface, which is calculated using the Connoly algorithm (33). The local specificity of molecular structure was also described by making use of atomic charges (qi); the frontier atomic charges of the highest occupied and lowest unoccupied molecular orbitals (fHOMO and fLUMO, respectively); donor and acceptor superdelocalizabilities (SEi and SNi, respectively); and charged partial surface areas (CPSAs), which were introduced by Stanton and Jurs (34) and include, among others, partial positive surface area (PPSA) and partial negative surface area (PNSA). Less specific molecular descriptors have also been used in order to describe receptor-mediated effects, such as water solubility and octanol-water partition coefficient (LogKow), which are important for bioavailability-related factors conditioning the effect (e.g., penetration, diffusion). Basic Principles of the COREPA Method. The common reactivity pattern (COREPA) method is a probabilistic classification scheme identifying criteria which classify an unknown object into predefined classes on the basis of a training set of objects from multiple classes. The COREPA formalism uses a Bayesian probabilistic method to identify common structural characteristics among chemicals that elicit similar biological activity. The important consequence of this approach is that instead of comparing single conformational representations of the chemicals, it analyzes and compares their probabilistic conformational distributions in the molecular descriptor space, thus accounting for the molecular flexibilities of the chemicals. The common reactivity pattern is developed by seeking overlap between the conformer distributions of biologically similar chemicals in the specific structural space (Figure 1). Thus, the COREPA method circumvents the problem of structure alignment encountered in the method traditionally used for similarity assessments by overlapping and comparing the conformational distributions of chemicals across the descriptor axis. Though the latter approach does not require the alignment of structures, it allows the common reactivity pattern to be identified. The common reactivity pattern consists of a structural subspace populated mainly by the conformers of chemicals with similar biological activities. The COREPA algorithm consists of three steps. In Step 1, two sets of chemicals are selected as training sets. The first set consists of chemicals having activities above a user-defined high-activity threshold, while the second set includes chemicals having activities below a predetermined nonactive threshold. In Step 2, a set of parameters associated with biological activity is established by evaluating the degree (%) of overlap between the distributions associated with those thresholds. The stereoelectronic parameters

that provide the maximal measure of similarity among chemicals in the training sets containing active and inactive chemicals and have the least overlap between overall patterns associated with those subsets (i.e., have the most distinct patterns) are assumed to be related to biological activity and are used in the subsequent step of the algorithm. Finally, in Step 3, common reactivity patterns for biologically similar molecules are obtained by summing the probabilistic distributions for specific stereoelectronic parameters associated with chemicals in the training sets containing active and inactive chemicals. As a result, the system automatically generates a decision tree. The Bayesian probabilistic method is used to segregate chemicals which meet the reactivity pattern associated with each of the nodes of the decision tree. Decisions about whether the structural boundaries defined by the probabilistic distributions of the common reactivity pattern (of the tree node) have been met are made at a given probability threshold. Chemicals are predicted to meet the structural constraints of the common reactivity pattern when they possess at least one conformer having a population density estimate exceeding the user-defined probability threshold. The thresholds depend on the specificity of the investigated classification task. In the current study, the probability threshold was parametrized to be 0.60. The mathematical formalism of the current algorithm is described elsewhere (25, 36, 37). Applicability Domain of the OASIS Model. The reliability of the predictions made by the CA model was evaluated using a recently developed stepwise approach for determining the applicability domain of (Q)SAR models (38). Four stages are applied to account for the diversity and complexity of the (Q)SAR models, reflecting their mechanistic rationales (including metabolic activation of chemicals) and transparencies. General parametric requirements (in term of ranges for molecular weight, water solubility, volatility, and so on) are imposed in the first stage, specifying the domain for only those chemicals that fall in the range of variation of the selected physicochemical parameters of the chemicals in the training set. The second stage defines the structural similarity between chemicals which are correctly predicted by the model. The structural neighborhood of atom-centered fragments is used to determine this similarity. The training-set chemicals for which the (Q)SAR model provides correct predictions (within user-defined accuracy thresholds) are used as the source from which atom-centered fragments are extracted. The resulting list of “good fragments” can be used to assess an external chemical. If the atom-centered fragments for each atom constituting an external chemical are determined to be elements of this list, then the chemical belongs to the structural domain of the model. If not, the chemical is considered to be outside of this domain.

1932 Chem. Res. Toxicol., Vol. 20, No. 12, 2007

Mekenyan et al.

Table 2. Interaction Mechanisms of (a) Alerting Groups Directly Acting with DNA and (b) Alerts Acting after Bacterial Enzymatic Activationa

Structural Requirements for Chromosomal Aberration

Chem. Res. Toxicol., Vol. 20, No. 12, 2007 1933 Table 2 (continued)

a

References supporting these mechanisms are listed in Appendix II of the Supporting Information.

The third stage in defining the domain is based on a mechanistic understanding of the modeled phenomenon (i.e., the domain of the mechanistic hypothesis). Here, the model domain combines the reliability of specific reactive groups hypothesized to cause the effect and the domain of explanatory variables, determining the parametric requirements for functional groups to elicit their reactivity. Finally, if metabolic activation of chemicals is a part of the (Q)SAR model, the reliability of simulated metabolism (metabolites, pathways, and maps) is taken into account in assessing the reliability of predictions. Some of the stages of the proposed approach for defining the model domain can be skipped, depending on the availability and quality of the experimental data used to derive the model, the specificity of (Q)SARs, and the goals of their ultimate application. Tissue Metabolic Simulator. The TIMES model is based on a probabilistic approach. It consists of a list of hierarchically ordered transformations and a substructure-matching engine (explained below) that implements them. According to the probabilistic approach we have developed, the hierarchy of transformations is defined by transformation probabilities determined in order to reproduce a database of documented metabolic transformations or data on rates of disappearance. The transformation probabilities are related to rate constants associated with the feasibility of occurrence of various reactions within the time frame of the metabolism tests. It is assumed that the transformations are independent and performed sequentially. Each molecular transformation consists of parent submolecular fragments, transformation products, and inhibiting masks. The last of these play the role of reaction inhibitors. If a fragment assigned as a mask is attached to the target subfragment, the execution of the transformation on the parent chemical is prevented. The presence of groups that can promote or inhibit metabolic reactions significantly increases the number of principal transformations. Although the number of organic functional groups known in intermediary and organ-specific metabolism is less than 60, the reactions that may occur for polyfunctional

compounds can be uncountable (39). Positional isomerism also adds to the combinatorial explosion. Currently, 343 principal transformations are used to model metabolism in the rat liver. These transformations have been separated into two major classes of reaction: non-rate-determining and rate-determining. The first class includes 41 abiotic and enzyme-controlled reactions that occur at very high rates on the time scale of the tests. Transformations of highly reactive groups and intermediates are included here. Various chemical-equilibrium processes such as tautomerism are also included in this class of transformations. The second class of reaction includes 302 metabolic transformations of phase 1 and phase 2 detoxification mechanisms, such as oxidative, redox, reductive, hydrolytic, and synthetic reactions. The simulator starts by matching the parent molecule with the reaction fragment associated with the transformation having the highest probability of occurrence. When a match is identified, the molecule is metabolized, and transformation products are treated as parent molecules for the next conversion step. The procedure is repeated for the newly formed chemicals until the product of probabilities of consecutively performed transformations reaches a user-defined threshold. Initially, the parent chemical is subjected to the list of transformations, and all transformations meeting the associated substructures are implemented on the parent, producing the list of first-level metabolites. Each of these generated metabolites is then subjected to the same list of transformations to produce the second level of metabolites, and so on. The mathematical formalism is based on the assumption that transformations occur sequentially; that is, the most probable transformation is applied first to the parent chemical, then the remaining nonmetabolized parent molecules undergo the second transformation with lower probability, and so on. The mathematical formalism defining the amount of metabolite, formation and metabolism probabilities, and so forth, is given in our recent publications (22, 40, 41). The reaction probabilities of the metabolic simulator were adjusted to reproduce a database containing 332 documented maps

1934 Chem. Res. Toxicol., Vol. 20, No. 12, 2007 Table 3. Identified Alerting Groups for Interactions with Proteins and the Associated Parameter Boundaries Used for Building a CA Model

for mammalian (primarily rat) liver metabolism (22, 42). The degree to which the training set reproduces the documented maps defines the performance of the simulator, which can be adjusted in light of new experimental evidence. Similarly, assessments evaluating the reliability of generated metabolites and metabolic maps on the basis of the rate of reproduction of documented metabolites by the individual transformations of the simulator have been introduced (22, 40).

Results and Discussion Model for Chromosomal Aberrations. The relationship between Ames mutagenicity and CA was studied by analyzing data included in the OASIS genotoxicity database (43). This is a large collection of data (7317 entries) on chemicals for which

Mekenyan et al.

information about different genotoxic effects (e.g., CA, micronucleus formation, and mutagenicity in the Mouse Lymphoma Assay, the Ames test, and other genotoxicity assays) is available. Of these 7317 chemicals, 662 that have been tested for both bacterial (Ames) mutagenicity and CA were identified. Comparison of the test results showed that about 80% of the mutagenic chemicals also cause CA (see Figure 2a). The opposite comparison showed that only 60% of the chemicals that cause CA are Ames-positive (see Figure 2b). Within the range of variation of the experimental data (80–85%), it was concluded that all chemicals that cause bacterial mutagenicity are also highly likely to be positive with respect to CA in vitro. Chemicals that are not Ames mutagenic could still be genotoxic (i.e., cause CA) by mechanisms different from those associated with direct interactions with DNA. In this regard, two other types of interaction mechanisms for CA have been distinguished: inhibition of enzymes controlling replication of DNA and interactions with nuclear (e.g., histone) proteins that preserve eukaryotic DNA in their chromosomal structures. To distinguish these two mechanisms from those involving direct interactions with DNA, the present model was separated into two parts. The first part accounts for direct interactions of chemicals with DNA, resulting in bacterial (Ames) mutagenicity. The model reproducing these interactions has already been derived (23) using chemicals from the National Toxicology Program (NTP) and BASF training sets for Ames mutagenicity. It is based on rules combining structural boundaries of alerting groups with ranges of variation of physicochemical properties and other 2D and 3D molecular parameters controlling bioavailability of chemicals and reactivity of alerts, respectively. A set of 17 such rules (listed in Table 1) was derived. Each alerting group is a functionality that could interact with DNA by covalent bond rearrangement or intercalation between base pairs. The use of each alert has been justified by the mechanistic interpretation of that interaction given in Table 2. Two of the alerts were identified as ones that cause effects without the need of modulating factors. In eight cases, global physicochemical (2D) parameters such as LogKow and molecular weight were imposed as modulating factors. In four cases only, quantumchemical requirements were used to assess the degree to which the alerts are fired by the rest of the molecules. For three alerts, two-parameter COREPA models were derived as modulating components, which explains the multiplicity of parameter ranges corresponding to the complex reactivity patterns associated with these alerts. To explain the variation of toxicity resulting from direct interaction with DNA, all of the recently identified structural alerts for Ames mutagenicity and their parametric boundaries (Table 1) were applied to the first training set used in this study (497 chemicals without S9 activation, 166 CA-positive and 331 CA-negative). Six of the rules (#2, 4, 6, 10, 15, and 16 in Table 1) associated with DNA interactions (23) were slightly modified to accommodate the variation of data within the CA training set. In the case of nitroso groups (alert #6, Table 1), the range of variation of the supporting parameter, molecular weight, was expanded from 117–240 to 103–272 g/mol. For R,β-unsaturated aldehydes (alert #15), the range of the LogKow parameter was modified from 0.18–1.7 to 0.18–3.35. The ranges of both parameters supporting primary aromatic amines (alert #16) were modified: -0.4 to 2.24 instead of -0.2 to 1.86 was used for LogKow and 7.2-9.1 eV instead of 7.2-8.5 eV was used for EGAP . The modulating parameter for p-quinones (alert #4) was eliminated; however, a structural requirement for o-quinones was added to the current model. New parameters were identified

Structural Requirements for Chromosomal Aberration

Chem. Res. Toxicol., Vol. 20, No. 12, 2007 1935

Table 4. Interaction Mechanisms of (a) Alerts That Interact Only with SH Groups of Proteins and (b) Alerts That Interact with Both SH and NH Groups of Proteinsa

1936 Chem. Res. Toxicol., Vol. 20, No. 12, 2007

Mekenyan et al. Table 4 (continued)

a

References supporting these mechanisms are listed in Appendix III of the Supporting Information.

Figure 3. Example of an extended structural alert proposed by Estrada and Molina (18). R1 ) H, aromatic C; R2 ) OH, NH2, O, alkyl.

to control the behavior of N-acyl and other urea derivatives (alert #10) in order to correctly reproduce the CA activities of these chemicals; in the present work, ELUMO and LogKow were combined in a two-parameter COREPA model that was used instead of a range of volume polarizability. The two-parameter COREPA model supporting epoxides (alert #2) was also modified in the current study, where it was based on the parameters EHOMO and EN instead of molecular weight and EHOMO. The next step was the development of the component of the CA model that accounts for protein interactions. In this regard, 10 new alerts associated with protein binding (Table 3) were proposed. The role of each alert identified in the present work was justified by an interaction mechanism(s) identified in the literature or introduced by our experts. These interaction mechanisms are listed in Table 4. The alerting groups and associated mechanisms can be classified on the basis of interaction target: specific enzymes (e.g., topoisomerases) or other proteins (e.g., histones). Interactions with enzymes that control replication of DNA, in particular, inhibition of topoisomerases, are thought to be a cause of CA (44). Topoisomerases are enzymes which participate in all stages of replication, functional activity, and structural maintenance of DNA (44). In the present study, given the restricted structural domain of the training set, only one alert, quinone, was considered to interact with these enzyme classes; the fact that this interaction causes CA has been documented (45). The quinones are well-known mutagens, and they have already been implemented in this work as alerts which can

cause DNA damage. This is an example of how the same alert can provoke different outcomes depending on the interaction target: DNA bases or proteins associated with DNA (see alert #4 in Table 2 and alert #1 in Table 4, respectively). The protein-binding alerts can also be classified according to the attacked nucleophilic site of the protein. Here, the principal nucleophilic centers are sterically accessible S and N atoms of SH and NH groups, respectively. Alerts in Tables 3 and 4 that can interact only with SH groups are quinones (alert #1), acrylates (alert #2) (46), and pyranones (alert #3) (47). The rest of the alerts can interact with either kind of nucleophilic site. The interaction mechanisms associated with all of the alerts studied in this work are summarized in Table 4, and detailed descriptions of these mechanisms can be found in Appendix IV of the Supporting Information. In addition to the direct interaction mechanisms, modulating 2D and 3D structural factors were also associated with the identified alerts for protein interactions. Parameter ranges associated with the alerts are presented in Table 3. Six of the alerts [quinones (alert #1), pyranones (alert #3), methylenedioxyphenyl derivatives (alert #5), pyrazolone and pyrazolidine derivatives (alert #8), formaldehyde (alert #9), and aromatic (thio)amides (#10)] were identified as ones that interact directly with protein without the need of modulating factors. In one case, acrylates (alert #2), the associated global physicochemical (2D) parameter LogKow had a range of 0.43-1.25. In two cases, quantum-chemical requirements were imposed in order to assess the effect of the rest of molecules on the reactivity of associated alerts: for sulfonamides (alert #6), EN was in the range from -5.65 to -5.29 eV, and for carbamates (alert #7), EHOMO was in the range from -9.2 to -8.9 eV. In one case, aromatic esters (alert #4), a two-parameter COREPA model was derived as a modulating component, which explains the multiplicity of parameter ranges for LogKow and EHOMO corresponding to the complex reactivity patterns for this alert.

Structural Requirements for Chromosomal Aberration

Chem. Res. Toxicol., Vol. 20, No. 12, 2007 1937

Table 5. Comparison of the CA Activity Alerts Identified by Estrada and Molina (18) with Those Used in the Current Work

1938 Chem. Res. Toxicol., Vol. 20, No. 12, 2007

Mekenyan et al. Table 5 (continued)

The present collection of DNA and protein binding alerts was compared with the set of alerts used by Estrada and Molina (18). They proposed 22 structural alerts associated with CA (for four of these alerts, no structural information was provided). Some of the alerts are specifically defined structural functionalities, whereas others are extended beyond the interacting functional group to include a large fragment from the structure. An example of such an extended alert is the structural fragment shown in Figure 3, which covers the unsaturated carboxylic acids, amides, esters, and ketones simultaneously. The use of extended alerts is based on the assumption that a single compound can have more than one interacting alert responsible for its activity. Estrada and Molina (18) adopted the concept that different interactions can lead to CA. However, their work considered only those alerts assumed to interact directly with DNA, and they only attempted to review the interaction mechanisms of three toxicophores.

We have compared the alerts derived by Estrada and Molina (18) with those used in the present work (Table 5). It is evident that the majority of the alerts from the current work could be identified within the extended alerts of Estrada and Molina. Three of their extended fragments (#10, 14, and 17) had no analogues from the list introduced in the present paper. Conversely, no analogues were found in ref 18 for 12 of the alerts from the current paper (alerts #3, 5, 9, 11, 12, and 14 from Table 1 and alerts #3, 4, 5, 8, 9, and 10 from Table 3). Some of the alerts introduced by these authors as reactive toward DNA were identified in the present work as alerts that can also interact with proteins. One such example is quinones. Structure of the Model. Two alternative configurations are proposed for implementation of the derived model, as illustrated in Figure 4a,b. Each alternative includes two individual parts. The first part accounts for the direct interactions with DNA,

Structural Requirements for Chromosomal Aberration

Figure 4. Structure of the currently derived model for chromosomal aberrations. The first part of the model (the upper pie chart in each panel) accounts for interactions with DNA; the second part (the lower pie chart) describes the interactions with proteins. Two alternative configurations are provided for this scheme: (a) If a chemical is identified in the first part of the model as active with respect to interaction with DNA, it is predicted to be CA-positive and forwarded to the second part for further analysis regarding possible interactions with protein. (b) If a chemical is identified in the first part of the model as active with respect to interaction with DNA, it is forwarded to second part of the model. A CA-positive prediction is assigned only in the case that the chemical meets the requirements for interaction with both DNA and proteins.

whereas the second part reproduces the interactions which lead to CA by protein or enzyme binding mechanisms. According to the first alternative (Figure 4a), when a new chemical is submitted for prediction, the system applies the first part of the model associated with DNA interactions. A positive prediction

Chem. Res. Toxicol., Vol. 20, No. 12, 2007 1939

for CA is assigned if the requirements for interaction with DNA are met, indicating that the ultimate effect is due to this interaction mechanism. Regardless of whether the chemical meets the requirements for direct interaction with DNA, it is forwarded to the second part of the model, which investigates its ability to interact with proteins. The system indicates the cases where the same chemical could cause CA by both mechanisms (direct interaction with DNA and interaction with protein) simultaneously. If the chemical passes through both parts of the model without being flagged for activity, the system predicts this chemical to be unable to produce chromosomal aberrations. A second alternative for combining the two parts of the model (Figure 4b) is also introduced. In this case, a chemical is predicted to be CA-positive only if it causes CA both by direct damage to DNA and by interactions with protein. This alternative, in which the prediction is based on applying a logical “and” to the results from the two parts of the model, could be considered as a more conservative version of the first prediction scheme. The assessment of model performance described below was based only on the first prediction scheme, which could be considered as more valuable from a regulatory point of view. Performance of the Model for Training Set Chemicals. The performance of the model was assessed by screening the training set of 497 chemicals without S9 metabolic activation. The model correctly predicted the activity of 121 of the 166 chemicals in the training set that elicit CA. Nine other chemicals were unclassified by the model because they did not meet the probabilistic level required for the COREPA model to make reliable predictions. This result (121 of 157) corresponds to a sensitivity of 77%. The model correctly classified 253 of the 331 CA-negative chemicals and left 24 others unclassified, corresponding to a specificity of 82%. Thus, the total concordance of the model is 80%. Of the 162 chemicals in the training set for CA with S9 metabolic activation, 121 were also included in the set of 497 chemicals studied for CA without metabolic activation. The remaining 41 chemicals, for which there are data only for activity after metabolic activation, were not included in this set because there is no information for their activity as parents. Hence, the positives among these 41 chemicals activated by metabolic transformation were not incorporated as negative parents in the evaluation of the specificity of the model. The performance of the modeling scheme combined with the simulator for rat liver S9 metabolism (22) was assessed by screening the training set of 162 S9-activated chemicals (81 CA-positive and 81 CA-negative). The model correctly predicted the behavior of 55 of the 81 CA-positive chemicals and 41 of the 81 CA-negative chemicals; eight chemicals in each group were left unclassified for failure to meet the probability threshold for reliable prediction. Thus, the sensitivity of the model accounting for metabolic activation is 75%. The specificity of the model with metabolic activation, however, was found to be relatively low, namely 56%. As shown in Figure 5, there was an even distribution of false positives across alerting groups, which hampered the attempts to improve model specificity. One way to increase the specificity of the current model would be the acquisition and analysis of more experimental data for metabolic activation (especially important are chemicals which appear to be CA-negative after metabolic activation). Profound assessment of the bioavailability of chemicals would be another way to increase the specificity of the model, especially if the

1940 Chem. Res. Toxicol., Vol. 20, No. 12, 2007

Mekenyan et al.

Figure 5. Distribution of false positives across alerting groups.

data indicate precipitation of the test substance at critical concentration levels.

Summary and Conclusions The model currently developed for chromosomal aberrations accounts for two principal types of interaction mechanisms. The first addresses direct interactions with DNA. In this regard, chemicals eliciting bacterial mutagenicity are assumed to cause CA as well. This assumption is supported by a large number of experimental observations. The second group of mechanisms includes interactions with enzymes participating in the replication of DNA (topoisomerases) and proteins maintaining the activity and structure of chromosomes (e.g., histones). The alerting groups associated with different mechanisms are defined by specific structural boundaries as well as by 2D and 3D parameter ranges describing effects of bioavailability and reactivity of alerts that are conditioned by the rest of the molecules. The use of each alert is justified by a corresponding mechanism. In this respect, the model provides for each mechanism explicit knowledge found in the literature or inferred by experts. A comparative analysis was performed between the alerts used in the model and those published in the literature. A commonality has been identified, but the boundaries defined for some of the publicly available alerts are biased by statistical analysis and not supported by insight into specific interaction mechanisms. In addition to its mechanistic transparency, the current work is the first attempt to model CA in a way that explicitly accounts for metabolic activation of chemicals. This was accomplished by coupling the model with a metabolic simulator that was trained to reproduce documented maps for mammalian (mainly rat) liver metabolism of 332 chemicals. In order to assess the model, chemicals were grouped into training sets distinguished by the absence or presence of metabolic activation. The performance of the model without metabolic activation was characterized by sensitivity and specificity values of 77 and 82%, respectively. For the model coupled with the metabolic simulator, the sensitivity reached 75%, whereas the specificity dropped to 56%. The low specificity of the model could not be associated with specific alert(s). The performance of the model could be improved with inclusion of additional S9-activated chemicals in the training set (especially ones that are CA-negative after metabolic activation). Profound assessment of the bioavailability of chemicals would be another way

to increase the specificity of the model, especially if the data indicate precipitation of the test substance at critical concentration levels. Acknowledgment. Research associated with this paper was funded in part through research agreements with Unilever and BASF Aktiengesellschaft. Gratitude is expressed to Drs. Grace Patlewicz (Joint Research Centre), Camilla Pease, and Carl Westmoreland from Unilever for discussions improving the quality of the paper. Supporting Information Available: CAS numbers, chemical names, SMILES, and observed CA activities for the training set chemicals with and without S9 activation (Appendix I); interaction mechanisms of alerting groups used in this work to represent DNA covalent binding (Table 2), supported by literature sources (Appendix II); interaction mechanisms of alerting groups used in this work to represent interactions with proteins (Table 4), supported by literature sources (Appendix III); and descriptions and interaction mechanisms of all identified toxicophores associated with CA (Appendix IV). This material is available free of charge via the Internet at http://pubs.acs.org.

References (1) European Medicines Agency, International Conference on Harmonization (1997) Genotoxicity: a standard battery for genotoxicity testing of pharmaceuticals, ICH Topic S2B, Ref. No. CPMP/ICH/174/95, London, U.K. (2) Enslein, K. (1988) An overview of structure-activity relationships as an alternative to testing in animals for carcinogenicity, mutagenicity, dermal and eye irritation, and acute oral toxicity. Toxicol. Ind. Health 4, 479–498. (3) Schultz, T. W., Cronin, M. T. D., and Netzeva, T. I. (2003) The present status of QSAR in toxicology. THEOCHEM 622, 23–38. (4) Woo, Y.-T., Lai, D. Y., Argus, M. F., and Arcos, J. C. (1995) Development of structure-activity relationship rules for predicting carcinogenic potential of chemicals. Toxicol. Lett. 79, 219–228. (5) Simon-Hettich, B., Rothfuss, A., and Steger-Hartmann, T. (2006) Use of computer-assisted prediction of toxic effects of chemical substances. Toxicology 224, 156–162. (6) Pfeiffer Obe, G., Savage, P., Johannes, J. R., Goedecke, C., Jeppesen, W., Natarajan, P., Martinez-Lopez, A. T., Folle, W. G. A., and Drets, M. E. (2002) Chromosomal aberrations: formation, identification and distribution. Mutat. Res. 504, 17–36. (7) Degrassi, F., Fiore, M., and Palitti, F. (2004) Chromosomal aberrations and genomic instability induced by topoisomerase-targeted antitumour drugs. Curr. Med. Chem.: Anti-Cancer Agents 4, 317–325. (8) Parry, E. M., Parry, J. M., Corso, C., Doherty, A., Haddad, F., Hermine, T. F., Johnson, G., Kayani, M., Quick, E., Warr, T., and Williamson,

Structural Requirements for Chromosomal Aberration

(9) (10) (11) (12)

(13) (14) (15) (16)

(17) (18) (19) (20)

(21) (22)

(23)

(24)

(25)

(26)

(27)

J. (2002) Detection and characterization of mechanisms of action of aneugenic chemicals. Mutagenesis 17, 509–521. Breiman, L. (2001) Random forests. Mach. Learn. 45, 5–32. Rothfuss, A., Steger-Hartmann, T., Heinrich, N., and Wichard, J. (2006) Computational prediction of the chromosome-damaging potential of chemicals. Chem. Res. Toxicol. 19, 1313–1319. Chang, C.-C., and Lin, C.-J. (2001) LIBSVMsA library for support vector machines. http://www.csie.ntu.edu.tw/∼cjlin/libsvm (accessed Oct 26, 2007). Hastie, T., Tibshirani, R., and Friedman, T. (2001) The elements of statistical learning. In Springer Series in Statistics (Bickel, P., Diggle, P., Fienberg, S., Gather, U., Olkin, I., and Zeger, S., Eds.) SpringerVerlag, Heidelberg, Germany. Merkwirth, C., and Wichard, J. D. (2002) ENTOOL: A MATLAB toolbox for ensemble modelling. http://www.j-wichard.de/entool/ (accessed Oct 26, 2007). Huberty, C. J. (1994) Applied Discriminant Analysis, John Wiley & Sons, New York. Klopman, G. (1992) MULTICASE 1. A hierarchical computer automated structure evaluation program. Quant. Struct.-Act. Relat. 11, 176–184. Rosenkranz, H. S., Cunningham, A. R., Zhang, Y. P., Claycamp, H. G., Macina, O. T., Sussmanm, N. B., Grant, G. S., and Klopman, G. (1999) Development, characterization and application of predictive toxicology models. SAR QSAR EnViron. Res. 10, 277–298. Sofuni, T., Ed. (1998) Data Book of Chromosomal Aberration Test In Vitro, revised ed., Life-Science Information Center, Tokyo, Japan. Estrada, E., and Molina, E. (2006) Automatic extraction of structural alerts for predicting chromosome aberrations of organic compounds. J. Mol. Graphics Modell. 25, 275–288. Sanderson, D., and Earnshaw, C. (1991) Computer prediction of possible toxic action from chemical structure; The DEREK system. Hum. Exp. Toxicol. 10, 261–273. Ridings, J. E., Barratt, M. D., Cary, R., Earnshaw, C. G., Eggington, C. E., Ellis, M. K., and Judson, P. N. (1996) Computer prediction of possible toxic action from chemical structure: An update on the DEREK system. Toxicology 106, 267–279. Williams, V. R., Naven, T. R., Marchant, A. C., Hirose, A., Kamata, E., and Hayashi, M. (2006) Derek for windows assessment of chromosomal aberration effects. Toxicol. Lett. 164 (Suppl. 1), S292. Mekenyan, O., Dimitrov, S., Serafimova, R., Thompson, E., Kotov, S., Dimitrova, N., and Walker, J. (2004) Identification of the structural requirements for mutagenicity by incorporating molecular flexibility and metabolic activation of chemicals I: TA100. Chem. Res. Toxicol. 17, 753–766. Serafimova, R., Todorov, M., Pavlov, T., Kotov, S., Jacob, E., Aptula, A., and Mekenyan, O. (2007) Identification of the structural requirements for mutagencitiy, by incorporating molecular flexibility and metabolic activation of chemicals. II. General Ames mutagenicity model. Chem. Res. Toxicol. 20, 662–676. Niemelå, J., and Wedebye, E. (2004) Evaluation of the setubal principles for establishing the status of development and validation of (Q)SARs. Annex 4: A “global” MULTI-CASE model for in vitro chromosomal aberrations in mammalian cells. http://glwww.mst.dk/ kemi/Word/work%20item%201%20report_annex%204.doc (accessed Oct 30, 2007). Mekenyan, O., Ivanov, J., Karabunarliev, S., Bradbury, S., Ankley, G., and Karcher, W. (1997) A computationally-based hazard identification algorithm that incorporates ligand flexibility. 1. Identification of potential androgen receptor ligands. EnViron. Sci. Technol. 31, 3702–3711. Bradbury, S., Kamenska, V., Schmieder, P., Ankley, G., and Mekenyan, O. (2000) A computationally-based identification algorithm for estrogen receptor ligands: Part I. Predicting hERR binding affinity. Toxicol. Sci. 58, 253–269. Mekenyan, O., Kamenska, V., Schmieder, P., Ankley, G., and Bradbury, S. (2000) A computationally based identification algorithm

Chem. Res. Toxicol., Vol. 20, No. 12, 2007 1941

(28) (29) (30) (31) (32) (33) (34)

(35)

(36) (37)

(38)

(39) (40)

(41)

(42)

(43) (44) (45) (46)

(47)

for estrogen receptor ligands: Part 2. Evaluation of a hERR binding affinity model. Toxicol. Sci. 58, 270–281. Mekenyan, O., Dimitrov, D., Nikolova, N., and Karabunarliev, S. (1999) Conformational coverage by a genetic algorithm. J. Chem. Inf. Comput. Sci. 39, 997–1016. Pavlov, T., Todorov, M., Serafimova, R., Aladjov, H., and Mekenyan, O. (2007) Conformational coverage by genetic algorithm: saturation of conformational space. J. Chem. Inf. Model. 47, 851–863. Steward, J. (1990) MOPAC: a semiempirical molecular orbital program. J. Comput.-Aided Mol. Des. 4, 1–103. Steward, J. (1993) MOPAC 93, Fujitsu Limited, Chiba, Japan, and Stewart Computational Chemistry, Colorado Springs, CO. Mekenyan, O., Nikolova, N., and Schmieder, P. (2003) Dynamic 3D QSAR techniques: Applications in toxicology. THEOCHEM 622, 147– 165. Connolly, M. (1983) Analytical molecular surface calculation. J. Appl. Crystallogr. 16, 548. Stanton, D. T., and Jurs, P. C. (1990) Development and use of charged partial surface area structural descriptors in computer-assisted quantitative structure-property relationship studies. Anal. Chem. 62, 2323– 2329. Nikolov, N., Grancharov, V., Stoyanova, G., Pavlov, T., and Mekenyan, O. (2006) Representation of chemical information in OASIS centralized 3D database for existing chemicals. J. Chem. Inf. Model. 46, 2537–2551. Mekenyan, O., Nikolova, N., Schmieder, P., and Veith, G. (2004) COREPA-M: A multi-dimensional formulation of COREPA. QSAR Comb. Sci. 23, 5–18. Serafimova, R., Todorov, M., Nedelcheva, D., Pavlov, T., Akahori, Y., Nakai, M., and Mekenyan, O. (2007) QSAR and mechanistic interpretation of estrogen receptor binding. SAR QSAR EnViron. Res. 18, 1–33. Dimitrov, S., Dimitrova, G., Pavlov, T., Dimitrova, N., Patlewicz, G., Niemela, J., and Mekenyan, O. (2005) A stepwise approach for defining the applicability domain of SAR and QSAR models. J. Chem. Inf. Model. 45, 839–849. Williams, J. A., and Phillips, D. H. (2000) Mammary expression of xenobiotic metabolizing enzymes and their potential role in breast cancer. Cancer Res. 60, 4667–4677. Mekenyan, O., Dimitrov, S., Pavlov, T., and Veith, G. (2004) A systematic approach to stimulating metabolism in computational toxicology. I. The TIMES heuristic modelling framework. Curr. Pharm. Des. 10, 1273–1293. Dimitrov, S., Pavlov, T., Vasilev, R., and Mekenyan, O. (2005) Simulation of abiotic molecular transformations by CATABOL. Poster presented at the SETAC Europe 15th Annual Meeting, Lille, France, May 21–26. Lenk, W., and Rosenhauer-Thilmann, R. (1993) Metabolism of 2-acetylaminofluorene. I. Metabolism in vitro of 2-acetylaminofluorene and 2-acetylaminofluoren-9-one by hepatic enzymes. Xenobiotica 23, 241–257. Laboratory of Mathematical Chemistry, Bourgas, Bulgaria (2007) OASIS Genotoxic Database, to be submitted for publication. Gatto, B., Capranico, G., and Palumbo, M. (1999) Drugs acting on DNA topoisomerases: Recent advances and future perspectives. Curr. Pharm. Des. 5, 195–215. Hutt, A. M., and Kalf, G. F. (1996) Inhibition of human topoisomerase II by hydroquinone and p-benzoquinone, reactive metabolites of benzene. EnViron. Health Perspect. 104, 1265–1269. Freidig, A. P., Verhaar, H. J. M., and Hermens, J. L. M. (1999) Quantitative structure-property relationships for the chemical reactivity of acrylates and methacrylates. EnViron. Toxicol. Chem. 18, 1133– 1139. Uhrig, M. L., and Varela, O. (2002) Synthesis of glycosides of 3-deoxy-4-thiopentopyranosid-2-uloses and their reduction products: 3-deoxy-4-thiopentopyranosides. Carbohydr. Res. 337, 2069–2076.

TX700249Q