Deviation from Additivity with Estrogenic Mixtures Containing 4

An intriguing deviation from expected additivity is reported with mixtures containing 17β-estradiol, 17R-ethinylestradiol, genistein, bisphenol A, 4-...
0 downloads 0 Views 425KB Size
Environ. Sci. Technol. 2004, 38, 6343-6352

Deviation from Additivity with Estrogenic Mixtures Containing 4-Nonylphenol and 4-tert-Octylphenol Detected in the E-SCREEN Assay NISSANKA RAJAPAKSE, ELISABETE SILVA, MARTIN SCHOLZE, AND ANDREAS KORTENKAMP* Centre for Toxicology, School of Pharmacy, University of London, 29-39 Brunswick Square, London, WC1N 1AX, United Kingdom

An intriguing deviation from expected additivity is reported with mixtures containing 17β-estradiol, 17R-ethinylestradiol, genistein, bisphenol A, 4-nonylphenol, and 4-tert-octylphenol. The effect of these chemicals on the proliferation of estrogendependent MCF-7 human breast cancer cells (the E-SCREEN) was measured. Data variance-component analyses, carried out to optimize the assay for mixture studies, showed that between-experiment variability was the dominant source of data variation. Adoption of a datanormalization procedure reduced the impact of this variability and allowed the pooling of historical E-SCREEN data. Concentration-response relationships for all six chemicals were recorded and utilized to calculate predictions of their joint effects by employing the model of concentration addition. Surprisingly, the observed combination effects of the mixture fell short of the additivity expectations, indicating weak antagonism. Experimental or prediction errors were ruled out as possible explanations for this deviation, which suggested that it might be the result of interactions between mixture components. With the aim of identifying the responsible components, mixtures were designed by excluding one or more of the chemicals from the original sixcomponent mixture, and the resulting combination effects were assessed. These permutation studies allowed us to conclude that the presence of 4-nonylphenol and 4-tertoctylphenol is associated with the antagonisms observed with the six-component mixture and thus negatively affected the predictability of mixture effects. Future mixture studies utilizing the E-SCREEN with endocrine disrupters that also exhibit toxicity or growth-inhibitory effects will have to take account of the possibility that such interactions might compromise the predictability of estrogenic combination effects.

Introduction Although efforts to assess the effects of combinations of chemicals in toxicology and therapeutics go back a long time (1-4), the recent interest in combination effects in connection with endocrine-disrupting chemicals is unprecedented. With * Corresponding author telephone/fax: +44 20 7753 5908; email: [email protected]. 10.1021/es049681e CCC: $27.50 Published on Web 10/28/2004

 2004 American Chemical Society

the identification of increasing numbers of endocrine-active chemicals, it has become clear that humans and wildlife are exposed to a multitude of these chemicals, all at rather low levels. Progress with making judgments about the risks associated with endocrine-active chemicals depends on our ability to assess and predict the effect of joint exposures. Our group has published studies of the effects of combinations of up to 12 estrogenic chemicals at low doses, using the yeast estrogen screen (YES), an ERR reporter gene construct (5, 6). To assess whether the joint effect of estrogenic chemicals was additive in these studies, we attempted to predict mixture effects based on information about concentration-effect relationships of all individual mixture components. These data were used to calculate the expected responses of a mixture with defined mixture ratio, over a large range of responses (“fixed mixture ratio design”; 4, 7). The calculations were made by assuming that all mixture components acted in an additive fashion and that the mode of action of each mixture constituent was similar, such that each could be replaced by an equi-effective concentration of another component. This concept in assessing mixture effects, called concentration addition (CA), was developed by Loewe and Muischneck (8). The alternative assessment concept, independent action (IA), views combination effects as a stochastic process and assumes that each mixture component produces its effects independently from the responses induced by other components by a variety of (dissimilar) mechanisms (9). Our attempts to verify the expected effects of mixtures of estrogenic chemicals revealed excellent agreement between observation and the additive combination effects predicted by CA (5, 6). However, reporter gene constructs such as the YES are only able to capture events close to receptor activation. They are blind to other effects that might interfere with steroid receptor signaling in living organisms. It is unclear, at present, whether intervening factors operating in more complex biological systems might compromise the excellent predictability of estrogenic combination effects that is achievable in the YES. We became interested in testing this by choosing a model system representative of greater biological complexity. An obvious choice was the E-SCREEN developed by Soto and colleagues (10). It exploits the principle that MCF-7 human breast cancer cells proliferate in the presence of chemicals that bind to and activate the estrogen receptor. Importantly, the activation of signaling pathways independent of the estrogen receptor also leads to mitogenicity; thus, cell number affords an attractive, highly integrative end point suitable for the study of multiple interfering factors. The measured parameter in this assay is the number of cells yielded after a defined period of incubation. As this parameter is a composite measure of mitogenicity and cytotoxicity, any toxic or even growth-restricting influence will have a more direct impact on assay outcomes than in reporter gene assays. The E-SCREEN is widely used as a screening tool for the identification of estrogen-like agents, and a series of papers describing its optimization for this purpose are available (1012). However, its application in mixture studies places demands, in terms of reproducibility and quantifiability, that go far beyond the data quality sufficient for screening exercises. It was, therefore, necessary to investigate the contribution of various experimental features to overall data variation and to find ways of dealing with variability. Previously, our group has reported on the joint effects of four organochlorine pesticides in the E-SCREEN (13). With respect to their physicochemical properties and their toxiVOL. 38, NO. 23, 2004 / ENVIRONMENTAL SCIENCE & TECHNOLOGY

9

6343

cokinetics, these four organochlorines (o,p′-DDT, p,p′-DDT, p,p′-DDE, and β-HCH) show great similarities. However, endocrine-active chemicals encompass a wide variety of chemicals with vastly differing properties. It was therefore necessary to gain further insights into the behavior of endocrine-disrupter mixtures composed of different classes of endocrine-active chemicals. Here, we present the results of studies with combinations containing 17β-estradiol (E2), 17R-ethinylestradiol (EE2), bisphenol A (BPA), 4-tertoctylphenol (4t-OP), 4-nonylphenol (4-NP), and genistein (GEN). These chemicals are able to bind to and activate the estrogen receptor (14-16), and all induce cell proliferation in the E-SCREEN. Therefore, we considered that these agents act in a similar fashion and judged the concept of IA inappropriate for an assessment of combination effects in this case. Instead, we have applied the concept of CA. Apart from our interest in expanding the spectrum of chemicals tested as mixtures in the E-SCREEN, the choice of the six chemicals was motivated by the fact that many of them are present in U.K. rivers, particularly in those that receive effluents from sewage treatment works with industrial waste input (17). In this paper, we present detailed investigations of the variability of the E-SCREEN and its adaptation for multicomponent mixture studies. We have approached this by carrying out data variance-component analyses. In our experience, controlling for inter-experimental variability proved to be of paramount importance for the implementation of mixture studies using the E-SCREEN, and we suggest procedures for dealing with this by using data normalization. Finally, we apply the optimized protocol to assess the predictability of multi-component mixtures containing estrogenic chemicals.

Experimental Section Chemicals. 17β-Estradiol (E2, 99% purity) and 17R-ethinylestradiol (EE2, 98% purity) were purchased from Sigma (Dorset, U.K.), genistein, (GEN, 97% purity) from Lancaster Synthesis Ltd. (Morecambe, U.K.), 4-nonylphenol (4-NP, mixture of isomers, 99% purity) from Acros Organics (Geel, Belgium), 4-tert-octylphenol (4t-OP, 97% purity), and bisphenol A (BPA, 99+% purity) from Aldrich (Dorset, U.K.). All chemicals were used as supplied, and stock solutions (1-10 mM) were prepared in HPLC-grade ethanol (Sigma). Stock solutions and subsequent dilutions were stored at -20 °C, except for 4-NP, which was stored in amber glass bottles at room temperature and used for a maximum of four weeks. Cell Culture. MCF-7 BOS breast cancer cells were routinely maintained in 75 cm2 canted-neck tissue culture flasks (Greiner, U.K.) in Dulbecco’s modified Eagle’s medium (DMEM, Gibco, Invitrogen Corporation, U.K.) supplemented with 5% (v/v) fetal bovine serum (FBS, Gibco) and 1% (v/v) MEM nonessential amino acids (MEM-NEAA, Gibco) in a humidified incubator, 37 °C, 5% CO2. Flasks were seeded with approximately 250 000 cells and media changed every 3-4 d. Cells were subcultured when flasks reached 70% confluence (usually every 6-8 d). Conditioned cell medium was regularly tested for Mycoplasma and proved negative. The E-SCREEN. The protocol briefly described below is extensively that developed by Soto and colleagues (10). Studies were carried out in 12-well plates (Falcon, BD Biosciences, U.K.). A single cell suspension of MCF-7 BOS cells was obtained at a density of 20 000 cells/mL DMEM, by gently pulling the cell suspension through an 18-gauge needle, enumerating, and diluting accordingly. To each well, 1 mL of the cell suspension was added, and the plates were placed in the incubator to allow cells to attach for 24 ( 2 h before treating. To reduce static effects, a manual pipettor was used to seed the plates, whereas a low-ejection force electronic pipettor was used at all other stages of the assay. Plates were 6344

9

ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 38, NO. 23, 2004

stacked in the incubator with the bottom plate of each stack not being used for treatments. In a change from the protocol described by Soto et al. (10), the media change into experimental conditions was carried out on a plate-by-plate basis. First, once the seeding DMEM was aspirated, the attached cells were rinsed with 1 mL of phenol red-free DMEM (Gibco). The rinse media was quickly aspirated and replaced by the experimental medium (CD-DMEM, consisting of phenol red-free DMEM supplemented with sodium pyruvate, 1% (v/v) MEM-NEAA and 10% charcoal-dextran stripped FBS). Each plate consisted of duplicate negative controls (CD-DMEM + 0.5% ethanol; i.e., the final ethanol concentration for all treatments), duplicate positive controls (CD-DMEM + 1-10 × 10-9 M E2), and eight increasing concentrations of test chemical, all randomly located on the plate. After a further 120 h, the assay was terminated by placing plates on ice for 1 min before removing the experimental media by rapid inversion over a waste vessel, replacing on ice, and fixing the cells for 25 min with a 10% (w/v) solution of ice-cold trichloroacetic acid (TCA). Plates were gently rinsed five times with running water, allowed to dry, and stained with 0.4% (w/v) sulforhodamine B (SRB, Sigma) in 1% (v/v) acetic acid for 10 min. Unbound dye was completely removed by rinsing with 1% (v/v) acetic acid, and bound SRB was solubilized by 10 mM Tris (1 mL, pH 10.5). Aliquots (100 µL) of the solubilized SRB were transferred into a flat-bottomed microtiter plate, and the optical density was read at 510 nm on a microtiter plate reader (Labsystems Multiskan, U.K.). We have previously established that there is a direct linear relationship between cell number to optical density (O.D.) values of the Tris-SRB solution, and experimental readings were in the linear range of the standard curve. Estrogenicity Testing. All single agents and mixtures were tested using eight different concentrations and four controls (two negative and two positive controls) on each 12-well plate, with duplicates and three repetitions (mixtures were only repeated once). The spacing of test concentrations was decided based on range finding studies, such that the entire range between low and maximal effects was covered evenly. In terms of the mixtures, the tested concentrations were based on the concentration range described by the additivity prediction (see Calculation of Mixture Effect Predictions Using Concentration Addition for details). Data Normalization. As described in detail in the Results section of this paper, there is variability of the assay outcome (cell number measurements), within one plate, between different plates of one experiment, and from experiment to experiment. A data-normalization procedure was adopted to deal with these variations, as follows: The experimental unit in the E-SCREEN is represented by the multi-well plate, and it is assumed that all the wells on the plate are affected in the same way by experimental factors, such as cell passage number, initial seeding density, etc. However, to minimize the impact of factors leading to increased intra- and interexperimental variations, normalization of the raw O.D. values was carried out on a plate-by-plate basis (eq 1):

O.D.test - average O.D.neg normalized proliferation ) average O.D.pos - average O.D.neg

(1)

where the raw effect data for the negative controls (O.D.neg) of each plate (n ) 2) was averaged, and this was subtracted from the whole plate’s raw effect data (O. D.test). To scale the data to an effect range between 0% (ethanol controls) and 100% (maximal effect seen with saturating concentrations of E2), the resulting values were divided by the average of the positive controls (O. D.pos), that is, cells treated with 1 - 10 × 10-9M E2. There were two positive controls on every plate.

TABLE 1. Regression Models name

function (F)a

inverse function (F-1)

Generalized Logit (GL) Weibull (W) Box-Cox-Logit (BL) Box-Cox-Weibull (BW) with

E ) θmin + (θmax - θmin )/(1 + exp(-K))θ3 E ) θmin + (θmax - θmin )‚(1 - exp(-exp(K))) E ) θmin + (θmax - θmin )/(1 + exp(-L)) E ) θmin + (θmax - θmin)‚(1 - exp(-exp(L))) K ) θ1 + θ2 log10(c) and L ) θ1 + θ2((cθ3 - 1)/θ3)

c ) 10(-loge((1/M)1/θˆ 3 - 1) - θˆ 2)/θˆ 2 c ) 10(-loge(-loge(1 - M)) - θˆ 1)/θˆ 2 c ) (1 + (-loge((1/M) - 1) - θˆ 1)‚(θˆ 3/θˆ 2))1/θˆ 3 c ) (1 + (loge(- loge(1 - M)) - θˆ 1)‚(θˆ 3/θˆ 2))1/θˆ 3 M ) (E - θˆ min)/(θˆ max - θˆ min)

a E, effect, expressed as a fraction of a maximum possible effect (0 e E e 1). c, concentration. θ , θ , θ , θ 1 2 3 min, and θmax, model parameters (corresponding statistical estimates marked by ∧).

TABLE 2. Estrogenicity and Regression Parameters of Single Agents and Mixtures model and estimated parametersb substancea

RM

θˆ 1

θˆ 2

θˆ 3

θˆ min

θˆ max

EC50 (nM [CI])c

EC10 (nM [CI])d

EE2 17β-estradiol 4t-OP genistein nonylphenol bisphenol A mixture 1e mixture 2e mixture 3e mixture 4e

BW BL W GL BL BW GL BL BL BL

4.664 7.748 -3.094 -13.33 -3.345 -3.355 -8.195 -3.486 -3.067 -3.220

1.726 3.617 1.340 5.181 0.394 0.168 2.743 0.343 0.488 0.367

0.203 0.323 0.274 0.170 0.284 0.459 0.161 0.248 0.139

0* 0* 0* 0* 0* 0* 0.002 0* 0* 0*

1* 1.06 1* 1* 0.90 0.74 1* 1* 1* 1*

0.0122 [0.0086-0.0169] 0.0237 [0.0212-0.0269] 108.5 [72.7-163.4] 126.4 [110.0-150.0] 242 [161-387] 881 [671-1096] 336 [284-415] 413 [329-509] 44.3 [32.2-58.0] 308 [239-400]

0.0003 [0.0001-0.0005] 0.0010 [0.0007-0.0012] 4.26 [2.46-7.73] 8.96 [6.99-11.63] 13 [7.7-26] 75 [50-117] 14.1 [11.4-17.0] 18.9 [12.9-26.3] 4.37 [2.58-7.42] 10.51 [6.92-16.58]

a Listed in order of EC b RM indicates the mathematical regression function used 50 values. EE2, 17R-ethinylestradiol; 4t-OP, 4tert-octylphenol. for describing the concentration-response relationships (W, Weibull; GL, Generalized Logit; BW, Box-Cox-Weibull; BL, Box-Cox-Logit), mathematical formulations of which are given in Table 1. Parameters that were fixed, i.e., not estimated are indicated by an asterisk (*). c Concentration provoking 50% maximal effect of E2, with upper and lower limits of the approximate 95% confidence interval based on bootstrap replicates in brackets. d Concentration provoking 10% maximal effect of E , with upper and lower limits of the approximate 95% confidence interval based on bootstrap 2 replicates in brackets. e Mixtures 1-4 are six-, five-, three-, and four component mixtures, respectively; composition defined in Table 4.

As a result of this normalization, it became possible to pool data from different plates, both within the same experiment and from different experiments before deriving reliable concentration-response functions for the single agents. One consequence of this procedure is that the maximal cell proliferation induced by a test chemical is expressed relative to the maximal effect seen with E2. Biometrical Concentration-Response Analyses. Concentration-response relationships were determined using the best-fit approach described by Scholze et al. (18). Fourteen different nonlinear regression models were fitted to each data set. By using a robust goodness-of-fit criterion, the bestfitting model was selected. As a common feature, all models have parameters that describe a lower and an upper effect asymptote (θmin, θmax). These parameters were set to fixed values for normalized effects (i.e., θmin ) 0 and θmax ) 1) when no significant improvement was achieved by the application of estimated values deviating from 0 and 1, respectively. In all other cases, the best-fitting model with estimated asymptotes was chosen. Four of the 14 regression models proved to provide the best possible description of the concentration-response relationship, and these are given in Table 1. Effect concentrations (ECx) were calculated from the functional inverse F-1 of the best fitting model (Table 1). These values give concentrations (c) that are expected to cause a defined effect (E). Confidence belts (CI) for effect concentrations were estimated by using the bootstrap approach described in Scholze et al. (18). Mean effects and effect concentrations of individual substances are subject to a stochastic variability. Consequently, the calculation of a prediction according to CA has to give a mean that is also affected by statistical uncertainty. This uncertainty was quantified by determining approximately the central 95% confidence intervals for mean predicted effect concentrations using the bootstrap method. The bootstrap samples were generated on the basis of the

effect distributions that were estimated within the fitting process for every individual concentration response function (parametric bootstrap). All computations were programmed using SAS (1996). Calculation of Mixture Effect Predictions Using Concentration Addition. When predicting the estrogenicity of combinations of chemicals, specific assumptions are made about the quantitative relationships between the estrogenicity of single substances and those of mixtures. For a multicomponent mixture of n chemicals the concept of CA (8) states n

ci

∑EC i)1

)1

(2)

xi

where ci are the concentrations of the individual substances present in a mixture with a total effect of X, and ECxi are the concentrations of the single substances that produce the same effect X on their own. If eq 2 holds true, a mixture component can be replaced totally or in part by an equal fraction of an equi-effective concentration of another, without altering the overall effect of the mixture. On the basis of the concentration-response functions of single xenoestrogens and E2 (Table 2), predictions of effect concentrations had to be calculated for mixtures containing all components at defined mixture ratios according to eq 2. To achieve this, eq 2 had to be rearranged, as follows: The concentrations of individual mixture compounds ci were replaced by the relative proportions (pi) of the total mixture concentration cmixture (i.e., ci ) pi × cmixture). Assuming that the mixture concentration cmixture produces an effect of X, the corresponding effect concentration (ECX(mixture)) is given as

( ) n

ECX(mixture) )

pi

∑EC i)1

-1

(3)

xi

VOL. 38, NO. 23, 2004 / ENVIRONMENTAL SCIENCE & TECHNOLOGY

9

6345

FIGURE 1. Plate-to-plate variability, between-experiment variability, and normalization of the absolute effect scale. Concentration-effect curves for E2 from two independent experiments (A) based on raw optical density values (black and red line) and (B) following a normalization based on negative and positive controls. The black and red box and whisker plots show the data variability of positive controls between plates for respective experiments (1.26 nM E2). The bottom and top edges of the box are located at the sample 25th and 75th percentiles, the center horizontal at the median, and the whiskers to the 5th and 95th percentiles. Within-plate variability is indicated by the blue box and whisker plot (A). It was derived by treating all wells in one plate with the same concentration of E2. For clarity, only the plate-to-plate variability from one experiment is shown (B) (blue box and whisker plot). This equation allows an explicit calculation of any effect concentrations of a mixture under the hypothesis of CA. The only prerequisites are knowledge about individual effect concentrations (ECxi) and the relative concentrations (pi) of the mixture components. Effect concentrations for mixtures (ECX(mixture)) denote the mixture concentrations that produce a given quantitative effect X. However, the effect range for X is limited: Equation 3 can only be used when it is possible to determine for each mixture compound a reliable estimate of a concentration that would produce the same effect when applied on its own (ECX(i)). Table 2 shows that the model estimates θmax for maximal effects of all tested chemicals differ, with BPA producing the lowest maximal effect (0.74) relative to E2. Thus, concentrations of BPA yielding effects higher than 0.74 cannot be estimated, and mixture concentrations corresponding to effects exceeding 0.74 were impossible to calculate. Furthermore, in the interest of achieving predictions of low statistical uncertainty, we did not calculate mixture concentrations corresponding to effect levels lower than 0.1. This restricted the range of predicted effects to between 0.1 and 0.74 (10 and 74%). Graphs of predicted concentration-response curves were obtained by calculating numerous ECX(mixture) values corresponding to predetermined effect levels X from 10% to 74%, in steps of e1%.

Results For the calculation of expected mixture effects, multicomponent mixture studies have to rely on data about concentration-effect relationships of single chemicals. Inevitably, therefore, there is a time lag between experiments with single chemicals and the final mixture experiments. Practical considerations also force the experimenters into conducting experiments consecutively. The sheer number of samples in multi-component mixture studies often makes it impractical to set up and run all single chemicals and mixtures in parallel. Thus, because of the need to rely on 6346

9

ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 38, NO. 23, 2004

historical data, dealing with experimental variations becomes an essential issue in the design and conduct of multicomponent mixture experiments. Their success depends on high reproducibility, defined as the ability of a test system to repeat an experimental outcome without systematic error. Ideally, experimental results should only be affected by random errors, and these should be small. This is of great importance for the quality of mixture effect predictions, which should also be free of systematic errors and only be affected by random variations, which can be dealt with by statistical assessment. As an integral part of our multi-component mixture studies, it became necessary to assess the reproducibility of the E-SCREEN and to analyze potential sources of systematic error in order to define limitations of mixture effect predictions. We have approached this by carrying out data variance-component analyses, where we considered the contribution of experimental features to overall variation in our data. For the E-SCREEN, this meant analyzing data variability within the plates, the variability between the plates, and finally the data variation between independent experiments. Variation in the E-SCREEN. Extensive studies with E2 were carried out. Data variation within plates was relatively small. The blue box and whisker plot in Figure 1A shows the variation of raw, unadjusted readings of optical density (number of wells ) 10) seen with cultures exposed to 1.26 nM E2 (positive controls) in one representative 12-well plate (within-plate variability). The range of all coefficients of variation was between 1.5 and 14% (Table 3). For two independent experiments, variability of readings obtained from the positive controls on separate 12-well plates is shown in the red and black box and whisker plots in Figure 1A (in each case, number of plates ) 12) (betweenplate variability). Generally, between-plate variability was higher than that observed within plates but was found to depend on the magnitude of effect. It was higher when the

VOL. 38, NO. 23, 2004 / ENVIRONMENTAL SCIENCE & TECHNOLOGY

FIGURE 2. Concentration-response data and curves for E2 and the five xenoestrogens. The best-fitting regression models (see Table 2) are shown as red lines with the corresponding 95% confidence belt for the mean effect as dotted red lines.

9

6347

TABLE 3. Data Variability for Positive Controls (1.2 nM E2) in the E-SCREEN variability source

CV (%)a for raw absorbancesb

CV (%)c for normalized cell proliferationd

within-plate between-platee inter-experiment

1.5-13.3 5.7-17.9 ≈60

1.5-14.8 5.8-12.3 10.2

a CV, coefficient of variation; defined as the ratio of the standard deviation to the mean expressed as a percentage. b Optical density values for SRB absorbance at 510 nm. c Corresponds to the standard deviation as the mean of the normalized positive control is always 1. d Normalization described in Experimental Section. e Variability between plates within the same experiment.

effect levels were large (black box and whisker plot in Figure 1A). Experiment-to-experiment variation of unadjusted readings was largest when compared to within-plate and betweenplate variability (Table 3). To illustrate this, we have selected two different experimental runs, which represent the extremes of variation seen in our laboratory. They are shown in Figure 1A as black and red regression lines, with corresponding box and whisker plots. In both cases, the regression lines were derived from eight tested concentrations of E2, placed in one 12-well plate. On each occasion, the same numbers of positive and negative controls were run in 12 parallel plates. As can be seen in Figure 1A, in one case the mean of unadjusted positive control readings came to lie outside of the plateau of the regression line observed within the same experiment (black line and box and whisker plot), while on a different occasion the mean of most positive controls corresponded well with the plateau of the regression model (red line and red box and whisker plot). Between-experiment variability is obviously the dominant source of data variation in the E-SCREEN, as indicated by the “worst-case” situation of the two concentration-response curves for E2. In this example, the maximal effects described by the regression models differed by a factor of about 5. We regard differences in seeding density as the most likely source of experiment-to-experiment variation. It is obvious that recourse to historical data cannot be made without normalization of raw readings. The question arises as to which normalization procedure is appropriate for dealing with E-SCREEN data. Data Normalization. First, we considered the feasibility of using the mean of positive controls of all plates from the same experiment as a way of dealing with variability. However, pooling data from different plates in this way can yield a mean value for the positive controls that does not reflect the control situation within the individual plates, as demonstrated by the box and whisker plots in Figure 1A. Instead, we adopted the procedure described in the Experimental Section, where normalization is strictly in relation to the positive and negative controls of each plate. To assess the impact of this adjustment on the resulting effect data, the raw data shown in Figure 1A were normalized and replotted (Figure 1B). After normalization to the controls in each plate, the regression models derived from the two experiments in Figure 1A became virtually indistinguishable. Within-plate variability of the normalized data was relatively low as shown by the blue box and whisker plot (n ) 10), which corresponds to the data represented by the blue box and whisker plot in Figure 1A. For the maximal effect, the average standard deviation was below 10%. Thus, our procedure proved effective in dealing with inter-experimental variation and allows data from separate independent experiments to be pooled. The introduction of strong bias is avoided by only utilizing control values within the same plate. 6348

9

ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 38, NO. 23, 2004

FIGURE 3. Concentration-response curves for E2 and the five xenoestrogens. Showing the best fitting models for E-SCREEN data. Test agents: (1) EE2; (2) E2; (3) GEN; (4) 4t-OP; (5) 4-NP; and (6) BPA. Concentration-Response Analyses for Six Estrogenic Chemicals. In addition to the steroidal estrogens E2 and EE2, concentration-response relationships were recorded for GEN, BPA, 4t-OP, and 4-NP using the normalization procedure described above (Figure 2). The data supporting the concentration-response functions in Figure 2 are duplicates from three independent experiments. The model parameters derived for each chemical are summarized in Table 2. The concentrations required to induce a 10% and a 50% increase in cell numbers relative to (positive and negative) controls (EC10 and EC50, respectively) are provided alongside the upper and lower limits of the 95% confidence intervals, demonstrating a low uncertainty in the curve estimates. The narrow confidence intervals are mainly due to the pooling of data sets from three experiments, which results in a high number of observations on which to base the fitting process. The maximum proliferative effect of the phenolic chemicals BPA and 4-NP was approximately 75% and 90%, respectively, of that provoked by the other tested chemicals (Table 2). The phenolic compounds, especially 4t-OP, also demonstrated the most pronounced variation over time if stock solutions were used for more than 4 weeks (data not shown). Similar to previous reports in the YES (19), there was a “creeping” effect with 4t-OP, which may be attributed to its volatility under the experimental conditions of the E-SCREEN. As recommended by Beresford et al. (19), we dealt with this phenomenon by ensuring that cultures treated with high concentrations were placed in separate 12-well plates. The EC50 values obtained in our studies are slightly higher than those reported previously in the literature (10, 11, 15), although the differences are small and relative proliferative potencies are comparable. For clarity, we have collated the six concentration-response curves in Figure 3. Mixture Estrogenicity and Predictability of Combination Effects. In the first instance, we composed a six-component mixture consisting of E2, EE2, BPA, GEN, 4-NP, and 4t-OP. The mixture ratio (Table 4) was designed such that each chemical was present in approximate proportion to its

TABLE 4. Composition of Tested Mixtures relative proportions (percentages)a

a

componentsb

mixture 1 (six-component)

mixture 2 (five-component)

mixture 3 (three-component)

mixture 4 (four-component)

17R-ethinylestradiol 17β-estradiol 4-tert-octylphenol genistein nonylphenol bisphenol A

0.0014 0.0027 12.98 12.69 19.94 54.39

0.0016 0.0030 14.87

0.0096 0.0187

0.0012 0.0024

Rounded values given for relative proportions.

99.97 22.83 62.29 b

12.55 87.45

In order of increasing EC50 values.

FIGURE 4. Predicted and observed mixture effects of equipotent mixtures with six components (A), five components (B), three components (C), and four components (D). Observed mixture effects (circles) are from two independent mixture experiments. The best-fitting regression models are shown as red lines with the corresponding 95% confidence belt for the mean effect as dotted red lines. Predicted effects were calculated using the model of CA (solid blue lines), with dotted blue lines indicating the corresponding approximate 95% confidence belts. individual estimated EC50 values, derived from the best-fit regression functions. Knowledge of the concentrationresponse functions for the individual mixture constituents as well as the mixture ratio allowed the estrogenicity of the mixture to be predicted using CA (Figure 4A). When tested experimentally, the best-fit regression model was determined

as that which described the observed mixture effect data the best (Table 2). As shown in Figure 4A, observed mixture responses spanned the entire 0-100% effect range, and the observed data variability was comparable to the variation obtained for single effect data. The comparative assessment between VOL. 38, NO. 23, 2004 / ENVIRONMENTAL SCIENCE & TECHNOLOGY

9

6349

TABLE 5. Statistical Uncertainty of Predicted and Observed Effect Concentration ECx (mixture) (nM) effect concentration ECxmix (nM) predicted by CAa effect level x

observed

mean

95% CIb

mean

95% CI

6.82 64.5 170 367

4.0-8.72 55.6-73.7 147-190 299-423

14.1 114 336 847

11.4-17.0 97-136 284-415 672-1107

6.60 66.5 179 396

3.71-8.59 55.8-77.8 150-204 311-474

18.9 149 413 989

12.9-26.3 115-188 329-509 731-1380

1.50 16.7 42.2 83

0.75-2.04 13.5-19.5 36.8-48.0 70-97

4.37 21.1 44.3 83

2.58-7.42 14.2-30.0 32.2-58.0 57-122

10.46 104 252 503

5.48-13.85 87-118 222-280 422-690

10.51 100.4 308 815

6.92-16.58 74.1-136.3 239-400 601-1119

six-componentc

mixture 1: 10% 30% 50% 70% mixture 2: five-componentc 10% 30% 50% 70% mixture 3: three-componentc 10% 30% 50% 70% mixture 4: four-componentc 10% 30% 50% 70% a

CA, concentration addition.

b

CI, confidence interval. c Mixture ratio as defined in Table 4.

observed and predicted mixture effects showed clearly that the observed effect data were lower than the expected combination effects. To provoke a particular effect, higher concentrations of the mixture of estrogenic chemicals were required than predicted on the basis of CA. At no point was there an overlap between the 95% confidence interval for observations with the bootstrap fitted to the prediction, which indicates that this difference is statistically significant. The deviation from expected additivity became more pronounced at high effect levels. We speculated that one or more mixture components might interact with each other to produce the observed deviations from expected concentration additivity. In an effort to trace such interactions to specific chemicals, we began removing components from the original six-component mixture, with a view to assessing whether the observed antagonism might disappear. Due to its wide spectrum of biological and toxic effects, we first suspected GEN as the cause of the observed deviations. Thus, we created a fivecomponent mixture by removing the phytoestrogen from the previously tested six-component mixture. Again, the mixture ratio was proportional to the individual agent’s EC50 values (Table 4). The deviation from expected concentration additivity became even more pronounced than with the sixcomponent mixture (Figure 4B). Comparing the EC50 values of the respective mixtures, the ratio between the observed and predicted estimate increased from 2 for the sixcomponent mixture to approximately 2.3 for the fivecomponent mixture (Table 5). Considering the reported cytotoxicity of the phenolic chemicals present in our mixtures (20, 21), we decided to omit all of these chemicals and studied the effects of a threecomponent mixture containing E2, EE2, and GEN (for mixture ratios see Table 4). As shown in Figure 4C, the joint effects of this mixture were essentially concentration additive, with good agreement between prediction and observation over the entire range of effects. Finally, we tested the influence of the least experimentally variable phenolic chemical (BPA) on the combined effects of E2, EE2, and GEN and created a four-component mixture of E2, EE2, GEN, and BPA (mixture ratios see Table 4). Again, the CA prediction agreed very well with the observed combined effects (Figure 4D). 6350

9

ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 38, NO. 23, 2004

Discussion For the first time, we report a deviation from expected concentration additivity with six- and five-component mixtures of various estrogenic chemicals. The deviations were relatively small but statistically significant. CA underestimated the observed EC50 values by a factor of around 2 (1.97 for the six- and 2.3 for the five-component mixture). Given that EC50 values for the same agent reported in the literature can vary by a factor of 10 or more (18), these differences may be judged irrelevant. However, from a conceptual and theoretical perspective, they deserve attention. It is conceivable that the observed deviations, rather than reflecting a biological phenomenon, were the result of systematic experimental errors affecting either the predictions or the measured joint effects. This, however, is highly unlikely because no differences could be detected between the three repetitions of the concentration-response relationships of the individual components or the mixtures. Furthermore, the model of CA becomes very robust against prediction errors provided that there are sufficient numbers of mixture components and that none of the constituents contributes disproportionately to the overall mixture effect. This is because the predicted effect concentrations correspond to the weighted harmonic mean of all individual effect concentrations (eq 3). Thus, we maintain that the additivity predictions for the six- and five-component mixtures were correct and that the observed mixture effects were not influenced by systematic errors. Although CA has been proposed as the generally applicable concept for estimating additive mixture effects (22), we next considered whether the basic assumptions underlying the concept of CA, namely, that all mixture components show similar modes of action were fulfilled in our case. All six estrogens have previously been shown to bind and activate the ERR (14-16). This, together with the largely parallel nature of the concentration-response functions in the E-SCREEN (Figure 3) supports the notion that similar mechanisms of action are involved. Other laboratories have reported that GEN induced little or no cell proliferation and in fact exhibited an antiestrogenic effect at higher concentrations, but clearly, this cannot explain our observed deviations because GEN was absent from the five-component mixture. In conclusion,

there are insufficient data to suggest that the mitogenic effects of the six estrogens tested in our mixtures do not arise via the same molecular pathway. The available evidence overwhelmingly indicates that the similarity assumptions of CA are met. Since systematic errors in effect measurement or calculation of predicted effects as well as inappropriate use of CA can all be ruled out as explanations for our findings, we are forced to conclude that the deviations from additivity are the result of interactions between certain mixture components. Here, we use the term “interactions” in the sense of interferences with toxicokinetics or toxicodynamics but do not imply interactions (e.g., through chemical reactivity). Our permutation analyses strongly support this idea and allow us to narrow the cause of the observed interactions to 4-NP and 4t-OP, for the following reasons: The short-fall from expected additivity became even more pronounced when GEN was removed from the six-component mixture. This indicated that GEN itself was not involved in the interactions leading to antagonism. Rather, GEN acted additively with all mixture components, except 4-NP and 4t-OP. This was demonstrated by the good agreement between observed mixture effects and predicted concentration additivity observed with the three- and four component mixtures involving the phytoestrogen and E2, EE2, and BPA. It seems that the removal of GEN from the six-component mixture had made the impact of 4-NP and 4t-OP more prominent. Conversely, the presence of GEN in the six-component mixture may have led to a “diluting out” of the effect of 4-NP and 4t-OP, which gave rise to the deviation. All these findings strongly point to 4-NP and 4t-OP as being associated with the antagonisms observed with the six- and five-component mixtures. Having narrowed the nature of the chemicals behind the antagonisms in our mixtures, the question arises as to the biological basis for the deviations from additivity with the mixtures containing 4-NP and 4t-OP. We suggest that a consideration of the features of the E-SCREEN holds the key to an answer to this question. The end point evaluated (i.e., cell numbers yielded within a constant time period) is a composite measure of two diametrically opposed cellular responses to treatment with estrogenic chemicals (i.e., cell division and toxicity). Even if it is accepted that the assumption of similarity in mode of action is valid for mitogenic effects, it is still conceivable that the toxicity component involves a variety of mechanisms, perhaps better described by IA. The CA concept may be unable to capture such complexities. In support of this line of argumentation, we have observed a marked decrease in cell numbers at higher concentrations of all phenolic chemicals. Due to the effect parameter measured, this is the only way toxicity can become apparent in the E-SCREEN. However, this does not preclude that toxic or growth-restricting influences are in operation at lower concentrations. Unfortunately, it is difficult to assess such effects by direct measurement because they cannot be dissociated easily from measurement of cell numbers in the E-SCREEN, for technical reasons. Toxicity has also been observed by others with the phenolic mixture constituents 4-NP and BPA in various test systems (20, 21). In contrast, toxicity did not become apparent with E2, EE2, or GEN, in the concentration range where these chemicals produced activity in the E-SCREEN. Thus, our hypothesis is that the “mitogenic effect components” of all tested agents may well act in a concentration additive fashion but that this may not be the case for the “toxic” component of their effect spectrum. Considering the manifold ways in which chemicals may exhibit growthrestricting or cytotoxic effects, it is conceivable that these aspects of their effect spectrum may be better described by IA. The impact of such interactions on the effect parameter

measured in the E-SCREEN is likely to be direct and immediate, leading to a short-fall from expected additivity. A similar short-fall may not become apparent in assays monitoring gene expression (YES) or increases of the levels of the products of estrogen-dependent genes as in vitellogenin analyses in fish (23) because the impact of “toxicity” on these end points is indirect. Thus, we suggest that the special features of the end point investigated in the E-SCREEN make this assay particularly sensitive to intervening growthrestricting influences and that this contributed to the observed deviations from additivity. Such reasoning is in contradiction to Berenbaum (1), who has emphasized the need to adopt mechanism-free approaches in assessing mixture effects and has promoted the idea that a test system be treated like a “black box”: As long as the same experimental outcome or end point is assessed for both the mixture and the individual components, events occurring at the molecular level within a test system need not be understood in order to derive accurate mixture effect predictions. Thus, if this holds true for our mixture constituents, then there would be no need to distinguish between combined mitogenic and toxic effects. However, we would like to maintain that the black box approach may be an oversimplification in the special case of the mixtures we have analyzed here. It is unfortunate that, for technical reasons, our ideas cannot be tested directly by measuring the balance of toxicity and mitogenicity in the E-SCREEN. Given these difficulties and the absence of direct experimental evidence to back our idea, it may be useful to probe whether our reasoning is consistent with published reports on mixture effects of estrogenic chemicals. On the basis of our hypothesis, the following expectations can be formulated: (i) Combination effects of chemicals with predominantly mitogenic (estrogenic) effect spectra should be described well by CA in the E-SCREEN. The findings with the three- and four-component mixtures presented here support this idea. (ii) Furthermore, for combinations of agents that exhibit estrogenicity and toxicity, concentration addition should prove applicable if the assumption of CA of “similar” mode-of-action holds true for each of the opposed cellular responses (mitogenicity and toxicity) taken in isolation. Our previous results with four estrogenic organochlorine pesticides (13) are in line with this expectation. (iii) Problems with predictability may arise when points (i) and (ii) are not met. For example, the mitogenicity of all mixture components meet the criteria of CA, but the toxic/growth-inhibiting effects do not, being the result of “dissimilar” modes of action. (iv) The end point utilized in the E-SCREEN is highly integrative, and this means that possible intervening (toxic, growth inhibiting) factors have a direct influence on the measured effect parameter. The impact of such influences on end points investigated in gene reporter constructs or in fish (vitellogenin induction) is much less direct, and this may be the reason deviations from additivity have not yet been observed in mixture studies with these assays. Dealing with the issue of experimental reproducibility in the E-SCREEN has been a crucial element in the planning, design and conduct of the multi-component mixture experiments presented here. Due to the need of having to rely on historical effect data, the demands for reproducibility go far beyond those sufficient for screening exercises. For quantitative dose-response analysis, a thorough optimization is required, and we have identified potential sources of systematic errors that need to be minimized to improve the reproducibility of the E-SCREEN. For example, cells can be detached from the bottoms of the wells during media changes from seeding conditions into experimental media. We reduced this systematic error by the use of an electronic VOL. 38, NO. 23, 2004 / ENVIRONMENTAL SCIENCE & TECHNOLOGY

9

6351

pipet set to low-force ejection, and with more consistent pipetting we were able to assume with confidence that cell loses between wells were comparable. Experiments carried out on different dates using cells from different passages did exhibit differences in raw optical density values (Figure 1A), but by ensuring that each plate contained at least two positive and negative controls it was possible to normalize the raw data using these wells as 100% and 0% effects, respectively. In so doing, the experimental data could be pooled. The increased number of independent replicates achieved in this way led to better quality best-fit regression models, as the likelihood of a systematic error becomes progressively smaller and the precision is increased, demonstrated by the small confidence intervals of the regression lines. Of all the in vitro assays in use for the screening of endocrine-active chemicals, the E-SCREEN represents the highest level of biological complexity, and we regard this as a strength because it affords the opportunity to identify factors that may impact on mixture effect predictability. And here lies the relevance of our observations to ecotoxicology: It will be interesting to see whether apical end points affected by endocrine-active chemicals in fish, such as effects on fertility or fecundity, may also be sensitive to intervening toxicity during the prediction of combination effects. Our findings suggest that this may well be the case and would have to be taken into consideration during experimental planning, design, and assessment of estrogenic combination effects.

Acknowledgments The MCF-7 BOS cells were a kind gift from Ana Soto (Tufts University School of Medicine), and Janine Calabro (Tufts) provided invaluable technical assistance. We are grateful for funding by the European Commission (Contracts EVK1-200100091 and QLRT-CT2002-00603).

Literature Cited (1) Berenbaum, M. Pharmacol. Rev. 1985, 41, 93-141. (2) Ko¨nemann, H. Toxicology 1981, 19, 229-238. (3) Escher, B. I.; Hermens, J. L. M. Environ. Sci. Technol. 2002, 36, 4201-4217. (4) Altenburger, R.; Backhaus, T.; Boedeker, W.; Faust, M.; Scholze, M.; Grimme, L. H. Environ. Toxicol. Chem. 2000, 19, 23412347. (5) Silva, E.; Rajapakse, N.; Kortenkamp, A. Environ. Sci. Technol. 2002, 36, 1751-1756.

6352

9

ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 38, NO. 23, 2004

(6) Rajapakse, N.; Silva, E.; Kortenkamp, A. Environ. Health Perspect. 2002, 110, 917-921. (7) Backhaus, T.; Altenburger, R.; Boedeker, W.; Faust, M.; Scholze, M.; Grimme, L. H. Environ. Toxicol. Chem. 2000, 19, 23482356. (8) Loewe, S.; Muischnek, H. Arch. Exp. Pathol. Pharmakol. 1926, 114, 313-326. (9) Bliss, C. I. Ann. Appl. Biol. 1939, 26, 585-615. (10) Soto, A. M.; Sonnenschein, C.; Chung, K. L.; Fernandez, M. F.; Olea, N.; Serrano, F. O. Environ. Health Perspect. 1995, 103 (Suppl. 7), 113-122. (11) Villalobos, M.; Olea, N.; Brotons, J. A.; Olea-Serrano, M. F.; Almodovar, J. M.; Pedraza, V. Environ. Health Perspect. 1995, 103, 844-850. (12) Ko¨rner, W.; Hanf, V.; Schuller, W.; Kempter, C.; Metzger, J.; Hagenmaier, H. Sci. Total Environ. 1999, 225, 33-48. (13) Payne, J.; Scholze, M.; Kortenkamp, A. Environ. Health Perspect. 2001, 109, 391-397. (14) Kuiper, G. G.; Lemmen, J. G.; Carlsson, B.; Corton, J. C.; Safe, S. H.; van der Saag, P. T.; van der Burg, B.; Gustafsson, J. A. Endocrinology 1998, 139, 4252-4263. (15) Andersen, H. R.; Andersson, A. M.; Arnold, S. F.; Autrup, H.; Barfoed, M.; Beresford, N. A.; Bjerregaard, P.; Christiansen, L. B.; Gissel, B.; Hummel, R.; Jorgensen, E. B.; Korsgaard, B.; Le Guevel, R.; Leffers, H.; McLachlan, J.; Moller, A.; Nielsen, J. B.; Olea, N.; Oles-Karasko, A.; Pakdel, F.; Pedersen, K. L.; Perez, P.; Skakkebæk, N. E.; Sonnenschein, C.; Soto, A. M.; Sumpter, J. P.; Thorpe, S. M.; Grandjean, P. Environ. Health Perspect. 1999, 107 (Suppl. 1), 89-108. (16) Rich, R.; Hoth, L. R.; Geoghegan, K. F.; Brown, T. A.; LeMotte, P. K.; Simons, S. P.; Hensley, P.; Myszka, D. G. Proc. Natl. Acad. Sci. U.S.A. 2002, 99, 8562-8567. (17) Purdom, C. E.; Haridman, P. A.; Bye, V. J.; Eno, N. C.; Tyler, C. R.; Sumpter, J. P. Chem. Ecol. 1994, 8, 275-285. (18) Scholze, M.; Boedeker, W.; Faust, M.; Backhaus, T.; Altenburger, R.; Grimme, L. H. Environ. Toxicol. Chem. 2001, 20, 448-457. (19) Beresford, N.; Routledge, E. J.; Harris, C. A.; Sumpter, J. P. Toxicol. Appl. Pharmacol. 2000, 162, 22-33. (20) Roy, D.; Colerangle, J. B.; Singh, K. P. Front. Biosci. 1998, 3, d913-921. (21) Nakagawa, Y.; Tayama, S. Arch. Toxicol. 2000, 74, 99-105. (22) Faust, M.; Altenburger, R.; Backhaus, T.; Blanck, H.; Boedeker, W.; Gramatica, P.; Hamer, V.; Scholze, M.; Vighi, M.; Grimme, L. H. Aquat. Toxicol. 2003, 63, 43-63. (23) Thorpe, K. L.; Hutchinson, T. H.; Hetheridge, M. J.; Scholze, M.; Sumpter, J. P.; Tyler, C. R. Environ. Sci. Technol. 2001, 35, 24762481.

Received for review March 1, 2004. Revised manuscript received September 17, 2004. Accepted September 22, 2004. ES049681E