Bayesian and Time-Independent Species Sensitivity Distributions for

Nov 25, 2005 - Wallingford, Oxfordshire, OX10 8BD, United Kingdom. Species sensitivity distributions (SSDs) are increasingly used to analyze toxicity ...
0 downloads 0 Views 166KB Size
Environ. Sci. Technol. 2006, 40, 395-401

Bayesian and Time-Independent Species Sensitivity Distributions for Risk Assessment of Chemicals E R I C P . M . G R I S T , * ,† ANTHONY O’HAGAN,‡ MARK CRANE,§ NEAL SOROKIN,| IAN SIMS,| AND PAUL WHITEHOUSE⊥ CSIRO Marine and Atmospheric Research, GPO Box 1538, Hobart, Tasmania 7001, Australia, Department of Probability and Statistics, University of Sheffield, Hicks Building, Sheffield S3 7RH, United Kingdom, Watts & Crane Associates, Faringdon, Oxfordshire SN7 7AG, United Kingdom, WRc, Henley Road, Medmenham, Marlow, Buckinghamshire SL7 2HD, United Kingdom, and Environment Agency, Wallingford, Oxfordshire, OX10 8BD, United Kingdom

Species sensitivity distributions (SSDs) are increasingly used to analyze toxicity data but have been criticized for a lack of consistency in data inputs, lack of relevance to the real environment, and a lack of transparency in implementation. This paper shows how the Bayesian approach addresses concerns arising from frequentist SSD estimation. Bayesian methodologies are used to estimate SSDs and compare results obtained with time-dependent (LC50) and time-independent (predicted no observed effect concentration) endpoints for the insecticide chlorpyrifos. Uncertainty in the estimation of each SSD is obtained either in the form of a pointwise percentile confidence interval computed by bootstrap regression or an associated credible interval. We demonstrate that uncertainty in SSD estimation can be reduced by applying a Bayesian approach that incorporates expert knowledge and that use of Bayesian methodology permits estimation of an SSD that is more robust to variations in data. The results suggest that even with sparse data sets theoretical criticisms of the SSD approach can be overcome.

Introduction Species sensitivity distributions (SSDs) are increasingly employed in ecological risk assessment procedures (1-3) and are usually constructed by fitting a statistical distribution, typically log-normal (4) or log-logistic (5), to toxicity data. One aim of an SSD analysis is to determine a chemical concentration protective for most species in the environment, usually by calculating an HC5 value (hazardous concentration for 5% of species) (6). In Europe, risk assessment methods for new and existing chemicals are described in the Technical Guidance Document (TGD) (7). This includes a methodology for the use of statistical extrapolation methods with SSDs, and there has also been considerable interest in the use of SSDs to assess pesticide risks (8). * Corresponding author phone: +61 3 6232 5354; e-mail: [email protected]. † CSIRO Marine and Atmospheric Research. ‡ University of Sheffield. § Watts & Crane Associates. | WRc. ⊥ Environment Agency. 10.1021/es050871e CCC: $33.50 Published on Web 11/25/2005

 2006 American Chemical Society

Despite the increased use of SSDs, the approach potentially suffers from some important flaws. For example, Forbes and Calow (9) have the following main criticisms, which are also discussed in Posthuma et al. (10): (1) SSDs are usually constructed from what they describe as a haphazard collection of values that have no direct relevance to site-specific assemblages of organisms. This may be a particular problem if site-specific exposure concentrations are then compared with these generic SSDs. (2) The endpoints used to construct SSDs are not demographically relevant; i.e., they are collections of lethal or sublethal threshold concentrations, such as median lethal effects (LC50s) or sublethal no observed effect concentrations (NOECs). (3) The technical aspects of constructing an SSD, such as model choice, selection of appropriate confidence intervals, definition of the minimum number, quality and representativeness of data points required, and selection of summary statistics such as the HC5 are often opaque and may not clearly relate to environmental protection goals. In this paper we begin to address these criticisms, using the organophosphorus insecticide chlorpyrifos as a model. Chlorpyrifos is used worldwide to control pests in arable crops and orchards and in homes and gardens (11). Chlorpyrifos toxicity is rapid and intense for susceptible species, but relatively rapid degradation in the environment means that cumulative toxicity is unlikely. A comprehensive ecological risk assessment for chlorpyrifos has been performed for North American aquatic environments (10), and mesocosm data are also available, providing a useful comparison for results in the present study. The main objectives of this paper are to address the valid criticisms of SSDs by Forbes and Calow (9) by (1) collating available data on chlorpyrifos and selecting only those that are relevant to U. K. species for inclusion in an SSD, (2) eliciting information on organism tolerances to chlorpyrifos from U. K. field biologists to counteract bias in the toxicity database due to data being available only for species that are suitable for laboratory tests, (3) developing scenarios for three different aquatic habitat types containing different generic organism assemblages in a preliminary effort to understand differences between sitespecific assemblages, (4) generating time-independent toxicity data that more closely relate to the protection aims of the risk assessment process. We apply both frequentist and Bayesian statistical approaches to construct SSDs and associated confidence or credible intervals from time-independent predicted no observed effect concentration (PNOEC) toxicity values as well as the more usual 96 h LC50 data. First, we collated literature data on the toxicity of the insecticide chlorpyrifos and estimated time-independent no effect concentrations. Then we elicited the opinions of freshwater biologists on the relative sensitivities of different freshwater taxa to the effects of this insecticide. Finally, species sensitivity distributions were constructed with and without elicited opinions and compared with results from new toxicity studies with previously untested species.

Materials and Methods LC50 Values from Literature Review. Acute lethal data for chlorpyrifos on the U. S. EPA ECOTOX database (http:// www.epa.gov/ecotox/) were collated by Crane et al. (12) and used in a simple probabilistic risk assessment without regard VOL. 40, NO. 1, 2006 / ENVIRONMENTAL SCIENCE & TECHNOLOGY

9

395

TABLE 1. LC50 Values at 96 h Obtained from the Literature (17 Species) and Laboratory Experiments (3 Species) species

96 h LC50 (µg/L)

taxon

literature (17 species) Anguilla anguilla Anguillidae Asellus aquaticus Asellidae Caenis horaria Caenidae Chironomus tentans Chironomidae Chironomus thummi Chironomidae Corixa punctata Coroxidae Rutilus rutilus Cyprinidae Gammarus lacustris Gammaridae Gammarus pulex Gammaridae Gammarus fasciatus Gammaridae Gammarus pseudolimnaeus Gammaridae Gasterosteus aculeatus Gasterosteidae Pungitius pungitius Gasterosteidae Peltodytes sp. Haliplidae Leptocerida sp. Leptoceridae Oncorhynchus mykiss Salmonidae Oncorhynchus clarki Salmonidae laboratory (3 species) Cloeon dipterum Baetidae Leuctra geniculata Leuctridae Brachycentrus subnubilus Brachycentridae Hirudo medicinalis Hirudidae

540 2.7 0.5 0.47 0.2 2 120 0.11 0.07 0.32 0.18 8.54 4.7 0.8 0.77 7.1 14.42 0.035 0.87 0.45 a

a Values were computable for three laboratory species only, because no mortality occurred in the exposure experiment conducted on Hirudo medicinalis (a leech).

to data quality or specific relevance to the U. K. environment. The original papers were reviewed for the present study according to the usual criteria for acceptability of data when setting U. K. Environmental Quality Standards (13). Those that contained information on species relevant to the U. K. environment were selected because U. K. field biologists could not be expected to possess expertise on the sensitivity to chlorpyrifos of non-native species during the expert elicitation exercise described below. A total of 17 species for which 96 h LC50 values had been calculated were selected for use in SSD construction (Table 1). PNOEC Values Based on Time-to-Event Analysis. Raw survival data after 48 and 96 h exposure to chlorpyrifos were kindly provided by Rene´ van Wijngaarden (14) of Alterra (Netherlands) for the following eight species: Chaoberus sp. (phantom midge larva), Cloeon dipterum (mayfly nymph), Corixa punctata (water boatman), Simocephalus vetulus (waterflea), Daphnia longiseta (waterflea) Gammarus pulex

(shrimp) Gasterosteus aculeatus (3-spined stickleback), and Pungitius pungitius (10-spined stickleback). Mayer et al. (15) describe a two-step linear regression method for estimating time-independent PNOEC values from such raw survival data. We applied this approach to estimate PNOEC values for each of the species in the van Wijngaarden raw survival data. An inherent problem of the Mayer et al. technique is that it does not properly account for the effects of individual variability whenever there is a decrease in mortality with an increase in concentration. Thus, although mortality under any single concentration can never decrease with time (the experiment is performed on one set of individuals, with death being irreversible), it may sometimes decrease with increasing concentration when measured at a fixed point in time because of the effects of heterogeneity (individual variability in response). This problem occurred in the PNOEC calculations for 1 species (Gasterosteus aculeatus) out of the eight species in the Wijngaarden raw survival data set. Biologically meaningful PNOECs were hence calculable only for seven species, hereafter referred to as the Wijngaarden PNOECs (Table 2). These values were then subsequently used to construct time-independent SSDs. Expert Elicitation. The views of freshwater field biologists were sought on the sensitivity to chlorpyrifos of 96 taxonomic groups found in U. K. freshwaters (Supporting Information). Seventeen biologists employed by the Environment Agency or belonging to the Freshwater Biological Association were asked to score the sensitivity to chlorpyrifos of each taxonomic group on a scale from 1 (insensitive) to 8 (highly sensitive). They were also asked to score their own knowledge of each taxonomic group from 0 (no knowledge) to 5 (high level of knowledge). The sensitivity scores for each taxon were then weighted according to expertise. Experts all had the same information presented to them and were given the same instructions for completing the exercise after a trial run with a facilitator. The expert opinions on species sensitivities to chlorpyrifos, weighted according to their assessment of individual knowledge, are shown in the Supporting Information. Generic Aquatic Assemblage Scenarios. The taxonomic groups (listed in the Supporting Information) were associated with three generic U. K. assemblages on the basis of information in Fitter and Manuel (16). These assemblages were (a) a fast-flowing stream, (b) a slow-flowing lowland river, and (c) a static pond or ditch. SSDs Based on Frequentist Analysis. SSDs were constructed from the 96 h LC50 data or time-independent PNOEC values by nonlinear regression of a logistic model to the cumulative frequency plot of the log-transformed data set

TABLE 2. Time-Independent Predicted No Effect Concentrations (PNOECs) Calculated by the Mayer et al. (15) Technique species

taxon

48 h LC0 (µg/L)

96 h LC0 (µg/L)

PNOEC (µg/L)

Chaoberus obscuripes Cloeon dipterum Corixa punctata Daphnia longiseta Gammarus pulex Pungitius pungitius Simocephalus vetulus Gasterosteus aculeatus

Wijngaarden (7 species) Chaoboridae 1.77 × 10-1 Baetidae 1.56 × 10-1 Corixidae 5.66 × 10-1 Daphniidae 6.31 × 10-2 Gammaridae 6.35 × 10-3 Gasterosteidae 7.50 × 10-1 Daphniidae 9.40 × 10-2 Gasterosteidae 3.27 × 10-1

1.10 × 10-1 1.45 × 10-2 2.62 × 10-1 5.56 × 10-2 2.66 × 10-3 6.48 × 10-1 9.26 × 10-2 4.06 × 10-1

6.81 × 10-2 1.35 × 10-3 1.21 × 10-1 4.89 × 10-2 1.12 × 10-3 5.59 × 10-1 9.12 × 10-2 a

Cloeon dipterum Leuctra geniculata Brachycentrus subnubilus Hirudo medicinalis

laboratory (1 species) Baetidae 1.09 × 10-4 Leuctridae 0.0000 Brachycentridae 1.99 × 10-3 Hirudidae a

4.63 × 10-5 1.41 × 10-5 6.72 × 10-3 a

1.25 × 10-5 a a a

a Values were calculable for only seven species in the Wijngaarden data set and one species (Cloeon dipterum) in the laboratory data because of the high levels of individual variability in mortality response.

396

9

ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 40, NO. 1, 2006

FIGURE 1. Plot showing the 17 measurements of the 96 h LC50 obtained for chlorpyrifos from the literature for freshwater aquatic species found in the U. K. plotted against the average assessment of the corresponding taxa sensitivity made by the experts. The y-axis is the log LC50 and the x-axis is the average assessment of sensitivity by the experts (on a scale of 1 to 8), with rings around data points to identify those species that are in the same taxon. (5). Associated 95% pointwise percentile confidence intervals were derived using bootstrap regression in which nonlinear regression is applied repeatedly to resamples of the cumulative frequency data plot (3). SSDs Based on Bayesian analysis. A Bayesian statistical model was developed to construct SSDs from either 96 h LC50 data or time-independent PNOEC values, plus the results of the expert elicitation exercise, and was implemented in the software package WINBUGS (http://www.mrcbsu.cam.ac.uk/bugs). Taxa mean log(96 h LC50) or log(PNOEC) values were assumed to be normally distributed around a linear function of the experts’ weighted mean sensitivity values, with a precision representing experts’ assessment errors. The linear function was determined by least-squares regression as the line of best fit to measurements of the toxicity data for chlorpyrifos (relating to freshwater aquatic species found in the U. K. in the literature) plotted against the corresponding average assessment of sensitivity made by the experts (on a scale of 1 to 8). Figure 1 shows the line of best fit together with corresponding data points. As expected, the line has negative slope, reflecting that more sensitive species were broadly judged by the experts to be associated with more sensitive taxa. However, the scattering in the plot reveals that expert judgments are not strongly correlated with true toxicity and hence that the data are unable to support a more complex relationship being fitted. Values for individual species were assumed to be normally distributed around their taxa means. SSDs were constructed for each of the three generic assemblages by predicting the proportion of species in each taxon with 96 h LC50 or PNOEC values below a given concentration and then averaging over the taxa present in the assemblage. These were constructed both with and without the use of expert opinion from the elicitation exercise, in the latter case by forcing the linear regression of taxa means on expert weighted means to have zero slope. Further details of the methodology are available in O’Hagan et al.17 Empirical Testing. Through the use of the expert opinions of taxa sensitivity obtained in the elicitation exercise, taxa of different predicted sensitivity for which empirical data were

not available were identified. Four species were then exposed to chlorpyrifos to examine whether the experts had accurately predicted their sensitivity. These species, with the common name and experts’ assessment of their relative sensitivity in parentheses were Cloeon dipterum (mayfly, 6th), Brachycentrus subnubilis (caddis fly, 8th), Leuctra geniculata (stonefly, 17th), and Hirudo medicinalis (leech, 47th). Test organisms were collected from the wild (mayflies, caddisflies, and stoneflies) or acquired commercially (leeches) and exposed to a range of concentrations of technical grade chlorpyrifos for 96 h in semistatic test systems, with test medium renewal every 24 h. Stock solutions were chemically analyzed to verify exposure concentrations. A 96 h LC50 value was not calculable for H. medicinalis, because no mortality occurred in that exposure experiment, but the observation thereby gave an empirical lower bound on the LC50 value. In summary, three additional 96 h LC50 values (hereafter referred to as the laboratory 96 h LC50 data) were obtained, together with a lower bound on the LC50 value for H. medicinalis, which was used as a fourth additional data point in the Bayesian analysis. Application of the Mayer et al. (15) technique to the mortality data collected for these three species yielded only one further time-independent PNOEC, for Cloeon dipterum, hereafter referred to as the laboratory PNOEC, because of the problem caused by high levels of individual variability in mortality response (discussed earlier). Hence, finally two combined data sets consisting of literature and laboratory 96 h LC50 data (as either 20 species (17 + 3) for frequentist or 21 species (17 + 4) for the Bayesian analyses) or Wijngaarden and laboratory PNOECs (8 ) 7 + 1 species) were obtained. The frequentist and Bayesian approaches described above were then run again to generate new 96 h LC50 and timeindependent SSDs using these two newly combined data sets.

Results Figure 2a shows the SSD together with 95% pointwise percentile confidence intervals constructed by the frequentist approach using the 96 h LC50 literature data (17 species). Figure 2b compares the Bayesian SSD constructed from the same data for each of the three generic habitats and associated assemblages using expert opinion, together with the single SSD constructed without expert opinion. If no expert opinion is incorporated, then the SSD for each generic assemblage is identical because there is simply no information to specify how they should differ. Only by using expert knowledge can the assemblages be distinguished and an SSD thus be tailored to fit a given assemblage, unless there are large numbers of empirical data for each assemblage type. In general, incorporation of expert opinion produces a leftward shift of the Bayesian SSD corresponding to each assemblage. Figure 3 illustrates this effect by comparing the Bayesian SSD obtained both with and without expert opinion for the fast-flowing stream assemblage, which incurred the greatest shift through inclusion of expert opinion. This is likely to be because of the experts’ perception, probably reflecting a more general view, that fast-flowing streams are characterized by sensitive species such as stoneflies and mayflies, while ponds and ditches are characterized by less sensitive species such as snails and leeches. For all three habitats the results show that the use of expert opinion within the Bayesian analysis produces lower (i.e., more sensitive) estimates of the respective SSD and its 95% credible interval, plus summary statistics such as the HC5. This is largely because the taxa that experts regarded as being particularly sensitive to chlorpyrifos were not well represented in the empirical toxicity dataset (that is, the LC50 values obtained from either the literature or the laboratory tests). VOL. 40, NO. 1, 2006 / ENVIRONMENTAL SCIENCE & TECHNOLOGY

9

397

FIGURE 3. Leftward shift that occurs through incorporation of expert opinion in the Bayesian-estimated SSD, illustrated for the fastflowing stream habitat using the 96 h LC50 literature data (open circles, 17 species). Respective 50% credible percentiles with (dashes) and without (bold) expert opinion are compared together with lower (2.5%) and upper (97.5%) 95% credible limits (dotted and solid lines, respectively).

FIGURE 2. Chlorpyrifos species sensitivity distributions (SSDs) estimated from the 96 h LC50 literature data (open circles, 17 species). (a) Frequentist using a log-logistic model with bootstrap regression. The 50% percentile (bold) is shown together with lower (2.5%) and upper (97.5%) pointwise percentile confidence intervals (thin). The number of resamples is 2000. (b) Bayesian using expert opinion from the elicitation exercise for the three generic assemblages of fast-flowing stream (dashes), slow-flowing river (dots), or static pond or ditch (dot-dash). With no expert opinion (bold), the three SSDs are identical. In each case the 50% credible percentile is shown (together with lower (2.5%) and upper (97.5%) credible percentiles (thin) with no expert opinion). For example, the HC5 was approximately 0.02 µg L-1 with and 0.05 µg L-1 without expert opinion for both fast-flowing streams and slow-flowing rivers. The HC5 estimate turned out to be slightly higher for the static pond, reflecting the fact that taxa found in this habitat and not the others were judged by the experts to be generally less sensitive. The curves for the different assemblages are strongly correlated because they have many species in common. Hence, although the 95% credible intervals are generally wide and overlapped substantially for all SSDs, there is evidence that the SSD for the fast-flowing stream lies to the left of that for the slowflowing river, which in turn lies to the left of the SSD for the static pond. Figure 4 compares time-independent SSDs constructed by either the frequentist (Figures 4a and 4c) or Bayesian (Figures 4b and 4d) methodologies using the Wijngaarden PNOECs derived from time-to-event analyses. The HC5 values for these PNOECs are all considerably lower than those calculated previously from 96 h LC50 data, as would be expected from extrapolation of short-term lethal concentrations to time-independent no effect concentrations. Bayesian 398

9

ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 40, NO. 1, 2006

incorporation of expert opinion with the Wijngaarden data produced a similar trend to the 96 h LC50 analyses, with lower estimates of HC5 for fast-flowing stream and slowflowing river habitats when compared with a frequentist analysis, and a higher estimated HC5 for static pond habitats. The results from toxicity tests with four additional species, designed to test the predictions made by experts, suggested that the experts were broadly correct in their assignment of sensitivities (Supporting Information). Cloeon dipterum (mayfly) was the most sensitive species, followed by Brachycentrus subnubilus (caddis fly), Leuctra geniculata (stonefly), and Hirudo medicinalis (leech), in the order predicted by the weighted mean expert ranking. All statistical analyses were performed again to include these new experimental data, and the results are summarized in Table 3 together with those obtained from the original datasets. A comparison between the analyses performed with the various data sets reveals two interesting aspects. First, there is a greater difference between the SSDs constructed from the time-independent PNOECs than those with the 96 h LC50 endpoints. In particular, Table 3 shows that the SSD constructed from PNOECs estimated for the third habitat (pond or ditch) has a noticeably higher HC5 than that calculated for the other two habitats (whether using just seven or all eight PNOEC values). Table 3 also shows that the SSD constructed from the PNOEC values estimated for ponds or ditches has a more than 2-fold higher HC5 than that calculated for the other two habitats, which suggests that lentic invertebrate assemblages may be less susceptible to the toxicity of chlorpyrifos than lotic assemblages. Second, a comparison between the time-independent SSDs associated with each habitat when derived from the Wijngaarden values with that derived from the Wijngaarden and laboratory values reveals a further important difference. This can be seen by comparing respective HC5 values obtained in each case and observing that those derived from the SSDs constructed with Wijngaarden and laboratory values are in general lower by a factor of about 5 (Table 3). Since the single laboratory PNOEC (Cloeon dipterum) has a very much lower value than any of those obtained by Wijngaarden, an effect of this kind through its inclusion in SSD construction would be expected. However, the Bayesian SSD analysis had access to expert knowledge, which quite clearly believed that the Ephemeridae would be more sensitive than any of the

FIGURE 4. Chlorpyrifos species sensitivity distributions (SSDs) estimated from time-independent PNOEC values (open circles). Respective frequentist pointwise confidence or Bayesian credible 50% percentiles (bold) are shown in each case together with lower (2.5%) and upper (97.5%) limits (thin). (a) Frequentist with a log-logistic model and bootstrap regression using the Wijngaarden PNOECs (7 species). The number of resamples is 2000. (b) Bayesian for the fast-flowing stream habitat using the Wijngaarden PNOEC values (7 species). (c) Frequentist with a log-logistic model and bootstrap regression using the combined Wijngaarden and laboratory PNOECs (8 ) 7 + 1 species). The number of resamples is 2000. (d) Bayesian for the fast-flowing stream habitat using the combined Wijngaarden and laboratory PNOEC values (8 ) 7 + 1 species). Wijngaarden species. So in principle this very low value should have been discounted to an extent. In any event, the Bayesian SSDs would be expected to be less affected by the singularly different laboratory data point than each of the respective frequentist SSDs, which did not incorporate expert elicitation. A direct comparison of the specific HC5 values determined for each habitat exhibited in Table 3 shows this to indeed be the case, as follows. Through the use of only the Wijngaarden PNOEC values to construct the Bayesian SSDs, the HC5 values (in µg L-1) were 1.34 × 10-4 for fast-flowing streams, 1.56 × 10-4 for slow-flowing rivers, and 3.89 × 10-4 for static ponds. Through the use of the Wijngaarden and laboratory values, these respectively become 2.33 × 10-5, 2.92 × 10-5, and 9.94 × 10-5, which represent leftward shifts by a factor of 5.75, 5.34, and 3.91 (corresponding to an average factor of about 5, as stated above). With the frequentist SSD construction approach, using the Wijngaarden values, the HC5 is estimated as 1.74 × 10-4 µg L-1, whereas using the Wijngaarden and laboratory values the HC5 estimate is 1.66 × 10-5. The leftward shift therefore now corresponds to a factor of 10.48, which is twice that for the Bayesian SSDs.

This provides demonstrable evidence of the greater robustness to be expected in the Bayesian approach to SSD construction.

Discussion The present study shows that for chlorpyrifos the use of a time-to-event approach to estimate PNOEC values leads to estimates of toxicity with an HC5 value generally less than 0.0004 µg L-1 (Table 3). However, this estimate is 50 times lower than the 96 h LC10 value of 0.02 µg L-1 estimated for Gammarus pulex, the most sensitive species tested by Wijngaarden et al. (13), and may therefore be an overestimate of likely effects caused by transient environmental exposure. Wheeler et al. (18) reported freshwater HC5 values for chlorpyrifos, based on acute data, of 0.086 µg/L (log-logistic model) and 0.063 µg/L (log-normal model), which are 215 and 158 times higher, respectively, than the HC5 based on time-independent PNOEC values. However, Giesy et al. (10) report acute-to-chronic ratios for chlorpyrifos of up to 181, so such a difference between HC5 estimates based on acute or chronic summaries is at least plausible. The use of Bayesian methodology to incorporate expert judgment of species tolerance distributions and empirical VOL. 40, NO. 1, 2006 / ENVIRONMENTAL SCIENCE & TECHNOLOGY

9

399

TABLE 3. Comparison of HC5 Values Determined from Species Sensitivity Distributions (SSDs) Constructed by Frequentist and Bayesian Methodologies, Using the 96 h LC50 Data Sets and Time-Independent PNOEC Valuesa Bayesian

frequentist HC5 (µg/L)

SSD constructed using literature (17 species)

literature and laboratory (17 + 3 species)b

Wijngaarden (7 species)

habitat no expert opinion fast-flowing stream slow-flowing river static pond no expert opinion fast-flowing stream slow-flowing river static pond

no expert opinion fast-flowing stream slow-flowing river static pond Wijngaarden and laboratory no expert opinion (7 + 1 species) fast-flowing stream slow-flowing river static pond

median

L (2.5%)

HC5 (µg/L) U (97.5%)

median

L (2.5%)

U (97.5%)

2.41 × 10-1 1.60 × 10-1 1.73 × 10-1 2.34 × 10-1 1.47 × 10-1 1.03 × 10-1 1.15 × 10-1 1.53 × 10-1

2.53 × 10-2 * * * 2.40 × 10-2 * * *

4.60 × 10-3 * * * 5.40 × 10-3 * * *

1.10 × 10-1 * * * 9.45 × 10-1 * * *

time-independent PNOEC values 4.01 × 10-4 1.16 × 10-6 4.28 × 10-3 1.34 × 10-4 3.26 × 10-7 1.47 × 10-3 1.56 × 10-4 4.42 × 10-7 1.75 × 10-3 3.89 × 10-4 1.86 × 10-6 2.83 × 10-3 2.17 × 10-5 1.52 × 10-8 5.55 × 10-4 2.33 × 10-5 9.82 × 10-8 2.75 × 10-4 2.92 × 10-5 1.49 × 10-7 3.51 × 10-4 9.94 × 10-5 6.44 × 10-7 7.83 × 10-4

1.74 × 10-4 * * * 1.66 × 10-5 * * *

8.20 × 10-6 * * * 3.87 × 10-8 * * *

3.05 × 10-2 * * * 2.02 × 10-2 * * *

96 h LC50 data 4.95 × 10-2 1.11 × 10-2 1.90 × 10-2 1.86 × 10-4 2.14 × 10-2 2.95 × 10-4 3.29 × 10-2 5.39 × 10-4 3.17 × 10-2 1.38 × 10-3 2.03 × 10-2 9.10 × 10-4 2.30 × 10-2 1.19 × 10-3 3.23 × 10-2 2.01 × 10-3

a L ) lower 2.5%, U ) upper 97.5% percentiles, for frequentist pointwise confidence or Bayesian credible intervals. were used in the Bayesian analysis.

data into SSDs produced lower estimates of HC5 than SSDs based on empirical data alone. This is because the available empirical data were probably not fully representative of the whole population of data. In this study, the differences were not large, but they could be for other substances if available data are from studies that concentrate on highly sensitive or insensitive taxa. Use of formalized expert opinion is therefore of considerable value in reducing any bias that may be due to an unrepresentative selection of test species. The experts were correct in their relative ranking of the four species tested as part of this study, but did make some mistakes with other species. For example, Chironomidae were judged by some experts to have low sensitivity to chlorpyrifos, which clearly should not be the case for an insecticide. Such beliefs probably originate from the “sanitary water quality” bias of many field biologists in the U. K., and particularly those who work for the regulatory agencies. If the experts had been more accurate in their relative assessments, then the predictions of SSDs would have tighter credible intervals. Hence, appropriate selection of experts is important, and some objective test of their true level of expertise, rather than reliance on self-assessment, could usefully be incorporated into elicitation exercises such as these. There were insufficient data points to be able to test any assumptions on taxa heterogeneity, and so for reasons of parsimony the same variance in every taxon was incorporated into the Bayesian model. In principle, the experts could have been asked to estimate such heterogeneity, but this would have raised questions about what they could tell us reliably. Unlike their assessments of sensitivity, such assessments of heterogeneity would not be accessible to meaningful validation. Hence it was not possible to develop a more realistic model. The choice of organisms used to produce the three generic assemblages in this study is open to debate. There was substantial species overlap between the three assemblages, with most phyla represented in all three scenarios, and this will have contributed to the relatively small differences in estimated SSDs. Forbes and Calow (9) suggest that risk assessments based upon SSDs should be relevant to specific sites. However, there is a question over whether we should seek to protect what is currently present at a site or whether we should protect what could be present at a site. A more sophisticated treatment of site-specific assemblages is cer400

9

ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 40, NO. 1, 2006

b

Note that 17 + 4 species

tainly achievable for U. K. lotic systems, by using RIVPACS (19) to predict site-specific assemblages under pristine conditions for subsequent SSD construction. In general, it seems from other studies that LC/EC values for chlorpyrifos can be broadly predictive of longer-term toxic effects and do not appear to over- or underestimate them greatly. Crane et al. (11) concluded that chlorpyrifos is highly toxic to arthropods, with the water flea Ceriodaphnia dubia the most sensitive species on the ECOTOX database with a 96 h LC50 of 0.057 µg L-1. These data compare well with Bayesian estimates of HC5 for 96 h LC50 values calculated for the three generic assemblages, with expert judgment, which ranged from 0.020 to 0.032 µg L-1. This suggests that chlorpyrifos concentrations should be less than 0.057 µg L-1 and may need to be less than 0.02 µg L-1 to protect all aquatic systems from harm. Mesocosm results may help in “ground-truthing” laboratory estimates, although even these systems cannot fully represent the range of natural water bodies and taxa that could potentially be adversely affected in the natural environment (20). Giesy et al. (10) reviewed available mesocosm data for chlorpyrifos and concluded that effects on invertebrates could be reliably measured at concentrations of chlorpyrifos >0.2 µg L-1, with recovery of most populations within 2-8 weeks and that effects on fish occurred at concentrations >0.5 µg L-1 (21). These values are an order of magnitude higher than those estimated in this study, which may reflect overly conservative estimates based on laboratory studies or problems in detecting low levels of effect on a wide variety of organisms in variable mesocosm experiments. If laboratory-to-field extrapolation factors are required to take lower field sensitivity into account, then it would be better to apply them to time-independent no effect concentrations rather than LC50 values estimated at an arbitrary multiple of 24 h. This study has shown that valid theoretical criticisms of the SSD approach can be overcome. We have demonstrated some strategies for constructing and using SSDs that would help to minimize current deficiencies and make SSDs more environmentally relevant, technically robust, and useful for both site-specific and more generic risk assessments. In this investigation, the PNOECs of assemblages representing different freshwater habitats did not differ in sensitivity to chlorpyrifos by more than a factor of 4.3. In this case it may

therefore be that risk management decisions could be based on a generic species assemblage without the need to consider different habitats. However, this may not always be the case, and the use of expert elicitation in the construction of SSDs can therefore help with two quite different aspects. First, expert knowledge allows us to account for differences between habitats. Second, it allows us to take account of the nonrepresentativeness of the highly selective data that may be available. Finally, we have also been able to show how more meaningful toxicological endpoints may be estimated from existing data sets through the use of time-to-event approaches to estimate low levels of effect which are, presumably, the protection goals of most regulatory frameworks. In principle, similar approaches to those described in this paper may be adopted for other chemicals. However, there are undoubtedly some practical constraints. These include difficulties in soliciting opinions on species sensitivities to chemicals with which field biologists are unfamiliar and in obtaining raw data for a sufficiently wide range of species from which more meaningful toxicological endpoints may be calculated. Nevertheless the benefits seem sufficiently important to justify efforts to address these deficiencies.

Acknowledgments This work was funded by NERC Environmental Diagnostics Grant No. GST/02/2062, DEFRA Grant No. PN0933, and the Environment Agency of England and Wales. We thank Environment Agency staff and members of the Freshwater Biological Association for help during the elicitation exercise. We also thank colleagues, especially Rene´ van Wijngaarden, for providing access to raw data.

Supporting Information Available Results of the expert elicitation displaying the taxonomic groups considered together with their likely habitats and assessed sensitivities. This material is available free of charge via the Internet at http://pubs.acs.org.

Literature Cited (1) Solomon, K. R.; Baker, D. B.; Richards, R. P.; Dixon, D. R.; Klaine, S. J.; Lapoint, T. W.; Kendall, R. J.; Weisskopf, C. P.; Giddings, J. M.; Giesy, J. P.; Hall, L. W.; Williams, W. M. Ecological risk assessment of atrazine in North American surface waters. Environ. Toxicol. Chem. 1996, 15, 31-74. (2) Steen, R. J. C. A.; Leonards, P. E. G.; Brinkman, U. A. T.; Barcelo, D.; Tronczynski, J.; Albanis, T. A.; Cofino, W. P. Ecological risk assessment of agrochemicals in European estuaries. Environ. Toxicol. Chem. 1999, 18, 1574-1581. (3) Grist, E. P. M.; Leung, K. M. Y.; Wheeler, J. R.; Crane, M. Better bootstrap estimation of hazardous concentration thresholds for aquatic assemblages Environ. Toxicol. Chem. 2002, 21, 15151524. (4) Wagner, C.; Lokke, H. Estimation of ecotoxicological protection levels from NOEC toxicity data. Water Res. 1991, 25, 12371242. (5) Aldenberg, T.; Slob, W. Confidence limits for hazardous concentrations based on log-logistically distributed NOEC toxicity data. Ecotoxicol. Environ. Saf. 1993, 25, 48-63. (6) Van Straalen, N. M.; Van Rijn. J. P. Ecotoxicological risk assessment of soil fauna recovery from pesticide application.

Rev. Environ. Contam. Toxicol. 1998, 154, 85-141. (7) EC Technical Guidance Document No 1488/94 on Risk Assessment for Existing Substances and Directive 98/8/Ec of the European Parliament and the Council Concerning the Placing of Biocidal Products on the Market. Part II; Eur 20418 En/2; European Commission Joint Research Centre: Luxembourg, 2003. (8) Hart, A. Probabilistic Risk Assessment for Pesticides in Europe: Implementation and Research Needs; Central Science Laboratory: York, U. K., 2001. (9) Forbes, V. E.; Calow, P. Species sensitivity distributions revisited: A critical appraisal. Hum. Ecol. Risk Assess. 2002, 8, 473492. (10) Species Sensitivity Distributions in Ecotoxicology; Posthuma L., Suter, G. W., II, Traas, T. P., Eds.; Lewis Publishers: Boca Raton, FL, 2002 (11) Giesy, J. P.; Solomon, K. R.; Coats, J. R.; Dixon, K. R.; Giddings, J. M.; Kenaga, E. E. Ecological risk assessment in North American aquatic environments. Rev. Environ. Contam. Toxicol. 1999, 160, 1-129. (12) Crane, M.; Whitehouse, P.; Comber, S.; Watts, C.; Giddings, J.; Moore, D. R. J.; Grist, E. P. M. Evaluation of probabilistic risk assessment of pesticides in the UK: Chlorpyrifos use on top fruit. Pest Manage. Sci. 2003, 59, 512-526. (13) Whitehouse, P.; Cartwright, N. Standards for environmental protection. In Pollution Risk Assessment and Management: A Structured Approach; Douben, P. E. T., Ed., Wiley: Chichester, U. K., 1998; pp 235-272. (14) van Wijngaarden, R.; Leeuwangh, P.; Lucassen, W. G. H.; Romijn, K.; Ronday, R.; Van Der Velde, R.; Willigenburg, Acute toxicity of chlorpyrifos to fish, a newt, and aquatic invertebrates. W. Bull. Environ. Contam. Toxicol. 1993, 51, 716-723. (15) Mayer, F. L.; Ellersieck, M. R.; Krause, G. F.; Sun, K.; Lee. G.; Buckler, D. R. Time-concentration-effect models in predicting chronic toxicity from acute toxicity data. In Risk Assessment With Time to Event Models; Crane, M., Newman, M. C., Chapman, P. F., Fenlon, J., Eds.; Lewis Publishers: Boca Raton, FL, 2002; pp 39-67. (16) Fitter, R.; Manuel, R. Lakes, Rivers, Streams and Ponds of Britain and North-West Europe; Collins Photo Guide, Harper Collins: Hong Kong 1994. (17) O’Hagan, A.; Crane, M.; Grist, E. P. M.; Whitehouse, P. Estimating Species Sensitivity Distributions with the Aid of Expert Judgements; Research Report No. 556/05; Department of Probability and Statistics, University of Sheffield: Sheffield, U.K., 2005. Freely downloadable from http://www.shef.ac.uk/∼st1ao/ pub.html. (18) Wheeler, J. R.; Leung, K. M. Y.; Morritt, D.; Whitehouse, P.; Sorokin, N.; Toy, R.; Holt, M.; Crane, M. Freshwater to saltwater toxicity extrapolation using species sensitivity distributions. Environ. Toxicol. Chem. 2002, 21, 2459-2467. (19) Wright, J. F.; Furse, M. T.; Armitage, P. D. Use of macroinvertebrate communities to detect environmental stress in running waters. In Water Quality and Stress Indicators in Marine and Freshwater Systems; Sutcliffe, D. W., Ed.; Freshwater Biological Association: Ambleside, U. K., 1994; pp 15-34. (20) Crane, M. Research needs for predictive multispecies tests in aquatic toxicology. Hydrobiologia 1997, 346, 149-155. (21) Giddings, J. M.; Biever, R. C.; Racke, K. D. Fate of chlorpyrifos in outdoor pond microcosms and effects on growth and survival of bluegill sunfish. Environ. Toxicol. Chem. 1997, 16, 23532362.

Received for review May 6, 2005. Revised manuscript received October 28, 2005. Accepted October 28, 2005. ES050871E

VOL. 40, NO. 1, 2006 / ENVIRONMENTAL SCIENCE & TECHNOLOGY

9

401