Geostatistical Assessment and Validation of Uncertainty for Three

Environ. Sci. Technol. 2001, 35, 3294-3301

Geostatistical Assessment and Validation of Uncertainty for Three-Dimensional Dioxin Data from Sediments in an Estuarine River N O EÄ M I B A R A B AÄ S , PIERRE GOOVAERTS,* AND PETER ADRIAENS Environmental and Water Resources Engineering, Department of Civil and Environmental Engineering, The University of Michigan, Ann Arbor, Michigan 48109-2125

Contaminated sediment management is an urgent environmental and regulatory issue worldwide. Because remediation is expensive, sound quantitative assessments of uncertainty about the spatial distribution of contaminants are critical, but they are hampered by the physical complexity of sediment environments. This paper describes the use of geostatistical modeling approaches to quantify uncertainty of 2,3,7,8-tetrachlorodibenzo-p-dioxin concentrations in Passaic River (New Jersey) sediments and to incorporate this information in decision-making processes, such as delineation of contaminated areas and additional sampling needs. First, coordinate transformation and analysis of threedimensional semivariograms were used to describe and model the directional variability accounting for the meandering course of the river. Then, indicator kriging was employed to provide models of local uncertainty at unsampled locations without requiring a prior transform (e.g. lognormal) of concentrations. Cross-validation results show that the use of probability thresholds leads to more efficient delineation of contaminated areas than a classification based on the exceedence of regulatory thresholds by concentration estimates. Depending on whether additional sampling aims at reducing prediction errors or misclassification rates, the variance of local probability distributions or a measure of the expected closeness to the regulatory threshold can be used to locate candidate locations.

Introduction Contaminated sediment management is an urgent problem worldwide, as the relative risks associated with remedial dredging have to be weighed against those of the quality of the watershed if the sediments are left in place. Particular emphasis is placed on the evaluation of uncertainty about the spatial distribution of pollutants since remedial action can be very expensive. Uncertainty assessment is also critical for the cost-efficient design of sampling schemes. Geostatistics provides a set of tools that allows explicit modeling of uncertainty in the form of local probability distributions for concentrations (1, 2). However, these tools need to be adapted to the specific features of each data set. The first challenge is that the physical domain where contaminated sediments are often found (estuaries and rivers) has an irregular shape and can display complex connectivity patterns (channels, networks). Geostatistics as * Corresponding author phone: (734)936-0141; fax: (734)763-2275; e-mail: [email protected]. 3294

9

ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 35, NO. 16, 2001

a means to analyze spatial relationships relies on Euclidian distances between points, which are often meaningless in a meandering river as these distances could relate to points over intervening land. A more appropriate measure of distance should follow the path of the channel since sediment deposition follows the specific path of water flow (3). Coordinate transformation prior to analysis provides a possible solution to this problem: generating a grid within the river “straightens out” the domain of analysis ensuring that distances are measured within the river. An additional burden arises from the three-dimensional nature of the domain under study. Although geostatistics offers tools that allow consideration of both vertical and horizontal variability in an integrated fashion, they are seldom used and too often each depth is processed separately (4). Once spatial patterns have been modeled using threedimensional semivariograms, kriging allows one to capitalize on relationships between observations to make predictions at unsampled locations. In this paper a variant of kriging, called indicator kriging, is used to derive the distribution of possible outcomes (e.g. concentrations) at unsampled locations. The conditional cumulative distribution functions (ccdfs) provide a model of local uncertainty that can be used for subsequent decision-making such as remediation or additional sampling (1, 2). However, it is critical to evaluate the quality of these uncertainty models prior to using them. Kriging estimates are commonly compared with observations that have been either temporarily removed one at a time (leave-one-out or cross-validation) or set aside for the whole analysis (jackknife), and statistics such as mean square or mean absolute errors of prediction are computed. The difficulty here resides in the fact that the prediction is a probability distribution instead of a single value. Several authors (5-7) have recently proposed criteria for measuring the goodness of local uncertainty models, and these are applied here. The primary purpose of this paper is to illustrate a set of methodologies. The objective is 4-fold: (1) demonstrate the use of coordinate transformation and 3-D variography to model the spatial variability of contaminant concentration in a river environment, (2) present an application of indicator kriging for the assessment of uncertainty, (3) introduce criteria for evaluating the accuracy of uncertainty models through cross-validation, and (4) show how local uncertainty can be incorporated in decision making, such as delineation of contaminated areas and selection of locations for additional sampling. The methodology is illustrated using concentrations of 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) measured in the sediments of the Passaic River. The results represent the initial phase of research aimed at determining the expressed potential for biochemical tranformation of dioxins in sediments, as was demonstrated in laboratory studies (8,9).

Theory The uncertainty about an attribute z (e.g. dioxin concentration) at any location u, where u is a vector of spatial coordinates, can be quantified by deriving the ccdf Prob{Z(u) e z|(n)}, where Z(u) is the random variable (RV) modeling the uncertainty at u based on the values of neighboring observations. This section reviews the nonparametric approach of indicator kriging to model local probability distributions and the use of accuracy plots to assess the goodness of these models. Criteria for delineation of contaminated areas and location of additional samples are then introduced. 10.1021/es010568n CCC: $20.00

 2001 American Chemical Society Published on Web 07/13/2001

Indicator Kriging (IK). Nonparametric algorithms do not assume any particular shape or analytical expression for the conditional distributions. Instead, the value of the function F(u; z|(n)) is determined for a series of K threshold values zk (e.g, the nine deciles of the sample distribution) discretizing the range of variation of z

F(u; zk|(n)) ) Prob{Z(u) e zk|(n)} k ) 1, ..., K

(1)

The resolution of the discrete ccdf is then modeled by interpolation within each class (zk, zk+1) and extrapolation beyond the two extreme threshold values z1 and zK. The underlying assumption is that the shape of the ccdf within a threshold class or beyond extreme thresholds is similar to the shape of the sample cdf, that is the marginal distribution of data. The technique, known as “linear interpolation between tabulated bounds”, is described in detail in ref 2, pp 284-328, and practical implementation of the indicator approach is discussed in ref 2, pp 284-328 and illustrated in ref 10. Goodness of Uncertainty Models. Prior to using such uncertainty models in decision-making, one should evaluate how well these models capture the uncertainty about the unknown values. At any location u, knowledge of the ccdf F(u; z|(n)) allows the computation of a series of symmetric p-probability intervals (PI) bounded by the (1 - p)/2 and (1 + p)/2 quantiles of that ccdf. For example, the 0.5-PI is bounded by the lower and upper quartiles of the cdf [F-1(u; 0.25|(n)), F-1(u; 0.75|(n))]. A correct modeling of local uncertainty would entail that there is a 0.5 probability that the actual z-value at u falls into that interval or, equivalently, that over the study area 50% of the 0.5-PI include the true value. If a set of z-measurements and independently derived ccdfs are available at N locations uR (e.g. through crossvalidation or jackknife), {[z(uR), F(uR; z|(n))], R ) 1, ..., N}, the fraction of true values falling into the symmetric p-PI can be computed as

ξh(p) )

u is contaminated if Prob{Z(u) > zc} ) [1 - F(u; zc|(n))] > pc (4) A first option for selecting pc consists of negotiating with regulatory authorities to define the meaning of an “acceptably low” probability for the site under study; for example, both Johnson (12) and Buxton et al. (13) mentioned a negotiated probability threshold of 0.2. Garcia and Froidevaux (14) used two probability thresholds to delineate clean areas (pc e 0.2), contaminated areas (pc > 0.8), and uncertain or unclassified areas (0.2 < pc e 0.8) where further investigation should be conducted (15). In this paper, cross-validation is used to investigate how the choice of pc influences the proportion of locations that are misclassified. In cross-validation mode, each observation is removed one at a time, and the ccdf at the data location is estimated using the remaining observations except those within the same core. An alternative is to estimate the z-value at each location u and to classify as contaminated all locations where the threshold zc is exceeded. A common estimate is the mean of the ccdf, known as the E-type estimate and computed as

z/E(u) )

1

K

[ ( | ) (

∑F 2K k)1

-1

u;

k

K

(n) + F-1 u;

| )]

k-1 K

(n)

(5)

where K is typically taken as 100, and F-1(u; k/K|(n)) then represents the kth percentile of the ccdf. The classification rule is

u is contaminated if z/E(u) > zc

(6)

Because the classification involves estimates, there is actually a risk that u is wrongly declared contaminated (false positive, risk R) or clean (false negative, risk β). These two types of risks can be computed directly from F(u; z|(n)) as

R(u) ) Prob{Z(u) e zc|z/E(u) > zc} ) F(u; zc|(n)) (7)

N

1

∑ξ(u ; p) ∀ p ∈[0, 1]

(2)

β(u) ) Prob{Z(u) > zc|z/E(u) e zc} ) 1 - F(u; zc|(n)) (8)

with ξ(uR; p) ) 1 if F-1(uR; (1 - p)/2|(n)) e z(uR) e F-1(uR; (1 + p)/2|(n)), and zero otherwise. The accuracy plot is a scattergram of the estimated, ξh(p), versus expected fractions p. Deutsch (4) proposed to assess the closeness of the estimated and theoretical fractions using the following “goodness” statistics

Location of Additional Samples. Because of the large uncertainty about the actual pollutant concentrations, it might be more advantageous to collect additional information before the final classification into clean and contaminated areas. Consider that resources are available for the collection of K additional samples. A well-established principle of efficient site exploration and remediation is that a sampling procedure must be designed with a purpose in mind. A first objective may be the reduction of the uncertainty about the measured concentrations, which is achieved by sampling those K locations with the largest ccdf variance

G)1-

NR)1

R

∫ [3a(p) - 2][ξh(p) - p]dp 1

0

(3)

where a(p) ) 1 if ξh(p) g p, and zero otherwise. Twice more importance is given to deviations when ξh(p) < p, that is, when over the study area, fewer observations fall into the probability interval than expected (inaccurate case). Indeed, inaccuracy could lead one to underestimate the presence of high concentrations (i.e. values outside the symmetric PI), with consequences for risk assessment. The weight |3a(p) - 2| equals 1 instead of 2 for the accurate case, that is the case where the fraction of true values falling into the p-PI is larger than expected. Best case is G ) 1. Delineation of Contaminated Areas. Once ccdfs have been modeled for all grid nodes, the delineation of contaminated areas can follow several possible paths (11). A straightforward approach consists of classifying as contaminated all locations where the probability of exceeding the critical concentration threshold zc is greater than a critical probability threshold pc

s2(u) )

∫

+∞

-∞

[z - z/E(u)]2‚f(u; z|(n)) dz

(9)

where f(u; z|(n)) is the probability density function of Z(u), and z/E(u) is the mean of that function. In many other situations, the primary concern is to assess the intensity and real extent of pollution concentrations exceeding a regulatory threshold. In other words, it is the remediation decision that matters, not the accuracy of the prediction itself. The objective would then be to minimize the uncertainty about the exceedence of the critical threshold zc. Following ref 16 we propose as sampling criterion the ratio of the ccdf standard deviation to the absolute difference between the E-type estimate and the regulatory threshold

r(u) ) s(u)/|z/E(u) - zc| with z/E(u) * zc VOL. 35, NO. 16, 2001 / ENVIRONMENTAL SCIENCE & TECHNOLOGY

(10) 9

3295

FIGURE 1. Layout of the sampling profiles in the original space and in the new coordinate system. The star marks the Lister Ave site of Diamond Alkali, a major former point source. The K locations with the largest r(u) value, including the case z/E(u) ) zc, are sampled. The quantity r(u) is large if the denominator is small, that is if the pollutant concentration and threshold value are close, and so the uncertainty about the exceedence of that threshold becomes large. For the same absolute difference, the ratio will be larger if the variance of the ccdf, which measures the uncertainty at u, is large.

Materials and Methods Study Site. The study area is a 10 km stretch of the Passaic River in New Jersey, under investigation by the EPA as part of its evaluation of the Diamond Alkali Superfund site. Historical industrial activity as well as nonpoint sources of pollution have resulted in elevated levels of dioxins in the sediments (17, 18). The river is part of a tidal estuarine system, the Hudson-Raritan Estuary, and its flow pattern is determined by both seasonally varying freshwater input and tidal events. Although Passaic River contamination data have been analyzed by several authors (19-24), only one geostatistical study has been published so far on 2378-TCDD (19), which aimed at assessing risk of contamination caused by the Diamond Alkali facility. It was found that sediments contaminated by the facility’s outfall have been buried by cleaner sediments and that dioxin contamination upstream and downstream of the facility was caused by additional sources as well. Polychlorinated-dibenzo-p-dioxin (PCDD) data were acquired from the U.S. Environmental Protection Agency and from a database of New Jersey sediment data compiled by the National Oceanic and Atmospheric Administration (25, 26). The data set includes 94 sediment cores that were taken along 27 approximately equally spaced (370 m) transects in 1995 (Figure 1, left). Most transects consist of a mid-stream and two flanking cores about 48 m apart. In this analysis, data for 2378-TCDD, the most toxic congener, is utilized. Observations represent depth-averaged concentrations from approximately 30 cm core intervals. Figure 2 (top) maps these observations in transformed space (discussed in next section below). About 9.0% of the data were below the level of detection (sample dependent) and were reset to half the detection limit value, in agreement with previous data analysis at the study site (19-24). The sample histogram and summary statistics of the entire data set are displayed in Figure 2 (bottom). The distribution is highly positively skewed and bimodal, which indicates the presence of zones of high contamination (hot-spots). Although a logarithmic scale was used to facilitate presentation of such skewed data, either original or scaled data can be used for indicator kriging as the results are not influenced by data transformation. Grid Generation. Semivariograms measure the average dissimilarity between data as a function of the separation 3296

9


FIGURE 2. 2378-TCDD concentrations [log(ppt)] in transformed space and their histogram. Crosses denote data locations (vertical exageration for clarity). distance. Due to the meandering nature of the river, physical distances between observations cannot be identified with a mere Euclidean distance as some of them would necessarily be measured over land. To ensure that all distances are measured within the water, sample coordinates were transformed prior to analysis. This amounts to mathematically “unbending” the meanders. The coordinate transformation was accomplished with Gridgen, a grid generating software (27). Grid generation is widely used in the field of flow simulation in channels or around bodies where boundary conditions apply along irregular surfaces (hydrodynamics, aeronautics etc), see ref

28 for an in-depth treatment of the theory and ref 29 for a summary. First, the physical boundary is defined into four segments that correspond to the four sides of a rectangle. For this study site, the two banks correspond to two opposite lengths of the rectangle, while the lines marking the northern and southern ends of the river section correspond to the other two opposite sides. Then an equal number of grid points are laid out along each of the pairs of facing boundaries. Depending on the problem, various algorithms, ranging from Laplace transforms to linear interpolation, calculate grid lines between the extreme grid point pairs. In this paper, the river boundary is approximated by a series of straight lines connecting the flanking sampling sites, hence a linear interpolation algorithm has been used. With sufficient resolution, the transformation leads to one-to-one correspondence between the two coordinate systems. Grid nodes do not necessarily coincide with data locations which thus need to be assigned to the nearest node within limits of resolution and sample coordinate precision. One unit distance in the transverse direction corresponds to 0.6-4.0 m depending on the local width of the channel, and in the flow direction one unit distance corresponds to 19-21 m. Figure 1 shows the sampling points in the original and transformed coordinate systems. Subsequent geostatistical analysis is conducted in the transformed space, and the results need merely be mapped into the original coordinate system. Modeling of Local Uncertainty. The first step in the kriging procedure is the calculation of experimental indicator semivariograms in three dimensions and the fitting of anisotropic models. Nine thresholds, corresponding to the nine deciles (0.1, 0.2, ..., 0.9) of the sample histogram, were used. For any semivariogram model three parameters have to be determined: the nugget effect, the sill, and the range. The nugget effect is a discontinuity at the origin relating to measurement error and/or spatial sources of variation at distances smaller than the shortest sampling interval (2, p 31). The sill approaches the sample variance, and the range indicates the maximum distance within which a relationship exists among data pairs. Although programs exist that compute semivariograms in various directions in 3-D and fit a model automatically, the modeling is carried out visually here, using the interactive program xgam (30). A visual procedure is preferred because erratic fluctuations in semivariograms caused by sparse information is better handled using expert judgment than blind regression procedures. Except for a few books (31), the modeling of 3-D semivariograms has received little attention in the geostatistical literature. The following protocol was adopted: 1. The experimental semivariograms are computed in the three main directions: x (along the river stream), y (transverse to the stream), and z (vertical). These directions are considered as the main axes of the ellipsoid of anisotropy, hence they suffice to characterize the variability in all other directions. 2. The nugget effect is assumed to be isotropic (directionindependent) and is estimated from the direction with the highest resolution, that is the smallest separation distance between observations, which is usually the vertical one. 3. In addition to the nugget component, several basic models (e.g. spherical, exponential) need to be combined to fit the complex shape of directional semivariograms. Key decisions involve the number and type of models as well as the value of the range (distance of spatial dependence) and sill (contribution to the total variance). Typically horizontal and vertical semivariograms display different sills, which indicates that the variability is more important in certain directions and the anisotropy is referred to as zonal (2). The idea is to select two to three basic models and estimate first each sill such that their sum plus the nugget effect ap-

FIGURE 3. Ccdf model for 2378-TCDD concentrations [log(ppt)] at the location depicted by a black diamond in Figure 2, left. The black dots denote the decile thresholds. proximates the largest sill displayed by directional semivariograms. Keeping this total constant, the sills can be adjusted so that in every direction one or more basic models add up to the total sill. Then, the range can be freely chosen for each of the basic models in each direction until a reasonable fit is achieved, which amounts to approximating zonal anisotropy by geometric anisotropy. Indicator kriging was performed using the ik3d procedure in GSLIB (32). Figure 3 shows an example of a ccdf model obtained for the distribution of 2378-TCDD at the location depicted by a black diamond on the location map in Figure 2 (top). The accuracy plot is built from the data and crossvalidated ccdfs using a GSLIB-type program available on http://www.ualberta.ca/∼cdeutsch/. Ccdfs were processed using the program postik to calculate the E-type estimate, the variance, and probabilities of exceedence of given thresholds. Delineation of Contaminated Areas. Currently, a wide range of guidelines exists to assist in evaluating potential risks from dioxin contamination. Based on the detection limits of standard analytical methods as listed in ref 33, two thresholds zc ) 10 ppt and 25 ppt are considered for this study, which are exceeded by 81.0% and 76.0% of the data, respectively. For each threshold, the probability threshold (4) and the concentration threshold (6) are applied to the cross-validated ccdfs at the 538 data locations. Classification results are then compared with actual concentrations, allowing the computation of the proportion of false positives and false negatives. For decision rule (4) a series of probability thresholds pc is used, and the resulting rate of misclassification is computed. The pc-value that minimizes this rate is chosen for classification. In the case of decision rule (6), the ability of quantities R and β to assess the risk of misclassification is assessed by computing their average separately for the locations correctly and wrongly classified as clean (β) or contaminated (R). For example, in the case of β, which measures the risk of declaring a contaminated location as clean, the average β-value is expected to be large (ideally 1.0) for locations that are actually contaminated and small (ideally 0.0) for locations that are clean. By both of these measures one can expect to classify as contaminated, areas significantly beyond those with very large concentration estimates. A procedure capable of further distinguishing such hot-spots may be necessary to prioritize remedial action. Additional Sampling. Cross-validated ccdfs are used to compute at each data location: the sampling criteria in refs 9 and 10, the absolute prediction error |z(uR) - z/E(uR)|, and the closeness to the critical threshold |z(uR) - zc|. The ability of criterion (9) to forecast large prediction errors and of r(u) to detect closeness to the threshold value is investigated by computing the correlation between these quantities. In addition, two subsets of 50 data locations with the largest VOL. 35, NO. 16, 2001 / ENVIRONMENTAL SCIENCE & TECHNOLOGY

9

3297

FIGURE 4. Experimental indicator semivariograms in the x, y, and z directions, with the anisotropic model fitted. The threshold is the median of the sample histogram of Figure 4. The sills are standardized to the sample variance. Nugget effect (NE), sill, and range are as indicated. values for each of the two criteria are selected, and histograms of sampled values are compared.

Results and Discussion Local Uncertainty Models. The calculated indicator semivariograms for the fifth decile, z0.5, are shown in Figure 4 (actual data points) along with the following anisotropic model (dashed lines)

(

)

|hx| |hy| |hz| + , , 1u 40u ∞ |hx| |hy| |hz| |hx| |hy| |hz| 0.29 Sph , , , , + 0.057 Sph ∞ ∞ 2.8m 65u ∞ ∞ (11)

γ(h; z0.5) ) 0.05 g0(h) + 0.123 Sph

(

)

(

)

where g0(h) is a nugget effect model, and Sph refers to the spherical semivariogram model with ranges along the x, y, and z directions given in the denominator (units are meters in the vertical direction and transformed units in the horizontal plane; x: u ≈ 10 m, y: u ≈ 2 m). An infinite range means that the basic model does not contribute to this particular direction. For example, the second spherical model, with a sill of 0.29, contributes only to the vertical direction with a range of 2.8 m. For all deciles, indicator semivariograms display similar patterns of geometric anisotropy (i.e. different ranges, but same sill in the different directions): the longest range is in the x direction with several hundred meters, while the shortest range is in the z direction at about 2-3 m. The y direction has an intermediate range of around 50-80 m. Zonal anisotropy patterns vary. For the first three deciles the variability in the three directions is similar, as indicated by the similarity of the standardized sills (relative to the variance) (Figure 4, right bottom graph). In the case of the top 6 deciles, 3298

9


including the model in Figure 4, variability in the z direction increases with the threshold, indicating less vertical continuity of high values, and thus the effect of hot-spots especially in subsurface samples. Significant variability in the z-direction would be expected on account of the temporal variation of dioxin input in the estuarine system. Dioxins found in the sediments are derived from known point sources such as the Diamond Alkali plant, which manufactured 2,4,5-trichlorophenate (TCP) in the 1960s (34), and potentially various combined sewage overflows (23), in addition to diffuse sources such as stormwater runoff and deposition of atmospheric dioxins resulting from combustion-related activities (35). While 245-TCP tends to generate dioxin patterns dominated by 2378-TCDD, the homologue (isomers with the same degree of chlorination) pattern of combustion-related sources is more evenly distributed among tetra- to octa-CDD, and 2378-TCDD represents only a minor fraction of total TCDD and PCDD (36). In addition, it was recently shown in laboratory studies that 2378-TCDD may be in a state of flux as the result of dechlorination of higher chlorinated homologues to form 2378-TCDD and further dechlorination to di-CDD (8, 9). The combined effect of source input and sediment pattern alteration would affect the resultant 2378-TCDD concentrations. Similarly, changes with the flow direction are an indication for the combined effect of localized dioxin input sources along the meander and contaminated sediment transport. As the latter would result in increased sedimentation in the bends of the meander, dioxin pattern distribution would be affected. Ehrlich et al. (22) reported, using polytopic vector analysis (PVA), that Passaic River patterns could be largely deconvoluted into a number of dominant constituent source profiles (end-members) but indicated that pattern alteration

FIGURE 5. Accuracy plot showing the proportion of true values falling within probability intervals versus the probability p and the value of the goodness statistic G. due to in-situ microbial or abiotic activity were not considered. With this caveat in mind, the changes in the x, y, and z directions are likely to be largely attributable to source pattern contributions. The accuracy plot of 2378-TCDD concentrations (Figure 5) shows good agreement between experimental and expected probabilities of lying within probability intervals. Uncertainty is slightly underestimated for p > 0.65 (inaccurate case, i.e., spread of distribution not wide enough to capture the data behavior). This effect is due to the high values: as the PI widens, more of these should be captured, but these values are so largesseveral orders of magnitude larger at the hot-spots, that their respective distributions are too narrow to capture them. Nevertheless, the goodness statistic G is close to 1, which indicates that ccdfs provide a reasonable model of local uncertainty. Contaminated Area Delineation. Figure 6 shows the map of E-type estimates and the corresponding classification according to exceedence of the concentration threshold (eq 6) with zc ) 10 ppt. The three-dimensional variability of logconcentrations is clearly visible, with an identifiable hotspot at 4000-6000 m in the flow direction of the data set used, which should locate this zone at and downstream of the Lister Ave (Diamond Alkali) facility (Figure 1). Moreover, the depth of highest concentration ranges from 1.5 to 3 m below the river bottom, which at a mean sedimentation rate of 1.7 × 108 kg/year and based on isotopic decay of radionuclides correlates to a maximum deposit in the 19601970 decade (19). It was estimated (34) that 4-8 kg of 2378TCDD was discharged in the river during 245-TCP production in the same period. Nevertheless, the 10 ppt criterion leads one to classify almost the entire sediment volume (95%) as contaminated. Cross-validation indicates that the misclassification rate for this criterion is quite low at 10.6% for zc ) 10 ppt. Similarly, for zc ) 25 ppt, misclassification is 10.4%. The average risk β to declare clean a contaminated location (eq 7) is 0.26 (0.22) for zc ) 10 (25) ppt at locations that were correctly classified as clean. As expected, this is less than for locations that were incorrectly classified as clean, where the risk β is 0.43 (0.36). Hence, β is an appropriate measure of the risk of misclassification using the concentration threshold (6). The impact of the choice of probability threshold pc on the rate of misclassification of data locations using decision

FIGURE 6. Map of E-type estimates of 2378-TCDD concentrations [log(ppt)] and indicators of exceedence of the threshold of 10 ppt by these estimates (black pixels).

FIGURE 7. Impact of the probability threshold pc on the proportion of sampling locations that are wrongly classified as clean or contaminated for the two thresholds (10 ppt - solid line, 25 ppt dashed line). The optimal probability threshold pc for 10 ppt is used to delineate contaminated areas (black pixels). rule (eq 4) is shown in Figure 7 (top graph). The minimum rate of 9.7 (9.3)% is reached at pc ) 0.54 (0.52) for zc ) 10 (25) ppt, which is sensibly lower than the proportion of data that VOL. 35, NO. 16, 2001 / ENVIRONMENTAL SCIENCE & TECHNOLOGY

9

3299

FIGURE 8. Two measures of uncertainty for additional sampling strategies: the ccdf variance and r(u) value for zc ) 10 ppt and the histograms of log(2378-TCDD) concentrations at the 50 locations with the largest values for both measures. actually exceed the thresholds, 0.81 (0.76). These misclassification rates are slightly smaller than the ones produced by the decision rule (eq 6) based on E-type estimates. Considering the negotiated value of 0.2 for pc in Johnson (12) and Buxton et al. (13), the values calculated here may not be acceptable for decision-makers, in such cases, an alternative is to rank the locations by decreasing risk of contamination and to clean the most risky locations. Figure 7 (bottom graph) shows the delineation of contaminated areas (black pixels) for the first threshold of 10 ppt. Interestingly, the use of probability thresholds leads to the classification as contaminated of a smaller sediment volume (91 and 80%) compared with the E-type estimate-based criterion (>95%). The combination of smaller estimated volumes for remediation with lower misclassification rates (assuming equal importance of false negatives and positives) indicates the superiority, for this data set, of the use of a probability threshold versus a simple concentration threshold. Additional Sampling. Uncertainty, as measured by the variance of the local probability distributions, is mapped in Figure 8 (left top graph). As expected, uncertainty is greater in sparsely sampled regions; it increases in the upstream direction and with depth. The greatest uncertainty is in the central regions. Cross-validation results show that the correlation of s2(uR) with the absolute prediction error is quite 3300

9


high at 0.56. The mean absolute error for the locations with the 50 highest values of s2(uR) is almost double (1.10) the overall mean absolute error (0.59), so this measure is a good indicator of where prediction errors are more likely. Figure 8 (left bottom graph) displays the value of r(u), which is a measure of the uncertainty about the exceedence of the concentration threshold of 10 ppt. Like the ccdf variance, the value of this criterion increases with depth and in the upstream direction, and it is highest at depths exceeding 2 m, where the uncertainty about the classification is greatest with estimates near the threshold value. The rank correlation between r(uR) and |z(uR) - zc| is -0.47 (-0.35), reflecting the ability of this criterion to detect locations where actual concentrations are close to the regulatory threshold. The histogram of 2378-TCDD concentrations at the 50 locations with largest r(u) values demonstrates the same effect. The sampled concentrations are distributed about the threshold value of 1 (log10(10 ppt), unlike when using s2(uR), which leads one to sample uniformly the entire bimodal distribution of the data set, compare Figures 8 (right, top) and 3.

Acknowledgments We would like to thank the National Science Foundation (Grant DMS-9905454) and the Office of Naval Research for

financial support. The authors express their gratitude for constructive comments provided by Dr. Sharon Jaffes and Dr. Jon Josephs of the US EPA and the anonymous reviewers.

Literature Cited (1) Journel, A. G. Geostatistics for the Environmental Sciences; EPA project no. cr 811893. Technical report; U.S. EPA: EMS Lab, Las Vegas, NV, 1987. (2) Goovaerts, P. Geostatistics for Natural Resources Evaluation; Oxford University Press: New York, 1997. (3) Little, L. S. J. Exp. Mar. Biol. Ecol. 1997, 213, 1-11. (4) Yoon, G. L.; O’Neill, M. W. Probabilistic Engineering Mechanics 1999, 14, 205-211. (5) Deutsch, C. V. In Geostatistics Wollongong ’96; Baafi, E. Y., Schofield, N. A., Eds.; Kluwer Academic Publishers: Dordrecht, 1997; pp 115-125. (6) Papritz, A.; Dubois, J.-P. In geoENV II - Geostatistics for Environmental Applications; Go´mez-Herna´ndez, J., et al., Eds.; Kluwer Academic Publishers: Dordrecht, 1999; pp 429-440. (7) Goovaerts, P. Geoderma 2001, in press. (8) Albrecht, I. D.; Barkovskii, A. L.; Adriaens, P. Environ. Sci. Technol. 1999, 33, 737. (9) Barkovskii, A. L.; Adriaens, P. Appl. Environ. Microbiol. 1996, 62, 4556. (10) Saito, H.; Goovaerts, P. Environ. Sci. Technol. 2000, 34, 42284235. (11) Goovaerts, P.; Webster, R.; Dubois, J.-P. Environ. Ecological Statistics 1997, 4, 31-48. (12) Johnson, R. L. In Geostatistics for Environmental and Geotechnical Applications; ASTM STP 1283; Rouhani, S., et al., Eds.; 1996; pp 102-116. (13) Buxton, B. E.; Wells, D. E.; Diluise, G. In Geostatistics Wollongong ’96; Baafi, E. Y., Schofield, N. A., Eds.; Kluwer Academic Publishers: Dordrecht, 1997; pp 984-995. (14) Garcia, M.; Froidevaux, R. In geoENV I - Geostatistics for Environmental Applications; Soares, A., et al., Eds.; Kluwer Academic Publishers: Dordrecht, 1997; pp 309-325. (15) Kyriakidis, P. C. In Geostatistics Wollongong ’97; Baafi, E. Y., Schofield, N. A., Eds.; Kluwer Academic Publishers: Dordrecht, 1997; pp 973-983. (16) Van Meirvenne, M.; Goovaerts, P. Geoderma 2001, 102, 75100. (17) Finley, B. L.; Wenning, R. J.; Ungs, M. J.; Huntley, S. L.; Paustenbach, D. J. Tenth International Meeting, Dioxin 90; 1990. (18) Gunster, D. G.; Gillis, C. A.; Bonnevie, N. L.; Abel, T. B.; Wenning, R. J. Environ. Poll. 1993, 82, 245-253.

(19) International Technology Corp. Final Report: Passaic River Sediment Study - Diamond Shamrock Chemicals Company; 1986; Project No. 501007.70. (20) Wenning, R. J.; Paustenbach, D. J.; Johnson, G. W.; Ehrlich, R.; Harris, M.; Bedburry, H. Chemosphere 1993, 27, 55-64. (21) O’Keefe, P.; Smith, R.; Connor, S.; Aldous, K.; Valente, H.; Donnelly, R. Arch. Environ. Contamin. Toxicol. 1994, 27, 357366. (22) Ehrlich, R.; Wenning, R. J.; Johnson, G. W.; Su, S. H.; Paustenbach, D. J. Arch. Environ. Contamin. Toxicol. 1994, 27, 486-500. (23) Huntley, S. L.; Ianuzzi, T. J.; Avantaggio, J. D.; Carlson-Lynch, H.; Schmidt, C. W.; Finley, B. L. Chemosphere 1997, 34, 213231. (24) Huntley, S. L.; Carlson-Lynch, H.; Johnson, G. W.; Paustenbach, D. J.; Finley, B. L. Chemosphere 1998, 36, 1167-1185. (25) U.S. Environmental Protection Agency. http://www.epa.gov/ r02earth/superfnd/sedsamp.htm (accessed Nov 2000). (26) National Oceanic and Atmospheric Administration/ National Ocean Service, Office of Response and Restoration. http://response.restoration.noaa.gov/cpr/watershed/ watershedtools.html (accessed Nov 2000). (27) Pointwise, Inc. Gridgen v. 13.3; http://www.pointwise.com (accessed Nov 2000). (28) Thompson, J. F. Numerical Grid Generation: Foundations and Applications; North-Holland, 1985. (29) Thompson, J. F.; Thames, F. C.; Mastin, C. W. J. Comput. Physics 1974, 40, 299-319. (30) Chu, J. Report 6, Stanford Center for Reservoir Forecasting; Stanford, CA, 1990. (31) Isaaks, E. H.; Srivastava, R. M. An Introduction to Applied Geostatistics; Oxford University Press: New York, 1989. (32) Deutsch, C. V.; Journel, A. G. GSLIB: Geostatistical Software Library and User Guide; Oxford University Press: New York, 1998. (33) Ianuzzi, T. J.; Bonnevie, N. L.; Wenning, R. J. Arch. Environm. Contam. Toxicol. 1995, 28, 366-377. (34) Bopp, R. F.; Gross, M. L.; Tong, H.; Simpson, H. J.; Monson, S. J.; Deck, B. L.; Moser, F. C. Environ. Sci. Technol. 1991, 25, 951956. (35) Lohmann, R.; Nelson, E.; Eisenreich, S. J.; Jones, K. Environ. Sci. Technol. 2000, 34, 3086-3093. (36) Baker, J. I.; Hites, A. R. Environ. Sci. Technol. 2000, 34, 28792886.

Received for review January 24, 2001. Revised manuscript received May 23, 2001. Accepted May 24, 2001. ES010568N

VOL. 35, NO. 16, 2001 / ENVIRONMENTAL SCIENCE & TECHNOLOGY

9

3301

Geostatistical Assessment and Validation of Uncertainty for Three

Recommend Documents