Estimating a Representative Value and Proportion of True Zeros for

Publication Date (Web): June 14, 2017 ... approach to derive representative (“best guess”) contaminant concentrations from data with censored valu...
2 downloads 0 Views 949KB Size
Article pubs.acs.org/est

Estimating a Representative Value and Proportion of True Zeros for Censored Analytical Data with Applications to Contaminated Site Assessment Claus P. Haslauer,*,† Jessica R. Meyer,‡ András Bárdossy,§ and Beth L. Parker‡ †

University of Tübingen, Center for Applied Geoscience, Hölderlinstrasse 12, 72074 Tübingen, Germany University of Guelph, G360 Institute for Groundwater Research, Guelph, Ontario N1G 2W1, Canada § University of Stuttgart, Institute for Modelling Hydraulic and Environmental Systems, 70569, Stuttgart, Germany ‡

S Supporting Information *

ABSTRACT: This paper demonstrates a maximum likelihood (ML)-based approach to derive representative (“best guess”) contaminant concentrations from data with censored values (e.g., less than the detection limit). The method represents an advancement over existing techniques because it is capable of estimating the proportion of measurements that are true zeros and incorporating varying levels of censorship (e.g., sample specific detection limits, changes through time in method detection). The ability of the method to estimate the proportion of true zeros is validated using precipitation data. The stability and flexibility of the method are demonstrated with stochastic simulation, a sensitivity analysis, and unbiasedness analysis including varying numbers of significant digits. A key aspect of this paper is the application of the statistical analysis to real site rock core contaminant concentration data collected within a plume at two locations using high resolution depth-discrete sampling. Comparison of the representative values for concentrations at each location along the plume center-line shows a larger number of true zeros and generally lower concentrations at the downgradient location according to the conceptual site model, leading to improved estimates of attenuation with distance and/or time and associated confidence; this is not achievable using deterministic methods. The practical relevance of the proposed method is that it provides an improved basis for evaluating change (spatial, temporal, or both) in environmental systems.



INTRODUCTION Sound interpretations of measurements rely on making the best use of all data collected during an investigation. This includes both “crisp” values, that is, measurements with concentrations larger than their associated level of censorship, and “censored” measurements, that is, the true measurement value is hidden within an interval, typically between zero and the method detection or reporting limit. This is typical for any discipline using analytical data collected from any natural compartment such as air, soil, and water. In addition, distinguishing between very small and censored concentrations and true zeros remains an important challenge. The interest of methods to properly incorporate censored measurements into data analysis and identify true zeros will only increase as more “new” chemicals are released into the environment and remediation efforts at legacy contaminant sites strive to meet clean up standards. The data sets used in this paper exemplify the usefulness of the proposed approach for hydro-, geo-, and chemical scientists when robust quantification of changes in concentrations and or mass storage are desired. Levels of censorship in the context of environmental investigations are typically fairly small, but a bias occurs when censored measurements are replaced with an arbitrary value © XXXX American Chemical Society

(such as zero, or the detection or reporting limit, or half the detection or reporting limit1), particularly when the distribution of the measurements within the censored interval is skewed. Despite early approaches2−7 to incorporate such censored data into analysis, it is still common practice to discard censored results. Censored measurements have been included in environmental and hydraulic applications;8,9 a recent relevant overview is given by Helsel.10 An optimal value to replace censored measurements, instead of the arbitrary assigned values mentioned above, has been found,11 but is strictly limited to the not-bounded normal distribution. Survival analysis12−14 handles censored values with a likelihood function, but can not take true zeros into account. Zero-inflated models15 have a discrete mass at zero, as is required to estimate the proportion of true zeros, but use discrete distributions. Alternatives to the maximum likelihood (ML; abbreviations are listed in Supporting Information (SI) Table S1.) estimator have been Received: Revised: Accepted: Published: A

October 25, 2016 June 7, 2017 June 14, 2017 June 14, 2017 DOI: 10.1021/acs.est.6b05385 Environ. Sci. Technol. XXXX, XXX, XXX−XXX

Article

Environmental Science & Technology developed, including the probability plot regression,16 and the expected moments algorithm (EMA).17−19 Some studies have tried to incorporate censored measurements for hydrogeologic applications: censored data have been used for regression models20,21 and for site groundwater flow and contaminant transport model calibration.22 A Markov chain Monte Carlo based method for spatial and temporal statistical models that incorporates constrained measurements has also been developed.23 Additionally, an expected value and a variance based on a censored sample using ML have been estimated, with the assumption that the sample distribution is symmetric and the incorporation of a subsequent bias correction step;24 this has also been conducted for other non-Gaussian distributions.25 In an artificial simulation study without real-world field data, the proportion of pure zeros or the proportion of censored measurements for low flow hydrological analysis were analyzed.26 Extreme historical hydrological events that are typically censored at the left boundary (where measurements below detection limits are typically at the right boundary) have recently been included in a multivariate statistical analysis.27 Lastly, censored measurements have been used for geostatistical interpolation.28,29 Although work has been done to develop statistical methods that incorporate censored measurements, additional flexibility is desired to handle analytical data sets typical of contaminated sites where data are often collected for many decades using a variety of sample collection techniques and spatial resolutions. For example, what if censorship levels change through time? Or what if censorship levels are sample-specific due to differences in dilution factors? In addition, most contaminated site investigations must evaluate the effectiveness of various remedial measures and/or natural attenuation processes. In both cases, quantitatively determining the best estimate of a representative concentration and the proportion of true zeros (i.e., samples with no contamination) inferring a change in total mass in the system would be helpful for improved performance evaluations. The objective of this work is to provide a method that includes censored measurements with multiple censorship levels in an ML-based approach to jointly estimate the representative concentration (including the moments of the fitted distribution) and the proportion of true zeros based on a set of real hydrogeological censored measurements. A variety of parametric distributions are fit to the field data, where the magnitude of the optimized likelihood indicates the optimal parametric distribution function. The representative value (i.e., the expected value), variance, higher statistical moments, and/ or interquantile ranges can then be deduced. In addition, a procedure for estimating the proportion of true zeros in the data set is incorporated into the above methodology and validated using a precipitation data set containing true zeros. The stability and sensitivity of the method are demonstrated by means of stochastic simulation. Method efficacy is demonstrated by using a real-world data set to derive a representative contaminant concentration for upgradient and downgradient locations within a mixed organic contaminants plume in fractured sandstone. The contaminant concentrations are derived from densely sampled continuous rock cores analyzed for several contaminants of concern, where concentrations vary over several orders of magnitude over short distance increments due to contrasts in permeability from fracture to rock matrix. The resulting ML-based representative concentrations are then compared to those calculated

deterministically to evaluate if the ML-based representative concentrations provide improved hydrogeological insight.



MATERIALS AND METHODS Site Description. This section describes the contaminated hydrogeology data set that demonstrates the method and relevance of the proposed ML-based approach. The data were collected from a fractured sedimentary rock field site in southern Wisconsin that has dense nonaqueous phase liquid (DNAPL) impacts in the Lone Rock Formation sandstone between ∼46 and 56 m below ground surface (bgs).30,31 Groundwater flow over three decades created a dissolved phase plume that extends 3 km downgradient at its maximum observed extent in 2003. The site geology, hydrogeology, and contaminants are well described elsewhere.32,33 This study utilizes rock core contaminant concentrations derived from two locations with continuous cores at this site, MP-19D and MP21D, situated approximately 0.5 km and 1.5 km downgradient of the DNAPL source zone, respectively34 (SI Figure S1). A detailed rock core contaminant sampling approach33,35,36 was used to collect depth-discrete and detailed profiles of the contaminant mass concentrations in the low permeability rock matrix that show evidence of contaminant mass storage away from fractures where groundwater flow occurs. A total of 258 samples were collected from the 83.2 m long MP-19D core and 220 samples were collected from the 81.7 m long MP-21D core (SI Figure S2). A subset of the samples collected from the Lone Rock Formation (i.e., the general interval associated with accumulated DNAPL in the upgradient source zone and the most extensive dissolved phase plume) are utilized here (SI Figure S2). These samples were analyzed for 36 volatile organic contaminants following the U.S. EPA SW846 8260B method.37 A subset of these results including those for tetrachloroethene (PCE), trichloroethene (TCE), cis-1,2-dichloroethene (cisDCE), vinyl chloride (VC), 1,1,1-trichlroethane (111TCA), 1,1-dichloroethane (11DCA), and chloroethane (CE) are examined. The contaminant concentrations were reported by the laboratory as the mass of solute per liter of methanol extract with respect to several quality control/quality assurance (QA/ QC) limits. The laboratory used two basic QA/QC limits, the method detection limit (MDL) and the limit of quantitation (LOQ). The analytical method required a 24-fold dilution of the methanol extract for all samples, increasing the MDL and LOQ for all samples. For the purposes of this paper, the sample detection limit (SDL) is equal to the MDL multiplied by the dilution factor, and the sample quantitation limit (SQL) is equal to the LOQ multiplied by the dilution factor. Prior to interpretation, the results, SDL, and SQL for each sample were converted from mass of solute per liter of methanol extract to mass of solute per mass of wet rock. The majority of measured concentrations within the Lone Rock Formation at both core hole locations were either nondetect (below the SDL) or not quantifiable (below the SQL) (SI Table S2). ML Method With Partially Censored Measurements Including True Zeros. This section derives an ML-based approach that leads to an estimated theoretical parametric distribution function theoretical distribution function based on a sample containing multiple censored values. Ultimately, summary measures can be calculated based on the fitted distributions (e.g., representative concentrations). The level of censorship can vary and the proportion of true zeros in the B

DOI: 10.1021/acs.est.6b05385 Environ. Sci. Technol. XXXX, XXX, XXX−XXX

Article

Environmental Science & Technology

Estimating the Proportion of True Zeros. In the case of solute concentrations, a censored measurement can imply that the concentration of the solute is too small to be detected with the measurement device. However, it could also be that the sample does not contain any solute mass. In the latter case, the true concentration is zero, which typically occurs in the analysis of “new” contaminants.42 Mathematically, the proportion of zeros, p0, can be described by the addition of one parameter to the regular distribution function H(x)(eq 3 and related density function h(x)(eq 4)). The Dirac delta δ at the origin indicates the mass of p0 at the origin. The likelihood function including p0 (eq 5) is obtained by inserting eqs 3 and 4 into the basic likelihood function (eq 1). SI Table S4 lists analytic expressions for first and second moments of the distributions including p0.

sample is estimated jointly. No prior assumptions of the type of distribution function are required. The goal of a maximum likelihood method (MLM)38 is to find the parameters of a given type of distribution function F with a set of parameters θ, such that the measurements are most likely a sample from that distribution. For this paper, this likelihood L consists of two parts (eq 1): the first part of the equation for crisp measurements and the second part of the equation for measurements within an interval. Crisp measurements are included in the L via their probability density function values (f(x)); censored measurements are incorporated via the difference in the probability distribution function values (F(x)) evaluated at the boundaries of the censored interval. L is optimized using a gradient-based optimizer. As an example, assume a given data set has x1,. . ., xk values above a generic detection limit (DL; here: DL = SDL) and y1,. . . , ym values below the DL. The lower and upper boundaries of the interval of the possible values that could be assigned to the m sample values are 0 and the DL, respectively. One characteristic of the given data set is that each sample has a different SDL when the concentrations are expressed in μg/g of wet rock (the ratio of volume of methanol extract to mass of rock for each sample, which is very difficult to keep constant, can be seen as a dilution factor). This characteristic makes it impossible to treat all values below SDL identically; m different summations need to be performed, each time with a different SDL. In the second part of eq 1, the term F(DLuj ) − F(DLlj) represents the probability that a measurement is within an interval with upper (superscript u) and lower (superscript l) boundaries.

∑ ln(h(xi , θ , p0 )) i=1

+

∑ ln(H(DLju , θ , p0 ) − H(DLjl , θ , p0 )) j=1

(5)

Statistical simulations demonstrated that all the parameters, including p0 were estimated well (SI Figure S4). Statistically, the distinction between a true zero and a censored, but small value is difficult. The addition of one single parameter p0 leads to an improved estimation of the distribution and hence also of the moments of the distribution compared to the model without this parameter (SI Figure S5). For validation, we resorted to a data set of precipitation intensities, where measurements of zeros occur (days without rain at a given location) and are measurable. A data set of daily precipitation measured between 1980 and 2010 at ∼250 stations within the state of Baden-Württemberg, Germany,43 was used to validate the outlined approach for estimating the proportion of true zeros at each location with a one parametric exponential distribution function. The proportion of true zeros could be reproduced very well (r = 0.998, SI Figure S6). Here, precipitation measurements tn − 2;1 − α /2 ,39 where X = concentration and Y is t = depth. The correlation with censored measurements can be estimated.40 Here, the assumption of independence cannot be rejected on a level of α ≥ 0.05 for all the solutes in the hydrogeological data set used in this paper, except for 111-TCA at location MP-19D, where α = 0.03. The Akaike Information Criterion (AIC)41 was used to compare model fits. In addition to L, it also takes the number of model parameters (nθ) into account (eq 2).

AIC = 2nθ − 2ln(L)

(4)

m

i=1

→ max

h(x) = (1 − p0 ) ·f (x) + p0 ·δ(0) k

∑ ln(f (xi , θ)) +

(3)

ln L(x , θ , p0 ) =

k

ln L(x , θ) =

H(x) = p0 + (1 − p0 ) ·F(x)

(2)

Theoretical Distribution Functions Used. For this study, the following parametric distribution functions were used: the exponential distribution function with parameter λ, the Weibull distribution function with parameters α and β, and the gamma distribution function with parameters α and β (SI Table S3). These types of distribution functions are able to represent skewness reasonably well even though other distribution functions could also be used. The decision of which distribution function to use is based on AIC (the smaller the better; identical sample size). For comparison purposes, all three distributions are used and reported.



RESULTS AND DISCUSSION Comparison of Estimated Representative Concentrations. This section compares results for estimating average C

DOI: 10.1021/acs.est.6b05385 Environ. Sci. Technol. XXXX, XXX, XXX−XXX

Article

Environmental Science & Technology

Table 1. Estimated Percent Differences (“Δ”) in Representative Concentrations (E[C]) and Representative Concentration Variances (VAR[C]) between the Locations MP-19D and MP-21D ΔE[C]

ΔVAR[C]

solute

exp

exp p0

PCE TCE cis-DCE VC 111TCA

4.26 38.4 43.4 3.86 71.6

0.475 38.4 41.5 1.09 70.2

PCE TCE cis-DCE VC 111TCA

0.181 14.7 18.9 0.149 51.2

0.226 12.7 33.0 0.20 79.9

Weibull

Weibull p0

gamma

gamma p0

38.3 69.8

35.6 42.0

41.7 41.4

33.4 41.4

79.4

77.0

69.7

69.7

5.93 25.3

9.636 32.164

13.5 31.9

12.0 32.3

49.1

locations. When the change in cis-DCE concentration between the upgradient (MP-19D) and downgradient (MP-21D) locations is calculated deterministically, the concentration decreases by between ∼40% and ∼92% depending on whether the censored data are included in the calculation and what value is assigned to the censored data (SI Table S3). The corresponding MLM-based changes in representative concentration for cis-DCE are most likely ∼41%, as the gamma distribution results in the smallest AIC among the chosen distributions. This example shows that the representative concentrations based on the noncensored values only are extreme and not likely representative of actual conditions. The ML-based estimates are within the anticipated bounds, they include the censored measurements in a statistical model (as probabilities of nonexceedance), and the AIC is a reasonable measure for the fit of the model to the data. The estimation is stable and largely independent of the type of theoretical distribution used. The estimated representative concentration is also not affected by the presence of the p0 parameter. In all cases, the estimated distributions fall within the limits spanned by the empirical distributions with censored values substituted with the interval boundaries. Notably, the range of representative values estimated for each contaminant at both locations is much smaller for the values obtained using the MLM approach. For some contaminants, for example, TCE and cis-DCE, the ML based estimation is much improved and consistent between varying types of distribution functions. For most contaminants, the estimated change is similar regardless of the type of distribution function used for the MLM. The less information is available, the more difficult the estimation becomes. Although several contaminants (e.g., PCE, TCE, and VC) do show a large range in the calculated change depending on the distribution function used, these ranges are smaller than those calculated deterministically for the same contaminants. For contaminants where nothing is known, i.e., those that have no noncensored measurements (e.g., chloroethane), reliable estimation is difficult even with an ML-based method. A larger proportion of true zero concentrations is estimated at the downgradient vs upgradient location. This is particularly the case for TCE, cis-DCE, and 111TCA, for which the data availability is best (Table 1). These results are consistent with the difference in proximity to the DNAPL source zone of both locations and the associated increased degree of attenuation due to matrix diffusion, degradation, and dispersion in the fracture network.

concentrations calculated deterministically and representative concentrations calculated statistically based on the partially censored rock core concentration measurements at the two core locations. Deterministically Calculated Representative Concentrations. The premise of this research is that censored measurements reside within a known interval but the precise position within the interval is not known. In the basic case of a “measurement below detection limit”, the true measurement resides within the interval (0, DL). Depending on the character of the censorship, other scenarios for interval bounds are possible. If not treated within a statistical framework, four scenarios for the establishment of these intervals are commonly used and evaluated: (1) ignore censored measurements and use only crisp values (this implies a reduced sample size), or placed within the interval, either (2) on the lower boundary, (3) or on the upper boundary, (4) or in the center of the interval. In the example presented in this paper, there are two levels of censorship: (1) values for which concentrations (C) ≤ SDL and (2) values for which SDL < C ≤ SQL (Figure S3). The deterministically calculated representative values can vary for each contaminant in each core by up to 1 order of magnitude and there is neither indication which among them is the best estimate nor consistency in the calculated changes between deterministic methods (SI Table S6). In fact, within each interval, any arbitrary value of the infinite number of possibilities between the “lower” and “upper” interval bounds is equally wrong. The worst results occur in the cases when values below the SDL are included in the calculation of the average as zeros and when censored values are ignored. The remainder of this paper provides a methodology that narrows this range in a robust and statistically justifiable manner. Statistically Estimated Representative Concentrations. The performance of the three types of theoretical distribution functions, with and without the proportion of samples with zero contaminant mass (p0), is evaluated and compared to deterministic estimates where possible. The fitted distributions lie within the bounds spanned by empirical distributions when censored values are plotted at either boundary (SI Figure S7), the estimated values of expected concentrations and expected concentration variances are stable (SI Tables S7, S8, and S9), and the difference between estimations at the upstream and downstream location are stable, for most distribution functions used (Table 1). The contaminant cis-DCE illustrates the methodology well because at least some noncensored measurements exist at both D

DOI: 10.1021/acs.est.6b05385 Environ. Sci. Technol. XXXX, XXX, XXX−XXX

Article

Environmental Science & Technology

accuracy of the estimated means. A sample of size 10,000 is generated based on the fitted Weibull distribution for core MP21D and contaminant 1,1,1-TCA. Successively, more and more values of that sample are artificially censored, corresponding to a successively increasing detection limit. The ML-based estimator is stable for up to 95% of the data being censored and independent of the magnitude of the detection limit (Figure 1). Deterministic methods for calculating the mean,

When the proportion of true zeros is not estimated, the gamma distribution performs best and achieves the highest likelihood in most of the cases among the three theoretical types of distribution functions used. The Weibull distribution performs second best. Generally, the difference in performance between the Weibull and gamma distributions is minor. The exponential distribution never outperforms the other distributions. Cases where zero noncensored measurements were available were ignored for this performance analysis. If the goal is to estimate p0, then the one parametric exponential distribution gives the most stable results. Performance Analysis. Stability. The stability of the estimated representative value using the proposed ML-based methodology is evaluated by pretending to “trust” the lab’s analysis less and less by successively using fewer reported significant digits. In each successive step, an ML estimation is performed pretending the true value is censored within an interval corresponding to the number of significant digits. The method is shown to be stable, as the estimated representative concentration does not change significantly if the “exact” reported lab value is used or if the reported value with less significant digits is used (SI Figure S8). This recognition could be useful for planning analysis-methods for future samples. For this data set, a somewhat higher variability in the estimated representative values (i.e., ML-based means) in the downgradient core (MP-21D) can be recognized between the different types of distribution functions used. This variability is attributed to the smaller portion of crisp values at the downgradient location compared to the upgradient location (for these contaminants, a decrease of ∼20%, c.f., SI). The fewer significant digits are used for the estimation, the more the estimated representative value deviates from the average estimated with the “exact” value reported by the lab. This approach can be incorporated into the ML-based approach by finding the boundaries for each result, which, when rounded, return the value up to the desired significant digit. The difference between the distribution function values at these boundaries is then used for the probability of that measurement, instead of its density function value−very similar to the approach used for incorporating censored information (eq 1). Robustness. The MLM approach is used to estimate a representative concentration based on simulated and artificially, to varying degrees, censored concentrations. In this case, the known true average is compared to the estimated representative concentrations. The comparison is performed using a cumulative average of MLM-based averages over s = 2000 simulations using a two parametric Weibull distribution to generate n = 200 samples per simulation (page S15). The better the MLM, the closer the representative value based on MLM and the average of the simulated results agree. As more simulations are performed, the simulated representative value moves closer to the “true” average, where the true average is the arithmetic average of the simulated results prior to censorship by the SDL and SQL. Converging behavior demonstrates the stability of the method. The more measurements censored, the more uncertain the estimated mean becomes. A measure for this uncertainty is the spread around the true mean. No bias in this spread is observed after 200 simulations; however, the magnitude becomes systematically larger with increased censored values (SI Figure S9). Sensitivity. The goal of a sensitivity analysis via the estimates above is to evaluate the influence of a varying detection limit. At the same time, this methodology can be used to evaluate the

Figure 1. Sensitivity analysis. A bias occurs if censored measurements are set to an identical value (to zero, half the detection limit, the detection limit) or ignored. For this case, the bias obtained when using half the detection limit is not symmetric between the biases obtained when using the detection limit and zero.

such as ignoring censored measurements or setting the censored values to either the detection limit, half the detection limit, or zero, result in a biased estimate of the mean. Notably, the amount of the bias is different if the upper or lower threshold is used as a replacement for censored values when calculating the average deterministically. The MLM-based estimated representative value is not the arithmetic average of these two deterministically calculated averages. When deterministic averages are used, the deviation from the true mean is larger than 5% when more than ∼20% of the values are censored. In the real data set for core MP-19D and contaminant 1,1,1-TCA, 59% of the values are censored. Calculating a deterministic mean with censored values set to the lower limit results in an error of ∼17%, using the detection limit as a replacement results in an error of ∼22%, and using half of the detection limit as a replacement results in an error of ∼3%. These errors are magnified for larger portions of censored measurements. It must be emphasized that these numbers are based on a simulation using the fitted Weibull data to this location with this contaminant. Depending on the shape of the distribution used, the results will change. Using a larger sample size could enhance the methodology. A Monte Carlo-based sensitivity analysis could quantify the uncertainty of the errors. Improved Contaminated Site Assessment Using the Proposed Methodology. In previous studies at the site, information about the contaminant distribution in the Lone Rock Formation was collected using a variety of methods: detailed rock core contaminant samples, as discussed here, groundwater samples collected from multilevel systems with ∼30 samples within the 10-m thick Lone Rock Formation, and E

DOI: 10.1021/acs.est.6b05385 Environ. Sci. Technol. XXXX, XXX, XXX−XXX

Article

Environmental Science & Technology

Figure 2. Comparison of rock core ML method representative concentrations for cis-DCE and 111TCA with nearby conventional wells and high resolution multilevel systems monitoring the same interval of rock at different spatial resolution.

groundwater samples collected from conventional wells screened within the formation. Consequently, there is a desire to compare these results at the same or nearby locations. Such comparisons, either locally and/or upgradient versus downgradient, can be used to inform general system conditions, evaluate contaminant mass in the matrix versus the fractures, and assess the amount of contaminant attenuation along the flowpath or over time. Comparison of the raw data sets is difficult due to the extreme variability in spatial resolution and the complexity introduced by having multiple contaminants with different degradation pathways and degrees of attenuation. For illustration purposes, three data sets with varying degrees of spatial resolution at an upgradient and downgradient locations are compared (SI Figure S1). The upgradient and downgradient locations at the site each include depth-discrete rock core contaminant concentration profiles collected in June/ July 2008 (MP-19D and MP-21D, respectively; ∼3 samples per meter formation), groundwater contaminant concentration profiles from high resolution multilevel systems (MLS) collected in June 2009 (MP-19S and MP-21S, respectively; ∼0.4 samples per meter formation), and groundwater contaminant concentrations for conventional wells collected in June 2008 (P-115 and P-132, respectively; ∼0.1 samples per meter formation). The rock core contaminant profiles provide the highest degree of spatial resolution for the contaminant distribution in the Lone Rock Formation and the conventional monitoring wells the lowest (SI Figure S2). To allow for comparison with the groundwater data sets, the rock core ML means were converted from units of μg/g wet rock to units of μg/L pore water using a mass partitioning equation and Lone Rock Formation sandstone measurements for porosity, bulk density, and fraction of organic carbon. The 111TCA rock core ML mean, multilevel system ML mean, and conventional well contaminant concentrations at both the upgradient and downgradient locations are all within a factor of 2 (Figure 2a). The good comparison between the 111TCA data sets provides support for the robustness of the ML means for capturing the 2 orders of magnitude variability observed in the rock core concentrations (SI Figure S3) and the 3−4 orders of magnitude variability in the concentrations from the 5−6 monitoring zones of each multilevel system

(Figure 2a). In addition, all three data sets show a slight decline in 111TCA concentrations from the upgradient to the downgradient location indicating some degree of attenuation of this parent compound. We are confident in this assessment, because this decline was estimated for all three locations using the objective ML-based method. The cis-DCE concentrations varied by more than 2 orders of magnitude in the rock core (SI Figure S3) and over 4 orders of magnitude in the multilevel system monitoring zones (Figure 2b). However, when the ML mean data sets for the rock core and the multilevel systems are compared to the conventional well data sets, the cis-DCE concentrations generally compare well for the downgradient location (Figure 2b). For example, all three data sets are within a factor of 2 at the downgradient location. At the upgradient location, the conventional well cisDCE concentration and ML mean for the 5−6 multilevel system monitoring zones are within a factor of 4 while the rock core ML mean for cis-DCE is between 16 and 70 times less than the multilevel system ML mean and conventional well concentration, respectively (Figure 2b). The larger variability upgradient deserves further evaluation of site conditions. Review of the time series data from the conventional well shows a 28-fold increase in cis-DCE concentrations between March and June 2008. This increase in concentration in the conventional well may be related to a number of time coincident actions, including (1) temporary shut-down of a nearby pump and treat system due to flooding and/or (2) the redistribution of contaminant concentrations due to recent drilling associated with rock core sampling at the MP-19D location. It is unlikely that contaminant concentrations in the rock core representing concentrations in the low permeability, but porous, rock matrix away from fractures collected in June would have adjusted to the increase in concentration observed in the conventional well where the groundwater sample is dominated by concentrations in the highest transmissivity fractures at about the same time. This may explain the discrepancy between the rock core ML mean and the conventional well data and the MLM clearly points out disparate data sets. The conventional well concentration for cisDCE at the upgradient location in March 2008, prior to the increase in concentration, was 3,395 μg/L, which compares well F

DOI: 10.1021/acs.est.6b05385 Environ. Sci. Technol. XXXX, XXX, XXX−XXX

Article

Environmental Science & Technology with the 2008 rock core ML mean of 1,374 μg/L. In addition, the cis-DCE concentrations for the conventional well, rock core ML mean, and multilevel system ML mean all show declines in concentration by between a factor of 2 and 165 (5 if the March value for the conventional well is used) from the upgradient to the downgradient location, suggesting some degree of attenuation. This example shows that the ML approach provides a robust basis for creating enhanced composite concentration estimates for high resolution data sets to facilitate comparison of conditions at the same location and between locations. Discussion. This paper shows that the representative concentrations for censored data (censored by detection, quantitation, and/or reporting limits) estimated using the described MLM are robust, reasonable, and useful for quantitative environmental site evaluations. The estimates are reasonable because they (1) are unbiased, (2) consistent for different solutes at different locations, and (3) lie within the expected bounds spanned by empirical distributions with censored measurements replaced by the interval boundaries within which they reside and improved by estimating the proportion of true zeros. The estimated representative concentration should not be based on a deterministic value somewhere between those bounds, as deterministic methods introduce a systematic error. This ML method is valuable for quantitative evaluation of environmental site conditions when comparing data sets from different locations and different time periods with varying analytical detection limits and/or before/ after remediation or natural attenuation. The number of true zeros is of particular interest for projects involving remediation and/or natural attenuation performance assessments. The flexibility of the ML-based method was demonstrated by incorporating multiple QA/QC limits (e.g., detection, quantitation, reporting limits) in the estimation and by allowing variable thresholds for the censorship (different censorship level for different measurements to accommodate complex data sets with, for example, multiple analytes and dilution factors, multiple measurement types, various laboratories). Practically, this means that the proposed method can incorporate different and multiple QA/QC limits for each available measurement; this could be particularly valuable when analyzing data collected over long time periods during which analytical methods and associated QA/QC limits change. The method is able to estimate the proportion of true zeros in a censored data set as shown by using a precipitation data set for validation. The estimation of the proportion of true zero concentrations aided in the distinction between anthropogenic and nonanthropogenic contamination origins in an example using regional groundwater quality data. For all of the above, the MLM-estimated representative concentration was shown to be stable. This was also true for individual estimates based on different distribution functions. Stability of the representative concentration was further demonstrated by performing simulations with the true mean of the simulated data known, censored limits known, and means estimated. For multiple simulations and for increasing sample size per simulation, the estimated means converged toward the true mean. For the sake of this paper, only the mean and the variance were analyzed. However, the presented approach estimates parameters for the full theoretical distribution and, therefore, higher moments can be calculated. The method was applied to high-resolution rock core contaminant concentrations collected from continuous cores

at two locations within a mixed organic contaminant plume in sedimentary rock. The purpose of applying the method to this data set was to facilitate a comparison between two locations along the plume centerline, and to less spatially resolved sample concentrations from monitoring wells and multilevel systems at these same locations. The comparison showed that the MLbased representative concentrations fit well into the conceptual hydrogeological model of the site whereas averages calculated using the standard deterministic approach were biased and showed no distinguishable pattern. The significance of this work goes beyond simply including more data in interpretations and decision making. The use of a statistical model allows for quantitative analysis and comparison of censored data sets (and avoids arbitrary assigned values which enhances the insights from the collected data), thus improving typical site assessments required from hydrologists and environmental scientists at legacy and/or contaminated sites to determine changes in site conditions with respect to risk or technology performance. We have estimated a representative concentration at two locations. This could be repeated at multiple locations to obtain a more realistic estimate of mass in the system, a key parameter for the design of remediation activities. This step would require the inclusion of a model of spatial dependence. The method could also be repeated over time to improve the estimation of solute transport behavior. The presented method allows for inclusion of historical data collected with outdated analytical methods and, hence, larger censorship intervals, which are commonly removed from such evaluations. Similarly, the quality of the estimation could be improved by compensating for poorer (analytical) resolution, for example, by accepting a smaller number of significant digits, with more measurements. In fact, analytical data collected from a variety of environmental compartments (e.g., water, air, soil) are commonly “censored” by detection, quantitation, and/or reporting limits. A deterministic analysis of the data without the statistical model was not able to derive a similarly conclusive assessment of site conditions. The practical relevance of the proposed method is an improved basis for evaluating spatial and/or temporal change in environmental systems, such as contaminant mass depletion via remediation or natural attenuation.



ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.est.6b05385. The source code is available upon request to the corresponding author. Computations were performed using open source tools python (3.6), numpy (1.12), scipy (0.18.1), and matplotlib (2.0.0) on macOS 10.12.3.44 (PDF)



AUTHOR INFORMATION

Corresponding Author

*Phone: +49 (0)7071−29 73081; e-mail: [email protected]. ORCID

Claus P. Haslauer: 0000-0003-0180-8602 Notes

The authors declare no competing financial interest. G

DOI: 10.1021/acs.est.6b05385 Environ. Sci. Technol. XXXX, XXX, XXX−XXX

Article

Environmental Science & Technology



(19) Griffis, V. W.; Stedinger, J. R.; Cohn, T. A. Log Pearson Type 3 Quantile Estimators With Regional Skew Information and low Outlier Adjustments. Water Resour. Res. 2004, 40, W07503. (20) Liu, S.; Lu, J.-C.; Kolpin, D. W.; Meeker, W. Q. Analysis of Environmental Data with Censored Observations. Environ. Sci. Technol. 1997, 31, 3358−3362. (21) Kroll, C. N.; Stedinger, J. R. Development of Regional Regression Relationships With Censored Data. Water Resour. Res. 1999, 35, 775−784. (22) LeFrancois, M.; Poeter, E. P. Use of Observations Below Detection Limit for Model Calibration. Groundwater 2009, 47, 228− 236. (23) Michalak, A. M. A Gibbs Sampler for Inequality-constrained Geostatistical Interpolation and Inverse Modeling. Water Resour. Res. 2008, 44, W09437. (24) Shumway, R.; Azari, R. Statistical Approaches to Estimating Mean Water Quality Concentrations with Detection Limits. Environ. Sci. Technol. 2002, 36, 3345−3353. (25) Harter, H. L.; Moore, A. H. Asymptotic Variances and Covariances of Maximum-Likelihood Estimators, from Censored Samples, of the Parameters of Weibull and Gamma Populations. Ann. Math. Stat. 1967, 38, 557−570. (26) Durrans, S. R.; Ouarda, T. B. M. J.; Rasmussen, P. F.; Bobée, B. Treatment of Zeroes in Tail Modeling of Low Flows. J. HYDROL ENG 1999, 4, 19−27. (27) Sabourin, A.; Renard, B. Combining Regional Estimation and Historical Floods: a Multivariate Semiparametric Peaks-over-threshold Model With Censored Data. Water Resour. Res. 2015, 51, 9646−9664. (28) Dubrule, O.; Kostov, C. An Interpolation Method Taking into Account Inequality Constraints: I. Methodology. Math. Geol. 1986, 18, 33−51. (29) Bárdossy, A. Interpolation of Groundwater Quality Parameters with Some Values Below the Detection Limit. Hydrol. Earth Syst. Sci. 2011, 15, 2763−2775. (30) HSI GeoTrans. DNAPL Removal Report; Hydrite Chemical Co., 1999. (31) GeoTrans Inc.. Additional Characterization Deep DNAPL Bedrock Zone; Hydrite Chemical Co., 2003. (32) Meyer, J. R.; Parker, B. L.; Arnaud, E.; Runkel, A. C. Combining High Resolution Vertical Gradients and Sequence Stratigraphy to Delineate Hydrogeologic Units for a Contaminated Sedimentary Rock Aquifer System. J. Hydrol. 2016, 534, 505−523. (33) Lima, G. G.; Parker, B. B.; Meyer, J. M. Dechlorinating Microorganisms in a Sedimentary Rock Matrix Contaminated with a Mixture of VOCs. Environ. Sci. Technol. 2012, 46, 5756−5763. (34) Meyer, J. M. A High Resolution Vertical Gradient Approach to Hydrogeologic Unit Delineation in Fractured Sedimentary Rocks. Ph.D. thesis, University of Guelph, Guelph, Ontario, 2013. (35) Parker, B. L.; Cherry, J. A.; Chapman, S. W. Discrete Fracture Network Approach for Studying Contamination in Fractured Rock. Aqua Mundi 2012, 3, 101−116. (36) Sterling, S. N.; Parker, B. L.; Cherry, J. A.; Williams, J. H.; Lane, J. W.; Haeni, F. P. Vertical Cross Contamination of Trichloroethylene in a Borehole in Fractured Sandstone. Groundwater 2005, 43, 557− 573. (37) U.S. Environmental Protection Agency (U.S. EPA). Test Methods for Evaluating Solid Waste, Physical/Chemical Methods: Method 8260B Volatile Organic Compounds by Gas Chromatography/Mass Spectrometry (GC/MS); Technical Report SW-846, Chapter 4.3.2. 1996. (38) Aldrich, J. R. a. Fisher and the Making of Maximum Likelihood 1912−1922. Statistical Science 1997, 12, 162−176. (39) Hartung, J.; Elpelt, B. Multivariate Statistik, 7th ed.; Oldenbourg Verlag, 2007. (40) Begier, M. H.; Hamdan, M. A. Correlation in a Bivariate Normal Distribution with Truncation in Both Variables. AUST J. STAT 1971, 13, 77−82. (41) Akaike, H. A new Look at the Statistical Model Identification. IEEE Trans. Autom. Control 1974, 19, 716−723.

ACKNOWLEDGMENTS Funding for the collection and analysis of the dataset was provided by Hydrite Chemical Co. through the University Consortium for Field-Focused Groundwater Contamination Research and by a grant from the Natural Sciences and Engineering Research Council of Canada (NSERC) provided to Dr. Beth Parker, School of Engineering at the University of Guelph for an NSERC Industrial Research Chair (Grant IRCPJ 363783−2006). Funding was also received from the German Research Foundation (DFG) under projects GRK 1829 and HA 7339. Data were collected as part of an industry-sponsored initiative and are currently proprietary.



REFERENCES

(1) Helsel, D. More Than Obvious: Better Methods for Interpreting Nondetect Data. Environ. Sci. Technol. 2005, 39, 419A−423A. (2) Cohen, C. A., Jr Restriction and Selection in Samples from Bivariate Normal Distributions. J. Am. Stat. Assoc. 1955, 50, 884−893. (3) Cohen, C. A., Jr Simplified Estimators for the Normal Distribution when Samples are Singly Censored or Truncated. Technometrics 1959, 1, 217−237. (4) Cohen, C. A., Jr Progressively Censored Sampling in the Three Parameter Log-Normal Distribution. Technometrics 1976, 18, 99−103. (5) Singh, N. Estimation of Parameters of a Multivariate Normal Population from Truncated and Censored Samples. J. ROY STAT SOC B MET 1960, 22, 307−311. (6) Nath, G. B. Estimation in Truncated Bivariate Normal Distributions. J. R STAT SOC C-APPL 1971, 20, 313−319. (7) Beale, E. M. L.; Little, R. J. A. Missing Values in Multivariate Analysis. J. ROY STAT SOC B MET 1975, 37, 129−145. (8) Gilliom, R. J.; Helsel, D. R. Estimation of Distributional Parameters for Censored Trace Level Water Quality Data: 1. Estimation Techniques. Water Resour. Res. 1986, 22, 135−146. (9) Helsel, D. R.; Cohn, T. A. Estimation of Descriptive Statistics for Multiply Censored Water Quality Data. Water Resour. Res. 1988, 23, 1997−2004. (10) Helsel, D. R. Statistics for Censored Environmental Data Using Minitab and R, 2nd ed.; John Wiley: Hoboken, NJ, 2012. (11) Aboueissa, A. E.-M. A.; Stoline, M. R. Estimation of the Mean and Standard Deviation from Normally Distributed Singly-censored Samples. Environmetrics 2004, 15, 659−673. (12) Wenger, S. J.; Freeman, M. C. Estimating Species Occurrence, Abundance, and Detection Probability Using Zero-inflated Distributions. Ecology 2008, 89, 2953−2959. (13) Lee, A. H.; Wang, K.; Scott, J. A.; Yau, K. K. W.; McLachlan, G. J. Multi-level Zero-inflated Poisson Regression Modelling of Correlated Count Data With Excess Zeros. STAT METHODS MED RES 2006, 15, 47−61. (14) Thelwall, M. Are There too Many Uncited Articles? Zero Inflated Variants of the Discretised Lognormal and Hooked Power law Distributions. Journal of Informetrics 2016, 10, 622−633. (15) Sheu, M. l.; Hu, T. w.; Keeler, T. E.; Ong, M.; Sung, H. Y. The Effect of a Major Cigarette Price Change on Smoking Behavior in California: a Zero-inflated Negative Binomial Model. Health Economics 2004, 13, 781−791. (16) Kroll, C. N.; Stedinger, J. R. Estimation of Moments and Quantiles using Censored Data. Water Resour. Res. 1996, 32, 1005− 1012. (17) Cohn, T. A.; Lane, W. L.; Baier, W. G. An Algorithm for Computing Moments-based Flood Quantile Estimates when Historical Flood Information is Available. Water Resour. Res. 1997, 33, 2089− 2096. (18) Cohn, T. A.; Lane, W. L.; Stedinger, J. R. Confidence Intervals for Expected Moments Algorithm Flood Quantile Estimates. Water Resour. Res. 2001, 37, 1695−1706. H

DOI: 10.1021/acs.est.6b05385 Environ. Sci. Technol. XXXX, XXX, XXX−XXX

Article

Environmental Science & Technology (42) Gerbersdorf, S. U.; Cimatoribus, C.; Class, H.; Engesser, K.-H.; Helbich, S.; Hollert, H.; Lange, C.; Kranert, M.; Metzger, J.; Nowak, W.; Seiler, T.-B.; Steger, K.; Steinmetz, H.; Wieprecht, S. Anthropogenic Trace Compounds (ATCs) in Aquatic Habitats Research Needs on Sources, Fate, Detection and Toxicity to Ensure Timely Elimination Strategies and Risk Management. Environ. Int. 2015, 79, 85−105. (43) Bárdossy, A.; Pegram, G. Interpolation of Precipitation Under Topographic Influence at Different Time Scales. Water Resour. Res. 2013, 49, 4545−4565. (44) van der Walt, S.; Colbert, S. C.; Varoquaux, G. The NumPy Array: A Structure for Efficient Numerical Computation. Comput. Sci. Eng. 2011, 13, 22−30.

I

DOI: 10.1021/acs.est.6b05385 Environ. Sci. Technol. XXXX, XXX, XXX−XXX