Updating Exposure Models of Indoor Air Pollution Due to Vapor

Jan 14, 2014 - In this research, we demonstrate a Bayesian approach for using .... (38-40) The calibration process assumes there is an optimal set of ...
0 downloads 0 Views 1MB Size
Policy Analysis pubs.acs.org/est

Updating Exposure Models of Indoor Air Pollution Due to Vapor Intrusion: Bayesian Calibration of the Johnson-Ettinger Model Jill E. Johnston,*,† Qiang Sun,‡ and Jacqueline MacDonald Gibson† †

Department of Environmental Sciences & Engineering, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States ‡ Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States S Supporting Information *

ABSTRACT: The migration of chlorinated volatile organic compounds from groundwater to indoor airknown as vapor intrusionis an important exposure pathway at sites with contaminated groundwater. High-quality screening methods to prioritize homes for monitoring and remediation are needed, because measuring indoor air quality in privately owned buildings is often logistically and financially infeasible. We demonstrate an approach for improving the accuracy of the Johnson-Ettinger model (JEM), which the Environmental Protection Agency (EPA) recommends as a screening tool in assessing vapor intrusion risks. We use Bayesian statistical techniques to update key JohnsonEttinger input parameters, and we compare the performance of the prior and updated models in predicting indoor air concentrations measured in 20 homes. Overall, the updated model reduces the root mean squared error in the predicted concentration by 66%, in comparison to the prior model. Further, in 18 of the 20 homes, the mean measured concentration is within the 90% confidence interval of the concentration predicted by the updated model. The resulting calibrated model accounts for model uncertainty and variability and decreases the false negatives rate; hence, it may offer an improved screening approach, compared to the current EPA deterministic approach.



INTRODUCTION Chlorinated volatile organic chemicals (CVOCs) can migrate from contaminated groundwater into overlying buildings, an exposure pathway known as vapor intrusion. The challenges of collecting a robust data set inside private homes have complicated the characterization of vapor intrusion risks across potentially affected communities.1−4 Empirical evidence shows substantial spatial and temporal variability in these risks, but the factors driving the observed variability are incompletely understood.1,5,6 The scarcity of community-scale data, in turn, creates challenges in prioritizing homes for monitoring and remediation. Because of political, technical, and monetary constraints on directly monitoring indoor air quality in private homes, mathematical screening tools typically are used to identify atrisk buildings. While few studies have compared model predictions to measured indoor air pollutant concentrations, the existing literature suggests that the current models are, in general, inadequate to describe observed concentrations.7 Previous research has identified the need for vapor intrusion models that more accurately reflect real-world conditions and better screen homes to identify priorities for monitoring.8 The model most commonly employed in the U.S. during vapor intrusion site characterizations and recommended by the U.S. Environmental Protection Agency (EPA) draft vapor intrusion guidance is the Johnson-Ettinger model (JEM).9,10 The JEM couples one-dimensional steady-state diffusion of © 2014 American Chemical Society

volatile compounds through porous media with diffusion and advection through the building foundation.11 In U.S. regulatory applications, the EPA spreadsheet version of the JEM is the most widely used and therefore is the benchmark against which we compare the Bayesian approach in this paper. In the EPA spreadsheet version, deterministic values for variables that describe contaminant, environmental, and household properties serve as JEM inputs. The output is the vapor attenuation ratio, α, a unitless parameter that relates the indoor air concentration to that in the subsurface water or soil.11 The indoor concentration due to vapor intrusion is estimated as C indoor = α × Csource

(1)

where Cindoor is the indoor air concentration (mass/volume) and Csource is the vapor-source concentration (mass/volume), generally expressed as groundwater concentration multiplied by the appropriate Henry’s Law coefficient. In this research, we demonstrate a Bayesian approach for using empirically measured indoor air concentrations from a case study community to calibrate the JEM. We test this approach using indoor air quality data that we collected from 20 homes in a case study community in San Antonio, Texas. Received: Revised: Accepted: Published: 2130

May 21, 2013 January 10, 2014 January 14, 2014 January 14, 2014 dx.doi.org/10.1021/es4048413 | Environ. Sci. Technol. 2014, 48, 2130−2138

Environmental Science & Technology

Policy Analysis

Table 1. Complete Set of Equation Needed to Implement the EPA Version of the Johnson-Ettinger Model (the input parameters calibrated using Bayesian updating are marked with an asterisk (*)) Model Equations

Variables:

Model Equations

House/Foundation Calculations

Q building =

R crack

Soil Property Calculations Vb = building volume (cm3) A = building area (cm2)

Vb = A × MH VbE b 3600

A =η b Xcrack

MH = building mixing height (cm)*

θv = θT − θm Qbuilding = Building ventilation rate (cm3/s)

K rg = (1 − Ste)0.5 (1 − (Ste)1/ M )2M Ki =

Lt n

∑i = 1

Li Dieff

K sμw ρw g

Q soil =

2πk(ΔP)Xcrack

(

μln

θr = residual soil water content (cm3/cm3)* θT = total porosity (m3-voids/m3 -soil)* M = Van Genuchten curve shape parameter* Krg = relative air permeability

k = K rg × K t 2Zcrack R crack

Ki = soil intrinsic permeability (cm2)

) Ks = soil saturated hydraulic conductivity (cm/s)* μw = dynamic viscosity of water (g/ cm-s) ρw = density of water (g/cm3) g = acceleration due to gravity (cm/s2) k = soil permeability near foundation (cm2/s) ΔP = indoor-outdoor pressure difference (g/cm-s2)* μ = viscosity of air (g/cm-s) Zcrack = crack opening depth below grade (cm)

Diffusion Calculations

eff Dtotal =

Ste = effective total fluid saturation θm = volumetric moisture content (cm3/cm3)*

θ − θr Ste = m θT − θr

Eb = indoor air exchange rate (hr−1)* Ab = area of enclosed space in contact with soil and below grade (cm2) Lcrack = foundation thickness (cm) Rcrack = effective crack width (cm) η = fraction of surface area with cracks* Xcrack = floor-wall seam perimeter (cm)

⎛ θ 3.33 ⎞ DH2O ⎛ θm3.33 ⎞ ⎜ ⎟ Dieff = Dair ⎜ v 2 ⎟ + Hi ⎝ θT2 ⎠ ⎝ θT ⎠

Variables:

Defi f = effective diffusion for soil layer, i (cm2/s) Dair = PCE diffusion coefficient in air (cm2/s) DH2O = PCE diffusion coefficient in water (cm2/s) Hi = PCE Henry’s constant Deff total = total overall effective diffusion coefficient (cm2/s)+ Lt = distance between the source and the bottom of the enclosed space floor (cm) Li = thickness of soil layer i (cm)

Systematic calibration procedures using site-level data have the potential to decrease the false-positive and false-negative rates in community-scale vapor intrusion risk assessments.12,13 The Bayesian method yields posterior calibrated distributions for the model input parameters of interest. The calibration approach used in this research has not been employed previously to update the JEM, although it has been applied successfully to a variety of ecological process-based and chemical fate-and-transport models.12,14−19 The JEM is intended to estimate the influence of groundwater contamination on the overlying indoor air and to identify buildings of concern for further evaluation.20 While the model contains numerous parameters, its application involves substantial uncertainty contributed by both model structure and parameter inputs.21 Since many sites contain hundreds or thousands of potentially impacted structures, the 2002 EPA draft guidance document on assessing and managing vapor intrusion risks recommends the use of a combination of default and readily available site-specific-values as JEM inputs in order to evaluate whether the modeled indoor air concentrations exceed an established risk level. The EPA guidance provides default deterministic values for many of the inputs and suggests that these defaults are “conservative,” in other words, tend to overestimate exposure risk. Previous studies comparing JEM predictions to measured concentrations have indicated that, with reasonable input

parameters, the JEM can predict indoor CVOC concentrations within 1 order of magnitude.22 Nontheless, although purported to be conservative, in some past studies the JEM has underpredicted indoor concentrations by more than an order of magnitude.3,5,23,23−25 More complex, three-dimensional models may be more accurate for an individual home, but these approaches require detailed local information, are not scaled to the community level, and have yet to be applied in a regulatory context.26−28 Nonetheless, a recent study demonstrated that making small JEM modifications, such as changing the soil-gas entry rate, can improve agreement with detailed 3dimensional models.29 The San Antonio, Texas, neighborhood employed to assess performance of the JEM in this research overlies extensive groundwater plumes of tetrachloroethylene (PCE), extending 8 km from the former Kelly Air Force Base and affecting some 30 000 homes. The shallow groundwater table lies 1−10 m below the houses, and PCE concentrations range from 5− 50 000 μg/L. In previous work we developed a method for incorporating parameter uncertainty into the JEM algorithm via Monte Carlo simulation and simulated indoor PCE concentrations in each of the affected homes, but the predictions underestimated measured concentrations.30 The goal of the research presented here is to further improve the JEM algorithm by calibrating it to measured indoor air concentrations, hence improving the model’s accuracy and 2131

dx.doi.org/10.1021/es4048413 | Environ. Sci. Technol. 2014, 48, 2130−2138

Environmental Science & Technology

Policy Analysis

measured data (Supporting Information (SI), Figure S1). The approach iteratively computes posterior distributions for each uncertain input by comparing the predicted and measured indoor concentrations at each new iteration. The algorithm produces a sequence of updated input parameter values that comprise the posterior probability distribution functions for these inputs. This approach allows for the formal management of uncertainties in both the input parameters and the measured values and permits the explicit calculation of uncertainty in model results.12 Johnson-Ettinger Model. The JEM uses the following relationship to characterize the diffusion of pollutant vapors through the subsurface soil and across a building foundation, in order to predict the subsequent pollutant accumulation inside the building:11

precision. We compare the indoor concentrations predicted by the calibrated stochastic model to those predicted by the deterministic JEM version currently used for policy decisions, in order to evaluate whether the calibration improves model predictions. To our knowledge, such calibration of the JEM previously has not been attempted.



MATERIALS AND METHODS The mechanistic JEM was combined with stochastic representations of input parameters and observed indoor air concentration data from the 20 homes to update unknown model input parameters through Bayesian calibration using a Markov Chain Monte Carlo (MCMC) technique.31 The method involves specifying prior probability distributions for the input parameters and the likelihood function for the

eff ⎛ Dtotal Q soilLcrack Ab ⎞ ⎜ ⎟exp Q Dcreffack ηA b ⎝ buildingLt ⎠ ⎛ Deff A ⎞ Deff A ⎡ + ⎜ Q total Lb ⎟ QtotalL b ⎢exp ⎝ building t ⎠ soil t ⎣

(

α= exp

(

Q soilLcrack eff ηA b Dcrack

)

(

2 where Deff total is total overall effective diffusion coefficient (cm / 2 s), Ab is the area of enclosed space below grade (cm ), Qbuilding is the building ventilation rate (cm3/s), Lt is the vertical sourcebuilding separation (cm), Qsoil is the volumetric flow rate of soil gas into the enclosed space (cm3/s), Lcrack is the enclosed space foundation or slab thickness (cm), η is the fraction of foundation surface area with cracks (unitless), and Deff crack is the effective diffusion coefficient through the cracks (cm2/s). A majority of the JEM inputs are time-consuming, impractical or expensive to characterize.32 Thus, in practice, parameter values are estimated from a series of secondary and tertiary equations (see Table 1) as suggested by the EPA.20 As a result, in practice the model requires 25 primary parameters. The EPA and many state agencies suggest using deterministic values for each parameter. In reality these parameters are both variable and uncertain. Hence, our stochastic modeling approach represents nine of the key parameters (five of which are specific to the soil type) with log-normal probability distributions. The prior parameters for each distribution are informed by readily available site-specific data or, where such data are unavailable, national surveys or experimental studies available in the literature.30 For this analysis, parameters describing soil properties were considered separately for each U.S. Department of Agriculture (USDA) classified soil type present at the sites of indoor air measurements (clay or silty clay). The distributions reflect both uncertainty in environmental drivers of vapor intrusion and variability in the parameter values. The remaining 14 inputs, the majority of which describe physical properties of PCE or measurable properties of a house (e.g., square footage), were represented as deterministic values. Collectively, the set of equations and parameter distributions constitutes the prior knowledge in the Bayesian framework to estimate the prior model-predicted indoor air concentration for each house. The groundwater depth and groundwater contaminant concentrations were interpolated for 2011 based on semiannual data for the 900 shallow groundwater monitoring wells located in and around the former Kelly AFB. Monitoring data are available through the Department of Defense Air Force Real Property Agency Semi-Annual Compliance Plans from 1998 to

)

) (

Q soilLcrack eff ηA b Dcrack

) − 1⎤⎥⎦

(2)

2011. The 31-km2 area was divided into 150 m by 150 m grid cells. A state-of-the-art spatiotemporal geostatistical interpolation method, Bayesian maximum entropy, was then used to describe the estimated mean concentration and variance for each grid cell using a technique described elsewhere.33,34 Every house in the grid cell was assigned the corresponding values (and uncertainty) for the groundwater depth (cm below surface) and PCE concentration (μg/L). Figures S2 and S3 (SI) show the resulting estimated mean values of groundwater concentration and depth across the study site. Since these values are house-specific and based on a dense network of monitoring wells, these two parameters were not calibrated during the Bayesian updating. Indoor Air Concentration Measurements. From July to August 2011, we collected multiple indoor air samples from 20 homes in the community (eight samples per home). The indoor PCE concentration (μg/m3) was measured with passive samplers taking multiple 72-h integrated measurements. Potential confounding sources were removed from each residence before sampling. Further details about the sampling methods are available elsewhere.35 The measured concentrations comprise the data values, yij, used to update the JEM parameters. The mean and variance of the measured concentration in each house were calculated, accounting for the detection limit (0.13 ug/m3) and using the Kaplan−Meier nonparametric technique.36,37 Table S1 (SI) presents indoor air concentration summary statistics and other relevant attributes for each sampled home. Bayesian Approach for Updating JEM Parameters. Let yij denote the PCE indoor air concentration for the jth measurement in the ith house, and let f(θ, ω, xi) denote the mean value of the indoor air concentration predicted in the ith house by the JEM using the parameter vectors θ (the nine parameters with asterisks in Table 1, which are represented using the same probability distribution for each house and are updated in the calibration), ω (representing parameters that are deterministic and therefore are not updated), and the xi (the three parametershouse area, groundwater concentration and groundwater levelthat vary by house and also are not updated). For every home, 8−12 indoor air measurements were 2132

dx.doi.org/10.1021/es4048413 | Environ. Sci. Technol. 2014, 48, 2130−2138

Environmental Science & Technology

Policy Analysis

Figure 1. Comparison of the measured indoor air PCE concentrations due to vapor intrusion (Johnston & MacDonald Gibson, 2013) and EPArecommended deterministic modeling based on the Johnson-Ettinger algorithm.

parameter. These samples are generated iteratively, with the distribution of the next candidate sample depending only on the current sample value, denoted as θt. The candidate (θcandidate) is either accepted, in which case it is used in the next iteration, or rejected, in which case it is discarded, and the current value is reused for the next iteration. The acceptance probability is determined by dividing the adjusted likelihood of the candidate parameter vector of model inputs by the adjusted posterior likelihood of the current input vector; the result is called the Metropolis ratio, r. If r ≥ 1, the candidate vector is always accepted; if r < 1, the candidate vector is accepted with probability r. The SI provides additional details. For each calibration, three parallel Markov chains were run from three different parameter vector initial values: the prior mean value and two randomly sampled points from the prior parameter distribution. We ran 20 000 iterations, disregarding the first 10 000 as the unrepresentative “burn-in” of the chain and adopting the final 10 000 to characterize the parameter vector posterior distribution.31,38 Chain convergence was assessed with Geweke’s (1992) convergence criterion for individual chains and the Gelman-Rubin convergence statistic for multiple chains.43−45 All test statistics indicated that the resulting chains reached stationary distributions and therefore can be considered as a representative sample from the posterior probability distribution. From this sample we calculated the mean, variance, and 90% confidence interval of the posterior estimate for each parameter based on the three chains. The generation and analysis of the Markov chains were carried out with the statistical package R, and diagnostics were conducted using the coda package for R.46 Soil Parameter Inputs Based on Site Measurements. Previous research suggests that the JEM is sensitive to soil property inputs and that these account for much of the model output uncertainty.30,47 As a result, we collected site-specific soil samples in order to provide site-level prior estimates of volumetric moisture content (θm), hydraulic conductivity (Ks), and total porosity (θT) for each of the two soil types in the

available. We can then use a stochastic model to investigate the relationship between yij, and f(θ,ω, xi), as follows: yij = f (θ , ω , x i) + εij , i = 1, 2, ..., n; j = 1, 2, ..., ji (3)

where n is the number of houses, ji is the total number of measurements within the ith house, and εij is the difference between the predicted mean and the observed mean concentration (the error term). In accordance with previous work, the errors are assumed to be independent and identically distributed as normal random variables, that is, εij∼N(0, σ2i ).38−40 The calibration process assumes there is an optimal set of input parameters that minimizes the error. These assumptions lead to the following likelihood function: n

p(y|θ)] =

ji

∏∏ i=1 j=1

⎛ (y − f (θ , ω , x ))2 ⎞ i ij ⎟ exp⎜⎜ 2 ⎟ 2 σ 2 2πσi i ⎝ ⎠ 1

(4)

The likelihood function evaluates how well the simulation model at each value of θ is able to reproduce the observed data yij. In the Bayesian paradigm, any prior information about the uncertain input parameters, θ, can be improved by incorporating the measured indoor concentrations through the likelihood function, p(y|θ), which further leads to the posterior distribution p(θ|y), according to Bayes’ Theorem: p(θ|y) =

p(y|θ)p(θ) p(y)

(5)

Posterior Distribution. The posterior distribution of θ cannot be specified analytically because of the complex relationships among the input parameters, the model output, and the data. However, samples from the posterior distribution can be generated using the MCMC method, specifically the Metropolis-Hastings algorithm.12,15,41,42 The Metropolis-Hastings algorithm works by generating a sequence of random samples to approximate the posterior distribution for each 2133

dx.doi.org/10.1021/es4048413 | Environ. Sci. Technol. 2014, 48, 2130−2138

Environmental Science & Technology

Policy Analysis

Figure 2. Box plots of the prior and posterior distributions (Model 1 and Model 2) for each of the four household parameters and five soil parameters (for clay and SiC: silty clay soil) calibrated during the Bayesian updating.

study area. Soil cores were collected at four of the 20 study homes, resulting in two cores for each soil type. These cores were evaluated at approximately 1.5 and 5 m below the surface. Due to the limited sample size and the model’s assumption of homogeneous soil properties, we identified the median, minimum, and maximum values for each soil type and assigned a log-normal distribution for each parameter such that the minimum and maximum measured values equaled the lower and upper 95th percentile values of the calculated log-normal distribution. Steinberg et al. (1997) and Arhonditsis et al. (2007) previously used a similar assumption with PCB fate and transport parameters and eutrophication model parameters, respectively.48,49 The extent to which the small number of soil cores collected represents soil properties across this study community is unknown. A second set of MCMC simulations was run to estimate a posterior probability distribution to evaluate whether these limited soil parameter measurements improve model calibration and posterior predictions, compared to using soil property estimates from available U.S. Department of Agriculture (USDA) databases. Evaluation of Model Performance. The calibrated model performance was evaluated via leave−one-out cross-validation, since no additional measurements were available for comparison. In this cross-validation, one home was excluded from the MCMC calibration process, and then the updated parameters were used to predict the indoor air concentration for the excluded home. This was repeated for each home. We then (a) compared the 90% confidence intervals of the in-home PCE concentration measurements and the model predictions, (b) determined the average pairwise correlations between the mean and 95th percentile values of the predictions and measurements, and (c) calculated the average root-mean-squared error (RMSE):

n

rmse =

j

∑i = 1 ∑ ji= 1 (yij − f (θ , ω , x i) n

(6)



RESULTS AND DISCUSSION JEM Deterministic Predictions. The predicted indoor PCE concentrations in the 20 homes were initially calculated using the standard adaptation of the JEM as recommended by the EPA vapor intrusion guidance document, incorporating the mean estimates for groundwater level and groundwater PCE concentrations, house-specific soil type and house area, and EPA default values (shown in SI, Table S3) for the other parameters.20 The JEM deterministic predictions for PCE ranged from 0.005 to 0.15 μg/m3 for the 20 homes where measurements were collected. As shown in Figure 1 the deterministic method underpredicted the mean measured concentrations in all cases, and the predicted value was less than the fifth percentile of the observed concentrations in 11 of 20 study homes. Hence, the standard JEM approach underestimated exposure levels. Simulation of Indoor Air Predictions. Monte Carlo simulation of the JEM using 10 000 iterations and the prior input parameter distributions (i.e., distributions based on already available local and national databases, prior to updating with measured indoor air concentration data) produced prior probability estimates of PCE indoor air concentrations for each house. On average, these prior predictions were within approximately 1 order of magnitude of the observed measurements, as generally expected for the JEM (see SI, Figure S4). In 19 of 20 homes, the modeled mean value was less than the observed mean. We calibrated the JEM against measured indoor air data using two different prior distributions for soil parameters: (1) distributions based on publicly available USDA soil information 2134

dx.doi.org/10.1021/es4048413 | Environ. Sci. Technol. 2014, 48, 2130−2138

Environmental Science & Technology

Policy Analysis

Table 2. Comparing Model Performance of the Deterministic Approach, The Prior Predictions and the Two Updated Model Predictions to the Measured Indoor Air Concentrations deterministic mean mean difference (paired t test)a RMSEb a

0.097 (p < 0.001) 0.264

prior 95th

Model 1

mean 0.082 (p < 0.001) 0.103

95th

mean

0.087 (p = 0.16) 0.276

Model 2 95th

0.024 (p = 0.23) 0.089

0.035 (p = 0.57) 0.262

mean 0.0019 (p = 0.94) 0.109

95th −0.14 (p = 0.84) 0.301

The average difference between the measured and the predicted mean (μg/m3) and the significance of this difference. bRoot Mean Squared Error.

(Model 1), and (2) distributions estimated from our analysis of the four soil cores (Model 2). Figure 2 shows the corresponding box plots of the prior and posterior distributions of the nine updated model parameters (further details in SI, Table S3). For the four house-related parameters (mixing height, air exchange rate, fraction of foundation area with cracks, and indoor-outdoor pressure difference), in both models the standard deviations of the posterior distributions were reduced as a result of the calibration. These results suggest that the Bayesian calibration technique was able to reduce the uncertainty in the model input variables and narrow the prediction interval. For example, the standard deviation of the air exchange rate (Eb) decreased to almost half of its prior value, and the mean also decreased (by about 20%) in both Model 1 and Model 2. We found a higher average differential pressure (ΔP) than initially chosen along with a decrease in the standard deviation of this parameter. Both of these changes suggest that our initial prior distributional assumptions were not sufficiently conservative, as a higher pressure differential and a lower air exchange rate increase the vapor attenuation factor and thus lead to higher indoor air concentrations. Model 1 decreased the 90% confidence interval width for four soil parameters (clay volumetric moisture content and silty clay total porosity, volumetric moisture content, and Van Genuchten curve shape parameter, shown in Figure 2). For Model 2, we integrated site-level soil core data on total porosity, volumetric moisture content, and hydraulic conductivity into our prior knowledge. As a result, the calibration adjusted the soil parameters to some extentalthough the posterior distribution is largely tied to the initial assumptions of the prior probability functions (from which the posterior candidate values were drawn). For Model 2, our prior distributions for these parameters were substantially different than the assumptions based on USDA data (Table 2). Further, we assumed that our soil measurements represented the soil type across the neighborhood in spite of the small sample size and single sampling event. As a result of these assumptions, the average hydraulic conductivity converged on values significantly lower than estimated in the first model. However, this effect appears to be partially offset by the higher overall total porosity of the measured soil parameters. Model Prediction Error. In general, the JEM’s predictive ability improved as a result of calibration with site-level data. The mean values of both Model 1 and Model 2 show better agreement with the measured mean values for PCE compared to both the deterministic and prior information models (Figure 1 and 3). In 18 of 20 homes, the mean measured value is within the 90% confidence interval of the predicted concentration, and the confidence intervals are less than 1 order of magnitude in range. However, in cases of high indoor air concentrations (especially in homes above lower PCE groundwater concentrations), the calibrated model was still unable to adequately predict the upper ranges of the observed indoor air

Figure 3. Measured predictions versus updated model predictions. The dotted gray horizontal line represents the PCE detection limit (0.13 μg/m3).

concentrations. This is also evident by examining the averaged cumulative distribution function of the measured and predicted indoor air concentrations (see SI Figure S6). In the two homes for which the model predictions were outside of the observed values, no detectable PCE was measured, and therefore model performance is difficult to assess. In general, Model 2 estimates are slightly higher (and have a somewhat larger uncertainty range) than Model 1, indicating a slightly more conservative (but less precise) exposure estimate. The average measured indoor PCE concentration for all homes was 0.16 μg/m3 compared to 0.07, 0.13, and 0.15 μg/m3 for the deterministic model, Model 1 and Model 2, respectively. As Table 2 shows, the leave-one-out-cross-validation technique shows that both Model 1 and Model 2 decreased the error (measured as the RMSE) between the measured and predicted means and 95th percentiles. Compared to the deterministic approach, Models 1 and 2 reduce the RMSE of the mean by 68% and 58%, respectively. In general, Model 1 residuals are less than Model 2 residuals, suggesting better agreement with observed values. The results of paired t tests between measured and modeled values indicate that the mean differences between both of the calibrated models and the measured values is not significantly different than zero. In contrast, for both the prior model and the deterministic model estimates, the predicted means differed significantly from the measured means (p < 0.001). Our approach assumed the individual elements of εij are uncorrelated. The empirical correlation matrix of the posterior chains in SI, Table S4, confirms that this assumption is reasonable, because the correlations do not differ significantly from 0. Sensitivity and Uncertainty Analysis. Previous work has found that the predicted vapor attenuation ratio is most 2135

dx.doi.org/10.1021/es4048413 | Environ. Sci. Technol. 2014, 48, 2130−2138

Environmental Science & Technology

Policy Analysis

Figure 4. Sensitivity analysis of uncertain variables. Indicates the change in the mean predicted indoor air concentrations (μ/m3) when uncertain parameters are set to their 5th or 95th percentile. The soil parameters displayed are for the silty clay (SiC) soil type.

below a certain range. Using measured soil values (Model 2) narrowed the absolute range of uncertainty in most soil parameters but increased the coefficient of variation (the ratio of the standard deviation to the mean), resulting in an increase in sensitivity of the predicted indoor air concentration (Figure 4). Notably, we observe greater uncertainty contributed by the hydraulic conductivity than seen in Model 1, which further suggests that soil properties exhibit substantial, irreducible variability. The important influence of small-scale heterogeneity in soil on soil vapor transport has been noted in other studies.52,53 This observation also may be an artifact of the limited sample size. Potential Suitability of Bayesian Calibration for Vapor Intrusion. Our primary goal in this study was to demonstrate the potential of a Bayesian calibration procedure to improve the parametrization of a vapor intrusion model, systematically quantify the uncertainty, and reduce the uncertainty in model output. Bayesian calibration with MCMC can handle a large number of parameters simultaneously, associate prior knowledge of parameter values with measurements of output variables, and reduce uncertainty when there is insufficient knowledge about the parameter distributions. We chose to calibrate model parameters common across the community (rather than house-specific parameters). This approach searches for posterior distributions that simultaneously minimize error among all study homes at this site. In this study, indoor air measurements and model estimates do not exceed the revised 2012 EPA screening levels for PCE, but the research nonetheless offers insights into a systematic approach to improving JEM exposure estimates based on limited data. This technique could also be applied to multisite data to compare site-level and regional-level estimates.15 The calibration results can serve as the default values for spatial extrapolation in new houses where indoor air measurements have not been collected. However, Bayesian calibration cannot

sensitive to the effective soil moisture content, air exchange rate, total porosity, and building mixing height.50,51 A sensitivity analysis was conducted by setting each calibrated model input variable to the fifth and 95th percentile of its posterior distribution (Figure 4) and also calculating the partial rankorder correlation (SI Figure S7), in order to determine the effects on the predicted indoor air concentration. We also conducted a similar analysis using the prior distributions of model parameters, in order to assess the effects of the model calibration on prediction sensitivity. Increasing the air exchange rate, the mixing height or the soil moisture content decreases the predicted indoor air concentration, while the opposite relationship is true of the other parameters. Among these input variables, the uncertainty in the air exchange rate and then mixing height had the largest influence on the predicted indoor air concentrations. Beyond building condition parameters, predicted concentrations also were sensitive to total soil porosity, pressure differential and hydraulic conductivity. Importantly, the calibration process tended to decrease the uncertainty around these important input parameters, hence providing for narrower prediction intervals, compared to the prior model. On the other hand, the uncertainty around some of the input parameters stayed the same, indicating that the calibration had little impact. This is particularly true for the first posterior model (Model 1), where only slight changes occurred in the probability distributions of the soil parameters (Figure 2). This may suggest either that data are insufficient to improve these parameters or that their optimal values were outside the prescribed range.15 The inability to narrow the uncertainty around the soil properties (for the 2 soil types observed in this study) may also suggest that there is extensive heterogeneity among the soil properties influencing vapor diffusion even within a given soil type. This natural heterogeneity may make it impossible to reduce the standard deviation of these inputs 2136

dx.doi.org/10.1021/es4048413 | Environ. Sci. Technol. 2014, 48, 2130−2138

Environmental Science & Technology



reduce the uncertainty of a parameter estimate or improve its accuracy without appropriate observational data. While the calibrated models improve model predictions compared to a deterministic approach, a large amount of uncertainty remains. This uncertainty is consistent with empirical observations at vapor intrusion sites, where indoor air concentrations can fluctuate by more than an order of magnitude.1,4,6 Predictions did not improve substantially when including measured soil properties as the initial parameter assumptions, although this may not be the case if a large sample is available from which to establish the initial distributional properties or for scenarios with nonclay soil types. At this site, improving parameters of the soil properties proved challenging and possibly reflects that the spatial resolution of the soil classification scheme may need to be increased. In the future, it may be possible to improve estimates even further by applying a hierarchical Bayesian technique.54,55 Research Implications. This research combined a vapor intrusion model with a mechanistic foundation with a statistical approach for model calibration and uncertainty estimation; the mechanistic model component can predict system behavior, while the Bayesian statistical updating approach allows for empirical parameter estimation and rigorous uncertainty analysis, which can strengthen the inferences drawn from model results in support of decision-making. Uncertainty is fundamental to all scientific activities, and people regularly make decisions based on uncertain data (e.g., weather forecasts). The results need not be treated as final predictions, but rather can be viewed as new sources of information (new prior probability functions) for subsequent “experiments” that lead to modifications in management practices (new decisions) or improvement of exposure predictions. For better model-based decision-making, the uncertainty in model projections should ideally be reduced, quantified, and reported in a way that can be used by decision makers. As EPA is emphasizing a multiple-lines-of-evidence approach, we demonstrate a technique for practitioners to improve sitelevel risk characterizing by integrating the JEM with indoor air measurements. While the limitations of JEM are well documented, it is nonetheless still widely used in regulatory contexts. Parameterizing and then calibrating inputs to the JEM may decrease the potential for false negatives, provide a better concept of the potential range of exposures due to vapor intrusion, and produce a more robust data set from which to make decisions.



REFERENCES

(1) Folkes, D.; Wertz, W.; Kurtz, J.; Kuehster, T. Observed spatial and temporal distributions of CVOCs at Colorado and New York vapor intrusion sites. Ground Water Monit. Rem. 2009, 29, 70−80. (2) McDonald, G. J.; Wertz, W. E. PCE, TCE, and TCA vapors in subslab soil gas and indoor air: A case study in upstate New York. Ground Water Monit. Remediat. 2007, 27, 86−92. (3) Schreuder, W. A. Uncertainty Approach to the Johnson and Ettinger Vapor Intrusion Model, EPA/600/R-05/110; U.S. Environmental Protection Agency, 2006; 072681998, pp 164−173. (4) McHughT. E.NicklesT.BrockS.Evaluation of spatial and temporal variability in VOC concentrations at vapor intrusion investigation sites. In Proceedings of the Air & Waste Management Assocition, Vapor Intrusion: Learning from the Challenges; Providence, RI, 2007. (5) Fitzpatrick, N. A.; Fitzgerald, J. J. An evaluation of vapor intrusion into buildings through a study of field data. Soil Sed. Contam. 2002, 11, 603−623. (6) Johnson, P.; Ettinger, R.; Kurtz, J.; Bryan, R.; Kester, J. Empirical assessment of ground water-to-indoor air attenuation factors for the CDOT-MTL Denver site. Ground Water Monit. Remediat. 2009, 29, 153−159. (7) Yao, Y.; Shen, R.; Pennell, K. G.; Suuberg, E. M. Examination of the U.S. EPA’s vapor intrusion database based on models. Environ. Sci. Technol. 2013, 47, 1425−1433. (8) Picone, S.; Valstar, J.; van Gaans, P.; Grotenhuis, T.; Rijnaarts, H. Sensitivity analysis on parameters and processes affecting vapor intrusion risk. Environ. Toxicol. Chem. 2012, 31, 1042−1052. (9) Eklund, B.; Beckley, L.; Yates, V.; McHugh, T. E. Overview of state approaches to vapor intrusion. Rem. J. 2012, 22, 7−20. (10) U.S. EPA. OSWER Draft Guidance for Evaluating the Vapor Intrusion to Indoor Air Pathway from Groundwater and Soils (Subsurface Vapor Intrusion Guidance), EPA530-D-02-004, 2002. (11) Johnson, P. C.; Ettinger, R. A. Heuristic model for predicting the intrusion rate of contaminant vapors into buildings. Environ. Sci. Technol. 1991, 25, 1445−1452. (12) Larssen, T.; Huseby, R. B.; Cosby, B. J.; Høst, G.; Høgåsen, T.; Aldrin, M. Forecasting acidification effects using a Bayesian calibration and uncertainty propagation approach. Environ. Sci. Technol. 2006, 40, 7841−7847. (13) Ellison, A. M. An introduction to Bayesian inference for ecological research and environmental decision-making. Ecol. Appl. 1996, 1036−1046. (14) Van Oijen, M.; Cameron, D.; Butterbach-Bahl, K.; Farahbakhshazad, N.; Jansson, P. E.; Kiese, R.; Rahn, K. H.; Werner, C.; Yeluripati, J. A Bayesian framework for model calibration, comparison and analysis: Application to four models for the biogeochemistry of a Norway spruce forest. Agric. For. Meteorol. 2011, 151, 1609−1621. (15) Lehuger, S.; Gabrielle, B.; Oijen, M.; Makowski, D.; Germon, J. C.; Morvan, T.; Hénault, C. Bayesian calibration of the nitrous oxide emission module of an agro-ecosystem model. Agric. Ecosyst. Environ. 2009, 133, 208−222. (16) Yeluripati, J. B.; van Oijen, M.; Wattenbach, M.; Neftel, A.; Ammann, A.; Parton, W.; Smith, P. Bayesian calibration as a tool for initialising the carbon pools of dynamic soil models. Soil Biol. Biochem. 2009, 41, 2579−2583. (17) Arhonditsis, G. B.; Papantou, D.; Zhang, W.; Perhar, G.; Massos, E.; Shi, M. Bayesian calibration of mechanistic aquatic biogeochemical models and benefits for environmental management. J. Mar. Syst. 2008, 73, 8−30. (18) Reinds, G. J.; Van Oijen, M.; Heuvelink, G.; Kros, H. Bayesian calibration of the VSD soil acidification model using European forest monitoring data. Geoderma 2008, 146, 475−488. (19) Saloranta, T. M.; Armitage, J. M.; Haario, H.; Næs, K.; Cousins, I. T.; Barton, D. N. Modeling the effects and uncertainties of contaminated sediment remediation scenarios in a Norwegian Fjord by Markov Chain Monte Carlo simulation. Environ. Sci. Technol. 2007, 42, 200−206.

ASSOCIATED CONTENT

S Supporting Information *

Additional details on the methods are provided. This information is available free of charge via the Internet at http://pubs.acs.org/.



Policy Analysis

AUTHOR INFORMATION

Corresponding Author

*Phone: 919.843.5786; fax: 919.966.7911; e-mail: jillj@unc. edu. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS This project was supported in part by the National Science Foundation Graduate Research Fellowship Program. 2137

dx.doi.org/10.1021/es4048413 | Environ. Sci. Technol. 2014, 48, 2130−2138

Environmental Science & Technology

Policy Analysis

(20) Environmental Quality Management, Inc. User’s Guide for Evaluating Subsurface Vapor Intrusion into Buildings, 2004. (21) Fitzgerald, J. One regulatory perspective on the vapor intrusion pathway. t. 2009, 29, 51−52. (22) Hers, I.; Zapf-Gilje, R. Evaluation of the Johnson and Ettinger model for prediction of indoor air quality. Ground Water Monit. Rem. 2003, 23, 119−133. (23) Hers, I.; Zapf-Gilje, R.; Evans, D.; Li, L. Comparison, validation, and use of models for predicting indoor air quality from soil and groundwater contamination. Soil Sed. Contam. 2002, 11, 491−527. (24) Mills, W. B.; Liu, S.; Rigby, M. C.; Brenners, D. Time-variable simulation of soil vapor intrusion into a building with a combined crawl space and basement. Environ. Sci. Technol. 2007, 41, 4993−5001. (25) Provoost, J.; Bosman, A.; Reijnders, L.; Bronders, J.; Touchant, K.; Swartjes, F. Vapour intrusion from the vadose zoneSeven algorithms compared. J. Soils Sed. 2010, 10, 473−483. (26) Pennell, K. G.; Bozkurt, O.; Suuberg, E. M. Development and application of a three-dimensional finite element vapor intrusion model. J. Air Waste Manag. Assoc. 2009, 59, 447−460. (27) Yao, Y.; Suuberg, E. M. A review of vapor intrusion models. Environ. Sci. Technol. 2013, 47, 2457−2470. (28) Bozkurt, O.; Pennell, K. G.; Suuberg, E. M. Simulation of the vapor intrusion process for nonhomogeneous soils using a threedimensional numerical model. Ground Water Monit. Rem. 2009, 29, 92−104. (29) Yao, Y.; Shen, R.; Pennell, K. G.; Suuberg, E. M. Comparison of the Johnson-Ettinger vapor intrusion screening model predictions with full three-dimensional model results. Environ. Sci. Technol. 2011, 45, 2227−2235. (30) Johnston, J. E.; Mac Donald Gibson, J. Probabilistic approach to estimating indoor air concentrations of chlorinated volatile organic compounds from contaminated groundwater: A case study in San Antonio, Texas. Environ. Sci. Technol. 2011, 45, 1007−1013. (31) Gilks, W. R.; Roberts, G. O. Strategies for improving MCMC. In Markov Chain Monte Carlo in Practice; CRC Press, 1996; pp 89−114. (32) Yao, Y.; Pennell, K. G.; Suuberg, E. M. Estimation of contaminant subslab concentration in vapor intrusion. J. Hazard. Mater. 2012, 231, 10−17. (33) Christakos, G.; Bogaert, P.; Serre, M. L. Temporal GIS: Advanced Functions for Field-Based Applications; Springer: New York, 2001, pp 217. (34) Serre, M. L.; Carter, G.; Money, E. Geostatistical space/time estimation of water quality along the Raritan river basin in New Jersey. In Computational Methods in Water Resources, Part 2, Proceedings of the 15th International Conference on Computational Methods in Water, 2004; Vol. 55, pp 13−17. (35) Johnston, J. E.; Gibson, J. M. Spatiotemporal variability of tetrachloroethylene in residential indoor air due to vapor intrusion: A longitudinal, community-based study. J. Exposure Sci. Environ. Epidemiol. 2013, DOI: 10.1038/jes.2013.13. (36) Helsel, D. R. Nondetects and Data Analysis. Statistics for Censored Environmental Data; Wiley-Interscience, 2005. (37) Antweiler, R. C.; Taylor, H. E. Evaluation of statistical treatments of left-censored environmental data using coincident uncensored data sets: I. Summary statistics. Environ. Sci. Technol. 2008, 42, 3732−3738. (38) Van Oijen, M.; Rougier, J.; Smith, R. Bayesian calibration of process-based forest models: Bridging the gap between models and data. Tree Physiol. 2005, 25, 915−927. (39) Svensson, M.; Jansson, P. E.; Gustafsson, D.; Kleja, D. B.; Langvall, O.; Lindroth, A. Bayesian calibration of a model describing carbon, water and heat fluxes for a Swedish boreal forest stand. Ecol. Model. 2008, 213, 331−344. (40) Klemedtsson, L.; Jansson, P. E.; Gustafsson, D.; Karlberg, L.; Weslien, P.; von Arnold, K.; Ernfors, M.; Langvall, O.; Lindroth, A. Bayesian calibration method used to elucidate carbon turnover in forest on drained organic soil. Biogeochemistry 2008, 89, 61−79.

(41) Metropolis, N.; Rosenbluth, A. W.; Rosenbluth, M. N.; Teller, A. H.; Teller, E. Equation of state calculations by fast computing machines. J. Chem. Phys. 1953, 21, 1087. (42) Kennedy, M. C.; O’Hagan, A. Bayesian calibration of computer models. J. R. Stat. Soc.: Ser. B 2001, 63, 425−464. (43) Gelman, A.; Carlin, J. B.; Stern, H. S.; Rubin, D. B. Bayesian Data Analysis; Chapman & Hall/CRC, 2003. (44) Geweke, J. Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments, Bayesian statistics. In Bayesian Statistics 4; Bernardo, J. M., Berger, J. O., DavidA. P., Smith, A. F. M., Eds.; Oxford University Press, 1992; 1690193. (45) Cowles, M. K.; Carlin, B. P. Markov chain Monte Carlo convergence diagnostics: A comparative review. J. Am. Stat. Assoc. 1996, 91, 883−904. (46) Plummer, M.; et al. CODA: Output analysis and diagnostics for Markov Chain Monte Carlo simulations. 2006. (47) Tillman, F. D.; Weaver, J. W. Temporal moisture content variability beneath and external to a building and the potential effects on vapor intrusion risk assessment. Sci. Total Environ. 2007, 379, 1− 15. (48) Steinberg, L. J.; Reckhow, K. H.; Wolpert, R. L. Characterization of parameters in mechanistic models: A case study of a PCB fate and transport model. Ecol. Model. 1997, 97, 35−46. (49) Arhonditsis, G. B.; Qian, S. S.; Stow, C. A.; Lamon, E. C.; Reckhow, K. H. Eutrophication risk assessment using Bayesian calibration of process-based models: Application to a mesotrophic lake. Ecol. Model. 2007, 208, 215−229. (50) Johnson, P. C. Identification of application-specific critical inputs for the 1991 Johnson and Ettinger vapor intrusion algorithm. Ground Water Monit. Remediat. 2005, 25, 63−78. (51) Tillman, F. D.; Weaver, J. W. Uncertainty from synergistic effects of multiple parameters in the Johnson and Ettinger (1991) vapor intrusion model. Atmos. Environ. 2006, 40, 4098−4112. (52) Garbesi, K.; Robinson, A.; Sextro, R.; Nazarof, W. Radon entry into houses: The importance of scale-dependent permeability. Health Phys. 1999, 77, 183. (53) Carsel, R. F.; Parrish, R. S. Developing joint probability distributions of soil water retention characteristics. Water Resour. Res. 1988, 24, 55−769. (54) Goyal, A.; Small, M. J.; von Stackelberg, K.; Burmistrov, D.; Jones, N. Estimation of fugitive lead emission rates from secondary lead facilities using hierarchical Bayesian models. Environ. Sci. Technol. 2005, 39, 4929−4937. (55) Borsuk, M. E.; Higdon, D.; Stow, C. A.; Reckhow, K. H. A Bayesian hierarchical model to predict benthic oxygen demand from organic matter loading in estuaries and coastal zones. Ecol. Model. 2001, 143, 165−181.

2138

dx.doi.org/10.1021/es4048413 | Environ. Sci. Technol. 2014, 48, 2130−2138