Article pubs.acs.org/est
Acute Gastrointestinal Illness Risks in North Carolina Community Water Systems: A Methodological Comparison Nicholas B. DeFelice,† Jill E. Johnston,‡ and Jacqueline MacDonald Gibson*,§ †
Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, New York 10032, United States ‡ Division of Environmental Health, Keck School of Medicine, University of Southern California, Los Angeles, California 90089, United States § Department of Environmental Sciences & Engineering, Gillings School of Global Public Health, University of North CarolinaChapel Hill, Chapel Hill, North Carolina 27599, United States S Supporting Information *
ABSTRACT: The magnitude and spatial variability of acute gastrointestinal illness (AGI) cases attributable to microbial contamination of U.S. community drinking water systems are not well characterized. We compared three approaches (drinking water attributable risk, quantitative microbial risk assessment, and population intervention model) to estimate the annual number of emergency department visits for AGI attributable to microorganisms in North Carolina community water systems. All three methods used 2007−2013 water monitoring and emergency department data obtained from state agencies. The drinking water attributable risk method, which was the basis for previous U.S. Environmental Protection Agency national risk assessments, estimated that 7.9% of annual emergency department visits for AGI are attributable to microbial contamination of community water systems. However, the other methods’ estimates were more than 2 orders of magnitude lower, each attributing 0.047% of annual emergency department visits for AGI to community water system contamination. The differences in results between the drinking water attributable risk method, which has been the main basis for previous national risk estimates, and the other two approaches highlight the need to improve methods for estimating endemic waterborne disease risks, in order to prioritize investments to improve community drinking water systems.
■
INTRODUCTION The introduction of improved municipal water and sewer services was one of the most influential U.S. public health advances during the twentieth century. These interventions are credited with decreasing U.S. infant mortality by 75%, child mortality by 67%, and total mortality by 50% between 1900 and 1936.1 Drinking water filtration and chlorination contributed to the majority of observed mortality reductions; these methods are highly effective at preventing common causes of waterborne enteric disease. As of October 2011, 299 million U.S. residents (96% of the population) were served by community water supply systems (CWSs),2 defined by the Environmental Protection Agency (EPA) as systems serving at least 25 residents or 15 residential connections year-round.3 The Safe Drinking Water Act (SDWA) requires regular monitoring of CWSs for indicators of microbial contamination and corrective action if these indicators exceed regulatory standards. Nonetheless, pathogens can still enter CWSs due to resistance to disinfection (e.g., Giardia, Cryptosporidium, and enteric viruses), treatment system deficiencies (e.g., inadequate disinfection), periodic treatment failures, or distribution system contamination.4 © 2015 American Chemical Society
The magnitude of current waterborne disease risk in U.S. communities with CWSs is not well quantified.5−7 Although the U.S. Centers for Disease Control and Prevention (CDC) has long tracked waterborne disease outbreaks, most outbreaks are unrecognized, because they are difficult to separate from background disease rates and because outbreak reporting is voluntary.7,8 Three previous groups have sought to estimate the U.S. national disease burden associated with microbial contamination of drinking water. Responding to requirements in the SDWA Amendments of 1996, EPA in 2006 developed an approach, which we refer to as the drinking water attributable risk (DWAR) method, to estimate that national rate of acute gastrointestinal illness (AGI) attributable to microbial contamination in CWSs.9 Using this method, the EPA attributed 16.4 million AGI cases annually to CWS contamination.9 Using a similar method, Colford et al. estimated a waterborne AGI rate among CWS customers of 4.26−11.7 million cases per Received: Revised: Accepted: Published: 10019
August 4, 2014 July 1, 2015 July 13, 2015 July 13, 2015 DOI: 10.1021/acs.est.5b01898 Environ. Sci. Technol. 2015, 49, 10019−10027
Article
Environmental Science & Technology year.10 Reynolds et al. in 2008 used quantitative microbial risk assessment (QMRA) to estimate that 18.4 million AGI cases annually are attributable to CWS contamination.6 Each of these groups presented only national-scale estimates rather than estimates at the finer spatial resolution needed to inform drinking water management decisions. In addition, the estimates were based on assumed rather than measured water quality characteristics, and the analyses did not systematically evaluate associations between observed CWS water quality and health outcomes. Improved methods for estimating the disease burden attributable to microbial contamination of CWSs are needed to identify CWSs and geographic regions where interventions may be needed.7,11 Toward this end, this paper compares three different methods for estimating waterborne disease risks in CWSs. As a basis for comparison, we apply each method to quantify the average annual AGI rate attributable to microbial contamination in the 2,120 CWSs in North Carolina (NC) (Supporting Information, SI, Table S1) over the time period 2007−2013. We incorporate the extensive water quality monitoring data available for NC CWSs available under the SDWA, along with county-level public health data collected by the NC Division of Public Health (NCDPH). We provide results by county and for the entire state. The three methods compared are the DWAR, QMRA, and population intervention model (PIM) approaches: DWAR. The DWAR method (the basis for the previously referenced EPA national risk estimate)9 assumes there is a fixed probability distribution of endemic waterborne disease risk that maps directly to the probability distribution of microbiological water quality among CWSs. A major uncertainty is the reliance on a distribution of endemic disease estimated from two previous studies in Laval, Canada, in which some water system customers were provided with household water filters, while others were provided with sham filters, and AGI rates between the two groups were compared. The method assumes the probability distribution of endemic waterborne disease among U.S. CWS customers is the same as that observed in the Laval studies. QMRA. QMRA (the basis for the previous risk estimate by Reynolds et al.)6 couples estimated pathogen concentrations to pathogen dose-response functions. Major uncertainties include the need to infer pathogen concentrations from fecal indicator bacteria concentrations, since CWSs do not monitor for pathogens, and the extrapolation of dose-response functions from studies of healthy human volunteers to the general population. PIM. The PIM method is widely used to estimate the benefits of public health interventions (such as smoking cessation programs),12−14and a recent review by Clasen et al.15 recommended considering it for future estimates of the global disease burden associated with inadequate water and sanitation. However, the PIM method has not been previously applied to quantify U.S. waterborne disease risks.15 The method fits a regression model to data on exposure to a particular risk factor (in this case, microbiologically contaminated drinking water) and disease incidence. The preventable disease burden is estimated by using the regression model to estimate the change in health outcomes if a portion of the population is shifted from a higher to lower risk factor exposure level (such as from high to lower concentration of microbial contaminants in drinking water). Limitations arise when exposure and health outcome
data are available only at the group level (for example, among communities) and not at the individual level. The present study is the first to employ the PIM approach to estimate U.S. waterborne disease risks. In addition, this study is the first to compare all three methods. In implementing the PIM, we also provide the first county-level statistical analysis of associations between NC emergency department (ED) visits for AGI and CWS violations of SDWA microbiological standards.
■
METHODS Data. Water Quality Data. The SDWA requires all CWSs to monitor total coliform bacteria throughout their distribution systems as indicators of potential fecal contamination. The minimum number of samples required each month is based on population served, but even the smallest systems must analyze at least one sample per month.3 If more than 5% of samples over a 30-day period test positive for total coliforms, then the system violates the monthly maximum contamination level (MCL).3 Follow-up analysis for E. coli or fecal coliforms is required for any sample testing positive for total coliforms; a positive result indicates the system has violated the acute MCL. The NC Department of Environment and Natural Resources (NCDENR) provided the results of all microbiological water quality analyses conducted under the SDWA for all 2,120 active NC CWSs for January 1, 2006−December 31, 2013.16 For each CWS, the data set included the sample date, indicator microbes analyzed, and test results (positive or negative). From these data, we computed the number of MCL violations in each month for each CWS (see SI Table S2 for aggregated results by county). Because the health data used for this analysis (see below) covered the 82-month time period January 1, 2007− October 10, 2013, this same time period of water quality data was used in the analysis. Water quality data from private drinking water wells (used by those without CWS service) were needed as a covariate in the PIM model. We obtained private well data for January 1, 2009− December 31, 2013, from the NC State Laboratory of Public Health.17 Since 2008, NC has required all newly drilled private wells to be tested for total coliforms and E. coli. The data set included 16 138 observations with well location (county), test date, microbial indicators assessed, and test results. Data were received for 91 of the 100 NC counties; the remaining 9 counties were excluded when fitting the regression model used in the PIM method. Data on the size of the population relying on private wells (and hence without CWS access) in each county were obtained from the U.S. Geological Survey.18 AGI Incidence Data. We used emergency department (ED) visits for AGI as a proxy for AGI incidence, while recognizing that only a fraction of those with AGI seek treatment in an ED. ED visit data for January 1, 2007−October 31, 2013 were extracted from the NC Disease Event Tracking and Epidemiologic Collection Tool (NCDETECT), which includes all 122 NC EDs.19 In keeping with prior research,20,21 AGI visits were defined using the following primary and secondary diagnostic codes: infectious GI illness (001 to 009), noninfectious GI illness (558.9), and nausea and vomiting (787.01−787.03, 787.91). In total, 2 769 620 such cases were reported during this time periodan average of 405 000 (SD = 38 500) per year. Records included the patient’s county of residence. Residence location rather than ED location was used in all analyses. 10020
DOI: 10.1021/acs.est.5b01898 Environ. Sci. Technol. 2015, 49, 10019−10027
Article
Environmental Science & Technology The QMRA required national AGI incidence rate information. We estimated the mean national AGI rate from previous studies based on telephone surveys by the CDC under the Foodborne Disease Active Surveillance Network (FoodNet) Program.22−25 Mean AGI rates in these studies ranged from 0.55 to 0.72 cases per person-year.22,25 Therefore, we represented the total U.S. AGI rate as a uniform distribution with parameters (0.55, 0.72). Demographic Data. County-level demographic data (population, poverty rate, and health insurance access) were obtained from the American Community Survey 2008−2012 5-Year Summary File.26 Models. DWAR. The DWAR approach matches the distribution of microbial contamination in CWSs to an assumed probability distribution representing the fraction of AGI cases attributable to drinking water contamination (represented as AF).9 We replicated EPA’s process for estimating an AF probability distribution (SI Figure S1) by following the simulation steps in Messner et al., Appendix A.9 To compute the fraction of AGI associated with contamination in each CWS, we first developed an empirical cumulative distribution function (CDF) of monthly MCL violation rates among all 2120 NC CWSs using the NCDENR data (SI Figure S2). Next, we matched the location of each CWS on this CDF to the equivalent location on the AF CDF (SI Figure S1). For each county, we then calculated the population-weighted average of the AFs across all CWSs. We multiplied the result by the county-specific yearly AGI ED visit rate to estimate the visits attributable to CWSs. QMRA. Since QMRA is pathogen-specific, previous QMRAs have quantified risks of pathogens in drinking water by selecting one or two reference organisms from each major pathogen group (protozoan parasites, bacteria, and viruses).6,27 Consistent with these previous QMRAs,6 parasites were represented by Giardia, nonlegionella bacteria by Campylobacter, and viruses by rotavirus.5 Dose-response and morbidity information for each pathogen were drawn from previous QMRAs (SI Table S3).28−32 Because the SDWA does not require CWSs to monitor for pathogens, we estimated pathogen exposure from microbial indicator data by multiplying E. coli concentrations by pathogen-to-E. coli ratios derived from previous studies that estimated probability distributions of these ratios in surface water and groundwater (SI Table S3). These ratios have been used in used in previous QMRA models.27 Data to simulate the ratios were obtained from the author of a previous QMRA, who had conducted a literature review of prior studies of pathogento-E. coli ratios.33 The ratios were assumed to be lognormally distributed with mean and standard deviation varying by organism and water source. For surface water systems, means of 0.033 (SD = 0.024), 0.20 (SD = 0.13), and 0.0070 (SD = 0.0045) were assumed for Giardia, Campylobacter, and rotavirus, respectively. The rotavirus-to-E. coli ratio in groundwater was assumed to have mean value 0.36 (SD = 1.4). The mean E. coli concentration was represented as a Poisson distribution with mean value estimated from monthly CWS sampling data and previous U.S. Geological Survey (USGS) data on total coliform and E. coli concentrations in surface water and groundwater samples.34,35 As the first step in this estimation process, the mean monthly total coliform concentration was estimated from monthly presence−absence samples using a maximum likelihood approach:36
μTCi , j =
1 ⎛⎜ ni , j − pi , j ⎞⎟ ln V ⎜⎝ ni , j ⎟⎠
(1)
where μTCi,j is the mean concentration of total coliforms in CWS i during month j, V is the volume of water sampled, ni,j is the number of samples collected in the distribution system of CWS i during month j and pi,j is the corresponding number of positive samples. The mean E. coli concentrations, μi,j, in each CWS and month were then estimated by multiplying the result of eq 1 by a probability distribution representing the ratio of E. coli to total coliform concentrations (from USGS data34,35) and by the fraction of total coliform samples testing positive for E. coli in the given CWS and month. SI Table S3 provides details of these calculations. The number of pathogens ingested by an individual was computed as follows: Pexposure, d , i , j = μi , j R pathogenI
(2)
where Pexposure,d,i,j is the number of pathogens ingested by a random customer of CWS i on day d in month j, μi,j is the mean E. coli concentration in CWS i during month j, Rpathogen is the lognormally distributed ratio of pathogens to E. coli, and I is a log-normal distribution representing daily tap water consumption (mean = 1.129 L; SD = 0.674).37 Daily infection risk (Pinf,d,i,j) was simulated using doseresponse models from previous studies (SI Table S4).27,28,30,32 Daily illness probability (Pill,d,i,j) was calculated by multiplying daily infection risk (Pinf,d,i,j) by morbidity ratios from previous studies (SI Table S4).28−32 The risk of illness per fecal contamination event (Pill,e,i,j) was then computed as follows: Pill, e , i , j = 1 − (1 − Pill, d , i , j)t
(3)
where t is the duration of the contamination event. We represented t as lognormally distributed with parameters (mean = 3.95 days, SD = 6.77 days) estimated from the CWS water quality data set.16 The number of cases per event was calculated by multiplying the CWS customer population by Pill,e for each monitoring event. Results for each CWS were then aggregated to the county level. The proportion of AGI cases attributable to CWS contamination was estimated by dividing the resulting estimate by the number of cases expected based on the national AGI incidence rate. This fraction was then multiplied by the number of ED visits in the county, in order to estimate an attributable ED visit rate. PIM. To implement the PIM, a panel structure log-Poisson regression model with temporal autocorrelation and a logperson-month offset was fitted to monthly county-level health outcome and water quality data. The final model form is as follows: ln(Yi , j /Ni , j) = α + β1CCWSi ,j + β2ECWSi ,j + β3C DWSi 9
20
+ β4 Povi + β5EDi + β6Ii + (∑ βl R l) + ( ∑ βmtm) l=7
+ μj
m = 10
(4)
where Yi,j is the number of AGI ED visits in county i during month j, Ni,j is the county population, α is a constant, CCWS,i,j is the population proportion exposed to a monthly MCL violation during month j (determined by assuming all customers of a CWSs with monthly MCL violations were exposed), ECWSi,j is 10021
DOI: 10.1021/acs.est.5b01898 Environ. Sci. Technol. 2015, 49, 10019−10027
Article
Environmental Science & Technology Table 1. Associations Between AGI ED Visits and Model Covariates Used in the PIM log-Poisson Regression Model variable alpha (α) fraction of population exposed to monthly MCL violation (CCWSi.j) fraction of population exposed to acute MCL violation (ECWSi,j) fraction of population exposed to total coliform bacteria in private wells (%) (CDWSi) fraction of population living in poverty (%) (Povi) county has an emergency department (binary) (EDi) greater than 16% of population uninsured (binary) (Ii) region (R) coastal plain piedmont mountain month (m) January February March April May June July August September October November December
(95% CI) (−5.96 to −5.88) (0.00327−0.0100) (0.0520−0.0677) (0.719−0.874) (2.36−2.62) (0.0701−0.131) (−0.282 to −0.252)
referent −0.117 −0.4961
(−0.1295 to −0.1045) (−0.5204 to −0.4718)
referent 0.02762 0.09786 −0.08354 −0.1337 −0.1924 −0.186 −0.1789 −0.1862 −0.1715 −0.167 −0.04821
(0.02584−0.0294) (0.0955−0.1002) (−0.08641 to −0.0807) (−0.137 to −0.131) (−0.196 to −0.189) (−0.189 to −0.183) (−0.182 to −0.175) (−0.189 to −0.183) (−0.175 to −0.168) (−0.170 to −0.164) (−0.0502 to −0.0463)
regression model to estimate Yi,j. Cases under the counterfactual scenario (Yi,j,counterfactual) were estimated as follows:
the population proportion exposed to an acute MCL violation, CDWS,i is the population proportion potentially exposed to coliform bacteria in private wells (determined by multiplying the fraction of private wells testing positive by the proportion of the county population served by private wells), PovI is the proportion of the population in poverty, EDi is a binary indicator of whether or not the county has an emergency department, Ii is a binary indicator of whether the county health uninsurance rate exceeds the median rate for North Carolina (16%), Ri indicates the region of the state where the county is located (Coastal Plain, Piedmont and Mountain), and tm is an indicator of month (with January serving as the reference month). The first-order autoregressive error term is represented as uj, where,
μj = μj − 1 + εj
Beta −5.92 0.00666 0.0598 0.797 2.49 0.100 −0.267
ln(Yi , j ,counterfactual /Ni , j) = α + β3C DWSi + β4 Povi + β5EDi ⎛ 9 ⎞ ⎛ 20 ⎞ + β6Ii + ⎜⎜∑ βl R l⎟⎟ + ⎜⎜ ∑ βmtm⎟⎟ + μj ⎠ ⎝ l=7 ⎠ ⎝ m = 10
(6)
Next, the fraction of cases attributable to microbial contamination from CWSs was estimated as follows: AFCWSi.j = 1 −
exp(ln(Yi , j ,counterfactual /Ni)) exp(ln(Yi , j /Ni))
(7)
where AFCWS,i,j is the fraction of AGI cases attributable to microbial contamination of CWSs in county i during month j. We then multiplied AF by the observed number of ED visits for month j. Uncertainty Analysis. 3 (Lumina Decision Systems, Los Gatos, CA). For the DWAR and QMRA methods, simulations were conducted at the level of the individual CWS, and the resulting disease burden estimates were summed across all systems for each county and for the state as a whole. Repeat runs of 5000 iterations each confirmed that the mean and standard deviation of estimated cases converged to at least two significant figures for each modeling approach. For the DWAR method, uncertainty was estimated by assigning a uniform distribution to the percentile ranking of each water system according to total MCL violations (monthly and acute) over the study period. For each CWS, the lower and upper bounds of this uniform distribution were determined from the lower and upper bounds of the system’s position on the histogram of MCL violations among all systems (SI Figure S2). For example, 67% of all CWSs reported zero MCL violations. Therefore, the percentile location of these systems on the AF probability distribution (SI Figure S1) was
(5)
and the εj are assumed to be independent with a mean of zero. A generalized estimating equations approach in STATA IC 12 (College Station, TX) was used to fit regression models to the data. Multiple model forms (including alternative explanatory variables) were tried, and the model with the lowest quasi-Aikake Information Criterion (QIC) value was selected. In addition to the variables shown in eq 4, explanatory variables considered included monthly precipitation and racial composition. In addition, models excluding the emergency department and health insurance variables were tested. Additional specifications for temporal variability (other than indicator variables for month of the year) also were tried and are discussed in the sensitivity analysis. Using the fitted regression model (Table 1), AGI cases in each county were estimated under two scenarios: current conditions and a counterfactual scenario wherein no SDWA violations occur (CCWS,i,j = ECWS,i,j = 0). Cases under current conditions were computed by using all parameters in the 10022
DOI: 10.1021/acs.est.5b01898 Environ. Sci. Technol. 2015, 49, 10019−10027
Article
Environmental Science & Technology Table 2. Annual AGI ED Visits Potentially Attributable to Microbial Contaminants in NC CWSs method
AGI ED visits potentially attributable to microbial contaminants (95% CI)a
percent of total AGI ED visits potentially attributable to microbial contaminants (95% CI)
Quantitative Microbial Risk Assessment (QMRA) Drinking Water Attributable Risk (DWAR) Population Intervention Model (PIM)
190 (3.0−1000) 32 000 (31 000−33 000) 190 (120−240)
0.047% (7.4 × 10−4−0.26) 8.0% (7.7−8.2) 0.046% (0.031−0.060)
a
Average annual number of cases estimated to be attributable to microbial contamination of community water supply systems between years 2007 and 2013.
Figure 1. Ranking of counties by annual rate of emergency department (ED) visits for acute gastrointestinal illness (AGI) attributable to community water systems using (A) the drinking water attributable risk (DWAR) method, (B) quantitative microbial risk assessment (QMRA), and (C) a population intervention model (PIM). A ranking of 100 indicates the highest rate of AGI attributable to contamination in community water systems among North Carolina counties. Ranks correspond to the following ranges of expected annual attributable cases per person: DWAR method: 1−25: