Linking Quantitative Microbial Risk Assessment and Epidemiological

9 Apr 2012 - Linking Quantitative Microbial Risk Assessment and Epidemiological Data: Informing Safe Drinking Water Trials in Developing Countries...
0 downloads 0 Views 3MB Size
Article pubs.acs.org/est

Linking Quantitative Microbial Risk Assessment and Epidemiological Data: Informing Safe Drinking Water Trials in Developing Countries Kyle S. Enger,† Kara L. Nelson,‡ Thomas Clasen,§ Joan B. Rose,† and Joseph N. S. Eisenberg*,∥ †

Department of Fisheries and Wildlife, 13 Natural Resources Building, Michigan State University, East Lansing, Michigan, 48824, United States ‡ Department of Civil and Environmental Engineering, 760 Davis Hall, University of California, Berkeley, California, 94720, United States § Department of Disease Control, London School of Hygiene & Tropical Medicine, North Courtyard, Third Floor, Keppel Street, University of London, London, WC1E 7HT, United Kingdom ∥ Department of Epidemiology, University of Michigan School of Public Health, M5065 SPH II, 1415 Washington Heights, Ann Arbor, Michigan, 48109, United States S Supporting Information *

ABSTRACT: Intervention trials are used extensively to assess household water treatment (HWT) device efficacy against diarrheal disease in developing countries. Using these data for policy, however, requires addressing issues of generalizability (relevance of one trial in other contexts) and systematic bias associated with design and conduct of a study. To illustrate how quantitative microbial risk assessment (QMRA) can address water safety and health issues, we analyzed a published randomized controlled trial (RCT) of the LifeStraw Family Filter in the Congo. The model accounted for bias due to (1) incomplete compliance with filtration, (2) unexpected antimicrobial activity by the placebo device, and (3) incomplete recall of diarrheal disease. Effectiveness was measured using the longitudinal prevalence ratio (LPR) of reported diarrhea. The Congo RCT observed an LPR of 0.84 (95% CI: 0.61, 1.14). Our model predicted LPRs, assuming a perfect placebo, ranging from 0.50 (2.5− 97.5 percentile: 0.33, 0.77) to 0.86 (2.5−97.5 percentile: 0.68, 1.09) for high (but not perfect) and low (but not zero) compliance, respectively. The calibration step provided estimates of the concentrations of three pathogen types (modeled as diarrheagenic E. coli, Giardia, and rotavirus) in drinking water, consistent with the longitudinal prevalence of reported diarrhea measured in the trial, and constrained by epidemiological data from the trial. Use of a QMRA model demonstrated the importance of compliance in HWT efficacy, the need for pathogen data from source waters, the effect of quantifying biases associated with epidemiological data, and the usefulness of generalizing the effectiveness of HWT trials to other contexts.



INTRODUCTION Diarrhea is a major cause of infectious disease mortality, accounting for 17% of deaths in children under 5 years of age; only pneumonia accounts for a similarly high share of mortality in this age group.1 Diarrheal mortality has declined from approximately 5 million in 19802 to 2 million in 2000 and 2004.3,4 However, the incidence of diarrhea has remained at 2− 3 episodes per child-year from 1980 to 2000.2,5,6 Contaminated drinking water is an important route of transmission for diarrheal pathogens. Recent reviews indicate that household water treatment (HWT) interventions, which can improve microbiological quality at the point of use, can be more protective against diarrhea than interventions at the water source in the developing world.7−10 HWT addresses not only contamination of the source water, but also recontamination during collection, transport, and storage in the home.11 The long-term sustainability and scalability of HWT remain important issues of discussion. © 2012 American Chemical Society

The randomized controlled trial (RCT) is considered the gold standard study design in epidemiology; it is the study design with the least systematic bias, and therefore the highest internal validity. Two important components of RCT design for internal validity are the randomization of subjects to the intervention and the nonintervention groups, and blinding of the subject and investigator to group assignment. It is difficult to blind HWT interventions because these devices are visually obvious and cannot be concealed from participants or investigators. It is also difficult to develop a placebo HWT filter that does not remove pathogens, but improves the appearance of water like an effective filter.12 Other biases may also affect the internal validity of an estimate derived from the Received: Revised: Accepted: Published: 5160

December 10, 2011 April 3, 2012 April 9, 2012 April 9, 2012 dx.doi.org/10.1021/es204381e | Environ. Sci. Technol. 2012, 46, 5160−5167

Environmental Science & Technology

Article

trial, such as recall bias, incomplete compliance with the intervention, or unexpected difficulties conducting the trial.12 In a recent RCT12 in rural communities in the Democratic Republic of the Congo (DRC) using the LifeStraw Family Filter (LFF; Vestergaard Frandsen Corporation, Lausanne, Switzerland), investigators attempted to blind the intervention. The LFF is an ultrafilter with a 20-nm pore size that was shown to remove 99.99999% of Escherichia coli, 99.998% of Cryptosporidium oocysts, and 99.97% of MS2 coliphage from challenge water in the laboratory.13 For the Lifestraw RCT, investigators developed a placebo filter resembling the LFF in appearance, weight, operation, and flow rate.12 The placebo was tested in the laboratory for three weeks against the same three organisms, and no removal was observed. In the field, however, the intended placebo removed on average 91% (95% CI: 88− 93%) of thermotolerant coliform bacteria (TTC), a group that includes E. coli and indicates fecal contamination, from source water.12 Therefore, the study could only compare a highly effective filter with a poorly effective filter. Although 65% of people reported using the filter, most filter users also reported drinking unfiltered water.12 The proportion of unfiltered water that people consumed was not quantified. The Lifestraw RCT did not find a statistically significant (P < 0.05) effect of the LFF against diarrhea.12 Quantitative microbial risk assessment (QMRA) models can examine and account for biases associated with environmental intervention trials (e.g., imperfect compliance, recall bias, or an imperfect placebo) and can explore risks associated with different contexts from those observed in empirical studies. Such models can provide a conceptual framework for understanding systems that are difficult to explore in the real world. QMRA models have been used to quantify disease risk in many contexts.14−16 Our analytic framework for HWT and approach to link QMRA and epidemiological data are unique and consist of (1) a calibration step using a QMRA model to produce results consistent with the epidemiological study; and (2) an estimation step that examines counterfactual scenarios that adjust for biases within the study and explores how altered contexts affect risk. The effectiveness of an intervention in those contexts can then be estimated, even if it was never directly studied under such conditions. In this manuscript we develop a counterfactual causal inference framework using a QMRA model to evaluate the impact of biases on estimates of intervention efficacies. We illustrate the approach by simulating the Lifestraw RCT12 and adjusting for some of its biases, to estimate the effectiveness of the LFF compared with a perfect placebo under differing levels of LFF compliance.

Figure 1. Conceptual model linking QMRA models to epidemiological studies. The results from an actual epidemiology study define a context (e.g., the LifeStraw Family Filter randomized controlled trial [Lifestraw RCT] in rural Congo). The calibration phase generates a set of simulated studies that are consistent with this defined context. Calibration also estimates values for parameters that were not observed during the real study, thus inferring unobserved context of the real study. The estimation phase generates simulated studies that are generalized to other contexts (e.g., higher or lower compliance than was observed during the Lifestraw RCT). For more detail, see Supporting Information (SI), Figure S1.

simulations that are consistent with the epidemiological study comprise the calibrated model; the parameter distributions provide a representation of the context in which the epidemiological study was conducted. Using this calibrated model, the epidemiological study can be generalized to other contexts in what we call the estimation step. This estimation step, therefore, consists of a set of simulations in which specific parameter values are varied to describe different contexts, such as alternative intervention strategies or different ecological or social settings. For the research described herein, a QMRA model was developed that simulates the following chain of events: 1. Determination of the concentrations of three pathogen types (bacteria, protozoa, and viruses) in drinking water, sampled from gamma distributions 2. Calculation of daily doses of pathogens based on their concentrations and the amount of water consumed 3. Use of dose response functions to convert daily doses of pathogens to probabilities of infection 4. Assignment of infection to individuals, based on the probabilities of infection 5. Assignment of diarrheal illness, based on morbidity ratios The same conceptual approach illustrated in Figure 1 could also be applied to more complex models including processes such as transmission dynamics17,18 or environmental fate and transport dynamics.19 Case Study Model Description. The model describing the Lifestraw RCT conducted in the Congo12 follows a simulated population of children under 5 years of age for 12 months using a time unit of 1 day (for details, see Supporting Information (SI), Section A and Figure S1). The population is surveyed about their diarrheal symptoms every 4 weeks, similar to the Lifestraw RCT. The simulated children ingest bacteria, protozoa, and viruses in their drinking water, respectively represented by diarrheagenic Escherichia coli, Giardia cysts, and rotavirus. These three pathogens were chosen because they are major causes of diarrheal disease in much of the developing



MATERIALS AND METHODS Conceptual Framework Linking QMRA Models to Epidemiological Studies. Quantitative microbial risk assessment (QMRA) uses environmental contamination data as input to models used to predict risk of infection or disease. Epidemiological studies provide data on patterns of disease measured by incidence or prevalence and measures of relative risk. Here we provide a framework for the calibration of risk models by using epidemiological data from a particular study that describes the risk in a particular context, where the context is defined by a particular time in a particular geographic setting (Figure 1). The calibration process involves simulating a risk model many times using different input and parameter values. The parameter sets (or parameter distributions) representing 5161

dx.doi.org/10.1021/es204381e | Environ. Sci. Technol. 2012, 46, 5160−5167

Environmental Science & Technology

Article

analogous to the LP measured by the Lifestraw RCT. Two measures of reported waterborne diarrhea are generated: the LP of reported waterborne diarrhea in the intervention group (LPIrwd) and the placebo group (LPPrwd). A third measure, LPrNW, is the LP of reported diarrhea acquired by nonwaterborne routes. Combining LPrNW with LPIrwd and LPPrwd yields the LP of all reported diarrhea in the intervention group (LPIrad) and the placebo group (LPPrad); (SI Section A9 has more detail):

world, and they represent the three main taxa of waterborne pathogens.20 A child is either susceptible to, immune to, or infected by each of these three pathogens; we assume that the infective processes of each pathogen are independent of each other, and a child may therefore be infected with 0, 1, 2, or 3 types of pathogens simultaneously. Children are divided into two groups, one receiving the intervention filter (with log10 removal values for E. coli, Giardia, and rotavirus set to 6.9, 3.6, and 4.7, respectively, based on laboratory testing13). The other group receiving the placebo filter with log10 removal values for the calibration step is set to 1.05 for all three pathogens (as a comparison during the analysis) based on coliform removal results from the Lifestraw RCT.12 Additional discussion of the rationale for these values is provided in the SI, Section C5. Compliance. Children’s compliance with water filtration is described by two parameters: probability of using their filter, and proportion of water treated if using the filter. Low compliance was defined as 65% of children treating 1/3 of their drinking water; medium compliance was 65% treating 2/3; high compliance was 65% treating 100%; and perfect compliance was 100% treating 100%. Although the proportion of children who were treating water was estimated at 65% during the Lifestraw RCT, the proportion of water treated was not measured, so 1/3, 2/3, and 100% were chosen for illustration. Estimation of Environmental Concentration and Daily Dose. The daily dose of each pathogen type for each person is determined as follows: Daily dose = cd[(1 − w) + w10−r ]

LPIrad ≈ (LPIrwd + LPrNW )

or

LPPrad ≈ (LPPrwd + LPrNW )

(2)

The longitudinal prevalence ratio of all reported diarrhea (LPRrad) describes the effectiveness of the LFF; the preventable fraction of reported diarrhea is 1 − LPRrad. LPR rad = LPIrad /LPPrad

(3)

There are a total of 33 model parameters. Values for 26 of these parameters came from published scientific literature (SI, Table S1). Four of these parameter values were estimated by calibration: the concentrations of the three pathogen types in untreated water, and the longitudinal prevalence of reported nonwaterborne diarrhea, LPrNW (Table 1). The remaining three Table 1. Uniform Distributions Used to Determine the Values of the Stochastically Varying Parameters for Each Simulation Model Run During Calibration

(1)

where c is the concentration per liter of a pathogen type in untreated water (sampled from a gamma distribution), d is the liters of water consumed daily, w is the proportion of water treated (which varies depending on compliance), and r is the log10 reduction value (which varies depending on whether the intervention filter, the placebo filter, or no filter is being used). Dose Response (Assignment of Infection and Disease). The daily dose is converted to a probability of infection using a dose response function21 (SI, Section A5 and Figure S4), for children who are susceptible to that pathogen type. The probability of infection is then used to randomly determine which children become infected on that day. The duration of infection is determined by sampling from appropriate distributions (SI, Sections A7 and C2). Morbidity ratios (proportion of infected who have diarrhea) are used to randomly determine diarrheal illness given infection. Immunity is incorporated in the model in two ways: (1) complete immunity to infection from that pathogen type for 7 days after recovery from infection, followed by complete susceptibility; (2) within the morbidity ratios, because immunity to some diarrheal pathogens confers resistance to illness rather than immunity to infection (SI, Sections C1 and E4).22−24 The morbidity ratios were obtained from studies of diarrheal disease in developing countries.25−27 The model estimates reported diarrhea in a manner similar to the Lifestraw RCT. It simulates a survey every 30 days that asks every child whether they had any diarrhea during the previous 7 days. The model assumes recall is perfect for the first 2 days and declines thereafter (SI, Section A8) to adjust for recall bias. We assume recall bias is nondifferential because the Lifestraw RCT was blinded. The primary output of the model is longitudinal prevalence (LP) of diarrhea. LP is person-time diseased divided by persontime observed, as determined by the simulated surveys. This is

description mean concentration per L, diarrheagenic E. coli in untreated drinking water mean concentration per L, Giardia cysts in untreated drinking water mean concentration per L, rotavirus in untreated drinking water baseline nonwaterborne diarrhea longitudinal prevalence (LPrNW)a

lower limit

upper limit (low calibration compliance)

upper limit (medium calibration compliance)

0

7.0 × 104

8.0 × 104

0

0.95

1.3

0

0.14

0.18

0

0.0972

0.0972

a

The upper limit for baseline diarrhea longitudinal prevalence is the upper limit of the 95% CI for LP in the 2000 runs) for each of four estimation compliance levels, assuming a perfect placebo. The estimation step simulated measurements of LPIrad and LPPrad, and their ratio LPRrad, which were calculated in the same way as in the calibration step (number of monthly personsurveys reporting diarrhea during the previous 7 days, divided by the total number of person-surveys). Differing compliance values were used in each step. In the calibration step, we use “calibration compliance” to refer to a set of compliance values that describe what probably occurred during the actual Lifestraw RCT; these are necessary to calibrate the model to the four parameter values mentioned above. In the estimation step, we use “estimation compliance” to refer to a larger set of compliance values that allow the model to make predictions for several different scenarios. Calibration compliance and estimation compliance must be considered simultaneously because different calibration compliance levels lead to different results in the estimation step. The QMRA model was programmed in Octave 3.2; the code also runs in Matlab 7.11. The model code is available online with the SI. Results were analyzed using R 2.11; the two-tailed Wilcoxon rank sum test (α = 0.05) was used to compare distributions.

Figure 2. Distributions of longitudinal prevalence ratios from simulation runs consistent with the Lifestraw RCT from the calibration step for low (65% of children treat 1/3 of their drinking water) and medium (65% of children treat 2/3 of their drinking water) calibration compliance. These distributions differed significantly (Wilcoxon rank sum test, p = 0.02). Boxplots include: median (heavy line), 25th and 75th percentiles (lower and upper limits of the box), 2.5th and 97.5th percentiles (× symbols), and range (whiskers).

2. Concentrations of pathogen types in untreated water (Figure 3) were higher for medium calibration compliance compared with low calibration compliance, which is necessary to produce LPIrad and LPPrad values consistent with the RCT. In individual calibration runs, higher concentrations of one pathogen type were associated with lower concentrations of the other two pathogen types. The median diarrheagenic E. coli concentration predicted by the model is lower than the median thermotolerant coliforms (TTC) concentration measured in untreated drinking water in the Lifestraw RCT (Figure 3). This is plausible since E. coli are a subset of TTC, and not all E. coli are pathogenic. Estimation Step. This step estimated LFF effectiveness compared to a perfect placebo for low, medium, high, and perfect estimation compliance, given low or medium calibration compliance. Estimation compliance was a major driver of effectiveness. For example, under low, medium, high, and perfect estimation compliance, the median LPRrad was 0.86, 0.70, 0.50, and 0.13, respectively, regardless of calibration compliance (Figure 5). Additionally, LPIrad was significantly greater with medium calibration compliance (compared to low calibration compliance), for all levels of estimation compliance except perfect (Figure 4). This difference occurred because both calibration steps (low and medium calibration compliance) were constrained to the same RCT result; if calibration compliance decreases, the pathogen concentrations must also decrease for the model to remain consistent with the Lifestraw RCT. During the estimation step, the higher LPPrad for medium calibration compliance is due to lack of protection from the perfect placebo; as a result, the LPIrad values were also higher. The differences between low and medium calibration compliance decrease as estimation compliance increases.



RESULTS Calibration Step. Out of the 100 000 simulation runs in the calibration step, 210 were consistent with the Lifestraw RCT based on the criteria in Table 2 and assuming low calibration compliance with water filtration. Repeating the calibration step assuming medium calibration compliance yielded 258 consistent runs. Calibration estimated distributions for two outputs: 1. The longitudinal prevalence ratio (LPRrad) distributions were similar by level of calibration compliance (Figure 2). The estimate from the Lifestraw RCT falls within the central 95% of the distributions, suggesting consistency between the model and the Lifestraw RCT. The median LPRrad estimated by the model differs from the Lifestraw RCT estimate because the Lifestraw RCT is a single experiment, whereas each distribution of LPR rad represents over 200 simulated experiments. 5163

dx.doi.org/10.1021/es204381e | Environ. Sci. Technol. 2012, 46, 5160−5167

Environmental Science & Technology

Article

Figure 3. Simulated distributions of microbial concentrations per liter of untreated water, consistent with the Lifestraw RCT. Distributions were obtained from the calibration step assuming low (65% of children treat 1/3 of their drinking water) or medium (65% of children treat 2/3 of their drinking water) calibration compliance. Thermotolerant coliforms (TTC) measured by the Lifestraw RCT are also shown for comparison with simulated E. coli. For all three pathogen types, the concentration distributions differ by calibration compliance (Wilcoxon rank sum test, p < 0.001).

ance, the preventable fraction increased by 22 percentage points (median LPRrad: 0.92 and 0.70 for imperfect and perfect placebo, respectively). All three pathogen types contributed substantially to infection and disease (SI, Figure S5). Multiple infections accounted for about 2% of infections.



DISCUSSION Results from an epidemiological study may only be relevant to the ecological and social conditions of the communities studied. However, quantitative microbial risk assessment (QMRA) models that are calibrated to epidemiologic data can predict risk under scenarios that were not actually studied, known as counterfactual scenarios.28,29 This calibration process can enhance QMRA models that are usually informed by environmental contamination data under conditions where direct risks are difficult to measure using epidemiology. There are many situations where epidemiological data can be used to inform QMRA models, such as in a developing country context where direct measures of risk are frequently measured but environmental contamination data are rare. We applied our modeling framework in this context to generalize results from the Lifestraw RCT.12 Generalizing across different compliance scenarios, we used our model to quantify the relationship between compliance and HWT effectiveness. Our analysis suggests that perfect compliance in the Lifestraw RCT communities would yield an LPRrad of 0.13, suggesting that 87% of reported diarrhea could be prevented by consumption of treated water. This result, suggesting that only 13% of diarrhea in the Lifestraw RCT community was caused by nonwaterborne transmission, is consistent with a HWT trial in a refugee camp in which there was 95% compliance and an 83% reduction in diarrhea prevalence,30 as well as numerous field trials of ceramic filters indicating risk ratios