ARTICLE pubs.acs.org/est
Modeling Contaminant Concentration Distributions in China’s Centralized Source Waters Rui Wu,† Song S. Qian,*,‡ Fanghua Hao,† Hongguang Cheng,† Dangsheng Zhu,§ and Jianyong Zhang§ †
State Key Joint Laboratory of Environmental Simulation and Pollution Control, School of Environment, Beijing Normal University, Beijing 100875, China ‡ Nicholas School of the Environment, Duke University, Durham, North Carolina 27708, United States § General Institute of Water Resources and Hydropower Planning and Design, Ministry of Water Resources, Beijing 100120, China
bS Supporting Information ABSTRACT: Characterizing contaminant occurrences in China’s centralized source waters can provide an understanding of source water quality for stakeholders. The single-factor (i.e., worst contaminant) water-quality assessment method, commonly used in Chinese official analysis and publications, provides a qualitative summary of the country’s water-quality status but does not specify the extent and degree of specific contaminant occurrences at the national level. Such information is needed for developing scientifically sound management strategies. This article presents a Bayesian hierarchical modeling approach for estimating contaminant concentration distributions in China’s centralized source waters using arsenic and fluoride as examples. The data used are from the most recent national census of centralized source waters in 2006. The article uses three commonly used source water stratification methods to establish alternative hierarchical structures reflecting alternative model assumptions as well as competing management needs in characterizing pollutant occurrences. The results indicate that the probability of arsenic exceeding the standard of 0.05 mg/L is about 0.961.68% and the probability of fluoride exceeding 1 mg/L is about 9.569.96% nationally, both with strong spatial patterns. The article also discusses the use of the Bayesian approach for establishing a source water-quality information management system as well as other applications of our methods.
1. INTRODUCTION With rapid economic development and increasing water pollution, source water quality in China is facing a severe threat. According to the Chinese government, source water with a high arsenic or fluoride detection rate is a concern for some areas in China.13 Epidemiological studies show that elevated levels of arsenic in drinking water may be related to the development of several cancers, particularly skin, bladder, and lung cancer.4 Long-term high fluoride intake from drinking water may lead to dental or skeletal fluorosis, which is suspected as the leading cause of endemic fluorosis in some areas.4 In China, source waters are divided into distributed sources and centralized sources. The former, similar to transient, noncommunity water systems in the U.S. are usually wells without purification, whereas the latter are treated and serve a large population. Although some studies indicate that high arsenic or fluoride concentration is common among the distributed source waters in some rural areas,13,5,6 especially groundwater, few clearly show the concentration distributions within centralized source waters. Hence, characterization of arsenic and fluoride occurrences in China’s centralized source waters will help better understand the contamination status and provide a starting point for the subsequent r 2011 American Chemical Society
analysis and the development of management strategies through better water treatment regulations and risk assessment. Water-quality assessment in China is guided by the Environmental Quality Standards for Surface Water (GB 38382002),7 which classifies the country’s waters into five grades. Grade 1 is defined as undisturbed headwater and water in national nature reserves, whereas grades 2 to 5 are defined based on a water’s function as a source for different usages (two tiers of surface drinking water sources, industrial/noncontact recreational water, and agricultural/landscape water). Grade 1 represents the best water quality and grade 5 represents the severely polluted water. Most official analyses and publications (e.g., Annual Report on Environmental Status, Water Pollution Control Planning) use the single-factor method, which judges water quality based on the worst contaminant (of all routinely monitored ones). Using the single-factor method, each routinely monitored pollutant is classified into 5 grades corresponding to the five functional Received: November 16, 2010 Accepted: June 1, 2011 Revised: April 14, 2011 Published: June 21, 2011 6041
dx.doi.org/10.1021/es1038563 | Environ. Sci. Technol. 2011, 45, 6041–6048
Environmental Science & Technology grades. All monitored water-quality constituents are compared to these tiered water-quality grades and the pollutant with the highest grade is used to define the functional grade of a water body. Although simple and clear, the single-factor method is often less informative8,9 because it cannot specify the extent and degree of a specific contaminant occurrence at the national level. For example, a water body is classified as a grade 3 water body if the worst water-quality constituent is at grade 3, and the worst pollutant is required by regulations to be identified. However, when summarizing water-quality status at a regional or national level, identifying the worst pollutant is often difficult because the worst pollutant may vary by region. For example, when summarizing national water-quality status of source waters, the fraction of all source waters violating China’s environmental standards (worse than a grade 3 water, GB 38382002) is often reported. But a grade 3 water body in one part of the country can be due to organic pollution (indicated by COD), whereas another grade 3 water body can be due to arsenic pollution. Consequently, comparison among regions is difficult because information about the leading pollutant is often buried in less accessible achieves. As a result, the single-factor method results in a qualitative assessment of water-quality status. Qualitative information is often very easy to understand and very effective in representing the overall status of the country’s water quality. But qualitative information can mask the weakness in the monitoring and assessment system. A recent survey of the compliance of the Water Environmental Monitoring Regulations (SL21998) suggests that only less than 20% of the cities and counties meet the minimum sampling effort requirement (at least 12 annual samples for at least 23 water-quality criteria).10 As a result, the single-factor assessment results are often based on available data rather than the required routinely monitored water-quality constituents. The problem of data porosity cannot be addressed explicitly. Accurate assessment of individual pollutant concentration distributions is a necessary step in assembling basic information to support China’s effort in developing its own national water-quality criteria system.11 As China’s environmental management and policy are closely controlled by the central government, policies with regard to source water protection should be guided by pollution information more specific than the general status. Such information, including trends (and regional differences in trends) of specific pollutants, can be used for setting regional environmental standards and management strategies. Under the current data collection, management, and reporting system represented by the single-factor method for assessing water quality, we get a qualitative summary of the general water-quality status of the country, and information about specific pollutants are often hidden. Regional and national management and policy strategies require knowledge in regional and national trends and status, as well as variations within a region and across different regions of specific pollutants. Several studies have tried to describe the occurrence and distribution of specific chemicals in China’s waters on the basis of frequency or simple statistical summary of sample concentrations exceeding standards.1,2,13,14 Such analyses are frequently conducted at a system level representing nopooling analysis 15 sensitive to outliers due to small sample sizes. Results from these analyses are useful at a local level but cannot be easily aggregated into a regional or national level because of uneven sample sizes and a frequently large number of concentration values below detection limits. To accurately describe regional or national distribution of pollutant concentrations, one may pool all data within a region or the entire country leading to
ARTICLE
quantitative summaries at the regional or national level (the complete-pooling method). However, pooling local-level data for analyzing national trends is a difficult task because of different sample sizes and various levels of data censorship (values below detection limits). Simply pooling all data together will lead to the final results dominated by systems (or regions) with large sample sizes. Even under the ideal condition (a unified sampling design, standard analytical protocol, similar levels of censorship), the complete pooling approach will result in the loss of information on spatial variation. The partial pooling approach in the context of analyzing data with a multilevel structure 15,16 is clearly the ideal tool for this problem. A complete assessment of source water contaminant occurrence ideally would be based on repeated samples from source waters. However, such data are generally not available for all source waters in China, largely due to regional differences in economic development and environmental monitoring capacities. Such regional differences often lead to varying quality assurance and quality control practices, particularly the handling of values below method detection limits (MDLs). Consequently, the single-factor assessment approach is often the least controversial one. To better understand the status of source water, the Chinese government organized a source water-quality survey using the same standard developed by relevant ministries. We believe that the Bayesian hierarchical modeling approach can be used for analyzing data from this effort and provide not only a detailed summary of concentration distributions of various contaminants but also a case for a systematic revamp of national environmental monitoring and assessment strategy; although many statistical methods have been used for estimating national distributions of chemicals in water systems1720 (a brief summary can be found in Qian et al.18). The Bayesian hierarchical modeling approach, proposed based on the model in Qian et al.18 developed for the first and second Six-Year Reviews of Drinking Water Standards by the U.S. EPA,21,22 is selected for its ability to fully account for all sources of uncertainty and the relatively light computational burden. Apart from the advantages of using a Bayesian approach (e.g., the ability to incorporate prior information, the ease of incorporating into a formal decision analytic context, the explicit handling of uncertainty, and the straightforward ability to assimilate new information in contexts such as adaptive management), the model has two important features handling censored data and Bayesian sequential updating.18 The latter feature can be especially useful as it can provide prior information for future studies and a framework for systematic information accumulation. Through the characterization of the occurrences of two important contaminants (arsenic and fluoride) in China’s centralized source waters (in terms of national mean source water concentration distributions and probabilities of exceeding waterquality standards), this article intends to (1) demonstrate the need of a unified data collection standard for China, (2) recommend a source water-quality information management system for better managing China’s drinking water quality, and (3) ultimately, contribute to the improvement of China’s waterquality monitoring and management practices. Analysis of a third water-quality constituent (copper) is included in the online Supporting Information.
2. MATERIALS AND METHODS 2.1. Data. Arsenic and fluoride concentration data used in this study were from the most recent national census of source waters, 6042
dx.doi.org/10.1021/es1038563 |Environ. Sci. Technol. 2011, 45, 6041–6048
Environmental Science & Technology
ARTICLE
Figure 1. Map of centralized source waters in China.
which was conducted in 2006 to support the compilation of the China National Urban Safe Drinking Water Planning (20062020). The census was done at the city/county level for the first time by several Chinese Central Government ministries, which included 4555 urban centralized source waters all over the country (Figure 1), serving a population of approximately 379 840 000.23 This data set is the only existing one with source water characteristics (such as type, geographic locations, administrative relations, hydrographical relations, population served), water-quality monitoring data, and other related information. The design of the source water census required that only the 2004 source water mean concentrations of specific contaminants be included in the data set. The system mean value is the average of all samples collected at different routine observational points. Although all samples were collected and tested by Ministries of Water Resources and Environmental Protection personnel using the same protocols and procedures according to official technical specifications, monitoring programs (including the number of routine observational points, sampling frequency, testing items, and testing method) may vary by source. After excluding sources whose routine monitoring program did not include arsenic or fluoride and those whose water-quality monitoring data were unavailable in that census, there are 2928 observations of arsenic mean concentrations (with 1467 measurements below MDL) and 3042 observations of fluoride mean concentrations (with 315 measurements below MDL). We note that all sources in Sichuan Province were excluded because monitoring data were unavailable. Three alternative stratification methods were used to stratify source waters into groups representing cultural and natural hierarchies. First, source waters are grouped based on administrative boundaries (country f province f city f source water).
Source waters within a city can be assumed to be exchangeable and cities within a province are assumed to be exchangeable (Section 2.2). This stratification puts an emphasis on local management. Second, source waters can be grouped based on hydrographical relations. China is divided into 10 class 1 water resources divisions, each within watersheds of a large river basin. These large rivers are Songhuajiang River Basin, Liao River Basin, Hai River Basin, Yellow River Basin, Huai River Basin, Yangtze River Basin, Southeast Rivers Basin, Pearl River Basin, Southwest Rivers Basin, and Northwest Rivers Basin.24 Each class 1 division is subdivided into several class 2 divisions as subwatersheds of the class 1 watershed. Each source water belongs to a certain class 2 division. Third, source waters are stratified based on source water type and size (population served). The Chinese government routinely classifies source waters into four types (reservoir, lake, stream, and groundwater) and four sizes categories (e10 000, 10 00150 000, 50 001200 000, and >200 000 population served). Water-quality standards for source water are obtained from relevant water-quality regulations, including Environmental Quality Standards for Surface Water (GB 38382002),7 Quality Standards for Ground Water (GB/T 1484893),25 and Standards for Drinking Water Quality (GB 57492006).26 There are three different standards for arsenic (0.005, 0.01, and 0.05 mg/L) and one for fluoride (1 mg/L). 2.2. Bayesian Hierarchical Modeling. For different levels of administrative agencies, management objects range from individual source to regional and national distributions of source means. The conventional analytical approach (e.g., regression on ordered statistics, maximum likelihood estimator) calculates contaminant mean concentrations separately for individual source water systems17,20 and then pools these estimated source 6043
dx.doi.org/10.1021/es1038563 |Environ. Sci. Technol. 2011, 45, 6041–6048
Environmental Science & Technology
ARTICLE
Figure 2. Three hierarchical model structures and the corresponding estimated CDFs of national distribution of source water log mean concentrations. In each panel, the solid line is the estimated median log mean concentration, the dotted lines represent the estimated 95% credible interval, and the solid gray line is the empirical CDF with all censored values being substituted with 0.
means (using weighted averages) to generate regional and national distributions in terms of the probabilities of source means exceeding the standards. However, when dealing with a large number of source waters, these methods become prohibitively expensive and can provide misleading estimates of exceedence probabilities due to the problem of uneven sample sizes and the exclusion of sources with many left-censored values.18 Compared to the conventional approach, the Bayesian hierarchical modeling approach facilitates the computation of exceedence probabilities and addresses the problems of uneven sample sizes and censored data. The hierarchical model can build the correlations among sources indirectly through the hierarchical structure, whereas it relies on many distribution parameters to form a realistic model thereby avoiding problems of overfitting. Furthermore, the approach provides a Bayesian framework for combining and updating information. Details of the Bayesian hierarchical model can be found in Qian et al.18 and the computational details are in the online Supporting Information. In this section, we discuss three alternative hierarchical structures and how the model results can be used for management and policy analysis. In analyzing drinking water survey data, we often group source water systems according to management needs. For example, we can group source water systems based on administrative borders so that various levels of governments can have a clear picture of the water-quality status within their jurisdictions. Source water systems can also be grouped based on major drainage basins so that we can better separate natural and anthropogenic contributions of
pollutants. Source water systems can also be grouped based on their source water type (groundwater or surface water) and size (characterized by population served) so that a meaningful human exposure risk assessment can be carried out. In grouping source water systems and analyzing the resulting data using a Bayesian hierarchical model, we impose the assumption that systems within a group are exchangeable, that is, system means within a group can be modeled as random variables with the same probability distribution even though we know that these means are probably different. By imposing the exchangeable assumption within each group, we summarize the between group difference in terms of the group-specific common probability distribution of system means. Once systems are grouped, we most likely view the between group differences as more prominent compared to within group differences, thereby ignoring the patterns within a group. Such intentional or unintentional ignorance is often harmless and imposing the exchangability assumption is a natural way of modeling ignorance,27 although the capability of modeling ignorance is not an encouragement for ignoring available information. Given that empirical evidence of exchangeability is often not easily available, we use three alternative hierarchical structures for this study. Model 1 stratifies source waters based on administrative boundaries with four hierarchies: national level f provinces and province-level metropolises f cities f source waters. Model 1 uses location and the degree of source water protection capacity (in accordance with the differences of economic development between cities and provinces) as the major factors influencing contaminant concentration distribution. Data are available 6044
dx.doi.org/10.1021/es1038563 |Environ. Sci. Technol. 2011, 45, 6041–6048
Environmental Science & Technology
ARTICLE
Table 1. Estimated National Level Probability of Exceeding Selected Threshold and Median Concentrations and their 95% Credible Intervals (in Parentheses) Arsenic model
Pr [>0.005]
Pr [>0.01]
Pr [>0.05]
median
model 1
0.2568 (0.2392,.2748)
0.1360 (0.1228,.1496)
0.0168 (0.0120,.0216)
0.0018 (0.0017,.0020)
model 2
0.2468 (0.2316,.2640)
0.1232 (0.1108,.1368)
0.0124 (0.0084,.0172)
0.0018 (0.0017,.0020)
model 3
0.2628 (0.2448,.2796)
0.1256 (0.1136,.1392)
0.0096 (0.0064,.0140)
0.0021 (0.0020,.0022)
Fluoride model model 1
Pr [>1] 0.0956 (0.0848,.1084)
median 0.22 (0.21,.23)
model 2
0.0996 (0.0876,.1116)
0.21 (0.19,.22)
model 3
0.0960 (0.0852,.1076)
0.22 (0.21,.23)
from 30 (of 34) provinces. Sichuan, Hong Kong, Macao, and Taiwan did not report arsenic and fluoride concentration data. Model 2 is based on the hierarchical structure of national level f class 1 water resources divisions f class 2 water resources divisions f source waters, which considers different hydrographical conditions to be the most important factor affecting concentration levels. Model 3 is based on the stratification of source water type and size (Section 2.1). There are 15 unique source water typesize combinations (national level f source water typesize class f source waters). As the Chinese government is in the process of phasing in different management requirements for different sizes of source waters, results from Model 3 can serve as a reference for this effort. Models 1 and 2 can be expressed in eq 1: yijk ∼ Nðμij , σ 21 ÞIð, Sijk Þ μij ∼ Nðμi , σ22 Þ μi ∼ Nðμ, σ 23 Þ
ð1Þ
where the subscript k represents sources water systems, j represents the first level group (city in model 1, class 2 water resources division in model 2, or typesize group in model 3), and i represents the second level group (provinces in model 1 and class 1 water resources divisions). The model has four levels. Using model 1 as an example, at the first level, the log transformed contaminant source mean concentration values yijk are assumed to follow a normal distribution with unknown city mean μij and variance σ21. At the second level, the city means μij in a given province are assumed to come from a higher level normal distribution with an unknown province mean μi and variance σ22. At the third level, the province means μi are assumed to come from a normal distribution with an unknown national mean μ and variance σ23. The fourth level consists of priors for μ,σ21, σ22, and σ23. When information about these parameters is unavailable, vague priors are used. Details of model formulation are in the online Supporting Information.
3. RESULTS AND DISCUSSION Figure 2 shows the CDFs and Table 1 presents the national exceedence probabilities of selected thresholds and median concentrations as well as the 95% credible interval. In the figure, the solid line is the median of log mean concentration, the dotted lines represent the 95% credible interval, the vertical dashed lines are water-quality standards, and the gray solid line is the empirical
CDF with all censored values being substituted with 0. Figure 2 and Table 1 show that the estimated national distributions based on the three alternative source water stratification methods are similar. For example, the estimated national median arsenic concentration ranges between 0.0018 and 0.0021 mg/L among the three models, with probabilities of exceeding standard of 0.005, 0.01, and 0.05 mg/L ranges of 24.6826.28%, 12.32 13.6%, and 0.961.68%, respectively. The estimated national fluoride median range is 0.210.22 mg/L and the probability of exceeding the standard of 1 mg/L is approximately 9.569.96%. The small differences among the three models suggest that the hierarchical model is not sensitive to methods of stratification when estimating national source water mean distributions. However, the relatively large between province variance from model 1 suggests regional variation in system mean distributions. Model 2 and model 3 represent much larger spatial aggregation thereby eliminates regional differences. Consequently, results from model 1 are emphasized in the rest of the article. Results from the other two models are summarized in Supporting Information. We highlight the spatial patterns of arsenic and fluoride occurrence in terms of probabilities of exceeding current standards (As: 0.05, F: 1 mg/L) in Figure 3. Nationwide, the average probability of source water mean arsenic concentrations exceeding 0.05 mg/L is about 1.2% with a substantial between province variation (ranging from 0.04% to 6.04%). Provinces with above national average exceedence probabilities are mostly in northern China (Gansu, Heilongjiang, Hebei, Shaanxi, and Inner Mongolia), southern mountain areas (Hunan and Jiangxi), and Tibet. Fluoride source water mean distribution also shows a clear north to south gradient with the exceedence probabilities ranging from 0.72% to 28.84%. At a provincial level, our model can provide within province variability in the probability of source water mean exceeding a standard. Figure 4 shows the source water mean concentration distribution for the Province of Inner Mongolia (dark solid line) as well as the mean concentration distributions for the 12 cities within the province where source water data were available (gray lines). The city-level arsenic mean concentration distributions can vary widely, whereas the city-level fluoride mean concentration distributions are clustered into two groups two cities have consistently above average concentration whereas the other 10 have consistently below average concentrations. Figure 4 illustrates the level of details our models can provide. When needed, the same model structure can be used to develop a provincial model using data from local sources. The current model can be used to develop prior information. Figure 4 6045
dx.doi.org/10.1021/es1038563 |Environ. Sci. Technol. 2011, 45, 6041–6048
Environmental Science & Technology
ARTICLE
Figure 3. Arsenic and fluoride concentration exceedence probabilities of current standard (As: 0.05, F: 1 mg/L) for source waters in each province and class 1 water resources division.
hints at a clustering of the fluoride concentration distribution in Inner Mongolia suggesting that cities within the province may not be exchangeable. The results, nevertheless, provide a basis for further investigation and model refinement. Although the Bayesian hierarchical model is effective in presenting the status of source water, our study also suggests areas of improvement necessary for the final product to be more useful for supporting decision making and planning. These improvements are related to data collection and management. Data included in the current data set represent all available data and not necessarily a random sample of all data (e.g., no data were available from Sichuan Province). As a result, our model results are to be treated as a start in assessing drinking water safety in China. This initial result should be used to guide the designs of future surveys. Because a stratified random sampling procedure will result in a more cost-effective data collection process and will ensure that the results are representative, the stratification scheme should be carefully designed, maybe based on the estimated uncertainty of exceedence probabilities from this study. Areas (provinces or water resources divisions) with high levels of uncertainty should be sampled more. Data reported in the last source water census included only a water-quality constituent’s annual mean concentration for each source water. As a result, we treat system means as samples from a larger group (e.g., city in model 1). We have no information on sample size and sample variance. Such information is necessary
for properly setting up our model. That is, the within system variance σ21 in eq 1 should be changed to σ21/nijk reflecting the effect of varying sample sizes. But a more effective data management practice would be to retain the raw concentration data so that we can assess the equal variance assumption. Furthermore, with individual concentration values for each system we can better quantify variances at all levels, thereby better quantify the exceedence probabilities. For example, the current model 1 shows between city variation (Figure 4) for each province. With raw data, we can model between and within system variance so that cities with high levels of uncertainty can better understand the problems. Keeping all measurements does not increase the effort in data collection and allows a more detailed model to be developed. We note that different provinces in China employed different chemical analytical methods with different MDLs. In the final data set we obtained, methods for handling censored data are inconsistent among provinces. Most provinces choose to substitute the censored value with a fixed number, most likely 0, onehalf of MDL, or MDL. We have spent a large amount of time reviewing each provincial report and other relevant documents to understand their MDLs and methods of reporting a censored value. A standard should be established to unify the reporting process. Although different stratification methods imply different assumptions about which source waters are exchangeable, results 6046
dx.doi.org/10.1021/es1038563 |Environ. Sci. Technol. 2011, 45, 6041–6048
Environmental Science & Technology
ARTICLE
Figure 4. Estimated CDFs of arsenic and fluoride log concentration for all cities in Inner Mongolia. In each panel, the black line is the estimated cumulative distribution of the log mean concentration and the gray lines are the estimated cumulative distributions for individual cities within the province. The vertical dotted lines are the respective water-quality standards.
from this article suggest that the model is quite robust for these two chemicals. Different stratifications not only reflect different statistical assumptions but also affect how model results can be used. When using political boundaries as the basis for stratifying source waters, results are suited for supporting local or regional resource management and planning. The central government can use the results to identify provinces with specific pollution problems. Results from our models can be more useful for provincial governments to provide information on public health risk of potential exposure to unhealthy drinking water when population served information is overlaid with the spatial distribution of exceedence probability. Such analyses can be carried out on a long-term basis to assess the effectiveness of environmental initiatives aimed at protecting drinking water safety. The Chinese government can facilitate such analyses by making important demographic information available and fund a longterm study aimed at accurately understanding the status and trend of the nation’s water quality. Analysis of an additional chemical (copper, online Supporting Information) suggests that methods of stratification of source water systems can sometime influence the model results when the concentration distribution has a high spatial variation. In our three models, model 3 assumes no regional difference, model 2 assumes a small regional variation (as the nation is divided into 10 large regions), and model 1 assumes a large regional variation. Using different stratification methods representing various levels of spatial aggregation allows us to diagnose problems and proposing corrective actions. In this case, we believe that, for those chemicals showing a large regional variation, provincial models should be developed and updated using the Bayesian sequential updating method. Sequential updating can be made at the national level or at the provincial level. At the national level, sequential updating is aimed at assessing the effectiveness of various national policies and management initiatives. At the provincial level, the updating can be aimed at developing a regional model to solve regional and local problems. In both levels, the capability of sequential updating can be used to develop a systematic information management system for better planning and for better quantifying trends
Table 2. Informative Priors Based on Model 1 Output arsenic
fluoride
1/σ21 ∼ gamma(1/σ21|672,700)
1/σ21 ∼ gamma(1/σ21|1208,856)
1/σ22 ∼ gamma(1/σ22|57,41) μ,1/σ23 ∼ Ng(μ,1/σ23|6.3,23,8,4)
1/σ22 ∼ gamma(1/σ22|71,21) μ,1/σ23 ∼ Ng(μ,1/σ23|1.5,25,10,3)
in source water quality. For example, results from the current study can be summarized to derive informative prior distributions for the four basic model parameters in eq 1 (σ21, σ22, σ23, μ) at the national level. (At the provincial level, the joint distribution of σ21, σ22, and μi is of interest.) With a set of informative prior distributions (e.g., in the form of conjugate prior), we can reduce sampling intensity while maintaining the necessary statistical power to detect changes in water quality over time. A centralized national sampling plan can be developed to achieve the maximum effect. The method of moments can be used to develop the informative prior distributions for σ21, σ22, σ23, and μ from the posterior samples in this study. The most commonly used informative prior distributions for σ21, σ22, σ23, and μ are: 1=σ21 ∼ gammað1=σ21 ja, bÞ 1=σ 22 ∼ gammað1=σ 22 jc, dÞ μ, 1=σ 23 ∼ Ngðμ, 1=σ 23 jμ0 , n0 , R, βÞ
ð2Þ
The distribution Ng(μ,1/σ23|μ0, n0, R, β) is the normalgamma distribution. Details of prior distribution parameter estimation can be found in Qian and Reckhow.28 For example, the informative prior distributions for σ21, σ22, σ23, and μ based on the results from model 1 is listed in Table 2. Data from a future study can be used to derive posterior distributions of these 4 parameters, from which new exceedence probabilities can be derived. Changes in model parameters over time will reflect the changes in water-quality conditions. When a model-based water-quality assessment program is implemented at a national level, the overall mean (μ) provides an aggregated summary of water-quality status, and the between province variance (σ23) shows the variability among provinces. A large 6047
dx.doi.org/10.1021/es1038563 |Environ. Sci. Technol. 2011, 45, 6041–6048
Environmental Science & Technology between province variance would indicate a need for regional strategies so that not only the overall concentration level (μ) is acceptable but also a relatively small regional variation (σ23). This model-based assessment program can also be implemented at the provincial level so that information from different provinces are comparable and can be upward aggregated. Such a model-based system can fully exploit the existing monitoring resources and systematically summarize water-quality status at both the provincial and national levels. The Bayesian hierarchical modeling approach presented in this article is mainly for assessing the status of water quality at both regional and national scales. Our model does not include covariates and is not intended for forecasting or association.
’ ASSOCIATED CONTENT
bS
Supporting Information. (1) Details of the Bayesian hierarchical model and the BUGS program, (2) detailed results of all 3 models, with additional figures and tables, (3) results of an additional chemical (copper), and (4) maps of China (provinces and major river basins). This material is available free of charge via the Internet at http://pubs.acs.org.
’ AUTHOR INFORMATION Corresponding Author
*Phone: 919 613-8105; e-mail:
[email protected].
’ ACKNOWLEDGMENT This research was partially funded by the National Natural Science Foundation of China (Nos. 40930740, 40771192, and 40771191). The work was completed while Rui Wu was visiting Duke University supported by the China Scholarship Council (CSC). S.S.Q.’s work was supported by the U.S. Geological Survey under a corporative agreement (08HQAG0121). The authors thank Thomas F. Cuffney, Gerald McMahon, Kenneth H. Reckhow, Ibrahim Alameddine, Yun Jian, Kristofor Voss, and Eric Money for their constructive comments and suggestions. ’ REFERENCES (1) Liu, Y.; Zheng, B. H.; Fu, Q.; Meng, W.; Wang, Y. Y. Risk assessment and management of arsenic in source water in China J. Hazard. Mater. 2009, 170 (23), 729–734. (2) Zhang, B.; Hong, M.; Zhao, Y. S.; Lin, X. Y.; Zhang, X. L.; Dong, J. Distribution and risk assessment of fluoride in drinking water in the west plain region of Jilin Province, China. Environ. Geochem. Health 2003, 25 (4), 421–431. (3) Ministry of Health, National Development and Reform Commission, and Ministry of Finance, P.R.C. National key endemic diseases control planning (20042020); Technical report, 2004. (4) Guidelines for drinking-water quality: Incorporating first and second addenda to third edition, vol 1. recommendations; Technical report, WHO, 2008. (5) Zhang, H. Heavy-metal pollution and arseniasis in Hetao region, China. Ambio 2004, 33 (3), 138–140. (6) Gong, Z. L.; Lu, X. F.; Watt, C.; Wen, B.; He, B.; Mumford, J.; Ning, Z. X.; Xia, Y. J.; Le, X. C. Speciation analysis of arsenic in groundwater from Inner Mongolia with an emphasis on acid-leachable particulate arsenic. Anal. Chim. Acta 2006, 555 (1), 181–187. (7) Ministry of Environmental Protection, P.R.C. Environmental quality standards for surface water, (GB 38382002), 2002.
ARTICLE
(8) Zeng, Y.; Fan, Y. Q.; Wang, L. W.; Diao, L. F.; Li, H. A comparison between fuzzy comprehensive evaluation and single-factor water quality assessment method. Yellow River 2007, No. 2, 45. (9) Yin, H. L.; Xu, Z. X. Discussion on China’s single-factor water quality assessment method. Water Purif. Technol. 2008, 27 (2), 1–3. (10) Wang, Y.; Tang, K.; Xu, Z.; Tang, Y.; Liu, H. Water quality assessment of surface drinking water sources in cities and towns of China. Water Res. Protection 2009, 25 (2), 1–4. (11) Wu, F.; Meng, W.; Zhao, X.; Li, H.; Zhang, R.; Cao, Y.; Liao, H. China embarking on development of its own national water quality criteria system. Environ. Sci. Technol. 2010, 44 (21), 7992–7993. (12) Gao, J. J.; Liu, L. H.; Liu, X. R.; Lu, J.; Zhou, H. D.; Huang, S. B.; Wang, Z. J.; Spear., P. A. Occurrence and distribution of organochlorine pesticides lindane, p,p’-ddt, and heptachlor epoxide in surface water of China. Environ. Int. 2008, 34 (8), 1097–1103. (13) Gao, J. J.; Liu, L. H.; Liu, X. R.; Zhou, H. D.; Lu, J.; Huang, S. B.; Wang, Z. J. The occurrence and spatial distribution of organophosphorous pesticides in Chinese surface water. Bull. Environ. Contam. Toxicol. 2009, 82 (2), 223–229. (14) Xing, X. R.; Cao, Q.; Liu, J.; Shi, J.; Ren, L. J.; Wu, G. P.; Wei, F. S. Water quality of public drinking water sources in some key cities of main environmental protection in China. China Environ. Sci. 2008, 28 (11), 961–967. (15) Gelman, A.; Hill, J. Data Analysis Using Regression and Multilevel/ Hierarchical Models; Cambridge University Press: New York, 2007. (16) Qian, S. S. Environmental and Ecological Statistics with R; Chapman and Hall/CRC Press, 2009. (17) U.S. EPA. Methods, occurrence, and monitoring documents for radon in drinking water; Technical report, Office of Water, 2000. (18) Qian, S. S.; Schulman, A.; Koplos, J.; Kotros, A.; Kellar, P. A hierarchical modeling approach for estimating national distributions of chemicals in public drinking water systems. Environ. Sci. Technol. 2004, 38 (4), 1176–1182. (19) Lockwood, J. R.; Schervish, M. J.; Gurian, P.; Small, M. J. Characterization of arsenic occurrence in source waters of U.S. community water system. J. Am. Stat. Assoc. 2001, 96 (456), 1184–1193. (20) U.S. EPA. Arsenic occurrence in public drinking water supplies; Technical report, Office of Water, 2000. (21) U.S. EPA. Occurrence estimation methodology and occurrence findings report for the six-year review of existing national primary drinking water regulations; Technical report, Office of Water, 2003. (22) U.S. EPA. The analysis of regulated contaminant occurrence data from public water systems in support of the second six-year review of national primary drinking water regulations; Technical report, Office of Water, 2009. (23) Zhu, D.S. Strategies on China Urban Drinking Water Safety; Science Press, Beijing, 2008. (24) Ministry of Water Resource, P. R. C. National water resource divisions; Technical report, 2002. (25) Ministry of Environmental Protection, P. R. C. National Standards of the People's Republic of China: Quality standards for groundwater (GB/T1484893), 1993. (26) Ministry of Health, P. R. C. National Standards of the People's republic of China: Standards for drinking water quality (GB 57492006), 2006. (27) Gelman, A.; Carlin, J. B.; Stern, H. S.; Rubin, D. B. Bayesian Data Analysis; Chapman & Hall: London, 2nd ed., 2003. (28) Qian, S. S.; Reckhow, K. H. Combining model results and monitoring data for water quality assessment. Environ. Sci. Technol. 2007, 41 (14), 5008–5013.
6048
dx.doi.org/10.1021/es1038563 |Environ. Sci. Technol. 2011, 45, 6041–6048