Article pubs.acs.org/est
Variations of DOM Quality in Inflows of a Drinking Water Reservoir: Linking of van Krevelen Diagrams with EEMF Spectra by Rank Correlation Peter Herzsprung,*,† Wolf von Tümpling,† Norbert Hertkorn,‡ Mourad Harir,‡ Olaf Büttner,† Jenny Bravidor,† Kurt Friese,† and Philippe Schmitt-Kopplin‡ †
Helmholtz Centre for Environmental Research − UFZ, Brückstrasse 3a, 39114 Magdeburg, Germany Helmholtz Zentrum München, German Research Center for Environmental Health, Reasearch Unit AnalyticalBioGeoChemistry (BGC), Ingolstädter Landstraße 1, 85764 Neuherberg, Germany
‡
S Supporting Information *
ABSTRACT: Elevated concentrations of dissolved organic matter (DOM) such as humic substances in raw water pose significant challenges during the processing of the commercial drinking water supplies. This is a relevant issue in Saxony, Central East Germany, and many other regions worldwide, where drinking water is produced from raw waters with noticeable presence of chromophoric DOM (CDOM), which is assumed to originate from forested watersheds in spring regions of the catchment area. For improved comprehension of DOM molecular composition, the seasonal and spatial variations of humic-like fluorescence and elemental formulas in the catchment area of the Muldenberg reservoir were recorded by excitation emission matrix fluorescence (EEMF) and ultrahigh-resolution mass spectrometry (FT-ICR-MS). The Spearman rank correlation was applied to link the EEMF intensities with exact molecular formulas and their corresponding relative mass peak abundances. Thereby, humic-like fluorescence could be allocated to the pool of oxygen-rich and relatively unsaturated components with stoichiometries similar to those of tannic acids, which are suspected to have a comparatively high disinfection byproduct formation potential associated with the chlorination of raw water. Analogous relationships were established for UV absorption at 254 nm (UV254) and dissolved organic carbon (DOC) and compared to the EEMF correlation.
1. INTRODUCTION Dissolved organic matter (DOM) is widespread in natural waterways and comprises a huge diversity of organic compounds.1 DOM is generated by the combined photochemical and biological transformation of organic precursors and/or by partial decomposition of organisms, and might be stored in soils for variable periods of time. A component of DOM, generally described as the colored or chromophoric DOM (CDOM), makes up a substantial part of freshwater DOM. The concentration of dissolved organic carbon (DOC) is frequently used as an indicator for DOM and usually ranges from 1 to 50 mg L−1 in freshwaters.2 The lowest DOC concentrations are observed in groundwaters, so-called clear-water lakes, and rivers draining from bare rock, whereas DOC concentrations are high in freshwaters draining peat and wetlands.3 Recent studies have revealed rising trends of DOC concentrations across large areas of North America and Europe.4−7 It has been reported that DOC concentrations in lakes and streams have almost doubled since 1988 (in the UK), whereas for Finnish lakes an average annual increase of total organic carbon (TOC) by about 0.1−0.2 mg L−1 has been observed in recent years.6 Increasing levels of organic carbon in freshwaters pose serious challenges for processing and the commercial supply of © 2012 American Chemical Society
drinking water. Organic material in water can cause aesthetic problems such as an unpleasant taste, odor, and color. DOM does not pose a health risk itself but may become transformed into potentially harmful disinfection byproducts (DBPs) when subjected to raw water processing, which often includes treatment with reactive species such as free chlorine, ozone, chloramines, or chlorine dioxide.8−11 For these reasons, DOC-rich yellow−brown raw waters cause increased flocculation costs to remove the unwanted organic byproducts. Optimizing the processing of drinking water requires substantial research among engineers and water chemists directed at characterization and removal of humic substances from raw waters.12 In the southern Saxony region of Germany, raw drinking water is mainly received from reservoirs situated in the Ore Mountains (Erzgebirge). Most of these reservoirs are affected by elevated concentrations of humic substances which are monitored by the drinking water administration13,14 via measurements of the DOC and the UV absorption at 254 nm (UV254). Received: Revised: Accepted: Published: 5511
January 27, 2012 April 23, 2012 April 23, 2012 April 23, 2012 dx.doi.org/10.1021/es300345c | Environ. Sci. Technol. 2012, 46, 5511−5518
Environmental Science & Technology
Article
(see additional data in SI5.1) from the five sampling sites. The samples were taken carefully with a scooper, filled bubble-free into ambered glass bottles, and stored in a refrigerator at 4 °C until analysis. Additionally, water samples were analyzed with respect to DOC, UV254, total inorganic carbon (TIC), and pH. The determination of DOC and TIC has been described previously.35,36 The UV measurements were carried out with a spectrophotometer CADAS 2000 (Dr. Lange, Duesseldorf, Germany). 3.2. EEMF Analysis. 3.2.1. Determination of Spectra. Water samples were filtered through Minisart cellulose acetate filters (pore size 0.22 μm; Sartorius Stedim Biotech). The filtrates were diluted about 2-fold with a phosphate buffer of pH 7 (Merck, Germany) before measuring fluorescence spectra to account for the inner filter effects and to obtain similar pH values. Excitation emission matrices (EEM) were generated using a SKALAR (The Netherlands) Fluo-Imager M53 spectrofluorometer, which was equipped with a xenon lamp and a photomultiplier as detector. The fluorescence intensity was recorded for the emission wavelengths between 260 and 575 nm, while exciting at wavelengths ranging from 240 to 360 nm (increments 5 nm each). In addition, on each measuring day, the EEM of Milli-Q water diluted with buffer of pH 7 was measured as a blank by the EEM. The EEM were not normalized further (with respect to water Raman peak or otherwise). 3.2.2. Evaluation of Humic-Like Fluorescence Intensity (HumLikeFluoInt). A blank subtraction was performed for each wavelength combination within the fluorescence matrix. No further correction of the inner filter effect (which was described in ref 37) was performed. A monotonic increase of fluorescence intensity with DOC concentration was confirmed with dilution experiments. Thus the inner filter effect could be neglected for calculation of HumLikeFluoInt ranks. Because the protein-like fluorescence was never observed in the spectrum of the investigated sampling sites, only the humiclike fluorescence area was evaluated. For simplification the HumLikeFluoInt was equalized to the median intensity value for the range 320−345 nm excitation and 410−440 nm emission for each sample (Figure S8). Fluorescence quenching effects and quantum yields were not considered for the calculation of fluorescence ranking since quantification of these effects was precluded by the unknown allocation of optical properties to discrete isomers within the DOM pool. 3.3. FTICR Mass Spectra Determination and Evaluation. The filtered samples (see EEMF analysis) were stored at 4 °C in a refrigerator prior to analysis. High-resolution FT-ICR mass spectra were acquired after electrospray ionization (ESI) in the negative mode. All detected masses were singly charged, since the mass difference between two isotopic components was equivalent to the mass of one neutron (1.003 Da). Applying other ionization techniques (in the positive or negative mode) may produce a considerable number of complementary elemental formulas.1,38 Solid-phase extraction, mass spectrometer, further acquisition details of mass spectra, and the extraction of reliable elemental formulas from data sets were described in ref 36. Whereas EEMF, UV254, and DOC represent the complete fraction of DOM as much as these methods are sensitive to DOM determination, there are some limitations using FT-ICR-MS for DOM analysis: FT-ICR-MS analyzes at present only the DOC which is extractable by solid-phase extraction. Non-extractable compounds may contribute to CDOM. In addition, only molecules with masses 100−1000 Da were evaluated. While
For evaluation of raw water quality, easy, cheap, and fast analytical tools are required. Excitation emission matrix fluorescence spectroscopy (EEMF) is an easy-to-operate, lowcost method which includes the advantage of high sample determination capacity. EEMF is useful, for instance, to distinguish between protein- and humic- (and fulvic-) like CDOM 15 and has been widely used for investigation of the optical properties of DOM16−20 and of its potential alteration reactions.21,22 EEMF provides in this manner sensitive bulk optical parameters with low structural resolution concerning DOM quality (even when coupled to size exclusion chromatography23). The protein-like fluorescence is supposed to originate mainly from tryptophan and tyrosine residues.24 However the molecular origin of humic-like fluorescence is still not known in detail and clearly insufficient for understanding the elemental composition of CDOM. At least humic-like fluorophores were discussed to be quinone-like.22 Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR-MS) is a high-resolution analytical tool to determine the elemental compositions of thousands of DOM compounds directly out of mixtures. Lacking the ability for identification of distinct chemical substances (i.e., isomers), the elemental compositions can nevertheless be allocated to biogeochemical pools by means of van Krevelen diagrams. The stoichiometric ranges used to allocate natural organic matter (NOM) components (Supporting Information (SI) Figure S3) are quite variable in the literature.1,25−32 The attribution of DOM quality specific to the biogeochemical pools might be useful for deciding whether raw water is more or less appropriate for a certain kind of drinking water processing (flocculation, disinfection) due to a different DBP formation potential of DOM components.10,33 The aim of the study was to link EEMF with FT-ICR-MS spectra and to develop a new evaluation procedure to receive appropriate rank data from different (with respect to the samples) FT-ICR-MS data sets, termed “inter sample rankings analysis”. The procedure of connecting humic-like fluorescence intensity (HumLikeFluoInt) ranks with elemental formula specific ranks, which were received from inter sample rankings analysis, is a new tool to assess CDOM molecular properties and will be presented here.
2. STUDY AREA AND SAMPLING SITES The catchment area of the drinking water reservoir, Muldenberg, in the Ore Mountains, Saxony, Germany (18.4 km2) was chosen as study area (map and GPS data are provided in SI2). The 23% of the catchment which is forested is a possible source for leaching DOM. Back in the 1980s, the forested watersheds were drained using elaborate trench systems. This resulted in partial mineralization and decomposition of the peat horizons.13 Currently, the trenches are not maintained and therefore a rewetting of the forested watersheds is observed. Previously, Åström et al.34 showed that such catchment areas show higher extents of leaching of DOC into the waters than others. This was also observed in the catchment of the Muldenberg drinking water reservoir. The periodic sampling sites are the three main streams Rote Mulde (RMU), Weiße Mulde (WMU), and Saubach (SBA), the outflow of the pond Sauteich (STE), and the reservoir effluent (TSP). 3. METHODS 3.1. Sampling, Sampling Process, and Analysis. Water samples were collected in March, April, July, and August 2009 5512
dx.doi.org/10.1021/es300345c | Environ. Sci. Technol. 2012, 46, 5511−5518
Environmental Science & Technology
Article
totCommonPre pool for the reasons explained above. The only practical possibility to compare the samples on the basis of their totCommonPre pool was to compare the relationships of mass peak intensities of the different samples. An initial statistical procedure to enable the comparison of data sets would be to calculate the ranking of mass peak intensities for each sample, based on the assumption of equal ionization potential for any given component in all compared samples, even if mass spectra of complex mixtures are known to be decisively affected by competition for ionization.1,39 However the isomeric composition of detected molecular ions might vary in different samples, and even the isomer-specific ionization potentials might vary from sample to sample. Hence, the ranking of mass peak intensities according to elemental formulas (derived from ions) will not always correctly reflect the authentic distribution of original DOM molecules. Nevertheless, to minimize effects of selective ionization, all DOM samples were analyzed applying the same extraction procedure, the same enrichment factor, and using the same ionization technique in FT-ICR mass spectrometry. Because only samples from a regional confined region were compared, they clearly showed quite similar distributions of elemental formulas, therefore the relative inter sample ionization competition effects could be reasonably neglected. The procedure for rank analysis is explained on the basis of an example (additional details are explained in SI4.2). To analyze seasonal variations of components, the totCommonPre(4) pool (total common presence in four samples) from the Rote Mulde River (RMU3, RMU4, RMU7, RMU8, where 3 was used as synonym for March, 4 for April, 7 for July, and 8 for August) was calculated by sorting the joined elemental formula data sets according to increasing IUPAC mass as described in Section 3.4.1. The data set was examined for unique mass peaks and possible multiplets. Then, quartets of identical IUPAC masses (i.e., those occurring in all four RMU samples) were identified and the corresponding rows were extracted to the totCommonPre(4) data set. This data set was then divided according to sample number and sorted with regard to increasing mass. The ranks of mass peak intensities (massPeakRank) were calculated for each of the four samples one after another. The data set was then again sorted according to increasing IUPAC mass to regenerate the multiplet (quartet) alignment of associated lines. An additional column (intSampRank, SI4.2) was generated for a new rank calculation (inter sample rankings analysis). To illustrate this procedure, seven examples of components from the totCommonPre(4) pool are shown in Table 1. A total of 638 common components were found in all four samples from RMU. Component C17H14O11, for instance, received the 116th rank in RMU3, the 86th rank in RMU4, the 36th rank in RMU7, and the 52nd rank in RMU8. As result of inter sample rankings the March sample received the fourth rank, the April sample the third rank, the July sample the first rank, and the August sample the second rank specific to this component. The components C15H22O2 and C13H10O9 demonstrated that sometimes two (or more) of the compared samples received the same rank specific to one component by chance. This implied that the distribution of the (four different) ranks was not necessarily equal within the four compared samples (a detailed balance is shown in Table S3). It can be assumed that the probability to receive the same rank in this manner will decrease with increasing number of components in the totCommonPre pool. 3.4.3. Rank Correlation According to Spearman .40 The inter sample rankings analysis provided a component-specific
considering only CHO components for presence and ranking analyses, CHNO, CHOS, and CHNOS components (which might have contributed to CDOM) were in fact detected (as provided in the rank_correlation_database.xls in the Supporting Information) but in too few numbers to allow for a significant statistical analysis. 3.4. Data Analysis. The organization of the elemental formula evaluation is shown in Figure 1. From EEMF spectra
Figure 1. Overview of elemental formula data analysis.
the ranks were easily calculated from the HumLikeFluoInt of the different samples. The Spearman rank correlation was applied to determine the relationships between EEMF and FT-ICR-MS data. 3.4.1. Comparison of Elemental Formulas of Two Samples. An obvious option to compare the two elemental formula data sets A and B was to sort the data table for formulas which were present in both samples (common presence, CommonPre) and those which are only present either in sample A or sample B (different presence, Dif Pre). Initially, the data were sorted according to increasing molecular weight (named as IUPAC (International Union of Pure and Applied Chemistry) mass in ref 36). A programmed Microsoft Excel routine was used to find unique IUPAC masses (Dif Pre) and doublets of IUPAC masses (CommonPre). The results of CommonPre or Dif Pre (sample A or B) were displayed using separate or color-coded van Krevelen diagrams. 3.4.2. New Evaluation Strategy to Assess Elemental Formula Intensities Based on Rank Analysis (Inter Sample Rankings Analysis). Because the expenditure of relative mass peak presence analysis massively increases with peak counts, a new approach to evaluate occurrences of elemental formulas was developed. For instance, in three samples A, B, and C, one could find formulas with total common presence, totCommonPre, (A, B, C), partial common presence, partCommonPre, (A, B; A, C; B, C), and with Dif Pre (only in A, only in B, only in C). An intrinsic discussion concerning the dimensions of the Dif Pre, partCommonPre, and totCommonPre pools is provided in SI4.1 addressing the question, which pools contain higher or lower intensities from a statistical point of view. As a result, the totCommonPre pools, in general, contained components with comparably higher mass peak intensities compared to the other pools. A comparison of elemental formula data sets of three or more different samples was required on the basis of their 5513
dx.doi.org/10.1021/es300345c | Environ. Sci. Technol. 2012, 46, 5511−5518
Environmental Science & Technology
Article
were received for each of the 455 components and were depicted in a van Krevelen diagram as described in the Results and Discussion Section (Figure 4). The significance test was performed for positive correlations only (Null hypothesis rs = 0 was rejected and rs > rs (N = 20,α) in which N is the number of samples and α denotes the significance level; rs: rank correlation coefficient). Components with a negative correlation were defined as “non significant” as they reflect irregular correlations for the regarded issue.
Table 1. Examples of Results of Inter Sample Rankings Analysis of Elemental Formula Compositions Intensities component
sample
intensity
massPeakRank
intSampRank
C17H14O11
RMU3 RMU4 RMU7 RMU8 RMU3 RMU4 RMU7 RMU8 RMU3 RMU4 RMU7 RMU8 RMU3 RMU4 RMU7 RMU8 RMU3 RMU4 RMU7 RMU8 RMU3 RMU4 RMU7 RMU8 RMU3 RMU4 RMU7 RMU8
31,556,566 34,251,255 36,336,859 29,675,498 41,229,274 29,751,961 19,414,439 16,463,927 4,909,504 4,390,666 4,632,457 5,884,579 46,308,958 49,075,488 70,759,213 60,252,467 16,571,272 24,037,926 20,750,952 19,923,487 59,532,427 56,257,644 35,661,319 32,945,855 14,848,888 23,223,937 24,185,951 25,320,135
116 86 36 52 64 122 228 282 635 637 638 629 42 28 1 1 322 200 200 184 13 18 43 36 379 214 133 95
4 3 1 2 1 2 3 4 2 3 4 1 4 3 1 1 4 2 2 1 1 2 4 3 4 3 2 1
C20H28O7
C33H28O20
C15H22O2
C13H10O9
C15H20O6
C10H6O7
4. RESULTS AND DISCUSSION 4.1. Seasonal Variations of HumLikeFluoInt, UV254, and DOC. The HumLikeFluoInt was quite diverse in the various samples and at different seasons (Table 2). HumLikeFluoInt, Table 2. Seasonal Variations of HumLikeFluoInt, UV254, and DOC Values HumLikeFluoInt (A.U.)a
UV254 (cm−1)
DOC (mg L−1)
ranking pattern for a given data set from three or more different samples. From here, a comparison of this pattern with (for instance) the chromophoric properties of the samples was viable. The median fluorescence intensity can be calculated for each sample as described in Section 3.2.2. From those intensities the fluorescence ranking was calculated for all of the investigated 20 samples (RMU, WMU, SBA, STE, and TSP in March, April, July, and August). A merged CommonPre pool was defined for the 20 samples from each component which was at least present in 19 of the 20 samples [partCommonPre(19) + totCommonPre(20)]. The partCommonPre(19) pool was defined as an entity of components which were present in exactly 19 of the 20 samples. The totCommonPre(20) pool was defined accordingly (components present in all 20 samples). The frequency values (presence of components in “n” samples) were calculated and assigned to each component (sample wise) as described in SI4.2 (Search1 columns in the exemplary tables). Together, 455 components (101 from the partCommonPre(19) pool and 354 from the totCommonPre(20) pool) were found to fulfill this criterion for the 20 samples. The reasons for using exactly partCommonPre(19) and totCommonPre(20) are explained in detail in SI4.2. Components from the Dif Pre or partCommonPre (