Quantitative Structure−Property Relationship for ... - ACS Publications

Mar 15, 2010 - QSPR for HOCldem based on eight constitutional descriptors. Model compounds with HOCldem ranging from 0.1 to 13.4 mol chlorine per mole...
1 downloads 4 Views 192KB Size
Environ. Sci. Technol. 2010, 44, 2503–2508

Quantitative Structure-Property Relationship for Predicting Chlorine Demand by Organic Molecules GEBHARD B. LUILO AND STEPHEN E. CABANISS* Department of Chemistry and Chemical Biology, MSC03 2060, 1 University of New Mexico, Albuquerque, New Mexico 87131-0001

Received October 17, 2009. Revised manuscript received February 19, 2010. Accepted February 22, 2010.

Conventional methods for predicting chlorine demand (HOCldem) due to dissolved organic matter (DOM) are based on bulk water quality parameters and ignore structural features of individual molecules that may better indicate reactivity toward the disinfectant. The Quantitative Structure-Property Relationship (QSPR) modeling approach can account for structural properties of individual molecules. Here we report a QSPR for HOCldem based on eight constitutional descriptors. Model compounds with HOCldem ranging from 0.1 to 13.4 mol chlorine per mole compound were divided into a calibration and cross-validation data set (N ) 159) and an external validation set (N ) 42). The QSPR was calibrated using multiple linear regression in a 5-way leave-many-out approach and has average R 2 ) 0.86 and standard error of regression (StdEreg) ) 1.24 mol HOCl per mole compound and p < 0.05. Internal crossvalidation has average q2 ) 0.85 and the external validation has q2 ) 0.88, indicating a robust model. The leverage of 7 of 42 compounds in the external validation data set exceeded the critical value, suggesting that these compounds may be overextrapolated. However, root-mean-square error of prediction in the external validation was 1.17 mol HOCl per mole compound, and all compounds were predicted with (2.5 standardized residuals (Sresid). Application of the QSPR to model structures of NOM predicts HOCldem comparable to reported measurements from natural water treatment.

1. Introduction Freshwater contamination from microbial pathogens poses serious public health risks worldwide (1–3). Chlorination has been the most commonly used technology to eliminate microbes in drinking water (1, 4). However, chlorine residue reacts with traces of dissolved organic matter (DOM) in water to produce disinfection byproducts (DBPs) mostly through electrophilic substitution reactions (5). Disinfection byproducts include trihalomethanes, haloacetic acids, haloacetonitriles, halonitromethames, and haloketones (6). Most DBPs are carcinogenic and tumorgenic to test animals (7) and are being regulated in most countries (8–10). Regulation is based on the precautionary principle (11) since little evidence links DBPs directly to reproductive problems and carcinogenicity in humans (12, 13). Since the first detection of DBPs in 1970s, minimization of DBP production without compromising water quality has been a major challenge. A closely related problem is prediction * Corresponding author e-mail: [email protected]. 10.1021/es903164d

 2010 American Chemical Society

Published on Web 03/15/2010

of the amount of HOCl consumed or chlorine demand (HOCldem). Quantitative prediction of HOCldem for different water supplies can help optimize chlorine dosages while maintaining disinfection and minimizing DBP production. Empirical models relate HOCldem to bulk water quality parameters like pH, ultraviolet absorption, and dissolved organic carbon (14–19). However, this approach treats DOM as a single, average entity, while it is actually a mixture of organic molecules with different chemical structures (20). The types, number, and arrangement of functional groups in each molecule strongly influence reactivity toward chlorine during water treatment (21–23). An alternative approach develops predictive models based on molecular structure and applies them to postulated DOM molecules. This approach is not a substitute for traditional experimental measurements but can enable predictions of HOCldem due to potential changes in the DOM-for example, changing pretreatment methods or land use within the watershed. A predictive model of DOM has been developed which provides composition data for thousands of molecules in a mixture (24, 25). What is lacking is a rapid and quantitative method to predict HOCldem from this molecular information. QSPRs require minimal computing power and have been used successfully to predict physical properties of organic pollutants (26) and activities of pharmaceuticals (27). Common variables used in QSPR modeling include electrostatic (e.g., partial charge), geometric (e.g., molecular volume), quantum-chemical (e.g., dipole moment), and typological descriptors (e.g., Weiner index) modeling (28). Constitutional descriptors reflect only the chemical composition without any reference to electronic structure and are attractive for work with large numbers of molecules because of their conceptual and computational simplicity (29). Despite numerous applications of QSPRs in environmental studies, we are unaware of any attempts to predict HOCldem using this approach. Here we use experimental data from the literature on HOCl consumption by small molecules to develop, calibrate, and validate a QSPR that predicts HOCldem based solely on constitutional descriptors. This QSPR is then used to predict HOCldem by tannic acid and model DOM structures.

2. Data Collection and Model Calibration 2.1. Data Collection. HOCldem data for model compounds were obtained from publications over a ∼30-year span (Table 1). The complete list of compounds and HOCldem values is given in the Supporting Information (Tables S1, S2, S3), and some representative compounds are shown in Figure S1. These 201 data points include both aromatic and aliphatic compounds with carboxyl, amine, alcohol, phenol, ether, and other functional groups. The data were not acquired under consistent chlorination conditions; in particular, reaction times varied from 4 to 96 h (Table 1). Since reaction with HOCl can require several days (35), the HOCldem values in the shorter-time studies were adjusted upward by comparing the chlorine demand of compounds included in both longer- and shorter-time studies. Two studies could only be compared if one or more ‘common’ compounds were used in both. The ratio of HOCldem at a shorter reaction time (si) to HOCldem at a longer reaction time (li) was calculated for each ‘common’ compound. If the average ratio si/li for the two studies was less than 0.85, only the shorter-time HOCldem was adjusted using eq 1. The adjusted chlorine demands for the ‘common’ compounds given in Tables S2 and S3 were used in calibration VOL. 44, NO. 7, 2010 / ENVIRONMENTAL SCIENCE & TECHNOLOGY

9

2503

TABLE 1. HOCldem Data Sources and Reaction Conditions sources

chlorine source

chlorine dose

pH

time

Norwood et al.(30) de Laat et al.(21) Boyce and Hornig(22) Hureiki et al.(31) Gallard and von Gunten(32) Bull et al.(23) Dickenson et al.(33) Hong et al.(19) Bond et al.(34)

HOCl Cl2 Cl2 Cl2 NaOCl Cl2 Cl2 NaOCl Cl2

1.5-2.0 mol/mol-C 2 mol/mol 10-102 mol/mol 8-20 mol/mol 90 mol/2-6 mol 20 mg/L 4-7 mg/mL 10 mg /mg-C 35 mol/mol

7 7 7 8 8 7 8 7 7

0.3-4 h 0.25 h 24 h 72 h 20 h 48 h 24 h 96 h 24 h

and validation along with those compounds which were not adjusted. AdjHOCldem ) HOCldem

1 N

N

li

i)1

i

∑s

(1)

2.2. Descriptors. Initially, 26 constitutional descriptors were calculated for each compound (definitions of each descriptor are given in Table S4). Descriptor variables are limited to those compatible with the AlphaStep data model (24), including atom counts (the number of atoms of each element), functional group counts (the number of each functional group, including the number of aromatic rings), and variables which can be calculated from those (for example, H:C ratio, O:C ratio and number of phenol groups per ring). Two types of composite descriptors require explanation: the ring activation index (RAI) and the carbonyl index (CI). An RAI descriptor is motivated by the observation that aromatic rings with no substituents or only electron withdrawing substituents show much less HOCldem than compounds with a single strong electron-donating substituent (-OH or -NH2), and rings with multiple electron donating substituents show intermediate HOCldem. For example, benzoic acid and nitrobenzene consume 0.6, q2 > 0.5, 0.85 < k < 1.15, R2-Ro′2/R2 < 0.1 (41, 42). These results were comparable to those obtained when prediction using the averaged model (eq 2 and Table 2) was performed on each of the five cross-validation sets (Table S8). Finally the average QSPR predictive power was tested using the entire training data (N ) 159) and obtained q2 ) 0.86, RMSE ) 1.21, and MBD ) -0.28% (Table S8). The plot of predicted HOCldem versus experimental HOCldem with and without the y-intercept had R2 ) 0.86, k ) 0.88 and Ro′2 ) 0.85, k′ ) 0.97, respectively (Figure S3), and the ratio, R2Ro′2/R2, was 0.02. All these statistics of predictive power are comparable and meet the criteria for checking predictive power of QSPR models (41, 42). Therefore the average QSPR obtained is robust. Plotting predicted HOCldem against experimental HOCldem results showed that the two were linearly related with most of the points along the line bisecting the two axes and only a few points having high residuals (Figure 1). The plot of residuals against predicted HOCldem also supported the argument that the model has predictive power as the data points scattered with no distinct pattern with only a few points outside (2.5 standardized residuals (Figure S4). The results from LOOCV (N ) 159) using statistical and graphical approaches were comparable to those obtained by LMOCV (Tables S7 and S8; Figures S5, S6, and S7). The consistency of the two results implies that there was no serious bias in 5-way LMO data splitting employed to calibrate the QSPR model. 3.2.2. External Validation. The external validation data set (N ) 42, Table S1) has HOCldem ranging from 0.1 to 11.0 mol HOCl per mole compound with a mean of 4.98 mol/mol and a standard deviation of 3.35 mol/mol. The HOCldem for each test compound was computed (eq 2 and Table 2) from which we obtained q2 ) 0.88, RMSE ) 1.17, and MBD ) 2506

9

ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 44, NO. 7, 2010

FIGURE 2. Plot of predicted chlorine demand against experimental chlorine demands for external validation data (dotted lines are (2.5 standardized residuals). 11.42% (Table S9). The plot of predicted HOCldem versus experimental HOCldem with the y-intercept gave R2 ) 0.91, k ) 0.90 and when the regression was repeated with the intercept set to zero for which we obtained Ro′2 ) 0.87 and k′ ) 1.05 (Figure S8) and the ratio R2-Ro′2/R2 ) 0.04. These statistics of predictive power were comparable to average values obtained when each of the five QSPR equations was used to predict the test compounds (Table S9). The ypermutation test was performed 60 times, and the average permuted R2 was 0.073 and q2 was -0.186 which are less than 0.3 and 0.05, respectively, implying that the model is robust. These external validation statistics are consistent with those obtained from LMO and LOO cross-validations, satisfying the criteria for model predictive power and robustness (41, 42). The plot of predicted HOCldem versus experimental HOCldem in the external validation data shows a good linear relationship with some bias toward overprediction for compounds with HOCldem < 9 mol/mol (Figure 2) consistent with MBD of 11%. Nonetheless, all compounds in the external data set are predicted within (2.5 Sresid (Figure 2, Figure S9). 3.2.3. Model Applicability Domain. Results from the AD evaluation indicate that 4-iodophenol (Sresid ) -2.94), leucine (Sresid ) -2.70), and isoleucine (Sresid ) -2.70) are outliers in terms of fits in training data (Figure S4). However, these are not considered influential as they fall within model AD. Although, acetylacetone acid had higher leverage (h ) 0.30) and yet closer to h* of 0.25 (Figure 3), it is not considered structurally most influential in determination of the model descriptors (39). Seven compounds in the external validation data had leverage h > h* and far from acetylacetone. These compounds were 3-oxohexanedioic acid (h ) 0.33), 3,4,5trimethoxybenzyl alcohol (h ) 0.33), 3-(3,4,5-trimethoxyphenyl) propionic acid (h ) 0.33), 3,4,5-triethoxybenzyl alcohol (h ) 0.33), 1,2,3-trihydroxybenzene (h ) 0.36), 3-ethylaceto acetate (h ) 0.54), and ornithinechloro-hydrate (h ) 0.54). Predicted values for these compounds are obtained due to overextrapolation, but absolute residuals were generally modest (Figure 3).

4. Mechanistic Implications of the Descriptors in the QSPR Model The final model had 8 descriptors-RAI, ACN, CI, ArOH, AS, O:C, ArORnonact, and ArORact-that differed in their contribution to HOCldem. The ring activation index (RAI), which accounts for the ratio of OH and NH2 to the number of rings, is the most significant and represents the degree of ring activation that favors electrophilic substitution reactions. Anilinic and

5. Prediction of Chlorine Demand of Large DOM Surrogates

FIGURE 3. Williams plot of standardized residuals against leverage for training and external validation data sets. phenolic compounds consume more chlorine than nonactivated aromatics; however, more than one electron donor on the ring slows the reaction due to steric effects and antagonism among the substituents particularly when they are ortho or para to each other. Aliphatic carbon bonded to reduced nitrogen, NH2 (ACN), the second most significant descriptor, represents increased HOCldem due to chlorine substitution on amine which accounts for most of substitution reaction in amino acids (31). The remaining descriptors represent smaller but mechanistically distinct effects. β-Dicarbonyl compounds have acidic carbons that allow HOCl to abstract the proton easily to form keto-enolates. However, the acidity of carbon decreases if sandwiched between two carboxylic acids or between carboxylic acid and keto group or aldehyde group. Sulfur increases reactivity of the molecules because sulfur has lone pairs of electron that could be donated and thiols tend to be more reactive than alcohols; e.g., benzothiomide had higher chlorine demand than benzamide as was the case of phenylthiourea and acetanilide. The more oxidized molecules, (i.e., high O:C) are less reactive toward oxidants in general, including HOCl. For example, oxalic, fumaric, and maleic acid have HOCldem < 1 (21, 44). However, phenolic and carbonyl groups act a strong activators, resulting in an overall positive effect. Alkoxy groups are weak ring activators relative to OH and NH2. When alkoxy groups are present with strong ring activators (ArOH or ArNH2), they decrease activation due to antagonistic effects. Thus βArORnonact is negative, and the higher the number of alkoxy groups the lower the HOCldem relative to aniline or phenol (23). However, when alkoxy groups are the only groups on a ring or they are present with deactivating groups, they activate the ring, and βArORact is positive as observed in the case of 3,5-dimethoxybenzoic acid and 3,4,5trimethoxybenzoic acid (23, 30). The QSPR presented here has several important limitations. The data set used here includes a variety of structures, but insufficient data were available for nucleic acids, highly conjugated nonring compounds, phosphorus compounds, and polymers, which may limit the range of applicability. The assumption of constant error for all HOCldem values means that the predicted values for low HOCldem compounds may have a large relative error (but not a large absolute error). Finally, constitutional descriptors do not describe chlorine demand mechanistically, and supplementing or replacing these with quantum chemical, geometrical, electrostatic, and/ or typological descriptors may improve prediction power and extrapolation to molecules with diverse structures.

The QSPR model is derived using model compounds of low molecular weight, but DOM contains larger molecules. The predictive power of the QSPR model (eq 2 and Table 2) on larger molecules was tested using tannic acid (C76H52O46, MW ) 1701) which has an experimental chlorine demand of 32.4 mol/mol (∼35.5 mmol/g-C) at pH 7 (34). The predicted HOCldem with standard error of prediction was 33.42 ( 6.50 mol/mol (∼36.61 mmol HOCl/g-C). This prediction is slightly higher than experimental, consistent with external validation results. An alternative test used two proposed model structures of fulvic acids (FA-1) and (FA-2) which had molecular formulas of C43H44O25 (MW ) 960) and C42H44O25 (MW ) 948) respectively (45). The predicted HOCldem for FA-1 was 8.37 mol/mol (∼16.21 mmol HOCl/g-C) and 7.96 mol/mol (∼15.78 mmol HOCl/g-C) for FA-2. These HOCldem predictions are within the expected range of HOCldem of phenolic model compounds reported in the literature (21–23, 32) and with surface water data from the U.S. EPA (46), which has a mean HOCldem of ∼13 mmol HOCl/g C. Application of this QSPR to source water DOM requires plausible structure(s) of the DOM present in those sources. The agent-based model of NOM can produce such structures (24, 25) and has been successfully combined with metal complexation QSPRs to predict metal binding by NOM and humic substances (47). In future work, the HOCldem QSPR will be interfaced with this agent-based model to predict chlorine demand in source waters.

Acknowledgments We thank the University of New Mexico and National Science Foundation (#NSF DEB 0113570) for financial support and David Reckhow for initial ideas and valuable suggestions to this project.

Appendix A βj AD AdjHOCldem DOM HOCldem LMOCV LOOCV MBD MLR QSPR RMSE Sresid StdE StdEreg

coefficient of ith descriptor applicability domain adjusted chlorine demand dissolved organic matter chlorine demand leave-many-out cross-validation leave-one-out cross-validation mean bias deviation multiple linear regression quantitative structure property relationship root-mean-square error standardized residuals standard error of βj standard error of regression

Supporting Information Available Tables S1-S9, Figures S1-S9, protocol, results, and comparison of statistics for LOO cross-validation, and limitations of using constitutional descriptors. This material is available free of charge via the Internet at http://pubs.acs.org.

Literature Cited (1) Surveillance for waterborne-disease outbreaks-United States, 1999-2000; Morbidity and mortality weekly report, surveillance summaries, volume. 51/No. SS-8; Center for Disease Control and Prevention: 2002. http://www.cdc.gov/mmwr/preview/ mmwrhtml/ss5108a1.htm. (2) Acosta, C. J.; Galindo, C. M.; Kimaro, J.; Senkoro, K.; Urasa, H.; Casalo, C.; Corachan, M.; Esko, N.; Tanner, M.; Mshindo, H.; VOL. 44, NO. 7, 2010 / ENVIRONMENTAL SCIENCE & TECHNOLOGY

9

2507

(3)

(4) (5) (6) (7) (8) (9) (10) (11) (12) (13)

(14) (15) (16) (17)

(18)

(19) (20) (21) (22)

(23) (24)

2508

Lwilla, F.; Vila, J.; Alonso, P. L. Cholera outbreak in Southern Tanzania: Risk factors and patterns of transmission. Emerg. Infect. Dis. 2001, 7 (3), 583–587. Anderson, Y.; Bohan, P. Disease surveillance and waterborne disease outbreaks. In Water Quality Guidelines, Standards and Health; Fewtrell, L., Batram, J., Eds.; IWA Publishers: London, 2000; Chapter 6. Christman, K. History of Chlorine;Chlorine Chemistry Council, Waterworld, September 1998. www.waterandhealth.org/drinkingwater/history.html. Larson, R. A.; Weber, E. J. Reaction Mechanisms in Environmental Organic Chemistry; Lewis Publishers: New York, 1994. Crittenden, J. C.; Trussel, R. R.; Hand, D. H.; Howe, K. J.; Tchobanglous, G. Water Treatment: Principles and Design, 2nd, ed.; Wiley & Sons Inc.: New York, 2005. Disinfectants and disinfectant byproducts: Environmental Health Criteria 216; WHO: Geneva, 2000. http://whqlibdoc.who.int/ ehc/WHO_EHC_216.pdf. Iriarte, U.; lvarez-Uriarte, J. I.; Lopez-Fonseca, R.; GonzalezVelasco, J. R. Trihalomethane formation in ozonated and chlorinated surface water. Environ. Chem. Lett. 2003, 1, 57–61. National primary drinking water regulations: disinfectants and disinfection byproduct; Final rule, Federal Registry, 63:241:69390; U.S. Environmental Protection Agency: 1998. Council directive 98/83/EC of November 3, 1998 on the quality of water intended for human consumption; European Union/ Commission Legislative Documents: European Union, 1998. Richardson, S. D.; Simmons, J. E.; Rice, G. Disinfection byproducts: The next generation. Environ. Sci. Technol. 2002, 36 (9), 198A–205A. Porter, C. K.; Putnam, S. D.; Hunting, K. L.; Riddle, M. R. The effect of trihalomethane and haloacetic acid exposure on fetal growth in a Maryland County. Am. J. Epidemiol. 2005, 162, 334–344. Monographs on the evaluation of carcinogenic risks to humans, Volume 52: Chlorinated drinking-water; chlorination by-products; Some other halogenated compounds; Cobalt and cobalt compounds; Summary of data reported and evaluation: International Agency for Research on Cancer; IRC Press: Lyon, France, 1999. Gang, D. D.; Segar, R. J., Jr.; Clevenger, T. E.; Banerji, S. K. Using chlorine demand to predict THM and HAA9 formation. J. Am. Water Works Assoc. 2002, 94 (10), 76–85. Liang, L.; Singer, P. C. Factors Influencing the formation and relative distribution of haloacetic acids and trihalomethanes in drinking water. Environ. Sci. Technol. 2003, 37, 2920–2928. Baxter, C. W.; Smith, D. W.; Stanley, S. J. A comparison of artificial neural networks and multiple regression methods for the analysis of pilot-scale data. J. Environ. Eng. Sci. 2004, 3, S45–S58. Shimazu, H.; Kouchi, M.; Yonekura, Y.; Kumano, H.; Hashiwata, K.; Hirota, T.; Ozaki, N.; Fukushima, H. Developing a model for disinfection by-products based on multiple regression analysis in a water distribution system. J. Water Supply: Res. Technol.AQUA 2005, 54 (4):), 225–237. Fitzgerald, F.; Chow, C. W. K.; Holmes, M. Disinfectant demand prediction using surrogate parameters-a tool to improve disinfection control. J. Water Supply Res. Technol. Aqua. 2006, 55, 391–400. Hong, H. C.; Wong, M. H.; Liang, Y. Amino acids Precursors of trihalomethane and haloacetic acid formation during chlorination. Arch. Environ. Toxicol. 2009, 56, 638–645. Perdue, E. M.; Ritchie, J. D. Dissolved organic matter in freshwater. In Surface and Groundwater, Weathering, and Soils; Elsevier Inc.: San Diego, CA, 2005; Vol. 5, pp 273-318. de Laat, J.; Merlet, N.; Dore´, M. Chlorination of organic compounds: chlorine demand and reactivity in relationship to the trihalomethane formation. Water Res. 1982, 16, 1437–1. Boyce, S.; Hornig, J. Reaction pathways of trihalomethane formation from the halogenation of dihydroxyaromatic model compounds for humic acid. Environ. Sci. Technol. 1983, 17, 202–211. Bull, R. J.; Reckhow, D. A.; Rotello, V.; Bull, O. M.; Kim, J. Use of toxicological and chemical models to prioritize DBP research; AWWA Research Foundation: 2006. Cabaniss, S. E.; Madey, G.; Leff, L.; Maurice, P. A.; Wetzel, R. A stochastic model for the synthesis and degradation of natural organic matter. Part I. Data structures and reaction kinetics. Biogeochemistry 2005, 76, 319–347.

9

ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 44, NO. 7, 2010

(25) Cabaniss, S. E.; Madey, G.; Leff, L.; Maurice, P. A.; Wetzel, R. A stochastic model for the synthesis and degradation of natural organic matter. Part II. Molecular property distributions. Biogeochemistry 2007, 86, 269–286. (26) Schwarzenbch, R. P.; Gschwend, P. M.; Imboden, D. M. Environmental Chemistry, 2nd ed.; John Wiley & Sons Inc.: Hoboken, NJ, 2003. (27) Karelson, M.; Lobanov, V. S.; Katritzky, A. R. Quantum-chemical descriptors in QSPR/QSAR. Chem. Rev 1996, 96 (3), 1027–1044. (28) Katritzky, A. R.; Lobonov, V. S.; Karelson, M. QSPR: the correlation and quantitative prediction of chemical and physical properties from structure. Chem. Soc. Rev. 1995, 24, 279–287. (29) Karelson, M. Molecular Descriptors in QSAR/QSPR; WileyInterscience: New York, 2000. (30) Norwood, D. L.; Johnson, J. D.; Christman, R. F.; Hass, J. R.; Bobenrieth, M. J. Reactions of chlorine with selected aromatic models of aquatic humic material. Environ. Sci. Technol. 1980, 14 (2), 187–189. (31) Hureiki, L.; Croue´, J.-P.; Legube, B. Chlorination studies of free and combined amino acids. Water Res. 1994, 28, 2521–2531. (32) Gallard, H.; Von Gunten, U. Chlorination Of Natural organic matter: Kinetics of chlorination and of THM formation. Water Res. 2002, 36, 65–74. (33) Dickenson, E. V.; Summers, S.; Croue´, J.-P.; Gallard, A. Haloacetic acid and trihalomethane formation from the chlorination and bromination of aliphatic β-dicarbonyl acid model compounds. Environ. Sci. Technol. 2008, 42, 3226–3233. (34) Bond, T.; Henriet, O.; Goslan, E. H.; Parsons, S. A.; Jefferson, B. Disinfection byproducts and fractionation behavior of natural organic matter surrogates. Environ. Sci. Technol. 2009, 43, 5982– 5989. (35) Reckhow, D. A.; Singer, P. C.; Malcolm, R. L. Chlorination of humic materials: Byproduct formation and chemical interpretation. Environ. Sci. Technol. 1990, 24, 1655–1664. (36) Formation of halogenated organics by chlorination of water supplies; US-Environmental Protection Agency (1975). National Service Center for Environmental Publication (NEPIS). (37) Reusch, W. (1999) Virtual textbook of organic chemistry (2008 revision). http://www.cem.msu.edu/∼reusch/VirtualText/intro1. htm. (38) Minitab Inc. (2007). “Graphical data,” Meet Minitab 15; pp 2-1 to 2-13. www.minitab.com. (39) Herrell, F. E., Jr. Regression Modeling Strategies with Application to Linear Models, Logistic Regression, and Survival Analysis; Springer-Verlag: New York, 2001. (40) Eriksson, L.; Jaworska, J.; Worth, A. P.; Cronin, M. T. D.; McDowell, R. M.; Gramatica, P. Methods for reliability and uncertainty assessment and for applicability evaluations of classification- and regression-based QSARs. Environ. Health Perspect. 2003, 111 (10), 1361–1375. (41) Tropsha, A.; Gramatica, P.; Gomba, V. K. The Importance of being earnest: Validation is absolute essential for successful application and interpretation of QSPR model. QSAR Comb. Sci. 2003, 23 (1), 69–77. (42) Golbraikh, A.; Tropsha, A. Beware of q2. J. Mol. Graphics Modell. 2002, 20 (4), 269–276. (43) Ru ¨ cker, C.; Ru ¨ cker, G.; Meringer, M. y-randomization and its variants in QSPR/QSAR. J. Chem. Inf. Model. 2007, 47, 2345– 2357. (44) Orvik, J. A. Kinetics and mechanism of nitromethane chlorination. A new rate expression. J. Am. Chem. Soc. 1980, 102 (2), 740–743. (45) Leenheer, J. A.; Brown, G. K.; MacCarthy, P.; Cabaniss, S. E. Models of metal binding structures in fulvic acid from the Suwannee River, Georgia. Environ. Sci. Technol. 1998, 32, 2410– 2416. (46) Obolensky, A.; Singer, P. C. Development and Interpretation of Disinfection Byproduct Formation Models Using the Information Collection Rule Database. Environ. Sci. Technol. 2008, 42, 5654–5660. (47) Cabaniss, S. E. Forward Modeling of Metal Complexation by NOM: I. A priori Prediction of Conditional Constants and Speciation. Environ. Sci. Technol. 2009, 43, 2838–2844.

ES903164D