Environ. Sci. Technol. 2011, 45, 1021–1027
Prediction of Soil Sorption Coefficients using Model Molecular Structures for Organic Matter and the Quantum Mechanical COSMO-SAC Model †
KATHY L. PHILLIPS, DOMINIC M. DI T O R O , * ,‡ A N D S T A N L E Y I . S A N D L E R † Center for Molecular and Engineering Thermodynamics, Department of Chemical Engineering, University of Delaware, 150 Academy Street, Newark, Delaware 19716, United States and Department of Civil and Environmental Engineering, University of Delaware, DuPont Hall, Newark, Delaware 19716, United States
Received August 12, 2010. Revised manuscript received November 19, 2010. Accepted November 30, 2010.
The soil sorption coefficient, KOC, is an important property affecting the environmental fate of organic molecules. Difficulties associated with measuring KOC have led to many attempts to predict this property, but most rely on empirical descriptors for the soil phase determined from correlations with measured KOC data, and are thereby limited by the data quality and diversity. A new method is presented to predict KOC for nonionic organic compounds that requires only molecular structures. No calibration is performed. Using model humic acid (HA) and fulvic acid (FA) molecular structures from the literature, the soil organic matter is modeled as an organic solvent composed of HA or FA molecules. KOC is predicted as an organic solvent-water partition coefficient using the quantum mechanicsbased model COSMO-SAC. The log KOC values for a set of 440 diverse, environmentally relevant chemicals are predicted with a root-mean-square error of 0.84-1.08, depending on which model HA or FA is used.
Introduction The sorption of organic molecules to soil or sediment is a major factor determining their fate in the environment. For many classes of these compounds, organic matter (OM) is the primary sorption site in soil (or sediment), and the sorption process can be considered as partitioning between soil OM and the surrounding water. This can be described by the equilibrium distribution coefficient, Kd (L water/kg dry soil) Kd ≡
Ci,S Ci,W
(1)
where Ci,S (mg i/kg dry soil) and Ci,W (mg i/L water) are, respectively, the equilibrium concentrations of compound i sorbed by the soil and dissolved in the water. For a given compound, normalizing Kd by the total soil organic content * Corresponding author phone: (302) 831-4092; fax: (302) 8313640; e-mail:
[email protected]. † Department of Chemical Engineering. ‡ Department of Civil and Environmental Engineering. 10.1021/es102760x
2011 American Chemical Society
Published on Web 12/31/2010
reduces the variation in the partition coefficient caused by variability in soil composition. A measure of the soil organic content is the mass fraction of organic carbon (OC) in the soil, fOC (kg OC/kg dry soil), where OC means all carbon contained in the OM. The fOC-normalized soil-water equilibrium partition coefficient is commonly referred to as the soil sorption coefficient, KOC (L water/kg OC) KOC ≡
Ci,S 1 Ci,W fOC
(2)
Considerable time, expense, and potential experimental artifacts are associated with measuring KOC (1, 2), prompting numerous attempts to develop models for its estimation. The OM-water partitioning approach is generally suitable for modeling sorption to soil in cases where the OC content of the soil is sufficiently high (>0.1% (2)), and sorption to black carbon (3) or various mineral phases present in the soil is not significant. Only sorption of nonionic organic compounds to soils under these conditions is considered, for chemicals that are present at low concentrations in the environment, and assuming sorption in the linear regime. Nonlinear sorption has also been observed under some lowconcentration conditions, and neglecting this phenomenon may introduce some error into the model (4). More than 200 relationships for KOC estimation have been reported (2, 5-7) (see Supporting Information (SI) for additional references). Typically, the characteristics of each unique compound i (the “solute”) are represented by descriptors such as the octanol-water partition coefficient (KOW), and the properties of the soil are implicitly accounted for using empirically determined regression coefficients in a correlation between the descriptors and the experimental KOC data. This approach avoids the need for precise characterization of the soil. However, there are several limitations, including potentially unavailable solute descriptors, and the class-specificity of many correlations. Correlations are also limited by the quality of the measured KOC data used. For a given compound, measured KOC values may vary substantially due to both natural variations in soil composition beyond those accounted for by fOC-normalization, and artifacts associated with the use of different experimental methods (2). For example, the range in log KOC values measured for pyrene is 1.9 log units (8). Although some modelers have used techniques to avoid overfitting (9), it is likely that many other models are overfitted, leading to a false sense of accuracy (2). The limitations of the existing KOC estimation methods suggest that a more fundamental approach is needed. This would require direct characterization of soil OM, a heterogeneous material in which humic substances, including humic acids (HA) and fulvic acids (FA) (10, 11), are the most important components affecting the sorption of organic compounds. Model molecular structures for HAs and FAs based on nuclear magnetic resonance and other techniques (12-17) have been used as representations of soil OM with generally good results (8, 17-20). Ames and Grulke (18) used a model HA molecule to represent OM as a polymeric phase, and then predicted KOC from group contribution methods. However, the average error in their method was large compared to many other approaches, due mainly to errors in their water-phase predictions (18). Also, group contribution methods are limited to compounds for which suitable parameters are available. It has been demonstrated that the OM-water partitioning process can be treated like partitioning between an organic VOL. 45, NO. 3, 2011 / ENVIRONMENTAL SCIENCE & TECHNOLOGY
9
1021
solvent and water (5, 7, 21). The availability of molecular models for OM introduces the possibility of predicting KOC using standard quantum mechanical (QM) approaches for organic solvent-water partition coefficients. Niederer et al. (17) used model HA structures combined with the QM-based COSMO-RS model (22) to predict HA-air partition coefficients. Their results suggest that a similar approach would be suitable for predicting KOC. A major advantage of developing a QM approach to predict soil sorption is the ability to screen large and diverse sets of compounds, including compounds that have not yet been synthesized, since predictions can be made using only the molecular structures of the compounds (17). In this work, a new method is proposed for predicting KOC using the QM-based continuum solvation model COSMOSAC (Conductor-like Screening Model - Segment Activity Coefficient; described below) (23). This approach relies on three important assumptions derived from previous research: (1) that soil sorption can be modeled as partitioning between OM, represented as an organic solvent, and water; (2) that OM can be characterized by one or more of the 16 model HA or FA molecular structures compiled in ref 20 (hereafter referred to as “model OM”); and (3) that OM can be adequately described using a QM-based continuum solvation approach (5, 7). Utilizing these assumptions, KOC values can be predicted using the COSMO-SAC model, starting with only the molecular structures of the model OM, water, and the solutes.
Methods and Theory (1) Experimental KOC Data. The experimental KOC data set compiled from the literature by Winget et al. (5) was used to evaluate the accuracy of the KOC predictions. This data set contains KOC values for 440 nonionic organic compounds, some of which represent averages of measured KOC values from multiple soil and sediment types and different studies. Diverse classes of environmentally relevant compounds are included (e.g., aromatics, amines, and halogenated, nitro, S-, and P-containing compounds), with log KOC values from 0 to 6.5. (2) COSMO-SAC KOC Predictions. The COSMO-SAC model, described in detail elsewhere (23, 24), was developed within our research group and is a variation on the COSMORS model of Klamt (22). In brief, the model uses density functional QM methods combined with the dielectric continuum solvation model COSMO (25) to calculate the energy of each solute or solvent molecule and the charge density on its surface. The solvent is modeled as a mixture of solvent molecules; for example, the model HA or FA molecules in the case of OM. Methods based on statistical thermodynamics are then applied using the surface-charge distributions on the molecules to predict properties such as the activity coefficient for a solute in a given solvent. Partition coefficients are calculated from the ratio of the activity coefficients in the two phases, as indicated below. The overall approach is similar to that used by the COSMO-RS model (22), however the details of the two methods differ. As COSMO-RS is part of a proprietary software package, the complete details of that method are not publicly available. KOC Prediction from Activity Coefficients. Consider a twophase mixture of OM and water (W) to which a small amount of solute, i, is added. At equilibrium (26), xi,OMγi,OM ) xi,Wγi,W
(3)
where xi,j is the mole fraction of i in phase j, and γi,j(T,P,xi,j) is its activity coefficient at the given temperature (T) and pressure (P). In this work, a fixed temperature of 25 °C was used, and the pressure dependence of γi,j was neglected, as it is not important at low P (26). Our model assumes that the two phases consist of dry OM and pure water, as KOC data 1022
9
ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 45, NO. 3, 2011
are reported on a dry soil basis, and the model OM molecular structures do not include bound water. Equation 3 can be rearranged to define an OM-water partition coefficient on a mole fraction basis KX ≡
∞ γi,W γi,W xi,OM ) = ∞ xi,W γi,OM γi,OM
(4)
∞ is the infinite dilution activity coefficient for i in where γi,j phase j. The last (approximate) equality follows from the ∞ since pollutants are present at low assumption that γi,j = γi,j concentrations in the environment (i.e., xi,j f 0). KOC can be calculated from Kx with a conversion of units and normalization, giving
KOC =
∞ γi,W
MWW
∞ MWOCFWnC,OM γi,OM
(5)
where MWW ) 0.01802 kg/mol and MWOC ) 0.01200 kg/mol are the molecular weights of water and OC; Fw is the density of water (0.997 kg water/L water at 25 °C (27)); and nC,OM (mol OC/mol OM) is the number of moles of C per mole of OM, which can be obtained from the molecular formula of the model OM. Computational Methods. COSMO-SAC activity coefficient calculations involve the following steps: (1) generate a good initial guess for the 3-D molecular geometry of each solute and solvent molecule; (2) perform a gas-phase molecular geometry optimization using density functional theory (DFT); (3) perform a single-point calculation of the energy and surface charges for the molecule within a conductor, using DFT and the COSMO model; and (4) apply a statistical thermodynamics-based method to calculate the value of γi,j. The DFT calculations represent the majority of the timeinvestment required (up to several days of computational time), but need to be performed only once for each molecule, and the results can be stored in a database. Activity coefficients can then be calculated within a few seconds at the temperatures and compositions specified by the user during step 4. Although calculations in this work were all performed at a temperature of 25 °C, COSMO-SAC predictions can also be made at other environmentally relevant temperatures. Solute Geometries. DFT calculation results for many of the solutes were available in the COSMO-SAC database (28). For all remaining molecules, preliminary optimizations were conducted using the Gaussian03 software (29), because a good initial guess of the molecular structure has been shown to be important (30). The geometries optimized by Winget et al. (5) at the HF/MIDI! level were used as the starting point for a gas-phase geometry optimization using B3LYP/631G(d,p). A frequency calculation was performed at the same level, and the optimized structure was confirmed to represent a minimum by the absence of imaginary frequencies. Next, a final gas-phase geometry optimization was performed using the Amsterdam Density Functional (ADF) program (31) with the GGA Becke Perdew exchange correlation functional and the TZ2P basis set with a small core. ADF defaults were used for the other settings. This geometry was used as the input to the single-point solvation calculation at the same level, using the COSMO model implemented in ADF. Solvent Geometries. DFT results for water were available in the COSMO-SAC database, so no further calculations were needed. For the OM, the 16 model HA and FA molecular structures compiled in ref 20 were used. The SMILES strings were converted to 2-D structures using the Online SMILES Translator and Structure File Generator (32). Because of the large size of the molecules, a stepwise approach was taken to determine the gas-phase 3-D geometry. Using Gaussian03,
a preliminary calculation was performed using the AM1 semiempirical method, followed by a QM geometry optimization at the HF/6-31G(d) level, then a further optimization and frequency calculation at the B3LYP/6-31G(d,p) level. The absence of imaginary frequencies was confirmed. Because of the high computational demand associated with optimizing the large model OM molecules, no higher-level gas-phase geometry optimization was performed in ADF. A single-point COSMO calculation was performed by the same method as for the solutes. Activity Coefficients. The COSMO-SAC program was used to predict infinite dilution activity coefficients (xi,j ) 10-12) for each solute in water and in the model OM solvent. The version of COSMO-SAC described in ref 23 was used with modified values of some parameters, as listed in the SI. (3) SPARC KOC Predictions. For comparison with the COSMO-SAC predictions, the SPARC (33) method was also used to predict KOC values based on each model OM. SPARC is available as a free online calculator that predicts properties for organic compounds from their 2-D molecular structures, using molecular interaction models based on chemical structure theory and calibrated using experimental data (34, 35). To obtain the KOC predictions, the SPARC OM-water distribution coefficient, P (L water/L OM), was calculated for each solute based on its SMILES string, and a unit conversion and normalization was performed. KOC ) P
MWOM MWOCFOMnC,OM
(6)
The molecular weight of OM, MWOM (kg OM/mol OM), was determined from the molecular formula of the model OM. The density for each of the model OMs, FOM (kg OM/L OM), was estimated using SPARC. Calculations could not be performed using the Stevenson-HA model OM as this compound was too large to be processed by SPARC, nor for the two solutes azoxybenzene and 3-chloro-4-methoxyaniline which contain functional groups not supported by SPARC.
Results and Discussion The model HA and FA structures used in this work were each determined using samples from one of three OM classes: terrestrial HA, aquatic HA, or aquatic FA (20). To evaluate the effect on the KOC prediction of using different OM models, calculations were performed using the 16 individual model OMs and one mixture of the model OMs for each of the 3 classes. As OM is a heterogeneous substance, mixtures are expected to better characterize its properties. We examined two approaches for determining KOC values corresponding to a mixture of model OMs. In the first approach, the “mixture” KOC value for a solute was taken to be the arithmetic mean of the KOC predictions using the individual model OMs within that class. In the second approach, we defined the organic solvent in COSMO-SAC as an equimolar mixture of the model OMs within a class, and directly calculated the partition coefficient for the mixed solvent. Both approaches led to almost identical KOC values, suggesting that solutesolvent interactions in the pure solvents and in the mixture are not dramatically different. This result is not surprising considering the range in KOC values for a given solute based on the individual model OMs within a class is not large (see discussion below). Only results obtained using the arithmetic mean approach are presented here. All results discussed below are available in the SI. Unless otherwise indicated, prediction accuracy is reported as the root-mean-square error (RMSE) in log KOC relative to the experimental data reported by Winget et al. (5). Evaluation of COSMO-SAC Predictions Relative to Average Experimental Data. The KOC predictions were first evaluated by comparing to the experimental values in the
FIGURE 1. Comparison of root-mean-square errors in the log KOC predictions based on different model HA and FA molecular structures as representations of OM. RMSEs are calculated using experimental data from the Winget et al. (5) compilation. Black bars: COSMO-SAC predictions for 440 solutes; gray bars: SPARC predictions for 438 solutes. Winget et al. (5) data set, which includes averages of multiple KOC measurements. The data do not contain information about the variability in measured KOC values resulting from differences in soil composition, which will be discussed later. The COSMO-SAC KOC predictions using each of the 19 models for OM agree generally well with the experimental data, with RMSE ) 0.84-1.08 (Figure 1). That is, the COSMOSAC method can predict the KOC values for a diverse set of compounds to within approximately an order of magnitude of the average experimental value, without any calibration of the model using experimental KOC data. Predictions that deviate from the average KOC value may still be within the range of the measured values, once variations in soil composition are taken into account (see below). The accuracy of the predictions supports the use of the organic solvent-water partitioning approach and suggests that the model OMs are good molecular models for the species involved in sorption to soil under the conditions represented by the data. By comparison with the COSMO-SAC results, the RMSEs in the SPARC predictions (1.05-1.96) are always higher when using the same model for OM (Figure 1). Winget et al. (5) and Klamt et al. (7) each developed QM, continuum solvationbased methods with RMSEs of 1.38 and 0.63, respectively, for the Winget et al. (5) data set. However, both methods used the measured KOC values for 387 of the 440 compounds to fit the 5 or 6 adjustable parameters in their models. Recently, Goudarzi et al. (9) reported a new quantitative structure-property relationship that achieved an RMSE of 0.29 for a test set of 31 pesticides, after calibration using data for other pesticides. However, the model has not been tested for the much larger and more diverse Winget et al. (5) data set. The COSMO-SAC KOC predictions corresponding to the Suwanee River FA model OM SR-FA-a agree most closely with the experimental data (RMSE ) 0.84; Figure 2A). For 76% of the compounds the log KOC predictions are within 1 log unit of the measured value. Only 8 of the 440 predictions deviate from the measured log KOC by more than 2 log units (amitrole, asulam, 3,5-dinitrobenzamide, tricyclazol, benzo[f]quinoline, urea, thiabendazole, 7H-dibenzo[c,g]carbazole). The KOC values for many of these compounds are also poorly predicted by other models discussed above (5, 7). There is no correlation between the prediction accuracy and the measured log KOC values, or predicted values of log KOC, ∞ ∞ , log γi,OM , the molecular volume of the solute log γi,W VOL. 45, NO. 3, 2011 / ENVIRONMENTAL SCIENCE & TECHNOLOGY
9
1023
FIGURE 2. Comparison of predicted and measured values of (A) log (KOC [L water/kg OC]) and (B) log (KOW [L water/L wet octanol]) for compounds in the Winget et al. (5) data set. KOC predictions were made using the SR-FA-a model molecular structure for OM. The one-to-one line (i.e., prediction ) measurement) is indicated by the solid line, and deviations of (1 log unit are marked by the dashed lines. Most predictions are within 1 log unit of the measured value for both properties.
FIGURE 3. Predicted log (KOC [L water/kg OC]) values as a function of the logarithms of the infinite dilution activity coefficients in the (A) water phase and (B) OM phase for compounds in the Winget et al. (5) data set. Predictions were made using the SR-FA-a model molecular structure for OM. The one-to-one line (i.e., log KOC ) log γ∞) is indicated by the solid line, and linear regressions fitted to the values shown are marked by the dotted lines. (McGowan characteristic volume (36), calculated using 37), or solvated solute polarity. As the COSMO-SAC KOC value is calculated from the ratio ∞ ∞ and γi,OM , the predicted activity coefficients can offer of γi,W insights into the molecular-level processes influencing KOC. ∞ varies from -0.11 to +9.7, For the 440 compounds, log γi,W ∞ varies from only -1.5 to +1.7. Clearly, the whereas log γi,OM variability in predicted KOC between different solutes is mainly ∞ due to changes in the solute-water interactions (i.e., the γi,W value) with a much smaller contribution from changes in the ∞ ). Only two of the compounds solute-OM interactions (γi,OM ∞ ∞ ∞ > γi,W . In all other cases, γi,W (urea and amitrole) have γi,OM is the dominant term affecting KOC, as reflected in the strong linear correlation (r2 ) 0.968, p < 0.01) between the predicted ∞ values (Figure 3A). Because the aqueous log KOC and log γi,W solubility of any sparingly soluble compound (on a mole fraction basis) is the reciprocal of γ∞i,W, this result is consistent with previous work showing that KOC can be predicted from a log-linear correlation with the aqueous solubility (2, 38). A weaker linear correlation exists between log KOC and log ∞ (r2 ) 0.608, p < 0.01; Figure 3B). The generally lower γi,OM ∞ is consistent with the sensitivity of KOC to the value of γi,OM observation that KOC values for most compounds can be predicted accurately using a variety of approximate molecular models for OM. ∞ ∞ and log γi,OM are also linearly Interestingly, log γi,W correlated (r2 ) 0.774, p < 0.01), suggesting that the solute 1024
9
ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 45, NO. 3, 2011
interacts with the model OM in a manner related its interaction with water. One interpretation is that hydrogen bonding is a major factor affecting solvation in both the water and the model OMs. Like water, the model OMs are all capable of both donating and accepting hydrogen bonds (HBs), since they contain several -OH functional groups and some also contain -NH groups. Therefore, in both solvents, dissolution of a solute molecule requires breaking any existing solvent-solvent HBs to create a cavity within which to place the solute. This may be followed by the formation of new solute-solvent HBs. Others have also reported that hydrogen bonding may affect the sorption properties of OM (5, 8). To probe this idea, we examined the change in the COSMO∞ prediction resulting from artificially “turning-off” SAC γi,OM hydrogen bonding (i.e., setting the COSMO-SAC parameter cHB ) 0). As expected, eliminating the solvent’s internal HB network acted to reduce γ∞i,OM for all solutes (i.e., solute-model OM interactions became more favorable), while in a competing effect for solutes that are capable of HB interactions, eliminating solute-solvent HBs decreased the strength of ∞ ). These the solute-model OM interaction (increased γi,OM findings support the importance of both internal and external hydrogen bonding involving the model OM. However, neglecting HBs involving OM strengthened the linear rela∞ ∞ and log γi,OM (r2 ) 0.862, p < 0.01), tionship between log γi,W therefore the relationship cannot be attributed to common hydrogen bonding features of model OM and water. Instead,
the difference in strength of the HBs for the model OM compared with water weakens the relationship between log ∞ ∞ γi,W and log γi,OM . Properties including the molecular volume ∞ and of the solute are significant for determining both γi,W ∞ γi,OM, and contribute to the correlation between them. The COSMO-SAC model tends to under-predict KOC; that is, the average measurements indicate stronger sorption to the soil than predicted. The deviations between measured and predicted log KOC values range from -2.62 for amitrole to +1.87 for di-2-ethylhexyl phthalate, with a median value of -0.31. It is of interest to assess the extent to which general deficiencies in the COSMO-SAC model for predicting organic solvent-water partition coefficients contribute to these deviations, as distinct from incomplete characterization of the OM or other factors. A useful approach is to examine the accuracy with which COSMO-SAC predicts KOW, since, unlike OM, wet octanol is a well-defined phase. Experimental values of KOW have been compiled by Winget et al. (5) for 316 of the compounds in their data set. The RMSE in COSMO-SAC log KOW values is 0.70 (Figure 2B). The error bounds on the data were not reported. The prediction errors in COSMO-SAC KOC and KOW values are correlated (r2 ) 0.308, p < 0.01), although there is substantial scatter with several outliers. The median deviation between predicted and measured log KOW values is -0.19, suggesting that the slight bias toward under-predicting KOC may be a general feature of COSMO-SAC organic solvent-water partition coefficient calculations with the model’s current parameterization. For activity coefficient calculations, the COSMOSAC model utilizes 15 universal parameters that were optimized based on small molecule vapor-liquid equilibrium data. KOC predictions may be improved by reparameterizing COSMO-SAC using data that more closely resemble organic solvent-water partitioning, such as liquid-liquid equilibrium data. Also, there are continuing improvements to both the ADF and COSMO-SAC programs. Another factor potentially affecting the accuracy of the COSMO-SAC partition coefficient predictions is the presence of multiple stable conformations for some solutes. Previous work has demonstrated that different conformationssthat is, different spatial arrangements of the atoms in the same moleculescan significantly affect the thermodynamic properties predicted using COSMO-based models (17, 30, 39). In this work, property predictions were based only on the lowest energy conformation we found for each species. Although this method often provides a good approximation (30), a more rigorous approach is to use a Boltzmann-weighted averaging of the properties corresponding to all likely conformations. Conformational differences in the model OMs are less likely to be important, since they have previously been found to have little effect on the properties predicted (17). A further limitation of the current approach is that, for reasons of computational efficiency, gas-phase optimized geometries for the solutes, water, and model OMs were used to perform the condensed phase (COSMO) calculations. This assumes that the molecular geometry does not change significantly upon solvation, which may not be a good approximation in all cases (40). Better condensed-phase property predictions may result from relaxing the geometry upon solvation. However, COSMO-based approaches use a conductor, rather than the real solvent of interest, to represent the condensed-phase during the DFT calculations. Therefore, reoptimizing the geometry in the presence of the conductor would not take into account the effect of specific solutesolvent interactions on the solute geometry; to do so is not presently feasible. It has been widely reported that linear correlations between log KOC and log KOW can provide a useful method to estimate KOC values (2, 38). There is a strong linear
correlation between the log KOC and log KOW values predicted using COSMO-SAC log KOC ) 0.809log KOW + 0.0674 r2 ) 0.987, p < 0.01 (7) ∞ being the dominant factor deterThis results from γi,W mining both KOC and KOW for most solutes. Like KOC, KOW is strongly log-linearly correlated with γ∞i,W (r2 ) 0.966, p < 0.01). However, there is essentially no correlation between KOW and the infinite dilution activity coefficient in the wet octanol 2 ∞ phase (γi,wet oct; a log-linear regression leads to r ) 0.065, p < 0.01). Also, there is only a weak correlation (r2 ) 0.528, ∞ ∞ p < 0.01) between log γi,wet oct and log γi,OM. Therefore, for ∞ ∞ ∞ solutes for which γi,W . (γi,OM, γi,wet oct), KOC values can be predicted with similar accuracy either from eq 7 (using COSMO-SAC log KOW values) or from a first-principles ∞ approach using model OMs. However, in cases in which γi,W ∞ ≈ γi,OM, accurate KOC values require the OM phase to be modeled directly, as its properties are not adequately described by the properties of wet octanol. Evaluation of COSMO-SAC Model As a Method to Account for Variability in OM. Thus far, the KOC predictions have been evaluated relative to averages of KOC measurements. As the variability in OM leads to a range in KOC values measured in different soils, new prediction methods need to account for differences in the composition of OM if greater accuracy is to be obtained (2, 8). For eight of the compounds in our study, Niederer et al. (8) have compiled literature values of KOC measured in different soil OM fractions. For a given solute, the COSMO-SAC log KOC predictions vary by 0.20-1.80 depending on which model for OM is used. This range is similar to the range of 0.34-1.87 in log KOC values measured for a given compound in different soil fractions (8). The range in SPARC log KOC predictions (0.75-6.13) is unreasonably high compared with the measured variability. In all cases, variability occurs both within and between OM classes. For four of the eight compounds for which data are available (toluene, 1,1,2-trichloroethylene, 1,4-dichlorobenzene, and 1,2,4-trichlorobenzene), there is a high degree of overlap between the measured and predicted KOC values. For example, for toluene, log KOC values measured in HAs and FA extracted from soil or coal range from 1.91-2.27 for the HAs, with a single value of 1.13 for a FA (8). The COSMOSAC log KOC predictions of 1.72-2.26 are within the range of the experimental data, and indicate that, for certain compounds, the COSMO-SAC method using different model OMs can describe much of the variability in KOC values caused by differences in the composition of OM. However, the predictions do not span the complete range of the measurements, suggesting that additional models for OM may need to be incorporated, as they become available, to describe the full range in the sorption properties of OM. In contrast, there is no overlap between the measured and predicted KOC values for the remaining four compounds, which are all planar polycyclic aromatic hydrocarbons (naphthalene, anthracene, phenanthrene, and pyrene). The measured values are substantially higher than the predictions. For example, the predicted log KOC values are 2.66-3.58 for pyrene, whereas the experimental range, based on HAs and FAs from terrestrial and aquatic OM samples, is 3.64-5.51. It has been reported that planar compounds sorb particularly strongly to black carbon and other carbonaceous materials in the soil distinct from OM (41). The presence of these additional sorbents would lead to higher measured KOC values than those predicted on the basis of sorption only to OM, and this may explain the difference between the measurements and predictions. Niederer et al. (8) have suggested selecting a model for estimating KOC based on a class of OM that most closely
VOL. 45, NO. 3, 2011 / ENVIRONMENTAL SCIENCE & TECHNOLOGY
9
1025
resembles the OM of interest. However, this approach would be limited by the difficulty associated with characterizing the OM. Further, while the COSMO-SAC predictions can give an indication of the range of KOC values corresponding to different soils, predictions using OM models from a specific class are not necessarily in better agreement with measurements based on a similar class of OM. For example, for toluene, the measured KOC value for the FA fraction is lower than for the terrestrial HA fractions (8). The lowest predicted KOC value also corresponds to a model FA, but some other model FAs lead to higher predictions than some model terrestrial HAs. The log KOC predictions of 1.91 for the aqueous FA mixture and 1.93 for the terrestrial HA mixture are not sufficiently different to allow the prediction of trends based on OM class. As most KOC values are determined primarily by the solute-water interactions, any of the OM models can provide reasonable KOC predictions for a variety of OM types.
Acknowledgments This material is based upon work supported in part by the National Science Foundation under Grant GOALI-0853685 and in part by the National Institute of Environmental Health Sciences through a Superfund Basic Research Program Grant (5R01ES015444). We thank Christopher Cramer (University of Minnesota) for providing coordinates for the solutes in electronic format; Undine Kipka (University of Delaware [UD]) for useful discussions, providing McGowan characteristic volumes and solute class characterizations, and for performing many of the SPARC calculations; Russell Burnett (UD) for assistance with the COSMO-SAC calculations; and Jeffrey Frey (UD) and Lionel Carreira (SPARC) for computing support.
Supporting Information Available List of publications investigating KOC estimation; Figures S1-S7: plots of the relationships between various predicted and measured properties discussed in the article; Tables S1-S6: list of the model OMs, the COSMO-SAC parameters, the solutes, and all predictions, and experimental data. This material is available free of charge via the Internet at http:// pubs.acs.org.
(10)
(11)
(12) (13)
(14)
(15)
(16)
(17) (18)
(19)
(20) (21) (22)
Literature Cited (1) Sabljic, A. Predictions of the nature and strength of soil sorption of organic pollutants by molecular topology. J. Agric. Food. Chem. 1984, 32, 243–246. (2) Doucette, W. J. Quantitative structure-activity relationships for predicting soil-sediment sorption coefficients for organic chemicals. Environ. Toxicol. Chem. 2003, 22, 1771–1788. (3) Cornelissen, G.; Haftka, J.; Parsons, J.; Gustafsson, O. Sorption to black carbon of organic compounds with varying polarity and planarity. Environ. Sci. Technol. 2005, 39, 3688–3694. (4) Liu, W.; Xu, S.; Xing, B.; Pan, B.; Tao, S. Nonlinear binding of phenanthrene to the extracted fulvic acid fraction in soil in comparison with other organic matter fractions and to the whole soil sample. Environ. Pollut. 2010, 158, 566–575. (5) Winget, P.; Cramer, C. J.; Truhlar, D. G. Prediction of soil sorption coefficients using a universal solvation model. Environ. Sci. Technol. 2000, 34, 4733–4740. (6) Wauchope, R. D.; Yeh, S.; Linders, J.; Kloskowski, R.; Tanaka, K.; Rubin, B.; Katayama, A.; Kordel, W.; Gerstl, Z.; Lane, M.; Unsworth, J. B. Pesticide soil sorption parameters: Theory, measurement, uses, limitations and reliability. Pest Manage. Sci. 2002, 58, 419–445. (7) Klamt, A.; Eckert, F.; Diedenhofen, M. Prediction of soil sorption coefficients with a conductor-like screening model for real solvents. Environ. Toxicol. Chem. 2002, 21, 2562–2566. (8) Niederer, C.; Schwarzenbach, R. P.; Goss, K. U. Elucidating differences in the sorption properties of 10 humic and fulvic acids for polar and nonpolar organic chemicals. Environ. Sci. Technol. 2007, 41, 6711–6717. (9) Goudarzi, N.; Goodarzi, M.; Araujo, M. C. U.; Galvao, R. K. QSPR modeling of soil sorption coefficients (K-OC) of pesticides using 1026
9
ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 45, NO. 3, 2011
(23) (24) (25)
(26) (27) (28)
(29)
SPA-ANN and SPA-MLR. J. Agric. Food. Chem 2009, 57, 7153– 7158. Senesi, N. Organic pollutant migration in soil as affected by soil organic matter. Molecular and mechanistic aspects. In Migration and Fate of Pollutants in Soils and Subsoils; NATO-ASI Series Vol. G32; Petruzzelli, D., Helfferich, F. G., Eds.; Springer: Berlin, 1993; pp 47-74. Hawker, D. W.; Boonsaner, M.; Connell, D. W.; Sripongpun, G. The role of organic matter components in soil on partitioning of chlorohydrocarbons. Toxicol. Environ. Chem. 1999, 71, 289– 307. Aiken, G. R., McKnight, D. M., Wershaw, R. L., MacCarthy, P., Eds. Humic Substances in Soil, Sediment, and Water: Geochemistry, Isolation, and Characterization; Wiley: New York, 1985. Wilson, M. A.; Vassallo, A. M.; Perdue, E. M.; Reuter, J. H. Compositional and solid-state nuclear-magnetic-resonance study of humic and fulvic-acid fractions of soil organic-matter. Anal. Chem. 1987, 59, 551–558. Leenheer, J. A.; McKnight, D. M.; Thurman, E. M.; P., M. Structural components and proposed structural models of fulvic acid from the Suwannee River. U.S. Geol. Surv. Water-Supply Pap. 1995, 195–211. Leenheer, J. A.; Brown, G. K.; MacCarthy, P.; Cabaniss, S. E. Models of metal binding structures in fulvic acid from the Suwannee River, Georgia. Environ. Sci. Technol. 1998, 32, 2410– 2416. Diallo, M. S.; Simpson, A.; Gassman, P.; Faulon, J. L.; Johnson, J. H.; Goddard, W. A.; Hatcher, P. G. 3-D structural modeling of humic acids through experimental characterization, computer assisted structure elucidation and atomistic simulations. 1. Chelsea soil humic acid. Environ. Sci. Technol. 2003, 37, 1783– 1793. Niederer, C.; Goss, K. U. Quantum chemical modeling of humic acid/air equilibrium partitioning of organic vapors. Environ. Sci. Technol. 2007, 41, 3646–3652. Ames, T. T.; Grulke, E. A. Group-contribution method for predicting equilibria of nonionic organic-compounds between soil organic-matter and water. Environ. Sci. Technol. 1995, 29, 2273–2279. Niederer, C.; Goss, K. U.; Schwarzenbach, R. P. Sorption equilibrium of a wide spectrum of organic vapors in Leonardite humic acid: Modeling of experimental data. Environ. Sci. Technol. 2006, 40, 5374–5379. Atalay, Y. B.; Carbonaro, R. F.; Di Toro, D. M. Distribution of proton dissociation constants for model humic and fulvic acid molecules. Environ. Sci. Technol. 2009, 43, 3626–3631. Chiou, C. T.; Porter, P. E.; Schmedding, D. W. Partition equilibria of nonionic organic-compounds between soil organic-matter and water. Environ. Sci. Technol. 1983, 17, 227–231. Klamt, A. COSMO-RS: From Quantum Chemistry to Fluid Phase Thermodynamics and Drug Design; Elsevier Science: Amsterdam, 2005. Wang, S.; Sandler, S. I.; Chen, C. C. Refinement of COSMO-SAC and the applications. Ind. Eng. Chem. Res. 2007, 46, 7275–7288. Lin, S. T.; Sandler, S. I. A priori phase equilibrium prediction from a segment contribution solvation model. Ind. Eng. Chem. Res. 2002, 41, 899–913. Klamt, A.; Schuurmann, G. COSMO: A new approach to dielectric screening in solvents with explicit expressions for the screening energy and its gradient. J. Chem. Soc., Perkin Trans. 2 1993, 79, 9–805. Sandler, S. I. Chemical, Biochemical, and Engineering Thermodynamics, 4th ed.; John Wiley & Sons: Hoboken, NJ, 2006. Lide, D. R., Ed. CRC Handbook of Chemistry and Physics, 90th ed. (Internet Version 2010); CRC Press/Taylor and Francis: Boca Raton, FL, 2010. Mullins, E.; Oldland, R.; Liu, Y. A.; Wang, S.; Sandler, S. I.; Chen, C.-C.; Zwolak, M.; Seavey, K. C. VT 2005 Sigma Profile Database; www.design.che.vt.edu/VT-2005.html; Virginia Polytechnic Institute and State University: Blacksburg, VA, 2006. Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.; Cheeseman, J. R.; Montgomery, J. A., Jr.; Vreven, T.; Kudin, K. N.; Burant, J. C.; Millam, J. M.; Iyengar, S. S.; Tomasi, J.; Barone, V.; Mennucci, B.; Cossi, M.; Scalmani, G.; Rega, N.; Petersson, G. A.; Nakatsuji, H.; Hada, M.; Ehara, M.; Toyota, K.; Fukuda, R.; Hasegawa, J.; Ishida, M.; Nakajima, T.; Honda, Y.; Kitao, O.; Nakai, H.; Klene, M.; Li, X.; Knox, J. E.; Hratchian, H. P.; Cross, J. B.; Bakken, V.; Adamo, C.; Jaramillo, J.; Gomperts, R.; Stratmann, R. E.; Yazyev, O.; Austin, A. J.; Cammi, R.; Pomelli, C.; Ochterski, J. W.; Ayala, P. Y.; Morokuma, K.; Voth, G. A.; Salvador, P.; Dannenberg, J. J.; Zakrzewski, V. G.; Dapprich, S.; Daniels, A. D.; Strain, M. C.; Farkas, O.; Malick, D. K.; Rabuck,
(30) (31)
(32) (33) (34)
(35)
A. D.; Raghavachari, K.; Foresman, J. B.; Ortiz, J. V.; Cui, Q.; Baboul, A. G.; Clifford, S.; Cioslowski, J.; Stefanov, B. B.; Liu, G.; Liashenko, A.; Piskorz, P.; Komaromi, I.; Martin, R. L.; Fox, D. J.; Keith, T.; Al-Laham, M. A.; Peng, C. Y.; Nanayakkara, A.; Challacombe, M.; Gill, P. M. W.; Johnson, B.; Chen, W.; Wong, M. W.; Gonzalez, C.; Pople, J. A. Gaussian 03, revision B.05; Gaussian, Inc.: Wallingford, CT, 2004. Wang, S. Thermodynamic properties predictions using the COSMO-SAC solvation method. Ph.D. Thesis. University of Delaware, Newark, DE, 2007. Baerends, E. J.; Autschbach, J.; Bashford, D.; Be´rces, A.; Bickelhaupt, F. M.; Bo, C.; Boerrigter, P. M.; Cavallo, L.; Chong, D. P.; Deng, L.; et al. ADF2009.01, SCM Theoretical Chemistry; Vrije Universiteit: Amsterdam, The Netherlands; http:// www.scm.com. National Cancer Institute. Online SMILES Translator and Structure File Generator. Accessed Aug. 2009. http://cactus. nci.nih.gov/services/translate/. Karickhoff, S. W.; Carreira, L. A.; Hilal, S. H. SPARC, version 4.5 (September 2009). http://ibmlc2.chem.uga.edu/sparc/. Hilal, S. H.; Carreira, L. A.; Karickhoff, S. W. Prediction of the solubility, activity coefficient and liquid/liquid partition coefficient of organic compounds. QSAR Comb. Sci. 2004, 23, 709– 720. Carreira, L. A.; Hilal, S. H.; Karickhoff, S. W. Estimation of chemical reactivity parameters and physical properties of organic molecules using SPARC. In Theoretical and Compu-
(36) (37) (38)
(39)
(40)
(41)
tational Chemistry, Quantitative Treatment of Solute/Solvent Interactions; Politzer, P., Murray, J. S., Eds.; Elsevier Publishers, 1994. Abraham, M. H.; McGowan, J. C. The use of characteristic volumes to measure cavity terms in reversed phase liquidchromatography. Chromatographia 1987, 23, 243–246. ACD/ADME Suite -Absolv, v 5.0; ACD Labs: Toronto, Canada. Gawlik, B. M.; Sotiriou, N.; Feicht, E. A.; SchulteHostede, S.; Kettrup, A. Alternatives for the determination of the soil adsorption coefficient, K-oc, of non-ionicorganic compounds - A review. Chemosphere 1997, 34, 2525–2551. Mullins, P. E. Application of COSMO-SAC to Solid Solubility in Pure and Mixed Solvent Mixtures for Organic Pharmacological Compounds. Masters Thesis. Virginia Polytechnic Institute and State University: Blacksburg, VA, 2007. Cramer, C. J. Essentials of Computational Chemistry: Theories and Models, 2nd ed.; John Wiley & Sons: Chichester, England, 2004. Cornelissen, G.; Gustafsson, O.; Bucheli, T. D.; Jonker, M. T. O.; Koelmans, A. A.; Van Noort, P. C. M. Extensive sorption of organic compounds to black carbon, coal, and kerogen in sediments and soils: Mechanisms and consequences for distribution, bioaccumulation, and biodegradation. Environ. Sci. Technol. 2005, 39, 6881–6895.
ES102760X
VOL. 45, NO. 3, 2011 / ENVIRONMENTAL SCIENCE & TECHNOLOGY
9
1027