Modeling Photo-oxidative Degradation of ... - ACS Publications

May 4, 2015 - Faculty of Chemical Engineering and Technology, University of Zagreb, Marulicev trg 19, Zagreb 10000, Croatia. •S Supporting Informati...
1 downloads 0 Views 4MB Size
Article pubs.acs.org/IECR

Modeling Photo-oxidative Degradation of Aromatics in Water. Optimization Study Using Response Surface and Structural Relationship Approaches Marina Smidt, Hrvoje Kusic,* Daria Juretic, Mirjana Novak Stankov, Sime Ukic, Tomislav Bolanca, Marko Rogosic, and Ana Loncaric Bozic* Faculty of Chemical Engineering and Technology, University of Zagreb, Marulicev trg 19, Zagreb 10000, Croatia S Supporting Information *

ABSTRACT: The study described herein was aimed at a combined modeling performance, applying response surface methodology (RSM) and quantitative structure−property relationship (QSPR) approaches to simulate photo-oxidative degradation of aromatics. Thirty single-benzene-ring compounds were treated by UV/H2O2 according to the set experimental conditions using 32 full factorial design (FFD). The experimental data showed that aromatics conversion obey first-order kinetics; degradation rates (kobs) were calculated for each pollutant treated at experimental points according to FFD. Using kobs, a quadratic polynomial equation (QPE) was derived for each pollutant, describing its degradation. The coefficients pertaining to each term in QPEs were used as responses in QSPR modeling to establish their dependence on the structural features of studied aromatics. The QSPR models were verified internally and externally, yielding satisfactory accuracy in both cases. The hypothesis on structural dependence of RSM model coefficients is confirmed, yielding powerful tool for the optimization and control of photo-oxidative treatment of aromatics.

1. INTRODUCTION Water resources are exposed to a vast array of man-made pollutants; thus, development of eco- and cost-effective technologies to mediate that emerged as an important task. The simulation of processes for this purpose, aiming at their optimization and control, is certainly one of great importance for engineers in all fields of science and technology.1−5 These actions are practically needed due to the fact that water is not a commercial product like any other but, rather, a heritage which must be protected, defended, and treated as such.6 The majority of man-made organic pollutants reaching bodies of water is hardly or not at all biodegradable. Therefore, such pollutants cannot be effectively treated by conventional processes based on aerobic biological methods. Although the application of physical methods such as adsorption, filtration, and sedimentation might be effective for removal of recalcitrant organic pollutants, it would also result in secondary waste which would need to be properly treated, thus increasing overall costs and bringing additional stress to the environment.2,4 In contrast, advanced oxidation technologies (AOTs) are capable of oxidizing refractory organics present in water into biodegradable products or ultimately into CO2 and H2O.2,7 AOTs’ effectiveness is based on the generation of highly reactive and non-selective radical species, primarily hydroxyl radicals (HO•). One of the simplest ways for the direct generation of HO• is through the photolysis of H2O2 (i.e., UVC/H2O2 process). Because the UV-C/H2O2 process is a multifactor system, its effectiveness depends on several key process parameters,7,8 primarily the operating pH and concentration of H2O2, both of which influence the yield of radicals generated either (i) through undesired dissociation of an oxidant or (ii) © 2015 American Chemical Society

by scavenging effects of excess oxidant. Besides, radiant power at applied wavelengths is an additional process parameter directly related to the proportion of radicals generated.9 Two additional reactor-geometry-related process parameters are effective path length and flow regime, which influence the process effectiveness (i) via an inverse proportional relationship with the yield of generated radicals and (ii) by overcoming mass transfer limitations (turbulent flow is preferable). The composition of the water matrix, including the structure of pollutants and the presence of HO• scavengers and suspended solids, plays an important role due to the competitive kinetics, scavenging reactions, and photon-shielding effect. Hence, such a multi-factor system requires optimization in order to maximize its effectiveness, but also to control unwanted implications which may rise during the application. Modeling strategies such as artificial neural networks (ANNs) and response surface methodology (RSM) can be used for the optimization and robust prediction of system response(s) that are dependent on input parameters.9−11 Although ANN- and RSM-based models do not encompass the underlying chemical phenomena influencing the degradation kinetics within AOTs, specific process chemistry may be robustly described through influencing model terms. In this manner, the key process parameters can be easily and effectively identified.10,12 By setting the process kinetics as the targeted system response, the time variable is also considered as a process parameter, bringing additional value to the optimization and control of the studied Received: Revised: Accepted: Published: 5427

February 10, 2015 April 29, 2015 May 4, 2015 May 4, 2015 DOI: 10.1021/acs.iecr.5b00588 Ind. Eng. Chem. Res. 2015, 54, 5427−5441

Article

Industrial & Engineering Chemistry Research

Figure 1. Molecular structures of the single-benzene-ring aromatics studied.

water treatment process. However, the matrix composition particularly pollutant structure, which may vary significantly, influencing not only the degradation kinetics but also desirable process conditionsis not considered at all. The modeling strategy that shows relationships between various properties (including process and kinetic parameters) or activities of organic pollutants and their structural features is called the quantitative structure−activity (or property) relationship (QSAR/QSPR) approach. There are numerous studies dealing with the application of QSAR/QSPR modeling tools in order to predict reaction rate constants, which may enlighten the degradation kinetics,15−18 but none of them have tried to apply such a complex tool to optimize and control multi-factor system such as the UV-C/H2O2 process for the treatment of various organic pollutants. Hence, our goal was to combine the RSM and QSPR modeling approaches in order to simulate photo-oxidative degradation of aromatics by the UV-C/H2O2 process. Accordingly, we collected experimental data from the treatment of 30 single-benzene-ring compounds at the set experimental conditions regarding key process parameters (initial pH and [H2O2]) using 32 full factorial design (FFD). Such data were afterward used for the derivation of the quadratic polynomial equation (QPE), i.e., RSM models of the same order for all studied compounds, in order to get the highest possible compatibility between them, which is required for further modeling by QSPR tools. Hence, the coefficients pertaining to each model term in the derived QPEs were afterward used as responses in QSPR modeling in order to

establish their dependence on the structural features of the studied aromatics. The selected QSPR models were verified internally and externally in order to confirm their accuracy.

2. MATERIALS AND METHODS 2.1. Chemicals. Single-benzene-ring compounds with various types and numbers of substituents were used as model pollutants; they were all purchased from either SigmaAldrich, USA or Acros Chemicals, USA. Their structures are provided in Figure 1, while their names, abbreviations, chemical formula, CAS Registry Numbers, and purities are summarized in Table S1 (Supporting Information). Other chemicals used in the study are methanol (CH3OH, HPLC grade, SigmaAldrich), ortho-phosphoric acid (o-H3PO4, w ≈ 85%, SigmaAldrich), hydrogen peroxide (H2O2, w = 30%, Kemika, Croatia), sulfuric acid (H2SO4, >96%, Kemika), and sodium hydroxide (NaOH, p.a., Kemika). 2.2. Experimental Procedure. Degradation of the studied aromatics was performed with their model solutions (c0 = 1 mM and V = 1.4 L) in a water-jacketed (T = 25.0 ± 0.2 °C) glass batch photoreactor equipped with a UV-C lamp (I0 = 5.12 × 10−6 einstein s−1) and a magnetic stirrer (Figure S1, Supporting Information). Initial pH and [H2O2] varied from 3 to 11 and from 10 mM to 200 mM, respectively, according to the FFD used (Table 1). Aliquots were taken from the reactor periodically during the treatment (0, 10, 20, 30, 40, 50, and 60 min) and analyzed immediately. All the experiments were 5428

DOI: 10.1021/acs.iecr.5b00588 Ind. Eng. Chem. Res. 2015, 54, 5427−5441

Article

Industrial & Engineering Chemistry Research

where Y is the chosen response, k is the number of patterns, i and j are index numbers for patterns, β0 is the offset term, Xi...Xk are coded independent variables, βi is the first-order effect, βii is the second-order effect, βij represents the interaction effect, and ε′ is the random error allowing for the discrepancy or uncertainty between predicted and observed values. The independent variables are transformed into dimensionless coded values at levels according to the chosen experimental design. In our case of 32 FFD, two numeric variables (X1 and X2) representing UV-C/H2O2 process parameters (pH and [H2O2], respectively) had three levels (−1, 0, and 1) corresponding to upper and lower boundaries of set process parameters (pH 3 and 11, and [H2O2] = 10 and 200 mM) and their arithmetic average (pH 7 and [H2O2] = 105 mM). Since it was determined that the degradation of all studied aromatics by the UV-C/H2O2 process at all set experimental conditions followed first-order degradation kinetics (RLR2 values of linear regression performed using eq 2 range from 0.952 up to 0.999; graphical representations are

Table 1. General Layout of the Experimental Matrix for the 32 Full Factorial Design Used input variables

response

coded X1

uncoded pH

coded X2

uncoded [H2O2], mM

Y

−1 0 1 −1 0 1 −1 0 1

3 7 11 3 7 11 3 7 11

−1 −1 −1 0 0 0 1 1 1

10 10 10 105 105 105 200 200 200

k1_obs k2_obs k3_obs k4_obs k5_obs k6_obs k7_obs k8_obs k9_obs

performed in triplicate; averages are reported, and the reproducibility of the experiments was >96%. 2.3. Analyses. The conversion of the studied aromatics was monitored by high-performance liquid chromatography (HPLC) (24 compounds) or ion chromatography (IC) (6 compounds) analyses. The HPLC system (Series 10, Shimadzu, Japan) was equipped with a UV diode array detector (SPDM10AVP, Shimadzu, Japan), two pumps, and a de-gas unit. The analyses were performed using either a 5 μm, 25.0 cm × 4.6 mm C18 column (Supelco Discovery, USA) or an EC 250/4.6 120-5 C18 column (Macherey-Nagel Nucleosil, Germany). The mobile phase of H2O/CH3OH/o-H3PO4 (the exact ratio depends on the type of pollutant, i.e., its structure) operated at 1.0 mL min−1, while the sample loop volume was 100 μL. An IC system (Dionex ICS-5000, Thermo Fisher Scientific, USA) was used for monitoring the conversion of BenzAc, CC, GalAc, HQ, Ph, and SalAc (compounds 7, 10, 20, 22, 26, and 27, respectively; Figure 1). The system was equipped with a dual pump, an eluent generator module (including a KOH cartridge, a de-gas unit on the eluent generator, and a continuously regenerated anion trap column), a thermostatically controlled detection module, and an autosampler. A Dionex IonPac AS18 (4 × 250 mm) anion-exchange column, with the respective AG18 (4 × 50 mm) guard column, was applied. The eluent flow rate was 1.0 mL min−1, and the sample loop volume was 10 μL. The components were eluted and detected at a constant temperature of 30 °C. BenzAc and SalAc were detected using conductometric detection, while GalAc, CC, HQ, and Ph were detected with pulsed amperometry with a gold working electrode, Ag/AgCl reference electrode, and potential waveform for carbohydrates analysis (for Ph the potential waveform was modified). A Handylab pH/LF portable pH-meter (Schott Instruments GmbH, Mainz, Germany) was used to adjust the pH to the desired value.

⎛ [P] ⎞ ln⎜ 0 ⎟ = kobst ⎝ [P]t ⎠

k

k

k

k

∑ βi Xi + ∑ βiiXi 2 + ∑ ∑ βijXiXj + ε′ i=1

i=1

i=1 j=1

(2)

given in Figures S2 and S3, Supporting Information), the calculated first-order degradation rate constants were chosen as the system response (Y). The goodness-of-fit was evaluated using the coefficient of determination (R2) and the analysis of variance (ANOVA). The software packages STATISTICA 10.0 (StatSoft Inc., USA) and Design-Expert 8.0 (StatEase, USA) were used for ANOVA, regression, and graphical analyses of the obtained data. 3.2. QSPR Models Formulation. Data Set. The set of 30 studied aromatics (along with their corresponding data collected during RSM modeling) was divided into a training set and a test set (19 and 10 compounds, respectively), and 1 compound was left for external validation. The same intervals of targeted values (coefficients from RSM modeling) were considered in QSPR modeling, while the test set included approximately 30% of the initial number of compounds. Since targeted responses in QSPR modeling were incorporated afterward in a quadratic equation for the purpose of the external validation of the developed QSPR models, we kept the same split into training and test sets in all cases. Calculation of Theoretical Structural Parameters. Molecular structures of studied aromatics were built using Chem3D Pro software (ChemOffice 7.0, PerkinElmer, USA) and optimized by AM1 and MNDO methods. The semiempirical quantum-chemical descriptorsdipole moment (total μ, as well as its X, Y, and Z components), energy of the highest occupied molecular orbital (EHOMO), energy of the lowest unoccupied molecular orbital (ELUMO), gap between EHOMO and ELUMO (HLG), final heat of formation (ΔHf), and ionization potentialwere calculated by using AM1 and MNDO methods built in the Chem3D Pro software. The molecular descriptors were calculated by using the DRAGON 3.0 software (Milano Chemometrics and QSAR Research Group, Italy). From a total of 1497 molecular descriptors available through DRAGON 3.0 software, 1034 were calculated, capturing relevant structural features of compounds placed in the training and test sets and describing, in this way, the chemical diversity of the studied aromatics. The definitions of

3. CALCULATIONS 3.1. Design of Experiments and Response Surface Modeling. The influence of UV/H2O2 process parameters (initial pH and [H2O2]) on the degradation of studied aromatics was investigated using the 32 FFD (Table 1) combined with RSM. Generally, the relationship between the dependent variable and the set of independent variables in RSM is presented by eq 1:11 Y = β0 +

(P stands for pollutant)

(1) 5429

DOI: 10.1021/acs.iecr.5b00588 Ind. Eng. Chem. Res. 2015, 54, 5427−5441

Article

Industrial & Engineering Chemistry Research

Figure 2. 3D response surface and contour diagrams showing the effects of the mutual interactions between input variables, i.e., initial pH (X1) and [H2O2] (X2), on the system response, i.e., first-order degradation rates (kobs), of six studied compounds [AcPh (1), BenzAc (7), CC (10), p-MP (18), ABSA (28), and TO (29)] when treated by the UV/H2O2 process.

the molecular descriptors used and the calculation procedures are summarized by Todeschini and Consonni.19 Statistical Correlation. The correlation between targeted QSPR responses, i.e., coefficients from RSM modeling, and structurally related descriptors (1034 generated by DRAGON and 18 calculated by semiempirical quantum chemistry relations) was obtained using the variable selection Genetic Algorithm (GA) and Multiple Linear Regression Analysis

(MLRA) methods. The combined GA-MLRA technique was applied for the selection of descriptors and construction of one-, two-, three-, four-, five-, and six-variable models using the BuildQSAR 2.1.0.0 program.20 The GA variable selection technique started with a population of 100 random models and 1500 iterations to evolution with the mutation probability specified as 35%. 5430

DOI: 10.1021/acs.iecr.5b00588 Ind. Eng. Chem. Res. 2015, 54, 5427−5441

Article

Industrial & Engineering Chemistry Research

parameters (pH and [H2O2], respectively). In order to get the maximal accuracy of all the models and the highest possible similarity between them, we applied in all cases the same transformation of the original response: Y (i.e., kobs) was transformed using the logarithmic function Y′ = ln(Y). The ANOVA results obtained for RSM models for all 30 studied aromatics are given in Tables S2 and S3 in the Supporting Information. Table S2 summarizes R2, determining the model accuracy in predicting obtained system responses, and p, determining the model significance for applied case. All R2 values are rather high, ranging from 0.9556 to 0.9988, clearly demonstrating the capability of derived RSM models, in a form of QPE, to accurately predict the experimentally determined degradation rate constants (kobs) of all 30 aromatics within the studied range of process parameters. Furthermore, p values of all RSM models are lower than 0.05, which is the cutoff point for model significance;26 they ranged from 0.0001 to 0.0305. The significance of a model term’s contribution to the predicted response (kobs) is also determined by ANOVA, and the results are summarized in Table S3, along with the values of each of the six coefficients (A, B, C, D, E, and F) in the RSM models. Hence, the observed degradation rate (kobs) of most of the studied compounds is influenced by both considered process parameters (through either direct linear or quadratic model terms or both). The exceptions are AcPh, BenzAld, BenzAm, BSA, and BenzAc, where [H2O2] is the only influential process parameter, while in the case of Ph the degradation rate is influenced only by the initial pH value. However, due to their role in further modeling of all RSM coefficients by QSPR, they cannot be neglected or excluded from the data set. Actually, these findings imply the structural influence on the degradation rate; the kobs values of all compounds possessing only a −COR (where R could be −CH3, −H, −OH, or −NH2) substituent group do not depend on the initial pH. This supports our hypothesis that the degradation rate, as influenced by process parameters, is highly structurally dependent and might be modeled through structural relationship modeling, i.e., QSPR. The simultaneous influence of both process parameters (within the investigated range) on the observed degradation rates of studied compounds by the UV/H2O2 process is clearly depicted by the 3D surface and contour plots. In Figure 2, the 3D surface and contour plots of 6 selected compounds are shown, while all 30 studied compounds are summarized in Figures S4 and S5 (Supporting Information). Hence, it can be clearly observed that the degradation rates of AcPh and BenzAc do not depend on the initial pH; kobs varied rather weakly with increasing pH from 3 to 11 (i.e., X1 from −1 to 1) while the [H2O2] (X2) was held constant. In all other cases the 3D spheres are rather curved, demonstrating the simultaneous influence of initial pH and [H2O2] through the investigated range. Based on the above-presented observations, i.e., (i) the accuracy and significance of derived RSM models and (ii) the indication of structural influences on both degradation rates and process parameters, the values obtained for RSM coefficients A, B, C, D, E, and F can be used in further modeling without affecting the accuracy of the QSPR models. 4.2. Modeling of RSM Coefficients Using QSPR. The values of A, B, C, D, E, and F coefficients of RSM model (3) were correlated with the structural features, i.e., descriptors, of studied compounds through the following methodology. The initial set of 30 aromatics was split into the training (19 compounds) and test (10) sets (Table 2); one compound was

The correlation coefficients for all pairs of descriptor variables used in the QSPR models were evaluated to identify highly correlated descriptors in order to detect redundancy in the data set. Therefore, the models with descriptors that had a cross-correlation coefficient of more than 0.6 were excluded from further consideration (this option, offered through the BuildQSAR 2.1.0.0 program, was turned on during programming). The validation of models was performed by comparing the values of selected statistical model quantifiers: r, the correlation coefficient of regression; r2, the model explained variance; F, the F-ratio between the variances of observed and calculated property/activity; p, the probability value for calculated F; Q2, the leave-one-out cross-validation coefficient; s, standard error; and SPRESS, standard error of the predictive residue of sum of squares. It should be noted that all the descriptors were transformed into their normalized values and used as such in model development in order to enable comparison of their contribution to the studied QSPR responses. The chemical applicability domain (AD) for obtained models was checked by the leverage approach to verify the prediction reliability.13 A Williams plot is used to visualize the AD of the developed QSPR models. This type of plot of standardized cross-validated residuals (RES) vs leverage (Hat diagonal) values (HAT) clearly depicts both the response outliers (Y outliers) and structurally influential compounds (X outliers) in a model.

4. RESULTS AND DISCUSSION 4.1. Response Surface Modeling. RSM modeling is a powerful tool for process optimization and control within the boundaries of the set process conditions.11 In the case of water/ wastewater oxidation treatment (WWOT), the effectiveness of which often depends on several process parameters, including both direct and interactive effects, RSM can be used efficiently to establish the key process parameters and to evaluate their contribution within the studied system.10,12,21 Among various WWOT process parameters, the characteristics of any pollutants present may also influence the treatment effectiveness. Hence, as shown in our previous studies,22−25 the effectiveness of the UV/H2O2 process for the treatment of various aromatic pollutants strongly depends on the pollutant structure, which is represented through categorical terms in the applied experimental designs. However, the exact influence of structural features of aromatic pollutants on the process effectiveness, as well as the optimal conditions for their maximal degradation, cannot be estimated using only the RSM approach. Accordingly, the goal of this study was to combine two modeling strategies, RSM and QSPR, in order to fully estimate the influence of a pollutant’s structure and its role in optimizing the UV/H2O2 process. Hence, we applied 32 FFD (Table 1) and RSM modeling on the experimental results obtained through the treatment of 30 aromatics with the UVC/H2O2 process, varying the key process parameters (pH and [H2O2]) within a rather wide experimental range (Table 1). The influence of studied process parameters on the chosen system response (kobs, first-order degradation rate constant) is represented through QPE of a general layout, Y = AX1 + BX12 + CX 2 + DX 2 2 + EX1X 2 + F

(3)

including direct linear (X1 and X2) and quadratic (X12 and X22) as well as interactive (X1 × X2) effects of both process 5431

DOI: 10.1021/acs.iecr.5b00588 Ind. Eng. Chem. Res. 2015, 54, 5427−5441

Article

Industrial & Engineering Chemistry Research

Table 2. Shifted and Transformed Values of Coefficients of RSM Models and Splitting of Studied Aromatics into Training and Test Set for QSAR Modeling shifted and transformed 1 C+1

1 D+1

1.019 0.977 0.919 1.023 0.927 0.978 1.026 1.060 1.009 1.045 1.055 1.097 1.084 1.042 1.447 0.890 0.953 1.013 1.019

Training Set: 19 Compounds 1.032 0.922 1.141 0.881 1.109 1.069 0.984 0.718 1.105 0.895 1.041 1.029 1.158 0.844 1.416 0.843 1.055 0.948 1.136 0.949 1.152 0.875 0.902 0.894 1.423 0.941 1.181 0.893 1.769 0.812 1.213 0.912 1.178 0.943 1.140 0.866 1.032 0.922

1.050 0.923 0.984 1.247 1.041 0.893 1.108 0.960 0.951 0.936

Test Set: 10 Compounds 1.100 0.908 1.140 0.866 1.054 0.882 1.005 0.872 1.320 0.841 1.163 0.862 1.161 0.792 1.108 0.927 1.196 0.924 1.093 0.871

compd no.

abbreviation

1 A+1

1 2 3 4 5 7 8 9 11 12 13 15 18 19 20 22 23 24 30

AcPh Anl Ans BenzAld BenzAm BenzAc BenzAlc o-BP CB CBSA o-CP DCP p-MP o-FP GalAc HQ HBSA o-NP TSA

6 10 14 16 17 21 25 27 28 29

BSA CC p-CP TCP o-MP GenAc p-NP SalAc ABSA TO

1 B+1

E+1

F

1.166 1.236 1.344 2.718 4.151 1.149 1.313 1.878 1.191 1.259 1.119 1.304 1.100 1.185 1.550 1.222 1.133 1.515 1.166

0.967 0.991 0.921 1.033 1.072 0.967 0.955 1.040 0.992 0.985 0.974 0.908 1.041 0.934 0.954 0.992 1.004 0.975 0.967

2.647 1.419 2.041 2.175 1.457 1.976 2.472 1.693 2.214 1.507 1.704 1.317 1.708 1.992 0.694 1.830 1.665 1.515 2.647

1.339 1.263 1.335 1.369 1.179 1.302 1.570 1.250 1.185 1.284

0.968 0.959 1.008 0.976 1.014 1.002 1.018 0.960 1.040 1.040

1.816 1.775 2.108 1.049 1.807 1.129 1.738 1.729 2.438 1.815

D, and E coefficients). Generally, the transformation is used to narrow the range of targeted responses and yields an increased correlation upon modeling actions.11 As it can be seen from Table 2, we determined that A, B, C, and D coefficients required shifting and transformation through the following expression:

excluded to be used in the external validation of developed models (26, Ph). It should be noted that the semiempirical quantum-chemical descriptors and molecular descriptors were calculated only for the compounds located in the training and test sets using their optimized molecular structures. The QSPR modeling for the each coefficient of RSM model was performed by building of models with 1 to n variables (i.e., descriptors) for the training set in order to get the highest possible accuracy (according to r) while keeping the linearity of derived models. This means that the descriptors included in the chosen model could not be cross-correlated (their correlation coefficient (r) should be ≤0.6). The preliminary modeling actions yielded rather low correlation between values of A, B, C, D, and E coefficients of RSM model (3) obtained directly through RSM and those predicted using QSPR models. Therefore, the originally obtained values of A, B, C, D, and E coefficients were shifted and transformed. The shifting was done in a very simple fashion; to each originally obtained value the constant (k = 1) was added. In this manner, we got only positive values of all A, B, C, D, and E coefficients, allowing their easier correlation with the derived molecular and semiempirical descriptors. Furthermore, these shifted values were also tested for several transformations using simple mathematical functions such as ln(Z), log(Z), sqrt(Z), 1/Z, 1/sqrt(Z)... (Z stands for A, B, C,

Z′ =

1 Z+k

(Z stands for A , B , C , and D coefficients,

k = 1) (4)

The E coefficient yielded the best results (i.e., highest correlation between RSM-calculated and QSPR-predicted values) during preliminary modeling actions using the following shifting and transformation scheme: E′ =

E+k

(k = 1)

(5)

Accordingly, the shifted and transformed values of A, B, C, D, and E coefficients thus obtained and the original value of F coefficient were used in QSPR modeling (Table 2). The values of statistical parameters (r2, q2, F, p, s, SPRESS) determining the performance of selected best one-, two-, three-, four-, five-, and six-variable (only in the case of C coefficient) QSPR models predicting A, B, C, D, E, and F coefficients of 5432

DOI: 10.1021/acs.iecr.5b00588 Ind. Eng. Chem. Res. 2015, 54, 5427−5441

Article

Industrial & Engineering Chemistry Research

Figure 3. Comparison of correlation coefficients obtained for training and test sets by one-, two-, three-, four-, five-, and six-variable models for modeling of A, B, C, D, E, and F coefficients of the RSM model.

model for the test set decreased; in some cases even rapidly, indicating that these higher-dimensional models are overfitted. Since the QSPR models in this study were built in the stepwise fashion by increasing the dimension of the model followed by statistical evaluation of its predictive capability, it can be easily concluded that building of six-variable (or seven-variable in the case of C coefficient) or higher-dimensional models would not be efficient. The predictivity of such higher-dimensional models for the training set would slightly increase, but in the case of test set r values would be lower than for that of four-variable model (or five-variable model in the case of C coefficient). Hence, four-variable models for A, B, D, E, and F coefficients,

RSM model (3) for both training and test set are summarized Table S4 (Supporting Information). In Figure 3, the curves of the correlation coefficient of regression, r, obtained for training and test sets versus number of variables in those selected QSPR models, are plotted. As it can be observed from Figure 3, r values for training set models increased with the increasing of number of variables in the models. Similar trend can be observed for test set models, but only up to the certain number of variables in the model: in the case of A, B, D, E, and F coefficients the number was 4, while in the case of C coefficient the number was 5. After that point (i.e., number of variables in the particular QSPR model) the predictivity of selected QSPR 5433

DOI: 10.1021/acs.iecr.5b00588 Ind. Eng. Chem. Res. 2015, 54, 5427−5441

Article

Industrial & Engineering Chemistry Research

Table 3. Names and Definitions of Descriptors Included in the Best Four- and Five-Variable Models for Prediction of Coefficients A, B, C, D, E, and F descriptor definitiona

coefficient

descriptor name

A

RDF050m Mor15p L2p R5e+

radial distribution function − 5.0/weighted by atomic masses 3D-MoRSE − signal 15/weighted by atomic polarizabilities 2nd component size directional WHIM index/weighted by atomic polarizabilities R maximal autocorrelation of lag 5/weighted by atomic Sanderson electronegativities

RDF 3D-MoRSE WHIM GETAWAY

B

SPI RDF050u Mor21m G3s

superpendentic index radial distribution function − 5.0/unweighted 3D-MoRSE − signal 21/weighted by atomic masses 3rd component symmetry directional WHIM index/weighted by atomic electrotopological states

topological RDF 3D-MoRSE WHIM

C

BEHv5 GATS3e Mor10e Kp R6m+

highest eigenvalue n.5 of Burden matrix/weighted by atomic van der Waals volumes Geary autocorrelation − lag 3/weighted by atomic Sanderson electronegativities 3D-MoRSE − signal 10/weighted by atomic Sanderson electronegativities K global shape index/weighted by atomic polarizabilities R maximal autocorrelation of lag 6/weighted by atomic masses

BCUT 2D autocorrelation 3D-MoRSE WHIM GETAWAY

D

Mor27u Mor29p R3e+ DPZ

3D-MoRSE − signal 27/unweighted 3D-MoRSE − signal 29/weighted by atomic polarizabilities R maximal autocorrelation of lag 3/weighted by atomic Sanderson electronegativities dipole moment (z axis)b

3D-MoRSE 3D-MoRSE GETAWAY semiempirical

E

GATS3p RDF040u Mor30m Km

Geary autocorrelation − lag 3/weighted by atomic polarizabilities radial distribution function − 4.0/unweighted 3D-MoRSE − signal 30/weighted by atomic masses K global shape index/weighted by atomic masses

2D autocorrelation RDF 3D-MoRSE WHIM

F

RDF065v HATSp R3u H-050

radial distribution function − 6.5/weighted by atomic van der Waals volumes leverage-weighted total index/weighted by atomic polarizabilities R autocorrelation of lag 3/unweighted H attached to heteroatom

RDF GETAWAY GETAWAY atom-centered fragment

a

descriptor type

From ref 19. bCalculated by AM1 theory.

and a five-variable model for C coefficient, were selected as the most accurate for the entire set (29 compounds); their ratio of the number of variables and predictive ability ensured the avoidance of overfitting. The final equations of these best selected QSPR models when applied for the entire set (29 compounds) are provided below (eqs 6−11), and the names, description, and group types of included descriptors are summarized in Table 3. A′ =

C′ =

+ 0.200(± 0.109) × GATS3e + 0.070(± 0.049) × Mor10e + 0.699( ± 0.255) × Kp − 0.136(± 0.092) × R6m+ + 0.384(± 0.195)

D′ =

− 0.297(± 0.194) × Mor15p + 0.888(± 0.279) × L2p

2

+ 1.268( ± 0.386) × DPz + 1.230(± 0.391)

2

E′ =

E + 1 = 0.344(± 0.168) × GATS3p

− 0.173(± 0.058) × RDF040u − 0.048(± 0.025)

× RDF050u + 0.571(± 0.152) × Mor21m

2

(9)

(N = 29; r2 = 0.752; q2 = 0.414; F = 18.208; p < 0.0001; s = 0.326; SPRESS = 0.464)

1 = 0.724(± 0.184) × SPI + 0.106(± 0.150) B+1

+ 0.252( ± 0.109) × G3s + 0.981(± 0.147)

1 = 0.840( ± 0.381) × Mor27u D+1

+ 1.268( ± 0.485) × Mor29p − 0.601(± 0.633) × R3e

(6)

(N = 29; r = 0.778; q = 0.603; F = 20.993; p < 0.0001; s = 0.058; SPRESS = 0.077) B′ =

(8)

(N = 29; r2 = 0.694; q2 = 0.399; F = 10.404; p < 0.0001; s = 0.041; SPRESS = 0.057)

1 = 0.417(± 0.126) × RDF050m A+1

− 0.479(± 0.182) × R5e+ + 0.797(± 0.134)

1 = −0.220(± 0.157) × BEHv5 C+1

× Mor30m + 0.120( ± 0.085) × Km + 0.712(± 0.171)

(7)

(10)

2

2

(N = 29; r = 0.787; q = 0.623; F = 22,135; p < 0.0001; s = 0.080; SPRESS = 0.107)

2

(N = 29; r = 0.694; q = 0.551; F = 13.562; p < 0.0001; s = 0.024; SPRESS = 0.028) 5434

DOI: 10.1021/acs.iecr.5b00588 Ind. Eng. Chem. Res. 2015, 54, 5427−5441

Article

Industrial & Engineering Chemistry Research

predicting the C coefficient, represents specific structural indices correlated with the highest value of eigenvalue in matrix with atomic van der Waals volumes and lag 5. 2D autocorrelation descriptors GATS3e and GATS3p (models (8) and (10), respectively) pertain to Geary (GATS) autocorrelations and are calculated from the molecular graphs by summing the products of atom weights of the terminal atoms of all the paths of the considered path length (so-called “the lag”). Besides Geary, 2D autocorrelation may include Moreau−Broto (ATS) and Moran (MATS) autocorrelations.27 Similar to BCUT descriptors, these 2D autocorrelations may be weighted with some property associated with the atoms, e.g., m, p, e, and v. In the cases of both models (8) and (10) predicting coefficients C and E, respectively, the Geary autocorrelation of the same lag (3) is included, but the weights are different: Sanderson electronegativity in the case of C, and polarizibility for the E coefficient. RDF descriptor types (RDF050m, model (6); RDF050u, model (7); RDF040u, model (10); and RDF065v, model (11)) are based on the distance distribution in the molecule. Generally, the radial distribution function of an ensemble of n atoms can be interpreted as the probability distribution of finding an atom in a spherical volume of radius R and are typically denoted as RDFsw, where 10 ≤ s ≤ 155 in five unit steps, while w stands for weighting scheme.27 As abovedescribed BCUT and Geary autocorrelations, they can also be weighted by atomic properties m, p, e, and v, or can be unweighted (u). 3D-MoRSE descriptors (3D molecule representation of structures based on electron diffraction) (Mor15p, model (6); Mor21m, model (7); Mor10e, model (8); Mor27u and Mor29p, model (9); and Mor30m, model (10)) are calculated by summing atom weights viewed by different angular scattering functions and are typically denoted as Morsw, where 1 ≤ s ≤ 32 in 1 unit steps, while w stands for weighting scheme, which is the same as in the case of RDF descriptors (u, m, p, e, and v).27 WHIM descriptors (L2p, model (6); G3s, model (7); Kp, model (8); and Km, model (10)) are based on the statistical indices calculated on the projections of atoms along principal axes.19 They are built in such a manner to capture relevant molecular 3D information regarding the molecular size, shape, symmetry, and atom distribution with respect to invariant reference frames. Similar to the majority of the above-described descriptors, they can also be weighted by the same schemes (m, p, e, and v) and additionally by electrotopological states (s). Besides, they can be either directional (calculated as univariate statistical indices on the scores of each individual principal component; in our case L2p and G3s) or global (calculated directly as a combination of the directional WHIM descriptors; in our case Kp and Km).27 The GETAWAY (geometry, topology, and atom-weights assembly) descriptors (including R5e+, model (6); R6m+, model (8); R3e+, model (9); HATSp and R3u, model (11))) are proposed as chemical structure descriptors derived from a new representation of molecular structure: Molecular Influence Matrix (MIM). The methodology of matrices construction, applied leverages approaches, and calculation procedure is given elsewhere.19 There are two types of GETAWAY descriptors. The descriptors within the first group are derived by applying traditional matrix operators and concepts of information theory to MIM and influence/ distance matrix R. Descriptors in the second group are based on spatial autocorrelation formulas which weight the molecule atoms in such a way as to account for atomic mass, polarizability, van der Waals volume, and electronegativity

F = 0.403( ± 0.358) × RDF065v − 1.571(± 0.554) × HATSp + 2.261(± 0.765) × R3u − 1.636(± 0.347) × H‐050 + 1.487( ± 0.587)

(11)

(N = 29; r2 = 0.848; q2 = 0.749; F = 33.605; p < 0.0001; s = 0.181; SPRESS = 0.233) The descriptors included in the above-presented QSPR models pertain to several classes: topological (SPI, model (7)), BCUT (BEHv5, model (8)), 2D autocorrelations (GATS3e and GATS3p, models (8) and (10), respectively), RDF (RDF050m, model (6); RDF050u, model (7); RDF040u, model (10); and RDF065v, model (11)), 3D-MoRSE (Mor15p, model (6); Mor21m, model (7); Mor10e, model (8); Mor27u and Mor29p, model (9); and Mor30m, model (10)), WHIM (L2p, model (6); G3s, model (7); Kp, model (8); and Km, model (10)), GETAWAY (R5e+, model (6); R6m+, model (8); R3e+, model (9); HATSp and R3u, model (11)), atom-centered fragment (H-050, model (11)), and semiempirical (DPZ, model (9)). These descriptors provide clearand some of them even simplecorrelation with the structural features of the studied compounds. Hence, the topological descriptor SPI (model (6)) represents in our case the level of benzene ring saturation. Generally, the superpendentic index is calculated as the square root of the sum of the products in a pendentic matrix, considering only non-zero elements (i.e., topological distances to the terminal vertices);19 terminal vertices are determined by the number of substituents, while the topological distance between them is determined by their position at the benzene ring. Furthermore, atom-centered fragment descriptor H-050 (model (11)) represents the number of all H-atoms attached to a heteroatom and can be considered in our case as an indirect measure of solubility. Our data set compounds in most cases possess an O-atom as the heteroatom where the H-atom is attached, forming a −OH group (“pure” in phenols, −COOH, −SO3H). The exceptions are H-atoms attached to an N-atom, forming an amino group (−NH2), i.e., compounds 2, 5, and 28 (Figure 1). The included semiempirical descriptor DPz (model (9)) represents the value of dipole moment in the direction of the z axis, providing the direct correlation of chemical structure, i.e., their polarity, with RSM coefficient D. Other included descriptors in the above QSPR models are calculated through rather complicated schemes providing specific structural indices. Hence, descriptors from the BCUT group are actually topological descriptors that consider BCUT indices. These are the eigenvalues of a modified connectivity matrix known as the Burden matrix, which can be described as follows. The diagonal elements Mww are the weights wi for atom Ai where the weights could be some property associated with the atoms such as m (relative atomic mass), p (polarizability), e (Sanderson electronegativity), and v (van der Waals volume). The non-diagonal elements Mwk are 1 if k = dij and 0 otherwise, where k is the lag defined as the topological distance d between the atom pair i-j and may have a value between 0 and 8. Hence, for a given k, the non-diagonal element Mij will be unity if the atoms i and j are apart by a topological distance k and zero otherwise.27,28 For a given w and k, there are two BCUT descriptors: BEHwk and BELwk. The first represents the highest positive eigenvalue of the matrix Mwk and will be 0 if there are no positive eigenvalues. The latter represents the lowest negative eigenvalue of the matrix Mwk and will be 0 if there are no negative eigenvalues.27 In our case, the BEHv5 descriptor included in model (8), 5435

DOI: 10.1021/acs.iecr.5b00588 Ind. Eng. Chem. Res. 2015, 54, 5427−5441

Article

Industrial & Engineering Chemistry Research

Figure 4. Plots of the observed and predicted values for A, B, and C coefficients of RSM model for the entire set (29 compounds) calculated by corresponding QSPR models (6), (7), and (8), respectively (panels A, C, and E, respectively), and determination of applicability domain of models (6), (7), and (8) through a Williams plot (panels B, D, and F, respectively).

together with 3D information encoded by the elements of MIM and influence/distance matrix R.19 Such kinds of indices encode information on structural fragments, effective position of substituents, and fragments in the molecular space and account for the information on molecular size and shape, and are

particularly suitable for describing differences in congeneric series of molecules.19 The GETAWAY descriptors included in models (6), (7), (8), and (10) pertain to the second group. The coefficients of RSM model (3) mimic the influence of operating parameters of UV/H2O2 process (pH and [H2O2]) 5436

DOI: 10.1021/acs.iecr.5b00588 Ind. Eng. Chem. Res. 2015, 54, 5427−5441

Article

Industrial & Engineering Chemistry Research

Figure 5. Plots of the observed and predicted values for D, E, and F coefficients of RSM model for the entire set (29 compounds) calculated by corresponding QSPR models (9), (10), and (11), respectively (panels A, C, and E, respectively), and determination of applicability domain of models (9), (10), and (11) through a Williams plot (panels B, D, and F, respectively).

on the observed degradation rate constant (kobs). Hence, F can be observed as a mean degradation rate, while other RSM coefficients correct it for the operating parameters influence through direct linear (A and C for pH and [H2O2], respectively), quadratic (B and D for pH and [H2O2],

respectively), and their mutual interaction (E) effects. If the included descriptors are viewed through such a prism, the following can be concluded. According to the value of coefficients in QSPR model (11), the highest contribution to F coefficient (i.e., average degradation rate) is obtained from 5437

DOI: 10.1021/acs.iecr.5b00588 Ind. Eng. Chem. Res. 2015, 54, 5427−5441

Article

Industrial & Engineering Chemistry Research

compounds in our data set; others are mostly mono- or disubstituted (Figure 1.). Such results can be also plausibly explained with the included descriptors, depicting specific structural properties of studied compounds, and their contributions to the prediction of A and B coefficients. Hence, the major contribution to B coefficient prediction is obtained by SPI descriptor (model (7)), which can be observed as a measure of benzene ring saturation in our case. Furthermore, in both of these models a rather high contribution to the prediction of A and B coefficients is obtained by descriptors related to the atomic masses, RDF050m and Mor21m in models (6) and (7), respectively. The benzene rings of these two compounds are highly saturated (four substituents), and they also possess rather high molecular masses (i.e., due to possessing atoms of rather high atomic masses) among the studied compounds. The structurally influential chemicals are also recognized by the leverage approach used in models (8) and (9): CBSA (compound 12, Figure 1) in the model predicting the C coefficient (Figure 4F), and BenzAm (compound 5, Figure 1) and BenzAlc (compound 8, Figure 1) in the model predicting the D coefficient (Figure 5B). However, the HAT value of CBSA is very close to the X limits, and its determination as a structurally influential variable in model (8) presumably relies on the nature of included descriptors, particularly those related with atomic mass (R6m) and polarizability (Kp), where this compound possesses extremely high values compared to other compounds from the studied set. The direct structural features of CBSA, i.e., type (−Cl and −SO3), number (2), or position (para) of substituents, are not peculiar for the studied set of compounds. On the other hand, simple and direct structural features of BenzAm and BenzAlc, the type of substituent (−COR) and its functional groups (R = −NH2, −OH), distinguish these two compounds from the majority of others included in the studied set. That is most probably the reason why these two compounds are recognized as structurally influential among others in the model (9). Accordingly, they possess values twice as high (or more) as those of other compounds of the greatest contributing descriptor (Mor29p) in model (9). Besides compounds crossing X limits in derived QSPR models, only one compound is marked as an outlier: BenzAld in model (8) (Figure 4F). However, its RES (standardized residual) value is rather close to 3.0σ. Since the fitting for that particular RSM model (Table S2, Supporting Information) is very high (R2 = 0.9974), this cannot be considered as an experimental error affecting QSPR model (8) accuracy significantly. Besides, the leverage approach used determined several compounds that possess 2.0σ < RES < 3.0σ: p-MP and GalAc in model (6) and BenzAc, p-CP, and TSA in model (8). 4.3. Verification of Developed QSPR Models through RSM Simulation. Although some authors recommend that testing a QSPR derived from a training set using a test set could be considered sufficient external validation,18,30 we performed the external validation of our QSPR models on a compound which was not included in the training nor in the test set: Ph (compound 26, Figure 1). In Table 4, the values of A, B, C, D, E, and F RSM coefficients obtained directly through RSM and those calculated using QSPR models (6)−(11) are compared. As it can be seen, the values of F coefficient, determining the mean degradation rate, are rather close: 2.088 and 2.191, differing less than 5%. In the case of other RSM coefficients, providing the correction of mean degradation rate in

R3u, providing synergistic effect through its positive index, and HATSp and H-050, providing antagonistic effects through their negative indexes. Hence, this indicates that the ability of the molecular structure to dissociate, providing ionic forms of targeted compounds, has negative effect to the degradation rate. Furthermore, it can be observed that the influence of pH is contributed by the same RDF descriptors (RDF050) possessing different weighting schemes: by atomic mass (direct linear, i.e., A coefficient) and without any weighting (direct quadratic, i.e., B coefficient). Both of them provide the same negative contribution to pH influence; they have positive indexes, but A and B coefficients are shifted and transformed using reverse proportional correlation (eq 4). Hence, this means that when the RDF050 value is higher, the values of A and B coefficients would be lower. Furthermore, the contribution of SPI to direct quadratic influence of pH (through B coefficient) is the greatest of all included descriptors in model (7) (it possesses the highest value among all coefficients) and is antagonistic; the lower the value of SPI, the higher is the B coefficient. The highest contributions to the influence of [H2O2] are obtained by Kp (direct linear in model (8) through C coefficient) and Mor29p and DPZ (direct quadratic in model (9) through D coefficient). According to their positive coefficients in corresponding QSPR models, and the reverse proportional relationship with C and D coefficients, the higher the values of these descriptors, the lower will be values of C and D. It should be noted that all these three most contributing descriptors are related to the polarity of molecular structures of studied compounds, either through their weighting schemes or directly through dipole moment. The graphical representations of QSPR models (6), (7), and (8) predicting the A, B, and C coefficients versus calculated values obtained directly through RSM are given in Figure 4, parts A, C, and E, respectively. The analogues for the other three coefficients, D, E, and F, are provided in Figure 5, parts A, C, and E, respectively. As it can be seen, points or point clusters are placed closely to diagonal line; the only exception can observed in Figure 4C (for model (8) predicting coefficient C) for compound BenzAld (compound 4, Figure 1). The ADs of derived QSPR models were defined by the leverage approach using a Williams plot. In this manner the highly structurally influential chemicals (X limits: hii = 2(m + 1)/n); m stands for the number of variables in the model, while n stands for the number of compounds in the training set) and outliers (Y limits: ±2.0σ and ±3.0σ; σ stands for the standard deviation units) are simultaneously detected. Generally, the compounds characterized by peculiar structural features that are poorly represented in the training set are considered to be influential on the structural domain of the model, while the outliers, characterized with RES value exceeding 3σ, might be associated with experimental error.29 The Williams plots for models (6), (7), and (8) are provided in Figure 4, parts B, D, and F, while those for models (9), (10), and (11) are shown in Figure 5, parts B, D, and F. As it can be seen, GalAc (20, Figure 1) is recognized as the structurally influential chemical in our data set in the case of models (6) and (7) predicting A and B coefficients of RSM model; i.e., in both models correlated with the influence of pH on the observed degradation rate (Figure 4B,D). Besides GalAc, in model (6) TCP (compound 16, Figure 1) is also marked as structurally influential chemical (Figure 4B), while in model (7) its HAT value is very close to the X limits (Figure 4D). It should be noted that these two compounds are the only tetrasubstituted single-benzene-ring 5438

DOI: 10.1021/acs.iecr.5b00588 Ind. Eng. Chem. Res. 2015, 54, 5427−5441

Article

Industrial & Engineering Chemistry Research Table 4. Values of A, B, C, D, E, and F Coefficients of RSM Model for Ph (Compound 26, Figure 1) coefficient

calculated through RSM

predicted by QSPR

A B C D E F

−0.401 −0.644 0.196 −0.303 0.045 2.088

−0.097 −0.334 0.128 −0.569 −0.043 2.191

convex when changing [H2O2] through pH up to neutral values. On the other hand, the major difference can be observed at basic pHs; in the case of 3D surface obtained through direct RSM modeling a significant drop in degradation rate is observed when increasing pH up to boundary value (pH 11, X1 = 1), while in the simulation using QSPR predicted coefficients the leveling off of degradation rate is there, but not in such a significant amount. As it was mentioned above, one of the main advantages of RSM modeling for process optimization, besides establishing the influence of operating parameters on the targeted system response, is the determination of optimal conditions yielding maximum process efficiency: in our case, maximal degradation rate. As it can be seen from Figure 7, the optimal area of process conditions for both simulations is placed in the very similar range of process parameters. In the case where QSPR predicted coefficients are used that range is a little bit wider (if the narrow peak area is not taken into accountthe ellipsoid with the smallest diameter in contour plot of Figure 7). The similarity in predicting the optimal conditions yielding maximal degradation rate between the case using direct RSM modeling and the case using QSPR predicted coefficients can be observed from Table 5, where the optimal values of pH and [H2O2] are listed along with the predicted maximal degradation rate constants for Ph. The predicted degradation rates are rather close (they differ only by 2.78%). The optimal pH values are also rather close; 5.80 and 6.39 (9.3%), while some larger discrepancy can be observed in the case of optimal [H2O2] (13%). Additionally, the experiments at the optimal values determined by RSM directly and those estimated through QSPR were performed in order to additionally verify used modeling approaches. The results are given in Figure S6 (Supporting Information). Basically, the experimental results (and correspondingly determined kobs) are in good agreement with the predicted values, either directly through RSM or through QSPR; corresponding root-mean-square deviation (RMSD) values, calculated using eq 12, are lower than 0.91%.

dependence on operating parameters, larger differences can be observed. Hence, 75% and 52% in the case of A and B coefficients related to pH influence, and 35% and 47% in the case of C and D coefficients related to [H2O2] influence. However, the indexes, determining the direction of influence of operating parameters, are the same (Table 4). In the case of E coefficient, related to the interactive effect of pH and [H2O2], although the absolute values are very similar, the indexes are opposite (Table 4). However, in the case of Ph the obtained result would not affect the observed degradation rate prediction because this model term (X1 × X2) is not significant (Table S3, Supporting Information). Comparisons of 3D surface and contour plots presenting simultaneous influence of both studied operating parameters of the UV/H2O2 process on the observed degradation rate constant of Ph obtained by direct RSM modeling on experimentally obtained results using 32 FFD matrix (Table 1), and those simulated using A, B, C, D, E, and F coefficients predicted through QSPR models (6)−(11), are presented in Figures 6 and 7, respectively. Although at the first

RMSD =

1 N

i=N

∑ (yexp,i − ymod,i )2 i=1

(12)

Hence, it can be concluded that our hypothesis of a structural dependence of the coefficients of the RSM model (3) is confirmed and that we derived satisfactorily accurate QSPR models that can predict coefficients of the RSM model eventually simulating the degradation rate of aromatic pollutants by UV/H2O2 as a function of key operating parameters.

5. CONCLUSIONS Thirty selected common water pollutants, members of the class of single-benzene-ring compounds, were treated by the UV-C/ H2O2 process at predefined conditions in two key process parameters, initial pH and [H2O2], that were set using 32 FFD. All of studied pollutants obeyed first-order degradation kinetics, which enabled us to use their degradation rate constants as appropriate responses during RSM modeling. The RSM models were derived by means of quadratic polynomial equations, including direct linear and quadratic equations, as well as interaction model terms and their corresponding coefficients, A, B, C, D, E, and F, which are assumed to depend on the structures of the studied aromatics. Accordingly, these coefficients were investigated using the QSPR modeling

Figure 6. Verification of QSPR models for prediction of A, B, C, D, E, and F coefficients of RSM model on the external set compound (compound 26, phenol). Comparison of RSM using QSPR-predicted coefficients and direct RSM simulation; 3D surface plot.

sight the layouts of these two 3D surfaces differ significantly, a closer look at particular parts of the investigated range of operating parameters yields certain similarities. As it can be seen, the part of 3D surface in the acidic and neutral pH (X1) is satisfactorily simulated, although in the case of QSPR the simulated degradation rate of Ph is more influenced by changes in [H2O2] (Figure 6). Hence, the 3D surface becomes more 5439

DOI: 10.1021/acs.iecr.5b00588 Ind. Eng. Chem. Res. 2015, 54, 5427−5441

Article

Industrial & Engineering Chemistry Research

Figure 7. Verification of QSPR models for prediction of A, B, C, D, E, and F coefficients of RSM model on the external set compound (compound 26, phenol). Comparison of RSM using QSPR-predicted coefficients and direct RSM simulation; contour plots.



Table 5. Optimal Conditions for Maximal Degradation Rate for Ph (Compound 26, Figure 1)

Corresponding Authors

*(H.K.) Tel.: +385 1 4597 160. Fax: +385 1 4597 143. E-mail: [email protected]. *(A.L.B.) Tel.: +385 1 4597 123. Fax: +385 1 4597 143. E-mail: [email protected].

process parameters calcd through RSM pred by QSAR

pH

[H2O2]

5.80 6.39

133.60 116.23

AUTHOR INFORMATION

−1

kpre, s

8.827 9.079

Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS We gratefully acknowledge financial support from the University of Zagreb, Republic of Croatia (Project No. TP1.30). Generous support and help of Thermo Fisher Scientific Corporation is also gratefully acknowledged.

approach, yielding the confirmation of set hypothesis. The QPE, i.e., RSM model, coefficients were successfully correlated by structural features of studied aromatics via selected derived QSPR models, which were successfully internally and externally verified. Hence, the applied combined modeling approach using RSM followed by QSPR modeling is a promising tool which can be used for future studies in predicting the influence of key process parameters of the UV-C/H2O2 process in the case of other aromatics. Moreover, using the presented approach, the optimization and control of an applied process in terms of key process parameters and structural features of the pollutants present can be simplified.





REFERENCES

(1) Chinchuluun, A.; Pardalos, P. M.; Enkhbat, R.; Pistikopoulos, E. N. Optimization, Simulation, and Control; Springer: New York, 2013. (2) Litter, M. I.; Candal, R. J.; Meichtry, J. M. Advanced Oxidation Technologies: Sustainable Solutions for Environmental Treatments; CRC Press/Taylor and Francis: London, 2014. (3) Qu, X.; Alvarez, P. J. J.; Li, Q. Applications of nanotechnology in water and wastewater treatment. Water Res. 2013, 47, 3931−3946. (4) Tang, W. Z. Physicochemical Treatment of Hazardous Wastes; CRC Press/Lewis Publisher: Boca Raton, FL, 2004. (5) Viesmann, W., Jr.; Hammer, M. J. Water Supply and Pollution Control, 7th ed., Pearson Education: Upper Saddle River, NJ, 2005. (6) European Parliament, Council of the European Union. Directive 2013/39/EU of the European Parliament and of the Council of 12 August 2013 amending Directives 2000/60/EC and 2008/105/EC as regards priority substances in the field of water policy. Off. J. Eur. Communities 2013, 226, 1−17. (7) Parsons, S. Advanced Oxidation Processes for Water and Wastewater Treatment; IWA Publishing: London, 2004. (8) Beltran, F. J. Ozone-UV Radiation-Hydrogen Peroxide Oxidation Technologies. In Chemical Degradation Methods for Wastes and Pollutants, Environmental and Industrial Applications; Tarr, M. A., Ed.; Marcel Dekker, Inc.: New York, 2003; pp 1−77. (9) Khataee, A. R.; Kasiri, M. B. Artificial neural networks modeling of contaminated water treatment processes by homogeneous and

ASSOCIATED CONTENT

S Supporting Information *

Names, CAS Registry Numbers, molecular formulas, abbreviations, and purities for the studied compounds (Table S1); ANOVA results for RSM models (Table S2) and for each model term in RSM models (Table S3); descriptors and statistical values for training and test QSPR models (Table S4); photoreactor design (Figure S1); first-order kinetics estimation for compounds 1−15 (Figure S2) and for compounds 16−30 (Figure S3); 3D response surface and contour diagrams for compounds 1−15 (Figure S4) and for compounds 16−30 (Figure S5); and comparison of experimental and modeling results for Ph (Figure S6). The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.iecr.5b00588. 5440

DOI: 10.1021/acs.iecr.5b00588 Ind. Eng. Chem. Res. 2015, 54, 5427−5441

Article

Industrial & Engineering Chemistry Research heterogeneous nanocatalysis. J. Mol. Catal. A: Chem. 2010, 331, 86− 100. (10) Kusic, H.; Jovic, M.; Kos, N.; Koprivanac, N.; Marin, V. The comparison of photo-oxidation processes for the minimization of organic load of colored wastewater applying the response surface methodology. J. Hazard. Mater. 2010, 183, 189−202. (11) Myers, R. H.; Montgomery, D. C.; Anderson-Cook, C. M. Response Surface Methodology: Process and Product Optimization using Designed Experiments, 3rd ed.; John Wiley & Sons: Hoboken, NJ, 2009. (12) Dopar, M.; Kusic, H.; Koprivanac, N. Treatment of simulated industrial wastewater by photo-Fenton process: Part I. The optimization of process parameters using design of experiments (DOE). Chem. Eng. J. 2011, 173, 267−279. (13) Gramatica, P. Minireview: Principles of QSAR models validation: internal and external. QSAR Comb. Sci. 2007, 26, 694−701. (14) Puzyn, T.; Leszczynski, J.; Cronin, M. T. Recent Advances in QSAR Studies; Methods and Applications; Springer: New York, 2010. (15) Gramatica, P.; Pilutti, P.; Papa, E. A tool for the assessment of VOC degradability by tropospheric oxidants starting from chemical structure. Atmos. Environ. 2004, 38, 6167−6175. (16) Kusic, H.; Rasulev, B.; Leszczynska, D.; Leszczynski, J.; Koprivanac, N. Prediction of rate constants for radical degradation of aromatic pollutants in water matrix: A QSAR study. Chemosphere 2009, 75, 1128−1134. (17) Sudhakaran, S.; Amy, G. L. QSAR models for oxidation of organic micropollutants in water based on ozone and hydroxyl radical rate constants and their chemical classification. Water Res. 2013, 47, 1111−1122. (18) Zhu, H.; Shen, Z.; Tang, Q.; Ji, W.; Jia, L. Degradation mechanism study of organic pollutants in ozonation process by QSAR analysis. Chem. Eng. J. 2014, 255, 431−436. (19) Todeschini, R., Consonni, V. Handbook of Molecular Descriptors; Wiley-VCH: Weinheim, 2000. (20) De Oliveira, D. B.; Gaudio, A. C. BuildQSAR: A new computer program for QSAR analysis. Quant. Struct.-Act. Rel. 2001, 19, 599−601. (21) Kourdali, S.; Badis, A.; Boucherit, A. Degradation of direct yellow 9 by electro-Fenton: Process study and optimization and, monitoring of treated water toxicity using catalase. Ecotox. Environ. Safety. 2014, 110, 110−120. (22) Juretic, D.; Kusic, H.; Koprivanac, N.; Loncaric Bozic, A. Photooxidation of benzene-structured compounds: Influence of substituent type on degradation kinetic and sum water parameters. Water Res. 2012, 46, 3074−3084. (23) Kalinski, I.; Juretic, D.; Kusic, H.; Peternel, I.; Loncaric Bozic, A. Structural aspects of degradation of sulfoaromatics by UV/H2O2 process. J. Photochem. Photobiol., A 2014, 293, 1−11. (24) Kusic, H.; Koprivanac, N.; Papic, S.; Loncaric Bozic, A. Influence of substituent type and position on photo-oxidation of phenolic compounds: Response surface methodology approach. J. Photochem. Photobiol., A 2012, 242, 1−12. (25) Milovac, N.; Juretic, D.; Kusic, H.; Dermadi, J.; Loncaric Bozic, A. Photo-oxidative degradation of aromatic carboxylic acids in water; influence of hydroxyl substituents. Ind. Eng. Chem. Res. 2014, 53, 10590−10598. (26) Stat-Ease, Multifactor RSM Tutorial (Part 2 - Optimization). Design-Expert software Version 7.1.5 User’s Guide, 2008. (27) Sarchitect Designer 2.5, Strand Life Sciences, User Guide, Theory − Descriptors, 2005; http://www.strandls.com/sarchitect/ documents/manual_html/desctheory.html (accessed on Jan 22, 2015). (28) Burden, F. R. Molecular identification number for substructure searches. J. Chem. Inf. Comput. Sci. 1989, 29, 225−227. (29) Liu, H.; Papa, E.; Gramatica, P. Evaluation and QSAR modeling on multiple endpoints of estrogen activity based on different bioassays. Chemosphere 2008, 70, 1889−1897. (30) Pramanik, S.; Roy, K. Environmental toxicological fate prediction of diverse organic chemicals based on steady-state compartmental chemical mass ratio using quantitative structure−fate relationship (QSFR) models. Chemosphere 2013, 92, 600−607. 5441

DOI: 10.1021/acs.iecr.5b00588 Ind. Eng. Chem. Res. 2015, 54, 5427−5441