Application of PCA and SIMCA Statistical Analysis of FT-IR Spectra for

Mar 5, 2012 - Environmental Science & Technology 2018 52 (4), 2295-2306. Abstract | Full Text HTML ... Neves , Rita Castro. Heritage Science 2018 6 (1...
0 downloads 0 Views 4MB Size
Article pubs.acs.org/est

Application of PCA and SIMCA Statistical Analysis of FT-IR Spectra for the Classification and Identification of Different Slag Types with Environmental Origin B. Stumpe,*,† T. Engel,† B. Steinweg,‡ and B. Marschner† †

Department Soil Science/Soil Ecology, Institute of Geography, Ruhr-Universität Bochum, Universitätsstrasse 150, 44801 Bochum, Germany ‡ Municipal Mönchengladbach/Soil Protection Agency, Devision of Polluted Areas, Weiherstrasse 21, 41050 Mönchengladbach, Germany S Supporting Information *

ABSTRACT: In the past, different slag materials were often used for landscaping and construction purposes or simply dumped. Nowadays German environmental laws strictly control the use of slags, but there is still a remaining part of 35% which is uncontrolled dumped in landfills. Since some slags have high heavy metal contents and different slag types have typical chemical and physical properties that will influence the risk potential and other characteristics of the deposits, an identification of the slag types is needed. We developed a FT-IR-based statistical method to identify different slags classes. Slags samples were collected at different sites throughout various cities within the industrial Ruhr area. Then, spectra of 35 samples from four different slags classes, ladle furnace (LF), blast furnace (BF), oxygen furnace steel (OF), and zinc furnace slags (ZF), were determined in the midinfrared region (4000−400 cm−1). The spectra data sets were subject to statistical classification methods for the separation of separate spectral data of different slag classes. Principal component analysis (PCA) models for each slag class were developed and further used for soft independent modeling of class analogy (SIMCA). Precise classification of slag samples into four different slag classes were achieved using two different SIMCA models stepwise. At first, SIMCA 1 was used for classification of ZF as well as OF slags over the total spectral range. If no correct classification was found, then the spectrum was analyzed with SIMCA 2 at reduced wavenumbers for the classification of LF as well as BF spectra. As a result, we provide a time- and cost-efficient method based on FT-IR spectroscopy for processing and identifying large numbers of environmental slag samples.



INTRODUCTION In the past, slags generated as the solid coproducts during steel, iron, or metal production reached a worldwide annual production of more than 50 million tons, with approximately 12 million tons of slags in Europe alone.1,2 There are different classes of steel industry slags each one named after the process from which they are generated.1 Ladle furnace (LF), blast furnace (BF), and oxygen furnace steel (OF) slags are produced as the nonmetallic coproducts of iron and steel production, whereas zinc furnace (ZF) slags are coproducts of extractive metallurgy processes. In the 19th century the use of slags for construction and landscaping became a common practice in Europe where the incentive to make all possible use of industrial byproduct was strong and storage space for byproduct was lacking.3 During these times, the different slag types were deposited without control in landfills or used as construction material for roads, children playgrounds, or sport fields.4 Thus, although present environmental laws strictly © 2012 American Chemical Society

control the use of slags, worldwide up to 35% of environmental slag material are of an unknown origin.2,3,5 At environmental sites, slag samples occur pure or as mixture with natural substrates in urban soils. Since slag materials are more or less resistant against weathering,4,6 these environmental samples are typically in the size range of 2−20 cm.4 Due to its coarse fraction slag material can be easily sampled in the field, but its attribution to different slag classes is difficult and thus generally not practiced. In Germany there are only a few experts who are able to visually differentiate between different slag types, since slag composition and structural properties are visually very similar.7 However, as it will be shown in the following, there is need to develop analytical methods for Received: Revised: Accepted: Published: 3964

November 22, 2011 March 5, 2012 March 5, 2012 March 5, 2012 dx.doi.org/10.1021/es204187r | Environ. Sci. Technol. 2012, 46, 3964−3972

Environmental Science & Technology

Article

accurate identification of slag samples in the urban environment. Previous studies1,5,8 have shown that each slag type has typical chemical, mineralogical, and physical properties. Thus, infiltration and leaching processes at slag deposits will be mainly controlled by the presence of specific slag types.4 Additionally, due to their physical properties and chemical composition, only certain slags are considered for recycling or reuse.1 For example, Gupta et al.9 focused on the reuse of slag material as low cost adsorbents for water treatment. Of further environmental concern are slag classes with high concentration of heavy metals. While blast furnace and oxygen furnace steel slags are generally low in heavy metal content, ladle furnace slags and zinc furnace slags often have high contents of heavy metals.4,10−12 Although, techniques such as mobile XRF can be useful for a fast identification of heavy metal contaminated sites, these techniques cannot identify the source of the contamination. However, this aspect is highly important for risk assessment since heavy metals within the slag matrix behave differently than within natural substrates. Mansfeldt and Dohrmann,5 Proctor et al.,4 and Bunzel et al.13 showed that hazardous effects of an incidental ingestion and leaching risks decrease when heavy metal contamination within urban soils is due to slag products because slag borne metals are strongly bound. That means that if there is an environmental heavy metal contamination detected, it is important to be able to decide if this contamination is due to the slag material deposed there. Based on this, the objective of our study was to develop a reliable method for identifying field collected slag samples with regard to their slag type. Since diffuse reflectance Fourier transformed infrared (FT-IR) spectroscopy is known as a rapid, low-cost method,14−16 we decided to use FT-IR spectroscopy in combination with unsupervised and supervised statistical classification procedures to develop an identification tool for environmental slag samples.

furnace steel (OF), and 9 as zinc furnace (ZF) slags (see SI Table S1). For all further measurements these slag samples were ground in an agate ball mill (Retsch, Germany) to 20 μm and then dried at 105 °C.



SLAG SAMPLE CHARACTERIZATION Chemical Slag Composition. To determine the main slag constituents via element analysis, X-ray fluorescence spectroscopy (XRF) was used as described by Navarro et al.1 For this purpose, the milled slag powder was pressed to a powder pellet and analyzed for major elements by the wavelength dispersive system of a PHILIPS PW 2404 X-ray fluorescence spectrometer. The total heavy metal content of all slag samples were also determined by using acid digestion according to Brokbartold et al.18 In the extracts heavy metals were analyzed by ICP-AES (Ciros, Spectro analytical instruments GmbH, Kleve, Germany). Mineral Slag Composition. The mineral composition of the slag samples was determined according to Miao et al.19 using X-ray powder diffraction measurements (XRD). Diffraction patterns were recorded in reflection geometry with a Panalytical MPD diffractometer which was equipped with a copper tube, 0.5° divergent and antiscatter slit, a 0.2-mm high receiving slit, incident and diffracted beam 0.04 rad Soller slit, and a secondary graphite monochromator. The detailed procedure of qualitative phase analysis is described in Miao et al.19 FT-IR Spectroscopy. About 50 mg of the raw sample was transferred to custom-made aluminum microplates with 24 wells. Since raw slag material was used, the samples were not diluted with KBr. Consistent with previous studies,14−16,20 neat slag sample powder was filled in sample holder and the surface was smoothened with scattering effects were tried to avoid using a spatula to gain a plain sample surface. Spectra were recorded in five replicates using a Bruker Tensor 27 equipped with an automated high throughput device (Bruker HTS-XT, Ettlingen), operating with a liquid N2-cooled mercury− cadmium telluride (MCT) detector. For further processing, these five sample replicates were averaged and not used as independent samples. Employing a broadband KBr beam splitter, spectra in the mid infrared region from 4000 to 400 cm−1 were recorded at a resolution of 4 cm−1. Gold was used as background sample. Each sample was measured at 120 scans, in order to reduce the signal-to-noise ratio (SNR) improving the spectra quality. More details are described in Stumpe et al.20 Prior to use, a variety of mathematical pretreatments such as standard normal variate (SNV), first and second derivatives or unit variance scaling were performed according to Reeves et al.14,16 in order to gain the most representative data set for further statistical classification procedures. Unsupervised Pattern Recognition. In the life and environmental sciences, different classification or pattern recognition techniques are used for spectroscopic data.21 As a first classification step, the principal component analysis (PCA) is commonly used as an unsupervised pattern recognition technique to detect groups in the measured data set.22 Thus, a PCA was applied to identify groupings of slag spectra between the different slag classes. During the course of a PCA, it is possible to calculate a score vector for each sample on a given principal component (factor vector). Score vectors provide the principle component (PC) composition related to the slag sample, while the loading vectors provide this sample



MATERIAL AND METHODS Slag Sampling. To gain a representative set of different environmental slag samples, slags were collected in various cities within the industrial Ruhr area (NRW, Germany) which has been one of the main industrial regions for coal mining and steel production in Germany since the 1850s.17 In the urban environment slag products can occur pure or as mixtures with natural soil substrates,7 this aspect was taken into account within the sampling strategy. Therefore, samples were taken from sites where they were used as base material for parking areas or sports fields (see Supporting Information (SI) Table S1). On the other hand, slags were sampled from the upper layer of urban soils where they occurred as mixtures with natural substrates as for example children playgrounds (see SI Table S1). All samples were in the size range of 2−20 cm, as it is typical for environmental slags samples.4 Since in the past slags were deposited in these urban environmental sites without control, no information was available regarding the age of the deposits. However, according to the “industrial boom” during the second half of the 19th century, we assume that some of the samples were dumped up to more than 100 years ago. After sampling, slags were visually identified by the coauthor Dr. Bernd Steinweg, who is one of the few experts in this field. According to this classification, 9 were identified as ladle furnace slags (LF), 11 as blast furnace slags (BF), 6 as oxygen 3965

dx.doi.org/10.1021/es204187r | Environ. Sci. Technol. 2012, 46, 3964−3972

Environmental Science & Technology

Article

composition related to the variables.23 Since score vectors account for the highest variance in the different spectra, the first two calculated scores were used to describe similarities between spectra. The PCA algorithm was used with mean-centered data. Supervised Pattern Recognition. As a second classification step, supervised classification tools like the support vector machines (SVMs), neural networks (NNs), or the soft independent modeling of class analogy (SIMCA) can be used.21,24,25 While the SVMs or NNs require large sample sizes and low within-class variability for good classification results, SIMCA is an established method for multivariate classification especially for high within-class variability.24 SIMCA is based on PCA modeling performed for each class in the calibration set. Unknown samples are compared to the PCA class models and assigned to the class according to their analogy with the calibration samples. Since this way high within-class variability is covered by the principal components calculated for each class, SIMCA is one of the most commonly used classmodeling techniques for the classification of spectral data.21,23−25 For the slag spectra data set, SIMCA modeling was performed as described in the following. The slag spectra data set comprised all averaged slag spectra. Each slag group was modeled using PCA whereas outliers were eliminated based on the Hotelling T2 ellipse to improve the PCA models describing the structure of each slag group as well as possible. The optimal number of PCs was chosen for each model separately, according to a suitable cross-validation procedure. However, based on this training data set within the SIMCA procedure classification rules were established to allow slag samples of an unknown origin to be classified. Since the goodness of the classification rules needs to be validated, a leave out one cross-validation procedure within the SIMCA data set was performed25 using the averaged spectra of the five replicates. According to Galtier et al.,21 the percentage of correct classification (%CC, eq 1) was used as criterion to compare classification results %CC =

Nc × 100 Nc + Nic

of class A, AA residuals for samples in class A fitted to the model of class A, and BB residuals for samples in class B fitted to the model of class B. Additionally, the Fisher-Snedecor variable (F) (eq 3) of the original data set for each variable was calculated according to Galtier et al.21 to characterize each slag class and also to explain the variable selection for SIMCA. The Fisher-Snedecor variable (F) is expressed as ratio between Interspectral variance (InterV) of all spectra (n spectra) and Intraspectral variance (IntraV) of all spectra within one class (j spectra) at each wavenumber. As previously described by Galtier et al.21 the Fisher-Snedecor test and the Fisher criterion were used to determine whether variances are significantly different j

F=



RESULTS AND DISCUSSION Chemical Slag Composition. The major constituents of the slag samples determined via X-ray fluorescence spectroscopy (XRF) are listed in Table S2 (SI). Generally, the main slag constituents (71 to 94%) are silica, calcium, magnesium, aluminum, manganese, and iron compounds. Especially silica and calcium are with up to 72% the primary constituents of the slags which is consistent with chemical slag characterization of previous studies.2,3,8 However, between the different slag classes the percentage of each constituent and thus the compound ratios vary widely. For example, whereas the LF slags have a Si/Ca ratio of about 2:1 the Si/Ca ratio of the BF slags is reciprocal. This means that the chemical slag class characteristics are mostly represented via different main component ratios. Additionally, ZF slags showed the most different chemical structure since the main slag constituents account only to 71% of their chemical composition (see SI Table S2). As it will become clear in the following the remaining 29% were identified as heavy metals. The concentrations of heavy metals of each slag sample are listed in Table S1 (SI). The heavy metal concentrations vary strongly between the four slag classes. While the heavy metal concentrations in the LF and BF slags were all below the trigger values of the German soil protection ordinance, consistent to the study of Proctor et al.4 the heavy metal concentration of the OF but especially the ZF slags exceed the tolerable environmental concentration by far. For OF, Cr concentrations are up to 10-times above tolerable levels and for ZF slag almost all heavy metal concentrations exceed the tolerable limits in soils. This clearly shows that it is important to differentiate between different slag classes due to their different environmental hazardous. Mineralogical Slag Composition. Table S3 (SI) shows the major mineral phases of the four slag classes determined via

(1)

AB2 + BA2 AA2 + BB2

(3)

with aix being the absorbance of the spectrum i at the wavenumber x, ax being the mean absorbance of all j spectra into the class at the wavenumber x, Akx being the absorbance ot the spectrum k at the wavenumber x, and Ax being the mean absorbance of all n mean spectra of each class at the wavenumber x. Software. The chemometric applications were performed by the The Unscrambler software version 10.01 from CAMO (Computer Aided Modeling, Trondheim, Norway) as well as with the programming language R (Gui) using the package pcaMethods.

with Nc as the number of correct and Nic as the number of incorrect classifications. To improve the classification through the selection of variables, SIMCA provides the Discrimination Power (DP) as quality parameter. The Discrimination Power (DP) describes how well a variable discriminates between two PCA class models. Calculating the DP it is necessary to fit each sample to both class models and to calculate the residuals for each variable as it is described in eq 2. As it becomes clear from eq 2, a variable with a high DP is very important for the differentiation between the two corresponding classes whereas variables with discrimination power higher than 3 can be considered to be quite important.26 For each slag class discrimination power to all other classes were calculated and then added to have only one parameter characterizing the most important wavelength for discrimination between the slag classes DP =

∑ (aix − ax)2 InterVarianz = ni − 1 InterVarianz ∑k − 1 (Akx − Ax )2

( 2)

with AB residuals for samples in class A fitted to the model of class B, BA residuals for samples in class B fitted to the model 3966

dx.doi.org/10.1021/es204187r | Environ. Sci. Technol. 2012, 46, 3964−3972

Environmental Science & Technology

Article

Figure 1. Original (a), UV scaled (b) and SNV transformed (c) spectra (left) and PC1,PC2-scoreplots of the original (a), UV scaled (b) and SNV transformed (c) spectra (right) of the ladle furnace (LF, red), blast furnace (BF, blue) and oxygen furnace steel (OF, green) and zinc furnace slags (ZF, gray).

the study of Navarro et al.1 no free CaO was detected although it is often used as basic slag additive.1,8 This may be due to the fact that Navarro et al.1 obtained their slag samples directly from the steel plants, whereas our slag samples were sampled in urban soils throughout the industrial Ruhr area. Thus, weathering and leaching processes seem to have removed CaO or transformed this phase over time.

XRD analysis. Since the baselines of all diffractograms showed no baseline increase, no amorphous slag phases were detected (data not shown). Thus, all patterns could be assigned to slag minerals. The XRD analysis revealed the presence of calcium, aluminum, and magnesium silicates as major constituent phases of all four slag classes as it was described earlier by Navarro et al.1 Other minor phases were very difficult to assign due to the complexity of the diffractograms. Interestingly, in contrast to 3967

dx.doi.org/10.1021/es204187r | Environ. Sci. Technol. 2012, 46, 3964−3972

Environmental Science & Technology

Article

Intervals between 3600 and 2600 cm−1, 1500 and 1000 cm−1 as well as 700−500 cm−1 clearly represent the highest loading intensities so that these MIR intervals were considered as the fingerprint regions for the slag class differentiation. Unsupervised Pattern Recognition for Slag Spectra Classification. Muik et al.25 stated that before using supervised pattern recognition techniques, a PC analysis is advisible to evaluate weather clustering exists in a data set without using class membership information in the calculation, even if the memberships of the samples are known apriority. Thus, similar to Vogt et al.31 as a first step, a PCA on the mean-centered data without spectral pretreatment was performed. Figure 1 shows the score plot projection of the PCA performed on the full MIR slag spectra data set as well as the corresponding slag spectra. Since PC1 and PC2 together explain 90% of the spectral variance the score plot is projected in the PC1 and PC2 plane. The score vectors of the individual spectra are clearly distinguished according to the four slag classes, which is a prerequisite for a possible classification via SIMCA modeling. However, the ZF slag spectra were most homogeneous at all. To improve class separation, different data pretreatment methods such as data scaling as well as transformations as described in Heinz et al.32 were tested on the slag spectra data set. In Figure 1, the unit-variance (UV) scaled (b) as well as the standard normal variate (SNV) transformed (c) slag spectra with the corresponding PC1-PC2 score plots are shown. Since SNV as data set-independent transformation is a common and effective data pretreatment procedure which removes major effects of light scattering,33 the data set-dependent multiplicative scattering correction (MSC) can be avoided. Thus, training data set extensions will not have any influence on the spectral pretreatment procedures. Clearly, the unit-variance scaling resulted in depressed adsorption peaks within the fingerprint intervals (Figure 1 b). That means that for all further PC analysis these wavelengths will become less important, although they represent the most important chemical information of the slag samples. However, although this scaling is often used to improve PC analysis,32 this pretreatment did not improve the slag class clustering by PCA. This became apparent in PC1-PC2 score plot of the unitvariance (UV) scaled data set (Figure 1 b), where the clustering of the four different slag classes could not be further improved compared to the clustering of the non pretreated spectral data (Figure 1 b). This is likely due to the fact that the influence of the slag type specific information within the fingerprint intervals was suppressed via unit-variance scaling. Although all slag samples were ground with constant mill intensity, the powder of the different slag samples may vary in particle size. As consequence within the FT-IR spectra scattering effects likely occur. To reduce this effect, the spectra were standard normal variate (SNV) transformed (Figure 1 c).32 Although the explaining variance of the two first PCs decreased to 85% due to deletion of unimportant scattering information, a better separation of ZF and OF slags is obtained (Figure 1 c). For the BF and LF slag classes this pretreatment did not improve the clustering. This is explained by the fact that, although the peaks within the fingerprint interval at 1500 and 1000 cm−1 are characteristic for the respective slag class, these peaks occur at slightly different wavelengths especially within the BF and LF slag classes. This shift in band positions seems to control and to complicate classification procedure and cannot be diminished by any data pretreatment.

However, although slags are complex materials consisting of a mixture of crystalline mineral phases (see SI Table S3), differences in the mineral composition between the four different slag classes are evident. Consistent with the chemical composition discussed above, LF and BF slags revealed a relatively simple diffraction pattern and could be described as pure calcium silicate and calcium magnesium silicates, respectively. Although, the OF and ZF slags mainly contain calcium and magnesium silicates as well, both slag classes revealed more heterogeneous diffraction patterns. Both additionally are composed of aluminum and calcium aluminum silicates, whereas the OF slags also contain iron oxides and the ZF slags zinc silicates. FT-IR Analysis. In a first step, the FT-IR spectra are analyzed qualitatively (Figure 1 and Figure S4 (SI)). In spite of the fact that most of the spectra were from different locations all over the industrial zone, slags within one class clearly display similar MIR spectra (Figure 1). As demonstrated above, the composition of slags is complex.1 Thus, it is difficult to attribute slag mineralogy directly to spectral absorbance intervals since it is likely that interval characteristics will be affected by the overlapping intervals of different slag constituents. However, some mineralogical information could be attributed to differences in the slag spectral response. In Figure S4 (SI) all slag spectra of one class were averaged to visualize slag class specific mineral band assignments. The LF mean slag spectra reflected the presence of silica in the broad and very sharp adsorption peak at 1190 cm−1, since here adsorption is due to Si−O−Si asymmetric stretching.27 There is also evidence of calcium silicates with a sharp adsorption peak at 572 cm−1.28 However, consistent with their diffraction pattern, LF spectra are relatively simple. The mean BF slag spectrum is more complex. It revealed a broad adsorption band at 3460 cm−1 which is characteristic for hydroxyl stretching vibrations of calcium/aluminum hydroxides and oxyhydroxides.1 Further, the sharp BF adsorption peaks at 1629 and 1457 cm−1 were assigned to C−O stretching modes28 indicating the presence of some sort of carbonated minerals, although in the X-ray diffraction patterns no carbonates were detected as main phases. The BF peaks at 1116 and at 543 cm−1 were related to silica and calcium silicates, respectively.28 The mean OF slag spectrum reveals a broad band at 3450 cm−1 and a very sharp peak at 1488 cm−1 which was attributed to the presence of hydroxide/oxyhydroxides and carbonate minerals, respectively.28 The very sharp peak at 3640 cm−1 corresponds to hydroxyl stretching vibrations of calcium/aluminum hydroxides or kaolinite.29 However, no intensive peaks from Si−O−Si bond vibrations were observed. The mean ZF slag spectrum showed a broad band around 3195 cm−1 which again was attributed to hydroxide and oxyhydroxides stretching modes.1 Similar to the BF slags small but sharp peaks appear at 1620 and 1496 cm−1 which were assigned to C−O stretching modes of the CO32‑ ions of carbonates.30 No Si−O−Si bond vibrations were observed since their characteristic peaks at 1190 and 576 cm−1 were absent. However, a broad peak at 856 cm−1 was related to calcium silicates and Al−O stretching vibrations of oxides.27 The increase in adsorption intensity at wavelength below 447 cm−1 is likely corresponding to iron oxides as for example Fe2O3.28 Figure S4 (SI) also shows the PC1 loading vectors of each slag class. Since the loading vectors represent the most important slag class specific mineral phases, the slag fingerprint intervals were defined based on slag class PC1 loadings. 3968

dx.doi.org/10.1021/es204187r | Environ. Sci. Technol. 2012, 46, 3964−3972

Environmental Science & Technology

Article

Table 1. Results of the SIMCA 1 and SIMCA 2 Classification for All Slag Classes (LF, BF, ZF, OF)a models

number of wavelength used

SIMCA 1

1654

SIMCA 2

706

a

groups

correct classification [%]

not recognized by any of the classes [%]

allocated to more than one class [%]

significance level

number of PCs used

LF BF ZF OF LF BF

39 0 100 100 100 100

0 90 0 0 0 0

62 10 0 0 0 0

0.001 0.05 0.001 0.001 0.001 0.001

4 5 4 2 3 5

PC means principal component.

Figure 2. F variable (inter/intra variance) and SNV transformed spectra (light gray lines) (left) and the Discrimination Power (D-Power, right) of the slag classes LF (red), BF (blue), ZF (gray), and OF (green).

3969

dx.doi.org/10.1021/es204187r | Environ. Sci. Technol. 2012, 46, 3964−3972

Environmental Science & Technology

Article

power obtained from SIMCA classification describes which variables discriminate best between PCA models. Generally, wavenumbers with discrimination power higher than 3 can be considered to be important for class differentiation. The discrimination power is shown in Figure 2 separately for each slag class. Obviously, a high discrimination power corresponds to a high F variable or rather a low intra variance which is not surprising. However, even at some intervals with a low F variable the discrimination power can be high. For example, the LF and BF slag spectra had low F variables within the fingerprint interval, but the discrimination power at this wavelength is one of the highest of all. This means that transforming the spectral information into principal components using PCA modeling, the most important chemical or mineralogical information of the slag classes will be extracted despite of band shifts (see SI Figure S4) and corresponding low F variables. Since the discrimination power combines information of the F variable as well as of important chemical characteristics of the slag classes it was selected as an adequate parameter for variable selection to obtain a better SIMCA classification. Figure S5 (SI) shows the most important wavenumbers for the differentiation between the slag classes (discrimination power values >7) separately for each slag class. Regarding their discrimination power the wavenumbers of the ZF and OF slag classes as well as the LF and BF slag classes were quite similar. Since for ZF and OF almost the total spectral range was important for classification, the 100% correct classification of the ZF and OF slags over 1654 wavelength (Table 1, SIMCA 1) is explained with the discrimination parameter. For the classification of LF and BF slags less wavenumber are relevant which are mainly located in the fingerprint intervals. Thus, to gain improved classification results for LF and BF slags, wavenumbers (706) with a discrimination power higher than five were selected which are most important for the discrimination between these two slags. Based on these wavenumbers new PCA models for both slag classes were performed for further SIMCA classification (Table 1, SIMCA 2). The results obtained with SIMCA 2 resulted in a 100% correct classification of the LF and BF slags at a significance level of 0.001 (Table 1). Thus, for precise classification results via SIMCA modeling two different SIMCA models have to be used stepwise. At first, SIMCA 1 has to be used for classification of ZF as well as OF slags over the total spectral range. If an unknown slag sample will be correctly classified to the ZF or OF slag group the classification procedure ends. If no correct classification is found, then the spectrum is analyzed with SIMCA 2 at reduced wavenumbers for the classification of LF as well as BF spectra. Although SIMCA classification proved to be a suitable classification tool dealing with low sample numbers,24 still, before practical use, the number of slag samples within a group needs to be increased to generate robust classification models. However, this classification system is only applicable when slag samples occur pure or in the coarse fraction in the environment. Since slag products were generally distributed as rather coarse fragments with 2 to 15 cm diameter, these samples can be easily retired for analysis even when mixed into urban soils. Only in rare cases, finer grained slag materials have been found in the environment.4,34 If materials or other typically fine grained technogenic substrates such as ashes are components of urban soils, the separation of this material from the soil matrix will be difficult. Thus, our current research

The original, UV scaled as well as SNV transformed data training data sets were used for the SIMCA modeling procedure to identify the most suitable data pretreatment procedure for precise classification. The original as well as SNV transformed data sets gave best discrimination results. However, we selected the SNV transformed data sets since unwanted scattering effects were eliminated for obtaining more robust classification rules. Thus, in the following details of SIMCA modeling procedures are only described for the SNV transformed training data set. Supervised Pattern Recognition Techniques for Slag Spectra Classification. Because of the promising results of unsupervised PCA technique, the PCA based supervised SIMCA classification techniques were applied to the slag spectral data. Thus, PCA models over the total spectral range (4000−400 cm−1) were performed for each slag class separately and used for SIMCA classification. For stable PCA models, slag classes were represented by different numbers of cross-validated significant principal components (Table 1). In Table 1 the percentage of correct classification via SIMCA cross-validation procedure is shown. Results were obtained from the mean-centered, SNV transformed spectra. All slag spectra belonging to the OF or ZF class were classified to 100% correct on a significance level of 0.001. However, the SIMCA gives poor results for the classification of the LF and BF slags, since LF slag spectra were only correctly classified to 39%, while correct BF slag spectra classification failed completely even at a significance level of 0.05. In order to explain these differences in classification accuracy and to improve slag classification, as reported by Galtier et al.21 the Fisher-Snedecor variable (F) of the spectral information of the different slag classes was analyzed. As a result, the FisherSnedecor variable is represented as ratio of Inter/Intra variance in Figure 2 separately for each slag class together with the slag spectra. Generally, the F variable is high when the intra variance is much lower than the inter variance. This means that F variables within a spectral interval identify those wavenumbers which best represent the spectral class information. Generally, the distribution of the highest F variables and the corresponding representative spectral intervals showed similar pattern for the LF and BF slags as well as for BF and OF slags. For the LF and BF slags, the highest F variables of spectral information occurred at 3600−2500 cm−1, whereas especially in the fingerprint intervals at 1500 and 1000 cm−1 as well as 700− 500 cm−1 the lowest F variables were found. These low F variables in the fingerprint intervals seem to be due to the fact that the characteristic peaks occur at slightly different wavelength. Thus, these band shifts within the LF and BF slag spectra resulted in a high intra variance. However, this means that based on the F variable as quality parameter the spectral information at 3600−2500 cm−1 best characterizes both slag classes, whereas the fingerprint intervals at 1500 and 1000 cm−1 as well as 700−500 cm−1 do not appear to be representative or useful for stable classification. In contrast, the F variable of the BF and OF slag classes is low over the total spectral range except in the intervals at 1300−1200 cm−1. This means that both slag classes have a high intra variance as shown in Figure 2 for the single slag class spectra. Based on the F variable, only the small interval at 1300−1200 cm−1 within the fingerprint intervals seems to be useful for classification. In addition to the F variable the discrimination power was used as a parameter for identifying the most important wavenumbers for precise slag classification. The discrimination 3970

dx.doi.org/10.1021/es204187r | Environ. Sci. Technol. 2012, 46, 3964−3972

Environmental Science & Technology

Article

(13) Bunzl, K.; Trautmannsheimer, M.; Schramel, P.; Reifenhauser, W. Availability of arsenic, copper, lead, thallium, and zinc to various vegetables grown in slag-contaminated soils. J. Environ. Qual. 2001, 30 (3), 934−939, DOI: 10.2134/jeq2001.303934x. (14) Reeves, J. B. Near- versus mid-infrared diffuse reflectance spectroscopy for soil analysis emphasizing carbon and laboratory versus on-site analysis: Where are we and what needs to be done? Geoderma 2010, 158, 3−14, DOI: 10.1016/j.geoderma.2009.04.005. (15) Janik, L. J.; Forrester, S. T.; Rawson, A. The prediction of soil chemical and physical properties from mid-infrared spectroscopy and combined partial least-squares regression and neural networks (PLSNN) analysis. Chemom. Intell. Lab. Sys. 2009, 97 (2), 179−188, DOI: 10.1016/j.chemolab.2009.04.005. (16) Reeves, J. B.; McCarty, G. W.; Reeves, V. B. Mid-infrared diffuse reflectance spectroscopy for the quantitative analysis of agricultural soils. J. Agric. Food Chem. 2001, 49 (2), 766−772, DOI: 10.1021/ jf0011283. (17) Hennings, G.; Kunzmann, K. R. Priority to local economic development: Industrial reconstructing and local development response in the Ruhr area - the case of Dortmund. In Global Challenge and Local Response; Stöhr, W. B., Ed.; The United Nations University: Tokyo, 1990. (18) Brokbartold, M.; Wischermann, M.; Marschner, B. Plant availability and uptake of lead, zinc, and cadmium in soils contaminated with anti-corrosion paint from pylons in comparison to heavy metal contaminated urban soils. Water, Air, Soil Pollut. 2011, 223 (1), 199−2013, DOI: 10.1007/s11270-011-0851-4. (19) Miao, S. J.; d’Alnoncourt, R. N.; Reinecke, T.; Kasatkin, I.; Behrens, M.; Schlogl, R.; Muhler, M. A Study of the Influence of Composition on the Microstructural Properties of ZnO/Al(2)O(3) Mixed Oxides. Eur. J. Inorg. Chem. 2009, 7, 910−921 , DOI: 10.1002/ ejic.200800987. (20) Stumpe, B.; Weihermüller, L.; Marschner, B. Sample preparation and selection for qualitative and quantitative analyses of soil organic carbon with mid-infrared reflectance spectroscopy. Eur. J. Soil Sci. 2011, 62 (6), 849−862 , DOI: 10.1111/j.1365-2389.2011.01401.x. (21) Galtier, O.; Abbas, O.; Le Dreau, Y.; Rebufa, C.; Kister, J.; Artaud, J.; Dupuy, N. Comparison of PLS1-DA, PLS2-DA and SIMCA for classification by origin of crude petroleum oils by MIR and virgin olive oils by NIR for different spectral regions. Vib. Spectrosc. 2011, 55 (1), 132−140, DOI: 10.1016/j.vibspec.2010.09.012. (22) Ciosek, P.; Brzozka, Z.; Wroblewski, W.; Martinelli, E.; Di Natale, C.; D’Amico, A. Direct and two-stage data analysis procedures based on PCA, PLS-DA and ANN for ISE-based electronic tongue Effect of supervised feature extraction. Talanta 2005, 67 (3), 590−596 , DOI: 10.1016/j.talanta.2005.03.006. (23) Brereton, R. G. Chemometrics - Data analysis for the laboratory and chemical plant; Wiley: Weinheim, Germany, 2002. (24) Bylesjo, M.; Rantalainen, M.; Cloarec, O.; Nicholson, J. K.; Holmes, E.; Trygg, J. OPLS discriminant analysis: combining the strengths of PLS-DA and SIMCA classification. J. Chemom. 2006, 20 (8−10), 341−351 , DOI: 10.1002/cem.1006. (25) Muik, B.; Lendl, B.; Molina-Diaz, A.; Ortega-Calderon, D.; Ayora-Canada, M. J. Discrimination of olives according to fruit quality using Foufier transformed raman spectroscopy and pattern recognition techniques. J. Agric. Food Chem. 2004, 52 (20), 6055−6060, DOI: 10.1021/jf049240e. (26) Otto, M. Chemometrics - Statistic and computer application in analytical chemistry; Wiley: Weinheim, Germany, 2007. (27) Smith, B. Infrared spectral interpretation − A systematic approach; CRC Press: New York, USA, 1999. (28) Gadsden, J. A. Inf rared spectra of minerals and related inorganic compounds; Butterworth Group: London, U.K., 1975. (29) Janik, L. J.; Merry, R. H.; Forrester, S. T.; Lanyon, D. M.; Rawson, A. Radid prediction of soil water retention using mid infrared spectroscopy. Soil Sci. Soc. Am. J. 2007, 71 (2), 507−514, DOI: 10.2136/sssaj2005.0391.

activities are focusing on the identification and classification of technogenic substrates in mixtures and as part of the finer grained soil matrix.



ASSOCIATED CONTENT

S Supporting Information *

Information and results of sample origin, heavy metal contents, further elemental analysis, mean slag class spectra, their mineral band assignments combined with one-vector loading plots and a plot with selected wavelength are provided. This material is available free of charge via the Internet at http://pubs.acs.org.



AUTHOR INFORMATION

Corresponding Author

*Phone: +49-(0)234-32-29598. Fax: +49-(0)234-32-14469. Email: [email protected]. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS This work was funded by The Foundation for Innovative Science within a program of the Ruhr-University Bochum. We thank the municipality of Mönchengladbach for providing the slag samples and for approval of this paper for publication as well as the Institute for Geologie, Mineralogie and Geophysik at the Ruhr-University Bochum for the XRD and XRF analyses.



REFERENCES

(1) Navarro, C.; Diaz, M.; Villa-Garcia, M. A. Physico-chemical characterization of steel slag. Study of its behavior under simulated environmental conditions. Environ. Sci. Technol. 2010, 44 (14), 5383− 5388, DOI: 10.1021/es100690b. (2) Motz, H.; Geiseler, J. Products of steel slags an opportunity to save natural resources. Waste Manage. 2001, 21 (3), 285−293 , DOI: 10.1016/S0956-053X(00)00102-1. (3) Kalyoncu, R. S. Slag - Iron and steel. U.S. Geological Survey Publications, Ch. in Mineral Commodity Summaries 2001. (4) Proctor, D. M.; Fehling, K. A.; Shay, E. C.; Wittenborn, J. L.; Green, J. J.; Avent, C.; Bigham, R. D.; Connolly, M.; Lee, B.; Shepker, T. O.; Zak, M. A. Physical and chemical characteristics of blast furnace, basic oxygen furnace, and electric arc furnace steel industry slags. Environ. Sci. Technol. 2000, 34 (8), 1576−1582, DOI: 10.1021/ es9906002. (5) Mansfeldt, T.; Dohrmann, R. Chemical and mineralogical characterization of blast-furnace sludge from an abandoned landfill. Environ. Sci. Technol. 2004, 38, 5977−5984, DOI: 10.1021/ es040002+. (6) Meuser, H. Contaminated urban soils; Springer: Dordrecht, Netherlands, 2010. (7) Burghardt, W. Soils in Urban and Industrial Environments. J. Plant Nutr. Soil Sci. 1994, 157 (3), 205−214. (8) Geiseler, J. Use of steelworks slag in Europe. Waste Manage. 1996, 16 (1−3), 59−63, DOI: 10.1016/S0956-053X(96)00070-0. (9) Gupta, v.K.; Carrott, P. J.M.; Ribeiro Carrott, M. M.L.; Shuhas, A. B. Low-cost adsorbents: Growing approach to wastewater treatment − A review. Crit. Rev. Environ. Sci. Technol. 2009, 39 (10), 783−842, DOI: 10.1080/10643380801977610. (10) Chaurand, P.; Rose, J.; Domas, J.; Bottero, J. Y. Speciation of Cr and V within BOF steel slag reused in road constructions. J. Geochem. Explor. 2006, 88, 10−14, DOI: 10.1016/j.gexplo.2005.08.006. (11) Drissen, P. Binding of trace elements in steel slags. Proceedings of the 5th European Slag Conference: Luxenburg 2008. (12) Rawlings, B. G.; Lark, R. M.; Ó Donnell, K. E.; Tye, A. M.; Lister, T. R. The assessment of point and diffuse metal pollution of soils from an urban geochemical survey of Sheffield, England. Soil Use and Manage. 2005, 21, 353−362 , DOI: 10.1079/SUM2005335. 3971

dx.doi.org/10.1021/es204187r | Environ. Sci. Technol. 2012, 46, 3964−3972

Environmental Science & Technology

Article

(30) Mayo, D. W.; Foil, A. M.; Hannah, R. W. Course notes on the interpretation of infrared and raman spectra; Wiley: New York, USA, 2004; DOI 10.1002/0471690082. (31) Vogt, N. B.; Brakstad, F.; Thrane, K.; Nordenson, S.; Krane, J.; Aamot, E.; Kolset, K.; Esbensen, K.; Steinnes, E. Polycyclic AromaticHydrocarbons in Soil and Air - Statistical-Analysis and Classification by the Simca Method. Environ. Sci. Technol. 1987, 21 (1), 35−44, DOI: 10.1021/es00155a003. (32) Heinz, A.; Savolainen, M.; Rades, T. Quantifiying ternary mixtures of different solid-state forms of indomethacin by Raman and near-infrared spectroscopy. Eur. J. Pharm. Sci. 2007, 32 (3), 182−192, DOI: 10.1016/j.ejps.2007.07.003. (33) Dhanoa, M. S.; Lister, S. J.; Sanderson, R. The link between multiplicative scatter correction (MSC) and standard normal variate (SNV) transformations of NIR spectra. J. Near Infrared Spectrosc. 1994, 2 (1), 42−47 , DOI: 10.1255/jnirs.30. (34) Meuser, H.; Blume, H. P. Characteristics and classification of anthropogenic soils in the Osnabruck area, Germany. J. Plant Nutr. Soil Sci. 2001, 164 (4), 351−358.

3972

dx.doi.org/10.1021/es204187r | Environ. Sci. Technol. 2012, 46, 3964−3972