Partial Least Squares Multivariate Regression as an Alternative To

Sep 3, 2003 - M. Felipe-Sotelo, J. M. Andrade,* A. Carlosena, and D. Prada. Department Analytical Chemistry, University of A Corun˜a, Campus da Zapat...
0 downloads 0 Views 154KB Size
Anal. Chem. 2003, 75, 5254-5261

Partial Least Squares Multivariate Regression as an Alternative To Handle Interferences of Fe on the Determination of Trace Cr in Water by Electrothermal Atomic Absorption Spectrometry M. Felipe-Sotelo, J. M. Andrade,* A. Carlosena, and D. Prada

Department Analytical Chemistry, University of A Corun˜a, Campus da Zapateira s/n, E-15071, A Corun˜a, Spain

Current chromium concentrations in natural waters amount to only a few micrograms per liter. Electrothermal atomic absorption spectrometry (ETAAS) is one of the most suitable techniques for its determination. However, it was found that high iron concentrations (3-4 orders of magnitude) may cause serious atomic signal enhancement in Cr analysis. It is discussed that Fe may cause either spectral or chemical (or both) interferences on Cr determination. The goal of this paper is to develop multivariate calibration models not affected by such interferences. Three multivariate regression methods were applied. One of them was linear in its nature (partial least squares, PLS), and two were nonlinear (polynomial PLS and locally weighted regression). PLS was revealed to be a suitable and convenient alternative to simplify current laboratory workload in order to cope with chemical and spectral interferences caused by a major metal (Fe) when determining another trace metal (Cr) by ETAAS. Three effects, which modify the atomic peak shape, due to the aging of the atomizers were simulated to evaluate to what extent they affect the predictions, namely, peak-shift, peak enhancement (depletion), and increased random noise. The ETAAS-PLS methodology was satisfactorily tested with water samples and two CRMs. Chromium is an essential element involved in glucose, fat, and protein metabolism of mammals. Despite this, high levels of chromium may be toxic to flora and fauna. Its two oxidation states are harmful, as undue ingestion of Cr3+ compounds can cause toxicological alterations to humans. Even ingestion of low amounts of Cr6+ can cause skin damage as well as serious pulmonary problems that can even lead to lung cancer. Thus, Cr and its compounds have long been recognized as human carcinogenic agents by the International Agency of Research on Cancer (ISARD).1-3 * Corresponding author. Fax: +34-981-167065. E-mail: [email protected]. (1) Gauglhofer, J.; Bianchi, V. Chromium. In Metals and their compounds in the environment; Merian, E., Eds.; VCH: Weinheim, Germany, 1991. (2) Klein, C. B. Carcinogenity and Genotoxity of chromium. In Toxicology of metals; Chang, L. W., Eds.; CRC Press: Boca Raton, FL, 1996. (3) Alloway, B. J.; Ayres, D. C. Chemical principles of environmental pollution: Blackie Academic & Professional: Suffolk (U.K.), 1997.

5254 Analytical Chemistry, Vol. 75, No. 19, October 1, 2003

Chromium is generally present at trace levels in the environmental compartments but at widely varying levels. For instance, freshwater contains variable Cr levels, around 1-10 µg L-1; seawater, amounts up to 0.3 µg L-1 (mainly as CrO42-); and in drinking water, up to 25 µg L-1 were found (although in an exceptional situation in a well).1 Nowadays, the maximum acceptable concentration of Cr in water for human consumption was established as 50 µg L-1 by the World Health Organization (WHO), a value adopted by Canada, the United States,3 and the European Union.4 Even though electrothermal atomic absorption spectrometry (ETAAS) is a suitable and widely applied analytical technique for the analysis of trace metals, it yields high risks of interferences (which can affect both precision and accuracy), because the concentrations of the concomitants can be up to 6-7 orders of magnitude higher. An example occurs when determining Cr by ETAAS in the presence of Fe (whose levels in water range from 0.5 to 100 mg L-1),5 as Carlosena et al.6 demonstrated for acid extracts of soils. In that work, the absorbance of Cr was significantly altered (>15%) for Fe concentrations above 7.5 µg mL-1. The interference could not be corrected using traditional methods, based on the stabilized temperature platform furnace (STPF) concept.6,7 Although STPF-based methods are widely applied, they are time-consuming, labor-intensive, costly, and not totally efficient in all cases. At present, there is another powerful and less timeconsuming alternative that is rooted in the application of multivariate chemometric models to the atomic data and which can complement the other ones. So far, this approach has been applied rarely, despite the important savings in time and effort (direct and indirect costs) it allows are highly attractive. Hou et al.8 determined Cd and As in grains employing an air/ acetylene flame combined with a back-propagation neural network model which accounted for the most important interferences. Moreover, the results were comparable to those from the standard (4) Marı´n-Galvı´n, R.; Rodrı´guez-Mellado, J. M. Ing. Quı´m. 2000, 32, 210-222. (5) Huebers, H. A. Iron. In Metals and their compounds in the environment; Merian, E., Eds.; VCH: Weinheim, Germany, 1991. (6) Carlosena, A.; Lo´pez-Mahı´a, P.; Muniategui, S.; Ferna´ndez, E.; Prada, D. J. Anal. At. Spectrom. 1998, 13, 1361. (7) Welz, B.; Sperling, M. Atomic absorption spectrometry; Wiley-VCH: Weinheim, Germany, 1999. (8) Hou, J.; Chen, G. S.; Wang, Z. P. Spectrosc. Spectral Anal. 2001, 21 (3), 387. 10.1021/ac0343477 CCC: $25.00

© 2003 American Chemical Society Published on Web 09/03/2003

addition method, but they were obtained faster. Grotti et al.9,10 did not rely on a black-box approach (because of the lack of chemical interpretation, as the neural nets) but, instead, relied on analyzing the particular interferences they suffered when determining several metals (Pb, Cd, Ni, Cr, and Mn, each one separately) in preconcentrated seawater by ETAAS. An experimental design was implemented to account for several levels of the most important interfering agents (Cl-, Na+, K+, Ca2+ and Mg2+), and then a multivariate ordinary least squares regression model (OLS) was implemented and studied. The predictions obtained by the multivariate equations performed satisfactorily when compared to other classical methods (as matrix-matched standards and the analyte-addition technique). In our opinion, the most important drawback of this approach is that the equations contain linear, quadratic, and cross-product terms to model the response surface, which are not easy to understand from a chemical point of view (except for their mathematical significance). The approach presented here focuses on using the atomic peak profiles that are gathered from the atomization of each standard and sample to develop multivariate models. Therefore, emphasis will be placed on considering each atomic absorption profile as a typical spectrum and on solving the quantitation issue by assuming that the atomic peak contain both systematic and random information. This means that there is a need to extract the systematic information that is related to the parameter of interest (Cr concentration) from the overall information in the atomic peak (which includes random noise, chemical and spectral interferences caused by other analytes into the solution, etc.). Accordingly, the problem is similar to that generally presented in molecular spectrometry, where multivariate models have been of the most help when quantifying an analyte in the presence of huge amounts of unrelated information into the spectra. To the best of our knowledge, this is the first application of PLS, polynomial PLS and locally weighted models to handle interferences in ETAAS analysis. The particular problem to be considered here consists of quantifying low amounts of Cr (trace levels) in natural waters when much larger amounts of an interfering metal, Fe, is present. Thus, the complexity of the problem is 2-fold: (i). The most sensitive spectral line for Cr (357.9 nm) was selected in order to quantify the very low concentrations of Cr in water (as it is generally made), although it is not totally free of interferences. Particularly, Fe presents an adjacent line (358.1 nm)11 that cannot be resolved with the instrumental slit (0.7 nm). Recently, this problem was considered in more detail when analyzing trace metals in whole blood samples.12 Likewise, a spectral interference could be present in our problem. (ii). Despite that most important interferences seem to be spectral, the chemical ones cannot be fully disregarded. This is a quite common situation in FAAS. It has been reported that Cr and Fe interact to form mixed carbides in reducing flames (some authors proposed the formation of chromite).7 Although FAAS (9) Grotti, M.; Abelmoschi, M. L.; Soggia, F.; Tiberiade, C.; Frache, R. Spectrochim. Acta, Part B 2000, 55, 1847. (10) Grotti, M.; Leardi, R.; Frache, R. Anal. Chim. Acta 1998, 376, 293. (11) CRC Handbook of Chemistry and Physics, 83rd ed.; CRC Press: Boca Raton, FL, 2002. (12) Baraj, B.; Bianchini, A.; Niencheski, L. F. H.; Campos, C. C. R.; Martı´nez, P. E.; Robaldo, R. B.; Muelbert, M. M. C.; Colares, E. P.; Zarzur, S. Fresenius’ Environ. Bull. 2001, 10 (12), 859.

Table 1. Furnace Temperature Program To Analyze Cr in Aqueous Samples by ETAAS

dry 1 dry 2 pyrolysis atomization clean

temp (°C)

ramp (s)

hold (s)

Ar flow (mL min-1)

100 130 1500 2500 2600

5 20 10 0 1

5 5 20 5 3

300 300 300 0 (read) 300

and ETAAS imply different mechanisms to get the metal in the elemental state, the reducing environment of the pyrolytically coated tube might play an important role. Thus, the reduction of Cr by carbon just before the vaporization of the free metal has been proposed as the mechanism.13,14 Castillo15,16 presented a more comprehensive explanation, taking into account the formation of higher carbides, which decompose very slowly to Cr (gas). Because there was no previous information about the particular effects the interferences caused in multivariate models or about the adequacy of the models to handle them properly, three different regression models were applied: (i) linear PLS (“PLS”); (ii) nonlinear PLS, in which the inner relationship can be modeled by a polynomial (“polynomial PLS”); and (iii) locally weighted regression models (“Local”), in which only a small number of samples closely related to an unknown sample are considered to deploy the regression models. The motivation for the two latter models is that if nonlinearities are present in the relationship between X and Y, they should be modeled better by the latter two types of regression models (in this work, artificial neural networks are not to be used). EXPERIMENTAL SECTION Equipment. A 4100 Perkin-Elmer atomic absorption spectrometer (U ¨ berlingen, Germany) equipped with a HGA-700 graphite furnace, an AS-70 autosampler, and a deuterium-arc background correction was employed throughout. Argon was used as the inert gas, the flow rate being 300 mL min-1 for all steps except atomization (gas, stopped). The furnace program is shown in Table 1. Measurements were made by using a hollow-cathode lamp (Perkin-Elmer) at 357.9 nm (0.7-nm slit). Pyrolytic coated tubes with preinserted L’vov platforms were purchased from Z-tek (Amsterdam, The Netherlands). Reagents. All reagents were of analytical reagent grade. Fe and Cr standards were prepared on a daily basis in HNO3 (0.5% v/v) (Baker Instra-analized grade, J. T. Baker, Phillipsburg) from stock standard solutions of 1000 µg mL-1 (Panreac, Barcelona, Spain). High-purity water (18 MΩ‚cm resistivity) was prepared on a Milli-Q water system (Millipore, Madrid, Spain). All glassware and plasticware were soaked in 10% v/v HNO3 for 24 h and rinsed with high-purity water at least three times before use. Standards and Samples. To establish the calibration and validation (test) sets, several aqueous standards with different Cr (13) Sturgeon, R. E.; Chakrabarti, C. L. Anal. Chem. 1976, 48, 1792. (14) Genc¸ , O ¨ .; Akman, S.; O ¨ zdural, A. R.; Ates, S.; Balkis, T. Spectrochim. Acta, Part B 1981, 36, 163. (15) Castillo-Suarez, J. R.; Mir, J. M.; Bendicho, C. Spectrochim. Acta, Part B 1988, 43, 263. (16) Castillo-Suarez, J. R.; Mir, J. M.; Bendicho, C. Fresenius’ J. Anal. Chem. 1988, 332, 783.

Analytical Chemistry, Vol. 75, No. 19, October 1, 2003

5255

concentrations were considered (they cover its usual levels in environmental samples): 0, 2, 4, 8, 10, 12, 16, 18, and 20 µg L-1. For each Cr level, several Fe concentrations were prepared: 0, 1, 2.5, 5, 7.5, and 10 mg L-1. Note that the Fe concentrations are 3-4 orders of magnitude greater than the Cr ones. The highest Fe concentrations (7.5 and 10 mg L-1) largely exceed what can be found currently in waters, but they were included to force the models and to study them and evaluate their behavior. To account for the variability of the ETAAS measurements, every standard and sample was prepared in duplicate. If disagreement occurred, a new standard was prepared. In total, 108 binary standards were prepared. From these, the standards containing 4, 12, and 18 µg Cr L-1 were employed only as the external validation (testing) set; i.e., none of the corresponding solutions were introduced into the calibration stage. In this way it was attempted to study how the multivariate models predicted unknown samples with different levels of interfering Fe. In addition, five samples taken from wells and springs as well as two Certified Reference waters (Water TM24 from the CRN, Canada; and Water SPS-SW1, Spectrapure Standards AS, Oslo, Norway) were measured in triplicate in order to test the models for routine analysis. Different aliquots of natural and CRM samples were spiked with Cr to assess the predictions. All standards and samples were analyzed by ETAAS using the furnace temperature program summarized on Table 1. Once the atomic peak was displayed in the ETAAS software (Winlab 4.1SP1, Perkin-Elmer), a temporary file containing the raw data needs to be immediately saved to an ASCII printable file (further operations in the software loses the data) to digitize the peak. The ASCII file is read by the statistical software MATLAB (The MathWorks, Inc. Natick, MA, v4.2c.1). For the purposes of this work, each peak profile became defined only by 26 variables, although a greater number might be considered, as well. Regression Methods. Despite a detailed presentation of the mathematics underlying the regression methods being out of the scope of this work, some details will be given just to center the problem and illustrate that the atomic peaks obtained by ETAAS can be treated as a typical (molecular) spectral problem. (a) Partial Least Squares, PLS. PLS regression is a biased, linear algorithm intended to extract most of the information present in the predictor variables (X block) which is related to the predictable variables (Y block). A linear model is assumed to relate the score vectors of the X block to those from the Y block (i.e., the inner relation). The PLS foundations have been broadly presented, and more details can be found elsewere.17 Note that PLS can, indeed, find a useful predictive model, even when an important amount of information in the X block seems to be unrelated to the predictable property. Recall that both spectral and chemical interferences (or their combination) can imply not only a change in the slope (maybe in the intercept, as well) but also nonlinear behavior in the LambertBeer-Bouguer’s law (thus, in the linear models). Additionally, despite PLS’s being able to model slight nonlinearities, by increasing the number of latent variables being considered in the model,18 it seemed interesting to compare the behavior of linear PLS with other regression models intended to cope with such (17) Wold, S.; Kettaneh-Wold, M.; Skagerberg, B. Chemom. Intell. Lab. Syst. 1989, 7, 53. (18) DiFoggio, R. Appl. Spectrosc. 1995, 49 (1), 67.

5256

Analytical Chemistry, Vol. 75, No. 19, October 1, 2003

nonlinear behavior. Should its results be like the nonlinear ones, it could be concluded that either nonlinearities are not present or they are not too strong, and therefore, that linear PLS models can model them. Because so far, there is not a simple way to elucidate if nonlinearities affect the multivariate models, this empirical approach can be applied to take into account eventual nonlinear problems.19 Another useful way to cope with nonlinearities is to increase the X matrix with squared and cross-terms, but this approach has serious drawbacks when large spectroscopic data sets are considered because of the large number of predictor variables making the cross-product terms dominate the model, adding noise to X block, thus often resulting in bad predictions. Although this problem can be overcome (see, e.g., Berglund and Wold20), here it was decided to follow the first strategy, that is, to develop nonlinear regression models. (b) Polynomial PLS. A simple and straightforward approach to develop nonlinear models is to modify the inner PLS relationship so as to consider a quadratic, a cubic, etc., polynomial.17 In polynomial PLS, two parameters have to be adjusted: (i) the order of the inner polynomial and (ii) the number of latent variables to include in the PLS model. This can be done using the calibration sets to create different models varying both the polynomial order and the number of latent variables. Each model is used to predict a test set, and the model leading to the lowest average prediction error (root-mean-squared error of prediction, RMSEP) is selected and, finally, validated by means of another validation set. (c) Locally Weighted Regression. Locally weighted regression (Local) grossly consists of developing a new model for each sample to be predicted by means of a sort of local principal component regression (PCR). Thus, for each sample to be predicted, the k samples in the calibration set that are closest to its spectrum will be selected. Then a local regression based on these k samples is obtained by using PCR, and finally, the unknown sample is introduced into the model, and its prediction is obtained.21 The weighted term is justified because each of the k neighboring samples is weighted by its distance to the unknown. The Mahalanobis distance based on the original principal components was selected. In this way, the influence of outlying samples can greatly be avoided. An important problem is that the k X block neighbors in the X space may not necessarily be close in the space of the predictable variables (Y space), and therefore, it is required that the distances between the k calibration standards and the unknown in the Y space be evaluated, as well. To sum up, the proposed distance measurement is a weighted sum between the two Mahalanobis distances (both in the X and Y spaces).21,22 This calls for an iterative process to, first, estimate the property of interest (e.g., by PCR) and, second, apply such estimation to calculate the distance and select the local calibration samples to get the “useful” analyte concentration (mathematical demonstrations can be found in the references given above). There are some steps to get “local” models. First, estimate of the analyte concentration (using a preliminary PCR model), and (19) Andrade, J. M.; Sa´nchez, M. S.; Sarabia, L. A. Chemom. Intell. Lab. Syst. 1999, 46, 41. (20) Berglund, A.; Wold, S. J. Chemom. 1997, 11, 141. (21) Naes, T.; Isaksson, T.; Kowalski, B. Anal. Chem. 1990, 62, 664. (22) Wang, Z.; Isaksson, T.; Kowalski, B. Anal. Chem. 1994, 66, 249.

Figure 1. Influence of Fe on the atomic peaks obtained for different concentrations of Cr in aqueous samples: (a) 2.00, (b) 20.00, and (c) 0.00 µg L-1.

then, use such prediction to initialize the Local algorithm itself.21 The next issue is how to weight the Mahalanobis distances in the X and Y spaces to result in the “global distance”. This is better understood if the global distance is written as: gd ) RY + βX; where gd) global distance, Y and X are the Mahalanobis distances in the X and Y spaces, and R and β are weights (R + β ) 1). It was encountered16 that when the Y values are noisy, R gets small; that is, more emphasis is put on the spectral similarities. A compromise value to start the iterations is to set R equal to 0.2.22 RESULTS AND DISCUSSION Figure 1 shows that the presence of Fe increased the Cr signals, although the effect is quite complex. Comparing Figure 1A and B, it is observed that the magnitude of the interference is not constant, and in fact, it depends on the Cr concentration at hand (the lower the Cr concentration, the higher the signal

enhancement). In these figures, the signal increases from 0.049 to 0.121 A (147%) for a 2.00 µg Cr L-1 solution and from 0.416 to 0.513 A for a 20.00 µg Cr L-1 (23.3%) when Fe varies from 0 to mg L-1. Figure 1C reveals that a signal is recorded even for null concentrations of Cr. This fact shows that Fe might be atomized under the experimental conditions, because it requires almost the same temperature furnace programs as Cr7 (and their two spectral lines are very close). The fact that this effect is not constant throughout all the Cr concentrations (considering different Fe levels) might suggest the additional influence of chemical interferences. The first step to select useful multivariate models was to ascertain whether strong nonlinearities were present in the calibration data set. Different calibration models were developed using PLS, polynomial PLS, and Local. In all cases, the best predictive model was searched for using cross-validation to find the best number of latent variables. Whenever the minimum in the average leave-one-out cross-validation error was not clearly definite, studies were made on the vicinity of the minimum using the external validation set. To avoid a too optimistic calibration error due to the presence of duplicated samples,23 other approaches, such as Venetian blind blocks, were tested24 to gain confidence in the number of latent variables being selected (several iterations were needed). Differences were not found between the cross-validation modes, and thus, only results corresponding to the leave-one-out cross-validation scheme will be presented. The development of Local models could not be made with cross-validation, and thus, the external validation set was mandatory. Therefore, all the results presented in the tables correspond to the root-mean-squared error of prediction (RMSEP ) [∑(yreal - ypred)2/n]1/2; n ) number of samples) for the external validation set. A preliminary PLS model was carried out using all samples in order to look for outliers by visual inspection of typical PLS graphs; mainly the “X variable” score plots (t1 vs t2, t1 vs t3 ... t1 vs tk) and the “X-Y scores relationship” (t1 vs u1, t2 vs u2 ... tk vs uk), where 1, 2, ..., k is the order of the latent variable (LV) being considered. Additionally, the presence of anomalous spectra or anomalous predictions in the calibration set was assessed applying the sample leverage (studentized error vs leverage) and the T2 (the multivariate t-test) and Q (considering the residuals of the model) tests. Outliers in the external validation (test) set were assessed by using the Mahalanobis distance. The bias of the model was studied by regressing the values predicted using each PLS model against the “true” ones (or reference values, obtained after diluting stock solutions) and computing the F-test for the joint confidence intervals. Throughout all the studies, overfitting was avoided as much as possible (note that two validation sets were used; see below) and parsimonious models were looked for. This implies that models with good prediction capabilities were preferred to those with excellent fits but poorer behaviors (e.g., with fewer extreme errors). To study eventual nonlinearities, a reduced calibration set excluding the standards with the highest Fe contents (7.5 and 10 mg L-1) from the initial one was prepared. Table 2 shows that (23) Faber, N. M.; Duewer, D. L.; Choquette, S. J.; Green, T. L.; Chesler, S. N. Anal. Chem. 1998, 70, 2972. (24) Wise, B. M.; Gallahger, N. B. PLS-Toolbox for Matlab v.1.5; Eigenvector Technology: Manson, WA, 1996.

Analytical Chemistry, Vol. 75, No. 19, October 1, 2003

5257

Table 2. Selection of the Partial Least Squares Model, Scaling Mode of the Atomic Signals, and Range of Fe Included in the Models Cr concn (µg L-1) model linear PLS

polynomial PLS (second order)

scaling

RMSEC

RMSEP

RMSEP water samples

autoscaling

overallb

mean-center

reducedc overall reduced

5 5 4 4

0.78 0.57 0.65 0.47

0.90 0.96 1.10 0.87

1.65 0.85 1.76 0.95

overall reduced overall reduced

4 6 2 5

0.83 0.45 0.80 0.43

0.96 0.99 1.13 1.12

2.30 1.76 2.50 2.31

overall reduced overall reduced

6 6 4 4

0.76 0.52 0.79 0.54

1.34 1.11 1.02 1.02

2.74 2.10 2.50 1.87

autoscaling mean center

local

range of Fe

LVa

autoscaling mean center

a Number of latent variables considered in the models. b Fe range: 0-10 mg L-1, number of validation samples for RMSEP) 36. c Fe range: 0-5 mg L-1, number of validation samples for RMSEP) 24.

Figure 2. Linear relationship among the spectral variables (t-scores) and the Cr concentration (u-scores) for each of the four latent variables included in the regression model.

there are not obvious differences among the three different models, and accordingly, the linear PLS models can be selected (compare only calibrations with the same Fe range). This means that interferences causing nonlinearities are not too strong or (more likely) that they can be quite satisfactorily modeled by linear PLS. In addition, it can be seen that Local models show a trend to overfit the calibration set, while the prediction of new samples is not satisfactory (comparing the good RMSECs, root mean standard error of calibrations, to the RMSEPs). The capability of the (linear) PLS models to handle nonlinearities has been discussed elsewhere.18,19 Henceforth, only linear models will be considered. Despite binary standards’ being measured, the number of latent variables is greater than two (except for a nonlinear model), confirming that linear models handle Fe interferences by including higher latent variables. It was not possible to ascertain if a given latent variable was more clearly associated with a spectral or 5258 Analytical Chemistry, Vol. 75, No. 19, October 1, 2003

chemical interference. Other minor effects, such as subtle changes in the tail of the atomic peak profiles (observed when all atomic peaks were overlaid and closely scrutinized) seemed also to be modeled. Two scaling options were studied: autoscaling and meancentering (see Table 2). The latter leads to both simpler models and slightly lower average errors (providing the same Fe range is considered). The calibration errors (RMSEC) for linear PLS were a bit lower when mean centering was used, 0.65 and 0.47 µg Cr L-1 (Fe ranges, 0-10 and 0-5 mg L-1, respectively) vs 0.78 and 0.57 µg Cr L-1 for autoscaling (Fe ranges, 0-10 and 0-5 mg L-1, respectively). Accordingly, mean-centering was selected. The calibration errors are a bit lower than those for validation (see Table 2), because the use of duplicate samples yields optimistic cross-validation errors.23 Comparing the models developed using both ranges of Fe shows that there are slight improvements on calibration and

Figure 3. Loadings defining each of the latent variables included in the regression model (continuous line) compared to a typical atomic profile of the aqueous standards (broken line).

prediction when the “reduced” range of Fe (0-5 mg L-1) is used, and thus, they were selected here (and because Fe concentrations as high as 7.5 or 10 mg L-1 are not usually reported for water samples). Note that a strict comparison between the average errors obtained by using the two ranges is not possible because the validation sets with the reduced range of Fe do not include the standards with higher Fe concentrations. Therefore, the final selection was a linear PLS model, using mean-centered data and considering the reduced range of Fe. This model gave quite satisfactory predictions without bias: predicted value ) (0.0417 ( 0.1179) + (0.9955 ( 0.01027) * Real (r ) 0.9977, standard error ) 0.48 µg Cr L-1; joint confidence test for the slope and intercept, Fexp ) 0.098, Ftab,95% ) 3.22); 99.98% explained variance in X and 99.54% explained variance in Y. The X score vs Y score plot (tk vs uk, where k is each of the latent variables, LV) reveals that the four selected factors hold a linear relationship between the spectra and the Cr concentrations (see Figure 2). Although the fourth LV (0.15% explained variance in Y) reveals two aqueous standards with a slightly different behavior, they correspond to the highest Cr and Fe concentrations (high leverage but not anomalous). The loadings associated to each LV (see Figure 3) are not easy to explain except for the first LV which resembles the average peak profile of the aqueous standards. The second and third LVs are more difficult to understand; they present a “first-derivative shape” just where spectral shoulders can be observed in the atomic peaks. This might suggest that the PLS model is correcting for some kind of interferences or maybe some kind of peak shift or spectral artifact not visible for us. This seems a bit more clear in the loadings of the fourth LV, as they established a negative relationship among the maximum of the atomic peaks and the Cr concentration. A reasonable explanation is that this variable is modeling the interference of Fe on the Cr peak. In this sense, the regression coefficients confirm the hypothesis above (see Figure 4). The variables more related to the Cr concentrations (positive coefficients) are associated to minor spectral characteristics (mainly spectral shoulders, not clearly defined). Variable number 5 (a

Figure 4. Regression coefficients for the PLS model using four latent variables. An original atomic peak is overlaid for comparison.

shoulder) and, more important, number 10 (the maximum of the atomic peaks) present a negative influence on the Cr quantitation and, so, they should be attributed to the interferences caused by Fe. Regarding the five natural samples taken in wells and springs, they were analyzed exactly as the aqueous standards and quantified either by the classical way (using an aqueous calibration curve and further assessment by a standard addition calibration) and applying the multivariate models. To study the behavior of the predictive models, original and Cr-spiked sample aliquots were quantified. The average prediction error (RMSEP) for all quantitations (samples plus spiked samples) are presented in Table 2, where it can be seen that the average errors are satisfactory (recall Analytical Chemistry, Vol. 75, No. 19, October 1, 2003

5259

Table 3. Cr Concentrations ( Standard Deviation (n ) 3) Predicted by the PLS Model for Different Original and Spiked Samplesa Cr concn (µg L-1) sample

true

predicted

ELV ELV-spiked P1 P1-spiked P2 P2-spiked P2-spiked PR5 PR5-spiked PR5-spiked TAB TAB-spiked TM24 SPS