2D Correlation Spectroscopy and Multivariate ... - ACS Publications

system under investigation, chemometric methods are used. They .... for the analysis of evolving data sets is based on curve resolution. (CR).14,15 Th...
1 downloads 0 Views 379KB Size
Anal. Chem. 2002, 74, 4944-4954

2D Correlation Spectroscopy and Multivariate Curve Resolution in Analyzing pH-Dependent Evolving Systems Monitored by FT-IR Spectroscopy, A Comparative Study Josef Diewok,† Marı´a Jose Ayora-Can˜ada,‡ and Bernhard Lendl*,†

Institute of Chemical Technologies and Analytics, Vienna University of Technology, Getreidemarkt 9/164-AC, A-1060 Vienna, Austria, and Department of Physical and Analytical Chemistry, University of Jaen, Paraje de las Lagunillas S/N, E-23071 Jaen, Spain

Multivariate curve resolution (MCR) and 2D correlation spectroscopy (2D-CoS), including sample-sample correlation, have been applied to the analysis of evolving midinfrared spectroscopic data sets obtained from titrations of organic acids in aqueous solution. In these data sets, well-defined species with significant differences in their spectra are responsible for the spectral variation observed. The two fundamentally different chemometric techniques have been evaluated and discussed on the basis of experimental and supportive simulated data sets. MCR gives information that can be directly related to the chemical species that is of importance from a practical point of view, whereas 2D-CoS results normally require more interpretation. The obtained conclusions are regarded valid for similar evolving data, which are increasingly being encountered in analytical chemistry when multivariate detectors are used to follow dynamic processes, including separations as well as chemical reactions, among others. In analytical chemistry, evolving data sets are frequently obtained when spectroscopic techniques are applied to the investigation of dynamic processes that proceed as a function of a modulation variable, such as time, pH, temperature, and pressure, among others. To increase knowledge and understanding about the chemical system under investigation, chemometric methods are used. They extract information about correlated and uncorrelated spectral changes in the recorded data sets and describe overall changes between different spectra of the data sets. The most relevant chemometric methods for this purpose are either based on curve resolution techniques or on 2-dimensional correlation spectroscopy. However, because they rely on different mathematical approaches, their usability and the form in which the results of the data analysis are obtained differ fundamentally. Generalized 2-dimensional correlation spectroscopy (2D-CoS) performs cross-correlation analysis of a series of spectra of a * Corresponding author. Fax: +43(0) 1 58801 15199. E-mail: blendl@ mail.zserv.tuwien.ac.at. † Vienna University of Technology. ‡ University of Jaen.

4944 Analytical Chemistry, Vol. 74, No. 19, October 1, 2002

system that is changed with some modulation variable and spreads the 1-dimensional spectra into a second spectral dimension, thus yielding 2-dimensional maps with the same wavenumber or wavelength axis (wavenumber will be used throughout the paper without loss of generality) in both directions.1 The 2D wavenumber-wavenumber (WW) correlation analysis generally yields two different correlation maps, one synchronous and another asynchronous. The synchronous correlation map shows correlations between spectral bands that change at the same rate (in-phase) in the system and whether they increase or decrease relative to each other. The asynchronous map gives information on bands that change out-of-phase and on which of these changes happen before or after each other.2 Since its introduction, 2D-CoS has spread widely and has been used for the analysis of a wide range of different systems in midand near-infrared, Raman, fluorescence, and UV/vis spectroscopy. Application examples include investigations of the stretching deformation of polymer films monitored with IR linear dichroism,2 the temperature behavior of methylene IR stretching vibrations in phospholipid acyl chains,3 thermo-induced periodic changes in the secondary structure of poly-(L)-lysine,4 a study of pH titrations of malic and succinic acid with 2D-FT-IR spectroscopy,5 or the 2D-FT-Raman investigation of different kinds of hydrogen bonding in concentrated phosphoric acid.6 2D-CoS is also commonly used in the MIR and NIR spectroscopic investigation of pressure-7 and temperature-induced8 protein folding and unfolding processes where it effectively helps to resolve overlapping spectral bands. A recent important innovation was the presentation of samplesample (SS) correlation analysis,9,10 in which synchronous and (1) Noda, I. Appl. Spectrosc. 1993, 47, 1329-36. (2) Noda, I.; Dowrey, A. E.; Marcott, C.; Story, G. M.; Ozaki, Y. Appl. Spectrosc. 2000, 54, 236A-248A. (3) Nabet, A.; Auger, M.; Pezolet, M. Appl. Spectrosc. 2000, 54, 948-955. (4) Mueller, M.; Buchet, R.; Fringeli, U. P. J. Phys. Chem. 1996, 100, 1081010825. (5) Ayora-Can ˜ada, M. J.; Lendl, B. Vib. Spectrosc. 2000, 24, 297-306. (6) Abderrazak, H.; Dachraoui, M.; Ayora-Can ˜ada, M. J.; Lendl, B. Appl. Spectrosc. 2000, 54, 1610-1616. (7) Smeller, L.; Heremans, K. Vib. Spectrosc. 1999, 19, 375-378. (8) Schultz, C. P.; Barzu, O.; Mantsch, H. H. Appl. Spectrosc. 2000, 54, 931938. (9) Sasic, S.; Muszynski, A.; Ozaki, Y. J. Phys. Chem. A 2000, 104, 6380-6387. (10) Sasic, S.; Muszynski, A.; Ozaki, Y. J. Phys. Chem. A 2000, 104, 6388-6394. 10.1021/ac0257041 CCC: $22.00

© 2002 American Chemical Society Published on Web 09/06/2002

asynchronous correlation maps with two modulation variable axes instead of wavenumber axes are obtained. In contrast to traditional 2D WW correlation concerning spectral features, the SS correlation maps give information on the dynamics in the direction of the modulation variable, that is, information about the concentration profiles of the chemical or physical components contributing to the experimental data matrix. Therefore, SS correlation is regarded to be complementary to the widely used WW correlation analysis.9 SS correlation analysis has already been applied in the analysis of temperature-dependent NIR spectra of oleic acid,10 in the prediction of fat and protein content in milk from short-wave NIR spectra,11 in the elucidation of the temperature-dependent water structure from NIR spectra,12 and in the analysis of a polycondensation reaction monitored by MIR spectroscopy.13 In these examples, it was shown that the synchronous SS correlation maps were easier to interpret, whereas full interpretation of the asynchronous maps still requires more investigation of the underlying chemical and mathematical mechanisms.12 The second group of chemometric methods mentioned above for the analysis of evolving data sets is based on curve resolution (CR).14,15 These methods determine the significant number of chemical/physical components that contribute spectroscopically to the experimentally measured data matrix D and decompose D into the product of two smaller matrices, C and S, containing the concentration profiles and the spectra, respectively, of the modeled components. Thus, a model that describes the amount of spectral contribution of each component in every spectrum of the data set is obtained. Commonly used curve resolution methods include Simplisma (SIMPLe-to-use Interactive Self-modeling Mixture Analysis),16 evolving factor analysis (EFA)-based methods,17 orthogonal projection approach (OPA),18 and multivariate curve resolution-alternating least squares (MCR-ALS).19,20 These methods differ mainly in whether the number of significant components is determined automatically, how initial estimates for the calculation are obtained, whether the matrix decomposition is performed in a single step or in an iterative manner, and which chemical/physical knowledge about the system under study can be included in the calculation. MCR-ALS19,20 is a very flexible iterative CR method that allows for the inclusion of different chemical constraints and also for the simultaneous analysis of several data sets. It has already been successfully applied to a broad range of different chemical problems, including the analysis (11) Sasic, S.; Ozaki, Y. Appl. Spectrosc. 2001, 55, 163-172. (12) Segtnan, V. H.; Sasic, S.; Isaksson, T.; Ozaki, Y. Anal. Chem. 2001, 73, 3153-3161. (13) Sasic, S.; Amari, T.; Ozaki, Y. Anal. Chem. 2001, 73, 5184-5190. (14) De Juan, A.; Casassas, E.; Tauler, R. Soft Modeling of Analytical Data. In Encyclopedia of Analytical Chemistry: Applications, Theory, and Instrumentation; Meyers, R. A., Ed.; John Wiley & Sons: New York, 2000; Vol. 11, pp 9800-9837. (15) Massart, D. L.; Vandeginste, B. G. M.; Buydens, L. M. C.; de Jong, S.; Lewi, P. J.; Smeyers-Verbeke, J. Handbook of Chemometrics and Qualimetrics: Part B.; Elsevier: Amsterdam, 1998; Chapter 34. (16) Windig, W.; Stephenson, D. A. Anal. Chem. 1992, 64, 2735-2742. (17) Maeder, M.; Zuberbu ¨ hler, A. D. Anal. Chim. Acta 1986, 181, 287-291. (18) Cuesta Sanchez, F.; Toft, J.; van den Bogaert, B.; Massart, D. L. Anal. Chem. 1996, 68, 79-85. (19) Tauler, R.; Kowalski, B.; Fleming, S. Anal. Chem. 1993, 65, 20402047. (20) Tauler R. Chemom. Intell. Lab. Syst. 1995, 30, 133-146.

of pH-modulated UV21 and FT-IR22 data, protein denaturation,23 and kinetic24 and equilibrium25 experiments. In separate scientific contributions12,13,26,27 the same data sets were analyzed by full 2D-CoS (WW and SS correlation) and CR methods, but no direct comparison of the two different chemometric approaches was given, despite the fact that different information details could be extracted by each method. Because of the increasing occurrence and importance of evolving data sets in analytical chemistry, it is of interest to systematically compare 2D-CoS and CR techniques by applying both to the same experimental and simulated data sets and to discuss their strengths and weaknesses on the basis of the obtained results. For this purpose, we have selected titration of organic acids in aqueous solution monitored by FT-IR spectrometry as wellunderstood chemical systems that in their complexity compare well with recent work dealing with SS correlation spectroscopy. The chosen systems are characterized by strong spectral changes that extend over a broad spectral range. Emphasis will be put on aspects of data pretreatment in both methods, influence of baseline contributions, and interpretability of the obtained result. For 2DCoS, special attention will be paid to SS correlation and its interpretation, because this feature is much newer and, therefore, less investigated than the widely used WW correlation. Although FT-IR titrations are treated specifically in this study, these systems can be regarded as typical examples for other evolving systems. Therefore, many of the findings and conclusions presented here can directly be applied to evolving data sets of various spectroscopies and should help the readers to optimize their use of the discussed evaluation methods for their data. THEORY SECTION For all data evaluation, the same nomenclature and data structure will be used: The m spectra recorded (or simulated) of each titration experiment at n wavenumbers are grouped into a m × n data matrix D so that it contains the spectra in rows, thus yielding a pH × wavenumber (or a titration time × wavenumber) matrix. 2D Correlation Spectroscopy. The full generalized 2D correlation analysis of a data matrix D gives four different correlation spectra/maps: synchronous and asynchronous wavenumber-wavenumber (WW) correlation spectra Sww and Aww that depict the correlations between the spectral changes during the experiment and synchronous and asynchronous sample-sample (SS) correlation maps Sss and Ass that offer information about concentration dynamics during the experiment. It should be noted here that a “sample” in SS correlation actually refers to one spectrum of the experimental data matrix, such as in PCA or PLS regression, and not to a sample as a whole experiment (a series of spectra), as is usual in CR. The term “sample-sample (21) Saurina, J.; Hernandez-Cassou, S.; Tauler, R.; Izquierdo-Ridorsa, A. Anal. Chem. 1999, 71, 2215-2220. (22) Diewok, J.; de Juan, A.; Tauler, R.; Lendl, B. Appl. Spectrosc. 2002, 56, 40-50. (23) Navea, S.; de Juan, A.; Tauler, R. Anal. Chim. Acta 2001, 446, 187-197. (24) Saurina, J.; Hernandez-Cassou, S.; Tauler, R.; Izquierdo-Ridorsa, A. J. Chemom. 1998, 12, 183-203. (25) Mendieta, J.; Diaz-Cruz, M. S.; Tauler, R.; Esteban, M. Anal. Biochem. 1996, 240, 134-141. (26) Sasic, S.; Segtnan, V. H.; Ozaki, Y. J. Phys. Chem. A 2002, 106, 760-766. (27) Sasic, S.; Amari, T.; Siesler, H. W.; Ozaki, Y. Appl. Spectrosc. 2001, 55, 1181-1191.

Analytical Chemistry, Vol. 74, No. 19, October 1, 2002

4945

correlation” can, thus, be misleading, but it is kept in this article, because it was already established in other works. The calculation of 2D correlation spectra is commonly not performed directly on the original spectral matrix Dexp, although this is also an option, but on the pretreated matrix D. The most widely used pretreatment consists of mean-centering the data set by subtracting the mean spectrum of Dexp from every spectrum, thus giving so-called “dynamic” spectra. If no other or additional pretreatment is stated in this article, the spectral data set was mean-centered in the described manner. Alternative pretreatment options will be discussed in the Results and Discussion Section. As shown in recent works,28,29 in which the calculation of 2DCoS, previously based on Fourier transformation, was reformulated in matrix notation of linear algebra and thus connected to classical correlation analysis, Sww can be obtained by calculating the simple rows cross product,

Sww ) DTD

intensities at a fixed wavenumber of the first spectral direction plotted versus the second spectral direction) or the direct interpretation of 3-dimensional plots can be useful. The surfaces obtained in Sss and Ass are not very well-presented with contour plots, because they lack typical correlation peaks. A better way is to analyze 3-dimensional plots or to plot all slices of the sample-sample correlation map in a single figure. All 2DCoS analyses in this work were performed with Matlab30 programs on the basis of the freely available 2D-CoS Toolbox by J. R. Berry and Y. Ozaki.31 Multivariate Curve Resolution-Alternating Least Squares. The general goal of all curve resolution methods is decomposing mathematically the mixed measurements in the original data set D into the pure contributions due to each of the components in the system. All pure components are supposed to contribute in an additive manner to the global measured response, and the general expression linked to CR mimics the Beer-Lambert law.

(1) D ) CST + E

where DT denotes the transposed of the spectral data matrix D. Sww is therefore the n × n covariance matrix of D and contains correlation coefficients between all wavenumbers. The asynchronous spectrum Aww is obtained by orthogonalizing D with the m × m Hilbert transform matrix H and calculating the rows cross-product between D and the orthogonal matrix HD.

Aww ) DTHD

(2)

A detailed article on calculating asynchronous 2D-CoS maps using the Hilbert transform was published by I. Noda.28 The appearance of a peak at the coordinate (ν1, ν2) in the asynchronous spectrum means that the spectral dynamics at the positions ν1 and ν2 are not in linear relationships; i.e., they are not proceeding at the same rate. Guidelines for interpretation of the different correlation peaks in Sww and Aww and their relative signs are given in the focal point article by Noda et al.2 The SS correlation maps are obtained in an analogous manner but with transposed D matrices and an adapted Hilbert matrix H, thus leading to m × m matrices that give correlations or disrelations in the modulation variable direction (pH or titration time in this work).

Sss ) DDT

(3)

Ass ) DHDT

(4)

Usually no typical correlation “peaks” as known from WW correlation, but “surfaces” of different shapes are obtained. The possible shapes and underlying processes are not yet systematically documented, but are given in a descriptive manner in the previous works. Therefore, simulated data will be used to clarify the shapes of SS correlation maps for pH evolving systems. For graphical depiction of Sww and Aww, contour plots are used, but attention has to be paid not to neglect small correlation peaks with this plotting method. Additional slice spectra (correlation (28) Noda, I. Appl. Spectrosc. 2000, 54, 994-999. (29) Sasic, S.; Muszynski, A.; Ozaki, Y. Appl. Spectrosc. 2001, 55, 343-349.

4946

Analytical Chemistry, Vol. 74, No. 19, October 1, 2002

(5)

C is a matrix of pure concentration profiles, and ST is the matrix of corresponding pure spectra. E is the matrix of residuals not explained by the modeled components. The general steps in MCR-ALS analysis of evolving data sets are summarized below. 1. The number of components that contribute spectroscopically to D has to be determined. This is usually done with singular value decomposition (SVD) or principal component analysis (PCA) of matrix D.32 Chemically relevant components give rise to bigger singular values than noise or minor instrument contributions. The “chemical rank” of the matrix can therefore be estimated as the number of singular values significantly higher than those associated with noise. 2. Initial estimates for the number of components to be modeled have to be generated. The estimates can be either of spectral type (e.g., taken from the original data set) or of concentration profile type, for example, obtained from evolving factor analysis (EFA).17 3. Alternating least squares optimization of eq 5. The optimization is not performed directly on Dexp, but on matrix Dpca that is obtained by PCA of Dexp and its reproduction with as many principal components as will be modeled in the MCR analysis. Then C and ST are calculated by an iterative alternating least squares algorithm. In every iteration, new C and ST are obtained, and the applied constraints (see below) are updated. The algorithm stops when convergence or a selected number of iterations is reached. The output of every MCR analysis is a spectrum and a concentration profile for every modeled component, the lack of fit and a matrix of residuals between the modeled and the original data set. Curve resolution methods generally yield ambiguous results (rotational and intensity ambiguity33) if no assumptions at all about the possible values and shapes of spectra and concentration (30) MATLAB 5.3, The Maths Work Inc., Natick, MA, 1999. (31) Berry, J. R.; Ozaki Y. 2D-Cos Toolbox, MATLAB code, 2001, Kwansei-Gakuin University, Uegahra, Japan; science.kwansei.ac.jp/∼ozaki/Main-eng.htm. (32) Golub, G. H.; Reinsch, C. Numer. Math. 1970, 14, 403-420. (33) Tauler, R.; Smilde, A.; Kowalski, B. J. Chemom. 1995, 9, 31-58.

profiles of the modeled components are made. Therefore, calculation of C and ST is always subject to constraints that reflect physical and chemical knowledge about the studied system. In this work, the following constraints have been applied for concentration profiles: nonnegativity, local rank constraints (in certain rows of the C matrix, zero-concentration windows are set for species known to be absent, e.g., at the beginning and end of titrations), and optionally closure constraint (i.e., the sum of concentrations of the different species of an acid is constant). The spectra were not subject to constraints. For MCR-ALS analysis, the freely available program (Matlab code) by A. de Juan and R. Tauler was used.34 EXPERIMENTAL SECTION All chemicals (acetic acid, L(+)tartaric acid, NaOH, HCl, and HNO3) were of analytical reagent grade. Sample and titrant solutions were prepared by weighing the appropriate amounts and dissolving them in deionized water. The acetic acid titration (pH 2-8) was performed by preparing 500 mL of sample solution (pH 2 adjusted with HCl), adding small amounts of concentrated NaOH, and simultaneously passing the solution through an attenuated total reflection (ATR) flow cell and back into the sample beaker by means of an peristaltic pump. The spectra were recorded during stopped sample flow on a Bruker Equinox 55 FT-IR spectrometer (Bruker Optik GmbH, Germany) equipped with a narrow-band mercury-cadmium telluride (MCT) detector. A total of 128 scans/spectrum were coadded at a resolution of 4 cm-1; a spectrum of water (pH 7) was used as reference spectrum. The tartaric acid titrations were performed in a fully automated flow titration between pH 12.0, as was described in a previous work.22 The sample and the titrant volume flow were constant throughout the titration, but the composition and, thus, the pH of the titrant flow were varied gradually by mixing different amounts of H2O, HNO3, and NaOH. The spectra were recorded in a 25-µm transmission cell with 32 co-added scans and a spectral resolution of 8 cm-1 on a Bruker IFS 88 FT-IR spectrometer (Bruker Optik GmbH, Germany) equipped with an MCT detector. A low-pass filter with a 5% cut at 1900 cm-1 was employed. In both titration methods, it was guaranteed that the total concentration of the organic acid did not change during the titration. SIMULATED DATA Model data of varying complexity were simulated in order to assess the performance of 2D-CoS and MCR-ALS and the optimal data pretreatment procedures and to investigate the output that can be expected from sample-sample correlation of different systems. Three two-component systems (D1, D1a, D1b) with a constant total concentration (closed systems) of the two species and different concentration profile shapessmonoprotic (D1), linear (D1a), exponential (D1b)swere simulated. Because of closure, these systems exhibit only one independent component. Additionally, a two-component system (D2) with individually changing concentrations of the two components (first component, linear; (34) Tauler, R.; de Juan, A. Multivariate Curve Resolution - Alternating Least Squares (MCR-ALS), MATLAB code, 1999, University of Barcelona, Barcelona, Spain; www.ub.es/gesq/eq1_eng.htm.

Table 1. Spectral Models for Simulated Data model name

conc profiles

component

band A (0.1)a

band B

D1 D1a D1b

monoprotic linear exponential

1 2

31 36 (0.03)

111 (0.12) 121 (0.11)

D2

linear/exp.

1 2

31 (0.1) 36 (0.03)

111 (0.12) 121 (0.11)

D3

diprotic

1 2 3

31 (1.0) 46 (0.7)

131 (0.6) 141 (0.8)

a Wavenumber positions and height (in parentheses) are given for the band maxima.

second component, exponential) was constructed. The same model spectra based on Gaussian peaks with a standard deviation of seven wavenumbers were used for all systems; the corresponding spectral parameters are shown in Table 1. Each component spectrum consisted of two bands, one slightly and one heavily overlapping with the bands of the second component. The overall spectral intensity (integrated area of the spectrum) is not the same for the two different component spectra (see D1 data in Figure S1a in Supporting Information). Furthermore, a model D3 for a diprotic acid with close pKa values (3.0 and 4.4, taken from literature values for tartaric acid) for the two protonation steps was established. As can be seen from Table 1, the two bands of the intermediate species overlap to a lesser degree with the band of the protonated and heavily with the band of the deprotonated species, respectively (see also Figure S1b in Supporting Information). EXPERIMENTAL DATA Acetic Acid. Thirteen spectra of a 16 g/L acetic acid sample were recorded every 0.5 pH units in the pH interval 2.0-8.0. The spectra were recorded in the spectral range 910-1840 cm-1, yielding a 13 × 724 data matrix (Figure S2 in Supporting Information). Tartaric Acid. Fifty spectra were recorded during the flow titration of a 6 g/L sample in the spectral range 900-1900 cm-1 and evaluated between 1030 and 1583 cm-1 (yielding a 50 × 144 data matrix), because the region around 1640 cm-1 is not accessible with the used transmission cell (Figure S3 in Supporting Information). RESULTS AND DISCUSSION Simulated Data Sets. The model systems described in detail in the previous section will help to get a “reference” output for the two data analysis methods applied, 2D-CoS and MCR-ALS. System D1 mimics the titration of a monoprotic acid in which two species can be observed and contribute spectroscopically. Models D1a and D1b are closely related and different only in the shape of their concentration profiles. Model D2 displays a system in which the total concentration of the two contributing species is not constant. This behavior is normally not observed in pHevolving systems but allows some better understanding of 2DCoS output and the concepts of synchronicity and asynchronicity. Finally, model system D3, which represents the titration of a diprotic acid with close pKa values, will be analyzed. Analytical Chemistry, Vol. 74, No. 19, October 1, 2002

4947

2D-CoS. First, the WW correlation spectra of the monoprotic model D1 were calculated (Figure S4 in Supporting Information). Three auto peaks and corresponding cross-peaks are obtained in the synchronous spectrum at wavenumbers 30, 109, and 124, which means that the less overlapping bands at 111 and 121 are resolved, but not the heavily overlapping bands at 31 and 36 that are also very different in relative intensity. The asynchronous spectrum (Figure S4b in Supporting Information) contains only computational noise and no peaks, which is expected from a model system in which the concentrations of two chemical components change synchronously at the expense of each other. Identical results are obtained for models D1a and D1b. To enhance the resolving ability of WW correlation, mean-normalization of the spectra, that is, dividing each spectrum by the mean of its absorbance values, was tried before the mean-centering step. This is a common pretreatment sequence in WW correlation when variations in the spectral intensity of the overlapping bands are observed. The results did not improve significantly (or improved only with an extremely high number of displayed contour levels). The heavily overlapping bands can actually be resolved if only the corresponding, mean-normalized wavenumber region of the data set is analyzed with WW correlation, but this means also that the full spectrum method claim is given up, and information obtained from different zooms has to be reassembled to give a single picture of the spectral information. For model system D2 in which one chemical component decreases exponentially in concentration while the other increases linearly, the synchronous WW correlation spectrum is similar to the ones observed for the closed two-component system; i.e., the highly overlapping bands are not resolved. The asynchronous correlation map proves to be more useful. Because of the partially asynchronous behavior of the two components, correlation peaks for the highly overlapping bands are also obtained, and approximate wavenumbers for the peak positions can be extracted. Mean-normalization of the spectra of D2 deteriorates the results from WW correlation. This pretreatment linearizes the model system and reduces the asynchronous correlation map to noise, which causes loss of band resolution. The last model system investigated was D3, the diprotic acid model. The WW results are given in Figure 1. Positive correlations are depicted with black solid lines; negative correlations, with black dashed or red solid lines throughout the paper. From the synchronous correlation spectrum (Figure 1a), it can be found that there is an overall change of the bands in the low wavenumber region that is synchronized with the overall change of the bands at higher wavenumbers, but occurring in opposite direction. The asynchronous spectrum (Figure 1b) resolves all overlapping band pairs and gives approximate peak maxima positions. However, the information about the sequential order of the different spectral changes that can be obtained from the signs of the correlation peaks2 is blurred and difficult to extract. This is due to the fact that the changes in the intermediate species of the diprotic acid are not monotonic in a single direction29 but follow a classical “peak” shape, increasing from 0 to a maximum and then decreasing to 0 again. Mean-normalization of spectra does not change the WW correlation analysis of the diprotic model system. In the next step, all models were subjected to the recently introduced SS correlation analysis. The pretreatment in all SS 4948 Analytical Chemistry, Vol. 74, No. 19, October 1, 2002

Figure 1. Wavenumber-wavenumber correlation of diprotic model D3. Synchronous (a) and asynchronous (b) map.

calculations was simple mean-centering of the spectra and not of the concentration profiles, as proposed by Sasic et al.,9 because the latter did not improve the results but can cause some unfavorable distortions in the data.13 Mean-centering in the concentration profile direction leads to systematic correlation features other than noise in the asynchronous SS correlation map for monoprotic acids (see below). This effect can be avoided by first mean-normalizing and then mean-centering the concentration profiles, but this pretreatment sequence leads to strong amplification of noise in the experimental spectra.13 The synchronous and asynchronous correlation maps of the monoprotic model D1 are shown in Figure S5 in Supporting Information. The synchronous map has a strictly monoprotic shape and shows no additional features. An asynchronous correlation cannot be observed in D1, because the slice plots are at calculation noise level. This is in agreement with the WW correlation results and corresponds to the general rule that if a system shows perfect correlation and no asynchronicities, both asynchronous maps (WW and SS) show no features, only experimental (or calculation) noise contributions. The analyses

Figure 2. Sample-sample correlation of diprotic model D3. Synchronous (a) and asynchronous (b) slice plot.

of the related systems D1a and D1b give analogous results, the difference being the overall shape of the synchronous maps that is linear for D1a and exponential for D1b. For system D2 in which the total concentration is not constant and the concentration profiles of the two components are not the same, different results were obtained. Both synchronous and asynchronous SS correlation maps exist, but the shapes of the SS correlations are linear combinations of exponentially and linearly shaped profiles. It can be extracted that there are bigger changes in the first spectra than in the last ones, but it is not possible to determine the true concentration profile shapes for the two components in the model system. Finally, the diprotic model D3 was analyzed with SS correlation; the results are shown in Figure 2. The extreme slices of the synchronous correlation map show intense monoprotic features that are caused by the concentration variations of the two extreme acid species H2A and A2-. The intermediate correlation slices describe the synchronous spectral changes that are caused by the intermediate acid species HA-, but these correlation features are less intense than the ones caused by H2A and A2-. The asynchronous correlation map also gives an overall account of diprotic behavior of the system, but it actually emphasizes the intermediate species of the acid. This can be explained by the spectral asynchronicities that exist between HA- and H2A or A2and also between H2A and A2- and that all contribute to the calculation of the asynchronous map. Some slices of the SS correlation map represent approximate concentration profiles for the three acid species, but there is no general rule for selecting these slices. Nevertheless, the presented correlation maps can be used as reference maps for identifying any spectral system where “diprotic behavior” is the reason for the observed spectral changes, a procedure that will be applied in the analysis of experimental data in this work. MCR-ALS. The data analysis with MCR-ALS is rather simple for the model systems of this work. The number of components that are present in the data sets can be unambiguously determined by singular value decomposition, yielding 2 for the systems D1,

D1a, D1b, and D2 and 3 for the diprotic system D3. It should be noted that for real data, SVD may yield ambiguous results as a result of noise, baseline contributions, etc. (See also Theory Section/MCR-ALS) Initial estimates were obtained by evolving factor analysis and yield an approximate concentration profile for each component. Then appropriate constraints are applied. Concentration profiles are forced to positive values; the first and last spectrum of the data set are constrained to contain only one component. Closure (assumption of constant total concentration) can optionally be applied, because in these model systems, it does not influence the resolution process (i.e., change the shapes of obtained spectra and concentration profiles) but guarantees only proper scaling of spectra and concentration profiles, that is, reducing intensity ambiguity.33 Because data pretreatment (different mean-centering options, mean-normalization, etc.) is known to strongly influence the output of 2D-CoS and, thus, needs special attention, some comments concerning data pretreatment in MCR-ALS are necessary. Common pretreatments are baseline correction (also applied in 2DCoS) if feasible, but also procedures such as smoothing or derivation can be used. It is, however, not advisable to mean-center the data set under investigation. Mean-centering decreases the rank of the data matrix by one; i.e., one component is lost. Consequently, the applicability of certain constraints (nonnegativity of concentration profiles and/or spectra) is also lost, which may lead to severely rotational ambiguous MCR results. Likewise, mean-normalization of spectra is not favorable. The pure spectra of the components are not altered by this treatment, but the corresponding concentration profiles change, so that, for example, closure is not justified as constraint any more. The MCR-ALS analyses of systems D1, D1a, D1b, and D2 (varying total concentration) give correctly modeled concentration profiles and spectra that are equal to the simulated ones within calculation error. Results for the monoprotic system D1 are shown in Figure S6 (Supporting Information). The upper plot contains the concentration profiles for the two modeled components, whereas the lower plot shows the corresponding pure spectra. Each pair of concentration profile and spectrum can be used to reproduce the spectral contribution to the observed mixture spectra for one of the components. It has to be emphasized that all spectral bands (also the heavily overlapping ones in the low wavenumber region) are readily resolved and that their correct intensities can also be extracted. The two-component model systems presented here are actually trivial tasks for curve resolution methods (and could also be obtained by simple curve fitting), because both component spectra are actually known: the first and the last spectrum of each data set are known to contain only one component. To complete the assessment of MCR-ALS, the diprotic model system D3 was investigated (Figure 3). As before, the obtained spectra and concentration profiles are identical to the simulated ones. All spectral bands are correctly resolved, and the concentration profiles fulfill all characteristics of a diprotic acid. Only information about the absence of the intermediate species in the first and last spectra of the data set was included in the calculation, and similar resolution results can also be obtained when the spectral features are overlapping to a much higher extent than in the present example. Analytical Chemistry, Vol. 74, No. 19, October 1, 2002

4949

Figure 3. MCR results for diprotic model D3. Concentration profiles (a) and spectra (b). H2A (s), HA- (- ‚), and A2- (- -).

Influence of noise in the data sets was also investigated for both chemometric methods. Results are very similar to noise-free data but are not given in detail. In 2D-CoS, noise affects only the asynchronous correlation maps. In MCR, both spectra and concentration profiles are influenced by noise, but the overall impact is small because of noise-filtering from the PCA data reproduction step in MCR-ALS. Acetic Acid. From the analysis of the model systems, we know about output, pretreatment effects, and special features of 2DCoS and MCR-ALS under fully controlled conditions. Now, these methods will be applied to experimental data sets. Analogously to the simulated model system D1 (monoprotic acid), the titration of acetic acid was chosen as the simplest experimental example for a pH-evolving system. Acetic acid (HAc) was titrated in the pH range 2-8 while 13 FT-IR spectra were recorded from 910 to 1840 cm-1 (see original data in Supporting Information). The WW correlation spectra are shown as contour plots in Figure 4. In the synchronous correlation map, intense autocorrelation peaks at (1712, 1712), (1279, 1279), (1551, 1551), and (1415, 1415) can be observed. These bands correspond in the given order to ν(CdO) and ν(C-Odimer) of HAc and to νas(COO-) and νs(COO-) of Ac-. The pair-wise attributionsbands at 1712 and 1279 change in one intensity direction (increase or decrease), and the bands at 1551 and 1415 change synchronously in the opposite directionscan be obtained from the signs of correlation peaks among these four bands. When a much higher number of contour levels are selected for creating the contour map, bands at 1371 and 1348 cm-1 can be attributed to HAc and Ac-, respectively, as well. Problems remain with the HAc peak at 1391 cm-1. The expected correlation peak cannot be observed, because it is suppressed by the neighboring correlation peak with opposite sign at 1415 cm-1. In addition, the weak bands at 1052 and 1016 cm-1, where small band shifts occur during titration, are not correctly resolved. The asynchronous correlation map (Figure 4b) shows weak correlation peaks and mainly baseline contributions, because there are small baseline shifts and bending in the original spectra that change out of rate with the spectral bands of the acetic acid 4950 Analytical Chemistry, Vol. 74, No. 19, October 1, 2002

Figure 4. Wavenumber-wavenumber correlation of acetic acid titration. Synchronous (a) and asynchronous (b) map.

species. However, the correlations in this asynchronous map should not be overestimated, because they are very weak (maximum correlation intensity (0.00075), as compared to the correlations in the synchronous map (ranging from -0.010 to +0.025). The SS correlation maps are shown in Figure 5 as slice plots. The synchronous slices have a monoprotic shape, as can be expected from the simulations. The extreme slices (first and last), but also other pairs, can be used as concentration profiles for the two acid species. The asynchronous correlations are much weaker than the synchronous ones and close to the experimental noise level (see correlation intensity in Figure 5b). However, the first slice shows a systematic S-shape. From additional simulations, we know that such features are a result of simple baseline offsets and, to an even higher degree, of deviations of the baseline from strictly linear shape. Both effects, an offset of +0.001 abs units and a slightly curved baseline, can be observed in the first experimental spectrum and are reflected in the first slice of the asynchronous SS correlation map.

Figure 5. Sample-sample correlation of acetic acid titration. Synchronous (a) and asynchronous (b) slice plot.

Figure 6. MCR results for acetic acid titration. Concentration profiles (a) and spectra (b). HAc (s) and Ac- (- ‚).

The same acetic acid data set was analyzed with MCR-ALS. Two components can be identified as contributing significantly to the variation in the data matrix and are selected for modeling. The data set is reduced to two pure spectra and the corresponding concentration profiles for HAc and Ac- that are shown in Figure 6. All bands of the two acid species are correctly reproduced, and the small bands shifts in the low wavenumber region (e.g., HAc w Ac-/1016 w 1020 cm-1) can also be easily extracted from the spectra. The obtained spectra are actually very similar to the first and last original spectra of the titration (we already mentioned in the discussion of the simulated data that a monoprotic twocomponent system is a trivial task for a CR method), but the added value of the analysis is the noise-filtered concentration profiles obtained from the full spectrum analysis, as compared to concentration profiles obtained from single wavenumber traces. The quality of the MCR modeling can be assessed by the lack of fit (LOF, percentage of total variation in Dexp that is not modeled

by MCR) and by analysis of the residual matrix (Dres ) Dexp Cmcr × STmcr). MCR results without closure constraint have a LOF of 1.8% that is, according to the residual matrix, caused by the small baseline shifts and bending that are observed during titration and that are not accounted for in the two-component model. Although application of constraints always increases the LOF in MCR, the results with closure constraint (Figure 6) have an unusually higher LOF of 3.2%, showing that the application of closure is not completely justified. This big increase in LOF is caused by the baseline offset of the first spectrum that causes an apparent acid concentration higher than the true one. Correcting for this offset decreases the LOF to the original level (∼2%) again. Tartaric Acid. The second experimental data set analyzed is the titration of a diprotic acid. A tartaric acid sample was titrated in a fully automated setup starting with a very low pH (