Anal. Chem. 2003, 75, 4010-4018
Applications of Moving Window Two-Dimensional Correlation Spectroscopy to Analysis of Phase Transitions and Spectra Classification Slobodan S ˇ asˇic´,* Yukiteru Katsumoto, Harumi Sato, and Yukihiro Ozaki
Kwansei-Gakuin University, School of Science and Technology, Sanda 669-1337, Japan
Our recently proposed idea of moving window twodimensional (2D) correlation spectroscopy, which partitions a data set into series of relatively small submatrices (windows) and calculates their covariance maps in succession, is tested for three convoluted data set. Phasetransition temperatures of oleic acid and poly-(N-isopropylacrylamide) in an aqueous solution are sought by analyzing covariances of their temperature-dependent near-infrared and infrared spectra, respectively, while Raman spectra of three kinds of polyethylene (PE) pellets are investigated to find the spectral differences among them and to classify randomly ordered spectra by a sample-sample (SS) covariance map. The criterion of mean of standard deviation of covariance matrices is used as an indicator of the crucial information present in these matrices so that only a few of them are discussed in details. The results are obtained quickly after very simple calculations and are studied at length. The baseline variation is not removed prior to the calculations but is found to be of use for the determination of the phasetransition temperatures. Randomly ordered Raman spectra of the PE pellets are classified by innovatively used and interpreted SS slice spectra, with the relation to principal component analysis discussed.
The algorithms behind the above two multivariate groups are not too complex but still necessitate an experimenter to become significantly familiar with their basic postulates. A simpler multivariate approach, easily understandable and, therefore, potentially likeable to everyone, is two-dimensional (2D) correlation (or covariance; we shall often use term “covariance” as interchangeable with and more correct than correlation) spectroscopy.6,7 The 2D correlation technique is based on monitoring covariances among variations at the variables (e.g., intensity variations along the wavenumbers) or variations along the sample axis (e.g., similarity/dissimilarity among the spectra). The first operation yields so-called variable-variable (VV) covariance maps, welldocumented in the literature, whereas the second one produces sample-sample (SS) correlation maps, which have been proposed very recently8 and still needs to find its place in analytical applications. Both maps are trivially calculated as
SynVV ) 1/(s - 1) XT‚X SynSS ) 1/(w - 1) X‚XT
(1)
The concept of multivariate spectral analysis has become widely popular in analytical spectroscopy during the past decade. Nowadays, instead of analyzing specifically predetermined peaks or bands, automatic and comprehensive analysis of the whole spectra has often been practiced. A variety of problems being treated this way cover, for example, chromatograms, images, online spectra, or biological processes.1-3 The common point of all these spectral studies is minimal/limited reliance on prior knowledge and a linear algebra mind-frame in which the problems are formulated and tackled. The two all-inclusive groups of multivariate methods can be distinguished: multivariate curve resolution (MCR)4 and multivariate calibration.5
where SynVV stands for conventionally called synchronous spectrum, or a covariance map, and SynSS stands for a SS covariance map (synchronous spectrum). X represents a s × w matrix of experimental data with spectra aligned in rows, and T means transposition. For all the variables behaving similarly, a SynVV map displays positive peaks at the cross points of these variables, while negative peaks appear at the cross points of the variables with the opposite variations of intensities. The same holds true for the SS covariances with spectra replacing variables and similarity replacing intensity variations. Monitoring the covariances only may lead to maps that are difficult to read and understand. To facilitate such problems, so-called asynchronous spectrum, or disvariance map,9 is introduced to simplify intricate variance maps by inclusion of an operator that annuls all the proportionality in the data. The simplicity of eq 1 necessarily implies that the range of problems resolvable by 2D correlation spectroscopy is somewhat
* To whom correspondence should be addressed. Phone: +81-795-65-8349. Fax: +81-795-65-9077. E-mail:
[email protected]. (1) Vandenginste B. G. M.; Massart, D. L.; Buydens, L. M. C.; de Jong, S.; Lewi, P. J.; Smeyers-Verbeke J.; Handbook of Chemometrics and Qualimetrics B; Elsevier: Amsterdam, 1998. (2) Martens, H.; Naes, T. Multivariate Calibration; Wiley: New York, 1993. (3) Chalmers, J. M., Ed. Spectroscopy in Process Analysis; Sheffield Academic Press: Sheffield, 2000.
(4) Cuesta Sanchez, F.; van den Bogaert, B.; Rutan, S. C.; Massart, D. L. Chemom. Intell. Lab. Syst. 1996, 34, 139-171. (5) Geladi, P.; Kowalski, B.; Anal. Chim. Acta 1986, 185, 1-17. (6) Noda, I. Appl. Spectrosc. 1993, 47, 1329-1336. (7) Sˇ asˇic´, S.; Muszynski, A.; Ozaki, Y. Appl. Spectrosc. 2001, 55, 343-349. (8) Sˇ asˇic´, S.; Muszynski, A.; Ozaki, Y. J. Phys. Chem. A 2000, 104, 6380-6387. (9) Isaksson, T.; Katsumoto, Y.; Ozaki, Y.; Noda, I. Appl. Spectrosc. 2002, 56, 1289-1297.
4010 Analytical Chemistry, Vol. 75, No. 16, August 15, 2003
10.1021/ac020769p CCC: $25.00
© 2003 American Chemical Society Published on Web 07/11/2003
limited. Because much more powerful mathematics is behind MCR or multivariate calibration, one may expect that 2D spectroscopy cannot be fully compared with them when highly complex data sets (many species with heavily overlapping spectra) are to be resolved or when a very precise quantification is the goal.10 However, it may apparently be useful whenever the problem is not considered a purely analytical one. This means that 2D spectra can be used to reach information indecipherable by an intuitive or univariate approach and then to utilize obtained information for other purposes. Such a scenario suggests that 2D spectroscopy can particularly be a useful analytical tool in physical chemistry where analytical information is not in itself the final objective of the experiment. This paper aims at illustrating how 2D spectroscopy, in its very simple form described above, may be used to provide information about typical physicochemical problems, such as phase transition, and how it may help to document the spectral differences between the different phases. Three sets of spectral data are analyzed herein: temperature-dependent near-infrared (NIR) spectra of oleic acid in the pure liquid state, temperature-dependent infrared spectra (IR) of an aqueous solution of poly-(N-isopropylacrylamide) (PNiPA), and Raman spectra of three kinds of polyethylene (PE) in pellets of different density. All these systems were already investigated by our group by using other algorithms and for different reasons.11-13 We have selected them here for reanalyzing because we want to exemplify the merits of our new idea in the frame of 2D spectroscopy, named moving-window 2D correlation spectroscopy,14 and to unveil other aspects of correlation spectroscopy, such as classification, that have rarely, if ever, been mentioned so far.15 In short, moving window 2D correlation spectroscopy is nothing but executing eq 1 on comparatively small data matrices that are obtained by partitioning X into sets of submatrices16,17 (or windows; hence, the name). These windows usually consist of only several spectra and are moved by one place along the sample axis up to the last sample. For a matrix comprising N spectra, there will be N - n + 1 windows of size n. This way of calculating covariance among the data is particularly useful for circumventing over-interpretation of covariance maps that may easily occur if they are calculated from large data sets. Too complex data should not be straightforwardly analyzed by means of 2D methodology because the shape of out-coming maps (that are misleadingly effortless to calculate) is due to several causes that cannot unambiguously be separated and analyzed. Hence, one has to seek for a simplification, and in our opinion, the most plausible way to alleviate the problem of data complexity is to start with simpler data, that is, to analyze windows. The results of the window analysis, though seemingly numerous, are easy to (10) Diewok, J.; Ayora-Canada, M. J.; Lendl, B. Anal. Chem. 2002, 74, 49444954. (11) Sˇ asˇic´, S.; Muszynski, A.; Ozaki, Y. J. Phys. Chem. A 2000, 104, 6388-6394. (12) Katsumoto Y.; Tanaka, T.; Sato, S.; Ozaki, Y. J. Phys. Chem. A 2002, 106, 3429-3435. (13) Sato, H.; Shimoyama, M.; Kamiya, T.; Amari, T.; Sˇ asˇic´, S.; Ninomiya, T.; Siesler, H. W.; Ozaki, Y.; J. Appl. Polym. Sci. 2002, 86, 443-448. (14) Sˇ asˇic´, S.; Ozaki, Y. Submitted for publication. (15) Wang, G.; Geng, L. Anal. Chem. 2000, 72, 4531-4542. Geng, L. Presentation at XXIX FACSS Meeting, Providence, RI, 2002. (16) Keller, H. R.; Massart, D. L.; De Beer, J. O. Anal. Chem. 1993, 65, 471475. (17) Darj, M. M.; Malinowski, E. R. Anal. Chem. 1996, 68, 1593-1598.
Figure 1. The NIR spectra in the 6600-7600 cm-1 region of oleic acid in the pure liquid state over the temperature range of 16-80 °C.
compile and to extract succinct information from. In this paper, we demonstrate how the combination of the moving window idea and 2D correlation spectroscopy may be utilized to evidence the phase transition points and to classify the spectra from the quite complex and demanding data sets. EXPERIMENTAL SECTION Only a brief description of the experiments is given here because the details can be found in refs 11-13. A sample of oleic acid of very high purity was supplied by Nippon Oil and Fats Co. (Amagasaki, Japan) and was used without further purification. The NIR spectra were taken on a Nicolet Magna 760 FT-IR/NIR spectrophotometer equipped with a PbSe detector. A total of 512 scans were accumulated per measurement. The temperature of the sample was controlled by circulating thermostated water in a cell holder.11 The preparation of poly-(N-isopropylacrylamide) (PNiPA) was reported in our previous paper.12 PNiPA was dissolved in water and incubated for 12 h before the IR measurements. The transmission and attenuated total reflection (ATR) spectra were measured by a Nicolet Magna 760 FT IR spectrometer equipped with a liquid nitrogen cooled mercury-cadmium-telluride detector. An ATR cell used was made of a horizontal ZnSe crystal with an incidence angle of 45°. The temperature was varied by a rate of ∼2 °C/h. The IR spectrum of water had been subtracted by a home-written C++ program.12 The PE pellets were provided by Mitsubishi Chemical Co. and were used as received. The Raman spectra of the pellets (4 mm in diameter) were measured with a JASCO NRS 2001 Raman spectrometer equipped with a liquid-cooled CCD detector (Princeton Instruments). Ar laser line of 514.5 nm was employed for excitation, with the power at the sample position of 50 mW. The total measurement time per spectrum was 20 s.13 RESULTS Oleic Acid Data. Figure 1 represents NIR spectra in the 7600-6600 cm-1 region of oleic acid in the pure liquid state measured over the temperature range of 16-80 °C. The spectra are featured by the prominent baseline fluctuations caused by the rise of temperature and by the intensity increase of the peak at Analytical Chemistry, Vol. 75, No. 16, August 15, 2003
4011
∼6920 cm-1 assignable to the first overtone of an O-H-stretching vibration of oleic acid monomer. This system was investigated in detail by different physicochemical techniques18,19 and was the first one to which SS correlation spectroscopy was applied.11 These references suggest that one could expect three phases of oleic acid in the pure liquid state to appear in the course of heating: quasi-smectic liquid crystal structure in the low-temperature range, the same structure but less ordered in the middle, and isotropic liquid in the high temperatures, respectively.18,19 The ascent of the band at 6920 cm-1 is a consequence of the increased monomer concentration at high temperatures. The spectra in Figure 1 would be easy to analyze with respect to searching for the phase transition temperatures only if the band at 6920 cm-1 is representative of the appearance of different phases. If so, one could eliminate the baseline variations and monitor the intensity variation at 6920 cm-1 the pattern of which should then reveal the existence of given phases. However, it turned out to be more elucidating to analyze the spectra in Figure 1 in a multivariate way using the entire spectral range. Hence, we start with the moving window 2D correlation analysis by first normalizing the spectra (dividing every spectrum by its mean), and then creating windows that consist of five spectra (the first window covering the first five spectra in the temperature range of 16-25 °C, etc.). The total of 22 available spectra gives 18 5-spectra-wide windows. The 18 covariance matrices are calculated from these windows (the spectra being mean-centered beforehand), and the overall conclusion is drawn by analyzing a common parameter of these matrices. The covariance matrix of the first window of the normalized data is shown in Figure 2A as a set of slice spectra (in this paper, we do not use 2D contour plots). Naturally, because the baseline variation is not eliminated, one may see the slice spectra containing features all along the spectral range. The next two windows give covariance matrices similar to the one in Figure 2A, but the covariance matrix of the fourth window (Figure 2B) is somewhat specific as the intensity of covariances sharply rises, being almost three times higher than those shown in Figure 2A. With the following windows, the shape of the covariances does not change, but the intensity decreases up to window 8, whereas at window 9, both intensity and shape of covariances apparently do change. The pattern of decreasing and increasing of the covariance intensities is once more seen, with window 15 being the break point. The common information contained in the covariance matrices of all the windows is measured through the mean of the standard deviation (SD) of the matrices. Our idea is the following: if the spectra in a window are due to a single species, then after meancentering, the resulting covariance must be comparable to the covariance of a matrix of noise, but if a window contains at least one spectrum arising from a different species, then the intensity of covariances must sharply rise, because the features of two spectrally distinguishable species cannot be annulled by meancentering and will give rise to nonnegligible covariances. The mean of the SD of the matrices charts the variation of the covariances and can be used as a graphical criterion to elucidate (18) Iwahashi, M.; Yamaguchi, Y.; Kato, T.; Horiuchi, T.; Sakurai, I.; Suzuki, M. J. Phys. Chem. 1991, 95, 445-451. (19) Iwahashi M.; Hachiya, N.; Hayashi, Y.; Matsuzawa, H.; Suzuki, M.; Fujimoto, Y.; Ozaki, Y. J. Phys. Chem. 1993, 91, 707-711.
4012 Analytical Chemistry, Vol. 75, No. 16, August 15, 2003
Figure 2. Covariance matrices of windows 1 (A) and 4 (B) viewed as the slice spectra.
the nature of the matrices. Such a plot is shown in Figure 3, and it clearly demonstrates that the first three windows (comprising 7 spectra) seem to belong to one phase, that another phase rises at window 4 (the corresponding temperatue is 32 °C), and the third and probably the fourth phase appear at windows 9 (53 °C) and 15 (73 °C), respectively. These results nicely concur with those obtained by Iwahashi et al.18,19 and us,11 except for the transition at 73 °C that has not been experimentally evidenced. The criterion displayed in Figure 3 can theoretically be supported by adding the maximum of autocovariance of the matrix of uniform noise of the same size as experimental data and with a precisely defined SD. This value is expected to lie somewhere near the line connecting the first three points in Figure 3. Alternatively, one may use a tripled mean of the SD of such a noisy matrix. The SD of the mentioned matrix of noise is a number estimated as equal to the SD of intensities at several wavenumbers at the beginning or end of any of the spectra shown in Figure 1. However, it turned out that the variations in the set of spectra belonging to the same phase were such that assumption of noise
Figure 3. Mean of standard deviation of the covariance matrices vs the windows. The temperatures corresponding to the sharp rise of the ordinate values are 32, 53, and 73 °C, respectively.
Figure 4. The IR spectra of an aqueous solution of PNiPA in the temperature range ∼24 to 38 °C.
uniformity did not quite hold true, so that the maximum of autocovariance of a noisy matrix with the SD as defined was somewhat below the first three points in Figure 3. The discrepancy at higher windowsswindows 8 and 14 should have been laid in line with the first three pointssmay be due to more pronounced deviation from the uniformity of noise caused by incomplete separation of the phases. PNiPA Data. The temperature-dependent IR spectra of PNiPA in an aqueous solution are given in Figure 4. The strong overallintensity variation taking place above 33 °C obscures comprehension of the changes in the bands. To identify the coil-globule transition temperature and to document the spectral variations associated with it, Katsumoto et al.12 employed two techniques: first, they calculated second derivatives of the spectra shown in Figure 4 in order to detect a new peak that appears above the phase-transition temperature, and then they calculated the correlation between the intensity variations at the wavenumber intervals assigned to each of the amide bands of both phases, like in statistical 2D correlation spectroscopy.20 The perfect correlation coefficient of 1 was found for the amide bands below the temperature of coil-globule transition, and the appearance (20) Sˇ asˇic´, S.; Ozaki, Y. Anal. Chem. 2001, 73, 2294-2301.
Figure 5. The mean-normalized spectra in the amide I (A), II (B), and III (C) regions.
of the globule phase was noted through the decline of the perfect correlation coefficient. In the present study, we analyze somewhat wider spectral ranges than those used by Katsumoto et al.12 and do not pay any in advance attention to the positions of the peaks in order to calculate covariances among the most representative wavenumbers, but rather, we identify these wavenumbers through the covariance maps and, finally, analyze the entire spectra shown in Figure 4 without any other pretreatment except for mean-normalization. Figure 5A-C describes the analyzed regions of amide I, II, and III bands. The covariance matrices obtained from the total of 21 spectra do not differ substantially from the ones shown in Figure 2, except, of course, for the shape of covariances and are, therefore, not shown. Figure 6 displays the criterion from Figure 3. For the amide I and II bands and for the entire IR spectra of PNiPA (Figure 6A, B, and D, respectively), there is a clear indication of the phase transition occurring at 33.1 °C, whereas a transistion temperature of 33.8 °C is found using the amide III Analytical Chemistry, Vol. 75, No. 16, August 15, 2003
4013
Figure 6. The mean of standard deviation of the covariance matrices obtained from the amide I (A), II (B), and III (C) regions and from the entire spectra (D). The straight lines represent the absolute maximum of the covariance matrix of noise, the SD of which is comparable with the experimental one.
band. These are the points where the means of the SD of covariance matrices sharply ascend, indicating the appearance of a new local conformation of the polymer. Of particular interest is the finding that even the entire IR spectra of PNiPA may be considered indicative of the phase transition, as shown in Figure 6D. The usual approach followed by many physicochemists is to select the bands of interest (such as amide bands) and to search for the changes in these bands. This study shows that attention should not be paid only to the bands of interest, but that the phase transition may induce the variations along the whole spectral range and that, therefore, the information about phase transition is not localized only in the bands, but that perhaps every single wavenumber may contain indices of phase transition. The experimental noise is estimated through band-free regions in Figure 5B and C, and the maximum of its covariance matrix is 4014
Analytical Chemistry, Vol. 75, No. 16, August 15, 2003
added to all the criteria. It performed quite satisfactorily, except for the entire range of covariances where it turned out to be remarkably underestimated. The spectral differences between the two phases in all three amide regions can be learned from the VV maps of the windows that include the spectra of both species. Figure 7 clearly suggests that there are two bands in the amide I region varying in opposite directions with maximums at 1610 and 1652 cm-1 (these two positions are commented on in the Discussion Section). A similar result is obtained for the amide II region (not shown) with the variation on the higher wavenumber side being difficult to assess, whereas there is no indication that the positions of the bands have changed in the amide III region. Polyethylene (PE) Pellets Data. The Raman spectra of PE pellets are not the subject of the same interest as the previously
Figure 7. The VV covariance maps shown as slice spectra in the amide I region. The marked slices illustrate opposite behavior of the bands. Figure 9. The mean of SD of covariance matrices of the ordered set of PE pellets’ Raman spectra.
Figure 8. The Raman spectra of HDPE, LLDPE ,and LDPE in pellets with different physical properties.
analyzed two data sets. There is no phase transition among the PE pellets in terms of continuous evolution of one phase into another. However, there is a similarity between this system and the former two in terms of classification. Searching for a phasetransition point may be understood as an attempt to cluster the spectra. A sharp conversion between the phases should always give a clear separation of the spectra belonging to different phases, and from a related cluster plot, it should be easy to read the point of conversion. Hence, one may assume that the spectra belonging to the different phases of oleic acid or PNiPA can be clustered. Our objective in applying moving window 2D correlation spectroscopy to the Raman spectra of PE pellets is to demonstrate that this method can be used as a clustering tool and that, following classification, one may employ the covariance spectra to learn about the difference between the different clusters (phases). The spectra shown in Figure 8 were analyzed13 by the most popular multivariate method, partial least squares (PLS),5 to build a model that can predict density, crystallinity, and melting points of unknown samples. The three kinds of pellets were separated by score biplots, and the spectral differences among them were discussed via the loadings’ features.13 In applying the moving window covariance methodology, one meets two options: if any parameter describing the groups of samples (such as density or crystallinity, in this case) is known, one may order the spectra before running calculation, and in that case, the main interest can
only be in unraveling the spectral differences among the phases, whereas if nothing is known about the samples, then the covariance spectroscopy may be used to cluster them. The latter task is apparently much more difficult. In the first instance, we use the knowledge about the parameters describing the three kinds of PE and order the Raman spectra shown in Figure 8 according to the ascendancy of the above three parameters, that is, the Raman spectra of six lowdensity (LDPE) are followed by those of six linear low-density (LLDPE) and five high-density (HDPE) samples. Then, we run the covariance analysis in the same way as before. The means of the SD of resulting maps are displayed in Figure 9. Apparently, the three kinds of PE are successfully separated, indicating that there are measurable spectral differences among them. The first six spectra (two windows) are only due to LDPE. The inclusion of the first spectrum of LLDPE causes an increase in the deviation of the covariance matrix of the third window. After of all the six LDPE spectra are used (windows 3-8), the covariance again rises indicating the emergence of a HDPE spectrum (window 9). Because Figure 9 provides information about the separability of the spectra, one may look for the spectral variations among the PEs in the same way as for the PNiPA data. Thus, window 3 is selected to monitor the differences between the Raman spectra of LDPE and LLDPE (Figure 10). All of the slice spectra appear to be positively covaried, with a minor number of negative data points and practically without splitting of any of the peaks. This result suggests that the spectral differences between these two kinds of PE are quite small (as can be seen in Figure 9, as well) and that it mainly originates in the overall intensity pattern, because no particular band-splitting evidence is acquirable from Figure 10. The spectral differences between LLDPE and HDPE are investigated through the covariances of window 9 (not shown). Again, no remarkable difference between them is observable, suggesting that the Raman spectra of the two species are very similar. However, a sharp increase in the intensity of covariance reveals that HDPE has fairly different Raman scattering intensities, making it easily distinguishable from LLDPE and LDPE. If nothing is known about the spectra, then the above calculation is not possible anymore. Moreover, the moving window approach may not make sense for the systems of randomly ordered spectra. To employ 2D correlation spectroscopy for Analytical Chemistry, Vol. 75, No. 16, August 15, 2003
4015
Figure 10. The 741 slice spectra of window 3 illustrating high similarity and significant intensity variations among the Raman spectra of different kinds of PE in pellets.
spectra classification, which is all that can be done with indiscriminately arranged spectra, one has to run SS covariance spectroscopy on the entire data set. So far, we have used here only VV covariance spectra, because they have correctly provided the information sought. We could also have calculated SS covariance spectra, and the results should have been highly similar to those in Figures 3, 6, and 9, respectively (in fact, we have confirmed this). For disordered spectra, however, only SS covariance maps may be useful. Irrespective of the order of the spectra, the covariance among the spectra of the same species should always be high, while at the same time, the covariance with the spectra of the different species should be lower but ordered. In other words, from a SS slice spectrum (that is much better to be used as a 2D presentation may be hardly readable) taken at any of the samples, one should recognize as many groups of covaried sample as there are different species present. Hence, we disarranged the above spectral assembly and created a new one in which the spectra were randomly ordered. The 2D representation of the obtained SS map is not advisable to plot while the slice spectra offer diverse information depending on the sample at which the slice spectrum is taken. As displayed in Figure 11A, one slice spectrum can nicely confirm the above scenario, but another one may yield no information (Figure 11B). The slice taken at sample 1 (Figure 11A) clearly separates the three groups of samples through the three levels of covariances: there are five samples with strong, six with medium, and five with negative covariance with sample 1, describing, therefore, exactly different phases of PE, while at the same time the slice taken at sample 12 is completely uninformative. Thus, the separation of the species seems to be heavily dependent on the choice of a slice spectrum to be examined. Bearing in mind the complexity of the spectra analyzed (Figure 8) and inability of a more complex separation method, principal component analysis (PCA), to provide a comprehensive clustering of the three kinds of PE (Figure 11C), one has to conclude that the result shown in Figure 11A is reasonably encouraging concerning the separation ability of the SS covariance spectroscopy. DISCUSSION It has been found that normalization is of crucial importance for the successfulness of the present studies. Some of the results 4016 Analytical Chemistry, Vol. 75, No. 16, August 15, 2003
Figure 11. The SS slice spectra (shown as scatter plots) taken at sample 1 (A) and 3 (B) of randomly ordered PE Raman spectra. The PCA score biplot of the same set of spectra is shown in (C).
are also acceptable without normalizing the spectra, but the best ones are obtained only after normalization. The purpose of normalizing spectra is to make spectral variations comparable a statistical analysis is greatly endangered if the inputs (spectra) are biased.1 The analysis of the PNiPA spectra (Figure 4) would hardly make sense if these spectra were analyzed without any pretreatment. However, this operation is usually applied to the bands, not to the full spectra, because it is the selected bands where inequality of spectral responses is evident and not the spectra. A phase transition represents a coherent move of an ensemble of molecules. This may lead to the changes of the optical parameters of the system that may cause baseline variations, such
as those spotted here in all three cases. It is thus reasonable to suppose that some information regarding phases may be embedded in the baseline variations and that it is not only peak shifts that may be considered descriptors of phase transitions. Therefore, the baseline variations should not be omitted straightforwardly in monitoring phase transformations. The normalization of the whole spectra with the baseline varying so markedly means only that the spectra of one phase are shifted so as to reach a common, as much as possible, baseline level. Another phase would have another baseline, slightly but clearly different from other baselines. Hence, running covariance analysis on such spectra with the purpose of determining phase transition points can be successful, as shown here. This approach means that the band details are somewhat ignored. The most purposeful result obtained by analyzing the full spectra is to verify the phase-transition points, whereas whether the covariance maps will be useful for elucidating the spectral differences among the phases is not always clear. For instance, although the analysis of the PE Raman spectra has clearly revealed that only minor spectral differences combined with scattering coefficients of different pellets lead to obvious discrimination of the pellets,13 the corresponding analysis of oleic acid data has a much smaller chance to be successful owing to the very complex combination of baseline variation, with one varying and two static bands. Hence, in the latter case, the phase transition description has been the safest result to report, while in the former case, additional information has been reachable. The covariance spectra in Figure 7 point to 1610 cm-1 as a variable where remarkable spectral variation takes place. Although this is quite correct, it is wrong to consider 1610 cm-1 a peak position. There is no literature data reporting the band at 1610 cm-1; however, the band at 1652 cm-1 is reported for the globule state (also in Figure 7). The reason for the appearance of the peak at 1610 cm-1 in the covariance spectra is simply in the notable intensity variation at this variable. The band is located at 1624 cm-1, but because its variation is somewhat moderated (probably) by the band at 1652 cm-1, the strongest intensity variation in the shorter wavenumber side appears at 1610 cm-1. The lesson to be learned from this is that the peak position in covariance spectra does not necessarily mean that a band is located at that place; the same applies to PCA peaks as well. There must be a band peak in the vicinity of a covariance peak, but complete matching of the two is not guaranteed (see also ref 10 for similar observations). Finally, analysis of the full spectra hints to the role of water in the phase transition. The spectra shown in Figure 5 have already been pretreated in order to eliminate an overwhelming water IR spectrum. This may have led to the suppression of the baseline influence, and therefore, another cause may have contributed to the pattern in Figure 6. Most likely, it is interaction between water and the polymer that could not have been removed by manipulating spectra prior to the covariance analysis, and its trace has been imprinted along the entire analyzed region. If so, then the pattern in Figure 6 is not only due to the spectral differences between the coil and globule conformations but also due to the different nature of interaction of these two phases of PNiPA with water. 2D correlation spectroscopy is, of course, incomparable with the PLS algorithm in terms of prediction ability, but the features of the most important loadings and scores of PLS can be compared
with 2D covariance and disvariance spectra. With respect to the analysis of Raman spectra of PE pellets, it turns out that the first loadings of PLS and PCA are highly similar to the covariance spectra in Figure 10. The negative features in the first PLS loading and the features of the higher PCA loadings can be found in the disvariance spectrum. This suggests that the main information, in this case, the spectral differences among the three kinds of PE, is satisfactorily acquirable from a very simple method, such as 2D correlation spectroscopy, and that there is no specific need to utilize more complex means, such as PCA or PLS (excluding prediction). Simply speaking, because noise does not seem to be an issue in the PE Raman spectra and because this system is not too complicated, covariance maps operate very well in extracting crucial information. It also performs properly on the level of details that requests disvariance spectra to be analyzed, and will not further be commented on here. It is known that the amorphous band at ∼1320 cm-1 and the peak at 1422 cm-1 are markers of the PE density, but for the purpose of this work, the overall Raman intensity provided the bulk of the information, so there was no need to employ the above two indicators. It is necessary to say that citing high-quality signal-to-noise ratio and relative simplicity of the PE pellets Raman spectra (actually, they are visually very complex, and only after analysis may one judge their true complexity) does not necessarily mean that covariance spectra are manifestly inferior to PCA. The great advantage of PCA lies in identification of the number of active species and the delineation of signal, significantly facilitating the analysis of spectral details, but the information contained in PCA loadings is definitely readable from 2D variance spectra, as well. For example, an average perspective of the geometrical distance among the spectra provided by PCA score biplot (Figure 11C) can be over-performed by conveniently selected SS slice spectra. Apparently, some samples may serve as good distance markers (Figure 11A), and some are useless for this purpose (Figure 11B). In this particular example, 4 out of 17 samples effectively cluster the phases, while the rest fail. Searching for the descriptive samples may apparently be a difficult and troublesome task, because if one does not know anything about the sample set, then one does not have an idea what is a correct result, and in addition, for the large sample sets, such a search becomes progressively pointless. However, such situations are objectively rare to come across. Usually, at least an elementary knowledge is present, and it may appreciably help to pose the problem and to apply effectively methodology developed in this study. We again emphasize that we mainly think of physicochemical systems and that there is no pretension to underestimate or replace high -quality, very popular, and powerful PCA. The method presented here has much in common with fixedsize moving window evolving factor analysis (FSMWEFA).16 Application of FSMWEFA to the first two spectral data gives the same results. The crucial difference between the two methods is that 2D correlation routinely does not decompose the data. This is an advantage when not too complex data is analyzed, because the 2D correlation spectra are always in a close relation with physical reality and may simultaneously provide information about spectral and concentration aspects of the data. On the other hand, FSMWEFA is superior in signal-to-noise separation and, consequently, may provide more reliable information about the number Analytical Chemistry, Vol. 75, No. 16, August 15, 2003
4017
of existing species. Therefore, it is a better choice when the data containing numerous analytes or of poor S/N ratio, for example, chromatograms, are analyzed. The criterion for S/N threshold proposed here is akin to but less precise than the definition of the pool of eigenvalues assignable to noisy directions in FSMWEFA. FSMWEFA is also easily comprehensible and undemanding concerning computation and programming, except for inclusion of routines for singular value decomposition. CONCLUSION Several applications of the newly proposed combination of moving window concept and 2D correlation spectroscopy have successfully been demonstrated in this study. The phase-transition points of oleic acid and PNiPA are determined from their temperature-dependent vibrational spectra by analyzing small covariance matrices in succession. The appearances of new species are indicated through sharp rises of covariance intensities. A criterion of mean of standard deviation of covariance matrices is employed to characterize these matrices so that despite the numerous maps calculated, only a single plot is essentially used to extract the sought data. After the transition point is determined, the spectral difference between the coil and globule states of PNiPA are investigated. It is found, in conclusion with previous studies, that new bands appear in the amide I and II regions, but only intensity variations between the two conformations are detected in the amide III region. The baseline variation is found
4018
Analytical Chemistry, Vol. 75, No. 16, August 15, 2003
to be indicative of different phases of oleic acid, whereas waterpolymer interaction is believed to have influenced (to some extent) the covariance spectra of the polymer. The same method is found to be capable of separating the three kinds of PE pellets with different physical properties. The analysis of the spectral features of the LDPE, LLDPE, and HDPE pellets reveals that their Raman spectra are highly similar, with the overall intensity of the spectra separating the three kinds of pellets. An original idea of classifying randomly ordered experimental spectra by using the slice spectra of SS covariance maps is demonstrated. It turns out that this method may in some instances outdo PCA score biplot. All three spectral sets are analyzed in an automatic fashion, quickly and efficiently, with mean normalization being necessary. The formulas used and programming involved has been utterly simple. The importance of the whole spectral range in analyzing physicochemical problems, such as those mentioned in this study, is underlined. ACKNOWLEDGMENT S.S. thanks the Japan Society for the Promotion of Science (JSPS) for financial support.
Received for review December 17, 2002. Accepted May 12, 2003. AC020769P