Anal. Chem. 2002, 74, 1371-1379
Combined Use of Conventional and Second-Derivative Data in the SIMPLISMA Self-Modeling Mixture Analysis Approach Willem Windig,*,† Brian Antalek,† Joseph L. Lippert,† Yann Batonneau,‡ and Claude Bre´mard‡
Imaging Materials and Media, Research & Development, Eastman Kodak Company, Rochester, New York 14650-2132, and Laboratoire de Spectrochimie Infrarouge et Raman UMR-CNRS 8516, PRC Re´ gion Nord Pas de Calais, Baˆ t. C5 Universite´ des Sciences et Technologies de Lille, F-59655 Villeneuve d’Ascq cedex, France.
Simple-to-use interactive self-modeling mixture analysis (SIMPLISMA) is a successful pure variable approach to resolve spectral mixture data. A pure variable (e.g., wavenumber, frequency number, etc.) is defined as a variable that has significant contributions from only one of the pure components in the mixture data set. For spectral data with highly overlapping pure components or significant baselines, the pure variable approach has limitations; however, in this case, second-derivative spectra can be used. In some spectroscopies, very wide peaks of components of interest are overlapping with narrow peaks of interest. In these cases, the use of conventional data in SIMPLISMA will not result in proper pure variables. The use of second-derivative data will not be successful, since the wide peaks are lost. This paper describes a new SIMPLISMA approach in which both the conventional spectra (for pure variables of wide peaks) and second-derivative spectra (for pure variables of narrow peaks, overlapping with the wide peaks) are used. This new approach is able to properly resolve spectra with wide and narrow peaks and minimizes baseline problems by resolving them as separate components. Examples will be given of NMR spectra of surfactants and Raman imaging data of dust particle samples taken from a lead and zinc factory’s ore stocks that were stored outdoors. For cases in which spectral mixture data are available without either pure component spectra or concentration profiles of pure components, a wide variety of self-modeling mixture analysis tools are available. For somewhat dated review, see Hamilton1 and Windig.2 The SIMPLISMA (simple-to-use interactive self-modeling mixture analysis) approach is different in that it is not based on principal component analysis, and it is interactive.3,4 The interactivity is important in a practical industrial environment where it is not always possible to design experiments or obtain replicate * Fax: (716) 477-7781. E-mail:
[email protected]. † Eastman Kodak Company. ‡ UMR-CNRS. (1) Hamilton, J. C.; Gemperline, P. J. J. Chemometrics 1990, 4, 1-13. (2) Windig, W. Chemom. Intell. Lab. Syst. 1992, 16, 1-16. (3) Windig, W.; Guilment, J. Anal. Chem. 1991, 63, 1425-1432. (4) Windig, W.; Heckler, C. E.; Agblevor, F. A.; Evans, R. J. Chemom. Intell. Lab. Syst. 1992, 14, 195-207. 10.1021/ac0110911 CCC: $22.00 Published on Web 02/19/2002
© 2002 American Chemical Society
or uncontaminated samples when trouble-shooting. Furthermore, in practical situations, events such as temperature changes may cause peak shifts, which may result in extra components. When resolving the data set, user interaction is necessary to avoid such problems. SIMPLISMA is based on a pure variable approach. A pure variable (e.g., wavenumber, frequency number, etc.) is one that has significant contributions from only one of the pure components in the mixture data set. When highly overlapping spectral features or baselines are present in the spectra, second-derivative spectra can be used to resolve the data properly.5-8 A variation of SIMPLISMA has been described to resolve time-resolved mass spectral data. This approach uses pure spectra as a first estimate and derives pure variables from the resolved concentration profiles.9 Demonstration code and files for SIMPLISMA are available.10 SIMPLISMA has been described for a variety of applications: (a) Raman spectra of a time-resolved reaction of tetramethyl orthosilcate,3,10 hydrogen peroxide activation by nitriles,11 and imaging of dust particles emitted by smelters;12 (b) FT-IR microscopy of a polymer laminate;3-8,10 (c) Pyrolysis mass spectral data of plant materials;4 (d) Time-resolved mass spectral data of photographic colorcoupling compounds;9,10 (e) NIR spectra of five-component mixtures of solvents5 or polymers13 and monitoring of powder blending;14 (f) Diffuse reflectance UV-vis spectra of zeolites15 and polynucleotides;16 (g) Fluorescence spectra of fulvic acids17 and natural organic matter from water;18 and (h) As a preprocessing step for other curve-resolution methods to analyze UV-vis spectra of polynucleotides,19 ion mobility (5) Windig, W.; Stephenson, D. A. Anal. Chem. 1992, 64, 2735-2742. (6) Windig, W. Chemom. Intell. Lab. Syst. 1994, 23, 71-86. (7) Windig, W.; Markel, S. J. Mol. Struct. 1993, 292, 161-170. (8) Guilment, J. ; Markel, S.; Windig, W. Appl. Spectrosc. 1994, 48, 320-326. (9) Phalp, J. M.; Payne, A. W.; Windig W. Anal. Chim. Acta 1995, 318, 43-53. (10) Windig W. Chemom. Intell. Lab. Syst. 1997, 36, 3-16. (11) Vacque, V.; Dupuy, N.; Sombret, B.; Huvenne J. P.; Legrand, P. Appl. Spectrosc. 1997, 51, 407-415. (12) Batonneau, Y.; Laureyns, J.; Merlin, J. C.; Bre´mard, C. Anal. Chim. Acta 2001, 446, 23-37. (13) Mansuetto, E. S.; Wight, C. A. Appl. Spectrosc. 1992, 46, 1799-1803. (14) Sanchez, F. C.; Toft, J.; van den Bogaert, B.; Massart, D. L.; Dive, S. S.; Hailey, P. Fresenius’ J. Anal. Chem. 1995, 352, 771-778.
Analytical Chemistry, Vol. 74, No. 6, March 15, 2002 1371
spectra of volatiles in air20 for which a real-time version has been reported,21 and for resolution of chromatographic peaks.22 In this paper, examples will be shown of spectral mixture data with both narrow and wide spectral features that need to be resolved. Neither the conventional nor the second-derivative approach alone can resolve these data sets. To resolve such data sets properly, SIMPLISMA needs to be extended toward resolving spectral data using a combination of conventional and secondderivative data. We will show practical applications of this approach of NMR spectra of surfactants and Raman imaging data of dust particle samples taken from a lead and zinc factory’s ore stocks that were stored outdoors. MATERIALS AND METHODS Simulated Data. Simulated data were generated as follows. Two Gaussian profiles were generated to represent spectra. The first one, representative of wide spectral features, such as a background, had a mean of 50 and a standard deviation of 30. The second Gaussian profile had a mean of 25 and a standard deviation of 1. Both profiles were represented by 100 data points. The spectra were normalized to have the same maximum of 1 and combined into mixtures in ratios of 1/9, 2/8, ..., 9/1. After this, random noise with a uniform distribution of -0.005 to 0.005 was added to the spectra. This resulted in noise with a range of 1% of the maximum intensity in the data. NMR. Conventional 1H NMR spectra were obtained at 23 °C on a Varian (Palo Alto, CA) Inova 400 MHz spectrometer using a standard 5-mm probe. Typical parameters included a spectral width of 8000 Hz, relaxation delay of 6 s, and 16 transients averaged. Low-power presaturation was used in all cases to attenuate the water signal. The residual water resonance was completely removed using digital filtering. The water resonance (4.7 ppm) was used to estimate the proper chemical shift scale. The data set represents spectra of a solid particle aqueous slurry obtained at different times during a ball mill process. Typically, eight spectra were acquired within a range of 24 h. The spectra consist of the signal from the surfactant used, N-methylN-oleoyltaurate potassium salt. Within the data set, mixtures of two distinct spectra are visible, one significantly broader than the other. The broad spectrum represents surface adsorbed surfactant, and the narrow spectrum represents the dissolved population. The assumption is that all of the surfactant is visible within the entire spectrum. Other experiments not reported in this article suggest that the assumption is valid. We observed no signal from the solid particles. Raman Microspectrometry. All data were collected on a LabRAM confocal scanning spectrometer manufactured by Jobin (15) Verberckmoes, A. A.; Weckhuysen B. M.; Schoonheydt, R. A. In Progress in Zeolite and Microporous Materials/Studies in Surface Science and Catalysis; Chon, H., Ihm, S.-K, Uh, Y. S., Eds.; Elsevier Science: New York, 1997, 105, 623-630. (16) Gargallo, R.; Sanchez, F. C.; Izquierdo-Ridorsa, A.; Massart, D. L. Anal. Chem. 1996, 68, 2241-2247. (17) Esteves da Silva, J. C. G.; Machado A. A. S. C.; Oliveira Laquipai, C. J. S. Environ. Toxicol. Chem. 1998, 19, 1268-1273. (18) Smith D. S.; Kramer. J. R. Environ. Int. 1999, 25, 295-306. (19) Vives, M.; Gargallo, F.; Tauler, R. Anal. Chem. 1999, 71, 4328-4337. (20) Shaw L. A.; Harrington, P. B. Spectroscopy 2000, 15, 41-45. (21) Rauch, P. J.; Harrington P. B.; Davis, D. M. Chemom. Intell. Lab. Syst. 1997, 39, 175-185. (22) Gargallo R.; Tauler, R.; Cuesta-Sanchez, F.; Massart, D. L. TRAC 1996, 15, 279-286.
1372
Analytical Chemistry, Vol. 74, No. 6, March 15, 2002
Yvon S. A. (231, Rue de Lille, 59650 Villeneuve d’Ascq, France) that was equipped with a Peltier-cooled charge coupled device (1152 × 298 pixels). The Raman scattering was excited using a 632.8-nm excitation wavelength supplied by an internal, air-cooled, helium-neon laser through an Olympus high-stability BX 40 microscope coupled confocally. Laser power delivered at the sample was 15 mW and could be monitored via a filters’ wheel with optical densities of 0.3, 0.6, and 1. The backscattered radiation was collected using the same microscope. Olympus objective of 100× ultralong working distance (ULWD) was used. Its numerical aperture was 0.80. The spot size of the laser focused by the 100× ULWD objective at the sample was estimated to be