Anal. Chem. 2000, 72, 1690-1698
Calibration Transfer Algorithm for Automated Qualitative Analysis by Passive Fourier Transform Infrared Spectrometry Frederick W. Koehler IV† and Gary W. Small*
Center for Intelligent Chemical Instrumentation, Department of Chemistry and Biochemistry, Clippinger Laboratories, Ohio University, Athens, Ohio 45701-2979 Roger J. Combs, Robert B. Knapp, and Robert T. Kroutil
U.S. Army, SBCCOM, Edgewood Chemical Biological Center, Aberdeen Proving Ground, Maryland 21010-5424
The automated qualitative analysis of passive Fourier transform infrared (FT-IR) remote sensing data is made difficult by the presence in the data of background and instrument-specific variation. For data collected with a single instrument, variation in the data arises from changes in the infrared background radiance, changes in the atmospheric composition within the field-of-view of the spectrometer, and changes in the instrument response function arising from temperature variation in the spectrometer. When more than one spectrometer is used, the variation in detector responses and phase signatures between instruments serves to complicate further the task of implementing an automated processing algorithm for detecting the signature of a target compound. In this work, a combination of signal processing and pattern recognition methodology is applied directly to the interferogram data collected by the FT-IR spectrometer to implement an automated compound detection procedure that is independent of background and instrument-specific variation. The key to this algorithm is the use of highly attenuating digital filters to isolate in the interferogram the frequencies associated with an analyte absorption or emission band while suppressing information at other frequencies. For the test compounds, acetone and sulfur hexafluoride, it is demonstrated that when this digital filtering procedure is coupled with either piecewise linear discriminant analysis or a back-propagation neural network, an automated detection algorithm can be developed with data from a primary instrument and then subsequently used to predict the presence of analyte signatures in data collected with a secondary spectrometer. Correct classification rates in excess of 92% are obtained for both compounds when the algorithm is applied to data collected with the secondary instrument. The use of Fourier transform infrared (FT-IR) remote sensing spectrometry for the detection of airborne pollutants is gaining † Present address: Sandia National Laboratories, MS 0342, Org. 1812, Albuquerque, NM 87185-1407.
1690 Analytical Chemistry, Vol. 72, No. 7, April 1, 2000
in importance.1 This trend can be attributed to several advantages inherent to the technique. The sensitivity and specificity of FT-IR spectrometry allow the determination of analytes in the presence of other potentially interfering species, which is critical in applications such as industrial stack monitoring. In addition, remote sensing measurements are capable of monitoring a large line-of-sight volume of atmosphere without direct sample collection. When the passive FT-IR configuration is used in which naturally occurring infrared background radiation serves as the light source, no internal source or external retroreflector is required. This allows great flexibility in performing open-air experiments. Both ground-based and airborne data collection are possible. Two significant problems have hindered the widespread application of the technique. Traditionally in FT-IR remote sensing, a reference background spectrum is collected and used to remove the background emission profile present in analyte spectra.1 Simple changes in the environment such as wind or temperature often prohibit stable, reproducible reference spectra from being measured. Analyte absorbance spectra computed with mismatched background spectra often contain widely varying baselines and can be difficult to analyze. In addition, a second important background problem is the presence in the data of large instrument-specific signatures which make the automated analysis of data from different instruments difficult. Differences between instruments, or even changes in a single instrument over time, jeopardize the application of the mathematical models created to perform automated qualitative or quantitative analyses. Since the methodology requires a large investment of time and effort spent initially in the collection and processing of training (calibration) data, it is desirable to eliminate the necessity of performing a full-scale training step for each spectrometer used. A significant amount of research has been performed to allow data analysis methods to be used with multiple spectrometers.2-7 (1) Hammaker, R. M.; Fateley, W. G.; Chaffin, C. T.; Marshall, T. L.; Tucker, M. D.; Makepeace, V. D.; Poholarz, J. M. Appl. Spectrosc. 1993, 47, 14711475. (2) de Noord, O. E. Chemom. Intell. Lab. Syst. 1994, 25, 85-97. (3) Bouveresse, E.; Masart, D. L. Vib. Spectrosc. 1996, 11, 3-15. 10.1021/ac9907888 CCC: $19.00
© 2000 American Chemical Society Published on Web 03/04/2000
The approaches investigated are termed calibration transfer methods and can generally be subdivided into two categories. One strategy attempts to build robust analysis methods that are insensitive to the differences between spectrometers.4,5 The other approach is based on adapting an existing method for use with a different instrument through the use of a limited amount of data collected with that instrument.6,7 None of these methods have been evaluated in the context of infrared remote sensing measurements, however. In previous work in our laboratories, methodology has been developed to overcome the effects of variation in the infrared background radiance that is inherent in field measurements through the use of signal processing and pattern recognition techniques applied directly to the raw interferogram data obtained from the FT-IR spectrometer.8-14 This methodology avoids the need for a separate background measurement or the calculation of absorbance spectra. Digital filtering steps are used to help isolate the analyte signal from signals arising from the background or interfering species, and pattern recognition techniques are utilized to discriminate and characterize signals which contain the signature of a target analyte. In this paper, in an effort to overcome the instrument-specific variation that differentiates data collected with multiple spectrometers, the automated compound detection methodology developed in these prior studies is made more robust with additional signal processing steps and with more strongly attenuating digital filtering techniques. The effectiveness of these digital filtering and signal processing steps is evaluated using both piecewise linear discriminant analysis (PLDA) and back-propagation neural network (BNN) pattern recognition techniques. It will be shown that, in conjunction with either pattern recognition method, instrumentspecific background problems can be eliminated, allowing a successful transfer of qualitative calibration information between spectrometers for two test compounds, acetone and sulfur hexafluoride (SF6). EXPERIMENTAL SECTION Data Collection. Calibration transfer issues in passive FT-IR spectrometry were explored by collecting laboratory acetone and SF6 interferograms with a pair of similarly configured Midac Outfielder FT-IR emission spectrometers, labeled units 120 and 145 (Midac Corp., Irvine, CA). These spectrometers employed (4) Blank, T. B.; Sum, S. T.; Brown, S. D.; Monfre, S. L. Anal. Chem. 1996, 68, 2987-2995. (5) Swierenga, H.; de Groot, P. J.; de Weijer, A. P.; Derksen, M. W. J.; Buydens, L. M. C. Chemom. Intell. Lab. Syst. 1998, 41, 237-248. (6) Bouveresse, E.; Massart, D. L.; Dardenne, P. Anal. Chim. Acta 1994, 297, 405-416. (7) Swierenga, H.; Haanstra, W. G.; de Weijer, A. P.; Buydens, L. M. C. Appl. Spectrosc. 1998, 52, 7-16. (8) Small, G. W.; Kroutil, R. T.; Ditillo, J. T.; Loerop, W. R. Anal. Chem. 1988, 60, 264-269. (9) Small, G. W.; Carpenter, S. E.; Kaltenbach, T. F.; Kroutil, R. T. Anal. Chim. Acta 1991, 246, 85-102. (10) Carpenter, S. E.; Small, G. W. Anal. Chim. Acta 1991, 249, 305-321. (11) Barber, A. S.; Small, G. W. Chemom. Intell. Lab. Syst. 1992, 15, 203-217. (12) Kroutil, R. T.; Combs, R. J.; Knapp, R. B.; Small, G. W. Appl. Spectrosc. 1994, 48, 724-732. (13) Shaffer, R. E.; Small, G. W.; Combs, R. J.; Knapp, R. B.; Kroutil, R. T. Chemom. Intell. Lab. Syst. 1995, 29, 89-108. (14) Koehler, F. W., IV; Small, G. W. In Fourier Transform Spectroscopy: 11th International Conference; de Haseth, J. A., Ed.; American Institute of Physics: Woodbury, NY, 1998; pp 231-234.
liquid-nitrogen-cooled Hg:Cd:Te detectors for use in the 800-1400 cm-1 spectral range. The spectrometers were interfaced to a Dell system 486P/50 IBM PC compatible computer (Dell Computer, Austin, TX) operating under MSDOS (Version 6.2, Microsoft, Redmond, WA). Data acquisition was performed with the MIDCOL software package.15 Interferograms were sampled at every eighth zero crossing of the HeNe reference laser, corresponding to a maximum spectral frequency of 1974.75 cm-1. A nominal spectral point spacing of 4 cm-1 was obtained through the collection of 1024 interferogram points per scan. A 4 × 4 in. extended blackbody (model SR-80, CI Systems, Agoura, CA) provided a NIST traceable infrared source whose temperature was varied over 5 to 50 °C. The source temperature was accurate to (0.03 °C and precise to (0.01 °C. This procedure simulated conditions found in open-air remote sensing experiments in which the radiance of the infrared background varies and in which the radiance of the analyte gas is sometimes greater and sometimes less than that of the background. This variation in the background radiance caused the spectral bands of the analytes to occur as absorptions when the background was hotter than the analyte gas and as emission signals when the background was colder than the analyte. Spectrophotometric grade acetone (Aldrich Chemical Co., Madison, WI, 99.5+% purity) and SF6 (Matheson Gas Products, Parsippany, NJ, 99.996% purity) were used as analytes. A sample cell with a path length of 8.2 cm and windows composed of low density polyethlyene (0.0005 in. thickness) was used. The temperature of the gas cell was uncontrolled but was typically 25 ( 1 °C. A thermistor was employed to monitor the cell temperature. For both acetone and SF6 experiments, data collection for units 120 and 145 was performed alternately by moving the cell and blackbody in front of each instrument in turn. For the acetone data set, interferograms were collected with blackbody temperatures from 5 to 50 °C with steps at approximately 5 °C intervals for approximate dilution factors with distilled/deionized water of 0 (pure acetone), 1/2, 1/4, 1/8, 1/16, 1/32, and 1/64. The corresponding estimated path-averaged concentrations were 21866, 13399.6, 9321.4, 5647.5, 3130.8, 1651.5, and 852.6 ppm m, respectively. These concentrations were computed by use of the Wilson equation and acetone liquid mole fraction to establish the acetone vapor pressures.16 Between 20 and 200 replicate interferograms were collected of the analyte at each combination of temperature and concentration. A similar number of interferograms were also collected of the blackbody without the cell in place and of the blank cell before it was filled. All interferograms were sequential single scans (i.e., no signal averaging was performed). The variation in the number of replicate interferograms obtained at each temperature/concentration setting was not part of the experimental design but rather an artifact of the control structure of the MIDCOL program. Once data collection is initiated, interferograms are scanned and saved until the data acquisition is manually terminated. (15) Kroutil, R. T.; Housky, M.; Small, G. W. Spectroscopy (Eugene, Oreg.) 1994, 9, 41. (16) Field, P. E.; Combs, R. J.; Knapp, R. B. Appl. Spectrosc. 1996, 50, 13071313.
Analytical Chemistry, Vol. 72, No. 7, April 1, 2000
1691
The following data collection protocol was used. For a given concentration level, all blackbody temperatures were collected before moving to the next concentration. Data collection began in each case with the blackbody at 50 °C. For a given concentration/temperature combination, the blackbody temperature was allowed to equilibrate to the desired level, and interferograms were collected of the blackbody without the cell in place. Interferograms were then collected of the blank cell, the cell was filled, and interferograms of the analyte were collected. The blackbody temperature was then lowered to the next level, the cell was evacuated, and the procedure was repeated. For each temperature/concentration combination and each type of interferogram (i.e., analyte, blank cell, or blackbody), data collection with both instruments was performed alternately before changing to the next type of interferogram or next concentration/temperature. For the SF6 data set, the same temperature range and data collection procedure was used with similar 5 °C steps. The injected analyte volumes were 0.02, 0.05, 0.1, 0.2. 0.3, 0.5, and 1.0 cm3. Using the ideal gas law, the corresponding path-averaged concentrations for these samples were calculated to be 0.96, 2.41, 4.82, 9.64, 14.46, 24.10, and 48.21 ppm m, respectively. Between 20 and 150 sequential single-scan analyte interferograms were acquired at each combination of temperature and concentration. As with the acetone data, a similar number of interferograms were collected of the blank cell before it was filled and of the blackbody without the cell in place. Data Subset Selection. For the pattern recognition work, two classes of interferograms were needed, analyte-active and analyteinactive. Separate subsets of data were also needed for developing or training the pattern recognition algorithm and for testing the algorithm once it was developed. These data subsets are termed training and prediction sets, respectively. The analyte-inactive class consisted of the blank-cell and blackbody interferograms. To obtain accurate classification statistics in the pattern recognition studies, it was also important to establish that detectable analyte signals were indeed present in the analyte-active interferograms. For this purpose, the collected analyte-active interferograms were Fourier processed, and the resulting single-beam spectra were ratioed to corresponding background spectra collected when no analyte was present. After conversion to absorbance units, the spectra were visually inspected to ensure that the analyte signal was at least three times the noise level. This corresponds to the conventional definition of the limit of detection. Those interferograms whose corresponding spectra did not meet this criterion were removed from further use. Since the degree of variation between sequential interferograms in the data was minimal in this laboratory data set, a random distribution of the total set of interferograms between training and prediction sets would have placed nearly replicate data across the training and prediction sets, thus preventing the assembly of an independent prediction set. Instead of a random or completely representative distribution of data, after visual inspection, alternating temperature/concentration combinations were placed into the training and prediction sets, with the extremes in temperature and concentration being placed in the training set. For example, for the concentration level of 48.21 ppm m SF6, interferograms corresponding to blackbody temperatures of 50, 40, 30, ... °C were placed in the training set, while 1692 Analytical Chemistry, Vol. 72, No. 7, April 1, 2000
Table 1. Partition of Acetone and SF6 Data Sets acetone Midac unit 120
Midac unit 145a
SF6 Midac unit 120
Midac unit 145a
2640 (3940) 2320 (3773)
2134 (3776)
2640 (3940) 1060 (1810) 1260 (1963)
1025 (1815) 1109 (1961)
PLDA Data Sets training prediction
3170 (8782)b 2190 (8202)
2292 (7753) BNN Data Sets
training monitoring prediction
3170 (5000) 1110 (3909) 1080 (4293)
1090 (3678) 1201 (4075)
a No data from Midac unit 145 was used in training for either analyte. Values in parentheses represent the number of analyte-inactive (i.e., blank-cell and blackbody) interferograms.
b
interferograms corresponding to blackbody temperatures of 45, 35, 25, ... °C were placed in the prediction set. All replicate interferograms corresponding to a given temperature/concentration combination were placed together into either the training or prediction set. This included the blank-cell and blackbody interferograms collected in conjunction with the analyte interferograms at each temperature/concentration setting. The prediction set thus contained temperature/concentration combinations that were not present in the training set. This led to the training and prediction sets for each analyte and instrument as listed in Table 1 for the PLDA study. The table entries consist of the numbers of interferograms in the various data subsets. The acetone and SF6 interferograms, as well as their companion blank-cell and blackbody interferograms, were kept separate throughout the studies, as were the interferograms from units 120 and 145. In the BNN study, the data were subdivided in similar fashion into three sets, termed training, monitoring, and prediction sets. The monitoring set was a separate prediction set used to help determine the optimal values for the large number of experimental parameters required for the BNN modeling technique. This allowed a prediction test to be incorporated into the development of the method while retaining a separate prediction set for use in testing the final algorithm. These subsets are also listed in Table 1. As with the PLDA study, separate data sets were assembled for data for each compound and from each instrument. For the BNN study, the acetone training set was reduced in size because of a software limit on the total number of training patterns. Data Analysis. For data analysis, the collected data sets were transferred to a dual 180 MHz Pentium Pro (Intel Corp., Santa Clara, CA) personal computer operating under the Linux operating system, version 2.0.14. The digital filtering and pattern recognition were performed on this system with original software written in FORTRAN 77 and C. Additional processing in the design and creation of finite impulse response (FIR) digital filters was performed with the aid of Matlab, version 4.2c (The MathWorks, Natick, MA). RESULTS AND DISCUSSION Data Analysis Overview. Infrared signals measured through passive FT-IR remote sensing experiments consist of analyte, background, and instrument-specific features superimposed.17 The lack of a stable background complicates the use of conventional data analysis methods such as the calculation of absorbance or
Figure 1. Laboratory acetone (A) and SF6 (B) spectra collected with Midac unit 120 and a blackbody temperature of 50 °C.
difference spectra for an automated determination since it is sometimes difficult to remove background and instrumental features reliably through ratioing or subtraction. The purpose of the signal processing and pattern recognition steps outlined here is the extraction of analyte information and the suppression of interfering signals, thereby allowing an automated determination to be performed without the use of background measurements. Acetone and SF6 are two chemical species frequently used as test compounds in pollution monitoring studies, and both have been well characterized in our previous work.8-14 The spectral features of interest for these compounds are the 1216 cm-1 C-CO-C stretching band of acetone (49 cm-1 full width at halfmaximum (fwhm)) and the 945 cm-1 S-F stretching band of SF6 (10 cm-1 fwhm). Figure 1 demonstrates the types of signals obtained through the calculation of absorbance spectra for interferograms collected from the laboratory acetone and SF6 data sets. The absorbance values are approximate since no correction was performed for self-emission of the analyte vapor. The vertical lines in each spectrum denote the bands in each compound that were targeted for the analysis. The bands in Figure 1 arise as absorptions since the background radiance (blackbody temperature of 50 °C) was greater than that of the ambient temperature analyte. However, as noted previously, for the combinations of cell and blackbody temperatures investigated in this study, both absorption and emission signals were present in the data sets. Fine rotation features were not observed in spectra calculated from these data because of the approximately 4 cm-1 spectral point spacing. To avoid the use of background measurements, signal processing and pattern recognition analysis are applied directly to the interferogram data. Interferograms are the raw data collected by the FT-IR spectrometer. They are a time-domain representation of the spectral information. While not generally suited for visual inspection, to an automated digital signal processing and pattern (17) Kroutil, R. T.; Ditillo, J. T.; Small, G. W. In Computer-Enhanced Analytical Spectroscopy; Meuzelaar, H., Ed.; Plenum: New York, 1984; Chapter 4.
recognition technique, the interferogram contains equivalent information. Direct interferogram analysis provides advantages by decomposing spectral features of different widths into different regions of the interferogram. This can be attributed to the fact that the interferogram representation of a narrow spectral feature damps more slowly than the corresponding representation of a wide background feature. By optimal choice of the interferogram segment to use for analysis, a significant amount of background interference can be removed. Once an optimal segment is isolated from the interferogram, digital filtering is used to enhance the analyte signal further. Timedomain digital filtering involves the estimation of the convolution of the interferogram with the time-domain representation of the filter frequency response function.18 Digital filtering provides a means of extracting frequency information corresponding to the analyte from problematic background frequencies while allowing the methodology to employ only a short section of the interferogram. Restriction of the data processing to a short interferogram segment allows an increase in scan speed and may make possible the use of alternative interferometer designs that are compatible with a short optical path difference in the interferometer.19 Two types of digital filtering were used in this study, a timevarying finite impulse response matrix (FIRM) filter developed previously in our laboratory18 and a standard finite impulse response (FIR) filter. FIRM filters sacrifice attenuation but offer high computational efficiency by having fewer coefficients. During filter generation, coefficients deemed statistically insignificant in the estimation of the convolution sum can be discarded. The timevarying aspect of the FIRM filter is realized by having a separate set of filter coefficients for use in filtering each interferogram point. The composite frequency response of an FIRM filter thus depends on the specific interferogram segment to which the filter is applied. In the work reported here, FIRM frequency response characteristics will be stated specific to a given interferogram segment, or average characteristics will be given with respect to all interferogram segments investigated. In the second type of digital filtering, standard FIR filters were calculated through the Remez exchange algorithm.20,21 These filters provide exceptional out-of-band attenuation; however they contain nearly 1 order of magnitude more coefficients than the FIRM filters. Figure 2 shows frequency response plots for a representative acetone FIRM filter, as well as several acetone FIR filters of varying stopband attenuation utilized in this study. Because of the larger number of coefficients, FIR filtering allows a closer approximation of the desired band-pass width as well as increased stopband attenuation to be attained. The FIRM filter achieves approximately 25 decibels (dB) of attenuation with an average of 22 filter coefficients, whereas the FIR filters used in this study employ 200 coefficients and provide up to 69 dB of attenuation. The FIR filters also produce a narrower band-pass. For comparison purposes, the dB scale is logarithmic, with each 20 dB of (18) Small, G. W.; Harms, A. C.; Kroutil, R. T.; Ditillo, J. T.; Loerop, W. R. Anal. Chem. 1990, 62, 1768-1777. (19) Hariharan, P. Basics of Interferometry; Academic Press: San Diego, CA, 1992; Chapter 3. (20) McClellan, J. H.; Parks, T. W. IEEE Trans. Circuit Theory 1973, CT-20, 697-701. (21) McClellan, J. H.; Parks, T. W.; Rabiner, L. R. IEEE Trans. Audio Electroacoust. 1973, AU-21, 506-525.
Analytical Chemistry, Vol. 72, No. 7, April 1, 2000
1693
Table 2. FIRM Filter Parameters variable
SF6
acetone
filter band-pass width (fwhm, cm-1)a interferogram segment locationb
100, 112, 143, 175, 191 75, 100, 125, 150, 175
94, 106, 143, 166, 204, 218 50, 75, 100, 125, 150
a Represents average filter band-pass width (fwhm) when applied to the specific interferogram segments used. b Relative to centerburst. All segments used were 120 points in length.
attenuation corresponding to 1 order of magnitude reduction in the signal intensity. When research utilizing the interferogram analysis methodology began, computational limits for real-time application prohibited the use of FIR filtering, and FIRM filtering was developed with computational efficiency as a strong priority. However, with the deployment of more powerful computing processors, this limitation has been relaxed, thus allowing the standard FIR filters to be investigated. After digital filtering has been performed on the interferogram data, a pattern recognition step is required in the analysis to determine the presence or absence of analyte signal in the filtered data. Because of its high performance and simplicity in configuration, the pattern recognition technique first utilized for this methodology was PLDA. PLDA attempts to optimize the location of linear separating surfaces, termed discriminants, which approximate a nonlinear boundary in the multidimensional data space between analyte-active and analyte-inactive categories.22,23 In the current work, a BNN24 was also used to implement automated classifiers for SF6 and acetone. PLDA Data Analysis. Previous work utilizing PLDA as the pattern recognition technique has demonstrated the most efficient means of optimizing the experimental parameters of FIRM filter band-pass center and width, interferogram segment starting position and length, and the parameters associated with the PLDA algorithm itself.13 Using this protocol, and a subset of the overall
experimental design used previously, FIRM filters were created for use in detecting SF6. Acetone FIRM filters were also created but with segment locations and filter band-pass centers optimized for its 1216 cm-1 peak. These filters were utilized to examine training, prediction, and calibration transfer issues for acetone and SF6. Table 2 summarizes the FIRM filter parameters used. All interferogram segments employed were 120 points in length. Previous work has established that as long as the segments used are 120 points in length or greater, the filter band-pass parameters and segment starting position can be optimized independently of the segment length.13 Absolute values of the filtered interferograms were used as inputs to the pattern recognition algorithms to make the data space more robust for calibration transfer by forcing absorption and emission patterns to fall on one side of the analyte-inactive category, instead of surrounding it in the multidimensional data space. This approach has been used previously.9 In addition, Forman phase correction was used to remove phase errors in the interferograms.25,26 Since each instrument imparts a characteristic phase signature to the interferogram, phase correction was used to help remove this source of variation and thereby facilitate transfer of the pattern recognition algorithm between spectrometers. Since the phase spectrum of an instrument contains mostly broad features, phase errors have the largest effect in regions of the interferogram close to the centerburst. Given that the pattern recognition algorithms investigated here were based on the use of short interferogram segments, the effect of phase errors on the ability to transfer the algorithms between spectrometers was expected to be variable, depending in large part on which interferogram segment was used. Use of phase correction throughout the work was judged to be the simplest way to remove this potential complication. In all cases throughout the work, Midac unit 120 was used as the primary instrument, meaning that its interferograms were used during FIRM filter generation, as well as during pattern recognition training. Midac unit 145 was used as a secondary instrument to test calibration transfer, i.e., the ability of the discriminants to perform properly when applied to data acquired with an “unknown” instrument. No interferograms from unit 145 were included during training in any part of this study. Results for FIRM filtering experiments from data collected with unit 120 and then utilized for both training and prediction were between 88.45 and 99.93% correct classification for both SF6 and
(22) Kaltenbach, T. F.; Small, G. W. Anal. Chem. 1991, 63, 936-944. (23) Shaffer, R. E.; Small, G. W. Chemom. Intell. Lab. Syst. 1996, 32, 95-109. (24) Zupan, J.; Gasteiger, J. Neural Networks for Chemists: An Introduction, VCH Publishers: New York, 1993; Chapter 8.
(25) Forman, M. L.; Steel, W. H.; Vanasse, G. A. J. Opt. Soc. Am. 1966, 56, 59-63. (26) Griffiths, P. R.; de Haseth, J. A. Fourier Transform Infrared Spectroscopy; Wiley: New York, 1986; pp 93-119.
Figure 2. Frequency responses of acetone filters. (A-C) FIR filter frequency responses with decreasing attenuation in the stopbands and a constant band-pass width of 85 cm-1 (fwhm). Each filter has 200 coefficients. (D) FIRM filter with fwhm of approximately 94 cm-1.
1694 Analytical Chemistry, Vol. 72, No. 7, April 1, 2000
Figure 3. FIRM filter PLDA prediction results for data from the secondary instrument for (A) SF6 and (B) acetone vs interferogram segment starting point from the centerburst. Midac unit 120 was the primary instrument used in predicting the data set from unit 145. Key: (A) squares, circles, inverted triangles, triangles, and hourglassshaped symbols represent SF6 FIRM filters with average band-pass widths (fwhm) of 100, 112, 143, 175, and 191 cm-1, respectively. (B) squares, circles, inverted triangles, triangles, and hourglass-shaped symbols indicate acetone FIRM filters with average band-pass widths (fwhm) of 106, 143, 166, 204, and 218 cm-1, respectively.
acetone. These results demonstrate that FIRM filtering performs well for prediction for both analytes when the data arise from a single instrument. However, when the same discriminants were applied to data from the secondary instrument (unit 145), the prediction results decreased as seen in Figure 3A,B, particularly for SF6. At approximately 50 cm-1 fwhm, the acetone spectral feature at 1216 cm-1 is approximately five times wider than the band for SF6. The typical FIRM band-pass more closely approximates the wider acetone peak but allows a large amount of background information to pass for the narrow SF6 peak. The prediction results for the data from the secondary instrument appear to improve as the acetone FIRM filter band-pass widths increase; however, no clear trend is evident for the optimal segment location. The individual functions that comprise the piecewise linear discriminant are positioned by use of the simplex optimization technique.22 The objective function that drives the optimization is designed to position the discriminant boundary as close as possible to the analyte-inactive class, without allowing any analyteinactive patterns to be misclassified as analyte-active for a given
discriminant.23 This allows a low limit of detection but also means that the linear discriminants themselves may closely conform to the instrument response function of the spectrometer with which the training data were collected. This characteristic instrument signature is a dominant part of the background of single-beam spectra, and its shape is not removed through subtraction or ratioing to a background during the interferogram-based analysis. The interferogram windowing and filtering steps are essential in this methodology to eliminate the background signature from the data. In the case of FIRM filtering, the prediction results with the data from the secondary instrument suggest that the overall FIRM filter stopband attenuation is not removing sufficient background variation to allow the linear discriminants to be free of instrumentspecific information in the course of training and prediction. To explore the potential benefits of additional attenuation in the removal of these instrument-specific signatures, standard FIR filters were utilized since by increasing the number of filter coefficients, the filter attenuation could be increased, and the filter band-pass width could be narrowed. Although an extensive experimental study has yet to be performed with FIR filter parameters similar to that done with the FIRM filters, four different 200-coefficient FIR filters were generated for each analyte. In each case, the band-pass location was centered on the analyte band of interest. The band-pass width was constant for the filters for a given analyte. The four filters exhibited varying attenuation in the stopbands. This was accomplished by varying the weights used with the Remez exchange algorithm to adjust the fit in each frequency band. For acetone, the band-pass width was fixed at approximately 85 cm-1, and the four filters had stopband attenuations of 28, 47, 64, and 69 dB. SF6 FIR filters had a band-pass width of 72 cm-1 and stopband attenuation values of 25, 29, 40, and 59 dB. For FIR filters with a fixed number of coefficients, stopband attenuation drops as the width of the specified filter band-pass decreases. This is the reason for the poorer attenuation characteristics of the SF6 filters. The FIR filters were applied to the same interferogram segment positions used in the FIRM study. Frequency responses for the four filters for acetone can be seen in Figure 2, with those of SF6 being similar except for the band-pass center being located at 945 cm-1. For the FIR filters, prediction classification results for the case in which both training and prediction data were collected with the same instrument varied between 89.99 and 99.98% for both analytes and were similar to the results obtained with the FIRM filters. However, as seen in Figure 4A,B, classification results for both compounds with data from the secondary instrument were markedly improved for the higher attenuating filters that used segments far from the centerburst. The acetone and SF6 predictions were observed to improve with increasing attenuation in the stopbands, with the best results being observed for attenuations above 60 dB and segments located past point 125 (relative to the centerburst). These results suggest that while FIRM filters provide sufficient performance for training and prediction with data from a single instrument, FIR filters with high degrees of stopband attenuation have the added benefit of allowing a successful transfer of qualitative calibration information across data spanning two spectrometers for both acetone and SF6. Analytical Chemistry, Vol. 72, No. 7, April 1, 2000
1695
Figure 4. FIR filtering PLDA prediction results for data from the secondary instrument for (A) SF6 and (B) acetone vs interferogram segment starting point from the centerburst. Key: (A) squares, circles, inverted triangles, and hourglass-shaped symbols represent FIR filter stopband attenuation values of 25, 29, 40, and 59 dB, respectively; (B) squares, circles, inverted triangles, and triangles indicate FIR filter stopband attenuation values of 28, 47, 64, and 69 dB, respectively.
BNN Data Analysis. BNNs are a nonlinear modeling technique often used in qualitative and quantitative analysis.24 Potential advantages to the use of the BNN in this application include true nonlinear modeling ability, the ability to adjust the size and degree of nonlinearity of the model by adding or removing neurons in the hidden layer of the network, and the fact that there is no requirement for discrete boundaries between analyte-active and -inactive categories. Challenges to the application of BNNs include the potentially large amount of computational time required in training, the large number of experimental parameters which may require optimization, and the fact that the model created is not easily interpreted. Thus, there is very little diagnostic information available besides the quality of the prediction itself. The back-propagation algorithm implemented for this research was fixed at a three-layer design consisting of an input layer, one hidden layer, and an output layer. The number of neurons in the input layer was equal to the number of points in the input patterns or filtered interferogram segments (120 points). Two output neurons were provided, one dedicated to analyte-active status and varying between 0 and 1 and the other for analyte-inactive classification, also varying between 0 and 1. Classifications of patterns in the prediction or monitoring sets were performed by 1696 Analytical Chemistry, Vol. 72, No. 7, April 1, 2000
assigning patterns to the class (i.e., analyte-active or analyteinactive) whose output neuron produced the largest value. A sigmoidal transfer function was used between each layer of the network. Owing to the degree of variation in the trained BNN caused by the random values of the initial weights, except for several initial studies, three replicates of each network were typically calculated. The weights were updated by random presentation of the training patterns. The learning rate and momentum parameters, used in updating the weights, were held fixed during training, but the effects of these values were explored as part of the experimental design. Because of the large number of experimental parameters involved in the BNN study, two prediction sets for each instrument were used. The monitoring set was used to find optimal settings for the various experimental parameters. The final prediction set was then used as an independent validation set to give a final prediction result. These data sets are summarized in Table 1. Although poor classification results were obtained for prediction data from the secondary instrument in the PLDA study utilizing the FIRM filters, a small study was undertaken with BNNs and three different FIRM filter/interferogram segment combinations for both SF6 and acetone. An initial set of parameters for the BNN was chosen on the basis of previous work in our laboratory.27 Learning rates of 0.1, 0.3, and 0.5, momentum values of 0.1, 0.3, and 0.5, and an architecture of 10 neurons in the hidden layer were studied with the three FIRM filter/interferogram segment combinations taken from the previous PLDA study. Prediction results with the data in the monitoring sets for each instrument were obtained every 250 training epochs up to 2000. Monitoring set prediction results after 500 epochs remained unchanged with the addition of up to 1500 more epochs, and so results after 500 epochs in all BNN experiments were reported. Excellent classification results for the primary instrument were obtained in all cases in this initial FIRM study, with Midac unit 120 training data used in the creation of the BNN model and unit 120 data used in the monitoring set. The average classification results for the monitoring set of the primary instrument were 99.16 and 98.27% for SF6 and acetone, respectively. However, when the trained networks were applied to the monitoring set from unit 145, the percentages of correctly classified patterns never exceeded 85.46% for either analyte. The average classification result was 52.55 and 30.32% for SF6 and acetone, respectively. In addition, a study with the same parameters for learning rate and momentum was performed with 20 hidden units for SF6 and acetone with similar poor prediction results for the data from the secondary instrument. Further work with five hidden units for SF6 and acetone yielded similarly poor results. These results support the conclusion drawn from the PLDA study that the FIRM filters have insufficient stopband attenuation to allow a successful calibration transfer to be performed. The four FIR filters with increasing stopband attenuation employed in the PLDA study were used to filter interferogram segments from point 125 to 244 (relative to the centerburst) for SF6 and acetone, and BNN pattern recognition was applied. This segment of the interferogram was chosen as a good candidate on the basis of earlier results with PLDA where the combination of (27) Hammer, C. L.; Small, G. W.; Combs, R. J.; Knapp, R. B.; Kroutil, R. T. Anal. Chem. 2000, 72, 1680-1689.
Figure 5. (A) SF6 and (B) acetone prediction results vs FIR filter stopband attenuation (dB). The classification results correspond to the monitoring set from the secondary instrument. The learning rate used was 0.1, the momentum was 0.1, and 10 hidden units were used. Interferogram segment 125/244 (relative to the centerburst) was used in all cases. Error bars represent the upper 95% confidence limits based on the results of the three replicate networks. The hatched bar in each plot denotes the filter used in all subsequent studies.
FIR filters and segments far from the centerburst produced good calibration transfer results. An initial set of BNN parameters of 0.1 for learning rate, 0.1 for momentum, and 10 hidden units was used. Results for classifying the monitoring set from the primary instrument varied between 99.75 ( 0.26% and 100 ( 0.00% for SF6 and between 96.89 ( 0.66% and 98.99 ( 0.99% for acetone. The indicated precision estimates represent the 95% confidence intervals computed from the classification results of the three replicate networks. These results are similar to those observed previously in the PLDA study when the corresponding filters and segments were used. The degree of attenuation in the stopbands of the filter clearly does not have much effect on the classification results for the primary instrument. Figure 5A,B show prediction results for the monitoring set from the secondary instrument for SF6 and acetone, respectively. The results for the secondary instrument clearly indicate the importance of filter attenuation in the stopband for removing the instrument-specific background signals and thereby allowing a successful calibration transfer. Further experiments with BNNs utilized the highest attenuating filters in each case. Experiments were performed next to investigate the effects of learning rate and momentum values on the BNN classification results. These parameters were found to have little effect on the prediction results obtained with the monitoring set of the primary
Figure 6. (A) SF6 and (B) acetone prediction results using the monitoring set from the secondary instrument to determine the optimal learning rate and momentum. The FIR filter with the highest attenuation was used in each case, and each network was constructed with 10 neurons in the hidden layer. In both plots, circles, squares, and triangles represent learning rate values of 0.1, 0.2, and 0.3, respectively.
instrument. Figure 6 illustrates prediction results for the monitoring set of the secondary instrument with learning rate and momentum values between 0.01 and 0.5 for both SF6 and acetone. For SF6 and acetone, with only one exception, the learning rate had an impact below 1% in the prediction results. Increased momentum from 0.01 to 0.5 showed slightly improved results for SF6, while, for acetone, the larger degree of variation in the prediction results makes the interpretation of the effect of momentum more difficult to determine. In general, since the analyte band of acetone is broader than that of SF6, analyte signals are located closer to the background signatures and make calibration transfer more difficult for this analyte, even with highly attenuating FIR filters. Experiments were also performed in which the number of neurons in the hidden layer was varied from 1 to 40. The learning rate and momentum values were fixed at 0.1 for these experiments, and the highest attenuating FIR filters for SF6 and acetone were used. Almost no impact was observed on the prediction results for the monitoring sets of the primary and secondary instruments. This suggests the degree of nonlinearity provided by the pattern recognition algorithm is not a significant issue in Analytical Chemistry, Vol. 72, No. 7, April 1, 2000
1697
obtaining good prediction results for data from the secondary instrument. Final prediction results for SF6 and acetone were obtained by selecting one network for each compound and applying it to the prediction sets of the primary and secondary instruments. The optimal network parameters were found based on the best classification result with the monitoring set from the secondary instrument, since almost all combinations performed well in predicting data collected with the same instrument as the training data. These results may be slightly optimistic since results from the secondary instrument were used to select the optimal network. For acetone, the best parameters yielded classification percentages of 95.70 ( 1.50% for the primary instrument and 92.96 ( 3.04% for the secondary instrument. The precision estimates again reflect the 95% confidence intervals computed from the three replicate networks. A learning rate of 0.5, a momentum of 0.5, and a network architecture of 10 hidden units were used. While these results are the highest overall prediction scores for acetone, the relatively high momentum may have meant that several of the replicates became trapped in local optima during training, causing the relatively large confidence interval. For SF6, the highest prediction results obtained for the independent prediction set from the secondary instrument were 92.32 ( 1.23%. The corresponding results for the primary instrument were 93.71 ( 0.02% correctly classified. A learning rate of 0.1, a momentum of 0.5, and 10 hidden units were used. The classification results described above for the application of the networks to the acetone and SF6 prediction sets of the secondary instrument are very similar to the corresponding PLDA prediction results in Figure 4 for the same interferogram segment (starting point 125). This suggests that the two pattern recognition methods have similar performance characteristics with respect to their ability to implement a calibration transfer with interferogram data. In addition, the similarity in performance of the two methods confirms that the use of the monitoring set of the secondary instrument in selecting the optimal network did not bias the final prediction results significantly. CONCLUSION The results presented in this paper demonstrate that an automated compound classification procedure can be developed
1698
Analytical Chemistry, Vol. 72, No. 7, April 1, 2000
with data from one instrument and then applied to data collected with a second instrument. This work represents the first demonstration that a direct interferogram analysis can be transferred from a primary to a secondary instrument without prior knowledge of the response characteristics of the secondary instrument. This research also confirms that calibration transfer can be performed in the context of passive FT-IR spectrometry. The key to this calibration transfer capability is the suppression of background and instrument-specific variation through the combination of windowing the interferogram to reject spectral features on the basis of their widths and the use of FIR filters approaching 60 dB or higher in stopband attenuation to suppress spectral information on the basis of frequency. The resulting filtered interferogram segments can be used successfully with either the PLDA or BNN pattern recognition methods to implement an automated compound detection procedure that is largely independent of instrumental characteristics. This analysis method requires the collection of only a short segment of the FT-IR interferogram and can thus be used to increase the scan speed of the spectrometer. This capability can be particularly important in airborne applications of FT-IR remote sensing since faster scan speeds translate into observations with improved spatial resolution. The next step in extending this research is to add complexity to the infrared background through the use of field remote sensing data in which increased variation is present in the background radiance and in which the chemical composition of the background is more complex. A key issue in this investigation will be whether the algorithm developed in the current work will be able to overcome the effects of differences between the primary and secondary spectrometers in the presence of this increased background variation. These studies are currently underway in our laboratories. ACKNOWLEDGMENT Funding for this work was provided by the Department of the Army. Cheryl Hammer is acknowledged for writing the initial version of the BNN software used in this work. Received for review July 19, 1999. Accepted January 10, 2000. AC9907888