Application of Multilayer Feed-Forward Neural Networks to Automated

Jan 6, 1999 - ... https://cdn.mathjax.org/mathjax/contrib/a11y/accessibility-menu.js .... Each neural network was trained to recognize only one compou...
0 downloads 0 Views 151KB Size
Anal. Chem. 1999, 71, 751-761

Application of Multilayer Feed-Forward Neural Networks to Automated Compound Identification in Low-Resolution Open-Path FT-IR Spectrometry Husheng Yang and Peter R. Griffiths*

Department of Chemistry, University of Idaho, Moscow, Idaho 83844-2343

A drawback of current open-path Fourier transform infrared (OP/FT-IR) systems is that they need a human expert to determine those compounds that may be quantified from a given spectrum. In this work, multilayer feedforward neural networks with one hidden layer were used to automatically recognize compounds in an OP/FT-IR spectrum without compensation of absorption lines due to atmospheric H2O and CO2. The networks were trained by fast-back-propagation. The training set comprised spectra that were synthesized by digitally adding randomly scaled reference spectra to actual open-path background spectra measured over a variety of path lengths and temperatures. The reference spectra of 109 compounds were used to synthesize the training spectra. Each neural network was trained to recognize only one compound in the presence of up to 10 other interferences in an OP/ FT-IR spectrum. Every compound in a database of vaporphase reference spectra can be encoded in an independent neural network so that a neural network library can be established. When these networks are used for the identification of compounds, the process is analogous to spectral library searching. The effect of learning rate and band intensities on the convergence of network training was examined. The networks were successfully used to recognize five alcohols and two chlorinated compounds in field-measured controlled-release OP/FT-IR spectra of mixtures of these compounds. Atmospheric monitoring plays an important role in air pollution control, understanding atmospheric chemistry, and modeling the transportation of pollutants. With the increasing concern for atmospheric environmental quality, better monitoring techniques are required. An ideal detection technique should be able to detect numerous compounds with one instrument, have high sensitivity and selectivity, and provide real-time, automated, in situ measurements.1 Open-path Fourier transform infrared (OP/FT-IR) spectrometry can meet most of these requirements and is considered a competitive technique for environmental atmospheric monitoring.2,3 (1) Sigrist, M. W. Air Monitoring by Spectroscopic Techniques; John Wiley & Sons: New York, 1994. (2) Marshall, T. L.; Chaffin, C. T.; Hammaker, R. M.; Fateley, W. G. Environ. Sci. Technol. 1994, 28, 224A-232A. 10.1021/ac980955o CCC: $18.00 Published on Web 01/06/1999

© 1999 American Chemical Society

One of the major drawbacks of contemporary OP/FT-IR spectrometry is the way in which the data are processed. Currently an expert spectroscopist is required to set up the instruments and analyze the measured spectra, and this has significantly limited the widespread application of this technique.4 The most popular data-processing techniques that are used to identify and quantify atmospheric species by OP/FT-IR spectrometry are spectral subtraction, classical least-squares (CLS) regression, and partial least-squares (PLS) regression.5 Each of these techniques has some drawbacks and usually requires an experienced user to implement it optimally. When spectral subtraction is used, the reference spectrum of a compound being analyzed is multiplied by a scaling factor and then subtracted from the field-measured spectrum. The concentration of each analyte can be estimated from the magnitude of the scaling factor required to eliminate it from the spectrum. In practice, however, the presence of a given analyte can be difficult to recognize by an inexperienced user and the scaling factor can be difficult to determine even by an expert.6 Although some automated methods of determining the scaling factor have been developed,7-10 these methods fail when the spectrum becomes complicated. As a result, it is more common to apply multivariate methods (especially CLS) to obtain quantitative information on each analyte from an OP/FT-IR spectrum. In current practice, as recommended in the EPA’s FT-IR OpenPath Monitoring Guidance Document,5 the intense lines in the rotation-vibration spectrum of atmospheric H2O and CO2 should be compensated before the CLS algorithm is implemented. Although several ways of compensating for atmospheric species (3) McClenny, W. A. Program Objectives and Status For the U. S. EPA Program on FTIR-Based Open-Path Monitoring. In Proceedings of Optical Remote Sensing for Environmental and Process Monitoring, Dallas, TX, 1996. (4) Newman, A. R. Anal. Chem. 1997, 69, 43A-48A. (5) Russwurm, G. M.; Childers, J. W. FT-IR Open-Path Monitoring Guidance Document, 2nd ed.; ManTech Environmental Technology, Inc.: Research Triangle Park, NC, 1995. (6) Griffiths, P. R.; de Haseth, J. A. Fourier Transform Infrared Spectrometry; John Wiley & Sons: New York, 1986. (7) Banerjee, S.; Li, D. Appl. Spectrosc. 1991, 45, 1047-1049. (8) Friese, M. A.; Banerjee, S. Appl. Spectrosc. 1992, 46, 246-248. (9) Hanst, P. L.; Hanst, S. T.; Williams, G. M., Mouse-Controlled Air Analysis Using Grams-386. In Proceedings of Optical Remote Sensing for Environmental and Process Monitoring, San Francisco, CA, 1995. (10) Pescatore, D. E.; Perry, S. H.; DuBois, A. E.; Kricks, R. J., Coanalysis and Spectral Subtraction: Developments in the Optimization and Automation of Open-Path FTIR Spectroscopy Analysis Software Based on Recent Field Projects. In Proceedings of Optical Sensing for Environmental and Process Monitoring, McLean, VA, 1994.

Analytical Chemistry, Vol. 71, No. 3, February 1, 1999 751

have been described,5 none is easily implemented in practice. When CLS is used, a spectral region is first selected and a calibration model is established. All compounds that give rise to absorption bands in this region can affect the prediction result, so it is very important that all compounds giving rise to absorption bands in the chosen window are included in the calibration set.5 Although PLS models do not require all the possible compounds that have absorption bands in the selected frequency region to be known and modeled, these compounds must also be included in the calibration sets as latent factors. If a new factor appears in the measured spectrum, the prediction accuracy will be degraded, although not to the same extent as for CLS. There has been considerable controversy over the question of whether OP/FT-IR spectra should be measured at high or low resolution. The rotational fine structure of bands in the spectra of small molecules is better resolved when the spectrum is measured at high resolution, increasing the specificity of the measurement. However, the noise level and peak absorbance of interfering lines due to atmospheric H2O and CO2 both decrease significantly when the resolution is reduced. Furthermore, the rotational lines in the spectra of most volatile organic compounds (VOCs) are separated by less than their collision-broadened halfwidth, so these lines can never be resolved and only the band contour, which is usually at least 10 cm-1 in width, is measured. Thus the band intensity is not affected significantly until the resolution is reduced below ∼8 cm-1. For monitoring many hazardous air pollutants, therefore, there are real advantages from a spectroscopic standpoint to measuring OP/FT-IR spectra at low resolution. In this paper we address the problem of the automated recognition of representative VOCs from low-resolution OP/FTIR spectra without compensating for the absorption of atmospheric H2O and CO2. The full mid-infrared spectrum, from 4000 to 400 cm-1, is rarely used in OP/FT-IR spectrometry, since only those atmospheric windows from 3000-2400, 2250-2000, and 1300-700 cm-1 are usually accessible. To date, it has not been found to be possible to use a single CLS model for a universal calibration using the information in the full atmospheric windows, considering the total number of VOCs that may exist in the atmosphere.11 This is often not a problem because the number of compounds present above the detection limit in a specific spectral window at any given time is usually small. Nonetheless, it is critical to determine which compounds should be included in a given CLS or PLS model prior to setting up the calibration. Before building the model, a human expert has always been required to determine which compounds should be included. This determination has been made in a number of ways, such as analyzing the nearby emission sources and/or visual inspection of the spectrum. An automated method that can replace the human expert in deciding which compounds are to be included in the model is vital if OP/FT-IR is to gain more widespread acceptance for atmospheric monitoring. Because of the rapid development of computer technology, a large number of spectra can now be conveniently stored on personal computers and quickly retrieved for subsequent processing. This capability has provided a basis for automated compound (11) Graedel, T. E.; Hawkins, D. T.; Claxton, L. D. Atmospheric Chemical Compounds: Sources, Occurrence, and Bioassay; Academic Press: Orlando, FL, 1986.

752

Analytical Chemistry, Vol. 71, No. 3, February 1, 1999

identification. Since each compound has a unique IR spectrum, it should be possible in principle to identify all compounds in an OP/FT-IR spectrum above the detection limit by spectral library searching. However, library searching always works best for the spectra of pure compounds, and there is no simple way to decompose an OP/FT-IR spectrum into the spectra of its pure components, especially in light of the fact that the spectrum is dominated by the ubiquitous water lines. In addition, fieldmeasured OP/FT-IR spectra have higher noise levels and poorer baselines than spectra measured in the laboratory. In recent years, artificial neural networks (ANNs) have proved to be a very useful tool for pattern recognition and classification problems12 and have been used to investigate problems that cannot be easily solved by traditional methods in several fields.13 A major advantage of ANNs is that they have the capability of discovering patterns that are so obscure as to be imperceptible to either human researchers or standard statistical methods.13 Another advantage of ANNs is that once a network is well trained, it can retain excellent performance even if degraded, noisy, or missing data are applied.14 Several applications of artificial neural networks in chemistry and spectroscopy have been reported since 1986.15,16 Most of the applications to infrared spectroscopy have concentrated on obtaining structural information from IR spectra.17-22 ANNs have proved to be useful for IR spectral database searching and baseline correction23-26 and have also been used for joint interpretation of infrared, mass, and 13C NMR spectra.27,28 Among many of the different types of neural network methods, the multilayer feedforward neural networks trained with back-propagation have been used most widely.15,29 These networks are particularly powerful for pattern classification and function approximation.12 For example, it has been shown that a two-layer feed-forward network can uniformly approximate any continuous function to an arbitrary degree of exactness provided that the hidden layer contains a sufficient number of neurons.30 In the work reported in this paper, we have applied multilayer feed-forward neural networks for the development of an automated (12) Hagan, M. T.; Demuth, H. B.; Beale, M. Neural Network Design; PWS Publishing Co.: Boston, 1996. (13) Masters, T. Practical Neural Network Recipes in C++; Academic Press: San Diego, 1993. (14) Elling, J. W.; Lahiri, S.; Luck, J. P.; Roberts, R. S.; Hruska, S. I.; Adair, K. L.; Levis, A. P.; Timpany, R. G.; Robinson, J. J. Anal. Chem. 1997, 69, 409A415A. (15) Zupan, J.; Gasteiger, J. Anal. Chim. Acta 1991, 248, 1-30. (16) Zupan, J.; Gasteiger, J. Neural Networks for Chemists-An Introduction; VCH: Weinheim, Germany, 1993. (17) Meyer, M.; Meyer, K.; Hobert, H. Anal. Chim. Acta 1993, 282, 407-415. (18) Novic, M.; Zupan, J. J. Chem. Inf. Comput. Sci. 1995, 35, 454-466. (19) Gasteiger, J.; Li, X.; Simon, V.; Novic, M.; Zupan, J. J. Mol. Struct. 1993, 292, 141-59. (20) Schulz, H.; Derrick, M.; Stulik, D. Anal. Chim. Acta 1995, 316, 145-159. (21) Ricard, D.; Cachet, C.; Cabrol-Bass, D. J. Chem. Inf. Comput. Sci. 1993, 33, 202-210. (22) Weigel, U.-M.; Herges, R. J. Chem. Inf. Comput. Sci. 1992, 32, 723-731. (23) Tanabe, K.; Tamura, T.; Uesaka, H. Appl. Spectrosc. 1992, 46, 807-810. (24) Klawun, C.; Wilkins, C. L. Anal. Chem. 1995, 67, 374-378. (25) Wabuyele, B. W.; Harrington, P. de B. Anal. Chem. 1994, 66, 2047-2051. (26) Wabuyele, B. W.; Harrington, P. de B. Appl. Spectrosc. 1996, 50, 35-42. (27) Munk, M. E.; Madison, M. S. J. Chem. Inf. Comput. Sci. 1996, 36, 231238. (28) Klawun, C.; Wilkins, C. L. J. Chem. Inf. Comput. Sci. 1996, 36, 249-257. (29) Burns, J. A.; Whitesides, G. M. Chem. Rev. 1993, 93, 2583-601. (30) Morris, A. J.; Montague, G. A.; Willis, M. J. Chem. Eng. Res. Des. 1994, 72, 3-19.

method for the recognition of compounds in low-resolution OP/ FT-IR spectra. The compounds predicted to be present may then be used as a guide for spectral subtraction or multivariate models for quantitative analysis. The results should be applicable to highresolution spectra, although it should be understood that the effect of noise and interference by atmospheric water vapor become greater as the resolution is increased and more computer memory and/or increased processing time may be required. DATA Reference Spectra Library. The database of vapor-phase FTIR spectra used in this work contained the reference spectra of 109 compounds, 105 of which were from the U.S. Environmental Protection Agency (EPA) library (prepared by Entropy Inc. under EPA Contract 68D90055). The other four reference spectra, of ethanol, 1-propanol, 2-propanol, and 1-butanol, were measured in our laboratory by a procedure described previously.31 The EPA library is a public domain library that is available on the Internet (http://info.arnold.af.mil/epa/refsym.htm). It contains 385 spectra of 105 different compounds, most of which are listed in the U.S. Clean Air Act or its 1990 amendments. Most compounds in the library are represented by four spectra, with replicate spectra at two different concentrations. Each spectrum was measured at a nominal resolution of 0.25 cm-1 and is stored between ∼4400 and 400 cm-1. The spectra were collected after air broadening to 1 atm pressure. For this project, all highresolution reference spectra were converted to an effective resolution of 8 cm-1 by boxcar averaging the neighboring data points and stored between 4000 and 700 cm-1. Measurement of Open-Path FT-IR Spectra. The open-path background spectra were field-measured OP/FT-IR spectra measured in an area of northern Idaho that was believed to be free of industrial pollutants. Single-beam spectra were measured over path lengths between 25 and 375 m, the ratio of which to a “zero” (actually ∼2-m) path length reference was taken and converted to absorbance. The only features that are recognizable in these spectra are due to atmospheric water and carbon dioxide. The original data were collected as interferograms with a nominal resolution of 1 or 8 cm-1. When used for training the network, the higher resolution spectra were deresolved to 8 cm-1 by truncating the interferograms and applying the Norton-Beer medium apodization function. The instrument used to collect these spectra was a Bomem MB-104 OP/FT-IR spectrometer (Quebec, Canada) operated in the monostatic mode with a cube-corner array retroreflector (Opticon Inc., Billerica, MA). OP/FT-IR spectra of controlled releases of dichloromethane, chloroform, methanol, ethanol, 1- and 2-propanol, and 1-butanol were measured at 8-cm-1 resolution with the same spectrometer and retroreflector as the open-path background spectra. The liquid analytes were simply poured into a dish and heated on a hotplate located downwind of the path between the spectrometer and the retroreflector. For these measurements, the analytes present in the beam path are known but not their concentration. The path length was fixed at 75 m for all controlled-release measurements. Absorbance spectra were obtained after taking the ratio of these spectra against a 2-m path length single-beam background spectrum. (31) Richardson, R. L.; Griffiths, P. R. Appl. Spectrosc. 1998, 52, 143-153.

Training and Validation Sets. Synthetic spectra were used for the training sets and part of the validation sets for the feedforward neural networks. Each synthesized spectrum was obtained by digitally adding the reference spectrum of a single component or a mixture to an open-path background absorbance spectrum that had been randomly selected from 195 spectra measured at path lengths between 25 and 375 m and the ratio of these to a short-path length background was taken before being converted to absorbance. The reference spectra of mixtures were obtained by randomly selecting several compounds in the reference library and digitally adding these spectra after scaling by randomly selected coefficients. The scaling factors were selected so that the strongest band of any component did not exceed 0.3 absorbance unit (AU). Furthermore, at no time, did the total absorbance of all the components in a mixture spectrum at any wavenumber exceed 0.3 AU. The training sets and validation sets used in this paper are summarized in Table 1. When other training sets or validation sets were developed for some special neural networks, they are described in the relevant discussion. Data Preprocessing. In an OP/FT-IR spectrum, many of the H2O and CO2 lines have peak absorbance values greater than 2. Because the photometric accuracy in spectral regions of high absorbance is very low, any absorbance value in an OP/FT-IR spectrum that is greater than 2 was set to zero. A simple onepoint baseline correction was applied to all open-path background spectra. No other data preprocessing methods, such as subtraction of H2O lines or advanced baseline-correction algorithms, were used. Calculations. All the neural network computations were conducted with Matlab (The MathWorks, Inc., Natick, MA). The “trainbpx” function in the Neural Network Toolbox for Matlab32 was used to train all the neural networks in this paper, and the “simuff” function was used to simulate the neural networks in the testing phase. Spectral data manipulations, such as calculation of single-beam spectra, calculation of absorption spectra, deresolution, etc., were performed using GRAMS/32 (Galactic Industries Corp., Salem, NH). All the calculations were conducted on a Pentium 166-MHz personal computer (PC) with 32-MB memory. NEURAL NETWORK METHODOLOGY The neural network architecture used in this study was a fully connected, feed-forward system with one hidden layer. A hyperbolic tangent sigmoid transfer function was used for both the hidden layer and output layer. Small random numbers between -0.05 and 0.05 were used as the initial weights. Training of the networks was performed by using the “trainbpx” function, which uses a fastback-propagation algorithm with momentum and an adaptive learning rate. A constant momentum of 0.9 was used for training all the networks. More descriptions on adaptive learning rate can be found in ref 32. Neural Network Inputs. Three types of neural network inputs have been compared. The first type used all the absorbance values of a full spectrum from 4000 to 700 cm-1; in this case, the input vector had 856 elements. The second type of input used only the absorbance values in the atmospheric windows in the regions of 3000-2400, 2250-2000, and 1300-700 cm-1; for this case, the (32) Demuth, H.; Beale, M. Neural Network Toolbox User’s Guide; The Math Works, Inc.: Natick, MA, 1994.

Analytical Chemistry, Vol. 71, No. 3, February 1, 1999

753

Table 1. Training Sets and Validation Sets Used To Train and Test the CHCl3, CH2Cl2, Methanol, Ethanol, 1-Propanol, 2-Propanol, and 1-Butanol Neural Networks type

no. of spectra

A

381

B

381

C

1000

D

1000

E

500

F

500

A

109

B C D E

100 100 100 116

F

36

description

targets

Training Sets single-component spectra. Each spectrum was synthesized by adding an open-path background and the spectrum of the compound being identified at different concentrations single-component spectra. Each spectrum was synthesized by adding an open-path background and an interference spectrum selected from the EPA library at different concentrations 10-component synthesized mixture spectra containing the compound being identified and 9 other interferences 9-component mixture spectra that did not contain the compound being identified. The 9 components were randomly selected from the EPA library synthesized mixture spectra containing the compound being identified and 1-5 other interferences. The interferences were specially selected as strong interferences of the target compound synthesized 1-5-component mixture spectra that did not contain the compound being identified. These compounds were specially selected as strong interferences of the target compound Validation Sets spectra for all the 109 compounds in the EPA library. Each compound has only one spectrum and the maximum peak absorbance is 0.3 AU. Open-path background is added to each spectrum synthesized the same way as training set A synthesized the same way as training set C synthesized the same way as training set D field-measured controlled-releases of single components: CHCl3 (12), CH2Cl2 (12), methanol (22), ethanol (24), 1-propanol (12), 2-propanol (8), and 1-butanol (26) field-measured controlled releases of mixtures: CHCl3 and CH2Cl2 (12), alcohols (26)

input vector had 392 elements. The third type of input used only the absorbance values in spectral regions where the compound of interest has an absorbance greater than 0.5% of the maximum value. For example, CHCl3 has two absorption bands (from 1231 to 1208 cm-1 and from 795 to 745 cm-1) and only the absorbance values in these two spectral regions were used as the network inputs. In this case, the CHCl3 neural network had 21 inputs. For the third type, the number of inputs for CH2Cl2, methanol, ethanol, 1-propanol, 2-propanol, and 1-butanol networks are 28, 139, 152, 154, 204, and 205, respectively. For the reason described below, the third type of input was used for the data described in this paper unless otherwise specified. In principle, the full spectrum contains the maximum information that can be obtained, but in an OP/FT-IR spectrum, most of the absorbance values outside the atmospheric windows are close to infinity and are of little use. Furthermore, since a PC has limited speed and memory, the size of the network has to be reasonably small. For these reasons, the full mid-infrared spectrum was the least useful window and no results obtained using the full spectrum are reported in this paper. For networks designed to recognize CHCl3 and ethanol, the second and third types gave similar results but the time required to train the network to converge was substantially different. For CH2Cl2, the atmospheric windows gave better results because the two strongest absorption bands of CH2Cl2 overlap severely with the bands of three potential interferences, epichlorohydrin, o-toluidine, and 1,1,2,2-tetrachloroethane. In this case, the spectrum in the region of the strong bands of CH2Cl2 did not contain enough information to permit the interferences to be distinguished from CH2Cl2. In general, therefore, use of the atmospheric windows (3000-2400, 22502000, and 1300-700 cm-1) gave the best prediction accuracy when several analytes contributed to the spectrum. However, for the networks described in this paper, we used the third type of input to keep the neural networks as small as possible. 754 Analytical Chemistry, Vol. 71, No. 3, February 1, 1999

between 0 and 1 between -1 and 0 between 0 and 1 between -1 and 0 between 0 and 1 between -1 and 0

between -1 and 0 or between 0 and 1 between 0 and 1 between 0 and 1 between -1 and 0 between -1 and 0 or between 0 and 1 between -1 and 0 or between 0 and 1

Determination of the Network Architecture. Before a multilayer feed-forward neural network can be trained, the number of layers, the number of neurons in each layer, the number of inputs and outputs, and the transfer functions used for each layer must be optimized. Currently there are no formal rules that can be applied to determine these parameters, and the ANNs described in many published papers have been based on trial and error. The number of inputs depends on the measured data and the available data reduction method, and the number of outputs on the problem being solved. In OP/FT-IR data processing, the final outputs include the identity of compounds and, ideally, their corresponding concentrations. In this paper, the number of outputs was equal to the number of compounds being identified by a single network. In an ideal case, one network should be able to recognize all the compounds of interest, but such a network could not be trained in practice. Instead, each network was trained to recognize the presence of a single analyte and, therefore, had one output neuron. This approach may seem to be less efficient when these networks are used for prediction, but it is not a serious problem when implemented on contemporary PCs. In practice, this architecture can be trained quickly and is very flexible. When we attempted to correlate the output with the concentration of the compound being analyzed, we found that the networks were difficult to train, especially when the spectrum was synthesized from a mixture of several analytes. Provided that the neural networks could be used to recognize the compounds in an OP/ FT-IR spectrum, CLS or PLS gave quantitative data faster and more accurately than the ANNs we investigated. In determining the optimal number of layers and the optimal number of neurons in each layer, we found that it was best to use the simplest network that adequately represented the training set. Networks with two hidden layers were never trained successfully even after long training times. Thus all the networks

Figure 1. Percentage of correctly identified spectra obtained by eight neural networks designed for CHCl3 with different numbers of neurons in the hidden layer. All networks were trained with the same training sets and the same number of epochs and tested with the same validation sets containing 36 field-measured OP/FT-IR spectra.

described here had only one hidden layer. Generally speaking, the more neurons there are in the hidden layer, the more powerful is the network but the longer it takes to train. CHCl3 is used as an example to show how the number of neurons in the hidden layer was determined. Eight CHCl3 neural networks with different numbers of neurons in the hidden layer (1, 2, 5, 10, 25, 50, 100, 200) were trained with the same training sets and same number of epochs and then tested with the same validation sets. The first four types of training sets (A-D) listed in Table 1 were used to train these networks, with only 200 spectra being used for types C and D instead of 1000 to keep the time needed to train the networks reasonable. The testing sets included a total of 36 field-measured spectra of CHCl3 (12), CH2Cl2 (12), and mixtures of CHCl3 and CH2Cl2 (12). These spectra were part of the validation sets E and F listed in Table 1. The percentage of correctly identified spectra in the testing sets achieved by each network is shown schematically in Figure 1. The neurons in the hidden layer are henceforth referred to as “hidden neurons” since only one hidden layer was used. It can be seen that when the number of hidden neurons exceeded 100, the percentage of correct predictions became low; however, there is no big difference in the percentages of correct predictions for the neural networks for which the number of hidden neurons is less than or equal to 50. Therefore, the optimum number of hidden neurons for CHCl3 could not be readily determined. Since the number of hidden neurons could not be determined explicitly, a second study was conducted. Four new CHCl3 special neural networks with 1, 2, 5, and 10 hidden neurons were trained. Training sets A and B in Table 1 were used to train these four networks, and the validation set included all the spectra in validation set A listed in Table 1. Each network was trained repeatedly with all the spectra in the training sets until the percentage of correctly identified spectra in the testing set could not be increased further. For the two networks with one or two hidden neurons, two spectra in the validation set were incorrectly predicted even after extensive training. For the network with 5 hidden neurons, only one spectrum was misidentified, and no compounds were misidentified for the network with 10 hidden neurons. Apparently a network with 10 hidden neurons was better than neural networks with 1, 2, and 5 hidden neurons for our data set; therefore, we used 10 hidden neurons for all the networks

discussed after this point. To keep the networks small, the number of hidden neurons was never increased beyond 10. Training of the Network. When back-propagation is used to train a feed-forward neural network, a target has to be determined for each training vector. Two methods of setting the targets for the training sets were tested. The first method was to use the concentration of the compound being identified; however, we found it difficult to train networks to this target, especially for the spectra of mixtures. The second approach was to train all the spectra that contain the compound being identified to give a positive output and all the spectra that did not contain the compound of interest to a negative output. Since the transfer function was a hyperbolic tangent sigmoid, the network was actually trained to give an answer between 0 and +1 when the compound was known to be present and between 0 and -1 when the compound was absent. A spectrum that had a higher concentration of the analyte may give a smaller output than a spectrum that had lower concentration when interferences are present, but the two outputs should have the same sign. Before any network was trained, the weights and biases were initialized with random numbers and these weights and biases were varied to calculate a network output for a given input vector. If the output was incorrect (as defined above), then the weights and biases were adjusted to new values by training with the input vector. If the output was correct (i.e., between +1.0 and 0 if the analyte was present and between -1.0 and 0 if it was absent), the input vector was eliminated from the training set. At the end of each epoch, the outputs for all of the spectra in the training set were calculated and the procedure was carried out again randomizing the order of the input spectra. All the vectors in the training set were tested and trained in such a manner. This iterative procedure was terminated either after a predetermined number of epochs or when all the training spectra had been correctly classified. Back-propagation algorithms are easy to overtrain; i.e., they can give perfect results for the training set but very poor predictions. There is no formal method that can solve this problem. To avoid overtraining, the training sets have to be carefully prepared so that they represent the data well. Many training spectra were synthesized for this study (see Table 1). During the training process, spectra that did and did not contain the analyte of interest were alternately input to the network so that training was effected with equal numbers of positive and negative examples. To judge when the training process could be terminated, validation set A in Table 1 was used. If the accuracy was lower than desired, all the training sets (A-F in Table 1) were repeatedly trained until convergence was achieved for more than 95% of all the spectra in validation set A. OP/FT-IR spectra are usually taken in conditions that are not conducive to good spectroscopy. Many factors, such as fluctuations in temperature, humidity, and wind speed and direction and the use of an excessively long path length, can affect the quality of the measured spectrum. As a result, OP/FT-IR spectra tend to have poor baselines and a high noise level. A knowledge of the detection limit was to be very important in the synthesis of the training set. In this work, we found that when the peak absorbance of the strongest band in the spectrum of the analyte was less than or only slightly greater than the noise level, the network learned Analytical Chemistry, Vol. 71, No. 3, February 1, 1999

755

some irrelevant features instead of learning to recognize the key features of the compound being identified. Not surprisingly, the prediction accuracy was reduced in this case. On the other hand, if the scaling factors by which the reference spectra were multiplied were set too high, some small features were missed by the network and the specificity was decreased. Thus, to utilize the capability of the network fully, the scaling factor should be set to the lowest possible level consistent with good predictions. In order for the networks to correctly predict which components are present, spectra collected under different conditions have to be included in the training set. Since the concentrations of the compound of interest vary widely in field-measured OP/FT-IR spectra, the maximum absorbance of each analyte was controlled between the detection limit and 0.3. To determine the detection limit for the five alcohol networks trained in this paper, the peak absorbance of the strongest band in the calibration spectra was varied between 0.1 and 0.005 at intervals of 0.005. It was found that when the peak absorbance of the strongest band was less than 0.05, the prediction accuracy began to degrade. The root-mean-square (rms) noise of all the open-path background spectra used to synthesize the training spectra in this project between 2200 and 2100 cm-1 was ∼0.045 AU. Thus the detection limit was approximately equal to the rms noise, i.e., below the peak-to-peak noise level. In all spectra in the training sets containing the compound of interest, the peak absorbance of its strongest band was set to be between 0.05 and 0.3. Another important parameter that can be adjusted during the training process is the learning rate. As indicated in the user’s guide for the Matlab Neural Network Toolbox,32 a learning rate that is too large leads to unstable learning, whereas a learning rate that is too small results in excessively long training times. The toolbox uses an adaptive learning rate for fast learning, but it allows the user to choose the initial learning rate. The correct selection of the initial learning rate was very important for the networks trained in this project. During the training process, a network was first trained with a high initial learning rate; it was then retrained with a smaller initial learning rate. The initial learning rate was decreased continuously until a satisfactory result was obtained. For all the networks reported in this paper, the initial learning rate was decreased from 0.02 to 0.0001 (if the initial learning rate was above 0.02, the convergence of the networks trained in this paper were very poor even after extensive training). RESULTS AND DISCUSSION Identifying a Compound in Synthesized Single-Component OP/FT-IR Spectra. Neural networks were trained for OP/ FT-IR spectra of chloroform and dichloromethane and five alcohols. The reference spectra of these seven compounds are shown in Figure 2. Chloroform and dichloromethane were selected because, even though the strongest bands in their spectra overlap somewhat, their spectra should not be difficult to distinguish. The five alcohols were selected as “worst cases” not only because they all contain the same functional groups but also because many potential interferences absorb in the regions where the alcohols have strong bands. Field-measured OP/FT-IR spectra of controlled releases of each of the five alcohols are shown in Figure 3. The shaded areas in the figure indicate where atmospheric H2O and CO2 absorb strongly; obviously the data in these 756 Analytical Chemistry, Vol. 71, No. 3, February 1, 1999

Figure 2. Reference spectra of CHCl3, CH2Cl2, methanol, ethanol, 1-propanol, 2-propanol and 1-butanol. Neural networks were trained to recognize each of these compounds in OP/FT-IR spectra.

Figure 3. Representative OP/FT-IR spectra of controlled releases of the five alcohols (A, methanol; B, ethanol; C, 1-propanol; D, 2-propanol; E, 1-butanol). The shaded areas indicate the absorption regions that are out of the atmospheric windows.

areas contain little useful information. By comparing Figures 2 and 3, it can be readily seen that the OP/FT-IR spectra are much noisier than laboratory spectra and that many absorption bands that are useful for identification are lost in OP/FT-IR spectra. Neural networks to recognize CHCl3 in OP/FT-IR spectra and in laboratory-measured reference spectra were compared. It was found that the training time for the OP/FT-IR spectra was much longer, indicating that it was more difficult to identify a compound in an OP/FT-IR spectrum. To test if a compound can be identified correctly in singlecomponent spectra, validation set B (see Table 1) was used. A total of 100 spectra were synthesized for each of the seven selected compounds. As before, the maximum absorbance of the analyte in the synthesized OP/FT-IR spectra was constrained to be between 0.05 and 0.30. The percentage of false positives was determined using only those spectra that did not contain the analyte. Similarly, the percentage of false negatives was determined using only spectra where the analyte is present. Each alcohol network was tested by the 500 spectra synthesized for all five alcohols. All the spectra were identified correctly by the methanol, ethanol, 2-propanol, and 1-propanol neural networks. A total of 3.5% of the spectra that did not contain 1-butanol gave a false positive result, and 3% of the spectra from which 1-butanol was known to be present gave a false negative result. In light of the similarity of the spectra of 1-butanol and

ethanol at 8-cm-1 resolution (see Figure 2), this result was considered to be acceptable. A total of 500 alcohol spectra that had maximum absorbance values between 0.005 and 0.05 were also tested. When the maximum absorbance was greater than 0.035, no false positives and fewer than 10% false negatives were found. The chloroform and dichloromethane networks were each tested by the sets of validation spectra (set B in Table 1 for that analyte). The CHCl3 network identified all the spectra correctly when the maximum absorbance value was greater than 0.11, but no spectra were identified correctly when the maximum absorbance was between 0.05 and 0.11. Similarly for CH2Cl2, the network gave 100% accuracy when the maximum absorbance was greater than 0.08, but 0% accuracy when the maximum absorbance was less than 0.08. The main reason that the detection limit for these two compounds was higher than that of the alcohol networks is believed to be because of the low SNR in the spectral region below 800 cm-1, where the strongest absorption band of these molecules is located. Interference from atmospheric water lines and variations in the baselines near the detector cutoff (see Figures 3 and 4) may also play a part. Identifying a Compound in Synthetic Multicomponent OP/FT-IR Spectra. OP/FT-IR spectra often contain absorption features from more than one component (other than H2O and CO2). It is, therefore, important that a network can recognize individual components in the spectrum of a mixture. Synthesized mixture spectra have been used to test the power of neural networks to recognize the presence of individual analytes in the presence of several interferences. Two sets of 100 simulated OP/FT-IR spectra were synthesized for each of the seven analytes (validation sets C and D in Table 1). To test for false negatives, one set contained the reference spectrum of the analyte of interest along with the spectrum of nine other randomly selected compounds. To test for false positives, each spectrum in the second set was synthesized from randomly selected reference spectra of nine compounds and did not include the analyte of interest. All of the networks gave fewer than 5% false positives. With respect to false negatives, the predicted accuracy depended on the relative amount of the compound of interest in each spectrum. When the relative amount of the analyte (as estimated by the intensity of the strongest band in the spectral region being examined) was greater than 10% and the peak absorbance of the strongest band in the analyte spectrum was greater than ∼0.05, the percentage of false negatives for all analytes was less than 10%. To test the performance of the networks under conditions of strong interference, 100 spectra of alcohol mixtures were synthesized. Each mixture spectrum was composed of the spectra of three randomly selected alcohols added to randomly selected OP/ FT-IR background spectra. Three of these spectra in the 10-µm atmospheric window are shown in Figure 4. The peak absorbance of the strongest band of each alcohol was constrained to be between 0.05 and 0.30. When these spectra were tested by the five alcohol networks, no false positives were found. The number of false negatives was much higher, however; the percentages of false negatives calculated for the methanol, ethanol, 1-propanol, 2-propanol, and 1-butanol networks were 41%, 5%, 31%, 39%, and 55%, respectively. In light of the similarity of the ethanol and

Figure 4. Three representative synthetic three-component alcohol mixture spectra. A total of 100 spectra were used to test the neural networks. For each spectrum, three alcohols were randomly selected. Each selected alcohol spectrum was randomly scaled between 0.05 and 0.30 AU. (A) methanol (peak, 0.058 AU), ethanol (peak, 0.059 AU), 1-butanol (peak, 0.081 AU); (B) 1-propanol (peak, 0.066 AU), 2-propanol (peak, 0.078 AU), 1-butanol (peak, 0.069 AU); (C) methanol (peak, 0.050 AU), 2-propanol (peak, 0.050 AU), 1-butanol (peak, 0.050 AU). See Table 2 to convert peak absorbance into pathintegrated concentration.

1-butanol spectra, it is quite remarkable that the number of false negatives for ethanol was so low. Two circumstances caused the networks to give a false negative prediction: either the amount of the analyte was equal or close to the detection limit in the singleanalyte case or the amount of the analyte was small relative to the amounts of the interferences. Special neural networks were trained for the five alcohols in order to find out if higher prediction accuracy can be achieved. The training sets were synthetic OP/FT-IR spectra containing only the five alcohols (single components and mixtures) which were similar to training sets E and F in Table 1, where the interferences were the other alcohols. For each network, the training set contained 1000 spectra. Half of them contained the compound of interest and the other half did not. Each spectrum contained a minimum of one and a maximum of five alcohols. When the 100 synthesized mixture spectra were tested by these five specifically trained neural networks, no false positive or false negative predictions were found. Identifying Compounds Not Included in the Training Sets. In the above tests, although the spectra in the validation set were independent of those in the training set, the compounds that appeared in the validation sets might also appear in the training sets. The major differences between the training sets and validation sets were in the concentrations of each compound, the existence of different interferences, and different baselines. In practice, however, it is quite probable that a network may be tested by a compound that it has not been trained to recognize as an interference. Thus it is important to understand the behavior of neural networks in this situation. A special methanol neural network with 10 neurons in the hidden layer and using the wavenumber regions where methanol has significant absorption (2997-2781, 1366-1204, and 1110-960 cm-1) as the network inputs was trained for this purpose. Among the 109 compounds in the vapor-phase spectra library, 98 have absorption bands that overlap with one or more bands of Analytical Chemistry, Vol. 71, No. 3, February 1, 1999

757

methanol. The spectra of 56 compounds that interfere relatively strongly with methanol were selected manually from these 98 spectra. A total of 28 of these 56 spectra were randomly selected and used to synthesize the validation set. The other 81 spectra were used to synthesize the training set for the special methanol neural network. Each spectrum in the training set and validation set was a synthesized single-component spectrum or a binary mixture of methanol and an interference with open-path background added. Although absorption bands of the 28 spectra in the validation set overlapped with absorption bands of methanol and these compounds were not seen by the neural network during the training process, the methanol network still recognized 26 of them correctly as non-methanol spectra. Two alcohol spectra, 2-propanol and 1-butanol, were included in the validation set. Although bands in these two spectra overlapped with bands in the methanol spectrum significantly, the spectra were still correctly identified. When the neural network was tested with 28 binary mixture spectra of methanol and untrained interferences, the network reported the existence of methanol for all the spectra. This example showed that a neural network can correctly identify a spectrum in the presence of an interference that was not included in the training set and can also reject interferences it has not seen before. Field-Measured Controlled-Release Single-Component and Mixture OP/FT-IR Spectra. The best validation of the conclusions made above is to use field-measured spectra instead of digitally synthesized spectra. For this reason, field-measured controlled-release single-component (validation set E in Table 1) and multicomponent spectra (validation set F in Table 1) were collected and tested by the seven networks. Note that none of the networks was trained with data synthesized from OP/FT-IR background spectra that were collected on the same day as the controlled-release data and the spectrometer and retroreflector were set up on the day of the measurements. Several of the field-measured controlled-release single-component spectra are shown in Figure 5. To dissipate the analytes effectively, the equipment was set up on the roof of our laboratory. The total path length was constant at 75 m. Even though the intensity of the water lines in these spectra is approximately constant, it should be noted that the network was trained using background spectra measured over path lengths from 25 to 375 m. Because the path length was relatively short, the field-measured spectra shown in Figures 5-7 were less noisy than the synthesized spectra as in Figure 4. The chlorinated solvents are quite volatile and gave rise to strong absorption bands in these spectra. All 12 chloroform spectra (peak absorbance, 0.26-0.63) and all 12 dichloromethane spectra (peak absorbance, 0.15-0.47) were identified correctly by all the seven neural networks; i.e., there were no false positives or false negatives. The 22 methanol spectra (peak absorbance 0.014-0.082), 24 ethanol spectra (peak absorbance 0.005-0.049), 12 1-propanol spectra (peak absorbance 0.031-0.115), 8 2-propanol spectra (peak absorbance 0.0290.068), and 26 1-butanol spectra (peak absorbance 0.017-0.058) were also tested by all the neural networks. None of the alcohol spectra gave a false positive prediction. With respect to false negatives, however, the results depended on the maximum peak absorbance. All the 1-propanol and 2-propanol spectra were identified correctly by their corresponding networks, but several 758 Analytical Chemistry, Vol. 71, No. 3, February 1, 1999

Figure 5. Representative OP/FT-IR single-component spectra of field-measured controlled-releases of CHCl3, CH2Cl2, and the five alcohols. A total of 12, 12, 22, 24, 12, 8, and 26 spectra for CHCl3, CH2Cl2, methanol, ethanol, 1-propanol, 2-propanol, and 1-butanol, respectively, were used to test the seven neural networks. Only three spectra are shown for each compound. All the components in the spectra shown in this figure were correctly identified by the seven neural networks. The path-integrated concentrations (ppm‚m) of the components as calculated by PLS regression are as follows: CHCl3: A, 272, B, 189, C, 112. CH2Cl2: A, 663, B, 380, C, 210. Methanol: A, 272, B, 175, C, 100. Ethanol: A, 202, B, 157, C, 107. 1-Propanol: A, 463, B, 310, C, 125. 2-Propanol: A, 442, B, 292, C, 188. 1-Butanol: A, 208, B, 147, C, 111.

Figure 6. Representative OP/FT-IR mixture spectra of fieldmeasured controlled releases of CHCl3 and CH2Cl2. A total of 3 out of the 12 spectra tested are shown in this figure. The path-integrated concentrations of the components as calculated by PLS regression are as follows: (A) CHCl3, 125 ppm‚m, CH2Cl2, 264 ppm‚m; (B) CHCl3, 96 ppm‚m, CH2Cl2, 272 ppm‚m; (C) CHCl3, 65 ppm‚m, CH2Cl2, 183 ppm‚m.

spectra of methanol, ethanol, and 1-butanol for which the peak absorbance was less than 0.029, 0.026, and 0.031, respectively, were identified incorrectly. Three field-measured spectra of mixtures of the two chlorinated solvents are plotted in Figure 6. Visual inspection of the 12 CHCl3/ CH2Cl2 mixture spectra (peak absorbance 0.20-0.51) showed that CHCl3 and CH2Cl2 are both present in all spectra. The components of all 12 spectra were correctly identified by all the seven neural networks. Field-measured spectra of mixtures of alcohols were used to test the alcohol neural networks. Representative spectra of each mixture are plotted in Figure 7. Each mixture sample set

Table 3. Path-Integrated Concentrations (Represented as the Absorbance of the Strongest Peak) of Each Alcohol in the Mixture Spectra Predicted by PLS Regressiona

Figure 7. Representative OP/FT-IR mixture spectra of fieldmeasured controlled releases of alcohol mixtures. Mixture 1 contains methanol, ethanol, and 1-propanol; mixture 2 contains 1-propanol, 2-propanol, and 1-butanol; mixture 3 contains methanol, 2-propanol, and 1-butanol. Concentrations are given in Table 3. Spectra A-C for mixture 1 correspond to spectra 6, 3, and 8 for mixture 1; spectra A-C for mixture 2 correspond to spectra 4, 1, and 5 of mixture 2; spectra A-C for mixture 3 correspond to spectra 6, 3, and 9 of mixture 3. Table 2. Conversion Factors To Find the Path-Integrated Concentration (ppm‚m) from the Maximum Observed Absorbance in an OP/FT-IR Spectrum Measured at 8-cm-1 Resolution analyte

conversion factor (ppm‚m/AU)

CHCl3 CH2Cl2 methanol ethanol

435 1408 3241 4124

analyte

conversion factor (ppm‚m/AU)

1-propanol 2-propanol 1-butanol

4027 6494 3582

comprised mixtures of three alcohols at different relative concentrations. A total of 12 spectra were collected for mixture sample set 1, 5 spectra for mixture sample set 2, and 19 spectra for mixture sample set 3. Since the composition of each sample in the liquid phase was known in advance, the composition of each vapor spectrum was also known qualitatively. The PLS regression model described by Hart et al.33 was used to find out the concentration of each component in each of the OP/FT-IR mixture spectra. The concentrations (represented as the absorbance of the strongest peak in a single-component spectrum) are given in Table 3. The alcohol mixture spectra in validation set F of Table 1 were used to test the five neural networks for both false positives and false negatives. For the test for false negatives, the absorbance of the strongest peak of the alcohol being tested was greater than or equal to 0.05. The results are summarized in Table 4. No false positive predictions were obtained under any circumstance. The number of false negatives for methanol and ethanol was generally satisfactory, in part because the peak absorbance of bands due to these two compounds was higher than that of the corresponding bands of the other alcohols (see Table 3) (presumably because the volatility of methanol and ethanol was greater than that of the other alcohols tested). For the 1-propanol and 2-propanol networks, only some of the spectra for which the strongest (33) Hart, B. K.; Berry R. J.; Griffiths, P. R. Field Anal. Chem. Technol., in press.

spectrum no.

methanol

1 2 3 4 5 6 7 8 9 10 11 12

0.242 0.030 0.059 0.041 0.137 0.235 0.205 0.020 0.010 0.007 0.011 0.100

ethanol

1-propanol

1-butanol

0.053 0.067 0.022 0.031 0.013

0.070 0.070 0.043 0.027 0.008

0.021 0.026 0.043 0.044 0.027 0.026 0.072 0.070 0.073 0.018 0.023 0.029 0.080 0.034 0.021 0.015 0.045 0.008 0.032

0.002 0.013 0.013 0.026 0.035 0.059 0.035 0.019 0.021 0.021 0.024 0.038 0.041 0.005 0.023 0.001 0.050 0.038 0.047

Mixture 1 0.167 0.020 0.127 0.020 0.164 0.064 0.110 0.028 0.039 0.047 0.035 0.052 0.181 0.054 0.060 0.106 0.084 0.067 0.121 0.076 0.051 0.075 0.037 0.030 Mixture 2 0.018 0.021 0.097 0.123 0.060

1 2 3 4 5

2-propanol

Mixture 3 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

0.052 0.090 0.096 0.060 0.105 0.119 0.139 0.089 0.030 0.100 0.091 0.113 0.156 0.081 0.068 0.051 0.113 0.152 0.088

a Assuming that only the compounds released into the atmosphere in each experiment was present. (The factors to convert peak absorbances into ppm‚meter are listed in Table 2.)

absorption band had a peak absorbance greater than 0.05 were identified correctly. In these cases, it is probable that the strong interference from other alcohols was responsible for the failure of the network. Ethanol and 1-butanol had very similar reference spectra (see Figure 2), so it should be expected that the performance of the neural networks for these two alcohols should be similar. None of the four spectra measured when the peak absorbance of 1-butanol was greater than 0.05 was identified correctly by the 1-butanol network. The reason for this may in part be the weak absorption of 1-butanol (only slightly higher than 0.05, as shown in Table 3), and in part because of severe interference by the other two alcohols. As mentioned previously, when synthetic spectra of alcohol mixtures containing no other interferences were employed as the training and validation sets, much higher prediction accuracy was achieved than when the network had to compensate for nonalcohol interferences. The five special neural networks that were trained Analytical Chemistry, Vol. 71, No. 3, February 1, 1999

759

Table 4. Result of Testing the Alcohol Neural Networks with Field-Measured Spectra of Controlled Releases of Alcohol Mixturesa false positive test

false negative test

neural networks

no. of testing spectra

no. of correctly identified spectra

no. of testing spectra

no. of correctly identified spectra

methanol NN ethanol NN 1-propanol NN 2-propanol NN 1-butanol NN

5 24 19 12 12

5 24 19 12 12

24 9 10 6 4

22 9 7 1 0

a For all tests for false negatives, the absorbance of the strongest peak of the alcohol being tested was greater than or equal to 0.05.

Table 5. Results Obtained for the Five Special Alcohol Neural Networksa false positive test

false negative test

neural networks

no. of validation spectra

no. of correctly identified spectra

no. of validation spectra

no. of correctly identified spectra

methanol NN ethanol NN 1-propanol NN 2-propanol NN 1-butanol NN

5 24 19 12 12

5 24 17 12 12

24 9 10 6 4

24 9 10 6 4

a The validation spectra were the same as in Table 4, but the neural networks were trained by spectra synthesized from only the five alcohol reference spectra.

using synthetic OP/FT-IR spectra containing only the five alcohols (single components and mixtures) were also used to test the above-measured alcohol mixture spectra. Except for two false positives obtained with the 1-propanol network, all the other spectra were correctly identified by all the networks, both with respect to false positives and false negatives. The results, which are summarized in Table 5, are much better than those shown in Table 4, especially for false negatives obtained by the 1-propanol, 2-propanol, and 1-butanol networks. CONCLUSIONS The possibility of using multilayer feed-forward neural networks with one hidden layer for compound identification in OP/ FT-IR spectra was studied. The method of synthesizing the training spectra provided a convenient way of obtaining a large number of statistically independent training sets without the need for extensive measurements in the field. The initial learning rate and the minimum amount of the compound added to the OP/ FT-IR background spectra used in the training set were critical for training the network effectively. Neural networks for five alcohols and two chlorinated solvents were trained and tested. Each neural network was designed to recognize only one compound in the presence of a number of interferences. Neither baseline correction and nor spectral subtraction of water or CO2 lines was required when these networks were used. The application of these networks to compound recognition in OP/FT-IR spectrometry is analogous to spectral 760 Analytical Chemistry, Vol. 71, No. 3, February 1, 1999

library searching and can be easily automated. All the available reference spectra are currently being encoded into independent network in our laboratory and a “neural network library” for OP/ FT-IR spectrometry is being established. Neural networks can be trained to identify individual components that have very similar spectra under the conditions of the measurement, such as ethanol and 1-butanol. The networks also gave reasonably good prediction from the spectra of mixtures. Neural networks trained with spectra synthesized only from the five alcohols gave much higher prediction accuracy than when mixture spectra containing compounds in addition to these alcohols were employed. This result suggests an alternative approach for identifying the presence of individual analytes in the presence of a mixture of similar compounds. In this case, a network could be trained to recognize the presence of a member of a particular compound class. Subsequent networks could then be trained to identify the presence or absence of individual compounds in this group. The results from field-measured controlled-release OP/FT-IR spectra were comparable with those obtained from the reference spectra. Three types of neural network inputs were compared. Not surprisingly, restricting OP/FT-IR spectra to the atmospheric windows gave the best result. Expansion to the full spectrum degraded the network performance because those data that are out of the atmospheric windows contain little useful information for identification. Using only those spectral windows containing absorption bands of the analyte as neural network input could reduce the input data significantly so that the neural networks can be trained more rapidly. If the absorption bands of the analyte overlap severely with those of an interference, however, the specificity is reduced. It should be emphasized here that neural networks rarely gave results that were 100% accurate. It is quite easy for inexperienced users to overtrain neural networks; thus testing with independent validation sets is even more important than it is with more linear methods such as CLS and PLS. In practice, the combination of neural networks with other methods is often to be recommended. For example, in OP/FT-IR spectrometry, the combination of the neural network approach described in this paper with PLS, CLS, or spectral subtraction for quantification will increase most users’ confidence in the final answer. It is also quite possible that the combination of feed-forward networks with other neural network methods will allow more confident predictions. It is generally believed that neural networks are highly tolerant of noise. However, the success of neural networks can depend strongly on the number of layers, the number of neurons in each layer, and the speed and memory of the computer. Even if larger computers become available, the training algorithms must to be improved for larger networks. In the project described in this paper, a small reference library containing the spectra of only 109 compounds was used. If the reference library was expanded, it is probable that more input data and more neurons would be needed compared to the networks illustrated in this paper. In this case, some other available training algorithm, such as simulated annealing and genetic optimization13,34,35 may be needed. (34) Masters, T. Advanced Algorithms for Neural Networks; John Wiley & Sons: New York, 1995. (35) Shaffer, R. E.; Small, G. W. Anal. Chem. 1997, 2366A-242A.

While training feed-forward networks is time-consuming and experience is needed to avoid many of the potential pitfalls, the implementation of these networks is fast and simple and can be easily automated. Thus we believe that feed-forward neural networks can provide not only useful information for the data processing of OP/FT-IR spectrometry but also a number of other areas of chemical analysis that are not amenable to more linear methods, such as those based on principal component analysis.

assistance in acquiring the OP/FT-IR spectra used in this study, and Dr. John D. Jegla of the UI Department of Chemistry for useful discussions. This work was supported in part by a grant from the Idaho National Engineering and Environmental Laboratory (INEEL) University Research Consortium. The INEEL is managed by Lockheed Martin Idaho Technologies Co. for the U.S. Department of Energy, Idaho Operations Office under Contract DE-AC07-94ID13223.

ACKNOWLEDGMENT The authors thank Professor Howard B. Demuth of Department of Electrical Engineering, University of Idaho (UI), for his assistance in the neural network design, Brian K. Hart for his

Received for review August 25, 1998. Accepted December 5, 1998. AC980955O

Analytical Chemistry, Vol. 71, No. 3, February 1, 1999

761