Identification of Multiple Analytes Using an Optical Sensor Array and

Networks. Stephen R. Johnson, Jon M. Sutter, Heidi L. Engelhardt, and Peter C. Jurs* ... (7) Dickinson, T. A.; White, J.; Kauer, J. S.; Walt, D. R. Na...
0 downloads 0 Views 216KB Size
Anal. Chem. 1997, 69, 4641-4648

Identification of Multiple Analytes Using an Optical Sensor Array and Pattern Recognition Neural Networks Stephen R. Johnson, Jon M. Sutter, Heidi L. Engelhardt, and Peter C. Jurs*

152 Davey Laboratory, Department of Chemistry, The Pennsylvania State University, University Park, Pennsylvania 16802 Joel White and John S. Kauer

Department of Neuroscience, Tufts University School of Medicine, Boston, Massachusetts 02111 Todd A. Dickinson and David R. Walt

Department of Chemistry, Tufts University, Medford, Massachusetts 02155

The further development of a vapor-sensing device utilizing an array of broadly distributed optical sensors is detailed. Data from these optical sensors provided input to pattern-recognizing neural networks, which successfully identified and quantified a collection of 20 analyte vapors. The optical sensor array consisted of 19 optical fibers whose tips were coated with Nile Red immobilized in various polymer matrices. Responses consisted of the changes in fluorescence with time resulting from the presentation of a vapor to the sensor array. Numerical descriptors calculated from these responses were then used to highlight important temporal and spatial features. Learning vector quantization neural network models were constructed using subsets of these descriptors, and they accurately identified and quantified each of the presented analytes. Successful classification was achieved for both the training set data (89%) and for the external prediction set data (90%). Relative concentrations were correctly assigned for 90% of the prediction set data. In recent years, there have been many reports regarding the development and use of sensor arrays for the identification and quantification of organic chemicals. The primary motivation behind these efforts is the mammalian olfactory system, which is known to possess both a broad-band response and remarkable sensitivity. Indeed, the human nose can detect differences between many odorant stimuli,1 often at low concentrations.2 The olfactory pathway appears to achieve these abilities by using a large array of cross-reactive receptor cells3 rather than by having species-specific sensors. The signals transmitted from these receptors are passed to the neuronal circuitry of the brain, where spatial and temporal response patterns are generated. These patterns contain molecular identity information, which may be used to recognize the odorant. (1) Persaud, K. C.; Tracers, P. J. In Handbook of Biosensors and Electronic Noses. Medicine Food, and the Environment; Kress-Rogers, E., Ed.; CRC Press: New York, 1997; Chapter 24. (2) Breer, H. In Handbook of Biosensors and Electronic Noses. Medicine Food, and the Environment; Kress-Rogers, E., Ed.; CRC Press: New York, 1997; Chapter 22. (3) Kauer, J. S. Trends Neurosci. 1991, 14, 79-85. S0003-2700(97)00298-9 CCC: $14.00

© 1997 American Chemical Society

There have been many attempts to mimic this property in the design of vapor-sensing instruments, usually incorporating an array of chemical sensors.4-7 Such a device would have many potential applications, including environmental monitoring, quality control in production processes, and diagnostic utility in the medical field. Many different types of chemosensors have been incorporated into these devices, such as conductive polymer,5,8 surface acoustic wave (SAW),9,10 piezoelectric,11 and electrochemical methods.6,12 The work presented here is a continuation of work reported earlier,7,13,14 involving the use of an optical sensor array using fiber-optic sensor technology. The use of optical sensors offers some intriguing advantages for a practical chemical sensing device, including easy miniaturization and the ability to locate the sensing element several kilometers from the device. In addition, due to their optical nature, fiber optics are free of electrical interferences. Like the mammalian olfactory system, the fiber-optic sensing instrument described here generates responses with a temporal component. As reported earlier,13,14 the use of temporal information in a chemical sensing device yields more information regarding the interaction of an analyte with the sensor and therefore should increase the probability of classifying the analyte successfully. This is in agreement with what is known about the (4) Lundstro ¨m, I.; Erlandsson, R.; Frykman, U.; Hedborg, E.; Spetz, A.; Sundgren, H.; Welin, S.; Winquist, F. Nature 1991, 352, 47-50. (5) Freund, M. S.; Lewis, N. S. Proc. Natl. Acad. Sci. USA 1995, 92, 26522656. (6) Singh, S.; Hines, E. L.; Gardner, J. W. Sens. Actuators, B 1996, 30, 185190. (7) Dickinson, T. A.; White, J.; Kauer, J. S.; Walt, D. R. Nature 1996, 382, 697-700. (8) Hodgins, D. Sens. Actuators, B 1995, 27, 255-58. (9) Grate, J. W.; Abraham, M. H. Sens. Actuators 1991, 3, 85-111. (10) Zellers, E. T.; Batterman, S. A.; Han, M.; Patrash, S. J. Anal. Chem. 1995, 67, 1092-1106. (11) Grate, J. W.; Rose-Pehrsson, S. L.; Venezky, D. L.; Klusty, M.; Wohltjen, H. Anal. Chem. 1993, 65, 1868-1881. (12) Stetter, J. R.; Jurs, P. C.; Rose, S. L. Anal. Chem. 1986, 58, 860-866. (13) White, J.; Kauer, J. S.; Dickinson, T. A.; Walt, D. R. Anal. Chem. 1996, 68, 2191-2202. (14) Sutter, J. M.; Jurs, P. C. Anal. Chem. 1997, 69, 856-862. (15) Holmberg, M.; Winquist, F.; Lundstro ¨m, L.; Gardner, J. W.; Hines, E. L. Sens. Actuators, B 1995, 26/27, 246-249. (16) Aishima, T. Anal. Chim. Acta 1991, 243, 293-300.

Analytical Chemistry, Vol. 69, No. 22, November 15, 1997 4641

mammalian nose, as small changes in odors are detected by spatial and temporal patterns of the activated olfactory receptor cells. As of yet, there have been no other reports of temporal information being used in sensor array technology. The current study presents the continued development and characterization of the optical sensor array. In comparison to our previous study,14 a larger set of analytes was used here to evaluate the ability of the sensor array to provide sufficient information for discrimination. This analyte set contained a more diverse range of functionality; however, several of the functional groups appear in multiple compounds. Numerical descriptors encoding the change in fluorescence data from the optical sensor array were presented to a learning vector quantization (LVQ) neural network which classified each of the presented analytes. The use of LVQ as the classifying algorithm represents a significant departure from the methodology detailed in the previous study.13,14 Among the advantages to this approach is the reduction of the overall number of adjustable parameters, which may imply a more reliable classifier. Additional LVQ networks were then used to assign a relative concentration to each of the analytes. Cross-validation sets were used to guard against overtraining each of these networks, and an independent prediction set was used to verify the ability of the networks to generalize. EXPERIMENTAL SECTION Analyte Set. The analyte set used in the current study consisted of 17 pure compounds and three complex mixtures. Two of the mixtures, cologne-1 and cologne-2, were colognes produced by the same manufacturer (Fragrance Impressions Ltd.). In addition, several different alcohols (propanol, 2-propanol, methanol) which are likely to be similar to the solvents used in the colognes were present in the analyte set. Four carbonyl-containing compounds (acetone, butyl acetate, camphor, (+)- and (-)carvone) were also present in the analyte set. The third mixture was a pseudoexplosive of unknown composition (Sigma). As the pseudoexplosive is known to be dissolved in water, water was included as an analyte to investigate the possibility of resolving the two. The remainder of the analyte set consisted of two sulfurcontaining compounds, two chlorinated compounds, one nitrogencontaining compound, one aromatic, one carboxylic acid, and one simple hydrocarbon species. A complete list of the 20 analytes used in this study is shown in Table 6. Sensor Fabrication. The distal ends of each of the fibers in the bundled array were cleaved, polished, cleaned, and silanized in the same manner as that previously described.13 Twelve solutions were prepared for coating the fibers: five for photopolymerization (Table 1) and seven for dip-coating (Table 2). The solutions chosen were those which resulted in the most useful sensors in the previous study.13,14 Sensors were then coated according to the procedures listed in Table 3. A variety of application orders or mixtures were utilized to investigate if two or more well-responding polymers would yield a sensor of greater utility. Data Collection. The system for data acquisition and for diluting and applying pulses of analyte vapors has been described previously.13 Fluorescence changes in the fiber sensors (λex 540 nm, λem 600 nm) were recorded using a cooled CCD video camera (Photometrics Ltd.) controlled by a Macintosh-based imaging system. Commercial software was used for experimental timing and data acquisition (Signal Analytics Corp.). 4642 Analytical Chemistry, Vol. 69, No. 22, November 15, 1997

Table 1. Summary of Photopolymerization Parameters for Sensor Constructiona soln a b c d m1

monomer 1; amt (µL)

monomer 2; amt (µL)

amt of Nile Red (µL)

BEE

400 400 400 400

30 60 30 30

PS802; 500 PS802; 500 MMA; 100 PS802; 500 PS078.9; 150 PS802; 500 MMA; 50 equal mixture of a-d

a Definitions: PS802, 80-85% dimethyl, 15-20% acryloxypropyl methylsiloxane copolymer. PS078.9, vinylmethoxysiloxane; MMA, methyl methacrylate.

Table 2. Summary of Dip-Coating Parameters for Sensor Constructiona soln e f g h i m2 m3

polymer name; amt (g)

solvent; amt (mL)

DOW; 0.36 toluene; 1 PC; 0.09 chlor; 1.2 PSAN; 0.1 chlor; 1 PMMA; 0.11 chlor; 0.5 PABS; 0.11 chlor; 1 equal mixture of f and h equal mixture of e and i

amt of Nile Red (µL)b 1000 500 250 250 500

a Definitions: DOW, dimethylsiloxane dispersion coating; PC, polycaprolactone; PSAN, poly(styrene-acrylonitrile); PMMA, poly(methyl methacrylate); chlor, chloroform. b 1 mg/mL in chlor.

A response to analyte application was recorded with a sequence of 60 video frames (256 × 256 pixels). Each frame represents the fluorescent sensor signal integrated over 200 ms; frames were collected at 350 ms intervals. Each sequence thus lasted 21 s, during which a 4 s analyte pulse was presented to the sensor array. The integration time, combined with on-chip binning (by a factor of 2), resulted in a high signal-to-noise ratio; no spatial or temporal filtering of the video sequence was necessary. Fluorescence values were obtained by averaging over a 30 × 30 pixel square centered on each fiber. Sensor responses to 20 different analytes were recorded, each presented at three concentrations (1:1, 1:2, and 1:3 dilution of saturated vapor with air for dichloroethane and chloroform; saturated vapor, 1:1 dilution of saturated vapor with air, 1:2 dilution of saturated vapor with air for all other analytes). A trial consisted of these 60 analyte applications, plus three pulses of air without analyte, presented at 1-2 min intervals. Data from 10 trials were collected over 10 consecutive days. The change in fluorescence intensity was obtained by subtracting the intensity immediately prior to analyte presentation from each time slice. Data Analysis. In order to improve the likelihood of building successful models, the sensor responses were visually inspected in order to identify the sensors which exhibited an essentially negligible response. Seven sensors were identified as having no response to the presented analytes. These sensors were eliminated from further consideration, leaving 12 sensors (1, 6, 8, 1017, 19) for further investigation. Prior to the calculation of descriptors, several preprocessing steps were taken. To account for changes in atmospheric conditions, such as humidity changes, a blank (air) that was measured the day of each trial was subtracted from the response for each observation. For the development of models for analyte

Table 3. Sensor Construction Parameters dip-coated

photopolymerized

fiber

soln (in order of application)

dips (per soln)

1 3 5 7 9 11 13 15 17 19

e i g h f h, g e, i i, e m2 m3

5 3 3 3 5 2 2 2 3 3

fiber

soln (in order of application)

dips (per soln)

UV exposure (s per dip)

2 4 6 8 10 12 14 16 18

a d c b b, c c, b m1 m1 a-d

2 2 2 2 1 1 1 2 1

30 15 15 15 15 15 15 15 10

classification, the change in fluorescence response plots were normalized using the relation

Rijknorm )

Rijk - R h ij σij

(1)

where Rijk is the kth time slice of the jth sensor for the ith observation, R h ij is the average value of the jth sensor for the ith observation, and σij is the corresponding standard deviation. A similar normalizing relation has been used recently for the calculation of similarity measures for mass spectra.17 It is worth noting that the transformation defined in eq 1 is significantly different than the autoscaling performed in other sensor array data processing approaches,5 where descriptor values for all the observations are centered around zero with a unit standard deviation. In this application, each individual sensor response is normalized. The effect is to largely remove concentration-dependent intensity changes from consideration, thus simplifying analyte recognition. Figure 1A shows a typical change in fluorescence response plot for the three concentrations of 2-propanol prior to normalization. It is apparent that analyte concentration has a large impact on the intensity of the change in fluorescence. Figure 1b shows the effect of normalizing the responses. For this reason, the changes in fluorescence data were not normalized for the quantification models. Finally, for the development of both classification and quantification models, the data were smoothed using a simplified least-squares method.18,19 Many different descriptors were calculated for each sensor. These descriptors were intended to encode regions of the response curves that were important for analyte identification. Table 4 lists some of the descriptors calculated in this study. Although some of the descriptors contained no temporal information (e.g., most positive intensity), most did contain some temporal information (e.g., the average intensity in time region 3). The descriptors containing temporal information proved to be the most useful for the classification of the presented analytes. Model Development. After the descriptors had been calculated, the observation pool was split into a 480-member training set, a 60-member cross-validation set, and a 60-member external prediction set. The members of each set were selected in a semirandom manner. As it was important that each analyte be adequately represented in each set for reliable model development (17) Windig, W.; Phalp, J. M.; Payne, A. W. Anal. Chem. 1996, 68, 3602-3606 (18) Savitzky, A.; Golay, M. J. E. Anal. Chem. 1964, 36, 1627-1639. (19) Steiner, J.; Termonia, Y.; Deltour, J. Anal. Chem. 1972, 44, 1906-1909.

Figure 1. (A) Change in fluorescence versus time for low, medium, and high concentration of 2-propanol, (sensor 14, trial 3). Large differences in response intensity are apparent as a function of concentration, although peak shape is similar across concentrations. (B) Response plots normalized as in eq 1. Differences in intensity are largely removed, allowing for easier recognition of analyte identity.

and validation, each analyte at each concentration was guaranteed to be in both the prediction and cross-validation sets. The training set and cross-validation set were used for descriptor reduction and selection and also for model development. The prediction Analytical Chemistry, Vol. 69, No. 22, November 15, 1997

4643

Table 4. Examples of Descriptors Derived from the Sensor Responses Used in This Study descriptor

description

AVER STSL SNSL MPOS MNEG APOS AVE(X)a DMPS MCHN

average change in intensity for all time slices steepest positive slope of response plots steepest negative slope of response plots most positive change in fluorescence intensity most negative change in fluorescence intensity average of all positive changes in fluorescence intensity average value of region (X)a most positive intensity of the derivative of the intensity plot maximum change in fluorescence intensity

a

Intensity plots were divided into either 6 or 10 regions.

set was used only for model validation. The number of descriptors that could be calculated from the response data was large; thus, it was important to eliminate descriptors that did not produce a consistent response for a given analyte and descriptors which did not adequately discriminate between analyte types. Therefore, descriptors which had high standard deviations within most classes of odorants, or those which had nearly identical values between classes of odorants, were eliminated from the descriptor pool. Small subsets of the remaining descriptors were then investigated using a genetic algorithm (GA) feature selection routine.20 GA is an evolutionary optimization technique that can be used to seek the optimal subset of descriptors to be used in a model. The fitness of each descriptor subset can be measured by a fitness function appropriate for the type of model being developed. For example, the rootmean-square (rms) error can be used as the fitness function in the search for models using multiple linear regression. In such a case, subsets yielding a lower rms error would be favored over those resulting in a higher rms error. For analyte classification, a collection of 20 three-layer, fully connected, feed-forward computational neural network models were developed. Each individual network was trained to recognize the presence or absence of its target analyte. The descriptor subsets for each network were selected using a feature selection routine, with rms error from a multiple linear regression guiding the search. A value of 0.05 was used to indicate the absence of an analyte, and 0.95 indicated that the analyte was present. After a subset of descriptors was chosen using multiple linear regression, they were used in a computational neural network. The networks were trained using a quasi-Newton update function,21 with the starting weights and biases chosen using an optimization routine based on generalized simulated annealing.22 The error of a cross-validation set (cvset) was monitored during training. Training was halted at the minimum error for the cvset to prevent overtraining the neural networks. After that point, the network begins to memorize the training data and loses its ability to generalize to data not included in the training set. Several architectures were investigated for each of the 20 neural network models. The smallest network architecture with acceptable rms errors was used for the final models. For detailed information regarding this approach, see ref 14. (20) Luke, B. T. J. Chem. Inf. Comput. Sci. 1994, 34, 1279-1287. (21) Sutter, J. M.; Dixon, S. L.; Jurs, P. C. J. Chem. Inf. Comput. Sci. 1995, 35, 77-84. (22) Sutter, J. M.; Jurs, P. C. In Data Handling in Science and Technology. Adaption of Simulated Annealing to Chemical Optimization Problems; Kalivas, J. H., Ed.; Elsevier: Amsterdam, 1995; Vol. 15, Chapter 5.

4644

Analytical Chemistry, Vol. 69, No. 22, November 15, 1997

As the results for the feed-forward networks were unsatisfactory, other methods of classifying the analytes were investigated. Learning vector quantization (LVQ) is a computational neural network particularly well-suited for pattern recognition applications.23 Here, LVQ is being used to classify each observation into one of 20 classes. LVQ is a supervised network which works by iteratively updating a collection of weight vectors, through the use of a simple learning function, to better classify the presented data. This is accomplished by defining the class regions in the descriptor space in a piecewise linear manner. There are three main stages to LVQ training: initialization, competition, and learning. The initialization stage involves the creation of the starting weight vectors for each neuron. In the implementation of LVQ used in this study (programmed in our laboratory), the weight initialization is performed by the creation of a preset number of weight vectors (specified by the user) for each class (e.g., 3 each for the 20 classes, 60 in total). These weight vectors were constructed using the equation

Cij ) X h j ( κσj

(2)

where Cij is the jth element of the weight vector for the ith neuron, X h j is the average value of the jth descriptor for the given class, κ is a uniformly distributed random number between -1.0 and 1.0, and σj is the standard deviation for the jth descriptor for all observations of the given class. An observation is presented to the network, and the competition phase starts. The Euclidean distance is calculated between the presented observation, Xk, and each weight vector:

di2 )

∑ ω (C j

ij

- Xkj)2

(3)

j

where the ωj values are determined iteratively as described in ref 24. Briefly, these weights are determined by an iterative procedure in which the degree of interclass separation is used to determine the importance of each feature. Two of the weight vectors are determined to be the most similar to the observation. The neuron corresponding to the weight vector with the smallest Euclidean distance is known as the winning neuron, and the neuron corresponding to the second smallest Euclidean distance is known as the runner-up neuron. Only these two neurons are eligible for learning in LVQ. If the class corresponding to the winning neuron corresponds to the correct classification of the observation, then only the winner proceeds to the update stage. If the classifications do not agree, and the classification of the runner-up neuron does agree with that of the observation, then both neurons proceed to the update stage. The neurons are updated through the use of the learning function

Cijnew ) Cijold + δ[R(t)][Cijold - Xj]

(4)

where Cijnew is the jth element of the weight vector for the ith (23) Kohonen, T. Self-Organizing Maps; Springer: New York, 1995. (24) Pregenzer, M.; Pfurtscheller, G.; Flotzinger, D. Neurocomputing 1996, 11, 19-29.

neuron after learning, Cijold is the jth element of the weight vector for the ith neuron before learning, R(t) is a learning rate which decreases monotonically with time, and Xj is the jth descriptor value of the presented observation. δ is equal to +1 if the winning neuron yields a correct classification (positive feedback) or -1 if the winning neuron yields an incorrect classification (negative feedback). For the update of the runner-up neuron, δ is equal to +1 if the winning neuron yields an incorrect classification and the runner-up yields the correct classification; otherwise, it is equal to zero. In this manner, the learning function attempts to reinforce correct output by making the neuron more similar to the presented observation. This neuron is then more likely to be selected again when the observation is presented in the future. Conversely, an incorrect output leads to the neuron being made less similar to the presented observation. This improves the probability of a different neuron being selected for the same observation in future epochs in the training cycle. It is possible that a neuron is never or seldom selected in the training process. These neurons, known as dead neurons, do not yield any benefit to the classification process, and they may actually degrade future performance. Dead neurons were therefore reinitialized to poorly classified classes during the training process. The incorrectly classified training observations for a poorly classified class were used to reinitialize a dead neuron using eq 2, with κ set equal to zero (i.e., the centroid of the incorrect observations). Accordingly, the analyte classes with observations that are more difficult to classify had more associated neurons. Conversely, easily classified analytes had fewer associated neurons. At the end of each epoch (one complete presentation of all the available training data) of training, the current capability of the network is measured by calculating a cost function

cost )

TSET + CVSET + 0.4|TSET - CVSET| 2

(5)

where the TSET and CVSET are the percentages of incorrectly classified samples in the training and cross-validation sets, respectively. This cost function (determined empirically) is then used in a generalized simulated annealing routine25 to determine if the results of this epoch should be accepted or rejected. Generalized simulated annealing is an optimization routine that is modeled after the annealing of a metal. It accepts all positive steps (in this case, a decrease in the cost function) and will accept detrimental steps on the basis of a probability derived from the Boltzmann distribution. This ability to accept detrimental steps helps to keep the network from converging to local minima. Should this routine choose to reject the results of an epoch, the weight vectors are returned to their value from the end of the previous epoch, and training is then continued as normal. Training can be terminated after a certain number of epochs or after a number of update rejections occur. The weight vectors corresponding to the minimum cost function (eq 5) are retained and used for later classification of the external prediction set. For purposes of prediction, the unknown observation is presented to the LVQ network, competitive selection occurs in which the weight vector most similar to the observation is identified, and the classification corresponding to this winner is output. (25) Song, H.-H.; Seong-Whan, L. Neural Networks 1996, 9, 329-336.

Descriptor subsets were investigated using the GA feature selection routine, with the fitness of each descriptor subset measured by the percent incorrectly classified by an abbreviated LVQ neural network training. Thus, the driving force behind the GA search was the continuous reduction of the number of incorrectly classified observations. Descriptor subsets with the highest degree of fitness (lowest rate of misclassification) were then subjected to a more thorough LVQ training and investigation. Models for the assignment of relative concentrations were also developed using both the feed-forward neural networks and learning vector quantization neural networks. The methodology used is similar to that outlined above, and it is explained in detail in the Results and Discussion. RESULTS AND DISCUSSION Classification of Analytes. A set of 20 3-layer, fully connected, feed-forward computational neural network models was developed for the classification of the analytes. Each model developed was used to recognize the presence or absence of a particular analyte. Descriptor subsets for each of the models were chosen using rms error from a multiple linear regression as the measure of fitness. The descriptors chosen were then used in a computational neural network model. A total of 154 descriptors (of which 86 were unique) were used in the 20 models. Each network was developed and trained independently, with final network architectures ranging from 4 to 13 input neurons, 3 to 5 hidden neurons, and a single output neuron. A cvset was used to monitor the training of the networks. Training was stopped at the point at which the error for the cvset was a minimum. Although the classification results for the training and crossvalidation sets approached 100% correct classification, the prediction set did not perform as well. Only 73% (44 of 60) of the prediction set members were correctly classified. Twelve of the 16 errors were false negatives, in which the observation was not assigned an identity. Although there are several possible explanations for the poor performance of this approach, the most probable explanation is network bias. As each network is trained to recognize a single analyte, only 1/20th of the observations in the training set belong to a positive analyte presence. Therefore, the network is biased toward giving a negative response for all analytes, as this will lower the overall rms error for the training. As the previous study14 only contained nine target analytes, this effect may not have been as important as it has proven to be here. Other methods of pattern recognition, such as learning vector quantization, are not as susceptible to such problems, as only a single model needs to be trained for classification. A learning vector quantization model was developed in order to build a more robust classifier. Many descriptor subsets were investigated using an evolutionary feature selection routine. This feature selection routine uses the genetic algorithm in an attempt to determine the descriptor subset with the highest degree of fitness. In this application, the fitness was evaluated as the lowest cost function from an LVQ training (eq 5). Subsets with a lower cost function were retained for further investigation. Using this method, several subset sizes were investigated, with smaller subsets being preferred over larger ones with similar cost functions. Each of the subsets chosen was then subjected to a more rigorous analysis. As the training of an LVQ network is dependent on the starting point, each descriptor subset was trained several Analytical Chemistry, Vol. 69, No. 22, November 15, 1997

4645

Table 5. Descriptors used in the Classification Modela descriptor

sensor

steepest negative slope most positive value av value, region 2 av value, region 3 av value, region 3 av value, region 3 av value, region 3

12 14 14 10 12 13 14

a Average values calculated by dividing the response plots for each analyte into six equal regions.

Table 6. Classification Errors by Concentrationa concn high

Figure 2. (A) Analyte recognition confusion matrix for the training set used in developing the LVQ model. The actual analyte identities are specified by the labels on the left axis, while the classifications assigned by the model are along the top axis. Labels are as follows: (1) acetone, (2) butyl acetate, (3) cologne-2, (4) camphor, (5) (-)carvone, (6) (+)-carvone, (7) chloroform, (8) dichloroethane, (9) DMSO, (10) cologne-1, (11) water, (12) heptane, (13) 2-propanol, (14) indole, (15) mercaptoethanol, (16) methanol, (17) propanol, (18) propionic acid, (19) pseudoexplosive, (20) toluene. (B) Confusion matrix for the external prediction set. Misclassification errors are largely consistent with those present in the training set.

times using different initial weight vectors. Training was monitored, and descriptor subsets with highly oscillatory behavior during training were discarded as being nonrobust. The few remaining subsets were retrained using several different starting points and with many different network sizes. The model with the lowest training set and cross-validation set errors, as well as the fewest adjustable parameters, was retained. The descriptors used by this final model are shown in Table 5. Interestingly, the sensors used in this classification model were sensors in which well-responding polymers were combined to yield new sensors. This network was trained using three neurons initially assigned to each class, although several neurons were later reassigned to difficult classes during training. Finally, model validation was performed using the external prediction set. The model validated well, with classification accuracy being 89% for the training set and 90% for both the cross validation set and the prediction set. Figure 2A shows the confusion matrix for the data used to train the network. Values along the main diagonal represent correct classifications, with a value of 24 being perfect for each analyte class. While there are few perfect classifications, clearly the network does an excellent job classifying the analytes overall. 4646 Analytical Chemistry, Vol. 69, No. 22, November 15, 1997

acetone butyl acetate cologne-2 camphor (-)-carvone (+)-carvone chloroform dichloroethane DMSO cologne-1 water heptane 2-propanol indole mercaptoethanol methanol propanol propionic acid pseudoexplosive toluene total errors

1 1 2 1 1

medium

1 3 2 4 (1) 2 (1) 3 2 1 5 3 (1) 1 1 2 (1) 1

1 3 2 2 3 1 1

2

(1) 8 (1)

low

1

3 (1) 1 18 (1)

1 32 (4)

a Parenthetical values represent misclassifications in the external predictions set. Other values are the combined errors for the training set and cross-validation set.

Table 7. Analyte Subsets Used for Quantification Models group 1

group 2

group 3

cologne-1 acetone cologne-2 dichloroethane DMSO heptane 2-propanol methanol

indole chloroform water pseudoexplosive

butyl acetate mercaptoethanol propionic acid toluene propanol camphor (+)/(-)-carvone

Indeed, a more in-depth analysis of the errors supports this statement. For example, the two colognes present in the data set, cologne-1 (10) and cologne-2 (3), were confused by the network a total of six times. As colognes contain primarily solvent, and these two are products of the same company, it is reasonable to assume that there is a high degree of similarity between the two. Another significant source of the misclassifications were the two carvone isomers (5 and 6). None of the sensors used in this study are chiral-sensitive, so it is likely that the device is discriminating between these compounds on the basis of the impurities found in the commercial sources. This possibility hints

Table 8. Descriptors Present In the Quantification Modelsa group 1

a

group 2

group 3

descriptor

sensor

descriptor

sensor

descriptor

sensor

steepest positive slope steepest negative slope av negative value av negative value av positive value av positive value av value, region 2 av value, region 6 av value, region 6 av value, region 7

12 13 16 14 10 16 14 17 16 12

av value, region 2 av value, region 2 av value, region 3 av value, region 3 av value, region 4 av value, region 6 av value, region 7 av value, region 7 av value, region 8 av value, region 9

17 8 19 6 17 17 10 6 13 10

steepest positive slope av negative value av positive value av value, region 3 av value, region 5 av value, region 6 av value, region 6 av value, region 8 av value, region 8 av value, region 9

10 14 8 12 16 7 6 10 13 10

Average values calculated by dividing response plots into 10 equal regions.

at potential future applications of such a device for monitoring manufacturing processes. While the high rate of confusion between water (11) and the pseudoexplosive (19) may seem troubling, in retrospect it is not surprising. The pseudoexplosive used in the present study is a proprietary mixture of an unknown composition. It is known, however, that it consists in large part of water as a solvent. The concentration of the solution is unknown, and thus it is not possible to adequately estimate the amount of pseudoexplosive needed for an accurate classification. The confusion matrix for the prediction set is shown in Figure 2B. The misclassifications in the prediction set are consistent with the results of the training set. Again, water and the pseudoexplosive yield the most errors. Table 6 shows a summary of the classification errors by concentration. As one might expect, the majority (55%) of the errors occurred for the low concentration analytes. This is reasonable, as the response intensity is related to analyte concentration. Therefore, as the concentration is lowered, the signal-to-noise ratio becomes less favorable. This likely has significant impact in the discrimination of highly similar analytes, such as the colognes. Quantification. As in the previous study,14 a single threelayer feed-forward neural network model was developed for the identification of relative concentrations. Analytes were classified as being of high, medium, or low concentrations. For chloroform and dichloroethane, these classes corresponded to 1:1, 1:2, and 1:3 dilutions with carrier air, respectively. For all of the other analytes the concentrations corresponded to saturated, 1:1, and 1:2 dilutions with carrier air. As stated earlier, the numerical descriptors used for the quantification models were calculated from the smoothed change in fluorescence plots (i.e., the response plots were not normalized). Descriptor subsets were investigated using an evolutionary optimization routine, in which the search was guided by the rms error from a multiple linear regression. The target values for the linear regression feature selection and the computational neural network analysis were 0.95, 0.50, and 0.05 for high, medium, and low concentrations, respectively. The eight-descriptor subset chosen using this method was then submitted to computational neural networks. A computational neural network can be thought of as a nonlinear mapping technique, and it was conceivable that this nonlinear nature may show improvement over the linear model. A variety of network architectures were investigated, with the final architecture consisting of eight inputs, four hidden neurons, and a single output neuron. Initial starting weights and biases were selected using

an optimization routine based on generalized simulated annealing. The resulting CNN model resulted in correct classifications for 83% of the training set, 80% of the cross-validation set, and 80% for the prediction set. In previous work,13,14 quantification of the presented analytes was easily obtained, with a single network producing 97% correct classification for the prediction set. One possible explanation for the difference between the earlier13,14 and current studies is the diversity of the current analyte pool. In the previous study,7,13,14 a collection of nine relatively simple organic compounds was investigated. The current study contains several complex mixtures and a wider range of functional groups. This diversity precludes a sensor which responds to only concentration differences, as the wide differences in functionality likely play a significant role in the interaction process for all the sensors. One possible solution to this wider diversity is analyte subsetting for the quantification models. An effort was made to create analyte subsets on the basis of the similarity of the responses to the sensors. The subsets are shown in Table 7. For the purposes of the LVQ quantification models, the carvone isomers were treated as a single analyte type. To avoid biasing the network in favor of this large class, observations were chosen at random to be used in the training set. Separate LVQ models were then developed for each of the analyte subsets, using the approach described above. A 10-descriptor LVQ model, shown in Table 8, was developed and trained for the group 1 compounds. Interestingly, all of the sensors used in the classification models were also used in this model. However, two additional sensors were required for correct recognition of the relative concentration. The correct concentration value was assigned for 96% of the training set members and 89% of the external prediction set members. The models for groups 2 and 3 are shown in Table 8. These models contained descriptors encoding significantly different sensors. This further enhances the hypothesis that the diversity in the analyte responses made a single quantification model unsatisfactory. Both models proved to generalize well, with correct quantification assignments of 88% and 90% for the training sets and correct prediction set assignments of 92% and 89% for group 2 analytes and group 3 analytes, respectively. The results for the three models taken together were 93% of the training set and 90% of the prediction set being quantified correctly. A summary of the quantification errors is presented in Table 9. Again, most misassignments were made for mediumand low-concentration analytes. This result is consistent with what Analytical Chemistry, Vol. 69, No. 22, November 15, 1997

4647

Table 9. Quantification Errors by Analyte and Concentrationa concn high acetone butyl acetate cologne-2 camphor carvone chloroform dichloroethane DMSO cologne-1 water heptane 2-propanol indole mercaptoethanol methanol propanol propionic acid pseudoexplosive toluene total errors

medium

low

1 3 3 (1) 3 1

2 (1) (1) (1) 2

1 1

2

3 2 1 1 (1) 1 (1) 2 2 1 2 1 1

1 2 9 (1)

13 (3)

17 (2)

a Parenthetical values represent misclassifications in the external predictions set. Other values are the combined errors for the training set and cross-validation set. As discussed in the text, the two carvone isomers were combined as a single analyte type.

was expected from an analysis of the response data. In the response data plots, it is clear that the medium- and lowconcentration responses had very similar intensities (Figure 2B). The high-concentration responses typically were significantly more intense than for the medium- and low-concentration responses. It

4648

Analytical Chemistry, Vol. 69, No. 22, November 15, 1997

is worth noting that while the final quantification models were developed using LVQ, it is not immediatly clear that feed-forward networks were outperformed for the assignment of relative concentrations, as models were not developed for the analyte subsets. Conclusion. This report shows that the change in fluorescence intensity data with time contains sufficient information to accurately identify and quantify a broad range of organic vapors. In addition to responding to a wide range of analytes, the sensor also has the capability to discriminate betweeen analytes with very similar functionality. The numerical descriptors calculated from the change in fluorescence response plots encode information which can be used in a single LVQ neural network model to correctly classify the 20 vapors, with 89% of the training set and 90% of the external prediction set being correctly identified. In addition, the sensor array provides enough information to correctly assign relative concentrations to these analytes. Future work will consist of mixture component analysis, chemical property recognition, and a further exploration of the quantification ability of the optical sensor array. A device capable of these operations, as well as analyte identification, would find many practical applications. ACKNOWLEDGMENT We thank Carolyn Bauer for technical assistance. This research was funded by the Office of Naval Research. Received for review March 18, 1997. Accepted September 9, 1997.X AC970298K X

Abstract published in Advance ACS Abstracts, October 15, 1997.