Fuzzy Rule-Building Expert System Classification ... - ACS Publications

Locally linear embedding method for dimensionality reduction of tissue sections of endometrial carcinoma by near infrared spectroscopy. Analytica Chim...
0 downloads 0 Views 452KB Size
Anal. Chem. 2007, 79, 1485-1491

Fuzzy Rule-Building Expert System Classification of Fuel Using Solid-Phase Microextraction Two-Way Gas Chromatography Differential Mobility Spectrometric Data Preshious Rearden,† Peter B. Harrington,*,† John J. Karnes,‡ and Christopher E. Bunker‡

Clippinger Laboratories, Center for Intelligent Chemical Instrumentation, Department of Chemistry and Biochemistry, Ohio University, Athens, Ohio 45701-2979 and Air Force Research Laboratory, Propulsion Directorate, Wright-Patterson Air Force Base, Dayton, Ohio 45433

Gas chromatography/differential mobility spectrometry (GC/DMS) has been investigated for characterization of fuels. Neat fuel samples were sampled using solid-phase microextraction (SPME) and analyzed using a micromachined differential mobility spectrometer with a photoionization source interfaced to a gas chromatograph. The coupling of DMS to GC offers an additional order of information in that two-way data are obtained with respect to compensation voltages and retention time. A fuzzy rulebuilding expert system (FuRES) was used as a multivariate classifier for the two-way gas chromatograms of fuels, including rocket (RP-1, RG-1), diesel, and jet (JP-4, JP5, JP-7, JP-TS, JetA-3639, Jet A-3688, Jet A-3690, Jet A-3694, and Jet A-generic) fuels. The GC-DMS with SPME was able to produce characteristic profiles of the fuels and a classification rate of 95 ( 0.3% obtained with a FuRES model. The classification system also had perfect classification for each fuel sample when applied one month later. Fuel characterization/identification is important for quality assurance,1-3 arson investigation,4-6 and environmental analysis.7-9 Gas chromatography (GC) is often used for analyzing fuels because the fuels are volatile, and the high resolution achieved with GC capillary columns allows these complex mixtures to be * To whom correspondence should be addressed. Phone: 740-517-8458. Fax: 740-593-0148. E-mail: [email protected]. † Ohio University. ‡ Air Force Research Laboratory. (1) Morris, R. E.; Hammond, M. H.; Shaffer, R. E.; Gardner, W. P.; RosePehrsson, S. L. Energy Fuels 2004, 18, 485-489. (2) Striebich, R. C.; Motsinger, M. A.; Rauch, M. E.; Zabarnick, S.; Dewitt, M. Energy Fuels 2005, 19, 2445-2454. (3) Murty, B. S. N.; Rao, R. N. Fuel Process. Technol. 2004, 85, 1595-1602. (4) Vendeuvre, C.; Bertoncini, F.; Duval, L.; Duplan, J. L.; Thiebaut, D.; Hennion, M. C. J. Chromatogr., A 2004, 1056, 155-162. (5) Zadora, G.; Borusiewicz, R.; Zieba-Palus, J. J. Sep. Sci. 2005, 28, 14671475. (6) Sandercock, P. M. L.; Du Pasquier, E. Forensic Sci. Int. 2004, 140, 43-59. (7) Lavine, B. K.; Brzozowski, D. M.; Ritter, J.; Moores, A. J.; Mayfield, H. T. J. Chromatogr. Sci. 2001, 39, 501-507. (8) Raia, J. C.; Blakley, C. R.; Fuex, A. N.; Villalanti, D. C.; Fahrenthold, P. D. Environ. Forensics 2004, 5, 21-32. (9) Cam, D.; Gagni, S. J. Chromatogr. Sci. 2001, 39, 481-486. 10.1021/ac060527f CCC: $37.00 Published on Web 01/09/2007

© 2007 American Chemical Society

characterized. Flame ionization and mass spectrometric detectors are the most commonly used GC detectors. Recently, the coupling of GC with a differential mobility spectrometry (DMS) detector has offered a new hyphenated system for gas-phase separation and detection.10-13 DMS offers low cost, simple design, high sensitivity, and selectivity. In this study, a DMS with a 10.6-eV photoionization source was used as an alternative GC detector for fuel identification. Photoionization is well-suited to fuel analysis because many of the fuel components will efficiently ionize to furnish cations, and confounding reactant ion peaks that are produced by the more prevalent 63Ni β-emitter sources are eliminated. In addition, photoionization sources are not regulated, as are the radioactive ion sources. A disadvantage of photoionization is that a useful analytical negative ion spectrum is not obtained. DMS is an ambient pressure, ion-separation technique that characterizes an ion by its change in gas-phase ion mobility with respect to strong and weak electric fields. These fields are generated via a 300-MHz asymmetric waveform. The separation of ions by their differential mobilities using asymmetric fields was introduced in 1993 by Buryakov and co-workers.14 A brief description of DMS operational principles will follow. A more comprehensive review of DMS principles and methodologies can be found elsewhere.15,16 DMS is a microfabricated analyzer with a planar drift tube design that allows simultaneous characterization of both positive and negative ions. In DMS, the ions are carried between two parallel-plate electrodes spaced from 0.5 to 5 mm apart for which a high frequency asymmetric electric field is applied to one plate and the other is held at ground. This applied field, referred to as (10) Eiceman, G. A.; Krylov, E. V.; Nazarov, E. G.; Miller, R. A. Anal. Chem. 2004, 76, 4937-4944. (11) Schmidt, H.; Tadjimukhamedov, F.; Mohrenz, I. V.; Smith, G. B.; Eiceman, G. A. Anal. Chem. 2004, 76, 5208-5217. (12) Eiceman, G. A.; Nazarov, E. G.; Miller, R. A.; Krylov, E. V.; Zapata, A. M. Analyst 2002, 127, 466-471. (13) Eiceman, G. A.; Krylov, E. V.; Tadjikov, B.; Ewing, R. G.; Nazarov, E. G.; Miller, R. A. Analyst 2004, 129, 297-304. (14) Buryakov, I. A.; Krylov, E. V.; Nazarov, E. G.; Rasulev, U. K. Int. J. Mass Spectrom. Ion Processes 1993, 128, 143-148. (15) Krylov, E. V. Int. J. Mass Spectrom. 2003, 225, 39-51. (16) Miller, R. A.; Nazarov, E. G.; Eiceman, G. A.; King, A. T. Sens. Actuators. A 2001, 91, 301-312.

Analytical Chemistry, Vol. 79, No. 4, February 15, 2007 1485

Figure 1. An adapted cross section schematic of a differential mobility spectrometer.31 The sample enters the spectrometer, where it is photoionized. The ions then pass between two electrodes, where a dispersion voltage is applied by a radio frequency (rf) generator, and the compensation voltage, a direct current (dc), is superimposed on the asymmetric waveform. The ions that traverse the electrodes are detected by electrometers.

the dispersion or separation voltage, causes ions to undergo fast oscillations perpendicular to the gas flow. Ions that do not change mobility with respect to field strength will travel between the electrodes and be detected. If the ion has different mobilities at different field strengths, it will experience a displacement toward one of the two electrodes. The ions that contact electrodes are neutralized and are no longer detectable. The net displacement of the ions can be corrected with a compensation voltage that is a low dc voltage superimposed on the high field frequency asymmetric field. This dc voltage corrects the path of an ion so that it no longer has a net migration toward an electrode and the ion traverses the electrodes. A differential mobility spectrum is obtained by scanning the compensation voltage across a range of voltages. A peak at a given compensation voltage is a measure of the difference in the mobility of an ion at high to low electric fields. This measurement is used to characterize the ion. A schematic of a differential mobility spectrometer is given in Figure 1. Several pattern recognition techniques have been enlisted for fuel analysis, including genetic algorithms,17,18 artificial neural networks,19-22 and principal component analysis.22,23 Pattern recognition is often used to classify fuels by only the chromatogram, even when multichannel detectors (e.g., mass spectrometric) are used. The additional information of the mass spectrum is typically used only to match peaks in the chromatogram to assist in alignment; otherwise, the mass order is not used by the classifier. The coupling of DMS to GC provides two-way data obtained with respect to compensation voltages and retention time. Two(17) Lavine, B. K.; Moores, A. J.; Mayfield, H.; Faruque, A. Microchem. J. 1999, 61, 69-78. (18) Lavine, B. K.; Brzozowski, D.; Moores, A. J.; Davidson, C. E.; Mayfield, H. T. Anal. Chim. Acta 2001, 437, 233-246. (19) Rao, R. N. TrAC, Trends Anal. Chem. 2002, 21, 175-186. (20) Santos, V. O.; Oliveira, F. C. C.; Lima, D. G.; Petry, A. C.; Garcia, E.; Suarez, P. A. Z.; Rubim, J. C. Anal. Chim. Acta 2005, 547, 188-196. (21) Long, J. R.; Mayfield, H. T.; Henley, M. V.; Kromann, P. R. Anal. Chem. 1991, 63, 1256-1261. (22) Doble, P.; Sandercock, M.; Du Pasquier, E.; Petocz, P.; Roux, C.; Dawson, M. Forensic Sci. Int. 2003, 132, 26-39. (23) Johnson, K. J.; Synovec, R. E. Chemom. Intell. Lab. Syst. 2002, 60, 225237.

1486 Analytical Chemistry, Vol. 79, No. 4, February 15, 2007

Figure 2. Schematic of the experimental setup used for gas chromatography/differential mobility spectrometry (not drawn to scale). The differential mobility spectrometer (DMS) is interfaced to the gas chromatograph using stainless steel tubing, and the data are collected by a data acquisition system (DAQ).

way data is often referred to as two-dimensional data; however, it is incorrect to refer to these two-way data objects as twodimensional because the GC way has a dimensionality of 3479 and the DMS way has 250 dimensions, so the dimensionality of each data object is 869, 750 and not 2. The imprecise practice of referring to two-way data as two-dimensional stems from image analysis but those dimensions are spatial and refer to the image and the not measurement dimensionality. In our work, FuRES was the pattern recognition technique used to classify the fuels using entire two-way data objects. The two-way set of data objects can be unfolded into a vector classified and then refolded back into the original two-way representation. One advantage of a FuRES model is that it provides a lucid mechanism of inference in that the variables responsible for the classification and the inductive logic of the classifier can be elucidated. The principal component transform (PCT) was used to provide a lossless compression of the data objects to speed up the FuRES algorithm. FuRES constructs a classification tree that allows the visualization of the inductive structure of the rules. The inductive classification tree is a collection of membership functions for which each branch is a multivariate fuzzy rule. The entropy of the system decreases as the classification tree branches. Membership functions in FuRES allow values between 0 and 1 and provide a measure of the degree of similarity of elements in the total population with the subsets. Analysis of variance-principal component analysis (ANOVAPCA)24 was used to evaluate the two-way data for differences among the data collected on different days with the GC/DMS. ANOVA is a statistical technique for determining the significance of the differences between two or more means and is a key but often neglected method for statistical decision-making from experimental data. PCA25 is used to reduce the number of variables in a data set while retaining as much variation from the original data set. ANOVA-PCA separates the variance pertaining to the different factors of the experiment. The scores of each factor combined with the residual error can be displayed with 95% (24) Harrington, P. D.; Vieira, N. E.; Espinoza, J.; Nien, J. K.; Romero, R.; Yergey, A. L. Anal. Chim. Acta 2005, 544, 118-127. (25) Malinowski, E. R. Factor Analysis in Chemistry, 2nd ed.; John Wiley & Sons, Inc.: New York, 1991.

Figure 3. The top figure is a positive two-way GC/DMS measurement for JetA3690. The figure to the right is the average DMS spectrum and the figure below is the average ion chromatogram.

confidence intervals about each level of the experimental design. A clear statistical rendering of the experimental design was obtained. This study demonstrates the use of DMS as an alternative detector for fuel analysis and using the entire two-way data object for classification. Neat fuel samples were sampled using solidphase microextraction (SPME) of the headspace. Direct headspace injections did not furnish useful signals. These fuels included rocket propellants (RP-1, RG-1), diesel, and military jet propellant (JP-4, JP-5, JP-7, JP-TS,) and commercial grade jet propellant (Jet A-3639, Jet A-3688, Jet A-3690, Jet A-3694, Jet A-generic) fuels. The PCT was used as a lossless compression method so that the number of variables equals the number of objects in the training set. These training set objects were then used for FuRES modeling. The PCT prior to FuRES reduces training time significantly. These studies will provide useful (26) Zhang, Z. Y.; Pawliszyn, J. Anal. Chem. 1993, 65, 1843-1852.

information from complex two-way data sets obtained using GC/ DMS and characterized with chemometric techniques. EXPERIMENTAL SECTION Reagents. Neat samples of RP-1, RG-1, JP-4, JP-TS, JP-7, JP-5, Jet A-3639, Jet A-3688, Jet A-3690, Jet A-3694, Jet A-generic, and diesel fuels were provided by the Wright Patterson Air Force Base at Dayton, OH. The numbers behind each Jet A fuel indicates the location where each fuel was obtained. Each sample was stored in a borosilicate glass vial with polytetrafluoroethylene lined caps at 20 °C and used as received. Materials. All samples were sampled using headspace solidphase microextraction (HS-SPME).26 All extractions were performed in 4-mL amber vials with polytetrafluoroethylene-lined silicone septa screw-top caps (Supelco Inc., Bellefonte, PA) and using a polydimethylsiloxane (PDMS, 100 µm) SPME fiber. The manual SPME holder and fiber were purchased from Supelco Inc. (Bellefonte, PA). Prior to use, the fiber was conditioned as Analytical Chemistry, Vol. 79, No. 4, February 15, 2007

1487

Figure 4. Two-way data objects of four fuel types; (A) commercial, (B) military, (C) diesel, and (D) rocket propellant. Chromatograms are representative of all the chromatograms obtained with GC/DMS. Table 1. Fuels Samples and Replicates

Figure 5. ANOVA-PCA score plot giving the date experimental factor. There is an obvious difference in data collected on day 1 (101605). The first and second principal components account for 31% of the total variance. Percentage of principal components is given in parentheses with the absolute variance. A 95% confidence interval is drawn around the mean of each day.

recommended by the manufacturer in the GC injector port at 250 °C for 1 h. Instruments. All experiments were performed on a GC/DMS system consisting of a HP 5890A (Agilent Technologies, Palo Alto, CA) gas chromatograph interfaced to a differential mobility spectrometer (model SDP-1, Sionex Corporation, Bedford, MA). The differential mobility spectrometer comprised two micromachined parallel plate electrodes with a 1-mm flow channel and an ultraviolet lamp (10.6 eV) photoionization source. A schematic of the GC/DMS setup is given in Figure 2. 1488 Analytical Chemistry, Vol. 79, No. 4, February 15, 2007

fuel

no. of samples

validation set

diesel Jet A-3639 Jet A-3688 Jet A-3690 Jet A-3694 Jet A-generic JP4 JP5 JP7 JPTS RG1 RP-1 total

5 6 5 5 5 5 5 5 4 5 5 5 60

1 1 1 1 1 1 1 1 1 1 1 1 12

Gas chromatography was carried out on a (5% diphenyl, 95% dimethyl polysiloxane cross-linked (Rtx-5MS; Restek Corporation, Bellefonte, PA)) wall-coated open tubular column (30 m × 0.25mm i.d., 0.25-µm film thickness). The GC and DMS were interfaced using stainless steel tubing and a Swagelok (Solon, OH) tee union. The total transfer line length was 14 cm. The GC column penetrated 2.5 cm into the DMS inlet. Compressed air, regulated by a mass flow controller (model 1159B, MKS Instruments, Andover, MA), was used as makeup drift gas for the DMS at a flow rate of 250 mL/min. The makeup and transfer gas lines were heated to 100 °C using heating tape to prevent sample condensation. The GC carrier gas was helium at a constant flow rate of 3.0 mL/min. The GC injector was operated in the splitless mode with a purge delay of 2 min. The temperature of the GC oven was program from 50 °C (3 min hold) to 200 °C (5 min hold) at 3 °C/min. The DMS was operated at a dispersion or rf voltage of 1500 V. The DMS spectra were acquired at a 1 Hz rate and ranged

Figure 6. Principal component analysis scores for training data set. Each letter represents a replicate of the fuel sample. This graph indicates no apparent separation of fuel type. The first and second principal components account for 45% of the total variance. The percent variance spanned by the principal components are given in parentheses with the absolute variance. A 95% confidence interval is drawn around the mean of each class.

Figure 7. The effect of polynomial order on alignment of spectral data for FuRES classification of fuels. NA refers to the evaluation without alignment. The fourth-order polynomial provided the best classification rate for the jet fuel data. Each result is reported with 95% confidence intervals for five bootstrapped Latin partitions.

from -13 to +6 V. DMS is capable of simultaneous detection of both positive and negative ions; however, only positive ion mode data was reported because no peaks were observed when the DMS operated in negative polarity. Data Collection. A 100-µL aliquot of neat jet fuel sample was placed in the vial and allowed to equilibrate for 30 min. The SPME fiber was then exposed to the headspace for 3 min. After extraction, the SPME fiber was immediately transferred to the GC injector, and chromatographic analysis by GC/DMS was performed. The desorption time was 5 min, and the desorption temperature was 250 °C. The fiber was placed in the GC injector for 20 min between runs to minimize carryover effects. (27) Wan, C.; Harrington, P. B. Anal. Chim. Acta 2000, 408, 1-12. (28) Wehrens, R.; Putter, H.; Buydens, L. M. C. Chemom. Intell. Lab. Syst. 2000, 54, 35-52.

Six replicates of each of the 12 fuels were collected, with the exception of Jet A-3694 (4 replicates) and JP4 (6 replicates). All samples were collected using a random block design. The entire series of fuel samples were run once in random order in each of the replicate blocks. The 12 fuel samples were rerun as blind unknowns one month after the initial spectra were collected. A LabView virtual instrument (VI) software program was used to acquire the data. Data were collected on a Dell notebook computer with an 1.20-GHz, Intel Celeron processor interfaced to the GC/ DMS via a data acquisition board type DAQCard-6024E (National Instruments, Austin, TX). Data Processing. The two-way data was acquired as comma separated text files and converted to Microsoft Office Excel workbooks using Excel (Microsoft Office Excel 2003, Redmond, WA). MATLAB 2006a 64 bit SP1 (Mathworks; Natick, MA) was used to process data and generate graphs. MATLAB ran under a Windows XP Professional 64 bit edition with SP1 (Microsoft; Redmond, WA). The computer was a home-built Opteron 150 Venus processor (Advanced Micro Devices; Sunnyvale, CA) with 4 GB of dual channel DDR PC4000 RAM (OCZ Technology; Sunnyvale, CA). The CPU operated at 2.8 GHz (256 × [email protected] V) that was cooled with a CL-P0024 copper cooling heat sink with 2 A2017 90-mm fans (Thermaltake, City of Industry, CA). The motherboard was a socket 939 LanParty nF4 Ultra-D (DFI Corp; San Jose, CA). Each data file was read by MATLAB as a MS Excel spreadsheet and converted to a data matrix such that the rows corresponded to retention times and the columns corresponded to the compensation voltages from the DMS. For the entire data set, a data tensor of 60 × 2240 × 250 was collected for the sample, retention time, and compensation voltage ways, respectively. The class designations were represented by a 60 × 12 binary encoded matrix. Each column designated one of the 12 fuel classes. Each spectrum was baseline-corrected by subtracting the mean of the DMS intensities calculated from the five outermost points on each side of the spectrum. These points corresponded to compensation voltages of [-13, -11.8] V and [+4.4, +5.9] V. A linear interpolation was used to standardize the retention time data to 1-s intervals from 2 s to 58 min, which yielded uniform sets of retention time measurements. The FuRES models were evaluated using Latin partitions. The Latin-partition method randomly divides the data set into training and testing set pairs so that every spectrum in the set is used once and only once in a prediction set.27 Latin partitions maintain a constant class distribution among the training and testing sets during the random sampling process. Five training/testing set pairs were created using Latin partition. Each training data set contained 48 training two-way objects, and each testing data set contained 12 two-way objects. The results of the five Latin partitions were pooled. Because the objects are randomly sampled without replacement for the training and test sets, these evaluations can be bootstrapped.28 The partitions were bootstrapped five times so that the average prediction results could be reported with confidence intervals. For five bootstraps, 25 FuRES models were constructed. During testing or prediction, no two-way object was ever used for model building or optimization in any of the evaluations, including the PCT and alignment steps. Analytical Chemistry, Vol. 79, No. 4, February 15, 2007

1489

Table 2. Average Confusion Matrix with 95% Confidence Intervals for Fourth-Order Chromatographic Alignment of Two-Way Fuel Measurements and Five Bootstrapped Latin Partitions with Five Partitions Per Bootstrap and the FuRES Classifier fuel

diesel

JP4

JP5

JP7

JPTS

JetA-3639

JetA-3688

JetA-3690

JetA-3694

JetA-generic

RG1

RP1

diesel JP4 JP5 JP7 JPTS JetA-3639 JetA-3688 JetA-3690 JetA-3694 JetA-generic RG1 RP1

5 0 0 0 0 0 0 0 0 0 0 0

0 6 0 0 0 0 0 0 0 0 0 0

0 0 5 0 0.6 ( 0.7 0 0 0 0 1 0 0

0 0 0 5 0 0 0 0 0 0 0 0

0 0 0 0 4.6 0 0 0 0 0 0 0

0 0 0 0 0 4.6 ( 0.7 0 0 0 0 0 0

0 0 0 0 0 0 4.8 0 0 0 0 0

0 0 0 0 0 0.2 ( 0.6 0.2 ( 0.6 5 0 0 0 0

0 0 0 0 0.2 ( 0.6 0 0 0 4 0 0 0

0 0 0 0 0.2 ( 0.6 0.2 ( 0.6 0 0 0 4 0 0

0 0 0 0 0 0 0 0 0 0 5 0

0 0 0 0 0 0 0 0 0 0 0 5

Figure 8. FuRES classification tree for 12 fuels with a 95% classification rate. Nc ) number of samples, H ) entropy values. The numbers indicate the number of rules used to build the tree. There is no splitting of the fuels among the leaf (circle) node. All fuels were separated using this 11-rule tree.

Retention time drift due to variations in pressure, temperature, and flow rate occurring during the chromatographic run was adjusted by using polynomial interpolation. The fminsearch function of MATLAB was used to maximize the correlation between the individual spectra and the two-way mean of the samples. Polynomial orders 0-7 were used to evaluate the FuRES model to find the best order to align the spectra. After retention time drift adjustment, the data were unfolded to yield vectors with 869 750 dimensions for each object. These objects were compressed using the PCT and normalized to unit length. The FuRES model and parameters optimized by the Latin-partition study were validated by running the same 12 fuel samples on the GC/DMS one month after the initial fuel collection. RESULTS AND DISCUSSION GC/DMS Analysis. Twelve fuel samples were evaluated for classification. An example of the two-way GC/DMS measurement for the JetA-3690 fuel sample is given in Figure 3 as an image along with the average or marginal chromatogram and DMS spectrum. Note that this spectrum was obtained one month after the initial collection of data and was one of the 12 used in the validation study. 1490 Analytical Chemistry, Vol. 79, No. 4, February 15, 2007

The fuels can be subdivided into four groups: rocket propellants (RP-1, RG-1), diesel, military (JP-4, JP-5, JP-7, and JP-TS), and commercial (Jet A-3639, Jet A-3688, Jet A-3690, Jet A-3694, and Jet A generic) jet fuels. Images of the two-way data objects of the fuels from each fuel type group are given in Figure 4. These images have been aligned, and the range of the image was reduced from [0, 58] min to [3, 40] min and from [13, 6] V to [-8, 2] V for display purposes. A plot of the two-way data (retention time as a function of compensation voltage) object for fuel with the averages for each order are given in Figure 4. ANOVA-PCA Analysis. ANOVA-PCA was applied to assess the data, and one factor that was studied was the date for each block of data collection. ANOVA-PCA revealed that the first study was different from the rest. After the flaw in the first day of data collection was recognized, an additional block of data was collected two weeks later. For all six blocks of data, the data factor ANOVA-PCA scores for the first two principal components are given in Figure 5. The first day of collection (A) was determined to be different with respect to precision (i.e., size and shape of the 95% confidence ellipses) and bias (i.e., the location of the center of the ellipse), so this block of data was discarded. Confidence intervals (95%) obtained from the Hotelling T2 statistics29 are drawn around the means for each class. A difference in instrumental warm-up time was believed to be responsible for the differences of the first day. After removing these data, 60 measurements that comprised two-way data objects were used for evaluating the FuRES classifier. The fuels used for this study are given in Table 1. ANOVA-PCA was also used to evaluate the degree of class separation or clustering for the remaining two-way data. The ANOVA-PCA scores for the class factor is given in Figure 6. The first two principal components account for 45% of the total variance. Considerable overlap among fuel classes was observed. The first two principal components do not indicate linear separability of the data set. Classification Model. Retention time drift is an important source of variation for gas chromatography, especially when the chromatograms are used for pattern recognition. A polynomial (29) Vandeginste, B. G. M.; Massart, D. L.; Buydens, L. M. C.; De Jong, S. P. J., L.; Smeyers-Verbeke, J. In Data Handling in Science and Technology; Elsevier: New York, 1998; Vol. 20B, pp 228.

retention time alignment method was applied to the two-way objects so that each two-way object would maximize its correlation with the mean spectrum in the training set of data. If too high a polynomial order is used, the two-way data objects may become distorted when the algorithm is trapped in local minima. The effect of retention time alignment for each two-way object was evaluated by the FuRES prediction rates. Because Latin partitions were used, a precise statistical analysis was obtained for polynomial orders 0-7. No retention time alignment was included as a control. The average number of misclassified two-way measurements for the 60 fuels samples using no alignment and as function of the order of the polynomial used for alignment are given in Figure 7. Latin partitions were used with five bootstraps to generate the 95% confidence intervals. From these results, one can see that alignment improves classification until the order of the polynomial increases above 4, then the interpolation model begins to overfit the data. The quartic polynomial was optimal with respect to overall classification performance and used for the other studies. The FuRES models built from the quartic polynomial retention time alignment correctly identified 57 of the 60 fuels measurements. The prediction results for FuRES with the quartic polynomial alignment are reported as a confusion matrix in Table 2. A confusion matrix was used to evaluate the performance criteria of the classification model providing information about the actual and predicted results. In the confusion matrix, information about the occurrence of each element in the predicted class is located in the columns and for the actual class, in the rows.3030-31 The number of correctly classified fuels for each class can be seen along the diagonal of the matrix, and the number of misclassified fuels is located along the off diagonal. When classifying the replicates of each fuel, JP4 was misclassified 0.4 times, in two of five bootstraps as Jet A-3694. Jet A-3639 was misclassified 0.2 times, in one of five bootstraps as Jet A-generic. JP-TS was misclassified 0.4 times, in two of five bootstraps as Jet A-3694 and 0.6 times in three of five bootstraps as JP5. One spectrum in Jet A-generic was misclassified five of five bootstraps as JP-5. The consistent misclassification of the Jet A-generic spectrum as a JP-5 spectrum occurred for only a single replicate; however, a closer (30) James, M. Classification Algorithms; John Wiley & Sons: London, 1985, pp 82. (31) Rainsberg, M. R.; Harrington, P. D. B. Appl. Spectrosc. 2005, 59, 754-762.

inspection of the five JP-5 spectra did not show any obvious differences. The entire two-way data set was used to construct the FuRES classification tree given in Figure 8 for all 12 fuels in Table 1. The FuRES algorithm used 11 rules to build the classification tree. The 12 blind fuel unknowns were collected with the GC/DMS one month after the initial fuel samples. The blind fuel unknowns were subjected to the same data processing and pattern recognition analysis as the 60 fuel samples that yielded the optimum results from the Latin partitions (i.e., a fourth-order alignment). All 12 blind unknown fuel samples were accurately classified, yielding a 100% classification rate. CONCLUSIONS FuRES models can be used to classify fuels from GC/DMS two-way measurements. The use of the full two-data object facilitates retention time alignment and classification. Using SPME as a sampling method eliminates problems with solvent peaks and provides a simple procedure for sample introduction. The classification rate for the model used in this study was 95 ( 0.3%. The robustness of the model was demonstrated by correctly classifying fuel samples after one month. GC/DMS with pattern recognition techniques such as FuRES and polynomial retention time alignment can be use to characterize complex samples, such as fuels, arson accelerants, flavors, fragrances, and volatile environmental samples. ACKNOWLEDGMENT The Center for Intelligent Chemical Instrumentation and Department of Chemistry and Biochemistry at Ohio University are acknowledged for their financial support. The USAF/WPAFB is thanked for the funding of this project and providing the jet fuel samples. The Research Corporation is thanked for the Research Opportunity Award. The Sionex Corporation is thanked for the differential mobility spectrometer (model SDP-1). Special thanks to Erkinjon Nazarov for his useful suggestions regarding the differential mobility spectrometer. Ping Chen and Yao Lu are thanked for their helpful comments and suggestions.

Received for review March 22, 2006. Accepted November 16, 2006. AC060527F

Analytical Chemistry, Vol. 79, No. 4, February 15, 2007

1491