Use of Wavelet Packet Transform in Characterization of Surface Quality

Jun 16, 2007 - we have experimentally shown that the wavelet packet transform is superior to the ... transform (WPT) and the discrete wavelet transfor...
0 downloads 0 Views 278KB Size
5152

Ind. Eng. Chem. Res. 2007, 46, 5152-5158

Use of Wavelet Packet Transform in Characterization of Surface Quality J. Jay Liu,† Daeyoun Kim,‡ and Chonghun Han*,‡ Samsung Electronics, Myeongam-ri 200, Tangjeong-myeon, Asan, Chungchengnam-do, South Korea, and School of Chemical and Biological Engineering, Seoul National UniVersity, San 56-1, Shillim-dong, Kwanak-gu, Seoul 151-742, Korea

Feature extraction is a crucial step in pattern recognition problems as well as in methods for characterizing the quality of a product surface (Liu, J. Ph.D. thesis, McMaster University, Canada, 2004). In this paper, different types of wavelet transforms, that is, the wavelet packet transform and the discrete wavelet transform, are compared in the feature extraction step for classification of the surface quality of rolled steel sheets (Bharati, M.; et al. Chemom. Intell. Lab. Syst. 2004, 72, 57-71). Using this real-world industrial example, we have experimentally shown that the wavelet packet transform is superior to the discrete wavelet transform in terms of classification performance and Fisher’s criterion. We also propose an easy but powerful method to determine the optimal decomposition level. A closer look at the characteristics of the image data reveals that as a result of its equal frequency bandwidth, wavelet packet transform is more suitable for extracting textural features when textural information from different classes of images is not confined within a certain (spatial) frequency region. 1. Introduction Cutting-edge information technology has made tremendous amounts of process data available to us on an on-line, realtime basis. As it became possible to monitor the state of processes based on real-time data, multivariate statistical methods have been introduced in chemical engineering to analyze data and extract information from those data. Such methods have been successfully applied for process monitoring, optimization, fault diagnosis, and safety.1-7 However, as chemical industries change their focus from commodity chemicals to specialized chemicals, the product quality has increased in importance. Previously a lab technician would measure the quality of a product by the visual appearance of the product surface, but now digital image sensors can be used to monitor product surfaces. The visual appearance of a product surface is very important in cases where the product is used mainly for display or as an outer part of other products. For example, the surface quality or surface roughness of steel sheets is important to automakers because these can be major sources of disturbance that can affect the coating quality after coating processes. In such cases, the visual quality (i.e., visual appearance) and the physical or mechanical qualities of a product surface need to be controlled or maintained. Although machine vision has been successfully applied to manufacturing industries, its main application area has been in automatic assembly and/or inspection where scenes in images from such applications are deterministic. On the other hand, in process industries there is a different class of problems where the major concern is the stochastic visual appearance of products or processes.8 Examples include, but are not limited to, the color and appearance of mineral flotation froth,9 visible patterns such as stripes and swirls on the surfaces of injection-molded plastic panels,10 and the surface quality of rolled steel sheets.11 A new machine vision methodology for process industries was proposed by Liu12 to handle the stochastic nature of the * To whom correspondence should be addressed. Tel.: 82-2-8801887 E-mail: [email protected]. † Samsung Electronics. ‡ Seoul National University.

visual quality of industrial processes and products. The proposed methodology has been successfully applied to industrial processes, including the examples mentioned earlier. This paper compares two different wavelet transform techniques that can be used for feature extraction. The feature extraction step is crucial because it can determine the overall performance of the machine vision system. The difference between the two wavelet transforms, the wavelet packet transform (WPT) and the discrete wavelet transform (DWT), will be examined in the classification of surface quality for rolled steel sheets, which was the industrial data set used in Bharati et al.11 where the DWT showed the best performance among other texture analysis techniques. Also, we discuss a simple but powerful method for determining an optimal decomposition level for wavelet transform, which can be used for searching optimal wavelet packets. The rest of this paper is organized as follows. Section 2 briefly describes the theories used in the methodology, wavelet transforms, and Fisher’s discriminant analysis (FDA). In section 3, the implementation of the new machine vision methodology using a texture classification for steel quality is discussed in detail. Experimental results for texture classification are presented in section 4. In the last section, some useful conclusions are given, and further discussions are presented. 2. Methodology for Surface Quality Characterization 2.1. Discrete Wavelet Transform. The DWT can be derived from discretization of the continuous wavelet transform (CWT) at dyadic scale a ) 2j and shift τ ) 2jk (j ∈ Z+, k ∈ Z) as given by

CWT(a,τ) )

( )

1 τ-s f(s) ψ ds w DWT(j,k) ) a xa 2jk - s 1 f(s) ψ ds (1) 2j x2j





( )

where f(s) is the signal to be analyzed and ψ(s) is the mother wavelet or basis function.

10.1021/ie061348r CCC: $37.00 © 2007 American Chemical Society Published on Web 06/16/2007

Ind. Eng. Chem. Res., Vol. 46, No. 15, 2007 5153

Figure 1. Separable structure for 2-D DWT at the jth decomposition stage.

The mother wavelet ψ(s) is related to the scaling function φ(s) with some suitable sequence h[k]:13-15

∑k h1[k] φ(2s - k)

ψ(s) ) x2

(2)

where φ(s) ) (x2)∑kh0[k] φ(2s - k) and h1[K] ) (-1)kh0[1 k]. Employing the following relations, the DWT at decomposition level j can be achieved without the explicit forms of ψ(s) and φ(s): j φj,l[k] ) 2j/2h(j) 0 [k - 2 l]

(3)

j ψj,l[k] ) 2j/2h(j) 1 [k - 2 l]

(4)

(0) where h(j+1) [k] ) [hm]v2j‚h(j) m m [k] and hm [k] ) hm[k] (m ) {0, 1}) and [‚]v2j denotes upsampling by a factor of 2j. j and l are the scale and translation indices, respectively. The DWT coefficients of a signal f(x) can now be computed as

a(J)[l] ) 〈f[k],φJ,l[k]〉, d(j)[l] ) 〈f[k],ψj,l[k]〉

both approximation and detail coefficients), yielding an equal frequency bandwidth (Figure 2b). 2.3. Wavelet Texture Analysis. Wavelet Texture Analysis (WTA) based on the 2-D DWT is a very powerful method compared to other texture analysis methods.11,16 In WTA, it is assumed that a texture has a unique distribution in the 3-D scale space. The space is composed of two spatial axes and an additional scale axis. Therefore, if the scale axis of the scale space of a textured image is discretized appropriately, different textures will have different responses at the discretized k scales. When treating each wavelet sub-image (i.e., a(J) and d(j) where j ) 1, 2, ..., J and k ) h, V, d for a separable 2-D DWT, see Figure 1) as a matrix, then the energy of the approximation sub-image is defined as

Ea(J) ) |a(J)|F2

The energy divided by the number of pixels is averaged power or normalized energy. The most popular wavelet textural feature, a waVelet energy signature, is a vector organized by energies of all sub-images. The concepts of WTA based on the 2-D DWT can also apply to 2-D WPT with an arbitrary tree structure.17-19 At the Jth level, the size of a feature vector for an image (when including an approximation sub-image) is 3J + 1 for the 2-D DWT and 4J for the 2-D full-tree WPT, respectively. 2.4. Fisher’s Criterion. In FDA, a d-dimensional input space is linearly projected to a (c - 1)-dimensional space, where c is the number of classes.20,21 Let the between-class scatter matrix be SB and the within-class scatter matrix be SW. Then, the optimal projection Wopt is chosen as the matrix with orthonormal columns which maximizes the ratio of the between-class scatter and the within class scatter.

(5)

where the a(J)[l]’s are expansion coefficients of the scaling function or approximation coefficients and the d(j)[l]’s are the wavelet coefficients or detail coefficients. The separable solution for achieving a two-dimensional (2-D) DWT is given in Figure 1. It gives rectangular divisions of the frequency spectrum and strongly oriented coefficients (often called sub-images because the wavelet coefficients for 2-D signals are also 2-D) in the horizontal, vertical, and diagonal directions. It consists of horizontal and vertical filtering of 2-D signals using low-pass and high-pass 1-D wavelet filters, H0 and H1. Separable horizontal (2V1) and vertical (1V2) downsampling by 2 gives a separable sampling lattice.10 2.2. Wavelet Packet Transform. The DWT produces dyadic decomposition and thus yields narrower bandwidths in the lower frequency regions and wider bandwidths in the higher frequency regions. In other words, the DWT yields an octave frequency bandwidth (Figure 2a). This octave frequency bandwidth of the DWT is suitable for analyzing signals whose information is only located in the low-frequency regions. However, the DWT may not be suitable for signals when the information is mainly located in the middle or high-frequency regions due to the wide bandwidth in the higher frequency region. To analyze this type of signal, the DWT is generalized to include a library of modulated waveform orthonormal bases, called wavelet packets. Implementation of the WPT can be done through tree-structured filter banks. While the DWT is implemented through iterative decomposition of approximation coefficients using the two channel filterbank (see Figure 1), the WPT can be implemented through iterative decomposition of all coefficients (including

(6)

J(Wopt) )

|WTSBW|

(7)

|WTSWW|

The objective value denoted as J(Wopt) is called Fisher’s criterion. The projection vector, z, often called the discriminant variables, from a d-dimensional space to a (c - 1)-dimensional space, is calculated by

z ) WTx

(8)

Equation 7 can be calculated only if the inverse of the matrix, WTSWW, exists. This condition for the inverse of WTSWW is equivalent to full-rankness of W.22 When W has rank less than or equal to its number of columns, we can get a similar result by replacing determinant with trace which is the sum of the diagonal element. In this case, we calculated Fisher’s criterion by using trace because of numerical solvability.23

J(Wopt) ) tr

(

WTSBW

WTSWW

)

(9)

Fisher’s criterion obtained from eq 9 can be used for an approximated classification performance measure for different data sets. Among the different feature vectors, a well-classifiable one will have a high Fisher’s criterion value. Belhumeur21 utilized these characteristics to select a discriminant feature vector obtained through wavelet transform for pattern recognition problems. In this work, we also employed a Fisher’s criterion value, but instead of evaluating an individual feature vector, we focused on appraising the overall discrimination

5154

Ind. Eng. Chem. Res., Vol. 46, No. 15, 2007

Figure 2. Ideal frequency division of (a) octave-band tree structured filter bank or wavelet transform and (b) full-tree structured filter bank or WPT at the Jth decomposition level using sync filters. For simplicity, a one-dimensional case is shown.

performance for each decomposition level in each transform method. Besides Fisher’s criterion, other indices such as a similarity measure, a classification rate, and an information theoretic measure, can be employed. The similarity measure can be obtained from the Bhattacharyya distance or Kullback-Leibler divergences.20,24 The Bayes’ classifier is the optimum classifier, giving the minimum expected misclassification error.20 However, a classification rate as a performance indicator causes a great computational burden, because one has to train and test a classifier at every iterative step during the search process for the best packets. The information theoretic measure estimates the class-related information content and the class-irrelevant variation using Shannon’s information theory.25,26 If feature distribution is available, other measures are preferred to Fisher’s criterion. However, when only a finite number of samples are available, Fisher’s criterion gives reliable results with the least amount of computational effort. The Fisher index is one of the most popular choices for indirectly indicating classification results.27-29 2.5. Framework for Texture Classification. Figure 3 shows the framework for texture classification used in this study. The four elements involved in the framework are image acquisition, feature extraction, feature reduction, and quality monitoring (classification). The fundamental structure was taken from the machine vision framework of Liu12 and properly modified for this work. A description of each element in the framework is as follows: A. Image Acquisition. Any type of images can be used. It is important to select well-matched image types for capturing the important characteristics of appearance. On-line image acquisition is necessary when rapid on-line monitoring is required. For off-line acquisition, the optimal design conditions need to be determined to maintain quality control for product appearance. B. Feature Extraction. An image can include an enormous amount of information, which can be extremely important for

Figure 3. Framework for texture classification.

texture classification, but this information can also be redundant or obstructive to analysis. Therefore, it is desirable to extract the relevant textural information from the images. By using suitable feature extraction methods, one can accomplish this goal. For texture classification, we selected a WTA based on wavelet transforms because this has been shown to perform better than other methods in many cases, as mentioned earlier.11,16 But, other transform methods, sch as 2-D fast Fourier transform, could also be used.30 The gray level co-occurrence matrix (GLCM) of an image is an estimate of the second-order joint probability, Pδ(i,j), of the intensity values of two pixels (i and j), a distance δ apart along a given direction h.11 If the

Ind. Eng. Chem. Res., Vol. 46, No. 15, 2007 5155

Figure 4. Four sample images of steel sheets. The experts’ evaluations are (a) excellent, (b) good, (c) medium, and (d) bad quality. All images are subdivided into four smaller images as shown in part a.

GLCM approach is selected, statistical features are available for analysis.31 C. Feature Reduction. The purpose of this step is to support an efficient analysis by reduction of the input (i.e., feature) dimension. In other words, the feature vector information is converted to a smaller number of features that summarize all the important information. After extracting textural features (usually much fewer than 100 features per image) from the original images (usually several megabytes and hundreds of thousand of pixels per image), further dimension reduction can be done, and Principal Component Analysis (PCA) is a popular method for this objective. In addition, our selected method, FDA, can be used when class labels are available. D. Quality Monitoring or Classification. According to the availability of images and the automation level of the plant, the quality monitoring/classification can be implemented in different ways. On/Off Specifications quality monitoring. Conventional procedures are used in statistical quality control (SQC) or statistical process control (SPC). The latent variable values are plotted either on individual control charts or on multivariate control charts. Then, one can determine the on/off specification requirements depending on the constructed control limits. Gradable Product Quality Classification. Product quality is often classified by operators/graders as good, medium, or bad. Many multi-class classifiers (KNN, SVM, LDC, etc.) are applicable according to characteristics of the features. 3. Application to Steel Image Sets 3.1. Description of Data. To compare two wavelet transform methods (DWT and WPT), we applied them to classification of the quality of steel surface images. Bharati et al.11 used these same images to illustrate the use of WTA. A detailed description of data acquisition and image pretreating methods can be found in Bharati et al.11 The steel images were labeled as excellent, good, medium, or bad by skilled graders based on various criteria representing the degree of surface quality. The quality of a steel sheet is reflected by the number and severity of pits on its surface. Steel surfaces with good quality have a few pits that are quite shallow and randomly distributed. In contrast, bad quality surfaces have deep craters throughout the steel. Figure 4 shows examples of steel surface images with excellent, good, medium, and bad surface qualities. The example of bad surface quality (see Figure 4d) contains deep pits that have joined to form craters. Figure 4c illustrates an example of a medium quality surface, which contains more noticeable pits as compared to the excellent and good quality samples. However, medium quality does not contain the winding patterns present in the bad quality steel.

Table 1. Fisher Criterion Values for Each Decomposition Level of the DWT and WPT level DWT WPT

1

2

3

4

2.1361 2.1361

2.1235 3.1204

2.8011 7.8397

3.1174 8.8282

Table 2. Leave-One-Out Estimates of Classification Errors at the Fourth Decomposition Level DWT (%)

WPT (%)

bad excellent good medium

Type I Error 0.0 0.0 24.1 0.0

0.0 0.0 2.9 0.0

bad excellent good medium

Type II Error 1.1 5.9 0.0 0.0

0.0 0.9 0.0 0.0

For this study, a total of 35 images of steel surfaces were used. Each image is an 8-bit grayscale image with pixel dimensions of 479 × 508. 3.2. Steel Quality Classification. The procedure to classify steel quality is explained below: (1) Preparation of Input Images. The number of samples in each class is 8 (excellent), 9 (good), 6 (medium), and 12 (bad). Each original steel image was subdivided into four nonoverlapping (240 × 254) smaller images (see Figure 4a). Therefore, a new image set has a total of 140 images: 32 (excellent), 36 (good), 24 (medium), and 48 (bad). (2) Determination of an Optimal Decomposition Level for Feature Extraction. The maximum decomposition levels used for wavelet transforms and the wavelet filter were selected following certain guidelines.8,17 For example, the number of maximum decomposition levels was determined according to heuristics such that the size of the smallest sub-image was greater than 10 × 10, and this criterion is similar to that of Chang and Kuo.17 For both cases (DWT and WPT), the orderone Coiflet filter was used for the decomposition of all 140 images. To find an optimal decomposition level, step-by-step decomposition was executed. Starting from level one, textural features, or energy signatures, were calculated for all 140 images at each level (eq 6). Multi-class FDA was performed to obtain Fisher’s criterion values for both DWT and WPT features. Further decomposition was performed only when the Fisher’s criterion value for the next level was larger than that of the previous level. The values of Fisher’s criterion at each level for both cases are summarized in Table 1. The resulting optimal decomposition level was the fourth level for both WPT and DWT. This simple rule is fast and powerful for determining

5156

Ind. Eng. Chem. Res., Vol. 46, No. 15, 2007

Figure 5. Plots for FDA applied to wavelet energy signatures. (a) DWT-FDA plot (z1 vs z2), (b) DWT-FDA plot (z1 vs z3), (c) WPT-FDA plot (z1 vs z2), and (d) WPT-FDA plot (z1 vs z3). Class labels are as follows: *, bad; +, excellent; 4, good; and O, medium classes.

the optimal decomposition level at the feature extraction step using wavelet transforms. (3) Classification Using the Fisher’s Discriminant Variables. FDA produces not only Fisher’s criterion but also the projected vectors and discriminant variables from a given feature set. We applied a K-nearest neighbor (KNN) classifier to the discriminant variables (z) obtained from the previous step for classification. The leave-one-out method was used to estimate the classification performance. The KNN algorithm in PRTools32 was used in this study. Two types of errors commonly arise when testing the null and alternative hypotheses:5 Type I and II errors are shown in Table 2 and are given by

H0: A data point is in class i. H1: A data point is not in class i. P(type I error) ) P(reject H0|H0 is true) P(type II error) ) P(reject H1|H1 is true) 4. Results and Discussion Fisher’s criterion is a ratio of inter-class distance (scatter between classes) to within-class distance (scatter within classes). Accordingly, a high Fisher’s criterion value implies a better separation of classes. From Table 1, we can see that Fisher’s criterion values for WPT features are higher than those for DWT features at all levels (except for decomposition level one).

Especially at level four, Fisher’s criterion value of the WPT (8.8282) is about 2.8 times that of the DWT (3.1174). The rate of increase in Fisher criterion values of WPT increases from level one to two and from two to three but decreases from level three to four, and on the basis of this, we set level 4 as the maximum decomposition level. With steel images, classification using WPT features have better class separation than classification using DWT features. This observation can be visualized by plotting the discriminant variable, z, for WPT and DWT features. FDA plots show that WPT features produce more compactly clustered classes than do DWT features. In Figure 5, the data points in each class are more compact in WPT-FDA plots (c and d) than in DWT-FDA plots (a and b). WPT is also able to separate classes better. This is evident when seeing the degree of overlap between classes in the two plots (Figure 5a,c); good surfaces (data points 4) and excellent surfaces (+) barely overlap in the WPT-FDA plot, while there is clear overlap of classes among bad (*), excellent (+), and good (4) in the DWT-FDA plot. We can conclude from the above that compared to the DWT, the WPT produces more closely clustered data points in the same class and at the same time and provides data points that are more separated in the different classes. To quantitatively compare the DWT and WPT, we applied 3-NN to the discriminant variable, z. For comparing the performances, type I and II errors were estimated by using oneleave-out cross-validation. Only the good class has Type I error

Ind. Eng. Chem. Res., Vol. 46, No. 15, 2007 5157

in both cases, but the error of WPT is almost 1/10 that of DWT. For Type II errors, the bad and excellent classes have two nonzero errors in DWT while only the excellent class has a nonzero error in WPT. The Type II error for WPT is almost one-sixth that for DWT. These results quantitatively confirm that that the WPT-based feature extraction shows better classification performance than the DWT-based feature extraction. The reason that WPT is more suitable for classification of the steel images lies in the nature of the images themselves. Bharati et al.11 examined the (spatial) frequency distribution of steel images with different class labels. It was clearly evident that there was a shift in the distribution from high-frequency regions to lower frequency regions as one moved from the excellent class to the bad class. In other words, images from the bad class have larger energy signatures in lower frequency regions, while images from the excellent class have larger energy signatures in higher frequency regions. Therefore, WPT is more suitable for classifying all classes equally well when the class information for an image is distributed across all the frequency regions, because the WPT has an equal frequency bandwidth (see Figure 2).

j ) scale indices l ) translation indices a(J) ) expansion coefficients of the scaling function or approximation coefficients d(j) ) wavelet coefficients or detail coefficients E ) energy of the sub-image d ) input space dimension c ) number of classes SB ) between-class scatter matrix SW ) within-class scatter matrix W ) weight matrix Wopt ) optimal projection weight matrix J ) objective function value or Fisher’s criterion z ) Fisher’s projection vector x ) normalized input vector

5. Conclusions

Operators

In this work, we assessed the use of WPT for feature extraction to determine if it was more powerful than DWT for characterizing steel images. Also, we demonstrated a new method for determining an optimal wavelet decomposition level. A higher Fisher’s criterion for WPT features indicates that WPT features can produce feature vectors that are more compact and more separated classes when projected using FDA. Consequently, the classification performance of WPT features is much higher as shown in Figure 5 and Table 2. Our work can be used to classify other images having similar characteristics to steel images, with important textural information not only in lower frequency regions but also in middle and higher frequency regions. However, the WPT has some disadvantages. The main disadvantage in using WPT is that it has too high a dimensionality compared to DWT; the number of features of the full tree is 4J, while the number of features of the DWT is 3J + 1, where J is the decomposition level. To solve this problem, our future work includes selecting the best discriminative packets using Fisher’s criterion, which is an extension of our proposed method for determining an optimal decomposition level.

| ) determinant | |F ) Frobenius norm tr( ) ) trace

Acknowledgment The authors would like to thank Professor John F. MacGregor at McMaster University for his permission to use the image data for this study. The authors gratefully acknowledge the partial financial support of the Korea Institute of Science and Technology, the Korea Science and Engineering Foundation, provided through the Advanced Environmental Biotechnology Research Center (R11-2003-006) at Pohang University of Science and Technology, the Brain Korea 21 project initiated by the Ministry of Education, Korea. This work was also supported by Grant R01-2004-000-10345-0 from the Basic Research Program of the Korea Science & Engineering Foundation. Nomenclature a ) scaling factor f ) signal function to be analyzed h ) coefficients of a sequence

Greek Letters ψ ) mother wavelet or basis function φ ) scaling function τ ) time shifting factor Subscripts (j) ) number of level j ) 1, 2, ..., J

Literature Cited (1) Heo, S. K.; Lee, K.-H.; Lee, H.-K.; Lee, I.-B.; Park, J. H. A New Algorithm for Cyclic Scheduling and Design of Multipurpose Batch Plants. Ind. Eng. Chem. Res. 2003, 42 (4), 836-846. (2) Kano, M.; Hasebe, S.; Hashimoto, I.; Ohno, H. Statistical Process Monitoring Based on Dissimilarity of Process Data. AIChE J. 2002, 48 (6), 1231-1240. (3) Lee, M.; Lee, K.; Kim, C.; Lee, J. Analytical Design of Multiloop PID Controllers for Desired Closed-Loop Responses. AIChE J. 2004, 50 (7), 1631-1635. (4) Bakshi, B. R.; Locher, G.; Stephanopoulos, G.; Stephanopoulos, G. Analysis of operating data for evaluation, diagnosis and control of batch operations. J. Process Control 1994, 4 (4), 179-194. (5) Jin, H. D.; Lee, Y.-H.; Lee, G.; Han, C. Robust Recursive Principal Component Analysis Modeling for Adaptive Monitoring. Ind. Eng. Chem. Res. 2006, 45 (2), 696-703. (6) Chu, Y.-H.; Lee, Y.-H.; Han, C. Improved Quality Estimation and Knowledge Extraction in a Batch Process by Bootstrapping-Based Generalized Variable Selection. Ind. Eng. Chem. Res. 2004, 43 (11), 26802690. (7) Han, I.-S.; Lee, Y.-H.; Han, C. Modeling and Optimization of the Condensing Steam Turbine Network of a Chemical Plant. Ind. Eng. Chem. Res. 2006, 45 (2), 670-680. (8) Liu, J. J.; MacGregor, J. F. Estimation and Monitoring of Product Aesthetics: Application to Manufacturing of “Engineered Stone” Countertops. Mach. Vision Appl. 2006, 16, 374-383. (9) Liu, J.; MacGregor, J. F.; Duchesne, C.; Bartolacci, G. Monitoring of Flotation Processes using Multiresolutional Multivariate Image Analysis. Miner. Eng. 2005, 18 (1), 65-76. (10) Liu, J.; MacGregor, J. F. Modeling and Optimization of Product Appearance: Application to Injection-molded Plastic Panels. Ind. Eng. Chem. Res. 2005, 44, 4687-4696. (11) Bharati, M.; Liu, J.; MacGregor, J. F. Image Texture Analysis: Methods and Comparisons. Chemom. Intell. Lab. Syst. 2004, 72 (1), 5771. (12) Liu, J. Machine Vision For Process Industries. Ph.D. thesis, McMaster University, Hamilton, Canada, 2004. (13) Mallat, S. G. A Theory for Multiresolution Signal Decomposition: The Wavelet Representation. IEEE Trans. Pattern Anal. Mach. Intell. 1989, 11, 674-693. (14) Rioul, O.; Duhamel, P. Fast Algorithms for Discrete and Continuous Wavelet Transforms. IEEE Trans. Inf. Theory 1992, 38 (2), 569586.

5158

Ind. Eng. Chem. Res., Vol. 46, No. 15, 2007

(15) Vetterli, M.; Kovacˇevic´, J. WaVelets and Subband Coding; Prentice Hall: Englewood Cliffs, NJ, 1995. (16) Randen, T.; Husoy, J. H. Filtering for Texture Classification: a Comparative Study. IEEE Trans. Pattern Anal. Mach. Intell. 1999, 21 (4), 291-310. (17) Chang, T.; Kuo, C. C. J. Texture Analysis and Classification with Tree-Structured Wavelet Transform. IEEE Trans. Image. Process 1993, 2 (4), 429-441. (18) Etdmad, K.; Chellappa, R. Separability-based Multiscale basis Selection and Feature extraction for Signal and Image classification. IEEE Trans. Image. Process 1998, 7 (10), 1453-1465. (19) Laine, A.; Fan, J. Texture classification by Wavelet Packet Signatures. IEEE Trans. Pattern Anal. Mach. Intell. 1993, 15 (11), 11861191. (20) Duda, R.; Hart, P.; Stork, D. G. Pattern classification and Scene Analysis, 2nd ed.; Wiley-Interscience: New York, 1973. (21) Belhumeur, P. N.; Hespanha, J. P.; Kriegman, D. J. Eigenfaces vs. Fisherfaces: Recognition using Class Specific Linear Projection. IEEE Trans. Pattern Anal. Mach. Intell. 1997, 19 (7), 711-720. (22) Camelio, J. A.; Hu, S. J.; Yim, H. Sensor Placement for Effective Diagnosis of Multiple Faults in Fixturing of Compliant Parts. J. Manuf. Sci. Eng. 2005, 127, 68-74. (23) Demirkol, A.; Demir, Z.; Emre, E. A New Classification Approach using Discriminant Functions. J. Inf. Sci. Eng. 2005, 21, 819-828. (24) Sim, T.; Zhang, S. Exploring Face Space. Presented at the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’04); 2004.

(25) Lee, Y.; Hwang, K. W. Selecting Good Speech Features for Recognition. ETRI J. 1996, 18 (1), 29-41. (26) Papoulis, A. Probability, Random Variables, and Stochastic Processes, 2nd ed.; McGraw-Hill: New York, 1984. (27) Model, F.; Adorjan, P.; Olek, A.; Piepenbrock, C. Feature selection for DNA Methylation based Cancer Classification. Bioinformatics DiscoVery Note 2001, 1, 1-8. (28) Biem, A.; Katagiri, S.; Juang, B. H. Pattern Recognition using Discriminative Feature Extraction. IEEE Trans. Signal Process. 1997, 45, 500-504. (29) Duin, R. P. W.; Umbach, R. H. Multiclass Linear Dimension Reduction by Weighted Pairwise Fisher criteria. IEEE Trans. Pattern Anal. Mach. Intell. 2001, 23, 762-766. (30) Geladi, P. Some Special Topics in Multivariate Image Analysis. Chemom. Intell. Lab. Syst. 1992, 14, 375- 390. (31) Haralick, R. M.; Shanmugam, K.; Dinstein, I. Textural Features for Image Classification. IEEE Trans. Syst. Man Cybern. 1973, 3, 610621. (32) Duin, R. P. W. PRTools: A Matlab Toolbox for Pattern Recognition; Delft University of Technology: Delft, The Netherlands, 2000.

ReceiVed for reView October 20, 2006 ReVised manuscript receiVed March 30, 2007 Accepted May 2, 2007 IE061348R