A Novel Integrated Feature Selection Method for the Rational

Dec 11, 2012 - ... School of Computer Science and Information Technology, Northeast ... Simultaneously, we compare our method with several classical ...
0 downloads 0 Views 513KB Size
Article pubs.acs.org/IECR

A Novel Integrated Feature Selection Method for the Rational Synthesis of Microporous Aluminophosphate Miao Qi,† Jinsong Li,†,‡ Jianzhong Wang,† Yinghua Lu,*,†,‡ and Jun Kong*,§ †

Key Laboratory of Intelligent Information Processing of Jilin Universities, School of Computer Science and Information Technology, Northeast Normal University, Changchun, China ‡ Faculty of Chemistry, Northeast Normal University, Changchun 130024, China § Key Laboratory for Applied Statistics of MOE, Northeast Normal University, Changchun, China ABSTRACT: In this paper, an integrated feature selection model is proposed to explore the relationship between the synthetic factors and the specific resulting structure on the database of AlPO syntheses. Specifically, the proposed model can select the most significant synthetic factors for the generation of (6,12)-ring-containing structure. First, a random subspace method is employed to prerank the synthetic factors based on the predictive performance of a support vector machine. Then, the Fisher score is presented to rank the synthetic factors for getting a fusion weight. Finally, a sequential forward search method is utilized to select the most significant synthetic factors in view of the highest predictive performance. Specially, the principal-componentanalysis method is adopted as guidance for estimating the random subspace dimension. The results are judged on the numerical prediction of (6,12)-ring-containing AlPO structures. Simultaneously, we compare our method with several classical feature selection methods. The experimental results show that the proposed model exhibits higher predictive accuracy with less synthetic factors. The results also provide an important guidance for the rational design and synthesis of microporous materials.

1. INTRODUCTION In the past few decades, molecular sieves and microporous materials have been widely researched. In particular, microporous aluminophosphate, as an important branch of molecular sieve material, has been paid more attention because of its potential applications in catalysis, separation, and adsorption and host−guest assemblies.1,2 Herewith, people are taking up in recent years with the design and synthesis of aluminophosphate compounds with their novel structures and good properties. However, the crystallization kinetics of such materials is rather complicated. There are a large number of factors that can influence the crystallization kinetics and the final crystalline phases. In order to better understand the relationships between the synthetic factors and the resulting structures and further guide the rational synthesis of AlPO materials, Yu and coworkers have built the AlPO synthesis database including about 1600 items for scholars at home and abroad.3,4 This database can provide a research platform for the rational design and synthesis of microporous materials.5 Subsequently, the AlPO molecular sieve has been used as a target to probe the relationship between the synthetic factors and resulting framework structures based on machine-learning methods.6−11 In the literature, Li et al.8 adopted a support vector machine (SVM) to predict (6,12)-ring-containing microporous AlPOs, which gave the best combination of synthetic factors based on a brute-force method. In ref 9, partial least squares and logistic discrimination were used to predict the formation of microporous aluminophosphate AlPO4-5. In addition, four resampling methods were proposed to deal with the problem of class imbalance. Huo et al. proposed an AlPO4-5 prediction system based on C5.0 combined with the Fisher score.10 More recently, Li et al. proposed a back-propagation-based imputation method to solve the missing-value problem.11 Its © 2012 American Chemical Society

effectiveness was evaluated by the prediction accuracy. All of the above works demonstrated that machine-learning methods can provide good solutions for analysis of the relationship between the synthetic factors and resulting structures. Simultaneously, the excellent analysis and prediction not only offer a priori knowledge before chemistry experiments but also save manpower and material and financial resources. In order to further mine the relationship between the synthetic factors and specific resulting structures and provide an intuitive interpretation, a new integrated feature selection (FS) method is proposed to select and sort the synthetic factors that significantly affect the specific resulting structures on the database of AlPO synthesis. The proposed method, named RPFS, integrates a random subspace method (RSM), Fisher score, and sequential forward search (SFS). In particular, the principal-component-analysis (PCA) method combined with SVM is employed to estimate the dimensions of the RSM. To demonstrate the efficiency and effectiveness of the proposed method, we compare RPFS with other representative FS methods in view of the predictive accuracy and number of selected features through extensive experiments. The main contributions of this paper include (1) proposing a novel FS method, (2) giving the prediction model with less synthetic factors and higher accuracy for specific resulting structure of microporous AlPOs, and (3) providing a priori guidance for the chemical experiment and analysis. The paper is organized as follows. Section 2 gives an outline of the existing techniques for FS and describes the proposed Received: Revised: Accepted: Published: 16734

July 24, 2012 November 18, 2012 November 18, 2012 December 11, 2012 dx.doi.org/10.1021/ie3019774 | Ind. Eng. Chem. Res. 2012, 51, 16734−16740

Industrial & Engineering Chemistry Research

Article

Usually, it is determined by a large number of experimental results or a priori knowledge. In this paper, we use PCA21 to estimate the dimensions of a low-dimensional subspace in RSM for making a FS model. PCA is a classical feature extraction, data representation, and data analysis technique widely used in the areas of pattern recognition and machine learning. PCA seeks some projections that can best represent the data in the sense of least squares. Usually, it is used as a tool to reveal the internal variance information of the data. The principal components of the data can be obtained by solving the eigenvalues and eigenvectors of the data covariance matrix. Here, the dimensions are determined by evaluating the predictive performance on a specific classifier with the principal components. The dimensions of principal components with respect to the best predictive accuracy are the dimensions of the subspace. Given a training set Xn×p, a class label set Yn×1, the number of generated random subspaces T, the number of qualified random subspaces t, the weight vector C, and the number of selections q obtained by PCA, the feature ranking process of RSM is described as follows:

integrated FS method in detail. The selection of classifier as well as parameter settings is discussed in section 3. The proposed FS method is also compared with some well-known FS methods. Finally, conclusions are given in section 4.

2. THE PROPOSED METHOD Feature selection (FS) is a process of choosing a feature subset from the original features for building robust learning models. It is particularly important in analyzing high-dimensional data from many experimental techniques such as biology and chemistry because it can help people to acquire a better understanding by selection results and offer the advantage of interpretability by a domain expert. Generally, the existing FS methods are widely divided into two categories: the filter models and the wrapper models.12−15 The filter model evaluates the performance of one feature subset by only using the intrinsic properties of the data. Advantages of the filter model are that it is a simple and fast calculation and independent of the classification algorithm. However, the filter model ignores the relationships among the selected features. Moreover, the interactions between the features and the learning algorithms are also neglected. Thus, the selected subset might not match the classification algorithm. On the contrary, the wrapper model evaluates various subsets on a specific classifier. It can usually obtain a better subset than the filter model in terms of predication/classification accuracy. Although the wrapper models consider the interaction between features and classifier and take the feature dependencies into account, they tend to be more computationally expensive and have a higher risk of overfitting compared with the filter models. To make the filter and wrapper models complement each other, a novel feature selection method (RPFS) integrating RSM, PCA, the Fisher score, and SFS is proposed, and the flowchart is shown in Figure 1.

2.2. Fisher Score. In fact, the above RSM−PCA method ranks features based on the frequency of the chosen synthetic factor. However, the high frequency might not explain the importance completely. Some irrelevant or weakly relevant features might be selected because of the randomness. In order to rectify this problem, the Fisher score22 is used to rank the synthetic factors for getting a fusion weight. The Fisher score is supervised with class labels, and it seeks the features with the best discrimination ability. The Fisher score is defined as Fj = Figure 1. Flowchart of the proposed RPFS method.

ny =+1(μyj=+1 − μ j )2 + ny =−1(μyj=−1 − μ j )2 ny =+1(σyj=+1)2 + ny =−1(σyj=−1)2

j = 1, 2, ..., p

2.1. RSM−PCA. RSM was first introduced by Ho. RSM is based on random sampling for original feature components to obtain different feature subsets. In recent years, it has been applied to FS, clustering, and other areas.17−20 When it is used for FS, it often finds the optimal result by evaluating a predefined number of features. It is a crucial step in determining the dimensions of the subspace in RSM. However, there is no theoretical guidance for setting the dimensions.

, (1)

where y ∈ {+1, −1} indicates the corresponding class label. y = +1 represents positive examples, y = −1 are the negative samples, μjy=+1, μjy=−1, and μj are the averages of the jth feature corresponding to the positive class, negative class, and whole samples, respectively. ny=+1 and ny=−1 are the number of positive and negative samples, respectively. The larger Fj is, the more discriminative this feature is. We fuse the results of RSM−PCA and the Fisher score using

16

16735

dx.doi.org/10.1021/ie3019774 | Ind. Eng. Chem. Res. 2012, 51, 16734−16740

Industrial & Engineering Chemistry Research Wj = Fj × Cj

Article

concentrations (F1−F4) are necessarily selected factors, and the other factors (F5−F21) are used as selected. 3.1. Classifier Selection. In the process of our proposed integrated FS method, the classifier is used in two stages, dimension determination of RSM with PCA and FS in SFS. In order to select a good classifier for improving the predictive performance, we compare several well-known classifiers on original features, including back-propagation network (BP), classification and regression tree (Cart), iterative dichotomizer3 (ID3), K-nearest-neighbor algorithm (KNN), Adaboost, and SVM with different kernel functions. The process of comparing experiments is depicted as follows: Step 1: Build the sample database, containing 398 positive samples and randomly selected 398 negative samples. Step 2: Select 199 positive samples and 199 negative samples randomly to train classifiers, and the remaining are used as the test samples. Step 3: Evaluate the predictive accuracy of various classifiers using two-fold cross-validation. Step 4: Repeat the above three steps n times and get the average classification accuracy as the final predictive result. Here, the setting of n is 13 because more than 99% of the data is sampled through 13 times of repeating. The comparison results are shown in Figure 2. As we can see, SVM with Gaussian radial basis function kernel functions (RBF) exhibits the best predictive accuracy of 84.40%, followed by Adaboost of 84.16%. Although these two methods show similar performances, Adaboost is more time-consuming. Therefore, SVM with RBF is employed for FS and prediction. In addition to the choice of classifier, this experiment is also designed to set the threshold th in RSM. Without loss of generality, the performance after FS should outperform the results without FS. In this case, th is set to 84% in our experiments. 3.2. Parameters Setting. In RSM−PCA, PCA combined with SVM is employed to estimate the feature dimension q of subspace first. The predictive results with different q values are shown in Figure 3. It can be found that the curve shows an ascendant trend with increasing q. It reaches the top accuracy of 84.04% when q = 11 and maintains a stable state when q > 11. Therefore, q is set to 11. Then, we generate T random subspaces with a dimensionality of q. On the basis of the predictive accuracy of the selected synthetic factors, t random subspaces are selected, the predictive accuracy of which is higher than th. Figures 4 and 5 depict the ratio of qualified random subspaces t/T and the selected frequency of each synthetic factor, respectively. We can see from Figure 4 that t increases accordingly with increasing T. The ratios are 45/100, 80/200, 115/300, 157/400, and 193/ 500. These ratios illustrate that t/T is stable and nearly not affected by the changes of t and T. As seen from Figure 5, the selected frequency is changeable with different T values. However, there is a weak influence to the overall trend of each synthetic factor by changing T. F12 shows the highest selected frequency, which implies that F12 plays a very important role in the prediction of the (6,12)-ring framework structure. Considering t/T, selected frequency, and running time comprehensively, T is set to 200. 3.3. Comparison Experiments. In order to demonstrate the effectiveness and advantage, we compare RPFS with 11 classical filter and wrapper methods. The comparison results are listed in Table 2. It is worth noting that only F5−F21 are selected in various FS methods. For the SFS method, SVM with RBF is used as the classifier. In Table 2, except for the Blogreg

(2)

2.3. SFS. With the above two methods, each feature holds a weight. For the sake of obtaining a better optimal feature subset in terms of dimensions, understanding, and predictive performance, SFS23 is further employed to search the best synthetic factors subset with highest accuracy. SFS is one of the most widely used wrapper FS methods, which starts with an empty set, adds one feature that provides the highest incremental discriminatory information to the existing feature subset at each step, and terminates when the classification performance is not improved. Instead of adding one feature exhaustively, the added feature order is determined by the fusion weight in our work. That is, the feature with the biggest weight in the unselected features set is first selected to the existing feature subset at each step in SFS. The advantage is that it can accelerate the speed of SFS in this way.

3. EXPERIMENTAL RESULTS To demonstrate the efficiency of the proposed RPFS method, we evaluate its performance from the aspects of prediction accuracy and number of selected features on the AlPO database (http://zeobank.jlu.edu.cn/). This database contains 1576 reaction data for ca. 230 AlPO structures. Each reaction consists of four groups of synthesis information:8 the source materials, the template, the synthesis conditions, and the structural characteristics of the product. Considering the missing-value problem, 21 synthetic factors shown in Table 1 Table 1. Description of the Synthetic Factors gel composition

code

description of the parameters

F1 F2 F3

molar ratio of Al2O3/Al2O3 in the gel composition molar ratio of P2O5/Al2O3 in the gel composition molar ratio of solvent/Al2O3 in the gel composition molar ratio of template/Al2O3 in the gel composition density melting point boiling point dielectric constant dipole moment polarity longest distance of the organic template second longest distance of the organic template shortest distance of the organic template van der Waals volume dipole moment ratio of C/N ratio of N/(C + N) ratio of N/van der Waals volume Sanderson electronegativity number of freely rotated single bonds maximum number of protonated H atoms

F4 solvent

Organic template

F5 F6 F7 F8 F9 F10 F11 F12 F13 F14 F15 F16 F17 F18 F19 F20 F21

are selected for the experiments. Among AlPOs, the (6,12)-ring represents a major class.24 Also, it provides instructional significance to the prediction of (6,12)-ring framework structure. So, 398 positive samples [containing (6,12)-rings] and 884 negative samples [containing non-(6, 12)-rings] are used to predict the (6,12)-ring-containing AlPOs and the other AlPOs. Four important synthetic factors related to the molar 16736

dx.doi.org/10.1021/ie3019774 | Ind. Eng. Chem. Res. 2012, 51, 16734−16740

Industrial & Engineering Chemistry Research

Article

Figure 2. Performance comparisons of different methods.

Figure 3. Predictive accuracy with different dimensions.

and FCBF methods, different FS filter models exhibit similar prediction performances with a range of about 84−85%. However, the selected features are quite different not only in numbers but also in nature character. This might attribute to the different aspects of selecting features. Taking the Fisher score and SFS as examples, the Fisher score is a filter-based method and it selects each feature independently according to their scores under the Fisher criterion that distances between data points in different classes are as large as possible, while the distances between data points in the same class are as small as

possible. That is to say, the Fisher score considers both withinclass and between-class scatter information. For SFS, it starts from an empty set and selects the feature, among all unselected features, together with the selected features that gives the highest classification/prediction rate. Therefore, SFS not only considers the interaction between features and classifier but also takes the dependency relationship among features into account. Although these methods obtain similar prediction performances on this database, they still reflect the subtle difference. As seen from the predictive accuracy, Gini, t test, ChiSquare, 16737

dx.doi.org/10.1021/ie3019774 | Ind. Eng. Chem. Res. 2012, 51, 16734−16740

Industrial & Engineering Chemistry Research

Article

Figure 4. Relationship between t and T.

Figure 5. Frequency of each chosen synthetic factor with different T values.

Fisher score, SFS, and RPFS are all more than 85%. Clearly, RPFS and Fisher score exhibit the best results with 85.21% and 85.14%, respectively. For these six methods, SFS selects the least features with 5, followed by RPSF with 7. However, the predictive accuracy of RPSF is higher than SFS by 0.11%. We also find that some methods such as Blogreg and FCBF are inferior to the result without FS. In particular, Blogreg shows the worst predictive accuracy with 74.77%, which is lower than RPFS by 10.44%. Accordingly, considering both the predictive accuracy and number of selected feature, RPFS clearly outperforms the others. This illustrates that our proposed method is effective and superior.

Obviously, the selected features and rank orders of different methods are different because the evaluation criteria of each method are discriminant and thus lead to different rankings of features. Visually, we can find that F12 almost always appears in the first two positions. This means that F12 is a very crucial factor for distinguishing between structures containing (6,12)rings or not. Subsequently, F16 and F18 also play very significant roles for the predictive task. These results are also in agreement with the fact that the geometric and electronic parameters of an organic template have a vital effect on the pore size and shape of an AlPO structure. Figure 6 shows changeable curves with increasing factors in sequence. When the feature dimensions increase from 5 to 6, all 16738

dx.doi.org/10.1021/ie3019774 | Ind. Eng. Chem. Res. 2012, 51, 16734−16740

Industrial & Engineering Chemistry Research

Article

problem. In particular, a few of the synthetic factors (11 dimensions) can reach good performances for predicting the formation of (6,12)-ring-containing AlPOs with an accuracy of 85.21%. Simultaneously, the prediction and rank results could provide a priori knowledge and a better understanding for rational synthetic experiments.

Table 2. Comparisons among Different FS Filter Models method without nonFS Blogreg25 SBMLR26 Gini27 InforGain28 relief t test29 ChiSquare30 Fisher score FCBF31 KruskalW32 SFS RPFS

selection and rank

predictive accuracy (%) 84.69

F12 F20 F16 F11 F21 F17 F15 F5 F18 F17 F19 F10 F12 F9 F5 F6 F7 F10 F8 F21 F20 F11 F15 F13 F19 F14 F15 F12 F18 F14 F13 F11 F17 F16 F19 F20 F18 F12 F21 F17 F19 F16 F20 F14 F11 F13 F16 F12 F14 F8 F20 F10 F13 F6 F11 F15 F12 F18 F14 F13 F11 F17 F16 F19 F20 F21 F12 F16 F18 F17 F19 F14 F21 F13 F18 F19 F21 F15 F5 F6 F7 F8 F10 F11 F12 F13 F14 F16 F17 F18 F19 F20 F12 F16 F21 F6 F18 F12 F16 F18 F19 F14 F13 F6



74.77 84.27

AUTHOR INFORMATION

Corresponding Author

85.04 84.73 84.84 85.12 85.08

*E-mail: [email protected] (Y.L.), [email protected] (J.K.). Tel: +86-431-84536326. Notes

The authors declare no competing financial interest.



85.14 78.89 84.86

ACKNOWLEDGMENTS This work is supported by the Young Scientific Research Foundation of Jilin Province Science and Technology Development Project (Grants 201201070 and 201201063), the Jilin Provincial Natural Science Foundation (Grant 201115003), the Fund of Jilin Provincial Science & Technology Department (Grants 20111804 and 20110364), Fundamental Research Funds for the Central Universities (Grant 11QNJJ005), the Science Foundation for Postdoctor of Jilin Province (Grant 2011274), the Program for Changjiang Scholars and Innovative Research Team in University (PCSIRT), and the National Nature Science Foundation of China (Grant 11071046).

85.10 85.21

*

The selected factors are ranked in descending order according to the fusion weight.

methods except Gini trend upward rapidly. They reach the highest point successively with addition of the factors. Obviously, SFS reaches the top first with the least factors, 9. RPFS gives the highest accuracy, but even it needs 11 factors. Blogreg gives the worst performance below 80%.



4. CONCLUSIONS In this work, a novel integrated FS method has been proposed to analyze the relationship between the specific structure and the associated synthetic conditions of AlPOs, which combines the advantages of the filter and wrapper models. A large number of experimental results have shown that our proposed method is efficient and can improve the predictive performance. Moreover, it displays a better predictive performance compared with some existing methods applied to the same

REFERENCES

(1) Lee, H.; Zones, S. I.; Davis, M. E. A Combustion-free Methodology for Synthesizing Zeolites and Zeolite-like Materials. Nature 2003, 425, 385. (2) Yu, J. H.; Xu, R. R. Insight into the Construction of Openframework Aluminophosphates. Chem. Soc. Rev. 2006, 35, 593. (3) Li, J. Y.; Yu, J. H.; Xu, R. R. http://zeobank.jlu.edu.cn/. (4) Li, J. Y.; Yu, J. H.; Sun, J. H.; Dong, X. C.; Li, Y.; Wang, Z. P.; Wang, S. X.; Xu, R. R. Introduction and Application of Zeobank: Synthesis and Structure Databases of Zeolites and Related Materials. Stud. Surf. Sci. Catal. 2007, 170, 168.

Figure 6. Comparison of the different FS methods. 16739

dx.doi.org/10.1021/ie3019774 | Ind. Eng. Chem. Res. 2012, 51, 16734−16740

Industrial & Engineering Chemistry Research

Article

(30) Liu, H.; Setiono, R. Chi2: Feature Selection and Discretization of Numeric Attributes. Proc. 7th Int. Conf. Tools Artif. Intell. 1995, 388. (31) Das, S. Filters, Wrappers and a Boosting-based Hybrid for Feature Selection. Proc. 18th Int. Conf. Machine Learning 2001, 74. (32) Wei, L. J. Asymptotic Conservativeness and Efficiency of Kruskal−Wallis Test for K Dependent Samples. J. Am. Stat. Assoc. 1981, 76, 1006.

(5) Yan, Y.; Li, J.; Qi, M.; Zhang, X.; Yu, J. H.; Xu, R. R. Database of Open-framework Aluminophosphate Syntheses: Introduction and Application (I). Sci. China Ser. B: Chem. 2009, 52, 1734. (6) Yu, J. H.; Xu, R. R. Toward the Rational Design and Synthesis of Inorganic Microporous and Related Materials. Stud. Surf. Sci. Catal. 2004, 154, 1. (7) Li, J. Y.; Li, L.; Liang, J.; Chen, P.; Yu, J. H.; Xu, R. R. TemplateDesigned Synthesis of Open-Framework Zinc Phosphites with ExtraLarge 24-Ring Channels. Cryst. Growth Des. 2008, 8, 2318. (8) Li, J. Y.; Qi, M.; Kong, J.; Wang, J. Z.; Yan, Y.; Huo, W. F.; Yu, J. H.; Xu, R. R.; Xu, Y. Computational Prediction of the Formation of Microporous Aluminophosphates with Desired Structural Features. Microporous Mesoporous Mater. 2010, 129, 251. (9) Qi, M.; Lu, Y. H.; Wang, J. Z.; Kong, J. Prediction of Microporous Aluminophosphate AlPO4-5 based on Resampling Using Partial Least Squares and Logistic Discrimination. Mol. Inf. 2010, 29, 203. (10) Huo, W. F.; Gao, N.; Yan, Y.; Li, J. Y.; Yu, J. H.; Xu, R. R. Decision Trees Combined with Feature Selection for the Rational Synthesis of Aluminophosphate AlPO4-5. Acta Phys.-Chim. Sin. 2011, 27, 2111. (11) Li, J. S.; Lu, Y. H.; Kong, J.; Gao, N.; Yu, J. H.; Xu, R. R.; Wang, J. Z.; Qi, M.; Li, J. Y. Missing Value Estimation for Database of Aluminophosphate (AlPO) Syntheses. Microporous Mesoporous Mater. DOI: 10.1016/j.micromeso.2012.03.007. (12) Kohavi, R.; John, G. H. Wrappers for Feature Subset Selection. Artif. Intell. 1997, 97, 273. (13) Das, S. Filters, Wrappers and a Boosting-based Hybrid for Feature Selection. Proc. 18th Int. Conf. Machine Learning 2001, 74. (14) Lei, Y.; Liu, H. Feature Selection for High-dimensional Data: A Fast Correlation-based Filter Solution. Proc. 20th Int. Conf. Machine Learning 2003, 856. (15) Liu, H.; Yu, L. Toward Integrating Feature Selection Algorithms for Classification and Clustering. IEEE Trans. Know. Data Eng. 2005, 17, 491. (16) Ho, T. K. Random Decision Forests. Proc. 3rd Int. Conf. Document Anal. Recogn. 1995, 278. (17) Lai, C.; Reinders, M. J. T.; Wessels, L. Random Subspace Method for Multivariate Feature Selection. Patt. Recogn. Lett. 2006, 27, 1067. (18) Yan, B. J.; Domeniconi, C. Subspace Metric Ensembles for Semi-supervised Clustering of High Dimensional Data. Lect. Notes Comput. Sci. 2006, 4212, 509. (19) Wang, J.; Luo, S. W.; Zeng, X. H. A Random Subspace Method for Co-training. Proc. IEEE Int. Joint Conf. Neural Networks 2008, 195. (20) Gao, Y.; Wang, Y. S. Boosting in Random Subspace for Face Recognition. Lect. Notes Contr. Inf. Sci. 2006, 345, 172. (21) Jolliffe, I. T. Principal Component Analysis, 2nd ed.; Springer: New York, 2002. (22) Duda, P. E. H. R. O.; Stork, D. G. Pattern Classification; Wiley: New York, 2001. (23) Kittler, J. Pattern Recognition and Signal Process; Sijthoff and Noordhoff: Alphen aan den Rijn, The Netherlands, 1978. (24) Baerlocher, C.; McCusker, L. B. Database of zeolite structures. Available from http://www.iza-structures.org/databases/ (accessed April 2009). (25) Gavin, C. C.; Talbot, N. L. C. Gene Selection in Cancer Classification Using Sparse Logistic Regression with Bayesian Regularization. Bioinformatics 2006, 22, 2348. (26) Cawley, G. C.; Talbot, N. L.; Girolami, M. Sparse Multinomial Logistic Regression via Bayesian L1 Regularisation. Adv. Neur. Inf. Process. Syst. 2007, 19, 209. (27) Breiman, L.; Friedman, J.; Olshen, R.; Stone, C. Classification and Regression Trees; Chapman & Hall: London, 1984. (28) Cover, T. M.; Thomas, J. A. Elements of Information Theory; Wiley: New York, 1991. (29) Press, W. H.; Teukolsky, S. A.; Vetterling, W. T.; Flannery, B. P. Numerical Recipes in C, 2nd ed.; Cambridge University Press: New York, 1992. 16740

dx.doi.org/10.1021/ie3019774 | Ind. Eng. Chem. Res. 2012, 51, 16734−16740