Comment on “Design of a Propane Ammoxidation Catalyst Using

Oct 28, 2006 - Okuyucu, H.; Kurt, A.; Arcaklioglu, E. Artificial Neural Network Application to the Friction Stir Welding of Aluminum Plates. Mater. De...
0 downloads 0 Views 13KB Size
Ind. Eng. Chem. Res. 2006, 45, 8223-8224

8223

CORRESPONDENCE Comment on “Design of a Propane Ammoxidation Catalyst Using Artificial Neural Networks and Genetic Algorithms” W. Sha* Metals Research Group, School of Planning, Architecture and CiVil Engineering, The Queen’s UniVersity of Belfast, Belfast BT7 1NN, United Kingdom 1. Introduction Sir: A paper has been published in Ind. Eng. Chem. Res. by Cundari et al.1 using feed-forward artificial neural network (ANN) in the design of a propane ammoxidation catalyst.1 The present paper attempts to discuss and comment on the paper. 2. Mathematical Indeterminacy A feature of the paper is that the artificial neural networks selected have structures of 6-n1-2 where the number of hidden neurons n1 is changed from 5 to 50 in increments of 5, 6-n1n2-2 (6-10-15-2, 6-14-10-2, 6-15-10-2, 6-20-102, 6-15-15-2, and 6-20-20-2) and 6-n1-n2-n3-2 (65-3-2-2 and 6-20-10-10-2). An extremely limited amount of data was used for training the networks, with the training set containing 15 data in ref 1. The networks are so complicated that they are not mathematically logical and determined. The training process involves minimizing the sum of square error between actual and predicted outputs, using the available training data, by continuously adjusting and finally determining the weights connecting neurons in adjacent layers. The total number of weights to be determined in a neural network is (according to eq 3 in ref 1) as follows: (i) one hidden layer: (6 + 1) × n1 + (n1 + 1) × 2; (ii) two hidden layers: (6 + 1) × n1 + (n1 + 1) × n2 + (n2 + 1) × 2; and (iii) three hidden layers: (6 + 1) × n1 + (n1 + 1) × n2 + (n2 + 1) × n3 + (n3 + 1) × 2. This essentially accounts for all the connections between neurons in the input, hidden, and output layers. Though often regarded as a “black box” or incorrectly a model without relationships being expressed using equations, a neural network actually has exact mathematical relationships between the neuron connections. These relationships have a very simple weighted sum form but are many because of the full connection between neurons in adjacent layers. Such relationships can be given explicitly using mathematical equations, as shown in a paper by Okuyucu et al.2 Unfortunately, ref 2 itself suffers from the same inadequacy criticized in this comment paper. The number of neurons in the hidden layers increases the amounts of connections and weights to be fitted. This number cannot be increased without limit because one may reach a situation where the number of the connections to be fitted is larger than the number of the data pairs available for training. Though the neural network can still be trained, the case is * Tel.: +44-28-90974017. Fax: +44-28-90663754. E-mail: w.sha@ qub.ac.uk

mathematically undetermined. Mathematically it is not possible to determine more fitting parameters than the available data points. For examples, two data points are required as a minimum for linear regression, three data points are required for secondorder polynomial (parabolic) regression, and so on. In practice, for reliable regression, much more data than the minimum amounts are used to increase statistical significance. For example, if we use two points to determine a slope through linear regression, the standard error of the slope calculated will be infinite (infinitely large).3,4 A slope determined through two points has no statistical significance. The amount of data used in ref 1 is not enough to determine the number of fitting parameters in the networks, ranging from 47 to 602 among the selected networks (Figure 3 in ref 1). Therefore, all the models are not mathematically sound or justified. Redundant hidden neurons are present in the models. Alternatively, it may be said that not all the weights in the models are independent. In fact, the so-called optimal linear combination of neural networks developed in ref 1 has pointed to this problem. Out of the 140 neural networks selected, only 18 had nonzero combination weights, because there were only 19 data points used to optimize the combination of 140 neural networks. So, only a maximum of 18 neural networks could possibly be selected through optimization (plus a bias term to equal the number of data points available). Overfitting likely occurs when trying to determine more fitting parameters than the number of data pairs. As a simple demonstration, the fitting curve based on a parabolic equation will definitely pass both data points (thus, “perfect” fitting) if there are only two data points available. Such fitting will be undetermined or not unique, because many, in fact an infinite number of, parabolic curves pass the two data points. In the work by Cundari et al., overfitting has definitely occurred when trying to determine more fitting parameters than the number of data pairs, as shown by the errors on testing datasets being orders of magnitude larger than errors on training and validation datasets (Table 1 in ref 1). Unfortunately, such misuse of the neural-network technique is not uncommon.5,6 This problem has existed since the pioneering work by Kito et al.,7 one of the earliest neuralnetwork papers published in Ind. Eng. Chem. Res. In ref 7, there were only 38 training data pairs and the data number was not large enough. On a separate note, in ref 1, six input units corresponding to the concentrations of six different elemental or compound components were used. However, the sum of the six concentrations is 100%, so only five are independent parameters.

10.1021/ie0611251 CCC: $33.50 © 2006 American Chemical Society Published on Web 10/28/2006

8224

Ind. Eng. Chem. Res., Vol. 45, No. 24, 2006

Therefore, five input units should be used, each representing an independent variable. 3. Concluding Remarks In conclusion, ANN modeling should be used with care and enough data. The deficient papers1,7 criticized here have been highly cited and used by many other authors, prompting the need of this comment paper to raise the attention of the research community. Literature Cited (1) Cundari, T. R.; Deng, J.; Zhao, Y. Design of a Propane Ammoxidation Catalyst Using Artificial Neural Networks and Genetic Algorithms. Ind. Eng. Chem. Res. 2001, 40, 5475-5480. (2) Okuyucu, H.; Kurt, A.; Arcaklioglu, E. Artificial Neural Network Application to the Friction Stir Welding of Aluminum Plates. Mater. Des., in press (doi: 10.1016/j.matdes.2005.06.003.).

(3) Mendenhall, W.; Beaver, R. J. Introduction to Probability and Statistics, 9th ed.; Wadsworth: Belmont, CA, 1994. (4) Harnett, D. L.; Murphy, J. L. Introductory Statistical Analysis; Addison-Wesley: Reading, MA, 1975. (5) Sha, W. Comment on “Modeling of Tribological Properties of Alumina Fiber Reinforced Zinc-Aluminum Composites Using Artificial Neural Network” by Genel, K. et al. [Mater. Sci. Eng., A 2003, 363, 203]. Mater. Sci. Eng., A 2004, 372, 334-335. (6) Sha, W. Comment on “Prediction of the Flow Stress of 0.4C-1.9Cr1.5Mn-1.0Ni-0.2Mo Steel during Hot Deformation” by Wu, R. H. et al. [J. Mater. Process. Technol. 2001, 116, 211]. J. Mater. Process. Technol. 2006, 171, 283-284. (7) Kito, S.; Hattori, T.; Murakami, Y. Estimation of the Acid Strength of Mixed Oxides by a Neural Network. Ind. Eng. Chem. Res. 1992, 31, 979-981.

IE0611251