Energy & Fuels 2007, 21, 379-380
379
Comment Comment on “Optimization of the Temperature Profile of a Temperature Gradient Reactor for DME Synthesis Using a Simple Genetic Algorithm Assisted by a Neural Network” by Kohji Omata, Toshihiko Ozaki, Tetsuo Umegaki, Yuhsuke Watanabe, Noritoshi Nukui, and Muneyoshi Yamada W. Sha* Metals Research Group, School of Planning, Architecture and CiVil Engineering, The Queen’s UniVersity of Belfast, Belfast BT7 1NN, United Kingdom ReceiVed August 26, 2006. ReVised Manuscript ReceiVed December 8, 2006 1. Introduction
A feature of the paper is that the ANN selected has a structure of 5-18-4-1, where the numbers of hidden neurons in the two hidden layers are 18 and 4, respectively. An extremely limited amount of data was used for training the network, 75. The network is so complicated that it is not mathematically logical and determined. The training process involves minimizing the sum of the square error between actual and predicted outputs, using the available training data, by continuously adjusting and finally determining the weights connecting neurons in adjacent layers. The total number of weights to be determined in the neural network is, for the two hidden-layer structure, (5 + 1) × 18 + (18 + 1) × 4 + (4 + 1) × 1 ) 189. This essentially accounts for all of the connections between neurons in the input, hidden, and output layers. Although often regarded as a “black box” or incorrectly a model without relationships being expressed using equations, a neural network actually has exact mathematical relationships between the neuron connections. These relationships have a very simple weighted sum form but are many because of the full connection between neurons in adjacent layers. Such relationships can be given explicitly using mathematical equations, as shown in a paper by Okuyucu et al.2 Unfortunately, ref 2 itself suffers from the same inadequacy criticized in this comment paper. The number of neurons in the hidden layers increases the amounts of connections and weights to be fitted. This number cannot be increased without a limit because one may reach a
situation where the number of the connections to be fitted is larger than the number of the data pairs available for training. Although the neural network can still be trained, the case is mathematically undetermined. Mathematically, it is not possible to determine more fitting parameters than the available data points. For example, two data points are required as a minimum for linear regression, while three data points are required as a minimum for second-order polynomial (parabolic) regression and so on. In practice, for reliable regression, much more data than the minimum amounts are used to increase statistical significance. For example, if we use two points to determine a slope through linear regression, the standard error of the slope calculated will be infinite (infinitely large).3,4 A slope determined through two points has no statistical significance. The amount of data used in ref 1 is not enough to determine the number of fitting parameters in the network. Therefore, the model is not mathematically sound or justified. Redundant hidden neurons are present in the model. Alternatively, it may be said that not all of the weights in the model are independent. Over-fitting likely occurs when trying to determine more fitting parameters than the number of data pairs. As a simple demonstration, the fitting curve based on a parabolic equation will definitely pass both data points (thus, “perfect” fitting) if there are only two data points available. Such fitting will be undetermined or not unique, because many, in fact, an infinite number of parabolic curves pass the two data points. Other noteworthy points from ref 1 are the following: (1) It was stated that “a neural network was also used for mapping of the catalytic activity to replace the laborious experimental activity test”. A neural network cannot replace experimental tests; in fact, a neural network can be developed only when there are plenty of experimental test data available. (2) The quoted pioneering work by Hattroi and Kito5 had the same problem as criticized in this comment paper. (3) There was no testing data used for testing the neural network model developed. Testing was made against the optimized conversion process.
* To whom correspondence should be addressed. Telephone: +44-2890974017. Fax: +44-28-90663754. E-mail:
[email protected]. (1) Omata, K.; Ozaki, T.; Umegaki, T.; Watanabe, Y.; Nukui, N.; Yamada, M. Energy Fuels 2003, 17, 836-841. (2) Okuyucu, H.; Kurt, A.; Arcaklioglu, E. Mater. Des. 2007, 28, 78-84.
(3) Mendenhall, W.; Beaver, R. J. Introduction to Probability and Statistics, 9th ed.; Wadsworth: Belmont, CA, 1994; pp 447-450. (4) Harnett, D. L.; Murphy, J. L. Introductory Statistical Analysis; Addison-Wesley: Reading, MA, 1975; pp 416-425. (5) Hattori, T.; Kito, S. Catal. Today 1995, 23, 347-355.
A paper has been published in Energy & Fuels using feedforward artificial neural network (ANN) in the optimization of the temperature profile of a temperature-gradient reactor for dimethyl ether (DME) synthesis.1 The present comment paper attempts to discuss and comment on the paper. 2. Mathematical Indeterminacy
10.1021/ef0604341 CCC: $37.00 © 2007 American Chemical Society Published on Web 12/30/2006
380 Energy & Fuels, Vol. 21, No. 1, 2007
However, the testing result (about 70% conversion) is significantly lower than the predicted value (98%), confirming the inadequacy of the models.
Communications
technique is not uncommon,6,7 prompting the need of this comment paper to raise the attention of the research community. EF0604341
3. Concluding Remarks In conclusion, ANN modeling should be used with care and enough data. Unfortunately such misuse of the neural network
(6) Sha, W. Mater. Sci. Eng., A 2004, 372, 334-335. (7) Sha, W. J. Mater. Process. Technol. 2006, 171, 283-284.