Ind. Eng. Chem. Res. 1998, 37, 3985-3991
3985
Software Sensor Design Using Bayesian Automatic Classification and Back-Propagation Neural Networks F. Z. Chen and X. Z. Wang* Department of Chemical Engineering, The University of Leeds, Leeds LS2 9JT, U.K.
Back-propagation neural networks (BPNN) have attracted attention as an effective method for designing software sensors. A critical issue with BPNN is the danger of extrapolation beyond the parameter space used for the training data. It is therefore important to select the data for model development and test with some care. This also means that there is a need to know when the BPNN model needs to be retrained with new data during use. This paper describes an approach for addressing this issue, which combines an unsupervised clustering method in conjunction with a BPNN model. An unsupervised Bayesian clustering system is used to automatically group the multivariate data into clusters in such a way that data patterns within a class have similar characteristics which distinguish them from other classes. Test data patterns for the BPNN model are selected from each class and when new data are available, the clustering system is employed to check if they are beyond the parameter space of the previous training data and therefore require retraining of the model. The approach is discussed by reference to the development of a software sensor of the fractionator of a refinery fluid catalytic cracking process. Introduction Process control contributes to process operational performance in safety, environmental protection, and product quality control through proper monitoring and control of process variables.1 Some of the variables including flow rate, temperatures, pressures, and levels can be easily monitored on-line with cost-effective and reliable measuring devices. Some variables, however, are often analyzed off-line in laboratories because they are either too expensive or technically unreliable to install an on-line instrument. These variables often relate to product quality and environmentally significant parameters such as composition and physical properties. There is now considerable interest in both biotechnology and chemical engineering for the development of the so-called software sensors using readily available measurements to estimate the hard-to-measure variables. Among the various approaches which have been used, back-propagation neural networks (BPNN) have been widely studied and have shown great potential as a basis for reliable software sensor design.2-12 BPNN offers the distinctive ability to learn complex nonlinear relations without requiring specific knowledge of the model structure. It has demonstrated surprisingly good performance in various applications. It is known that any continuous function of N variables can be computed using only linear summations and nonlinear but continuously increasing functions of only one variable.13 This effectively means that a three-layered BPNN with N(2N + 1) nodes using continuously increasing nonlinearities can be used to compute any continuous functions of N variables. However, while providing a foundation, it does not solve many of the difficulties encountered in the practical application of BPNN. * Corresponding author. Tel.: +44 113 233 2427. Fax: +44 113 233 2405. E-mail:
[email protected].
A limitation of BPNN models is that they are dataintensive because they are trained and not programmed.14 Therefore, when a BPNN model is to be developed, it is important to divide the data into training and test sets so that the model developed can be validated. In addition, the trained model needs to be periodically updated to maintain the accuracy of the model. The traditional way of dividing a database into training and test sets is through random sampling. Random sampling of test data has major shortcomings. If more sampled patterns belong to the sparse region in the high-dimensional parameter space, the results will not be reliable, because the training patterns from that region are not adequate. On the other hand, if more sampled patterns for testing come from the dense space, the model will give over-optimistic estimation. Another problem is to decide when the model needs to be retrained. It is generally assumed that if the values of all input variables are within the boundaries of the training data, the model is reliable, but this assumption is questionable. Given a new data pattern, with input attributes within the boundaries of the training data, it is still difficult to tell if it is covered by the training data since the new data represent a point in a highdimensional nonlinear parameter space. This paper describes a method for developing a software sensor for the main fractionator of a fluid catalytic cracking (FCC) process. The method combines a Bayesian automatic classification system (AutoClass) and BPNN, using AutoClass to automatically cluster the data into classes so that data patterns within a class can be distinguished from those in other classes and so form a basis for sampling test data patterns. The paper is organized as follows. The method is introduced in the next section, followed by the description of the process and the problem of the industrial case study. Then the process and result of developing the software sensor for the condensation point of the light diesel is described.
S0888-5885(98)00230-9 CCC: $15.00 © 1998 American Chemical Society Published on Web 09/03/1998
3986 Ind. Eng. Chem. Res., Vol. 37, No. 10, 1998
Figure 1. A combined framework for software sensor design using automatic classification and BPNN.
Combined Framework Using Automatic Classification and Neural Networks for Software Sensor Design Figure 1 depicts the method for designing software sensors. It is divided into two stages, model development and model maintenance. In model development, data are first grouped into classes using the unsupervised clustering system AutoClass. Test patterns for BPNN are sampled from each class according to the size of the class. During model maintenance, it is necessary to decide when the model requires to be retrained. Because of the high dimensionality and large volume of data, it is difficult to carry out the analysis manually. The approach used here mixes new data with older data and then analyzes it using AutoClass. If any of the following three situations arise, the model needs to be retrained. (1) New classes are formed that cover mainly new data. This means that the new data are located in new parameter spaces. (2) Some new data are assigned to small-sized existing classes. This implies that the parameter spaces covered by small-sized classes are insufficiently trained. When new data are available, the model should be retrained to improve its performance. (3) New data are assigned to large classes and the degree of confidence estimated using the old model to the new patterns is low. It should be pointed out that the framework in Figure 1 does not cover all the steps necessary for general software sensor design. For example, Qin et al.15 has developed an integral framework for developing software sensors that is able to validate the input sensors before making predictions using software sensor models. Since the BPNN is well-studied, only the clustering approach AutoClass is described. Unsupervised Bayesian Clustering Systems AutoClass. The clustering approach used is based on unsupervised Bayesian classification developed by
NASA.16-19 Given a number of data patterns (also called cases, observations, samples, instances, objects, or individuals), each of which is described by a set of attributes, AutoClass can devise a classification scheme automatically for grouping the data patterns into a number of classes such that instances within a class are similar, in some respect, but distinct from those in other classes. The approach has several advantages over other clustering methods. (1) The number of classes is determined automatically. Deciding when to stop forming classes is a fundamental problem in classification.20 More classes can always explain the data better, so there is a need to limit the number of classes. Many systems rely on an ad hoc stopping criterion. For example, ART2 (adaptive resonance theory) is strongly influenced by a vigilance or threshold value which needs to be input by users based on trial and error. The Kohonen network requires the number of classes to be determined beforehand. The Bayesian solution to the problem is based on the use of prior knowledge. It is believed that simpler class hypotheses (e.g., those with fewer classes) are more likely than complex ones, in advance of acquiring any data, and the prior probability of the hypothesis reflects this preference. The prior probability term prefers fewer classes, while the likelihood of the data prefers more, so both effects balance at the most probable number of classes. Because of this, AutoClass finds only one class in random data. (2) Objects are not assigned to a class absolutely. AutoClass calculates the probability of membership of an object in each class, providing a more intuitive classification than absolute partitioning techniques. An object described equally well by two class descriptions should not be assigned to either class with certainty, because the evidence cannot support such an assertion. (3) All attributes are potentially significant. Classification can be based on any or all attributes simultaneously, not just the most important one. This represents an advantage of the Bayesian method over
Ind. Eng. Chem. Res., Vol. 37, No. 10, 1998 3987
Figure 2. The main fractionator of the FCC process.
Figure 3. The back-propagation neural network structure.
human classification. In many applications, classes are distinguished not by one or even by several attributes, but by many small differences. Humans often have difficulty in taking more than a few attributes into account. The Bayesian approach utilizes all attributes simultaneously, permitting uniform consideration of all the data. At the end of learning, AutoClass gives the contributing factors to class formation. (4) Data can be real or discrete. Many methods have difficulty in analyzing mixed data. Some methods insist on real valued data, while others accept only discrete data. The Bayesian approach can utilize the data exactly as they are given. (5) It allows missing attribute values. AutoClass has been applied as a data mining tool to the analysis of process operational data of a fluid catalytic cracking process and is proved to be able to cluster the data into classes representing significantly different operational modes.21
(1) T1-11, the temperature on tray 22 where the light diesel is withdrawn; (2) T1-12, the temperature on tray 20 where the light diesel is withdrawn; (3) T1-33, the temperature on tray 19; (4) T1-42, the temperature on tray 16 (i.e., the initial temperature of the pumparound); (5) T1-20, the return temperature of the pumparound; (6) F215, the flow rate of the pumparound; (7) T1-09, column top temperature; (8) T1-00, reaction temperature; (9) F205, fresh feed flow rate to the reactor; (10) F204, flow rate of the recycle oil; (11) F101, steam flow rate; (12) FR-1, steam flow rate; (13) FIQ22, flow rate of the over-heated steam; (14) F207, flow rate of the richabsorbent oil. Crowe and Vassiliadis14 have dawn attention to the fact that chemical process models are multidimensional, with peaks and valleys, which can trap the gradient descent process before it reaches the system minimum. The technique used here employs GDR-EGA (the combination of the gradient descent research and the extended genetic algorithm) search method to avoid local minima.22,23 Initially, 146 data patterns (data set 1) were obtained from the refinery to develop the model (model-1). Later, three different sets of data became available having data patterns of 84 (data set 2), 48 (data set 3), and 25 (data set 4), respectively. Model 1 has therefore been modified three times with the versions being denoted by model 2, model 3, and model 4. There are interesting and different results for every model improvement, as described below. In all cases, the accuracy of the model is required to be within (2 °C.
Main Fractionator and Its Product Quality Prediction The fluid catalytic cracking process of the refinery converts a mixture of heavy oils into more valuable products. The relevant section of the process is shown in Figure 2, where the oil-gas mixture leaving the reactor goes into the main fractionator to be separated into various products. The individual side-draw products are further processed by downstream units before being sent to blending units. Software sensors have been developed to predict the saturation pressure of gasoline, the end boiling point of gasoline, and the condensation point of light diesel. Here, by way of an example, the software sensor development for estimating the condensation point of the light diesel is described. The condensation point of light diesel is an important indication of the product quality. Typically, it has been monitored by off-line laboratory analysis. The sampling interval is between 4 and 6 h and it is not practical to sample it more frequently since the procedure is timeconsuming. This deficiency of off-line analysis is obvious and the time delay is a cause of concern because control action is delayed. Moreover, laboratory analysis requires sophisticated equipment facilities and skillful analytical technicians. Clearly, there is an important industrial need for a software sensor to carry out such on-line monitoring. Fourteen parameters are used as inputs to the BPNN model, as shown in Figure 3. These are the following:
Model Development (Model 1) Using Data Set 1 The data set (data set 1) used for model development comprises 146 patterns. As illustrated in Figure 1, the first step is to process the data using AutoClass. It requires selecting a density distribution model for all attributes (i.e., all the inputs to the BPNN, Figure 3). All variables have real values and no data are missing so a Gaussian model is used. AutoClass predicts seven classes (numbered 0, 1, ..., 6) as shown in Table 1. For example, class 0 has 32 members (i.e., 32 data patterns). Test data are sampled from each class and they are indicated in bold and underlined in Table 1. More data patterns are sampled from larger classes and fewer from smaller classes. Altogether, there are 30 data patterns used for testing and 116 for training. The procedure is summarized in Table 2.
3988 Ind. Eng. Chem. Res., Vol. 37, No. 10, 1998 Table 1. Classification of Data Set 1 and Test Data Selection for BPNNa class: 0 weight: 32
class: 1 weight: 31
class: 2 weight: 29
5 6 16 17 31 32 34 35 36 37-39 40 41-43 81 103 104 105 108 109 110 112 114 115 116 118 136 137 138 139
1 10 25 26 27 44 45 46 47-50 51 52 76 79 80 82 120 121 122-24 125 126 128 129 132 133 134 135
2 3 4 15 18 19 64 65 69 70 72 73 74 75 83 84 85 86-88 89 90 91 93 94 95 96 97 146
class: 3 weight: 26
class: 4 weight: 11
class: 5 weight: 9
class: 6 weight: 8
11 12 13 14 53 54 55 56-59 60 61-63 66 67 68 71 77 78 92 98 99 100 143
7 9 28 29 33 111 117 127 130 131 145
20 21 30 101 102 106 140 141 142
8 22 23 24 107 113 119 144
a
Data in bold and underlined are test data; weight - number of members.
Table 2. Selection of Test Data and Training Data from the 146 Patterns in Data Set 1 classes class weight number of patterns for training number of patterns for test
0 32 25 7
1 31 24 7
2 29 23 6
3 26 21 5
4 11 9 2
5 9 7 2
6 8 7 1
Table 3. Examples of Class, Membership Probability data pattern
(class, membership probability)
(class, membership probability)
(class, membership probability)
16 17 31 32 34 35 36 37
(0, 0.803) (0, 0.914) (0, 0.984) (0, 1.000) (0, 0.995) (0, 1.000) (0, 0.967) (0, 0.996)
(1, 0.197) (1, 0.055) (1, 0.014)
(4, 0.030) (6, 0.002)
(classes 2, 5, 7, 8, 9, 13, and 14), while all the 146 patterns in data set 1 are in eight classes (classes 0, 1, 3, 4, 6, 10, 11, and 12). This is consistent with the poor predictions for data set 2 using model-1. A summary of the classification is given in Table 4 where the data in italic are from data set 2. The data patterns in bold and underlined are chosen for testing and the rest for retraining to generate model 2. Altogether, 49 data patterns are chosen for testing, of which 19 are from data set 2 and 30 from data set 1. The degree of confidence for the training data is found to be 100% (11/181)% ) 92.8% and that for test data to be 100% (3/49)% ) 93.9%.
(6, 0.005)
Model 2 Improvement Using Data Set 3 (1, 0.003) (1, 0.002)
(6, 0.001)
The assignment of a data pattern to a class is fuzzy in the sense that there is a membership probability. Examples of membership probabilities are shown in Table 3. For instance, data pattern 17 (the third row in Table 3) has a membership probability of 0.914 to class 0, 0.055 to class 1, and 0.030 to class 4. It therefore is assigned to class 0. The BPNN software sensor model (model 1) is obtained when the training error reaches 4.24e-01. With normalized [0, 1] training data, the error is calculated n (y′i - yi)2, where y′i, is the prediction by the using 1/2∑i)1 model for the ith training pattern and yi the target value. There are three training patterns and two test patterns with absolute errors exceeding the required (2 °C. Therefore, the degree of confidence for training data is 100% - (3/116)% ) 97.4%, and that for test data is 100% - (2/30)% ) 93.3%. A degree of confidence of 90% is considered acceptable by the refinery. Model 1 Improvement Using Data Set 2 The production strategy changes according to season. For example, in winter it produces diesel of grade-8, having a condensation point of -8 °C, while in summer, it has diesel of grade 10, with a condensation point of 10 °C. The accuracy of model 1 originally developed was later found to be inadequate and 84 more data patterns (data set 2) were provided in order to improve model performance. Initial application of model 1 to the 84 data patterns indicated that the degree of confidence is only 100% - (63/84)% ) 25.0%. The 84 new data patterns were combined with data set 1 and processed by AutoClass. It was found that the 230 data patterns (84 + 146) were classified into 15 classes. Interestingly, all the 84 new data patterns in data set 2 were classified into seven new classes
Later, 48 more data patterns (data set 3) were provided. Model 2 was applied to data set 3 and the degree of confidence was found to be 100% - (27/48)% ) 43.8%. This is low but better than the prediction to data set 2 using model 1 (25.0%). The 48 new data patterns are then mixed with data set 1 and data set 2 and processed by AutoClass to give 16 classes (a detailed classification is given in Table 5). It is found that 25 of the 48 data patterns form a new class (class 1 in Table 5, data patterns 251-275), but the rest are assigned to the classes combined with data from data set 1 and data set 2. The degree of confidence using model 2 to predict the classes 1, 2, 3, 8, and 11 are summarized in Table 6. It can be seen that class 1 contains only new data patterns and has the lowest degree of confidence for prediction using model 2. Nineteen of the twenty-five patterns in data set 1 have deviations bigger than (2 °C. Classes with fewer new data patterns have a higher degree of confidence in the predictions. This result further demonstrates the advantage of using AutoClass for clustering data before training a BPNN model. It is also found that a model’s confidence of predicting new data is lower if more data are grouped into new classes. For example, all patterns in data set 2 were grouped into new classes, the confidence of predicting data set 2 using model 1 is 25.0%; while some patterns in data set 3 were classified into old classes, the confidence of predicting data set 3 using model 2 is 43.8%. The 48 new data patterns in data set 3 have been combined with data sets 1 and 2 to develop model 3. The sampling of test data patterns is the same as before. A total of 68 data patterns have been used for testing with the remaining 210 patterns for training. Model 3 has a degree of confidence of 92.6% ()100% - (5/68)%) for testing data and 93.8% ()100% - (13/210)%) for the training data.
Ind. Eng. Chem. Res., Vol. 37, No. 10, 1998 3989 Table 4. Classification Results of Data Set 1 Plus Data Set 2a class: 0 weight: 38
class: 1 weight: 31
class: 2 weight: 26
class: 3 weight: 19
class: 4 weight: 17
2 3 4 14 15 18 19 21 64 65 66-69 70 71-75 82-84 86 87 88-90 91 93 94 95 96-98 99 100 146
5 6 17 29 32 35 36-39 40 41-43 81 103 104 105 108 109 110 111 112 114 115 116 117 136 137 138 139
160 161 175 177-179 182 186 187 192 193-195 196 197 209 210 211 212 215 216 217-219 220 221
1 10 16 44 45-48 49 50-52 53 54 55 79 80 121 122
11 12 13 25 56 57 58-60 61 62 63 77 78 85 92 143
class: 5 weight: 15
class: 6 weight: 14
class: 7 weight: 12
class: 8 weight: 11
class: 9 weight: 10
157 176 180 181 183 184 185 188 189 190 191 205 206 207 208
7 8 22 23 24 31 33 34 76 107 113 118 119 144
162 163 164 222 223 224 225-227 228 229 230
158 159 198 199 200-202 203 204 213 214
147 148 149 150-152 153 154-156
class: 10 weight: 10
class: 11 weight: 9
class: 12 weight: 8
class: 13 weight: 7
class: 14 weight: 3
9 26 28 120 123 124 127 130 131 145
27 125 126 128 129 132 133 134 135
20 30 101 102 106 140 141 142
166 167 168-170 171 172
165 173 174
a
Italic refers to data patterns from data set 2. Bold and underlined are used for the test of BPNN.
Table 5. Classification Results of Data Sets 1, 2, and 3a class: 0 weight: 50
class: 1 weight: 25
class: 2 weight: 22
class: 3 weight: 22
class: 4 weight: 19
1 6 7 9 10 16 17 22 23 26 28 251 252 253-255 256 160 161 175 177 178 179 162 176 182 183 184 157 158 180 181 188 29 31 33 34 36-39 40 41 257-259 260 261-264 186 187 192 193 209 185 222 223 224-227 189 190 191 198 199 200 201-203 204 47 48 49 50 51 52 76 81 82 265 266-268 269 212 216 217 218 219 228 229-231 232 111 113 117 118 120 121 122 270-273 274 275 220 221 234 235 236 233 238 239 240 241 205 206 207 208 123 124 126 127 128 129 237 130 131 133 134 135 138 145 class: 5 weight: 18
class: 6 weight: 18
class: 7 weight: 18
class: 8 weight: 13
class: 9 weight: 12
5 32 35 42 43 103 104 105 108 109 110 112 114 115 116 136 137 139
11 12 13 53 54 55 56-69 60 61-63 77 78 79 85
8 20 24 27 30 44 45-46 80 101 102 119 125 132 140 141 142 144
147 148 149-151 152 153 154 155 156 276 277 278
21 25 73 83 87 92 93 94 100 106 107 143
class: 10 weight: 12
class: 11 weight: 11
class: 12 weight: 10
class: 13 weight: 10
class: 14 weight: 10
class: 15 weight: 8
2 14 15 64 65 66 67 68 70 71 98 99
163 164 242 243-245 246 247-249 250
3 18 19 72 84 86 88 89 90 91
165 166 167 168-171 172 173 174
159 194 195 196 197 210 211 213 214 215
4 69 74 75 95 96 97 146
a
Italic: data from data set 3. Bold and underlined: test data.
Table 6. Degree of Confidence to Using Model 2 to Predict Data Set 3
classes
total data patterns from data set 3
total data patterns from data sets 1 and 2
total data patterns in the class
total number of predictions with an absolute error greater than (2 °C
degree of confidence using model 2 to predict the class
1 2 3 8 11
25 4 7 3 9
0 18 15 10 2
25 22 22 13 11
19 0 4 0 4
24.0% ()100% - (19/25)%) 100.0% 57.1% 100.0% 44.4%
Model 3 Improvement Using Data Set 4 Data set 4 was subsequently obtained from the refinery and has 25 new data patterns. The confidence of applying model 3 to data set 4 is 100% - (8/25)% ) 68.0%. Data set 4 was then combined with previous data sets to give a database of 303 patterns. These are classified by AutoClass into 15 classes. 77 data patterns have been selected as test data and the rest for training to develop the BPNN model 4. The model has a confidence of 92.5% ()100% - (17/226)%) for training data and 90.9% ()100% - (7/77)%) for test data. To make a comparison with the conventional sampling approach, 77 data patterns have selected using random sampling as test patterns. The training is terminated using the same criterion (i.e., a training error of 5.75e-1). On this basis, 11 test data patterns have errors exceed-
ing (2 °C. So the confidence for the test data is 100% - (11/77)% ) 85.7%. This is lower than that of the test data used in this study, which is 90.9% (100% (7/77)%). The confidence of the training data for both approaches is the same, 92.5%. Application of Model 4 Model 4 (i.e., all data sets) covers all the operational seasons and is being used very satisfactorily. It has proved to be robust and reliable over a wide range of operational conditions. Figures 4 and 5 show the differences between predictions using model 4 and the target values of the 303 patterns. Variations in the input and output variables are summarized in Table 7. The plant has reduced the sampling frequency to twice
3990 Ind. Eng. Chem. Res., Vol. 37, No. 10, 1998
fractionator of a refinery fluid catalytic cracking process to predict the condensation point of light diesel oil. It shows that the selection of test data based on an AutoClass classification gives better results than the random selection approach. Nomenclature Figure 4. Comparison between predictions using model 4 and the target values for the first 151 data patterns.
Figure 5. Comparison between predictions using model 4 and the target values for the last 152 data patterns. Table 7. Changing Ranges of Input and Output Variables of BPNN Model 4 attribute name unit input output maximum minimum distance T1-11 T1-12 T1-33 T1-42 T1-20 F215 T1-09 T1-00 F205 F204 F101 FR-1 FIQ22 F207 TS3
°C °C °C °C °C T/h °C °C T/h T/h T/h T/h T/h T/h °C
x x x x x x x x x x x x x x
x
205.40 262.30 228.00 285.00 187.30 170.00 113.00 511.00 118.40 39.50 3.46 10.00 7.63 14.00 7.00
181.50 215.10 186.10 246.80 129.20 106.00 102.30 493.00 80.00 10.00 2.39 6.40 3.30 6.80 -13.00
23.90 47.20 41.90 38.20 58.10 64.00 10.70 18.00 38.40 29.50 1.07 3.60 4.33 7.20 20.00
a day compared to the four to six times used previously and the intention is to reduce to once a day in the future. Conclusions It is important to have an appropriate mechanism to test the reliability of software sensor models developed using BPNN in both model development and maintenance stages. This is complicated by the high dimensionality and large volume of data. The integral framework described in this paper uses Bayesian automatic classification to cluster the multivariate data into classes having similar patterns. Test data for BPNN model development are then selected from each of the classes. When new data become available, it is possible to tell if they represent new information which makes it possible to determine when the neural network needs to be retrained. Through the progressive improvement of the model, it is found that the model’s confidence of predicting new data is lower if more data are grouped into new classes. For example, all patterns in data set 2 were grouped into new classes, the confidence of predicting data set 2 using model 1 is 25.0%; while some patterns in data set 3 were classified into old classes, the confidence of predicting data set 3 using model 2 is 43.8%. The effectiveness of the approach has been demonstrated by designing a software sensor for the main
ART2 ) adaptive resonance theory BPNN ) back-propagation neural networks F101 ) steam flow rate F204 ) flow rate of the recycle oil F205 ) fresh feed flow rate to the reactor F207 ) flow rate of the rich-absorbent oil F215 ) the flow rate of the pumparound FIQ22 ) flow rate of over heated steam FR-1 ) steam flow rate T1-00 ) reaction temperature T1-09 ) top temperature of the column T1-11 ) the temperature on tray 22 where the light diesel is withdrawn T1-12 ) the temperature on tray 20 where the light diesel is withdrawn T1-20 ) the return temperature of the pumparound T1-33 ) the temperature on tray 19 T1-42 ) the temperature on tray 16 (i.e., the initial temperature of the pumparound) y′i ) the prediction by the model for the ith training pattern yi ) the target value of the ith training pattern n ) the total number of training patterns
Acknowledgment Support for this work by EPSRC (grant reference: GR/L61774) is gratefully acknowledged. Literature Cited (1) Marlin, T. E. Process control: designing processes and control systems for dynamic performance; McGraw-Hill: New York, 1995. (2) Montague, G. A.; Morris, A. J.; Tham, M. T. Enhancing bioprocess operability with genetic software sensors. J. Biotechnol. 1992, 25, 183-201. (3) Montague, G. A.; Gent, C.; Morris, A. J.; Buttress, J. Industrial reactor modelling with artificial neural networks. Trans. Inst. Measurement Control 1996, 18, 118-124. (4) Thibault, J.; Breusegem, V. V.; Cheruy, A. On-line prediction of fermentation variables using neural networks. Biotechnol. Bioeng. 1990, 36, 1041-1048. (5) Thibault, J.; Breusegem, V. V.; Cheruy, A. Adaptive neural models for on-line prediction in fermentation. Can. J. Chem. Eng. 1991, 69, 481-487. (6) Cheruy, A. Software sensors in bioprocess engineering. J. Biotechnol. 1997, 52, 193-199. (7) He, X. R.; Zhao, X.G.; Chen, B. Z. On-line estimation of vapour pressure of stabilised gasoline via ANN’s. Chin. J. Chem. Eng. 1997, 5 (1), 23-28. (8) Gudi, R. D.; Shah, S. L.; Gray, M. R. Adaptive multirate state and parameter-estimation strategies with application to a bioreactor. AIChE J. 1995, 41, 2451-2464. (9) Chen, B. Z.; He, X. R. Neural network intelligent system for the on-line optimisation in chemical plants. Chin. J. Chem. Eng. 1997, 5 (1), 57-62. (10) Regnier, N.; Defaye, G.; Caralp, L.; Vidal, C. Software sensor-based control of exothermic batch reactors. Chem. Eng. Sci. 1996, 51, 5125-5136. (11) McGreavy, C.; Lu, M. L.; Wang, X. Z.; Kam, E. K. T. Characterisation of the behaviour and product distribution in fluid catalytic cracking using neural networks. Chem. Eng. Sci. 1994, 49, 4717-4724.
Ind. Eng. Chem. Res., Vol. 37, No. 10, 1998 3991 (12) Yang, S. H.; Wang, X. Z.; McGreavy, C.; Chen, Q. H. Soft sensor based predictive control of industrial fluid catalytic cracking processes. Chem. Eng. Res. Des. 1998, 76, 499-508. (13) Lorentz, G. G. The 13th problem of Hilbert. In Mathematical Developments from Hilbert Problems; Browder, F. E., Ed.; America Mathematical Society: Providence, RI, 1976. (14) Crowe, E. R.; Vassiliadis, C. S. Artificial intelligence: starting to realise its practical promise. Chem. Eng. Progress 1995, (January), 22-31. (15) Qin, S. J.; Yue, H.; Dunia, R. Self-validating inferential sensors with application to air emmision monitorng. Ind. Eng. Chem. Res. 1997, 36 (6), 1675-1685. (16) Cheeseman, P.; Stutz, J. Bayesian classification (AutoClass): theory and results. In Advances in Knowledge Discovery and Data Mining; Fayyad, U. M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R., Eds.; AAAI Press/MIT Press: Menlo Park, CA/Cambridge, MA, 1996. http://ic-www.arc.nasa.gov/ic/projects/ bayes-group/autoclass/autoclass-c-program.html. (17) Cheeseman, P.; Stutz, J.; Self, M.; Taylor, W.; Goebel, J.; Volk, K.; Walker, H. Automatic classification of spectrum from the infrared astronimical satelite (IRAS); NASA reference publication #1217; National Technical Information Service: Springfield, VA, 1989. (18) Cheeseman, P.; Freeman, D.; Kelly, J.; Self, M.; Stutz, J.; Taylor, W. AutoClass: a Bayesian classification system. In Proceedings of the Fifth International Conference on Machine Learning; Laird, J., Ed.; Morgan Kaufmann: San Mateo, CA, 1988.
(19) Hanson, R.; Stutz, J.; Cheeseman, P. Bayesian classification theory; Technical Report FIA-90-12-7-01; NASA Ames Research Centre, Artificial Intelligence Branch, May 1991; http://icwww.arc.nasa.gov/ic/projects/bayes-group/autoclass/autoclass-cprogram.html. (20) Everitt B. S.; Hand, D. J. Cluster Analysis, 2nd ed.; John Wiley & Sons: New York, 1980. (21) Wang, X. Z.; McGreavy, C. Automatic classification for mining process operational data. Ind. Eng. Chem. Res. 1998, 37, 2215-2222. (22) Chen, F.; Chen, B.; He, X. Genetic algorithms and artificial neural network. (I) Training artificial neural network with extended genetic algorithms. J Chem. Ind. Eng. (In Chinese with English abstract) 1996, 47 (3), 280-286. (23) Chen, F.; Chen, B.; He, X. Genetic algorithms and artificial neural network. (II) Training artificial neural network by EGAGDR. J. Chem. Ind. Eng. (In Chinese with English abstract) 1996, 47 (4), 421-426.
Received for review April 13, 1998 Revised manuscript received July 17, 1998 Accepted July 20, 1998 IE980230A