3778
Ind. Eng. Chem. Res. 2000, 39, 3778-3788
PROCESS DESIGN AND CONTROL An Extended Self-Organizing Map for Nonlinear System Identification Ming Ge,†,§ Min-Sen Chiu,*,‡ and Qing-Guo Wang† Departments of Electrical Engineering and of Chemical and Environmental Engineering, National University of Singapore, 10 Kent Ridge Crescent, Singapore 119260, Singapore
Local model networks (LMN) are recently proposed for modeling a nonlinear dynamical system with a set of locally valid submodels across the operating space. Despite the recent advances of LMN, a priori knowledge of the processes has to be exploited for the determination of the LMN structure and the weighting functions. However, in most practical cases, a priori knowledge may not be readily accessible for the construction of LMN. In this paper, an extended selforganizing map (ESOM) network, which can overcome the aforementioned difficulties, is developed to construct the LMN. The ESOM is a multilayered network that integrates the basic elements of a traditional self-organizing map and a feed-forward network into a connectionist structure. A two-phase learning algorithm is introduced for constructing the ESOM from the plant input-output data, with which the structure is determined through the self-organizing phase and the model parameters are obtained by the linear least-squares optimization method. Literature examples are used to demonstrate the effectiveness of the proposed scheme. 1. Introduction The technique of local model networks (LMN) is appealing for modeling complex nonlinear systems in recent years because of its intrinsic simplicity and the weak assumptions required. The idea of LMN is to approximate a nonlinear system with a set of relatively simple local models valid in certain operating regimes. To this end, a weighting function is assigned to each local model to indicate its respective degree of validity in the operating space considered, and LMN is formed as the weighted sum of these local models. The approximation properties of LMN have been examined in greater detail recently by researchers.1-3 Under some mild condition, it has been shown that LMN can uniformly approximate any system on a compact subset given a sufficient number of local models.4 Although the results obtained so far have been quite promising for LMN, there remain problems that need to be solved: First, the problem of how to partition the operating space still exists in approximating highdimensional and complex nonlinear systems. A coarse split of the operating space may create the LMN by simply optimizing the weighting function; however, such a strategy will inevitably increase the difficulty in modeling high-dimensional systems. In particular, for LMN, it may not be easy to find a small number of variables to characterize the operating regimes because * To whom all correspondence should be addressed. Telephone: (65) 874 2223. Fax: (65) 779 1936. E-mail: checms@ nus.edu.sg. † Department of Electrical Engineering. ‡ Department of Chemical and Environmental Engineering. § Current address: Honeywell Singapore Technology Center, Singapore 486073, Singapore.
of the possibly large number of local models and the resulting loss of transparency of LMN. The other problem is that the weighting functions for interpolating between the local models are usually difficult to establish directly when little knowledge of the actual regimes exists. Some methods for solving these problems can be found in the literature,2,5 but they are yet complex and laborious tasks by now. Despite the recent advances of LMN, a priori knowledge of the processes has to be exploited for the determination of the LMN structure and the weighting functions. However, in most practical cases, a priori knowledge may not be readily accessible for the construction of LMN. To overcome these difficulties, some techniques have recently been developed. Hecker et al.6 identified a heat exchanger using LMN with the local linear model trees algorithm as the learning method to estimate the weighting functions. Brown et al.7 proposed a hybrid learning approach to construct LMN from autoregression exogenous (ARX) local models and normalized Gaussian basis functions. In their work, singular value decomposition is utilized to identify the local models in conjunction with a quasi-Newton optimization method for determining the centers and widths of the weighting functions. Other recent interesting progress was made by Chikkula et al.,8 who employed the hinging hyperplane to identify the local models which were used for the scheduling model predictive control. The main contribution of this paper is to develop an extended self-organizing map (ESOM) network to identify the LMN that can overcome the difficulties as discussed above. A two-phase learning algorithm is proposed for building the ESOM network. The first learning phase decides the proper partition of the local models and the reduction of the high-dimensional operating
10.1021/ie0000212 CCC: $19.00 © 2000 American Chemical Society Published on Web 09/14/2000
Ind. Eng. Chem. Res., Vol. 39, No. 10, 2000 3779
space. The second phase adjusts the link parameters of the ESOM network with a least-squares method incorporated to optimize the local models. Literature examples are used to illustrate the performance and applicability of the proposed learning algorithm. This paper is organized as follows. Section 2 formulates the LMN problem based on ARX local models. Section 3 presents a brief introduction of the selforganizing map network, followed by the development of ESOM as the new infrastructure to obtain the LMN. Simulation results are presented in Section 4 to illustrate the effectiveness of the ESOM. 2. Problem Formulation The problem of developing the LMN involves construction of a set of local models and the accompanying weighting functions using either a priori knowledge or observed input-output data. In most cases, the later is required to determine LMN in practice. In this paper, we restrict our attention to a single-input single-output system and the LMN under consideration has the following form:4 nφ
y(k) )
Fi(O(k-1))ψT(k-1) θi ∑ i)1
(1)
where ψT(k-1) is the regression vector given by
ψ(k-1) ) [y(k-1), ..., y(k-ny), u(k-nd), ..., u(k-nd-nu+1), 1]T (2) where y is the system output, u is the input, k denotes discrete time samples, ny and nu are integers related to the system’s order, nd is time delay, and θi is the local parameter vector:
θi ) [θi,1, ...,θi,ny, θi,ny+1, ..., θi,ny+nu, θi,ny+nu+1]T O(k-1) Φ is an operating point vector that characterizes the process dynamics, and Φ is a set of operating points and is called the operating range. An operating regime Φi is defined as the subset of Φ. A weighting function Fi(φ), which determines the validity of the local model given the current operating point, is associated with each local operating regime Φi. The number of weighting functions and operating regimes is nφ. The weighting function is constrained between 0 and 1 and satisfies nφ
Fi(φ) ) 1 ∑ i)1
(3)
3. Self-Organizing Map (SOM) Network SOM introduced by Kohonen9 is used to transform an input signal of high dimension into a lower dimensional discrete representation while preserving topological neighborhoods. It can work as learning vector quantization machinery to adaptively quantize the input space by discovering a set of representative prototypes. The basic idea of SOM is to categorize vectorial stochastic data into different groups with the winner-selection rule, which is similar to the machinery of the system identification. The SOM is composed of two fully connected layers: an input layer and a competitive layer, namely, the
Figure 1. SOM architecture.
Kohonen layer (Figure 1). The input layer simply receives the incoming input vector and forwards it to the competitive layer through the connected weight vectors. The goal of the SOM is to represent the input data density by the distribution of the connected weights via the competitive learning and finally drive the connected weight vector to become similar to the input data. Throughout this paper, we define the connected weight vector between layer 1 and the ith node of layer 2 as Wi ) (wi,1 wi,2...wi,K1)T, where wi,j denotes the connected weight between the jth node in layer 1 and the ith node in layer 2 and Km is the node number of layer m. In what follows, a brief description of the original SOM algorithm is given. Input Layer. The input training data v are given. Kohonen Layer. Every node in this layer has input lines connected to the input layer with the corresponding weight vectors. The purpose of the Kohonen layer is to make the connected weight similar to the input data by a self-organizing phase as described in the following: Step 1. Initialize Wi, i ) 1, 2, ..., K2. Step 2. At learning step l, determine the winner node whose weight vector Ws best matches the input v, i.e.,
||Ws(l) - v(l)|| ) min||Wi(l) - v(l)|| i
(4)
where ||‚|| denotes the Euclidean norm. Step 3. Update every weight vector Wi in the Kohonen layer by
Wi(l+1) ) Wi(l) + γ(i,l) (v(l) - Wi(l))
(5)
where γ(i,l) is a scalar-valued function with a value between 0 and 1 depending on the distance between the winner node and the node to be adapted. A typical γ(i,l) can be chosen as9
γ(i,l) )
1
x(1 + l)e||p -p || i
s
(6)
2
where pi and ps are the positions of node i and the winner node, respectively. Check for the convergence of Wi. If not, go to step 2. Otherwise, stop the algorithm.
3780
Ind. Eng. Chem. Res., Vol. 39, No. 10, 2000
Figure 2. ESOM architecture.
Because of its self-organizing clustering property, SOM is suitable for the identification of a nonlinear system if it can be trained in the supervised learning mechanism instead of its original unsupervised learning mechanism. To achieve this, a new ESOM embedded with a two-phase learning algorithm is developed in the next section. 4. ESOM Network The ESOM shown in Figure 2 has one additional layer attached to the original SOM, and thus it consists of three layers: an input layer, a Kohonen layer, and an output layer. Input Layer. Given input-output training data v and y, the input of this layer includes y as well as v; i.e., data pair v j ) (v, y) serves as the input to the ESOM. This forms one of the major distinctions between ESOM and SOM, whose input only contains v. This
Figure 3. Input-output data set for CSTR.
modification is essential in the proposed ESOM because its purpose is to capture the characteristics of the input-output mapping space. Therefore, in the subsequent self-organizing phase, the nonlinear system can be divided into local regions much more accurately than those generated only from the input training data. Kohonen Layer. In this layer, the partition task is accomplished by a modified self-organizing phase in which the system training data are partitioned into different local regions and the connected weight is determined via self-organizing of the data v j. As mentioned above, the input of the ESOM contains both v and y. As a result, the original self-organizing algorithms used in SOM would produce incorrect results when it applies directly to v and y.10 Therefore, v and y must be treated differently in order to obtain the proper self-organizing feature map. To this end, a master-slave decomposition method formerly applied to the Bayes classifier is employed in the proposed ESOM network. The input v is the master input vector and is treated as usual in the original SOM algorithm. The input y is the slave input vector and will only be used to update the corresponding weight vector. In this way, the weight vector Wi is also decomposed into two parts, (Wiv, Wiy), where the former is called the master weight vector and the latter is the slave weight vector. According to the vector quantization property,10 the final weight vector Wi ) (Wiv, Wiy) will become similar to the input vector v j ) (v, y), and thus the slave weight vector Wiy will also approach the slave input vector y. Consequently, the topology-conserving feature still holds in this modified algorithm. Another distinct feature of the proposed algorithm is the use of the multi-winner-take-all (MWTA) rule11 in the selection of winner nodes. In the SOM algorithm, only one Kohonen unit is selected in the competition. In comparison, the MWTA strategy determines multiple winner nodes in the ESOM; i.e., the output of Kohonen layer is calculated from a combination of a set of winner nodes. Therefore, this strategy actually makes the ESOM operated in an interpolation mode that improves
Ind. Eng. Chem. Res., Vol. 39, No. 10, 2000 3781
Figure 4. Weighting functions for example 1: (a) F1, (b) F2, (c) F3, (d) F4, (e) F5, (f) F6, (g) F7, (h) F8.
the mapping approximating accuracy. The modified selforganizing algorithm is described in the following: Step 1. Initialize Wi, i ) 1, 2, ..., K2. Step 2. At each learning step l, only the master weight vector Wiv, i ) 1, 2, ..., K2, is considered and used in eq 4. Step 3. Update every weight vector Wi ) (Wiv, Wiy) in the Kohonen layer as follows:
Wi(l+1) ) Wi(l) + γ(i,l) (v j (l) - Wi(l))
(7)
Step 4. Check for the convergence of Wi. If not, go to step 2. Otherwise, the weight vector Wiv, i ) 1, 2, ..., K2, serves as the initial value in the subsequent steps in determining the multiple winner nodes. Reset learning step l ) 1, and go to step 5. Step 5. At each learning step l, the MWTA rule is employed to determine a fixed number of winner nodes, Ωj, j ) 1, 2, ..., Kw, in the Kohonen layer according to the distance between Wiv and v: Kw
v
||Ωj(l) - v(l)|| ) least ||Wi (l) - v(l)||, i ) 1, 2, ..., K2, j ) 1, 2, ..., Kw (8) where leastKw means to choose Kw winner nodes that are closer to the current input v as measured by the Euclidean norm. Step 6. Update every master weight vector Wiv in the Kohonen layer as follows: v
v
Kw
1
∑ j)1
Kwx1 + l
1
xe
(10)
||pi-pj||2
Step 7. Check for the convergence of Wiv. If not, go to step 5. Otherwise, stop. This modified self-organizing phase can be mathematically interpreted in such a way that the algorithm creates a topology-conserving mapping of a subset of the input-output space into the space spanned by the weight vectors in the Kohonen layer. More significantly, the ESOM represents the system dynamics in the discrete output lattice enhanced with neighborhood relationships; i.e., states that are adjacent in the input space will be represented by the nodes in the Kohonen layer that are neighbors in the output space. This feature makes it possible to extract a set of linear models with a center located at the winner nodes. To compute the output of the winner nodes, we define zi,j(k) and µj(k) as follows:11 where Ki is the number of
(
zi,j(k) ) 1 -
||v(k) - Ωj||2 Kw
)
||v(k) - Ωj||2 ∑ j)1
vi(k)
,
||v(k)||
i ) 1, ..., K1 - 1, j ) 1, ..., Kw, and k ) 1, 2, ..., Kt (11) training data, vi(k) is the ith element of v(k), and
µj(k) ) z1,j(k) z2,j(k) ... z1-1,j(k), j ) 1, ..., Kw
(12)
v
Wi (l+1) ) Wi (l) + χ(i,l) (v(l) - Wi (l)), i ) 1, 2, ..., K2 (9) where
χ(i,l) )
Equations 11 and 12 may be easier to understand if we interpret them by using the analogy of the fuzzy system (FS). In fact, the winner nodes can be regarded as the fuzzy sets, and the preceding two equations can
3782
Ind. Eng. Chem. Res., Vol. 39, No. 10, 2000
Figure 5. System surface of a CSTR example.
be regard as the fuzzification procedure in FS. For example, assume that there exist two winner nodes (corresponding to two fuzzy sets) and the data vector v ) [v1, v2]; after processing v by eqs 11 and 12 (corresponding to fuzzification), we can obtain z1,1 and z1,2 from v1 and z2,1 and z2,2 from v2 (z1,1, z1,2, z2,1, and z2,2 corresponding to the fuzzy values). Because there is more than one element in v, we have to aggregate their “fuzzy values” using some operator like the T-norm fuzzy operator in FS.12 In other words, µ1 ) z1,1z2,1 and µ2 ) z1,2z2,2 calculated from eq 12 represent the aggregated effects of two different “fuzzy values” by the algebraic production. Let the output of each winner node be calculated by
ζ2,i(k) )
K1-1
µi(k) Kw
(
µi(k) ∑ i)1
θi,jvj(k) + θi,K ) ∑ j)1 1
(13)
Kw
ζ2,i(k) ∑ i)1
(14)
Combining eqs 11-14 gets K1-1
Kw
ζ3(k) )
∑ i)1
(Fi(k)(
θi,jvj(k) + θi,K )) ) ∑ j)1 1
Kw
Fi(k) vˆ T(k) θi ) Fˆ T(k) θ ∑ i)1 where
(v(k),1)T, Fi(k) )
µi(k) Kw
(15)
, Fˆ (k) )
µi(k) ∑ i)1 (F1vˆ T(k), F2vˆ T(k), ..., Fkwvˆ T(k))T Given the input-output data {v(k), y(k), k ) 1, 2, ..., Kt}, Fˆ (k) is fixed after the self-organizing process such that the output ζ3(k) is actually a linear combination of the elements of θ. Hence, θ can be identified with any efficient linear optimization methods. A linear leastsquares optimal method is employed here to find the best solution for θ, which minimizes ||Λθ - Y||, where Λ ) [Fˆ (1), Fˆ (2), ..., Fˆ (Kt)]Τ, Y ) [y(1), y(2), ..., y(Kt)]T:
θ ) (ΛTΛ)-1ΛTY
where θi,j are the parameters used to determine the local linear model. Output Layer. The output of this layer is computed as the summation of all incoming signals:
ζ3(k) )
θi ) (θi,1, ..., θi,K1-1, θi,K1)T, θ ) (θ1T, ..., θKwT)T, vˆ (k) )
(16)
This is the second learning phase of the ESOM algorithm. Because the problem solved above is only a linear optimization problem, the weight parameter θ can be guaranteed to be the global optimum. Such a result is not surprising because the nonlinearity of the system has been handled in the self-organizing phase. In what follows, a brief comparison between ESOM and neural network is made. It is easily observed that ESOM is similar to the local neural networks such as cerebellar model articulation control, radial basis function network,13 except in the second layer where the way to partition the nonlinearity of the system to local regions is different. In most neural networks, the linearity is partitioned into local regions only through a predetermined nonlinear structure without making use of the system information. Consequently, the identification results may become rather unsatisfactory when the nonlinear system structure is incorrectly prespecified such that the determination of network weights is prone
Ind. Eng. Chem. Res., Vol. 39, No. 10, 2000 3783
Figure 6. ESOM model surface.
Figure 7. Error surface between the plant and ESOM.
to be trapped in the local optimum. This point will be further demonstrated in the Examples section. To recap, the reasoning mechanism of ESOM can be summarized as follows: First, the input-output training data are collected as the input-output of ESOM. Second, these data are trained via the modified selforganizing phase, and the nonlinearity is distributed into the different local regions. Finally, a linear optimization problem is solved to estimate the weight parameters θ.
The ESOM can be easily implemented to solve the LMN problem formulated in the second section. Given the plant data which are sampled from the nonlinear system, the input-output data pair for ESOM is constructed as (v(k-1), y(k)), where v(k-1) is expressed in a regression vector format:
v(k-1) ) [y(k-1), ..., y(k-ny), y(k-nd), ..., u(k-nd-nu+1)] (17)
3784
Ind. Eng. Chem. Res., Vol. 39, No. 10, 2000
Figure 8. NN model surface.
5. Examples
Table 1. Mean-Squared Error versus K2 and Kw in Example 1 K2 Kw
12
36
81
144
225
2 4 6 8
0.00919 0.00492 0.00215 0.00133
0.00904 0.00386 0.00144 0.00132
0.00904 0.00358 0.00152 0.00123
0.00905 0.00372 0.00149 0.00125
0.00901 0.00318 0.00173 0.00121
From eqs 15 and 17, it is obvious that eqs 1 and 15 are essentially equivalent. Furthermore, Kw does, in fact, represent the number of local models. Generally, the model number can be chosen according to a priori knowledge, which is widely used in the LMN-related literature. Nonetheless, the choice of the local model position in the identification space is not a trivial problem. Usually we may decide the approximate position by observing the distribution of the input-output data. However, this empirical practice is quite difficult in the high dimensional space, leading to the degradation of modeling accuracy. This problem can be overcome by the ESOM because the position of each local model can be self-determined during the self-organizing process irrespective of the dimensionality of the operating space. Note that the operating point vector O(k-1) in eq 1 should be chosen prior to constructing local models in the conventional LMN. However, to choose such a characteristic vector often needs a detailed knowledge of the system dynamics, which is a demanding task in the industrial practice. In ESOM, the regression vector vˆ (k-1) is simply chosen as O(k-1). This is reasonable because the task for characterization, i.e., the partition of the input-output space, has actually been implemented in the self-organizing map and therefore the choice of the characteristic vector is not required any more.
This section presents simulation results to demonstrate the approximation properties of the ESOM network. Example 1. Consider a single enzyme-catalyzed reaction with substrate-inhibited kinetics in an isothermal continuous stirred tank reactor14 which can be described by the following dynamic equation:
dy )u-y-β dt
1 k1 y 1+ + y k2
(18)
where y is the reactant concentration, u is the feed reactant concentration, k1 and k2 are the kinetic parameters, and β is a constant. With k1 ) 0.01, k2 ) 0.1, and β ) 2, the process exhibits multiple steady states for a range of input values. To simulate the input-output training data set, a probability-based switching scheme identical with that in Chikkula et al.8 is utilized to design the input signal. By doing so, both low- as well as high-frequency behavior of the process can be generated. The resulting data are given in Figure 3. For the purpose of training and validation, the whole data set is divided into two subsets separately. Note that eq 18 is a first-order ordinary differential equation, v(k-1), chosen as
v(k-1) ) [y(k-1), u(k-1)] To determine the number of LMN (Kw) and the number of nodes of the Kohonen layer (K2) in the proposed scheme, the approximation accuracy of the ESOM model is studied. Table 1 lists the mean-squared error of the trained model with different combinations of K2 and Kw. For a fixed K2, the accuracy of the ESOM model improves as Kw increases. For a fixed Kw, the error decreases initially as K2 increases from 12 to a
Ind. Eng. Chem. Res., Vol. 39, No. 10, 2000 3785
Figure 9. Error surface between the plant and NN.
Figure 10. Error surface between the plant and ESOM (trained with corrupted data).
certain value of K2, after which the error starts to increase and then decreases when K2 increases further. Taking into account the tradeoff between the accuracy and the efficiency of the ESOM model, K2 ) 12 and Kw ) 8 are chosen. The resulting local models are given in Table 2, and the weighting functions are shown in Figure 4. Figures 5-7 compare the system surfaces generated by eq 18 and the ESOM and the corresponding error surface. This shows that ESOM achieves a highly accurate approximation to this nonlinear process.
To further illustrate the effectiveness of the ESOM, a feedforward neural network (NN), which consists of one hidden layer and is trained with a fast LevenbergMarquardt algorithm, is employed for comparison. Simulation results reveal that a NN with 20 perceptrons gives the best prediction. The modeling surface reconstructed by NN is shown in Figure 8. By comparison of the error surfaces that are shown in Figures 7 and 9, it is evident that that the predictive performance of ESOM is much better than that of NN. The poor performance
3786
Ind. Eng. Chem. Res., Vol. 39, No. 10, 2000
Figure 11. Error surface between the plant and NN (trained with corrupted data).
Figure 12. Comparison between the outputs of the crystallizer (solid line) and the ESOM model (dashed line).
of NN seems to be attributed to its structure: the learning processes could become trapped in local minima because of the randomly initialized weights, or some neurons could be pushed into saturation during the training process. Either of these two situations can significantly degrade the modeling accuracy of NN. To demonstrate the robustness of ESOM to noise effects, to the the plant output is deliberately added a
random noise with a ratio of noise to signal of 10%. For the purpose of comparison, both ESOM and NN are trained using the same set of corrupted training data. From the error surfaces of ESOM and NN as shown in Figures 10 and 11, it is obvious that ESOM still has a better predictive performance in the presence of corrupted data than NN does. Example 2. Consider a batch crystallization process
Ind. Eng. Chem. Res., Vol. 39, No. 10, 2000 3787 Table 2. ESOM Local Models for Example 1
Table 5. ESOM Local Models for Example 2
8
y(k) )
∑F (ψ(k-1)) ψ (k-1) θ
3
T
i
y(k) )
i
i)1
Table 3. Model Parameters for Example 2 value
units
7.57 × 106 2.11 -85.75 + 92.82Cs - 99.97Cs2 59 643 6 1 e17.142 e8.849 1.78 1.32 0
g g/cm3 cal/g cal/min‚°C
θ1 ) [0.807 0.184 0.006 0.076]T, θ2 ) [1.256 -0.26 0.006 -0.043]T, θ3 ) [0.826 0.166 0.006 0.094]T
variable. The growth and nucleation rates for the seeded crystallizer can be expressed as where S is the super-
G ) KgSg B0 ) kbSbµ3 S)
#/cm3‚min µm/min
Table 4. Initial Values of State Variables for Example 2 state
value
units
µ0, µ˜ 0 µ1, µ˜ 1 µ2, µ˜ 2 µ3, µ˜ 3 µ4 Cs
1.8996 372.322 73072.2 1.43602 × 107 2.82582 × 109 0.59
no. of particles/g of solvent µm/g of solvent µm2/g of solvent µm3/g of solvent µm4/g of solvent g of solute/g of solvent
described by15
dµ0 ) B0 dt dµi ) iGµi-1 + B0r0i, i )1-4 dt dCs ) -3FckvGµ2 - FckvB0r03 dt mslucp
i
ψ(k-1) ) [y(k-1), y(k-2), u(k-1), 1]T
θ1 ) [0.536 0.732 -0.374]T, θ2 ) [0.216 0.021 -0.007]T, θ3 ) [0.4 0.644 -0.199]T θ4 ) [-7.602 2.239 -2.942]T, θ5 ) [0.463 0.672 -0.279]T, θ6 ) [-0.038 -0.02 0.008]T θ7 ) [0.961 0.862 -0.695]T, θ8 ) [0.394 0.676 -0.176]T
msol Fc ∆Hc UA ka kv kb kg b g r0
T
i
i)1
ψ(k-1) ) [y(k-1), u(k-1), 1]T
variable
∑F (ψ(k-1)) ψ (k-1) θ
dT ) -3∆HcFckvmsolGµ2 dt ∆HcFckvmsolB0r03µ3 + UAc(Tc - T) dµ˜ 0 )0 dt dµ˜ i ) iGµ˜ i-1, i )1-3 dt
where U is the jacket heat-transfer coefficient, Ac the heat-transfer area, ∆Hc the heat of crystallization, kv the volume shape factor, Fc the crystal density, Cs the solute concentration on a per-mass-solvent basis, h a conversion factor equal to the volume of slurry per mass of solvent, cp the specific heat of the slurry, mslu is the mass of slurry, msol is the mass of solvent, r0 is the crystal size at nucleation and is assumed to be constant, µi is the ith moment of crystal size distribution (CSD), and µ˜ i is the ith moment of CSD for the control volume that includes only the seed crystals, the crystallizer temperature T is the measurable variable, and the cooling fluid temperature Tc is taken as the manipulated
Cs - Csat Csat
saturation and Csat is the saturation concentration which is approximated as15
Csat ) 0.1286 + 0.00588T + 0.0001721T2 The values of other kinetic parameters for this crystallization system are given in Tables 3 and 4. The crystallization process is simulated with the input Tc generated randomly, and the resulting input-output data are collected for the training and validation of ESOM. In this example, K2 ) 81 and Kw ) 3 are chosen. The local models constructed from ESOM are given in Table 5. Figure 12 shows that the constructed ESOM model results in an excellent approximation of the underlying nonlinear system dynamics. 6. Conclusion In this paper, an ESOM is developed for the nonlinear system modeling and identification. Based on the ESOM network, a set of local models are extracted via the selforganizing of the input-output training data. In comparison with the existing local model network based identification methods, ESOM does not require the knowledge of the locations of the local models and the corresponding weighting functions because they can all be generated automatically through the two-phase learning algorithm without a priori knowledge. Another advantage is that, during the learning process of the ESOM, the high dimensional operating space can be divided into lower dimensional space with local regions formed according to the probability and density distribution by self-organizing of the training data, thus effectively alleviating the curse of dimensionality. These properties are illustrated in the identification of two nonlinear systems, and the simulation results demonstrated the effectiveness of the ESOM. Literature Cited (1) Murry-Smith, R. A local model network approach to nonlinear modeling. Ph.D. Thesis, University of Strathclyde, Glasgow, U.K., 1994. (2) Johansen, T. A.; Foss, B. A. Identification of nonlinear system structure and parameters using regime decomposition. Automatica 1995, 31, 321-326. (3) Narendra, K. S.; Balakrishnan, J.; Ciliz, M. K. Adaptation and learning using multiple models, switching and tuning. IEEE Control Syst. Mag. 1995, 15, 37-51.
3788
Ind. Eng. Chem. Res., Vol. 39, No. 10, 2000
(4) Johansen, T. A.; Foss, B. A. Constructing NARMAX models using ARMAX models. Int. J. Control 1993, 58, 1125-1153. (5) Kavli, T.; Weyer, E. On ASMODsan algorithm for empirical modeling using spline functions. Neural network engineering in dynamic control systems; Hunt, R., Irwin, G., Warwick, K., Eds.; Springer: New York, 1995; pp 83-104. (6) Hecker, O.; Nells, O.; Moseler, O. Nonlinear system identification and predictive control of a heat exchanger based on local linear fuzzy models. Proceedings of the American Control Conference, Albuquerque, NM, 1997; pp 3294-3298. (7) Brown, M. D.; Lightbody, G.; Irwin, G. W. Nonlinear internal model control using local model networks. IEE Proc.Control Theory Appl. 1997, 144, 505-514. (8) Chikkula, Y.; Lee, J. H.; Ogunnaike, B. A. Dynamically scheduled MPC of nonlinear processes using hinging hyperplane models. AIChE J. 1998, 44, 2658-2674. (9) Kohonen, T. Self-organization and associative memory; Springer-Verlag: Berlin, 1984. (10) Veelenturf, L. P. J. Analysis and application of artificial neural networks; Prentice-Hall: New York, 1994.
(11) Nie, J. H.; Linkens, D. A. Fuzzy neural control: principles, algorithms, and applications; Prentice-Hall: New York, 1995. (12) Dubois, J.; Prade, H. Fuzzy sets and systems: theory and applications; Academic Press: New York, 1980. (13) Martin, T. H.; Howard, B. D.; Mark, B. Neural Network Design; PWS: Boston, 1996. (14) Bruns, D. D.: Bailey, J. E. Process operation near an unstable steady-state using nonlinear feedback control. Chem. Eng. Sci. 1975, 30, 755-762. (15) Miller, S. M. Modelling and quality control strategies for batch cooling crystallizers. Ph.D. Thesis, University of Texas at Austin, Austin, TX, 1993.
Received for review January 3, 2000 Accepted July 22, 2000
IE0000212