Identification of Nonlinear Dynamic Processes with Unknown and

Mar 15, 1995 - Variable Dead Time Using an Internal Recurrent Neural Network. Yi Cheng,f ... dead times. If a process input window rather than just th...
0 downloads 0 Views 816KB Size
Ind. Eng. Chem. Res. 1995,34, 1735-1742

1735

Identification of Nonlinear Dynamic Processes with Unknown and Variable Dead Time Using an Internal Recurrent Neural Network Yi Cheng,?Thomas W. Karjala,' and David M. Himmelblau* Department of Chemical Engineering, The University of Texas at Austin, Austin, Texas 78712

Methods for identifying a nonlinear dynamic process with unknown and possibly variable dead times via a n internal recurrent network (IRN) model are proposed. It is shown that a n IRN with sufficient hidden nodes can be used directly for the identification of a nonlinear dynamic process with fixed or variable dead times. If a process input window rather than just the current process input is used as the input to an IRN model, the number of hidden nodes in the IRN model can be reduced, and the prediction performance of the IRN improves for processes with large, and variable, dead times. Simulation results for a pH neutralization process with transportation lags demonstrate the effectiveness of the proposed methods. 1. Introduction

The identification of processes with unknown pure time delays has been a challenge in many industries, particularly in the process industries. It is well-known that the performance of control systems can be degraded if time delays are not modeled appropriately. We consider here only what are termed pure (time) delays or dead time. Such delays can be put into two general categories: (a) transport lags and (b) observation lags. Delays from transport lags correspond to the time it takes for a fluid to move through a process vessel, for a thermal response to propagate, or for the recycle of the material to be returned to the system. Delays from observation lags correspond to the time it takes for an instrument to respond or the time it takes to receive an analysis from a laboratory. We examine the identification of processes in which the time delays are unknown a priori. Consequently, the problem is how t o identify a process model that can model both the process dynamics and the process time delays based on the process input and output data. To model a process (including instrumentation) that exhibits pure time delays, you have to keep in mind that whatever model is formulated will be an approximate model. Even though the selected state equations are nonlinear, by using an approximate model with delayed coordinates, identification can be applied to finite dimensional rather than infinite dimensional equations. Whether or not the approximate model converges t o some "true" model in some sense is unimportant since the "true" model of the process is never known. What is important is that the approximate model represents the process in all of the features that are important for the application of the model. The conditions for the convergence of the identification strategy are also important. Modeling for processes with delays is greatly facilitated by using a discrete model and assuming that any delays are approximately integer multiples of the sampling period. Various kinds of discrete models that have been suggested include autoregressive-moving average model (ARMA) and moving average (MA) models, differential-difference equations, Pade approximations, Walsh functions, approximation based on linear semigroup theory, state space models, wavelets, and artificial

* Author to whom correspondence should be addressed. +

E-mail: [email protected]. E-mail: [email protected]. Fax: (512)471-7060. 0888-5885/95/2634-1735$09.00l0

Inputs

outputs Computation Node

Context Unit _......-....__._..._________._......

''_

/-"

0

Input Unit

Figure 1. Structure of an IRN.

neural networks. Our work focuses on using recurrent neural networks. In contrast to the usual type of parameter estimation strategies, we are not interested in getting unbiased and minimum variance estimates for the values of the model parameters, the initial conditions, or the delays themselves but instead seek unbiased and minimum variance estimates for the predictions of the process states by the net, given a particular domain for the inputs t o the net. Various types of corruption in the values of the process measurements must be considered in practice, including random noise, correlated noise, gross errors, and biased measurements as well as unmeasured disturbances that influence the process data. We treat only the first here, but the other sources of degradation of data can be ameliorated if valid data records exist. In what follows, we first summarize the essential features of modeling a dynamic process with delays by using internally recurrent neural networks URNS). Then we show how IRNs can be used to model processes with fixed or variable time delays. Finally, we provide an example application of the proposed technique for a pH neutralization process. 2. Internal Recurrent Nets as Approximators

for Processes with Dead Time The IRNs (Elman, 1990;Williams, 1990;Karjala and Himmelblau, 1994) used in our work were comprised of three layers of nodes (see Figure 1): the input layer, the hidden layer, and the output layer. A bias node was added both to the input layer and to the hidden layer. In addition to the feedforward connections between the nodes in the input layer and the nodes in the hidden layer, and between the hidden node layer and the output 0 1995 American Chemical Society

1736 Ind. Eng. Chem. Res., Vol. 34, No. 5, 1995 node layer, that exist for ordinary feedforward neural networks, an IRN has connections and nodes (called context nodes) that provide 1 time step delay feedback among the hidden nodes themselves. We used Gaussian activation functions (not radial basis functions) and linear activation functions for the hidden nodes and the output nodes in our studies, respectively. The mathematical description for this kind of IRN model is as follows:

x ( k + l ) = 4WRx(k)+W’u(k)+ WIbiasl 3(k+1) = Wox(k+l)

+ Wobias

(1) where x(k) represents the vector of the states (the outputs) of the hidden nodes of an IRN model, jr(k+l) is the process output vector a t time k 1 predicted by the IRN model, u(k)is the process input vector a t time k, u ( ) is the vector of the nonlinear activation function of hidden neurons (Gaussian function in our work), and W’, WR, WO, WIbiaa,and WObias are the weight matrices for the weights (coefficients) associating the input to the hidden nodes, the context nodes to the hidden nodes as feedback (every context node is a 1 time step delay shift operator), the hidden nodes to the output nodes, and the bias weights, respectively. Internal recurrent neural networks have been shown to be very good models for the identification of nonlinear dynamic processes without time delays (You and Nikolaou, 1993; %ala and Himmelblau, 1994). M e r the training of an IRN model is completed, only the process input at time step k is needed to predict the process output at time k 1. No previous process input and output information is needed as for a NARMAX model (Chen et al., 1990). Let us consider here a multiple input single output (MISO) process that can be modeled via an IRN (multiple input multiple output (MIMO) processes can be considered to be combinations of MISO processes). A nonlinear MISO process with time delays can be represented by the following NARMAX model 5 ( k + l l k ) = flr(k),~ ( k - l ) ,...,y ( k - n ) ; ~ ~ ( k - d ..., ,), u,(k-d,-m); ...; u,(k-d,), ..., u,(k-d,-m)l (2) where ui( is the ith process input, i = 1, 2, ...,I, y( 1 is the scalar process output, 9(k+llk) is the scalar process output at time step k 1 predicted by the model on the basis of the process input and output information up to k, and di represents the pure time delay from the ith input to the process output. For simplicity and without loss of generality, we use n and m as the length of the windows for all the inputs and outputs, respectively. By shifting the time back by successive time steps, model 2 can describe both input and output time delays for a nonlinear process; therefore, di represents the total time delay from the ith process input to the process output. Since an IRN model can represent a nonlinear dynamic system t o an arbitrary accuracy, the above NARMAX model for a process with time delays can also be represented by an IRN in the nonlinear state space form according to eq 1

+

+

-

+

1

x(k+1) = o[ WRx(k)+CW;ui(k-di)+WIbias1 i=l

+

Q(k+l) = wOx(k.+l) wObias (3) where for a MISO model WO is now a vector of weights and Wbias is a scalar weight.

Although, in principle, we should use inputs with time delays di instead of the current process inputs ui(k) as the inputs to an IRN that would be used to represent a process with time delays, we can just as well use current process inputs because the states of an IRN (the outputs of hidden nodes) can store the information about the past process inputs. However, the IRN will be less complex and use fewer hidden nodes if the inputs at the correct dead time, di,are used instead. When “training” an IRN model (estimating the weights) to identify an actual process, we use the observed process outputs at k 1 as the desired IRN training targets. The node outputs from the net are adjusted to match the targets for a given set of process input measurements at time (k - dJ. For a process without time delays, this time is just k. Thus, the training problem is how to adjust the weights W’, WR, and W and the bias weights by least square such that the corresponding output of an IRN is as close as possible to the actual process output at time step k 1. The training of all the IRN models in our simulation was accomplished by using the nonlinear programming code NPSOL (Gill et al., 1986) in a batch fashion. An IRN model used to represent a process is not unique as to either the structure or the values of the weights (coefficients). Because no good way exists to determine the number of hidden nodes to use in an IRN model, we chose the number by trial and error so as to reduce the mean square error adequately. Usually, one trains and then tests sequentially until the mean square error on test data (by cross validation) starts t o rise. We found, on the basis of a large number of simulations, that for our training method Gaussian activation functions worked better than sigmoidal activation functions in achieving lower training error and faster training.

+

+

3. Modeling Time Delays Using an IRN The time delays considered in what follows can be constant or variable. For variable time delays, any delay must have some relationship with the cause of the delay that in turn is related to a measured or calculated variable. If a time-varying delay does not have any relation with a process input variable, or is caused by some unmeasured process disturbance, then an IRN model will not be able to model the delay well unless trained with typical disturbances. 3.1. Modeling a Process with Time Delay Using an Ordinary IRN. Properly designed discrete recurrent neural network models inherently accommodate some delays because of the fhnction of the context nodes illustrated in Figure 1. As indicated by model 1, an IRN model is nothing more than a state space representation of a nonlinear dynamic system. The outputs of hidden nodes are the states of an IRN model, and just like a linear state space model, the states include information about past process inputs. This is the reason that an IRN model needs only the current process input information rather than an input window, as in the case of a NARMAX model, to predict the process output at the next time step. Consequently, model 1 can approximate a process with time delays as represented by model 2 if model 1 can store the information about past process inputs long enough to include all the information about ~ ~ ( k - d l..., ) , ~ ~ ( k - d l - m ...; ) ; ul(k-dl), ..., ~ ~ ( k - d l - m ) . The question is as follows: For how long can an IRN model store information about past process inputs? Obviously, the more hidden nodes in the IRN, the longer the IRN can store such information (Macmurray, 1993).

Ind. Eng. Chem. Res., Vol. 34,No. 5, 1995 1737

ICC

Table 1. Parameters of a pH Neutralization Process A = 207 cm2 C, = 8.75 cm5I2s-l p&= 6.35 pKz = 10.25 wal = 3 10-3 M wa2= -3 10-2 M wb2 = 3 X M W a 3 = -3.05 X M wbl= O wb3 = 5 x M q i = 16.6 cm3 s-l q 2 = 0.55 cm3 s-l q s = 15.6 cm3 s-l h = 14.0 cm pH4 = 7.0

1051

input t o an IRN at time K so as to include the inputs with the correct time delays. Thus, we can use the following model to identify a process with unknown time delays.

Figure 2. pH neutralization process.

Thus, we can use an IRN to model a process with time delays as long as the IRN has a sufficient number of hidden nodes t o capture both the time delays and the nonlinear dynamics of a process. If the process time delays are not very long, no matter what they might be, fured or variable, we can use an IRN of reasonable size to model the process directly from process input and output data. 3.2. Modeling a Process with Time Delays Using an IRN with a Past Input Window. Time delays caused by transport through process equipment, piping, and so on can lead t o long time delays and, equally important, variable time delays caused by time-varying flow rates. For such circumstances, the common type of IRN model would have to include a large number of so-called hidden nodes to guarantee that the IRN could represent both the time delays and the nonlinear dynamics of a process. But using a large number of hidden nodes in the net may lead to deterioration of the performance of an IRN model. Also, as the number of weights increases, so must the training time for a net, One way to accommodate long time delays, if the values of the true process time delays are known, is to use model 3, i.e., t o introduce to the IRN all the input signals with correct time delays instead of u(k) in training and prediction. In such a case, the number of hidden nodes of an IRN only has to satisfy the requirement of representing the nonlinear dynamics of a process and can be much smaller than the number of nodes needed to satisfy the requirement of modeling both the nonlinear dynamics and the time delays even if a process time delay is very large. However, in practice, the true values of the process time delays may not be known, and it is quite possible for the values of the true process time delays to vary in a large range, depending on some process variable such as flow rate. To deal with this situation, we propose using a window of past process inputs as the input to an IRN t o model a process with unknown time delays. Suppose that the true process time delay from the ith input to the output is di and that the value is unknown but that the possible maximum and minimum value of each delay is known (which is a reasonable assumption for most practical processes) or can be estimated. Let and ~ ' M I Nrepresent the possible maximum and minimum time delays for the ith input so that the true process time delays satisfy

dim > di> di,,, (V i = 1, ..., 1) (4) If we do not know the true value of a time delay di,we cannot use model 3 to identify a process with unknown time delays. But we can use a window of past input data ui(k-di-), ..., ui(k - &MIN) (i = 1,2,..., 1) as the

j ( k + l ) = WOX(k+l)

+

WOblas

(5)

Because a neural net has the asymptotic property of being a universal approximator of a nonlinear function, model 5 automatically weights the process inputs with the correct time delays more heavily and tends to diminish the effect of an input with an incorrect time delay in the input window during training. Hence, model 5 can be used to represent nonlinear process dynamics and at the same time take into account the process time delays. The number of past inputs used in the IRN model is determined by the estimation of the possible maximum and minimum time delays. Note that the time delays must be larger than ~ ' M I N for i = 1, ...,1. If d ' M 1 ~is larger than the corresponding time delay, you are erroneously trying to predict y ( k ) using u , ( k - d t ~ )..., , u,(k-d~~m because ) y(K) actually depends on u ( K - ~ ~ M I N + s ) (0 < s .c ~ ' M I N ) . If you cannot estimate ~ ' M I N , the recommended procedure is to set = 0 at the expense of incorporating an excess number of inputs (and hence weights) in the net. To sum up, if we choose an IRN model that uses a window of past process inputs, the number of hidden nodes in the IRN model only has t o be large enough t o account for the dynamics of a nonlinear system. Hence, by using an input window, we can obtain an IRN model of reasonable size that is better able to model large time delays. Furthermore, not every past input at all the measurement times has to be included in the window of past measurements since an ordinary IRN has some capability to store past input information. Hence, the number of input nodes to an IRN can be less than the full number of sampling intervals in a window. For example, you can delete from the input window all the data except that taken at every third time step and use the abbreviated sample rather than the whole process input window as the input to an IRN. Also, the estimated d z can~be less than the largest possible time delay as long as the difference is not too long. Use of an input window to an IRN involves using more process input information than needed for modeling a process. This redundancy of information may have an adverse impact on the prediction performance of an IRN if the process input signal contains significant noise (spikelike noise) or step changes of large magnitude. To reduce the noise in the inputs and thus to improve the prediction performance of an IRN with an input window, you can use a first-order filter t o preprocess the input data fed to the IRN for both training and prediction, Le., you can calculate ii( ) by the following equation as a substituent for u( ) in model 5

.

.

+

ii,(k) = a,u,(k) (1.0 - a J i i , ( K - l )

(6)

where a, is a filter coefficient between 0.0 and 1.0 for

1738 Ind. Eng. Chem. Res., Vol. 34, No. 5, 1995 1 1

I

c

4



I

19 1

1

t

1



I

10 I

1

0.9

0.8 0.7 0.6 0.5 0.4

0.3 0.2

11

10 9

Ill

4 I

0

500 lo00 1500 2000 time step k ( 1 time rtep-lO seconds )

2500

Figure 3. Data for training the IRN models.

the ith process input. If the noise level in the input signal is not very high or no large step change in the input signal occurs, there is no need t o use such a prefilter. Many investigators have used ordinary feedforward networks with process inputs and outputs taken from a moving window to model a nonlinear dynamic process. If such networks are used to model a process with large unknown time delays and the difference between the maximum time delay and minimum time delay is large, the number of input nodes needed to represent the process well would become quite large. Thus, the advantage of using an IRN is to get a parsimonious model and, implicitly, a better representation of the process.

4. Identification of a p H Neutralization Process

with Unknown Dead Time We selected a pH neutralization process to demonstrate the performance of IRN models in identifylng a nonlinear dynamic process with time delays caused by transportation lag. 4.1. Process Description. Figure 2 shows the schematic diagram of the pH neutralization process we used in our simulations in which acid, buffer, and base streams are mixed in a tank and the effluent pH is measured. The mathematical model of the pH neutralization process was taken from Nahas (1992) and consists of three nonlinear differential equations plus

Ind. Eng.Chem. Res., Vol. 34, No. 5, 1995 1739 process IRN prec

10

:put tlon

' li ft----

Ij ;; : I

-

: ' 9

-aa f 5

8

7

6

5

0

100 200 300 400 500 600 700 800 time step k ( 1 time steps10 seconds )

900 1000

Figure 4. Prediction of pH by the IRN with 4 hidden nodes for the test data.

I

I

t

processoutput IRN predicton ------

lo

I

0

100 200 300 400 500 600 700 800 900 1000 time step k ( 1 time step40 seconds )

Figure 6. Prediction of pH by the IRN with 10 hidden nodes for the test data.

a nonlinear output equation:

ii =$q,

+ q 2 + q 3 - C"&

wa4+ 10pH4-14 + where h is the liquid level, Wa4 and w b 4 are the reaction invariants of the effluent stream, and q1, q2, and 4 3 are the acid, buffer, and base flow rates, respectively. The pH measurement pH4, is assumed to be available with

a delay d from the actual process pH value, pH4, i.e.,

pH4,(t) = pH4(t - 4 (8) where d is the delay caused by the transportation lag of effluent flow q4. The parameters, nominal operating condition, and initial conditions of the process are listed in Table 1. In our simulation, we treated the pH process as having two inputs (a manipulated variable, the base flow rate 43, and a measured disturbance, the buffer flow rate q2) and one output (the measured pH value of the effluent, pH4,). The delay d caused by the transportation lag in the output pipeline can be assumed to be proportional to the reciprocal of the process output flow rate q 4 , i.e.,

where C is a coefficient associated with the distance between the outlet of the tank and the point on the output pipe at which the measurement is made. The

1740 Ind. Eng. Chem. Res., Vol. 34,No. 5 , 1995

t

lo 9

-a 0)

process output IRN predictlon ------

8

f

Ip 7

6

1

0

100 200 300 400 500 600 ,700 800 900 1000 time step k ( 1 time step=lO seconds )

Figure 6. Prediction of pH by the IRN with a sampled input window for the test data.

I

t

processoutput IRN prediction ------

I

lo

f

f

B

"

0

300 400 500 600 700 800 900 1000 time step k ( 1 time step40 seconds )

100 200

Figure 7. Prediction of pH by the IRN with a sampled input window and a prefilter for the test data.

flow rate 44 is determined by the height of the liquid in the tank, which is q4

=

c,&

(10)

where C, is the orifice coefficient. From the pH process, eq 7, we known that h varies with time and depends nonlinearly on the process input variables q z and 43. Hence, in what follows, we assume that the process time delay is a variable depending on q z and q 3 , and it is such a relationship between the time delay and process input variables that makes it possible to use an IRN to model a process with variable time delay. Although only one process time delay d (the process output delay) exists in the process, we need to treat the delay as two input delays d l and dz since the process output time delay is equivalent to delays in both of the input-output channels of the process. To generate the pseudoprocess data for the simulations, we assumed that the range for d was between 2 and 9 time steps, meaning that the highest liquid level in the tank would cause a 2 time step output transpor-

tation lag and the lowest liquid level in the tank would cause a lag of 9 time steps in the output. Simulated process input measurements were generated by introducing step changes in the two process input variables, and pseudoprocess output measurements were obtained by integrating the process model and delaying the resulting process output an appropriate number of time steps governed by the liquid level h in the tank at time k. A sampling period of 10 s was chosen in our simulations. All three measurements were corrupted by zero mean Gaussian noise with standard deviations of about 5% of the ranges for the three variables. In Figure 3, the top three panels show the typical process input and output data used for training the IRN models for the different cases discussed below, and the bottom panel shows the process output time delays at each time step (altogether 2500 process input and output data points were generated). From Figure 3, you can see that the process output delays were complex functions of the process inputs QZ and q 3 .

Ind. Eng. Chem. Res., Vol. 34, No. 5, 1995 1741 I

process output IRN prediction ------

t

1

lo

I

0

100 200 300 400 500 600 700 800 900 1000 time step k ( 1 time step=lO seconds )

Figure 8. Prediction of pH by the IRN with a sampled input window and a prefilter for the case of a constant time delay. Table 2. Scaled Prediction MSEs for the Test Data Set and the Number of Coefficients in the IRN Model for the Five Cases number of parameters case scaled MSE in the model 1 2.329 35 x 33 141 2 1.715 25 x 3 3.712 98 x 49 49 4 1.402 80 x 49 5 9.244 07 x

Another 1000 input and output data points were generated in the same way (except that no noise was added) to serve as the test data set. 4.2. Identification of the Process with Variable Dead Time Using an IRN Directly. In our simulations, IRN models such as shown in Figure 1and given by eq 1 were trained as 1 time step ahead predictors using the pseudoprocess input and output measurements with the variable time delays shown in Figure 3; Le., given process input measurements at time step k as IRN model inputs, the process output measure1 were used as the training targets for ments at k the IRN models. Two cases were considered. Case 1. An IRN model with 2 input nodes, 4 hidden nodes, and 1 output node (resulting in a total of 33 coefficients)was trained. The pH output of the process and the corresponding IRN predictions for the test data are shown in Figure 4. Case 2. An IRN model with 2 input nodes, 10 hidden nodes, and 1 output node (resulting in a total of 141 coefficients)was trained. The pH output of the process and the corresponding IRN predictions for the test data are shown in Figure 5. You can see from Figure 4 that the IRN with 4 hidden nodes could not model the pH process with variable time delays very well. However, the IRN model with 10 hidden nodes could model the pH process with maximum delays of 9 time steps fairly well. If the time delays had been larger, we would have had to use more hidden nodes to model the time delays. 4.3. Identification of the Process Using an IRN with an Input Window and an Abbreviated Data Sample. In the simulations discussed next, we assumed that the estimation of the largest possible time delay was 9 time steps and the smallest possible time

+

delay was 0 time steps so that we used a past input window for the two process inputs from time step k - 9 to the current time step k. However, since an IRN model has certain inherent capabilities in representing a process with small time delays, we did not use all of the measurements in the window. Instead, we used as a sample every third measurement as the input to the IRN. Consequently, we used an IRN with 6 input nodes (corresponding t o the two input signals at time steps k, k - 3, and k - 6), 4 hidden nodes, and 1 output node (representing the delayed process output pH4,). The number of coefficients in the IRN model was 49. Case 3. An IRN model without a prefilter was trained using the process input and output data shown in Figure 3. The results of the prediction of the pH by the IRN model for the test data are shown in Figure 6. Case 4. Another IRN model with a prefilter (a= 0.5) was trained on the same data shown in Figure 3. The results of the prediction of the pH by the IRN model for the test data are shown in Figure 7. From Figure 6, you can see that the IRN model using an input window of 9 time steps with a subset of the data in the window satisfactorily modeled the nonlinear dynamics of the pH neutralization process that had pure time delays between 2 and 9 time steps. You can also see from Figure 7 that the use of the prefilter improved the prediction performance somewhat further. 4.4. Identification of the Process with a Constant Time Delay. In some practical situations, process time delays may be constant. As an example of using the proposed approach to identify such a process, another simulation was carried out. In the simulation, the process input and output data used for training the IRN model were similar to that shown in Figure 3 except that the time delay d was a constant of 7 time steps. The test data set was also similar to that used in cases 1-4 except that a fixed time delay of 7 time steps was employed. We assumed that the possible maximum time delay was 9 time steps and the minimum time delay was 0. Case 5. An IRN model with 6 input nodes (corresponding to the two input signals a t time steps k, k 3, k - 6 in the window from k - 9 to k), 4 hidden nodes, and 1 output node was trained. A prefilter (a= 0.5) was also used. The test results are shown in Figure 8.

1742 Ind. Eng. Chem. Res., Vol. 34,No. 5, 1995

From Figure 8, you can see that the IRN model using a sampled input window and a prefilter modeled the nonlinear dynamic properties of the process with a constant time delay of 7 time steps well. 4.5. Discussion of Results. Table 2 lists the scaled MSE (mean square error) for the five cases discussed above. The MSE was calculated by summing the squares of the deviations between the output of the net and the test data, each respectively scaled to the range 0 to 1, and dividing the number of data points in the test set. Table 2 shows that the most parsimonious model, case 1,also had the highest MSE. Adding more hidden nodes improved the prediction of the pH. However, using a window of selected past inputs reduced the MSE from that of case 1 with a smaller number of parameters than were used in case 2. Filtering the inputs improved the MSE even further. Case 5 , the one for a constant delay, a special case of variable delay, appeared to be easy to treat. We believe that IRN models for modeling delays would be considerably less complex, and involve far fewer coefficients, than ordinary feedforward nets with a window of past inputs and outputs. The feedback of information ameliorates the need for a large input data set, which in turn requires more hidden nodes, and perhaps more than one hidden layer in the net. To model with any type of net, keep in mind that the variable delay must be related directly or indirectly to measured variables. 5. Conclusions How to handle delays, and in particular time-varying delays, is a significant problem in modeling nonlinear processes. Internal recurrent neural networks (IRN) are nonparametric models that can represent well nonlinear dynamic processes with small dead times by parsimonious models and at the same time can accommodate large dead times either by increasing the number of hidden nodes or by adding a window of delayed coordinates of the inputs to the IRN. An important advantage of using IRN model with or without a past process input window is that it is very easy and straightforward to apply such a model to multiinput and multi-output nonlinear processes. Both SISO and MIMO processes with time-varying delays can be represented by such nets as long as the varying delay is related to observed variables. The simulations presented for a pH neutralization process demonstrate how an IRN could be applied in practice.

Nomenclature A = pH neutralization tank area C, = valve coefficient

d = process pure time delay

A * ] = nonlinear function and nonlinear function vector h = liquid level in pH neutralization tank k = time step for a variable I = number of inputs for a multivariable process

m = length of process input window for NARMAX model n = length of process output window for NARMAX model NH = number of hidden nodes of an IRN NI = number of input nodes of an IRN pK = log of equilibrium constant for pH neutralization process PI& = effluent pH from the neutralization tank pH, = measured effluent pH from the neutralization tank q = flow rate W, = reaction invariant a w b = reaction invariant b W, W,W, m i a s , W b l a s = weights of an IRN x ( k ) = output of hidden nodes of an IRN y ( k ) = measured process output variables j(k+llk), Q(k+llk) = predicted process output variables Greek Letters a( ) = activation function of hidden nodes of an IRN

-

Literature Cited Chen, S.;Billing, S.; Cowan, C.; Grant, P. Practical Identification of NARMAX Models Using Radial Basis Functions. Znt. J . Control 1990,52, 1327-1350. Elman, J. L. Finding Structure in Time. Cognit. Sci. 1990,14,

179-211. Gill, P. E.; Murray, W.; Saunders, M. A.; Wright, M. H. User’s Guide for SOLINPSOL: A Fortran Package for Nonlinear Programming, Technical Report SOL 86-2,Systems Optimization Laboratory, Department of Operations Research: Stanford University, Stanford, CA, 1986. Kajala, T.; Himmelblau, D. M. Dynamic Data Reconciliation by Recurrent Neural Networks vs. Traditional Methods. AlChE J . 1994,40, 1865-1875. Macmurray, J. Modeling and Control of a Packed Distillation Column Using Artificial Neural Networks. M.S. Thesis, The University of Texas at Austin, 1993. Nahas, E.; Henson, M. A.; Seborg, D. Nonlinear Internal Model Control Strategy for Neural Network Models. Comput. Chem. Eng. 1992,16,1039-1057. Williams, R. J. Adaptive State Representation and Estimation Using Recurrent Connectionist Networks. In Neural Networks for Control; Miller, W. T., Sutton, R. S., Werbos, P. J., Eds.; The MIT Press: Cambridge, MA, 1990;pp 97-114. You, Y.; Nikolaou, M. Dynamic Process Modeling with Recurrent Neural Networks. AlChE J . 1993,39,1654-1666.

Received for review August 19,1994 Revised manuscript received January 30, 1995 Accepted February 9,1995@

IE940501R @

Abstract published in Advance ACS Abstracts, March 15,

1995.