VolterraLaguerre Models for Nonlinear Process Identification with

function expansion of the Volterra kernels and orthogonal regression analysis for the ... applied to the identification of a 3 × 3 third-order nonlin...
0 downloads 0 Views 140KB Size
340

Ind. Eng. Chem. Res. 2004, 43, 340-348

Volterra-Laguerre Models for Nonlinear Process Identification with Application to a Fluid Catalytic Cracking Unit Qingsheng Zheng and Evanghelos Zafiriou* Department of Chemical Engineering, University of Maryland, College Park, Maryland 20742

Volterra series models are attractive for use in model-based control of nonlinear processes because they are direct extensions of linear impulse response models commonly used in process control. However, a limitation in their use is the fact that higher than second-order nonlinearities and/ or multi-input multi-output Volterra models involve very large numbers of parameters. Here we address the problem with a parameter reduction method that utilizes a Laguerre basis function expansion of the Volterra kernels and orthogonal regression analysis for the determination of the dominating terms in the model. The conditions under which a nonlinear system can be approximated by a Volterra-Laguerre model are investigated. The technique is then applied to the identification of a 3 × 3 third-order nonlinear model for a simulated model IV fluid catalytic cracking unit. 1. Introduction The performance of most advanced process control systems depends heavily on the availability of adequate process models. Multivariable interactions and nonlinearities are common characteristics in chemical process systems. During the past several years, there has been a significant increase in the number of nonlinear process control techniques that are based on Volterra series models,1 which are linear-in-parameters models that can be obtained by identification from input/ output (I/O) data. A nonlinear model-based control methodology for systems modeled by the Volterra series was developed by Doyle et al.2 Genceli and Nikolaou3 used the Volterra series with constrained model predictive control. Zheng and Zafiriou4 developed a local small gain theorem for stability analysis of feedback Volterra systems. The main criticism in using the Volterra series as nonlinear models lies in the large number of parameters needed to represent the kernels. For instance, a fourthorder Volterra series may need hundreds of thousands of parameters. When these kernels need to be estimated from I/O data, this becomes an unrealistic identification problem. Indeed, process control applications utilizing the Volterra series have usually been limited to secondorder models.3,5,6 To tackle this problem, Kurth and Rake7 proposed a procedure to reduce this number by using orthonormal basis functions (distorted sine functions) to approximate the kernels first and then to use a significant term selection algorithm to eliminate those terms that make relatively little contribution to the model accuracy. The use of orthonormal basis functions to approximate Volterra kernels has long been known in Volterra series identification (e.g., see ref 8), but the combination with an algorithm for selecting the dominating terms was novel. However, their discussion was based on Hammerstein and Wiener systems, and the classification of the systems that can be identified with this procedure was not considered possible. * To whom correspondence should be addressed. Tel.: (301)405-6625. Fax: (301)314-9126. E-mail: [email protected].

In this paper, the procedure used by Kurth and Rake7 is extended to so-called Volterra-Laguerre models to make it more suitable for chemical process control purposes. Dumont et al.9 and Zheng and Zafiriou10 suggested the use of such models for process control. Parker and Doyle11 used a second-order VolterraLaguerre model for the control of a bioreactor, and Parker12 further examined the utilization of such models in model predictive control. Here we derive conditions under which a nonlinear system can be approximated by a Volterra-Laguerre model to any precision. The Volterra-Laguerre model is then rearranged into a form suitable for identification, and a vector orthogonalization algorithm, proposed by Korenberg et al.13 for a nonlinear ARMAX model, is adopted to construct an orthogonal regression analysis procedure. This allows the determination of the dominating terms in the Volterra-Laguerre model. The Volterra series and the Volterra-Laguerre model are then extended to cover the multi-input multi-output (MIMO) case. The impact of the input excitation signals on the model identifiability is also discussed. Finally, as an application example, we use the technique to identify a 3 × 3 nonlinear model for a simulated fluid catalytic cracking unit (FCCU). For this purpose, a model IV FCCU dynamic simulator, based on the models developed by McFarlane et al.,14 is used to generate I/O data, thus emulating the actual plant. The selected manipulated variables include fresh feed flow rate, slurry recycle flow rate, and lift air flow rate, and the controlled variables include regenerator bed temperature, riser temperature, and stack gas carbon monoxide concentration. The use of the resulting structure of the identified model to draw information on process characteristics is also discussed.

2. Volterra-Laguerre Models A Volterra series is a nonlinear I/O model. A comprehensive discussion on the Volterra series can be found in work by Schetzen8 and Rugh.15 A discrete-time Volterra series for a single-input single-output (SISO)

10.1021/ie021064g CCC: $27.50 © 2004 American Chemical Society Published on Web 09/26/2003

Ind. Eng. Chem. Res., Vol. 43, No. 2, 2004 341

where the sequence {[x1-a2/(z - a)][(1 - az)/(z a)]k-1} is a Laguerre series. Denote

process has the form N

y(k) )

m

m

∑ ∑ i∑)0hn(i1,...,in) u(k-i1) ... u(k-in) ‚‚‚

n)1i1)0

(1)

where hn(i1,...,in) is a Volterra kernel, u is an input sequence, y is an output sequence, N is the order of the Volterra series, and m is the memory (0 e m e +∞). As a matter of convenience, one can always assume a kernel to be symmetric about its indices because any kernel can be put into symmetric form.15 When compared to a Taylor series, a Volterra series is a polynomial expansion to a nonlinear system when its input and output are time series, while a Taylor series is a polynomial expansion to a nonlinear function when its input and output are simple variables. The first-order Volterra series (N ) 1) is simply the popular process control linear impulse response model. A Volterra series can be used to describe a large class of nonlinear dynamic systems. For instance, it was shown by Boyd and Chua16 that a fading memory system can be approximated by eq 1 to within arbitrary accuracy. Compared to other types of empirical nonlinear models, a Volterra series has several attractive properties: (i) A Volterra series representation has a clear nonlinearity structure, which is a natural extension to the linear impulse response models with which process control engineers are familiar. (ii) A Volterra series is a linear-in-parameters model, which makes the parameter estimation mathematically no more difficult than obtaining a linear impulse response model from I/O data. (iii) If an open-loop stable process is investigated, the model is always stable, even in the case when the admissible input range is violated or unmeasured disturbances occur. These properties are important to nonlinear model predictive control. The major obstacle in using a Volterra series model is the large number of parameters needed for the kernels. For instance, if an Nth-order Volterra series with memory m is used, the parameter number in the kernels will be on the order of mN. Therefore, applications using higher than second-order Volterra series are rarely seen in the literature. As orthogonal basis functions, Laguerre functions are widely used in linear system identification.17,18 The basic properties of Laguerre functions in the perspective of linear system identification can be summarized as follows. Consider a linear system k

y ) H(z) u

or

y(k) )

h(i) u(k-i) ∑ i)0

(2)

Assume the z-transfer function H(z) to be strictly proper [h(0) ) 0], analytic in |z| > 1, and continuous in |z| g 1. Let the time factor |a| < 1. Then there exists a sequence {θk} such that

H(z) )

Lk(i) ) Z-1

n

(

)

∞ x1 - a2 1 - az k-1 θk ∑ z-a z-a k)1

[x

]

1 - a2 1 - az k-1 z-a z-a

(

)

(3)

The orthonormality property reads ∞

Lk (i) Lk (i) ) 0, ∑ i)0 1

2

∀ k1 * k2



Lk (i) Lk (i) ) 1, ∑ i)0 1

2

∀ k1 ) k2

Denote ∞

lp ) {f: [0, +∞) f R|

|f(i)|p < +∞}, ∑ i)0

1 e p < +∞ (4)

l∞ ) {f: [0, +∞) f R|sup|f(i)| < +∞} ig0

(5)

Then the Laguerre series forms a complete set on l2. Therefore, for any real functions f(i), i ) 0, 1, ..., ∞ f 2(i) < ∞, we can always satisfying f(0) ) 0 and ∑i)0 expand f as ∞

f(i) )

∑ θkLk(i)

(6)

k)1

where {θk} are real constants determined by ∞

θk )

f(i) Lk(i) ∑ i)0

(7)

To obtain a fast rate of convergence of the partial sums, a should be chosen close to the dominating pole of H(z). Fu and Dumont19 proposed a method for the optimal selection of a for linear systems based on knowledge of the impulse response of the process. If we truncate both h(i) and Lk(i) and denote

[

Θ ) [θ1, θ2, ..., θr]T

L1(0) L (1) B) 1 ... L1(m)

L2(0) L2(1)

... ...

Lr(0) Lr(1)

L2(m)

...

Lr(m)

]

Uk ) [u(k), u(k-1), ..., u(k-m)]

(8)

(9)

(10)

where r is the order of the basis functions, then system (2) can be approximated by the following truncated Laguerre expansion:

y˜ (k) ) UkBΘ

(11)

Now, we extend this result to higher-order kernels. For this purpose, we use the following definitions: Definition 1. A kernel is called separable if q

hn(i1,...,in) )

ν1j(i1) ν2j(i2) ... νnj(in) ∑ j)1

(12)

342

Ind. Eng. Chem. Res., Vol. 43, No. 2, 2004

where q is some finite number and νlj(il) is a single variable real function, l ) 1, ..., n. If each νlj(il) also satisfies





(13)

l

the kernel is called stably separable. The term “stable” comes from the fact that if one takes νlj(il) as the kernel of a linear system, then the system is stable if and only if the kernel satisfies eq 13. Definition 2. A symmetric kernel is called strictly proper if

hn(i1,i2,...,in) ) 0, for i1i2...in ) 0

(14)

Then the following theorem describes the VolerraLaguerre model. Theorem 1. For a given finite-order Volterra system in eq 1, if each kernel is stably separable and strictly proper, then the system has a Volterra-Laguerre description defined by

y(k) )





n

m

∑ ∑ ‚‚‚j∑)1θn(j1,...,jn)∏ ∑ Lj (il) u(k-il) l)1 i )1 l

n)1j1)1

n

(15)

l

where θn can be obtained from ∞



∑ ‚‚‚i∑)1hn(i1,...,in) Lk (i1) ... Lk (in) i )1

θn(k1,...,kn) )

1

1



∑ ∑ hn(i1,...,in) Lk (i1) ... Lk (in) ) ‚‚‚

∑ |νlj(il)| < ∞ i )0

N

both sides of eq 17 with Lk1(i1) ... Lkn(in) and take summations to obtain

n

n

(16)

Proof. Because hn(i1,...,in) is stably separable, it can be expressed as eq 12. Because it is also strictly proper, without loss of generality, we can write

1

i1)0 in)0 ∞ ∞

∑ j∑)1 ‚‚‚

j1)1

n



{θn(j1,...,jn) [



i1)0

n



Lj1(i1) Lk1(i1)]‚‚‚[

∑ Lj (in)

in)0

n

Lkn(in)]} ) θn(k1,...,kn) where the orthonormality properties of the Laguerre functions are used to obtain the last expression, which is eq 16. Next, we substitute eq 17 into eq 1. Reordering of the summations and grouping of each Laguerre function with the input term that corresponds to the same il yields eq 15. 3. System Classification The previous section provides a sufficient condition in order for Volterra kernels to have Laguerre expansions. This leads to the question, what types of systems possess stably separable kernels? In this section, we will show that two commonly encountered nonlinear systems, fading memory dynamic systems and bilinear systems, do have this property. A formal definition of fading memory can be found in work by Boyd and Chua.16 Roughly speaking, the concept of fading memory states that the impact of remote inputs on the system output will eventually vanish. Lemma 1. If a system has a state-space realization

X(k+1) ) AX(k) + bu(k), X(0) ) 0



∑ |νlj(il)| < ∞

y(k) ) p[X(k)]

il)0

where A is a q × q exponentially stable matrix and p is a polynomial, then the system has a finite Volterra series representation with stably separable and strictly proper kernels. Proof. Because A is exponentially stable, each state variable xi(k), i ) 1, 2, ..., q, has the following solution:

νlj(0) ) 0, ∀ 1 e l e n, 1 e j e q Because l1 ⊂ l2, we have ∞

ν2lj(il) < ∞ ∑ i )0 l



xi(k) )

Therefore, each νlj(il) has a Laguerre expansion

gi(l) u(k-l), ∑ l)0

i ) 1, 2, ..., q



νlj(il) )

∑ alj(k) Lk(il)

k)1

for some constants {alj}. Use of these expansions in eq 12 and collection of the terms for each product of n Laguerre functions result in a Laguerre expansion for the nth-order kernel: ∞

hn(i1,...,in) )





∑ ‚‚‚j∑)1θn(j1,...,jn) Lj (i1) ... Lj (in)

j1)1

1

n

where gi(‚) is a first-order kernel satisfying gi(0) ) 0 and gi ∈ l1. Because p is a polynomial, it consists of a finite number of multiplications and additions of {xi(k)}, exclusively. Consider the multiplication yij(k) ) xi(k) xj(k). It is clear that

(17)

n

where θn values depend only on the constants {alj}. However, they can also be obtained directly from the Volterra kernels and the Laguerre functions. Multiply

yij(k) )



∑ ∑ gi(li) gj(lj) u(k-li) u(k-lj)

li)0lj)0

Therefore, the multiplication has a second-order kernel gij(li,lj) ) gi(li) gj(lj), which is stably separable and strictly proper.

Ind. Eng. Chem. Res., Vol. 43, No. 2, 2004 343

If we consider the addition ya(k) ) yi(k) + yst(k), we can obtain ∞

ya(k) )



In scalar terms,

hn(i1,i2,...,in) )



m1

gs(ls) gt(lt) u(k-ls) ∑ gi(li) u(k-li) + l∑)0l∑ l )0 )0 i

s

ln

∑ ∑ k∑)1j∑)1ak j ...k j ij1 -1...ijn -1 λki ...λki

Therefore, the addition yields a second-order Volterra series with stably separable and strictly proper kernels. Hence, a finite number of multiplications and additions will not change the stably separable and strictly proper properties. This lemma is used in the following theorem to directly show that a fading memory system whose input has no instantaneous effect on the output can be approximated arbitrarily well by a Volterra series that satisfies the conditions of theorem 1 for having a Volterra-Laguerre expansion. Theorem 2. Let H be any time-invariant operator: l∞ f l∞ with fading memory on K ) {u ∈ l∞||u| e d} for some finite number d and PkHu ) PkHPk-1u, where Pk denotes up to time k truncation. Then there is a finiteorder Volterra series H ˜ with stably separable and strictly proper kernels that uniformly approximates H to any precision. Proof. Because H is of fading memory, according to theorems 2 and 3 in work by Boyd and Chua,16 for any given  > 0 there is a finite Volterra series operator H ˜ such that for all u ∈ K

˜ u(k)] e  sup [Hu(k) - H kg0

n

n

n n

1

n

1

n

n

where m1, ..., mn, l1, ..., ln are some finite numbers, ak1j1...knjn are constants, and λk1, ..., λkn are real numbers satisfying |λkj| < 1, j ) 1, 2, ..., n. It is clear that this kernel is stably separable and strictly proper. 4. Identification Formulation In this section, we focus on rearranging the VolterraLaguerre model into a form suitable for identification from I/O data. Consider the Volterra series in eq 1. If we use the first r Laguerre functions L1, L2, ..., Lr to approximate hn, n ) 1, 2, ..., N, then from eq 15 we obtain N

y˜ (k) )

n

r

m

∑ ∑ ‚‚‚θn(j1,...,jn)∏ ∑ Lj (il) u(k-il) n)1j )1 l)1 i )0 l

1

l

(22)

In fact, there are many repeated terms in eq 22. This redundancy can be eliminated by combining the repeated terms and using a lexicographic ordering. This leads to N

y˜ (k) )

n

r

r

∑ ∑ ‚‚‚j )j∑ n)1j )1 1

and H ˜ has a finite dimensional state-space realization

1

1 1

k1)1j1)1

t

n

m

∑ Ljl(il) u(k-il) ∏ l)1 i )0

θn(j1,...,jn)

n-1

l

(23)

Define a reduced Kronecker product15

a[2] ) a X a ) [a1, a2, ..., an][2] ) [a1a1, a1a2, ..., a2a2, a2a3, ..., anan] (24)

X(k+1) ) AX(k) + bu(k) y(k) ) p[X(k)] where A is an exponentially stable matrix and p is a polynomial. The result then follows from lemma 1. We next show that a bilinear system is also described by a Volterra series that satisfies the conditions of theorem 1. Note that a nonlinear system is said to have a bilinear realization if it can be described by

X(k+1) ) A0X(k) + A1X(k) u(k) + bu(k)

(18)

y(k) ) cX(k)

(19)

Theorem 3. If a nonlinear system has an asymptotically stable bilinear realization, then its Volterra kernels are stably separable and strictly proper. Proof. Following section 6.4 in work by Rugh,15 this bilinear system has a finite-order Volterra series inputoutput representation. Moreover, the nth-order Volterra kernel is given by

cAi0n-1A1Ai0n-1-1A1...A1Ai01-1b,

i1, i2, ..., in > 0 (20)

hn(i1,i2,...,in) ) 0, i1i2...in ) 0

mn

...

u(k-lt)

hn(i1,i2,...,in) )

l1

(21)

a[i] ) a[i-1] X a, i g 2

(25)

Then eq 23 is written as

y˜ (k) ) [UkB, (UkB)[2], ..., (UkB)[N]]‚Θ

(26)

where Θ ) [θ1T, ..., θNT]T, with each θn a column vector with the same dimension as (UkB)[n], and B and Uk are defined in eqs 9 and 10, respectively. As such, UkBθ1 corresponds to the first-order Volterra operator, (UkB)[2]θ2 corresponds to the second-order Volterra operator, and so on. When {θn} are known, the corresponding Volterra kernels can be derived based on eqs 15 and 16. Let

Y ) [y(1), y(2), ..., y(D)]T

[

Y ˜ ) [y˜ (1), y˜ (2), ..., y˜ (D)]T U1B (U1B)[2] U2B (U2B)[2] X) ‚ ‚ ULB (ULB)[2]

... (U1B)[N] ... (U2B)[N] ... ‚ ... (ULB)[N]

]

344

Ind. Eng. Chem. Res., Vol. 43, No. 2, 2004

where D is the data length. Then eq 26 can be written as

Y ˜ ) X‚Θ

(27)

which is a linear-in-parameters model. Using the integrated square errors (ISE) as the identification criterion, the parameter vector Θ can be obtained from

˜ )T(Y - Y ˜) Θ* ) arg min(Y - Y Θ

(28)

5. Orthogonal Regression Analysis The advantage in using a Volterra-Laguerre model over the conventional Volterra series model lies in the fact that the number of parameters in a VolterraLaguerre model is independent of the system memory m, and the order of Laguerre functions r is usually smaller than m (compare eq 1 with eq 22). Therefore, the number of parameters to be estimated can be reduced. However, if the system order N is high, the parameter number, which is proportional to rN, will still be very large. In fact, not all basis functions contribute to the model accuracy equally, with some of them making relatively larger contributions than the rest. As such, the number of basis functions (and hence the number of estimated parameters) can be further reduced if we only use dominating basis functions in the model. For a model with a linear-in-parameters structure, there are several possible ways to determine the dominating terms that should be included in the model. Among these are forward regression, backward regression, stepwise regression, and orthogonal regression analysis.20 The orthogonal regression analysis has several advantages over the others: (i) Entering a new model component does not alter the values of the previously estimated parameters. (ii) The decrease in the ISE value due to regression of a model having more components is the sum of the ISE decreases due to regression of the individual components when they are individually included in the model. (iii) As a consequence of the above, the best model components to be considered for possible entry into the model can be deduced by simply calculating the ISE values due to regression of the candidate terms and choosing the one that results in the largest decrease in the ISE value. A widely used orthogonalization algorithm is the Gram-Schmidt procedure. However, as pointed out by Korenberg et al.,13 this procedure is very sensitive to rounding errors. As such, another procedure for orthogonalization was proposed there and is used in our study. Assume that X is full rank in its column vectors (see the next section if X does not satisfy this assumption). Suppose that l - 1 parameters have been included in the model. Accordingly, l - 1 column vectors from X have been selected. Let zj, j ) 1, 2, ..., l - 1, represent these selected and already orthogonalized vectors. Next, a new element from X, say x, will be considered. Define D

Rjl )

∑ k)1

D

x(k) zj(k)/

z2j (k), ∑ k)1

j ) 1, 2, ..., l - 1 (29)

Then the corresponding orthogonal vector for x, relative to {zj, j ) 1, 2, ..., l - 1}, can be obtained: l-1

zx ) x -

Rjlzj ∑ j)1

(30)

The ISE value when zx is added into the model will become D

ISEl ) ISEl-1 - (YTzx)2/

z2x (k) ∑ k)1

(31)

where ISEl is the ISE value when l terms are included in the model. All columns in X except those already having been included in the model should be tested. Then, the new component will be chosen as the one that corresponds to the smallest ISEl. The selection procedure will stop when ISE reaches a predefined limit or the tendency of decreasing in ISE saturates. Note that the term in eq 31 corresponding to the incremental error is obtained from the expression for the reduction in mean-squared error (ISE divided by D) derived by Korenberg et al.13 In the reference, the expression involves the estimated parameters for the orthogonalized form of eq 27, which we do not use here. To obtain the form shown in eq 31, the expression for those parameter estimates, also derived by Korenberg et al.,13 must be substituted into the error expression. This is a trivial calculation, and it is not included here in order to avoid introduction of a lot of mathematical notation, not needed elsewhere in the paper. After the dominating terms are selected, the model parameters can be estimated as follows. Let Xr be the matrix whose columns are those vectors selected from X by the above selection procedure, and Θr be the parameter vector whose elements correspond to Xr. Then the partial parameters Θr can be estimated from the output prediction model Y ˜ ) Xr‚Θr using a leastsquares method to solve eq 28. The other remaining parameters in Θ are set to zero. 6. Effect of Inputs on Identifiability In the previous section, X was assumed to be of full column rank. In identification, X is a tall matrix. Assuming full rank, the square matrix (XTX) will be invertible. Then eq 28 has a unique solution for Θ and for Y ˜ ) Xr‚Θr as well because Xr consists of column vectors from X. What will happen if this assumption does not hold? If X is not full rank in its column vectors, (XTX) becomes singular and there will be multiple solutions for Θ in eq 28. In other words, the complete VolterraLaguerre model is not identifiable from the given data. Because X depends only on input data when {N, r, m, a} are given, a basic requirement has to be that the input excitation signal should be designed in such a way that the matrix X is of full rank. The impact on the dominating term selection, if X is not full rank, can be analyzed as follows. Intuition suggests that the selection procedure through the orthogonal regression analysis will guarantee Xr to be full rank. Suppose we have selected l - 1 vectors from X that are linearly independent of each other. Now consider another vector x from X. If x is linearly dependent on the selected vectors, then it will also be linearly dependent on the corresponding l - 1 orthogo-

Ind. Eng. Chem. Res., Vol. 43, No. 2, 2004 345

nalized vectors. Hence, the orthogonalization of x described by eqs 29 and 30 leads to zx ) 0. Such a vector, of course, has no individual contribution to the ISE value and hence should not be included in the model. In other words, we should not use those vectors that will induce zx ) 0. However, in the numerical implementation, check if zx ) 0 is not as simple as one may think because numerical errors may make zx not exactly zero even if it is theoretically. Therefore, we need some sort of criterion to reject such vectors. For this purpose, the following steps are included in the identification procedure: 1. Choose a threshold s, say s ) 1 × 105. 2. If ||x||/||zx|| g s, x is considered to be linearly dependent on the previously selected vectors and is dismissed, where ||x|| ) ∑i|xi|. 3. If ||x||/||zx|| < s and x results in the largest decrease in the ISE value, then x is selected. Though the difficulty arising from the singularity problem can be avoided through these steps, it should be noted that, because this singularity problem has nothing to do with the output data, one can check if a designed input signal will result in this problem before the actual experiment is carried out. Finally, it should be noted that several input design methods for use with Volterra models have been proposed in the literature1,22,23 and their use may reduce the amount of data required for the identification of Volterra-Laguerre models as well.

we define an Nth-order multi-input single-output (MISO) Volterra series: N

y(t) )

M

M

∑ ∑ i∑)1∫0 ‚‚‚∫0 hi ‚‚‚i (τ1,...,τn) ui (t-τ1) ... n)1i )1 t

‚‚‚

t

1

1

n

1

n

uin(t-τn) dτ1 ... dτn (35) where hi1...in(τ1,...,τn) is the nth-order Volterra kernel with respect to the input indices {i1, ..., in}. The discrete-time version of the MISO Volterra series is accordingly defined as N

y(k) )

M

M

m

m

∑ ∑ ... ∑ ∑ ... ∑ hi ...i (j1,...,jn) ui (k-j1) ... n)1i )1 i )1j )1 j )1 1

1

n

1

n

1

n

uin(k-jn) (36) From the identification point of view, obtaining a MIMO model breaks down into q independent problems of obtaining a MISO model. As in the SISO case, eq 36 can also be described by the following Volterra-Laguerre model: N

y(k) )

M

M





∑ ∑ i∑)1l∑)1 l∑)1 ‚‚‚

n)1i1)1

‚‚‚

n

1

m

θi1...in(l1,...,ln) [

∑ Ll (j1)

j1)1

n

1

m

ui1(k-j1)]‚‚‚[

∑ Ll (jn) ui (k-jn)]

jn)1

n

n

(37)

or in truncated form 7. Extension to MIMO Systems

N

Lesiak and Krener21 have shown that a fairly general class of physical systems have Volterra series representations. In differential/algebraic equation form, these systems can be written in the following MIMO nonlinear state-space description: M

X4 ) F(X) +

uigi(X) ∑ i)1

(32)

Y ) C(X)

(33)

where X ∈ Rp and Y ∈ Rq. The vector field F, output function C, and gi are assumed to have a sufficient degree of smoothness. Each of the M inputs, u1, ..., uM, is assumed to be either absolutely integrable or bounded on a finite interval. It is assumed that the origin is both an equilibrium point and the initial condition. Lesiak and Krener21 showed that such a system has a Volterra series representation of the following form: M

yj(t) )

∫0 hij (t-τ1) ui (τ1) dτ1 + ... + ∑ i )1 t

1

1

1

M

M

∑ ‚‚‚i∑)1∫0 ...∫

i1)1

t

t j h (t-τ1,...,t-τn) 0 i1‚‚‚in

y˜ (k) )

M

M

r

r

∑ ∑ i∑)1l∑)1 l∑)1 ‚‚‚

n)1i1)1

‚‚‚

n

1

m

θi1...in(l1,...,ln) [

∑ Ll (j1)

j1)1

n

1

m

ui1(k-j1)]‚‚‚[

∑ Ll (jn) ui (k-jn)]

jn)1

n

n

(38)

Notice that unlike the SISO case where we can use a triangular form of θi1...in(l1,...,ln) to eliminate repeated terms, we need all elements of θi1...in(l1,...,ln) for a MISO system because the inputs ui1, ..., uin are different (in the SISO case, i1 ) i2 ) ... ) in ) 1). If we denote

Uik ) [ui(k), ..., ui(k - m)], i ) 1, ..., M

(39)

Uk ) [U1kB, ..., UM k B]

(40)

and

Θ ) [θ1(1), ..., θ1(r), θ2(1), ..., θM(r), θ11(1,1), ..., θ11(1,r), θ12(1,1), ..., θ12(1,r), ..., θ1M(1,r), ..., θ11(2,2), ..., θ11(2,r), θ12(2,1), ..., θ1M(2,r), θ11(3,3), ..., θ1M(r,r), θ22(1,1), ..., θ2M(r,r), ..., θMM(r,r), ..., θM...M(r,...,r)]T (41)

ui1(τ1) ...

n

n

uin(τn) dτ1 ... dτn + (||u|| ), j ) 1, 2, ..., q (34) One can see from eq 34 that the number of outputs does not affect the Volterra expression because no autoregressive terms are involved. Motivated by this result,

then eq 38 can be written as

y˜ (k) ) [Uk, Uk[2], ..., Uk[N]]Θ

(42)

Notice that when the structure parameters {N, m, r, a} and the I/O data are given, eq 42 is a linear-inparameters model and the parameters Θ can be esti-

346

Ind. Eng. Chem. Res., Vol. 43, No. 2, 2004

Figure 1. Schematic of a model IV FCCU.

mated using a linear regression method. Moreover, the dominating term selection procedure proposed for the SISO case can be carried over to the MISO case.

Figure 2. Input data for identification.

8. Application to an FCCU The fluidized-bed catalytic cracker is the refinery’s workhorse. It converts gas oils into a range of hydrocarbon products, of which gasoline is the most valuable. Its large throughput gives the FCCU a major role in the overall economic performance of a refinery; therefore, it is a prime candidate for advanced process control.24 Its internal feedback loops (interactions) and high nonlinearity make it an appropriate candidate for the application of MIMO nonlinear model identification techniques. 8.1. Description of Model IV FCCU. For a detailed description of this process, one is referred to work by McFarlane et al.14 The model IV FCCU consists of two main sections: the reactor, where the catalytic cracking reaction occurs, and the regenerator, where hydrocarbon deposits that are products of the cracking reaction are burned off the catalyst (Figure 1). The fresh feed from the preheating furnace is mixed with the hot slurry recycle before entering the reactor where the catalytic conversion takes place. The conversion yields hydrocarbon vapor as well as coke, which deposits on the catalyst and suppresses its reactivity. Catalyst regeneration is achieved by burning off the coke deposit in a fluidized bed inside the regenerator, where the lift air assists in catalyst circulation and, together with the combustion air blower, provides oxygen for combustion reaction. The main control objective is to operate the process in its maximum throughput with no violation of various constraints. Usually, a number of regulatory proportional-integral-derivative controllers are permanently equipped in a FCCU to maintain a stable process. A user-supplied advanced control algorithm can then manipulate the setpoints to these controllers (as well as other variables). In the model IV FCCU, there are 14 manipulated and 20 measured output variables. Among these, the following variables are considered to be important to the control objective, as suggested by Kalra and Georgakis.25 Controlled variables: stack gas carbon monoxide concentration (XCO); reactor riser temperature (Tr); regenerator bed temperature (Treg). Manipulated variables: fresh feed flow rate Fset 3 (as u1); slurry recycle flow rate Fset (as u ); lift air flow rate 2 4 Fset (as u ). 3 9 8.2. Nonlinear Model Identification. Kalra and Georgakis25 obtained different linearized models at

Figure 3. Output data for identification.

different operating points, which were then used for model predictive control at the corresponding operating points. The reason for using different linearized models is the significant process nonlinearity because no single linear model can describe its behavior appropriately. It would be desirable to be able to obtain from available I/O data a nonlinear model that captures the process characteristics over a relatively wide operating range. A model IV FCCU simulator, based on the models developed by McFarlane et al.,14 is used to emulate the plant. Prior to the beginning of the identification experiment, the process is assumed at the following initial settings (others not listed can be found in work by McFarlane et al.:14 Fset 3 ) 126 lb/s XCO ) 59.81 ppm

Fset 4 ) 5.25 lb/s Tr ) 994.517 °F

Fset 9 ) 14.51 lb/s Treg ) 1271.09 °F

As suggested, the process has a settling time of about 100 min for a step response, and a sampling time of 1 min is adequate. Here we use a sampling time of 50 s. A sequence of I/O data are collected as shown in Figures 2 and 3, where all three manipulated variables are allowed maximum 8% changes. All data are in deviation variable form around their initial values. No special effort is made to design the input sequence to avoid singularity of the X matrix. The input values are selected randomly within the set limits, and the time duration for each value is also random. The following model structure parameters are used: N ) 3, r ) 5, a ) 0.85, and m ) 100. There are no optimal selection procedures for these parameters; however, relatively simple choices lead to reasonable

Ind. Eng. Chem. Res., Vol. 43, No. 2, 2004 347 Table 1. Model Parameters for Tr no.

index

θ values

APE

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

θ1(1) θ1(2) θ1(5) θ3(1) θ2(4) θ12(2,4) θ12(1,1) θ2(1) θ3(5) θ122(3,2,4) θ111(1,3,4) θ112(2,5,5) θ12(5,5) θ22(1,5) θ22(2,2)

-0.5111 0.3009 0.1015 0.6862 0.8287 -9.517 × 10-2 -0.1380 1.037 0.3633 -0.5701 -7.575 × 10-4 1.691 × 10-2 -0.1164 2.865 1.784

1.627 1.101 0.935 0.881 0.838 0.799 0.777 0.743 0.721 0.705 0.692 0.680 0.665 0.651 0.636

Table 2. Model Parameters for Treg no.

index

θ values

APE

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

θ1(3) θ1(5) θ2(3) θ3(3) θ123(5,5,1) θ1(2) θ23(5,1) θ12(3,4) θ222(5,5,5) θ2(1) θ23(1,5) θ223(3,5,3) θ123(2,5,2) θ122(5,5,5) θ113(1,5,5)

0.2284 0.2272 1.8486 0.9315 0.5117 0.1208 -1.7332 -0.1784 7.6279 1.6352 2.1155 -7.3213 -0.2298 -0.3204 5.2887 × 10-3

1.631 1.267 1.138 1.066 1.008 0.962 0.907 0.880 0.842 0.814 0.783 0.761 0.741 0.727 0.713

results in our experience. The truncation number m can be selected based on the settling time of the process, for example, from a step response. Note that, because a large m does not increase the number of parameters to be identified (contrary to the standard Volterra model), a conservative choice on the large side may be used. A step response will also provide a guide for the selection of a in an attempt to match a dominant pole. Of course, for a nonlinear process, there may be no such thing, and unlike linear processes, for which optimal methods exist,19 we can only use it as a rough approximation. The use of the orthogonal regression analysis to keep only significant terms in the identified model provides some insurance against the selection of a model order, N, or order of Laguerre functions, r, larger than needed. Our experience shows that the value r ) 5 is adequate as long as the selection of a is reasonable. Fifteen dominating terms are selected for each of the three identified MISO models; these are shown in Tables 1-3. The parameters are arranged in the order of their inclusion in the model. Therefore, terms with lower numbers make more significant contributions to model accuracy than those with higher numbers. The values in the APE column represent the average prediction errors (defined as ISE1/2/number of data) when this term and the terms above it are included in the model. The corresponding model outputs are shown in Figure 3. To validate the model, another sequence of I/O data are collected and the model output is compared against them, as shown in Figures 4 and 5. Some discussion about these models is in order: (1) The 8% changes in the manipulated variables cover all four different operating points discussed by Kalra and Georgakis.25 Therefore, the identified nonlinear models can capture the nonlinear dynamics of the

Figure 4. Input data for comparison.

Figure 5. Output comparison (data not used in the identification). Table 3. Model Parameters for XCO no.

index

θ values

APE

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

θ1(1) θ11(1,1) θ2(1) θ111(1,2,5) θ11(1,5) θ12(1,1) θ111(2,2,2) θ1(4) θ333(2,5,5) θ123(2,2,5) θ223(1,4,4) θ111(1,5,5) θ22(1,1) θ13(3,1) θ1(2)

7.241 0.445 32.040 -1.276 × 10-2 -0.117 2.416 1.385 × 10-2 -1.222 15.274 -2.746 -67.956 -6.597 × 10-3 20.822 0.404 0.706

30.874 20.109 15.540 12.487 11.359 10.459 9.731 8.718 7.981 7.675 7.298 7.121 6.965 6.853 6.767

FCCU in the operating range over which four different linear models are needed otherwise. (2) Among the three models for Tr, Treg, and XCO, we can recognize from their parameter indices that the one corresponding to XCO displays the strongest nonlinearity. This is because the first four dominating terms for Tr and Treg are all linear terms; however, two out of the first four dominating terms for XCO are nonlinear. This observation is consistent with the response curves where the one corresponding to XCO shows the strongest asymmetry to the symmetric inputs (Figures 3 and 5). Kalra and Georgakis25 also reached a similar conclusion. (3) The impact of F9 (lift air flow rate) on Tr (riser temperature) is relatively weak. Moreover, their relation is quite linear because there is no nonlinear term for F9. One possible explanation could be that there are two

348

Ind. Eng. Chem. Res., Vol. 43, No. 2, 2004

regulatory control loops equipped to maintain relatively constant pressures inside both the reactor and the regenerator, and hence the impact of the lift air on the catalyst circulation is weakened and, in turn, on the riser temperature as well. 9. Concluding Remarks The major obstacle in using a higher than secondorder Volterra series model, namely, the very large number of parameters, is removed through the use of Volterra-Laguerre models and orthogonal regression analysis to determine the dominating terms. At the same time, the Volterra-Laguerre model remains a linear-in-parameters model, whose parameters can be estimated by linear regression. Conditions under which a nonlinear system can be accurately approximated by the Volterra-Laguerre model were developed for the SISO case. It was shown that fading memory systems and bilinear systems satisfy such conditions. This indicates the wide applicability of Volterra-Laguerre models. The identification of Volterra-Laguerre models, including the orthogonal regression analysis for selection of dominating terms, was extended to MIMO processes by breaking the problem into several MISO identification problems. The successful testing on the simulated model IV FCC process demonstrates its usefulness in multivariable nonlinear system identification. Furthermore, as seen from the discussion, a Volterra-Laguerre model with only the dominating terms included has a structure from which we can draw useful information regarding process characteristics. Nomenclature D ) data length Fset 3 ) fresh feed flow rate Fset 4 ) slurry recycle flow rate Fset 9 ) lift air flow rate hn ) nth-order Volterra kernel H ) nonlinear operator k ) time index Li ) ith-order Laguerre function m ) memory length M ) number of inputs N ) Volterra series order r ) order of Laguerre functions Tr ) reactor riser temperature Treg ) regenerator bed temperature u ) input XCO ) stack gas carbon monoxide concentration y ) output y˜ ) model output θ ) coefficient of the Laguerre expansion (‚)[i] ) ith-order reduced Kronecker product (‚)T ) transposition of a vector or a matrix

Literature Cited (1) Doyle, F. J., III; Pearson, R. K.; Ogunnaike, B. A. Identification and Control Using Volterra Models; Springer-Verlag: New York, 2002. (2) Doyle, F. J., III; Ogunnaike, B. A.; Pearson, R. K. Nonlinear Model-Based Control Using Second-Order Volterra Models. Automatica 1995, 31, 697. (3) Genceli, H.; Nikolaou, M. Design of Robust Constrained Model Predictive Controllers with Volterra Series. AIChE J. 1995, 41, 2098.

(4) Zheng, Q.; Zafiriou, E. A Local Form of Small Gain Theorem and Analysis of Feedback Volterra Systems. IEEE Trans. Autom. Control 1999, 44, 635. (5) Maner, B.; Doyle, F. J., III; Ogunnaike, B. A.; Pearson, R. K. Nonlinear Model Predictive Control of Multivariable Polymerization Reactor Using Second-Order Volterra Series. Automatica 1996, 32, 1285. (6) Pearson, R. K.; Ogunnaike, B. A.; Doyle, F. J., III Identification of Structurally Constrained Second-Order Volterra Models. IEEE Trans. Signal Process. 1996, 44, 2837. (7) Kurth, J.; Rake, H. Identification of Nonlinear Systems with Reduced Volterra Series. Proceedings of the 10th IFAC Symposium on System Identification (SYSID’94), Copenhagen, Denmark, 1994; pp 143-150. (8) Schetzen, M. The Volterra and Wiener Theories of Nonlinear Systems; John Wiley & Sons: New York, 1980. (9) Dumont, G. A.; Fu, Y.; Lu, G. Nonlinear Adaptive Generalized Predictive Control and Applications. In Advances in Model Predictive Control; Clarke, D., Ed.; Oxford University Press: Oxford, U.K., 1994. (10) Zheng, Q.; Zafiriou, E. Nonlinear System Identification for Control Using Volterra-Laguerre Expansion. Proceedings of the American Control Conference, Seattle, WA, 1995; pp 2195-2199. (11) Parker, R. S.; Doyle, F. J., III. Optimal Control of a Continuous Bioreactor Using an Empirical Nonlinear Model. Ind. Eng. Chem. Res. 2001, 40, 1939. (12) Parker, R. S. Efficient Nonlinear Model Predictive Control: Exploiting the Volterra-Laguerre Model Structure. Proceedings of the Conference on Chemical Process Control VI, Tucson, AZ, 2001. (13) Korenberg, M.; Billings, S. A.; Liu, Y. P.; McIlroy, P. J. Orthogonal Parameter Estimation Algorithm for Nonlinear Stochastic Systems. Int. J. Control 1988, 48, 193. (14) McFarlane, R. C.; Reineman, R. C.; Bartee, J. F.; Georgakis, C. Dynamic Simulator for a Model IV Fluid Catalytic Cracking Unit. Comput. Chem. Eng. 1993, 17, 275. (15) Rugh, W. Nonlinear System Theory: Volterra/Wiener Approach; The Johns Hopkins University Press: Cleveland, OH, 1981. (16) Boyd, S.; Chua, L. O. Fading Memory and the Problem of Approximating Nonlinear Operators with Volterra Series. IEEE Trans. Circuits Syst. 1985, 32, 1150. (17) Lee, Y. W. Statistical Theory of Communication; John Wiley & Sons: New York, 1967. (18) Wahlberg, B. System Identification Using Kautz Models. IEEE Trans. Autom. Control 1994, 39, 1276. (19) Fu, Y.; Dumont, G. A. An Optimum Time Scale for Discrete Laguerre Network. IEEE Trans. Autom. Control 1993, 38, 934. (20) Haber, R.; Unbehauen, H. Structure Identification of Nonlinear Dynamic SystemssA Survey on Input/Output Approach. Automatica 1990, 26, 651. (21) Lesiak, C.; Krener, A. J. The Existence and Uniqueness of Volterra Series for Nonlinear Systems. IEEE Trans. Autom. Control 1978, 23, 1090. (22) Nowak, R. D.; Van Veen, B. D. Random and Pseudorandom Inputs for Volterra Filter Identification. IEEE Trans. Signal Process. 1994, 42, 2124. (23) Parker, R. S.; Heemstra, D.; Doyle, F. J., III; Pearson, R. K.; Ogunnaike, B. A. The Identification of Nonlinear Models for Process Control Using Tailored “Plant-Friendly” Input Sequences. J. Process Control 2001, 11, (special issue) SI:237. (24) Grosdidier, P.; Mason, A.; Aitolahti, A.; Heinonen, P.; Vanhamaki, V. FCC Unit Reactor-Regenerator Control. Comput. Chem. Eng. 1993, 17, 165. (25) Kalra, L.; Georgakis, C. Effect of Process Nonlinearity on the Performance of Linear Model Predictive Controllers for the Environmentally Safe Operation of a Fluid Catalytic Cracking Unit. Ind. Eng. Chem. Res. 1994, 33, 3063.

Received for review December 30, 2002 Revised manuscript received May 21, 2003 Accepted July 2, 2003 IE021064G