From Data to Nonlinear Predictive Control. 1. Identification of

Feb 16, 2006 - From Data to Nonlinear Predictive Control. 1. Identification of Multivariable. Nonlinear State Observers. M. Srinivasarao, Sachin C. Pa...
1 downloads 8 Views 208KB Size
Ind. Eng. Chem. Res. 2006, 45, 1989-2001

1989

From Data to Nonlinear Predictive Control. 1. Identification of Multivariable Nonlinear State Observers M. Srinivasarao, Sachin C. Patwardhan,* and Ravindra D. Gudi Department of Chemical Engineering, Indian Institute of Technology, Bombay, Powai, Mumbai 400 076, India

This work aims at the identification of a special class nonlinear state space observers for nonlinear multivariable systems directly from input-output data when the data is corrupted with unmeasured disturbances. At the identification stage, the one step ahead predictor form of the model is arranged to have a Weiner-like structure. The linear dynamic component of the predictor is parametrized using generalized orthonormal basis functions. The resulting observer is shown to be a nonlinear ARX (NARX) type model with an infinite but fading memory property. It is also shown that the proposed model structure is capable of capturing input as well as output multiplicity (multiple steady states) behavior. The efficacy of the proposed modeling scheme is demonstrated using simulation studies on a continuously stirred tank reactor (CSTR) process model, which exhibits input multiplicity, and another CSTR process model that exhibits output multiplicities. The types of unmeasured disturbances investigated are (a) unknown input disturbances (such as feed concentration fluctuations), (b) uncertainties in manipulated inputs, and (c) fluctuation in process parameters. The proposed modeling scheme is also validated in real time using a laboratory scale, multivariable experimental system. The analysis of the simulation and experimental studies reveals that the identified models have excellent disturbance modeling and long range prediction abilities. The identified models are also able to capture the steady-state behavior of the systems under consideration reasonably accurately over a wide operating range. The resulting stochastic model can be directly used for the development of an extended Kalman filter and to formulate a nonlinear model predictive control (NMPC) scheme. 1. Introduction Model predictive control (MPC) has been widely used in the process industry as an integral component of the on-line optimization of control schemes. As many chemical plants exhibit nonlinear behavior when operated over a wide range, analysis and synthesis of MPC schemes based on nonlinear prediction models has been an important area of research over last two decades.1-3 Despite the large number of theoretical developments in this area, nonlinear MPC (or NMPC) schemes have not made a significant industrial impact and their industrial applications are relatively few. Lee4 has pointed out that the lukewarm response of the industry to this new generation of predictive control schemes can be mainly attributed to the inability to construct a nonlinear model on a reliable and consistent basis. This is particularly true when the system under consideration is of large dimension and is subjected to unmeasured disturbances of significant magnitudes. Thus, the development of NMPC relevant models, starting from available data and information about the system, can be singled out as one of the most important issues in this area. Models used in NMPC formulation can be broadly classified as mechanistic (or fundamental) models and black-box models. The mechanistic models have a clear advantage over the blackbox models in terms of extrapolation ability and portability to multiple facilities.5 The majority of NMPC formulations proposed in the literature are based on mechanistic models.4-7 These formulations typically employ extended Kalman filtering to deal with unmeasured disturbances. As the model parameters have physical significance, the uncertainties in parameters can be treated systematically by incorporating robustness at the controller design stage.8,9 However, the development and validation of a mechanistic model is, in general, a difficult and * To whom correspondence should be addressed. E-mail: [email protected]. Tel.: +91-22-25767211. Fax: +91-22-25726895.

time-consuming process and requires a domain expert. Thus, the cost and the time required for developing such a model is justified when the model can be reused with minor changes at multiple production sites with similar facilities.5 Development of a black-box model directly from perturbed plant data, on the other hand, can be a relatively easy and economically attractive alternative in some situations.10 Sjoberg et al.11 and Pearson and Oggunaike12 provide excellent reviews of the variety of model structures available for black-box modeling. Recently, Lee4 has critically reviewed different model structures that have been used in various NMPC formulations. The emphasis of MPC relevant model identification has been on developing models with good long range prediction capability with respect to known inputs. The issue of identifying the stochastic part of the system (and using it for disturbance rejection) has been largely ignored in the NMPC related identification literature.4 Other primary issues of importance that need to be addressed systematically are model structure selection and dimensionality reduction. From the viewpoint of structures used for modeling unmeasured disturbance effects on process outputs, various blackbox models proposed in NMPC literature can be broadly classified into two classes: (a) nonlinear output error (NOE) or nonlinear moving average (NMAX) models and (b) nonlinear ARX (NARX) models. Some of the significant models belonging to the NOE class are the recurrent neural networks (RNN),13 the Volterra models,14 the decoupled AB net,15 the LaguerreWeiner model,16 and the multivariable Hammerstein model.17 The RNNs are models with large numbers of unknown parameters in which parameter estimation by prediction error minimization results in a highly ill-conditioned, nonconvex, nonlinear optimization problem. The Volterra models alleviate this difficulty as the model parameter can be estimated analytically using linear least-squares.14 In fact, the Volterra series models can be viewed as nonlinear generalizations of finite

10.1021/ie050904z CCC: $33.50 © 2006 American Chemical Society Published on Web 02/16/2006

1990

Ind. Eng. Chem. Res., Vol. 45, No. 6, 2006

impulse response (FIR) models, which have been widely used for the development of linear MPC schemes. However, these models are nonparsimonious in parameters and require a large data set to keep the variance errors low. This is an important issue, especially when they are used to model multivariable systems. Block oriented structures, such as Weiner or Hammerstein structures, with canonical parametrization of linear dynamics have been used to partly deal with the problems associated with the dimensionality. Wiener originally proposed the use of orthonormal filters in combination with a memoryless nonlinear map for the modeling of nonlinear dynamical systems.18 Boyd and Chua19 have later shown that any time invariant, continuous, nonlinear operator with fading memory can be realized in terms of a finite, dimensional, linear dynamical system coupled with a nonlinear readout map. The block oriented decoupled AB net15 or the Laguerre-Weiner model16 belong to this class of Weiner type models where the linear dynamic component of the model is parametrized using Laguerre/Kautz filters. Recently, Gomez and Baeyens17 have proposed a multivariable Hammerstein model where linear dynamics is parametrized using generalized orthonormal basis filters (GOBF). The main advantage of a NOE/NMAX structure is that it ensures good prediction capability over a large horizon with respect to known (manipulated) and measured inputs, which is of vital importance in any NMPC formulation. In addition, NMAX/NOE type models have also been shown to be bounded input bounded output (BIBO) stable.14,16 However, NOE/NMAX structures do not facilitate the modeling of unmeasured disturbances. Moreover, models belonging to the NMAX class cannot capture multiplicities of steady states (or output multiplicities) or subharmonic behavior.12 The NARX models (or polynomial autoregressive moving average (ARMA) models) proposed by Hernandez and Arkun20,21 explicitly capture the effect of unmeasured disturbances through nonlinear coupling between known inputs and unknown disturbances. The NARX structure is quite general and is capable of exhibiting steady-state multiplicities.4 However, identifying the proper structure of a conventional NARX model can be quite a tedious task particularly for multivariable systems. The feed forward neural networks (FNN), trained using past inputs and past output measurements, also belong to the class of NARX models. The primary drawback of such models is that they are nonparsimonious. Though training these models is relatively easy when compared to RNNs, these models have been shown to have poor prediction capabilities.13 The nonlinear additive ARX model,4 the AR-Volterra model,12 the multivariable Weiner model,17 or the partial least sqaure (PLS) based Hammerstein and Weiner models22 can be viewed as special forms of the NARX class of models with certain constraints imposed on the model structure. Note that the assumptions on model structure restrict the types of nonlinear phenomenon these models can capture. For example, the invertibility conditions imposed on the static nonlinear blocks in the multivariable Weiner model17 do not facilitate the modeling of output multiplicities. One major concern while developing black-box models is the length of data required for parameter estimation, which translates to the time required for conducting identification experiments and possible loss of production during this period. Note that the variance errors in parameter estimates are directly proportional to the number of model parameters and inversely proportional to the length of data.11,12 The black-box models such as Volterra models or models developed using neural network (RNN or FNN) have large numbers of unknown

parameters. Thus, in these cases, the length of data required for keeping bias and variance errors low is significantly large. The black-box models parametrized using orthonormal basis functions or conventional NARX type models are, on the other hand, parsimonious in parameters, and consequently, the data length requirement is moderate. From linear MPC formulations, it is well-known that unmeasured disturbance modeling and prediction improves the controller performance.23 In most of these formulations, however, it is implicitly assumed that the model relating physical unmeasured disturbance variables (such as feed composition fluctuations) with the state/output dynamics is available a priori. In practice, however, such disturbance models relating the true root cause of the unmeasured disturbances with the states/outputs are difficult to develop when the disturbances are not measurable. It has been recently shown that data driven stochastic modeling of unmeasured disturbances can be effectively used to improve the performance of linear MPC.25,24 Even though NARX type models facilitate unmeasured disturbance modeling, not much literature is available on how well these models capture typical unmeasured disturbances encountered in process industry and how to translate the advantages of unmeasured disturbance modeling into the better regulatory performance of NMPC. This work aims at the identification of a special class of nonlinear state space observers for nonlinear multivariable systems directly from input-output data.26 From an implementation viewpoint, this idea is particularly attractive as it is not straightforward to design an effective nonlinear state estimator starting from a general system description.4 At the identification stage, the one step ahead predictor form of the observer is arranged to have a Weiner like structure. The linear dynamic component of the predictor is parametrized using GOBF. The resulting observer is shown to have an infinite but fading memory property. It is also shown that the proposed model structure can capture input as well as output multiplicity behavior. The efficacy of the proposed modeling scheme is demonstrated using simulations on following systems: (i) A benchmark continuously stirred tank reactor (CSTR) system,27,28 which exhibits input multiplicity and change in the sign of steadystate gain in the operating region. (ii) A CSTR system considered in Hernandez and Arkun,21 which exhibits output multiplicities. The types of unmeasured disturbances investigated are (a) unknown input disturbances (such as feed concentration fluctuations), (b) uncertainties in manipulated inputs, and (c) fluctuation in process parameters. The applicability of the proposed modeling scheme is also validated in real time using a laboratory scale, multivariable experimental system. This paper is organized as follows. The next section deals with the proposed model structure and formulation of the parameter estimation problem. In the third section, simulation studies on systems exhibiting input and output multiplicities are presented. The results of an identification exercise carried out using experimental data are presented in the fourth section. The fifth section presents the conclusions reached through the analysis of simulation and experimental results. 2. Nonlinear State Observer Consider a process represented as set of nonlinear ordinary differential equations (ODEs)

dz ) F[z, uT(t), d(t), p(t)] dt

(1)

y(t) ) G[z] + υy(t)

(2)

Ind. Eng. Chem. Res., Vol. 45, No. 6, 2006 1991

Figure 1. Schematic diagram for process.

Figure 2. Schematic diagram for nonlinear state observer.

where z ∈ Rs represents the state vector, uT ∈ Rm represents the true value of the manipulated inputs, d ∈ Rd represents unmeasured disturbances, y ∈ Rr represents the vector of measured outputs corrupted with measurement noise υy(t) and p ∈ Rν represents the parameter vector (see Figure 1). We further assume that (i) u ∈ Rm represents known (or computed) value of manipulated inputs, which are related to the true values as follows

represents some nonlinear static map relating states with the outputs and, e(k) represents a white noise sequence. A schematic block diagram of the proposed model is shown in Figure(2). Rearranging eqs 8 and 9 as

uT(t) ) u(t) + υu(t)

(3)

where υu ∈ Rm denotes an unknown input disturbance, which is assumed to be a zero-mean stationary signal. (ii) Variations of signals d(t) and p(t) around their mean values, denoted as υd(t) and υp(t), respectively, can be represented as zero-mean stationary stochastic processes and

d(t) ) d h + υd(t)

(4)

p(t) ) p j + υp(t)

(5)

where d h and p j represent mean values of these signals. In eq 3, u(k) can be viewed as a set point to a fast acting cascaded slave loop generated by the control law, while uT(k) represents the actual input to the plant. In practice, the operators F[.] and G[.] are too complex and seldom known exactly. Thus, the information available from the plant is the sampled sequence of input and output vectors ΣN ) {(y(k), u(k)): k ) 1, 2, ..., N}.Given input and output data set ΣN collected from a plant, the problem of identifying a nonlinear time series model can be stated as finding a nonlinear operator Ξ[.]

y(k) ) Ξ[φ(k), θ] + e(k)

(6)

φ(k) ) φ[u(k - 1), ..., u(1), y(k - 1), ..., y(1)]

(7)

such that a suitable norm of model residuals {e(k): k ) 1, ..., N} is minimized with respect to parameter vector θ. Here, φ[.] represents a regressor vector. Note that, even though variations of various disturbances and parameters are unmeasured, their effect is present in the measured output sequence. Thus, the model represented by eqs 6 and 7 are expected to not only capture the effect of manipulated inputs but also model the effect of unmeasured disturbances. The modeling problem can be further decomposed as (a) choosing a suitable regressor φ[.] and (b) selecting a suitable nonlinear mapping Ξ[.] from the regressor space to the output space.11 In this work, we propose to develop a special class of nonlinear state space observers of the form

X(k + 1) ) ΦX(k) + Γu(k) + Ky(k)

(8)

y(k) ) Ω[X(k)] + e(k)

(9)

where X(k)(≡ φ(k)) ∈ Rn represents the state vector, Ω[.]

X(k + 1) ) F [X(k)] + Γu(k) + Ke(k)

(10)

y(k) ) Ω[X(k)] + e(k)

(11)

F [X(k)] ) [ΦX(k) + KΩ(X(k))]

(12)

it may be observed that these equations represent a special type of nonlinear state observer, which is affine in the manipulated inputs and the state noise is additive. The representation given by eqs 8 and 9 has an advantage over eqs 10-12 at the model identification stage as the state dynamics is linear and the observer has a Weiner like structure. Thus, the modeling problems at hand are parametrizations of the linear dynamic component (eq 8) and the nonlinear static map Ω[.]. Remark 1. Weiner type nonlinear models proposed in the literature, which haVe been parameterized using orthonormal basis filters (OBF), can be expressed in a generic form

X(k + 1) ) ΦX(k) + Γu(k)

(13)

y(k) ) Ω[X(k)] + v(k)

(14)

Note that these models haVe a nonlinear output error (NOE) structure and it is implicitly assumed that the effects of unmeasured disturbances are additiVe. Thus, in these modeling schemes, there is no attempt to assign any structure to the unmeasured disturbances or model them explicitly. 2.1. Parametrization of Linear Dynamic Component. There are multiple possible ways by which we can achieve canonical parametrization of the linear dynamic component (eq 8). For example, matrices (Φ, Γ, K) can be parametrized as in the classical observer canonical form. In this work, we choose to parametrize these matrices using generalized orthonormal basis filters,24 which represent an orthonormal basis for the set of strictly proper stable transfer functions (denoted as H2). Ninness and Gustafson29 have shown that a complete orthogonal set in H2 can be constructed as follows

Fk(z, ξ) )

x(1 - |ξ | ) k

(z - ξk)

2 k-1 (1

- ξ/i z)

∏ i)1 (z - ξ )

(15)

i

where {ξk: k ) 1, 2, ...} is an arbitrary sequence of poles inside the unit circle appearing in complex conjugate pairs. Classical FIR models, Laguerre filters, or Kautz filter based models are well-known examples of the class of models parametrized using GOBF.25,30 The main advantage of using orthogonal basis filters other than {z-j} in the classical FIR models is that a strictly proper stable transfer function can be approximated by only a small number of coefficients in the expansion, i.e., a model

1992

Ind. Eng. Chem. Res., Vol. 45, No. 6, 2006

parsimonious in parameters is generated. If the system under consideration has scattered poles, GOBF is a better choice of orthonormal basis filters than Laguerre or Kautz filters. Given a set of real poles inside a unit circle, a method of parametrization of matrices (Φ, Γ, K) using a GOBF network for a multiinput model is given in Patwardhan and Shah.24 For a r × m multi-input/multi-output (MIMO) process, we propose to develop r nonlinear state observers from inputoutput data. The linear dynamic component for the ith observer is constructed by using either of the following two approaches: • Multi-input/single-output (MISO) linear component (LMISO model):

X(i)(k + 1) ) Φ(i)X(i)(k) + Γ(i)u(k) + K(i)yi(k)

(16)

• MIMO linear component (L-MIMO model):

X(i)(k + 1) ) Φ(i)X(i)(k) + Γ(i)u(k) + K(i)y(k)

(17)

where X(i) ∈ Rni represents the state vector of the ith observer and yi(.) represents the ith component of the measurement vector y(k). 2.2. Parametrization of the State Output Map. The nonlinear state output map Ωi[.]: Rni f R can be parametrized as a function space expansion

yi(k) ) Ωi[X(i)(k)] + ei(k) Ωi[X(i)(k)] )

∑j cijωij[X(i)(k)]

(18) (19)

where ωij[.] represent some basis functions. Sjoberg et al.11 have discussed several possible ways of choosing single variable or multivariable basis functions such as sigmoidal neural networks, wavelet and radial basis networks, etc. In the present work, these static nonlinear maps are chosen to be simple multidimensional polynomial functions of finite order mainly with the intention of simplifying the resulting parameter estimation problem.. For example, a quadratic polynomial function can be expressed as

Ωi[.] ) C(i)X(i)(k) + (X(i)(k))TD(i)(X(i)(k))

(20)

for i ) 1, ..., m, where C(i) represents a (1 × ni) vector and D(i) represents a ni × ni symmetrical matrix. Thus, with the state output map given by eq 20, the output can be expressed as

yi(k) ) (Θ(i))TZ(i)(k) + ei(k)

(21)

number of basis filters (filter order) necessary to develop a reasonably good approximation of the system dynamics. Given input sequence {u(0), u(1), ..., u(Ns)} and output {yi(0), yi(1), ..., yi(Ns)} data sets, the least-squares estimate of the parameters can be obtained by solving the following minimization problem

(Θ ˆ (i), ξˆ (i)) ) arg min Θ(i),ξ(i)

(22)

(i) (i) 2 T χ(i)(k) ) [(X(i) 1 (k)) 2X1 (k)X2 (k) ...]

(23)

Ns

∑[eˆ i(k, Θ(i), ξ(i))]2

Ns k)1

|ξ(i) j | e 1; i ) 1, 2, ..., ni

(i) T (i) Θ(i)(k) ) [C(i)D(i) 11 D12 ... DNi,Ni]

(24)

Here, Θ(i) is a Ni × 1 vector with Ni ) ni × (ni + 3)/2, X(i) l (k) represents the lth element of vector X(i)(k), and D(i) jl represents the (j, l)th element of matrix D(i). Thus, the main advantage of choosing polynomial functions is that the resulting state output map is linear in parameters. 2.3. Parameter Estimation. The key step in the development of GOBF based models is the selection of filter poles and the

(26)

ξ(i)

where represents a vector of GOBF poles. In principle, an orthonormal basis for H2 can be constructed by selecting GOBF poles in an arbitrary manner. However, using an arbitrary set of basis filters may require a large number of terms in the Fourier series and the resulting model is no longer parsimonious in parameters. For linear model identification, it has been shown that selecting a basis with poles that closely match the dominant poles of the system requires fewer terms in the Fourier expansion.30 In many practical situations, there is some a priori knowledge about the dominant system time constants. If it is desired to develop an OE or NOE type model, then such information can be effectively used to fix the poles of the deterministic component of the model. However, this approach appears infeasible when it is desired to develop a stochastic model together with the deterministic components. Suppose we select the GOBF pole locations by some means; then, the parameter that remains to be selected is the number of basis filters (nij) for each single-input/single-output (SISO) submodel (i.e., model order) of an observer. There are two difficulties with the conventional strategy of first fixing the pole locations and then trying to find suitable filter orders. (i) To get a good fit, we may have to choose large filter orders. This would lead to an increase in variance errors as the number of model parameters to be estimated (which is a polynomial function of nij) becomes large. (ii) Since we are working in a multi-input scenario, choosing optimal filter orders (nij) can result in a combinatorial optimization problem. Thus, to keep variance errors low and data length requirements reasonable, we propose to choose filter orders a priori and perform a search in the set of filter poles as proposed by Patwardhan and Shah.25 The model order can be fixed by fitting a linear state space model to the identification data using a subspace identification method. Given the model order, the parameter estimation problem is formulated as two nested optimization problems as follows:

(Θ ˆ (i), ξˆ (i)) ) arg min ξ(i)

1 Ns

Ns

[eˆ i(k, Θ ˆ (i), ξ(i))]2 ∑ k)1

(27)

subject to

|ξ(i) j | e 1; j ) 1, 2, ..., ni

and

(25)

subject to

where

Z(i)(k) ) [(X(i)(k))T(χ(i)(k))T]T

1

(28)

where, given a guess of pole vector ξ˜ (i), the parameter vector Θ ˆ (i) is estimated by solving another optimization problem

Θ ˆ (i)(ξ˜ (i)) ) arg min Θ(i)

1 Ns

Ns

[eˆ i(k, Θ(i), ξ˜ (i))]2 ∑ k)1

(29)

In particular, when the state output map is linear in parameters, we can exploit the fact that the parameter vector Θ ˆ (i) can be estimated analytically by a simple linear regression scheme. Thus, when the state output map is constructed using

Ind. Eng. Chem. Res., Vol. 45, No. 6, 2006 1993

finite order polynomial functions, given a choice of filter orders ˆ (i) can (nij) and a set of GOBF poles x˜ , the parameter vector Θ be estimated as follows

Θ ˆ (i)(ξ˜ (i)) ) [E h (Z(i)(k)Z(i)(k)T)]-1E h (Z(i)(k)Y(i)) ) (ΨTΨ)-1ΨTY(i) (30) where matrix Ψ and vector Y(i) are defined as,

[ ]

Z(i)(1)T Ψ ) ... Z(i)(k)T

and E h (.) represents the expected value operator. Thus, the parameter estimation problem can be reformulated as a constrained optimization problem

ξ(i)

1 Ns

Ns

[eˆ i(k, Θ ˆ (i)(ξ(i)))]2 ∑ k)1

(31)

subject to the constraints 26 and 30. This leads to a constrained nonlinear optimization problem which can be solved using any standard nonlinear optimization method. The above nonlinear optimization problem has, in general, multiple solutions. In fact, since the GOBF generated using any choice of filter poles within the unit circle forms a orthonormal basis for H2, any local feasible solution is an acceptable solution to the optimization problem. Even if we choose a nonlinear-in-parameter state output map, a similar nested optimization strategy can be employed as the state sequence depends only on the choice of the GOBF filter pole vector (x(i)). Remark 2. Let us assume that the data sequence {yi(k)} is generated by

˜ ))T Z(i)(k) + ei(k) yi(k) ) (Θ(i) T (ξ

(32)

where Θ(i) ˜ ) represents the true Value of the parameter Vector T (ξ Θ(i) and {ei(k)} is a white noise sequence. Under this assumption, Θ ˆ (i) ˜ ) giVen by eq 30 represents an unbiased estimate of T (ξ (i) ΘT (ξ˜ ). 2.4. Combined MIMO Observer. The parameter estimation procedure outlined above generates r observers, which can be further combined to form a MIMO observer of the form

X(k + 1) ) [ΦX(k) + KΩ(X(k))] + Γu(k) + w(k) y(k) ) Ω[X(k)] + e(k)

(33) (34)

where X ∈ RN represents the augmented state space vector with r Ni. Here, w(k) and e(k) are white noise sequences. N ) ∑i)1 The parameter estimation procedure also generates estimates of the innovation sequence {eˆ (k) ∈ Rr: k ) 1, 2, ..., Ns}. This innovation sequence can be used to estimate the covariance matrix of innovation sequence Vˆ e as follows

Vˆ e ) E[eˆ (k)eˆ (k) ] ) T

1

(i) (i) (i) X(i) u (k + 1) ) Φu Xu (k) + Γu u(k)

(37)

yi(k) ) Ωu(i)[X(i) u (k)] + vi(k)

(38)

i ) 1, 2, ..., r

Y(i) ) [yi(1)T yi(2)T ... yi(Ns)T ]T

(Θ ˆ (i), ξˆ (i)) ) arg min

The above information about the state and the residual covariances is important if it is desired to use the model in eqs 33 and 34 to develop an extended Kalman filter while formulating a nonlinear MPC scheme. Remark 3. As indicated earlier, some authors haVe proposed NOE type models of the form15,16

If it is desired to deVelop a model of this form for the deterministic predictions, then, it is possible to deVelop a disturbance model using the residual sequence {v(k)} under the assumption that the effect of unmeasured disturbances on the outputs is additiVe. For example, we can identify either SISO NAR type models of the form (i) (i) (i) X(i) V (k + 1) ) ΦV XV (k) + ΓV vi(k)

(39)

vi(k) ) Ω(i) i [XV(k)] + ei(k)

(40)

i ) 1, 2, ..., r or MISO NAR type models of the form (i) (i) (i) X(i) V (k + 1) ) ΦV XV (k) + ΓV v(k)

(41)

vi(k) ) Ω(i) i [XV(k)] + ei(k)

(42)

i ) 1, 2, ..., r (i) where {ei(k)} is a white noise sequence and (F(i) V , GV ) is parameterized using GOBF as described aboVe. The aboVe NAR models can be rearranged as NARMA models by a simple transformation as follows

(i) (i) X(i) V (k + 1) ) F[XV (k)] + ΓV ei(k)

(43)

vi(k) ) Ω(i) i [XV(k)] + ei(k)

(44)

(i) (i) (i) (i) F[X(i) V (k)] ) [ΦV XV (k) + ΓV ΩV (XV(k))]

(45)

i ) 1, 2, ..., r and combined with the deterministic model to formulate NOE+NARMA models of the form (i) (i) (i) X(i) u (k + 1) ) Φu Xu (k) + Γu u(k)

(46)

(i) (i) X(i) V (k + 1) ) F [XV (k)] + ΓV ei(k)

(47)

(i) (i) yi(k) ) Ω(i) u [Xu (k)] + ΩV [XV(k)] + ei(k)

(48)

i ) 1, 2, ..., r

Ns

∑eˆ (k)eˆ (k)

Ns k)1

T

(35)

3. Model Properties

The estimated innovation covariance can be further used to estimate state noise covariances as follows

3.1. NARX Model with Large but Finite Memory. The linear state dynamics of the observer given by eq 8 can be expressed as

ˆ eKT; E[w(k)e(k)T] ) KVe (36) E[w(k)w(k)T] ) KV

X(k) ) [(qI - Φ)-1Γ]u(k) + [(qI - Φ)-1K]y(k) (49)

1994

Ind. Eng. Chem. Res., Vol. 45, No. 6, 2006

where q represents the forward shift operator. As GOBF are infinite impulse response (IIR) type filters,30 we have ∞

X(k) )



Wu(i)u(k - i) + ∑Wy(i)y(k - i) ∑ i)1 i)1

Wu(i) ) Φi-1Γ; Wy(i) ) Φi-1K

(50)

tions can be established as follows ∞

||X ˆ (k|k - 1)||∞ e ||



Wu(i)u(k - i) + ∑Wy(i)y(k - i)||∞ ∑ i)1 i)1

(59)



(51)

e

[||Wu(i)||∞mu + ||Wy(i)||∞my] ∑ i)1

and the proposed model can be expressed as ∞

y(k) ) Ω[



Wu(i)u(k - i) + ∑Wy(i)y(k - i)] + e(k) ∑ i)1 i)1

(52)

The above representation of the state sequence reveals that the proposed model is an NARX type model with, in principle, infinite memory. However, since F is parametrized through poles strictly inside the unit circle, both Wu(i) and Wy(i) tend to null matrices for large i. The number of significant input and output vectors in the past is determined by the locations of the real parts of the GOBF poles. Thus, if Nu and Ny denote the number of past inputs and output vectors which have significant effects on current state, then we can write Nu

X(k) =

∑ i)1

Ny

Wu(i)u(k - i) +

Nu

y(k|k - 1) = Ω[

∑ i)1

Wy(i)y(k - i) ∑ i)1

(53)

Ny

Wu(i)u(k - i) +

Wy(i)y(k - i)] ∑ i)1

(54)

e M um u + M y m y

(60) (61)

Thus, if the input and the measurement sequences are bounded, the proposed modeling scheme generates a bounded state sequence when used for generating one step ahead predictions. Remark 4. It is also possible to perceiVe a more general obserVer of the form

X(k + 1) ) Φ(k) + ΓΨ[u(k)] + KΛ[y(k)]

(62)

y(k) ) Ω[X(k)] + e(k)

(63)

where Ψ[.]: Rm f Rm and Λ[.]: Rr f Rr are inVertible nonlinear maps and (Φ, Γ, K) are parameterized though GOBF. Note that the fading memory property of the obserVer and boundedness of the estimated state sequence are preserVed in this generalization. 3.3. Steady-State Multiplicities. The steady-state outputs of an observer for fixed values of the manipulated input vector, say u ) us, can be computed by solving the algebraic equation

(Φ - I)Xs + KΩ(Xs) + Γus ) 0h

(64)

ys ) Ω[Xs]

(65)

Thus, the proposed observer belongs to the class of fading memory systems (see to the work of Pearson and Oginnaike12 for the definition of fading memory systems). The main advantage of this property can be found while formulating the parameter estimation problem. To compute the state evolution while estimating Θ ˆ (i) using eq 26, the initial state X(0) has to be specified. If the initial state is estimated together with the parameter vector, then it can result in a nonlinear programming problem. However, since the initial state can be estimated using a finite number of past inputs and outputs, this difficulty can be avoided by starting with a zero initial state and formulating the loss function of the optimization problem (eq 25) starting from k > k0. 3.2. Bounds on Estimated State Sequence. Let us assume that the inputs and output measurements available from a process are bounded, i.e.,

If the structure of the nonlinear map is chosen properly, then multidimensional nonlinear eq 64 can have multiple solutions for a fixed steady-state input us. Thus, the proposed nonlinear observer structure facilitates the modeling of systems exhibiting output multiplicity (multiple output steady states for a fixed input us). Since the nonlinear state output function typically maps from high dimensional state space to output space of lower dimensions (Ω: Rn f Rr), the proposed model structure is also capable of capturing input multiplicity behavior (multiple steadystate inputs giving rise to an identical steady-state output). On the other hand, the steady-state behavior of the NOE type block oriented nonlinear models proposed in the literature (given by eq 13-14) can be obtained by solving

||u(k)||∞ e mu; ||y(k)||∞ e my

(Φ - I)Xs + Γus ) 0h

(66)

for all k g 0. Since poles of F are strictly inside the unit circle, it follows that

ys ) Ω[Xs]

(67)

Wu(i) f [0] and Wy(i) f [0] as i f ∞

From the above expressions, it is easy to see that these NOE models can capture input multiplicity behavior but cannot model output multiplicity behavior. Thus, unlike the other GOBF based nonlinear models available in the literature, the proposed model structure facilitates the modeling of input as well as output multiplicity behavior. 3.4. Local Dynamic Behavior and Infinite Horizon Predictions. Consider the deterministic model derived from eqs 3334

(55)

(56)

where [0] represents a null matrix. As a consequence,

||Wu(i)|| f 0 and ||Wy(i)|| f 0 as i f ∞

(57)

and it can be shown that ∞

∑ i)1



||Wu(i)|| e Mu;

||Wy(i)|| e My ∑ i)1

(58)

Using above inequality, the bounds on one step ahead predic-

X h (k + 1) ) [ΦX h (k) + KΩ(X h (k))] + Γu(k)

(68)

yj(k) ) Ω[X h (k)]; X h (0) ) E[X(0)]

(69)

Ind. Eng. Chem. Res., Vol. 45, No. 6, 2006 1995

where X h (k) ) E[X(k)]. Let (Xs, us) represent a steady state of this mode. Then, the local dynamic behavior of the model in the neighborhood of steady state (Xs, us) is governed by eigenvalues of matrix Φs where

Φs ) Φ + K

[∂Ω ∂X ]

X)Xs

(70)

Further, following Economou,31 the region of attraction for the unforced system

X h (k + 1) ) [ΦX h (k) + KΩ(X h (k))] + Γus

(71)

h (k) such that can be characterized as the set Π(Xs) of all X

||

Φ+K

[∂Ω ∂X ]

X)X h (k)

||