Nonlinear Predictive Control of Systems Exhibiting Input Multiplicities

3186

Ind. Eng. Chem. Res. 2002, 41, 3186-3198

Nonlinear Predictive Control of Systems Exhibiting Input Multiplicities Using the Multimodel Approach K. Kishore Kumar and Sachin C. Patwardhan*,† Department of Chemical Engineering, Indian Institute of Technology, Madras, Chennai 600 036, India

This work establishes the feasibility of using a multilayer perceptron for the development of a multimodel that combines structurally simple local models developed in different operating regions. The local models either are obtained by linearizing a first principles model or are identified from input-output data using a linear combination of Laguerre filters. In particular, it is shown that the proposed multimodels can capture dynamic and steady-state characteristics of a continuous fermenter, which exhibit input multiplicities and change in the sign of steadystate gains, fairly accurately over a wide operating range. The proposed multimodel is further used to develop a nonlinear model predictive control (NMPC) scheme. The effectiveness of the NMPC scheme is demonstrated by simulating a servo problem that requires the fermenter to be controlled at its optimum operating point, which happens to be singular points where the invertibility is lost. The proposed NMPC scheme is found to achieve a smooth transition for large-magnitude setpoint changes and control the systems at the singular operating point even in the presence of measurement noise. The NMPC scheme is also found to be robust to moderate variations in system parameters. 1. Introduction Linear model predictive control (MPC) schemes are now widely used in the process industry for control of key unit operations.1 However, the use of linear dynamic models for prediction limits their applicability to a narrow range of operation or to systems which exhibit mildly nonlinear dynamics. Many key unit operations in chemical plants, such as high-purity distillation columns, reactors, and biochemical reactors, exhibit highly nonlinear dynamic behavior. The need to achieve a tighter control of strongly nonlinear processes has led to a more general MPC formulation whereby nonlinear dynamic models are used for prediction.1-5 Selection of a suitable form of a nonlinear model to represent system dynamics is a crucial step in the development of a nonlinear MPC (NMPC) scheme. The NMPC schemes proposed in the literature use models developed from either first principles or models identified from input-output data. The first principle models are valid globally and can predict system dynamics over the entire operating range. However, development of a reliable first principle model is, in general, a difficult and time-consuming task. The nonlinear black box models, on the other hand, have certain advantages over the mechanistic models in terms of development time and efforts. The major shortcoming of these models is their local validity pertaining to the range covered by the training data set. Also, for a nonlinear black box model, determination of the appropriate model structure that can capture system dynamics over the entire operating regime is not an easy task. Thus, it is necessary to evolve a scheme for the development of a * To whom all correspondence should be addressed. Email: [email protected]. † Present address: Department of Chemical Engineering, Indian Institute of Technology, Bombay, Powai, Mumbai 400 076, India.

black box model in which the model structure can be selected relatively easily and the resulting model is valid over a wide operating range. Most of the NMPC schemes based on first principle models are straightforward (but computationally expensive) extensions of linear MPC, whereby local linear approximations of the first principle model are generated at each sampling instant and used for future prediction.6-8 In practice, even for a highly nonlinear system, if the linear models developed at nearby points are examined, then it can be observed that the model parameters do not change appreciably. Thus, repeated or successive linearization within this region may not give a significant advantage in terms of prediction accuracy. If this fact can be exploited by a modeling scheme effectively, then the computational complexity associated with successive linearization based NMPC schemes can be reduced considerably. Use of multiple linear models or multiple linear controllers to control a nonlinear system is gaining importance in control literature in recent years.9-13 By this approach, the entire operating regime is divided into multiple small overlapping operating regimes and a local model is developed in each regime. These local models either can be combined to formulate a global nonlinear model for the system or can be used to develop a local controller for each region. These local controllers are then combined to generate a nonlinear controller capable of operating the nonlinear system in the entire operating region. Main advantages of this approach are as follows:12 (i) A relatively simple model structure (linear/bilinear/ quadratic) can be selected in each operating region. This alleviates the difficulties associated with model structure selection while developing a nonlinear black box model. (ii) The rich and well-developed linear control theory can be used for local controller design.

10.1021/ie001049g CCC: $22.00 © 2002 American Chemical Society Published on Web 05/30/2002

Ind. Eng. Chem. Res., Vol. 41, No. 13, 2002 3187

(iii) Qualitative, vague, and imprecise system knowledge can be more easily incorporated into the model or controller. With these attractive features, multimodels appear to be ideally suited for formulation of a nonlinear predictive control scheme. The present work is aimed at exploring the possibility of using the multiple-model approach to modeling and control of highly nonlinear processes exhibiting input multiplicities. Specifically, it is desired to use the multimodel approach to overcome limitations associated with linearized first principle models and conventional black box models. We propose to develop a multimodel by combining structurally simple linear or nonlinear local models using an artificial neural network (ANN). The local models in individual operating regions are obtained by linearization of a nonlinear first principle model or developed directly from input-output data. The resulting multimodel is expected to have a much larger region of validity than the individual submodels. The proposed form of the multimodel is then used to develop a NMPC scheme. The effectiveness of this modeling and control strategy is demonstrated by simulating a benchmark nonlinear control problem involving control of a continuous fermenter that exhibits input multiplicities. A phenomenon typically associated with systems exhibiting input multiplicities is the change in the sign of the steady-state gain in the operating region. A smooth change in the sign of the steady-state gain implies that the steadystate gain reduces to zero at some singular points in the operating region. The existence of such singular points renders the task of controlling a system exhibiting input multiplicities extremely difficult.14 We specifically deal with the problem of controlling the fermenter at a singular operating point, which happens to be the optimum operating point of the system. This paper is organized in five sections. The development of the multimodels and their properties are presented in the next section. In the section that follows, a NMPC scheme is developed using the proposed multimodels. Simulation studies carried out using a continuous fermenter system are presented in the subsequent section. The main conclusions reached from the simulation studies are discussed in the last section. 2. Development of a Multimodel In the present study, we restrict the scope of investigations only to lumped parameter nonlinear systems which are open-loop stable. The development of a multimodel for a lumped parameter nonlinear system consists of the following steps: (i) Selection of sub-regions: dividing the operating region into sub-regions such that the system exhibits more or less uniform behavior within each sub-region. (ii) Development of local models: development of a structurally simple dynamic model within each region to capture local dynamics. (iii) Development of a multimodel: combining the models from sub-regions into a multimodel which is valid over the entire operating range covered by the subregions. The operating region is divided into overlapping subregions based on some characterization variable(s). The choice of characterization variables is system-dependent and can include manipulated inputs, measured outputs, disturbances, or some auxiliary variables.12 In the

present work, it is assumed that the identification of operating sub-regions can be carried out on the basis of physical or engineering insight into a specific problem. The remaining two steps in multimodel formulation are dealt with in detail in the following two subsections. 2.1. Development of Local Models. Let us assume that the operating region is divided into NR overlapping sub-regions based on some characterization variable(s). In each sub-region, it is desired to develop a structurally simple discrete model of the form of

xi(k+1) ) Ψi[xi(k),u(k)] yˆ i(k) ) Πi[xi(k),u(k)]

for i ) 1, 2, ..., NR

(1) (2)

where xi ∈ Rni, u ∈ Rm, and yˆ i ∈ Rr represents a state variable vector, input vector, and model output vector in the ith region, respectively. In general, a highly nonlinear system can exhibit drastically different local dynamic characteristics from one sub-region to another. Consequently, different sub-regions may require different types of approximations for satisfactorily capturing the local dynamics. Several structurally simple forms can be considered for capturing the local dynamics in a sub-region, such as the following: (i) Perturbation Models from a First Principle Model. Starting from a nonlinear first principle model of the form

dx ) F(x,u); x(0) ) x0 dt

(3)

yˆ ) G(x,u)

(4)

linear perturbation models of the form

δxi(k+1) ) Φi δxi(k) + Γi δu(k)

(5)

j i + Ci δxi(k) + Di δu(k) yˆ i(k) ) y

(6)

can be developed. Here, δxi(k) ∈ Rn and δui(k) represents state and input and output perturbations from the j i, y j i) for the ith region, nominal steady state (x j i, u respectively, while yˆ i(k) represents output of the ith output. If such a simple linear form is found to be inadequate for capturing the local dynamics, discrete quadratic perturbation models proposed by Patwardhan and Madhavan15 can be used. (ii) Simple Linear/Nonlinear Black Box Models Developed from Input-Output Data. If the local dynamics in a sub-region is linear, then a transfer function model can be developed to capture the local dynamics. For the purpose of multimodel development, it is convenient to work with a state-space realization of the forms given in eqs 5 and 6 for the transfer function model. Alternatively, simple nonlinear forms such as Weiner- or Hammarstein-type nonlinear models

xi(k+1) ) Φixi(k) + ΓiΛ[u(k)]

(7)

yˆ i(k) ) Ξ[xi(k)]

(8)

developed from input-output data can be considered. Here Ξ(‚) and Λ(‚) represent simple nonlinear functions, such as quadratic or higher order polynomials, of their respective arguments.

3188

Ind. Eng. Chem. Res., Vol. 41, No. 13, 2002

One of the linear black box modeling techniques that has been receiving increasing attention in the past decade is the use of orthonormal filter networks such as Laguerre and Kautz filters.16,17 The orthonormal filter approximations provide a simple and elegant method of representing open-loop stable systems. Advantages of using such orthonormal filters for process modeling can be summarized as follows: (i) A good approximation can be obtained with a small number of terms due to the orthogonal property. (ii) The development of orthonormal filter models does not need any explicit knowledge about the system time constant and time delay. In the present work, we propose to use the Laguerre filter based models for developing local linear models. Thus, state-space realization of a multiple-input, multiple-output (MIMO) linear Laguerre model of the form

xi(k+1) ) Φ(ai) xi(k) + Γ(ai) u(k) yˆ i(k) ) C0,i + Cixi(k)

for i ) 1, 2, ..., r

Thus, local models are combined using an MLP, which consists of an input layer with r × NR input nodes, one hidden layer with Nh nodes, and an output layer with r nodes. The number of nodes in the hidden layer can be suitably selected to achieve the desired accuracy of interpolation. Let

η(k) ) [yˆ 1T(k) yˆ 2T(k) ... yˆ NRT(k)]T

denote the (rNR × 1)-dimensional network input vector. Then, the combined output of the multimodel can be written as follows:21

v(k) ) W1h[η(k)] - r qi(k) )

1

i ) 1, 2, ..., Nh

1 + e-vi(k)

(9) (10)

can be developed for each region where ai represents the set of Laguerre filter parameters and xi ∈ Rni represents the state vector defined for the ith region18 (also see the appendix for details). 2.2. Development of a Multimodel. After local models are developed within each operating regime, the next step is to combine the submodels, i.e., to decide the policy of when and how to switch or interpolate between local models. For a general nonlinear system, it is difficult to define hard or mutually exclusive partitions where only a given local model is valid. A more realistic approach is to define the operating regimes as overlapping sets and implement a smooth transition between them. The global multimodel is now an interpolation of the local approximations where smooth and normalized weighting functions10,12 or fuzzy membership functions9 are used to provide soft transitions between the operating regimes. An alternative to defining deterministic operating regimes is to use statistical methods to infer which operating regime is most appropriate at each time instant.13,19 Recently, Galan et al.20 have proposed to use a global filter that coordinates the efforts of local linear controllers, which are designed based on H∞ control theory. The weighting functions or fuzzy membership function based approaches appear more promising for developing a predictive control scheme. However, finding such interpolating functions or fuzzy membership functions for a MIMO nonlinear system is not an easy task. To alleviate this difficulty, it is proposed to combine these individual models into a global model using a multilayer perceptron (MLP) in the present work. An ANN of this type has an inherent capability of approximating any arbitrary function with adequate accuracy by the proper choice of its weights and biases.21 Unlike other approaches, where an explicit functional form for interpolating functions has to be selected, the structure of MLP (barrier function and number of neurons in the hidden layer) can be selected relatively easily. The MLP can also be trained relatively easily to perform interpolation between the submodels using well-established methods such as the back-propagation algorithm.

(11)

yˆ i(k) ) gi

(12) (13)

z(k) ) W2q(k) - β

(14)

[

(15)

1

1 + e-zi(k)

]

i ) 1, 2, ..., r

Here, W1 and W2 represent (Nh × rNR)- and (r × Nh)dimensional weighting matrices, respectively, r and β represent the associated Nh × 1 and r × 1 bias vectors, respectively, h(‚) and gi(‚) represent the functions used for scaling network input vector η(k) and the network output yˆ i(k), respectively. The network weights and biases can be computed using the standard backpropagation algorithm. In the rest of the text, for the sake of convenience, the multimodel developed by combining linearized first principle models will be referred to as the linearized multimodel, while the multimodel developed by combining linear Laguerre models will be referred to as the Laguerre multimodel. 2.3. Properties of a Multimodel. The multimodel developed from linear submodels has following properties: (i) Weiner Structure. The individual linear models for each region can be further combined to form a global state-space model such as

X(k+1) ) ΦX(k) + Γu(k)

(16)

yˆ (k) ) Ω[X(k)]

(17)

where

X(k+1) ) [x1(k+1)T x2(k+1)T ... xNR(k+1)T]T Ω(‚) represents the ANN state-output map together with state-output maps within the individual regions. Here the matrices Φ and Γ are given as

Φ ) blockdiag[Φ1 Φ2 ... ΦM] Γ ) [Γ1T Γ2T ... ΓMT]T From this representation, it is clear that the proposed multimodel has a Weiner-type structure. This also implies that the proposed multimodel can model only a limited class of nonlinear systems.


(ii) Output Error Structure. The multimodel in eqs 16 and 17 can be expressed as

X(k+1) ) ΦX(k) + Γu(k)

(18)

y(k) ) yˆ (k) + e(k) ) Ω[X(k)] + e(k)

(19)

where y(k) represents the vector of measured outputs and e(k) is a zero mean random sequence representing the measurement noise vector. Clearly, the proposed multimodel has an output error structure. (iii) Steady-State Characteristics. Unlike a general NARX model, it is comparatively easy to examine the steady-state behavior of the proposed multimodel given in eqs 16 and 17. For any given steady-state input, say us, if the spectral radius of matrix Φ is less than 1, the corresponding steady-state vector xs can be computed as

xs ) (I - Φ)-1Γus

(20)

prediction. Given a sequence of future control moves, i.e., {u(k/k), u(k+1/k), ..., u(k+M-1/k)}, a “p step ahead” output prediction can be written as follows:

xi(k+j+1/k) ) Φixi(k+j/k) + Γiu(k+j/k) for j ) 1, ..., M - 1 and for i ) 1, 2, ..., NR (22) xi(k+j+1/k) ) Φixi(k+j/k) + Γiu(k+M-1/k) for j ) M, ..., p - 1 and for i ) 1, 2, ..., NR (23) yˆ (k+j/k) ) Ω[xˆ 1(k+j/k), xˆ 2(k+j/k), ..., xˆ NR(k+j/k)] for j ) 1, ..., p (24) where Ω(‚) represents the ANN state-output map. To account for plant-model mismatch and unmeasured disturbances, a simple unmeasured disturbance estimator similar to the linear dynamic matrix control scheme is incorporated as follows:

and the corresponding steady-state output becomes

ys ) Ω(xs)

(21)

(iv) Bounded Input, Bounded Output (BIBO) Stability. If the spectral radius Φ in eqs 16 and 17 is strictly less than 1 and if the state-output map for each output, i.e., Ω(‚): RrNR f Rr is a bounded function (a function Ω(‚) is defined as a bounded function if |Ω(‚)| e My < ∞ on every bounded subset of the RrNR), then it can be argued that the proposed multimodel is globally BIBO stable. This follows from the fact that the linear part is BIBO stable if the spectral radius of Φ is strictly less than 1. Thus, for any bounded input sequence such that |u(k)|∞ e Mu < ∞, the state sequence |x(k)|∞ e Mx < ∞ will be bounded. If Ω(‚) is a bounded function, then it follows that the corresponding output sequence will be bounded, i.e., |y(k)| e My < ∞. In the present work, it is proposed to use a multilayer perceptron-type ANN for capturing the state-output map. It can be argued that the output of such an ANN is always bounded because the output of the sigmoidal function in the output layer is bounded between 0 and 1. Also, if Laguerre filters are used for developing the linear dynamic part, the Φ matrices in the state-space realizations (9) and (10) will always have their corresponding spectral radii less than 1. Note that, for a general NARX model, it is difficult to arrive at a similar general conclusion regarding the BIBO stability. 3. NMPC Formulation Using a Multimodel In a typical MPC formulation, at every sampling instant an explicit dynamic model is used for predicting the future behavior of the plant over a finite number of future time steps, say p, which is called the prediction horizon. A set of M future manipulated input moves {u(k/k), u(k+1/k), ..., u(k+M-1/k)} (M is called the control horizon) are determined by optimization with the objective of minimizing the sum of the squares of the future prediction errors. The controller is implemented in a moving horizon framework; i.e., only u(k/ k) is implemented at each sampling instant, and the optimization is repeated at each sampling instant based on the updated information from the plant. The discrete multimodels developed in the previous section can be used recursively to obtain multistep

yc(k+j/k) ) yˆ (k+j/k) + d(k+j/k)

(25)

d(k+j/k) ) d(k/k) ) y(k) - yˆ (k)

(26)

where

for j ) 1, 2, ..., p. Here y(k) represents the measured output at the kth instant and yˆ (k) represents the model output at the kth instant. Given a future setpoint trajectory {yr(k+j/k); j ) 1, 2, ..., p}, the controller design problem can be formulated as P

[E(k+j/k)]TWE[E(k+j/k)] + ∑ j)1

{

min u(k/k)...u(k+M-1/k)

M

[∆u(k+j/k)]TWU[∆u(k+j/k)]} ∑ j)1 where

E(k+j) ) yr(k+j/k) - yc(k+j/k) ∆u(k+j) ) u(k+j/k) - u(k+j-1/k) subject to

uL e u(k+j/k) e uU

for j ) 0, ..., M - 1 (input limits)

∆u(k+M/k) ) ∆u(k+M+1/k) ) ... ) ∆u(k+p-1/k) ) 0 h Here, WE represents the error weighting matrix, while WU represents the input weighting matrix. The above formulation closely corresponds to the NMPC formulation used in the industry.5 Note that the open-loop BIBO stability characteristics of the multimodel is of vital importance while formulating the above MPC scheme. The NMPC formulation and unmeasured disturbance prediction scheme uses the multimodel as an open-loop observer. This requires the multimodel to be BIBO stable, and consequently the proposed NMPC scheme can be used for controlling systems which are open-loop stable. The resulting nonlinear programming problem

3190


can be solved using any standard optimization technique such as successive quadratic programming. 4. Simulation Studies In this section we demonstrate that the proposed modeling and control scheme can be effectively used to deal with a benchmark control problem involving control of a continuous fermenter at a singular operating point. This system is a typical example of the class of systems exhibiting input multiplicities and changes in the sign of the steady-state gain in the operating region. It should be noted that, in the simulation studies presented in this section, the neural networks have been trained using the Neural Networks Toolbox of MATLAB, while the nonlinear optimization problems in the Laguerre model identification and NMPC formulation have been solved using the MATLAB Optimization Toolbox. 4.1. Control of Systems Exhibiting Input Multiplicities. Many chemical processes, such as reactors, distillations columns, and continuous fermenters, are known to exhibit input multiplicities; i.e., more than one set of manipulated variables can produce the identical steady-state outputs. Input multiplicities can exist when the steady-state gain matrix becomes singular in the operating region.22 The simplest examples of the systems exhibiting input multiplicity are processes that exhibit a steady-state maxima (or minima) in their outputs.14 The chemical processes are often required to operate at the optimal value of a physical process variable. Note that the conditions for the existence of input multiplicity get automatically satisfied at such an unconstrained optimum. Though control at the optimum operating point appears to be economically the most desirable option, the resulting control problem is difficult to handle because there exists a singularity at the optimum operating point. From a system theoretical viewpoint, the relative degree of the system becomes undefined and the invertibility is lost at the singular point.14 There are number of difficulties associated with the control of systems exhibiting input multiplicities. Morari24 has observed that input multiplicities of nonlinear systems usually cause robustness problems which cannot be eliminated using linear controllers with integral action. Sistu and Bequette25 have shown that in the presence of input multiplicities the linearized model of the nonlinear system exhibits nonminimum phase behavior and the control problem becomes difficult to handle because of the unstable dynamics of the model inverse. The phenomena of the change in the sign of the steady-state gain and the loss of invertibility pose difficulties even for nonlinear control strategies. Biegler and Rawlings2 observe that use of an unconstrained nonlinear controller may not be sufficient in such cases, even when the model is perfect. In the case of almost all of the nonlinear control strategies based on exact linearization, an assumption, often not stated explicitly, is that the steady-state gain of the model cannot change sign in the operating region.26 In fact, global inputoutput linearization based approaches cannot be directlyapplied in the regions of state space where the relative degree is not well defined and invertibility is lost.27,28 Thus, control of a system exhibiting input multiplicities at its singular point offers a challenging control problem.

Table 1. Continuous Fermenter: Nominal Model Parameters param

nominal value

param

nominal value

YX/S a b µm

0.4 g/g 2.2 g/g 0.2 h-1 0.48 h-1

Pm Km Ki

50 g/L 1.2 g/L 22 g/L

4.2. Control of a Continuous Fermenter. Consider a continuously operated fermenter described by the following set of equations:3,29

dX ) -DX + µX dt

(27)

1 dS µX ) D(Sf - S) dt YX/S

(28)

dP ) -DP + (Rµ + β)X dt

(29)

where X represents the biomass concentration, S represents the substrate concentration, and P denotes the product concentration. It is assumed that the product concentration (S) and the biomass concentration (X) are measured process outputs while the dilution rate (D) and the feed the substrate concentration (Sf) are manipulated inputs. Model parameter µ represents the specific growth rate, YX/S represents the cell mass yield, and R and β are the yield parameters for the product. The specific growth rate model is allowed to exhibit both substrate and product inhibition:

(

µm 1 µ)

)

P S Pm

Km + S +

S2 Ki

where µm represents the maximum specific growth rate, Pm represents the product saturation constant, Km represents the substrate saturation constant, and Ki represents the substrate inhibition constant. The nominal values of model parameters are listed in Table 1. The optimum steady-state operating conditions for the fermenter that maximize the productivity (Q ) DP) are

S* ) xKmKi ) 5.138 g/L

and P* )

Pm ) 25.0 g/L 2

The optimum is reached when manipulated inputs are

D h ) D* ) 0.164 h-1

and

S h f ) Sf* ) 23.4 g/L

The following figures present the steady-state behavior of the system in the neighborhood of the optimum: (i) Figure 1 a: biomass concentration (X) vs dilution rate (D) (at Sf ) 23.4). (ii) Figure 1b: product concentration (P) vs dilution rate (D) (at Sf ) 23.4). (iii) Figure 2 a: biomass concentration (X) vs feed substrate concentration (Sf) (at D ) 0.164). (iv) Figure 2b: product concentration (P) vs feed substrate concentration (Sf) (at D ) 0.164). As can be observed from these figures, both the biomass concentration (X) and the product concentration (P) exhibit a maximum and a change in the sign of the


Figure 1. Continuous fermenter: comparison of the process and model steady-state behavior with respect to the dilution rate.

steady-state gain with respect to the input Sf. The behavior of the biomass concentration (X) with respect to the dilution rate is also similar. Thus, the problems associated with singularity can arise if it is desired to control the cell concentration (X) and product concentration (P) at the optimum operating point using the feed substrate concentration (Sf) for manipulation. 4.2.1. Development of Multimodels. Using both the manipulated inputs and steady-state gains as characterization variables, the entire operating locus was divided into nine operating regions and a nominal steady-state operating point was selected in each region (see Table 2). The sampling interval was chosen as 0.1 h. It was also assumed that the measurements obtained from the system are corrupted with noise. To simulate the effect of measurement noise, zero mean normally distributed random variables with standard deviations equal to 0.1 and 0.3 were added to measurements of the cell mass concentration (X) and product concentration (P), respectively. For the development of a linearized multimodel, the nonlinear system equations (27)-(29) were linearized around the operating point in each region and represented in the state-space form. Analysis of these models revealed that, apart from the changes in the signs of the steady-state gains (see Table 2), some of the transmission zeros which are found in the left-hand plane in one region move to the right-hand plane in another region. For the purposes of development of a global multimodel, low-frequency random input signals covering all of the regions were introduced simultaneously in both the manipulated variables, and the responses of the two output variables were recorded. A

Figure 2. Continuous fermenter: comparison of the process and model steady-state behavior with respect to the feed substrate concentration. Table 2. Fermenter Operating Points and Steady-State Gain Matrices region

operating point (D, Sf)

1 2 3 4 5 6 7 8 9

(0.14, 19.4) (0.14, 23.4) (0.14, 27.4) (0.16, 19.4) (0.16, 23.4) (0.16, 27.4) (0.18, 19.4) (0.18, 23.4) (0.18, 27.4)

X-D

P-D

-0.0859 -0.9820 -0.1950 -1.4531 -0.2454 -1.6155 -0.1213 -1.0091 -0.2851 -1.6470 -0.4854 -2.3084 -0.1955 -1.1899 -0.5825 -2.5893 -5.2547 -19.2588

X - Sf

P - Sf

0.2850 1.0241 0.0922 0.3312 -0.1716 -0.6166 0.2601 0.9345 0.0004 0.0014 -0.3778 -1.3575 0.2119 0.7613 -0.2532 -0.9098 -3.9093 -14.0475

total of 3600 points (equivalent to 15 days of operation) were collected during an identification exercise. Based on the bounds on input perturbations introduced for developing the multimodels, the training region for the multimodel can be loosely defined as a set of points in state space generated by input sequences confined to the following region in the input space

0.12 e D e 0.2

and

16 e Sf e 30

During development of the MLP using all nine submodels, it was found that the resulting multimodel is unable to capture the dynamic characteristics properly. The reason for this was traced to the properties of the linear system in region 9. It can be observed from Table 2 that the gain matrix for this submodel differs significantly from the gain matrix of neighboring regions and individual gain elements are comparatively very high. Consequently, the dynamic behavior predicted by this model far exceeds the normal operating region and, in

3192


Figure 3. Multimodel validation: comparison of model outputs and noise-free components of process outputs.

Figure 4. Multimodel validation: manipulated input perturbations.

effect, deteriorates the effectiveness of the resulting multimodel. Much better identification and validation results were obtained when the ninth region was excluded, and the multimodel was developed using the remaining eight regions. The MLPs used for multimodel development consisted of an input layer with 8 × 2 input nodes, a hidden layer with 30 nodes, and an output layer with 2 output nodes. The results of model validation for a linearized multimodel, generated using an independent data set covering all of the regions, are compared with the noise-free components of the measured outputs in Figure 3, and the corresponding input sequences are given in Figure 4. As is evident from Figures 3 and 4, the linearized multimodel is able to predict the nonlinear system behavior quite accurately over a wide operating range. A comparison of the steady-state behavior of the process and the linearized multimodel is presented in Figures 1 and 2. The linearized multimodel is able to capture the steady-state behavior reasonably well only in the training region. To develop the Laguerre multimodel, the fermenter was perturbed at each operating point by simultaneously introducing low-frequency random magnitude fluctuations with standard deviations

Table 3. Intitial and Optimum Operating Conditions for the Fermenter

σD ) 0.0075

and

σSf ) 0.75

in both the inputs and 1500 data points were collected. Note that, because the system under consideration is highly nonlinear, it is important to introduce multilevel perturbations even for linear model development so that the local model can capture the average behavior around the operating point. Based on the input perturbations

X S P D Sf

initial conditions

optimum conditions

setpoint

4.3866 1.0335 15.499 0.15 12.0

7.3044 5.138 25 0.1636 23.4

7.3044 25

introduced, the region of validity of each submodel can be loosely defined as the set of points in state space generated by input sequences confined to the following region in the input space:

D h i - 0.015 e D hi e D h i + 0.015 and S h f,i - 1.5 e S h f,i e S h f,i + 1.5 h f,i represent steady-state inputs in each where D h i and S region. In each region, two multiple-input, single-output (MISO) linear Laguerre models were identified using the nonlinear optimization procedure given in Appendix A. Three basis filters were used for capturing dynamics relating each input-output pair. The optimum values of filter time-scale parameters for each submodel are given in Table 4. Note that, for the ease of interpretation, the Laguerre filter parameters ai are reported in terms of equivalent continuous time constant τi where

ai ) exp(-T/τi) Variations in dominant time constants of the system with changes in the operating point are reflected in significant variation of Laguerre filter parameters from one region to another. Figures 5 and 6 present the results of a typical identification exercise (for region 2).


Figure 5. Identification of the local linear model for region 2: comparison of model outputs and noise-free components of process outputs.

Figure 6. Identification of the local linear model in region 2: input perturbations.

Table 4. Time-Scale Parameters (h) of Local Laguerre Models

a local linear model developed based on linearization at one operating point. This reflects in better extrapolation capabilities of the Laguerre multimodel. Note that, because the measurements are corrupted with only additive white noise and the proposed multimodel has an output error structure, the multimodels developed with a relatively small data size were found to generate reasonably accurate predictions of the process behavior in the training region. 4.2.2. Closed-Loop Studies. The MIMO servo control problem is formulated similar to that of Patwardhan and Madhavan3 where it is desired to shift the fermenter operation from a given suboptimal initial steady state to the nominal optimum operating point (see Table 3). In this case both the product concentration and the cell mass concentration are the controlled and measured process outputs, while the dilution rate and the feed substrate concentration are available for manipulation. During closed-loop operation, the following constraints are imposed on manipulated inputs:

region

X-D

X - Sf

P-D

P - Sf

1 2 3 4 5 6 7 8

4.52 6.75 10.45 5.19 7.99 15.29 6.54 4.14

3.80 2.73 6.39 3.34 11.41 7.60 3.04 9.21

4.25 7.72 12.24 5.47 9.10 17.7 6.98 5.05

5.48 3.32 7.45 4.45 11.33 8.61 3.73 9.41

The noise-free component of the process output is compared with the linear Laguerre model response in Figure 5, while the corresponding input perturbations are presented in Figure 6. Because the input perturbations are small, the outputs remain bounded within a narrow region around the operating point of region 2 and a linear model is adequate to capture the local dynamics. Taking clues from the identification exercise for the linearized multimodel, only the first eight regions were considered for model development and the resulting linear models were combined to generate the Laguerre multimodel. The training and validation data sets and ANN structure used for the development of the Laguerre multimodel were identical to that of the linearized multimodel. The results of model validation are presented in Figures 1-4. The Laguerre multimodel not only captures global dynamics well but also is able to extrapolate the steady-state behavior outside the training region. For a highly nonlinear system, a linear model identified for data may provide a better approximation of the average local dynamic behavior in a region than

0 e D e 0.30

and

10 e Sf e 40

A prediction horizon of P ) 150 and a control horizon M ) 1 have been used for tuning the NMPC controller. The prediction horizon was selected to be relatively large ()15 h) because the system has a large settling time (≈40 h) when compared to the sampling interval (T ) 0.1 h). The error weighting and controller weighting matrices used in the NMPC formulation are as follows:

WE ) I

and

WU ) [0]

3194


Figure 7. Comparison of closed-loop output responses for the servo problem.

Figure 8. Comparison of closed-loop manipulated input responses for the servo problem.

Note that identical controller tuning parameters were for developing NMPC schemes based on the linearized and Laguerre multimodels. Figures 7 and 8 compare the closed-loop performances of the NMPC schemes based on the linearized and Laguerre multimodels. Both of the NMPC schemes are able to achieve a smooth transition to the peak operating point in the presence of measurement noise, and their performances are comparable. The corresponding manipulated input profiles are presented in Figure 8. The fact that the steady-state gain reduces to zero at the singular operating point can lead to input fluctuations as the gain of the model inverse based control becomes large in the neighborhood of this point. However, the ability of both of the multimodels to predict the system dynamics close to the singular points accurately ensures that the input fluctuations are not large even when the steady-state gain to close to zero and measurements are corrupted with noise. The average computation time for control law computations together with their respective standard deviations based on a 866 MHz Pentium based personal computer are as follows: NMPC

linearized multimodel

Lageurre multimodel

avgcomp.time(s)

3.2(0.93)

5.83(1.52)

In the presence of plant-model mismatch, the nonlinear systems with input multiplicities and a maximum in the output exhibit a peculiar behavior. Figure 9 presents the effect of typical parameter perturbation (in µm) on the steady-state behavior of the system. The steadystate behaviors for perturbations in other model pa-

Figure 9. Continuous fermenter: comparison of the steady-state behavior in the presence of parameter perturbations.


Figure 10. Performance of NMPC (linearized multimodel) in the presence of parameter perturbations: output profiles.

rameters, namely, YX/S, ki, and km, are qualitatively similar. From the analysis of the relative locations of the peaks with respect to the nominal optimum point (setpoint), the following two types of control problems can be identified.3,14 Suboptimal Operation. In this case, the optimum operating point shifts to a higher value than the nominal value. In such a situation, operation at the setpoint based on the nominal parameter value will become suboptimal. Unattainable Setpoint. In this case, the optimum operating point shifts to a lower value than the nominal value. In such a situation, the setpoint based on the nominal parameter value will become unattainable. The situation leading to infeasibility of the setpoint is potentially more dangerous because the final steadystate offset cannot be eliminated and does not change its sign. In the presence of such a persistent error, a controller with integral action can cause instability in the unconstrained case or input saturation and loss of control in the constrained case. Because the closed-loop behaviors under these two situations are qualitatively similar for different parameter perturbations, only the results for the case when perturbation in µm is introduced simultaneously with the setpoint change are reported. Figures 10 and 11 present output and manipulated input responses obtained using NMPC based on the linearized multimodel, respectively, while Figures 12 and 13 present output and manipulated input responses obtained using NMPC based on the Laguerre multimodel, respectively. A parameter value of µm ) 0.44 renders the specified setpoint unattainable and results in a steady-state offset

Figure 11. Performance of NMPC (linearized multimodel) in the presence of parameter perturbations: manipulated input profiles.

that cannot be eliminated. Even when there is an offset that cannot be eliminated, both of the controllers produce acceptable closed-loop behavior without causing input saturation or instability due to integral windup. This can be attributed to the fact that the NMPC algorithm attempts to minimize the difference between the setpoint and process output and moves the plant closest to the setpoint, if the setpoint becomes unattainable. Note that, because of the least-squares formulation of the NMPC problem, the resulting final steady state is the point closest to the nominal setpoint and not the peak operating point under parameter perturbation. For the case when µm ) 0.52, the specified setpoint becomes suboptimal and is attained by both of the controllers without any difficulty. However, in both cases the final steady state is different from maximum attainable product concentrations under the perturbed conditions. Thus, the controller performance is sacrificed at the cost of maintaining stability in the face of parameter perturbations. 5. Conclusions This work establishes feasibility of using a MLP for the development of a multimodel that combines structurally simple local models developed in different operating regions. The local models either are obtained by linearizing a first principle model or are identified from input-output data using a linear combination of Laguerre filters. The resulting multimodel has a larger range of validity and can be used to develop a model based controller that can operate the process over a wide operating range. In particular, it is shown that the proposed multimodels can capture dynamic and steady-

3196


Figure 12. Performance of NMPC (Laguerre multimodel) in the presence of parameter perturbations: output profiles.

Figure 13. Performance of NMPC (Laguerre multimodel) in the presence of parameter perturbations: manipilated input profiles.

state characteristics of a continuous fermenter, which exhibits input multiplicities and changes in the signs of the steady-state gains in the operating region, fairly accurately over a wide operating range. The proposed multimodels are further used to develop NMPC schemes. The effectiveness of the NMPC scheme is demonstrated by simulating a servo problem that requires the fermenter to be controlled at the optimum operating point. The control problem is particularly difficult because the optimum operating points happen to be singular points where the invertibility is lost. The proposed NMPC scheme is found to achieve a smooth transition for large magnitude setpoint changes and to control the systems at the singular operating point even in the presence of measurement noise. The NMPC scheme is also found to be robust with respect to moderate variations in system parameters. In particular, when the setpoint is rendered unattainable because of parameter variations, both of the controllers are found to produce acceptable closed-loop behavior without causing input saturation or instability due to integral windup.

where Lj(q,a) represents the ith-order discrete Laguerre filter

Appendix: Development of Models Using Laguerre Filters A.1. Single-Input, Single-Output (SISO) Laguerre Models. A SISO linear Laguerre model relating the system input with the system output can be written as

y(q) u(q)

n

) G(q) =

cjLj(q,a) ∑ j)1

Lj(q,a) )

x(1 - a2)T 1 - aq j-1 q-a

(q-a)

Here, a (-1 < a < 1) represents the adjustable filter parameter, T denotes the sampling interval, and n represents the number of Laguerre filters used for the approximation. Let us define a state vector x(k) whose ith component xi(k) represents the output of the ithorder filter for i ) 1, 2, ..., n. Then, following Kalpana,30 a discrete-time state-space representation of the Laguerre filter model can be written in the form

x(k+1) ) Φ(a) x(k) + Γ(a) u(k) y(k) ) cx(k) where

[

Φ(a) ) a 1 - a2 ... (-1)nan-2(1 - a2)

0 a ... (-1)n-1an-3(1 - a2)

Γ(a) ) [x(1 - a2)T - ax(1 - a2)T ... ...

0 0 ... ...

0 0 ... ...

0 0 0 a

]

n×n

(-a)n-1x(1 - a2)T]n×1T

c ) [c1 c2 ... ... cn]1×n


A.2. MIMO Laguerre Models. The state-space representations for the SISO Laguerre filter models developed in the previous section can easily be extended to a MIMO case.30 An r × m MIMO linear system can be represented as r MISO systems

where

θˆ j ) [E h (φj(k) φj(k)T)]-1E h (φj(k) yj(k))

subject to

|ajk| < 1

for k ) 1, 2, ..., Nj

m

yj(q) )

Gjl(q) ul(q) ∑ l)1

where j ) 1, 2, ..., r where Gjl(q) represents a transfer function relating the jth output and the lth input, i.e., the (j, l)th element of the transfer function matrix G(q). Now, let xjl ∈ Rnjl represent the state vector associated with the (j, l)th element of the transfer function matrix. Then, the following equations are associated with the jth output and the ith input:

xjl(k+1) ) Φ(ajl) xjl(k) + Γ(ajl) ul(k) yj(k) ) cjlxjl(k) Defining an augmented state vector xj(k) as

xj(k) ) [xj1T(k) xj2T(k) ... xjmT(k)]T the MISO state-space model can be written as

xj(k+1) ) Φj(aj) xj(k) + Γ(aj) u(k) yj(k) ) Cjxj(k) where

Φj(aj) ) blockdiag[Φj1(aj1) Φj2(aj2) ... Φjn(ajn)]nj×nj Γj(aj) ) blockdiag[Γi1(aj1) Γi2(aj2) ... Γim(ajm)]nj×m Cj ) [cj1 cj2 ... cjm] m

nj )

njl ∑ l)1

and

aj ) [aj1 aj2 ... ajn]T

where j ) 1, 2, ..., r. The above linear MISO models can be further combined to form a MIMO state-space model in a similar manner. A.3. Estimation of Model Parameters. For the MISO model given above, the parameter estimation problem can be stated as Ns

(aˆ j, θˆ j) ) arg min (aj, θj)

[j(k)]2 ∑ k)1

j(k,aj,θj) ) yj(k) - φ(k,aj)T θj φi(k,aj) ) [1 xj(k,aj)T]T where θj ) CjT. Note that, given a guess of aj, the above parameter estimation problem is linear with respect to θj. Thus, θj can be estimated using an ordinary linear least-squares method. Taking advantage of this fact, the above optimization problem can be reformulated as Ns

(aˆ j) ) arg min [ aj

∑ [y(k) - yˆ (k|θˆ j)]T[y(k) - yˆ (k|θˆ j)]]

k)1

where E h (‚) represents the expected value. This leads to a constrained nonlinear optimization problem which can be solved using any standard nonlinear optimization technique. Nomenclature a ) Laguerre time-scale parameter a ) vector of the Laguerre time-scale parameter A ) state transition matrix in a continuous domain B ) input transition matrix in a continuous domain c ) coefficient vector of the linear Laguerre model C ) output coupling matrix D ) dilution rate d ) plant/model mismatch correction term E ) error vector G(q) ) transfer function matrix k ) sampling instant P ) product concentration p ) prediction horizon Q ) productivity S ) substrate concentration Sf ) feed substrate concentration T ) sampling period u ) manipulated input vector W1, W2 ) weighting matrices for MLP x(‚) ) state vector X ) cell concentration y(‚) ) system output vector YX/S ) yield coefficient in the fermenter model Greek Letters Φ ) state transition matrix Γ ) input coupling matrix r, β ) bias vectors in MLP R, β ) growth-associated parameters in the fermenter model Ω ) MLP state output map µ ) cell growth rate µm ) maximum growth rate Subscripts c ) corrected value f ) feed i ) pertaining to the ith region r ) setpoint

Literature Cited (1) Qin, S. J.; Badgwell, T. A. An Overview of Industrial Model Predictive Control Technology. In Proceedings of the Fifth International Conference on Chemical Process Control, Tahoe City, CA, 1996; Kantor, J. C., Garcia C. E., Carnaham, B., Eds.; CACHEAIChE: New York, 1996; pp 232-256. (2) Biegler, L. T.; Rawlings, J. B. Optimization Approaches to Nonliear Model Predictive Control. In Proceedings of the Fourth International Conference on Chemical Process Control, Padre Island, TX, 1991; Arkun, Y., Ray, W. H., Eds.; CACHE-AIChE: New York, 1991; pp 543-571. (3) Patwardhan, S. C.; Madhavan, K. P. Nonlinear Predictive Control Using Approximate Second-Order Perturbation Model. Ind. Eng. Chem. Res. 1993, 32, 334-344. (4) Mayne, D. Q. Nonlinear Model Predictive Control: An Assessment. In Proceedings of the Fifth International Conference on Chemical Process Control, Tahoe City, CA, 1996; Kantor, J.

3198


C., Garcia, C. E., Carnaham, B., Eds.; CACHE-AIChE: New York, 1996; pp 217-231. (5) Qin, S. J.; Badgwell, T. A. An Overview of Nonlinear Model Predictive Control Applications. In Nonlinear Model Predictive Control; Allgower, F., Zheng, A., Eds.; Birkhauser: Basel, Switzerland, 1999. (6) Garcia, C. E. Quadratic Dynamic Matrix Control of Nonlinear ProcessessAn Application to Batch Reactor Process. AIChE Annual Meeting, San Francisco, 1984. (7) Brengel, D. D.; Seider, W. D. Multistep Nonlinear Predictive Control. Ind. Eng. Chem. Res. 1989, 28, 1812-1822. (8) Li, W. C.; Biegler, L. T. Newton-type Controllers for Constrained Nonlinear Processes with Uncertainties. Ind. Eng. Chem. Res. 1990, 29 (8), 1674. (9) Takagi, T.; Sugeno, M. Fuzzy Identification of systems and its applications to Modelling and Control. IEEE Trans. Syst., Man Cybernet. 1985, SMC-15 (1), 116-132. (10) Unbehauen, H. Adaptive Control of Time-Varying and Nonlinear Systems Using a Multi-Model Approach. In Control of Uncertain Systems; Hinrichsen, D., Martensson, B., Eds.; Birkhauser: Basel, Switzerland, 1990; pp 297-308. (11) Kuipres, B.; Astrom, K. The Composition and Validation of Heterogeneous Control Laws. Automatica 1994, 30 (2), 233249. (12) Johanson, T. A.; Murray-Smith, R. The Operating Regime Approach. In Multiple Model Approaches to Modelling and Control; Murray-Smith, R., Johnson, T. A., Eds.; Taylor and Francis: London, 1997; pp 3-72. (13) Schott, K. D.; Bequette, B. W. Multiple Model Adaptive Control. In Multiple Model Approaches to Modelling and Control; Murray-Smith, R., Johnson, T. A., Eds.; Taylor and Francis: London, 1997; pp 269-291. (14) Patwardhan, S. C.; Madhavan, K. P. Nonlinear Internal Model Control using Quadratic Prediction Models. Comput. Chem. Eng. 1998, 22 (3-4), 587-601. (15) Patwardhan, S. C.; Madhavan, K. P. Improved Techniques for the Development of Quadratic Perturbation Models. Ind. Eng. Chem. Res. 1996, 35, 4281-4290. (16) Zervos, C. C.; Dumont, G. A. Deterministic adaptive control based on Laguerre series representation. Int. J. Control 1988, 48 (6), 2333-2359. (17) Wahlberg, B. System identification using Laguerre models. IEEE Trans. Autom. Control 1991, 36 (5), 551-562. (18) Kishore Kumar, K. Predictive Control of Nonlinear Processes Using Multiple Model Approach. M.S. Dissertation, Indian Institute of Technology Madras, Chennai, India, 2000.

(19) Rao, R.; Palerm, C. C.; Aufderheide, B.; Bequette, B. W. Experimental Studies on Automated Refulation of Hemodynamic Variables. IEEE Eng. Med. Biol. Mag. 2001, 20 (1), 24-38. (20) Galan, O.; Ramagloni, J. A.; Palazoglu, A. Robust H∞ Control of Nonlinear Plants Based on Multilinear Models: An Application to a Bench Scale pH Neutralization Reactor. Chem. Eng. Sci. 2000, 55 (20), 4435-4450. (21) Hunt, L. R.; Sbarbaro, D.; Zbikowski, R.; Gawthrop, P. J. Neural networks for control systemssA survey. Automatica 1992, 28 (6), 1083-1112. (22) Koppel, L. B. Input Multiplicities in Nonlinear Multivariable Control Systems. AIChE J. 1982, 28 (6), 935-945. (23) Fox, J. M.; Schmidt, W. J.; Kantor, J. C. Comparison Sensitivity and Feedback Control of Optimized Chemical Reactors. Proceedings of the 1984 American Control Conference, San Diego, June 6-8, 1984; IEEE: New York, 1984; pp 1621-1627. (24) Morari, M. Robust Stability of Systems with Intergral Control. Proceedings of the Conference on Decision and Control, San Antonio, TX, Dec 14, 1983; IEEE: New York, 1983; pp 865869. (25) Sistu, P. B.; Bequette, B. W. Model Predictive Control of Processes with Input Multiplicities. Chem. Eng. Sci. 1995, 50 (6), 921-936. (26) Henson, M. A.; Seborg, D. E. A Critique of Exact Linearization Strategies for Process Control. J. Process Control 1991, 1, 122-139. (27) Slotine, J. E.; Li, W. Applied Nonlinear Control; PrenticeHall: Englewood Cliffs, NJ, 1991. (28) Daoutidis, P.; Kravaris, C. Dynamic Output Feedback Control of Minimum-phase Nonlinear Processes. Chem. Eng. Sci. 1992, 47 (4), 837-849. (29) Henson, M. A.; Seborg, D. E. Nonlinear Control Strategies for Continuous Fermenters. Chem. Eng. Sci. 1992, 47 (4), 821835. (30) Kalpana, N. Internal Model Control of Nonlinear Processes using Laguerre and Kautz Filter Models. M.S. Dissertation, Indian Institute of Technology Madras, Chennai, India, 2000.

Received for review December 5, 2000 Revised manuscript received September 24, 2001 Accepted March 22, 2002 IE001049G

Nonlinear Predictive Control of Systems Exhibiting Input Multiplicities

Recommend Documents