Computationally Efficient Neural Predictive Control Based on a

Oct 19, 2006 - A new strategy for integrating system identification and predictive control is proposed. A novel feedforward neural-network architectur...
2 downloads 16 Views 187KB Size
Ind. Eng. Chem. Res. 2006, 45, 8575-8582

8575

Computationally Efficient Neural Predictive Control Based on a Feedforward Architecture Matthew Kuure-Kinsey,† Rick Cutright,‡ and B. Wayne Bequette*,† Isermann Department of Chemical & Biological Engineering, Rensselaer Polytechnic Institute, Troy, New York 12180-3590, and Research & Systems Architecture, Plug Power Inc., Latham, New York 12110

A new strategy for integrating system identification and predictive control is proposed. A novel feedforward neural-network architecture is developed to model the system. The network structure is designed so that the nonlinearity can be mapped onto a linear time-varying term. The linear time-varying model is augmented with a Kalman filter to provide disturbance rejection and compensation for model uncertainty. The structure of the model developed lends itself naturally to a neural predictive control formulation. The computational requirements of this strategy are significantly lower than those using the nonlinear neural network, with comparable control performance, as illustrated on a challenging nonlinear chemical reactor and a multivariable process, each with both nonminimum and minimum phase behavior. 1. Prelude Warren Seider has been a pioneer in the use of nonlinear models for optimization and control. While model predictive control (MPC), as initially applied in the oil and petrochemical industries, was based on linear convolution coefficient-based models, Warren recognized that there was no inherent limitation to linear models. His publications with David Brengel were among the first to use nonlinear models as the basis for control computations in a model predictive control framework. He also used nonlinear MPC as a basis for integrating design and control, recognizing that the limitations to linear controller performance could impose unnecessary limitations to the process design. One of the most impressive attributes of Professor Seider is his ability to bring novel design and control concepts directly into the undergraduate classroom. His process design textbook with Professors Seader and Lewin makes these research results accessible to undergraduate students, many of whom will have the chance to implement these techniques early in their industrial career. It is with pleasure that we contribute a paper on a new technique to incorporate a neural-network-based model into a model predictive control strategy to this special issue dedicated to Professor Seider. 2. Introduction There has been a trend toward specialty chemicals manufacturing, with production plants that frequently change operating conditions to meet varying product requirements for different consumers. Linear control strategies, which work quite well in maintaining a plant at a given steady state, may fail when required to operate over a wide range of conditions. To develop appropriate control strategies, accurate nonlinear models of the processes must first be developed. The use of fundamental firstprinciples models has been applied to control specialty and batch chemical processes.1,2 First-principles models can give physical insights into the system. This physical insight allows for more meaningful interpretation of control system behavior, and a first principle’s basis often means the model is accurate over a wider * To whom correspondence should be addressed. E-mail: bequette@ rpi.edu. Tel.: (518) 276-6683. Fax: (518) 276-4030. † Rensselaer Polytechnic Institute. ‡ Plug Power Inc.

range of input/output space. However, not all systems lend themselves to fundamental models without considerable time and effort. Though lacking in physical meaning, empirical models such as fuzzy sets and neural networks can often be developed with less time and effort. Fuzzy sets use a series of rules and classifications to model nonlinear relations.3 One potential drawback to the use of fuzzy sets is the large amount of data that can be required to accurately train the model. Among empirical modeling techniques, neural networks have garnered the most research attention in recent years. This is due in part to the simplicity of the “black-box” input-output relationship that results.4 Also important is the “universal approximation” capability of neural networks. A two-layer feedforward neural network, with sigmodial and linear activation functions, is capable of representing any nonlinear function to an arbitrary degree given enough hidden layer neurons.5 By definition, in a feedforward network, each input is only fed forward, and passes through each activation function once. This allows for the well-documented ability to represent static systems. Recurrent neural networks, on the other hand, incorporate internal feedback by using layer outputs as inputs to hidden nodes. This internal feedback adds difficulty and time to the training process, because of convergence issues, but it also makes it possible for recurrent neural networks to model dynamic systems.6 The number of neurons in a hidden layer, whether the neural network is feedforward or recurrent, is also of importance. Too few neurons can lead to incomplete training and poor generalization abilities. With too many neurons, the neural network will become more complex than necessary. Several approaches have been used to determine an optimal number of hidden layer neurons. Prasad and Bequette7 used a singular value decomposition based technique for model reduction, while Youngjik et al.8 used an iterative reduction technique to determine the optimal number of neurons. Alongside development of neural-network architectures to model nonlinear systems has been research into control strategies that use these neural networks. Several classes of strategies have begun to emerge from this research. Ge and Cong have used neural networks adaptively,9 where the network weights and biases are adapted based on feedback about control performance. Psichogios and Ungar have put neural networks into an internal model control (IMC) framework, where the neural network serves as the model inverse for control design.10 Neural networks

10.1021/ie060246y CCC: $33.50 © 2006 American Chemical Society Published on Web 10/19/2006

8576

Ind. Eng. Chem. Res., Vol. 45, No. 25, 2006

have also increasingly been used in model predictive control, where the neural network is used to predict n steps into the future in order to determine optimal control actions. Within model predictive control, feedforward neural networks,11 recurrent neural networks,12 and radial basis functions13,14 have been used as the model. To determine the optimal control solution in model predictive control, the model prediction must be propagated a number of steps into the future. The optimal control solution is then found by application of an appropriate optimization routine. In the case of neural networks being used as the model, the resulting system to be optimized is nonconvex, which leaves two options. One solution is to solve the nonlinear neural-network optimization with techniques such as sequential quadratic programming.15 The primary drawback to such methods is the high computational expense required, because of convergence issues and local minima associated with nonconvex sets. As the size of the nonlinear system in question grows, computational speed becomes more of an issue,16 especially for systems with fast sampling time requirements. The alternative to solving the nonlinear neural-network optimization problem is to use an approximation of the nonlinear model. The goal of approximation is to recast the nonlinear model in a linear form that closely matches the original system, while gaining computational savings associated with the simpler linear model. Common methods of approximation include using multiple linear models to span the operating space.17 For a more complete review of nonlinear control topics, see ref 18 and the references therein. The contribution of this paper is to present a novel feedforward neural-network architecture that can be approximated as a linear time-varying system for which a model predictive control formulation with an algebraic solution in the unconstrained case can be developed. The result is a more computationally efficient neural predictive control strategy than the solving of the nonlinear neural network. The paper is organized into five sections. After a brief background and introduction, Section 3 introduces the neural network architecture and corresponding input-output relationship. Section 4 develops the model predictive control formulation. An example illustrating the use and benefits of the method is presented in Section 5. The computational efficiency is analyzed in Section 6, with conclusions and recommendations in Section 7. 3. Neural-Network Architecture Standard feedforward neural networks are constructed from several important subfunctions: layer weights, biases, and activation functions. Layer weights are matrices that linearly transform data signals prior to the activation functions. Bias terms are static inputs to the neural network that are summed with layer matrix output to form the input for the activation functions. Activation functions are what allows the neural network to capture nonlinear behavior and can be either linear or nonlinear in nature. Common nonlinear activation functions are natural logarithm and hyperbolic tangent based. Each activation function is made up of nodes or “neurons” that can transform signals. The more neurons an activation function has, the larger in size is the corresponding layer weight matrix. Each group of weight matrices, biases, and activation functions is known as a layer. Each element of the weight matrices, along with the bias terms, is free to be manipulated in order to match the neuralnetwork output with training data. This is known as training, and it is typically done based on the gradient of the network error. The standard training algorithm for feedforward neural

Figure 1. Standard two-layer feedforward neural network with nonlinear hidden layer and linear output layer.

Figure 2. Proposed feedfoward neural-network architecture with decoupled inputs and multiple hidden layers connected in series and parallel.

networks, and the one used in this work, is the LevenbergMarquardt algorithm, given in eq 1.

wk+1 ) wk - [JTJ + µI]-1JTe

(1)

where e is a vector of network errors and J is the Jacobian matrix containing first derivatives of network errors with respect to network weights and biases. The general input-output relationship in a feedforward neural network can be described by y ) f(u*, y), and an example of a typical network with two layers is shown in Figure 1, where u* is a vector of inputs, y is a vector of outputs, and f(•) is the neural network. To better approximate the training data, the input vector is often a combination of past system inputs and outputs described by eq 2, similar to an ARX model in linear systems theory. The use of past inputs and outputs allows for a degree of feedback similar to what is found in recurrent neural networks.

u* ) (uk, uk-1, ..., uk-n, yk, yk-1, ..., yk-n)

(2)

The standard “universal approximation” feedforward neural network, with two layers, has the following input-output relationship.

yk+1 ) LW tanh(IWu* + b1) + b2

(3)

Using the standard least squares objective function propagated into the future for model predictive control, the result is a nonconvex optimization problem. To get a more computationally efficient optimization problem, the standard architecture is modified by decoupling uk and yk into distinct inputs. This allows for a more favorable placement of the nonlinear activation function. Several linear activation functions are also added, and the five-layer feedforward architecture is shown in Figure 2. The architecture retains the feedback structure through the use of yk but results in a different input-output relationship given in eq 4.

Ind. Eng. Chem. Res., Vol. 45, No. 25, 2006 8577

yk+1 ) CAyk + CBuk + CE tanh(Fyk) A ) WFWC, B ) WEWB

gk ) E tanh(Frk) - E(1 - tanh(Frk)2)Frk (4)

C ) W G, E ) W D, F ) W A

From the system model in eq 8, the next step is to derive the optimal model predictive control solution. The objective function that will be minimized is eq 11.

4. Model Predictive Control

p

Model predictive control, or receding horizon control, is an advanced control technique that takes advantage of the model’s inherent ability to predict system behavior into the future. At each time step, an optimization problem is formulated and solved. The objective function is to minimize control action over a prediction horizon of p time steps. The decision variables are m control moves, where m is the control horizon. Only the first control move is applied to the system, the model is updated, and the entire process is repeated at the next time step. 4.1. Feedforward Neural-Network Formulation. A number of different types of models can be used in a model predictive control framework. Among the most popular are step response, impulse response, and state-space models.19 A closer inspection of eq 4 reveals an inherent state-space nature, which allows the input-output relationship to be rewritten as the following nonlinear state-space model given in eq 5.

xk+1 ) Axk + Buk + E tanh(Fyk) yk ) Cxk

(5)

The model in eq 5 is a nonlinear state-space model of a particular class. The presence of yk within the hyperbolic tangent term makes the model a linear state-space model with a feedback path nonlinearity. In order for a computationally efficient control solution to be possible, the model must be cast into a different form using a specific type of linearization. Instead of linearizing the model around a single operating point, the model will be linearized around a setpoint vector of length p. By linearizing around a vector instead of a point, the static nonlinearity will effectively be mapped into a time-varying linear term. The result will be the transformation of the linear time-invariant model in eq 5 to a linear time-varying model. The linearization is accomplished by approximating the feedback path nonlinearity with a Taylor series expansion similar to eq 6.

f(x) ) f(x0) + f′(x0)(x - x0) + ...

(6)

A Taylor series expansion of any order can be used; however, for the sake of brevity, the derivation will be done using a firstorder approximation. Use of higher-order approximations can be done in a similar manner without a significant increase in complexity and difficulty. Applying the first-order Taylor series approximation, the feedback path nonlinearity becomes

E tanh(Fyk) ≈ E tanh(Frk) + E(1 - tanh(Frk) )F[yk - rk] (7) The nonlinearity now only has linear time-varying terms and can be combined with the linear terms in eq 5. The result is the linear time-varying model given in eq 8.

xˆ k+1 ) Akxˆ k + Buk + gk yˆ k ) Cxˆ k

∑ i)1

m

(rk+i - yˆ k+i)2 + Wu

∆uk+i2 ∑ i)1

(11)

The objective function is of the sum-of-squared errors form, where the first term represents the error of the prediction horizon and the second term is a penalization of excessive control actions whose purpose is to minimize potentially expensive control actions. No system model will perfectly describe the system at all times. There will always be parameter uncertainty, and system characteristics change with time. To account for this, the model predicted output, yˆ , in the linear time-varying model (eq 8) must be augmented with a correction term. The most basic, and frequently used, correction method is an additive disturbance term. A more advanced correction method is the use of an augmented state-space-based Kalman filter. The Kalman filter approach has been studied by Muske and Badgwell,20 and it has been shown to provide superior disturbance rejection over a range of possible disturbance types and, is, therefore the correction method used in this work. The estimated disturbance term, dk, is used to account for model uncertainty and disturbances and is augmented to the linear time-varying model in eq 9 to give eq 12.

[ ] [ ][ ] [ ] [ ]

xˆ k+1 Ak Γd xˆ k B I ) dk+1 dk + 0 uk + 0 gk + ωk 0 I xˆ yˆ k ) [C 0 ] dk + υk k

[]

(12)

The matrix Γd represents the input disturbances that are being estimated, and because of this, Γd is simply the input B matrix. The zero and identity matrices are then of appropriate dimension. Writing this in more compact matrix-vector notation, with appropriately defined terms, gives a ) Aak xˆ ak + Bauk + Gagk + ωk xˆ k+1

yˆ k ) Caxˆ ak + υk

(13)

The augmented state vector, xˆ ak , is updated with the predictor/ corrector equations given in eq 14. a a ) Aak xˆ k-1|k-1 + Bauk + Gagk xˆ k|k-1

(14)

In eq 14, L is the Kalman gain, which is determined by solving the steady-state Ricatti equation. The objective function in eq 11 can then be written

Φ ) (R - Y)T(R - Y) + ∆uTWu∆u

(15)

(8)

where R is a vector of setpoints, Y is a vector of predicted outputs, and u is the optimal control solution. By propagating eqs 9 and 15 p times into the future, the vector Y can be written as

(9)

a + SW + EZ∆u + Euk-1 Y ) Qxˆ k|k

where the terms Ak and gk are defined in eqs 9 and 10.

Ak ) A + E(1 - tanh(Frk)2)FC

Φ)

a a a xˆ k|k ) xˆ k|k-1 + L(yk - Caxˆ k|k-1 )

2

(10)

(16)

8578

Ind. Eng. Chem. Res., Vol. 45, No. 25, 2006

The E matrix is a multivariable identity matrix with an element having the dimensions nu × nu, where nu is the number of manipulated inputs. The remaining matrices are more complex and defined as i-1

Q(i) ) C

a

∏ k)0

Aak

(17)

[ ]

gk ‚ W) ‚ ‚ gk+p-1

S(i,j) )

j-i

∏ k)1

Ca

Z(i,j) )

{

0 CaBa j-i

∏ k)1

Ca

ij Ak+1

i j

}

(19)

(20)

(22)

To make an accurate comparison, an augmented Kalman filter approach, similar to eq 12, was developed.

[ ] [ ][ ] [ ] [ ]

xˆ k+1 B I A Γd xˆ k ) dk+1 dk + 0 uk + 0 ξk + ωk 0 I xˆ yˆ k ) [C 0 ] dk + υk k

[]

(23)

ξk ) E tanh(Fyk) Writing this in more compact matrix-vector notation, with appropriately defined terms, gives a ) Aaxˆ ak + Bauk + Ξaξk + ωk xˆ k+1

yˆ k ) Caxˆ ak + υk

k1

k2

(26)

2A 98 D

m

(rk+i - yˆ k+i)2 + Wu∑∆uk+i2 ∑ i)1 i)1

To demonstrate and verify the efficacy of the feedforward architecture and the efficiency of the proposed neural predictive control technique compared to solving the nonlinear neuralnetwork optimization problem resulting from the standard inner-hidden-outer feedforward neural network, two representative problems of different dimensions are solved. 5.1. van de Vusse Reactor. The single-input, single-output example comes from the chemical reactor literature and involves a series-parallel reaction scheme known as the van de Vusse reactor. The two reactions are

k3

4.2. Nonlinear Neural-Network Formulation. The aforementioned feedforward neural-network formulation is based on the specific architecture developed in Figure 2. As a means of standard comparison for efficacy and computational efficiency of the proposed method, the nonlinear neural network will be used as the model for model predictive control. This results in the nonlinear programming problem defined in eq 22.

∆u

In eq 25, L is the Kalman gain, which is determined from the solution of the steady-state Ricatti equation. The nonlinear programming problem in eq 22 will be solved by using a trustregion approach, implemented in the MATLAB optimization toolbox. For additional details on the algorithm and implementation, the reader is referred to The Mathworks.21

A 98 B 98 C

a ∆u ) (HTH + Wu)-1 HT(R - Qxˆ k|k - SW - Euk-1) (21)

p

(25)

5. Application of Neural Predictive Control

Using eq 16, the optimization problem in eq 15 can be solved to yield the algebraic solution, where H ) EZ:

min Φ )

a a a xˆ k|k ) xˆ k|k-1 + L(yk - Caxˆ k|k-1 )

(18)

{ } 0 C

a a xˆ k|k-1 ) Aaxˆ k-1|k-1 + Bauk + Ξaξk

(24)

The augmented state vector is updated with the predictor/ corrector equations given in eq 25.

The desired product in the reaction is species B, the intermediate product in the primary reaction, which increases the difficulty of control. The reactor has been shown to exhibit input multiplicity,19,22 where a desired output may be achieved by two different steady-state input values. In addition, there is a right-half-plane zero (unstable inverse) over part of the desired operating region. The modeling equations are

dCA F ) (CAf - CA) - k1CA - k3CA2 dt V

(27)

dCB F ) - CB + k1CA - k2CB dt V These modeling equations assume a constant reactor volume. The equations for CC and CD are neglected because CB is not dependent on them. The manipulated input in this system is F/V, the dilution rate. The parameters for the reactor are given in Table 1. Using eq 27, a series of step-test data was generated from step changes to the dilution rate. The dilution rates were switched every 10 min, with an average value of 1.617 min-1 and a standard deviation of 1.383, and kept positive to ensure physically realizable flowrates. This data was then used as the training data for the feedforward neural network given in eq 6. The network was trained using the Levenberg-Marquardt algorithm in eq 1, and the results of the training are given in Figure 3. The neural-network architecture developed in Figure 2 has five layers, each with an independent number of neurons. Because the output of the fifth layer is yk+1, the networkpredicted output, the number of neurons in the fifth layer is constrained to equal the number of outputs in the model. The third layer, where yk is an input to a linear activation function, defines the number of state variables in the model. Table 2 gives more detailed information about each layer in the model.

Ind. Eng. Chem. Res., Vol. 45, No. 25, 2006 8579

Figure 3. Neural-network training.

Figure 4. Neural-network validation.

Table 1. Van De Vusse Reactor Details parameter

value

k1 k2 k3 Caf

5/6 min-1 5/3 min-1 1/6 mol/(L‚min) 10 gmol/L

Table 2. Proposed Neural-Network Details layer

activation function

number of nodes

1 2 3 4 5

nonlinear linear linear linear linear

6 5 3 2 number of outputs

As Figure 3 clearly shows, the proposed neural-network architecture is able to accurately model the training data. Successful modeling of the training data is not sufficient to prove the efficacy of the neural-network architecture. The network must also show the ability to predict the inputoutput behavior from data not included in the training set. This ability to generalize data makes the network applicable over a wider range of inputs from a control perspective. Again using eq 27, a series of step-test data was generated from step changes to the dilution rate. The dilution rates were switched every 10 min, with an average value of 1.622 min-1 and a standard deviation of 1.391, and kept positive to ensure physically realizable flowrates. The resulting comparison of networkpredicted outputs to actual outputs is found in the validation plot of Figure 4. As the figure clearly shows, the proposed neural-network architecture also has the ability to generalize input-output data it was not trained to model. The ability to both train and generalize, as illustrated by Figures 3 and 4, demonstrates the efficacy of the network. Using the control equations developed in eqs 9, 10, and 1720, the computational efficiency of the neural-network architecture in a model predictive control framework was investigated. To establish a baseline for comparison, a model predictive control strategy where the nonlinear model in eq 4 is optimized at each time step was devised. Before the computational efficiency can be gauged, it first needs to be established that the proposed neural predictive control method performs well in comparison with the nonlinear approach. Initial simulations, for both setpoint tracking and disturbance rejection, were done for a prediction horizon of 15 and a control horizon of 1. To simulate the inherent noise present in a real

Figure 5. Comparison of proposed neural-network MPC and nonlinear neural-network optimization based MPC over operating range of van de Vusse reactor, p ) 15, m ) 1.

process, measurement error was applied in the form of an error term with a standard deviation of 5% of the steady-state value added to the output. To determine how the two methods perform relative to one another, it is instructive to compare control performances over a range of operating setpoints, including an infeasible setpoint. In this simulation, a concentration of 1.35 mol/L was used, which is greater than the feasible limit of 1.266 mol/L. The comparison of strategies over this range of setpoints is shown in Figure 5. The results of the two control strategies over the operating space show that the proposed approach has comparable performance to the nonlinear optimization approach. The quantitative measure used to assess this is the mean absolute deviation (MAD). Mean absolute deviation values and computational times for all three simulation studies are summarized in Table 3, and the corresponding entry for Figure 5 shows that the two methods have nearly identical performance, with the proposed method taking an order of magnitude less computational time. It is also important that the proposed strategy is able to reject disturbances as well as track setpoints. With the same measurement noise previously described, a positive disturbance to the dilution rate of magnitude 0.1 min-1 is applied at time 35 min and a negative disturbance to the dilution rate of magnitude 0.1 min-1 is applied at time 70 min; the results are shown in Figure 6.

8580

Ind. Eng. Chem. Res., Vol. 45, No. 25, 2006

Table 3. Statistical Data on Simulations simulation

MAD (proposed neural MPC)

MAD (nonlinear optimization)

computational time (proposed neural MPC)

computational time (nonlinear optimization)

van de Vusse reactor, setpoint tracking van de Vusse reactor, disturbance rejection quadruple tank, setpoint tracking

0.006 6 0.003 3 0.000 832 04

0.001 5 0.003 7 0.000 694 66

0.000 133 0.000 133 0.000 152

0.041 05 0.041 05 0.043 45

Figure 6. Comparison of proposed neural-network MPC and nonlinear neural-network optimization based MPC for disturbance rejection of van de Vusse reactor, p ) 15, m ) 1. Table 4. Kalman Filter Tuning Information system

q11

q22

r11

r22

van de Vusse quadruple tank

0.01 0.1

0.01

1 1

1

The comparison of the two strategies for disturbance rejection yields the same comparable performance seen for setpoint tracking. Looking at the mean absolute deviation values and computational times in Table 3, along with visual inspection of Figure 6, provides confirmation that disturbance rejection is nearly identical and the proposed method takes an order of magnitude less computational time. To provide additional motivation for the development of the proposed neural predictive control strategy, a comparison against linear model predictive control is presented for the van de Vusse reactor. While a fundamental model-based approach is not always a viable control strategy, as in the case where only empirical data is available, it provides a solid baseline for comparing the performance of other model-based control strategies. The linear model predictive control strategy uses a state-space formulation with an augmented Kalman filter, to make the comparison to the proposed neural predictive control approach valid. Details about the tuning of the Kalman filter, for both the van de Vusse and subsequent multivariable example, are given in Table 4. The off-diagonal terms in the Q and R matrices are zero, and the diagonal terms are given in the table. The linearization point for generating the linear model is CAs ) 1.117 mol/L and F/V ) 0.5714 min-1. Additional details on linear model predictive control, as well as on the linearization of the van de Vusse reactor, can be found in ref 17. The comparison is shown in Figure 7. The results in Figure 7 show the clear benefits of the proposed neural predictive control strategy over linear model predictive control. The proposed neural predictive control strategy is able to both bring the concentration to the setpoint considerably faster than the linear model predictive control strategy and require less manipulated input action to do so.

Figure 7. Comparison of proposed neural-network MPC and linear MPC for van de Vusse reactor, p ) 15, m ) 1.

5.2. Quadruple Tank. The single-input, single-output van de Vusse example demonstrated the efficacy of the proposed neural-network architecture and started to analyze the computational efficiency of the resulting neural predictive control strategy. A more complete look at the computational efficiency, however, requires another example that illustrates the effect of increasing system dimension, because of the increased complexity resulting from more inputs and outputs in the system. The quadruple tank of Johansson,23 a two-input, two-output system, will be used. The quadruple tank is also a challenging system because it can be operated in both nonminimum-phase and minimum-phase configurations, as will be done in this work. The outputs are voltages corresponding to tank height measurements, and inputs are control voltages for valves, with the modeling equations given in eq 28.

a3 γ1k1 dh1 -a1 2gh + 2gh + V ) dt A1 x 1 A1 x 3 A1 1 dh2 -a2 a4 γ2k2 ) 2gh2 + x2gh4 + V x dt A2 A2 A2 2 (1 - γ2)k2 dh3 -a3 2gh3 + V2 ) x dt A3 A3

(28)

(1 - γ1)k1 dh4 -a4 2gh + V1 ) dt A4 x 4 A4 y)

[

][ ]

kc 0 0 0 h1 0 kc 0 0 h2

For the parameters of the system, and a more detailed discussion of the modeling equations, the reader is referred to ref 23. The procedure for training and validating the neural-network model does not change with increasing the dimension of the system. Therefore, the training and validation plots created in the generation of the neural-network model are not shown.

Ind. Eng. Chem. Res., Vol. 45, No. 25, 2006 8581

Figure 8. Comparison of proposed neural MPC and nonlinear neuralnetwork optimization based MPC for setpoint tracking of quadruple tank, p ) 15, m ) 1.

The training and validation data used to generate the neuralnetwork model contained data from both the nonminimum-phase and minimum-phase configurations to allow the neural network to operate in either regime. To provide a challenging control problem, the quadruple tank switches from nonminimum-phase to minimum-phase configuration during setpoint tracking. In Figure 8, which shows a setpoint-tracking comparison between the proposed neural predictive control and nonlinear neuralnetwork optimization approaches, the quadruple tank switches configuration at time 500 s. There are several important points to note from Figure 8. The first point is that the neural network is able to successfully deal with the switch from nonminimum phase to minimum phase, which demonstrates the effectiveness of the neuralnetwork architecture from a modeling perspective. The second point is that the proposed neural predictive control approach is again comparable to the nonlinear neural-network optimization approach, both visually from Figure 8 and from the quantitative data given in Table 3. As for the previous system, a comparison of the proposed neural predictive control strategy to linear model predictive control was done for the quadruple tank. The linearization point is the minimum-phase operating point described by Johansson,23 and the quadruple tank switches to nonminimum-phase behavior at time 600 s. The resulting comparison is given in Figure 9. The results of Figure 9 again show the benefit of the proposed neural predictive control strategy over linear model predictive control, which is unable to maintain stability with the switch to nonminimum-phase behavior. The proposed neural predictive control strategy is able to handle the switch, further motivating the development of the strategy. 6. Computational Efficiency The comparisons to verify the proposed neural predictive control strategy shown in Figures 7-9 were done for a specific set of parameters. To see the real computational efficiency of the proposed strategy, a comparison must be made over a range of prediction and control horizons, as well as for systems of increasing input and output size. For all simulations in this work, MATLAB 7.0 was used to do the computations on a 3 GHz Pentium 4 computer. The nonlinear neural-network optimization

Figure 9. Comparison of proposed neural-network MPC and linear MPC for quadruple tank, p ) 15, m ) 1.

Figure 10. van de Vusse; left axis is computational time for nonlinear neural-network optimization, and right axis is computational time for proposed neural predictive control.

based model predictive control approach was done using the solution procedure described in Section 4.2. To determine the effect of prediction horizon and control horizon on the computational efficiency, both the single-input, singleoutput van de Vusse reactor scheme and the two-input, twooutput quadruple tank were used for comparison against the nonlinear neural-network optimization based model predictive control strategy. The control horizon was held constant at 1 and the prediction horizon was varied to determine the effect of changing prediction horizon on the computational efficiency. Similarly, the prediction horizon was held constant at 20 and the control horizon was varied to determine the effect of changing control horizon on the computational efficiency. The results of these simulations for the van de Vusse reactor are shown in Figure 10. It is important to note the time-scale differences in Figure 10, as the computational time for the proposed neural predictive control strategy is several orders of magnitude less than the computational time for the nonlinear neural-network optimization based model predictive control strategy. The gap in computational time widens with increases in both prediction

8582

Ind. Eng. Chem. Res., Vol. 45, No. 25, 2006

ment No. 70NANB3H3037) and National Science Foundation (BES-0411693 and DGE-0504361) for partial support of this research. Literature Cited

Figure 11. Quadruple tank; left axis is computational time for nonlinear neural-network optimization, and right axis is computational time for proposed neural predictive control.

and control horizon. Simulations were also done for the quadruple tank, and the results are shown in Figure 11. The time scales are again important in Figure 11, as the proposed neural predictive control strategy takes several orders of magnitude less computational time than does the nonlinear neural-network optimization based approach. Also consistent is the widening gap in computational time as the prediction and control horizons are increased. 7. Summary A new strategy for combining system identification and model predictive control was demonstrated. The system identification was accomplished by use of a novel feedforward neural-network architecture. The network was trained and validated on a highly nonlinear chemical reactor and a multivariable quadruple tank exhibiting both nonminimum- and minimum-phase behavior. The network structure is designed so that the static nonlinearity can be mapped onto a linear time-varying term, which spans the setpoint vector. An optimal control solution to the resulting model predictive control formulation was then derived. The neural predictive control approach was then tested against the nonlinear neural-network optimization based model predictive control approach for the same chemical reactor and quadruple tank. The proposed neural predictive control approach was shown to have comparable control performance, but requiring significantly less computational time. This computational efficiency was shown to extend over the range of prediction and control horizons. It was also shown that the computational efficiency of the proposed method increases with increasing system dimension. Acknowledgment The authors would like to gratefully acknowledge the National Institute of Standards and Technology (Cooperative Agree-

(1) Reuter, E.; Wozny, G.; Jeromin, L. Modeling of multicomponent batch distillation processes with chemical reactions and their control systems. Comput. Chem. Eng. 1992, 16, 27. (2) Gabbar, H. A.; Aoyama, A.; Naka, Y. Automated solution for control recipe generation of chemical batch plants. Comput. Chem. Eng. 2005, 29, 949. (3) Fischle, K.; Schroder, D. An improved stable adaptive fuzzy control method. IEEE Trans. Fuzzy Syst. 1999, 7, 27. (4) Bostanci, M.; Koplowitz, J.; Taylor, C. W. Identification of power system load dynamics using artificial neural networks. IEEE Trans. Power Syst. 1997, 12, 1468. (5) Cotter, N. E. The Stone-Weierstrass theorem and its application to neural networks. IEEE Trans. Neural Networks. 1990, 1, 290. (6) Chao-Chee, K.; Lee, K. Y. Diagonal recurrent neural networks for dynamic systems control. IEEE Trans. Neural Networks 1995, 6, 144. (7) Prasad, V.; Bequette, B. W. Nonlinear system identification and model reduction using artificial neural networks. Comput. Chem. Eng. 2003, 27, 1741. (8) Youngjik, L.; Song, H. K.; Kim, M. W. An efficient hidden node reduction technique for multilayer perceptrons. IEEE Conf. Neural Networks 1991, 1937. (9) Ge, S. S.; Cong, W. Adaptive neural control of uncertain MIMO nonlinear systems. IEEE Trans. Neural Networks 2004, 15, 674. (10) Psichogios, D. C.; Ungar, L. H. Direct and indirect model based control using artificial neural networks. Ind. Eng. Chem. Res. 1991, 30, 2564. (11) Hong, T.; Zhang, J.; Morris, A. J.; Martin, E. B.; Karim, N. M. Neural based predictive control of a multivariable microalgae fermentation. IEEE Conf. Syst., Man Cybernetics 1996, 1, 345. (12) Gil, P.; Henriques, J.; Dourado, A.; Duarte-Ramos, H. Constrained neural model predictive control with guaranteed free offset. IEEE Conf. Ind. Electron. Soc. 2000, 3, 1991. (13) Bhartiya, S.; Whiteley, J. R. Factorized approach to nonlinear MPC using a radial basis function model. AIChE J. 2001, 47, 358. (14) Bhartiya, S.; Whitely, J. R. Benefits of factorized RBF-based NMPC. Comput. Chem. Eng. 2002, 26, 1185. (15) Chen, J.; Yea, Y.; Wang, C.-H. Neural network model predictive control for nonlinear MIMO processes with unmeasured disturbances. J. Chem. Eng. Jpn. 2002, 35, 150. (16) Rau, M.; Schroder, D. Model predictive control with nonlinear state space models. Workshop AdV. Motion Control 2002, 136. (17) Norgaard, M.; Sorensen, P. H.; Poulsen, N. K.; Ravn, O.; Hansen, L. K. Intelligent predictive control of nonlinear processes using neural networks. Proc. IEEE Symp. Intell. Control 1996, 301. (18) Bequette, B. W. Nonlinear control of chemical processes: A review. Ind. Eng. Chem. Res. 1991, 30, 1391. (19) Bequette, B. W. Process Control: Modeling, Design and Simulation; Prentice Hall: Upper Saddle River, NJ, 2003. (20) Muske, K. R.; Badgwell, T. A. Disturbance modeling for offsetfree linear model predictive control. J. Process Control 2002, 12, 617. (21) Optimization Toolbox, MATLAB, version 7.0; The Mathworks Inc.: Natick, MA, 2005. (22) Sistu, P. B.; Bequette, B. W. Model Predictive Control of Processes with Input Multiplicity. Chem. Eng. Sci. 1995, 50, 921. (23) Johansson, K. H. The quadruple tank process: a multivariable laboratory process with an adjustable zero. IEEE Trans. Control Syst. Technol. 2000, 8, 456.

ReceiVed for reView February 27, 2006 ReVised manuscript receiVed September 2, 2006 Accepted September 5, 2006 IE060246Y