Ind. Eng. Chem. Res. 1999, 38, 1973-1986
1973
A Multiscale Model Predictive Control Strategy Arun Krishnan and Karlene A. Hoo* Department of Chemical Engineering, University of South Carolina, Columbia, South Carolina 29208
Multiscale systems defined on trees can provide local time and scale information about the behavior of the process in contrast to the usual time-domain model and Fourier transforms. Because the model predictive control (MPC) framework uses a model of the process to determine the optimal control action, improving the model by using a multiscale approach will result in controller actions that can compensate for phenomena that may occur at different scales. This work develops multiscale models on trees, describes how these time-scale models can be used in the MPC framework to represent both the process and the disturbance, and proposes a new optimization strategy to determine the controller actions such that the optimal inputs, at the finer scales reflect the inputs at the coarser scales. The performance of this multiscale MPC strategy is demonstrated on a continuous process and on a chemical batch reactor. 1. Introduction Model predictive control (MPC) is the generic name given to a class of model-based control strategies.1-6 MPC provides a systematic framework to solve multivariable, constrained control problems by replacing fixed parameter control laws with open-loop, on-line optimization. Other advantages of MPC include its ability to handle nonminimum phase behavior and multiple interactions. Linear MPC uses a convolution model or a linear state-space model of the process in an optimization strategy to determine both the present and future controller actions by using the model to predict future process performance. Because all real chemical processes exhibit some degree of nonlinear dynamic behavior and are subject to unmeasured and measured disturbances, there will be inaccuracies associated with the use of a linear model to the extent that the effect of the process nonlinearities and disturbances is significant. Thus, the quality of the optimal inputs is heavily dependent upon the identification of a meaningful approximation to the process.7,8 Because the use of a linear model to predict the future process behavior may result in plant-model mismatch, thereby affecting the calculation of the present and future optimal controller actions, only the optimal controller action at the present sampling time is actually implemented. At each subsequent sample time, all of the future controller actions are recalculated taking into account the present measurements of the plant outputs. Recent research to improve the model in the MPC strategy has focused on the explicit or implicit use of a nonlinear model. The former means the direct use of a nonlinear model in a nonlinear, constrained optimization framework. The latter involves several possibilities, for example, the repeated on-line linearization of a nonlinear model to produce a linear state-space or convolution model.4,9,10 Another is the off-line development of multiple linear state-space models from a representative nonlinear model using the reference * To whom correspondence should be addressed. E-mail:
[email protected]. Tel: (803)777-0143. Fax: (803)7778265.
trajectory as the reference point.11 Both strategies continue to use linear optimization techniques, and because the model(s) remain(s) linear, linear theory can be used for analysis. Disturbances also contribute to plant-model mismatch. In the MPC framework, the conventional approach is to assume that the disturbance in the future will be the same magnitude and direction as that of the present time. Some effort has been made to provide better disturbance estimates, thereby improving the performance of the controller.9,10 It is primarily in the description of the input disturbance that this work proposes improvements by using the emerging field of multiresolution analysis.12,13 Physical phenomena often involve dynamics at different time scales. For instance, in any industrial process, different subprocesses exhibit behaviors on different time scales; sensors too may provide measurements at different sampling rates, resulting in controller actions at correspondingly different rates. It is not unexpected that disturbances may also occur over many frequencies and at different times. To exacerbate the issue, sampling is usually done at a fixed fast rate to retain the smallest feature, thereby producing redundant information for slow varying features. Many of the typical data representations, however, do not retain the multiscale character of the process. For effective, intelligent process control, what is needed is a method that can represent the multiscale nature of the process. Wavelet theory is one such method. Like Fourier analysis, wavelet theory is a frequency domain technique which can be applied to both continuous and discrete time signals. However, it is more flexible in that wavelets provide for localization of information in both time and frequency. More importantly, the timefrequency decompositions are done in a way that matches the frequency resolution to the time scale. Benveniste et al.14 developed a multi-scale theoretical framework for obtaining process representations which capture scale-based characteristics of process models, measurements, and control actions. This work builds upon these results to provide time-scale models of the process and the disturbance identified on trees for use in a multi-scale MPC (ms-MPC) strategy. A new top-
10.1021/ie980658+ CCC: $18.00 © 1999 American Chemical Society Published on Web 03/31/1999
1974 Ind. Eng. Chem. Res., Vol. 38, No. 5, 1999
down optimization strategy and a multi-scale disturbance estimation algorithm are proposed such that the controller action implemented at the finest scale (at the process level) contains components that reflect the multi-scale distributed nature of the disturbance and the process. The organization of the paper is as follows. The first section introduces wavelet-based multi-scale representation of signals on trees. Specifically, a linear timeinvariant system is used to demonstrate this concept. Section 2 first outlines the time-domain linear MPC strategy and then develops the theoretical framework for ms-MPC. This is followed by the development of the top-down multi-scale optimization strategy and the disturbance estimation algorithm in section 3. Section 4 introduces the candidate processes and demonstrates the ms-MPC controller performance. Last, section 5 summarizes and discusses the important results about multi-scale representations and ms-MPC.
fast decay to zero and (ii) forming an orthogonal set with its integer translates. These properties impose certain conditions on the coefficients h(j). In particular, h(j) must be the impulse response coefficients of a quadrature mirror filter.22 The second property can be stated mathematically as
∑k h(k) h(k-2j) ) δj
where δj is the Kronecker delta.15,17 The concept of a wavelet transform can be arrived at by considering the incremental detail added to the (m + 1)th scale approximation from the mth scale. Let V(m+1) represent the space of all functions spanned by the orthogonal set, {φ(2m+1t-k); k ∈ Z}, corresponding to a finer approximation of functions in V(m), the space of all functions spanned by the orthogonal set, {φ(2mtl); l ∈ Z}. Then, V(m) ⊂ V(m+1), that is
V(m+1) ) V(m) x W(m)
2. Multiscale Representation of Signals A large number of papers about multiresolution signal processing and signal analysis have appeared in part because of the development of fast algorithms to compute the wavelet transform15,16 and the introduction of orthonormal wavelet functions by Daubechies.17 The term wavelets is used to describe a framework that is a unification of several methods of which multiresolution analysis is one. Others include sub-band coding, signal processing, speech processing, and data compression. Existing research has demonstrated that wavelets hold considerable potential for feature analysis, numerical computation, system identification, and control. The main feature of wavelet transforms is their ability to represent signals in such a way as to obtain time and scale (analogous to frequency) localization of the signal. Excellent introductions and reviews can be found in refs 18-21. A brief introduction follows. 2.1. The Wavelet Transform. Any continuous signal f(t) can be represented by a sequence of approximations at many scales, with each approximation expressed as the weighted sum of translated and compressed (or dilated) versions of a basic scaling function φ(t). As an example, the representation of f(t) at the mth scale is given as22
f(m,j) φ(2mt-j) ∑ j)-∞
m, j ∈ Z
where W(m) is the space that contains the detail removed in the coarser, f(m)(t), representation of the original function f(t). The symbol x is the orthogonal sum. The space W(m) is spanned by the orthogonal translates of a single wavelet function, ψ(2mt). Similar to eq 1, the relation between the (m + 1)th scale and the mth scale is given as23 j)+∞
f(m+1)(t) ) f(m)(t) +
δf(m,j) ψ(2mt-j) ∑ j)-∞
∑j h(j) φ(2m+1t-j)
m, j ∈ Z (5)
Not every wavelet function forms an orthogonal set.19 In fact, wavelets can be classified as orthogonal, semiorthogonal, biorthogonal, and nonorthogonal and still satisfy the criterion to be a wavelet. An example of a nonorthogonal wavelet function is the Poisson wavelet, as developed by Kosanovich et al.24 The wavelet function ψ(2mt) is related to the scaling function φ(2m+1t) through the relationship23
ψ(2mt) )
∑k g(j) φ(2m+1t-j)
(6)
(1) where g(j) are the wavelet function coefficients. The coefficients f(m,j) and δf(m,j) are given by
where Z represents the set of integers and j indexes the translations. If the (m + 1)th approximation is a refinement of the mth approximation, then the function φ(2mt) should satisfy the following criterion:
φ(2mt) )
(4)
W(m) ⊥ V(m)
j)+∞
f(m)(t) )
(3)
f(m,j) )
∑k h(2j-k) f(m+1,k)
δf(m,j) ) (2)
where the right-hand side is a weighted sum of translated and dilated basis functions spanning the space of the (m + 1)th approximation and h(j) are the scaling function coefficients. In addition, the scaling function must satisfy several other properties such as (i) having compact support or
∑k
(7) g(2j-k) f(m+1,k)
The scaling and wavelet functions can be interpreted as low- and band-pass filters, respectively.25 Thus, any continuous signal can be filtered to a desired approximation, f(m)(t), using the scaling function, φ(2m+1t), as a low-pass filter. Similarly, the wavelet function can be viewed as a band-pass filter that extracts information at a given scale or range of frequencies.21
Ind. Eng. Chem. Res., Vol. 38, No. 5, 1999 1975
average of the values at the two descendent nodes (m + 1, 2j) and (m + 1, 2j + 1). Let {f(0,0), f(0,1), ..., f(0,2k), f(0,2k+1), ...} be a sequence of discrete-time values of f(t). Filtering f(t) using the Haar coefficients given as23
h)
1 [1 -1] x2
g)
1 [1 -1] x2
(11)
generates the scaling and wavelet coefficients at scale m ) -1,
f(-1,k) ) Figure 1. Translations and dilations of the Haar wavelet. In all of the panels, the dash-dotted line represents ψ(0,0). Top panel: the dotted line represents translation, ψ(0,2). Middle panel: the dotted line represents compression, ψ(2,0). Bottom panel: the dotted line represents both compression and translation, ψ(2,2).
The simplest example of a scaling function and its corresponding wavelet function is the Haar function (or Daub 1)23
φ(t) ) ψ(t) )
{
1 0et