Predictive controller design by principal components analysis

Predictive controller design by principal components analysis ... On the Tuning of Predictive Controllers: Inverse Optimality and the Minimum Variance...
1 downloads 0 Views 1MB Size
1204

I n d . Eng. Chem. Res. 1988,27, 1204-1212

Predictive Controller Design by Principal Components Analysis Paul R. Maurath,+Alan J. Laub,t Dale E. Seborg, and Duncan A. Mellichamp* Department of Chemical and Nuclear Engineering, University of California, Santa Barbara, California 93106

A new method of designing predictive controllers has been developed t h a t is based on a singular value analysis of the process dynamics. T h e primary design parameter in the new method is the number of principal components of the system generalized inverse t h a t are retained in the approximate process inverse used by the controller. T h e effects of retaining each of the individual components on closed-loop performance and robustness can be easily calculated. Choices of other controller design parameters have a minimal impact on the results of the new method. A particular feature is t h a t explicit move suppression is not required. The method works particularly well on multiple-input/multiple-output (MIMO) processes and tolerates changes in process scaling and output (SISO) weighting. Application of the method is illustrated with a simple single-input/single-output process model and with two MIMO distillation column models. The implementation of predictive controllers for both single-input/single-output(SISO) processes and multiple-input/multiple-output (MIMO) processes has been discussed in a number of recent papers. A particular form of predictive control, Dynamic Matrix Control (DMC), was developed in an industrial (refinery) control environment (Cutler and Ramaker, 1980; Cutler, 1983; Garcia and Prett, 1986) and only later became the subject of academic research. DMC and other predictive controllers such as Model Algorithmic Control (Richalet et al., 1978; Rohani and Mehra, 1982; Mehra et al., 1982 introduced a very different philosophy of control, one that could utilize the power of modern process computers to deal directly with the computationally intensive task of computer feedback signals based on predicted future outputs of the process. Predictive controllers typically utilize a sampled-data representation (model) of the process to predict and optimize the process outputs over a finite future time interval. The independent variables in the optimization procedure are the future values of the manipulated variables. Thus, predictive control such as DMC involves a sequence of operations: (1)At each sampling time, a discrete convolution model of theprocess (either a step or impulse response model) is used to predict the trajectories of the process outputs over the chosen finite time interval (a design parameter called the optimization horizon). (2) A sequence of process inputs is calculated which optimizes these trajectories. (3) Only the first set of calculated process inputs is implemented, after which the entire procedure beginning a t (1)is repeated again at the next sampling time. The use of this shifting (or receding) optimization horizon and the implementation of only the first set of values of the optimized input sequence have a number of consequences for the resulting predictive controller. In particular, because the optimization interval is finite, the form of the controller is not restricted to state feedback (usually proportional feedback) as in classical optimal control theory. Since the finite control interval, i.e., the length of the computed input sequence, usually is chosen to be smaller than the output optimization horizon, the optimization problem reduces to a least-squares formulation. Finally, because the process outputs are measured at each

* Author to whom correspondence should be sent.

Present address: Procter and Gamble Company, Sharon Woods Technical Center, 11530 Reed Hartman Highway, Cincinnati, OH 45241. Present address: Department of Electrical and Computer Engineering, University of California, Santa Barbara, CA 93106.

*

sampling time and the entire prediction/optimization procedure is repeated, the method is robust (insensitive to model errors) and deals easily with disturbances. Perhaps as a consequence, DMC and other similar predictive control methods have been adopted relatively widely in the process industries in contrast to classical optimal control methods. In implementing DMC, it was early recognized that the optimization procedure was subject to the same type of ill-conditioning problems as any other least-squares procedure. Previous methods of dealing with this problem, e.g., Marchetti (1982) and Marchetti et al. (1983),utilized a move suppression factor which is equivalent to the ridge regression factor introduced by statisticians much earlier in dealing with ill-conditioned (least-squares) parameter estimation. The primary contribution of the present paper is to show that a principal components analysis based on singular value decomposition methods can be used to determine the important design parameter values more easily while furnishing more guidance to the designer in making critical decisions. In particular, no trial-and-error search is required in order to specify the move suppression factor. This paper utilizes the theoretical and notational developments given in a number of earlier publications (Marchetti, 1982; Marchetti et al., 1983; Maurath, 1985; Maurath et al., 1985, 1988).

The Predictive Controller Optimization The optimization problem that is embedded in the predictive controller algorithm has been developed in Maurath et al. (1988); the objective function to be minimized can be written as

J(Au) = (AAu - E ‘ ) T Q T Q ( A A- ~E’)

(1)

where E’is a vector of predicted deviations from the set point a t the next R sampling times, assuming no future control action, Au is a vector of the next L input changes (to be calculated), A is the R X L system “Dynamic Matrix” (Cutler, 1983), and Q is a positive weighting matrix, usually chosen to be diagonal. R is the prediction horizon, the number of future sampling periods included in the optimization; and L is the control horizon, the number of sampling periods for which control changes will be computed. Then AAu gives the effect on the system output a t the next R sampling times of the next L input changes. For purposes of the optimization, it is assumed that the input is not changed after the next L changes have been implemented. This problem is in a standard leastsquares form; the solution can be written in several forms.

0888-5885/88/2627-1204$01.50/0 0 1988 American Chemical Society

Ind. Eng. Chem. Res., Vol. 27, No. 7, 1988 1205 By use of the normal equations and assuming that the A matrix is of full column rank, the solution can be written as Au = (ATQTQA)-‘ATQTQE’ The solution can also be expressed in the form

(2)

where (QA)+is the Moore-Penrose pseudoinvese (Penrose, 1956) of QA. If matrix A is of full column rank, these two solutions are equivalent. However, when matrix A is nearly rank-deficient, the problem is ill-conditioned; i.e., small changes in the elements of A can result in large changes in the elements of the solution, Au One measure of how much the elements of the solution Au can change is the so-called condition number of QA with respect to pseudoinversion. This number, which arises naturally from a first-order perturbation analysis of the problem, is given by IIQ~II~II~QA~~II where 11-11 denotes the matrix %norm. When A has full column rank, this condition number is given by the square root of the ratio of the largest to smallest eigenvalues of ATQTQA. The formulation of the solution using eq 2 above is particularly ill-conditioned. Statisticians working with linear least-squares parameter estimation, a problem identical in form with this optimization, have observed that the computed solution can be significantly different from the “true” solution if ill-conditioning is involved. Control problems invariably are ill-conditioned for values of the R and L parameters usually assumed.

.

Ridge Regression Several approaches have been suggested for dealing with this ill-conditioning. Some of these modifications to the standard least-squares problem were examined in a simulation study by Dempster et al. (1977). One of the earliest and most widely used is called “ridge regression” (Hoed, 1962). In ridge regression and its variants, an additional term is added to the performance index which weights the solution vector elements. By use of this same principle, the modified performance index in eq 1for the ridge regression (RR) approach is JRR(Au)

= [(AAu - E’)TQTQ(AA~ - E’)

+~Au~Au]

(4) where f is a positive scalar called the ridge parameter or move suppression parameter. Thisapproach has been used in several previous predictive control studies beginning with Marchetti (1982) because the controller designed by using eq 2 resulted in control input changes which were unacceptably large. The real cause for the poor control was that, as the R and L parameters become large, matrix A becomes very ill-conditioned with respect to pseudoinversion. Weighting vector Au reduces the size of the control inputs, as would be expected. The solution to the modified problem is AuRR

+

= (ATQTQA fI)-’ATQTQE’

(5)

One disadvantage of using this approach to design a predictive controller is that the move suppression parameter must be chosen empirically with few guidelines available to aid in its selection. An approach to choosing f based on generalized cross-validation in statistics is described by Golub et al. (1979). A different way of handling poorly conditioned leastsquares problems is called principal components analysis. This approach uses a singular value decomposition of matrix QA whereby orthogonal transformations are used

to enable a more direct determination of the effective rank of the matrix. By use of a lower rank approximation to the matrix, a solution can be determined which results in only a slightly larger problem residual (poorer control) with a solution vector of smaller norm (smaller control input changes and better robustness). This approach also produces information useful for choosing the ridge (move suppression) parameter, if that approach to the problem is preferred. Singular Value Decomposition Singular value decomposition (SVD) is a very effective matrix decomposition technique which can be used for many purposes, e.g., to solve sets of linear algebraic equations or linear least-squares problems. General information on the applications of singular value decomposition and its numerical calculation can be found in a number of sources, including Stewart (1973) and Klema and Laub (1980). Given an m X n matrix, QA, of rank r , there exist orthogonal matrices U ( m X m) and V (n X n) and a positive diagonal matrix S ( r X r ) such that QA = UZVT

(6)

where (7)

The QA matrix has min (m,n) singular values, r of which are non-zero. The ith singular value is denoted by oi.The algorithms which calculate these matrices normally arrange the singular values in order of descending magnitude. In this case, the condition number of QA is given by ul/u,. The SVD can be used to compute the pseudoinverse of the QA matrix in eq 3. It is easily verified that (QA)+ = VZ+UT

(8)

where

s-’ E+=[ 0

0

01

The computation of the SVD has good numerical properties and is the preferred method for finding a matrix pseudoinverse. The SVD is also the most reliable method for numerically determining the rank of a matrix (Stewart, 1973). See also section 1II.C. of Laub (1985) and the references therein for further remarks on numerical rank. Principal Components Analysis The SVD approach provides a simple way of transforming the predictive controller least-squares problem into orthogonal variables. We can rewrite the optimization problem in terms of vector norms as

J(Au) = IlQAAu - QE’11z2

(10)

By use of the orthogonality of U and V and the fact that the Euclidean norm is unitarily invariant, i.e., if U is orthogonal, llUXll2 = llxllz (11) the optimization problem can be rewritten as J(w)=

IlZw - t41z2

(12)

where w = VTAu

(13)

g = UTQE’

(14)

1206 Ind. Eng. Chem. Res., Vol. 27, No. 7, 1988

The solution ( w = Z'g) to this problem is trivial using eq 9. The elements of vector ware the principal components of the solution to the least-squares problem. The desired solution A u can be calculated by using the relationships in eq 13 and 14. Since V is orthogonal, vectors w a n d Au have the same norm. This property can be exploited further, as is noted below. Each of the principal components can be easily calculated. Each also contributes to improving the solution by reducing the residual (improving the control). If the ith component is excluded from the solution w, = 01, the residual increases by.:g Each component used in the solution also increases the squared norm (or magnitude) of the w vector by w,2, implying that the control input changes are increased accordingly. In a poorly conditioned problem, the components which correspond to small singular values contribute very little to reducing the residual @: is small) but they greatly increase the norm of the w vector (w: is large). Under these conditions, a more appropriate solution may be obtained by excluding these components from the computed solution. The sensitivity of the resulting approximate solution, i.e., the robustness of the solution to modeling errors, is also reduced compared to the solution of the full problem using all components.

The Relationship between Principal Components and Ridge Regression The ridge regression and principal components techniques both attempt to find an "appropriate" solution to the problem by solving a slightly modified minimization problem. If we rewrite eq 5 using the SVD of QA and the orthogonality of the U and V matrices, the following expression for the ridge regression solution can be obtained: AURR

= VhR+UTQE'

The ridge regression method has the advantage of allowing a gradual change in the weighting of the solution components. If two components have singular values of comparable size, it is not possible to eliminate the smaller singular value component from the solution without affecting the adjacent singular value component. The component selection method has just this capability. Examination of the g and w vectors can also provide a great deal of information as to which components contribute significantly to improvement in the control and which components only increase the size of the resulting control moves while increasing the sensitivity to modeling errors. This information allows the designer to gauge the importance of the individual components and to make informed design decisions in developing a predictive controller. These results can also be useful in applying the ridge regression approach; however, the component selection method offers definite design advantages for predictive controllers because of the explicit robustness information that it provides.

The Component Selection Design Procedure The component selection design procedure follows the steps outlined earlier for principal components analysis of any linear least-squares problem. The first step is to perform a singular value decomposition of the QA matrix. The problem can then be reduced to the form of eq 12. In order to complete this minimization, the error vector E' must be specified. In the ordinary least-squares application, it would be part of the problem data set. However, in the control application, the error vector changes at each sampling time. For purposes of controller design, E' is specified so the controller is optimized for set-point changes. In a SISO situation, E' takes the following form (for a unit step set-point change)

E' = (l,l,...,l)T

where

and

Equation 17 assumes that f is non-zero. Some of these diagonal elements may be zero or very small. Using the principal components analysis and retaining k ( S r ) components (the component selection method), we obtain the following solution: Aucs = VZcs+UTQE'

(18)

where

E; = (0,...,0,1,...,1IT Note that the matrices ERR+and ZCS' are diagonal. The ridge regression and principal components methods can be viewed as two techniques for weighting the components of the least-squares solution. The ridge regression method weights all r components with the factors in eq 17, which are bounded by the inverse of the singular value and 0. I f f dominates u;, the component will be effectively eliminated from the solution. If a? dominates f , then the component will be unaffected by the inclusion of the ridge parameter. In between there is a range off values which reduces the impact of the individual component. The component selection method eliminates components from the solution entirely or leaves them unaffected.

(20)

The formulation of the optimization for a multiple-input/multiple-output (MIMO) process is only slightly more complicated than that for SISO systems. The error vector E'and the input change vector Au are constructed with a partition for each output in E' and each input in Au . The MIMO dynamic matrix A is then a block matrix with each block taking the form of the SISO A matrix; i.e., it is built of the step response models relating each input to each output. The final formulation results in the same form of the optimization problem as given above. For thorough design with an MIMO system, the design computations should be performed for set-point changes in each variable. For a 2 X 2 system, the two E'vectors to be tested are E,' = (1,...,1,0,...,O)T (21) (22)

Some controllers designed for set-point changes, e.g., minimal prototype controllers, may not perform well for load disturbances. This is not a serious problem for predictive controllers because of the way they handle feedback from the process. A predictive controller reacts to a negative step change in a disturbance added to the process output in the same way it would react to an equivalent positive set-point change in that output. Once the SVD has been completed and the w and g vectors formed, a table can be constructed for the designer to summarize useful information about the components. For each component, several quantities can be easily computed and displayed: (1)the associated singular value ai;(2) the contribution, g:, that component makes toward

Ind. Eng. Chem. Res., Vol. 27, No. 7, 1988 1207 improving the control by reducing the residual (this residual reduction is the improvement in the objective function obtained by computing a value of wi instead of removing the component from the solution and setting wi to 0); (3)the remaining residual assuming that the current and all previous components (those associated with larger singular values) are retained in the controller; (4) an indicator of the norm of the solution vector for the controller (the vector of future input changes) which includes the current and all previous components. The two residual columns can be normalized by the largest possible residual, which would result if no control action is taken and the pseudoinvese is set equal to the zero matrix. Two convenient measures of the control input are available: the norm of the solution vector and the sizes of the first input changes. The first input changes are particularly helpful to the designer working with an MIMO system because they can be compared easily to the final steady-state input change required for the set-point change being examined. The first computed change is also the only input computed that actually is implemented at any sampling time (Cutler, 1983). In our experience, the first input changes computed are also the largest elements of the solution vector. A good rule of thumb to follow in developing the design is that the initial input changes (for each input) should not exceed 2 times the steady-state change in that input. This guideline ensures a physically reasonable control policy. The individual designer may wish to impose a stricter guideline. The normalized residual columns and the first control move in an SISO system can be computed as follows:

Table I. Fifth-Order System Principal Components Results (L= 10, R = 50, N = 60, T. = 2) component “i 1 1.73 X 2 1.84 X 3 3.35 X 4 6.64 X 5 1.43 X 6 3.45 X 7 9.41 x 8 2.86 X 9 9.44 X 10 3.39 x

-0.251 0

( G F ) ~ (Gdi 10’

0.769 loo 0.095 lo-’ 0.050 0.032 0.021 0.014 10-4 0.009 lo4 0.006 0.004 10-6 0.002

0.231 0.137 0.087 0.055 0.035 0.021 0.011 0.005 0.002 0.000

(Wdi 1.25 X 1.49 X 2.30 X 3.74 X 5.38 x 6.24 X 5.69 X 4.23 X 2.34 x 9.85 x

I

I

I

1

10

20

30

40

lo-’ loo 10’ lo2 103 lo4 lo6 lo6 107 107

AUl

0.119 0.682 3.52 11.4 35.7 90.7 188

so

TIME

Figure 1. Three-, four-, and five-component controllers. Fifth-order system. L = 10,R = 50,N = 60,T = 2.

With this information the control system designer can make a trade-off between control system performance, quantified by the Gs figures, and control input energy, quantified by Au,. Lower values of Aul in the final controller will also contribute to increase robustness. The final value of Gs also indicates the limit of control using the specified R and L values selected for analysis. Applications of the New Controller Design Met hod A Simple Example. To illustrate the application of the SVD to predictive controller design, consider control of the following SISO process model: ~ ( s=) i/(5s + 1)5

(26)

A sampling period of T,= 2 was used, and 60 terms were retained in the convolution model ( N = 60). The optimization and control horizons were chosen to be 50 and 10 sampling periods, respectively (R = 50, L = 10). Equal weighting of all outputs was used. All SVD calculations were performed by using a single precision version of the IMSL subroutine LSMF on a Data General S-130Eclipse minicomputer. The least-squares problem using this process model and the chosen parameters are very poorly conditioned. A straightforward attempt to solve the problem using eq 2

might result in an inaccurate solution due to the extreme sensitivity of the solution to small changes in the process A matrix. The results of the component analysis appear in Table I. The third column of the table traces the contribution of each component in reducing the objective function (G). The table clearly shows that the last three components contribute approximately 1% to reduction of the objective function while increasing the square of the norm of the input vector by a factor of almost 200. In this case, each component in the solution contributes less to residual reduction that all those with larger singular values. The designer must decide which components corresponding to small singular values to drop from the solution. Figure 1 shows closed-loop system step responses for controllers designed with three, four, and five components retained. Closed-loop systems with controllers designed using five or more components showed significant sensitivity to errors introduced during integration of the system equations. The source of this sensitivity can be seen in the last column of Table I. The initial control change in response to a unit step change in set point is greater than 35 times the steady-state change in the input for controllers with five or more components. Such controllers are totally unrealistic for any process subject to significant disturbances, nonlinearities, or modeling errors. The three-component controller is the most reasonable choice for this system, although the four-component controller may be

1208 Ind. Eng. &em. Res., Vol. 27, No. 7, 1988 Table 11. Principal Components Analysis of Wood and Berry Model 6, GF GS (mF)l (A~B),

GF

GS

0.010 0.000

0.626 0.197 0.151 0.129 0.062 0.058 0.038 0.038 0.028 0.028

0.002 0.026 0.027 0.037 0.094 0.098 0.169 0.168 0.237 0.242

-0.003 -0.012 -0.023 -0.024 0.001 -0.021 -0.012 -0.016 -0.021 -0.011

0.365 0.107 0.381 0.023 0.035 0.018 0.003 0.012 0.000 0.006

0.635 0.528 0.147 0.124 0.089 0.070 0.068 0.056 0.056 0.049

0.002 -0.010 -0.012 -0.001 -0.042 -0.034 -0.060 -0.080 -0.085 -0.120

(AVB)~ -0.003 0.001 -0.026 -0.049 -0.067 -0.113 -0.116 -0.183 -0.183 -0.245

0.000

0.001

0.875

0.006

0.001

0.040

-0.024

-0.390

1 2 3 4 5 6 7 8 9 10

305.50 51.92 47.91 14.36 9.30 6.04 3.95 3.09 2.45 1.85

0.374 0.429 0.047 0.022 0.066 0.004 0.020

io

0.16

0.000

acceptable for some applications. Selection of a move suppression factor f can also be guided by these component analysis results. As noted above, a component will effectively be removed from the solution if u: