The Gradient Method in Process Control

linear constraints. In the gradient method there are three problems: • to find the direction of the gradient vector at the base point. • to ascert...
0 downloads 0 Views 722KB Size
I

S. M. ROBERTS and

H. 1. LYVERS

TRW Computers Co., Beverly Hills, Calif.

The Gradient Method in Process Control A flexible, computable, efficient technique for handling constrained o r unconstrained optima the shape, Iresponse position, and orientation of both the surface and the constraints may N CONTROLLING A PROCESS

vary. The gradient method is a powerful tool for ferreting out optimum conditions. The gradient method, which is readily mechanized for on-line control, can handle linear and nonlinear objective functions with linear and nonlinear constraints. In the gradient method there are three problems : 0 to find the direction of the gradient vector a t the base point 0 to ascertain how far to move along the gradient vector to locate the next point 0 to determine when to terminate the calculation

Optimization Models

Optimization models of chemical processes are often formulated using three general types of equations. The first expresses the object of optimization. Commonly the objective is to maximize profit or to minimize costs. The second consists of relationships that describe the metamorphosis of raw feed material into products. Included in this category are the heat balance, material balance, and reaction rate equations. The third type expresses the constraints. These equations limit the process in some way, perhaps by upper and/or lower bounds on certain variables or conditions. A distinction between the second and third types of relationships is that the second type of relationship is true for all times. A material balance or heat balance,for example, has general validity, whereas the constraints are valid only over certain limited ranges. For example, a feed rate limitation or maximum heat transfer rate limitation may be changed by the installation of a new feed pump or larger heat transfer surface. There are several classes of variables: the manipulated, the disturbance, and the intermediate. T h e manipulated variables are those which the operator can set (by, say, turning valves). The disturbance variables are those over Present address, Bonner and Moore Engineering Associates, Adams Petroleum Center, 6910 Fannin, Houston 25, Tex.

which the operator has no control, for example, ambient temperature. Between these is a third class designated as intermediate variables, which may be partially controlled. Included may be such variables as catalyst activity. Computer Control and Gradient Methods. Gradient methods can be applied to computer control of chemical processes. All of the situations depicted below can be handled well by gradient methods. The use of the gradient method for computer control involves searching the mathematical model for the optimum response. The experimentation and the step-by-step movements u p the response surface are done mathematically. Once the optimum is found mathematically then the process itself can be moved to the optimum conditions. If the mathematical search for the optimum takes a long time, the process conditions which triggered off the optimization routine may very well have changed by the time the optimum is found by the computer. For this case, it will pay for the computer controller to start guiding the process toward the optimum before the optimum is actually found by using intermediate results which give higher response than the current plant operating conditions. By using the mathematical model to find the optimum, the computer never sends the process outside of its constraints. After the optimum is found mathematically, the process is physically sent to the optimum point in a fashion that is always within the permissible region. The speed and the manner of moving to the optimum point must be determined by the process dynamics.

constraints as a function of the manipulatable variables. Using the response surface-constraints plot for reference (Figure 1) a classification system for steady-state optimization models is shown below as a matrix of 16 possible combinations. The conditions describing the response surface in i t h column and the constraints in the jthrow identify the category i, j. Invariant response surface shape means that the relative position of any point on the response surface does not change with respect to the principal axes of that surface. This still allows for change of its relative position and orientation with respect to the coordinate axes. Relative position refers to the location of the center of the response surface relative to the axes of the plot. Orientation refers to the angles between the coordinate axes of the plot and the principal axes of the response surface. Similar statements apply for the constraints. I n terms of the classification system, one can think of how to cope with this variability. A study of the process and the constraints and their relative variability will be helpful in determining whether computer control is desirable. The general approach of Box (7) has been to determine the response surface as evaluated over a relatively long period of time. By continually evaluating and re-evaluating the response surface, Box is able to follow the long-term trends in

/ A

XI

8

Classiflcation of Steady State Optimization Problems

Mathematically, relationships of the second type may be substituted into the objective function to express algebraically (or graphically) the objective function as a function of the manipulatable variables. When plotted, this gives a response surface of the objective contours us. these variables. O n this same plot may be drawn the various

Figure 1. Response surface and constraints plots VOL. 53, NO. 11

NOVEMBER 1961

877

quality, and catalyst degradation, among others. The constraints may vary owing to disturbance variables such as ambient conditions. Typical of this is the compressor capacity constraint which is a function of cooling water temperature. On the other hand, the constraints may be invariant. For example, the constraint on an over-head condenser may be a cooling duty. The quantity of water flowing over the coils from time to time may be so large that the required heat is removed despite variations in the cooling water temperature.

Response Surface and Constraints Classification Table Rem i s e Surface

Invariant Invariant Invariant

Varies V a ries Invariant

1, 1

1, 2

(Figure 1, Ai

(Figure 1, 8 )

I

1

Invariant Invariant Varies

I 1

1 Varies Varies Varies

1

1,3

1

1, 4

I 2, 1 (Figure 1, C)

I

2,2

2, 4

I

Gradient Method

I

4, 2 the shape, position, and orientation of the response surface as it moves with respect to the axes. Where there is migratory behavior of the response surface and constraints over a short period of time, computer control makes it possible to cope with the shift in the optimum operation conditions due to the movement of the response surface and/or the constraints. The methods of Box, as well as other methods, have been used to develop the response surface for computer control. Through frequent measurements and analyses taken on-line in the plant, the computer is able to re-evaluate the coefficients in the equations. I n this manner, the computer develops and utilizes equations that represent the current state of the system. The category 1,1 describes the situation where the determination of the

2,3

(Figure 1, Di

I

4, 3

4, 4

response surface and constraint equations one time suffices for all times. This case may not require a control computer. All the other categories may be candidates for on-line computer control. Practical Examples

The response surface invariant with respect to shape and/or position and or orientation may exist where the process is unaffected by ambient conditions. The feed stock may be of such uniformlyhigh quality that uncertainty in feed quality is not a consideration. In addicion. the catalyst may be so long-lived that its activity is essentially constant. O n the other hand, variations in the response surface shape, position, and orientation may be caused by the effects of changing ambient conditions. feed

In view of the situations described b) the classification system, one would like to emp1o)- an optimization technique that can cope with the complexities of variability in the response surface and in the constraints. The technique should be flexible, computable, and efficient. The gradient method (2-5) meets these requirements and is one of the most powerful means to ferret out optimum response. It can be used with linear or nonlinear response functions. with or without constraints, with linear or nonlinear constraints. The technique is based on the calculus which states that the most rapid change in a response is achieved by moving along the gradient (which is orthogonal to the response surface at the initial point). If the surface is represented by y

= U X l ,

x2, 3 3 ,

. ., x,)

(1)

The total derivative of Y with respect to the distance s is

4 Figure 2.

5x13

-

Response surface for Y =

3X12X2

+

The partial derivative terms

x22

Figure 3. Value o f tangent along x:: = 0.07697(~1 - 1.38) 3.00

+

This line is based on the gradient vector evaluated at point (1.38, 3.00)

v

+ 2 -.05 W

z -.IO (3

-.I5

-.20 -,25

1

0

I

2

3

xi 878

INDUSTRIAL AND ENGINEERING CHEMISTRY

axi

42 (3

0

'

aY -

-

+ 05

4

I

are

proportional to the direction cosines of the vector orthogonal to the response surface. In particular, the component of the normal to the surface in the ith direction may be normalized to give

,

+.IO,

3x1

(33

which is evaluated at an initial or base point. In the gradient method, there are three problems that must be solved: to determine the direction of the gradient vector at the base point, to determine how large a step should be taken in the direction of the gradient, and to determine when to terminate the calculation. The problem of direction is solved by

PROCESS CONTROL evaluating Equation 3. T h e problem of step size is somewhat more knotty in that the size of the step in the XI, x2, x 8 , . . . , x, directions depend a great deal on the type of surface. If the size of the step is too small, it takes too long to ascend the surface. If the size of the step is too large, it is possible to overshoot the optimum or to miss certain important features of the surface. The size of the step in each direction is evaluated by the values of the partial derivatives a t a n initial point. Only a t the base point is the gradient strictly orthogonal to the response surface. If the steps in each of the i directions are too large, then the vector from the base point will not be orthogonal to the response surface at the new point. The choice of a satisfactory step size implies that the derivatives a t the next point are “reasonably close” to those at the base point. T h e problem of terminating the calculation may be solved by establishing certain rules. One satisfactory procedure is to store in the computer memory the tentative maximum yield as it is calculated. If the next iteration produces a smaller yield, the computer continues For as many as k more gradient steps. If the computer does not find a higher yield within the k steps, then the tentative maximum yield is accepted as the optimum. If the computer does find a higher yield within the k steps, the computer continues its search for a tentative maximum. The number k is determined by running many case histories for sufficient number of iterations that in general the maximum yield will fall within the k iterations. Step Size. One is not free to choose the size of the incremental steps, Axo. b Evaluate the gradient at the new point. b Evaluate the new Axd to yield t h e third point. b Repeat the process until the optimal response is reached.

+

+

+

taxa

VOL. 53, NO. 11

NOVEMBER -1961

879

merely moving in the general direction of the gradient will enable one to find the optimum point albeit after many more trials and longer period of time. One drawback to the gradient method is that it finds only the local optimum, therefore the likely region must be explored by using several different starting points. The validity of the gradient method is limited, of course, to the region over which the data were taken to generate the response surface. There is no way to determine before hand what the value of the K should be t o determine the size of the 4 x s steps. In sophisticated systems, the value of the K may be changed depending on the position of the operating point on the response surface. One way to guarantee that the size of the K factor is satis-

factory is to choose it arbitrarily and then evaluate the direction cosines at the base point and the next point so that 1’41

- I B I Ic =

(7)

the direction cosine in the i t h direction at the base or initial point

and the procedure is repeated until the conditions of Equation 7 are met. Methods of Finding the Next Point. There are three ways of finding the location of the next point. The most general method is given above-namely, pick a constant K and use Equation 4 to find the Ax,. The new point is found from ( x ~ ) ~( A Y , ) ~ ,

+

+ A second (

( 4 0

W

, (xJ0

O

+ (hx,Jo is to write

0 method the equation of the normal line to the surface

at

(x’dO,

(xl)O!

- A-x , C

=

the

maximum permissible difference,

(”) ax,

(x3)0,

- __ Axz o

(”) bxz

o

=

’ > OI,.(

Ax, __ -

(”)

a

If the conditions of Equation 7 are not met, the size of the assumed K is reduced,

Numerical Example The response Equation 5

Hemstitching Method.

Y

-

5x1’

3X12Xz

+

Xz2

is constrained by a linear function xz

-

2

2

-XI

3

1

(15)

and a nonlinear function -

(XI

1)2

+ ( x , - 2)2 5 4

(16)

Taking as the starting point (2.0000, 2.5000), the optimal path has been determined by the gradient method, using the hemstitching technique when a constraint (or constraints) is violated. The results of the calculations are tabulated in Table 111. I n Table I11 the value of the x1 and x, and the response Y are given for the point a t the beginning of the step. The Axi and Ax2 are listed for the steps in the permissible region, for steps to move across the linear and the nonlinear constraints. Reading across any line the new X I and x, are formed by algebraically summing all the Ax1 and A x 2 terms with the x1 and x2 values, respectively, a t the beginning of the step. T o determine the size of the Ax1 and Ax2 steps, the proportionality factor of K = 0.25 was

Table 111.

Step

Beginning of Step

No.

XI

$2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

2.0000 2.2434 2.1048 2.3477 2.5905 2.4519 2.6943 2.5557 2.7978 2.6592 2.9010 2.6784 2.9205 2.5510 2.7931 2.6545 2.8962 2.6746 2.9167 2.5480 2.7900

2.5000 2.4433 2.6513 2.5925 2.5330 2.7410 2.6802 2.8882 2.8260 3.0340 2.9707 2.8571 2.7951 2.9075 2.8451 3.0531 2.9895 2.8739 2.8118 2.9222 2.8597

16.25 25.54 18.42 28.57 42.35 31.79 46.63 35.22 51.14 38.86 55.92 42.76 60.87 34.71 50.49 38.32 55.21 42.25 60.23 34.34 50.01

0.2434

AX? -0.0567

0.2429 0.2428

-0.0588 -0.0595

AX1

...

...

...

... -0,0608 ... -0.0622 ... -0.0633 ...

0.2421

-0.0620

...

0.2424

... ...

0.2421 0.2418

... 0.2421

... 0.2417 ... 0.2421 ... 0.2420

...

+

Step-By-Step Calculations

Permissible Region Y

used for the yield equation, the linear constraint, and the nonlinear constraint. The path is shown in Figure 4 for only 14 steps out of the 20 for clarity. Note that even though the hemstitching technique was used here, this does not mean that once the hemstitching begins the boundary will necessarily be crossed a t every step. Until the end of step 10, only the linear constraint is violated. At the end of step 11, only the nonlinear constraint is violated. At the end of step 12, both the linear and nonlinear constraints are violated. T o move into the permissible region requires moving along a vector which is the resultant of the vectors orthogonal to the linear and nonlinear constraints. This is done by calculating the components to the vector orthogonal to the linear constraint and the vector orthogonal to the nonlinear constraint and summing them algebraically. In particular, a t the end of step 13 the new X I = 2.9205 - 0.1386 - 0.2309 = 2.5510 and the new xg = 2.7951 0.2080 - 0.0956 = 2.9075. I n computing the gradient vectors to the constraints, one must be careful with the signs in order to find the gradient which moves toward the permissible region. I t is possible to find the gradient vector which is in the opposite direction. In this example, the proper gradient components for the

... -0.0624

... -0.0636 ... -0.0621 ... -0.0625 ...

Linear Constraint AX,

... -0.1386 ... ... -0.1386 ...

AX2

...

... 0.2080 ... ... 0.2080 ... 0.2080 ... 0.2080 ... ... ... 0.2080 ... 0.2080 ... ... ...

-0.1386

0.2080

-0.1386

... ...

-0.1386

... ... -0.1386

...

-0.1386

... ...

... ...

...

...

Nonlinear Constraint AX1

...

... ... ... ... ... ... ... ... ...

AXZ

...

...

... ... ... ... ... ... ... ...

-0.2226

...

-0.1136

-0.2309

-0.0956

...

... ... -0.2216 ... -0.2301 ... ...

... ... ...

...

-0.1156

... ... ...

-0.0976

New XI

2.2434 2.1048 2.3477 2.5905 2.4519 2,6943 2.5557 2.7978 2.6592 2.9010 2.6784 2.9205 2.5510 2.7931 2.6545 2.8962 2.6746 2.9167 2.5480 2.7900

...

I’oint $2

2.4433 2.6513 2.5925 2.5330 2.7410 2.6802 2.8882 2.8260 3.0340 2.9707 2.8571 2.7951 2.9075 2.8451 3.0531 2.9895 2.8739 2.8118 2.9222 2.8597

...

Note: The constant of proportionality, K , is equal to 0.25 for computing the A n and Ax2 in the permissible region, and for computing the A s l and A X Zto cross the linear and nonlinear constraints.

880

INDUSTRIAL AND ENGINEERING CHEMISTRY

PROCESS CONTROL

+

+

0 The third way is to select the value of the response function at the new point, then solve simultaneously the equations of the normal line at the original point and the yield equation. Let the arbitrary choice of the new response function at the new point be

Y9l

=

Y, (Xl,XZ,

. ., xn)

X8,

linear constraint are

the gradient is evaluated at the new point, and then the response. If one or more constraints are violated, it is necessary to move the new point back into the permitted region. As long as the operating point is within the constraints, the gradient is taken with respect to the objective function. As soon as the operating points fall into the forbidden region, the gradient is taken with respect to the constraint(s). The rationale for this is that, if the greatest rate of change in the response surface is found by moving in the direction of the gradient vector, then the fastest way to move out of the forbidden region is to move orthogonal to

Then the simultaneous solution of Equations 8 and 9 will yield the coordinates of the next point. This method is not recommended. Adding Constraints to the Gradient Method. HEMSTITCHING METHOD.The gradient method can be used very conveniently with a constrained problem. Starting with an initial point which lies within the constraints, the gradients are computed and the objective function is evaluated for the incremental steps (Axl),, ( A X Z ) ~., . ., (Ax,),. At this time, the various constraints are checked to see if the new point (x1)o (Ax,),, (XZ)O (Axlo, . . . , (XJO ( k ) o satisfies all the Constraints. If it does,

If the value of any Ax, is chosen arbitrarily then the other Ax terms can be found from Equation 8. Again the new point is found from ( X J O (Axl)~, ( X Z ) O -I- (40, . . ., ( x , ) o (Axn),. The second method of evaluating the normal line is very closely related to the first method. Solving Equation 4 for K yields Equation 8.

+

+

(9)

- 2/3.

+

1

xz

in the x1 and 419and x z directions, respectively. For the nonlinear constraint, the proper components of the gradient are 49-

(XI

,-

2 - - XI 3

1)’

+

(XZ

1.0000 f 0.0050

4.0000 f 0.2500

- 2)’

With this as a standard, Table IV shows that a t the end of step 1 and x z directions, respectively. Table I11 and Figure 4 show that the gradient method does not “home” directly on the optimal point. Near the intersection of the constraints, there may be a number of reflections back and forth across the constraints before the optimal response is found. I n addition, in this search for an optimum, some of the early steps may give better responses in the permissible region than later steps. Compare the response at the beginning of step 12 with response at the beginning of step 14. Due to the reflection back and forth across the boundaries, it is necessary to establish rules for terminating the calculation, as suggested earlier. Ride-the-Constraint Method. For the linear constraint, a violation occurs if

XI

22 - ! X I

3

+

- XI

1 2 3 4 5 6

Beginning of Step Xl

2.0000 2.0000 2.1800 2.5550 2.5550 2.8250

22

2.5000 2.5000 2.4581 2.7081 2.7081 2.8881

Y 16.2500 16.2500 22.7978 37.6936 37.6936 51.9205

-

1)Axl

4-( X Z -

35

+

+

-0.0567 -0.0419 0.2500 0.2500 0.1800 0.2054

0

2)Ax2 = 0

See Table 1V and Figure 5 for the results. Here, the maximum response occurs at the intersection of the constraints. This is not always the case.

= 1.0000 (x2 - = 212 I=0.0050 = 4.0000 f 0.2500 New Point XI XI Ax2 AXZ AZI Ax2 0.2434 0.1800 0.3750 0.3750 0.2700 -0.1000

=

After each step a test was made to see if the nonlinear constraint has been violated. Adjustments are made until the nonlinear constraint is reached. Then proceed up the nonlinear constraint by

Ride the Constraint Example

Linear Nonlinear constraint: constraint: xp -2/3 (XI

Step No.

0.9947

2 AXZ - -Ax1 3

(XI

(x1 - 1 ) 2 ( x 2 - 2)Z > 4 Let us consider a point to lie on the constraints if it meets the following conditions

2 3

-XI

As a consequence the point (2.2434, 2.4433) lies outside the linear constraint. Starting again a t the same point, the point a t the end of step 2 lies on the linear constraint. Proceed up the linear constraint according to the equation

\J

*’.Ii./ \ X 2- 3 X1 =I0000+_00050

2

5

Figure 5.

Ride the constraint method

VOL. 53, NO. 11

NOVEMBER 1961

881

.X,tl N

x

3.0 - REGION 6-

2.5 -

2

2.5

t .5

3.0

ment to and fro across the boundary is a kind of mathematical “hemstitching.” RIDE THE CONSTRAINT.Another method of using the gradient method is to move orthogonal to the response surface until a constraint is violated. TVhen a constraint is violated, instead of mathematically hemstitching the constraint, one can return to the point that was satisfactory just before the constraint was violated. Starting at this point one can use smaller Ax% and then move orthogonal to the lines of constant response uctil the new point lies on the constraint itself. The optimization is then carried out by moving along the constraint. If the constraint is

XI

Figure 4.

VI

Hemstitching method

the constraint(s). There is? therefore, one set of rules when the operating point lies within the constraints and another ret of rules when it lies outside the constraints. If the point lies outside one Constraint, the gradient is taken with respect to that constraint. If, for example, the constraint is expressed as ‘PI = VI(X1,

X%,

. . ., x , )

(10)

The size of the step in the x , direction is found by Equation 4

VI

If the point lies outside two constraints x z . . ., xn) and ~ 4 x 1 x,z . . . , x J , then a composite gradient is computed. The size of the step in the x , direction is

(12) where the K 1 and K2 are the chosen constants. Extensions to multiple constraints follow in the same manner. Care should be taken to see that the direction of the gradient vectors to the constraints direct the process toward, not away from, the permissible region. There is no guarantee that if the point falls outside the constraints that one application of the gradient with respect to the constraints will put the point within the constraints. If the point violates some of the constraints, then the gradient with respect to only these constraints which are violated must be calculated again. The approach described here is illustrated in Figure 4 where the move-

882

. ., x,)

(13)

then along the constraint

since the constraint represents a line of constant value. In this relationship, the bV1 - terms can bxt be evaluated at the point where the response surface initially crosses the constraint. All the Ax; terms except one can be specified, and the remaining one is determiiied by Equation 14. The next point on the constraint is found from the last point on the constraint, from the

PI(XI,

(xi, XZ:

(%:)terms, computed

at this point, and from the Ax, terms chosen consistent with Equation 14. For each new point along the given constraint, the response function is evaluated as well as the other constraints. As soon as a new constraint is violated, the method switches over to the new constraint and rides it, provided of course the response continues to increase. IYhile the “constraint riding” technique implies that the optimum lies on a constraint, the gradient method and the hemstitching variation of it do not. The optimum point may be within the constraints. If a good starting place for the calculation is found-good in the sense that the local optimum is the optimum-then orthogonal movement to the response contours will find the optimum. Matrix Methods. Recently a very interesting paper was published descrLbing a theory and a general algorithm for handling gradient problem3 through the gradient projection method(4). This paper is based upon optimizing an objective function subject to a set of constraints and relationships. The method guarantees a solution in a finite number of iterations and has been programmed for a large computer. I n some computer control applications, it is not convenient to put down the constraints and the relationships as a

INDUSTRIAL AND ENGINEERING CHEMISTRY

set of equations or inequalities which can be manipulated through matrix methods. There are several reasons for this. The mathematical description of the process may be cast into a sequential operation where one equation must be solved first to yield information for the next. I n these cases, it is not possible to substitute one equation into the next and thus eliminate some intermediate result. For example, some of the product yield equations may be correlations that depend on knowing some intermediate value such as catalyst rate. This must be calculated first bv heat and material balances \vhich in turn require some knowledge about the products. In some parts of the model, a trial-anderror or iterative calculation must be made before the rest of the problem can be attacked There are numerous situations where the equations describing the process cannot be manipulated via matrix methods In these cases, the methods of the gradient projection, linear programming. and the like cannot be used. Another objection to these sophisticated matrix techniques for use with process control computers which are. in general. medium-size machines, is the size of the program. When so much of the memory is devoted to the optimization routiie. there is only limited room for the other computer functions. The time for solution may be so long that the process conditions upon which the optimization is based may have changed. As mentioned before this can be partly compensated for by printing out intermediate results and acting on these. One of the outstanding advantages of the matrix methods is the fact that an optimum will be found despite the nature of the objective function and the constraints. In addition. parameter studies may be very conveniently carried out. The fact that the matrix methods are powerful tools for solving gradient and other programming problems is fairly obvious. The fact that there are many problem situations that cannot be cast into such neat form is less well known.

Literature Cited (1) Davies, 0. L., “Design and Analysis of Industrial Experiments,” Chap. 11, Hafner, New York, 1956. (2) Dennis, J. B., ”ease, R. F., Saunders, R. M., “System Synthesis With the Aid of Digital Computers,” A.I.E.E. Conference Paper, No. 56-328, February

1956.

(3) Pyne, L. B., “Linear Programming on

an Electronic Analog Computer,” A.I.E.E. Conference Paper, No. 56-147, February 1956. (4) Pxosen, J. B., J . Snc. Ind. Appl. Math. 8, NO. 1, 181-217 (1960). (5) Sokolnikoff, I. S., Sokolnikoff, E. S., “Higher Mathematics for Engineers and Physicists,” 2nd ed., p. 143, McGrawHill, New York, 1941. RECEIVED for review December 2, 1960 ACCEPTED April 17, 1961