Use of random admissible values for control in iterative dynamic

1308

Znd. Eng. Chem. Res. 1992,31, 1308-1314

Use of Random Admissible Values for Control in Iterative Dynamic Programming Bojan Bojkov and Rein Luus* Department of Chemical Engineering, University of Toronto, Toronto, Ontario M5S lA4, Canada

Iterative dynamic programming employing region contraction, where instead of uniformly chosen admissible values for control, randomly generated admissible values for control are used, is examined for optimal control. The use of randomly generated control values becomes necessary when the number of control variables is very large. Two examples are used to examine the viability of this method of choosing candidates for control values. Choosing control values at random becomes especially useful to keep the number of trajectories to be evaluated and compared reasonably small when the number of control variables is very large. In the numerical example where there are 20 state variables and 20 control variables, convergence to the optimum was fast even when only 100 randomly chosen control values were used at each grid point. Introduction In engineering design and operation one is frequently confronted with optimization problems where the system is described in terms of nonlinear differential equations. Since in many practical situations the global optimum solution may be difficult to obtain, as shown by Luus et al. (1991)) it is important to cross-check the solution by using different optimization procedures. Dynamic programming provides numerical procedures that are different fiom the type based on Pontryagin's maximum principle and therefore provides good alternate means for solving optimal control problems, for example, when state constraints are present as shown by Luus and Rosen (1991). In fact, for some problems, Luus (1991) obtained better results than reported in the literature. Recently, Luus (1989, 1990) introduced the iterative dynamic programming method using region contraction to overcome the state dimensionality problem. In that numerical procedure the grid points for the state vector were generated by assigning different values for control and at each grid point cadidates for the control variables were chosen from a uniform distribution. The minimum number of values that can be used is 3" where rn is the number of control variables. When there are 3 control variables, 27 values to be examined at each grid point is quite reasonable. With 6 control variables, the minimum number 729 may still be manageable. However, with 10 control variables the number of values to be examined at each grid point becomes 69 049, which is unreasonably high, even for a mainframe computer. It is obvious, therefore, that for the iterative dynamic programming to be viable for systems having more than 6 control variables, a method of choosing candidates for the control should be established, so that a reasonably small number such as 100 can be used. The purpose of this paper is to examine substituting the uniform set of control values with a randomly generated set based on the idea by Luus and Jaakola (1973) for random search optimization, so that iterative dynamic programming may be applied to very high-dimensional nonlinear optimal control problems. Since the use of a randomly generated control set enables any number of values for control to be used, it is important to examine the convergence of such a scheme by consideringnumerical examples. Problem Formulation Let us consider the continuous dynamic system described by the following nonlinear differential equation:

dx dt with x(to)given, where x is an (n X 1) state vector and u is an (rn X 1) control vector bounded by i = 1, 2, ..., rn (2) aiI ui IBi Associated with the system is a performance index

- = f(x,u)

Z[x(to),tfl= WxW) + ff@(x,u) dt to

(3)

where the initial time toand final time tf are given. The optimal control problem is to find the control policy u(t) in the time interval toIt Itf such that the performance index given in eq 3 is minimized. Instead of a continuous control policy, we seek a piecewise constant control policy over P time stages, each of equal time length L. The performance index is thus approximated by

I[x(to),P1= \k(x(tf))+

k=l

f*@(x(t),u(tk-J)dt (4) tk-1

where tk-1 is the time at the beginning of stage k,and the control is kept constant at U(tk-1) throughout the time interval tkl It Itk If P is chosen to be sufficientlylarge, then close approximation to the continuous control policy is obtained. Without loss of generality, to can be chosen to be 0. If the system is described by a set of difference equations, rather than differential equations, no special approximations have to be; made for the application of iterative dynamic programming, as was shown by Luus and Smith (1991). This is illustrated by the first example. Iterative Dynamic Programming Algorithm. The iterative dynamic programming procedure using systematic region contraction is described as follows. 1. Divide the time interval [to,tJ into P time stages, each of length L. 2. Choose the number of x-grid points N and the number of allowable values M for each of the control variables for the uniform distribution, or the number of values for the control R for the random distribution. 3. Choose the initial size of search region ri, for each of the control variables uiover which values of control can be taken for the generation of the x-grid and for the optimization. 4. By choosing N values for control evenly distributed inside the allowable region for the control, integrate eq 1 from t = toto t = tf to generate N accessible values for the x-grid at each stage.

0888-5885/92/2631-1308$03.00/0 0 1992 American Chemical Society

Ind. Eng. Chem. Res., Vol. 31, No. 5, 1992 1309

5. Starting at stage P, corresponding to time tf - L, for each x-grid point integrate eq 1 from tf - L to tf once with each of the allowable values for control. For each grid point choose the control that minimizes I and store the value of control for use in step 6. 6. Step back to stage P - 1, corresponding to time tf 2L, and for each x-grid point integrate eq 1 from tf - 2L to tf - L once with each of the allowable values for control. To continue integration from tf - L to tf, choose the control from step 5 that corresponds to the grid point closest to the x at tf - L. Compare the different values of the performance index I and store the control policy that gives the minimum value. 7. Repeat step 6 until stage 1, corresponding to the initial time, t = to,is reached. Store the control policy that minimizes the performance index I. 8. Reduce the size of the search region for allowable control by an amount y rO+l)= yrO') (5) where j is the iteration index. Use the optimal control policy from step 7 as the midpoint for the allowable values for the control at each stage. 9. Increment the iteration index j by 1 and go to step 4. Continue the procedure for a specified number of iterations and examine the results. In using the uniform distribution for control values, M"' for control are used at each grid point, whereas for the random distribution R values are used, where the maximum value of R used here is 100, regardless of the value of m. Numerical Examples Computations were performed on the Cray X-MP/28 digital computer. For integration of the differential equations in example 2, a fifth-order Runge-KuttaButcher method as given by Chapra and Canale (1985) was used. For generation of random numbers, the built-in random number generator RANF of Cray Research was used. Example 1: Series of Chemical Reactors. Let us consider the reaction system consisting of a series of CSTRs first considered by Aris (1961) and used by Fan and Wang (1964) for illustration of the discrete maximum principle and more recently by Rangaiah (1985) for comparison of direct search optimization procedures. The system is described by the difference equations expressing the concentrations of the three key components xi(k - 1) (6) x l ( k ) = 1 + u,(k)(l + u,(k))

1+ Ul(k)

= X 3 ( k - 1) + O.Olu,(k) x,(k) (8) The control ul(k)is related to the temperature T a t stage k through the relation u,(k) = 104e-3m/T(k) (9) and u&) is the residence time in stage k. The initial conditions are XT(0) = [ l 0 01 (10) The constraint on the temperature is specified as T(k) I 3 9 4 (11) and the constraint on the residence time is 0 < ~ , ( k I)2 1 0 0 (12) X3(k)

I

I

I

1

0,540

1

0.540 0

id,

A

5

I

1

I

I

1

IO

15

20

25

30

ITERATION NUMBER

Figure 1. Improvement of the performance index as a function of iteration number for example 1 with P = 3. (0-0)Uniformly chosen values for control at each grid point. (A---A) Randomly chosen values for control at each grid point. Table I. Optimal Control Policy and ExamDle 1, with P = 3 stage no. k uAk) T(k) udk) 1 0.1233 265.40 1.3802 2 0.3343 291.09 2.3792 3 4.9339 394.00 2100.0

State Trajectories for x k k ) xAk) x&k) 0.3921 0.4807 0.0066 0.0939 0.6431 0.0219 O.oo00 0.0251 0.5490

Given the number of stages P, the problem is to determine ul(k) and u,(k) for k = 1,2, ...,P to maximize x3(P),Le., to minimize -x3(P). We initially consider the standard case of three stages (i.e., P = 31, for which Rangaiah (1985) obtained the optimum x3(3) = 0.54897. Of interest is to examine the rate of convergence of iterative dynamic programming when candidates for the control are chosen at random rather than over a uniform grid. We first consider 21 grid points for the x-grid ( N = 21) and a reduction factor y = 0.80. For the uniformly generated control values we choose M = 3, yielding nine values to be compared at each grid point, since there are two control variables. For the randomly generated values we choose nine values for control at random (R = 9) inside the admissible region. As an initial control policy, we take the following policy: ul(l) = 0.2, ui(2) = 0.5, Ui(3) = 4.0, uZ(1) = 0.5, ~ 2 ( 2 = ) 1.0, and u2(3) = 1900. For initial region sizes we take r,(k) = 0.5, for k = 1,2,3; r,(l) = 1.0; r2(2) = 1.0; and r2(3) = 100. In each case the convergence from the initial value of x3(3) = 0.45773 to x3(3) = 0.54897 was rapid as shown in Figure 1. However, it is noted that by randomly choosing the control values convergence results somewhat faster,taking 18 iterations rather than 24. The resulting optimal control policy and the state trajectories are given in Table I. Let us now consider the situation where instead of 3 stages we have 10 stages (P = 10). As initial control policy and initial region size, we choose ul(k) = 0.3 for k = 1, 2, ...,8; u1(9)= ul(lO) = 4.0 and r,(k) = 0.5 for 12 = 1, 2, ..., 10; u2(k)= 0.45 and r,(k) = 0.6 for k = 1, 2, ..., 8; u2(9) = ~ ~ ( 1 =0 1900, ) and r2(9) = rz(lO) = 100.

1310 Ind. Eng. Chem. Res., Vol. 31, No. 5, 1992 I

I

I

I

I

I

X W

n

0

K

z_

I

I

I

I

I

““‘1 1 P

W 0

z a

2 K [L

W

a

I I

1

I

A

0.605 0

I

5

IO

I

I

I

I

15

20

25

30

0,623

0

5

By using only nine allowable values for control at each grid point as used previously, we were not able to reach the optimum value of x3(10) = 0.631 40,although we could reach 0.631 32 by using randomly chosen values for control, as shown in Figure 2. However, increasing M from 3 to 5 and using R = 25 values rather than 9 resulted in very fast convergence to the optimum, as shown in Figure 3. Again the randomly chosen values performed more efficiently, reaching the optimum r3(10)= 0.631 40 in 15 iterations as compared to 21 iterations by the uniformly chosen values. The optimal control results are given in Table 11. It is noted that taking 10 stages rather than 3 leads to a yield improvement of 15%. To show that the results are not dependent on the random number generator and also to provide comparisons to personal computers, we ran this example with R = 15. On the CRAY, the computation time was 15.5 s for 30 iterations, yielding r3(10) = 0.631 40 in 20 iterations. On the 486/33 personal computer using the Microsoft Fortran 5.0 random number generator, the computation time was 118 s for 30 iterations, yielding convergence to x3(10) = 0.631 40 in 16 iterations. Therefore, the use of a different random number generator had negligible effect. When the region contraction factor y is in the range 0.75-0.90,the randomly chosen candidates for control yield faster convergence to the vicinity of the optimum than

I

I

20

25

Figure 3. Convergence to the vicinity of the optimum for example 1with P = 10, using 25 values for control at each grid point. (0-0) Uniformly chosen values. (A-- -A)Randomly chosen values. 501

Table 11. Optimal Control Policy and State Trajectories for Examde 1. with P = 10 stage no. k ui(k) T(k) UAk) xi@) x A k ) d k ) 1 0.0465 244.34 0.2076 0.8215 0.1629 0.0003 2 0.0765 254.65 0.2715 0.6357 0.3232 0.0012 3 0.1095 262.64 0.3263 0.4668 0.4603 0.0027 4 0.1505 270.17 0.3822 0.3242 0.5676 0.0049 5 0.2071 278.17 0.4456 0.2108 0.6448 0.0078 6 0.2945 287.55 0.5246 0.1255 0.6946 0.0114 7 0.4532 299.95 0.6348 0.0653 0.7214 0.0160 8 0.8447 319.86 0.8225 0.0259 0.7296 0.0220 O.oo00 0.0280 0.6090 9 4.9339 394.00 2100.0 10 4.9339 394.00 2100.0 O.oo00 0.0011 0.6314

I

15

ITERATION NUMBER

ITERATION NUMBER

Figure 2. Convergence to the vicinity of the optimum for example 1with P = 10, wig nine values for control at each grid point. (0-0) Uniformly chosen values. (A-- -A)Randomly chosen values.

I

10

40

I

I

I

I

P

-

30-

20

-

IO

-

0.60

0.70

0.00

0.90

REDUCTION FACTOR

Figure 4. Effect of the reduction factor on the rate of convergence to the vicinity of the optimum for example 1with P = 10. (0-0) Uniform distribution within 0.1% of optimum. (&--A) Random distribution within 0.1% of optimum. (0-0)Uniform distribution within 0.0001% of optimum. (A---A)Random distribution within 0.0001% of optimum.

using a uniform distribution, as shown in Figure 4 where 25 values were used. For reduction factors less than 0.70, the uniformly chosen values, however, yield faster convergence. Choosing the control values at random has the further advantage of providing the freedom of choice to the number of values that can be used at each grid point. This is illustrated in the next example. Example 2: Mathematical System. Let us consider the problem presented by Nagurka et al. (1991) which

Ind. Eng. Chem. Res., Vol. 31, No. 5,1992 1311 451

I"""""""'""""'1

35 v)

z Q

I

I

1

1

1

1

I /

PI'

5U W

ILL

25

0 U W

m

-

fz 5.30

0

I

I

I

a

I o

1

1

5

8

I

I

I

I

IO

1

1

I

I

I

I

(

'

20

15

'

I

I .

15-

25

-

ITERATION NUMBER

Figure 5. Convergence to the vicinity of the optimum for example 2 with n = 2, P = 30, using nine values for control at each grid point. (0-0)Uniformly distributed values. (A-- -A) Randomly distributed values.

consists of an n-input nth-order h e a r time-invariant dynamic system described by

-dx -- Ax(t) + u(t), dt

~ ~ ( =0 [l ) 2 3

...

n]

(13)

where 0 1 0 0

. . . 1

0 0

I = 10xT(l)x(l) + sl(xTx + uTu)dt

(15)

The objective is to examine the convergence properties of iterative dynamic programming for different values of the dimensionality of the system n. Let us fmt consider this system with n = 2,where there are two state variables and two control variables. We took N = 21 for the x-grid, M = 3 allowing three levels for both ul(tkl) and u2(tkJ,for a total of 9 values for the uniform distribution, and R = 9 for the random distribution. For the initial pass the center point for control was taken as (-2,-2), the region for control was taken as rl = r2 = 1, and a region contraction factor y = 0.75 was used. By taking P = 30 and an integration step size of 1/30,convergence for both the uniformly chosen values and the randomly chosen values is rapid as shown in Figure 5,both yielding a minimum of 5.3594 within 24 iterations. As shown in Figure 6, the use of randomly chosen control yields a somewhat faster convergence than uniform distribution for a wide range of the reduction factor. By using P = 60 an improved value of 5.3592 was obtained for the performance index. When the performance index is plotted for various values of P against 1 / P as in Figure 7, the extrapolated value of 5.3591 is obtained, which is the same as reported by Nagurka et al. (1991). Let us next consider the system with n = 6. Let us take P = 10, N = 11,and M = 3 for the uniform distribution

0.70

0.60

0.80

0.90

REDUCTION FACTOR

Figure 6. Effect of reduction factor on rate of convergence to the vicinity of the optimum for example 2 with P = 30. (0-0) Uniform distribution within 0.01%of optimum. (A- --A)Random distribution within 0.01%of optimum. (0-0)Uniform distribution within O.OOOl% of optimum. (A-- -A)Random distributionwithin O.OOOl% of optimum.

1

The problem is to find u(t) in the time interval 0 It I 1 that minimizes the performance index 0

5

F

I

0

2

I

1

I

I

P

X W

0

z W 0 Z

a

5 U

2

En

5.355 4

6

8

IO

I2

I/P2 x IO2

Figure 7. Variation of optimal performance index with example 2 with n = 2.

P2for

for a total of 729 values and R = 100 for the random distribution. The initial values for control are chosen as u1 = -2, u2 = -3, u3 = -5, u4 = -6, U5 = -7, and Us -1, and initial region sizes are r1 = 1,r2 = 1,r3 = 2,r, = 2,r5 = 3,and r6 = 1. These values are used as initial values for each time stage. Using an integration step of 0.05 and y = 0.80, we obtained fast convergence to 153.978 for the uniform grid and to 153.982 for the random distribution using only one-seventh of the number of trajectories. The rate of convergence is shown in Figure 8, and the optimal

1312 Ind. Eng. Chem. Res., Vol. 31, No. 5, 1992 Table 111. Optimal Control Policy and State Trajectories for Example 2, with n = 6 and P = 10 stage no. k Ul(tk-1) UZ(tk-1) U3(t&-l) Ud(tk-1) Us(tk-1) udtk-1) xl(t&) x Z ( t k ) x3(tk) 1 -2.8732 -3.7644 -8.8173 -7.5234 -12.7641 -2.1328 0.9080 1.8990 2.5044 -1.6919 0.8322 1.7611 2.0978 -7.6212 -7.0933 -10.6567 2 -2.5919 -3.6826 -1.3403 0.7648 1.5982 1.7604 -8.8627 -2.3570 -3.5613 -6.5945 -6.6487 3 -7.3369 -1.0565 0.6997 1.4189 1.4757 -2.1610 -3.4145 -5.7107 -6.2075 4 -6.0413 -0.8251 0.6326 1.2292 1.2302 -5.7825 -1.9975 -3.2523 -4.9477 5 -0.6333 0.5598 1.0335 1.0124 -4.9420 -1.8614 -3.0821 -4.2892 -5.3819 6 -4.0135 -0.4703 0.4783 0.8343 0.8128 -1.7506 -2.9081 -3.7229 -5.0100 7 -3.2351 -0.3222 0.3855 0.6331 0.6228 -1.6630 -2.7327 -3.2406 -4.6674 8 -0.1720 0.2789 0.4306 0.4344 -2.5937 -1.5996 -2.5563 -2.8389 -4.3485 9 0.0120 0.1556 0.2274 0.2398 -2.0911 -4.0387 -1.5636 -2.3723 -2.5232 10

C

'

I

I

I

I

I

1

xq(tk)

Xg(tk)

%B(tk)

3.7104 3.3881 3.0411 2.6764 2.3000 1.9197 1.5304 1.1446 0.7624 0.3880

4.2223 3.4978 2.8442 2.2663 1.7627 1.3296 0.9618 0.6535 0.3986 0.1891

4.0946 2.8029 1.8971 1.2493 0.7841 0.4544 0.2290 0.0862 0.0098 -0.0127

I

I

1

i

25-

20

1

-

15-

-

IO

-

-

153

0

200

600

400 CPU TIME

000

50

100

I50

I

I

I

I

0.70

0.75

0.00

0.05

I

I

0.90 0.95

REDUCTION FACTOR

(5)

Figure 8. Convergence to the vicinity of the optimum for example 2 with n = 6, P = 10, using nine values for control at each grid point. (0-0)Uniformly distributed values. (A- - -A) Randomly distributed values.

0

51 0.65

200

CPU TIME ( s )

Figure 9. Comparison of N = 11 and N = 21 for the x-grid for example 2 with n = 6, P = 10, using 100 randomly distributed values for control.

Figure 10. Effect of reduction factor on rate of convergence to the vicinity of the optimum for example 2 with P = 15. (0-0) Uniform distribution within 0.1% of optimum. (&-A) Random distribution within 0.1% of optimum. (0-0)Uniform distribution within 0.01% of optimum. (A- - -A) Random distribution within 0.01% of optimum.

policy and states are given in Table 111. Increasing the number of grid points for the x-grid increases the computation time but does not improve the rate of convergence to justify using more than 11 grid points for this system. Computationally, it is attractive to use as small a number as reasonable, as can be seen in Figure 9. The effect of the region contraction factor on the convergence to the vicinity of the optimum with P = 15, using an integration step size of 1/15 with the same initial values for u and r as above is shown in Figure 10. The number of iterations required by the uniform distribution with M = 3 is comparable to the random distribution using R = 100 for convergence to 0.1 and 0.01% of the minimum performance index 153.855. However, the computation time for the random distribution is substantially lower since 100 rather than 729 values for admissible control are used a t each grid point. As before, Figure 11 gives an extrapolated value of 153.757, which is close to the minimum reported by Nagurka et al. (1991) of 153.75. The use of randomly generated valuea for control in the iterative dynamic programming algorithm allows a very large number of control variables to be used. To illustrate this, we consider the case with n = 20, where to use uni-

Ind. Eng. Chem. Res., Vol. 31, No. 5, 1992 1313

1

PASS I

0

2

4

6

8

6200

12

IO

0

5

15

IO

20

25

30

I/ P2 x IO'

Figure 11. Linear variation of optimal performance index with Pz for example 2 with n = 6.

ITERATION NUMBER

Figure 12. Three-pass method using 100 random values for control for example 2 with n = 20 using P = 20.

Table IV. Initial Control Policy and Region Size for Example 2, with n = 20

1 2 3

-2.5 -3.0

4

-6.0 -6.0

5 6 7 8 9 10

-9.5 -8.5 -12.5 -12.0 -16.0 -14.0

10.0 10.0 10.0 10.0 10.0 10.0 10.0 10.0 10.0 10.0

11 12 13 14 15 16 17 18 19 20

-16.0 -17.0 -22.5 -20.0 -25.0 -22.0 -30.0 -25.0 -20.0 -0.0

10.0 10.0 10.0 10.0 10.0 10.0 10.0 10.0 10.0 10.0

c 6 500 X W

n

I

I

i

I

1

7

I

\

-z

yW

64001

a

formly generated control values, one would be confronted with considerable computational difficulty, since the minimum number of points for the uniform grid is 320,or 3.49 X lo9 values. Here we used only 100 values chosen at random. However, since the convergence is dependent on the initial control policy, we shall use a three-pass method where the best policy from one pass is used as a starting policy for the next pass as was done by Luus and Galli (1991). Let us use P = 10, N = 11, y = 0.85, an integration step size of 0.05, and the initial control policy and region size given in Table IV. As before, these values are used initially for all the time stages. The minimum value obtained for the performance index using the three-pass method is 6236.95, as shown in Figure 12. To obtain a more refined control policy, comparable to the continuous system, three passes were used where the number of stages was doubled after each pass and the policy at the end of a pass was used as the initial policy for the following pass. We took a reduction factor of y = 0.85, N = 11, R = 100, and an integration step size of 0.05. As shown in Figure 13, the convergence is systematic and efficient to a performance index for P = 20 of 6228.73. The total computation time was 777 8. The graph of the optimal performance index versus P2 in Figure 14 is linear, with an extrapolated minimum value of 6225.78, which is very close to the optimum 6225.4, reported by Nagurka et al. (1991).

5 U

e U

w

a

63 -I 63001

t

t-

P.20

'c"""""""----.

62001 0

I

I

I

5

IO

15

I

I

I

20

25

30

ITERATION NUMBER

Figure 13. Three-pass method using for example 2 with n = 20; doubling the number of stages after each pass.

Conclusions The use of randomly selected values for admissible control in iterative dynamic programming appears to be an effective alternative to the uniformly generated values. It has been shown that the use of 100 or fewer randomly distributed values for control at each grid point in iterative dynamic programming provides rapid convergence efficiently to the vicinity of the optimum, even when a very large number of control variables such as 20 are present. There were no computational difficulties for either example. Therefore, the use of randomly chosen values for

1314 Ind. Eng. Chem. Res., Vol. 31, No. 5, 1992

t = time to =

initial time

tf = final time

T = temperature, K u = control vector (rn X 1) x = state vector (n X 1) Greek Letters

X

w

ai=

2

lower bound on control ui

Bi = upper bound on control ui y

= reduction factor for the control region

0 = integrand in performance index, eq 3 0 = final state function used in performance index, eq 3

Literature Cited his,R. The Optimal Design of Chemical Reactors; Academic Press:

6220 0

1

2

3

4

5

6

I/ P2 x IO2

Figure 14. Linear variation of optimal performance index with P2 for example 2 with n = 20.

control extends the usefulness of iterative dynamic programming to systems with a very large number of control variables.

Acknowledgment Financial support from the Natural Sciences and Engineering Research Council of Canada Grant A-3515 is gratefully acknowledged. Computations were performed on the Cray X-MP/28 digital computer of the Ontario Center of Large Scale Computations at the University of Toronto.

Nomenclature f = nonlinear function of x and u (n X 1) I = performance index i = general index, variable number j = general index, iteration number k = general index, stage number 1 = length of stage m = number of control variables M = number of allowable values for each control variable for uniform distribution n = number of state variables N = number of grid points for state x P = number of stages r = region over which control is taken (m X 1) R = number of allowable values for each control variable for random distribution

New York, 1961;pp 85-95. Chapra, S. C.; Canale, R. P. Numerical Methods or Engineers, with Personal Computer Applications; McGraw-Hill: New York, 1985; pp 497-499. Fan, La-T.;Wang, C.3. The Discrete Maximum Principle; Wiley: New York, 1964;pp 104-108. Luus, R. Optimal Control by Dynamic Programming using Accessible Grid Points and Region Contraction. Hung. J.Znd. Chem. 1989,17,523-543. Luus, R. Application of Dynamic Programming to High-Dimensional Nonlinear Optimal Control Problems. Znt. J. Control 1990,52, 239-250. Luus, R. Application of Iterative Dynamic Programming to State Constrained Optimal Control Problems. Hung. J. Znd. Chem. 1991,19,245-254. Luus, R.; Jaakola, T. H. I. Optimization by Direct Search and Systematic Reduction of the Size of Search Region. AZChE J. 1973, 19,760-766. Luus, R.; Galli, M. Multiplicity of Solutions in Using Dynamic Programming for Optimal Control. Hung. J.Znd. Chem. 1991,19, 55-62. Luus, R.; Rosen, 0. Applications of Dynamic Programming to Final State Constrained Optimal Control Problems. Znd. Eng. Chem. Res. 1991,30,1525-1530. Luus, R.; Smith, S. G. Application of Dynamic Programming to High-Dimensional Systems Described by Difference Equations. Chem. Eng. Technol. 1991,14,122-126. Luus, R.; Dittrich, J.; Keil, F. J. Multiplicity of Solutions in the Optimization of a Bifunctional Catalyst Blend in Tubular Reactor. Paper presented at the International Workshop on Chemical Engineering Mathematics, GBttingen, Germany, July 7-11,1991. Nagurka, M.; Wang, S.; Yen, V. Solving Linear Quadratic Optimal Control Problems by Chebychev-Based State Parameterization. Proceedings of the 1991 American Control Conference, Boston, MA; American Automatic Control Council, IEEE Service Center: Piscataway, NJ, 1991;pp 104-109. Rangaiah,G. P. Studies in Constrained Optimization of Chemical Process Problems. Comput. Chem. Eng. 1985,9,395-404.

Received for review October 7 , 1991 Revised manuscript received January 21, 1992 Accepted January 30, 1992

Use of random admissible values for control in iterative dynamic

Recommend Documents