Ind. Eng. Chem. Res. 1998, 37, 2469-2478
2469
Enhancement of the Global Convergence of Using Iterative Dynamic Programming To Solve Optimal Control Problems Jeng-Shiaw Lin Department of Chemical Engineering, National Cheng Kung University, Tainan 700, Taiwan
Chyi Hwang* Department of Chemical Engineering, National Chung Cheng University, Chia-Yi 621, Taiwan
Although iterative dynamic programming (IDP) has been well recognized as a powerful method for finding the true optimal solution to a nonlinear dynamic optimization problem, the global optimality of the IDP solution is still not guaranteed completely. This paper explores enhancing the global convergence of using iterative dynamic programming to solve optimal control problems. Two approaches are employed to enhance the possibility of obtaining the true optimum while reducing the required computation efforts. One approach employs Sobol’s quasi-random sequence generator to generate allowable controls and the other utilizes multipass computation. Numerical examples show that the use of multipass IDP computation with small numbers of state grid points and allowable control values can indeed enhance the possibility of obtaining a true optimum. This is particularly true when the allowable control values are generated by using Sobol’s quasi-random sequence generator. Introduction Recently, the iterative dynamic programming (IDP) method has been introduced by Luss and coworkers (Bojkov and Luus, 1992a,b; Luus, 1989, 1990a,b; Luus and Rosen, 1991) to solve continuous-time optimal control problems where the systems are described by nonlinear differential equations. The IDP method is essentially based on treating control as time-stage operation, gridding the state and control variables, and utilizing region contraction to overcome the state dimensionality problem. In performing IDP computations, the state grid points are generated by assigning different values for control and at each state grid point admissible control values, which are chosen from a uniform distribution or at random, are evaluated to determine the optimal control associated with the state grid point. By contracting the control region for each time stage after each cycle of dynamic programming computation, the IDP finally obtains an optimal control profile. Due to the advantages of being easily implemented on a personal computer and having greater possibility to obtain a globally optimal solution than a nonlinear programming method, the original IDP method and some of the improved versions (Lin and Hwang, 1996b; Hwang and Lin, 1998) have been successfully applied to solve various optimal control problems of nonlinear processes (Bojkov and Luus, 1994a,b, 1996; Dadebo and McAuley, 1995b; Hartig and Keil, 1993a, 1994; Hartig et al., 1996; Keil et al., 1996; Luus, 1990c, 1991, 1993a-c; Luus and Bojkov, 1994; Luus et al., 1992; Luus and Galli, 1991; Mekarapiruk and Luus, 1997), including time-delay systems (Dadebo and Luus, 1992; Dadebo and McAuley, 1995a; Lin and Hwang, 1996a) and systems with a large number of state and control variables (Bojkov and Luus, 1992a,b, 1993, 1995; Hartig and Keil, 1993b; Luus, 1990b, 1993d; Luus and Smith, 1991). * To whom all correspondence should be addressed.
It is noted that IDP is a heuristic method that does not guarantee finding the global optimum. In the literature (Bojkov and Luus, 1993, 1994b, 1995, 1996; Hartig and Keil, 1993a; Luus and Bojkov, 1994; Luus et al., 1992; Mekarapiruk and Luus, 1997), several cases have been reported that the IDP method gives local optimal solutions. The possibility for the IDP method to obtain a local optimum depends on many parameters, such as the region contraction factor, the number of state grid points, the number of allowable control values, the choice of admissible values for control, etc. In general, using a large region contraction factor or large numbers of state grid points and control values can reduce the possibility of obtaining a local optimal solution. However, the computational cost may be exceptionally high for the case where the system has many state and control variables. Hence, the selection of appropriate control values and the multipass computation seem to be viable approaches to ensure the global optimality of an IDP solution while not increasing computation cost. The purpose of this paper is to explore the possibility of enhancing global convergence of the IDP solutions to optimal control problems. Specially, it is to use Sobol’s systematic approach (Bratley and Fox, 1988; Fox, 1986; Sobol, 1979) as an alternative way of generating allowable control values and to demonstrate the efficacy of obtaining the true optimal solution by using multipass IDP computation. Two examples are used to compare the schemes of generating control values and to compare the convergence and the computational efficiency of using single-pass and multipass IDP computations. Optimal Control Problem Consider a dynamic process described by the nonlinear differential equation
S0888-5885(97)00629-5 CCC: $15.00 © 1998 American Chemical Society Published on Web 05/15/1998
2470 Ind. Eng. Chem. Res., Vol. 37, No. 6, 1998
dx(t) ) f(x(t), u(t), t) dt
x(0) ) x0
(1)
where x ) [x1, x2, ..., xn]T is an n-state vector, u ) [u1, u2, ..., um]T is the m-control vector, and f is the n-component vector-valued function. The optimal control problem is to find the control policy u(t) bounded by
ji ui e ui(t) e u
i ) 1, 2, ..., m
(2)
over the specified time interval 0 e t e tf, where tf is the final time, such that the performance index
I ) Φ(x(tf), tf) +
∫0
tf
φ(x(t), u(t), t) dt
tk-1 e t < tk, k ) 1, 2, ..., Nt (4)
over Nt equal-length time stages, where tk ) ktf/Nt. With this control discretization, the problem is then to find the control sequence u(k), k ) 1, 2, ..., Nt, bounded by
ji ui e ui(k) e u
i ) 1, 2, ..., m
(5)
for the augmented dynamic system
[ ][
]
f(x, u, t) x dx d } )x , dt dt n+1 φ(x, u, t) u(t) ) u(k)
xˆ (0) )
tk-1 e t < tk
[ ] x0 0
Uil,k ) [uil(k), u j il(k)]
(9a)
i uil(k) ) max(ul, ui-1 l (k) - rl)
(9b)
i u j il(k) ) min(u j l, ui-1 l (k) + rl)
(9c)
(3)
is minimized. To solve this problem by the IDP method, we approximate the control u(t) by the piecewise-constant control profile
u(t) ) u(k)
In the above equations, u0l (k) and r1l denote the initial central control at stage k and the half-width of the region of the lth control variable ul(t), respectively. For the subsequent iterations, the region for each control component at each stage is determined by the region contraction factor R, the optimal control sequence of the previous iteration, and the clipping policy. Suppose that the (i - 1)th iteration of dynamic programming gives i-1 the optimal control sequence ui-1(k) ) [ui-1 1 (k), u2 (k), i-1 T ..., um (k)] , k ) 1, 2, ..., Nt. Then, the region of the lth control variable ul(t) at the ith dynamic programming iteration is determined as follows:
(6a) (6b)
such that the performance index
I ) Φ(x(tf), tf) + xn+1(tf)
(7)
is minimized. A Review of Iterative Dynamic Programming In applying the IDP method to solve the optimal control problem described by (5)-(7), it is required first to divide the time interval [0, tf] of interest into Nt equallength time stages. Then, for each optimization iteration of dynamic programming, it is required to (i) specify the control region for each time stage, (ii) construct Nx state grid points x(k, j), j ) 1, 2, ..., Nx, for all stages but the first time stage, whose state node has the initial condition x(0), and (iii) assign Nc control vectors u(k, j), j ) 1, 2, ..., Nc, for each stage. After the continuous optimization problem is discretized, the dynamic programming is then performed iteratively to determine the optimal control sequence u*(k), k ) 1, 2, ..., Nt. Before proceeding with IDP computations, the region of the lth (1 e l e m) control component ul(t) at stage k (1 e k e Nt) is set to
j 1l (k)] U1l,k ) [u1l (k), u
(8a)
u1l (k) ) max(ul, u0l (k) - r1l )
(8b)
u j 1l (k) ) min(u j l, u0l (k) + r1l )
(8c)
where
where ril ) Rri-1 l . According to (9a), the width of the region for the lth control variable at stage k is given by
|Uil,k| ) u j il(k) - uil(k)
(10)
Once the control region for the ith iteration of dynamic programming has been determined, we can construct the state grids by first choosing Nx control sequences {uj(1), uj(2), ..., uj(Nt)}, j ) 1, 2, ..., Nx with
uj(k) ) ui(k) +
j-1 i [|Ui |, |U2,k |, ..., |Uim,k|]T Nx - 1 1,k k ) 1, 2, ..., Nt (11)
and then integrating state equation (6) with each of these control sequences to obtain state grid points x(k, j), j ) 1, 2, ..., Nx; k ) 2, 3, ..., Nt. Such an approach of constructing state grid points was proposed by Luus (1989). The Nc allowable controls ui(k, j), j ) 1, 2, ..., Nc for stage k at the ith iteration of dynamic programming optimization are taken values from the hyperrectangle i i × U2,k × ... × Uim,k Uik ) U1,k
(12)
The scheme of assigning allowable controls will be discussed in detail in the next section. Having set up the parameters for the ith iteration of dynamic programming, the optimal control sequence ui(k), k ) 1, 2, ..., Nt can be obtained by the algorithm of reverse dynamic programming (Luus, 1990b; Hartig and Keil, 1993a; Hwang and Lin, 1998). Generating Allowable Values for the Control Optimization by IDP The choice of allowable values for the control of each stage plays a crucial role in dynamic programming optimization. In the IDP computation with Nc allowable controls and Nx state grid points for each stage, it requires NcNt(Nx(Nt - 1)/2 + 1) times of state shootings to complete a cycle of optimization if the usual reverse dynamic programming (Luus, 1990b) is used. Here a state shooting means that of integrating the augmented system equation (6) over the time interval of a stage. Usually, the number of state grid points, Nx, chosen for each time stage is proportional to the number of
Ind. Eng. Chem. Res., Vol. 37, No. 6, 1998 2471
allowable control values, Nc. Also, the number of state grid points, Nx, is large enough to ensure the global convergence. As a result, the number of state shootings increases almost quadratically with Nc. However, several examples appeared in the literature as well as the examples given in this work reveal the fact that very little may be gained by using too many allowable controls for each stage. In the past development of the IDP computation algorithms, two schemes of assigning allowable control values have been used. If the number of control variables m is small, a uniform gridding of control space is preferable to the random distribution. Suppose that at stage k, Ml + 1 values for the lth control variable ul(t) are distributed uniformly over the control region i i-1 i i [ui-1 j il(k)]. Then there l (k) - rl, ul (k) + rl] } [ul(k), u m are Nc ) ∏l)1(Ml + 1) allowable control vectors [v1, v2, ..., vm]T, where
vl )
uil(k)
2ril + jl Ml
(13)
and jl ) 0, 1, ..., Ml; l ) 1, 2, ..., m. To include the central control sequence ui-1(k) of the previous iteration of dynamic programming computation in the above set of control vectors so as to obtain a nonincreasing convergence of the performance index, the numbers Ml, l ) 1, 2, ..., m, should be chosen as even integers. On the other hand, if the number of control variables m is quite large, it becomes difficult to use uniformlydistributed control values. In this case the allowable control vectors for each time stage and each dynamic programming iteration are usually taken at random from the allowable control domain (12), which is an m-dimensional hyperrectangle. Suppose that Nc allowable control vectors are chosen for each stage and the central control ui-1(k) is to be included as one of the allowable control vectors, it is required first to generate a sequence of random numbers rν, ν ) 1, 2, ..., mNt(Nc - 1), which are uniformly distributed in the interval [0, 1]. Then, allowable control vectors ui(k, j), which lie in the control region (9), are obtained as
ui(k, j) )
{
[v1, v2, ..., vm]T for j ) 1, ..., Nc - 1 (14) for j ) Nc ui-1(k)
where
ul ) uil(k) + rν(u j il(k) - uil(k)) and ν ) (k - 1)(Nc - 1)m + m(j - 1) + l. It is noted that, in the IDP computation using randomly-generated allowable control values, the sequences of random variables generated are not repeated from stage to stage and from iteration to iteration. In this paper, we suggest the use of a systematic approach to generate allowable values of control for the dynamic programming optimization. The approach is based on using Sobol’s quasi-random sequence generator to generate a sequence of points that are distributed very uniformly in a multidimensional hypercube. These points have been successfully used for a systematic crude search, as starting points for a global search algorithm. For a detailed description of Sobol’s quasirandom sequence generator and its implementation algorithm, the readers are referred to the papers by
Bratley and Fox (1988), Fox (1986), Sobol (1979), Homma and Saltelli (1995). By using Sobol’s quasi-random sequence generator, we can obtain in sequence M sets of (Nc - 1) points in the m-dimensional unit hypercube [0, 1]m. Let these points be denoted by
pi,j ) [pi,j,1, pi,j,2, ..., pi,j,m]T
j ) 1, 2, ..., Nc - 1 (15)
where i ) 1, 2, ..., M. On the basis of these sets of points, the allowable controls for each stage are assigned as follows:
ui(k, j) )
{
[v1, v2, ..., vm]T for j ) 1, ..., Nc - 1 (16) for j ) Nc ui-1(k)
where
j il(k) - u j il(k)) vl ) uil(k) + pµ,j,l(u µ is equal to 1 plus the remainder of dividing i by M. Here it should be noted that the points generated by Sobol’s quasi-random sequence generator are deterministic rather than stochastic. Also noted is that the allowable control patterns are repeated for every M iterations of dynamic programming computation. Before leaving the section, it is noted that the values of allowable control generated by the uniform distribution approach may violate the bounds (2). If a component of the generated allowable control vector u(k, l) does not reside in the feasible control region [u, u j ], a simple clipping technique (Luus 1989, 1990a-c) or a modified clipping technique (Hartig and Keil, 1993a) can be used to shift the infeasible control component into the allowable region. Multipass IDP Computation A pass of IDP computation consists of Ni iterations of dynamic programming optimization with contracting the control region by a factor R in each iteration. The optimal control sequence obtained at the current iteration will be used as the central control profile of the next iteration. Similarly, in solving optimal control problems by multipass IDP with a pass region contraction factor η, the optimal control sequence obtained in the current pass of IDP optimization will be used as the central control profile of the next pass. Multiple passes of IDP computation have been used (Bojkov and Luus, 1992a; Dadebo and McAuley, 1995a; Luus and Galli, 1991; Luus and Rosen, 1991) to refine the piecewise-constant control policy by doubling the number of time stages so that it is comparable to the optimal continuous control profile. Recently, it has been found (Luus, 1993d; Luus et al., 1995) that several optimal control problems can be effectively solved by multipass IDP computation using only a single state grid for each stage. Therefore, it is worth examining the global convergence property of multipass IDP computation. For this purpose, we outline in the following the procedure of multipass IDP computation. Procedure of Multipass IDP Scheme (1) Let the control region size be initially given by r(1). Select the parameters Np, Nt, and η, and the central control profile u(0)(k), k ) 1, 2, ..., Nt. (2) Set the pass index p ) 1.
2472 Ind. Eng. Chem. Res., Vol. 37, No. 6, 1998
(3) Select the parameters Ni, Nc, and R. (4) Set the initial control region size to p(p). Use u(p-1)(k), k ) 1, 2, ..., Nt, as the central control profile and then use an existing IDP algorithm (Hwang and Lin, 1998) to perform one pass of Ni iterations of dynamic programming optimization and obtain the optimal control sequence u(p)(k), k ) 1, 2, ..., Nt. (5) Increment the pass index p by 1, and restore the control region size to r(p) ) ηr(p-1). (6) If p e Np, go to step (3). (7) Stop. Illustrative Examples To illustrate the enhancement of global convergence of the IDP method by using a systematic approach to assign allowable controls and using multipass computation, two examples are worked out. The improved IDP algorithm of Hwang and Lin (1998) was invoked to perform dynamic optimization. The required state shootings are performed with a fourth-order RungeKutta method with a step length h. All the numerical computations are carried out in double precision on an IBM RS/6000 workstation. Since the major computation burden lies in the state shootings for evaluating the performance index, the computational efficiency will be measured by the number of state shootings, Nss, actually performed in the computation. Example 1. Consider an isothermal photochemical continuous stirred tank reactor (CSTR) in which the following five chemical reactions are occurring
A + B f 2D C + B f (CB) (CB) + B f 2E E + D f 2F F + A f 2G where A, B, and C are the reactants, (CB) is an intermediate, and D, E, F, and G are the products. This typical chemical engineering system has been used for optimal control studies by several authors (Jensen, 1964; Lapidus and Luus, 1967; Rao and Luus, 1972; Luus, 1990b, 1991; Bojkov and Luus, 1993). The system equations are given by Lapidus and Luus (1967):
x˘ 1(t) ) u4(t) - q(t) x1(t) - 17.6x1(t) x2(t) 23x1(t) x6(t) u3(t) x˘ 2(t) ) u1(t) - q(t) x2(t) - 17.6x1(t) x2(t) 146x2(t) x3(t) x˘ 3(t) ) u2(t) - q(t) x3(t) - 73x2(t) x3(t) x˘ 4(t) ) -q(t) x4(t) + 35.2x1(t) x2(t) - 51.3x4(t) x5(t) x˘ 5(t) ) -q(t) x5(t) + 219x2(t) x3(t) - 51.3x4(t) x5(t) x˘ 6(t) ) -q(t) x6(t) + 102.6x4(t) x5(t) 23x1(t) x6(t) u3(t) x˘ 7(t) ) -q(t) x7(t) + 46x1(t) x6(t) u3(t) where x1(t), x2(t), x3(t), x4(t), x5(t), x6(t), and x7(t) denote the concentrations of species A, B, C, D, E, F, and G, respectively. The quantity
q(t) ) u1(t) + u2(t) + u4(t) is the total feed rate, u4(t), u1(t), and u2(t) are the flow
rates of A, B, and C, respectively, and u3(t) is the square root of the light intensity. For the given initial state
x(0) ) [0.1883, 0.2507, 0.0467, 0.0899, 0.1804, 0.1394, 0.1046]T and the control bounds
0 e u1(t) e 20 0 e u2(t) e 6 0 e u3(t) e 4 0 e u4(t) e 20 the optimal control objective is to maximize the performance index
I)
∫0t [5.8(q(t) x1(t) - u4(t)) - 3.7u1(t) - 4.1u2(t) + f
q(t)(23x4(t) + 11x5(t) + 28x6(t) + 35x7(t)) - 5u32(t) 0.099] dt by choosing the controls u1(t), u2(t), u3(t), and u4(t) in the time interval 0 e t e tf. As reported in the literature (Luus, 1990b), the use of single-pass IDP with Nt ) 11 time stages, Nx ) 27 state grid points for each stage, and Nc ) 81 ()34) rectangular grids for allowable controls gave the optimal performance index I ) 21.76. We solved the same problem by using single-pass IDP with the same number of time stages and the same allowable controls but with 15 state grid points, the optimal performance index is converged to I ) 21.7575 in 29 iterations. The total number of state shootings performed is Ntss ) 1 963 764, which will be used as a comparison basis of the computational savings of the multipass IDP computation. In order to verify this result, we solved the problem again by using single x-grid multipass IDP computation with the same number of time stages and the same allowable controls. The values of the optimum performance I is still converged to 21.7575 by 20 passes of 20-iteration IDP computations. Hence, in the following, the value of 21.7575 will be regarded as the true optimal solution for the case where Nt ) 11 is used. To solve this problem by the IDP method with the allowable controls chosen randomly or systematically, we partitioned the time interval [0, 0.2] into Nt ) 11 equal-length time stages and set the region contraction factor R ) 0.85. The step length of numerical integration used for each state shooting is h ) 1/110. The initial central control for each time stage is set as u0(k) ) [10, 3, 2, 6]T, k ) 1, 2, ..., Nt, and the initial halfwidth of the control region is given as r1 ) } [r11, r12, r13, r14]T ) [10, 3, 2, 6]T. The converged performance indices by 30-iteration IDP computations are listed in Table 1 for using different numbers of state grid points and allowable controls, and using different approaches to generating allowable controls. As can be seen from the table, single-pass IDP computations with allowable controls chosen either randomly or systematically fail to converge to the true optimal solution. For each set of IDP parameters, none of ten runs with randomly chosen allowable controls gave the true optimal solution. Furthermore, the single-pass IDP computation using large numbers of allowable controls and state grid
Ind. Eng. Chem. Res., Vol. 37, No. 6, 1998 2473 Table 1. Comparison of Convergence of Performance Indices Obtained by Single-Pass IDP Computations with Allowable Controls Generated by Sobol’s Quasi-Random Sequence and Random Distribution for Example 1a Nc ) 40 allowable controls random 1 random 2 random 3 random 4 random 5 random 6 random 7 random 8 random 9 random 10 Sobol’s quasi-random
I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss
Nc ) 60
Nx ) 9
Nx ) 15
Nx ) 21
Nx ) 9
Nx ) 15
Nx ) 21
21.676 82 600 520 21.628 28 600 880 21.559 91 602 040 21.610 08 597 600 21.525 43 601 200 21.711 68 600 440 21.532 44 599 680 21.413 99 601 240 21.666 26 600 480 21.542 95 600 520 21.667 71 607 200
21.432 27 977 240 21.574 94 975 120 21.603 51 975 760 21.604 90 980 400 21.499 09 976 480 21.642 03 979 760 21.607 40 977 160 21.558 19 975 040 21.482 24 981 040 21.405 47 978 000 21.756 56b 996 000
21.490 95 1 337 160 21.451 03 1 346 360 21.469 32 1 341 840 21.683 75 1 338 840 21.490 87 1 339 200 21.602 91 1 343 760 21.544 03 1 343 600 21.664 57 1 341 760 21.700 24 1 338 000 21.696 12 1 342 640 21.752 24c 1 381 640
21.500 35 909 660 21.641 68 903 660 21.609 35 906 600 21.617 06 909 600 21.707 06 906 720 21.535 84 905 400 21.711 80 904 440 21.678 41 904 800 21.540 22 906 600 21.511 78 907 800 21.619 90 910 800
21.501 01 1 482 600 21.668 84 1 483 200 21.605 39 1 487 580 21.706 40 1 485 000 21.593 36 1 484 100 21.722 60 1 491 000 21.543 68 1 486 200 21.722 64 1 482 000 21.609 03 1 483 320 21.477 46 1 488 000 21.611 45 1 500 000
21.651 83 2 054 400 21.577 37 2 050 920 21.731 67 2 047 320 21.709 77 2 055 240 21.643 88 2 057 820 21.511 81 2 043 840 21.719 82 2 045 820 21.538 91 2 053 980 21.648 57 2 054 520 21.742 17 2 053 560 21.725 32 2 088 000
a I: performance index. N : number of state shootings performed. b Performance index within 0.01% of the optimum. c Performance ss index within 0.03% of the optimum.
points, which requires an excessively large number of state shootings, still give the local solution rather than the global solution. In the most runs, IDP with allowable controls generated by Sobol’s quasi-random distribution gave better results than those generated by a random distribution. On the basis of the results in Table 1, we infer that the multipass IDP may be an effective alternative to obtain the true optimal solution. To verify this, we solved the problem using multi-pass IDP computation with a few state and control grid points. The convergence of the performance index versus the pass number for multipass 20-iteration IDP computations using Sobol’s systematically-generated allowable controls is shown in Figure 1. Table 2 shows the convergence of the performance index by multipass 20-iteration IDP computations with the allowable controls generated by Sobol’s quasi-random sequence. Table 3 shows the convergence of the performance index by multipass 20iteration IDP computation with allowable control generated by random distribution. It is seen from Table 2 that, with the allowable controls generated by Sobol’s quasi-random sequence, all sets of multipass IDP computation gave global optimal solutions. In contrast, as can be seen from Table 3, not all multipass IDP computations using randomly-generated allowable controls gave the true optimal solution. In other words, Sobol’s systematic distribution of points is a robust approach to generate allowable controls to ensure the global convergence of the multipass IDP computation. In order to compare the computation burdens of singlepass and multipass IDP computation, the actual numbers of state shootings performed in each run of IDP computation with the allowable control generated by Sobol’s quasi-random sequence is an efficient and effective choice for obtaining the global optimal solution. We are now in a position to examine in greater detail the effect of the number of allowable controls Nc and the contraction factor R on the convergence of the multipass IDP computation. Let us first investigate the influence of the number of allowable controls Nc on the
Figure 1. Convergence of the performance index by the multipass IDP algorithm with Sobol’s quasi-random distribution of allowable controls for example 1.
convergence of the performance index to within 0.03% of the optimal performance index I ) 339.10. In order to compare with the results obtained by utilizing randomly-chosen values for allowable controls (Bojkov and Luus, 1993), we solved this problem with tf ) 4.0, Nt ) 20, Ni ) 20, R ) 0.85, h ) 0.01, and the same initial control policy and initial region size as before. In actual computations, multipass IDP computations were performed if the number of allowable controls Nc is so small as not to obtain a convergent performance index within 0.03% of the optimum by a one-pass IDP computation. In performing the multipass IDP computation, the region contraction factor η between passes is set as η ) R, and 15 IDP passes were carried out. The cumulative
2474 Ind. Eng. Chem. Res., Vol. 37, No. 6, 1998 Table 2. Convergence of the Performance Indices Obtained by Multipass IDP Computations with Allowable Control Generated by Sobol’s Quasi-Random Sequence for Example 1a Nx ) 1 pass 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss E
Nx ) 3
Nc ) 10
Nc ) 15
Nc ) 20
Nc ) 10
Nc ) 15
Nc ) 20
21.2881 13 200 21.5348 13 200 21.6258 13 200 21.6572 13 200 21.7043 13 200 21.7475 13 200 21.7538d 13 200 21.7547d 13 200 21.7558c 13 200 21.7564c 13 200 21.7568c 13 200 21.7570c 13 200 21.7572c 13 200 21.7573c 13 200 21.7574c 13 200 21.7575b 13 200 89.25%
21.5658 19 800 21.7219 19 800 21.7566c 19 800 21.7572c 19 800 21.7574c 19 800 21.7574c 19 800 21.7574c 19 800 21.7575b 19 800
21.6406 26 400 21.7469 26 400 21.7503 26 400 21.7522d 26 400 21.7537d 26 400 21.7547d 26 400 21.7563c 26 400 21.7569c 26 400 21.7570c 26 400 21.7572c 26 400 21.7574c 26 400 21.7575b 26 400
21.2586 35 100 21.4745 35 200 21.5613 35 200 21.5888 35 200 21.6243 35 200 21.6542 35 200 21.6904 35 200 21.7320 35 200 21.7491 35 200 21.7532d 35 200 21.7554c 35 200 21.7563c 35 200 21.7569c 35 200 21.7573c 35 200 21.7575b 35 200
21.5445 52 800 21.6352 52 800 21.6846 52 800 21.7494 52 800 21.7550d 52 800 21.7565c 52 800 21.7573 52 800 21.7575b 52 800
21.4077 70 400 21.6180 70 400 21.6773 70 400 21.7550d 70 400 21.7562c 70 400 21.7567c 70 400 21.7571c 70 400 21.7573c 70 400 21.7574c 70 400 21.7575b 70 400
91.93%
83.87%
73.12%
78.49%
64.15%
I: performance index. Nss: number of state shootings performed. E percentage of reduction in the number of state shootings, E ) 100% × (Ntss - N*ss)/Ntss, Ntss ) 1 963 764, N*ss ) cumulative number of state shootings. b Optimal performance. c Performance index within 0.01% of the optimum. d Performance index within 0.03% of the optimum. a
Table 3. Number of Passes Performed by the Multipass IDP Computations with Randomly-Chosen Allowable Controls To Obtain Optimal Performance Index 21.7575 for Example 1a Nx ) 1
Nx ) 3
run
Nc ) 10
Nc ) 15
Nc ) 20
Nc ) 10
Nc ) 15
Nc ) 20
1 2 3 4 5 6 7 8 9 10
6 15 (21.7574) 13 9 8 (21.7569) 15 10 (21.7572)
(21.5118) 9 (21.4595) 11 (21.5383) (21.4903) 13 10 12 13
10 7 (21.5389) 9 12 9 14 6 10 10
12 13 10 10 12 7 12 14 6 13
6 8 10 9 8 (21.5385) 10 8 8 7
(21.4595) 6 4 6 9 8 7 8 6 8
a The parenthesized figures represent the converged performance indices, which are not the true optimum, obtained by 20 passes of IDP computations.
number of iterations Iter and the number of cumulative state shootings N*ss are recorded in Table 4. It can be observed from the table that (i) the global convergence of the IDP solutions appears to be independent of the number of state grid points Nx used; (ii) the total number of state shootings performed increases as the number of state grid points is increased; (iii) the total number of state shootings required to give a converged solution is not necessarily increased with the increase of the number of allowable controls; (iv) the convergence does not have a significant improvement when the number of allowable controls Nc is greater than 10; (v) when Nc is less than 8, the multipass IDP is required to ensure the solution convergence. It is interesting to
note that multipass IDP with a single x-grid requires fewer state shootings than that with multiple x-grids. In the present case, single x-grid multipass IDP Nc ) 8 gives a converged solution while requiring the least total number of state shootings N* ss ) 31 920. Note that Bojkov and Luus (1993) have applied onepass IDP with Nc ) 16 randomly-chosen values for control and Nx ) 3 state grid points to obtain a convergence to within 0.03% of the optimum in 16 iterations. The cumulative number of state shootings was N* ss ) 148 480. Hence, the percentage of reduction in the total number of state shootings by using the proposed systematic approach to assign allowable controls can be as high as
Ind. Eng. Chem. Res., Vol. 37, No. 6, 1998 2475 Table 4. Effect of the Parameters Nc and Nx Used in Multipass IDP Computations (with tf ) 4.0, Nt ) 20 and Sobol’s Quasi-Random Allowable Controls) on the Convergence of the Performance Index to within the 0.03% of the True Optimum for Example 1a Nx ) 1 Nc Iter
N*ss
Nx ) 3 Iter
N*ss
Nx ) 5 Iter
N*ss
Nx ) 7 Iter
N*ss
3 136 85 680 155 205 782 119 167 553 136 213 537 4 40 33 600 55 113 440 55 123 712 55 44 852 5 59 61 950 54 143 555 72 228 955 72 265 610 6 78 98 280 78 270 420 77 336 468 77 398 124 7 58 85 260 58 228 900 58 315 595 39 258 363 8 19 31 920 19 89 680 19 129 352 19 156 064 9 20 37 800 37 196 470 37 310 698 37 393 129 10 19 39 900 20 118 000 20 177 080 20 226 840 15 19 59 850 19 168 150 19 276 450 20 393 315 20 19 79 800 19 224 200 18 349 200 19 497 800 25 18 94 500 18 265 500 17 401 225 17 562 950 30 19 119 700 18 318 600 18 523 800 18 728 430 35 20 147 000 17 351 050 19 645 050 18 850 500 40 19 159 600 17 401 200 18 698 400 17 918 000 a Iter: cumulative number of iterations. b N* : cumulative ss number of state shootings.
E)
N* ss - Nss × 100% ) 78.50% N* ss
The convergence of the IDP solution also depends on the contraction factor R used. In one-pass IDP computation, the smaller contraction factor R is used, the smaller number of state shootings is required, and the higher possibility of a local optimum will be obtained. To investigate the effect of the contraction factor R on the convergence in multipass IDP computation, we solved the problem by using single x-grid multipass IDP with Nc ) 6, 8, 10, and 15. In these computations, the contraction factor η between passes has the same value as R. We compare in Figure 2 the number of cumulative state shootings required by multipass IDP with the systematic control assignment and by one-pass IDP with the random control assignment for various values of contraction factor R. In this figure, the number of state shootings required for each one-pass IDP computation is calculated by multiplying NcNt(1 + Nx(Nt - 1)/2) with the number of iterations performed, which is quoted from Figure 5 in the paper by Bojkov and Luus (1993). It is noted that for R < 0.8, one-pass IDP with random control assignment did not yield convergence for Nc ) 11, 16, and 21 while multipass IDP with systematic control assignment yields convergence, even if a small contraction factor R ) 0.65 and a small of number of allowable controls Nc ) 6 were used. Hence, as can be seen from Figure 2, multipass IDP with systematic control assignment is more robust to obtain global optimum while requiring less computational effort than one-pass IDP with random control assignment. Example 2. Consider the nth-order linear timeinvarient dynamic system with n inputs previously presented by Nagurka et al. (1991):
x3 (t) ) Ax(t) + u(t)
x(0) ) [1, 2, ..., n]T
where x [x1, ..., xn]T is the state vector, u ) [u1, ..., un]T, and the system matrix A is given by
Figure 2. Effect of the contraction factor R on the number of state shootings required to yield convergence with 0.03% of the optimum for example 1 with Nt ) 20 and tf ) 4.
[
0 0 · A) · · 0 1
1 0 · · · 0 -2
0 1 · · · 0 · · ·
· · · · · · · · · · · · · · ·
0 0 · · · 1 (-1)n+1n
]
The problem here is to find the optimal policy u(t) in the time interval 0 e t e 1 that minimizes the performance index
I ) 10xT(1) x(1) +
∫01(xT(t) + uT(t) u(t)) dt
Recently, Bojkov and Luus (1992a,b) have solved this problem for the cases of n ) 2, 6, 8, and 20 with the randomly-chosen values for control. In particularly, Bojkov and Luus (1992a) found the optimal control policy for the eighth-order by using a three-pass IDP method (Luus and Galli, 1991). In their multipass IDP computations, the number of time stages was doubled after each pass and the control policy at the end of the pass was used as the initial policy for the following pass. The IDP parameters used to solve the problem are as follows: the region contraction factor R ) 0.85, the number of state grid points Nx ) 11, the number of random allowable values for control Nc ) 100 and the integration step size of h ) 0.05 for the fifth-order Runge-Kutta method. The initial values for control policy are chosen as u0(k) ) [-2, -3, -4, -6, -7, -8, -8, -1]T, k ) 1, 2, ..., Nt, and initial region sizes are given as r ) [1, 1, 1, 1, 1, 1, 1, 1]T. Moreover, they obtained the minimum performance index I ) 373.17 for the case where the time interval [0, 1] was divided into Nt ) 20 equal-length time stages. It is noted that the number of state shootings for single-pass IDP with Nt ) 10, Nx ) 11, Nc ) 100, and Ni ) 30 is Nsst ) 1 515 000, which will be used as the comparison basis for the computational savings of the multipass IDP computation.
2476 Ind. Eng. Chem. Res., Vol. 37, No. 6, 1998 Table 5. Comparison of the Convergence of Performance Indices by Single-Pass IDP Computations with Allowable Controls Generated by Sobol’s Quasi-Random Sequence and Random Distribution for Example 2a Nx ) 7 allowable controls random 1 random 2 random 3 random 4 random 5 random 6 random 7 random 8 random 9 random 10 Sobol’s quasi-random
I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss
Nx ) 9
Nc ) 50
Nc ) 75
Nc ) 100
Nc ) 50
Nc ) 75
Nc ) 100
373.993 319 700 373.890 318 700 373.853 322 750 373.843 321 850 373.875 323 200 373.865 324 100 373.848 319 050 373.907 321 400 373.786 322 300 373.857 322 350 373.701c 317 500
373.778 485 475 373.798 484 800 373.726 486 150 373.830 486 150 373.810 486 150 373.714 485 475 373.747 487 500 373.764 485 475 373.762 485 475 373.733 481 200 373.648c 426 250
373.716 650 000 373.692c 650 000 373.732 649 100 373.722 650 000 373.738 649 100 373.767 649 100 373.719 650 000 373.785 646 500 373.765 648 200 373.710 647 300 373.683c 635 000
373.869 410 950 374.032 407 950 374.196 408 750 373.788 407 800 373.966 407 800 373.996 409 600 373.828 407 400 373.843 409 150 373.838 405 650 373.876 409 200 373.665c 385 000
373.803 617 775 373.824 620 475 373.774 621 825 373.805 620 475 373.789 617 175 373.751 619 125 373.743 619 800 373.775 621 150 373.761 621 825 373.765 619 800 373.625b 608 900
373.727 827 300 373.718 829 100 373.715 829 100 373.712 829 100 373.758 828 200 373.728 829 100 373.747 825 500 373.711 828 200 373.682c 827 300 373.710 829 100 373.684c 805 000
a I: performance index. N : number of state shootings performed. b Performance index within 0.01% of the optimum. c Performance ss index within 0.03% of the optimum.
Table 6. Converged Performance Indices and Number of State Shootings Performed by the Multipass IDP Computation with Sobol’s Quasi-Random Allowable Controls for Example 2a Nx ) 1 pass 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss I Nss E
Nx ) 3
Nc ) 10
Nc ) 15
Nc ) 10
Nc ) 10
Nc ) 15
Nc ) 10
403.079 11 000 379.798 11 000 375.168 11 000 374.337 11 000 373.952 11 000 373.763 11 000 373.695d 11 000 373.648d 11 000 373.639d 11 000 373.633d 11 000 373.613c 11 000 373.608c 11 000 373.602c 11 000 373.600c 11 000 373.598c 11 000 373.597c 11 000 373.596c 11 000 373.595b 11 000
381.857 16 500 375.523 16 500 374.336 16 500 373.815 16 500 373.723 16 500 373.664d 16 500 373.641d 16 500 373.623c 16 500 373.614c 16 500 373.606c 16 500 373.602c 16 500 373.599c 16 500 373.596c 16 500 373.595c 16 500 373.595b 16 500
378.540 22 000 374.365 22 000 373.835 22 000 373.733 22 000 373.678d 22 000 373.653d 22 000 373.636d 22 000 373.615c 22 000 373.608c 22 000 373.605c 22 000 373.604c 22 000 373.600c 22 000 373.598c 22 000 373.596c 22 000 373.595b 22 000
380.077 43 500 374.870 43 500 373.940 43 500 373.762 43 500 373.690d 43 500 373.651d 43 500 373.628c 43 500 373.620c 43 500 373.610c 43 500 373.604c 43 500 373.601c 43 500 373.598c 43 500 373.597c 43 500 373.596c 43 500 373.595b 43 500
379.508 58 000 374.382 58 000 373.860 58 000 373.696d 58 000 373.670d 58 000 373.648d 58 000 373.635d 58 000 373.619c 58 000 373.609c 58 000 373.603c 58 000 373.600c 58 000 373.599c 58 000 373.597c 58 000 373.596c 58 000 373.595b 58 000
86.93%
83.66%
78.22%
407.026 26 300 379.869 26 660 375.092 26 300 374.161 26 300 373.849 26 300 373.742 26 300 373.682d 26 300 373.652d 26 300 373.639d 26 300 373.626c 26 300 373.621c 26 300 373.613c 26 660 373.608c 26 300 373.604c 26 300 373.601c 26 300 373.599c 26 300 373.597c 26 300 373.596c 26 660 373.595b 26 300 66.95%
56.93%
42.57%
I: performance index. Nss: number of state shootings performed. E: percentage of reduction in the number of state shootings, E ) 100 × (Ntss - N*ss)/Ntss, Ntss ) 1 515 000, N*ss ) cumulative number of state shootings. b Optimal peformance index. c Performance index within 0.01% of the optimum. d Performance index within 0.03% of the optimum. a
Ind. Eng. Chem. Res., Vol. 37, No. 6, 1998 2477
Here, we solved the problem with n ) 8 to demonstrate the efficacy and robustness of the multipass IDP computation with allowable controls generated by Sobol’s quasi-random sequence in obtaining the global optimal solution. First, we examine the convergence property of the single-pass IDP computation with randomlychosen allowable control values. For these purposes, the following IDP parameters were used: Nt ) 10, Nc ) 50, 75, and 100, Nx ) 7, 9, h ) 0.01. The initial control policy is chosen as uc(k) ) [-5, -5, -5, -5, -5, -5, -5, -5]T, k ) 1, 2, ..., Nt, and the initial half region size is given by r0 ) [5, 5, 5, 5, 5, 5, 5, 5]T. The converged performance indices I for Ni ) 30 and the number of state shootings Nss are recorded in Table 5 for 60 runs. With the same parameters but using Sobol’s quasi-random sequence to generate allowable control values, we obtained the converged performance indices shown in the bottom row of Table 5. It is observed from Table 5 that none of these runs gave the true optimal solution. However, all single-pass IDP computations with systematically-chosen allowable controls were able to obtain a performance index within 0.03% of the optimum values (373.595) while only 2 out of 60 runs using randomly-chosen controls converged to the same performance level. Moreover, the singlepass IDP computation using randomly-chosen allowable controls fails to obtain the true optimal solution even by the number of allowable controls is chosen to be sufficiently large. Now let us solve this problem using multipass IDP computation with the allowable controls generated by Sobol’s quasi-random sequence. We solved the problem by the multipass IDP scheme with R ) 0.85, η ) 0.85, Nt ) 10, Ni ) 20, and h ) 0.01 for various combinations of the number of allowable controls Nc ) 10, 15, 20 and the number of state grids Nx ) 1, 3. The convergence properties obtained and the number of state shootings performed in these multipass IDP computations are listed in Table 6. It is noted that all multipass IDP computations gave the true optimal solutions, even for the cases where the single state grid point and a small number of allowable controls were used. As compared with the results shown in Table 5, we know that the multipass IDP computation along with the allowable control being generated from Sobol’s quasi-random sequence is a robust and efficient approach to obtain the global solution to the optimal control problem by the IDP computation. Conclusions Iterative dynamic programming using uniformlydistributed allowable controls is a reliable method of finding optimal piecewise constant control policy for nonlinear optimal control problems. However, the required computation burden is excessively high as the number of control variables is large. To avoid the combinatorial explosion of the number of allowable controls by using rectangular gridding of the feasible control space, it has been suggested in the literature to use a prescribed number of the allowable controls generated by using the random numbers with a uniform distribution in the interval (0, 1). In this paper, we have shown by numerical examples that the single-pass IDP computation with the randomly-distributed allowable controls is not so robust to obtain the true optimal solution. In order to enhance the global convergence of the IDP method without incrasing the computation load,
we propose the use of Sobol’s quasi-random sequence to generate allowable controls and the use of multipass IDP computation with very few state grid points and a small number of allowable controls. Simulation results show that the multipass IDP method with allowable controls generated by Sobol’s quasi-random sequence is indeed a robust and efficient means of obtaining the true optimal solution to the optimal control problems having a large number of control variables. Acknowledgment This work was supported by the National Science Council of the Republic of China under Grant NSC-872214-E-194-001. Nomenclature E ) reduction ratio in the number of state shootings f(‚) ) vector-valued function I ) performance index defined in (3) IDP ) iterative dynamic programming Iter ) cumulative number of iterations NLP ) nonlinear programming Nc ) number of control grid points Ni ) number of iterations Np ) number of passes Nss ) number of state shootings N* ss ) cumulative number of state shootings Nt ) number of equilength time stages Nx ) number of state grid points ri ) half-width of the control region vector for the ith iteration tf ) final time u(k) ) control used for stage k u(k, j) ) jth allowable control for stage k u(t) ) m-component control vector ui(k) ) central control vector used for stage k at the ith iteration u ) lower bound of the control vector u(t) u j ) upper bound of the control vector u(t) Uil,k ) region for the lth component of control for stage k at the ith iteration Uik ) m-dimensional hyperrectangle defined in (12) x(k, j) ) jth state grid point for stage k x(t) ) n-component state vector xˆ (t) ) augmented state vector in (7a) x0 ) initial state vector Greek Letters R ) region contraction factor between iterations η ) region contraction factor between passes Φ(‚) ) scalar function defined in (3) φ(‚) ) scalar function defined in (3) Suffixes (Subscripts/Superscripts) a ) optimal performance index b ) performance index within 0.01% of the optimum c ) performance index within 0.03% of the optimum i ) component i, ith element T ) matrix transpose * ) optimal value
Literature Cited Bojkov, B.; Luus, R. Extension of Iterative Dynamic Programming to High-Dimensional Systems by using Randomly Chosen Values for Control. Proc. Am. Control Conf. 1992a, 194. Bojkov, B.; Luus, R. Use of Random Admissible Values for Control in Iterative Dynamic Programming. Ind. Eng. Chem. Res. 1992b, 31, 1308.
2478 Ind. Eng. Chem. Res., Vol. 37, No. 6, 1998 Bojkov, B.; Luus, R. Evaluation of the Parameters Used in Iterative Dynamic Programming. Can. J. Chem. Eng. 1993, 71, 451. Bojkov, B.; Luus, R. Application of Iterative Dynamic Programming to Time Optimal Control. Trans. Inst. Chem. Eng. 1994a, 72A, 72. Bojkov, B.; Luus, R. Time Optimal Control by Iterative Dynamic Programming. Ind. Eng. Chem. Res. 1994b, 33, 1486. Bojkov, B.; Luus, R. Time Optimal Control of High Dimensional Systems by Iterative Dynamic Programming. Can. J. Chem. Eng. 1995, 73, 380. Bojkov, B.; Luus, R. Optimal Control of Nonlinear Systems with Unspecified Final Times. Chem. Eng. Sci. 1996, 51 (6), 905. Bratley, P.; Fox, B. L. Algorithm 659: Implementing Sobol’s Quasirandom Sequence Generator. ACM Trans. Math. Soft. 1988, 14 (1), 88. Dadebo, S.; Luus, R. Optimal Control of Time-Delay Systems by Dynamic Programming. Opt. Control App. Methods 1992, 13, 29. Dadebo, S. A.; McAuley, K. B. Iterative Dynamic Programming for Minimum Energy Control Problems with Time Delay. Opt. Control App. Methods 1995a, 16, 217. Dadebo, S. A.; McAuley, K. B. A Simultaneous Iterative Solution Technique for Time-Optimal Control using Dynamic Programming. Ind. Eng. Chem. Res. 1995b, 34, 2077. de Tremblay, M.; Luus, R. Optimization of Non-Steady-State Operation of Reactors. Can. J. Chem. Eng. 1989, 67, 494. Fox, B. L. Algorithm 647: Implementation and Relative Efficiency of Quasirandom Sequence Generators. ACM Trans. Math. Soft. 1986, 12 (4), 362. Hartig, F.; Keil, F. J. A Modified Algorithm of Iterative Dynamic Programming. Hung. J. Ind. Chem. 1993a, 21, 101. Hartig, F.; Keil, F. J. Large-scale Spherical Fixed Bed Reactors: Modeling and Optimization. Ind. Eng. Chem. Res. 1993b, 32, 424. Hartig, F.; Keil, F. J. Global Optimization of Quench Reactors by Iterative Dynamic Programming. Hung. J. Ind. Chem. 1994, 22, 233. Hartig, F.; Keil, F. J.; Kafarov, V. V. Optimization of Complex Reactions by Mixed-Integer Iterative Dynamic Programming. Theoretical Foundations Chem. Eng. 1996, 30 (1), 50. Homma, T.; Saltelli, A. Use of Sobol’s Quasirandom Sequence Generator for Integration of Modified Uncertainty Importance Measure. J. Nucl. Sci. Technol. 1995, 32 (11), 1164. Hwang, C.; Lin, J. S. An Improved Computational Scheme for Solving Dynamic Optimization Problem with Iterative Dynamic Programming. Submitted to Chin. Inst. Chem. Eng. 1998. Jensen, T. Dynamic Control of Large Dimension Nonlinear Chemical Processes. Ph.D. Dissertation, University of Princeton, 1964. Keil, F. J.; Stoyanov, S.; Chunova, E. Optimization of Ferrite Production by Fuzzy Iterative Dynamic Programming. Hung. J. Ind. Chem. 1996, 24, 309. Lapidus, L.; Luus, R. Optimal Control of Engineering Process; Blaisdell: Waltham, MA, 1967. Lin, J. S.; Hwang, C. Optimal Control of Time-Delay Systems by Forward Iterative Dynamic Programming. Ind. Eng. Chem. Res. 1996a, 35 (8), 2795. Lin, J. S.; Hwang, C. A Forward Iterative Dynamic Programming Technique for Optimal Control of Nonlinear Dynamic Systems. J. Chinese Inst. Chem. Engrs. 1996b, 27, 477.
Luus, R. Optimal Control by Dynamic Programming using Accessible Grid Points and Region Reduction. Hung. J. Ind. Chem. 1989, 17, 523. Luus, R. Optimal Control by Dynamic Programming using Systematic Reduction in Grid Size. Int. J. Control 1990a, 51, 995. Luus, R. Application of Dynamic Programming to High-Dimensional Nonlinear Optimal Control Problems. Int. J. Control 1990b, 52, 239. Luus, R. Application of Dynamic Programming to Singular Optimal Control Problems. Proc. Am. Control Conf. 1990c, 2932. Luus, R. Effect of the Choice of Final time in Optimal Control of Nonlinear Systems. Can. J. Chem. Eng. 1991, 69, 144. Luus, R. Piecewise Linear Continuous Optimal Control by Iterative Dynamic Programming. Ind. Eng. Chem. Res. 1993a, 32, 859. Luus, R. Application of Dynamic Programming to DifferentialAlgebraic Process Systems. Comput. Chem. Eng. 1993b, 17 (4), 373. Luus, R. Optimization of Fed-Batch Fermentors by Iterative Dynamic Programming. Biotechnol. Bioeng. 1993c, 41, 599. Luus, R. Application of Iterative Dynamic Programming to very High-Dimensional Systems. Hung. J. Ind. Chem. 1993d, 21, 243. Luus, R.; Bojkov, B. Global Optimization of the Bifunctional Catalyst Problem. Can. J. Chem. Eng. 1994, 72, 160. Luus, R.; Galli, M. Multiplicity of Solutions in using Dynamic Programming for Optimal Control. Hung. J. Ind. Chem. 1991, 19, 55. Luus, R.; Rosen, O. Application of Dynamic Programming to Final State Constrained Optimal Control Problems. Ind. Eng. Chem. Res. 1991, 30, 1525. Luus, R.; Smith, S. G. Application of Dynamic Programming to High-Dimensional Systems Described by Difference Equations. Chem. Eng. Technol. 1991, 14, 122. Luus, R.; Dittrich, J.; Keil, F. J. Multiplicity of Solutions in the Optimization of a Bifunctional Catalyst Blend in a Tubular Reactor. Can. J. Chem. Eng. 1992, 70, 780. Luus, R.; Zhang, X.; Hartig, F.; Keil, F. J. Use of Piecewise Linear Continuous Optimal Control for Time-Delay Systems. Ind. Eng. Chem. Res. 1995, 34, 4136. Mekarapiruk, W.; Luus, R. Optimal Control of Inequality State Constrained Systems. Ind. Eng. Chem. Res. 1997, 36, 1686. Nagurka, M.; Wang, S.; Yen, V. Solving Linear Quadratic Optimal Control Problems by Chebyshev-Based State parametrization. Proc. Am. Control Conf. 1991, 104. Rao, S. N.; Luus, R. Evaluation and Improvement of Control Vector Iteration Procedures for Optimal Control. Can. J. Chem. Eng. 1972, 50, 777. Sobol, I. M. On the Systematic Search in a Hypercube. SIAM J. Numer. Anal. 1979, 16 (5), 790.
Received for review September 8, 1997 Revised manuscript received March 23, 1998 Accepted March 24, 1998 IE970629J