+
+
Ind. Eng. Chem. Res. 1996, 35, 2795-2800
2795
Optimal Control of Time-Delay Systems by Forward Iterative Dynamic Programming Jeng-Shiaw Lin Department of Chemical Engineering, National Cheng Kung University, Tainan 700, Taiwan
Chyi Hwang* Department of Chemical Engineering, National Chung Cheng University, Chia-Yi 621, Taiwan
In this paper we apply the forward IDP technique to obtain optimal control policies for timedelay systems. The benefits of fast convergence and accurate solutions of using forward IDP are demonstrated by several numerical examples. Introduction In a time-delay system, the evolution of current states depends on the past values of states and/or controls. In general, a time-delay system arises as a result of inherent delays in the transmission of information or material between different parts of the system and/or as a deliberate introduction of time delay into the system for control purposes. Since time delays often occur in various systems, control engineers are frequently confronted with the optimization problems in which the system dynamics is described by nonlinear delay-differential equations. The problem of computing optimal control policies for nonlinear time-delay systems has been receiving constant attention. The solution method (Pontryagin et al., 1962) based on Pontryagin’s maximum principle gives rise to an extremely difficult problem in which the differential equations involve time-delayed state variables and time-advanced adjoin variables. Instead, a control parametrization method (Chan and Perkins, 1973; Chyung and Lee, 1966) is often used. In the method, the control is parametrized such that the optimal control problem becomes an optimal parameter selection problem, which can then be solved by a nonlinear programming method. The major difficulty associated with the method of using control parametrization and nonlinear programming is the existence of local optima. To obtain a reliable global solution to the optimal control problem of a time-delay system, an iterative dynamic programming (IDP) algorithm has been recently used (Dadebo and Luus, 1992; Dadebo and McAuley, 1995; Jime´nez-Romero and Luus, 1991). Basically, the IDP algorithm also uses the control parametrization technique. However, it converts an optimal control problem into a stagewise optimization problem by dividing the continuous time interval of interest into discrete time stages and gridding state and control variables. The optimal control sequence is searched by performing state integrations from each of the state grid points with all allowable controls. The accuracy and global optimality of the solution are ensured by performing the dynamic programming and contracting the allowable control region iteratively. The IDP algorithm (Dadebo and Luus, 1992; Dadebo and McAuley, 1995; Jime´nez-Romero and Luus, 1991) searches the optimal control policy at each iteration by backward dynamic programming, which starts the * To whom all correspondence should be addressed.
stagewise search from the last time stage. Hence, except for the case of using a single-state grid for each time stage, the initial profiles of the delayed variables are only available for the first time stage. If multiplestate grids are selected, approximations for the missed initial profiles of the delayed variables have to be made. The use of approximate initial profiles for the delayed variables without considering the corresponding control in the previous time stage may lead to a slow convergence and even worse to a local optimal solution. Moreover, it is extremely difficult to apply the IDP algorithm to the optimal control problems which involve input delays. The above-mentioned drawbacks of a general IDP can be partially overcome by using the multipass IDP with a single-state grid (Luus and Bojkov, 1994) or the semiexhaustive search scheme (Gupta, 1995). In the multipass IDP with a single-state grid, the missed state values are obtained from the state trajectory corresponding to the optimal control profile searched in the previous pass. Since the state profile used is independent of the control in the previous time stage, the multipass IDP with a single-state grid is not so effective for the case where the system is highly nonlinear. The semiexhaustive search scheme performs the stagewise optimization in the forward direction and avoids the explosion in the number of state grids by eliminating those state grids that have a poorer performance. The strategy of reducing the number of state grids in later time stages restricts the scheme to the case where the performance index is a strictly monotonical function of time. In this paper, we present the use of a forward IDP (Lin and Hwang, 1995) for optimal control of nonlinear dynamic systems with delays in both state and control variables. In general, the forward IDP (FIDP) requires less computation time but gives more accurate solution than the IDP. Besides, due to the fact that an FIDP performs stagewise optimization in the forward direction, the initial profiles of delayed states and inputs for each state grid are always available by storing the computed results of the previous time stage. Hence, it is not necessary to make any approximation on the initial profiles for delayed variables and a more accurate solution than an IDP algorithm can be obtained. To illustrate the effectiveness of the forward IDP in obtaining globally optimal control policies for time-delay systems, three examples are examined.
+
+
2796 Ind. Eng. Chem. Res., Vol. 35, No. 8, 1996
Problem Formulation Consider a time-delay system described by the delaydifferential equations:
dx/dt ) f(x(t),x(t-τx),u(t),u(t-τu),t)
(1)
where x and u are the n -state and m -input vectors, respectively. Suppose that x(t) ) z(t) for -τx e t e 0 and u(t) ) w(t) for -τu e t < 0. The optimal control problem is to find the control vector u(t) that is bounded within the interval u(t) ∈ [u, u j ], i.e., ui e ui(t) e u j i for i )1, 2, ..., m, and minimizes the performance index
J(x(0),tf) ) Φ(x(tf),tf) +
∫0t φ(x,u,t) dt f
(2)
where tf is the specified final time. To solve this optimal control problem by an IDP algorithm, we partition the time interval [0, tf] into Nt equidistant time stages and seek a piecewise constant control policy
u(t) ) u(k),
tk-1 e t < tk
(3)
where tk ) ktf/Nt, k ) 0, 1, ..., Nt. With this approximation for control, the problem then is to find u(k), k ) 1, 2, ..., Nt, for the augmented system
[ ] [
] [ ]
f(x(t),x(t-τx),u(t),u(t-τu),t) d x dxˆ } ) , φ(x(t),u(t),t) dt dt xn+1 x xˆ (0) ) 0 (4a) 0 x(t) ) z(t), u(t) )
{
w(t), u(k),
-τx e t < 0
(4b)
if -τu e t e 0 if tk-1 < t e tk
(4c)
to minimize the performance index
J ) Φ(x(tf),tf) + xn+1(tf)
(5)
Forward Iterative Dynamic Programming A detailed description of the concept and algorithm of forward dynamic programming (FDP) for the optimal control of continuous-time systems has been presented in our previous paper (Lin and Hwang, 1995). Here, we first give a modified forward dynamic programming (FDP) algorithm for time-delay systems. Then we present an FIDP algorithm, which employs the FDP algorithm along with the systematic region contraction strategy, to solve optimal control problems of time-delay systems. As we begin, we explain some notation used in the FDP algorithm in order to facilitate an understanding of the algorithm. In the FDP, it is required to specify the number of time stages, Nt, the number of state grids, Nx, the number of allowable controls, Nc, the state grids, x(k,l), l ) 1, 2, ..., Nx, k ) 1, 2, ..., Nt, and the allowable controls, u(k,l), l ) 1, 2, ..., Nc, k ) 1, 2, ..., Nt. At stage k, the set of time-delay state equations in (4) is integrated using an interpolant Runge-Kutta integration scheme (Enright et al., 1986) from t ) tk-1 to t ) tk, with the initial condition xˆ (k - 1) and an allowable control u(t) ) u(k,j). It is noted that xˆ (0) is the specified initial condition. The computed state xˆ (tk) is stored in xˆ (k), which will be used as the initial condition of timestage integration of stage k + 1. Each state grid x(k,l)
is associated with a logical variable L(k,l), which is set to 1 if the allowable control associated with the state grid x(k,l) has been determined. If this is the case, an integer variable Ik,l is used to record the index of the optimal allowable control. Also, integer variables Ic(k) and Ix(k) are used to memorize the indices of the allowable control being used and the state grid being connected, and variable J(k) is used to record the smallest value of performance index evaluated. Each time a smaller value of the performance index is calculated, the indices of the current control sequence are stored in I*c(k). Initially, Ic(k) ) 1 for k ) 1, 2, ..., Nt, and Ix(1) ) 1. Having introduced the notation used in the FDP algorithm, we list below a complete procedure of FDP for determining the optimal control sequence {uc(1), uc(2), ..., uc(Nt)}, where uc(k) ∈ {u(k,1), u(k,2), ..., u(k,Nc)}, k ) 1, 2, ..., Nt, which gives the smallest value of the performance index J* for the time-delay nonlinear dynamic system (1). FDP Algorithm (1) Initialization: (a) Ic(k) ) 1, k ) 1, 2, ..., Nt (b) Ix(1) ) 1 (c) L(k,l) ) 0, l ) 1, 2, ..., Nx, k ) 2, 3, ..., Nt (2) Store [x0, 0]T to xˆ (0). (3) Save the initial state profile z(t), t ∈ [-τx, 0] in xd(1,t) and set the zero initial control profile ud(1,t) ) w(t), t ∈ [-τu, 0]. (4) Set k ) 1. (5) Integrate the augmented system equations in (4), with the control u(t) ) u(k,Ic(k)) and the initial condition xˆ (tk-1) ) xˆ (k - 1), from t ) tk-1 to tk to obtain xˆ (tk) } [x(tk), xn+1(tk)]T. Note that in the integration of the equations in (4), let x(t-τx) ) xd(k,t-τx) for t ∈ [tk-1, tk-1 + τx], and ud(k,t-τu) for t ∈ [tk-1, tk-1 + τu] and u(t-τu) ) u(k,Ic(k)) for t > tk-1 + τu. Store the computed state profile x(t), tk - τx e t e tk, in xd(k+1,t) and the control profile ud(k+1,t) ) u(k,Ic(k)). (6) If k < Nt, then (a) let Ix(k+1) ) arg min1eleNx |x(tk) - x(k+1,l| (b) store xˆ (tk) to xˆ (k) (c) advance one time stage by setting k ) k + 1 (d) if L(k,Ix(k)) ) 1, set Ic(k) ) Ik,Ix(k) (e) go to step 5. (7) If k )Nt, then (a) evaluate the performance index J ) Φ(x(tk),tk) + xn+1(tk) (b) if J < J*, then replace J* by J and record the indices of the optimal control sequence I* c(k) ) Ic(k), k ) 1, 2, ..., Nt (c) if L(k,Ix(k)) ) 0, then go to step 7d, else reset Ic(k) ) 1 and go to step 7g (d) if Ic(k) ) 1 or J < J(k), then J(k) ) J and Ik,Ix(k) ) Ic(k) (e) set the index of the next allowable control Ic(k) ) Ic(k) + 1 (f) if Ic(k) e Nc, then go to step 5, else reset Ic(k) ) 1 and J ) J(k) (g) back up one time stage by setting k ) k - 1 (h) if k g 1, go to step 7c (8) End of FDP: (a) The optimal control sequence is given by uc(k) ) u(k,I* c(k)), k ) 1, 2, ..., Nt (b) The value of least performance index is stored in J*.
+
+
Ind. Eng. Chem. Res., Vol. 35, No. 8, 1996 2797
Now, we combine the above FDP algorithm and the systematic region contraction strategy (Luus 1989, 1990a-c) to obtain the optimal piecewise constant control profile for the time-delay nonlinear dynamic system (1). The complete forward iterative dynamic programming (FIDP) algorithm is listed as follows: FIDP Algorithm (1) Partition the time interval [0, tf] into Nt equallength time stages. (2) Choose the number of state grids, Nx, and the number of allowable values of control, Nc. (3) Choose an initial region size vector r ) [r1, r2, ..., rm]T and a contraction factor η. (4) Choose an initial center control sequence uc(k), k ) 1, 2, ..., Nt, and evaluate the corresponding performance index J*. (5) Choose the maximum number of iterations Ni, and set the iteration index i ) 1. (6) Choose Nx control sequences {ux(1,l), ux(2,l), ..., ux(Nt,l)}, l ) 1, 2, ..., Nx, within the allowable control region [uc(k) - r, uc(k) + r], and then integrate the dynamic equations in (4) with these control sequences to generate state grids x(k,l), l ) 1, 2, ..., Nx, k ) 2, 3, ..., Nt. (7) Choose allowable controls u(k,l), l ) 1, 2, ..., Nc, k ) 1, 2, ..., Nt, within the allowable control region [uc(k) - r, uc(k) + r]. (8) Use the FDP algorithm described above to obtain the optimal control sequence uc(k), k ) 1, 2, ..., Nt, and the corresponding performance index J*. (9) Reduce the region of allowable control by the factor η, i.e., r(i+1) ) ηr(i). (10) Increase iteration index i ) i + 1. (11) If i e Ni, go to Step 6. (12) Stop. Before ending this section, it should be mentioned that the choice of allowable values for control for each time stage plays a crucial role in both the IDP and the FIDP algorithms. For a given center control sequence uc(k), k ) 1, 2, ..., Nt, and a given region size vector r ) [r1, r2, ..., rm]T, the values of allowable controls are first generated such that u(k,l) ∈ [uc(k) - r, uc(k) + r]. In general, these u(k,l) can be generated by a uniform distribution approach or by a random distribution approach. The approach of uniform distribution for values of allowable control is adopted in the case where the number of control components, m, is small. Let M equally-spaced values inside the region [uc(k) - r, uc(k)+ r] be taken for each control component; then there are Nc ) Mm values of allowable control for each state grid at stage k. For the approach of distributing Nc values of allowable control randomly in the region [uc(k) - r, uc(k) + r], it is required to generate a sequence of mNc(Nt - 1) random data, say, dj, j ) 1, 2, ..., (mNcNt - 1), which are uniformly distributed in the [0, 1]. Then each component of the allowable control u(k,l) is given by
ui(k,l) ) uc,i(k) - ri + 2ridj, j ) (k - 1)mNc + (l - 1)m + (i - 1) where i ) 1, 2, ..., m; l ) 1, 2, ..., Nc; and k ) 2, 3, ..., Nt. This approach is often adopted for the case of large m in order to avoid the explosion of the number of allowable control values. It is noted that the values of allowable control generated by the above two approaches may violate the bounds specified. If one or
more components of a generated allowable control u(k,l) do not reside in the feasible control region [u, u j ], a simple clipping technique (Luus 1989, 1990a-c) or a modified clipping technique (Hartig and Keil, 1993) can be used to shift the infeasible control component of the allowable control into the feasible region. Also noted is that, in order to ensure the nonincreasing convergence of the performance index in executing the FIDP algorithm, the optimal control obtained in the current iteration should be used as the center control of the next iteration. The center control uc(k) should be taken as one of the allowable controls for stage k. Numerical Examples and Comparisons In this section, three examples appeared in the literature (Chyung and Lee, 1966; Dadebo and Luus, 1992; Dadebo and McAuley, 1995; Oh and Luus, 1976; Tsay et al., 1988) are resolved by the FIDP algorithm in order to show the computational efficiency and solution accuracy of the FIDP over the IDP. The FIDP algorithm is coded by a Microsoft V5.0 FORTRAN compiler in double precision. In order to compare the computed results with those obtained by the IDP algorithm, the same IDP parameters were used. Since the major computational burden of an IDP algorithm lies in state integrations, the computation efficiency of the FIDP algorithm is defined as the ratio of the number of time-stage integrations performed to that required by the IDP algorithm, which is a fixed number for given Nx and Nc. In actual computations, the state equations are integrated by using the interpolant locally fifthorder Runge-Kutta method (Enright et al., 1986) with a fixed step length h ) 0.01. For constructing state grids for each time stage, the integral state gridding strategy (Luus, 1990a,b; de Tremblay and Luus, 1989) was used. Due to the fact that the number of control components for the presented examples is small, the uniform distribution of allowable values of control was adopted. Also, if any of the allowable control values violates the bounds specified, it is simply clipped to the corresponding boundary of the feasible control region. Example 1. Consider the linear time-delay system
x˘ (t) ) -x(t - 1) + u(t),
x(t) ) 1.0,
-1 e t e 0
The problem is to seek an optimal control policy u(t) g 0 which drives system state x(t) from x(0) ) 1 to x(2) ) 0 while minimizing the performance index
I)
∫02u2(t) dt
1 2
Chyung and Lee (1966) obtained the exact solution with a performance index of 0.093 75 for this minimumenergy problem. Dadebo and McAuley (1995) applied the IDP algorithm to find the optimal piecewise constant control profile for this problem. To treat the terminal state constraint, they constructed the penalized performance indices
∫02u2(t) dt + θ[x(2)]2
J1 )
1 2
J2 )
1 2
∫02u2(t) dt + ω|x(2)|
Moreover, the state trajectories resulting in the process of constructing state grids by integrating system equations forward were used as the initial profiles for
+
+
2798 Ind. Eng. Chem. Res., Vol. 35, No. 8, 1996 Table 1. Comparisons of Performance J2, Final State x(1), and Computational Efficiency E for Example 1a FIDP
Nc ) 3 Nx ) 17 ω ) 10 Nc ) 5 Nx ) 21 ω ) 0.5 Nc ) 5 Nx ) 41 ω ) 0.5 Nc ) 5 Nx ) 1 ω ) 0.5 a
x(1) J2 E x(1) J2 E x(1) J2 E x(1) J2
IDP
Nt ) 4
single-pass Nt ) 8
two-pass Nt ) 8
two-pass Nt ) 8
-6.3072 × 10-6 0.09589 23.58% -4.2144 × 10-7 0.09527 41.23% -7.7991 × 10-7 0.09524 22.80%
1.8559 × 10-6 0.09678 31.10% -2.3986 × 10-8 0.09412 53.74% -4.6285 × 10-8 0.09412 32.35%
-4.2751 × 10-8 0.09528 30.89% -1.3205 × 10-7 0.09412 53.41% -1.6478 × 10-9 0.09412 31.34%
-1.2 × 10-4 0.10611 -3.2774 × 10-4 0.094963 -7.464 × 10-3 0.093538 1.123 × 10-7 0.095476
E ) ratio of the number of time-stage integrations performed in FIDP to that required by the IDP.
Table 2. Comparison of the Values of the Performance Index Obtained by FIDP and IDP Algorithms for Example 2a τ
method
Nx
Nt ) 10
Nt ) 20
Nt ) 40
CVI
0.05
s-FIDP m-FIDP m-IDP s-FIDP m-FIDP m-IDP s-FIDP m-FIDP m-IDP s-FIDP m-FIDP m-IDP
21 21 21 21 21 21 21 21 21 21 21 27
0.02362 (134320)
0.02327 (592696) 0.02307 (599848) 0.02311 (721800) 0.02350 (591960) 0.02340 (600240) 0.02340 (721800) 0.02422 (581112) 0.02392 (596344) 0.02394 (721800) 0.02503 (576576) 0.02476 (596528) 0.02536 (927000)
0.02347 (2496888) 0.02291 (2527296) 0.02294 (2955600) 0.02354 (2481200) 0.02322 (2526224) 0.02324 (2955600) 0.02416 (2450416) 0.02379 (2524168) 0.02386 (2955600) 0.02505 (2416720) 0.02467 (2520832) 0.02495 (3798000)
0.02297
0.10 0.20 0.40
0.02363 (171900) 0.02377 (133384) 0.02392 (171900) 0.02432 (132392) 0.02435 (171900) 0.02518 (131312) 0.02643 (220500)
0.02328 0.02372 0.02476
a
s-FIDP: single-pass FIDP. m-FIDP: multipass FIDP. m-IDP: multipass IDP. CVI: control vector iteration method. A parenthesized number represents the number of time-stage integrations performed.
delayed states and a quadratic approximation of the delayed state profile was used for the numerical integration of the delay-differential equation. By using the same IDP parameters as used by Dadebo and McAuley (1995), we solved this problem by using the FIDP algorithm to minimize the penalized performance index J2. Also, we solved the problem using a 10-pass IDP with single-state grid, i.e., Nx ) 1. In each pass of single-state-grid IDP computation, 20 iterations were performed and the control region is halved. Table 1 compares the results obtained by the single-pass and two-pass FIDP, the two-pass IDP (Dadebo and McAuley, 1995), and the 10-pass IDP with a single state grid (Luus and Bojkov, 1994). The values of computation efficiency shown in this table indicate that the FIDP algorithm is indeed more efficient and give more accurate results than the IDP algorithm. Example 2. Consider two CSTRs cascaded in series in which a first-order irreversible chemical reaction occurs. The reactor dynamics is described by the following four delay-differential equations:
x˘ 1(t) ) 0.5 - x1(t) - R1,
x1(0) ) 0.15
x˘ 2(t) ) -2(x2(t) + 0.25) - u1(t) (x2(t) + 0.25) + R1, x2(0) ) -0.03 x˘ 3(t) ) x1(t - τ) - x3(t) - R2 + 0.25,
x3(0) ) 0.1
x˘ 4(t) ) x2(t - τ) - 2x4(t) - u2(t) (x4(t) + 0.25) + R2 0.25, x4(0) ) 0 where the reaction rates in tanks 1 and 2 are respec-
tively given by
( (
R1 ) (x1(t) + 0.5) exp
R2 ) (x1(t) + 0.5) exp
25x2(t)
) )
x2(t) + 2 25x4(t)
x4(t) + 2
In the above equations, x1 and x3 are normalized concentration variables in tanks 1 and 2, respectively, and x2 and x4 are normalized temperature variables in tanks 1 and 2, respectively. The problem is to find the optimal controls u1(t) and u2(t) such that the performance index
J)
∫02[x12 + x22 + x32 + x42 + 0.1(u12 + u22)] dt
is minimized. Recently, Dadebo and Luus (1992) applied the IDP algorithm to solve the above problem for the initial state profiles z1(t) ) 0.15 and z2(t) ) -0.03 and for various delay times. In the first pass of IDP computation, the following IDP parameters were used: Nt ) 10, Nc ) 32,Ni ) 20, r1 ) 0.3, r2 ) 0.2, and uc(k) ) [-0.05, 0.15]T for k ) 1, 2, ..., Nt. In the second and third passes, the optimal control profiles obtained in the previous IDP pass were refined by doubling the number of time stages. Here, we resolved the problem using the FIDP algorithm. Table 2 compares the results computed by the multipass IDP (Dadebo and Luus, 1992), single-pass FIDP, multipass FIDP, and the control vector iteration (CVI) method (Oh and Luus, 1976). It is verified once again that the FIDP algorithm gave better results but requires less computation time than the IDP algorithm.
+
+
Ind. Eng. Chem. Res., Vol. 35, No. 8, 1996 2799
Figure 1. Computational results for example 2: (a) profile of control u1(t); (b) profile of control u2(t); (c) trajectory of state x1(t); (d) trajectory of state x2(t); (e) trajectory of state x3(t); (f) trajectory of state x4(t); (g) convergence of the performance index.
In order to see if the multipass IDP with a single-state grid (Luus and Bojkov, 1994) can be used to find the true optimal control for this highly nonlinear system, we resolved the problem with Nx ) 1 while using the same other IDP parameters in the IDP computation. The performance index computed using 10-pass IDP computations with 20 iterations converges to 0.027 662 for Nt ) 10 and 0.027 674 for Nt ) 20. Clearly, the IDP algorithm with a single-state grid fails to give a true optimal solution for this nonlinear system. Computational results for example 2 are given in Figure 1. Example 3. To demonstrate the ability of the FIDP algorithm to solve the optimal control problem for systems with both state and control delays, we consider the second-order system
Figure 2. Computational results for example 3: (a) profile of control u(t); (b) trajectory of state x1(t); (c) trajectory of state x2(t); (d) convergence of performance index.
orthogonal polynomials for this problem. Since the system involves a time delay in the input, it cannot be solved by a IDP algorithm with single- or multiple-state grids. Here, we used the FIDP with the parameters Nt ) 10, Nc ) 3, Nx ) 13, η ) 0.8, uc(k) ) -10, k ) 1, 2, ..., Nt, r ) 10, and h ) 0.005 to solve the problem. After Ni ) 20 iterations of FDP computation, the value of the performance index converges to J ) 155.5382. To obtain a more accurate solution, the solution obtained by the FIDP algorithm was refined by a gradient-based method. The refined performance index is J ) 155.5178. In Figure 2, we compare the state trajectories and control profiles obtained with the FIDP and gradient-based algorithms. Conclusions
x˘ 1(t) ) x2(t) + x1(t - 1) x˘ 2(t) ) tx1(t) + 2x1(t - 1) + x2(t - 1) + u(t) - u(t - 0.5) with
x1(t) ) x2(t) ) 1, u(t) ) 5(t - 1),
for -1 e t e 0 for -0.5 e t e 0
The problem is to find the optimal control u(t) such that the performance index
1 J ) x12(3) + x22(3) + 2
(
∫03 2x12(t) + 2x1(t) x2(t) + x22(t) +
)
u2(t) dt t+2
is minimized. Tsay et al. (1988) applied the general
In this paper we have demonstrated that the forward IDP algorithm is particularly useful for finding an optimal control policy for systems with state and control delays. Besides offering greater potential for saving computation times, the forward IDP algorithm provides the following attractive features: First, since the state integration begins from the first stage and proceeds forward, the exact initial profiles for delayed states are available for each time-stage integration by storing the computed state trajectories. This avoids using approximate initial profiles for the delayed states and hence leads to more accurate results than the IDP algorithm. Second, the forward computation strategy of the forward IDP is essentially a sequential computation scheme which allows for using the control of the previous time stage. As a result, the optimal control problems for systems with delays in control can be equally treated. However, this is not the case for the IDP algorithm. Finally, the forward IDP algorithm needs not to store the state trajectory associated with
+
+
2800 Ind. Eng. Chem. Res., Vol. 35, No. 8, 1996
the constructed state grids, and it requires less computer memory and is easier to program than the IDP. Acknowledgment This work was supported by the National Science Council of Republic of China under Grant NSC84-2214E-194-002. Nomenclature E ) the ratio of the number of time-stage integrations performed in FIDP to that required by the IDP f( ) ) vector function of nonlinear delay-differential equations Ik,l ) index of the optimal route starting from state grid xk,l Ic(k) ) integer variable to memorize the indices of the allowable control being used Ic*(k) ) integer variable to memorize the indices of the optimal control sequence Ix(k) ) integer variable to memorize the indices of the state grid being connected J ) performance index defined in (2) J(k) ) variable to record the smallest value of the performance index for the kth stage L(k,l) ) logical variable for state grid x(k,l) Nc ) number of control grid points Ni ) number of iterations Nt ) number of equidistant time stages Nx ) number of state grid points tf ) final time u(k) ) control used for stage k u(t) ) m-component control vector u(k,l) ) lth allowable control for stage k u ) lower bound of the control vector u j ) upper bound of the control vector uc(k) ) center control for stage k ud(k,t) ) initial control profile for stage k w(t) ) initial control function for -τu e t e 0 x(k,l) ) lth state grid for stage k x(t) ) n-component state vector xˆ (t) ) augmented state vector defined in (4a) x0 ) initial state vector xd(k,t) ) initial state profile for stage k z(t) ) initial state function for -τx e t e 0 Greek Letters η ) contraction factor Φ( ) ) scalar function defined in (2) φ( ) ) scalar function defined in (2) τu ) delay time of the control variable τx ) delay time of the state variable
θ ) penalty factor for the square error ω ) penalty factor for the absolute error
Literature Cited Chan, H. C.; Perkins, W. R. Optimization of Time-Delay Systems using Parameter Imbedding. Automatica 1973, 9, 257. Chyung, D. H.; Lee, E. B. Linear Optimal Systems with TimeDelay. SIAM J. Control 1966, 4, 548. Dadebo, S.; Luus, R. Optimal Control of Time-Delay Systems by Dynamic Programming. Optim. Control Appl. Methods 1992, 13, 29. Dadebo, S. A.; McAuley, K. B. Iterative Dynamic Programming for Minimum Energy Control Problems with Time Delay. Optim. Control Appl. Methods 1995, 16, 217. de Tremblay, M.; Luus, R. Optimization of Non-Steady-State Operation of Reactors. Can. J. Chem. Eng. 1989, 67, 494. Enright, W. H.; Jackson, K. R.; Norsett, S. P.; Thomsen, P. G. Interpolants for Runge Kutta Formulas. ACM Trans. Math. Software 1986, 12, 193. Gupta, Y. H. Semiexhaustive Search for Solving Nonlinear Optimal Control Problem. Ind. Eng. Chem. Res. 1995, 34, 3878. Hartig, F.; Keil, F. J. A Modified Algorithm of Iterative Dynamic Programming. Hung. J. Ind. Chem. 1993, 21, 101. Jime´nez-Romero, L. F.; Luus, R. Optimal Control of Time-Delay Systems. Proc. Am. Control Conf. 1991, 1818. Lin, J. S.; Hwang, C. A Forward Iterative Dynamic Programming Technique for Optimal Control of Nonlinear Dynamical Systems. Submitted for publication in Chem. Eng. Sci. 1995. Luus, R. Optimal Control by Dynamic Programming using Accessible Grid Points and Region Reduction. Hung. J. Ind. Chem. 1989, 17, 523. Luus, R. Optimal Control by Dynamic Programming using Systematic Reduction in Grid Size. Int. J. Control 1990a, 51, 995. Luus, R. Application of Dynamic Programming to High-Dimensional Non-linear Optimal Control Problems. Int. J. Control 1990b, 52, 239. Luus, R. Application of Dynamic Programming to Singular Optimal Control Problems. Proc. Am. Control Conf. 1990c, 2932. Luus, R.; Bojkov, B. Global Optimization of the Bifunctional Catalyst Problem. Can. J. Chem. Eng. 1994, 72, 160. Oh, S. E.; Luus, R. Optimal Feedback Control of Time-Delay Systems. AIChE J. 1976, 22, 140. Pontryagin, L. S.; Boltyanskii, V. G.; Gamkrelidze, R. V.; Mishchenko, E. F. In The Mathematical Theory of Optimal Processes; Wiley: New York, 1962. Tsay, S. C.; Wu, I. L.; Lee, T. T. Optimal Control of Linear TimeDelay Systems via General Orthogonal Polynomials. Int. J. Syst. Sci. 1988, 19, 365.
Received for review February 20, 1996 Revised manuscript received May 9, 1996 Accepted May 12, 1996X IE960087Q
X Abstract published in Advance ACS Abstracts, July 1, 1996.