Optimal Control of Inequality State Constrained Systems

ACS2GO © 2018. ← → → ←. loading. To add this web app to the home screen open the browser option menu and tap on Add to homescreen...
2 downloads 0 Views 235KB Size
1686

Ind. Eng. Chem. Res. 1997, 36, 1686-1694

Optimal Control of Inequality State Constrained Systems Wichaya Mekarapiruk and Rein Luus* Department of Chemical Engineering, University of Toronto, Toronto, Ontario M5S 3E5

To handle inequality state constraints in nonlinear optimal control problems, we propose a method of introducing an auxiliary state variable for each constraint. The derivatives of these state constraint variables are made positive if the constraint is violated, and zero if there is no constraint violation. By incorporating these state variables then as penalty functions in an augmented performance index, we can ensure that the inequality state constraints are satisfied everywhere inside the given time interval. The procedure, as illustrated and tested with three nonlinear optimal control problems, is found to work well even in the presence of many state constraints. dx ) f(x,u) dt

1. Introduction In solving optimal control problems, one frequently encounters the requirement of keeping some state variables within specified bounds throughout the given time interval. In many cases, the strict satisfaction of such constraints is crucial since the violation may result in the failure of equipment or degradation of a product. There are a number of methods available in the literature for solving such inequality state constrained optimal control problems, but the choice becomes somewhat limited when the systems are highly nonlinear. Goh and Teo (1988) and Teo et al. (1991) used a control parametrization technique to solve nonlinear optimal control problems with several classes of constraints. Logsdon and Biegler (1989) showed that, by first converting the differential equations into a set of algebraic equations by using a collocation technique, sequential quadratic programming could be used for optimization. However, by using this method, the satisfaction of the state constraints is not guaranteed beyond the collocation points. Similarly, by introducing penalty functions into an augmented performance index, Luus (1991) and Dadebo and McAuley (1995) were able to handle inequality state constraints quite readily at the end of each time stage. It was pointed out by Keil (1994) that, although the state constraints are satisfied at the end of each time stage, constraint violation can occur inside a time stage, unless the state variables are examined at points inside each time stage. Although it could be argued that, if a large number of time stages is used, then the extent of violation is quite negligible; the recent work of Bojkov and Luus (1996) and of Luus (1996a) shows that, by using flexible stage lengths, a small number of stages is required to give excellent results. Under these conditions it is important that state violations do not occur inside a time stage. Therefore, in this paper we present a method that will ensure that the constraints are satisfied not only at the end of each time stage but also inside each time stage. 2. Optimal Control Problem Let us consider the system described by the differential equation * Author to whom correspondence should be addressed. E-mail: [email protected]. Fax: 416-978-8605. S0888-5885(96)00583-0 CCC: $14.00

(1)

where the initial state x(0) is given. The state vector x is an (n × 1) vector and u is an (m × 1) control vector bounded by

Rj e uj e βj

j ) 1, 2, ..., m

(2)

In addition, there are q inequality constraints on the state variables

Ψi(x) e 0

i ) 1, 2, ..., q

(3)

The performance index associated with this system is a scalar function of the state at final time tf

I ) Φ(x(tf))

(4)

The optimal control problem is to find the control policy u(t) in the 0 e t e tf time interval that minimizes (or maximizes) the performance index in eq 4. 3. Construction of Augmented Performance Index To deal with inequality constraints in eq 3, we shall use the penalty function approach as used by Luus (1991). Since there are q inequality constraints, we introduce q additional variables xn+1, xn+2, ..., xn+q, which we call state constraint variables, through the relationship

{

dxn+i Ψi(x) if Ψi(x) > 0 ) dt 0 if Ψi(x) e 0

i ) 1, 2, ..., q

(5)

with the initial condition

xn+i(0) ) 0

i ) 1, 2, ..., q

(6)

The final value xn+i(tf) therefore gives the total violation of the ith inequality state constraint integrated over time. For minimization of the performance index in eq 4, we choose the augmented performance index q

J)I+

θixn+i(tf) ∑ i)1

(7)

where θi > 0 are penalty function factors for inequality state constraints. The main idea is to incorporate all penalties into a performance index that is a function of final time, and the state constraint variables provide a © 1997 American Chemical Society

Ind. Eng. Chem. Res., Vol. 36, No. 5, 1997 1687

convenient means of ensuring that no state variable constraints are violated at any time. When these state constraint variables are displayed in exponential format, then even the slightest violation of the inequality constraints can be observed immediately. There is a certain amount of flexibility that one can use in choosing the augmented performance index, as is illustrated in the third example in this paper. For optimization we choose iterative dynamic programming (IDP) as developed by Luus (1989). IDP has been found to be well suited for an extractive fermentation optimization problem by Arnold et al. (1996) and for general constrained optimal control problems by Luus (1991) and by Dadebo and McAuley (1995). To obtain accurate convergence, we shall use IDP in a multipass fashion as used by Luus (1996b) for high dimensional systems. We divide the time interval 0 e t e tf into P stages of equal length and seek piecewise constant control over the P time stages so that the augmented performance index in eq 7 is minimized. If J is equal to I, then it is clear that all of the inequality constraints are satisfied everywhere. Maximization of the performance index in eq 4 creates no difficulty, because then we simply minimize the augmented performance index q

J ) -I +

θixn+i(tf) ∑ i)1

(8)

where the penalty function is chosen as before. Of interest is whether the use of the penalty function in the performance index yields the optimum value of the performance index, and the range of penalty function factors, along with other parameters such as the number of grid points, for which convergence to the optimum will result. To investigate the viability of the approach and to illustrate the procedure, we therefore choose three nonlinear optimal control problems that have been examined recently by other methods. 4. Numerical Examples All of the computations were performed in double precision on Pentium/120 and Pentium/166 personal computers using WATCOM FORTRAN compiler version 9.5. The FORTRAN subroutine DVERK of Hull et al. (1976) was used for the integration of the differential equations. The built-in compiler random number generator URAND was used for the generation of random numbers. Example 1: Plug-Flow Tubular Reactor Problem. Let us first consider the plug-flow tubular reactor problem as studied by Ko and Stevens (1971), Reddy and Husain (1981), Modak et al. (1989), and Luus (1991, 1994). The system is described by the two differential equations

dx1 ) (1 - x1)k1 - x1k2 dt

(9)

dx2 ) 300[(1 - x1)k1 - x1k2] - u(x2 - 290) (10) dt where

(

k1 ) 1.7536 × 105 exp

)

-1.1374 × 104 1.9872x2

(11)

and

k2 ) 2.4885 × 1010 exp

(

)

-2.2748 × 104 1.9872x2

(12)

with the initial condition

x(0) ) [0 380]T

(13)

Here x1 denotes the normalized concentration of the desired product, x2 is the temperature, and u is the normalized coolant flow rate. The coolant flow rate is bounded by

0 e u e 0.5

(14)

In addition, there is an upper constraint on the temperature

x2(t) e 460

(15)

The performance index to be maximized is the yield given by

I ) x1(tf)

(16)

where the final time tf ) 5 min. In order to deal with the inequality state constraint in eq 15, we introduce a state constraint variable x3 by

{

dx3 x - 460 if x2 - 460 > 0 ) 2 0 if x2 - 460 e 0 dt

(17)

with the initial condition x3(0) ) 0. Therefore, whenever the state constraint is violated, x3 is increased in value. Since the performance index in eq 16 is to be maximized, we choose the augmented performance index to be minimized as

J ) -I + θx3(tf)

(18)

where θ is a positive penalty function factor. As was done by Luus (1991), let us divide the given time interval into 25 time stages (P ) 25) each of length 0.2 min. For optimization we took N ) 3, R ) 20, and γ ) 0.9. The local error tolerance of 10-8 was used for integration in subroutine DVERK. We chose u(0) ) 0.25 and r(0) ) 0.5 as the initial control policy and the initial control region, respectively, and used θ ) 10. After 34 iterations requiring 96 s of computation time on a Pentium/120 personal computer, from the initial value of the performance index I(0) ) 0.099 31, we obtained rapid convergence to I ) 0.676 84 with no violation of the inequality constraint at all (i.e., x3(tf) ) 0). By using a lower value for the penalty function factor θ ) 0.1, we obtained almost the same performance index I ) 0.676 85 also with x3(tf) ) 0. Therefore, the choice of the penalty function factor is not very important for this problem. Although, Luus (1991) reported a slightly higher performance index I ) 0.6770, this better value is attributed to a slight violation of the inequality state constraint within the time stages 12 and 13, i.e., in the intervals 2.2 < t < 2.4 and 2.4 < t < 2.6. The optimum value of the performance index is very sensitive to the time of switching of the coolant flow rate, as was shown by Luus (1994). By taking 24 stages, i.e., P ) 24, with θ ) 1, we were able to obtain a somewhat better value for the performance index, I ) 0.67713, than was possible with P ) 25. The resulting optimal control policy is given in Table 1, and the corresponding state

1688 Ind. Eng. Chem. Res., Vol. 36, No. 5, 1997 Table 1. Optimal Control Policy for Example 1 with P ) 24 stage number k

time tk

control u(tk-1)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

0.20833 0.41666 0.62499 0.83332 1.04165 1.24998 1.45831 1.66664 1.87497 2.08330 2.29163 2.49996 2.70829 2.91662 3.12495 3.33328 3.54161 3.80202 3.90619 4.16660 4.37493 4.58326 4.79159 5.00000

0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.47323 0.50000 0.47481 0.32578 0.45483 0.35908 0.31999 0.28335 0.24989 0.23732 0.20414 0.19851 0.17647 0.16901 0.15563

Figure 1. Temperature profile in example 1, obtained with P ) 24 stages.

Table 2. Effect of the Number of Grid Points N on the Performance Index Obtained after 10 Passes no. of stages, P

N)5

N ) 11

N ) 15

N ) 21

47 48 49 95 96 97

0.677 07 0.677 20 0.677 07 a 0.677 25 0.677 11

0.677 18 0.677 22 0.677 15 a a a

0.677 19 0.677 23 0.677 16 0.677 24 0.677 22 a

0.677 19 0.677 23 0.677 16 0.677 27 0.677 27 0.677 26

a

Local optimum.

trajectory of x2 is shown in Figure 1. The state trajectory was generated by using the optimal control policy to integrate the system equations with the time interval divided into 240 time steps in order to obtain an accurate profile. As can be seen, the trajectory does not go over the 460 bound. To examine the convergence properties and to obtain a more accurate optimal control policy, we took a larger number of stages. Initially we started by doubling the number of stages to P ) 48. By using the same starting conditions as before, but allowing 10 passes, each consisting of 30 iterations with the region restoration factor η ) 0.7, we obtained with the use of N ) 3 grid points I ) 0.677 11. Since this is lower than was obtained with P ) 24, it is clear that adequate convergence was not obtained. We therefore increased the number of grid points and performed a number of runs, with each run using the same starting conditions as before. By using N ) 15, we obtained I ) 0.677 23. The effect of the number of grid points is shown in Table 2. When the number of stages was doubled again, as shown in the same table, with P ) 96 and N ) 21, we obtained I ) 0.677 27. We also obtained the same value for I with P ) 95 stages. As was found by Hartig et al. (1995), as the number of stages is increased, it is necessary to use a larger number of grid points to increase the likelihood of obtaining the global optimum. Increasing the number of candidates for control, i.e., R, had very little effect. The computation time on Pentium/120 for 1 pass with P ) 95 and N ) 21 was 6991 s. The control policy with P ) 95 is given in Figure 2. It is interesting to note the dip in the control policy around t ) 2.6 min and then

Figure 2. Control policy for example 1 with P ) 95 stages, after 10 passes, yielding I ) 0.677 268 6.

the rise to the second peak. This dip is due to the state constraint, and, therefore, obtaining the optimal control policy is quite difficult. As can be seen, there are numerous small irregularities beyond t ) 3 min, so that, even after 10 passes, the control policy can still be improved somewhat to make it a smoother profile. For this system, small changes in the control policy have very little effect on the performance index. In fact, this lack of sensitivity makes this optimal control problem very challenging. To obtain an improvement to this control policy, it was perturbed slightly and used as an initial control policy for another 10 passes. Although the performance index was improved only marginally from 0.677 269 to 0.677 270, the resulting control policy is considerably smoother, as is seen in Figure 3. Such a low sensitivity of the control policy on the performance index has been observed in the optimal control of a fedbatch reactor (Luus, 1995).

Ind. Eng. Chem. Res., Vol. 36, No. 5, 1997 1689 Table 3. Effect of the Penalty Function Factor θ and the Number of Grid Points on the Results Obtained after 10 Passes θ ) 10

θ ) 100

θ ) 1000

no. of grid points N

I

x4(1)

I

x4(1)

I

x4(1)

3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35

0.193 08 0.176 84 0.175 88 0.173 00 0.172 79 0.172 66 0.172 67 0.172 67 0.172 68 0.172 66 0.172 66 0.172 66 0.172 68 0.172 67 0.172 67 0.172 67 0.172 67

2.57 × 10-5 1.07 × 10-5 1.74 × 10-5 3.50 × 10-6 4.76 × 10-6 3.37 × 10-6 2.19 × 10-6 2.79 × 10-6 1.13 × 10-6 2.20 × 10-6 3.18 × 10-6 2.51 × 10-6 9.64 × 10-7 2.07 × 10-6 1.58 × 10-6 1.78 × 10-6 1.55 × 10-6

0.394 67 0.250 60 0.192 38 0.177 81 0.181 67 0.173 03 0.173 25 0.173 06 0.172 78 0.172 80 0.172 71 0.172 73 0.172 78 0.172 70 0.172 69 0.172 71 0.172 69

6.61 × 10-6 1.54 × 10-7 9.93 × 10-7 4.35 × 10-8 3.01 × 10-8 0.0 5.38 × 10-9 2.34 × 10-10 9.36 × 10-9 1.50 × 10-8 0.0 5.57 × 10-9 1.30 × 10-9 5.08× 10-10 9.61 × 10-10 3.82 × 10-10 4.05 × 10-10

0.460 15 0.365 14 0.197 84 0.191 44 0.201 26 0.193 72 0.173 05 0.172 96 0.172 83 0.172 82 0.172 85 0.172 69 0.172 73 0.172 71 0.172 69 0.172 69 0.172 69

0.0 3.63 × 10-8 0.0 0.0 9.31 × 10-12 2.72 × 10-9 1.82 × 10-10 0.0 7.27 × 10-12 3.16 × 10-10 0.0 0.0 0.0 0.0 0.0 0.0 0.0

The control is bounded by

-20 e u e 20

(24)

The performance index to be minimized is

I ) x3(tf)

(25)

where tf ) 1. To deal with the inequality state constraint in eq 23, we introduce a state constraint variable x4 by

{

dx4 h(x) if h(x) > 0 ) dt 0 if h(x) e 0

(26)

with the initial condition x4(0) ) 0. Since the performance index in eq 25 is to be minimized, we chose the augmented performance index to be minimized as

J ) I + θx4(tf) Figure 3. Optimal control policy for example 1 with P ) 95 stages, yielding I ) 0.6772704.

Example 2: Mathematical System with Nonlinear Inequality Constraint. This system involving a nonlinear inequality constraint was studied by Mehra and Davis (1972), Goh and Teo (1988), Vlassenbroeck (1988), Teo et al. (1991), and Elnagar et al. (1995). The differential equations describing the system are

dx1 ) x2 dt

(19)

dx2 ) -x2 + u dt

(20)

dx3 ) x12 + x22 + 0.005u2 dt

(21)

with the initial condition

x(0) ) [0 -1 0]T

(22)

The nonlinear inequality constraint to be satisfied is

h(x) ) x2 + 0.5 - 8(t - 0.5)2 e 0

(23)

(27)

Let us take P ) 20, R ) 20, and γ ) 0.85 and allow 10 passes, each consisting of 30 iterations, with the region restoration factor of η ) 0.5. Of interest is examination of the effect of the penalty function factor θ. We took as the initial control policy u(0) ) 0 and the initial region size r(0) ) 20. To ensure that the inequality constraint would be satisfied, we divided each time stage into five subintervals over which integration was carried out with the subroutine DVERK with 10-8 local error tolerance. The results in Table 3 show that, in this example, the penalty function factor has a significant effect. The amount of violation is reduced quite considerably when θ ) 100 is used rather than 10, and the violation is reduced even more when θ ) 1000. However, the use of a larger value of θ requires the use of a larger number of grid points to obtain adequate convergence. It is noted that when N is greater than 11 with θ ) 100 and greater than 13 with θ ) 1000, we obtained values for the performance index that were better than the value of I ) 0.1816 reported by Goh and Teo (1988). With θ ) 10, the vicinity of the global minimum was obtained for all values of N greater than 3, but the violations in the constraint were on the order of 10-6. With N ) 23 and θ ) 100, we obtained the performance index I ) 0.172 71 with x4(1) ) 0. For several values of N greater than 23 with θ ) 1000, we obtained I ) 0.172 69. This performance index is marginally lower than the performance index I ) 0.1730

1690 Ind. Eng. Chem. Res., Vol. 36, No. 5, 1997

Example 3: Fed-Batch Fermentor Problem. In solving complex optimal control problems, special consideration may be necessary to make the problem as easy as possible to solve. We illustrate this idea with the problem of determining the optimal feed rate to a fed-batch fermentor in which the maximum production of penicillin is desired. The presence of six state inequality constraints makes this a challenging optimal control problem. This optimal control problem was studied by Lim et al. (1986), Cuthrell and Biegler (1989), Luus (1993a,b), Dadebo and McAuley (1995), and most recently Gupta (1995). The system is described by the four differential equations

( ) ( )

Figure 4. Trajectory of the function h(x) in example 2. Table 4. Optimal Control Policy for Example 2 with 20 Stages of Equal Length stage number k

time tk

control u(tk-1)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00

10.8647 5.1732 2.2683 0.5588 -0.8955 -2.8832 -3.0644 -2.3752 -1.6643 -0.8955 -0.1043 0.7442 1.6157 2.5438 2.2698 1.0094 0.3985 0.1087 -0.0126 -0.0260

reported by Teo et al. (1991) with a mild violation in the inequality state constraint. The optimal control policy is given in Table 4. The computation time required for 10 passes with N ) 23 was 7522 s on Pentium/166. The trajectory of the function of state h(x) is shown in Figure 4. As can be seen, the constraint is satisfied throughout the time interval. By using only half as many allowable values for control, i.e., R ) 10 instead of R ) 20, we obtained essentially the same results, except that the computation time was halved. To refine the results, we divided the time interval into 40 stages. By starting from the control policy in Table 4, we obtained the performance index I ) 0.170 92, with no violation in the inequality state constraint throughout the entire time period. This value of the performance index is very slightly better than the performance index I ) 0.171 85 reported by Elnagar et al. (1995) for a smooth control profile when using nonlinear programming along with the collocation technique. It is interesting to note that doubling the number of stages from 20 to 40 improved the performance index by only 1%.

dx1 x1 ) h1x1 u dt 500x4

(28)

x2 dx2 ) h2x1 - 0.01x2 u dt 500x4

(29)

(

)

h1x1 h2x1 0.029x3x1 x3 u dx3 )+ 1dt 0.47 1.2 0.0001 + x3 500 x4 (30) dx4 u ) dt 500

(31)

where

h1 ) h2 )

0.11x3 0.006x1 + x3 0.0055x3

0.0001 + x3(1 + 10x3)

(32)

(33)

with the initial state

x(0) ) [1.5 0 0 7]T

(34)

The constraints on the feed rate are

0 e u e 50

(35)

and the constraints on the state variables are

0 e x1 e 40

(36)

0 e x3 e 25

(37)

0 e x4 e 10

(38)

The performance index to be maximized is the total amount of penicillin produced at the final time tf given by

I ) x2(tf) x4(tf)

(39)

Although there are two inequality constraints imposed on each of the variables x1, x3, and x4, we do not have to deal with six inequality constraints. It is obvious that the lower constraint on x4 cannot be reached and, therefore, can be neglected. Also, negative values for the concentrations x1 and x3 may lead to numerical difficulties, so we introduce three state constraint variables, x5, x6, and x7 through the differential equations

Ind. Eng. Chem. Res., Vol. 36, No. 5, 1997 1691 Table 5. Optimal Control Policy for Example 3 Using 10 Stages stage number k

time tk

control u(tk-1)

1 2 3 4 5 6 7 8 9 10

12.6 25.2 37.8 50.4 63.0 75.6 88.2 100.8 113.4 126.0

5.8222 38.6031 8.8590 9.0419 9.2029 9.3178 9.4160 9.5069 9.5956 9.6822

Table 6. Effect of Penalty Function Factor θ on Convergence for Example 3 penalty function factor θ

no. of passes for convergence

shifting term s

sensitivity 2θs

1 2 5 10 20 50 100

6 4a 2 2 2 3 4

-14.3629 -7.1799 -2.8660 -1.4357 -0.7168 -0.2864 -0.1433

-28.73 -28.72 -28.66 -28.71 -28.67 -28.64 -28.66

Figure 5. Convergence profile for the first pass in example 3 with different values for the penalty function factor: (s) θ ) 5; (- - -) θ ) 20; and (- -) θ ) 100.

{ {

x1 - 40 dx5 ) 0 dt 1000 x3 - 25 dx6 ) 0 dt 1000 dx7 x - 10 ) 4 0 dt

{

if if if if if if

x1 > 40 0 e x1 e 40 x1 < 0 x3 > 25 0 e x3 e 25 x3 < 0

if x4 > 10 if 0 e x4 e 10

a

(40)

A very slight violation on x3; x6(tf) ) 9.6 × 10-7.

(41) (42)

with the initial conditions x5(0) ) 0, x6(0) ) 0, x7(0) ) 0. Here we have used arbitrarily the large number 1000 in eqs 40 and 41 to eliminate the possibility of negative concentrations. Instead of treating the constraint in eq 38 as an inequality, it is obvious that, for maximum yield, we must have the maximum allowed volume. Therefore, instead of the inequality constraint, we have the equality constraint

x4(tf) - 10 ) 0

(43)

To maximize the performance index in eq 39, the augmented performance index to be minimized is chosen as

J ) -I + θ1x5(tf) + θ2x6(tf) + θ3(x4(tf) - 10 - s)2 (44) where we have included a shifting term s in the penalty function. Here we take all the penalty function factors to be equal; i.e., θ1 ) θ2 ) θ3 ) θ. The use of a shifting term allows a quadratic penalty function to be used for equality constraints (Luus, 1996a,c). The shifting term is initially chosen to be zero and is updated after every iteration by the amount of violation of the equality constraint at final time tf , i.e., by

sj+1 ) sj - (x4(tf) - 10)

(45)

where j is the iteration number. First, let us choose tf ) 126 h and divide the time interval into 10 stages, i.e., P ) 10, and take R ) 11, γ ) 0.9, η ) 0.8, r1(0) ) r2(0) ) 10, 30 iterations per pass, and N ) 5 and use the initial control policy u(0) ) 11.9 as was used by Luus (1993a). A local error tolerance of 10-6 was used for integration with the DVERK sub-

Figure 6. Trajectories of state variables: (- -) x1; (s) x3; and (- - -) x4.

routine. As is shown in Figure 5, the effect of the penalty function factor θ is not that important after about 20 iterations of the first pass. Only the points where all the constraints were satisfied have been plotted. In every case we were able to get convergence to I ) 87.79, with x5(126) ) x6(126) ) x7(126) ) 0 within 6 passes. The computation time on Pentium/166 was approximately 40 min/pass for the first pass and 30 min/ pass for the sixth pass. The optimal control policy is given in Table 5, and the resulting trajectories of the state variables x1, x3, and x4 are shown in Figure 6. This performance index is slightly better than the performance index I ) 87.76 reported by Luus (1993a) and by Dadebo and McAuley (1995). Although the control policy reported by Luus (1993a) does not cause any constraint violation at the end of each time stage, it yields the state constraint variable x6(126) ) 2.92 which shows some violation of the constraint on x3. This state violation inside the time stages in this example was first pointed out by Keil (1994).

1692 Ind. Eng. Chem. Res., Vol. 36, No. 5, 1997 Table 7. Optimal Control Policy for Example 3 Using 20 Stages stage number k

time tk

control u(tk-1)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

6.6 13.2 19.8 26.4 33.0 39.6 46.2 52.8 59.4 66.0 72.6 79.2 85.8 92.4 99.0 105.6 112.2 118.8 125.4 132.0

3.6381 6.1262 23.1089 49.9945 8.1939 8.3694 8.6261 8.7754 8.8804 8.9607 9.0286 9.0877 9.1420 9.1927 9.2413 9.2890 9.3358 9.3820 9.4279 9.4721

To investigate further the effect of the penalty function factor θ, additional runs were performed with θ ) 1, 2, 10, and 50. As is shown in Table 6, there appears to be a range of θ for which the optimum can be obtained in only two passes. For this example the best range for θ appears to be between 5 and 20. If θ is too small, it takes longer to get the optimum shifting parameter and there is a possibility of slight state constraint violation in x3, as was experienced with θ ) 2. It is noted that, in the entire range of θ that was used, the product 2θs is constant (at the value -28.7), as would be expected from the work of Luus (1996a,c), because this factor gives the Lagrange multiplier, or sensitivity to the final state constraint. This means that, if the upper constraint on the volume were 10.001, rather than 10, the performance index is expected to increase by 0.0287 to 87.82. Let us now take tf ) 132 h and divide this time interval into 20 stages of equal length. For this case, Luus (1993a) obtained the performance index I ) 87.948, and Gupta (1995) reported I ) 88.00. The corresponding control policies were checked by integrating the system equations (eqs 28-31) and the auxiliary differential equations (eqs 40-42). The control policy given by Luus (1993a) yielded x6(132) ) 3.28, showing that the constraint on x3 is violated inside a time stage, while the policy reported by Gupta (1995) gave x7(132) ) 0.127 × 10-3, showing a very slight violation of the upper bound on x4. In fact, x4(132) ) 10.002 208. Although this violation is only 0.02% of the bound, it changes the performance index by 0.05%. Sometimes small violations of constraints can have a significant effect, as was pointed out by Luus (1976) in the optimization of an alkylation process and by Rosen et al. (1987) in considering time-optimal control. By starting with the control policy given either by Gupta (1995) or by Luus (1993a), we obtained convergence to the performance index I ) 87.959, with no violation of the constraints throughout the time interval (x5(132) ) x6(132) ) x7(132) ) 0), with 2θs ) -28.87. The optimal control policy is given in Table 7. The sensitivity of the volume constraint is shown in Figure 7. This is very similar to the result reported by Luus (1993b) in using piecewise linear continuous control for this example. In relaxing the upper constraint on x4 to 10.01, from the sensitivity factor we would expect the performance index to be increased by

Figure 7. Sensitivity of the volume constraint for example 3.

Figure 8. Effect of initial volume on the yield in example 3.

0.289 to 88.248. In fact, this same value was obtained through computation. It is also interesting to note the significant effect that the choice of the initial volume has, as is shown in Figure 8. The effect of the initial concentration of x1, as shown in Figure 9, is not quite as important. When all these conditions are changed slightly, we expect an additive effect. For example, if we choose x4(0) ) 6.98, x1(0) ) 1.60, and the maximum value for x4 of 10.01, we would expect to get I ) 88.93. A run was made with these values, and the optimum value of I obtained was 88.930. Therefore, such sensitivity information can be used very effectively in design, or operation. Although IDP was used for optimization for the three examples, the proposed method of introducing the state constraint variables to handle state constraints is not restricted to any particular optimization procedure. 5. Conclusions Incorporating state constraint variables into the set of equations describing the system provides a convenient way of monitoring constraint violations. When these variables at the final time are used in the penalty function in the augmented performance index, then a

Ind. Eng. Chem. Res., Vol. 36, No. 5, 1997 1693 Φ ) performance index ψ ) inequality constraint function Subscripts f ) final i ) general index, state variable number j ) general index, control variable number k ) index used for stage number Superscript (0) ) initial value

Literature Cited

Figure 9. Effect of the initial concentration of x1 on the yield in example 3.

good means for handling inequality state constraints in optimal control problems is available. The proposed procedure is well suited for iterative dynamic programming. The method is straightforward, and no computational difficulties were encountered with the test problems. In each case the state constraint variables could be forced to be zero at the final time, showing that the state constraints were satisfied inside each time step throughout the given time period. Acknowledgment Financial support from the Natural Sciences and Engineering Research Council of Canada is gratefully acknowledged. Nomenclature h ) inequality constraint in eq 23 h1,2 ) variables defined in eqs 32 and 33 I ) performance index J ) augmented performance index k ) index used for stage number k1,2 ) rate constants defined in eqs 11 and 12 m ) number of control variables n ) number of state variables N ) number of grid points used in IDP P ) number of time stages q ) number of state constraints r ) region over which allowable values for the control are taken R ) number of sets of random allowable values for the control s ) shifting parameter used for equality constraints t ) time tf ) final time u ) scalar control variable u ) control vector, m × 1 x ) state vector, n × 1 Greek Letters Rj ) lower bound on control uj βj ) upper bound on control uj γ ) reduction factor for the control region after every iteration η ) region restoration factor used after every pass θ ) penalty function factor for state inequality constraints

Arnold, S. G.; McLellan, P. J.; Daugulis, A. J. Dynamic Modelling and Performance Optimization of an Extractive Fermentation. Can. J. Chem. Eng. 1996, 74, 385-393. Bojkov, B.; Luus, R. Optimal Control of Nonlinear Systems with Unspecified Final Times. Chem. Eng. Sci. 1996, 51, 905-919. Cuthrell, J. E.; Biegler, L. T. Simultaneous Optimization and Solution Methods for Batch Reactor Control Profiles. Comput. Chem. Eng. 1989, 13, 49-62. Dadebo, S. A.; McAuley, K. B. Dynamic Optimization of Constrained Chemical Engineering Problems Using Dynamic Programming. Comput. Chem. Eng. 1995, 19, 513-525. Elnagar, G.; Kazemi, M. A.; Razzaghi, M. The Pseudospectral Legendre Method for Discretizing Optimal Control Problems. IEEE Trans. Autom. Control 1995, 40, 1793-1796. Goh, C. J.; Teo, L. K. Control Parameterization: a Unified Approach to Optimal Control Problems with General Constraints. Automatica 1988, 24, 3-18. Gupta, Y. P. Semiexhaustive Search for Solving Nonlinear Optimal Control Problems. Ind. Eng. Chem. Res. 1995, 34, 3878-3884. Hartig, F.; Keil, F. J.; Luus, R. Comparison of Optimization Methods for a Fed-Batch Reactor. Hung. J. Ind. Chem. 1995, 23, 141-148. Hull, T. E.; Enright, W. D.; Jackson, K. R. User Guide to DVERKsA Subroutine for Solving Nonstiff ODE’s. Report 100; Department of Computer Science, University of Toronto, Toronto, Ontario, Canada, 1976. Keil, F. J. Technical University of HamburgsHarburg, personal communication, 1994. Ko, D. Y. C.; Stevens, W. F. Studies of Singular Solutions in Dynamic Optimization. AIChE J. 1971, 17, 160-166. Lim, H. C.; Tayeb, Y. J.; Modak, J. M.; Bonte, P. Computational Algorithms for Optimal Feed Rates for a Class of Fed-Batch Fermentation: Numerical Results for Penicillin and Cell Mass Production. Biotechnol. Bioeng. 1986, 28, 1408-1420. Logsdon, J. S.; Biegler, L. T. Accurate Solution of Differential Algebraic Equations. Ind. Eng. Chem. Res. 1989, 28, 16281639. Luus, R. A Discussion on Optimization of an Alkylation Process. Int. J. Numer. Methods Eng. 1976, 10, 1187-1190. Luus, R. Optimal Control by Dynamic Programming Using Accessible Grid Points and Region Reduction. Hung. J. Ind. Chem. 1989, 17, 523-543. Luus, R. Application of Iterative Dynamic Programming to State Constrained Optimal Control Problems. Hung. J. Ind. Chem. 1991, 19, 245-254. Luus, R. Optimization of Fed-Batch Fermentors by Iterative Dynamic Programming. Biotechnol. Bioeng. 1993a, 41, 599602. Luus, R. Piecewise Linear Continuous Optimal Control by Iterative Dynamic Programming. Ind. Eng. Chem. Res. 1993b, 32, 859-865. Luus, R. Optimal Control of Batch Reactors by Iterative Dynamic Programming. J. Process Control 1994, 4, 218-226. Luus, R. Sensitivity of Control Policy on Yield of a Fed-Batch Reactor. Proceedings IASTED International Conference on Modelling and Simulation, Pittsburgh, PA, April 27-29, 1995, pp 224-226. Luus, R. Use of Iterative Dynamic Programming with Variable Stage Lengths and Fixed Final Time. Hung. J. Ind. Chem. 1996a, 24, 279-284. Luus, R. Numerical Convergence Properties of Iterative Dynamic Programming when Applied to High Dimensional Systems. Chem. Eng. Res. Des. 1996b, 74, 55-62.

1694 Ind. Eng. Chem. Res., Vol. 36, No. 5, 1997 Luus, R. Handling Difficult Equality Constraints in Direct Search Optimization. Hung. J. Ind. Chem. 1996c, 24, 285-290. Mehra, R. K.; Davis, R. E. A Generalized Gradient Method for Optimal Control Problems with Inequality Constraints and Singular Arcs. IEEE Trans. Autom. Control 1972, AC-17, 6978. Modak, J. M.; Bonte, P.; Lim, H. C. An Improved Computational Algorithm for Singular Control Problems in Chemical Reaction Engineering. Chem. Eng. Commun. 1989, 86, 165-183. Reddy, K. V.; Husain, A. Computation of Optimal Control Policy with Singular Subarc. Can. J. Chem. Eng. 1981, 59, 557-559. Rosen, O.; Imanudin; Luus, R. Final-state Sensitivity for TimeOptimal Control Problems. Int. J. Control 1987, 45, 1371-1381. Teo, K. L.; Goh, C. J.; Wong, K. H. A Unified Computational Approach to Optimal Control Problems; Wiley: New York, 1991; pp 146-147.

Vlassenbroeck, J. A. Chebyshev Polynomial Method for Optimal Control with State Constraints. Automatica 1988, 24, 499-506.

Received for review September 24, 1996 Revised manuscript received January 9, 1997 Accepted January 10, 1997X IE960583E

X Abstract published in Advance ACS Abstracts, March 1, 1997.