Extension of Iterative Dynamic Programming to Multiobjective Optimal

May 1, 1997 - programming is introduced to estimate the Lagrange multipliers of the optimal control ... iterative dynamic programming (IDP), which was...
1 downloads 0 Views 172KB Size
Ind. Eng. Chem. Res. 1997, 36, 2279-2286

2279

Extension of Iterative Dynamic Programming to Multiobjective Optimal Control Problems Feng-Sheng Wang* and Tsong-Lin Shieh Department of Chemical Engineering, National Chung Cheng University, Chia-Yi 621, Taiwan, Republic of China

Multiobjective optimal control problems are converted into a single-objective optimal control problem including constraints. As a result, while the threshold level of constraints is specified, such single-objective optimal control problems can be solved by dynamic programming. Interactive programming procedures are usually required to find a satisfactory solution to multiobjective optimal control problems. In order to obtain the trade-off information in the interactive programming, the multiplier updating method incorporated in iterative dynamic programming is introduced to estimate the Lagrange multipliers of the optimal control problems. Two chemical processes are used to investigate the effect of using Lagrange multiplier updating in iterative dynamic programming. Introduction Until very recently, almost all optimal control problems in chemical process engineering have been considered to be single-objective optimization problems. In addition to the economic efficiency, the performance of many process systems requires an evaluation through a variety of criteria (Farber, 1986; Sophos et al., 1980). For instance, the aim of time optimal control problems is to find a control policy that drives the given initial states to the desired states in a minimum amount of time. This type of problem occurs very often in batch polymerization processes (Thomas and Kiparissides, 1984; Hsu and Chen, 1988; Jang and Yang, 1989; Chang and Lai, 1992). In such processes, reducing the batch reaction time and achieving the desired polymer properties, a near-complete monomer conversion and a high polymer molecular weight, are simultaneously required. These situations become multiobjective optimization problems. Since most of these objective functions are noncommensurable, multiobjective optimization methods have to be employed to solve such problems. The most common way of dealing with multiple objectives puts the onus on designers. They are asked to assign weights or reference levels to the different objectives and combine them into a single-objective function. The optimization problem can then be solved with this single objective. If the designer finds that of the original objectives have unacceptable values, he or she can replace the weights or reference levels on those objectives and resolve the problem. Various multiobjective optimization methods provide a framework for understanding the relationships between the various performance criteria and allow the designer to make decisions on how to trade off amongst the objectives to achieve the performance he or she defines as being best. It is an inherently interactive process, with the designer constantly making decisions. A trade-off surface can be referred to in these decision procedures. This surface corresponds to those points where performance in one objective must be given up in order to achieve better performance in another. The goal of multiobjective optimization is to find this boundary. Any point on this * To whom all correspondence should be addressed. Phone: +886 5 2428153. Fax: +886 5 2721206. E-mail: [email protected]. S0888-5885(96)00660-4 CCC: $14.00

surface represents a Pareto optimal operating point. It can be shown that this point is an extreme of a scalar function of the objective functions (Sawaragi et al., 1985). This is known as a utility function. It is up to the designer to determine the best operating point in a particular set of circumstances. The designer intuitively decides what the utility function is and selects the operating point that optimizes this function. The aim of this paper is to extend a method of iterative dynamic programming (IDP), which was introduced by Luus and co-workers (Bojkov and Luus, 1996; Luus, 1991; Luus and Rosen, 1991), to solve multiobjective optimal control problems. According to the methodologies of multiobjective optimization, such problems are first expressed as a utility problem. Such a utility problem becomes a single-objective optimal control problem with the designer’s reference levels so that it can be solved by conventional IDP augmented with penalty functions (Luus, 1991; Luus and Rosen, 1991; Dadebo and McAuley, 1995a,b; Bojkov and Luus, 1996). However, the single-objective function augmented with penalty functions, the squared or absolute terms, is simple in concept and implementation. These penalty function methods have certain weaknesses that are most serious when the penalty parameter is large. Such penalty functions tend to be ill-behaved near the boundary of the feasible domain where the optimum points usually lie. In addition, the trade-off information between objective functions on the Pareto surface cannot be obtained by the penalty methods. In this study, a method of multiplier updating is introduced and used as a remedy to alleviate some of the difficulties of the penalty methods. Moreover, the Lagrange multipliers can be estimated by the proposed method so that the trade-off information in the interactive multiobjective optimization programming can be achieved. Multiobjective Optimal Control Problems Consider the system described by the differential equation

dx ) f(x, u) dt

x(0) ) x0

(1)

where x is an (n × 1) state vector and u is an (m × 1) © 1997 American Chemical Society

2280 Ind. Eng. Chem. Res., Vol. 36, No. 6, 1997

control vector bounded by

ulk e uk(t) e uhk

k ) 1, ..., m

(2)

The system is restricted by ne equality constraints and ni inequality constraints of the type

hk(x, u) ) 0

k ) 1, ..., ne

(3)

gk(x, u) e 0

k ) 1, ..., ni

(4)

Multiobjective optimal control problems can now be stated to find a control policy u(t) that minimizes a set of objective functions, simultaneously. The general form of the multiple-objective functions is considered as follows:

min J ) min [J1(x(tf)) ... Js(x(tf))]T u(t)

u(t)

(5)

where J is a (s × 1) vector of the objective functions. Multiobjective optimization is a natural extension of the traditional optimization of a single-objective function. If the multiobjective functions are commensurate, the minimizing one-objective function minimizes all criteria and the problem can be solved using traditional optimization techniques. However, if the objective functions are incommensurate, or competing, then the minimization of one objective function requires a compromise in another objective function. The competition between multiobjective functions gives rise to the distinguishing difference between multiobjective optimization and traditional single-objective optimization. This fact causes the lack of complete order for multiobjective optimization problems. The concept of Pareto optimality or noninferiority is therefore used to characterize a solution to the multiobjective optimization problem. In order to concisely define the Pareto optimal solution, we introduce the following definitions. Definition 1. The feasible region in input space, Ω, is the set of all admissible control inputs that satisfy the system constraints; i.e.,

Ω ) {u|x3 ) f(x, u), x(0) ) x0; ulk e uk(t) e h uk,k)1,...,m ; hk(x, u) ) 0,k)1,...,ne; gk(x, u) e 0,k)1,...,ni}

We are now in a position to define the Pareto optimal solutions. Definition 2. A control action u* is a Pareto optimal policy if and only if there does not exist u ∈ Ω such that

Ji(u) e Ji(u*)

i ) 1, ..., s

Jk(u) < Jk(u*)

for some k

The image of a Pareto optimal policy is a Pareto optimal solution. In general, there are an infinite number of Pareto policies for a given multiobjective optimization problem. The collection of Pareto policies is the Pareto set. The image of this set is called the trade-off surface. The literature on multiobjective optimization is huge, and we cannot hope to mention all the techniques that have been used to generate Pareto solutions; however, one method is pervasive in the multiobjective optimization literature. This technique is the weighted sum method.

The weighted sum method is a technique for converting multiobjective optimization problems such as (5) into a single-objective function problem. In this scheme, a weight vector is chosen, and the following singleobjective function problem is solved:

min wTJ

(6)

u(t)∈Ω

This is one of the most commonly used techniques for the optimization of chemical processes. A problem with this weighted sum technique arises when the lower boundary of the feasible region in output space is not convex. This situation corresponds to a duality gap in the normal theory of Lagrange multipliers and historically indicated the first practical difference between the solution of single-objective and multiobjective optimization problems. Clearly, if the Pareto surface is nonconvex, the weighted sum method may yield poor designs no matter what weight or optimization is used. The -constraint method can be used to overcome the drawback of such a nonconvex problem. This method for characterizing Pareto optimal solutions is to solve the following constraint problem formulated by taking one objective function, say J1, as the objective function and letting all other objective functions be inequality or equality constraints:

min J1(x(tf))

(7)

u(t)

subject to

φk(x(tf)) ) Jk(x(tf)) - k e 0

k ) 2, ..., s

u(t) ∈ Ω While the threshold level k in problem 7 is assigned by a designer, such a problem can then be solved by conventional single-objective optimization methods. Interactive programming procedures are usually required to find a satisfactory solution to the multiobjective optimal control problem (5) since the solution of the subproblem (7) is dependent on the designer-selected threshold level. The following theorems are used to guarantee that an optimal solution of the -constraint problem (7) is a Pareto optimal solution of the multiobjective optimization problem (5). Theorem 1. If u* ∈ Ω is a unique optimal solution to the -constraint problem (7) for some k, k ) 2, ..., s, then u* is a Pareto optimal solution to the multiobjective optimization problem (5). Theorem 2. If u* ∈ Ω is a Pareto optimal solution of the multiobjective optimization problem (5), then u* is an optimal solution of the -constraint problem (7) for some k, k ) 2, ..., s. The proofs of these theorems follow immediately from definition 2 of the Pareto optimality by making use of contradiction arguments. IDP Including Multiplier Updating If the weight wk or the threshold level k in the subproblem (6) or (7) is specified by a designer, then such a subproblem becomes a single-objective optimal control problem with constraints. The original algorithm of IDP-augmented penalty functions could be applied to solve this constrained subproblem. However, the penalty function methods are easy to program and

Ind. Eng. Chem. Res., Vol. 36, No. 6, 1997 2281

are considered efficient. The optimization process can be significantly improved by including the Lagrange multipliers of the optimization problem in the algorithm. Indeed, Powell (1978) notes that the use of the penalty function method which does not include the Lagrange multipliers is obsolete as a practical optimization tool. In this study, we consider a way to include the conditions for optimality in the optimization algorithm in order to improve its efficiency and reliability. The principal motivation here is to reduce the dependency of the algorithm on the choice of the penalty parameters and the way by which they are updated during the optimization process. To apply dynamic programming, the optimal control problem is approximated by seeking a piecewise constant control policy, rather than a continuously varying control policy, so that the control action in the time interval tj-1 e t < tj becomes

u(t) ) u(j)

j ) 1, ..., P

(8)

The multiple-objective optimal control problem (5) is then used to find coefficients of the control policy that minimized the single-objective function in (7). By applying this approximation, the optimal control problem therefore becomes the problem of optimal parameter selection. A modified algorithm of IDP is now introduced to solve the converted problem (6) or (7). The Lagrangian for the converted problem (7) is first defined as

ni P (µ 2/β ) and 1/ ∑s (ν 2/γ ) in (11) The terms 1/2∑k)1 ∑j)1 kj k 2 k)2 k k are independent of u. Therefore, it can be clearly seen from the above expression that Ja has certain penalty terms that are added to the ordinary Lagrangain (9) for the problem (7). The initial choices of penalty parameters, Rk, βk, and γk, and corresponding Lagrange multipliers, θkj, ϑkj, and σk, are very important. Usually initial θkj, ϑkj, and σk are set to zero. The initial choices of penalty parameters are important from the point of view of the rate of convergence. Arora et al. (1991) have proposed a strategy to choice the initial penalty parameters in nonlinear programming problems. Such a strategy has to expect a decrease in the objective function at the initial control policy. This requires some prior knowledge about the problems. Another way to choose the initial penalty parameters is to require the norm of the gradients of the objective function and the penalty functional to be equal at the initial point. In this study, the initial penalty parameters are accordingly set to be in the forms

2|J1(u(0) + ∆u) - J1(u(0))| Rk )

max|hkj(u(0) + ∆u) - hkj(u(0))|

βk )

∑ ∑λkjhk(x(tj), u(j)) + k)1j)1

ni

P

γk ) s

∑ ∑µkjgk(x(tj), u(j)) + k)2 ∑ νkφk(x(tf)) k)1j)1

(9)

where λkj, µkj, and νk are Lagrange multipliers. The augmented Lagrangian including quadratic penalty terms is then defined for the problem (7) as

Ja ) J1 +

1

∑ ∑Rk{[hkj + θkj]2 - θkj2} +

k ) 1, ..., ni

2|J1(u(0) + ∆u) - J1(u(0))| |φk(u(0) + ∆u) - φk(u(0))|

k ) 2, ..., s (12)

where ∆u is a small perturbation on the initial guess of the control policy u(0). These penalty parameters can be updated if any constraint in (7) is unsatisfied. Modified IDP is then applied to solve the problem (7) again. The corresponding multipliers, θkj, ϑkj, and σk, are updated in the modified algorithm as (l+1) (l) (l) θkj ) θkj + hkj

2k)1j)1 ni

P

(l+1) (l) (l) (l) ϑkj ) ϑkj + max{gkj , -ϑkj }

∑ ∑βk{〈gkj + ϑkj〉+2 - ϑkj2} + 2k)1j)1 1

s

γk{〈φk + σk〉+2 - σk2} ∑ 2k)2

(10)

where Rkθkj (≡λkj), βkϑk (≡µkj), and γkσk (≡νk), are the Lagrange multipliers and the bracket operator is defined as 〈gkj〉+ ) max{gkj, 0}. According to the definition of the bracket operator, this functional can be written in terms of the Lagrange multipliers as follows:

{

max|gkj(u(0) + ∆u) - gkj(u(0))|

ne P

1

Ja )

2|J1(u(0) + ∆u) - J1(u(0))| j

ne P

Jˆ ) J1(x, u) +

k ) 1, ..., ne

j

Jˆ +

1 ne

P

1

ni

P

1

s

βkgkj2 + ∑ γkφk2 ∑ ∑Rkhkj2 + 2k)1 ∑∑ 2k)1j)1 2k)2 j)1 if gkj + ϑkj g 0 and φk + σk g 0

Jˆ +

1 ne

P

1

ni

P

µkj2

∑ ∑Rkhkj2 - 2k)1 ∑∑ βk 2k)1j)1 j)1

1

-

s

νk2

∑ 2k)2 γk

if gkj + ϑkj < 0 and φk + σk < 0

(11)

(l) (l) σ(l+1) ) σ(l) k k + max{φk , -σk }

(13)

where l is the iteration index in the IDP algorithm. Usually the initial corresponding multipliers are set to zero. The Lagrange multipliers of the problem (7) can therefore be estimated by these equations. Interactive programming procedures are usually required to find a satisfactory solution to multiobjective optimal control problems since the optimal solution of the problem (7) is dependent on the specified threshold level. Here it should be stressed that any improvement of one objective function can be achieved only at the expense of at least one of the other objective functions. Accordingly, trade-off information between a standing objective function J1 and each of the other objective functions is very useful. Such a trade-off between J1 and Jk can be easily obtained in this study. When a Pareto optimal solution is obtained by solving the constraint problem (7) for some selected values of k, if all the  constraints are active, then the corresponding

2282 Ind. Eng. Chem. Res., Vol. 36, No. 6, 1997

Lagrange multipliers give the trade-off rates as follows:

νk ) -

∂J1 ∂Jk

k ) 2, ..., s

(14)

Here the Lagrange multiplier represents the marginal decrease of J1 when Jk is increased by one unit. The computational procedures for multiplier updating, which is modified from the IDP algorithm (Bojkov and Luus, 1996), are summarized into the following algorithm. 1. Divide the time interval [0, tf] into P time stages, each of length L. 2. Choose an initial guess of the control policy u(0) and a small perturbation on u(0). Estimate the initial penalty parameters by (12). 3. Choose the number of allowable values M for u and the number of x grid points N. Choose an initial profile for each uj, initial region size rj, and the contraction factor η. 4. Generate the N values for the x grid at each time stage by using the N allowable values of control. 5. Using each x(tf - L) grid point as an initial value, integrate the state equation from time tf - L to tf once with each of the allowable values for the control vector that is kept constant throughout the time interval. For each x grid point, choose the value that gives the minimum value of the augmented objective function Ja. 6. Step back to stage P - 1, corresponding to time tf - 2L. Integrate the state equations from tf - 2L to tf L once with each of the allowable values of the control vector u(P - 1). Continue integration until t ) tf using for the last stage the constant value u(P) from step 4. Compare the M values Ja and choose the u(P - 1) that gives the smallest value. 7. Repeat this procedure for stage P - 2, P - 3, etc., until stage 1, corresponding to the initial time, t ) 0, is reached. 8. Update the multipliers by eq 13 and reduce the region for allowable control

r(l+1) ) ηr(l) where l is the iteration index. Use the optimal control policy from step 6 as the midpoint for the allowable values for the control u at each stage. 9. Increment the iteration index by 1 and go to step 4. Continue the procedure for a number of iterations and examine the results. 10. Perform the trade-off procedures. If the designer is satisfied with the current levels of the objective functions, stop. Then the current Pareto optimal solution is the satisfying solution optimal solution of the designer. Otherwise, ask the designer to modify the current threshold level to the new level by considering the current levels of the objective functions together with the trade-off rates between the objective functions and return to step 3. Here it should be stressed to the designer that any improvement of one objective function can be achieved only at the expense of at least one of the other objective functions. Computational Examples In order to investigate the effect of using Lagrange multiplier updating in IDP, let us consider two typical chemical processes. All computations were performed in double precision on a Pentium Pro 200 personal

computer using a Microsoft Windows NT Version 4.0 operating system and the Microsoft Fortran PowerStation Version 4.0 compiler. Example 1. An optimal temperature control of batch styrene polymerization initiated with the mixed initiator system is considered in this example. Reducing the batch reaction time and obtaining the desired polymer properties are simultaneously required in this batch free-radical polymerization process. Although one can use both high temperature and high initiator concentration to achieve a near-complete monomer conversion in a short reaction time, it is often difficult to achieve both high monomer conversion and the desired high polymer molecular weight simultaneously by doing so. It has been well-known that merely increasing the reaction temperature results in decreasing the polymer molecular weight. Therefore, it is desired to select the proper polymerization temperature profile to achieve both high monomer conversion and the desired high polymer molecular weight, simultaneously. In this case, the multiobjective nature of the optimization problems is more intrinsic to describe this polymerization process. A simplified model for styrene polymerization initiated by mixed initiators is considered in this example. This system was discussed in Butala (1990) in detail. The modeling equations are the following:

dx1 ) -kd1x1 dt dx2 ) -kd2x2 dt dx3 kp(1 - x3xd) ) λ0 dt xd

[

]

dx4 kt ) λ0 + kpCmM0(1 - x3xd) λ0/µ0d dt 2 dx5 ) [ktλ0 + kpCmM0(1 - x3xd)]λ2/µ2d + ktλ12/µ2d dt (15) Here the state variables x1 and x2 are the dimensionless concentrations of the first and second initiator, respectively, x3 is the conversion of monomer, and x4 and x5 are the corresponding zeroth and second moments of dead polymer distributions. The physical data in the above equations are represented in Table 1. Moreover, the zeroth, first, and second moments, λ0, λ1, and λ2, of the live polymer radicals in this modeling equation are expressed as follows:

λ0 ) [2(kd1I10x1 + kd2I20x2)/kt]1/2 λ1 ) R(λ0 + γ)/(1 - R) λ2 ) R(λ0 + γ + 2λ1)/(1 - R) where

M0(1 - x3xd) R)

M0(1 - x3xd)(1 + Cm) + ktλ0/kp

γ)

2(kd1I10x1 + kd2I20x2) kpM0(1 - x3xd)

+ Cmλ0

Ind. Eng. Chem. Res., Vol. 36, No. 6, 1997 2283 Table 1. Physical Data for the Bulk Polymerization of Styrene reaction

parameters (g, mol, L, s, cal, K) BPO: kd1 ) 1.44 × 1013 exp(-14 700/T) TBPB: kd2 ) 8.315 × 1013 exp(-16 162/T) kp ) kp0 ) 1.06 × 107 exp(-3569/T) Cm ) kfm/kp ) kfmo/kp0 + B1x3 kfmo ) 2.31 × 106 exp(-6377/T)

initiation propagation chain transfer to monomer

B1 ) -1.013 × 10-3 log chain termination gel effect

density of styrene and polystyrene initial concn of initiators

-T (473.12 202.5 )

kt ) ktc ) kt0gt kt0 ) 1.25 × 109 exp(-847/T) gt ) exp[-2(A1x3 + A2x23 + A3x33)] A1 ) 2.57 - 5.05 × 10-3T A2 ) 9.56 - 1.76 × 10-2T A3 ) -3.03 + 7.85 × 10-3T Fm ) 924.0 - 0.918(T - 273.1) Fp ) 1084.8 - 0.605(T - 273.1) I10 ) 0.001 536 mole/L I20 ) 0.003 264 mole/L I0 + I10 + I20

As mentioned earlier, we are interested in the minimization of batch time while the desired polymer properties are achieved. Hence, the objective functions are defined as functions of the squared deviation of the normalized state variables from their desired state at the final time. The normalization is necessary as the orders of magnitude of the state variables are very different from each other. Now, three objective functions can be expresses as

( )

min J1 ) 1 T(t)

x3 x4

Figure 1. Optimal concentrations of the first and second initiators. Solid line for β1 ) 1.84 × 104 and γ1 ) 5.95 × 10-2; dashed line for β1 ) 0.92 × 104 and γ1 ) 2.975 × 10-2.

2

(16)

min J2 ) (1 - x3)2

(17)

min J3 ) tf

(18)

T(t)

T(t)

The goal of these objective functions is to achieve the desired number-average molecular chain length and monomer conversion in minimum time. The reaction temperature T(t) is the only control variable to be determined within the region

343 K e T(t) e 413 K

(19)

to minimize the above objective functions, simultaneously. As mentioned early, a high initiator residue level in the final polymer product is not desirable because the residual peroxide group decomposes at the final polymer processing stage, leading to deterioration in color and other polymer properties. To avoid this drawback, one more design constraint is included in this optimization problem. The initiator residue at final time is restricted to be less than 4% of the total initiator concentration as

g1 ) x1(tf) + x2(tf) - 0.04I0 e 0

(20)

where the total initiator concentration I0 in expressed in Table 1. To apply the -constraint method, the first objective function in (16) is considered to be the utility function in this multiobjective optimization problem. The second objective function is converted as an inequality constraint in the form

φ1 ) (1 - x3(tf))2 -  e 0

(21)

The threshold level is a problem-dependent parameter.

Figure 2. Optimal profile of monomer conversion. Solid line for β1 ) 1.84 × 104 and γ1 ) 5.95 × 10-2; dashed line for β1 ) 0.92 × 104 and γ1 ) 2.975 × 10-2.

Here, the value of 0.1 is used in this example. The third objective function is considered to be an equality constraint in this example. This restriction implies that the reaction time is specified by the designer. The optimal control problem is therefore recast to find the reaction temperature policy T(t) under the specified reaction time such that the utility function (16) is minimized with subject to system constraints (15) and the inequality constraints (20) and (21). Using the modified IDP, the penalty parameters are first estimated by (12) to be β1 ) 1.84 × 104 and γ1 ) 5.95 × 10-2, respectively. By choosing P ) 20 stages, a reduction factor η ) 0.8, and the number of grid points N ) 21, we obtain the optimal results shown in Figures 1, 2, and 3. From these results, 2 inequality constraints, (20) and (21), are satisfied after 20 iterations. The initiator residue at final time tf ) 90 min is 3.7% of the total initiator concentration, which satisfied the desired level. The second initiator is almost exhausted at t > 50 min as observed in Figure 1. The monomer conversion at final time is 96.9% as observed in Figure 2. The optimal stepwise temperature profile is shown in Figure 3. Convergence to the optimal values of the augmented objective function is systematic and rapid. As is shown

2284 Ind. Eng. Chem. Res., Vol. 36, No. 6, 1997

straint is, however, unsatisfied by this penalty method. The initiator residue at final time is 4.4% of the total initiator concentration. The penalty parameter β1 is therefore increased to 10 times the original value. IDP with such penalty terms is then performed again. After 20 iterations, the initiator residue at final time is 4.3% of the total initiator concentration, which is still unsatisfied at the desired level. Example 2. An optimal control problem of sugar crystallization in a vacuum pan crystallizer is presented in this example. This optimal control problem was solved by the method of control paramerization (Chew and Goh, 1989). The dynamics of the crystallization process are summarized as follows:

dx1 ) d1u2(t) - u1(t) dt dx2 ) d2u2(t) dt

Figure 3. Optimal temperature control action in example 1. Solid line for β1 ) 1.84 × 104 and γ1 ) 5.95 × 10-2; dashed line for β1 ) 0.92 × 104 and γ1 ) 2.975 × 10-2.

{ [ [ ]

]

dx3 d6x3(t) - d8 × ) d3u2(t) - d4 d5x5(t) dt x1(t) - d7x2(t) -d9x2(t) -d11x2(t) exp + d10x4(t) exp x1(t) x1(t)

[ [

] [ ] [ [

[

] ] ]

]}

dx4 x6x3(t) -d9x2(t) - d8 exp ) d5 dt x1(t) - d7x2(t) x1(t)

dx5 x6(t)x3(t) -d9x2(t) - d8 exp ) 2d5x4(t) + dt x1(t) - d7x2(t) x1(t) -d11x2(t) d10x4(t) exp (22) x1(t)

Figure 4. Convergence profiles of the augmented objective function. (O) β1 ) 1.84 × 104 and γ1 ) 5.95 × 10-2; (×) β1 ) 0.92 × 104 and γ1 ) 2.975 × 10-2.

in Figure 4, the convergence to the optimum is obtained in only 10 passes. The computation time is 8 min for the 20 passes on a Pentium Pro computer. The half-values of the original penalty parameters, β1 and γ1, are also used to solve this optimal control problem. The simulated results are shown as dashed lines in Figures 1-4 for comparison. Both optimal solutions are very close as observed from Figures 1-3. This observation is due to the fact that an optimization method including Lagrange multipliers could reduce the dependency of the algorithm on the choice of the penalty parameters. The original IDP-augmented square penalty terms, i.e.,

Jp ) J1 + β1〈x1(tf) + x2(tf) - 0.04I0〉+ + 2

γ1〈(1 - x3(tf))2 - 2〉+2 are also used to solve this example for comparison. The penalty parameters are chosen to be the same as the proposed method in this example. The computation time is almost 8 min for the 20 passes since evaluations of the objective function are identical to the proposed method. After 20 iterations, the first inequality con-

where x1 is the weight of water in the pan, x2 is the weight of impurities in the pan, x3 is the weight of sucrose in the pan, x4 is the mass mean radius of a crystal, and x5 is the second moment of the crystal size distribution. The controls of the evaporation rate u1 and feed rate u2 are subjected to three constraints, namely, the nucleation boundary, viscosity, and pan capacity, in the forms

g1(x) )

d12x2(t)

d6x3(t) x1(t) - d7x2(t)

+

x2(t) + x3(t) x2(t) d13 x2(t) + x3(t)

[

[

]

]

2

- d14 e 0

2x43 g2(x) ) (1 - d15)d4 x4(t)x5(t) 3 d15[x1(t) + x2(t) + x3(t)] e 0 g3(x) ) x1(t) + x2(t) + x3(t) +

[

d4 x4(t)x5(t) -

]

2x43 - d16 e 0 3

0 e u1(t) e 25 0 e u2(t) e 100

(23)

where the parameters of this optimal control problem are presented in Table 2.

Ind. Eng. Chem. Res., Vol. 36, No. 6, 1997 2285 Table 2. Parameters Used in Example 2 parameter

value

d1 d2 d3 d4 d5 d6 d7 d8 d9 d10 d11 d12 d13 d14 d15 d16

0.32 0.071 4 1 - d 1 - d2 1699.14 2.75 0.347 5 0.088 1.02 1.577 0.002 5 0.92 0.284 2.333 1.128 89 0.55 80.0

This process seeks to control the feed and evaporation so that the most desired crystal size is produced in minimum time. This problem actually becomes a multiple-objective optimal control problem. Two objective functions are respectively expressed as

max J1 ) x4(tf)

(24)

min J2 ) tf

(25)

u1,u2

Figure 5. Optimal profile of the crystal size in example 2.

and u1,u2

To apply the -constraint method, the first objective function is considered to be the utility function in the constraint problem. The second objective function is then converted into an equality constraint. In such a case, while the threshold level is specified, i.e., the final time is explicitly given, the optimal control problem can be recast to find the control policy u1(t) and u2(t) that maximizes the final crystal size in a fixed terminal time. By repeating these procedures, the trade-off surface can be obtained. As a result of the trade-off surface, the minimum time of this biobjective optimal control problem can be easily achieved from such a surface. Here we explain one of the solving results. The final time of 1.54 h is first assigned. Using the modified IDP, the penalty parameters are estimated by (12) to β1 ) 1.7593 × 104, β2 ) 0.4719, and β3 ) 1.8084 × 10-2, respectively. By choosing P ) 20 stages, a reduction factor η ) 0.8, and the number of grid points N ) 21, we obtain the results of the optimal crystal size and optimal control actions shown in Figures 5 and 6, respectively. Three inequality constraints are satisfied after 20 allowed iterations. The crystal size is continuously growing to achieve the maximum size of 0.4161 mm at final time. The trade-off surface between the crystal size and the operation time is presented in Figure 7. From this trade-off information, we observe that the maximum crystal size reaches nearly tf ) 1.54 h. After this time, the trade-off surface becomes non-Pareto solutions as shown in this figure. This fact implies that any attempt to increase the final crystal size is impossible in the nonPareto region. Moreover, the Pareto surface of this problem is not a convex set as observed in Figure 7. The IDP-augmented square penalty functions are also used to solve this example for comparison. The penalty parameters are chosen as the same as the augmented Lagrange multiplier method in this example. After 20 iterations of IDP-augmented penalty functions, the second and third inequality constraints are unsatisfied

Figure 6. Optimal control actions of evaporation rate and feed rate.

Figure 7. Trade-off information between the maximum final crystal size and the minimum final time.

after time t g 1.38 h. Both penalty parameters are increased to 10 times the original value. IDP with such penalty terms is then performed again. After 20 iterations, the third inequality constraint is still unsatisfied at the final time t ) 1.54 h.

2286 Ind. Eng. Chem. Res., Vol. 36, No. 6, 1997

Conclusions

Literature Cited

A modified algorithm of iterative dynamic programming has been proposed to solve multiobjective optimal control problems. The use of the Lagrange multiplier updating method in dynamic programming appears to be an attractive means of handling this converted problem with equality and inequality constraints. A very simple guideline in selecting the initial guesses of the penalty parameters and Lagrange multipilers was introduced so that this method could be very promising in obtaining a practically feasible solution. Two chemical processes were used to test the proposed algorithm. There were no computational difficulties, so the IDP may be used successfully to solve multiobjective optimal control problems.

Arora, J. S.; Chahande; Paeng, J. K. Multiplier Methods for Engineering Optimization. Int. J. Num. Methods Eng. 1991, 32, 1485-1525. Bojkov, B.; Luus, R. Optimal Control of Nonlinear Systems with Unspecified Final Times. Chem. Eng. Sci. 1996, 51, 905-919. Butala, D. Modeling and Optimal Control of Polymerization Reactors. Ph.D. Dissertation, University of Maryland, College Park, 1990. Chang, J. S.; Lai, J. L. Computation of Optimal Temperature Policy for Molecular Weight Control in a Batch Polymerization Reactor. Ind. Eng. Chem. Res. 1992, 31, 861-868. Chew, E. P.; Goh, C. J. On Minimum Time Optimal Control of Batch Crystallization of Sugar. Chem. Eng. Commun. 1989, 80, 225-231. Dadebo, S. A.; McAuley, K. B. A Simultaneous Iterative Solution Technique for Time-Optimal Control Using Dynamic Programming. Ind. Eng. Chem. Res. 1995a, 34, 2077-2083. Dadebo, S. A.; McAuley, K. B. Dynamic Optimization of Constrained Chemical Engineering Problems Using Dynamic Programming. Comput. Chem. Eng. 1995b, 19, 513-525. Farber, J. N. Steady State Multiobjective Optimization of Continuous Copolymerization Reactors. Polym. Eng. Sci. 1986, 26, 499507. Hsu, K. Y.; Chen, S. A. Minimum End-Time Policies for Batchwise Radical Chain Polymerization: The Picewise Initiator Addition Policy. Chem. Eng. Sci. 1988, 43, 1311-1321. Jang, S. S.; Yang, W. L. Dynamic Optimization of Batch Emulsion Polymerization of Vinyl Acetate- An Orthogonal Polynomial Initiator Policy. Chem. Eng. Sci. 1989, 44, 515-528. Luus, R. Application of Iterative Dynamic Programming to State Constrained Optimal Control Problems. Hung. J. Ind. Chem. 1991, 19, 245-254. Luus, R.; Rosen, O. Application of Dynamic Programming to Final State Constrained Optimal Control Problems. Ind. Eng. Chem. Res. 1991, 30, 1525-1530. Powell, M. J. D. Algorithms for Nonlinear Constraints that use Lagrangian Functions. Math. Program. 1978, 14, 224-248. Sawaragi, Y.; Nakayama, H.; Tanino, T. Theory of Multiobjective Optimization; Academic: Orlando, FL, 1985. Sophos, A.; Rotstein, E.; Stephanopoulos, G. Multiobjective Analysis in Modeling the Petrochemical Industry. Chem. Eng. Sci. 1980, 35, 2415-2426. Thomas, I. M.; Kiparissides, C. Computation of the Near-Optimal Temperature and Initiator Policies for a Batch polymerization Reactor. Can. J. Chem. Eng. 1984, 62, 284-291.

Acknowledgment This work was supported by the National Science Council of Republic of China under Grant NSC84-2214E194-005. Nomenclature f(‚) ) vector function of nonlinear differential equations J ) a (s × 1) objective function vector gk(‚) ) kth inequality constraint hk(‚) ) kth equality constraint u ) control variables u(j) ) control action used for stage j w ) weighting vector x ) state variables x0 ) initial state vector Greek Letters Rk ) penalty parameter for the equality constraint βk ) penalty parameter for the inequality constraint γk ) penalty parameter for the converted constraint Ω ) feasible region in the input space λkj ) Lagrange multiplier for the equality constraint µkj ) Lagrange multiplier for the inequality constraint νk ) Lagrange multiplier for the converted constraint θkj ) corresponding Lagrange multiplier for the equality constraint ϑkj ) corresponding Lagrange multiplier for the inequality constraint σk ) corresponding Lagrange multiplier for the converted constraint k ) threshold level

Received for review October 15, 1996 Revised manuscript received March 6, 1997 Accepted March 7, 1997X IE9606606 X Abstract published in Advance ACS Abstracts, May 1, 1997.