Dynamic Optimization of Distributed Parameter Systems Using

Sep 21, 2004 - The dynamic optimization of combined lumped and distributed process systems, governed by nonlinear ordinary and partial differential eq...
2 downloads 9 Views 116KB Size
6756

Ind. Eng. Chem. Res. 2004, 43, 6756-6765

Dynamic Optimization of Distributed Parameter Systems Using Second-Order Directional Derivatives Eva Balsa-Canto* Department of Applied Mathematics II, Universidad de Vigo, Campus Marcosende, 36280 Vigo, Spain

Julio R. Banga and Antonio A. Alonso Process Engineering Group, IIM-CSIC, Eduardo Cabello 6, 36208 Vigo, Spain

Vassilios S. Vassiliadis Department of Chemical Engineering, Cambridge University, Cambridge, U.K.

The dynamic optimization of combined lumped and distributed process systems, governed by nonlinear ordinary and partial differential equations (ODEs and PDEs), is considered in this work. The application of a recently developed method based on the combination of the control vector parametrization approach with the calculation of exact gradients and projected Hessians (Hp’s), is presented as an alternative for the efficient computation of the control policies needed to optimize a specific performance criterion. The exact first- and second-order information is calculated by means of the solution of an extended initial value problem (IVP) whose particular time-varying Jacobian structure is exploited by a sparse implicit solver to increase efficiency. Finally, the applicability of this method is shown through the solution of a number of case studies demonstrating that a significant speed-up can be obtained through the use of exact secondorder information. 1. Introduction The computation of optimal operating conditions for chemical and biotechnological processes has been a challenging research topic during recent decades. The ever-increasing market competition, demand for highquality products, and stringent environmental regulations in the industrial environment have led to the necessity of developing adequate computer-aided techniques for the solution of ever-more-complex dynamic optimization problems. The systematic calculation of optimal operating conditions relies on the use of precise mathematical models of the process under consideration. Most processes in the fine chemicals, biotechnology, and food industries are operated in an inherently transient manner (i.e., as batch or semibatch processes) rather than at some nominal steady state. Therefore, state variables depend on time, and mathematical models result in sets of ordinary and algebraic equations. However, a lumped description is usually not sufficient for systems in which the state variables depend not only on time but also on spatial location, such as those involving heat and mass transfer, fluid dynamics, reaction processes, etc. The mathematical modeling of such so-called distributedparameter systems (DPSs) results in sets of partial and ordinary differential equations. Most dynamic optimization (DO) studies in the process industries have targeted lumped-parameter systems. An example of a type of lumped problem that has received considerable attention is the dynamic optimization of bioreactors (see the recent review by Banga et al.1). However, for many applications, one must use distributed process models, and this fact has motivated * To whom correspondence should be addressed. E-mail: [email protected]. Fax: +34 986 812116.

a number of ongoing research efforts.2-6 Furthermore, it has been shown that the computed optimal operating policies for these distributed process systems might have significant advantages over current process conditions, as shown, for example, in food processing,7 thermal cracking,8 crystallization,9 adsorption,10 chemical vapor deposition,11 and polymerization.12 In practice, dynamic optimization of DPSs usually involves transforming the original DPS into an equivalent large-scale lumped system, usually through spatial discretization, and applying lumped-system DO methods. Most popular techniques are based on complete parametrization, multiple shooting, or control vector parametrization approaches. As a result, a constrained nonlinear programming (NLP) problem is obtained whose solution can be approached with an NLP solver. In this regard, the use of second-order methods is particularly attractive because of the possibility of achieving at least superlinear convergence properties. Recent works14,15 have illustrated how the calculation of first- and second-order forward parametric sensitivities combined with second-order NLP solvers largely enhances the robustness and efficiency of the dynamic optimization of lumped systems. Alternatively, an adjoint sensitivity analysis can be used to calculate gradient information and second-order directional derivatives,16,17 which seems to be more convenient in the presence of a large number of control variables. However, to the authors’ knowledge, this methodology has not yet been applied for dynamic optimization purposes. This work presents the application of a truncated Newton (TN) method (e.g., Nash, 1984)13 within the control vector parametrization framework as introduced by Balsa-Canto et al.14 in the context of the dynamic optimization of lumped systems. This approach makes use of first- and second-order parametric sensitivities

10.1021/ie0497590 CCC: $27.50 © 2004 American Chemical Society Published on Web 09/21/2004

Ind. Eng. Chem. Res., Vol. 43, No. 21, 2004 6757

to obtain exact gradients and projected Hessians (Hp’s) to be used by a TN method to calculate search directions. This methodology proved to be faster and more robust than its counterparts, e.g., gradient-based and sequential quadratic programming (SQP) approaches, for the solution of several case studies. In this work, a number of refinements to enhance the performance of this methodology in the context of DSPs are presented. In this regard, the fact that the Jacobian for the first- and projected second-order sensitivities is the same as in the original dynamic system15 is exploited to reduce the computational effort associated to the calculation of gradients and second-order directional derivatives. The use of the exact Jacobian and the exact time-dependent Jacobian structure combined with a sparse implicit ODE solver is illustrated here. The paper is organized as follows: Section 2 presents the mathematical formulation of the dynamic optimization problem to be considered in this work. The most common approaches for the solution of dynamic optimization problems are reviewed in section 3, where special emphasis is made on the control vector parametrization approach and how first- and projected secondorder sensitivities can be used together with a truncated Newton method to solve dynamic optimization problems. The approaches employed for the simulation of distributed-parameter systems are discussed in sections 4 and 5. There, we discuss the theoretical aspects involved in the calculation of parametric sensitivities for DPSs and their numerical solution with a sparse IVP. The last section illustrates the possibilities of this alternative in the solution of a number of case studies of increasing complexity. 2. Dynamic Optimization Problem Formulation Consider a nonlinear combined lumped/distributedparameter system described by the following state space equations

y3 ) Ξ(x,y,u,w,t)

(1)

xt ) Ψ(x,xξ,xξξ,y,u,w,t)

(2)

where ξ ∈ Ω ⊂ R3 represents the spatial variables, x(ξ,t) ∈ X ⊂ Rν is the subset of state variables depending on both time and spatial location, y(t) ∈ Y ⊂ Rµ is the subset of time-dependent variables, xξ ) ∂x/∂ξ, xξξ ) ∂2x/∂ξ2, xt ) ∂x/∂t, y3 ) dy/dt, u ∈U ⊂ Rσ represents the control variables, and w ∈ W ⊂ Rη represents timeindependent parameters. In addition, eqs 1 and 2 are subject to the following initial and boundary conditions

y(t0) ) Ξ0(x(t0),u(t0),w,t0)

(3)

x(t0) ) Ψ0(y(t0),u(t0),w,t0)

(4)

B(x,xξ,u,w,ξ,t) ) 0

(5)

The general dynamic optimization (open-loop optimal control) problem considered here can be formulated as follows: Find the controls u(t) and the time-invariant parameters w subject to the system dynamics (eqs 1 and 2) so as to minimize (or maximize) the following objective functional

J ) φ(x(ξ,tf),y(tf),w,tf) +

∫tt L(x(ξ,t),y(t),u(t),w,ξ,t) dt f

0

(6)

where the scalar functions φ (Mayer term) and L (Lagrangian term) are continuously differentiable with respect to all of their arguments and the final time tf can be either fixed or free. State and control variables might be also subject to constraints that force the satisfaction of safety or environmental regulations, quality requirements, proper operation conditions, etc. These constraints can be classified in two main groups: point constraints and path constraints. Point constraints must be satisfied at certain time points during the process, including endpoint constraints at final time tf

req k (x(ξ,tk),y(tk),u(tk),w,tk) ) 0

(7)

rin k (x(ξ,tk),y(tk),u(tk),w,tk) e 0

(8)

Path constraints must be satisfied throughout the process, so that

ceq(x(ξ,t),y(t),u(t),w,t) ) 0

(9)

cin(x(ξ,t),y(t),u(t),w,t) e 0

(10)

Control variables and parameters might be also subject to constraints, usually in the form of bounds expressing operational limits

uL e u(t) e uU

(11)

wL e w e wU

(12)

3. Dynamic Optimization Solution Methods The solution algorithms for dynamic optimization problems can be classified in three main groups: dynamic programming methods, indirect methods, and direct methods. Dynamic programming techniques are based on Bellman optimality conditions, the most popular being the iterative dynamic programming (IDP) technique proposed by Luus in the early 1990s.18,19 Indirect methods use variational calculus to express necessary optimality conditions as a two- or multipoint boundary value problem (see, for example, Bryson and Ho20), which is usually quite complex to solve. Direct methods transform the infinite-dimensional dynamic optimization problem into a finite-dimensional nonlinear programming problem (NLP) that can then be solved by using an appropriate NLP solver. The most widely used direct methods include the complete parametrization approach, the control vector parametrization approach, and the multiple shooting method. In the complete parametrization technique (also called the simultaneous approach), both the control and state variables are discretized, usually employing a direct collocation approach, so that the coefficients and interval lengths now become the decision variables in a larger NLP (see a recent review by Biegler et al.21). In the control vector parametrization (CVP) approach, only the control variables are discretized and approximated using low-order polynomials. The coefficients of these polynomials become the decision variables in a master NLP22,23 whose solution involves the simulation of the system dynamics. On the other hand,

6758

Ind. Eng. Chem. Res., Vol. 43, No. 21, 2004

the multiple shooting approach24 divides the duration of the time domain into a number of separate elements, and an initial value problem solver is used to simulate the process within each element. In this formulation, the initial conditions for each element together with the control parameters become the decision variables for the master NLP. Although all approaches have their own advantages and disadvantages, direct methods have proven to be reliable for both lumped- and distributed-parameter processes. Considering distributed-parameter systems, the literature offers a wide range of applications of direct approaches. Goldberg and Tro¨ltzsch,25 Kleis and Sachs,26 Mittelman and Maurer,27 and Biegler et al.,6 among others, illustrate the use of the complete parametrization approach. Banga et al.,28,29 Petzold et al.,30 Blatt and Schittkowski,31 Quirijns et al.,32 and Saa et al.,33 again among others, illustrate the use of the control vector parametrization approach. More recently Serban et al.5 presented the application of a modified multiple shooting approach to an optimal control problem associated with a modified Burgers’ equation. In practice, the dynamic optimization of DPSs typically involves transforming the original DPS into an equivalent lumped system and applying lumped-system DO methods. Therefore, a spatial discretization approach is usually used to transform the original infinitedimension partial differential equations (PDE) into a large-scale, and possibly stiff, set of ordinary differential equations (ODEs). The accurate solution of the overall ODE system [ODEs present in the model (eq 1) together with the set of ODEs resulting from transforming the PDEs] then requires the use of an implicit ODE solver. Thus, either control vector parametrization or multiple shooting seems to be more efficient in dealing with distributed-parameter systems. The former, control vector parametrization, is the approach used in this work. 3.1. Truncated Newton Method for DO. The control vector parametrization approach proceeds by dividing the duration of the process into a number of intervals. The simplest choice for the control switching time intervals is to choose them to be equal for all control variables and to have equidistantly distributed time nodes. However, it is also possible to include the control time nodes in the parameter set. The control variables are then approximated by low-order polynomials within each interval, and the states are found by integrating the state space equations forward in time with a suitable initial value problem solver. Free initial states, if present, are included in the parameter set w. As a result, a nonlinear programming (NLP) problem is obtained in which the decision variables v correspond to the coefficients in the polynomials, the free control time nodes, and the time-independent parameters w. In Vassiliadis,21 the use of the first-order parametric sensitivity equations within the CVP approach was proposed to obtain gradients with respect to the optimization parameters. The use of exact first-order information resulted in better convergence properties for an SQP-based NLP solver. A second differentiation with respect to v yields a new set of sensitivity equations: the so-called second-order sensitivities as presented by Vassiliadis et al.15 In that work, the solution of the original dynamic system together with the set of first- and second-order sensitivities (extended IVP) was used in combination with a

Newton-type NLP method to solve dynamic optimization problems. The authors concluded that, although the performance of the CVP approach was considerably enhanced by the use of exact second-order information, the cost of evaluating the Hessian would rapidly grow with problem size, thus limiting its application to “small” dynamic systems with a “small” number of decision variables. Nevertheless, the use of the exact second-order directional derivatives combined with algorithms requiring Hessian vector products such as the truncated Newton method (e.g., Nash, 198413) reduces these limitations, as illustrated in Balsa-Canto et al.,14,34 allowing for the efficient solution of large-scale dynamic optimization problems when the number of control variables is comparatively small or achieving fine discretization levels for the control variables. 4. Solution of an Initial Boundary Value Problem (IBVP) Only some elementary PDEs have closed-form solutions, that is, solutions that can be expressed in terms of fundamental functions, whereas for most cases, numerical approaches are necessary. Numerical procedures for solving PDEs can be classified into two main groups: techniques that expand the solution as a linear combination of global basis functions or, alternatively, techniques that aim to find the value of the solution at a finite number of nodes within the domain of definition of the equation (see reviews in Schiesser35 and Oh36). Concerning the first type of methods, the global set might correspond to the eigenfunctions of the spatial operator or more general sets such as those employed in the proper orthogonal decomposition techniques.37 This alternative has recently been considered for the solution of dynamic optimization problems with promising results.3,4,38 However, there is still much work to do in terms of automating the generation of these reduced-order models. Therefore, methods of the second type are still the ones generally used by the dynamic optimization community29-33 and are also the methods to be used in this work. The numerical method of lines (NUMOL) or the finiteelement method (FEM) proceeds by dividing the domain of interest into many smaller subdomains and applying an approximation method within each subdomain. This results in a grid of π time-dependent field values at specific spatial locations as functions of field values in the vicinity. The evolution of the field is then described by a possibly stiff set of ordinary differential equations whose solution requires an adequate initial value problem solver. In this regard, Mageroy39 recommends the use of an implicit method so as to enhance the stability of the solution process. Implicit methods for stiff ODE systems require the solution of systems of nonlinear equations. Normally, these systems are solved by an iterative Newton-type method that requires the solution of systems of linear equations. The structure of the dynamic system Jacobian determines the structure of these linear systems of equations. Because, for DPSs, it often occurs that the components of the system dynamics depend on only a small number of state variables, the system can be quite sparse. This property can be exploited to enhance efficiency as described in the following sections.

Ind. Eng. Chem. Res., Vol. 43, No. 21, 2004 6759

( [ ])

5. Computation of Second-Order Directional Derivatives for DPSs

s2i(t0) ) pT

Once the spatial discretization and control vector parametrization approach have been applied to the original system (eqs 1-2), the following set of ODEs is obtained

z3 ) f(z,v)

z(t0) ) z0(v)

(14)

where z, z3 , z0 ∈ Rn. n is the number of ODEs resulting from the spatial discretization plus the ODEs originally present in the model (n ) π + µ), and v represents the vector of decision variables, including the parameters F, which characterize each of the control functions σ, plus the time-independent parameters η, v ∈ RF˜ (F˜ ) σF + η). A first differentiation of eq 13 with respect to v results in the following set of first-order sensitivities

∂f ∂z ∂f ∂f ∂f ∂z3 ) + T s3 1 ) s1 + ∂v ∂z ∂v ∂v ∂z ∂v

[∂z∂f ]

∂f ∂f ∀ i ∈ 1, ..., F˜ s3 1i ) s1i + ∂z ∂vi

(16)

Z4 ) f(Z,v)

J)

(17)

] [

( ) ][

]

∂s3 1i

)

∂J ∂J s + ) (J1i)n×n ∂z 1i ∂vi

∂s3 2i ) Jn×n ∂s2i

(19)

where X is the Kronecker matrix-product operator and IF and In are identity matrices. Postmultiplying eq 18 by a vector p ∈ RF˜ and collecting the resulting vectors and matrices, the equivalent form in eq 20 is obtained

∂f s + A(z,v) ∈ Rn×F˜ ∂z 2

(20)

where the columns are given by the expression

s3 2i )



s2i +

∂z



∂2f

k)1∂z∂vk



p(k)s1i +



(27)

(28)

plus taking into account the dependence on s1

∂2z0 ∂2z (t ) ) (v) 0 ∂v2 ∂v2

∂f

(26)

Equivalently, for the set of second-order directional derivatives, we have

with initial conditions given by

s3 2 )

) Jn×n

Because the first-order sensitivities also depend on z, a new set of terms is obtained for the extended IVP Jacobian matrix

∂z

∂2z3 ∂f ∂2z ∂z T ∂2f ∂z ∂2f ) + X IF 2 + I n X + 2 2 ∂z ∂v ∂v ∂v ∂z ∂v ∂v∂z ∂2f ∈ Rn‚F˜ ×F˜ (18) 2 ∂v

(25)

The Jacobian matrix components for the first-order sensitivities result

∂s1i

Differentiating eq 15 again (as presented by Vassiliadis et al.15), we obtain the following set of second-order sensitivities

[

∂f , J ∈ Rn×n ∂z

∂s˘ 1i

∂z0 (v) ∂v

(24)

where Z ∈ Rn(1+2F˜ )×1 includes all components in vector z ∈ Rn and matrices s1 and s2 ∈ Rn×F˜ . Denoting the Jacobian matrix of the set of ODEs (eq 13) as J, we have

and the following set of initial conditions

s1(t0) )

(23)

that coincides with the Jacobians associated with both the original set (eq 13) and the first-order sensitivities (eq 15), as pointed out by Vassiliadis et al.15 We will make use of this property to enhance the efficiency of the solution of the extended initial value problem (eqs 13, 15, and 20). To that purpose, let us write the extended IVP in concise form as

(15)

where s1 is an n × F˜ real matrix with columns satisfying

(22)

The system’s Jacobian in eq 20 is an n × n matrix

(13)

with initial conditions

∂2s20i (v) ∂v2

∂2 f

p(k)

k)1∂vi∂vk

∀ i ∈ 1, ..., F˜ (21)

with their corresponding initial conditions

∂s3 2i ∂s1i



)

∂J

∑ p(k) ) SJn×n k)1∂v

(29)

k

and the dependence on z

∂s3 2i ∂z

)

F˜ F˜ ∂2 J ∂2J ∂J p(k)s1i + p(k) ) s2i + ∂z k)1∂z∂vk k)1∂vi∂vk (J2i)n×n (30)





To avoid the adverse effect of the stiffness of the control functions on the accuracy of the numerical integration method, the integration procedure is organized so that the control switching nodes (tj) coincide with end points of integration. Moreover, some parameters in v affect only parts of the trajectory. For instance, for a control parameter that starts acting at

6760

Ind. Eng. Chem. Res., Vol. 43, No. 21, 2004

control time node number j, the sensitivity equations for that parameter need only be integrated from tj. As a consequence, because the structure of the Jacobian matrix for the extended IVP changes each time a sensitivity equation becomes active, the Jacobian matrix at the final time becomes

{

J J11 J12 J13 l JT ) J1F˜ J21 J22 J23 l J2F˜

0 J 0 0 l 0 SJ 0 0 l 0

0 0 J 0 l 0 0 SJ 0 l 0

0 0 0 J l 0 0 0 SJ l 0

0 0 0 0 l 0 0 0 0 l 0

‚‚‚ ‚‚‚ ‚‚‚ ‚‚‚ l ‚‚‚ ‚‚‚ ‚‚‚ ‚‚‚ l ‚‚‚

0 0 0 0 l J 0 0 0 l SJ

0 0 0 0 l 0 J 0 0 l 0

0 0 0 0 l 0 0 J 0 l 0

0 0 0 0 l 0 0 0 J l 0

‚‚‚ ‚‚‚ ‚‚‚ ‚‚‚ ‚‚‚ ‚‚‚ ‚‚‚ ‚‚‚ ‚‚‚ ‚‚‚ ‚‚‚

0 0 0 0 l 0 0 0 0 l J

}

(31)

5.1. Solution of the Extended IVP with a Sparse Implicit ODE Solver. Note that, even if the Jacobian of the original system J is dense, the extended Jacobian is sparse. However, for the case of DPSs whose simulation is approached using the NUMOL40 or FEM methods, the resulting J is already sparse. Note also that the result of applying these methods is a grid of π timedependent field values at specific spatial locations as functions of field values in the vicinity. This translates into an ODE set with an almost banded Jacobian. Let us denote NB as the number of bands with nonzero elements in the Jacobian; then, the number of nonzero elements in the overall Jacobian can be approximated by nnz e (2NB - 1)(π - 1). It becomes clear that nnz , π2 for π sufficiently large; thus, the Jacobian can be considered sparse. Note that, although a banded IVP solver could be used for the simulation of this ODE, the banded structure disappears when the sensitivity set is added. Therefore, a sparse IVP solver, such as LSODES,41 can be employed for the solution of the extended IVP. LSODES offers a number of alternative options for solving ODE systems. Consequently, an Adams method can be used for nonstiff problems, whereas a BDF-based method combined with sparse matrix calculus seems more appropriate for stiff ODE systems. For the latter class, which is the one we are considering in this work, LSODES allows the Jacobian matrix to be either computed internally by difference quotients or supplied by the user. In the latter case, the exact Jacobian or Jacobian matrix and sparse structure (i.e., the positions of nonzero elements) must be provided. The sparse structure is given to the solver through the following two structure descriptors: jan, an array of size nnz that contains the row indexes of the nonzero locations, read columnwise, and ian, an array of n + 1 elements that contains the starting locations in jan of the columns 1, ..., n, in that order. As an example, let us consider the following simple matrix

[ ] * 0 * 0 0

0 * * 0 0

0 0 * * 0

* 0 0 * 0

0 * 0 * *

where the asterisks indicate the locations of nonzero elements. The structure descriptors for this case are then and

jan ) [1, 3, 2, 3, 3, 4, 1, 4, 2, 4, 5 ] ian ) [1, 3, 5, 7, 9, 12 ] It must be noted that the efficiency of the IVP solver is, to a large extent, conditioned by the Jacobian information supplied. In this way, if the exact Jacobian and its structure are provided, the solver does not need to perform extra function and/or Jacobian evaluations to calculate either the Jacobian or its structure, thus increasing efficiency. Considering this, in this work, a number of routines were implemented so as to compute and update the exact extended IVP Jacobian and its sparse structure at each time step, taking into account the fact that the Jacobian and its structure vary every time a new sensitivity becomes active. To illustrate this point, a simple linear case with one control variable is considered

z3 ) Jz + Bu

(32)

This type of system appears, for example, when the NMOL or FEM is applied to linear parabolic PDEs. These cases are particularly simple as it is not necessary to calculate second-order sensitivities because the objective function Hessian depends only on states and firstorder sensitivities. Consider that the unique control variable is parametrized using F˜ elements and that we intend to solve the extended IVP using a sparse ODE solver. The Jacobian for the extended IVP will evolve in time as follows

Interval t0-t1

{ } { } { }

J 0 0 JT ) 0 l 0 Interval t1-t2

J 0 0 JT ) 0 l 0 Interval tF˜ -1 - tf

J 0 0 JT ) 0 l 0

0 J 0 0 l 0

0 0 0 0 l 0

0 0 0 0 l 0

0 0 0 0 l 0

‚‚‚ ‚‚‚ ‚‚‚ ‚‚‚ l ‚‚‚

0 0 0 0 l 0

0 J 0 0 l 0

0 0 J 0 l 0

0 0 0 0 l 0

0 0 0 0 l 0

‚‚‚ ‚‚‚ ‚‚‚ ‚‚‚ l ‚‚‚

0 0 0 0 l 0

0 J 0 0 l 0

0 0 J 0 l 0

0 0 0 J l 0

0 0 0 0 l 0

‚‚‚ ‚‚‚ ‚‚‚ ‚‚‚ l ‚‚‚

0 0 0 0 l J

It is important to point out that one needs to add as many elements to J as control variables are present in

Ind. Eng. Chem. Res., Vol. 43, No. 21, 2004 6761 Table 1. Solution of a Linear Problem with LSODESa F˜

neq

10

231

704

20

441

1344

25

546

1664

50

1071

3264

100 200

2121 4221

nnz

6464 12864

MF

np

nf

nj

TCPU

1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3

592 592 592 941 941 941 1131 1131 1131 1697 1697 1697 2395 2395 2395 3178 3178 3178

3194 764 764 10239 1259 1259 15330 1484 1459 56188 2196 2196 216262 3282 3282 846080 4366 4366

0 9240 6930 0 26460 17640 0 40404 26754 0 158508 104958 0 625695 413595 0 2460843 1620864

0.44 0.42 0.41 2.6 2.3 2.2 4.0 3.5 3.3 11.4 9.3 8.0 65.5 49.3 37 442 365 217

The following notation has been used: F˜ indicates the control parameterization level; neq is the number of equations in the extended IVP (n ) 21); nnz is the number of nonzero elements in the extended Jacobian at the final time; MF indicates the integration method used, with MF ) 1 when the Jacobian is calculated by finite quotients, MF ) 2 when the exact Jacobian is provided and its sparse structure is numerically computed, and MF ) 3 when the exact Jacobian and its sparse structure are provided; np is the number of time steps; nf is the total number of system dynamics evaluations; nj is the total number of Jacobian evaluations; and TCPU is the computing time in seconds on a PC Pentium III/450 MHz computer. a

Figure 1. Comparative study: Solution of a linear problem with LSODES.

the problem, taking care of switching times, which are usually selected to be the same for all control variables. For nonlinear cases, it will also be necessary to add J1i, J2i, and SJ at the right positions. Table 1 reports the results obtained for the solution of eq 32 with LSODES. From these results, it can be concluded that, as the control discretization level (and therefore the size of the extended IVP) increases, the numerical calculation of the Jacobian (MF ) 1) or its sparse structure (MF ) 2) requires a large number of function or Jacobian evaluations and, thus, a larger computational effort. Figure 1 shows how the computational cost rapidly increases with the number of ODEs for cases MF ) 1 and 2, although this cost is substantially reduced when the exact Jacobian and its structure are provided (case MF ) 3). 6. Illustrative Examples To fully automate the application of this procedure, a Mathematica 3.0 notebook was developed to symbolically derive the extended initial value problem (IVP). The procedure goes as follows:

Figure 2. Heating of a slab: Jacobian structure. Table 2. Results for the Temperature Control of a Slab F 10 80 160 320

OBJ 10-5

6.5 × 7.9 × 10-5 6.6 × 10-5 4.4 × 10-5

Tmax

Tmin

NI/NF/CG

TCPU

0.203 0.202 0.202 0.203

0.196 0.195 0.195 0.196

3/4/7 2/3/8 2/3/8 4/5/13

1.3 29 87 767

1. Derive the set of ODEs, from the original set (eq 2), through the application of the discretization procedure, particularly the numerical method of lines (NMOL). 2. Derive the set of first-order sensitivities and the exact product equations, as presented in eqs 15 and 20. 3. Generate the Jacobian matrix and its sparse structure in order to provide exact information to LSODES. 4. Transform the resulting equations into a format readable by LSODES. A number of FORTRAN routines have been implemented to generate the time-dependent full Jacobian as well as its sparse structure to provide the exact information to LSODES. In this section, we present some results obtained for three case studies to illustrate the efficiency gained by using exact second-order information in the DPS dynamic optimization problem, especially as compared with standard SQP approaches. 6.1. Temperature Control of a Slab. We first consider the simple problem of adjusting the temperature distribution of a slab of length L ) 1, to a desired temperature Tset ) 0.2 using the external temperature (Tout) as the control variable. The system is described in terms of the heat conduction equation subject to boundary conditions expressing the heat flux through the slab. Formally, the problem can be stated as follows: 42

Find Tout(t) to minimize

J)

∫0L[Tset - T(x,tf)]2 dx

(33)

subject to

Tt ) Txx

(34)

with the initial and boundary conditions

T(x,0) ) 0 Tx(0,t) ) K(T - Tout), Tx(L,t) ) 0

(35)

To simulate the process, an NMOL second-order formula with a discretization level of 11 was used, resulting in a system of n ) 11 ODEs with a Jacobian matrix sparse structure as illustrated in Figure 2. The results are reported in Table 2, where F is the control discretization level; OBJ is the objective function value; Tmax and Tmin are the maximum and minimum temperature values, respectively at tf; NI, NF, and CG

6762

Ind. Eng. Chem. Res., Vol. 43, No. 21, 2004

Figure 3. Optimal control profile for the heating of a slab (F ) 320).

are the iteration, function evaluation and Hp evaluation numbers, respectively; and TCPU is the computation time, in seconds, on a PC Pentium III/450 MHz. The results achieved with this new methodology are in good agreement with those reported by Ramirez.42 Note that, for F ) 10, a temperature distribution within 2% of the desired value is obtained in less than 2 s. The optimal control profile obtained for discretization level F ) 320 is shown in Figure 3. For the sake of comparison, this case study was also solved using a standard gradient-based approach with approximated second-order information. The best result was obtained with F ) 10, corresponding to a temperature distribution within the interval [0.197, 0.202]. This result was achieved in 15 iterations and 11.6 s of computational effort. Higher discretization levels lead the gradient-based approach to suboptimal solutions with very noisy control profiles. 6.2. Temperature Control of a Cylinder. In this case, the objective is to maximize the uniformity on the final temperature around a desired value (Tset) while restricting the movements of the external temperature (control variable). This problem is formally stated as follows: Find Tout(t) to minimize

J ) Z(tf)TMZ(tf) +

∫0t P(Tout - Tset)2 dt f

(36)

where Z is the vector of differences between the temperatures inside the cylinder and Tset, M is a weight matrix which restricts deviations of the states and P is a weight factor that imposes a penalty on control variable movement. The dynamic constraint corresponds to the 2D Fourier equation with radial and axial symmetry

k Tt ) [ (Trr + (1/r)Tr + Tzz)] Fc

(37)

where r and z are the radial and axial coordinates, respectively; k is the conductivity, k ) 6.48 × 10-3 W m-1 °C-1; F is the medium density, F ) 1.067 kg m-3; and c is the specific heat, c ) 3.542 J kg °C-1. The initial and boundary conditions are given by

T(r,z,0) ) 25.0 Tr(0,z,t) ) 0.0 Tz(r,0,t) ) 0.0

(38)

Tr(R,z,t) ) h[Tout - T(R,z,t)] Tz(r,L,t) ) h[Tout - T(r,L,t)] where R ) 1.75 cm is the radius and L ) 1.75 cm is the

Figure 4. Heating of a cylinder: Jacobian structure.

Figure 5. Optimal control profile for the heating of a cylinder (F ) 40).

Figure 6. Final temperature profile in the center plane of the cylinder. Table 3. Results Obtained for the Temperature Control of a Cylinder F

OBJ

Tmax

Tmin

NI/NF/CG

TCPU

5 10 20 40

4.00 × 104 3.84 × 104 3.78 × 104 3.73 × 104

100.532 100.841 100.654 100.640

99.037 98.760 98.904 98.948

3/4/8 3/4/9 3/4/9 3/4/9

22 47 164 835

half-height of the cylinder and h the heat transfer coefficient, h ) 17.85 × 10-4 W m-2 °C-1. Tout is the control variable, which is subject to the following bounds

25 °C e Tout e 200 °C

(39)

To simulate the process, a spatial 11 × 11 grid was used, resulting in a system of π ) 121 ODEs plus an extra ODE for calculation of the objective function. The simplified sparse structure of the system Jacobian is presented in Figure 4. The results obtained for Tset ) 100 °C are reported in Table 3, and the optimal control profile and final temperature distribution are shown in Figures 5 and 6, respectively. Again, a standard gradient-based method was also used to solve this problem. The results obtained with that method compare satisfactorily with the ones reported here. The best objective value of 3.74 × 104 was

Ind. Eng. Chem. Res., Vol. 43, No. 21, 2004 6763

Figure 7. Diffusion through a membrane: Illustration.

obtained with F ) 20, in 18 iterations. However, the computational cost was almost 15 times larger than the time required by the method proposed here. 6.3. Problem in Biochemistry: Diffusion through a Membrane. This example is taken from Neittaanmaki and Tiba.43 A membrane M separates two compartments I and II as illustrated in the Figure 7. M consists of an inactive protein coreticulated with an enzyme E. At time t ) 0, the membrane contains no substrate or product. Then, some substrate S diffuses from I and II into the membrane, where the enzyme E is a catalyst of the reaction to produce a certain product P. By taking suitable units (thickness of membrane ) 1, diffusion coefficient ) 1, etc.), the model governing the system can be written as

cs cst ) csξξ - F(cs), F(cs) ) σ 1 + cs

(40)

Figure 8. Diffusion through a membrane: Jacobian structure.

Figure 9. Optimal control profile for the diffusion through a membrane.

with the initial and boundary conditions

cs(x,0) ) 0

(41)

cs(0,t) ) R, cs(1,t) ) β

(42)

where cs is the concentration of substrate and R, β, and σ are constants. Assume that the concentration ca of an activator in compartments I and II is at our disposal. This activator is diffusing in the membrane M, so that the equations describing the process become

cst ) csξξ1 - F(cs,ca) T(cs,ca) a

Table 4. Results Obtained for the Diffusion through a Membrane

s

c c )σ 1 + ca 1 + cs

(43)

cat ) caξξ

(44)

with

and the following initial and boundary conditions

cs(x,0) ) 0, ca(x,0) ) 0

(45)

cs(0,t) ) R, cs(1,t) ) β

(46)

ca(0,t) ) a1(t), ca(1,t) ) a2(t)

(47)

The objective is to regulate the concentration of substrate (c1) at a certain point (ξ ) 0.5), that is, to make it close to a desired concentration zd ) σ × arctan(30.0t)with the control variables subject to the following

J)

∫0t [c1(0.5,t) - zd(t)]2 dt f

Figure 10. The desired concentration of substrate (zd) vs the concentration in the optimum (cs).

(48)

F

OBJ

NI/NF/CG

TCPU

10 15 25

0.94079 0.93660 0.93307

10/15/22 8/11/17 12/15/24

38.8 48.5 130.0

To solve the PDEs present in the system, 7 spatial nodes were used, resulting in 14 equations to describe the process and one extra equation to deal with the objective function. The resulting (nonbanded) Jacobian structure is illustrated in Figure 8. The results obtained with the TN approach are presented in Table 4. The optimal control profile and final concentration are shown in Figures 9 and 10, respectively. For the sake of comparison, this case study was also solved using an SQP-based algorithm with two different approximations for the control function: piecewise constant and linear (10 elements of variable length). The best result was obtained with the linear approximation, giving an objective function value of 0.9432 in 19 iterations and 81 s on a Pentium III/450 MHz computer. 7. Conclusions

bounds

0 e ai e 100

(49)

This work presents the application of a recently developed technique for the dynamic optimization of

6764

Ind. Eng. Chem. Res., Vol. 43, No. 21, 2004

DPSs. This strategy allows the optimization of largescale dynamic systems, with hundreds of ODEs, resulting from the application of the numerical method of lines or the finite-element approach to the original combined lumped/distributed-parameter system. To reduce the computational effort, the use of a sparse IVP solver together with an exact Jacobian time-dependent structure was introduced here. The results obtained with the new technique are either comparable to or better than those obtained with standard methods based on SQP solvers. Increases in speed of around 15 times and significant reductions in the numbers of iterations and evaluations are achieved with the method proposed here, demonstrating that efficiency and robustness are considerably enhanced using exact second-order information. Acknowledgment The authors acknowledge the financial support received from the Spanish Government (MCyT Projects AGL2001-2610-C02-02 and PPQ2001-3643) and Xunta de Galicia (PGIDIT02PXIC40211PN and PGIDIT02PXIC40209PN). Literature Cited (1) Banga, J. R.; Balsa-Canto, E.; Moles, C. G.; Alonso, A. A. Dynamic optimization of bioreactors: A review. Proc. Indian Natl. Sci. Acad. 2003, 69A, 257-265. (2) Blatt, M.; Schittkowski, K. Optimal control of one-dimensional partial differential algebraic equations with applications. Ann. Oper. Res. 2000, 98, 45-60. (3) Armaou, A.; Christofides, P. D. Dynamic optimization of dissipative PDE systems using nonlinear order reduction. Chem. Eng. Sci. 2002, 57, 5083-5114. (4) Balsa-Canto, E.; Banga, J. R.; Alonso, A. A. A novel, efficient and reliable method for thermal process design and optimization. Part II: Applications. J. Food Eng. 2002, 52, 225-247. (5) Serban, R.; Li, S.; Petzold, L. R. Adaptive algorithms for optimal control of time-dependent partial differential-algebraic equation systems. Int. J. Numer. Eng. 2003, 57, 1457-1469. (6) Biegler, L. T.; Ghattas, O.; Heinkenschloss, M.; van Bloemen Waanders, B. Large-Scale PDE-Constrained Optimization; Lecture Notes in Computer Science and Engineering 30; Springer: New York, 2003. (7) Banga, J. R.; Balsa-Canto, E.; Moles, C. G.; Alonso, A. A. Improving food processing using modern optimization methods. Trends Food Sci. Technol. 2003, 14 (4), 131-144. (8) Edwin, E. H.; Balchen, J. G. Dynamic optimization and production planning of thermal cracking operation. Chem. Eng. Sci. 2001, 56 (3), 989-997. (9) Nagy, Z. K.; Braatz, R. D. Worst-case and distributional robustness analysis of finite-time control trajectories for nonlinear distributed parameter systems. IEEE Trans. Control Syst. Technol. 2003, 11 (5), 694-704. (10) Ko, D.; Siriwardane, R.; Biegler, L. T. Optimization of a pressure-swing adsorption process using zeolite 13X for CO2 sequestration. Ind. Eng. Chem. Res. 2003, 42 (2), 339-348. (11) Itle, G. C.; Salinger, A. G.; Pawlowski, R. P.; Shadid, J. N.; Biegler, L. T. A tailored optimization strategy for PDE-based design: Application to a CVD reactor. Comput. Chem. Eng. 2004, 28 (3), 291-302. (12) Gentric, C.; Pla, F.; Latifi, M. A.; Corriou, J. P. Optimization and nonlinear control of a batch emulsion polymerization reactor. Chem. Eng. J. 1999, 75 (1), 31-46. (13) Nash, S. G. Newton-type minimization via the Lanczos method. SIAM J. Numer. Anal. 1984, 21, 770-778. (14) Balsa-Canto, E.; Banga, J. R.; Alonso, A. A.; Vassiliadis, V. S. Efficient optimal control of bioprocesses using second-order information. Ind. Eng. Chem. Res. 2000, 39, 4287-4295. (15) Vassiliadis, V. S.; Balsa-Canto, E.; Banga, J. R. Second order sensitivities of general dynamic systems with application

to optimal control problems. Chem. Eng. Sci. 1999, 54 (17), 38513860. (16) Cao, Y.; Li, S.; Petzold, L.; Serban, R. Adjoint sensitivity analysis for differential-algebraic equations: The adjoint DAE system and its numerical solution. SIAM J. Sci. Comput. 2003, 24 (3), 1076-1089. (17) O ¨ zyurt, D. B.; Barton, P. I. Cheap second-order directional derivatives of stiff ODE embedded functionals. Presented at the AIChE Annual Meeting, San Francisco, CA, Nov 16-21, 2003. (18) Luus, R. Optimal control by dynamic programming using systematic reduction in grid size. Int. J. Control 1990, 51 (5), 9951013. (19) Luus, R. On the application of dynamic iterative programming to singular optimal control problems. IEEE Trans. Autom. Control 1992, 37 (11), 1802. (20) Bryson, A. E.; Ho, Y. C. Applied Optimal Control; Hemisphere Publishing Co.: Washington, DC, 1988. (21) Biegler, L. T.; Cervantes, A. M.; Wa¨tcher, A. Advances in simultaneous strategies for dynamic process optimization. Chem. Eng. Sci. 2002, 57 (4), 575-593. (22) Goh, C. J.; Teo, K. L. Control parametrization: A unified approach to optimal control problems with general constraints. Automatica 1988, 24 (1), 3-18. (23) Vassiliadis, V. S. Computational Solution of Dynamic Optimization Problems with General Differential-Algebraic Constraints. Ph.D. Thesis, Imperial College, University of London, London, U.K., Jul 1993. (24) Bock, H. G.; Plitt, K. J. A multiple shooting algorithm for direct solution of optimal control problems. In Proceedings of the 9th IFAC World Congress; Pergamon Press: New York, 1984; pp 242-247. (25) Goldberg, H.; Tro¨ltzsch, F. Optimal Control: Theory, Algorithms and Applications; Kluwer Academic Publishers: Dordrecht, The Netherlands, 1997. (26) Kleis, D.; Sachs, E. W. Optimal control of the sterilization of prepackaged food. SIAM J. Optim. 2000, 10 (4), 1180-1195. (27) Mittelmann, H. D.; Maurer, H. Solving elliptic control problems with interior point and SQP methods: Control and state constraints. J. Comp. Appl. Math. 2000, 120, 175-195. (28) Banga, J. R.; Martın, R. P.; Gallardo, J. M.; Casares, J. J. Optimization of thermal processing of conduction-heated canned foods: Study of several objective functions. J. Food Eng. 1991, 14 (1), 25-51. (29) Banga, J. R.; Alonso, A. A.; Martın, R. I. P.; Singh, R. P. Optimal control of heat and mass transfer in food and bioproducts processing. Comput. Chem. Eng. 1994, 18, S699-S705. (30) Petzold, L.; Rosen, J. B.; Gill, P. E.; Jay, L. O.; Park, K. Numerical optimal control of parabolic PDEs using DASOPT. In Large Scale Optimization with Applications, Part II: Optimal Design and Control; Biegler, L. T., Coleman, T. F., Conn, A. R., Santosa, F. N., Eds.; IMA Volumes in Mathematics and Its Applications; Springer Verlag: Berlin, 1997; Vol. 93, pp 271-300. (31) Blatt, M.; Schittkowski, K. PDECON: A FORTRAN code for solving control problems based on ordinary, algebraic and partial differential equations; Technical Report; Department of Mathematics, University of Bayreuth: Bayreuth, Germany, 1997. (32) Quirijns, E. J.; Van Willigenburg, L. G.; Van Boxtel, A. J. B. New perspectives for optimal control of drying processes. Presented at ADCHEM (International Symposium on Advanced Control of Chemical Processes), Pisa, Italy, June 14-16 2000. (33) Saa, J.; Alonso, A. A.; Banga, J. R. Optimal control of microwave heating using mathematical models of medium complexity. Presented at ACoFoP (Automatic Control of Good and Biological Processes), Go¨teborg, Sweden, Sep 21-23, 1998. (34) Balsa-Canto, E.; Banga, J. R.; Alonso, A. A.; Vassiliadis, V. S. Dynamic optimization of chemical and biochemical processes using restricted second-order information. Comput. Chem. Eng. 2001, 25, 539-546. (35) Schiesser, W. E. Computational Mathematics in Engineering and Applied Science: ODEs, DAEs and PDEs; CRC Press: Boca Raton, FL, 1994. (36) Oh, M. Modelling and Simulation of Combined Lumped and Distributed Processes. Ph.D. Thesis, Imperial College, University of London, London, U.K., 1995. (37) Holmes, P.; Lumley, J. L.; Berkooz, G. Turbulence, Coherent Structures, Dynamical Systems and Symmetry; Cambridge University Press: New York, 1996. (38) Balsa-Canto, E.; Alonso, A. A.; Banga, J. R. Optimal control of distributed processes using the Karhunen-Loeve decomposition.

Ind. Eng. Chem. Res., Vol. 43, No. 21, 2004 6765 Presented at the AIChE Annual Meeting, Los Angeles CA, Nov 12-17, 2000. (39) Mageroy, E. Numerical integration of systems arising from the method of lines; Technical Report 377; Mathematics Department Research Report Series; Department of Mathematics, University of Auckland: Auckland, New Zealand, 1997. (40) Schiesser, W. E. The Numerical Method of Lines; Academic Press: New York, 1991. (41) Hindmarsh, A. C. ODEPACK, A Systematized Collection of ODE Solvers; North-Holland: Amsterdam, 1983; pp 55-64.

(42) Ramirez, W. F. Process Control and Identification; Academic Press: San Diego, CA, 1994. (43) Neittaanma¨ki, P.; Tiba, D. Optimal Control of Nonlinear Parabolic Systems; Marcel Dekker: New York, 1994.

Received for review March 26, 2004 Revised manuscript received July 19, 2004 Accepted August 13, 2004 IE0497590