Ind. Eng. Chem. Res. 2006, 45, 8373-8392
8373
Global Methods for Dynamic Optimization and Mixed-Integer Dynamic Optimization Benoıˆt Chachuat,† Adam B. Singer,‡ and Paul I. Barton*,§ Automatic Control Laboratory, Swiss Federal Institute of Technology, Station 9, CH-1015 Lausanne, Switzerland, ExxonMobil Upstream Research Company, Houston, Texas, and Process Systems Engineering Laboratory, Department of Chemical Engineering, Massachusetts Institute of Technology, 77 Massachusetts AVenue, Cambridge, Massachusetts 02139
An overview of global methods for dynamic optimization and mixed-integer dynamic optimization (MIDO) is presented, with emphasis placed on the control parametrization approach. These methods consist of extending existing continuous and mixed-integer global optimization algorithms to encompass solution of problems with ODEs embedded. A prerequisite for so doing is a convexity theory for dynamic optimization as well as the ability to build valid convex relaxations for Bolza-type functionals. For solving dynamic optimization problems globally, our focus is on the use of branch-and-bound algorithms; on the other hand, MIDO problems are handled by adapting the outer-approximation algorithm originally developed for mixed-integer nonlinear problems (MINLPs) to optimization problems embedding ODEs. Each of these algorithms is thoroughly discussed and illustrated. Future directions for research are also discussed, including the recent developments of general, convex, and concave relaxations for the solutions of nonlinear ODEs. 1. Introduction Interest in dynamic simulation and optimization of chemical processes has increased significantly in process industries over the past two decades. This is because process industries are driven by strongly competitive markets and have to face evertighter performance specifications and regulatory limits. The transient behavior of chemical processes is typically modeled by ordinary differential equations (ODEs), differential/algebraic equations (DAEs), or partial differential/algebraic equations (PDAEs). These equations describe mass, energy, and momentum balances and thus ensure physical and thermodynamic consistency. Due to the now widespread industrial use of dynamic process modeling and simulation software (such as ABACUSS/JACOBIAN, gPROMS, and ASPEN CUSTOM MODELER) together with finite element modeling software (e.g., FEMLAB), dynamic models are increasingly developed and applied. It is then natural to ask what other applications might exploit the resources invested in the development of such models. Dynamic optimization is a natural extension of these dynamic simulation tools because it automates many of the decisions required for engineering studies. It consists of determining values for input/control profiles as well as real valued parameters, initial conditions, and/or boundary conditions of a dynamic system that optimize its performance over some period of time according to a specified metric. Examples of dynamic optimization tasks include: determination of optimal control profiles for batch processes;1-3 determination of optimal changeover policies in continuous processes;4,5 design of distributed process units, including adsorption separations, chemical vapor deposition reactors, and microscale reactors;6-8 fitting of chemical reaction kinetic parameters to data;9-11 optimal experimental design;12,13 nonlinear model predictive control with a continuous-time model;14,15 online system * To whom correspondence should be addressed. Tel.: +1 617 253 6526. Fax: +1 617 258 5042. E-mail:
[email protected]. † Swiss Federal Institute of Technology. ‡ ExxonMobil Upstream Research Company. § Massachusetts Institute of Technology.
identification and gross-error detection with a nonlinear process model;16,17 etc. The optimization of a dynamic system is a classical problem in the calculus of variations and its extension by modern optimal control theory.18 A great variety of numerical algorithms have been proposed for such problems over the years, and they generally fall into one of two categories: Variational methods and discretization (or direct) methods. The variational approach19 attempts to find stationary functions via solution of the Euler-Lagrange equations (a multiple-point boundary value problem, MPBVP), which are at least necessary for optimality. When an analytical solution can be found to the Euler-Lagrange equations, this approach has the advantage of being rigorous, in the sense that a solution is found in the original infinite dimensional space. For most optimal control problems, however, MPBVPs are not solvable analytically and are also difficult to solve numerically. In the discretization approach, the infinite dimensional optimization problem is approximated by an optimization problem in a Euclidean space. Discretization approaches can be further subdivided into two broad categories, known as simultaneous and sequential. The simultaneous method,20,21 also termed total discretization, converts the problem into a finite dimensional nonlinear program (NLP) via discretization of both the state and control variables utilizing, for example, orthogonal collocation on finite elements.22 While completely transforming a variational problem into a system of algebraic equalities and inequalities, the simultaneous method typically generates a very large number of variables and equations, yielding large NLPs that are difficult to solve reliably; a great deal of research has thus been devoted to developing tailored algorithms that exploit the special structure of the resulting NLPs.23 Sequential discretization is usually achieved via control parametrization,24,25 where the control variable profiles are approximated by a collection of basis functions that depend on a finite number of real parameters. These parameters become the decision variables in a master NLP, and function evaluations are provided to this NLP via numerical solution of the fully determined initial value problem (IVP) given by fixing the control profiles and any other parameters. This method has
10.1021/ie0601605 CCC: $33.50 © 2006 American Chemical Society Published on Web 08/15/2006
8374
Ind. Eng. Chem. Res., Vol. 45, No. 25, 2006
the advantage of yielding a relatively small NLP (as compared to the simultaneous approach) and exploiting the robustness and efficiency of modern IVP, sensitivity, and adjoint solvers.26-28 A drawback of the sequential approach, however, is that the differential equations must be stable (or at most mildly unstable if the time horizon is small). Nevertheless, a number of approaches have been proposed to handle unstable problems. As just an example, direct multiple shooting29 proceeds by relaxing the original IVP into a number of decoupled IVPs on finite time elements and imposing continuity constraints for the state variables between these time elements. It has been known for a number of years that dynamic optimization problems encountered in chemical engineering applications exhibit suboptimal local minima almost pathologically.30,31 This property, which can be attributed to nonconvexity of the functions participating in most chemical engineering models, implies that the foregoing local optimization methods will often yield suboptimal solutions to problems (the stationary functions of the variational approach need not even be local minima). Multiple suboptimal local minima can be found even in small-scale problems, e.g., for the temperature control of a batch reactor.32 Another striking example is the optimization of catalyst blend profiles in a tubular reactor wherein hundreds of local minima have been identified using local optimization techniques from random starting points.30,33 Implementing a suboptimal operating policy on a real process can have direct economic, safety, and environmental impacts. This has motivated researchers to seek global optimization algorithms for nonconvex dynamic optimization problems. The problem of estimating the parameters of a dynamic model from data provides yet another illustration of the importance of global optimization methods.11,34 Suppose a kineticist has come up with a new mechanism for describing some chemical reactions and wants to validate (or disprove) it by testing its consistency with experimental data. If a set of parameters can be found that gives “sufficiently” good fits with experimental data, then the model is declared valid. But, a problem arises when such a parameter set cannot be obtained, e.g., when local optimization techniques are used. In this case, the kineticist would be left in quandary since he could not conclude whether (i) the model is an incorrect description of the chemistry, or (ii) the correct set of fitting parameters was simply not found by the local search. On the other hand, global optimization allows the detection of the inconsistency of a candidate mechanism with certainty. Both stochastic and deterministic optimization methods have been proposed to overcome convergence to suboptimal, local minima in dynamic optimization problems. Stochastic optimization methods are often able to find much better solutions than the aforementioned local optimization techniques, and many successful applications have been reported in chemical engineering.30,31,35 Notwithstanding the attractive features offered by these methods, they tend to be computationally expensive and have difficulties with highly constrained problems, e.g., for converging to solutions at which multiple constraints are active. Most importantly, even the best stochastic search method cannot guarantee locating a global solution in a finite number of iterations. On the other hand, deterministic global optimization algorithms can guarantee that the optimal performance has been found.36 Said differently, such algorithms amount to a constructive proof that a global solution has been identified. Attempts have been made to extend the RBB method for regular NLP problems37,38 to dynamic embedded NLPs.39-42 Although several such algorithms have yielded rigorous global optimization approaches,41,42 they are very computationally expensive for they
require second-order sensitivities or adjoints to be calculated; moreover, they require twice-continuous differentiability for the solutions of the differential equations and do not address the critical issue of non-quasi-monotone differential equations. Concurrently, methods based on McCormick’s relaxation technique43,44 have been proposed that provide convex relaxations for dynamic optimization problems embedding either linear or nonlinear ODEs.45,46 Not only do McCormick-based relaxations apply to a much wider class of ODEs (for they only require the right-hand side of the differential equations to be Lipschitzcontinuous and factorable), but they were also shown to outperform RBB-based relaxations both in terms of tightness and computational expense and allow treatment of non-quasimonotone differential equations.47 For these reasons, only this latter class of relaxations is considered in this paper. In recent years, there has also been a growing interest in problems in process design and operations that can be formulated as dynamic optimization problems including discrete (binary or integer) decisions; these problems are termed mixedinteger dynamic optimization (MIDO).48-50 Binary or integer decisions are often introduced in an optimization formulation to represent the inclusion or exclusion of elements in a design, nonsmoothness or discontinuities in a process model, as well as temporal sequencing decisions. If the underlying model is dynamic, this yields a MIDO problem. Areas of application for MIDO include the following: batch process synthesis and development;2,51-54 design of batch distillation columns;55-57 solvent design in batch processes;58 simultaneous design and control;59-62 reduction of kinetic mechanisms;63,64 optimization of hybrid discrete/continuous systems,65-69 e.g., for the design of major process transients such as startup or shutdown procedures. Similarly, problems in safety verification have been formulated as mixed-integer optimization problems (MIPs) with a linear, discrete-time model.70 Use of a more realistic nonlinear, continuous-time model would yield a MIDO formulation. It is worthwhile to note that the use of deterministic global optimization is absolutely vital here, for formal safety verification requires a rigorous proof that the formulated safety property is satisfied by the model. Standard algorithms for general MIPs in Euclidean spaces assume that the participating functions are convex.71-73 But when applied to problems wherein the participating functions are nonconvex, these algorithms typically terminate at an arbitrary suboptimal solution, for the cuts added during the solution process are prone to excluding (possibly large) portions of the feasible set within which a global solution may occur.74 In response to this, deterministic global optimization algorithms have begun to emerge in recent years to solve mixed-integer nonlinear problems (MINLPs) with nonconvex participating functions. They rely either on a branch-and-bound procedure (branch-and-reduce,75,76 SMIN-RBB and GMIN-RBB,77 reformulation/spatial branch-and-bound78) or on a decomposition strategy (OA for nonconvex separable and nonseparable MINLPs79,80), both of which have been shown to be particular instances of a generalized branch-and-cut (GBC) framework used with different sets of algorithmic heuristics.81 Building upon these recent developments, along with the foregoing developments in relaxation techniques for nonconvex Bolzatype functionals subject to embedded ODEs, algorithms can be formulated that are capable of finding a global solution of a quite general class of MIDO problems, despite the nonconvexities inherent to the dynamic optimization subproblems. In particular, the extension of the outer-approximation algorithm for nonconvex MIPs80 has been investigated to encompass
Ind. Eng. Chem. Res., Vol. 45, No. 25, 2006 8375
solutions of MIDO problems embedding either linear timevarying or general nonlinear ODEs.50,69,82 Ostensibly, any other branch-and-bound or decomposition-based global MINLP algorithm can be adapted to solve MIDO problems provided that rigorous lower bounding problems can be constructed from existing relaxation techniques. In summary, significant progress has been made in global optimization of problems with ODEs embedded over the past few years. In this contribution, we present an overview of deterministic global optimization methods for dynamic optimization and MIDO problems embedding ODEs, with special emphasis placed on the sequential approach. The remainder of the paper is organized as follows. Global methods for dynamic optimization problems with ODEs embedded are presented in section 2; the focus is on the use of a branch-and-bound procedure, and special attention is paid to construction of convex relaxations for Bolza-type functionals based on McCormick’s composition technique. Global solution techniques for MIDO problems are then discussed in section 3; in particular, the extension of an outer-approximation algorithm originally developed to address MINLPs is presented and illustrated. Finally, a number of challenges are identified in section 4, and future research directions are discussed. 2. Global Dynamic Optimization In this section, we review branch-and-bound techniques for solving optimization problems embedding general, nonlinear ODEs to guaranteed global optimality. For the sake of clarity, we restrict the presentation to continuous dynamic optimization problems of the following form:
min J(p) :) φ0(x(tf, p), p) + p∈P
∫t t ψ0(t, x(t, p), p) dt
s.t. 0 g Gk(p) :) φk(x(tf, p), p) +
f
0
∫t t ψk(t, x(t, p), p) dt, f
0
k ) 1, ..., ng
x3 (t, p) ) f(t, x(t, p), p), ∀ t ∈ (t0, tf] x(t0, p) ) h(p)
(D1)
Here, t ∈ [t0, tf] stands for the independent variable (e.g., time); p ∈ P denotes the continuous time-invariant parameters, with P a nonempty compact convex subset of Rnp; x ∈ X are the continuous variables describing the state of the process, with X ⊂ Rnx such that x(t, p) ∈ X, ∀ (t, p) ∈ [t0, tf] × P; φk: X × P f R and ψk: [t0, tf] × X × P f R, k ) 0, ..., ng are continuous, potentially nonconvex mappings; fi: [t0, tf] × X × P f R, i ) 1, ..., nx is Lipschitz-continuous on [t0, tf] × X and continuous on P; and hi: P f R, i ) 1, ..., nx is continuous on P. The objective and constraint functionals in (D1) are subject to a set of single-stage ODEs, and both the initial and terminal times are fixed. In general, however, the developments presented later on in this section can be readily extended to address problems wherein the objective and constraint functionals include a finite number of point and integral terms on a fixed partition interior to the time domain. Note also that extensions exist to encompass solutions of problems embedding multistage ODEs with varying transition times (including problems with varying initial and/or terminal times);83 this can be done, e.g., by applying the so-called control parametrization enhancing transform (CPET),84 which transforms the time domain from one with varying transition times into one with fixed transition times, via introduction of bilinear terms (i.e., additional non-
convexities) into the right-hand sides of the differential equations. To illustrate this, consider the case where the final time tf is free in (D1). Then, a possible reformulation of this problem into one with fixed final time is
min φ0(ξ(1, p, T), p) +
p∈P,Tg0
∫01ψ0(t0 + θT, ξ(θ, p, T), p) dθ
s.t. 0 g φk(ξ(1, p, T), p) +
∫01ψk(t0 + θT, ξ(θ, p, T), p) dθ,
k ) 1, ..., ng
ξ˙ (θ, p, T) ) Tf(t0 + θT, ξ(θ, p, T), p), ∀ θ ∈ (0, 1] ξ(0, p) ) h(p) where T ) tf - t0 is made an additional decision variable. Last but not least, problems containing function valued decision variables can be converted into the formulation given in (D1) by applying standard control parametrization techniques,24,25 which consist of approximating the infinite dimensional control variables, u(t), with parametric functions, U(t, p), such as piecewise constant or piecewise linear functions. 2.1. Branch-and-Bound. So far, our focus has been on the use of branch-and-bound algorithms36,43,85,86 in order to guarantee location of a global solution of the optimization problem, even when the objective and/or constraint functionals are nonconvex. A branch-and-bound algorithm begins by constructing a relaxation of the original nonconvex problem. This relaxation is solved to generate a lower bound on the solution of the original problem, and should, in some way, be easier to solve than the original problem. In the current context, the relaxation is a convex optimization problem whose objective function underestimates the nonconvex objective functional on P and whose feasible set contains that of the nonconvex problem. This can be achieved by constructing functions that are conVex relaxations (a convex relaxation of a function f on a convex set C is a convex function u:C f R such that u(x) e f(x), ∀ x ∈ C.) of the objective and constraint functionals on P and formulating a convex optimization problem from these relaxed functions. In addition, because every local minimum of a convex optimization problem is a global minimum, (in principle) standard nonlinear programming (NLP) algorithms designed to locate local minima can find this lower bound reliably. An upper bound is generated by the value of the nonconvex objective function at any feasible point (e.g., a local minimum found by a standard NLP algorithm). If these bounds are not within some tolerance, a branching heuristic is used to partition the infinite set P (normally an interval derived from physical considerations) into two new subproblems (e.g., bisect on one of the variables). Relaxations can be constructed on these two smaller sets, and lower and upper bounds can be computed for these partitions. If the lower bound on a partition is greater than the current best upper bound, a global solution cannot exist in that partition and the partition is excluded from further consideration (fathoming). This process of branching, bounding, and fathoming continues until the lower bound on all active partitions is within of the current best upper bound. The branch-and-bound procedure is illustrated in Figure 1. Typically, if ) 0, only infinite convergence of this procedure can be established.36 However, for any > 0, only a finite number of iterations will be required. It should be noted that while finite convergence of a branch-and-bound procedure for an integer program (IP) or a mixed-integer linear program (MILP) is trivial due to the finite number of nodes in the branchand-bound tree, finite convergence of a branch-and-bound
8376
Ind. Eng. Chem. Res., Vol. 45, No. 25, 2006
Figure 1. Example of a convex relaxation of a nonconvex function on R0 (center plot). The region R0 is then subdivided to form regions R1 and R2, and the relaxation is repeated to find new lower bounds for each region (right plot). By finding a point p in region R2 where J(p) is less than the convex lower bound for region R1, one can show that a global minimum cannot exist in region R1; region R1 can now be discarded from the search tree, whereas R2 is further subdivided as shown in the branch-and-bound tree (left plot).
procedure for an NLP on an infinite set P is quite a significant result. Convergence proofs require the convex relaxations of any functions involved to converge in some sense to the original function as the size of the supporting set decreases. On the other hand, given > 0, the branch-and-bound procedure has an exponential running time due to the repeated partitioning of P. The ability to solve practical problems in reasonable time scales hinges on the strength of the convex relaxations, good algorithmic heuristics, and (potentially) the ability to use parallel computing architectures. In recent years, much effort has been devoted to developing algorithmic heuristics38,75,76,87 and the resulting codes have been able to achieve impressive results on many difficult NLP problems. In contrast, global optimization for problems with differential equations embedded is still in a state of infancy. 2.2. Road Map for Construction of Convex Relaxations. Established polynomial time methods for constructing the convex relaxation of a function38,43 assume that the form of the function is known explicitly as an elementary function and can be manipulated symbolically (indeed, this includes many functions that can be implemented as a computer program88). However, the objective functional J and constraints G in (D1) do not fall into this category because they are composed with the solution of the ODEs. Hence, the ability to apply a branchand-bound algorithm to (D1) hinges on the development of a practical computational method for constructing convergent convex relaxations of these functionals. Two basic “ingredients” are needed to compute a convex relaxation of a general Bolza-type functional such as J or G in (D1). First, one needs the ability to construct convex relaxations of the terminal terms φi(x(tf, ‚), ‚), as well as of the integrands ψi(t, x(t, ‚), ‚) pointwise in time, on a given partition of P. Then, valid convex relaxations must be derived for the integral terms ∫tt0f ψi(t, x(t, ‚), ‚) dt. Each of these two steps is thoroughly detailed and illustrated hereafter. 2.2.1. Relaxation of Point Functionals. The idea behind construction of convex relaxations for a point functional such as ψi(t, x(t, ‚), ‚) (at a fixed t ∈ [t0, tf]) on P consists of adapting McCormick’s composition technique43 to the ODE case. This procedure requires that (i) a time-varying enclosure X(t) ) [xL(t), xU(t)] ⊇ X be known for the solutions of the embedded parametric ODEs at t on P and (ii) a convex underestimator ux(t, ‚) and a concave overestimator ox(t, ‚) be known for the continuous state variables x(t, ‚) on P. The critical issues of obtaining time-varying enclosures and convex/concave bounds for the solutions of general, nonlinear differential equations at each t ∈ [t0, tf] are addressed in the
subsequent two paragraphs. Then, the construction of convex relaxations for point functionals is outlined, and a number of numerical considerations are discussed. Finally, the special case of point functionals with linear-time-varying (LTV) ODEs embedded is considered. Implied State Bounds. The first bounding step consists of estimating the image of the parameter set P under the solution of the ODEs. Said differently, one wants to find so-called implied state bounds xL(t) and xU(t) such that
xL(t) e x(t, p) e xU(t) ∀ p ∈ P pointwise in [t0, tf]. In general, obtaining the exact bounds for nonlinear ODEs (i.e., the tightest possible implied state bounds) is itself a nonconvex optimal control problem. Nevertheless, a number of techniques exist that provide rigorous pointwise-intime enclosures of the image set. When the ODEs have a property known as quasi-monotonicity,89 tight bounds can be obtained via the classical theory of differential inequalities.89,90 More specifically, any trajectories satisfying the following differential inequalities in xL and xU yield valid enclosures for the solutions x(‚, p) of the initial value problem in ODEs in problem D1, with p ∈ P:
}
x˘ Li (t) e inf{fi(t, z, q) : z ∈ X(t), zi ) xLi (t), q ∈ P} ∀ t ∈ (t , t ] z,q 0 f x˘ Ui (t) g sup{fi(t, z, q) : z ∈ X(t), zi ) xUi (t),q ∈ P} i ) 1, ..., nx z,q
(1)
xL(t0) e h(p) e xU(t0) ∀p ∈ P Nevertheless, non-quasi-monotone systems are frequently encountered in many engineering and scientific fields. As just an example, even very simple biochemical network models can be non-quasi-monotone. For such systems, it is well-known that the implied state bounds computed using differential inequalities rapidly explode toward +∞ or -∞, hence making them practically useless in a branch-and-bound algorithm. In response to this, new differential inequality results have been developed that enable knowledge concerning constraints and solution invariants to be incorporated in the computation of implied state bounds.46 For example, situations wherein natural bounds and invariants for the state variables are known frequently arise in many engineering problems, e.g., from mass and energy conservation principles. To illustrate this, consider the initial value problem in ODEs in problem D1 and suppose a set X h (t) of natural bounds is known for x(t, p) pointwise in
Ind. Eng. Chem. Res., Vol. 45, No. 25, 2006 8377
[t0, tf], with p ∈ P. Then, any trajectories satisfying the following differential inequalities in xL and xU yield valid enclosures for x(‚, p):46
with the initial conditions: xLB(0) ) xUB (0) ) xB0. To compute a numerical solution, the following values are assigned to the rate constants and initial conditions:
x˘ Li (t) e inf{fi(t, z, q) : z ∈ X(t)∩X h (t), zi ) xLi (t), q ∈ P} ∀ t ∈ (t , t ] z,q 0 f x˘ Ui (t) g sup{fi(t, z, q) : z ∈ X(t)∩X h (t), zi ) xUi (t),q∈ P} i ) 1, ..., nx
kLf ) 100, kUf ) 500, kLr ) 0.001, kUr ) 0.01, xA0 ) 1,
}
z,q
xB0 ) 1.5, xC0 ) 0.5
(2)
xL(t0) e h(p) e xU(t0) ∀ p ∈ P
where the units for concentrations are mol/L, the units for the forward rate constant are L/(mol‚min), and the units for the reverse rate constant are min-1. Moreover, the time scale of interest is set to 1.5 min. A plot of the upper bound for species B is shown in Figure 2 (left plot). Because system 3 is nonquasi-monotone, this bound rapidly explodes, hence making it useless in practice; the same behavior is obtained for the upper bounds of the other species concentrations. To correct the bound explosion, constraints are incorporated in the computation of the implied state bounds. Natural bounds for the state variables are easily obtained from considering species mole balances on the system:
For many non-quasi-monotone systems, this approach yields much tighter implied state bounds than the regular differential inequalities89 and accelerates the convergence of the branchand-bound algorithm. This is illustrated in an example below. From inspection, the most difficult aspect of obtaining implied state bounds appears to be bounding the solutions of the parametric optimization problems defining the right-hand sides of (1) and (2). Solving these optimization problems at each integration step in a numerical integration would be a prohibitively expensive task. Instead, their solutions can be estimated by an interVal arithmetic pointwise in time, e.g., via natural interVal extension.91 Given the inclusion monotonicity property of natural interval extensions, theoretical guarantees can be given that the states bounds approach the original state when the parameter space approaches degeneracy. But, although sufficient to guarantee convergence of the branch-and-bound algorithm, calculation of a lower bound for (D1) based on the implied state bounds only typically yields unacceptable solution times (even for problems containing as few as three optimization parameters41). These considerations motivate the development of tight convex/concave state bounds. Illustrative Example 1. Consider the bimolecular, reversible chemical reaction A + B h C, with forward rate constant kf and reverse rate kr. We wish to compute bounds for the concentrations of the three chemical species over a specified time interval, given a bounding set [kLf , kUf ] × [kLr , kUr ] for the rate constants and the following (non-quasi-monotone) differential system describing the evolution of the species concentrations:
Now, the implied state bounds are computed using eq 2, with X h (t) given by eq 4. The resulting bounds for species B are shown in Figure 2 (right plot). Note that the upper bound no longer explodes because when the right-hand sides of the bounding differential equations exceed a limit (dictated by the natural bounds), they become constant thus bounding the slope of the state variables. Convex/Concave State Bounds. As noted previously, convex relaxations for the (potentially nonconvex) functions φi and ψi in (D1) can be obtained by adapting McCormick’s composition technique43 to the ODE case, hence requiring convex functions ux(t, ‚) and concave functions ox(t, ‚) on P that satisfy the inequality
x˘ A ) -kfxAxB + krxC
ux(t, p) e x(t, p) e ox(t, p) ∀ p ∈ P
x˘ B ) -kfxAxB + krxC
pointwise in [t0, tf]. This is illustrated in Figure 3. Noting that an affine function is both convex and concave and that the solutions of LTV ODEs are affine with respect to the parameters,45 it was proposed to construct ux and ox as solutions of LTV ODEs. A method was therefore developed for constructing such LTV ODEs from the functional form of the nonlinear ODEs in (D1), so that their solutions satisfy inequality 5 at each t ∈ [t0, tf]. Specifically, ux and ox are obtained by constructing and solving the following set of auxiliary differential equations:46
x˘ C ) kfxAxB - krxC xA(0) ) xA0, xB(0) ) xB0, xC(0) ) xC0 The implied state bounds are first derived using Harrison’s theorem,89 i.e., without employing a priori information regarding the behavior of the state variables. When natural interval extensions are used to construct the right-hand sides of (1), this analysis yields the following differential equations for x˘ LB and x˘ UB (see Singer92 for the complete system of ODEs):
x˘ LB
)
-max{min{kLf xLA, kUf xLA, kLf xUA, kUf xUA}xLB, max{kLf xLA, kUf xLA, kLf xUA, kUf xUA}xLB} + min{kLr xLC, kUr xLC, kLr
xA0 - min{xA0, xB0} e xA(t) e xA0 + xC0 xB0 - min{xA0, xB0} e xB(t) e xB0 + xC0 0 e xC(t) e min{xA0, xB0} + xC0
u˘ xi(t, p) ) inf{Lufi(t, z, p)|x*(t),p* : z ∈ C(t, p), zi ) uxi(t, p)}
uxi(t0, p) ) Luhi(p)|p* kUr
xUC }
x˘ UB ) - min{min{kLf xLA, kUf xLA, kLf xUA, kUf xUA}xUB , max{kLf xLA, kUf xLA, kLf xUA, kUf xUA}xUB } + max{kLr xLC, kUr xLC, kLr xUC , kUr xUC } (3)
(5)
}
∀ t ∈ (t0, tf] z o˘ xi(t, p) ) sup{Lofi(t, z, p)|x*(t),p* : z ∈ C(t, p), zi ) oxi(t, p)} i ) 1, ..., nx z
xUC ,
(4)
oxi(t0, p) ) Lohi(p)|p*
}
i ) 1, ..., nx
(6)
for some reference trajectory (x*(t), p*) ∈ X(t) × P (differentiability of uf and of is assumed along (x*(t), p*)). In this expression, uf(t, ‚, ‚) and of(t, ‚, ‚) are convex underestimators and concave overestimators of f(t, ‚, ‚) on X(t) × P, pointwise in [t0, tf], respectively; uh and oh are a convex underestimator
8378
Ind. Eng. Chem. Res., Vol. 45, No. 25, 2006
Figure 2. Upper bound for species B in the bimolecular, reversible chemical reaction example: (left plot) without utilizing a priori information; (right plot) with the natural state bounds given by (4).
Illustrative Example 2. Given a bounding set P ) [pL, pU] for the parameter p, we wish to construct pointwise-in-time convex underestimators ux(t, ‚) and pointwise-in-time concave overestimators ox(t, ‚) on P for the differential variable x(t, ‚) defined by the IVP in ODEs
Figure 3. Illustration of pointwise-in-time affine relaxations: (left plot) time trajectories for a fixed parameter value in P; (right plot) state and relaxations at a fixed time instant in [t0, tf].
and a concave overestimator of h on P, respectively; L denotes the linearization operator; C(t, p) ) {z: ux(t, p) e z e ox(t, p)} for each p ∈ P; and X(t) are pointwise-in-time implied state bounds. Since the foregoing LTV ODEs have a fixed sequence of events, the solution functions ux(t, ‚) and ox(t, ‚) are indeed affine in P, pointwise in time.93 As noted previously with the implied state bounds, the most difficult aspect of obtaining affine state relaxations is constructing the right-hand sides of (6). So far, the convex/concave relaxations uf, of, uh, and oh have been obtained by applying McCormick’s composition technique.43 Accordingly, ux(‚, p) and ox(‚, p) converge to x(‚, p), pointwise in time, as both upper and lower bounds on the range of p narrow. The reference trajectory (x*(t), p*) is another important aspect of the affine state relaxations. It is needed for constructing auxiliary differential equations 6, via the linearization of the relaxations uf(t, ‚, ‚) and of(t, ‚, ‚) of the original ODE righthand sides. In particular, the theory guarantees that any choice of this reference trajectory in X(t) × P yields a valid convex underestimator ux(t, ‚) and a valid concave overestimator ox(t, ‚) on P, pointwise in [t0, tf]. For example, a possible reference trajectory is (x*(t), p*) ) ((xL(t) + xU(t))/2, (pL + pU)/2), even though x*(t) may not be the solution of the ODEs for p*. Experience shows that the choice of (x*(t), p*) may have a large influence on the derived affine state relaxations, especially when the parameter range P is large (i.e., at early nodes of the branch-and-bound tree). Despite this important fact, there is presently no systematic way of selecting a “good” reference trajectory a priori. In practice, however, the influence of the reference trajectory can be mitigated by taking a set of reference trajectories in X(t) × P, calculating the relaxations ux and ox for each of these references in parallel, and then considering the tightest convex and concave relaxations only. The construction of the state relaxations, in a step-by-step fashion, is detailed in an example below. The influence of the reference trajectory is also illustrated in this example.
x˘ (t, p) ) -0.1(x(t, p) - p)2 ∀ t ∈ (0, 1]
(7a)
x(0, p) ) p2 - 0.5
(7b)
First, time-varying enclosures X(t) ) [xL(t), xU(t)] are easily obtained from (2) as the solutions of the differential equations
}
x˘ L(t) ) -0.1 max{(xL(t) - pL)2, (xL(t) - pU)2} ∀ t ∈ (0, 1] x˘ U(t) ) -0.1 mid{xU(t) - pU, xU(t) - pL, 0}2 (8) with the initial conditions
xL(0) ) mid{pL, pU, 0}2 - 0.5 xU(0) ) max{(pL)2, (pU)2} - 0.5
(9)
Here, the upper and lower bounds for the right-hand sides of the differential equation (eq 7a) and the initial conditions (eq 7b) are obtained using natural interval extensions.91 Next, we construct affine relaxations for the solutions of the differential equation based on X(t). Noting that f(t, ‚, ‚) is a concave function on X(t) × P, a pointwise-in-time concave overestimator of(t, ‚, ‚) of f(t, ‚, ‚), t ∈ [t0, tf], is trivially given by
of(t, z, p) ) -0.1(z - p)2 For obtaining a convex underestimator uf(t, ‚, ‚) of f(t, ‚, ‚) on X(t) × P, the following factorization is considered:
V1 :) z - p V2 :) V12 f ) -0.1V2 Convex/concave relaxations for each of the foregoing factors for (z, p) ∈ X(t) × P are easily obtained as
uV1 ) z - p, oV1 ) z - p
Ind. Eng. Chem. Res., Vol. 45, No. 25, 2006 8379
uV2 ) V12, oV2 ) V1(VL1 + VU1 ) - VL1 VU1
zmin(t) ∈ Xi(t)
(12)
uf ) -0.1oV2, of ) -0.1uV2
uxi(t, p) e oxi(t, p) ∀ p ∈ P
(13)
uxi(t, p) e xUi (t) ∀ p ∈ P
(14)
Moreover,
with: VL1 ) xL(t) - pU, VU1 ) xU(t) - pL and the desired convex underestimator follows by back substitution as
uf(t, z, p) ) -0.1(z - p)(x (t) + x (t) - p - p ) + U
L
U
L
0.1(xU(t) - pL)(xL(t) - pU)
for if the converse were true, then [uxi(t, p), oxi(t, p)] ∩ Xi(t) ) L, which contradicts the fact that the state trajectory x(t, p) exists and is unique since x(t, p) ∈ [uxi(t, p), oxi(t, p)] and x(t, p) ∈ X(t) for each p ∈ P. Likewise,
Analogously, a convex underestimator uh and a concave overestimator oh of h on P are
oxi(t, p) g xLi (t) ∀ p ∈ P
uh(p) ) p2 - 0.5
By the definition of the mid function and from (13), (i) either uxi(t, p) e zmin(t) e oxi(t, p), in which case mid{uxi(t, p), oxi(t, p), zmin(t)} ) zmin(t) ∈ Xi(t)sfrom (12); (ii) or zmin(t) e uxi(t, p) e oxi(t, p), in which case mid{uxi(t, p), oxi(t, p), zmin(t)} ) uxi(t, p) ∈ Xi(t)sfrom (12) and (14); (iii) or uxi(t, p) e oxi(t, p) e zmin(t), in which case mid{uxi(t, p), oxi(t, p), zmin(t)} ) oxi(t, p) ∈ Xi(t)sfrom (12) and (15). Overall, mid{uxi(t, p), oxi(t, p), zmin(t)} ∈ Xi(t) for each p ∈ P. The utilization of the mid function also merits some comments as it may introduce nonsmoothness in the resulting convex underestimators. For example, this situation occurs when the bounds xLi (t) or xUi (t) for the state variables yield tighter relaxations than those provided by the functions uxi(t, ‚) or oxi(t, ‚), respectively, for some values p ∈ P. To circumvent this issue, an alternative approach consists of formulating a smooth lower-bounding problem by substituting uφ(ω) to uφoxi(p) in the original functional, with ω being an additional decision variable, and then adding four new inequality constraints as follows:
oh(p) ) p(pL + pU) - pLpU - 0.5 Note that uh and oh are also the convex and concave envelopes of h on P, respectively, in this case. Applying (6) provides the desired relaxations as the solutions of the following auxiliary LTV ODEs u˘ x(t, p) ) -0.1(ux(t, p) - p)(xU(t) + xL(t) - pU - pL) +0.1(xU(t) - pL)(xL(t) - pU) o˘ x(t, p) ) -0.2(ox(t, p) - p)(x* - p*) - 0.1(x* - p*)2
}
∀ t ∈ (0, 1]
(10)
with the affine initial conditions
ux(0, p) ) 2pp* + (p*)2 - 0.5 ox(0, p) ) p(pL + pU) - pLpU - 0.5
(11)
for any reference trajectory (x*(t), p*) ∈ X(t) × P. Note that the differential equations giving ux and ox are decoupled in this simple example. To compute a numerical solution, the bounding set is chosen as P ) [-3, 2]. For comparison purposes, the corresponding relaxations at final time t ) 1 are shown in Figure 4 for various choices of the reference trajectory. Convex Relaxation of Point Functionals. Once time-varying enclosures and convex/concave bounds have been obtained for the solutions of the ODEs, one can apply McCormick’s composition technique43,44 in a straightforward fashion to derive convex relaxations of point functionals. To illustrate this, consider the point functional φ(xi(t, ‚)) for some i ∈ {1, ..., nx}. Let uφ be a convex underestimator (a convex underestimator uφ is obtained by using any suitable technique for relaxing elementary functions38,43) of φ on Xi(t), and denote zmin(t) a point at which uφ attains its infimum on Xi(t). Then,
uφoxi(p) :) uφ(mid{uxi(t, p), oxi(t, p), zmin(t)}) is a convex underestimator for φ(xi(t, ‚)) on P, where the mid function selects the middle value of three scalars.43,44 An analogous result holds for constructing a concave overestimator oφoxi of φ(xi(t, ‚)). Interestingly enough, neither uxi(t, p) nor oxi(t, p) are required to be in the domain of uφ (which is Xi(t)) by virtue of the composition of uφ with the mid function. To see this, note first that the following relations hold true by construction:
(15)
ω g uxi(t, p) ω g xLi (t) ω e oxi(t, p) ω e xUi (t) Illustrative Example 2 [Continued]. Given a bounding set P ) [pL, pU] for the parameter p, we want to construct a convex relaxation on P for the point functional ψ(t, x(t, ‚), ‚) defined as:
ψ(t, x(t, p), p) :) p2 - [x(t, p)]2 ∀ p ∈ P for some t ∈ [0, 1], subject to the IVP in ODEs 7. Obtaining a convex relaxation uψ(t, ‚) requires that a convex underestimator be known for the term -[x(t, p)2]. On the basis of the material presented in the previous paragraph, one such convex underestimator is obtained as the following:
uφ[mid{ux(t, p), ox(t, p), zmin(t)}] where uφ: z |f -z(xL(t) + xU(t)) + xL(t)xU(t); implied state bounds xL and xU and affine state relaxations ux and ox of x on P are obtained at time t by integrating the differential equations
8380
Ind. Eng. Chem. Res., Vol. 45, No. 25, 2006
Figure 4. State variable x at final time and convex/concave relaxations for various reference trajectories: (left plot) (x*(t), p*) ) (xL(t), pL); (center plot) (x*(t), p*) ) ((xL(t) + xU(t))/2, (pL + pU)/2); (right plot) (x*(t), p*) ) (xU(t), pU).
of (8) and (9) and (10) and (11), respectively; and zmin(t) :) xU(t) is the point at which uφ attains its minimum on X(t) :) [xL(t), xU(t)]. Overall, a convex relaxation uψ(t, ‚) of ψ(t, ‚) on P is thus given by
uψ(t, p) :) p2 [mid{ux(t, p), ox(t, p), xU(t)}(xL(t) + xU(t)) - xL(t)xU(t)] ∀ p ∈ P (16) Numerical Considerations. By construction, the auxiliary set of LTV ODEs (6) giving the state relaxations do not necessarily possess Lipschitz-continuous right-hand sides, for the differential equations possess state events. This situation arises because the convex and concave relaxations uf and of used in the right-hand sides of (6) may be nonsmooth. When this occurs, the derivatives of the relaxations uf and of may be discontinuous causing nonsmooth shifts through time in the linearizations defining u3 x and o3 x. Although this nonsmoothness does not pose any theoretical limitations,46 it often causes numerical integration codes to fail. Discontinuity locking with eVent detection can be used for preventing it.47,94 Moreover, because any supporting linearization yields valid relaxations of ux and ox, the mode actually selected at an event is arbitrary, hence causing chattering. However, this can be easily prevented, e.g., by locking the system into an arbitrary mode while chattering ensues.47,92 Clearly, integrating the differential system (2),(6) in the relaxed problem is more expensive than integrating the original ODE system in (D1), for the former is both four times larger (in the number of state equations) and requires discontinuity locking. However, when using an optimizer to determine a lower bound for (D1), one can often exploit the affine structure of ux(t, ‚) and ox(t, ‚) to reduce the number of required integrations based on linear algebra (e.g., if the objective function and constraints in (D1) do not contain integral terms). Moreover, because the implied state bounds xL and xU are parameter independent, they need not be recomputed for each iteration of the optimizer. It is also worthwhile to note that the running time of the algorithms is much more sensitive to the number of degrees of freedom p than to the number of state variables x. This is because the control parametrization approach limits the worstcase exponential branch-and-bound process to the parameter space, whereas evaluation of the function x(‚, p) can be accomplished with polynomial-time numerical integration algorithms. This suggests that optimization problems with a small number of degrees of freedom but a large number of state
variables should be tractable within reasonable computing times. Certainly, many problems of engineering interest fall into this category. Special Case of Linear-Time-Varying Systems. The construction of convex relaxations for dynamic optimization problems greatly simplifies in the case of dynamic optimization problems embedding LTV ODEs
x3 ) A(t)x + B(t)p + q(t) where the elements of the matrices A and B and of the vector q are continuous functions with respect to time, and with an affine initial condition in p. Noting the affine structural form of the solution of LTV ODEs, i.e.,
x(t,p) ) M(t)p + n(t)
(17)
it is immediately seen that the state variables are an affine function of the optimization parameters at any fixed time t ∈ [t0, tf]. A direct consequence of this structural property is that a convex underestimator ux(t, ‚) and a concave overestimator ox(t, ‚) of x(t, ‚) on P are trivially obtained as
ux(t, p) ) x(t, p) ) ox(t, p) ∀ p ∈ P pointwise in [t0, tf]. As regards implied state bounds xL and xU, a tailored, polynomial complexity method has been developed which utilizes a combination of interval analysis91 with affine structural form 17.45 The main interest of using this approach instead of differential inequality results lies in the guarantee that the implied state bounds are now exact in the following sense: for any j ∈ {1, ..., nx} and any t ∈ [t0, tf],
j) ∃p, p j ∈ P : xj(t, p) ) xLj (t) e xj(t, p) e xUj (t) ) xj(t, p ∀p∈P for P being an interval. Several numerical case studies have been considered, which demonstrate tractability of the approach for optimization problems containing up to 10 decision variables (control vector parametrization problems in particular).45,92 2.2.2. Relaxation of Integral Functionals. The second step in the development of convex relaxations for general, Bolzatype functionals is construction of convex relaxations for integral terms. Clearly, a convex relaxation for an integral term can be obtained by reformulating it as an additional differential equation and, then, generating relaxations as indicated in section 2.2.1. A more interesting way of relaxing integral terms that guarantees providing tighter relaxations is to exploit the monotonicity of the Lebesgue integral.45
Ind. Eng. Chem. Res., Vol. 45, No. 25, 2006 8381
Figure 5. Integrand and integral terms, plus convex relaxations at (x*(t), p*) ) ((xL(t) + xU(t))/2, (pL + pU)/2): (left plot) integrand vs t and p; (right plot) integral vs p.
Figure 6. Branch-and-bound algorithm for solution of dynamic optimization problems to guaranteed global optimality: (P) stack; (Rj) node j; (UBD) best upper bound; (UB) upper-bounding problem; (LB) lower bounding-problem.
Let P be a nonempty convex set, ψ: [t0, tf] × X × P f R, a point functional subject to nonlinear ODEs, and Ψ(‚) :) ∫tt0f ψ(t, x(t, ‚), ‚) dt. Suppose uψ(t, ‚) is a pointwise-in-time convex underestimator of ψ(t, x(t, ‚), ‚) on P (as obtained, e.g., by applying the relaxation procedure presented early on in section 2.2.1). Then,
uΨ(p) :)
∫t t uψ(t, p) e Ψ(p) f
0
∫01p2 - [x(t, p)]2 dt
∀p ∈ P
uΨ(p) :)
∫01p2 - [mid{ux(t, p), ox(t, p), xU(t)} (xL(t) + xU(t)) - xL(t) xU(t)] dt (19)
∀p∈P
and uΨ is a convex function on P. In other words, integrating a pointwise-in-time convex underestimator (or respectively a concave overestimator) for an integrand provides a convex underestimator (or respectively a concave overestimator) for an integral. This is illustrated in an example below. Notwithstanding the attractiveness of the foregoing result, it is important to note that the integral of the pointwise-in-time convex envelope (i.e., the tightest possible convex underestimator) of a nonconvex integrand does not provide, in general, the convex envelope for the integral. Illustrative Example 2 [Continued]. Given a bounding set P ) [pL, pU] for the parameter p, we finally want to construct a convex relaxation uΨ on P for the integral functional Ψ given by
Ψ(p) :)
derived previously based on McCormick’s composition technique; it is given by (16). Therefore, by applying the theory presented earlier in this subsection, the following system yields the desired relaxation for the integral:
(18)
subject to the IVP in ODEs 7. A convex underestimator uψ(t, ‚) on P of the integrand ψ(t, x(t, p), p) :) p2 - [x(t, p)]2, at a given t ∈ [0, 1], has been
For the numerical calculations, the bounding set is chosen as P ) [-3, 2] and the reference trajectory in the affine state relaxation is taken as (x*(t), p*) ) ((xL(t) + xU(t))/2, (pL + pU)/2). It is seen that the integral relaxation is convex on P in Figure 5 (right plot). Interestingly, the integrand relaxation is not convex on the joint time/parameter space in this example (Figure 5, left plot). This is because only partial convexity of the integrand with respect to the parameters is needed to ensure convexity of the integral.45 2.3. Application of Branch-and-Bound to Global Dynamic Optimization. The solution process of a branch-and-bound algorithm consists of iterating between two subproblems, the upper-bounding problem (UB) and the lower-bounding problem (LB). The algorithm is shown in Figure 6. Its adaptation to global dynamic optimization warrants some explanations. (i) Any feasible solution to the (nonconvex) dynamic optimization problem D1 yields a valid upper bound on the global solution of (D1). Typically, the upper-bounding problem is taken as (D1), and a local solution to this problem is sought by applying the conventional numerical procedures mentioned in the Introduction (e.g., the sequential approach).
8382
Ind. Eng. Chem. Res., Vol. 45, No. 25, 2006
Figure 7. Principle of the gdoc-compiler.
Figure 8. Principle of the GDOC-CLI.
(ii) The lower bounding problem is a convex dynamic optimization problem of the form:
min uJ (p) :) uφ0(p) + p∈P
s.t. 0 g ugk(p) :) uφk(p) +
∫t t uψ (p, t) dt f
0
0
∫t t uψ (p, t) dt, f
0
k
k ) 1, ..., ng (20)
where uφk and uψk (k ) 0, ..., ng) are pointwise-in-time convex underestimators of φk and ψk in (D1), respectively, for p ∈ P. These relaxations can be constructed by applying the procedure outlined earlier in section 2.2 or any alternative valid relaxation procedure. Because this problem is convex, a global solution can be determined, in principle, by applying conventional numerical procedures. It yields a valid lower bound on the global solution of (D1). On finite termination, this algorithm guarantees finding a global solution of (D1) within finite tolerance > 0 or terminates with an indication that the problem is infeasible (UBD ) +∞). 2.4. GDOC, a Symbolic and Numeric Library for Global Dynamic Optimization. To solve (D1) for practical problems, a general purpose program named GDOC (global dynamic optimization collection) has been developed. Because solving problem D1 is computationally expensive, a compiled code approach was selected in preference to an interpreted approach. GDOC is an open source software library that is made available, without charge, for both education and nonprofit research purposes. More information on how to obtain, install, and use GDOC can be found on the web site http://yoric.mit.edu/gdoc; also refer to the GDOC reference manual95 for additional information. The two major modules contained in GDOC are the gdoccompiler and the GDOC command-line interface (GDOC-CLI). In addition to this, the distribution contains a branch-and-bound code,96 a numerical integrator,97 and an interface to various LP and NLP solvers.98-101 It should be noted that each of these modules is meant to be interchangeable, and some interchange may be necessary due to various licensing restrictions. Using GDOC is easily divided into three distinct tasks: (i) writing a GDOC model, (ii) using the gdoc-compiler to generate the associated residual routine library, and (iii) using the GDOCCLI to obtain the desired results (optimization, simulation, or multistart analysis). Note that GDOC comes with an extensive set of test problems47,92 to illustrate usage for several different types of problems.
(i) Writing a GDOC Model. The GDOC language is a flexible language designed to allow a user without expert knowledge of optimization to perform global optimization provided he or she is capable of expressing the problem in a mathematically abstract manner. In particular, the GDOC language allows the user to specify the objective function (Bolza-type), as well as the constraining ODEs and their initial conditions. An example input file related to the case study presented in section 2.5 below is given in Appendix A. (ii) Using the gdoc-compiler. The gdoc-compiler is the part of GDOC responsible for translating the GDOC input file, applying the dynamic relaxation theory, and generating a Fortran residual file for the numerical integrator. While the upperbounding problem is easily coded manually, the generation of code (state bounds and affine relaxations) for the lowerbounding problem is indeed a very tedious and error-prone task, even for small-scale problems, and this analysis is unique for each individual problem. The gdoc-compiler runs in batch mode, which is suitable for use in a Makefile. Typically, a residual routine library called libres.so is generated from the Fortran residual file; this is illustrated in Figure 7. (iii) Using the GDOC-CLI. The GDOC-CLI is a batch interface for simulating and (globally) optimizing dynamic problems. At run time, the GDOC-CLI must dynamically bind to a shared library called libres.so that contains the residual routines generated by the gdoc-compiler.95 The principle of the GDOC-CLI in global optimization mode is illustrated in Figure 8. Despite the fact that the relaxation techniques and branchand-bound procedure outlined in subsections 2.2 and 2.3, respectively, are applicable to constrained dynamic optimization problems, the current implementation in GDOC can only handle dynamic optimization problems with simple bound constraints. This is suitable, e.g., for use in parameter estimation problems. Future versions of GDOC will allow dealing with constrained problems as well as discrete-valued decision variables (MIDO; see section 3). 2.5. Application: Parameter Estimation in Chemical Kinetics. This subsection presents an application of global dynamic optimization to parameter estimation problems in chemical kinetics.11 It considers the liquid phase reaction of resonantly stabilized cyclohexadienyl radicals with molecular oxygen to form two isomeric cyclohexadienylperoxyl radicals, a reaction which is important early in flame ignition.102 Given a set of experimental data and a candidate reaction mechanism (obtained via quantum mechanical calculations), one
Ind. Eng. Chem. Res., Vol. 45, No. 25, 2006 8383
Figure 9. Proposed reaction mechanism (left plot) and comparison between the optimized model and experimental data for T ) 298 K (right plot).
wants to determine the optimal values of the rate constants ki of the intermediate reactions in the proposed mechanism. The objective function is to minimize the sum of square errors between the measured absorbance values IE,i and the absorbance values predicted by the model IM,i:
min J (p,y) :) φ0(x(tf, p, y), p, y) +
p∈P,y∈Y
∫t t ψ0(t, x(t, p, y), p, y) dt f
0
s.t. 0 g Gk(p, y) :) φk(x(tf, p, y), p, y) +
∫t t ψk(t, x(t, p, y), p, y) dt, f
N
φ)
(IE,i - IM,i)2 ∑ i)1
0
x3 (t, p, y) ) f(t, x(t, p, y), p, y), ∀ t ∈ (t0, tf] x(t0, p, y) ) h(p, y)
with N being the number of measurements. The reaction scheme is shown in the left plot of Figure 9. Note that part of the reactions are well-studied, and the values for their rate constants are taken from the literature. Moreover, some of the rate constants are eliminated from the problem by employing equilibrium constants (e.g., k-1a, k-1b). Overall, the estimation problem therefore consists of five state variables and three degrees of freedom (k1a, k2a, and k4). The corresponding input file in the GDOC language is given in Appendix A. Note that obtaining a global solution to this problem requires 225 CPU s (on a PC with a 3.4 GHz Pentium 4 processor, 1 GB memory, running Linux 2.4.21) and 1850 nodes with GDOC, for a relative tolerance set to ) 10-3. For this problem, the application of a multistart algorithm demonstrates the existence of several distinct local optima, and a global optimization method is certainly warranted. This is illustrated in the right plot of Figure 9, which represents a set of experimental data obtained at T ) 298 K with the predictions provided by the model with either the globally optimal parameter values (obtained with GDOC) or a locally optimal set of parameter values.92 Except at very early times, one sees that the model fitted by global optimization almost perfectly matches the data; on the other hand, the fit provided by the local minimum shows systematic error relative to the experimental data. The numerical analysis of this problem therefore demonstrates one of the most important reasons for utilizing global optimization for kinetic parameter estimation problems: if one were to draw conclusions from these results based solely on the local minima, it would likely be concluded incorrectly that the proposed model demonstrates systematic error, hence risking invalidation the correct reaction mechanism. 3. Global Mixed-Integer Dynamic Optimization We now review global methods for solving mixed-integer optimization problems embedding nonlinear ODEs to global optimality. Throughout this section, we consider MIDO problems of the following form:
k ) 1, ..., ng
(D2)
where the same notation as in problem D1 has been used for the sake of consistency; additionally, y ∈ Y denotes a special set of time-invariant parameters that can only take 0-1 values, with Y ) {0, 1}ny. Note that the developments presented subsequently can be extended to address problems subject to multistage ODEs with varying transition times, as well as problems having a finite number of point and integral terms in the objective and/or constraint functions on a fixed partition interior to the time domain, but these are omitted for the sake of clarity. Also omitted in the problem formulation (D2) are function valued decision variables whose time profile either take values in some compact convex set (u : [t0, tf] f U ⊂ Rnu) or are restricted to take 0-1 values (w : [t0, tf] f {0, 1}nw), both of which can be converted into the formulation (D2) by applying standard control parametrization techniques. Finally, in contrast to other recent MIDO formulations where the dynamic systems are described by a set of DAEs,49,62 problem D2 considers embedded ODE systems only. This is because no technique has been proposed to date for constructing valid relaxations of a problem with DAEs embedded. 3.1. Outer-Approximation Roadmap for MIDO. So far, our focus has been on the use of outer-approximation (OA) algorithms72,73,79,80 in order to guarantee finding a global solution of the MIDO problem. Nevertheless, any other global MINLP algorithm, such as branch-and-bound,75-78 could be used to solve the problem provided that rigorous lower-bounding problems could be constructed from existing relaxation techniques. 3.1.1. Outer-Approximation Algorithms for MIPs. An OA algorithm proceeds by decomposing a difficult MIP into a sequence of subproblems that are easier to solve than the original problem: (i) a continuous optimization problem (referred to as the primal problem, P), obtained by fixing the binary variables in the original MIP, that provides an upper bound on the MIP solution; and (ii) a MILP (referred to as the relaxed-master problem) that provides a new assignment for the binary variables and the solution value of which is a valid lower bound for the subset of binary variable values not yet explored by the
8384
Ind. Eng. Chem. Res., Vol. 45, No. 25, 2006
Figure 10. Outer-approximation algorithm for solution of mixed-integer problems with nonconvex participating functions to guaranteed global optimality.79,80
algorithm.72,73 During the procedure, cuts are progressively added to the relaxed-master problem which restrict the feasible set, until the relaxed-master solution becomes greater than the best known upper bound (or the relaxed-master problem becomes infeasible). If the objective and constraint functions in the MIP are jointly convex with respect to the continuous and discrete valued variables, valid cuts are obtained by linearizing the objective function and (active) constraints of the primal problem, e.g., at its optimal solution. (In case a primal problem is infeasible, a so-called primal feasibility problem is considered.73) However, when the participating functions happen to be nonconvex, linearizations of the primal problem are no longer guaranteed to provide valid cuts, i.e., the procedure is prone to excluding (possibly large) feasible regions into which a global solution may occur. In response to this, an extension of the OA algorithm (GOA) has been proposed that allows solving MIPs with nonconvex participating functions to global optimality.79,80 The GOA algorithm starts by constructing a relaxation of the original nonconvex MIP (referred to as the lower-bounding problem, LBD), i.e., a MIP with convex participating functions. By construction, any linearization of the lower-bounding problem supports the original nonconvex problem and can be added as a cut into the relaxed-master problem (M). In practice, cuts are generated at the solution point of a convex primal-bounding problem (PB) obtained by fixing the binary variables in the lower-bounding problem. (Analogously to the convex case, a primal-bounding-feasibility problem, PBF, is considered in case a primal-bounding problem happens to be infeasible.) On the other hand, because of nonconvexity of the participating functions, each primal problem must be solved to global optimality. Loosely speaking, a GOA algorithms therefore proceeds by solving a sequence of global-primal, primal-bounding, and relaxed-master problems. Because solving primal problems to guaranteed global optimality is typically the most time-consuming step in the procedure, a somewhat more elaborated version of the algorithm has been proposed which consists of two interlinked loops.79 This latter algorithm is shown in Figure 10.
A sequence of primal-bounding and relaxed-master problems is solved in the inner loop, while solution of the primal problems is deferred to the outer loop, hence reducing the overall number of primal problems to be solved during an optimization. 3.1.2. Appliczation of Outer-Approximation to MIDO. A prerequisite for applying the foregoing outer-approximation algorithm to MIDO is a convexity theory for dynamic optimization and the ability to build valid convex relaxations for the functions in problem D2. In particular, the theory presented earlier on in section 2 is trivially extended to state variables that depend on both continuous and discrete valued parameters by relaxing these latter parameters y ∈ Yc, with Yc being the convex hull of Y. As mentioned previously, the solution process of a GOA algorithm is composed of an iteration between several subproblems. Its adaptation to MIDO warrants some explanations.50 (i) The primal problem is a nonconvex dynamic optimization problem obtained by fixing the binary variables y in (D2). Solving this problem to guaranteed global optimality can be done, e.g., by applying the branch-and-bound procedure presented in section 2. (ii) The lower-bounding problem is a convex MIDO problem of the form
min uJ (p, y) :) uφ0(p, y) +
p∈P,y∈Y
s.t. 0 g uφk(p, y) +
∫t t uψ (p, y, t) dt f
0
∫t t uψ (p, y, t) dt, f
0
k
0
k ) 1, ..., ng (21)
where uφk and uψk (k ) 0, ..., ng) are pointwise-in-time convex underestimators of φk and ψk in (D2), respectively, for (p, y) ∈ P × Yc. These relaxations can be constructed by applying the procedure outlined in section 2.2 or any alternative valid relaxation procedure. (iii) The primal-bounding problem is a convex dynamic optimization problem obtained by fixing the binary variables y in (21). As such, a global solution to this problem can be determined, in principle, by applying the conventional numerical procedures mentioned in the Introduction (e.g., the sequential approach).
Ind. Eng. Chem. Res., Vol. 45, No. 25, 2006 8385
Figure 11. Application of the GOA algorithm to MIDO problem 22: (left plot) state and affine relaxations at (x*(t), p*) ) ((xL(t) + xU(t))/2, (pL + pU)/2); (right plot) subproblems in the GOA algorithm.
(ii) Finally, as in the MINLP case, the relaxed-master problem is a MILP for which well-established solution procedures exist.103 The construction of these subproblems is illustrated in an example below. In addition to the foregoing MIDO algorithm (referred to as algorithm 1 subsequently), a second algorithm has been considered in which the primal problem is solved to local optimality only (algorithm 2). On finite termination, algorithm 1 guarantees finding the global solution value of problem D2 within finite tolerance (or terminates with an indication that the MIDO problem is infeasible), while algorithm 2 finds rigorous bounds bracketing the global solution value of (D2) (and returns a potentially suboptimal solution). The advantage of the latter is a substantial reduction in computational expense as will be illustrated by the case study in section 3.2. Illustrative Example 3. Consider the following MIDO problem:
min J (p, y) :)
p∈P,y∈Y
∫01 p2 - [x(t, p, y)]2 dt
s.t. x˘ (p, y, t) ) -0.1[x(t, p, y) - p]2 + yx(t, p, y) ∀ t ∈ (0, 1] x(0, p, y) ) p2 - 0.5
(22)
with P ) [-3, 2] and Y ) {0, 1}. Problem 23 is a nonconvex MIDO problem, with a global solution value of J* ) -38.219 attained at (p*, y*) ) (-3, 1). A direct consequence of these nonconvexities is the presence of suboptimal local minima in each of the primal problems. This is illustrated in the right plot of Figure 11. In this example, any cut obtained from the linearization of a primal problem objective function (at a point other than the global minimum itself) excludes the global solution from the feasible region. Hence, an OA algorithm generating cuts from the nonconvex primal problems directly would fail to locate the global minimum and would terminate at an arbitrary suboptimal point. Applying the GOA algorithm for locating the global solution of (22) starts by constructing a lower-bounding problem. For so doing, pointwise-in-time relaxations of the state variable x on P × Yc ) [-3, 2] × [0, 1] are needed. Applying the relaxation procedure described in section 2.2.1 to the differential equation in (22) presents many similarities with example 2 and is not repeated here for the sake of conciseness. The resulting affine state relaxation functions ux and ox at final time are
depicted in Figure 11, left plot. A convex relaxation uJ of J on P × Yc is then immediately obtained from (19). Such a relaxation is depicted in dotted lines in Figure 11 (right plot). Next, the primal-bounding problems are obtained by assigning a fixed value to the binary variable y; this yields the dashed curves in Figure 11 (right plot). Finally, the cuts added to the relaxedmaster problem correspond to the supporting planes at the minimum point of the primal-bounding problem. It is easily seen that these cuts are valid, for the lower-bounding objective function is convex on [-3, 2] × [0, 1] (see Figure 11, right plot). Numerical Considerations. The efficiency of the GOA algorithm can be further improved by applying bounds tightening techniques in order to reduce the size of the parameter space. As the parameter bounds are tightened, so are the convex relaxations for the nonconvex functions, potentially enhancing the convergence of the iterative procedure. The time required to solve the primal problem to global optimality may also be improved as the search space is reduced. In the GOA algorithm presented above, a bounds tightening procedure can be considered immediately before (or after) solution of a relaxed-master problem. It consists of solving 2 × np additional MILP problems that find the minimum and maximum values of each parameter so that the constraints in the actual relaxed-master problem are satisfied. Clearly, whether bounds tightening techniques should or should not be applied results from a tradeoff between the computational expense associated with the additional MILPs and the overall reduction of computational time obtained from convergence enhancement. Our experience in solving MIDO problems indicates that the bounds tightening procedure is generally beneficial in algorithm 1 because the majority of the CPU time is devoted to solving, globally, the primal problems. On the contrary, bounds tightening tends to no longer be beneficial in algorithm 2, as the solution time for MILPs dominates. In this latter case, one may want to solve the bounds tightening problems once every N g 1 iterations to reduce the cost associated with the additional MILPs. Use of bound tightening techniques is illustrated in the case study presented in sectioin 3.2. 3.2. Application: Batch Process Development. The application of the MIDO algorithms is demonstrated with an example based on the two-stage batch process shown in Figure 12.50 It consists of a series reaction (A f B f C) followed by separation with no intermediate storage (NIS). The objective is to select the optimal process design and operation that minimizes
8386
Ind. Eng. Chem. Res., Vol. 45, No. 25, 2006
Figure 12. Task network of the two-stage batch process. Table 1. Results For the Batch Process Development Case Study (PR primal problem; PB primal bounding problem; RM relaxed master problem) CPU timeb (s) heuristics without domain reduction with domain reduction with domain reduction & screening model cuts without domain reduction with domain reduction with domain reduction & screening model cuts
iterationsa
PR
PB
RM
total
MIDO Algorithm 1 1 305/1 266 53 631 1 082/1 062 50 416 512/482 12 243
272 235 303
1 584 703 420
55 656 51 465 13 020
MIDO Algorithm 2 1 305/1 266 334 1 082/1 062 371 512/482 66
269 234 298
1 587 703 421
2 357 1 417 839
a Number of PB and RM problems solved/number of PR problems solved. PC with a 3.4 GHz Pentium 4 processor, 1 GB memory, running Linux 2.4.21.
b
the overall manufacturing cost subject to a fixed production amount of B in given time. The model equations are derived by assuming that the process operates at a cyclic steady state. The concentration profiles of the reactant A and products B and C are governed by a set of ODEs, assuming first-order kinetics, whereas a perfect split of components is assumed for the batch distillation. The resulting problem is a nonconvex MIDO problem that contains 1 control variable (reactor temperature), 4 design parameters (duration of the manufacturing campaign, batch reactor cycle time, batch distillation cycle time, and reactor volume), 39 binary variables (equipment selection), and 1 integer variable (total number of batches). By accounting for the existing SOS1 (special ordered set of type 1; a set of variables is said to be SOS1 when at most one variable in the set can be nonzero104) sets of binary variables, the total number of discrete alternatives in this problem is 8316. The computational times for solving this problem are reported in Table 1. The application of algorithm 1 provides the global
solution to the problem in about 14.3 and 15.5 h depending on whether domain reduction is applied; in both cases, total enumeration of the process structure alternatives is avoided as only 1082 binary realizations out of 8316 were visited in the former case and 1305 in the latter. As expected, the major computational expense derives from the solution of primal problems to -optimality. Roughly, the more promising binary realizations are visited early by the outer-approximation algorithm, and the upper bound rapidly reaches the global minimum of the problem (found after about 200 iterations); this can be seen from Figure 13 (left plot), which depicts the solutions of the primal and relaxed-master problems versus the iteration count. By using the current upper bound as an incumbent in the branch-and-bound procedure, the computational expense for subsequent primal problems is then progressively reduced as it becomes faster to detect whether a given binary realization will yield a worse upper bound or is infeasible. These considerations are illustrated in the right plot of Figure 13, which depicts the computational expense versus the iteration count. The application of algorithm 2 was also considered to solve this problem. It is worth noting that this simplified algorithm finds the global solution despite the fact that no theoretical guarantee can be given. In addition, the CPU time required to solve the problem is dramatically reduced, for algorithm 2 terminates in approximately 40 and 24 min without and with the application of domain reduction techniques, respectively. To enhance the convergence of the algorithms, valid cuts derived from the screening model53,54 of the batch process are added to the lower-bounding-convex MIDO problem to tighten the relaxations. The results obtained are also reported in Table 1 for algorithms 1 and 2. One sees that the addition of screening model cuts significantly reduces the overall number of iterations and, correlatively, the overall CPU time, for the use of tighter relaxations allows the algorithms to exclude a larger number of binary realizations. These results therefore demonstrate that large benefits can be realized by considering screening models for batch process development purposes. 4. Future Directions Over the past few years, impressive results have been obtained in the global optimization of problems with ODEs embedded, formerly a class of problems for which rigorous global optimization algorithms did not exist. As just an example, the computational cost of branch-and-bound procedures has been reduced by 2-3 orders of magnitude since the first publications in this area. This mainly owes to the development of tighter convex relaxations for functionals subject to embedded ODEs,
Figure 13. Results for the batch process development case study: (left plot) MIDO iterations; (right plot) CPU time (with domain reduction heuristics).
Ind. Eng. Chem. Res., Vol. 45, No. 25, 2006 8387
which in addition can be computed more efficiently.46,47 But despite these tremendous improvements, (mixed-integer) dynamic optimization problems involving more than about 10 degrees of freedom (continuous decision variables) tend to have prohibitive solution times, and some convergence difficulties may still be experienced, in practice, for certain types of differential systems. Still a lot of work is thus needed in order to overcome these obstacles. The major issue governing the slow convergence of the algorithms is the weakness of the current convex relaxations for many problems (i.e., the lower bound computed is much lower than the global minimum). Several factors contribute to the weakness of the overall relaxation in the ODE case: (1) the well-established techniques employed to relax the elementary functions participating in the objective function, constraints, and right-hand sides of the differential equations, on the set furnished by the implied state bounds;38,43,85 (2) the theoretical technique employed to construct bounding differential equations whose solutions are convex and concave, respectively, pointwise in time;46 (3) the method for computing the implied state bounds.46,89 Our experience with a growing set of test problems indicates that the largest source of weakness in the relaxations is the weakness of the implied state bounds, followed by the use of linear bounding differential equations. The first factor contributes much less, because often the convex envelope is available. Furthermore, in general, there seems much less scope to improve the already tight relaxations furnished by McCormick’s technique in conjunction with the known tight relaxations for many functions of special structure. Besides relaxation tightness, the fact that branch-and-bound is a viable tool for global optimization only if it is supplemented with appropriate algorithmic heuristics is well established for NLP problems.75,76,87,105,106 This observation applies equally to dynamic optimization problems. Both domain reduction heuristics and node partitioning schemes are particularly relevant to the branch-and-bound procedure. As regards domain reduction, many techniques developed in the NLP case can be readily extended to the ODE embedded problems. On the other hand, there is still a lack of efficient node partitioning heuristics for optimization problems with embedded ODEs, especially for selection of the branching variables. This is a serious obstacle to the application of branch-and-bound to problems containing more than a few degrees of freedom, e.g., for solving MIDO problems (see section 3). Implied State Bounds. Methods for computing the implied state bounds inherently overestimate the true image of the parameter set P under the solution of the ODEs. This weakens the overall relaxation because much larger sets than necessary are employed to construct the relaxations. Moreover, in the nonquasi-monotone case, the application of standard techniques based on differential inequalities89,90 often yields bounds that rapidly explode in value on short time scales (see, e.g., example 1 for an illustration). As already mentioned in section 2.2.1, our prior work mitigates the “bound explosion” often experienced when applying Harrison’s theorem by incorporating natural bounds and solution invariants that are known independently from the statement of the differential equations themselves.46 For example, when modeling the concentrations of chemical species undergoing chemical reactions, both global upper and lower bounds on the concentrations can generally be obtained from the principle of mass conservation. In our theory, these bounds and solution invariants are incorporated directly as constraints in the optimization problems of Harrison’s theorem, thus
Figure 14. Nonlinear and affine state relaxations for the solutions of differential system 7.
theoretically strengthening the implied state bounds. We have found in practice that this advance makes the difference between being able to solve and not being able to solve many dynamic optimization problems involving non-quasi-monotone ODEs. In the general case, however, the solution invariants may be complex functions of both the state variables and the parameters, and it may no longer be easy to derive natural bounds for the state variables from these expressions. Moreover, the estimates of the infima and suprema in bounding eqs 2 may be extremely pessimistic when generated by simple procedures such as interval arithmetic, because they cannot take into account complex constraints in the bounding optimization problems. In light of this, we are currently exploring computational methods that expend more effort to yield tighter estimates of the infima and suprema, in the hope that this provides stronger and more rapidly converging implied state bounds and reduces the elapsed time of global optimization algorithms. Another avenue to explore concerns the bounding techniques used routinely for verifying and enclosing solutions of initial value problems in ODEs, such as the interval Taylor series method.107,108 Adjoint-based procedures also offer some promise to improve the bounds for non-quasi-monotone differential equations.109 Convex/Concave State Relaxations. The ability to construct tight convex relaxations for functions with state variables participating, such as the objective and constraint functionals in problems D1 and D2, is one of the most important aspects of the global optimization algorithms, for it has a very large effect on their convergence rates. As mentioned in section 2.2, a general procedure has been developed in our earlier work in order to relax such functionals. This procedure relies on the use of McCormick’s composition result43 in conjunction with convex and concave relaxations for the solution of a system of ODEs. However, unlike algebraic functions for which wellestablished and widely applicable relaxation techniques exist, general methods enabling construction of convex/concave relaxations for the solutions of a differential system are not yet sufficiently developed and still present many challenges. In previous work,46 we have developed a general method based on the construction of auxiliary linear systems, the solutions of which underestimate/overestimate the solutions of the original differential system and are affine in p. So far, the application of this relaxation procedure has shown its efficiency for solving small-scale problems to global optimality, i.e., problems containing a few degrees of freedom only.47,50 Before one can use these techniques to optimize larger problems, several issues must be addressed. First, the construction of the convex/
8388
Ind. Eng. Chem. Res., Vol. 45, No. 25, 2006
Chart 1. GDOC Input File for Parameter Estimation Problem
Ind. Eng. Chem. Res., Vol. 45, No. 25, 2006 8389 Chart 1 (Continued)
concave relaxation requires that a reference trajectory be specified a priori. Such a reference trajectory is used for building the outer-approximations, and it is self-evident that a “bad” choice can significantly affect the rate of convergence of the algorithms. More importantly maybe, the influence of the reference trajectory is expected to increase as the dimensions of the state and parameter spaces increase. Despite these important considerations, it is still unclear how to select the reference trajectories. Second, while new results have been obtained that enable knowledge concerning solution invariants to be incorporated in the computation of implied state bounds,46 the use of these invariants has not been investigated thus far for improving the convex/concave bounds. It is envisioned, however, that solution invariants would also significantly tighten the relaxations, especially when dealing with non-quasimonotone differential systems. Recently, very encouraging results have been obtained that suggest a general theory for the construction of general convex/
concave relaxations of the solutions of parameter dependent nonlinear ODEs.110 Let P ⊂ Rnp be a nonempty convex set, and consider the nonlinear scalar differential equation
x˘ ) g(t, x, p) ∀ t ∈ (t0, tf] x(t0, p) ) h(p)
(23)
It has been shown that the solution functions x(t, ‚) are pointwise-in-time convex on P provided that h is convex on P and g(t, ‚, ‚) is pointwise-in-time convex on X(t) × P, with X(t) ⊃ {z ∈ R: ∃p ∈ P, z ) x(t, p)}. Sufficient conditions for the relaxations of systems of ODEs to be convex or concave have also been established.110 To illustrate it, consider the scalar differential system (eq 7) studied in example 2. The solution functions ux(t, ‚) and ox(t, ‚) of the following IVPs provide nonlinear convex and concave relaxations of x(t, ‚), respectively:
8390
Ind. Eng. Chem. Res., Vol. 45, No. 25, 2006
u˘ x(t, p) ) -(ux(t, p) - p)(xU(t) - xL(t) + pU - pL) + (xU(t) - pL)(xL(t) - pU) ∀ t ∈ (t0, tf] ux(t0, p) ) p2 - 0.5 and
o˘ x(t, p) ) -(ox(t, p) - p)2 ∀ t ∈ (t0, tf] ox(t0, p) ) p(pL + pU) - pLpU with the implied state bounds xL and xU given by (8) and (9). It is easily verified that the right-hand sides of the foregoing differential equations are either convex or concave on P × R, and so are their respective initial conditions on P. This is illustrated in Figure 14 which displays x(1, ‚) at final time, together ux(1, ‚) and ox(1, ‚). For the sake of comparison, affine state relaxations are also shown in this figure for a set of reference trajectories. A very desirable property of the developed nonlinear theory is that the specification of a reference trajectory is no longer needed for calculating the relaxations. Moreover, the nonlinear state relaxations being tighter than the affine ones, one may expect a significant improvement of the convergence rate to a global solution in the algorithms. From the standpoint of overall CPU time however, it will not be necessarily advantageous to use the tightest known relaxations. This is because, by relinquishing the affine structure of the relaxations, the number of numerical integrations of the relaxed differential system that is required to solve each lower-bounding problem significantly increases. In addition, the nonlinear relaxations may be nonsmooth at some points, which will necessitate the use of specific NLP solvers. Therefore, further developments of both theoretical and practical aspects of the nonlinear relaxation framework are clearly warranted to exploit fully the potential of these novel relaxations. Acknowledgment B.C. is grateful to the French Ministry of Foreign Affairs (Lavoisier postdoctoral fellowships) for financial support. This work is based upon work supported by the National Science Foundation under Grant CTS-0120441. Appendix A See Chart 1 for the complete GDOC input file for the parameter estimation problem. Literature Cited (1) Rippin, D. W. T. Simulation of single- and multiproduct batch chemical plants for optimal design and operation. Comput. Chem. Eng. 1983, 7 (3), 137-156. (2) Charalambides, M. S. Optimal design of integrated batch processes, Ph.D. Thesis, University of London, UK, 1996. (3) Srinivasan, B.; Palanki, S.; Bonvin, D. Dynamic optimization of batch processes - I. Characterization of the nominal solution. Comput. Chem. Eng. 2003, 27 (1), 1-26. (4) Fikar, M.; Latifi, M. A.; Creff, Y. Optimal changeover profiles for an industrial depropanizer. Chem. Eng. Sci. 1999, 54 (13), 2715-2720. (5) Flores-Tlacuahuac, A.; Biegler, L. T.; Saldivar-Guerra, E. Dynamic optimization of HIPS open-loop unstable polymerization reactors. Ind. Eng. Chem. Res. 2005, 44 (8), 2659-2674. (6) Chachuat, B.; Mitsos, A.; Barton, P. I. Optimal design and steadystate operation of micro power generation employing fuel cells. Chem. Eng. Sci. 2005, 60 (16), 4535-4556.
(7) Nilchan, S.; Pantelides, C. C. On the optimisation of periodic adsorption processes. Adsorption 1998, 4 (2), 113-147. (8) Raja, L. L.; Kee, R. J.; Serban, R.; Petzold, L. R. Computational algorithm for dynamic optimization of chemical vapor deposition processes in stagnation flow reactors. J. Electrochem. Soc. 2000, 147 (7), 27182726. (9) Bock, H. G. Numerical treatment of inverse problems in chemical reaction kinetics, In Modelling of Chemical Reaction Systems; Ebert, K. H., Deuflhard, P., Ja¨ger, W., Eds.; Springer Series in Chemical Physics; Springer-Verlag: Heidelberg, Germany, 1981; Vol. 18, pp 102-125. (10) Walter, E.; Pronzato, L. Identification of Parametric Models from Experimental Data; Communications and Control Engineering; Springer, London, UK, 1997. (11) Singer, A. B.; Taylor, J. W.; Barton, P. I.; Green, W. H., Jr. Global dynamic optimization for parameter estimation in chemical kinetics. J. Phys. Chem. A 2006, 110 (3), 971-976. (12) Emery, A. F.; Nenarokomov, A. V. Optimal experiment design. Meas. Sci. Technol. 1998, 9, 864-876. (13) Bauer, I.; Bock, H. G.; Ko¨rkel, S.; Schlo¨der, J. P. Numerical methods for optimum experimental design in DAE systems. J. Comput. Appl. Math. 2000, 120, 1-25. (14) Diehl, M.; Bock, H. G.; Schlo¨der, J. P.; Findeisen, R.; Allgower, F. Real-time optimization and nonlinear model predictive control of processes governed by differential-algebraic equations. J. Process Control 2002, 12 (4), 577-585. (15) Martinsen, F.; Biegler, L. T.; Foss, B. A. A new optimization algorithm with application to nonlinear MPC. J. Process Control 2004, 14 (8), 853-865. (16) Robertson, D. G.; Lee, J. H.; Rawlings, J. B. A moving horizonbased approach for least-squares estimation. AIChE J. 1996, 42 (8), 22092224. (17) Alamir, M. Optimisation-based nonlinear observers revisited. Int. J. Control 1999, 72 (13), 1204-1217. (18) Troutman, J. L. Variational Calculus and Optimal Control: Optimization with Elementary ConVexity, 2nd ed.; Springer: New York, 1996. (19) Bryson, A. E.; Ho, Y. Applied Optimal Control; Hemisphere: New York, 1975. (20) Neuman, C. P.; Sen, A. A suboptimal control algorithm for constrained problems using cubic splines. Automatica 1973, 9, 601-603. (21) Tsang, T. H.; Himmelblau, D. M.; Edgar, T. F. Optimal control via collocation and nonlinear programming. Int. J. Control 1975, 21, 763768. (22) Cuthrell, J. E.; Biegler, L. T. On the optimization of differentialalgebraic process systems. AIChE J. 1987, 33, 1257-1270. (23) Biegler, L. T.; Cervantes, A. M.; Wa¨chter, A. Advances in simultaneous strategies for dynamic process optimization. Chem. Eng. Sci. 2002, 57 (4), 575-593. (24) Brusch, R.; Schappelle, R. Solution of highly constrained optimal control problems using nonlinear programming. AIAA J. 1973, 11 (2), 135136. (25) Teo, K. L.; Goh, C. J.; Wong, K. H. A Unified Computational Approach to Optimal Control Problems; Pitman Monographs and Surveys in Pure and Applied Mathematics; John Wiley & Sons: New York, 1991. (26) Maly, T.; Petzold, L. R. Numerical methods and software for sensitivity analysis of differential-algebraic systems. Appl. Numer. Math. 1996, 20 (1-2), 57-79. (27) Feehery, W. F.; Tolsma, J. E.; Barton, P. I. Efficient sensitivity analysis of large-scale differential-algebraic systems. Appl. Numer. Math. 1997, 25 (1), 41-54. (28) Cao, Y.; Li, S.; Petzold, L. R. Adjoint sensitivity analysis for differential-algebraic equations: Algorithms and software. J. Comput. Appl. Math. 2002, 149, 171-191. (29) Bock, H. G.; Plitt, K. J. A multiple shooting algorithm for direct solution of optimal control problems, In Proceedings of the IFAC 9th World Congress, Budapest, Hungary, 1984; pp 242-247. (30) Banga, J. R.; Seider, W. D. Global optimization of chemical processes using stochastic algorithms. In State of the Art in Global Optimization; Floudas, C. A., Pardalos, P. M., Eds.; Nonconvex Optimization and Its Applications; Kluwer Academic Publishers: Norwell, MA, 1996; Vol. 7, pp 563-583. (31) Luus, R. IteratiVe Dynamic Programming; Chapman & Hall: Boca Raton, FL, 2000. (32) Luus, R.; Cormack, D. E. Multiplicity of solutions resulting from the use of variational methods in optimal control problems. Can. J. Chem. Eng. 1972, 50, 309-311. (33) Luus, R.; Dittrich, J.; Keil, F. J. Multiplicity of solutions in the optimization of a bifunctional catalyst blend in a tubular reactor. Can. J. Chem. Eng. 1992, 70, 780-785.
Ind. Eng. Chem. Res., Vol. 45, No. 25, 2006 8391 (34) Granvilliers, L.; Cruz, J.; Barahona, P. Parameter estimation using interval computations. SIAM J. Sci. Comput. 2004, 26 (2), 591-612. (35) Balsa-Canto, E.; Vassiliadis, V. S.; Banga, J. R. Dynamic optimization of single- and multi-stage systems using a hybrid stochasticdeterministic method. Ind. Eng. Chem. Res. 2005, 44 (5), 1514-1523. (36) Horst, R.; Tuy, H. Global Optimization: Deterministic Approaches, 3rd ed.; Springer-Verlag: Berlin, Germany, 1996. (37) Maranas, C. D.; Floudas, C. A. Global minimum potential energy conformations of small molecules. J. Global Optim. 1994, 4, 135-170. (38) Adjiman, C. S.; Dallwig, S.; Floudas, C. A.; Neumaier, A. A global optimization method, RBB, for general twice-differentiable constrained NLPs - I. Theoretical advances. Comput. Chem. Eng. 1998, 22 (9), 11371158. (39) Esposito, W. R.; Floudas, C. A. Deterministic global optimization in nonlinear optimal control problems. J. Global Optim. 2000, 17, 96126. (40) Esposito, W. R.; Floudas, C. A. Global optimization for the parameter estimation of differential-algebraic systems. Ind. Eng. Chem. Res. 2000, 39 (5), 1291-1310. (41) Papamichail, I.; Adjiman, C. S. A rigorous global optimization algorithm for problems with ordinary differential equations. J. Global Optim. 2002, 24, 1-33. (42) Chachuat, B.; Latifi, M. A. A new approach in deterministic global optimization of problems with ordinary differential equations. In Frontiers in Global Optimization; Floudas, C. A., Pardalos, P. M., Eds.; Nonconvex Optimization and Its Applications; Kluwer Academic Publishers: Norwell, MA, 2003; Vol. 74, 83-108. (43) McCormick, G. P. Computability of global solutions to factorable nonconvex programs: Part I - Convex underestimating problems. Math. Program. 1976, 10, 147-175. (44) McCormick, G. P. Nonlinear Programming: Theory, Algorithms and Applications; John Wiley & Sons: New York, 1983. (45) Singer, A. B.; Barton, P. I. Global solution of linear dynamic embedded optimization problems. J. Optim. Theory Appl. 2004, 121 (3), 149-182. (46) Singer, A. B.; Barton, P. I. Bounding the solutions of parameter dependent nonlinear ordinary differential equations. SIAM J. Sci. Comput. 2006, 27 (6), 2167-2182. (47) Singer, A. B.; Barton, P. I. Global optimization with nonlinear ordinary differential equations. J. Global Optim. 2006, 34 (2), 159-190. (48) Allgor, R. J.; Barton, P. I. Mixed-integer dynamic optimization. Comput. Chem. Eng. 1997, 21 (S), S451-S456. (49) Allgor, R. J.; Barton, P. I. Mixed-integer dynamic optimization. I-Problem formulation. Comput. Chem. Eng. 1999, 23 (4/5), 567-584. (50) Chachuat, B.; Singer, A. B.; Barton, P. I. Global mixed-integer dynamic optimization. AIChE J. 2005, 51 (8), 2235-2253. (51) Allgor, R. J.; Barrera, M. D.; Barton, P. I.; Evans, L. B. Optimal batch process development. Comput. Chem. Eng. 1996, 20 (6/7), 885896. (52) Bhatia, T. K.; Biegler, L. T. Dynamic optimization in the design and scheduling of multiproduct batch plants. Ind. Eng. Chem. Res. 1996, 35 (7), 2234-2246. (53) Allgor, R. J.; Evans, L. B.; Barton, P. I. Screening models for batch process development, Part 1. Design targets for reaction/distillation networks. Chem. Eng. Sci. 1999, 54, 4145-4164. (54) Allgor, R. J.; Barton, P. I. Screening models for batch process development, Part 2. Case studies. Chem. Eng. Sci. 1999, 54, 4165-4187. (55) Sharif, M.; Shah, N.; Pantelides, C. C. On the design of multicomponent batch distillation columns. Comput. Chem. Eng. 1998, 22 (S), S69-S76. (56) Oldenburg, J.; Marquardt, W.; Heinz, D.; Leineweber, D. B. Mixedlogic dynamic optimization applied to batch distillation process design. AIChE J. 2003, 49 (11), 2900-2917. (57) Low, K. H.; Sorensen, E. Simultaneous optimal design and operation of multipurpose batch distillation columns. Chem. Eng. Process. 2004, 43, 273-289. (58) Giovanoglou, A.; Barlatier, J.; Adjiman, C. S.; Pistikopoulos, E. N.; Cordiner, J. L. Optimal solvent design for batch separation based on economic performance. AIChE J. 2003, 49 (12), 3095-3109. (59) Mohideen, M. J.; Perkins, J. D.; Pistikopoulos, E. N. Towards an efficient numerical procedure for mixed integer optimal control. Comput. Chem. Eng. 1997, 21 (S), S457-S462. (60) Schweiger, C. A.; Floudas, C. A. Interaction of design and control: Optimization with dynamic models. In Optimal Control: Theory, Algorithms, and Applications; Hager, W. W., Pardalos, P. M., Eds.; Kluwer Academic Publishers: Norwell, MA, 1997; pp 388-435. (61) Kookos, I. K.; Perkins, J. D. An algorithm for simultaneous process design and control. AIChE J. 2001, 40, 4079-4088.
(62) Bansal, V.; Sakizlis, V.; Ross, R.; Perkins, J. D.; Pistikopoulos, E. N. New algorithms for mixed-integer dynamic optimization. Comput. Chem. Eng. 2003, 27, 647-668. (63) Petzold, L. R.; Zhu, W. Model reduction for chemical kinetics: An optimization approach. AIChE J. 1999, 45, 869-886. (64) Androulakis, I. P. Kinetic mechanism reduction based on an integer programming approach. AIChE J. 2000, 46 (2), 361-371. (65) Gala´n, S.; Barton, P. I. Dynamic optimization of hybrid systems. Comput. Chem. Eng. 1998, 22 (S), S183-S190. (66) Barton, P. I.; Allgor, R. J.; Feehery, W. F.; Gala´n, S. Dynamic optimization in a discontinuous world. Ind. Eng. Chem. Res. 1998, 37 (3), 966-981. (67) Avraam, M. P.; Shah, N.; Pantelides, C. C. A decomposition algorithm for the optimisation of hybrid dynamic processes. Comput. Chem. Eng. 1999, 23 (S), S451-S454. (68) Barton, P. I.; Banga, J. R.; Gala´n, S. Optimization of hybrid discrete/ continuous dynamic systems. Comput. Chem. Eng. 2000, 24 (9/10), 21712182. (69) Barton, P. I.; Lee, C. K. Design of process operations using hybrid dynamic optimization. Comput. Chem. Eng. 2004, 28 (6/7), 955-969. (70) Dimitriadis, V. D.; Shah, N.; Pantelides, C. C. Modeling and safety verification of discrete/continuous processing systems. AIChE J. 1997, 43 (4), 1041-1059. (71) Geoffrion, A. M. Generalized Benders decomposition. J. Optim. Theory Appl. 1972, 10, 237-262. (72) Duran, M. A.; Grossmann, I. E. An outer-approximation algorithm for a class of mixed-integer nonlinear programs. Math. Program. 1986, 36, 307-339. (73) Fletcher, R.; Leyffer, S. Solving mixed-integer nonlinear programs by outer approximation. Math. Program. 1994, 66, 327-349. (74) Grossmann, I. E. Review of nonlinear mixed-integer and disjunctive programming techniques for process systems engineering. Optim. Eng. 2002, 3 (3), 227-252. (75) Ryoo, H. S.; Sahinidis, N. V. Global optimization of nonconvex NLPs and MINLPs with applications in process design. Comput. Chem. Eng. 1995, 19, 551-566. (76) Tawarmalani, M.; Sahinidis, N. V. Global optimization of mixedinteger nonlinear programs: A theoretical and practical study. Math. Program. 2004, 99 (3), 563-591. (77) Adjiman, C. S.; Androulakis, I. P.; Floudas, C. A. Global optimization of mixed-integer nonlinear problems. AIChE J. 2000, 46 (9), 17691797. (78) Smith, E. M. B.; Pantelides, C. C. A symbolic reformulation/spatial branch-and-bound algorithm for the global optimization of nonconvex MINLPs. Comput. Chem. Eng. 1999, 23, 457-478. (79) Kesavan, P.; Allgor, R. J.; Gatzke, E. P.; Barton, P. I. Outerapproximation algorithms for separable nonconvex mixed-integer nonlinear programs. Math. Program. 2004, 100 (3), 517-535. (80) Kesavan, P.; Barton, P. I. Decomposition algorithms for nonconvex mixed-integer nonlinear programs. AIChE Symp. Ser. 2000, 96, 458-461. (81) Kesavan, P.; Barton, P. I. Generalized branch-and-cut framework for mixed- integer nonlinear optimization problems. Comput. Chem. Eng. 2000, 24, 1361-1366. (82) Lee, C. K.; Barton, P. I. Determining the optimal mode sequence. In Proceedings of the IFAC Conference on Analysis and Design of Hybrid Systems; Engell, S., Zaytoon, J., Gueguen, H., Eds.; St-Malo, France, 2003. (83) Lee, C. K.; Barton, P. I. Global optimization of linear hybrid systems with varying transition times. SIAM J. Control Optim. 2006, submitted for publication. (84) Teo, K. L.; Jennings, L. S.; Lee, H.; Rehbock, V. The control parametrization enhancing transform for constrained optimal control problems. J. Aust. Math. Soc. (Ser. B) 1999, 40 (3), 314-335. (85) Falk, J. E.; Soland, R. M. An algorithm for separable nonconvex programming problems. Manag. Sci. 1969, 15 (9), 550-569. (86) Soland, R. M. An algorithm for separable nonconvex programming problems II: Nonconvex constraints. Manag. Sci. 1971, 17, 759-773. (87) Adjiman, C. S.; Androulakis, I. P.; Floudas, C. A. A global optimization method, RBB, for general twice-differentiable constrained NLPs - II. Implementation and computational results. Comput. Chem. Eng. 1998, 22 (9), 1159-1179. (88) Gatzke, E. P.; Tolsma, J. E.; Barton, P. I. Construction of convex function relaxations using automated code generation. Optim. Eng. 2002, 3 (3), 305-326. (89) Harrison, G. W. Dynamic models with uncertain parameters. In Proceedings of the 1st International Conference on Mathematical Modeling; Avula, X., Ed.; St. Louis, MO, August 29-September 1, 1977; Vol. 1, pp 295-304. (90) Walter, W. Differential and Integral Inequalities; SpringerVerlag: Berlin, Germany, 1970.
8392
Ind. Eng. Chem. Res., Vol. 45, No. 25, 2006
(91) Moore, R. E. Methods and Applications of InterVal Analysis; SIAM: Philadelphia, PA, 1979. (92) Singer, A. B. Global Dynamic Optimization, Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, MA, 2004. (93) Lee, C. K.; Singer, A. B.; Barton, P. I. Global optimization of linear hybrid systems with explicit transitions. Syst. Contr. Lett. 2004, 51, 363375. (94) Park, T.; Barton, P. I. State event location in differential-algebraic models. ACM Trans. Model. Comput. Simul. 1996, 6 (2), 137-165. (95) Singer, A. B.; Chachuat, B.; Barton, P. I. GDOC Version 1.0 manual; Massachusetts Institute of Technology: Cambridge, MA, 2005. (96) Singer, A. B. LibBandB Version 3.2 manual; Massachusetts Institute of Technology: Cambridge, MA, 2004. (97) Hindmarsh, A. C.; Serban, R. User documentation for CVODES, an ODE solVer with sensitiVity analysis capabilities; Lawrence Livermore National Laboratory: Livermore, CA, 2002. (98) Berkelaar, M.; Eikland, K.; Notebaert, P. Introduction to lp_solVe 5.1.1.3; 2004. URL: http://lpsolve.sourceforge.net/5.1/. (99) Wa¨chter, A.; Biegler, L. T. On the implementation of an interiorpoint filter-line search algorithm for large-scale nonlinear programming. Math. Program. 2006, 106 (1), 25-57. (100) Schittkowski, K. NLPQL: A fortran subroutine solving constrained nonlinear programming problems. Ann. Oper. Res. 1985/1986, 5, 485500. (101) Gill, P. E.; Murray, W.; Saunders, M. SNOPT: An SQP algorithm for large-scale constrained optimization. SIAM J. Optim. 2002, 12, 9791006. (102) Taylor, J. W.; Ehlker, G.; Cartensen, H. H.; Ruslen, L.; Field, R. W.; Green, W. H., Jr. Direct measurement of the fast, reversible reaction of cyclohexadienyl radicals with oxygen in nonpolar solvents. J. Phys. Chem. A 2004, 108 (35), 7193-7203.
(103) Johnson, E. L.; Nemhauser, G. L.; Savelsbergh, M. W. P. Progress in linear programming based branch-and-bound algorithms: Exposition. INFORMS J. Comput. 2000, 12. (104) Beale, E. M. L.; Tomlin, J. A. Special facilities in a general mathematical programming system for nonconvex problems using ordered sets of variables. In Proceedings of the 5th International Conference on Operations Research; Tavistock Publications: London, 1970; pp 447-454; Venice, Italy, June, 1969. (105) Zamora, J.; Grossmann, I. A branch and contract algorithm for problems with concave univariate, bilinear and linear fractional terms. J. Global Optim. 1999, 14, 217-249. (106) Tawarmalani, M.; Sahinidis, N. V. ConVexification and Global Optimization in Continuous and Mixed-Integer Nonlinear Programming: Theory, Algorithms, Software, and Applications; Nonconvex Optimization and Its Applications; Kluwer Academic Publishers: Boston, MA, 2002. (107) Rihm, R. Interval methods for initial value problems in ODE’s. In Topics in Validated Computations; Herzberger, J., Ed.; Elsevier: Amsterdam, 1994; pp 173-207. (108) Nedialkov, N. S.; Jackson, K. R.; Corliss, G. F. Validated solutions of initial value problems for ordinary differential equations. Appl. Math. Comput. 1999, 105 (1), 21-68. (109) Harrison, G. W. Compartmental models with uncertain flow rates. Math. Biosci. 1979, 43, 131-139. (110) Chachuat, B.; Barton, P. I. Sufficient conditions for convex/ concave relaxations of the solutions of nonlinear ODEs, 2006, in preparation.
ReceiVed for reView February 8, 2006 ReVised manuscript receiVed June 19, 2006 Accepted June 26, 2006 IE0601605