492
Ind. Eng. Chem. Res. 2007, 46, 492-504
Real-Time Optimization of Batch Processes by Tracking the Necessary Conditions of Optimality B. Srinivasan* Department of Chemical Engineering EÄ cole Polytechnique Montreal, Montreal, Canada H3C 3A7
D. Bonvin Laboratoire d’Automatique EÄ cole Polytechnique Fe´ de´ rale de Lausanne, CH-1015 Lausanne, Switzerland
The use of measurements to compensate for the effect of uncertainty has recently gained attention in the context of real-time optimization of dynamic systems. The commonly used approach consists of updating a process model and performing numerical optimization using the refined model. In contrast, this paper presents a two-level approach that does not require repeating the optimization: At the upper level, the constraints that are active in the optimal solution are identified from optimization of a nominal process model; at the lower level, feedback control is used to enforce the necessary conditions of optimality, i.e., meet the identified active constraints and push selected gradients to zero. A key feature of this self-optimizing control scheme is the use of an input parametrization that is tailored to the identified active constraints. Another feature that is specific to batch processes is the possibility to meet the control objectives either online or on a run-to-run basis. The self-optimizing control approach is illustrated on a semibatch reactor example. 1. Introduction Process optimization has received attention recently because, in the face of growing competition, it represents a natural choice for reducing production costs, improving product quality, and meeting safety requirements and environmental regulations. The standard nominal optimization approach consists of determining the optimal solution for a given process model numerically. In practical situations, however, an accurate process model can rarely be found with affordable effort.1 The resulting modeling errors, together with process variations and disturbances, may lead to either infeasible operation in the presence of constraints or nonoptimality.2-4 Hence, in the presence of uncertainty, openloop implementation of offline calculated optimal inputs is clearly insufficient. Two main classes of optimization methods are available for handling uncertainty. The essential difference between the two classes relates to whether measurements are used in the calculation of the optimal strategy. In the absence of measurements, a robust optimization approach is typically used, whereby conservatism (and, thus, also a loss in performance) is introduced to guarantee feasibility for the entire range of expected variations.5,6 When measurements are available, adaptiVe optimization can help adjust to process changes and disturbances, thereby leading to less conservatism.7 It is interesting to note that the above classification is similar to that found in control problems with the robust and adaptive approaches. In the context of measurement-based adaptive optimization, it is possible to distinguish between explicit and implicit schemes, depending on whether or not a process model is used online. This is illustrated in Figure 1 and is discussed as follows. Explicit schemes involve two steps: (i) a model update that consists of estimating the current states and, optionally, certain parameters of the process model, and (ii) numerical optimization based on the updated process model. Because the two steps are * To whom correspondence should be addressed. Tel.: 1-5143404711, ext. 7472. Fax: 1-514-3405150. E-mail address:
[email protected].
repeated with the advent of new measurements, the procedure is also called repeated optimization. These ideas have been widely discussed in the literature and used in the context of both static optimization (e.g., real-time optimization, RTO)8,9 and dynamic optimization (e.g., model predictive control, MPC).10,11 Implicit schemes use measurements to update the inputs directly. For example, under the assumption that the active constraints do not change, optimality can be achieved by choosing an appropriate control structure that meets the necessary conditions of optimality (NCO).7 This class of methods, which has been applied to various industrial applications,12-14 will be investigated in this paper. Also, a comparison of explicit and implicit schemes will be presented, along with a suggestion to combine the two types of schemes. Among the implicit schemes, two mechanisms have been used to handle constraints: a barrier-function approach and an activeset approach. The barrier-function approach consists of using barrier functions to convert a constrained optimization problem to an unconstrained problem.15,16 Optimization then is achieved by pushing the gradients to zero (using, e.g., extremum-seeking methods).15,17 With such an approach, one must compromise between (i) choosing too high a penalty and ending up far from the constraint, thereby achieving poor performance, and (ii) choosing too low a penalty and risking numerical difficulties. The other approach, referenced as the active-set approach, consists of determining the active constraints and using certain input combinations to keep these constraints active. The remaining input combinations are adapted for optimizing the cost by pushing the reduced gradients to zero. The advantage of this scheme is that it uses the constraints, which are often easily measured, to perform a major portion of the optimization. However, the difficulty has been pushed to the mechanism with which the set of active constraints is determined. In this paper, the active-set approach is considered. In the work on self-optimizing control for unconstrained static systems, it has been shown that near-optimal solution can be
10.1021/ie0600487 CCC: $37.00 © 2007 American Chemical Society Published on Web 12/14/2006
Ind. Eng. Chem. Res., Vol. 46, No. 2, 2007 493
Figure 1. Measurement-based adaptive optimization. The left-hand side of the figure is a depiction of the explicit scheme, where the measurements are used to update a process model before repeating numerical optimization; the right-hand side of the figure depicts the implicit scheme, where the measurements feed a self-optimizing controller.
achieved by choosing a proper control structure.18,19 The same approach is directly applicable to the constrained dynamic case, provided that the active constraints do not change. The contribution of this paper lies in the derivation of self-optimizing control structures for the dynamic optimization case. It will be shown that the choice of the self-optimizing control structure is not unique, and this non-uniqueness can be exploited to ease the adaptation and improve the performance. The paper is organized as follows. Section 2 introduces preliminary material related to dynamic optimization. Section 3 presents RTO as the problem of meeting the NCO that involves active constraints and selected gradients. The NCOtracking procedure is investigated in section 4. A semibatch reactor example is used throughout the paper to illustrate the concepts, and optimization results with this simulated example are presented in section 5. Finally, conclusions are drawn in section 6. 2. Dynamic Optimization Problem 2.1. Problem Formulation. Consider the constrained dynamic optimization problems with finite operational time, where the objective is to determine the input profiles and possibly parameters that optimize a final-time cost function. In addition to constraints that correspond to the dynamic system equations, there might be path constraints (involving inputs and states) as well as terminal constraints. Input bounds are dictated by actuator limitations, whereas state-dependent constraints typically result from safety and operability considerations. Terminal constraints usually result from quality or performance considerations. These dynamic optimization problems can be formulated mathematically as follows:20,21
min φ(x(tf), F)
(1)
u|0,tf),F
s.t. x˘ ) F(x, u, F), x(0) ) x0(F)
(2)
S(x, u, F) e 0
(3)
T(x(tf), F) e 0
(4)
where φ is the final-time scalar cost functional to be minimized, x the n-dimensional vector of states with the known initial conditions x0, u the m-dimensional vector of input profiles, F the r-dimensional vector of time-invariant decision variables, S the ζ-dimensional vector of path constraints, T the τ-dimensional vector of terminal constraints, F a smooth vector function that describes the system dynamics, and tf the final time that is finite but can be either fixed or free. If tf is free, it is part of F. Note that the initial conditions can also be considered to be decision variables. The solution of the problems that are described by expressions 1-4 is typically discontinuous and consists of several intervals or arcs.22 Yet, the input profiles are continuous and differentiable in each interval. Each interval is characterized by a set of active path constraints, i.e., this set changes between two successive intervals. The information regarding the type and sequence of arcs can be inferred from the numerical solution of the problems that are described by expressions 1-4 for the nominal process model. Also, note that a method for automatically detecting the input switching structure has recently been proposed.23 Illustrative Example. A simple semibatch reactor with jacket cooling will be considered to illustrate the concepts throughout the paper. The characteristics of the example are as follows: reaction system: A + B f C, 2B f D; isothermal, exothermic reactions objective: maximize the amount of C at a given final time manipulated input: feed rate of B path constraints: input bounds; heat removal constraint expressed as a lower bound on the cooling jacket temperature terminal constraint: upper bound on the amount of D at final time
Model equations. Assuming perfect control of the reactor temperature through adjustment of the cooling jacket temperature, the model equations read as follows:
u c˘ A ) -k1cAcB - cA cA(0) ) cA0 V
(5)
494
Ind. Eng. Chem. Res., Vol. 46, No. 2, 2007
u c˘ B ) -k1cAcB - 2k2cB2 + (cBin - cB) cB(0) ) cB0 V V˙ ) u V(0) ) V0 Tj ) Tr -
(6) (7)
V ((-∆H1)k1cAcB + (-∆H2)k2cB2) UA nC ) V0cA0 - VcA
(8) (9)
1 nD ) [V(cA - cB) + V0(cB0 - cA0) + cBin(V - V0)] 2
(10)
Variables and parameters. The variables and parameters in this illustrative example are given in the following list. cX ≡ concentration of species X nX ≡ number of moles (amount) of species X V ≡ reactor volume ki ≡ kinetic coefficient of reaction i u ≡ feed rate of B cBin ≡ inlet concentration of B ∆Hi ≡ enthalpy of reaction i Tr ≡ reactor temperature Tj ≡ cooling jacket temperature U ≡ heat transfer coefficient A ≡ reactor heat exchange area
Table 1. Model Parameters, Operating Bounds, and Initial Condition
Optimization problem. The optimization problem is defined as
max nC(tf) u[0,tf)
s.t. dynamic system (eqs 5-10) (11)
0 e u(t) e umax Tj(t) g Tj,min nD(tf) e nDf,max
H(t) ) λTF(x, u, F) + µTS(x, u, F)
(12a)
Φ(x(tf), F) ) φ(x(tf), F) + ν T(x(tf), F)
(12b)
∫0t H(t) dt
(13a)
T
λT(tf) )
f
∂H ∂x
(13b)
∂Φ ∂x(tf)
(13c)
λ˙ T(t) ) -
parameter
value
parameter
value
k1 ∆H1 UA Tr umax cA0 V0
0.11 L/(mol min) -8 × 104 J/mol 1.25 × 104 J/(min °C) 30 °C 1 L/min 0.5 mol/L 1000 L
k2 ∆H2 cBin Tj,min nDf,max cB0 tf
0.13 L/(mol min) -105 J/mol 5 mol/L 10 °C 100 mol 0 mol/L 180 min
terminal cost, Ψ(x(tf), F) is the total terminal cost, λ(t) * 0 represents the n-dimensional vector of adjoint variables (Lagrange multipliers for the system equations), µ(t) g 0 represents the ζ-dimensional vector of Lagrange multipliers for the path constraints, and ν g 0 represents the τ-dimensional vector of Lagrange multipliers for the terminal constraints. The NCO for the optimization problems that are described by expressions 1-4 can be written as follows:21,22 For constraints:
µTS(x, u, F) ) 0, µ g 0
The model and operational parameters are given in Table 1. Optimal Solution. The optimal feed rate profile is depicted in Figure 2 and shows three intervals: (i) The feed rate is initially at its upper bound umax, (ii) then ustate(t) keeps the state constraint Tj(cA, cB, V) ) Tj,min active, and (iii) usens(t), where no path constraint is active, seeks a compromise between producing the desired C and the undesired D. The switching times t1 and t2 are linked to the state and terminal constraints, respectively; i.e., t1 is determined by Tj(t) reaching Tj,min and t2 is such that nD(tf) ) nDf,max. 2.2. Necessary Conditions of Optimality. 2.2.1. NCO for the Original Problem. The following functions must be defined:
Ψ(x(tf), F) ) Φ(x(tf), F) +
Figure 2. Optimal input consisting of the three arcs umax[0, t1), ustate[t1, t2), and usens[t2, tf).
H(t) is the Hamiltonian function, Φ(x(tf), F) is the augmented
νTT(x(tf), F) ) 0, ν g 0
(path)
(14a)
(terminal)
(14b)
For sensitivities:
∂H (t) ) 0 ∂u ∂Ψ )0 ∂F
(path) (terminal)
(14c) (14d)
The Lagrange multipliers µ(t) and ν are zero when the corresponding constraints are less than zero and, otherwise, are nonzero, so that the expressions µTS ) 0 and νTT ) 0 are always true when the constraints are satisfied (complementarity conditions). These complementarity conditions are discontinuous in nature and represent one of the main challenges in constrained optimization. In fact, the set of active constraints determine the structure of the NCO. In the remainder, it is assumed that the active constraints are known and, in addition, that they do not change with uncertainty. 2.2.2. NCO with Input Parametrization Tailored to Active Constraints. An important feature in this paper is the use of an input parametrization that takes the active constraints into consideration. Some of the arcs are parametrized using a piecewise-polynomial approximation, whereas others (especially those characterized by active path constraints) are left as infinitedimensional variables. Also, because the inputs are typically
Ind. Eng. Chem. Res., Vol. 46, No. 2, 2007 495
discontinuous, it is helpful to treat the switching times as explicit decision variables. Consequently, the decision variables u[0, tf) and F in the problem described by expression 1 can be parametrized as U(η[ts, t′s), π). The vector of input arcs η[ts, t′s), with [ts, t′s) indicating the interval for which a given arc exists, represents time functions or arcs that are left as infinite-dimensional variables. The parameter vector π includes the original timeinvariant decision variables F as well as the parameters that result from the input parametrization, such as the switching times. With Sh and T h denoting the active path and terminal constraints, respectively, the constraint conditions in expressions 14 read Sh ) 0 and T h ) 0 and the NCO become as follows: For constraints:
(16)
x˘ 2 ) -2x1x2, x2(0) ) x2,0
(17)
and the following three path constraints:
S1(x, u) ) u - umax e 0
(18)
S2(x, u) ) x1 - x1,max e 0
(19)
S3(x, u) ) x2 - x2,max e 0
(20)
(15a)
(terminal)
(15b)
The constraint S1 contains the input u explicitly and, thus, has a relative degree of r ) 0. The constraint is entered when u(ts) ) umax and is kept active as long as u(t) ) umax. Hence, the constraint S1 generates the profile condition Sh1 [ts, t′s) ) 0. The path constraint S2 does not contain u explicitly. Time differentiation of S2 gives
(path)
(15c)
S˙ 2 ) x˘ 1 ) -x1x2 + ux1
(terminal)
(15d)
which contains u explicitly. Hence, the relative degree is r ) 1. The condition that must be satisfied upon entering the path constraint is S2 ) 0, which implies x1 ) x1,max. Hence, the path constraint is entered when x1(ts) ) x1,max and is kept active by setting u(t) ) x2(t). The constraint S2 generates the pointwise condition Sh2 (ts) ) 0 and the profile condition Sh˙ 2[ts, t′s) ) 0. The path constraint S3 does not contain u explicitly either. Two time differentiations of S3 are necessary to have the input appear explicitly:
For sensitivities:
∂H [t , t′) ) 0 ∂η s s ∂Ψ )0 ∂π
x˘ 1 ) -x1x2 + ux1, x1(0) ) x1,0
(path)
Sh[ts, t′s) ) 0 T h)0
For the sake of illustration, let us consider the simple secondorder nonlinear dynamic system, given by
Here, the active path constraints Sh [ts, t′s) and the path sensitivities (∂H/∂η)[ts, t′s) are defined over appropriate intervals [ts, t′s). 2.2.3. NCO in Terms of Profile and Pointwise Objectives. With the aforementioned input parametrization, the set of decision variables corresponds to the nη profiles η[ts, t′s) and nπ scalars π. Similarly, the NCO consist of (nS + nη) profile conditions Sh [ts, t′s) ) 0 and (∂H/∂η)[ts, t′s) ) 0, and (nT + nπ) scalar conditions T h ) 0 and (∂Ψ/∂π) ) 0, with nS and nT denoting the number of active path and terminal constraints, respectively. Hence, the number of decision variables does not correspond to the number of NCO, thereby preventing the configuration of a square control problem. This section is devoted to rewriting the NCO so that the number of NCO elements corresponds to the number of decision variables. A mismatch situation arises for example when a switching time (i.e., a scalar variable in π) is related to a path constraint (i.e., a profile variable of the NCO), such as t1 in the illustrative example. This particularity, which can be linked to the concept of relative degree of a path constraint, is discussed next. 2.2.3.1. Relative Degree of a Path Constraint. The relative degree of a constraint, with respect to an input, is defined as the number of time differentiations of the constraint that are necessary for that input to appear explicitly. [This definition agrees with the standard definition of relative degree used in nonlinear system theory.24] If the relative degree of an active path constraint Sh is r, then the conditions Sh ) 0, Sh˙ ) 0, ‚‚‚, Sh(r-1) ) 0 must hold upon entering the constraint at time ts. This requires the adaptation of certain parameters (typically earlier switching times) to ensure that Sh(j) ) 0 (j ) 0, ..., r - 1) all occur simultaneously when entering the constraint. After these conditions are satisfied, the input keeps the highest time derivative at zero, i.e., Sh(r) [ts, t′s) ) 0, where [ts, t′s) is the interval for which the path constraint is active. Hence, the path constraint Sh results in the pointwise objectives Sh (ts) ) 0, Sh˙ (ts) ) 0, Sh(r-1) (ts) ) 0, and the profile objective Sh(r) [ts, t′s) ) 0.21 Profile objectives express conditions over a time interval, whereas pointwise objectives must be met at specific instants in time.
(21)
S˙ 3 ) -2x1x2 S(2) 3 ) -2x1x2(-x2 + u - 2x1)
(22)
Hence, the relative degree is r ) 2. The two conditions that must be simultaneously satisfied upon entering the path constraint are S3 ) 0 and S˙ 3 ) 0, which imply x2(ts) ) x2,max and x1 (ts)x2,max ) 0. The path constraint then is kept active by enforcing S(2) 3 ) 0, i.e., by setting u(t) ) 2x1 + x2,max. The constraint S3 generates the pointwise conditions Sh3(ts) ) 0 and Sh˙ 3(ts) ) 0, as well as the profile condition Sh(2) 3 [ts, t′s) ) 0. As illustrated with this simple example, the terminal constraints can be extended to include the additional scalar conditions Sh(j)(ts) ) 0 (j ) 0, ..., r - 1). More formally, the path and terminal constraints can be generalized to the concepts of profile and pointwise objectives, P h [ts, t′s) and Q h (ts), respectively, which are defined as follows:
P h [ts, t′s) ) {Sh(r)[ts, t′s)} ) 0
(23)
h (tf), Sh(j)(ts) (j ) 0, ..., r - 1)} ) 0 Q h (ts) ) {T
(24)
Note that the pointwise objectives include both the active terminal constraints T h (tf) ) 0 and the pointwise conditions imposed by active path constraints, Sh (ts) ) 0, Sh˙ (ts) ) 0, ..., Sh(r-1)(ts) ) 0. 2.2.3.2. Reduced Gradients. Let nQ denote the number of pointwise objectives Q h ) 0. The number of scalar decision variables is nπ, which must be larger than or equal to nQ for the problem to have a solution. Because the goal is to define a square control problem, it is necessary to define (nπ - nQ)
496
Ind. Eng. Chem. Res., Vol. 46, No. 2, 2007
Figure 3. Two-level NCO-tracking scheme with slow explicit optimization and fast self-optimizing control. If the active set does not change, the slow loop is entered only once initially (offline). Table 2. NCO, in Terms of Profile and Pointwise Objectives and Using the Concept of Reduced Gradients constraints sensitivities
profile objectives
pointwise objectives
P h [ts, t′s) ) 0 ∂H [t ,t′)N ) 0 ∂η s s P
Q h (ts) ) 0 ∂Ψ N )0 ∂π Q
by the corresponding constraint. Furthermore, the reduced gradient in arc 3 is the gradient itself, because there are no active constraints in that interval. Finally, with two scalar decision variables and two pointwise constraints, there is no pointwise sensitivity. 3. Real-Time Optimization via NCO Tracking
Table 3. NCO Corresponding to Input Model I1 profile objectives constraints sensitivities
arc 1: u[0, π1) ) umax arc 2: T˙ j[π1, π2) ) 0 ∂H [π , t ) ) 0 arc 3: ∂η3 2 f
pointwise objectives Tj(π1) ) Tj,min nD(tf) ) nDf,max
sensitivity conditions that are associated with (∂Ψ/∂π) ) 0. This can be done with the concept of a reduced gradient, as discussed next. Consider the (nQ × nπ)-dimensional matrix (∂Q h /∂π) and its null space NQ of dimension nπ × (nπ - nQ). It follows that [(∂Q h /∂π)T, NQ] is full rank and (∂Q h /∂π)NQ ) 0. The term (∂Ψ/ ∂π)NQ represents the reduced (or projected) gradients, because it is orthogonal to the directions necessary to meet the pointwise objectives. A similar procedure is possible to handle the profile objectives. If there are more arcs than profile objectives in a given interval, some profile sensitivities will need to be pushed to zero. Let nP denote the number of profile objectives P h [ts, t′s) ) 0 and consider the (nP × nη)-dimensional matrix (∂P h /∂η)[ts, t′s). Reduced sensitivities can be computed as (∂H/∂η)NP[ts, t′s), where the matrix NP[ts, t′s) of dimension nη × (nη - nP) represents the null space of (∂P h /∂η)[ts, t′s). 2.2.3.3. Square Control Problem. With the aforementioned developments, i.e., considering profile and pointwise objectives and using reduced gradients, the numbers of profiles and scalars in the NCO and in the input parametrization are the same by construction. Table 2 shows the NCO upon input parametrization and the use of reduced gradients. The four NCO parts express the fact that both the profile and pointwise objectives must be met, with each objective involving the enforcement of active constraints and the zeroing of reduced gradients. NCO for the Illustrative Example. The NCO are dependent on the input parametrization. The most natural input parametrization involves three profiles and two scalar variables:
{
η1(t) (for 0 e t < π1) u(t) ) U(η1, η2, η3, π1, π2) ) η2(t) (for π1 e t < π2) (25) η3(t) (for π2 e t < tf) Expression 25 for u(t) will be termed Input Model I1. The corresponding NCO are given in Table 3. With only a single input, no profile sensitivity is defined in the constrained arcs 1 and 2, because the input is determined
As mentioned previously, the standard approach to RTO consists of updating a process model, using available measurements, followed by numerical reoptimization that provides the inputs to the plant.8-11 In contrast, an explicit approach labeled NCO tracking is used in this paper.7 The idea of NCO tracking stems from the fact that optimality requires meeting the necessary conditions of optimality. NCO tracking implements optimal operation via feedback without having to solve a dynamic optimization problem in real time. Furthermore, tracking the NCO using process measurements seeks optimality for the real process, i.e., not simply for the available (possibly inaccurate) model, as this is the case with model-based optimization approaches. The main difficulty is that the structure of the NCO changes with the active constraints and, consequently, the active set must be known to the feedback controller. 3.1. NCO-Tracking Scheme. A two-level NCO-tracking scheme is presented in Figure 3. The upper level shows a regular explicit optimization scheme with model update. Unlike standard explicit optimization, this loop can be invoked at a much slower time scale, and only once if the active set does not change. However, the presence of this loop is important for handing variations in the active set and could be triggered as in ref 25. Using information from numerical optimization, a procedure that will be discussed in the next section is invoked to generate a self-optimizing control structure. At the lower level, the self-optimizing control structure tracks the NCO using appropriate measurements. This loop is faster than the loop at the upper level. For batch processes, as will be shown later, this loop can include online as well as run-to-run feedback. 3.2. Comparison of Explicit and Implicit Schemes. Three important sub-tasks in constrained optimization problems include (i) the determination of the active constraints, (ii) the choice of appropriate combinations of the manipulated variables to keep these constraints active, and (iii) the use of the remaining degrees of freedom to force the reduced gradients to zero. Numerical optimization solves all three sub-tasks simultaneously based on the available process model. In contrast, NCO tracking treats each step individually and sequentially. There are additional differences between NCO tracking and numerical reoptimization. In NCO tracking, the objective of the numerical optimization step is not to provide the input values, but only the set of active constraints. Furthermore, NCO tracking is formulated as a control problem that slowly moves the
Ind. Eng. Chem. Res., Vol. 46, No. 2, 2007 497
inputs toward the optimal solution. This is in contrast to numerical reoptimization that provides input values that “jump” to the computed optimal solution. These issues are discussed next, with respect to specific features of the optimization schemes. 3.2.1. Meeting Active Constraints. Let us assume that the model mismatch is such that the active constraints of the real process are determined correctly by the model, while the input values needed to keep these constraints active are in error. To get to the active constraints using numerical reoptimization, model parameter update is required, which, in turn, necessitates persistency of excitation. Also, because the identification algorithm is tailored to minimize the output prediction error, it does not guarantee that the active constraints of the real process will be reached. On the other hand, with NCO tracking, the problem of reaching the active constraint is formulated as a control problem, for which an appropriate controller with integral action can be designed. This strategy guarantees reaching the active constraints without the need of persistency of excitation. This is clearly a situation where NCO tracking will outperform numerical reoptimization. This situation can also be considered from a different perspective. Because of uncertainty (model mismatch and process disturbances), numerical optimization cannot compute the input values that bring the plant to its active constraints. Numerical reoptimization tries to compensate the effect of this uncertainty through model parameter update. In contrast, NCO tracking compensates the effect of uncertainty through direct input update via the integral action of a controller. The former is more difficult to achieve, because it requires persistency of excitation, whereas the latter is more direct and often straightforward to implement. 3.2.2. Pushing Gradients to Zero. Turning to the gradient component, the numerical reoptimization approach uses the process model to estimate the gradients. Because of model mismatch, these gradients are in error, which leads to incorrect inputs. Parameter update is required with, again, no guarantee of computing the correct gradients. In the NCO-tracking approach, as will be discussed later, a multirate prediction/ correction algorithm is used for gradient estimation. This gives a quick prediction of the gradients which, although initially in error, eventually get corrected. Hence, a slower, yet moreaccurate, gradient estimation is used. Also, the gradients in NCO tracking are pushed to zero using a controller with integral action, and not through system inversion using an imperfect model. 3.2.3. Requirements on the Process Model. The previous points bring into picture the important question regarding the process model required for optimization. In the numerical reoptimization approach, the model and the reality should eventually (after model update) share the same optimum. This is quite a difficult task, noting that such a detailed modeling requires considerable effort, and getting all the parameters correct calls for persistency of excitation and low levels of measurement noise. On the other hand, the NCO-tracking approach only requires a process model that correctly determines the active constraints. This is a much less stringent requirement, which can be met more easily. Furthermore, uncertainty is handled by the integral action of the feedback controller. This clearly shows the benefit of the two-level structure, as opposed to the single-level numerical optimization scheme. 3.2.4. Regularization Aspect. In regard to the regularization aspect, NCO tracking is particularly attractive, because it allows the optimization problem to be treated as a control problem.
Table 4. Control and Optimization Schemes Available To Enforce the Various NCO Parts Individually profile objectives constraints
online control26,27
sensitivities
neighboring-extremal control32,33
pointwise objectives run-to-run control28,29 mid-course correction30,31 extremum-seeking control17,34 self-optimizing control18,19 evolutionary operation35,36
The control objective is not to “jump” to the optimal solution, but rather to approach it smoothly as a function of time. Regularization is provided by the controller, which is not just an inversion of the plant. This regularization comes with the advantages of sensitivity reduction and disturbance rejection. 4. Design of Self-Optimizing Controllers NCO tracking consists of meeting the constraint and sensitivity components of both the profile and pointwise objectives. Hence, the dynamic optimization problem is recast as a control problem that has as many manipulated variables (inputs) as there are control objectives (outputs), thereby representing a square multivariable control problem. However, the objects used for control are heterogeneous, in the sense that there are both profiles and scalars. Many ways of enforcing selected components of the NCO are available in the literature and are summarized in Table 4. Individually, the four NCO-tracking sub-problems have been investigated quite extensively for both static and dynamic optimization scenarios. The novel contribution of this paper is to consider all four sub-problems simultaneously and generate a self-optimizing control structure that is appropriate for the optimization of constrained final-time dynamic optimization problems. The major challenge lies in the fact that the four subproblems in Table 4 have very different adaptation strategies. An additional difficulty with NCO tracking in the dynamic scenario results from noncausality. Although path constraints can be met online using feedback control, the terminal constraints can only be evaluated at the end of the run. Also, several completed runs are necessary to evaluate terminal sensitivities experimentally. This implies that full adaptation generally cannot be accomplished within a single run. Consequently, the key to success for NCO tracking in the dynamic scenario will be to consider adaptation both within the run and on a run-to-run basis. Unfortunately, run-to-run adaptation has certain drawbacks: (i) it does not compensate for within-run variations, because only disturbances that are correlated over several runs can be rejected, and (ii) it requires multiple runs for convergence and, thus, optimality. Hence, an important implementation aspect will be to determine which components of the solution are adapted online, and which components are adapted on a run-to-run basis. As opposed to schemes available in the literature, where everything is performed either online10,37,38 or on a run-to-run basis,39-41 the methodology proposed here allows an appropriate mix of the two. The procedure for generating a self-optimizing controller is illustrated in Figure 4. The flowchart shows the steps taken, from numerical optimization of a nominal process model to validation of the resulting experimental performance. The various tasks are summarized in the following subsections.
498
Ind. Eng. Chem. Res., Vol. 46, No. 2, 2007
Figure 4. Control design for optimization, depicting the design of self-optimizing controllers.
4.1. Structure Detection. The procedure starts with numerical optimization of a nominal process model. The optimal solution can be computed for several uncertainty realizations. From this information, the structure detection step aims at identifying (i) the arcs and the switching times of the optimal solution, and (ii) the active path and terminal constraints. An automatic switching structure detection has been proposed in ref 23. After obtaining a first solution at a coarse resolution of the input profiles, using a direct sequential approach, the possible switching times and arcs are detected automatically via wavelet analysis. Subsequently, the problem is reformulated and solved as a multistage problem, with each stage corresponding to a possible arc. Another avenue for automatic structure detection is given in ref 42. Note that the structure detection step needs to uncover not only the structure of the optimal solution but also the amount of uncertainty for which this structure still holds.43,44 Hence, in addition to a nominal process model, this step requires good knowledge of the type and amount of uncertainty to be expected. At this stage, certain approximations can be introduced. For example, operator experience may suggest that some of the short arcs that are observed in the numerical solution might be related to the need to have appropriate derivatives for entering an active path constraint (see discussion in subsection 2.2.3) and, thus, are not necessary for any practical purpose. 4.2. Formulation of the Control Problem. The key in reformulating an optimization problem into a control problem is the choice of the set of manipulated and controlled variables. This choice is not unique and can affect the control performance significantly. 4.2.1. Choice of Manipulated Variables: Input Model. After the structure of the optimal solution has been detected, the next step is to choose the variables to adapt. To this effect, the inputs are parametrized using the knowledge of the structure of the optimal solution and the active constraints. Typically,
for numerical optimization, the infinite-dimensional inputs are replaced by a finite number of parameters, using a piecewisepolynomial approximation of the inputs. In contrast, here, some arcs (especially those characterized by active path constraints) are left as infinite-dimensional variables. The remaining input arcs are parametrized using a finite number of parameters (e.g., piecewise-constant, piecewise-linear, or exponential profiles), because it is easier to deal with input parameters than input arcs. Such a parameterization is particularly useful for transforming a profile sensitivity into pointwise sensitivities. Thus, the decision variables are parametrized as {u[0,tf), F} ) U(η[ts, t′s), π), where η represents the profile variables and π represents the scalar variables. It is sometimes more intuitive to specify the shape of a state variable than that of an input. For example, in batch or semibatch reactors, it is easier to specify the shape of the reactor temperature Tr than that of the inlet jacket temperature Tj,in, i.e. the actual input variable. In such a case, Tr is parametrized, and the actual input Tj,in(t) is obtained by tracking Tr(t) using feedback control. Alternate Input Model for the Illustrative Example. The last arc is sensitivity-seeking and can be approximated by the constant value π3. This leads to an input parametrization that involves two profiles and three scalar variables:
{
η1(t) (for 0 e t < π1) u(t) ) U(η1, η2, π1, π2, π3) ) η2(t) (for π1 e t < π2) π3(t) (for π2 e t < tf) (26) Expression 26 will be termed Input Model I2. Note that the change in input parameterization also changes the NCO (Table 5). The profile sensitivity (Arc 3) in Table 3 has become a pointwise sensitivity. There are three scalar decision variables
Ind. Eng. Chem. Res., Vol. 46, No. 2, 2007 499 Table 5. NCO Corresponding to Input Model I2 profile objectives constraints sensitivities
arc 1: u[0, π1) ) umax arc 2: T˙ j[π1, π2) ) 0
pointwise objectives Tj(π1) ) Tj,min nD(tf) ) nDf,max (∂Ψ/∂π)NQ ) 0
and two pointwise constraints; therefore, there is a single (reduced) pointwise sensitivity. 4.2.2. Choice of Controlled Variables: Sensor Model. After the manipulated variables have been determined, the controlled variables correspond to the NCO of the parametrized problem (Table 2). However, in most cases, the NCO are not directly measured, or at least are not available online. Thus, at this stage, it becomes crucial to decide how the NCO will be estimated or approximated from the available measurements. The main aspect of the NCO sensor model is whether the information regarding that particular NCO is available continuously online. This observation, in turn, influences the manner in which control will be implemented. Active profile constraints are often straightforward to handle, because they can be measured or estimated online. For active pointwise constraints, two possibilities exist: (i) one can opt to wait until the end of the run, and the corresponding parameters can be adapted on a run-to-run basis, and (ii) if one can predict the final constraints from other online measurements, the constraints can be met within a single run.45 There exist three ways of estimating profile and pointwise sensitivities: (i) the finite-difference or perturbation approach, where the sensitivities are computed from the difference in performance of multiple runs with the inputs being perturbed;17 (ii) the model-based approach, where the gradients are calculated from the adjoint variables;46 and (iii) the neighboring-extremal (NE) approach, where approximations of the gradients are computed using the difference between nominal and measured outputs.33 Also, note that it is not always necessary to evaluate the sensitivities (∂H/∂η) and (∂Ψ/∂π) directly, because quantities that go to zero with them suffice for implementation. The NE approach can be viewed as an extension to the dynamic case of the idea of self-optimizing control.18 In selfoptimizing control, an offline effort is performed to search for the combination of available measurements that approximates the gradients as well as possible over the space of possible perturbations. This search could be based on a process model or historical data. On the other hand, the traditional NE controller consists of a direct update law between the measured states and the inputs. It can be easily split into a gradient estimator and a proportional controller. Hence, the NE approach can also be viewed as searching offline for the combination of the measurements that approximates the gradients, but using the process model this time. Furthermore, in the dynamic case, measurements spread over time are used to compute the gradients. The various methods of gradient estimation have their own advantages and disadvantages. The model-based methods (NE in particular) generally are fast but inaccurate, whereas the perturbation methods are slow, yet more accurate. Thus, a combination of these methods can be envisaged, where the gradients estimated at a slower time scale using a perturbation approach are used to correct the prediction made by the NE approach. This multirate estimation addresses the performance issue, because it gives a quick prediction of the gradients which, although initially in error, eventually are corrected by a more direct estimation via the perturbation method. Approximation of the NCO for the Illustrative Example. The profile sensitivity (∂H/∂η3) in input model I1 can be approximated using the NE approach:
∂H ) KA(cAnom(t) - cA(t)) + KB(cBnom(t) - cB(t)) + ∂η3 KV(Vnom(t) - V(t))
(for π2 e t < tf) (27)
The reduced pointwise sensitivity (∂Ψ/∂π)NQ can be evaluated using the finite-difference approximation:
Ψk - Ψk-1
N = (∂Ψ ∂π ) π˜ - π˜ Q
k
k
(28)
k-1
The variable π˜ is the one left out after assignment of the scalars to the pointwise constraints. Because the final concentration nD(tf) is not available during the batch, the on-line measurement of nD(t) can be used to predict the final concentration in a linear fashion:
nDf,pred(t) ) nD(π2) +
( )
tf - π 2 (n (t) - nD(π2)) t - π2 D (for π2 e t < tf) (29)
Two sensor models will be treated in this work: S1 and S2. They both estimate the profile sensitivity using the NE approach and the pointwise sensitivity using finite differences. However, they differ in the way the terminal constraint nD(tf) ) nDf,max is treated. In sensor model S1, one waits until the end of the batch to measure nD(tf) and performs the run-to-run correction, and in sensor model S2, a prediction of the final concentration is available online, thus allowing online update. 4.3. Control Design. After the input (manipulated) and output (controlled) variables have been chosen, a large body of literature, ranging from single-loop proportional-integralderivative (PID) controllers to multivariable nonlinear model predictive controllers, is available to solve the control problems.26,27,47 In this paper, a decentralized approach with pairing is presented, although the user is invited to use his/her preferred control method. 4.3.1. Decentralization by Pairing. Decentralization48 is particularly attractive in the context of batch process optimization, because the information needed for adaptation is available at different times, i.e., either online or at the end of the run. Hence, all manipulated variables are not adapted simultaneously. Pairing links manipulated variables to controlled variables. Pairing to meet profile constraints: The first and most straightforward link is to pair the active profile constraints with the variables η[ts, t′s) reserved for this purpose. When there is only one input available in a given interval to meet an active profile constraint, the link is obvious. However, when there are several inputs to meet one or more profile constraints, a relative gain array analysis (RGA) should be performed for proper pairing.27 Often, the path constraint Sh ) 0 is tracked instead of the profile constraints Sh(r) ) 0. Pairing to meet pointwise constraints: Here, the pairing corresponds to choosing appropriate scalar variables (π) to make the pointwise constraints active. Again, a RGA analysis can help choose a proper pairing. Linking the remaining input arcs to profile sensitiVities: The remaining unparametrized sensitivity-seeking arcs are linked to profile sensitivities. Although the input arc ηj[ts, t′s) may also affect (∂H/∂ηi)[ts, t′s), if the sufficient conditions of optimality are satisfied (which is normally the case), the Hessian matrix is diagonally dominant. This means that the cross-coupling terms are smaller than the direct effects, and, thus, the link ηi to (∂H/ ∂ηi) is appropriate.
500
Ind. Eng. Chem. Res., Vol. 46, No. 2, 2007
Linking the remaining input parameters to pointwise sensitiVities: The remaining scalar variables are linked to pointwise sensitivities, with each variable linked to its corresponding gradient, again because of the diagonal dominance of the Hessian matrix. Hence, four types of adaptation laws can exist for the profile and pointwise constraints and the profile and pointwise sensitivities. The possible adaptation laws can formally be written as follows:
ηi[ts, t′s) ) Pc(P h j[ts, t′s))
(30a)
h j) πi ) Qc(Q
(30b)
[t , t′)N ) (∂H ∂η ∂Ψ N ) π )Q( ∂π
ηi[ts, t′s) ) Ps i
s
s
s
P
Q
(31a) (31b)
where Pc, Qc, Ps, and Qs are appropriate operators or controllers for the profile constraints, pointwise constraints, profile sensitivities, and pointwise sensitivities, respectively. Pairing for Input Model I1. Input model I1, which is given by eq 25, exhibits two profile constraints, one profile sensitivity, and two pointwise constraints. The links between the decision and controlled variables are rather straightforward: (1) η1[0, π1) and η2[π1, π2) can be determined by tracking the respective profile constraints. (2) The condition Tj(π1) ) Tj,min is used to fix π1. The terminal constraint nD(tf) ) nDf,max is used to determine π2. (3) η3[π2, tf) is left undetermined by the constraints and, thus, is adapted to force (∂H/∂η3) ) 0. The pairing leads to the following control structure (i.e., C1):
[
η1(t) ) umax (for 0 e t < π1) η2(t) ) Pc(Tj,min - Tj(t)) (for π1 e t < π2) u(t) ) ∂H η3(t) ) Ps (for π2 e t < tf) ∂η3
( )
(32)
π1 ) Q1c (Tj(t) ) Tj,min) π2 ) Q2c (nDf,max - nD(tf))
where Pc represents a controller that regulates Tj(t) to Tj,min, Ps a controller that pushes the path sensitivity (∂H/∂η3) to zero, Qc1 an operator that detects the first occurrence of Tj(t) ) Tj,min, and Qc2 a controller that forces nD(tf) to nDf,max. Note that though the profile constraint is T˙ j ) 0, the path constraint Tj(t) ) Tj,min is tracked. This strategy gives better disturbance rejection and robustness, with respect to errors in the choice of π1. Pairings for Input Model I2. Input model I2, which is given by eq 26, exhibits two profile constraints and three pointwise constraints. The pairing for the profile constraints is the same as that previously discussed. The link between the heat-removal constraint and the switching time π1 is obvious. For the remaining constraints, two different pairings are possible. (1) The terminal constraint nD(tf) ) nDf,max is used to determine the switching time π2, whereas the input level π3 is used to push the sensitivity to zero. For this pairing, the controller structure (i.e., C2) is
[
η1(t) ) umax (for 0 e t < π1) η (t) ) Pc(Tj,min - Tj(t)) (for π1 e t < π2) u(t) ) 2 ∂Ψ (for π2 e t < tf) N π3(t) ) Qs ∂π Q
(
)
(33)
π1 ) Q1c (Tj(t) ) Tj,min) π2 ) Q2c (nDf,max - nD(tf))
where Qs represents a controller that pushes the reduced terminal sensitivity to zero. (2) The terminal constraint nD(tf) ) nDf,max is used to determine the input level π3, whereas the switching time π2 is used to push the sensitivity to zero. The controller structure (i.e., C3) is
[
η1(t) ) umax (for 0 e t < π1) (for π1 e t < π2) u(t) ) η2(t) ) Pc(Tj,min - Tj(t)) π3(t) ) Q2c (nDf,max - nD(tf)) (for π2 e t < tf)
(34)
π1 ) Q1c (Tj(t) ) Tj,min) N (∂Ψ ∂π )
π 2 ) Qs
Q
4.3.2. Online versus Run-to-Run Implementation. Batch processes are examples of repetitive dynamic processes that are characterized by the presence of two independent “time” variables: the run time (t) and the run counter (k). Thus, NCO tracking can be achieved using either online or run-to-run controllers, as explained next. In principle, each of the control laws (expressions 30 and 31) can be implemented either online (in run time t) or on a run-to-run basis (in run index k), depending on the information available, although certain control laws are more likely to prefer one form of adaptation than the other. Hence, the last part of the design is to decide what component of the controller is implemented online and what component is implemented in a run-to-run manner. Online controllers: For profile constraints, the design of online controllers corresponds to a standard control design problem.26,27 For pointwise constraints derived from path constraints, the corresponding input parameter can be adjusted online directly from online measurements, e.g., reaching the constraint Tj(t) ) Tj,min determines π1 in the illustrative example. For other pointwise constraints and sensitivities, the use of online controllers requires online prediction of the constraints and sensitivities, e.g., using an empirical model such as partial leastsquares49,50 or a simple linear extrapolation as given by eq 29. Run-to-run controllers: For adapting profile variables, the runto-run controllers are of the iterative learning control (ILC) type.51,52 Information regarding the profile of the current run is used to adapt the entire profile of the next run. For adapting scalar variables in a run-to-run manner, the dynamic process is viewed as a static map between the input variables π available at the beginning of the run and the quantities Q h and (∂Ψ/∂π)NQ available at the final time. This special structure is exploited to design simple integral controllers in the run index k.28,53 Various self-optimizing controllers can be generated by combining the different choices available in (i) the input model, (ii) the sensor model, (iii) the pairing, and (iv) the online vs run-to-run implementation. A few control structures for the illustrative example are discussed next. 4.3.2.1. Self-Optimizing Controller A. This controller is based on input model I1 (three profiles and two scalars), sensor
Ind. Eng. Chem. Res., Vol. 46, No. 2, 2007 501
Figure 5. Self-optimizing controller A. Figure 7. Self-optimizing controller C.
model. However, note that Θ does not require the optimal input u* itself, but rather the optimal cost φ(u*), which could possibly be obtained by extrapolation of the costs obtained with models of increasing complexity. The optimality measure Θ is in the range of 0-1 and provides the fraction of the potential improvement that has been realized. The closer Θ is to 1, the better the optimality. If the quality of the approximate solution is insufficient, it is necessary to go back and propose a different input model with additional degrees of freedom, or a different sensor model, or a different pairing, or a different implementation, or simply retune the parameters of the controller.
Figure 6. Self-optimizing controller B.
model S1 (with the terminal constraint only available at the end of the batch), and decentralized control structure C1. The controllers Pc and Ps are implemented online and Q2c is implemented in a run-to-run fashion. This gives the selfoptimizing controller shown in Figure 5. Alternatively, Pc and Ps could be ILC controllers implemented in a run-to-run manner. 4.3.2.2. Self-Optimizing Controller B. This controller is based on input model I2 (two profiles and three scalars), sensor model S1 (with the terminal constraint only available at the end of the batch) and decentralized control structure C2. The controllers Pc is implemented online, whereas Qc2 and Qs are implemented in a run-to-run fashion. This gives the selfoptimizing controller shown in Figure 6. 4.3.2.3. Self-Optimizing Controller C. This controller is based on input model I2 (two profiles and three scalars), sensor model S2 (with online prediction of the terminal constraint), and decentralized control structure C3. The controllers Pc and Qc are implemented online, whereas Qs is implemented in a run-to-run fashion. This gives the self-optimizing controller shown in Figure 7. 4.4. Validation. There might be several controller candidates for a given process; therefore, it is important to be able to compare them and assess the quality of the approximation. For this, the optimality measure Θ, which expresses the ratio of two cost differences, in introduced:54
Θ)
φ(ucons) - φ(uNCO) φ(ucons) - φ(u*)
(35)
where uNCO represents the inputs (decision variables) obtained via NCO tracking, u* is the true optimal solution, and ucons represents a conservative solution used in practice. The true optimal solution u* is unknown. It could perhaps be approximated via numerical optimization of a detailed process
5. Optimization Results for the Illustrative Example Optimization results for the illustrative example considered throughout this paper are presented in this section. It is assumed that uncertainty is present in the form of time-varying kinetic coefficients:
k1(t) ) k1,0 k2(t) ) k2,0
(R R+ t) (R β- t)
(36a) (36b)
with the nominal values k1,0 and k2,0, R ) 1800 min, and β ) 900 min. The variation may, for example, correspond to a variation of the catalyst or efficiency with time. The information regarding this variation is, of course, not revealed to the optimization algorithms. If this information were available, the ideal cost value would be φ* ) 392.3 mol of desired product C. The optimization approaches based on repeated optimization and NCO tracking are compared. In the repeated-optimization approach, two cases are considered, i.e., without and with online model refinement. In the NCO-tracking approach, the three selfoptimizing control schemes presented in section 3 are considered. 5.1. Repeated Optimization. The first scenario does not consider online model refinement. The current state information is used as initial conditions for reoptimization based on the nominal process model. In the second scenario, the model parameters k1 and k2 are updated online every 5 min using standard least-squares techniques and assuming that the entire state information is available. Reoptimization then is performed with the refined model, using the current state information as initial conditions. The optimization results are given in Table 6 for reoptimization periods of 60, 30, and 10 min. The perfor-
502
Ind. Eng. Chem. Res., Vol. 46, No. 2, 2007
Table 6. Comparison of Various Repeated Optimization Scenarios: Distance to Constraints, Cost O, and Optimality Measure Θa reoptimization period (min)
minimum jacket temperature (°C)b
final amount of D (mol)c 100
final amount of C, cost φ (mol)
optimality measure, Θ
392.3
1
optimal
10
open loop
10.1
77.5
374.6
0
update of only initial conditions update of only initial conditions update of only initial conditions
60 30 10
10.1 10.1 10.6
84.9 85.8 79.7
380.4 380.4 374.6
0.33 0.33 0
update of initial conditions and parameters update of initial conditions and parameters update of initial conditions and parameters
60 30 10
10.1 10.0 10.1
88.8 77.1 87.4
383.7 372.8 382.1
0.51