Reduced-Order Models for Nonlinear Distributed Process Systems

which calls for reduced-order descriptions of the distributed process system. In this work, we ... fully applied to many nonlinear dynamic systems.1-4...
9 downloads 0 Views 173KB Size
Ind. Eng. Chem. Res. 2004, 43, 3353-3363

3353

Reduced-Order Models for Nonlinear Distributed Process Systems and Their Application in Dynamic Optimization Eva Balsa-Canto Department of Applied Mathematics II, Universidad de Vigo, Campus Marcosende, 36280 Vigo, Spain

Antonio A. Alonso* and Julio R. Banga Process Engineering Group, IIM-CSIC, Eduardo Cabello 6, 36208 Vigo, Spain

Dynamic optimization of distributed process systems has received considerable attention over the last couple of years. Most approaches proposed for the solution of these types of problems are based on the use of the control vector parametrization method, which transforms the original dynamic optimization problem into an outer nonlinear programming (NLP) problem. The solution of this NLP problem requires the simulation of the process under consideration for each function evaluation. Unfortunately, this task is usually very demanding for this class of dynamic systems, which calls for reduced-order descriptions of the distributed process system. In this work, we exploit the use of low-order models based on the Gale¨rkin projection on a set of proper orthogonal functions as a very efficient alternative to the solution of dynamic optimization problems for nonlinear distributed process systems. 1. Introduction The objective of dynamic optimization is to compute the time-varying control profiles that optimize (minimize or maximize) a desired performance index (objective functional) subject to the system dynamics. This class of problems is usually solved by the so-called direct approaches, i.e., transforming the original, infinitedimensional dynamic optimization problem into a nonlinear programming (NLP) problem through parametrization of the control vector functions, or both the controls and the dynamic states, by means of combinations of piecewise continuous polynomials. In these formulations, the decision variables are now the coefficients of the polynomial set employed. In the particular case of control vector parametrization (CVP), each evaluation of the objective function requires the solution of an initial value problem (IVP) associated to the system’s dynamics. Direct methods have been successfully applied to many nonlinear dynamic systems.1-4 However, direct approaches are, to a large extent, conditioned by the dimensionality of the state-space description. For example, in the particular case of CVP, the highest computational effort is invested in solving the differential and algebraic equation set (DAE) describing the dynamics of the system. In the domain of CVP methods, such computational demand is an especially critical factor in dealing with systems described by sets of partial DAEs (PDAE systems), as is the case, for instance, of polymerization reactors, crystallization units, or thermal processing of bioproducts. The solution of the PDAE systems usually proceeds in two steps: first, the original infinitedimensional system (PDAE) is transformed into a finitedimensional DAE set. Second, the resulting system is then solved by a standard IVP solver. Traditional discretization schemes include finite differences or finite elements applied to the spatial operators, which usually * To whom correspondence should be addressed. Tel.: +34 986 23 19 30. Fax: +34 986 292762. E-mail: antonio@iim.csic.es.

results into a highly dimensional and stiff DAE system. This approach has been applied by many authors in the context of chemical and biochemical processes.5-7 However, the main limitation of employing spatial discretization is that the dimension, and therefore the computational effort, increases very rapidly, with the size and stiffness of the resulting DAE system thus restricting its application in a real-time context such as real-time dynamic optimization or model predictive control. Alternatively, one can take advantage of the dissipative nature of distributed process systems to derive a reduced-order description that captures the relevant (slow) dynamics of the system. The approach relies on the notion of dissipation, which allows the original infinite-dimensional system to be separated into slow (possibly unstable) and fast (stable) dynamic subsystems. Such time-scale separation has been widely used in the area of process control8 to derive robust nonlinear control for diffusion-convection-reaction systems. On the basis of this concept, Alonso and Ydstie9 developed a number of stability analysis tools for distributed process systems. Recently, Alonso et al.10,11 extended this approach to design robust identification tools for tubular reactors and to stabilize limit cycles in diffusion-reaction systems. The way the original PDAE system is transformed into a low-dimensional DAE proceeds as follows: First, a set of spatial basis functions associated with the dynamics of interest is computed by solving a given eigenvalue problem (see, for instance, work by Alonso et al.10,11). Second, the original PDAE set is projected on the selected basis by a Gale¨rkin method. A particular set of basis functions that proved to be useful in the area of fluid dynamics and turbulence are the ones based on the Karhunen-Loe`ve expansion (also called the proper orthogonal decomposition, POD).12 POD-based reduction techniques have been successfully employed to derive efficient dynamic simulation schemes for nonlinear distributed process systems that involve reaction, diffusion, and convection phenomena.13,14 In

10.1021/ie049946y CCC: $27.50 © 2004 American Chemical Society Published on Web 05/18/2004

3354 Ind. Eng. Chem. Res., Vol. 43, No. 13, 2004

fact, this technique has become quite popular over the last years in the context of identification and process control,10,15,16 parameter estimation,17 and steady-state optimization.18,19 It seems, therefore, reasonable to make use of such a technique as a means of reducing the computational burden associated with dynamic optimization of distributed process systems. Nevertheless, despite the encouraging results that in this direction have been recently reported in the literature,20,21 a number of conceptual and computational issues remain that, to the best of our knowledge, have not been properly addressed yet. Among those, we highlight the following: (i) dealing with nonhomogeneous boundary conditions in order to comply with existing projection methods such as Gale¨rkin; (ii) issues regarding the efficiency of POD computation in cylindrical and/or radial geometries where standard techniques fail;22 (iii) efficient Gale¨rkin projection of nonlinear (reaction) terms of diffusion-reaction systems on the POD set. The purpose of this paper, which is an extension of the ideas originally presented by Balsa-Canto et al.,20 is to provide a systematic POD-based model reduction framework that allows us to overcome the aforementioned questions. In this way, we propose a general systematic technique to numerically compute the POD basis set, which takes advantage of the underlying discretization scheme (either finite differences or finite elements) and also solves numerical questions associated with cylindrical or spherical coordinates. In addition, an approach to handle nonhomogeneous boundary conditions is suggested that transforms the original nonhomogeneous problem into a homogeneous equivalent one through well-defined boundary transformations. Projection issues related to the presence of chemical reaction-type nonlinearities are addressed by extending the space of variables so that the original nonlinear terms now become of the polynomial-type. The paper is organized as follows. For the sake of completeness, section 2 summarizes the underlying ingredients on which reduction is based, including the approaches taken to consider general types of boundary conditions as well as on dealing with nonlinearities. A quite general class of basis functions (the so-called POD) is described in section 3 because this set will be the one employed to derive the reduced-order description. Numerical questions related to solving the associated eigenvalue problems and the effect of the geometry will be also considered. The statement of the dynamic optimization problem, with a brief summary of previous results in the context of dynamic optimization is presented in section 4. Finally, in section 5, the construction of a reduced-order description and its application in dynamic optimization will be illustrated in a case study involving the heat processing of solid food stuff. This example will also be used to illustrate the degree of accuracy obtained with the reduced-order description as compared with standard approaches based on finite differences. 2. Reduced-Order Representation of Distributed Process Systems The numerical solution of partial differential equations (PDEs) usually involves the transformation of the PDAEs into a set of DAEs. To perform this transformation, the spatial operator is projected into a set of spatial functions, locally or globally defined. The use of local

functions, as in the numerical method of lines (NMOL) approach,23 provides precise models. However, direct numerical integration of the resulting system might be computationally expensive. Alternatively, the so-called spectral methods make use of global functions, leading to reduced-order models that capture the relevant information of the system with a reasonable computational cost. Consider a distributed process in general form

xt ) L(x) + σ(x)

(1)

where L(‚) denotes a parabolic linear operator defined on the domain ω and σ(x) is a nonlinear function of the field x. To single out a unique solution for eq 1, the field must satisfy additional conditions that are usually specified at the boundary of the domain ω: (i) First-order (or Dirichlet) boundary conditions:

x(Ω,t) ) F1(ξ,t)

(2)

in which the field is specified at each point of the boundary Ω as a function, in general, of the spatial coordinates ξ and/or the time. (ii) Second-order (Neumann):

xn(Ω,t) ) F2(ξ,t)

(3)

where n is a vector normal to the boundary, so that the normal component of the gradient of the field is specified at each point of the boundary, in general, as a function of the spatial coordinates ξ and/or the time. Conditions at an initial time t0 from which eq 1 evolves are also necessary:

F0(ξ,t0) ) 0

(4)

Equation 1 can be interpreted as an infinite-dimensional system on a Hilbert space equipped with an inner product and a norm of the form

(f, g) )

∫Ω f g dΩ

and ||f||22 ) (f, f)

The dissipative nature of system (1) allows the solution to be expressed as9 ∞

x˜ (ξ,t) )

ai(t) φi(ξ) ∑ i)1

(5)

In this expression, the coefficients ai are functions of ∞ contains the spatial time, whereas the basis {φi}i)1 dependency of the solution. On the other hand, each element of this basis can be considered as the solution of the following eigenvalue problem:10

∫Ω K(ξ,ξ′) φi(ξ′) dΩ

φ i ) λi

(6)

where λi is a parameter associated with each element of the basis and K is a given kernel. Note that, for symmetric kernels, the corresponding eigenfunctions are orthonormal,24 so that (φi, φj) ) δij. In addition, for dissipative processes, the eigenvalues (λi) are ordered in such a way that λi < λj for i < j, imposing a timescale separation on the evolution of ai and enforcing the convergence of the series expansion (5). Consequently, the field can be approximated at arbitrary precision with

Ind. Eng. Chem. Res., Vol. 43, No. 13, 2004 3355

a finite, and usually small, set of terms in the series. Depending on the nature of the kernel K(ξ,ξ′), different types of basis sets will emerge. Representative examples include the following: (1) the spectral basis, in the case that the kernel K coincides with the Green function of the associated linear spatial operator L; (2) the Karhunen-Loe`ve expansion (POD),25 in this case the kernel is associated with the so-called two-point correlation function. The resulting basis functions are known as empirical eigenfunctions. As pointed out by Shvartsman and Kevrekidis,16 there are a number of methods (ranging from the approximate inertial manifolds to the Karhunen-Loe`ve expansion) that exploit the intrinsic time-scale separation property of dissipative systems to approximate their long-term behavior by a low-dimensional dynamic representation. In this work, we propose the use of empirical eigenfunctions for the following two basic reasons: (i) first, because, in contrast with the spectral basis functions, the empirical eigenfunctions can be easily generated from experiments or from numerical solutions of the PDEs for any type of geometric domain and nonlinearity; (ii) second, because it can be demonstrated that this set of functions is optimal in the sense that, for the same accuracy, it requires fewer functions than any other expansion.12,13 2.1. Gale1 rkin Projection on the Basis Set. Considering that the field is substituted by its approximation (5) in eq 1, the following residual is obtained:

R(ξ,t) ) x˜ t - [L(x˜ ) + σ(x˜ )]

(7)

where the dynamic evolution of the time-dependent coefficients ai(t) is computed through a Gale¨rkin projection of the residual on each of the elements of the orthonormal basis set φi26 so that

∫Ω R(ξ,t) φi(ξ) dΩ ) 0

i ) 1, ..., N

(8)

k

k

i)1

i)1

∫Ω L[∑ai(t) φi(ξ)] + σ[∑ai(t) φi(ξ)]φi(ξ) dΩ (9)

with the corresponding set of initial conditions

ai(0) )

ai(t) φi(ξ) ∑ i)1

∫Ω x(ξ,0) φi(ξ) dΩ

i ) 1, ..., N

(10)

2.2. Nonhomogeneous Boundary Conditions. The Gale¨rkin method, as formulated in the previous section, can only be used for processes with homogeneous boundary conditions provided that the trial functions fulfill the boundary conditions exactly. However, this condition is not always accomplished. For example, in most of the dynamic optimization problems to be considered here, the boundary conditions are not homogeneous. Moreover, in most of the cases, they are not even constant because they are usually functions of the control variables. The Tau method,27 however, can be employed to deal with nonhomogeneous boundary conditions. This method enforces boundary condition satisfaction through an extended expansion of the field as follows:

(11)

where NB represents the number of nonhomogeneous boundary conditions. Note that the application of the Gale¨rkin method will now result in k + NB ODEs and, therefore, as the dimension of the problem increases, so also does the size of the reduced-order model. As an alternative, we suggest here the use of a set of transformations that will convert the original problem into an equivalent homogeneous one. These transformations will depend on the types of boundary conditions but are mainly based on the definition of a new field ω as follows:

ω(ξ,t) ) x(ξ,t) - χ(ξ,t)

(12)

so that for the case of first-order boundary conditions

χ ) F1(ξ,t)

(13)

For second-order boundary conditions, χ should be a function satisfying

∂χ/∂n ) F2(ξ,t)

(14)

2.3. Projection of Nonlinear Terms. Many relevant distributed processes are described in terms of highly nonlinear PDEs, which cannot be handled efficiently with the standard Gale¨rkin method. To illustrate this point, let us consider an exponential nonlinearity, σ(x) ) exp[g(x)], in eq 1. This type of nonlinearity appears, for example, in reaction-diffusion processes, where the kinetics depend on temperature. Projection of σ(x) on the set of trial functions results in



k

ai(t) φi(ξ)]} dΩ ∑ i)1

φ (ξ) exp{g[ Ω j

leading to the following set of ordinary differential equations (ODEs):

a˘ i(t) )

k+NB

x˜ (ξ,t) )

(15)

It becomes clear that no closed-form solution exists for eq 15. Alternative solutions are possible either by numerical evaluation of the integral or through a Taylor series expansion of the exponential term around a given reference value for the ai coefficients. Note, however, that both approaches demand an extremely high computational cost. Other possibilities would imply, for instance, the use of fast Fourier transform and collocation as described, e.g., by Fletcher26 and Gottlieb and Orszag.27 Unfortunately, this method will usually result in a very large number of equations, particularly in the two-dimensional or three-dimensional case. Artificial neural networks have also been proposed for the approximation of nonlinear terms.28 However, training a neural network is not an easy task because it requires nonobvious steps such as the design of the network and its tuning through the solution of an optimization problem. The approach we follow in this work consists of transforming the original set of nonlinear PDEs into an equivalent, although larger, set of PDEs with (simpler) polynomial nonlinearities. The following sufficient condition is required for a given nonlinear function v1 ) g(x) to be mapped into a polynomial-type nonlinearity:

3356 Ind. Eng. Chem. Res., Vol. 43, No. 13, 2004

(v1)x ) v2

The main steps involved in the solution of eq 22 are summarized below:29 1. Let φ(ξ) ) φ*(ξ) + η(ξ), where φ* corresponds to the optimum,  is a parameter, and η is a test function in the vicinity of φ*. 2. Maximizing the objective through the method of variations

(v2)x ) v3 ‚‚‚ (vj-1)x ) vj ‚‚‚ (vn-1)x ) P(x,v1,...,vn-1)

where P(x,v1,...,vn-1) is a polynomial function. In our case, the following transformations are imposed:

Θ ) exp(v)

(17)

v ) g(x)

(18)

In this way, the transformed system consists of three PDEs, one for each field (x, θ, and v). Thus, when it is assumed that the linear part of the spatial operator corresponds to the Laplacian operator [L(‚) ) ∆(‚)], the resulting equations will be of the form

]

dJ[φ)φ*+η(ξ)] d

(23)

)0

is equivalent to solving the linear integral equation

∫Ω [(∫ξ′ 〈x(ξ) x(ξ′)〉 φ(ξ′) dΩ′) - λφ(ξ)]η(ξ) dΩ ) 0

(24)

Because η(ξ) is a test function, the only possible solution is

∫Ω 〈x(ξ) x(ξ′)〉 φ(ξ′) dΩ′ ) λφ(ξ)

(25)

xt ) L(x) + Θ

(19)

which is the solution of the variational problem, equivalent to eq 6, but where the kernel is defined as a correlation operator:

vt ) L(v) - ∇(vx) ∇x + vxΘ

(20)

K ) 〈x(ξ) x(ξ′)〉

(21)

3.1. Computing the PODs from a Set of Data. In the previous section, the variational formulation for the calculation of the empirical eigenfunctions was presented. However, the system dynamics will be described by a finite set of data. In essence, the POD approach is nothing other than a data compression technique. Given a data set, represented as a cloud of points on an usually high dimensional space, the POD method determines the lowest dimensional subspace that congregates all of the points within a predefined vicinity. This implies finding the subspace that minimizes the distance at which all points will lie. l be a set of data obtained from Let Sd ) {xi}i)1 experiments or from direct simulation using a “full”order model (for example, a NMOL-based model). Each l element represents a vector with components corresponding to the value of the field at given π positions and l the number of time sample snapshots. This set is assumed to be representative of the behavior of the system for a set of initial conditions, parameters, inputs, and/or disturbances of interest. Let φj be the discrete k , with k e π, the equivalent of φj and SBk ) {φj}j)1 subspace that minimizes the average distance to the data. The discrete equivalent of problem (22) now becomes

Θt ) L(Θ) - vx∇x∇Θ - ∇(vx) ∇xΘ + vxΘ

2

where vx ) ∂v/∂x. This approach offers the following advantages: 1. It is very easy to use. Transformations can be fully automated with symbolic manipulation codes. 2. The resulting system of PDEs is completely equivalent to the original one (1) but easily manageable with the Gale¨rkin method because the nonlinear terms are now polynomial. 3. Each one of these new fields can be approximated using an expansion of the form (5), where the spatial trial functions can be calculated using the POD approach. 4. Finally, although the number of ODEs will, in general, increase, the resulting number of equations will always be smaller than the one obtained through the use of the collocation approach, except perhaps for the unidimensional case. 3. POD The POD basis set is defined through the following optimization problem: Given a set of l measurements of the field x at different spatial and time locations (snapshots), calculate the set of k orthonormal functions that best “captures” the data set. For a given a set of functions x(ξ), the problem consists, therefore, of finding a basis of orthonormal functions φ(ξ) representative of the original data set. Such an optimization problem can be formulated as the maximization of its projection on the set of original functions on average, which formally can be stated as

max J ) max 〈(φ, x)2〉 - λ[(φ,φ) - 1] φ

[

(16)

φ

max J(φj,x)

(27)

φj

with

J(φj,x) )

1

l



l j)1

k

(xjT - wjT)(xj - wj) -

λj(φjTφj - 1) ∑ j)1

(28)

(22)

where 〈‚〉 is the averaging operator defined as 〈‚〉 ) m and λ corresponds to the Lagrange limmf∞ (1/m)∑j)1 multiplier associated with the normalization constraint (φ, φ) ) 1.

(26)

k where wi ) ∑j)1 ajiφj is the projection of the vector xj in the new basis. The solution of this optimization problem29 results in a set of functions φj satisfying

Rφj ) λjφj

(29)

Ind. Eng. Chem. Res., Vol. 43, No. 13, 2004 3357

∫r∫θ∫z〈xrx, xr′x′〉Ψ dr′ dθ′ dz′ ) λΨ(r,θ,z)

where R is the correlation matrix:

R)

l

1

xjxjT ∑ l j)1

(30)

In this way, each element xi from the data set Sd can be expressed as

xi ) φai + 

1

l

∑ l i)1

di2 )

1

l

∑ l j)1

1

λi ∑ i)1

(33)

i.e., the eigenvalues provide a measure of how close the data are to the optimal basis. Combining expressions (32) and (33), we obtain l

∑ λi)1/2 i)k+1

(34)

k

)

l

(39)

so that the empirical eigenfunctions can then be computed through the solution of the eigenvalue problem:

Tψ ) λψ

(40)

with eq 40 being the discrete equivalent of eq 38. The solution of this equation will result in a set of empirical eigenfunctions ψ. Nevertheless, note that these functions are not valid to express the field because they do not allow recovery of the dynamics of the system at the origin (r ) 0). To overcome such a limitation, a method to recover the eigenfunctions φi from the functions ψi is proposed here, considering that φi ) ψi/xr. To begin with, vectors φi will be approximated by means of symbolic manipulation of polynomials satisfying the boundary conditions. A symmetry condition is usually imposed at r ) 0 in the form

[ ] [] ∂

aiφi ∑ i)1 ∂r

∂φi

)0w

r)0

∂r

)0

(41)

r)0

replacing φi by its equivalent ψi/xr in the previous expression results:

(35)

λi ∑ i)1 3.2. Cylindrical and Spherical Geometries. For the case of cylindrical/spherical geometries like the ones to be considered in the applications later on in this work, the resolution of the maximization problem (22) results in

∫r∫θ∫z〈x(r,θ,z), x′(r′,θ′,z′)〉

φ(r′,θ′,z′) r′ dr′ dθ′ dz′ ) λφ(r,θ,z) (36)

At a first glance, this equation is similar to eq 25. However, the presence of the radius in the integrand makes the kernel no longer symmetric.22 To solve this equation, the problem is reformulated to obtain an equivalent symmetric kernel. This can be done by multiplying the equation by xr, thus resulting in

∫r∫θ∫zxr〈x(r,θ,z), x′(r′,θ′,z′)〉 φ(r′,θ′,z′) xr′xr′ dr′ dθ′ dz′ ) xrλφ(r,θ,z) (37) and making Ψ ) xrφ

(xrxj)(xrxj)T ∑ l j)1

k

The dimension of the subspace can be estimated by making use of the so-called “captured energy”, defined as

λi ∑ i)1

l

1

(32)

l

λi ) ∑ xjTxj ∑ l j)1 i)1

Dav ) (

T)

k

xjTxj -

Note that k ) l implies that Dav ) 0, so that SBl and Sd coincide. Therefore l

so that the new kernel K ′ ) 〈xrx, xr′x′〉 is now symmetric. 3.2.1. Numerical Aspects. The discrete equivalent of eq 38 (the correlation matrix) for cylindrical and spherical geometries now becomes

(31)

where E is an error vector, orthogonal to the basis, that represents the distance from each element of the data set to the subspace. The average distance of the data set to the low-dimensional projection plane is then computed as

Dav2 )

(38)

[ ] ∂φi ∂r

(

)

xr ∂ψi ψi )0 r ∂r 2r

)

r)0

(42)

The right-hand-side term (∂ψi/∂r - ψi/2r) is then approximated by a polynomial g(r), which verifies the boundary condition (42), that is

lim rf0

[

]

xr g(r) f 0 r

(43)

where the polynomial is assumed to be of the form g(r) ) rP(r), to fulfill condition (43). When this polynomial is replaced in eq 42

∂ψi ψi ) rP(r) ∂r 2r

(44)

and the resulting differential equation is solved, the following expression for the functions ψ is obtained:

ψi ) c0xr +

N

2cj+1

rj+2 ∑ j)03 + 2j

(45)

3358 Ind. Eng. Chem. Res., Vol. 43, No. 13, 2004

Finally, functions φ will be computed as N

φi ) c0 +

2cj+1

rj+3/2 ∑ j)03 + 2j

Find the retort temperature Tret(t) that maximizes

(46)

where the coefficients c0 and cj will be calculated through the approximation of the functions ψi. 4. Case Study: Dynamic Optimization of a Diffusion-Reaction Process Dynamic optimization can be a key tool to improve the processing of foods and bioproducts.30 The thermal processing of canned food is one of the most important operations of the food processing industry. In economic terms, and according to Euromonitor data, the canning food sector was worth 1322.4 billion US$ in 2000, having grown in real terms by 3.4% since 1995, while it is expected to grow by 11.8% in real terms between 2000 and 2004. The objective of thermal processing of food is to obtain products that are safe for consumers and have a longer shelf life. In this way, foods are processed with the aim of eliminating, or at least reducing, the activity of microorganisms that might cause the spoilage of the food product or that can pose a danger for consumers. The thermal sterilization of canned foods (the process considered here) consists of the heating of the prepackaged food inside a steam retort to inactivate possible spores or microorganisms present in the product. However, not only are microorganisms sensitive to heat but nutrients and other quality properties of the food product can also be affected. Therefore, many efforts have been devoted to calculate optimal combinations of heating temperature and time in order to produce the greatest percentage of nutrient retention while ensuring the required lethality (a measure directly related to the reduction of a given microorganism) for a given product. Such a dynamic optimization problem has received considerable attention over the last years.31-33 Typical mechanisms of food heating include conduction and convection, with conduction being the relevant mechanism in the heating of solids and very viscous foods and convection being the dominant mechanism in liquids. With the purpose of describing the effect that temperature produces in the properties of the food, it is necessary to have a rigorous model based on mass and energy balances. The model must include, in addition, the equations that provide the variation of the concentrations of the microorganisms present in the food and the ones corresponding to the concentrations of substances responsible for the quality of the product, such as vitamins or enzymes. The case considered here is related to the thermal sterilization of solid canned food in cylindrical containers. Banga et al.5 considered the formulation and solution of a number of dynamic optimization problems regarding the thermal processing of food with different objective functions and constraints. In this work, the optimal control problem is defined as that of finding the time-temperature profile outside the product (retort temperature) that maximizes the final retention of nutrients, retN(tf), under constraints on the final microbiological lethality, Fs(tf), and the final temperature in the hottest point. Mathematically, the problem can be stated as follows.

J)

(

10 t ∫0V exp -ln ∫ DN,ref 0

1 VT

T

exp

f

(

) )

T(r,z,t) - TN,ref ln 10 dt dV (47) ZN,ref

subject to the system dynamics

(

∂2T ∂T ∂2T ∂T )R 2 + + 2 ∂t ∂r ∂r ∂z

)

(48)

with the following boundary and initial conditions:

T(R,z,t) ) Tret(t)

(49)

T(r,L,t) ) Tret(t)

(50)

∂T (0,z,t) ) 0 ∂r

(51)

∂T (r,0,t) ) 0 ∂z

(52)

T(r,z,0) ) T0

(53)

Note that because of symmetry considerations it is sufficient to model the heat-transfer and kinetic phenomena in the half-plane of the cylinder. As explained previously, two final time constraints are imposed on the system, one related to the maximum temperature in the hottest point inside the container

∀ r, z ∈ VT

T(x,y,z,t)tf) e T0

(54)

and a second constraint on the lethality associated with the point with minimum lethality (critical point)

Fc(tf) g Fc,D

(55)

where for the present case Fc,D ) 876 s with

Fc(tf) )

∫0t 10(T (t)-T f

c

M,ref)/ZM,ref

dt

(56)

These expressions are a consequence of the assumption of pseudo-first-order kinetics for the thermal degradation of nutrients and microorganisms:

( ) ( )

( (

) )

dCM(t) T(r,z,t) - TM,ref ln 10 )CM(t) exp dt DM,ref zM,ref

(57)

T(r,z,t) - TN,ref dCN(t) ln 10 CN(t) exp )dt DN,ref zN,ref

(58)

On the other hand, the retort temperature is subject to the following upper and lower bounds:

20.0 °C e Tret(t) e 140.0 °C

(59)

Table 1 summarizes the parameters for the case considered, as taken from Banga et al.5 among others. It must be remarked that many of the contributions found in the literature rely on the use of more or less rigorous models based on the conservation laws like the one presented here. However, the complexity of these

Ind. Eng. Chem. Res., Vol. 43, No. 13, 2004 3359

(i) The dynamics of the nutrient concentrations is now described in terms of the new fields as follows:

Table 1. Parameters of the Case Study Considered radius (R, m) height (2L, m) R (m2 s-1) T0 (°C) microorganism zM,ref (°C) DM,ref (s) TM,ref (°C) nutrient zN,ref (°C) DN,ref (s) TN,ref (°C)

0.043 75 0.1160 1.5443 × 10-7 71.11 Bacillus stearothermophilus 10.0 240.0 121.11 thiamin 25.56 10716.0 121.11

( ) (

models, with a highly nonlinear character related to the presence of exponential terms in the kinetics, and the possibility of complex geometries often result into computationally very expensive simulations, not suitable for real-time applications. Such characteristics make of this case study an appropriate candidate to illustrate the applicability and advantages of the techniques proposed in this work. 4.1. Reduced-Order Model Representation. The methodology proposed in section 3.1 will be followed here to derive a reduced-order representation of the thermal sterilization of canned food. (i) To begin with, the original problem with nonhomogeneous boundary conditions is transformed into an equivalent one but with homogeneous boundary conditions, according to eqs 12 and 13:

(

)

∂ω ∂Tret ∂2ω 1 ∂ω ∂2ω + ) L(ω) ) R 2 + + 2 ∂t ∂t r ∂r ∂r ∂z

where, for convenience, ln CN has been used instead of CN. (ii) The data set Sd required for the computation of the POD set has been obtained from direct numerical simulation of the nominal model (a NMOL finite difference scheme was employed) under a number of perturbations and initial conditions representative of the dynamics of the process.16,25 (iii) The next step is concerned with the solution of the eigenvalue problem as formulated in eq 40. Table 2 shows the energy captured by the first eigenfunctions, for each of the three fields. (iv) From Table 2, it is clear that, in order to describe ω, five eigenfunctions are enough to capture most of the energy. In the same way, five equations are necessary to describe Θ and three for ln CN so that, at least, the 99.999% of the total energy is captured. These eigenfunctions are represented in Figures 1-3. (v) Each field is then approximated by an expansion of the form kω

ω ˜ (r,z,t) ) (60)

[

]

ln 10 w -1 Di,ref

(61)

which in combination with eq 60 results in a new PDE:

(

) )

∂Tret ∂2Θ 1 ∂Θ ∂2Θ ∂Θ ln 10 (Θ + 1) + )R 2 + + 2 ∂t Di,ref ∂t r ∂r ∂r ∂z ln 10 ∂ω ∂Θ ∂ω ∂Θ R (62) Di,ref ∂r ∂r ∂z ∂z

(

with homogeneous boundary conditions. Note that with this transformation the exponential nonlinear term is now reduced to a bilinear term in the ω and Θ spatial derivatives, thus facilitating the application of the Gale¨rkin approach. The corresponding initial conditions for PDEs (60) and (62) are

ω(r,z,0) ) T0 - Tret(0) Θ(r,z,0) ) exp

[

(63)

]

ln 10 ω(r,z,0) - 1 Di,ref

aωi(t) φωi(r,z) ∑ i)1

(66)



Θ ˜ (r,z,t) )

where the new field ω is defined as ω ) T - Tret. In this new formulation, the control variable appears explicitly in the PDE (60). (ii) The degradation kinetics, which includes an exponential nonlinear term, is also transformed as explained in section 2.3, leading to

Θ ) exp

)

∂ ln CN Tret(t) - TN,ref ln 10 )exp (Θ(r,z,t) + 1) ∂t DN,ref zN,ref (65)

(64)

aΘi(t) φΘi(r,z) ∑ i)1

(67)

kN

ln C ˜ N(r,z,t) )

aNi(t) φNi(r,z) ∑ i)1

(68)

(vi) The trial functions φ are obtained from the empirical eigenfunctions Ψ, as explained in section 3.2. The reduced-order model is then calculated by substituting the approximated fields given in eqs 66-68 into eqs 60-65 and projecting on the trial functions calculated for ω, Θ, and ln CN. As a result, the following set of ODE coefficients aωi, aΘi, and aNi are obtained:

a˘ ωi ) -cωi

∂Tret ∂t



+

bωijaωj ∑ j)1

i ) 1, kω

(69)

k

Θ ln 10 ∂Tret (aΘi + cΘi) + bΘijaΘj + a˘ Θi ) Dref ∂t j)1



kω kΘ

∑ ∑bijkaωkaΘj

(70)

k)1j)1

with i, j ) 1, kΘ and k ) 1, kω

( ) (

a˘ Ni ) -

ln 10

DN,ref

exp

)

Tret(t) - TN,ref zN,ref

(cNi +

∑j bNijaΘj)

(71)

3360 Ind. Eng. Chem. Res., Vol. 43, No. 13, 2004

Figure 2. Empirical eigenfunctions for Θ expansion. Figure 1. Empirical eigenfunctions for ω expansion. Table 2. Relative Captured Energy versus for the First Empirical Eigenfunctions captured energy eigenmode

ω

Θ

ln CN

1 2 3 4 5 6

95.218 99.302 99.921 99.989 99.999 100.00

99.589 99.998 99.999 100.00 100.00 100.00

96.356 99.393 99.988 99.998 100.00 100.00

with i ) 1, kN and j ) 1, kΘ, where b and c in eqs 69-71 are obtained by the Gale¨rkin projection as

bωij )

∫VTφωi(r,z) L[φωj(r,z)] dV

bΘij )

∫VTφΘi(r,z) L[φΘj(r, z)] dV

i ) 1, kω; j ) 1, kω (72)

i ) 1, kΘ; j ) 1, kΘ (73)

∫VTφΘi

ln 10 bijk ) -R Dref

(

)

∂φωk ∂φΘj ∂φωk ∂φΘj dV (74) ∂r ∂r ∂z ∂z

with i, j ) 1, kΘ and k ) 1, kω and

L)R bNij )

(

∫VTφNiφΘj dV cωi )

)

∂2 ∂2 1 ∂ + + ∂r2 r ∂r ∂z2

i ) 1, kN; j ) 1, NΘ

∫VTφωi(r,z) dV

i ) 1, kω

(75) (76) (77)

Figure 3. Empirical eigenfunctions for ln CN expansion.

cΘi )

∫VTφΘi(r,z) dV

i ) 1, kΘ

(78)

cNi )

∫VTφNi(r,z) dV

i ) 1, kN

(79)

The resulting reduced-order model consists of 14 ODEs: five ODEs describing the temperature distribution inside the food load, eight ODEs describing the nutrients degradation, and the remaining equation describing the lethality in the critical point (Fc). Finally, it must be pointed out that different steps involved in the construction of the reduced-order model can be (and, in fact, were) fully automated in the Mathematica symbolic package.

Ind. Eng. Chem. Res., Vol. 43, No. 13, 2004 3361

Figure 4. Comparison of the reduced-order (dots) and nominal (lines) models: temperature evolution in a number of points inside the food load.

Figure 5. Comparison of the reduced-order and nominal models: average nutrient retention.

4.2. Reduced-Order Model Validation. The performances of the reduced-order model and a nominal (full) model obtained using the numerical method of the lines23 were compared to test the ability of the reducedorder model to capture the relevant dynamic features of the system. A spatial discretization level of 41 × 41 elements was used to maintain the desired accuracy. Therefore, the nominal model consisted of 3363 ODEs: 1681 to describe the temperature distribution, 1 equation for the evaluation of the lethality in the critical point (Fc), and the remaining 1681 for the calculation of the nutrient concentration distribution inside the food. The simulations of both, nominal and reduced, models were carried out under a typical constant retort temperature profile. Figure 4 shows the evolution of the temperature in a set of points inside the food load as predicted by both models. Because the nutrient retention average [retN(t)] is a magnitude of interest for the resolution of the dynamic optimization problem, its value is also compared using both models. The results obtained are plotted in Figure 5. As noted from the figures, the reduced-order model preserves the predictive capabilities of the full model, up to the point of both results being indistinguishable. This is especially relevant if one considers that the size of the nominal model is 240 times the size of the reduced-order model. 4.3. Dynamic Optimization Problem Results. The dynamic optimization problem was solved using the

Figure 6. Optimal control profile and temperature at the critical point.

CVP approach and a stochastic global optimization method ICRS/DS,34 for the solution of the resulting NLP problem. The solution of the dynamic optimization problem involves a number a simulations of the process model. Consequently, and with the aim of reducing the overall computational effort, a new nominal model has been calculated using a less dense mesh consisting of 11 × 11 nodes. As a result, the nominal model, used for optimization purposes, consists of 243 ODEs. The resolution of the dynamic optimization problem with a reduced-order model resulted in a maximum retention value of 47.3% (in good agreement with the results presented by Banga et al.6) in less than 50 s of calculation time. The dynamic optimization profile and the temperature at the critical point are shown in Figure 6. The solution of the dynamic optimization problem with the nominal model produced similar results but in 720 s of CPU time. Note that equivalent results may be obtained using a CVP method based on a local deterministic (gradient-based) optimization method35 but requiring up to 3 times this computational cost. Although these results demonstrate the advantages of using the reduced-order model, a more detailed comparison will allow one to draw new conclusions. Figure 7 presents a comparison of the convergence curves (relative error with respect to the best known solution versus computation time) of both optimization processes. In this figure, the superiority in terms of the efficiency of the reduced-order model is illustrated: starting with an initial control profile, which results in a retention value of less than 32%, a value of 46% is reached in only 0.27 s, almost the same time required for the first objective evaluation with the nominal model (0.20 s). In addition, the use of the reduced-order model produces a retention value of about 47% in only 12 s, which is sufficient for most practical applications. These results demonstrate that reduced-order models can be used not only for the efficient solution of dynamic optimization problems but also for applications like realtime optimization or model predictive control. 5. Conclusions In this work, a novel approach to develop reducedorder models has been presented. The possibilities of this approach are beyond the scope of this paper because

3362 Ind. Eng. Chem. Res., Vol. 43, No. 13, 2004

Figure 7. Comparison of the curves of convergence for the dynamic optimization.

it can be easily applied for the calculation of reducedorder models of nonlinear distributed processes of very diverse nature, with nonhomogeneous time-dependent boundary conditions and different types of nonlinear terms and geometries. Moreover, the derived reduced-order models can be applied for the efficient solution of dynamic optimization problems. This work illustrated the possibilities of using the CVP approach combined with a stochastic optimization approach. Nevertheless, identical results could be obtained using any other type of NLP solver or any other direct approach to solve dynamic optimization problems. The results obtained demonstrate that this technique is very promising not only in the context of simulation and dynamic optimization but also in real-time optimization and model predictive control. Acknowledgment The authors acknowledge the financial support received from the Spanish Government (MCyT Projects AGL2001-2610-C02-02 and PPQ2001-3643) and Xunta de Galicia (PGIDIT02PXIC40211PN and PGIDIT02PXIC40209PN). Literature Cited (1) Vassiliadis, V. S. Computational solution of dynamic optimization problems with general differential-algebraic constraints. Ph.D. Thesis, Imperial College, University of London, London, U.K., 1993. (2) Barton, P. I.; Allgor, R. J.; Feehery, W. F.; Gala´n, S. Dynamic optimization in a discontinuous world. Ind. Eng. Chem. Res. 1998, 37, 966-981. (3) Biegler, L. T.; Cervantes, A. M.; Watcher, A. Advances in simultaneous strategies for dynamic process optimization. Chem. Eng. Sci. 2002, 57 (4), 575-593. (4) Biegler, L. T.; Ghattas, O.; Heinkenschloss, M.; van Bloemen Waanders, B. Large-Scale PDE-Constrained Optimization; Lecture Notes in Computer Science and Engineering; Springer: New York, 2003; Vol. 30. (5) Banga, J. R.; Martın, R. P.; Gallardo, J. M.; Casares, J. J. Optimization of thermal processing of conduction-heated canned foods: Study of several objective functions. J. Food Eng. 1991, 14 (1), 25-51. (6) Blatt, M.; Schittkowski, K. PDECON: A FORTRAN code for solving control problems based on ordinary, algebraic and partial differential equations; Technical Report; Department of Mathematics, University of Bayreuth: Bayreuth, Germany, 1997. (7) Balsa-Canto, E.; Banga, J. R.; Alonso, A. A.; Vassiliadis, V. S. Optimal control of distributed processes using restricted second-

order information. In International Symposium on Advanced Control of Chemical Processes (ADCHEM); Biegler, L. T., Brambilla, A., Scali, C., Eds.; 2000; pp 905-910. (8) Christofides, P. D. Nonlinear and Robust Control of PDE Systems; Birkhauser: Boston, 2001. (9) Alonso, A. A.; Ydstie, B. E. Stabilization of distributed process systems using irreversible thermodynamics. Automatica 2001, 37 (11), 1739-1755. (10) Alonso, A. A.; Kevrekidis, Y. G.; Banga, J. R.; Frouzakis, C. E. Optimal sensor location and reduced order observer design for distributed process systems. Comput. Chem. Eng. 2004, 28 (12), 27-35. (11) Alonso, A. A.; Ferna´ndez, C. V.; Banga, J. R. Dissipative systems: from physics to robust nonlinear control. Int. J. Robust Nonlinear Control 2004, 14, 199-225. (12) Sirovich, L. Turbulence and the dynamics of coherent structures. Part I: Coherent structures. Q. Appl. Math. 1987, 561571. (13) Berkooz, G.; Holmes, P.; Lumley, L. The Proper Orthogonal Decomposition in the analysis of turbulent flows. Annu. Rev. Fluid Mech. 1993, 25, 539-375. (14) Bangia, A. K.; Batcho, P. F.; Kevrekidis, I. G.; Karniadakis, G. E. Unsteady two-dimensional flows in complex geometries: Comparative bifurcation studies with global eigenfunction expansions. SIAM. J. Sci. Comput. 1997, 18 (3), 775-805. (15) Burl, J. B. Reduced-order system identification using the Karhunen-Loe`ve transform. Int. J. Model. Simul. 1993, 13 (4), 183-188. (16) Shvarstman, S. Y.; Kevrekidis, I. G. Low-dimensional approximation and control of periodic solutions in spatially extended systems. Phys. Rev. E 1998, 58 (1), 361-368. (17) Park, H. M.; Kim, T. H.; Cho, D. H. Estimation of parameters in flow reactors using the Karhunen-Loe`ve decomposition. Comput. Chem. Eng. 1998, 23, 109-123. (18) Theodoropoulou, A.; Adomaitis, R. A.; Zafiriou, E. Model reduction for optimization of rapid thermal chemical vapor deposition systems. IEEE Trans. Semiconduct. Manuf. 1998, 11 (1), 8598. (19) Bendersky, E.; Christofides, P. D. Optimization of transportreaction processes using nonlinear modelo reduction. Chem. Eng. Sci. 2000, 55 (19), 4349-4366. (20) Balsa-Canto, E.; Alonso, A. A.; Banga, J. R. Optimal control of distributed processes using the Karhunen-Loe`ve decomposition. AIChE Annual Meeting, Los Angeles, CA, 2000. (21) Armaou, A.; Christofides, P. D. Dynamic optimization of dissipative pde systems using nonlinear order reduction. Chem. Eng. Sci. 2002, 57, 5083-5114. (22) Citriniti, J. H. Experimental investigation into the dynamics of the axisymmetric mixing layer utilizing the proper orthogonal decomposition. Ph.D. Thesis, University of New York, Buffalo, NY, 1996. (23) Schiesser, W. E. The Numerical Method of Lines; Academic Press: New York, 1991. (24) Courant, R.; Hilbert, D. Methods of Mathematical Physics; Wiley: New York, 1937. (25) Holmes, P.; Lumley, J. L.; Berkooz, G. Turbulence, Coherent Structures, Dynamical Systems and Symmetry; Cambridge University Press: Cambride, U.K., 1996. (26) Fletcher, C. A. J. Computational Gale¨ rkin methods; Springer-Verlag: New York, 1984. (27) Gottlieb, D.; Orszag, S. A. Numerical Analysis of Spectral Methods: Theory and Applications; Society for Industrial and Applied Mathematics: Philadelphia, PA, 1977. (28) Rico-Martınez, R.; Krischer, K.; Kevrekidis, Y. G. Neural Networks in Chemistry and Chemical Engineering. In Nonlinear system identification using neural networks: dynamics and instabilities; Bulsari, A., Ed.; 1995. (29) Luenberger, D. Optimization by Vector Space Methods; Wiley: New York, 1969. (30) Banga, J. R.; Balsa-Canto, E.; Moles, C. G.; Alonso, A. A. Improving food processing using modern optimization methods. Trends Food Sci. Technol. 2003, 14 (4), 131-144. (31) Chalabi, Z. S.; Van Willigenburg, L. G.; Van Straten, G. Robust optimal receding horizon control of the thermal sterilization of canned foods. J. Food Eng. 1999, 40 (3), 207-218.

Ind. Eng. Chem. Res., Vol. 43, No. 13, 2004 3363 (32) Alvarez-Va´zquez, L. J.; Martınez, A. Modelling and control of natural convection in canned foods. IMA J. Appl. Math. 1999, 63 (3), 247-265. (33) Kleis, D.; Sachs, E. W. Optimal control of the sterilization of prepackaged food. SIAM J. Optim. 2000, 10 (4), 11801195. (34) Banga, J. R.; Irizarry, R.; Seider, W. D. Stochastic optimization for optimal and model-predictive control. Comput. Chem. Eng. 1998, 22 (4-5), 603-612.

(35) Balsa-Canto, E. Algoritmos eficientes para la optimizacio´n dina´mica de procesos distribuidos. Ph.D. Thesis, University of Vigo, Vigo, Spain, 2001.

Received for review January 16, 2004 Revised manuscript received April 6, 2004 Accepted April 16, 2004 IE049946Y