Local Optima in Model-Based Optimal Experimental Design

Sep 21, 2010 - Jan C. Schöneberger*, Harvey Arellano-Garcia, and Günter Wozny ... Calvin Tsay , Richard C. Pattison , Michael Baldea , Ben Weinstein...
0 downloads 0 Views 653KB Size
Ind. Eng. Chem. Res. 2010, 49, 10059–10073

10059

Local Optima in Model-Based Optimal Experimental Design Jan C. Scho¨neberger,*,‡ Harvey Arellano-Garcia, and Gu¨nter Wozny Chair of Process Dynamics and Operation, Sekr. KWT-9, Berlin Institute of Technology, D-10623 Berlin, Germany

Model-based nonlinear optimal experimental design (OED) has advanced to a relevant systematic tool in model development. However, the high nonlinearity of OED problems and their required high computational effort makes the optimization a challenging task. The objective of this work is to point out a crucial aspect of OED problems, namely, the existence of local minima. For this purpose, general characteristics of OED problems are presented and discussed, using different OED criteria and approaches (sequential/parallel). Furthermore, five different approaches are compared to overcome the local minima problem. Tuning options to improve the computation time of hybrid approaches are derived and discussed. The approaches are compared, based on the results of two case studies. To solve high-dimensional complex OED problems, a proposed hybrid framework is found to be a promising approach. 1. Introduction The use of mathematical models for the description of dynamic and/or steady-state behavior in process engineering has become a common technique for tasks of analysis, control, design, and optimization in chemical engineering. Usually, these models consisting of equations, assumptions, parameters, areas of validity, and documentation are embedded in simulation tools (e.g., gPROMS, ChemCAD, Aspen Plus, MOSAIC,1 etc.). Models for new process units, components or mixtures, and new chemical reactions coming with novel or improved processes must first be built and validated before they can be used for such environments. For most of the relevant reactions in the chemical industry, in particular, reaction kinetic data cannot be derived theoretically as, for example, equilibrium data for phase equilibria. Moreover, ∼90% of the industrially relevant reactions are supported by catalysts,2 and, for each new catalyst, new models must be developed, even if the reaction is already known.3 Consequently, the model building process currently determines an important aspect of chemical engineering. For this purpose, diverse theoretical and experimental steps must be performed until a model can be implemented in any simulation tool. However, the experimental part is very currency-, material-, and time-consuming. Thus, much effort has been currently invested to systematize the model building process, to reduce time and costs.4-8 Moreover, the approach of model-based nonlinear optimal experimental design (OED) has become more popular since increased computational power and improved algorithms allow for fast numerical solution and optimization of large-scale and complex systems. Nevertheless, the optimization procedure still remains a challenging task, and, so far, there is no general framework with the help of which a suitable optimization technique, an efficient equation solver, an appropriate discretization method, and an adequate formulation of the objective function can be selected. Concerning the last point, we refer to Franceschini and Macchietto9 for their reported work on this issue. However, local minima of the objective function in OED, irrespective of its formulation, remains still an untreated problem. In this work, it can be shown that local minima naturally appear during OED and that they can, in fact, * To whom correspondence should be addressed. E-mail: [email protected]. ‡ Current address: TU Berlin, Sek. KWT 9, Str. des 17. Juni 135, 10237 Berlin, Germany.

strongly influence the performance of an OED framework. Moreover, because of the fact that, besides the highly nonlinear objective function, the model equation systems (e.g., ordinary differential equations (ODEs) or differential algebraic expressions (DAEs)) must be solved either sequentially (solving the equation system in each step) or simultaneously (taking the equation system into account as nonlinear constrains of the objective function), the solution of OED optimization problems is a computationally expensive task. For details and recommendations about which technique should be used, see, e.g., the work of Biegler et al.10 In chemical reaction engineering, these equation systems show commonly high stiffness, because reactive systems feature an exponential behavior, e.g., using the Arrhenius approach for the description of the temperature dependency of reaction rates. General speaking, the objective function call is the most expensive step in OED. This prohibits the use of stochastic global search algorithms, which commonly require thousands of objective function evaluations to converge. Another drawback is that penalty functions must be implemented to consider variable boundaries. In this work, the key idea is to combine stochastic and gradient-based optimization in such a way that their main drawbacks (long convergence time/local minima) remain effectless. For this purpose, particle swarm optimization (PSO),11 as a stochastic optimizer (GO), and NPSOL,12 as a gradient-based SQP optimizer (LO), are used. However, the approach can also be implemented with other algorithms, such as Simulated Annealing,13,14 Genetic Algorithms,15 or other optimization tools (such as SNOPT,16 CONOPT,17 IPOPT,18 MINOS,18 etc.). A brief overview of state of the art in SQP implementations can be found in the work of Gill et al.19 In the next sections, the issue of local minima in model-based optimal experimental design is discussed. To overcome this problem, a developed hybrid optimization approach and its modifications are introduced. To compare those approaches, two case studies are discussed in detail. Based on these results, tuning options for the hybrid approaches are derived and analyzed on a case-by-case basis. Finally, conclusions and a brief outlook are given. 2. Problem Statement The main task of model-based nonlinear optimal experimental design (OED) for parameter precision is to determine the experimental conditions u with which a function φ of the

10.1021/ie9016094  2010 American Chemical Society Published on Web 09/21/2010

10060

Ind. Eng. Chem. Res., Vol. 49, No. 20, 2010

Figure 1. Step response of a PT1 system.

parameter covariance matrix CP is minimized (see eq 1). Concerning the solution of the optimization problem, in the simultaneous approach, the model equations F and G are solved, giving nonlinear constraints, whereas in the sequential approach, they are solved internally for the calculation of the objective function. In this work, the sequential approach is adopted. min φ(CP) u

(1)

subject to F(x˙, x, u, p, t) ) 0 G(x, u, p, t) ) 0 lb e u e ub Common approaches for the function φ are the so-called A-, D-, and E-Criterion,20 given in eq 2, where, per definition, A minimizes the mean parameter standard deviations, D minimizes the size of the confidence region, and E minimizes the largest parameter spreading.

{

A:

trace(CP) dim(CP)

φ(CP) ) D: det(C )1/dim(CP) P E: max(eig(CP))

(2)

To overcome problems of parameter correlation or robustness,21 these criteria can be modified and combined. However, this work first focuses on the conventional criteria. For models with one parameter, the A-, D-, and E-Criterion result in the same objective function, which is equal to the variance of the considered parameter. The covariance matrix of the parameters in eq 3 is calculated based on the Fisher information matrix F for NDP data points, assuming a quadratic behavior of the objective function of the parameter estimation problem (this means a linear behavior of the measurement function, with respect to the parameters). (See, e.g., the work of Bard.22) Here, a maximum likelihood estimation is used, assuming a normally distributed measurement error characterized by the covariance matrix Cx. The state variables x are measured directly at the position, time

point, or experiment DP. NDP is the number of available data points.

{ ∑ [( NDP

CP g F

-1

)

DP

) ( )]}

∂xDP T -1 ∂xDP Cx ∂p ∂p

-1

(3)

Equation 3 is also called the Cramer-Rao bound. Equality is achieved, if the estimated set of parameters is efficient,23 which means that it represents the parameter values at the global optimum of the maximum likelihood estimation. This is assumed during the entire paper. Local minima of the objective function naturally appear when there is a value of u that results in an uninformative experiment. The number of these local minima increases with the number of experiments designed, which will be demonstrated in the next section. Another reason for local minima, in the case of more than one parameter to be estimated, are repeated experiments. These two causes are elucidated by means of an illustrative example. 2.1. Illustrative Example. As a reference, the first-order lag element PT1 system in eq 4 is first chosen because of its relatively simple solution (see eq 5). Moreover, determining the time constant (τ) and gain (k) are common tasks in control applications. Figure 1 shows the system step response and its corresponding parameters. τx˙ + x ) kc

[

x(0) ) 0

( τt )]

x ) kc 1 - exp -

(4) (5)

In this illustrative example, two cases of experimental design are considered: (1) the choice of the step amplitude u ) c for a fixed measurement time tM and for one parameter p ) τ, and (2) the choice of measurement times ui ) tM,i for a fixed step amplitude c for two parameters, namely, p1 ) τ and p2 ) k. 2.1.1. Optimal Amplitude. Consider the case that the time constant τ of a PT1 system must be identified. The objective is to determine the step amplitude with which maximal information is obtained, concerning the time constant. Here, a fixed

Ind. Eng. Chem. Res., Vol. 49, No. 20, 2010

10061

Table 1. System Conditions for the Optimal Amplitude Design general

x

Cx

specific range/value

x(t ) tM) [-∞, ∞]

2

σxx 1

u

p

fixed

fixed

c [-0.5, 1]

τ 1

tM 1

k 1

measurement time (tM ) 1), a constant measurement variance (σxx2 ) 1), and a gain of k ) 1 are assumed. The parameter value is set to one time unit (p ) τ ) 1). Based on this observation, eqs 6, 7, and 8 denote the sensitivity, the Fisher information matrix, and the objective function of the OED problem, respectively. In addition, Table 1 summarizes the system conditions.

( ∂p∂x )

t)tM)1

t Mc

)-

) -ue-1

2

τ exp(tM /τ) 2 -2

F)ue φ(CP) )

e2 u2

(6) Figure 3. Objective function over the design variables u1 and u2.

(7) (8)

The range of the control variable u ) c is considered to be limited within the interval u ∈ [-0.5, 1]. To avoid infinite values of the objective function, it is assumed that an initial experiment with u0 ) 1 is performed (F1 ) e-2). In the case of one parameter to be estimated, there is no cross effect between earlier and new experiments, as described in Appendix A. The resulting new objective function is denoted in eq 9. φ1(CP) )

e2 1 + u12

(9)

Figure 2 shows its value over the considered interval. The global optimum lies at u1,GB ) 1 and a local optimum is found at u1,LB ) -0.5. Taking a random initial guess for u1 will lead to the global optimum with a combinatorial probability of 66.67% (δC ) 1 - 1/3 ) 2/3). Now, let us consider the case that two experiments are to be performed to improve the parameter accuracy. The initial experiment with u0 ) 1 is again performed. The multivariable objective function (φ1,2) then takes the form given in eq 10. Its surface is depicted in Figure 3. φ1,2(CP) )

e2 1 + u12 + u22

(10)

The global optimum and three local optima can be detected at the positions [1 1], [-0.5 -0.5], [-0.5 1], and [1 -0.5], respectively. The combinatorial probability for finding the global

Table 2. System Conditions for the Optimal Measurement Time Design general

x

Cx

ui

p

fixed

specific range/value

x(t ) tM) [-∞, ∞]

2 σxx

tM,i (0, 2.5]

[k τ] [1 1]

c 1

1

optimum with an initial guess is then reduced to 44.45% (δC ) 1 - (1/3)(1/3) - (1/3)(2/3) -(2/3)(1/3) ) 4/9). It becomes obvious that, with an increasing number of parallel designed experiments, this will become worse. In this example, it is rather straightforward to locate the global optima. Besides, there is a noninformative experiment for u ) 0. However, in complex systems, this is generally not the case. But even for the simple PT1 system in eq 1, different scenarios may occur for which no solution can be found by just taking a look into the equations. This is demonstrated in the next example. 2.1.2. Optimal Measurement Times. The time constant τ of a PT1 system should be identified again. In addition, the gain k is treated as an additional unknown parameter. For the determination of τ, the maximal allowed amplitude c ) 1 leads to the best experiments, as seen above. In this case, however, it should now be determined at which time points (tM,i) measurements are to be performed. Problems of this type are called optimal sensor allocation problems in the field of parameter estimation in distributed-parameter systems.24 The system conditions are given in Table 2. Because there are two parameters involved now, at least two experiments are required to reach identifiably. Thus, a first experiment is designed to estimate the time constant (p1 ) τ) by considering the second parameter to be known. The objective function and its derivatives are given in eqs 11-13. φ(CP) )

( )

τ4 exp(2tM /τ) k2tM2

2τ3 exp(2tM /τ)(τ - tM) ∂φ )∂tM k2tM3

(11)

(12)

( )

2τ2 exp(2tM /τ)(3τ2 - 4τtM + 2tM2) tM ) τ 2e2 ∂2 φ ) 98 2 ∂tM2 k2tM4 k (13)

Figure 2. Objective function over the design variable u1.

It is a known fact that, for any values of k and τ, a measurement at a time point equal to the time constant is optimal to estimate τ. (See, e.g., Sung et al.25) This is proven by the objective function eq 11, which reaches a minimum for tM ) τ,

10062

Ind. Eng. Chem. Res., Vol. 49, No. 20, 2010

Table 3. Results of the OED for the First Two Measurements

intuitive A-Optimal D-Optimal E-Optimal

tM,1

tM,2

A

D

E

σk,k

στ, τ

Fk,τ

1.00 0.66 0.78 0.66

2.50 2.50 2.50 2.50

16.41 13.48 13.86 13.48

4.81 4.67 4.61 4.67

32.10 26.13 26.93 26.13

203 186 190 186

536 485 491 485

-0.897 -0.855 -0.869 -0.855

as eq 12 equals zero and eq 13 is positive at that point. To determine the gain k, the latest possible measurement is optimal because the response x of the system approaches the value of k with time (Figure 1). This initial set of two experiments is considered to be intuitive optimal. The functions used to calculate the information matrix (FM,i) for a single measurement are given in eqs 14-17. FM,i )

[

f11 f12 f21 f22

[

f11 ) 1 - exp f12 ) f21 ) -

]

(14)

( )] -tM,i τ

2

ktM,i[1 - exp(-tM,i /τ)]

f22 ) -

τ2 exp(tM,i /τ) k2tM,i2 τ4 exp(2tM,i /τ)

(15)

(16)

(17)

Because of the required inversion of the sum of the matrices FM,i, the expressions for the objective functions become excessively long, such that they are not explicitly appointed here. Based on the analytic description of the objective function, the global optima for tM,1 and tM,2 can be calculated for different criteria by screening the entire variable interval. The results are summarized in Table 3. Moreover, values of the objective function (A, D, and E), parameter standard deviations (σii), and the correlation coefficient (Fij, defined by eqs 18 and 19) are specified. σii (%) ) √CPii × 100% Fij )

CPij

√σii × √σjj2

(18) (19)

2

Figure 4 shows the calculated 95% confidence regions for the different designs. The axes are not equally spaced, for the purpose of highlighting the shape of the ellipsoids.

Figure 4. 95% confidence regions for the different design cases described in Table 3.

Table 4. Results of the A-Optimal Design for the First Eight Measurements decision variable

tM,1

tM,2

tM,3

tM,4

tM,5

tM,6

tM,7

tM,8

GO GOposition

0.66 left

2.50 right

0.60 left

2.5 right

0.63 left

0.60 left

2.5 right

0.62 left

It can be concluded that the intuitive design is not optimal, because it does not consider the correlation between the parameters. The A- and E-Optimal design criteria yield similar results, while the D-Optimal design disregards the correlation, which leads to a smaller area of the ellipsoidal, but a larger correlation coefficient. For the following studies, the A-Optimal design is adopted. Furthermore, sequential OED steps are performed to increase the parameter accuracies. In Table 4, the additional next six designs are presented, while Figure 5 shows the behavior of the objective function for the A-Criterion concerning the decision variable. The distinctive feature is here, now that the position of the global optimum changes alternatively. For the first experiments, this can be interpreted as follows: After one measurement has been taken at the right position of the interval, the left position gets more attractive, because a “new point” could contain more information than a repeated experiment. However, this rule of thumb is not valid for the sixth experiment! Here, the left position gives the global optimum two times back to back. 2.2. OED for Complex Systems. The illustrative example has shown that local minima occur even in the simplest OED applications. However, OED for chemical engineering problems is often related to much more complex underlying equation systems and the resulting optimization problems are generally multidimensional. Based on the next OED applications, it can be demonstrated that the local minima will not vanish with an increasing problem complexity. 2.2.1. Determination of Protein Adsorption Isotherms. This example is related to the reliable determination of parameters in steric mass (SMA) ion-exchange equilibrium formalisms for the adsorption of proteins.26 To identify the set of six model parameters, a couple of at least four experiments must be designed in parallel. The reason for this relatively high number of required measurements (seven data points per experiment) is that the parameter set is enlarged by the uncertain experimental conditions. In this case study, the existence of local minima can be seen in Figure 6, where different combined sequential and parallel ordinary differential equation (ODE) strategies are compared: The optimal design for the entire parallel approach (10 experiments, denoted with a square mark) includes the optimal solutions of the sequential/parallel approaches (4 plus 6 times 1, 4 plus 2 times 3 and 5 plus 5), and thus, it must at least give the same optimal value for the objective function. It is quite obvious that a local optimum was found only for the entire parallel approach, so as for the 5 plus 5 approach. This supports the statement that the number of local optima, and, therewith, the chance for not finding the global optimum, increases with the number of design conditions (i.e., search directions). In Figure 7, the objective function for the design of the 20th experiment is plotted. Two of the four design variables, the physical meanings of which can be found elsewhere,26 are fixed. The differences in the values of the A-Criterion are small, because of the high number of experiments that were already performed; however, the existence of local optima is still observable. The value of the A-Criterion after 19 optimal experiments is 0.0493, so that the experiment derived with the locally optimal set of design variables leads to a marginal

Ind. Eng. Chem. Res., Vol. 49, No. 20, 2010

10063

Figure 5. Objective function values for A-Optimal designs.

Figure 6. Development of the A-Criterion with sequential and parallel OED strategies disturbed by local minima.

reduction of the A-Criterion and can thus be called an “uninformative experiment”. 2.2.2. Determination of Kinetic Parameters from Isothermal Batch Experiments. Chemical reaction engineering offers a wide field for the application of OED techniques. The following example, which concerns the estimation of kinetic parameters for the catalytic oxidation of methane, was taken from EUROKIN-Test Cases for sequential experimental design.27 Based on the data given by Buzzi-Ferraris and Manenti,28 a set of kinetic parameters has been estimated. Only experiments with the same inlet temperature were considered (T ) 733.1 °C), so that the velocity constant and the absorption equilibrium constant can be determined for this temperature. The experimental conditions, which can be set for the next experiment, in , and are the inlet concentrations of the educts: methane PCH 4 oxygen POin2. The temperature remains the same as in the preliminary experiments. In Figure 8, the surface of the objective function of the OED problem (eq 2, using the A-Criterion) is plotted in the feasible range of the design variables.

in The value of the A-Criterion at the local optimum (PCH ) 4 in 0.02 bar and PO2 ) 0.05 bar) is ∼5 times higher than at the in ) 0.03 bar and POin2 ) 0.01 bar)! It can global optimum (PCH 4 be assumed that the number of local minima increases when the parallel OED approach is applied. 2.2.3. Determination of Kinetic Parameters from Adiabatic FBR Experiments. In a previous work, a detailed study on OED was carried out for the determination of the kinetic parameters of the oxidation of sulfur dioxide (SO2) in an adiabatic catalytic fixed-bed reactor (FBR).29 In Figure 9, the surface of the corresponding objective function is plotted for two of the design variables, namely, the reactor inlet temperature (T) and the Reynolds number (Re), indicating the superficial gas velocity inside the reactor. Because of the high nonlinearity of the underlying equation system, the objective function features several local minima. 2.2.4. Determination of Kinetic Parameters in a PilotPlant Reactor. The former application (section 2.2.3) repre-

10064

Ind. Eng. Chem. Res., Vol. 49, No. 20, 2010

Figure 7. Objective function over the design variables u20, 3 and u20, 4 in the OED for the determination of protein adsorption isotherms.

in Figure 8. Objective function values, with respect to the design variables PCH and POin2 in an OED application for an isothermal batch reactor. 4

sented a preliminary study for the operation of a pilot-plant reactor. Because of technical limitations, an adiabatic operation of the reactor was not possible; thus, the model equations had to be extended by heat-loss relations.30 The pilot-plant setup is depicted in Figure 10. Since toxic and highly corrosive gases and liquids are involved, each experiment implies high costs, in terms of time and money. Thus, most of the units in the pilot plant, which must be allocated in an extractor hood, are related to the reactor off-gas treatment, namely, the gas cooling, the acid condensation, the acid fog separation, and the final gas scrubbing. During 1 h of operation (approximate time needed to reach steady-state profiles), an amount of ∼10 L of concentrated sulfuric acid is produced! Hence, each experiment spared means a considerable reduction of costs. Now, consider the case that the aim of the model building procedure is to reach a parameter spreading below 3%. Figure 11 shows the variation of the A-Criterion and the spreading of the worst determined parameter σmax (given by the E-Criterion), with respect to the reactor inlet temperature T for the OED of the next experiment after several performed experiments. The

other design variables have been fixed to their globally optimal value. Only if either the global optimum at T ) 410 °C or the local optimum at T ) 383 °C are found, a further experiment could be avoided. The performance of the experiment designed with the temperature found with the local optimum at T ) 372 °C would lead to the result that another experiment must be performed to reach the desired model accuracy. 3. Solution Approach To overcome the stated problem of local minima, in this work, the proposed approach is to apply a developed hybrid optimization algorithm, which basically represents the combination of global (GO) and local (LO) optimization techniques. Recently, the basic idea of the hybrid optimization approach was introduced by the authors for the first time.29 However, in that work, an empirical way was used to overcome the local minima in a complex OED problem. The scope of the present paper is to analyze the performance of the proposed framework and to compare it to alternative approaches. Moreover, derived tuning

Ind. Eng. Chem. Res., Vol. 49, No. 20, 2010

10065

Figure 9. Objective function values, with respect to the design variables T and Re in a highly nonlinear case.

Figure 10. Pilot-plant setup: (a) quality measurement, (b) heating section, (c) reactor, (d) tube bundle cooler, (e) quality measurement, and (f) bubble column. (Reproduced from ref 30.)

options for the hybrid framework lead to an efficiency improvement of the solution approach. For the sake of completeness, an optional approach must be mentioned. The use of parameter relaxation for a robust OED leads to a flattening of the objective function.31 This feature can be stressed until the objective function becomes completely convex. However, the computational effort will greatly increase,

because of the fact that either a min-max problem must be solved or higher-order derivatives are required. Furthermore, in cases of relative small changes in the parameter values, the parameter relaxation can lead to nonoptimal designs.32 3.1. Hybrid Optimization Framework. The stated objective functions of the considered optimization problems in eq 2 are highly nonlinear. Furthermore, the functions of the covariance matrix exhibit several local minima within the considered region, as specified above. However, the surface of the objective function shows a trend toward a certain direction. This tendency can be exploited, using the stochastic search algorithm PSO.11 It simulates particles, which move over the surface of the objective function. The particles share their actual and historical information (i.e., the values of the objective function) and change the direction of their movement toward the best particle position. The main problem with this algorithm (and most other GO techniques) is that it is very difficult to find any real minima, local or global, since it does not use any gradient information. This leads to huge convergence times. To overcome this, the algorithm is combined with a gradient-based optimization algorithm. The key idea is that each particle should move to the next optimum using gradient information before it shares information with the other particles. The algorithm framework is roughly depicted in Figure 12. Because of the particle movement, the initial points of the gradient-based algorithm can be in the infeasible region. This requires a robust algorithm such as NPSOL that can handle these types of problems.33 Another drawback of global optimization algorithms is solved in the following way: the bounds of the optimization variables do not have to be forced with penalty functions, because NPSOL corrects the particle positions.The combination of GO and LO algorithms can be understood as a reduction of the search space of the GO, as depicted in Figure 13. Instead of searching the whole decision variable intervals, only the local optima are scanned. This leads to small convergence times and notably reduces the number of iterations required for the GO algorithm. Because of the fact that most of the local minima are explored by the hybrid framework and a local best position only gives a local minimum, a first approach for tuning the PSO algorithm is

10066

Ind. Eng. Chem. Res., Vol. 49, No. 20, 2010

Figure 11. A-Criterion and parameter spreading for the potential last experiment in the model building campaign, as a function of the reactor inlet temperature.

only reduced to a value that is one-half times the global weight. The reference set of the PSO parameters, as well as the applied set in this work, are given in Table 5. The number of particles (Np) and the number of iterations (Niter) are reduced. The applied formula for calculating Np and Niter with the problem dimension Nd in eq 20 was derived empirically. Np ) Niter ) 2Nd

Figure 12. Hybrid optimization framework.

Figure 13. Search space reduction using the hybrid approach.

to set the local weighting c1 to zero (see Appendix B for the description of the algorithm and its parameters). Anyhow, this did not lead to an improvement of the PSO performance. It is assumed that taking out this random effect heavily degrades the PSO efficiency. Hence, the local weight was

(20)

3.2. Alternative Hybrid Frameworks. Alternatively to the presented approach, two different combinations of global and local optimization algorithms are proposed. The former approach tries to explore all the valleys of the objective function, and thus, almost all local minima can be detected. The approach of the first alternative framework is to reduce the effort for the local optimizer by first exploring the hypersurface with only global optimization and finally starting the local optimization in the best particle positions (Alternative A). The LO effort can be reduced further when only the global best particle position is taken and, thus, only one valley is explored (Alternative B). The frameworks are depicted in Figure 14. Because of the reduction of local optimizer calls, the particle number (Np) and the iteration number (Niter) can be increased, in comparison to the basic hybrid framework. However, the effort for the GO steps increases, because the new particle positions must be checked for boundary violations. The applied PSO parameters are given in Table 5. 3.3. Reference Approach. The hybrid optimization approaches increase the number of local optimizer calls. To decide whether a benefit can be gained from this, the approaches must be compared to a random seed of calls to the local optimizer, leading to the same number of LO steps. The reference approach can be implemented as an PSO call with one iteration and a number of particles Np equal to the seed size. The reference approach can also be used to determine the probability of finding the global optimum with a single local optimizer call by repeating the search with a seed of size 1 several times. Thus, a characterization of the objective function hypersurface can be obtained. 3.4. Approach Summary. In Table 5, the PSO parameter sets of the presented approaches are summarized. The basic PSO parameters were taken from Schwaab et al.34 The number of

Ind. Eng. Chem. Res., Vol. 49, No. 20, 2010

10067

Table 5. PSO Parameter Sets

a

PSO hybrid framework Alternative A Alternative B reference approach a

c1

c2

w0

Np

Niter

2 1 1 1

2 2 2 2

0.9 0.9 0.9 0.9

20 2Nd 3Nd 4Nd 4Nd × Nd

100 2Nd 15Nd 17Nd 1

Nφ Np Np Np Np

× × × ×

Niter × NNPSOL Niter + Np × NNPSOL Niter + NNPSOL NNPSOL

N2D 2000 320 300 292 320

From ref 34.

total objective function evaluations (Nφ) is dependent on the number of calls required by the local optimizer (NNPSOL). The last row of Table 5 was calculated assuming a 2D problem and a mean number of LO function evaluations of NNPSOL ) 20. While the number of particles and iterations for the hybrid framework is given by eq 20, the combinations for the other approaches were derived by maintaining the same magnitude of function evaluations, as explained in the respective sections. The termination criterion of the local optimizer (optimality tolerance, εrel), which is calculated using the reduced gradients, is set to εrel ) 10-8. For more details, see Holsto¨m et al.35 4. Case Studies To compare the efficiency of the presented approaches, two case studies are considered. The first one is rather disconnected from OED problems and it is just used to analyze and compare the numerical behavior of the approaches. In the second case study, the already described PT1 system is investigated. For this purpose, sequential as well as parallel OED steps are performed. 4.1. Case Study 1: Test Surface. As a test surface for the optimization approaches, the 2D function given in eq 21 is employed. This function was chosen because it features several local minima and shows a tendency toward a certain search direction. Furthermore, some local minima are located on the boundary of the search intervals, as found also in OED problems. Figures 15 and 16 provide an impression of the test surface in the search range of x1,x2 ∈[0, 11π]. The function φ has a global optimum at the function value φ(x1 ) 33.02, x2 ) 31.51) ) -2092.2. The search algorithms

Figure 14. Alternative hybrid optimization frameworks.

were applied 1000 times to the optimization problem. The results are summarized in Table 6. φ ) -[sin(x1)x1x2 + cos(x2)x22 + x1 + x2]

(21)

The reference PSO algorithm seems to be quite suitable for this type of problem. In all of the cases, it can find the global optimum. In this case study, the high number of objective function evaluations represents no drawback, because of the fact that the function evaluation takes only ∼10-6 s. In comparison to this, the NPSOL initialization takes too much time, such that the mean overall calculation time (tcal) for PSO is even lower than that for the reference approach. Alternatives A and B, as well as the reference approach, find the global optimum in ∼50% of the cases. The hybrid framework leads to the lowest number of function evaluations, and it can find the global optimum in 65% of the cases. The lower number of NPSOL steps, in comparison to the other approaches, can be explained with the overlaying PSO algorithm; i.e., when GO starts to converge, the particle positions become so good that the number of LO steps can be reduced considerably. This feature is discussed more in detail in the section regarding the tuning options. However, in the case of a fast computable objective function and a low dimensionality, PSO is the algorithm to be chosen. 4.2. Case Study 2: OED for a PT1 System. In the second case study, the optimization approaches are applied to an OED problem. Although the system is a simple one, it features some characteristics of OED problems as stated in the problem statement section. Here, the two-parameter problem is considered (p ) [k,τ]) and the decision variables are the amplitude u1 ) c ∈ [-0.5, 1] and the measurement time u2 ) tM ∈ [0.1, 2.5].

10068

Ind. Eng. Chem. Res., Vol. 49, No. 20, 2010

Figure 15. Three-dimensional (3D) plot of the test surface.

Figure 16. Contour plot of the test surface. Table 6. Comparison of the Approaches for the Test Surface

PSO34 hybrid framework Alternative A Alternative B reference approach

GO hits

NNPSOL



tcal (ms)

1000/1000 649/1000 461/1000 483/1000 516/1000

0 16 18 18 18

2000 256 270 263 288

0.107 0.160 0.056 0.023 0.193

Moreover, the A-Criterion is used and the procedure starts taking the A-Optimal experiments (see Table 3) as a basis. To simulate a real OED behavior, the system is formulated as an ordinary differential equation (ODE) and integrated together with the sensitivity equations for s1 ) ∂x/∂k and s2 ) ∂x/∂τ (see eqs 22-24) using the Matlab function ode45.36 Furthermore, the gradient of the objective function is calculated numerically by NPSOL. x˙ )

kc - x τ

(22)

s˙1 )

c - s1 τ

(23)

s˙2 )

x - kc - s2τ τ2

(24)

4.2.1. Sequential Design. The third and fourth experiments are designed in a sequential way. This reduces the number of

dimensions to the number of decision variablessin this particular case, to Nd ) 2. Figure 17 shows the surfaces of the objective functions for the third (left) and fourth (right) experiment. Because of the sequential design procedure, the number of local minima does not increase. Two of them are located in the vertexes and the third is located at the lower bound of the amplitude c. In this case, the position of the global optimum does not change from the third experiment to the fourth experiment, but the valley around it becomes more narrow. For a single call to NPSOL, the probability to find the global optimum is 43% for the third experiment and 27% for the fourth experiment (estimated as stated in the description of the reference approach). In analogy to the first case study, 1000 runs were performed for each approach. The results are summarized in Table 7. Here, the reference approach is the winner. The probability for a random seed of 16 initial values to find the global optimum is ∼99.988%! Even for the fourth experiment with a lower probability for a single call, the chance for the reference approach is >99%. The PSO approach also benefits from these high probabilities, because it contains an initial seed of 20 particles, from which nearly one-half (or onequarter, respectively) starts in the valley of the global optimum. However, because of the high number of function calls, its mean calculation time is more than twice as high as that for the other approaches. The alternative hybrid approaches are slightly faster than the hybrid framework, because of a reduced number of

Ind. Eng. Chem. Res., Vol. 49, No. 20, 2010

10069

Figure 17. Objective function surfaces for the third and fourth experiments. Table 7. Comparison of the Approaches for the Design of the Third Experiment

PSO34 hybrid framework Alternative A Alternative B reference approach

GO hits

NNPSOL



1000/1000 954/1000 903/1000 939/1000 1000/1000

0 27 29 29 29

2000 432 354 301 464

Table 8. Comparison of the Approaches for the Parallel Design of Seven Experiments

tcal 4.5 1.9 1.3 0.8 2.2

s s s s s

NPSOL calls, which also means a reduced number of objective function evaluations. Anyhow, their chance to hit the global optimum is ∼5% (or ∼1%, respectively) lower than that for the hybrid framework. It can also be noted that, because of the convergence of the overlaying GO algorithm, the number of calls to NPSOL can be reduced for the hybrid approach. This is the case when all the particles are located in the same local or global optimum and NPSOL can be terminated with one iteration (which still includes the gradient calculations in this case study). The results for the fourth experiments give no new insights and are omitted here. 4.2.2. Parallel Design. Instead of going on sequentially, the 4th-10th experiments are designed in parallel, leading to Nd ) 14 search directions. For an experimental design based on real performed experiments, which lead to changes in the parameter values, a parallel design is not recommendable. Because of the change in the parameter values resulting from the model fitting to the recorded data, the recently performed experiment is not optimal anymore. For one experiment, this effect cannot be avoided; however, for a designed set of experiments, the advantages and the disadvantages must be considered. As a rule of thumb, it can be said that if the expected change in the parameter values is small, then a parallel design procedure is preferable. If the issue is to collect data for a relatively unknown process with few data available, a sequential approach should be used until the uncertainty of the parameter values becomes small. In the following test case, the parameter values remain constant, such that the parallel design can be compared with the sequential design approach. Mainly, it is used here to increase the problem complexity and the number of local minima. Thus, a problem with 14 setpoint variables to be optimized may exhibit behavior similar to that for a sequential design. The probability to find the global optimum with a single LO call is considerably reduced to a value of 1.8%. However, because of the increasing size of the random seed (see Table 5), the probability to find the global optimum for the random seed is practically 100%. Even after a reduction of the random seed size from 784 to 196 particles (i.e., bisecting the problem dimension), the probability for finding the global optimum is still high (>97%). Thus, for the determination of the particle number and the iteration number, the bisected dimension number

PSOa hybrid framework Alternative A Alternative B Alternative B, tunedb reference approachc

GO hits

NNPSOL



tcal (s)

39/100 99/100 14/100 16/100 76/100 98/100

0 175 166 139 132 310

375000 34300 15792 13467 52436 60760

5254 875 261 130 532 1350

a Np ) 25, Niter ) 15000. Niter ) 1.

b

Np ) 112, Niter ) 467.

c

Np ) 196,

Nd/2 ) 7 is used for the hybrid framework and Alternatives A and B, according to Table 5 and eq 20. The PSO seed is also increased to 25 particle and 15000 iterations, to maintain the relation of the number of function calls for the different approaches. Because of the significantly higher calculation times, only 100 runs were performed for comparison. The results are given in Table 8. The pure GO approach represented by the PSO is not longer able to give satisfactory results. Even with a calculation time that is ∼5 times larger than that for the other algorithms, the global optimum was found only in ∼39% of the runs. But also, the alternatives of the original hybrid framework (A and B) experience this adverse GO behavior. The high dimensionality seems to be a problem for the PSO. Because of the high number of the LO function evaluations required for the original hybrid framework, the alternative approaches are a lot faster. Thus, the particle and iteration numbers can be doubled (obtaining the values as given in Table 5) by still maintaining a comparable computation time. Here, Alternative A is finally discarded, because Alternative B is faster and has a similar chance for finding the global optimum. The result is given in Table 8 under the row “Alternative B, tuned”. Note that the PSO function evaluations take less time than NPSOL function evaluations. Taking the values from Table 8, the mean computation time for PSO evaluations is 0.014 s and for the NPSOL evaluations 0.022 s. This can be explained by the more complex overlaying algorithm used by NPSOL. The original hybrid framework and the reference approach are similar in its probability for finding the global optimum. However, because of the beginning convergence of the overlaying GO, in the case of the hybrid framework, the number of function evaluations required for LO is significantly smaller, leading to an ∼35% reduction in computation time. 5. Tuning Options The main advantage of the hybrid approaches over the random seed is the possibility of tuning the algorithms and, thus, adapting

10070

Ind. Eng. Chem. Res., Vol. 49, No. 20, 2010

Figure 18. Number of LO function evaluations using a variable tolerance.

Figure 19. Number of LO function evaluations using a stepwise change in tolerance.

them to a specific problem. Some options were already introduced in the case studies, e.g., playing with particle and iteration numbers. Although the PSO parameters can be manipulated, it is thought to be difficult to provide general advice here. An option that can lead to a considerable acceleration of the hybrid framework is the use of a variable convergence tolerance for the LO steps. At the beginning of the iteration loop, no accurate information about position and function value of the local optima is required, because the particles will move away from them anyhow. Only after the last GO step is it important to terminate the LO with the desired accuracy. In this work, a constant convergence tolerance was used with a relative error of εrel ) 10-8 for the objective function and the optimization variables. Reducing this tolerance will reduce the number of LO iterations, which is especially beneficial when gradients have to be calculated numerically. A decadal decreasing tolerance can be applied as denoted in eq 25. log10[εrel(k)] )

k {log10[εrel(Niter)] - log10[εrel(0)]} + Niter log10[εrel(0)] (25)

Applying this tuning option (εrel(0) ) 10-3 and εrel(Niter) ) 10-8) reduces the mean number of LO iterations to NNPSOL ) 105 (-40%) and the total computation time reduces to tcal ) 560 s (-36%). Figure 18 depicts the number of LO function calls for each particle along the GO steps. At the beginning, they are low, because of the high convergence tolerance. It then increases in the middle part and, finally, it becomes low again because the GO begins to converge. The chance to detect the global optimum remains high (at ∼96%). Figure 18 motivates a rather different progression of the LO convergence tolerance; i.e., to avoid the increase in the LO iterations, the convergence tolerance can be held low until the final GO step. Taking εrel(k) ) 10-3, k ) 1, ..., Niter -1 and εrel(Niter) ) 10-8 reduces the number of LO function calls to NNPSOL ) 88, and thus, the total computation time to tcal ) 435 s. In Figure 19, the altered behavior of the required function evaluations is depicted. The benefit regarding the computation time is at the cost of the chance to determine the global optimum. This is then reduced to ∼31%. However, the initial tolerance εrel(0) can be adapted to increase this probability.

Ind. Eng. Chem. Res., Vol. 49, No. 20, 2010

6. Conclusions and Outlook The aspect of local minima in model-based optimal experimental design (OED) tasks must be taken into account. Solving OED problems with only one single initial guess for the optimization variables will probably lead to nonoptimal experiments, and, consequently, time and resources are wasted unnecessarily. All the presented frameworks of global and hybrid optimization are able to overcome this problem, but they differ in efficiency and computational effort. The decision on which framework to chose depends strongly on the problem type, complexity, number of decision variables, and computation time required for the evaluation of the objective function. For very fast evaluable objective functions (tφ < 10-5 s), a global optimization algorithm such as PSO is the best choice. However, the function evaluations for OED problems commonly are considerably slower. Although the local optimizer was penalized with a numerical gradient evaluation, the hybrid algorithms demonstrate a more efficient behavior than the PSO, e.g., in the second case study and, in particular, in the high dimensional design of multiple experiments. The random seed is a good approach and can be implemented easily. However, whenever the hypersurface of the objective function shows a tendency that can be used by the overlaying global optimizer, the hybrid approaches are able to outplay the random seed. This effect is expressly demonstrated in the first case study. Alternative approach B, which only takes the global best particle, is faster and finds the global optimum more often than Alternative approach A. This can be explained with the higher number of global optimization steps and the larger particle seed. A significant time benefit can be gained when the number of local optimization iterations is getting high, e.g., through a numerical gradient calculation, as shown in the second case study. However, the probability for finding the global optimum is lower than that for the hybrid framework. The user must decide whether this drawback can be accepted or if it can be compensated by tuning. The original hybrid framework is even unmodified in line with the best approaches for each case. By modifying it with the proposed tuning options, it can be accelerated further and, thus, it can be identified as the best of the presented approaches. However, it should be remarked that if the objective function gradient can be supplied analytically or by automatic differentiation (AD),37 respectively, the hybrid framework is even more advantageous. The high number of function evaluations that are required for a numerical gradient evaluation (NGE) presents a higher computational load for the hybrid approach than for the other approaches. So it can be concluded that if NGE is not required, the hybrid approach saves CPU time disproportionately high. Numerical gradients calculated with finite differences (FD) are a common, but not good, approach. To evaluate the objective function, the solution of a differential equation is required. This is received by applying integration techniques with an adaptive step size control (ODE solver). Thus, the calculated finite difference is superimposed by the accuracy of the ODE solver. However, FD is the simplest to implement approach. Supplying gradient information for OED problems is a difficult and computationalcost-intensive task, because second-order derivatives of the model equation system are required. For this purpose, derivatives of the sensitivities versus the decision variables are required: ∂/∂u ∂x/∂p. Including them as sensitivity equations in the equation system enlarges it by (Nx × Np × Nu) equations. Another approach is shown by Bauer et al.,38

10071

where the gradients are calculated with a tailored derivative calculation in a sequential manner. A detailed study on how to choose the best gradient calculation approach for a specific problem is still missing.

Acknowledgment The Max-Buchner Forschungsstiftung is acknowledged for the financial support and Dr.-Ing. Thielert from the Uhde GmbH for the appointment of the research project that enabled these studies. The authors also wish to thank the support from the Collaborative Research Center SFB/TR 63 InPROMPT “Integrated Chemical Processes in Liquid Multiphase Systems”, coordinated by the Berlin Institute of Technology and funded by the German Research Foundation.

Appendix A Optimal experimental designs for models of one single parameter must be performed for just one experiment, in the case of no changes in the parameter value. The repetition of this experiment is automatically the best next experiment, because there is no influence of the already-performed experiments on the position of the minimum of the objective function φ. The Fisher matrix for an experiment i can be calculated with eq A1. The parameter covariance is then given by eq A2 for a set of NDP data points, leading to the objective function in eq A3.

FMi )

[( ) ( )] ∂xi T -1 ∂xi Cx ∂p ∂p

(A1)

( )

(A2)

NDP

Cp ) F-1 )

∑F

Mi

-1

i)1

φ ) φ A ) φ D ) φE ) Cp )

1 FM1 + FM2 + ... + FNDP (A3)

Because of the fact that F is a scalar, the covariance is simply its reciprocal value, leading to a summation in the denominator of the objective function. So, it can be concluded that the objective function reaches its minimum when each summand FMi is at its maximum, which is the case for the repeated optimal single experiment. In the field of optimal sensor allocation (see Section 2.1.2, “Optimal Measurement Times”), this fact is known as sensor clusterization.39 If physical sensors are to be positioned, it is obviously not desired that more than one sensor is placed in one position. Ucinski40 offers solutions for such problems.

Appendix B The implemented PSO algorithm follows Scheme B1. It was taken from Schwaab et al.,34 and it has been only changed

10072

Ind. Eng. Chem. Res., Vol. 49, No. 20, 2010

Scheme B1. PSO Algorithma

a

Adapted from ref 34.

in terms of variable nomenclature and some paper-specific points. The variable bounds are implemented with an exponentially growing penalty function of the form shown in eqs B1 and B2. The penalty function is added to the value of the objective function. Nd

φPenalty )

∑φ

Penalty,d

(B1)

d)1

φPenalty,d

{

lbd e xp,d e ubd 0 exp[lbd - xp,d] xp,d < lbd ) exp[xp,d - ubd] xp,d > ubd

(B2)

Literature Cited (1) Kuntsche, S.; Arellano-Garcia, H.; Wozny, G. Web-Based ObjectOriented Modelling Environment for the Simulation of Chemical Processes. Chem. Eng. Trans. 2009, 18, 779–784. (2) Bartholomew, C. H.; Farrauto, R. J. Fundamentals of Industrial Catalytic Processes; John Wiley & Sons: Hoboken, NJ, 2006. (3) Fogler, H. S. Elements of Chemical Reaction Engineering, 4th ed.; Pearson International: New Delhi, India, 2006. (4) Ko¨rkel, S.; Kostina, E.; Bock, H.; Schlo¨der, J. Numerical Methods for Optimal Control Problems in Design of Robust Optimal Experiments for Nonlinear Dynamic Processes. Opt. Methods Software J. 2004, 19, 327–228. (5) Brendel, M. L. Incremental Identification of Complex Reaction Systems, Fortschritt-Berichte Nr. 864; VDI-Verlag: Dusseldorf, Germany, 2006. (6) Arellano-Garcia, H.; Scho¨neberger, J. C.; Ko¨rkel, S. Optimale Versuchsplanung in der chemischen Verfahrenstechnik. Chem. Ing. Tech. 2007, 79, 1625–1638. (7) Galvanin, F.; Macchietto, S.; Bezzo, F. Model-based design of parallel experiments in dynamic systems. Ind. Eng. Chem. Res. 2007, 46, 871–882.

Ind. Eng. Chem. Res., Vol. 49, No. 20, 2010 (8) Benabbas, L.; Asprey, S.; Macchietto, S. Curvature-based methods for designing optimally informative experiments in multiresponse nonlinear dynamic situations. Ind. Eng. Chem. Res. 2005, 44, 7120–7131. (9) Franceschini, G.; Macchietto, S. Model-based design of experiments: State of the art. Chem. Eng. Sci. 2008, 63, 4846–4872. (10) Biegler, L. T.; Grossmann, I. E.; Westerberg, A. W. Systematic Methods of Chemical Process Design, Prentice Hall International Series in the Physical and Chemical Engineering Sciences; Prentice-Hall: Upper Saddle River, NJ, 1997. (11) Kennedy, J.; Eberhart, R. Particle swarm optimization. Proc. IEEE Int. Conf. Neuronal Networks 1995, 4, 1942. (12) Gill, P.; Murray, W.; Saunders, M.; Wright, M. Some issues in implementing a sequential quadratic programming algorithm. ACM Signum Newsl. 1985, 20, 13. (13) Faber, R.; Jockenho¨vel, T.; Tsatsaronis, G. Dynamic optimization with simulated annealing. Comput. Chem. Eng. 2005, 29, 273–290. (14) Li, P.; Lo¨we, K.; Arellano-Garcia, H.; Wozny, G. Integration of simulated annealing to a simulation tool for dynamic optimization of chemical processes. Chem. Eng. Process. 2000, 39, 357–363. (15) Wehrens, L. M. C.; Buydens, R. Evolutionary Optimization: A tutorial. Trends Anal. Chem. 1998, 17, 193–203. (16) Gill, P. E.; Murray, W.; Saunders, M. A. SNOPT: An SQP algorithm for large-scale constrained optimization. SIAM J. Optim. 2002, 12, 979–1006. (17) Bartlett, R. A.; Wa¨chter, A.; Biegler, L. T. Active set vs. interior point methods for nonlinear model predictive control. Proc. Am. Control Conf. 2000, 6, 4229–4233. (18) Murtagh, B. A.; Saunders, M. A. MINOS 5.5 User’s Guide, 1998. (19) Gill, P. E.; Murray, W.; Saunders, M. A. SNOPT: An SQP Algorithm for Large-Scale Constrained Optimization. SIAM ReV. 2005, 47, 99–131. (20) Goodwin, G. C.; Payne, R. L. Dynamic System Identification: Experiment Design and Data Analysis; Academic Press: New York, 1977. (21) Rojas, C. R.; Welsh, J. S.; Goodwin, G. C.; Feuer, A. Robust optimal experiment design for system identification. Automatica 2007, 43, 993–1008. (22) Bard, Y. Nonlinear Parameter Estimation; Academic Press: New York, 1974. (23) Shao, J. Mathematical Statistics, Springer Texts in Statistics; Springer: New York, 1999. (24) Rafajlowicz, E. Optimum choice of moving sensor trajectories for distributed-parameter system identification. Int. J. Control 1986, 43, 1441– 1451. (25) Sung, S. W.; Lee, J.; Lee, I.-B. Process Identification and PID Control; Wiley and John Wiley: Chichester, U.K., 2009. (26) Barz, T.; Arellano-Garcia, H.; Wozny, G. Handling Uncertainty in Model-Based Optimal Experimental Design. Ind. Eng. Chem. Res. 2010, 49, 5702-5713.

10073

(27) Auer, R.; Berger, R. EUROKIN Test Case for Sequential Design of Experiments, 2002 (available via the Internet at [email protected]). (28) Buzzi-Ferraris, G.; Manenti, F. Kinetic models analysis. Chem. Eng. Sci. 2009, 64, 1061–1074. (29) Scho¨neberger, J. C.; Arellano-Garcia, H.; Thielert, H.; Krkel, S.; Wozny, G. Experimental Analysis of a Fixed Bed Reactor for Catalytic SO2 Oxidation. Ind. Eng. Chem. Res. 2009, 48, 5165–5176. (30) Scho¨neberger, J. C.; Arellano-Garcia, H.; Thielert, H.; Wozny, G. Identification of Reaction Mechanisms with a Dynamic PFR Model. Presented at ADCHEM 2009, Istanbul, Turkey, July 12-15, 2009; Paper No. 128. (31) Dette, H.; Melas, V.; Pepelyshev, A. Strigul, Robust and efficient experimental design of experiments for the Monode model. J. Theor. Biol. 2005, 234, 537–550. (32) Bock, H. G.; Ko¨rkel, S.; Kostina, E.; Schlo¨der, J. P. In Robustness Aspects in Parameter Estimation: Optimal Design of Experiments and Optimal Control. In ReactiVe Flows, Diffusion and Transport: From Experiments Via Mathematical Modeling to Numerical Simulation and Optimization; Ja¨ger, W., Rannacher, R., Warnatz, J., Eds.; Springer: Berlin, New York, 2007; pp 117-146. (33) Stein, O.; Oldenburg, J.; Marquardt, W. Continuous reformulations of discrete-continuous optimization problems. Comput. Chem. Eng. 2004, 28, 1951–1966. (34) Schwaab, M.; Biscaia, E. C.; Monteiro, J. L.; Pinto, J. C. Nonlinear parameter estimation through particle swarm optimization. Chem. Eng. Sci. 2008, 63, 1542. (35) Holmstro¨m, K.; Go¨ran, A. O.; Edvall, M. M. User’s Guide for TOMLAB/NPSOL, Technical Report, Tomlab Optimization, Inc., 2007. (36) Dormand, J. R.; Prince, P. J. A family of embedded Runge-Kutta formulas. J. Comput. Appl. Math. 1980, 6, 19–26. (37) Forth, S. A. An Efficient Overloaded Implementation of Forward Mode Automatic Differentiation in MATLAB. ACM Trans. Math. Software 2006, 32, 195–222. (38) Bauer, I.; Bock, H.; Ko¨rkel, S.; Schlo¨der, J. Numerical Methods for Optimum Experimental Design in DAE Systems. J. Comput. Appl. Math. 2000, 120, 195–222. (39) Ucinski, D.; Patan, M. D-optimal design of a monitoring network for parameter estimation of distributed systems. J. Global Opt. 2007, 39, 291–322. (40) Ucinski, D. Optimal Measurement Methods for Distributed Parameter System Identification, Systems and Control Series; CRC Press: Boca Raton, FL, 2005.

ReceiVed for reView October 15, 2009 ReVised manuscript receiVed June 24, 2010 Accepted July 8, 2010 IE9016094