Optimal Experimental Design for Discriminating ... - ACS Publications

Claas Michalik, Maxim Stuckert and Wolfgang Marquardt*. AVT—Process Systems Engineering, RWTH Aachen University, Turmstrasse 46, D-52064 Aachen, ...
0 downloads 0 Views 167KB Size
Ind. Eng. Chem. Res. 2010, 49, 913–919

913

Optimal Experimental Design for Discriminating Numerous Model Candidates: The AWDC Criterion Claas Michalik, Maxim Stuckert, and Wolfgang Marquardt* AVTsProcess Systems Engineering, RWTH Aachen UniVersity, Turmstrasse 46, D-52064 Aachen, Germany

While model-based optimal experimental design (OED) strategies aiming at maximizing the parameter precision are regularly applied in industry and academia, only a little attention has been payed to OED techniques for model discrimination in practical applications. A broader use of these techniques is mainly hindered by two drawbacks: (i) The use of such techniques is desirable in an early model identification phase, where only a little knowledge on the process is available. The known methods, however, rely on good estimates of the parameters of all candidate model structures. (ii) The available methods are tailored to few (ideally two) model candidates and do not work well if numerous candidate structures are taken into account. In this work we propose a novel design criterion for model-based OED for model discrimination in the case of multiple model candidates. The resulting OED method is thus well-suited for designing experiments in an early stage of the model identification process to efficiently reduce the number of model candidates, thereby reducing the overall cost for model identification. m-1

1. Introduction The idea of choosing the experimental degrees of freedom in a way such that the information gain due to the resulting experiment is maximized dates back to the original work by Fisher in 1935.21 While Fisher’s ideas were derived on a sound statistical foundation and targeted only at linear models, modelbased optimal experimental design strategies for nonlinear models have first been proposed by Box and Lucas.10 The basic idea in their work is to choose the experimental degrees of freedom such that, on the basis of the expected measurements, the uncertainty in the parameter estimates is minimized. Box and Lucas10 proposed to minimize the parameter variance by minimizing the determinant of the covariance matrix. A variety of modifications and extensions of this general idea have been proposed thereafter, for instance to account for dynamic models or parameter uncertainty.29 All methods described above fall into a category often referred to as OED techniques for parameter precision, since they aim at determining the parameters of a fixed model structure most precisely. In the case of multiple model structure candidates, another strategy is advisible. Here, the experiments should allow for identifying the best suited model structure most efficiently. Such OED techniques for model discrimination (OED-MD techniques, for short) were first proposed by Hunter and Reiner.28 The basic idea of Hunter and Reiner28 is to design the experiments such that the model predictions of two competing model structures differ most. The objective function is thus the sum of squared differences in the model predictions (referred to as RSOS throughout this paper), which has to be maximized according to max D(e) ) [y1(Θ1, e) - y2(Θ2, e)]2 e

(1)

where e are the experimental conditions, Θ1 and Θ2 are the (estimated) parameters, and y1 and y2 are the predictions of models 1 and 2, respectively. This concept can directly be extended to the general case of m rival model structures,22 yielding * To whom correspondence should be addressed. E-mail: [email protected].

max D(e) ) e

m

∑ ∑ [y (Θ , e) - y (Θ , e)] i

i

j

2

(2)

j

i)1 j)i+1

Box and Hill8 criticize that eqs 1 and 2 do not account for the measurement and model uncertainty. On the basis of the entropy concept in information theory,36 they derive a new OED-MD criterion that also accounts for the model and measurement uncertainty: m-1

max D(e) ) e

[

m

∑∑

pi pj

i)1 j)i+1

(σi2 - σj2)2 (σ2 + σi2)(σ2 + σj2)

(

+ (yi - yj)2

1 1 + 2 2 σ + σi σ + σj2 2

)]

(3)

In eq 3, σ2 is the measurement variance, σi2 is the model prediction variance (for model i), and pi is the prior probability of model candidate i such that the criterion concentrates on discriminating the most probable model candidates. Reilly33 proposes an OED-MD criterion based on eq 3 that avoids an approximation introduced by Box and Hill,8 Hsiang and Reilly27 propose another OED-MD criterion based on the idea of maximizing the change in entropy. In contrast to previous works, they propose to avoid calculating the model and parameter uncertainties by a local linearization of the model but use discrete parameter probability distributions, instead. However, this approach significantly increases the technical effort for OED-MD.12 Atkinson and Fedorov6 propose an OED-MD technique for the case of several rival model structures that aims at maximizing the minimal difference to the best model. However, the method requires that a best model is already identified and concentrates on falsifying the alternative model structures. In addition, the method is restricted to linear models (although an extension to nonlinear models is principally possible) andsmore severelysresults in a discontinuous objective function, which is problematic in a numerical optimization framework. BuzziFerraris and Forzatti16 note that the discrimination among multiple rival models at the same time may be too complex and propose to discriminate between pairs of models instead. Later they present an improved version of their criterion to

10.1021/ie900903u  2010 American Chemical Society Published on Web 12/01/2009

914

Ind. Eng. Chem. Res., Vol. 49, No. 2, 2010

correctly account for the experimental errors.17 This formulation reads as max Di,j(e) ) (yi - yj)TVi,j-1(yi - yj) + trace(2VVi,j-1) e

(4)

where Di,j(e) is the criterion to discriminate between models i and j and V is the covariance matrix of the measurements. Vi,j is defined as Vi,j ) 2V + Vi + Vj, where Vi is the covariance matrix of the model predictions for model i. However, following this approach, it is hard to determine which models should directly be compared. In addition, the number of experiments is half the number of model candidates only for the first set of experiments. If numerous, diverse model structures are considered, it seems reasonable that the discrimination can be performed more efficiently. Hence, the design criterion presented by Buzzi-Ferraris and Forzatti17 can be adapted to the case of direct discrimination among m rival models: m-1

max Dclassical(e) ) e

m

∑ ∑

pipj(yi - yj)TVi,j-1(yi - yj) +

i)1 j)i+1

trace(2VVi,j-1)

(5)

4

Asprey and Macchietto have extended the general concept to the case of dynamic models. Due to its easy applicability, the simple criterion given in eq 2 is still popular in practice.29 More complex criteria rigorously addressing the statistical nature of the problem are known (see, e.g., refs 27 and 29) but are rarely applied due to the high technical effort associated with their use. In addition, comparisons show that, in the case of OED-MD, the designed experiments and thus the overall experimental effort for model discrimination do not vary much if different design criteria are employed.5,13,20,32 Thus, the predominantly applied criteria are still based on the original idea of maximizing the (potentially weighted) RSOS. As shown below, such criteria are likely to lead to model lumping if numerous model candidates are considered. Model lumping means that some of the calculated model responses are very close to one another, while they are far apart from other ones, resulting in an overall inefficient model discrimination process. Hence, in summary, an easy to use OED-MD criterion for the efficient discrimination of numerous model candidates is still lacking. The remainder of this paper is structured as follows. In the next section the general problem of established OED-MD techniques in the case of numerous candidate model structures is explained in detail. In section 3, the proposed new design criterion and its statistical background (Akaike’s information criterion) are presented. In section 4, a work process based on the new criterion is outlined, before a case study is presented in section 5, and the results are summarized in the last section. 2. OED-MD of Numerous Model Candidates: A Motivating Example While the methodic literature on model-based OED-MD techniques is rich, only a few works deal with the application of such methods to real life problems (see, for instance, refs 18, 19, 22, 23, 39, and 40). If model-based OED-MD techniques are applied, the number of competing model candidates is usually relatively low and the simple design criterion as given in eq 2 is used. This is probably due to the technical problems associated with a high number of competing model structures and more sophisticated design criteria. That is also the reason why most of the available algorithms are based on eq 2.29 Hence, in practical applications the simple approach of Hunter and Reiner28 is still popular.

Figure 1. Illustration of two alternative experimental settings for the discrimination of four model candidates.

However, as demonstrated below, this and similar but more complex approaches only work well in the case of few model candidates. Due to this limitation the number of model candidates is usually reduced prior to any experiment on the basis of expert knowledge. Hence, well-suited model structures might be dismissed before the first experiment is conducted. Therefore, we propose a novel OED-MD criterion that is easy to use and implement and allows the efficient discrimination of a high number of model candidate structures. This is achieved by avoiding the so-called model lumping, which is the main problem in classical OED-MD approaches for a high number of model candidates. Model lumping and its negative effects are best illustrated on the basis of a simple example. It is assumed that four model candidates are considered, and the aim is to efficiently discriminate these candidates. Figure 1 shows the predicted model responses (yi, i ) 1, ..., 4) and their differences for two alternative experimental conditions, e1 and e2. For experimental setting e1 the model predictions are nicely distributed such that a single model candidate is likely to turn out to be best suited after one experiment. Considering experimental conditions e2 on the other hand, the predictions of model candidates 1 and 2, and 3 and 4 are close to each other. Hence performing this experiment will probably allow discrimination of models 1 and 2 from 3 and 4. Discriminating model 1 from 2 or 3 from 4, however, is likely to be impossible. In this case, both approaches seem reasonable, since the number of model candidates is still fairly low. If, however, tens or hundreds of model candidates are considered, experimental conditions of type e1 are likely to allow reducing the number of model candidates more efficiently. Assuming equal prior probabilities of 0.25 for all model candidates and a standard deviation of 0.1 for the measurements and also a model prediction variance of 0.1 for each model candidate, the classical design criterion in eq 5 yields

((

Objclassical(e1) ) 0.0625 3

) ( (

) ))

) ( ) (

)

12 22 + 0.5 + 2 + 0.5 + 0.4 0.4 32 + 0.5 ) 3.3125 0.4

for experiment e1 and

((

Objclassical(e2) ) 0.0625 2

0.12 2.82 + 0.5 + + 0.5 + 0.4 0.4 2 2 2.9 3 + 0.5 + + 0.5 ) 5.45 2 0.4 0.4

(

))

for experiment e2. Hence, the RSOS-based design criterion prefers experimental settings e2 over e1, which does not seem reasonable if numerous model candidates are present. Therefore,

Ind. Eng. Chem. Res., Vol. 49, No. 2, 2010

we propose a new design criterion favoring experimental settings of type e1. 3. Akaike Weights Design Criterion In this section, we present a novel design criterion for the efficient discrimination of numerous model candidates. The design criterion builds upon the so-called Akaike weights, which themselves build upon the Akaike information criterionsAIC for short.1 The AIC is a well-established and statistically sound criterion for model discrimination.29 Before presenting the novel Akaike weights design criterion (AWDC), we briefly introduce the AIC and the Akaike weights and their application to model discrimination. For a detailed derivation of the AIC and its relation to information theory, the interested reader is referred to the book of Burnham and Anderson.14 3.1. Akaike’s Information Criterion for Model Selection. The AIC has been introduced by Akaike in a sequence of papers in the 1970s.1-3 In these papers, he identifies a relation between the log-likelihood and the expected Kullback-Leibler (KL) distance.30 Roughly speaking, the KL distance I(f,g) is the information loss if the model g is used to approximate f, which is assumed to represent the reality or truth. In model selection, a model gbest is considered being the best of a set of candidate models if the information loss I(f,gbest) associated with this model is minimal.14 Unfortunately, the truth f is usually not known and may even not exist. Hence, the KL distance was rather a theoretical construct with little practical value until Akaike1 found the following relation: ˆ ))|data) - K ) constant - E[I(f, g(Θ ˆ ))] (6) log(L(g(Θ ˆ ) is the approximating model and Θ ˆ are the estimated Here, g(Θ ˆ ))|data) is the likelihood parameter values of that model. L(g(Θ ˆ in light of the function of model g with the parameter values Θ measurement data. K is the number of estimable parameters in ˆ ))] is the expected value of the information g, and E[I(f,g(Θ ˆ ). Due to the distance (or loss) associated with the model g(Θ unknown constant present in eq 6, the log-likelihood value ˆ ))]. However, cannot be used to calculate a value for E[I(f,g(Θ the difference in information loss (against truth/reality) associated with alternative models g can be calculated, since the unknown constant automatically drops out in these cases. Motivated by this, Akaike1 defined an information criterion as

a different number of parameters. In model discrimination, however, a model probability, quantifying the likelihood of a model structure, is desirable. For this purpose the so-called Akaike weights14 ˆ , e, x, σ2) ) wk(Θ

exp(-0.5∆k,c) m

n

ˆ j, e)|x, σ2) ) L(mj(Θ

∏ i)1

( ) 1 2πσi2

1/2

(

exp -

ˆ j, e))2 (xi - mj(Θ 2σi2

)

(8)

ˆ j,e) is the prediction of model j with parameter where mj(Θ ˆ estimates Θj at experimental conditions e, σi2 is the variance of measurement xi, and x and σ2 are the vectors of the experimental data and the according variances, respectively. The model with the lowest AIC value is the one with the smallest loss of information and thus considered the best model in this framework. The AIC automatically applies the Principle of Parsimony9 because the (statistically sound) term 2K penalizes the number of parameters to be estimated. Hence, the AIC can be used to rank candidate model structures in a statistically sound manner, even if the alternative model structures comprise

(9)

∑ exp(-0.5∆

j,c)

j)1

can be used. In eq 9 wk refers to the probability of model structure k being the correct one (with 0 e wk e 1), m is the ˆ is the vector total number of model candidates considered, Θ of all parameter estimates of all models, and ∆k,c is the difference in the AIC values of model k and model c, which refers to the best model (defined as the one with the lowest AIC value), according to ∆k,c ) AICk - AICc

(10)

In real life applications the true mechanism is often too complex to be fully accessible, and, hence often no correct model exists. The calculation of the Akaike weights on the other hand theoretically requires that the correct model structure is within the set of model candidates. However, for practical applications it is sufficient, if a good model structure is present in the set of candidates.14 In the remainder we assume that such a good model candidate is always present in the set of candidate structures and refer to the best of the model candidates as the correct model. 3.2. Akaike Weights for Optimal Experimental Design. It seems reasonable to use the same criterion for OED-MD that is used for model discrimination. In fact Kreutz and Timmer29 just recently mentioned in their review paper the possibility of using an AIC-based criterion for OED-MD in the case of nonnormal measurement noise. Theoretically, the optimal experiment for model discrimination is the one resulting in a probability of 1 for the correct model structure and 0 for all other candidates calculated from eq 9. Hence, wi should be as close as possible to 1 for the correct model structure and as close as possible to 0 for the remaining structures, after all models have been fitted to the experimental data. The resulting design criterion reads as ˆ , e, x, σ2) max D(e) ) wc(Θ

(11)

s.t.:

(12)

e

ˆ )|data)) + 2K AIC ) -2 log(L(g(Θ (7) Assuming for xi normally distributed noise with zero mean and variance σi2, i ) 1, ..., n with n being the number of observations, the likelihood for model j, obtained at experimental conditions e, can be calculated as

915

c is the correct model

where e denotes the experimental degrees of freedom. Unfortunately, this design criterion is of little practical use, because the correct model structure is not known. However, the adaption of the criterion is straightforward. Since, the correct model is unknown, different scenarios with varying assumptions on the correct model structure have to be considered. Therefore, each model is assumed to be the correct one in a series of sequentially explored scenarios, and the corresponding Akaike weight based on experimental data that are consistent with this correct model is calculated. Thus, in total, m scenarios are considered, each resulting in an Akaike weight of the model assumed being the correct one in that particular scenario. The resulting criterion is the sum of these Akaike weights according to m

max D(e) ) e

∑w

ˆ

c,c(Θ, e, x, σ

2

)

(13)

c)1

ˆ ,e,x,σ2) is the Akaike weight of model c under the where wc,c(Θ assumption that model c is the correct one. If a priori knowledge on the model probabilities exists, this can be incorporated

916

Ind. Eng. Chem. Res., Vol. 49, No. 2, 2010

yielding the general formulation of the proposed novel design criterion,

wc,c(e) )

exp(-0.5(∆c,c(e)) m

∑ exp(-0.5(∆

ˆ , e, x, σ2) ) max AWDC(Θ e

∑p w

ˆ

c c,c(Θ, e, x, σ

2

)

(14)

m pc ) 1. where pc is the prior probability of model c, with Σc)1 ˆ ,e,x,σ2) ranges from 1/m to 1, where a value Note that wc,c(Θ close to 1/m indicates that the predictions of all models are almost equal and a value close to 1 indicates that the prediction of model c is far apart from the predictions of all other model candidates. Therefore, the value of the AWDC allows for a direct interpretation of the quality of the designed experiment. Assuming equal prior probabilities for all models, an AWDC value close to 1 means that the Akaike weight for each model will be close to 1, if the model is the correct one. Hence, in this case the correct model can be identified easily, no matter which model turns out to be the correct one. In other words, an AWDC value close to 1 means that experimental conditions have been identified, under which the prediction of each candidate model is nicely separated from the predictions of the alternative model candidates. An AWDC value close to 1/m on the other hand means that the Akaike weight for each model is close to 1/m, and thus no model discrimination can be performed. In other words, all models predict a very similar outcome for the proposed experimental setting. This characteristic of the AWDC is also favorable in the context of global optimization. If deterministic or stochastic global optimization algorithms are applied for the OED task, an upper bound for the objective function can be provided according to

(15)

Since OED-MD problems tend to be strongly nonlinear, global optimization is especially attractive in this framework and the information on the theoretically optimal objective value can help reduce the computational load. Similar to the situation of model-based OED for parameter precision, where various criteria based on the covariance matrix of the parameters have been proposed, alternative objective functions based on the Akaike weights are also possible. One could for instance minimize the lowest Akaike weight considering the same m scenarios. Such a criterion would aim at experimental conditions allowing dismissal of a single model structure from the candidate set with high certainty. This could be an efficient alternative in a setting, where a model structure is only allowed to be dismissed if its probability is extremely low (model falsification framework). However, if various model candidates are considered, criterion (14) seems to be the most reasonable approach and is therefore considered in the remainder of this paper. Note that it is also straightforward to derive design criteria based on other information criteria, as for instance the Bayesian information criterion.15 In general, the information criterion that is used for model selection should also be the basis for the optimal experimental design criterion. Assuming that measurement xi is consistent with the prediction mc, i of the model structure assumed being correct, the Akaike weights can be calculated according to (for clarity of presentation in the following equations, only the dependence on e is explicitly mentioned)

1

)

m

∑ exp(-0.5(∆

c)1

AWDCopt e 1

j,c(e))

j)1

m

j,c(e)))

j)1

)

m

n

j)1

i)1

m

n

j)1

i)1

( (

∑ ∏ exp )

∑ ∏ exp

1 (xi - mc,i(e))2 - (xi - mj,i(e))2

+ K c - Kj

2σi2 1

-(mc,i(e) - mj,i(e))2 2σi2

)

)

+ Kc - Kj

(16)

In eq 16, mj,i(e) is the prediction of model candidate j for the ith measurement value xi, σi2 is the variance of that measurement value, Kj is the number of estimated parameters in model j, and the index c indicates the model assumed to be the correct one. Two different sources contribute to the variance term σi2 in eq 16. First, the prediction of model c is uncertain due to its uncertain parameter estimates, and second, the measurement itself is uncertain due to experimental errors. Assuming that38 (1) the prediction uncertainty of model c for the ith measurement value is normal with zero mean and variance σmodel,i,c2, that (2) the measurement uncertainty at the prediction of model c for the ith measurement value is normal with zero mean and variance σmeas,i,c2, and that (3) the model prediction uncertainty and the measurement uncertainty are independent, the variance of measurement i (σi2) can be calculated as σi2 ) σmeas,i,c2 + σmodel,i,c2

(17)

Back to the motivating example from section 2, here m ) 4, σmodel ) 0.1, and σmeas ) 0.1 for both experimental conditions and all four models. Assuming that model 1 is correct and two parameters have to be estimated for all model candidates, the expected AIC values for both experimental settings can be calculated using eqs 7, 8, and 17. The results are summarized in Table 1. On the basis of the expected AIC values, the expected Akaike weights for the model candidates given in Table 2 can be calculated using eq 9. The resulting objective function values according to eq 14 for both experimental settings are AWDC(e1) ) 0.25(1 + 1 + 1 + 1) ) 1

(18)

Table 1. AIC Values for Both Experimental Settings and All Four Model Candidates for given correct model (experimental setting) 1 (e1)

1 (e2)

2 (e1)

2 (e2)

3 (e1)

3 (e2)

4 (e1)

4 (e2)

AIC1 1.93 1.93 51.93 2.43 201.93 422.43 451.93 451.93 AIC2 51.93 2.43 1.93 1.93 51.93 393.93 201.93 422.43 AIC3 201.93 422.43 51.93 393.93 1.93 1.93 51.93 2.43 AIC4 451.93 451.93 201.93 422.43 51.93 2.43 1.93 1.93 Table 2. Akaike Weights for Both Experimental Settings and Different Assumptions Regarding the Correct Model Structure e1

e2

correct model

w1

w2

w3

w4

w1

w2

w3

w4

1 2 3 4

1 0 0 0

0 1 0 0

0 0 1 0

0 0 0 1

0.562 0.438 0 0

0.438 0.562 0 0

0 0 0.562 0.438

0 0 0.438 0.562

Ind. Eng. Chem. Res., Vol. 49, No. 2, 2010 AWDC(e2) ) 0.25(0.562 + 0.562 + 0.562 + 0.562) ) 0.562 (19)

and thus the AWDC indeed prefers experimental setting e1 over e2. Table 2 also shows that experimental setting e1 is indeed preferable, no matter which model structure turns out to be the correct one, indicated by the bold Akaike weights, which are higher for experimental setting e1 in any case. This emphasizes that if numerous model candidates are considered, classically designed experiments are not optimal. 4. Proposed Work Process for Model Identification On the basis of the new design criterion, we propose the general work process for model identification depicted in Figure 2. In a first step, a set of model candidates has to be proposed. This set can be based on a literature database, expert knowledge, or automatically generated by means of a specific software.26 Especially in the latter case, the number of competing models can be on the order of hundreds or even thousands.41 The second step depends on the amount of prior knowledge. If reasonable estimates of all model parameters for all models are available, AWDC optimal experiments can be performed on the basis of eq 14. If, for at least some of the candidates, no parameter estimates are available, some initial experiments, for instance based on a factorial design,21 are necessary to increase process knowledge. The results of the (potentially AWDC optimal) experiment are next used to estimate the model parameters and

917

calculate the AIC values and Akaike weights. Model candidates that have a probability (Akaike weight) below a certain threshold, as for instance 1%, can be dismissed from the set of candidates. It is important to mention that, although generally the update of the candidate set means to dismiss candidates, it is also possible to add new candidates thatsin the light of the new datasseem reasonable. The steps of OED-MD (employing eq 14), experiment, parameter estimation, and the update of the candidate set are repeated, until (i) one model candidate has a probability above a certain threshold (e.g., 95%) or (ii) the maximum number of experiments is reached. Additionally, the accuracy of the parameter estimates for the final model(s) can be improved using classical OED for parameter precision. Again, the steps of OED, experiment and parameter estimation are repeated, until the parameter accuracy is satisfactory. It is important to mention that mainly two reasons can lead to an overall bad model fit. The first and obvious reason is an incorrect model structure; the second is only locally optimal parameter estimates. Hence, a good model candidate might be rejected due to only locally optimal parameter estimates. This has been illustrated by means of some examples from chemical reaction kinetics by Singer et al.37 The problem of local optima can be avoided using global parameter estimation algorithms. Deterministic global optimization methods can provide a level of assurance that the global optimum will be located in finite time. However, these methods are computationally very demanding, especially in the case of dynamic models,24 and can thus only be applied to very small and simple models. Stochastic global optimization methods are computationally less demanding but cannot guarantee identifying the global optimum in finite time.25 Thus, at this point in time no optimization method that (i) guarantees the identification of the global optimum and (ii) can be applied to problems of reasonable complexity is available. However, two approaches that significantly increase the chances of obtaining the global optimum while keeping the computational cost at a reasonable level can be applied, here. These are (a) hybrid optimization methods combining stochastic and deterministic methods7,34 and (b) a novel method entitled IGPE that has recently been presented by Michalik et al.31 5. Case Study To test the novel AWDC, we compare a classical design criterion and the AWDC in an exemplary model discrimination task (the MATLAB files for the optimal experimental design, parameter estimation, and calculation of Akaike weights for this case study are available on line, free of charge, at http:// www.avt.rwth-aachen.de/AVT/index.php?id)721). As an exemplary system we choose a reaction step within the reaction mechanism for the acetoacetylation of pyrrole. The reaction can be described by the simple mechanism35 cat.

A + B 98 C Brendel et al.11 identify the following 10 candidate mechanisms for this reaction:

Figure 2. Proposed work process.

r1 ) k1

(20)

r2 ) k2CA

(21)

r3 ) k3CB

(22)

r4 ) k4Ccat.

(23)

r5 ) k5CACB

(24)

918

Ind. Eng. Chem. Res., Vol. 49, No. 2, 2010

r6 ) k6CACcat.

(25)

r7 ) k7CBCcat.

(26)

r8 ) k8CACBCcat.

(27)

r9 ) k9CA CB

(28)

r10 ) k10CACB2

(29)

2

In eqs 20-29 Ci, i ∈ {A, B, cat.} denotes the concentration of species i; rj, j ) 1, ..., 10, is the reaction rate according to candidate model structure j; and kl, l ) 1, ..., 10, is the kinetic constant to be estimated in model candidate l. For model discrimination, initial rates can be measured at different concentrations within the bounds given in Table 3. Model 9 with k9 ) 1 mol/(L · min) is assumed to be the correct model structure; it is used to generate pseudomeasurements, which are assumed to be normally distributed with zero mean and variance σ2 ) 1 mol/L. To obtain initial estimates for the 10 parameters (ki, i ) 1, ..., 10), one (not optimally planned) experiment is performed using the initial concentrations given in the first line of Table 5 (initial guess). On the basis of these simulated experimental data, the kinetic parameters ki, i ) 1, ..., 10 are estimated along with their standard deviations using a classical least-squares approach. Since all models are convex, a global optimum can be guaranteed by any local solver. The results of the parameter estimation step are presented in Table 4. On the basis of this knowledge, we design an optimal experiment for model discrimination using both, the classical design criterion (CDC) given in eq 5 and the novel AWDC given in eq 14. The prior probabilities for all models are set to 1/10. The initial guess used for the optimization and the CDCand AWDC-optimal experimental settings for the first optimally planned experiment are given in Table 5. Table 6 presents the reaction rates predicted by the model candidates by applying the experimental conditions given in Table 5. Table 3. Bounds for the Experimental Degrees of Freedom (Initial Concentrations) component A

component B

catalyst

0 10

0 10

0 100

lower bound (mol/L) upper bound (mol/L)

Table 4. Results of the First Parameter Estimation parameter

estimate

std dev

k1 (mol/(L · min)) k2 (1/min) k3 (1/min) k4 (1/min) k5 (L/(mol · min)) k6 (L/(mol · min)) k7 (L/(mol · min)) k8 (L2/(mol2 · min)) k9 (L2/(mol2 · min)) k10 (L2/(mol2 · min))

125 25 25 2.5 5 0.5 0.5 0.1 1 1

1 0.2 0.2 0.02 0.04 0.004 0.004 0.0008 0.008 0.008

no. wm1 wm2

wm3

wm4

wm5

wm6

wm7

wm8

wm9

wm10

AWDC Criterion 1

0

0

0

0

1 2 3 4 5

0 0 0 0 0

0 0 0 0 0

0.167 0 0 0 0

0 0 0 0 0

0

0

0

0

0.9999 0.001

0.167 0.25 0.33 0 0

0.167 0.25 0.33 0.5 1

Classical Criterion 0.167 0 0 0 0

0 0.25 0 0 0

0.167 0 0 0 0

0.167 0.25 0.33 0.5 0

It can be seen that the classical design criterion leads to model lumping, as discussed in section 2; six of the 10 model candidates predict an identical reaction rate of 0 and two an identical rate of 250. Using the AWDC, all model predictions are nicely distributed and no model lumping occurs. The value of the objective function is approximately 1 for the AWDC and 1037 for the CDC. While the latter value does not allow any interpretation, the AWDC value is close to the theoretically optimal value of 1, indicating a good experiment. Next, we estimate the parameters on the basis of simulated measurements generated using the correct model and then calculate the Akaike weights. We repeatedly perform the steps of (i) optimal experimental design, (ii) parameter estimation, and (iii) calculation of Akaike weights until one model has an Akaike weight of more than 95%. Table 7 shows the Akaike weights after all necessary experiments, applying the novel AWDC and the CDC, respectively. It can be seen that one optimally planned experiment based on the novel AWDC is sufficient to identify the correct model structure (number 9), which has an Akaike weight of 99.99% after this experiment. On the other hand, five optimally planned experiments are necessary, if the classical design criterion (7) is applied. This is due to model lumping, which hinders a more efficient model discrimination. After the first optimally planned experiment, four model candidates can be dismissed from the candidate set and the remaining six model candidates share the same Akaike weight of 16.7%. In the following experiments, only one or two model candidates are falsified in each experiment, such that in total five optimally planned experiments are necessary to identify the correct model structure if a classical design criterion is applied. It is obvious that the experiments designed by the classical criterion are by no means optimal and better experiments can probably be designed in an intuitive manner. However, in the case of more complex model structures such an approach is unlikely to lead to good results, making a well-suited design criterion necessary. 6. Results and Discussion In this paper we show that the classical RSOS-based OEDMD criteria are not well-suited in the case of a high number of concurrent model structure candidates. While the approach works well for only a few (ideally two) model candidates, model lumping is likely to occur if numerous model candidates are considered. In case of model lumping the designed experiment will generally not be optimal and thus hinders an efficient model discrimination.

Table 5. Results of Optimal Experimental Design

initial guess CDC AWDC

Table 7. Model Probabilities during Model Identification Using the AWDC and the Classical Design Criterion

CA, 0 (mol/L)

CB, 0 (mol/L)

Ccat., 0 (mol/L)

5 10 7.1863

5 0 8.4722

50 100 44.9776

Table 6. Model Predictions and Standard Deviations for the 10 Model Candidates and Both Experimental Settings (mol/(L · min))

CDC stdv. AWDC stdv.

r1

r2

r3

r4

r5

r6

r7

r8

r9

r10

125 1 125 1

250 2 179.7 1.4373

0 0 211.8 1.6954

250 2 112.4 0.8996

0 0 304.4 2.4354

500 4 161.6 1.2929

0 0 190.5 1.5242

0 0 273.8 2.1907

0 0 437.5 3.5002

0 0 515.8 4.1266

Ind. Eng. Chem. Res., Vol. 49, No. 2, 2010

We therefore propose a new, statistically sound design criterion for the discrimination of numerous concurrent model candidates based on Akaike weights. This AWDC allows determination of experimental settings leading to equally distributed model predictions. This approach efficiently prevents model lumping and thus generally allows identification of the best model structure more efficiently if numerous diverse model structures are considered. Such a method is desirable, since it helps to decrease the experimental effort for model identification. We compare the novel AWDC with a classical RSOS-based approach in a case study based on a literature example. Applying the AWDC, one optimally planned experiment is sufficient to identify the correct model structure from a set of 10 candidates. Applying a classical design criterion five optimally planned experiments are necessary. Hence, in this case study, the AWDC is able to reduce the total number of experiments necessary from six to only two, which corresponds to a reduction of the experimental effort for model identification of 67%. The novel design criterion does not lead either to an increased implementation effort or to an increase in computational times for the optimal experimental design. Acknowledgment The financial support by the Deutsche Forschungsgemeinschaft (DFG) via the Collaborative Research Center 540 (Grant SFB 540) is greatly appreciated. Literature Cited (1) Akaike, H. Information theory as an extension of the maximum likelyhood principle. In Second International Symposium on Information Theory; Petrov, B. N., Csaki, F., Eds.; Akademiai Kiado: Budapest, Hungary, 1973; pp 267-281. (2) Akaike, H. A new look at statistical model identification. Autom. Control 1974, 19, 716–722. (3) Akaike. H. Canonical correlation analysis of time series and the use of an information cirterion. In System Identification: Adcances and Case Studies; Mehra, R. K., Lainiotis, D. G., Eds.; Academic Press: New York, 1976; pp 267-281. (4) Asprey, S. P.; Macchietto, S. Statistical tools for optimal model building. Comput. Chem. Eng. 2000, 24 (1), 1261–1267. (5) Atkinson, A. C. Posterior probabilities for choosing a regression model. Biometrika 1978, 65 (1), 39–48. (6) Atkinson, A. C.; Fedorov, V. V. Optimal design: Experiments for discriminating between several models. Biometrika 1975, 62 (2), 289–303. (7) Banga, J. R.; Balsa-Canto, E.; Moles, C. G.; Alonso, A. A. Dynamic optimization of bioprocesses: Efficient and robust numerical strategies. J. Biotechnol. 2005, 117 (4), 407–419. (8) Box, G. E. P.; Hill, W. J. Discrimination among mechanistic models. Technometrics 1967, 9 (1), 57–71. (9) Box, G. E. P. Jenkins, G. M. Time Series Analysis: Forecasting and Control; Holden-Day: London, 1970. (10) Box, G. E. P.; Lucas, H. L. Design of experiments in non-linear situations. Biometrika 1959, 46 (1-2), 77–90. (11) Brendel, M.; Bonvin, D.; Marquardt, W. Incremental identification of complex reaction kinetics in homogeneous systems. Chem. Eng. Sci. 2006, 61 (16), 5404–5420. (12) Burke, A. L.; Duever, T. A.; Penlidis, A. Model disrimination via designed experiments: Discriminating between the terminal and the penultimate models on the basis of composition data. Macromolecules 1994, 27, 386–399. (13) Burke, A. L.; Duever, T. A.; Penlidis, A. Discriminating between the terminal and penultimate models using designed experiments: An overview. Ind. Eng. Chem. Res. 1997, 36, 1016–1035. (14) Burnham, K. P. Anderson, D. R. Model Selection and Multimodel InferencesA Practical Information-Theoretic Approach, 2nd ed.; SpringerVerlag: New York, 2002. (15) Burnham, K. P.; Anderson, D. R. Multimodel inference: Understanding aic and bic in model selection. Sociol. Methods Res. 2004, 33, 261–304.

919

(16) Buzzi-Ferraris, G.; Forzatti, P. Sequential expermental design for discrimination in the case of multiple responses. Chem. Eng. Sci. 1984, 39 (1), 81–85. (17) Buzzi-Ferraris, G.; Forzatti, P. An improved version of a sequential design criterion for discriminating among rival response models. Chem. Eng. Sci. 1990, 45 (2), 477–481. (18) Chen, B. H.; Bermingham, S.; Neumann, A. H.; Kramer, H. J. M.; Asprey, S. P. On the design of optimally informative experiments for model discrimination among dynamic crystallization process models. Ind. Eng. Chem. Res. 2004, 43, 4889–4902. (19) Cooney, M. J.; McDonald, K. Optimal dynamic experiments for bioreactor model discrimination. Appl. Microbiol. Biotechnol. 1995, 43, 826– 837. (20) Hosten, L. H.; Dumez, F. J.; Froment, G. The use of sequential discrimination in the study of 1-butene dehydrogenation. Ind. Eng. Chem. Fundam. 1977, 16, 298–301. (21) Fisher, R. A. The Design of Experiments; Oliver and Boyd: Edinburgh, U.K., 1935. (22) Froment, G. Model discrimination and parameter estimation in heterogeneous catalysis. AIChE J. 1975, 1041–1056. (23) Garciaochoa, F.; Satnos, A.; Quincoces, C. Kinetic-model discrimination using sequential designsCatalytic dehydratation of cyclohexanol. An. Quim. 1992, 88 (5-6), 573–577. (24) Grossmann, I. E.; Biegler, L. T. Part II. Future perspective on optimization. Comput. Chem. Eng. 2004, 28 (8), 1193–1218. (25) Guus, C.; Boender, E.; Romeijn, H. E. Stochastic methods. In Handbook of Global Optimization; Horst, R., Pardalos, P. M., Ed.; Kluwer Academic Publishers: Dordrecht, The Netherlands, 1995; pp 829-869. (26) Haunschild, M. D.; Freisleben, B.; Takors, R.; Wiechert, W. Investigating the dynamic behavior of biochemical networks using model families. Bioinformatics 2005, 21, 1617–1625. (27) Hsiang, T.; Reilly, P. M. A practical method for discriminating among mechanistic models. Can. J. Chem. Eng. 1971, 49, 865–871. (28) Hunter, W. G.; Reiner, A. M. Design for discriminating between two rival models. Technometrics 1965, 7, 307–323. (29) Kreutz, C.; Timmer, J. Systems biology: Experimental design. FEBS J. 2009, 276, 923–942. (30) Kullback, S.; Leibler, R. A. On information and sufficiency. Ann. Math. Stat. 1951, 22, 79–86. (31) Michalik, C.; Chachuat, B.; Marquardt, W. Incremental global parameter estimation in dynamical systems. Ind. Eng. Chem. Res. 2009, 48, 5489–5497. (32) Pinto, J. C.; Lobao, M. W.; Monteiro, J. L. Sequential experimental design for parameter estimation: A different approach. Chem. Eng. Sci. 1990, 45 (4), 883–892. (33) Reilly, P. M. Statistical methods in model discrimination. Can. J. Chem. Eng. 1970, 48, 168–73. (34) Rodriguez-Fernandez, M.; Mendes, P.; Banga, J. R. A hybrid approach for efficient and robust parameter estimation in biochemical pathways. Biosystems 2006, 83 (2-3), 407–419. (35) Ruppen, D.; Bonvin, D.; Rippin, D. W. T. Implementation of adaptive optimal operation for a semi-batch reaction system. Comput. Chem. Eng. 1997, 22 (1-2), 185–199. (36) Shannon, C. E. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423. (37) Singer, A. B.; Taylor, J. W.; Barton, P. I.; Green, W. H. Global dynamic optimization for parameter estimation in chemical kinetics. J. Phys. Chem. A 2006, 110 (3), 971–976. (38) Soong, T. T. Fundamentals of Probability and Statistics for Engineers; John Wiley and Sons: Chichester, U.K., 2004. (39) Spiess, A. C.; Zavrel, M.; Ansorge-Schumacher, M. B.; Janzen, C.; Michalik, C.; Schmidt, T. W.; Schwendt, T.; Bu¨chs, J.; Poprawe, R.; Marquardt, W. Model discrimination for the propionic acid diffusion into hydrogel beads using lifetime confocal laser scanning microscopy. Chem. Eng. Sci. 2008, 63 (13), 3457–3465. (40) Ternbach, M. B.; Bollman, C.; Wandrey, C.; Takors, R. Application of model discriminating experimental design for modeling and development of a fermentative fed-batch l-valine production process. Biotechnol. Bioeng. 2005, 91 (3), 356–368. (41) Wahl, S. A.; Haunschild, M. D.; Oldiges, M.; Wiechert, W. Unravelling the regulatory structure of biochemical networks using stimulus response experiments and large-scale model selection. IEE Proc. Syst. Biol. 2006, 153 (4), 275–285.

ReceiVed for reView June 2, 2009 ReVised manuscript receiVed October 6, 2009 Accepted November 11, 2009 IE900903U