On the Design of Optimally Informative Dynamic Experiments for

the methods are general enough to be applied in other modeling exercises. ... falsification for biological network models using semidefinite progr...
0 downloads 0 Views 187KB Size
Ind. Eng. Chem. Res. 2003, 42, 1379-1390

1379

On the Design of Optimally Informative Dynamic Experiments for Model Discrimination in Multiresponse Nonlinear Situations Bing H. Chen and Steven P. Asprey* Centre for Process Systems Engineering, Imperial College London, London SW7 2AZ, United Kingdom

We present a new method for determining optimally informative dynamic experiments for the purpose of model discrimination among several rival multiresponse nonlinear structured dynamic models generally described by systems of differential and algebraic equations (DAEs). A robust and efficient algorithm based on an extension to the dynamic case of the discrimination criterion put forth by Buzzi-Ferraris and Forzatti (Chem. Eng. Sci. 1984, 39, 81) is developed to calculate dynamic input trajectories by reformulation of the experiment design problem as an optimal control problem. We show that the new approach, by taking parametric uncertainty into account, can provide significant improvements in the ability to distinguish among a series of rival dynamic models over previous attempts to design dynamic experiments primarily based on parameter point estimates and thus maximizes the divergence of the model predictions without regard for uncertainty (Espie, D. M.; Macchietto, S. AIChE J. 1989, 35, 223). We illustrate the experiment design concepts with a relatively simple, but pedagogical example of the dynamic modeling of the fermentation of baker’s yeast, although the methods are general enough to be applied in other modeling exercises. 1. Introduction Dynamic mechanistic modeling has become a common approach in most engineering disciplines, used to gain a better understanding of the various phenomena occurring within processing systems. In general, this approach yields nonlinear parametric models that give a very accurate mathematical description of the system, much more so than high-dimensional linear models obtained using classical identification algorithms. Through a deeper understanding of the underlying phenomena, these detailed mathematical models are being used to improve product, process, and plant performance through such applications as model-based process design, control, and optimization of processes. Thus, building models of processing systems is a very important activity. In any case, performing just one experimental run to produce data can be costly, in terms of both time and money. As a consequence, there is a distinct need for systematic methods of developing models so as to minimize any incurred loss and to maximize the quality of the information obtained from any one experiment. When building process models, the situation often arises that more than one model can be proposed to describe the system under study. Notable examples include, but are not limited to, (i) catalytic kinetics studies in which several mechanisms can be proposed and, hence, a different model formulated for each mechanism;1,33 (ii) crystallization kinetics studies, where models consist of population balances coupled with kinetic models that include either power-law or first-principles models involving crystal surface energy;34 (iii) investigations of mammalian cell culture systems in which the kinetics of the metabolic * To whom all correspondence should be addressed. Email: [email protected]. Tel.: +44 (20) 7594-6656. Fax: +44 (20) 7594-6606.

network are modeled;36 and (iv) polymerization reaction modeling.18 The model discrimination problem also naturally arises in attempts to improve upon an existing process model, perhaps for the purposes of increasing model fidelitysin this case, discrimination experiments can be performed to ensure that the added model improvement provides merit. In any one of these situations, it is natural to want to know which model is the “best” of the candidates proposed. On the basis of the philosophy that mathematical models can guide experimental work by providing a framework for synthesis and analysis, statistical model discrimination methods have been designed specifically for this type of problem1-10 and are the sole focus of this communication. Here, we present a new criterion based on an extension to the works of BuzziFerraris and Forzatti4,5 to discriminate among several dynamic models generally described by sets of differential and algebraic equations (DAEs). This new criterion is embedded within a dynamic optimization framework, allowing one to design time-varying (or dynamic) inputs to the process. The consideration of model discrimination dates as far back as the original works of Hunter and Reiner1 in 1965 and has been an active area of research since its inception. In their work, Hunter and Reiner1 introduced the simple notion of using the expected value of the divergence between the point predictions of the candidate models to design further experiments. Naturally, in maximizing this divergence, one hopes to collect experimental data that will provide evidence in favor of one model over the others. Of course, when process models are nonlinear in the parameters, an iterative or sequential process consisting of several key steps such as experiment design, parameter estimation, and model adequacy hypothesis testing is required. Over the years since 1965, two different approaches to generate the

10.1021/ie0203025 CCC: $25.00 © 2003 American Chemical Society Published on Web 02/08/2003

1380

Ind. Eng. Chem. Res., Vol. 42, No. 7, 2003

Figure 1. Model discrimination algorithm.

discrimination criterion have evolved: The first approach uses a Bayesian inference that assigns to each model a probability of being correct. In the Bayesian analysis, prior probabilities are initially assigned to reflect our belief in the correctness of the competing models. Model discrimination experiments are then designed using the predicted posterior probability calculated using Bayes theorem, the experiments are carried out, and the prior probabilities for the rival models are updated in light of the new observed data. This process is repeated until the probability of one model becomes sufficiently large that it can be deemed to be best. The second takes a frequentist approach, in which experiments are designed using model predictions and their associated confidence intervals (assuming all models are correct with equal probability), the experiments are carried out, and the resulting model regressions to the data are checked for significant lack of fit (LOF) through the use of a null hypothesis. This process is repeated until only one model passes the lack-of-fit test. This process is illustrated in Figure 1 and is the approach used in this work. We feel it necessary here to point out that the model discrimination issue is only one component of an overall framework necessary to fully scrutinize process models. Asprey and Macchietto11 presented an overall modelbuilding strategy comprising three distinct stages: (i) The first stage involves tests for global parametric identifiability35 and distinguishability before any data are collected. These tests are used to determine, mathematically, whether the parameters in the mathematical formulation of the models can be uniquely identified and whether the model structures can be distinguished from one another. (ii) In the second stage, experiments for model discrimination are designed, with the final purpose of selecting the best model from the given set. (iii) Finally, the third stage involves the design of

experiments for improving the precision of the parameter estimates within a single process model. Note that the identification issue surfaces in the later stage only. To address the identification problem, methods were presented19,26 in the late 1980s and early 1990s for general mechanistic models consisting of DAEs; these works were based on the so-called “alphabetic” design criteria.28 Other previous developments include optimal input signal selection to identify dynamic models in the form of linear discrete time models29,30,32 based on a frequency-domain approach and the spectral information content of the input signals. An excellent overview of these methods and their relation to the alphabetic criteria can be found in Shirt et al.31 More recent work has begun on the application of alphabetic criteria to systems described by general DAEs by Ko¨rkel et al.,27 as well as Asprey and Macchietto.24 However, it is not our intention to cover such methods in this article; rather, we focus solely on the model discrimination problem, while keeping in mind that all issues mentioned must be addressed in an integrated fashion when a process model is being built. Thus, strictly speaking, the experiments designed for model discrimination are not intended for model identification; rather, their purpose is exclusively for discrimination. (For a discussion of an attempt to design experiments for simultaneous discrimination and identification, the reader is referred to Hill et al.37) In this respect, in this work, we are not concerned with the spectral information contained within the time-varying input sequences for identification purposes. In this paper, we present a brief overview of each of the methods used in the individual steps of the algorithm presented in Figure 1. For completeness, we include parameter estimation techniques based on the maximum-likelihood estimator for multiresponse nonlinear models, as well as model adequacy testing.

Ind. Eng. Chem. Res., Vol. 42, No. 7, 2003 1381

Subsequently, we cover statistical model discrimination methods, including those developed for the design of dynamic experiments. We then introduce a new criterion to efficiently determine the best dynamic model from a set of candidate models, on the basis of a dynamic extension of the Buzzi-Ferraris and Forzatti5 divergence measure that accounts for uncertainty in model predictions in the design of discrimination experiments. We demonstrate the efficacy of the newly proposed methods to discriminate among four rival models of a fed-batch fermentation process for baker’s yeast, a test case selected because of its ubiquitous appearance in previous publications that develop dynamic model discrimination experiment design techniques.7,21

parameters, ψ, and yˆ is the vector of predictions of y by the solution of model eq 1, with

E[nm] ) 0

Σ) 2. Process Models: Mathematical Representation When building models whose use is to explain an observed phenomenon, one uses a priori knowledge such as physical, chemical, or biological conservation laws to propose (conceivably several) possible models. In what follows, we therefore consider the general deterministic model form of a set of (possibly mixed) differential and algebraic equations (DAEs)

f(x3 (t),x(t),u(t),w,θ,t) ) 0 yˆ (t) ) g[x(t)]

(1)

where x(t) is an ns-dimensional vector of time-dependent state variables, u(t) is an nc-dimensional vector of timevarying controls or inputs to the process, w is an nidimensional vector of constant controls, θ is a P-dimensional vector of model parameters to be determined, and yˆ (t) is an M-dimensional vector of predicted response variables that are a function of the state variables, x(t). In most cases, g[x(t)] will simply be a “selector” matrix, selecting those state variables that are in fact measured. 3. Parameter and Variance Estimation Models such as eq 1 are deterministic, in that they have no error associated with them. However, in the real world, we know that there certainly is variability that comes in many forms: measurement error, human error, random error, unexpected disturbances, etc. Although specific reduced models exist for the case of errors in both the measured and independent variables, we focus on the case where there are errors in the measured response variables only. Here, we use the maximum-likelihood framework for parameter estimation, in which we restrict ourselves to the case of the normal probability density function (pdf) for the variability of the errors. The restriction to the normal pdf is less restrictive than one might think, as many measurements in nature exhibit a normal density. As the number of measurements increases asymptotically to infinity (large sample size property), the central limit theorem provides us with the property that the pdf for the sample mean tends to a normal density, even if the small sample size density is not normal. In addition, often, a simple transformation of a function will yield a normal density. We therefore write the model as

yn ) yˆ (xn,θ) + n

n ) 1, ..., N

(2)

where n is an M-dimensional vector of errors that has a pdf that is conditional on a set of error distribution

[

E[nmri] ) var(y1)

{

{Σ}mi n ) r n*r

0

cov(y1,y2) ‚ ‚ ‚ ‚ ‚ ‚ cov(y1,yM)

cov(y2,y1)

var(y2)

· · · · · · cov(yM,y1)

··

· ··

‚‚‚

·

‚‚‚ ‚‚‚

· · · · · · · · · var(yM)

]

(3a)

(3b)

where Σ is an M × M covariance matrix and E[‚] denotes expectation. Assuming that these error terms are uncorrelated from observation to observation but correlated from response variable to response variable when measured at any one instant in time, the conditional likelihood function will be of the standard form12

L(θ,Σ1,...,ΣN) )

{

N

2πNM/2

∏ i)1

|Σi|-1/2 exp -

1

N

[(yi - yˆ i)Σi-1(yi - yˆ i)] ∑ 2 i)1

}

(4)

where Σi represents the variance-covariance matrices of the error in the ith experiment, which are not necessarily constant, and yˆ i is yˆ (xi,θ) from eq 2. There are two approaches to the estimation of both the parameters and the variance: (i) The first approach uses a simple preliminary guess of the variance-covariance matrices, Σi, and maximizes the likelihood function, eq 4, over the values of θ ∈ Θ to provide maximumlikelihood parameter estimates, denoted θˆ . The variance-covariance of the residuals can then be used to estimate the (r,s)th element of Σ via N

σ˜ rs )

{[yri - yˆri(θˆ )][ysi - yˆsi(θˆ )]} ∑ i)1 N-1

(ii) The second approach uses a functional approximation to the variance such as σˆ i,j2 ) ωj2(yˆ i,j2)γj/2, where ωj is a proportionality factor, yˆ i,j is the model prediction at the ith experimental condition of the jth response, and γj is an adjustable parameter, often called the “heteroscedasticity parameter”. From this result, one can maximize the likelihood function, eq 4, over θ ∈ Θ, ω ∈ Ω, and γ ∈ Γ to obtain simultaneous estimates of the parameters and variance. Either of these methods can be augmented with those of Seber and Wild13 for parameter estimation in the case of autocorrelated errors and, in particular, should be used when the measurements are taken rapidly in succession. As the time between measurements increases, arguably, the autocorrelation between the errors in these measurements decreases rapidly.13

1382

Ind. Eng. Chem. Res., Vol. 42, No. 7, 2003

4. Assessing Model Adequacy: Postexperiment Testing Model adequacy pertains to how well the fitted model describes the observed experimental data. Adequacy is tested using hypothesis testing and is performed after an experiment is designed, the data are collected, and the model parameters and response variable variancecovariance are estimated using the methods outlined in section 3. There are several approaches to hypothesis testing for model adequacy, and all are well documented in the literature.13 The sum-of-squares of residuals of any parameter estimation scenario consists of a contribution due to the measurement noise or pure error variance and a contribution due to the LOF of the model. Here, we briefly review LOF measures for multiresponse models in the cases where the pure error variance is either known or not known (and thus estimated from data). It is prudent to point out that, whether we know or must estimate the pure error variance-covariance of the process, the method used for LOF testing has no bearing on the experiment design itself, nor the methods for its design. 4.1. Case of Unknown Pure Error VarianceCovariance. When the pure error variance of the measurements, Σ, is not known a priori (in general, this will be the case), repeat tests are required to permit a sample estimate Me of the experimental error contribution to the moment matrix of the residuals, M(θˆ j), for the jth model. As pointed out by Stewart et al,8 the LOF can be tested via

|

×c3j ) (N - Pj) ln

M(θˆ j)

|

(N - Pj) M(θˆ j) - Me Me (N - Pj - re) ln - re ln (5) re (N - Pj - re)

|

|

| |

where re is the number of repeat measurements used for estimation of Me, N is the number of observations of the response variables, Pj is the number of parameters in the vector θj for the jth model, Mj(θˆ ) is the moment matrix of the residuals of the jth model evaluated at the best estimate of θ, and Me is the experimental error moment matrix. As pointed out by Stewart et al,8 the ×c3 value calculated using eq 5 can be tested by the sampling probability Pr(×c3>×c3j) of ×c3 values exceeding ×c3j by using eq 26 of Box;14 a small value of this probability casts doubt on the hypothesis and thus on the model itself as a good candidate for describing the process for which data were collected. 4.2. Case of Known Pure Error Variance-Covariance. If the pure error variance (Σ) is known exactly (perhaps only in the case of simulated data), we can estimate the LOF of the jth model using a χ2 value given by

χj2 ) S(θˆ j)

(6)

where N

S(θˆ j) )

M

M

[yki - yˆ ki(θj)](σkl)-1[yli - yˆ li(θj)] ∑ ∑ ∑ i)1 k)1 l)1

(7)

with (σkl)-1 defined as (k,l)th element of Σ-1. This χ2 value can be tested by reference to a χ2 distribution with

(NM - Pj) degrees of freedom, with smaller χ2 values indicating an adequate fit of the model to the observed data. In cases where data are simulated and noise is added (for the purposes of testing experiment design and parameter estimation methods in an experiment-free manner), this LOF test is preferred. 5. Design of Experiments for Model Discrimination 5.1. Design of Conventional Discrimination Experiments. The problem of model discrimination occurs when NM rival models have been proposed to describe a system and it is not certain which of the NM models is best. For any experiment, it is expected that the best model will provide the most accurate prediction of the observed measurements. Bayesian Methods. Advocated by such workers as Box and Hill,2 Box and Henson,15,16 and Stewart et al.,8 in Bayesian analysis, a prior probability is initially assigned to each model to reflect our belief in the correctness of the competing models. The initial prior probability of the ith model is represented by p(Mi,0), and the total probability is restricted such that NM p(Mi,0) ) 1. The prior probabilities can be set on ∑i)1 the basis of existing knowledge about which of the models seems most reasonable, or the prior probabilities can be set to equal values if there is little to no existing knowledge. Model discrimination experiments are then designed and carried out on the basis of a predicted a posteriori distribution. After the nth experiment is performed, the prior probabilities are updated according to the observations yn using Bayes’ theorem

p(Mi,n) ) p(Mi,n-1)‚L(Mi|yn)

(8)

where L(Mi|yn) is the likelihood of model i given the observations yn. The prior probabilities, once updated, are referred to as posterior probabilities, but it is simpler to use the subscripts n - 1 and n to identify the model probabilities before and after the nth experiment. This process of picking a new experimental point, performing the experiment, and updating the model probabilities is continued until the probability of one model, for example, model j, becomes sufficiently large that model j is judged to be the best. Frequentist Methods. Advocated by such workers as Pritchard and Bacon17 and Buzzi-Ferraris et al.,4-6,10 in the frequentist approach, experiments are designed using model predictions and their associated confidence intervals, e.g., in the case of Buzzi-Ferraris et al.10

Tij(φ) ) [yˆ i(φ,θ) - yˆ j(φ,θ′)]T Sij(φ,θ,θ′)-1[yˆ i(φ,θ) - yˆ j(φ,θ′)] (9) where Sij(φ) is the covariance matrix of the vector of deviations of the responses under models i and j (i * j). Once the design has been determined, the experiment is then carried out, and the resulting model regressions to the data are checked for significant LOF through use of a null hypothesis, as outlined in eq 5. This process is repeated until only one model passes the lack-of-fit test. A review of all time-invariant experiment design methods for model discrimination can be found in Burke

Ind. Eng. Chem. Res., Vol. 42, No. 7, 2003 1383

et al.18 As pointed out by Buzzi-Ferraris,10 both the Bayesian-based methods and frequentist methods, although very different on a philosophical level, generate similar results if used with care. 5.2. Design of Dynamic Discrimination Experiments. Relatively little past work has been presented for discriminating between rival dynamic nonlinear models. Espie19 was the first to attempt an extension of the Hunter-Reiner20 criterion to the dynamic case, where the summation over the data points in the original formulation was replaced by an integral over a prespecified time interval [t0, tf]. The objective function for model discrimination was then defined as

φT ) arg max φ∈Φ

Nm

Nm

∫t ∑ ∑ [yˆ l(φ,θˆ ,t) l)1 l′)l+1 tf

0

yˆ l′(φ,θˆ ′,t)]T W[yˆ l(φ,θˆ ,t) - yˆ l′(φ,θˆ ′,t)] dt (10) and embedded in an optimal control framework. Cooney and McDonald21 later proposed a slightly modified version of this objective function for model discrimination

φT ) arg max min φ∈Φ

l,l′

∫tt [yl(φ,θˆ ,tk) f

0

yˆ l′(φ,θˆ ′,tk)]T W[yˆ l(φ,θˆ ,tk) - yˆ l′(φ,θˆ ′,tk)] dt (11) where the minimization step occurs over models l ) 1, ..., NM and l′ ) l + 1, ..., NM. Thus, they maximize the minimum distance between the model predictions. In either case, in going from a discrete (original formulation) to a continuous (eqs 10 and 11) function, the meaning of the sampling time of the responses is lost in the translation. In our formulation,11 we keep the practical concept of the discrete sampling time by using the Hunter-Reiner formulation as a starting point for the design of dynamic experiments for model discrimination, keeping the notion of sampling points, tk, k ) 1, ..., nsp nsp NM

φT ) arg max

NM

∑ ∑ ∑ {[yˆ l(φ,θˆ ,tk) -

φ∈Φ k)1 l)1 l′)l+1

yˆ l′(φ,θˆ ′,tk)]T W[yˆ l(φ,θˆ ,tk) - yˆ l′(φ,θˆ ′,tk)]} (12) where nsp is the number of discrete sampling points of each of the response measurements, chosen a priori, and the RHS of eq 12 is an overall measure of the divergence between the model predictions. We embed this criterion in a dynamic optimization framework, fully described by Asprey and Macchietto.24 To account for uncertainty in the point estimates of the parameters, and thus variance in the predictions of the models, new extensions of eq 12 could be formulated on the basis of the Bayesian posterior density approach used for nonlinear steady-state process models8 or using either an expected value criterion nsp NM

φTEV ) arg max

E

NM

∑ ∑ ∑ {[yˆ l(φ,θ,tk) -

φ∈Φ θ∈Θ,θ′∈Θ′k)1 l)1 l′)l+1

yˆ l′(φ,θ',tk)]T W[yˆ l(φ,θ,tk) - yˆ l′(φ,θ′,tk)]} (13)

or a worst-case criterion nsp NM

φTWC ) arg max min

NM

∑ ∑ ∑ {[yˆ l(φ,θ,tk) -

φ∈Φ θ∈Θ,θ′∈Θ′ k)1 l)1 l′)l+1

yˆ l′(φ,θ′,tk)]T W[yˆ l(φ,θ,tk) - yˆ l′(φ,θ′,tk)]} (14) 6. New Formulation for the Design of Dynamic Experiments for Model Discrimination Buzzi-Ferraris and Forzatti4,5 took an alternative approach to model discrimination in which they proposed a design criterion as a ratio of the average squared difference between expected value predictions to the average variance in the predictions. As such, this criterion is similar to an F-statistic, and its value can be used to judge the ability to discriminate between the models. For instance, if it is possible to discriminate between models under given experimental conditions, the value of the criterion should be greater than M, the number of responses. If the criterion value is less than M under all possible experimental conditions, no discrimination can be gained, and the sequential design for model discrimination can be stopped, or experiments can be designed and carried out to decrease model prediction uncertainty. In this work, we propose the following extension of the Buzzi-Ferraris criterion for design of dynamic experiments, using the dynamic sensitivity matrix

Vl,sp )

[

∂yˆ l,1(tsp1,θˆ l) ∂yˆ l,1(tsp1,θˆ l) ∂θˆ l,1

· · ·

∂yˆ l,1(tsp1,θˆ l) ∂θˆ l,Pl

∂θˆ l,2

∂yˆ l,2(tsp1,θˆ l) ∂yˆ l,2(tsp1,θˆ l)

· · ·

∂yˆ l,2(tsp1,θˆ l)

∂θˆ l,Pl ∂θˆ l,1 ∂θˆ l,2 · · · ·· · · · · · · · ∂yˆ l,M(tsp1,θˆ l) ∂yˆ l,M(tsp1,θˆ l) ∂yˆ l,M(tsp1,θˆ l) · · · ∂θˆ l,Pl ∂θˆ l,1 ∂θˆ l,2

]

(15)

where Vl,sp is defined as the sensitivity matrix for model l at sampling time tsp1. The parametric sensitivity information for models comprising a set of differential and algebraic equations can be obtained by the solution of the first-order derivative of the model equations, eq 1, with respect to the model parameters

∂g(x) ∂x ∂yˆ ) ∂θp ∂x ∂θp fy3

∂x3 ∂x ∂f + fy + )0 ∂θ ∂θ ∂θ

(16) (17)

Subsequently, we can define the variance-covariance for model l predictions as

Wl ) Vl,spΣθ,l-1Vl,spT

(18)

We then form the criterion for a single discrete sampling time as

Tl,l′,sp(φ,θˆ l,θˆ l′) ) [yˆ l(x,θ,tspk) yˆ l′(x,θ,tspk)]T Sl,l′-1[yˆ l(x,θ,tspk) - yˆ l′(x,θ,tspk)] (19)

1384

Ind. Eng. Chem. Res., Vol. 42, No. 7, 2003

where Sl,l′ ) S + Wl + Wl′ and S is the covariance matrix of experimental error (estimated by using the methods outlined in section 3). To form a measure of divergence over time, we sum the criteria eqs 19 over all sampling times nsp

max φ

∑ Tl,l′,sp(φ,θˆ l,θˆ l′)

(20)

sp)1

As pointed out by Buzzi-Ferraris et al.,5 when working with their multiresponse criterion, we must maximize above M (the number of measured responses), a limiting value that “indicates that the variance of the divergences between the corresponding responses can be explained in terms of experimental error variance (S) plus the variance of the corresponding expected responses (Wl + Wl′)”. With the new dynamic formulation, we extrapolate this observation, requiring maximization of eq 20 above (nsp × M). As in the case of the original Buzzi-Ferraris method, “in order to discriminate among more than two rival models, the criterion requires the choice for the new experiment design vector, φ, that maximizes eq 20 over any pair (l, l′) of models (l * l′)”. The procedure is stopped when no setting of φ maximizes eq 20 above (nsp × M). If left with more than one model, BuzziFerraris et al.5 suggest designing for parameter precision in the remaining models and returning to the discrimination designs with the updated parameter values. For comparison purposes, we can also use the extended Hunter-Reiner formulation for the design of dynamic experiments for model discrimination, as presented in Asprey and Macchietto11 nsp NM

T(φ) )

NM

∑ ∑ ∑ [yˆ l(x,θ,tsp ) k

k)1 l)1 l′)l+1

T

yˆ l′(x,θ,tspk)] [yˆ l(x,θ,tspk) - yˆ l′(x,θ,tspk)] (21) where, again, nsp is the number of discrete sampling points of each of the response measurements, chosen a priori by the user. We embed this criterion in the dynamic optimization framework for designing dynamic experiments as presented in Asprey and Macchietto.24 For mathematically representing continuous time-varying inputs, u(t), to the process with a finite number of optimization variables, we use the control vector parametrization technique,24,25 using a piecewise-constant parametrization. This ensures that the infinite-dimensional continuous-time trajectories of the process inputs can be represented with a finite number of optimization variables, thus making the solution of such a problem tractable. As such, the vector of design variables φ is expressed as

φ ) [tspT, tsw1,1, ..., tswnc,nsw-1, z1,1, ...znc,nsw, y0T]

(22)

where tspT is a row vector of discrete response sampling times, tswi,j corresponds to the jth time at which the ith time-varying input switches levels, zi,j, and y0T is a row vector of controllable initial conditions. In this formulation, the sampling points are the same for all response

variables, and we have a flexible final time for the end of the experiment, defined by the final sampling point, tspnsp. Constraints are placed on the design variables, including simple bounds and optional nonlinear equality/inequality constraints on the initial conditions, the state variables, and the sampling and switching times

x0Li e x0i e x0Ui

i ) 1, ..., ns

g(x0) g 0 min max e tspl - tspl-1 e ∆tsp ∆tsp l l min max e tswi,j - tswii,j-1 e ∆tsw ∆tsw i,j i,j

(23) (24)

l ) 1, ..., nsp (25) i ) 1, ..., nu; j ) 1, ..., nswi (26)

min max where ∆tsp and ∆tsp are, respectively, the minimum l l and maximum allowable intervals between consecutive sampling points. The constraint eq 26 is introduced to avoid mathematical singularities caused by the collapse of one or more control intervals. Simple bounds are also imposed on the levels of the piecewise constant controls within each control interval

L U e zi,j e zi,j zi,j

i ) 1, ..., nu; j ) 1, ..., nswi (27)

Finally, the constraint

tswi,nsw e tspnsp

i ) 1, ..., nu

(28)

indicates that there is no point in effecting any changes to the control variables past the last sampling point. The constraints in eqs 23-27 can be used to ensure that the experiment design framework is commensurate with the capabilities of the measurement frequency and actuator dynamics of the experimental equipment used (i.e., the design is physically realizable). The resultant nonlinear programming problem is solved for φ by using the SQP code of the NAG (www.nag.co.uk) algorithm E04UFF. 7. Engineering Example: Fed-Batch Fermentation of Baker’s Yeast We illustrate the experiment design concepts with this relatively simple, but pedagogical example, chosen for its ubiquitous appearance in past works on the same subject. We consider the fermentation stage of an intracellular enzyme process,22 as presented in Espie and Macchietto7 and Cooney and MacDonald.21 As in these past studies, we use a typical unstructured (i.e., the internal composition and structure of the cells are not taken into account), unsegregated (i.e., all cells are considered identical) model in which only one key substrate is assumed to be the limiting factor for growth and product formation. The product is an intracellular enzyme. We assume that the fermenter is operated isothermally and that the feed is free from product. First, assuming Monod-type kinetics for biomass growth and substrate consumption and invoking the

Ind. Eng. Chem. Res., Vol. 42, No. 7, 2003 1385

conservation laws of mass, the system can be mathematically modeled by the following set of DAEs18,23

MI(θ,u,t) dy1 ) (r - u1 - θ4)y1 dt ry1 dy2 )+ u1(u2 - y2) dt θ3 r)

θ1y2 θ2 + y2

Σ) (29)

where y1 is the biomass concentration (g/L), y2 is the substrate concentration (g/L), u1 is the dilution factor (h-1), and u2 is the substrate concentration in the feed (g/L). The important experimental conditions that characterize a particular experiment are (1) the initial biomass concentration (or inoculation), y1,0, with range from 1 to 10 g/L; (2) the dilution factor, u1, with range from 0.05 to 0.20 h-1; and (3) the substrate concentration in the feed, u2, with range from 5 to 35 g/L. The initial substrate concentration, y2,0, is always 0.01 g/L and cannot be manipulated for experiment design purposes. Both y1 and y2 can be measured during the experiment. For the purposes of model discrimination, we also consider the following three variants of the model in eq 23. The second model assumes Canoid-type kinetics, the third model assumes linear biomass growth rate, and the final model assumes Monod-type kinetics with constant maintenance energy

MII(θ,u,t) dy1 ) (r - u1 - θ4)y1 dt dy2 ry1 )+ u1(u2 - y2) dt θ3 r)

θ1y2 θ2y1 + y2

(30a)

MIII(θ,u,t) dy1 ) (r - u1 - θ3)y1 dt ry1 dy2 )+ u1(u2 - y2) dt θ2 r ) θ1y2

(30b)

MIV(θ,u,t) dy1 ) (r - u1)y1 dt dy2 ry1 )+ u1(u2 - y2) dt θ3 r)

θ1y2 θ2 + y2

MI(θ,u,t) and subsequently added normally distributed noise with a mean of zero; the conditions and corresponding covariances of the simulation are listed in Table 1. The first simulation (simulation 1.1) will be used to illustrate the approach. The complete variancecovariance matrix of the response variables in this case was

(30c)

7.1. Simulating Data for Algorithm Testing Purposes. For parameter estimation and model discrimination purposes, we produced simulated data with model

[

0.058 92 -0.011 43 -0.011 43 0.040 85

]

(31)

7.2. Model Verification. Using the simulated data from simulation 1.1 (cf. Table 1) and starting estimates of θI ) θII ) [0.50, 0.50, 0.50, 0.50]T and θIII ) θIV ) [0.50, 0.50, 0.50]T (where the subscript denotes the model number), Tables 2-5 show the results obtained through maximizing the likelihood estimator (eq 4) by use of the quasi-Newton optimization routine DRMNGB (www.netlib.org) with simple bounds on the parameter values. In addition to the parameter point estimates, we report the corresponding linear approximation parameter confidence limits and correlation matrices. By inspection of the estimated confidence intervals and correlation matrices, all models are relatively wellbehaved; models MIII(θ,u,t) and MIV(θ,u,t) show significant uncertainty in the estimates for θ2. An LOF analysis for each model is shown in Table 6, where we report the χ2 value for each of the models. Included in Table 6 are the estimates of the response variable variance-covariance matrices, calculated using the methods outlined in section 3. Note the relatively good agreement of the estimated covariances for models MI(θ,u,t) and MII(θ,u,t) to the true covariance in eq 31 used to produce the data, whereas those for models MIII(θ,u,t) and MIV(θ,u,t) show poor agreement because of their statistically significant LOF. Models MIII(θ,u,t) and MIV(θ,u,t) show significant LOF; inspection of the overlay plots in Figure 2 indicates that models MIII(θ,u,t) and MIV(θ,u,t) do not predict proper values for response y1. As such, we can conclude that MIII(θ,u,t) is not statistically adequate for modeling the dynamic behavior of the bioreactor system. Although the χ2 value of MIV(θ,u,t) is larger than the reference χ2 value (and thus fails the hypothesis test that the model is adequate), we will keep MIV(θ,u,t) for model discrimination in subsequent iterations of the design of experiments algorithm for illustrative purposes only. 7.3. Design of Optimal Dynamic Experiments for Model Discrimination. Using models MI(θ,u,t), MII(θ,u,t), and MIV(θ,u,t) with the parameter estimates from Tables 2, 3, and 5, respectively, and response variable variance-covariance matrices from Table 6, an experiment design was carried out for model discrimination using the new criterion from eq 20. Theoretically, the discrimination approach based on the new criterion can only design an experiment for any one pair of models at a time. This implies that the number of designed experiments increases for discrimination among more than two models. In this example, three experiments are required for model pairs MI(θ,u,t) and MII(θ,u,t), MI(θ,u,t) and MIV(θ,u,t), and MII(θ,u,t) and MIV(θ,u,t). To avoid this drawback, we design an experiment to distinguish the “most adequate” model pair according to the model LOF results on existing data sets; as such, the data that arise as a result of the application of the designed experiment will be used for a reassessment of all remaining models. Here, we choose

1386

Ind. Eng. Chem. Res., Vol. 42, No. 7, 2003

Table 1. Simulated “Experimental” Conditions and Corresponding Variances Used simulation no. 1.1 1.2 1.3 1.4 1.5 2.1 2.2 2.3 3.1 3.2

initial conditions (t ) 0) y1(0) y2(0)

θ1

“true” model parameters θ2 θ3

θ4

preliminary time-invariant inputs u1 u2

1.00

0.01

0.25

0.25

0.88

0.09

0.05

30.0

1.00

0.01

0.3

0.2

0.5

0.05

0.02

35.0

5.50

0.01

0.3

0.2

0.5

0.05

0.02

35.0

σy12

variance-covariance σy22 cov(y1,y2)

0.058 92 0.389 27 0.444 86 0.518 56 0.628 92 0.045 33 0.074 48 0.102 72 0.090 61 0.168 74

0.040 85 0.379 37 0.437 37 0.491 35 0.611 73 0.046 98 0.078 01 0.098 92 0.092 31 0.170 33

-0.011 43 -0.124 30 -0.126 48 -0.334 80 0.116 16 -0.010 88 0.015 72 0.026 26 0.020 38 -0.020 10

Table 2. Parameter Estimation Results from Simulation 1.1 Data for MI(θ,u,t) paramconfidence eter estimate interval θ1 θ2 θ3 θ4

0.254 90 0.241 33 0.907 29 0.094 55

(0.008 15 (0.099 71 (0.036 49 (0.007 20

approximate correlation matrix 1.000 00 0.369 70 1.000 00 0.834 55 -0.180 14 1.000 00 0.801 64 -0.254 79 0.980 14 1.000 00

Table 3. Parameter Estimation Results from Simulation 1.1 Data for MII(θ,u,t) paramconfidence eter estimate interval θ1 θ2 θ3 θ4

0.249 87 0.009 12 0.941 94 0.101 54

(0.011 73 (0.008 99 (0.057 49 (0.011 02

approximate correlation matrix 1.000 00 0.201 46 1.000 00 0.986 98 0.137 05 1.000 00 0.985 88 0.046 15 0.980 41 1.000 00

Table 4. Parameter Estimation Results from Simulation 1.1 Data for MIII(θ,u,t) paramconfidence eter estimate interval θ1 θ2 θ3

0.026 36 0.681 45 0.033 81

(0.002 89 (0.176 73 (0.028 49

approximate correlation matrix 1.000 00 0.552 85 0.625 41

1.000 00 0.982 72

1.000 00

Table 5. Parameter Estimation Results from Simulation 1.1 Data for MIV(θ,u,t) paramconfidence eter estimate interval θ1 θ2 θ3

0.152 70 0.405 75 0.475 29

(0.007 57 (0.265 06 (0.030 88

approximate correlation matrix 1.000 00 0.919 81 0.291 34

1.000 00 -0.088 09

1.000 00

Figure 2. Overlay plot of data and model predictions.

the model pair MI(θ,u,t) and MII(θ,u,t), as they showed most promise during the initial model assessment in section 7.2. For the design of dynamic experiments, the two timevarying inputs, u1 (the dilution factor) and u2 (the feed substrate concentration), were parametrized by five piecewise-constant time intervals, delineated by four switching times, and 20 sampling points were used, i.e., nsp ) 20. To ensure a physically realizable experiment design, constraints were placed on the minimum time between samples (1 h; cf. eq 25), as well as the minimum switching time intervals for the time-varying inputs (1 h; cf. eq 26). Using the SQP routine E04UFF, the final values for the design vector

φ ) [t1, ..., tnsp, t1,sw1, ..., t1,sw4, t2,sw1, ..., t2,sw4, z1,1,...,z1,5, z2,1, ..., z2,5, y1,0]

T

(32)

and the initial and final objective values are listed in Table 7 (here, e.g., z2,1 and t2,sw1 denote the first

Figure 3. Piecewise inputs of the optimally designed dynamic experiment.

piecewise value of u2(t) and the second switching time for u2(t), respectively), and the resulting piecewiseconstant inputs are illustrated in Figure 3. The model

Ind. Eng. Chem. Res., Vol. 42, No. 7, 2003 1387 Table 6. LOF Analysisa and the Estimated Response Variable Variance-Covariance Matrices MI(θ,u,t) χ2 estimated (co)variance of the residuals a

MII(θ,u,t)

43.9853 0.020 08 -0.001 04

MIII(θ,u,t)

55.8409

-0.001 04 0.055 32

0.040 59 -0.021 62

2838.169

-0.021 62 0.083 42

0.637 51 -0.547 45

-0.547 45 7.394 73

MIV(θ,u,t) 804.4432 1.404 02 -0.044 38

-0.044 38 0.101 47

Reference value: χ2 ) 60.4809.

Table 7. Optimal Dynamic Experiment Design Results for Model Discrimination sampling times

tsp

sp ) 1, ..., nsp

dilution rate levels inlet feed substrate concentration levels switching intervals for the dilution rate switching intervals for the feed substrate concentration initial initial objective value final objective value

z1,i z2,i ti,sw1 ti,sw2

i ) 1, ..., nsw1 i ) 1, ..., nsw2 i ) 1, ..., nsw1 i ) 1, ..., nsw2

y1,0

predictions under these designed conditions are shown in Figure 4, along with their predicted uncertainty bands. Several interesting observations can be drawn from the designed experiment presented in Table 7. Placing the sampling points between 20 and 40 h indicates that the predicted divergence between these two models is larger than at earlier times; this, in turn, results from relatively large changes in the piecewise-constant inputs during this time period (see Figure 3). As pointed out in section 6, the final value of the objective function (cf. eq 20) can be used to judge the ability of the designed experiment to discriminate between the models. In this example, the minimum value of the criterion in eq 20 to ensure discrimination is 40 (20 sampling points × 2 response variables). The final objective function value

Figure 4. Model predictions and 95% inference bands based on the optimally designed dynamic experiments.

20.7, 21.7, 22.7, 23.7, 25.0, 26.0, 27.0, 28.0, 29.0, 30.0, 31.0, 32.0, 33.0, 34.0, 35.0, 36.0, 37.0, 38.0, 39.0, 40.0 0.20, 0.17, 0.12, 0.20, 0.18 35.0, 35.0, 35.0, 35.0, 5.11 3.0, 8.7, 33.5, 38.9 3.8, 29.8, 34.0, 38.3 10.0 111 258

of 258 presented in Table 7 indicates that, according to model predictions and their associated uncertainties, the discrimination experiment should yield sufficient discrimination information. 7.4. Model Reassessment. Having identified the experimental conditions that optimally differentiate between models MI(θ,u,t) and MII(θ,u,t), we simulated new “data” under these conditions for the sampling points given by the design of experiments shown in Table 7. The data were then subjected to the same white noise with covariance as indicated for simulation 1.1 in Table 1. All models were then reassessed through reestimation of the model parameters and recalculation of the model adequacy tests, in an attempt to reveal the best model as expected. As mentioned previously, MIV(θ,u,t) was also reassessed using these data sets, even though the designed experiment was intended for the optimal discrimination between models MI(θ,u,t) and MII(θ,u,t). Using the newly simulated data in combination with the original data and starting estimates of θI ) [0.254 90, 0.241 33, 0.907 29, 0.094 55]T, θII ) [0.249 87, 0.009 12, 0.941 94, 0.101 54]T, and θIV ) [0.152 70, 0.405 75, 0.475 29]T, the likelihood estimator (eq 4) was again maximized using quasi-Newton optimization. Model adequacy tests from yˆ reassessment are shown in Table 8, as well as the estimates of the response variable covariance from the residuals. Inspection of the χ2 test for the three models in Table 8 shows that both MII(θ,u,t) and MIV(θ,u,t) reveal significant LOF. Inspection of the overlay plots in Figure 5 shows that MII(θ,u,t) fails to adequately predict y2, indicating that the newly designed experiment has aided in discriminating against this model. From this analysis, we can conclude that only MI(θ,u,t) passes all statistical hypothesis tests, and thus it is deemed the best model (as expected). A similar approach was used for each of the individual simulation cases listed in Table 1. The discrimination results from this exhaustive exercise are shown in Table 9. There, we include a comparison in performance and efficacy between experiments designed using the new method developed here (eq 20) and the approach using the dynamic extension of the Hunter-Reiner criterion11 (eq 21). From these results, it is clear that the new formulation can either reduce the number of experiments required for model discrimination (simulation 1.5) or lead to the correct best model (simulations 2.1 and 3.2) when the difficulties of the discrimination task

1388

Ind. Eng. Chem. Res., Vol. 42, No. 7, 2003

Figure 5. Predictions of the three models after re-estimation of the model parameters with the preliminary data set (a and b) and models MI(θ,u,t) and MII(θ,u,t) with the newly obtained dataset (c and d). Plots c and d also show the 95% inference bands of the model predictions. Table 8. LOF Analysisa and the Estimated Response Variable Variance-Covariance Matrices MI(θ,u,t)

MII(θ,u,t)

MIV(θ,u,t)

81.6213

972.5257

2206.8595

χ2 estimated (co)variance of the residuals a

0.026 62 0.000 61

0.000 61 0.041 99

0.236 14 -0.363 01

-0.363 01 0.756 99

1.959 95 -0.021 35

-0.021 35 0.050 35

Reference value: χ2 ) 115.3897.

Table 9. Model Discrimination Results Using Simulations 1.1-3.2 simulation no. 1.1 1.2 1.3 1.4 1.5 2.1 2.2 2.3 3.1 3.2

iterations required new formulation Hunter-Reiner 1 2 3 3 3 2 2 3 2 2

1 2 3 3 4 2 2 3 2 2

optimal objective function values new formulation Hunter-Reiner 258 209 1190 695 1486 1179 5549 902 166 1097

were increased by increasing the covariance of the added white noise. Thus, the newly proposed criterion for design of experiments is very efficient. From Table 9, we can also conclude that the determination of the best model among the proposed models is usually accomplished with only a few iterations when dynamic information is used. Difficulties arise, however, in finding globally optimal results with the increase in the number of optimization variables for experiment design. However, the added computational effort reduces the required amount of experimental effort to statistically determine the best model and thus will save time and money in this sense.

853 631 2992 849 1559 14 984 30 158 444 46 14 984

final model selected new formulation Hunter-Reiner model 1 model 1 model 1 model 1 model 1 model 1 reject all model 4 model 1 model 1

model 1 model 1 model 1 model 1 model 1 reject all reject all model 4 model 1 reject all

8. Concluding Remarks We have presented a new method for determining optimally informative dynamic experiments for the purpose of model discrimination among several rival multiresponse nonlinear structured dynamic models, generally described by systems of differential and algebraic equations. A robust and efficient algorithm based on an extension to the dynamic case of the divergence criterion put forth by Buzzi-Ferraris and Forzatti5 was developed to calculate dynamic input trajectories by reformulation of the experiment design problem as an optimal control problem. We showed that

Ind. Eng. Chem. Res., Vol. 42, No. 7, 2003 1389

the new approach could provide significant improvements in the ability to distinguish among a series of rival dynamic models over previous attempts to design dynamic experiments.7 The new methods showed statistically significant improvement in the ability to discriminate among the four dynamic fermenter models in a smaller number of experiments, as well as in choosing the correct model in simulation exercises. Acknowledgment The authors are grateful to the EPSRC (through Grant GR/N38688/01) for the support of this work. The authors also thank Toritseju Jakpa for bibliographic assistance. Nomenclature E[‚] ) expected value f, g ) analytic vector fields (ns-dimensional) F() ) F-statistic h() ) analytic function L() ) likelihood function M ) number of system outputs or response variables M(θ,u,t) ) model structure as a function of θ, u, and t Mj(θ) ) moment matrix of the residuals of the jth model evaluated at the best estimate of θ Me ) experimental error moment matrix nsp ) number of sampling points (or times) nu ) number of time-varying process inputs or controls N ) number of observations of the response variables Nj ) number of observations of the jth response variable NM ) number of competing models p() ) probability distribution function (pdf) P ) number of parameters re ) number of repeat measurements of the response variables Sij(φ) ) covariance matrix of the vector of deviations of the responses under models i and j S ) covariance matrix of experimental error t ) time t() ) Student’s-t statistic tspi ) ith sampling point of the response variables tswi,j ) jth switching time of the ith time-varying process input T(φ) ) model discrimination criterion u ) nu-dimensional vector of time-varying process inputs Vl,sp ) dynamic sensitivity matrix for model l at sampling time tsp w ) nw-dimensional vector of time-invariant process inputs W ) weighting matrix x(t) ) ns-dimensional vector of state variables y(t) ) M-dimensional vector of system outputs or response variables yˆ ) estimated or predicted response variables y0 ) M-dimensional vector of initial conditions Greek Symbols n ) nth vector of errors or residuals (M-dimensional) θ ) P-dimensional vector of model parameters θˆ ) present best point estimate of the parameter vector θ µ ) mean value σˆ 2 ) variance estimate σi2 ) variance of the expected value of the ith output yi (σkl)-1 ) (k, l)th element of Σ-1 Σ ) M × M covariance matrix of the response variables Σθ ) M × M covariance matrix of the parameters φ ) vector of experiment design variables Ψ ) error distribution parameters (e.g., mean and variance)

χ2() ) χ2 statistic Ω ) Hessian matrix

Literature Cited (1) Hunter, W. G.; Reiner, A. M. Designs for Discriminating Between Two Rival Models. Technometrics 1965, 7, 307. (2) Box, G. E. P.; Hill, W. J. Discrimination Among Mechanistic Models. Technometrics 1966, 9, 57. (3) Reilly, P. W. Statistical Methods in Model Discrimination. Can. J. Chem. Eng. 1970, 48, 168. (4) Buzzi-Ferraris, G.; Forzatti, P. A New Sequential Experimental Design Procedure for Discriminating Among Rival Models. Chem. Eng. Sci. 1983, 38, 225. (5) Buzzi-Ferraris, G.; Forzatti, P. Sequential Experimental Design for Model Discrimination in the Case of Multiple Responses. Chem. Eng. Sci. 1984, 39, 81. (6) Buzzi-Ferraris, G.; Forzatti, P.; Canu, P. An Improved Version of a Sequential Design Criterion of Discriminating Among Rival Response Models. Chem. Eng. Sci. 1990, 39, 81. (7) Espie, D. M.; Macchietto, S. The Optimal Design of Dynamic Experiments. AIChE J. 1989, 35, 223. (8) Stewart, W. E.; Shon, Y.; Box, G. E. P. Discrimination and Goodness of Fit of Multiresponse Mechanistic Models. AIChE J. 1998, 44, 1404. (9) Singh, S. Model Selection for a Multirespons System. Ind. Chem. Eng. A 1999, 77, 138. (10) Buzzi-Ferraris, G. Planning of Experiments and Kinetic Analysis. Catal. Today 1999, 52, 125. (11) Asprey, S. P.; Macchietto, S. Statistical Tools for Optimal Dynamic Model Building. Comput. Chem. Eng. 2000, 24, 1261. (12) Bard, Y. Nonlinear Parameter Estimation; Academic Press: London, 1974. (13) Seber, G. A. F.; Wild, C. J. Nonlinear Regression; Wiley Series in Probability and Mathematical Statistics; John Wiley & Sons: Toronto, Canada, 1989. (14) Box, G. E. P. A General Distribution Theory for a Class of Likelihood Criteria. Biometrika 1949, 36, 317. (15) Box, G. E. P.; Henson, T. L. Model Fitting and Discrimination; Department of Statistics Technical Report 211; University of Wisconsin: Madison, WI, 1969. (16) Box, G. E. P.; Henson, T. L. Some Aspects of Mathematical Modeling in Chemical Engineering. In Proceedings of the Inaugural Conference of the Scientific Computation Centre and the Institute of Statistical Studies and Research; Cairo University Press: Cairo, Egypt, 1970; p 548. (17) Prichard, D. J.; Bacon, D. W. Statistical Assessment of Kinetic Models. Chem Eng. Sci. 1975, 30, 567. (18) Burke, A. L.; Deuver, T. A.; Penlidis, A. Discriminating Between the Terminal and Penultimate Models Using Designed Experiments: An Overview. Ind. Eng. Chem. Res. 1997, 36, 1016. (19) Espie, D. M. The Use of Nonlinear Parameter Estimation for Dynamic Chemical Reactor Modelling. Ph.D. Thesis, University of London, London, U.K., 1986. (20) Hunter, W. G.; Reiner, A. M. Designs for Discriminating Between Two Rival Models. Technometrics 1965, 7, 307. (21) Cooney, M. J.; McDonald, K. A. Optimal Dynamic Experiments for Bioreactor Model Discrimination. Appl. Microbiol. Biotechnol. 1995, 43, 826. (22) Uesbeck, F.; Samsatli, N. J.; Papageorgiou, L. G.; Shah, N. Robust Optimal Fermentation Operating Policies. Comput. Chem. Eng. 1998, 22, S167. (23) Nihtila, M.; Virkunnen, J. Practical Identifiability of Growth and Substrate Consumption Models. Biotechnol. Bioeng. 1977, 19, 1831. (24) Asprey, S. P.; S. Macchietto. Designing Robust Optimal Dynamic Experiments. J. Process Control 2002, 12, 545. (25) Vassiliadis, V. S.; Sargent, R. W. H.; Pantelides, C. C. Solution of a Class of Multistage Dynamic Optimization Problems 1: Problems without Path Constraints. Ind. Eng. Chem. Res. 1994, 33, 2111. (26) Zullo, L. Ph.D. Thesis, University of London, London, U.K., 1991. (27) Ko¨rkel, S.; Bauer, I.; Bock, H. G.; Schlo¨der. J. P. In Proceedings of the International Workshop on Scientific Computing in Chemical Engineering; Springer: Berlin, 1999; Vol. II, p 338. (28) Keifer, J.; J. Wolfowitz. Optimum Designs in Regression Problems. Ann. Math. Stat. 1959, 30, 271.

1390

Ind. Eng. Chem. Res., Vol. 42, No. 7, 2003

(29) Ljung, L. System Identification: Theory for the User; Prentice Hall: Englewood Cliffs, NJ, 1987. (30) Goodwin, G. C.; Payne, R. L. Dynamic System Identification: Experiment Design and Data Analysis; Academic Press: New York, 1977. (31) Shirt, R. W.; Harris, T. J.; Bacon, D. W. Experimental Design Considerations for Dynamic Systems. Ind. Eng. Chem. Res. 1994, 33, 2656. (32) Zarrop, M. B. Optimal Experiment Design for Dynamic System Identification; Springer-Verlag: Berlin, 1979. (33) Wojciechowski, B. W.; Asprey, S. P. Kinetic Studies Using Temperature-Scanning: the Oxidation of Carbon Monoxide. Appl. Catal. A: Gen. 2000, 190, 1. (34) Bermingham, S. K. Ph.D. Thesis, Delft University of Technology, Delft, The Netherlands, 2002. (35) Asprey, S. P.; Mantalaris, A. Global Parametric Identifiability of a Dynamic Unstructured Model of Hybridoma Cell

Culture. In Proceedings of the 8th IFAC International Conference on Computer Applications in Biotechnology (CAB-8); Elsevier Science: New York, 2001, p 25. (36) Sanderson, C. S.; Barford, J. P.; Barton, G. W. A structured, dynamic model for animal cell culture systems. Biochem. Eng. J. 1999, 3, 203. (37) Hill, W. J.; Hunter, W. G.; Wichern, D. W. A Joint Criterion for the Dual Problem of Model Discrimination and Parameter Estimation. Technometrics 1968, 10, 145.

Received for review April 22, 2002 Revised manuscript received August 16, 2002 Accepted December 4, 2002 IE0203025