Curvature-Based Methods for Designing Optimally Informative

Aug 2, 2005 - It is common practice in nonlinear regression situations to use asymptotic linear approximations of the model functions to construct par...
0 downloads 0 Views 392KB Size
7120

Ind. Eng. Chem. Res. 2005, 44, 7120-7131

Curvature-Based Methods for Designing Optimally Informative Experiments in Multiresponse Nonlinear Dynamic Situations Lamia Benabbas, Steven P. Asprey, and Sandro Macchietto* Department of Chemical Engineering, Imperial College, London, London SW7 2BZ, United Kingdom

It is common practice in nonlinear regression situations to use asymptotic linear approximations of the model functions to construct parameter inference regions; such approximations may turn out to be a poor representation of the true underlying surfaces, especially for highly nonlinear situations and small sample sizes. For this reason, experimental designs based on these approximations could well be moderately noninformative. We present a new method for optimal experimental design for improving parametric precision while taking account of curvature in multiresponse nonlinear structured dynamic models. We base the curvature measures in the multiresponse case on the Box-Draper estimation criterion through use of the generalized leastsquares model conditioned on the maximum likelihood estimate of the variance-covariance matrix for the responses. Curvature measures commensurate with those found in the literature are used for the generalized least-squares model in the neighborhood of the parameter point estimates. The problem of designing dynamic experiments is cast as an optimal control problem that enables the calculation of a fixed number of optimal sampling points, experiment duration, fixed and variable external control profiles, and initial conditions of a dynamic experiment subject to general constraints on inputs and outputs. We illustrate the experimental design concepts with a relatively simple but pedagogical example of the dynamic modeling of the fermentation of baker’s yeast. 1. Introduction Dynamic mechanistic modeling has become a common approach in most engineering disciplines, used to gain a better understanding of the various phenomena occurring within processing systems. In general, this approach yields nonlinear parametric models that give an accurate mathematical description of the system, much more so than high-dimensional linear models obtained using classical identification algorithms. By capturing and encoding the underlying phenomena, these detailed mathematical models can be used across a wide range of conditions to improve the product, process, and plant performance through such applications as model-based process design, control, and optimization of processes. Thus, building models of processing systems is a very important activity. However, performing even a single experimental run to produce the data required can be costly, in both terms of time and money. As a consequence, there is a distinct need for the use of systematic methods of developing models so as to minimize the cost of materials, analysis, and time and to maximize the quality of the information obtained from any one experiment. When building models whose use is to explain an observed phenomenon, one uses a priori knowledge such as physical, chemical, or biological “laws” to propose (conceivably several) possible mechanistic or empirical models. These laws dictate the model structure. In each case, these models contain parameters that may have physical meaning, and we simply wish to know if it is at all possible to determine their values and, if so, to do so with maximum precision. In what follows, we * To whom correspondence should be addressed. Tel.: +44-20-75946608. Fax: +44-20-7594661255. E-mail: [email protected].

therefore consider the general deterministic model form of a set of (possibly mixed) differential and algebraic equations

f(x3 (t), x(t), u(t), w, θ, t) ) 0

yˆ (t) ) g(x(t)) (1)

where x(t) ∈ R ns is an ns-dimensional vector of timedependent state variables, u(t) ∈ R nu is an nu-dimensional vector of time-varying controls or inputs to the process, w ∈ R nw is an nw-dimensional vector of timeinvariant controls, θ ∈ R P is a P-dimensional vector of model parameters to be determined, and yˆ (t) ∈ R M is an M-dimensional vector of measured response variables that are a function of the state variables, x(t). The model function f(‚), such that f is R ns × R ns × R nu × R nw × R P f R ns, is assumed to be smooth and twice differentiable (C2) with respect to θ. It should be pointed out that the methods outlined here are not limited to explicit models. In most cases, g(x(t)) will simply be a “selector” matrix, selecting those state variables that are in fact measured (it is not necessary that all states be measured). For the statistical verification of detailed dynamic mathematical models of this type, Asprey and Macchietto1 have presented a systematic model-building procedure. Overall, the procedure involves proposing a mathematical parametric model for the system under study, designing experiments using the model, estimating the parameters within the model using experimental data collected under the designed conditions, checking for model adequacy (model error), and repeating the procedure until satisfied with the degree of precision in the model predictions. A key component within this procedure is the design of experiments to gain optimal information from the experimental apparatus being modeled. In particular, experimental design is concerned with the following question: how does one select

10.1021/ie040096w CCC: $30.25 © 2005 American Chemical Society Published on Web 08/02/2005

Ind. Eng. Chem. Res., Vol. 44, No. 18, 2005 7121

the time-varying and time-invariant controls, initial conditions, length of experiment, and the sampling times for the experiment to generate the maximum amount of information for the purpose of estimating the parameters with greatest precision? Despite the importance of this problem, there has been a relatively modest amount of work in the past on the application of experimental design techniques to dynamic experiments. The concept and some results were presented in the late 1980s and early 1990s for general mechanistic models consisting of differential and algebraic equations (DAEs);2-4 these works were based on the so-called “alphabetic” design criteria.5 Other previous developments include optimal input signal selection to identify dynamic models in the form of linear discrete time models based on a frequencydomain approach and the concepts of the spectral information content of the designed input signals and persistent excitation.6-8 An excellent overview of these methods and their relation to the alphabetic criteria can be found in Shirt et al.9 More recent work on the application of alphabetic criteria to systems described by general DAEs is reported by Ko¨rkel et al.10 and Asprey and Macchietto.11 An overview of an array of statistical and mathematical methods for dynamic model building is presented in Asprey and Macchietto.1 A key feature of these methods is the optimization of some measure of the expected information content of the next experiment to be performed. This is typically of the form of an a priori estimate of the confidence interval on the parameters that will be obtained when the experiment is performed, and the new data are used to update the model parameters. Clearly, the reliability of the estimated confidence intervals is key, and this, in turn, depends on the local curvature of the maximum likelihood function surface. In this paper, we use the methods presented by Guay12 to form a measure of curvature when designing experiments for improving parameter precision in nonlinear multiresponse dynamic models. Section 2 of the paper covers the central issue of parameter and variance estimation. Section 3 deals with the generalized leastsquares approach of Guay12 on curvature measures in the multiresponse case, including use of second-order sensitivity coefficients calculated for the general DAE model in eq 1; section 4 details the dynamic optimization framework in which the curvature methods are embedded to allow the design of dynamic experiments; section 5 illustrates application of the methodology to engineering examples; section 6 provides concluding remarks. 2. Parameter and Variance Estimation To statistically verify models such as eq 1, we follow the overall model-building strategy presented in Asprey and Macchietto.1 The central issue in statistical verification of models involves estimation of the model parameters from collected experimental data. To estimate the model parameters, θ, from experimental data, ynm, we write the model including an additive stochastic component

ynm ) yˆnm(ξn,θ) + nm n ) 1, ..., N m ) 1, ..., M (2) where ξn is a vector denoting in compact form the experimental settings for the nth measured data point (elements include sampling time, tsp, as well as values of the time-varying and -invariant inputs, u(tsp), w), and

ynm is the nth data point for the mth response. A common estimator used is that given by the BoxDraper13 criterion

d(θ) ) |Z(θ)TZ(θ)|

(3)

where Z(θ) is the N × M matrix of residuals ((ynm ˆ ) in matrix notation), evaluated at θ. It yˆnm) or (Y - Y can be shown that eq 3 gives both a maximum likelihood estimator (MLE) as well as a Bayesian maximum a-posterior (MAP) estimator under the following conditions. The stochastic component is assumed to be multivariate normally distributed, with

E[Znm] ) 0 E[ZnmZri] )

{

{Ω}mi n ) r 0 n*r

}

(4)

where Ω is an M × M covariance matrix and E[‚] denotes expectation. That is to say that we assume that measurements from different experimental settings (ξn) are independent, but measurements of the multiple responses taken under the same experimental settings are possibly correlated (i.e., Ω is not diagonal). Conventional steady-state14,15 and more recent dynamic1,10,11 experimental design methods rely on linear approximations to compute inference regions for parameter estimates in the nonlinear situation. In particular, when using the Box-Draper13 criterion, Kang and Bates16 show that a (1 - R) joint inference region can be approximated by

Γ (θ - θˆ )T (θ - θˆ ) e Ps2FP,N-P;R 2

(5)

ˆ | evaluated where Γ is an approximate Hessian of |Z ˆ TZ at the parameter point estimates θˆ , F is the standard F-statistic distribution function, and

s2 )

|Z ˆ TZ ˆ| N-P

(6)

An approximate variance-covariance matrix of θˆ is

W(θˆ ) ) 2s2Γ-1

(7)

In situations in which there exists a moderate to a high degree of nonlinearity in the model functions, the approximation given by eq 5 may in fact be a very poor representation of the true underlying parameters joint confidence surface. As pointed out by Hamilton and Watts,17 this is especially true for small sample sizes, in which the use of asymptotic (large sample) design criteria may be inappropriate. 3. Curvature Measures for Multiresponse Regression Models We follow the developments presented by Guay12 to determine an overall curvature measure of the parameters inference region, based on the generalized leastsquares (GLS; see Gallant18) model given by

(E-T X INM)y ) (E-T X INM)yˆ + (E-T X INM)z (8) where X indicates the Kronecker or direct product of two matrixes, E is the Cholesky factor of Ω ˆ )

7122

Ind. Eng. Chem. Res., Vol. 44, No. 18, 2005

N-1Z(θˆ )TZ(θˆ ) ) N-1Z ˆ TZ ˆ , the maximum likelihood estimate of the covariance matrix Ω for the response variables, and INM is the NM × NM identity matrix; we let vec be the operator that stacks the columns of a matrix into a vector, and thus y ) vec(Y), yˆ ) vec(Y ˆ ), and z ) vec(Z). In this formulation, we encapsulate the nonlinearity in the multiresponse model by considering the NM-dimensional GLS problem as a single-response model. In this form, we are now able to apply the Bates and Watts19 curvature measures, calculated by using the NM-dimensional velocity and acceleration vectors defined by

∂yˆ (ξ,θˆ ) ∂θp

V4 p ) (E-T X INM)

∂2yˆ (ξ,θˆ ) ∂θp∂θq

V 2 pq ) (E-T X INM)

(9) where θˆ are the parameter point estimates obtained by minimizing the Box-Draper13 criterion, eq 3. We can then follow the developments given by Bates and Watts20 to continue to derive curvature measures. Conditioning the parameter inference procedure on the maximum likelihood estimate of Ω gives an approximate maximum likelihood estimate confidence region for θ defined by21

(θ - θˆ )T

{

}

∂yˆ (ξ,θˆ )T -1 ∂yˆ (ξ,θˆ ) (Ω ˆ X INM) (θ - θˆ ) e ∂θ ∂θ NMP F (10) NM - P R

The maximum likelihood estimate of the variancecovariance matrix for the parameters is given by W(θˆ ) ) (NM - P)-1ΛL-1, where

ΛL ) N-1

∂yˆ (ξ,θˆ ) ∂yˆ (ξ,θˆ )T -1 (Ω ˆ X INM) ∂θ ∂θ

(11)

P d(θˆ ) F NM - P R

(13)

Following Bates and Watts22 and Guay,12 we perform a QR decomposition of the composite array [V4 ,W 2 ], in which V4 is the NM-dimensional velocity vector and W 2 2 12, V 2 22, .., V 2 1P, ..,V 2 PP] is an NM × P(P + 1))/2 ) [V 2 11, V matrix of nonredundant acceleration vectors, such that

[ ]

R1 Aθ [V4 ,W 2 ] ) QR ) [Q1|Q′1|Q2] ‚ 0 Aι 0 0

Calculating the curvature arrays presented in eq 14 for dynamic process models involves solving the firstand second-order dynamic sensitivity equations (Guay and McLean21 and Vassiliadis et al.23). We define the NM × P-dimensional matrix of first-order sensitivities as

∂g(x) ∂x ∂yˆ ) ∂θp ∂x ∂θp

(14)

where R1 is a P × P upper-left triangular matrix and Aθ and Aι are P × P(P + 1)/2 and P′ × P(P + 1)/2 matrixes, respectively, which are used to form the parameter effects and intrinsic components of the curvature or acceleration array A 2 . The information contained in Aθ and Aι corresponds to the tangential

(15)

where by partial differentiation of eq 1 with respect to the parameter θp

∂f ∂x ∂f d ∂x ) + dt ∂θp ∂x ∂θp ∂θp

(16)

To obtain the second-order sensitivities of eq 1 with respect to the parameter vector θ, we differentiate eq 16 with respect to θq to obtain an NM × P × P array

∂2x

dt ∂θp∂θq

[∑ ( n

)

i)1

∂2f ∂xi

)

+

∂x∂xi ∂θq

n

(12)

that can be approximated by eq 5. In the Bayesian approach, the variance-covariance matrix for the parameters is given by WB(θ) ) ΛB-1/NM - P, where

1 ΛB ) d(θˆ )-1Γ 2

4. Curvature-Based Experimental Design Framework for Dynamic Models

d

The exact determinant contours give rise to a Bayesian inference region for θ given by

d(θ) - d(θˆ ) e

and normal components of the acceleration vectors, respectively. The extent to which the acceleration vectors lie outside the tangent plane provides a measure of how much the expectation surface deviates from a plane; this nonplanarity is called the intrinsic nonlinearity. As such, the intrinsic nonlinearity does not depend on the model parametrization but only on the experimental design and the expression for the expectation function. However, the projections of the acceleration vectors in the tangential plane depend on the parametrization of the model, and measure the nonuniformity of the parameter lines on the tangential plane; this is termed parameter effects nonlinearity.20

∑ i)1

(

∂2f

]

∂x

+

∂x∂θq ∂θp ∂2f

∂xi

)

∂θp∂xi ∂θq

∂f

∂2x

+

∂x ∂θp∂θq +

∂2f

(17)

∂θp∂θq

The DAEs in eqs 15-17 can be solved simultaneously with the model equations, eq 1 using the direct decoupled method (DDM) outlined in Dunker.24 It has been reported21 that step-length adjustments for error control in integration is usually not an issue for the second-order sensitivities. In fact, as pointed out by Keeping and Pantelides,25 step length should not be adjusted for integration of either the first- or the secondorder sensitivity equations. A further review of the second-order sensitivities of general dynamic systems described by DAEs such as eq 1 is given by Vassiliadis et al.23 Once eqs 1 and 15-17 have been solved for the velocity and acceleration vectors, the calculations in eqs 9 and 14 are straightforward; however, as pointed out in Bates and Watts,20 the curvatures calculated in eq 14 are in units of (1/response), and as such, their values depend on the scaling of the data. We remove this dependence by calculating the (P + P′) × P × P relative curvature array using

2 R1-1sxP C ) R-T 1 A

(18)

Ind. Eng. Chem. Res., Vol. 44, No. 18, 2005 7123

For experimental design purposes, the use of this curvature information is best accomplished by calculating a simple overall scalar measure of nonlinearity. For this purpose, again we follow Bates and Watts20 and use the root-mean-square (rms) curvature measure, which is the square root of the average over all directions of the squared curvature, calculated by

c2 )

1

P

P

∑ ∑∑ P(P + 2) n p)1 q)1 [2

P

Cnpp)2] ∑ p)1

Cnpq2 + (

(19)

Experimental designs for dynamic systems can use this overall curvature measure in several ways: (1) Minimize the expected overall curvature

min E[c2(φ,θˆ )] φ∈Φ

(20)

The value of c2 after an optimal experimental design has been determined will give an indication of the expected degree of nonlinearity when fitting the model to the newly collected data, and hence an indication of whether linear approximations to the joint parameter confidence regions, cf. eq 5, can in fact be trusted and used. (2) Minimize parametric uncertainty as predicted by the asymptotic dynamic Fisher Information matrix, subject to an acceptable overall level of curvature

min f (MI(φ,θˆ )) φ∈Φ

(21)

c(φ,θˆ )xF e c As in the work of Asprey and Macchietto,1,11 here we utilize a criterion based on the information matrix, MI, for dynamic nonlinear multiresponse systems, first derived by Zullo.4 Zullo4 defined an information matrix for experimental design for the improvement of parameter precision in nonlinear, multiresponse, dynamic situations. The design criterion requires maximizing a scalar metric, f (‚), such that f is R P × R P f R 1, of the information matrix, defined in the following way for dynamic models M M

MI(θˆ ,φ) ≡

T σ˜ rs ∑ ∑ 1 Qr Qs r)1s)1

(22)

where, as already defined, θˆ is the vector of the best available point estimates of the model parameters and φ is the vector of experimental design variables. The matrix Qr is the matrix of first-order sensitivity coefficients of the rth response variable in the model computed at each of the nsp sampling intervals and is defined as

The weighting factors, σ˜ rs 1 , in eq 22 either can be

Figure 1. Geometric interpretation of the experimental design situation.

equally weighted (e.g., all ones) or, if a previous parameter estimation has been carried out on existing data, can come from an estimate of the variance-covariance matrix of the residuals. To compare the magnitude of different matrices, various real-valued functions have been suggested as metrics for f(‚) in eq 21. Three common criteria are: (i) D-Optimality. An experimental design is Doptimal if it minimizes the determinant of the covariance matrix, eq 7, and thus minimizes the volume of the joint confidence region. (ii) E-Optimality. An experimental design is Eoptimal if is minimizes the largest eigenvalue of the covariance matrix, eq 7, and thus minimizes the size of the major axis of the joint confidence region. (iii) A-Optimality. An experimental design is Aoptimal if it minimizes the trace of the covariance matrix, eq 7, and thus minimizes the dimensions of the enclosing box around the joint confidence region. Figure 1 illustrates geometrically the experimental design situation. The designs described above make use of an approximation to the underlying likelihood surface, depicted by the dashed ellipse. The point estimate of the parameters is indicated by a dot (•), located at the center of the figure. The other lines represent the contours of the actual parameter confidence region projected onto the plane of the two parameters θ1 and θ2. It may be noted that the contours are indeed elliptical very close to the best estimate of the parameters but depart significantly from elliptical shape far from the point estimates. (3) Minimize parametric uncertainty as predicted by a quadratic (small sample size) approximation to the volume of the joint parameter confidence regions, as achieved by Hamilton and Watts.17 Here, we focus on minimal curvature designs as well as designs subject to an acceptable level of curvature and leave the application of the quadratic criterion17 to dynamic systems for a future exercise. Dynamic Optimization Framework. The design of experiments for dynamic systems naturally requires the capability to optimize time-varying inputs to the process, i.e., input trajectory optimization. For this purpose, infinite-dimensional continuous time-varying inputs to the process, u(t), are mathematically approximated using the control vector parametrization (CVP) technique,26 using either piecewise-constant,

7124

Ind. Eng. Chem. Res., Vol. 44, No. 18, 2005

piecewise-linear, or piecewise-quadratic functions with zeroth or first-order continuity. For the piecewiseconstant case, we have

ui(t) ) zi,j ∀t ∈ τswj

(24)

i ) 1, ..., nc j ) 1, ..., nsw where nc is the number of time-varying controls and nsw is the number of time intervals, with vector τsw, of switching times defining the intervals in which the timevarying controls remain constant at values zi,j (in this framework, we allow independent time intervals for each of the time-varying controls). By using CVP, we represent the time-varying input trajectories with a finite number of optimization variables and thus make the problem’s solution tractable. A diagrammatic representation of the piecewise-constant control vector parametrization is shown in Figure 2. In this case, all zi,j and τswj are design variables, along with a sufficient number of initial conditions of the experiment, x˜ 0 (where x˜ ⊂ x) to properly specify the problem, any time-invariant controls w (nw-dimensional vector), and the sampling times or measurement intervals, τsp (nsp-dimensional vector), of the process responses for which we wish the model to predict. Thus, we may now define the vector of experimental design variables, φ, as

φ ) [τTsp, τTsw, zT1 , ..., znTc, x˜ T0 , wT]T

Figure 2. Illustration of the control vector parametrization for the piecewise-constant case.

To allow maximum versatility, independent limits are allowed for each sampling time interval. As it turns out, these types of constraints are active almost all of the time, as the experimental design criteria attempt to place sampling points at the minimum of the nonlinearity measure repeatedly. The constraint in eq 32 is used to avoid a mathematical singularity caused by the collapse of one or more control time intervals. Finally, we impose a constraint on the time-varying control switching intervals nsw

(25)

nsp

τsw e ∑τsp ∑ i)1 i)1 i

In this formulation, the sampling intervals are the same for all response variables; however, there is no reason we could not have different sampling intervals for each response variable, especially if the time scales are drastically different between samples (e.g., thermocouple vs. gas chromatograph). To ensure that the experimental design obtained is physically realizable, constraints may be imposed on the design variables. There are also simple bounds and optional nonlinear equality/inequality constraints on the initial conditions and the state variables

x˜ L0 e x˜ 0 e x˜ U 0

(26)

a(x˜ 0) g 0

(27)

xL e x e xU

(28)

b(x) g 0

(29)

Similar bounds could also be imposed on the response variables, y0. Simple bounds are imposed on the constant inputs as well as the levels of the piecewiseconstant controls within each control interval, the timevarying input switching intervals, and the sampling time intervals

wL e w e wU

(30)

zLj e zj e zjU

(31)

j ) 1, ..., nsw τLsw e τsw e τU sw

(32)

τLsp e τsp e τU sp

(33)

i

(34)

as there is no point in making changes to the timevarying controls past the last sampling interval. As such, in this formulation, we have a flexible final time for the end of the experiment, defined by the sum of nsp τspi. In this paper, the the sampling time intervals, ∑i)1 resulting NLP is solved using the SRQPD method (see Vassiliadis et al.26 and references therein for further details). Implementation Algorithm. Here, we outline the algorithm for designing an experiment for minimizing the expected overall curvature. Implementation of the other experimental designs follows the same procedure, with exception to a change in equations for the objective function (eq 20). (1) Given θˆ , set initial values for all elements in φ (cf. eq 25) (2) Loop (start of optimization). (3) (a) Integrate the model equations, eq 1, coupled with the first- and second-order sensitivity equations, eqs 15-17 numerically through use of DASOLV (see Keeping and Pantelides25), a DAE solver. (b) Calculate the expected overall curvature, eq 19. This involves using the model and sensitivity coefficient solutions obtained in step 3a and solving eqs 9, 14, and 18, respectively. The QR decomposition is affected through use of the methods detailed in Bates and Watts.20 (c) Calculate the derivatives of eq 19 with respect to all elements in φ through use of central finite differences. This will involve individually perturbing the elements in φ and iterating steps 3a and 3b. (d) Through use of a local quadratic approximation and the constraints, eqs 26-34, the optimizer calculates new values of φ such that the value of the expected overall curvature decreases.

Ind. Eng. Chem. Res., Vol. 44, No. 18, 2005 7125

(4) Loop until the optimization converges to a specified tolerance. (5) At the solution, the value of φ obtained is the design vector for which the expected overall curvature is minimized. 5. An Engineering Example: Fed-Batch Fermentation of Baker’s Yeast We illustrate the experimental design concepts with this relatively small, but pedagogical example, chosen for its ubiquitous appearance in past papers on a similar subject.3,4,11 We consider the fermentation stage of an intracellular enzyme process,27 as presented in Asprey and Macchietto.1,11 We use a typical unstructured (i.e., the internal composition and structure of the cells are unaccounted for), unsegregated (i.e., all of the cells are considered identical) model in which only one key substrate is assumed to be the limiting factor for growth and product formation. The product is an intracellular enzyme; nonviable cells are also formed. We assume isothermal operation of the fermenter and that the feed is free from product. Assuming Monod-type kinetics for biomass growth and substrate consumption, the system is described by the following set of DAEs3,27

dy2 ry1 + u1(u2 - y2) )dt θ3 θ1y2 θ2 + y2

Table 1. Parameter Estimates Calculated from the Unplanned Data EXP 0 parameter

estimate

confidence interval

θ1 θ2 θ3 θ4

0.29812 0.21503 0.48796 0.04712

(0.01974 (0.09368 (0.03368 (0.00915

the parameters and subsequently added multivariate normally distributed noise with a mean of zero; the covariance matrix of the simulated measurement errors was

dy1 ) (r - u1 - θ4)y1 dt

r)

Figure 3. Overlay plot of the predictions of the model with data from the unplanned experiment EXP 0.

∑) (35)

where y1 is the biomass concentration (g/L), y2 is the substrate concentration (g/L), u1 is the dilution factor (h-1), and u2 is the substrate concentration in the feed (g/L). The important experimental conditions that characterize a particular experiment are (1) the initial biomass concentration (or inoculation), y10, with a range from 1 to 10 g/L, (2) the dilution factor, u1, with a range from 0.05 to 0.20 h-1, and (3) the substrate concentration in the feed, u2, with a range from 5 to 35 g/L. The initial substrate concentration, y20, is always 0.1 g/L and cannot be manipulated for experimental design purposes. Both y1 and y2 can be measured during the experiment. Notice here that according to the general model formulation, eq 1, both state variables are also measured and thus y ) x and g ) I. The objective is to design an experiment to yield the best possible information for the precise estimation of the four parameters θi, i ) 1, ..., 4. The only a priori available information on the latter is that they each lie in the range 0.0-1.0. The experiment under consideration involves 20 sampling intervals. A piecewiseconstant variation over 10 switching intervals is assumed for both controls. The elapsed time between any two successive sampling points is bounded between 1 and 20 h (cf. eq 33), and the duration of each control interval is between 2 and 20 h (cf. eq 32). For parameter estimation and experimental design purposes, we produced simulated experimental data with θ ) [0.30, 0.20, 0.50, 0.05]T as the “true” values of

[

0.01628 0.00301 0.00301 0.01055

]

(36)

5.1. Unplanned Experiment. To start, an unplanned experiment, herein denoted EXP 0, was carried out with 20 sampling intervals of uniform length over a 40 h period, with input variables set constant at u ) [0.05, 30.0]T and an initial condition of y10 ) 1.0. 5.1.1. Model Verification and Assessment. First, we estimated the model parameters using the data produced by EXP 0 described above and a starting estimate for the parameters of θ ) [0.50, 0.50, 0.50, 0.50]T. The Box-Draper13 MLE, eq 3, was minimized using a successive quadratic programming (SQP) method with simple bounds on the parameter values. Results of the estimation are shown in Table 1, along with linear approximation 95% parameter confidence limits calculated using eq 5. Statistical model adequacy was assessed by comparing the calculated χ2 value N

χ2 )

M

M

(yij - yˆ ij)(σjk)-1(ykj - yˆ kj) ∑ ∑ ∑ i)1 j)1 k)1

(37)

(where (σjk)-1 is the j,kth element of Σ-1) to the reference 95% confidence level χ2 value, which are 1.1689 and 50.9980, respectively. The calculated value smaller than that of the reference is indicative of an adequate fit of the model to the data. This evidence is further corroborated by inspection of Figure 3, which shows the predictions of the model using the new parameter estimates overlaid with the simulated “experimental” data produced in EXP 0. From these figures, we can infer that the model fits the data well, which is also supported by observation of randomly distributed residuals of y1 and y2 (not shown

7126

Ind. Eng. Chem. Res., Vol. 44, No. 18, 2005

Table 2. D-Optimal Designed Dynamic Experiment design variable

symbol 0

value(s)

biomass initial condition measurement times

y1 tspi; i ) 1, ..., nsp

dilution factor control switching times feed substrate control switching times dilution factor control levels feed substrate control levels

τsw1,i; i ) 1 , ..., nsw1

10.00 0.73, 1.73, 2.73, 4.02, 7.81, 8.81, 10.28, 11.33, 18.45, 19.45, 20.53, 21.78, 22.82, 33.99, 34.99, 36.00, 37.00, 38.00, 39.00, 40.00 0.65, 4.79, 6.79, 8.87, 11.74, 15.21, 17.27, 19.46, 21.52

τsw2,i; i ) 1, ..., nsw2

0.61, 7.56, 10.28, 15.52, 22.46, 25.20, 27.98, 31.27, 34.00

z1,i; i ) 1, ..., nsw1 z2,i; i ) 1, .., nsw2

0.05, 0.20, 0.09, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05 5.00, 35.00, 34.87, 35.00, 5.00, 5.04, 5.18, 8.57, 9.64, 8.98

for brevity). Note that, hypothetically, if we had found significant lack-of-fit (LOF) for the model (indicating the presence of possible modeling error), we would have revisited the model formulation before continuing with the investigation. The rms curvature, c, for this situation was calculated to be 221.59, with a rms parameter effects curvature of 119.73 and a rms intrinsic curvature of 11.85, indicating

Table 3. Parameter Estimates Calculated from the Optimal Experiment, EXP 1, Designed for Improved Parameter Precision parameter

estimate

confidence interval

θ1 θ2 θ3 θ4

0.31334 0.29932 0.50591 0.05090

(0.01168 (0.09224 (0.00986 (0.00203

significant curvature. This is also supported by visual inspection of the nonelliptical contours of the actual likelihood surface shown in Figure 4. 5.2. Standard D-Optimal Design of Dynamic Experiments. As a calibration of the new method, a standard D-optimal design (i.e., maximizing the information content without consideration for curvature and thus solving eq 21 without the curvature constraint) was carried out with the same initial conditions as in the previous experiment, EXP 0. Nine switching times for the two time-varying inputs and 20 sampling points were used for the dynamic experimental design. The experiment’s duration was fixed at 40 h. The optimal operating conditions obtained are summarized in Table 2, and the resulting piecewise-constant inputs are depicted in Figure 5. In this situation, the rms curvature was 94.71, and the corresponding rms intrinsic curvature and parameters curvature were 6.25 and 50.38, respectively. These values indicate significant curvature. The objective function (information content) was evaluated to be 2.511 × 103 before the design and 4.926 × 1010 after, indicating that the designed experiment has an estimated 7 orders of magnitude more information than that of the starting (unplanned) design. 5.2.1. Model Reassessment. We generated new data under the newly designed conditions by running the simulated experiment and adding white noise, herein referred to as EXP 1. The parameters were reestimated, and new estimates are given in Table 3.

Figure 4. Contours of the likelihood surface from the unplanned experiment EXP 0.

Figure 5. The dynamic inputs designed for improving parametric precision.

Ind. Eng. Chem. Res., Vol. 44, No. 18, 2005 7127 Table 4. Designed Dynamic Experiment for Reducing Overall rms Curvature design variable

symbol 0

value(s)

biomass initial condition measurement times

y1 tspi; i ) 1, ..., nsp

dilution factor control switching times feed substrate control switching times dilution factor control levels feed substrate control levels

τsw1,i; i ) 1, ..., nsw1

10.00 0.72, 1.72, 2.72, 3.72, 4.72, 5.72, 6.72, 7.72, 8.73, 16.11, 29.40, 3.17, 32.68, 33.68, 34.68, 35.68, 36.68, 37.68, 38.68, 40.00 1.67, 7.66, 12.72, 14.72, 16.72, 19.69, 22.59, 28.60, 30.73

τsw2,i; i ) 1, ..., nsw2

0.67, 5.76, 8.68, 12.72, 15.20, 17.20, 20.54, 23.97, 30.48

z1,i; i ) 1, ..., nsw1 z2,i; i ) 1, ..., nsw2

0.20, 0.05, 0.20, 0.20, 0.20, 0.20, 0.20, 0.20, 0.20, 0.20 21.57, 6.38, 5.00, 34.56, 5.00, 5.00, 5.00, 5.00, 5.00, 5.13

The LOF was again statistically insignificant, indicated by a calculated value of 1.6164 for the 95% confidence level χ2 value. This can also be seen from the overlay plots of the predictions of the model calculated using the new estimates of the parameters with the simulated experimental data EXP 1 presented in Figure 6. The likelihood surface contours shown in Figure 7 are smaller than those obtained in the previous case (Figure 4), as one would expect, since the experiment was designed to improve parameter precision and hence would have smaller corresponding inference regions. The shape of the contours is again nonelliptical, because the curvature is high. 5.3. Design of Dynamic Experiments for Reducing Overall Curvature. An experimental design was carried out with the objective of minimizing the overall curvature using eq 20, with the same initial conditions as in the unplanned experiment, EXP 0. As indicated earlier, nine switching times for the two time-varying inputs and 20 sampling points were used for the dynamic experimental design. The optimization converged on the values of the design variables shown in Table 4. Figure 8 illustrates the resulting piecewiseconstant inputs for the optimal experiment. The value of the objective function (the overall rms curvature in this case) at the initial conditions was 221.59 and decreased to 9.11 at the optimal point. The corresponding optimal rms intrinsic value was 2.87 with a rms parameter curvature of 4.57. Comparison of these values to those calculated in the case of the unplanned experiment (cf. section 5.1.1) shows a significant reduction in rms curvature, and hence we would expect more elliptically shaped contours of the likelihood surface following the use of this designed experiment (the desired result). 5.3.1. Model Reassessment. Under the designed experimental conditions described above, we generated new simulated experimental data “collected” at the optimal sampling points shown in Table 4, herein referred to as EXP 2. Here again, the same level of noise

as that for the unplanned experiment was added to the simulated data. The model parameters were then reestimated. The resulting estimates are displayed in Table 5. In this case, the χ2 value was calculated to be 3.9799, which is much lower than the reference 95% confidence level χ2 value of 50.9980. The LOF is hence statistically insignificant. Through use of the model with the new

Figure 6. Overlay plot of the predictions of the model with data from the D-optimal experiment EXP 1.

Figure 7. Contours of the likelihood surface from the D-optimal experiment EXP 1.

7128

Ind. Eng. Chem. Res., Vol. 44, No. 18, 2005

Figure 8. Dynamic inputs designed for reducing overall rms curvature.

Figure 9. Overlay plot of the predictions of the model with data from the designed experiment for reducing overall rms curvature EXP 2. Table 5. Parameter Estimates Calculated from the Optimal Experiment, [EXP 2], Designed to Reduce Overall Curvature parameter

estimate

confidence interval

θ1 θ2 θ3 θ4

0.37271 0.44271 0.50809 0.05097

(0.11893 (0.412095 (0.0228019 (0.00337498

parameter estimates, Figure 9 shows the predicted responses overlaid with the simulated noisy data for the designed experiment. Inspection of Figure 10 indicates that, as expected, the actual likelihood surface contours are now more elliptically shaped than those obtained for the unplanned experiment. 5.4. Design of Dynamic Experiments for Maximal Information Content Subject to an Acceptable Overall Curvature Level. Through use of eq 21 as the objective function and the same initial conditions as those for the unplanned experiment, an experimental design was carried out for improving parametric precision, while keeping the overall curvature constrained to an acceptable level. A value of c ) 16 (cf. eq 21) was used in the curvature constraint. The resulting optimal values of the designed variables are presented in Table 6, and an illustration of the resulting piecewise-constant optimal inputs is given in Figure 11. The values of the objective function before and after the design were 2.511 × 103 and 4.254 × 108, respectively, showing that the newly designed experiment contains an estimated 5 orders of magnitude more information than that of the starting-point design. The

Figure 10. Contours of the likelihood surface from the experiment designed for reducing overall curvature EXP 2.

corresponding value of the overall rms curvature calculated for the new situation was 22.3, with a rms intrinsic curvature value of 4.9 and a rms parameters curvature of 12.3. These values are higher than those calculated in the case of the minimum curvature design but are lower than those in the case of the unplanned experiment, indicating an intermediate level of curvature. 5.4.1. Model Reassessment. For the purpose of parameter reestimation, we simulated experimental data under the conditions given in Table 6, added white noise as in the previous cases (herein referred to as EXP 3), and reestimated the model parameters. We report the new parameter estimates in Table 7, along with their linear approximation 95% confidence intervals. The calculated χ2 value in this case was 1.8352, which is again much lower than the reference 95% confidence level χ2 value (50.9980). This indicates an insignificant LOF. The profiles of the responses predicted by the model were then computed using the latest estimates of the parameters. Figure 12 shows the overlay plots of those

Ind. Eng. Chem. Res., Vol. 44, No. 18, 2005 7129 Table 6. Designed Dynamic Experiment for Improved Parameter Precision while Ensuring an Acceptable Level of Overall RMS Curvature design variable

symbol

value(s)

biomass initial Condition measurement times

y10 tspi; i ) 1, ..., nsp

dilution factor control switching times feed substrate control switching times dilution factor control levels feed substrate control levels

τsw1,i; i ) 1, ..., nsw1

10.00 0.11, 1.33, 2.33, 3.33, 4.33, 8.83, 9.83, 10.83, 15.47, 18.73, 19.73, 20.74, 21.74, 33.94, 34.95, 36.00, 37.00, 38.00, 39.00, 40.00 6.42, 15.39, 22.47, 25.18, 29.47, 32.00, 34.00, 36.00, 38.00

τsw2,i; i ) 1, ..., nsw2

0.10, 2.10, 4.10, 8.26, 14.28, 32.00, 34.00, 36.00, 38.00

z1,i; i ) 1, ..., nsw1 z2,i; i ) 1, .., nsw2

0.11, 0.05, 0.05, 0.06, 0.05, 0.05, 0.05, 0.05, 0.05, 0.20 13.06, 20.68, 21.85, 35.00, 5.00, 34.25, 5.00, 5.92, 31.52, 33.20

Table 7. Parameter Estimates Calculated from the Optimal Experiment, [EXP 3], Designed for Improved Parameter Precision with Acceptable Overall Curvature parameter

estimate

confidence interval

θ1 θ2 θ3 θ4

0.31665 0.25096 0.52154 0.05413

(0.01994 (0.08489 (0.02929 (0.00569864

profiles with the data of EXP 3. For this case of an intermediate level of curvature, the actual likelihood surface contours are shown in Figure 13. Overall, it appears that the shape of the contours is approximately elliptical. We can, therefore, conclude that the curvature in the region corresponding to the designed experiment is limited enough to allow the use of the linear approximations for the parameter inference regions with sufficient accuracy.

Figure 11. Dynamic inputs designed for improved parameter precision while ensuring an acceptable level of overall rms curvature.

Figure 13. Contours of the likelihood surface from the experiment designed for improved parameter precision while ensuring an acceptable level of curvature EXP 3.

Figure 12. Overlay plot of the predictions of the model with data from the designed experiment for improved parameter precision while ensuring an acceptable level of overall rms curvature EXP 3.

Table 8 summarizes the values of the rms curvature, determinant of the information matrix (information content), and confidence intervals in all four cases presented in this paper. We can conclude that there is a marked tradeoff between parameter precision and curvature of inference regions.

7130

Ind. Eng. Chem. Res., Vol. 44, No. 18, 2005

Table 8. Summary of Confidence Intervals, rms Curvature, and Information Content in Each of the Experiments Presented 95% linear approximation confidence intervals

unplanned

minimum curvature

acceptable curvature

standard D-optimal design 0.01168 0.09224 0.00986 0.00203

θ1 θ2 θ3 θ4

0.01974 0.09368 0.03368 0.00915

0.11893 0.41210 0.02280 0.00338

0.01994 0.08489 0.02929 0.005698

det(MI) c

2.975 × 103 221.59

2.103 × 102 9.11

4.254 × 108 22.3

4.926 × 1010 94.71

6. Concluding Remarks We have presented a new method for optimal experimental design for improving parametric precision while taking account of curvature in multiresponse nonlinear structured dynamic models. We based the curvature measures in the multiresponse case on the Box-Draper estimation criterion through use of the generalized least-squares model conditioned on the maximum likelihood estimate of the variance-covariance matrix for the responses, as done by Guay.12 Curvature measures commensurate with those of Bates and Watts22 are used for the generalized least-squares model in the neighborhood of the parameter point estimates. The problem of designing dynamic experiments was cast as an optimal control problem, which enables the calculation of a fixed number of optimal sampling points, experiment duration, fixed and variable external control profiles, and initial conditions of a dynamic experiment subject to general constraints on inputs and outputs. We illustrated the experimental design concepts with a relatively simple, but pedagogical, example involving the dynamic modeling of the fermentation of baker’s yeast. We showed that the new approach could provide significant improvements in designing experiments to ensure an overall acceptable level of curvature, in turn ensuring that linear approximation parametric confidence intervals and correlation calculations are relevant. Acknowledgment Lamia Benabbas is grateful to the Mitsubishi Chemical Company for financial support in the form of a postdoctoral fellowship. Nomenclature E[] ) expected value f, g ) analytic vector fields (ns-dimensional) F() ) F-statistic L() ) likelihood function M ) number of system outputs or response variables M(θ,u,t) ) model structure as a function of θ, u, and t. Mj(θ) ) the moment matrix of the residuals of jth model evaluated at the best estimated θ Me ) experimental error moment matrix nsp ) number of sampling points (or times) nu ) number of system inputs N ) number of observations of the response variables Nj ) number of observations of the jth response variable NM ) number of competing models NC ) the number of time-varying controls p() ) probability distribution function P ) the number of parameters

re ) number of repeat measurements of the response variables Sij(φ) ) the covariance matrix of the vector of deviations of the responses under models i and j S ) the covariance matrix of experimental error t ) time t() ) Student’s t statistic tspi ) ith sampling point of the response variables tswi,j ) jth switching time of the ith time-varying system input T(φ) ) model discrimination criterion u ) nu-dimensional vector of system inputs Vl,sp ) the sensitivity matrix for model l at sampling time tsp w ) nw-dimensional vector of time-independent (constant) system inputs W ) weighting matrix x(t) ) ns-dimensional vector of state variables x˜ 0 ) vector of state variables for which the initial conditions can be set experimentally y(t) ) M-dimensional vector of system outputs or response variables yˆ ) estimated or predicted response variables y0 ) M-dimensional vector of initial conditions γj ) heteroscedasticity parameter of the jth response variable n ) nth vector of errors or residuals (M-dimensional) θ ) P-dimensional vector of model parameters θˆ ) present best estimate of the parameter vector θ λ ) condition number of a matrix µ ) mean value σˆ 2 ) variance estimate σi ) variance of the expected value of the ith output yi Σ ) M × M covariance matrix of residuals Σθ ) M × M covariance matrix φ ) vector of experimental design variables Ψ ) error distribution parameters (e.g., mean and variance) χ2() ) χ2 statistic Ω ) Hessian matrix

Literature Cited (1) Asprey, S. P.; Macchietto, S. Statistical Tools for Optimal Dynamic Model Building. Comput. Chem. Eng. 2000, 24, 1261. (2) Espie, D. M. The Use of Nonlinear Parameter Estimation for Dynamic Chemical Reactor Modelling, Ph.D. Thesis, University of London, 1986. (3) Espie, D. M.; Macchietto, S. The Optimal Design of Dynamic Experiments. AIChE J. 1989, 35, 223. (4) Zullo, L. Computer Aided Design of Experiments: An Engineering Approach, Ph.D. Thesis, University of London, 1991. (5) Kiefer, J.; Wolfowitz, J. Optimum Designs in Regression Problems, Ann. Math. Stat. 1959, 30, 271. (6) Ljung, L. System Identification: Theory for the User; Prentice Hall: Englewood Cliffs, NJ, 1987. (7) Goodwin, G. C.; Payne, R. L. Dynamic System Identification: Experiment Design and Data Analysis; Academic Press: New York, 1977. (8) Zarrop, M. B. Optimal Experiment Design for Dynamic System Identification; Springer: Berlin, 1979. (9) Shirt, R. W.; Harris, T. J.; Bacon, D. W. Experimental Design Considerations for Dynamic Systems. Ind. Eng. Chem. Res. 1994, 33, 2656. (10) Ko¨rkel, S.; Bauer, I.; Bock, H. G.; Schlo¨der J. P. A Sequential Approach for Nonlinear Optimum Experimental Design in DAE Systems. In Proceedings of International Workshop on Scientific Computing in Chemical Engineering; TU: HamburgHarburg, Germany, 1999; Vol. II. (11) Asprey, S. P.; Macchietto, S. Designing robust optimal dynamic experiments. J. Process Control 2002, 12, 545. (12) Guay, M. Curvature Measures for Multiresponse Regression Models. Biometrika 1995, 82, 411.

Ind. Eng. Chem. Res., Vol. 44, No. 18, 2005 7131 (13) Box, G. E. P.; Draper, N. R. The Bayesian Estimation of Common Parameters from Several Responses. Biometrika 1965, 52, 355. (14) Seber, G. A. F.; Wild, C. J. Nonlinear Regression; Wiley Series in Probability and Mathematical Statistics; John Wiley & Sons: Toronto, 1989. (15) Draper, N. R.; Hunter, W. G. Design of Experiments for Parameter Estimation in Multiresponse Situations. Biometrika 1966, 53, 525. (16) Kang, G.; Bates, D. M. Approximate Inferences in Multiresponse Regression Analysis. Biometrika 1990, 77, 321. (17) Hamilton, D. C.; Watts D. G. A Quadratic Design Criterion for Precise Estimation in Nonlinear Regression Models. Technometrics 1985, 27, 241. (18) Gallant A. R. Nonlinear Statistical Models; Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics; John Wiley & Sons: Toronto, 1996. (19) Bates, D. M.; Watts D. G. Multiresponse Estimation with Special Application to Linear-Systems of Differential-Equations. Technometrics 1985, 27, 329. (20) Bates, D. M.; Watts, D. G. Nonlinear Regression Analysis and Its Applications; Wiley: New York, 1988. (21) Guay, M.; McLean D. D. Optimization and Sensitivity Analysis for Multiresponse Parameter Estimation in Systems of Ordinary Differential Equations. Comput. Chem. Eng. 1995, 19, 1271.

(22) Bates, D. M.; Watts D. G. Relative Curvature Measures of Nonlinearity (with discussion). J. R. Stat. Soc. 1980, 42, 1. (23) Vassiliadis, V. S.; Canto, E. B.; Banga, J. R. Second-Order Sensitivities of General Dynamic Systems with Application to Optimal Control Problems. Chem. Eng. Sci. 1999, 54, 3851. (24) Dunker, A M. The Decoupled Direct Method of Calculating Sensitivity Coefficients in Chemical Kinetics. J. Chem. Phys. 1984, 81, 2385. (25) Keeping, B. R.; Pantelides C. C. A Distributed Memory Parallel Algorithm for the Efficient Computation of Sensitivities of Differential-Algebraic Systems. Math. Comput. Simul. 1998, 44, 545. (26) Vassiliadis, V. S.; Sargent R. W. H.; Pantelides, C. C. Solution of a Class of Multistage Dynamic Optimization Problems 1: Problems Without Path Constraints. Ind. Eng. Chem. Res. 1994, 33, 2111. (27) Nihtila, M.; Virkunnen, J. Practical Identifiability of Growth and Substrate Consumption Models. Biotechnol. Bioeng. 1977, 19, 1831.

Received for review March 26, 2004 Revised manuscript received January 5, 2005 Accepted January 21, 2005 IE040096W