Ind. Eng. Chem. Res. 2007, 46, 871-882
871
Model-Based Design of Parallel Experiments Federico Galvanin,† Sandro Macchietto,†,‡ and Fabrizio Bezzo*,† DIPICsDipartimento di Principi e Impianti di Ingegneria Chimica, UniVersita` di PadoVa,Via Marzolo 9, I-35131 PadoVa, Italy; and Department of Chemical Engineering, Imperial College London, South Kensington Campus, SW7 2AZ London, U.K.
Advanced model-based experiment design techniques are essential for the rapid development, refinement, and statistical assessment of deterministic process models. One objective of experiment design is to devise experiments yielding the most informative data for use in the estimation of the model parameters. Current techniques assume that multiple experiments are designed in a sequential manner. However, multiple equipment can sometimes be available, and simultaneous (parallel) experiments could be advantageous in terms of time and resources utilization. The concept of model-based design of parallel experiments is presented in this paper. Furthermore, a novel criterion for optimal experiment design is proposed: the criterion aims at maximizing complementary information by considering different eigenvalues in the information matrix. The benefits of adopting such an approach are discussed through an illustrative case. 1. Introduction The use of detailed mathematical models to design, control, and optimize individual unit operations as well as entire processes is a well-established approach in the process industry. A wide class of models describes the phenomena being investigated through the statement of physical laws and correlations in the form of a system of differential and algebraic equations (DAEs). The model capability of representing in a reliable and accurate way the underlying phenomena depends on the model structure (i.e., the correlations and laws being used) and on the values of any parameters which may be calibrated to match the model to a specific real system. Experimental data are typically required both to assess the model validity and to estimate the model parameters in the range of expected utilization. The approach of perturbing a process for system identification is a widespread and mature technique for parameter identification, in particular for systems represented by linear models. However, in particular for nonlinear systems, the choice of the experiments to be carried out is critical to establish the most appropriate model structure and the best values of the parameters as well as to save time and effort in experimental trials. Deciding which experiment to carry out is not obvious, and there is usually a trade-off between experimental effort (in time and money) and amount and quality of data. The benefits derived from the use of statistical design of (black-box) experiments are well-known, and a large amount of scientific literature can be found on the subject (see, for instance, the books of Atkinson and Donev1 and Lazic2). Since the very beginning,3,4 such techniques were applied to nonlinear systems (particularly reactions schemes), and their use to estimate parameters in dynamic models has been intensively discussed (see, for instance, the review by Shirt et al.5). However, the application of optimal experimental design to nonlinear-dynamic models, where explicit advantage is taken of the structure of the underlying system, in the form of a DAE model (as opposed to considering it a purely black box) is rather * To whom correspondence should be addressed. Tel.: +39 049 8275468. Fax: +39 049 8275461. E-mail:
[email protected]. † Universita` di Padova. ‡ Imperial College London.
more recent.6,7 To distinguish this from the black-box case, the denomination of “model-based experiment design” has been adopted. Such advanced model-based experiment design techniques are a tool for rapid development, refinement, and statistical assessment of deterministic process models. They allow for selecting conditions for the next experiments that are “best”, in the sense of generating the maximum information content about the underlying process, usually described using the Fisher information matrix. On the basis of the earlier work of Espie and Macchietto8 and Zullo,9 Asprey and Macchietto10 proposed a general systematic procedure to support the development and statistical verification of dynamic process models for both linear and nonlinear dynamic systems described by DAEs. As in blackbox experiment design, assuming that no model discrimination is required beforehand, three consecutive steps are needed to determine the model parameters: (1) the design of a new set of experiments, based on current knowledge (model structure and parameters, and statistics from prior experiments) and on the maximization of some scalar value associated to the dynamic Fisher information matrix; (2) the execution of the designed experiment and collection of new data; and (3) the estimation of new model parameters and statistical assessment. The sequential iteration of steps 1, 2, and 3 leads to a progressive reduction in the uncertainty region of model parameters, thanks to the new information obtained from experimental data.11 What characterizes this approach is the explicit use in step 1 of the model equations (including any constraints) for the evaluation of the experiment design objective function and the use of a deterministic (dynamic) optimization framework for the solution of resulting numerical optimization problem. The procedure has been successfully demonstrated in several applications, such as crystallization processes,12 mammalian cell cultures,13 and biodiesel production.14 Bauer et al.15 presented a similar sequential procedure for optimal model-based experimental design for chemical processes. In particular, they discussed the optimization problem and the sequential quadratic programming (SQP) method used for the numerical solution. The benefits of the approach are demonstrated by an example represented by a stiff set of equations describing chemical reaction kinetics. Leineweber et al.16 further developed the optimization procedure through a
10.1021/ie0611406 CCC: $37.00 © 2007 American Chemical Society Published on Web 01/09/2007
872
Ind. Eng. Chem. Res., Vol. 46, No. 3, 2007
Figure 2. Elementary unit of the information fluxes for parallel experiment design. Figure 1. Elementary unit of the information fluxes for sequential experiment design.
simultaneous solution strategy based on multiple shooting and reduced SQP. Figure 1 illustrates the information flux for a sequential experiment design, with the “OK?” diamond representing a set of statistical model adequacy and parameter significance tests. The use of more and more complex models inevitably requires suitable design procedures in order to estimate reliable values for the model parameters. Furthermore, in all design procedures presented so far, experiments are performed sequentially, thus exploiting new data prior to designing each new experiment. However, there are a number of research and industrial applications where it is possible to envisage the simultaneous execution of several experiments in parallel rather than sequentially. For instance, miniaturization and nanotechnology allow the definition of an array of modules (e.g., microreactors for chemical or biochemical reactions) in which several experiments can be simultaneously carried out in the same or different experimental conditions. However, model-based parallel experiment design has never been properly discussed in the scientific literature. The elementary unit for a pure parallel design (i.e., assuming that no iteration is required) is illustrated in Figure 2. As in Figure 1, the parameter-estimation step could be followed by a statistical test for model adequacy and parameter significance. While many black-box, statistical experimental design methods inherently require the parallel execution of multiple experiments (for example, five experiments in a “latin box”), model-based parallel experiment design has not been previously discussed in the scientific literature. In this work, the possibility of extending the current modelbased experiment design techniques to tackle the design of parallel experiments is discussed. This paper focuses on the optimal design of experiments for parameter estimation, although similar considerations also hold when the scope is to discriminate among rival models. First, the following section
examines as broadly as possible what an experimental design procedure for parameter estimation should ideally take into account. Then, the paper presents a new approach based on a statistical analysis of the variance-covariance matrix of the parameters to be estimated. It is shown that the methodology can be applied to develop new strategies for parallel or sequential design as well as hybrid sequential-parallel experiments. Parallel and sequential-parallel techniques are compared to a standard sequential approach, and potential advantages/ disadvantages are highlighted. The applicability of the new experimental design methods to dynamic systems and their performance is discussed via an illustrative case study. 2. Experimental Design Framework A generic experiment setting is considered, characterized by the following: (1) A set Sy of experimental measured responses Y(t), where t is time. It is assumed that, for each measured variable, the number of measurements is finite and defined by a specified number nsp of sampling points at times tsp. The measured responses are characterized by the variance-covariance matrix Σy of the experimental error. (2) Two sets of control variables (inputs) that can be measured and manipulated by the experimenter. One set (Su) comprises time-varying variables U(t); the other set (Sw) comprises timeinvariant variables W. (3) A set Sz of time-dependent disturbances Z(t), i.e., inputs that cannot be manipulated. For ease of discussion, it is assumed that Sz comprises all disturbances, reducing the informative content of the experiment and potentially hindering data replication as well as masking the system dynamics. Sz comprises unmeasured timedependent disturbances of known nature and randomly distributed disturbances (e.g., related to measurement instru-
Ind. Eng. Chem. Res., Vol. 46, No. 3, 2007 873
mentation, control system, and unidentifiable human errors contribution). The vectorial SE space of the experiment conditions is defined as
SE ) Sy ∪ Su ∪ Sw ∪ Sz ∪ {Y0} ∪ {τ}
(1)
where {Y0} is the set of the measured values of Y at time t ) 0, which is also set by the experimenter or related to the environment, and {τ} is the set of the durations of the experiments, which is also an experimental condition that needs to be defined. Any first-principle model aimed at describing the system will try to include as many of the above factors as possible. It is assumed that the model is described by a set of DAEs of the form
{
f(x3 (t),x(t),u(t),w,θ,t) ) 0 yˆ (t) ) g(x(t))
(2)
where x(t) is the Ns-dimensional vector of time-dependent state variables, u(t) ⊆ U(t) and w ⊆ W are the Nu and Nw model control variables, and θ is the set of Nθ unknown model parameters to be estimated. Parameters are characterized by an initial estimated value θ0 and an initial approximated variancecovariance matrix V0. The circumflex symbol (i.e., yˆ ) is used to indicate the model estimate of a variable (or a set of variables) that can be measured. y(t) ⊆ Y(t) is the M-dimensional vector of measured values of the outputs, while yˆ (t) is the vector of the corresponding values estimated by the model. In general, predicted responses yˆ (t) differ from the measured ones y(t). For ease of discussion, it is assumed that the model does not include any representation of the set Sz, i.e., the set Sz represents the set of phenomena that cannot be described in the model. As noted above, the space of the model experiment conditions is a subset of the space of the experiment condition, i.e., SM ⊆ SE; in fact, the mathematical model may neglect in its representation the effect of Z(t) and some components of the U(t) and W vectors. Model-based experimental design procedures aim at decreasing the expected model parameter uncertainty region by acting on the experiment design vector φ
φ ) {y0,u(t),w,tsp,τ}
(3)
It may be noted that the identity of the measured variables y is not explicitly included in the design vector: it is assumed that these have been “chosen” a priori. However, the frequency at which these variables are sampled is a design variable and can be expressed through the vector tsp of sampling times. Technological limitations usually restrict the choice of φ to a subset Sφ of SM called “technological design vectorial space”. This is usually implemented within model 2 by including a number of additional constraints that define the feasible range or rate of variation of the measured responses (for example, maximum pressure, maximum rate of temperature change, or safe composition range) and control variables (for example, range of an ingredient feed rate or heating rate) during an experiment and the durations of experiments. We may also specify desirable final conditions (values or ranges) of the measured variables at the end of the experiment. Formally, we have
φ ∈ Sφ ⊆ S M ⊆ S E
(4)
In the above discussion, it is implied that the experimental setup is defined a priori: for example, there may be a stirred-
Figure 3. Information fluxes for a combined sequential-parallel experiment design.
tank reactor (and related instrumentation) and all experiments are sequentially run in that piece of equipment. However, the availability of multiple equipment gives the possibility to execute experiments not only sequentially but also simultaneously (i.e., in parallel), as well as adopting hybrid configurations, thus extending the design space Sφ. For example, if we refer to the elementary design units of Figures 1 and 2, it can be observed that parallel experimental design can be incorporated within the inner loop of a sequential iteration scheme, as illustrated in the example in Figure 3, where the number of parallel experiments used at each iteration can be an additional decision variable. In this case, additional degrees of freedom are available to the experimenter, such as the elapsed time (typically related to the number of experiments carried out simultaneously or sequentially) and the allocation decisions (which equipment is used to perform the ith experiment). These can be represented by an experiment structure T,
T ) {N, NR, C}
(5)
where N is the total number of experiments (also defining the maximum number of iterations in a purely sequential, singleexperiment scheme), NR is the total number of available devices (defining the maximum degree of parallelism adoptable), and C is an s-dimensional connectivity vector (s is the number of experiment sequences, representing an experimental campaign mapping). For instance, it may happen that there are two stirredtank reactors (i.e., NR ) 2) and the available budget allows for three experiments (i.e., N ) 3), which are carried out according to the following scheme: two parallel experiments followed by a sequential experiment (i.e., C ) [2 1]T with s ) 2). (Although the pieces of equipment could be different (for example, two different reactors at different scales), here it is assumed that a single model can represent any piece of equipment used for the experimental runs.)
874
Ind. Eng. Chem. Res., Vol. 46, No. 3, 2007
Accordingly, a new design vector Φ including the experiment structure T is defined as
Φ ) {φi,T} (i ) 1, ..., N)
(6)
Prior information on the parameters can be ignored by dropping the dependency of eq 8 on the approximate parameter variance-covariance V0:3 Nexp
In principle, the optimization of Φ would produce an experimental design capable of maximizing the information content by selecting the best sequence of (possibly parallel) experiments, constraining the set of experiments in terms of a maximum number of experiments to be performed and/or a maximum time of experimentation. The optimization of Φ, combining the optimum model-based experimental design problem15 with the optimum experiment structure problem, is quite a challenging problem. In general, its solution could be approached using mixed-integer nonlinear programming (MINLP) algorithms.17 Note that, to include multiple equipment and connectivity, the vectorial space of the experimental conditions should be modified by including the set of experiment structures ST:
SE ) Sy ∪ Su ∪ Sw ∪ Sz ∪ {Y0} ∪ {τ} ∪ ST
(7)
Sφ and SM are modified accordingly. This paper is a first step toward the definition of a general experimental design architecture. However, the discussion and solution of this generic optimization problem, aimed at optimizing the experiment structure T as well as individual experiments, are beyond the scope of this work and will not be considered. Here it will be discussed how parallel and sequential design can be treated, and a new methodology to exploit the informative content of multiple experiment is proposed. 3. Methodology Let us consider a process modeled by a set of DAEs as system 2 and any additional inequality constraints. Here it is assumed for simplicity that all the differential variables x can be measured, i.e., x(t) ≡ yˆ (t). Model-based experimental design for parameter precision aims at determining the optimal vector φ of experimental conditions required to maximize the expected information content from the measured data generated by these experiments, i.e., to optimize the expected “quality” of the estimated model parameters (in some statistical sense, usually defined in terms of the parameters inference region). In order to decrease the size of the inference regions of each of the parameters in a given model, some measure ψ of the variance-covariance matrix Vθ of the parameters has to be minimized. After a number Nexp of experiments, the matrix Vθ is the inverse of the Nθ × Nθ information matrix Hθ,18 * + (V0) -1]-1 ) ∑ Hθ|k
[∑∑∑ Nexp
k)1 M M
k)1 i)1 j)1
]
-1
T σij|kQi|k Qj|k + (V0)-1
(8)
* where Hθ|k is the information matrix of the kth experiment (symbol * indicates that the information matrix refers to a single experiment), σij is the ijth element of the inverse of the estimated measurement variance-covariance matrix Σy, and Qi is the ith state matrix of the sensitivity coefficients at each of the nsp sampling points:9
Qi )
[ ] ∂xil ∂θm
* -1 Hθ|k ] ) ∑ k)1 Nexp M
[
l ) 1, ..., nsp m ) 1, ..., Nθ
(9)
M
T T σi|k Qi|kQj|k]-1 ∑ ∑ ∑ k)1 i)1 j)1
(8a)
The information matrix (eq 8a) is sufficiently generic and can be used for the design of both sequential and simultaneous experiments. In order to assess an experimental design with respect to the parameter estimation accuracy, various functions of the information matrix are discussed in the literature.19 A common choice for the measure ψ is the E-optimality criterion,20 which aims at minimizing the largest eigenvalue λ1 of matrix Vθ. It may be noted that the definition of matrix Vθ and the E-optimality criterion are quite general and do not depend on whether the experiments are run sequentially or simultaneously. If a sequential approach is considered, the information matrix after Nexp experiments is defined as Nexp-1
Hθ )
∑ k)1
* * * Hθi|k + Hθ|N (θ,φNexp) ) K + Hθ|N (θ,φNexp) exp exp
(10)
where K is a constant matrix defined by the previous (Nexp 1) experiments. In the above information matrix, only the vector φNexp of the experimental conditions for the new experiment, Nexp, is available for optimization. 3.1. Parallel Design. The new experiments may be designed simultaneously, i.e., in parallel, by considering as design objective a measure of the overall information expected and as design variables the design vectors of all the parallel experiments. From the former definition (eq 8a), the overall information matrix become Nexp
Hθ )
* (θ,φk) ∑ Hθ,k
(11)
k)1
and the optimum experimental design problem becomes
min ψk(Vθ)
φ1,...,φNexp
Nexp
Vθ(θ,φ) ) Hθ-1(θ,φ) ) [
Vθ(θ,φ) ) Hθ-1(θ,φ) ) [
(12)
Here, all vectors φk, one for each experiment k, are optimized simultaneously. As before, we may use E-optimality to “measure” Hθ (i.e., the largest eigenvalue λ1 of the overall matrix Vθ) and as the objective function to be minimized. It is important Nexp * λ1,k(Hθ,k ), and therefore, to note that, in general, λ1(Hθ) * ∑k)1 a criterion based on a measure of Hθ depending on the matrix’ eigenvalues does not produce the trivial solution φ1 ) φ2 ) ... ) φNexp. In other words, the Nexp new optimal experiments will normally be distinct. A major drawback of this approach is that a much larger optimization problem needs to be solved. 3.2. New Optimum Criterion. An alternative method for optimum design is proposed here, which is particularly suited for parallel experiments. Standard criteria aim at maximizing a measure of the information matrix in order to exploit the
Ind. Eng. Chem. Res., Vol. 46, No. 3, 2007 875
Note that, if a parallel design is performed according to the SV-optimal criterion, matrix 8a is simplified as
Vθ(θ,φ) ) Hθ-1(θ,φ) )
Figure 4. Geometric interpretation of SV-optimality for a model with two parameters.
information content represented by a global measure of the information matrix. However, it may be more convenient to choose a more targeted approach to optimization. From this point of view, the E-optimality criterion is already a selective criterion when compared to the D-optimality criterion (which minimizes the determinant of Vθ): the underlying assumption is that it is more useful to look at a specific direction of the information content, i.e., the direction identified by the largest eigenvalue. Clearly, that is not all the information that may be available to the experimenter, particularly when parallel experiments can be designed, since the minimization of a global measure typically excludes the possibility of exploring complementary information to which different experimental devices could be dedicated. The novel approach discussed here is to design each experiment by selecting a priori a vector of experimental conditions that produces information that is as different as possible (orthogonal) from the other ones. In mathematical terms, that means that the information content of matrix Hθ is split into its singular values, identified by its Nθ eigenvalues λi: the new optimization criterion, called SV-optimality, aims at maximizing the information linked to the Nexp largest singular values of Vθ. Thus, the overall optimization problem is split into Nexp separate optimization problems,
min ψk (Vθ) φk
k ) 1, ..., Nexp e Nθ
(13)
where the kth measure ψk is defined as
ψk ) λk(Vθ)
k ) 1, ..., Nexp e Nθ λ1 > λ2 > ... > λNexp (14)
In other words, we optimize in parallel the largest Nexp eigenvalues, each requiring the solution of a distinct optimal design problem, where each problem is small. As noted in eq 14, the largest number of distinct parallel problems/experiments is the number of parameters, Nθ. As illustrated in Figure 4, SV-optimal design allows for compressing the confidence ellipsoid of the parameters to be estimated by minimizing its axes in a selective way. Although the relative size of the axes (i.e., the relative magnitude of the eigenvalues) does depend on the model being analyzed, it is quite significantly affected by the variance-covariance matrix of the measurements Σy. Large values of the matrix elements (i.e., measurement uncertainty) amplify the difference between the eigenvalues.
[
M
M
]
∑ ∑ σijQiQj i)1 j)1
-1
(15)
since the SV criterion does not require to include the number of parallel experiments in the information matrix. One clear advantage of SV-optimality applied to model-based design of parallel experiments is that now it is possible to solve Nexp small optimization problems (each of the size of φ) rather than a single large one (having size Nexp × φ), as with the overall optimization method of Section 3.1. The second potential advantage is that we do not design the experiments to maximize the information content of the overall set, but each experiment is designed to maximize a specific component of the available information. One potential drawback of the SV-optimal design is that the number of independent experiments is limited by the number of model parameters, although a large number of experiments is usually needed if a large number of parameters also require estimation. Nonetheless, if the available equipment contains more experimental units than model parameters (as may be the case for microtiter plates used by biochemists), it is still possible to combine the SV-optimal criterion with the E-optimal (or D-optimal) criterion for parallel experiments (based, as discussed above, on the information matrix 11). It can be observed that the SV criterion could also be applied within a sequential experiment design scheme: for instance, the first experiment aims at minimizing the largest eigenvalue of the variance-covariance matrix, the second minimizes the second largest eigenvalue, and so on. As is discussed later, it can be noted that this approach adds an additional degree of freedom, since the order according to which experiments are designed (i.e., which eigenvalue is to be minimized first, second, etc.) is also left to be decided by the experimenter. 4. Case Study The methodology discussed in the previous section is applied to a biomass fermentation process for baker’s yeast that appeared in several papers on the subject.8,10,18 Assuming Monod-type kinetics for biomass growth and substrate consumption, the system is described by the following set of DAEs,
dx1 ) (r - u1 - θ4)x1 dt rx1 dx2 )+ u1(u2 - x2) dt θ3 r)
(16)
θ1x2 θ2 + x 2
where x1 is the biomass concentration (g/L), x2 is the substrate concentration (g/L), u1 is the dilution factor (h-1), and u2 is the substrate concentration in the feed (g/L). The experimental conditions that characterize a particular experiment are the initial biomass concentration x01 (range 1-10 g/L), the dilution factor u1 (range 0.05-0.20 h-1), and the substrate concentration in the feed u2 (range 5-35 g/L). The initial substrate concentration x02 is set to 0 g/L and cannot be manipulated for experimental design purposes. Both x1 and x2 can be measured during the experiment (i.e., x(t) ≡ yˆ (t)).
876
Ind. Eng. Chem. Res., Vol. 46, No. 3, 2007
The total duration of a single experiment is set equal to 40 h (i.e., duration τ is fixed). It is assumed that each experimental run involves five sampling times. It is assumed that inputs u(t) can be manipulated and represented as a piecewise-constant profile over five switching intervals. The sampling times and the control variables switching times can be different. The elapsed time between any two sampling points is allowed to be between 0.1 and 20 h, and the duration of each control interval is allowed to be between 0.2 and 20 h. The set (eq 16) will represent both the real system and the system model. Therefore, with reference to Section 2, we have the following:
Z(t) ) 0 y(t) ) Y(t) ()x(t)) u(t) ) U(t) w)W)0 The objective is to design a set of experiments to yield the best possible information for the estimation of the four parameters θi. A total of 24 variables (i.e., one initial condition, five sampling times for the two measured variables, 2 × 4 switching times, and 2 × 5 levels for the two control variables) are optimized in each experiment. As discussed by Chen and Asprey,21 the mathematical representation of continuous time-varying inputs is obtained through a control vector parametrization technique,22 using a piecewise-constant parametrization. The nonlinear programming problem is solved for φ, by using the sequential reduced quadratic programming (SRQP) optimization routine of the gPROMS software,23 with simple bounds on the parameter values. Five starting points for the optimization variables are used in order to reduce the possibility of incurring into local minima. The points are chosen as follows: (1) x01, u1, u2, and control variables’ switching times are set to their allowed maximum values. (2) x01, u1, u2, and control variables’ switching times are set to their allowed minimum values. (3) x01, u1, u2, and control variables’ switching times are set to average values between their allowed maximum and minimum values. (4) x01, u1, and u1’s switching times are set to their maximum values, while u2 and u2’s switching times are set to their minimum values. (5) x01, u1, and u1’s switching times are set to their minimum values, while u2 and u2’s switching times are set to their maximum values. The initial sampling times for the measured variables do not vary and are set to [1, 5, 15, 20, 30]T h. The SRQP routine was also used in the parameter estimation step, with simple bounds on control variables. The DAE’s integration was performed by adopting the DASOLV routine of gPROMS. Synthetic “experimental” data are obtained by simulation of model 16 with θtrue ) [0.310, 0.180, 0.550, 0.050]T as the “true” parameters and by adding normally distributed noise with a mean of zero (the vector of parameter units is [h-1, g/L, dimensionless, h-1]T). Two distinct M × M measurement covariance matrices Σy of the simulated measurements error are considered:
ΣA )
[
0.01 0 0 0.05
]
ΣB )
[
0.1 0 0 0.5
]
(17)
In other words, two degrees of measurement accuracy are assumed: matrix ΣA refers to a case where the experimental measurement devices are known to be reliable and accurate and matrix ΣB assumes that the experimental equipment delivers less good quality data. Matrix Σy is assumed to be diagonal, i.e., no dependence among different measurements is taken into account. Furthermore, we will start from two different initial guesses for the parameters, as this may affect the quality of the design method:18
θA ) [0.357 0.153 0.633 0.043 ]Τ θB ) [0.527 0.054 0.935 0.015 ]T (18) Set θA corresponds to a starting point (perhaps obtained from previous experiments or from literature), which is quite close (within 15%) to the true values for the parameters; set θB represents the availability of a less accurate (within 70% of the true value) initial guess of the parameter values. Note, however, that in general the “designer” does not know whether his/her initial guess is good or not: the initial guess is simply the best available knowledge before setting up the new experiments. The above covariance matrices of the measurement errors and initial guesses for the parameters values identify two limit conditions used to assess the experiment design. Instance A represents a rather ideal situation (reliable initial guess and precise measurements); instance B assumes that the initial guess for the parameter values is less good and that measurements are imprecise (typically, a more realistic situation). 5. Proposed Experimental Designs and Results Several experimental design approaches are compared assuming that we wish to design the same number of new experiments. The following designs are implemented: (1) D1 ) sequential experiment design (E-optimality), two experiments and one experimental device (the experiment structure T is defined as N ) 2; NR ) 1; C ) [1 1]T); (2) D2 ) parallel experiment design (E-optimality), two experiments and two experimental devices (T is defined as N ) 2; NR ) 2; C ) [2]); (3) D3 ) sequential experiment design (SV-optimality: first the largest eigenvalue, λ1, is minimized, then λ2), two experiments and one experimental device (T is defined as N ) 2; NR ) 1; C ) [1 1]T); (4) D4 ) parallel experiment design (SV-optimality), two experiments and two experimental devices (T is defined as N ) 2; NR ) 2; C ) [2]); (5) D5 ) sequential experiment design (D-optimality), two experiments and one experimental device (the experiment structure T is defined as N ) 2; NR ) 1; C ) [1 1]T); (6) D6 ) parallel experiment design (D-optimality), two experiments and two experimental devices (T is defined as N ) 2; NR ) 2; C ) [2]). Each design is applied to each of the two instances, A and B, previously described. The D-optimality criterion (designs D5 and D6) is tested as this is also a very common choice for experimental design. It should be noted that, in a sequential design (D1, D3, and D5), the model parameters are reestimated after the first experiment and this information is used to design the second experiment. Previous designs also determine the constant matrix K, according to formulation 10, for the prior information matrix. In all instances, the quality of the designs after two experiments is evaluated in terms of the statistical quality of the model
Ind. Eng. Chem. Res., Vol. 46, No. 3, 2007 877
Figure 5. Instance A. Representation of control vectors φ for designs D1 (a), D2 (b), D3 (c), D4 (d), D5 (e), and D6 (f). The initial conditions x0 (not shown in the plot) are 1.08 and 8.45 g/L for x1 and x2 in design D1, 1.85 and 1.30 g/L for x1 and x2 in design D2, 1.08 and 4.00 g/L for x1 and x2 in design D3, 1.00 and 9.98 g/L for x1 and x2 in design D4, 1.04 and 5.00 g/L for x1 and x2 in design D5, and 1.01 and 9.00 g/L for x1 and x2 in design D6, respectively. Table 1. Comparison of Sequential and Parallel Approaches for Model-based Experiment Design for Instance A (Two Experiments)a design
param. estimate
conf. interval (95%)
t-value (tref ) 1.75)
χ2 (χ2ref ) 26.30)
D1 D2 D3 D4 D5 D6
θ ) [0.312, 0.189, 0.555, 0.051]T θ ) [0.298, 0.191, 0.507, 0.040]T θ ) [0.310, 0.190, 0.547, 0.048]T θ ) [0.310, 0.217, 0.540, 0.047]T θ ) [0.311, 0.200, 0.550, 0.050]T θ ) [0.308, 0.184, 0.544, 0.048]T
[(0.0114, (0.0427, (0.0285, (0.0094]T [(0.0131, (0.0810, (0.0393, (0.0104]T [(0.0092, (0.0691, (0.0146, (0.0042]T [(0.0109, (0.0934, (0.0167, (0.0050]T [(0.0143, (0.106, (0.0278, (0.0087]T [(0.0111, (0.235, (0.0326, (0.010]T
[27.40, 4.43, 19.47, 5.46]T [22.80, 2.35, 12.92, 3.88]T [28.43, 3.14, 24.60, 4.82]T [28.49, 2.32, 32.30, 9.44]T [21.74, 1.886, 19.78, 5.80]T [27.8, 0.78*, 16.66, 4.78]T
18.39 17.18 22.36 19.35 8.95 10.93
a
Superscript asterisk (*) indicates t-values failing the t-test.
and the parameters obtained once the data collected have been used for parameters fitting. In particular, (1) χ2 value. This gives a statistical evaluation of the fit of the data derived from the designed experiments (compared to a χref2 based on a Student distribution). It should be noted that the χ2 values for the different cases cannot strictly be compared to each other, since each one represents the capability of the model to fit the data collected from the experiments of that specific design, which are different from case to case. (2) t-value statistics for each parameter. This represents the estimation confidence for that parameter. For a set of experi-
ments to produce a reliable parameter estimation, its t-value must be greater than a computed reference value derived from a Student distribution (t-test). It is noted that, in these “simulated” experiments, the different designs could be assessed by comparing the estimated parameter values to the true ones. However, in “real life”, this test is not possible since the true values are, of course, unknown. 5.1. Instance AsGood Initial Parameter Guess and Precise Measurements. Figure 5 illustrates the profiles of the controls in the optimal experiments calculated in the six proposed design cases. The plots show that the design vector
878
Ind. Eng. Chem. Res., Vol. 46, No. 3, 2007
Figure 6. Instance B. Representation of control vectors φ for designs D1 (a), D2 (b), D3 (c), D4 (d), D5 (e), and D6 (f). The initial conditions x0 (not shown in the plot) are 1.08 and 8.45 g/L for x1 and x2 in design D1, 1.85 and 1.30 g/L for x1 and x2 in design D2, 1.08 and 4.00 g/L for x1 and x2 in design D3, 1.00 and 9.98 g/L for x1 and x2 in design D4, 1.04 and 10.00 g/L for x1 and x2 in design D5, and 1.00 and 4.60 g/L for x1 and x2 in design D6, respectively. Table 2. Comparison of Sequential and Parallel Approaches for Model-Based Experiment Design for Instance B (Two Experiments)a design
param. estimate
D1 D2 D3 D4 D5 D6
θ ) [0.267, 0.400, 0.447, θ ) [0.317, 0.950, 0.518, 0.028]T θ ) [0.314, 0.243, 0.547, 0.049]T θ ) [0.318, 0.372, 0.547, 0.048]T θ ) [0.295, 0.149, 0.524, 0.037]T θ ) [0.274, 0.072, 0.452, 0.018]T
a
t-value (tref ) 1.75)
conf. interval (95%) 0.001]T
(0.0793]T
[(0.0775, (0.4417, (0.1689, [(0.0462, (1.4000, (0.0997, (0.0402]T [(0.0163, (0.2120, (0.0283, (0.0091]T [(0.0195, (0.3085, (0.0329, (0.0105]T [(0.0425, (0.1298, (0.0875, (0.0040]T [(0.0317, (0.6257, (0.0853, (0.028]T
0.90*,
0.13*]T
[3.44, 2.65, [6.87, 0.68*, 5.19, 0.71*]T [19.23, 1.147*, 19.36, 5.39]T [16.34, 1.21*, 16.63, 4.55]T [6.94, 1.15*, 5.99, 0.92*]T [8.64, 0.12*, 5.30, 0.64*]T
χ2 (χ2ref ) 26.30) 13.03 19.22 20.26 22.14 2.04 5.13
Superscript asterisk (*) indicates t-values failing the t-test.
depends significantly on the experiment structure (sequential vs parallel) as well as on the optimization criterion used. For sequential designs D1 and D3, the first experiment is exactly the same since it is obtained by solving the same optimization problem. The difference in the information content is given by the second experiment, which is defined by taking into account two different eigenvalues. On the other hand, parallel designs D2 and D4 do not have anything in common since they are obtained by solving totally different optimization problems (note that, quite obviously, one of the experiments in design D4 is
identical to the first experiment in designs D1 and D3). As expected, designs D5 and D6 propose experiments that are different from any of the previous ones. The results, in terms of the a posteriori statistics obtained after the optimally designed experiments were executed and the model parameters were reestimated with the new data, are summarized in Table 1. All designs except D6 provide statistically sound results (t-values are above the reference threshold). Note that, from this point of view, parallel design is a sensible alternative to
Ind. Eng. Chem. Res., Vol. 46, No. 3, 2007 879
Figure 7. Instance B. Perturbation profiles for the dilution factor, u1 (left), and the substrate concentration in the feed, u2 (right). Table 3. Comparison of Sequential and Parallel Approaches for Model-Based Experiment Design for Instance B in Terms of Relative and Absolute Error with Respect to the True Output Values (Index 1 Refers to Biomass Concentration x1, while Index 2 Refers to the Substrate Concentration x2) design
r,1
r,2
1 (g/L)
2 (g/L)
D1 D2 D3 D4 D5 D6
0.1491 0.0728 0.0075 0.0197 0.0387 0.0661
0.7287 7.2696 0.2782 1.1982 0.2086 0.6097
1.0192 0.5115 0.0537 0.1397 0.2758 0.4329
0.7959 0.9595 0.1060 0.3361 0.2779 0.9654
save elapsed experimental time since the experimental session requires half the time from the start of the first experiment as either D1 or D3 (but, of course, double the equipment is needed). One drawback of design D2 is that, as previously stated, it requires the solution of a larger optimization problem (48
variables), and therefore, it may be more affected by numerical convergence issues and, more importantly, by a larger number of local minima. This is not an issue in design D4, which produces an estimation as accurate as both designs D1 and D3. We may note that both D3 and D4 produce a more confident estimation of parameter θ3, hinting that some of the information content related to that parameter belongs to a different direction in the variance-covariance matrix. On the other hand, it seems than D1 gives a more reliable estimation of parameter θ2. If the estimated values are compared to the true ones, it can be seen that both D1 and D3 provide an accurate estimation of all parameters, whereas D2 is less effective in estimating θ3 and θ4 and D4 does not produce a good estimation of θ2. Designs D5 and D6 seem less reliable with respect to the estimation of parameter θ2 (D6 fails the t-test). As regards the estimation of the other three parameters, D5 performs very similarly to D1,
880
Ind. Eng. Chem. Res., Vol. 46, No. 3, 2007
Figure 8. Instance B. Confidence ellipsoids with respect to parameters θ2 and θ3 after designs D1 (top) and D3 (bottom). Symbol (/) indicates the true parameters’ values; the parameters’ estimated values after the first (EV1) and second (EV2) design are shown by symbols (b) and (0), respectively.
Table 4. Comparison of Approaches for Model-Based Experiment Aiming at Minimizing Eigenvalues λ1 and λ2, Respectively, for Instances A and B (One Experiment)a design
param. estimate
conf. interval (95%)
t-value (tref ) 1.75)
χ2
Dλ1-A Dλ2-A Dλ1-B Dλ2-B
θ ) [0.305, 0.293, 0.510, 0.035]T θ ) [0.301, 0.118, 0.541, 0.048]T θ ) [0.288, 0.427, 0.485, 0.018]T θ ) [0.330, 0.680, 0.551, 0.050]T
[(0.0250, (0.1691, (0.0669, (0.0211]T [(0.0206, (0.1672, (0.0204, (0.0060]T [(0.1123, (0.5760, (0.2403, (0.1108]T [(0.0520, (1.2780, (0.0558, (0.0145]T
[12.2, 1.73*, 7.62, 1.68*]T [14.62, 0.70*, 26.49, 7.95]T [2.56, 0.74*, 2.02, 0.16*]T [6.34, 0.53*, 9.87, 3.40]T
18.14 (χ2ref ) 12.59) 4.14 (χ2ref ) 12.59) 13.14 (χ2ref ) 12.59) 4.14 (χ2ref ) 12.59)
a
Superscript asterisk (*) indicates t-values failing the t-test.
Table 5. Comparison of Sequential and Parallel Approaches for Model-Based Experiment Design for Instance B (Three Experiments) design
param. estimate
conf. interval (95%)
t-value (tref ) 1.75)
χ2
D7
θ ) [0.310, 0.193, 0.543, 0.047]T
[(0.0122, (0.0591, (0.0279, (0.0098]T
[25.36, 3.26, 19.45, 4.86]T
24.30 (χ2ref ) 12.59)
while D6 overall appears to be a better design than D2 (but worse than D4). 5.2. Instance BsBad Initial Parameter Guess and Noisy Measurements. Figure 6 illustrates the optimization results for the proposed vector designs. The plots show the designs obtained by adopting the same six experiment structures and/or optimality criteria used in Instance A. The comments given for Instance A are valid also in this case. As can be seen from Table 2, no design is capable of providing a full set of reliable parameters with just two experiments (D2 and D6 produce a particularly bad θ2 estima-
tion) and more experiments are needed. In this case, Doptimality (D5 and D6) is more effective than E-optimality (D1 and D2), but it is SV-optimality that appears to lead to significantly better results. Both designs D3 and D4 are performing sensibly better and provide a statistically sound estimation of three parameters. What is surprising is the poor performance of sequential designs D1 and D5. In fact, their estimation capability is comparable to (possibly even worse than) the one obtained after parallel designs D2 and D6. This seems to be related to both initial parameter uncertainty and measurements noise.24 In such cases, a number of experiments
Ind. Eng. Chem. Res., Vol. 46, No. 3, 2007 881
is simply wasted to “remove” noise and obtain a more reliable initial estimation. That is why, in such cases, parallel design performs similarly to sequential design: in fact, at least at the beginning, the optimization procedure cannot take advantage of the data obtained from the first design. The SV-optimality approach seems to be able to provide more informative experiments: exploiting the orthogonal information related to λ2 is more important than using the information from the first experiment to design the second experiment. From a comparison with the true parameter values, it can be observed in all designs that the accuracy of estimate of parameter θ2 is quite poor (particularly, after design D2); designs D1 and D6 are rather unsatisfactory with respect to parameter θ4. In order to further assess the effect of the various designs on the parameter precision obtained, simulations were carried out comparing the model response with the parameters calculated after each design of Instance B with the “real” process behavior (i.e., the model with the true parameter values). A set of 12 perturbation profiles of the two inputs to the model, i.e., the dilution factor, u1, and the substrate concentration in the feed, u2, were implemented (Figure 7). Such feed variations represent a “random” sequence within the range allowed for the two control variables and do not resemble any of the designs used to estimate the parameters values. The initial biomass concentration x01 is set equal to 5.0 g/L. Table 3 shows the average relative model error r and the average model error with respect to the true output values for all the NT disturbances tests, computed as
r,i )
x
i )
x
1
NT
∑∫
( ) yi,j - yˆ i,j
2
dt
(19)
∑ ∫(yi,j - yˆ i,j)2 dt
(20)
τ × NT j)1
yi,j
and
1
NT
τ × NT j)1
where τ is the simulation time. In general, design D3 seems to be the best approach to obtain reliable parameters, although in terms of the relative error with respect to the substrate concentration x2, it rates second after design D5. However, it can be observed that, even in such a case, the absolute error is sensibly smaller in D3. The reason for a larger relative error is mainly related to the fact that the substrate concentration can get close to zero, therefore affecting the reliability of the relative error. Overall, the data in Table 3 confirm the statistics of Table 2. Sequential design D1 performs rather poorly (the sequential design D5 based on a D-optimal criterion is a better choice). The are no significant differences between parallel designs D2 and D6. However, it should be noted that parallel design D4 produces rather good results. On the basis of this example, it appears as a reasonable choice if parallel equipment is available. 6. Some Comments on Design Directionality The reported results indicate that the SV-optimality criterion can represent a better approach to generate informative data when a good initial guess for the parameter values is not available and there are noisy measurements. It is interesting to note that, in the control field, it has already been pointed out25 that a reliable identification of ill-conditioned systems may require specific optimal design techniques26 to focus on the
smaller eigenvalues after the singular value decomposition (SVD) of the process transfer matrix: in fact, robust control stability is very sensitive to small singular values. The context here is quite different, but it seems that noisy measurements and parameters uncertainty do mask the information directionality in matrix Vθ, so that valuable information content is linked to the smaller eigenvalues. Furthermore, and most importantly, it has been noted that the design choice can affect the capability of estimating some specific parameters. For example, let us consider designs D1 and D3. Figure 8 shows the confidence ellipsoids after the first and second (sequential) experiments, plotted with respect to parameters θ2 and θ3. It can be seen that the largest axis of the ellipsoid (largest eigenvalue) is more related to the uncertainty concerning parameter θ3. On the other hand, the second axis is more related to the uncertainty in the value of θ2. It is also worth noting that, in this case, minimizing the second eigenvalues is beneficial to improve the accuracy with respect to θ3, too, because of the resulting rotation of the ellipsoid. This aspect can be further explored by analyzing the design of single experiments. Let us assume to design an experiment aimed at minimizing λ1 (design Dλ1) and, from the same starting condition, an experiment aimed at minimizing λ2 (design Dλ2). Both designs are performed first with the covariance matrix of the simulated measurements and the initial guess for the parameter values as in Instance A, and then as in Instance B. Statistics are summarized in Table 4. Results are quite significant. In fact, independently of the measurement noise and the initial guess of the parameter value, from the parameter confidence intervals and t-values it appears that parameter θ2 is linked to the information content of eigenvalue λ1; on the other hand, parameters θ3 and θ4 are related to the information content within eigenvalue λ2. This suggests that SV-optimality can be a better criterion for exploiting the information content related to all the model parameters (from this point for view, even D-optimality could be a better approach than E-optimality, since the information related to all the eigenvalues is optimized, although not in a selective way). Furthermore, such an approach is very effective when parallel experiments are possible. Although the number of independent designs is limited by the number of eigenvalues (i.e., of the model parameters) and, in general, it may not be worth minimizing the smaller eigenvalues since they could just represent the system noise, the procedure is demonstrated to be nearly as effective as a sequential approach. Furthermore, this allows the solution of a considerably smaller optimization problem than that with a full parallel design with the overall objective function (formulation 11). In view of the above results, it seems reasonable to design experiments by considering the best experimental sequence. More complicated designs can take advantage of the information structure within matrix Vθ. For example, results in Table 2 show that designs D3 and D4 are very effective in estimating all parameters except θ2. Since θ2 is related to the information content within λ1, it seems very reasonable to run one additional experiment aimed at minimizing such an eigenvalue. The resulting design D7 has a parallel-sequential experimental design structure, according to which two parallel experiments (designed following the SV-optimality criterion) are followed by a single experiment (designed following the E-optimality criterion); i.e., T is defined as N ) 3; NR ) 2; and C ) [2 1]T. Results are summarized in Table 5. As expected, one further experiment designed to maximize the information content related to θ2 is sufficient to estimate the parameter in an accurate way. It may be observed that, in terms of elapsed time, design D7 is
882
Ind. Eng. Chem. Res., Vol. 46, No. 3, 2007
equivalent to D1, since the first two experiments are run simultaneously. 7. Final Remarks Two formulations have been proposed that are suitable for the model-based design of parallel experiments. The first is based on optimizing the overall (expected) information content from all experiments, using the design vector of each experiment as the manipulated variables within a single optimization problem. This can be achieved, albeit at the expense of a larger optimization problem. It may well be possible to reduce the solution effort by exploiting structural characteristics of the problem, but this was not explored here. A second novel procedure, based on the decomposition of the information matrix, has been suggested, which is particularly suitable to the design of parallel experiments. This decomposition is also applicable to the model-based design of sequential experiments. The procedure minimizes (individually and separately) the largest eigenvalues of the variance-covariance matrix Vθ of the parameters, making it possible to exploit orthogonal information contributions to the information matrix. Although the approach proved to be also effective for the sequential design of experiments, the significant advantage is that it allows the simultaneous design of parallel experiments, each aimed at delivering a complementary piece of information. The optimization problem is greatly reduced, since the methodology does not require the maximization of the overall information content of a variance-covariance matrix comprising the whole set of parallel experiments. Each experiment is designed by maximizing the information related to a specific direction of the information matrix. The methodology discussed here is a first attempt to incorporate sequential/parallel experiment structures within a model-based experiment design framework. Results on a small (but not easy) typical illustrative application show that, by using model-based experiment design in combination with an SVoptimality criterion, it is possible to design multiple experiments, both in sequence and parallel, that indeed lead to complementary information content and improved estimation of specific parameters. Future work should aim at (1) assessing the performance of the methods on larger and more complicated applications and (2) developing a systematic procedure to help determine the best approach to model-based experimental design in order to include a topological design variable. Acknowledgment The authors gratefully acknowledge partial financial support for this work by the Italian Ministero dell’Universita` e Ricerca Scientifica (MIUR), Contract B000MIUR03, in particular for a visiting professorship for S.M. at the Universita` di Padova. Literature Cited (1) Atkinson, A. C.; Donev, A. N. Optimum Experiment Designs; Oxford University Press: Oxford, U.K., 1992. (2) Lazic, Z. R. Design of Experiments in Chemical Engineering; WileyVCH: Weinheim, Germany, 2004. (3) Box, G. E. P.; Lucas, H. L. Design of Experiments in Non-Linear Situations. Biometrika 1959, 46, 77-90.
(4) Heineken, F. G.; Tsuchiya, H. N.; Aris, R. On the Mathematical Status of the Pseudo-Steady State Hypothesis of Biochemical Kinetics. Math. Biosci. 1967, 1, 95-113. (5) Shirt, R. W.; Harris, T. J.; Bacon, D. W. Experimental Design Considerations for Dynamic Systems. Ind. Eng. Chem. Res. 1994, 33, 26562667. (6) Baltes, M.; Schneider, R.; Sturm, C.; Reuss, M. Optimal Experiment Design for Parameter Estimation in Unstructured Growth Models. Biotechnol. Prog. 1994, 10, 480-488. (7) Rudolph, P. E.; Herrendo¨rfer, G. Optimal Experiment Design and Accuracy of Parameter Estimation for Nonlinear Regression Models Used in Long Term Selection. Biometrical J. 1995, 37, 183-190. (8) Espie, D.; Macchietto, S. The Optimal Design of Dynamic Experiments. AIChE J. 1989, 35, 223-229. (9) Zullo, L. Computer aided design of experiments. An engineering approach. Ph.D. Thesis, The University of London, U.K., 1991. (10) Asprey, S. P.; Macchietto, S. Statistical Tools for Optimal Dynamic Model Building. Comput. Chem. Eng. 2000, 24, 1261-1267. (11) Bard, Y. Nonlinear parameter estimation; Academic Press, Inc.: New York, 1974. (12) Chen, B. H.; Bermingham, S.; Neumann, A. H.; Kramer, H. J. M.; Asprey, S. P. On the Design of Optimally Informative Experiments for Dynamic Crystallization Process Modeling. Ind. Eng. Chem. Res. 2004, 43, 4889-4902. (13) Sidoli, F. R.; Manthalaris, A.; Asprey, S. P. Towards Global Parametric Estimability of a Large-Scale Kinetic Single-Cell Model for Mammalian Cell Cultures. Ind. Eng. Chem. Res. 2005, 44, 868-878. (14) Franceschini, G.; Macchietto, S. A Numerical Experiment Design Study on a Biodiesel Production Process. In ESCAPE-15; Puigjaner, L., Espuna, A., Eds.; CACE Series 20A; Elsevier: Amsterdam, The Netherlands, 2005; pp 349-354. (15) Bauer, I.; Bock, H. G.; Ko¨rkel, S.; Schlo¨der, J. P. Numerical Methods for Optimum Experimental Design in DAE Systems. J. Comput. Appl. Math. 2000, 120, 1-25. (16) Leineweber, D. B.; Bauer, I.; Bock, H. G.; Schlo¨der, J. P. An Efficient Multiple Shooting Based Reduced SQP Strategy for Large-Scale Dynamic Process Optimization. Part 1: Theoretical Aspects. Comput. Chem. Eng. 2003, 27, 157-166. (17) Grossmann, I. E. Review of Nonlinear Mixed-Integer and Disjunctive Programming Techniques. Optim. Eng. 2002, 3, 227-252. (18) Asprey, S. P.; Macchietto, S. Designing Robust Optimal Dynamic Experiments. J. Process Control 2002, 12, 545-556. (19) Pukelsheim, F. Optimal Design of Experiments; J. Wiley & Sons: New York, 1993. (20) Keifer, J.; Wolfowitz, J. Optimum Designs in Regression Problems. Ann. Math. Stat. 1959, 30, 271-294. (21) Chen, B. H.; Asprey, S. P. On the Design of Optimally Informative Dynamic Experiments for Model Discrimination in Multiresponse Nonlinear Situations. Ind. Eng. Chem. Res. 2003, 42, 1379-1390. (22) Vassiliadis, V. S.; Sargent, R. W. H.; Pantelides, C. C. Solution of a Class of Multistage Dynamic Optimization Problems. 1. Problems with Path Constraints. Ind. Eng. Chem. Res. 1994, 33, 2111-2122. (23) gPROMS AdVanced User Guide, release 2.3; Process Systems Enterprise Ltd.: London, U.K., 2004. (24) Galvanin, F.; Barolo, M.; Bezzo, F.; Macchietto, S. A Framework for Model-Based Design of Parallel Experiments in Dynamic Systems. In Computer-Aided Chemical Engineering 21A/B 16th European Symp. on Computer Aided Process Engineering and 9th International Symp. on Process Systems Engineering; Marquardt, W., Pantelides, C., Eds.; Elsevier: Amsterdam, The Netherlands; 2006, pp 249-254. (25) Misra, P.; Nikolaou, M. Input Design for Model Order Determination in Subspace Identification. AIChE J. 2003, 49, 2124-2132. (26) Bruwer, M. J.; MacGregor, J. F. Robust Multi-Variable Identification: Optimal Experimental Design with Constraints. J. Process Control 2006, 16, 581-600.
ReceiVed for reView August 30, 2006 ReVised manuscript receiVed November 2, 2006 Accepted November 10, 2006 IE0611406