PCA Combined Model-Based Design of Experiments (DOE) Criteria

Ljung , L. System Identification-Theory For the User, 2nd ed.; Prentice Hall: ..... Matías L. Picchio , Silfredo J. Bohórquez , Peter G. C. A. van d...
0 downloads 0 Views 1MB Size
7772

Ind. Eng. Chem. Res. 2008, 47, 7772–7783

PCA Combined Model-Based Design of Experiments (DOE) Criteria for Differential and Algebraic System Parameter Estimation Yang Zhang and Thomas F. Edgar* Department of Chemical Engineering, The UniVersity of Texas at Austin, Austin, Texas 78712

Design of experiments (DOE) for parameter estimation in dynamic systems is receiving more attention from process system engineers. In this paper, a principal component analysis (PCA)-based optimal criterion (Poptimal) for model-based DOE is proposed that combines PCA with information matrix analysis. The P-optimal criterion is a general form that encompasses most widely used optimal design criteria such as D-, E-, and SV-optimal, and it can automatically choose the optimal objective function (criterion) to use for a specific differential and algebraic (DAE) system. Two engineering examples are used to validate the algorithms and assumptions. The advantages of P-optimal DOE include ease of reducing the scale of the optimization process by choosing parameter subsets to increase estimation accuracy of specific parameters and avoid an ill-conditioned information matrix. 1. Introduction As detailed mathematical models are highly desirable in process control, design, monitoring, and optimization areas, modeling is becoming more and more important for engineers. However, a reliable model for real systems under certain conditions (operating constraints) is usually hard to develop, and the procedure is not trivial. Asprey and Macchietto proposed a general modeling procedure:1 (i) Preliminary system analysis and model building based on process information. (ii) Design of optimal experiments according to model structure. (iii) Carrying out of experiments as designed. (iv) Use of experimental data to estimate model parameters and perform model validation by analyzing estimated parameters and available data. By analyzing the procedure, one can see experimental data are playing a key role not only for parameter regression but also for model validation. Thus, how to design experiments to get good parameter estimation results with a limited experimental effort is an important issue. Design of experiments (DOE) is a statistical tool to choose the optimal experimental conditions for deterministic models. Following the pioneering work of Box and Lucas2 and Atkinson and Hunter,3 the advantages from the use of DOE are well-known, with applications to many areas, for example, linear discrete time model frequency domain analysis,4,5 adsorption kinetics,6 reaction kinetics,7-9 biological networks,10,11 and fermentation processes.1 From an algorithm development perspective, DOE has been utilized in models based on algebraic equations for a long time and was applied to differential and algebraic (DAE) systems in the early 1990s.12 Many optimal design criteria have been proposed (D-optimal, E-optimal, etc.) and compared by different case studies; Walter and Pronzato13 gave a detailed discussion of available optimal design criteria and their geometrical interpretations. Recently, Atkinson14 used DOE for nonconstant measurement variance cases and Galvanin et al.15 extended the DOE territory to parallel experiment designs. Despite its many applications and useful results, DOE still has some shortcomings. DOE may not be applicable to large/ * To whom correspondence should be addressed.

medium scale DAE systems due to numerical difficulties, and calculations usually take a long time to converge. No clear rule set is available in choosing an optimal design criterion for a specific case. More discussion on the advantages and shortcomings of available DOE methods is provided in Section 2. This work focuses on a generalized methodology that introduces principal component analysis (PCA) into DOE. By combining the two methods, one can elegantly divide the large system into small pieces and design a series of experiments to avoid the numerical problems discussed above. Furthermore, the new proposed criterion reduces to well-known optimal design criteria (D-, E-, SV-optimal) under certain assumptions, and one can choose a subset of model parameters to increase estimation accuracy without changing the form of the objective function (Ds-optimal 16). Galvanin et al.15 proposed SV-optimal, which also used singular value decomposition (SVD). However, further analysis of the relationship of SVD results with physical background was not performed. In Section 2, the basic idea and popular optimal design criteria of DOE are discussed, and Section 3 introduces PCA. The new criterion is proposed in Section 4, and case studies are used to compare the performance of P-optimal criterion and traditional ones in Section 5. Section 6 contains conclusions and future work. 2. Design of Experiments For static (algebraic) and dynamic (differential) systems, the Fisher information matrix plays a central role in model-based DOE with the goal to design a series of experiments based on the model structure. By carrying out these experiments, the model parameters can be estimated with the best accuracy.17 For an algebraic system with q equations and m model parameters (fj(x, θk), j ) 1, 2,..., q; k ) 1, 2,..., m), suppose we are going to perform n experiments and the measurements are stored in yi. The corresponding errors between experiments and model predictions are yji - fj(xi, θ) which follow N(0, Vj). After the n experiments are performed, the posterior covariance of the error (parameter covariance matrix) is1,12 V(θ, φ) )

[

q

q

∑∑V

-1

m,rs

r)1 s)1

10.1021/ie071206c CCC: $40.75  2008 American Chemical Society Published on Web 09/24/2008

]

JrJs + V0-1

-1

(1)

Ind. Eng. Chem. Res., Vol. 47, No. 20, 2008 7773

where Jr ) ∂fr/∂θ evaluated at n experimental points. φ is the design vector which typically contains the sampling time, initial reactant concentration, control inputs, and so forth. Usually the “real” parameter value (θ) is not available, so the estimated (or initial guessed) value (θˆ ) is used in eq 1 and further analysis. Vm,rs is the rth term in the measurement covariance matrix that can be estimated by n

i)1

(2)

n-1

When prior information V0 is not available, the posterior covariance of the error is

[∑ ∑ q

V(θˆ , φ) )

]

q

r)1 s)1

(3)

The inverse of V is the Fisher information matrix for algebraic equations: q

M)

q

∑∑V

rs

-1 T Jr Js

(4)

r)1 s)1

The larger the information matrix, the smaller the posterior error covariance matrix. If there is only one model parameter (J is n × 1), M should be a scalar and so is V. As a result, maximizing the absolute value of M is the only criterion. However, usually M is an m × m matrix (m is the number of parameters in the model), thus there are several ways to define the minimization criteria: D-optimality: F ) min(|V|)

(5)

φ∈Φ

Minimize the determinant of the prediction error covariance matrix and thus the volume of the parameter estimation covariance matrix. E-optimality: F ) min(a1),

a1 g a2 g ... g am

φ∈Φ

(6)

where a1 is the largest eigenvalue of V which is equivalent to reducing the maximum diameter of the covariance ellipsoid for θ.

∑∑V

-1 T Jr Js

rs

[ ]

(8)

Jr contains the sensitivity coefficient of output yr with respect to parameter θˆ evaluated at different sampling times (ts): ∂yr ∂θˆ

∂yr ∂θˆ

...

2

∂yr r t1 ∂θˆ m

∂yr ∂yr ∂yr r t2 · · · Jr ) ∂θˆ 1 ∂θˆ 2 ∂θˆ m l ∂yr ∂θˆ

1

-1

Vm,rs-1JrTJs

q

r)1 s)1

1

∑ (y

ri - fr(xi, θ))(ysi - fs(xi, θ))

Vm,rs )

q

M)

(9)

· ·· l l ∂yr ∂yr · · · ˆ r tn ∂θˆ 2 ∂θm

As discussed in Section 1, model-based DOE has seen numerous applications in the past few decades, however, there are still some drawbacks that require further study: 1. DOE sometimes fails to find out the optimal experimental condition for medium and large scale DAE systems, and it usually takes a long time even for small scale systems. Franceschini and Macchietto8 applied E-optimal to a biodiesel production dynamic model identification problem. According to their research, designing a single set of experiments aimed at improving the estimation of all parameters together failed as the experimental design calculations did not converge. It is reasonable to design different experiments for a specific group of parameters. However, there is no trivial/automatic way to separate the model parameters into different groups. 2. There are many criteria to choose from (D-optimal; E-optimal; A-optimal; Ds-optimal, etc.), and there is no clear way to tell which one to use for a specific case. 3. All criteria depend on minimizing the prediction error variance-covariance matrix V in some sense. Because V is calculated from M, M may not be full rank or it may be illconditioned (condition number is greater than 1010). In this case M may not be invertible, and V cannot be numerically calculated. As a result, working on M instead of V is suggested. 4. It is hard to focus on improving specific parameters in the DAE system. Ds-optimal suggested above leads to a different form of the objective function and a more complex optimization problem. The different magnitudes of |V| and |Vs| can lead to scaling problems. 3. Principal Component Analysis

Ds-optimality: F ) min φ∈Φ

( ) |V| |Vs|

(7)

PCA involves decomposing the data matrix from experiments and is defined by the following expression: X ) T · PT + E

where

(∑ ∑ q

Vs )

q

r)1 s)1

)

-1

T Vm,rs-1Js,e Js,e

Vs is the parameter covariance matrix for the parameters of less interest, and the criterion minimizes the determinant of the covariance matrix of the important parameters. Vs and V are e × e and m × m matrices, respectively. The only difference between V and Vs is that Jr,e is composed of the parameters that are of less interest (θr). For dynamic system DOE, the parameter covariance matrix can be treated as a sequential experimental design according to Zullo.18 By assuming a piecewise constant control signal, the information matrix can be defined similar to eq 4.

(10)

X is an n × m data matrix, which can be decomposed into score (T, n × l) and loading (P, m × l) matrices, while E is the residual. PCA has very attractive features including the following. 1. ti (the ith column of T) are orthogonal: A)

TT · T ) diag(a1, a2, ..., am) n-1

(11)

where ai is the ith eigenvalue of the covariance matrix (XTX/ (n - 1)) in descending order. ti explains the relationship between each row. 2. pi (the ith column of P) are orthonormal: I ) PT · P ) diag(1, 1, ..., 1)

(12)

7774 Ind. Eng. Chem. Res., Vol. 47, No. 20, 2008

pi focuses on the relationship between each column. Because XTX is symmetric, its eigenvalue and eigenvectors are real. The first few columns of T and P capture most of the variance existing in X, and when l (the number of principal components) ) min(n, m), E is a zero matrix. There have been some studies on choosing the optimal number of principal components (q) to separate useful information and noise in process monitoring.19,20 The cumulative percent variance (CPV) is one such method, and the threshold for this method can be set to 90%.

() k

∑a

j

CPV(k) )

j)1 m

(13)

∑a

j

j)1

The relationship between PCA and SVD can be explained by manipulating the equations for PCA (eqs 14b and 14c) and comparing them with the SVD equation (eq 14a). SVD : PCA :

1 XTX ) U · B · QT n-1

(14a)

1 1 1 XTX ) (TPT)T · (TPT) ) P · TT · T · PT n-1 n-1 n-1 (14b)

1 1 XTX) P · (TT · T) · PT ) P · A · PT (14c) n-1 n-1 XTX is a real symmetric matrix such that the left eigenvector (U(m × m)) and right eigenvector matrix (Q(m × m)) are exactly the same and the corresponding eigenvalues are stored in B. As a result, P)Q A)B 4. PCA Combined Criterion for DOE 4.1. Information Matrix and Parameter Covariance Matrix. For the purpose of simplicity, assume there is only one measured output (q ) 1) and the measurement error is identity (Vm,rs ) 1), such that eq 3 becomes (15) V(θˆ , φ) ) [JTJ]-1 ) [M]-1 The sensitivity matrix J can be viewed as X in the above PCA equations, and M is proportional to the covariance matrix V (the scaling factor (1/(n - 1)) in the covariance is contained in Vrs). Assume the eigenvalue and eigenvector matrices of M are Λ and P, respectively. Substituting eqs 10 and 14c into eq 15, we obtain M ) [J J] ) [(T · P ) · (T · P )] ) [P · Λ · P ] V(θˆ , φ) ) [M]-1 ) [P · Λ · PT]-1 ) P-T · Λ-1 · P-1(16) T

T T

T

T

-1

Because P is orthonormal, P · P ) I and P P-1. T

4.2. Geometric Interpretation. A geometric interpretation of DOE criteria is shown in Figure 1, which shows the covariance matrix of a two parameter system. Two eigenvectors p2, p1 indicate the direction of the largest and second largest direction of variance. The projection of the long axis (p2 direction) on θ1 and short axis (p1 direction) on θ2 is proportional to the confidence region of θ1 and θ2, respectively. Thus, to shrink the estimated confidence region, traditionally D-optimal focuses on shrinking the size of the ellipsoid, but it fails when λ1 becomes very large and λ2 is small (the ellipsoid degenerates into a line which also minimizes the size criterion). E-optimal focuses on minimize the longest the axis of the ellipsoid (maximize λ2). However, this method may not be efficient when several axes have a similar length (multiparameter systems). In Figure 1, when λ1 is much larger than λ2, the ellipsoid will degenerate into a line (D-optimal fails) and it is reasonable to look at λ2 alone, which is E-optimal. Instead, when θ2 is wellknown, one can only focus on shrinking the projection of both ellipsoid axes on the θ1 direction: min(|p1(1)/λ1| · |p2(1)/λ2|). To eliminate the absolute value and take advantage of the unit length of pi, we use the following expression:

(

min

)

p1(1)2 p2(1)2 · ) max(λ1 · p1(1)-2 · λ2 · p2(1)-2) λ1 λ2

4.3. PCA based Optimal Criterion (P-Optimal) and Its Variation. From Sections 4.1 and 4.2, it is reasonable to formulate the new objective function as follows: F ) min

m



ai ·

∑P

2

ji

j

))

(18)

where ai are eigenvalues of V in ascending order (ai ) 1/λi and λi is in descending order) and P is the corresponding eigenvector matrix. The advantage of storing eigenvalues of V in ascending order is that P can be used directly without transformation, otherwise P for V needs to be transformed by

[ ] 1

PV ) PM ·

.··

1

l is the number of optimal number of PCs, and j corresponds to the parameters that are selected to increase estimation accuracy. Suppose we want to improve the precision of all parameters (j ) 1:m) and all the PCs are retained by PCA. Then eq 18 becomes

((

m

m

F ) min ∏ ai · φ∈Φ i)1

∑P

2

ji

j)1

)) m

pi orthonormal w

·P ) I w P ) T

(17) V(θˆ , φ) ) P-T · Λ-1 · P-1 ) P · Λ-1 · PT From PCA analysis, to get V from M, one can efficiently decompose M by SVD or NIPALS21 and replace Λ in eq 16 with Λ-1. In Section 3, it has been shown the columns of P explain the relationship between different columns in J. For example, the smallest eigenvalue in M is λm, and its inverse λm-1 will be the largest eigenvalue for V, which indicates the largest variance in the prediction error covariance matrix. The corresponding eigenvector (pm) gives the direction of the largest variance in the m dimensional parameter space.

( (

φ∈Φ i)m-l+1

∑P

2

ji

)1

j)1

(

m

)

F ) min ∏ (ai) ) min(|V|) w D-optimal(19) φ∈Φ i)1

φ∈Φ

When only the largest eigenvalue of V is suggested by PCA (i ) m) when all parameters are to be estimated, eq 18 is

(

F ) min am · φ∈Φ

m

∑P j)1

)

2

jm

m

pi orthonormal w

∑P

2

jm

)1

j)1

F ) min(am) w E-optimal (20) φ∈Φ

Ind. Eng. Chem. Res., Vol. 47, No. 20, 2008 7775 Table 1. P-Optimal Algorithm Description 1

initialize model parameters (θ0) and design vector (φ0) (i ) 0). calculate sensitivity matrix (J) (eq 9) and information matrix (M) (eq 8). choose the number of PCs using CPV (eq 13) and group all parameters into subsets according to P if necessary. optimize the objective function (eq 18) and find the optimal experimental design (φi). If the parameters are divided into groups in step 3, multiple experiments can be designed in parallel (as discussed in ref 15). carry out real experiments and use the data to regress model parameters (θi). calculate t and χ2 values to justify the quality of the estimated parameter (model validation). When the quality is not satisfied, set i ) i + 1 and return to step 2.

2 3 4

5 6 Figure 1. Geometric interpretation of PCA combined DOE criteria.

When all the PCs are to be used with specific parameters to be estimated (e.g., the first s), eq 18 becomes

((

))

s

m

F ) min ∏ ai · φ∈Φ i)1

∑P

2

ji

j)1

(21)

SV-optimal was used for parallel and sequential process DOE by Galvanin et al.,15 which is similar to our approach. After obtaining the eigenvalues of the V matrix (ai), a series of experiments are designed to minimize a1, a2,..., am, respectively. Furthermore, they noticed that by minimizing some eigenvalues, the estimation of certain parameters will improve, but no analysis was provided. The SV-optimal can be treated as a special case in our generalized approach by setting i ) k and j ) 1:m.

(

F ) min ak · φ∈Φ

m

∑P

2 jk

j)1

)

2. The proposed P-optimal criterion is a general form of most widely used criteria such as D- and E-optimality. 3. By introducing PCA to carry out both eigenvalue calculation and selecting the optimal number of eigenvalues to evaluate, the ill-conditioned M is avoided. The PCA method automatically chooses the optimal number of eigenvalues to be investigated, to filter out unnecessary eigenvalues, and reduces problem scale. 4. P gives a clue on grouping the estimated parameters (selecting subset j in eq 17), so it is easy to design an experiment for improving specific parameter estimation, compared with traditional methods. 5. The criteria can be easily embedded in parallel and sequential DOE routines. 5. Case Study

m

pi orthonormal w

∑P

jk

2

)1

j)1

5.1. Yeast Fermentation Reactor Model. Nearly all fermentation processes are based on mass and energy balances. Neglecting heat effects leads to a set of differential equations:

F ) min(ak) w SV-optimal(22)

dc ) G · r(c(t)) - H · c(t) + u(t) dt

φ∈Φ

Comparing eq 22 with eq 18, SV-optimal cannot be used in designing an experiment that increases the estimation accuracy of specific parameters. When calculating l by the method described by eq 13, one can choose the eigenvalues of either M or V. When using V, the last l eigenvalues (kept in ascending order such that P does not need to be transformed) and the corresponding eigenvectors should be used to define the objective function. When using M, if the first k eigenvalues sum to 90%, then the remaining m - k eigenvalues (l ) m - k) and eigenvectors are used in eq 17. In real applications, using M to do PC number selection will prefer D-optimal results while using V will bias the results toward E-optimal. This is because the first few (most of the time one or two) PCs is enough to capture most of the variance for either V or M. Normally DOE tries to increase the estimation precision for most model parameters, and usually a single eigenvalue cannot include information for most parameters (some elements in pi are close to zero), thus retaining more eigenvalues in the objective function for the first few runs is preferred. As a result, in later examples, M is used for calculations and the computational steps for P-optimal algorithm are summarized in Table 1. Generally speaking, the new criterion enjoys the following advantages: 1. For medium and large scale DAE system cases, it is always easy to reduce the scale of the DOE problem by choosing certain parameters out of the entire set to be the focus.

(23)

where G (n × m) contains the stoichiometry information of the m reactions described in r, H (1 × 1) is the dilution rate, and u(n × 1) contains the input/output information. In this example, a fed-batch model for baker’s yeast is used22

[]

[ ]

([ ] [ ]) [ ]

1 0 x d x ) - 1 · r(θ, x, s) · x + u1 u s dt s 2 θ3

+

-θ4x 0

(24)

x and s are biomass and substrate concentration, respectively, and r is the specific growth rate, which is assumed to be of the Monod-type: r(θ, x, s) )

θ1s θ2 + s

(25)

There are four parameters to be estimated while two of them are kinetic rate coefficients (θ1, θ2) and the other two are yield coefficients (θ3, θ4). The true values are [0.31, 0.18, 0.55, 0.05], and there is no distribution information regarding the parameters; however, there are upper and lower bounds: 0.05 < θ1, θ2, θ3 < 0.98

0.01 < θ4 < 0.98

This classical model has been successfully used to perform D-optimal,1 E-optimal,15 SV-optimal, 15 and mini-max robustoptimal12 experiment design.

7776 Ind. Eng. Chem. Res., Vol. 47, No. 20, 2008

The total batch time is 40 h, and initially four switching points are chosen for both control inputs (dilution factor u1 and substrate concentration in feed u2): tsw ) 5, 13, 21, 29. The design variables include sampling time (ts), initial concentration for biomass (x0) and substrate (s0), and control input magnitude for u1 and u2. Assume both biomass and substrate concentrations can be measured online and the process noise is N(0, 0.04). Ten samples are taken from each batch, and any neighboring two samples should be taken at least 1 h apart and at most 20 h. u1 is bounded by 0.05 and 0.2 h-1 while u2 lies between 5 and 35 g L-1. The initial biomass concentration ranges from 1 to 10 g L-1, and initial substrate concentration is always 0.1 g L-1. All of these constraints can be summarized by following inequalities: 0 e ts,1 · · · ts,10 e 40 h 1 e ts,i+1 - ts,i e 20 h;

i ) 1, 2, ..., 9

0.05 e u1,j e 0.2 h-1 ;

j ) 1, 2, ..., 5

5 e u2,k e 35 g L-1 ; k ) 1, 2, ..., 5 1 e x0 e 10 s0 ) 0.1

The initial design variables are chosen as φ(0): ts ) 2, 6, 10, 14, 18, 22, 26, 30, 34, 38 h u1,j ) 0.12 h-1 ;

u2,k ) 15 g L-1 ; x0 ) 5.5 s0 ) 0.1

j ) 1, 2, ..., 5 k ) 1, 2, ..., 5

Two different initial parameter estimates are used to test the robustness of the optimal design criteria: θI ) [0.5 0.5 0.5 0.5]; θII ) [0.527 0.054 0.935 0.015]. θI represents the case when no a priori information is available so the guess is in the middle of the region. θII is the case when the initial information is inaccurate (up to a 70% error from the true value).15 Case I: θI ) [0.5 0.5 0.5 0.5]. Figure 2 shows the DOE results using different criteria. For the generalized criteria, the initial information matrix (M0) is

[

1450 -189.5 -153.5 -165.3 -189.5 27.8 24.9 216.1 M0 ) -153.5 24.9 29.8 164.7 -165.3 216.1 164.7 1903.5

]

The corresponding eigenvalue and eigenvector matrices are

[

Λ0 ) diag(3384.7 21.0 5.3 0.1 )

-0.6535 -0.3328 -0.6383 0.2340 0.0856 0.2663 -0.5219 -0.8059 P0 ) 0.0673 0.8144 -0.3170 0.4815 0.7491 -0.3939 -0.4688 0.2530

]

According to CPV, the first eigenvalue of M captures 99% of the variance, and the second to fourth eigenvalues (i ) 2, 3,4) need to be included in the objective function if all four parameters are to be estimated (j ) 1, 2, 3, 4). From P we can see the values are more or less equally distributed in each column, which means no eigenvector focuses on increasing the accuracy of a single parameter. As a result, the objective function for P-optimal is

(( 4

F ) min ∏ ai · φ∈Φ i)2

4

∑ j)1

Pji2

)) ( ( 4

) max ∏

φ∈Φ i)2

4



1 / P2 ai j)1 ji

))

(

4

)

) max ∏ (λi) φ∈Φ i)2

(26)

Figure 2. Control inputs and sampling points calculated by different optimal design criteria. Red, control limits; blue, designed control inputs; dash, initial control guesses; square, sampling points. (a) Control inputs designed by D-optimal criterion. (b) Control inputs designed by E-optimal criterion. (c) Control inputs designed by P-optimal criterion.

Figure 2 shows the DOE results according to different optimal design criteria. We note that the three criteria lead to three different experimental approaches. For feed rate u1, E-optimal and P-optimal suggest bang-bang control while D-optimal has more gradual step changes before reaching the lower control limits. For substrate concentration in the feed (u2), the three methods give rather different results. Table 2 lists several

Ind. Eng. Chem. Res., Vol. 47, No. 20, 2008 7777 Table 2. DOE Results by Different Criteria

D-optimal

E-optimal

P-optimal

[

M(φ) 44570 -3906.3 352.9 -3887.3 381.9 5877.0 -4579.0 398.5 3782.5 4724.5

[ [

3572.1 -591.0 100.6 -446.9 69.4 145.7 -3966.8 661.7 402.6 4511.9

4657.4 -564.7 77.4 -578.7 66.9 607.6 -4824.9 596.3 2.8 5674.8

] [ ] [ ] [

Λ (φ)

92378

0 366.0 10.5

0

] [ ] [ ] [

-0.694 0.3179 -0.5892 -0.2641 0.0607 -0.1582 -0.5295 0.8312 0.0591 -0.8485 -0.3525 -0.3903 0.7147 0.3924 -0.4983 -0.2949

1.21

8179.6

0 145.7 2.5

0

2.4

10607

0 888.0 9.672

0

P(φ)

-0.658 -0.3759 -0.4911 0.4283 0.1095 0.0211 -0.7408 -0.6624 0.0747 0.8253 -0.3535 0.4340 0.7405 -0.421 -0.2917 0.4352

1.25

-0.66 0.4121 0.361 0.5062 0.0818 -0.481 0.873 -0.478 0.0415 -0.809 0.2438 0.533 0.7388 0.415 0.2183 0.481

]

] ]

Table 3. Parameter Estimation Results by Carrying Out Designed Experiments t test (tref ) 1.94)

θˆ D-optimal

E-optimal

P-optimal

0.2922 0.1485 0.5340 0.0464 0.2599 0.0883 0.5618 0.0526 0.3070 0.1997 0.5546 0.0516

( ( ( ( ( ( ( ( ( ( ( (

0.0089 0.1693 0.0010 0.001 0.0282 0.1959 0.0055 0.0002 0.0085 0.2447 0.0042 0.0001

true

parameter correlation

32.8 0.877* 53.4 116.0 9.216 0.4507* 102.2 263.0 36.2 0.816* 132.1 516.0

1.0000 0.9860 0.0990 -0.0026 1.0000 0.9721 0.1886 0.0463 1.0000 0.9354 0.3530 0.2700

0.9860 1.0000 -0.0184 -0.1183 0.9721 1.0000 0.0363 -0.1021 0.9354 1.0000 0.0589 -0.0207

-0.0026 -0.1183 0.9190 1.0000 0.0463 -0.1021 0.9467 1.0000 0.2700 -0.0207 0.9636 1.0000

0.0990 -0.0184 1.0000 0.9190 0.1886 0.0363 1.0000 0.9467 0.3530 0.0589 1.0000 0.9636

[0.310 0.180 0.550 0.050 ]T

important values from DOE: M is the information matrix with suggested experimental design and initial parameter guess; Λ and P are the eigenvalue and eigenvector matrices of M, respectively. One can see D-optimal results lead to a much larger Λ(1, 1), and the product of all the eigenvalues is 4.29 × 108 (volume of the parameter covariance matrix is 0.23 × 10-8). E-optimal focuses on increasing the value of the smallest eigenvalue which is 2.4, larger than the other two methods. According to PCA, the generalized method optimizes the product of the last three eigenvalues, which is 1.07 × 104. This is larger than the other two methods (4.65 × 103 for D-optimal; 8.74 × 102 for E-optimal), so in this sense D-optimal is better than E-optimal. In Section 4, the drawbacks of D- and E-optimal methods have been discussed, and to verify these assumptions, Table 3 lists the parameter estimation results by carrying out the corresponding designed experiments. Normal distributed measurement noise is assumed, and the maximum likelihood equation degenerated into a least-squares problem. Standard dynamic model nonlinear parameter estimation is carried out in this paper as described by Bard.17 To be specific, the optimization is carried out by a BFGS algorithm, and numerical integration is carried out by an explicit Runge-Kutta method. A t test23 indicates if the results show a reliable estimation: ˆ θ t ) tR(n - m) σθˆ

(27)

where R is the confidence value (0.95, 0.99); σ{θˆ } is the standard deviation of the estimation which can be calculated by σ{θˆ } )

F(R, n, m) · diag(V1/2); and n and m are the number of individual observations and number of parameters in the model. If the t value is larger than the value on the right-hand side (tR(n - m) ) t0.95(10 - 4) ) 1.94), the parameter is said to be reliable; otherwise, it is statistically unacceptable. According to Table 3, the P-optimal method estimation results are the closest to the true value (which is known in this simulation study). The correlation coefficient matrix has a similar result for all three methods, and 〈θ1, θ2〉 and 〈θ3, θ4〉 are highly correlated. To answer why θ2 cannot be estimated with high accuracy, we examine the elements of P in Table 2. The second row represents θ2, and for all three cases, the third and fourth columns of the row have larger values compared to the other columns. The third and fourth columns correspond to the small eigenvalues in the M matrix, which agrees with our PCA analysis (small eigenvalue in M indicates a larger parameter estimation variance). In real applications, as θ2 is not statistically acceptable, the P-optimal method allows designing an experiment to increasing its estimation accuracy. From analyzing Λ in Table 2, again the first eigenvalue is dominant and the remaining three eigenvalues should be included in the new objective function:

(

4

)

(( 4

F ) min ∏ (ai · P2i2) ) max ∏ φ∈Φ i)2

φ∈Φ i)2

1 ·P 2 ai 2i

))

(

4

)

) max ∏ (λi · P2i2) φ∈Φ i)2

(28) and ai and pi are eigenvalues and eigenvectors of V(θ, φ) ) [∑rq) 1∑sq ) 1Vrs-1JrJs + V0-1]-1. V0 ) M0-1 is taken from the last design in Table 2 because V0 is the a priori information

7778 Ind. Eng. Chem. Res., Vol. 47, No. 20, 2008

Figure 3. Control inputs and sampling points calculated by sequential P-optimal criterion. Red: control limits; blue, designed control inputs; dash, initial control guesses; squares: sampling points.

obtained from earlier estimations. Figure 3 shows the control inputs and sampling times suggested by the sequential design. The sequential estimation results are θ ) [0.321 ( 0.0312

0.1776 ( 0.0924 0.582 ( 0.0459 0.0571 ( 0.01883]T The corresponding t test value is [29.48 5.51 36.327 8.670], and the reference value is 1.75. One can see that all the parameters converged to the true values and are statistically reliable according to the t test. Moreover, the estimation accuracy for θ2 is noticeably enhanced (t value changes from 0.816 to 5.51) and verifies our design goal, which is increased estimation accuracy of θ2. Case II: θII ) [0.527 0.054 0.935 0.015]. Three optimal criteria are employed, and Figure 4 is the control strategy and corresponding sampling points suggested by different DOE. Both D- and P-optimal methods suggest bang-bang control for u1, and the only difference is that the switching times are different. For u2, both criteria also give similar results, which are composed of small set point changes. E-optimal gave different set points and sampling times compared with the other two because it only focuses on the last eigenvalue which may not be enough to capture all the necessary information. Tables 4 and 5 listed the designed information matrix results and nonlinear regression results, respectively. As in Case I, Doptimal leads to a large leading eigenvalue (1.099 × 105), E-optimal results in the largest eigenvalue for λ4 (366), and the P-optimal criterion focuses on improving the product of the remaining three eigenvalues. Table 5 indicates that E-optimal gives a larger confidence region than the other two methods in most estimation and θ2 estimation reaches a lower bound (0.05). D- and P-optimal methods give similar regression results while the P- ptimal method is slightly better for θ2 estimation, because D-optimal gives a lower bound. Because the initial guess is poor in this case, the information matrix (M in Table 4) is larger (in terms of volume) than in Table 2; the estimation results are generally worse by comparing Table 5 with Table 3. In Case I, P in Table 2 clearly shows θ2 may not be accurately estimated. However, one examines P in Table 4, it is reasonable to assume that θ2 can be estimated with high accuracy because P(2, 2) is dominant in the second row of P for both D- and P- optimal criteria. The results in Table 5 again indicate θ2 is not statistically acceptable (using a t test). The apparent contra-

Figure 4. Control inputs and sampling points calculated by different optimal design criteria. Red, control limits; blue, designed control inputs; dash, initial control guesses; squares, sampling points. (a) Control inputs designed by the D-optimal criterion. (b) Control inputs designed by the E-optimal criterion. (c) Control inputs designed by the P-optimal criterion.

diction can be explained by the fact that the initial parameter guesses are poor. Using the estimated parameter value of the P-optimal criterion in Table 5 (0.3072, 0.0913, 0.5571, 0.0521) and the designed experimental condition (Figure 4b), the corresponding P matrix is

[

-0.7427 0.5989 -0.2972 0.0375 0.0372 -0.0276 -0.0227 0.9987 P˜ ) 0.0529 -0.3884 -0.9193 -0.0336 0.6665 0.6998 -0.2569 -0.0114

]

Ind. Eng. Chem. Res., Vol. 47, No. 20, 2008 7779

As the estimated parameter is closer to the true value compared with the initial guess, the corresponding P˜ is more reasonable. One can see P˜(2, 4) is dominant in the second row, which indicates poor estimation accuracy as we saw in Case I. Moreover, when one analyzes P˜ column by column, P˜ is dominant in column 4 with the other three elements very close to zero, which means only θ2 needs to be estimated with better accuracy. As a result, based on P˜ matrix above, it is reasonable to design another experiment for θ2. According to CPV, the last three eigenvalues are included in the optimization and the objective function is the same as Case I (eq 26). As a result of the poor performance of E-optimal in previous cases, only D- and P-optimal are employed for sequential design. The suggested set point changes are shown in Figure 5, and the parameter estimation results are listed in Table 6. Because the P-optimal method is focusing on design of an experiment for estimating θ2, the corresponding results have a more accurate value and a tighter confidence region compared with D-optimal results. Meanwhile, D-optimal has a tighter confidence region for the other three parameters because it has no preference in choosing which parameter improves accuracy. t test results suggest all parameters are statistically acceptable after two sequential designs. 5.2. Polymerization Reaction Model. Polyethylene is the most popular synthetic polymer with over 40 billion tons of

production every year.24 Among the different production techniques, gas phase reaction over Ziegler-Natta catalysts enjoys the advantage of moderate reaction conditions and simple downstream separation. Therefore, a large proportion of polyethylene is produced in this way and a key operating constraint of this reactor is to keep the reactor temperature above the dew point of the reactant to avoid condensation and below the melting point of the polymer to prevent particle melting and agglomeration. At the same time, the system is prone to unstable steady states, limit cycles, and excursions toward unacceptably high temperature steady states.25 As a result of process understanding, optimization, and automation needs, there have been many efforts on first principles modeling during the past decade.25-27 In this work, a gas phase polymerization reactor model25 with seven ordinary differential equations (ODEs) and eight algebraic equations (AEs) is applied. It has been demonstrated that the model can reproduce the process features listed above and has been widely applied in controller design and process optimization. Figure 6 shows the schematic of the gas phase polymerization process.24,25 A feed including ethylene, comonomer, hydrogen, inerts, and catalysts are continuously added to the system, and catalyst is also continuously added through another feed. Unreacted gas is cooled by a counter current heat exchanger. The ODEs and AEs used for this dynamic model are listed in eqs 29 and 30, respectively. Detailed model parameters can be found in ref 24.

Table 4. DOE Results by Different Criteria

D-optimal

E-optimal

P-optimal

[ [ [

M(φ)

] [ ] [ ] [

Λ(φ)

7184.7 -3573.1 13641 -6865.4 1109.5 1312.5 -5.909.8 1572.2 -9880.6 108640

109900

1704.1 -1029.2 2387.0 -80.0 -660.2 3.585.2 -2081.7 4634.7 -17015 95618

98935

5213.7 -5329.5 30506 -36184 8926 980.9 -48773 54409 -6.081 70793

72456

15264 5456

] [ ] [ ] [

156

2940 1053 366

30722 4004

P(φ) -0.057 0.3923 0.9053 0.1524 0.0173 -0.9125 0.4053 -0.0519 -0.0899 -0.1119 -0.1224 0.9820 0.9941 0.02835 0.0339 0.0985

-0.021 0.616 0.6775 0.401 0.048 -0.768 0.6266 0.1238 -0.175 -0.174 -0.3761 0.8299 0.983 0.0203 -0.0834 0.1622

311

]

-0.082 0.1763 0.9699 0.1463 0.136 -0.9718 0.1913 -0.021 -0.0816 -0.0608 -0.1443 0.9843 0.9839 0.1442 0.0421 0.0966

]

Table 5. Parameter Estimation Results by Carrying out Designed Experiments t test (tref ) 1.94)

θˆ D-optimal

E-optimal

P-optimal

true

0.298 0.050 0.531 0.047 0.265 0.050 0.540 0.045 0.307 0.091 0.557 0.052

( ( ( ( ( ( ( ( ( ( ( (

0.013 0.198 0.031 0.008 0.010 0.324 0.108 0.020 0.013 0.228 0.033 0.010

22.9 0.252* 17.48 5.86 26.5 0.154* 5.00 2.45 23.6 0.399* 16.87 5.10

parameter correlation 1.000 0.694 0.543 0.521 1.000 0.267 0.850 0.874 1.000 0.605 0.613 0.614

0.694 1.000 -0.196 -0.237 0.267 1.000 -0.172 -0.214 0.605 1.000 -0.216 -0.248

[0.310 0.180 0.550 0.050 ]T

0.543 -0.196 1.000 0.937 0.850 -0.172 1.000 0.960 0.613 -0.216 1.000 0.935

0.521 -0.237 0.937 1.000 0.874 -0.214 0.960 1.000 0.614 -0.248 0.935 1.000

]

7780 Ind. Eng. Chem. Res., Vol. 47, No. 20, 2008

d[In] ) dt

Fin -

dT Hf + Hg1 - Hg0 - Hr - Hpol ) dt MrCpr + BwCppol

[In] b [M1] + [In] t Vg

dTw1 Fw UA ) (T - Tw1) (T - Tg1) dt Mw wi MwCpw w1

[M1] FM1 b - RM1 d[M1] [M1] + [In] t ) dt Vg

dTg1 Fg UA ) (T - Tg1) (T - Tg1) dt Mg MgCpg w1

RM1MW1Y1 dY1 ) Fcac1 - kd1Y1 dt BW

where

RM1MW1Y2 dY2 ) Fcac2 - kd2Y2 dt BW

bt ) VpCV√([M1] + [In]) · RR · T - PV

(29)

[ ( )]

RM1 ) [M1] · kp0 · exp Cpg )

-Ea 1 1 R T Tf

· (Y1 + Y2)

[M1] [In] C + C [M1] + [In] pm1 [M1] + [In] pIn

Hf ) FM1Cpm1(Tfeed - Tf) + FInCpIn(Tfeed - Tf)

(30)

Hg1 ) Fg(Tg1 - Tf)Cpg Hg0 ) (Fg + bt)(T - Tf)Cpg Hr ) HreacMW1RM1 Hpol ) Cppol(T - Tf)RM1MW1

Figure 5. Control inputs and sampling points calculated by different optimal design criteria. Red, control limits; blue, designed control inputs; dash, initial control guesses; squares, sampling points. (a) Control inputs designed by the D-optimal criterion. (b) Control inputs designed by the P-optimal criterion.

In the original paper,25 the two types of active site have the same characteristics. In this work, two active sites have different concentrations, ac1 ) 0.548 mol/kg and ac2 ) 0.750 mol/kg, and deactivation rates, kd1 ) 0.36 h-1 and kd2 ) 0.72 s-1. All the other model parameters are kept the same as in refs 24 and 25. Suppose most thermodynamic properties are known and reaction-related kinetic parameters are unknown (Table 7). This can be mapped into the case where a new catalyst is introduced to the production system. The only unknown thermodynamic property is assumed to be the heat capacity of polymer (Cppol). Table 7 lists the true value of the parameters and the initial guesses of each parameter. One can see the parameters are distributed widely across many magnitudes (0 to 105), which leads to parameter estimation difficulty. As a result of safety considerations, one can only perturb the catalyst flow rate (u1, Fcss ) 5.8 kg/h) and feeding temperature (u2, Tfeedss ) 293 K) within a tight range ((10% around steady state value). All the state variables ([In], [M1], Y1, Y2, T, Tw1, and Tg1) are measurable (temperature related variables can be read online and concentrations are analyzed off-line), and the corresponding measurement noise are N(0, [4 6.7 0.02 0.02 0.4 0.4 0.1]). Twenty samples are analyzed during one hundred hours experimental time, and each two neighboring samples are taken longer than two hours and shorter than twenty hours. Four switches are scheduled for each control variable (u1 and u2), and the switching times are under certain constraints as shown

Table 6. Parameter Estimation Results by Sequential DOE t test (tref ) 1.75)

θˆ D-optimal

P-optimal

true

0.306 0.204 0.532 0.046 0.314 0.180 0.561 0.054

( ( ( ( ( ( ( (

0.006 0.057 0.062 0.014 0.007 0.052 0.026 0.007

47.15 3.59 24.71 9.29 42.45 3.46 21.57 7.70

parameter correlation 1.0000 0.2604 0.8346 0.8301 1.0000 -0.0653 0.8913 0.8654

0.2604 1.0000 -0.2529 -0.2776 -0.0653 1.0000 -0.4247 -0.5312

[0.310 0.180 0.550 0.050 ]T

0.8346 -0.2529 1.0000 0.9637 0.8913 -0.4247 1.0000 0.9548

0.8301 -0.2776 0.9637 1.0000 0.8654 -0.5312 0.9548 1.0000

Ind. Eng. Chem. Res., Vol. 47, No. 20, 2008 7781 Table 7. Unknown Parameters in the Polymerization Model ac2 (mol/kg)

ac1 (mol/kg) true value initial knowledge initial guess

0.548 [0 1 ] 0.5

0.750 [0 1 ] 0.5

kp0 (m3/(mol h))

Ea (J/mol)

kd1 (h-1)

kd2 (h-1)

Cppol (J/(kg K))

0.306 [0 1 ] 0.5

3.768 × 10 [103 106 ] 0.5 × 105

0.36 [0 1 ] 0.5

0.72 [0 1 ] 0.5

3559 [103 105 ] 5 × 104

4

below. To eliminate the initial transient phase in simulation, the DOE phase is carried out from 100 to 200 h: 100 e ts,1, ..., ts,20 e 200 h 2 e ts,i+1 - ts,i e 10 h; i ) 1, 2, ..., 19 5.22 e u1,j e 6.38 kg h-1 ; j ) 1, 2, ..., 5 263.7 e u2,k e 322 K; k ) 1, 2, ..., 5 5 e tu1,l+1 - tu1,l e 30 h; l ) 1, 2, 3, 4 5 e tu2,m+1 - tu2,m+1 e 30 h; m ) 1, 2, 3, 4 Pseudo random binary sequence (PRBS) signal is widely used for model identification in both academia and industry. Suppose a PRBS signal is applied to this system with switching time equal to 10 hours and constant sampling rates (every five hours), the designed variable will be φPRBS: ts ) 105, 110, ..., 200 h u1 ) 0.522 0.638 0.638 u2 ) 263.7 322.3 322.3 tu1 ) 105 115 125 135 tu2 ) 110 120 130 140

0.522 0.522 0.522 263.7 322.3 322.3 145 155 165 175 150 160 170 180

0.638 0.638 0.638 0.638 kg h-1 322.3 263.7 263.7 322.3 K 185 h 190 h

With this PRBS input and equal sampling rates, the eigenvalue and eigenvector matrices are

[

]

ΛPRBS ) diag(6.0 × 107 2.8 × 105 9.2 × 103 2.2 × 103 3.4 × 101 1.6 × 101 4.5 × 10-3)

-0.3810 -0.0280 -0.0816 -0.1169 -0.5941 0.5777 0.3834 -0.3810 -0.0280 -0.0816 -0.1169 0.5941 0.5777 -0.3834 -0.7621 -0.0558 -0.1654 -0.2372 -0.0000 -0.5767 0.0000 PPRBS ) -0.0878 0.9738 0.1932 -0.0820 0.0000 0.0001 0.0000 0.2461 0.0471 -0.3606 -0.5543 0.3834 0.0015 0.5941 0.2461 0.0471 -0.3606 -0.5543 -0.3834 0.0015 -0.5941 0.0076 -0.2065 0.8136 -0.5434 -0.0000 0.0000 -0.0000

One can see that the first eigenvalue is dominant and the corresponding eigenvector focuses on explaining kp0 (P13 ) -0.76) and ac values and kd values are also included. Furthermore, the second one is dominated by Ea related terms (P24 ) 0.97) and the third by polymer heat capacity (Cppol). Because E-optimal performance was unsatisfactory in Section 5.1, only D- and P-optimal are used here and the objective functions are

Figure 6. Gas phase polymerization system.

Figure 7. Control inputs and sampling points calculated by different optimal design criteria. Red, control limits; blue, designed control inputs; squares, sampling points. (a) Control inputs designed by the D-optimal criterion. (b) Control inputs designed by the P-optimal criterion for all parameters. (c) Control inputs designed by the P-optimal criterion for activation energy (Ea).

7782 Ind. Eng. Chem. Res., Vol. 47, No. 20, 2008

Figure 7 compares the suggested control set point changes for catalyst flow rate (u1) and feed temperature (u2) and the sampling times (ts) for different optimum criteria. It is obvious that D-optimal and P-optimal for all parameters (Figure 7a,b) give similar suggestions on sampling time. Furthermore, during the frequent sampling phase (t < 135 h), catalyst flow rate set points in both cases are very close while P-optimal suggests two-step movements for feed temperature and D-optimal suggests bang-bang control. The reason for this small difference could be due to the continuous operation mode. Compared with batch operation, a continuous process loses some degrees of freedom for DOE such as initial condition and time dependency. This is also one of the reasons why parameter identification is difficult for continuous processes. For case (c), the calculated trajectories differ from cases (a) and (b), and the suggested sampling times are roughly divided into two parts: 110-135 h and 190-200 h, respectively. The control modes are bang-bang control as well. Interestingly, the frequent sampling time periods are roughly under maximum operation limits. The difference

D-optimal: 7

7

( )

( )

F1 ) min ∏ ai ) max ∏ λi P-optimal (all parameters):

(( 7

F2 ) min ∏ ai · φ∈Φ i)2

7

∑P

ji

j)1

2

))

(31)

φ∈Φ i)1

φ∈Φ i)1

( ( )) 7

1 ai

) max ∏

φ∈Φ i)2

7

( )

) max ∏ λi φ∈Φ i)2

(32) P-optimal (Ea):

(

7

)

(( 7

F3 ) min ∏ (ai · P4i2) ) max ∏ φ∈Φ i)2

φ∈Φ i)2

1 · P -2 ai 4i

))

)

(

7

)

max ∏ (λi · P4i-2) (33) φ∈Φ i)2

Two cases are considered in P-optimal: F2 aims to improve the precision of all the parameters while F3 only focuses on activation energy (Ea) because activation energy directly related to optimum operation temperature choice.

Table 8. Parameter Estimation Results by Carrying out Designed Experiments ac1

ac2

kp0

Ea

kd1

kd2

Cppol

PRBS θˆ

0.6478 (3.7562

0.5249 (3.5924

0.3509 (0.9840

42670 (200200

0.5719 (3.7744

0.5260 (4.1985

5840 (18710

t-test (tref ) 1.77) parameter correlation

0.1725 1.0000 -0.3111 -0.5028 -0.3488 0.4970 -0.3828 -0.3482

0.1461 -0.3111 1.0000 -0.6611 0.4444 -0.6692 0.5503 0.3731

0.3566 -0.5028 -0.6611 1.0000 -0.1029 0.1825 -0.1485 0.0018

0.2131 -0.3488 0.4444 -0.1029 1.0000 -0.7483 0.7345 0.6626

0.1515 0.4970 -0.6692 0.1825 -0.7483 1.0000 -0.9718 -0.6219

0.1253 -0.3828 0.5503 -0.1485 0.7345 -0.9718 1.0000 0.7051

0.3121 -0.3482 0.3731 0.0018 0.6626 -0.6219 0.7051 1.0000

θˆ

0.6588 (0.6757

0.5540 (0.7283

0.3080 (0.1720

46300 (103000

0.4831 (0.5263

0.4709 (0.6717

5549 (8600

t-test (tref ) 1.77) parameter correlation

0.9750 1.0000 -0.9900 -0.0717 -0.0178 0.9950 -0.9950 0.0127

0.7607 -0.9900 1.0000 -0.0695 0.0189 -0.9950 0.9950 -0.0105

1.7907 -0.0717 -0.0695 1.0000 -0.0002 -0.0011 0.0010 -0.0004

0.4495 -0.0178 0.0189 -0.0002 1.0000 -0.0179 0.0190 0.8456

0.9179 0.9950 -0.9950 -0.0011 -0.0179 1.0000 -1.0000 0.0128

0.7011 -0.9950 0.9950 0.0010 0.0190 -1.0000 1.0000 -0.0106

0.6452 0.0127 -0.0105 -0.0004 0.8456 0.0128 -0.0106 1.0000

θˆ

0.6455 (0.6189

0.6482 (0.6858

0.3035 (0.1765

40100 (90060

0.4675 (0.3947

0.5779 (0.5351

5076 (9480

t-test (tref ) 1.77) parameter correlation

1.0430 1.0000 -0.3321 -0.5653 -0.3128 0.5966 -0.5142 -0.3320

0.9452 -0.3321 1.0000 -0.5870 0.3222 -0.6333 0.5538 0.3544

1.7195 -0.5653 -0.5870 1.0000 -0.0014 0.0051 0.0044 -0.0011

0.4453 -0.3128 0.3222 -0.0014 1.0000 -0.5589 0.5365 0.5421

1.1844 0.5966 -0.6333 0.0051 -0.5589 1.0000 -0.9854 -0.5950

1.0800 -0.5142 0.5538 0.0044 0.5365 -0.9854 1.0000 0.5925

0.5354 -0.3320 0.3544 -0.0011 0.5421 -0.5950 0.5925 1.0000

D-optimal

P-optimal (All Parameters)

P-optimal (Activation Energy only) θˆ

0.7255 (0.7448

0.5635 (0.6786

0.3053 (0.1757

38190 (18100

0.5594 (0.5883

0.4769 (0.5928

5532 (9060

t-test (tref ) 1.77) parameter correlation

0.9741 1.0000 -0.4507 -0.5246 -0.1447 0.6745 -0.7043 0.1719 0.5480

0.8304 -0.4507 1.0000 -0.5177 0.0900 -0.6308 0.6842 -0.0913 0.7500

1.7376 -0.5246 -0.5177 1.0000 0.0012 -0.0034 -0.0026 -0.0007 0.3060

2.1099 -0.1447 0.0900 0.0012 1.0000 -0.2086 0.1360 -0.4300 37680

0.9509 0.6745 -0.6308 -0.0034 -0.2086 1.0000 -0.9868 0.2477 0.3600

0.8045 -0.7043 0.6842 -0.0026 0.1360 -0.9868 1.0000 -0.1379 0.7200

0.6106 0.1719 -0.0913 -0.0007 -0.4300 0.2477 -0.1379 1.0000 3559

true

Ind. Eng. Chem. Res., Vol. 47, No. 20, 2008 7783

between (c) and the previous two cases can be attributed to the significantly different objective functions because the P matrix is employed to focus on estimate activation energy better. To verify the DOE results, simulation is carried out with measurement noise indicated above. The parameter estimation results for PRBS test, D-optimal, and P-optimal criteria are listed in Table 8. Same as Table 3, the estimated parameter value (θˆ ), t test, and estimated parameter correlation matrix are shown. It can be seen that the PRBS test gave the worst estimation results with large confidence regions for nearly all the parameters. As indicated by φPRBS above, there are nine switching points compared with four for the designed runs, but PRBS still gives unsatisfactory results. For designed experiments, P-optimal for all parameters indicates a better overall parameter estimation result than D-optimal with θˆ closer to true values and tighter confidence intervals for almost all parameters. Furthermore, the parameter correlation matrix suggests some very strong linear dependency in D-optimal results as 〈θ1, θ2, θ5, and θ6〉, 〈θ4 and θ7〉, which indicates unreliable results. These correlated parameters can be replaced by a single parameter and still give a good fit to the D-optimal designed experimental data. This is not surprising for a continuous process because it is regulated as close to the steady state as possible; hence, it does not provide much excitation. But this will surely increase the difficulty for parameter identification. Mathematically, the reason for highly correlated parameters could be attributed to the ill-conditioned information matrix (M) because the correlation matrix is proportional to the inverse of the M matrix. D-optimal maximized the dominant eigenvalue, which led to a large condition number 5 × 109. In comparison, the P-optimal case shows well-decoupled results, which means the two-step movement between 100 to 140 h for feed temperature in Figure 7b increases the nonlinearity of the response, which is helpful for identification. The condition number reduces to 9 × 106. In the mean time, P-optimal design for activation energy only gives the best estimation for Ea with the tightest confidence region, although other parameters do not converge as close to the true value as P-optimal does for all parameters (eq 32). It is worthwhile to notice that, for this continuous example, few estimation cases meet the t-test requirement (1.77). Again, a continuous process with fixed operation condition caused difficulty for DOE and parameter estimation. However, by using P-optimal for one important parameter only, Ea estimation meets the criterion (2.01 > 1.77) and an optimum operation temperature can be defined based on this estimate for later controller design. 6. Conclusions A generalized model-based DOE criterion is proposed in this paper by introducing PCA into the sensitivity matrix (J) and information matrix (M) analysis. The main advantages of this method includes easy rescaling of a large DAE system into small components. In addition, it is easy to focus on improving a specific subset of parameters, and parameters can be grouped according to their sensitivity behavior. Two engineering examples are presented to show the effectiveness of the new proposed method. According to the analysis, it can be concluded that P-optimal is more efficient for complex systems than the traditional D- or E- optimal criteria, and it is robust to different initial parameter estimates. It may be a good idea to combine this identification objective function with the optimal control objective function to achieve simultaneous control and identification, which will be the next step in future work.

Literature Cited (1) Asprey, S. P.; Macchietto, S. Statistical tools for optimal dynamic model building. Comput. Chem. Eng. 2000, 24, 1261–1267. (2) Box, G. E. P.; Lucas, H. L. Design of experiments in non-linear situations. Biometrika 1959, 46, 77–90. (3) Atkinson, A. C.; Hunter, W. G. The Design of Experiments for Parameter Estimation. Technometrics 1968, 10, 271–289. (4) Ljung, L. System Identification-Theory For the User, 2nd ed.; Prentice Hall: Upper Saddle River, 1999. (5) Goodwin, G. C.; Payne, R. L. Dynamic Systems Identification: Experiment Design and Data Analysis; Academic Press: New York, 1977. (6) Asprey, S. P.; Naka, Y. Mathematical Problems in Fitting Kinetic Models - Some New Perspectives. J. Chem. Eng. Jpn. 1999, 32 (3), 328– 337. (7) Issanchou, S.; Cognet, P.; Cabassud, M. Sequential Experimental Design Strategy for Rapid Kinetic Modeling of Chemical Synthesis. AIChE J. 2005, 51, 1773–1781. (8) Franceschini, G.; Macchietto, S. Validation of a Model for Biodiesel Production through Model-Based Experiment Design. Ind. Eng. Chem. Res. 2007, 46, 220–232. (9) Atkinson, A. C.; Bogacka, B. Compound and other optimum designs for systems of nonlinear differential equations arising in chemical kinetics. Chemom. Intell. Lab. Syst. 2002, 61, 17–33. (10) Gadkar, K. G.; Gunawan, R.; Doyle, F., III. Iterative approach to model identification of biological networks. BMC Bioinformatics 2005, 6, 155–174. (11) Sidoli, F. R.; Mantalaris, A.; Asprey, S. P. Toward global parametric estimability of a large scale kinetic single cell model for Mammalian cell cultures. Ind. Eng. Chem. Res. 2005, 44, 868–878. (12) Asprey, S. P.; Macchietto, S. Designing robust optimal dynamic experiments. J. Process Control 2002, 12, 545–556. (13) Walter, E.; Pronzato, L. Qualitative and Quantitative Experiment Design for Phenomenological Models - A Survey. Automatica 1990, 26, 195–213. (14) Atkinson, A. C. Non-constant Variance and the Design of Experiments for Chemical Kinetic Models. In Dynamic Model DeVelopment; Asprey, S. P., Macchietto, S. Eds.; Elsevier: New York, 2003; Vol. 16. (15) Galvanin, F.; Macchietto, S.; Bezzo, F. Model-Based Design of Parallel Experiments. Ind. Eng. Chem. Res. 2007, 46, 871–882. (16) Atkinson, A. C.; Bogacka, B. Compound, D- and Ds-optimum designs for determining the order of a chemical reaction. Technometrics 1997, 39, 347–356. (17) Bard, Y. Nonlinear Parameter Estimation; Academic Press, Inc.: New York, 1974; p 341. (18) Zullo, L. Computer aided design of experiments. An engineering approach; University of London: London, 1991. (19) Qin, S. J.; Dunia, R. H. Determining the number of principal components for best reconstruction. Presented at IFAC DYCOPS’98, Greece, 1998. (20) Zhang, Y.; Edgar, T. F. Multivariate Statistical Process Control In New Directions in Bioprocess Modeling and Control; Boudreau, M., McMillan, G. Eds.; ISA: North Carolina, 2006; Chapter 8. (21) Geladi, P.; Kowalski, B. R. Partial Least Squares Regression: A Tutorial. Anal. Chim. Acta 1986, 185, 1–17. (22) Nihtila, M.; Virkunnen, J. Practical identifiability of growth and substrate consumption models. Biotechnol. Bioeng. 1977, 19, 1831. (23) Kristensen, N. R.; Madsen, H.; Jorgensen, S. B. An investigation of some tools for process model identification for prediction. In Dynamic Model DeVelopment; Asprey, S. P., Macchietto, S. Eds.; Elsevier: New York, 2003; Vol. 16. (24) Gani, A.; Mhaskar, P.; Christofides, P. D. Fault tolerant control of a polyethylene reactor. J. Process Control 2007, 17, 439–451. (25) McAuley, K. B.; Macdonald, D. A.; McLellan, P. J. Effects of operating conditions on stability of gas-phase polyethylene reactors. AIChE J. 1995, 41, 868–879. (26) Hatzantonis, H.; Yiannoulakis, H.; Yiagopoulos, A.; Kiparissides, C. Recent developments in modeling gas-phase catalyzed olefin polymerization fluidized-bed reactors: The effect of bubble size variation on the reactor’s performance. Chem. Eng. Sci. 2000, 55, 3237–3259. (27) McAuley, K. B.; MacGregor, J. F.; Hamielec, A. E. A Kinetic Model for Industrial Gas Phase Ethylene Copolymerization. AIChE J. 1990, 36, 837–850.

ReceiVed for reView September 7, 2007 ReVised manuscript receiVed June 27, 2008 Accepted July 14, 2008 IE071206C