Mean-Squared-Error Methods for Selecting Optimal Parameter

Apr 9, 2012 - An orthogonalization algorithm combined with a mean squared error (MSE) based selection criterion has been used to rank parameters from ...
0 downloads 0 Views 566KB Size
Article pubs.acs.org/IECR

Mean-Squared-Error Methods for Selecting Optimal Parameter Subsets for Estimation Kevin A. P. McLean,† Shaohua Wu,‡ and Kimberley B. McAuley* Department of Chemical Engineering, Queen’s University, Kingston, Ontario, Canada K7L 3N6 ABSTRACT: Engineers who develop fundamental models for chemical processes are often unable to estimate all of the parameters, especially when available data are limited or noisy. In these situations, modelers may decide to select only a subset of the parameters for estimation. An orthogonalization algorithm combined with a mean squared error (MSE) based selection criterion has been used to rank parameters from most to least estimable and to determine the parameter subset that should be estimated to obtain the best predictions. A robustness test is proposed and applied to a batch reactor model to assess the sensitivity of the selected parameter subset to initial parameter guesses. A new ranking and selection technique is also developed based on the MSE criterion and is compared with existing techniques in the literature. Results obtained using the proposed ranking and selection techniques agree with those from leave-one-out cross-validation but are more computationally attractive. collinearity index methods,16−18 methods based on principal component analysis (PCA) and eigenvalues,2,19−21 and orthogonalization methods.10,22,23 Many researchers have successfully used the orthogonalization method to rank parameters from most to least estimable and various techniques have been employed to select the number of parameters to estimate from the ranked list (e.g., using an arbitrary threshold value,7,8,22,24 judgments based on the objective function,25 or cross-validation10). A statistical method proposed by Wu et al.11,12,26 relies on mean squared error (MSE), which takes into account both the bias and variance in parameter estimates and model predictions. The bias increases when a modeler chooses to use a simplified model (SM) wherein only a subset of parameters is estimated. However, a SM with fewer parameters also results in a beneficial decrease in the variance (and standard error) for the parameter estimates and model predictions. When the variance reduction is greater than the increase in bias, the SM should give better predictions, in terms of MSE, than the original model with all parameters estimated.26,27 Wu’s MSE-based method has been applied to select optimal parameter subsets in a styrene polymerization model28 and a nylon 66 degradation model.29 One limitation of Wu’s MSE-based method and other methods that are based on a local sensitivity matrix is their dependence on initial parameter values. Poor initial guesses will lead to poor derivative information in the scaled sensitivity coefficients. Using different parameter values or uncertainty factors can result in changes in parameter ranking and in selection of a different subset of parameters for estimation. A simple method for assessing the sensitivity of the selected parameters to the initial guesses is proposed in this article, wherein Wu’s MSE-based method is repeated using different initial parameter guesses. Another MSE-based technique

1. INTRODUCTION Chemical engineers often build fundamental dynamic models to represent the behavior of chemical phenomena in industrial processes. In many situations, modelers must decide which parameters to select for estimation. In this article, new and existing strategies for ranking parameters from most to least estimable and determining parameter subsets that should be estimated are applied to a batch reactor model and their efficiencies are compared. It is often impossible for the modeler to estimate all of the model parameters for several reasons. Nonlinear chemical process models frequently contain many parameters. When the values of some parameters are not well-known, they must be estimated using nonlinear optimization algorithms.1 The data set available to the modeler for estimation is often limited (i.e., the number of experimental runs or measurements is small, the experimental design is highly correlated, or data are noisy).2−4 In addition, the effects of some parameters on model predictions may be small, or the effects of some parameters may be correlated with the effects of others. Faced with this situation, a modeler may lump several parameters together into a single overall parameter2 or may remove terms from the model equations that are expected to have little influence on the model predictions.5 For example, parameters have been selected for elimination based on statistical significance tests such as the t test or ridge-selection analysis.6 Alternatively, the modeler may instead estimate only a subset of the model parameters, while keeping other parameters at literature values or initial guesses.7−12 This model simplification method may also be useful when a modeler has precise prior knowledge about values of some parameters (i.e., from the literature or from previous experiments) and less knowledge about other parameters. When the modeler is presented with new data, he/ she may be unsure about which parameter values should be adjusted to get improved model predictions. Methods for selecting parameter subsets for estimation have been reviewed recently4 and include: methods based on scalar measures of the Fisher Information Matrix,13−15 correlation and © 2012 American Chemical Society

Received: Revised: Accepted: Published: 6105

October 13, 2011 March 5, 2012 April 9, 2012 April 9, 2012 dx.doi.org/10.1021/ie202352f | Ind. Eng. Chem. Res. 2012, 51, 6105−6115

Industrial & Engineering Chemistry Research

Article

proposed by Chu et al.9 employs a Monte Carlo technique that considers a large number of potential parameter values and, as such, may be more robust to poor initial guesses than Wu’s MSE-based method. However, the Monte Carlo approach is computationally intensive and may be prohibitive for models with a large number of parameters. The results obtained and the computational effort required for implementing these two robust parameter-selection techniques will be compared. The method of Wu et al. relies on an orthogonalization algorithm to first rank the parameters.10,22 In a second step, the appropriate number of parameters is determined using a MSE-based model selection criterion.11,12,26 A new selection technique will also be proposed, which uses Wu’s MSE-based criterion to rank the parameters and to select the optimal number of parameters to estimate. Section 2 below describes the current parameter subset selection methodology used by our research group (i.e., parameter ranking by orthogonalization, followed by MSEbased subset selection).12,28,29 A simple procedure is then proposed to account for uncertainty in initial parameter guesses when this methodology is used. The MSE-based Monte Carlo method of Chu et al.9 is also described, as well as a proposed method for simultaneous parameter ranking and selection using Wu’s MSE-based criterion. In Section 3, a multiresponse batch reactor model and literature data30 are used to demonstrate the three different parameter selection techniques. Results and computational requirements are compared.

Zilm , j =

(1)

y = g (x , u , θ ) + ε

(2)

∂θj syi

(3)

These scaled sensitivity coefficients are calculated for each of the i = 1 ... d response variables, each of the l = 1 ... n measurement times, and each of the m = 1 ... r experimental runs. Because scaling can strongly affect the outcome of the ranking algorithm, the scaling factors must be chosen carefully. As recommended by Thompson et al.,10 the scaling factors sθj0 and syi, which have the same units as θj and yi, should account for uncertainties in the initial guesses for parameters and measured responses, respectively. The initial parameter guesses and their corresponding uncertainty levels can be set by the modeler based on prior knowledge (e.g., information from literature, earlier data or parameter estimation analyses, or knowledge of physically realistic parameter values)31 and the different syi values, which account for inaccuracies of different measurements, can be determined from replicate experiments or information from sensor suppliers. The elements of this scaled sensitivity matrix, Z, can be calculated by difference approximations with perturbed parameter values32 or by solving sensitivity equations.33 The widely used orthogonalization algorithm, which is outlined in Table 1, relies on this scaled sensitivity matrix to Table 1. Orthogonalization Algorithm12 1. Calculate the magnitude (i.e., the Euclidean norm) of each column in the scaled sensitivity matrix Z. The most estimable parameter corresponds to the column in Z with the largest magnitude. Set k = 1. 2. Put the k columns from Z that correspond to parameters that have been ranked into matrix Xk. 3. Use Xk to predict columns in Z using ordinary least-squares: Zk̂ = Xk(Xk TXk)−1Xk TZ (5)

2. EXISTING AND PROPOSED METHODS FOR PARAMETER SUBSET SELECTION Assume that the behavior of the process of interest can be described by a multivariate nonlinear dynamic model of the form: dx = f (x , u , θ ), x(0) = x0 dt

∂gilm sθj0

and calculate the residual matrix R k = Z − Z ̂k

(6)

4. Calculate the magnitude of each column in Rk. The (k + 1)th most estimable parameter corresponds to the column in Rk with the largest magnitude. 5. Increase k by 1 and put the columns corresponding to the k + 1 parameters that have been ranked in matrix Xk. 6. Advance the iteration counter (subscripts for X and R) and repeat steps 3−5, until all parameters are ranked or until it is impossible to perform the leastsquares prediction of Z in step 3 due to matrix singularity.

where x ∈ Rw is the vector of model states with the vector of initial values x0, u ∈ Rv is the vector of input trajectories, θ ∈ Rp is vector of model parameters, y ∈ Rd is the vector of measured response variables, and ε ∈ Rd is the vector of stochastic measurement errors. With several different response variables and experimental runs performed, there may be N = dnr total data points available for estimation, where d is the number of response variables and n is the number of measurements of each response variable in each of the r experimental runs. 2.1. Parameter Ranking and Selection Using Orthogonalization and Wu’s MSE Criterion (O + r CC). Information about influences of parameter values on model predictions is summarized in a sensitivity matrix, Z, which is important for many parameter selection techniques. The sensitivity matrix has dimensions N × p, where p is the total number of model parameters. The outcomes from all of the experimental runs are stacked in one matrix, instead of analyzing individual sensitivity matrices for different runs, to produce one overall parameter ranking that considers the predictions of all of the available measurements. The elements of this matrix, which are scaled for dimensional consistency, are partial derivatives of each response variable prediction gilm with respect to each parameter θj and are calculated as follows:

rank parameters from most estimable to least estimable. The rank of each model parameter is determined based on the influence of the parameter on model predictions and on correlations with other parameters. A complete description of the algorithm is provided by Yao et al.22 and Thompson et al.10 Recently, Wu et al.11,12,26 developed a MSE-based technique to determine the optimal number of parameters to estimate from the ranked list obtained via orthogonalization. This technique, shown in Table 2, relies on a critical ratio, rC. This statistic estimates the ratio of the squared bias to the variance that results from estimating only k model parameters, rather than estimating all p parameters. The ratio rC is computed using optimal values of a weighted least-squares objective function J, which is a function of the parameter vector: ⎛ y − g (θ ) ⎞ 2 ilm ⎟⎟ J(θ ) = ∑ ∑ ∑ ⎜⎜ ilm s ⎝ ⎠ yi i=1 l=1 m=1 d

n

r

(4)

where yilm is the measured value of the ith response variable at the lth measurement time, for the mth experimental run, and 6106

dx.doi.org/10.1021/ie202352f | Ind. Eng. Chem. Res. 2012, 51, 6105−6115

Industrial & Engineering Chemistry Research

Article

3) using the updated θj values. However, the uncertainty levels, sθj0, should be left at their initial values. A benefit of scaling based on initial parameter uncertainties is that parameters with large uncertainties tend to be selected for estimation and parameters that are known precisely a priori are left out of the estimation. 2.2. Proposed Method for Assessing Subset Robustness. The ranked parameter list and the parameter subset selected using the algorithms in Tables 1 and 2 depend on initial parameter guesses and scaling factors used to construct the Z matrix. The robustness of the selected parameter subset to these assumptions can be analyzed by repeating the ranking and parameter selection using different realistic initial parameter guesses or scaling factors. In Section 3, a batch reactor model is analyzed using 100 sets of different initial parameter guesses, chosen randomly from the initial uncertainty ranges specified for the parameters. These random parameter values are then used to determine 100 different Z matrices. The Monte Carlo approach for selecting different initial parameter guesses may generate a large number of operating conditions and may require long computation times, especially if a large number of cases are considered and the model is complicated. It is important to choose a number of random sets of initial parameter guesses that strikes a balance between ensuring robustness and computational efficiency. The orthogonalization algorithm for parameter ranking is repeated for each set of initial guesses, resulting in 100 ranked parameter lists. Wu’s MSE-based method is then applied 100 times to select the appropriate number of parameters for estimation. This technique can be used to determine if the same subset of parameters is selected frequently for estimation or whether the nominal parameter guesses and scaling factors gave anomalous results. 2.3. MSE-based Parameter Selection Algorithm of Chu et al. A different MSE-based technique proposed by Chu et al.9 relies on Monte Carlo simulations to simultaneously rank parameters and select the optimal number to estimate. The algorithm for this technique is shown below in Table 3. Note that unlike the previous method, this technique does not involve parameter estimation and does not require experimental data. In this method, many different sets of potential true parameter values are considered, and sensitivity matrices are formulated for each candidate set. The potential true parameter values are selected from uncertainty ranges for each

Table 2. Wu’s MSE-Based Approach for Selecting the Number of Parameters to Estimate12 1. Start with a ranked parameter list (i.e., use the orthogonalization algorithm). 2. Estimate the top-ranked parameter using weighted least-squares regression, with all other parameters fixed at initial guesses. Next, estimate the top two parameters, followed by the top three parameters and so on, until all of the ranked parameters have been estimated. Denote the value of the objective function with the top k parameters estimated and the remaining p − k parameters held fixed as Jk. 3. Compute the critical ratio

rC, k = (Jk − Jp )/(p − k) for k = 1 ... p − 1. 4. For each value of k, compute the corrected critical ratio p−k rCC, k = (rCKub, k − 1) N where ⎞ ⎛ 2 rCKub, k = max⎜rC, k − 1, rC, k ⎟ p−k+2 ⎠ ⎝

(7)

(8)

(9)

5. Select the value of k corresponding to the lowest value of rCC,k as the appropriate number of parameters to estimate.

gilm is the corresponding predicted value. This objective function is appropriate when the modeler has prior knowledge about measurement uncertainties and when measurement errors are assumed to be uncorrelated. In Table 2, Jk and Jp are values of the objective function when k and p parameters are estimated, respectively. In cases where it is impossible to estimate all p parameters simultaneously due to numerical difficulties, Jp can be approximated by estimating a sufficiently large number of parameters. A corrected critical ratio, rCC, is calculated in step four of the algorithm to allow for comparisons among simplified models with different numbers of parameters. An rCC,k value that is less than zero indicates that the SM with k parameters estimated is expected to provide better predictions, in terms of MSE, than the model with all parameters estimated. The value of k corresponding to the lowest rCC,k value is the number of parameters that should be selected for estimation. After selecting the number of parameters to estimate, it is important to compare the model predictions and the data to check for lack of fit, which might indicate that the model has structural problems. Once the modeler has estimated some of the parameters, it can be useful to recompute the sensitivity information in the Z matrix (see eq

Table 3. MSE-Based Approach Developed by Chu et al.9 to Rank and Select the Number of Parameters to Estimate 1. Specify initial guesses (nominal values), parameter ranges, and distributions for all p parameters, as well as the variance σ2 for the response variable. 2. Randomly select 104 possible sets of true parameter values from the specified parameter ranges. For each set of candidate parameter values, evaluate and record the sensitivity matrix S. 3. Starting with no parameters in the ranked list, formulate p candidate parameter subsets, each with 1 parameter that will be estimated, so that p − 1 parameters will be left out in each case. For each selected parameter (i.e., for j = 1, ..., p), place its corresponding column from S into T and place the remaining columns in W. 4. For j = 1, ..., p, evaluate the bias for each of the 104 potential sets of true values, caused by selecting parameter j, but leaving the others at their nominal values: (θ̅ U − θU*)T [WTW − WTT(TTT )−1TTW ](θ̅ U − θU*) For each candidate parameter subset, compute the average of these 104 bias values. 5. The parameter with the lowest mean value for the bias is ranked most estimable. 6. With k − 1 parameters ranked, formulate p − k − 1 candidate parameter subsets, each with k parameters. Each of these subsets will include the k − 1 previously selected parameters and one additional parameter. For each candidate subset, include the corresponding columns in T and put columns for the unselected parameters in W. 7. For each candidate subset, evaluate the bias for the 104 potential sets of true values, caused by selecting k parameters. This bias is obtained using the expression shown in step 4 and the average of the 104 bias values is computed for each candidate parameter subset. 8. The selected parameter that gives the lowest mean value for the bias (in combination with the parameters that were previously ranked) is ranked next. 9. Repeat steps 6−8 until the reduction in average bias resulting from the addition of the new parameter is less than the increase in the total variance caused by including the parameter, which is equal to σ2. 6107

dx.doi.org/10.1021/ie202352f | Ind. Eng. Chem. Res. 2012, 51, 6105−6115

Industrial & Engineering Chemistry Research

Article

Table 4. New Method to Rank and Select Parameters to Estimate 1. Starting with no parameters in the ranked list, formulate p candidate parameter subsets, each with one parameter that will be estimated, so that p − 1 parameters will be held at their initial values. 2. For each selected parameter (i.e., for j = 1, ..., p), fit the candidate model using the weighted nonlinear least-squares criterion in eq 4 to compute an optimal objective function value, which is then used to calculate p different rC,1 and rCC,1 values using eqs 7−9 from Table 2. 3. The parameter corresponding to the lowest rCC,1 value is ranked as the most estimable parameter. 4. With k − 1 parameters ranked, formulate p − k − 1 candidate parameter subsets. Each subset will contain the k − 1 parameters that were previously ranked and one additional parameter. 5. For each candidate subset, fit the candidate model using weighted nonlinear least-squares to compute an optimal objective function value, which is then used to calculate rC,k and rCC,k values using eqs 7−9 from Table 2. 6. The additional parameter that gives the lowest rCC,k value (in combination with the parameters that were previously ranked) is ranked next. 7. Repeat steps 4−6 until all the parameters are ranked, or until estimation fails due to numerical conditioning problems. Select the parameter subset corresponding to the lowest overall value of rCC,k as the subset that should be included in the final estimation.

the parameters from most estimable to least estimable using this technique requires p(p + 1)/2 − 1 parameter estimations. The algorithm for this procedure is outlined in Table 4. Like Chu’s method, this new forward selection technique combines parameter ranking and subset selection by considering the trade-off between bias and variance at each step. Unlike Chu’s method, the new technique uses parameter estimation and data. Chu’s method (and the orthogonalization algorithm in Table 1) ranks parameters based on sensitivity matrices and thus requires only the settings at which the experimental data will be collected (e.g., measurement times, reactor temperature, and other input conditions). Like Wu’s algorithm in Table 2, this new algorithm determines the appropriate number of parameters to estimate based on the lowest value of rCC,k. The user specifies information about uncertainty in parameter values by enforcing reasonable bounds during parameter estimation. For chemical process models, it is not clear whether the three methods outlined above will give similar results and which method will be reliable. In the next section, the proposed technique for checking robustness of O + rCC results is demonstrated, and the three subset selection methods are applied to select parameter subsets for estimation in a batch reactor model.

parameter, which are specified by the user in step one of the algorithm. By using many different sets of candidate parameter values to calculate an average bias, the selected parameter subset should be less sensitive to poor initial guesses than the O +rCC method outlined in Tables 1 and 2. Unscaled sensitivity matrices, S, with elements ∂gilm/∂θj, are calculated many times (e.g., 104 times as recommended by Chu et al.) and each S matrix is partitioned into two parts: T and W. T is the matrix of sensitivity vectors corresponding to parameters that may be estimated, and W is the matrix of sensitivity vectors for parameters that will remain fixed at nominal values. Steps four and seven involve estimating the average bias introduced by considering only a subset of parameters for estimation. The calculated bias is based on the estimated difference between the nominal values and the true values of the unselected parameters. θ̅U in the expression for bias denotes the vector of nominal values for the unselected parameters and θU* denotes the vector of possible true values for the unselected parameters. Similar to the O + rCC method in Tables 1 and 2, Chu’s method is a forward selection technique (i.e., parameters are ranked one at a time from most estimable to least estimable) that considers the trade-off between bias and variance as additional parameters are included in the estimation. However, in the O + rCC method, ranking of parameters and selection of the parameter subset are performed in separate procedures, whereas in Chu’s method, the ranking and selection steps are combined. For typical chemical process models, it is not clear which method will give better results and which will be more computationally expensive. 2.4. Proposed Simultaneous Parameter Ranking and Subset Selection Method Using the Corrected Critical Ratio (SrCC). A new parameter selection technique is proposed, wherein Wu’s MSE-based method is modified to rank parameters by minimizing MSE at each ranking step, similar to the method of Chu et al. This method allows for simultaneous parameter ranking and subset selection and uses the same criteria to rank and select parameters. Unlike the methods described above, this new method does not involve sensitivity matrices (which can be inaccurate due to poor initial parameter guesses). Rather, the proposed method uses parameter estimation at each ranking step to compute rCC. The parameter selected at each step is the parameter that gives the lowest value of rCC,k. Thus, this method ranks parameters based on the improvement in the objective function and may select a simpler model with fewer parameters than the algorithm in section 2.1. This method is useful when the modeler would like to rank parameters using information from the available data. However, this method cannot be performed when experimental data are unavailable and it is also more computationally demanding than existing methods. To rank all

3. CASE STUDY: PARAMETER SUBSET SELECTION AND ESTIMATION FOR THE DOW CHEMICAL MODEL The Dow Chemical batch reactor model described by Biegler et al.30 consists of the differential and algebraic equations (DAEs) shown in Table 5. In these equations, [Q+] is the initial catalyst concentration (0.0131 mol/kg) and T0 is a reference temperature (340.15 K). The four response variables in the model include three measured concentrations and a fourth computed concentration: y1 = x1 + ε1 y2 = x 2 + ε2 y3 = x3 + ε3 y4 = y10 − y1 − y3

(10)

0

where y1 is the initial value of y1. Biegler et al. provide initial conditions and measured data from dynamic experiments conducted at 40, 67, and 100 °C, with some y1 measurements missing for the experimental run at 40 °C. The uncertainty associated with each measured concentration is assumed to be syi = 0.02 mol/kg. Numerical difficulties were frequently encountered in previous studies of this parameter estimation problem.30 As a 6108

dx.doi.org/10.1021/ie202352f | Ind. Eng. Chem. Res. 2012, 51, 6105−6115

Industrial & Engineering Chemistry Research

Article

Table 5. Batch Reactor Model Equations30

Table 6. Initial Parameter Guesses, Uncertainty Factors, and Rankings Obtained Using the Orthogonalization Algorithm in Table 1

dx1 = − k 2x 2x8 dt

dx 2 = − k1x 2x6 + k −1x10 − k 2x 2x8 dt

parameter k10 (kg mol−1 h−1) E1 (cal mol−1) k20 (kg mol−1 h−1) E2 (cal mol−1) k−10 (h−1) E−1 (cal mol−1) K1 (mol kg−1) K2 (mol kg−1) K3 (mol kg−1)

dx 3 = k 2x 2x8 + k1x4x6 − 0.5k −1x 9 dt

dx4 = − k1x4x6 + 0.5k −1x 9 dt dx5 = k1x 2x6 − k −1x10 dt

dx6 = − k1x 2x6 + k −1x10 − k1x4x6 + 0.5k −1x 9 dt x 7 = − [Q +] + x6 + x8 + x 9 + x10

x8 =

K 2x1 K 2 + x7

1.0

0.2

1 × 10 2.0 1 2 1 3 5 2

× × × × × ×

uncertainty sθj0

4

104 103 104 10−16 10−14 10−16

2 × 10 0.3

rank (67 °C data)

rank (all data)

3

6

2

4 5

3

1 5 4 2 5 1

× × × × × ×

104 102 103 10−16 10−15 10−17

4 1 5 6

1 7 3 2 8 9

ranks first. When all of the data are used for parameter estimation, E2 and K1 rank at the top of the list due to their large influence on model predictions and large uncertainties in their initial guesses. Parameter K3 ranks at the bottom of the list for both cases because of its small influence, precise initial value (i.e., small initial uncertainty), and correlation with parameters that appear higher in the list. On the basis of each ranked list, a series of SMs was formulated. First, only the top-ranked parameter was estimated and the other parameters were left at their initial guesses. Next, the top two parameters were estimated, followed by the top three parameters, and so on, until all model parameters were estimated. When problems with local optima were encountered, parameter estimation was restarted with updated parameter estimates obtained from the previous SM. As expected, estimating additional parameters resulted in an improved fit of the data, as shown by the trends in the objective functions plotted in Figures 1 and 2. An improvement in the objective function, J (see eq 4), is observed for up to three parameters estimated using the 67 °C data and up to six parameters estimated using all the data. When additional parameters are estimated, there is negligible improvement in the fit. The algorithm in Table 2 was then used to determine the appropriate number of parameters to estimate from the ranked

K3x3 x9 = K3 + x 7 x10 =

initial guess

K1x5 K1 + x 7

⎛ −E ⎛ 1 1 ⎞⎞ k1 = k10 exp⎜⎜ 1 ⎜ − ⎟⎟⎟ T0 ⎠⎠ ⎝ R ⎝T ⎛ −E ⎛ 1 1 ⎞⎞ k 2 = k 20 exp⎜⎜ 2 ⎜ − ⎟⎟⎟ T0 ⎠⎠ ⎝ R ⎝T ⎛ −E ⎛ 1 1 ⎞⎞ k −1 = k −10 exp⎜⎜ −1 ⎜ − ⎟⎟⎟ R T T ⎝ ⎝ 0 ⎠⎠

result, some researchers simplified the model by reducing the number of differential equations and the number of parameters.34−36 Careful selection of tolerances37 for the DAE solver “ode15s” and optimizer “lsqnonlin” in Matlab was required so that the complete model could be used in the current study. Previously, Wu et al.12 applied the O + rCC method to this model using only the data at 40 °C and demonstrated that five of the nine model parameters should be estimated to obtain the best expected predictions. In section 3.1, this parameter selection study is extended to include the data at 67 and 100 °C to study the effects of including additional data on the number of estimable parameters. These results are compared with leave-one-out cross-validation in section 3.2. In section 3.3, the proposed robustness test is applied, and in sections 3.4 and 3.5, results obtained using Chu’s method and SrCC are compared with those from O + rCC. 3.1. Parameter Subset Selection using Orthogonalization and Wu’s MSE Criterion (O + rCC). The experimental data are divided into two cases for analysis: (1) only the mid temperature (67 °C) data are assumed to be available for estimation and (2) all the data (40, 67, and 100 °C) are available for estimation. When only the mid temperature data set is used, the three activation energies are eliminated from the model, because the reference temperature, T0 = 340.15 K, is the same as the reactor temperature, T (see expressions for k1, k2, and k−1 in Table 5). Table 6 shows the initial parameter guesses and associated uncertainties used for parameter ranking.12 The parameter rankings obtained using the orthogonalization algorithm in Table 1 are also shown in Table 6. When only the mid temperature data are used for parameter estimation, K1

Figure 1. Effect of the number of parameters estimated on the objective function J and rCC,k values obtained using the algorithm in Table 2 with only the 67 °C data. Note the break and the different scales used on the vertical axis for rCC,k . 6109

dx.doi.org/10.1021/ie202352f | Ind. Eng. Chem. Res. 2012, 51, 6105−6115

Industrial & Engineering Chemistry Research

Article

Figure 2. Effect of the number of parameters estimated on the objective function J and rCC,k values obtained using the algorithm in Table 2 with all of the data at 40, 67, and 100 °C. Note the break and different scales used on the vertical axis for rCC,k.

Figure 3. Comparison of model predictions obtained by estimating the seven top-ranked parameters (using all of the data) with the measured data obtained at 40 °C. The remaining two parameters were fixed at nominal values specified in Table 6. Solid lines represent model predictions.

⎛ y − y ̂ (θ ) ⎞ 2 q qk ⎟ CVk(θ ) = ∑ ⎜⎜ ⎟ s yq ⎠ q=1 ⎝

lists. The corresponding rCC,k values, shown in Figures 1 and 2, were calculated from the objective function values. With only the 67 °C data, the lowest value of the criterion occurs when the top three parameters are estimated. When all of the data are used for parameter estimation, rCC,k is smallest at k = 7. Thus, estimating three and seven parameters, respectively, should give the best predictions with the lowest MSE. That is, with all the data available, parameters E2, K1, E−1, E1, k20, k10, and k−10 should be estimated; if only 67 °C data are available, parameters K1, k20, and k10 should be estimated and the other parameters should remain at their initial values specified in Table 6. As expected, including additional data at different temperatures enabled the estimation of additional parameters. These results agree with the work of Schittkowski20 who selected seven parameters for estimation in this model using a backward selection eigenvalue method. Note that Schittkowski used different assumptions about the initial parameter guesses and scaling factors and that Schittkowski’s eigenvalue method relies on an arbitrary cutoff to categorize parameters as estimable or inestimable. Model predictions at 40 °C, resulting from estimation of the top seven parameters (with K2 and K3 held fixed at the initial values specified in Table 6) are shown in Figure 3 along with the experimental data. The model predictions match the data very well. Similar good agreement was obtained between predictions and data at 67 and 100 °C (not shown). 3.2. Leave-One-out Cross-Validation. The results obtained using the O + rCC technique were verified using leave-one-out cross-validation, which is a computationally intensive technique used in model selection and validation.38 Prior to the development of Wu’s MSE-based technique, crossvalidation was used by Thompson et al.10 to select the appropriate number of parameters to estimate from a ranked list of parameters in a polyethylene model. The method used by Thompson et al. involved leaving out one data point at a time, and using the remaining N − 1 data points for parameter estimation. The adjusted parameter values were then used to estimate the value of the withheld data point. This procedure was repeated for several data points to compute a crossvalidation objective function. If all of the data points are used for cross-validation, this objective function is:

N

(11)

where yq is the value of the withheld measurement, and ŷqk is the corresponding estimated value. The lowest CVk value corresponds to the k-parameter SM that is expected to give the best predictions. In the current study, this procedure was applied using the 63 experimental points in the 67 °C data set and the 256 data points from all three data sets. The resulting cross-validation objective function values obtained using all of the data are shown in Figure 4. The cross-validation technique

Figure 4. Effect of the number of parameters estimated on the crossvalidation objective function values obtained using eq 11 with the data collected at 40, 67, and 100 °C. Note the break and different scales used on the vertical axis.

confirms that seven parameters should be selected for estimation, which is the same result that was obtained using the O + rCC technique. CVk decreases as k increases from one to seven, as the predictive ability of the model improves with additional parameters included in the estimation. However, CVk increases beyond seven parameters, indicating a decline in the predictive ability of the model. With only the 67 °C data, 6110

dx.doi.org/10.1021/ie202352f | Ind. Eng. Chem. Res. 2012, 51, 6105−6115

Industrial & Engineering Chemistry Research

Article

cross-validation confirms that three parameters should be selected for estimation (not shown). 3.3. Application of Proposed Robustness Test. A modeler may not be confident in the values of the initial guesses and scaling factors that are required using the O + rCC technique. Different values for uncertainties and initial guesses will result in different values for the scaled sensitivity coefficients in Z and possibly in different parameters selected for estimation. To assess the robustness of results in section 3.1, the O + rCC technique was repeated 100 times. The 100 sets of initial parameter guesses were chosen randomly from a uniform distribution for each of the 9 parameters, with lower and upper parameter bounds set at θj0 ± sθj0 (see Table 6 for nominal parameter values and uncertainty factors). Note that a different distribution could be assumed, e.g. a triangular or normal distribution, depending on the prior beliefs of the modeler. Next, 100 ranked lists were obtained by repeating the orthogonalization technique. The frequency of the rank for each parameter (assuming that data are available at all three temperatures) is shown in the shaded box diagram in Figure 5.39 Along the vertical and horizontal axes of this diagram, the

Figure 6. Frequency of the number of parameters selected obtained using the algorithm in Table 2 with all of the data and 100 random initial parameter guesses.

times. This type of information could be valuable to a modeler who wants to obtain the best possible model predictions when data are limited. 3.4. Application of Chu’s MSE-Based Method. The algorithm in Table 3 was applied to the batch reactor model in Table 5 using the data collected at 67 °C and also the data collected at all temperatures. First, 100 sets of possible true parameter values were first selected from the uncertainty ranges θj0 ± sθj0 corresponding to the values in Table 6, assuming a uniform distribution. This procedure was also repeated with 104 sets of possible parameter values as recommended by Chu et al.9 The parameters were ranked according to the average bias terms calculated at each step. In Table 7, the resulting ranked Table 7. Comparison of Parameter Rankings and Subsets Selected Using the Orthogonalization Technique in Table 1, Wu’s MSE-Based Technique in Table 2, and the Method of Chu et al.9 from Table 3 Using 10 000 Sets of Possible Parameter Values O + rCC technique parameter k10 E1 k20 E2 k−10 E−1 K1 K2 K3

Figure 5. Frequency of parameter rankings obtained using the algorithm in Table 1 with all of the data for 100 random initial parameter guesses.

parameters are listed according to their rank from Table 6. The degree of shading for each cell depends on the frequency of that rank for the parameter. For example, parameter E2, which is ranked first in Table 6, ranks as the top parameter using 93 out of 100 sets of initial parameter guesses. Parameters K1, E−1, K2, and K3 all retain their original ranks more than 50% of the time, whereas the ranks for other parameters are more variable. Wu’s MSE-based criterion was applied 100 times to determine the optimal number of parameters to estimate for set of initial guesses. The frequecies of the optimal number of parameters selected for estimation are shown in Figure 6. The seven parameters selected in section 3.1 were selected for estimation nearly 80% of the time. Note that different results would be obtained using different models, scaling factors, and data. For example, when the same robustness procedure using 100 initial guesses was performed using only the 67 °C data, the three-parameter subset from section 3.1 was selected 19 out of 100 times, and a four-parameter subset was selected 71 out 100

Chu’s MSE technique

rank (67 °C data)

rank (all data)

rank (67 °C data)

rank (all data)

3*

6 4 5 1 7* 3 2 8 9

3

5 6 7 1 4 3 2 8 9*

2 4 1 5 6

4 2 1 5 6*

lists obtained using 104 sets of possible parameter values are compared with the ranked lists obtained in section 3.1. The number of parameters selected using each method is denoted by an asterisk. Using the settings for the 67 °C data, the ranked lists and selected parameters were identical using 100 and 104 sets of possible true values. Using all the settings for the data, the parameters that ranked fourth−sixth changed positions when using 100 sets were used compared with 104 sets, but the same subset was selected. The rankings obtained using the O + rCC technique and Chu’s MSE method were similar. Using the data at 40, 67, and 100 °C, the top-three and the bottom-two parameters were identical using both methods. Only the parameters that ranked fourth−seventh showed slight differences in rank. Note that these parameters also showed the largest variability using the proposed robustness verification method in section 3.3 (see Figure 5). Unlike the O + rCC method, Chu’s method indicates 6111

dx.doi.org/10.1021/ie202352f | Ind. Eng. Chem. Res. 2012, 51, 6105−6115

Industrial & Engineering Chemistry Research

Article

k10, and k−10 was found to give the lowest rCC,k value, as shown in Table 8. Note that the subsets selected using this new MSEbased method are identical to the subsets selected using the O + rCC technique, but disagree with results from Chu’s MSE method. This new MSE-based method and O + rCC use information from the data when assessing the trade-off between bias and variance, whereas Chu’s method does not. Table 8 shows that the rankings of some parameters differed between using the O + rCC technique and the SrCC technique. For example, using the 67 °C data, parameter K1 ranked at the top of the list using the O + rCC technique due to its large influence on model predictions and large specified uncertainty. This ranking was obtained prior to parameter estimation using the data. However, using the SrCC technique, changing K1 from its initial guess did not have as much impact on rCC,1 as adjusting k20. As a result, the SrCC technique selected k20 as the top-ranked parameter. Similarly, when using all the data, the SrCC technique first selected parameter k20 instead of E2. Note that with this forward-selection method, the estimated value of the selected parameter in the 1-parameter subset depends on the initial values of the other parameters. Thus, this method is also prone to problems arising from poor initial guesses. To assess the robustness of this new MSE-based method, the procedure was repeated using 40 sets of initial guesses selected randomly from the uncertainty ranges given in Table 6, using a uniform distribution. As shown in Figures 7

that all of the parameters should be selected for estimation using either the 67 °C data alone or the data from all temperatures. This discrepancy may be due to the different ways that the two methods approximate the bias. The O + rCC technique uses experimental data and weighted-least-squares regression (see eq 4); Chu’s method assumes a uniform distribution of potential true parameter values and calculates an “average” bias from a random sample (see step 4 in Table 3). As a result, the O + rCC technique can make use of information from the data set, whereas Chu’s method must rely on assumptions about potential true values. This example illustrates that the trade-off between bias and variance estimated using Chu’s method may be very different from the trade-off determined using information from the data. Unlike the two-step O + rCC method, Chu’s method combines parameter ranking and selection in a single algorithm. As described in section 2.4, Wu’s MSE-based parameter selection method can be extended to simultaneously rank and select parameters using the corrected critical ratio rCC. This new approach is tested using the batch reactor model in the following section. 3.5. Application of Proposed MSE-Based Ranking Method (SrCC). The new MSE-based ranking method outlined in Table 4, which simultaneously ranks and selects parameters was tested using the model and data from Biegler et al.30 Since the isothermal model that predicts the 67 °C data set contains six parameters, each of these six parameters was first estimated individually with the remaining five held fixed. Estimation of k20 resulted in the lowest rCC,1 value of 16.81 and thus k20 was selected as the top-ranked parameter, as shown in Table 8. Next Table 8. Comparison of Parameter Rankings and Subsets Selected Using the Orthogonalization Method with Wu’s MSE-Based Technique and the New MSE-Based Ranking Method in Table 4 O + rCC technique parameter k10 E1 k20 E2 k−10 E−1 K1 K2 K3

SrCC

rank (67 °C data)

rank (all data)

rank (67 °C data)

rank (all data)

3*

6 4 5 1 7* 3 2 8 9

2

2 4 1 7* 5 3 6 9 8

2 4 1 5 6

1 4 3* 6 5

Figure 7. Frequency of parameter rankings using the new MSE-based ranking method in Table 4 with all of the data for 40 random initial parameter guesses.

k20 was estimated along with each of the remaining five parameters. The parameter pair that gave the lowest value of rCC,2 = 1.24 was k20 and k10. The minimum value of rCC,3 = −0.042 was obtained when parameters k20, k10, and K1 were estimated. Note that the negative value of rCC,3 indicates that the SM with three parameters should give better predictions, in terms of MSE, than the model with all six parameters estimated. When the best four parameters were estimated, rCC,4 was higher than rCC,3. This result indicates that estimating three parameters (k20, k10, and K1), with the remaining three parameters held fixed at their initial values specified in Table 6, is preferable to estimating four. This method was repeated using the data at all temperatures and the seven-parameter subset containing E2, K1, E−1, E1, k20,

and 8, there was more variation among parameters ranked two−six, but the same seven parameters were selected for estimation 65% of the time. A shortcoming of the SrCC method (and also Chu’s method and O + rCC) is that it selects parameters for estimation considering the trade-off between bias and variance for predictions made at the experimental points for parameter estimation. A modeler will usually be interested in using the model to obtain predictions at different operating conditions, rather than at conditions where data are already available. In the future, it is recommended that these robust parameter selection techniques be extended to select parameters for estimation that will provide good predictions at desired operating conditions. 6112

dx.doi.org/10.1021/ie202352f | Ind. Eng. Chem. Res. 2012, 51, 6105−6115

Industrial & Engineering Chemistry Research

Article

p(p + 1)/2 − 1 parameter estimations for each set of initial guesses. With the original set of initial guesses, this method requires 248 s to select parameters using the 67 °C data set, which is about twice the effort required for the O + rCC technique. With all of the data, the computation time is approximately 3.5 times greater than the time required for the O + rCC technique. Approximately 24 h of computation time are required to repeat the SrCC analysis to check robustness using 40 different sets of initial parameter guesses and all of the data. For comparison, an additional study was performed wherein the orthogonalization algorithm in Table 1 was repeated with 100 different initial parameter guesses to confirm that the ranking for the initial guesses was obtained most frequently. Next, Wu’s MSE-based selection technique was performed using the initial set of parameter guesses and the most frequent ranking. As shown in Table 9, the computation time is much lower (by a factor of 50) compared to the proposed robustness test from section 3.3. This final strategy is recommended to modelers who want to assess the sensitivity of the parameter rank to assumptions about initial guesses. If desired, the rCC step could then be repeated for a few initial guesses to check the robustness of the size of the parameter subset. This method would not require as many time-consuming parameter estimation steps as the full robustness technique described in section 3.3 or the repeated SrCC technique.

Figure 8. Frequency of the number of parameters selected using the new MSE-based ranking method in Table 4 with all of the data for 40 random initial parameter guesses.

This new MSE-based method, which requires the most parameter estimation steps, is the most computationally expensive of the three techniques tested. The computational requirements of the three different robust MSE-based techniques are compared in the next section. 3.6. Comparison of Computational Effort. The computational efforts required for the various methods are summarized in Table 9. Computing times are compared when Table 9. Comparison of Computational Effort for the Parameter Selection Techniquesa technique O + rCC leave-one-out cross-validation proposed robustness test with 100 sets of initial guesses Chu’s MSE-based method 100 sets of possible parameter values 10000 sets of possible parameter values SrCC original parameter guesses 40 sets of initial guesses orthogonalization technique with 100 sets of initial guesses and Wu’s MSE-based selection technique applied to 1 ranking a

67 °C data

all data

123 9345 11439

840 268756 76140

164 16368

712 71243

248

2913 86406 1512

279

4. DISCUSSION In this article, three MSE-based techniques are applied to a batch reactor model and their computational requirements are compared. The method of Wu et al.12 selects seven out of nine parameters for estimation using data from experiments at multiple temperatures. As expected, fewer parameters (only three) are selected using only the data obtained at a single temperature. These results are consistent with those from leaveone-out cross-validation and with the results from the new MSE-based ranking method that uses Wu’s corrected critical ratio rCC for simultaneous parameter ranking and selection.26 The method of Chu et al., wherein bias is approximated without using experimental data, gives markedly different results (i.e., all nine parameters should be estimated using all of the data and six parameters should be estimated using the data at a single temperature). The computational requirements of the three parameter selection techniques varied, with the proposed method, which involves many parameter estimations, requiring more computational effort than Chu’s method or Wu’s method, but significantly less computational effort than leave-one-out cross-validation. For modelers who seek an efficient method to select parameters for estimation, but have concerns about their choice of initial guesses and/or parameter uncertainty factors, we recommend that they repeat the estimability ranking procedure (orthogonalization algorithm in Table 1) several times. If a consistent parameter ranking is obtained, then the MSE-based method of Wu et al. can be applied to several sets of initial guesses that resulted in this ranking. The parameter subset that is obtained most frequently should then be selected for estimation.

Computation times are reported in seconds.

using only the 67 °C data (6 parameters and 63 data points) and using all of the data (9 parameters and 256 data points). These computation times, reported in seconds, were determined using the tic and toc function in MATLAB with a HP Pavilion dv6700 notebook computer. As shown in the first row of Table 9, the computation time required for the O + rCC technique is approximately 2 min when using the 67 °C data, and 14 min using all of the data. This computation time is much shorter than the time required for leave-one-out crossvalidation, which takes approximately 2.6 and 75 h. To complete the proposed robustness test with 100 sets of initial parameter guesses requires 76 140 s using all the data, which is approximately 21 h of computation time. When 100 sets of possible parameter values are considered, Chu’s MSE method requires a similar amount of computational effort compared to the O + rCC technique. Using 104 sets of possible parameter values, as suggested by Chu et al., requires 100 times the effort. Note that Chu’s method does not involve parameter estimation, so obtaining parameter values will require additional computation time. The SrCC technique, which uses Wu’s MSE criterion to simultaneously rank and select parameters, involves

5. CONCLUSIONS Chemical engineers who build fundamental nonlinear models often encounter numerical difficulties when trying to estimate 6113

dx.doi.org/10.1021/ie202352f | Ind. Eng. Chem. Res. 2012, 51, 6105−6115

Industrial & Engineering Chemistry Research

Article

rCkub = truncated Kubokawa estimator syi = uncertainty associated with measured response sθj0 = uncertainty associated with initial parameter guess t = time u = input trajectories v = number of input trajectories w = number of state variables x = state variables y = response variables ŷ = predicted responses using parameter estimates E = activation energy J = objective function in parameter estimation N = total number of data points O + r CC = parameter ranking and selection using orthogonalization and Wu’s MSE criterion Q+ = initial catalyst concentration R = ideal gas constant Rk = residual matrix in parameter ranking algorithm S = unscaled sensitivity matrix SM = simplified model SrCC = simultaneous parameter ranking and subset selection method using the corrected critical ratio T = (1) columns from S corresponding to sensitivity vectors for parameters that may be included in the estimation; (2) temperature T0 = reference temperature W = columns from S corresponding to sensitivity vectors for parameters that are not included in the estimation X = subset of Z matrix corresponding to selected parameters Z = scaled sensitivity matrix

all of their model parameters when the number of parameters is large and when data are limited. The modeler may know very little about appropriate values for some model parameters and may have precise prior knowledge about other parameter values. When presented with new data, it is important for the modeler to know which parameters should be estimated and which should remain at their initial values so that the best model predictions can be obtained. Several MSE-based methods have been developed to aid modelers in selecting the appropriate parameter subset to estimate. Unfortunately, uncertain initial parameter guesses can result in misleading conclusions from subset selection methods that rely on parametric sensitivity coefficients. A differential algebraic equation model for a batch reactor is used to illustrate a simple Monte Carlo technique to test the robustness of the selected parameter subset to uncertainty in initial parameter guesses. A new ranking and selection method is also developed based on a MSE criterion and is compared with existing techniques. The three robust MSE-based selection techniques compared in this article all determine which parameters should be estimated to obtain the best predictions at the experimental points used for parameter estimation. Often, model users want to make accurate predictions at operating points that are different from those where measurements are available. In the future, MSE-based parameter selection techniques should be extended to select parameters that will give the best predictions at operating conditions that are most important to the model user.



AUTHOR INFORMATION

Greek Symbols

Corresponding Author

*E-mail: [email protected]. Phone: +1(613) 533 2768. Fax: +1(613) 533 6637. Present Addresses †

Hatch, Systems & Process Control, Mississauga, Ontario, Canada L5K 2R7. ‡ Honeywell Aerospace Canada, Mississauga, Ontario, Canada L5L 3S6.



Notes

ε = stochastic component θ = unknown parameters θ̅U = vector of nominal values for unselected parameters θU* = vector of potential true values for unselected parameters σ2 = noise variance

REFERENCES

(1) Maria, G. A Review of Algorithms and Trends in Kinetic Model Identification for Chemical and Biochemical Systems. Chem. Biochem. Eng. Q. 2004, 18, 195. (2) Vajda, S.; Rabitz, H.; Walter, E.; Lecourtier, Y. Qualitative and Quantitative Identifiability Analysis of Nonlinear Chemical KineticModels. Chem. Eng. Commun. 1989, 83, 191. (3) Raue, A.; Kreutz, C.; Maiwald, T.; Bachmann, J.; Schilling, M.; Klingmüller, U.; Timmer, J. Structural and Practical Identifiability Analysis of Partially Observed Dynamical Models by Exploiting the Profile Likelihood. Bioinformatics 2009, 25, 1923. (4) McLean, K. A. P.; McAuley, K. B. Mathematical Modelling of Chemical Processes − Obtaining the Best Model Predictions and Parameter Estimates using Identifiability and Estimability Procedures. Can. J. Chem. Eng. 2012, 90, 351−366. (5) Maria, G. An Adaptive Strategy for Solving Kinetic Model Concomitant Estimation - Reduction Problems. Can. J. Chem. Eng. 1989, 67, 825. (6) Maria, G.; Rippin, D. W. T. Note Concerning Two Techniques for Complex Kinetic Pathway Analysis. Chem. Eng. Sci. 1993, 48, 3855. (7) Kou, B.; McAuley, K. B.; Hsu, J. C. C.; Bacon, D. W. Mathematical Model and Parameter Estimation for Gas-Phase Ethylene/Hexene Copolymerization with Metallocene Catalyst. Macromol. Mater. Eng. 2005, 290, 537. (8) Kou, B.; McAuley, K. B.; Hsu, C. C.; Bacon, D. W.; Yao, K. Z. Mathematical Model and Parameter Estimation for Gas-Phase

The authors declare no competing financial interest.

■ ■

ACKNOWLEDGMENTS The authors thank MITACS, Mprime, and Shell Global Solutions for financial support of this research. NOMENCLATURE d = number of response variables f = right-hand side of nonlinear ordinary differential equation (ODE) model g = response prediction from nonlinear ODE model i = index for response variables k = (1) index for number of candidate parameters selected; (2) reaction rate constant l = index for measurements m = index for experimental runs n = number of measurements for each response variable p = number of unknown parameters q = index for data point in cross-validation r = number of experimental runs rC = critical ratio in MSE-based algorithm rCC = corrected critical ratio in MSE-based algorithm 6114

dx.doi.org/10.1021/ie202352f | Ind. Eng. Chem. Res. 2012, 51, 6105−6115

Industrial & Engineering Chemistry Research

Article

(31) Bates, D. M.; Watts, D. G. Nonlinear Regression Analysis and Its Applications; John Wiley & Sons: New York, 1988; pp 72−76 and 147−149. (32) Saltelli, A.; Chan, K.; Scott, E. M. Sensitivity Analysis; John Wiley & Sons: New York, 2000, pp 15−50. (33) Leis, J. R.; Kramer, M. A. The Simultaneous Solution and Sensitivity Analysis of Systems Described by Ordinary Differential Equations. ACM Trans. Math. Softw. 1988, 14, 45. (34) Tjoa, I. B.; Biegler, L. T. Simultaneous Solution and Optimization Strategies for Parameter Estimation of DifferentialAlgebraic Equation Systems. Ind. Eng. Chem. Res. 1991, 30, 376. (35) Guay, M.; McLean, D. D. Optimization and Sensitivity Analysis for Multiresponse Parameter Estimation in Systems of Ordinary Differential Equations. Comput. Chem. Eng. 1995, 19, 1271. (36) Sulieman, H.; Kucuk, I; McLellan, P. J. Parametric Sensitivity: a Case Study Comparison. Comput. Stat. Data Anal. 2009, 53, 2640. (37) McLean, K. A. P. Obtaining the Best Model Predictions and Parameter Estimates using Limited Data. Master’s Thesis, Queen’s University, Kingston, 2011. (38) Stone, M. Cross-validation Choice and Assessment of Statistical Predictions. J. Roy. Stat. Soc. B. 1974, 36, 111. (39) Stortelder, W. Parameter Estimation in Nonlinear Dynamical Systems. Ph.D. Thesis, CWI Amsterdam, Amsterdam, 1998; p 91.

Ethylene Homopolymerization with Supported Metallocene Catalyst. Ind. Eng. Chem. Res. 2005, 44, 2428. (9) Chu, Y.; Huang, Z.; Hahn, J. Improving Prediction Capabilities of Complex Dynamic Models via Parameter Selection and Estimation. Chem. Eng. Sci. 2009, 64, 4178. (10) Thompson, D. E.; McAuley, K. B.; McLellan, P. J. Parameter Estimation in a Simplified MWD Model for HDPE Produced by a Ziegler-Natta Catalyst. Macromol. React. Eng. 2009, 3, 160. (11) Wu, S.; McAuley, K. B.; Harris, T. J. Selection of Simplified Models: II. Development of a Model Selection Criterion Based on Mean Squared Error. Can. J. Chem. Eng. 2011, 89, 325. (12) Wu, S.; McLean, K. A. P.; Harris, T. J.; McAuley, K. B. Selection of Optimal Parameter Set Using Estimability Analysis and MSE-Based Model-Selection Criterion. Int. J. Adv. Mechatronic Syst. 2011, 3, 188. (13) Weijers, S. R.; Vanrolleghem, P. A. A Procedure for Selecting Best Identifiable Parameters in Calibrating Activated Sludge Model No.1 to Full-Scale Plant Data. Water Sci. Technol. 1997, 36, 69. (14) Freni, G.; Mannina, G.; Viviani, G. Identifiability Analysis for Receiving Water Body Quality Modelling. Environ. Modell. Softw. 2009, 24, 54. (15) Machado, V. C.; Tapia, G.; Gabriel, D.; Lafuente, J.; Baeza, J. A. Systematic Identifiability Study Based on the Fisher Information Matrix for Reducing the Number of Parameters Calibration of an Activated Sludge Model. Environ. Modell. Softw. 2009, 24, 1274. (16) Brun, R.; Reichert, P.; Künsch, H. R. Practical Identifiability Analysis of Large Environmental Simulation Models. Water Resour. Res. 2001, 37, 1015. (17) Brun, R.; Kühni, M.; Siegrist, H.; Gujer, W.; Reichert, P. Practical Identifiability of ASM2d Parameters − Systematic Selection and Tuning of Parameter Subsets. Water Res. 2002, 36, 4113. (18) Chu, Y.; Hahn, J. Parameter Set Selection via Clustering of Parameters into Pairwise Indistinguishable Groups of Parameters. Ind. Eng. Chem. Res. 2009, 48, 6000. (19) Degenring, D.; Froemel, C.; Dikta, G.; Takors, R. Sensitivity Analysis for the Reduction of Complex Metabolism Models. J. Process Control. 2004, 14, 729. (20) Schittkowski, K. Experimental Design Tools for Ordinary and Algebraic Differential Equations. Ind. Eng. Chem. Res. 2007, 46, 9137. (21) Quaiser, T.; Mönnigmann, M. Systematic Identifiability Testing for Unambiguous Mechanistic Modeling − Application to JAK-STAT, MAP Kinase, and NF-κB Signaling Pathway Models. BMC Syst. Biol. 2009, 3, 50. (22) Yao, K. Z.; Shaw, B. M.; Kou, B.; McAuley, K. B.; Bacon, D. W. Modeling Ethylene/Butene Copolymerization with Multi-Site Catalysts: Parameter Estimability and Experimental Design. Polym. React. Eng. 2003, 11, 563. (23) Lund, B. F.; Foss, B. A. Parameter Ranking by Orthogonalization − Applied to Nonlinear Mechanistic Models. Automatica 2008, 44, 278. (24) Jayasankar, B. R.; Ben-Zvi, A.; Huang, B. Identifiability and Estimability Study for a Dynamic Solid Oxide Fuel Cell Model. Comput. Chem. Eng. 2009, 33, 484. (25) Littlejohns, J. V.; McAuley, K. B.; Daugulis, A. J. Model for a Solid-Liquid Stirred Tank Two-Phase Partitioning Bioscrubber for the Treatment of BTEX. J. Hazard. Mater. 2010, 175, 872. (26) Wu, S.; Harris, T. J.; McAuley, K. B. The Use of Simplified or Misspecified Models: Linear Case. Can. J. Chem. Eng. 2007, 85, 386. (27) Hocking, R. R. Analysis and Selection of Variables in Linear Regression. Biometrics 1976, 32, 1. (28) Woloszyn, J. D.; McAuley, K. B. Application of Parameter Selection and Estimation Techniques in a Thermal Styrene Polymerization Model. Macromol. React. Eng. 2011, 5, 453. (29) Karimi, H.; Schaffer, M. A.; McAuley, K. B. A Kinetic Model for Non-Oxidative Thermal Degradation of Nylon 66. Macromol. React. Eng. 2012, 6, 93−109. (30) Biegler, L. T.; Damiano, J. J.; Blau, G. E. Nonlinear Parameter Estimation: A Case Study Comparison. AIChE J. 1986, 32, 29. 6115

dx.doi.org/10.1021/ie202352f | Ind. Eng. Chem. Res. 2012, 51, 6105−6115