4716
Ind. Eng. Chem. Res. 2006, 45, 4716-4725
Comparison of the Luus-Jaakola Optimization and Gauss-Newton Methods for Parameter Estimation in Ordinary Differential Equation Models Praveen Linga, Nayef Al-Saifi, and Peter Englezos* Department of Chemical and Biological Engineering, The UniVersity of British Columbia, 2360 East Mall, VancouVer, B.C. V6T 1Z3, Canada
The direct search optimization method of Luus and Jaakola (LJ) and the Gauss-Newton (GN) method are employed to solve four parameter optimization problems for chemical and biochemical processes described by ordinary differential equation models. A comparison of the solution methods was thus carried out. It was found that the impact of initial guess values is minimal on the optimal parameters and the optimum. GaussNewton, however, was found to be sensitive to the initial guess. Moreover, GN encountered convergence problems which were alleviated by the use of the Marquardt-Levenberg approach. 1. Introduction Parameter estimation is essentially an optimization problem whereby the unknown parameters are obtained by minimizing a suitable objective function.1-4 A classic example of parameter estimation is the determination of kinetic parameters from a set of experimental data. The objective function is a suitable measure of the discrepancy between model-based calculated values and a set of experimental data. Optimization methods that are broadly used to estimate unknown parameters are classified in two categories: direct search and gradient methods. The gradient search methods require derivatives of the objective function, whereas the direct search methods are derivative free. Among the gradient methods, it was found that the GaussNewton (GN) method is the most efficient method for estimating parameters in nonlinear models.5 Implementing the GN method for systems described by ordinary differential equation (ODE) models requires the solution of state and sensitivity equations. For example, a model with 3 state equations and 7 parameters would require simultaneous solution of the 3 model ODEs and the 3 × 7 ) 21 sensitivity ODEs.1 Moreover, a partial differential equation (PDE) model when discretized into M grid blocks to approximate the spatial derivatives yields a system of ODEs. In that case, the number of sensitivity ODEs corresponding to the single PDE would be M × p for p parameters. Thus, the dimensionality increases dramatically. With the availability of very fast computers, direct search optimization techniques are being used extensively to solve optimization problems of interest to chemical engineering. Some of the frequently used techniques are neural networks,6 genetic algorithms,7 simulated annealing,8 and Luus-Jaakola (LJ) optimization.9 Luus and Jaakola9 successfully showed that their method could be used to solve difficult optimization problems with several local minima. This procedure uses random search points and systematic contraction of the search region. Luus10-12 introduced a multipass formulation for the procedure with the region size between each pass to be changed according to the extent to which each variable changes with respect to the previous pass. With this method, Luus et al.13 were able to solve the complex cancer chemotherapy optimization problem of high dimension that involved 84 variables and several state con* To whom correspondence should be addressed. Tel.: 1-604-8226184. Fax: 1-604-822-6003. E-mail:
[email protected].
straints. LJ optimization was successfully employed to solve multiphase equilibrium calculations involving both phase as well as chemical equilibrium.14 The Gauss-Newton method was also successfully employed to solve the parameter optimization problems formulated for thermodynamic (activity coefficient or equation of state) models.1,15-19 These models are algebraic, and therefore, there is no need to solve differential equations to obtain the sensitivity coefficients as they can be obtained directly.1 It should be noted that the GN method should always incorporate the Marquardt-Levenberg modification to improve its convergence characteristics and robustness.1 The broad range of applicability of LJ and its demonstrated ability to converge with an enlarged region of convergence motivated us to compare it with the best indirect method for parameter estimation (Gauss-Newton). Another important feature of the LJ optimization method is its ability to handle multiple optima reliably.20,21 Thus, the objective of the work undertaken in the present study is to implement the LJ optimization procedure to determine parameter values in ordinary differential equation models and compare its performance with the GN method and its modifications. This work is also relevant to partial differential equation (PDE) models since they may be solved by the method of lines. The incentive to find an alternative to Gauss-Newton when one deals with ODE or PDE models is the increase in dimensionality of the problem for systems with large numbers of parameters. 2. Formulation of Parameter Estimation Problem Two important aspects to be considered in a parameter estimation problem are the type of mathematical model and the type of objective function to be minimized.1,22 The formulation of the parameter estimation problem is equally important to the actual solution of the problem. Here, we focus on processes described by ordinary differential equation (ODEs) models. The mathematical model is of the following form.
dx(t) ) f(x(t),u,k); x(t0) ) x0 dt
(1)
y(t) ) Cx(t)
(2)
where k ) [k1, k2, ..., kp]T is a p-dimensional vector of parameters whose numerical values are unknown; x ) [x1, x2, ..., xn]T is an n-dimensional vector of state variables; x0 ) [x10, x20, ..., xn0]T is an n-dimensional vector of initial
10.1021/ie060051q CCC: $33.50 © 2006 American Chemical Society Published on Web 05/20/2006
Ind. Eng. Chem. Res., Vol. 45, No. 13, 2006 4717
conditions for the state variables which are assumed to be known precisely; u ) [u1, u2 ,..., ur]T is an r-dimensional vector of manipulated variables which are either set by the experimentalist or their numerical values are precisely known or they have been measured; f ) [f1, f2 , ..., fn]T is an n-dimensional vector function of known form (the differential equations); y ) [y1, y2 , ..., ym]T is the m-dimensional output vector, i.e., the set of variables that are measured experimentally; and C is the m × n observation matrix, y ) Cx. It is noted that many chemical processes are described by partial differential equations (PDEs). Such models can normally be reduced to ordinary differential equations (ODEs) for parameter estimation purposes.1, 22 The second important aspect in the formulation of the parameter estimation problem is the selection of the appropriate objective function. In general, the unknown parameter vector k is found by minimizing a scalar function often referred to as the objective function. The objective function is a measure of the overall departure of the model calculated values from the measurements. A suitable objective function to be to minimized is given by the equation,
∑ i)1
[yˆ i - y(ti,k)] Qi[yˆ i - y(ti,k)]] T
(3)
where S(k) is the objective function, k ) [k1, k2, ..., kp]T is the p-dimensional vector of parameters whose values are unknown, yˆ i are the experimentally measured values, y(ti,k) are the model calculated output values, Qi is an m × m user specified weighting matrix, and N is the number of experimentally measured values for each run. The weighting matrix Qi can be chosen so that the parameter estimates have appropriate or desirable statistical properties.1 3. Gauss-Newton Method Briefly, linearization of the output vector around k(j) and retaining first-order terms of eq 1 yields
y(ti,k(j+1)) ) y(ti,k(j)) +
( )
∂yT T (j+1) ∆k ∂k i
(4)
Assuming a linear relationship between the output vector and the state variables (y ) Cx), the above equation becomes
y(ti,k(j+1)) ) Cx(ti,k(j)) + C
( )
∂xT T (j+1) ∆k ∂k i
(5)
The sensitivity matrix G(ti) ≡ (∂xT/∂k)T is obtained from eq 1 by differentiating both sides with respect to k
∂ ∂ dx ) (f(x,u,k)) ∂k dt ∂k
( )
(6)
Reversing the order of differentiation on the left-hand side (LHS) and performing the implicit differentiation of the right-hand side (RHS), we obtain the following (n × p) ODEs
[( ) ] ( ) ( ) ( )
d ∂xT dt ∂k
A∆k(j+1) ) b
T
)
∂fT ∂x
T
∂xT ∂k
T
+
∂fT ∂k
T
(7)
or
( )
( )
dG(t) ∂fT T ∂fT T ) G(t) + ; G(t0) ) 0 dt ∂x ∂x
(8)
(9)
where N
GT(ti) CT Qi C G(ti) ∑ i)1
(10)
GT(ti) CT Qi[yi - Cx(ti,k(j))] ∑ i)1
(11)
A) and N
b)
Solution of the above equation yields ∆k(j+1), and hence, k(j+1) is obtained from
k(j+1) ) k(j) + µ∆k(j+1)
N
S(k) ) [
This is a matrix differential equation and represents a set of n × p ODEs. Solution of the above equations enables computation of y(ti,k(j+1)). Substitution of the latter into the objective function and use of the stationary condition ∂S(k(j+1))/∂k(j+1) ) 0, yields a linear equation for ∆k(j+1)
(12)
where µ is a stepping parameter (0 < µ e 1) to be determined by the bisection rule. Thus, a sequence of parameter estimates is generated, k(1), (2) k , ... which converges to the optimum, k*, if the initial guess, k(0), is sufficiently close. The iterative procedure stops when the convergence criterion for termination is satisfied. When the unknown parameters are of the same order of magnitude, then a typical test for convergence is ||∆k(j+1)|| < TOL, where TOL is the user specified tolerance. The solution for the ordinary differential eqs 1 and 8 is obtained by using the subroutine DIVPAG available in the IMSL libraries of FORTRAN Power Station 4. This integration method solves an initial value problem for ordinary differential equations using either Adams-Moulton’s or Gear’s backward differentiation formula (BDF) method. For systems described by PDEs, a transformation into ODEs will result in a very large number of ODEs.1 Therefore, efficient computation schemes of the state and sensitivity equations are of paramount importance to using the GN method. One such scheme is based on sequential integration of the sensitivity coefficients. Leis and Kramer23 presented implementation guidelines and an error control strategy that ensures the independent satisfaction of local error criteria by both numerical solutions of the model and the sensitivity equations. This procedure has been adapted and successfully used for automatic history matching (i.e., parameter estimation) in reservoir engineering.24 4. LJ Optimization Procedure The LJ optimization procedure for solving parameter estimation problems has been described by Luus.9,10 We consider the dynamic systems described by eqs 1 and 2 where the unknown parameters are to be found by minimizing the objective function given by eq 3. A schematic flowchart for implementing the LJ optimization method for estimating the unknown p-dimensional vector k is shown in Figure 1. The first step is to input the initial values for the pdimensional parameter vector k and the region size described by the p-dimensional vector r. The procedure contains two main loops. The outer loop is for the number of passes, and the inner loop is for the number of iterations per pass. Inside the iteration
4718
Ind. Eng. Chem. Res., Vol. 45, No. 13, 2006
of FORTRAN Power Station 4. This integration method solves an initial value problem for ordinary differential equations using either Adams-Moulton’s or Gear’s backward differentiation formula (BDF) method. 5. Illustrative Examples We considered four examples to illustrate the use of the LJ on GN methods and compare their performance. 5.1. Hydrogenation of HPA (Example 1). The first example is the hydrogenation of 3-hydroxypropanol (HPA) to 1.3propanediol (PD) over Ni/SiO2/Al2O3 catalyst powder studied by Zhu et al.25 PD is a potentially attractive monomer for polymers such as poly(propylene terephthalate). They used a batch stirred autoclave. The experimental data consists of measurements for the concentration of HPA and PD (CHPA and CPD) versus time at various operating temperatures and pressures. The experimental data consisting of the measurements of the concentration of HPA and PD (CHPA and CPD) versus time at 318 K for three different pressure runs at 2.60, 4.00, and 5.15 MPa, respectively, are given elsewhere.1 They proposed a reaction scheme and a mathematical model that describes the rates of HPA consumption and PD formation as well as the formation of acrolein (Ac). The model can be formulated in parameter estimation notation as explained in the above section as follows:
x ) [x1,x2,x3]T ) [CHPA,CPD,CAc]T
(13)
k ) [k1,k2,k3,k4,k5,k6,k7]T ) [k1,k2,k3,k-3,k4,K1,K2]T (14) y ) [y1,y2]T ) [CHPA,CPD]T
(15)
The model equations now can be written as,
dx1 ) -u1(r1 + r2) - (k3x1 + k5x3x1 - k4x3) dt
(16)
dx2 ) u1(r1 - r2) dt
(17)
dx3 ) r3 - r4 - r-3 dt
(18)
Figure 1. Schematic of LJ optimization procedure for parameter estimation.
loop there is another loop for the number of randomly chosen points (Psets) to generate p-dimensional uniformly distributed random numbers between -0.5 and +0.5 using a random number generator. In each iteration, the ordinary differential equations of the model are solved simultaneously to obtain the solution for the output vector. The objective function is evaluated using eq 3 with the unit weighting matrix, and the parameter vector that yields the minimum objective function is saved. Between each iteration, the region size is contracted using a contraction factor, ξ ) 0.95. Between each pass, the region size is changed according to the extent of the change in each variable with respect to the previous pass, as shown in the flowchart. To prevent the region size from becoming zero, if the region size is less than the tolerance, , then the region size is set equal to . In the present work, ) 10-6. The program continues until the specified number of passes is reached. LJ does not compute the confidence interval levels for the calculated optimum parameter values, but we can use these values in the GN method for one iteration to obtain the confidence intervals on the optimum parameters. The solution for the ordinary differential equations is obtained by using the subroutine DIVPAG available in the IMSL libraries
Where, u1 is the concentration of the catalyst (10 g/L). The initial conditions in all the experiments were x1(0) ) 1.349 53, x2(0) ) 0, and x3(0) ) 0. The reaction rates are given below
r1 )
k1u2x1 k6u2 1/2 H 1+ + k7x1 H
r2 )
( ( )
k2x1x2 k6u2 1/2 1+ + k7x1 H
)
3
(19)
( )
(20)
r3 ) k3x1
(21)
r-3 ) k4x3
(22)
r4 ) k5x3x1
(23)
In the above equations, k1, k2, k3, k4, and k5 are rate constants (L/(mol min g)) and k6 and k7 are the adsorption equilibrium constants (L/mol) for H2 and HPA, respectively. The term u2 is
Ind. Eng. Chem. Res., Vol. 45, No. 13, 2006 4719
the hydrogen pressure (MPa) in the reactor, and H is the Henry’s law constant with a value equal to 1379 (L bar)/mol at temperature, T (298 K). The seven parameters (k1, k2, k3, k4, k5, k6, and k7) are to be determined from the measured concentrations. 5.2. Toluene Hydrogenation (Example 2). The second illustrative example we have considered is the model for toluene hydrogenation.26 The hydrogenation of toluene was performed at ambient temperature and pressure in a semibatch isothermal stirred reactor with commercial 5% Ru-act catalyst. Hydrogen was automatically added to the system at the same rate at which it was consumed. The particle size of the catalyst used and the efficiency of the stirring rate were sufficient for carrying out the reaction in the kinetic regime. Under the experimental conditions, the validity of Henry’s law was assumed. The experimental data are given elsewhere.1,27 The reaction scheme for the catalytic reaction is r1
r2
A 79 8 B 98 C r
(24)
-1
Where, A is toluene, B is 1-methy-cyclohexane, C is methyl cyclohexane, r1 is the hydrogenation rate (forward reaction), and r-1 is the disproportionation rate (backward reaction). The proposed kinetic model describing the above system is expressed in parameter estimation notation as
x ) [x1,x2,x3] ) [CA,CB,CC] T
T
(25)
rel T k ) [k1,k2,k3,k4,k5]T ) [kH,k2,kD, Krel A ,KC ]
(26)
y ) [y1,y2,y3]T ) [CA,CB,CC]T
(27)
Where, Ci (i ) A, B, C) are the reactant concentrations and Krel are the relative adsorption coefficients. The model equations are as follows
dx1 ) -r1 + r-1; x1(0) ) 1 dt dx2 ) r1 - r-1 - r2; x2(0) ) 0 dt dx3 ) r2; x3(0) ) 0 dt
(28)
(29)
(30)
The rate equations are as follows
r1 )
k1k4x1 k4x1 + x2 + k5x3
(31)
k3x2 k4x1 + x2 + k5x3
(32)
r-1 )
k2x2 r2 ) k4x1 + x2 + k5x3
used. The data are given elsewhere.1,26 The reaction scheme for the catalytic reaction is r1
r2
The objective is to determine the five unknown parameters described in the model from the measured concentrations. 5.3. Methyl Ester Hydrogenation (Example 3). The third illustrative example considered is the methyl ester hydrogenation reaction.26 The experiments were performed in an autoclave at elevated pressure and temperature. The Ni catalyst DM2 was
(34)
Where, A, B, C, and D are the methyl esters of linolenic, linoleic, oleic, and stearic acids and r1, r2, and r3 are the hydrogenation rates. The proposed kinetic model is expressed in parameter estimation notation as follows
x ) [x1,x2,x3,x4]T ) [CA,CB,CC,CD]T
(35)
rel rel T (36) k ) [k1,k2,k3,k4,k5,k6]T ) [k1,k2,k3,Krel B ,KC ,KD ]
y ) [y1,y2,y3,y4]T ) [CA,CB,CC,CD]T
(37)
where Ci (i ) A, B, C, D) are the reactant concentrations and Krel are the relative adsorption coefficients. The model equations are as follows,
dx1 ) -r1; x1(0) ) 0.1012 dt
(38)
dx2 ) r1 - r2; x2(0) ) 0.2210 dt
(39)
dx3 ) r2 - r3; x3(0) ) 0.6570 dt
(40)
dx4 ) r3; x4(0) ) 0.0208 dt
(41)
The rate equations are defined as follows
r1 )
k1x1 x1 + k4x2 + k5x3 + k6x4
(42)
r2 )
k2k4x2 x1 + k4x2 + k5x3 + k6x4
(43)
r3 )
k3k5x3 x1 + k4x2 + k5x3 + k6x4
(44)
The objective is to determine the six unknown parameters described in the model from the measured concentrations. 5.4. Contact Inhibition in Microcarrier Cultures of Vero Cells (Example 4). The fourth illustrative example considered is Vero cells grown on microcarriers. Hawboldt et al.28 reported experimental data on anchorage dependent Vero cells grown on Cytodex I microcarriers in 250 mL spinner flasks. The experimental data are given elsewhere.1,28 Frame and Hu29 proposed a model which is expressed in parameter notation as,
x ) [x]T k ) [k1,k2,k3]T )
(33)
r3
A 98 B 98 C 98 D
[
1 ,C,µmax x∞
(45)
]
T
y ) [y]T
(46) (47)
Where, k1 is the inverse of the maximum cell density, k2 is a constant, and k3 is the maximum specific growth rate. The model equation is of the form
dx ) k3{1 - exp[-k2(1 - k1x)]}x dt
(48)
4720
Ind. Eng. Chem. Res., Vol. 45, No. 13, 2006
Table 1. Comparison of the LJ and GN Methods for Various Initial Estimates (Example 1) k1 initial estimates (1) GN method LJ method initial estimates (2) GN method LJ method initial estimates (3) GN method LJ method initial estimates (4) GN method LJ method initial estimates (5) GN method LJ method initial estimates (6) GN method LJ method initial estimates (7) GN method LJ method Zhu et al.25 (8) LJ method
10-3 2.6866 12.0853 2.6866 2.6866 12.0477 2.6866 2.6866 12.0368 2.6866 2.7397 12.0172 3.0436 13.354 12.0336 13.354 13.502 12.0663 13.502 13.502 12.0490 6.533 12.0523
k2 10-3
0.236 × 10-8 0.148 × 10-7 0.108 × 10-6 0.236 × 10-8 0.103 × 10-6 0.236 × 10-8 0.236 × 10-8 0.247 × 10-8 0.236 × 10-8 0.236 × 10-8 0.233 × 10-8 0.236 × 10-8 0.236 × 10-8 0.216 × 10-8 0.236 × 10-8 0.236 × 10-8 0.234 × 10-8 0.236 × 10-8 0.236 × 10-8 0.251 × 10-8 0.305 × 10-4 0.180 × 10-8
k3 10-3
0.672 × 10-3 0.410 × 10-3 0.672 × 10-3 0.672 × 10-3 0.393 × 10-3 0.672 × 10-3 0.672 × 10-3 0.390 × 10-3 0.672 × 10-3 0.672 × 10-3 0.390 × 10-3 0.672 × 10-3 0.672 × 10-3 0.389 × 10-3 0.672 × 10-3 0.392 × 10-3 0.399 × 10-3 0.392 × 10-3 0.392 × 10-3 0.389 × 10-3 0.623 × 10-5 0.388 × 10-3
k4 10-3
0.126 × 10-5 0.130 × 10-4 0.68 × 10-5 0.126 × 10-5 0.682 × 10-5 0.126 × 10-5 0.126 × 10-5 0.930 × 10-7 0.126 × 10-5 0.126 × 10-5 0.196 × 10-5 0.126 × 10-5 0.126 × 10-5 0.127 × 10-5 0.126 × 10-5 0.126 × 10-5 0.147 × 10-6 0.126 × 10-5 0.126 × 10-5 0.784 × 10-6 0.722 × 10-3 0.136 × 10-5
k5
k6
k7
10-3
10-3
10-3
0.0273 0.0255 0.0273 0.0273 0.0262 0.0273 0.0273 0.0272 0.0273 0.0273 0.0272 0.0273 0.0273 0.0277 0.0273 0.0273 0.0236 0.0273 0.0273 0.0275 3.90 × 10-6 0.0277
35.56 171.7734 35.56 35.56 171.6385 35.56 35.56 171.5329 35.56 35.56 171.2807 35.56 4.5435 171.4543 4.5435 4.3531 171.9470 4.3531 4.3531 171.6883 95.00 171.7310
2.57 4.2008 2.57 2.57 4.1921 2.57 2.5322 4.1906 2.5322 2.5322 4.1881 2.7119 172.25 4.1905 172.25 191.30 4.1939 191.30 191.30 4.1921 3.227 4.1924
S(k) 40.654 0.3045 0.2159 0.3045 0.3044 0.2159 0.3044 0.2781 0.2159 0.2781 0.2694 0.2159 0.2675 0.2436 0.2159 0.2436 0.2161 0.2159 0.2161 0.2161 0.2159 0.2633 0.2159
The objective is to determine the three unknown parameters described in the model from the measured cell densities. 6. Results and Discussion The program code for all the examples was written in FORTRAN, the compiler version is Fortran Power Station 4, and the computer used was a Pentium 4 processor. It is also noted that both methods need to solve the model differential equations. This is done with the same subroutine. 6.1. Hydrogenation of HPA ((Example 1). Zhu et al.25 reported the values of 6.533, 3.048 × 10-4, 6.233 × 10-6, 7.219 × 10-4, 3.902 × 10-6, 95.00, and 3.227 for k1, k2, k3, k4, k5, k6, and k7, respectively. This problem was studied extensively by Englezos and Kalogerakis.1 The results obtained with various initial estimates are shown in Table 1. Along the rows in the Table there are three sets of data. The first row consists of the values of the initial estimates, while the second and third rows give the parameter values obtained by the GN, as explained by Englezos and Kalogerakis,1 and LJ methods, respectively, along with their objective function values. This sequence is repeated for all other initial estimates in the Table. Englezos and Kalogerakis1 found the problem to be ill-conditioned and provided a stepwise sequential way by letting only one or two parameters to vary at a time. They were able to successfully reduce the objective function to a new minimum from 0.2633 (value obtained by Zhu et al.25) to 0.2161 and reported the parameters values as 13.502, 0.236 × 10-8, 0.3922 × 10-3, 0.126 × 10-5, 0.0273, 4.3531, and 191.30 for k1, k2, k3, k4, k5, k6, and k7, respectively. Using the LJ optimization procedure, we were able to converge to a slightly lower minimum (0.2159) for all the initial estimate values used by Englezos and Kalogerakis.1 It is also noted that in all the cases the calculated values of the parameters were almost same. Figure 2 gives a visual observation of how the optimum value is reached with respect to the number of passes. The numbers in the legend correspond to those indicated in Table 1 for various initial guesses. As is evident from the figure, it took 38 passes to reach the minimum for the first initial estimate, as the initial estimates were far away from the optimum values. The Gauss-Newton method could not converge to the minimum with the same initial estimates even with Marquardt’s modification.1
Figure 2. Hydrogenation of HPA: values of the objective function for various initial estimates.
Figure 3. Hydrogenation of HPA: parameter variation for k1.
With the introduction of convergence criteria between passes and by fixing the number of iterations to 100, the effect of the number of randomly chosen points (Psets) used per iteration on obtaining the optimum was studied for the initial guess values in the second row of Table 1. Figures 3 and 4 show parameters k1 and k7 as functions of the number of passes at different values of Psets. Similar plots for parameters k2, k3, k4, k5, and k6 were prepared but are not shown. On the basis of Figures 3 and 4 and similar plots for the other parameters, the optimum value for Psets is 30. Figures 5 and 6 show region sizes r1 and r7 for parameters k1 and k7 respectively as functions of the number of passes at different values of Psets. Similar plots for the region sizes for the other parameters (k2, k3, k4, k5, and k6) were prepared but are not shown. Finally, the objective function
Ind. Eng. Chem. Res., Vol. 45, No. 13, 2006 4721
Figure 4. Hydrogenation of HPA: parameter variation for k7.
Figure 5. Hydrogenation of HPA: region size variation for parameter k1.
Figure 6. Hydrogenation of HPA: region size variation for parameter k7.
Figure 7. Hydrogenation of HPA: convergence of the objective function for various values of Psets.
values as functions of the number of passes at different values of Psets are shown in Figure 7. What is seen in Figures 3-7 is that by changing the region size (r) along the passes according to the extent of change in the parameter values LJ is able to
Figure 8. Hydrogenation of HPA: experimental data for the concentration of HPA compared to model calculated values from LJ at 318 K.
Figure 9. Hydrogenation of HPA: experimental data for the concentration of PD compared to model calculated values from LJ at 318 K.
locate the optimum irrespective of the values of the initial guesses. The concentrations of HPA and PD obtained with the optimal parameters calculated by the LJ optimization method are compared with the experimental data for 318 K and for three different pressures at 2.60, 4.00, and 5.15 MPa, respectively, in Figures 8 and 9. The calculated values obtained through the LJ optimization method seem to agree well with the experimental results of Zhu et al.25 It is noted from Figures 8 and 9 that the concentration of HPA at time t ) 10 min for pressures 2.60 and 5.15 MPa is higher than the initial concentration at time zero. Obviously, this cannot be the case as either the concentration should be less than the initial concentration due to formation of PD or it should remain the same as the initial concentration if the product is not formed. These data points can be considered as outliers.27 The outliers were removed from the database, and the parameters were estimated again. The parameter values were essentially the same as 11.2327, 0.759 × 10-9, 0.388 × 10-3, 0.963 × 10-8, 0.0268, 163.3680, and 4.0689 for k1, k2, k3, k4, k5, k6, and k7, respectively, giving an objective function of 0.1903. Consequently, the model calculated output values did not change. 6.2. Toluene Hydrogenation (Example 2). The optimal parameters obtained by the LJ and Gauss-Newton methods are shown in Table 2. The data representation in the table is the same as that explained in example 1. The standard deviation values for the parameters are also shown in the table in parentheses. It is noted that for all the various initial estimates we were able to converge to a minimum value of 0.015 900 03 using the LJ method. The optimal parameter values are 0.025 11, 0.008 73, 0.003 26, 1.272 58, and 1.242 30 for k1, k2, k3, k4, and k5, respectively. These parameters values differ quite substantially from those given in the literature,26 namely, k ) [0.023, 0.011, 0.005, 1.9, 1.8], giving an objective function value of 0.017 891 6. The GN method was found to converge but with
4722
Ind. Eng. Chem. Res., Vol. 45, No. 13, 2006
Table 2. Comparison of the LJ and GN Methods for Various Initial Estimates (Example 2)
initial guess (1) GN method LJ method initial guess (2) GN method LJ method initial guess (3) GN method LJ method Belohlav et al.26 (4) GN method LJ method
k1
k2
k3
k4
k5
S(k)
0.10 0.0254 (4.91%) 0.0251 0.25 0.0224 (5.53%) 0.0251 1.00 0.9665 (59.67%) 0.0251 0.023 0.0249 (3.47%) 0.0251
0.10 0.0098 (22.77%) 0.0087 0.25 0.0199 (28.55%) 0.0087 1.00 0.0084 (28.32%) 0.0087 0.011 0.0104 (16.42%) 0.0087
0.10 0.0052 (19.27%) 0.0033 0.25 0.0202 (20.93%) 0.0033 1.00 1.0081 (60.04%) 0.0033 0.005 0.0055 (13.90%) 0.0033
0.10 1.4029 (9.53%) 1.2726 0.25 3.5303 (11.49%) 1.2726 1.00 1.6787 (55.02%) 1.2726 1.9 1.6724 (7.00%) 1.2726
0.10 1.1312 (7.54%) 1.2423 0.25 2.7073 (9.70%) 1.2423 1.00 0.6405 (48.92%) 1.2423 1.8 1.6270 (4.82%) 1.2423
5.116 60 0.063 08 0.015 90 6.904 89 0.225 06 0.015 90 7.718 61 0.397 69 0.015 90 0.017 89 0.039 17 0.015 90
Table 3. Comparison of the LJ and GN Methods for Various Initial Estimates (Example 3)
initial guess (1) GN method LJ method initial guess (2) GN method LJ method initial guess (3) GN method LJ method Belohlav et al.26 (4) GN methoda LJ method a
k1
k2
k3
k4
k5
k6
S(k)
0.10 0.106 03 (6.53%) 1.863 56 0.1 0.237 62 (18.36%) 1.652 12 1.00 1.007 91 (28.99%) 1.598 59 1.44
0.10 0.049 15 (4.12%) 0.027 07 0.1 0.131 46 (15.28%) 0.027 09 1.00 0.914 35 (28.95%) 0.027 10 0.03
0.10 0.046 18 (3.01%) 0.093 69 0.1 0.033 18 (5.89%) 0.093 92 1.00 0.030 62 (6.91%) 0.093 93 0.09
0.10 0.195 87 (5.34%) 43.599 82 1.0 1.239 54 (16.12%) 38.622 74 1.00 0.976 98 (28.99%) 37.363 40 28
0.10 0.047 25 (5.51%) 2.523 25 1.0 1.045 63 (15.29%) 2.238 77 1.00 1.312 06 (24.66%) 2.166 30 1.8
0.10 0.030 21 (5.24%) 4.352 20 1.0 0.529 68 (15.24%) 3.858 06 1.00 0.685 48 (24.41%) 3.733 99 2.9
0.759 543 0.002 124 0.000 459 0.824056 0.0188 50 0.000 459 3.034587 0.048 436 0.000 459 0.004343
1.434 33 (4.40%) 1.863 10
0.027 78 (1.84%) 0.027 06
0.088 70 (3.01%) 0.094 03
28.236 32 (3.52%) 43.612 75
1.803 70 (3.71%) 2.522 50
2.902 54 (3.33%) 4.353 00
0.001 040 0.000 459
Did not converge in 1000 iterations.
Figure 10. Toluene hydrogenation: values of the objective function for various initial estimates.
a nonzero Marquardt-Levenberg parameter (γ ) 0.1). In one case (initial guess of 1), a parameter value (γ) equal to 1 was needed for convergence. Thus, convergence to the “true” optimization problem was not achieved and the reported standard deviations should be viewed with this in mind. Figure 10 gives a visual observation of how the optimum value is reached with respect to the number of passes for various initial values using the LJ optimization method. The numbers in the legend correspond to those indicated in Table 2 for various initial estimates. The model calculated values using the parameters obtained by the LJ method are shown in Figure 11. As seen, the model is able to describe the data well. This problem was also solved by Luus.12,30 For an initial estimate of k ) [0.25, 0.25, 0.25, 0.25, 0.25], he reported the optimal parameters as k ) [0.025 11, 0.008 73, 0.003 26, 1.272 58, 1.242 30], giving an objective function of 0.015 900 03. 6.3. Methyl Ester Hydrogenation (Example 3). The optimal parameters obtained by the LJ and Gauss-Newton methods are shown in Table 3. The data representation in the table is the same as that explained in example 1. The standard deviations of the parameters are also shown in the table. It is noted that all the reported parameters calculated using the GN method were
Figure 11. Toluene hydrogenation: experimental data for CA, CB, CC concentrations compared to model calculated values.
Figure 12. Methyl ester hydrogenation: values of the objective function for various initial estimates.
obtained with a nonzero Marquardt-Levenberg parameter (γ ) 1.0). It is noted that for all the initial guesses convergence to a minimum value of 0.000 459 was achieved using the LJ method. Figure 12 shows how the optimum value is reached with respect to the number of passes for various initial values. The numbers in the legend correspond to those indicated in
Ind. Eng. Chem. Res., Vol. 45, No. 13, 2006 4723
Figure 14. Growth of Vero cells: values of the objective function for various initial estimates. Figure 13. Methyl ester hydrogenation: experimental data for CA, CB, CC, CD concentrations compared to model calculated values.
Table 3 for various initial estimates. The model calculated values using the parameters obtained by the LJ method are shown in Figure 13. As seen, the model reproduces the data well. The parameter values obtained by LJ differ quite substantially from those given in the literature,26 namely, k ) [1.44, 0.03, 0.09, 28.0, 1.8, 2.9], giving an objective function value of 0.004 343. They are also different from those obtained by Gauss-Newton. It is also significant to note that for different initial guess values different optimal parameter values are obtained by LJ even though the objective function has the same value at the optimum. This was also reported by Luus.30 Luus suggested that the performance index or objective function can be further improved by modifying the model equations to contain only five parameters. The objective function obtained by Luus was 0.000 457 8. We also obtained this value of the objective function and the same parameter values by using Luus’s modified model and the LJ method. 6.4. Contact Inhibition in Microcarrier Cultures of Vero Cells (Example 4). This problem was solved using the LJ and GN methods, and the obtained results for various initial values are shown in Table 4. The data representation in the table is the same as that in previous tables. The standard deviations for the parameters are also shown in the table. It is noted that for all the various initial estimates we were able to converge to a minimum value of 0.104 13 using the LJ method. However, the GN method converged with a nonzero Marquardt-Levenberg parameter (γ ) 10-5). Figure 14 gives a visual observation of how the optimum value is reached with respect to the number of passes for various initial values using the LJ optimization method. The numbers in the legend correspond to those indicated in Table 4 for various initial estimates. The model
Figure 15. Growth of Vero cells: experimental data for cell density compared with model calculated values from LJ (inoculation level ) 2.00 cells/bead).
calculated values using the parameters obtained by the LJ method are shown in Figure 15. As seen, the model reproduces the data well. 6.5. Discussion. It is evident from the results shown above that it is difficult for the GN method to find the optimum if the initial guess values are far away from the optimal parameter values. Gradient search methods tend to find local minima. Hence, initial guess values for the parameters to be determined play an important role in the GN method. This is not the case with the LJ optimization method. For instance, for toluene hydrogenation (example 2), for various initial estimates of the parameters given in Table 2, the objective function behavior with respect to number of iterations (GN method) and number of passes (LJ method) is shown in Figure 16. It is seen from the figure how GN finds a local minimum while the LJ method reaches an optimum for the given initial values. The number in the legend corresponds to the respective initial estimates in Table 2.
Table 4. Comparison of the LJ and GN Methods for Various Initial Estimates (Example 4)
initial guess (1) GN method LJ method initial guess (2) GN method LJ method initial guess (3) GN methoda LJ method initial guess (4) GN methoda LJ method initial guess (5) GN method LJ method a
k1
k2
k3
S(k)
0.1 0.247 95 (2.22%) 0.247 99 0.25 0.247 95 (2.22%) 0.247 95 2.0 0.243 00 (3.47%) 0.247 85 3.7 0.559 21 (22.26%) 0.247 98 0.55 0.247 95 (2.22%) 0.247 94
0.1 0.413 31 (91.21%) 0.417 00 0.25 0.413 31 (91.21%) 0.413 53 2.0 0.003 64 (12 496.0%) 0.405 01 3.7 3.993 72 (3.12 × 1012%) 0.414 76 5.0 0.413 31 (91.21%) 0.414 85
0.1 0.077 93 (76.17%) 0.077 35 0.25 0.077 93 (76.17%) 0.077 90 2.0 7.331 78 (12 476.0%) 0.079 27 3.7 3.995 97 (2.99 × 1012%) 0.077 71 0.026 0.077 93 (76.17%) 0.077 69
61.651 30 0.104 13 0.104 13 38.60444 0.104 13 0.104 12 64.68140 0.153 30 0.104 13 75.640 75 36.469 05 0.104 13 20.261 49 0.104 13 0.104 13
Did not converge in 1000 iterations.
4724
Ind. Eng. Chem. Res., Vol. 45, No. 13, 2006
points is not necessary. On the other hand, the Gauss-Newton method encountered convergence difficulties which were alleviated by the use of the Marquardt-Levenberg approach. Acknowledgment The financial support of the National Science and Engineering Research Council of Canada (NSERC) is greatly appreciated. The authors express their gratitude to Professor Hoffman for kindly providing the experimental data. Nomenclature
Figure 16. Toluene hydrogenation: values of the objective function obtained by the GN and LJ methods for various initial estimates.
Figure 17. Methyl ester hydrogenation: convergence of the objective function for the GN and LJ methods for initial estimate (4) in Table 3.
In the GN method, there is no direct control over the objective function that is to be minimized. At a given iteration, if the objective function obtained is higher than the objective function in the previous iteration, then the bisection rule is applied until the objective function assumes a smaller value. The problem is that for models described by ordinary differential equations the objective function may not necessarily decrease. On the other hand, in direct search methods such as LJ, the objective function is controlled directly. The parameters that give the smallest objective function are saved in the first pass after a search in the given region space. Subsequently, it proceeds into the next pass with the saved parameters, whereby the objective function is minimized along the number of passes until the optimum is reached. For instance, for a given initial estimate (4) in Table 3, Gauss-Newton did not converge even though it was very near to the optimum as seen in Figure 17. After a given number of iterations, the objective function increases and decreases in a repetitive manner without converging even for 1000 iterations. 7. Conclusions The LJ optimization method can be successfully employed for parameter estimation of chemical engineering problems that involve models described by ordinary differential equations. The LJ optimization method is simple, easy to implement, and robust in calculating the optimum. The impact of initial guess values is minimal on the optimal parameters and the optimum. Furthermore, it was illustrated that a large number of random
C ) constant Ck ) concentration of catalyst (g/L) CHPA ) concentration of HPA (mol/L) CPD ) concentration of PD (mol/L) H ) Henry constant (mol/(L bar)) k ) p-dimensional parameter vector N ) number of experimentally measured data NP ) number of passes NT ) number of iterations p ) number of unknown parameters Psets ) number of randomly chosen points used per iteration (1 < Psets < 100) Q ) m × m user specified weighting matrix r ) p-dimensional region size vector S(k) ) objective function T ) temperature (K) TOL ) tolerance of limit x ) n-dimensional vector of state variables x0 ) n-dimensional vector of initial conditions for the state variables x∞ ) maximum cell density (cells/mL) yˆ i ) experimentally measured values y(ti,k) ) model calculated output values γ ) Marquardt-Levenberg parameter ) tolerance of limit µmax ) thespecific growth rate ξ ) region size contraction factor φ ) p-dimensional vector Literature Cited (1) Englezos, P.; Kalogerakis, N. Applied Parameter Estimation for Chemical Engineers; Marcel Dekker: New York, 2001. (2) Edgar, T. F.; Himmelblau, D. M. Optimization of Chemical Processes; McGraw-Hill: New York, 1988. (3) Gill, P. E.; Murray, W.; Wright, M. H. Practical Optimization; Academic Press: London, 1981. (4) Sargent, R. W. H. In A ReView of Optimization Methods for Nonlinear Problems; Squires, R. G., Reklaitis, G. V. E., Eds.; ACS Symposium Series; American Chemical Society: Washington, DC, 1980; pp 37-52. (5) Bard, Y. Comparison of Gradient Methods for the Solution of Nonlinear Parameter Estimation Problems. SIAM J. Numer. Anal. 1970, 7, 157-186. (6) Bishop, C. M. Neural Networks for Pattern Recognition; Oxford University Press: London, 1995. (7) Holland, J. H. Adaptation in Natural and Artificial Systems; University of Michigan Press: Ann Arbor, MI, 1975. (8) Kirkpatrick, S.; Gelatt, C. D.; Vecchi, M. P. Optimization by Simulated Annealing. Science 1983, 220 (4598), 671-680. (9) Luus, R.; Jaakola, T. H. I. Optimization by Direct Search and Systematic Reduction of Size of Search Region. AIChE J. 1973, 19 (4), 760-766. (10) Luus, R. Determination of the region sizes for LJ optimization procedure. Hung. J. Ind. Chem. 1998, 26 (4), 281-286.
Ind. Eng. Chem. Res., Vol. 45, No. 13, 2006 4725 (11) Luus, R. Parameter estimation of Lotka-Volterra problem by direct search optimization. Hung. J. Ind. Chem. 1998, 26 (4), 287-292. (12) Luus, R., Application of direct search optimization for parameter estimation. In Scientific computing in chemical engineering II. Simulation, image processing, optimization and control; Keil, F., Mackens, W., Voss, H., Werther, J., Eds.; Springer: Berlin, 1999; pp 346-353. (13) Luus, R.; Hartig, F.; Keil, F. J. Optimal Drug Scheduling of CancerChemotherapy by Direct Search Optimization. Hung. J. Ind. Chem. 1995, 23 (1), 55-58. (14) Lee, Y. P.; Rangaiah, G. P.; Luus, R. Phase and chemical equilibrium calculations by direct search optimization. Comput. Chem. Eng. 1999, 23 (9), 1183-1191. (15) Englezos, P.; Kalogerakis, N.; Trebble, M. A.; Bishnoi, P. R. Estimation of Multiple Binary Interaction Parameters in Equations of State Using Vle Data - Application to the Trebble Bishnoi Equation of State. Fluid Phase Equilibr. 1990, 58 (1-2), 117-132. (16) Englezos, P.; Bygrave, G.; Kalogerakis, N. Interaction parameter estimation in cubic equations of state using binary phase equilibrium and critical point data. Ind. Eng. Chem. Res. 1998, 37 (5), 1613-1618. (17) Englezos, P.; Kalogerakis, N.; Bishnoi, P. R. Simultaneous Regression of Binary Vle and Vlle Data. Fluid Phase Equilibr. 1990, 61 (1-2), 1-15. (18) Englezos, P.; Kalogerakis, N.; Bishnoi, P. R. A SystematicApproach for the Efficient Estimation of Interaction Parameters in Equations of State Using Binary Vle Data. Can. J. Chem. Eng. 1993, 71 (2), 322326. (19) Englezos, P.; Kalogerakis, N. Constrained Least-Squares Estimation of Binary Interaction Parameters in Equations of State. Comput.Chem. Eng. 1993, 17 (1), 117-121. (20) Wang, B. C.; Luus, R. Reliability of Optimization Procedures for Obtaining Global Optimum. AIChE J. 1978, 24 (4), 619-626. (21) Liao, B.; Luus, R. Comparison of the Luus-Jaakola optimization procedure and the genetic algorithm. Eng. Optimization 2005, 37 (4), 381398.
(22) Seinfeld, J. S.; Lapidus, L. Mathematical Models in Chemical Engineering; Prentice Hall, Inc.: Englewood Cliffs, NJ, 1974; Vol. 3. (23) Leis, J. R.; Kramer, M. A. The Simultaneous Solution and Sensitivity Analysis of Systems Described by Ordinary Differential Equations. ACM transactions on Mathematical Software 1988, 14 (1), 45-60. (24) Tan, T. B.; Kalogerakis, A Three-Dimensional Three-Phase Automatic History Matching Model: Reliability of Parameter Estimates. J. Can. Pet. Technol. 1992, 31 (3), 34-41. (25) Zhu, X. D.; Valerius, G.; Hofmann, H.; Haas, T.; Arntz, D. Intrinsic kinetics of 3-hydroxypropanal hydrogenation over Ni/SiO2/Al2O3 catalyst. Ind. Eng. Chem. Res. 1997, 36 (8), 2897-2902. (26) Belohlav, Z.; Zamostny, P.; Kluson, P.; Volf, J. Application of random-search algorithm for regression analysis of catalytic hydrogenations. Can. J. Chem. Eng. 1997, 75 (4), 735-742. (27) Barnett, V.; Lewis, T.; Rothamsted, V. Outliers in Statistical Data, 3rd ed.; Wiley: New York, 1994. (28) Hawboldt, K. A.; Kalogerakis, N.; Behie, L. A. A CellularAutomaton Model for Microcarrier Cultures. Biotechnol. Bioeng. 1994, 43 (1), 90-100. (29) Frame, K. K.; Hu, W. S. A Model for Density-Dependent Growth of Anchorage-Dependent Mammalian-Cells. Biotechnol. Bioeng. 1988, 32 (8), 1061-1066. (30) Luus, R., IteratiVe Dynamic Programming; Chapman & Hall/ CRC: New York, 2001.
ReceiVed for reView January 12, 2006 ReVised manuscript receiVed April 12, 2006 Accepted April 20, 2006 IE060051Q