A Simultaneous Approach for Parameter Estimation of a System of

Aug 24, 2011 - Parameter estimation for system of ordinary differential equations (ODEs) is formulated as a nonlinear programming (NLP) problem. The o...
0 downloads 11 Views 1MB Size
ARTICLE pubs.acs.org/IECR

A Simultaneous Approach for Parameter Estimation of a System of Ordinary Differential Equations, Using Artificial Neural Network Approximation Vivek Dua*,† and Pinky Dua‡ †

Centre for Process Systems Engineering, Department of Chemical Engineering, University College London, Torrington Place, London WC1E 7JE, United Kingdom ‡ Research Clinical Pharmacology, Pfizer PharmaTherapeutics R&D, Sandwich CT13 9NJ, United Kingdom

bS Supporting Information ABSTRACT: Parameter estimation for system of ordinary differential equations (ODEs) is formulated as a nonlinear programming (NLP) problem. The objective function of the NLP consists of two terms, which are simultaneously minimized. The first term is the summed square of the difference between the ODE model predictions and experimental data. The solution of the ODE model is postulated as an artificial neural network (ANN) model given by time points as the inputs and the state variables as the outputs. The outputs of the ANN model can be analytically differentiated with respect to the input, providing the differential terms of the ODE model. The summed square of the difference between these differential terms and the right-hand side of the ODE model represents the second term in the objective function of the NLP problem.

1. INTRODUCTION Model-based process design and operation is becoming increasingly important to ensure profitability and safety of chemical process systems. Developing a model usually requires postulating a model and carrying out experiments to estimate model parameters. The two main issues that must be considered are identifiability and model discrimination. Identifiability is concerned with (i) whether model parameters can be estimated based on the measurements that are available and (ii) whether the parameters can be uniquely identified. The objective of model discrimination is to determine the model that best describes the system, when more than one candidate model seems to represent the observations adequately. This can be achieved by optimally designing the experiments13 to compute the conditions at which experiments should be carried out. The conditions obtained are such that the predictions from only a subset of the candidate models adequately match the experimental observations. Model discrimination is usually an iterative process where a subset of model candidates is rejected at every iteration. In this paper, it is assumed that the model to be investigated has been selected, is given by a system of ODEs, and is identifiable; the experimental observations are available and the objective is to compute the model parameters. Parameter estimation of the system of ODEs requires solving a dynamic optimization problem,47 which is known to be a nonconvex problem usually requiring the tools of global optimization.815 Several approaches have been presented for solving the parameter estimation problem, and they can be broadly classified as decomposition approaches and sequential/ simultaneous approaches, as discussed next. 1.1. Decomposition Algorithms for Parameter Estimation.

Varah16 proposed an approach that was based on fitting the measured data using splines, which can then be differentiated, r 2011 American Chemical Society

with respect to time, to obtain the estimates of the derivatives. These estimates of the derivatives are then used to reduce the dynamic optimization problem into a much simpler algebraic optimization problem. Varziri et al.17 presented an iterative principal differential analysis approach, based on fitting splines to data. Michalik et al.18 presented an incremental parameter estimation approach, where splines are used to fit the data and differentiation of the splines provides an estimate of the derivatives. In their approach a subsystem of the problem is considered and corresponding subset of the parameters is estimated using an iterative strategy. Dua19 presented an approach where an artificial neural network (ANN) model is used to approximate the data, the ANN model is differentiated and then the derivatives at a reduced set of simulated data points can be used for parameter estimation. The abovementioned approaches require solution of two optimization problems: the first problem fits the experimental data, and the second problem minimizes the discrepancy between the estimates of the derivatives obtained from fitted model and the derivatives evaluated from the equations in the given ODE model at the experimental data points. These approaches do not require integration of ODEs while estimating model parameters but usually require a posteriori analysis by integrating the ODEs for the estimates of the parameters obtained. 1.2. Sequential/Simultaneous Algorithms for Parameter Estimation. Parameter estimation algorithms that require Special Issue: Nigam Issue Received: March 28, 2011 Accepted: August 24, 2011 Revised: August 17, 2011 Published: August 24, 2011 1809

dx.doi.org/10.1021/ie200617d | Ind. Eng. Chem. Res. 2012, 51, 1809–1814

Industrial & Engineering Chemistry Research

ARTICLE

integration of the ODEs have also been presented. There are two main types of such approaches: first, the sequential approach, where the optimization problem is decoupled from the numerical integration of differential equations,2026 and, second, the simultaneous approach, where differential equations are converted to algebraic equations, which form part of the overall optimization problem.2731 In this paper, the focus is on the simultaneous approach. Integration of ODEs can be carried out using, for example, orthogonal collocation on finite elements (OCFE). This requires dicretizing the time interval into many subintervals, or finite elements, and approximating the solution of the ODEs by orthogonal polynomial at collocation points within each finite element. The order of the polynomial and the number of finite elements is given by the user and the computation of the coefficients of the terms in the polynomials forms part of the solution of the ODEs. In this paper, ANN is used to integrate the ODEs. The problem of the computation of the parameters of the ANN model is similar to the problem of computation of the coefficients of the terms in the polynomials for the OCFE method. Therefore, the original parameter estimation problem for ODEs becomes an augmented optimization problem where the objective is to simultaneously solve the ODEs, by computing ANN model parameters, and compute the ODE model parameters such that model predictions match closely the experimental observations. The main advantage of using ANN is that highly nonlinear multi-inputmulti-output (MIMO) functions can be approximated.3235 The developments reported in this paper are thus expected to open avenues for parameter estimation of highly nonlinear ODEs. Note that the main difference between the approach presented in the work of Dua19 and this work is that, in the Dua work,19 a solution of ODEs is not required, whereas, in this work, solution of ODEs is required and is part of the optimization problem. Dua19 required solution of two smaller subproblems, whereas only one—but larger—simultaneous optimization problem is solved in this work. The work presented by Dua19 thus falls into the category of the decomposition approach (section 1.1), whereas the work presented here falls into the category of the simultaneous approach (section 1.2). The methodology presented in this paper can also be useful when the measurements for all of the state variables are not available. In the next section, the proposed methodology for parameter estimation is presented, and in section 3, example problems are presented to demonstrate the applicability of the methodology. Discussion and concluding remarks are given in section 4.

2. PROBLEM DEFINITION AND SOLUTION METHODOLOGY The objective is to solve the following problem:

where z is the J-dimensional vector of state variables in the given ODE system, ^z(ti) represents the experimentally observed values of the state variables at time points ti, and θ is the vector of parameters that must be estimated such that the error, ε1, between the observed values and the model predictions is minimized. The solution of the ODE model given by eqs 24, for given values of θ, can be obtained using ANN.36 Consider an ANN with l inputs, one hidden layer, m nodes in the hidden layer, and a linear output. The output from the ANN is given by N ¼

∑m vm σm

ð5Þ

where σm ¼

1 1 þ eam

ð6Þ

∑l wml xl þ bm

ð7Þ

where am ¼

where wml is the weight from the input l to the hidden node m, vm is the weight from the hidden node m to the output, bm is the bias, and σ represents the sigmoid transformation. The kth derivative of the output, with respect to the lth input, is given by ∂k N ¼ ∂xkl

∑m vm wkml σðkÞ m

ð8Þ

where σ(k) m denotes the kth derivative of the sigmoid. Note that, for the ODE model, given by eqs 24, k = 1, and the inputs xl are the time points t. A trial solution of the ODE model is postulated as follows: ¼ z0j þ tNj zANN j

ð9Þ

where an ANN model (Nj) is considered for each trial solution zANN . j Note that the trial solution, given by eq 9, is constructed such that it satisfies the initial condition, given by eq 3. From eq 9: dzANN dNj j ¼ Nj þ t ð10Þ dt dt where dNj/dt is given by eq 8 where x = t and k = 1. Therefore, the solution of the ODE model, given by eqs 24, for given values of θ can be formulated as the following nonlinear programming (NLP) problem:36 Problem P2. ε2 ¼

min

zANN , N, σ, w, v, a, b

∑i ∑j

(

dzANN ðti Þ j dt

)2

 fj ðzðti Þ, θ, ti Þ

ð11Þ

subject to eqs 510

Problem P1.

ε1 ¼ min

∑ ∑ f^zjðti Þ  zjðti Þg2

θ, zðtÞ i ∈ I j ∈ J

ð1Þ

subject to dzj ðtÞ ¼ fj ðzðtÞ, θ, tÞ dt zj ðt ¼ 0Þ ¼ z0j t ∈ ½0, tf 

j∈J

j∈J

ð2Þ ð3Þ ð4Þ

Illustrative ODEs Simulation Example. Consider the following system of ODEs: dz1 ¼  θ1 z1 dt dz2 ¼  θ1 z1  θ2 z2 dt h i     where θ ¼ θ1 θ2 ¼ 5 1 and zðt ¼ 0Þ ¼ 1 0 . The solution of these ODEs by formulating and solving Problem 1810

dx.doi.org/10.1021/ie200617d |Ind. Eng. Chem. Res. 2012, 51, 1809–1814

Industrial & Engineering Chemistry Research

ARTICLE

The parameter estimation Problem P1, where θ must be estimated, therefore can be recast as the following NLP problem. Problem P3.

P2 is given in Figure 1 and is compared with the exact solution of the ODEs. The ANN-based solution obtained by solving Problem P2 is comparable to the exact solution. The numerical details of the solution are reported in Table 1; note that these results are presented for the sake of completeness and may vary for different NLP solvers and initial guesses used for the solvers.

ε3 ¼

min

θ, zðtÞ, zANN , N, σ, w, v, a, b

þ

∑i ∑j

(

∑i ∑j f^zj ðti Þ  zjðti Þg2

dzANN ðti Þ j dt

)2

 fj ðzðti Þ, θ, ti Þ

ð12Þ

subject to eqs 510 The first term in the objective function minimizes the deviation of model predictions from the experimental data, and the second term provides the ANN solution to the ODEs. In the next section, example problems are presented which are solved by formulating and solving Problem P3.

3. EXAMPLE PROBLEMS Figure 1. ODEs simulation using ANN.

3.1. Examples 16. In this section, six example problems are considered; the data used for the problems are the same as that given in the work of Dua.19 A summary of these example problems and the solution obtained is given in Table 2; all the examples were solved using GAMS/SNOPT.37 The parameter estimates obtained are comparable to those reported by Dua.19 A comparison of the experimental data (^z) and model ) is shown in Figure 2. A very good match is predictions (zANN j ) are indistinguishable. obtained: ^z and these predictions (zANN j The corresponding weight matrices and bias vectors are given in the Supporting Information; note that these results are presented for the sake of completeness and may vary for different NLP solvers and initial guesses used for the solvers.

Table 1. ANN Model Solution of the ODE Simulation Example w

v

b

0.134756

513.443000

573.397100

2.01953978

0.00928 0.026558

294.618700 225.051300

1340.551000 627.811000

1.22429663 0.93496132

590.092400

526.499000

1.21430911

1044.580000

1295.633000

5.82772805

0.031748 3.61333 0.01448

121.828800

1598.250000

0.99994572

0.01349

1728.500000

1615.680000

1.19784413

Table 2. Example Problems 16 Estimated θ Values number of nodes ODE model number

name

1

first-order irreversible

2

chain reaction first-order reversible chain reaction

3

catalytic cracking of gas oil

4 5

Bellman’s problem methanol-to-

dz1 dt dz2 dt dz1 dt dz2 dt dz3 dt dz1 dt dz2 dt dz dt dz1 dt

6

LotkaVolterra problem

this work

ε3

layer of the ANN

¼  θ1 z1

θ1 = 5.017

θ1 = 5.001

1.79  108

15

¼ θ1 z1  θ2 z2 ¼  θ1 z1 þ θ2 z2

θ2 = 1.001 θ1 = 4.014

θ2 = 1.000 θ1 = 4.004

1.83  104

20

¼ θ1 z1  ðθ2 þ θ3 Þz2 þ θ4 z3

θ2 = 2.019

θ2 = 2.005

¼  θ4 z3 þ θ3 z2

θ3 = 40.085

θ3 = 38.447

θ4 = 20.045

θ4 = 19.193

¼  ðθ1 þ θ3 Þz1 2

θ1 = 12.232

θ1 = 11.975

2.05  106

15

¼ θ1 z1  θ2 z2

θ2 = 8.131

θ2 = 7.987

θ3 = 2.205

θ3 = 2.024

¼ θ1 ð126:2  zÞð91:9  zÞ2  θ2 z2

θ1 = 4.560  106 θ2 = 2.773  104

θ1 = 4.556  106 θ2 =2.771  104

4.50  103

20

θ1 = 5.195

θ1 = 5.187

1.75  106

40

θ2 = 1.206

θ2 = 1.199

þ θ3 z 1

θ3 = 0

θ3 = 0

þ θ4 z 1

θ4 = 0

θ4 = 2.028  104 1.74  104

30

2

  ¼  2θ1  ðθ2 þ θθ15zÞz2 1 þ z2 þ θ3 þ θ4 z1

hydrocarbon process

in the hidden Dua19

equations

θ1 z1 ðθ2 z1  z2 Þ ðθ2 þ θ5 Þz1 þ z2 θ1 z1 ðz2 þ θ5 z1 Þ ðθ2 þ θ5 Þz1 þ z2

dz2 dt dz3 dt

¼

θ5 = 0

θ5 = 0

dz1 dt dz2 dt

¼ θ1 z1 ð1  z2 Þ

θ1 = 3.453

θ1 = 3.128

¼ θ2 z2 ðz1  1Þ

θ2 = 0.933

θ2 = 0.970

¼

1811

dx.doi.org/10.1021/ie200617d |Ind. Eng. Chem. Res. 2012, 51, 1809–1814

Industrial & Engineering Chemistry Research

ARTICLE

Figure 2. Experimental data and model predictions.

3.2. Example 7. The ODE model of the seventh example problem is given by dz1 ¼  θ1 z1 þ θ2 z2 dt

dz2 ¼ θ1 z1  ðθ2 þ θ3 Þz2 dt The simulated data for this example was generated for θ = [3, 2, 1] and is given in the Supporting Information. Solving parameter estimation problem for this example by formulating and solving Problem P3 gives θ = [2.994, 1.994, 1] as the parameter estimates. For some problems, data for all of the state variables are not available. For this example, if we consider that only ^z1 is available and therefore only ^z1 appears in the objective function of Problem

P3, then the parameter estimates obtained are θ = [2.993, 1.989, 0.998]. Therefore, the approach presented in this work worked quite well, even for the case where the system is identifiable, when all the state variable measurements are not available. 3.3. Example 8. Consider the following first-order reversible reaction:19 k1

A þ BaC þ D k2

The model is given by dA dB dC dD ¼ ¼  ¼  ¼  k1 AB þ k2 CD dt dt dt dt Aðt ¼ 0Þ ¼ 10, Bðt ¼ 0Þ ¼ 9, Cðt ¼ 0Þ ¼ 1, Dðt ¼ 0Þ ¼ 2 1812

dx.doi.org/10.1021/ie200617d |Ind. Eng. Chem. Res. 2012, 51, 1809–1814

Industrial & Engineering Chemistry Research This example has noisy data that were generated at 100 time points by simulating the model for k1 = 4  103 and k2 = 2  103 and then adding the noise to the data.19 In the work of Dua,19 the data at 100 time points were reduced to data at 11 time points and parameter estimates were obtained as k1 = 4.312  103 and k2 = 2.195  103. In this work, when all the data at 100 time points are used, the parameter estimates obtained are k1 = 3.553  103 and k2 = 1.719  103; when the reduced data at 11 time points are used, the parameter estimates obtained are k1 = 4.115  103 and k2 = 2.075  103. This was carried out for 15 nodes in the hidden layer.

4. SUMMARY AND CONCLUDING REMARKS Artificial neural networks (ANNs) have been well-known for their ability to approximate highly nonlinear functions; however, this ability has not been systematically used for parameter estimation of ordinary differential equations (ODEs). Algorithms for parameter estimation of ODEs fall into two main categories: decomposition and simultaneous. In the recent work by Dua,19 ANNs were effectively and successfully used for parameter estimation of ODEs, using decomposition-based algorithm. This required the solution of two smallersized nonlinear programming (NLP) problems. In this paper, a simultaneous algorithm for parameter estimation of the ODEs has been presented. The problem was formulated and solved as one larger size NLP (Problem P3), where the solution of the ODEs was postulated as an ANN model. The proposed approach has been tested on several test example problems. The parameter estimate results obtained from the simultaneous approach presented here are comparable to those obtained in the earlier work, based on the decomposition approach. If a system is identifiable, this methodology can be applied to the system, where measurements for all of the state variables are not available. An example where experimental measurements for all of the state variables are not available was also presented. An advantage of the ANN-based approach that has been presented is that highly nonlinear multi-inputmulti-output (MIMO) functions can be approximated by ANNs, which is expected to open avenues for parameter estimation of highly nonlinear ODEs. In this work, global optimization was not carried out; however, the results obtained were comparable to those that have been obtained using global optimization techniques, as reported in the literature. The future work will include developing global optimization techniques for parameter estimation of the ODEs based on ANN model approximations, to guarantee global optimality of the solution obtained. ’ ASSOCIATED CONTENT

bS

Supporting Information. The weights and biases for the ANNs for Examples 16, and the raw data used for Example 7, are given as supporting information. This information is available free of charge via the Internet at http://pubs.acs.org/.

’ AUTHOR INFORMATION Corresponding Author

*Tel.: +44 (0) 20 7679 0002. Fax: +44 (0) 20 7383 2348. E-mail: [email protected].

’ DEDICATION This paper is dedicated to Professor K. D. P. Nigam. The authors gratefully acknowledge the interesting discussions and interactions they had with Professor Nigam.

ARTICLE

’ REFERENCES (1) Espie, D.; Macchietto, S. The optimal design of dynamic experiments. AIChE J. 1989, 35 (2), 223–229. (2) Michalik, C.; Stuckert, M.; Marquardt, W. Optimal experimental design for discriminating numerous model candidates: The AWDC criterion. Ind. Eng. Chem. Res. 2010, 49, 913–919. (3) Preisig, H. A. Constructing and maintaining proper process models. Comput. Chem. Eng. 2010, 34, 1543–1555. (4) Papamichail, I.; Adjiman, C. S. A rigorous global optimization algorithm for problems with ordinary differential equations. J. Global Optim. 2002, 24, 1–33. (5) Papamichail, I.; Adjiman, C. S. Global optimization of dynamic systems. Comput. Chem. Eng. 2004, 28, 403–415. (6) Sakizlis, V.; Perkins, J. D.; Pistikopoulos, E. N. Parametric controllers in simultaneous process and control design optimization. Ind. Eng. Chem. Res. 2003, 42 (20), 4545–4563. (7) Vassiliadis, V. S.; Sargent, R. W. H.; Pantelides, C. C. Solution of a class of multistage dynamic optimization problems. 1. Problems without path constraints. Ind. Eng. Chem. Res. 1994, 33, 2111. (8) Katare, S.; Bhan, A.; Caruthers, J. M.; Delgass, W. N.; Venkatasubramanian, V. A hybrid genetic algorithm for efficient parameter estimation of large kinetic models. Comput. Chem. Eng. 2004, 28, 2569–2581. (9) Esposito, W. R.; Floudas, C. A. Global optimization for the parameter estimation of differential-algebraic systems. Ind. Eng. Chem. Res. 2000, 39, 1291–1310. (10) Lin, Y.; Stadtherr, M. A. Deterministic global optimization for parameter estimation of dynamic systems. Ind. Eng. Chem. Res. 2006, 45, 8438–8448. (11) Lin, Y.; Stadtherr, M. A. Deterministic global optimization of nonlinear dynamic systems. AIChE J. 2007, 53, 866–875. (12) Park, T.; Froment, G. F. A hybrid genetic algorithm for the estimation of parameters in detailed kinetic models. Comput. Chem. Eng. 1998, 22, S103. (13) Rodriguez-Fernandez, M.; Mendes, P.; Banga, J. R. A hybrid approach for efficient and robust parameter estimation in biochemical pathways. BioSystems 2006, 83, 248–265. (14) Singer, A. B.; Taylor, J. W.; Barton, P. I.; Green, W. H. Global dynamic optimization for parameter estimation in chemical kinetics. J. Phys. Chem. A 2006, 110 (3), 971–976. (15) Wolf, D.; Moros, R. Estimating rate constants of heterogeneous catalytic reactions without supposition of rate determining surface steps—An application of a genetic algorithm. Comput. Chem. Sci. 1997, 52 (7), 1189. (16) Varah, J. M. A spline least squares method for numerical parameter estimation in differential equations. SIAM J. Sci. Stat. Comput. 1982, 3 (1), 28–46. (17) Varziri, M. S.; Poyton, A. A.; McAuley, K. B.; McLellan, P. J.; Ramsay, J. O. Selecting optimal weighting factors in iPDA for parameter estimation in continuous-time dynamic models. Comput. Chem. Eng. 2008, 32, 3011–3022. (18) Michalik, C.; Chachuat, B.; Marquardt, W. Incremental global parameter estimation in dynamical systems. Ind. Eng. Chem. Res. 2009, 48 (11), 5489–5497. (19) Dua, V. An artificial neural network approximation based decomposition approach for parameter estimation of system of ordinary differential equations. Comput. Chem. Eng. 2011, 35 (3), 545–553. (20) Bellman, R.; Jacquez, J.; Kalaba, R.; Schwimmer, S. Quasilinearization and the Estimation of Chemical Rate Constraints from Raw Kinetic Data. Math. Biosci. 1967, 1, 71. (21) Bilardello, P.; Joulia, X.; LeLann, J. M.; Delmas, H.; Koehret, B. A general strategy for parameter estimation in differential-algebraic systems. Comput. Chem. Eng. 1993, 17 (5/6), 517. (22) Hwang, M.; Seinfeld, J. H. A new algorithm for the estimation of parameters in ordinary differential equations. AIChE J. 1972, 18 (1), 90. (23) Kalogerakis, N.; Luus, R. Simplification of quasilinearization method for parameter estimation. AIChE J. 1983, 29 (5), 858. (24) Kim, I. W.; Liebman, M. J.; Edgar, T. F. A sequential error-invariables method for nonlinear dynamic systems. Comput. Chem. Eng. 1991, 15 (9), 663. 1813

dx.doi.org/10.1021/ie200617d |Ind. Eng. Chem. Res. 2012, 51, 1809–1814

Industrial & Engineering Chemistry Research

ARTICLE

(25) Maria, G. An adaptive strategy for solving kinetic model concomitant estimation-reduction problems. Can. J. Chem. Eng. 1989, 67, 825. (26) Wang, B.-C.; Luus, R. Increasing the size of region of convergence for parameter estimation through the use of shorter data length. Int. J. Control 1980, 31 (5), 947. (27) Baden, N.; Villadsen, J. A family of collocation based methods for parameter estimation in differential equations. Chem. Eng. J. 1982, 23, 1. (28) Liebman, M. J.; Edgar, T. F.; Lasdon, L. S. Efficient Data Reconciliation and Estimation for Dynamic Processes Using Nonlinear Programming Techniques. Comput. Chem. Eng. 1992, 16 (10/11), 963. (29) Tjoa, I. B.; Biegler, L. T. Simultaneous solution and optimization strategies for parameter estimation of differential-algebraic equation systems. Ind. Eng. Chem. Res. 1991, 30, 376. (30) Van Den Bosch, B.; Hellinckx, L. A new method for the estimation of parameters in differential equations. AIChE J. 1974, 20 (2), 250. (31) Villadsen, J.; Michelsen, M. L. Solution of Differential Equation Models by Polynomial Approximation; PrenticeHall: Englewood Cliffs, NJ, 1978. (32) Dua, V. A mixed-integer programming approach for optimal configuration of artificial neural networks. Chem. Eng. Res. Des. 2010, 88, 55–60. (33) Himmelblau, D. M. Accounts of experiences in the application of artificial neural networks in chemical engineering. Ind. Eng. Chem. Res. 2008, 27, 5782–5796. (34) Hussain, M. A.; Ng, C.; Aziz, N.; Mujtaba, I. M. Neural network techniques and applications in chemical process control systems. Intell. Syst. Tech. Appl. 2003, 5, 326362. (35) Venkatasubramanian, V.; Chan, K. A neural network methodology for process fault diagnosis. AIChE J. 1989, 35, 1993–2002. (36) Lagaris, I. E.; Likas, A.; Fotiadis, D. I. Artificial neural networks for solving ordinary and partial differential equations. IEEE Trans. Neural Networks 1998, 9 (5), 987–1000. (37) Brooke, A.; Kendrick, D.; Meeraus, A.; Raman, R. GAMS: A User’s Guide; GAMS Development Corporation: Washington, DC, 1998.

1814

dx.doi.org/10.1021/ie200617d |Ind. Eng. Chem. Res. 2012, 51, 1809–1814