Systematic Development of Kinetic Models for the ... - ACS Publications

Jun 6, 2018 - CIEPQPF, Department of Chemical Engineering, University of Coimbra, ... principle it can be manufactured from any oil or fat reacted ...
0 downloads 0 Views 6MB Size
Subscriber access provided by Kaohsiung Medical University

Process Systems Engineering

Systematic development of kinetic models for the glyceride transesterification reaction via alkaline catalysis José Filipe Oliveira Granjo, Belmiro P. M. Duarte, and Nuno M. C. Oliveira Ind. Eng. Chem. Res., Just Accepted Manuscript • DOI: 10.1021/acs.iecr.7b05328 • Publication Date (Web): 06 Jun 2018 Downloaded from http://pubs.acs.org on June 6, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Systematic development of kinetic models for the glyceride transesterication reaction via alkaline catalysis José F.O. Granjo,

∗,†

Belmiro P.M. Duarte,

‡,†

and Nuno M.C. Oliveira



,

†CIEPQPF, Department of Chemical Engineering, University of Coimbra, Rua Sílvio Lima  Pólo II, 3030790 Coimbra, Portugal

‡Instituto Politécnico de Coimbra, ISEC, Department of Chemical and Biological Engineering, Rua Pedro Nunes, Quinta da Nora, 3030199 Coimbra, Portugal E-mail: [email protected]

Phone: +351 239 798 793. Fax: +351 239 798 703

Abstract This study addresses the glyceride transesterication (TE) kinetics, a crucial step in biodiesel production from vegetable oils. A nth order reversible model is considered to describe the TE reaction rate, using experimental data gathered from dierent authors for a broad range of conditions. An incremental model-building strategy is used, consisting on the following sequence of steps: (a) structural identiability analysis; (b) outliers detection using a robust M-estimator; (c) parameter estimation and subsequent construction of the respective condence intervals; and (d) practical identiability analysis. The methodology is applied to the glyceride TE reaction with sodium methoxide as catalyst, using a data set comprising 144 points collected from literature, and comparing the results for a ve (5P) and a six (6P) parameters' model. Although the results show that both kinetic models are structurally non-identiable, the former is 1

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 40

practically identiable for the case study analyzed. Furthermore, the 5P model shows a good capability of explaining the data features, with an average relative deviation of 5.2% for the residuals. The methodology presented can be used to analyze general ki-

netic reactions when experimental data is available and plausible reaction mechanisms are known.

1

Introduction

Biodiesel, also known as fatty acid methyl esters (FAME), is an alternative fuel that is used in pure form or blended with fossil diesel in vehicles without engine modications. Although in principle it can be manufactured from any oil or fat reacted with short-chain alcohols, the large-scale production of biodiesel is usually carried out via transesterication of glycerides (acylglycerols) contained in vegetable oils, with methanol (CH3 OH), in the presence of a catalyst to accelerate the reaction. Factors inuencing the process topology and economy include the duration of the reaction, the temperature at which the reaction is run, the methanol-to-oil mole ratio, mixing rate, the reactant's purity, and the nature and amount of catalyst. 1,2 Alkaline hydroxides and methoxides have shown to be very eective as catalysts, and potassium hydroxide, sodium hydroxide and sodium methoxide (NaOCH3 ) are the most disseminated in the industry. The TE reaction involves the exchange of alkoxyl groups between ester and alcohol compounds, and consists of a sequence of parallel and reversible reactions where tri- (TG), di(DG) and mono- (MG) glycerides, mixed with an alcohol in the presence of a base, react forming FAME and glycerol (GL): TG + CH3OH

DG + R'COOCH3

DG + CH3OH

MG + RCOOCH3

MG + CH3OH

GL + R 'COOCH3

2

ACS Paragon Plus Environment

[R1]

Page 3 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Here R', R and R ' represent long-chain (C6 C24 ) alkyl groups. Initially, the reaction is mass transfer-controlled due to the low solubility of methanol in TG. 3 As the reaction progresses, the liquid-liquid interphase boundary collapses, the drop size of the disperse phase (methanol) decreases and with it the mass transfer resistance. The reacting mixture then behaves as a pseudo-homogeneous medium where the reaction rate becomes kineticallycontrolled. During the later stages, the glycerol-enriched phase splits carrying most of the catalyst from the FAME-enriched phase, and the reaction rate becomes again mass transferlimited until the equilibrium is reached. 4,5 Consequently, the rate at which the TE occurs assumes an overall sigmoid shape, with a lag time that is more pronounced at temperatures below 50 ◦C and low mixing speeds. 2,4 An extensive body of literature devoted to studying the kinetics of TE using dierent catalytic systems has been published. Besides the experimental work carried out by several authors, both mechanistic (based on rst principles) and empirical (hybrid or purely statistical inferred) models have been proposed for data tting. Mechanistic models describe separately the stepwise reactions in [R1] including the formation and consumption of intermediary species, 2,6,7 saponication and ion methoxide formation reactions. 8 In Stamenkovic et al. 4 , the mass transfer-controlled regime in the initial heterogeneous phase is also modeled, while Brásio et al. 9 incorporate the frequency of rotation and time delay factors in the TE rate of reaction models to quantify the impact of the mixing rate. Models of this type have the advantage of providing a detailed description of the evolution of the concentration proles of all components in the TE, based on very detailed phenomenological considerations. Nevertheless, the experimental data required to parameterize the resulting kinetic models typically requires the quantication of the various glyceride species present, which is seldom available, and the mass transfer aspects are often system-specic. Empirical models to describe the TE kinetics may include dierent levels of phenomenological knowledge, such as the general structure of reaction rates and/or dominant phenomena. They can also be statistically inferred, with an empirical structure resulting from the 3

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

use of a maximum-likelihood (ML) criterion, usually reporting the overall conversion of TG over time. Vicente et al. 10 used a factorial experimental design to obtain second-order models that estimate the nal FAME purity and yield as a function of catalyst concentration and temperature. In a related approach, Granjo et al. 11 obtained kinetic models with reduced complexity and a broad application spectrum, assuming that the TE behaves as an irreversible pseudo-homogeneous reaction of rst or second orders. While these models are useful to screen operating regions of interest, they become of limited use for the optimal reactor design or control, since they did not capture the reversible nature of the TE reaction, among other mechanistic aspects. The development of kinetic models for the TE can benet from the use of systematic approaches to maximize the information gathered from the data and provide guidelines to drive the experiments. Previous works by Kontoravdi et al. 12 , Kiparissides et al. 13 introduced strategies for model development within the context of biological systems, which include model selection, structural identiability analysis, parameter estimation and analysis, optimal design of experiments, and model validation. Bonvin et al. 14 review the state-of-art and discuss current challenges in the eld. Once a model structure is proposed, one may wish to check a priori whether the parameters can be uniquely identied, i.e., whether the solution trajectory is unique for any given parameter's vector and system inputs. To handle this task, usually designated as a structural identiability analysis (MSI), algebra-based methods 15 that attempt to nd the characteristic set of the dierential system using symbolic computation can be used. Other approaches include optimization-based techniques 16 where the MSI problem is formulated as a semi-innite program (SIP), and sensitivity-based analysis, 17 which relies on estimating sensitivity measures averaged over the parameter space. Traditionally, model tting of TE kinetic models, which have the form of a system of dierential algebraic equations (DAEs), involves the embedment of a DAE solver within an optimizer, sequentially solving the problem until convergence is attained. Optimizers employed include the general reduced gradient, 6 Levenberg-Marquardt, 7 Nelder-Mead, 18 and 4

ACS Paragon Plus Environment

Page 4 of 40

Page 5 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

genetic algorithms; 19 for the integration techniques, the Runge-Kutta 7,8,18 and Gear's methods have been used. 19 This sequential approach has the advantage of allowing the explicit control of the local truncation error along the integration, and since the parameter estimates are used to integrate the model equations, feasible concentration estimates are always maintained throughout the solution phase. However, the repetitive integration of DAE systems can be computationally expensive, and the solution accuracy reached during the integration phase might limit the ultimate accuracy attainable in the optimization stage. An alternative strategy is to transcribe the DAE system into a set of algebraic constraints through the use of collocation methods, and to solve the resulting large-scale algebraic problem at once, employing a convenient solver; this is the simultaneous approach. 20,21 The resulting nonlinear programming (NLP) formulation generally contains special structure and sparsity patterns that can be exploited by suitable NLP solvers, for numeric eciency. 21 An additional aspect is the detection of outliers in data, whose occurrence may lead to biases in the parameter estimates, as the least squares estimator is highly sensitive to the existence of large residues. To tackle this problem, robust methods have been applied, such as the Hampel's lter, 22 correntropy, 23 and the least median of squares (LMS). 24,25 Contrarily to the Hampel's and correntropy estimators, LMS does not need tuning and has a high breakdown point, but its minimization is hard due to its multi-modality and discontinuous nature. After estimating the model parameters, their reliability and adequacy needs to be quantied, a step surprisingly often omitted in the literature of TE kinetic modeling. 2,6,8,19 For this purpose, a practical identiability analysis is useful to check whether the parameter estimates and the model structure are supported by the amount and quality of the data collected. The possible existence of insensitive parameters, subsets of strongly correlated parameters, or the amount of noise in the data may cause ill-posed estimations. 26 Note that a model which is considered to be structurally identiable might become practically nonidentiable, depending on the data sample space and the variability therein. Conversely, 5

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

a model classied as practically identiable may be structurally non-identiable, since the parameter estimates for nonlinear models have necessarily a local nature. This work addresses the current shortcomings found in the modeling of the TE reaction with a homogeneous catalysts. A general reversible model for the global kinetic rate is used to capture the eect of key variables over a broad range of operating conditions. The data was selected so that the presence of side reactions (e.g., saponication) can be neglected, with the TE occurring in a kinetically-controlled regime. Consequently, the eect of chemical kinetics on the overall rate of reaction is decoupled from the mass transfer aspects and additional side reactions. To attain this goal, a stepwise methodology is used to build and validate the process models. The main tasks include: (i) a structural identiability analysis (Task 1); (ii) the detection of outliers in data (Task 2); (iii) parameters estimation (Task 3); and (iv) practical identiability and nonlinear regression analysis (Task 4). Tasks 1 and 3 were formulated as a semi-innite and nonlinear program (respectively), and solved simultaneously using orthogonal collocation on nite elements. The minimization of the LMS in Task 2 was achieved by stochastic simulation, while the statistical signicance of parameters and the model adequacy were assessed in Task 4, using metrics based on the asymptotic theory and information contained in the sensitivity matrix. The analysis is limited to experimental data available in the literature; consequently optimal experimental design for parameter estimation or model discrimination was not addressed in this work. The approach is demonstrated for the TE reaction of glycerides, using NaOCH3 as a catalyst. The remaining parts of the paper are organized as follows. In Section 2, the model structure of the TE reaction is derived, and the main assumptions underlying the model representation are introduced. Subsequently, the methodology used to t the model is described in Section 3, and the results obtained from the application to the TE reaction with NaOCH3 as catalyst appear in Section 4. Section 5 considers nal remarks and the analysis of future extensions to this framework. 6

ACS Paragon Plus Environment

Page 6 of 40

Page 7 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

2

Kinetics modeling

For the purpose of modeling the TE reaction, the data used for tting was obtained from experiments at lab scale reported in the open literature, where the following operating conditions were assured: (i) the reaction occurs under intense agitation; (ii) the reaction occurs with methanol excess; and (iii) the water and free fatty acids (FFA) contents in the glycerides are within the recommended operating ranges. Under these assumptions, the reacting mixture behaves as a pseudo-homogeneous medium, the concentrations of the intermediary products (DG and MG) are negligible, and the saponication, hydrolysis, and FFA esterication reactions may be essentially neglected. 13,6,10 The maximum limits considered for the concentration of water and FFA were 0.5 wt % and 3 wt %, respectively, usually recommended to carry out base-catalyzed TE without pretreatment. 1,27 Consequently, the TE reaction is represented as an overall equilibrium reaction TG + 3 CH3OH

k1 k2

3 FAME + GL

[R2]

The overall reaction rate is expressed by a reversible kinetic law a b c d −rTG = k1 CTG CCH − k2 CFAME CGL 3 OH

(1)

where −rTG is the rate of disappearance of TG, C the concentration of components (in mole fraction), and k1 and k2 are the kinetic constants for the direct and reverse reactions, respectively. The parameters a, b, c, and d are the individual reaction orders for TG, methanol, FAME and GL in [R2], respectively, and the equilibrium constant of [R2] is denoted by keq = k1 /k2 . Using the stoichiometry of reaction and considering that no products exist at the beginning of the experiments, Eq. (1) is reformulated, similarly to Kumar et al. 19 , so that the rate of reaction is expressed in terms of TG conversion, denoted by XTG , as a function of

7

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 40

time (t in min), concentration of catalyst (xa in % w/w.oil), temperature (T in Kelvin) and methanol-to-oil mole ratio (xMOR ), i.e.,

dXTG (t) (a+b−1) (c+d−1) = k1 CTG,0 (1 − XTG (t))a (xMOR − 3XTG (t))b − 3c k2 CTG,0 XTG (t)(c+d) dt

(2)

with the initial concentration of TG, CTG,0 , obtained from

CTG,0 =

1 1 + xMOR + 10−2 ·

MwTG x Mwcat a

,

(3)

and Mw representing the molecular weight. Whenever the molecular weights of TG and FAME are not reported, they are estimated from Phan and Phan 28 , with the mass fractions of the FFA of vegetable oils and corresponding FAME retrieved from the literature. 29 The orders of the reactants and products in the kinetic expression are assumed to be independent of the reaction conditions, k1 is described by an Arrhenius-like equation (Eq. (4)) and k2 is expressed using k1 and keq (Eq. (5)), which in turn is modeled by the Van't Ho equation: −Ea1

k1 (T, xa ) = k10 xa e Rgas T

(4)

k2 (T, xa ) = k1 (T, xa )/keq (T )   ∆rx H ◦ 1 1 ◦ − ln keq (T ) = ln keq − Rgas T 298

(5) (6)

The kinetic model formed by Eqs. (2)(6) is a DAE system, and includes the parameters: (i) k10 , that represents the frequency factor; (ii) Ea1 , the energy of activation (J mol−1 ); ◦ (iii) keq , the equilibrium constant at 298 K; and (iv) the individual orders of reaction a, b, c,

and d. The standard enthalpy of reaction (∆rx H ◦ ) is xed at 30 −300.51 kJ mol−1 , and Rgas is the ideal gas constant. The form of the kinetic model in Eq. (2) is not adequate for data regression purposes, since the parameters contained in Arrhenius- and power law-like terms interact strongly. 8

ACS Paragon Plus Environment

Page 9 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

This adverse phenomenon is caused by the product of parameters and typically leads to ill-conditioned problems, due to their mutual inuence. The reciprocal compensation of the values assigned to the parameters will also originate the existence of multiple local optimal solutions. To prevent these numerical diculties, the approach of Buzzi-Ferraris and Manenti 31 was used to reformulate Eq. (2) into a better structure (Γ):

dXTG (t) ≡ Γ(T, xa , xMOR , XTG (t); θ) = dt = k1 (α3 CTG,0 )(a+b−1) (1 − XTG (t))a (xMOR − 3XTG (t))b − − k20 (α3 CTG,0 )(θ4 −1) XTG (t)θ4 ,

(7)

k1 is now (8)

k1 = (α4 · xa ) eθ1 +α1 θ2 (1/T +α2 ) , and −Ea1

3c k10 xa e Rgas T 3c k1 = = (α4 · xa ) eθ3 +α1 θ2 (1/T +α2 )+α5 /T +α6 , k20 = ◦ − ∆Rrx H (1/T −1/298) θ4 −1 keq ◦ gas keq e α4 α3

(9)

where the constants α1 to α4 are introduced to balance Eq. (7), which together with α5 and

α6 are estimated from 1 1 , α2 = − , k1/T + α2 k2 T 1 ∆rx H ◦ α4 = , α5 = , kxa k2 Rgas

α1 =

α3 =

1

kCTG,0 k2 α5 α6 = − , 298

(10)

where k•k2 represents the l2 norm of vectors T , xa and CTG,0 containing the set of their measured values. The new parameters, denoted θ1 , θ2 , θ3 , θ4 , respectively, are related to the original ones as

 θ1 + α1 α2 θ2 = ln

k10 α4 α3a+b−1 9

 ,

Ea1 = −α1 θ2 Rgas ,

ACS Paragon Plus Environment

(11)

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

and

θ3 = ln

3c k10 ◦ α αθ4 −1 keq 4 3

Page 10 of 40

! ,

θ4 = c + d.

(12)

◦ , decreasing the number of parameters to be Note that θ3 and θ4 aggregate c, d and keq

tted and avoiding their mutual interaction. Consequently, the complete set of the original ◦ is known. The kinetic model comprising the paparameters can only be estimated if keq

rameter set θ = {a, b, θ1 , θ2 , θ3 , θ4 } is denoted as 6P, while the ve parameter version (5P) is obtained from 6P by setting θ4 = 1, i.e the sum of the reaction orders for FAME and glycerol is 1 (Eq. (2)). The dynamic model formed by Eqs. (7)(12) is compactly represented by M(XTG , T, xa , xMOR , θ, t). The data available from each experiment consists of values of XTG measured at discrete time instants at which the batch reactor is sampled, while the process variables aecting the reaction rate (T , xa and xMOR ) are set at the beginning and kept constant along the experiment.

3

Methodology

The stepwise methodology for the estimation of parameters and analysis of general nonlinear models comprehends the following tasks: ˆ Task 1 - structural identiability analysis (MSI), to assess the theoretical uniqueness of the parameters over the range of operation, given the proposed model; ˆ Task 2 - detection and elimination of outliers in data; ˆ Task 3 - parameter estimation (PE) to nd the optimal parameter set (θ ∗ ) that maximize a ML criterion of the residuals (ϕ); ˆ Task 4 - practical identiability analysis, to verify the uniqueness of the parameters given the noisy data available, as well as the identication of potential collinear parameters. Here, statistical model adequacy tests are performed and the parameters' condence intervals for a 95% signicance level are determined. 10

ACS Paragon Plus Environment

Page 11 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

In addition to the assumptions stated in Section 2 for the experimental conditions, it is considered that the experiments are subject to similar random errors with variance σ 2 and zero mean, i.e., the data is statistically independent and identically distributed. Figure 1 represents the methodology steps that are detailed in the next sections. [Figure 1 about here.]

3.1 Task 1  Structural identiability analysis In the MSI analysis, it is assumed that the model adequately describes the data available under a noise free scenario, given the set of measurements available from the process. Since in our setup the complete structure of the data is already established, including all the input variables tested and the output variable measured along the experiments, the results of the MSI do not inuence the subsequent experiments. Consequently, the analysis is performed

a posteriori, and is constrained to the range of the operational variables (T, xa , xMOR ) used in the experimental data set, as well as to a xed initial condition, XTG (0) = 0. Here, the MSI seeks to assess the uniqueness of the parameters tted if the same process structure is maintained through additional batch experiments, where T , xa and xMOR were chosen and kept constant during each test. Let θ, θ 0 ∈ Θ ⊂ Rnθ where Θ is the closed region containing the model parameters, and

Θ is obtained from the Cartesian product of the parameters domain, i.e., Θ =

×

nθ [θL , θiU ], i=1 i

where θiL is the lower plausible value of θi , θiU the upper, and nθ is the number of parameters in the model. The MSI is formulated as a semi-innite programming (SIP) problem. 16 It consists in nding a point in the domain of the input variables that allows maximizing the Cartesian distance between the vectors θ and θ 0 ∈ Θ and yield indistinguishable trajectories 0 0 XTG (T, xa , xMOR , θ, t) (or simply XTG ) and XTG (T, xa , xMOR , θ 0 , t) (or XTG ) for t ∈ [0, tf ]

with tf designating the nal integration time. Here, the distinguishability of trajectories is 0 formulated as a constraint where the integral of the weighted squared dierence XTG − XTG

is below a small positive constant, ξ , previously set. Without loss of generality, the problem 11

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 40

can be formulated for a single-output system as

max

θ,θ 0 ∈Θ

s.t.

T

Ψ = (θ − θ 0 ) Wθ (θ − θ 0 ) Z tf 2 2 0 wX (XTG − XTG ) dt − ξ ≤ 0 TG 0

M(XTG , T, xa , xMOR , θ, t)

(SIP1)

0 M(XTG , T, xa , xMOR , θ 0 , t)     L   U , ∀x ∈ x , x ∀T ∈ T L , T U , ∀xa ∈ xLa , xU MOR a MOR MOR

where Wθ is a weighting matrix; in our case an identity matrix was used, since we give equal importance to all the parameters and expect values of the same order of magnitude. wXTG is a scaling factor that allows weighting dierently the points along the experiments, here xed at one. The objective function Ψ represents the maximum distance between the parameter vectors that produce indistinguishable trajectories. If Ψ is below a small threshold value, , also previously set, the model is structurally identiable; otherwise the model is considered structurally non identiable. The operation region for the inputs is denoted by X ⊂ R3 and is formed by the Cartesian product of the plausible range of all the input variables, i.e.,      L  U X = T L , T U × xLa , xU a × xMOR , xMOR . 0 The trajectories XTG (t) and XTG (t) are calculated by integrating Eq. (7) for parameter

sets θ and θ 0 through orthogonal collocation on nite elements. 21 Let the time be discretized into E nite elements numbered by j where j ∈ {1, · · · , E}, and XTG (t) be locally approxiK mated by XTG (t), a K th order polynomial using K + 1 collocation points, where the index

representing the sequence is k ∈ {0, · · · , K}. Also, let tj,k = tj−1 + hj τk , where tj,k is the discretized point of t corresponding to k th collocation point at nite element j , hj = tf /E be the length of the nite element j , and τk ∈ [0, 1] the k th Radau root calculated from the K shifted Legendre polynomials with xed end-point τK = 1. 21 Then, XTG (t) at element j is

given by K XTG (t)

=

K X

lk (τ ) XTGj,k

k=0

12

ACS Paragon Plus Environment

(13)

Page 13 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

lk (τ ) =

K Y (τ − τl ) (τk − τl ) l=0,6=k

(14)

with τ0 = 0, XTG1,0 = 0, τk < τk+1 and k, l ∈ {0, . . . , K}. Lagrange interpolation polynoK mials are used for convenience, because XTG (tj,k ) = XTGj,k . Substituting the polynomial K approximation of XTG (t) into Eq. (7), collocation equations in element j approximating the

solution of the system of DAEs at the interpolation points τl are obtained K X

l˙k (τl )XTGj,k = hj Γ(T, xa , xMOR , XTGj,l ; θ),

l = {1, . . . , K}

(15)

k=0

where l˙k (τl ) = dlk (τl )/dτ . Note that the solution is the set of polynomial coecients at the collocation points. Additionally, constraints are introduced to impose the continuity of

XTG (t) between contiguous elements so that the rst node (τ0 ) in element j + 1 coincides with the last node (τK ) of element j , and XTGf is the predicted conversion of TG at tf .

XTGj+1,0 =

K X

lk (1)XTGj,k ,

j = {1, . . . , E − 1}

(16)

k=0

XTGf =

K X

lk (1)XTGE,k

(17)

k=0

The time integral in (SIP1) is calculated with the Lobatto quadrature formula wherein Q a local polynomial approximation of degree Q, XTG (t), is constructed. Here, the Lobatto's

roots, τq ∈ [0, 1], and respective weights ωq used in the quadrature are calculated from Q the shifted Legendre polynomials with both end-points included. The values of XTG (t)

are computed as in Eqs. (13)(17) for K + 1 collocation points. The discretized form of the model M(XTG , T, xa , xMOR , θ, t) is denoted by Md (XTG , T, xa , xMOR , θ, t) and contains Eqs. (8)(12) and Eqs. (13)(17).

13

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 40

Problem (SIP1) can now be cast as a SIP and solved simultaneously:

max 0

θ,θ ∈Θ

s.t.

T

Ψ = (θ − θ 0 ) Wθ (θ − θ 0 ) 0.5

Q E X X

2 ωq w X TG



XTGj,q −

0 XTGj,q

2

−ξ ≤0

j=1 q=0

Md (XTGj,q , T, xa , xMOR , θ, tj,q ),

0 Md (XTGj,q , T, xa , xMOR , θ 0 , tj,q )

(SIP2)

0 Md (XTGj,k , T, xa , xMOR , θ, tj,k ), Md (XTGj,k , T, xa , xMOR , θ 0 , tj,k )      L  U ∀T ∈ T L , T U , ∀xa ∈ xLa , xU a , ∀xMOR ∈ xMOR , xMOR

The algorithm used to solve problem (SIP2) is described in Mitsos 32 . The procedure requires setting the value for maximum violation allowed ν (> 0), the threshold for parameter's distance, , the relative tolerance imposed for convergence, `, and an initial point of the space

X , say x0 , that is stored in a vector designated X containing a nite set of feasible points of X that approximate the innite constraints of the original problem where in iteration 1

X = {x0 }. Specically, we use ν = 10−5 , ξ = 10−5 ,  = 10−3 , ` = 10−3 , and consider one of the vertex of X for initial point. Afterwards, the procedure iterates as follows: 1. the (SIP2) problem is solved for X and the solution (Ψ∗ ) corresponds to an upper bound, U B . The optimal solution is the set of parameters θ ∗ and θ 0∗ ; 2. the feasibility of the solution obtained in step 1 is checked by nding the maximum dierence of the sum of squares of the predictions which must be positive; 3. the lower level problem that consists of the maximization of the distance between trajectories for vectors θ ∗ and θ 0∗ is solved for x ∈ X . The problem generates a (local) lower bound of the original problem, LB , and a local maximizer, say x1 ; 4. the feasibility of the solution obtained in step 3 is checked by nding the maximum dierence of the sum of squares of the predictions which must be positive; 5. the convergence is checked, i.e. if −` ≤ (U B − LB)/U B ≤ ` the solution converged; 14

ACS Paragon Plus Environment

Page 15 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

otherwise if the feasibility checking step 4 is successful, X is appended with x2 , i.e.

X ← X ∪ {x2 } sets a new bound for the violation, i.e. ν ← ν/2, increases the iteration by 1, and go to step 1. The system is structurally identiable when after converging the problem (SIP2), the value at convergence is lower than or equal to ; otherwise is not identiable. The problem (SIP2) is rst solved with a local NLP solver and if the feasible local maximum of Ψ is below the threshold , problem (SIP2) is then solved for global optimality. The latter step, may not be required once the non identiability of the model has been already established with an optimal solution Ψ∗ > .

3.2 Task 2  Outlier detection The strategy used to detect outliers consists in replacing the least squares criterion ϕ by the median of squares (ψ ), an estimator that is able to deal with a large proportion of outliers in experimental data sets. 25 First, the model is tted to data using the median of squares criterion, which allows nding the outlier observations. In our conceptualization the outliers are associated to experiments that exhibit standard residuals above a previously set threshold level. Next, the outliers are excluded from the original data; and nally the least squares criterion is used for model tting using the remaining data (Section 3.3). Replacing ϕ by ψ as the objective function, the least median of squares problem (LMSP) becomes:

min ψ = med ri2 θ∈Θ

s.t.

i

ˆ TG,i − XTGi,f , ri = X

∀i ∈ {1, . . . , N }

(LMSP)

Md (XTGi,j,k , Ti , xa,i , xMOR,i , θ, ti,j,k ), i ∈ {1, . . . , N } As in (SIP2), the integration of Eq. (7) is performed via orthogonal collocation on nite elements, but now the model M is applied to a multi-experiment scenario where the index

i ∈ {1, · · · , N } is used for the experiment identication, and N is the number of experi-

15

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ments available, each one lasting ti,f which matches the maximum time value reported for experiment i. The discretized time instants ti,j,k = ti,j−1,k + hi,j τk can be dierent for each experiment, and consequently hi,j (= ti,f /E ) is the length of element j in experiment i, and

ˆ TG,i ) and the one preri is the residual value between the experimental conversion of TG (X dicted by the model (XTGi,f ) for the ith experiment. Moreover, Ti , xa,i , and xMOR,i represent the temperature (K), catalyst concentration (% w/w.oil) and methanol-to-oil mole ratio for an experiment i collected from data in literature, respectively. The minimization of ψ based on gradient algorithms is hard due to discontinuities and the multi-modality nature of the median function. To overcome this issue we generate a stochastic and suciently large number (S ) of parameter sets (θˆ ∈ Θ), and solve the constrained nonlinear system (CNS) from (LMSP) for each one. The set of discrete instances

θˆ sampled from Θ is denoted by ΘS . This allows computing the model predictions for each θˆ ∈ ΘS , the residuals ri for all the experiments, and the respective values of the least squares criterion, ϕ, and least median, ψ , for each θˆ, which are then stored. Afterwards, the vector

θˆ that produces the lower ψ , designated ψ ∗ , is chosen. The procedure can be compared to multi-start heuristic algorithms for NLP problems. The sampling technique employed uses S points of an Hammersley sequence on a nθ dimensional hyperspace cube, a procedure described in detail elsewhere 33 and omitted here due to space limitations. Hammersley sequences fall into low-discrepancy sequences commonly used in quasi Monte Carlo sampling techniques. Typically, they show low deviation from the uniform distribution, and there is a low correlation between sequences of points on dierent basis if the sampling space is not too large, assuring that the nθ -dimensional cube is uniformly screened for the ϕ and ψ estimators. This strategy has the advantage of sampling only on Rnθ , instead of the domain of the full problem dimension which includes basic variables, and is easily parallelized, reducing the solution time. Although this method based on stochastic simulation cannot guarantee a global minimum of ψ , it is usually ecient to nd a good one to detect outliers. 25 16

ACS Paragon Plus Environment

Page 16 of 40

Page 17 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

After sampling the parameter's space and the construction of a pool of local optima, the best solution (ψ ∗ ) is employed to identify the outliers. Each experiment in the initial battery identied as an outlier is given a null weight (wi = 0) in the subsequent least squares problem (WLSQP), while the set of regular experiments are given a weight wi = 1 . The procedure to determine the weights wi from (LMSP) is described in detail in Rousseeuw and Leroy 24 , and starts by considering an initial estimate for the residuals scale factor s0 , which is based on the minimal median multiplied by a nite-sample correction factor:

s0 = 1.4826(1 − 5/(N − nθ ))

q med ri2 i

(18)

The factor s0 is then used to obtain initial guesses for the standard residuals (ri /s0 ) and wi :

   1 wi =   0

if |ri /s0 | ≤ Fcrit.

i ∈ {1, . . . , N },

otherwise

(19)

where Fcrit. is the critical threshold, which is xed by the user. A value of 3.5 was assumed in this work. The robust standard deviation σ ∗ is then computed as follows:

v u P u N w r2 u i i u i=1 ∗ σ =uN tP w i − nθ

(20)

i=1

The nal weights are computed from Eq. (19), where in the standardized residuals, s0 is replaced by σ ∗ . The parameter estimation stage is performed subsequently where problem (WLSQP) is solved with the new weights wi dened here. To increase the convergence rate of the local NLP solver, a subset of the parameter sets θˆ generated in the sampling stage of the outlier detection step are provided as initial guesses.

17

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 40

3.3 Task 3  Parameter estimation The data tting is formulated as a weighted least squares problem (WLSQP) of N experiments where the weights wi containing the information of whether the experiments are regular or outliers are determined in Task 2 (Section 3.2).

min ϕ = θ∈Θ

s.t.

N X

wi ri2

i=1

ˆ TG,i − XTGi,f , ri = X

∀i ∈ {1, . . . , N }

(WLSQP)

Md (XTGi,j,k , Ti , xa,i , xMOR,i , θ, ti,j,k ), i ∈ {1, . . . , N } As in Tasks 1 and 2, the orthogonal collocation method was employed, where the resulting NLP problem is solved all-at-once.

3.4 Task 4  Statistical and practical identiability analysis Following the PE stage, a set of metrics were estimated to assess the accuracy and signicance of the parameter estimates. Statistical inferences such as condence regions, parameter's standard error (s.e. θ ∗ ), intervals of condence of XTGi,f and θˆ for a signicance level of 95%, and p-values were computed, assuming that regularity conditions hold for the local linearization. 34 . The model adequacy to capture the underlying features of the data is also tested with a Chi-square test, where a goodness-of-t indicator (χ2red ) is compared with its critical value (χ2crit., red ). Finally, a practical identiability analysis is performed based on the sensitivity matrix (S ) of the (WLSQP) problem at the solution, following the approach described in López et al. 26 . The matrix S is decomposed using a SVD factorization to nd its eigenvalues and, in turn, the corresponding condition number (κ(S)) and the collinearity index (γ(S)). These indicators are used for measuring the practical identiability and detecting subsets of correlated parameters, constituting reformulation candidates in simpler model structures. To prevent bad scaling of S , an algorithm based on Curtis and Reid 35 was employed to scale matrix S before the SVD factorization. The acceptance of the resulting 18

ACS Paragon Plus Environment

Page 19 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

model depends on the overall analysis of these statistical and identiability indicators.

4

Application

The data set considered to parameterize the kinetics of the TE with sodium methoxide as the catalyst has a total of 144 batch experiments (or runs) collected from the literature. The temperature (T ) ranges from 318 to 338 K, xa from 0.1 to 2.5 % w/w.oil, and xMOR from

3.2 to 9.0, with the experiences lasting up to 240 min. The sets used were retrieved from: 15 1 ; 68 36 ; 918 37 ; 1925 38 ; 2632 39 ; 33 40 ; 3443 41 ; 4475 42 ; 76105 43 ; 106126 44 ; 127 134 18 ; and 135144 45 . The experimental setups involved batch reactor vessels with volumes between 100 to 2000 mL, initial amounts of oil (mainly rapeseed and sunower) between

60 to 500 g, concentration of FFA between 0.02 to 1.2 wt %, and mixing speeds between 600 to 1200 rpm, except for KoohiKamali et al. 44 which reports a stirring rate of 300 rpm. Although some of these works 1,36,38,41,43 do not report the value of the mixing rate, they were still herein considered. For the MSI analysis, problem (SIP1) was solved in GAMS 46 for the time interval

0 to 240 min, with E = 12, K = 3, within the domain of the input variables X = [318, 338] × [0.1, 2.5] × [3.2, 9.0], the domain of parameters sets θ, θ 0 ∈ [0, 4]2 × [−20, 20]3 for the 5P model θ, θ 0 ∈ [0, 4]2 × [−20, 20]3 × [0, 10] for the 6P model. The local maxima Ψ∗ found for the 5P and 6P models are about 2968 and 4131, respectively, well above the identiability threshold ( = 1 × 10−3 ). The number of equations that each subproblem of (SIP1) reached during the solution cycle and the overall CPU time that took for convergence were, respectively, 5800 and 64 s for the 5P model; 11 500 and 217 s for the 6P. These CPU times were obtained in an Intel Xeon X5675 @ 3.07 GHz workstation using CONOPT 47 as NLP solver. Similar solutions were obtained with the global solver BARON 48 , which completed normally with optimal solution within 13 145 s of CPU time for the 5P model and 24 430 s for the 6P, using an absolute and a relative gap of 10−8 and 10−7 , respectively.

19

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The results of the MSI analysis show that the 5P and 6P models are structurally nonidentiable, which means that in both cases there are points in the domain X that may produce indistinguishable trajectories for distinct vectors θ and θ 0 . Notice that this analysis only determines if a single run allows the determination of all model parameters uniquely. The results obtained may not apply to (i) simultaneous sets of batch experiments where the input variables are kept constant during the time duration of each experiment but at dierent levels in separate experiments; (ii) dynamic experiments where the inputs are changed along the experiment so that an objective function measuring the information content is maximized; or (iii) multiple measurement devices that can observe various variables describing the reaction progress in time. These alternatives fall into the optimal design of experiments, which is out of this study's scope. In Task 2, the number of sampling points, S , generated for forming the pool of θˆ in LMS minimization was 5 × 104 , where a, b ∈ [0, 4], θ1 , θ2 , θ3 ∈ [−20, 20], for both 5P and 6P kinetic models and θ4 ∈ [0, 10], only required in the 6P model. Table 1 lists the ve best solutions found using the least median of squares criterion in the 5P and 6P models, together with the best solution for ϕ of each model in the sampling stage. The best solution of ψ for the 5P kinetic model was 8 × 10−4 (sample 37 743), while of ϕ it was 1.101 (sample 34 810). It is clear that the least sum of squares estimator is aected by the presence of contaminated data, given that the median of squares of sample 34 810 is considerably higher (26 × 10−4 ) than the lowest value found for ψ . In the absence of outliers, the solutions provided by the estimators least sum of squares and least median of squares tend to be equal. 24 Similar results are observed for the 6P model, where the best solution of ψ was approximately

5 × 10−4 (sample 40 414) and of ϕ was 1.091 (sample 44 526), with a corresponding value of ψ of 27 × 10−4 . [Table 1 about here.] The biases of ϕ are highlighted when comparing the studentized and standard residual plots in Figure 2 for both models. While Figure 2a and Figure 2c exhibit only one or no 20

ACS Paragon Plus Environment

Page 20 of 40

Page 21 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

outliers when ϕ is used as estimator, Figure 2b and Figure 2d show 24 and 27 outliers when

ψ is used for the 5P and 6P kinetic models, respectively. The runs identied as outliers in both Figure 2b and Figure 2d were 12 1 , 1925 38 , 2632 39 , 77, 79, 80, 83, 87, 92, 99 and 102 43 . Additionally, experiments 33 40 , and 97 and 101 43 were identied as outliers in 6P kinetic model, as shown in Figure 2d. In these runs, other factors may be in play which were not captured by both models, particularly mass transfer. 1,38,43 [Figure 2 about here.] In the PE stage (Task 3), problem (WLSQP) was also implemented in GAMS 46 employing the NLP solver CONOPT 47 with a subset of the best solutions found in Task 2 as initial guesses, while the subsequent statistical treatment was performed in Mathematica. 49 The number of nite elements E was xed E = 12 with K = 3, which lead to an ARD between predictions of XTGf from GAMS and Mathematica below 10−6 for both kinetic models. Table 2 summarizes the results where (WLSQP) was solved with the outliers previously detected excluded from data set. The optimal value ϕ(θ ∗ ) found for the 5P kinetic model was 0.505, corresponding to a ψ value of 6.41 × 10−4 and an ARD equal to 5.2%. As for the 6P model, ϕ(θ ∗ ) was 0.255, related with a ψ of 2.35 × 10−4 and an ARD of 3.7%. Note that the values of ψ for both models are now closer to the best local optima found in the pool generated to identify outliers via LMS minimization (Table 1), which was expected given the exclusion of outliers. Moreover, both models t well to the remaining data since they have a R2 above 0.995, and a χ2red between 0.94 and 0.95. [Table 2 about here.] The main dierence between the 5P and 6P kinetic models concerns the statistical signicance and correlation between the parameters estimates. Table 2 shows large standard errors (s.e.) for θ ∗ , as well as p-values above the threshold (0.05) in the cases of θ2 and θ4 of the 6P model. This is a clear symptom of model over-tting which is corroborated by the two-dimension condence region ellipses (Figure S2 of Support Information) that dis21

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

play a strong correlation among parameters. The value of s.e. θ ∗ for the parameters in the 5P kinetic model are signicantly lower and are all statistically signicant. Although the two-dimension condence region ellipses in Figure S1 still display some collinearity between parameters, this is not as extreme as in Figure S2. The average value of the condence intervals for the response variable XTGf,i also decreased from 0.0922 (6P) to 0.0676 (5P). The results of the practical identiability analysis presented in Section 3.4 denote a strong agreement with the those inferred from the basic statistical analysis, since the 6P model reveals high collinearity among the parameters (γ(S) = 162) and the inability to robustly identify them uniquely, since κ(S) is also high (506). In contrast, the 5P model is satisfactory and preferable, since it explains accurately the data (ARD = 5.2%) and its parameters are practically identiable in the sense that κ(S) < 1000 and γ(S) is just above the recommended range γ(S)