Model Building Methodology for Complex Reaction Systems

Apr 15, 2015 - Preliminary experiments provide information for construction of a reaction network containing all feasible reactions. A strategy is emp...
0 downloads 0 Views 468KB Size
Article pubs.acs.org/IECR

Model Building Methodology for Complex Reaction Systems Wenling Zhang,† Michael Binns,‡ Constantinos Theodoropoulos,† Jin-Kuk Kim,*,‡ and Robin Smith† †

Centre for Process Integration, School of Chemical Engineering and Analytical Science, The University of Manchester, P.O. Box 88, Manchester M60 1QD, U.K. ‡ Department of Chemical Engineering, Hanyang University, Wangshimniro 222, Seongdong-gu, Seoul, 133-791, Republic of Korea S Supporting Information *

ABSTRACT: Most often reactor designs are scaled-up using experimental measurements, especially for the manufacture of fine and specialty chemicals. Yet without an appropriate model of the reaction system major opportunities can be missed in the design and optimization of the reactor. In this paper a model building methodology for complex reaction systems combined with experimental evaluation is developed. Preliminary experiments provide information for construction of a reaction network containing all feasible reactions. A strategy is employed for generating rival alternative simple models using different subsets (mechanisms or pathways) extracted from the reaction network. On the basis of model predictions further experiments can be carried out either at the optimal reactor performance conditions or at conditions that allow discrimination among rival models. These additional experiments can be used to refit and improve the rival models and to test and validate the predicted optimal reactor design. An example involving oleic acid epoxidation is used to demonstrate the methodology.

1. INTRODUCTION A wide variety of reactions encountered in processes (in particular for fine and specialty chemicals and pharmaceutical chemicals) are complex multistep processes. Furthermore, processes often feature products with short product life cycles1 that make the development of a detailed model unattractive because the company that first markets the product tends to get 70% of the total sales.2 This creates a disincentive to study the reaction chemistry and kinetics in any detail. In turn this often leads to reactor designs being scaled from laboratory experiments without any model of the chemistry or reaction kinetics. Yet, without a model that reflects the key features of the reaction system major opportunities can be missed. For the design, optimization, and control of a chemical reactor a quantitative description of reaction system behavior is essential. To obtain this quantitative description it is usually necessary to know (1) the molecular compounds present including raw materials, products, byproducts, and intermediates; (2) the reactions and reaction mechanisms occurring in the system; and (3) the kinetic rate expressions for each reaction in terms of molecular concentrations, temperatures, and where appropriate, catalyst concentration. Using these three elements allows the quantitative prediction of reactor performance. However, the application of model reduction methods may also optionally be implemented both to reduce the complexity of the models generated and to accelerate the design process where the models are used within optimization (e.g., to maximize reactor yield). To obtain this information there are a variety of methods and techniques available for the (i) prediction of possible reaction steps; (ii) identification of appropriate reaction mechanisms; (iii) identification of kinetic models based on initial experiments; and (iv) model-based design of experiments for improving model accuracy. For generating reaction steps and mechanisms the review of different approaches by Nishida et al. in 19813 showed that © XXXX American Chemical Society

different approaches used in organic synthesis can be divided into two categories. Knowledge-based methods such as LHASA4 (logic and heuristics applied to synthetic analysis, developed to predict reaction steps which produce a target product) and Zeneth5 (developed to predict drug degradation products) rely on databases or sets of known reactions which effectively teach the computer program different possible reaction transformation rules. Similar knowledge-based systems have also been developed for biochemical systems either using sets of derived transformations6 to represent enzyme-based reactions or with combinatorial algorithms considering different compounds which can attach to enzyme active sites.7 However, such methods for both chemical and biochemical systems still have the fundamental limitation that they are unable to predict reactions if the knowledge base of reactions used is not sufficiently complete. Alternatively the logic-based methods such as RAIN (reaction and intermediates networks) and IGOR (interactive generation of organic reactions)8 based on the ideas of Ugi and Gillespie9 use matrices to represent the chemical structure of molecules and reaction transformations. Methods such as these consider specific rules defining allowed transformations including the making and breaking of individual atom−atom bonds. Hence, logic-based methods allow a more fundamental prediction of reactions including reaction transformation steps which are not considered by knowledge-based methods unless sufficiently similar or identical reactions are included in the knowledge base. However, as previously mentioned logic-based methods are generally based on sets of bond-making and bondbreaking rules and if these are incomplete or incorrect then they might generate both feasible and infeasible reactions. The Received: November 3, 2014 Revised: April 3, 2015 Accepted: April 6, 2015

A

DOI: 10.1021/ie504343d Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

Figure 1. Model building procedure.

implements simple tendency models13 and while their procedure has a number of advantages they do not consider changes of the basic structure of their models within the procedure. Hence, if the behavior at optimal process conditions is significantly different from the behavior at conditions used to develop the initial model then the final predictive power of their approach will be limited. In this work we develop a systematic procedure which combines model building, experimental design, and process optimization. This is similar to the procedure of Luna and Martinez20 except that in this case we consider the parallel development of multiple different rival models derived from reaction prediction. In this way the new method considers the possibility that the reaction mechanism occurring at optimal conditions could be different from that which is occurring at the initial experimental conditions. Hence, the novelty of this method lies in the combination of reaction prediction tools used together with process and model development. This allows the fast development of both processes and their corresponding models based on minimal initial information about the system. The reaction prediction methodology implemented here is a simple logic-based method which identifies all possible reactions based on knowledge of molecular compounds present. Unlike other logic-based methods (such as IGOR and RAIN8) this approach does not impose complex rules and instead involves a simple combinatorial method which can generate all possible reactions using the available reacting species. This is similar to the method employ by Burnham et al.15 except that here we have employed constraints based on atomic composition rather than molecular mass, and we have separated reactions into stage I and stage II lists which aids the development of feasible reaction schemes. The benefits of this new approach include fast development of processes and importantly the elimination/reduction of uncertainties regarding process performance at the conditions which will be implemented at industrial scales. This is possible thanks to model improvements which are made simultaneously together with experimental process development. Hence, the time and costs related to process development can be reduced if a development procedure such as the one proposed here is followed (removing or reducing the amount of tuning required to achieve optimal reactor output). Model development is demonstrated through an example involving oleic acid epoxidation to illustrate how the methodology can interpret the experimental data for reactor design and optimization.

most recent work in this area considers the use of molecular dynamics10 and density functional theory11 for the prediction of transition states and the corresponding reactions. Unfortunately the addition of complex calculations which permit the more accurate prediction of feasible reactions also implies the use of significant computational time which precludes their application for large reaction pathways or mechanisms unless more efficient methods can be developed or large numbers of powerful computers are available. Following the identification of possible reactions it is desirable to select subsets of the full list of reactions which can be used to describe the behavior of the system as a whole. This is useful because the full list of possible reactions may be very large containing redundant, unused, or insignificant reactions steps. Aris and Mah12 developed the concept of independent reactions from stoichiometry. This provides guidance that a sufficient number of reactions are picked to represent the system. But this information does not suggest which reactions should be included. The use of tendency models13 and target factor analysis14 are methods which allow the identification of reactions and reaction stoichiometry based on measured data. Alternatively Burnham et al.15 identifies species rate expressions without initially specifying which reactions are taking place. To reduce the complexity of these rate expressions (which might initially contain large numbers of parameters) they implement a combinatorial method for the generation of all possible reactions followed by the elimination of reactions which do not satisfy a mass balance. However, these methods are often applied using only initial experiments and the lists of reaction stoichiometries identified are not updated if any additional experiments are carried out. For example Bhatt et al.16 and Burnham et al.15 have both fitted kinetic models using only initial experimental data, where Bhatt et al. have used target factor analysis to identify reactions and reaction stochiometries. Alternatively, there have been many studies looking at model-based design of experiments (MBDOE). For example Box and Hill 17 and more recently Franceschini and Macchietto18 have considered model improvement linked with repeated experiments. However, their goals are only to obtain an accurate model without considering process optimization (i.e., without obtaining the maximum reactor yield). To include the influence of process optimization Galvanin et al.19 considered MBDOE together with constraints to enforce some minimum performance in the resulting experiments. Additionally, Luna and Martinez20 have constructed a MBDOE procedure which includes both information and process yield objectives. This work of Luna and Martinez B

DOI: 10.1021/ie504343d Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

2. MODEL BUILDING METHODOLOGY The methodology presented in this paper summarizes and combines several different methodologies developed by Zhang21 into a single procedure for the simultaneous experimental design, model development, process design, and optimization in parallel. Hence this procedure aims to identify the following: • important species including raw materials, intermediates, and products (e.g., A, B, C, ...). • possible reactions occurring in the system (e.g., A → B, B → C) • reaction mechanisms (e.g., chains of connected reactions, A → B → C → D) • kinetic expressions which describe the rate of each reaction • different reaction models (including different combinations of reactions with different kinetic expressions) • optimal reactor conditions (e.g., feed composition, pressure, and temperature conditions which give the maximum yield) This combined procedure (illustrated in Figure 1) emphasizes model improvement and is intended to give optimal reactor conditions together with models which are sufficiently precise at these optimal conditions. The suggested overall procedure can be summarized as follows: 1. Perform initial experiments selecting process parameters to maximize the information gathered (e.g., using orthogonal experimental design). Also considering the range of possible process conditions based on the physical properties of the reactants and products, safety considerations, equipment constraints, etc. 2. Identify raw materials, products, and byproducts and collect available measurements. 3. Construct a reaction network for the reaction system. 4. Generate possible reaction schemes (subsets of the reaction network) to describe the process. 5. On the basis of the reaction schemes and experimental data, fit different rival kinetic models. 6. Model improvement option A: i. Use the best of the rival models to predict optimal experimental conditions. ii. Perform additional experiments at the predicted optimal point. iii. Either fit new models or refit the existing models using initial + new experimental results. 7. Model improvement option B: i. Use model discrimination to identify experimental conditions that maximize the deviation between the fitted rival models. ii. Perform additional experiments at the conditions predicted to give the largest deviation. iii. Either fit new models or refit the existing models using initial plus new experimental results. 8. Repeat model improvement options A and/or B until a final model is found which is sufficiently accurate and valid over a range of conditions appropriate for operability of the final design. The following sections 2.1, 2.2, 2.3, and 2.4 describe the methodology required for steps 3, 4, 5 and 7-i, respectively. Methodology required for carrying out experiments will not be discussed here and it will depend on the process under

consideration. The methods described in section 2.3 for the fitting of rival models are implemented at steps 5, 6-iii and 7-iii. 2.1. Reaction Network Construction. Using one reaction step to describe a complex reaction system from raw materials to products is in most cases inadequate for reaction system analysis. A reaction network is defined as a set of reaction steps that describes all the possible interactions between the reacting species. A new algorithm for reaction network construction is proposed in this work to automatically generate all reactions starting from limited knowledge (including only the set of reacting species) with no knowledge of the reaction mechanism. The algorithm as shown in Figure 2 includes a

Figure 2. Reaction generation framework.

two-stage strategy for generating feasible reactions and an integer programming method for testing reaction feasibility. Stage I of the procedure identifies reactions converting raw materials into products. Reactions converting raw materials and/or products into other products and byproducts as reactants are included in Stage II. 2.1.1. Reacting Species Classification. A reaction system consists of n species (t raw materials) represented as Species Set S = {s1, s2, ..., sn} which has two subsets: the Reactant Subset R and the Product Subset P. R and P are dynamic sets changing within the procedure. Equation 1 gives the relationship between the set and subsets. S=R∪P

where

R ⊂ S, P ⊂ S

(1)

In any reaction, reactants must be chosen from Reactant Set R, whereas products must be chosen from Product Set P. For stage I, the Reactant Set is RI = {s1, s2, ..., st} and includes all raw materials fed to the reactor. The Product Set for stage I is PI = {st+1, st+2, ..., sn} and includes all the products produced in the reactor. For stage II, the Reactant Set is RII = {s1, s2, ..., sn} where all reacting species are included. The Product Set for C

DOI: 10.1021/ie504343d Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

stage II is PII = {st+1, st+2, ..., sn} which is the same as the product set in stage I. 2.1.2. Reaction Feasibility: Integer Programming. Considering the list of possible reactants and the list of possible products a simple combinatorial method is used to generate all possible reactions involving different combinations of reactants and products. However, these reactions must be tested to determine whether they are feasible in terms of their atomic balance (e.g., if the reactant side has more carbon atoms than the product side then it is clearly not feasible). At the same time the smallest value of the sum of absolute reaction coefficients for each combination is chosen. These smallest values are selected in order to represent the reactions in the simplest possible format (e.g., A → 2B rather than 5A → 10B). In this study the stoichiometric coefficients are constrained to integer values when testing if a certain combination of reactants and products form a feasible reaction. A reaction is feasible only if integer programming can find a solution. It is then added to the reaction list with the reaction stoichiometric coefficients given by the solution. The roles of each species in a reaction are defined using the stoichiometric coefficient vector X (defined by eq 2) which specifies the classification of each species as either: reactant, product, or not involved. XT = [ x1 x 2 ⋯ xn ]

stage I procedure for reaction generation and feasibility checking is repeated for stage II to complete the reaction generation process. Additionally, any reactions which are common to both lists are deleted from the stage II list (removing duplicate reactions). Assuming there are u reactions generated from stage I and v reactions from stage II the stoichiometric coefficient matrix Φ is an n × (u + v) matrix. If possible the size of the reaction lists should be reduced by eliminating some reactions. There are several different situations where this might be necessary. For instance prohibiting some products from acting as reactants in reactions. Furthermore, knowledge that particular reactions cannot take place allows them to be eliminated based on a thermodynamic check22 if data is available. 2.2. Generation of Reaction Schemes. The reaction network generated including stage I and stage II reactions comprises a set of useful possible reactions that can be used to describe the reaction system being studied. As this set may contain a large number of reactions which are unnecessary and unimportant at the considered process conditions a reaction scheme construction algorithm is useful to systematically generate feasible reaction schemes (subsets of the full reaction network). The most important feature of a strategy to solve this problem is the ability check if a reaction scheme is feasible while using some combinatorial algorithm to generating possible schemes. Note that the number of possible reactions in any reaction scheme is not in general the same as the number of independent reactions as determined by the algorithm of Aris and Mah.12 The number of independent reactions Nind, is calculated by singular value decomposition (SVD) of the atom-molecule matrix A (Nind = Nspecies − Rank(A)). The numerical SVD of matrix A gives three matrices (U, V, and S) such that A = USVT where U and V contain the left-singular and right-singular vectors and the matrix S contains the singular values of matrix A along its diagonal.23 The number of significant nonzero singular values then determines the rank of matrix A. In principle, reaction pathways can contain any number of possible reactions but optionally and for simplicity shorter pathways can be generated using Nind as an upper limit for the length of reaction pathways. Any feasible reaction scheme is a possible route to connect raw materials with products by selecting reaction steps generated in the two-stage strategy. For a reaction scheme to be feasible all the raw materials must be consumed and all the products must be produced. To start the process reactions must occur using only the raw materials. Hence, the number of reactions selected from stage I must be at least 1 to guarantee that one reaction occurs using only the raw materials. Subsequent reactions must then form chains (reaction pathways) which connect the raw materials with all the products. Verifying that this condition is met (and hence that the reaction scheme is feasible) is generally not a simple task, particularly if there are large numbers of reactions and long pathways involved. A systematic feasibility check for a reaction scheme is carried out using an incidence matrix, which is derived from the stoichiometric coefficient matrix Φ and the reaction-index vector Θ. The elements of this index vector are 1 or 0. If Reaction j is selected to occur in a reaction scheme it is set to be one, otherwise it is set to zero.

(2)

where xi are integer variables. For example in a system with five reacting species the reaction S1 → S2 + S3 would be represented by XT = [−1, 1, 1, 0, 0]. In addition to the classification of reacting species, the number of reactants and products allowed in a reaction should be limited, for example, to a maximum of two reactants and two products. These limits are recommended for both practical reasons (the number of combinations increases exponentially as the number of reactants and products increase) and for physical reasons (reactions with a large numbers of reactants are less probable due to the larger number of collisions required) which mean that reactions with three or more reactants and/or products are less significant and can often be neglected. Also, in one reaction a species cannot be both reactant and a product at the same time and so every species must be assigned a value to specify its role. The objective function for the integer programming optimization is to minimize n

F=

∑ |xi|

(3)

i=1

subject to the constraints: (4)

AX = 0

where A is the atom-molecule matrix,(xi > 0 for products; xi < 0 for reactants; and xi = 0 if the species is not involved) and

A m×n

⎡ a11 a12 ⎢a a ⎢ 21 22 =⎢ ⋮ ⋮ ⎢ ⎣ am1 am2

⋯ a1n ⎤ ⋯ a 2n ⎥ ⎥ aji ⋮ ⎥ ⎥ ⋯ amn ⎦

(5)

aji is the quantity of the jth atom (or functional group) found in the ith species. If a feasible solution exists, then the values of xi become a column vector of the stoichiometric coefficient matrix Φ. The D

DOI: 10.1021/ie504343d Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research ΘT = [ θ1 θ2 ⋯ θu ⋮ θ θu + 2 ⋯ θu + v ]

Article

satisfied this reaction scheme is feasible: (i) The total number of reactions selected from stage I is less than or equal to the number of independent reactions and greater than or equal to 1

(6)

Hence, if a reaction network contains five reactions and a reaction scheme is chosen which contains only the first and third reactions this will be specified as ΘT = [1, 0, 1, 0, 0]. In graph theory an incidence matrix is used to represent the incidence of labeled vertices and labeled arcs.24 In this work incidence matrices will be used to represent both individual reactions and reactions pathways containing multiple reactions. The elements of an incidence matrix D representing a reaction system are defined as (i) dij = 1 (if species i is produced by reaction/reaction pathway j); (ii) dij = −1 (if species i is consumed by reaction/reaction pathway j); (iii) dij = 0 (if species i is neither produced or consumed by reaction/reaction pathway j), and it should be noted that dij = 0 in cases where species i acts as an intermediate in a reaction pathway j where there is no net consumption or production (e.g., species B in the reaction pathway A → B → C). By restricting the elements of the matrix to be no less than −1 and no greater than 1 the incidence matrix D represents the connectivity between certain reactants and certain products.

u

1≤

(9)

j=1

(ii) Every species in the raw materials must be consumed at least once in the reaction scheme excluding any catalysts, solvents, or diluents. u+v

∑ dij(l) ≤ −1,

∀ i = 1, 2, ..., t (10)

j=1

(iii) Every species except raw materials must be produced by the reaction steps in the reaction scheme. u+v

∑ dij(l) ≥ 1,

∀ i = t + 1, t + 2, ..., n (11)

j=1

Because this formulation in the incidence matrix can only represent one orientation of any reaction step irreversible reaction schemes are more straightforward to represent than reversible reactions. To avoid confusion reaction reversibility is not considered at this stage, reversibility will be represented through kinetic expressions inside the developed rival models. 2.3. Rival Model Derivation. For a complex reaction system there may be a large number of possible reactions and a very large number of reaction schemes. In such cases it may be impractical to fit kinetic expressions for all the possible reactions simultaneously due to the large numbers of parameters which would need to be estimated and the large computational time and effort required. Hence, it is desirable to obtain models through the fitting of reactions schemes containing small subsets of the full list of potential reactions. This selection of subsets is also effectively a method of model reduction. However, so as not to rule out important mechanisms the method must consider either models developed from multiple different reaction schemes or including the addition/removal of potential reactions as part of the model fitting process. In this way the most appropriate model and reaction scheme can be identified. It is almost impossible to predict which reaction scheme will best represent the system without any information from kinetics. It is possible that no single model will be more accurate than all others for the range of conditions considered. Different models may be better suited to different ranges of conditions. The most appropriate model will be the one that is most accurate for the conditions associated with the final reactor design. 2.3.1. Evaluation of Rival Models. Finding a model that closely matches the experimental data is an optimization problem. The objective is to minimize the difference between the measured and predicted values. For a reaction system the objective function of data fitting needs to evaluate the difference between measured and predicted concentrations of the different species. This would normally be quantified by an objective function of the type:

The incidence matrix D is divided into 4 submatrices: dIR, dIIR, dIP and dIIP according to the classification of reacting species and reaction lists. To check the feasibility of a reaction scheme the following seven steps are used to guarantee that all the raw materials are consumed and all the products are produced: (1) Generate the incidence matrix D from the stoichiometric coefficient matrix Φ. (2) Modify the matrix D based on any specific reaction scheme. From this D(1) is obtained:

dij(1) = dij·θj

∑ θj ≤ Nind

(8)

(3) For every reaction in stage I which has a column not equal to 0 add these reactions to every reaction in stage II one by one. (4) If all the elements of any addition column in submatrix dIIP are equal to or greater than 0 then this column is labeled as a feasible reaction pathway. If there are no feasible reaction pathways created and there are still elements in submatrix dIIP less than 0 this reaction scheme is not feasible and the feasibility checking procedure stops. Otherwise, the superscript of the incidence matrix is updated and the procedure continues. (5) If there is any reaction whose elements in submatrix dIIP are still less than 0 repeat steps 3 and 4 by adding the feasible reaction pathways generated to the remaining columns in stage II (incrementing the superscript of the incidence matrix). Otherwise the procedure continues. (6) If the value of the superscript reaches Nind but there are still some reactions left (elements in submatrix dIIP remain negative) delete this reaction scheme. (7) If elements in submatrix dIIP are equal or greater than 0 with the superscript l ≤ Nind then calculate the summation (∑) of every row of the matrix. If the following three basic rules are

Min F =

∑ wi(yi (exptl) − yi (calc))2 i

(12)

where yi is the experimental measurement at i data point (including the different species) and wi is the weighting factor. E

DOI: 10.1021/ie504343d Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

equations might be stiff or nonstiff so the method for solving ODE problems must be able to cope with these different situations.27 The method used for simulating reaction systems here is a backward difference method appropriate for stiff systems which is also capable of handling nonstiff problems. 2.3.2. Optimization Method. The problem of obtaining a reaction model starting from lists of possible reactions where the kinetic parameters are unknown is an MINLP problem. The optimization starts using one of the generated feasible reaction schemes and can be used to find multiple solutions as rival models. To identify accurate solutions hybrid optimization is carried out in two-stages. The initial stage uses stochastic optimization in the form of simulated annealing (SA) which has the ability to obtain optimal or near-optimal solutions regardless of initial guesses used. The second stage makes use of the mathematical programming techniques (e.g., sequential quadratic programming) to fine-tune the solution obtained from stochastic optimization by varying only the continuous variables. The choice of SA allows flexibility in the way that the solution space is represented and searched that is suitable for complex reaction systems with large parameter spaces. Reaction model building using SA optimization includes two kinds of moves: reaction scheme moves and kinetic expression moves. Reaction scheme moves are discrete changes which involve adding or removing reactions. Kinetic expression moves involve continuous changes of either the activation energies, frequency factors, or the reaction orders. As an alternative to changing the reaction model by adding or removing reactions, the optimization can be repeated multiple times for each fixed reaction scheme, fitting kinetic expressions for each of the resulting models. 2.4. Model Discrimination and Optimal Reactor Design. To increase the range of conditions over which the models are accurate and to subsequently identify which is the best model to describe the reaction system further experimental data is needed. The model should be accurate for the conditions that will be used in the final reactor design. However, to make the model more useful it should also be reliable for a range of conditions around this point as the process might not operate at design conditions for a variety of reasons. To improve the model in this way additional experiments should be carried out at conditions which maximize the performance of the reactor and at conditions which maximize the differences between different model outputs. To identify the conditions that maximize the reactor performance the rival models developed from the preliminary experiments can be optimized to predict the optimal experimental conditions and hence to design new experiments (e.g., maximizing yield or selectivity). Carrying out such optimization using the rival models from the preliminary experimental data will give an indication of the reactor conditions required for the final design. However, this assessment might change later. Gathering further experimental data only in the region of the apparent optimum point for a reactor design based on a model developed from preliminary experimental data will not necessarily provide the best model and the true optimum conditions. However, it will give further evidence with which the models can be improved. To discriminate between different models experiments can be designed that maximize the difference between different model outputs. For this purpose Hunter and Reiner28

However, the complexity of the model used (i.e., the number of reaction steps in the corresponding reaction scheme) to describe a specific reaction system is also an important factor to consider. More reaction steps mean that more equations need to be solved which can pose difficulties in the simulation and control of the system. A new objective function is proposed to account for these factors: 2

Min F = M ∑i wi(yi (exptl) − yi (calc)) − 1

(13)

where M is the number of stoichiometric reactions used in the model. Thus, the new objective function minimizes the difference between the measured and predicted concentrations and at the same time minimizes the complexity of the model. Clearly mathematical forms other than that in eq 13 could also have been used. For example Crampin et al.25 describe several different expressions for combining model accuracy and model complexity including the Akaike information criteria.26 These different criteria/objectives/cost functions vary only in that they give different degrees of importance to the two objectives (maximum model accuracy and minimum model complexity) and Crampin et al.25 suggest that there is “no universally applicable cost function”. The objective function used here in eq 13 is chosen such that model complexity has a significant impact on the objective so that the simplest models which are capable of representing the system will give the minimum objective values. In this work both eqs 12 and 13 will be used to evaluate different rival models. Equation 12 will be used to measure the absolute accuracy (i.e., the ability to reproduce experimental results) of each model while eq 13 is used to compare the different models in terms of both accuracy and simplicity. For a single reaction the rate of disappearance or formation of Component A (−rA or rA) depend on temperature and composition. The most commonly used kinetic equations for systems that are kinetically controlled are in the form of a power law. For example: −rA = kACAaC Bb... − kA′ CSsC Tt

(14)

Such expressions can be used to represent both irreversible and reversible reactions determined by the appearance or absence of the reverse term. Other more complicated forms of the kinetic model can be used without affecting the basic procedure. The temperature effect is accounted for using an Arrhenius equation: ⎛ Eaj ⎞ kj = Aj exp⎜ − ⎟ ⎝ RT ⎠

(15)

For a complex reaction system stoichiometric equations can provide the relationship between reaction rates of all species. Consider a chemical reaction system where m reactions involving n species take place: m

Ṙ i =

∑ ϕijrj j=1

(16)

When mole balance equations are combined with reaction rate equations a set of ordinary differential equations (ODEs) is obtained (see eq 16). In addition, material and energy balance equations can be coupled leading to a more complex ODE or ODE/algebraic initial value problem (IVP). This set of F

DOI: 10.1021/ie504343d Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

including byproducts and intermediates involved in the reaction system are

developed a sequential experimental design approach for discrimination of two rival models. Later, Box and Hill17 proposed new criteria for discrimination among m different rival models. For this purpose their objective function is the maximum divergence of predicted values among all rival models under specific experimental conditions. m

max D =

benzaldehyde, C6H5CHO oxygen, O2 oleic acid, CH3(CH2)7−HC=CH−(CH2)7−COOH perbenzoic acid, C6H5CO3H benzoic acid, C6H5COOH oleic acid epoxide, CH3(CH2)7−HC−O−CH−(CH2)7− COOH Hence, this system has the following atom-species matrix

⎛ p Πin − 1Πjn − 1⎜⎜ pi ln i dyn pj j=i+1 ⎝ m



∑ ∑ i=1

+

A. B. C. D. E. F.



⎞ pj ln dyn ⎟⎟ pi ⎠ pj

(17)

where i, j are the indices of models, (i, j = 1, 2, .... m) and n is the index of experiments, yn. pi is the probability density function of the nth experiment under model i. Πin‑1 and Πin are the probabilities associated with the ith model before and after the nth experimental observation (for details see the work of Box and Hill17). This criterion D called Kullback’s total measurement of information29 is derived from statistical theory based on the concept of entropy30 where experimental errors are considered. The predicted value at the next operating conditions is determined within a certain level of confidence. If the measurements are normally distributed then the probability density function pi in eq 17 is substituted with a normal distribution expression and the discrimination function D becomes D=

1 2

m

with each of the six species containing only carbon, oxygen, and hydrogen. From the SVD calculation the rank of the atom-species matrix is 3 and so the number of independent reactions is Nind = 6 − 3 = 3. Following the procedure described in section 2.1 we see that the full set of species is S = {A, B, C, D, E, F}. Additionally it is known that A, B and C are raw materials and D, E, and F are end products and hence for the generation of stage I reactions the reactant set will be RI = {A, B, C} and the product set will be PI = {D, E, F}. Hence, stage I reactions are generated using combinations of RI and PI (stage I reaction combinations {A, B, C} → {D, E, F}). After checking the stoichiometric feasibility of all reactions using integer programming only three are found to be feasible. The third reaction is in stoichiometric balance but is improbable in practice based on knowledge of the complexity of the required transformations (requiring the breaking of double-bonds in two molecules of oleic acid plus the interaction with an oxygen molecule) and so it is deleted.

⎧ (σi2 − σj2)2 ⎪ Πin − 1Πjn − 1⎨ 2 2 2 2 ⎪ j=i+1 ⎩ (σ + σi )(σ + σj ) m

∑ ∑ i=1

⎫ ⎛ ⎞⎪ 1 2 ⎟⎬ + (yin − yjn )2 ⎜⎜ 2 + 2 σ 2 + σj2 ⎟⎠⎪ ⎝ σ + σi ⎭

(18)

where σ2 is the variance. This can be further simplified according to the specific problem. Not only can the reaction conditions be used to discriminate between rival models but also the mixing characteristics of the experimental reactors. Deliberately making the reaction less efficient can lead to greater understanding of the reaction. The choice of experimental reactor configuration can be made using optimization of a reactor network superstructure.31 This is analogous to the optimization carried out for reactor design but with the objective now of model discrimination rather than optimum yield. In this overall procedure the criteria for refinement and termination will depend very much on the complexity of the system being analyzed and modeled, the quality of the model required and the time and resources available for such an experimental program. Consider now the application of the approach to the following illustrative example epoxidation reaction.

1⎧ A + B = D ⎪ 1⎧ A + B = D 2 ⎨ 2A + B = 2E ⇒ ⎨ 2 ⎩ 2A + B = 2E ⎪ 3 ⎩ B + 2C = 2F

For stage II the reactant set should include all reacting species so that RII = {A, B, C, D, E, F} and the product set remains the same so that PII = {D, E, F}. Stage II reactions are then generated using combinations of RII and PII, testing each combination for feasible reactions (stage II reaction combinations {A, B, C, D, E, F} → {D, E, F}. The resulting list of feasible reactions generated for stage II includes three reaction steps: 3 ⎧C + D = E + F ⎪ 4 ⎨ A + D = 2E ⎪ 5 ⎩ B + 2E = 2D

3. ILLUSTRATIVE EXAMPLE: PRODUCTION OF OLEIC ACID EPOXIDE 3.1. Reaction Network Construction. Oleic acid epoxide is the desired product of the net reaction between oxygen, benzaldehyde, and oleic acid, dissolved in acetone as solvent.32 Through chemical analysis the identified species present

3.2. Testing Feasible Reaction Schemes. Starting from the set of the five reactions generated, the incidence matrix is generated from the stoichiometric coefficient matrix Φ by restricting the elements to be −1 ≤ dij ≤ 1: G

DOI: 10.1021/ie504343d Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

To demonstrate the methodology for testing if a reaction scheme is either feasible or infeasible two example reaction schemes are tested here. Reaction scheme 1: ΘT = [1 0 ⋮ 1 1 0]

Figure 3. Feasible and infeasible reaction schemes.

species D cannot be produced so it is displayed in a dashed circle, consequently reactions 3 and 4 cannot occur (dashed square boxes). 3.3. Rival Model Derivation. Experimental conditions and data for the oleic acid epoxide reaction are taken from the factorial experimental design of Rastogi et al.32 In this case the working range for the process variables are restricted to a narrow range because of the physical properties of the system.32 In practice laboratory facilities would allow experimental measurements to be taken at any set of conditions within the working range. As no further experimental data are available it is necessary to develop a model based on the sets of experimental data and assume this to be the true model for the prediction of experimental data at other conditions. In this example the weighting factor in the objective function is the reciprocal of the scaling factor given by Rastogi et al.32 It was assumed that there was no evaporation of the liquid phase, reactions occur only in the liquid phase, and that reaction rates are not affected by gas flow rate. Hence, it was assumed that there is no mass transfer resistance in the gas phase and the concentration of gas in the liquid phase was saturated at the gas flow rates used in the experiments. Now it will be assumed that any intrinsic mechanism and kinetic information is unavailable and our objectives are to obtain a simple model (e.g., using a subset of the set of five possible reactions) and to find reactor conditions which give the maximum yield of oleic acid epoxide. To meet these objectives the procedure detailed in section 2 and Figure 1 is followed. Steps 1−3 of this procedure were met in sections 3.1 and 3.2 and so here we follow steps 4−7. First we propose a set of kinetic expressions for the five possible reactions as shown in Table 1. Next a number of different reaction schemes are selected involving different combinations of reactions which satisfy the reaction scheme feasibility tests demonstrated in section 3.2. Optimization can in principle modify the number of reactions used in a reaction scheme. However, in this case because the reaction network is small four selected reaction schemes (models) have been separately optimized without adding or removing any reactions from each. The four reaction schemes considered each use different subsets of the five possible reaction steps. Models 1, 2, and 4 all use three reactions and model 3 uses only two reactions to

This first reaction scheme includes reaction steps 1, 3, and 4 from the list of 5 reactions generated. In the initial form of the incidence matrix D(1), two elements of submatrix dIIP are less than 0. This indicates that both reactions 3 and 4 consume component D which is not one of the raw materials. Since component D is a product of reaction 1 in the procedure we add column r1 to columns r3 and r4 giving the new incidence matrix D(2). A subsequent check shows that all elements in submatrix dIIP are now greater than or equal to zero. For a reaction scheme to be feasible all reactants must be consumed (i.e., the sum must be negative) and all new products generated (i.e., the sum must be positive) which is checked by considering the summation on the right of matrix. So we check the sum for all reacting species belonging to R0, have values less than 0 and the sum of all species belonging to P0 are greater than 0 (on the right side of the matrix). Hence, this reaction scheme is feasible. Reaction scheme 2: ΘT = [1 0 ⋮ 1 1 0]

In the original form of the incidence matrix D(1) two elements of submatrix dIIP are less than 0. As with reaction scheme 1 this indicates that both reactions 3 and 4 consume component D which is not one of the raw materials. So following the procedure we try adding column r2 to columns r3 and r4. Since there are no feasible pathways generated and there are still two elements less than 0 in submatrix dIIP, this is found to be an infeasible reaction scheme. The reason for this is because the species D (perbenzoic acid) is not produced in the stage I and there are no available pathway for producing D so it cannot be a reactant in the following reactions. Finally, these two reaction schemes are drawn in the form of flowsheets in Figure 3 to clearly show the feasibility of the reaction schemes where square boxes represent the reactions (showing the reaction number and stage) and letters in the circles refer to the reacting species. In Reaction Scheme 2 H

DOI: 10.1021/ie504343d Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

Table 1. Reaction Model Kinetic Expressions reaction step

kinetic expression

kinetic parameters

reaction 1 (I) A + B → D

⎛ E ⎞ z2 r1 = A1 exp⎜− 1 ⎟CAz1Ccat ⎝ RT ⎠

A1, E1, z1, z2

reaction 2 (I) 2A + B → 2E

⎛ E ⎞ z4 r2 = A 2 exp⎜− 2 ⎟CAz3Ccat ⎝ RT ⎠

A2, E2, z3, z4

reaction 3(II) C + D → E + F

⎛ E ⎞ r3 = A3 exp⎜− 3 ⎟CCz5C Dz6 ⎝ RT ⎠

A3, E3, z5, z6

reaction 4 (II) A + D → 2E

⎛ E ⎞ r4 = A4 exp⎜− 4 ⎟CAz 7C Dz8 ⎝ RT ⎠

A4, E4, z7, z8

reaction 5 (II) B + 2E → 2D

⎛ E ⎞ z10 r5 = A5 exp⎜− 5 ⎟C Ez 9Ccat ⎝ RT ⎠

A5, E5, z9, z10

Table 2. Fitted Parameter Values for Different Rival Models Based on Preliminary Experimental Data reactions objective value (eq 15) objective value (eq 16) A1 A2 A3 A4 A5 E1 E2 E3 E4 E5 z1 z2 z3 z4 z5 z6 z7 z8 z9 z10

true model

model 1

model 2

model 3

model 4

1,2,3,4,5 3.6263 342.4893 5.7100 × 10+09 1.3085e+04 1.4207 × 10+08 4.5823 × 10−04 3.1528e+05 6.1948 × 10+04 173.8811 5.9759 × 10+04 0.0426 4.3234 × 10+08 1.4669 0.5479 −2.000 1.6996 −1.1961 1.1790 −0.3472 0.2379 −0.0101 1.0000 × 10−03

1,3,5 7.4977 3.7785 × 10+03 1.8481 × 10+04 0 1.6963 × 10+09 0 3.0100 2.9812 × 10+04 0 6.6350 × 10+04 0 5.3246 × 10+05 1.3354 0.5206 0 0 −0.7793 1.0305 0 0 −1.2107 −4.4054 × 10−04

1,3,4 5.1100 274.2155 7.3706 × 10+06 0 6.0310 × 10+07 0.3759 0 4.7886 × 10+04 0 5.2228 × 10+04 1.2380e+04 0 0.9083 0.3825 0 0 −1.3826 2.000 −1.3714 0.4250 0 0

1,3 6.2892 78.2058 6.2157 × 10+07 0 722.4827 0 0 5.3888 × 10+04 0 2.3533 × 10+04 0 0 1.0466 0.3822 0 0 −1.4520 2.000 0 0 0 0

2,3,5 11.5708 3.3163 × 10+05 0 13.6282 1.8550 × 10+09 0 5.2012e+05 0 2.6314e+04 6.2507 × 10+04 0 2.8224 × 10+04 0 0 1.4487 0.0741 −0.8491 1.2853 0 0 1.4694 0.5117

this model using all five reactions gives the most accurate fit with experiment using eq 12 it is only the third best model overall considering also model complexity using eq 13. As one of the aims of this work is to develop simple models we have focused here on the improvement and development of the four smaller rival models. The “true model” is used instead to generate additional experimental data to demonstrate the procedure. The accuracy of the four models with respect to the experimental data for two of the runs is demonstrated in Figure 4, which gives the change in oleic acid and oleic acid epoxide predicted by the four models and the experimental values. This

describe the behavior of this system. The results of fitting the four models to the preliminary experimental data are given in Table 2 which shows the parameter and objective function values after fitting each model. Considering the four models a comparison shows that using the least-squares objective function (eq 12) model 2 gives the best fit to the preliminary experimental data. However, if the modified least-squares objective function (in eq 13) is considered which weighs the objective in favor of mechanisms with fewer reactions then model 3 gives the best value. In addition the “true model” containing all five reaction is also included in this table for comparison. It is shown that while I

DOI: 10.1021/ie504343d Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

epoxide is 7.52%. These conditions at the maximum temperature and catalyst concentration and with a relatively low initial concentration of oleic acid are similar to the conditions of run 3 (from the experiments of Rastogi et al.32) except with a higher initial concentration of benzaldehyde. Following these model predictions using model 2, additional experimental data are generated using the “true” model (using all five reactions and the parameters specified in Table 2). However, at the above conditions all the oleic acid in the system is consumed in less than 70 min using the true model inside a batch reactor simulation. For this reason the new pseudoexperimental measurements (run PRF1) are given at 0, 40, and 60 min as shown in Table 3. Table 3. Additional Pseudoexperimental Data Generated Following Optimization of Reactor Yield (PRF1) and Optimization for Model Discrimination (PRF2) run ID

time (min)

scaling factor PRF1 0 40 60 PRF2 0 40 80

Figure 4. Comparison of four models against experimental data from Rastogi et al.32

[A] (M)

[D] (M)

0.2930 3.3173 3.2623 3.0273 2.5039 2.2030 2.0282

0.0705 0.0001 0.1142 0.0876 0.0001 0.0345 0.1012

[C] (M) [E] (M) 0.0609 0.2496 0.1679 0.1272 0.4780 0.4733 0.4461

0.0083 0.0001 0.1044 0.1860 0.0001 0.0291 0.1130

[F] (M) 0.0012 0.0001 0.0760 0.1692 0.0001 0.0234 0.0852

With the use of this additional experimental data the four rival models are refitted to improve their accuracy and range of validity in the optimal yield conditions. This is shown in Figure 5 which demonstrates the improved prediction power of the models after they have been tuned to fit the new experimental conditions. Although it is worth noting that most of the original models before refitting also give a reasonable fit to the new experiments. The advantage of this approach is that model accuracy is improved at the apparent optimal yield conditions at which the models would be expected to be useful for future reactor design. 3.4.2. Model Improvement Option B: Using Optimal Model Discrimination Conditions. Option B which is step 7 from the procedure in section 2 involves the improvement of all models through the use of additional experiments performed at reactor conditions which maximize the differences between the different rival models considered. Following the previous model improvement step it is clear that models 4 and 4* are unable to predict experimental conditions at the optimal reactor yield conditions. Hence in this step the aim is to find conditions which discriminate between models 1*, 2* and 3*. Following optimization of eq 18 the conditions which allow the greatest discrimination between these three models are temperature, 33.84 °C; catalyst concentration, 3.9932 × 10−5 M; initial oleic acid concentration, 0.4780 M; initial benzaldehyde concentration, 2.5039 M. For comparison Figure 6 shows the simulation results using initial conditions from one of the experimental runs of Rastogi et al.32compared with simulation using the model discrimination conditions. This figure shows that at the initial conditions all three models give similar results with some deviation between the results of each. However, at the model discrimination conditions model 1* deviates more significantly from models 2* and 3*. If conditions outside the range

shows that models 1, 2, and 3 can predict experimental conditions reasonably well. However, model 4 is shown to be slightly less accurate. 3.4. Model Improvement and Process Optimization. The next steps in the procedure improve the accuracy and range of validity of the models developed through the design of additional experiments and subsequent refitting of the models. This is achieved through optimization of the reactor conditions either to maximize reactor yield or to maximize the difference between model results (steps 6 or 7 from section 2). Here both model improvement steps are demonstrated using optimization of a batch reactor, varying the same four parameters considered by the factorial experiments of Rastogi et al.32 (temperature, catalyst concentration, initial oleic acid concentration, and initial benzaldehyde concentration). In addition the bounds of these parameters were set to the same upper and lower used in the initial experiments in order to avoid any infeasible operating conditions. 3.4.1. Model Improvement Option A: Using Optimal Reactor Conditions. Option A which is step 6 from the procedure in section 2 involves the improvement of all models through the use of additional experiments performed at the predicted optimal reactor conditions. Using the most accurate of the four models obtained (model 2) and fixing the maximum reaction time to 80 min the maximum yield of oleic acid epoxide with respect to the fed benzaldehyde gives the following reactor conditions: temperature, 34.99 °C; catalyst concentration, 3.9982 × 10−5 M; initial oleic acid concentration, 0.2496 M; initial benzaldehyde concentration, 3.3173 M. At these conditions (according to model 2) all of the oleic acid is consumed after 80 min and the yield of oleic acid J

DOI: 10.1021/ie504343d Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

specified had also been considered it is possible that a more significant difference between the models could have been obtained. Following these model predictions the above conditions are used with the true model to generate new experimental data that can be used to improve the models. As there are no problems associated with raw material being consumed completely in this case experimental results are generated at 0, 40, and 80 min similar to those in the initial experiments. These new results are given in Table 3, labeled as PRF2. Following the refit using this new data an updated set of rival models is produced (models 1**, 2**, 3**, and 4**). A comparison of all the models developed is given in Table 4 which shows the objective function values following optimization using different quantities of experimental data. Considering the models based on the four different reactions schemes at all stages of the model improvement procedure, models 2, 2*, and 2** give the best fit to the experimental data in terms of the unbiased least-squares objective function (eq 12). While the weighted least-squares objective function (eq 13) favors models with fewer reactions. Models 3, 3*, and 3** give the best fit to this objective since they adequately represent the system performance while including only two reactions, compared with the other models that use three reactions. Each step of model improvement increases the range of validity and the accuracy of the models. However, model discrimination in this case does not lead to a clear “best” model since models 1**, 2**, and 3** can all predict the experimental outputs for this system with reasonable accuracy in the range of conditions specified. This is because model discrimination can only eliminate models if conditions can be found under which some of the models are unable to accurately predict the resulting reactor outlet conditions. These models could be further improved by repeating the optimal yield and optimal model discrimination steps. However, for the range of conditions considered these models cannot be significantly improved any further. To make any further significant improvements, experiments outside these ranges (e.g., at higher temperatures) would be required.

Figure 5. Prediction of oleic acid epoxide concentration at optimal reactor conditions before and after refitting.

4. CONCLUSIONS A new approach for reaction model building is proposed in this work. After all detectable species (reactants, products, and byproducts) are identified and known a combinatorial algorithm is implemented to generate all possible reactions. Integer programming is used to test the stoichiometric feasibility of the reactions generated, and reactions are sorted into stage I and stage II reaction lists. Stage I reactions use only raw materials as input reactants and hence they are the starting point for any reaction scheme/mechanism. Stage II reactions

Figure 6. Comparison of initial conditions (from the experimental run RF4 of Rastogi et al.32) against optimal model discrimination conditions.

Table 4. Comparison of Optimization Results Using Different Levels of Experimental Data data used for fitting preliminary data used for fitting preliminary + optimal yield data used for fitting

objective function

model 1

equation 12 equation 13 objective function

7.4977 3.7785 × 10+03 model 1*

equation 12 7.2821 equation 13 2.9815 × 10+03 objective function model 1**

preliminary + optimal yield + discrimination

equation 12 equation 13

6.2596 969.6119 K

model 2

model 3

5.1100 274.2155 model 2*

6.2892 78.2058 model 3*

5.2545 321.4079 model 2** 4.8580 207.8924

model 4

6.2750 77.4395 model 3** 6.3336 80.6486

11.5708 3.3163 × 10+05 model 4* 15.8780 3.7648 × 10+07 model 4** 17.2481 1.6961 × 10+08

DOI: 10.1021/ie504343d Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

Aj = frequency factor in Arhenius equation (mola Lb min−1, a and b depending on reaction orders) Ci = concentration for ith species (mol L−1) di,j = incidence matrix element dIR, dIIR, dIP and dIIP = submatrices of matrix D D = incidence matrix Ej = activation energy in Arhenius equation (J mol−1) F = optimization objective function i,j = integer index kj = reaction rate constant for reaction j (mola Lb min−1, a and b depending on reaction orders) m = total number of reaction steps M = number of reaction steps used in a model n = total number of reacting species Nind = number of independent reactions pi = probability density function of the nth experiment under model i P = product subset PI = product subset for stage I PII = product subset for stage II rj = reaction rate for jth reaction (mol L−1 min−1) R = reactant subset; universal gas constant (J mol−1 K−1) RI = reactant subset for stage I RII = reactant subset for stage II Ṙ i = net reaction rate for ith species (mol L−1 min−1) si = ith reacting species S = species set t = number of raw materials T = temperature (K) u = number of reactions generated from stage I v = number of reactions generated from stage II wi = weighting factor xi = reaction stoichiometric coefficient vector element X = reaction stoichiometric coefficient vector yi = experimental measurement

include all other feasible reactions including those which consume the products of stage I reactions. Combining the reactions from both stages gives a reaction network which contains the possible reactions occurring in the system. Additionally, a procedure is described for the generation and testing of feasible reaction schemes, which are reaction mechanisms capable of consuming all the raw materials and producing all the products and byproducts in the given system. Starting from one or more initial reaction scheme, rival models can be constructed by selecting kinetic expressions for each reaction implemented and fitting kinetic parameters using available experimental data. Alternatively it is possible to start by constructing a model based on the full set of possible reactions. However, in cases where there are large numbers of possible reactions this requires the estimation of a correspondingly large number of parameters and significant computational time and effort which may be impractical as part of the “fast” model and process development process. Although it is important to note that if reaction schemes with subsets of the full set of reactions are used then either multiple reaction schemes should be considered or the addition/removal of reactions should be considered as part of the model fitting optimization. The rival models developed from the preliminary experiments can be used for reactor design and optimization. This identifies predicted optimal reactor conditions and further experimental data collected at these critical conditions under which the final design is expected to operate. The models can be improved by refitting using new experimental data around the optimum for the reactor design, increasing the reliability of the models in this important region of conditions. On the other hand including data from model discrimination experiments exaggerates the differences between the rival models potentially identifying regions where one or more model is unable to predict experimental results with reasonable accuracy. Ultimately, following model improvement steps a final model should be selected for further reactor design and optimization. Although, if the models have been improved through optimization of the reactor conditions then a good estimate of the optimal reactor conditions will have already been calculated and tested. Hence, this combined procedure for model development and reactor design can identify and improve models in parallel with the design of reactor conditions (e.g., for obtaining the maximum yield).



Greek Letters



REFERENCES

(1) Mill, P. L.; Chaudhari, R. V. Multiphase catalytic reactor engineering and design for pharmaceuticals and fine chemicals. Catal. Today 1997, 37, 367. (2) Cussler, E. L.; Moggridge, G. D. Chemical Product Design; Cambridge University Press: London, 2001. (3) Nishida, N.; Stephanopoulos, G.; Westerberg, A. W. A review of process synthesis. AIChE J. 1981, 27, 321. (4) Corey, E. J.; Howe, W. J.; Pensak, D. A. Computer-assisted synthetic analysisMethods for machine generation of synthetic intermediates involving multi-step look ahead. J. Am. Chem. Soc. 1974, 96, 7724. (5) Parenty, A. D. C.; Button, W. G.; Ott, M. A. An expert system to predict the forced degradation of organic molecules. Mol. Pharm. 2013, 10, 2961. (6) Hatzimanikatis, V.; Li, C.; Ionita, J. A.; Henry, C. S.; Jankowski, M. D.; Broadbelt, L. J. Exploring the diversity of complex metabolic networks. Bioinformatics 2005, 21, 1603−1609. (7) Binns, M.; Theodoropoulos, C. An integrated knowledge-based approach for modeling biochemical reaction networks. Comput. Chem. Eng. 2011, 35, 3025−3043.

ASSOCIATED CONTENT

S Supporting Information *

Details of the models 1*, 2*, 3*, 4*, 1**, 2**, 3**, and 4** including the fitted parameters for each of these models developed as part of the model improvement procedures. This material is available free of charge via the Internet at http:// pubs.acs.org.



Θ,θ reaction-index vector and elements Φ,ϕ stoichiometric coefficient matrix and elements σ2 variance Πn−1 prior probability Πn posterior probability

AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. Tel.: 82 (0)2 2220 2331. Notes

The authors declare no competing financial interest.



NOTATION aij = atom-species matrix element A = atom-species matrix L

DOI: 10.1021/ie504343d Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

(8) Ugi, I. K.; Bauer, J.; Baumgartner, R.; Fontain, E.; Forstmeyer, D.; Lohberger, S. Computer assistance in the design of syntheses and a new generation of computer programs for the solution of chemical problems by molecular logic. Pure Appl. Chem. 1988, 60, 1573. (9) Ugi, I.; Gillespie, P. Chemistry and logic structure. 3. Representation of chemical systems and interconversion by BE matrices and their transformation properties. Angew. Chem., Int. Ed. 1971, 10, 914. (10) Wang, L. P.; Titov, A.; McGibbon, R.; Liu, F.; Pande, V. S.; Martinez, T. J. Discovering chemistry with an ab initio nanoreactor. Nat. Chem. 2014, 6, 1044. (11) Zimmerman, P. M. Automated discovery of chemically reasonable elementary reaction steps. J. Comput. Chem. 2013, 34, 1385. (12) Aris, R.; Mah, R. H. S. Independence of chemical reactions. Ind. Eng. Chem. Fundam. 1963, 2, 90−94. (13) Filippi-Bossy, C.; Bordet, J.; Villermaux, J.; Marchal-Brassely, S.; Georgakis, C. Batch reactor optimization by use of tendency models. Comput. Chem. Eng. 1989, 13, 35−47. (14) Bonvin, D.; Rippin, D. W. T. Target factor analysis for the identification of stoichiometric models. Chem. Eng. Sci. 1990, 45, 2417−2426. (15) Burnham, S. C.; Searson, D. P.; Willis, M. J.; Wright, A. R. Inference of chemical reaction networks. Chem. Eng. Sci. 2008, 63, 862−873. (16) Bhatt, N.; Kerimoglu, N.; Amrhein, M.; Marquardt, W.; Bonvin, D. Incremental identification of reaction systemsA comparison between rate-based and extent-based approaches. Chem. Eng. Sci. 2012, 83, 24. (17) Box, G. E. P.; Hill, W. J. Discrimination among mechanistic models. Technometrics 1967, 9, 27−71. (18) Franceschini, G.; Macchietto, S. Novel anticorrelation criteria for design of experiments: algorithm and application. AIChE J. 2008, 54, 3221. (19) Galvanin, F.; Barolo, M.; Bezzo, F.; Macchietto, S. A backoff strategy for model-based experiment design under parametric uncertainty. AIChE J. 2010, 56, 2088. (20) Luna, M.; Martinez, E. A Bayesian approach to run-to-run optimization of animal cell bioreactors using probabilistic tendency models. Ind. Eng. Chem. Res. 2014, 53, 17252. (21) Zhang, W. Model building methodology for complex reaction systems. Ph.D. Thesis, University of Manchester Institute of Science and Technology, 2004. (22) Holiastos, K.; Manousiouthakis, V. Automatic synthesis of thermodynamically feasible reaction clusters. AIChE J. 1998, 44, 164− 173. (23) Golub, G. H.; Van Loan, C. F. Matrix Computations; Johns Hopkins University Press: 1996. (24) Dolan, A.; Aldous, J. Network and Algorithms: An Introductory Approach; John Wiley & Sons: Chichester, UK, 1993. (25) Crampin, E. J.; Schnell, S.; McSharry, P. E. Mathematical and computational techniques to deduce complex biochemical reaction mechanisms. Prog. Biophys. Mol. Biol. 2004, 86, 77. (26) Akaike, H. A new look at the statistical model identification. IEEE Trans. Autom. Control 1974, AC19, 716. (27) Gupta, G. K.; Sacks-Davis, R.; Tischer, P. E. A review of recent developments in solving odes. ACM Comput. Surv. 1985, 17, 5−47. (28) Hunter, W. G.; Reiner, A. M. Design for discriminating between two rival models. Technometrics 1965, 7, 307−323. (29) Kullback, S. Information Theory and Statistics; John Wiley and Sons: New York, 1959. (30) Shannon, C. E. A mathematical theory of communication. Bell Syst. Technol. J. 1948, 27, 379−423 and 623−656. (31) Kokossis, A. C.; Floudas, C. A. Optimization of complex reactor networks I. Isothermal operation. Chem. Eng. Sci. 1990, 45, 595−614. (32) Rastogi, A.; Vega, A.; Georgakis, C.; Stenger, H. G., Jr. Optimization of catalyzed epoxidation of unsaturated fatty acids by using tendency models. Chem. Eng. Sci. 1990, 45, 2067−2074.

M

DOI: 10.1021/ie504343d Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX