220
Ind. Eng. Chem. Res. 2007, 46, 220-232
Validation of a Model for Biodiesel Production through Model-Based Experiment Design Gaia Franceschini* and Sandro Macchietto Department of Chemical Engineering, Imperial College London, South Kensington Campus, SW7 2AZ London, U.K.
Advanced model-based experiment design techniques are a reliable tool for the rapid development and refining of process models. Through a practical case study of current interest (a biodiesel production process), we demonstrate the validity of this approach as applied to the planning of optimal experiments for complex kinetics elucidation. The need for an appropriate use of these tools, in particular a judicious problem formulation, is highlighted. A trade-off is found between the predicted precision of the estimates and important experimental aspects (for example, times and costs of the analytical work). The intelligent application of these techniques allows finding experiments which are suitable for the collection of data for complex kinetic networks. A methodology capable of dealing with experiment design problems which involve complex reaction mechanisms is also presented, and advantages and limitations of this procedure are discussed in the light of the results obtained from the parameter estimation. 1. Introduction Dynamic mechanistic modeling has become a key activity in process engineering: its main advantage is the possibility of gaining a better understanding of the different phenomena which occur within the system under investigation. Therefore, building high quality and validated models of process systems cannot be ignored in many applications such as model-based product and process design, control, and optimization. By expressing the underlying phenomena in a compact mathematical form, these detailed models can be used to improve product, process, and plant performance in a wide range of applications. On the other hand, collecting the data which are required to build and validate a model can be costly, both in terms of time and resources. Therefore, there is a need to develop such models in a systematic way in order to maximize the information obtained from each experiment and to minimize the number of analyses, the cost of materials, and the time required. When building mechanistic models, one uses a priori knowledge such as physical, chemical, or biological laws to propose one or more possible alternatives. These laws dictate the model structure and such models invariably contain adjustable parameters that may have physical meaning. Typically, one desires to establish if it is at all possible to determine their values with the maximum precision and to validate their values and the model statistically. This second aspect involves adequacy tests for the model overall (i.e., how well the model explains the observed data) and tests to check whether individual parameters are well determined in the model or not (t-test). For the statistical verification of detailed dynamic models, Asprey and Macchietto1 have presented a systematic model-building procedure. This approach is an iterative method which involves the following: • proposing a mathematical parametric model; • designing optimal experiments using the model; • estimating new model parameters using the experimental data collected under the optimally planned conditions; • checking for parameter precision and significance and model adequacy. * To whom correspondence should be addressed. Tel.: +44-2075946643. Fax: +44-20-7594 1255 (not direct fax). E-mail address:
[email protected].
The whole procedure must be repeated until a satisfactory degree of precision in the model predictions is achieved (other steps in the procedure deal with selection between alternative models). A key step within this approach is the design of experiments so as to obtain the maximum information from the experimental apparatus being modeled. In particular, model-based experiment design aims at assisting a modeler/experimenter in devising experiments that will yield the most informative data, in a statistical sense, for use in parameter estimation and model validation. In mathematical terms, given an initial model and the current values of its parameters, the aim is to minimize the expected inference region of the new parameters, i.e., to make the elements of the variance-covariance matrix of the parameters small. An experiment design calculation thus involves minimizing some metrics of this matrix by choosing a set of experiment decision variables (length, time-varying and timeinvariant controls, initial conditions, sampling times, etc.) subject to equality or inequality constraints.1 Various real-valued functions have been suggested as suitable metrics for the “size” of the variance-covariance matrix of the parameters.2 The most common criteria are these four: (1) D-optimality which minimizes the determinant of the covariance matrix and thus the volume of the joint confidence region of the parameters (typically, at a 90% or 95% confidence level); (2) E-optimality which minimizes the largest eigenvalue of the covariance matrix and thus the size of the major axis of the joint confidence region; (3) A-optimality which minimizes the trace of the covariance matrix and thus the dimensions of the enclosing box around the joint confidence region; (4) modified E-optimality which minimizes the condition number of the covariance matrix (i.e., the ratio between the maximum and minimum eigenvalue) and thus makes the joint confidence region as spherical as possible.3 For a detailed discussion of these criteria, reference is made to the work of Walter and Pronzato.4 Despite the importance of the problem, there has been a relatively modest amount of work reported on the application of experiment design techniques to dynamic systems. The basic
10.1021/ie060758c CCC: $37.00 © 2007 American Chemical Society Published on Web 12/02/2006
Ind. Eng. Chem. Res., Vol. 46, No. 1, 2007 221
Figure 1. Schematic representation of the transesterification of triglycerides (vegetable oil) with methanol to produce fatty acid methyl esters (biodiesel).
concepts and some results were presented in the late 1980s and early 1990s for general mechanistic models consisting of differential and algebraic equations (DAEs).5-7 More recent work on the application of these techniques has been reported by Asprey and Macchietto,1,2 Ko¨rkel et al.,8 Bauer et al.,9 and Versyck et al.3,10,11 The significant recent improvements in the methodology have allowed the inclusion of a routine for experiment design for parameter precision in a commercially available general purpose process modeling software package, gPROMS (Process System Enterprise Ltd.).12,13 Experiment design techniques have been applied in the past to a large range of systems, but most studies made use of simulated “experiments” in order to illustrate the improvements in the parameter precision or decrease in number of experiments gained from this approach. 1.1. Objectives and Structure of This Paper. This paper has several objectives. First, we wish to demonstrate the application of advanced model-based design of dynamic experiments to a typical application of industrial interest involving the actual, as opposed to simulated, execution of laboratory experiments. The production of biodiesel from vegetable oil was selected as it is both topical and of typical complexity. Second, it is intended to highlight some of the more subtle problems that arise in the design and execution of actual experiments, due in particular to the inevitable presence of errors, and to present ways to overcome them. Finally, we trust that a fairly complete discussion of a demanding application will show that there remains still a substantial role for intelligent, judicious human intervention within a model-supported experiment planexecute-analyze loop and give some practical suggestions on the use of model-based experiment design techniques for maximum advantage. The biodiesel process studied is briefly described in section 1.2. The experimental rig used for all experiments is then introduced, including a discussion of quantities that can be manipulated, measurements, and analysis methods. A model of this process, within which it is desired to estimate the parameters for a specific kinetic scheme, is briefly outlined in section 3. The initial model-based design of a set of optimal experiments is then discussed in some detail, focusing in particular on appropriate formulations of the optimal design problem. This is followed by the actual experiment results obtained, a thorough discussion of parameter estimation, and statistical assessment of model and parameter validity. 1.2. Biodiesel Process. Vegetable oils are complex mixtures of mainly triglycerides. Their methyl esters, when separated, at concentrations >96.5 wt % and subject to a host of other
properties (see the European standard EN-14214 or the American standard ASTM-D6751), are called biodiesel. Biodiesel would be an ideal substitute for the conventional diesel fuel if only it was more competitive economically. Efforts have been made to reduce its cost by optimizing its production processes. The subject of our study is one of these processes: the transesterification of a vegetable oil (rapeseed oil) with methanol in presence of an alkali catalyst (sodium methoxide) under mild pressure conditions in a batch reactor. The basic reaction scheme can be represented as in Figure 1. According to Noureddini and Zhu,14 the reaction consists of three consecutive and reversible steps occurring in parallel: k1,k2
TG + M 798 DG + E k3,k4
DG + M 798 MG + E k5,k6
MG + M 798 G + E
(1)
where TG, DG, and MG are, respectively, the tri-, di- and monoglycerides and M, G, and E indicate methanol, glycerol, and the mixture of methyl esters which form biodiesel. A previous experimental study was carried out in our laboratory in order to maximize the conversion of this process.15 The experimental results collected were then used in a subsequent work16 aimed at building a mathematical model of this system. This esterification reaction has been used for several decades, and a number of studies are available (see Meher et al.17 for a recent review); so, we initially expected the basics (kinetics, thermophysical properties, etc.) to be well understood. However, it turned out that the published kinetics was not adequate over the range of interest. Our attempts to estimate the kinetic parameters involved in the model from our own experimental data proved to be problematic, and only the equilibrium constants could be estimated, but not the desired individual pre-exponential factors and activation energies.18 Analysis of simulation results revealed that almost all the data collected in the previous study were concentrated in the region controlled by equilibrium and, therefore, contained no useful information on the dynamics of the system. In our experience, this situation is not at all uncommon, as kinetics studies are typically completed well ahead of process design and often in conditions quite distinct from those optimal for a process. Consequently, new data were deemed necessary and it was decided to use model-based experiment design techniques to plan a new experiment campaign, with the objective of
222
Ind. Eng. Chem. Res., Vol. 46, No. 1, 2007
Figure 2. General rig scheme.
minimizing the experimental effort required to obtain a better estimation of the kinetic parameters. 2. Experimental SetsApparatus, Measurements, and Analysis 2.1. Materials. The vegetable oil used for the transesterification was rapeseed oil. For this type of reaction, it is a better raw material than soybean oil because of its lower cloud point. Refined rapeseed oil with a high purity level was obtained from Cargill PLC. On the basis of the technical oil composition, its average molecular weight was calculated to be 884 g/mol. The alcohol used was methanol because of its low cost and its physical and chemical advantages. For this reaction, it must be anhydrous. Methanol with these features (Methanol AnalaR) was bought from VWR International Ltd. The catalyst was sodium methoxide in solution with methanol (technical solution ∼30% by volume in methanol) and was obtained from Fluka Chemika. The reaction in the samples collected was stopped by adding a 0.5 M solution of HCl (see section 2.4.1) prepared using hydrochloric acid (37% AR grade) from Aldrich. 2.2. Reaction Conditions and Apparatus. The apparatus, depicted schematically in Figure 2, includes a 6 L stainless steel cylindrical reactor suitable for reactions under pressure, equipped with a compressed air stirrer to ensure thorough agitation (the reaction involves two liquid phases) and an overhead packed column (for the distillation of methanol vapors). A (manual) sampling system appropriate for operation under pressure enables one to take samples of the mixture at required sampling times. A metal band heater and a proportional-integralderivative (PID) temperature controller linked to a computer permit a convenient and efficient automatic heating. Full details of the apparatus are given by Lemieuvre.15 The design of the apparatus did not include a cooling system: this is a significant experimental constraint which must be considered in the design of the optimal experiments. Several operating variables can be manipulated and optimized during the experiment design so as to obtain the best experimental conditions. In our case study, these variables, which constituted the design vector, included the length of the experiment, the initial amounts of the reactants charged (the oil/methanol ratio is a very important variable), the temperature profiles during the reaction, and the sampling times for the measurements to be taken. Other operating conditions, common to all experiments and not included as optimization variables in the experiment design, were the catalyst amount (0.5% of the oil charge in weight), the pressure (1.4 bar as in the previous study15), and the mixing rate (300 rpm, selected to avoid mass transfer limitations).
2.3. Experimental Procedure. Reactants and catalyst were charged in the reactor at the beginning of the process. Their amounts were measured by means of measuring cylinders. Then, all the valves were closed and the system was sealed. The pressure in the system was imposed by a nitrogen flow and set at the desired value (1.4 bar) using the needle valve present at the end of the system. The compressed air stirrer was switched on and kept at the desired rate for the whole experiment in order to ensure an efficient mixing of the reaction medium. The reaction was timed as soon as the mechanical stirrer was turned on. The temperature was raised progressively with the metal band heater, which works with a power level variable from 0 to 100%, according to the optimal temperature profile of each specific experiment (see Figure 5). 2.4. Sampling and Analysis. Samples of 30 mL were taken from the reaction mixture at the optimal sampling times. The reaction was immediately stopped by means of an acidification procedure (see section 2.4.1), and the weight and pH of each sample were measured (the pH was measured in order to check the proper application of the stopping procedure). Each acid sample was then demethanolyzed as described in Komers et al.19 by a combination of vacuum, mixing (using a magnetic stirrer), and increasing temperature by about 40 °C (by means of a water bath). This operation was necessary to avoid methanol interferences during the analyses and took between 20 and 30 min for each sample. After the demethanolysis, each sample was weighed again (to check the success of the procedure) and then left to separate by decantation into ester and glycerol phases. After separation, the glycerol phase was transferred into a 30 mL bottle (previously weighed) by means of a pipet. Each phase was then weighed, and in this way, the amount of ester phase (EP) and glycerol phase (GP) could be exactly known. As discussed in Appendix A, the measures of weight were very important in order to use the experimental data for parameter estimation. Three quantities were analyzed (depending on the experiment): ester and monoglycerides in the ester phase and glycerol in the glycerol phase. Ester and monoglycerides in the glycerol phase were not measured because the technique used to stop the reaction allows only traces of these substances to be left in this phase.19 On the other hand, glycerol in the ester phase was determined as it was necessary for the analysis of the monoglycerides. The choice of the most suitable techniques for the analyses was achieved with the help of the laboratory of Uniqema in Brombourough (U.K.). The ester content was measured by gas chromatography (GC): the method used for our analyses was based on PrEN 14103 but had been adapted by Uniqema in order to use their GC column and their internal standard. The other two amounts were measured via a titration technique, the “Kruty method”, which was based on international standards but had been adapted for Uniqema use. 2.4.1. Acidification Technique. A simple cooling with a water bath had been used in the previous study15 to stop the reaction in the samples after collection but it was subsequently demonstrated not to be adequate, as samples taken during the first 20 min (when the reaction is very fast) were not immediately stopped. Another technique was, therefore, employed for this experimental work. Between the methods described in the literature,19-21 the neutralization of the catalyst by means of acidification proposed by Komers et al.19 was considered the most suitable. The procedure adopted for the acidification was therefore based on the method described in Komers’ papers.19-22 After collection, the sample was quickly mixed with a surplus of 0.5 M HCl in order to reach a pH value of the final mixture
Ind. Eng. Chem. Res., Vol. 46, No. 1, 2007 223 Table 1. Kinetic Parameters for the Biodiesel Model As Provided in the Work of Noureddini and Zhu14 pre-exponential factor
activation energy/ universal gas const
reaction
symbol
value (m3/(mol s))
symbol
value (K)
TG + M f DG + E DG + E f TG + M DG + M f MG + E MG + E f DG + M MG + M f G + E G + E f MG + M
A1 A2 A3 A4 A5 A6
0.215 7 0.003 12 33 315.6 77.92 0.002 933 0.009 905
E1 E2 E3 E4 E5 E6
5892 4269 9280 6760.4 3994.5 5532
between 2 and 3 as suggested in the original technique.19 The neutralization was realized in a glass test tube with a volume of about 50 cm3 with a side arm (used for the later demethanolysis) closed by rubber plugs. The right amount of acid was introduced in the tubes before collecting the samples. According to Komers et al.,19 the moles of HCl must be about 2.5 times the moles of NaOH in the sample to obtain the desired final pH. In order to calculate the amount of acid for each experiment, a simplifying hypothesis was formulated: the concentration of the catalyst (NaOH) was supposed to be the same in each sample and equal to the initial catalyst concentration in the reactor. This hypothesis supposes that no saponifications happen during the process, and it is cautious because in reality the catalyst concentration in the sample is always less as a result of these side reactions. 3. The Model In this section, a very brief description of the main characteristics of the model is presented; some further details and some of the main equations are provided in Appendix B. The model describes in detail most of the phenomena which occur in the experimental reactor (for example, phase equilibrium, temperature varying with time, methanol evaporation, and energy input and losses), and only the kinetic parameters were not known during the building of the model. Mass and energy balances for a batch reactor were used together with suitable assumptions required to solve some complications of the process under investigation. As explained in the introduction, the alkali transesterification of a vegetable oil with methanol can be described by the kinetic scheme of eq 1, and since nonisothermal processes are considered, 12 parameters are involved in the model. The values of the parameters provided in the work of Noureddini and Zhu14 were used as the best initial estimates currently available for this model (Table 1). The reaction rates are described by the following equations (all six reactions are of first order with respect to each of the reagents according to several studies14,22,23): NComp
rj ) kj
cia ∏ i)1
ij
(2)
where ci indicates the concentration of the various components, aij is the order of the component i in the reaction j, and the kinetic constants, kj, are calculated with a modified Arrhenius equation:
kj ) AjT exp
( ) -Ej T
(3)
The use of this equation is suggested by Noureddini and Zhu14 in order to model the temperature effect on the process more accurately.
The experimental reactor was assumed to be well mixed thanks to the compressed air stirrer, and the model for a “lumped system” was therefore used. No phenomena of mass transfer were modeled because the experimental stirring rate was chosen to be high enough to avoid any limitation caused by the partial immiscibility of the two reactants (oil and methanol). The synthesis of biodiesel is a liquid-phase reaction, but there is phase equilibrium in the reactor because of the partial evaporation of methanol: this equilibrium was described using the γ-φ approach (see Appendix B). Two are the main thermal effects (other than those due to the reactions) which were modeled: the thermal dispersion toward the surroundings and the thermal inertia of the reactor. The thermal dispersion was computed as follows:
thermal dispersion ) uL(Treactor - Troom)
(4)
where uL was calculated from the results of a heating test to have a value of about 0.9 J/(s K). The heat absorbed by the reactor itself was about 20% of the total heat given to the system, and so, it was not negligible. Two different energy balances were therefore used: one for the reactor body and one for the reaction mixture (see Appendix B for the equations). Another problem which affects the models of any process involving biodiesel production is the characterization of the oil. The oil is a complex mixture of various components, mainly fatty acids, organized as TG (90-98%), DG, and MG. The main constituent of rapeseed oil is oleic acid, so this oil was represented in the model by only one triglyceride: triolein. This simplification was essential because it is very difficult to find physical properties for these types of components (other studies22 have applied the same hypothesis with satisfactory results). Nevertheless, we could not find enough information about the enthalpies of triolein and diolein, and so, in order to calculate the reaction mixture enthalpy, the specific mass enthalpies of triolein and oleic acid were assumed to be about the same. The results of the simulations confirmed the validity of this hypothesis. Further details can be found in Appendix B. Modeling, simulation, parameter estimation, and experiment design were all carried out using gPROMS,12,13 interfaced with two databases (Infodata and DIPPR24) and a program (Multiflash25) for the calculation of the thermophysical properties of the reaction mixture. 4. Numerical Experiment Design Study 4.1. Methodology. 4.1.1. Problem. The solution of a modelbased experiment design problem allows a set of new optimal experiments to be planned. In our case, however, we wanted to carry out these experiments with the available experimental and analytical setup. The goal was to investigate the potential of the methodology even under experimental conditions which were not the ideal. This led to an interesting experiment design problem, which coupled a complex reaction network (three consecutive and competitive reversible reactions) with many practical constraints and limitations of the apparatus, such as, for example, the nonisothermal conditions. Nonisothermal experiments are usually not optimal for kinetic studies and complicate the mathematical resolution of the problem because of the use of Arrhenius’ equations. On the other hand, transient experiments are often used because they can convey more information.26 For our particular example, it proved impossible to design a single set of optimal experiments aimed at improving the estimation of all the parameters together (“overall estimation”).
224
Ind. Eng. Chem. Res., Vol. 46, No. 1, 2007
The experiment design calculations failed and no convergence was reached. This failure can be ascribed to different factors: the form of the model (the nonlinearity given by Arrhenius’ equations), the large number of highly correlated parameters, and the restrictions imposed by the experimental apparatus (i.e., the absence of a cooling system forces the temperature to profiles with only increasing or stationary steps). In order to overcome this problem, a different approach was tried: the optimal experiments were planned to improve the estimation only of a limited number of selected parameters. The twelve parameters of the kinetic scheme were divided into groups, and experiments were designed for the estimation of these subsets of parameters with the others fixed at their current values. As we will see in the next sections, these groups can include individual, couples, or sets of three parameters (not more than three otherwise convergence problems occurred again). Advantages and limitations of this approach will be discussed together with the results of the parameter estimation in section 6.3. 4.1.2. Experiment Design Procedure. The approach chosen for the experiment design of our case study adds some degrees of freedom to the standard problem: it is, for example, necessary to decide how to group parameters and, strongly correlated to this question, how many experiments to plan. This requires taking into account aspects of an economic nature: the two extremes are to plan one experiment for all the parameters (and this was impossible in this case, as we have seen) and to design one experiment for each parameter (12 experiments for this example), with a number of intermediate possibilities between the two extremes. A criterion was therefore identified to group the parameters so as to reduce the number of experiments required for the elucidation of the kinetics (see below). The other choices left to the modeler/experimenter involve the selection of which responses to measure and when to measure them (in other words, initial guesses for the sampling times). Regarding the first issue, the question to answer is whether measuring all the possible variables (three in this case) can produce a significant improvement in the parameter estimation or only cause an increase in the analytical burden (the predicted t-values calculated by the experiment design software can be used to address this aspect, as will be explained later). In order to saturate all these degrees of freedom, a procedure in several steps was defined. One of the most important is the first stage: a dynamic sensitivity analysis (given in Figure 3) based on the initial value of the parameters. This analysis can convey several pieces of information which are very useful to deal with some of the above issues. First of all, the sensitivity analysis allows the parameters with respect to which the model responses are more sensitive to be identified. These parameters can be considered as the most important for the model, and therefore, the efforts of the design should be directed at obtaining the best possible estimates in particular for them. Figure 3 shows the results of this sensitivity analysis as applied to the biodiesel example; it can be seen that in this case the most important parameters were found to be these three: the pre-exponential factors of the third reaction step (A5 and A6) and of the first reverse reaction (A2). If the results of the sensitivity analysis are normalized by the maximum value (so that the new coefficients lie within the range 0-1), other useful information can be obtained. In this way, it is possible to identify the most suitable intervals to collect data for the parameter estimation (this information is very helpful to obtain reasonable initial guesses of the sampling
Figure 3. Sensitivity analysis results: sensitivity of ester content (a) to pre-exponential factors and (b) to activation energies. A perturbation of 5% was applied on each parameter. The following initial conditions were used: 296 K for the temperature, 3.96 and 11.88 mol for the oil and methanol charges, respectively (all the other components were not present in the initial reaction mixture). The temperature profile adopted is shown in Figure 5 (the dashed curve named Exp85C_Old). The base values of parameters are those in Table 1.
times). These intervals are identified by the maxima of the various sensitivity curves: for example, as can be seen from Figure 4, the data collected in the period between 800 and 1000 s are (in these conditions) the most sensitive for the estimation of parameters A1 and E1. Also, the criterion chosen to group the parameters was based on the results of the normalized sensitivity analysis: the parameters which showed the maximum of this sensitivity in approximately the same time interval were grouped together and a single experiment was planned for their estimation. The information obtainable from measurements taken in that interval is in fact useful for all these parameters. These initial analyses were essential to plan the experiment design problem. In the second step of the procedure, the parameters to be estimated (individually, in pairs, etc.) were chosen according to the criterion explained above. Then, the number of samples and the initial guesses for the sampling times were decided (all in the range around the maximum of the corresponding normalized sensitivity curve, as described above). This step turned out to be very important: for example, using 16 randomly chosen sampling points predicts worse optimal estimates than with 8 sampling points chosen according to the sensitivity analysis (t-values of 31.9 and 42.09, respectively, for parameter A1 in experiment Gssee section 4.2.2). In the next step, the measurements to use (content of ester, glycerol,
Ind. Eng. Chem. Res., Vol. 46, No. 1, 2007 225
Figure 4. Normalized sensitivity analysis results: sensitivity of ester content (a) to pre-exponential factors and (b) to activation energies. A perturbation of 5% was applied on each parameter.
and monoglycerides) were selected. In this case, no criterion could be identified to choose the responses and, therefore, for each set of parameters, seven experiments were designed with all the possible combinations of responses (only one response measured or two or all three). The predicted t-values of the parameters (given by the experiment design calculations) were used to select the most promising experiments for each case. This way, an initial set of ten optimally designed experiments was found (for some sets of parameters, more than one experiment was still selected at this stage). These first results were then improved by introducing constraints to ensure that the optimal experiments satisfied many practical limitations of the apparatus. For instance, the reactor can be heated, but not cooled, so the slope of the temperature profile can only be positive or null. In this way, a set of experiments actually realizable in our laboratory was obtained. The next step was to analyze this set of predicted optimal experiments, taking into account for each costs and times required to obtain the experimental data. The chemical analyses are time consuming, and so, we tried to reduce as much as possible the number of experiments, of variables that have to be measured, and of samples. In this way, however, the predicted accuracy of the estimated parameters decreases, and so, a tradeoff had to be found between the expected precision of the estimation and the overall amount of experimental and analytical work (see section 4.2.1). 4.2. Experiment Design Results. The optimal experiments obtained with the procedure described in section 4.1.2 are presented in this section (the E-optimal criterion was employed for all the studies). As explained in section 2.2, the decision variables optimized during the design of the experiments were as follows: the experiment duration, the initial state of the experimental equipment (initial oil and methanol charges and
initial temperature), the temperature profile during the experiment, and the sampling times for the measurements. 4.2.1. Trade-off. As noted, a very important trade-off exists between precision of the estimates, number of samples, experiments, and analytical effort required. If the analytical methods are very quick and cheap, a high number of samples does not constitute a problem, but if the budget is limited, as it usually happens, it is very important to be able to find the best solution within the available resources. In order to illustrate the importance of this trade-off, the example of the three most important parameters in the biodiesel model (A2, A5, and A6) is presented here. Different options were available: an experiment could be designed for the optimal estimation of each parameter individually (second column of Table 2), one experiment could be designed to test all parameters simultaneously (first column in Table 2), or an experiment for a couple of parameters plus a single experiment for the remaining one could be used (third and fourth columns in Table 2). The couple (A2 and A5) could not be estimated in a single experiment, because even with 12 samples and the 3 variables measured the results were not statistically satisfactory (the predicted t-values were insufficient, and this is a clear indication that the information contained in the planned experiment is not sufficient to estimate the parameters properly). Parameter A5 was the most difficult to estimate and required a very large number of samples. With only eight samplesssufficient for the other parameterssthe estimate was not precise enough or the temperature had to be unrealistically high. In the end, two optimal experiments for the estimation of these parameters were selected: one for the couple (A2 and A6) and one for parameter A5 alone. In fact, as we can see in Table 2, there is an evident trade-off between the number of experiments, the number of analyses, and the predicted precision of the estimation. The chosen solution (experiment set 4 in Table 2) represents the best compromise in this case: with a medium experimental and analytical load, the predicted estimates are for almost all of the parameters very similar to those obtainable in the extreme case (one experiment for each parameterssee the second column of Table 2). 4.2.2. Optimal Experiments. As explained in section 4.1.2, it was impossible to decide a priori which responses to use and so all the possible combinations of responses were tried and the planned experiments were compared by means of the predicted t-values. The experiment which led to the highest predicted t-values was, therefore, chosen for each case. By taking into account the economic aspect (the trade-off described in the previous section for the three most important parameters of the model applies to other subsets of parameters as well), the ten initially designed experiments were reduced to a set of six, which indicated that the identification of all parameters could be expected. The experimental conditions and the parameter groupings adopted for the final set of experiments are shown in Table 3 together with the predicted precision of the estimates. Figure 5 reports the optimal temperature profiles obtained for the six optimal experiments planned. The dashed curves refer to the two temperature profiles which were used in all the (unplanned) experiments of the previous study.15 It is noted that a rather wide range of conditions is covered. These profiles were used as set points for the reactor temperature control system. From all the combinations of responses tried, it emerged that different measured variables lead to very different operating conditions and have a significant effect on the predicted precision of the estimations. For example, the ester content was
226
Ind. Eng. Chem. Res., Vol. 46, No. 1, 2007
Table 2. Comparison between the Possible Optimal Experiments for the Three Most Important Parameters of the Model
analytical load predicted parameter precision
experiment set
1
2
3
4
parameter grouping number of experiments number of analyses (EP) number of analyses (GP) number of analyses (total) precision: t-value (A2) precision: t-value (A5) precision: t-value (A6)
(A2 + A5 + A6) 1 45 45 90 4.243 5.104 5.734
A2, A5, A6 3 7 + 14 + 7 ) 28 7 + 14 + 7 ) 28 56 13.29 24.89 13.93
A2, (A5 + A6) 2 7 + 13 ) 20 7 + 13)20 40 13.29 15.28 7.938
(A2 + A6), A5 2 12 + 14 ) 26 12 + 14)26 52 13.4 24.89 8.497
Table 3. Optimal Experimental Conditions for the Estimation of the Transesterification Parameters experiment
experimental conditions analytical load predicted parameter precision
parameters for the estimation of which the experiment is planned process time (min) T max (K) amount of oil/MeOH (mol/mol) number of samples variables measured number of analyses (EP + GP) precision (t-value) reference t-value
exp M
exp N
A1
exp G
A2,A6
A3,A4
A5
E1,E3,E5
E2,E4,E6
19 310 3/23.76 8 E 8+8 36.59 1.895
94 368 3/14.57 12 E 12 + 12 13.4/8.5 1.812
71 340 3.3/16.96 8 E 8+8 9.64/5.2 1.943
64 332 3.2/14.9 14 E 14 + 14 24.89 1.771
27 316 3.95/12 8 MG,E 8+8 96/76/49 1.771
76 341 3.95/12 8 G,E 8+8 12.6/21/52 1.771
demonstrated to be the most important and most suitable variable to be measured. Ester is a product or a reactant in all the six reactions of the kinetic scheme, and therefore, its measurement is able to produce highly informative data for all the parameters of the model. The same cannot be said for the glycerol or the monoglycerides. If measured together with the ester, they can sometimes improve the information content of the experiments (see, for instance, experiments M or N) but considered alone they give very poor and often unsatisfactory results (such as, for example, in the case of parameter A1sresults not shown here). The aim of the experiment design is to maximize the information content of the new experiments. As highlighted in the introduction, the problem with the set of data already available was that most of the sampling points were in the region controlled by equilibrium and therefore contained very scarce information on the dynamics of the process. The transesterification is a very quick reaction, and equilibrium is reached after approximately 20 min at room temperature if the amount of methanol used is stochiometric.19 Therefore, in order to be able to collect more data in the kinetically controlled region, the optimal experiments (obtained from the calculations) try to delay the reaching of the equilibrium. This is achieved by using more alcohol (see the fourth row in Table 3) or a temperature profile which varies continuously, either with a constant slope (see for example exp M in Figure 5) or with sheer and steep increments (exp L in Figure 5). Figure 6 show the optimal sampling times
Figure 5. Temperature profiles of the six optimal experiments of Table 3 and of the unplanned experiments (the dashed curves).
exp H
exp I
exp L
for the six experiments of Table 3. It can be observed that over half of the samples were planned to be taken in the first 30 min of the reaction. The samples were not further bunched up in the initial part of the reaction due to a constraint on the minimal distance between sampling times of 1 min, dictated by the actual sampling protocol. 5. Experimental Data The six experiments of Table 3 were actually carried out; the mass fractions (in weight percent) of ester, glycerol, and monoglycerides were determined according to section 2.4, and their profiles are illustrated in Figure 7. Table 4 reports the numerical data for three of the experiments performed; the times of the measurements shown in the table are the actual minutes and seconds when the acidification of the sample began (see section 2.4). These sampling times are as close as experimentally possible to those suggested by the optimized experiment designs. The table shows the mass fractions in the reaction mixture as calculated from eqs A1 and A2 of Appendix A and not the mass fractions directly measured with the analytical techniques. As explained before, the model considers only the overall amount of a substance present in the batch during the reaction time and not its distribution into the two liquid phases. The profiles of Figure 7 and the data of Table 4 are consistent with the kinetics of a transesterification reaction (ester and glycerol profiles either increasing or approaching equilibrium and, for the monoglycerides, the typical curve with a maximum,
Figure 6. Sampling points for the six optimal experiments.
Ind. Eng. Chem. Res., Vol. 46, No. 1, 2007 227 Table 5. Error Analysis
Figure 7. Experimental data for the six experiments of Table 3: (a) ester content and (b) glycerol and monoglycerides contents. Table 4. Experimental Data (weight percent) in the Reaction Mixture exp M sample
timea
E
1 2 3 4 5 6 7 8 9 10 11 12 13 14
14’58” 16’4” 18’2” 20’06” 30’02” 23’17” 25’00” 26’38”
42.16 43.18 44.40 43.76 50.91 47.42 47.86 49.66
a
exp N MG
timea
E
5.05 5.56 5.86 5.76 5.77 5.71 5.84 5.74
28’23” 30’04” 31’42” 33’25” 58’12” 63’20” 66’42” 75’24”
52.15 51.64 51.07 51.46 54.53 56.85 59.16 60.90
substance
average wt % of the 3 measures for 1 of the samples
variance
avg % error
ester glycerol monoglycerides
25.83 19.22 6.34
0.69 0.455 0.045
1.54 2.68 2.55
the valves of the sampling system at the beginning of each experiment for around 30 s so that the new mixture washes away the traces of the previous one. This systematic problem could be the reason for the same strange behavior observed in the previous study15 for the samples at the twentieth minute. This anomaly had previously been attributed to imperfect stopping of the reaction, which led to the use of another stopping technique in this study. A deeper study on the effect of different stopping techniques on the experimental data is currently under investigation. Concerning the experimental analytical errors, a specific study was carried out for each technique used. As these analytical methods were very time consuming, the error study was performed only for some samples, the analyses of which were repeated three times in order to calculate measurement variance and percent error. Table 5 shows the results of this study, and as we can see, the errors are within the normal range of experimental errors. The average values of variance for each type of analyses were then applied to all the other samples. The calculation of the variance of the measurements is very important in order to have an idea about the precision of the experimental data and in particular for the subsequent use of these data in the parameter estimation. 6. Parameter Estimation
exp L G
timea
E
4.24 2.88 3.29 3.45 3.95 3.85 4.10 4.35
16’48” 18’22” 21’47” 23’34” 25’07” 26’52” 30’04” 48’38” 50’02” 51’42” 58’10” 60’44” 61’38” 62’43”
46.57 52.22 57.02 55.35 60.74 59.69 62.31 65.21 66.71 67.53 67.78 68.02 68.23 68.88
Time is expressed in minutes (’) and seconds (”).
which is characteristic of reaction intermediates) and so demonstrate the validity of the experimental and analytical procedures adopted. The data for the first sampling point are the only exception. In most of the experiments (this is particularly evident for the glycerol data of exp N), this first experimental point shows a larger value than the next data point. This cannot be explained according to the kinetic scheme, and it was concluded that it was probably caused by some systematic error in the experimental procedures, likely regarding the sampling system. Further investigation revealed that the sampling system was not accessible for cleaning between one experiment and the next one. As a consequence, the first sample of each experiment was likely contaminated with residues from the last sample of the previous experiment at higher conversion. This hypothesis is at the moment under experimental verification. A possible solution to eliminate this error could be to open
The parameter estimation was carried out in two steps. The first stage was an “individual” estimation: the parameters were estimated singly, in pairs, or in groups of three (depending on how they had been coupled during the experiment design), using the data only of the specifically designed experiment and with the other parameters fixed at their current values. Then, an overall estimation was performed using the data of all the optimal experiments and the parameter values found in the previous step as starting points. The problems which emerged and the solution proposed to overcome them are described in section 6.2. 6.1. Individual Estimations. Table 6 shows the results obtained for the twelve parameters involved in the model. The table reports the final value (FV) of each parameter together with its 95% confidence interval (CI), initial guess (IG), and lower and upper bounds (UB and LB). The last two columns contain the initial and final values of the objective functions (OF), which were minimized in each estimation. In all the estimations, due to the method adopted for the calculation of the experimental variance (see section 5), the constant variance model (σ2 ) ω2) was used. The values of ω2 were estimated together with the parameters (with initial guesses equal to the variances experimentally measured). This was necessary because only the variance of the analytical methods was known, but the errors due to reproducibility of the experiments were not measured. The estimated variances for the measurements of ester, glycerol, and monoglycerides were found to be on average 1.3, 0.75, and 0.12, respectively (in weight percent). Two statistical tests were performed to check the reliability of the estimation. The t-test is used to establish the statistical
228
Ind. Eng. Chem. Res., Vol. 46, No. 1, 2007
Table 6. Results of the Individual Parameter Estimations FVa
parameter A1 A2 A3 A4 A5 A6 E1 E2 E3 E4 E5 E6 a
10-2
3.53 × 1.121 × 10-2 3098 1291 3.334 × 10-3 5.371 × 10-3 4814 4303 9032 6769 4271 5328
95% CIa
IGa
10-3
LBa 10-1
3.912 × 1.924 × 10-3 1003 508.2 5.788 × 10-4 2.097 × 10-3 840 659 101 108 73. 126
2.157 × 3.12 × 10-3 33 320 77.92 2.933 × 10-3 9.991 × 10-3 5892 4269 9280 6760 3995 5532
10-2
10-4 103 10-1 10-4 10-4 103 103 3 × 103 103 103 103
UBa
OFa (initial value)
OFa (final value)
1 10-1 105 104 10-2 10-1 1.2 × 104 1.2 × 104 1.5 × 104 1.5 × 104 104 1.2 × 104
5734 701 1285 1285 116 701 36 391 1153 36 391 1153 36 391 1153
13.51 22.93 11.68 14.22 33.66 22.93 45.73 41.85 45.73 41.85 45.73 41.85
FV ) final value of each parameter; CI ) confidence interval; IG ) initial guess; LB ) lower bound; UB ) upper bound; OF ) objective function.
Table 7. Results of the Statistical Tests parameter
95% t-value
95% pred t-valuea
ref t-value (95%)
weighted residual
χ2-value (95%)
A1 A2 A3 A4 A5 A6 E1 E2 E3 E4 E5 E6
9.023 5.826 3.087 2.541 5.767 2.561 5.728 6.535 89.9 62.7 73.34 42.37
36.59 13.4 9.64 5.2 24.89 8.5 96 12.6 76 21 49 52
1.943 1.833 2.132 2.015 1.771 1.833 1.706 1.708 1.706 1.708 1.706 1.708
8 12 6 9.085 20.82 12 33.971 32 33.971 32 33.971 32
12.592 16.919 9.488 11.07 22.362 16.919 38.885 37.652 38.885 37.652 38.885 37.652
a The third column contains the t-values as predicted by the experiment design calculations.
significance of each parameter: the t-value shows the percent accuracy of the estimated parameters with respect to the 95% confidence intervals. The test is satisfied when the calculated value is larger than the reference t-value. The χ2-test concerns the goodness of the model fit and is satisfied when the weighted residual is less than the 95% reference χ2-value. Table 7 reports the results of these statistical tests performed for each parameter. As we can see, both tests were always satisfied. The experiment design results (Table 3) indicated that it would be possible to estimate all the parameters with enough precision. However, the t-values achieved in practice were almost always smaller than those predicted (second and third column of Table 7). This is due to the fact that we had been optimistic about the precision of the measurements. The estimated variance used in the experiment design study for the three measured variables (a standard value of 0.5 was adopted for all the variables due to lack of knowledge about the real experimental errors correlated with the analytical procedures) turned out to be too small. The individual estimations confirmed the results which were expected from the experiment design and seemed to have produced very satisfactory results: all the parameters appeared to be estimated with good precision, and the experimental data of each single experiment were well fitted. However, when the new values of the parameters obtained with the individual estimations were used all together, the model performed very badly and none of the sets of experimental data were reproduced by the simulated profiles. This was a clear indication of the presence of strong interactions and correlations between the parameters, which the individual estimations had neglected completely. For this reason, an overall estimation of all the parameters together was tried: the problems encountered and the results obtained are presented in the next section.
6.2. Overall Estimation. The values found from the individual estimations were used as starting points for an overall estimation of all the parameters. The data of all the optimal experiments were employed simultaneously to estimate all parameters. The first attempts with this overall estimation were unsuccessful because of convergence problems. The cause of these difficulties was sought in the mathematical formulation of the model. It is well known that a model with Arrhenius’ law directly incorporated can make the estimation of the parameters very hard because of the high degree of nonlinearity (the correlations between pre-exponential factors and corresponding activation energies make the problem even worse). In order to reduce the degree of nonlinearity, three different solutions were tried: • a reparametrization of the model using a reference temperature;27 • a reparametrization of the model by means of an approximation of the exponential function with a power function; • a linearization of Arrhenius’ equations. The first solution required Arrhenius’ equation to be replaced by the following form:
kj ) A/j exp
[( )(
)]
-Ej 1 1 R T Tref
(5)
( )
(6)
where
A/j ) Aj exp
-Ej RTref
and Tref is a mean reference temperature. This solution proved to be inadequate. Although effective in overcoming the correlation problem (as pointed out by Asprey and Naka27), this reparametrization generated a new difficulty: the scaling problem became significant, and the convergence was hindered again. In this case study, the pre-exponential factors were much smaller than the activation energies due to the units of measurement used (the pre-exponential factors were expressed in cubic meters per mole and second). The application of the reparametrization made the difference between the values of the parameters Ej and A/j larger, and therefore, the numerical solution of the problem was not helped at all. The second solution approximated the exponential function with a power function:
exp
( )
-Ej ≈ bTc RT
(7)
where b and c are parameters which can be found by fitting the exponential curves. Arrhenius’ equation was, therefore, replaced
Ind. Eng. Chem. Res., Vol. 46, No. 1, 2007 229 Table 8. Results of the Global Parameter Estimation parametera A/1 A/2 A/3 A/4 A/5 A/6 E1 E2 E3 E4 E5 E6 a
(A1) (A2) (A3) (A4) (A5) (A6)
95% CIb
final value 0.4865 (1.627) -6.961 (0.000 95) 9.6362 (15 309) 2.7263 (15.24) -6.88 (0.001) -3.861 (0.021) 5447 4947 9296 6792 3994 5487
54.75 518.6 0.9103 1.598 0.4594 0.5313 2.75 × 105 7.73 × 105 293.2 872.6 187.7 346
95% t-value
ref t-value (95%)
weighted residual
χ2-value (95%)
1.69
45.333
49.802
1.68
58.345
60.481
10-2
4.529 × 1.995 × 10-2 10.59 1.706 14.98 7.267 1.982 × 10-2 6.405 × 10-3 31.7 7.783 21.28 15.86
A* ) ln(Aj). b CI ) confidence interval.
by the following expression:
kj ) A/j Tc
(8)
A/j ) Ajb
(9)
where
This solution proved to be satisfactory: an estimation of the parameters A/j and c was found with sufficient precision for most of the parameters. The only drawback of this approach is the impossibility to go back, after the estimation, to the values of the original Arrhenius parameters. The last solution simply required Arrhenius’ equation to be replaced by the following linear equation:
costj ) A/j - Ej/T
(10)
A/j ) ln(Aj)
(11)
kj ) exp(costj)
(12)
needs to be justified. A recent experimental work carried out in our group showed that the initial mixing of the two liquid phases, where the reactants are at the beginning of the transesterification, took up to 2 min for small volumes such as 50-90 mL. We can, therefore, expect a longer time to be required to obtain a good mixing of 4 L or more of reaction mixture. When the experiment design study was performed, this information was not available and, therefore, 1 min was chosen as the minimum time for the first sampling point. Exp G required all its data to be collected in the first 18 min of the reaction (the optimal sampling points of exp G can be seen in Figure 6). Therefore, we preferred to discard these data because we could not be sure that they had not been affected by dishomogeneities in the mixing. Table 8 shows that 8 parameters out of 12 were identified with sufficient precision as indicated by the t-tests results and by the confidence intervals (CIs). However, the parameters of the first reaction step (A1, A2, E1, and E2) are not statistically
where
and
This was the approach chosen for the overall estimation. Despite the use of this linearization or of the reparametrization of the second approach, the overall estimation of the 12 parameters still showed some difficulties. The estimation was therefore carried out in two steps. First, only the pre-exponential factors (A/j ) were estimated with the activation energies fixed at their initial values. Then, the activation energies were estimated with the pre-exponential factors fixed at the values found in the previous step. A third iteration was not necessary because no further improvements were obtained with a re-estimation of all the parameters together using the values found in the previous two steps as starting points. Table 8 shows the final results of the overall estimation with the values of the parameters in the original Arrhenius formulation reported in parentheses (only for the pre-exponential factors since the activation energies are unaffected by the linearization). The table contains also the results of the statistical tests which were applied to the linearized parameters (A/j and Ej). We can see from the χ2-tests that the fits were always adequate (this means that the model was able to reproduce the experimental data). Figure 8 shows the goodness of the fit graphically. The overall estimation described was carried out using the data of all the optimal experiments except exp G. This choice
Figure 8. Comparison between experimental data and simulated profiles (a) for the ester content of experiments L and M and (b) for the monoglycerides content of exp M.
230
Ind. Eng. Chem. Res., Vol. 46, No. 1, 2007
Figure 9. 95% confidence ellipsoids for two pairs of corresponding parameters: (a) A1 with E1 and (b) A2 with E2.
valid. This could be caused by two distinct problems: (i) correlation and (ii) lack of data during the very first minutes of the reaction. Sometimes in multiparameter models, a t-value may be low because of the high correlations between parameters. From the correlation matrix results (not shown here), a high correlation between the corresponding parameters (A1 with E1 and A2 with E2) can be noted, as can be also seen from the confidence ellipsoids (Figure 9). Regarding the second problem, the data which were collected in the first minutes of the reaction were discarded for the reason mentioned above. Unfortunately, as shown from the sensitivity analysis, the model is particularly sensitive to parameters A1 and E1 exactly at the beginning of the process, where no sure data could be collected. We can therefore conclude that, with the available experimental apparatus, the proper identification of these parameters is really difficult. One possible solution could be to use a higher mixing rate at the beginning of the process to shorten the time required to obtain a homogeneous phase. However, this hypothesis must be verified with some additional experiments before proceeding with another experiment design aimed at planning experiments for the estimation of these four parameters. From the comparison between the final values of the parameters obtained from the individual estimations (Table 6) and from the overall estimation (Table 8), it can be seen that almost all of the parameters (except A5 and A6) show very different values. This is a clear indication that interactions between parameters are significant and that only an overall procedure can produce, as in this case, satisfactory results. The model achieved so far is able to reproduce all the experimental
data available and so can now be used for a second experiment design study with increased confidence in its validity. 6.3. Discussion. It is possible now to extract some conclusions from all this work. The approach proposed for the experiment design (to plan experiments for the estimation of subgroups of parameters with the others fixed at their current values followed by overall estimation) has been demonstrated to work but requires further developments. Its main advantage is the capability to deal with complex reaction networks with a high number of parameters and correlations between them, when the currently available experiment design techniques are affected by convergence problems. In most of the cases, the standard approach often fails and no experiments can be planned (our biodiesel case is just an example). Nevertheless, there is a limitation in the procedure proposed: the significant correlations between the parameters are neglected during the planning of the experiments for subsets of parameters, but their effects reappear during the estimation, which makes the identification of the parameters more difficult. Therefore, the proposed experiment design technique should be modified to overcome this drawback and this is under investigation at the moment. With regards to the sampling points, the method tries to collect them in the time interval around the maximum of the curve of the normalized sensitivity. In the proposed design-by-groups approach, only the sensitivities of the responses to the parameters under investigation (i.e., the parameters for which the new experiment is going to be planned) are taken into account. The technique does not consider the effect of the other parameters in this interval. It is possible (as we have seen in the previous sections) that the experiment planned for the estimation of one parameter is highly influenced by the values of the other parameters and that this will re-emerge during the estimation. One possible way to deal with this is to develop a criterion able to identify the regions of the design space where the sensitivity of the parameters not included in the design is low (in this way, the effects of these parameters will be negligible). This may lead to the identification of experiments with lower information content but should permit a better decoupling between the parameters than the current technique. A trade-off will of course have to be found between these two requirements: information content of the experiments on one side and parameter decoupling on the other. Further work in this direction will be reported elsewhere. 7. Conclusions Model-based experiment design techniques are indeed a powerful tool to reduce the amount of experimental work required for model validation. However, we demonstrated here via a practical example that a judicious problem formulation (choice of constraints, measured, and estimated variables) is essential to obtain good results. The use of a preliminary sensitivity analysis is compulsory, for instance, for the initial choice of sampling points. Our study highlighted also an important trade-off between precision of the estimates, number of samples, experiments, and analytical effort required. Taking into account the most critical variables (in our case the number of analyses, but for a biological system it could be the length of the experiment or the total amount of substances available), this trade-off analysis, which can be performed prior to experimentation, allows a modeler/experimenter to establish the best set of optimal experiments. We trust that the messages from this study are useful, generic to many situations, and worthy of further refinement. They indicate possible future developments for the optimal experiment design method, aimed at systematizing the choices highlighted here.
Ind. Eng. Chem. Res., Vol. 46, No. 1, 2007 231
The application of model-based experiment design techniques usually requires an iterative procedure, in particular if the initial values of the parameters are far from the best ones. In this paper, the method has been applied to a model describing a biodiesel production process in order to identify the parameters of its complex and rather difficult kinetic network (three consecutive, competitive, and very similar reactions). The number of parameters and the form of the equations involved (Arrhenius’ equations) generated convergence problems in the design of a set of experiments for the estimation of all the parameters of the network. The preliminary solution proposed here involves the planning of experiments for the estimations of carefully chosen subsets (individual, couples, or other groups) of parameters, with the others fixed at their current values, and uses sensitivity analysis to identify which parameters can be estimated with the data collected from a single optimal experiment. As demonstrated in this paper, the approach proposed is valid but requires further developments to take into account the correlations between the parameters and avoid possible problems during the subsequent overall estimation. A different formulation of the objective function and some modifications of the experiment design algorithms may be necessary. In spite of these difficulties, using the procedure proposed in this paper, it was possible to validate the model almost completely in just one single iteration. Only the parameters of the first reaction step were not identified successfully (from a statistical point of view). Data in the first reaction minutes are required to estimate these parameters properly. The reactor used was demonstrated not to be the best apparatus to collect data in the initial part of the reaction because of the mixing difficulties explained in section 6.2. It is likely that a different type of reactor (differential or with a different mixing system) should be used to study the initial steps of the first reaction before planning the second iteration of the experiment design. Appendix A Computation of the Overall Reaction Mixture Composition from the Experimental Data. Due to lack of data describing the liquid-phase equilibria, the model cannot predict the phase separation and distribution of methanol, glycerol, and esters between the two phases. Therefore, the model calculates and uses the amounts of methanol, ester, and glycerol as mass fractions in the overall reaction mixture. The results of the analyses give the mass fractions in each phase, and the amount of the two phases and the amount of the original sample collected (without the acid added) were measured, thus enabling the comparison of predicted and experimental results. In order to calculate the mass fraction in the reaction mixture (RM), the weight of ester (or glycerol or monoglycerides) is divided by the total weight of the sample:
gEGP gEEP gEP + g ) gETOT gEP gGP GP E ) wRM
gETOT gSample - gAcid
(A1)
(A2)
The total weight of the sample was computed as the difference between the weight of the sample, measured after the experiment and before the demethanolysis, and the weight of the acid added. Each sample was weighed for the first time after the acidification, and therefore, the acid weight must be subtracted in order to obtain the weight of the original sample taken from the reaction mixture.
Appendix B Some Model Details. (A) Phase Equilibrium. The synthesis of biodiesel is a liquid-phase reaction, but a vapor phase is always present in the reactor because of the partial evaporation of methanol: the system was therefore modeled as biphasic. The phase equilibrium was described with the γ-φ approach:
Pviγixi ) Pyi
(B1)
where Pvi is the vapor pressure of the component i, γi is its activity coefficient, P is the pressure of the system, and xi and yi are the molar fractions in the liquid and vapor phases. For the computation of the fugacity coefficients, the vapor phase was considered to be an ideal gas since it is constituted by essentially only methanol and the pressure is not so far from 1 atm (the reactor has a constant pressure of 1.4 bar). The activity coefficients were calculated by Multiflash25 using the Unifac method. This model was chosen because it was impossible to find enough information about triolein and diolein to use other methods but was demonstrated to be effective. For the computation of the other thermodynamic properties, the RedlichKwong-Soave equation was chosen (these calculations were performed by Multiflash). The reactor was assumed (and verified) to be well mixed. (B) Thermal Effects. Two different energy balances (one for the reactor body and one for the reaction mixture) were considered in order to model the thermal inertia of the reactor:
dHreactor ) Q - thermal dispersion - Qmixture dt dUmixture dt
(B2)
6
rj∆HR + Qcorr ∑ j)1
) Qmixture - MlVl
j
(B3)
where Ml is the total liquid holdup, Vl is the molar volume of the liquid phase, rj and ∆HRj are the reaction rate and the reaction heat of the jth reaction, and Q is a control variable which represents the heat flow from the electrical heater to the reactor walls. For the experiment design calculations, the control input optimized directly was the temperature profile (Q became a variable). This change was necessary to avoid problems in the interface between gPROMS and Multiflash. Qcorr in eq B3 is a corrective term and represents the heat flow caused by methanol refluxing into the reactor. The methanol vapors, which leave the system to go into the column, are totally condensed there and refluxed into the reactor. It was impossible to collect data about this reflux and, therefore, its contribution to the energy balance was characterized by this term. The thermal dispersion term was obtained and validated using some independent oil heating experiments. Qmixture is the heat flow from the reactor body to the reaction mixture, computed as:
Qmixture ) uRS(Treactor - Tmixture)
(B4)
where uR is the coefficient of thermal exchange (calculated from the available experimental data) and S is the area of the internal heat transfer surface. (C) Computation of the Reaction Mixture Enthalpy. The enthalpy of the liquid mixture was calculated as follows: NC
Hl )
xihi + hmix ∑ i)1
(B5)
where NC is the number of components () 6), hi is the enthalpy
232
Ind. Eng. Chem. Res., Vol. 46, No. 1, 2007
of the ith component (pure) at the same temperature of the system, xi is its molar fraction in the liquid phase, and hmix is the mixing enthalpy. Because of the lack of data about the enthalpy of triolein and diolein, the computation of the mixing enthalpy was problematic. A simplifying hypothesis similar to that described in section 3 was adopted: the mass enthalpy of mixing of the real mixture was approximated to that of a fictitious mixture where oleic acid replaces the same amounts of triolein and diolein. The molar fraction of the fictitious component (oleic acid) was calculated as
xoleic acid ) xtriolein + xdiolein
(B6)
and the mass mixing enthalpy was computed as NCfm
hmix ) hF.M. -
xihi ∑ i)1
(B7)
where hF.M. is the fictitious mixture enthalpy calculated by Multiflash, NCfm is the number of components in the fictitious mixture () 5), hi is the enthalpy of the ith component (pure) of the fictitious mixture at the same temperature of the system, and xi its molar fraction in the liquid phase. Again, the results obtained using this approximation were satisfactory. The full model includes 94 variables (with one known, Q or T) and 93 equations with 12 parameters to be estimated. Acknowledgment The contributions of Uniqema (Bromborough, U.K.) in providing us with technical assistance as well as access to laboratory facilities are gratefully acknowledged. Particular thanks go to Dr. Roger Kemp who allowed the GC analyses of our samples to be performed in Uniqema’s laboratory and to Mr. Jeff Entwistle and Mr. Brian McKinnell for their invaluable help and useful suggestions. Literature Cited (1) Asprey, S. P.; Macchietto, S. Statistical tools for optimal dynamic model building. Computers Chem. Eng. 2000, 24, 1261. (2) Asprey, S. P.; Macchietto, S. Designing robust optimal dynamic experiments. J. Process Control 2002, 12, 545. (3) Versyck, K. J.; Claes, J. E.; Van Impe, J. F. Practical identification of unstructured growth kinetics by application of optimal experimental design. Biotechnol. Prog. 1997, 13, 524. (4) Walter, E.; Pronzato, L. Qualitative and quantitative experiment design for phenomenological models - a survey. Automatica 1990, 26, 195. (5) Espie, D. M. The use of nonlinear parameter estimation for dynamic chemical reactor modeling. Ph.D. Thesis, University of London, London, U.K., 1986. (6) Espie, D. M.; Macchietto, S. The optimal design of dynamic experiments. AIChE J. 1989, 35, 223. (7) Zullo, L. Computer aided design of experiments. An engineering approach. Ph.D. Thesis, University of London, London, U.K., 1991.
(8) Ko¨rkel, S.; Kostina, E.; Bock, H. G.; Schlo¨der, J. P. Numerical methods for optimal control problems in design of robust optimal experiments for nonlinear dynamic processes. Optim. Methods Software 2004, 19, 327. (9) Bauer, I.; Bock, H. G.; Ko¨rkel, S.; Schlo¨der, J. P. Numerical methods for optimum experimental design in DAE systems. J. Comput. Appl. Math. 1999, 120, 1. (10) Versyck, K. J.; Claes, J. E.; Van Impe, J. F. Optimal experimental design for practical identification of unstructured growth models. Math. Comput. Simul. 1998, 46, 621. (11) Versyck, K. J.; Van Impe, J. F. Trade-offs in design of fed-batch experiments for optimal estimation of biokinetic parameters. Proceedings of the 1998 IEEE International Conference on Control applications, Trieste, Italy, September 1-4, 1998. (12) gPROMS V.2.3.0, Experiment Design for Parameter Precision in gPROMS; Process System Enterprise Ltd: London, U.K., 2004. (13) gPROMS V.2.3.0, Introductory and AdVanced User Guide; Process System Enterprise Ltd: London, U.K., 2004. (14) Noureddini, H.; Zhu, D. Kinetics of Transesterification of Soybean Oil. J. Am. Oil Chem. Soc. (JAOCS) 1997, 74, 1457. (15) Lemieuvre, M. Biodiesel process study: experiments and simulations. M.Sc. Thesis, University of London, London, U.K., 2002. (16) Franceschini, G. Biodiesel da transesterificazione di olio vegetale: modellazione matematica ed analisi numerica del processo. Tesi di laurea, University of Padua, Padua, Italy, 2003. (17) Meher, L. C.; Vidya Sagar, D.; Naik, S. N. Technical aspects of biodiesel production by transesterification- a review. Renewable Sustainable Energy ReV. 2006, 10, 248. (18) Franceschini, G.; Bertucco, A.; Macchietto, S. Simulation and optimisation of a biodiesel production process. Proceedings of the ConVegno GRICU 2004: NuoVe Frontiere di Applicazione delle Metodologie dell’Ingegneria Chimica, Porto d’Ischia (Napoli), Italy, September 1215, 2004. (19) Komers, K.; Skopal, F.; Stloukal, R.; Machek, J. Biodiesel from rapeseed oil, methanol and KOH 3. Analysis of composition of actual reaction mixture. Eur. J. Lipid Sci. Technol. 2001, 103, 363. (20) Komers, K.; Skopal, F.; Stloukal, R.; Machek, J.; Komersova, A. Biodiesel fuel from rapeseed oil, methanol and KOH. Analytical methods in research and production. Fett/Lipid 1998, 100, 507. (21) Lacaze-Dufaure, C.; Mouloungui, Z. Analysis of Mixtures of Fatty Acid Esters by Chromarod Chromatography/Flame Ionization Detection. J. High Resolut. Chromatogr. 1999, 22, 191. (22) Komers, K.; Skopal, F.; Stloukal, R.; Machek, J. Kinetics and mechanism of the KOH-catalyzed methanolysis of rapeseed oil for biodiesel production. Eur. J. Lipid Sci. Technol. 2002, 104, 728. (23) Freedman, B.; Butterfield, R. O.; Pryde, E. H. Transesterification kinetics of soybean oil. J. Am. Oil Chem. Soc. (JAOCS) 1986, 63, 1375. (24) DIPPR. Data compilation of pure compound properties. Design Institute for Physical Properties, AIChE. (25) Multiflash V.3.0. Multiflash Command Reference; Infochem Computer Services Ltd: London, U.K., 2001. (26) Bernaerts. K.; Van Impe, J. F. Optimal dynamic experiment design for estimation of microbial growth kinetics at sub-optimal temperatures: Modes of implementation. Simul. Modell. Pract. Theory 2005, 13, 129. (27) Asprey, S. P.; Naka, Y. Mathematical problems in fitting kinetics models - Some new perspective. J. Chem. Eng. Jpn. 1999, 32, 328.
ReceiVed for reView June 14, 2006 ReVised manuscript receiVed September 28, 2006 Accepted October 24, 2006 IE060758C