Subscriber access provided by CMU Libraries - http://library.cmich.edu
Article
Robust algorithms for simultaneous model identification and optimization in the presence of model-plant mismatch Jasdeep Mandur, and Hector Marcelo Budman Ind. Eng. Chem. Res., Just Accepted Manuscript • DOI: 10.1021/acs.iecr.5b01560 • Publication Date (Web): 25 Aug 2015 Downloaded from http://pubs.acs.org on September 11, 2015
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
Industrial & Engineering Chemistry Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 42
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
1
Robust algorithms for simultaneous model
2
identification and optimization in the presence of
3
model-plant mismatch
4
Jasdeep S. Mandur† and Hector M. Budman*
5
Department of Chemical Engineering, University of Waterloo, Waterloo, ON, Canada, N2L3G1
6
KEYWORDS:
7
Modeling; Model correction; Identification; Robust optimization; Parameter Estimation;
8
Parametric Uncertainty
9
ABSTRACT:
10
In the presence of model-plant mismatch, the set of parameter estimates obtained using standard
11
model identification procedures cannot accurately predict the gradients of the optimization
12
problem. Hence, a “two-step” approach involving an identification followed by an optimization
13
step, performed repeatedly, often fails to convergence to the true process optimum. In our recent
14
work1, we proposed a new identification procedure that progressively corrects the model for
15
structural error such that the updated set of parameter estimates simultaneously satisfies both
16
identification and optimization objectives with a guaranteed convergence to the true optimum. In
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
17
this paper, we expanded our previous methodology in two directions; first, a new model correction
18
based on a higher order approximation of model is proposed and second, the effect of model
19
uncertainties is considered explicitly by formulating a robust optimization problem at each
20
iteration. The resulting improvements are then illustrated using a simulated study of fed-batch
21
penicillin production process.
22 23
ACS Paragon Plus Environment
Page 2 of 42
Page 3 of 42
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
24
Industrial & Engineering Chemistry Research
1. INTRODUCTION
25
Model-based optimization provides a systematic framework to improve profitability of the
26
process while satisfying various process constraints, especially those related to safety and
27
environmental regulations. However, its success relies on the accuracy of the underlying process
28
models in terms of three major factors; (1) how accurately the model represents the actual process,
29
(2) under what operating conditions and how the experiments were carried out and (3) the quality
30
of the measurements involved. The standard modeling approach involves an iterative procedure
31
that begins with the estimation of model parameters using an initial set of experimental data. Then,
32
the calibrated model is validated over different operating conditions and if the prediction error is
33
considered significant, the parameters’ estimates are updated using additional set of experiments.
34
The procedure is repeated until a model with the required prediction accuracy is obtained. If the
35
assumed model is a true representation of the actual process, the parameters’ estimates will
36
eventually converge to a unique set of values, provided there is enough excitation from the
37
experiments to identify all the parameters. The final calibrated model will have good prediction
38
accuracy over the entire space of operating conditions and, correspondingly, optimizing this model
39
is equivalent to optimizing the process itself. The rate of convergence in this approach, however,
40
depends on how the experiments were carried out and the degree of noise in the measurements. In
41
most practical situations, the number of experiments is usually restricted either because of the
42
duration of experiments or budget constraints. Then, if the main objective is to use the model for
43
process optimization, it is desirable to conduct these limited number of experiments along the
44
search path of the process optimum. To this end, a “two-step” approach is quite promising where
45
the operating conditions for the next experiments are basically obtained by optimizing the model
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
46
calibrated in the previous step. This procedure is also iterative where the model is re-calibrated
47
and then re-optimized at each iteration until convergence is obtained2.
48
On the other hand, when there is a mismatch between the assumed model structure and the actual
49
process, there is no unique set of parameter estimates to predict the process with the same level of
50
accuracy over the entire operating region. In fact, the parameters have to compensate for the
51
unmodelled dynamics and as a result, the calibrated model will be accurate only around the
52
operating conditions where the parameters were estimated. This also implies that such model may
53
not predict the gradients of the cost function as well as the process constraints with sufficient
54
accuracy and in the extreme case, the predicted and the measured gradients may have an opposite
55
sign driving the optimization search away from the actual process optimum. In these situations,
56
even the standard “two-step” identification/optimization approach would converge to sub-optimal
57
operating conditions3.
58
To ensure convergence towards the true process optimum, it is necessary that the model should
59
predict the optimality conditions of the process accurately4. To achieve this goal, a class of
60
algorithms has been proposed3, 5-9 where the optimization objective and the constraints are adjusted
61
for the differences between their predicted and measured gradients. Since these corrections are
62
added directly to the corresponding functions in the optimization problem, they have to be filtered
63
in order to avoid aggressive corrections so as to obtain a smoother convergence. However, if the
64
measurements are very noisy, this may slow down the rate of convergence significantly. In another
65
study10, unlike the standard estimation approach where the parameters are estimated only to correct
66
for the prediction error, the authors also included the difference between the predicted and
67
measured objective functions’ gradients as a feedback and, as a result, the parameters were updated
68
to obtain a trade-off between the identification and optimization objectives. If the mismatch
ACS Paragon Plus Environment
Page 4 of 42
Page 5 of 42
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
69
between these objectives is significant around the optimum, one of the two objectives has to be
70
compromised. Recently, we proposed an alternative approach1 where the model was corrected for
71
its structural error in a way that, upon convergence, the updated set of model parameters not only
72
minimizes the prediction error but also predicts the gradients of the optimization problem within
73
pre-specified bounds, thus, simultaneously satisfying the identification and optimization
74
objectives. At any given iteration, the proposed model correction was based on the linearization of
75
model outputs which could be quite restrictive if the degree of nonlinearity and/or uncertainty in
76
the model is quite significant resulting in poor convergence.
77
In this paper, we expanded our previous study to improve upon the convergence by proposing
78
two modifications; (1) a new model correction term is proposed based on the quadratic
79
approximation of model outputs and (2) a robust formulation is considered for the optimization
80
objective where a weighted sum of nominal performance and its variability due to model
81
uncertainties is minimized. An additional novelty of this work is related to the description of
82
parametric uncertainty used in the robust optimization problem. The most common approach to
83
estimate this parametric uncertainty is based on the linearization of model outputs which, for
84
normally distributed measurement errors, results in normally distributed uncertainty in parameter
85
estimates. However, the assumption of linearization may not be valid for the given level of
86
measurement noise, especially in nonlinear problems and may result in incorrect optimal solutions
87
as shown in our previous study11. Instead, a more accurate approach based on the Bayes Theorem11
88
is used in this work where the linearization assumptions are not required. To this end, a
89
comparative study is presented where the effect of Bayesian and normal description of parametric
90
uncertainty is compared with respect to the convergence of the proposed algorithm. To the authors’
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
91
knowledge, this is the first time the convergence of the iterative optimization algorithms has been
92
studied with respect to the Bayesian uncertainty descriptions.
93
The organization of this paper is as follows. Sections 2 and 3 covers the theory and the proposed
94
modifications with the overall optimization methodology summarized in Section 4. The
95
methodology is then illustrated using a penicillin production process in Section 5 followed by the
96
conclusions in section 6.
97
2. MODEL CORRECTION
98
2.1 Preliminaries
99
Although the algorithms presented in this work can be applied to the optimization of both
100
continuous and batch/fed-batch processes, for the sake of brevity, the following discussion and the
101
case study will be limited to a fed-batch process with a duration time of tf. However, in the
102
proposed setting, the applicability of the algorithms is limited to run-to-run optimization of an end
103
point property in case of batch/fed-batch systems or steady state optimization for continuous
104
systems.
105
Uncertainties in model parameters resulting from either insufficient excitation in measurements
106
or incorrect model structure often lead to non-optimal operating policies. The standard “two-step”
107
approach, reviewed in the Introduction section, aims to tackle this problem by continuously
108
updating the model parameters at the sub-optimal operating conditions, calculated using the last
109
calibrated model. The first step in this approach involves model identification, where the model
110
parameters are estimated by minimizing some measure of the errors between the predicted and
111
measured outputs at some initial set of operating conditions. A simplest measure is a sum of
112
squared errors defined as follows1:
ACS Paragon Plus Environment
Page 6 of 42
Page 7 of 42
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
113 𝑁
𝛉𝐤 = arg min ∑ 𝛉
𝑠. 𝑡.
‖𝐲𝐦 (𝐮𝐤 ) − 𝐲(𝐮𝐤 , 𝛉)‖2
𝑖=1
𝐱̇ = 𝑓(𝐱, 𝛉, 𝐮𝐤 ) 𝐲 = ℎ(𝐱)
(P. 1)
114
Where, 𝐱 ϵ ℝnx is the vector of model states, 𝛉 ϵ ℝnθ is the vector of model parameters, 𝐮 ϵ ℝnu
115
is the vector of process inputs, 𝐲 ϵ ℝny and 𝐲𝒎 ϵ ℝny are the vectors of predicted and measured
116
outputs respectively, N is the number of time points, 𝑓 ϵ ℝnx is a set of differential equations
117
representing the correlation between model states and process inputs and ℎ ϵ ℝny is a mapping
118
between model states and predicted outputs. It is important to note here that this formulation is
119
based on the assumption that the errors in measured outputs and at each time point are normally
120
distributed with a zero mean and a known constant and uncorrelated variance. Now, depending on
121
different assumptions on the distribution of these measurement errors, there are different
122
formulations that can be used12-14. For examples, if the variance is assumed to be uncorrelated but
123
not constant among different outputs, a weighted least squares is more appropriate as used in the
124
case study in this work. In a more general case considering the covariance-variance matrix as
125
unknown, a determinant criteria by Box and Draper14 would be an ideal choice. A detailed
126
comparative discussion on these formulations is beyond the scope of this study. For simplicity, we
127
will be using the simple least squares formulation as posed in problem P.1 in the following
128
discussions and the proposed algorithms by no means are restricted to any one particular type of
129
identification formulation.
130
Once the parameter estimates are obtained, the next step is to calculate the optimal time-varying
131
input profiles, 𝑢𝑗 (𝑡) such that the some measure of the process performance can be maximized
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
132
subject to certain terminal and/or path constraints. Although this problem falls under the category
133
of dynamic optimization, it can be easily posed as NLP using a control vector approach15. Here
134
the idea is to parameterize the input profiles using piecewise constant values for the inputs over
135
̂𝑗 , 𝑡) so that the problem can now be solved in terms pre-specified time intervals, i.e. 𝑢𝑗 (𝑡) = 𝑈(𝐮
136
̂𝑗 . This approach is also referred as the direct sequential approach as the of the static variables 𝐮
137
model equations are integrated separately using an appropriate numerical integration technique
138
and the outputs are then used to compute the objective function and constraints of the optimization
139
problem. Considering path constraints in their discretized form, the final NLP problem can be
140
formulated as follows1:
141
ACS Paragon Plus Environment
Page 8 of 42
Page 9 of 42
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
142 143
̂ 𝐤+𝟏 = arg min 𝜙(𝐲(𝐮 ̂ , 𝛉𝐤 , 𝑡𝑓 )) 𝐮
144
s. t.
̂ 𝐮
̂) 𝐱̇ = 𝑓̂(𝐱, 𝛉𝐤 , 𝐮
145
𝐲 = ℎ(𝐱)
146
̂ , 𝛉𝐤 , 𝑡𝑖 ), 𝐔(𝐮 ̂ , 𝑡𝑖 )) ≤ 𝟎 𝐒(𝐲(𝐮
147
̂ , 𝛉𝐤 , 𝑡𝑓 )) ≤ 𝟎 𝐓 (𝐲(𝐮
∀𝑡𝑖 ∈ [𝑡0 , 𝑡𝑓 ]
(P.2)
148
̂ ϵ ℝnû is the vector of parameterized inputs, 𝜙 is the objective function to be minimized Where, 𝐮
149
at the end of batch, 𝐒 ϵ ℝnS is a vector of discretized path constraints and 𝐓 ϵ ℝnT is a vector of
150
terminal constraints. Since the constraints are being calculated at pre-specified time points, they
151
can
152
̂ , 𝛉𝐤 , 𝑡0 ), 𝐔(𝐮 ̂ , 𝑡0 )), … , 𝐒 (𝐲(𝐮 ̂ , 𝛉𝐤 , 𝑡𝑓 ), 𝐔(𝐮 ̂ , 𝑡𝑓 )) , 𝐓 (𝐲(𝐮 ̂ , 𝛉𝐤 , 𝑡𝑓 ))] [𝐒(𝐲(𝐮
be
represented
collectively
by
a
single
vector
as:
̂ , 𝛉𝐤 ), 𝐮 ̂) = 𝐠(𝐲(𝐮
153
The above two steps (P.1 and P.2) are then successively repeated until convergence is obtained.
154
However, in the presence of an error between the assumed model structure 𝑓(𝐱, 𝛉, 𝐮) and the
155
actual process, the convergence of this approach towards the true process optimum cannot be
156
guaranteed. Since this error is a representation of unmodelled dynamics and therefore, a function
157
of model states and inputs, when the model is calibrated against experimental data collected at
158
different operating conditions, the parameter estimates have to compensate for this variable error.
159
As a result, there is no unique set of parameter values that can satisfy the identification objective
160
(P. 1) for all possible realizations of operating conditions. The set of parameter estimates that
161
minimizes the prediction error at any particular set of operating conditions, for instance 𝐮𝐤 , may
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 10 of 42
162
not minimize this error at values slightly different from 𝐮𝐤 which implies that the model may not
163
predict the gradients of the optimization objective and constraints with respect to 𝐮𝐤 correctly, i.e.:
164 165
̂ 𝐤 , 𝑡𝑓 )) ̂ 𝐤 , 𝛉𝐤 , 𝑡𝑓 )) 𝜕𝜙𝑝 (𝐲𝐦 (𝐮 𝜕𝜙(𝐲(𝐮 ≠ 𝜕𝑢̂𝑖𝑘 𝜕𝑢̂𝑖𝑘
(1a)
̂ 𝐤 ), 𝐮 ̂𝐤) ̂ 𝐤 , 𝛉𝐤 ), 𝐮 ̂ 𝐤 ) 𝜕𝑔𝑝 𝑗 (𝐲𝐦 (𝐮 𝜕𝑔𝑗 (𝐲(𝐮 ≠ 𝜕𝑢̂𝑖𝑘 𝜕𝑢̂𝑖𝑘
(1b)
In a worst case scenario, the above predicted and measured gradients may be of an opposite sign thus driving the optimization search away from the true process optimum, i.e.: ̂ 𝐤+𝟏 , 𝑡𝑓 )) > 𝜙𝑝 (𝐲𝐦 (𝐮 ̂ 𝐤 , 𝑡𝑓 )) 𝜙𝑝 (𝐲𝐦 (𝐮
(2)
166
To ensure the convergence to true process optimum, one may choose to estimate the parameters
167
that predict the above gradients with a higher accuracy but then the accuracy in predicting the
168
process outputs has to be relaxed to a certain extent. To tackle this tradeoff between the
169
identification and optimization objectives, we recently proposed a set of progressive linear
170
corrections to model outputs such that upon convergence, the updated model parameters not only
171
minimize the difference between predicted and measured gradients but also minimize the
172
prediction error in the outputs In the next sub-section, the idea behind this model correction is
173
explained and a more accurate model correction based on a quadratic approximation of the model
174
is proposed.
175
2.2 Quadratic Model Correction
176
For any kth iteration, the identification problem posed in P.1 calculates the set of parameter
177
estimates 𝛉𝐤 such that the sum of squared errors between the predicted and measured outputs 𝜖𝑘
178
is at its minimum. Mathematically, 𝜖𝑘 can be expressed as:
ACS Paragon Plus Environment
Page 11 of 42
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
𝑁
𝜖𝑘 = ∑
‖𝐲𝐦 (𝐮𝐤 ) − 𝐲(𝐮𝐤 , 𝛉𝐤 )‖2
(3)
𝑖=1
179
Let ∆𝛉𝐤 be the change in parameter estimates 𝛉𝐤 , required to minimize the difference between
180
predicted and measured gradients given in Equations (1a)-(1b)). However, with the updated
181
parameter estimates 𝛉𝐤 + ∆𝛉𝐤 , the sum of squared errors between the predicted and measured
182
outputs is no longer at its minimum and can be expressed as follows: 𝑁
𝜖𝑘′ = ∑
‖𝐲𝐦 (𝐮𝐤 ) − 𝐲(𝐮𝐤 , 𝛉𝐤 + ∆𝛉𝐤 )‖2
(4)
𝑖=1
183 184
To ensure that 𝜖𝑘′ = 𝜖𝑘 , let us introduce a vector of constant corrections −𝐜𝐤 in the model outputs such that: 𝑁
∑
𝑁
‖𝐲𝐦 (𝐮𝐤 ) − (𝐲(𝐮𝐤 , 𝛉𝐤 + ∆𝛉𝐤 ) − 𝐜𝐤 )‖2 = ∑
𝑖=1
185 186
‖𝐲𝐦 (𝐮𝐤 ) − 𝐲(𝐮𝐤 , 𝛉𝐤 )‖2
To satisfy Equation (5), the corresponding error terms on the both sides of the equality should be equal: 𝐲𝐦 (𝐮𝐤 ) − 𝐲(𝐮𝐤 , 𝛉𝐤 + ∆𝛉𝐤 ) + 𝐜𝐤 = 𝐲𝐦 (𝐮𝐤 ) − 𝐲(𝐮𝐤 , 𝛉𝐤 )
187
189
191 192
(7)
Following a Taylor Series expansion, the model outputs 𝐲(𝐮𝐤 , 𝛉𝐤 + ∆𝛉𝐤 ) can be expanded around 𝛉𝐤 as follows: 𝐲(𝐮𝐤 , 𝛉𝐤 + ∆𝛉𝐤 ) = 𝐲(𝐮𝐤 , 𝛉𝐤 ) + 𝐷𝐲(𝛉𝐤 ) ∆𝛉𝐤 +
190
(6)
Following a rearrangement, the correction term 𝐜𝐤 can be expressed as: 𝐜𝐤 = 𝐲(𝐮𝐤 , 𝛉𝐤 + ∆𝛉𝐤 ) − 𝐲(𝐮𝐤 , 𝛉𝐤 )
188
(5)
𝑖=1
1 ∆𝛉T𝐤 𝐻𝐲(𝛉𝐤 ) ∆𝛉𝐤 + ⋯ 2
(8)
Where, 𝐷𝐲(𝛉𝐤 ) is the Jacobian whose elements are the derivatives of outputs 𝐲 with respect to model parameters 𝛉𝐤 and 𝐻𝐲(𝛉𝐤 ) is the corresponding Hessian Combining Equations (7) and (8), the correction term is, then, given by:
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
𝐜𝐤 = 𝐷𝐲(𝛉𝐤 ) ∆𝛉𝐤 + 193 194
1 ∆𝛉T𝐤 𝐻𝐲(𝛉𝐤 ) ∆𝛉𝐤 + ⋯ 2
196
(9)
Neglecting the terms higher than second order, the above correction can be approximated by the quadratic expansion around 𝛉𝐤 as follows: 𝐜𝐤 = 𝐷𝐲(𝛉𝐤 ) ∆𝛉𝐤 +
195
Page 12 of 42
1 ∆𝛉T𝐤 𝐻𝐲(𝛉𝐤 ) ∆𝛉𝐤 2
(10)
The accuracy of this correction quadratic approximation can be quantified by using the relative truncation error as follows:
𝑻
𝝐 =
1 ∆𝛉T𝐤 𝐻𝐲(𝛉𝐤 ) ∆𝛉𝐤 − 𝐲(𝐮𝐤 , 𝛉𝐤 ) 2 𝐲(𝐮𝐤 , 𝛉𝐤 )
𝐲(𝐮𝐤 , 𝛉𝐤 + ∆𝛉𝐤 ) − 𝐷𝐲(𝛉𝐤 ) ∆𝛉𝐤 −
(11)
197
Following the above discussion, the calculation of ∆𝛉𝐤 to minimize the differences between
198
predicted and measured gradients (Equations (1a)-(1b)) is then performed using an optimization
199
problem as follows:
200
∆𝛉𝐤 = arg min (𝐰𝜙 |
201
+ 𝐰𝑔 |
202
∆𝛉
s. t.
̂ 𝐤 , 𝑡𝑓 )) 𝜕𝜙(𝐲̂(𝐮 ̂ 𝐤 , 𝛉𝐤 , ∆𝛉, 𝑡𝑓 )) 𝜕𝜙𝑝 (𝐲𝐦 (𝐮 − | ̂ ̂ 𝜕𝐮 𝜕𝐮
̂ 𝐤 ), 𝐮 ̂ 𝐤 ) 𝜕𝐠(𝐲̂(𝐮 𝜕𝐠 𝑝 (𝐲𝐦 (𝐮 ̂ 𝐤 , 𝛉𝐤 , ∆𝛉), 𝐮 ̂𝐤) − |) ̂ ̂ 𝜕𝐮 𝜕𝐮
̂) 𝐱̇ = 𝑓̂(𝐱, 𝛉𝐤 + ∆𝛉, 𝐮
203
𝐲̂ = ℎ(𝐱) − 𝐷𝐲(𝛉𝐤 ) ∆𝛉 −
204
𝑇 ‖𝛜𝐓 ‖∞ ≤ 𝜖𝑚𝑎𝑥
1 ∆𝛉𝐓𝐤 𝐻𝐲(𝛉𝐤 ) ∆𝛉𝐤 2 (P.3)
ACS Paragon Plus Environment
Page 13 of 42
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
205
Where, 𝐰𝜙 and 𝐰𝑔 are the vectors of normalizing weights corresponding to the gradients of
206
objective function and constraints respectively and ϵ𝑇𝑚𝑎𝑥 is the limit imposed on relative truncation
207
error (Equation 11) representing the required degree of accuracy in the quadratic correction.
208
To summarize, at any given set of operating conditions, the estimation of parameters is divided
209
into two steps. In the first step, initial estimates are obtained independent of the optimization
210
objectives to minimize only the prediction error as per problem P.1. In the subsequent step, the
211
change in these estimates is computed from P.3 such that the updated model also predicts the
212
measured gradients of the optimization problem with a minimum possible error while the set of
213
corresponding model corrections guarantees the prediction error to be within the bounds of its
214
minimum value from the first step. The updated model with the parameter estimates 𝛉′𝐤 (= 𝛉𝐤 +
215
∆𝛉𝐤 ) is then used to calculate the optimal operating conditions for the next iteration as per problem
216
P.2. This procedure involving P.1, P.3 and P.2 is repeated until convergence is achieved.
217
In this paper, the optimization gradients from the process were approximated using a finite
218
difference approach which requires a set of additional experiments by perturbing the inputs one at
219
a time. For large number of inputs, a more practical approach would be to approximate these
220
gradients using the process information from past and current iterations7, 16-17 by modifying the
221
optimization problem P.2. A detailed discussion on this topic is, however, beyond the scope of this
222
paper.
223
3. ROBUST OPTIMIZATION
224
The overall performance of the proposed algorithm can be evaluated in terms of (1) its rate of
225
convergence and (2) the robustness to model uncertainties. The rate at which the algorithm will
226
converge highly depends on the change in parameter estimates ∆𝛉𝐤 along the iterations. For faster
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 14 of 42
227
convergence, it is desirable to use the values of ∆𝛉𝐤 that allow to predict the gradients of the
228
optimization problem accurately but this choice is controlled by the relative truncation error ϵ𝑇𝑚𝑎𝑥
229
which restricts the values of ∆𝛉𝐤 to a region where the proposed model corrections are assumed to
230
be valid. When compared to the linear correction proposed earlier in Mandur and Budman (2015)1,
231
the quadratic model correction increases this validity region, thereby allowing for larger ∆𝛉𝐤 values
232
and thus higher rate of convergence. However, this improvement comes with a price as the larger
233
values of ∆𝛉𝐤 will make the algorithm more sensitive to the noise in measured gradients. Although
234
with some trial and error, one can obtain the value of ∆𝛉𝐤 that can provide a reasonable tradeoff
235
between the robustness towards modeling errors vs. noisy gradients, a more systematic approach
236
is to incorporate the effect of model uncertainties explicitly into the algorithm which will increase
237
its overall robustness irrespective of ∆𝛉𝐤 values. Therefore, the second contribution of this paper
238
is the introduction of robustness in the optimization objective where instead of nominal objective
239
function as in problem P.2, a weighted sum of the nominal objective function and its variability is
240
minimized as follows: ̂ 𝐤+𝟏 = arg min (𝐸 (𝜙 (𝐲(𝐮 ̂ , 𝑡𝑓 ))) + 𝐕𝐚𝐫 (𝜙 (𝐲(𝐮 ̂ , 𝑡𝑓 )))) 𝐮 ̂ 𝐮
(12)
241
Where Var is a measure of variability in the optimization objective due to model uncertainties
242
In the following sub-sections, a computationally efficient approach to first quantify the model
243
uncertainties and then propagate their effect onto the optimization objective is presented.
244
3.1 Parametric Uncertainty
245
The effects of modeling errors and/or measurement noise are usually captured by defining a
246
confidence region around the parameter estimates. This region basically represents the joint
247
probabilities by which parameter estimates predict the given experimental data. The conventional
ACS Paragon Plus Environment
Page 15 of 42
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
248
method to compute this region is based on the linearization of model around the nominal parameter
249
estimates. Following this linearization, for normally distributed measurement errors, the resulting
250
parametric uncertainty is also normal and can be expressed by the following hyper-ellipsoid12:
251 2 𝐸𝜃 = {𝜃: (𝛉 − 𝛉𝐤 )𝑇 𝑽−𝟏 𝜽 (𝛉 − 𝛉𝐤 ) ≤ 𝜒𝑛𝜃 (𝛼)
(13)
252
Where, 𝑽𝜽 ∈ ℝ𝑛θ x 𝑛θ is the covariance matrix of measurement errors and 𝜒𝑛2𝜃 (𝛼) is the chi-
253
square distribution with 𝑛θ degrees of freedom and 𝛼 confidence level. This description is,
254
however, valid only if either the model is linear or the uncertainty region for parameters is small
255
enough that the underlying assumption of model linearity is valid. For nonlinear problems, the
256
Bayesian approach is a more accurate choice11 where the joint probability distribution for
257
parameters is given by: 𝐩(𝛉|𝐷) =
𝐿(𝛉|𝐷) 𝐩(𝛉) ∫ 𝐿(𝛉|𝐷) 𝐩(𝛉)𝑑𝛉
(14)
258
Where, 𝐩(𝛉|𝐷) is the posterior probability of parameters conditional on the given set of
259
measurements D, 𝐩(𝛉) is the prior probability of parameters representing any a priori available
260
information about the parameters and 𝐿(. ) represents the likelihood of parameters defining the
261
statistical distribution of errors. For instance, assuming the errors between the output
262
measurements and predictions to be independent and normally distributed, the likelihood function
263
is given by a 𝑛θ -dimensional multivariate normal distribution as follows: 𝐿(𝛉|𝐷) =
1 (2π)k/2 |Σ|1/2
1 𝑇 exp (− (𝐲𝐦 − 𝐲(𝛉)) Σ −1 (𝐲𝐦 − 𝐲(𝛉))) 2
(15)
264
In this study, the effect of both the normal as well as the Bayesian parametric distributions is
265
investigated. Assuming that the only prior information available for the parameters is their lower
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 16 of 42
266
and upper bounds, a uniform prior is used in the above posterior calculation. Since for nonlinear
267
models, the above likelihood function does not have an analytical solution, the most common
268
approach to propagate uncertainty is to sample from the posterior distribution (equation 14) using
269
Markov Chain Monte Carlo (MCMC) sampling techniques18 and to solve, for each realization, the
270
full nonlinear model to calculate the desired output. Given the fact that these calculations have to
271
be repeated at each function evaluation, the resulting robust optimization problem is a
272
computationally challenging task. To this end, we implemented an approach based on Polynomial
273
Chaos (PC) expansions where the key idea is to use a set of orthogonal polynomials to build a
274
surrogate map between the desired output and the model parameters specifically over the uncertain
275
parameter space11. Compared to conventional Monte Carlo sampling techniques, the PC
276
expansions require significantly lesser number of model runs making it an ideal choice for robust
277
optimization studies. A brief background for PC-based uncertainty propagation is presented next.
278
3.2 Polynomial Chaos Expansions
279
Let us define a probability space (Ω, ℱ, 𝑃), where Ω is the sample space, ℱ is the 𝜎-algebra over
280
Ω and 𝑃 is a probability measure on ℱ. If {𝜉𝑖 (𝜔)}∞ 𝑖=1 is a set of independent random variables with
281
a standard probability distribution, then, the PC expansion of any random variable X with a finite
282
variance will be as follows: ∞
∞
𝑖1
𝑋(𝜔) = 𝑎0 Γ𝑜 + ∑ 𝑎𝑖1 Γ1 (𝜉𝑖1 ) + ∑ ∑ 𝑎𝑖1 𝑖2 Γ2 (𝜉𝑖1 , 𝜉𝑖2 ) 𝑖1 =1 ∞
𝑖1 =1 𝑖2 =1 𝑖1
𝑖2
+ ∑ ∑ ∑ 𝑎𝑖1 𝑖2𝑖3 Γ3 (𝜉𝑖1 , 𝜉𝑖2 , 𝜉𝑖3 ) + ⋯ 𝑖1 =1 𝑖2 =1 𝑖3 =1
ACS Paragon Plus Environment
(16)
Page 17 of 42
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
283
Where, Γ𝑝 is the basis function of order p19, 𝜔 is the random event and 𝑎(.) is the deterministic
284
coefficient multiplying the corresponding basis function. The expansion can be expressed in more
285
compact form19 as: ∞
𝑋(𝜔) = ∑ 𝑎̂𝑘 𝛹𝑘 (𝜉1 , 𝜉2 , 𝜉3 , … )
(17)
𝑘=0
286 287
The fundamental property of PC expansions is that all basis functions are orthogonal to each other, i.e. 〈𝛹𝑖 𝛹𝑗 〉 = ∫ 𝛹𝑖 (𝛏)𝛹𝑗 (𝛏)𝑝(𝛏)d𝛏 = 𝛿𝑖𝑗 〈𝛹𝑖2 〉
288 289
(18)
Then, following the above orthogonality equation, the coefficients in the expansion can be computed using Galerkin projections as follows: 𝑎̂𝑘 =
〈𝑋 𝛹𝑘 〉 〈 𝛹𝑘2 〉
(19)
290
To propagate the effect of parametric uncertainty onto the desired variable X e.g. the
291
optimization objective function and/or constraints (if required), the first step is to formulate the
292
PC expansions for the uncertain model parameters. To compute the expansion coefficients, a one-
293
to-one mapping between the set of parameters 𝛉 and the set of independent random variables 𝛏 is
294
needed. For the case of uncorrelated parameters, this can be achieved by an inverse transformation
295
as follows: 𝜃
𝜉𝑖 = 𝐹 −1 (∫ 𝑝(𝜃𝑖 )𝑑𝜃 )
(20)
0
296
Where, 𝐹 −1 is the inverse of the cumulative density function for the independent random
297
variable 𝜉𝑖 and 𝑝(𝜃𝑖 ) is the probability of the model parameter 𝜃𝑖 . Then using these mappings and
298
equation (19), the calculation of coefficients in respective PC expansions is quite straightforward.
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
299 300
Page 18 of 42
However, when the parameters are correlated, the following set of transformations based on conditional and marginal probabilities needs to be applied: 𝜉1 = 𝐹1
−1
𝜃1
(∫ 𝑝(𝜃1 |𝐷)𝑑𝜃1 ) 0
𝜉2 = 𝐹2
−1
𝜃2
(∫ 𝑝(𝜃2 |𝜃1 , 𝐷) 𝑑𝜃2 )
(21)
0
⋮ 𝜃𝑖
𝜉𝑖 = 𝐹𝑖 −1 (∫ 𝑝(𝜃𝑖 |𝜃𝑖−1 , 𝜃𝑖−2 , … , 𝜃1 , 𝐷) 𝑑𝜃𝑖 ) 0
301
Based on these conditional mappings, the PC expansions for different parameters can be
302
obtained where the coefficients in each expansion will depend on the corresponding set of
303
conditional parameters. A two parameter example for the case of correlated parameters has been
304
presented in11 where the authors proposed to build nested PC expansions to deal with conditional
305
mappings. Once the PC expansions for the set of uncertain parameters are formulated, the next
306
step is to formulate the PC expansion between the desired variable X and the independent random
307
variables 𝛏 as follows:
308
1. Select the values of independent random variables 𝛏 at the required collocation points
309
2. Calculate the corresponding set of parameter values 𝛉 from their PC expansions
310
3. Solve the nonlinear model for each set of parameter values and calculate the desired
311 312 313 314
output X (e.g. optimization objective). 4. Using the map between X and 𝛏 from steps (1)-(3), build a PC expansion for X(𝛏). Due to the orthogonality of basis functions, the variance of the output variable 𝑋 can be calculated analytically as follows:
ACS Paragon Plus Environment
Page 19 of 42
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
𝑃
𝑉𝑎𝑟(𝑋) = ∑ 𝑎̂𝑖2 < 𝛹𝑖2 >
(22)
𝑖=1
315
This expression can then be applied to calculate the robust optimization cost in equation (12).
316
317
4. PROPOSED OPTIMIZATION METHODOLOGY
318
The overall methodology is summarized using a flowchart in Figure 1. The algorithm begins
319
with the first step of identification problem where the objective is to calculate 𝛉𝐤 as posed in P.1
320
with the initial model correction term as zero. In the second step, the change in parameter estimates
321
∆𝛉𝐤 and the set of new model corrections 𝐜𝐤 are calculated using problem P.3. Then using the
322
corrected model and parameter estimates 𝛉𝐤 + ∆𝛉𝐤 , a nominal or robust optimization problem,
323
with the objective function given by problem P.2 or equation 12 respectively, is solved to calculate
324
the optimal operating conditions for the next iteration. This procedure is repeated until
325
convergence is achieved. In the context of batch or fed-batch system studied in this work, each
326
iteration corresponds to new batch experiment where after estimating the parameter estimates
327
using P.1 and P.3, new optimal operating conditions are obtained for the next batch experiment
328
until the algorithm converges after a finite number of batches. It is important to note here that
329
solving a robust optimization problem (equation 12) involves an additional computational time
330
required to calculate the PC representation of uncertain model parameters and then computing the
331
mean and variances for robust objective function.
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
𝐜𝟎 = 0 Solve for 𝛉𝐤 𝑁
𝛉𝐤 = arg min ∑ 𝛉
s. t.
𝑖=1
‖𝐲𝐦 (𝐮 ̂ 𝐤 ) − 𝐲(𝐮 ̂ 𝐤 , 𝛉)‖2
̂𝐤) 𝐱̇ = 𝑓̂(𝐱, 𝛉, 𝐮 𝐲 = ℎ(𝐱) − 𝐜𝐤−𝟏
Solve for ∆𝛉𝐤 ̂ 𝐤 , 𝑡𝑓 )) 𝜕𝜙(𝐲̂(𝐮 ̂ 𝐤 , 𝛉𝐤 , ∆𝛉, 𝑡𝑓 ) 𝜕𝜙𝑝 (𝐲𝐦 (𝐮 − | ̂ ̂ 𝜕𝐮 𝜕𝐮 ̂ 𝐤 ), 𝐮 ̂ 𝐤 ) 𝜕𝐠(𝐲̂(𝐮 𝜕𝐠 𝒑 (𝐲𝐦 (𝐮 ̂ 𝐤 , 𝛉𝐤 , ∆𝛉), 𝐮 ̂𝐤) + 𝐰𝑔 | − |) ̂ ̂ 𝜕𝐮 𝜕𝐮
∆𝛉𝐤 = 𝑎𝑟𝑔 min (𝐰𝜙 | ∆𝛉
s. t.
̂𝐤) 𝐱̇ = 𝑓̂(𝐱, 𝛉𝐤 + ∆𝛉, 𝐮 𝐲̂ = ℎ(𝐱) − 𝐜𝐤−𝟏 − 𝐷𝐲(𝛉𝐤 ) ∆𝛉 −
𝟏 𝟐
∆𝛉𝐓𝐤 𝐻𝐲(𝛉𝐤 ) ∆𝛉𝐤
𝑇 ‖𝝐𝑻 ‖∞ ≤ 𝜀𝑚𝑎𝑥
̂ 𝐤+𝟏 Solve for 𝐮 ̂ 𝐤+𝟏 = arg min 𝜙(𝐲̂(𝐮 ̂ , 𝛉𝐤 , ∆𝛉𝐤 , 𝑡𝑓 )) 𝐮 ̂ 𝐮
s. t.
̂𝐤) 𝐱̇ = 𝑓̂(𝐱, 𝛉𝐤 + ∆𝛉, 𝐮 𝐲̂ = ℎ(𝐱) − 𝐜𝐤−𝟏 − 𝐷𝐲(𝛉𝐤 ) ∆𝛉 −
𝟏 𝟐
∆𝛉𝐓𝐤 𝐻𝐲(𝛉𝐤 ) ∆𝛉𝐤
̂ 𝐤 , 𝛉𝐤 , ∆𝛉𝐤 ), 𝐮 ̂𝐤) ≤ 𝟎 𝐠(𝐲̂(𝐮
𝒄𝒌−𝟏 = 𝐜𝐤−𝟏 + 𝐷𝐲(𝛉𝐤 ) ∆𝛉 −
𝟏 𝟐
∆𝛉𝐓𝐤 𝐻𝐲(𝛉𝐤 ) ∆𝛉𝐤
No Converge? Yes Stop
332
Figure 1: Proposed Algorithm with quadratic model correction
333
ACS Paragon Plus Environment
Page 20 of 42
Page 21 of 42
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
334
Industrial & Engineering Chemistry Research
5. CASE STUDY
335
The proposed algorithm is illustrated on a fed-batch bioprocess for penicillin production. The
336
actual process is considered to be described by a system of differential equations ((23)-(26)) as
337
follows20-21: 𝑑𝑋 𝜇𝑋 𝑆𝑋 𝑋 𝑑𝑉 =( )− 𝑑𝑡 𝐾𝑋 𝑋 + 𝑆 𝑉 𝑑𝑡 𝑑𝑃 =( 𝑑𝑡
𝜇𝑃 𝑆𝑋 𝐾𝑃 + 𝑆 +
(23)
2 ) − 𝐾𝐻 𝑃 −
𝑆 𝐾𝐼
𝑃 𝑑𝑉 𝑉 𝑑𝑡
𝑑𝑆 1 𝜇𝑋 𝑆𝑋 1 =− ( )− ( 𝑑𝑡 𝑌𝑋⁄ 𝐾𝑋 𝑋 + 𝑆 𝑌𝑃⁄ 𝑆
𝑆
𝜇𝑃 𝑆𝑋 𝐾𝑃 + 𝑆 +
(24)
) 𝑆2
− 𝑚𝑋 𝑋 +
𝐹𝑠𝑓 𝑆 𝑑𝑉 − 𝑉 𝑉 𝑑𝑡
(25)
𝐾𝐼
𝑑𝑉 = 𝐹 − 6.226 ∗ 10−4 𝑉 𝑑𝑡
(26)
338
Where, 𝑋, 𝑃 and 𝑆 represent the concentrations of biomass, penicillin and substrate respectively
339
and 𝑉 is the culture volume. The rate constants and other parameters are defined as follows: 𝜇𝑋
340
and 𝜇𝑃 are specific growth rates for the biomass and penicillin respectively with 𝐾𝑋 and 𝐾𝑃 as
341
respective saturation constants, 𝐾𝐼 represents the substrate inhibition in the growth kinetics of the
342
penicillin, 𝐾𝐻 accounts for the consumption of penicillin by hydrolysis, 𝑌𝑋⁄𝑆 and 𝑌𝑃⁄𝑆 are the
343
yields per unit mass of substrate for the biomass and penicillin respectively, 𝑚𝑋 accounts for the
344
consumption of substrate in maintaining the biomass and, finally, 𝑠𝑓 is the concentration of
345
substrate in the feed.
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 22 of 42
346
The above Equations ((23)-(27)) are then used to generate in silico data for the output variables
347
as well as the gradients of the cost function and constraints in the optimization problem. The
348
measured outputs, for example, are given by the following relation: 𝑦𝑚𝑖𝑗 = 𝑦̂𝑖𝑗 + 𝜀𝑖𝑗
(27)
349
Where 𝑦̂𝑖𝑗 is the deterministic value of the ith output at jth time point computed from the equations
350
(23)-(27) and 𝜀𝑖𝑗 is the corresponding stochastic part representing any uncertainty in the measured
351
output 𝑦𝑚𝑖𝑗
352
Assuming that there is no uncertainty in the inputs, there are no unmeasured disturbances
353
entering the process and the samples are collected and analyzed independently, the only
354
uncertainty in the measurements is the one coming from the sensor noise and will be uncorrelated
355
both with respect to different outputs as well as the time. Therefore, assuming the measured outputs
356
to be normally distributed with a zero mean and a constant variance over the time while different
357
outputs have different variances, the identification problem in P.1 is formulated as a weighted least
358
squares problem.
359
As for the process model with structural uncertainty, the consumption of penicillin by hydrolysis
360
occurring in the actual process is assumed to be a priori unknown to the user. Accordingly, the
361
rate of change in the penicillin concentration is inaccurately modelled, as: 𝑑𝑃 =( 𝑑𝑡
𝜇𝑝 𝑆𝑋 𝐾𝑃 + 𝑆 +
) 𝑆2
−
𝑃 𝑑𝑉 𝑉 𝑑𝑡
(28)
𝐾𝐼
362
Assuming the dynamics of the other states to be known accurately, the set of Equations (23),
363
(25), (26) and (28) then represents the inaccurate model of the process to be used in the proposed
364
algorithm. It should be noted that the above elimination of the hydrolysis term results in a
ACS Paragon Plus Environment
Page 23 of 42
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
365
significant model error as shown in our previous study1 where without correcting for this error
366
using iterative model corrections, the run-to-run optimization algorithm converged to a sub optimal
367
solution with nearly 25% reduction in the final penicillin amount as compared to the true process
368
optimum.
369
The optimization objective is to maximize the amount of penicillin at the end of batch by
370
manipulating the initial substrate concentration 𝑆𝑜 and the input feed rate 𝐹 while ensuring that the
371
culture volume does not exceed a maximal volume of 120L. For reference, the true process
372
optimum corresponds to the initial substrate concentration, 𝑆𝑜 = 54.9 g/L and the input feed rate,
373
F = 0.1728 L/hr with the final penicillin at the end of batch ~592 g
374
Mathematically, the problem is formulated as: min
−𝑃(𝐱, 𝛉, 𝐮, 𝑆𝑜 , 𝐹, 𝑡𝑓 )
𝑠. 𝑡.
(5.23) 𝑎𝑛𝑑 (5.25 − 5.27)
𝑆𝑜 ,𝐹
𝑉(𝐱, 𝛉, 𝐮, 𝑆𝑜 , 𝐹, 𝑡𝑓 ) ≤ 𝑉𝑚𝑎𝑥
(P.4)
375
The algorithm starts with a parameter estimation step with the initial input conditions listed in
376
Table 1. Here, without loss of generality, only two parameters 𝐾𝑋 and 𝐾𝐼 are updated in the
377
algorithm whereas the remaining parameters are kept at their initial values, that were estimated in
378
the first iteration with the incorrect model. The reasons for selecting a subset of parameters are;
379
(1) to reduce the sensitivity to noise and (2) to reduce the computational load in the robust
380
optimization problem. It has been shown that as the number of parameters increase, the
381
propagation of parametric uncertainty using PC expansions needs more model runs, thus
382
increasing the overall computational time since this propagation step has to be repeated several
383
times in optimization framework11. In situations where the optimal inputs provide insufficient
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
384
excitation to estimate all the parameters, updating only a subset of parameters becomes even more
385
relevant. However, this is beyond the scope of this study.
386 387
Table 1: Set of initial input conditions
Biomass Conc. (X0)
0.1 (g/l)
Substrate Conc. (S0)
0.1 (g/l)
Product Conc. (P0)
0 (g/l)
Initial Culture Volume (V0)
100 (L)
Input Feed (F)
0.04 (L/hr)
388 389 390
5.1 Results and discussion
391
As mentioned earlier in the Introduction, the key motivation behind this work was to improve
392
the convergence of the previously proposed algorithm1 in two directions: (1) by using a new model
393
correction term based on quadratic approximation of the model and (2) by addressing robustness
394
explicitly in the optimization problem to reduce the effect of model uncertainties. To compare their
395
relative effects, these two modifications were studied independently through several case studies
396
as discussed below.
397
5.1.1
Comparison of linear and quadratic model corrections.
398
The effect of the quadratic model correction is discussed first. It is worth mentioning, here, that
399
in all optimizations the optimal feed rate 𝐹 always converged to a value of 0.1728 L/hr satisfying
400
the constraint on culture volume and, therefore, for clarity, it will not be considered in the
ACS Paragon Plus Environment
Page 24 of 42
Page 25 of 42
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
401
following discussion. At any iteration, a minimal trade-off between the identification and
402
optimization objectives is obtained by using the proposed model correction and the bound on
403
𝑇 ) for which this correction is assumed to be valid. It was hypothesized that, truncation error (𝜖𝑚𝑎𝑥
404
for highly nonlinear models, the linear approximation proposed in our previous work may be valid
405
only in a small neighborhood of parameter estimates from the first identification step. Thus, when
406
𝑇 using linear corrections, a small value of 𝜖𝑚𝑎𝑥 had to be used to improve the predictive accuracy
407
along the iterations which limited the change in estimates in the second step and therefore the
408
extent to which correction for the optimization gradients can be made.
409
To test the effect of a quadratic correction, the algorithm is solved using both quadratic and
410
𝑇 linear corrections for 𝜖𝑚𝑎𝑥 = 1% and the corresponding convergence of the optimal 𝑆𝑜 as a
411
function of the iteration number is compared in Figure 2. It is clear from this figure that with the
412
linear correction, the algorithm is not even able to match the sign of the predicted and measured
413
gradients for some of the iterations, thus, leading to an oscillatory profile during the transient. It
414
should be remembered that if the sign of the objective function’s gradient is incorrectly predicted,
415
the optimization will proceed in the wrong direction thus going away temporarily from the actual
416
optimum and resulting in the observed oscillations. On the other hand, the quadratic approximation
417
of the given model allows for much larger changes in the parameter estimates in all iterations,
418
increasing the ability to fit the measured optimization gradients more accurately. As a result, the
419
new correction resulted in a smooth (non-oscillatory) and much faster convergence towards the
420
process optimum. In these results, the measurements were assumed to be noise free so as to
421
evaluate the performance of the algorithm when only the modeling error is present.
422
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 26 of 42
423 424
Figure 2: Comparing the effect of linear vs. quadratic model correction on the convergence of
425
optimal 𝑆𝑜
426
In the second comparison, in order to study how much variability one can expect in each iteration
427
when the algorithm is subject to random measurement noise, the algorithm is solved repeatedly 10
428
times with each run subjected to independent noise realizations. This analysis is quite relevant
429
from a practical point of view for batch processes that are operated typically in parallel in order to
430
test the reproducibility of the process outcomes. To this end, the measured outputs as well as the
431
gradients of the optimization problem are assumed to be corrupted with an additive white Gaussian
432
noise as per Equation (27). The comparisons are then conducted for both types of corrections, i.e.
433
𝑇 linear and quadratic, and for two different levels of 𝜖𝑚𝑎𝑥 . At the end of the iterative search, the
434
values of the optimal 𝑆𝑜 from 10 independent runs of the algorithm are used collectively to
435
estimate the sample mean (̅̅̅̅ 𝑆𝑜 ) and sample variance (𝑠 2 ) for optimal 𝑆𝑜 in each iteration, as a
436
way to quantify the sensitivity of the algorithm to noise. Figure 3 shows the convergence of the
437
average optimal 𝑆𝑜 along with 95% confidence interval calculated as ̅̅̅ 𝑆𝑜 ± 𝑡0.025,9 (𝑠⁄√10). The
ACS Paragon Plus Environment
Page 27 of 42
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
438
effectiveness of the algorithm is, then, evaluated in terms of (1) the integral absolute error (IAE)
439
between the predicted average and the actual optimal 𝑆𝑜 and (2) the total variability in the predicted
440
average optimal 𝑆𝑜 . The results are summarized in Table 2.
441 442 443 444
Table 2: Comparison of model correction based on linear vs quadratic approximation
𝑇 𝜖𝑚𝑎𝑥 = 1%
𝑇 𝜖𝑚𝑎𝑥 = 5%
IAE
Total std. deviation in ̅̅̅ 𝑆𝑜
IAE
Total std. deviation in ̅̅̅ 𝑆𝑜
Linear approximation
8.7292
9.3147
6.3651
8.1248
Quadratic approximation
5.7419
8.3629
4.8090
7.5233
445 446 447
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 28 of 42
448 449
450 451
(a)
452 453
(b)
454
…contd.
ACS Paragon Plus Environment
Page 29 of 42
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
455 456
(c)
457 458
(d)
459
Figure 3: Convergence of average optimal 𝑆𝑜 with 95% confidence interval for (a) linear model
460
𝑇 𝑇 correction and 𝜖𝑚𝑎𝑥 = 1%, (b) quadratic model correction and 𝜖𝑚𝑎𝑥 = 1%, (c) linear model
461
𝑇 𝑇 correction and 𝜖𝑚𝑎𝑥 = 5% and (d) quadratic model correction and 𝜖𝑚𝑎𝑥 = 5%
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
462
As compared to the linear correction, the proposed correction based on the quadratic
463
approximation resulted in a significant reduction in both IAE and variability (𝜎 2 ) of the optimal 𝑆𝑜
464
𝑇 in all cases. For 𝜖𝑚𝑎𝑥 = 5%, these reductions are nearly 24.45 % and 14.26% respectively whereas
465
𝑇 for 𝜖𝑚𝑎𝑥 = 1%, the numbers are much higher with the IAE reduced by nearly 34.22 % and the
466
variability by nearly 19.39 %. These results are critical since, in a practical situation, a smaller IAE
467
and variability will provide higher confidence to plant personnel that the algorithm is actually
468
converging to a process optimum rather than appearing to change at random due to process
469
variability and measurement noise.
Page 30 of 42
470
To better interpret the effect of noise on the convergence, let us recall the updated parameter
471
estimate at each iteration 𝛉𝐤 + ∆𝛉𝐤 . The first component 𝛉𝐤 is the estimate obtained in the first
472
step (P.1) by minimizing the prediction error in the model outputs. As a result, the uncertainty
473
in 𝛉𝐤 is due to the combination of modeling errors and the noise in measured outputs. The second
474
component ∆𝛉𝐤 is the change in estimates with respect to 𝛉𝐤 and is obtained in the second step
475
(P.3) by minimizing the differences between the predicted and measured gradients of the
476
optimization problem and therefore, its accuracy will largely depend on the degree of noise in the
477
gradients. However, the values of ∆𝛉𝐤 are controlled by both the type of model correction 𝐜𝐤 , e.g.
478
𝑇 linear or quadratic, and the bound on truncation error 𝜖𝑚𝑎𝑥 defining the accuracy of this correction
479
(c.f. the last constraint in P.3). Following the above, if ∆𝛉𝐤 is large, the parameter estimates 𝛉𝐤 +
480
∆𝛉𝐤 will become more sensitive to the noise in gradients whereas with smaller values of ∆𝛉𝐤 , the
481
effect of modeling errors and the noise in measured outputs will dominate. Out of the four cases,
482
the correction based on linear approximation and 1% truncation error allows for the smallest value
483
of ∆𝛉𝐤 in each iteration and, as a result, the estimates are highly affected by modeling errors and
484
the noise in measured outputs. Compared to the linear approximation, the correction 𝐜𝐤 based on
ACS Paragon Plus Environment
Page 31 of 42
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
485
the quadratic approximation will always favor larger values of ∆𝛉𝐤 irrespective of the bound on
486
𝑇 truncation error (𝜖𝑚𝑎𝑥 ), thereby reducing the sensitivity of the algorithm to modeling errors and
487
noise in measured outputs relative to the sensitivity to noise in gradients. This is evident in the
488
results in Table 2 where for both levels of truncation error, the proposed quadratic correction
489
resulted in a smaller variability in optimal 𝑆𝑜 . However, the improvement in the results is not the
490
𝑇 same between the two levels. One of the reasons is that using 𝜖𝑚𝑎𝑥 = 5% allows for relatively larger
491
𝑇 values of ∆𝛉𝐤 as compared to 𝜖𝑚𝑎𝑥 = 1% thus leaving smaller margin for improvement between
492
the linear and quadratic corrections. The second reason is that beyond certain values of ∆𝛉𝐤 , the
493
algorithm becomes more sensitive to the noise in gradients irrespective of the type of correction
494
as discussed above, thus limiting its overall ability to further filter the effect of model uncertainties.
495
Although, the new correction term has an added computational load as it requires the calculation
496
of Hessian matrix, this increase has a marginal effect on the overall computational time which is
497
dominated by the optimization step.
498
5.1.2
Comparison of robust versus nominal optimization
499
In the second approach, the sensitivity of the algorithm to noise and modeling error is reduced
500
by explicitly adding a measure of robustness to the optimization objective as shown in equation
501
12. For the given case study, the robust optimization problem is formulated as follows: min
− [𝑃(𝐱, 𝐮, 𝑆𝑜 , 𝐹, 𝑡𝑓 ) − 𝑤 Var (𝑃(𝐱, 𝐮, 𝑆𝑜 , 𝐹, 𝑡𝑓 ))]
𝑠. 𝑡.
(23) 𝑎𝑛𝑑 (25 − 27)
𝑆𝑜 ,𝐹
‖𝑉(𝐱, 𝛉, 𝐮, 𝑆𝑜 , 𝐹, 𝑡𝑓 )‖∞ ≤ 𝑉𝑚𝑎𝑥
ACS Paragon Plus Environment
(P.5)
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 32 of 42
502
Where, the term 𝑉𝑎𝑟 represents the variability in the amount of penicillin at final time averaged
503
over the uncertain parameter region and w is the weight on robustness. Since, in the parameter
504
estimation step, the estimates are obtained such that the model can predict the measured gradient
505
of the nominal objective function, the above formulation will promote a trade-off between the
506
corresponding nominal performance and its variability.
507
The results using a normal distribution for parametric uncertainty will be discussed first. Since
508
this uncertainty description is much easier to compute, it is the most commonly used description
509
in probabilistic uncertainty analysis. The robust optimization problem is, then, solved 10 times
510
with different realizations of noise for both 1% and 5% truncation error and 𝑤 = 1%. Since one
511
of the motivations was also to compare the efficiency of this approach with the quadratic model
512
correction, the above problem is solved with the previously proposed linear correction. The
513
resulting convergence of the average optimal 𝑆𝑜 in successive runs with 95% confidence intervals
514
is shown in Figure 4 with the corresponding IAE and variability in the average optimal 𝑆𝑜 listed
515
in Table 3. It is evident from these results that the variability is significantly reduced in both cases
516
𝑇 𝑇 with nearly 45.15% for 𝜖𝑚𝑎𝑥 = 1% and 33.79% for 𝜖𝑚𝑎𝑥 = 5%.
517 518
Table 3: Comparison of nominal vs. robust run-to-run optimization (w/Linear model correction)
𝑇 𝜖𝑚𝑎𝑥 = 1%
IAE Nominal Optimization 8.7292 Robust Optimization
8.1793
𝑇 𝜖𝑚𝑎𝑥 = 5%
Total std. deviation
IAE
Total std. deviation 𝜎
9.3147
6.3651
8.1248
6.8985
5.6249
6.6109
519 520
ACS Paragon Plus Environment
Page 33 of 42
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
521
522 523
(a)
524 525
(b)
526 527
Figure 4: Robust convergence of average optimal 𝑆𝑜 with 95% confidence interval for linear
528
𝑇 𝑇 model correction and (a) 𝜖𝑚𝑎𝑥 = 1% and (b) 𝜖𝑚𝑎𝑥 = 5%
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
529
By comparing the results in Table 2 and Table 3, it is observed that the proposed quadratic
530
correction provides better improvement over the robust approach in terms of IAE whereas the later
531
approach is much more effective in reducing the total variability. This difference in the behavior
532
between the quadratic correction and the robust optimization approach is attributed to the way the
533
two methodologies handle model uncertainties. By allowing a larger change in parameter
534
estimates, the quadratic correction primarily aims to further minimize the difference in predicted
535
and measured optimization gradients, thus speeding up the rate of convergence significantly.
536
Whereas, this indirectly reduce the effect of modelling error and the noise in measured outputs on
537
the rate of convergence providing some reduction in the variability, at the same time, it also
538
increases the sensitivity of the algorithm towards the noise in gradients. On the other hand, the
539
robust approach aims directly to reduce the variability in optimal solutions irrespective of its
540
source. Therefore, which method is better clearly depends on the final objective, the degree of
541
nonlinearity in the problem and the level of noise in the measurements.
542
One of the key limitations of robust formulations is the existence of an offset with respect to the
543
true optimum. This offset is due to the presence of variability term in the optimization objective
544
which forces the algorithm to converge to an apparent value that is different from the nominal
545
𝑇 objective. For 𝜖𝑚𝑎𝑥 = 5%, the average optimal 𝑆𝑜 converged to ~55.75g/L with an offset of
546
~1g/L. With the increase in weight on variability, this offset increases further. One option to
547
eliminate this offset is to switch the algorithm to the nominal mode, corresponding to a cost
548
function that does not include the variability, when approaching the convergence. However, the
549
effectiveness of this solution will depend on when the switching between robust to nominal
550
optimization is done and it may result in higher variability after the switching. Another possible
ACS Paragon Plus Environment
Page 34 of 42
Page 35 of 42
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
551
solution as shown in this study is using more realistic and less conservative descriptions of model
552
uncertainty such as the Bayesian based distributions (equation 14).
553
For comparison, using the final optimal solutions from the above results as initial conditions, we
554
solved the algorithm for both normal (equation 13) and Bayesian (equation 14) uncertainty
555
descriptions with 𝑤 = 1% and 2%. The results are summarized in Figure 5 where one can clearly
556
see that as the weight of the variability term, w is increased from 1% to 2%, the difference in the
557
robust optimal solutions from the two uncertainty descriptions is also increased. Moreover,
558
compared to its normal counterpart, the Bayesian uncertainty description is less conservative and
559
thus resulted in a steady state optimal solution closer to the actual process optimum. Figure 6
560
compares the normal and the Bayesian uncertainty descriptions by plotting the resulting joint
561
confidence region for parameters 𝐾𝑋 and 𝐾𝐼 at different confidence levels. It is quite clear that the
562
two descriptions are significantly different especially at higher confidence levels and therefore, as
563
we increase the weight of the variability term which is equivalent to increasing the confidence
564
level in a given uncertainty, the impact of the difference between these two uncertainty
565
descriptions on the optimal solution becomes more evident (see Figure 5).
566
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
567 568
(a)
569 570
(b)
571
Figure 5: Comparing the effect of Bayesian vs normal parametric uncertainty on the robust
572
𝑇 𝑇 convergence of optimal 𝑆𝑜 for (a) 𝜖𝑚𝑎𝑥 = 1% and (b) 𝜖𝑚𝑎𝑥 = 5%; (−o − Normal with 𝑤 = 2%;
573
−⟐ − Bayesian with 𝑤 = 2%; − ∗ − Normal with 𝑤 = 1%; −x − Bayesian with 𝑤 = 1%)
ACS Paragon Plus Environment
Page 36 of 42
Page 37 of 42
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
574 575 576 577
578 579
Figure 6: Comparing normal (dotted) and Bayesian (solid) descriptions of parametric uncertainty
580
in 𝐾𝑋 and 𝐾𝐼 .
581 582 583 584 585
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
586
6. CONCLUSION
587
A quadratic Taylor series approximation has been proposed to progressively correct the model
588
for structural inaccuracies as the model is optimized towards the true process optimum in a run to
589
run approach. Compared to the correction based on linear approximation, the proposed quadratic
590
correction allows for more accurate predictions of the optimization gradients along the iterations,
591
thus, achieving much faster convergence. In addition, a significant reduction in variability of the
592
search path has also been observed. In a second approach, the variability is reduced by explicitly
593
adding a measure of variability in the optimization objective. Although, for the given case study,
594
the first approach based on quadratic model correction performs better in reducing IAE with a
595
reduction in variability as an added advantage, the second approach with explicit minimization of
596
variability in the robust formulation has been shown to perform significantly better in reducing the
597
variability. In a comparative study, it is also shown that the description of parametric uncertainty
598
has a significant effect on the convergence of the algorithm. When compared to a normal
599
description of parametric uncertainty, the Bayesian uncertainty description resulted in less
600
conservative solutions and also in a reduced offset in the final converged solution with respect to
601
the true optimum.
602 603
ACS Paragon Plus Environment
Page 38 of 42
Page 39 of 42
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
604
AUTHOR INFORMATION
605
Corresponding Author
606
*Email:
[email protected]. Phone: 519-888-4567 ext. 36980
607
Present Addresses
608
†Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA,
609
USA.
610
Notes
611
The authors declare no competing financial interest.
612
ACKNOWLEDGMENT
613 614
The authors would like to acknowledge the financial support from Natural Sciences and Engineering Research Council (NSERC), Canada.
615
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
616
REFERENCES
617
(1) Mandur, J.; Budman, H. Simultaneous model identification and optimization in presence of
618 619 620 621 622 623 624 625 626 627 628 629 630 631 632
Page 40 of 42
model-plant mismatch. Chem. Eng. Sci. 2015, 129, 106-115 (2) Chen, C. Y.; Joseph, B. On-line Optimization Using a Two-Phase Approach: An Application Study. Ind. Eng. Chem. Res. 1987, 26, 1924. (3) Chachuat, B.; Srinivasan, B.; Bonvin, D. Adaptation strategies for real-time optimization. Comput. Chem. Eng. 2009, 33, 1557. (4) Beigler, L. T.; Grossmann, I. E.; Westerberg, A. W. A Note on Approximation Techniques Used for Process Optimization. Comput. Chem. Eng. 1985, 9, 201. (5) Roberts, P. D. An algorithm for Steady-State System Optimization and Parameter Estimation. Int. J Syst. Sci. 1979, 10, 719. (6) Tatjewski, P. Iterative optimizing set point control-The basic principle redesigned. In Proceedings of the 15th Triennial IFAC World Congress, Barcelona, Spain, 2002. (7) Gao, W.; Engell, S. Iterative set-point optimization of batch chromatography. Comput. Chem. Eng. 2005, 29, 1401. (8) Marchetti A.; Chachuat, B.; Bonvin, D. Modifier-adaptation methodology for real–time optimization, Ind. Eng. Chem. Res. 2009, 48, 6022.
633
(9) Costello S.; Francois, G.; Srinivasan, B; Bonvin, D. Modifier Adaptation for Run-to-Run
634
Optimization of Dynamic Processes. In Proceedings of the 18th IFAC World Congress, Milano,
635
Italy, 2011.
ACS Paragon Plus Environment
Page 41 of 42
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
636
(10) Srinivasan, B.; Bonvin, D. Interplay between identification and optimization in run-to-run
637
optimization schemes. In Proceedings of American Control Conference, Anchorage, Alaska, 2002.
638
(11) Mandur J.; Budman, H. Robust Optimization of Chemical Processes using Bayesian
639 640 641 642 643 644 645 646 647 648 649
description of parametric uncertainty, J Process Contr. 2014, 24, 2, 422-430 (12) Beck, J. V.; Arnold, K. J. Parameter Estimation in Engineering and Science; John Wiley & Sons, 1977. (13) Bates, D. M.; Watts D. G. Nonlinear Regression Analysis and Its Applications; John Wiley & Sons, 1988. (14) Box, G. E. P.; Draper, N. R. The Bayesian Estimation of Common Parameters from Several Responses, Biometrika. 1965, 52, 355-365 (15) Banga, J. R.; Balsa-Canto, E.; Moles, C.G.; Alonso, A.A. Dynamic optimization of bioprocesses: Efficient and robust numerical strategies. J Biotechnol. 2005, 117(4), 407-419. (16) Marchetti, A.; Chachuat, B.; Bonvin, D. A dual modifier-adaptation approach for real-time optimization, J. Process Contr. 2010, 20 (9), 1027–1037
650
(17) Brdýs, M.; Tajewski, P. An algorithm for steady-state optimizing dual control of uncertain
651
plants. In Proceedings of the First IFAC Workshop on New Trends in Design of Control Systems
652
1994, pp.249–254.
653 654
(18) Gilks, W. R.; Richardson, S.; Spiegelhalter, D. J. Markov Chain Monte Carlo in Practice; Chapman & Hall: London, 1996.
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
655 656 657 658 659 660
(19) Ghanem, R. G.; Spanos, P. Stochastic Finite Elements: A Spectral Approach; Springer: Berlin, 1991. (20) Bajpai, R. K.; Reuss, M. A Mechanistic Model for Penicillin Production. J Chem. Technol. Biot. 1980, 30, 332-344 (21) Birol, G.; Undey, C.; Cinar A. A modular simulation package for fed-batch fermentation: penicillin production. Comput. Chem. Eng. 2002, 26(11), 1553-1565
661
ACS Paragon Plus Environment
Page 42 of 42