Robust Algorithms for Simultaneous Model Identification and

Subscriber access provided by CMU Libraries - http://library.cmich.edu

Article

Robust algorithms for simultaneous model identification and optimization in the presence of model-plant mismatch Jasdeep Mandur, and Hector Marcelo Budman Ind. Eng. Chem. Res., Just Accepted Manuscript • DOI: 10.1021/acs.iecr.5b01560 • Publication Date (Web): 25 Aug 2015 Downloaded from http://pubs.acs.org on September 11, 2015

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Industrial & Engineering Chemistry Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

1

Robust algorithms for simultaneous model

2

identification and optimization in the presence of

3

model-plant mismatch

4

Jasdeep S. Mandur† and Hector M. Budman*

5

Department of Chemical Engineering, University of Waterloo, Waterloo, ON, Canada, N2L3G1

6

KEYWORDS:

7

Modeling; Model correction; Identification; Robust optimization; Parameter Estimation;

8

Parametric Uncertainty

9

ABSTRACT:

10

In the presence of model-plant mismatch, the set of parameter estimates obtained using standard

11

model identification procedures cannot accurately predict the gradients of the optimization

12

problem. Hence, a “two-step” approach involving an identification followed by an optimization

13

step, performed repeatedly, often fails to convergence to the true process optimum. In our recent

14

work1, we proposed a new identification procedure that progressively corrects the model for

15

structural error such that the updated set of parameter estimates simultaneously satisfies both

16

identification and optimization objectives with a guaranteed convergence to the true optimum. In

ACS Paragon Plus Environment


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

17

this paper, we expanded our previous methodology in two directions; first, a new model correction

18

based on a higher order approximation of model is proposed and second, the effect of model

19

uncertainties is considered explicitly by formulating a robust optimization problem at each

20

iteration. The resulting improvements are then illustrated using a simulated study of fed-batch

21

penicillin production process.

22 23


Page 2 of 42

Page 3 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

24


1. INTRODUCTION

25

Model-based optimization provides a systematic framework to improve profitability of the

26

process while satisfying various process constraints, especially those related to safety and

27

environmental regulations. However, its success relies on the accuracy of the underlying process

28

models in terms of three major factors; (1) how accurately the model represents the actual process,

29

(2) under what operating conditions and how the experiments were carried out and (3) the quality

30

of the measurements involved. The standard modeling approach involves an iterative procedure

31

that begins with the estimation of model parameters using an initial set of experimental data. Then,

32

the calibrated model is validated over different operating conditions and if the prediction error is

33

considered significant, the parameters’ estimates are updated using additional set of experiments.

34

The procedure is repeated until a model with the required prediction accuracy is obtained. If the

35

assumed model is a true representation of the actual process, the parameters’ estimates will

36

eventually converge to a unique set of values, provided there is enough excitation from the

37

experiments to identify all the parameters. The final calibrated model will have good prediction

38

accuracy over the entire space of operating conditions and, correspondingly, optimizing this model

39

is equivalent to optimizing the process itself. The rate of convergence in this approach, however,

40

depends on how the experiments were carried out and the degree of noise in the measurements. In

41

most practical situations, the number of experiments is usually restricted either because of the

42

duration of experiments or budget constraints. Then, if the main objective is to use the model for

43

process optimization, it is desirable to conduct these limited number of experiments along the

44

search path of the process optimum. To this end, a “two-step” approach is quite promising where

45

the operating conditions for the next experiments are basically obtained by optimizing the model



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

46

calibrated in the previous step. This procedure is also iterative where the model is re-calibrated

47

and then re-optimized at each iteration until convergence is obtained2.

48

On the other hand, when there is a mismatch between the assumed model structure and the actual

49

process, there is no unique set of parameter estimates to predict the process with the same level of

50

accuracy over the entire operating region. In fact, the parameters have to compensate for the

51

unmodelled dynamics and as a result, the calibrated model will be accurate only around the

52

operating conditions where the parameters were estimated. This also implies that such model may

53

not predict the gradients of the cost function as well as the process constraints with sufficient

54

accuracy and in the extreme case, the predicted and the measured gradients may have an opposite

55

sign driving the optimization search away from the actual process optimum. In these situations,

56

even the standard “two-step” identification/optimization approach would converge to sub-optimal

57

operating conditions3.

58

To ensure convergence towards the true process optimum, it is necessary that the model should

59

predict the optimality conditions of the process accurately4. To achieve this goal, a class of

60

algorithms has been proposed3, 5-9 where the optimization objective and the constraints are adjusted

61

for the differences between their predicted and measured gradients. Since these corrections are

62

added directly to the corresponding functions in the optimization problem, they have to be filtered

63

in order to avoid aggressive corrections so as to obtain a smoother convergence. However, if the

64

measurements are very noisy, this may slow down the rate of convergence significantly. In another

65

study10, unlike the standard estimation approach where the parameters are estimated only to correct

66

for the prediction error, the authors also included the difference between the predicted and

67

measured objective functions’ gradients as a feedback and, as a result, the parameters were updated

68

to obtain a trade-off between the identification and optimization objectives. If the mismatch


Page 4 of 42

Page 5 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


69

between these objectives is significant around the optimum, one of the two objectives has to be

70

compromised. Recently, we proposed an alternative approach1 where the model was corrected for

71

its structural error in a way that, upon convergence, the updated set of model parameters not only

72

minimizes the prediction error but also predicts the gradients of the optimization problem within

73

pre-specified bounds, thus, simultaneously satisfying the identification and optimization

74

objectives. At any given iteration, the proposed model correction was based on the linearization of

75

model outputs which could be quite restrictive if the degree of nonlinearity and/or uncertainty in

76

the model is quite significant resulting in poor convergence.

77

In this paper, we expanded our previous study to improve upon the convergence by proposing

78

two modifications; (1) a new model correction term is proposed based on the quadratic

79

approximation of model outputs and (2) a robust formulation is considered for the optimization

80

objective where a weighted sum of nominal performance and its variability due to model

81

uncertainties is minimized. An additional novelty of this work is related to the description of

82

parametric uncertainty used in the robust optimization problem. The most common approach to

83

estimate this parametric uncertainty is based on the linearization of model outputs which, for

84

normally distributed measurement errors, results in normally distributed uncertainty in parameter

85

estimates. However, the assumption of linearization may not be valid for the given level of

86

measurement noise, especially in nonlinear problems and may result in incorrect optimal solutions

87

as shown in our previous study11. Instead, a more accurate approach based on the Bayes Theorem11

88

is used in this work where the linearization assumptions are not required. To this end, a

89

comparative study is presented where the effect of Bayesian and normal description of parametric

90

uncertainty is compared with respect to the convergence of the proposed algorithm. To the authors’



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

91

knowledge, this is the first time the convergence of the iterative optimization algorithms has been

92

studied with respect to the Bayesian uncertainty descriptions.

93

The organization of this paper is as follows. Sections 2 and 3 covers the theory and the proposed

94

modifications with the overall optimization methodology summarized in Section 4. The

95

methodology is then illustrated using a penicillin production process in Section 5 followed by the

96

conclusions in section 6.

97

2. MODEL CORRECTION

98

2.1 Preliminaries

99

Although the algorithms presented in this work can be applied to the optimization of both

100

continuous and batch/fed-batch processes, for the sake of brevity, the following discussion and the

101

case study will be limited to a fed-batch process with a duration time of tf. However, in the

102

proposed setting, the applicability of the algorithms is limited to run-to-run optimization of an end

103

point property in case of batch/fed-batch systems or steady state optimization for continuous

104

systems.

105

Uncertainties in model parameters resulting from either insufficient excitation in measurements

106

or incorrect model structure often lead to non-optimal operating policies. The standard “two-step”

107

approach, reviewed in the Introduction section, aims to tackle this problem by continuously

108

updating the model parameters at the sub-optimal operating conditions, calculated using the last

109

calibrated model. The first step in this approach involves model identification, where the model

110

parameters are estimated by minimizing some measure of the errors between the predicted and

111

measured outputs at some initial set of operating conditions. A simplest measure is a sum of

112

squared errors defined as follows1:


Page 6 of 42

Page 7 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


113 𝑁

𝛉𝐤 = arg min ∑ 𝛉

𝑠. 𝑡.

‖𝐲𝐦 (𝐮𝐤 ) − 𝐲(𝐮𝐤 , 𝛉)‖2

𝑖=1

𝐱̇ = 𝑓(𝐱, 𝛉, 𝐮𝐤 ) 𝐲 = ℎ(𝐱)

(P. 1)

114

Where, 𝐱 ϵ ℝnx is the vector of model states, 𝛉 ϵ ℝnθ is the vector of model parameters, 𝐮 ϵ ℝnu

115

is the vector of process inputs, 𝐲 ϵ ℝny and 𝐲𝒎 ϵ ℝny are the vectors of predicted and measured

116

outputs respectively, N is the number of time points, 𝑓 ϵ ℝnx is a set of differential equations

117

representing the correlation between model states and process inputs and ℎ ϵ ℝny is a mapping

118

between model states and predicted outputs. It is important to note here that this formulation is

119

based on the assumption that the errors in measured outputs and at each time point are normally

120

distributed with a zero mean and a known constant and uncorrelated variance. Now, depending on

121

different assumptions on the distribution of these measurement errors, there are different

122

formulations that can be used12-14. For examples, if the variance is assumed to be uncorrelated but

123

not constant among different outputs, a weighted least squares is more appropriate as used in the

124

case study in this work. In a more general case considering the covariance-variance matrix as

125

unknown, a determinant criteria by Box and Draper14 would be an ideal choice. A detailed

126

comparative discussion on these formulations is beyond the scope of this study. For simplicity, we

127

will be using the simple least squares formulation as posed in problem P.1 in the following

128

discussions and the proposed algorithms by no means are restricted to any one particular type of

129

identification formulation.

130

Once the parameter estimates are obtained, the next step is to calculate the optimal time-varying

131

input profiles, 𝑢𝑗 (𝑡) such that the some measure of the process performance can be maximized



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

132

subject to certain terminal and/or path constraints. Although this problem falls under the category

133

of dynamic optimization, it can be easily posed as NLP using a control vector approach15. Here

134

the idea is to parameterize the input profiles using piecewise constant values for the inputs over

135

̂𝑗 , 𝑡) so that the problem can now be solved in terms pre-specified time intervals, i.e. 𝑢𝑗 (𝑡) = 𝑈(𝐮

136

̂𝑗 . This approach is also referred as the direct sequential approach as the of the static variables 𝐮

137

model equations are integrated separately using an appropriate numerical integration technique

138

and the outputs are then used to compute the objective function and constraints of the optimization

139

problem. Considering path constraints in their discretized form, the final NLP problem can be

140

formulated as follows1:

141


Page 8 of 42

Page 9 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


142 143

̂ 𝐤+𝟏 = arg min 𝜙(𝐲(𝐮 ̂ , 𝛉𝐤 , 𝑡𝑓 )) 𝐮

144

s. t.

̂ 𝐮

̂) 𝐱̇ = 𝑓̂(𝐱, 𝛉𝐤 , 𝐮

145

𝐲 = ℎ(𝐱)

146

̂ , 𝛉𝐤 , 𝑡𝑖 ), 𝐔(𝐮 ̂ , 𝑡𝑖 )) ≤ 𝟎 𝐒(𝐲(𝐮

147

̂ , 𝛉𝐤 , 𝑡𝑓 )) ≤ 𝟎 𝐓 (𝐲(𝐮

∀𝑡𝑖 ∈ [𝑡0 , 𝑡𝑓 ]

(P.2)

148

̂ ϵ ℝnû is the vector of parameterized inputs, 𝜙 is the objective function to be minimized Where, 𝐮

149

at the end of batch, 𝐒 ϵ ℝnS is a vector of discretized path constraints and 𝐓 ϵ ℝnT is a vector of

150

terminal constraints. Since the constraints are being calculated at pre-specified time points, they

151

can

152

̂ , 𝛉𝐤 , 𝑡0 ), 𝐔(𝐮 ̂ , 𝑡0 )), … , 𝐒 (𝐲(𝐮 ̂ , 𝛉𝐤 , 𝑡𝑓 ), 𝐔(𝐮 ̂ , 𝑡𝑓 )) , 𝐓 (𝐲(𝐮 ̂ , 𝛉𝐤 , 𝑡𝑓 ))] [𝐒(𝐲(𝐮

be

represented

collectively

by

a

single

vector

as:

̂ , 𝛉𝐤 ), 𝐮 ̂) = 𝐠(𝐲(𝐮

153

The above two steps (P.1 and P.2) are then successively repeated until convergence is obtained.

154

However, in the presence of an error between the assumed model structure 𝑓(𝐱, 𝛉, 𝐮) and the

155

actual process, the convergence of this approach towards the true process optimum cannot be

156

guaranteed. Since this error is a representation of unmodelled dynamics and therefore, a function

157

of model states and inputs, when the model is calibrated against experimental data collected at

158

different operating conditions, the parameter estimates have to compensate for this variable error.

159

As a result, there is no unique set of parameter values that can satisfy the identification objective

160

(P. 1) for all possible realizations of operating conditions. The set of parameter estimates that

161

minimizes the prediction error at any particular set of operating conditions, for instance 𝐮𝐤 , may



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 42

162

not minimize this error at values slightly different from 𝐮𝐤 which implies that the model may not

163

predict the gradients of the optimization objective and constraints with respect to 𝐮𝐤 correctly, i.e.:

164 165

̂ 𝐤 , 𝑡𝑓 )) ̂ 𝐤 , 𝛉𝐤 , 𝑡𝑓 )) 𝜕𝜙𝑝 (𝐲𝐦 (𝐮 𝜕𝜙(𝐲(𝐮 ≠ 𝜕𝑢̂𝑖𝑘 𝜕𝑢̂𝑖𝑘

(1a)

̂ 𝐤 ), 𝐮 ̂𝐤) ̂ 𝐤 , 𝛉𝐤 ), 𝐮 ̂ 𝐤 ) 𝜕𝑔𝑝 𝑗 (𝐲𝐦 (𝐮 𝜕𝑔𝑗 (𝐲(𝐮 ≠ 𝜕𝑢̂𝑖𝑘 𝜕𝑢̂𝑖𝑘

(1b)

In a worst case scenario, the above predicted and measured gradients may be of an opposite sign thus driving the optimization search away from the true process optimum, i.e.: ̂ 𝐤+𝟏 , 𝑡𝑓 )) > 𝜙𝑝 (𝐲𝐦 (𝐮 ̂ 𝐤 , 𝑡𝑓 )) 𝜙𝑝 (𝐲𝐦 (𝐮

(2)

166

To ensure the convergence to true process optimum, one may choose to estimate the parameters

167

that predict the above gradients with a higher accuracy but then the accuracy in predicting the

168

process outputs has to be relaxed to a certain extent. To tackle this tradeoff between the

169

identification and optimization objectives, we recently proposed a set of progressive linear

170

corrections to model outputs such that upon convergence, the updated model parameters not only

171

minimize the difference between predicted and measured gradients but also minimize the

172

prediction error in the outputs In the next sub-section, the idea behind this model correction is

173

explained and a more accurate model correction based on a quadratic approximation of the model

174

is proposed.

175

2.2 Quadratic Model Correction

176

For any kth iteration, the identification problem posed in P.1 calculates the set of parameter

177

estimates 𝛉𝐤 such that the sum of squared errors between the predicted and measured outputs 𝜖𝑘

178

is at its minimum. Mathematically, 𝜖𝑘 can be expressed as:


Page 11 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


𝑁

𝜖𝑘 = ∑

‖𝐲𝐦 (𝐮𝐤 ) − 𝐲(𝐮𝐤 , 𝛉𝐤 )‖2

(3)

𝑖=1

179

Let ∆𝛉𝐤 be the change in parameter estimates 𝛉𝐤 , required to minimize the difference between

180

predicted and measured gradients given in Equations (1a)-(1b)). However, with the updated

181

parameter estimates 𝛉𝐤 + ∆𝛉𝐤 , the sum of squared errors between the predicted and measured

182

outputs is no longer at its minimum and can be expressed as follows: 𝑁

𝜖𝑘′ = ∑

‖𝐲𝐦 (𝐮𝐤 ) − 𝐲(𝐮𝐤 , 𝛉𝐤 + ∆𝛉𝐤 )‖2

(4)

𝑖=1

183 184

To ensure that 𝜖𝑘′ = 𝜖𝑘 , let us introduce a vector of constant corrections −𝐜𝐤 in the model outputs such that: 𝑁

∑

𝑁

‖𝐲𝐦 (𝐮𝐤 ) − (𝐲(𝐮𝐤 , 𝛉𝐤 + ∆𝛉𝐤 ) − 𝐜𝐤 )‖2 = ∑

𝑖=1

185 186

‖𝐲𝐦 (𝐮𝐤 ) − 𝐲(𝐮𝐤 , 𝛉𝐤 )‖2

To satisfy Equation (5), the corresponding error terms on the both sides of the equality should be equal: 𝐲𝐦 (𝐮𝐤 ) − 𝐲(𝐮𝐤 , 𝛉𝐤 + ∆𝛉𝐤 ) + 𝐜𝐤 = 𝐲𝐦 (𝐮𝐤 ) − 𝐲(𝐮𝐤 , 𝛉𝐤 )

187

189

191 192

(7)

Following a Taylor Series expansion, the model outputs 𝐲(𝐮𝐤 , 𝛉𝐤 + ∆𝛉𝐤 ) can be expanded around 𝛉𝐤 as follows: 𝐲(𝐮𝐤 , 𝛉𝐤 + ∆𝛉𝐤 ) = 𝐲(𝐮𝐤 , 𝛉𝐤 ) + 𝐷𝐲(𝛉𝐤 ) ∆𝛉𝐤 +

190

(6)

Following a rearrangement, the correction term 𝐜𝐤 can be expressed as: 𝐜𝐤 = 𝐲(𝐮𝐤 , 𝛉𝐤 + ∆𝛉𝐤 ) − 𝐲(𝐮𝐤 , 𝛉𝐤 )

188

(5)

𝑖=1

1 ∆𝛉T𝐤 𝐻𝐲(𝛉𝐤 ) ∆𝛉𝐤 + ⋯ 2

(8)

Where, 𝐷𝐲(𝛉𝐤 ) is the Jacobian whose elements are the derivatives of outputs 𝐲 with respect to model parameters 𝛉𝐤 and 𝐻𝐲(𝛉𝐤 ) is the corresponding Hessian Combining Equations (7) and (8), the correction term is, then, given by:



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

𝐜𝐤 = 𝐷𝐲(𝛉𝐤 ) ∆𝛉𝐤 + 193 194

1 ∆𝛉T𝐤 𝐻𝐲(𝛉𝐤 ) ∆𝛉𝐤 + ⋯ 2

196

(9)

Neglecting the terms higher than second order, the above correction can be approximated by the quadratic expansion around 𝛉𝐤 as follows: 𝐜𝐤 = 𝐷𝐲(𝛉𝐤 ) ∆𝛉𝐤 +

195

Page 12 of 42

1 ∆𝛉T𝐤 𝐻𝐲(𝛉𝐤 ) ∆𝛉𝐤 2

(10)

The accuracy of this correction quadratic approximation can be quantified by using the relative truncation error as follows:

𝑻

𝝐 =

1 ∆𝛉T𝐤 𝐻𝐲(𝛉𝐤 ) ∆𝛉𝐤 − 𝐲(𝐮𝐤 , 𝛉𝐤 ) 2 𝐲(𝐮𝐤 , 𝛉𝐤 )

𝐲(𝐮𝐤 , 𝛉𝐤 + ∆𝛉𝐤 ) − 𝐷𝐲(𝛉𝐤 ) ∆𝛉𝐤 −

(11)

197

Following the above discussion, the calculation of ∆𝛉𝐤 to minimize the differences between

198

predicted and measured gradients (Equations (1a)-(1b)) is then performed using an optimization

199

problem as follows:

200

∆𝛉𝐤 = arg min (𝐰𝜙 |

201

+ 𝐰𝑔 |

202

∆𝛉

s. t.

̂ 𝐤 , 𝑡𝑓 )) 𝜕𝜙(𝐲̂(𝐮 ̂ 𝐤 , 𝛉𝐤 , ∆𝛉, 𝑡𝑓 )) 𝜕𝜙𝑝 (𝐲𝐦 (𝐮 − | ̂ ̂ 𝜕𝐮 𝜕𝐮

̂ 𝐤 ), 𝐮 ̂ 𝐤 ) 𝜕𝐠(𝐲̂(𝐮 𝜕𝐠 𝑝 (𝐲𝐦 (𝐮 ̂ 𝐤 , 𝛉𝐤 , ∆𝛉), 𝐮 ̂𝐤) − |) ̂ ̂ 𝜕𝐮 𝜕𝐮

̂) 𝐱̇ = 𝑓̂(𝐱, 𝛉𝐤 + ∆𝛉, 𝐮

203

𝐲̂ = ℎ(𝐱) − 𝐷𝐲(𝛉𝐤 ) ∆𝛉 −

204

𝑇 ‖𝛜𝐓 ‖∞ ≤ 𝜖𝑚𝑎𝑥

1 ∆𝛉𝐓𝐤 𝐻𝐲(𝛉𝐤 ) ∆𝛉𝐤 2 (P.3)


Page 13 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


205

Where, 𝐰𝜙 and 𝐰𝑔 are the vectors of normalizing weights corresponding to the gradients of

206

objective function and constraints respectively and ϵ𝑇𝑚𝑎𝑥 is the limit imposed on relative truncation

207

error (Equation 11) representing the required degree of accuracy in the quadratic correction.

208

To summarize, at any given set of operating conditions, the estimation of parameters is divided

209

into two steps. In the first step, initial estimates are obtained independent of the optimization

210

objectives to minimize only the prediction error as per problem P.1. In the subsequent step, the

211

change in these estimates is computed from P.3 such that the updated model also predicts the

212

measured gradients of the optimization problem with a minimum possible error while the set of

213

corresponding model corrections guarantees the prediction error to be within the bounds of its

214

minimum value from the first step. The updated model with the parameter estimates 𝛉′𝐤 (= 𝛉𝐤 +

215

∆𝛉𝐤 ) is then used to calculate the optimal operating conditions for the next iteration as per problem

216

P.2. This procedure involving P.1, P.3 and P.2 is repeated until convergence is achieved.

217

In this paper, the optimization gradients from the process were approximated using a finite

218

difference approach which requires a set of additional experiments by perturbing the inputs one at

219

a time. For large number of inputs, a more practical approach would be to approximate these

220

gradients using the process information from past and current iterations7, 16-17 by modifying the

221

optimization problem P.2. A detailed discussion on this topic is, however, beyond the scope of this

222

paper.

223

3. ROBUST OPTIMIZATION

224

The overall performance of the proposed algorithm can be evaluated in terms of (1) its rate of

225

convergence and (2) the robustness to model uncertainties. The rate at which the algorithm will

226

converge highly depends on the change in parameter estimates ∆𝛉𝐤 along the iterations. For faster



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 42

227

convergence, it is desirable to use the values of ∆𝛉𝐤 that allow to predict the gradients of the

228

optimization problem accurately but this choice is controlled by the relative truncation error ϵ𝑇𝑚𝑎𝑥

229

which restricts the values of ∆𝛉𝐤 to a region where the proposed model corrections are assumed to

230

be valid. When compared to the linear correction proposed earlier in Mandur and Budman (2015)1,

231

the quadratic model correction increases this validity region, thereby allowing for larger ∆𝛉𝐤 values

232

and thus higher rate of convergence. However, this improvement comes with a price as the larger

233

values of ∆𝛉𝐤 will make the algorithm more sensitive to the noise in measured gradients. Although

234

with some trial and error, one can obtain the value of ∆𝛉𝐤 that can provide a reasonable tradeoff

235

between the robustness towards modeling errors vs. noisy gradients, a more systematic approach

236

is to incorporate the effect of model uncertainties explicitly into the algorithm which will increase

237

its overall robustness irrespective of ∆𝛉𝐤 values. Therefore, the second contribution of this paper

238

is the introduction of robustness in the optimization objective where instead of nominal objective

239

function as in problem P.2, a weighted sum of the nominal objective function and its variability is

240

minimized as follows: ̂ 𝐤+𝟏 = arg min (𝐸 (𝜙 (𝐲(𝐮 ̂ , 𝑡𝑓 ))) + 𝐕𝐚𝐫 (𝜙 (𝐲(𝐮 ̂ , 𝑡𝑓 )))) 𝐮 ̂ 𝐮

(12)

241

Where Var is a measure of variability in the optimization objective due to model uncertainties

242

In the following sub-sections, a computationally efficient approach to first quantify the model

243

uncertainties and then propagate their effect onto the optimization objective is presented.

244

3.1 Parametric Uncertainty

245

The effects of modeling errors and/or measurement noise are usually captured by defining a

246

confidence region around the parameter estimates. This region basically represents the joint

247

probabilities by which parameter estimates predict the given experimental data. The conventional


Page 15 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


248

method to compute this region is based on the linearization of model around the nominal parameter

249

estimates. Following this linearization, for normally distributed measurement errors, the resulting

250

parametric uncertainty is also normal and can be expressed by the following hyper-ellipsoid12:

251 2 𝐸𝜃 = {𝜃: (𝛉 − 𝛉𝐤 )𝑇 𝑽−𝟏 𝜽 (𝛉 − 𝛉𝐤 ) ≤ 𝜒𝑛𝜃 (𝛼)

(13)

252

Where, 𝑽𝜽 ∈ ℝ𝑛θ x 𝑛θ is the covariance matrix of measurement errors and 𝜒𝑛2𝜃 (𝛼) is the chi-

253

square distribution with 𝑛θ degrees of freedom and 𝛼 confidence level. This description is,

254

however, valid only if either the model is linear or the uncertainty region for parameters is small

255

enough that the underlying assumption of model linearity is valid. For nonlinear problems, the

256

Bayesian approach is a more accurate choice11 where the joint probability distribution for

257

parameters is given by: 𝐩(𝛉|𝐷) =

𝐿(𝛉|𝐷) 𝐩(𝛉) ∫ 𝐿(𝛉|𝐷) 𝐩(𝛉)𝑑𝛉

(14)

258

Where, 𝐩(𝛉|𝐷) is the posterior probability of parameters conditional on the given set of

259

measurements D, 𝐩(𝛉) is the prior probability of parameters representing any a priori available

260

information about the parameters and 𝐿(. ) represents the likelihood of parameters defining the

261

statistical distribution of errors. For instance, assuming the errors between the output

262

measurements and predictions to be independent and normally distributed, the likelihood function

263

is given by a 𝑛θ -dimensional multivariate normal distribution as follows: 𝐿(𝛉|𝐷) =

1 (2π)k/2 |Σ|1/2

1 𝑇 exp (− (𝐲𝐦 − 𝐲(𝛉)) Σ −1 (𝐲𝐦 − 𝐲(𝛉))) 2

(15)

264

In this study, the effect of both the normal as well as the Bayesian parametric distributions is

265

investigated. Assuming that the only prior information available for the parameters is their lower



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 42

266

and upper bounds, a uniform prior is used in the above posterior calculation. Since for nonlinear

267

models, the above likelihood function does not have an analytical solution, the most common

268

approach to propagate uncertainty is to sample from the posterior distribution (equation 14) using

269

Markov Chain Monte Carlo (MCMC) sampling techniques18 and to solve, for each realization, the

270

full nonlinear model to calculate the desired output. Given the fact that these calculations have to

271

be repeated at each function evaluation, the resulting robust optimization problem is a

272

computationally challenging task. To this end, we implemented an approach based on Polynomial

273

Chaos (PC) expansions where the key idea is to use a set of orthogonal polynomials to build a

274

surrogate map between the desired output and the model parameters specifically over the uncertain

275

parameter space11. Compared to conventional Monte Carlo sampling techniques, the PC

276

expansions require significantly lesser number of model runs making it an ideal choice for robust

277

optimization studies. A brief background for PC-based uncertainty propagation is presented next.

278

3.2 Polynomial Chaos Expansions

279

Let us define a probability space (Ω, ℱ, 𝑃), where Ω is the sample space, ℱ is the 𝜎-algebra over

280

Ω and 𝑃 is a probability measure on ℱ. If {𝜉𝑖 (𝜔)}∞ 𝑖=1 is a set of independent random variables with

281

a standard probability distribution, then, the PC expansion of any random variable X with a finite

282

variance will be as follows: ∞

∞

𝑖1

𝑋(𝜔) = 𝑎0 Γ𝑜 + ∑ 𝑎𝑖1 Γ1 (𝜉𝑖1 ) + ∑ ∑ 𝑎𝑖1 𝑖2 Γ2 (𝜉𝑖1 , 𝜉𝑖2 ) 𝑖1 =1 ∞

𝑖1 =1 𝑖2 =1 𝑖1

𝑖2

+ ∑ ∑ ∑ 𝑎𝑖1 𝑖2𝑖3 Γ3 (𝜉𝑖1 , 𝜉𝑖2 , 𝜉𝑖3 ) + ⋯ 𝑖1 =1 𝑖2 =1 𝑖3 =1


(16)

Page 17 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


283

Where, Γ𝑝 is the basis function of order p19, 𝜔 is the random event and 𝑎(.) is the deterministic

284

coefficient multiplying the corresponding basis function. The expansion can be expressed in more

285

compact form19 as: ∞

𝑋(𝜔) = ∑ 𝑎̂𝑘 𝛹𝑘 (𝜉1 , 𝜉2 , 𝜉3 , … )

(17)

𝑘=0

286 287

The fundamental property of PC expansions is that all basis functions are orthogonal to each other, i.e. 〈𝛹𝑖 𝛹𝑗 〉 = ∫ 𝛹𝑖 (𝛏)𝛹𝑗 (𝛏)𝑝(𝛏)d𝛏 = 𝛿𝑖𝑗 〈𝛹𝑖2 〉

288 289

(18)

Then, following the above orthogonality equation, the coefficients in the expansion can be computed using Galerkin projections as follows: 𝑎̂𝑘 =

〈𝑋 𝛹𝑘 〉 〈 𝛹𝑘2 〉

(19)

290

To propagate the effect of parametric uncertainty onto the desired variable X e.g. the

291

optimization objective function and/or constraints (if required), the first step is to formulate the

292

PC expansions for the uncertain model parameters. To compute the expansion coefficients, a one-

293

to-one mapping between the set of parameters 𝛉 and the set of independent random variables 𝛏 is

294

needed. For the case of uncorrelated parameters, this can be achieved by an inverse transformation

295

as follows: 𝜃

𝜉𝑖 = 𝐹 −1 (∫ 𝑝(𝜃𝑖 )𝑑𝜃 )

(20)

0

296

Where, 𝐹 −1 is the inverse of the cumulative density function for the independent random

297

variable 𝜉𝑖 and 𝑝(𝜃𝑖 ) is the probability of the model parameter 𝜃𝑖 . Then using these mappings and

298

equation (19), the calculation of coefficients in respective PC expansions is quite straightforward.



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

299 300

Page 18 of 42

However, when the parameters are correlated, the following set of transformations based on conditional and marginal probabilities needs to be applied: 𝜉1 = 𝐹1

−1

𝜃1

(∫ 𝑝(𝜃1 |𝐷)𝑑𝜃1 ) 0

𝜉2 = 𝐹2

−1

𝜃2

(∫ 𝑝(𝜃2 |𝜃1 , 𝐷) 𝑑𝜃2 )

(21)

0

⋮ 𝜃𝑖

𝜉𝑖 = 𝐹𝑖 −1 (∫ 𝑝(𝜃𝑖 |𝜃𝑖−1 , 𝜃𝑖−2 , … , 𝜃1 , 𝐷) 𝑑𝜃𝑖 ) 0

301

Based on these conditional mappings, the PC expansions for different parameters can be

302

obtained where the coefficients in each expansion will depend on the corresponding set of

303

conditional parameters. A two parameter example for the case of correlated parameters has been

304

presented in11 where the authors proposed to build nested PC expansions to deal with conditional

305

mappings. Once the PC expansions for the set of uncertain parameters are formulated, the next

306

step is to formulate the PC expansion between the desired variable X and the independent random

307

variables 𝛏 as follows:

308

1. Select the values of independent random variables 𝛏 at the required collocation points

309

2. Calculate the corresponding set of parameter values 𝛉 from their PC expansions

310

3. Solve the nonlinear model for each set of parameter values and calculate the desired

311 312 313 314

output X (e.g. optimization objective). 4. Using the map between X and 𝛏 from steps (1)-(3), build a PC expansion for X(𝛏). Due to the orthogonality of basis functions, the variance of the output variable 𝑋 can be calculated analytically as follows:


Page 19 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


𝑃

𝑉𝑎𝑟(𝑋) = ∑ 𝑎̂𝑖2 < 𝛹𝑖2 >

(22)

𝑖=1

315

This expression can then be applied to calculate the robust optimization cost in equation (12).

316

317

4. PROPOSED OPTIMIZATION METHODOLOGY

318

The overall methodology is summarized using a flowchart in Figure 1. The algorithm begins

319

with the first step of identification problem where the objective is to calculate 𝛉𝐤 as posed in P.1

320

with the initial model correction term as zero. In the second step, the change in parameter estimates

321

∆𝛉𝐤 and the set of new model corrections 𝐜𝐤 are calculated using problem P.3. Then using the

322

corrected model and parameter estimates 𝛉𝐤 + ∆𝛉𝐤 , a nominal or robust optimization problem,

323

with the objective function given by problem P.2 or equation 12 respectively, is solved to calculate

324

the optimal operating conditions for the next iteration. This procedure is repeated until

325

convergence is achieved. In the context of batch or fed-batch system studied in this work, each

326

iteration corresponds to new batch experiment where after estimating the parameter estimates

327

using P.1 and P.3, new optimal operating conditions are obtained for the next batch experiment

328

until the algorithm converges after a finite number of batches. It is important to note here that

329

solving a robust optimization problem (equation 12) involves an additional computational time

330

required to calculate the PC representation of uncertain model parameters and then computing the

331

mean and variances for robust objective function.



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

𝐜𝟎 = 0 Solve for 𝛉𝐤 𝑁

𝛉𝐤 = arg min ∑ 𝛉

s. t.

𝑖=1

‖𝐲𝐦 (𝐮 ̂ 𝐤 ) − 𝐲(𝐮 ̂ 𝐤 , 𝛉)‖2

̂𝐤) 𝐱̇ = 𝑓̂(𝐱, 𝛉, 𝐮 𝐲 = ℎ(𝐱) − 𝐜𝐤−𝟏

Solve for ∆𝛉𝐤 ̂ 𝐤 , 𝑡𝑓 )) 𝜕𝜙(𝐲̂(𝐮 ̂ 𝐤 , 𝛉𝐤 , ∆𝛉, 𝑡𝑓 ) 𝜕𝜙𝑝 (𝐲𝐦 (𝐮 − | ̂ ̂ 𝜕𝐮 𝜕𝐮 ̂ 𝐤 ), 𝐮 ̂ 𝐤 ) 𝜕𝐠(𝐲̂(𝐮 𝜕𝐠 𝒑 (𝐲𝐦 (𝐮 ̂ 𝐤 , 𝛉𝐤 , ∆𝛉), 𝐮 ̂𝐤) + 𝐰𝑔 | − |) ̂ ̂ 𝜕𝐮 𝜕𝐮

∆𝛉𝐤 = 𝑎𝑟𝑔 min (𝐰𝜙 | ∆𝛉

s. t.

̂𝐤) 𝐱̇ = 𝑓̂(𝐱, 𝛉𝐤 + ∆𝛉, 𝐮 𝐲̂ = ℎ(𝐱) − 𝐜𝐤−𝟏 − 𝐷𝐲(𝛉𝐤 ) ∆𝛉 −

𝟏 𝟐

∆𝛉𝐓𝐤 𝐻𝐲(𝛉𝐤 ) ∆𝛉𝐤

𝑇 ‖𝝐𝑻 ‖∞ ≤ 𝜀𝑚𝑎𝑥

̂ 𝐤+𝟏 Solve for 𝐮 ̂ 𝐤+𝟏 = arg min 𝜙(𝐲̂(𝐮 ̂ , 𝛉𝐤 , ∆𝛉𝐤 , 𝑡𝑓 )) 𝐮 ̂ 𝐮

s. t.

̂𝐤) 𝐱̇ = 𝑓̂(𝐱, 𝛉𝐤 + ∆𝛉, 𝐮 𝐲̂ = ℎ(𝐱) − 𝐜𝐤−𝟏 − 𝐷𝐲(𝛉𝐤 ) ∆𝛉 −

𝟏 𝟐


̂ 𝐤 , 𝛉𝐤 , ∆𝛉𝐤 ), 𝐮 ̂𝐤) ≤ 𝟎 𝐠(𝐲̂(𝐮

𝒄𝒌−𝟏 = 𝐜𝐤−𝟏 + 𝐷𝐲(𝛉𝐤 ) ∆𝛉 −

𝟏 𝟐


No Converge? Yes Stop

332

Figure 1: Proposed Algorithm with quadratic model correction

333


Page 20 of 42

Page 21 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

334


5. CASE STUDY

335

The proposed algorithm is illustrated on a fed-batch bioprocess for penicillin production. The

336

actual process is considered to be described by a system of differential equations ((23)-(26)) as

337

follows20-21: 𝑑𝑋 𝜇𝑋 𝑆𝑋 𝑋 𝑑𝑉 =( )− 𝑑𝑡 𝐾𝑋 𝑋 + 𝑆 𝑉 𝑑𝑡 𝑑𝑃 =( 𝑑𝑡

𝜇𝑃 𝑆𝑋 𝐾𝑃 + 𝑆 +

(23)

2 ) − 𝐾𝐻 𝑃 −

𝑆 𝐾𝐼

𝑃 𝑑𝑉 𝑉 𝑑𝑡

𝑑𝑆 1 𝜇𝑋 𝑆𝑋 1 =− ( )− ( 𝑑𝑡 𝑌𝑋⁄ 𝐾𝑋 𝑋 + 𝑆 𝑌𝑃⁄ 𝑆

𝑆

𝜇𝑃 𝑆𝑋 𝐾𝑃 + 𝑆 +

(24)

) 𝑆2

− 𝑚𝑋 𝑋 +

𝐹𝑠𝑓 𝑆 𝑑𝑉 − 𝑉 𝑉 𝑑𝑡

(25)

𝐾𝐼

𝑑𝑉 = 𝐹 − 6.226 ∗ 10−4 𝑉 𝑑𝑡

(26)

338

Where, 𝑋, 𝑃 and 𝑆 represent the concentrations of biomass, penicillin and substrate respectively

339

and 𝑉 is the culture volume. The rate constants and other parameters are defined as follows: 𝜇𝑋

340

and 𝜇𝑃 are specific growth rates for the biomass and penicillin respectively with 𝐾𝑋 and 𝐾𝑃 as

341

respective saturation constants, 𝐾𝐼 represents the substrate inhibition in the growth kinetics of the

342

penicillin, 𝐾𝐻 accounts for the consumption of penicillin by hydrolysis, 𝑌𝑋⁄𝑆 and 𝑌𝑃⁄𝑆 are the

343

yields per unit mass of substrate for the biomass and penicillin respectively, 𝑚𝑋 accounts for the

344

consumption of substrate in maintaining the biomass and, finally, 𝑠𝑓 is the concentration of

345

substrate in the feed.



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 42

346

The above Equations ((23)-(27)) are then used to generate in silico data for the output variables

347

as well as the gradients of the cost function and constraints in the optimization problem. The

348

measured outputs, for example, are given by the following relation: 𝑦𝑚𝑖𝑗 = 𝑦̂𝑖𝑗 + 𝜀𝑖𝑗

(27)

349

Where 𝑦̂𝑖𝑗 is the deterministic value of the ith output at jth time point computed from the equations

350

(23)-(27) and 𝜀𝑖𝑗 is the corresponding stochastic part representing any uncertainty in the measured

351

output 𝑦𝑚𝑖𝑗

352

Assuming that there is no uncertainty in the inputs, there are no unmeasured disturbances

353

entering the process and the samples are collected and analyzed independently, the only

354

uncertainty in the measurements is the one coming from the sensor noise and will be uncorrelated

355

both with respect to different outputs as well as the time. Therefore, assuming the measured outputs

356

to be normally distributed with a zero mean and a constant variance over the time while different

357

outputs have different variances, the identification problem in P.1 is formulated as a weighted least

358

squares problem.

359

As for the process model with structural uncertainty, the consumption of penicillin by hydrolysis

360

occurring in the actual process is assumed to be a priori unknown to the user. Accordingly, the

361

rate of change in the penicillin concentration is inaccurately modelled, as: 𝑑𝑃 =( 𝑑𝑡

𝜇𝑝 𝑆𝑋 𝐾𝑃 + 𝑆 +

) 𝑆2

−

𝑃 𝑑𝑉 𝑉 𝑑𝑡

(28)

𝐾𝐼

362

Assuming the dynamics of the other states to be known accurately, the set of Equations (23),

363

(25), (26) and (28) then represents the inaccurate model of the process to be used in the proposed

364

algorithm. It should be noted that the above elimination of the hydrolysis term results in a


Page 23 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


365

significant model error as shown in our previous study1 where without correcting for this error

366

using iterative model corrections, the run-to-run optimization algorithm converged to a sub optimal

367

solution with nearly 25% reduction in the final penicillin amount as compared to the true process

368

optimum.

369

The optimization objective is to maximize the amount of penicillin at the end of batch by

370

manipulating the initial substrate concentration 𝑆𝑜 and the input feed rate 𝐹 while ensuring that the

371

culture volume does not exceed a maximal volume of 120L. For reference, the true process

372

optimum corresponds to the initial substrate concentration, 𝑆𝑜 = 54.9 g/L and the input feed rate,

373

F = 0.1728 L/hr with the final penicillin at the end of batch ~592 g

374

Mathematically, the problem is formulated as: min

−𝑃(𝐱, 𝛉, 𝐮, 𝑆𝑜 , 𝐹, 𝑡𝑓 )

𝑠. 𝑡.

(5.23) 𝑎𝑛𝑑 (5.25 − 5.27)

𝑆𝑜 ,𝐹

𝑉(𝐱, 𝛉, 𝐮, 𝑆𝑜 , 𝐹, 𝑡𝑓 ) ≤ 𝑉𝑚𝑎𝑥

(P.4)

375

The algorithm starts with a parameter estimation step with the initial input conditions listed in

376

Table 1. Here, without loss of generality, only two parameters 𝐾𝑋 and 𝐾𝐼 are updated in the

377

algorithm whereas the remaining parameters are kept at their initial values, that were estimated in

378

the first iteration with the incorrect model. The reasons for selecting a subset of parameters are;

379

(1) to reduce the sensitivity to noise and (2) to reduce the computational load in the robust

380

optimization problem. It has been shown that as the number of parameters increase, the

381

propagation of parametric uncertainty using PC expansions needs more model runs, thus

382

increasing the overall computational time since this propagation step has to be repeated several

383

times in optimization framework11. In situations where the optimal inputs provide insufficient



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

384

excitation to estimate all the parameters, updating only a subset of parameters becomes even more

385

relevant. However, this is beyond the scope of this study.

386 387

Table 1: Set of initial input conditions

Biomass Conc. (X0)

0.1 (g/l)

Substrate Conc. (S0)

0.1 (g/l)

Product Conc. (P0)

0 (g/l)

Initial Culture Volume (V0)

100 (L)

Input Feed (F)

0.04 (L/hr)

388 389 390

5.1 Results and discussion

391

As mentioned earlier in the Introduction, the key motivation behind this work was to improve

392

the convergence of the previously proposed algorithm1 in two directions: (1) by using a new model

393

correction term based on quadratic approximation of the model and (2) by addressing robustness

394

explicitly in the optimization problem to reduce the effect of model uncertainties. To compare their

395

relative effects, these two modifications were studied independently through several case studies

396

as discussed below.

397

5.1.1

Comparison of linear and quadratic model corrections.

398

The effect of the quadratic model correction is discussed first. It is worth mentioning, here, that

399

in all optimizations the optimal feed rate 𝐹 always converged to a value of 0.1728 L/hr satisfying

400

the constraint on culture volume and, therefore, for clarity, it will not be considered in the


Page 24 of 42

Page 25 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


401

following discussion. At any iteration, a minimal trade-off between the identification and

402

optimization objectives is obtained by using the proposed model correction and the bound on

403

𝑇 ) for which this correction is assumed to be valid. It was hypothesized that, truncation error (𝜖𝑚𝑎𝑥

404

for highly nonlinear models, the linear approximation proposed in our previous work may be valid

405

only in a small neighborhood of parameter estimates from the first identification step. Thus, when

406

𝑇 using linear corrections, a small value of 𝜖𝑚𝑎𝑥 had to be used to improve the predictive accuracy

407

along the iterations which limited the change in estimates in the second step and therefore the

408

extent to which correction for the optimization gradients can be made.

409

To test the effect of a quadratic correction, the algorithm is solved using both quadratic and

410

𝑇 linear corrections for 𝜖𝑚𝑎𝑥 = 1% and the corresponding convergence of the optimal 𝑆𝑜 as a

411

function of the iteration number is compared in Figure 2. It is clear from this figure that with the

412

linear correction, the algorithm is not even able to match the sign of the predicted and measured

413

gradients for some of the iterations, thus, leading to an oscillatory profile during the transient. It

414

should be remembered that if the sign of the objective function’s gradient is incorrectly predicted,

415

the optimization will proceed in the wrong direction thus going away temporarily from the actual

416

optimum and resulting in the observed oscillations. On the other hand, the quadratic approximation

417

of the given model allows for much larger changes in the parameter estimates in all iterations,

418

increasing the ability to fit the measured optimization gradients more accurately. As a result, the

419

new correction resulted in a smooth (non-oscillatory) and much faster convergence towards the

420

process optimum. In these results, the measurements were assumed to be noise free so as to

421

evaluate the performance of the algorithm when only the modeling error is present.

422



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 42

423 424

Figure 2: Comparing the effect of linear vs. quadratic model correction on the convergence of

425

optimal 𝑆𝑜

426

In the second comparison, in order to study how much variability one can expect in each iteration

427

when the algorithm is subject to random measurement noise, the algorithm is solved repeatedly 10

428

times with each run subjected to independent noise realizations. This analysis is quite relevant

429

from a practical point of view for batch processes that are operated typically in parallel in order to

430

test the reproducibility of the process outcomes. To this end, the measured outputs as well as the

431

gradients of the optimization problem are assumed to be corrupted with an additive white Gaussian

432

noise as per Equation (27). The comparisons are then conducted for both types of corrections, i.e.

433

𝑇 linear and quadratic, and for two different levels of 𝜖𝑚𝑎𝑥 . At the end of the iterative search, the

434

values of the optimal 𝑆𝑜 from 10 independent runs of the algorithm are used collectively to

435

estimate the sample mean (̅̅̅̅ 𝑆𝑜 ) and sample variance (𝑠 2 ) for optimal 𝑆𝑜 in each iteration, as a

436

way to quantify the sensitivity of the algorithm to noise. Figure 3 shows the convergence of the

437

average optimal 𝑆𝑜 along with 95% confidence interval calculated as ̅̅̅ 𝑆𝑜 ± 𝑡0.025,9 (𝑠⁄√10). The


Page 27 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


438

effectiveness of the algorithm is, then, evaluated in terms of (1) the integral absolute error (IAE)

439

between the predicted average and the actual optimal 𝑆𝑜 and (2) the total variability in the predicted

440

average optimal 𝑆𝑜 . The results are summarized in Table 2.

441 442 443 444

Table 2: Comparison of model correction based on linear vs quadratic approximation

𝑇 𝜖𝑚𝑎𝑥 = 1%


IAE

Total std. deviation in ̅̅̅ 𝑆𝑜

IAE

Total std. deviation in ̅̅̅ 𝑆𝑜

Linear approximation

8.7292

9.3147

6.3651

8.1248

Quadratic approximation

5.7419

8.3629

4.8090

7.5233

445 446 447



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 42

448 449

450 451

(a)

452 453

(b)

454

…contd.


Page 29 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


455 456

(c)

457 458

(d)

459

Figure 3: Convergence of average optimal 𝑆𝑜 with 95% confidence interval for (a) linear model

460

𝑇 𝑇 correction and 𝜖𝑚𝑎𝑥 = 1%, (b) quadratic model correction and 𝜖𝑚𝑎𝑥 = 1%, (c) linear model

461

𝑇 𝑇 correction and 𝜖𝑚𝑎𝑥 = 5% and (d) quadratic model correction and 𝜖𝑚𝑎𝑥 = 5%



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

462

As compared to the linear correction, the proposed correction based on the quadratic

463

approximation resulted in a significant reduction in both IAE and variability (𝜎 2 ) of the optimal 𝑆𝑜

464

𝑇 in all cases. For 𝜖𝑚𝑎𝑥 = 5%, these reductions are nearly 24.45 % and 14.26% respectively whereas

465

𝑇 for 𝜖𝑚𝑎𝑥 = 1%, the numbers are much higher with the IAE reduced by nearly 34.22 % and the

466

variability by nearly 19.39 %. These results are critical since, in a practical situation, a smaller IAE

467

and variability will provide higher confidence to plant personnel that the algorithm is actually

468

converging to a process optimum rather than appearing to change at random due to process

469

variability and measurement noise.

Page 30 of 42

470

To better interpret the effect of noise on the convergence, let us recall the updated parameter

471

estimate at each iteration 𝛉𝐤 + ∆𝛉𝐤 . The first component 𝛉𝐤 is the estimate obtained in the first

472

step (P.1) by minimizing the prediction error in the model outputs. As a result, the uncertainty

473

in 𝛉𝐤 is due to the combination of modeling errors and the noise in measured outputs. The second

474

component ∆𝛉𝐤 is the change in estimates with respect to 𝛉𝐤 and is obtained in the second step

475

(P.3) by minimizing the differences between the predicted and measured gradients of the

476

optimization problem and therefore, its accuracy will largely depend on the degree of noise in the

477

gradients. However, the values of ∆𝛉𝐤 are controlled by both the type of model correction 𝐜𝐤 , e.g.

478

𝑇 linear or quadratic, and the bound on truncation error 𝜖𝑚𝑎𝑥 defining the accuracy of this correction

479

(c.f. the last constraint in P.3). Following the above, if ∆𝛉𝐤 is large, the parameter estimates 𝛉𝐤 +

480

∆𝛉𝐤 will become more sensitive to the noise in gradients whereas with smaller values of ∆𝛉𝐤 , the

481

effect of modeling errors and the noise in measured outputs will dominate. Out of the four cases,

482

the correction based on linear approximation and 1% truncation error allows for the smallest value

483

of ∆𝛉𝐤 in each iteration and, as a result, the estimates are highly affected by modeling errors and

484

the noise in measured outputs. Compared to the linear approximation, the correction 𝐜𝐤 based on


Page 31 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


485

the quadratic approximation will always favor larger values of ∆𝛉𝐤 irrespective of the bound on

486

𝑇 truncation error (𝜖𝑚𝑎𝑥 ), thereby reducing the sensitivity of the algorithm to modeling errors and

487

noise in measured outputs relative to the sensitivity to noise in gradients. This is evident in the

488

results in Table 2 where for both levels of truncation error, the proposed quadratic correction

489

resulted in a smaller variability in optimal 𝑆𝑜 . However, the improvement in the results is not the

490

𝑇 same between the two levels. One of the reasons is that using 𝜖𝑚𝑎𝑥 = 5% allows for relatively larger

491

𝑇 values of ∆𝛉𝐤 as compared to 𝜖𝑚𝑎𝑥 = 1% thus leaving smaller margin for improvement between

492

the linear and quadratic corrections. The second reason is that beyond certain values of ∆𝛉𝐤 , the

493

algorithm becomes more sensitive to the noise in gradients irrespective of the type of correction

494

as discussed above, thus limiting its overall ability to further filter the effect of model uncertainties.

495

Although, the new correction term has an added computational load as it requires the calculation

496

of Hessian matrix, this increase has a marginal effect on the overall computational time which is

497

dominated by the optimization step.

498

5.1.2

Comparison of robust versus nominal optimization

499

In the second approach, the sensitivity of the algorithm to noise and modeling error is reduced

500

by explicitly adding a measure of robustness to the optimization objective as shown in equation

501

12. For the given case study, the robust optimization problem is formulated as follows: min

− [𝑃(𝐱, 𝐮, 𝑆𝑜 , 𝐹, 𝑡𝑓 ) − 𝑤 Var (𝑃(𝐱, 𝐮, 𝑆𝑜 , 𝐹, 𝑡𝑓 ))]

𝑠. 𝑡.

(23) 𝑎𝑛𝑑 (25 − 27)

𝑆𝑜 ,𝐹

‖𝑉(𝐱, 𝛉, 𝐮, 𝑆𝑜 , 𝐹, 𝑡𝑓 )‖∞ ≤ 𝑉𝑚𝑎𝑥


(P.5)


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 42

502

Where, the term 𝑉𝑎𝑟 represents the variability in the amount of penicillin at final time averaged

503

over the uncertain parameter region and w is the weight on robustness. Since, in the parameter

504

estimation step, the estimates are obtained such that the model can predict the measured gradient

505

of the nominal objective function, the above formulation will promote a trade-off between the

506

corresponding nominal performance and its variability.

507

The results using a normal distribution for parametric uncertainty will be discussed first. Since

508

this uncertainty description is much easier to compute, it is the most commonly used description

509

in probabilistic uncertainty analysis. The robust optimization problem is, then, solved 10 times

510

with different realizations of noise for both 1% and 5% truncation error and 𝑤 = 1%. Since one

511

of the motivations was also to compare the efficiency of this approach with the quadratic model

512

correction, the above problem is solved with the previously proposed linear correction. The

513

resulting convergence of the average optimal 𝑆𝑜 in successive runs with 95% confidence intervals

514

is shown in Figure 4 with the corresponding IAE and variability in the average optimal 𝑆𝑜 listed

515

in Table 3. It is evident from these results that the variability is significantly reduced in both cases

516

𝑇 𝑇 with nearly 45.15% for 𝜖𝑚𝑎𝑥 = 1% and 33.79% for 𝜖𝑚𝑎𝑥 = 5%.

517 518

Table 3: Comparison of nominal vs. robust run-to-run optimization (w/Linear model correction)


IAE Nominal Optimization 8.7292 Robust Optimization

8.1793


Total std. deviation

IAE

Total std. deviation 𝜎

9.3147

6.3651

8.1248

6.8985

5.6249

6.6109

519 520


Page 33 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


521

522 523

(a)

524 525

(b)

526 527

Figure 4: Robust convergence of average optimal 𝑆𝑜 with 95% confidence interval for linear

528

𝑇 𝑇 model correction and (a) 𝜖𝑚𝑎𝑥 = 1% and (b) 𝜖𝑚𝑎𝑥 = 5%



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

529

By comparing the results in Table 2 and Table 3, it is observed that the proposed quadratic

530

correction provides better improvement over the robust approach in terms of IAE whereas the later

531

approach is much more effective in reducing the total variability. This difference in the behavior

532

between the quadratic correction and the robust optimization approach is attributed to the way the

533

two methodologies handle model uncertainties. By allowing a larger change in parameter

534

estimates, the quadratic correction primarily aims to further minimize the difference in predicted

535

and measured optimization gradients, thus speeding up the rate of convergence significantly.

536

Whereas, this indirectly reduce the effect of modelling error and the noise in measured outputs on

537

the rate of convergence providing some reduction in the variability, at the same time, it also

538

increases the sensitivity of the algorithm towards the noise in gradients. On the other hand, the

539

robust approach aims directly to reduce the variability in optimal solutions irrespective of its

540

source. Therefore, which method is better clearly depends on the final objective, the degree of

541

nonlinearity in the problem and the level of noise in the measurements.

542

One of the key limitations of robust formulations is the existence of an offset with respect to the

543

true optimum. This offset is due to the presence of variability term in the optimization objective

544

which forces the algorithm to converge to an apparent value that is different from the nominal

545

𝑇 objective. For 𝜖𝑚𝑎𝑥 = 5%, the average optimal 𝑆𝑜 converged to ~55.75g/L with an offset of

546

~1g/L. With the increase in weight on variability, this offset increases further. One option to

547

eliminate this offset is to switch the algorithm to the nominal mode, corresponding to a cost

548

function that does not include the variability, when approaching the convergence. However, the

549

effectiveness of this solution will depend on when the switching between robust to nominal

550

optimization is done and it may result in higher variability after the switching. Another possible


Page 34 of 42

Page 35 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


551

solution as shown in this study is using more realistic and less conservative descriptions of model

552

uncertainty such as the Bayesian based distributions (equation 14).

553

For comparison, using the final optimal solutions from the above results as initial conditions, we

554

solved the algorithm for both normal (equation 13) and Bayesian (equation 14) uncertainty

555

descriptions with 𝑤 = 1% and 2%. The results are summarized in Figure 5 where one can clearly

556

see that as the weight of the variability term, w is increased from 1% to 2%, the difference in the

557

robust optimal solutions from the two uncertainty descriptions is also increased. Moreover,

558

compared to its normal counterpart, the Bayesian uncertainty description is less conservative and

559

thus resulted in a steady state optimal solution closer to the actual process optimum. Figure 6

560

compares the normal and the Bayesian uncertainty descriptions by plotting the resulting joint

561

confidence region for parameters 𝐾𝑋 and 𝐾𝐼 at different confidence levels. It is quite clear that the

562

two descriptions are significantly different especially at higher confidence levels and therefore, as

563

we increase the weight of the variability term which is equivalent to increasing the confidence

564

level in a given uncertainty, the impact of the difference between these two uncertainty

565

descriptions on the optimal solution becomes more evident (see Figure 5).

566



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

567 568

(a)

569 570

(b)

571

Figure 5: Comparing the effect of Bayesian vs normal parametric uncertainty on the robust

572

𝑇 𝑇 convergence of optimal 𝑆𝑜 for (a) 𝜖𝑚𝑎𝑥 = 1% and (b) 𝜖𝑚𝑎𝑥 = 5%; (−o − Normal with 𝑤 = 2%;

573

−⟐ − Bayesian with 𝑤 = 2%; − ∗ − Normal with 𝑤 = 1%; −x − Bayesian with 𝑤 = 1%)


Page 36 of 42

Page 37 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


574 575 576 577

578 579

Figure 6: Comparing normal (dotted) and Bayesian (solid) descriptions of parametric uncertainty

580

in 𝐾𝑋 and 𝐾𝐼 .

581 582 583 584 585



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

586

6. CONCLUSION

587

A quadratic Taylor series approximation has been proposed to progressively correct the model

588

for structural inaccuracies as the model is optimized towards the true process optimum in a run to

589

run approach. Compared to the correction based on linear approximation, the proposed quadratic

590

correction allows for more accurate predictions of the optimization gradients along the iterations,

591

thus, achieving much faster convergence. In addition, a significant reduction in variability of the

592

search path has also been observed. In a second approach, the variability is reduced by explicitly

593

adding a measure of variability in the optimization objective. Although, for the given case study,

594

the first approach based on quadratic model correction performs better in reducing IAE with a

595

reduction in variability as an added advantage, the second approach with explicit minimization of

596

variability in the robust formulation has been shown to perform significantly better in reducing the

597

variability. In a comparative study, it is also shown that the description of parametric uncertainty

598

has a significant effect on the convergence of the algorithm. When compared to a normal

599

description of parametric uncertainty, the Bayesian uncertainty description resulted in less

600

conservative solutions and also in a reduced offset in the final converged solution with respect to

601

the true optimum.

602 603


Page 38 of 42

Page 39 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


604

AUTHOR INFORMATION

605

Corresponding Author

606

*Email: [email protected]. Phone: 519-888-4567 ext. 36980

607

Present Addresses

608

†Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA,

609

USA.

610

Notes

611

The authors declare no competing financial interest.

612

ACKNOWLEDGMENT

613 614

The authors would like to acknowledge the financial support from Natural Sciences and Engineering Research Council (NSERC), Canada.

615



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

616

REFERENCES

617

(1) Mandur, J.; Budman, H. Simultaneous model identification and optimization in presence of

618 619 620 621 622 623 624 625 626 627 628 629 630 631 632

Page 40 of 42

model-plant mismatch. Chem. Eng. Sci. 2015, 129, 106-115 (2) Chen, C. Y.; Joseph, B. On-line Optimization Using a Two-Phase Approach: An Application Study. Ind. Eng. Chem. Res. 1987, 26, 1924. (3) Chachuat, B.; Srinivasan, B.; Bonvin, D. Adaptation strategies for real-time optimization. Comput. Chem. Eng. 2009, 33, 1557. (4) Beigler, L. T.; Grossmann, I. E.; Westerberg, A. W. A Note on Approximation Techniques Used for Process Optimization. Comput. Chem. Eng. 1985, 9, 201. (5) Roberts, P. D. An algorithm for Steady-State System Optimization and Parameter Estimation. Int. J Syst. Sci. 1979, 10, 719. (6) Tatjewski, P. Iterative optimizing set point control-The basic principle redesigned. In Proceedings of the 15th Triennial IFAC World Congress, Barcelona, Spain, 2002. (7) Gao, W.; Engell, S. Iterative set-point optimization of batch chromatography. Comput. Chem. Eng. 2005, 29, 1401. (8) Marchetti A.; Chachuat, B.; Bonvin, D. Modifier-adaptation methodology for real–time optimization, Ind. Eng. Chem. Res. 2009, 48, 6022.

633

(9) Costello S.; Francois, G.; Srinivasan, B; Bonvin, D. Modifier Adaptation for Run-to-Run

634

Optimization of Dynamic Processes. In Proceedings of the 18th IFAC World Congress, Milano,

635

Italy, 2011.


Page 41 of 42

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


636

(10) Srinivasan, B.; Bonvin, D. Interplay between identification and optimization in run-to-run

637

optimization schemes. In Proceedings of American Control Conference, Anchorage, Alaska, 2002.

638

(11) Mandur J.; Budman, H. Robust Optimization of Chemical Processes using Bayesian

639 640 641 642 643 644 645 646 647 648 649

description of parametric uncertainty, J Process Contr. 2014, 24, 2, 422-430 (12) Beck, J. V.; Arnold, K. J. Parameter Estimation in Engineering and Science; John Wiley & Sons, 1977. (13) Bates, D. M.; Watts D. G. Nonlinear Regression Analysis and Its Applications; John Wiley & Sons, 1988. (14) Box, G. E. P.; Draper, N. R. The Bayesian Estimation of Common Parameters from Several Responses, Biometrika. 1965, 52, 355-365 (15) Banga, J. R.; Balsa-Canto, E.; Moles, C.G.; Alonso, A.A. Dynamic optimization of bioprocesses: Efficient and robust numerical strategies. J Biotechnol. 2005, 117(4), 407-419. (16) Marchetti, A.; Chachuat, B.; Bonvin, D. A dual modifier-adaptation approach for real-time optimization, J. Process Contr. 2010, 20 (9), 1027–1037

650

(17) Brdýs, M.; Tajewski, P. An algorithm for steady-state optimizing dual control of uncertain

651

plants. In Proceedings of the First IFAC Workshop on New Trends in Design of Control Systems

652

1994, pp.249–254.

653 654

(18) Gilks, W. R.; Richardson, S.; Spiegelhalter, D. J. Markov Chain Monte Carlo in Practice; Chapman & Hall: London, 1996.



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

655 656 657 658 659 660

(19) Ghanem, R. G.; Spanos, P. Stochastic Finite Elements: A Spectral Approach; Springer: Berlin, 1991. (20) Bajpai, R. K.; Reuss, M. A Mechanistic Model for Penicillin Production. J Chem. Technol. Biot. 1980, 30, 332-344 (21) Birol, G.; Undey, C.; Cinar A. A modular simulation package for fed-batch fermentation: penicillin production. Comput. Chem. Eng. 2002, 26(11), 1553-1565

661


Page 42 of 42

Robust Algorithms for Simultaneous Model Identification and

Recommend Documents