Global Sensitivity Analysis and Uncertainty Quantification of Crude

Mar 23, 2018 - Global Sensitivity Analysis and Uncertainty Quantification of Crude Distillation Unit Using Surrogate Model Based on Gaussian Process R...
3 downloads 7 Views 2MB Size
Subscriber access provided by Queen Mary, University of London

Article

Global sensitivity analysis and uncertainty quantification of crude distillation unit using surrogate model based on Gaussian process regression Le Quang Minh, Pham Luu Trung Duong, and Moonyong Lee Ind. Eng. Chem. Res., Just Accepted Manuscript • DOI: 10.1021/acs.iecr.7b05173 • Publication Date (Web): 23 Mar 2018 Downloaded from http://pubs.acs.org on March 26, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Industrial & Engineering Chemistry Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

1

Global sensitivity analysis and uncertainty quantification of crude

2

distillation unit using surrogate model based on Gaussian process

3

regression Le Quang Minh1, Pham Luu Trung Duong1, Moonyong Lee* School of Chemical Engineering, Yeungnam University, Gyeongsan, Rep. of Korea

4

Submitted to Industrial & Engineering Chemistry Research

5 6 7 8

1

9

*

These authors contributed equally to this work. Correspondence concerning this article should be addressed to:

10

Prof. Moonyong Lee

11

Email: [email protected]

12

Telephone: +82-53-810-2512

13 14 15 16 17

1

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

18 19

Abstract

20

This study presents a global sensitivity analysis to simplify a surrogate-model-based

21

uncertainty quantification of a crude distillation unit with a large number of uncertainties. To

22

overcome the computational limitation of a conventional surrogate model based approach where

23

the number of simulations required grows exponentially as the input dimension increases, a

24

novel two-stage approach was proposed in this study: in the first stage, a multiplicative

25

dimensional reduction method is applied to identify factors that exert the highest influence on the

26

model outputs. In the second stage, the Gaussian process regression is exploited for uncertainty

27

quantification from the simplified model derived in the first stage. As a result, the computational

28

efforts for uncertainty quantification were significantly reduced (approximately more than 95 %)

29

compared to the conventional Quasi Monte Carlo, while the predicted density functions by the

30

proposed method closely matched with those from the Quasi Monte Carlo. The proposed two-

31

stage approach was executed for sensitivity analysis and uncertainty quantification of a crude

32

distillation unit by an interface between MATLAB and HYSYS. The economic revenue and the

33

operating cost per unit of crude oil processed were selected as the output of interests for the

34

crude distillation unit. The global sensitivity analysis result showed that the flow rates of crude

35

oil and naphtha products are critical for both of the economic revenue and the operating cost.

36

Keywords: Gaussian process regression; Global sensitivity analysis; Uncertainty

37

quantification; Crude distillation unit; Operating cost reduction; Maximum economic revenue.

38

2

ACS Paragon Plus Environment

Page 2 of 40

Page 3 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

39

Abbreviations

40

AGO: atmospheric gas oil

41

ASTMD86: American Society for Testing and Materials (standard test method for

42

distillation of petroleum products and liquid fuels at atmospheric pressure)

43

CDU: crude distillation unit

44

ER: economic revenue per unit of crude oil processed

45

GP: Gaussian process

46

GPR: Gaussian process regression

47

HP: high pressure

48

HSR: heavy straight run

49

LSR: light straight run

50

MC: Monte Carlo

51

MDRM: multiplicative dimensional reduction method

52

OP: operating cost per unit of crude oil processed

53

PA: pump around

54

PAQ: outlet temperature of the pump around from the cooler

55

QMC: Quasi Monte Carlo

56

SA: sensitivity analysis

57

UQ: uncertainty quantification 3

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

58

1. Introduction

59

The use of fossil resources for the production of fuels and commodity chemicals has been

60

a focal point of discussion for the past few decades. It is anticipated that the total amount of

61

fossil fuels (coal, natural gas, and oil) available on the earth can support various needs for at least

62

another hundred years1. Identifying a profitable/efficient production process is a crucial task for

63

engineers which facilitates higher cost savings and is safer and cleaner. Most industries would

64

rather maximize available resources for maximum profitability than opt for large-scale

65

expansion2.

66

optimization, which is a major quantitative tool in decision-making for process availability3.

67

Mathematical models, which elucidate the relationship among input variables, play a vital role in

68

the final design to comply with the safety and serviceability constraints4.

A common approach that has gained widespread application in this regard is

69

A crude distillation unit (CDU) is among the largest energy-consuming processes owing

70

to the large amount of energy consumption and high operating temperature5. It is reported that

71

the operation of the CDU and vacuum distillation consumes over 35% of the energy required in

72

the refinery plants6. Therefore, research on crude distillation systems has received considerable

73

attention. In optimizing the CDU, the objective is to maximize the profits of a CDU while

74

minimizing the energy consumption (resulting in reduced CO2 emissions), subject to constraints

75

on the quality of the products7. Previous studies recommended that the separation and energy

76

performance of distillation processes can be represented by different type of models such as

77

rigorous, simplified, and statistical/empirical models8-10. Nevertheless, the development of such

78

models generally requires significant amount of effort because this involves various differential

79

and algebraic equations associated with a large number of input variables. Simultaneously,

80

uncertainties are sometimes correlated in the process design because of the connected process 4

ACS Paragon Plus Environment

Page 4 of 40

Page 5 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

81

and various random factors. The complex nature of CDUs, including their interactions with the

82

associated heat recovery network, the large number of degrees of freedom, and a large number of

83

uncertainties in feed mixture and process conditions, renders their optimization a highly

84

challenging task.

85

The goal of uncertainty quantification11, and in particular, a priority in the design and

86

modeling of an efficient process, is to assess the effect of the input uncertainties on the model

87

output and consequently, on the system performance12. Certain parameters such as the crude oil

88

property and tray efficiency, which depend on physical properties (e.g., tray hydraulics, liquid

89

mixing and entrainment), are likely to illustrate a natural disturbance, which is not revealed in

90

advance (this is referred to as an aleatory uncertainty). However, certain other factors such as

91

pressure, temperature, and flow rate, which are likely to exhibit a unique value, are directly

92

measurable and susceptible to lack of knowledge (epistemic uncertainty). The former are

93

typically observed to be irreducible, while the latter can be quantified and minimized13. For

94

considering the unknown factors, a global sensitivity analysis (SA), which is a widely used

95

technique for selecting representative parameters from mathematical models14, 15, is required to

96

reduce their quantity and select the ones to be incorporated.

97

Monte Carlo (MC)16 and Quasi Monte Carlo (QMC) methods are representative

98

probabilistic approaches for the propagation of uncertainties in the model inputs to its

99

output17. However, notwithstanding the simplicity of their implementation, the mean

100

convergence rate was estimated to be in the order of O(1/ M ) , where M is the number of

101

samples; this renders MC/QMC-based approaches computationally expensive, and they are used

102

only as the final option. To overcome the problem, a convenient-to-evaluate function may be

103

used to replace the original computational model18. Shen and Braatz19 developed a new 5

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

104

polynomial chaos based algorithm on the design of batch and continuous-flow chemical reactors

105

with probability uncertainties. Duong et al.20 studied the UQ and SA of chemical processes using

106

the standard polynomial chaos method for systems with a small number of random inputs. Using

107

an efficient-to-evaluate surrogate model, Celse et al.21 investigated the influence of each input on

108

the construction of the model of the hydrodesulfurization/hydrodenitrogenation reaction in

109

vacuum gas oil hydrotreatment. Duong et al.22 and Minh et al.23 recently developed a two-stage

110

approach using a compressive polynomial chaos method to overcome the computational

111

limitation in a system with a moderate/large number of uncertain parameters.

112

Surrogate models, also known as meta-models, can approximate detailed mechanistic

113

models, which are capable of simplifying highly nonlinear and computationally expensive

114

problems. The Kriging technique, which originated from interpolating geographical data in

115

mining, is also known as Gaussian process (GP) modeling24, 25. In general, the GP modeling is a

116

Bayesian approach, which provides a complete posterior distribution over feasible functions

117

from the observed data. It is completely determined by its mean and covariance function. The

118

mean is used to encode the prior knowledge of the output. The GP prediction is a distribution.

119

Use of the predictive GP can facilitate the selection of the new data. A high predictive variance

120

implies an absence of the training data, whereas, a low predictive variance implies a high

121

confidence in the model. Note that the higher confidence in the model results in a more reliable

122

design. The covariance function is used to capture a straightforward intuition: if the inputs are

123

close, the function outputs are likely to be closed. It can also be used to encode the periodic

124

property of the output. They are widely used in diverse fields of engineering, such as structural

125

reliability analysis and design optimization26-28.

6

ACS Paragon Plus Environment

Page 6 of 40

Page 7 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

126

For the Gaussian process regression (GPR), however, the number of hyper-parameters

127

increases with the dimension of the input space so that their statistical inference can be

128

problematic in high-dimensional situations29. The number of simulations required to learn a

129

generic multivariate response grows exponentially as the input dimension increases. This is

130

because the GPR defines an input space correlation with Euclidean (2-norm) distance. Since the

131

Euclidean distance becomes noninformative as the dimensionality of uncertain inputs increases30,

132

more evaluations of the expensive-to-evaluate models are required to construct an accurate GPR.

133

Therefore, it is desirable to reduce the dimension of the problem from the perspectives of both

134

accuracy and efficiency. In this study, a two-stage approach was presented to tackle this issue:

135

firstly, a multiplicative dimensional reduction method is used for detecting non-influential inputs,

136

then a GPR model is constructed with the influential inputs only. Although the GPR possesses

137

the automatic relevant determination mechanism wherein the kernel scales can be used to

138

determine the relevancy of inputs, this approach may be not reliable23.

139

This work aimed to overcome a current limitation of the conventional GPR where the

140

number of model evaluations grows exponentially and may not be applicable to systems with a

141

moderate/large number of uncertainties such as a CDU. One of main contributions of this work is

142

to propose a novel two-stage approach by combining the multiplicative dimensional reduction

143

with Gaussian process kriging regression methods. The proposed two-stage approach could

144

remarkably reduce the simulation time (one of the most burdens in handling UQ) so that the UQ

145

for complex systems with a large/moderate number of uncertainties can be popular in practice.

146

Another main contribution of this work is to carry out a global SA and UQ of the CDU which is

147

highly nonlinear and complex. In terms of our understanding, there is no report of the global SA

148

that includes a discussion on the uncertain inputs effects on the process performance for the

7

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

149

conventional CDU, particularly with a large number of random variables. Thus, this work is

150

expected to describe the uncertain effects of the process inputs on main process interests of the

151

CDU such as the economic revenue and operating costs. Also, this approach attracts considerable

152

attention in refinery modeling and optimization because it directly determines the quality of final

153

products and its profitability.

154

This paper is organized as follows: Section 2 addresses the global SA of the

155

multiplicative dimensional reduction method, and subsequently, the UQ through the GPR of the

156

identified influential factors from the previous step. Section 3 briefly describes the CDU and

157

analysis methods of the system including the economic revenue and operating costs. Section 4

158

discusses the results, and Section 5 concludes the study.

159 160

2. Methodologies

161

UQ requires a large number of original model evaluations, particularly for high-nonlinear

162

and high-dimensional problems. The GPR is a Bayesian modeling technique for UQ purposes

163

that has recently attracted engineers from different fields such as biological, chemical, and

164

control engineering24, 31, 32 owing to its computational efficiency. Although the GPR is flexible,

165

the optimizing approach by selecting hyper-parameters either got stuck in local solutions24, 31 or

166

achieved non-satisfactory results when it is applied to model systems with a large number of

167

random inputs. To address these problems, an efficient two-stage UQ method, which combines

168

the multiplicative dimensional reduction method (MDRM) with the GPR, is proposed in this

169

study. In the proposed method, the MDRM is first adopted by determining critical inputs for a

170

model-dimension reduction. The meta-model by the GPR is then consequently generated

8

ACS Paragon Plus Environment

Page 8 of 40

Page 9 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

171

corresponding to the critical inputs determined in the previous step. The work intends to explain

172

the efficient and practical framework, and the typical steps involved in SA and UQ in order to

173

reduce the computational efforts.

174

2.1. MDRM for SA

175

SA is a procedure aimed at determining the inputs that are relatively more critical than

176

the others. It is of two types: local SA and global SA. The first utilizes the information from the

177

partial derivative of the model output with respect to the model inputs. Thus, it is used only to

178

study the vicinity of a nominal value of the selected inputs. That is, the global SA involving

179

Sobol indices14 is addressed with the global output disturbance for the entire ranges of input

180

variation. Therefore, it is preferable over the local SA for an incomplete determination of the

181

sensitivities of a specific system. Several approaches for global SA with Sobol indices are

182

available, such as Fourier amplitude sensitivity test15, extended Fourier amplitude sensitivity

183

test33, MC, and MDRM. A multivariate function in MDRM34 is approximated as a product of

184

univariate ones, which demonstrates a reasonable tradeoff between the accuracy and

185

computational load. The main theory of global SA by the MDRM is briefly described in this

186

subsection.

187

A system defined by a computational model (1) is considered:

188

y ( ξ ) = Μ (ξ )

189

where M denotes the computational model, y is the output of interest, and

190

represents the vector of uncertain inputs.

(1)

9

ACS Paragon Plus Environment

ξ = ( ξ1,...,ξn )

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 40

191

The uncertain inputs are assumed to be randomly independent variables with density

192

functions pi (ξi ): Γi → R . If the output y exhibits a finite variance, it can be decomposed into

193

the sum of increasing dimension terms:

+

n

n

n

y (ξ ) = y0 + ∑ yi (ξi ) + ∑∑ yij (ξi , ξ j ) + ... + yi1 ...in (ξi1 ,..., ξ n ) i =1

194



= y0 +

i =1 i < j

(2)

yA (ξ A )

A ⊂{1,..., n} A ≠∅

A ={i1,..., is} ⊂{1,.., n} denotes a subset of indices, and ξA is a subset of ξ that contains

195

where

196

only parameters indexed by A . The terms in Eq. (2) are computed recursively as

y0 = E [ y(ξ)] 197

yi (ξ) = E [ y(ξ ) | zi ] − y0 (t )

yij (ξi , ξ j ) = E  y (t , ξ ) | ξi , ξ j  − yi (ξi ) − y j (ξ j ) − y0 (t )

(3)

...

E(ξ | ξA ) is the conditional expectation. Because the

198

where E is the expectation operator, and

199

functions defined in Eq. (3) are orthogonal to each other, the variance of the model output can be

200

decomposed into

201

Vy =



Var ( yA (ξ A ))

(4)

A ⊂{1,..., n} A ≠∅

202

The first order Sobol indices (Si), which measure the contribution of a single input factor

203 204

ξi

205

by Eq. (5):

on the variance of the output without considering the joint effect with other inputs, is defined

10

ACS Paragon Plus Environment

Page 11 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Vi Vy

206

Si =

207

where Vi denotes the variance of

(5)

yi (ξi ) .

The Sobol indices for a subset of parameters A are defined by Eq. (6), which quantifies

208 209

the joint effect of the subset of parameters A .

210

SA =

VA Vy

(6)

Finally, the total Sobol indices (Ti), including all the interactions with the other inputs,

211 212

quantify the total impact of a given input ξi :

213

Ti = ∑ SA

(7)

i∈A

214

215

The main concept behind the MDRM is to approximate the model output in Eq. (1) as n

n

i =1

i =1

y(ξ) ≈ y(c)1−n ∏ y(c1 ,..., ci−1 , ξi , ci+1 ,..., cn ) = h01−n ∏ y(ξi , c−i )

(8)

216

where c = (c1,..., cn ) is an anchor point, and c−i = (c1,..., ci−1, ci+1,..., cn ) . Generally, the mean

217

vector of the uncertain input is used as the anchor point.

218

219

Let us denote the ith mean and the mean square of the dimensional function as µ i = E [ y (ξ i , c− i )]

(9)

σ i = E [ y (ξ i , c− i ) 2 ]

11

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Utilizing a Gaussian quadrature with q nodes {wi , ξi }l =1 , the ith mean and mean (l )

220 221

Page 12 of 40

square of the dimensional function ( µi and

(l ) q

σi , respectively) can be computed as

q

µi = ∑ wi (l ) y(c1 ,..., ci −1 , ξi (l ) , ci +1 ,..., cn ) 222

l =1

(10)

q

σ i = ∑ wi (l ) ( y (c1 ,..., ci −1 , ξi (l ) , ci +1 ,..., cn ))2 l =1

223 224 225

226

Using Eq. (8), it can be demonstrated that the first order and total sensitivity functions can be approximated34 as

Si ≈

−1 + σ i / µi 2 n

−1 + (∏σ k / µk 2 )

;

Ti ≈

k =1

1 − µi 2 / σ i

(11)

n

1 − (∏ µk 2 / σ k ) k =1

227

where the ith mean and mean square of the dimensional function are computed as in Eq. (10).

228

Remarks: If Ti is approximately equal to zero, it implies that the input

229

can be fixed at any point in its distribution without affecting the variance of the output (4). The

230

difference is an indicator of the presence of interactions in the model.

ξi

is non-influential and

231 232

2.2. GPR method

233

We assume that m (< n) important inputs ( ξ m = (ξ i , ..., ξ i 1

234

m

) ) were detected in the previous

{

}

step. Let us make p simulations and collect the following data set ξm , y = M (ξm ) (i )

(i )

12

ACS Paragon Plus Environment

(i )

p

i=1

.

Page 13 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

235 236

237

Industrial & Engineering Chemistry Research

A GPR model is a generalization of the multivariate Gaussian random variables to an infinite dimension. It is defined as

y(ξm ) = h(ξm )T β + f (ξm )

(12)

238

where f (ξm ) is a zero mean Gaussian random process with the covariance function k ( ⋅, ⋅) ,

239

and h(ξm) is a set of basic functions or a trend which transforms the original input space into a

240

new feature vector h(ξm) in Rm, a vector of basis coefficients. That is, the GPR is a statistical

241

model that exhibits an instant response of the form

242

P( y(i ) | f (ξm(i ) ), ξm(i ) ) ≈ N ( y(i ) | h(ξm(i ) )T β + f (ξm(i ) ),σ 2 )

243

where N () is a normal distribution in which the polynomial functions of the inputs can be

244

normally used as the basis functions. It is noted that the latent function, f , is introduced for

245

each observation. Various covariance functions are described in the literature25, 32. In this study,

246

the automatic relevance determination squared exponential is used, which is defined as follows:

247

k (ξ m(i ) , ξ m( j ) | θ) = θ f exp(−

(13)

1 m (i ) ∑ (ξl − ξl ( j ) )2 / θl ) 2 l =1

(14)

248

m +1 where θ ∈ R = (θ f , θ1 ,..., θm ) is a vector of hyper-parameters. It is expected that similar values of

249

inputs should exhibit similar responses. Thus, the covariance function can be used to characterize

250

the similarity between two input vectors ξm , ξm . The length scale

251

extent to which one is required to move along a particular input space for its response to become

252

uncorrelated. Such a covariance function implements the automatic relevance determination

(i )

( j)

θ1,...,θm

13

ACS Paragon Plus Environment

quantifies the

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 40

253

mechanism24, which determines the relevance of an input is owing to the inverse of the length-

254

scale. Therefore, the GPR model is defined by the coefficient vectors β , hyper parameters ɵ,

255

σ . The values for these parameters will be updated and learnt

256

and measurement noise variance,

257

from the given data by maximizing the logarithm likelihood expressed in Eq. (15): log P ( y | ξ m , β , θ , σ )

258

=

1 2

259

T

'

2

p 2

log(2π ) −

1 2

'

log K ( ξ m , ξ m | θ ) + σ I p 2

(15)

where Ip is an identity matrix of size p, and

 (ξ m (1) )T  (2 T  (ξ ) X = m ...   (ξ ( P ) )T  m 260

−1

( y − H β )  K ( ξ m , ξ m | θ ) + σ I p  ( y − H β ) −

  h(ξ m(1) )T   (2) T   h(ξ m ) = , H   ...     h(ξ ( P ) )T   m

      

 y (1)    y =  ...   y( P)   

(16)

 k (ξ m (1) , ξ m (1) ) ... k (ξ m(1) , ξ m( P ) )    K (ξ m , ξ m ) = ... ... ...   k (ξ ( P ) , ξ (1) ) ... k (x( P ) , ξ ( P ) )  m m  m  261

where H p×r is a matrix of values of the basic vector at training points, r is the number of basic

262

function, and K is the covariance matrix at the training points.

14

ACS Paragon Plus Environment

Page 15 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

263

Regarding the estimated optimal parameters {βˆ , θˆ, σˆ } , which are obtained by the

264

maximization of the logarithm likelihood function, the density function of the response at a new

265

data point ξ m is

266

N (h(ξm* )βˆ + µ,σˆ 2 + Σ)

267

where

*

(17)

µ = K (ξ m*T , ξm | θˆ)( K (ξm , ξm | θˆ) + σˆ I p )-1 ( y - H β ) (18)

268

K (ξ

*T m

, ξm | θˆ) = ( k (ξm* , ξm(1) ),..., k (ξm* , ξm (n ) ) )

269

and

270

∑ = k (ξ m * , ξ m * ) - K (ξ m * T , ξ m | θˆ )( K (ξ m , ξ m | θˆ ) + σˆ I p ) -1 K (ξ m , ξ m * T | θˆ )

(19)

271

The mean value calculated in Eq. (19) is used as the prediction for the new data, whereas

272

the variance ∑ is an indicator of the uncertainty of the prediction. One can use Eq. (18) as a

273

surrogate to sample large numbers of output responses and to estimate density functions and

274

statistics of the output interest.

275 276

3. Crude oil distillation unit

277

Table 1 presents the compositions of the crude oil processed in this study35.

278

simulation is carried out in Aspen HYSYS v10. The property package is Peng–Robinson, which

279

utilizes the properties of equation of state in its enthalpy calculations. Owing to the assay

280

characterization, the feed contains 36 components, including 5 light components and 31 pseudo15

ACS Paragon Plus Environment

The

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

281

components. Figure 1 shows the oil cut production distribution. Table S1 lists the product

282

specifications (see Supporting Information). The temperature of each cut is referred at 95%

283

liquid fraction recovery (ASTMD86-95%). The boiling point curves of column products are

284

shown in Figure S1 (see Supporting Information); these products are the naphtha product, three

285

side products (kerosene, diesel, and atmospheric gas oil), and the residues at the bottom of the

286

column. In this study, it is assumed that the light and heavy straight run naphtha streams are

287

mixed to be the naphtha product stream. It is also assumed that the input crude oil is at a

288

temperature of 232.2 °C and pressure of 5.17 bar, operating at 100,000 barrels/day.

289

Table 1. Feed compositions Type

Light component

Hypothetical components

Components Mole fraction (%) Methane Ethane Propane i-Butane n-Butane Water NBP49 NBP79 NBP111 NBP144 NBP176 NBP208 NBP240 NBP272 NBP304 NBP336 NBP368 NBP400 NBP433 NBP464 NBP496 NBP528 NBP560

0.031 0.069 0.952 0.601 2.131 0.000 0.000 0.000 5.773 2.995 3.040 5.381 5.277 4.821 4.489 4.105 3.811 3.718 3.837 4.275 4.301 4.053 3.617 16

ACS Paragon Plus Environment

Page 16 of 40

Page 17 of 40

NBP592 NBP624 NBP656 NBP688 NBP720 NBP752 NBP784 NBP830 NBP888 NBP947 NBP1009 NBP1062 NBP1124

3.205 2.824 2.438 2.138 1.893 1.691 1.490 1.876 2.173 2.043 2.250 3.355 5.346

290

0.5 Light straight run naphtha Heavy straight run naphtha Kerosene Diesel Atmospheric gas oil Residues

0.4

Liquid volume fraction

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

0.3

0.2

0.1

0.0 0

200

400

600 o

291 292

ASTM D86 - 95% ( C) Figure 1. Cut distributions of crude oil.

17

ACS Paragon Plus Environment

800

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

293

Figure 2 demonstrates a rigorous CDU model as a substitute for the actual plant35. The

294

tower consists of 29 stages in the main shell and three side columns associated with three pump-

295

arounds for stripping the middle products. Furthermore, a pre-flash separator is used to enhance

296

the separation efficiency in separating the light components from the liquid phase36. After

297

leaving the pre-flash separator, the crude oil enters the CDU furnace (see Figure 2). The crude is

298

vaporized through the furnace in order to recover the products from the side strippers. The outlet

299

temperature of the furnace is normally set so that the vaporized fraction equals the sum of the

300

column products plus a small percentage. This small percentage excess is the “over-flash,” which

301

indicates the amount of the heavy residues recovered in the light fractions37. The two-phase

302

mixture enters the column at 333.4 °C and is flashed in the bottom section (flash zone).

303

Furthermore, a significant amount of the main stripping steam is included at the bottom, which

304

serves to strip any residue and prevent the excessive thermal cracking of crude oil because of the

305

high temperature at the flash zone. Other light fractions of the crude were retrieved by side

306

strippers at different locations using steam or reboilers. Both the atmospheric gas oil and diesel

307

side-strippers use stripping steams, while the kerosene side-stripper includes a reboiler. The

308

presence of side coolers or pump-arounds is an additional feature in the CDU. These units permit

309

the reduction of the condenser cooling requirements by reducing the vapor flow in the column

310

and allowing for heat recovery. The residue from the column contains valuable hydrocarbons,

311

which is typically further separated in a vacuum distillation column (not discussed in this study).

18

ACS Paragon Plus Environment

Page 18 of 40

Page 19 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

312 313

Figure 2. Process flow diagram of crude oil distillation unit.

314 315

3.1. Process decision variables

316

Rigorous models, which simulate the distillation column according to the material and

317

energy balances as well as equilibrium relations in each stage of the column38, are sufficiently

19

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

318

realistic to provide accurate predictions39. In this work, Aspen HYSYS v10 is used to simulate

319

the CDU. To facilitate the data transfer required for the global SA and the UQ, an interface

320

between MATLAB and HYSYS is established using the automation client–server application

321

provided by HYSYS; meanwhile, the algorithm is coded in MATLAB.

322

For UQ and SA, the inputs of the surrogate model are varied, including the flow rate of

323

the crude and that of the main stripping steam, product flow rates, outlet temperature of crude oil

324

from the furnace, pump-around rates and their temperature changes, stripping steam rates, and

325

reboiler duty of the side strippers. Sensitivity analysis indicates the parameters that are most

326

influential and are to be focused upon and those which can be discarded. Furthermore, the UQ

327

study can examine their relative influences on the outputs. All the inputs are assumed to be

328

distributed uniformly or normally and to independently affect the process performance

329

(economic revenue and operating costs). Table 2 lists all the uniformly distributed input variables

330

associated with the confidence boundaries. Note that the normally distributed input variances, i.e.,

331

the Gaussian uncertainties, were also investigated in the case study without fixing product flow

332

rates. In that case study, the same means and variances as in the uniform case were used.

333 334

Table 2. Boundaries of random inputs under uniform uncertainties Variables

Base

Lower

Upper

Unit

Crude oil

4166. 7

3958.3

4375.0

bbl/h

Naphtha

958.3

910.4

1006.2

bbl/h

Kerosene

387.5

368.1

406.9

bbl/h

Diesel

802.1

762.0

842.2

bbl/h

AGO

187.5

178.1

196.9

bbl/h

20

ACS Paragon Plus Environment

Page 20 of 40

Page 21 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

PA1

2083.3

1979.2

2187.5

kmol/h

PA1Q

95.0

90.2

99.8

°C

PA2

1250.0

1187.5

1312.5

kmol/h

PA2Q

80.0

76.0

84.0

°C

PA3

1250.0

1187.5

1312.5

kmol/h

PA3Q

72.0

68.4

75.6

°C

Diesel steam

75.5

71.8

79.3

kmol/h

AGO steam

62.9

59.8

66.1

kmol/h

Main steam

188.8

179.4

198.3

kmol/h

Kerosene HP steam

157.1

149.3

165.0

kmol/h

Hot crude temperature

343.3

326.1

360.5

°C

335 336

3.2. Objective functions in the CDU

337

The economic revenue (ER) and operating cost (OP) per unit crude oil processed are

338

selected as the performance indicator for UQ and SA. Fundamentally, the ER is estimated on the

339

basis of the operating costs in the column, as follows40: n

OP = 340

Fsteam C steam + ∑ Qx C x x =1

Fcrude n

n

∑ Fj C j − [ FcrudeCcrude + FsteamCsteam + ∑ QxCx ] ER =

j =1

(20)

x =1

Fcrude

341

where Fj and Cj are the flow rates and costs, respectively of the jth product; and Fsteam and Csteam

342

are the flow rates and costs, respectively, of the low pressure steam. Qx and Cx are the heat

343

requirement and its utility cost, respectively. The estimation is based on the assumption of 8600

344

hour per year. 21

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

345

The ER is based on the product prices, raw material cost, and utility costs to calculate the

346

operating costs within a refinery. Furthermore, the fixed costs such as capital modification, labor,

347

and maintenance were not considered in this study. Table 3 lists the prices of the feed, products,

348

and utility, respectively.

349 350

Table 3. Feed, product, and utility prices40,41 Items Crude oil Naphtha Kerosene Diesel Atmospheric gas oil Residue Fired heating Cooling water Kerosene HP steam Stripping steam

Cost 80.00 136.00 122.70 121.70 95.29 89.71 150.00 5.25 17.70 0.14

Unit $/bbl $/bbl $/bbl $/bbl $/bbl $/bbl $/kJ $/kJ $/GJ $/kmol

351 352

4. Results and discussions

353

Here, we address the UQ of a complex CDU by integrating a rigorous simulation with a

354

global SA. The column is simulated rigorously in HYSYS, while the independent input

355

variables are uniformly or normally distributed. All the variables are linked in a spreadsheet-

356

variables window in HYSYS and executed to MATLAB via the ActiveX/COM connection.

357

The ER and the OP were calculated as the outputs in HYSYS spreadsheet cells and used for

358

the global SA. The Sobol’s sensitivity indices were derived in Eq. (11).

359 360

4.1. Global SA and UQ (unfixed product flow rates with uncertain inputs of uniform distributions)

22

ACS Paragon Plus Environment

Page 22 of 40

Page 23 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

361

The independent input variables are uniformly varied along their upper and lower

362

boundaries. According to the MDRM, a set of 160 Gauss Legendre quadrature nodes was

363

generated from the orthogonal polynomial toolbox42 passed to HYSYS. Figure 3 illustrates the

364

percentage of the total SA indices obtained from the MDRM, which indicates the influential

365

factors for the two outputs. The values of total SA indices are listed in Table S2 in Supporting

366

Information. As a result, for the ER and OP, two random variables such as the flow rates of the

367

crude oil and naphtha products were detected as being critical. Other random variables become

368

non-influential factors that can be omitted in the next stage11.

369 370

Figure 3. Percentage of total SA indices obtained by the MDRM for (a) the ER and (b) the OP

371

under uniform uncertainties.

Comment [ML1]: I have changed the Figure.

372

The effective detection of the non-influential inputs from the SA step enables one to

373

simplify the model. Two simulation sets of size 500 and 300 were considered for training the

374

GPR24 for the ER and the OP, respectively. Because precise estimates of the process outputs are

375

not accessible, the results of the proposed method were compared with those of the QMC method

23

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

376

with a sufficient number of simulations. For an accurate estimation of the probability for the

377

QMC method, the number of samples was selected from Chernoff bound following Tempo et

378

al.43. Figure 4 compares the density functions of the ER obtained by the GPR (with 500 training

379

data points) and by the QMC (with 10,000 simulations) methods with two influential random

380

inputs (flow rates of crude oil and naphtha). Furthermore, these density functions were compared

381

with that by the QMC method with all 16 random inputs (using the 10,000 simulations from

382

Halton sequence). Similarly, as observed in Figure 5, the density functions of the OP were

383

obtained using the GPR and the QMC methods with two critical random variables, namely, the

384

flow rates of crude oil and naphtha product, which were then compared with those by the QMC

385

method with all 16 random inputs. Table 4 lists the statistical properties and computational time

386

achieved by the proposed (MDRM + GPR) and QMC methods for the ER and the OP. The

387

computational time of the proposed method includes the computational time for running 160

388

simulations for SA permitted by the MDRM and for constructing two surrogate GPR models

389

with 500 quadrature nodes for UQ. The QMC/MC methods require a large number of

390

simulations (approximately 10,000) to estimate the expected values, variances, and densities

391

accurately; hence, they are computationally expensive. As compared with the conventional QMC

392

method, the computation time was reduced by 95 % in the proposed method. Also, it was

393

observed by comparing Figures 4 and 5 that the proposed method can achieve acceptable results

394

with only a small computational cost in comparison with the conventional QMC method.

395 396

Table 4. Statistical data and simulation time obtained by the proposed and QMC methods

397

under uniform uncertainties Methods

No. of simulations

Runtime (sec.)

Mean µ (ER)

24

ACS Paragon Plus Environment

Mean µ (OP)

Page 24 of 40

Page 25 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

QMC – sixteen inputs for the ER and OP QMC – two inputs for the ER QMC – two inputs for the OP Proposed (MDRM+GPR) for the ER Proposed (MDRM+GPR) for the OP

Industrial & Engineering Chemistry Research

10,000

11,877.4

255,729.4

1,742.66

10,000

11,809.7

255,196.8

-

10,000

11 820.2

-

1730.84

660

786.7

255,182.2

-

460

555.6

-

1730.62

398 399 400

25

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

401 402

Figure 4. Density profiles of the ER obtained by the GPR and QMC methods under

403

uniform uncertainties.

404 405 406

26

ACS Paragon Plus Environment

Page 26 of 40

Page 27 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

407 408

Figure 5. Density profiles of the OP obtained by the GPR and QMC methods under

409

uniform uncertainties.

410

Figure 6 illustrates the accuracy of the surrogate GPR model with respect to the number

411

of fitting points by using the Nash-Sutcliffe model efficiency coefficient. The closer the Q value

412

is to 1, the more accurate the surrogate can be. As seen in Figure 6, the Q value seems to be

413

constant at 500, which is a reasonable value for the fitted points. The Nash-Sutcliffe model

414

efficiency coefficient is defined as44: 10000

415

Q =1−

∑ (µ

i

i =1 10000

∑ (f

− fi ) 2 ;

i

− f)

2

f =

1 10000 ∑ fi 10000 i =1

i =1

27

ACS Paragon Plus Environment

(21)

Industrial & Engineering Chemistry Research

416

where µi is the prediction of the trained GPR model at the ith test point of the test sample of size

417

of 10,000 nodes from Halton sequence, fi is the real model output from HYSYS with the ith test

418

point.

0.9447420

0.9447415

0.9447410

0.9447405

Q

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

0.9447400

0.9447395

0.9447390

0.9447385 0

200

300

400

500

600

n fitting points

419 420

100

Figure 6. Nash-Sutcliffe model efficiency coefficient vs. number of training points

421 422

4.2. Global SA and UQ (unfixed product flow rates and uncertain inputs of Gaussian

423

distributions)

424

In this case study, 16 input variables were assumed to be of Gaussian distribution with the same

425

means and variances as in the uniform distribution case. The ER may vary due to numerous

426

factors, and thus, the refinery engineers have to adjust its operations frequently to maintain its

427

profitability. Although the optimal solution based on the nominal parameter values were

428

assumed to achieve the highest profit, it possibly leads to serious losses in other cases. Therefore, 28

ACS Paragon Plus Environment

Page 28 of 40

Page 29 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

429

a reliable optimization approach should take uncertainties into account and guarantee that its

430

solution is robust enough to parameter perturbations. Table S3 in Supporting Information shows

431

the mean and standard deviation of the parameter distributions for the objective. After

432

identifying the influential variables, the proposed GPR method used 500 training simulations.

433

From the simulation result, the surrogate GPR model was obtained for ER using the theory

434

described in Section 2.2. Table 5 lists the statistical properties of the ER obtained from the GPR

435

and MC/QMC methods. To obtain accurate estimations, the QMC sampling requires 10,000

436

simulations on these 16 parameters independently according to their probability distribution.

437

Figure 7 compares the density functions of the ER obtained from the proposed and QMC

438

methods. The results from the proposed method closely match with those from the traditional

439

QMC method. Note that the number of simulations required by the proposed method is

440

significantly lower than those by the QMC method (about 20 times). Also, another unique

441

advantage of the proposed method is that Sobol’s sensitivity indices, which are used to identify

442

the influential inputs in the propagation of process uncertainty, can be obtained directly from the

443

surrogate analytical model. Thus, the number of simulations is significantly reduced from

444

10~100 times, which leads to a drastic decrease in the computational efforts needed for UQ.

445

Note that to obtain the Sobol indices by using the MC/QMC methods, a huge number of

446

simulations are required which hurdles its application in a simplifying model for UQ. In this

447

example, the SA indices for the process output with respect to 16 uncertain inputs (see Table S4

448

in Supporting Information) were obtained from the MDRM for the Gaussian uncertainties.

449

Interestingly, the SA indices identify three important inputs, the flow rates of crude oil feed,

450

naphtha, and diesel products. Others are non-influential and can be maintained at the median

451

values for the uncertainty propagation. It is worth to say that the Sobol indices obtained from the

29

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

452

Gaussian distribution were similar to those estimated by uniformly distributed uncertainties.

453

Note that in the Gaussian distribution case, independent variables have no limits of upper and

454

lower boundaries (the mean and variance of the input are selected to be equal to the case of

455

uniform distribution). For the case associated with independent variables as continuous variables

456

with upper and lower boundaries, the GPR method can make the prediction when the inputs

457

exceed the fixed boundaries, and using Eq. (19), one can quantify the reliability of the prediction

458

at the outlier point. The variance Σ is expected to be large in this case, which indicates the

459

unreliable prediction.

460 461

Table 5. Statistical data and simulation time obtained by the proposed and QMC methods

462

under Gaussian uncertainties

No. of

Runtime

simulations

(sec.)

QMC – sixteen inputs

10,000

11,777.7

256,208.6

QMC – two inputs

10,000

11,736.4

257,483.7

Proposed (MDRM+GPR)

660

775.8

257,416.2

Methods

Mean µ (ER)

463 464

30

ACS Paragon Plus Environment

Page 30 of 40

Page 31 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

465 466

Figure 7. Density profiles of the ER obtained by the GPR and QMC methods under Gaussian

467

uncertainties.

468 469

4.3. Global SA and UQ (fixed product flow rates with uncertain inputs of uniform

470

distributions)

471

In this study, the product flow rates of naphtha, diesel, kerosene, and AGO were specified

472

at the upper bounds when the maximum flow rates of each product were obtained. Twelve

473

random inputs were run at 120 simulations with 120 Gauss-Legendre quadrature nodes to derive

474

the Sobol indices for the outputs; Table S5 in Supporting Information lists the Sobol’s total SA

475

indices. As a result, the flow rate of the crude oil was the most influential factor for both ER and

476

OP when product flow rates were fixed. 31

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

477

To train the GPR for the simplified model, 100 training data points were used (from

478

Halton sequence). Figures 8 and 9 present the density function results for the ER and the OP,

479

respectively. The density functions of the ER obtained by the GPR and QMC methods (with

480

10,000 simulations) were compared with those by the QMC method with all 12 random inputs

481

(using the 10,000 simulations from Halton sequence). Similarly, the density functions of the OP

482

obtained by the GPR and QMC methods with two influential random variables (the flow rates of

483

crude oil and naphtha product), were compared with those by the QMC method with all 12

484

random inputs. A summary of the statistical results and computational time for this case is

485

presented in Table 6.

486

Table 6. Statistical data and simulation time obtained by the proposed and QMC methods

487

with fixed product flow rates Methods

No. of simulations

Runtime (sec.)

Mean µ (ER)

Mean µ (OP)

QMC - twelve inputs

10,000

11,830.4

254,381.6

1721.16

QMC - one input

10,000

11,597.7

254,385.3

-

QMC - one input

10,000

11,528.5

-

1720.64

proposed (MDRM+GPR) for the ER

220

264.6

254,382.9

-

proposed (MDRM+GPR) for the OP

220

267.3

-

1720.75

488

32

ACS Paragon Plus Environment

Page 32 of 40

Page 33 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

489 490

Figure 8. Density profiles of the ER obtained by the GPR and QMC methods with fixed product

491

flow rates under uniform uncertainties.

33

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

492 493

Figure 9. Density profiles of the OP obtained by the GPR and QMC methods with fixed product

494

flow rates under uniform uncertainties.

495 496

To summarize, in order to reduce the cost of simulations, one can replace the rigorous

497

simulation model with an inexpensive surrogate based on conventional GPR. However, the

498

number of simulations required to learn a generic multivariate response can increase significantly

499

as the input dimension increases. Furthermore, the optimization of hyper-parameters for the GPR

500

can be stuck straightforwardly in a local solution when a large number of random inputs are

501

considered. This paper addresses the dimensionality problem with a two-stage approach: Firstly,

502

the inputs are screened with the MDRM method; secondly, the GPR is constructed with respect 34

ACS Paragon Plus Environment

Page 34 of 40

Page 35 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

503

to the critical inputs to reduce the number of simulations required to learn the model. The

504

proposed approach is used for modeling the ER and the OP of CDU. Under this circumstance,

505

the result demonstrated that the flow rates of crude oil and naphtha product are critical for the ER

506

and the OP. Prediction results by the proposed method are verified by comparison with those of

507

the QMC method. It is demonstrated that there is precise agreement between the proposed and

508

the QMC methods. Moreover, the proposed method is computationally superior to the standard

509

QMC method.

510 511

5. Conclusions

512

This work focused on UQ and SA for testing the model robustness against uncertainties

513

in the CDU with the following objectives: obtaining a better understanding of the effects of

514

inputs on the output, determining the key inputs that influence the uncertainty of model outputs

515

(such as the EP and OP), and achieving model simplification. To address a large number of

516

uncertainties in the CDU with the computational limitation, an efficient two-stage SA approach

517

combining the MDRM and GPR was presented and successfully applied to UQ and SA of

518

complex CDU problem. Density functions of the outputs predicted by the Gaussian regression

519

from the simplified model showed a good agreement with those of the conventional QMC

520

methods. The proposed method is superior to the popular QMC approach primarily in terms of

521

the computational efforts (approximately more than 95 %). The SA and UQ results indicate that

522

the ER and the OP are mostly influenced by the fluctuation of the flow rates of crude oil and

523

light fraction (e.g., naphtha) in the CDU. The proposed approach is expected to offer an effective

524

method to address both SA and UQ for other complex chemical processes associated with a large

525

number of uncertainties. In future work, another study of SA is planned in order to focus on the 35

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

526

scheduling of the blending of crude oils to emulate an actual refinery optimization process and

527

therefore provide more realistic results regarding the practical application.

528 529

Acknowledgment

530

This study was supported by Engineering Development Research Center (EDRC) funded

531

by the Ministry of Trade, Industry & Energy (MOTIE) (No. N0000990), the Priority Research

532

Centers Program through the National Research Foundation of Korea (NRF) funded by the

533

Ministry of Education (2014R1A6A1031189), and the R&D Center for Reduction of Non-CO2

534

Greenhouse Gases (201700240008) funded by the Ministry of Environment as a ‘Global Top

535

Environment R&D Program’.

536 537 538 539

Supporting Information

The supporting information contains Sobol’s sensitivity analysis indices from all case studies.

540 541 542

Conflict of Interest

The authors declare no competing financial interest.

543

36

ACS Paragon Plus Environment

Page 36 of 40

Page 37 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

544

Literature cited

Industrial & Engineering Chemistry Research

545

(1) Kong, Q.; Shah, N. Development of an Optimization-Based Framework for Simultaneous Process

546

Synthesis and Heat Integration. Ind. Eng. Chem. Res. 2017, 56, 5000-5013.

547

(2) Saltelli, A.; Ratto, M.; Tarantola, S.; Campolongo, F. Update 1 of: Sensitivity Analysis for Chemical

548

Models. Chem. Rev. 2012, 112, PR1-PR21.

549

(3) Basak, K.; Abhilash, K. S.; Ganguly, S.; Saraf, D. N. On-Line Optimization of a Crude Distillation

550

Unit with Constraints on Product Properties. Ind. Eng. Chem. Res. 2002, 41, 1557-1568.

551

(4) Osuolale, F. N.; Zhang, J. Energy efficiency optimisation for distillation column using artificial

552

neural network models. Energy 2016, 106, 562-578.

553

(5) Kim, Y. H. An Energy-Efficient Crude Distillation Unit with a Prefractionator. Chem. Eng. Technol.

554

2017, 40, 588-597.

555

(6) Waheed, M. A.; Oni, A. O. Performance improvement of a crude oil distillation unit. Appl. Therm.

556

Eng. 2015, 75, 315-324.

557

(7) Mahalec, V.; Sanchez, Y. Inferential monitoring and optimization of crude separation units via

558

hybrid models. Comput. Chem. Eng. 2012, 45, 15-26.

559

(8) More, R. K.; Bulasara, V. K.; Uppaluri, R.; Banjara, V. R. Optimization of crude distillation system

560

using aspen plus: Effect of binary feed selection on grass-root design. Chem. Eng. Res. Des. 2010, 88,

561

121-134.

562

(9) Enríquez-Gutiérrez, V. M.; Jobson, M.; Ochoa-Estopier, L. M.; Smith, R. Retrofit of heat-integrated

563

crude oil distillation columns. Chem. Eng. Res. Des. 2015, 99, 185-198.

564

(10) Ochoa-Estopier, L. M.; Jobson, M.; Smith, R. The use of reduced models for design and

565

optimisation of heat-integrated crude oil distillation systems. Energy 2014, 75, 5-13.

566

(11) Gu, W.; Huang, Y.; Wang, K.; Zhang, B.; Chen, Q.; Hui, C.-W. Comparative analysis and

567

evaluation of three crude oil vacuum distillation processes for process selection. Energy 2014, 76, 559-

568

571.

569

(12) Doostan, A.; Owhadi, H. A non-adapted sparse approximation of PDEs with stochastic inputs. J,

570

Comput. Phys. 2011, 230, 3015-3034.

571

(13) Uusitalo, L.; Lehikoinen, A.; Helle, I.; Myrberg, K. An overview of methods to evaluate uncertainty

572

of deterministic models in decision support. Environ. Model. Softw. 2015, 63, 24-31.

573

(14) Sobol′, I. M. Global sensitivity indices for nonlinear mathematical models and their Monte Carlo

574

estimates. Math. Comput. Simulation 2001, 55, 271-280.

575

(15) Saltelli, A. Making best use of model evaluations to compute sensitivity indices. Comput. Phys.

576

Commun. 2002, 145, 280-297.

37

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

577

(16) Webster, M. D.; Tatang, M. A.; McRae, G. J., Application of the Probabilistic Collocation Method

578

for an Uncertainty Analysis of a Simple Ocean Model. Report 4 (MIT Joint Program on the Science and

579

Policy of Global Change). In 1996; Vol. 32.

580

(17) Abubakar, U.; Sriramula, S.; Renton, N. C. Reliability of complex chemical engineering processes.

581

Comput. Chem. Eng. 2015, 74, 1-14.

582

(18) Schobi, R.; Sudret, B.; Wiart, J. Polynomial-chaos-based Kriging. Int. J. Unc. Quant. 2015, 5, 171-

583

193.

584

(19) Shen, D. E.; Braatz, R. D. Polynomial chaos-based robust design of systems with probabilistic

585

uncertainties. AlChE J. 2016, 62, 3310-3318.

586

(20) Duong, P. L. T.; Ali, W.; Kwok, E.; Lee, M. Uncertainty quantification and global sensitivity

587

analysis of complex chemical process using a generalized polynomial chaos approach. Comput. Chem.

588

Eng. 2016, 90, 23-30.

589

(21) Celse, B.; Costa, V.; Wahl, F.; Verstraete, J. J. Dealing with uncertainties: Sensitivity analysis of

590

vacuum gas oil hydrotreatment. Chem. Eng. J. 2015, 278, 469-478.

591

(22) Duong, P. L. T.; Minh, L. Q.; Pham, T. N.; Goncalves, J.; Kwok, E.; Lee, M. Uncertainty

592

quantification and global sensitivity analysis of complex chemical processes with a large number of input

593

parameters using compressive polynomial chaos. Chem. Eng. Res. Des. 2016, 115, Part A, 204-213.

594

(23) Minh, L. Q.; Duong, P. L. T.; Goncalves, J.; Kwok, E.; Lee, M. A two-stage approach of

595

multiplicative dimensional reduction and polynomial chaos for global sensitivity analysis and uncertainty

596

quantification with a large number of process uncertainties. J. Taiwan Inst. Chem. Eng. 2017, 78, 254-264.

597

(24) Clifton, L.; Clifton, D. A.; Pimentel, M. A. F.; Watkinson, P. J.; Tarassenko, L. In Gaussian

598

process regression in vital-sign early warning systems, 2012 Annual International Conference of the

599

IEEE Engineering in Medicine and Biology Society, Aug. 28 2012-Sept. 1 2012, 2012; 2012; pp 6161-

600

6164.

601

(25) Rasmussen, C. E.; Williams, C. K. I., Gaussian Processes for Machine Learning (Adaptive

602

Computation and Machine Learning). The MIT Press: 2005.

603

(26) Quirante, N.; Javaloyes, J.; Caballero, J. A. Rigorous design of distillation columns using surrogate

604

models based on Kriging interpolation. AlChE J. 2015, 61, 2169-2187.

605

(27) Kajero, O. T.; Chen, T.; Yao, Y.; Chuang, Y.-C.; Wong, D. S. H. Meta-modelling in chemical

606

process system engineering. J. Taiwan Inst. Chem. Eng. 2017, 73, 135-145.

607

(28) Kaymaz, I. Application of kriging method to structural reliability problems. Struct. Safety 2005, 27,

608

133-151.

38

ACS Paragon Plus Environment

Page 38 of 40

Page 39 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

609

Industrial & Engineering Chemistry Research

(29) Liu, X.; Guillas, S. Dimension Reduction for Gaussian Process Emulation: An Application to the

610

Influence of Bathymetry on Tsunami Heights. SIAM/ASA Journal on Uncertainty Quantification 2017, 5,

611

787-812.

612

(30) Bengio, Y.; Delalleau, O.; Roux, N. L. In The curse of highly variable functions for local kernel

613

machines, Advances in Neural Information Processing Systems 18 (NIPS'05), 2005; Weiss, Y.; Schölkopf,

614

B.; Platt, J., Eds. Cambridge, MA: MIT Press: 2005; pp 107- 114.

615

(31) Kalaitzis, A. A.; Lawrence, N. D. A Simple Approach to Ranking Differentially Expressed Gene

616

Expression Time Courses through Gaussian Process Regression. BMC Bioinformatics 2011, 12, 180.

617

(32) Kocijan, J., In Modelling and Control of Dynamic Systems Using Gaussian Process Models,

618

Springer Cham, 2016.

619

(33) Saltelli, A.; Chan, K.; Scott, E. M., Sensitivity analysis. Wiley New York: 2000; Vol. 1.

620

(34) Zhang, X.; Pandey, M. D. An effective approximation for variance-based global sensitivity analysis.

621

Reliab, Eng. Sys. Safety 2014, 121, 164-174.

622

(35) HYSYS, A. Aspen HYSYS customization guide. http://www.aspentech.com/core/aspen-hysys.aspx

623

(36) Errico, M.; Tola, G.; Mascia, M. Energy saving in a crude distillation unit by a preflash

624

implementation. Appl. Therm. Eng. 2009, 29, 1642-1647.

625

(37) Chang, A.-F.; Pashikanti, K.; Liu, Y. A., Atmospheric Distillation Unit. In Refinery Engineering,

626

Wiley-VCH Verlag GmbH & Co. KGaA: 2012; pp 57-116.

627

(38) Javaloyes-Antón, J.; Ruiz-Femenia, R.; Caballero, J. A. Rigorous Design of Complex Distillation

628

Columns Using Process Simulators and the Particle Swarm Optimization Algorithm. Ind. Eng. Chem. Res.

629

2013, 52, 15621-15634.

630

(39) Ibrahim, D.; Jobson, M.; Guillén-Gosálbez, G. Optimization-Based Design of Crude Oil Distillation

631

Units Using Rigorous Simulation Models. Ind. Eng. Chem. Res. 2017, 56, 6728-6740.

632

(40) Osuolale, F. N.; Zhang, J. Thermodynamic optimization of atmospheric distillation unit. Comput.

633

Chem. Eng. 2017, 103, 201-209.

634

(41) Turton, R., Analysis, Synthesis, and Design of Chemical Processes. Prentice Hall: 2012.

635

(42) Gautschi, W., Orthogonal Polynomials: Computation and Approximation. Clarendon Press: 2004.

636

(43) Tempo, R.; Calafiore, G.; Dabbene, F., Randomized algorithms for analysis and control of

637

uncertain systems: with applications. Springer Science & Business Media: 2012.

638

(44) N. Moriasi, D.; G. Arnold, J.; W. Van Liew, M.; L. Bingner, R.; D. Harmel, R.; L. Veith, T. Model

639

Evaluation Guidelines for Systematic Quantification of Accuracy in Watershed Simulations. Transactions

640

of the ASABE 2007, 50, 885.

641 642

39

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

643

Table of Contents (TOC) Graphic

644

645

40

ACS Paragon Plus Environment

Page 40 of 40