Linear Subspace Principal Component Regression Model for Quality

May 8, 2017 - State Key Laboratory of Industrial Control Technology, Institute of Industrial Process ... Copyright © 2017 American Chemical Society ...
0 downloads 0 Views 531KB Size
Subscriber access provided by UB + Fachbibliothek Chemie | (FU-Bibliothekssystem)

Article

Linear subspace PCR model for quality estimation of nonlinear and multimode industrial processes Junhua Zheng, and Zhihuan Song Ind. Eng. Chem. Res., Just Accepted Manuscript • Publication Date (Web): 08 May 2017 Downloaded from http://pubs.acs.org on May 16, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Industrial & Engineering Chemistry Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Linear subspace PCR model for quality estimation of nonlinear and multimode industrial processes Junhua Zheng, and Zhihuan Song∗ State Key Laboratory of Industrial Control Technology, Institute of Industrial Process Control, Department of Control

Science and Engineering, Zhejiang University, Hangzhou 310027, Zhejiang, China

Abstract Principal component regression (PCR) has been widely used for quality estimation in industrial processes. However, the traditional PCR method is restricted in linear processes. Although several nonlinear forms of PCR have been proposed, most of them have high algorithm complexities, which make them difficult to use in practice. In this paper, a new linear subspace PCR model is proposed for quality estimation of nonlinear processes. Through monitoring key statistics and Bayesian inference, the quality estimation results in different linear subspaces can be effectively integrated. By introducing an additional information combination direction, the basic linear subspace PCR model is further extended to a two-dimensional form for quality estimation of multimode processes. Two industrial examples are provided for performance evaluation of the proposed methods.

Keywords:

Quality estimation; Principal component regression; Linear subspace; Nonlinear processes;

Multimode processes.



To whom all correspondence should be addressed. Email: [email protected]

-1-

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1. Introduction In the process industry, some key variables are difficult to measure online, such as the product composition in distillation columns, the concentration of reaction mass in chemical reactors, etc. As a result, the quality related process control becomes quite difficult, since it depends on the measured quality variables. Typically, those important variables are determined by offline analyses in the laboratory or by online analyzers. However, both methods are expensive or time-consuming which may introduce a significant delay to the control system. Therefore, it is necessary to carry out online quality estimation or prediction by using easy-to-measure process variables, which is also known as soft sensor or visual measurement. With the wide utilization of the distributed control system (DCS) in modern industrial processes, a huge number of process data have been collected, upon which the data-based quality estimation and prediction methods have gained much attention in recent years.1-4 Compared to the first principle model-based method which intensively relies on the process knowledge, data-based quality predication methods rarely need process knowledge or expert experiences, instead, it tries to extract useful information directly from the process data. In the past years, many data-based methods have been developed, among which principal component regression (PCR) and partial least squares (PLS) may be two of the most widely used ones.5-10 Nonlinear quality estimation methods have also been proposed, such as Artificial neural network (ANN), support vector machine (SVM), kernel-based method, etc.11-22 Besides, several nonlinear extensions of PCR/PLS models have been proposed23-25, with incorporation of nonlinear modeling techniques, such as support vector regression (SVR) and Gaussian process regressions.14-16, 26-30 Recently, several ensemble learning methods have also been introduced for performance enhancement of the soft sensor model, such as bagging, random subspace, etc.31-36 For example, Jin et. al.31 developed an

-2-

ACS Paragon Plus Environment

Page 2 of 32

Page 3 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

adaptive soft sensor for nonlinear time-varying batch processes, which is based on ensemble Gaussian process regression. Shao and Tian32 proposed an adaptive softs sensor for quality prediction based on selective ensemble local partial least squares models. Kaneko and Funatsu34 developed an ensemble locally weighted partial least squares method for just-in-time modeling and application for soft sensor. Since the quality estimation step is a part of the process control system, the real-time performance is important. In the present paper, an efficient nonlinear quality estimation method is developed, which is based on PCR and the linear subspace method. The proposed method is then extended for modeling multimode industrial processes, which have several different operating conditions. First, to approximate the nonlinear process, the linear subspace method is introduced, through which several local PCR models are developed for quality estimation in different linear subspaces. Compared to traditional nonlinear modeling methods, the computational complexity of the linear subspace method is much lower. Second, a probabilistic combination strategy is proposed to combine the results obtained in different linear subspaces, which is based on PCA monitoring statistics and Bayesian inference. Based on the developed linear subspace model, an additional probabilistic combination direction is formulated to catch the mode information in the multimode process. As a result, a two-dimensional probabilistic combination method can be constructed for quality estimation. To address the mode localization problem for the new data sample, a similar Bayesian inference approach is developed. Therefore, the mode posterior probability value can be determined in each operation mode for the new data sample. The main contributions of this paper are summarized as: (1) a new linear subspace PCR model is proposed for quality estimation of nonlinear processes, and (2) the linear subspace PCR model is extended to the two-dimensional form, in order to handle the multimode condition of the process. The rest of this paper is structured as follows. In section 2, the principle of the traditional PCR method is introduced, which

-3-

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 32

is followed by the proposed linear subspace based method for nonlinear quality estimation in the next section. In second 4, the linear subspace based quality estimation method is extended to multimode processes. Two industrial examples are provided in section 5 for performance evaluation of the proposed methods. Finally, conclusions are made.

2. Principal component regression (PCR) Denote the easy-measured secondary process variable measurement matrix as X ∈ R n×m , where n is the number of data sample, m is the number of measured variables. The predicted variable matrix can be given as Y ∈ R n×r , where n is the number of data sample, r is the number of predicted variables. The main idea of PCR is first to find a set of principal components which span the original measurement variable space, and then calculate the regression matrix between the extracted principal components and the predicted variable matrix Y . The traditional derivation of PCR can be expressed as follows2

X = TPT + E

(1)

Y = TCT + F

(2)

where P ∈ R m×k is the loading matrix, T ∈ R n×k is the principal component matrix, k is the selected number of principal components, C ∈ R

r ×k

is the regression matrix, E and F are the residuals

matrices with appropriate dimensions. For a new data sample x new ∈ R

m×1

, the predicted variables can be

calculated as

yˆ new = CPT x new

(3)

3. Linear subspace method for nonlinear quality estimation In the conventional nonlinear PCA method, the original nonlinear space can be expressed by the nonlinear principal components. Depending on the nonlinear PCA method, the nonlinear PCR method can -4-

ACS Paragon Plus Environment

Page 5 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

be formulated as

X = Φ (Tnl ) + Enl Y = f ( X ) + E y = f (Φ ( Tnl )) + E y

(4)

where Φ (⋅) defines the nonlinear relationship between the process variables and the latent variables, the form of which depends on the nonlinearity between those two types of variables, Tnl is the latent variable matrix, and Enl is the residual matrix, f (⋅) defines the nonlinear relationship between the process variable and the quality variable, E y represents the prediction error. The main idea of the linear subspace method is to approximate the nonlinear process by several linear subspaces, which can greatly alleviate the algorithm complexity while remain the modeling efficiency for the nonlinear process. An important issue of the linear subspace method is how to construct different linear subspaces, which are both diverse and accurate. A simple illustration of the linear subspace modeling method is provided in Figure 1. In this figure, one can find that the whole nonlinear space is divided into several linear subspaces, each of which can be described by a linear model. In other words, the whole nonlinear space is approximated by several linear subspaces, that is why we call this method linear subspace modeling method.

Figure 1: Linear subspace method for nonlinear process modeling According to the PCA method, different principal components are orthogonal to each other. If we build individual linear subspaces through different principal directions, and select the most relevant variables in the corresponding subspace, the diversity and accuracy of each linear subspace model can both

-5-

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 32

be obtained. Since the retained principal components can capture most data information of the process, linear subspaces constructed through those principal component directions could provide an approximation for the nonlinear process. Therefore, the traditional linear PCA decomposition is firstly implemented on the dataset X , thus

X = TPT + E

(5)

Suppose k principal components are selected in the PCA model, the loading matrix can be decomposed as follows

P = [P1 , P2 , L , Pk ]

(6)

Hence, the nonlinear space can be approximated by k linear subspaces developed through each principal component direction. However, before developing individual subspace models, a subset of process variables should be selected in each subspace. A subspace contribution index can be defined to measure the importance of each variable in the linear subspace, given as m

CT (i, j ) = pij2 / ∑ plj2

(7)

l =1

2

where i = 1, 2, L, m , j = 1, 2, L, k , pij represents squared value of the corresponding element in the loading matrix. Thus, to select the variable index for the each linear subspace, those variables which have big contribution values should be determined as the variable subset. In practice, the number of variables in each subspace can be determined by a cut-off value or a contribution ratio (e.g. 80%) can be defined to select the number of variables. In the current paper, a contribution ratio around 80% has been used for variable selection in each subspace.

3.1. Linear PCR model development After the linear subspace has been constructed, a local PCR model is developed in each linear subspace. Suppose the number of process variable selected in each linear subspace is msub , the index of -6-

ACS Paragon Plus Environment

Page 7 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

the process variable in each linear subspace is represented as d i , i = 1, 2,L , k . Then, the subspace datasets for each local PCR model development can be denoted as Xi = X(:, d i ) ∈ R

n× msub

and

Y ∈ R n×r . Therefore, the local linear PCR model in each subspace can be modeled as follows Y = Xi Pi CiT + Ei = Xi R ipcr + Ei pcr

where i = 1, 2,L , k , R i

(8)

= Pi Ci ∈ R msub ×r is the regression matrix of the i-th local PCR model.

3.2. Nonlinear quality estimation method Depending on the developed local PCR models, the nonlinear quality estimation method can be formulated by combing local estimation results in different linear subspaces. Therefore, when the new data m

sample x new ∈ R is available, the local PCR estimation results can be first calculated as

y inew = (R ipcr )T x new (d i ,1)

(9)

where i = 1, 2,L , k , y inew ∈ R r is the i-th local estimation result of the linear PCR model. When all of the local PCR estimation results have been generated, the next step is how to combine them together to get the final quality estimation result. Here, a probabilistic weighted form is used for local results combination. Therefore, the weighted value of each local PCR model for the new data sample should be determined first. In the traditional PCA-based process monitoring method, two statistics T 2 and SPE have been constructed for monitoring the status of the data sample. If the value of the statistic exceeds its confidence limit, the data sample will be considered as abnormal. In other words, the comparison between the statistic and its corresponding confidence limit in the normal region can describe the data status. In this paper, these two PCA-based monitoring statistics are used for calculation of the weighted value of each local PCR model. For the new data sample x new in the i-th subspace, its T 2 and SPE values are calculated as [1]

t inew = (Pi )T x new (d i ,1) Ti ,2new = (t inew )T ( Λ i ) −1 (t inew )

-7-

ACS Paragon Plus Environment

(10)

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 32

einew = x new (d i ,1) − Pi t inew

(11)

SPEi ,new = (einew )T (einew )

where i = 1, 2,L , k , Λ i is the eigenvalue matrix of the i-th local PCR model. Then, the probabilities of the data sample in each linear subspace corresponding to T 2 and SPE statistics are estimated as follows

PT 2 [x new (di ,1) | i ] = exp{−

PSPE [x new (d i ,1) | i ] = exp{− 2

Ti ,2new 2 Ti ,lim

}

SPEi ,new SPEi ,lim

where Ti ,lim and SPEi ,lim are confidence limits of the T

2

(12)

}

(13)

and SPE monitoring statistics in different

subspaces, which can be determined by F and as follows1 2 Ti ,lim =

ki (n − 1) Fki ,( n − ki ),α n − ki

(14)

SPEi ,lim = gi χ h2i ,α

(15)

where ki is the number of principal components in each local PCR model, Fki ,( n − ki ),α represents 2

F-distribution, α is the selected significance level, and gi = vi / (2mi ) , hi = 2mi / vi , in which mi and vi are the mean and variance values of SPE within each model. Compared to the traditional probability estimation method such as kernel density estimation (KDE), this estimation method is much easier. However, due to the property of the two statistics, the estimation results of the probability are effective. The smaller value of the statistic, the higher probability of the data sample belongs to the corresponding subspace. In contrast, when a big value of the statistic has been obtained, a small probability value of this sample is determined in the corresponding subspace. As a result, the membership of this sample could be quite small in the subspace. Since there is no prior knowledge about the constructed subspace, it is assumed that the prior

-8-

ACS Paragon Plus Environment

Page 9 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

probability of each subspace is equal, thus P (i ) = 1/ k , i = 1, 2,L , k . However, if there is any available information about the linear subspace, the priority of each subspace could be further considered to improve the performance of this probabilistic transformation. The posterior probability of the new data sample corresponding to each subspace can be calculated by Bayesian inference, which are given as follows

PT 2 [i | x new (di ,1)] =

PT 2 [i, x new (di ,1)] PT 2 [x new (di ,1)]

=

PT 2 [x new (di ,1) | i ]P(i ) k

∑ {P

T2

(16)

[x new (di ,1) | i ]P(i)}

i =1

PSPE [i | x new (d i ,1)] =

PSPE [i, x new (di ,1)] = PSPE [x new (d i ,1)]

PSPE [x new (di ,1) | i ]P (i ) k

∑{P

SPE

(17)

[x new (d i ,1) | i ]P(i )}

i =1

Finally, local PCR prediction results can be combined by their corresponding posterior probabilities in different linear subspaces. For simplicity, the mean value of PT 2 [i | x new (d i ,1)] and PSPE [i | x new (di ,1)] in each subspace is used as the weighted value for combination. Thus, the final estimation result of the new data sample can be calculated as

P[i | x new (d i ,1)] = {PT 2 [i | x new (d i ,1)] + PSPE [i | x new (d i ,1)]} / 2 (18)

k

y new = ∑ P[i | x new (d i ,1)]y inew i =1

4. Two-dimensional linear subspace modeling method Based on the developed nonlinear quality estimation method, it can be extended to those processes with multiple operating modes (operation conditions), e.g. those processes which produce multiple products or have multiple operational loads. Here, ‘mode’ means the type of the process operation or the process condition. In the past years, data-based modeling and some applications have been made to the multimode processes37-41. To extend the proposed method to the two-dimensional form, both of the nonlinear and multimode process behaviors can be addressed simultaneously. To combine the quality estimation results in

-9-

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 32

different operation modes, an additional probabilistic combination direction should be introduced. Suppose the process contains C operating modes, the overall dataset can be partitioned into C sub-datasets, which correspond to different operating modes. Then, the proposed linear subspace method is carried out sub

on each sub-dataset Xc , c = 1, 2,L , C . Thus, different local PCR models can be built on each linear sub

subspace of sub-dataset Xc , which are given as follows sub subT sub sub sub Y = X csub + Ecsub ,i Pc ,i Cc ,i ,i = X c , i R c ,i + E c , i sub

sub

sub

where i = 1, 2,L , k , R c ,i = Pc ,i Cc ,i ∈ R

msub ×r

(19)

is the regression matrix of the i-th local PCR model in

the c-th operation mode. m

Given the new data sample x new ∈ R , two weighted parameters should be defined: the linear subspace weighted value and the mode weighted value, depending on which the final estimation result can be obtained by combining the results in different linear subspace and operating modes. First, following the procedures of the linear subspace combination method in section 3, the prediction result of the new data sample in each operating mode can be calculated as follows T y c ,i ,new = (R csub ,i ) x new (d i ,1) c P c [i | x new (d i ,1)] = {PTc2 [i | x new (d i ,1)] + PSPE [i | x new (d i ,1)]} / 2

(20)

k

y c ,new = ∑ P c [i | x new (d i ,1)]y c ,i , new i =1

c

where the linear subspace weighted value P [i | x new (d i ,1)], i = 1, 2,L , k , c = 1, 2,L , C

can be

calculated through eqs. (10)-(17). Next, the mode weighted value of the new data sample can be defined and calculated as follows

Tc2,i ,new 1 k i 1 k P [ x ( d ,1) | c ] = exp{ − } ∑ 2 new i ∑ k i T k i Tc2,i ,lim

(21)

SPEc ,i , new 1 k i 1 k P [ x ( d ,1) | c ] = exp{− } ∑ ∑ SPE new i k i k i SPEc ,i ,lim

(22)

PT 2 (x new | c) =

PSPE (x new | c) =

- 10 -

ACS Paragon Plus Environment

Page 11 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

PT 2 (c | x new ) =

PT 2 (c, x new ) PT 2 (x new )

PT 2 (x new | c) P(c)

=

C

(23)

∑ [ PT 2 (xnew | c) P(c)] c =1

PSPE (c | x new ) =

PSPE (c, x new ) = PSPE ( x new )

PSPE ( x new | c) P(c) C

∑[P

SPE

(24)

( x new | c) P(c)]

c =1

where P (c) , c = 1, 2,L , C are prior probabilities, which can be simply determined as P (c) = nc / n , where nc is the number of the data samples in operation mode c . The two monitoring statistic values

Tc2,i ,new and SPEc ,i , new can be calculated as T t c ,i ,new = (Pcsub ,i ) x new (d i ,1) −1 Tc2,i ,new = (t c ,i ,new )T ( Λ csub ,i ) (t c ,i , new )

ec ,i ,new = x new (di ,1) − Pcsub ,i t c ,i , new SPEc ,i ,new = (ec ,i ,new )T (ec ,i ,new )

(25)

(26)

sub where i = 1, 2,L , k , Λ c ,i is the eigenvalue matrix of the i-th local PCR model in the c-th operating

mode. As a result, the quality estimation results in different operating modes can finally be combined as

P[c | x new ] = {PT 2 [c | x new ] + PSPE [c | x new ]} / 2 C

y new = ∑ P[c | x new ]y c ,new

(27)

c =1

where y c ,new is the estimation result of c-th operating mode. It should be noted that the aim of calculating the posterior probability of the new data sample in each operation mode is to determine the operation condition automatically. Besides, in some processes, there are transition periods among different operation conditions. In this case, the new data may be assigned to various operation modes in different probabilities. However, if one can determine the operation mode of the new data beforehand, then we only needs to assign the corresponding model for quality estimation.

- 11 -

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 32

5. Industrial case studies In this section, the quality estimation performance of the proposed method is evaluated through two industrial examples. The first one is a fluid catalytic cracking unit (FCCU), which is the core unit of the oil secondary operation process. Another industrial example is the polypropylene production process which often has multiple operating conditions. To evaluate the performance of the proposed model, the root mean square error (RMSE) criterion can be used, which is defined as follow n _ te

RMSE =



y j − yˆ j

2

/ n _ te

(28)

j =1

where j = 1, 2,L , n _ te , y j and yˆ j are real and estimated values, respectively, n _ te is the total number of test data samples.

5.1. FCCU process 5.1.1.

System description

The traditional FCCU process consists of four subsystems: the reactor-regenerator subsystem, the fractionator subsystem, the absorber-stabilizer subsystem, and the gas sweetening subsystem. A simplified flowchart of the FCCU process is described in Figure 2.19 The running condition of the FCCU process strongly affects the yield of light oil in petroleum refining processes. Therefore, FCCU has an important role in the overall economic performance of the refinery process. The main function of the fractionator subsystem is to split the crude oil according to the fractional distillation process. After the crude oil has been split, the main products include gasoline, light diesel oil, and liquefied petroleum gas. To control the quality of the product, these three yields should be estimated online. Conventionally, however, the yield rate can only be analyzed offline every 8 hours. Therefore, a significant delay will be introduced, which would affect the performance of the feedback control system. - 12 -

ACS Paragon Plus Environment

Page 13 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Figure 2: A simplified flowchart of the FCCU process.19 To this end, the quality estimation method is necessary for online estimation and prediction for these important quality variables. For performance comparisons of different quality estimation methods, the traditional PCR based method, PLS based method, SVR based method14, kernel PCR42, kernel PLS11 and the proposed linear subspace PCR method (LSPCR) are all developed. Three product concentrations: gasoline, light diesel oil, and liquefied petroleum gas are selected as quality variables, while other six process variables have been determined as the input vector of quality estimation model, which are highly correlated with the quality variables. Detailed descriptions of both input and output variables of the quality estimation model are given in Table 1. It should be noted that the unit of the each concentration value in the three quality variables is percentage, which is between 0% and 100%. The modeling and test datasets are both collected from the distributed control system and the daily laboratory analysis in an industrial FCCU refinery in China. After data preprocessing, a total of 104 data samples have been obtained, among which 74 data samples are used for modeling and the rest 30 data samples are for test. Data characteristics of both input and output variables are shown in Figure 3. Detailed information of the computer used in this case study is as follows. CPU: Intel Core i5-4590, 3.30GHz; Memory: 4 GB; Operation System: Windows 7; Simulation Platform: MATLAB 2013. - 13 -

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

Table 1. Selected process and quality variables in FCCU process No

Process variable description

Minimum value

Maximum value

Mean value

Standard deviation

1

Flow of crude oil

95.0 t/h

110.0 t/h

103.4554 t/h

2.9114 t/h

2

Flow of cycle oil

10.0 t/h

35.0 t/h

22.6135 t/h

5.7656 t/h

3

Catalyst reactor temperature

490.0 ℃

515.0 ℃

504.3595℃

3.8242℃

4

Top temperature of the main fractionating tower

100.0 ℃

120.0 ℃

109.8770℃

3.1809℃

5

Extraction temperature

225.0 ℃

250.0 ℃

235.1581℃

6.8288℃

6

Bottom temperature of stabilization tower

165.0 ℃

175.0 ℃

169.5216℃

2.3795℃

No

Quality variable description

Minimum value

Maximum value

Mean value

Standard deviation

1

Gasoline

0

1

43.0811%

2.0256%

2

Light diesel oil

0

1

22.0676%

1.9326%

3

Liquefied petroleum gas

0

1

14.4189%

1.3750%

600

Input variables

500 400 300 200 100 0 0

10

20

30

40

50

60

70

80

90

100

70

80

90

100

Samples (a) 50 45

Output variables

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 32

40 35 30 25 20 15 10 0

10

20

30

40

50

60

Samples (b) Figure 3: Data characteristics of input and output variables in FCCU process.

- 14 -

ACS Paragon Plus Environment

Page 15 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Table 2: Quality estimation results (RMSE and relative prediction error) of different methods Quality variables/

LSPCR

PCR

PLS

SVR

methods

Kernel

Kernel

Bagging

Random

Ensemble GPR

PCR

PLS

PCR

subspace PLS

model

gasoline

1.5455

1.7158

1.6012

1.3460

1.5203

1.5187

1.6543

1.6023

1.5551

light diesel oil

1.6606

1.7593

1.6679

1.7795

1.7014

1.6601

1.7432

1.6656

1.6595

liquefied petroleum

1.0258

1.1213

1.0413

1.0594

1.0321

1.0221

1.0987

1.0399

1.0287

7.9

1.1

1.6

117.1

44

45.7

9.6

10.8

157.9

0.8

0.4

0.6

1.5

1.6

1.5

0.9

0.8

1.9

gas CPU running time-offline (s) CPU running time-online (s)

5.1.2.

Illustrations and results

Before development of the model, both outliers and missing data should be handled in the first step. To eliminate the outlier, the PCA method is used in the present paper. The missing data samples are simply deleted from both calibration and validation datasets. For LSPCR model development, an initial PCA model should be calculated for linear subspace construction. Three principal components are retained in this PCA model, which can explain over 85% of the process data information. Therefore, a total of three linear subspaces can be construction through the three different principal directions of the initial PCA model. To model each linear subspace, four dominant variables have been selected and used for local PCR model construction in each subspace. An appropriate number of principal components should be retained in each local PCR model, which can explain over 85% data information of the corresponding subspace. In order to provide a reasonable result for each subspace, the linearity among selected variables in each subspace has been evaluated. It turns out that the linearity has been greatly enhanced by the dividing the original space into different subspaces. The numbers of latent variables in the conventional PCR and PLS models have been determined through the leave-one-cross validation strategy. In the SVR based quality estimation model, the Gaussian kernel has been used, with the width parameter selected as 2. In both kernel PCR and kernel PLS methods, the parameters are tuned by the try-and-error method, in order to obtain the - 15 -

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

best modeling performance. As a result, the RMSE values of different methods are tabulated in Table 2, though which it can be found that the prediction performance of the LSPCR method is comparable to those obtained by the kernel PCR and kernel PLS methods. The CPU computation times of different methods have been evaluated, detailed results of which are provided in the last two rows of Table 2. It can be seen that the computational burden of linear modeling methods is much lower than that of the nonlinear modeling methods such as SVR, kernel PCR, and kernel PLS. Therefore, even the linear subspace method have similar prediction performance as the nonlinear modeling method, it still has the advantage in lower computational burden. Besides, the comparison results to several ensemble learning based method are also provided in Table 2. One can find that the proposed method has comparative results to the ensemble nonlinear model, but has better performance compared to the two ensemble linear model. Again, the computational complexity of the ensemble nonlinear model is much higher than that of the proposed method. 0.4 0.35

Posterior probability

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 32

0

5

10

15

20

25

30

5

10

15

20

25

30

5

10

15

20

25

30

0.4 0.35

0 0.4 0.35

0

Samples Figure 4: Posterior probabilities of the test data samples in different linear subspaces Table 3. Quality estimation results (RMSE) of different local PCR models in FCCU process Quality variables /Local PCR

1st local PCR

2nd local PCR

3rd local PCR

gasoline

2.0664

1.4775

1.6098

light diesel oil

1.7598

1.8390

1.7452

liquefied petroleum gas

1.0129

1.0353

1.0770

- 16 -

ACS Paragon Plus Environment

Page 17 of 32

To examine how much extent each subspace local PCR is gotten involved, the posterior probability of the data sample in each subspace is plotted in Figure 4. As can be seen from this figure, the three subspaces have similar weights for linear subspace combination. Since the posterior probability is calculated upon the two monitoring statistics, the monitoring results of them should be accordance with the results presented in Figure 4. Thus, most of the data samples should have similar monitoring results with each local PCR model, which are given in Figure 5. In addition, the quality estimation result of each local PCR model is shown in Table 3.

T2

10

5

0 0

5

10

5

10

15

20

25

30

15

20

25

30

15

20

25

30

15

20

25

30

SPE

6 4 2 0 0

Samples (a) 15

T2

10 5 0 0

5

10

5

10

2

SPE

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

1

0 0

Samples (b)

- 17 -

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

15

T2

10 5 0 0

5

10

5

10

15

20

25

30

15

20

25

30

4

SPE

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 32

2

0 0

Samples (c) Figure 5: PCA Monitoring results of the test data samples, (a) First linear subspace; (b) Second linear subspace; (c) Third linear subspace.

5.2. Polypropylene production process Polypropylene is an important industrial material, which has been widely used in many fields, such as chemical industry, light industry, medical industry and etc. With the increased market demands of different product types, producing different characteristics of polypropylene has become a key issue of this industry process. A typical flowchart of this process is given in Figure 6. In this process, about 40 variables are measured online. As an important variable of the product quality, the melt index has been selected as the quality variable in this process. A total of 14 process variables (listed in Table 4) have been selected for quality estimation model development, which are highly correlated with the quality variable melt index. Similarly, for modeling training and validation, the outlier and missing data samples have been handled beforehand. Catalyst Catalytic body system

Reactor #1

Reactor #2

Reactor #3

hydrogen Propylene

Figure 6: Flowchart of the polypropylene production process

- 18 -

ACS Paragon Plus Environment

Polypropylene

Page 19 of 32

Table 4. Selected variables for quality estimation No.

Variable description

No.

Variable description

1

Hydrogen concentration of the first reactor

8

Propylene feed of the first reactor

2

Hydrogen concentration of the second reactor

9

Propylene feed of the second reactor

3

Density of the first reactor

10

Power for the first reactor

4

Density of the second reactor

11

Power for the second reactor

5

TEAL flow

12

Lever of the second reactor

6

DONOR flow

13

Temperature of the first reactor

7

Atmer-163 flow

14

Temperature of the second reactor

To train the two-dimensional LSPCR (TD-LSPCR) model, 100 data samples have been collected under each of the three operating conditions. The first three dimensionalities of the process variables are exhibited in Figure 7, which clearly describes the multimode characteristic of the data samples. Initially, under each operating mode, five principal components have been selected for development of the PCA model, depending on which five different linear subspaces can be constructed. Then eight dominant variables have been selected in each linear subspace to build the local PCR model in the corresponding operating mode. Therefore, a total of 15 local linear PCR models have been built in different subspaces of three operating modes. Similarly, the linearity among selected variables in each subspace has been improved compared to that among variables in the original space. First mode Second mode Third mode 580

Third variable

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

560

540

520

500 4000 3000

4000 3000

2000

Second variable

2000

1000

1000 0

0

First variable

Figure 7: Three-dimensional data characteristic of different operation modes

- 19 -

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

To evaluate the performance of the TD-LSPCR model, 20 data samples have been collected in each operating mode, thus a total of 60 samples are used for model testing. First, the mode posterior probability of each test data sample is examined, which are presented in Figure 8. As can be seen, the initial 20 data samples are from the first operating mode of the process, because the mode posterior probability corresponding to the first operating mode is very close to one while the other two posterior probabilities are remain to zero. However, the mode posterior probabilities of the rest 40 data samples have shown some mixed characteristic in the second and third operating modes. It can be found in Figure 8 that data samples 21-40 have higher posterior probabilities in the second operating mode, while data samples 41-60 have higher posterior probabilities in the third operating mode. The estimation results of the TD-LSPCR model three single LSPCR models, several nonlinear modeling methods and ensemble learning based methods are tabulated together in Table 5, through which it can be found that the TD-LSPCR method has obtained the best estimation result (smallest RMSE value). Similarly, the CPU running times of those methods are compared which are listed in the last two columns in Table 5. It can be found that the computational burden of the TD-LSPCR method is much lower than that of other nonlinear modeling methods. Detailed results of TD-LSPCR and single LSPCR models are demonstrated in Figure 9. Clearly, compared to results the single LSPCR model, the quality estimation performance has been greatly improved by the TD-LSPCR method.

Mode Posterior Probability

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 32

1 0.8 0.6 0.4 0.2 0 0

10

20

30

40

Samples (a)

- 20 -

ACS Paragon Plus Environment

50

60

Page 21 of 32

Mode Posterior Probability

1

0.8

0.6

0.4

0.2

0 0

10

20

30

40

50

60

40

50

60

Samples (b)

Mode Posterior Probability

1

0.8

0.6

0.4

0.2

0 0

10

20

30

Samples (c) Figure 8: Mode posterior probabilities of the test data samples in different operation modes, (a) First operation mode; (b) Second operation mode; (c) Third operation mode. 62 Estimated Real

61.5

Quality Variable

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

61 60.5 60 59.5 59 0

10

20

30

40

Samples (a)

- 21 -

ACS Paragon Plus Environment

50

60

Industrial & Engineering Chemistry Research

62 Estimated Real

Quality Variable

61.5 61 60.5 60 59.5 59 0

10

20

30

40

50

60

Samples (b)

Quality Variable

62

Estimated Real

61

60

59

58

57 0

10

20

30

40

50

60

Samples (c) 62 Estimated Real

61.5

Quality Variable

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 32

61 60.5 60 59.5 59 58.5 58 0

10

20

30

40

50

60

Samples (d) Figure 9: Estimation results of two-dimensional and single LSPCR models, (a) two-dimensional LSPCR; (b) First single LSPCR; (c) Second single LSPCR; (d) Third single LSPCR.

- 22 -

ACS Paragon Plus Environment

Page 23 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Table 5. Quality estimation results (RMSE) of one and two dimensional LSPCR models Estimation method/Quality variable

One-dimensional LSPCR

y

CPU running time-offline (s)

CPU running time-online (s)

First LSPCR

0.3256

12

0.9

Second LSPCR

1.3427

12.4

0.8

0.8751

12.2

0.9

Two-dimensional LSPCR

Third LSPCR

0.3027

24.6

1.2

SVR

0.5687

299.7

1.7

Kernel PCR

0.6543

97

1.6

Kernel PLS

0.6123

99.6

1.7

Bagging PCR

0.8345

12.9

0.7

Random subspace PLS

0.7654

12.8

0.6

Ensemble GPR model

0.4768

320

1.9

It is noted that the RMSE value of the first single LSPCR model is much smaller than that of the other two single LSPCR models. To examine the reason of this phenomenon, the estimation results of each linear subspace PCR model in three different operating modes are given in Table 6. In addition, subspace posterior probabilities of all test data samples corresponding to different linear subspaces are provided in Figure 10. It can be found that while the first 20 data samples have similar posterior probabilities in different subspaces, posterior probabilities of the rest 40 data samples are mostly expressed in the fourth and fifth subspaces. Noticed that the RMSE values of the fourth and fifth subspaces are much smaller than that of the first three subspaces, therefore, the overall RMSE of the first single LSPCR model should be much smaller than that of the other two single LSPCR models. Precisely, detailed results of different data samples by single local PCR models are presented in Table 7. Lowest RMSE values of the sub-datasets (1-20, 21-40, and 41-60) predicted by each local PCR model in different single LSPCR space are highlighted in bold. It can be seen that most data samples have low RMSE values when they are estimated by the exact local PCR model developed in the corresponding operating mode.

- 23 -

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

Table 6. RMSE of different local PCR models in each one-dimensional LSPCR model One-dimensional LSPCR/Local PCR

1st local PCR

2nd local PCR

3rd local PCR

4th local PCR

5th local PCR

First LSPCR

1.2078

1.8878

1.4320

0.9963

0.3795

Second LSPCR

0.7055

0.2483

2.9677

3.2573

3.2608

Third LSPCR

2.3916

0.4228

2.3475

1.7480

2.1424

Table 7. Results predicted by different local PCR models in each LSPCR model space Local PCR

1st local PCR

nd

2 local PCR

3rd local PCR

4th local PCR

th

5 local PCR

Subspace Posterior Probability

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 32

Sample number

First LSPCR

Second LSPCR

Third LSPCR

1-20

0.4489

0.9617

3.2097

21-40

1.0926

0.3926

2.6086

41-60

1.7265

0.6436

0.2275

1-20

0.3391

0.1781

0.3760

21-40

3.2177

0.3815

0.5952

41-60

0.4718

0.0878

0.2017

1-20

0.3770

4.8721

2.9922

21-40

2.3481

0.4907

2.7468

41-60

0.7044

1.5629

0.1847

1-20

0.2726

5.6128

2.9280

21-40

1.3121

0.4499

0.7500

41-60

1.0872

0.3514

0.1761

1-20

0.4717

5.4578

3.6582

21-40

0.1806

0.3340

0.5832

41-60

0.4205

1.4140

0.2167

0.5

0 0 0.4

10

20

30

40

50

60

10

20

30

40

50

60

0 0 0.5

10

20

30

40

50

60

0 0 1

10

20

30

40

50

60

10

20

30

40

50

60

0.2 0 0 0.4 0.2

0.5 0 0

Samples (a)

- 24 -

ACS Paragon Plus Environment

0.5

0 0 1

10

20

30

40

50

60

0 0 0.5

10

20

30

40

50

60

0 0 0.5

10

20

30

40

50

60

0 0 0.5

10

20

30

40

50

60

0 0

10

20

30

40

50

60

0.5

Samples (b)

Subspace Posterior Probability

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Subspace Posterior Probability

Page 25 of 32

0.4 0.2 0 0 1

10

20

30

40

50

60

10

20

30

40

50

60

10

20

30

40

50

60

10

20

30

40

50

60

10

20

30

40

50

60

0.5 0 0 0.4 0.2 0 0 1 0.5 0 0 0.4 0.2 0 0

Samples (c) Figure 10: Subspace posterior probabilities of different operation modes, (a) First operation mode; (b) Second operation mode; (c) Third operation mode.

6. Conclusions In the present paper, a new linear subspace PCR model has been proposed for quality estimation of nonlinear processes. Compared to traditional PCR, PLS and SVR methods, the new LSPCR method has obtained better performance. Simultaneously, the contribution of each subspace estimation result can also be evaluated through the posterior probability calculated through monitoring statistics and Bayesian inference. Under the linear subspace modeling structure, the basic LSPCR model has been extended to the two-dimensional form by introducing an additional information combination direction, which can be - 25 -

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

effectively used for quality estimation of processes with both nonlinear and multimode behaviors. Two industrial application case studies both show the feasibility and efficiency of the proposed methods. The advantage of the developed method in terms of the CPU running time could be significant if online modeling is required or the model needs to be updated due to the change of the process condition. It is worth to be noted that the linear subspace modeling idea can be extended to other methods, in order to handle more complex cases. For example, we can use the basic PCA model to construct multiple linear subspaces, and then used different models in different subspaces; or even the linear spaces can be constructed by other new methods, while the linear subspace modeling remains similar to the original model. However, how to select the structure of the linear model is worth further investigation. Another important issue is how to effectively determine the number of linear subspaces, which is actually a quite practical problem. Generally, this should be highly related to the nature of the process. Therefore, further researches on this topic may take the data characteristic into consideration. As a result, different ranges and different types of linear subspaces may be built to accommodate various process conditions.

Acknowledgement This work was supported in part by the National Natural Science Foundation of China. (61370029).

References 1. Kano, M.; Nakagawa, Y. Data-based process monitoring, process control and quality improvement: recent developments and applications in steel industry. Comput. Chem. Eng. 2008, 32, 12-24. 2. Kadlec, P.; Gabrys, B.; Strandt, S. Data-driven soft sensors in the process industry. Comput. Chem. Eng. 2009, 33, 795-814.

- 26 -

ACS Paragon Plus Environment

Page 26 of 32

Page 27 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

3. Qin, S. Survey on data-driven industrial process monitoring and diagnosis. Annual Reviews in Control. 2012, 36, 220-234. 4. Ge, Z.; Song, Z.; Gao, F. Review of Recent Research on Data-Based Process Monitoring. Ind. Eng. Chem. Res. 2013, 52, 3543-3562. 5. Zhu, J.; Ge, Z.; Song, Z. Robust modeling of mixture probabilistic principal component analysis and process monitoring application. AIChE J. 2014. 60, 2143-2157. 6. Yu, J.; Rashid, MM. A novel dynamic bayesian network‐based networked process monitoring approach for fault detection, propagation identification, and root cause diagnosis. AIChE J. 2013, 59, 2348-2365. 7. Yan, Z.; Huang, B.; Yao, Y. Multivariate statistical process monitoring of batch‐to‐batch startups. AIChE J. 2015, 61, 3719-3727. 8. Jiang, Q.; Yan, X. Monitoring multi-mode plant-wide processes by using mutual information-based multi-block PCA, joint probability, and Bayesian inference. Chem. Intel. Lab. Syst. 2014, 136, 121-137. 9. Kruger, U.; Chen, Q.; Sandoz, D.; McFarlane, R. Extended PLS Approach for Enhanced Condition Monitoring of Industrial Processes. AIChE J. 2001, 47, 2076-2091. 10. Zhang, K.; Hao, H.; Chen, Z.; Ding, S.; Peng, K. A comparison and evaluation of key performance indicator-based multivariate statistics process monitoring approaches. J. Process Control. 2015, 33, 112-126. 11. Yu, J. Multiway Gaussian mixture model based adaptive kernel partial least squares regression method for soft sensor estimation and reliable quality prediction of nonlinear multiphase batch processes. Ind. Eng. Chem. Res. 2012, 51, 13227-13237. 12. Galicia, H.; He, Q.; Wang, J. Comparison of the performance of a reduced-order dynamic PLS soft

- 27 -

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

sensor with different updating schemes for digester control. Contr. Eng. Prac. 2012, 20, 747-760. 13. Ni, W.; Tan, S.; Ng, W.; Brown, S. Localized adaptive recursive partial least squares regression for dynamic system modeling. Ind. Eng. Chem. Res. 2012, 51, 8025-8039. 14. Liu, Y.; Chen, J. Integrated soft sensor using just-in-time support vector regression and probabilistic analysis for quality prediction of multi-grade processes. J. Process Control, 2013, 23, 793-804. 15. Gonzaga, J.; Meleiro, L.; Kiang, C.; Filho, R. ANN-based soft-sensor for real-time process monitoring and control of an industrial polymerization process. Comput. Chem. Eng. 2009, 33, 43-49. 16. Yan, X. Hybrid artificial neural network based on BP-PLSR and its application in development of soft sensors. Chem. Intell. Lab. Syst. 2010, 103, 152-159. 17. Bhattacharya, S.; Pal, K.; Pal, S. Multi-sensor based prediction of metal deposition in pulsed gas metal arc welding using various soft computing models. Applied Soft Computing, 2012, 12, 498-505. 18. De Canete, J.; del Saz-Orozco, P.; Gonzalez, S.; Garcia-Moral, I. Dual composition control and soft estimation for a pilot distillation column using a neurogenetic design. Comput. Chem. Eng. 2012, 40, 157-170. 19. Liu, Y.; Hu, N.; Wang, H.; Li, P. Soft chemical analyzer development using adaptive least-squares support vector regression with selective pruning and variable moving window size. Ind. Eng. Chem. Res. 2009, 48, 5731-5741. 20. Ge, Z.; Song, Z. Nonlinear soft sensor development based on relevance vector machine. Ind. Eng. Chem. Res. 2010, 49, 8685-8693. 21. Kaneko, H.; Funatsu, K. Development of soft sensor models based on time difference of process variables with accounting for nonlinear relationship. Ind. Eng. Chem. Res. 2011, 50, 10643-10651. 22. Yu, J. A Bayesian inference based two-stage support vector regression framework for soft sensor

- 28 -

ACS Paragon Plus Environment

Page 28 of 32

Page 29 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

development in batch bioprocesses. Comput. Chem. Eng. 2012, 41, 134-144. 23. Yoo, C. Nonlinear monitoring and prediction model in an industrial environmental process. J. Chemical Engineering Japan, 2008, 41, 32-42. 24. Wibowo, A.; Desa, M. Kernel based regression and genetic algorithms for estimating cutting conditions of surface roughness in end milling machining process. Expert Systems with Application, 2012, 12, 1634-11641. 25. Ge, Z.; Huang, B.; Song, Z. Nonlinear semi-supervised principal component regression for soft sensor modeling and its mixture form. J. Chemometrics, 2014, 28, 793-804. 26. Liu, Y.; Chen, T.; Chen, J. Auto-Switch Gaussian Process Regression-Based Probabilistic Soft Sensors for Industrial Multigrade Processes with Transitions. Ind. Eng. Chem. Res. 2015, 54, 5037-5047. 27. Gholami, A.; Shahbazian, M.; Safian, G. Soft Sensor Development for Distillation Columns Using Fuzzy C-Means and the Recursive Finite Newton Algorithm with Support Vector Regression (RFN-SVR). Ind. Eng. Chem. Res. 2015, 54, 12031-12039. 28. Li, Z.; Kruger, U.; Xie, L.; Almansoori, A.; Su, H. Adaptive KPCA modeling of nonlinear systems. IEEE T. Signal Processing, 2015, 63, 2364-2376. 29. Yang, K.; Jin, H.; Chen, X.; Dai, J.; Wang, L.; Zhang, D. Soft sensor development for online quality prediction of industrial batch rubber mixing process using ensemble just-in-time Gaussian process regression models. Chem. Intell. Lab. Syst. 2016, 155, 170-182. 30. He, Y.; Geng, Z.; Zhu, Q. Soft sensor development for the key variables of complex chemical processes using a novel robust bagging nonlinear model integrating improved extreme learning machine with partial least square. Chem. Intell. Lab. Syst. 2016, 151, 78-88. 31. Jin, H.; Chen, X.; Wang, L.; Yang, K.; Wu, L. Adaptive Soft Sensor Development Based on

- 29 -

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Online Ensemble Gaussian Process Regression for Nonlinear Time-Varying Batch Processes. Ind. Eng. Chem. Res. 2015, 54, 7320-7345. 32. Shao, W.; Tian ,X. Adaptive soft sensor for quality prediction of chemical processes based on selective ensemble of local partial least squares models. Chem. Eng. Res. Des. 2015, 95, 113-132. 33. Liu, Y.; Zhang, Z.; Chen, J. Ensemble local kernel learning for online prediction of distributed product outputs in chemical processes. Chem. Eng. Sci. 2015, 137, 140-151. 34. Kaneko, H.; Funatsu, K. Ensemble locally weighted partial least squares as a just-in-time modeling method. AIChE J. 2016, 62, 717-725. 35. Tong, C.; Lang, T.; Shi, X. Soft sensing of non-Gaussian processes using ensemble modified independent component regression. Chem. Intell. Lab. Syst. 2016, 157, 120-126. 36. Shao, W.; Tian, X. Semi-supervised selective ensemble learning based on distance to model for nonlinear soft sensor development. Neurocomputing, 2017, 222, 91-104. 37. Feital, T.; Kruger, U.; Dutra, J.; Pinto, J.; Lima, E. Modeling and performance monitoring of multivariate multimodal processes. AIChE J. 2013, 59, 1557-1569. 38. Liu, Y.; Wang, F.; Chang, Y.; Ma, R. Operating optimality assessment and nonoptimal cause identification for non-Gaussianmultimode processes with transitions. Chem. Eng. Sci., 2015, 137, 106-118. 39. Ge, Z. Mixture Bayesian Regularization of PCR Model and Soft Sensing Application. IEEE T. Ind. Elect., 2015, 62, 4336-4343 40. Wang, F.; Tan, S.; Yang, Y.; Shi H. Hidden Markov Model-Based Fault Detection Approach for a Multimode Process. Ind. Eng. Chem. Res., 2016, 55, 4613-4621. 41. Jiang, Q.; Huang, B.; Yan, X. GMM and optimal principal components-based Bayesian method for

- 30 -

ACS Paragon Plus Environment

Page 30 of 32

Page 31 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

multimode fault diagnosis. Comput. Chem. Eng., 2016, 84, 338-349. 42. Yuan, X.; Ge, Z.; Song, Z. Locally weighted kernel PCR model for soft sensing of nonlinear time-variant processes. Ind. Eng. Chem. Res., 2014, 53, 13736-13749.

- 31 -

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

TOC Graphic

- 32 -

ACS Paragon Plus Environment

Page 32 of 32