Subscriber access provided by YORK UNIV
Process Systems Engineering
Quality-related locally weighted non-Gaussian regression based soft sensing for multimode processes Yuchen He, Binbin Zhu, Chenyang Liu, and Jiusun Zeng Ind. Eng. Chem. Res., Just Accepted Manuscript • DOI: 10.1021/acs.iecr.8b04075 • Publication Date (Web): 09 Dec 2018 Downloaded from http://pubs.acs.org on December 14, 2018
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
1
Quality-related locally weighted non-Gaussian regression
2
based soft sensing for multimode processes Yuchen He1, Binbin Zhu1, Chenyang Liu1, Jiusun Zeng2*
3 4
1College
5 6
Zhejiang, China 2College
7 8
of Mechanical & Electrical Engineering, China Jiliang University, Hangzhou 310018,
of Metrology & Measurement Engineering, China Jiliang University, Hangzhou 310018, Zhejiang, China
Abstract
9
This paper develops a novel just-in-time (JIT) learning based soft sensor for multimode processes. The
10
involved multimode datasets are assumed to be non-Gaussian distributed and time varying. A supervised
11
non-Gaussian latent structure (SNGLS) is introduced to model the relationship between predictor variables
12
and quality variables. In order to handle the multimode process, a moving window approach is adopted, based
13
on which a new similarity measure is proposed by integrating window confidence and between-sample local
14
similarity. The similarity between the current query sample and the dataset in a specific window is quantified
15
by the window confidence using the support vector data description (SVDD). Based on the data in the moving
16
window, the SNGLS model is constructed and used to obtain the estimation of between-sample local
17
similarity. The two similarities are integrated and used as sample weights, and a locally weighted structure
18
is designed for key quality variable estimation. The performance of the developed method is demonstrated
19
by application studies to the Tennessee Eastman(TE) process and a pre-decarburization absorption unit. It is
20
shown that the proposed method outperformed competitive methods on the prediction accuracy of key quality
21
variables. 1
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1 2
Keywords: Multimode processes; Locally weighted non-Gaussian regression; Just-in-time learning; Soft sensor
3
2
ACS Paragon Plus Environment
Page 2 of 40
Page 3 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
Industrial & Engineering Chemistry Research
1. Introduction
2
In the past few decades, soft sensors have been widely employed as an important technique to predict
3
key variables in industrial processes. It is a common practice to construct regression models to describe the
4
relationship between easy-to-measure process variables and key quality variables that are difficult to measure
5
online1-4. By introducing these regression models, it is possible to predict the variables that are of interest.
6
Among different kinds of soft sensing models, data-driven multivariate calibration methods have been widely
7
used. Classical data-driven multivariate calibration methods, such as principal component regression (PCR)5,
8
6,
9
12,
10
partial least square (PLS)7, 8, kernel partial least squares (KPLS) 9, 10and support vector machines (SVM)11, have been successfully applied to soft sensor modeling in fields like chemical engineering and medical
processes.
11
For soft sensor models, it is often the case that they perform well in the early stage. However, due to
12
variations of system and material conditions, changes of process dynamics are inevitable. The performance
13
of soft sensors may deteriorate as time evolves if the model is not consistent with the current situation. Hence,
14
soft sensors should be routinely updated to maintain its predictive accuracy, which is time-consuming and
15
costly. As a consequence, a large amount of methods have been proposed to cope with this situation. Among
16
them, the most popular is the Just-in-time learning (JITL) method13, 14. Traditional JITL method mainly
17
consists of three steps: selection of similar samples, construction of online models, and making prediction
18
according to the query sample, among which selection of similar samples and construction of online models
19
are the most critical.
20
In JITL methods, proper sample selection will lead to good prediction performance. Several kinds of
21
similarity measures have been proposed for sample selection, such as Euclidean-distance and subspace 3
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 4 of 40
1
angle15, 16. When defining similarity measure, most methods consider similarity between the inputs of training
2
samples and the query sample, whilst the output information in the training samples is largely ignored.
3
However, absence of output data may result in inappropriate sample selection due to loss of information17.
4
In order to cope with this problem, researchers have proposed several schemes to employ the output
5
information in the similarity measures. One possibility is to predict the output of the query sample using an
6
appropriate model first, and then define the similarity between the training samples and the query sample in
7
both input and output space18, 19. Note that the output of query sample in those methods is estimated based
8
on the input data, there is no new information added. In addition, introducing a new model may bring into
9
more uncertainty. Hence, such methods are problematic.
10
Bearing this in mind, Yuan et al. constructed a PLS model within a supervised latent space20. In this
11
method, the inputs of both training samples and the query sample are projected into the latent space, which
12
are further employed to construct similarity measure. In this way, the output information is utilized in the
13
modeling stage and the proposed method was proved to be more accurate than conventional methods. Despite
14
the above progress, two problems still remain. As is shown in the work of Yuan
15
structure is built up based on the assumptions that the process is Gaussian distributed and relatively stable.
16
Unfortunately, both assumptions do not hold for many practical processes. In addition, traditional locally
17
weighting techniques only consider one-to-one similarity between the query and training samples, whilst
18
ignoring the mode information of the training samples. As is true in practice, even the same inputs may
19
produce sharply different output values in different operational modes. Therefore, it is required to consider
20
the mode information in soft sensor development.
21
20,
the supervised latent
In this paper, a locally weighted non-Gaussian regression method is proposed using the SNGLS method, 4
ACS Paragon Plus Environment
Page 5 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
1
which is a non-Gaussian extension of traditional independent component regression (ICR) models21. Instead
2
of maximizing covariance, the latent structure is established by maximizing the mutual information between
3
input and output variables22, 23. The SNGLS not only considers the second-order information, but also higher
4
order information omitted in the PLS model, hence it is more suitable for modeling of non-Gaussian processes.
5
In order to cope with the multimode characteristics of the process, a moving window strategy is applied.
6
The similarity between the query sample and training sample is defined by integrating the window confidence
7
and between-sample similarity. The window confidence between the query sample and a data window is
8
estimated using the SVDD classifier. On the other hand, similarity between the query sample and a specific
9
sample (the between-sample similarity) is obtained using the SNGLS model. The reasons for using the
10
integrated similarity are double folded. As the window confidence only considers the similarity between the
11
query sample and the whole data window, it ignores the difference between samples in the same window.
12
Meanwhile, the between-sample similarity is estimated in individual base, it ignores the mode information
13
generally captured in a data window. In this paper, by using the integrated similarity, both the window
14
confidence and local sample-to-sample similarity are considered so that it considers more information and is
15
more suitable for modeling of multimode processes.
16
Comparing with traditional JITL methods which define similarity measures using only the information
17
in the input space whilst ignoring the information in output space, in this paper, we consider both the
18
information in input space and output space using the SNGLS. Moreover, both the window confidence and
19
between-sample confidence are considered in the similarity measure so that better modeling effects can be
20
achieved.
21
The remainder of this paper is organized as follows. Sample selection using supervised structures are 5
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 6 of 40
1
explained in Section 2. The NGR (Non-Gaussian regression) and SVDD methods are briefly introduced as
2
preliminaries in Section 3. Section 4 provides the details of the SNGLS. The locally weighted non-Gaussian
3
regression will be discussed in Section 5 where sample selection and the supervised non-Gaussian regression
4
are implemented for multimode processes soft sensing. In Section 6, the performance of the proposed method
5
is demonstrated through two study cases including the Tennessee Eastman (TE) benchmark and an industrial
6
process. Finally, conclusion summaries are made in Section 7.
7
2. Similarity measurement and sample selection using supervised latent
8
space
9
In conventional JITL soft sensor, similarities between the query sample and historical samples should
10
be calculated first. These similarities show the importance of individual historical sample in model
11
construction. Samples with bigger similarities are commonly assigned with larger weights in the training
12
model. Good similarity measure can greatly enhance the accuracy of online modeling. According to different
13
types of data information involved, the similarity measures in literature can be roughly divided into 3
14
categories.
15
2.1
Input information of historical data and query sample are used
16
In this category, only the inputs of historical data and query sample are involved in the similarity
17
measure while the outputs are not considered. Among numerous similarity measures proposed in literature,
18
Euclidean distance based similarity is perhaps the most popular, which is expressed as follows
19
DIS ( xnew xi )T ( xnew xi )
(1)
20
where xi and DIS represent the ith historical sample and its corresponding Euclidean distance to the new
21
query sample xnew . Other relevant similarity measures include vector-angle based and relevance based 6
ACS Paragon Plus Environment
Page 7 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
1
methods12, 24, among others. The problem of these similarity measures is that the output of historical data is
2
totally neglected, which may result in loss of important information and have negative impact on the
3
prediction accuracy of developed soft sensors.
4
2.2
Both the input and output information of historical and query sample are used
5
To solve the problem mentioned in sub-section 2.1, an intuitive solution is to take the output data into
6
consideration so that the similarity of both input and output data can be measured. Unfortunately, the output
7
part in the query sample is always inaccessible. One solution is to replace the true output corresponding to
8
the query inputs by the predicted outputs
9
conventional methods like PCR 25. Then the output distance between the jth training sample and the query
10
y% new
based on the available data ( xnew , X and Y) using
sample can be further defined as Di , y
11
yi y%new N
j 1
(2)
y j y%new
12
The distances defined in Eqs. (1) and (2) are then integrated into a single similarity measure. Compared
13
to the approach in sub-section 2.1, this approach may improve the modeling accuracy to some extent since
14
the output is also involved. However, it should be noticed that the performance of sample selection highly
15
depends on the accuracy of predicted outputs y%new . This may cause problem when the prediction of output is
16
not that reliable.
17
2.3
Inputs of query data and all historical data are used
18
To solve the problems of similarity measures used in sub-section 2.1 and 2.2, another method is
19
proposed in Ref.20. Compared with the methods proposed in subsection 2.2, this method does not require
20
prediction of the output of the query sample and avoids additional modeling errors in the similarity measure. 7
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
Instead, a supervised structure is established based on the training data at first. Then both historical samples
2
and the query sample are projected into the structures so that the similarities between corresponding score
3
vectors can be measured. Here, a simple example is given. Suppose there are three predictor variables x1 ,
4
x2 , x3 , which have the following correlation with the key variable y x12 x22 . The query sample is given
5
as SQ (0, 0, 0) while three candidate historical samples are S1 (1, 0, 2), S2 (0, 1, 2), S3 (2, 0, 0). Apparently,
6
based on the criterion in sub-section 2.1, S3 is likely to be the most similar sample since it is the closest
7
sample to SQ . However, it is noticed that the key variable y has nothing to do with x3 , which indicates that
8
it is inappropriate to employ Euclidean distance directly onto the raw data in the similar sample selection.
9
The above example shows the importance of variable-wise relationship in sample selection procedure.
10
Compared with previous non-supervised methods, the third kind of methods better utilizes the relationship
11
between variables during the sample selection procedure.
12
This is achieved using the supervised structure of PLS. In PLS, score vectors of the inputs can be
13
obtained by projecting the raw data into the latent space and the relationship between the score vectors and
14
the output can be constructed simultaneously. By using the latent space structure of PLS, both the input and
15
output information are considered in the modeling stage. It does not require an additional modeling stage;
16
hence it can be more effective than the method proposed in Section 2.2.
17
The problem of the method proposed in Ref.20 is that the Gaussian distributed assumption of PLS limits
18
its application in many non-Gaussian situations. Also, a PLS model is built by maximizing the covariance of
19
two latent variables. That means the variables in the latent space only reveal the second order information.
20
Actually, variables in most modern industries, such as ammonia processes and polymerization reaction
21
processes, usually contain high order correlation. In these processes, PLS method is generally not suitable. 8
ACS Paragon Plus Environment
Page 8 of 40
Page 9 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
1
In this paper, a novel SNGLS based method is designed to overcome the shortcomings of PLS based
2
approaches. The details of the SNGLS are discussed in Section 4. Before that, some preliminaries will be
3
presented regarding the SVDD classifier and the NGR method.
4
3. Preliminaries
5
3.1
SVDD
6
The SVDD is an effective approach to determine whether one sample is similar to a reference dataset
7
for non-Gaussian distributed data. It has been widely used as a similarity measure in many occasions26, 27. It
8
constructs a hypersphere with a minimal volume to envelop as many samples as possible. The hypersphere
9
contains two important parameters: center a and radius Rs. Suppose WL samples are included in the reference
10
dataset, the parameters of the hypersphere can be derived as follows min F ( Rs , a) Rs 2 C l
11
l
2
s.t. si a Rs i , i 0, i 1, 2,...,WL
(3)
2
12
where si indicates corresponding data while C and i represent tuning parameter and slack variable,
13
respectively. The above optimization problem can be transformed into the following form max hi ( si , si ) hi h j ( si , si ) a
14
i
i
s.t. hi 1,
j
(4)
hi [0, C ]
i
15
where hi represents the coefficient of the ith support vector and the two parameters of the hypersphere can
16
be further given as follows a hi si
17
(5) R ( sk , sk ) 2 hi ( sk , si ) hi h j ( si , s j ) i
i
j
18
where sk , k 1,..., K are corresponding support vectors. The closer a sample is from the center, the higher
19
similarity it has with the reference dataset. On the other hand, if a new sample falls out of the hypershpere, it 9
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1 2
Page 10 of 40
means that the sample is likely to be an outlier. 3.2
NGR
3
In our previous work, the NGR method is introduced to give a novel solution for non-Gaussian
4
regression 28, which is an extension of the ICA method. In ICA, independent components and their weight
5
vectors are obtained by maximizing the negentropy of the latent variable in the input data space. However, it
6
does not consider the relationship between input and output. In order to consider the relationship between
7
input and output variables, the NGR method was proposed in our previous work by introducing the mutual
8
information between the extracted latent variables from input and output space. In this way, both the non-
9
Gaussianity and the relationship between input and output space are considered. Different from ICA,
10
independent components are extracted by solving the following optimization problem wi , ci arg max{ J ( wiT x X O ) I ( wiT x X O , ciT yYO ) J ( ciT yYO )}, i 1,..., d i wi , ci
11
(6) s.t. w wi 1, c c 1 T i
T i i
12
where Xo and Yo represent the original data. wi , ci and di represent corresponding weight vectors and
13
number of latent variables in the NGR method, respectively. I ( wiT X O , ciT yYO ) indicates the mutual
14
information between two independent components where x and y are the whitening matrices of Xo and
15
Yo, respectively. J (g) represents the negentropy operator. It should be noted that the first and third terms in
16
Eq.(6) have the same form with the optimal problem in traditional independent component analysis (ICA).
17
When data are Gaussian distributed, J ( wiT x X O ) and J ( ciT yYO ) equals to zero. In ICA method, the non-
18
Gaussian components can be obtained through the maximization of these terms while the correlation between
19
two different spaces is neglected. To overcome this problem, the NGR method tries to make a balance
20
between data non-Gaussianity and data correlation in Eq.(6). When 1, 0, 0 or 0, 0, 1 , 10
ACS Paragon Plus Environment
Page 11 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
1
latent variables in Eq.(6) reduce to independent components in ICA. The optimal problem becomes the
2
maximization of correlation coefficient if Xo and Yo are Gaussian distributed. For more details about the NGR
3
method, please refer to our previous work 28.
4 5
4. Supervised non-Gaussian latent structure (SNGLS)
6
The supervised modeling framework mentioned in sub-section 2.3 constructs a latent space for sample
7
selection. However, most of current methods are not suitable for modeling of non-Gaussian distributed data
8
in multimode processes. To solve the problem, a supervised non-Gaussian structure is applied to replace PLS
9
in the sample selection. In this paper, the NGR method is extended to form a non-Gaussian latent space model.
10
In order to clearly illustrate the method, process data are assumed to be stable in this section. In next section,
11
the non-Gaussian regression method will be extended to a locally weighted form to cope with the time varying
12
feature of the multimode processes.
13 14 15
4.1
The model structure of SNGLS
Suppose that Xo (n×m) and Yo (n×k) are the original input and output, respectively. The model structure of SNGLS can be presented as follows X O TP T E
16
(7) YO UQ F T
17
where T t1 , t2 , , td and U u1 , u2 , , ud are the score matrices. P and Q are the corresponding
18
loading matrices, E and F are the residual matrices for Xo and Yo. Note that Eq.(7) has a similar structure as
19
the PLS model where the weight vectors are obtained by maximizing the following covariance of latent
20
variables 29 11
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 12 of 40
w j , c j arg max cov ( wTj x X O , c Tj yYO ), j 1,..., d j w j ,c j
1
(8) s.t. w w j 1, c c j 1 T j
T j
2
where w j and c j are the corresponding weight vectors of the PLS method. x and y are the
3
whitening matrices of Xo and Yo, respectively. dj represents the number of latent variables in the PLS method.
4
It is noticed that covariance is only a second order statistic which cannot well describe the behavior of higher
5
order information. Also, the basic assumption that data in multimode processes are Gaussian distributed is
6
often invalid in practice. In order to deal with non-Gaussian distributed datasets, a non-Gaussian calibration
7
framework is proposed by replacing the covariance in Eq.(8) using mutual information in Eq.(6). The next
8
sub-section discusses the relationship between SNGLS and PLS.
9
4.2
Relationship between SNGLS and PLS
10
Before discussing the relationship between SNGLS and PLS, the concept of mutual information should
11
be introduced. According to previous references 22, 30, the mutual information between two random variables
12
f and g can be estimated as I ( f , g) H ( f ) H ( g) H ( f , g)
13
(9)
14
where H ( f ) p( f )logp( f )df and
15
variables, respectively. p( f ) and p( g ) indicate the corresponding marginal probability density functions.
16
H ( f , g ) p( f , g ) log p( f , g )dfdg is the joint entropy of variables f and g with p( f , g ) representing
17
the joint probability density function. Several methods have been proposed to solve the problem of mutual
18
information estimation
19
According to previous works 31, the marginal entropy can be estimated as follows
22, 30, 31.
H ( g ) p( g )logp( g )dg represent the marginal entropies of two
Among these methods, the Edgeworth expansion is applied in this paper.
12
ACS Paragon Plus Environment
Page 13 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
H ( p ) H( p )
1
1 d 1 d i ,i ,i 2 ( ) ( i ,i , j )2 12 i 1 4 i , j 1,i j
(10)
d 1 ( i , j ,k ) 2 72 i , j ,k 1,i j k
2
where p is a Gaussian density function of mean and covariance 2 , (i, j, k), (i, j, k, l) and (i, j, k, l, p,
3
q) are input dimensions. hi , j ,k , hi , j ,k ,l
4
dimensions. i , j ,k , i , j ,k ,l and l , p ,q represent the standardized cumulants with corresponding dimensions.
5
Eq.(10) gives a comprehensive approach to estimate the marginal and joint entropy in Eq.(9). Then the
6
corresponding entropy can be obtained 31.
and hi , j ,k ,l , p ,q
are the Hermite polynomial with corresponding
1 1 1 E ( f 3 )2 H ( f ) ln 2f ln 2 2 2 2 12( 2f )3 1 1 1 E ( g 3 )2 H ( g ) ln g2 ln 2 2 2 2 12( g2 )3
7
1 1 E ( f 3 )2 E ( g 3 )2 H ( f , g ) ln( 2f g2 ( 2fg ) 2 ) ln 2 1 2 12 ( 2f )3 ( g3 )3
(11)
1 E ( f 2 g )2 E ( g 2 f )2 2 2 2 2 2 2 4 ( f ) g ( g ) f
8
where E () represents the expectation of the variable. 2f , g2 and 2fg indicate the variance and
9
covariance of f and g, respectively. Substitute Eq.(11) into Eq.(9), the mutual information between the two
10
random variables can be rewritten as 2 2 1 E ( f 2 g ) 2 E ( g 2 f ) 2 1 ( ) I ( f , g ) 2 2 2 2 2 2 ln 1 2fg 2 4 ( f ) g ( f ) g 2 f g
11
(12)
f wiT x X O and g ciT yYO . Note that the variances of both f and g equal to 1 since the
12
Suppose
13
existences of whitening matrices x , y and constraints in Eq.(6). Therefore, Eq.(12) can be modified as
14
follows
15 16
I ( f , g)
1 E ( f 2 g )2 E ( g 2 f )2 12 ln 1 ( 2fg )2 4
(13)
Substitute Eq.(13) into Eq.(6), the pair of weight vectors and the novel supervised structure in Eq.(7) 13
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
can then be obtained. It should be noted that the cost function in Eq.(8) is already involved in the second term
2
of Eq.(13), which means the covariance-based supervised structure is considered as a special case of SNGLS.
3
Compared with the PLS method, SNGLS retains extra high order information, which improves the modeling
4
performance when data are non-Gaussian distributed. In the next section, SNGLS will be extended into a
5
locally weighted form to deal with multimode processes regression.
6 7
5. Locally weighted non-Gaussian regression for multimode processes
8
In this section, a novel framework involving sample similarity measure and locally weighting scheme
9
is designed for time-varying processes. For the purpose of sample selection and online modeling, a new
10
similarity measure is proposed by integrating the window confidence and between-sample local similarities.
11
The flowchart of the proposed method is shown in Figure1. Firstly, in order to handle the time-varying
12
characteristics of the process, the training data are divided into a series of overlapping windows, each of
13
which is considered stable so that it can be modeled by a single SNGLS model. The window confidence with
14
respect to the current query sample is quantified by SVDD. In order to get the between-sample similarities,
15
the SNGLS is constructed in each moving window to set up a series of local models. After that, both training
16
samples and the current query sample are projected onto the local models to obtain local similarities. The
17
global similarity between the same pair of training sample and query sample is then derived through the
18
integration of the local similarities and corresponding window confidence. Those global similarities are
19
considered as weights and assigned to each training sample, which finally leads to the locally weighted non-
20
Gaussian regression solution. In summary, the locally weighted non-Gaussian regression method consists of
21
four steps: 1) window confidence calculation, 2) local similarity measurement, 3) global similarity 14
ACS Paragon Plus Environment
Page 14 of 40
Page 15 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
1
measurement, 4) modeling and output prediction. These steps will be explained in detail in the following four
2
sub-sections.
3
[Figure 1 about here.]
4
5.1
Window confidence
5
In continuous multimode processes, it is always reasonable to assume that the data characteristics of
6
neighboring samples are similar, which makes it feasible to model the relationship between the input and
7
output variables within a local area through a single SNGLS. It is important to determine the window
8
confidence by quantifying the similarity between the query sample and the data window. The schematic of
9
window confidence calculation is shown in Figure 2. The training samples are first divided into a series of
10
overlapping windows with the length of L. In order to reduce computation load and avoid updating the SVDD
11
classifier each time the window slides forward, a window step of M was introduced, which can be determined
12
according to the discussion of Zhu et al.32 Local information can be well captured using the moving window
13
strategy and overlapping structure can also guarantee the consistency of process modeling.
14
Since the process data are assumed to be non-Gaussian distributed, SVDD is applied to determine the
15
window confidences between the query sample and each moving window. The confidence of the th
16
window is defined as follows Rs ,
17
WS
18
where represents the query sample projected in the kernel space, and a and Rs , represent the center
19
and radius of the hypersphere in the th window, respectively.
20
between the query sample and the center in the high dimension space. For the window confidence defined in
a
(14)
2
a
15
ACS Paragon Plus Environment
is the corresponding distance
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 16 of 40
1
Eq.(15), a greater WS value indicates that the query sample is more similar to the reference data in the
2
moving window.
3
[Figure 2 about here.]
4
5.2
Between-sample similarity computation
5
On the basis of the window division, a series of SNGLS models can be established in corresponding
6
moving windows. In order to capture the local information, the between-sample similarity is estimated based
7
on the SNGLS model in each moving window. The schematic of between-sample similarity measurement is
8
shown in Figure 3. Suppose that the rth training sample belongs to the th moving window. An SNGLS
9
can be established using the samples in the window. Then both the input of training sample xr and the
10
current query sample xq are projected into the latent space to obtain corresponding latent scores tr and
11
tq . Sample distance based on the th SNGLS can then be modified into the following form DIS ( tq tr )T ( tq tr )
12 13
(16)
The local similarity between the two samples is further defined as follows Sim ,r
14
1 DIS
(17)
15
where Sim ,r represents the between-sample local similarity between the rth training sample and the query
16
sample under the th SNGLS.
17
[Figure 3 about here.]
18
5.3
Integrated similarities
19
In order to consider both the between-sample similarity and the mode information, a new similarity
20
measure is obtained by integrating the between-sample similarity and the window confidence. It should be
21
noted that by using the overlapping moving window, each sample may be assigned into several overlapping 16
ACS Paragon Plus Environment
Page 17 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
1
windows, making it difficult to determine which local similarity to use for the similarity integration. Assume
2
that the rth training sample is assigned to different moving windows. Hence, there are different
3
between-sample similarities between each training sample and the query sample. To handle this problem, the
4
integrated similarity is defined as follows
Simr WS Sim ,r
5
(18)
=1
6
where Simr represents the integrated global similarity between the rth training sample and the query sample.
7
In fact, many kinds of integrating strategies can be used to get the integrated similarity from the window
8
confidence and between-sample similarities, for example, simply summing them up. However, by
9
multiplying them, both the window confidence and between-sample similarities should be relative great to
10
produce a significant similarity; while other integrating strategies like summing them up are not that effective.
11
In the integration, the local similarity Sim ,r will be given a great weight when corresponding window
12
confidence is large. On the contrary, the local similarity is unreliable if the window does not have a strong
13
relationship with the query sample. Eq.(18) tries to balance different local similarities among different
14
moving windows with respect to the same pair of samples. Compared with local similarities, the global
15
similarity can handle similarity measurement when the process is time-varying. Furthermore, it should be
16
noted that SNGLS parameters can be obtained in advance in the offline procedure, which will greatly reduce
17
the computation complexity.
18
The schematic of global similarity measurement is shown in Figure 4.
19
[Figure 4 about here.]
20
5.4
Output prediction using locally weighted structure
21
On the basis of the previous sub-section, the global similarities of each training samples respect to the 17
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 18 of 40
SIM Sim1 Sim2 Simr Simn . These similarities are then
1
current query sample are obtained as
2
considered as weights and assigned to training samples for online modeling and output prediction. The locally
3
weighted training datasets XLW and YLW can be obtained as follows X LW SIM gX O
4
(19) YLW SIM gYO
5
Substitute the modified training datasets into eq.(7) T X LW TLW PLW ELW
6
(20) YLW U LW Q
T LW
FLW
7
ˆ where TLW, ULW, PLW, QLW, ELW and FLW are the corresponding parameters and the unmixing matrix W LW
8
can be obtained simultaneously. The regression parameter
9
ˆ independent score vectors t =xq W can be derived using least square regression, which is shown as LW
10
between key variables and extracted
follows Yˆ T t + e
11
(21)
= X , X
LW YLW
T ˆ ˆT ˆ TX LW W LW WLW X LW X LW X LW WLW
12
where X
13
modified training data. Yˆ represents the predicted output variables.
LW YLW
LW
1
and X are the covariance matrix, whitening matrix and variance matrix of the LW
14
In the latent structure of Eq.(20), both the sample importance and variable importance are considered in
15
the locally weighted SNGLS, which makes it more applicable in the soft-sensing of multimode processes.
16
Latent variables that can best describe the input-output relationship are chosen to measure the sample
17
similarity while the sample importance is represented by different weights assigned to training samples.
18
Compared with other locally weighted methods, the proposed method can better handle the variable 18
ACS Paragon Plus Environment
Page 19 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
1
prediction when the process data are non-Gaussian distributed and time-varying. In next section, the
2
effectiveness of the proposed method will be demonstrated on a simulation benchmark and a real industrial
3
process. The main steps of the locally weighted non-Gaussian regression are summarized as follows:
4
1) Divide the training samples into a series of moving windows;
5
2) Calculate window confidences between each moving window and the query sample using Eq.(14);
6
3) Establish a series of SNGLS models using the data in each moving window;
7
4) Project the training samples in the window and the query sample into the corresponding SNGLS;
8
5) Calculate local similarities using Eq.(16) and Eq.(17);
9
6) Calculate global similarities using Eq.(18);
10
7) The above global similarities are considered as weights and assigned to each training samples;
11
8) Construct a locally weighted non-Gaussian latent structure using Eq.(20);
12
9) Predict key variables using Eq.(21);
13
10) Turn to next query sample and repeat from step 3) to step 9) until all online samples are predicted.
14 15
6. Case study 6.1
Case 1: TE benchmark
16
The Tennessee Eastman (TE) industrial process has been widely applied to test the performance of
17
process monitoring and soft sensing. The TE process includes 41 measurement variables and 12 manipulated
18
variables with 5 basic operation units. A detailed description of the process can be referred to Ref33. In order
19
to test the proposed method, 10000 training samples operated under two stable modes and one transition
20
mode are collected. For the purpose of soft sensor modeling, process variables with small variations are not
21
considered in order to avoid singularity. As a result, 28 variables are selected as input variables and the 19
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
variable-A constituent in stream 6 is selected as the output variable, which are shown in Table 1. Hence, the
2
dimension of training data is 10000×29. Meanwhile, another 10000 samples are chosen as the test samples
3
which are collected under the same operation conditions as the training data. The dimension of online query
4
samples is 10000×28. According to Ref.32, window length L and window step M are set to be 20 and 4,
5
respectively. For the SNGLS method, the parameters and are set to 0.8 and 0.2 according to Ref.28.
6
[Table 1 about here.]
7
For comparison, LWNGR (Locally weighted Non-Gaussian Regression) and LWPLS (Locally weighted
8
Partial Least Squares) methods are also tested28. In LWNGR and LWPLS, sample similarities are obtained
9
based on Euclidean distance. These similarities are then designated as weights and assigned to each training
10
sample. The prediction results are shown in Figure 5, with subplot (a) showing the results of the proposed
11
method, subplots (b) and (c) showing the results of LWNGR and LWPLS. It can be seen that the proposed
12
methods can well track the quality variable even when fluctuation occurs at about 5000th -6000th sample
13
points.
14
On the other hand, LWNGR can also track the quality variable, however, not as well as the proposed
15
method. In sharp contrast, the performance of LWPLS deteriorates when sudden changes occur in process
16
condition. This is expected, as the basic assumption of LWPLS method that the process data follows Gaussian
17
distribution is not valid, making the similarity measure in LWPLS not reliable.
18 19 20
[Figure 5 about here.] To better evaluate the performance, the root mean squares error (RMSE) and the R 2 statistic are considered, which are defined as
20
ACS Paragon Plus Environment
Page 20 of 40
Page 21 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
1 N test
RMSE=
N test
1 R 1 2
(y i 1 N test
i
N test
( yˆ - y ) i
i 1
2
i
yˆ ) 2
( y yˆ )
(22)
2
i 1
2
where yi , y and yˆ i are the real value, average value and estimated value. N test is the number of test
3
samples. A higher RMSE value indicates worse accuracy and higher R 2 value indicate better accuracy. The
4
RMSE and R 2 values of all methods are shown in Table 2. It can be seen that the proposed method has a
5
better prediction performance in terms of both the RMSE and R 2 values.
6
[Table 2 about here.]
7
6.2
8
In this subsection, pre-decarburization absorption unit of a real industrial ammonia synthesis process is
9
considered. This unit is a critical component in the ammonia synthesis process which produces source
10
material NH3. In ammonia synthesis process, hydrogen is considered as one of the most critical variables.
11
Hydrogen is mainly generated from the raw material methane. Therefore, the pretreatment of the ammonia
12
synthesis process includes the methane transforming procedure. Classical methane transformation usually
13
includes three parts: the pre-reformer, the primary reformer and the secondary reformer. According to
14
practical operation, the transformation procedure is mainly triggered in the primary reformer. The natural
15
raw gas is the mixture of NH3 and CO2. Carbon dioxide should be eliminated from the natural gas. The Pre-
16
Decarburization absorption unit aims at absorbing the carbon dioxide in the raw natural gas, and this unit is
17
expected to be optimized in order to enhance the performance of hydrogen production processes. For more
18
details about the ammonia processes, interested readers can refer to Ref.33.
19
Case 2: Pre-decarburization absorption unit
In this case study, the content of residual CO2 (AI15002.PV) is considered as the quality variable and 21
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
another 20 process variables are collected as input variables, which are shown in Table 3.
2
[Table 3 about here.]
3
In this case, the training dataset containing 6400 samples with fluctuations in the supplies are considered.
4
Another 6500 samples are used as test samples, which are collected under the same conditions as the training
5
data. Again, LWNGR and LWPLS are introduced for comparison. The prediction results are shown in Figure
6
6, with subplot (a) showing the results of the proposed method, subplots (b) and (c) showing the results of
7
LWNGR and LWPLS.
8
In Figure 6(c), the trends of the key variable can be roughly tracked (such as big changes around the
9
1000th sample point) using LWPLS due to the local weighted structure. However, it is noted that the
10
prediction deviations are relatively large across the whole process. This is expected, as in the LWPLS method,
11
the similarities between training samples and the query sample are measured according to Euclidean distance,
12
which may be inaccurate due to the non-Gaussianity of the process data. Furthermore, the latent structure of
13
PLS is constructed by maximizing the covariance between the predictor variables and the quality variable,
14
which neglecting the high order information. In Figure 6(b), a similar situation occurs when the LWNGR
15
method is applied. Compared to LWPLS method, the non-Gaussianity of process variables is considered in
16
the latent space construction. It can be seen that the prediction performance is improved comparing to that of
17
LWPLS. The residual content of CO2 can be well tracked when the process is relatively stable. Unfortunately,
18
the sample selection steps in LWPLS and LWNGR remain the same, making it difficult to track the frequent
19
changes occurring in the pre-decarburization unit. This can be seen from the comparison between Figure 6(a)
20
and Figure 6(b). The result indicates that the proposed method can better predict the content of residual CO2
21
even though strong fluctuations occur in the process. In order to better evaluate the prediction performance, 22
ACS Paragon Plus Environment
Page 22 of 40
Page 23 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
1
the RMSE and R 2 values are also considered and shown in Table 4, which further confirmed that our
2
method outperforms both LWNGR and LWPLS.
3
[Figure 6 about here.]
4
[Table 4 about here.]
5
7. Conclusion
6
In this work, a novel quality-related locally weighted non-Gaussian regression method is proposed for
7
multimode processes. Firstly, the SNGLS is proposed to depict the variable relationship in non-Gaussian
8
distributed data. Both the high order and low order information are considered in the latent space construction.
9
Secondly, a locally weighted approach is introduced to cope with the time-varying characteristics in the
10
process. Different from classical JITL method and its variants, sample selection is implemented under a
11
supervised manner where the latent space of SNGLS is used. Moving window strategy is applied to separate
12
the whole process into several overlapping segments, each of which is described by an SNGLS. The same
13
pair of training sample and query sample are then projected onto relevant SNGLS models to obtain a series
14
of local similarities. Simultaneously, the window confidence is evaluated by the SVDD method. On the basis
15
of the above discussion, the final similarity can be derived through the combination of local similarities and
16
window confidence. These similarities are considered as weights and assigned to original data. After that,
17
the new locally weighted dataset is adopted for non-Gaussian regression. Finally, algorithm effectiveness is
18
validated on the TE benchmark and the pre-decarburization absorption unit. The results show that the
19
proposed method can well predict the key variables.
20 21
Acknowledgement 23
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1 2
This work is supported by Zhejiang Provincial Natural Science Foundation (LQ19F030007) and National Natural Science Foundation of China (61673358).
3
24
ACS Paragon Plus Environment
Page 24 of 40
Page 25 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
1
Reference
2
1.
3
squares with orthogonal signal correction. Chemometrics and Intelligent Laboratory Systems 2005, 79, (1–
4
2), 22-30.
5
2.
6
Intelligent Laboratory Systems 2001, 58, (2), 109-130.
7
3.
8
independent component regression. Chemometrics and Intelligent Laboratory Systems 2009, 98, (2), 143-
9
148.
Kim, K.; Lee, J.-M.; Lee, I.-B., A novel multivariate regression approach based on kernel partial least
Wold, S.; Sjöström, M.; Eriksson, L., PLS-regression: a basic tool of chemometrics. Chemometrics and
Zhang, Y.; Zhang, Y., Complex process monitoring using modified partial least squares method of
10
4.
Massy, W. F., Principal Components Regression in Exploratory Statistical Research. Journal of the
11
American Statistical Association 1965, 60, (309), 234-256.
12
5.
Heidelberg, S. B., Principal Component Regression. Betascript Publishing 2010, 1954-1954.
13
6.
Surhone, L. M.; Timpledon, M. T.; Marseken, S. F.; Correlation, C.; Squares, T. S. O.; Analysis, R.,
14
Principal Component Regression. Springer Berlin Heidelberg: 2013; p 1954-1954.
15
7.
16
2007.
17
8.
18
Window Partial Least Squares. Industrial & Engineering Chemistry Research 2016, 49, (22), 11530-11546.
19
9.
20
Journal of Machine Learning Research 2002, 2, (2), 97-123.
21
10. Bai, Y.; Jian, X.; Long, Y. In Kernel Partial Least-Squares Regression, IEEE International Joint
22
Conference on Neural Network Proceedings, 2006; 2006; pp 1231-1238.
23
11. Feng, R.; Shen, W.; Shao, H. In A soft sensor modeling approach using support vector machines,
24
American Control Conference, 2003. Proceedings of the, 2003; 2003; pp 3702-3707 vol.5.
25
12. Zheng, X. X.; Feng, Q., Soft Sensor Modeling Based on PCA and Support Vector Machines. Journal of
26
System Simulation 2006, 18, (3), 739-741.
27
13. Ge, Z.; Song, Z., A comparative study of just-in-time-learning based methods for online soft sensor
Abdi, H., Partial Least Square Regression PLS-Regression. Encyclopedia of Measurement & Statistics
Liu, J.; Chen, D. S.; Shen, J. F., Development of Self-Validating Soft Sensors Using Fast Moving
Rosipal, R.; Trejo, L. J., Kernel partial least squares regression in reproducing kernel hilbert space.
25
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
modeling. Chemometrics & Intelligent Laboratory Systems 2010, 104, (2), 306-317.
2
14. Cybenko, G., Just-in-Time Learning and Estimation. Nato Asi 1996.
3
15. Cheng, C.; Chiu, M. S., A new data-based methodology for nonlinear process modeling. Chemical
4
Engineering Science 2004, 59, (13), 2801-2810.
5
16. Cheng, C.; Chiu, M.-S., Nonlinear process monitoring using JITL-PCA. Chemometrics and Intelligent
6
Laboratory Systems 2005, 76, (1), 1-13.
7
17. Chen, M.; Khare, S.; Huang, B., A unified recursive just-in-time approach with industrial near infrared
8
spectroscopy application. Chemometrics & Intelligent Laboratory Systems 2014, 135, (14), 133-140.
9
18. Chen, M.; Khare, S.; Huang, B.; Zhang, H.; Lau, E.; Feng, E., Recursive Wavelength-Selection Strategy
10
to Update Near-Infrared Spectroscopy Model with an Industrial Application. Industrial & Engineering
11
Chemistry Research 2013, 52, (23), 7886-7895.
12
19. Chang, S. Y.; Baughman, E. H.; Mcintosh, B. C., Implementation of Locally Weighted Regression to
13
Maintain Calibrations on FT-NIR Analyzers for Industrial Processes. Applied Spectroscopy 2001, 55, (9),
14
1199-1206.
15
20. Yuan, X.; Huang, B.; Ge, Z.; Song, Z., Double locally weighted principal component regression for soft
16
sensor with sample selection under supervised latent structure. Chemometrics & Intelligent Laboratory
17
Systems 2016, 153, 116-125.
18
21. Zhao, C.; Gao, F.; Wang, F., An improved independent component regression modeling and quantitative
19
calibration procedure. AIChE Journal 2010, 56, (6), 1519-1535.
20
22. Kraskov, A.; Stögbauer, H.; Grassberger, P., Estimating mutual information. Physical Review E 2004,
21
69, (6), 066138.
22
23. Rashid, M. M.; Yu, J., A new dissimilarity method integrating multidimensional mutual information
23
and independent component analysis for non-Gaussian dynamic process monitoring. Chemometrics and
24
Intelligent Laboratory Systems 2012, 115, (0), 44-58.
25
24. Chen, K.; Liang, Y.; Gao, Z.; Liu, Y., Just-in-Time Correntropy Soft Sensor with Noisy Data for
26
Industrial Silicon Content Prediction. Sensors 2017, 17, (8), 1830.
27
25. Wang, Z.; Isaksson, T.; Kowalski, B. R., New approach for distance measurement in locally weighted
28
regression. Analytical Chemistry 1994, 66, (2), 249-260. 26
ACS Paragon Plus Environment
Page 26 of 40
Page 27 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
1
26. Zhu, Z.; Song, Z.; Palazoglu, A., Transition Process Modeling and Monitoring Based on Dynamic
2
Ensemble Clustering and Multiclass Support Vector Data Description. Industrial & Engineering Chemistry
3
Research 2011, 50, (24), 13969-13983.
4
27. Yao, M.; Wang, H.; Xu, W., Batch process monitoring based on functional data analysis and support
5
vector data description. Journal of Process Control 2014, 24, (7), 1085-1097.
6
28. Zeng, J.; Xie, L.; Kruger, U.; Gao, C., A non-Gaussian regression algorithm based on mutual
7
information maximization. Chemometrics and Intelligent Laboratory Systems 2012, 111, (1), 1-19.
8
29. Chiang, L. H., Data-driven Methods for Fault Detection and Diagnosis in Chemical Processes. 2000.
9
30. Tsimpiris, A.; Vlachos, I.; Kugiumtzis, D., Nearest neighbor estimate of conditional mutual information
10
in feature selection. Expert Systems with Applications 2012, 39, (16), 12697-12708.
11
31. Hulle, M. M. V., Edgeworth Approximation of Multivariate Differential Entropy. MIT Press: 2005; p
12
1903-1910.
13
32. Zhu, Z.; Song, Z.; Palazoglu, A., Process pattern construction and multi-mode monitoring. Journal of
14
Process Control 2012, 22, (1), 247-262.
15
33. Filippi, E., AMMONIA SYNTHESIS PROCESS. In US: 2006.
16 17
27
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 28 of 40
1
List of Figures
2
Figure 1. The flowchart of the proposed non-Gaussian regression method
14
3
Figure 2. The schematic of process localization and window confidence calculation
15
4
Figure 3. The schematic of between-sample local similarity measurement
16
5
Figure 4. The schematic of global similarity measurement
17
6
Figure 5. Comparison of prediction performance between the three methods in the TE process
20
7
Figure 6. Comparison of prediction performance between the three methods in the pre-decarburization
8
absorption unit
22
9
28
ACS Paragon Plus Environment
Page 29 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1 2
Industrial & Engineering Chemistry Research
Figure 1. The flowchart of the proposed non-Gaussian regression method
3
29
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1 2
Figure 2. The schematic of process localization and window confidence calculation
3
30
ACS Paragon Plus Environment
Page 30 of 40
Page 31 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
1 2
Figure 3. The schematic of between-sample local similarity measurement
3
31
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1 2
Figure 4. The schematic of global similarity measurement
3
32
ACS Paragon Plus Environment
Page 32 of 40
Page 33 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
1
2 3
Figure 5. The comparison of prediction performance between the three methods in the TE process: (a) the
4
proposed method; (b) LWNGR; (c) LWPLS
5
33
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
2 3
Figure 6. Comparison of prediction performance between the three methods in the pre-decarburization
4
absorption unit: (a) the proposed method; (b) LWNGR; (c) LWPLS
5
34
ACS Paragon Plus Environment
Page 34 of 40
Page 35 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
1
List of Tables
2
Table 1. The predictor variables and output variable in the TE process
20
3
Table 2. The comparison of RMSE and R 2 statistics in the TE process
20
4
Table 3. The predictor variables and output variable in the pre-decarburization absorption unit
22
5
Table 4. The comparison of parameter RMSE and R 2 statistics in the pre-decarburization absorption unit 22
6
35
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
Table 1. The predictor variables and output variable in the TE process No
Predictor variables
No
Predictor variables
1 2 3 4 5 6 7 8 9 10 11 12 13 14
A feed (stream 1) D feed (stream 2) E feed (stream 3) A & C feed (stream 4) Recycle flow (stream 8) Reactor feed rate (stream 6) Reactor pressure Reactor level Reactor temperature Purge rate (stream 9) Product separator temperature Product separator level Product separator pressure Product separator underflow (stream 10)
15 16 17 18 19 20 21 22 23 24 25 26 27 28
Stripper level Stripper pressure Stripper underflow (stream 11) Stripper temperature Stripper steam flow Reactor cooling water outlet temperature Separator cooling water outlet temperature D feed flow (stream 2) E feed flow (stream 3) A feed flow (stream1) A & C feed flow (stream 4) Separator pot liquid flow (stream 10) Stripper liquid product flow (stream 11) Reactor cooling water flow
2 3
36
ACS Paragon Plus Environment
Page 36 of 40
Page 37 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
Industrial & Engineering Chemistry Research
Table 2. The comparison of RMSE parameter and R 2 statistics in the TE process
The proposed method
LWNGR
LWPLS
RMSE value
0.1418
0.2497
1.1431
R 2 statistics
0.9026
0.6981
-5.3266
2 3
37
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
Table 3. The predictor variables and output variable in the pre-decarburization absorption unit
Tags
Predictor variables
FI15001.PV
The Flow-rate of Feed NG
LI15001.PV
The Level of 15-F001
PDI15001.PV
The Pressure Difference of 15-F001
PI15001.PV
The Pressure of Feed NG
TI15001.PV
The Temperature of Feed NG
LIC15002.PV
The Level of 15-F002
PDI15002.PV
The Pressure Difference of 15-C001
PIC15002.PV
The Pressure of Process Gas at 15-F001
TI15002.PV
The Temperature of Process Gas at 15-F002
PI15003.PV
The Pressure of Process Gas at 15-F002
TI15003.PV
The Temperature of Process Gas at 15-C001
LI15004.PV
The Level #1 of 15-C001
PI15004.PV
The Pressure of Process Gas to 15-C001
LI15005.PV
The Level #2 of 15-C001
TI15005.PV
The Temperature in the Middle of 15-C001
LI15006.PV
The Level #3 of 15-C001
PI15006.PV
The Pressure of Process Gas at the Top of 15-C001
TI15006.PV
The Temperature of Amine Liquor to 15-C001
TI15007.PV
The Temperature of Process Gas at the Top of 15-C001
LIC15010.PV
The Level of Regeneration Column
Tag
Output variable
AI15002.PV
The content of Residual CO2 in the process gas
2 3
38
ACS Paragon Plus Environment
Page 38 of 40
Page 39 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
Industrial & Engineering Chemistry Research
Table 4. The comparison of parameter RMSE and R 2 statistics in the pre-decarburization absorption unit
The proposed method
LWNGR
LWPLS
RMSE value
0.0667
0.1872
0.2845
R 2 statistics
0.9572
0.6644
0.2135
2 3
39
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
ACS Paragon Plus Environment
Page 40 of 40