Subscriber access provided by CORNELL UNIVERSITY LIBRARY
Article
An Online Performance Monitoring and Modeling Paradigm based on Just-in-time Learning and Extreme Learning Machine for Non-Gaussian Chemical Process Xin Peng, Yang Tang, Wenli Du, and Feng Qian Ind. Eng. Chem. Res., Just Accepted Manuscript • Publication Date (Web): 10 May 2017 Downloaded from http://pubs.acs.org on May 16, 2017
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
Industrial & Engineering Chemistry Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 41
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
An Online Performance Monitoring and Modeling Paradigm based on Just-in-time Learning and Extreme Learning Machine for Non-Gaussian Chemical Process Authors 5
Xin Peng, Yang Tang, Wenli Du, Feng Qian* Author affiliations the Key Laboratory of Advanced Control and Optimization for Chemical Process, Ministry of Education, East China University of Science and Technology, Shanghai, China
10
*Corresponding author: Feng Qian:
[email protected] the Key Laboratory of Advanced Control and Optimization for Chemical Process, Ministry of Education, East China University of Science and Technology, Shanghai, China
15
1
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
An Online Performance Monitoring and Modeling Paradigm based on Just-in-time Learning and Extreme Learning Machine for Dynamic Non-Gaussian Chemical Process Monitoring 20
Abstract 25
This paper proposes a novel performance monitoring and online modeling method to deal with a non-Gaussian chemical process with multiple operating conditions. Based on the framework of the proposed method, a kernel Extreme Learning Machine (ELM) technique is used to efficiently extract features from high dimensional process data efficiently. Besides, the Fastfood kernel is introduced into kernel ELM to accelerate computation efficiency, which is
30
relatively low at the prediction time. Then, a modified Just-in-time learning (JITL) technique is applied for online modeling. In JITL, a novel similarity index, called modified adjusted cosine similarity (MACS), is proposed so as to improve the prediction performance of online modeling. The proposed paradigm provides an efficient, accurate and fast approach to monitor and model the multimode chemical process. The validity and effectiveness are evaluated by applying the
35
method to a synthetic non-Gaussian multimode model and the distillation system. Keywords: Non-linear process, Non-Gaussian process, Extreme learning machine, Fastfood kernel, Gaussian kernel, Just-in-time learning, Online modeling, Process monitoring,
2
ACS Paragon Plus Environment
Page 2 of 41
Page 3 of 41
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
1 40
Introduction In the last decade, the monitoring of modern industrial manufacturing systems (e.g. chemical
engineering
1, 2
and biological industry
3, 4
) received increasing attention because of the higher
process safety and quality requirements 5-7. In such large systems, most of the maintenance costs are allotted to the control system malfunction (i.e. aging, unanticipated interactions of components, and misuse of components) 8. However, the detection of abnormalities and their
45
corresponding root causes becomes increasingly difficult because of the complexity and scale of these modern process systems. Statistical learning based methods in the field of chemometrics and process modeling and monitoring were recently developed 9. Various methods are based on conventional statistical approaches and are related to time series analysis, model regression, and classification.
50
Statistical-based methods, which are often referred to as statistical process monitoring (SPM), have achieved great success due to the high availability of historical process data from distributed control systems (DCS). These data can be used to construct a statistical model for predicting and supervising the status of the process. Monitoring methods based on SPM take the serial and spatial correlation of process data into account, which provides a more efficient and
55
precise monitoring framework than the conventional model-based approaches and methods that merely determine the control threshold for each observation. This feature of SPM-based methods renders them effective especially when the process data are high-dimensional and highly correlated multivariate datasets. To recapitulate, the procedure of Multivariate Statistical Process Monitoring (MSPM)
60
10
mainly focuses on the decorrelation of the high-dimensional
data, for the extraction of the key features indicated in process data. By analyzing the information in key features that reflect the operation condition of the process
11
, the faulty
behavior that occurred in the process can be detected. Principal component analysis (PCA) and partial least squares (PLS) are two of the most comprehensively researched methods in MSPM
65
12, 13
in the field of high dimensional process
modeling and monitoring. PCA is a dimension reduction procedure concerned with elucidating the covariance structure of a set of measurements. In particular, PCA allows us to identify the principal directions in which the raw process data varies. Similar to PCA method, PLS is concerned with the spatial correlation of process data, however, unlike PCA, PLS attempts to decompose the measurements into a feature space where the correlation between the predictor
70
and predicted variables is maximized 14. For these methods, the derivation of the control interval for Hotelling’s T2 and squared prediction error (SPE) monitoring statistics is based on the assumption that the normal operation data follow a multivariate Gaussian distribution 15, which may not be satisfied by practical data distributions. In fact, practical industrial process data is
3
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 4 of 41
often significantly non-Gaussian. Consequently, conventional PCA and PLS usually suffers if
75
there is no proper modification to deal with the non-Gaussianity. Total PLS (T-PLS) is proposed to analyze the inherent flaw of PLS and provide a more detailed decomposition for process variables matrix, which is beneficial for non-Gaussian process monitoring16. Meanwhile, Peng proposed Total KPLS (T-KPLS)
17
model for nonlinear non-Gaussian process and Jia
18
extended this model in a quality related non-Gaussian process monitoring by singular value
80
decomposition. Other than PCA and PLS, some novel methods, such as Independent Component Analysis (ICA), are proposed to monitor non-Gaussian processes by separating the raw data by independent component (IC) instead of the conventional orthogonal components as in the framework of PCA 19. The ICs are assumed to be non-Gaussian and mutually independent in
85
terms of high order statistics, which imply that the ICs can maintain the non-Gaussian features more effectively than the traditional PCA/PLS-based method 20. A number of applications based on ICA have been successfully applied to chemical process monitoring
21
. However, the
conventional ICA is still based on the linear assumption, which may result to unsatisfactory performance in practical applications. Hence, several modifications of ICA were proposed. For
90
instance, Zhao et al.22 proposed a Kernel ICA-PCA to deal with the nonlinear feature of the data. Jiang et al. developed a double weighted strategy ICA
23
to increase the sensitivity of fault
detection. However, some industrial processes may change their operation mode according to their
corresponding
considerations
95
24
manufacturing
strategies,
product
requirements,
and
economic
. In these cases, conventional MSPM methods, such as PCA and ICA, may
perform poorly under multimode situation due to their single operation mode assumption
25
.
Considering that the PCA and ICA both suffer from the improperly interpret the process data, a Bayesian network based dimension reduction method
26
was proposed to deal with the non-
linear non-Gaussian variables. Although this method was successfully applied to non-linear chemical process. However, similar to the traditional PCA and ICA, it still lack the mechanism
100
to deal with the multimode process data because it is merely a dimension reduction method. In terms of multimode process modeling, current state-of-the-art methods can be categorized into three major classes. The first class is based on a precise global model 27. By combining the features from all operation conditions, this class focuses on merging all the local models by minimizing the dissimilarity among all local models. The second class involves building
105
multiple local models according to each individual operating condition. In this strategy, the key factors of performance monitoring are the classification criteria for the monitored sample and the clustering method to properly divide the training data into multiple subsets. K-means and fuzzy C-means are two of the most commonly used approaches to separate the training data into
4
ACS Paragon Plus Environment
Page 5 of 41
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
their corresponding clusters. The third class uses adaptive models that are updated periodically
110
to track the variation of the data structure. Several adaptive techniques include recursive 29
moving windows (MW) , time difference (TD)
30
and Just-in-time learning (JITL)
28
,
31
. These
methods are effective especially when applied to the performance monitoring of processes with mode-shifting
32
. However, the adaptive model tends to be over-trained when the updating
standard is improperly selected.
115
The present paper demonstrates the construction of an appropriate adaptive paradigm for online performance monitoring and modeling of a multimode non-Gaussian process. Considering that the conventional monitoring methods (e.g. PCA and ICA), which suffer from the low efficiency of extracting non-Gaussian features from the raw process data, and the traditional Gaussian kernel, which may cause high computational cost. The newly proposed
120
paradigm provides an efficient way to project higher-dimensional non-linear raw process data into a low-dimensional linear feature space using a relative low computation cost to avoid the aforementioned disadvantages. In this paper, extreme learning machine (ELM) is introduced into the performance monitoring and modeling because of its outstanding performance in extracting features from high dimensional process data. To be specific, the proposed method
125
first identifies the current operation mode by kernel density estimation (KDE). Meanwhile, an interesting and promising approximate kernel expansion, Fastfood kernel, which accelerates the computation time of kernel function significantly, is introduced to replace the conventional kernel tricks for the purpose of accelerate the computation speed of ELM. This modified ELM provides an efficient framework for online fault detection. Then, once a fault is detected in the
130
process, the online JITL method is applied to estimate the current status. In this phase, a modified similarity index is introduced to enhance the precision of the prediction result of the proposed ELM-based modeling and monitoring method. This hybrid paradigm provides a rapid and accurate approach to online modeling and monitoring the status of a non-Gaussian process with multi-operation modes. The newly proposed method is applied to a distillation system with
135
three operation points to validate its efficiency and efficacy. The remainder of the paper is organized as follows: the preliminaries of the basic idea of extreme learning machine, non-linear projection kernel tricks, and JITL are briefly discussed in Section 2. Next, in Section 3, the details of the proposed method are presented. In Section 4, the performance of the newly proposed method is proven by applying it to a synthetic non-Gaussian
140
multimode toy model and a distillation system with three operation modes. Finally, , a brief conclusion is presented in Section 5.
2
Preliminaries 2.1
Extreme learning machine
5
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 6 of 41
ELM is a promising data mining method that has been successfully applied in the fields of
145
regression and large dataset and multi-labeled dataset classification. The framework of ELM (as shown in Figure 1) is based on the least square theory for modeling the training dataset. ELM was originally designed for the single-hidden layer feedforward neural networks and was subsequently applied to a more generalized networks, which are dissimilar to look like conventional neural network 33.
150
In contrast to conventional feedforward neural network training methods such as support vector machines (SVM), ELM can model data with a single layer of hidden nodes; where the weights connecting the inputs to hidden nodes are randomly assigned and never updated. Therefore, the training speed of ELM is relatively faster than SVM. This feature renders ELM a suitable method for online process modeling and monitoring. The training stage of ELM
155
consists to two phases. In the first phase, the goal of ELM is to construct the hidden layer by using a fixed number of mapping neurons that are randomly generated. In this phase, Sigmoid 34
or Gaussian function can be selected as a mapping function
. In the second phase, ELM
focuses on the output weights by minimizing the sum of the squared losses of the prediction errors, as shown in Equation (6).
160
In the ELM framework, the data sample, x =
{ x1 , x2 ,L , xn }
T
∈¡
n× m
, where n is the number
of samples and m is the dimension of each sample, is observed and used to learn features,. The sample is expressed as follows:
x = Hs s = WBx,
,
(1)
where s refers the set of independent components and H ∈¡
165
mixing matrix. W ∈¡
n× n
and B ∈¡
n× n
n×n
represents the unknown
are the demixing matrix and components selecting
matrix, respectively. Therefore, each component si can be defined as: L
si ( x ) = ∑ bi Gi ( x, ai , bi ) .
(2)
i =1
where Gi ( x, ai , bi ) is the output of the ith hidden node to the output node. ELM aims to resolve the optimization problem through:
170
= J (b , s)
Gi ( x, ai , bi ) b − s 2 . 2
(3)
In Equations (2) and (3), L denotes the number of hidden nodes in the ELM, β is the weight parameter in the matrix B, which is defined as βββ =[ 1 L L ] , and it connects the feature
6
ACS Paragon Plus Environment
Page 7 of 41
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
space and its corresponding components; gi is the activation function; and ai and bi are the weight and bias parameters between the hidden nodes and inputs, respectively.
175
The activation function for a specific hidden node can be expressed by:
Gi ( x, = ai , bi ) gi ( ai x + bi ) .
(4)
In contrast to conventional training methods such as SVM, all parameters except βi are generated randomly. After resolving the optimization problem in Equation (4) (using the least square method), for a specific input process data x = { x1 , x2 ,L , xn } , the weight matrix β can T
180
be calculated by:
β = H † s,
(5)
where H † is the Moore-Penrose generalized inverse of H , which is:
G ( x1 , a1 , b1 ) K G ( xL , a1 , bL ) H = M O M . G ( xn , aL , b1 ) L G ( xn , aL , bL ) n× L However, in the original framework of ELM, the process noise affects the model accuracy
185
because all the parameters are generated randomly. Therefore, some of the hidden nodes are irrelevant to the process data. If too many nodes are used, the ELM model would be over fitted, yet, with too few nodes, it will be under fitted. Therefore, a sparse method is introduced into ELM to improve performance with noisy data. Therefore, the objective function of ELM can be transformed as:
= J (b , s)
190
gi ( x, ai , bi ) b − s 2 + ζ b 1 , 2
(6)
where β 1 is the L1-norm of β and ζ is its sparse tuning parameter. Initially, the number of nodes generated is higher than that of the nodes needed to represent the process data. The L1norm is used to reduce the redundant hidden nodes. Alternatively, other modified methods, such as pruned-ELM (P-ELM)
195
35
and optimally pruned-ELM (OP-ELM)
36
, are proposed to avoid
overfitting with the presence of noisy data.
2.2
Kernel Trick
Kernel methods, which are a class of algorithms for machine learning and pattern analysis, have received attention in the field of the nonlinear chemical process recent decades. By using
200
the kernel tricks, ELM can effectively deal with the nonlinear modeling. The idea of kernel based ELM is to nonlinearly map the process data into a feature space, where in the feature space, the data has a more linear structure. Then, in the feature space, ELM can be used to
7
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 8 of 41
properly extract the data features. To be specific, by calculating the inner products between the projections of all pairs of data in the higher-dimensional feature space, kernel tricks may
205
provide a more linear data structure in the feature space. The key idea of the kernel trick is to n 1, 2,L , n ) and then apply a define a nonlinear mapping xi a F ( xi ) ∈ F for xi ∈¡ , ( i =
linear method (i.e., PCA, PLS or ELM) in the newly defined feature space F. n
After whitening the data (viz.
0 ), the covariance matrix in the space F can be ∑Φ(x ) = i
i =1
calculated by:
1 n T ΦΦ ( xi ) ( xi ) . ∑ n i =1
Cov =
210
(7)
Thus, the principle components can be extracted by calculating the eigenvectors of the matrix
Cov :
λ ⋅ Q= Cov ⋅ Q.
(8)
Instead of directly Eigen-decomposing the covariance matrix C, the kernel tricks can be
215
applied as an alternate method to search for the principal components. The kernel can be defined by a gram matrix as:
[ K ]ij = K ( xi , x j ) = ΦΦ ΦΦ ( xi ) ( x j ) = ( xi ) , ( x j ) . T
(9)
Thus, the kernel matrix is obtained as:
K =ΘT Θ, 220
( )
ΦΦ where Θ = ΦΦ ( xi ) , x j ( x1 ) ,L , ( xn ) . By introducing a kernel function k ( xi , x j ) =
,
the inner products in the feature space can be calculated, and the nonlinear mapping can be avoided. Regarding the kernel function, some of the widely used kernels are as follows: Sigmoid:
225
Polynomial: Gaussian RBF:
(
= k ( xi , x j ) tanh δ 0 xi , x j + δ1 ( xi , x j ) k=
( x ,x i
(
j
)
+1
k ( xi , x j )= exp − xi − x j
)
τ
2
2σ 2
)
where δ 0 and δ1 are the parameters of the sigmoid, τ is a positive integer for polynomial kernel, and σ is the bandwidth of a Gaussian RBF kernel. Once the kernel matrix is obtained, the mean centering and variance scaling can be performed as:
230
K%= K − 1n K − K 1n + 1n K 1n.
8
ACS Paragon Plus Environment
(10)
Page 9 of 41
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
In Equation (10), the whitening matrix 1n can be defined as follows:
1 K 1 1 1n = M O M . n 1 L 1n×n 2.3 235
(11)
Just-in-time learning framework
Just-in-Time learning (JITL), also called model-on-demand, instance-based learning, or lazing learning, was developed as a dynamic approach of modeling a nonlinear system
37, 38
in
the fields of chemical process modeling, monitoring and controlling. Compared with conventional modeling methods, the JITL method focused on modeling the current situation according to a set of the nearest or the most similar dataset and, meanwhile. the JITL model
240
performs online learning only when it is needed 39. Therefore, this model is inherently adaptive according to the changes in process characteristics
40
. This feature enables the JITL to use
process data collected from the nominal operation condition for offline modeling and then update the model according to the online process data. To be brief, the JITL method is particularly suitable when the samples are not fully available
245
or the modes of the process are changed during the online monitoring phase. Compared with the conventional offline global modeling, JITL modeling focuses on local model structures, which can be constructed by using the relevant samples. Consequently, the current status of the process can be described by a local JITL model. The detailed framework of both global and JITL modeling is illustrated in Figure 2.
250
The major steps of JITL are: 1.
Relevant data samples are selected to match the new monitored sample according to
certain similarity measurements (e.g., Euclidean distance, Mahalanobis distance or mutual information 41);
255
2.
A local model is constructed based on the relevant samples.
3.
Model outputs (e.g., monitoring statistics or model prediction) are derived according to
both local model and new monitored sample. Recently, several interesting JTIL based modeling methods have been proposed. A probabilistic JITL (P-JITL) is proposed to deal with the data samples that contain missing values in the chemical process. This method provides a symmetric Kullback–Leibler divergence
260
measurement to measure the differences between two distributions31. Distance based angle-based JITLs
43
42
and
were also applied to process modeling. The monitoring result and
estimation of JITL are affected by the improper selection of relevant sample. Thus, a suitable
9
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 10 of 41
similarity criterion for the relevant samples is important in JITL. Proposing a modified JITL online modeling method for online ELM modeling is beneficial.
265 3
Online ELM modeling and monitoring based on Fastfood kernel and JITL
3.1
Fastfood kernel-based ELM
Considering that conventional ELM and aforementioned LARS-based ELM are still linear
270
modeling methods, their ability to model and monitor high dimensional nonlinear processes is still limited. A natural idea is to combine an effective kernel with ELM for the nonlinear process 44
. By using a kernel trick, the lower-dimensional nonlinear data can be mapped to a higher
dimensional linear feature space described by hidden nodes. The main advantage of the kernel trick is that it avoids nonlinear optimization, which can be
275
complicated
and
computationally
expensive.
To
clarify,
in
process
dataset
x = { x1 , x2 ,L , xk L , xn } , considering that the nonlinear mapping h ( xk ) of process data is T
unknown,
the
hidden
layer
feature
mapping
can
be
expressed
as
h ( xk ) = [ g ( xk , a1 , b1 ),L , g ( xk , aL , bL ) ] , where L denotes the number of hidden nodes in the ELM.
280
According to Equation (5) and the partial derivative of Equation (6), β can be defined as:
β HT ( = where s = {s1 , s2 ,L , sn }
T
I
ζ
+ H T H ) −1 s.
(12)
and H is the Walsh-Hadamard matrix. Thus, the corresponding
output functions are calculated by:
= f ( xk ) h= ( xk ) β h ( xk ) H T (
285
I
ζ
+ H T H ) −1 s.
(13)
By utilizing the kernel trick, in the kernel space,
Θ = HH T Θi , j =k ( xi , x j ) = ΦΦ ( xi ) , ( x j ) . Thus, Equation (13) can be transformed as:
10
ACS Paragon Plus Environment
(14)
Page 11 of 41
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
I = f ( xk ) h ( xk ) H T ( + HH T ) −1 s
ζ
T
k ( xk , x1 ) −1 I M + Θ s. = ζ k ( xk , xn )
(15)
The conventional Gaussian function or Sigmoid can be selected as the kernel function for
290
kernel-based ELM. However, due to the computation cost and time 45 of these two kernels, they may not suitable for online modeling. Therefore, a specific approximate kernel expansion, which can cut down the computation time, is needed. Despite the successful application in the field of nonlinear process modeling, the disadvantage of state-of-art kernel application in high-dimensional process monitoring is that
295
large scale data makes the computing the kernel function extremely expensive, especially in online modeling and monitoring phase. To overcome this problem, Le, Sarlos and Smola
46
proposed an approximate kernel expansions called the Fastfood kernel to accelerate computation time. The idea of the fastfood kernel is based on fitting the kernel approximation via a product of
300
diagonal and simple matrices through the Walsh-Hadamard transform 47.
{ x1 , x2 ,L , xn }
For an m-dimensional = x
T
∈¡
n× m
, the approximate Gaussian radial basis
function (RBF) kernel feature mapping can be defined as:
(
)
1 Φ ( xi ) = exp i [Vxi ] j . k ( xi , x j ) = n = i 1,= 2,L , n; j 1, 2,L , m
(16)
where V is a diagonal matrix,
305
V =
1
σ n
SHGΠAE
(17)
In Equation (17), S, G, and E are diagonal random matrices and thus can be computed once and then stored. The parameter Π denotes a random permutation matrix. A stands for a WalshHadamard transform matrix, which can be recursively calculated by using the Hadamard ordered Fast Walsh–Hadamard transform (FWHT) as:
310
1 1 A2 = = and An 1 −1
1 H n −1 H n −1 . 2 H n −1 − H n −1
(18)
FWHT is considered a generalized form of a Fourier transform, which is orthogonal, symmetric and linear. In summary, the computation cost of Fastfood kernel is lower than that of the conventional Gaussian RBF kernel
(
)
, which are O n 2 m and O ( n log n ) respectively.
48
11
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 12 of 41
This characteristic makes Fastfood kernel capable of online modeling. Instead of the
315
conventional Gaussian RBF kernel, ELM can effectively extract the nonlinear features in the feature space within a relative low computation cost by using the Fastfood kernel trick.
3.2
Similarity index based on posterior probability and cosine similarity
In JITL framework, the samples with high similarity to the new monitored sample (often also
320
referred to as the relevant dataset) can be selected to build a local model. Therefore, the similarity criterion is crucial of online JITL modeling. The Euclidean distance and Mahalanobis distance are two of the most commonly used similarity indices. However, due to the neglect of non-Gaussian features of process data, the performance of conventional JITL model is not always satisfactory. Especially, when the process data have apparent multimode features, the
325
global distance-based may not characterize the local features of dataset properly. Cheng and Chiu proposed a comprehensive similarity factor (SF), which combines both the distance and angle index, and showed better performance than that of the Euclidean distance 43. Practically, the assumption that the distribution of the process data follows a unimodal Gaussian distribution may become invalid. Hence, a similarity index combined with mode
330
clusters information and conventional similarity index could be more appropriate for modeling the practical process data. The schematic diagram of the proposed index can be found is shown in Figure 3. Given a process with M modes x = { x1 , x2 ,L , xi ,L , xn } , where k represents the kth mode T
= and X k 335
{x , x ,L , x } 1
2
nk
k
T
∈¡
nk ×m
represents the process data sampled from the kth operation
mode, the whole process dataset can be expressed as X = { X 1 MX 2 M L ,MX M } . For a new T
monitored sample xnew that needs online modeling, a cosine similarity index can be calculated as:
simcos ( xnew , xi ) =
xnew , xi xi
2
xnew
(19) 2
However, measuring the similarity by using Equation (19), has two crucial drawbacks: (1) the differences in distribution scales among different modes are not considered and (2) this equation
340
is a global similarity measure, however, because each point belongs to a particular sub-mode, its similarity index should be weighted to its particular sub-mode. To solve this problem, an adjusted local cosine similarity (ACS) index is constructed for each sub-mode, and a single index (called the modified adjusted local cosine similarity (MACS) index) is subsequently
12
ACS Paragon Plus Environment
Page 13 of 41
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
obtained by weighting each sub-mode according to the probability that the data point belongs to
345
that sub-mode. The ACS is defined as:
xnew − X k , xi − X k
ACS ( xnew , xi ) =
xi − X k
(20)
xnew − X k
2
2
where X k denotes the mean of the samples from all k modes. To determine the probabilities of membership in each group, we use the Bayesian inference:
p ( Ck xnew ) =
p ( Ck ) ⋅ p ( xnew Ck ) M
∑ p (C ) ⋅ p ( x i
i =1
350
new
(21)
Ci )
In Equation (21), p ( Ck ) denotes the prior probability that an arbitrary sample is generated from the kth mode Ck . Then, we can define MACS:
= MACS ( xnew , xi )
∑ p (C M
j =1
j
xnew ) ⋅ ACS ( xnew , xi )
(
(22)
)
In the above equation, the posterior probability p C j xnew is actually determined by the conditional probability p ( xnew Ck ) , which can be estimated by using KDE.
355
Multivariate KDE via the Parzen–Rosenblatt window method is extensively used as a nonparametrical approach to estimate a probability density function of each output si from local Kernel ELM model which represents the kth mode. Given a new monitored sample xnew , a multivariate kernel estimator can be constructed by:
p ( xnew Ck ) = 360
1 nk
nk
1 xnew − x j ,k h
∑K h j =1
(23)
where K stands for a kernel function matrix, h is the Parzen-window bandwidth that works as a smooth parameter in KDE method, and nk is the number of samples in the mode. The density function can be considered as a measurement for a new monitored sample whether it belongs to the same mode as the reference data or not. That is, a high density function of the monitored samples to a specific mode, it indicates that this sample possibly belongs to the same mode of
365
the reference data. The kernel function K should be unimodal and symmetrically smooth with a peak at zero. As discussed in Section 2.2, several widely used kernels can be applied to estimate the kernel density. However, considering that the choice of kernel function is not the key point of this section and a Gaussian kernel is always a safe choice estimation.
13
ACS Paragon Plus Environment
49
, it will be used for future
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
ϕ (u ) =
370
1 −u2 2 e . 2π
Page 14 of 41
(24)
Thus, by introducing a Gaussian kernel, Equation (23) can be transformed as:
= p ( xnew Ck )
1 nk hk
( x − x )2 new i ,k . exp − ∑ 2 2hk 2p i =1 nk
(25)
According to Bayesian inference theory, if the prior is determined, then the posterior probability will be affected only by the conditional probability. The value, which will be
375
relatively small or even zero when the new monitored sampled does not belong to kth mode, will reflect the actual local structural relationship of the monitored sample with each sub-mode. The choice of window bandwidth is crucial for the final estimation of KDE. Several factors, such as the dimension of training dataset, data distribution, and choice of the kernel function. In this paper, an adaptive window bandwidth method proposed by Botev, Grotowski and Kroese 50
380
is applied to obtain a suitable bandwidth h for each mode. The performance of KDE can be assured because the sufficient training data from each mode of the multimode process can be easily obtained, the performance of KDE can be assured. Once
p ( xnew Ck ) is calculated, its corresponding similarity index can be derived as Equation (22). The relevant samples are subsequently selected according to this similarity index for future
385
online JITL modeling. In summary, the proposed method has two major stages as illustrated in Chart 1. In the framework of the proposed method, the JITL is based on a newly proposed similarity index (i.e, MACS) and focuses on selecting the suitable sample set. Then, the Fastfood kernel based modified ELM uses the sample set, which is selected by the JITL, for local modeling. To be
390
specific, in the offline stage, the training dataset is utilized the construct local ELM models by using Fastfood kernel-based ELM algorithm. Both the local models and training data are stored for the future online stage. In the online monitoring stage, the new monitored samples are classified into certain modes according to its posterior probability. This sample can be monitored according to its corresponding local ELM model. Once the abnormality is detected,
395
the relevant dataset is selected according to a modified cosine similarity index. The online JITLbased modeling and estimation method are then used to predict the process status.
4
Case Studies and discussion 4.1
Illustrative synthetic example
14
ACS Paragon Plus Environment
Page 15 of 41
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
400
Industrial & Engineering Chemistry Research
A synthetic example is generated to reveal the performance of the proposed multimode Fastfood-kernel based-ELM approach. The synthetic model consists 7 inputs and 1 output, and there are 3 different operating modes. First, the 5 source variables are generated according to the following equations:
s1 ( t ) = 2 cos ( 0.08t ) sin ( 0.06t ) = s2 ( t ) sin ( 0.3t ) + 3cos ( 0.1t ) = . s3 ( t ) sin ( 0.4t ) + 3cos ( 0.1t ) = s4 ( t ) cos ( 0.1t ) − sin ( 0.05t ) s ( t ) uniformly distributed noise in[ −1,1] = 5 405
The mixing matrices Γ1 and Γ 2 are defined as:
0.86 0.79 Γ1 = 0.67 0.23 0.34 1 0 1 1 1 1 Γ 2 =1 1 1 1 1 1 1 1
−0.55
0.17
−0.33
0.65
0.32
0.12
0.46
−0.28
0.27
0.15
0.56
0.84
0.95
0.12
0.47
0.89
0.2
0.8
−0.97
0.4
0.5
0 0 0 0 . 0 1 1 1 1 0 1 1 1 1 1 0 0 1 1 1
0 0 0 1 1
0 0 0 0 1
Mode 1:
Mode 2:
Mode 3:
0 0 0 0 0
x (t ) = Γ1 ( s ( t ) − 8 ) + enoise y ( t ) = 0.8 x1 ( t ) + 0.6 x2 ( t ) + 1.5 x3 ( t ) x ( t ) =Γ 2 Γ1 ( s ( t ) − 2 ) + enoise y ( t ) = 2.4 x2 ( t ) + 1.6 x3 ( t ) + 4 x4 ( t ) x ( t ) =Γ 22 Γ1 ( s ( t ) + 2 ) + enoise y ( t ) =1.2 x1 ( t ) + 0.4 x2 ( t ) + x4 ( t )
where e is a normally distributed noise:
enoise ~ N ( 0, 0.01) , and the output is polluted by a Gaussian noise:
15
T
−0.74 −0.3 −0.45 0.23 0.13 0.14 0.92 0.19 0.56
Modes 1, 2 and 3 are generated by:
410
(26)
ACS Paragon Plus Environment
(27)
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 16 of 41
y= y + h; h ~ N ( 0, 0.1) . 415
In the training phase, 250 samples are generated in each mode as a training dataset. Thus, 750 samples represents all 3 modes under normal conditions, respectively. The process starts at mode 1, and is followed by mode 2 and mode 3, sequentially. As shown in Figure 4 shows, the illustrative process that works under 3 different modes is significantly non-Gaussian. Alternatively, Figure 5 shows the observation of the seven inputs ( x ( t ) ), which reveals that the
420
dynamic process works under three different modes. In the testing phase, a set of test data generated from all 3 modes are used to illustrate the performance of the proposed method. The process starts at mode 2, followed by mode 1 and then ends at mode 3. Now, 3 samples (15th sample for mode 2, 275th sample for mode 1 and 550th sample for mode 3) are selected as examples, and their corresponding cosine similarity
425
and adjust cosine similarity to each sample from training data are shown in Figures 6 and 7. As the figures show, although both 2 similarity indices can correctly discriminate the modes, the performance is unsatisfactory, especially in mode 1 and mode 2, because of the very small difference between the samples in these modes. In a more complex system, this factor can easily lead to an improper selection of the relevant dataset in the JITL online modeling.
430
Then, the posterior probability is introduced into the similarity index to improve the performance of similarity index and, meanwhile, to identify the modes. The conditional probability of membership in mode 1 for all 7 inputs is shown in Figure 8. It can be seen in the figure that certain dimensions of conditional probabilities are relatively small or even zero due to the drift of model structure. This phenomenon will result to joint posterior probability near to
435
zero if the sample does not belong to that mode. The joint posterior probability of all 7 inputs is illustrated in Figure 9. From this figure, the observed that posterior probability techniques can be used to identify the mode structure. Therefore, combining the adjusted cosine similarity and posterior probability, the new proposed MACS can select the relevant dataset more appropriately than the conventional cosine similarity. The performance of MACS is shown in
440
Figure 10; all three modes can be unambiguously identified from the data. After the discussion of the similarity index of online JITL modeling, a scenario (Scenario 1) is designed to demonstrate the performance of the newly proposed online ELM modeling and monitoring method. Considering that the data from mode 3 are easily differentiated from the other 2 modes (even with the basic cosine similarity) mode 3 will be excluded to show the
445
differences of ACS and MACS. In this scenario, the process starts in mode 1 under normal conditions. After that, in the second phase, a step fault (Fault 1) [ 0, 0, 11 ,, 0, 0, 0] is introduced T
into the process from the 251st sample and last for 250 samples. Then, the process is switched to
16
ACS Paragon Plus Environment
Page 17 of 41
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
mode 2 under normal conditions from samples 1001 to 1501. Finally, another step fault (Fault 2)
[-1,1, 0, 0, 0, 0, 0]
T
450
is added into the process at phase 4, for another 500 samples. Hotelling’s T2
is applied to monitor the process and the confidence level of control limit is set to 95%. The performance of the proposed method is shown in Figure 11. Taking the second phase as an example, when the fault is detected, both MACS and ACS similarity indices are applied to select the relevant dataset. Figure 12 shows that the estimation result of MACS outperforms that of ACS.
455 4.2
Distillation column system
Distillation is the physical separation of a mixture into two or more chemical products with different boiling points through selective evaporation and condensation. More volatile components are separated from the mixture when it is heated. The vapor that comes off will be
460
essentially complete separation, or it may be a partial separation that increases the concentration of selected components of the mixture than that in liquid from which it involved. In industrial chemistry, the distillation column has been considered as a practically universal important operation unit. For continuous distillation (also known as fractional distillation), the process will have at least two output fractions. The overhead product, which actually is one or more volatile
465
distillate fractions, will be boiled, vaporized and then condensed to a liquid. Meanwhile, the bottom product, which consists of the less volatile components will exit from the bottom of the column
51
. In practice, the continuous rectification process is a multistage countercurrent
distillation operation. The real-time monitoring and modeling of the status of the distillation column are crucial for improving the control quality.
470
The original model, created by Villalba Torán 52, has 4 manipulated variables, 4 controlled variables and 3 input measured disturbances, plus 41 mole fractions and temperatures corresponding to every column stage. A schematic showing a distillation column that illustrates this model is shown in Figure 13. Explanatory measurements are listed in Table 1. Figure 14 depicts the data characteristics of the normalized output measurements of B, D, xB, and yD,
475
respectively. As can been seen in this figure, the process status varies drastically and dynamically. In practice, the product purity is the key index for the continuous process, however, the online component analysis of xB and yD may not be available in a distillation column. Therefore, according to Jain 53, the product purity index PI can be defined as:
PI = xB ⋅ yD
480
(28)
In practical industrial processes, distillation columns are often operated under different operating conditions that are related to various quality requirements. In this paper, 3 operating
17
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 18 of 41
conditions are introduced into the fractional distillation system. The details of these 3 modes can be found in Table 2. Figure 15 shows the normal probability plots for each variable with all three modes. The
485
behavior of data will be linear if the dataset follows a certain Gaussian distribution, otherwise, distortion occurs when the process actually follows another kind of distribution. It can be seen from Figure 15 that all the outputs are non-Gaussian. Similarly, the other measurements (not shown) are also non-Gaussian. Therefore, applying a kernel-based multimode learning method is reasonable.
490
In constructing an appropriate Fastfood kernel-based ELM for each mode, it is important to determine the ideal number of outputs. According to Sivalingamaiah and Reddy 54, the number
ks of outputs s can be calculated by: ks = log 2 ( N msr 2)
(29)
where N msr denotes the dimension of the measurement. In this distillation process, there are 90
495
measurements are conducted each sample. Hence, the output number will be defined as 6. Note that the output number can even be less than 6 due to the data redundancy. Consequently, the requirement of the sparsity of ELM in Equation (6) can be ensured. Then, due to the fulfillment of sparsity of ELM, satisfied performance can be expected. Several types of abnormal condition can be introduced into the distillation column process.
500
The detail description is listed in Table 3. In this distillation process simulation, the proportional–integral (PI) controller failure refers to the case where the valve malfunctions, preventing the PI controller from tracking the set point. Two scenarios are designed to discuss the monitoring and modeling qualities of the newly proposed method. Detailed information on these 2 cases is listed in Table 4. These 2 scenarios
505
cover all 3 major operation modes and typical fault (Ramp, spike, and PI controller failure). The confidence parameter ( α ) of control limits used in both scenarios is set as 5%. The training data are sampled from all 3 operation modes. In each mode, the distillation column will work for 6 hours, the sample time is set as 0.5 minutes. So there will be 720 samples for each mode and the whole training dataset consists 2160 samples representing 3 normal operation modes. In
510
addition, 20 samples for each type of abnormal event will be added to the training dataset as well. The Additive White Gaussian Noises (AWGN) are added into the process data for both offline training and online monitoring phase, the signal-to-noise ratio is 99%. To demonstrate the effectiveness of our technique, we constructed Scenarios 2 and 3 (Scenario 1 was presented in the Section 4.1). The details of each scenario are shown in Table 4.
18
ACS Paragon Plus Environment
Page 19 of 41
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
515
Industrial & Engineering Chemistry Research
As seen in Figure 16, the occurrences of the predefined faults 2 and 4 cause the drift of process model, which leads to the drop of the joint probability. However, considering that the joint conditional probability is extremely low or even zero, the proposed method can still identify the mode effectively. Once the new monitored sample is classified to a certain mode according to the posterior probability, the online monitoring based on the newly proposed
520
Fastfood kernel based ELM can be employed to monitor the distillation process status and the performance of this method in terms of application to scenario 2 as shown in Figure 17. It can be seen that the method is capable of quickly detecting both faults. In scenario 2, when the abnormalities are detected, the online JITL modeling can be applied to serve as a soft sensor to predict the behavior of the distillation system. For example, in stage
525
2, the fault 2 is introduced into the process and monitored by the proposed Fastfood kernel based method, then the JITL modeling is required once the fault is detected. In this paper, the number of relevant samples is 12. The prediction result is shown in Figure 18, revealing that both ACS and MACS perform reasonably well. The root-mean-square deviation (RMSE) index is applied to measure the
530
online modeling performance, which is 7.3071× 10−5 for MACS and 1.0617 × 10−4 for ACS, respectively indicating that the MACS outperforms ACS in this case. From Equation (29), the outputs of kernel-based ELM can be set to 7 to achieve satisfactory modeling performance. However, due to data redundancy, the dimension of outputs can be reduced even further. Given the complexity of MACS, which requires the calculation of the
535
posterior probability, the CPU time should be taken into consideration. Figure 19 shows the effect of number of the outputs on CPU time and RMSE for the MACS calculation. Computation cost increases approximately linearly, whereas its corresponding RMSE does not decrease proportionally and significantly after the 3 outputs. Consequently, in order to achieve balance between modeling quality and computation cost, a rational choice for the output number
540
of ELM is 3 or 4. In this paper, the outputs number is set as 4 for the distillation system. Once the output number of ELM is determined, the results from the analyses of the three indices (standard cosine, ACS, and MACS) with original RBF kernel or fastfood kernel are shown in Table 5. The RMSE of MACS with a Fastfood kernel is inferior to that with an original Gaussian RBF kernel, but the computational cost is reduced by 57.86% of the MACS
545
whereas that RBF kernels and performance is reduced only slightly. In addition, the KDE and calculation of posterior probability significantly increase the CPU time (which explains the faster performance of the cosine and ACS methods). However, the progress in terms of modeling quality should not be neglected. Therefore, the combination of MACS and the fastfood kernel will keep a proper balance between CPU time and model quality.
19
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
550
The online JITL modeling result and its corresponding estimation error in the fourth phase (Mode 2, Fault 4) is shown in at Figure 20. For scenario 3, the process starts at mode 2. In phase 2, a ramp fault occurs at the molar feed rate in F, the amplitude in 20% and its duration is 10 minutes. Then, in the next phase (phase 3), the process switches into mode 3 for an hour. Finally, a PI controller failure occurs in yD in the
555
last phase. To be brief, the monitoring result, online JITL prediction results, and their corresponding estimation error are illustrated in Figures 21, 22 and 23, respectively. As shown in Figure 21, the monitoring performance for Scenario 3 is satisfying and both faults are quickly identified. This finding is achieved because the proposed method can identify the differences between mode 2 and mode 3 and monitor the process status according to its
560
corresponding mode. Once the abnormalities are detected, the online JITL model can track the dynamic characteristics properly by KDE based cosine similarity techniques. The predictions of JITL model are beneficial for future fault diagnosis and the loss assessment when the fault happens. Especially, the predictions of JITL model can serve as a soft sensor, which can provide useful information for further system controlling when the abnormality occurs.
565 5
Conclusion In this paper, a novel online monitoring and modeling method is proposed for the
performance monitoring and mode identification for non-Gaussian multimode process. In the paradigm of the proposed method, the KDE techniques and Bayesian classifier method are
570
utilized to estimate the likelihood of each operation mode for each monitored sample data according to its corresponding posterior probability. Then, the performance monitoring can be employed based on the local model constructed by using the fastfood kernel based ELM method, which can identify the abnormalities in the process efficiently. ELM provides a novel and accurate way to model a non-Gaussian process by projecting the raw process data into a higher
575
dimensional hidden nodes space and then extracting the proper information from hidden nodes. Once a fault is detected, the current process status is crucial for further fault diagnosis and correction and the loss evaluation, which can be estimated by an online JITL-based method. The relevant dataset of the monitored samples is always beneficial because it can provide some underlying process information that is helpful for the prediction of process status. Therefore, for
580
the purpose of the improving the quality of online modeling, a novel relevant dataset selection method based on adjusted cosine similarity and Bayesian classifier is proposed in this paper. Case studies on both a numerical model and the distillation system benchmark process reveal the effectiveness in the field of non-Gaussian multimode process monitoring. However, the transitional status between different modes cannot be strictly classified into one certain mode.
20
ACS Paragon Plus Environment
Page 20 of 41
Page 21 of 41
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
585
Industrial & Engineering Chemistry Research
The Bayesian classifier of the proposed method does not take the transition data into account, which may result in the method being unable to classify the transition data from the fault data. Further study will focus on the online modeling and monitoring of the multimode process with transition data. Therefore, the differences of transition status and fault status in the multimode process should be analyzed to avoid faulty alarm when the process switches from one operation
590
point to another operation condition.
References
595
600
605
610
615
620
625
1. Chen, J. Y.; Yu, J.; Mori, J.; Rashid, M. M.; Hu, G. S.; Yu, H. L.; FloresCerrillo, J.; Megan, L., A non-Gaussian pattern matching based dynamic process monitoring approach and its application to cryogenic air separation process. Comput Chem Eng 2013, 58, (45), 40-53. 2. Peng, X.; Tang, Y.; Du, W.; Qian, F., Multimode Process Monitoring and Fault Detection: A Sparse Modeling and Dictionary Learning Method. IEEE Transactions on Industrial Electronics 2017, PP, (99), 1-1. 3. Gregersen, L.; Jorgensen, S. B., Supervision of fed-batch fermentations. Chemical Engineering Journal 1999, 75, (1), 69-76. 4. Peng, X.; Tang, Y.; He, W.; Du, W.; Qian, F., A Just-in-Time Learning based Monitoring and Classification Method for Hyper/Hypocalcemia Diagnosis. IEEE/ACM Trans Comput Biol Bioinform 2017. 5. Qin, S. J.; Valle, S.; Piovoso, M. J., On unifying multiblock analysis with application to decentralized process monitoring. J Chemom 2001, 15, (9), 715-742. 6. Ge, Z.; Song, Z., Process Monitoring Based on Independent Component Analysis−Principal Component Analysis (ICA−PCA) and Similarity Factors. Ind Eng Chem Res 2007, 46, (7), 2054-2063. 7. Zhao, C. H.; Gao, F. R., Fault-relevant Principal Component Analysis (FPCA) method for multivariate statistical modeling and process monitoring. Chemometrics and Intelligent Laboratory Systems 2014, 133, 1-16. 8. Venkatasubramanian, V.; Rengaswamy, R.; Kavuri, S. N.; Yin, K., A review of process fault detection and diagnosis. Comput Chem Eng 2003, 27, (3), 327-346. 9. Yin, S.; Ding, S. X.; Haghani, A.; Hao, H.; Zhang, P., A comparison study of basic data-driven fault diagnosis and process monitoring methods on the benchmark Tennessee Eastman process. J. Process. Contr. 2012, 22, (9), 1567-1581. 10. MacGregor, J.; Cinar, A., Monitoring, fault diagnosis, fault-tolerant control and optimization: Data driven methods. Comput Chem Eng 2012, 47, 111-120. 11. Joe Qin, S., Statistical process monitoring: basics and beyond. J Chemom 2003, 17, (8-9), 480-502. 12. Zhang, Y. W.; Ma, C., Fault diagnosis of nonlinear processes using multiscale KPCA and multiscale KPLS. Chem Eng Sci 2011, 66, (1), 64-72. 13. Ding, S. X., Data-driven design of monitoring and diagnosis systems for dynamic processes: A review of subspace technique based schemes and some recent results. J Process Control 2014, 24, (2), 431-449.
21
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
630
635
640
645
650
655
660
665
670
14. Li, G.; Liu, B.; Qin, S. J.; Zhou, D., Quality relevant data-driven modeling and monitoring of multivariate dynamic processes: the dynamic T-PLS approach. IEEE Trans Neural Netw 2011, 22, (12), 2262-71. 15. Yu, J.; Qin, S. J., Multiway Gaussian Mixture Model Based Multiphase Batch Process Monitoring. Ind Eng Chem Res 2009, 48, (18), 8585-8594. 16. Zhou, D.; Li, G.; Qin, S. J., Total projection to latent structures for process monitoring. AIChe Journal 2009, 56, (1), 168-178. 17. Peng, K. X.; Zhang, K.; Li, G., Quality-Related Process Monitoring Based on Total Kernel PLS Model and Its Industrial Application. Mathematical Problems in Engineering 2013, 2013, (4), 1-14. 18. Jia, Q. L.; Zhang, Y. W., Quality-related fault detection approach based on dynamic kernel partial least squares. Chemical Engineering Research & Design 2016, 106, 242-252. 19. Chien, J. T.; Hsieh, H. L., Convex Divergence ICA for Blind Source Separation. IEEE Transactions on Audio Speech and Language Processing 2012, 20, (1), 302-313. 20. And, Z. G.; Song, Z., Process Monitoring Based on Independent Component Analysis−Principal Component Analysis (ICA−PCA) and Similarity Factors. Ind Eng Chem Res 2007, 46, (7), 2054-2063. 21. Chen, J.; Yu, J.; Mori, J.; Rashid, M. M.; Hu, G.; Yu, H.; Flores-Cerrillo, J.; Megan, L., An independent component analysis and mutual information based nonGaussian pattern matching method for fault detection and diagnosis of complex cryogenic air separation process. In 2013; pp 2797-2802. 22. Zhao., C.; Gao., F.; Wang, F., Nonlinear Batch Process Monitoring Using PhaseBased Kernel-Independent Component Analysis− Principal Component Analysis (KICA− PCA). Ind. Eng. Chem. Res. 2009, 48, 12. 23. Jiang, Q. C.; Yan, X. F.; Tong, C. D., Double-Weighted independent Component Analysis for Non-Gaussian Chemical Process Monitoring. Ind Eng Chem Res 2013, 52, (40), 14396-14405. 24. Wang, F. L.; Tan, S.; Peng, J.; Chang, Y. Q., Process monitoring based on mode identification for multi-mode process with transitions. Chemometrics and Intelligent Laboratory Systems 2012, 110, (1), 144-155. 25. Ma, H. H.; Hu, Y.; Shi, H. B., A novel local neighborhood standardization strategy and its application in fault detection of multimode processes. Chemometrics and Intelligent Laboratory Systems 2012, 118, 287-300. 26. Gonzalez, R.; Huang, B.; Lau, E., Process monitoring using kernel density estimation and Bayesian networking with an industrial case study. ISA Trans 2015, 58, 330-347. 27. Ma, Y. X.; Shi, H. B., Multimode Process Monitoring Based on Aligned Mixture Factor Analysis. Ind Eng Chem Res 2014, 53, (2), 786-799. 28. Qin, S. J., Recursive PLS algorithms for adaptive data modeling. Comput Chem Eng 1998, 22, (4-5), 503-514. 29. Jiang, J. H.; Berry, R. J.; Siesler, H. W.; Ozaki, Y., Wavelength interval selection in multicomponent spectral analysis by moving window partial least-squares regression with applications to mid-infrared and near-infrared spectroscopic data. Anal. Chem. 2002, 74, (14), 3555-65. 30. Kaneko, H.; Funatsu, K., Maintenance-free soft sensor models with time difference of process variables. Chemometrics and Intelligent Laboratory Systems 2011, 107, (2), 312-317.
22
ACS Paragon Plus Environment
Page 22 of 41
Page 23 of 41
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
675
680
685
690
695
700
705
710
715
720
Industrial & Engineering Chemistry Research
31. Yuan, X.; Ge, Z.; Huang, B.; Song, Z., A Probabilistic Just-in-Time Learning Framework for Soft Sensor Development With Missing Data. IEEE Transactions on Control Systems Technology 2017, 25, (3), 1124-1132. 32. Choi, S. W.; Martin, E. B.; Morris, A. J.; Lee, I. B., Adaptive multivariate statistical process control for monitoring time-varying processes. Ind Eng Chem Res 2006, 45, (9), 3108-3118. 33. Huang, G. B.; Chen, L., Enhanced random search based incremental extreme learning machine. Neurocomputing 2008, 71, (16-18), 3460-3468. 34. Huang, G.; Song, S.; Gupta, J. N.; Wu, C., Semi-supervised and unsupervised extreme learning machines. IEEE Trans Cybern 2014, 44, (12), 2405-17. 35. Rong, H. J.; Ong, Y. S.; Tan, A. H.; Zhu, Z. X., A fast pruned-extreme learning machine for classification problem. Neurocomputing 2008, 72, (1-3), 359-366. 36. Miche, Y.; Sorjamaa, A.; Bas, P.; Simula, O.; Jutten, C.; Lendasse, A., OPELM: optimally pruned extreme learning machine. IEEE Trans Neural Netw 2010, 21, (1), 158-62. 37. Fdez-Riverola, F.; Iglesias, E. L.; Diaz, F.; Mendez, J. R.; Corchado, J. M., Applying lazy learning algorithms to tackle concept drift in spam filtering. Expert Syst Appl 2007, 33, (1), 36-48. 38. Ge, Z. Q.; Song, Z. H., Online monitoring of nonlinear multiple mode processes based on adaptive local model approach. Control Eng Pract 2008, 16, (12), 1427-1437. 39. Cybenko, G., Just-in-time learning and estimation. NATO ASI SERIES F COMPUTER AND SYSTEMS SCIENCES 1996, 153, 423-434. 40. Kim, S.; Kano, M.; Hasebe, S.; Takinami, A.; Seki, T., Long-Term Industrial Applications of Inferential Control Based on Just-In-Time Soft-Sensors: Economical Impact and Challenges. Ind Eng Chem Res 2013, 52, (35), 12346-12356. 41. Chen, J.; Yu, J., Independent component analysis mixture model based dissimilarity method for performance monitoring of Non-Gaussian dynamic processes with shifting operating conditions. Ind Eng Chem Res 2013, 53, (13), 5055-5066. 42. Ge, Z.; Song, Z., A comparative study of just-in-time-learning based methods for online soft sensor modeling. Chemometrics & Intelligent Laboratory Systems 2010, 104, (104), 306-317. 43. Cheng, C.; Chiu, M. S., A new data-based methodology for nonlinear process modeling. Chem. Eng. Sci. 2004, 59, (13), 2801-2810. 44. Huang, G. B.; Zhou, H.; Ding, X.; Zhang, R., Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern B Cybern 2012, 42, (2), 513-29. 45. Yang, Z.; Moczulski, M.; Denil, M.; de Freitas, N.; Smola, A.; Song, L.; Wang, Z., Deep Fried Convnets. arXiv preprint arXiv:1412.7149 2014. 46. Le, Q.; Sarlos, T.; Smola, A. In Fastfood-computing hilbert space expansions in loglinear time, Proceedings of the 30th International Conference on Machine Learning, 2013; 2013; pp 244-252. 47. Ahmed, N.; Rao, K. R., Walsh-hadamard transform. In Orthogonal Transforms for Digital Signal Processing, Springer: 1975; pp 99-152. 48. Zhao, J.; Meng, D., FastMMD: Ensemble of Circular Discrepancy for Efficient Two-Sample Test. Neural Comput 2015, 27, (6), 1345-72. 49. Elgammal, A.; Duraiswami, R.; Harwood, D.; Davis, L. S., Background and foreground modeling using nonparametric kernel density estimation for visual surveillance. Proceedings of the IEEE 2002, 90, (7), 1151-1163.
23
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
725
730
735
50. Botev, Z. I.; Grotowski, J. F.; Kroese, D. P., Kernel Density Estimation Via Diffusion. Ann Stat 2010, 38, (5), 2916-2957. 51. Henley, E. J.; Seader, J. D.; Roper, D. K., Separation process principles. Wiley: 2011. 52. Villalba Torán, P. M. Multivariate statistical process monitoring of a distillation. 2013. 53. Jain, S.; Kim, J. K.; Smith, R., Operational Optimization of Batch Distillation Systems. Ind Eng Chem Res 2012, 51, (16), 5749-5761. 54. Sivalingamaiah, M.; Reddy, B. V., Texture Segmentation Using Multichannel Gabor Filtering. IOSR Journal of Electronics and Communication Engineering 2012, 2, 22-26.
Acknowlegement This work was supported by National Natural Science Foundation of China (61590923, 61422303, 61333010), and “Shu Guang” project supported by Shanghai Municipal Education Commission and Shanghai Education Development Foundation.
24
ACS Paragon Plus Environment
Page 24 of 41
Page 25 of 41
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Figure Captions
740 Fig.1 Illustration of Extreme Learning Machine in the field of process modeling
Fig.2 Comparison between global modeling and JITL modeling monitoring method
25
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
745
Fig.3 Schematic diagram of the proposed similarity based on cosine similarity and posterior probability
Fig. 4. Normal probability plot for output of 3 modes in illustrative synthetic process
26
ACS Paragon Plus Environment
Page 26 of 41
Page 27 of 41
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
750
Industrial & Engineering Chemistry Research
Fig. 5. Process status depicted by input observations of synthetic process
Fig. 6. Cosine similarity of 3 samples (15th, 275th, and 550th) to the training dataset
27
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Fig. 7. Adjusted cosine similarity of 3 samples (15th, 275th, and 550th) to the training 755
dataset
Fig. 8. Conditional probabilities of each input according to mode 1.
28
ACS Paragon Plus Environment
Page 28 of 41
Page 29 of 41
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Fig. 9. Mode identification result based on joint posterior probability method (Mode 1)
760 Fig. 10. Modified adjust cosine similarity (MACS) of 3 samples (15th, 275th, and 550th) to the training dataset
29
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Fig. 11. T2 statistics monitoring performance of Illustrative Synthetic Example
765 Fig. 12. Prediction result of two similarity index in the second phase (Mode1, Fault 1, (a) for MACS, (b) for ACS)
30
ACS Paragon Plus Environment
Page 30 of 41
Page 31 of 41
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Fig. 13. Schematic diagram of a typical distillation column
770 Fig. 14. Characteristics of normalized outputs data in the distillation system
31
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Fig. 15. Characteristics of normalized outputs data in the distillation system
775
Fig. 16. Joint conditional probability according to Mode 1 and Mode 2 in scenario 2
32
ACS Paragon Plus Environment
Page 32 of 41
Page 33 of 41
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Fig. 17. Hotelling’s T2 statistics monitoring performance in terms of predesigned Scenario 2 of the distillation system
780
Fig. 18. Prediction result of two similarity index in Scenario 2 (Mode1, Fault 2, above for MACS, below for ACS)
33
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Fig. 19. CPU time and RMSE for different ELM outputs number for 12 relevant samples
785 Fig. 20. Prediction result and estimation result of Mode 2, Fault 4 in Scenario 2
34
ACS Paragon Plus Environment
Page 34 of 41
Page 35 of 41
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Fig. 21. Hotelling’s T2 statistics monitoring performance in terms of predesigned Scenario 3 of the distillation system
790 Fig. 22. Prediction result and estimation result of Mode 2, Fault 1 in scenario 3
35
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Fig. 23. Prediction result and estimation result of Mode 3, Fault 5 in scenario 3
36
ACS Paragon Plus Environment
Page 36 of 41
Page 37 of 41
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
795
Chart 1. Flow chart for the proposed kernel based ELM monitoring and modeling method
37
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
800
Page 38 of 41
Table 1 Measurement and output for the distillation column Outputs
Descriptions
yD
the vapor composition of light components of D(mole fraction)
xB
the liquid composition of light components of B(mole fraction)
Measurements FM
Molar feed rate (kmol/min)
FV
Volumetric feed flow (L/h)
zF
feed composition (mole fraction)
TF
feed temperature(Celsius)
qF
the fraction of liquid in the feed
MD
the liquid holdup of the overhead (kmol)
MB
the liquid holdup of the bottom (kmol)
deltaL
variation of reflux flow rate(kmol/min)
deltaV
variation of boilup flow rate(kmol/min)
L
reflux flow rate(kmol/min)
V
boilup flow rate(kmol/min)
D
top product flow rate(kmol/min)
B
bottom product flow rate(kmol/min)
x2-x40
the liquid composition of light components at stage 2-40
T2-T40
Temperature at stage -40 (Celsius)
Table 2 Description of operation modes in distillation system Mode
xB
yD
L
V
1
0.01
0.99
2.6889
3.2294
38
ACS Paragon Plus Environment
Page 39 of 41
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
2
0.01
0.96
2.2537
2.8100
3
0.05
0.99
2.3243
2.8435
Table 3 Abnormal events and corresponding fault type Fault
Description
Type
1
Molar feed rate in F
Pulse/Ramp/Spike
2
Feed composition change in zF
Pulse/Ramp/Spike
3
Fluctuation of feed temperature in TF
Spike/Pulse
4
PI controller failure (xB)
Step
5
PI controller failure (yD)
Step
805 Table 4 Benchmark scenarios of the distillation column system No.
Test scenario Normal condition: Sample 1-120, Mode 1 Faulty condition: Sample 121-240, Mode 1 Fault 2: Feed composition change in zF (Type: spike, Start time: 0
Scenario 2
min, amplitude: 15%, duration: 20 min) Normal condition: Sample 241-360, Mode 2 Faulty condition: Sample 361-480, Mode 2 Fault 4: PI controller failure (xB) (Start time: 0 min) Normal condition: Sample 1-120, Mode 2
Scenario 3
Faulty condition: Sample 481-960, Mode 2 Fault 1: Molar feed rate in F (Type: ramp, Start time: 0 min,
39
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 40 of 41
amplitude: 20%, duration: 10 min) Normal condition: Sample 961-1440, Mode 3 Faulty condition: Sample 1441-1920, Mode 3 Fault 5: PI controller failure (yD) (Start time: 0 min)
Table 5 RMSE and CPU time of three different similarity indices with different kernel Method Cosine , Fastfood ACS, Fastfood MACS, exact Gaussian RBF MACS, Fastfood
RMSE(10-5) 17.131 11.9314
CPU time(second) 7.22 8.11
6.6825
68.78
7.3071
39.8
40
ACS Paragon Plus Environment
Page 41 of 41
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Schematic diagram of the proposed similarity based on cosine similarity and posterior probability 150x97mm (96 x 96 DPI)
ACS Paragon Plus Environment