Enhanced Fault Detection Based on Ensemble Global–Local

Sep 2, 2017 - Department of Automation and Key Laboratory of System Control ... In particular, the false alarm rate (FAR) of the ensemble one is a lit...
0 downloads 0 Views 3MB Size
Subscriber access provided by UNIV MASSACHUSETTS BOSTON

Article

Enhanced fault detection based on ensemble Global-Local Preserving Projections with quantitative global-local structure analysis Chengjun Zhan, Shuanghong Li, and Yupu Yang Ind. Eng. Chem. Res., Just Accepted Manuscript • DOI: 10.1021/acs.iecr.7b01642 • Publication Date (Web): 02 Sep 2017 Downloaded from http://pubs.acs.org on September 3, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Industrial & Engineering Chemistry Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Enhanced fault detection based on ensemble GlobalLocal Preserving Projections with quantitative global-local structure analysis Chengjun Zhan,* Shuanghong Li, Yupu Yang Department of Automation, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China

ABSTRACT: A novel data-driven fault detection strategy that fully considers the global-local structure is proposed. Inevitably, traditional methods concerning the global-local structure are unilateral and the global-local information is extracted partly. To extract the information fully, an available idea is to combine these methods. Considering the feasibility, the same type of model is applied in this paper. Global-Local Preserving Projections (GLPP) model, which is suitable for extracting the global-local structure, is used as the base model of the proposed model. The purpose of this paper is to study on the combination of the GLPP models by the use of ensemble learning strategy. This idea requires to deal with two main problems: how to choose the GLPP models diversely and how to combine them. To solve these two issues, firstly, a global index Gper that could analyze the global and local structure quantitatively is proposed and then the diverse GLPP models can be selected according to Gper. The Kernel Density Estimation (KDE) method is used to get more accurate result and control limits can be acquired. Next, Bayesian

ACS Paragon Plus Environment

1

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 37

inference and weighted sum strategies are adopted to combine the selected GLPP models. The weighted sum strategy implements the motivation of fully analyzing the global-local structure in a simple and useful way. In particular, the False Alarm Rate (FAR) of the ensemble one is a little higher since the false alarm information is accumulated as well. To help reduce the FAR of EGLPP (ensemble GLPP) model, an indicator called WFAR that could represent the reliability of GLPP model is proposed. It is shown that the proposed EGLPP monitoring method could enhance the performance of monitoring, in other word, be more stable to the selection of the global index and significantly improve the detection performance under a relatively low FAR. This is demonstrated by the widely-accepted Tennessee Eastman benchmark process.

1. INTRODUCTION There are two main issues with process monitoring, ensuring process safety and improving product quality. As faults occur, detecting and diagnosing the faults as quickly as possible is of significance, especially in the modern industrial process.1 With the progress of technology, modern industrial process work becomes more complicated and the initial accident may trigger a chain of accidents owing to the domino effect. Therefore, the monitoring method of the process is quite fatal and vital, and consequently should be selected gingerly.2,3 Because of the widely use of distributed control system (DCS) in modern industrial processes, a large number of data has been collected and stored in the past decades. Thus, multivariate statistical process monitoring (MSPM) methods have been hot research topics which draw more and more attention.4-9 Among all the methods, principal component analysis (PCA) becomes popular and representative for its effectiveness.10-13 PCA is designed to extract the global information of the process data, i.e., the variance information. However, it ignores the local neighborhood information that is another significant constituent of the data. As a kind of

ACS Paragon Plus Environment

2

Page 3 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

manifold learning algorithm, locality preserving projections (LPP) is adept at extracting the local information.14 As a matter of course, the combination of the two methods is underway and some research has been done on this topic. Zhang et al. proposed the global-local structure analysis (GLSA) model for fault detection and identification by combining the global information and local manifold structure.15 Yu was motivated by GLSA and developed a model based on local and global PCA (LGPCA).16 Luo inherited this idea and developed the Global-Local Preserving Projections (GLPP)17, and a balance parameter that helps balance the proportion of global and local structure was introduced in GLPP model. The GLPP model is elaborately developed to reflect the global and local structure of the data. What’s more, the research had showed its great availability. Recently, the GLPP model has been extended to the nonlinear, sparse, and parameterless versions. The nonlinear version has two variants, i.e., KGLPP and DDKGLPP.18,19 The version called DDKGLPP uses a data-dependent kernel while KGLPP uses a conventional kernel. The sparse version called SGLPP could extract sparse transformation vectors from the data set.20 As an alternative version of GLPP, the parameterless version called NLLSPP model had overcome the problem of parameter selection by using the similarity weight coefficients.21 However, this parameterless version simultaneously removed the balance parameter that could extract global-local information quantitatively. All these versions had demonstrated the effectiveness and applicability of the GLPP model. It is noted that the violation of the global-local structure is unpredictable when the abnormal conditions occur. One GLPP model with the fixed balance parameter could only monitor some kind of abnormal condition. A natural idea that could fully consider the global-local structure is the combination of different GLPP models. The purpose of this paper is to study on the combination of the GLPP models by the use of ensemble learning strategy. This idea brings two

ACS Paragon Plus Environment

3

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 37

problems, i.e., how to choose the GLPP models as diversely as possible and how to combine them. By analyzing the global and local structure quantitatively, a global index called Gper is proposed to represent the global proportion of global-local structure. In this way, the diverse GLPP models can be selected according to the global index Gper. The second problem could be solved with Bayesian inference and weighted sum strategy. The frame of this strategy employs the logic of ensemble learning. Ensemble learning has been demonstrated effective in the area of machine learning.22,23 The core ideas of ensemble learning are to get a series of base models and combine them to get an ensemble model. The combined ensemble model can greatly improve performance and has been verified by a lot of researches.24,25 The most significant foundation for ensemble learning is the high diversity of the base models. In this paper, the global index Gper is used to quantitatively analyze the global-local structure of the data and ensure the diversity of GLPP models. The proposed model would monitor the process in the ensemble learning viewpoint compared to the conventional model. Firstly, the global index Gper is calculated by quantitatively analyzing the global-local structure of the GLPP model, which can satisfy the high diversity demand of ensemble learning. And then the GLPP models with different global index Gper are applied to monitor the process. The Kernel Density Estimation (KDE) method is used to get more precise results and the control limits could be acquried.26 In order to get an ensemble one, the Bayesian inference is used to change the base model’s monitoring statistic value to probability and the final results will be integrated with weighted sum strategy. The larger weights have been assigned to those sub-models which give higher probabilities. In other words, the weighted sum strategy implements the motivation of fully analyzing the global-local structure in a simple and useful way. In particular, the false alarm information would be accumulated by the ensemble

ACS Paragon Plus Environment

4

Page 5 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

learning strategy as well. A possible way to solve this problem is to reduce the influence from the GLPP model which shows higher False Alarm Rate (FAR). The indicator called WFAR is designed to represent the reliability of the GLPP model and functions in the weighted sum strategy. The result of the TE case study demonstrates that the detection performance is greatly enhanced by the proposed ensemble GLPP (EGLPP) method compared to the conventional GLPP method. The remainder of this paper is organized as follows. In section 2, Global-Local Preserving Projections (GLPP) process monitoring is reviewed. In section 3, the details of the new proposed EGLPP model, i.e., the selection of the global index Gper and process monitoring procedures, are expatiated. In section 4, the widely used Tennessee Eastman (TE) benchmark is tested to demonstrate the effectiveness of the proposed model. Finally, conclusions and some discussions appear in section 5. 2. GLPP-BASED PROCESS MONITORING The Global-Local Preserving Projections model was designed for extracting both global and local information of the data. It was originally proposed by Luo17 to overcome the drawbacks of PCA and LPP. It was proposed by the fact that PCA could only extract the variance information and LPP keeps the local neighborhood structure. The core idea of GLPP is to determine the projection directions with consideration of both global and local structure. Let X denote the data matrix, i.e., X = [x1, x2, ..., xn]T∈ ℜ n×m and the optimization purpose is finding the transformation vector p∈ ℜ m, which could map the data matrix X to y = [y1, y2, ..., yn]T∈ ℜ n, i.e., y = Xp. The objective function which combines the global and local structure can be written as follows: J GLPP (p ) = min{η J Global (p ) + (1 − η ) J Local (p)}

(1)

ACS Paragon Plus Environment

5

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 37

where the two sub-objectives JGlocal (p) and JLocal (p) represent the global and local structure respectively. The details of the two sub-objectives could be listed as J Global (p ) = max p

∑(y

i

− y j ) 2 S ij

ij

∑p

= − min p

i

T

xi Dii xiT p − ∑ p T xi Sij x Tj p ij

(2)

= − min p X(D − S) X p T

T

p

= − min p T XLXT p p

J Local (p ) = min p

∑(y

i

∑p

xi Dii xiT p − ∑ p T xi Sij x Tj p

− y j ) 2 S ij

ij

= min p

T

i

ij

(3)

= min p X(D − S)X p T

T

p

= min p T XLX T p p

where Sij is the quantitative measure of the adjacent relationship between xi and xj and S is the adjacency weighting matrix. The value of Sij is dependent on the distance between the xi and xj data points. Similar as LPP, a nonzero value will be assigned to Sij when the data points xi and xj are close to each other. The value of the dual parameter Sij is complementary to Sij since the latter corresponds to the local structure and the former corresponds to the global structure. To solve the proposed dual-objective problem, a weighted parameter η is introduced. Thus, the objective function of the GLPP model will be reformulated like J GLPP (p) = min {η J Global (p) + (1 − η ) J Local (p)} p

= min {(1 − η )∑ ( yi − y j ) 2 S ij − η ∑ ( yi − y j )2 S ij } p

ij

ij

= min ∑ ( yi − y j ) R ij 2

p

= min p

ij

∑p i

T

x i H ii xiT p − ∑ p T xi R ij x Tj p

(4)

ij

= min p X(H − R)X p T

T

p

= min p T XMXT p p

ACS Paragon Plus Environment

6

Page 7 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

where Rij = (1-η)Sij-ηSij , i.e., R = (1-η)S-ηS. And H is a diagonal matrix; its entries can be calculated as Hii = ∑j Rij, and M = H-R. To avoid the singularity problem of XHXT, the following constraint could be introduced for the GLPP objective function: p T (η XHX T + (1 − η )I )p = p T Np = 1

(5)

Then this problem can be solved as a generalized eigenvector problem: XMXT p = λ Np

(6)

The obtained eigenvectors are p1, p2, ..., pd, pd+1, ..., pm, corresponding to the eigenvalues λ1, λ2, ..., λd, λd+1, ..., λm, where λ1 < λ2 < ... < λd < λd+1< ... < λm. These eigenvectors are the projection directions that keep both the global and local structure of the dataset and retaining d eigenvectors entails dimensionality reduction. Finally, the projections with GLPP method are listed as x i → y i = P T x i , P = [p1 , p 2 , ..., p d ]

(7)

The Hotelling’s T2 and SPE statistics are used for the process monitoring, which can be built as T 2 = y T S −1 y SPE = x − P y

(8) 2

(9)

where y is the latent variable vector and S is the covariance matrix of the latent variable matrix Y, i.e., S = YTY/(n-1). The two statistics represent the statistical characteristic of the data structure and the control limits of the two statistics are estimated with the data under the normal operation conditions. Traditionally, the control limit of T2 statistic is calculated with the F-distribution and the SPE

ACS Paragon Plus Environment

7

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 37

statistic is estimated by χ2-distribution, which can be found in the ref 27. The shortcomings of this method are obvious since the distribution of every projected variable does not follow normal distribution and the sum of the variables’ normalized squares would not follow F-distribution. The other way of estimating the control limit is the kernel density estimation (KDE). This technique considers the reality that the real distribution of the random variable is complex and instead estimates the probability density function with the given data. Suppose the estimated data samples x = {x1, x2, ..., xn} and the probability density function (PDF) f (x) is defined as

f ( x) =

1  x − xi  ∑K nh i  h 

(10)

where x is the estimated data point, xi is the i-th data sample, h represents the bandwidth parameter and Gaussian kernel is usually chosen as the kernel function K. As the bandwidth parameter h has a great influence on KDE, Mugdadi and Ahmad have done research on the selection.28 They used the least-squares cross validation method to get the suitable value and the parameter h can be computed as h = 1.06σ n−0.2

(11)

where σ and n represent the standard deviation and the number of data samples respectively. When the probability density function is calculated, the cumulative density function (CDF) is obtained, which can be derived as follows P( x < θ ) = ∫

θ

−∞

f ( x)dx

(12)

According to the CDF, choose the proper confidence level, such as 0.99, and then the control limit will be determined.

ACS Paragon Plus Environment

8

Page 9 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

3. EGLPP FOR ENHANCED PROCESS MONITORING 3.1. Motivation illustration. Traditional GLPP model can function well since it combines the

global and local information. However, every single GLPP model could only extract the certain combination of the global and local structure, which can be illustrated by the Figure 1. Figure 1 illustrates the one-dimensional (1-D) projections of global, local and two global-local directions on the two-dimensional (2-D) data set X. The two global-local directions correspond to two GLPP models with different balance parameters. The three red dots x1, x2, x3 are the abnormal data. Comparing with the local direction and global direction, the first G-L direction would detect points x1 and x2 easily while the global direction could only detect the point x2 and the local direction monitors point x1. However, the first G-L direction is not sensitive to the point x3 while the second one could not detect point x2. The illustration explains the reason why the GLPP model is available and shows the drawbacks of the single GLPP model. Motivated by this illustration, the new model that combines different GLPP models is proposed.

ACS Paragon Plus Environment

9

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 37

Figure 1. The 1-D projections of global, local and two GLPP directions on a 2-D dataset X. 3.2. Development of the EGLPP model. To better take advantage of different GLPP models,

ensemble learning is used to develop the GLPP model. Instead of selecting a single balance parameter, the balance parameter set Ω is chosen and then a series of GLPP models are built, which can be rewritten as: J GLPP (p)(l ) = min {η ( l ) J Global (p) + (1 − η (l ) ) J Local (p)} p

= min ∑ ( yi − y j )2 R ij(l ) p

ij

= min p T X(H (l ) − R (l ) ) X T p

(13)

p

= min p T XM (l ) XT p p

ACS Paragon Plus Environment

10

Page 11 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

where R(l)ij = (1-η(l))Sij-η(l)Sij , i.e., R(l) = (1-η(l))S-η(l) S and η(l)∈Ω. And H(l) is a diagonal matrix; its entries are row sums of matrix R(l), H(l)ii = ∑j R(l)ij , and M(l) = H(l)-R(l). The constraint for the GLPP model is: p T (η ( l ) XH (l ) X T + (1 − η ( l ) )I )p = p T N ( l ) p = 1

(14)

After solving the generalized eigenvector problem as equation 6, the projections of each GLPP model can be denoted as P(l) = [p(l)1 , ..., p(l)d ]. The EGLPP model is built on a series of wellselected GLPP models. How to choose the GLPP model and how to combine them are the two key issues need to be solved. 3.3. The selection strategy for balance parameter. The key parameter of the EGLPP model is

the balance parameter set Ω, which corresponds to a series of GLPP models. The set Ω contains the balance parameters that could well represent the global and local proportion. To find this set, the objective function of the GLPP model should be reconsidered, which can be seen from the follows: J GLPP (p ) = min {η J Global (p ) + (1 − η ) J Local (p )} p

= min {(1 − η )p T XLX T p − η p T XLX T p}

(15)

p

Considering the objective function, the former part L(p) = (1-η)pTXLXTp denotes the local information while the latter G(p) = ηpTXLXTp corresponds to the global information. The objective function would minimize the value of L(p) and maximize G(p). To quantitatively analyze the two parts, normalization should be introduced and the standard values for the two parts can be inferred from the follows:

ACS Paragon Plus Environment

11

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 37

GStandard (u) = u T XLX T u

(16)

LStandard (w ) = w T XLX T w

(17)

where u∈ Φ(p) = argmax {pTXL XTp} and w∈ Ψ(q) = argmin {qTXLXTq}. In this way, the normalized global and local proportion can be written as:

VGlobal : VLocal =

G (p ) L(p ) : GStandard (u ) LStandard (w )

η p T XLX T p

(1 − η )p T XLX T p = T : w T XLX T w u XLX T u

(18)

considering the d dimension projected vectors, the ratio between global and local can be concluded as: d

VGlobal : VLocal =

∑G i =1

G (p i ) : Standard (u i )

d

p i T XLX T p i

i =1

u i T XLX T u i

=η∑

d

∑L j =1

L(p j )

Standard

(w j )

d

p j T XLX T p j

j =1

w j T XLX T w j

: (1 − η )∑

(19)

where d is the value of dimension reduction. For simplicity, an index Gper that represents the global ratio can be used as: G per = VGlobal (VGlobal +VLocal )

(20)

then the balance parameter set Ω can be selected with the consideration of the global index Gper. 3.4. Reconsideration of the EGLPP model. As described above, the proposed EGLPP model

could enhance the single GLPP model. However, there exists reconsideration to be put forward. Process monitoring based on EGLPP model is designed upon the ensemble learning strategy. There are two important indicators that could assess the process monitoring performance quantitatively, i.e., Fault Detection Rate(FDR) and False Alarm Rate(FAR). FDR is defined as the percentage of samples upon the control limit under fault condition while FAR is the

ACS Paragon Plus Environment

12

Page 13 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

percentage of samples upon the control limit under normal condition. Obviously, the EGLPP model-based process monitoring will achieve higher FDR than the single GLPP model-based monitoring since EGLPP model could get more information about the process. Unavoidably, the FAR of EGLPP is higher than GLPP because the false alarm information from different GLPP models is accumulated. The proposed EGLPP model is designed to improve the single GLPP model, meanwhile, the FAR should be considered. Since the ensemble learning combines the information from different GLPP models, some models with higher FAR should be carefully managed. To deal with this problem, an available way is to assign corresponding weight to the GLPP model before the ensemble learning strategy. The GLPP model that performs lower FAR should be more reliable, in other words, assigned with larger weight in the combination. After this, the EGLPP model could combine the detection information and reduce the false alarm information from different GLPP models. 3.5. Process monitoring based on EGLPP. As with common practice, two statistics, T2(l) and

SPE(l) (η(l)∈Ω) are constructed according to different GLPP models. These monitoring statistics can be calculated as: y j = P(l )T x j T 2(l ) = yT (S(l ) )−1 y

(21)

2

SPE (l ) = x − P (l ) y ,η (l )∈ Ω , j = 1,L, n

where S(l) is covariance matrix of the latent variable matrix of the l-th GLPP model and P(l) represents the projection matrix of the l-th GLPP model. In order to combine the monitoring results, the Bayesian strategy is adopted to transfer each model’s statistics value to the fault probability value. Then the sub-model’s monitoring results will be normalized and the indication

ACS Paragon Plus Environment

13

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 37

for the fault behavior will be more clear. All that was mentioned above can be derived from the following: PT(2l ) ( x|F ) PT(2l ) ( F )

PT(2l ) ( F |x ) =

(22)

PT(2l ) ( x ) (l ) (l ) PSPE ( x|F ) PSPE (F ) (l ) PSPE ( x )

(l ) PSPE ( F |x ) =

(23)

where the two probabilities P(l)T (x) and P(l)SPE(x) can be calculated as 2

PT(2l ) ( x )=PT(2l ) ( x|F ) PT(2l ) ( F )+PT(2l ) ( x|N ) PT(2l ) ( N )

(24)

(l ) (l ) (l ) (l ) (l ) PSPE ( x )=PSPE ( x|F ) PSPE ( F )+PSPE ( x|N ) PSPE (N )

(25)

where N and F denote the normal and fault conditions. Thus, the value of P(N) and P(F) can be 1 -α and α. The likelihood terms P(l)T (x|F), P(l)T (x|N), P(l)SPE(x|F), P(l)SPE(x|N) are the links between the 2

2

statistics and probability, which can be as follows: 2( l ) PT(2l ) ( x|F )=exp(- Tlimit T 2(l ) )

(26)

2( l ) PT(2l ) ( x|N )=exp(- T 2(l ) Tlimit )

(27)

(l ) (l ) PSPE ( x|F )=exp(- SPElimit SPE ( l ) )

(28)

(l ) (l ) PSPE ( x|N )=exp(- SPE (l ) SPElimit )

(29)

Since all the sub-model monitoring results are transferred to fault probabilities, the weighted combination strategy can be used to combine them:

PT(2l ) ( F |x )

card ( Ω )

ET 2 =

∑ l =1

card ( Ω )



( p) T2

P

p =1

card ( Ω )

ESPE =

∑ l =1

(l ) SPE card ( Ω )

P



PT(2l ) ( F |x )

( F |x )

( F |x )

(l ) PSPE ( F |x )

( p) PSPE ( F |x )

(30)

(31)

p =1

ACS Paragon Plus Environment

14

Page 15 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Considering the side effect of the ensemble learning, the FAR of the ensemble one will be a little higher than the single. As discussed in section 3.4, an indicator WFAR is designed to represent the reliability of the chosen GLPP models, which can be seen as follows: q (l ) (l ) WFAR = 1 − card ( Ω )



(32)

qi

i =1

(l) FAR

where W

is the reliability weight of the l-the GLPP model, q(l) is the FAR value of the statistics

(T2 or SPE) of the l-the GLPP model and card(Ω) is the number of the chosen GLPP models. The value of q(l) can be calculated with k-fold cross-validation. In detail, divide the train set into k (different from the parameter of k-NN method) parts equally and select one to be the test set for FAR. This process will be finished until all the parts have been selected as test set. The FAR value of the GLPP model will be the average of the k times test. Obviously, the higher FAR of (l) (l) the GLPP model, the lower WFAR . This indicator WFAR will show the reliability of the chosen

GLPP model and can be updated in the weighted sum strategy as:

(W ∑ ∑ (W

(l ) T 2 FAR

card ( Ω )

ET 2 =

l =1

card ( Ω )

( p) T 2 FAR

p =1

(W ∑ ∑ (W

(l ) SPEFAR

card ( Ω )

ESPE =

l =1

× PT(2l ) ( F |x ) ) × PT(2p ) ( F |x ) )

(l ) × PSPE ( F |x ) )

card ( Ω )

( p) SPEFAR

p =1

PT(2l ) ( F |x )

×P

( p) SPE

( F |x ) )

(l ) PSPE ( F |x )

(33) (34)

Comparing the above equations 33 and 34 to equations 30 and 31, the modification is making the less reliable GLPP model affect less in the combination. The larger weights have been assigned to those sub-models which give higher fault probabilities and less FAR. This design implements the motivation of fully considering the global-local structure in a simple and useful way. Finally, the process will be determined whether normal or not by simply comparing the two final monitoring statistics to their confidence limits. In more detail, when ET2