System Decomposition for Distributed Multivariate Statistical Process

paper. We name it the Performance Driven Agglomerative Clustering (PDAC) method. ... The remaining sections of the paper are organized as follows. In ...
0 downloads 0 Views 910KB Size
Subscriber access provided by NORTH CAROLINA A&T UNIV

Process Systems Engineering

System Decomposition for Distributed Multivariate Statistical Process Monitoring by Performance Driven Agglomerative Clustering Shaaz Khatib, Prodromos Daoutidis, and Ali Almansoori Ind. Eng. Chem. Res., Just Accepted Manuscript • DOI: 10.1021/acs.iecr.8b01708 • Publication Date (Web): 30 May 2018 Downloaded from http://pubs.acs.org on May 31, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

System Decomposition for Distributed Multivariate Statistical Process Monitoring by Performance Driven Agglomerative Clustering Shaaz Khatib,† Prodromos Daoutidis,∗,† and Ali Almansoori‡ †Department of Chemical Engineering and Materials Science, University of Minnesota, Minneapolis, MN 55455, USA ‡Department of Chemical Engineering, Khalifa University of Science and Technology, Abu Dhabi, UAE E-mail: [email protected] Abstract Conventional multivariate statistical process monitoring methods like Principal Component Analysis perform poorly in detecting faults in large systems. Partitioning the system and implementing a multivariate statistical process monitoring method in a distributed manner improves monitoring performance. A simulation optimization method is proposed whose objective is to find the system decomposition for which the performance of a distributed multivariate statistical process monitoring method is optimal. The proposed method uses the search strategy used in agglomerative clustering in finding the optimal system decomposition. To demonstrate its effectiveness, the proposed method is incorporated into a distributed principal component analysis based monitoring scheme and applied to the benchmark Tennessee Eastman Process case study.

1

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Keywords: Decentralized Fault Detection, Multivariate Statistical Process Monitoring, Agglomerative Clustering, Simulation Optimization, System Decomposition, Principal Component Analysis

1

Introduction

Due to the increasing complexity of chemical plants, ensuring safe and cost-effective operation has become more challenging over the years. Over the last few decades, a number of process monitoring methods for fast and accurate fault detection and diagnosis have been developed to tackle this issue. 1–4 These methods can be classified into three categories namely analytical, data-driven and knowledge based methods. 1 Multivariate Statistical Process Monitoring (MSPM) methods are a class of data driven methods in which measured data is first projected onto a feature space. A statistical model is used to fit the data in the feature space and to derive statistical tests for online process monitoring. MSPM methods are popular in the chemical industry since they are easy to deploy and don’t require a process model to implement them. 4 Principal Component Analysis (PCA) is the most popular MSPM method due to its simplicity and efficiency in dimensionality reduction 5–7 and will be used in this study. In PCA, the measurements are projected onto a lower dimensional subspace of the measurement space in which the data has maximum variance. In recent years a lot of research has focused on improving the performance and tackling some of the deficiencies of MSPM methods, including PCA. 8–11

One of the deficiencies of standard MSPM methods like PCA is that they perform poorly in large scale processes 12 since detecting a fault becomes difficult when a large number of the measured variables of a process are unaffected by the fault. Dividing the system and implementing MSPM methods in a distributed manner could in principle address this problem. We will use the term distributed MSPM methods to include those methods in which each subsystem’s local measurements are used to estimate local test statistics which can then be 2

ACS Paragon Plus Environment

Page 2 of 55

Page 3 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

used to make a local fault decision. The local test statistics or local fault decisions of all the subsystems can then be combined via a consensus strategy 6,13 to make a global fault decision to determine if the measurements indicate that the process is faulty. This definition covers “decentralized”, 14–17 “distributed” 5,6,18,19 and “multi-block” 15–17,20,21 MSPM methods. A schematic of distributed process monitoring is provided in Figure 1.

Figure 1: Schematic of Distributed Multivariate Statistical Process Monitoring The division of measured variables into subsystems can have a major impact on the monitoring performance of a distributed MSPM method. Therefore, system decomposition for distributed MSPM is an open problem that has attracted considerable amount of research in the last few years. Multi-block monitoring methods in 15–17,22 place measured variables from different process units into different subsystems. In 6,14,18 normal data is used to decompose the system such that when PCA is applied to the measurements of each subsystem, a small number of loading vectors are needed to capture most of the variance in the measurements. Other system decomposition methods in 20,21,23,24 use a correlation metric (calculated using

3

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

normal data) to define the closeness between two measured variables of the system. This is then used to divide the measured variables into subsystems via clustering or based on whether the metric exceeds a certain threshold.

Although the system decomposition methods described in the previous paragraph have their merits, the monitoring performance of a distributed MSPM method is unlikely to be optimal if a decomposition generated by one of these methods is used since they do not use any fault information. This is identified in 5 where a performance driven system decomposition method is proposed. In this method, the measured variables relevant to detecting a fault type are placed in a subsystem. This set of variables is found by solving a simulation optimization problem using a genetic algorithm in which the performance of a MSPM method is simulated using data of the fault. However, the decomposition generated by the method in 5 may contain a large number of subsystems since the number of possible faults affecting a process may be very large. This can in turn lead to very high false alarm rates unless the thresholds of the monitoring statistics are significantly increased which would in turn degrade fault monitoring performance. Also, the method in 5 does not simulate the consensus between the different subsystems in determining the optimal set of measured variables in a subsystem.

A performance driven decomposition method, which simulates the distributed MSPM method as a whole to not only find the optimal allocation of measured variables in a certain number of subsystems, but also find the optimal number of subsystems, is proposed in the present paper. We name it the Performance Driven Agglomerative Clustering (PDAC) method. The proposed PDAC method uses agglomerative clustering 25 to generate candidate decompositions. Agglomerative clustering is a class of algorithms in which clusters are iteratively merged at each step such that the constituents of each cluster are maximally similar with respect to a certain desired characteristic. 25,26 Normal and faulty data is then used to simulate the performance of a distributed MSPM method. The missed detection rate (MDR) is used

4

ACS Paragon Plus Environment

Page 4 of 55

Page 5 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

to compare the monitoring performance of the distributed MSPM method when different candidate decompositions are used. The proposed PDAC method generates a near optimal decomposition for every possible value of the number of subsystems in a decomposition of the system. A fine tuning algorithm is also included which can be used to further improve the quality of a decomposition output by the initial agglomerative clustering step. The monitoring performance of the decompositions output by the agglomerative clustering and fine tuning algorithms is compared to finally select the optimal decomposition. To illustrate the proposed PDAC decomposition method, it is used in a distributed PCA 6 based monitoring method. A case study and a number of comparisons document the performance and robustness of the proposed method.

The remaining sections of the paper are organized as follows. In section 2 the problem is formulated and motivation for using the proposed PDAC method is given. In section 3 the proposed PDAC method is described in detail. The robustness and performance of the PDAC method are analyzed by applying it to the Tennessee Eastman Process in section 4. In section 5 final conclusions are made and future research directions are discussed. The appendix contains a description of the commonly used fault detection performance metrics and a table describing commonly used abbreviations.

2

Motivation and Overview of the Proposed Method

The system decomposition has a significant impact on the performance of a distributed MSPM method. The performance of a distributed MSPM method based on a decomposition depends on the faults that affect the system. Placing measured variables that are affected by similar faults in the same subsystem would enhance monitoring performance since each measured variable would then have a significant contribution to the local (subsystem) test statistic(s) when a fault occurs. 27 However, a measured variable that is unaffected by a

5

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

fault, when placed in a subsystem containing other measured variables that are affected by the fault, could either enhance or degrade the monitoring performance with respect to that fault. 5 The performance of a distributed MSPM method based on a decomposition also depends on the number of subsystems used in the decomposition. If the number of subsystems is too large, then a large number of hypothesis tests (each of which generates false alarms at a certain rate) may have to be simultaneously employed, which would together generate a large number of false alarms. The detection thresholds of the hypothesis tests would then have to be increased which would make them less sensitive to faults. However, if the number of subsystems in the decomposition is too small, then there is a greater probability of variables that tend to be affected by different faults being placed in the same subsystem. The consensus between the subsystems could also have a significant effect on monitoring performance. If the system decomposition is such that all the subsystems are more sensitive to faults during the same time instants and less sensitive to faults during different time instants of faulty operation, then monitoring performance of a distributed MSPM method will not be optimal. The performance of a distributed MSPM method based on a decomposition also depends on the MSPM method which is implemented in a distributed configuration. The decomposition for which the performance of a distributed MSPM method is optimal may not be the decomposition for which the performance of another distributed MSPM method is optimal.

A distributed MSPM method should ideally be implemented using a decomposition for which its performance is optimal. Finding such a decomposition is a complex task due to the large number of factors that have an effect on the performance of a distributed MSPM method based on a decomposition and due to the large number of measured variables that potentially may have to be partitioned. Simulation optimization is an effective strategy to find the optimal decomposition since it requires no insight into these factors and also directly optimizes performance. Therefore, to find the optimal decomposition, the monitoring per-

6

ACS Paragon Plus Environment

Page 6 of 55

Page 7 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

formance of various candidate decompositions can be evaluated and compared by simulating the distributed MSPM method using these candidate decompositions and process data. The false alarm rate (FAR), missed detection rate (MDR) and detection delay (DD) are widely used performance metrics (see Appendix A).

In determining the decomposition of a system, two decisions must be made. The first involves selecting the number of subsystems and the second one involves allocating the measured variables into the subsystems. The allocation of measured variables into subsystems is expected to have a significant impact on the MDR and DD but not the FAR. This is because the detection thresholds for the test statistic(s) in each subsystem are set based on the level of significance (which is the expected false alarm rate that a statistical hypothesis test generates). For example, consider a distributed MSPM method in which a single hypothesis test is being carried out in each subsystem (with level of significance set equal to α) and the indication of a fault by one hypothesis test is sufficient for the monitoring method as a whole to indicate the fault. If, for a decomposition with B subsystems, the B statistical tests are assumed to be independent and the statistical models used to model the test statistics in each hypothesis test are precise, then the expected FAR is given by:

F AR = 1 − (1 − α)B

(1)

Equation (1) indicates that, for the considered distributed MSPM method, the FAR mainly depends on the number of subsystems (B) and not the allocation of the variables in the subsystems. Therefore, given a number of subsystems in the decomposition, we can find the optimal allocation of measured variables by minimizing the MDR or the DD. Since, in the determination of the performance metrics, the distributed MSPM method is simulated using noisy data, the performance metric we choose to minimize should be robust to a change in the data set. The MDR is expected to be more robust than the DD because the MDR

7

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

depends on the monitoring decision at every time instant (or for every sample). The DD, on the other hand, depends on the monitoring decision over a small number of time instants which can easily be corrupted by noise in the data. Therefore, we determine the optimal allocation of measured variables into B subsystems (i.e. the optimal decomposition with B subsystems) by finding the candidate decomposition with B subsystems which yields the lowest MDR when the distributed MSPM method is simulated using it. We refer to the aforementioned optimization problem as the allocation optimization problem.

The number of subsystems in a decomposition of the system can range from one to the number of measured variables (m) in the system. To determine the number of subsystems, the allocation optimization problem can be solved for all possible values of B thus generating m decompositions. We cannot directly compare the monitoring performance of these m decompositions. This is because the number of subsystems in a decomposition has a significant effect on both the MDR and the FAR and the decomposition having the lowest MDR may also be the decomposition with the highest FAR. Note however that the objective is not to minimize the FAR but to ensure that it is below a predefined threshold. This can be done by altering the detection thresholds of the test statistics used in the distributed MSPM method (by tuning the level of significance). Once this is done for the m decompositions, the decomposition for which the distributed MSPM method (with the detection thresholds being set based on the threshold FAR) yields the lowest MDR will be considered optimal.

The allocation optimization problem is difficult to solve. First, we cannot use conventional optimization approaches to solve this problem since the objective function (i.e. MDR) is a ‘blackbox’ as its evaluation involves simulating the distributed MSPM method for a candidate decomposition (the decision variable); thus the objective function evaluation is computationally expensive. Second, we cannot use a brute-force search to find the optimal decomposition since the number of candidate decompositions in a brute-force search is given by the Stirling

8

ACS Paragon Plus Environment

Page 8 of 55

Page 9 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

number of the second kind which grows faster than exponentially with m for most values of B. 28 Therefore we are forced to use heuristics to find an approximate solution to the allocation optimization problem. Another problem is that we have to solve the allocation optimization problem m times to obtain the optimal decompositions for every possible value of B. The proposed PDAC method uses the greedy search employed in Ward’s agglomerative clustering method 25 to generate candidate decompositions to solve the m allocation optimization problems. The main advantages of using the search in Ward’s agglomerative clustering method are that all m allocation optimization problems can be solved by applying the search once and the number of candidate decompositions generated during the search (O(m3 )) does not rise exponentially with m. A fine tuning algorithm can be applied to improve the quality of some of the decompositions output by the initial agglomerative clustering step. The monitoring performance of the decompositions output by the agglomerative clustering and fine tuning algorithms can be compared to select a final decomposition for the distributed MSPM method.

3

Performance Driven Agglomerative Clustering

In this section the proposed PDAC method will be described in detail. To illustrate the PDAC method, we incorporate it into the distributed PCA (DPCA) monitoring method. 6 First the calculation of the MDR (objective function) in the PDAC method is described. After this, the proposed agglomerative clustering and fine tuning algorithms of the PDAC method are described. Finally, a procedure that uses the agglomerative clustering and fine tuning algorithms to select a decomposition for a distributed MSPM method is discussed.

3.1

Objective Function Evaluation

To calculate the MDR of a candidate decomposition, the distributed MSPM method is simulated using the candidate decomposition. To simulate the distributed MSPM method,

9

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 55

both normal and faulty data are required. The data can be obtained from a historical database or by running simulations, if knowledge about the process models is available. The normal data is used in the derivation of one or more statistical hypothesis tests which are used to determine if a sample is faulty. The statistical tests are applied to every sample in the faulty data set to determine the fraction of samples that are incorrectly classified (i.e. the MDR). The PDAC method requires that the calculations involved in simulating the distributed MSPM method be divided into two stages. The calculations in stage 1 are carried out independently for each subsystem and require no exchange of information between subsystems. The calculations in stage 2, on the other hand, combine the information from each subsystem in some way. The two stage division of the calculations involved in the DPCA monitoring method is discussed below.

3.1.1

Stage 1: Local Process Monitoring

Consider a subsystem (indexed as b ∈ [1..B]) with mb measured variables in a candidate decomposition of the system. The subsystem’s measurements in the normalized normal and faulty data sets are the inputs required to carry out the stage 1 calculations. PCA is applied to the subsystem’s normal data matrix (Xb ∈ Rn×mb ), containing n samples of the mb measured variables, to calculate the loading matrix Pb ∈ Rmb ×ab . The number of loading vectors (ab ) is found by using the percentage variance test 1 where the ab loading vectors should explain at least a certain minimum percentage of the variance in the normal data. The faulty data set contains nf faulty samples of the subsystem’s measurements. If the 2 T 2 statistic is used for monitoring, the subsystem T 2 statistic (Tbf ) is calculated for all the

faulty samples (xbf ∈ Rmb ×1 where f ∈ [1..nf ] indexes the faulty samples): 2 Tbf

=

2 ab  X xbf T pb i

σi

i=1

10

ACS Paragon Plus Environment

(2)

Page 11 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

where pb i is the ith loading vector and σi is the standard deviation of the projection of a sample of measured variables onto pb i . Given the level of significance, the detection threshold 2 ) for the T 2 statistic is calculated. The outputs of stage 1 are the subsystem (local) (Tb,lim

fault decisions (ybf ) for all the faulty samples where:

ybf =

   2 2 1, if Tbf > Tb,lim

(i.e. a fault is detected by the subsystem) (3)

  0, otherwise The stage 1 calculations are carried out for all B subsystems.

3.1.2

Stage 2: Decision Fusion

The stage 1 calculation outputs for all the subsystems (i.e. the subsystem fault decisions ybf ∀b ∈ [1..B] and f ∈ [1..nf ]) are the inputs required to carry out the stage 2 calculations. The subsystems reach a consensus that a sample of measured variables is faulty if a certain minimum number (ythreshold ) of subsystems detect that the sample is faulty. The system fault decision yf is calculated for every faulty sample:

yf =

  Pb=B  1, if b=1 ybf ≥ ythreshold

(i.e. a fault is detected by the system) (4)

  0, otherwise The stage 2 calculation output is the MDR: Pf =nf M DR = 1 −

f =1

yf

nf

(5)

Remark 1: The outputs of calculations in the distributed MSPM simulation that do not depend on the candidate decomposition should be stored before the calculations involved in the two stages are carried out. This is done to ensure that calculations are not repeated when

11

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 55

the distributed MSPM simulation is re-run using different decompositions. For example, the normalization of the normal and faulty data does not depend on the candidate decomposition and is thus carried out before stage 1.

3.2

Agglomerative Clustering

The PDAC method uses the greedy search in Ward’s agglomerative clustering method 25 to solve the m allocation optimization problems (each of which involves finding the optimal decomposition of the system with B ∈ [1..m] subsystems). The first iteration of the search starts with an initial decomposition in which each measured variable is its own subsystem. This decomposition represents the only possible way of allocating the m measured variables into m non-empty subsystems and is thus the optimal decomposition with m subsystems. Out of the m subsystems, two can be merged to generate a candidate decomposition with m − 1 subsystems. Since there are

m(m−1) 2

ways of pairing two subsystems,

m(m−1) 2

candi-

date decompositions with m − 1 subsystems are generated and the MDR of each candidate decomposition is calculated by simulating the distributed MSPM method. The candidate decomposition with the lowest MDR is the optimal decomposition with m − 1 subsystems. The process is repeated in subsequent iterations in which the initial decomposition of the current iteration is the optimal decomposition generated in the previous iteration, all possible mergers of two subsystems of the initial decomposition are considered and the optimal decomposition for the current iteration is found. The search stops when the optimal decomposition with two subsystems is found. The optimal decomposition with one subsystem involves allocating all the measured variables into a single subsystem. The algorithm used for agglomerative clustering is formulated in a way such that the stage 1 simulation calculations being carried out in a certain iteration of the search are not repeated in subsequent iterations. A simple example is used to illustrate the main steps of the agglomerative clustering algorithm. A general description of the agglomerative clustering algorithm applied to the DPCA monitoring method is provided in the Supporting Information. 12

ACS Paragon Plus Environment

Page 13 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Example

Consider a system with four measured variables v1 , v2 , v3 , v4 . Distributed PCA is used for fault detection with ythreshold = 1. The faulty data set used in the calculation of the MDR contains four multivariate faulty samples.

In the initial decomposition of the first iteration, each measured variable is its own subsystem. The initial decomposition of the first iteration is the only possible and hence optimal decomposition of the system with four subsystems. The initial decomposition and the calculations involved in the first iteration are shown graphically in Figure 2. Each node (circle) in the graph represents a subsystem. We carry out the stage 1 calculations of the DPCA simulation for each subsystem defined by the initial decomposition and store the output local fault decisions (LFD). For example, after carrying out the stage 1 calculations for subsystem ‘1’ we find that subsystem ‘1’ identifies the first faulty sample as faulty and the remaining faulty samples as normal. The LFD vector [1, 0, 0, 0] is thus stored in the subsystem ‘1’ node in Figure 2. There are six possible ways of merging 2 subsystems of the initial decomposition to form a candidate subsystem. Each candidate subsystem is represented graphically in Figure 2 by an edge linking the two subsystems merged to form it. We carry out the stage 1 calculations of the DPCA simulation for each candidate subsystem and store the output local fault decisions. For example, after carrying out the stage 1 calculations for candidate subsystem ‘12’ we find that candidate subsystem ‘12’ identifies the first faulty sample as faulty and the remaining faulty samples as normal. The LFD vector [1, 0, 0, 0] is thus stored near the edge corresponding to candidate subsystem ‘12’ in Figure 2.

Six candidate decompositions are generated in the first iteration. Each candidate decomposition contains a candidate subsystem and the remaining subsystems which were not merged

13

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 2: Graphical Description of the First Iteration Calculations to form the candidate subsystem. For example, candidate decomposition ‘12’ consists of candidate subsystem ‘12’, subsystem ‘3’ and subsystem ‘4’. Using the local fault decisions of the subsystems of a candidate decomposition, we carry out the stage 2 calculations of the DPCA simulation and find the MDR of the candidate decomposition. For example, using the LFD of candidate subsystem ‘12’, subsystem ‘3’ and subsystem ‘4’ (shown in Figure 2), the system with candidate decomposition ‘12’ will identify the first three faulty samples as faulty and the fourth faulty sample as normal. The MDR of candidate decomposition ‘12’ is thus 0.25 and is shown in Figure 2 near the edge representing candidate subsystem ‘12’. The MDRs of all six candidate decompositions are calculated and candidate decomposition ‘12’ is found to have the lowest MDR. Therefore, the decomposition with subsystem ‘12’, subsystem ‘3’ and subsystem ‘4’ is the optimal decomposition with three subsystems.

The optimal decomposition with three subsystems is the initial decomposition of the second 14

ACS Paragon Plus Environment

Page 14 of 55

Page 15 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

iteration. The initial decomposition and the calculations involved in the second iteration are shown graphically in Figure 3. The local fault decisions of subsystem ‘12’, subsystem ‘3’, subsystem ‘4’ and candidate subsystem ‘34’ were already calculated and stored in the first iteration and are therefore transferred from the graph in Figure 2 to the graph in Figure 3. The local fault decisions of subsystem ‘1’, subsystem ‘2’, candidate subsystem ‘13’, candidate subsystem ‘14’, candidate subsystem ‘23’ and candidate subsystem ‘24’ were calculated and stored in the first iteration but are not required in the second iteration and are thus deleted. Only the local fault decisions of candidate subsystem ‘123’ and candidate subsystem ‘124’ need to be calculated in the second iteration. We carry out the stage 1 calculations of the DPCA simulation for these two candidate subsystems and store the output local fault decisions (LFD). The MDRs of the three candidate decompositions generated in the second iteration are calculated by carrying out the stage 2 calculations of the DPCA simulation for each candidate decomposition using its subsystem’s local fault decisions. Candidate decomposition ‘124’ which contains subsystem ‘3’ and candidate subsystem ‘124’ has the lowest MDR out of the three candidate decompositions and is the optimal decomposition with two subsystems. The optimal and only possible decomposition with one subsystem has all the four measured variables in one subsystem.

Remark 2: The algorithm requires O(m2 ) evaluations of stage 1 of the DPCA simulation and O(m3 ) evaluations of stage 2 of the DPCA simulation. A naive implementation of Ward’s agglomerative clustering method, in which the two stage division of the DPCA simulation is not considered, would require O(m3 ) simulations of the entire DPCA method which is equivalent to O(m4 ) stage 1 evaluations and O(m3 ) stage 2 evaluations. The reduction in the number of stage 1 evaluations is significant since most of the computations involved in the distributed MSPM method occur in stage 1.

15

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 3: Graphical Description of the Second Iteration Calculations

3.3

Fine Tuning

Agglomerative clustering is only able to generate approximate solutions to the allocation optimization problems. This is because it is a greedy algorithm in which the optimal pairing of subsystems in a particular iteration is assumed to be optimal for subsequent iterations and this may not be the case. To solve this problem a fine tuning procedure can be used to slightly modify a decomposition obtained by agglomerative clustering and thus potentially generate a decomposition with a lower MDR.

Newman used a fine tuning step, based on Kernighan-Lin’s algorithm, 29 to improve the quality of cluster bi-partitions in his community detection method. 30 The fine tuning algorithm we employ extends the greedy search that Newman’s fine tuning step used, to multiple subsystems. The fine tuning algorithm first requires an initial decomposition (i.e. an initial

16

ACS Paragon Plus Environment

Page 16 of 55

Page 17 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

guess) which is modified in the course of the algorithm. A measured variable is then shifted from its current subsystem (defined by the initial decomposition) to another subsystem thus generating a candidate decomposition. The distributed MSPM method is simulated using the candidate decomposition and its MDR is calculated. Out of all possible moves of measured variables from their current subsystem to a different subsystem, the move of a measured variable to a subsystem for which the corresponding candidate decomposition has the lowest MDR is carried out permanently (even if the resulting candidate decomposition has a higher MDR than the initial decomposition). The process is repeated in subsequent iterations where the initial decomposition is the candidate decomposition with the lowest MDR in the preceding iteration. Once a variable has been permanently moved in an iteration, it is constrained to remain in the same subsystem in subsequent iterations. The candidate decompositions are also constrained to have the same number of subsystems as the initial guess. Therefore, if, in an iteration of the fine tuning algorithm, a measured variable is the only constituent of its subsystem, then the measured variable cannot be moved to another subsystem. The algorithm stops when there are no measured variables which can undergo moves to other subsystems. The decomposition with the lowest MDR out of all the decompositions generated in the search (which includes all the candidate decompositions generated in all the iterations and the initial guess of the first iteration) is the output of the fine tuning algorithm. The fine tuning algorithm is formulated in a way such that the stage 1 simulation calculations are not repeated in the algorithm as far as possible. A simple example is used to illustrate the main steps of the fine tuning algorithm. A general description of the fine tuning algorithm applied to the DPCA monitoring method is provided in the Supporting Information.

Example

Consider a system with seven measured variables v1 , v2 , v3 , v4 , v5 , v6 , v7 . Distributed PCA

17

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

is used for fault detection with ythreshold = 1. The faulty data set used in the calculation of the MDR contains four multivariate faulty samples.

The initial guess is a decomposition with four subsystems. The four subsystems of the initial guess are S1 = {v1 , v2 }, S2 = {v3 , v4 }, S3 = {v5 , v6 } and S4 = {v7 }. We carry out the stage 1 calculations of the DPCA simulation for each of the four subsystems and store the output local fault decisions. Using the local fault decisions of the four subsystems, we calculate the MDR of the initial guess by carrying out stage 2 of the DPCA simulation. The calculations involved in finding the MDR of the initial guess are shown graphically in Figure 4.

Figure 4: MDR Calculation for the Initial Guess

In the first iteration, all the measured variables except v7 can be shifted from their current subsystems since v7 is the only variable in its subsystem S4 and shifting it to another subsystem would reduce the number of subsystems in the decomposition. Consider moves of measured variable v1 from subsystem S1 to subsystems S2 , S3 and S4 . These moves generate four new subsystems S10 = {v2 }, S20 = {v1 , v3 , v4 }, S30 = {v1 , v5 , v6 } and S40 = {v1 , v7 }. We carry out the stage 1 calculations of the DPCA simulation for each of the four new subsystems and store the output local fault decisions. The moves of v1 from S1 to S2 , S3 and S4 generate three candidate decompositions (see Figure 5). We find the MDR of each one of these candidate decompositions by carrying out stage 2 of the DPCA simulation using the local fault decisions of its subsystems. The MDR calculations for the candidate decomposi18

ACS Paragon Plus Environment

Page 18 of 55

Page 19 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

tions are shown in Figure 5.

Figure 5: MDR Calculation for Some of the Candidate Decompositions in the First Iteration

The measured variables v2 , v3 , v4 , v5 and v6 are shifted from their current subsystems to other subsystems in a similar way to v1 . The MDRs of the candidate decompositions being generated when these shifts are carried out are calculated. The decomposition generated when v3 is shifted from subsystem S2 to S4 is found to have the lowest MDR out of all the candidate decompositions generated in the first iteration (not including the initial guess). Measured variable v3 is shifted permanently from subsystem S2 to S4 at the end of the first iteration. The resulting decomposition has the subsystems S1 , S2new = {v4 }, S3 and S4new = {v3 , v7 } and is shown in Figure 6.

In the second iteration, all the measured variables, except v3 and v4 , can be shifted from their current subsystems since v3 has already been moved permanently in the first iteration and v4 is the only variable in its subsystem S2new . Consider moves of measured variable v1 19

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 55

Figure 6: Best Candidate Decompositions in the First Iteration from subsystem S1 to subsystems S2new , S3 and S4new . These moves generate two new sub0

0

systems S2new = {v1 , v4 } and S4new = {v1 , v3 , v7 } along with the subsystems S10 and S30 that were generated when moves of v1 were carried out in the first iteration. Carry out the stage 0

0

1 calculations of the DPCA simulation for subsystems S2new and S4new and store the output local fault decisions. The stored local fault decisions of the subsystems S10 and S30 from the first iteration are required in the calculation of the MDRs of the three candidate decompositions generated when the measured variable v1 is shifted to another subsystem. We find the MDR of each one of these three candidate decompositions by carrying out stage 2 of the DPCA simulation using the local fault decisions of its subsystems. The MDR calculations for the candidate decompositions are shown in Figure 7. Moves of measured variables v2 , v5 , v6 and v7 are also carried out and the MDRs of the resulting candidate decompositions are calculated. Measured variable v1 is permanently shifted from subsystem S1 to S2new at the end of the second iteration since the resulting decomposition (candidate decomposition 1 in Figure 7) has the lowest MDR out of all the candidate decompositions generated in the second iteration.

The procedure carried out in the first and second iterations is repeated in subsequent iterations. The algorithm stops at the end of the sixth iteration since none of the measured variables can be moved in the seventh iteration. The MDRs of the initial guess and the best candidate decomposition in each of the six iterations are compared. The best candidate decomposition in the sixth iteration has an MDR of zero which is lower than the MDR of

20

ACS Paragon Plus Environment

Page 21 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Figure 7: MDR Calculation for Some of the Candidate Decompositions in the Second Iteration the initial guess and all the candidate decompositions generated in other iterations. This decomposition is thus considered optimal and is the algorithm output.

Remark 3: The fine tuning algorithm requires O(m2 ) evaluations of stage 1 of the DPCA simulation and O(Bm2 ) evaluations of stage 2 of the DPCA simulation, in the worst case, for B ≤

m . 2

A naive implementation of the fine tuning algorithm, in which the two stage

division of the DPCA simulation is not considered, would require O(Bm2 ) simulations of the entire DPCA method which is equivalent to O(B 2 m2 ) stage 1 evaluations and O(Bm2 ) stage 2 evaluations, in the worst case, for B ≤

3.4

m . 2

Decomposition Selection

The agglomerative clustering algorithm outputs m decompositions with number of subsystems in each decomposition ranging from 1 to m. The FARs generated when the distributed

21

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

MSPM method is simulated using each of these decompositions must be comparable before the MDRs of these decompositions can be compared. Besides the system decomposition, the level of significance (α) is another parameter that the user must define for a distributed MSPM method. The level of significance is the expected FAR that a statistical hypothesis test will generate. The thresholds for the test statistics used in a distributed MSPM method are a function of α. The level of significance can therefore be tuned to set the FAR to an acceptable level.

Figure 8: Schematic Flow Sheet of the PDAC Method

Therefore, a value of α for which the FAR is comparable in value to a user-defined threshold (F ARthreshold ) is determined for all the m decompositions. The distributed MSPM method is simulated for each of the m decompositions with the determined value of α being used to set the detection threshold for the test statistics, and the MDR of each of the m decompositions is determined. The decomposition with the lowest MDR is then subjected

22

ACS Paragon Plus Environment

Page 22 of 55

Page 23 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

to the fine tuning algorithm. The decompositions with number of subsystems less than the number of subsystems in the decomposition with the lowest MDR, and whose MDRs are within a small percentage of the the lowest MDR are also subjected to the fine tuning algorithm. The decomposition with the lowest MDR after fine tuning is selected for the distributed MSPM method. A flow sheet describing the PDAC method is provided in Figure 8. A MATLAB ® code for the implementation of the PDAC method is provided at

daoutidis.cems.umn.edu/software and it can be used to find the optimal decomposition of a system for a distributed MSPM method that uses a voting based consensus strategy with ythreshold = 1 i.e. the system detects a fault if at least one of its subsystems detects the fault.

The steps involved in the selection of a decomposition for the DPCA monitoring method in which ythreshold = 1 and in which the T 2 statistic is the only test statistic used for monitoring are: 1. Initially set α = F ARthreshold in the DPCA simulation. Apply the agglomerative clustering algorithm to generate m optimal decompositions to the m allocation optimization problems (with the number of subsystems (B) in the optimal decompositions ranging from 1 to m). 2. If a decomposition output by the agglomerative clustering algorithm has B subsystems, 2 then B statistical tests (in which Tb2 is compared with Tb,lim , b ∈ [1..B]) will be carried

out simultaneously in the DPCA method to determine if a sample is faulty. If we assume that the statistical tests are independent and that all the measured variables are Gaussian, then the expected FAR is given by:

F AR = 1 − (1 − αB )B

(6)

where αB is the level of significance used to set the threshold value of the T 2 statistic 2 (Tb,lim , b ∈ [1..B]) for each statistical test. Setting F AR = F ARthreshold and solving

23

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 55

equation (6) for αB gives: 1

αB = 1 − (1 − F ARthreshold ) B

(7)

Calculate αB ∀B ∈ [1..m] using equation (7). 3. Set α = αB in the DPCA simulation. Run the DPCA simulation using the decomposition with B subsystems output by the agglomerative clustering algorithm and find its MDR. Use this procedure to find the MDRs of the m decompositions output by the agglomerative clustering algorithm. 4. Find the decomposition with the lowest MDR. The decomposition with Bmin subsystems, output by the agglomerative clustering algorithm, has the lowest MDR equal to M DRmin . Set α = αBmin in the DPCA simulation and apply the fine tuning algorithm using the decomposition with Bmin subsystems as the initial guess. Find the decompositions output by the agglomerative clustering algorithm with B < Bmin , B 6= 1 and with MDR within Z% of M DRmin . Apply the fine tuning algorithm using each one of these decompositions as the initial guess after setting α in the DPCA simulation using equation (7). 5. Out of all the decompositions generated as outputs after applying the fine tuning algorithm in the previous step, the decomposition with the lowest MDR is selected for the distributed PCA method. Remark 4: If the statistical tests in the DPCA monitoring method are not independent due to correlation between the measured variables from different subsystems, then F AR ≤ 1 − (1 − αB )B 31 which implies F AR ≤ F ARthreshold . If the measured variables are not Gaussian then it is possible that the statistical model used to fit the T 2 statistic might underfit it which may cause FAR to exceed F ARthreshold . However, since the aim is to find an αB for each decomposition output by the agglomerative clustering algorithm such that 24

ACS Paragon Plus Environment

Page 25 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

the FAR of each decomposition is comparable in value, estimating α using equation (7) is acceptable.

Remark 5: The F AR can also be estimated using k-fold cross validation. 32 In k-fold cross validation, the normal data set is divided into k parts. Out of the k parts, k − 1 parts of the data set are used as training data and the remaining part is used as testing data to estimate the FAR. The FAR is calculated for k different divisions of the normal data and is averaged over the k divisions of the data that are considered. The value of αB is tuned until the estimated FAR is equal to F ARthreshold . This method for estimating α is more computationally expensive than using probability concepts to estimate aB (in equation (7), for example). However, if there is still a considerable dependence of FAR on B after the level of significance is set based on probability concepts, then k-fold cross validation can be used to tune αB .

Remark 6: The level of significance used in the simulation of the distributed MSPM method in the agglomerative clustering algorithm is not the level of significance for which F AR ≈ F ARthreshold . The level of significance also has an impact on the optimal allocation of measured variables into subsystems. The fine tuning algorithm takes this factor into account by simulating the distributed MSPM method using a level of significance for which F AR ≈ F ARthreshold .

Remark 7: The value of Z in step 4 should be set in the (1 − 2)% range. If Z is set equal to zero in step 4, then only the decomposition with the lowest MDR will be fine tuned. However, the extent to which fine tuning will reduce the MDR of a decomposition cannot be determined before the decomposition is fine tuned. Therefore, decompositions with an MDR within (1 − 2)% of M DRmin are subject to fine tuning. Only decompositions with number of subsystems B < Bmin are fine tuned since the reduction in MDR after fine tuning is ap-

25

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

plied is expected to be higher for decompositions with smaller number of subsystems. This is because a decomposition output by the agglomerative clustering algorithm with smaller number of subsystems is generated by merging subsystems that tend to contain a greater number of measured variables. This would increase the likelihood of variable misplacement and hence suboptimal variable allocation. Also, smaller number of candidate decompositions are being analyzed in the agglomerative clustering algorithm to solve the allocation optimization problem for a smaller value of B. Fine tuning should not be applied to all the decompositions output by the agglomerative clustering algorithm since the computational cost of applying the fine tuning algorithm once is comparable to the computational cost of applying the agglomerative clustering algorithm. The value of Z should be set equal to zero if the speed of the fine tuning algorithm is too slow.

Remark 8: If the PDAC method becomes too slow due to the number of normal and faulty samples in the training data set being too large, then only a subset of the samples could be used in the PDAC method. The fraction of faulty samples of a certain fault type in the faulty data set used in the PDAC method should ideally be equal to the fraction of faulty samples of the fault type in the original faulty data set. PDAC will give more weight to a fault type if the number of faulty samples of that fault type is high in comparison to the number of faulty samples of other fault types. The performance of the decomposition generated by the PDAC method will only be optimal in monitoring those faults whose samples are part of the faulty data set and not in monitoring unanticipated faults.

4

Case Study

In this section, the robustness and performance of the proposed PDAC method are analyzed by applying it to find the optimal decomposition of the Tennessee Eastman Process for process monitoring via the DPCA monitoring method. The simulations are carried out in

26

ACS Paragon Plus Environment

Page 26 of 55

Page 27 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

MATLAB ® using a 2.7 GHz Intel ® Core — i7-7500U processor.

4.1

Tennessee Eastman Process

The Tennessee Eastman Process (TEP) is a benchmark case study in the process monitoring research community. The process has five main process units which include a reactor where a set of exothermic gas liquid reactions occur and where cooling water is used to remove the heat generated by these reactions, a product condenser, a separator which separates some of the non condensible reactants from the liquid products, a compressor which recycles the non condensible reactants and a stripper which further purifies the products. The process has 52 measured variables (indexed from 1-52), all of which are used for fault detection in this paper. A detailed process description is given in. 33 The data set generated in 1 is used in this paper and can be downloaded from http://web.mit.edu/braatzgroup. The data set has both normal and faulty data. The faulty data was generated by simulating 21 faults. The data set consists of a training data set and a testing data set. The data in the training set is used to find the optimal decomposition via the PDAC method. The data in the testing set is used to test the DPCA monitoring method using the decompositions output in the agglomerative clustering and fine tuning algorithms of the PDAC method. The data in the training set consists of 500 samples of normal operation data and 480 samples per fault of faulty data (a total of 10080 faulty samples). The data in the testing set consists of 960 samples of normal operation data and 800 samples per fault of faulty data. The process flow sheet for the TEP process, the list of measured variables and the list of faults is provided in the Supporting Information.

4.2

PDAC Input Parameters

Besides the system decomposition, certain other parameters of the DPCA monitoring method must be defined before it is simulated to calculate the MDR. These include the test statistic, the consensus strategy, the level of significance and the method used to find the dimension 27

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 55

of the score space (ab ) of each subsystem’s PCA model. The minimum variance test is used to calculate ab . In the minimum variance test, the threshold cumulative sum of variance (CUMSUM) must be defined. Unless otherwise stated, the default values of the parameters listed in Table 1 will be used. Table 1: Default Parameter Values Parameter α CUMSUM Test Statistic Consensus Strategy

4.3

Default Value 0.01 85% of the total variance T2 Voting based Decision Fusion with ythreshold = 1

Comparison Metrics

A comparison metric is the percentage by which the performance metric (PM) of a particular decomposition with B subsystems is better (lower) than the average performance metric of a decomposition with B subsystems. The average performance metric of a decomposition with B subsystems is calculated by randomly generating 50 decompositions with B subsystems and taking the average of the performance metrics of the randomly generated decompositions. The value of the performance metric of a particular decomposition is denoted by P Mj where j ∈ {test, train} represents the data set used to test the monitoring method and find the performance metric of the decomposition. For example, to calculate M DRtest of a decomposition, the normal data in the training data set is used to train the PCA models and the faulty data in the testing data set is used to calculate the MDR of the decomposition. Similarly, the average performance metric of a decomposition is denoted by P Mjavg . The comparison metric is denoted by P Mjscale and is calculated using the equation: P Mjscale =

P Mjavg − P Mj × 100 P Mjavg 28

ACS Paragon Plus Environment

(8)

Page 29 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

4.4

Performance Analysis

The agglomerative clustering and fine tuning algorithms of the PDAC method are applied scale using the normal and faulty data in the training data set. The comparison metric M DRtrain

Figure 9: Performance of the Agglomerative Clustering and Fine Tuning Algorithms of the decompositions output by the agglomerative clustering and fine tuning algorithms are calculated using the objective function value of the decompositions. The plot of the scale comparison metric M DRtrain vs B in Figure 9 shows that the MDRs of the decompositions

output by the agglomerative clustering algorithm are significantly lower than that of randomly generated decompositions. The plot also shows that there is a significant reduction in the MDR when the decompositions with small B, output by the agglomerative clustering algorithm, are subject to fine tuning. However, fine tuning did not significantly lower the MDR of the decompositions with large B.

To further test the performance of the PDAC method, it is compared with two optimization strategies. The first of these simply involves picking the decomposition with the lowest MDR out of 10000 randomly generated decompositions. Table 2 shows that PDAC outperformed 29

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

this strategy and that a simple random search will not yield high quality decompositions. The second optimization strategy involves running a genetic algorithm 34 for the same time (i.e. 794 seconds) as the PDAC method. PDAC outperformed the genetic algorithm since scale M DRtrain for the optimal decomposition with B = 2 obtained using the genetic algorithm scale is 15.6 whereas the M DRtrain for the optimal decomposition with B = 2 obtained using the

agglomerative clustering algorithm with fine tuning is 17.3. Therefore, PDAC performs well in generating optimal decompositions for distributed MSPM methods. Table 2: Comparison of PDAC with Random Search No. of Subsystems 2 15 30

4.5

scale Minimum M DRtrain using PDAC 17.3 (with fine tuning) 19.1 (without fine tuning) 17.0 (without fine tuning)

scale Minimum M DRtrain using random search 10.6 8.3 8.9

Robustness Analysis

scale of the To analyze the robustness of the PDAC method, the comparison metric M DRtest

decompositions output by the agglomerative clustering and fine tuning algorithms are calculated by using the faulty data in the testing data set to simulate the performance of the scale DPCA monitoring method. The plot of the comparison metric M DRtest vs B in Figure 10

shows that the MDRs of the decompositions output by the agglomerative clustering algorithm are significantly lower than that of randomly generated decompositions even though the MDRs are calculated using faulty data that isn’t used in the agglomerative clustering algorithm. The plot in Figure 10 also shows that there is a significant reduction in the MDR when the decompositions with small B, output by the agglomerative clustering algorithm, are subject to fine tuning even though the MDRs are calculated using faulty data that isn’t used in the fine tuning algorithm. The plot in Figure 10 thus indicates that MDR

30

ACS Paragon Plus Environment

Page 30 of 55

Page 31 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Figure 10: Robustness of the Agglomerative Clustering and Fine Tuning Algorithms is a robust performance metric and the agglomerative clustering and fine tuning algorithms generate decompositions whose monitoring performance is robust to a change in the data set.

The DD (summed over all faults) is another performance metric that could’ve been used as the objective function in the PDAC method. In the calculation of the DD for a fault, the fault will be indicated only when it is detected in four consecutive time instants to ensure that a fault is not indicated due to a false alarm. DD is used as the objective function in the scale scale agglomerative clustering algorithm and the consensus metrics DDtrain and DDtest are calscale scale culated for the resulting output decompositions. The consensus metrics DDtrain and DDtest

are also calculated for the decompositions output by the agglomerative clustering algorithm when MDR was used as the objective function and the results are plotted in Figure 11.

The plots in Figure 11 indicate that the DD is not a robust performance metric. For the training data set, the DD performance of the decompositions output by the agglomerative clustering algorithm when the DD is used as the objective function is far better than average since DD is calculated and optimized in the PDAC method using the same training data 31

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

set. However, for the testing data set, the DD performance of a decomposition output by the agglomerative clustering algorithm when the DD is used as the objective function can be scale is negative for some of the decompositions) significantly worse than average (since DDtest

thus showing that noise in the data has a large effect on the DD. In fact, for the testing data set, the DD of the decompositions output by the agglomerative clustering algorithm when the MDR is used as the objective function tend to be lower than the DD of the decompositions output by the agglomerative clustering algorithm when the DD is used as the objective function. The use of MDR as the objective function in the PDAC method is thus justified.

(a) Detection Delay Consensus Metric Calculated Using Training Data

(b) Detection Delay Consensus Metric Calculated Using Testing Data

Figure 11: Robustness Analysis of the Detection Delay Remark 9: The optimal decompositions output by the agglomerative clustering algorithm will also depend on parameters of the DPCA monitoring method such as the test statistic, the consensus strategy employed, CUMSUM, α and the faulty data set used to compute the objective function. This is illustrated by changing one of the parameters in the DPCA simulation from its default value to some other value. The agglomerative clustering algorithm scale is then applied to find the optimal decompositions and the comparison metric M DRtest

of the decompositions are calculated by simulating the DPCA monitoring method with the scale changed parameter. The comparison metric M DRtest of the decompositions output by the

agglomerative clustering algorithm, when the DPCA parameters were set equal to the de32

ACS Paragon Plus Environment

Page 32 of 55

Page 33 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

fault values, are calculated by simulating the DPCA monitoring method with the changed scale is averaged over the decompositions with B ∈ [2..(m − 1)] subsystems parameter. M DRtest

output by the agglomerative clustering algorithm. The results are shown in Table 3. It is also worth noting that the MDRs of decompositions generated using the default parameters were still significantly better (lower) than average. Table 3: Dependence of the PDAC method on the DPCA parameters Change in Default Parameter Set

T 2 and Q statistics used together Voting based decision fusion with ythreshold = 2 CUMSUM = 0.95 α = 0.005 Faulty data in the testing set used for objective function evaluation

4.6

scale for Average M DRtest decompositions output by PDAC with the parameter updated 43.8

scale for Average M DRtest decompositions output by PDAC using default parameters 7.8

10.4

6.3

15.1 10.7 17.5

11.0 10.6 12.7

Decomposition Selection

The results of the decomposition selection procedure are presented in this subsection. The threshold false alarm rate (F ARthreshold ) is set equal to 0.01. Besides α, all other parameters of the DPCA simulation are set to be equal to the default values defined in Table 1. After setting α = F ARthreshold = 0.01 in the DPCA simulation, the agglomerative clustering algorithm is applied. The MDRs of the decompositions output by the agglomerative clustering algorithm are found after the value of α for each decomposition is set, in the DPCA simulation, using equation (7). For example, the value of α for the decomposition with 2 subsystems is set equal to 1 − (1 − 0.01)0.5 = 0.005. The MDR of the decomposition with 2 subsystems is calculated by simulating the DPCA method using α = 0.005 to set the detection thresholds and the faulty data in the training data set to calculate the MDR. The 33

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

MDRs of the decompositions output by the agglomerative clustering algorithm are plotted in Figure 12.

The decomposition with 13 subsystems is found to have the lowest MDR. The MDRs of decompositions with 8, 9, 10 and 12 subsystems were within 1% of the MDR of the decomposition with 13 subsystems. The decompositions with 8, 9, 10, 12 and 13 subsystems are thus subject to fine tuning with the value of α set based on the number of subsystems in the decomposition. The decomposition generated when the decomposition with 9 subsystems is fine tuned, is found to have the lowest MDR and is thus selected. The selected decomposition is given in Table 4. The computation time of the PDAC method is 2569 seconds. The agglomerative clustering algorithm required a computation time of 646 seconds and fine tuning of the five decompositions required a computation time of 1899 seconds. Table 4: Allocation of measured variables in the selected decomposition Subsystem Index 1 2 3 4 5 6 7 8 9

Measured Variables in the System 1, 2, 8, 12, 14, 15, 17, 23, 25, 26, 28, 32, 35, 36, 39, 40, 41, 42, 43, 45, 48, 49, 52 4, 22, 29, 30, 31, 34, 37, 44 7, 21, 50 5, 13 6, 33, 46 11, 20, 27 16, 19, 38 9, 51 3, 10, 18, 24, 47

Figure 12 also shows that neither the completely decentralized decomposition (corresponding to univariate statistical process monitoring (USPM)) nor the completely centralized decomposition (corresponding to PCA) perform optimally. This indicates that, in the selection of the number of subsystems in the decomposition, there is a trade-off between using a decom34

ACS Paragon Plus Environment

Page 34 of 55

Page 35 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

position with small number of subsystems that requires less conservative detection thresholds (i.e. higher α) to meet the constraint on the FAR, and using a decomposition with larger number of subsystems which would in turn have smaller number of variables per subsystem as a result of which there is a smaller chance that the detection of a fault affecting a small number of variables will be suppressed by the variables that are unaffected by the fault. We are essentially selecting the decomposition which achieves the best balance between both these factors.

Figure 12: Decomposition Selection

The decomposition selected using the PDAC method is tested by simulating the DPCA method using the testing data set. The monitoring performance of the DPCA method using the selected decomposition is compared with the monitoring performance of DPCA with a single subsystem (i.e. PCA) and the monitoring performance of the DPCA method using the decomposition generated by the performance driven decomposition method (FBPCA)

35

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 36 of 55

proposed in. 5 Besides α, the parameters of the DPCA simulation are set equal to the default values specified in Section 4.2 and Table 1. If the value of α in the DPCA simulations is set using equation (7), then the FAR and MDR generated by simulating the DPCA monitoring method using the three decompositions is shown in Table 5. In all three cases the FAR, calculated using the normal data in the testing data set, significantly exceeds the threshold value of 0.01. This is because dynamics were introduced in 260 out of the 960 samples of the normal data in the testing data set and the normal data in the training data set comprised only steady state data. 5 In all three cases the FAR, calculated using the steady state normal data of the testing data set, is within the threshold value. The MDR generated when the decomposition selected using the PDAC method is used in the DPCA monitoring method is the lowest of the three cases. Table 5: Comparison of PCA, FBPCA and PDAC MDR FAR calculated using the normal data in the testing data set FAR calculated using the steady state normal data in the testing data set

PCA 0.4118 0.0219

FBPCA 0.3120 0.0292

PDAC 0.2774 0.0708

0.0057

0.0071

0.0086

For the fair comparison of the monitoring performance of the DPCA method using the three decompositions, we must set the FAR equal in all three cases. The value of α is thus tuned such that it is the minimum value for which the DPCA monitoring method generates a FAR (calculated using all 960 samples of the normal data in the testing data set) of 0.0104 (i.e 10 samples out of 960). Once the value of α is tuned in each of the three cases, the MDR and DD are calculated by simulating the DPCA monitoring method using the tuned values of α. In the calculation of the DD of a fault, the fault will be indicated only when it is detected in three consecutive time instants. The results are presented in Table 6.

36

ACS Paragon Plus Environment

Page 37 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Table 6: Comparison of MDR and DD of PCA, FBPCA and PDAC when FAR = 0.0104 Fault Index 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 Average

PCA MDR DD 0.0063 9 0.0175 17 0.9900 0.7175 77 0.7550 3 0.0100 11 0 3 0.0275 25 0.9888 8 0.6375 87 0.5263 8 0.0150 9 0.0500 41 0.0013 3 0.9775 0.8175 199 0.2200 31 0.1100 94 0.9563 0.6763 89 0.6413 512 0.4353 -

FBPCA MDR DD 0.0063 8 0.0163 17 0.9738 85 0 3 0.7363 3 0 3 0 3 0.0263 24 0.9775 6 0.5263 76 0.3238 8 0.0125 5 0.0475 39 0 3 0.9313 575 0.6325 34 0.0938 25 0.1075 90 0.8638 187 0.4700 83 0.5150 270 0.3457 74

PDAC MDR DD 0.0025 5 0.0150 15 0.9625 53 0 3 0.7350 3 0 3 0 3 0.0225 22 0.9788 3 0.4400 37 0.1863 8 0.0100 5 0.0513 44 0 3 0.9075 575 0.5163 32 0.0613 24 0.1038 87 0.8613 187 0.4363 77 0.4863 245 0.3227 68

PCA performs poorly compared to DPCA when DPCA uses the FBPCA and PDAC decompositions. If indication that a fault has occurred requires its detection over three consecutive time instants, then PCA is not be able to indicate faults 3,15 and 19, whereas DPCA, with the aforementioned decompositions, is able to indicate each of the faults. Not only is the average MDR generated by implementing the DPCA method using the FBPCA and PDAC decompositions significantly lower than that of PCA, but also the MDRs of almost all the faults are lower when the DPCA monitoring method is implemented. These observations suggest that implementing PCA in a distributed manner would significantly improve its overall monitoring performance. The decomposition used in the DPCA monitoring method also has a significant impact on its monitoring performance. The average MDR and DD when the

37

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

decomposition is selected using the PDAC method are lower than when the decomposition is selected using the FBPCA method.

4.7

Fault 4

The effect of fault 4 on the allocation of the measured variables of TEP into the subsystems is analyzed in this subsection to highlight the physical significance of the decomposition selected using PDAC. There is a step change in the inlet cooling water temperature of the reactor as a consequence of fault 4. This initially causes a spike in the reactor’s temperature (i.e. measured variable 9) (Figure 13(a)). A cascade controller manipulates the reactor cooling water flow rate (i.e. measured variable 51) to compensate the effect of the fault on the reactor’s temperature. This causes a step increase in the reactor cooling water flow rate (Figure 13(b)). The effect of the fault does not propagate to the other process units due to the quick action of the cascade controller. The mean and variance of the remaining 50 measured variables, therefore, do not deviate significantly from their normal values as a result of fault 4. 1

(a) Reactor Temperature Testing Data

(b) Reactor Cooling Water Flow Rate Testing Data

Figure 13: Measured Variables Affected by Fault 4

PCA is only able to detect 28% of the samples of fault 4 (Table 6). This is because all 52 measured variables contribute to the T 2 statistic when PCA is applied to the entire TEP. 38

ACS Paragon Plus Environment

Page 38 of 55

Page 39 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Therefore, a large detection threshold for the T 2 statistic will have to be used to account for the variances of all 52 measured variables captured by the PCA model. However, only the reactor temperature and reactor cooling water flow rate (variables 9 and 51) are expected have a significant contribution to the T 2 statistic when fault 4 occurs. Therefore, for most fault 4 samples, the statistical test used in PCA interprets the deviation of the T 2 statistic as being caused due to normal variance since most of the variables do not contribute significantly to it. An obvious solution to this problem would be to employ a statistical test in which the test statistic is calculated only using variables 9 and 51. This is what is done in DPCA when the decomposition selected using PDAC is used. The reactor temperature and reactor cooling water flow rate are placed in a single subsystem (i.e. subsystem 8) in the decomposition selected using PDAC. Therefore, the two variables that contribute to subsystem 8’s local T 2 statistic are the two variables that are affected by fault 4 and the detection threshold of the local test statistic is smaller since the variance of only two measured variables needs to be taken into account. Therefore, subsystem 8’s local statistical test and the DPCA monitoring method detect all fault 4 samples.

Fault 4 is a special case where only 2 variables are affected by the fault. Only subsystem 8 in the decomposition selected using PDAC is able to detect faults 4 and 11 (which has a similar effect on the measured variables of TEP as fault 4). The relation between these faults and variables 9 and 51 drives PDAC to allocate variables 9 and 51 into a single subsystem. The decomposition selected using PDAC is, therefore, physically meaningful in this sense. However, fault-variable relationships tend to be more complex for other faults affecting TEP. Also, other factors (such as the correlation between the measure variables of TEP) also have a significant effect on the performance of DPCA for a certain decomposition. Therefore, it is highly unlikely that a near optimal decomposition of a large complex system like TEP can be determined simply through physical intuition.

39

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

5

Conclusions

The Performance Driven Agglomerative Clustering method, proposed in this paper, generated near optimal decompositions for distributed multivariate statistical process monitoring. The method uses the proposed agglomerative clustering and fine tuning algorithms along with the proposed procedure for selecting a decomposition to not only find the optimal allocation of the measured variables into a certain number of subsystems, but also to find the optimal number of subsystems. Although the decomposition generated by PDAC will not be robust to a change in the data set because of noise in the data, the performance (i.e. MDR) of the decomposition generated by PDAC is robust to a change in the data set as has been illustrated in the case study. The PDAC method is also very flexible and can, in principle, be incorporated into most distributed MSPM methods. Although the computation time for the method (especially the fine tuning algorithm) is significant, selecting the system decomposition is a one time decision carried out off-line.

In future work, the PDAC method will be extended to account for additional factors that are important in selecting the system decomposition. Specifically, the PDAC method will be extended to include a layout constraint which prevents certain measured variables from being paired in the same subsystem. A pattern classification method can be part of a distributed data driven monitoring scheme to diagnose a fault once it is detected. The PDAC method will be extended to account for its performance in selecting the system decomposition. The PDAC method will also be modified to allow overlap between subsystems to further improve its performance.

Acknowledgement Partial financial support from the Petroleum Institute, Abu Dhabi, is gratefully acknowledged. 40

ACS Paragon Plus Environment

Page 40 of 55

Page 41 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Supporting Information Available A general description of the agglomerative clustering and fine tuning algorithm applied to the DPCA monitoring method; process flow sheet, the list of measured variables and the list of faults of the TEP process. This information is available free of charge via the Internet at http:// pubs.acs.org.

Appendix A

Performance Metrics

The performance metrics which are commonly used to quantify the performance of an MSPM method for fault detection are: 1. Missed Detection Rate (MDR): This is the fraction of faulty samples that are identified as normal:

M DR =

Number of faulty samples in which the fault is not detected Total number of faulty samples

(9)

For accurate fault detection, the MDR should be low. 2. Detection delay (DD):

DD =

F X

(Time when fault f is first indicated − Time when fault f is introduced)

f =1

(10) The detection delay is summed over all F faults in the data set. To ensure that a fault is not indicated due to a false alarm, a fault is indicated only when a fault is consecutively detected by the monitoring system over a certain number of time instants. 3. False Alarm Rate (FAR): This is the fraction of normal samples that are identified as

41

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 42 of 55

faulty:

F AR =

Number of normal samples in which a fault is detected Total number of normal samples

(11)

The monitoring scheme will be impractical if the FAR is too high.

Appendix B

Nomenclature

Table 7: Nomenclature for commonly used notations Notation PCA MSPM B b PDAC n nf m CUMSUM α xb ythreshold MDR FAR DD DPCA PM P Mjscale F ARthreshold LFD Si vj [a..b] TEP

Description Principal Component Analysis Multivariate Statistical Process Monitoring Number of subsystems Subsystem index Performance Driven Agglomerative Clustering Number of normal samples (per measured variable) Number of faulty samples (per measured variable) Number of measured variables Threshold for cumulative variance of loading vectors used in the selection of number of loading vectors Level of significance Local measurement vector Minimum number of votes (that a fault has occurred) needed from the subsystems to detect a fault Missed Detection Rate False Alarm Rate Detection Delay Distributed PCA Performance Metric Comparison Metric Threshold on the false alarm rate Local Fault Decision Set of variables in subsystem i Variable j Integers between and including a and b Tennessee Eastman Process

42

ACS Paragon Plus Environment

Page 43 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

References (1) Chiang, L. H.; Russell, E. L.; Braatz, R. D. Fault Detection and Diagnosis in Industrial Systems; Springer: London; New York, 2001. (2) Venkatasubramanian, V.; Rengaswamy, R.; Yin, K.; Kavuri, S. N. A Review of Process Fault Detection and Diagnosis Part I: Quantitative Model-Based Methods. Computers & Chemical Engineering 2003, 27, 293–311. (3) Venkatasubramanian, V.; Rengaswamy, R.; Kavuri, S. N. A Review of Process Fault Detection and Diagnosis Part II: Qualitative Models and Search Strategies. Computers & Chemical Engineering 2003, 27, 313–326. (4) Venkatasubramanian, V.; Rengaswamy, R.; Kavuri, S. N.; Yin, K. A Review of Process Fault Detection and Diagnosis Part III: Process History Based Methods. Computers & Chemical Engineering 2003, 27, 327–346. (5) Jiang, Q.; Yan, X.; Huang, B. Performance-Driven Distributed PCA Process Monitoring Based on Fault-Relevant Variable Selection and Bayesian Inference. IEEE Transactions on Industrial Electronics 2016, 63, 377–386. (6) Ge, Z.; Song, Z. Distributed PCA Model for Plant-Wide Process Monitoring. Industrial & Engineering Chemistry Research 2013, 52, 1947–1957. (7) Gao, Z.; Cecati, C.; Ding, S. X. A Survey of Fault Diagnosis and Fault-Tolerant Techniques Part II: Fault Diagnosis With Knowledge-Based and Hybrid/Active Approaches. IEEE Transactions on Industrial Electronics 2015, 62, 3768–3774. (8) Ge, Z.; Song, Z.; Gao, F. Review of Recent Research on Data-Based Process Monitoring. Industrial & Engineering Chemistry Research 2013, 52, 3543–3562. (9) Qin, S. J. Survey on Data-Driven Industrial Process Monitoring and Diagnosis. Annual Reviews in Control 2012, 36, 220–234. 43

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(10) Rato, T. J.; Reis, M. S. Fault Detection in the Tennessee Eastman Benchmark Process Using Dynamic Principal Components Analysis Based on Decorrelated Residuals (DPCA-DR). Chemometrics and Intelligent Laboratory Systems 2013, 125, 101–108. (11) Zhao, C.; Gao, F. Fault-Relevant Principal Component Analysis (FPCA) Method for Multivariate Statistical Modeling and Process Monitoring. Chemometrics and Intelligent Laboratory Systems 2014, 133, 1–16. (12) Kohonen, J.; Reinikainen, S.-P.; Aaljoki, K.; Perki¨o, A.; V¨a¨an¨anen, T.; Høskuldsson, A. Multi-Block Methods in Multivariate Process Control. Journal of Chemometrics 2008, 22, 281–287. (13) Ge, Z.; Zhang, M.; Song, Z. Nonlinear Process Monitoring Based on Linear Subspace and Bayesian Inference. Journal of Process Control 2010, 20, 676–688. (14) Grbovic, M.; Li, W.; Xu, P.; Usadi, A. K.; Song, L.; Vucetic, S. Decentralized Fault Detection and Diagnosis via Sparse PCA Based Decomposition and Maximum Entropy Decision Fusion. Journal of Process Control 2012, 22, 738–750. (15) Zhang, Y.; Zhou, H.; Qin, S. J.; Chai, T. Decentralized Fault Diagnosis of LargeScale Processes Using Multiblock Kernel Partial Least Squares. IEEE Transactions on Industrial Informatics 2010, 6, 3–10. (16) Qin, S. J.; Valle, S.; Piovoso, M. J. On Unifying Multiblock Analysis with Application to Decentralized Process Monitoring. Journal of Chemometrics 2001, 15, 715–742. (17) Liu, Q.; Qin, S. J.; Chai, T. Multiblock Concurrent PLS for Decentralized Monitoring of Continuous Annealing Processes. IEEE Transactions on Industrial Electronics 2014, 61, 6429–6437. (18) Tong, C.; Song, Y.; Yan, X. Distributed Statistical Process Monitoring Based on Four-

44

ACS Paragon Plus Environment

Page 44 of 55

Page 45 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Subspace Construction and Bayesian Inference. Industrial & Engineering Chemistry Research 2013, 52, 9897–9907. (19) Zhu, J.; Ge, Z.; Song, Z. Distributed Parallel PCA for Modeling and Monitoring of Large-scale Plant-Wide Processes with Big Data. IEEE Transactions on Industrial Informatics 2017, (20) Jiang, Q.; Wang, B.; Yan, X. Multiblock Independent Component Analysis Integrated with Hellinger Distance and Bayesian Inference for Non-Gaussian Plant-Wide Process Monitoring. Industrial & Engineering Chemistry Research 2015, 54, 2497–2508. (21) Huang, J.; Yan, X. Angle-Based Multiblock Independent Component Analysis Method with a New Block Dissimilarity Statistic for Non-Gaussian Process Monitoring. Industrial & Engineering Chemistry Research 2016, 55, 4997–5005. (22) Smilde, A. K.; Westerhuis, J. A.; Boque, R. Multiway Multiblock Component and Covariates Regression Models. Journal of Chemometrics 2000, 14, 301–331. (23) Huang, J.; Yan, X. Double-Step Block Division Plant-Wide Fault Detection and Diagnosis Based on Variable Distributions and Relevant Features. Journal of Chemometrics 2015, 29, 587–605. (24) Jiang, Q.; Yan, X. Nonlinear Plant-Wide Process Monitoring Using MI-Spectral Clustering and Bayesian Inference-Based Multiblock KPCA. Journal of Process Control 2015, 32, 38–50. (25) Ward Jr, J. H. Hierarchical Grouping to Optimize an Objective Function. Journal of the American Statistical Association 1963, 58, 236–244. (26) Murtagh, F.; Contreras, P. Algorithms for Hierarchical Clustering: An Overview. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 2012, 2, 86–97.

45

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(27) Ghosh, K.; Ramteke, M.; Srinivasan, R. Optimal Variable Selection for Effective Statistical Process Monitoring. Computers & Chemical Engineering 2014, 60, 260–276. (28) Fortunato, S. Community Detection in Graphs. Physics Reports 2010, 486, 75–174. (29) Kernighan, B. W.; Lin, S. An Efficient Heuristic Procedure for Partitioning Graphs. The Bell System Technical Journal 1970, 49, 291–307. (30) Newman, M. E. Modularity and Community Structure in Networks. Proceedings of the National Academy of Sciences 2006, 103, 8577–8582. ˇ ak Corrections for Multiple Comparisons. In Encyclo(31) Abdi, H. The Bonferroni and Sid´ pedia of Measurement and Statistics; Sage: Thousand Oaks (CA), 2007. (32) Cheng, S.; Pecht, M. Using Cross-Validation for Model Parameter Selection of Sequential Probability Ratio Test. Expert Systems with Applications 2012, 39, 8467–8473. (33) Downs, J. J.; Vogel, E. F. A Plant-Wide Industrial Process Control Problem. Computers & Chemical Engineering 1993, 17, 245–255. (34) Chipperfield, A.; Fleming, P. The MATLAB Genetic Algorithm Toolbox. In IEE Colloquium on Applied Control Techniques Using MATLAB ; 1995.

46

ACS Paragon Plus Environment

Page 46 of 55

Page 47 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

TOC Graphic

47

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 1: Schematic of Distributed Multivariate Statistical Process Monitoring

1

ACS Paragon Plus Environment

Page 48 of 55

Page 49 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Figure 2: Graphical Description of the First Iteration Calculations

2

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 3: Graphical Description of the Second Iteration Calculations

Figure 4: MDR Calculation for the Initial Guess

3

ACS Paragon Plus Environment

Page 50 of 55

Page 51 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Figure 5: MDR Calculation for Some of the Candidate Decompositions in the First Iteration

Figure 6: Best Candidate Decompositions in the First Iteration

4

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 7: MDR Calculation for Some of the Candidate Decompositions in the Second Iteration

Figure 8: Schematic Flow Sheet of the PDAC Method

5

ACS Paragon Plus Environment

Page 52 of 55

Page 53 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Figure 9: Performance of the Agglomerative Clustering and Fine Tuning Algorithms

Figure 10: Robustness of the Agglomerative Clustering and Fine Tuning Algorithms

6

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(a) Detection Delay Consensus Metric Calculated Using Training Data

(b) Detection Delay Consensus Metric Calculated Using Testing Data

Figure 11: Robustness Analysis of the Detection Delay

Figure 12: Decomposition Selection

7

ACS Paragon Plus Environment

Page 54 of 55

Page 55 of 55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

(a) Reactor Temperature Testing Data

(b) Reactor Cooling Water Flow Rate Testing Data

Figure 13: Measured Variables Affected by Fault 4

8

ACS Paragon Plus Environment