Evaluation of a Hybrid Clustering Approach for a Benchmark Industrial

Graduate Program in Industrial Engineering, Polytechnic School, Federal ..... The assignment of each of the training objects to the clusters is based ...
1 downloads 0 Views 2MB Size
Article Cite This: Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

pubs.acs.org/IECR

Evaluation of a Hybrid Clustering Approach for a Benchmark Industrial System Cristiano Hora Fontes*,† and Hector Budman‡ †

Graduate Program in Industrial Engineering, Polytechnic School, Federal University of Bahia, Salvador, Bahia 40170-115, Brazil Department of Chemical Engineering, University of Waterloo (Canada), Waterloo, Ontario N2L 3G1, Canada

Downloaded via UNIV OF READING on August 3, 2018 at 22:12:15 (UTC). See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.



ABSTRACT: The paper discusses a novel algorithm for classifying data represented through multivariate time series based on similarity metrics. To improve over the performance of existent classification methods based on single similarity, the method used in this study is based on a combination between the principal component analysis similarity factor and the average-based Euclidian distance within a fuzzy clustering approach. Additionally, an approach is proposed to cope with the changes of these metrics over the time window, improving the similarity analysis between the objects. The method is applied to the Tennessee Eastman process, a well-known benchmark industrial system used to compare various fault detection and diagnosis approaches. The results were compared with standards multivariate techniques showing the efficiency and flexibility of the proposed method in fault detection and classification problems, when considering different types of failures, process variables, and changes in operating conditions.



INTRODUCTION

pattern recognition methods have been widely used in FDD problems.21−23,16 On the other hand, in view that industrial plants exhibit highly nonlinear behavior combined with the fact that they are often operated at different times around different steady states, FDD algorithms should take this information into account. In these cases, a single similarity metric is usually limited in its ability and flexibility to compare MTS and recognize clusters and patterns. Some studies have proposed the use of hybrid approaches combining the SPCA with another similarity metric.10,3 Singhal and Seborg10 present a hybrid approach for clustering MTS applying a combination of the SPCA and another similarity metric based on the Mahalanobis distance. The hybrid-based technique of Singhal and Seborg10 was specifically tailored for pattern matching to detect deterministic faults (step, ramps, sine) and as such, it is not evident how to select a particular pattern that is representative of a stochastic fault. The fuzzy clustering approach selected in the current work can deal with both stochastic as well as deterministic faults. Furthermore, the intrinsic dynamic behavior of SPCA suggests that the level of similarity between two MTS should consider the history of a given value of this metric (high or low) over time. To account for the time history, the time integral, that is, area related to the time profile of SPCA, is proposed in this work as a new metric to compare two MTS. Izakian et al.3 propose the use of a hybrid approach

The availability of large sets of data collected from industrial processes motivates the use of Data Mining (DM) approaches for extracting knowledge from these data.1 Time series are widely used in the fields of process engineering, medicine, and finances among others2,3 and represent an important class of data objects for process monitoring. Pattern recognition in univariate time series have been investigated4,5 using standard approaches such as dimensionality reduction based methods.6−9 Alternately, pattern recognition from multivariate time series (MTS) represents a more complex problem (nonpoint prototyping problem) with intrinsic features such as similarity and cluster validity.10,11 The adoption of dimensionality reduction approaches for MTS that does not explicitly account for time may lead to a loss of information, reducing the ability to detect joint features or hidden information behind the time profiles as a whole.12,11,13,2 Some studies on MTS are based on the measurement of similarity between two MTS12,13 within a fuzzy clustering approach.14,15 Other works highlight the use of PCA-based similarity metrics (SPCA) (and modified versions) in pattern recognition of MTS and present case studies associated with nonlinear systems.16,11,17,18,12,19,20 In general, results provided by clustering and classification methods involving MTS are highly dependent on the similarity/dissimilarity metrics used for classification.2 Fault detection and diagnosis (FDD) in industrial plants can provide an increase in production efficiency by providing early detection of an undesirable operation condition thus enabling corrective actions. Techniques comprising clustering and © XXXX American Chemical Society

Received: Revised: Accepted: Published: A

January 26, 2018 May 8, 2018 July 21, 2018 July 22, 2018 DOI: 10.1021/acs.iecr.8b00429 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research

···, n objects). Each object is a matrix comprising p time series extracted at the same period of time (time window). Different objects refer to sets of time series extracted in different time windows (different periods of operation of the process). The principal component analysis (PCA) similarity metric (SPCA index11,17) measures the level of similarity between two different objects (MTS represented by two matrices m × p, respectively) with the same number of variables (p), but it does not require the same number of observations (m). In this work a modified version of SPCA index (SPCAλ) has been applied where each principal component is weighted by the square root of its corresponding eigenvalue as done in Singhal and Seborg.16 The PCA is performed on each mean-centered time series.

combining two fuzzy clustering techniques (Fuzzy C-Means and Fuzzy C-Medoids) which apply the dynamic time warping (DTW) and Euclidean distance separately. Furthermore, the work of Izakian et al.3 is only applicable to univariate time series, and thus the extension to MTS is not evident. This paper presents a novel method for clustering MTS based on a combination between two similarity metrics, namely, the modified PCA similarity factor16 and the averagebased Euclidian distance (AED) that are combined together for classification with a fuzzy clustering approach. The proposed hybrid metric compares different MTS based on a weighted sum of these two different metrics, that is, the direction of principal components of the time series and the averages calculated from the MTS. Additionally, this work shows through a tailored example the need to consider the dynamic behavior of the SPCA over the time window, rather than obtaining a single value of similarity between two MTS based on the whole window. In this case, this paper proposes that both metrics (SPCA and AED) be considered as a function of time, and their respective time integrals are used as a way to evaluate the similarity between 2 MTS. The tailored examples and results show that this approach is capable of recognizing hidden similarities among objects (MTS), not observable through a static analysis which considers the time series as a whole. The proposed method is applied in a case study that comprised the clustering and pattern recognition of normal and abnormal (failures) operation for the benchmark Tennessee Eastman (TE) process24 widely used for the assessment of FDD techniques. Various process monitoring approaches have been applied on the TE process involving different experiment conditions and different closed-loop strategies.16,25−27 The results (percentage of misclassifications) obtained by the proposed method are compared to two wellknown monitoring statistics, namely, Q statistic (squared prediction error, SPE) and Hotelling’s T2 statistic10 used for detecting faults. This paper is structured as follows. In section 2, the hybrid metric and a formulation for the clustering problem are presented. Section 3 presents the application of the hybrid metric on simple examples to elucidate the relevance of the hybrid method especially in a FDD problem and also proposes the time-based global metrics (SPCAG and AEDG). Section 4 presents the results of fault detection involving three different cases of faults occurring in the Tennessee Eastman (TE) process.

SPCAλ(A, B) =

1 k ∑i =0 1 (λi A k0 k0

× λ i B)

·∑ ∑ (λi A × λjB) cos2 θij (2)

i=1 j=1

where A and B are the matrices (MTS objects) being compared. θij is the angle between the ith principal component of A and the jth principal component of B. k0 is the largest among k1 or k2 where the latter corresponds to the number of principal components capable of describing a significant percentage of both the variance in A or B, respectively. λA and λB are vectors of eigenvalues of ATA and BTB, respectively, and “×” indicates a scalar product. The SPCAλ is limited to the interval [0;1]. Values close to 0 indicate high dissimilarity. 2.2. The Hybrid Distance and Clustering Approach. Clustering is one of the most fundamental data mining problems and comprises the partition of data into a set of groups according to some predefined measure of similarity. One of the challenges lies in the unsupervised nature of the clustering problem. There is no a priori information to distinguish the objects from each other (nonlabeled objects).8 Considering that the similarity metrics play a decisive role in problems involving the clustering of time series, it is also important to note that a single metric (just SPCA) may be limited in its ability to compare MTS and recognize clusters and patterns which comprises the extraction of knowledge from the data sample. This work presents a formulation for the clustering problem based on a hybrid approach for measuring similarity between MTS. This approach combines two similarity/dissimilarity metrics: the modified SPCA similarity factor (SPCAλ) and a metric to evaluate the distance between the averages of different MTS (average-based Euclidian distance, AED).20 The clustering problem follows the Fuzzy C-Means (FCM) formulation,28,29 suitable for clustering objects that can be represented by vectors in the space ℜw (w is the dimensionality of the data set). Accordingly, the fuzzy clustering problem based on the proposed combined metric is as follows:

2. HYBRID CLUSTERING APPROACH 2.1. Preliminaries. Let us consider a general series of observations over time xj (t) (j = 1, ···, p ; t = 1, ···, m) where p is the number of process variables, m is the number of observations (window length) and t indexes the measurements made at each time instant. A MTS comprises the case in which each object is composed by two or more time series (p ≥ 2) and can be represented by the following m × p matrix: ÄÅ ÉÑ ÅÅÅ xi1(1) μ xip(1) ÑÑÑ ÅÅ ÑÑ Å ÑÑ ÑÑ X i = ÅÅÅÅ ∂ ∏ ∂ ÅÅ ÑÑÑ ÅÅ Ñ ÅÅ xi1(m) μ xip(m)ÑÑÑ (1) ÅÇ ÑÖ

ß

c

n

min Jε (U , V) =

∑ ∑ {α·uikε || Xk − Vi ||2 + (1 − α)·

U, V

i=1 k=1

uikε || X k − Vi ||2 }

where Xi is the object, xij(t) is the measurement of variable j (j = 1, ···, p) at time instant t (t = 1, ···, m) in the object Xi (i = 1,

(3)

Subject to B

DOI: 10.1021/acs.iecr.8b00429 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research n

U ∈ 9 c × n: uik ∈ [0, 1] ,

∑ uik > 0

∀i

The AED based distance between two MTS was obtained considering each time series in a deviation variable (dv) form, with respect to its initial value, that is, the value at the beginning of the time window. Working in deviation variables allows decoupling the changes related to faulty versus normal conditions from set point changes.

and

k=1 c

∑ uik = 1

∀k

i=1

(4)

where c is the number of clusters, n is the number of objects, uik is the membership degree of the kth object to the ith cluster, U is the partition matrix (c × n matrix) and ε (ε > 1) is a fuzzification coefficient (the value commonly recommended in the literature, ε = 2, was adopted in this work). Xk (k = 1, ···, n) is an object (MTS), Vi (i = 1, ···, c) is a prototype/pattern (Vk and Vi ∈ 9 m × p) and V is the set of prototype matrices {V1, V2 , ..., Vc} ∈ 9 m × p. The first term of the argument in the double summation in eq 3, ∥Xk − Vi∥ is the distance between an object and the center of a cluster based on the modified SPCA. Since values of SPCAλ close to 1 imply high similarity and in view that the problem in eq 3 is a minimization, a complementary SPCA distance is used as follows: || X k − Vi || = SPCA c(X k , Vi ) = 1 − SPCAλ(X k , Vi )

3. MOTIVATION EXAMPLES 3.1. The Relevance of the Hybrid Approach. This section briefly summarizes two simple numerical examples specifically tailored to elucidate the relevance of the hybrid method especially in a FDD problem.20 Although they show extreme cases, these examples aim to motivate the proposed approach by showing that the similarity PCA index (SPCA) will be ineffective for particular situations unless it is combined with the AED. These two primary examples are also presented in Fontes and Budman20 but are also presented here for clarity. Initially consider MTS described by straight lines in such a way that a sample with 10 objects (6 normal objects and 4 fault objects) is available (Figure 2). Each MTS comprises two time series (two different process variables) as follows:

(5)

y1j = kj·t

that is, SPCAc (Xk,Vi) is the complement of SPCAλ (Xk,Vi) (eq 2). X k and Vi ∈ 9 p are vectors composed of the averages associated with each time series in the object and in the cluster’s center, respectively. || Xk − Vi || is the AED between these vectors. α (α ∈ [0,1]) is a tuning parameter. The application of the first order necessary conditions to the problem given in eqs 3 and 4 leads to the following analytical solution for each membership degree: ÄÅ ÉÑ1/ ε − 1 ÅÅ ÑÑ 1 ÅÅ 2 2 Ñ ÅÅÇ (α·|| Xk − Vi || +(1 − α)·|| Xk − Vi || ) ÑÑÑÖ uik = ÄÅ Ä ÉÑ1/ ε − 1ÉÑÑÑ ÅÅ Ñ 1 ÅÅ∑c ÅÅÅÅ ÑÑ ÅÅ j = 1 ÅÅ (α·|| X − V ||2 +(1 − α)·|| X − V ||2 ) ÑÑÑÑ ÑÑ ÅÅ ÑÑ Å Ñ k j k j Å Ñ Ç Ö (6) ÅÇ ÑÖ

and

y2j = 0.7 ·kj·t

j = 1, ..., 10

(7)

where t is the time vector, kj is a parameter whose value is specific for each object, y1j and y2j are the time series related to the jth object. The following rule establishes the fault status of the process: l o j th object is a fault operation if |kj − k 0| o o o o m ≥ ∈ ( ∈ is a small threshold) o o o o o o n otherwise, j th object is a normal operation

(8)

where k0 determines the boundary between the normal and faulty situations. This example was specifically tailored to illustrate that the SPCA similarity metric is not applicable because there is no difference between the directions of the principal components of different objects. The different objects have a complementary SPCA distance (SPCAc) equal to zero since the ratio of y2j/y1j is constant. Therefore, in this case the AED is the only way to differentiate between the classes. Figure 1 presents the resulting patterns (centers) of each cluster applying the Hybrid metric (eq 3) in the only way (α equal to zero) able to cluster the objects correctly (0% of misclassification) and recognize the respective patterns.

The weighted average α ·|| X k − X j || + (1 − α) ·|| X k − X j ||

is referred to as the hybrid distance between two MTS (Xk and Xj, j,k = 1, ···, n). The optimization problem in eqs 3 and 4 is solved in MATLAB using a classical second-order optimization method with the cluster centers as decision variables and the analytical expressions for the membership degrees (eq 6) included as an additional equality constraint within the optimization problem defined in eq 3. The problem presented by eqs 3 and 4 can also be considered a bicriterion constrained clustering30 which involves nonconflicting criteria (SPCA and AED metrics) capable of improving the recognition of patterns and similarities in MTS. 2.3. The Effect of Set Point Changes on AED. As discussed in section 1.1, fault and normal objects were collected at different steady states (operating modes) defined by different production rates. It is expected that the AED metric representing changes in mean values of the MTS and included as part of the hybrid distance used in eq 3 will help discern between operating levels of normal and fault objects while these differences cannot be properly identified through the SPCA which is based on mean centered objects.

Figure 1. Patterns recognized and data sample (four fault objects and six normal objects, each one consisting of one time series associated with y1 and other time series associated with y2). C

DOI: 10.1021/acs.iecr.8b00429 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research

For example, Figure 3a shows two hypothetical MTS objects that comprise, each one, two time series collected over the

On the other hand, another extreme can be illustrated in order to show a situation in which the AED metric alone cannot be used to recognize differences between objects. Consider a sample of 12 objects, each one comprising a MTS consisting of two sigmoid functions as follows: y1j =

1 1+e

−βj·(t − t 0)

and

y2j =

1 1+e

−ρ·βj·(t − t 0)

j = 1, ..., 12

(9)

t is the time vector, t0 is the time instant of the inflection point, βj is a parameter whose value is specific for each object, y1j and y2j are the time series related to the jth object. The following rule establishes the fault condition of the process: l o j th object is a fault operation if |βj − β0| ≥ ∈ o o o o o o and ρ = ρ0 = 0.7 o o o m o o o j th object is a normal operation if |βj − β0| < ∈ o o o o o o and ρ = ρ = 0.5 o 1 n

(10)

β0, ρ0, and ρ1 are the fault parameters of the process. In this case (Figure 2) only the SPCA similarity metric can be used to cluster the objects (also with 0% of misclassification) and recognize their patterns.

Figure 2. Patterns recognized and data sample (five fault objects and seven normal objects, each one consisting of one time series associated with y1 and other time series associated with y2). Figure 3. Two objects (MTS) (fault and normal) (a) and similarity PCA index (SPCAc) over the time window (time window and sampling period equal to 5 and 0.1 units of time, respectively) (b).

The time series shown in Figures 1 and 2 have the same length of time window, and each pair of series associated with the same object represents a given process operation period related to a failure (fault) or nonfailure (nonfault) event. 3.2. Dynamic Feature of the SPCATime-Based Global SPCA (SPCAG). The intrinsic dynamic behavior of the time series can also produce significant variations of the SPCA over the time window that is considered such that a single measurement, using SPCA, of similarity between two objects (MTS) cannot be used to recognize hidden patterns of behavior. This section shows the relevance of considering the dynamic behavior of a similarity metric (SPCA) when analyzing the level of similarity/dissimilarity between time series in order to recognize different patterns of process behavior. This analysis should be considered in applications involving time series limited to a time domain (whole time series clustering9) as well as in pattern matching methods that comprise the comparison of historical data with some predefined pattern (snapshot data16).

same time period and associated with the process variables z1 and z2, respectively. Consider that each object (MTS) is associated with a specific period of a given process and related to a fault or normal operation. Figure 3b presents the measure of similarity (SPCAc, eq 5) between objects over time suggesting, in this case, the need to consider the initial instants of the time window in order to recognize the differences between the two objects. Although easily visible, the difference between fault and normal operation would not be detected or recognized if only the final value of SPCAc (0.0064 at 5 min, whole time window) is available. The intrinsic dynamic behavior of SPCAc suggests that the level of similarity between two MTS should consider the history of a given value of this metric (high or low) over time. A high dissimilarity between two objects would correspond to SPCAc values close to the unit over the entire time window. To D

DOI: 10.1021/acs.iecr.8b00429 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research account for the time history, the time integral, that is, area related to the time profile of SPCAc (SPCAc (t)), is proposed in this work as a new metric to compare two MTS. So, considering that SPCAc (Xk,Vi) = SPCAc (t,Xk,Vi), the timebased global SPCA (SPCAG) is defined by t

|| X k − Vi || = SPCA G(X k , Vi ) =

∫0 f SPCA c(t, X k , Vi ) dt tf (11)

where tf is the length of time window. SPCAG is simply an average in time where the normalization by the duration of the time window tf permits comparing objects (MTS) with different time windows. Extending the case presented in Figure 3a, consider a sample with eight objects prelabeled as fault and another eight prelabeled as normal whose differences are also easily detectable (Figure 4). The box-plots (Figure 5a,b) associated

Figure 5. Similarities distribution within and between labeled objects (normal and fault) using SPCAG (a) and SPCAc (b).

dissimilarity between two objects (MTS) is established gradually over time rather than instantaneously. Similarly to the time-based global SPCA (SPCAG), a timebased global AED (AEDG) in order to consider the dynamic behavior of the AED based on the same time window and on the same sampling period adopted to obtain SPCAG:

Figure 4. Data sample (16 time series/8 fault objects and 16 time series/8 normal objects).

t

|| X k − Vi || = AEDG(X k , Vi ) =

with this sample are defined by the mean and variance of the similarity metric calculated for three cases: (i) between fault objects (1 and 1), (ii) between objects from the entire sample (1 and 2), and (iii) between normal objects (2 and 2), respectively. The closer the value of the metric is to unity, the greater is the dissimilarity among objects. Figure 5 panels a and b show the results obtained using SPCAc (whole time window) and SPCAG (eq 11) in the measure of similarity of each pair of objects, respectively. The box related to the distribution of similarities among the objects of the whole sample shows that the recognition of differences between failure and normal patterns in this example would not be feasible if the similarity is evaluated only at the end of the time series domain (SPCAc). The proposal for a time-based global SPCA (SPCAG) instead of the classical measure (SPCAc) is based on the premise that the similarity between two MTS should be evaluated over time and not simply by the final static picture of the objects. This means that cases of high or low SPCAc localized (or concentrated) only at the end of the time window do not represent dissimilar (SPCAc close to unity) or similar objects (SPCAc close to zero), respectively. Moreover, in a continuous process it is expected that the similarity or

∫0 f AED(t, X k , Vi ) dt tf (12)

4. RESULTS AND DISCUSSION 4.1. Case Study and Data Acquisition. The Tennessee Eastman (TE) process comprises four unit operations, namely: (i) an exothermic two-phase reactor; (ii) a flash separator; (iii) a product stripper; and (iv) a recycle compressor. A portion of the recycle stream (gas stream exhausted from separator) is purged to avoid the accumulation of the inert and byproduct in the process. The plant produces two products (G and H) from four reactants (A, C, D, and E). Two other byproducts are present in the form of an inert (B) and a byproduct (F). All the reactions (eq 1) are irreversible, exothermic, and the rates are function of temperature according to the Arrhenius equation. Both products G and H exit the stripper base and are separated in a downstream unit. A(g) + C(g) + D(g) → G(l) A(g) + C(g) + E(g) → H(l) A(g) + E(g) → F(l) E

DOI: 10.1021/acs.iecr.8b00429 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research

Figure 6. Tennessee Eastman challenge process schematic with the base control strategy.31

3) comprises a step change in the temperature of the D feed stream (stream 2) and results in a much smaller effect on the process variables as compared to faults 2 and 8. Fault 3 is of particular interest since it has been often misclassified with many methodologies due to low signal-to-noise ratio. The selection of the appropriate measured variables (manipulated or controlled) in order to set the MTS (objects) for classification with a clustering-based FDD approach represents an additional challenge since the exclusion or inclusion of a variable can lead to loss of information or hinder the recognition of similarities (dissimilarities) among objects of same (different) class (normal or fault). For both 2 and 8 faults, the MTS of three manipulated variables (D feed flow, stream 2, E feed flow, stream 3, and the flow of the stream 4) were used to infer the faults. For fault 3 the MTS of the only two variables that exhibit non-negligible changes in response to the occurrence of the fault were used to detect the faults: (i) the reactor cooling water flow and (ii) the reactor temperature. We consider a data set of faults occurring one at time, i.e., nonsimultaneous faults and a data set corresponding to normal operating conditions generated around different set point changes. Samples of data were created for each fault by simulating the TE process for 2500 h with a sampling period equal to 30 s. The production rate (set point) was randomly changed within a range ([19.5;26.3] m3/h) and introduced at different time instants to investigate the ability to detect faults in the neighborhood of different steady states of the process. The changes in steady states corresponded to changes in production rates. The MTS were chosen with a particular time length that was set according to the dynamic time constant of the response for each particular fault. These time constants were assumed to be known a priori from simulations. Accordingly, for faults 2 and 8, each object had a time window of 500 min, whereas for fault 3 the time window was only 20 min due to the fast dynamic closed loop response in this latter case. Two samples of same size of 50 objects related to normal operation and 50 objects related to fault/disturbance were considered for each kind of fault.

3D(g) → 2F(l)

Among some alternative control strategies that were proposed for this process, in this paper the decentralized control system proposed by Ricker31 (Figure 6) is adopted for the case study. The process has 12 manipulated variables and 41 measurements (22 continuous process measurements and 19 composition measurements collected at lower sampling rates). Measured variables in a closed-loop process may exhibit small deviations from the original operation condition for a short time period when a fault occurs (even if the fault is persistent). This represents an additional challenge for distinguishing between faulty and normal conditions, which may contribute to the increased rate of missed detections and false alarms, respectively. A list of 20 faults occurring at different points in the process have been previously proposed.24 Regarding the 20 possible fault types that can be considered for the TE process, the detection techniques reported in the literature have shown that most of these faults (17 out of 20) exhibit a similar level of difficulty for their detection and diagnosis and three of these (faults number 3, 9, and 15) are more difficult to detect and generally result in poor classification rates. Both failures 3 and 9 are related to the disturbance in the same process variable (D feed temperature, Figure 6). Accordingly, in order to evaluate the performance of the proposed approach in the detection of failures associated with the TEP, two typical faults (2 and 8) with the same level of complexity as the majority of the faults were considered together with an additional more challenging fault for detection (fault 3). The first (fault 2, Figure 6) comprises a step change in the inert (B) composition with the A/C ratio constant (stream 4, which is a mixture of A and C reactants). The second (fault 8) comprises random variations in the A, B, and C feed compositions (stream 4). Both these faults result in changes in the amount of inert in the system, in the reactor pressure, in partial pressure of reactants inside the reactor and in the production rate (G and H products). A third fault (fault F

DOI: 10.1021/acs.iecr.8b00429 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research

(for both training and test samples), the clustering method with two clusters does not properly segregate the normal objects from the faulty ones in cluster 2 which implies a high rate of false alarms. So, in this case although the cluster 1 has only failed objects, the second cluster could not be characterized as a cluster of normal objects. On the other hand, clustering with 3 clusters is capable of separating almost completely the normal and faulty objects. These 3 clusters include a typical fault cluster (cluster 1) and two normal clusters (clusters 2 and 3) associated with different patterns. As already mentioned, the effectiveness of the fault detection procedure is evaluated through the fraction of fault data classified as normal (misdetection rates or false negative) and the fraction of normal data classified as fault (false positive). Table 1 presents the percentage of normal (false positive) and fault (false negative) objects wrongly classified in both training and test samples (3 clusters) together with the results obtained by the standard statistics (T2 and Q). The assignment of each of the training objects to the clusters is based on the largest membership value calculated for each one of these objects. On the other hand, the classification of each object in the test sample is based on the smallest distance (hybrid distance) of each object from the center of each cluster identified in the training step. Table 1 shows that in the case of fault 2 (step disturbance), the best results of the hybrid method were obtained with α = 0.5. In this case, the use of time-based global approaches (SPCAG and AEDG) provides the same results obtained with the standard ones (SPCAc and AED). Also, none of the metrics (SPCAG or AEDG) when used alone (α = 1 and α = 0, respectively) gave better results than the ones obtained with the hybrid metric (α = 0.5). It is also evident that the SPCAG metric alone (α = 1) is less informative about the faulty situations (53% and 60% of misdetection rates) but it presents better results than the standard SPCAc alone which, together with Q statistics, provides 100% of misclassification in the normal detection (false alarm rate) for both training and test samples. Both (Q and T2) present a high rate of false alarms, that is, normal objects classified as fault due to the measurement noise, highlighting the better performance of the Hybrid approach as compared to the detection based on these traditional statistics that are not capable of distinguishing the normal objects (associated with different production rates) from the faulty ones. Figure 8 presents the composition and the distribution of faulty and normal objects within the clusters for the training and test samples, considering only two clusters for fault 8 involving a random disturbance. Table 2 present the percentage of misclassifications in both the training and testing together with T2 and Q results. In this case, adopting the same manipulated variables and the same window length as the fault 2, only two clusters were enough to segregate normal and faulty objects with low rates of false alarms and missed detections. The T2 and Q statistics also present difficulty in distinguishing the normal objects from the faulty. The best α value (α = 1) shows that in this case of random disturbance for fault 8, the SPCAG metric alone is capable of recognizing dissimilarities among normal and faulty objects. In addition, the use of a time-based global approach (SPCAG) is capable of further improving the classification results compared to the static approach (SPCAc).

For analysis each time series was normalized within the range [−1; +1] considering the maximum and minimum values of the respective process variable along the entire sample (set of objects). This normalizing procedure was required to account for the different magnitudes in engineering units associated with different types of process variables used to infer faults. 4.2. Tennessee Eastman ProcessFaults 2 and 8. A cross-validation procedure was adopted for both faults 2 and 8, and 30 normal and 30 fault objects were randomly selected for training sample for each case. This cross-validation procedure is based on minimizing the percentage of misclassifications obtained with the training and test data with respect to a set of discrete values of the tuning parameter α. An external clustering validation32,33 was considered in order to evaluate the results. In this case, the external information, consisting of objects previously labeled as normal or fault classes, is used to evaluate the extent to which the clustering results matches the given class labels. To that purpose, the performance of the proposed method is quantified by fault detection rate and false alarm rate. Figure 7 presents the composition and the distribution of faulty and normal objects within the clusters for the training and test samples for two different cases considering 2 and 3 clusters, respectively, and the best α value (see eq 3) for the fault 2 and a hybrid distance using both SPCAG and AEDG. Despite the higher number of faulty objects in the same cluster

Figure 7. Clustering results. Number of objects in each cluster; fault 2 (α = 0.5), 2 and 3 clusters. G

DOI: 10.1021/acs.iecr.8b00429 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research Table 1. Percentage (%) of MisclassificationsFault 2 training data−fault training data−normal test data−fault test data−normal

α = 0 (AEDG)

α = 0 (AED)

0 7 0 10

0 10 0 15

α = 1 (SPCAG) α = 1 (SPCAc) 0 53 5 60

0 100 0 100

α = 0.5 (SPCAG/ AEDG)

α = 0.5 (SPCAc/ AED)

T2

Q

0 7 0 10

0 7 0 15

0 97 0 95

0 100 0 100

Figure 8. Clustering results. Number of objects in each cluster; fault 8 (α = 1).

objects (in both training and test data) which highlights the efficiency of the proposed approach considering that the clustering problem itself (optimization-based clustering, eqs 3 and 4) copes with nonlabeled objects. 4.3. Tennessee Eastman ProcessFault 3. Fault 3 in the TE process has been often reported in several earlier FDD studies as very challenging to isolate due to a small signal-tonoise ratio. Shams et al.27 propose a novel method based on the cumulative-sum (CUSUM) of available measurements in combination with PCA for the detection and diagnosis of faults in the TE process and they illustrated the ability of their algorithm to detect fault 3. The step disturbance on D feed temperature (fault 3) causes a negligible effect in the closedloop TE process and only the cooling water flow and reactor temperature present a small change with a fast dynamic response. Shams et al.27 showed that integrating the cooling flow over time (CUSUM approach) allows detection of this fault.

Table 2. Percentage (%) of Misclassifications, Fault 8

training data−fault training data− normal test data− fault test data− normal

α=0 (AEDG)

α = 0.5 (SPCAG/ AEDG)

0

0

0

65

45

0 35

α=1 α=1 (SPCAG) (SPCAc)

T2

Q

0

0

11

0

7

97

60

0

0

0

0

5

65

5

10

95

50

Although the selection of the best values for the trade-off parameter (α) is based on a cross-validation approach, the best results in both cases (fault 2 with α = 0.5 and 3 clusters and fault 8 with α = 1 and 2 clusters) indicate a small percentage (or zero) of misclassifications related to the normal and fault

Figure 9. Clustering results. Number of objects in each cluster; fault 3 (α = 0.5, SPCAG and AEDG). H

DOI: 10.1021/acs.iecr.8b00429 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research

Figure 10. Time series (normal and fault time series): cooling water flow (a) and reactor temperature (b).

In this work, the cooling flow and reactor temperature were considered together within the hybrid approach. Thirty normal and 30 fault objects were randomly selected as training sample, and the same simulation procedure of the TE process (section 1.1) as employed for faults 2 and 8 in the previous section to generate faulty and normal situations was employed. Figure 9 presents the composition and the distribution of the fault and normal objects through the clusters for the training and test samples, considering 2 clusters. The best α value in this case obtained by cross-validation (α = 0.5, both SPCAG and AEDG) shows that the time behavior of the variables considered (reactor temperature and cooling water flow) and the specific features associated with the fault 3 generate a different scenario for the clustering problem. For fault 3, unlike fault 8, the SPCA metric alone does not provide enough information to recognize dissimilarities among normal and faulty objects. The results show that the hybrid approach together with global metrics (SPCAG and AEDG) is capable of recognizing 2 clusters with the majority of faulty situations (cluster 1) and normal (cluster 2) objects in both test and training samples; 23% and 10% of fault objects were misclassified in the training

and test data, respectively, and 27% and 15% of misclassifications (false alarm rates) of normal objects were obtained in the training and test data, respectively. The results are acceptable considering that few variables are affected following the occurrence of fault 3 and of the lack of observability of the fault due to low signal-to-noise ratio. Furthermore, as reported in other works,27,25,26 the standard approaches T2 and Q provide high misdetection rates (close to 100%) for this type of fault. It should be noticed that the approach of Shams et al. is only able to identify the fault with high accuracy with a CUSUM operation only after sufficient time has elapsed since the onset of the fault, whereas in the present approach we have tested the detection ability immediately after the onset of the fault. The low level of observability and fast dynamics of fault 3 suggest that the length of the time window may affect the classification results. A very short time window may not contain enough information to recognize the differences between fault objects and normal operation. On the other hand, a longer time window does not necessarily imply a better classification since the effect of the disturbance will fade out over time and its effect will be reduced in both metrics I

DOI: 10.1021/acs.iecr.8b00429 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research



ACKNOWLEDGMENTS The authors acknowledge the financial support provided by the Federal Agency for Support and Evaluation of Graduate Education (Coordenaçaõ de Aperfeiçoamento de Pessoal de ́ Superior, CAPES-BRAZIL) through a grant to C. Fontes Nivel (Bolsista CAPES, Proc. No. 6773/14-1).

proposed (SPCAG and AEDG). Figure 10 presents some fault and normal objects (time series) associated with the two process variables selected for fault detection (cooling water flow and reactor temperature). The best results presented in Table 3 (α = 0.5) were obtained with a time window equal to 7.8 min.



Table 3. Percentage (%) of MisclassificationsFault 3 α=0 α=1 (AEDG) (SPCAG) training data− fault training data− normal test data− fault test data− normal

α = 0.5 (SPCAG/ AEDG)

α = 0.5 (SPCAc/ AED)

T2

Q

23

17

23

27

0

11

27

67

27

27

97

60

15

15

10

15

0

5

15

95

15

15

95

50

REFERENCES

(1) Strachan, S. M.; Stephen, B.; McArthur, S. D. J. Practical Applications of Data Mining in Plant Monitoring and Diagnostics. In Proceedings of the IEEE Power Engineering Society General Meeting, Florida (USA); IEEE, 2007; pp 1−7. (2) Bankó, Z.; Abonyi, J. Correlation Based Dynamic Time Warping of Multivariate Time Series. Expert Syst. Appl. 2012, 39, 12814− 12823. (3) Izakian, H.; Pedrycz, W.; Jamal, I. Fuzzy Clustering of Time Series Data Using Dynamic Time Warping Distance. Eng. Appl. Artif. Intell. 2015, 39, 235−244. (4) Liao, T. W. Clustering of Time Series Data - a Survey. Pattern Recognit. 2005, 38, 1857−1874. (5) Keogh, E. J.; Kasetty, S. On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD); Edmonton, Alberta, Canada, 2002; pp 23−26. (6) Trebuňa, P.; Halčinová, J. Mathematical Tools of Cluster Analysis. Appl. Math 2013, 4, 814−816. (7) Fu, T. A Review on Time Series Data Mining. Eng. Appl. Artif. Intell. 2011, 24, 164−181. (8) Kavitha, V.; Punithavalli, M. Clustering Time Series Data Stream - A Literature Survey. Int. J. Comput. Sci. Inf. Secur. 2010, 8 (1), 289− 294. (9) Aghabozorgi, S.; Shirkhorshidi, A. S.; Wah, T. Y. Time-Series Clustering − A Decade Review. Inf. Syst. 2015, 53, 16−38. (10) Singhal, A.; Seborg, D. E. Pattern Matching in Multivariate Time Series Databases Using a Moving-Window Approach. Ind. Eng. Chem. Res. 2002, 41, 3822−3838. (11) Yang, K.; Shahabi, C. A PCA-Based Similarity Measure for Multivariate Time Series. In Proceedings of the International Workshop on Multimedia Databases, ACM-MMDB, Washington DC, USA; 2004; pp 1−10. (12) Xun, L.; Zhishu, L. The Similarity of Multivariate Time Series and Its Application. In Proceedings of the International Conference on Management of e-Commerce and e-Government, Sichuan, China; 2010; pp 76−81. (13) Plant, C.; Wohlschlager, A. M.; Zherdin, A. Interaction-Based Clustering of Multivariate Time Series. In Proceedings of the Ninth IEEE International Conference on Data Mining; Miami, Florida, USA, 2009; pp 914−919. (14) Coppi, R.; D’urso, P.; Giordani, P. A Fuzzy Clustering Model for Multivariate Spatial Time Series. J. Classif. 2010, 27 (1), 54−88. (15) D’urso, P.; Maharaj, E. A. Autocorrelation-Based Fuzzy Clustering of Time Series. Fuzzy Sets Syst. 2009, 160 (24), 3565− 3589. (16) Singhal, A.; Seborg, D. E. Evaluation of a Pattern Matching Method for the Tennessee Eastman Challenge Process. J. Process Control 2006, 16, 601−613. (17) Dobos, L.; Abonyi, J. On-Line Detection of Homogeneous Operation Ranges by Dynamic Principal Component Analysis Based Time-Series Segmentation. Chem. Eng. Sci. 2012, 75, 96−105. (18) Deng, X.; Tian, X. Nonlinear Process Fault Pattern Recognition Using Statistics Kernel PCA Similarity Factor. Neurocomputing 2013, 121, 298−308. (19) Zhang, Y.; Wang, Z.; Zhang, J.; Ma, J. Fault Localization in Electrical Power Systems: A Pattern Recognition Approach. Electr. Power Energy Syst. 2011, 33, 791−798.

5. CONCLUSIONS A hybrid clustering combining two metrics, (1) the direction of principal components quantified by the SPCA and (2) the averages quantified by the AED, is applied to improve the performance of MTS classification. The effect of the intrinsic dynamic behavior of the SPCA over the time window, using a static PCA, is also investigated, and a modification of this metric (time-based global SPCA) is proposed for recognizing dynamic similarities among objects (MTS) not observable through a standard (static) approach which considers the time series as a whole. The proposed method is applied to detect normal and abnormal closed loop operation of the Tennessee Eastman (TE) process considering different operating conditions provided by changes in the production rate. Three particular faults have been considered to test the algorithm. In both cases, time series of preselected process variables, directly related to the control actions, were considered in each object. The results highlight the ability of the method in recognizing typical clusters associated with fault or normal objects, according to a learning procedure (clustering) combined together with a cross-validation approach. The results show that the hybrid metric also provides a flexible way to cope and identify specific situations in which only one type metric is capable of recognizing structure in data. Furthermore, timebased global SPCA (SPCAG) represents a feasible alternative to improve the classifier performance even in complex classification problems (fault 3, TE process) and can provide better results in relation to the traditional approach of SPCA application (SPCAc).



Article

AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. ORCID

Cristiano Hora Fontes: 0000-0001-8020-6815 Hector Budman: 0000-0002-0773-7457 Notes

The authors declare no competing financial interest. J

DOI: 10.1021/acs.iecr.8b00429 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research (20) Fontes, C. H.; Budman, H. A Hybrid Clustering Approach for Multivariate Time Series - A Case Study Applied to Failure Analysis in a Gas Turbine. ISA Trans. 2017, 71, 513. (21) Li, S.; Wen, J. Application of Pattern Matching Method for Detecting Faults in Air Handling Unit System. Autom. Constr. 2014, 43, 49−58. (22) Venkatasubramanian, V.; Rengaswamy, R.; Yin, K.; Kavuri, S. N. A Review of Process Fault Detection and Diagnosis - Part I: Quantitative Model-Based Methods. Comput. Chem. Eng. 2003, 27, 293−311. (23) Lee, J. M.; Yoo, C.; Lee, I. B. Fault Detection of Batch Processes Using Multiway Kernel Principal Component Analysis. Comput. Chem. Eng. 2004, 28 (9), 1837−1847. (24) Downs, J.; Vogel, E. F. A Plant-Wide Industrial Process Control Problem. Comput. Chem. Eng. 1993, 17 (3), 245−255. (25) Rato, T. J.; Reis, M. S. Fault Detection in the Tennessee Eastman Benchmark Process Using Dynamic Principal Components Analysis Based on Decorrelated Residuals (DPCA-DR). Chemom. Intell. Lab. Syst. 2013, 125, 101−108. (26) Lau, C. K.; Ghosh, K.; Hussain, M. A.; Hassan, C. R. Fault Diagnosis of Tennessee Eastman Process with Multi-Scale PCA and ANFIS. Chemom. Intell. Lab. Syst. 2013, 120, 1−14. (27) Shams, M. A.; Budman, H. M.; Duever, T. A. Fault Detection, Identification and Diagnosis Using CUSUM Based PCA. Chem. Eng. Sci. 2011, 66, 4488−4498. (28) Ozkan, C.; Keskin, G. A. A Variant Perspective to Performance Appraisal System: Fuzzy-C-Means Algorithm. Int. J. Ind. Eng. 2014, 21 (3), 168−178. (29) Döring, C.; Lesot, M.-J.; Kruse, R. Data Analysis with Fuzzy Clustering Methods. Comput. Stat. Data Anal. 2006, 51, 192−214. (30) Dao, T. B. H.; Duong, K. C.; Vrain, C. Constrained Clustering by Constraint Programming. Artif. Intell. 2017, 244, 70−94. (31) Ricker, N. L. Decentralized Control of the Tennessee Eastman Challenge Process. J. Process Control 1996, 6 (4), 205−221. (32) Xiong, H.; Li, Z. Clustering Validation Measures. In Data Clustering; Algorithms and Applications; Agarrawal, C. C., Reddy, C. K., Ed.; CRC Press, Taylor & Francis Group: New York, 2013; pp 571− 605. (33) Arbelaitz, O.; Gurrutxaga, I.; Muguerza, J.; Pérez, J.; Perona, I. An Extensive Comparative Study of Cluster Validity Indices. Pattern Recognit. 2013, 46, 243−256.

K

DOI: 10.1021/acs.iecr.8b00429 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX