Article pubs.acs.org/IECR
Self-Organizing Map Based Fault Diagnosis Technique for NonGaussian Processes Hongyang Yu,† Faisal Khan,*,† Vikram Garaniya,† and Arshad Ahmad‡ †
National Centre for Maritime Engineering and Hydrodynamics, Australian Maritime College, University of Tasmania, Launceston, TAS 7250, Australia ‡ Institute of Hydrogen Economy, Faculty of Chemical Engineering, University Technology Malaysia, 81300 Skudai, Johor, Malaysia ABSTRACT: A self-organizing map (SOM) based methodology is proposed for fault detection and diagnosis of processes with nonlinear and non-Gaussian features. The SOM is trained to represent the characteristics of a normal operation as a cluster in a two-dimensional space. The dynamic behavior of the process system is then mapped as a two-dimensional trajectory on the trained SOM. A dissimilarity index based on the deviation of the trajectory from the center of the cluster is derived to classify the operating condition of the process system. Furthermore, the coordinate of each best matching neuron on the trajectory is used to compute the dynamic loading of each process variable. For fault diagnosis, the contribution plot of the process variables is generated by quantifying the divergences of the dynamic loadings. The proposed technique is first tested using a simple nonGaussian model and is then applied to monitor the simulated Tennessee Eastman chemical process. The results from both cases have demonstrated the superiority of proposed technique to the conventional principal component analysis (PCA) technique.
1. INTRODUCTION The modern process systems have become more versatile and sophisticated due to the rapid development of technology. As a result, the number of process variables and components increases exponentially. In addition, the interactions among the process variables have become more intricate, resulting in nonlinear and non-Gaussian behaviors of the process. A fault condition in a simple component can easily be concealed by these variations and quickly propagate to cause multiple upsets. These upsets, if not being dealt properly, can lead to serious safety issues and significant degradation in process performance. To minimize the impact of these upsets, it is necessary to develop a monitoring technique that is able to timely detect and precisely diagnose the abnormal behavior of a process, facilitating the efficient determination and prompt deployment of the proper remedial measures. A systematic review of the recent developments of process monitoring techniques can be found in the literature.1 Broadly, these techniques can be divided into model-based techniques and data-driven techniques. The model-based technique requires an in-depth understanding and analysis of the system to derive an accurate mechanistic model.1−5 However, these first principal models are often difficult to obtain explicitly due to the nonlinear and non-Gaussian variations of the modern process systems.5 On the other hand, the data-driven techniques rely only on the process history data rather than an explicit first principal model. Therefore, this technique has been widely applied to the complex industrial processes.6−8 Among various data-driven techniques, the principal component analysis (PCA) and the partial leastsquares (PLS) are most extensively adopted.9−11 These two techniques are able to project the process data from the highdimensional measurement space into a low-dimensional feature space. The boundary of the feature space can be defined by specifying a control limit for either the Hotelling’s statistic (T2) or Q statistic (SPE).12 Subsequently, the multivariate contribu© 2014 American Chemical Society
tion plots can be generated for the out-of-control data samples to diagnose the faulty process variables.13 The major drawback of PCA and PLS based techniques is attributed to the assumption that the intrinsic process variations follow a multivariate Gaussian distribution and this multivariate distribution can be decomposed into a group of independent univariate Gaussian distributions with descending variance.14−16 The variance and orientation of each univariate Gaussian distribution are captured by the corresponding eigenvalue and loading vector (principal component), respectively. In this regard, PCA only takes into account the second-order statistic (variance-covariance) of process data.17−19 However, the underlying data generation mechanism for modern industrial processes is often governed by high-order and non-Gaussian parameters. Therefore, the PCA and PLS based techniques may not be able to effectively extract important information on the process and the fault detection and diagnosis results can be misleading.20 In recent years, the Self-organizing Map (SOM) has gained popularity as an alternative to PCA in fault detection and diagnosis for complex industrial processes.21−23 SOM surpasses PCA as a nonlinear dimensionality reduction technique. It is able to capture the nonlinear variations of the process and visualize them on a low-dimensional display in a topologically ordered fashion.24−26 The following example is generated from a Java applet developed by Mirkes27 to illustrate the comparison between PCA and SOM based process monitoring. The simulated two-dimensional data D = [d1 d2] is generated from a nonlinear and non-Gaussian process. Both PCA and SOM provide a one-dimensional approximation of the distribution of data. In Figure 1, PCA with a single principal component (PC) Received: Revised: Accepted: Published: 8831
February 25, 2014 April 19, 2014 May 6, 2014 May 6, 2014 dx.doi.org/10.1021/ie500815a | Ind. Eng. Chem. Res. 2014, 53, 8831−8843
Industrial & Engineering Chemistry Research
Article
and SOM is presented in the Background section. The derivation of the proposed SOM based fault detection and diagnosis technique is illustrated in the Methodology section. In the Case Studies section, the effectiveness of the proposed technique is first validated on a simple numerical example and is then verified by the well-established Tennessee Eastman chemical process system. The superiority in performance of the proposed technique to conventional PCA based techniques is demonstrated by comparing the results. The major conclusions along with the key advantages and potential for future developments of the proposed technique are presented in the final section.
2. BACKGROUND 2.1. Principal Component Analysis (PCA). PCA is an extensively applied multivariate statistic technique that is able to handle data with high correlation and high dimensionality. The basic PCA is conducted by applying singular value decomposition (SVD) to the data matrix.36 Consider a zero-meaned data matrix X, with n samples and m features,X ∈ n × m. The covariance matrix of X can be computed in two forms.37
Figure 1. Process monitoring comparison between PCA and SOM.
determines a linear regression that best explains the data. However, PCA is only able to capture approximately 79% of the variance owning to the fact that PCA assumes the data is Gaussian distributed. In comparison, SOM provides a discrete approximation of the distribution of the data with a single row of 20 best matching neurons (red dots). Each best matching neuron points to a local region with high data concentration. It is observed that SOM in fact constructs a high-order nonlinear regression that explains the nonlinear variation of data much better. As a result, SOM is able to capture approximately 95% of the variance. In addition, when a fault data sample, shown as a green dot in Figure 1, is introduced, PCA may not be able to correctly classify the data sample as it is still in the normal region defined by the control limits of T2 and Q statistic. On the other hand, the fault data sample can be easily classified by SOM as it has significantly exceeded the normal region. The normal region of SOM can be defined by specifying a control limit for the quantization error of the process data.21,25,28−30 The quantization error provides a measure of similarity between the process data vector and the weight vector of the corresponding best matching neuron. For fault diagnosis, SOM is used to visualize the dynamic behavior of the process as a two-dimensional trajectory which can easily be interpreted to identify the root cause of fault.24,28,30−35 These conventional SOM based fault detection and diagnosis techniques rely heavily on the availability of various types of faulty process data.21,23,30 However, in practice, generating faulty process data that takes into account all possible fault conditions of the process is infeasible. As a result, these techniques can only detect and diagnose a limited number of faults. In this study, a dissimilarity index which measures the difference between the best matching neuron representing the most significant variation of the normal operation and the best matching neuron representing each monitored process data sample is derived for fault detection. The control limit for the dissimilarity index is estimated by adopting a nonparametric kernel density estimator. For fault diagnosis, a dynamic loading vector is computed for each best matching neuron that captures a predefined number of monitored process data samples. These dynamic loading vectors are compared with the dynamic loading vector of the normal process data. A multivariate contribution plot is then generated for fault diagnosis based on how much each process variable affects the divergence of the dynamic loading vector from normal condition. The reminder of the paper is divided into the following sections. A brief review of the fundamental concepts of the PCA
∑= 1
∑= 2
XT X N−1
(1)
XXT N−1
(2)
The eigenvalue decomposition is then applied to both covariance matrices
∑ = V Σ2VT (3)
1
∑ = U Σ2UT (4)
2 m×m
where V ∈ is an orthogonal matrix (VTV = I), with column vectors representing the loading vectors (principal components). U ∈ n × m is an orthonormal matrix. Σ ∈ m × m is a diagonal matrix in which the diagonal elements are the eigenvalues in descending order. Finally, the SVD of data matrix X is expressed as
X = U ΣVT
(5)
The projection of X into the PCA subspace determined by V is achieved by computing the following dot product. XV = U ΣVTV = U Σ = T
(6)
Therefore, UΣ = T is the score matrix. If only first K principal components are considered for projection, the PCA approximation of X is computed as X ≈ UK ΣK VK T
(7)
Where UK, ΣK, and VK comprise of only first K columns of U, Σ, and V, respectively. The Hotelling’s statistic (T2) that measures the variation of X in the K-dimensional PCA subspace is given by T 2 = XT VK Σ−K1VKTX
(8) 2
The variation that is not measurable by the T statistic can be described by the Q statistic which computes the residual between the original data and the reconstructed data. 8832
dx.doi.org/10.1021/ie500815a | Ind. Eng. Chem. Res. 2014, 53, 8831−8843
Industrial & Engineering Chemistry Research
Q = X(I − VK VKT )XT
Article
the BMU. In this regard, the similarity between data samples can simply be measured by the Euclidean distance between the BMUs that capture them. If the data samples are generated from the same process in a steady-state condition, they have high similarity and are captured by the BMUs that are very close to each other. These BMUs then form a topologically ordered 2-D cluster that represents the steady-state feature of the process. In addition, each BMU in the cluster has a slightly different weight vector which allows them to further discretize the input data space into min-batches of data samples; and the weight vectors represent the expectation of each min-batch. In other words, these BMUs provide a discrete approximation of the distribution of the input data space.
(9)
The control limit for the T2 statistic is determined by using Fdistribution with K degrees of freedom,38 and the control limit for Q is computed based on the assumption that the data follows a standard normal distribution.39 2.2. Self-Organizing Map. Self-organizing map (SOM) was first proposed by Kohonen40 as a specific type of unsupervised neural network. The SOM is an effective dimensionality reduction tool for feature extraction and classification of high dimensional data. The basic SOM is constructed by arranging a group of neurons in lattice formation in a discrete twodimensional (2-D) space. An input data vector is connected to each neuron through a set of weights. A typical 2-D SOM is shown in Figure 2.
3. METHODOLOGY 3.1. SOM Based Dissimilarity Index and Fault Detection. The SOM is trained with the historical data from the normal operating condition of the process and also random data that represent the fault condition. The random data prevents the BMUs from occupying the entire feature space and defines a clear boundary around the normal cluster. In this study, the random data is sampled from the beta distribution, X − Beta(α,β); for each data sample, the parameters are sampled from a uniform distribution, α, β −[0,1]. An example of the trained SOM is shown in Figure 3(a), where the green cluster represents
Figure 2. Basic SOM structure.
There are three fundamental steps involved in the training of the SOM: the competition step, the cooperation step, and the adaptation step. In the competition step, the weights of each neuron are randomly initiated; these weights are then compared with the input data vector. The neuron that carries weights with the highest similarity to the input data vector is declared as the winner neuron or the best matching unit (BMU).40 An arbitrary data vector with m entries is expressed as Iin = [x1 x2 x3... xm]T, and the initial weight vector of a neuron which also has m entries is represented by W = [w1 w2 w3... wm]T. The similarity between these two vectors is measured by the root-squared error between them.40
Figure 3. SOM training (a) with a normal data and a random data and process monitoring (b) with the trained SOM (green cluster represents normal condition and red cluster represents fault condition).
n
E = ||Iin − W || =
∑ (xi − wi)2 i=1
the normal operating condition and the red cluster represents the random data. The coordinate of each BMU is determined by their distance to the corresponding reference axes C1 and C2.The center of the green cluster is located at the BMU that captures the largest number of data samples. This BMU represents the most significant variation of the normal operating process. The trained SOM is then used for online monitoring of the process. A BMU is determined for each online data sample. A trajectory that shows the dynamic behavior of the process is formed by connecting these BMUs. An example of the dynamic process monitoring by using SOM is shown in Figure 3(b). When the process is in normal operating condition, the online process data is very similar to each other. Their corresponding BMUs concentrate in the green cluster. Thus, the dynamic trajectory also varies within the green cluster. A fault condition introduces abnormal variations in the process which results in generation of less similar process data. These fault data samples are then mapped in BMUs outside the green cluster. As the fault progresses, the process becomes more disturbed and the BMUs move further toward the inner region of the red cluster.
(10)
The BMU neuron has the smallest E as compared to the other neurons. Subsequently, in the cooperation step, the direct neighborhood neurons of the BMU are identified. Finally, in the cooperation phase, the weights of BMU and the neighborhood neurons are selectively tuned to form a 2-D cluster that represents various features of the input data space.41 The tuning function is expressed as W (t + 1) = W (t ) + α(t )θ(t )[I(t ) − W (t )]
(11)
where α(t) is the learning rate, and θ(t) is the Gaussian neighborhood function. α(t) decreases exponentially over time, resulting in a more refined learning as the training progresses. On the other hand, θ(t) is maximized at the BMU and decays exponentially with the distance from the BMU; neurons that are closer to the BMU are updated more rapidly. As a result of the differential updating, the neurons surrounding the BMU become more similar to the BMU, whereas the neurons far away from the BMU suffer from a low learning rate and become less similar to 8833
dx.doi.org/10.1021/ie500815a | Ind. Eng. Chem. Res. 2014, 53, 8831−8843
Industrial & Engineering Chemistry Research
Article
Figure 4. Illustration of SOM-based fault detection.
where Xi is the data samples that are captured by the BMUi at sample time i; and Vi is the dynamic loading vector of the captured data samples. In this case, Disimi is equivalent to the dynamic score vector corresponding to the dynamic loading vector (dynamic PC). Therefore, the dynamic loading vector is calculated as
Therefore, the Euclidean distance between the BMUs of each online data sample and the center of the cluster on the twodimensional SOM is used as an index to measure the dissimilarity between the online data sample and those data samples representing significant variations of the normal process. This dissimilarity index, denoted as Disim and calculated as shown below, is then used for fault detection
V i = (Xi)‐1Disimi
2
Disim = || C i − C || =
(12)
where Ci = [Ci1 Ci2]is the coordinate of the BMU representing data sample i, and C = [C1 C2] is the coordinate of the center BMU of the green cluster. The upper control limit of the dissimilarity index is estimated using a kernel density estimator under the 95% confidence interval ∧
fh (Disimi) =
1 nh
n
⎛ Disim − Disimi ⎞ ⎟ ⎠ h
∑ K ⎜⎝ i=1
(13)
DisimUCL ∧
∫−∞
fh (Disimi)dDisimi = 0.95
(14)
where h is the bandwidth, and K(·) is the Gaussian density function.42 The optimal bandwidth is determined by adopting a diffusion-based plug-in selection method.43 The upper control limit is denoted as, DisimUCL. The complete SOM-based fault detection technique is illustrated in Figure 4. 3.2. SOM-Based Multivariate Contribution Plot for Fault Diagnosis. The dissimilarity index (Disim) provides not only a means of fault detection but also a dynamic measure of the variation of the process in reference to the normal condition. In this regard, dynamic behavior of the process is captured by a onedimensional quantity, Disim. Similar to PCA, the projection from the high-dimensional process data space to the one-dimensional Disim is expressed as Disimi = XiV i
i
Equation 16 computes the inverse of X which requires X be a well-conditioned square matrix. This means that to compute Vi, the number of data samples (rows of the Xi) has to be equal to the number of the monitored process variables (columns of the Xi). In SOM, data samples are classified according to their similarity; data samples with high similarity are captured by the same BMU. During online monitoring, the BMUs that capture a large number of data samples represent significant behaviors of the process system. Conversely, the BMUs that capture less data samples only represent insignificant behaviors of the system. In this regard, when solving Vi, these BMUs are neglected. In essence, this is a data filtering technique based on SOM that filters out the less significant information on the process system. This data filtering technique is demonstrated in Figure 5. In comparison, the filtered trajectory in Figure 5(b) clearly demonstrates the progression of process state from normal to fault condition. The dynamic loading vector is equivalent to the first PC of the min-batch of data samples that are captured by the BMUi at sample time i. It also provides an approximation of the orientation of the distribution of the captured data samples. Each entry in this vector represents the influence of each process variable on the orientation. When the process is operating in a steady-state normal condition, high similarity normal process data is generated and is discretized by BMUs that are very close to each other. The variation of the dissimilarity index is very small, and, as a result, the computed dynamic loading vectors are very similar to each other. After a fault is introduced in the process system, the process starts generating abnormal and dissimilar fault data samples which are captured by BMUs outside the normal region. Consequently, the computed
∑ (Cji − Cj)2 j=1
(16) i
(15) 8834
dx.doi.org/10.1021/ie500815a | Ind. Eng. Chem. Res. 2014, 53, 8831−8843
Industrial & Engineering Chemistry Research
Article
(FDR) is calculated after the fault is introduced as the percentage of the out-of-control (exceeding the upper control limit) data samples detected by the corresponding fault diagnosis technique. FDR =
Number of out‐of‐control data samples after fault × 100% Total number of data samples after fault
(19)
On the other hand, the false alarm rate (FAR) calculates the percentage of data samples that have been classified to be out-ofcontrol before the fault is introduced. FAR =
Figure 5. Process monitoring trajectory (a) before filtering and (b) after filtering.
(20)
4.2. A Simple Non-Gaussian Model. A simple nonGaussian model is first used to demonstrate the validity of the proposed fault detection and diagnosis technique. This model has 4 output variables, X = [x1 x2 x3 x4]T, which are monitored to determine the state of the model. These monitored variables are generated based on the following model
dynamic loading vector starts to diverge from those in normal operating condition. An example of such divergence is illustrated in Figure 6. The multivariate contribution plot is generated based on how much each process variable influences this divergence which is measured by the root-squared error between the dynamic loading vector of the captured online fault data samples and the dynamic loading vector of the historical data samples captured by the center BMU
X = AZ + Φ
∑ (vji − vj)2 j=1
⎡ u 4 + 4u3 + 2u 2 + u ⎤ ⎥ Z=⎢ ⎢⎣ 2v 4 + v 3 − 2v 2 + 3v ⎥⎦
(17)
where m is the number of monitored process variables. Vi is the loading vector for fault data samples captured by BMUi at sample time i. V is the normal dynamic loading vector. vij and vj are the jth elements of Vi and V, respectively. The contribution of each monitored process variable is calculated as
Cont j = (vji − vj)2
(21)
where A ( A ∈ 4 × 2 ) is the coefficient matrix, and Φ is the zeromeaned Gaussian noise having a correlation matrix of 0.25I, I ∈ 4 × 4 . The input signal Z has nonlinear and non-Gaussian variation and is generated from the following source
m
DV = || V i − V || =
Number of out‐of‐control data samples before fault × 100% Total number of data samples before fault
(22)
where u and v are the Gaussian input variables of the model, u ∼ N(0,0.1) and v ∼ N(0,0.4). Two fault conditions have been introduced to x1 and x3, respectively, at sample time 3000. The Gaussian noise functions for x1 and x3 are switched to gamma noise functions which further introduce non-Gaussian variations into the model. The SOM is trained with 1000 normal samples generated from this model and 1000 random fault samples generated according to the method outlined in the first paragraph of section 3.1. The trained SOM is then used for model state monitoring, and the dynamic trajectories for both fault conditions are shown in Figure 7. The fault detection results for both the proposed technique and the PCA-based technique are presented in Figure 8. For PCA, the first two PCs that capture 82.5% of variance are used to construct the feature space. In addition, a comparison of the fault detection rate between these two techniques is shown in Table 2.
(18)
To further illustrate the advantage of the proposed SOMbased technique over the conventional PCA-based technique, a comparison of the fault detection and diagnostic mechanism of these two techniques are summarized in Table 1.
4. CASE STUDIES 4.1. Fault Detection Rate and False Alarm Rate. The fault detection rate and the false alarm rate are used to assess and compare the performance of the proposed technique and the conventional PCA-based technique. The fault detection rate
Figure 6. Divergence of dynamic loading vector in fault condition. 8835
dx.doi.org/10.1021/ie500815a | Ind. Eng. Chem. Res. 2014, 53, 8831−8843
Industrial & Engineering Chemistry Research
Article
Table 1. Comparison of PCA-Based Technique and SOM-Based Technique process monitoring Fault Detection
Fault Diagnosis
PCA-based technique
SOM-based technique
• Construct a linear regression which utilizes second-order statistic to explain process variance • Fault Detection Statistics: T2 and Q statistic • Compute upper control limits for fault detections based on assumption of Gaussian variation of the process • Calculate individual contribution of process variable to T2 and Q statistics (which assume Gaussian and linear variation)
• Construct a discrete nonlinear regression based on the similarity of data samples to capture process variance • Fault Detection Statistic: Disim index • Compute upper control limit using nonparametric kernel density function without any prior assumption • Calculate individual contribution based on influence of each process variable on the divergence of the dynamic loading vector (nonlinear and no prior assumption)
Table 2. Comparison of Fault Detection Rate between the Proposed Technique and PCA fault condition no.
detection technique
fault detection rate (%)
early detection ([Y/N])
1
T2 statistic Q statistic Disim index T2 statistic Q statistic Disim index
3.66 4.4 99.5 3.31 5.5 100
N N Y N N Y
2
Table 3. Comparison of False Alarm Rate between the Proposed Technique and PCA
Figure 7. Dynamic monitoring of the process with SOM: (a) Fault condition 1 and (b) Fault condition 2.
fault condition no. 1
It is clearly shown that the PCA-based technique is not able to detect either of the fault conditions due to the significant nonGaussian variation of the input signal source. On the other hand, the proposed technique demonstrates superior performance and successfully detects both fault condition with detection rate close to 100%. Furthermore, a comparison of the false alarm rates between these two techniques has been presented in Table 3, which shows that both techniques yield similar reliability in classification of the data samples when the simple process model is in normal condition. The contribution plots for fault diagnosis of the proposed dynamic loading vector based technique and PCA based techniques are shown in Figure 9. In comparison, the proposed
2
detection technique 2
T statistic Q statistic Disim index T2 statistic Q statistic Disim index
false alarm rate (%) 4.64 5.71 5.54 3.72 6.01 5.11
technique and the Q statistic contribution are both able to accurately locate the root-cause output variable. However, the results from T2 contribution are misleading. The main reason for the difference in diagnostic performance between the T2 contribution and the Q contribution is that the Q contribution is able to capture the non-Gaussian variation of the model in the
Figure 8. Comparison of fault detection results between SOM-based Disim index and PCA. 8836
dx.doi.org/10.1021/ie500815a | Ind. Eng. Chem. Res. 2014, 53, 8831−8843
Industrial & Engineering Chemistry Research
Article
Figure 9. Comparison of fault diagnosis results between SOM-based Disim index and PCA.
Figure 10. Process flow diagram of the Tennessee Eastman chemical process system.
residual space. Nevertheless, the poor fault detection result of Q statistic due to the assumption that the residual space contains only standard normal noise has greatly restricted the fault diagnosis capability of Q statistic. As demonstrated by this simple case study, the overall performance of the PCA-based technique
is less satisfactory comparing to the proposed technique for fault detection and diagnosis when significant non-Gaussian variations exist. 4.3. Tennessee Eastman Chemical Process. In this section, the effectiveness of the proposed fault detection and 8837
dx.doi.org/10.1021/ie500815a | Ind. Eng. Chem. Res. 2014, 53, 8831−8843
Industrial & Engineering Chemistry Research
Article
diagnosis is verified by further testing on a well-established simulation program of the Tennessee Eastman chemical process. The simulation program adopts the decentralized control strategy to construct a closed-loop stable simulation of the process.44 The Tennessee Eastman chemical process consists of five major operating units: a reactor, a product condenser, a vapor−liquid separator, a recycle compressor, and a product stripper.45 The process flow diagram of the process system is shown in Figure 10. Three gaseous reactants are fed to the reactor, where catalyzed chemical reactions occur to form liquid products. The product stream exits the reactor as vapor and is condensed at the condenser. Subsequently, the product stream from the condenser passes through the vapor−liquid separator where the condensed product and the noncondensed product are separated. The noncondensed product stream is then recycled back to the reactor feed through a centrifugal compressor. Meanwhile, the undesirable byproducts and inert are purged from the process as vapor. Finally, the condensed product stream moves further into the stripper to be stripped with stream 4 to remove the residual reactants. The final product stream exits from the base of the stripper and is pumped to the downstream section for further refinement. In total, 41 process variables are measured for the process system, among which 22 process variables are monitored to determine the operating condition of the process system. These monitored variables are listed in Table 4.
Table 5. Fault Conditions of the Tennessee Eastman Chemical Process
variable description
unit
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22
A feed (stream 1) D feed (stream 2) E feed (stream 3) A and C feed (stream 4) recycle flow (stream 8) reactor feed rate (stream 6) reactor pressure reactor level reactor temperature purge rate (stream 9) separator temperature separator level separator pressure separator underflow (stream 10) stripper level stripper pressure stripper underflow (stream 11) stripper temperature stripper steam flow compressor work reactor cooling water outlet temperature condenser cooling water outlet temperature
kscmh kg/h kg/h kscmh kscmh kscmh kPa gauge % Deg C Kscmh Deg C % kPa gauge m3/h % kPa gauge m3/h Deg C kg/h kW Deg C Deg C
fault description
IDV1
A/C feed ratio, B composition constant (stream 4) B composition, A/C feed ratio constant (stream 4) D feed temp (stream 2) reactor cooling water inlet temperature condenser cooling water inlet temperature A feed loss (stream 1)
IDV2 IDV3 IDV4 IDV5 IDV6
IDV7 IDV8 IDV9 IDV10 IDV11 IDV12 IDV13 IDV14 IDV15
C header pressure loss-reduced availability (stream 4) A, B, C feed composition (stream 4) D feed temperature (stream 2) C feed temperature (stream 4) reactor cooling water inlet temperature condenser cooling water inlet temperature reaction kinetics reactor cooling water valve condenser cooling water valve
signal type step step step step step step. Switch pressure controller to purge stream and reduce production rate by 23.8% step random variation random variation random variation random variation random variation slow drift sticking sticking
are also compared with the results obtained by PCA-based techniques to demonstrate the superiority of the proposed techniques. The SOM is trained with 1000 samples of historical normal process data generated by running the simulation program normally for 1000 sample time and 1000 samples of random data generated as outline in the first paragraph of section 3.1 for all 15 conditions. The first 16 PCs that capture 80% variance of the process data are selected for PCA-based techniques. The fault detection results of the PCA T2 statistic, Q statistic, and the proposed Disim index for all 15 conditions are shown in Figure 11, Figure 12, and Figure 13, respectively. For fault condition IDV6, the low stripper liquid level has caused the shutdown of the process, and the simulation has stopped at sample time 3709. A comparison of the fault detection rates and false alarm rates for these three techniques are also shown in Table 6 and Table 7, respectively. It is evident that the proposed technique demonstrates superior performance in fault detection as compared to PCA-based techniques. The PCA-based techniques are incapable of detecting most of the fault conditions. In IDV1 and IDV8, the T2 statistic and Q statistic are able to capture the abnormal behavior of the process relatively earlier; however, due to the increasing non-Gaussian disturbance introduced by the step fault as well as the closed-loop control actions, the PCAbased techniques fail to capture the abnormality short after sample time 4000. On the other hand, the proposed technique is able to characterize the non-Gaussian variation of the process and correctly identifies 12 of the 15 fault conditions with minimum delay and high fault detection rate. In terms of the false alarm rate, it is observed in Table 7 that both techniques are able to correctly classify most of the normal online data samples. In addition, fault condition IDV3, IDV9, and IDV15 have been suggested to be difficult to detect when the control strategy is implemented.19,46−48 The similar results have been obtained by this study Furthermore, the 2-D dynamic trajectories representing the dynamic behavior of the process under the fault conditions IDV6,
Table 4. Monitored Variables of the Tennessee Eastman Chemical Process System variable no.
fault no.
In addition, Table 5 summarizes the 20 fault conditions that have been preprogrammed in the Tennessee Eastman process simulation and have been widely used by the process monitoring community for verifying and comparing various techniques.45 In this study, the 15 known fault conditions are adopted to test the proposed fault detection technique, and 4 of these conditions are used to verify the proposed fault diagnosis technique. The sampling interval for data collection is 3 min. The testing results 8838
dx.doi.org/10.1021/ie500815a | Ind. Eng. Chem. Res. 2014, 53, 8831−8843
Industrial & Engineering Chemistry Research
Article
Figure 11. Fault detection results of PCA T2 statistic.
Figure 12. Fault detection results of PCA Q statistic. 8839
dx.doi.org/10.1021/ie500815a | Ind. Eng. Chem. Res. 2014, 53, 8831−8843
Industrial & Engineering Chemistry Research
Article
Figure 13. Fault detection results of SOM Disim index.
Table 6. Fault Detection Rates of PCA T2 Statistic, PCA Q Statistic, and SOM Disim Index
Table 7. Comparison of False Alarm Rate of PCA T2 Statistic, PCA Q Statistic, and SOM Disim Index
fault detection rate (%), early detection ([Y/N]) PCA statistics 2
false alarm rate (%)
SOM
PCA statistics 2
fault no.
T
Q
Disim index
fault no.
T
IDV1 IDV2 IDV3 IDV4 IDV5 IDV6 IDV7 IDV8 IDV9 IDV10 IDV11 IDV12 IDV13 IDV14 IDV15
18.73 [Y] 9.85 [N] 6.24 [N] 6.04 [N] 6.59 [N] 8.66 [Y] 7.45 [N] 30.77 [N] 7.33 [N] 6.12 [N] 14.75 [N] 16.66 [N] 41.31 [N] 7.14 [N] 6.59 [N]
20.85 [Y] 22.04 [N] 10.66 [N] 10.30 [N] 8.61 [N] 9.19 [Y] 11.02 [N] 19.44 [N] 12.28 [N] 40.86 [N] 32.15 [N] 6.12 [N] 41.03 [N] 19.85 [N] 8.42 [N]
99.31 [Y] 99.71 [Y] 2.52 [N] 94.93 [Y] 87.32 [Y] 98.17 [Y] 98.45 [Y] 97.50 [Y] 22.82 [N] 100[Y] 99.69 [Y] 98.98 [Y] 91.93 [Y] 99.95 [Y] 9.40 [N]
IDV1 IDV2 IDV3 IDV4 IDV5 IDV6 IDV7 IDV8 IDV9 IDV10 IDV11 IDV12 IDV13 IDV14 IDV15
4.31 2.44 3.63 3.17 5.14 0.25 8.70 0.41 7.25 8.97 6.71 5.06 0.01 5.70 3.43
SOM Q
Disim index
3.35 6.01 6.33 5.80 4.57 2.31 5.67 0.25 9.83 7.92 8.12 8.57 0.33 6.14 3.83
5.50 5.23 4.57 5.27 5.03 7.47 5.23 5.33 5.17 5.07 5.00 5.13 5.23 5.67 5.80
contribution and the proposed technique have correctly identified the root cause variables. Feed A has the highest contribution to the fault. This fault condition is then propagated to the reactor which introduces disturbance to the catalyzed chemical reactions. Consequently, the stability of the reactor pressure (X7) and the reactor temperature (X9) are affected. On the other hand, the T2 statistic has failed to identify the closely related root-cause variable. In the second fault condition IDV7, a
IDV7, IDV10, and IDV11 have been presented in Figure 14 to demonstrate the visual power of SOM for process monitoring. The fault conditions IDV6, IDV7, IDV10, and IDV11 have been selected to verify the effectiveness of the proposed dynamic loading vector based fault diagnosis technique. The results of the diagnosis for PCA-based techniques and the proposed technique are presented in Figure 15. In fault condition IDV6, a step change is introduced to the feed A (X1) of the process. The Q statistic 8840
dx.doi.org/10.1021/ie500815a | Ind. Eng. Chem. Res. 2014, 53, 8831−8843
Industrial & Engineering Chemistry Research
Article
Figure 14. Dynamic monitoring of the TEP process with SOM for IDV6, IDV7, IDV10, and IDV11.
Figure 15. Fault diagnosis results of PCA-based techniques and the proposed dynamic loading vector based technique.
successfully identified. In contrast, the PCA-based techniques are incapable of diagnosing the most closely related monitored process variable. In summary, the effectiveness of the proposed fault detection and diagnosis technique has been successfully verified on the Tennessee Eastman chemical process. It has been demonstrated that the proposed technique has superior performance over the conventional PCA-based techniques.
step change has been introduced to the C header pressure which has reduced the availability of feed C (stream 4). The stripper is the first operating unit that is affected by this fault condition. The fault diagnosis results of the proposed technique have correctly captured the abnormality in the stripper pressure (X16); while the PCA-based techniques fail to identify the root-cause variable. Similarly, in fault condition 10, due to the random variation of temperature in feed C, the temperature of the stripper column is directly affected (X18). This random variation in stripper temperature is correctly captured by the proposed technique. However, this fault has also introduced non-Gaussian disturbance into the process which leads to poor performance of the PCA-based techniques. In the last case, the random variation in reactor cooling water inlet temperature has resulted in abnormal behavior of the reactor temperature (X9). Because of the lack of online monitored data, the proposed technique as pure data-driven technique is not able to identify this true rootcause process variable. Nevertheless, the most closely related monitored process variable, reactor temperature (X9), has been
5. CONCLUSIONS In this study, a SOM-based dynamic fault detection diagnosis technique for non-Gaussian process is proposed. The dynamic behavior of the process is represented as a 2-D trajectory on the SOM. A fault is detected when this trajectory exceeds the predefined upper control limit of the normal cluster. The dynamic loading vector is computed using the coordinates of the BMU on the trajectory and the captured min-batch of data samples. A multivariate contribution plot is then generated for root-cause variable identification based on the influence of each process variable on the divergence of the dynamic loading vector. 8841
dx.doi.org/10.1021/ie500815a | Ind. Eng. Chem. Res. 2014, 53, 8831−8843
Industrial & Engineering Chemistry Research
Article
identification, and root cause diagnosis. AIChE J. 2013, 59 (7), 2348− 2365. (4) Doyle, F. J., III Nonlinear inferential control for process applications. J. Process Control 1998, 8 (5), 339−353. (5) Mori, J.; Yu, J. A quality relevant non-Gaussian latent subspace projection method for chemical process monitoring and fault detection. AIChE J. 2014, 60 (2), 485−499. (6) Chiang, L. H.; Braatz, R. D.; Russell, E. L. Fault detection and diagnosis in industrial systems; Springer: 2001. (7) Nomikos, P.; MacGregor, J. F. Monitoring batch processes using multiway principal component analysis. AIChE J. 1994, 40 (8), 1361− 1375. (8) Kourti, T.; Lee, J.; Macgregor, J. F. Experiences with industrial applications of projection methods for multivariate statistical process control. Comput. Chem. Eng. 1996, 20, S745−S750. (9) Zadakbar, O.; Imtiaz, S.; Khan, F. Dynamic risk assessment and fault detection using principal component analysis. Ind. Eng. Chem. Res. 2012, 52 (2), 809−816. (10) Bakshi, B. R. Multiscale PCA with application to multivariate statistical process monitoring. AIChE J. 1998, 44 (7), 1596−1610. (11) Gertler, J. Fault detection and diagnosis in engineering systems; CRC Press: 1998. (12) Li, W.; Yue, H. H.; Valle-Cervantes, S.; Qin, S. J. Recursive PCA for adaptive process monitoring. J. Process Control 2000, 10 (5), 471− 486. (13) Qin, S. J. Statistical process monitoring: basics and beyond. J. Chemom. 2003, 17 (8−9), 480−502. (14) Draper, B. A.; Yambor, W. S.; Beveridge, J. R. Analyzing pca-based face recognition algorithms: Eigenvector selection and distance measures. Empirical Evaluation Methods in Computer Vision, Singapore; World Scientific: 2002; pp 1−15. (15) Weber, J. J.; van Thuijl, J.; de Jong, H. J. Principal component analysis applied to collisionally-activated-decomposition mass spectra of alkylbenzenes. Anal. Chim. Acta 1986, 188, 195−204. (16) Adler, N.; Yazhemsky, E. Improving discrimination in data envelopment analysis: PCA−DEA or variable reduction. Eur. J. Oper. Res. 2010, 202 (1), 273−284. (17) Lee, J.-M.; Yoo, C.; Lee, I.-B. Statistical monitoring of dynamic processes based on dynamic independent component analysis. Chem. Eng. Sci. 2004, 59 (14), 2995−3006. (18) Rashid, M. M.; Yu, J. Hidden Markov model based adaptive independent component analysis approach for complex chemical process monitoring and fault detection. Ind. Eng. Chem. Res. 2012, 51 (15), 5506−5514. (19) Wang, J.; He, Q. P. Multivariate statistical process monitoring based on statistics pattern analysis. Ind. Eng. Chem. Res. 2010, 49 (17), 7858−7869. (20) Martin, E.; Morris, A. Non-parametric confidence bounds for process performance monitoring charts. J. Process Control 1996, 6 (6), 349−358. (21) Kohonen, T.; Oja, E.; Simula, O.; Visa, A.; Kangas, J. Engineering applications of the self-organizing map. Proc. IEEE 1996, 84 (10), 1358− 1384. (22) López-Rubio, E.; Munoz-Pérez, J.; Gómez-Ruiz, J. A. A principal components analysis self-organizing map. Neural Networks 2004, 17 (2), 261−270. (23) Chen, X.; Yan, X. Using improved self-organizing map for fault diagnosis in chemical industry process. Chem. Eng. Res. Des. 2012, 90 (12), 2262−2277. (24) Chopra, T.; Vajpai, J. Classification of Faults in DAMADICS Benchmark Process Control System Using Self Organizing Maps. Int. J. Soft Comput. 2011, 1 (3), 2231−2307. (25) Gonçalves, L. F.; Bosa, J. L.; Balen, T. R.; Lubaszewski, M. S.; Schneider, E. L.; Henriques, R. V. Fault detection, diagnosis and prediction in electrical valves using self-organizing maps. J. Electronic Testing 2011, 27 (4), 551−564. (26) Kowalski, C. T.; Orlowska-Kowalska, T. Neural networks application for induction motor faults diagnosis. Math. Comput. Simul. 2003, 63 (3), 435−448.
The effectiveness of the proposed technique has been verified using a simple non-Gaussian model and the Tennessee Eastman chemical process. The results from both case studies have demonstrated that the proposed technique is able to detect fault at an early stage. By analyzing the dynamic loadings, the rootcause process variables are also correctly identified. It has also been demonstrated through comparison the superiority of the proposed techniques to the conventional PCA-based techniques. As a data-driven technique, SOM-based diagnosis does not take into account the causal relationship among process variables; therefore, it is not able to identify the faulty process variable without online monitored data. In future research, this issue will be addressed by integrating the SOM-based technique with Bayesian network which allows further inference to identify the root-cause process variables that are not monitored. In addition, this work will be extended through taking advantage of the timely fault detection and diagnosis of the SOM-based technique to develop a real-time risk management system for industrial processes. This development will focus on constructing a dynamic risk assessment model and the development of remedial measures/actions considering different fault scenarios.
■
AUTHOR INFORMATION
Corresponding Author
*E-mail: fi
[email protected]. Notes
The authors declare no competing financial interest.
■
SYMBOLS AND ACRONYMS A = coefficient matrix of the first case study C = coordinate of the neuron I = identity matrix T = principal score matrix U = left-singular matrix V = loading matrix X = data sample matrix Z = input matrix of the first case study Dv = divergence of dynamic loading vector T2 = Hotelling’s statistic Vk = loading matrix containing first k columns of V Uk = left-singular matrix containing first k rows of U SPE/Q = squared prediction error Iin = input vector to self-organizing map α(t) = learning rate at training time t θ(t) = Gaussian neighborhood function at training time t α = beta distribution first shape parameter β = beta distribution second shape parameter Φ = multivariate Gaussian noise of the first case study BMU = best matching neuron Contj = contribution of variable j to the divergence PCA = principal component analysis SOM = self-organizing map Disim = Dissimilarity index
■
REFERENCES
(1) Venkatasubramanian, V.; Rengaswamy, R.; Yin, K.; Kavuri, S. N. A review of process fault detection and diagnosis: Part I: Quantitative model-based methods. Comput. Chem. Eng. 2003, 27 (3), 293−311. (2) Mina, J.; Verde, C. Fault detection for large scale systems using Dynamic Principal Components Analysis with adaptation. Int. J. Comput., Commun. Control 2007, 2 (2), 185−194. (3) Yu, J.; Rashid, M. M. A novel dynamic bayesian network-based networked process monitoring approach for fault detection, propagation 8842
dx.doi.org/10.1021/ie500815a | Ind. Eng. Chem. Res. 2014, 53, 8831−8843
Industrial & Engineering Chemistry Research
Article
(27) Mirkes, E. M. Principal Component Analysis and Self-Organizing Maps: applet. University of Leicester: 2011. (28) Deventer, J.; Moolman, D. W.; Aldrich, C. Visualisation of plant disturbances using self-organising maps. Comput. Chem. Eng. 1996, 20. (29) Ng, Y.; Srinivasan, R. A Self-Organizing Map Approach for Process Fault Diagnosis during Process Transitions. AIChE Annual Meeting, Austin, TX, November 7−12, 2004. (30) Song, Y.; Jiang, Q.; Yan, X.; Guo, M. A Multi-SOM with Canonical Variate Analysis for Chemical Process Monitoring and Fault Diagnosis. J. Chem. Eng. Jpn. 2014, 47 (1), 40−51. (31) Vapola, M.; Simula, O.; Kohonen, T.; Meriläi nen, P. Representation and identification of fault conditions of an anaesthesia system by means of the Self-Organizing Map. In ICANN’94; Springer: 1994; pp 350−353. (32) Gonçalves, L. F.; Schneider, E. L.; Henriques, R. V. B.; Lubaszewski, M.; Bosa, J. L.; Engel, P. M. In Fault prediction in electrical valves using temporal Kohonen maps, Test Workshop (LATW), 2010 11th Latin American, IEEE: 2010; pp 1−6. (33) Sirola, M.; Talonen, J.; Lampi, G. In SOM based methods in early fault detection of nuclear industry, Proceedings of the 17th European Symposium On Artificial Neural Networks, ESANN, 2009. (34) Ng, Y.; Srinivasan, R. In Monitoring of distillation column operation through self-organizing maps, Dynamics and Control of Process Systems 2004 (DYCOPS-7): A Proceedings Volume from the 7th IFAC Symposium, Cambridge, Massachusetts, USA, 5−7 July 2004, Access Online via Elsevier: 2004; p 559. (35) Domínguez, M.; Fuertes, J.; Reguera, P.; Díaz, I.; Cuadrado, A. A. Internet-based remote supervision of industrial processes using selforganizing maps. Eng. Appl. Artif. Intell. 2007, 20 (6), 757−765. (36) Wall, M. E.; Rechtsteiner, A.; Rocha, L. M. Singular value decomposition and principal component analysis. A practical approach to microarray data analysis; 2003; p 91. (37) Golub, G. H.; Reinsch, C. Singular value decomposition and least squares solutions. Numerische Math. 1970, 14 (5), 403−420. (38) Jackson, J. E. A user’s guide to principal components; John Wiley & Sons: 2005; Vol. 587. (39) Jackson, J. E.; Mudholkar, G. S. Control procedures for residuals associated with principal component analysis. Technometrics 1979, 21 (3), 341−349. (40) Kohonen, T. The self-organizing map. Proc. IEEE 1990, 78 (9), 1464−1480. (41) Kohonen, T. Self-organizing maps; Springer: 2001; Vol. 30. (42) Murphy, K. P. Machine learning: a probabilistic perspective; MIT Press: 2012. (43) Botev, Z.; Grotowski, J.; Kroese, D. Kernel density estimation via diffusion. Ann. Stat. 2010, 38 (5), 2916−2957. (44) Ricker, N. L. Decentralized control of the Tennessee Eastman challenge process. J. Process Control 1996, 6 (4), 205−221. (45) Downs, J. J.; Vogel, E. F. A plant-wide industrial process control problem. Comput. Chem. Eng. 1993, 17 (3), 245−255. (46) Lee, J. M.; Qin, S. J.; Lee, I. B. Fault detection and diagnosis based on modified independent component analysis. AIChE J. 2006, 52 (10), 3501−3514. (47) Kano, M.; Hasebe, S.; Hashimoto, I.; Ohno, H. Statistical process monitoring based on dissimilarity of process data. AIChE J. 2002, 48 (6), 1231−1240. (48) Russell, E. L.; Chiang, L. H.; Braatz, R. D. Fault detection in industrial processes using canonical variate analysis and dynamic principal component analysis. Chemom. Intell. Lab. Syst. 2000, 51 (1), 81−93.
8843
dx.doi.org/10.1021/ie500815a | Ind. Eng. Chem. Res. 2014, 53, 8831−8843